linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] Improved seccomp logging
@ 2017-02-14  3:45 Tyler Hicks
  2017-02-14  3:45 ` [PATCH v3 1/4] seccomp: Add sysctl to display available actions Tyler Hicks
                   ` (4 more replies)
  0 siblings, 5 replies; 29+ messages in thread
From: Tyler Hicks @ 2017-02-14  3:45 UTC (permalink / raw)
  To: Paul Moore, Eric Paris, Kees Cook, Andy Lutomirski, Will Drewry
  Cc: linux-audit, linux-kernel, John Crispin, linux-api

This patch set is the third revision of the following two previously
submitted patch sets:

v1: http://lkml.kernel.org/r/1483375990-14948-1-git-send-email-tyhicks@canonical.com
v1: http://lkml.kernel.org/r/1483377999-15019-2-git-send-email-tyhicks@canonical.com

v2: http://lkml.kernel.org/r/1486100262-32391-1-git-send-email-tyhicks@canonical.com

The patch set aims to address some known deficiencies in seccomp's current
logging capabilities:

  1. Inability to log all filter actions.
  2. Inability to selectively enable filtering; e.g. devs want noisy logging,
     users want relative quiet.
  3. Consistent behavior with audit enabled and disabled.
  4. Inability to easily develop a filter due to the lack of a
     permissive/complain mode.

Changes since v2 to address feedback from Kees:
- Patch 1
  + Log a warning when sysctl registration fails
  + Move comment describing SECCOMP_RET_*_NAME from PATCH 2
  + Document the actions_avail sysctl
- Patch 2
  + Inline seccomp_log()
  + Optimize logging for RET_ALLOW hot path
  + Use "{ }" for name buffer initialization
  + Make a copy of the ctl_table and only modify the copy
  + Rename max_action_to_log sysctl to log_max_action
  + Document the log_max_action sysctl
- Patch 3
  + Put some space between RET_LOG and RET_ALLOW for future actions
  + Separate the RET_ALLOW and RET_LOG cases in __seccomp_filter()
- Patch 4
  + Adjust the selftests for the updated RET_LOG value

Tyler


Tyler Hicks (4):
  seccomp: Add sysctl to display available actions
  seccomp: Add sysctl to configure actions that should be logged
  seccomp: Create an action to log before allowing
  seccomp: Add tests for SECCOMP_RET_LOG

 Documentation/prctl/seccomp_filter.txt        |  43 ++++++
 Documentation/sysctl/kernel.txt               |   1 +
 include/linux/audit.h                         |   6 +-
 include/uapi/linux/seccomp.h                  |   1 +
 kernel/seccomp.c                              | 185 +++++++++++++++++++++++++-
 tools/testing/selftests/seccomp/seccomp_bpf.c |  94 +++++++++++++
 6 files changed, 322 insertions(+), 8 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 1/4] seccomp: Add sysctl to display available actions
  2017-02-14  3:45 [PATCH v3 0/4] Improved seccomp logging Tyler Hicks
@ 2017-02-14  3:45 ` Tyler Hicks
  2017-02-16  1:00   ` Kees Cook
  2017-02-16  3:14   ` Andy Lutomirski
  2017-02-14  3:45 ` [PATCH v3 2/4] seccomp: Add sysctl to configure actions that should be logged Tyler Hicks
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 29+ messages in thread
From: Tyler Hicks @ 2017-02-14  3:45 UTC (permalink / raw)
  To: Paul Moore, Eric Paris, Kees Cook, Andy Lutomirski, Will Drewry
  Cc: linux-audit, linux-kernel, John Crispin, linux-api

This patch creates a read-only sysctl containing an ordered list of
seccomp actions that the kernel supports. The ordering, from left to
right, is the lowest action value (kill) to the highest action value
(allow). Currently, a read of the sysctl file would return "kill trap
errno trace allow". The contents of this sysctl file can be useful for
userspace code as well as the system administrator.

The path to the sysctl is:

  /proc/sys/kernel/seccomp/actions_avail

libseccomp and other userspace code can easily determine which actions
the current kernel supports. The set of actions supported by the current
kernel may be different than the set of action macros found in kernel
headers that were installed where the userspace code was built.

In addition, this sysctl will allow system administrators to know which
actions are supported by the kernel and make it easier to configure
exactly what seccomp logs through the audit subsystem. Support for this
level of logging configuration will come in a future patch.

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
---
 Documentation/prctl/seccomp_filter.txt | 16 ++++++++++
 Documentation/sysctl/kernel.txt        |  1 +
 kernel/seccomp.c                       | 55 ++++++++++++++++++++++++++++++++++
 3 files changed, 72 insertions(+)

diff --git a/Documentation/prctl/seccomp_filter.txt b/Documentation/prctl/seccomp_filter.txt
index 1e469ef..a5554ff 100644
--- a/Documentation/prctl/seccomp_filter.txt
+++ b/Documentation/prctl/seccomp_filter.txt
@@ -166,7 +166,23 @@ The samples/seccomp/ directory contains both an x86-specific example
 and a more generic example of a higher level macro interface for BPF
 program generation.
 
+Sysctls
+-------
+
+Seccomp's sysctl files can be found in the /proc/sys/kernel/seccomp/
+directory. Here's a description of each file in that directory:
+
+actions_avail:
+	A read-only ordered list of seccomp return values (refer to the
+	SECCOMP_RET_* macros above) in string form. The ordering, from
+	left-to-right, is the least permissive return value to the most
+	permissive return value.
 
+	The list represents the set of seccomp return values supported
+	by the kernel. A userspace program may use this list to
+	determine if the actions found in the seccomp.h, when the
+	program was built, differs from the set of actions actually
+	supported in the current running kernel.
 
 Adding architecture support
 -----------------------
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index a32b4b7..56f9b29 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -74,6 +74,7 @@ show up in /proc/sys/kernel:
 - reboot-cmd                  [ SPARC only ]
 - rtsig-max
 - rtsig-nr
+- seccomp/                    ==> Documentation/prctl/seccomp_filter.txt
 - sem
 - sem_next_id		      [ sysv ipc ]
 - sg-big-buff                 [ generic SCSI device (sg) ]
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index f7ce79a..e36dfe9 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -16,10 +16,12 @@
 #include <linux/atomic.h>
 #include <linux/audit.h>
 #include <linux/compat.h>
+#include <linux/kmemleak.h>
 #include <linux/sched.h>
 #include <linux/seccomp.h>
 #include <linux/slab.h>
 #include <linux/syscalls.h>
+#include <linux/sysctl.h>
 
 #ifdef CONFIG_HAVE_ARCH_SECCOMP_FILTER
 #include <asm/syscall.h>
@@ -905,3 +907,56 @@ long seccomp_get_filter(struct task_struct *task, unsigned long filter_off,
 	return ret;
 }
 #endif
+
+#ifdef CONFIG_SYSCTL
+
+/* Human readable action names for friendly sysctl interaction */
+#define SECCOMP_RET_KILL_NAME		"kill"
+#define SECCOMP_RET_TRAP_NAME		"trap"
+#define SECCOMP_RET_ERRNO_NAME		"errno"
+#define SECCOMP_RET_TRACE_NAME		"trace"
+#define SECCOMP_RET_ALLOW_NAME		"allow"
+
+static char seccomp_actions_avail[] = SECCOMP_RET_KILL_NAME	" "
+				      SECCOMP_RET_TRAP_NAME	" "
+				      SECCOMP_RET_ERRNO_NAME	" "
+				      SECCOMP_RET_TRACE_NAME	" "
+				      SECCOMP_RET_ALLOW_NAME;
+
+static struct ctl_path seccomp_sysctl_path[] = {
+	{ .procname = "kernel", },
+	{ .procname = "seccomp", },
+	{ }
+};
+
+static struct ctl_table seccomp_sysctl_table[] = {
+	{
+		.procname	= "actions_avail",
+		.data		= &seccomp_actions_avail,
+		.maxlen		= sizeof(seccomp_actions_avail),
+		.mode		= 0444,
+		.proc_handler	= proc_dostring,
+	},
+	{ }
+};
+
+static int __init seccomp_sysctl_init(void)
+{
+	struct ctl_table_header *hdr;
+
+	hdr = register_sysctl_paths(seccomp_sysctl_path, seccomp_sysctl_table);
+	if (!hdr)
+		pr_warn("seccomp: sysctl registration failed\n");
+	else
+		kmemleak_not_leak(hdr);
+
+	return 0;
+}
+
+#else /* CONFIG_SYSCTL */
+
+static __init int seccomp_sysctl_init(void) { return 0; }
+
+#endif /* CONFIG_SYSCTL */
+
+device_initcall(seccomp_sysctl_init)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 2/4] seccomp: Add sysctl to configure actions that should be logged
  2017-02-14  3:45 [PATCH v3 0/4] Improved seccomp logging Tyler Hicks
  2017-02-14  3:45 ` [PATCH v3 1/4] seccomp: Add sysctl to display available actions Tyler Hicks
@ 2017-02-14  3:45 ` Tyler Hicks
  2017-02-14  3:45 ` [PATCH v3 3/4] seccomp: Create an action to log before allowing Tyler Hicks
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 29+ messages in thread
From: Tyler Hicks @ 2017-02-14  3:45 UTC (permalink / raw)
  To: Paul Moore, Eric Paris, Kees Cook, Andy Lutomirski, Will Drewry
  Cc: linux-audit, linux-kernel, John Crispin, linux-api

Administrators can write to this sysctl to set the maximum seccomp
action that should be logged. Any actions with values greater than
what's written to the sysctl will not be logged.

For example, all SECCOMP_RET_KILL, SECCOMP_RET_TRAP, and
SECCOMP_RET_ERRNO actions would be logged if "errno" were written to the
sysctl. SECCOMP_RET_TRACE and SECCOMP_RET_ALLOW actions would not be
logged since their values are higher than SECCOMP_RET_ERRNO.

The path to the sysctl is:

 /proc/sys/kernel/seccomp/log_max_action

The actions_avail sysctl can be read to discover the valid action names
that can be written to the log_max_action sysctl. The actions_avail
sysctl is also useful in understanding the ordering of actions used when
deciding the maximum action to log.

The default setting for the sysctl is to only log SECCOMP_RET_KILL
actions which matches the existing behavior.

There's one important exception to this sysctl. If a task is
specifically being audited, meaning that an audit context has been
allocated for the task, seccomp will log all actions other than
SECCOMP_RET_ALLOW despite the value of log_max_action. This exception
preserves the existing auditing behavior of tasks with an allocated
audit context.

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
---
 Documentation/prctl/seccomp_filter.txt |  21 ++++++
 include/linux/audit.h                  |   6 +-
 kernel/seccomp.c                       | 123 ++++++++++++++++++++++++++++++++-
 3 files changed, 142 insertions(+), 8 deletions(-)

diff --git a/Documentation/prctl/seccomp_filter.txt b/Documentation/prctl/seccomp_filter.txt
index a5554ff..487cb0c 100644
--- a/Documentation/prctl/seccomp_filter.txt
+++ b/Documentation/prctl/seccomp_filter.txt
@@ -184,6 +184,27 @@ actions_avail:
 	program was built, differs from the set of actions actually
 	supported in the current running kernel.
 
+log_max_action:
+	A read-write file containing the most permissive seccomp return
+	value that should be logged. The list of valid strings that can
+	be written to this file can be found in the actions_avail sysctl.
+
+	The actions_avail sysctl can also serve as a way to discover the
+	ordering of seccomp return values so that you can easily
+	understand which return values will be logged. For example,
+	assume that the actions_avail file contains
+	"kill trap errno trace allow" and "errno" is written to the
+	log_max_action file. The SECCOMP_RET_KILL, SECCOMP_RET_TRAP, and
+	SECCOMP_RET_ERRNO actions will be logged.
+
+	It is important to note that the value of log_max_action does
+	not prevent certain actions from being logged when the audit
+	subsystem is configured to audit a task. If the action is more
+	permissive than what's written to log_max_action, the final
+	decision on whether to audit the action for that task is
+	ultimately left up to the audit subsystem to decide for all
+	seccomp return values other than SECCOMP_RET_ALLOW.
+
 Adding architecture support
 -----------------------
 
diff --git a/include/linux/audit.h b/include/linux/audit.h
index f51fca8d..e0d95fc 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -315,11 +315,7 @@ void audit_core_dumps(long signr);
 
 static inline void audit_seccomp(unsigned long syscall, long signr, int code)
 {
-	if (!audit_enabled)
-		return;
-
-	/* Force a record to be reported if a signal was delivered. */
-	if (signr || unlikely(!audit_dummy_context()))
+	if (audit_enabled && unlikely(!audit_dummy_context()))
 		__audit_seccomp(syscall, signr, code);
 }
 
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index e36dfe9..270a227 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -509,6 +509,22 @@ static void seccomp_send_sigsys(int syscall, int reason)
 }
 #endif	/* CONFIG_SECCOMP_FILTER */
 
+static u32 seccomp_log_max_action = SECCOMP_RET_KILL;
+
+static inline void seccomp_log(unsigned long syscall, long signr, u32 action)
+{
+	/* Force an audit message to be emitted when the action is not greater
+	 * than the configured maximum action.
+	 */
+	if (action <= seccomp_log_max_action)
+		return __audit_seccomp(syscall, signr, action);
+
+	/* Let the audit subsystem decide if the action should be audited based
+	 * on whether the current task itself is being audited.
+	 */
+	return audit_seccomp(syscall, signr, action);
+}
+
 /*
  * Secure computing mode 1 allows only read/write/exit/sigreturn.
  * To be fully secure this must be combined with rlimit
@@ -534,7 +550,7 @@ static void __secure_computing_strict(int this_syscall)
 #ifdef SECCOMP_DEBUG
 	dump_stack();
 #endif
-	audit_seccomp(this_syscall, SIGKILL, SECCOMP_RET_KILL);
+	seccomp_log(this_syscall, SIGKILL, SECCOMP_RET_KILL);
 	do_exit(SIGKILL);
 }
 
@@ -633,18 +649,30 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
 		return 0;
 
 	case SECCOMP_RET_ALLOW:
+		/* Open-coded seccomp_log(), optimized for the RET_ALLOW hot
+		 * path.
+		 *
+		 * We only want to log RET_ALLOW actions when the admin has
+		 * configured them to be logged via the log_max_action sysctl.
+		 * Therefore, call __audit_seccomp() directly so that RET_ALLOW
+		 * actions are not audited simply because the task is being
+		 * audited.
+		 */
+		if (unlikely(seccomp_log_max_action == SECCOMP_RET_ALLOW))
+			__audit_seccomp(this_syscall, 0, action);
+
 		return 0;
 
 	case SECCOMP_RET_KILL:
 	default:
-		audit_seccomp(this_syscall, SIGSYS, action);
+		seccomp_log(this_syscall, SIGSYS, action);
 		do_exit(SIGSYS);
 	}
 
 	unreachable();
 
 skip:
-	audit_seccomp(this_syscall, 0, action);
+	seccomp_log(this_syscall, 0, action);
 	return -1;
 }
 #else
@@ -917,12 +945,96 @@ long seccomp_get_filter(struct task_struct *task, unsigned long filter_off,
 #define SECCOMP_RET_TRACE_NAME		"trace"
 #define SECCOMP_RET_ALLOW_NAME		"allow"
 
+/* Largest strlen() of all action names */
+#define SECCOMP_RET_MAX_NAME_LEN	5
+
 static char seccomp_actions_avail[] = SECCOMP_RET_KILL_NAME	" "
 				      SECCOMP_RET_TRAP_NAME	" "
 				      SECCOMP_RET_ERRNO_NAME	" "
 				      SECCOMP_RET_TRACE_NAME	" "
 				      SECCOMP_RET_ALLOW_NAME;
 
+struct seccomp_action_name {
+	u32		action;
+	const char	*name;
+};
+
+static struct seccomp_action_name seccomp_action_names[] = {
+	{ SECCOMP_RET_KILL, SECCOMP_RET_KILL_NAME },
+	{ SECCOMP_RET_TRAP, SECCOMP_RET_TRAP_NAME },
+	{ SECCOMP_RET_ERRNO, SECCOMP_RET_ERRNO_NAME },
+	{ SECCOMP_RET_TRACE, SECCOMP_RET_TRACE_NAME },
+	{ SECCOMP_RET_ALLOW, SECCOMP_RET_ALLOW_NAME },
+	{ }
+};
+
+static bool seccomp_name_from_action(const char **namep, u32 action)
+{
+	struct seccomp_action_name *cur;
+
+	for (cur = seccomp_action_names; cur->name; cur++) {
+		if (cur->action == action) {
+			*namep = cur->name;
+			return true;
+		}
+	}
+
+	return false;
+}
+
+static bool seccomp_action_from_name(u32 *action, const char *name)
+{
+	struct seccomp_action_name *cur;
+
+	for (cur = seccomp_action_names; cur->name; cur++) {
+		if (!strcmp(cur->name, name)) {
+			*action = cur->action;
+			return true;
+		}
+	}
+
+	return false;
+}
+
+static int seccomp_log_max_action_handler(struct ctl_table *ro_table, int write,
+					  void __user *buffer, size_t *lenp,
+					  loff_t *ppos)
+{
+	char name[SECCOMP_RET_MAX_NAME_LEN + 1] = { };
+	struct ctl_table table;
+	int ret;
+
+	if (write && !capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (!write) {
+		const char *namep;
+
+		if (!seccomp_name_from_action(&namep, seccomp_log_max_action))
+			return -EINVAL;
+
+		strncpy(name, namep, sizeof(name) - 1);
+	}
+
+	table = *ro_table;
+	table.data = name;
+	table.maxlen = sizeof(name);
+	ret = proc_dostring(&table, write, buffer, lenp, ppos);
+	if (ret)
+		return ret;
+
+	if (write) {
+		u32 action;
+
+		if (!seccomp_action_from_name(&action, table.data))
+			return -EINVAL;
+
+		seccomp_log_max_action = action;
+	}
+
+	return 0;
+}
+
 static struct ctl_path seccomp_sysctl_path[] = {
 	{ .procname = "kernel", },
 	{ .procname = "seccomp", },
@@ -937,6 +1049,11 @@ static struct ctl_table seccomp_sysctl_table[] = {
 		.mode		= 0444,
 		.proc_handler	= proc_dostring,
 	},
+	{
+		.procname	= "log_max_action",
+		.mode		= 0644,
+		.proc_handler	= seccomp_log_max_action_handler,
+	},
 	{ }
 };
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 3/4] seccomp: Create an action to log before allowing
  2017-02-14  3:45 [PATCH v3 0/4] Improved seccomp logging Tyler Hicks
  2017-02-14  3:45 ` [PATCH v3 1/4] seccomp: Add sysctl to display available actions Tyler Hicks
  2017-02-14  3:45 ` [PATCH v3 2/4] seccomp: Add sysctl to configure actions that should be logged Tyler Hicks
@ 2017-02-14  3:45 ` Tyler Hicks
  2017-02-14  3:45 ` [PATCH v3 4/4] seccomp: Add tests for SECCOMP_RET_LOG Tyler Hicks
  2017-02-16  3:24 ` [PATCH v3 0/4] Improved seccomp logging Andy Lutomirski
  4 siblings, 0 replies; 29+ messages in thread
From: Tyler Hicks @ 2017-02-14  3:45 UTC (permalink / raw)
  To: Paul Moore, Eric Paris, Kees Cook, Andy Lutomirski, Will Drewry
  Cc: linux-audit, linux-kernel, John Crispin, linux-api

Add a new action, SECCOMP_RET_LOG, that logs a syscall before allowing
the syscall. At the implementation level, this action is identical to
the existing SECCOMP_RET_ALLOW action. However, it can be very useful when
initially developing a seccomp filter for an application. The developer
can set the default action to be SECCOMP_RET_LOG, maybe mark any
obviously needed syscalls with SECCOMP_RET_ALLOW, and then put the
application through its paces. A list of syscalls that triggered the
default action (SECCOMP_RET_LOG) can be easily gleaned from the logs and
that list can be used to build the syscall whitelist. Finally, the
developer can change the default action to the desired value.

This provides a more friendly experience than seeing the application get
killed, then updating the filter and rebuilding the app, seeing the
application get killed due to a different syscall, then updating the
filter and rebuilding the app, etc.

The functionality is similar to what's supported by the various LSMs.
SELinux has permissive mode, AppArmor has complain mode, SMACK has
bring-up mode, etc.

SECCOMP_RET_LOG is given a lower value than SECCOMP_RET_ALLOW so that
"allow" can be written to the max_action_to_log sysctl in order to get a
list of logged actions without the, potentially larger, set of allowed
actions.

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
---
 Documentation/prctl/seccomp_filter.txt | 6 ++++++
 include/uapi/linux/seccomp.h           | 1 +
 kernel/seccomp.c                       | 7 +++++++
 3 files changed, 14 insertions(+)

diff --git a/Documentation/prctl/seccomp_filter.txt b/Documentation/prctl/seccomp_filter.txt
index 487cb0c..b776bc7 100644
--- a/Documentation/prctl/seccomp_filter.txt
+++ b/Documentation/prctl/seccomp_filter.txt
@@ -138,6 +138,12 @@ SECCOMP_RET_TRACE:
 	allow use of ptrace, even of other sandboxed processes, without
 	extreme care; ptracers can use this mechanism to escape.)
 
+SECCOMP_RET_LOG:
+	Results in the system call being executed after it is logged. This
+	should be used by application developers to learn which syscalls their
+	application needs without having to iterate through multiple test and
+	development cycles to build the list.
+
 SECCOMP_RET_ALLOW:
 	Results in the system call being executed.
 
diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
index 0f238a4..bb7d57d 100644
--- a/include/uapi/linux/seccomp.h
+++ b/include/uapi/linux/seccomp.h
@@ -29,6 +29,7 @@
 #define SECCOMP_RET_TRAP	0x00030000U /* disallow and force a SIGSYS */
 #define SECCOMP_RET_ERRNO	0x00050000U /* returns an errno */
 #define SECCOMP_RET_TRACE	0x7ff00000U /* pass to a tracer or disallow */
+#define SECCOMP_RET_LOG		0x7ffc0000U /* allow after logging */
 #define SECCOMP_RET_ALLOW	0x7fff0000U /* allow */
 
 /* Masks for the return value sections. */
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 270a227..1f52c56 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -648,6 +648,10 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
 
 		return 0;
 
+	case SECCOMP_RET_LOG:
+		seccomp_log(this_syscall, 0, action);
+		return 0;
+
 	case SECCOMP_RET_ALLOW:
 		/* Open-coded seccomp_log(), optimized for the RET_ALLOW hot
 		 * path.
@@ -943,6 +947,7 @@ long seccomp_get_filter(struct task_struct *task, unsigned long filter_off,
 #define SECCOMP_RET_TRAP_NAME		"trap"
 #define SECCOMP_RET_ERRNO_NAME		"errno"
 #define SECCOMP_RET_TRACE_NAME		"trace"
+#define SECCOMP_RET_LOG_NAME		"log"
 #define SECCOMP_RET_ALLOW_NAME		"allow"
 
 /* Largest strlen() of all action names */
@@ -952,6 +957,7 @@ static char seccomp_actions_avail[] = SECCOMP_RET_KILL_NAME	" "
 				      SECCOMP_RET_TRAP_NAME	" "
 				      SECCOMP_RET_ERRNO_NAME	" "
 				      SECCOMP_RET_TRACE_NAME	" "
+				      SECCOMP_RET_LOG_NAME	" "
 				      SECCOMP_RET_ALLOW_NAME;
 
 struct seccomp_action_name {
@@ -964,6 +970,7 @@ static struct seccomp_action_name seccomp_action_names[] = {
 	{ SECCOMP_RET_TRAP, SECCOMP_RET_TRAP_NAME },
 	{ SECCOMP_RET_ERRNO, SECCOMP_RET_ERRNO_NAME },
 	{ SECCOMP_RET_TRACE, SECCOMP_RET_TRACE_NAME },
+	{ SECCOMP_RET_LOG, SECCOMP_RET_LOG_NAME },
 	{ SECCOMP_RET_ALLOW, SECCOMP_RET_ALLOW_NAME },
 	{ }
 };
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 4/4] seccomp: Add tests for SECCOMP_RET_LOG
  2017-02-14  3:45 [PATCH v3 0/4] Improved seccomp logging Tyler Hicks
                   ` (2 preceding siblings ...)
  2017-02-14  3:45 ` [PATCH v3 3/4] seccomp: Create an action to log before allowing Tyler Hicks
@ 2017-02-14  3:45 ` Tyler Hicks
  2017-02-16  3:24 ` [PATCH v3 0/4] Improved seccomp logging Andy Lutomirski
  4 siblings, 0 replies; 29+ messages in thread
From: Tyler Hicks @ 2017-02-14  3:45 UTC (permalink / raw)
  To: Paul Moore, Eric Paris, Kees Cook, Andy Lutomirski, Will Drewry
  Cc: linux-audit, linux-kernel, John Crispin, linux-api

Extend the kernel selftests for seccomp to test the newly added
SECCOMP_RET_LOG action. The added tests follow the example of existing
tests.

Unfortunately, the tests are not capable of inspecting the audit log to
verify that the syscall was logged.

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 94 +++++++++++++++++++++++++++
 1 file changed, 94 insertions(+)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 03f1fa4..5633b42 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -87,6 +87,10 @@ struct seccomp_data {
 };
 #endif
 
+#ifndef SECCOMP_RET_LOG
+#define SECCOMP_RET_LOG       0x7ffc0000U /* allow after logging */
+#endif
+
 #if __BYTE_ORDER == __LITTLE_ENDIAN
 #define syscall_arg(_n) (offsetof(struct seccomp_data, args[_n]))
 #elif __BYTE_ORDER == __BIG_ENDIAN
@@ -342,6 +346,28 @@ TEST(empty_prog)
 	EXPECT_EQ(EINVAL, errno);
 }
 
+TEST(log_all)
+{
+	struct sock_filter filter[] = {
+		BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_LOG),
+	};
+	struct sock_fprog prog = {
+		.len = (unsigned short)ARRAY_SIZE(filter),
+		.filter = filter,
+	};
+	long ret;
+	pid_t parent = getppid();
+
+	ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
+	ASSERT_EQ(0, ret);
+
+	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog);
+	ASSERT_EQ(0, ret);
+
+	/* getppid() should succeed and be logged (no check for logging) */
+	EXPECT_EQ(parent, syscall(__NR_getppid));
+}
+
 TEST_SIGNAL(unknown_ret_is_kill_inside, SIGSYS)
 {
 	struct sock_filter filter[] = {
@@ -735,6 +761,7 @@ TEST_F(TRAP, handler)
 
 FIXTURE_DATA(precedence) {
 	struct sock_fprog allow;
+	struct sock_fprog log;
 	struct sock_fprog trace;
 	struct sock_fprog error;
 	struct sock_fprog trap;
@@ -746,6 +773,13 @@ FIXTURE_SETUP(precedence)
 	struct sock_filter allow_insns[] = {
 		BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW),
 	};
+	struct sock_filter log_insns[] = {
+		BPF_STMT(BPF_LD|BPF_W|BPF_ABS,
+			offsetof(struct seccomp_data, nr)),
+		BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, __NR_getpid, 1, 0),
+		BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW),
+		BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_LOG),
+	};
 	struct sock_filter trace_insns[] = {
 		BPF_STMT(BPF_LD|BPF_W|BPF_ABS,
 			offsetof(struct seccomp_data, nr)),
@@ -782,6 +816,7 @@ FIXTURE_SETUP(precedence)
 	memcpy(self->_x.filter, &_x##_insns, sizeof(_x##_insns)); \
 	self->_x.len = (unsigned short)ARRAY_SIZE(_x##_insns)
 	FILTER_ALLOC(allow);
+	FILTER_ALLOC(log);
 	FILTER_ALLOC(trace);
 	FILTER_ALLOC(error);
 	FILTER_ALLOC(trap);
@@ -792,6 +827,7 @@ FIXTURE_TEARDOWN(precedence)
 {
 #define FILTER_FREE(_x) if (self->_x.filter) free(self->_x.filter)
 	FILTER_FREE(allow);
+	FILTER_FREE(log);
 	FILTER_FREE(trace);
 	FILTER_FREE(error);
 	FILTER_FREE(trap);
@@ -809,6 +845,8 @@ TEST_F(precedence, allow_ok)
 
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
 	ASSERT_EQ(0, ret);
+	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+	ASSERT_EQ(0, ret);
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
 	ASSERT_EQ(0, ret);
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
@@ -833,6 +871,8 @@ TEST_F_SIGNAL(precedence, kill_is_highest, SIGSYS)
 
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
 	ASSERT_EQ(0, ret);
+	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+	ASSERT_EQ(0, ret);
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
 	ASSERT_EQ(0, ret);
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
@@ -864,6 +904,8 @@ TEST_F_SIGNAL(precedence, kill_is_highest_in_any_order, SIGSYS)
 	ASSERT_EQ(0, ret);
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
 	ASSERT_EQ(0, ret);
+	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+	ASSERT_EQ(0, ret);
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
 	ASSERT_EQ(0, ret);
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trap);
@@ -885,6 +927,8 @@ TEST_F_SIGNAL(precedence, trap_is_second, SIGSYS)
 
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
 	ASSERT_EQ(0, ret);
+	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+	ASSERT_EQ(0, ret);
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
 	ASSERT_EQ(0, ret);
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
@@ -910,6 +954,8 @@ TEST_F_SIGNAL(precedence, trap_is_second_in_any_order, SIGSYS)
 	ASSERT_EQ(0, ret);
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trap);
 	ASSERT_EQ(0, ret);
+	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+	ASSERT_EQ(0, ret);
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
 	ASSERT_EQ(0, ret);
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
@@ -931,6 +977,8 @@ TEST_F(precedence, errno_is_third)
 
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
 	ASSERT_EQ(0, ret);
+	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+	ASSERT_EQ(0, ret);
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
 	ASSERT_EQ(0, ret);
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
@@ -949,6 +997,8 @@ TEST_F(precedence, errno_is_third_in_any_order)
 	ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
 	ASSERT_EQ(0, ret);
 
+	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+	ASSERT_EQ(0, ret);
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
 	ASSERT_EQ(0, ret);
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
@@ -971,6 +1021,8 @@ TEST_F(precedence, trace_is_fourth)
 
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
 	ASSERT_EQ(0, ret);
+	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+	ASSERT_EQ(0, ret);
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
 	ASSERT_EQ(0, ret);
 	/* Should work just fine. */
@@ -992,12 +1044,54 @@ TEST_F(precedence, trace_is_fourth_in_any_order)
 	ASSERT_EQ(0, ret);
 	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
 	ASSERT_EQ(0, ret);
+	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+	ASSERT_EQ(0, ret);
 	/* Should work just fine. */
 	EXPECT_EQ(parent, syscall(__NR_getppid));
 	/* No ptracer */
 	EXPECT_EQ(-1, syscall(__NR_getpid));
 }
 
+TEST_F(precedence, log_is_fifth)
+{
+	pid_t mypid, parent;
+	long ret;
+
+	mypid = getpid();
+	parent = getppid();
+	ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
+	ASSERT_EQ(0, ret);
+
+	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
+	ASSERT_EQ(0, ret);
+	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+	ASSERT_EQ(0, ret);
+	/* Should work just fine. */
+	EXPECT_EQ(parent, syscall(__NR_getppid));
+	/* Should also work just fine */
+	EXPECT_EQ(mypid, syscall(__NR_getpid));
+}
+
+TEST_F(precedence, log_is_fifth_in_any_order)
+{
+	pid_t mypid, parent;
+	long ret;
+
+	mypid = getpid();
+	parent = getppid();
+	ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
+	ASSERT_EQ(0, ret);
+
+	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+	ASSERT_EQ(0, ret);
+	ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
+	ASSERT_EQ(0, ret);
+	/* Should work just fine. */
+	EXPECT_EQ(parent, syscall(__NR_getppid));
+	/* Should also work just fine */
+	EXPECT_EQ(mypid, syscall(__NR_getpid));
+}
+
 #ifndef PTRACE_O_TRACESECCOMP
 #define PTRACE_O_TRACESECCOMP	0x00000080
 #endif
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/4] seccomp: Add sysctl to display available actions
  2017-02-14  3:45 ` [PATCH v3 1/4] seccomp: Add sysctl to display available actions Tyler Hicks
@ 2017-02-16  1:00   ` Kees Cook
  2017-02-16 18:43     ` Tyler Hicks
  2017-02-16  3:14   ` Andy Lutomirski
  1 sibling, 1 reply; 29+ messages in thread
From: Kees Cook @ 2017-02-16  1:00 UTC (permalink / raw)
  To: Tyler Hicks
  Cc: Paul Moore, Eric Paris, Andy Lutomirski, Will Drewry,
	linux-audit, LKML, John Crispin, linux-api

On Mon, Feb 13, 2017 at 7:45 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
> This patch creates a read-only sysctl containing an ordered list of
> seccomp actions that the kernel supports. The ordering, from left to
> right, is the lowest action value (kill) to the highest action value
> (allow). Currently, a read of the sysctl file would return "kill trap
> errno trace allow". The contents of this sysctl file can be useful for
> userspace code as well as the system administrator.
>
> The path to the sysctl is:
>
>   /proc/sys/kernel/seccomp/actions_avail
>
> libseccomp and other userspace code can easily determine which actions
> the current kernel supports. The set of actions supported by the current
> kernel may be different than the set of action macros found in kernel
> headers that were installed where the userspace code was built.
>
> In addition, this sysctl will allow system administrators to know which
> actions are supported by the kernel and make it easier to configure
> exactly what seccomp logs through the audit subsystem. Support for this
> level of logging configuration will come in a future patch.
>
> Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
> ---
>  Documentation/prctl/seccomp_filter.txt | 16 ++++++++++
>  Documentation/sysctl/kernel.txt        |  1 +
>  kernel/seccomp.c                       | 55 ++++++++++++++++++++++++++++++++++
>  3 files changed, 72 insertions(+)
>
> diff --git a/Documentation/prctl/seccomp_filter.txt b/Documentation/prctl/seccomp_filter.txt
> index 1e469ef..a5554ff 100644
> --- a/Documentation/prctl/seccomp_filter.txt
> +++ b/Documentation/prctl/seccomp_filter.txt
> @@ -166,7 +166,23 @@ The samples/seccomp/ directory contains both an x86-specific example
>  and a more generic example of a higher level macro interface for BPF
>  program generation.
>
> +Sysctls
> +-------
> +
> +Seccomp's sysctl files can be found in the /proc/sys/kernel/seccomp/
> +directory. Here's a description of each file in that directory:
> +
> +actions_avail:
> +       A read-only ordered list of seccomp return values (refer to the
> +       SECCOMP_RET_* macros above) in string form. The ordering, from
> +       left-to-right, is the least permissive return value to the most
> +       permissive return value.
>
> +       The list represents the set of seccomp return values supported
> +       by the kernel. A userspace program may use this list to
> +       determine if the actions found in the seccomp.h, when the
> +       program was built, differs from the set of actions actually
> +       supported in the current running kernel.
>
>  Adding architecture support
>  -----------------------
> diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
> index a32b4b7..56f9b29 100644
> --- a/Documentation/sysctl/kernel.txt
> +++ b/Documentation/sysctl/kernel.txt
> @@ -74,6 +74,7 @@ show up in /proc/sys/kernel:
>  - reboot-cmd                  [ SPARC only ]
>  - rtsig-max
>  - rtsig-nr
> +- seccomp/                    ==> Documentation/prctl/seccomp_filter.txt
>  - sem
>  - sem_next_id                [ sysv ipc ]
>  - sg-big-buff                 [ generic SCSI device (sg) ]
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index f7ce79a..e36dfe9 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -16,10 +16,12 @@
>  #include <linux/atomic.h>
>  #include <linux/audit.h>
>  #include <linux/compat.h>
> +#include <linux/kmemleak.h>
>  #include <linux/sched.h>
>  #include <linux/seccomp.h>
>  #include <linux/slab.h>
>  #include <linux/syscalls.h>
> +#include <linux/sysctl.h>
>
>  #ifdef CONFIG_HAVE_ARCH_SECCOMP_FILTER
>  #include <asm/syscall.h>
> @@ -905,3 +907,56 @@ long seccomp_get_filter(struct task_struct *task, unsigned long filter_off,
>         return ret;
>  }
>  #endif
> +
> +#ifdef CONFIG_SYSCTL
> +
> +/* Human readable action names for friendly sysctl interaction */
> +#define SECCOMP_RET_KILL_NAME          "kill"
> +#define SECCOMP_RET_TRAP_NAME          "trap"
> +#define SECCOMP_RET_ERRNO_NAME         "errno"
> +#define SECCOMP_RET_TRACE_NAME         "trace"
> +#define SECCOMP_RET_ALLOW_NAME         "allow"
> +
> +static char seccomp_actions_avail[] = SECCOMP_RET_KILL_NAME    " "
> +                                     SECCOMP_RET_TRAP_NAME     " "
> +                                     SECCOMP_RET_ERRNO_NAME    " "
> +                                     SECCOMP_RET_TRACE_NAME    " "
> +                                     SECCOMP_RET_ALLOW_NAME;
> +
> +static struct ctl_path seccomp_sysctl_path[] = {
> +       { .procname = "kernel", },
> +       { .procname = "seccomp", },
> +       { }
> +};
> +
> +static struct ctl_table seccomp_sysctl_table[] = {
> +       {
> +               .procname       = "actions_avail",
> +               .data           = &seccomp_actions_avail,
> +               .maxlen         = sizeof(seccomp_actions_avail),
> +               .mode           = 0444,
> +               .proc_handler   = proc_dostring,
> +       },
> +       { }
> +};
> +
> +static int __init seccomp_sysctl_init(void)
> +{
> +       struct ctl_table_header *hdr;
> +
> +       hdr = register_sysctl_paths(seccomp_sysctl_path, seccomp_sysctl_table);
> +       if (!hdr)
> +               pr_warn("seccomp: sysctl registration failed\n");
> +       else
> +               kmemleak_not_leak(hdr);
> +
> +       return 0;
> +}
> +
> +#else /* CONFIG_SYSCTL */
> +
> +static __init int seccomp_sysctl_init(void) { return 0; }
> +
> +#endif /* CONFIG_SYSCTL */
> +
> +device_initcall(seccomp_sysctl_init)

Can the device_initcall() just live in the CONFIG_SYSCTL #ifdef to
avoid the #else and stub?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/4] seccomp: Add sysctl to display available actions
  2017-02-14  3:45 ` [PATCH v3 1/4] seccomp: Add sysctl to display available actions Tyler Hicks
  2017-02-16  1:00   ` Kees Cook
@ 2017-02-16  3:14   ` Andy Lutomirski
  2017-02-16 18:47     ` Tyler Hicks
  1 sibling, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2017-02-16  3:14 UTC (permalink / raw)
  To: Tyler Hicks
  Cc: Paul Moore, Eric Paris, Kees Cook, Will Drewry, linux-audit,
	linux-kernel, John Crispin, linux-api

On Mon, Feb 13, 2017 at 7:45 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
> This patch creates a read-only sysctl containing an ordered list of
> seccomp actions that the kernel supports. The ordering, from left to
> right, is the lowest action value (kill) to the highest action value
> (allow). Currently, a read of the sysctl file would return "kill trap
> errno trace allow". The contents of this sysctl file can be useful for
> userspace code as well as the system administrator.

Would this make more sense as a new seccomp(2) mode a la
SECCOMP_HAS_ACTION?  Then sandboxy things that have no fs access could
use it.

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/4] Improved seccomp logging
  2017-02-14  3:45 [PATCH v3 0/4] Improved seccomp logging Tyler Hicks
                   ` (3 preceding siblings ...)
  2017-02-14  3:45 ` [PATCH v3 4/4] seccomp: Add tests for SECCOMP_RET_LOG Tyler Hicks
@ 2017-02-16  3:24 ` Andy Lutomirski
  2017-02-16 19:37   ` Tyler Hicks
  2017-02-16 23:29   ` Kees Cook
  4 siblings, 2 replies; 29+ messages in thread
From: Andy Lutomirski @ 2017-02-16  3:24 UTC (permalink / raw)
  To: Tyler Hicks
  Cc: Paul Moore, Eric Paris, Kees Cook, Will Drewry, linux-audit,
	linux-kernel, John Crispin, linux-api

On Mon, Feb 13, 2017 at 7:45 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
> This patch set is the third revision of the following two previously
> submitted patch sets:
>
> v1: http://lkml.kernel.org/r/1483375990-14948-1-git-send-email-tyhicks@canonical.com
> v1: http://lkml.kernel.org/r/1483377999-15019-2-git-send-email-tyhicks@canonical.com
>
> v2: http://lkml.kernel.org/r/1486100262-32391-1-git-send-email-tyhicks@canonical.com
>
> The patch set aims to address some known deficiencies in seccomp's current
> logging capabilities:
>
>   1. Inability to log all filter actions.
>   2. Inability to selectively enable filtering; e.g. devs want noisy logging,
>      users want relative quiet.
>   3. Consistent behavior with audit enabled and disabled.
>   4. Inability to easily develop a filter due to the lack of a
>      permissive/complain mode.

I think I dislike this, but I think my dislikes may be fixable with
minor changes.

What I dislike is that this mixes app-specific built-in configuration
(seccomp) with global privileged stuff (audit).  The result is a
potentially difficult to use situation in which you need to modify an
app to make it loggable (using RET_LOG) and then fiddle with
privileged config (auditctl, etc) to actually see the logs.

What if, instead of logging straight to the audit log, SECCOMP_RET_LOG
[1] merely meant "tell our parent about this syscall"?  (Ideally we'd
also figure out a way to express "log this and allow", "log this and
do ERRNO", etc.)  Then we could have another mechanism that installs a
layer in the seccomp stack that, instead of catching syscalls, catches
log events and sticks them in a ring buffer (or audit).

Concretely, it might work like this.  If a filter returns
SECCOMP_RET_LOG, then we "log" and keep processing.  SECCOMP_RET_LOG
is otherwise treated literally like SECCOMP_RET_ALLOW and has no
effect on return value.  If you want log-and-kill, you install two
filters.

There's a new seccomp(2) action that returns an fd.  That fd
references a new thing in the seccomp stack that is a BPF program that
is called whenever SECCOMP_RET_LOG is returned from lower down.  The
output of this filter determines whether the log event is ignored,
stuck in the ring buffer, or passed up the stack for further
processing.  You read(2) the fd to access the ring buffer.

Using this mechanism, you could write a simple seccomptrace tool that
needs no privilege and dumps SECCOMP_RET_LOG events from the target
program to stderr.

Thoughts?

[1] If we went this route, it might want to be renamed.

P.S. We ought to be able to write a BPF verifier pass that makes sure
that filters don't return unsupported return values if we cared to do
so.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/4] seccomp: Add sysctl to display available actions
  2017-02-16  1:00   ` Kees Cook
@ 2017-02-16 18:43     ` Tyler Hicks
  0 siblings, 0 replies; 29+ messages in thread
From: Tyler Hicks @ 2017-02-16 18:43 UTC (permalink / raw)
  To: Kees Cook
  Cc: Paul Moore, Eric Paris, Andy Lutomirski, Will Drewry,
	linux-audit, LKML, John Crispin


[-- Attachment #1.1: Type: text/plain, Size: 6056 bytes --]

On 02/15/2017 07:00 PM, Kees Cook wrote:
> On Mon, Feb 13, 2017 at 7:45 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>> This patch creates a read-only sysctl containing an ordered list of
>> seccomp actions that the kernel supports. The ordering, from left to
>> right, is the lowest action value (kill) to the highest action value
>> (allow). Currently, a read of the sysctl file would return "kill trap
>> errno trace allow". The contents of this sysctl file can be useful for
>> userspace code as well as the system administrator.
>>
>> The path to the sysctl is:
>>
>>   /proc/sys/kernel/seccomp/actions_avail
>>
>> libseccomp and other userspace code can easily determine which actions
>> the current kernel supports. The set of actions supported by the current
>> kernel may be different than the set of action macros found in kernel
>> headers that were installed where the userspace code was built.
>>
>> In addition, this sysctl will allow system administrators to know which
>> actions are supported by the kernel and make it easier to configure
>> exactly what seccomp logs through the audit subsystem. Support for this
>> level of logging configuration will come in a future patch.
>>
>> Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
>> ---
>>  Documentation/prctl/seccomp_filter.txt | 16 ++++++++++
>>  Documentation/sysctl/kernel.txt        |  1 +
>>  kernel/seccomp.c                       | 55 ++++++++++++++++++++++++++++++++++
>>  3 files changed, 72 insertions(+)
>>
>> diff --git a/Documentation/prctl/seccomp_filter.txt b/Documentation/prctl/seccomp_filter.txt
>> index 1e469ef..a5554ff 100644
>> --- a/Documentation/prctl/seccomp_filter.txt
>> +++ b/Documentation/prctl/seccomp_filter.txt
>> @@ -166,7 +166,23 @@ The samples/seccomp/ directory contains both an x86-specific example
>>  and a more generic example of a higher level macro interface for BPF
>>  program generation.
>>
>> +Sysctls
>> +-------
>> +
>> +Seccomp's sysctl files can be found in the /proc/sys/kernel/seccomp/
>> +directory. Here's a description of each file in that directory:
>> +
>> +actions_avail:
>> +       A read-only ordered list of seccomp return values (refer to the
>> +       SECCOMP_RET_* macros above) in string form. The ordering, from
>> +       left-to-right, is the least permissive return value to the most
>> +       permissive return value.
>>
>> +       The list represents the set of seccomp return values supported
>> +       by the kernel. A userspace program may use this list to
>> +       determine if the actions found in the seccomp.h, when the
>> +       program was built, differs from the set of actions actually
>> +       supported in the current running kernel.
>>
>>  Adding architecture support
>>  -----------------------
>> diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
>> index a32b4b7..56f9b29 100644
>> --- a/Documentation/sysctl/kernel.txt
>> +++ b/Documentation/sysctl/kernel.txt
>> @@ -74,6 +74,7 @@ show up in /proc/sys/kernel:
>>  - reboot-cmd                  [ SPARC only ]
>>  - rtsig-max
>>  - rtsig-nr
>> +- seccomp/                    ==> Documentation/prctl/seccomp_filter.txt
>>  - sem
>>  - sem_next_id                [ sysv ipc ]
>>  - sg-big-buff                 [ generic SCSI device (sg) ]
>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>> index f7ce79a..e36dfe9 100644
>> --- a/kernel/seccomp.c
>> +++ b/kernel/seccomp.c
>> @@ -16,10 +16,12 @@
>>  #include <linux/atomic.h>
>>  #include <linux/audit.h>
>>  #include <linux/compat.h>
>> +#include <linux/kmemleak.h>
>>  #include <linux/sched.h>
>>  #include <linux/seccomp.h>
>>  #include <linux/slab.h>
>>  #include <linux/syscalls.h>
>> +#include <linux/sysctl.h>
>>
>>  #ifdef CONFIG_HAVE_ARCH_SECCOMP_FILTER
>>  #include <asm/syscall.h>
>> @@ -905,3 +907,56 @@ long seccomp_get_filter(struct task_struct *task, unsigned long filter_off,
>>         return ret;
>>  }
>>  #endif
>> +
>> +#ifdef CONFIG_SYSCTL
>> +
>> +/* Human readable action names for friendly sysctl interaction */
>> +#define SECCOMP_RET_KILL_NAME          "kill"
>> +#define SECCOMP_RET_TRAP_NAME          "trap"
>> +#define SECCOMP_RET_ERRNO_NAME         "errno"
>> +#define SECCOMP_RET_TRACE_NAME         "trace"
>> +#define SECCOMP_RET_ALLOW_NAME         "allow"
>> +
>> +static char seccomp_actions_avail[] = SECCOMP_RET_KILL_NAME    " "
>> +                                     SECCOMP_RET_TRAP_NAME     " "
>> +                                     SECCOMP_RET_ERRNO_NAME    " "
>> +                                     SECCOMP_RET_TRACE_NAME    " "
>> +                                     SECCOMP_RET_ALLOW_NAME;
>> +
>> +static struct ctl_path seccomp_sysctl_path[] = {
>> +       { .procname = "kernel", },
>> +       { .procname = "seccomp", },
>> +       { }
>> +};
>> +
>> +static struct ctl_table seccomp_sysctl_table[] = {
>> +       {
>> +               .procname       = "actions_avail",
>> +               .data           = &seccomp_actions_avail,
>> +               .maxlen         = sizeof(seccomp_actions_avail),
>> +               .mode           = 0444,
>> +               .proc_handler   = proc_dostring,
>> +       },
>> +       { }
>> +};
>> +
>> +static int __init seccomp_sysctl_init(void)
>> +{
>> +       struct ctl_table_header *hdr;
>> +
>> +       hdr = register_sysctl_paths(seccomp_sysctl_path, seccomp_sysctl_table);
>> +       if (!hdr)
>> +               pr_warn("seccomp: sysctl registration failed\n");
>> +       else
>> +               kmemleak_not_leak(hdr);
>> +
>> +       return 0;
>> +}
>> +
>> +#else /* CONFIG_SYSCTL */
>> +
>> +static __init int seccomp_sysctl_init(void) { return 0; }
>> +
>> +#endif /* CONFIG_SYSCTL */
>> +
>> +device_initcall(seccomp_sysctl_init)
> 
> Can the device_initcall() just live in the CONFIG_SYSCTL #ifdef to
> avoid the #else and stub?

Yes. That'll be a nice cleanup.

Tyler

> 
> -Kees
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/4] seccomp: Add sysctl to display available actions
  2017-02-16  3:14   ` Andy Lutomirski
@ 2017-02-16 18:47     ` Tyler Hicks
  2017-02-16 19:01       ` Andy Lutomirski
  0 siblings, 1 reply; 29+ messages in thread
From: Tyler Hicks @ 2017-02-16 18:47 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Paul Moore, Eric Paris, Kees Cook, Will Drewry, linux-audit,
	linux-kernel, John Crispin


[-- Attachment #1.1: Type: text/plain, Size: 1078 bytes --]

On 02/15/2017 09:14 PM, Andy Lutomirski wrote:
> On Mon, Feb 13, 2017 at 7:45 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>> This patch creates a read-only sysctl containing an ordered list of
>> seccomp actions that the kernel supports. The ordering, from left to
>> right, is the lowest action value (kill) to the highest action value
>> (allow). Currently, a read of the sysctl file would return "kill trap
>> errno trace allow". The contents of this sysctl file can be useful for
>> userspace code as well as the system administrator.
> 
> Would this make more sense as a new seccomp(2) mode a la
> SECCOMP_HAS_ACTION?  Then sandboxy things that have no fs access could
> use it.
> 

It would make sense for code that needs to check which actions are
available. It wouldn't make sense for administrators that need to check
which actions are available unless libseccomp provided a wrapper utility.

Is this a theoretical concern or do you know of a sandboxed piece of
code that cannot access the sysctl before constructing a seccomp filter?

Tyler


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/4] seccomp: Add sysctl to display available actions
  2017-02-16 18:47     ` Tyler Hicks
@ 2017-02-16 19:01       ` Andy Lutomirski
  2017-02-16 20:05         ` Tyler Hicks
  0 siblings, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2017-02-16 19:01 UTC (permalink / raw)
  To: Tyler Hicks
  Cc: Paul Moore, Eric Paris, Kees Cook, Will Drewry, linux-audit,
	linux-kernel, John Crispin

On Thu, Feb 16, 2017 at 10:47 AM, Tyler Hicks <tyhicks@canonical.com> wrote:
> On 02/15/2017 09:14 PM, Andy Lutomirski wrote:
>> On Mon, Feb 13, 2017 at 7:45 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>>> This patch creates a read-only sysctl containing an ordered list of
>>> seccomp actions that the kernel supports. The ordering, from left to
>>> right, is the lowest action value (kill) to the highest action value
>>> (allow). Currently, a read of the sysctl file would return "kill trap
>>> errno trace allow". The contents of this sysctl file can be useful for
>>> userspace code as well as the system administrator.
>>
>> Would this make more sense as a new seccomp(2) mode a la
>> SECCOMP_HAS_ACTION?  Then sandboxy things that have no fs access could
>> use it.
>>
>
> It would make sense for code that needs to check which actions are
> available. It wouldn't make sense for administrators that need to check
> which actions are available unless libseccomp provided a wrapper utility.
>
> Is this a theoretical concern or do you know of a sandboxed piece of
> code that cannot access the sysctl before constructing a seccomp filter?
>

It's semi-theoretical.  But suppose I unshare namespaces, unmount a
bunch of stuff, then ask libseccomp to install a filter.  (I've
written code that does exactly that.)   libseccomp won't be able to
read the sysctl.

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/4] Improved seccomp logging
  2017-02-16  3:24 ` [PATCH v3 0/4] Improved seccomp logging Andy Lutomirski
@ 2017-02-16 19:37   ` Tyler Hicks
  2017-02-16 23:29   ` Kees Cook
  1 sibling, 0 replies; 29+ messages in thread
From: Tyler Hicks @ 2017-02-16 19:37 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Paul Moore, Eric Paris, Kees Cook, Will Drewry, linux-audit,
	linux-kernel, John Crispin


[-- Attachment #1.1: Type: text/plain, Size: 4154 bytes --]

On 02/15/2017 09:24 PM, Andy Lutomirski wrote:
> On Mon, Feb 13, 2017 at 7:45 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>> This patch set is the third revision of the following two previously
>> submitted patch sets:
>>
>> v1: http://lkml.kernel.org/r/1483375990-14948-1-git-send-email-tyhicks@canonical.com
>> v1: http://lkml.kernel.org/r/1483377999-15019-2-git-send-email-tyhicks@canonical.com
>>
>> v2: http://lkml.kernel.org/r/1486100262-32391-1-git-send-email-tyhicks@canonical.com
>>
>> The patch set aims to address some known deficiencies in seccomp's current
>> logging capabilities:
>>
>>   1. Inability to log all filter actions.
>>   2. Inability to selectively enable filtering; e.g. devs want noisy logging,
>>      users want relative quiet.
>>   3. Consistent behavior with audit enabled and disabled.
>>   4. Inability to easily develop a filter due to the lack of a
>>      permissive/complain mode.
> 
> I think I dislike this, but I think my dislikes may be fixable with
> minor changes.
> 
> What I dislike is that this mixes app-specific built-in configuration
> (seccomp) with global privileged stuff (audit).  The result is a
> potentially difficult to use situation in which you need to modify an
> app to make it loggable (using RET_LOG) and then fiddle with
> privileged config (auditctl, etc) to actually see the logs.

There's no need to fiddle with audictl. You would only need to write
"log" to /proc/sys/kernel/seccomp/log_max_action. I wouldn't think that
would be an issue considering that this will typically be be done on a
developer's system. Additionally, I plan to make "log" the default for
log_max_action in Ubuntu so there'd be nothing to change to get RET_LOG
working.

If auditd is running, the messages will go to the audit.log. If auditd
is not running, the messages will go to the syslog.

What if we make "log" the default for log_max_action in this patch set?
Would that address your concerns?

> What if, instead of logging straight to the audit log, SECCOMP_RET_LOG
> [1] merely meant "tell our parent about this syscall"?  (Ideally we'd
> also figure out a way to express "log this and allow", "log this and
> do ERRNO", etc.)  Then we could have another mechanism that installs a
> layer in the seccomp stack that, instead of catching syscalls, catches
> log events and sticks them in a ring buffer (or audit).
> 
> Concretely, it might work like this.  If a filter returns
> SECCOMP_RET_LOG, then we "log" and keep processing.  SECCOMP_RET_LOG
> is otherwise treated literally like SECCOMP_RET_ALLOW and has no
> effect on return value.  If you want log-and-kill, you install two
> filters.
> 
> There's a new seccomp(2) action that returns an fd.  That fd
> references a new thing in the seccomp stack that is a BPF program that
> is called whenever SECCOMP_RET_LOG is returned from lower down.  The
> output of this filter determines whether the log event is ignored,
> stuck in the ring buffer, or passed up the stack for further
> processing.  You read(2) the fd to access the ring buffer.
> 
> Using this mechanism, you could write a simple seccomptrace tool that
> needs no privilege and dumps SECCOMP_RET_LOG events from the target
> program to stderr.
> 
> Thoughts?

IMHO, this sounds like yet another logging daemon. It is also more
complex than what's needed for my use case of SECCOMP_RET_LOG. I only
need a seccomp learning mode, to accompany the learning modes
implemented in the various LSMs, that allows for everything marked with
RET_LOG to hit the syslog (or audit log).

Having to wedge seccomptrace tool between my application launcher, which
handles sandboxing, and the application being launched is less than
ideal. The application launcher could reimplement what seccomptrace is
doing but it'd be nice if that was left up to auditd/rsyslogd/journald/etc.

Tyler

> 
> [1] If we went this route, it might want to be renamed.
> 
> P.S. We ought to be able to write a BPF verifier pass that makes sure
> that filters don't return unsupported return values if we cared to do
> so.
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/4] seccomp: Add sysctl to display available actions
  2017-02-16 19:01       ` Andy Lutomirski
@ 2017-02-16 20:05         ` Tyler Hicks
  0 siblings, 0 replies; 29+ messages in thread
From: Tyler Hicks @ 2017-02-16 20:05 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Paul Moore, Eric Paris, Kees Cook, Will Drewry, linux-audit,
	linux-kernel, John Crispin


[-- Attachment #1.1: Type: text/plain, Size: 1638 bytes --]

On 02/16/2017 01:01 PM, Andy Lutomirski wrote:
> On Thu, Feb 16, 2017 at 10:47 AM, Tyler Hicks <tyhicks@canonical.com> wrote:
>> On 02/15/2017 09:14 PM, Andy Lutomirski wrote:
>>> On Mon, Feb 13, 2017 at 7:45 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>>>> This patch creates a read-only sysctl containing an ordered list of
>>>> seccomp actions that the kernel supports. The ordering, from left to
>>>> right, is the lowest action value (kill) to the highest action value
>>>> (allow). Currently, a read of the sysctl file would return "kill trap
>>>> errno trace allow". The contents of this sysctl file can be useful for
>>>> userspace code as well as the system administrator.
>>>
>>> Would this make more sense as a new seccomp(2) mode a la
>>> SECCOMP_HAS_ACTION?  Then sandboxy things that have no fs access could
>>> use it.
>>>
>>
>> It would make sense for code that needs to check which actions are
>> available. It wouldn't make sense for administrators that need to check
>> which actions are available unless libseccomp provided a wrapper utility.
>>
>> Is this a theoretical concern or do you know of a sandboxed piece of
>> code that cannot access the sysctl before constructing a seccomp filter?
>>
> 
> It's semi-theoretical.  But suppose I unshare namespaces, unmount a
> bunch of stuff, then ask libseccomp to install a filter.  (I've
> written code that does exactly that.)   libseccomp won't be able to
> read the sysctl.

That's a good point. It seems like we might need both mechanisms
(SECCOMP_HAS_ACTION for code and actions_avail for humans).

Tyler

> 
> --Andy
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/4] Improved seccomp logging
  2017-02-16  3:24 ` [PATCH v3 0/4] Improved seccomp logging Andy Lutomirski
  2017-02-16 19:37   ` Tyler Hicks
@ 2017-02-16 23:29   ` Kees Cook
  2017-02-17 17:00     ` Andy Lutomirski
  2017-02-22 18:46     ` Kees Cook
  1 sibling, 2 replies; 29+ messages in thread
From: Kees Cook @ 2017-02-16 23:29 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Tyler Hicks, Paul Moore, Eric Paris, Will Drewry, linux-audit,
	linux-kernel, John Crispin, linux-api

On Wed, Feb 15, 2017 at 7:24 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Mon, Feb 13, 2017 at 7:45 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>> This patch set is the third revision of the following two previously
>> submitted patch sets:
>>
>> v1: http://lkml.kernel.org/r/1483375990-14948-1-git-send-email-tyhicks@canonical.com
>> v1: http://lkml.kernel.org/r/1483377999-15019-2-git-send-email-tyhicks@canonical.com
>>
>> v2: http://lkml.kernel.org/r/1486100262-32391-1-git-send-email-tyhicks@canonical.com
>>
>> The patch set aims to address some known deficiencies in seccomp's current
>> logging capabilities:
>>
>>   1. Inability to log all filter actions.
>>   2. Inability to selectively enable filtering; e.g. devs want noisy logging,
>>      users want relative quiet.
>>   3. Consistent behavior with audit enabled and disabled.
>>   4. Inability to easily develop a filter due to the lack of a
>>      permissive/complain mode.
>
> I think I dislike this, but I think my dislikes may be fixable with
> minor changes.
>
> What I dislike is that this mixes app-specific built-in configuration
> (seccomp) with global privileged stuff (audit).  The result is a
> potentially difficult to use situation in which you need to modify an
> app to make it loggable (using RET_LOG) and then fiddle with
> privileged config (auditctl, etc) to actually see the logs.

You make a good point about RET_LOG vs log_max_action. I think making
RET_LOG the default value would work for 99% of the cases.

> What if, instead of logging straight to the audit log, SECCOMP_RET_LOG
> [1] merely meant "tell our parent about this syscall"?  (Ideally we'd
> also figure out a way to express "log this and allow", "log this and
> do ERRNO", etc.)  Then we could have another mechanism that installs a
> layer in the seccomp stack that, instead of catching syscalls, catches
> log events and sticks them in a ring buffer (or audit).

So, I really don't like this because it's yet another logging system.
We already have a security event logger: audit. This continues to use
that subsystem without changing semantics very much.

> Concretely, it might work like this.  If a filter returns
> SECCOMP_RET_LOG, then we "log" and keep processing.  SECCOMP_RET_LOG
> is otherwise treated literally like SECCOMP_RET_ALLOW and has no
> effect on return value.  If you want log-and-kill, you install two
> filters.
>
> There's a new seccomp(2) action that returns an fd.  That fd
> references a new thing in the seccomp stack that is a BPF program that
> is called whenever SECCOMP_RET_LOG is returned from lower down.  The
> output of this filter determines whether the log event is ignored,
> stuck in the ring buffer, or passed up the stack for further
> processing.  You read(2) the fd to access the ring buffer.
>
> Using this mechanism, you could write a simple seccomptrace tool that
> needs no privilege and dumps SECCOMP_RET_LOG events from the target
> program to stderr.

If someone was going to do this, they could just as well set up a
tracer to use RET_TRAP. (And this is what things like minijail does
already, IIRC.) The reality of the situation is that this is way too
much overhead for the common case. We need a generalized logging
system that uses the existing logging mechanisms.

> Thoughts?
>
> [1] If we went this route, it might want to be renamed.
>
> P.S. We ought to be able to write a BPF verifier pass that makes sure
> that filters don't return unsupported return values if we cared to do
> so.

Can we? I thought the BPF_RET used the BPF registers, and validating
that might be less-than-easy?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/4] Improved seccomp logging
  2017-02-16 23:29   ` Kees Cook
@ 2017-02-17 17:00     ` Andy Lutomirski
  2017-02-22 18:39       ` Kees Cook
  2017-02-22 18:46     ` Kees Cook
  1 sibling, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2017-02-17 17:00 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tyler Hicks, Paul Moore, Eric Paris, Will Drewry, linux-audit,
	linux-kernel, John Crispin, linux-api

On Thu, Feb 16, 2017 at 3:29 PM, Kees Cook <keescook@chromium.org> wrote:
> On Wed, Feb 15, 2017 at 7:24 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> On Mon, Feb 13, 2017 at 7:45 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>>> This patch set is the third revision of the following two previously
>>> submitted patch sets:
>>>
>>> v1: http://lkml.kernel.org/r/1483375990-14948-1-git-send-email-tyhicks@canonical.com
>>> v1: http://lkml.kernel.org/r/1483377999-15019-2-git-send-email-tyhicks@canonical.com
>>>
>>> v2: http://lkml.kernel.org/r/1486100262-32391-1-git-send-email-tyhicks@canonical.com
>>>
>>> The patch set aims to address some known deficiencies in seccomp's current
>>> logging capabilities:
>>>
>>>   1. Inability to log all filter actions.
>>>   2. Inability to selectively enable filtering; e.g. devs want noisy logging,
>>>      users want relative quiet.
>>>   3. Consistent behavior with audit enabled and disabled.
>>>   4. Inability to easily develop a filter due to the lack of a
>>>      permissive/complain mode.
>>
>> I think I dislike this, but I think my dislikes may be fixable with
>> minor changes.
>>
>> What I dislike is that this mixes app-specific built-in configuration
>> (seccomp) with global privileged stuff (audit).  The result is a
>> potentially difficult to use situation in which you need to modify an
>> app to make it loggable (using RET_LOG) and then fiddle with
>> privileged config (auditctl, etc) to actually see the logs.
>
> You make a good point about RET_LOG vs log_max_action. I think making
> RET_LOG the default value would work for 99% of the cases.
>
>> What if, instead of logging straight to the audit log, SECCOMP_RET_LOG
>> [1] merely meant "tell our parent about this syscall"?  (Ideally we'd
>> also figure out a way to express "log this and allow", "log this and
>> do ERRNO", etc.)  Then we could have another mechanism that installs a
>> layer in the seccomp stack that, instead of catching syscalls, catches
>> log events and sticks them in a ring buffer (or audit).
>
> So, I really don't like this because it's yet another logging system.
> We already have a security event logger: audit. This continues to use
> that subsystem without changing semantics very much.

Audit sucks for this kind of thing, though.  It's mostly useless in a
container, for example.  But let me propose a middle ground.

>
>> Concretely, it might work like this.  If a filter returns
>> SECCOMP_RET_LOG, then we "log" and keep processing.  SECCOMP_RET_LOG
>> is otherwise treated literally like SECCOMP_RET_ALLOW and has no
>> effect on return value.  If you want log-and-kill, you install two
>> filters.
>>
>> There's a new seccomp(2) action that returns an fd.  That fd
>> references a new thing in the seccomp stack that is a BPF program that
>> is called whenever SECCOMP_RET_LOG is returned from lower down.  The
>> output of this filter determines whether the log event is ignored,
>> stuck in the ring buffer, or passed up the stack for further
>> processing.  You read(2) the fd to access the ring buffer.
>>
>> Using this mechanism, you could write a simple seccomptrace tool that
>> needs no privilege and dumps SECCOMP_RET_LOG events from the target
>> program to stderr.
>
> If someone was going to do this, they could just as well set up a
> tracer to use RET_TRAP. (And this is what things like minijail does
> already, IIRC.) The reality of the situation is that this is way too
> much overhead for the common case. We need a generalized logging
> system that uses the existing logging mechanisms.

True.  And we can always add this part later if we want to.

But let me propose a different, much more minor change to the patches:

First, we currently have seccomp_run_filters running the whole stack
and keeping (more or less) the lowest value.  What if we changed it a
bit so that return values of 0xff???????? were special.  Specifically,
a return value of 0xff?????? from a filter means "take some action
right now but don't change the outcome of the filter stack".  Then we
define SECCOMP_RET_LOG as 0xff000000 and perhaps reserve a few bits to
be a number reflected in the log entry.  (e.g. SECCOMP_RET_LOG(x) =
0xff000000 | (x & 0xff)).

Now SECCOMP_RET_LOG or SECCOMP_RET_LOG(0) does approximately what it
does in the current patch series if used in isolation, but you can
install two filters, one of which logs and one of which kills, to get
"log and kill".

If we do this, we might want SECCOMP_RET_KILL to stop running filters
so that filters farther up the stack don't log the syscall.

What do you think?  This should be a very small delta on top of the
current patches.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/4] Improved seccomp logging
  2017-02-17 17:00     ` Andy Lutomirski
@ 2017-02-22 18:39       ` Kees Cook
  0 siblings, 0 replies; 29+ messages in thread
From: Kees Cook @ 2017-02-22 18:39 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Tyler Hicks, Paul Moore, Eric Paris, Will Drewry, linux-audit,
	linux-kernel, John Crispin, linux-api

On Fri, Feb 17, 2017 at 9:00 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Thu, Feb 16, 2017 at 3:29 PM, Kees Cook <keescook@chromium.org> wrote:
>> On Wed, Feb 15, 2017 at 7:24 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> If someone was going to do this, they could just as well set up a
>> tracer to use RET_TRAP. (And this is what things like minijail does
>> already, IIRC.) The reality of the situation is that this is way too
>> much overhead for the common case. We need a generalized logging
>> system that uses the existing logging mechanisms.
>
> True.  And we can always add this part later if we want to.
>
> But let me propose a different, much more minor change to the patches:
>
> First, we currently have seccomp_run_filters running the whole stack
> and keeping (more or less) the lowest value.  What if we changed it a
> bit so that return values of 0xff???????? were special.  Specifically,
> a return value of 0xff?????? from a filter means "take some action
> right now but don't change the outcome of the filter stack".  Then we
> define SECCOMP_RET_LOG as 0xff000000 and perhaps reserve a few bits to
> be a number reflected in the log entry.  (e.g. SECCOMP_RET_LOG(x) =
> 0xff000000 | (x & 0xff)).

I'm not a fan of adding more logic to the filter-running loop, but
this idea is tempting.

> Now SECCOMP_RET_LOG or SECCOMP_RET_LOG(0) does approximately what it
> does in the current patch series if used in isolation, but you can
> install two filters, one of which logs and one of which kills, to get
> "log and kill".
>
> If we do this, we might want SECCOMP_RET_KILL to stop running filters
> so that filters farther up the stack don't log the syscall.

I don't want to change the semantics of filter execution for this, so
I'd prefer to avoid adding an early abort.

> What do you think?  This should be a very small delta on top of the
> current patches.

What I don't like about this is that the logging and the action taken
are now totally separate. You could even end up having something
install multiple RET_LOGs, which aren't tied to what seccomp actually
decides to do in the end.

I'm not entirely opposed to the 0xff.... idea, but I don't think it
overlaps with logging very well. I would want seccomp logging to
reflect the actual action taken. Okay, I will now begin
stream-of-consciousness dumping to email...

I'm sure gmail will mangle whitespace, but here's sort of the 0xff
idea, well, actually 0x80....

diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
index 0f238a43ff1e..f61a0b783f6d 100644
--- a/include/uapi/linux/seccomp.h
+++ b/include/uapi/linux/seccomp.h
@@ -32,6 +32,7 @@
 #define SECCOMP_RET_ALLOW      0x7fff0000U /* allow */

 /* Masks for the return value sections. */
+#define SECCOMP_RET_OOB 0x80000000U
 #define SECCOMP_RET_ACTION     0x7fff0000U
 #define SECCOMP_RET_DATA       0x0000ffffU

diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index f8f88ebcb3ba..4fac64fdfdc1 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -197,6 +197,9 @@ static u32 seccomp_run_filters(const struct
seccomp_data *sd)
        for (; f; f = f->prev) {
                u32 cur_ret = BPF_PROG_RUN(f->prog, sd);

+               if (unlikely(cur_ret & SECCOMP_RET_OOB))
+                       seccomp_oob_action(cur_ret);
+
                if ((cur_ret & SECCOMP_RET_ACTION) < (ret & SECCOMP_RET_ACTION))
                        ret = cur_ret;
        }

So, if SECCOMP_RET_OOB_LOG existed, it'd need to be installed as a
stand-alone filter, since it's still a BPF return value. That seems
... unfriendly ... as it would have to duplicate all the same logic as
another filter running along side it. e.g.:

filter 1 (actual actions):
- check arch
- check syscalls, jump to resolution
- ok: ret_allow
- unexpected: ret_allow
- weird: ret_errno
- bad: ret_kill

filter 2 ("oob" logging):
- check arch
- check syscalls, jump to resolution
- ok: ret_allow
- unexpected: ret_oob_log <- only difference
- weird: ret_errno
- bad: ret_kill

I mean, filter 2 could also just be:
- check arch
- check syscalls, jump to resolution
- unexpected: ret_oob_log
- everything else: ret_allow

But this kind of split logic seems needlessly complex and error-prone
on the filter developer's end. I don't think OOB logging should be
used here. Adding RET_LOG seems like the right approach to me.

The observations that it's per-process logging vs system-wide log
reporting is, I think, still problematic, but the case where a
developer is using RET_LOG is under non-production development and it
can be argued that the general case is developer==system owner, so
this is, again, sufficient.

So... I remain convinced that Tyler's series is the correct direction
to solve the bulk of the logging needs currently expressed by
developers and system owners.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/4] Improved seccomp logging
  2017-02-16 23:29   ` Kees Cook
  2017-02-17 17:00     ` Andy Lutomirski
@ 2017-02-22 18:46     ` Kees Cook
  2017-04-07 22:16       ` Tyler Hicks
  1 sibling, 1 reply; 29+ messages in thread
From: Kees Cook @ 2017-02-22 18:46 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Tyler Hicks, Paul Moore, Eric Paris, Will Drewry, linux-audit,
	linux-kernel, John Crispin, Linux API

On Thu, Feb 16, 2017 at 3:29 PM, Kees Cook <keescook@chromium.org> wrote:
> On Wed, Feb 15, 2017 at 7:24 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> On Mon, Feb 13, 2017 at 7:45 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>>> This patch set is the third revision of the following two previously
>>> submitted patch sets:
>>>
>>> v1: http://lkml.kernel.org/r/1483375990-14948-1-git-send-email-tyhicks@canonical.com
>>> v1: http://lkml.kernel.org/r/1483377999-15019-2-git-send-email-tyhicks@canonical.com
>>>
>>> v2: http://lkml.kernel.org/r/1486100262-32391-1-git-send-email-tyhicks@canonical.com
>>>
>>> The patch set aims to address some known deficiencies in seccomp's current
>>> logging capabilities:
>>>
>>>   1. Inability to log all filter actions.
>>>   2. Inability to selectively enable filtering; e.g. devs want noisy logging,
>>>      users want relative quiet.
>>>   3. Consistent behavior with audit enabled and disabled.
>>>   4. Inability to easily develop a filter due to the lack of a
>>>      permissive/complain mode.
>>
>> I think I dislike this, but I think my dislikes may be fixable with
>> minor changes.
>>
>> What I dislike is that this mixes app-specific built-in configuration
>> (seccomp) with global privileged stuff (audit).  The result is a
>> potentially difficult to use situation in which you need to modify an
>> app to make it loggable (using RET_LOG) and then fiddle with
>> privileged config (auditctl, etc) to actually see the logs.
>
> You make a good point about RET_LOG vs log_max_action. I think making
> RET_LOG the default value would work for 99% of the cases.

Actually, I take this back: making "log" the default means that
everything else gets logged too, include "expected" return values like
errno, trap, etc. I think that would be extremely noisy as a default
(for upstream or Ubuntu).

Perhaps RET_LOG should unconditionally log? Or maybe the logged
actions should be a bit field instead of a single value? Then the
default could be "RET_KILL and RET_LOG", but an admin could switch it
to just RET_KILL, or even nothing at all? Hmmm...

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/4] Improved seccomp logging
  2017-02-22 18:46     ` Kees Cook
@ 2017-04-07 22:16       ` Tyler Hicks
  2017-04-07 22:46         ` Kees Cook
                           ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Tyler Hicks @ 2017-04-07 22:16 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andy Lutomirski, Paul Moore, Eric Paris, Will Drewry,
	linux-audit, linux-kernel, John Crispin, Linux API


[-- Attachment #1.1: Type: text/plain, Size: 4844 bytes --]

On 02/22/2017 12:46 PM, Kees Cook wrote:
> On Thu, Feb 16, 2017 at 3:29 PM, Kees Cook <keescook@chromium.org> wrote:
>> On Wed, Feb 15, 2017 at 7:24 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>> On Mon, Feb 13, 2017 at 7:45 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>>>> This patch set is the third revision of the following two previously
>>>> submitted patch sets:
>>>>
>>>> v1: http://lkml.kernel.org/r/1483375990-14948-1-git-send-email-tyhicks@canonical.com
>>>> v1: http://lkml.kernel.org/r/1483377999-15019-2-git-send-email-tyhicks@canonical.com
>>>>
>>>> v2: http://lkml.kernel.org/r/1486100262-32391-1-git-send-email-tyhicks@canonical.com
>>>>
>>>> The patch set aims to address some known deficiencies in seccomp's current
>>>> logging capabilities:
>>>>
>>>>   1. Inability to log all filter actions.
>>>>   2. Inability to selectively enable filtering; e.g. devs want noisy logging,
>>>>      users want relative quiet.
>>>>   3. Consistent behavior with audit enabled and disabled.
>>>>   4. Inability to easily develop a filter due to the lack of a
>>>>      permissive/complain mode.
>>>
>>> I think I dislike this, but I think my dislikes may be fixable with
>>> minor changes.
>>>
>>> What I dislike is that this mixes app-specific built-in configuration
>>> (seccomp) with global privileged stuff (audit).  The result is a
>>> potentially difficult to use situation in which you need to modify an
>>> app to make it loggable (using RET_LOG) and then fiddle with
>>> privileged config (auditctl, etc) to actually see the logs.
>>
>> You make a good point about RET_LOG vs log_max_action. I think making
>> RET_LOG the default value would work for 99% of the cases.
> 
> Actually, I take this back: making "log" the default means that
> everything else gets logged too, include "expected" return values like
> errno, trap, etc. I think that would be extremely noisy as a default
> (for upstream or Ubuntu).
> 
> Perhaps RET_LOG should unconditionally log? Or maybe the logged
> actions should be a bit field instead of a single value? Then the
> default could be "RET_KILL and RET_LOG", but an admin could switch it
> to just RET_KILL, or even nothing at all? Hmmm...

Hi Kees - my apologies for going quiet on this topic after we spoke
about it more in IRC. I needed to tend to other work but now I'm able to
return to this seccomp logging patch set.

To summarize what we discussed in IRC, the Chrome browser makes
extensive use of RET_ERRNO, RET_TRACE, etc., to sandbox code that may
not ever be adjusted to keep from bump into the sandbox walls.
Therefore, it is unreasonable to enable logging of something like
RET_ERRNO on a system-wide level where Chrome browser is in use.

In contrast, snapd wants to set up "noisier" sandboxes for applications
to make it clear to the developers and the users that the sandboxed
application is bumping into the sandbox walls. Developers will know why
their code may not be working as intended and users will know that the
application is doing things that the platform doesn't agree with. These
sandboxes will end up using RET_ERRNO in the majority of cases.

This means that with the current design of this patch set, Chrome
browser will either be unintentionally spamming the logs or snapd's
sandboxes will be helplessly silent when both Chrome and snapd is
installed at the same time, depending on the admin's preferences.

To bring it back up a level, two applications may have a very different
outlook on how acceptable a given seccomp action is and they may
disagree on whether or not the user/administrator should be informed.

I've been giving thought to the idea of providing a way for the
application setting up the filter to opt into logging of certain
actions. Here's a high level breakdown:

 - The administrator will have control of system-wide seccomp logging
   and the application will have control of how its seccomp actions are
   logged
 - Both the administrator's and application's logging preferences are
   bitmasks of actions that should be logged
 - The default administrator bitmask is set to log all actions except
   RET_ALLOW
   + Configurable via a sysctl
   + Very similar to the log_max_action syscall that was proposed in
     this patch set but a bitmask instead of a max action
 - The default application bitmask is set to log only RET_KILL and
   RET_LOG
   + Configurable via the seccomp(2) syscall (details TBD)
 - Actions are only logged when the application has requested the
   action to be logged and the administrator has allowed the action to
   be logged
   + By default, this is only RET_KILL and RET_LOG actions

Let me know what you think about this and I can turn out another patch
set next week if it sounds reasonable.

Tyler


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/4] Improved seccomp logging
  2017-04-07 22:16       ` Tyler Hicks
@ 2017-04-07 22:46         ` Kees Cook
  2017-04-07 23:46           ` Tyler Hicks
  2017-04-10 15:18         ` Steve Grubb
  2017-04-10 15:57         ` Andy Lutomirski
  2 siblings, 1 reply; 29+ messages in thread
From: Kees Cook @ 2017-04-07 22:46 UTC (permalink / raw)
  To: Tyler Hicks
  Cc: Andy Lutomirski, Paul Moore, Eric Paris, Will Drewry,
	linux-audit, linux-kernel, John Crispin, Linux API

On Fri, Apr 7, 2017 at 3:16 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
> On 02/22/2017 12:46 PM, Kees Cook wrote:
>> On Thu, Feb 16, 2017 at 3:29 PM, Kees Cook <keescook@chromium.org> wrote:
>>> On Wed, Feb 15, 2017 at 7:24 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>> On Mon, Feb 13, 2017 at 7:45 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>>>>> This patch set is the third revision of the following two previously
>>>>> submitted patch sets:
>>>>>
>>>>> v1: http://lkml.kernel.org/r/1483375990-14948-1-git-send-email-tyhicks@canonical.com
>>>>> v1: http://lkml.kernel.org/r/1483377999-15019-2-git-send-email-tyhicks@canonical.com
>>>>>
>>>>> v2: http://lkml.kernel.org/r/1486100262-32391-1-git-send-email-tyhicks@canonical.com
>>>>>
>>>>> The patch set aims to address some known deficiencies in seccomp's current
>>>>> logging capabilities:
>>>>>
>>>>>   1. Inability to log all filter actions.
>>>>>   2. Inability to selectively enable filtering; e.g. devs want noisy logging,
>>>>>      users want relative quiet.
>>>>>   3. Consistent behavior with audit enabled and disabled.
>>>>>   4. Inability to easily develop a filter due to the lack of a
>>>>>      permissive/complain mode.
>>>>
>>>> I think I dislike this, but I think my dislikes may be fixable with
>>>> minor changes.
>>>>
>>>> What I dislike is that this mixes app-specific built-in configuration
>>>> (seccomp) with global privileged stuff (audit).  The result is a
>>>> potentially difficult to use situation in which you need to modify an
>>>> app to make it loggable (using RET_LOG) and then fiddle with
>>>> privileged config (auditctl, etc) to actually see the logs.
>>>
>>> You make a good point about RET_LOG vs log_max_action. I think making
>>> RET_LOG the default value would work for 99% of the cases.
>>
>> Actually, I take this back: making "log" the default means that
>> everything else gets logged too, include "expected" return values like
>> errno, trap, etc. I think that would be extremely noisy as a default
>> (for upstream or Ubuntu).
>>
>> Perhaps RET_LOG should unconditionally log? Or maybe the logged
>> actions should be a bit field instead of a single value? Then the
>> default could be "RET_KILL and RET_LOG", but an admin could switch it
>> to just RET_KILL, or even nothing at all? Hmmm...
>
> Hi Kees - my apologies for going quiet on this topic after we spoke
> about it more in IRC. I needed to tend to other work but now I'm able to
> return to this seccomp logging patch set.
>
> To summarize what we discussed in IRC, the Chrome browser makes
> extensive use of RET_ERRNO, RET_TRACE, etc., to sandbox code that may
> not ever be adjusted to keep from bump into the sandbox walls.
> Therefore, it is unreasonable to enable logging of something like
> RET_ERRNO on a system-wide level where Chrome browser is in use.
>
> In contrast, snapd wants to set up "noisier" sandboxes for applications
> to make it clear to the developers and the users that the sandboxed
> application is bumping into the sandbox walls. Developers will know why
> their code may not be working as intended and users will know that the
> application is doing things that the platform doesn't agree with. These
> sandboxes will end up using RET_ERRNO in the majority of cases.
>
> This means that with the current design of this patch set, Chrome
> browser will either be unintentionally spamming the logs or snapd's
> sandboxes will be helplessly silent when both Chrome and snapd is
> installed at the same time, depending on the admin's preferences.
>
> To bring it back up a level, two applications may have a very different
> outlook on how acceptable a given seccomp action is and they may
> disagree on whether or not the user/administrator should be informed.
>
> I've been giving thought to the idea of providing a way for the
> application setting up the filter to opt into logging of certain
> actions. Here's a high level breakdown:
>
>  - The administrator will have control of system-wide seccomp logging
>    and the application will have control of how its seccomp actions are
>    logged
>  - Both the administrator's and application's logging preferences are
>    bitmasks of actions that should be logged
>  - The default administrator bitmask is set to log all actions except
>    RET_ALLOW
>    + Configurable via a sysctl
>    + Very similar to the log_max_action syscall that was proposed in
>      this patch set but a bitmask instead of a max action
>  - The default application bitmask is set to log only RET_KILL and
>    RET_LOG
>    + Configurable via the seccomp(2) syscall (details TBD)
>  - Actions are only logged when the application has requested the
>    action to be logged and the administrator has allowed the action to
>    be logged
>    + By default, this is only RET_KILL and RET_LOG actions
>
> Let me know what you think about this and I can turn out another patch
> set next week if it sounds reasonable.

This seems good to me. With a bitmask we don't have to play crazy
games with levels, and with the app able to declare what it wants
logged, we get the flexibility needed by app developers.

One change I think I'd like to make is that an app can't block
RET_KILL from being logged (the sysadmin can, though). What do think
of that minor tweak?

Is RET_ALLOW allowed to be logged by an application? I guess with it
default-off for an admin, it should be okay.

Does the app-controlled bitmask apply to the filter, the process, the
process tree, or something else? e.g. systemd launches an app with a
filter, leaving the defaults alone, then later process installs a
filter and wants everything logged -- will things from the earlier
filter get logged?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/4] Improved seccomp logging
  2017-04-07 22:46         ` Kees Cook
@ 2017-04-07 23:46           ` Tyler Hicks
  2017-04-11  3:59             ` Kees Cook
  0 siblings, 1 reply; 29+ messages in thread
From: Tyler Hicks @ 2017-04-07 23:46 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andy Lutomirski, Paul Moore, Eric Paris, Will Drewry,
	linux-audit, linux-kernel, John Crispin, Linux API


[-- Attachment #1.1: Type: text/plain, Size: 8585 bytes --]

On 04/07/2017 05:46 PM, Kees Cook wrote:
> On Fri, Apr 7, 2017 at 3:16 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>> On 02/22/2017 12:46 PM, Kees Cook wrote:
>>> On Thu, Feb 16, 2017 at 3:29 PM, Kees Cook <keescook@chromium.org> wrote:
>>>> On Wed, Feb 15, 2017 at 7:24 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>>> On Mon, Feb 13, 2017 at 7:45 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>>>>>> This patch set is the third revision of the following two previously
>>>>>> submitted patch sets:
>>>>>>
>>>>>> v1: http://lkml.kernel.org/r/1483375990-14948-1-git-send-email-tyhicks@canonical.com
>>>>>> v1: http://lkml.kernel.org/r/1483377999-15019-2-git-send-email-tyhicks@canonical.com
>>>>>>
>>>>>> v2: http://lkml.kernel.org/r/1486100262-32391-1-git-send-email-tyhicks@canonical.com
>>>>>>
>>>>>> The patch set aims to address some known deficiencies in seccomp's current
>>>>>> logging capabilities:
>>>>>>
>>>>>>   1. Inability to log all filter actions.
>>>>>>   2. Inability to selectively enable filtering; e.g. devs want noisy logging,
>>>>>>      users want relative quiet.
>>>>>>   3. Consistent behavior with audit enabled and disabled.
>>>>>>   4. Inability to easily develop a filter due to the lack of a
>>>>>>      permissive/complain mode.
>>>>>
>>>>> I think I dislike this, but I think my dislikes may be fixable with
>>>>> minor changes.
>>>>>
>>>>> What I dislike is that this mixes app-specific built-in configuration
>>>>> (seccomp) with global privileged stuff (audit).  The result is a
>>>>> potentially difficult to use situation in which you need to modify an
>>>>> app to make it loggable (using RET_LOG) and then fiddle with
>>>>> privileged config (auditctl, etc) to actually see the logs.
>>>>
>>>> You make a good point about RET_LOG vs log_max_action. I think making
>>>> RET_LOG the default value would work for 99% of the cases.
>>>
>>> Actually, I take this back: making "log" the default means that
>>> everything else gets logged too, include "expected" return values like
>>> errno, trap, etc. I think that would be extremely noisy as a default
>>> (for upstream or Ubuntu).
>>>
>>> Perhaps RET_LOG should unconditionally log? Or maybe the logged
>>> actions should be a bit field instead of a single value? Then the
>>> default could be "RET_KILL and RET_LOG", but an admin could switch it
>>> to just RET_KILL, or even nothing at all? Hmmm...
>>
>> Hi Kees - my apologies for going quiet on this topic after we spoke
>> about it more in IRC. I needed to tend to other work but now I'm able to
>> return to this seccomp logging patch set.
>>
>> To summarize what we discussed in IRC, the Chrome browser makes
>> extensive use of RET_ERRNO, RET_TRACE, etc., to sandbox code that may
>> not ever be adjusted to keep from bump into the sandbox walls.
>> Therefore, it is unreasonable to enable logging of something like
>> RET_ERRNO on a system-wide level where Chrome browser is in use.
>>
>> In contrast, snapd wants to set up "noisier" sandboxes for applications
>> to make it clear to the developers and the users that the sandboxed
>> application is bumping into the sandbox walls. Developers will know why
>> their code may not be working as intended and users will know that the
>> application is doing things that the platform doesn't agree with. These
>> sandboxes will end up using RET_ERRNO in the majority of cases.
>>
>> This means that with the current design of this patch set, Chrome
>> browser will either be unintentionally spamming the logs or snapd's
>> sandboxes will be helplessly silent when both Chrome and snapd is
>> installed at the same time, depending on the admin's preferences.
>>
>> To bring it back up a level, two applications may have a very different
>> outlook on how acceptable a given seccomp action is and they may
>> disagree on whether or not the user/administrator should be informed.
>>
>> I've been giving thought to the idea of providing a way for the
>> application setting up the filter to opt into logging of certain
>> actions. Here's a high level breakdown:
>>
>>  - The administrator will have control of system-wide seccomp logging
>>    and the application will have control of how its seccomp actions are
>>    logged
>>  - Both the administrator's and application's logging preferences are
>>    bitmasks of actions that should be logged
>>  - The default administrator bitmask is set to log all actions except
>>    RET_ALLOW
>>    + Configurable via a sysctl
>>    + Very similar to the log_max_action syscall that was proposed in
>>      this patch set but a bitmask instead of a max action
>>  - The default application bitmask is set to log only RET_KILL and
>>    RET_LOG
>>    + Configurable via the seccomp(2) syscall (details TBD)
>>  - Actions are only logged when the application has requested the
>>    action to be logged and the administrator has allowed the action to
>>    be logged
>>    + By default, this is only RET_KILL and RET_LOG actions
>>
>> Let me know what you think about this and I can turn out another patch
>> set next week if it sounds reasonable.
> 
> This seems good to me. With a bitmask we don't have to play crazy
> games with levels, and with the app able to declare what it wants
> logged, we get the flexibility needed by app developers.
> 
> One change I think I'd like to make is that an app can't block
> RET_KILL from being logged (the sysadmin can, though). What do think
> of that minor tweak?

That's fine from my point of view.

> 
> Is RET_ALLOW allowed to be logged by an application? I guess with it
> default-off for an admin, it should be okay.

As you say, it should be fine to allow but I don't know if there's any
real use in it. I can go either way here.

> 
> Does the app-controlled bitmask apply to the filter, the process, the
> process tree, or something else? e.g. systemd launches an app with a
> filter, leaving the defaults alone, then later process installs a
> filter and wants everything logged -- will things from the earlier
> filter get logged?

I think implementation preferences may decide many of these questions.
As I see it, here are the options in order of my preference:

A) Claim the MSB of the filter return value and make the app logging
   preference per-rule
   - If the bit is set, log the rule
   - Provides very fine-grained logging control at the "high cost" of
     the remaining free bit in the filter return bitmask
   - The bit can be ignored in the case of RET_KILL
   - Can be synced across all threads in the calling task with the
     SECCOMP_FILTER_FLAG_TSYNC filter flag

B) Claim a few bits in the filter flags and make the app logging
   preference per-filter
   - Something like SECCOMP_FILTER_FLAG_LOG_TRAP,
     SECCOMP_FILTER_FLAG_LOG_ERRNO, and
     SECCOMP_FILTER_FLAG_LOG_TRACE
   - Logging for RET_KILL and RET_LOG can't be turned off
   - I'd prefer not to waste a bit for RET_ALLOW in this case so it
     simply won't be loggable
   - Works with the SECCOMP_FILTER_FLAG_TSYNC filter flag
   - Doesn't scale well if many new actions are added in the future

C) A simplified version of 'B' where only a single mode bit is claimed
   to enable logging for all actions except RET_ALLOW
   - Something like SECCOMP_FILTER_FLAG_LOG_ACTIONS
   - Filters without this flag only log RET_KILL and RET_LOG
   - Scales much better than 'B' at the expense of less flexibility
   - Works with the SECCOMP_FILTER_FLAG_TSYNC filter flag

D) Claim a bit in the filter mode and make the app logging preference
   per-process
   - This new SECCOMP_MODE_ENABLE_LOGGING mode would take a bitmask of
     actions that should be logged
   - Incurs a small per-task increase in memory footprint in the form
     of an additional member in 'struct seccomp'
   - Has odd behavior you described above where launchers may set the
     logging preference and then launched application may want
     something different

I think 'A' is the cleanest design but I don't know if highly
configurable logging is deserving of the MSB bit in the filter return.
I'd like to hear your thoughts there.

I _barely_ prefer 'B' over 'C'. They're essential equal in my use case.

To be honest, I haven't completely wrapped my head around how 'D' would
actually work in practice so I may be writing it off prematurely.

Am I missing any more clever options that you can think of? Let me know
what you think of the possibilities.

Tyler


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/4] Improved seccomp logging
  2017-04-07 22:16       ` Tyler Hicks
  2017-04-07 22:46         ` Kees Cook
@ 2017-04-10 15:18         ` Steve Grubb
  2017-04-10 15:57         ` Andy Lutomirski
  2 siblings, 0 replies; 29+ messages in thread
From: Steve Grubb @ 2017-04-10 15:18 UTC (permalink / raw)
  To: linux-audit
  Cc: Tyler Hicks, Kees Cook, Will Drewry, Linux API, linux-kernel,
	Andy Lutomirski, John Crispin

On Friday, April 7, 2017 6:16:08 PM EDT Tyler Hicks wrote:
> On 02/22/2017 12:46 PM, Kees Cook wrote:
> > On Thu, Feb 16, 2017 at 3:29 PM, Kees Cook <keescook@chromium.org> wrote:
> >> On Wed, Feb 15, 2017 at 7:24 PM, Andy Lutomirski <luto@amacapital.net> 
wrote:
> >>> On Mon, Feb 13, 2017 at 7:45 PM, Tyler Hicks <tyhicks@canonical.com> 
wrote:
> >>>> This patch set is the third revision of the following two previously
> >>>> submitted patch sets:
> >>>> 
> >>>> v1:
> >>>> http://lkml.kernel.org/r/1483375990-14948-1-git-send-email-tyhicks@can
> >>>> onical.com v1:
> >>>> http://lkml.kernel.org/r/1483377999-15019-2-git-send-email-tyhicks@can
> >>>> onical.com
> >>>> 
> >>>> v2:
> >>>> http://lkml.kernel.org/r/1486100262-32391-1-git-send-email-tyhicks@can
> >>>> onical.com
> >>>> 
> >>>> The patch set aims to address some known deficiencies in seccomp's
> >>>> current
> >>>> 
> >>>> logging capabilities:
> >>>>   1. Inability to log all filter actions.
> >>>>   2. Inability to selectively enable filtering; e.g. devs want noisy
> >>>>   logging,
> >>>>   
> >>>>      users want relative quiet.
> >>>>   
> >>>>   3. Consistent behavior with audit enabled and disabled.
> >>>>   4. Inability to easily develop a filter due to the lack of a
> >>>>   
> >>>>      permissive/complain mode.
> >>> 
> >>> I think I dislike this, but I think my dislikes may be fixable with
> >>> minor changes.
> >>> 
> >>> What I dislike is that this mixes app-specific built-in configuration
> >>> (seccomp) with global privileged stuff (audit).  The result is a
> >>> potentially difficult to use situation in which you need to modify an
> >>> app to make it loggable (using RET_LOG) and then fiddle with
> >>> privileged config (auditctl, etc) to actually see the logs.
> >> 
> >> You make a good point about RET_LOG vs log_max_action. I think making
> >> RET_LOG the default value would work for 99% of the cases.
> > 
> > Actually, I take this back: making "log" the default means that
> > everything else gets logged too, include "expected" return values like
> > errno, trap, etc. I think that would be extremely noisy as a default
> > (for upstream or Ubuntu).
> > 
> > Perhaps RET_LOG should unconditionally log? Or maybe the logged
> > actions should be a bit field instead of a single value? Then the
> > default could be "RET_KILL and RET_LOG", but an admin could switch it
> > to just RET_KILL, or even nothing at all? Hmmm...
> 
> Hi Kees - my apologies for going quiet on this topic after we spoke
> about it more in IRC. I needed to tend to other work but now I'm able to
> return to this seccomp logging patch set.
> 
> To summarize what we discussed in IRC, the Chrome browser makes
> extensive use of RET_ERRNO, RET_TRACE, etc., to sandbox code that may
> not ever be adjusted to keep from bump into the sandbox walls.
> Therefore, it is unreasonable to enable logging of something like
> RET_ERRNO on a system-wide level where Chrome browser is in use.

Just to illustrate what can happen, /usr/lib64/qt5/libexec/QtWebEngineProcess 
is causing hundreds of errno SECCOMP audit events.

See bz https://bugzilla.redhat.com/show_bug.cgi?id=1438973

The developers should not have enabled this for distribution's runtime. It 
should be a debug option.

-Steve

> In contrast, snapd wants to set up "noisier" sandboxes for applications
> to make it clear to the developers and the users that the sandboxed
> application is bumping into the sandbox walls. Developers will know why
> their code may not be working as intended and users will know that the
> application is doing things that the platform doesn't agree with. These
> sandboxes will end up using RET_ERRNO in the majority of cases.
> 
> This means that with the current design of this patch set, Chrome
> browser will either be unintentionally spamming the logs or snapd's
> sandboxes will be helplessly silent when both Chrome and snapd is
> installed at the same time, depending on the admin's preferences.
> 
> To bring it back up a level, two applications may have a very different
> outlook on how acceptable a given seccomp action is and they may
> disagree on whether or not the user/administrator should be informed.
> 
> I've been giving thought to the idea of providing a way for the
> application setting up the filter to opt into logging of certain
> actions. Here's a high level breakdown:
> 
>  - The administrator will have control of system-wide seccomp logging
>    and the application will have control of how its seccomp actions are
>    logged
>  - Both the administrator's and application's logging preferences are
>    bitmasks of actions that should be logged
>  - The default administrator bitmask is set to log all actions except
>    RET_ALLOW
>    + Configurable via a sysctl
>    + Very similar to the log_max_action syscall that was proposed in
>      this patch set but a bitmask instead of a max action
>  - The default application bitmask is set to log only RET_KILL and
>    RET_LOG
>    + Configurable via the seccomp(2) syscall (details TBD)
>  - Actions are only logged when the application has requested the
>    action to be logged and the administrator has allowed the action to
>    be logged
>    + By default, this is only RET_KILL and RET_LOG actions
> 
> Let me know what you think about this and I can turn out another patch
> set next week if it sounds reasonable.
> 
> Tyler

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/4] Improved seccomp logging
  2017-04-07 22:16       ` Tyler Hicks
  2017-04-07 22:46         ` Kees Cook
  2017-04-10 15:18         ` Steve Grubb
@ 2017-04-10 15:57         ` Andy Lutomirski
  2017-04-10 19:22           ` Tyler Hicks
  2017-04-11  4:03           ` Kees Cook
  2 siblings, 2 replies; 29+ messages in thread
From: Andy Lutomirski @ 2017-04-10 15:57 UTC (permalink / raw)
  To: Tyler Hicks
  Cc: Kees Cook, Paul Moore, Eric Paris, Will Drewry, linux-audit,
	linux-kernel, John Crispin, Linux API

On Fri, Apr 7, 2017 at 3:16 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
> On 02/22/2017 12:46 PM, Kees Cook wrote:
>> On Thu, Feb 16, 2017 at 3:29 PM, Kees Cook <keescook@chromium.org> wrote:
>>> On Wed, Feb 15, 2017 at 7:24 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>> On Mon, Feb 13, 2017 at 7:45 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>>>>> This patch set is the third revision of the following two previously
>>>>> submitted patch sets:
>>>>>
>>>>> v1: http://lkml.kernel.org/r/1483375990-14948-1-git-send-email-tyhicks@canonical.com
>>>>> v1: http://lkml.kernel.org/r/1483377999-15019-2-git-send-email-tyhicks@canonical.com
>>>>>
>>>>> v2: http://lkml.kernel.org/r/1486100262-32391-1-git-send-email-tyhicks@canonical.com
>>>>>
>>>>> The patch set aims to address some known deficiencies in seccomp's current
>>>>> logging capabilities:
>>>>>
>>>>>   1. Inability to log all filter actions.
>>>>>   2. Inability to selectively enable filtering; e.g. devs want noisy logging,
>>>>>      users want relative quiet.
>>>>>   3. Consistent behavior with audit enabled and disabled.
>>>>>   4. Inability to easily develop a filter due to the lack of a
>>>>>      permissive/complain mode.
>>>>
>>>> I think I dislike this, but I think my dislikes may be fixable with
>>>> minor changes.
>>>>
>>>> What I dislike is that this mixes app-specific built-in configuration
>>>> (seccomp) with global privileged stuff (audit).  The result is a
>>>> potentially difficult to use situation in which you need to modify an
>>>> app to make it loggable (using RET_LOG) and then fiddle with
>>>> privileged config (auditctl, etc) to actually see the logs.
>>>
>>> You make a good point about RET_LOG vs log_max_action. I think making
>>> RET_LOG the default value would work for 99% of the cases.
>>
>> Actually, I take this back: making "log" the default means that
>> everything else gets logged too, include "expected" return values like
>> errno, trap, etc. I think that would be extremely noisy as a default
>> (for upstream or Ubuntu).
>>
>> Perhaps RET_LOG should unconditionally log? Or maybe the logged
>> actions should be a bit field instead of a single value? Then the
>> default could be "RET_KILL and RET_LOG", but an admin could switch it
>> to just RET_KILL, or even nothing at all? Hmmm...
>
> Hi Kees - my apologies for going quiet on this topic after we spoke
> about it more in IRC. I needed to tend to other work but now I'm able to
> return to this seccomp logging patch set.
>
> To summarize what we discussed in IRC, the Chrome browser makes
> extensive use of RET_ERRNO, RET_TRACE, etc., to sandbox code that may
> not ever be adjusted to keep from bump into the sandbox walls.
> Therefore, it is unreasonable to enable logging of something like
> RET_ERRNO on a system-wide level where Chrome browser is in use.
>
> In contrast, snapd wants to set up "noisier" sandboxes for applications
> to make it clear to the developers and the users that the sandboxed
> application is bumping into the sandbox walls. Developers will know why
> their code may not be working as intended and users will know that the
> application is doing things that the platform doesn't agree with. These
> sandboxes will end up using RET_ERRNO in the majority of cases.
>
> This means that with the current design of this patch set, Chrome
> browser will either be unintentionally spamming the logs or snapd's
> sandboxes will be helplessly silent when both Chrome and snapd is
> installed at the same time, depending on the admin's preferences.
>
> To bring it back up a level, two applications may have a very different
> outlook on how acceptable a given seccomp action is and they may
> disagree on whether or not the user/administrator should be informed.
>
> I've been giving thought to the idea of providing a way for the
> application setting up the filter to opt into logging of certain
> actions. Here's a high level breakdown:

At the risk of overcomplicating things, I think this issue may be a
decent argument for doing something more like what I suggested
earlier: let seccomp users emit loggable things and let their parents
(optionally) catch and handle those things.  Then we get real scoping
rather than fiddling with bitmasks and hoping we get what we want to
see without generating massive log spam.

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/4] Improved seccomp logging
  2017-04-10 15:57         ` Andy Lutomirski
@ 2017-04-10 19:22           ` Tyler Hicks
  2017-04-11  4:03           ` Kees Cook
  1 sibling, 0 replies; 29+ messages in thread
From: Tyler Hicks @ 2017-04-10 19:22 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Kees Cook, Paul Moore, Eric Paris, Will Drewry, linux-audit,
	linux-kernel, John Crispin, Linux API


[-- Attachment #1.1: Type: text/plain, Size: 5331 bytes --]

On 04/10/2017 10:57 AM, Andy Lutomirski wrote:
> On Fri, Apr 7, 2017 at 3:16 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>> On 02/22/2017 12:46 PM, Kees Cook wrote:
>>> On Thu, Feb 16, 2017 at 3:29 PM, Kees Cook <keescook@chromium.org> wrote:
>>>> On Wed, Feb 15, 2017 at 7:24 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>>> On Mon, Feb 13, 2017 at 7:45 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>>>>>> This patch set is the third revision of the following two previously
>>>>>> submitted patch sets:
>>>>>>
>>>>>> v1: http://lkml.kernel.org/r/1483375990-14948-1-git-send-email-tyhicks@canonical.com
>>>>>> v1: http://lkml.kernel.org/r/1483377999-15019-2-git-send-email-tyhicks@canonical.com
>>>>>>
>>>>>> v2: http://lkml.kernel.org/r/1486100262-32391-1-git-send-email-tyhicks@canonical.com
>>>>>>
>>>>>> The patch set aims to address some known deficiencies in seccomp's current
>>>>>> logging capabilities:
>>>>>>
>>>>>>   1. Inability to log all filter actions.
>>>>>>   2. Inability to selectively enable filtering; e.g. devs want noisy logging,
>>>>>>      users want relative quiet.
>>>>>>   3. Consistent behavior with audit enabled and disabled.
>>>>>>   4. Inability to easily develop a filter due to the lack of a
>>>>>>      permissive/complain mode.
>>>>>
>>>>> I think I dislike this, but I think my dislikes may be fixable with
>>>>> minor changes.
>>>>>
>>>>> What I dislike is that this mixes app-specific built-in configuration
>>>>> (seccomp) with global privileged stuff (audit).  The result is a
>>>>> potentially difficult to use situation in which you need to modify an
>>>>> app to make it loggable (using RET_LOG) and then fiddle with
>>>>> privileged config (auditctl, etc) to actually see the logs.
>>>>
>>>> You make a good point about RET_LOG vs log_max_action. I think making
>>>> RET_LOG the default value would work for 99% of the cases.
>>>
>>> Actually, I take this back: making "log" the default means that
>>> everything else gets logged too, include "expected" return values like
>>> errno, trap, etc. I think that would be extremely noisy as a default
>>> (for upstream or Ubuntu).
>>>
>>> Perhaps RET_LOG should unconditionally log? Or maybe the logged
>>> actions should be a bit field instead of a single value? Then the
>>> default could be "RET_KILL and RET_LOG", but an admin could switch it
>>> to just RET_KILL, or even nothing at all? Hmmm...
>>
>> Hi Kees - my apologies for going quiet on this topic after we spoke
>> about it more in IRC. I needed to tend to other work but now I'm able to
>> return to this seccomp logging patch set.
>>
>> To summarize what we discussed in IRC, the Chrome browser makes
>> extensive use of RET_ERRNO, RET_TRACE, etc., to sandbox code that may
>> not ever be adjusted to keep from bump into the sandbox walls.
>> Therefore, it is unreasonable to enable logging of something like
>> RET_ERRNO on a system-wide level where Chrome browser is in use.
>>
>> In contrast, snapd wants to set up "noisier" sandboxes for applications
>> to make it clear to the developers and the users that the sandboxed
>> application is bumping into the sandbox walls. Developers will know why
>> their code may not be working as intended and users will know that the
>> application is doing things that the platform doesn't agree with. These
>> sandboxes will end up using RET_ERRNO in the majority of cases.
>>
>> This means that with the current design of this patch set, Chrome
>> browser will either be unintentionally spamming the logs or snapd's
>> sandboxes will be helplessly silent when both Chrome and snapd is
>> installed at the same time, depending on the admin's preferences.
>>
>> To bring it back up a level, two applications may have a very different
>> outlook on how acceptable a given seccomp action is and they may
>> disagree on whether or not the user/administrator should be informed.
>>
>> I've been giving thought to the idea of providing a way for the
>> application setting up the filter to opt into logging of certain
>> actions. Here's a high level breakdown:
> 
> At the risk of overcomplicating things, I think this issue may be a
> decent argument for doing something more like what I suggested
> earlier: let seccomp users emit loggable things and let their parents
> (optionally) catch and handle those things.  Then we get real scoping
> rather than fiddling with bitmasks and hoping we get what we want to
> see without generating massive log spam.

Hi Andy - I remembered your proposal and considered it when I was
writing up mine. I still feel like it is easier to get the logging
bitmask right, if you even need to adjust the default bitmask, than to
force parent processes to act as a relay for log messages.

The logging bitmasks can easily be adjusted in debug builds or when a
user specifies a runtime option to enable seccomp logging in a program
that sets up filters.

Another potential issue with the parent process handling the logging is
that it isn't always possible when audit is in use.
audit_log_user_message() requires CAP_AUDIT_WRITE (seems like
CAP_AUDIT_CONTROL might also be required but I can't remember why) so
unprivileged processes would have to divert their log messages elsewhere.

Tyler



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/4] Improved seccomp logging
  2017-04-07 23:46           ` Tyler Hicks
@ 2017-04-11  3:59             ` Kees Cook
  2017-04-27 22:17               ` Tyler Hicks
  0 siblings, 1 reply; 29+ messages in thread
From: Kees Cook @ 2017-04-11  3:59 UTC (permalink / raw)
  To: Tyler Hicks
  Cc: Andy Lutomirski, Paul Moore, Eric Paris, Will Drewry,
	linux-audit, linux-kernel, John Crispin, Linux API

On Fri, Apr 7, 2017 at 4:46 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
> On 04/07/2017 05:46 PM, Kees Cook wrote:
>> Does the app-controlled bitmask apply to the filter, the process, the
>> process tree, or something else? e.g. systemd launches an app with a
>> filter, leaving the defaults alone, then later process installs a
>> filter and wants everything logged -- will things from the earlier
>> filter get logged?
>
> I think implementation preferences may decide many of these questions.
> As I see it, here are the options in order of my preference:
>
> A) Claim the MSB of the filter return value and make the app logging
>    preference per-rule
>    - If the bit is set, log the rule
>    - Provides very fine-grained logging control at the "high cost" of
>      the remaining free bit in the filter return bitmask
>    - The bit can be ignored in the case of RET_KILL
>    - Can be synced across all threads in the calling task with the
>      SECCOMP_FILTER_FLAG_TSYNC filter flag
>
> B) Claim a few bits in the filter flags and make the app logging
>    preference per-filter
>    - Something like SECCOMP_FILTER_FLAG_LOG_TRAP,
>      SECCOMP_FILTER_FLAG_LOG_ERRNO, and
>      SECCOMP_FILTER_FLAG_LOG_TRACE
>    - Logging for RET_KILL and RET_LOG can't be turned off
>    - I'd prefer not to waste a bit for RET_ALLOW in this case so it
>      simply won't be loggable
>    - Works with the SECCOMP_FILTER_FLAG_TSYNC filter flag
>    - Doesn't scale well if many new actions are added in the future
>
> C) A simplified version of 'B' where only a single mode bit is claimed
>    to enable logging for all actions except RET_ALLOW
>    - Something like SECCOMP_FILTER_FLAG_LOG_ACTIONS
>    - Filters without this flag only log RET_KILL and RET_LOG
>    - Scales much better than 'B' at the expense of less flexibility
>    - Works with the SECCOMP_FILTER_FLAG_TSYNC filter flag
>
> D) Claim a bit in the filter mode and make the app logging preference
>    per-process
>    - This new SECCOMP_MODE_ENABLE_LOGGING mode would take a bitmask of
>      actions that should be logged
>    - Incurs a small per-task increase in memory footprint in the form
>      of an additional member in 'struct seccomp'
>    - Has odd behavior you described above where launchers may set the
>      logging preference and then launched application may want
>      something different
>
> I think 'A' is the cleanest design but I don't know if highly
> configurable logging is deserving of the MSB bit in the filter return.
> I'd like to hear your thoughts there.
>
> I _barely_ prefer 'B' over 'C'. They're essential equal in my use case.
>
> To be honest, I haven't completely wrapped my head around how 'D' would
> actually work in practice so I may be writing it off prematurely.
>
> Am I missing any more clever options that you can think of? Let me know
> what you think of the possibilities.

Hmm, so, I think we can just make this a bitmask in the process
seccomp struct. It'll get inherited across forks, and any filter that
wants to make sure it never changes again can just blacklist the
seccomp syscall with that argument. I don't see anything about the
logging that should be considered private, considering the logs are
going through syslog or auditd. Since it's already out-of-band, this
won't change the behavior of ptrace monitors, etc.

So, how about seccomp(SECCOMP_SET_LOGGING, flags, user_ptr) and
...GET_LOGGING? flags likely 0, and user_ptr can point to:

struct seccomp_logging {
    u32 count;
    u32 values[];
};

Where each value entry is a filter return value to log. (That way
bitmasks are just an internal storage detail and we're allowed to add
new filter returns without breaking a bitmask UAPI.)

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/4] Improved seccomp logging
  2017-04-10 15:57         ` Andy Lutomirski
  2017-04-10 19:22           ` Tyler Hicks
@ 2017-04-11  4:03           ` Kees Cook
  1 sibling, 0 replies; 29+ messages in thread
From: Kees Cook @ 2017-04-11  4:03 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Tyler Hicks, Paul Moore, Eric Paris, Will Drewry, linux-audit,
	linux-kernel, John Crispin, Linux API

On Mon, Apr 10, 2017 at 8:57 AM, Andy Lutomirski <luto@kernel.org> wrote:
> On Fri, Apr 7, 2017 at 3:16 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>> On 02/22/2017 12:46 PM, Kees Cook wrote:
>>> On Thu, Feb 16, 2017 at 3:29 PM, Kees Cook <keescook@chromium.org> wrote:
>>>> On Wed, Feb 15, 2017 at 7:24 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>>> On Mon, Feb 13, 2017 at 7:45 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>>>>>> This patch set is the third revision of the following two previously
>>>>>> submitted patch sets:
>>>>>>
>>>>>> v1: http://lkml.kernel.org/r/1483375990-14948-1-git-send-email-tyhicks@canonical.com
>>>>>> v1: http://lkml.kernel.org/r/1483377999-15019-2-git-send-email-tyhicks@canonical.com
>>>>>>
>>>>>> v2: http://lkml.kernel.org/r/1486100262-32391-1-git-send-email-tyhicks@canonical.com
>>>>>>
>>>>>> The patch set aims to address some known deficiencies in seccomp's current
>>>>>> logging capabilities:
>>>>>>
>>>>>>   1. Inability to log all filter actions.
>>>>>>   2. Inability to selectively enable filtering; e.g. devs want noisy logging,
>>>>>>      users want relative quiet.
>>>>>>   3. Consistent behavior with audit enabled and disabled.
>>>>>>   4. Inability to easily develop a filter due to the lack of a
>>>>>>      permissive/complain mode.
>>>>>
>>>>> I think I dislike this, but I think my dislikes may be fixable with
>>>>> minor changes.
>>>>>
>>>>> What I dislike is that this mixes app-specific built-in configuration
>>>>> (seccomp) with global privileged stuff (audit).  The result is a
>>>>> potentially difficult to use situation in which you need to modify an
>>>>> app to make it loggable (using RET_LOG) and then fiddle with
>>>>> privileged config (auditctl, etc) to actually see the logs.
>>>>
>>>> You make a good point about RET_LOG vs log_max_action. I think making
>>>> RET_LOG the default value would work for 99% of the cases.
>>>
>>> Actually, I take this back: making "log" the default means that
>>> everything else gets logged too, include "expected" return values like
>>> errno, trap, etc. I think that would be extremely noisy as a default
>>> (for upstream or Ubuntu).
>>>
>>> Perhaps RET_LOG should unconditionally log? Or maybe the logged
>>> actions should be a bit field instead of a single value? Then the
>>> default could be "RET_KILL and RET_LOG", but an admin could switch it
>>> to just RET_KILL, or even nothing at all? Hmmm...
>>
>> Hi Kees - my apologies for going quiet on this topic after we spoke
>> about it more in IRC. I needed to tend to other work but now I'm able to
>> return to this seccomp logging patch set.
>>
>> To summarize what we discussed in IRC, the Chrome browser makes
>> extensive use of RET_ERRNO, RET_TRACE, etc., to sandbox code that may
>> not ever be adjusted to keep from bump into the sandbox walls.
>> Therefore, it is unreasonable to enable logging of something like
>> RET_ERRNO on a system-wide level where Chrome browser is in use.
>>
>> In contrast, snapd wants to set up "noisier" sandboxes for applications
>> to make it clear to the developers and the users that the sandboxed
>> application is bumping into the sandbox walls. Developers will know why
>> their code may not be working as intended and users will know that the
>> application is doing things that the platform doesn't agree with. These
>> sandboxes will end up using RET_ERRNO in the majority of cases.
>>
>> This means that with the current design of this patch set, Chrome
>> browser will either be unintentionally spamming the logs or snapd's
>> sandboxes will be helplessly silent when both Chrome and snapd is
>> installed at the same time, depending on the admin's preferences.
>>
>> To bring it back up a level, two applications may have a very different
>> outlook on how acceptable a given seccomp action is and they may
>> disagree on whether or not the user/administrator should be informed.
>>
>> I've been giving thought to the idea of providing a way for the
>> application setting up the filter to opt into logging of certain
>> actions. Here's a high level breakdown:
>
> At the risk of overcomplicating things, I think this issue may be a
> decent argument for doing something more like what I suggested
> earlier: let seccomp users emit loggable things and let their parents
> (optionally) catch and handle those things.  Then we get real scoping
> rather than fiddling with bitmasks and hoping we get what we want to
> see without generating massive log spam.

I still think this is overkill. We already have an out-of-band logging
system, and having something that is tied to the parent duplicates
much of ptrace monitor capabilities without really adding much. I'd
like to continue with the syslog/auditd approach until we have an
actual use-case like what you've described. I think Tyler (and others,
e.g. openWRT) have demonstrated a need that would be addressed via the
syslog path.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/4] Improved seccomp logging
  2017-04-11  3:59             ` Kees Cook
@ 2017-04-27 22:17               ` Tyler Hicks
  2017-04-27 23:42                 ` Kees Cook
  0 siblings, 1 reply; 29+ messages in thread
From: Tyler Hicks @ 2017-04-27 22:17 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andy Lutomirski, Paul Moore, Eric Paris, Will Drewry,
	linux-audit, linux-kernel, John Crispin, Linux API


[-- Attachment #1.1: Type: text/plain, Size: 4890 bytes --]

On 04/10/2017 10:59 PM, Kees Cook wrote:
> On Fri, Apr 7, 2017 at 4:46 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>> On 04/07/2017 05:46 PM, Kees Cook wrote:
>>> Does the app-controlled bitmask apply to the filter, the process, the
>>> process tree, or something else? e.g. systemd launches an app with a
>>> filter, leaving the defaults alone, then later process installs a
>>> filter and wants everything logged -- will things from the earlier
>>> filter get logged?
>>
>> I think implementation preferences may decide many of these questions.
>> As I see it, here are the options in order of my preference:
>>
>> A) Claim the MSB of the filter return value and make the app logging
>>    preference per-rule
>>    - If the bit is set, log the rule
>>    - Provides very fine-grained logging control at the "high cost" of
>>      the remaining free bit in the filter return bitmask
>>    - The bit can be ignored in the case of RET_KILL
>>    - Can be synced across all threads in the calling task with the
>>      SECCOMP_FILTER_FLAG_TSYNC filter flag
>>
>> B) Claim a few bits in the filter flags and make the app logging
>>    preference per-filter
>>    - Something like SECCOMP_FILTER_FLAG_LOG_TRAP,
>>      SECCOMP_FILTER_FLAG_LOG_ERRNO, and
>>      SECCOMP_FILTER_FLAG_LOG_TRACE
>>    - Logging for RET_KILL and RET_LOG can't be turned off
>>    - I'd prefer not to waste a bit for RET_ALLOW in this case so it
>>      simply won't be loggable
>>    - Works with the SECCOMP_FILTER_FLAG_TSYNC filter flag
>>    - Doesn't scale well if many new actions are added in the future
>>
>> C) A simplified version of 'B' where only a single mode bit is claimed
>>    to enable logging for all actions except RET_ALLOW
>>    - Something like SECCOMP_FILTER_FLAG_LOG_ACTIONS
>>    - Filters without this flag only log RET_KILL and RET_LOG
>>    - Scales much better than 'B' at the expense of less flexibility
>>    - Works with the SECCOMP_FILTER_FLAG_TSYNC filter flag
>>
>> D) Claim a bit in the filter mode and make the app logging preference
>>    per-process
>>    - This new SECCOMP_MODE_ENABLE_LOGGING mode would take a bitmask of
>>      actions that should be logged
>>    - Incurs a small per-task increase in memory footprint in the form
>>      of an additional member in 'struct seccomp'
>>    - Has odd behavior you described above where launchers may set the
>>      logging preference and then launched application may want
>>      something different
>>
>> I think 'A' is the cleanest design but I don't know if highly
>> configurable logging is deserving of the MSB bit in the filter return.
>> I'd like to hear your thoughts there.
>>
>> I _barely_ prefer 'B' over 'C'. They're essential equal in my use case.
>>
>> To be honest, I haven't completely wrapped my head around how 'D' would
>> actually work in practice so I may be writing it off prematurely.
>>
>> Am I missing any more clever options that you can think of? Let me know
>> what you think of the possibilities.
> 
> Hmm, so, I think we can just make this a bitmask in the process
> seccomp struct. It'll get inherited across forks, and any filter that
> wants to make sure it never changes again can just blacklist the
> seccomp syscall with that argument. I don't see anything about the
> logging that should be considered private, considering the logs are
> going through syslog or auditd. Since it's already out-of-band, this
> won't change the behavior of ptrace monitors, etc.
> 
> So, how about seccomp(SECCOMP_SET_LOGGING, flags, user_ptr) and
> ...GET_LOGGING? flags likely 0, and user_ptr can point to:
> 
> struct seccomp_logging {
>     u32 count;
>     u32 values[];
> };
> 
> Where each value entry is a filter return value to log. (That way
> bitmasks are just an internal storage detail and we're allowed to add
> new filter returns without breaking a bitmask UAPI.)

Quick update... I finished the move from the high-water mark
log_max_action sysctl to the bitmask based actions_logged sysctl.

Unfortunately, I've just realized that SECCOMP_SET_LOGGING, or any
process-wide logging configuration mechanism, will not work. It is fine
for the situation where two unrelated processes set up seccomp filters
that should be logged differently. However, it fails when two closely
related processes, such as parent and child, need to set up seccomp
filters that should be logged differently. Imagine a launcher that sets
up an application sandbox (including a seccomp filter) and then launches
an electron app which will have its own seccomp filter for sandboxing
untrusted code that it runs. Unless the launcher and app completely
agree on actions that should be logged, the logging won't work as
intended for both processes.

I think this needs to be configured at the filter level.

Tyler


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/4] Improved seccomp logging
  2017-04-27 22:17               ` Tyler Hicks
@ 2017-04-27 23:42                 ` Kees Cook
  2017-05-02  2:41                   ` Tyler Hicks
  0 siblings, 1 reply; 29+ messages in thread
From: Kees Cook @ 2017-04-27 23:42 UTC (permalink / raw)
  To: Tyler Hicks
  Cc: Andy Lutomirski, Paul Moore, Eric Paris, Will Drewry,
	linux-audit, linux-kernel, John Crispin, Linux API

On Thu, Apr 27, 2017 at 3:17 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
> Quick update... I finished the move from the high-water mark
> log_max_action sysctl to the bitmask based actions_logged sysctl.

Awesome!

> Unfortunately, I've just realized that SECCOMP_SET_LOGGING, or any
> process-wide logging configuration mechanism, will not work. It is fine
> for the situation where two unrelated processes set up seccomp filters
> that should be logged differently. However, it fails when two closely
> related processes, such as parent and child, need to set up seccomp
> filters that should be logged differently. Imagine a launcher that sets
> up an application sandbox (including a seccomp filter) and then launches
> an electron app which will have its own seccomp filter for sandboxing
> untrusted code that it runs. Unless the launcher and app completely
> agree on actions that should be logged, the logging won't work as
> intended for both processes.

Oh, you mean the forked process sets up the logging it wants for the
filters it just installed, then after exec a process sets up new
logging requirements?

> I think this needs to be configured at the filter level.

I'm not sure that's even the right way to compose the logging desires.

So, my initial thought was "whatever ran SECCOMP_SET_LOGGING knows
what it's doing" and it should be the actual value.

If the launcher wants logs of everything the application does with its
filters, then a purely-tied-to-filter approach won't work either.

Perhaps log bits can only be enabled? I.e. SECCOMP_SET_LOGGING
performs an OR instead of an assignment?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/4] Improved seccomp logging
  2017-04-27 23:42                 ` Kees Cook
@ 2017-05-02  2:41                   ` Tyler Hicks
  2017-05-02 16:14                     ` Andy Lutomirski
  0 siblings, 1 reply; 29+ messages in thread
From: Tyler Hicks @ 2017-05-02  2:41 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andy Lutomirski, Paul Moore, Eric Paris, Will Drewry,
	linux-audit, linux-kernel, John Crispin, Linux API

On 04/27/2017 07:42 PM, Kees Cook wrote:
> On Thu, Apr 27, 2017 at 3:17 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>> Quick update... I finished the move from the high-water mark
>> log_max_action sysctl to the bitmask based actions_logged sysctl.
> 
> Awesome!
> 
>> Unfortunately, I've just realized that SECCOMP_SET_LOGGING, or any
>> process-wide logging configuration mechanism, will not work. It is fine
>> for the situation where two unrelated processes set up seccomp filters
>> that should be logged differently. However, it fails when two closely
>> related processes, such as parent and child, need to set up seccomp
>> filters that should be logged differently. Imagine a launcher that sets
>> up an application sandbox (including a seccomp filter) and then launches
>> an electron app which will have its own seccomp filter for sandboxing
>> untrusted code that it runs. Unless the launcher and app completely
>> agree on actions that should be logged, the logging won't work as
>> intended for both processes.
> 
> Oh, you mean the forked process sets up the logging it wants for the
> filters it just installed, then after exec a process sets up new
> logging requirements?

Yes - see below.

> 
>> I think this needs to be configured at the filter level.
> 
> I'm not sure that's even the right way to compose the logging desires.
> 
> So, my initial thought was "whatever ran SECCOMP_SET_LOGGING knows
> what it's doing" and it should be the actual value.
> 
> If the launcher wants logs of everything the application does with its
> filters, then a purely-tied-to-filter approach won't work either.
> 
> Perhaps log bits can only be enabled? I.e. SECCOMP_SET_LOGGING
> performs an OR instead of an assignment?

The problem that I'm envisioning with this design is this:

1. Launcher is told to launch Chrome and forks off a process.

2. Launcher sets up a filter using RET_ERRNO for all unacceptable
syscalls and enables auditing of RET_ERRNO.

3. Launcher execs Chrome.

4. Chrome then sets up its own, more restrictive filter that uses
RET_ERRNO, among other actions, but does not want auditing of RET_ERRNO.

If we use process-wide auditing controls, the logs will be filled with
RET_ERRNO messages that were unintended and unrelated to the RET_ERRNO
actions set up in the launcher's filter.

Unfortunately, the OR'ing idea doesn't solve the problem.

Tyler

> 
> -Kees
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/4] Improved seccomp logging
  2017-05-02  2:41                   ` Tyler Hicks
@ 2017-05-02 16:14                     ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2017-05-02 16:14 UTC (permalink / raw)
  To: Tyler Hicks
  Cc: Kees Cook, Paul Moore, Eric Paris, Will Drewry, linux-audit,
	linux-kernel, John Crispin, Linux API

On Mon, May 1, 2017 at 7:41 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
> On 04/27/2017 07:42 PM, Kees Cook wrote:
>> On Thu, Apr 27, 2017 at 3:17 PM, Tyler Hicks <tyhicks@canonical.com> wrote:
>>> Quick update... I finished the move from the high-water mark
>>> log_max_action sysctl to the bitmask based actions_logged sysctl.
>>
>> Awesome!
>>
>>> Unfortunately, I've just realized that SECCOMP_SET_LOGGING, or any
>>> process-wide logging configuration mechanism, will not work. It is fine
>>> for the situation where two unrelated processes set up seccomp filters
>>> that should be logged differently. However, it fails when two closely
>>> related processes, such as parent and child, need to set up seccomp
>>> filters that should be logged differently. Imagine a launcher that sets
>>> up an application sandbox (including a seccomp filter) and then launches
>>> an electron app which will have its own seccomp filter for sandboxing
>>> untrusted code that it runs. Unless the launcher and app completely
>>> agree on actions that should be logged, the logging won't work as
>>> intended for both processes.
>>
>> Oh, you mean the forked process sets up the logging it wants for the
>> filters it just installed, then after exec a process sets up new
>> logging requirements?
>
> Yes - see below.
>
>>
>>> I think this needs to be configured at the filter level.
>>
>> I'm not sure that's even the right way to compose the logging desires.
>>
>> So, my initial thought was "whatever ran SECCOMP_SET_LOGGING knows
>> what it's doing" and it should be the actual value.
>>
>> If the launcher wants logs of everything the application does with its
>> filters, then a purely-tied-to-filter approach won't work either.
>>
>> Perhaps log bits can only be enabled? I.e. SECCOMP_SET_LOGGING
>> performs an OR instead of an assignment?
>
> The problem that I'm envisioning with this design is this:
>
> 1. Launcher is told to launch Chrome and forks off a process.
>
> 2. Launcher sets up a filter using RET_ERRNO for all unacceptable
> syscalls and enables auditing of RET_ERRNO.
>
> 3. Launcher execs Chrome.
>
> 4. Chrome then sets up its own, more restrictive filter that uses
> RET_ERRNO, among other actions, but does not want auditing of RET_ERRNO.
>
> If we use process-wide auditing controls, the logs will be filled with
> RET_ERRNO messages that were unintended and unrelated to the RET_ERRNO
> actions set up in the launcher's filter.
>
> Unfortunately, the OR'ing idea doesn't solve the problem.

Things like my more complicated solution solve this completely, I
think.  The launcher would, by whatever means, say "RET_ERRNO and log
this".  The more restrictive sandbox would say "RET_ERROR and don't
log this" and we'd just make sure that the composition rules mean the
inner rule wins.

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2017-05-02 16:15 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-14  3:45 [PATCH v3 0/4] Improved seccomp logging Tyler Hicks
2017-02-14  3:45 ` [PATCH v3 1/4] seccomp: Add sysctl to display available actions Tyler Hicks
2017-02-16  1:00   ` Kees Cook
2017-02-16 18:43     ` Tyler Hicks
2017-02-16  3:14   ` Andy Lutomirski
2017-02-16 18:47     ` Tyler Hicks
2017-02-16 19:01       ` Andy Lutomirski
2017-02-16 20:05         ` Tyler Hicks
2017-02-14  3:45 ` [PATCH v3 2/4] seccomp: Add sysctl to configure actions that should be logged Tyler Hicks
2017-02-14  3:45 ` [PATCH v3 3/4] seccomp: Create an action to log before allowing Tyler Hicks
2017-02-14  3:45 ` [PATCH v3 4/4] seccomp: Add tests for SECCOMP_RET_LOG Tyler Hicks
2017-02-16  3:24 ` [PATCH v3 0/4] Improved seccomp logging Andy Lutomirski
2017-02-16 19:37   ` Tyler Hicks
2017-02-16 23:29   ` Kees Cook
2017-02-17 17:00     ` Andy Lutomirski
2017-02-22 18:39       ` Kees Cook
2017-02-22 18:46     ` Kees Cook
2017-04-07 22:16       ` Tyler Hicks
2017-04-07 22:46         ` Kees Cook
2017-04-07 23:46           ` Tyler Hicks
2017-04-11  3:59             ` Kees Cook
2017-04-27 22:17               ` Tyler Hicks
2017-04-27 23:42                 ` Kees Cook
2017-05-02  2:41                   ` Tyler Hicks
2017-05-02 16:14                     ` Andy Lutomirski
2017-04-10 15:18         ` Steve Grubb
2017-04-10 15:57         ` Andy Lutomirski
2017-04-10 19:22           ` Tyler Hicks
2017-04-11  4:03           ` Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).