linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/3] arm64: Add seccomp support
@ 2014-07-22  9:14 AKASHI Takahiro
  2014-07-22  9:14 ` [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations AKASHI Takahiro
                   ` (3 more replies)
  0 siblings, 4 replies; 36+ messages in thread
From: AKASHI Takahiro @ 2014-07-22  9:14 UTC (permalink / raw)
  To: wad, catalin.marinas, will.deacon, keescook
  Cc: dsaxena, linux-arm-kernel, linaro-kernel, linux-kernel, AKASHI Takahiro

(Please apply this patch after my audit patch in order to avoid some
conflict on arm64/Kconfig.)

This patch enables secure computing (system call filtering) on arm64.
System calls can be allowed or denied by loaded bpf-style rules.
Architecture specific part is to run secure_computing() on syscall entry
and check the result. See [3/3]

Prerequisites are:
 * "arm64: Add audit support" patch

This code is tested on ARMv8 fast model using
 * libseccomp v2.1.1 with modifications for arm64 and verified by its "live"
   tests, 20, 21 and 24.
 * modified version of Kees' seccomp test for 'changing/skipping a syscall'
   behavior

Changes v4 -> v5:
* rebased to v3.16-rc
* add patch [1/3] to allow ptrace to change a system call
  (please note that this patch should be applied even without seccomp.)

Changes v3 -> v4:
* removed the following patch and moved it to "arm64: prerequisites for
  audit and ftrace" patchset since it is required for audit and ftrace in
  case of !COMPAT, too.
  "arm64: is_compat_task is defined both in asm/compat.h and linux/compat.h"

Changes v2 -> v3:
* removed unnecessary 'type cast' operations [2/3]
* check for a return value (-1) of secure_computing() explicitly [2/3]
* aligned with the patch, "arm64: split syscall_trace() into separate
  functions for enter/exit" [2/3]
* changed default of CONFIG_SECCOMP to n [2/3]

Changes v1 -> v2:
* added generic seccomp.h for arm64 to utilize it [1,2/3] 
* changed syscall_trace() to return more meaningful value (-EPERM)
  on seccomp failure case [2/3]
* aligned with the change in "arm64: make a single hook to syscall_trace()
  for all syscall features" v2 [2/3]
* removed is_compat_task() definition from compat.h [3/3]

AKASHI Takahiro (3):
  arm64: ptrace: reload a syscall number after ptrace operations
  asm-generic: Add generic seccomp.h for secure computing mode 1
  arm64: Add seccomp support

 arch/arm64/Kconfig               |   14 ++++++++++++++
 arch/arm64/include/asm/seccomp.h |   25 +++++++++++++++++++++++++
 arch/arm64/include/asm/unistd.h  |    3 +++
 arch/arm64/kernel/entry.S        |    2 ++
 arch/arm64/kernel/ptrace.c       |   18 ++++++++++++++++++
 include/asm-generic/seccomp.h    |   28 ++++++++++++++++++++++++++++
 6 files changed, 90 insertions(+)
 create mode 100644 arch/arm64/include/asm/seccomp.h
 create mode 100644 include/asm-generic/seccomp.h

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations
  2014-07-22  9:14 [PATCH v5 0/3] arm64: Add seccomp support AKASHI Takahiro
@ 2014-07-22  9:14 ` AKASHI Takahiro
  2014-07-22 20:15   ` Kees Cook
  2014-07-24  3:54   ` Andy Lutomirski
  2014-07-22  9:14 ` [PATCH v5 2/3] asm-generic: Add generic seccomp.h for secure computing mode 1 AKASHI Takahiro
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 36+ messages in thread
From: AKASHI Takahiro @ 2014-07-22  9:14 UTC (permalink / raw)
  To: wad, catalin.marinas, will.deacon, keescook
  Cc: dsaxena, linux-arm-kernel, linaro-kernel, linux-kernel, AKASHI Takahiro

Arm64 holds a syscall number in w8(x8) register. Ptrace tracer may change
its value either to:
  * any valid syscall number to alter a system call, or
  * -1 to skip a system call

This patch implements this behavior by reloading that value into syscallno
in struct pt_regs after tracehook_report_syscall_entry() or
secure_computing(). In case of '-1', a return value of system call can also
be changed by the tracer setting the value to x0 register, and so
sys_ni_nosyscall() should not be called.

See also:
    42309ab4, ARM: 8087/1: ptrace: reload syscall number after
	      secure_computing() check

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm64/kernel/entry.S  |    2 ++
 arch/arm64/kernel/ptrace.c |   13 +++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 5141e79..de8bdbc 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -628,6 +628,8 @@ ENDPROC(el0_svc)
 __sys_trace:
 	mov	x0, sp
 	bl	syscall_trace_enter
+	cmp	w0, #-1				// skip syscall?
+	b.eq	ret_to_user
 	adr	lr, __sys_trace_return		// return address
 	uxtw	scno, w0			// syscall number (possibly new)
 	mov	x1, sp				// pointer to regs
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 70526cf..100d7d1 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -21,6 +21,7 @@
 
 #include <linux/audit.h>
 #include <linux/compat.h>
+#include <linux/errno.h>
 #include <linux/kernel.h>
 #include <linux/sched.h>
 #include <linux/mm.h>
@@ -1109,9 +1110,21 @@ static void tracehook_report_syscall(struct pt_regs *regs,
 
 asmlinkage int syscall_trace_enter(struct pt_regs *regs)
 {
+	unsigned long saved_x0, saved_x8;
+
+	saved_x0 = regs->regs[0];
+	saved_x8 = regs->regs[8];
+
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
 		tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
 
+	regs->syscallno = regs->regs[8];
+	if ((long)regs->syscallno == ~0UL) { /* skip this syscall */
+		regs->regs[8] = saved_x8;
+		if (regs->regs[0] == saved_x0) /* not changed by user */
+			regs->regs[0] = -ENOSYS;
+	}
+
 	if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
 		trace_sys_enter(regs, regs->syscallno);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v5 2/3] asm-generic: Add generic seccomp.h for secure computing mode 1
  2014-07-22  9:14 [PATCH v5 0/3] arm64: Add seccomp support AKASHI Takahiro
  2014-07-22  9:14 ` [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations AKASHI Takahiro
@ 2014-07-22  9:14 ` AKASHI Takahiro
  2014-07-24  3:40   ` Andy Lutomirski
  2014-07-22  9:14 ` [PATCH v5 3/3] arm64: Add seccomp support AKASHI Takahiro
  2014-07-22 20:16 ` [PATCH v5 0/3] " Kees Cook
  3 siblings, 1 reply; 36+ messages in thread
From: AKASHI Takahiro @ 2014-07-22  9:14 UTC (permalink / raw)
  To: wad, catalin.marinas, will.deacon, keescook
  Cc: dsaxena, linux-arm-kernel, linaro-kernel, linux-kernel, AKASHI Takahiro

Those values (__NR_seccomp_*) are used solely in secure_computing()
to identify mode 1 system calls. If compat system calls have different
syscall numbers, asm/seccomp.h may override them.

Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 include/asm-generic/seccomp.h |   28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)
 create mode 100644 include/asm-generic/seccomp.h

diff --git a/include/asm-generic/seccomp.h b/include/asm-generic/seccomp.h
new file mode 100644
index 0000000..5e97022
--- /dev/null
+++ b/include/asm-generic/seccomp.h
@@ -0,0 +1,28 @@
+/*
+ * include/asm-generic/seccomp.h
+ *
+ * Copyright (C) 2014 Linaro Limited
+ * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef _ASM_GENERIC_SECCOMP_H
+#define _ASM_GENERIC_SECCOMP_H
+
+#include <asm-generic/unistd.h>
+
+#if defined(CONFIG_COMPAT) && !defined(__NR_seccomp_read_32)
+#define __NR_seccomp_read_32		__NR_read
+#define __NR_seccomp_write_32		__NR_write
+#define __NR_seccomp_exit_32		__NR_exit
+#define __NR_seccomp_sigreturn_32	__NR_rt_sigreturn
+#endif /* CONFIG_COMPAT && ! already defined */
+
+#define __NR_seccomp_read		__NR_read
+#define __NR_seccomp_write		__NR_write
+#define __NR_seccomp_exit		__NR_exit
+#define __NR_seccomp_sigreturn		__NR_rt_sigreturn
+
+#endif /* _ASM_GENERIC_SECCOMP_H */
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v5 3/3] arm64: Add seccomp support
  2014-07-22  9:14 [PATCH v5 0/3] arm64: Add seccomp support AKASHI Takahiro
  2014-07-22  9:14 ` [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations AKASHI Takahiro
  2014-07-22  9:14 ` [PATCH v5 2/3] asm-generic: Add generic seccomp.h for secure computing mode 1 AKASHI Takahiro
@ 2014-07-22  9:14 ` AKASHI Takahiro
  2014-07-24  3:52   ` Andy Lutomirski
  2014-07-22 20:16 ` [PATCH v5 0/3] " Kees Cook
  3 siblings, 1 reply; 36+ messages in thread
From: AKASHI Takahiro @ 2014-07-22  9:14 UTC (permalink / raw)
  To: wad, catalin.marinas, will.deacon, keescook
  Cc: dsaxena, linux-arm-kernel, linaro-kernel, linux-kernel, AKASHI Takahiro

secure_computing() should always be called first in syscall_trace_enter().

If secure_computing() returns -1, we should stop further handling. Then
that system call may eventually fail with a specified return value (errno),
be trapped or the process itself be killed depending on loaded rules.
In these cases, syscall_trace_enter() also returns -1, that results in
skiping a normal syscall handling as well as syscall_trace_exit().

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm64/Kconfig               |   14 ++++++++++++++
 arch/arm64/include/asm/seccomp.h |   25 +++++++++++++++++++++++++
 arch/arm64/include/asm/unistd.h  |    3 +++
 arch/arm64/kernel/ptrace.c       |    5 +++++
 4 files changed, 47 insertions(+)
 create mode 100644 arch/arm64/include/asm/seccomp.h

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 3a18571..eeac003 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -32,6 +32,7 @@ config ARM64
 	select HAVE_ARCH_AUDITSYSCALL
 	select HAVE_ARCH_JUMP_LABEL
 	select HAVE_ARCH_KGDB
+	select HAVE_ARCH_SECCOMP_FILTER
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_DEBUG_BUGVERBOSE
@@ -259,6 +260,19 @@ config ARCH_HAS_CACHE_LINE_SIZE
 
 source "mm/Kconfig"
 
+config SECCOMP
+	bool "Enable seccomp to safely compute untrusted bytecode"
+	---help---
+	  This kernel feature is useful for number crunching applications
+	  that may need to compute untrusted bytecode during their
+	  execution. By using pipes or other transports made available to
+	  the process as file descriptors supporting the read/write
+	  syscalls, it's possible to isolate those applications in
+	  their own address space using seccomp. Once seccomp is
+	  enabled via prctl(PR_SET_SECCOMP), it cannot be disabled
+	  and the task is only allowed to execute a few safe syscalls
+	  defined by each seccomp mode.
+
 config XEN_DOM0
 	def_bool y
 	depends on XEN
diff --git a/arch/arm64/include/asm/seccomp.h b/arch/arm64/include/asm/seccomp.h
new file mode 100644
index 0000000..c76fac9
--- /dev/null
+++ b/arch/arm64/include/asm/seccomp.h
@@ -0,0 +1,25 @@
+/*
+ * arch/arm64/include/asm/seccomp.h
+ *
+ * Copyright (C) 2014 Linaro Limited
+ * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef _ASM_SECCOMP_H
+#define _ASM_SECCOMP_H
+
+#include <asm/unistd.h>
+
+#ifdef CONFIG_COMPAT
+#define __NR_seccomp_read_32		__NR_compat_read
+#define __NR_seccomp_write_32		__NR_compat_write
+#define __NR_seccomp_exit_32		__NR_compat_exit
+#define __NR_seccomp_sigreturn_32	__NR_compat_rt_sigreturn
+#endif /* CONFIG_COMPAT */
+
+#include <asm-generic/seccomp.h>
+
+#endif /* _ASM_SECCOMP_H */
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index c980ab7..729c155 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -31,6 +31,9 @@
  * Compat syscall numbers used by the AArch64 kernel.
  */
 #define __NR_compat_restart_syscall	0
+#define __NR_compat_exit		1
+#define __NR_compat_read		3
+#define __NR_compat_write		4
 #define __NR_compat_sigreturn		119
 #define __NR_compat_rt_sigreturn	173
 
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 100d7d1..e477f6f 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -28,6 +28,7 @@
 #include <linux/smp.h>
 #include <linux/ptrace.h>
 #include <linux/user.h>
+#include <linux/seccomp.h>
 #include <linux/security.h>
 #include <linux/init.h>
 #include <linux/signal.h>
@@ -1115,6 +1116,10 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs)
 	saved_x0 = regs->regs[0];
 	saved_x8 = regs->regs[8];
 
+	if (secure_computing(regs->syscallno) == -1)
+		/* seccomp failures shouldn't expose any additional code. */
+		return -1;
+
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
 		tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations
  2014-07-22  9:14 ` [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations AKASHI Takahiro
@ 2014-07-22 20:15   ` Kees Cook
  2014-07-23  7:03     ` AKASHI Takahiro
  2014-07-24  3:54   ` Andy Lutomirski
  1 sibling, 1 reply; 36+ messages in thread
From: Kees Cook @ 2014-07-22 20:15 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Will Drewry, Catalin Marinas, Will Deacon, dsaxena,
	linux-arm-kernel, linaro-kernel, LKML

On Tue, Jul 22, 2014 at 2:14 AM, AKASHI Takahiro
<takahiro.akashi@linaro.org> wrote:
> Arm64 holds a syscall number in w8(x8) register. Ptrace tracer may change
> its value either to:
>   * any valid syscall number to alter a system call, or
>   * -1 to skip a system call
>
> This patch implements this behavior by reloading that value into syscallno
> in struct pt_regs after tracehook_report_syscall_entry() or
> secure_computing(). In case of '-1', a return value of system call can also
> be changed by the tracer setting the value to x0 register, and so
> sys_ni_nosyscall() should not be called.
>
> See also:
>     42309ab4, ARM: 8087/1: ptrace: reload syscall number after
>               secure_computing() check
>
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>  arch/arm64/kernel/entry.S  |    2 ++
>  arch/arm64/kernel/ptrace.c |   13 +++++++++++++
>  2 files changed, 15 insertions(+)
>
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 5141e79..de8bdbc 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -628,6 +628,8 @@ ENDPROC(el0_svc)
>  __sys_trace:
>         mov     x0, sp
>         bl      syscall_trace_enter
> +       cmp     w0, #-1                         // skip syscall?
> +       b.eq    ret_to_user
>         adr     lr, __sys_trace_return          // return address
>         uxtw    scno, w0                        // syscall number (possibly new)
>         mov     x1, sp                          // pointer to regs
> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> index 70526cf..100d7d1 100644
> --- a/arch/arm64/kernel/ptrace.c
> +++ b/arch/arm64/kernel/ptrace.c
> @@ -21,6 +21,7 @@
>
>  #include <linux/audit.h>
>  #include <linux/compat.h>
> +#include <linux/errno.h>
>  #include <linux/kernel.h>
>  #include <linux/sched.h>
>  #include <linux/mm.h>
> @@ -1109,9 +1110,21 @@ static void tracehook_report_syscall(struct pt_regs *regs,
>
>  asmlinkage int syscall_trace_enter(struct pt_regs *regs)
>  {
> +       unsigned long saved_x0, saved_x8;
> +
> +       saved_x0 = regs->regs[0];
> +       saved_x8 = regs->regs[8];
> +
>         if (test_thread_flag(TIF_SYSCALL_TRACE))
>                 tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
>
> +       regs->syscallno = regs->regs[8];
> +       if ((long)regs->syscallno == ~0UL) { /* skip this syscall */
> +               regs->regs[8] = saved_x8;
> +               if (regs->regs[0] == saved_x0) /* not changed by user */
> +                       regs->regs[0] = -ENOSYS;

I'm not sure this is right compared to other architectures. Generally
when a tracer performs a syscall skip, it's up to them to also adjust
the return value. They may want to be faking a syscall, and what if
the value they want to return happens to be what x0 was going into the
tracer? It would have no way to avoid this -ENOSYS case. I think
things are fine without this test.

-Kees

> +       }
> +
>         if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
>                 trace_sys_enter(regs, regs->syscallno);
>
> --
> 1.7.9.5
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 0/3] arm64: Add seccomp support
  2014-07-22  9:14 [PATCH v5 0/3] arm64: Add seccomp support AKASHI Takahiro
                   ` (2 preceding siblings ...)
  2014-07-22  9:14 ` [PATCH v5 3/3] arm64: Add seccomp support AKASHI Takahiro
@ 2014-07-22 20:16 ` Kees Cook
  2014-07-23  7:09   ` AKASHI Takahiro
  3 siblings, 1 reply; 36+ messages in thread
From: Kees Cook @ 2014-07-22 20:16 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Will Drewry, Catalin Marinas, Will Deacon, dsaxena,
	linux-arm-kernel, linaro-kernel, LKML

On Tue, Jul 22, 2014 at 2:14 AM, AKASHI Takahiro
<takahiro.akashi@linaro.org> wrote:
> (Please apply this patch after my audit patch in order to avoid some
> conflict on arm64/Kconfig.)
>
> This patch enables secure computing (system call filtering) on arm64.
> System calls can be allowed or denied by loaded bpf-style rules.
> Architecture specific part is to run secure_computing() on syscall entry
> and check the result. See [3/3]

Thanks for working on this!

> Prerequisites are:
>  * "arm64: Add audit support" patch
>
> This code is tested on ARMv8 fast model using
>  * libseccomp v2.1.1 with modifications for arm64 and verified by its "live"
>    tests, 20, 21 and 24.
>  * modified version of Kees' seccomp test for 'changing/skipping a syscall'
>    behavior

Would you be able to share this? I'd love to add it to the seccomp
regression suite for the arm64-specific parts.

Thanks!

-Kees

>
> Changes v4 -> v5:
> * rebased to v3.16-rc
> * add patch [1/3] to allow ptrace to change a system call
>   (please note that this patch should be applied even without seccomp.)
>
> Changes v3 -> v4:
> * removed the following patch and moved it to "arm64: prerequisites for
>   audit and ftrace" patchset since it is required for audit and ftrace in
>   case of !COMPAT, too.
>   "arm64: is_compat_task is defined both in asm/compat.h and linux/compat.h"
>
> Changes v2 -> v3:
> * removed unnecessary 'type cast' operations [2/3]
> * check for a return value (-1) of secure_computing() explicitly [2/3]
> * aligned with the patch, "arm64: split syscall_trace() into separate
>   functions for enter/exit" [2/3]
> * changed default of CONFIG_SECCOMP to n [2/3]
>
> Changes v1 -> v2:
> * added generic seccomp.h for arm64 to utilize it [1,2/3]
> * changed syscall_trace() to return more meaningful value (-EPERM)
>   on seccomp failure case [2/3]
> * aligned with the change in "arm64: make a single hook to syscall_trace()
>   for all syscall features" v2 [2/3]
> * removed is_compat_task() definition from compat.h [3/3]
>
> AKASHI Takahiro (3):
>   arm64: ptrace: reload a syscall number after ptrace operations
>   asm-generic: Add generic seccomp.h for secure computing mode 1
>   arm64: Add seccomp support
>
>  arch/arm64/Kconfig               |   14 ++++++++++++++
>  arch/arm64/include/asm/seccomp.h |   25 +++++++++++++++++++++++++
>  arch/arm64/include/asm/unistd.h  |    3 +++
>  arch/arm64/kernel/entry.S        |    2 ++
>  arch/arm64/kernel/ptrace.c       |   18 ++++++++++++++++++
>  include/asm-generic/seccomp.h    |   28 ++++++++++++++++++++++++++++
>  6 files changed, 90 insertions(+)
>  create mode 100644 arch/arm64/include/asm/seccomp.h
>  create mode 100644 include/asm-generic/seccomp.h
>
> --
> 1.7.9.5
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations
  2014-07-22 20:15   ` Kees Cook
@ 2014-07-23  7:03     ` AKASHI Takahiro
  2014-07-23  8:25       ` Will Deacon
  2014-07-23 15:13       ` Kees Cook
  0 siblings, 2 replies; 36+ messages in thread
From: AKASHI Takahiro @ 2014-07-23  7:03 UTC (permalink / raw)
  To: Kees Cook
  Cc: Will Drewry, Catalin Marinas, Will Deacon, dsaxena,
	linux-arm-kernel, linaro-kernel, LKML

On 07/23/2014 05:15 AM, Kees Cook wrote:
> On Tue, Jul 22, 2014 at 2:14 AM, AKASHI Takahiro
> <takahiro.akashi@linaro.org> wrote:
>> Arm64 holds a syscall number in w8(x8) register. Ptrace tracer may change
>> its value either to:
>>    * any valid syscall number to alter a system call, or
>>    * -1 to skip a system call
>>
>> This patch implements this behavior by reloading that value into syscallno
>> in struct pt_regs after tracehook_report_syscall_entry() or
>> secure_computing(). In case of '-1', a return value of system call can also
>> be changed by the tracer setting the value to x0 register, and so
>> sys_ni_nosyscall() should not be called.
>>
>> See also:
>>      42309ab4, ARM: 8087/1: ptrace: reload syscall number after
>>                secure_computing() check
>>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>   arch/arm64/kernel/entry.S  |    2 ++
>>   arch/arm64/kernel/ptrace.c |   13 +++++++++++++
>>   2 files changed, 15 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
>> index 5141e79..de8bdbc 100644
>> --- a/arch/arm64/kernel/entry.S
>> +++ b/arch/arm64/kernel/entry.S
>> @@ -628,6 +628,8 @@ ENDPROC(el0_svc)
>>   __sys_trace:
>>          mov     x0, sp
>>          bl      syscall_trace_enter
>> +       cmp     w0, #-1                         // skip syscall?
>> +       b.eq    ret_to_user
>>          adr     lr, __sys_trace_return          // return address
>>          uxtw    scno, w0                        // syscall number (possibly new)
>>          mov     x1, sp                          // pointer to regs
>> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
>> index 70526cf..100d7d1 100644
>> --- a/arch/arm64/kernel/ptrace.c
>> +++ b/arch/arm64/kernel/ptrace.c
>> @@ -21,6 +21,7 @@
>>
>>   #include <linux/audit.h>
>>   #include <linux/compat.h>
>> +#include <linux/errno.h>
>>   #include <linux/kernel.h>
>>   #include <linux/sched.h>
>>   #include <linux/mm.h>
>> @@ -1109,9 +1110,21 @@ static void tracehook_report_syscall(struct pt_regs *regs,
>>
>>   asmlinkage int syscall_trace_enter(struct pt_regs *regs)
>>   {
>> +       unsigned long saved_x0, saved_x8;
>> +
>> +       saved_x0 = regs->regs[0];
>> +       saved_x8 = regs->regs[8];
>> +
>>          if (test_thread_flag(TIF_SYSCALL_TRACE))
>>                  tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
>>
>> +       regs->syscallno = regs->regs[8];
>> +       if ((long)regs->syscallno == ~0UL) { /* skip this syscall */
>> +               regs->regs[8] = saved_x8;
>> +               if (regs->regs[0] == saved_x0) /* not changed by user */
>> +                       regs->regs[0] = -ENOSYS;
>
> I'm not sure this is right compared to other architectures. Generally
> when a tracer performs a syscall skip, it's up to them to also adjust
> the return value. They may want to be faking a syscall, and what if
> the value they want to return happens to be what x0 was going into the
> tracer? It would have no way to avoid this -ENOSYS case. I think
> things are fine without this test.

Yeah, I know this issue, but was not sure that setting a return value
is mandatory. (x86 seems to return -ENOSYS by default if not explicitly
specified.)
Is "fake a system call" a more appropriate word than "skip"?

I will defer to Will.

Thanks,
-Takahiro AKASHI

> -Kees
>
>> +       }
>> +
>>          if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
>>                  trace_sys_enter(regs, regs->syscallno);
>>
>> --
>> 1.7.9.5
>>
>
>
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 0/3] arm64: Add seccomp support
  2014-07-22 20:16 ` [PATCH v5 0/3] " Kees Cook
@ 2014-07-23  7:09   ` AKASHI Takahiro
  2014-07-23 15:36     ` Kees Cook
  0 siblings, 1 reply; 36+ messages in thread
From: AKASHI Takahiro @ 2014-07-23  7:09 UTC (permalink / raw)
  To: Kees Cook
  Cc: Will Drewry, Catalin Marinas, Will Deacon, dsaxena,
	linux-arm-kernel, linaro-kernel, LKML

On 07/23/2014 05:16 AM, Kees Cook wrote:
> On Tue, Jul 22, 2014 at 2:14 AM, AKASHI Takahiro
> <takahiro.akashi@linaro.org> wrote:
>> (Please apply this patch after my audit patch in order to avoid some
>> conflict on arm64/Kconfig.)
>>
>> This patch enables secure computing (system call filtering) on arm64.
>> System calls can be allowed or denied by loaded bpf-style rules.
>> Architecture specific part is to run secure_computing() on syscall entry
>> and check the result. See [3/3]
>
> Thanks for working on this!
>
>> Prerequisites are:
>>   * "arm64: Add audit support" patch
>>
>> This code is tested on ARMv8 fast model using
>>   * libseccomp v2.1.1 with modifications for arm64 and verified by its "live"
>>     tests, 20, 21 and 24.
>>   * modified version of Kees' seccomp test for 'changing/skipping a syscall'
>>     behavior
>
> Would you be able to share this? I'd love to add it to the seccomp
> regression suite for the arm64-specific parts.

Yep, I forked your repo here:
https://github.com/t-akashi/seccomp.git
(See trace_arm64 branch)

Thanks,
-Takahiro AKASHI

> Thanks!
>
> -Kees
>
>>
>> Changes v4 -> v5:
>> * rebased to v3.16-rc
>> * add patch [1/3] to allow ptrace to change a system call
>>    (please note that this patch should be applied even without seccomp.)
>>
>> Changes v3 -> v4:
>> * removed the following patch and moved it to "arm64: prerequisites for
>>    audit and ftrace" patchset since it is required for audit and ftrace in
>>    case of !COMPAT, too.
>>    "arm64: is_compat_task is defined both in asm/compat.h and linux/compat.h"
>>
>> Changes v2 -> v3:
>> * removed unnecessary 'type cast' operations [2/3]
>> * check for a return value (-1) of secure_computing() explicitly [2/3]
>> * aligned with the patch, "arm64: split syscall_trace() into separate
>>    functions for enter/exit" [2/3]
>> * changed default of CONFIG_SECCOMP to n [2/3]
>>
>> Changes v1 -> v2:
>> * added generic seccomp.h for arm64 to utilize it [1,2/3]
>> * changed syscall_trace() to return more meaningful value (-EPERM)
>>    on seccomp failure case [2/3]
>> * aligned with the change in "arm64: make a single hook to syscall_trace()
>>    for all syscall features" v2 [2/3]
>> * removed is_compat_task() definition from compat.h [3/3]
>>
>> AKASHI Takahiro (3):
>>    arm64: ptrace: reload a syscall number after ptrace operations
>>    asm-generic: Add generic seccomp.h for secure computing mode 1
>>    arm64: Add seccomp support
>>
>>   arch/arm64/Kconfig               |   14 ++++++++++++++
>>   arch/arm64/include/asm/seccomp.h |   25 +++++++++++++++++++++++++
>>   arch/arm64/include/asm/unistd.h  |    3 +++
>>   arch/arm64/kernel/entry.S        |    2 ++
>>   arch/arm64/kernel/ptrace.c       |   18 ++++++++++++++++++
>>   include/asm-generic/seccomp.h    |   28 ++++++++++++++++++++++++++++
>>   6 files changed, 90 insertions(+)
>>   create mode 100644 arch/arm64/include/asm/seccomp.h
>>   create mode 100644 include/asm-generic/seccomp.h
>>
>> --
>> 1.7.9.5
>>
>
>
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations
  2014-07-23  7:03     ` AKASHI Takahiro
@ 2014-07-23  8:25       ` Will Deacon
  2014-07-23  9:09         ` AKASHI Takahiro
  2014-07-23 15:13       ` Kees Cook
  1 sibling, 1 reply; 36+ messages in thread
From: Will Deacon @ 2014-07-23  8:25 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Kees Cook, Will Drewry, Catalin Marinas, dsaxena,
	linux-arm-kernel, linaro-kernel, LKML

On Wed, Jul 23, 2014 at 08:03:47AM +0100, AKASHI Takahiro wrote:
> On 07/23/2014 05:15 AM, Kees Cook wrote:
> > On Tue, Jul 22, 2014 at 2:14 AM, AKASHI Takahiro
> > <takahiro.akashi@linaro.org> wrote:
> >>   asmlinkage int syscall_trace_enter(struct pt_regs *regs)
> >>   {
> >> +       unsigned long saved_x0, saved_x8;
> >> +
> >> +       saved_x0 = regs->regs[0];
> >> +       saved_x8 = regs->regs[8];
> >> +
> >>          if (test_thread_flag(TIF_SYSCALL_TRACE))
> >>                  tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
> >>
> >> +       regs->syscallno = regs->regs[8];
> >> +       if ((long)regs->syscallno == ~0UL) { /* skip this syscall */
> >> +               regs->regs[8] = saved_x8;
> >> +               if (regs->regs[0] == saved_x0) /* not changed by user */
> >> +                       regs->regs[0] = -ENOSYS;
> >
> > I'm not sure this is right compared to other architectures. Generally
> > when a tracer performs a syscall skip, it's up to them to also adjust
> > the return value. They may want to be faking a syscall, and what if
> > the value they want to return happens to be what x0 was going into the
> > tracer? It would have no way to avoid this -ENOSYS case. I think
> > things are fine without this test.
> 
> Yeah, I know this issue, but was not sure that setting a return value
> is mandatory. (x86 seems to return -ENOSYS by default if not explicitly
> specified.)
> Is "fake a system call" a more appropriate word than "skip"?
> 
> I will defer to Will.

I agree with Kees -- iirc, I only suggested restoring x8.

Will

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations
  2014-07-23  8:25       ` Will Deacon
@ 2014-07-23  9:09         ` AKASHI Takahiro
  0 siblings, 0 replies; 36+ messages in thread
From: AKASHI Takahiro @ 2014-07-23  9:09 UTC (permalink / raw)
  To: Will Deacon
  Cc: Kees Cook, Will Drewry, Catalin Marinas, dsaxena,
	linux-arm-kernel, linaro-kernel, LKML

On 07/23/2014 05:25 PM, Will Deacon wrote:
> On Wed, Jul 23, 2014 at 08:03:47AM +0100, AKASHI Takahiro wrote:
>> On 07/23/2014 05:15 AM, Kees Cook wrote:
>>> On Tue, Jul 22, 2014 at 2:14 AM, AKASHI Takahiro
>>> <takahiro.akashi@linaro.org> wrote:
>>>>    asmlinkage int syscall_trace_enter(struct pt_regs *regs)
>>>>    {
>>>> +       unsigned long saved_x0, saved_x8;
>>>> +
>>>> +       saved_x0 = regs->regs[0];
>>>> +       saved_x8 = regs->regs[8];
>>>> +
>>>>           if (test_thread_flag(TIF_SYSCALL_TRACE))
>>>>                   tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
>>>>
>>>> +       regs->syscallno = regs->regs[8];
>>>> +       if ((long)regs->syscallno == ~0UL) { /* skip this syscall */
>>>> +               regs->regs[8] = saved_x8;
>>>> +               if (regs->regs[0] == saved_x0) /* not changed by user */
>>>> +                       regs->regs[0] = -ENOSYS;
>>>
>>> I'm not sure this is right compared to other architectures. Generally
>>> when a tracer performs a syscall skip, it's up to them to also adjust
>>> the return value. They may want to be faking a syscall, and what if
>>> the value they want to return happens to be what x0 was going into the
>>> tracer? It would have no way to avoid this -ENOSYS case. I think
>>> things are fine without this test.
>>
>> Yeah, I know this issue, but was not sure that setting a return value
>> is mandatory. (x86 seems to return -ENOSYS by default if not explicitly
>> specified.)
>> Is "fake a system call" a more appropriate word than "skip"?
>>
>> I will defer to Will.
>
> I agree with Kees -- iirc, I only suggested restoring x8.

OK.

-Takahiro AKASHI

> Will
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations
  2014-07-23  7:03     ` AKASHI Takahiro
  2014-07-23  8:25       ` Will Deacon
@ 2014-07-23 15:13       ` Kees Cook
  1 sibling, 0 replies; 36+ messages in thread
From: Kees Cook @ 2014-07-23 15:13 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Will Drewry, Catalin Marinas, Will Deacon, dsaxena,
	linux-arm-kernel, linaro-kernel, LKML

On Wed, Jul 23, 2014 at 12:03 AM, AKASHI Takahiro
<takahiro.akashi@linaro.org> wrote:
> On 07/23/2014 05:15 AM, Kees Cook wrote:
>>
>> On Tue, Jul 22, 2014 at 2:14 AM, AKASHI Takahiro
>> <takahiro.akashi@linaro.org> wrote:
>>>
>>> Arm64 holds a syscall number in w8(x8) register. Ptrace tracer may change
>>> its value either to:
>>>    * any valid syscall number to alter a system call, or
>>>    * -1 to skip a system call
>>>
>>> This patch implements this behavior by reloading that value into
>>> syscallno
>>> in struct pt_regs after tracehook_report_syscall_entry() or
>>> secure_computing(). In case of '-1', a return value of system call can
>>> also
>>> be changed by the tracer setting the value to x0 register, and so
>>> sys_ni_nosyscall() should not be called.
>>>
>>> See also:
>>>      42309ab4, ARM: 8087/1: ptrace: reload syscall number after
>>>                secure_computing() check
>>>
>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>> ---
>>>   arch/arm64/kernel/entry.S  |    2 ++
>>>   arch/arm64/kernel/ptrace.c |   13 +++++++++++++
>>>   2 files changed, 15 insertions(+)
>>>
>>> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
>>> index 5141e79..de8bdbc 100644
>>> --- a/arch/arm64/kernel/entry.S
>>> +++ b/arch/arm64/kernel/entry.S
>>> @@ -628,6 +628,8 @@ ENDPROC(el0_svc)
>>>   __sys_trace:
>>>          mov     x0, sp
>>>          bl      syscall_trace_enter
>>> +       cmp     w0, #-1                         // skip syscall?
>>> +       b.eq    ret_to_user
>>>          adr     lr, __sys_trace_return          // return address
>>>          uxtw    scno, w0                        // syscall number
>>> (possibly new)
>>>          mov     x1, sp                          // pointer to regs
>>> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
>>> index 70526cf..100d7d1 100644
>>> --- a/arch/arm64/kernel/ptrace.c
>>> +++ b/arch/arm64/kernel/ptrace.c
>>> @@ -21,6 +21,7 @@
>>>
>>>   #include <linux/audit.h>
>>>   #include <linux/compat.h>
>>> +#include <linux/errno.h>
>>>   #include <linux/kernel.h>
>>>   #include <linux/sched.h>
>>>   #include <linux/mm.h>
>>> @@ -1109,9 +1110,21 @@ static void tracehook_report_syscall(struct
>>> pt_regs *regs,
>>>
>>>   asmlinkage int syscall_trace_enter(struct pt_regs *regs)
>>>   {
>>> +       unsigned long saved_x0, saved_x8;
>>> +
>>> +       saved_x0 = regs->regs[0];
>>> +       saved_x8 = regs->regs[8];
>>> +
>>>          if (test_thread_flag(TIF_SYSCALL_TRACE))
>>>                  tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
>>>
>>> +       regs->syscallno = regs->regs[8];
>>> +       if ((long)regs->syscallno == ~0UL) { /* skip this syscall */
>>> +               regs->regs[8] = saved_x8;
>>> +               if (regs->regs[0] == saved_x0) /* not changed by user */
>>> +                       regs->regs[0] = -ENOSYS;
>>
>>
>> I'm not sure this is right compared to other architectures. Generally
>> when a tracer performs a syscall skip, it's up to them to also adjust
>> the return value. They may want to be faking a syscall, and what if
>> the value they want to return happens to be what x0 was going into the
>> tracer? It would have no way to avoid this -ENOSYS case. I think
>> things are fine without this test.
>
>
> Yeah, I know this issue, but was not sure that setting a return value
> is mandatory. (x86 seems to return -ENOSYS by default if not explicitly
> specified.)
> Is "fake a system call" a more appropriate word than "skip"?

I think this is just a matter of semantics and perspective. From the
kernel's perspective, it's always a "skip" since the syscall is never
actually executed. But from the perspective of userspace, it's really
up to the tracer to decide how it should be seen: the tracer could
return -ENOSYS, or a fake return value, etc. But generally, I think
"skip" is the most accurate term for this.

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 0/3] arm64: Add seccomp support
  2014-07-23  7:09   ` AKASHI Takahiro
@ 2014-07-23 15:36     ` Kees Cook
  0 siblings, 0 replies; 36+ messages in thread
From: Kees Cook @ 2014-07-23 15:36 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Will Drewry, Catalin Marinas, Will Deacon, Deepak Saxena,
	linux-arm-kernel, linaro-kernel, LKML

On Wed, Jul 23, 2014 at 12:09 AM, AKASHI Takahiro
<takahiro.akashi@linaro.org> wrote:
> On 07/23/2014 05:16 AM, Kees Cook wrote:
>>
>> On Tue, Jul 22, 2014 at 2:14 AM, AKASHI Takahiro
>> <takahiro.akashi@linaro.org> wrote:
>>>
>>> (Please apply this patch after my audit patch in order to avoid some
>>> conflict on arm64/Kconfig.)
>>>
>>> This patch enables secure computing (system call filtering) on arm64.
>>> System calls can be allowed or denied by loaded bpf-style rules.
>>> Architecture specific part is to run secure_computing() on syscall entry
>>> and check the result. See [3/3]
>>
>>
>> Thanks for working on this!
>>
>>> Prerequisites are:
>>>   * "arm64: Add audit support" patch
>>>
>>> This code is tested on ARMv8 fast model using
>>>   * libseccomp v2.1.1 with modifications for arm64 and verified by its
>>> "live"
>>>     tests, 20, 21 and 24.
>>>   * modified version of Kees' seccomp test for 'changing/skipping a
>>> syscall'
>>>     behavior
>>
>>
>> Would you be able to share this? I'd love to add it to the seccomp
>> regression suite for the arm64-specific parts.
>
>
> Yep, I forked your repo here:
> https://github.com/t-akashi/seccomp.git
> (See trace_arm64 branch)

Great, thanks! I'll incorporate your changes into my trace branch. (It
looks like using PTRACE_GETREGSET works on all archs, so I'll switch
to using that for all.)

-Kees

>
> Thanks,
> -Takahiro AKASHI
>
>
>> Thanks!
>>
>> -Kees
>>
>>>
>>> Changes v4 -> v5:
>>> * rebased to v3.16-rc
>>> * add patch [1/3] to allow ptrace to change a system call
>>>    (please note that this patch should be applied even without seccomp.)
>>>
>>> Changes v3 -> v4:
>>> * removed the following patch and moved it to "arm64: prerequisites for
>>>    audit and ftrace" patchset since it is required for audit and ftrace
>>> in
>>>    case of !COMPAT, too.
>>>    "arm64: is_compat_task is defined both in asm/compat.h and
>>> linux/compat.h"
>>>
>>> Changes v2 -> v3:
>>> * removed unnecessary 'type cast' operations [2/3]
>>> * check for a return value (-1) of secure_computing() explicitly [2/3]
>>> * aligned with the patch, "arm64: split syscall_trace() into separate
>>>    functions for enter/exit" [2/3]
>>> * changed default of CONFIG_SECCOMP to n [2/3]
>>>
>>> Changes v1 -> v2:
>>> * added generic seccomp.h for arm64 to utilize it [1,2/3]
>>> * changed syscall_trace() to return more meaningful value (-EPERM)
>>>    on seccomp failure case [2/3]
>>> * aligned with the change in "arm64: make a single hook to
>>> syscall_trace()
>>>    for all syscall features" v2 [2/3]
>>> * removed is_compat_task() definition from compat.h [3/3]
>>>
>>> AKASHI Takahiro (3):
>>>    arm64: ptrace: reload a syscall number after ptrace operations
>>>    asm-generic: Add generic seccomp.h for secure computing mode 1
>>>    arm64: Add seccomp support
>>>
>>>   arch/arm64/Kconfig               |   14 ++++++++++++++
>>>   arch/arm64/include/asm/seccomp.h |   25 +++++++++++++++++++++++++
>>>   arch/arm64/include/asm/unistd.h  |    3 +++
>>>   arch/arm64/kernel/entry.S        |    2 ++
>>>   arch/arm64/kernel/ptrace.c       |   18 ++++++++++++++++++
>>>   include/asm-generic/seccomp.h    |   28 ++++++++++++++++++++++++++++
>>>   6 files changed, 90 insertions(+)
>>>   create mode 100644 arch/arm64/include/asm/seccomp.h
>>>   create mode 100644 include/asm-generic/seccomp.h
>>>
>>> --
>>> 1.7.9.5
>>>
>>
>>
>>
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 2/3] asm-generic: Add generic seccomp.h for secure computing mode 1
  2014-07-22  9:14 ` [PATCH v5 2/3] asm-generic: Add generic seccomp.h for secure computing mode 1 AKASHI Takahiro
@ 2014-07-24  3:40   ` Andy Lutomirski
  2014-07-24  4:41     ` Kees Cook
  0 siblings, 1 reply; 36+ messages in thread
From: Andy Lutomirski @ 2014-07-24  3:40 UTC (permalink / raw)
  To: AKASHI Takahiro, wad, catalin.marinas, will.deacon, keescook
  Cc: dsaxena, linux-arm-kernel, linaro-kernel, linux-kernel

On 07/22/2014 02:14 AM, AKASHI Takahiro wrote:
> Those values (__NR_seccomp_*) are used solely in secure_computing()
> to identify mode 1 system calls. If compat system calls have different
> syscall numbers, asm/seccomp.h may override them.
>
> Acked-by: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>   include/asm-generic/seccomp.h |   28 ++++++++++++++++++++++++++++
>   1 file changed, 28 insertions(+)
>   create mode 100644 include/asm-generic/seccomp.h
>
> diff --git a/include/asm-generic/seccomp.h b/include/asm-generic/seccomp.h
> new file mode 100644
> index 0000000..5e97022
> --- /dev/null
> +++ b/include/asm-generic/seccomp.h
> @@ -0,0 +1,28 @@
> +/*
> + * include/asm-generic/seccomp.h
> + *
> + * Copyright (C) 2014 Linaro Limited
> + * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +#ifndef _ASM_GENERIC_SECCOMP_H
> +#define _ASM_GENERIC_SECCOMP_H
> +
> +#include <asm-generic/unistd.h>
> +
> +#if defined(CONFIG_COMPAT) && !defined(__NR_seccomp_read_32)
> +#define __NR_seccomp_read_32		__NR_read
> +#define __NR_seccomp_write_32		__NR_write
> +#define __NR_seccomp_exit_32		__NR_exit
> +#define __NR_seccomp_sigreturn_32	__NR_rt_sigreturn
> +#endif /* CONFIG_COMPAT && ! already defined */
> +
> +#define __NR_seccomp_read		__NR_read
> +#define __NR_seccomp_write		__NR_write
> +#define __NR_seccomp_exit		__NR_exit
> +#define __NR_seccomp_sigreturn		__NR_rt_sigreturn

I don't like these names.  __NR_seccomp_read sounds like the number of a 
syscall called seccomp_read.

Also, shouldn't something be including this header?  I'm confused.

--Andy

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 3/3] arm64: Add seccomp support
  2014-07-22  9:14 ` [PATCH v5 3/3] arm64: Add seccomp support AKASHI Takahiro
@ 2014-07-24  3:52   ` Andy Lutomirski
  2014-07-24  5:40     ` AKASHI Takahiro
  0 siblings, 1 reply; 36+ messages in thread
From: Andy Lutomirski @ 2014-07-24  3:52 UTC (permalink / raw)
  To: AKASHI Takahiro, wad, catalin.marinas, will.deacon, keescook
  Cc: dsaxena, linux-arm-kernel, linaro-kernel, linux-kernel

On 07/22/2014 02:14 AM, AKASHI Takahiro wrote:
> secure_computing() should always be called first in syscall_trace_enter().
>
> If secure_computing() returns -1, we should stop further handling. Then
> that system call may eventually fail with a specified return value (errno),
> be trapped or the process itself be killed depending on loaded rules.
> In these cases, syscall_trace_enter() also returns -1, that results in
> skiping a normal syscall handling as well as syscall_trace_exit().
>
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>   arch/arm64/Kconfig               |   14 ++++++++++++++
>   arch/arm64/include/asm/seccomp.h |   25 +++++++++++++++++++++++++
>   arch/arm64/include/asm/unistd.h  |    3 +++
>   arch/arm64/kernel/ptrace.c       |    5 +++++
>   4 files changed, 47 insertions(+)
>   create mode 100644 arch/arm64/include/asm/seccomp.h
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 3a18571..eeac003 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -32,6 +32,7 @@ config ARM64
>   	select HAVE_ARCH_AUDITSYSCALL
>   	select HAVE_ARCH_JUMP_LABEL
>   	select HAVE_ARCH_KGDB
> +	select HAVE_ARCH_SECCOMP_FILTER
>   	select HAVE_ARCH_TRACEHOOK
>   	select HAVE_C_RECORDMCOUNT
>   	select HAVE_DEBUG_BUGVERBOSE
> @@ -259,6 +260,19 @@ config ARCH_HAS_CACHE_LINE_SIZE
>
>   source "mm/Kconfig"
>
> +config SECCOMP
> +	bool "Enable seccomp to safely compute untrusted bytecode"
> +	---help---
> +	  This kernel feature is useful for number crunching applications
> +	  that may need to compute untrusted bytecode during their
> +	  execution. By using pipes or other transports made available to
> +	  the process as file descriptors supporting the read/write
> +	  syscalls, it's possible to isolate those applications in
> +	  their own address space using seccomp. Once seccomp is
> +	  enabled via prctl(PR_SET_SECCOMP), it cannot be disabled
> +	  and the task is only allowed to execute a few safe syscalls
> +	  defined by each seccomp mode.
> +
>   config XEN_DOM0
>   	def_bool y
>   	depends on XEN
> diff --git a/arch/arm64/include/asm/seccomp.h b/arch/arm64/include/asm/seccomp.h
> new file mode 100644
> index 0000000..c76fac9
> --- /dev/null
> +++ b/arch/arm64/include/asm/seccomp.h
> @@ -0,0 +1,25 @@
> +/*
> + * arch/arm64/include/asm/seccomp.h
> + *
> + * Copyright (C) 2014 Linaro Limited
> + * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +#ifndef _ASM_SECCOMP_H
> +#define _ASM_SECCOMP_H
> +
> +#include <asm/unistd.h>
> +
> +#ifdef CONFIG_COMPAT
> +#define __NR_seccomp_read_32		__NR_compat_read
> +#define __NR_seccomp_write_32		__NR_compat_write
> +#define __NR_seccomp_exit_32		__NR_compat_exit
> +#define __NR_seccomp_sigreturn_32	__NR_compat_rt_sigreturn
> +#endif /* CONFIG_COMPAT */
> +
> +#include <asm-generic/seccomp.h>
> +
> +#endif /* _ASM_SECCOMP_H */
> diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
> index c980ab7..729c155 100644
> --- a/arch/arm64/include/asm/unistd.h
> +++ b/arch/arm64/include/asm/unistd.h
> @@ -31,6 +31,9 @@
>    * Compat syscall numbers used by the AArch64 kernel.
>    */
>   #define __NR_compat_restart_syscall	0
> +#define __NR_compat_exit		1
> +#define __NR_compat_read		3
> +#define __NR_compat_write		4
>   #define __NR_compat_sigreturn		119
>   #define __NR_compat_rt_sigreturn	173
>
> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> index 100d7d1..e477f6f 100644
> --- a/arch/arm64/kernel/ptrace.c
> +++ b/arch/arm64/kernel/ptrace.c
> @@ -28,6 +28,7 @@
>   #include <linux/smp.h>
>   #include <linux/ptrace.h>
>   #include <linux/user.h>
> +#include <linux/seccomp.h>
>   #include <linux/security.h>
>   #include <linux/init.h>
>   #include <linux/signal.h>
> @@ -1115,6 +1116,10 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs)
>   	saved_x0 = regs->regs[0];
>   	saved_x8 = regs->regs[8];
>
> +	if (secure_computing(regs->syscallno) == -1)
> +		/* seccomp failures shouldn't expose any additional code. */
> +		return -1;
> +

This will conflict with the fastpath stuff in Kees' tree.  (Actually, 
it's likely to apply cleanly, but fail to compile.)  The fix is trivial, 
but, given that the fastpath stuff is new, can you take a look and see 
if arm64 can use it effectively?

I suspect that the performance considerations are rather different on 
arm64 as compared to x86 (I really hope that x86 is the only 
architecture with the absurd sysret vs. iret distinction), but at least 
the seccomp_data stuff ought to help anywhere.  (It looks like there's a 
distinct fast path, too, so the two-phase thing might also be a fairly 
large win if it's supportable.)

See:

https://git.kernel.org/cgit/linux/kernel/git/kees/linux.git/log/?h=seccomp/fastpath

Also, I'll ask the usual question?  What are all of the factors other 
than nr and args that affect syscall execution?  What are the audit arch 
values?  Do they match correctly?

For example, it looks like, if arm64 adds OABI support, you'll have a 
problem.  (Note that arm currently disables audit and seccomp if OABI is 
enabled for exactly this reason.)

Do any syscall implementations care whether the user code is LE or BE? 
Are the arguments encoded the same way?

An arm-specific question: will there be any confusion as a result of the 
fact that compat syscalls seems to stick nr in w7, but arm64 puts them 
somewhere else?

--Andy

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations
  2014-07-22  9:14 ` [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations AKASHI Takahiro
  2014-07-22 20:15   ` Kees Cook
@ 2014-07-24  3:54   ` Andy Lutomirski
  2014-07-24  5:57     ` AKASHI Takahiro
  1 sibling, 1 reply; 36+ messages in thread
From: Andy Lutomirski @ 2014-07-24  3:54 UTC (permalink / raw)
  To: AKASHI Takahiro, wad, catalin.marinas, will.deacon, keescook
  Cc: dsaxena, linux-arm-kernel, linaro-kernel, linux-kernel

On 07/22/2014 02:14 AM, AKASHI Takahiro wrote:
> Arm64 holds a syscall number in w8(x8) register. Ptrace tracer may change
> its value either to:
>    * any valid syscall number to alter a system call, or
>    * -1 to skip a system call
>
> This patch implements this behavior by reloading that value into syscallno
> in struct pt_regs after tracehook_report_syscall_entry() or
> secure_computing(). In case of '-1', a return value of system call can also
> be changed by the tracer setting the value to x0 register, and so
> sys_ni_nosyscall() should not be called.
>
> See also:
>      42309ab4, ARM: 8087/1: ptrace: reload syscall number after
> 	      secure_computing() check
>
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>   arch/arm64/kernel/entry.S  |    2 ++
>   arch/arm64/kernel/ptrace.c |   13 +++++++++++++
>   2 files changed, 15 insertions(+)
>
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 5141e79..de8bdbc 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -628,6 +628,8 @@ ENDPROC(el0_svc)
>   __sys_trace:
>   	mov	x0, sp
>   	bl	syscall_trace_enter
> +	cmp	w0, #-1				// skip syscall?
> +	b.eq	ret_to_user

Does this mean that skipped syscalls will cause exit tracing to be 
skipped?  If so, then you risk (at least) introducing a nice 
user-triggerable OOPS if audit is enabled.  This bug existed for *years* 
on x86_32, and it amazes me that no one ever triggered it by accident. 
(Grr, audit.)

--Andy

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 2/3] asm-generic: Add generic seccomp.h for secure computing mode 1
  2014-07-24  3:40   ` Andy Lutomirski
@ 2014-07-24  4:41     ` Kees Cook
  2014-07-24  5:17       ` AKASHI Takahiro
  0 siblings, 1 reply; 36+ messages in thread
From: Kees Cook @ 2014-07-24  4:41 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: AKASHI Takahiro, Will Drewry, Catalin Marinas, Will Deacon,
	dsaxena, linux-arm-kernel, linaro-kernel, LKML

On Wed, Jul 23, 2014 at 8:40 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On 07/22/2014 02:14 AM, AKASHI Takahiro wrote:
>>
>> Those values (__NR_seccomp_*) are used solely in secure_computing()
>> to identify mode 1 system calls. If compat system calls have different
>> syscall numbers, asm/seccomp.h may override them.
>>
>> Acked-by: Arnd Bergmann <arnd@arndb.de>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>   include/asm-generic/seccomp.h |   28 ++++++++++++++++++++++++++++
>>   1 file changed, 28 insertions(+)
>>   create mode 100644 include/asm-generic/seccomp.h
>>
>> diff --git a/include/asm-generic/seccomp.h b/include/asm-generic/seccomp.h
>> new file mode 100644
>> index 0000000..5e97022
>> --- /dev/null
>> +++ b/include/asm-generic/seccomp.h
>> @@ -0,0 +1,28 @@
>> +/*
>> + * include/asm-generic/seccomp.h
>> + *
>> + * Copyright (C) 2014 Linaro Limited
>> + * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +#ifndef _ASM_GENERIC_SECCOMP_H
>> +#define _ASM_GENERIC_SECCOMP_H
>> +
>> +#include <asm-generic/unistd.h>
>> +
>> +#if defined(CONFIG_COMPAT) && !defined(__NR_seccomp_read_32)
>> +#define __NR_seccomp_read_32           __NR_read
>> +#define __NR_seccomp_write_32          __NR_write
>> +#define __NR_seccomp_exit_32           __NR_exit
>> +#define __NR_seccomp_sigreturn_32      __NR_rt_sigreturn
>> +#endif /* CONFIG_COMPAT && ! already defined */
>> +
>> +#define __NR_seccomp_read              __NR_read
>> +#define __NR_seccomp_write             __NR_write
>> +#define __NR_seccomp_exit              __NR_exit
>> +#define __NR_seccomp_sigreturn         __NR_rt_sigreturn
>
>
> I don't like these names.  __NR_seccomp_read sounds like the number of a
> syscall called seccomp_read.
>
> Also, shouldn't something be including this header?  I'm confused.

Ah! Good catch. These names are correct (see kernel/seccomp.c's
mode1_syscalls and mode1_syscalls_32 arrays), but the location of this
change was unexpected. I was expecting this file to live in
arch/*/include/asm/seccomp.h, not in include/asm-generic/seccomp.h.

However, since it's always the same list, it might make sense to
consolidate them into a single place as a default to make arch porting
easier. However, I think that should be a separate patch.

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 2/3] asm-generic: Add generic seccomp.h for secure computing mode 1
  2014-07-24  4:41     ` Kees Cook
@ 2014-07-24  5:17       ` AKASHI Takahiro
  2014-07-24 14:57         ` Andy Lutomirski
  0 siblings, 1 reply; 36+ messages in thread
From: AKASHI Takahiro @ 2014-07-24  5:17 UTC (permalink / raw)
  To: Kees Cook, Andy Lutomirski
  Cc: Will Drewry, Catalin Marinas, Will Deacon, dsaxena,
	linux-arm-kernel, linaro-kernel, LKML

On 07/24/2014 01:41 PM, Kees Cook wrote:
> On Wed, Jul 23, 2014 at 8:40 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> On 07/22/2014 02:14 AM, AKASHI Takahiro wrote:
>>>
>>> Those values (__NR_seccomp_*) are used solely in secure_computing()
>>> to identify mode 1 system calls. If compat system calls have different
>>> syscall numbers, asm/seccomp.h may override them.
>>>
>>> Acked-by: Arnd Bergmann <arnd@arndb.de>
>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>> ---
>>>    include/asm-generic/seccomp.h |   28 ++++++++++++++++++++++++++++
>>>    1 file changed, 28 insertions(+)
>>>    create mode 100644 include/asm-generic/seccomp.h
>>>
>>> diff --git a/include/asm-generic/seccomp.h b/include/asm-generic/seccomp.h
>>> new file mode 100644
>>> index 0000000..5e97022
>>> --- /dev/null
>>> +++ b/include/asm-generic/seccomp.h
>>> @@ -0,0 +1,28 @@
>>> +/*
>>> + * include/asm-generic/seccomp.h
>>> + *
>>> + * Copyright (C) 2014 Linaro Limited
>>> + * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License version 2 as
>>> + * published by the Free Software Foundation.
>>> + */
>>> +#ifndef _ASM_GENERIC_SECCOMP_H
>>> +#define _ASM_GENERIC_SECCOMP_H
>>> +
>>> +#include <asm-generic/unistd.h>
>>> +
>>> +#if defined(CONFIG_COMPAT) && !defined(__NR_seccomp_read_32)
>>> +#define __NR_seccomp_read_32           __NR_read
>>> +#define __NR_seccomp_write_32          __NR_write
>>> +#define __NR_seccomp_exit_32           __NR_exit
>>> +#define __NR_seccomp_sigreturn_32      __NR_rt_sigreturn
>>> +#endif /* CONFIG_COMPAT && ! already defined */
>>> +
>>> +#define __NR_seccomp_read              __NR_read
>>> +#define __NR_seccomp_write             __NR_write
>>> +#define __NR_seccomp_exit              __NR_exit
>>> +#define __NR_seccomp_sigreturn         __NR_rt_sigreturn
>>
>>
>> I don't like these names.  __NR_seccomp_read sounds like the number of a
>> syscall called seccomp_read.
>>
>> Also, shouldn't something be including this header?  I'm confused.
>
> Ah! Good catch. These names are correct (see kernel/seccomp.c's
> mode1_syscalls and mode1_syscalls_32 arrays), but the location of this
> change was unexpected. I was expecting this file to live in
> arch/*/include/asm/seccomp.h, not in include/asm-generic/seccomp.h.
>
> However, since it's always the same list, it might make sense to
> consolidate them into a single place as a default to make arch porting
> easier.

Yeah, that is why I put this file under include/asm-generic.

> However, I think that should be a separate patch.

Do you mean that the code for all the existing archs should also be changed
to use this (common) header?

-Takahiro AKASHI
>
> -Kees
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 3/3] arm64: Add seccomp support
  2014-07-24  3:52   ` Andy Lutomirski
@ 2014-07-24  5:40     ` AKASHI Takahiro
  2014-07-24 15:00       ` Andy Lutomirski
  0 siblings, 1 reply; 36+ messages in thread
From: AKASHI Takahiro @ 2014-07-24  5:40 UTC (permalink / raw)
  To: Andy Lutomirski, wad, catalin.marinas, will.deacon, keescook
  Cc: dsaxena, linux-arm-kernel, linaro-kernel, linux-kernel

On 07/24/2014 12:52 PM, Andy Lutomirski wrote:
> On 07/22/2014 02:14 AM, AKASHI Takahiro wrote:
>> secure_computing() should always be called first in syscall_trace_enter().
>>
>> If secure_computing() returns -1, we should stop further handling. Then
>> that system call may eventually fail with a specified return value (errno),
>> be trapped or the process itself be killed depending on loaded rules.
>> In these cases, syscall_trace_enter() also returns -1, that results in
>> skiping a normal syscall handling as well as syscall_trace_exit().
>>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>   arch/arm64/Kconfig               |   14 ++++++++++++++
>>   arch/arm64/include/asm/seccomp.h |   25 +++++++++++++++++++++++++
>>   arch/arm64/include/asm/unistd.h  |    3 +++
>>   arch/arm64/kernel/ptrace.c       |    5 +++++
>>   4 files changed, 47 insertions(+)
>>   create mode 100644 arch/arm64/include/asm/seccomp.h
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 3a18571..eeac003 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -32,6 +32,7 @@ config ARM64
>>       select HAVE_ARCH_AUDITSYSCALL
>>       select HAVE_ARCH_JUMP_LABEL
>>       select HAVE_ARCH_KGDB
>> +    select HAVE_ARCH_SECCOMP_FILTER
>>       select HAVE_ARCH_TRACEHOOK
>>       select HAVE_C_RECORDMCOUNT
>>       select HAVE_DEBUG_BUGVERBOSE
>> @@ -259,6 +260,19 @@ config ARCH_HAS_CACHE_LINE_SIZE
>>
>>   source "mm/Kconfig"
>>
>> +config SECCOMP
>> +    bool "Enable seccomp to safely compute untrusted bytecode"
>> +    ---help---
>> +      This kernel feature is useful for number crunching applications
>> +      that may need to compute untrusted bytecode during their
>> +      execution. By using pipes or other transports made available to
>> +      the process as file descriptors supporting the read/write
>> +      syscalls, it's possible to isolate those applications in
>> +      their own address space using seccomp. Once seccomp is
>> +      enabled via prctl(PR_SET_SECCOMP), it cannot be disabled
>> +      and the task is only allowed to execute a few safe syscalls
>> +      defined by each seccomp mode.
>> +
>>   config XEN_DOM0
>>       def_bool y
>>       depends on XEN
>> diff --git a/arch/arm64/include/asm/seccomp.h b/arch/arm64/include/asm/seccomp.h
>> new file mode 100644
>> index 0000000..c76fac9
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/seccomp.h
>> @@ -0,0 +1,25 @@
>> +/*
>> + * arch/arm64/include/asm/seccomp.h
>> + *
>> + * Copyright (C) 2014 Linaro Limited
>> + * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +#ifndef _ASM_SECCOMP_H
>> +#define _ASM_SECCOMP_H
>> +
>> +#include <asm/unistd.h>
>> +
>> +#ifdef CONFIG_COMPAT
>> +#define __NR_seccomp_read_32        __NR_compat_read
>> +#define __NR_seccomp_write_32        __NR_compat_write
>> +#define __NR_seccomp_exit_32        __NR_compat_exit
>> +#define __NR_seccomp_sigreturn_32    __NR_compat_rt_sigreturn
>> +#endif /* CONFIG_COMPAT */
>> +
>> +#include <asm-generic/seccomp.h>
>> +
>> +#endif /* _ASM_SECCOMP_H */
>> diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
>> index c980ab7..729c155 100644
>> --- a/arch/arm64/include/asm/unistd.h
>> +++ b/arch/arm64/include/asm/unistd.h
>> @@ -31,6 +31,9 @@
>>    * Compat syscall numbers used by the AArch64 kernel.
>>    */
>>   #define __NR_compat_restart_syscall    0
>> +#define __NR_compat_exit        1
>> +#define __NR_compat_read        3
>> +#define __NR_compat_write        4
>>   #define __NR_compat_sigreturn        119
>>   #define __NR_compat_rt_sigreturn    173
>>
>> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
>> index 100d7d1..e477f6f 100644
>> --- a/arch/arm64/kernel/ptrace.c
>> +++ b/arch/arm64/kernel/ptrace.c
>> @@ -28,6 +28,7 @@
>>   #include <linux/smp.h>
>>   #include <linux/ptrace.h>
>>   #include <linux/user.h>
>> +#include <linux/seccomp.h>
>>   #include <linux/security.h>
>>   #include <linux/init.h>
>>   #include <linux/signal.h>
>> @@ -1115,6 +1116,10 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs)
>>       saved_x0 = regs->regs[0];
>>       saved_x8 = regs->regs[8];
>>
>> +    if (secure_computing(regs->syscallno) == -1)
>> +        /* seccomp failures shouldn't expose any additional code. */
>> +        return -1;
>> +
>
> This will conflict with the fastpath stuff in Kees' tree.  (Actually, it's likely to apply cleanly, but fail to
> compile.)  The fix is trivial, but, given that the fastpath stuff is new, can you take a look and see if arm64 can use
> it effectively?

I will look into the code later.

> I suspect that the performance considerations are rather different on arm64 as compared to x86 (I really hope that x86
> is the only architecture with the absurd sysret vs. iret distinction), but at least the seccomp_data stuff ought to help
> anywhere.  (It looks like there's a distinct fast path, too, so the two-phase thing might also be a fairly large win if
> it's supportable.)
>
> See:
>
> https://git.kernel.org/cgit/linux/kernel/git/kees/linux.git/log/?h=seccomp/fastpath
>
> Also, I'll ask the usual question?  What are all of the factors other than nr and args that affect syscall execution?
> What are the audit arch values?  Do they match correctly?

As far as I know,

> For example, it looks like, if arm64 adds OABI support, you'll have a problem.  (Note that arm currently disables audit
> and seccomp if OABI is enabled for exactly this reason.)

I don't think that arm64 will add OABI support in the future.

> Do any syscall implementations care whether the user code is LE or BE? Are the arguments encoded the same way?

when I implemented audit for arm64, the assumptions were
* If userspace is LE, then the kernel is also LE and if BE, then the kernel is BE.
* the syscall numbers and how arguments are encoded are the same btw BE and LE.
So syscall_get_arch() always return the same value.

> An arm-specific question: will there be any confusion as a result of the fact that compat syscalls seems to stick nr in
> w7, but arm64 puts them somewhere else?

I don't know, but syscall_get_arch() returns ARCH_ARM for 32-bit tasks.

Thanks,
-Takahiro AKASHI

> --Andy

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations
  2014-07-24  3:54   ` Andy Lutomirski
@ 2014-07-24  5:57     ` AKASHI Takahiro
  2014-07-24 15:01       ` Andy Lutomirski
  0 siblings, 1 reply; 36+ messages in thread
From: AKASHI Takahiro @ 2014-07-24  5:57 UTC (permalink / raw)
  To: Andy Lutomirski, wad, catalin.marinas, will.deacon, keescook
  Cc: dsaxena, linux-arm-kernel, linaro-kernel, linux-kernel

On 07/24/2014 12:54 PM, Andy Lutomirski wrote:
> On 07/22/2014 02:14 AM, AKASHI Takahiro wrote:
>> Arm64 holds a syscall number in w8(x8) register. Ptrace tracer may change
>> its value either to:
>>    * any valid syscall number to alter a system call, or
>>    * -1 to skip a system call
>>
>> This patch implements this behavior by reloading that value into syscallno
>> in struct pt_regs after tracehook_report_syscall_entry() or
>> secure_computing(). In case of '-1', a return value of system call can also
>> be changed by the tracer setting the value to x0 register, and so
>> sys_ni_nosyscall() should not be called.
>>
>> See also:
>>      42309ab4, ARM: 8087/1: ptrace: reload syscall number after
>>           secure_computing() check
>>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>   arch/arm64/kernel/entry.S  |    2 ++
>>   arch/arm64/kernel/ptrace.c |   13 +++++++++++++
>>   2 files changed, 15 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
>> index 5141e79..de8bdbc 100644
>> --- a/arch/arm64/kernel/entry.S
>> +++ b/arch/arm64/kernel/entry.S
>> @@ -628,6 +628,8 @@ ENDPROC(el0_svc)
>>   __sys_trace:
>>       mov    x0, sp
>>       bl    syscall_trace_enter
>> +    cmp    w0, #-1                // skip syscall?
>> +    b.eq    ret_to_user
>
> Does this mean that skipped syscalls will cause exit tracing to be skipped?

Yes. (and I guess yes on arm, too)

 > If so, then you risk (at least) introducing
> a nice user-triggerable OOPS if audit is enabled.

Can you please elaborate this?
Since I didn't find any definition of audit's behavior when syscall is
rewritten to -1, I thought it is reasonable to skip "exit tracing" of
"skipped" syscall.
(otherwise, "fake" seems to be more appropriate :)

-Takahiro AKASHI

> This bug existed for *years* on x86_32, and it amazes me that no one
> ever triggered it by accident. (Grr, audit.)
>
> --Andy

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 2/3] asm-generic: Add generic seccomp.h for secure computing mode 1
  2014-07-24  5:17       ` AKASHI Takahiro
@ 2014-07-24 14:57         ` Andy Lutomirski
  2014-07-25  8:52           ` AKASHI Takahiro
  0 siblings, 1 reply; 36+ messages in thread
From: Andy Lutomirski @ 2014-07-24 14:57 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Deepak Saxena, Catalin Marinas, linaro-kernel, LKML, Will Deacon,
	Will Drewry, Kees Cook, linux-arm-kernel

On Jul 23, 2014 10:17 PM, "AKASHI Takahiro" <takahiro.akashi@linaro.org> wrote:
>
> On 07/24/2014 01:41 PM, Kees Cook wrote:
>>
>> On Wed, Jul 23, 2014 at 8:40 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>
>>> On 07/22/2014 02:14 AM, AKASHI Takahiro wrote:
>>>>
>>>>
>>>> Those values (__NR_seccomp_*) are used solely in secure_computing()
>>>> to identify mode 1 system calls. If compat system calls have different
>>>> syscall numbers, asm/seccomp.h may override them.
>>>>
>>>> Acked-by: Arnd Bergmann <arnd@arndb.de>
>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>> ---
>>>>    include/asm-generic/seccomp.h |   28 ++++++++++++++++++++++++++++
>>>>    1 file changed, 28 insertions(+)
>>>>    create mode 100644 include/asm-generic/seccomp.h
>>>>
>>>> diff --git a/include/asm-generic/seccomp.h b/include/asm-generic/seccomp.h
>>>> new file mode 100644
>>>> index 0000000..5e97022
>>>> --- /dev/null
>>>> +++ b/include/asm-generic/seccomp.h
>>>> @@ -0,0 +1,28 @@
>>>> +/*
>>>> + * include/asm-generic/seccomp.h
>>>> + *
>>>> + * Copyright (C) 2014 Linaro Limited
>>>> + * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or modify
>>>> + * it under the terms of the GNU General Public License version 2 as
>>>> + * published by the Free Software Foundation.
>>>> + */
>>>> +#ifndef _ASM_GENERIC_SECCOMP_H
>>>> +#define _ASM_GENERIC_SECCOMP_H
>>>> +
>>>> +#include <asm-generic/unistd.h>
>>>> +
>>>> +#if defined(CONFIG_COMPAT) && !defined(__NR_seccomp_read_32)
>>>> +#define __NR_seccomp_read_32           __NR_read
>>>> +#define __NR_seccomp_write_32          __NR_write
>>>> +#define __NR_seccomp_exit_32           __NR_exit
>>>> +#define __NR_seccomp_sigreturn_32      __NR_rt_sigreturn
>>>> +#endif /* CONFIG_COMPAT && ! already defined */
>>>> +
>>>> +#define __NR_seccomp_read              __NR_read
>>>> +#define __NR_seccomp_write             __NR_write
>>>> +#define __NR_seccomp_exit              __NR_exit
>>>> +#define __NR_seccomp_sigreturn         __NR_rt_sigreturn
>>>
>>>
>>>
>>> I don't like these names.  __NR_seccomp_read sounds like the number of a
>>> syscall called seccomp_read.
>>>
>>> Also, shouldn't something be including this header?  I'm confused.
>>
>>
>> Ah! Good catch. These names are correct (see kernel/seccomp.c's
>> mode1_syscalls and mode1_syscalls_32 arrays), but the location of this
>> change was unexpected. I was expecting this file to live in
>> arch/*/include/asm/seccomp.h, not in include/asm-generic/seccomp.h.
>>
>> However, since it's always the same list, it might make sense to
>> consolidate them into a single place as a default to make arch porting
>> easier.
>
>
> Yeah, that is why I put this file under include/asm-generic.

It seems odd that the header would be added without any users.  I
guess it's okay, since arm64 uses it in the followup patch.

>
>> However, I think that should be a separate patch.
>
>
> Do you mean that the code for all the existing archs should also be changed
> to use this (common) header?

If that works, yes.

--Andy

>
> -Takahiro AKASHI
>>
>>
>> -Kees
>>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 3/3] arm64: Add seccomp support
  2014-07-24  5:40     ` AKASHI Takahiro
@ 2014-07-24 15:00       ` Andy Lutomirski
  2014-07-24 15:16         ` Catalin Marinas
  2014-07-25  9:37         ` AKASHI Takahiro
  0 siblings, 2 replies; 36+ messages in thread
From: Andy Lutomirski @ 2014-07-24 15:00 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Deepak Saxena, Will Drewry, Kees Cook, linaro-kernel,
	linux-arm-kernel, Will Deacon, Catalin Marinas, linux-kernel

On Jul 23, 2014 10:40 PM, "AKASHI Takahiro" <takahiro.akashi@linaro.org> wrote:
>
> On 07/24/2014 12:52 PM, Andy Lutomirski wrote:
>>
>> On 07/22/2014 02:14 AM, AKASHI Takahiro wrote:
>>>
>>> secure_computing() should always be called first in syscall_trace_enter().
>>>
>>> If secure_computing() returns -1, we should stop further handling. Then
>>> that system call may eventually fail with a specified return value (errno),
>>> be trapped or the process itself be killed depending on loaded rules.
>>> In these cases, syscall_trace_enter() also returns -1, that results in
>>> skiping a normal syscall handling as well as syscall_trace_exit().
>>>
>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>> ---
>>>   arch/arm64/Kconfig               |   14 ++++++++++++++
>>>   arch/arm64/include/asm/seccomp.h |   25 +++++++++++++++++++++++++
>>>   arch/arm64/include/asm/unistd.h  |    3 +++
>>>   arch/arm64/kernel/ptrace.c       |    5 +++++
>>>   4 files changed, 47 insertions(+)
>>>   create mode 100644 arch/arm64/include/asm/seccomp.h
>>>
>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>> index 3a18571..eeac003 100644
>>> --- a/arch/arm64/Kconfig
>>> +++ b/arch/arm64/Kconfig
>>> @@ -32,6 +32,7 @@ config ARM64
>>>       select HAVE_ARCH_AUDITSYSCALL
>>>       select HAVE_ARCH_JUMP_LABEL
>>>       select HAVE_ARCH_KGDB
>>> +    select HAVE_ARCH_SECCOMP_FILTER
>>>       select HAVE_ARCH_TRACEHOOK
>>>       select HAVE_C_RECORDMCOUNT
>>>       select HAVE_DEBUG_BUGVERBOSE
>>> @@ -259,6 +260,19 @@ config ARCH_HAS_CACHE_LINE_SIZE
>>>
>>>   source "mm/Kconfig"
>>>
>>> +config SECCOMP
>>> +    bool "Enable seccomp to safely compute untrusted bytecode"
>>> +    ---help---
>>> +      This kernel feature is useful for number crunching applications
>>> +      that may need to compute untrusted bytecode during their
>>> +      execution. By using pipes or other transports made available to
>>> +      the process as file descriptors supporting the read/write
>>> +      syscalls, it's possible to isolate those applications in
>>> +      their own address space using seccomp. Once seccomp is
>>> +      enabled via prctl(PR_SET_SECCOMP), it cannot be disabled
>>> +      and the task is only allowed to execute a few safe syscalls
>>> +      defined by each seccomp mode.
>>> +
>>>   config XEN_DOM0
>>>       def_bool y
>>>       depends on XEN
>>> diff --git a/arch/arm64/include/asm/seccomp.h b/arch/arm64/include/asm/seccomp.h
>>> new file mode 100644
>>> index 0000000..c76fac9
>>> --- /dev/null
>>> +++ b/arch/arm64/include/asm/seccomp.h
>>> @@ -0,0 +1,25 @@
>>> +/*
>>> + * arch/arm64/include/asm/seccomp.h
>>> + *
>>> + * Copyright (C) 2014 Linaro Limited
>>> + * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License version 2 as
>>> + * published by the Free Software Foundation.
>>> + */
>>> +#ifndef _ASM_SECCOMP_H
>>> +#define _ASM_SECCOMP_H
>>> +
>>> +#include <asm/unistd.h>
>>> +
>>> +#ifdef CONFIG_COMPAT
>>> +#define __NR_seccomp_read_32        __NR_compat_read
>>> +#define __NR_seccomp_write_32        __NR_compat_write
>>> +#define __NR_seccomp_exit_32        __NR_compat_exit
>>> +#define __NR_seccomp_sigreturn_32    __NR_compat_rt_sigreturn
>>> +#endif /* CONFIG_COMPAT */
>>> +
>>> +#include <asm-generic/seccomp.h>
>>> +
>>> +#endif /* _ASM_SECCOMP_H */
>>> diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
>>> index c980ab7..729c155 100644
>>> --- a/arch/arm64/include/asm/unistd.h
>>> +++ b/arch/arm64/include/asm/unistd.h
>>> @@ -31,6 +31,9 @@
>>>    * Compat syscall numbers used by the AArch64 kernel.
>>>    */
>>>   #define __NR_compat_restart_syscall    0
>>> +#define __NR_compat_exit        1
>>> +#define __NR_compat_read        3
>>> +#define __NR_compat_write        4
>>>   #define __NR_compat_sigreturn        119
>>>   #define __NR_compat_rt_sigreturn    173
>>>
>>> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
>>> index 100d7d1..e477f6f 100644
>>> --- a/arch/arm64/kernel/ptrace.c
>>> +++ b/arch/arm64/kernel/ptrace.c
>>> @@ -28,6 +28,7 @@
>>>   #include <linux/smp.h>
>>>   #include <linux/ptrace.h>
>>>   #include <linux/user.h>
>>> +#include <linux/seccomp.h>
>>>   #include <linux/security.h>
>>>   #include <linux/init.h>
>>>   #include <linux/signal.h>
>>> @@ -1115,6 +1116,10 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs)
>>>       saved_x0 = regs->regs[0];
>>>       saved_x8 = regs->regs[8];
>>>
>>> +    if (secure_computing(regs->syscallno) == -1)
>>> +        /* seccomp failures shouldn't expose any additional code. */
>>> +        return -1;
>>> +
>>
>>
>> This will conflict with the fastpath stuff in Kees' tree.  (Actually, it's likely to apply cleanly, but fail to
>> compile.)  The fix is trivial, but, given that the fastpath stuff is new, can you take a look and see if arm64 can use
>> it effectively?
>
>
> I will look into the code later.
>
>
>> I suspect that the performance considerations are rather different on arm64 as compared to x86 (I really hope that x86
>> is the only architecture with the absurd sysret vs. iret distinction), but at least the seccomp_data stuff ought to help
>> anywhere.  (It looks like there's a distinct fast path, too, so the two-phase thing might also be a fairly large win if
>> it's supportable.)
>>
>> See:
>>
>> https://git.kernel.org/cgit/linux/kernel/git/kees/linux.git/log/?h=seccomp/fastpath
>>
>> Also, I'll ask the usual question?  What are all of the factors other than nr and args that affect syscall execution?
>> What are the audit arch values?  Do they match correctly?
>
>
> As far as I know,
>
>
>> For example, it looks like, if arm64 adds OABI support, you'll have a problem.  (Note that arm currently disables audit
>> and seccomp if OABI is enabled for exactly this reason.)
>
>
> I don't think that arm64 will add OABI support in the future.
>
>
>> Do any syscall implementations care whether the user code is LE or BE? Are the arguments encoded the same way?
>
>
> when I implemented audit for arm64, the assumptions were
> * If userspace is LE, then the kernel is also LE and if BE, then the kernel is BE.
> * the syscall numbers and how arguments are encoded are the same btw BE and LE.
> So syscall_get_arch() always return the same value.

If arm64 ever adds support for mixed-endian userspace, this could
become awkward.  Hmm.

IMO this matters more for seccomp than for audit.  The audit code
doesn't seem to do anything terribly interesting w/ the arch field, at
least in terms of interpretation of syscall args.

>
>
>> An arm-specific question: will there be any confusion as a result of the fact that compat syscalls seems to stick nr in
>> w7, but arm64 puts them somewhere else?
>
>
> I don't know, but syscall_get_arch() returns ARCH_ARM for 32-bit tasks.

Will 32-bit tracers be compatible between arm and arm64 kernels?  That
is, if a 32-bit program installs a seccomp filter with a trace action
and traces a 32-bit process, will everything work correctly?  (Kees'
and Will's tests should work for this, I think.)

--Andy

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations
  2014-07-24  5:57     ` AKASHI Takahiro
@ 2014-07-24 15:01       ` Andy Lutomirski
  2014-07-25 10:36         ` AKASHI Takahiro
  0 siblings, 1 reply; 36+ messages in thread
From: Andy Lutomirski @ 2014-07-24 15:01 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Deepak Saxena, Will Drewry, Kees Cook, linaro-kernel,
	linux-arm-kernel, Will Deacon, Catalin Marinas, linux-kernel

On Jul 23, 2014 10:57 PM, "AKASHI Takahiro" <takahiro.akashi@linaro.org> wrote:
>
> On 07/24/2014 12:54 PM, Andy Lutomirski wrote:
>>
>> On 07/22/2014 02:14 AM, AKASHI Takahiro wrote:
>>>
>>> Arm64 holds a syscall number in w8(x8) register. Ptrace tracer may change
>>> its value either to:
>>>    * any valid syscall number to alter a system call, or
>>>    * -1 to skip a system call
>>>
>>> This patch implements this behavior by reloading that value into syscallno
>>> in struct pt_regs after tracehook_report_syscall_entry() or
>>> secure_computing(). In case of '-1', a return value of system call can also
>>> be changed by the tracer setting the value to x0 register, and so
>>> sys_ni_nosyscall() should not be called.
>>>
>>> See also:
>>>      42309ab4, ARM: 8087/1: ptrace: reload syscall number after
>>>           secure_computing() check
>>>
>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>> ---
>>>   arch/arm64/kernel/entry.S  |    2 ++
>>>   arch/arm64/kernel/ptrace.c |   13 +++++++++++++
>>>   2 files changed, 15 insertions(+)
>>>
>>> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
>>> index 5141e79..de8bdbc 100644
>>> --- a/arch/arm64/kernel/entry.S
>>> +++ b/arch/arm64/kernel/entry.S
>>> @@ -628,6 +628,8 @@ ENDPROC(el0_svc)
>>>   __sys_trace:
>>>       mov    x0, sp
>>>       bl    syscall_trace_enter
>>> +    cmp    w0, #-1                // skip syscall?
>>> +    b.eq    ret_to_user
>>
>>
>> Does this mean that skipped syscalls will cause exit tracing to be skipped?
>
>
> Yes. (and I guess yes on arm, too)
>
>
> > If so, then you risk (at least) introducing
>>
>> a nice user-triggerable OOPS if audit is enabled.
>
>
> Can you please elaborate this?
> Since I didn't find any definition of audit's behavior when syscall is
> rewritten to -1, I thought it is reasonable to skip "exit tracing" of
> "skipped" syscall.
> (otherwise, "fake" seems to be more appropriate :)

The audit entry hook will oops if you call it twice in a row without
calling the exit hook in between.  I can also imagine ptracers getting
confused if ptrace entry and exit don't line up.

What happens if user code directly issues syscall ~0?  Does the return
value register get set?  Is the behavior different between traced and
untraced syscalls?  The current approach seems a bit scary.

--Andy

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 3/3] arm64: Add seccomp support
  2014-07-24 15:00       ` Andy Lutomirski
@ 2014-07-24 15:16         ` Catalin Marinas
  2014-07-25  9:37         ` AKASHI Takahiro
  1 sibling, 0 replies; 36+ messages in thread
From: Catalin Marinas @ 2014-07-24 15:16 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: AKASHI Takahiro, Deepak Saxena, Will Drewry, Kees Cook,
	linaro-kernel, linux-arm-kernel, Will Deacon, linux-kernel

On Thu, Jul 24, 2014 at 04:00:03PM +0100, Andy Lutomirski wrote:
> On Jul 23, 2014 10:40 PM, "AKASHI Takahiro" <takahiro.akashi@linaro.org> wrote:
> > when I implemented audit for arm64, the assumptions were
> > * If userspace is LE, then the kernel is also LE and if BE, then the kernel is BE.
> > * the syscall numbers and how arguments are encoded are the same btw BE and LE.
> > So syscall_get_arch() always return the same value.
> 
> If arm64 ever adds support for mixed-endian userspace, this could
> become awkward.  Hmm.

I really doubt we would ever support mixed endian user space. Too many
problems with translating syscalls, futexes (someone looked into this
and gave up eventually).

-- 
Catalin

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 2/3] asm-generic: Add generic seccomp.h for secure computing mode 1
  2014-07-24 14:57         ` Andy Lutomirski
@ 2014-07-25  8:52           ` AKASHI Takahiro
  0 siblings, 0 replies; 36+ messages in thread
From: AKASHI Takahiro @ 2014-07-25  8:52 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Deepak Saxena, Catalin Marinas, linaro-kernel, LKML, Will Deacon,
	Will Drewry, Kees Cook, linux-arm-kernel

On 07/24/2014 11:57 PM, Andy Lutomirski wrote:
> On Jul 23, 2014 10:17 PM, "AKASHI Takahiro" <takahiro.akashi@linaro.org> wrote:
>>
>> On 07/24/2014 01:41 PM, Kees Cook wrote:
>>>
>>> On Wed, Jul 23, 2014 at 8:40 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>>
>>>> On 07/22/2014 02:14 AM, AKASHI Takahiro wrote:
>>>>>
>>>>>
>>>>> Those values (__NR_seccomp_*) are used solely in secure_computing()
>>>>> to identify mode 1 system calls. If compat system calls have different
>>>>> syscall numbers, asm/seccomp.h may override them.
>>>>>
>>>>> Acked-by: Arnd Bergmann <arnd@arndb.de>
>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>> ---
>>>>>     include/asm-generic/seccomp.h |   28 ++++++++++++++++++++++++++++
>>>>>     1 file changed, 28 insertions(+)
>>>>>     create mode 100644 include/asm-generic/seccomp.h
>>>>>
>>>>> diff --git a/include/asm-generic/seccomp.h b/include/asm-generic/seccomp.h
>>>>> new file mode 100644
>>>>> index 0000000..5e97022
>>>>> --- /dev/null
>>>>> +++ b/include/asm-generic/seccomp.h
>>>>> @@ -0,0 +1,28 @@
>>>>> +/*
>>>>> + * include/asm-generic/seccomp.h
>>>>> + *
>>>>> + * Copyright (C) 2014 Linaro Limited
>>>>> + * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>> + *
>>>>> + * This program is free software; you can redistribute it and/or modify
>>>>> + * it under the terms of the GNU General Public License version 2 as
>>>>> + * published by the Free Software Foundation.
>>>>> + */
>>>>> +#ifndef _ASM_GENERIC_SECCOMP_H
>>>>> +#define _ASM_GENERIC_SECCOMP_H
>>>>> +
>>>>> +#include <asm-generic/unistd.h>
>>>>> +
>>>>> +#if defined(CONFIG_COMPAT) && !defined(__NR_seccomp_read_32)
>>>>> +#define __NR_seccomp_read_32           __NR_read
>>>>> +#define __NR_seccomp_write_32          __NR_write
>>>>> +#define __NR_seccomp_exit_32           __NR_exit
>>>>> +#define __NR_seccomp_sigreturn_32      __NR_rt_sigreturn
>>>>> +#endif /* CONFIG_COMPAT && ! already defined */
>>>>> +
>>>>> +#define __NR_seccomp_read              __NR_read
>>>>> +#define __NR_seccomp_write             __NR_write
>>>>> +#define __NR_seccomp_exit              __NR_exit
>>>>> +#define __NR_seccomp_sigreturn         __NR_rt_sigreturn
>>>>
>>>>
>>>>
>>>> I don't like these names.  __NR_seccomp_read sounds like the number of a
>>>> syscall called seccomp_read.
>>>>
>>>> Also, shouldn't something be including this header?  I'm confused.
>>>
>>>
>>> Ah! Good catch. These names are correct (see kernel/seccomp.c's
>>> mode1_syscalls and mode1_syscalls_32 arrays), but the location of this
>>> change was unexpected. I was expecting this file to live in
>>> arch/*/include/asm/seccomp.h, not in include/asm-generic/seccomp.h.
>>>
>>> However, since it's always the same list, it might make sense to
>>> consolidate them into a single place as a default to make arch porting
>>> easier.
>>
>>
>> Yeah, that is why I put this file under include/asm-generic.
>
> It seems odd that the header would be added without any users.  I
> guess it's okay, since arm64 uses it in the followup patch.
>
>>
>>> However, I think that should be a separate patch.
>>
>>
>> Do you mean that the code for all the existing archs should also be changed
>> to use this (common) header?
>
> If that works, yes.

As is often the case, the patch itself is quite simple, but I can't
test it on other architectures.

-Takahiro AKASHI

> --Andy
>
>>
>> -Takahiro AKASHI
>>>
>>>
>>> -Kees
>>>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 3/3] arm64: Add seccomp support
  2014-07-24 15:00       ` Andy Lutomirski
  2014-07-24 15:16         ` Catalin Marinas
@ 2014-07-25  9:37         ` AKASHI Takahiro
  2014-08-05 15:08           ` Kees Cook
  1 sibling, 1 reply; 36+ messages in thread
From: AKASHI Takahiro @ 2014-07-25  9:37 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Deepak Saxena, Will Drewry, Kees Cook, linaro-kernel,
	linux-arm-kernel, Will Deacon, Catalin Marinas, linux-kernel

On 07/25/2014 12:00 AM, Andy Lutomirski wrote:
> On Jul 23, 2014 10:40 PM, "AKASHI Takahiro" <takahiro.akashi@linaro.org> wrote:
>>
>> On 07/24/2014 12:52 PM, Andy Lutomirski wrote:
>>>
>>> On 07/22/2014 02:14 AM, AKASHI Takahiro wrote:
>>>>
>>>> secure_computing() should always be called first in syscall_trace_enter().
>>>>
>>>> If secure_computing() returns -1, we should stop further handling. Then
>>>> that system call may eventually fail with a specified return value (errno),
>>>> be trapped or the process itself be killed depending on loaded rules.
>>>> In these cases, syscall_trace_enter() also returns -1, that results in
>>>> skiping a normal syscall handling as well as syscall_trace_exit().
>>>>
>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>> ---
>>>>    arch/arm64/Kconfig               |   14 ++++++++++++++
>>>>    arch/arm64/include/asm/seccomp.h |   25 +++++++++++++++++++++++++
>>>>    arch/arm64/include/asm/unistd.h  |    3 +++
>>>>    arch/arm64/kernel/ptrace.c       |    5 +++++
>>>>    4 files changed, 47 insertions(+)
>>>>    create mode 100644 arch/arm64/include/asm/seccomp.h
>>>>
>>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>>> index 3a18571..eeac003 100644
>>>> --- a/arch/arm64/Kconfig
>>>> +++ b/arch/arm64/Kconfig
>>>> @@ -32,6 +32,7 @@ config ARM64
>>>>        select HAVE_ARCH_AUDITSYSCALL
>>>>        select HAVE_ARCH_JUMP_LABEL
>>>>        select HAVE_ARCH_KGDB
>>>> +    select HAVE_ARCH_SECCOMP_FILTER
>>>>        select HAVE_ARCH_TRACEHOOK
>>>>        select HAVE_C_RECORDMCOUNT
>>>>        select HAVE_DEBUG_BUGVERBOSE
>>>> @@ -259,6 +260,19 @@ config ARCH_HAS_CACHE_LINE_SIZE
>>>>
>>>>    source "mm/Kconfig"
>>>>
>>>> +config SECCOMP
>>>> +    bool "Enable seccomp to safely compute untrusted bytecode"
>>>> +    ---help---
>>>> +      This kernel feature is useful for number crunching applications
>>>> +      that may need to compute untrusted bytecode during their
>>>> +      execution. By using pipes or other transports made available to
>>>> +      the process as file descriptors supporting the read/write
>>>> +      syscalls, it's possible to isolate those applications in
>>>> +      their own address space using seccomp. Once seccomp is
>>>> +      enabled via prctl(PR_SET_SECCOMP), it cannot be disabled
>>>> +      and the task is only allowed to execute a few safe syscalls
>>>> +      defined by each seccomp mode.
>>>> +
>>>>    config XEN_DOM0
>>>>        def_bool y
>>>>        depends on XEN
>>>> diff --git a/arch/arm64/include/asm/seccomp.h b/arch/arm64/include/asm/seccomp.h
>>>> new file mode 100644
>>>> index 0000000..c76fac9
>>>> --- /dev/null
>>>> +++ b/arch/arm64/include/asm/seccomp.h
>>>> @@ -0,0 +1,25 @@
>>>> +/*
>>>> + * arch/arm64/include/asm/seccomp.h
>>>> + *
>>>> + * Copyright (C) 2014 Linaro Limited
>>>> + * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or modify
>>>> + * it under the terms of the GNU General Public License version 2 as
>>>> + * published by the Free Software Foundation.
>>>> + */
>>>> +#ifndef _ASM_SECCOMP_H
>>>> +#define _ASM_SECCOMP_H
>>>> +
>>>> +#include <asm/unistd.h>
>>>> +
>>>> +#ifdef CONFIG_COMPAT
>>>> +#define __NR_seccomp_read_32        __NR_compat_read
>>>> +#define __NR_seccomp_write_32        __NR_compat_write
>>>> +#define __NR_seccomp_exit_32        __NR_compat_exit
>>>> +#define __NR_seccomp_sigreturn_32    __NR_compat_rt_sigreturn
>>>> +#endif /* CONFIG_COMPAT */
>>>> +
>>>> +#include <asm-generic/seccomp.h>
>>>> +
>>>> +#endif /* _ASM_SECCOMP_H */
>>>> diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
>>>> index c980ab7..729c155 100644
>>>> --- a/arch/arm64/include/asm/unistd.h
>>>> +++ b/arch/arm64/include/asm/unistd.h
>>>> @@ -31,6 +31,9 @@
>>>>     * Compat syscall numbers used by the AArch64 kernel.
>>>>     */
>>>>    #define __NR_compat_restart_syscall    0
>>>> +#define __NR_compat_exit        1
>>>> +#define __NR_compat_read        3
>>>> +#define __NR_compat_write        4
>>>>    #define __NR_compat_sigreturn        119
>>>>    #define __NR_compat_rt_sigreturn    173
>>>>
>>>> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
>>>> index 100d7d1..e477f6f 100644
>>>> --- a/arch/arm64/kernel/ptrace.c
>>>> +++ b/arch/arm64/kernel/ptrace.c
>>>> @@ -28,6 +28,7 @@
>>>>    #include <linux/smp.h>
>>>>    #include <linux/ptrace.h>
>>>>    #include <linux/user.h>
>>>> +#include <linux/seccomp.h>
>>>>    #include <linux/security.h>
>>>>    #include <linux/init.h>
>>>>    #include <linux/signal.h>
>>>> @@ -1115,6 +1116,10 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs)
>>>>        saved_x0 = regs->regs[0];
>>>>        saved_x8 = regs->regs[8];
>>>>
>>>> +    if (secure_computing(regs->syscallno) == -1)
>>>> +        /* seccomp failures shouldn't expose any additional code. */
>>>> +        return -1;
>>>> +
>>>
>>>
>>> This will conflict with the fastpath stuff in Kees' tree.  (Actually, it's likely to apply cleanly, but fail to
>>> compile.)  The fix is trivial, but, given that the fastpath stuff is new, can you take a look and see if arm64 can use
>>> it effectively?
>>
>>
>> I will look into the code later.
>>
>>
>>> I suspect that the performance considerations are rather different on arm64 as compared to x86 (I really hope that x86
>>> is the only architecture with the absurd sysret vs. iret distinction), but at least the seccomp_data stuff ought to help
>>> anywhere.  (It looks like there's a distinct fast path, too, so the two-phase thing might also be a fairly large win if
>>> it's supportable.)
>>>
>>> See:
>>>
>>> https://git.kernel.org/cgit/linux/kernel/git/kees/linux.git/log/?h=seccomp/fastpath
>>>
>>> Also, I'll ask the usual question?  What are all of the factors other than nr and args that affect syscall execution?
>>> What are the audit arch values?  Do they match correctly?
>>
>>
>> As far as I know,
>>
>>
>>> For example, it looks like, if arm64 adds OABI support, you'll have a problem.  (Note that arm currently disables audit
>>> and seccomp if OABI is enabled for exactly this reason.)
>>
>>
>> I don't think that arm64 will add OABI support in the future.
>>
>>
>>> Do any syscall implementations care whether the user code is LE or BE? Are the arguments encoded the same way?
>>
>>
>> when I implemented audit for arm64, the assumptions were
>> * If userspace is LE, then the kernel is also LE and if BE, then the kernel is BE.
>> * the syscall numbers and how arguments are encoded are the same btw BE and LE.
>> So syscall_get_arch() always return the same value.
>
> If arm64 ever adds support for mixed-endian userspace, this could
> become awkward.  Hmm.
>
> IMO this matters more for seccomp than for audit.  The audit code
> doesn't seem to do anything terribly interesting w/ the arch field, at
> least in terms of interpretation of syscall args.

I digged into libseccomp source files, and found that there is some
endianness-dependent code. Especially, "classic" BPF interpreter handles
only 32-bit accumulator/registers, and so special care should be taken
when a filter wants to check a 64-bit argument of system call.
If we don't support mixed-endianness, this issue can be fixed by statically
compiling the library with BYTE_ORDER macro. But otherwise
syscall_get_arch() should return a dedicated value for BE kernel and this change
will also have some impact on audit commands.

>>
>>
>>> An arm-specific question: will there be any confusion as a result of the fact that compat syscalls seems to stick nr in
>>> w7, but arm64 puts them somewhere else?
>>
>>
>> I don't know, but syscall_get_arch() returns ARCH_ARM for 32-bit tasks.
>
> Will 32-bit tracers be compatible between arm and arm64 kernels?  That
> is, if a 32-bit program installs a seccomp filter with a trace action
> and traces a 32-bit process, will everything work correctly?  (Kees'
> and Will's tests should work for this, I think.)

I found a bug in my current patch (v5). When 32-bit tracer skips a system call,
we should not update syscallno from x8 since syscallno is re-written directly
via ptrace(PTRACE_SET_SYSCALL).
I'm sure that my next version should work with 32/64-bit tracers on 64-bit kernel.

Thanks,
-Takahiro AKASHI

> --Andy
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations
  2014-07-24 15:01       ` Andy Lutomirski
@ 2014-07-25 10:36         ` AKASHI Takahiro
  2014-07-25 11:03           ` Will Deacon
  0 siblings, 1 reply; 36+ messages in thread
From: AKASHI Takahiro @ 2014-07-25 10:36 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Deepak Saxena, Will Drewry, Kees Cook, linaro-kernel,
	linux-arm-kernel, Will Deacon, Catalin Marinas, linux-kernel

On 07/25/2014 12:01 AM, Andy Lutomirski wrote:
> On Jul 23, 2014 10:57 PM, "AKASHI Takahiro" <takahiro.akashi@linaro.org> wrote:
>>
>> On 07/24/2014 12:54 PM, Andy Lutomirski wrote:
>>>
>>> On 07/22/2014 02:14 AM, AKASHI Takahiro wrote:
>>>>
>>>> Arm64 holds a syscall number in w8(x8) register. Ptrace tracer may change
>>>> its value either to:
>>>>     * any valid syscall number to alter a system call, or
>>>>     * -1 to skip a system call
>>>>
>>>> This patch implements this behavior by reloading that value into syscallno
>>>> in struct pt_regs after tracehook_report_syscall_entry() or
>>>> secure_computing(). In case of '-1', a return value of system call can also
>>>> be changed by the tracer setting the value to x0 register, and so
>>>> sys_ni_nosyscall() should not be called.
>>>>
>>>> See also:
>>>>       42309ab4, ARM: 8087/1: ptrace: reload syscall number after
>>>>            secure_computing() check
>>>>
>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>> ---
>>>>    arch/arm64/kernel/entry.S  |    2 ++
>>>>    arch/arm64/kernel/ptrace.c |   13 +++++++++++++
>>>>    2 files changed, 15 insertions(+)
>>>>
>>>> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
>>>> index 5141e79..de8bdbc 100644
>>>> --- a/arch/arm64/kernel/entry.S
>>>> +++ b/arch/arm64/kernel/entry.S
>>>> @@ -628,6 +628,8 @@ ENDPROC(el0_svc)
>>>>    __sys_trace:
>>>>        mov    x0, sp
>>>>        bl    syscall_trace_enter
>>>> +    cmp    w0, #-1                // skip syscall?
>>>> +    b.eq    ret_to_user
>>>
>>>
>>> Does this mean that skipped syscalls will cause exit tracing to be skipped?
>>
>>
>> Yes. (and I guess yes on arm, too)
>>
>>
>>> If so, then you risk (at least) introducing
>>>
>>> a nice user-triggerable OOPS if audit is enabled.
>>
>>
>> Can you please elaborate this?
>> Since I didn't find any definition of audit's behavior when syscall is
>> rewritten to -1, I thought it is reasonable to skip "exit tracing" of
>> "skipped" syscall.
>> (otherwise, "fake" seems to be more appropriate :)
>
> The audit entry hook will oops if you call it twice in a row without
> calling the exit hook in between.

Thank you, I could reproduce this problem which hits BUG(in_syscall) in
audit_syscall_entry(). Really bad, and I fixed it in my next version and
now a "skipped" system call is also traced by audit.

I ran libseccomp test and Kees' test under auditd running with a rule,
	auditctl -a exit,always -S all
and all the tests seemed to pass.

   I can also imagine ptracers getting
> confused if ptrace entry and exit don't line up.

FYI, on arm64, we can distinguish syscall enter/exit with x12 register.

> What happens if user code directly issues syscall ~0?  Does the return
> value register get set?  Is the behavior different between traced and
> untraced syscalls?

Interesting cases. Let me think about it.

Thanks,
-Takahiro AKASHI

>  The current approach seems a bit scary.
>
> --Andy
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations
  2014-07-25 10:36         ` AKASHI Takahiro
@ 2014-07-25 11:03           ` Will Deacon
  2014-07-29  6:49             ` AKASHI Takahiro
  0 siblings, 1 reply; 36+ messages in thread
From: Will Deacon @ 2014-07-25 11:03 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Andy Lutomirski, Deepak Saxena, Will Drewry, Kees Cook,
	linaro-kernel, linux-arm-kernel, Catalin Marinas, linux-kernel

On Fri, Jul 25, 2014 at 11:36:49AM +0100, AKASHI Takahiro wrote:
> On 07/25/2014 12:01 AM, Andy Lutomirski wrote:
> >>> If so, then you risk (at least) introducing
> >>>
> >>> a nice user-triggerable OOPS if audit is enabled.
> >>
> >>
> >> Can you please elaborate this?
> >> Since I didn't find any definition of audit's behavior when syscall is
> >> rewritten to -1, I thought it is reasonable to skip "exit tracing" of
> >> "skipped" syscall.
> >> (otherwise, "fake" seems to be more appropriate :)
> >
> > The audit entry hook will oops if you call it twice in a row without
> > calling the exit hook in between.
> 
> Thank you, I could reproduce this problem which hits BUG(in_syscall) in
> audit_syscall_entry(). Really bad, and I fixed it in my next version and
> now a "skipped" system call is also traced by audit.

Can you reproduce this on arch/arm/ too? If so, we should also fix the code
there.

Will

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations
  2014-07-25 11:03           ` Will Deacon
@ 2014-07-29  6:49             ` AKASHI Takahiro
  2014-07-29 13:26               ` Will Deacon
  0 siblings, 1 reply; 36+ messages in thread
From: AKASHI Takahiro @ 2014-07-29  6:49 UTC (permalink / raw)
  To: Will Deacon
  Cc: Andy Lutomirski, Deepak Saxena, Will Drewry, Kees Cook,
	linaro-kernel, linux-arm-kernel, Catalin Marinas, linux-kernel

On 07/25/2014 08:03 PM, Will Deacon wrote:
> On Fri, Jul 25, 2014 at 11:36:49AM +0100, AKASHI Takahiro wrote:
>> On 07/25/2014 12:01 AM, Andy Lutomirski wrote:
>>>>> If so, then you risk (at least) introducing
>>>>>
>>>>> a nice user-triggerable OOPS if audit is enabled.
>>>>
>>>>
>>>> Can you please elaborate this?
>>>> Since I didn't find any definition of audit's behavior when syscall is
>>>> rewritten to -1, I thought it is reasonable to skip "exit tracing" of
>>>> "skipped" syscall.
>>>> (otherwise, "fake" seems to be more appropriate :)
>>>
>>> The audit entry hook will oops if you call it twice in a row without
>>> calling the exit hook in between.
>>
>> Thank you, I could reproduce this problem which hits BUG(in_syscall) in
>> audit_syscall_entry(). Really bad, and I fixed it in my next version and
>> now a "skipped" system call is also traced by audit.
>
> Can you reproduce this on arch/arm/ too? If so, we should also fix the code
> there.

As far as I tried on arm with syscall auditing enabled,

1) Changing a syscall number to -1 under seccomp doesn't hit BUG_ON(in_syscall).
2) But, in fact, audit_syscall_entry() is NOT called in this case because
    __secure_computing() returns -1 and then it causes the succeeding tracing
    in syscall_trace_enter(), including audit_syscall_entry(), skipped.
3) On the other hand, calling syscall(-1) from userspace hits BUG_ON because
    the return path, ret_slow_syscall, doesn't contain syscall_trace_exit().
4) When we re-write a syscall number to -1 without seccomp, we will also see
    BUG_ON hit, although I didn't try yet.

Fixing case 3 is easy, but should we also fix case 2?

Please note that, even if we call audit_syscall_exit() in case 2 or 3, no log against
syscall -1 will be recorded because audit_filter_syscall() doesn't allow logging
for any syscall number which is greater than 2048.
This behavior was introduced by Andy's patch, a3c54931, in v3.16-rc.
If the intention of "-1" is to fake a system call, this behavior seems to be a bit odd.

Thanks,
-Takahiro AKASHI


> Will
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations
  2014-07-29  6:49             ` AKASHI Takahiro
@ 2014-07-29 13:26               ` Will Deacon
  0 siblings, 0 replies; 36+ messages in thread
From: Will Deacon @ 2014-07-29 13:26 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Andy Lutomirski, Deepak Saxena, Will Drewry, Kees Cook,
	linaro-kernel, linux-arm-kernel, Catalin Marinas, linux-kernel

On Tue, Jul 29, 2014 at 07:49:47AM +0100, AKASHI Takahiro wrote:
> On 07/25/2014 08:03 PM, Will Deacon wrote:
> > On Fri, Jul 25, 2014 at 11:36:49AM +0100, AKASHI Takahiro wrote:
> >> On 07/25/2014 12:01 AM, Andy Lutomirski wrote:
> >>>>> If so, then you risk (at least) introducing
> >>>>>
> >>>>> a nice user-triggerable OOPS if audit is enabled.
> >>>>
> >>>>
> >>>> Can you please elaborate this?
> >>>> Since I didn't find any definition of audit's behavior when syscall is
> >>>> rewritten to -1, I thought it is reasonable to skip "exit tracing" of
> >>>> "skipped" syscall.
> >>>> (otherwise, "fake" seems to be more appropriate :)
> >>>
> >>> The audit entry hook will oops if you call it twice in a row without
> >>> calling the exit hook in between.
> >>
> >> Thank you, I could reproduce this problem which hits BUG(in_syscall) in
> >> audit_syscall_entry(). Really bad, and I fixed it in my next version and
> >> now a "skipped" system call is also traced by audit.
> >
> > Can you reproduce this on arch/arm/ too? If so, we should also fix the code
> > there.
> 
> As far as I tried on arm with syscall auditing enabled,
> 
> 1) Changing a syscall number to -1 under seccomp doesn't hit BUG_ON(in_syscall).
> 2) But, in fact, audit_syscall_entry() is NOT called in this case because
>     __secure_computing() returns -1 and then it causes the succeeding tracing
>     in syscall_trace_enter(), including audit_syscall_entry(), skipped.

What happens if CONFIG_SECCOMP=n?

> 3) On the other hand, calling syscall(-1) from userspace hits BUG_ON because
>     the return path, ret_slow_syscall, doesn't contain syscall_trace_exit().
> 4) When we re-write a syscall number to -1 without seccomp, we will also see
>     BUG_ON hit, although I didn't try yet.
> 
> Fixing case 3 is easy, but should we also fix case 2?

I think so.

Will

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 3/3] arm64: Add seccomp support
  2014-07-25  9:37         ` AKASHI Takahiro
@ 2014-08-05 15:08           ` Kees Cook
  2014-08-08  7:35             ` AKASHI Takahiro
  0 siblings, 1 reply; 36+ messages in thread
From: Kees Cook @ 2014-08-05 15:08 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Andy Lutomirski, Deepak Saxena, Will Drewry, linaro-kernel,
	linux-arm-kernel, Will Deacon, Catalin Marinas, linux-kernel,
	Lee Campbell

On Fri, Jul 25, 2014 at 2:37 AM, AKASHI Takahiro
<takahiro.akashi@linaro.org> wrote:
> I found a bug in my current patch (v5). When 32-bit tracer skips a system call,
> we should not update syscallno from x8 since syscallno is re-written directly
> via ptrace(PTRACE_SET_SYSCALL).

Ah, yes. Will aarch64 have a PTRACE_SET_SYSCALL option, or is this
strictly a 32-bit vs 64-bit issue?

> I'm sure that my next version should work with 32/64-bit tracers on 64-bit
> kernel.

Do you have a git tree uploaded anywhere? I'd love to follow this more
closely. When do you expect a v6?

Thanks!

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 3/3] arm64: Add seccomp support
  2014-08-05 15:08           ` Kees Cook
@ 2014-08-08  7:35             ` AKASHI Takahiro
  2014-08-11  9:24               ` Will Deacon
  0 siblings, 1 reply; 36+ messages in thread
From: AKASHI Takahiro @ 2014-08-08  7:35 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andy Lutomirski, Deepak Saxena, Will Drewry, linaro-kernel,
	linux-arm-kernel, Will Deacon, Catalin Marinas, linux-kernel,
	Lee Campbell

On 08/06/2014 12:08 AM, Kees Cook wrote:
> On Fri, Jul 25, 2014 at 2:37 AM, AKASHI Takahiro
> <takahiro.akashi@linaro.org> wrote:
>> I found a bug in my current patch (v5). When 32-bit tracer skips a system call,
>> we should not update syscallno from x8 since syscallno is re-written directly
>> via ptrace(PTRACE_SET_SYSCALL).
>
> Ah, yes. Will aarch64 have a PTRACE_SET_SYSCALL option, or is this
> strictly a 32-bit vs 64-bit issue?

As discussed in a few weeks ago, aarch64 won't support PTRACE_SET_SYSCALL.

>> I'm sure that my next version should work with 32/64-bit tracers on 64-bit
>> kernel.
>
> Do you have a git tree uploaded anywhere? I'd love to follow this more
> closely. When do you expect a v6?

I'd like to submit v6 as soon as possible, but
(1) how we should handle syscall(-1) is annoying me.
     Without ptracer, we will normally return -ENOSYS but, for example,
     what if some seccomp filter is installed and it does allow (or doesn't
     have any rule against) '-1' syscall? Since the kernel doesn't know tracer's
     intention, we should just let syscall(-1) return a bogus value.
     Thus we will see inconsistent results of syscall(-1).

(2) I'm investigating some failures in Kees' test suite.
   * 'TRACE.handler' case on compat task:
     Now I found a bug in arm64's compat_siginfo_t and fixed it.
   * 'TSYNC.two_siblings_*' cases on 32/64-bit task:
     I rebased my patch on pre-v3.17-rc1, but those cases still fail.
     I have no clues at this moment.

So please be patient for a while.

-Takahiro AKASHI


> Thanks!
>
> -Kees
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 3/3] arm64: Add seccomp support
  2014-08-08  7:35             ` AKASHI Takahiro
@ 2014-08-11  9:24               ` Will Deacon
  2014-08-12  6:57                 ` AKASHI Takahiro
  0 siblings, 1 reply; 36+ messages in thread
From: Will Deacon @ 2014-08-11  9:24 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Kees Cook, Andy Lutomirski, Deepak Saxena, Will Drewry,
	linaro-kernel, linux-arm-kernel, Catalin Marinas, linux-kernel,
	Lee Campbell

On Fri, Aug 08, 2014 at 08:35:42AM +0100, AKASHI Takahiro wrote:
> On 08/06/2014 12:08 AM, Kees Cook wrote:
> > On Fri, Jul 25, 2014 at 2:37 AM, AKASHI Takahiro
> > <takahiro.akashi@linaro.org> wrote:
> >> I found a bug in my current patch (v5). When 32-bit tracer skips a system call,
> >> we should not update syscallno from x8 since syscallno is re-written directly
> >> via ptrace(PTRACE_SET_SYSCALL).
> >
> > Ah, yes. Will aarch64 have a PTRACE_SET_SYSCALL option, or is this
> > strictly a 32-bit vs 64-bit issue?
> 
> As discussed in a few weeks ago, aarch64 won't support PTRACE_SET_SYSCALL.

Well, I don't think anything was set in stone. If you have a compelling
reason why adding the new request gives you something over setting w8
directly, then we can extend ptrace.

Will

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 3/3] arm64: Add seccomp support
  2014-08-11  9:24               ` Will Deacon
@ 2014-08-12  6:57                 ` AKASHI Takahiro
  2014-08-12  9:40                   ` Will Deacon
  0 siblings, 1 reply; 36+ messages in thread
From: AKASHI Takahiro @ 2014-08-12  6:57 UTC (permalink / raw)
  To: Will Deacon
  Cc: linaro-kernel, Will Drewry, Kees Cook, Catalin Marinas,
	linux-kernel, Andy Lutomirski, Deepak Saxena, Lee Campbell,
	linux-arm-kernel

Will,

On 08/11/2014 06:24 PM, Will Deacon wrote:
> On Fri, Aug 08, 2014 at 08:35:42AM +0100, AKASHI Takahiro wrote:
>> On 08/06/2014 12:08 AM, Kees Cook wrote:
>>> On Fri, Jul 25, 2014 at 2:37 AM, AKASHI Takahiro
>>> <takahiro.akashi@linaro.org> wrote:
>>>> I found a bug in my current patch (v5). When 32-bit tracer skips a system call,
>>>> we should not update syscallno from x8 since syscallno is re-written directly
>>>> via ptrace(PTRACE_SET_SYSCALL).
>>>
>>> Ah, yes. Will aarch64 have a PTRACE_SET_SYSCALL option, or is this
>>> strictly a 32-bit vs 64-bit issue?
>>
>> As discussed in a few weeks ago, aarch64 won't support PTRACE_SET_SYSCALL.
>
> Well, I don't think anything was set in stone. If you have a compelling
> reason why adding the new request gives you something over setting w8
> directly, then we can extend ptrace.

Yeah, I think I may have to change my mind. Looking into __secure_computing(),
I found the code below:

 >     case SECCOMP_MODE_FILTER:
 >         case SECCOMP_RET_TRACE:
 >             ...
 >             if (syscall_get_nr(current, regs) < 0)
 >                 goto skip;

This implies that we should modify syscallno *before* __secure_computing() returns.

I assumed, in my next version, we could skip a system call by overwriting syscallno
with x8 in syscall_trace_enter() after __secure_computing() returns 0, and it actually
works.
But we'd better implement PTRACE_SET_SYSCALL to comply with what __secure_computing()
expects.

-Takahiro AKASHI


> Will
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 3/3] arm64: Add seccomp support
  2014-08-12  6:57                 ` AKASHI Takahiro
@ 2014-08-12  9:40                   ` Will Deacon
  2014-08-12 11:17                     ` AKASHI Takahiro
  0 siblings, 1 reply; 36+ messages in thread
From: Will Deacon @ 2014-08-12  9:40 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: linaro-kernel, Will Drewry, Kees Cook, Catalin Marinas,
	linux-kernel, Andy Lutomirski, Deepak Saxena, Lee Campbell,
	linux-arm-kernel

Hi Akashi,

On Tue, Aug 12, 2014 at 07:57:25AM +0100, AKASHI Takahiro wrote:
> On 08/11/2014 06:24 PM, Will Deacon wrote:
> > On Fri, Aug 08, 2014 at 08:35:42AM +0100, AKASHI Takahiro wrote:
> >> As discussed in a few weeks ago, aarch64 won't support PTRACE_SET_SYSCALL.
> >
> > Well, I don't think anything was set in stone. If you have a compelling
> > reason why adding the new request gives you something over setting w8
> > directly, then we can extend ptrace.
> 
> Yeah, I think I may have to change my mind. Looking into __secure_computing(),
> I found the code below:
> 
>  >     case SECCOMP_MODE_FILTER:
>  >         case SECCOMP_RET_TRACE:
>  >             ...
>  >             if (syscall_get_nr(current, regs) < 0)
>  >                 goto skip;
> 
> This implies that we should modify syscallno *before* __secure_computing()
> returns.

Why does it imply that? There are four competing entities here:

 - seccomp
 - tracehook
 - ftrace (trace_sys_*)
 - audit

With the exception of ftrace, they can all potentially rewrite the pt_regs
(the code you cite above is just below a ptrace_event call), so we have
to choose some order in which to call them.

On entry, x86 and arm call them in the order I listed above, so it seems
sensible to follow that.

> I assumed, in my next version, we could skip a system call by overwriting
> syscallno with x8 in syscall_trace_enter() after __secure_computing()
> returns 0, and it actually works.

Why does overwriting the syscallno with x8 skip the syscall?

I thought the idea was that we would save w8 prior to each call that could
change the pt_regs, then if it was changed to -1 we would replace it with
the saved value and return -1? The only confusion I have is whether we
should call the exit hooks after skipping a syscall. I *think* x86 does
call them, but ARM doesn't. Andy says this can trigger an OOPs:

  http://lists.infradead.org/pipermail/linux-arm-kernel/2014-July/274988.html

so we should fix that for ARM while we're here.

Will

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 3/3] arm64: Add seccomp support
  2014-08-12  9:40                   ` Will Deacon
@ 2014-08-12 11:17                     ` AKASHI Takahiro
  2014-08-15 14:33                       ` Will Deacon
  0 siblings, 1 reply; 36+ messages in thread
From: AKASHI Takahiro @ 2014-08-12 11:17 UTC (permalink / raw)
  To: Will Deacon
  Cc: linaro-kernel, Will Drewry, Kees Cook, Catalin Marinas,
	linux-kernel, Andy Lutomirski, Deepak Saxena, Lee Campbell,
	linux-arm-kernel

On 08/12/2014 06:40 PM, Will Deacon wrote:
> Hi Akashi,
>
> On Tue, Aug 12, 2014 at 07:57:25AM +0100, AKASHI Takahiro wrote:
>> On 08/11/2014 06:24 PM, Will Deacon wrote:
>>> On Fri, Aug 08, 2014 at 08:35:42AM +0100, AKASHI Takahiro wrote:
>>>> As discussed in a few weeks ago, aarch64 won't support PTRACE_SET_SYSCALL.
>>>
>>> Well, I don't think anything was set in stone. If you have a compelling
>>> reason why adding the new request gives you something over setting w8
>>> directly, then we can extend ptrace.
>>
>> Yeah, I think I may have to change my mind. Looking into __secure_computing(),
>> I found the code below:
>>
>>   >     case SECCOMP_MODE_FILTER:
>>   >         case SECCOMP_RET_TRACE:
>>   >             ...
>>   >             if (syscall_get_nr(current, regs) < 0)
>>   >                 goto skip;
>>
>> This implies that we should modify syscallno *before* __secure_computing()
>> returns.
>
> Why does it imply that? There are four competing entities here:
>
>   - seccomp
>   - tracehook
>   - ftrace (trace_sys_*)
>   - audit
>
> With the exception of ftrace, they can all potentially rewrite the pt_regs
> (the code you cite above is just below a ptrace_event call), so we have
> to choose some order in which to call them.

(audit won't change registers.)

> On entry, x86 and arm call them in the order I listed above, so it seems
> sensible to follow that.

Right, but as far as I understand, ptrace_event() in __secure_computing()
calls ptrace_notify(), and eventually executes ptrace_stop(), which can
be stopped while tracer runs (until ptrace(PTRACE_CONT)?).
So syscall_get_nr() is expected to return -1 if trace changes a syscall number to -1
(as far as sycall_get_nr() refers to syscallno in pt_regs).

That is why I think we should have PTRACE_SET_SYSCALL.

>> I assumed, in my next version, we could skip a system call by overwriting
>> syscallno with x8 in syscall_trace_enter() after __secure_computing()
>> returns 0, and it actually works.
>
> Why does overwriting the syscallno with x8 skip the syscall?
>
> I thought the idea was that we would save w8 prior to each call that could
> change the pt_regs, then if it was changed to -1 we would replace it with
> the saved value and return -1?

I think its the right way to do.
But x86 rewrites orig_ax and arm rewrites syscallno directly, and
refer to these values as "syscall numbers" later on, for example,
see the arguments to audit_syscall_entry().
So if we don't update syscallno, we may see different behaviors from x86 or arm?

> The only confusion I have is whether we
> should call the exit hooks after skipping a syscall. I *think* x86 does
> call them, but ARM doesn't. Andy says this can trigger an OOPs:

Again, right. we should definitely avoid OOPs.
But we may be able to avoid OOPs by not calling entry hooks for skipped system
calls, instead of calling exit hooks, if we rewrite syscallno as mentioned above.
(Please note, as I mentioned, audit_syscall_xx() ignores any request for logging
invalid system calls.)

Thanks,
-Takahiro AKASHI

>    http://lists.infradead.org/pipermail/linux-arm-kernel/2014-July/274988.html
>
> so we should fix that for ARM while we're here.
>
> Will
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v5 3/3] arm64: Add seccomp support
  2014-08-12 11:17                     ` AKASHI Takahiro
@ 2014-08-15 14:33                       ` Will Deacon
  0 siblings, 0 replies; 36+ messages in thread
From: Will Deacon @ 2014-08-15 14:33 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: linaro-kernel, Will Drewry, Kees Cook, Catalin Marinas,
	linux-kernel, Andy Lutomirski, Deepak Saxena, Lee Campbell,
	linux-arm-kernel

On Tue, Aug 12, 2014 at 12:17:53PM +0100, AKASHI Takahiro wrote:
> On 08/12/2014 06:40 PM, Will Deacon wrote:
> > On Tue, Aug 12, 2014 at 07:57:25AM +0100, AKASHI Takahiro wrote:
> >>
> >>   >     case SECCOMP_MODE_FILTER:
> >>   >         case SECCOMP_RET_TRACE:
> >>   >             ...
> >>   >             if (syscall_get_nr(current, regs) < 0)
> >>   >                 goto skip;
> >>
> >> This implies that we should modify syscallno *before* __secure_computing()
> >> returns.
> >
> > Why does it imply that? There are four competing entities here:
> >
> >   - seccomp
> >   - tracehook
> >   - ftrace (trace_sys_*)
> >   - audit
> >
> > With the exception of ftrace, they can all potentially rewrite the pt_regs
> > (the code you cite above is just below a ptrace_event call), so we have
> > to choose some order in which to call them.
> 
> (audit won't change registers.)

Sorry, you're quite right.

> > On entry, x86 and arm call them in the order I listed above, so it seems
> > sensible to follow that.
> 
> Right, but as far as I understand, ptrace_event() in __secure_computing()
> calls ptrace_notify(), and eventually executes ptrace_stop(), which can
> be stopped while tracer runs (until ptrace(PTRACE_CONT)?).
> So syscall_get_nr() is expected to return -1 if trace changes a syscall number to -1
> (as far as sycall_get_nr() refers to syscallno in pt_regs).
> 
> That is why I think we should have PTRACE_SET_SYSCALL.

Gotcha, yeah that looks like the cleanest approach after all. Thanks for the
explanation.

Will

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2014-08-15 14:34 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-22  9:14 [PATCH v5 0/3] arm64: Add seccomp support AKASHI Takahiro
2014-07-22  9:14 ` [PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations AKASHI Takahiro
2014-07-22 20:15   ` Kees Cook
2014-07-23  7:03     ` AKASHI Takahiro
2014-07-23  8:25       ` Will Deacon
2014-07-23  9:09         ` AKASHI Takahiro
2014-07-23 15:13       ` Kees Cook
2014-07-24  3:54   ` Andy Lutomirski
2014-07-24  5:57     ` AKASHI Takahiro
2014-07-24 15:01       ` Andy Lutomirski
2014-07-25 10:36         ` AKASHI Takahiro
2014-07-25 11:03           ` Will Deacon
2014-07-29  6:49             ` AKASHI Takahiro
2014-07-29 13:26               ` Will Deacon
2014-07-22  9:14 ` [PATCH v5 2/3] asm-generic: Add generic seccomp.h for secure computing mode 1 AKASHI Takahiro
2014-07-24  3:40   ` Andy Lutomirski
2014-07-24  4:41     ` Kees Cook
2014-07-24  5:17       ` AKASHI Takahiro
2014-07-24 14:57         ` Andy Lutomirski
2014-07-25  8:52           ` AKASHI Takahiro
2014-07-22  9:14 ` [PATCH v5 3/3] arm64: Add seccomp support AKASHI Takahiro
2014-07-24  3:52   ` Andy Lutomirski
2014-07-24  5:40     ` AKASHI Takahiro
2014-07-24 15:00       ` Andy Lutomirski
2014-07-24 15:16         ` Catalin Marinas
2014-07-25  9:37         ` AKASHI Takahiro
2014-08-05 15:08           ` Kees Cook
2014-08-08  7:35             ` AKASHI Takahiro
2014-08-11  9:24               ` Will Deacon
2014-08-12  6:57                 ` AKASHI Takahiro
2014-08-12  9:40                   ` Will Deacon
2014-08-12 11:17                     ` AKASHI Takahiro
2014-08-15 14:33                       ` Will Deacon
2014-07-22 20:16 ` [PATCH v5 0/3] " Kees Cook
2014-07-23  7:09   ` AKASHI Takahiro
2014-07-23 15:36     ` Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).