All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10  0:49 ` Tycho Andersen
  0 siblings, 0 replies; 29+ messages in thread
From: Tycho Andersen @ 2015-06-10  0:49 UTC (permalink / raw)
  To: linux-kernel, linux-api
  Cc: Tycho Andersen, Kees Cook, Andy Lutomirski, Will Drewry,
	Roland McGrath, Oleg Nesterov, Pavel Emelyanov, Serge E. Hallyn

This patch is the first step in enabling checkpoint/restore of processes
with seccomp enabled.

One of the things CRIU does while dumping tasks is inject code into them
via ptrace to collect information that is only available to the process
itself. However, if we are in a seccomp mode where these processes are
prohibited from making these syscalls, then what CRIU does kills the task.

This patch adds a new ptrace option, PTRACE_O_SUSPEND_SECCOMP, that enables
a task from the init user namespace which has CAP_SYS_ADMIN and no seccomp
filters to disable (and re-enable) seccomp filters for another task so that
they can be successfully dumped (and restored). We restrict the set of
processes that can disable seccomp through ptrace because although today
ptrace can be used to bypass seccomp, there is some discussion of closing
this loophole in the future and we would like this patch to not depend on
that behavior and be future proofed for when it is removed.

Note that seccomp can be suspended before any filters are actually
installed; this behavior is useful on criu restore, so that we can suspend
seccomp, restore the filters, unmap our restore code from the restored
process' address space, and then resume the task by detaching and have the
filters resumed as well.

v2 changes:

* require that the tracer have no seccomp filters installed
* drop TIF_NOTSC manipulation from the patch
* change from ptrace command to a ptrace option and use this ptrace option
  as the flag to check. This means that as soon as the tracer
  detaches/dies, seccomp is re-enabled and as a corrollary that one can not
  disable seccomp across PTRACE_ATTACHs.

v3 changes:

* get rid of various #ifdefs everywhere
* report more sensible errors when PTRACE_O_SUSPEND_SECCOMP is incorrectly
  used

v4 changes:

* get rid of may_suspend_seccomp() in favor of a capable() check in ptrace
  directly

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
CC: Kees Cook <keescook@chromium.org>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Will Drewry <wad@chromium.org>
CC: Roland McGrath <roland@hack.frob.com>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Pavel Emelyanov <xemul@parallels.com>
CC: Serge E. Hallyn <serge.hallyn@ubuntu.com>
---
 include/linux/ptrace.h      | 1 +
 include/uapi/linux/ptrace.h | 6 ++++--
 kernel/ptrace.c             | 9 +++++++++
 kernel/seccomp.c            | 8 ++++++++
 4 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 987a73a..061265f 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -34,6 +34,7 @@
 #define PT_TRACE_SECCOMP	PT_EVENT_FLAG(PTRACE_EVENT_SECCOMP)
 
 #define PT_EXITKILL		(PTRACE_O_EXITKILL << PT_OPT_FLAG_SHIFT)
+#define PT_SUSPEND_SECCOMP	(PTRACE_O_SUSPEND_SECCOMP << PT_OPT_FLAG_SHIFT)
 
 /* single stepping state bits (used on ARM and PA-RISC) */
 #define PT_SINGLESTEP_BIT	31
diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
index cf1019e..a7a6979 100644
--- a/include/uapi/linux/ptrace.h
+++ b/include/uapi/linux/ptrace.h
@@ -89,9 +89,11 @@ struct ptrace_peeksiginfo_args {
 #define PTRACE_O_TRACESECCOMP	(1 << PTRACE_EVENT_SECCOMP)
 
 /* eventless options */
-#define PTRACE_O_EXITKILL	(1 << 20)
+#define PTRACE_O_EXITKILL		(1 << 20)
+#define PTRACE_O_SUSPEND_SECCOMP	(1 << 21)
 
-#define PTRACE_O_MASK		(0x000000ff | PTRACE_O_EXITKILL)
+#define PTRACE_O_MASK		(\
+	0x000000ff | PTRACE_O_EXITKILL | PTRACE_O_SUSPEND_SECCOMP)
 
 #include <asm/ptrace.h>
 
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index c8e0e05..11fa460 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
 	if (data & ~(unsigned long)PTRACE_O_MASK)
 		return -EINVAL;
 
+	if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
+		if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
+		    !config_enabled(CONFIG_SECCOMP))
+			return -EINVAL;
+
+		if (!capable(CAP_SYS_ADMIN))
+			return -EPERM;
+	}
+
 	/* Avoid intermediate state when all opts are cleared */
 	flags = child->ptrace;
 	flags &= ~(PTRACE_O_MASK << PT_OPT_FLAG_SHIFT);
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 980fd26..645e42d 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -590,6 +590,10 @@ void secure_computing_strict(int this_syscall)
 {
 	int mode = current->seccomp.mode;
 
+	if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
+	    unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
+		return;
+
 	if (mode == 0)
 		return;
 	else if (mode == SECCOMP_MODE_STRICT)
@@ -691,6 +695,10 @@ u32 seccomp_phase1(struct seccomp_data *sd)
 	int this_syscall = sd ? sd->nr :
 		syscall_get_nr(current, task_pt_regs(current));
 
+	if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
+	    unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
+		return SECCOMP_PHASE1_OK;
+
 	switch (mode) {
 	case SECCOMP_MODE_STRICT:
 		__secure_computing_strict(this_syscall);  /* may call do_exit */
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10  0:49 ` Tycho Andersen
  0 siblings, 0 replies; 29+ messages in thread
From: Tycho Andersen @ 2015-06-10  0:49 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-api-u79uwXL29TY76Z2rM5mHXA
  Cc: Tycho Andersen, Kees Cook, Andy Lutomirski, Will Drewry,
	Roland McGrath, Oleg Nesterov, Pavel Emelyanov, Serge E. Hallyn

This patch is the first step in enabling checkpoint/restore of processes
with seccomp enabled.

One of the things CRIU does while dumping tasks is inject code into them
via ptrace to collect information that is only available to the process
itself. However, if we are in a seccomp mode where these processes are
prohibited from making these syscalls, then what CRIU does kills the task.

This patch adds a new ptrace option, PTRACE_O_SUSPEND_SECCOMP, that enables
a task from the init user namespace which has CAP_SYS_ADMIN and no seccomp
filters to disable (and re-enable) seccomp filters for another task so that
they can be successfully dumped (and restored). We restrict the set of
processes that can disable seccomp through ptrace because although today
ptrace can be used to bypass seccomp, there is some discussion of closing
this loophole in the future and we would like this patch to not depend on
that behavior and be future proofed for when it is removed.

Note that seccomp can be suspended before any filters are actually
installed; this behavior is useful on criu restore, so that we can suspend
seccomp, restore the filters, unmap our restore code from the restored
process' address space, and then resume the task by detaching and have the
filters resumed as well.

v2 changes:

* require that the tracer have no seccomp filters installed
* drop TIF_NOTSC manipulation from the patch
* change from ptrace command to a ptrace option and use this ptrace option
  as the flag to check. This means that as soon as the tracer
  detaches/dies, seccomp is re-enabled and as a corrollary that one can not
  disable seccomp across PTRACE_ATTACHs.

v3 changes:

* get rid of various #ifdefs everywhere
* report more sensible errors when PTRACE_O_SUSPEND_SECCOMP is incorrectly
  used

v4 changes:

* get rid of may_suspend_seccomp() in favor of a capable() check in ptrace
  directly

Signed-off-by: Tycho Andersen <tycho.andersen-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
CC: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
CC: Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
CC: Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
CC: Roland McGrath <roland-/Z5OmTQCD9xF6kxbq+BtvQ@public.gmane.org>
CC: Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
CC: Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
CC: Serge E. Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>
---
 include/linux/ptrace.h      | 1 +
 include/uapi/linux/ptrace.h | 6 ++++--
 kernel/ptrace.c             | 9 +++++++++
 kernel/seccomp.c            | 8 ++++++++
 4 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 987a73a..061265f 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -34,6 +34,7 @@
 #define PT_TRACE_SECCOMP	PT_EVENT_FLAG(PTRACE_EVENT_SECCOMP)
 
 #define PT_EXITKILL		(PTRACE_O_EXITKILL << PT_OPT_FLAG_SHIFT)
+#define PT_SUSPEND_SECCOMP	(PTRACE_O_SUSPEND_SECCOMP << PT_OPT_FLAG_SHIFT)
 
 /* single stepping state bits (used on ARM and PA-RISC) */
 #define PT_SINGLESTEP_BIT	31
diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
index cf1019e..a7a6979 100644
--- a/include/uapi/linux/ptrace.h
+++ b/include/uapi/linux/ptrace.h
@@ -89,9 +89,11 @@ struct ptrace_peeksiginfo_args {
 #define PTRACE_O_TRACESECCOMP	(1 << PTRACE_EVENT_SECCOMP)
 
 /* eventless options */
-#define PTRACE_O_EXITKILL	(1 << 20)
+#define PTRACE_O_EXITKILL		(1 << 20)
+#define PTRACE_O_SUSPEND_SECCOMP	(1 << 21)
 
-#define PTRACE_O_MASK		(0x000000ff | PTRACE_O_EXITKILL)
+#define PTRACE_O_MASK		(\
+	0x000000ff | PTRACE_O_EXITKILL | PTRACE_O_SUSPEND_SECCOMP)
 
 #include <asm/ptrace.h>
 
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index c8e0e05..11fa460 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
 	if (data & ~(unsigned long)PTRACE_O_MASK)
 		return -EINVAL;
 
+	if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
+		if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
+		    !config_enabled(CONFIG_SECCOMP))
+			return -EINVAL;
+
+		if (!capable(CAP_SYS_ADMIN))
+			return -EPERM;
+	}
+
 	/* Avoid intermediate state when all opts are cleared */
 	flags = child->ptrace;
 	flags &= ~(PTRACE_O_MASK << PT_OPT_FLAG_SHIFT);
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 980fd26..645e42d 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -590,6 +590,10 @@ void secure_computing_strict(int this_syscall)
 {
 	int mode = current->seccomp.mode;
 
+	if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
+	    unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
+		return;
+
 	if (mode == 0)
 		return;
 	else if (mode == SECCOMP_MODE_STRICT)
@@ -691,6 +695,10 @@ u32 seccomp_phase1(struct seccomp_data *sd)
 	int this_syscall = sd ? sd->nr :
 		syscall_get_nr(current, task_pt_regs(current));
 
+	if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
+	    unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
+		return SECCOMP_PHASE1_OK;
+
 	switch (mode) {
 	case SECCOMP_MODE_STRICT:
 		__secure_computing_strict(this_syscall);  /* may call do_exit */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10  1:08   ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2015-06-10  1:08 UTC (permalink / raw)
  To: Tycho Andersen
  Cc: linux-kernel, Linux API, Kees Cook, Will Drewry, Roland McGrath,
	Oleg Nesterov, Pavel Emelyanov, Serge E. Hallyn

On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
<tycho.andersen@canonical.com> wrote:
> This patch is the first step in enabling checkpoint/restore of processes
> with seccomp enabled.
>
> One of the things CRIU does while dumping tasks is inject code into them
> via ptrace to collect information that is only available to the process
> itself. However, if we are in a seccomp mode where these processes are
> prohibited from making these syscalls, then what CRIU does kills the task.
>
> This patch adds a new ptrace option, PTRACE_O_SUSPEND_SECCOMP, that enables
> a task from the init user namespace which has CAP_SYS_ADMIN and no seccomp
> filters to disable (and re-enable) seccomp filters for another task so that
> they can be successfully dumped (and restored). We restrict the set of
> processes that can disable seccomp through ptrace because although today
> ptrace can be used to bypass seccomp, there is some discussion of closing
> this loophole in the future and we would like this patch to not depend on
> that behavior and be future proofed for when it is removed.
>
> Note that seccomp can be suspended before any filters are actually
> installed; this behavior is useful on criu restore, so that we can suspend
> seccomp, restore the filters, unmap our restore code from the restored
> process' address space, and then resume the task by detaching and have the
> filters resumed as well.
>
> v2 changes:
>
> * require that the tracer have no seccomp filters installed
> * drop TIF_NOTSC manipulation from the patch
> * change from ptrace command to a ptrace option and use this ptrace option
>   as the flag to check. This means that as soon as the tracer
>   detaches/dies, seccomp is re-enabled and as a corrollary that one can not
>   disable seccomp across PTRACE_ATTACHs.
>
> v3 changes:
>
> * get rid of various #ifdefs everywhere
> * report more sensible errors when PTRACE_O_SUSPEND_SECCOMP is incorrectly
>   used
>
> v4 changes:
>
> * get rid of may_suspend_seccomp() in favor of a capable() check in ptrace
>   directly
>
> Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
> CC: Kees Cook <keescook@chromium.org>
> CC: Andy Lutomirski <luto@amacapital.net>
> CC: Will Drewry <wad@chromium.org>
> CC: Roland McGrath <roland@hack.frob.com>
> CC: Oleg Nesterov <oleg@redhat.com>
> CC: Pavel Emelyanov <xemul@parallels.com>
> CC: Serge E. Hallyn <serge.hallyn@ubuntu.com>
> ---
>  include/linux/ptrace.h      | 1 +
>  include/uapi/linux/ptrace.h | 6 ++++--
>  kernel/ptrace.c             | 9 +++++++++
>  kernel/seccomp.c            | 8 ++++++++
>  4 files changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
> index 987a73a..061265f 100644
> --- a/include/linux/ptrace.h
> +++ b/include/linux/ptrace.h
> @@ -34,6 +34,7 @@
>  #define PT_TRACE_SECCOMP       PT_EVENT_FLAG(PTRACE_EVENT_SECCOMP)
>
>  #define PT_EXITKILL            (PTRACE_O_EXITKILL << PT_OPT_FLAG_SHIFT)
> +#define PT_SUSPEND_SECCOMP     (PTRACE_O_SUSPEND_SECCOMP << PT_OPT_FLAG_SHIFT)
>
>  /* single stepping state bits (used on ARM and PA-RISC) */
>  #define PT_SINGLESTEP_BIT      31
> diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
> index cf1019e..a7a6979 100644
> --- a/include/uapi/linux/ptrace.h
> +++ b/include/uapi/linux/ptrace.h
> @@ -89,9 +89,11 @@ struct ptrace_peeksiginfo_args {
>  #define PTRACE_O_TRACESECCOMP  (1 << PTRACE_EVENT_SECCOMP)
>
>  /* eventless options */
> -#define PTRACE_O_EXITKILL      (1 << 20)
> +#define PTRACE_O_EXITKILL              (1 << 20)
> +#define PTRACE_O_SUSPEND_SECCOMP       (1 << 21)
>
> -#define PTRACE_O_MASK          (0x000000ff | PTRACE_O_EXITKILL)
> +#define PTRACE_O_MASK          (\
> +       0x000000ff | PTRACE_O_EXITKILL | PTRACE_O_SUSPEND_SECCOMP)
>
>  #include <asm/ptrace.h>
>
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index c8e0e05..11fa460 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
>         if (data & ~(unsigned long)PTRACE_O_MASK)
>                 return -EINVAL;
>
> +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
> +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
> +                   !config_enabled(CONFIG_SECCOMP))
> +                       return -EINVAL;
> +
> +               if (!capable(CAP_SYS_ADMIN))
> +                       return -EPERM;

I tend to think that we should also require that current not be using
seccomp.  Otherwise, in principle, there's a seccomp bypass for
privileged-but-seccomped programs.  In any event, CRIU isn't going to
work well if you run the restorer under seccomp, since it'll start
nesting in a manner that probably isn't desirable.

> +       }
> +
>         /* Avoid intermediate state when all opts are cleared */
>         flags = child->ptrace;
>         flags &= ~(PTRACE_O_MASK << PT_OPT_FLAG_SHIFT);
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index 980fd26..645e42d 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -590,6 +590,10 @@ void secure_computing_strict(int this_syscall)
>  {
>         int mode = current->seccomp.mode;
>
> +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> +               return;
> +
>         if (mode == 0)
>                 return;
>         else if (mode == SECCOMP_MODE_STRICT)
> @@ -691,6 +695,10 @@ u32 seccomp_phase1(struct seccomp_data *sd)
>         int this_syscall = sd ? sd->nr :
>                 syscall_get_nr(current, task_pt_regs(current));
>
> +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> +               return SECCOMP_PHASE1_OK;
> +

If it's not hard, it might still be nice to try to fold this into
mode.  This code is rather hot.  If it would be a mess, then don't
worry about it for now.

Otherwise seems reasonable.

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10  1:08   ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2015-06-10  1:08 UTC (permalink / raw)
  To: Tycho Andersen
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linux API, Kees Cook,
	Will Drewry, Roland McGrath, Oleg Nesterov, Pavel Emelyanov,
	Serge E. Hallyn

On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
<tycho.andersen-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> wrote:
> This patch is the first step in enabling checkpoint/restore of processes
> with seccomp enabled.
>
> One of the things CRIU does while dumping tasks is inject code into them
> via ptrace to collect information that is only available to the process
> itself. However, if we are in a seccomp mode where these processes are
> prohibited from making these syscalls, then what CRIU does kills the task.
>
> This patch adds a new ptrace option, PTRACE_O_SUSPEND_SECCOMP, that enables
> a task from the init user namespace which has CAP_SYS_ADMIN and no seccomp
> filters to disable (and re-enable) seccomp filters for another task so that
> they can be successfully dumped (and restored). We restrict the set of
> processes that can disable seccomp through ptrace because although today
> ptrace can be used to bypass seccomp, there is some discussion of closing
> this loophole in the future and we would like this patch to not depend on
> that behavior and be future proofed for when it is removed.
>
> Note that seccomp can be suspended before any filters are actually
> installed; this behavior is useful on criu restore, so that we can suspend
> seccomp, restore the filters, unmap our restore code from the restored
> process' address space, and then resume the task by detaching and have the
> filters resumed as well.
>
> v2 changes:
>
> * require that the tracer have no seccomp filters installed
> * drop TIF_NOTSC manipulation from the patch
> * change from ptrace command to a ptrace option and use this ptrace option
>   as the flag to check. This means that as soon as the tracer
>   detaches/dies, seccomp is re-enabled and as a corrollary that one can not
>   disable seccomp across PTRACE_ATTACHs.
>
> v3 changes:
>
> * get rid of various #ifdefs everywhere
> * report more sensible errors when PTRACE_O_SUSPEND_SECCOMP is incorrectly
>   used
>
> v4 changes:
>
> * get rid of may_suspend_seccomp() in favor of a capable() check in ptrace
>   directly
>
> Signed-off-by: Tycho Andersen <tycho.andersen-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> CC: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> CC: Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
> CC: Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> CC: Roland McGrath <roland-/Z5OmTQCD9xF6kxbq+BtvQ@public.gmane.org>
> CC: Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> CC: Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
> CC: Serge E. Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>
> ---
>  include/linux/ptrace.h      | 1 +
>  include/uapi/linux/ptrace.h | 6 ++++--
>  kernel/ptrace.c             | 9 +++++++++
>  kernel/seccomp.c            | 8 ++++++++
>  4 files changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
> index 987a73a..061265f 100644
> --- a/include/linux/ptrace.h
> +++ b/include/linux/ptrace.h
> @@ -34,6 +34,7 @@
>  #define PT_TRACE_SECCOMP       PT_EVENT_FLAG(PTRACE_EVENT_SECCOMP)
>
>  #define PT_EXITKILL            (PTRACE_O_EXITKILL << PT_OPT_FLAG_SHIFT)
> +#define PT_SUSPEND_SECCOMP     (PTRACE_O_SUSPEND_SECCOMP << PT_OPT_FLAG_SHIFT)
>
>  /* single stepping state bits (used on ARM and PA-RISC) */
>  #define PT_SINGLESTEP_BIT      31
> diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
> index cf1019e..a7a6979 100644
> --- a/include/uapi/linux/ptrace.h
> +++ b/include/uapi/linux/ptrace.h
> @@ -89,9 +89,11 @@ struct ptrace_peeksiginfo_args {
>  #define PTRACE_O_TRACESECCOMP  (1 << PTRACE_EVENT_SECCOMP)
>
>  /* eventless options */
> -#define PTRACE_O_EXITKILL      (1 << 20)
> +#define PTRACE_O_EXITKILL              (1 << 20)
> +#define PTRACE_O_SUSPEND_SECCOMP       (1 << 21)
>
> -#define PTRACE_O_MASK          (0x000000ff | PTRACE_O_EXITKILL)
> +#define PTRACE_O_MASK          (\
> +       0x000000ff | PTRACE_O_EXITKILL | PTRACE_O_SUSPEND_SECCOMP)
>
>  #include <asm/ptrace.h>
>
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index c8e0e05..11fa460 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
>         if (data & ~(unsigned long)PTRACE_O_MASK)
>                 return -EINVAL;
>
> +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
> +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
> +                   !config_enabled(CONFIG_SECCOMP))
> +                       return -EINVAL;
> +
> +               if (!capable(CAP_SYS_ADMIN))
> +                       return -EPERM;

I tend to think that we should also require that current not be using
seccomp.  Otherwise, in principle, there's a seccomp bypass for
privileged-but-seccomped programs.  In any event, CRIU isn't going to
work well if you run the restorer under seccomp, since it'll start
nesting in a manner that probably isn't desirable.

> +       }
> +
>         /* Avoid intermediate state when all opts are cleared */
>         flags = child->ptrace;
>         flags &= ~(PTRACE_O_MASK << PT_OPT_FLAG_SHIFT);
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index 980fd26..645e42d 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -590,6 +590,10 @@ void secure_computing_strict(int this_syscall)
>  {
>         int mode = current->seccomp.mode;
>
> +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> +               return;
> +
>         if (mode == 0)
>                 return;
>         else if (mode == SECCOMP_MODE_STRICT)
> @@ -691,6 +695,10 @@ u32 seccomp_phase1(struct seccomp_data *sd)
>         int this_syscall = sd ? sd->nr :
>                 syscall_get_nr(current, task_pt_regs(current));
>
> +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> +               return SECCOMP_PHASE1_OK;
> +

If it's not hard, it might still be nice to try to fold this into
mode.  This code is rather hot.  If it would be a mess, then don't
worry about it for now.

Otherwise seems reasonable.

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10 15:19     ` Tycho Andersen
  0 siblings, 0 replies; 29+ messages in thread
From: Tycho Andersen @ 2015-06-10 15:19 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-kernel, Linux API, Kees Cook, Will Drewry, Roland McGrath,
	Oleg Nesterov, Pavel Emelyanov, Serge E. Hallyn

Hi Andy,

On Tue, Jun 09, 2015 at 06:08:42PM -0700, Andy Lutomirski wrote:
>
> > +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
> > +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
> > +                   !config_enabled(CONFIG_SECCOMP))
> > +                       return -EINVAL;
> > +
> > +               if (!capable(CAP_SYS_ADMIN))
> > +                       return -EPERM;
> 
> I tend to think that we should also require that current not be using
> seccomp.  Otherwise, in principle, there's a seccomp bypass for
> privileged-but-seccomped programs.  In any event, CRIU isn't going to
> work well if you run the restorer under seccomp, since it'll start
> nesting in a manner that probably isn't desirable.

Ok, I can resend with that. (sorry Oleg :)

> > +       }
> > +
> >         /* Avoid intermediate state when all opts are cleared */
> >         flags = child->ptrace;
> >         flags &= ~(PTRACE_O_MASK << PT_OPT_FLAG_SHIFT);
> > diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> > index 980fd26..645e42d 100644
> > --- a/kernel/seccomp.c
> > +++ b/kernel/seccomp.c
> > @@ -590,6 +590,10 @@ void secure_computing_strict(int this_syscall)
> >  {
> >         int mode = current->seccomp.mode;
> >
> > +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> > +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> > +               return;
> > +
> >         if (mode == 0)
> >                 return;
> >         else if (mode == SECCOMP_MODE_STRICT)
> > @@ -691,6 +695,10 @@ u32 seccomp_phase1(struct seccomp_data *sd)
> >         int this_syscall = sd ? sd->nr :
> >                 syscall_get_nr(current, task_pt_regs(current));
> >
> > +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> > +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> > +               return SECCOMP_PHASE1_OK;
> > +
> 
> If it's not hard, it might still be nice to try to fold this into
> mode.  This code is rather hot.  If it would be a mess, then don't
> worry about it for now.

The part I'm not immediately clear on is what to do when the tracer
dies and the task is running. Oleg pointed out that we can't play with
TIF_SECCOMP (or we could, but restoring it in this case is
complicated), and I'm not sure if playing with ->seccomp.mode has any
similar complications. I /think/ it should be ok to just re-enable it,
but I'm not sure.

I'd like to leave this patch as is (modulo the extra check) for now.
I'm still looking at a way to export mode 2 filters, so there will
hopefully be more patches in this area soon and we can reexamine then.

Thanks for the review.

Tycho

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10 15:19     ` Tycho Andersen
  0 siblings, 0 replies; 29+ messages in thread
From: Tycho Andersen @ 2015-06-10 15:19 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linux API, Kees Cook,
	Will Drewry, Roland McGrath, Oleg Nesterov, Pavel Emelyanov,
	Serge E. Hallyn

Hi Andy,

On Tue, Jun 09, 2015 at 06:08:42PM -0700, Andy Lutomirski wrote:
>
> > +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
> > +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
> > +                   !config_enabled(CONFIG_SECCOMP))
> > +                       return -EINVAL;
> > +
> > +               if (!capable(CAP_SYS_ADMIN))
> > +                       return -EPERM;
> 
> I tend to think that we should also require that current not be using
> seccomp.  Otherwise, in principle, there's a seccomp bypass for
> privileged-but-seccomped programs.  In any event, CRIU isn't going to
> work well if you run the restorer under seccomp, since it'll start
> nesting in a manner that probably isn't desirable.

Ok, I can resend with that. (sorry Oleg :)

> > +       }
> > +
> >         /* Avoid intermediate state when all opts are cleared */
> >         flags = child->ptrace;
> >         flags &= ~(PTRACE_O_MASK << PT_OPT_FLAG_SHIFT);
> > diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> > index 980fd26..645e42d 100644
> > --- a/kernel/seccomp.c
> > +++ b/kernel/seccomp.c
> > @@ -590,6 +590,10 @@ void secure_computing_strict(int this_syscall)
> >  {
> >         int mode = current->seccomp.mode;
> >
> > +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> > +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> > +               return;
> > +
> >         if (mode == 0)
> >                 return;
> >         else if (mode == SECCOMP_MODE_STRICT)
> > @@ -691,6 +695,10 @@ u32 seccomp_phase1(struct seccomp_data *sd)
> >         int this_syscall = sd ? sd->nr :
> >                 syscall_get_nr(current, task_pt_regs(current));
> >
> > +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> > +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> > +               return SECCOMP_PHASE1_OK;
> > +
> 
> If it's not hard, it might still be nice to try to fold this into
> mode.  This code is rather hot.  If it would be a mess, then don't
> worry about it for now.

The part I'm not immediately clear on is what to do when the tracer
dies and the task is running. Oleg pointed out that we can't play with
TIF_SECCOMP (or we could, but restoring it in this case is
complicated), and I'm not sure if playing with ->seccomp.mode has any
similar complications. I /think/ it should be ok to just re-enable it,
but I'm not sure.

I'd like to leave this patch as is (modulo the extra check) for now.
I'm still looking at a way to export mode 2 filters, so there will
hopefully be more patches in this area soon and we can reexamine then.

Thanks for the review.

Tycho

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
  2015-06-10  1:08   ` Andy Lutomirski
  (?)
  (?)
@ 2015-06-10 16:31   ` Oleg Nesterov
  2015-06-10 17:20       ` Andy Lutomirski
  -1 siblings, 1 reply; 29+ messages in thread
From: Oleg Nesterov @ 2015-06-10 16:31 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Tycho Andersen, linux-kernel, Linux API, Kees Cook, Will Drewry,
	Roland McGrath, Pavel Emelyanov, Serge E. Hallyn

On 06/09, Andy Lutomirski wrote:
>
> On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
> >
> > @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
> >         if (data & ~(unsigned long)PTRACE_O_MASK)
> >                 return -EINVAL;
> >
> > +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {

Well, we should do this if

			(data & O_SUSPEND) && !(flags & O_SUSPEND)

or at least if

			(data ^ flags) & O_SUSPEND


> > +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
> > +                   !config_enabled(CONFIG_SECCOMP))
> > +                       return -EINVAL;
> > +
> > +               if (!capable(CAP_SYS_ADMIN))
> > +                       return -EPERM;
>
> I tend to think that we should also require that current not be using
> seccomp.  Otherwise, in principle, there's a seccomp bypass for
> privileged-but-seccomped programs.

Andy, I simply can't understand why do we need any security check at all.

OK, yes, in theory we can have a seccomped CAP_SYS_ADMIN process, seccomp
doesn't filter ptrace, you hack that process and force it to attach to
another CAP_SYS_ADMIN/seccomped process, etc, etc... Looks too paranoid
to me.

But damn, I said many times that I won't argue ;)

> > @@ -590,6 +590,10 @@ void secure_computing_strict(int this_syscall)
> >  {
> >         int mode = current->seccomp.mode;
> >
> > +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> > +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> > +               return;
> > +
> >         if (mode == 0)
> >                 return;
> >         else if (mode == SECCOMP_MODE_STRICT)
> > @@ -691,6 +695,10 @@ u32 seccomp_phase1(struct seccomp_data *sd)
> >         int this_syscall = sd ? sd->nr :
> >                 syscall_get_nr(current, task_pt_regs(current));
> >
> > +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> > +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> > +               return SECCOMP_PHASE1_OK;
> > +
>
> If it's not hard, it might still be nice to try to fold this into
> mode.  This code is rather hot.  If it would be a mess, then don't
> worry about it for now.

IMO, this would be a mess ;) At least compared to this simple patch.

Suppose we add SECCOMP_MODE_SUSPENDED. Not only this adds the problems
with detach if the tracer dies.

We need to change copy_seccomp(). And it is not clear what should we
do if the child is traced too.

We need to change prctl_set_seccomp() paths.

And even the "tracee->seccomp.mode = SECCOMP_MODE_SUSPENDED" code needs
some locking even if the tracee is stopped, we need to avoid the races
with SECCOMP_FILTER_FLAG_TSYNC from other threads.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10 17:20       ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2015-06-10 17:20 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Tycho Andersen, linux-kernel, Linux API, Kees Cook, Will Drewry,
	Roland McGrath, Pavel Emelyanov, Serge E. Hallyn

On Wed, Jun 10, 2015 at 9:31 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 06/09, Andy Lutomirski wrote:
>>
>> On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
>> >
>> > @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
>> >         if (data & ~(unsigned long)PTRACE_O_MASK)
>> >                 return -EINVAL;
>> >
>> > +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
>
> Well, we should do this if
>
>                         (data & O_SUSPEND) && !(flags & O_SUSPEND)
>
> or at least if
>
>                         (data ^ flags) & O_SUSPEND
>
>
>> > +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
>> > +                   !config_enabled(CONFIG_SECCOMP))
>> > +                       return -EINVAL;
>> > +
>> > +               if (!capable(CAP_SYS_ADMIN))
>> > +                       return -EPERM;
>>
>> I tend to think that we should also require that current not be using
>> seccomp.  Otherwise, in principle, there's a seccomp bypass for
>> privileged-but-seccomped programs.
>
> Andy, I simply can't understand why do we need any security check at all.
>
> OK, yes, in theory we can have a seccomped CAP_SYS_ADMIN process, seccomp
> doesn't filter ptrace, you hack that process and force it to attach to
> another CAP_SYS_ADMIN/seccomped process, etc, etc... Looks too paranoid
> to me.

I've sometimes considered having privileged processes I write fork and
seccomp their child.  Of course, if you're allowing ptrace through
your seccomp filter, you open a giant can of worms, but I think we
should take the more paranoid approach to start and relax it later as
needed.  After all, for the intended use of this patch, stuff will
break regardless of what we do if the ptracer is itself seccomped.

I could be convinced that if the ptracer is outside seccomp then we
shouldn't need the CAP_SYS_ADMIN check.  That would at least make this
work in a user namespace.

>> > @@ -590,6 +590,10 @@ void secure_computing_strict(int this_syscall)
>> >  {
>> >         int mode = current->seccomp.mode;
>> >
>> > +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
>> > +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
>> > +               return;
>> > +
>> >         if (mode == 0)
>> >                 return;
>> >         else if (mode == SECCOMP_MODE_STRICT)
>> > @@ -691,6 +695,10 @@ u32 seccomp_phase1(struct seccomp_data *sd)
>> >         int this_syscall = sd ? sd->nr :
>> >                 syscall_get_nr(current, task_pt_regs(current));
>> >
>> > +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
>> > +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
>> > +               return SECCOMP_PHASE1_OK;
>> > +
>>
>> If it's not hard, it might still be nice to try to fold this into
>> mode.  This code is rather hot.  If it would be a mess, then don't
>> worry about it for now.
>
> IMO, this would be a mess ;) At least compared to this simple patch.
>
> Suppose we add SECCOMP_MODE_SUSPENDED. Not only this adds the problems
> with detach if the tracer dies.
>
> We need to change copy_seccomp(). And it is not clear what should we
> do if the child is traced too.
>
> We need to change prctl_set_seccomp() paths.
>
> And even the "tracee->seccomp.mode = SECCOMP_MODE_SUSPENDED" code needs
> some locking even if the tracee is stopped, we need to avoid the races
> with SECCOMP_FILTER_FLAG_TSYNC from other threads.
>

Agreed.  Let's hold off until this becomes a problem (if it ever does).

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10 17:20       ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2015-06-10 17:20 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Tycho Andersen, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linux API,
	Kees Cook, Will Drewry, Roland McGrath, Pavel Emelyanov,
	Serge E. Hallyn

On Wed, Jun 10, 2015 at 9:31 AM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On 06/09, Andy Lutomirski wrote:
>>
>> On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
>> >
>> > @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
>> >         if (data & ~(unsigned long)PTRACE_O_MASK)
>> >                 return -EINVAL;
>> >
>> > +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
>
> Well, we should do this if
>
>                         (data & O_SUSPEND) && !(flags & O_SUSPEND)
>
> or at least if
>
>                         (data ^ flags) & O_SUSPEND
>
>
>> > +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
>> > +                   !config_enabled(CONFIG_SECCOMP))
>> > +                       return -EINVAL;
>> > +
>> > +               if (!capable(CAP_SYS_ADMIN))
>> > +                       return -EPERM;
>>
>> I tend to think that we should also require that current not be using
>> seccomp.  Otherwise, in principle, there's a seccomp bypass for
>> privileged-but-seccomped programs.
>
> Andy, I simply can't understand why do we need any security check at all.
>
> OK, yes, in theory we can have a seccomped CAP_SYS_ADMIN process, seccomp
> doesn't filter ptrace, you hack that process and force it to attach to
> another CAP_SYS_ADMIN/seccomped process, etc, etc... Looks too paranoid
> to me.

I've sometimes considered having privileged processes I write fork and
seccomp their child.  Of course, if you're allowing ptrace through
your seccomp filter, you open a giant can of worms, but I think we
should take the more paranoid approach to start and relax it later as
needed.  After all, for the intended use of this patch, stuff will
break regardless of what we do if the ptracer is itself seccomped.

I could be convinced that if the ptracer is outside seccomp then we
shouldn't need the CAP_SYS_ADMIN check.  That would at least make this
work in a user namespace.

>> > @@ -590,6 +590,10 @@ void secure_computing_strict(int this_syscall)
>> >  {
>> >         int mode = current->seccomp.mode;
>> >
>> > +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
>> > +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
>> > +               return;
>> > +
>> >         if (mode == 0)
>> >                 return;
>> >         else if (mode == SECCOMP_MODE_STRICT)
>> > @@ -691,6 +695,10 @@ u32 seccomp_phase1(struct seccomp_data *sd)
>> >         int this_syscall = sd ? sd->nr :
>> >                 syscall_get_nr(current, task_pt_regs(current));
>> >
>> > +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
>> > +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
>> > +               return SECCOMP_PHASE1_OK;
>> > +
>>
>> If it's not hard, it might still be nice to try to fold this into
>> mode.  This code is rather hot.  If it would be a mess, then don't
>> worry about it for now.
>
> IMO, this would be a mess ;) At least compared to this simple patch.
>
> Suppose we add SECCOMP_MODE_SUSPENDED. Not only this adds the problems
> with detach if the tracer dies.
>
> We need to change copy_seccomp(). And it is not clear what should we
> do if the child is traced too.
>
> We need to change prctl_set_seccomp() paths.
>
> And even the "tracee->seccomp.mode = SECCOMP_MODE_SUSPENDED" code needs
> some locking even if the tracee is stopped, we need to avoid the races
> with SECCOMP_FILTER_FLAG_TSYNC from other threads.
>

Agreed.  Let's hold off until this becomes a problem (if it ever does).

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10 17:29         ` Serge Hallyn
  0 siblings, 0 replies; 29+ messages in thread
From: Serge Hallyn @ 2015-06-10 17:29 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Oleg Nesterov, Tycho Andersen, linux-kernel, Linux API,
	Kees Cook, Will Drewry, Roland McGrath, Pavel Emelyanov

Quoting Andy Lutomirski (luto@amacapital.net):
> On Wed, Jun 10, 2015 at 9:31 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> > On 06/09, Andy Lutomirski wrote:
> >>
> >> On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
> >> >
> >> > @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
> >> >         if (data & ~(unsigned long)PTRACE_O_MASK)
> >> >                 return -EINVAL;
> >> >
> >> > +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
> >
> > Well, we should do this if
> >
> >                         (data & O_SUSPEND) && !(flags & O_SUSPEND)
> >
> > or at least if
> >
> >                         (data ^ flags) & O_SUSPEND
> >
> >
> >> > +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
> >> > +                   !config_enabled(CONFIG_SECCOMP))
> >> > +                       return -EINVAL;
> >> > +
> >> > +               if (!capable(CAP_SYS_ADMIN))
> >> > +                       return -EPERM;
> >>
> >> I tend to think that we should also require that current not be using
> >> seccomp.  Otherwise, in principle, there's a seccomp bypass for
> >> privileged-but-seccomped programs.
> >
> > Andy, I simply can't understand why do we need any security check at all.
> >
> > OK, yes, in theory we can have a seccomped CAP_SYS_ADMIN process, seccomp
> > doesn't filter ptrace, you hack that process and force it to attach to
> > another CAP_SYS_ADMIN/seccomped process, etc, etc... Looks too paranoid
> > to me.
> 
> I've sometimes considered having privileged processes I write fork and
> seccomp their child.  Of course, if you're allowing ptrace through
> your seccomp filter, you open a giant can of worms, but I think we
> should take the more paranoid approach to start and relax it later as

I really do intend to look at your old proposed tree for improving that...
have only done a once-over so far, though.

> needed.  After all, for the intended use of this patch, stuff will
> break regardless of what we do if the ptracer is itself seccomped.
> 
> I could be convinced that if the ptracer is outside seccomp then we
> shouldn't need the CAP_SYS_ADMIN check.  That would at least make this
> work in a user namespace.
> 
> >> > @@ -590,6 +590,10 @@ void secure_computing_strict(int this_syscall)
> >> >  {
> >> >         int mode = current->seccomp.mode;
> >> >
> >> > +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> >> > +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> >> > +               return;
> >> > +
> >> >         if (mode == 0)
> >> >                 return;
> >> >         else if (mode == SECCOMP_MODE_STRICT)
> >> > @@ -691,6 +695,10 @@ u32 seccomp_phase1(struct seccomp_data *sd)
> >> >         int this_syscall = sd ? sd->nr :
> >> >                 syscall_get_nr(current, task_pt_regs(current));
> >> >
> >> > +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> >> > +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> >> > +               return SECCOMP_PHASE1_OK;
> >> > +
> >>
> >> If it's not hard, it might still be nice to try to fold this into
> >> mode.  This code is rather hot.  If it would be a mess, then don't
> >> worry about it for now.
> >
> > IMO, this would be a mess ;) At least compared to this simple patch.
> >
> > Suppose we add SECCOMP_MODE_SUSPENDED. Not only this adds the problems
> > with detach if the tracer dies.
> >
> > We need to change copy_seccomp(). And it is not clear what should we
> > do if the child is traced too.
> >
> > We need to change prctl_set_seccomp() paths.
> >
> > And even the "tracee->seccomp.mode = SECCOMP_MODE_SUSPENDED" code needs
> > some locking even if the tracee is stopped, we need to avoid the races
> > with SECCOMP_FILTER_FLAG_TSYNC from other threads.
> >
> 
> Agreed.  Let's hold off until this becomes a problem (if it ever does).
> 
> --Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10 17:29         ` Serge Hallyn
  0 siblings, 0 replies; 29+ messages in thread
From: Serge Hallyn @ 2015-06-10 17:29 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Oleg Nesterov, Tycho Andersen,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linux API, Kees Cook,
	Will Drewry, Roland McGrath, Pavel Emelyanov

Quoting Andy Lutomirski (luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org):
> On Wed, Jun 10, 2015 at 9:31 AM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > On 06/09, Andy Lutomirski wrote:
> >>
> >> On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
> >> >
> >> > @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
> >> >         if (data & ~(unsigned long)PTRACE_O_MASK)
> >> >                 return -EINVAL;
> >> >
> >> > +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
> >
> > Well, we should do this if
> >
> >                         (data & O_SUSPEND) && !(flags & O_SUSPEND)
> >
> > or at least if
> >
> >                         (data ^ flags) & O_SUSPEND
> >
> >
> >> > +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
> >> > +                   !config_enabled(CONFIG_SECCOMP))
> >> > +                       return -EINVAL;
> >> > +
> >> > +               if (!capable(CAP_SYS_ADMIN))
> >> > +                       return -EPERM;
> >>
> >> I tend to think that we should also require that current not be using
> >> seccomp.  Otherwise, in principle, there's a seccomp bypass for
> >> privileged-but-seccomped programs.
> >
> > Andy, I simply can't understand why do we need any security check at all.
> >
> > OK, yes, in theory we can have a seccomped CAP_SYS_ADMIN process, seccomp
> > doesn't filter ptrace, you hack that process and force it to attach to
> > another CAP_SYS_ADMIN/seccomped process, etc, etc... Looks too paranoid
> > to me.
> 
> I've sometimes considered having privileged processes I write fork and
> seccomp their child.  Of course, if you're allowing ptrace through
> your seccomp filter, you open a giant can of worms, but I think we
> should take the more paranoid approach to start and relax it later as

I really do intend to look at your old proposed tree for improving that...
have only done a once-over so far, though.

> needed.  After all, for the intended use of this patch, stuff will
> break regardless of what we do if the ptracer is itself seccomped.
> 
> I could be convinced that if the ptracer is outside seccomp then we
> shouldn't need the CAP_SYS_ADMIN check.  That would at least make this
> work in a user namespace.
> 
> >> > @@ -590,6 +590,10 @@ void secure_computing_strict(int this_syscall)
> >> >  {
> >> >         int mode = current->seccomp.mode;
> >> >
> >> > +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> >> > +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> >> > +               return;
> >> > +
> >> >         if (mode == 0)
> >> >                 return;
> >> >         else if (mode == SECCOMP_MODE_STRICT)
> >> > @@ -691,6 +695,10 @@ u32 seccomp_phase1(struct seccomp_data *sd)
> >> >         int this_syscall = sd ? sd->nr :
> >> >                 syscall_get_nr(current, task_pt_regs(current));
> >> >
> >> > +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> >> > +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> >> > +               return SECCOMP_PHASE1_OK;
> >> > +
> >>
> >> If it's not hard, it might still be nice to try to fold this into
> >> mode.  This code is rather hot.  If it would be a mess, then don't
> >> worry about it for now.
> >
> > IMO, this would be a mess ;) At least compared to this simple patch.
> >
> > Suppose we add SECCOMP_MODE_SUSPENDED. Not only this adds the problems
> > with detach if the tracer dies.
> >
> > We need to change copy_seccomp(). And it is not clear what should we
> > do if the child is traced too.
> >
> > We need to change prctl_set_seccomp() paths.
> >
> > And even the "tracee->seccomp.mode = SECCOMP_MODE_SUSPENDED" code needs
> > some locking even if the tracee is stopped, we need to avoid the races
> > with SECCOMP_FILTER_FLAG_TSYNC from other threads.
> >
> 
> Agreed.  Let's hold off until this becomes a problem (if it ever does).
> 
> --Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
  2015-06-10 17:29         ` Serge Hallyn
@ 2015-06-10 17:42           ` Andy Lutomirski
  -1 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2015-06-10 17:42 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Oleg Nesterov, Tycho Andersen, linux-kernel, Linux API,
	Kees Cook, Will Drewry, Roland McGrath, Pavel Emelyanov

On Wed, Jun 10, 2015 at 10:29 AM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote:
> Quoting Andy Lutomirski (luto@amacapital.net):
>> On Wed, Jun 10, 2015 at 9:31 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>> > On 06/09, Andy Lutomirski wrote:
>> >>
>> >> On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
>> >> >
>> >> > @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
>> >> >         if (data & ~(unsigned long)PTRACE_O_MASK)
>> >> >                 return -EINVAL;
>> >> >
>> >> > +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
>> >
>> > Well, we should do this if
>> >
>> >                         (data & O_SUSPEND) && !(flags & O_SUSPEND)
>> >
>> > or at least if
>> >
>> >                         (data ^ flags) & O_SUSPEND
>> >
>> >
>> >> > +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
>> >> > +                   !config_enabled(CONFIG_SECCOMP))
>> >> > +                       return -EINVAL;
>> >> > +
>> >> > +               if (!capable(CAP_SYS_ADMIN))
>> >> > +                       return -EPERM;
>> >>
>> >> I tend to think that we should also require that current not be using
>> >> seccomp.  Otherwise, in principle, there's a seccomp bypass for
>> >> privileged-but-seccomped programs.
>> >
>> > Andy, I simply can't understand why do we need any security check at all.
>> >
>> > OK, yes, in theory we can have a seccomped CAP_SYS_ADMIN process, seccomp
>> > doesn't filter ptrace, you hack that process and force it to attach to
>> > another CAP_SYS_ADMIN/seccomped process, etc, etc... Looks too paranoid
>> > to me.
>>
>> I've sometimes considered having privileged processes I write fork and
>> seccomp their child.  Of course, if you're allowing ptrace through
>> your seccomp filter, you open a giant can of worms, but I think we
>> should take the more paranoid approach to start and relax it later as
>
> I really do intend to look at your old proposed tree for improving that...
> have only done a once-over so far, though.

Don't read it yet.  It's unnecessarily complicated due to the mess
that is x86's entry code, and I want to clean up the entry code first.

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10 17:42           ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2015-06-10 17:42 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Oleg Nesterov, Tycho Andersen,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linux API, Kees Cook,
	Will Drewry, Roland McGrath, Pavel Emelyanov

On Wed, Jun 10, 2015 at 10:29 AM, Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> wrote:
> Quoting Andy Lutomirski (luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org):
>> On Wed, Jun 10, 2015 at 9:31 AM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> > On 06/09, Andy Lutomirski wrote:
>> >>
>> >> On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
>> >> >
>> >> > @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
>> >> >         if (data & ~(unsigned long)PTRACE_O_MASK)
>> >> >                 return -EINVAL;
>> >> >
>> >> > +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
>> >
>> > Well, we should do this if
>> >
>> >                         (data & O_SUSPEND) && !(flags & O_SUSPEND)
>> >
>> > or at least if
>> >
>> >                         (data ^ flags) & O_SUSPEND
>> >
>> >
>> >> > +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
>> >> > +                   !config_enabled(CONFIG_SECCOMP))
>> >> > +                       return -EINVAL;
>> >> > +
>> >> > +               if (!capable(CAP_SYS_ADMIN))
>> >> > +                       return -EPERM;
>> >>
>> >> I tend to think that we should also require that current not be using
>> >> seccomp.  Otherwise, in principle, there's a seccomp bypass for
>> >> privileged-but-seccomped programs.
>> >
>> > Andy, I simply can't understand why do we need any security check at all.
>> >
>> > OK, yes, in theory we can have a seccomped CAP_SYS_ADMIN process, seccomp
>> > doesn't filter ptrace, you hack that process and force it to attach to
>> > another CAP_SYS_ADMIN/seccomped process, etc, etc... Looks too paranoid
>> > to me.
>>
>> I've sometimes considered having privileged processes I write fork and
>> seccomp their child.  Of course, if you're allowing ptrace through
>> your seccomp filter, you open a giant can of worms, but I think we
>> should take the more paranoid approach to start and relax it later as
>
> I really do intend to look at your old proposed tree for improving that...
> have only done a once-over so far, though.

Don't read it yet.  It's unnecessarily complicated due to the mess
that is x86's entry code, and I want to clean up the entry code first.

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10 19:20         ` Oleg Nesterov
  0 siblings, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2015-06-10 19:20 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Tycho Andersen, linux-kernel, Linux API, Kees Cook, Will Drewry,
	Roland McGrath, Pavel Emelyanov, Serge E. Hallyn

On 06/10, Andy Lutomirski wrote:
>
> On Wed, Jun 10, 2015 at 9:31 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > Andy, I simply can't understand why do we need any security check at all.
...
> I think we
> should take the more paranoid approach to start and relax it later as
> needed.

OK. I didn't really tried to argue. I actually replied to other part
of you email, and simply could not resist ;)

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10 19:20         ` Oleg Nesterov
  0 siblings, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2015-06-10 19:20 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Tycho Andersen, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linux API,
	Kees Cook, Will Drewry, Roland McGrath, Pavel Emelyanov,
	Serge E. Hallyn

On 06/10, Andy Lutomirski wrote:
>
> On Wed, Jun 10, 2015 at 9:31 AM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> >
> > Andy, I simply can't understand why do we need any security check at all.
...
> I think we
> should take the more paranoid approach to start and relax it later as
> needed.

OK. I didn't really tried to argue. I actually replied to other part
of you email, and simply could not resist ;)

Oleg.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10 20:18         ` Kees Cook
  0 siblings, 0 replies; 29+ messages in thread
From: Kees Cook @ 2015-06-10 20:18 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Oleg Nesterov, Tycho Andersen, linux-kernel, Linux API,
	Will Drewry, Roland McGrath, Pavel Emelyanov, Serge E. Hallyn

On Wed, Jun 10, 2015 at 10:20 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Wed, Jun 10, 2015 at 9:31 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>> On 06/09, Andy Lutomirski wrote:
>>>
>>> On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
>>> >
>>> > @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
>>> >         if (data & ~(unsigned long)PTRACE_O_MASK)
>>> >                 return -EINVAL;
>>> >
>>> > +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
>>
>> Well, we should do this if
>>
>>                         (data & O_SUSPEND) && !(flags & O_SUSPEND)
>>
>> or at least if
>>
>>                         (data ^ flags) & O_SUSPEND
>>
>>
>>> > +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
>>> > +                   !config_enabled(CONFIG_SECCOMP))
>>> > +                       return -EINVAL;
>>> > +
>>> > +               if (!capable(CAP_SYS_ADMIN))
>>> > +                       return -EPERM;
>>>
>>> I tend to think that we should also require that current not be using
>>> seccomp.  Otherwise, in principle, there's a seccomp bypass for
>>> privileged-but-seccomped programs.
>>
>> Andy, I simply can't understand why do we need any security check at all.
>>
>> OK, yes, in theory we can have a seccomped CAP_SYS_ADMIN process, seccomp
>> doesn't filter ptrace, you hack that process and force it to attach to
>> another CAP_SYS_ADMIN/seccomped process, etc, etc... Looks too paranoid
>> to me.
>
> I've sometimes considered having privileged processes I write fork and
> seccomp their child.  Of course, if you're allowing ptrace through
> your seccomp filter, you open a giant can of worms, but I think we
> should take the more paranoid approach to start and relax it later as
> needed.  After all, for the intended use of this patch, stuff will
> break regardless of what we do if the ptracer is itself seccomped.
>
> I could be convinced that if the ptracer is outside seccomp then we
> shouldn't need the CAP_SYS_ADMIN check.  That would at least make this
> work in a user namespace.

But not if that namespace is running under a manager that has added a
seccomp filter to do things like drop finit_module, as lxc does.

Let's start with CAP_SYS_ADMIN, and when we have an actual use-case,
we can change it then.

>
>>> > @@ -590,6 +590,10 @@ void secure_computing_strict(int this_syscall)
>>> >  {
>>> >         int mode = current->seccomp.mode;
>>> >
>>> > +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
>>> > +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
>>> > +               return;
>>> > +
>>> >         if (mode == 0)
>>> >                 return;
>>> >         else if (mode == SECCOMP_MODE_STRICT)
>>> > @@ -691,6 +695,10 @@ u32 seccomp_phase1(struct seccomp_data *sd)
>>> >         int this_syscall = sd ? sd->nr :
>>> >                 syscall_get_nr(current, task_pt_regs(current));
>>> >
>>> > +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
>>> > +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
>>> > +               return SECCOMP_PHASE1_OK;
>>> > +
>>>
>>> If it's not hard, it might still be nice to try to fold this into
>>> mode.  This code is rather hot.  If it would be a mess, then don't
>>> worry about it for now.
>>
>> IMO, this would be a mess ;) At least compared to this simple patch.
>>
>> Suppose we add SECCOMP_MODE_SUSPENDED. Not only this adds the problems
>> with detach if the tracer dies.
>>
>> We need to change copy_seccomp(). And it is not clear what should we
>> do if the child is traced too.
>>
>> We need to change prctl_set_seccomp() paths.
>>
>> And even the "tracee->seccomp.mode = SECCOMP_MODE_SUSPENDED" code needs
>> some locking even if the tracee is stopped, we need to avoid the races
>> with SECCOMP_FILTER_FLAG_TSYNC from other threads.
>>
>
> Agreed.  Let's hold off until this becomes a problem (if it ever does).

Arg, right, no. I don't want this represented in seccomp.mode. Way too
much would get touched for little benefit.

Thanks! And sorry Tycho as we all disagree about how to disagree with
your patch... :)

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10 20:18         ` Kees Cook
  0 siblings, 0 replies; 29+ messages in thread
From: Kees Cook @ 2015-06-10 20:18 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Oleg Nesterov, Tycho Andersen,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linux API, Will Drewry,
	Roland McGrath, Pavel Emelyanov, Serge E. Hallyn

On Wed, Jun 10, 2015 at 10:20 AM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
> On Wed, Jun 10, 2015 at 9:31 AM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> On 06/09, Andy Lutomirski wrote:
>>>
>>> On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
>>> >
>>> > @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
>>> >         if (data & ~(unsigned long)PTRACE_O_MASK)
>>> >                 return -EINVAL;
>>> >
>>> > +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
>>
>> Well, we should do this if
>>
>>                         (data & O_SUSPEND) && !(flags & O_SUSPEND)
>>
>> or at least if
>>
>>                         (data ^ flags) & O_SUSPEND
>>
>>
>>> > +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
>>> > +                   !config_enabled(CONFIG_SECCOMP))
>>> > +                       return -EINVAL;
>>> > +
>>> > +               if (!capable(CAP_SYS_ADMIN))
>>> > +                       return -EPERM;
>>>
>>> I tend to think that we should also require that current not be using
>>> seccomp.  Otherwise, in principle, there's a seccomp bypass for
>>> privileged-but-seccomped programs.
>>
>> Andy, I simply can't understand why do we need any security check at all.
>>
>> OK, yes, in theory we can have a seccomped CAP_SYS_ADMIN process, seccomp
>> doesn't filter ptrace, you hack that process and force it to attach to
>> another CAP_SYS_ADMIN/seccomped process, etc, etc... Looks too paranoid
>> to me.
>
> I've sometimes considered having privileged processes I write fork and
> seccomp their child.  Of course, if you're allowing ptrace through
> your seccomp filter, you open a giant can of worms, but I think we
> should take the more paranoid approach to start and relax it later as
> needed.  After all, for the intended use of this patch, stuff will
> break regardless of what we do if the ptracer is itself seccomped.
>
> I could be convinced that if the ptracer is outside seccomp then we
> shouldn't need the CAP_SYS_ADMIN check.  That would at least make this
> work in a user namespace.

But not if that namespace is running under a manager that has added a
seccomp filter to do things like drop finit_module, as lxc does.

Let's start with CAP_SYS_ADMIN, and when we have an actual use-case,
we can change it then.

>
>>> > @@ -590,6 +590,10 @@ void secure_computing_strict(int this_syscall)
>>> >  {
>>> >         int mode = current->seccomp.mode;
>>> >
>>> > +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
>>> > +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
>>> > +               return;
>>> > +
>>> >         if (mode == 0)
>>> >                 return;
>>> >         else if (mode == SECCOMP_MODE_STRICT)
>>> > @@ -691,6 +695,10 @@ u32 seccomp_phase1(struct seccomp_data *sd)
>>> >         int this_syscall = sd ? sd->nr :
>>> >                 syscall_get_nr(current, task_pt_regs(current));
>>> >
>>> > +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
>>> > +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
>>> > +               return SECCOMP_PHASE1_OK;
>>> > +
>>>
>>> If it's not hard, it might still be nice to try to fold this into
>>> mode.  This code is rather hot.  If it would be a mess, then don't
>>> worry about it for now.
>>
>> IMO, this would be a mess ;) At least compared to this simple patch.
>>
>> Suppose we add SECCOMP_MODE_SUSPENDED. Not only this adds the problems
>> with detach if the tracer dies.
>>
>> We need to change copy_seccomp(). And it is not clear what should we
>> do if the child is traced too.
>>
>> We need to change prctl_set_seccomp() paths.
>>
>> And even the "tracee->seccomp.mode = SECCOMP_MODE_SUSPENDED" code needs
>> some locking even if the tracee is stopped, we need to avoid the races
>> with SECCOMP_FILTER_FLAG_TSYNC from other threads.
>>
>
> Agreed.  Let's hold off until this becomes a problem (if it ever does).

Arg, right, no. I don't want this represented in seccomp.mode. Way too
much would get touched for little benefit.

Thanks! And sorry Tycho as we all disagree about how to disagree with
your patch... :)

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10 20:26           ` Oleg Nesterov
  0 siblings, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2015-06-10 20:26 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andy Lutomirski, Tycho Andersen, linux-kernel, Linux API,
	Will Drewry, Roland McGrath, Pavel Emelyanov, Serge E. Hallyn

On 06/10, Kees Cook wrote:
>
> And sorry Tycho as we all disagree about how to disagree with
> your patch... :)

Yes ;)

So, just in case, I am fine with this version.

Andy wants another security check, OK, this is fine too to me.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10 20:26           ` Oleg Nesterov
  0 siblings, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2015-06-10 20:26 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andy Lutomirski, Tycho Andersen,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linux API, Will Drewry,
	Roland McGrath, Pavel Emelyanov, Serge E. Hallyn

On 06/10, Kees Cook wrote:
>
> And sorry Tycho as we all disagree about how to disagree with
> your patch... :)

Yes ;)

So, just in case, I am fine with this version.

Andy wants another security check, OK, this is fine too to me.

Oleg.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10 20:33   ` Kees Cook
  0 siblings, 0 replies; 29+ messages in thread
From: Kees Cook @ 2015-06-10 20:33 UTC (permalink / raw)
  To: Tycho Andersen
  Cc: LKML, Linux API, Andy Lutomirski, Will Drewry, Roland McGrath,
	Oleg Nesterov, Pavel Emelyanov, Serge E. Hallyn

On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
<tycho.andersen@canonical.com> wrote:
> This patch is the first step in enabling checkpoint/restore of processes
> with seccomp enabled.
>
> One of the things CRIU does while dumping tasks is inject code into them
> via ptrace to collect information that is only available to the process
> itself. However, if we are in a seccomp mode where these processes are
> prohibited from making these syscalls, then what CRIU does kills the task.
>
> This patch adds a new ptrace option, PTRACE_O_SUSPEND_SECCOMP, that enables
> a task from the init user namespace which has CAP_SYS_ADMIN and no seccomp
> filters to disable (and re-enable) seccomp filters for another task so that
> they can be successfully dumped (and restored). We restrict the set of
> processes that can disable seccomp through ptrace because although today
> ptrace can be used to bypass seccomp, there is some discussion of closing
> this loophole in the future and we would like this patch to not depend on
> that behavior and be future proofed for when it is removed.
>
> Note that seccomp can be suspended before any filters are actually
> installed; this behavior is useful on criu restore, so that we can suspend
> seccomp, restore the filters, unmap our restore code from the restored
> process' address space, and then resume the task by detaching and have the
> filters resumed as well.
>
> v2 changes:
>
> * require that the tracer have no seccomp filters installed
> * drop TIF_NOTSC manipulation from the patch
> * change from ptrace command to a ptrace option and use this ptrace option
>   as the flag to check. This means that as soon as the tracer
>   detaches/dies, seccomp is re-enabled and as a corrollary that one can not
>   disable seccomp across PTRACE_ATTACHs.
>
> v3 changes:
>
> * get rid of various #ifdefs everywhere
> * report more sensible errors when PTRACE_O_SUSPEND_SECCOMP is incorrectly
>   used
>
> v4 changes:
>
> * get rid of may_suspend_seccomp() in favor of a capable() check in ptrace
>   directly
>
> Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
> CC: Kees Cook <keescook@chromium.org>
> CC: Andy Lutomirski <luto@amacapital.net>
> CC: Will Drewry <wad@chromium.org>
> CC: Roland McGrath <roland@hack.frob.com>
> CC: Oleg Nesterov <oleg@redhat.com>
> CC: Pavel Emelyanov <xemul@parallels.com>
> CC: Serge E. Hallyn <serge.hallyn@ubuntu.com>
> ---
>  include/linux/ptrace.h      | 1 +
>  include/uapi/linux/ptrace.h | 6 ++++--
>  kernel/ptrace.c             | 9 +++++++++
>  kernel/seccomp.c            | 8 ++++++++
>  4 files changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
> index 987a73a..061265f 100644
> --- a/include/linux/ptrace.h
> +++ b/include/linux/ptrace.h
> @@ -34,6 +34,7 @@
>  #define PT_TRACE_SECCOMP       PT_EVENT_FLAG(PTRACE_EVENT_SECCOMP)
>
>  #define PT_EXITKILL            (PTRACE_O_EXITKILL << PT_OPT_FLAG_SHIFT)
> +#define PT_SUSPEND_SECCOMP     (PTRACE_O_SUSPEND_SECCOMP << PT_OPT_FLAG_SHIFT)
>
>  /* single stepping state bits (used on ARM and PA-RISC) */
>  #define PT_SINGLESTEP_BIT      31
> diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
> index cf1019e..a7a6979 100644
> --- a/include/uapi/linux/ptrace.h
> +++ b/include/uapi/linux/ptrace.h
> @@ -89,9 +89,11 @@ struct ptrace_peeksiginfo_args {
>  #define PTRACE_O_TRACESECCOMP  (1 << PTRACE_EVENT_SECCOMP)
>
>  /* eventless options */
> -#define PTRACE_O_EXITKILL      (1 << 20)
> +#define PTRACE_O_EXITKILL              (1 << 20)
> +#define PTRACE_O_SUSPEND_SECCOMP       (1 << 21)
>
> -#define PTRACE_O_MASK          (0x000000ff | PTRACE_O_EXITKILL)
> +#define PTRACE_O_MASK          (\
> +       0x000000ff | PTRACE_O_EXITKILL | PTRACE_O_SUSPEND_SECCOMP)
>
>  #include <asm/ptrace.h>
>
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index c8e0e05..11fa460 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
>         if (data & ~(unsigned long)PTRACE_O_MASK)
>                 return -EINVAL;
>
> +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
> +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
> +                   !config_enabled(CONFIG_SECCOMP))
> +                       return -EINVAL;
> +
> +               if (!capable(CAP_SYS_ADMIN))
> +                       return -EPERM;
> +       }
> +
>         /* Avoid intermediate state when all opts are cleared */
>         flags = child->ptrace;
>         flags &= ~(PTRACE_O_MASK << PT_OPT_FLAG_SHIFT);
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index 980fd26..645e42d 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -590,6 +590,10 @@ void secure_computing_strict(int this_syscall)
>  {
>         int mode = current->seccomp.mode;
>
> +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> +               return;
> +
>         if (mode == 0)
>                 return;
>         else if (mode == SECCOMP_MODE_STRICT)
> @@ -691,6 +695,10 @@ u32 seccomp_phase1(struct seccomp_data *sd)
>         int this_syscall = sd ? sd->nr :
>                 syscall_get_nr(current, task_pt_regs(current));
>
> +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> +               return SECCOMP_PHASE1_OK;
> +
>         switch (mode) {
>         case SECCOMP_MODE_STRICT:
>                 __secure_computing_strict(this_syscall);  /* may call do_exit */
> --
> 2.1.4
>

And if I've convinced Andy to be okay with this patch, consider v4:

Acked-by: Kees Cook <keescook@chromium.org>

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10 20:33   ` Kees Cook
  0 siblings, 0 replies; 29+ messages in thread
From: Kees Cook @ 2015-06-10 20:33 UTC (permalink / raw)
  To: Tycho Andersen
  Cc: LKML, Linux API, Andy Lutomirski, Will Drewry, Roland McGrath,
	Oleg Nesterov, Pavel Emelyanov, Serge E. Hallyn

On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
<tycho.andersen-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> wrote:
> This patch is the first step in enabling checkpoint/restore of processes
> with seccomp enabled.
>
> One of the things CRIU does while dumping tasks is inject code into them
> via ptrace to collect information that is only available to the process
> itself. However, if we are in a seccomp mode where these processes are
> prohibited from making these syscalls, then what CRIU does kills the task.
>
> This patch adds a new ptrace option, PTRACE_O_SUSPEND_SECCOMP, that enables
> a task from the init user namespace which has CAP_SYS_ADMIN and no seccomp
> filters to disable (and re-enable) seccomp filters for another task so that
> they can be successfully dumped (and restored). We restrict the set of
> processes that can disable seccomp through ptrace because although today
> ptrace can be used to bypass seccomp, there is some discussion of closing
> this loophole in the future and we would like this patch to not depend on
> that behavior and be future proofed for when it is removed.
>
> Note that seccomp can be suspended before any filters are actually
> installed; this behavior is useful on criu restore, so that we can suspend
> seccomp, restore the filters, unmap our restore code from the restored
> process' address space, and then resume the task by detaching and have the
> filters resumed as well.
>
> v2 changes:
>
> * require that the tracer have no seccomp filters installed
> * drop TIF_NOTSC manipulation from the patch
> * change from ptrace command to a ptrace option and use this ptrace option
>   as the flag to check. This means that as soon as the tracer
>   detaches/dies, seccomp is re-enabled and as a corrollary that one can not
>   disable seccomp across PTRACE_ATTACHs.
>
> v3 changes:
>
> * get rid of various #ifdefs everywhere
> * report more sensible errors when PTRACE_O_SUSPEND_SECCOMP is incorrectly
>   used
>
> v4 changes:
>
> * get rid of may_suspend_seccomp() in favor of a capable() check in ptrace
>   directly
>
> Signed-off-by: Tycho Andersen <tycho.andersen-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> CC: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> CC: Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
> CC: Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> CC: Roland McGrath <roland-/Z5OmTQCD9xF6kxbq+BtvQ@public.gmane.org>
> CC: Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> CC: Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
> CC: Serge E. Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>
> ---
>  include/linux/ptrace.h      | 1 +
>  include/uapi/linux/ptrace.h | 6 ++++--
>  kernel/ptrace.c             | 9 +++++++++
>  kernel/seccomp.c            | 8 ++++++++
>  4 files changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
> index 987a73a..061265f 100644
> --- a/include/linux/ptrace.h
> +++ b/include/linux/ptrace.h
> @@ -34,6 +34,7 @@
>  #define PT_TRACE_SECCOMP       PT_EVENT_FLAG(PTRACE_EVENT_SECCOMP)
>
>  #define PT_EXITKILL            (PTRACE_O_EXITKILL << PT_OPT_FLAG_SHIFT)
> +#define PT_SUSPEND_SECCOMP     (PTRACE_O_SUSPEND_SECCOMP << PT_OPT_FLAG_SHIFT)
>
>  /* single stepping state bits (used on ARM and PA-RISC) */
>  #define PT_SINGLESTEP_BIT      31
> diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
> index cf1019e..a7a6979 100644
> --- a/include/uapi/linux/ptrace.h
> +++ b/include/uapi/linux/ptrace.h
> @@ -89,9 +89,11 @@ struct ptrace_peeksiginfo_args {
>  #define PTRACE_O_TRACESECCOMP  (1 << PTRACE_EVENT_SECCOMP)
>
>  /* eventless options */
> -#define PTRACE_O_EXITKILL      (1 << 20)
> +#define PTRACE_O_EXITKILL              (1 << 20)
> +#define PTRACE_O_SUSPEND_SECCOMP       (1 << 21)
>
> -#define PTRACE_O_MASK          (0x000000ff | PTRACE_O_EXITKILL)
> +#define PTRACE_O_MASK          (\
> +       0x000000ff | PTRACE_O_EXITKILL | PTRACE_O_SUSPEND_SECCOMP)
>
>  #include <asm/ptrace.h>
>
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index c8e0e05..11fa460 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
>         if (data & ~(unsigned long)PTRACE_O_MASK)
>                 return -EINVAL;
>
> +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
> +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
> +                   !config_enabled(CONFIG_SECCOMP))
> +                       return -EINVAL;
> +
> +               if (!capable(CAP_SYS_ADMIN))
> +                       return -EPERM;
> +       }
> +
>         /* Avoid intermediate state when all opts are cleared */
>         flags = child->ptrace;
>         flags &= ~(PTRACE_O_MASK << PT_OPT_FLAG_SHIFT);
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index 980fd26..645e42d 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -590,6 +590,10 @@ void secure_computing_strict(int this_syscall)
>  {
>         int mode = current->seccomp.mode;
>
> +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> +               return;
> +
>         if (mode == 0)
>                 return;
>         else if (mode == SECCOMP_MODE_STRICT)
> @@ -691,6 +695,10 @@ u32 seccomp_phase1(struct seccomp_data *sd)
>         int this_syscall = sd ? sd->nr :
>                 syscall_get_nr(current, task_pt_regs(current));
>
> +       if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> +           unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> +               return SECCOMP_PHASE1_OK;
> +
>         switch (mode) {
>         case SECCOMP_MODE_STRICT:
>                 __secure_computing_strict(this_syscall);  /* may call do_exit */
> --
> 2.1.4
>

And if I've convinced Andy to be okay with this patch, consider v4:

Acked-by: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10 20:57     ` Tycho Andersen
  0 siblings, 0 replies; 29+ messages in thread
From: Tycho Andersen @ 2015-06-10 20:57 UTC (permalink / raw)
  To: Kees Cook
  Cc: LKML, Linux API, Andy Lutomirski, Will Drewry, Roland McGrath,
	Oleg Nesterov, Pavel Emelyanov, Serge E. Hallyn

On Wed, Jun 10, 2015 at 01:33:21PM -0700, Kees Cook wrote:
>
> And if I've convinced Andy to be okay with this patch, consider v4:
> 
> Acked-by: Kees Cook <keescook@chromium.org>

Thanks, I'm happy to send a v5 with checking seccomp (and
->ptrace & PT_SUSPEND_SECCOMP) if you'd feel better with that, Andy.

Tycho

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-10 20:57     ` Tycho Andersen
  0 siblings, 0 replies; 29+ messages in thread
From: Tycho Andersen @ 2015-06-10 20:57 UTC (permalink / raw)
  To: Kees Cook
  Cc: LKML, Linux API, Andy Lutomirski, Will Drewry, Roland McGrath,
	Oleg Nesterov, Pavel Emelyanov, Serge E. Hallyn

On Wed, Jun 10, 2015 at 01:33:21PM -0700, Kees Cook wrote:
>
> And if I've convinced Andy to be okay with this patch, consider v4:
> 
> Acked-by: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>

Thanks, I'm happy to send a v5 with checking seccomp (and
->ptrace & PT_SUSPEND_SECCOMP) if you'd feel better with that, Andy.

Tycho

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-12 23:27           ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2015-06-12 23:27 UTC (permalink / raw)
  To: Kees Cook
  Cc: Oleg Nesterov, Tycho Andersen, linux-kernel, Linux API,
	Will Drewry, Roland McGrath, Pavel Emelyanov, Serge E. Hallyn

On Wed, Jun 10, 2015 at 1:18 PM, Kees Cook <keescook@chromium.org> wrote:
> On Wed, Jun 10, 2015 at 10:20 AM, Andy Lutomirski <luto@amacapital.net> wrote:
>> On Wed, Jun 10, 2015 at 9:31 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>>> On 06/09, Andy Lutomirski wrote:
>>>>
>>>> On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
>>>> >
>>>> > @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
>>>> >         if (data & ~(unsigned long)PTRACE_O_MASK)
>>>> >                 return -EINVAL;
>>>> >
>>>> > +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
>>>
>>> Well, we should do this if
>>>
>>>                         (data & O_SUSPEND) && !(flags & O_SUSPEND)
>>>
>>> or at least if
>>>
>>>                         (data ^ flags) & O_SUSPEND
>>>
>>>
>>>> > +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
>>>> > +                   !config_enabled(CONFIG_SECCOMP))
>>>> > +                       return -EINVAL;
>>>> > +
>>>> > +               if (!capable(CAP_SYS_ADMIN))
>>>> > +                       return -EPERM;
>>>>
>>>> I tend to think that we should also require that current not be using
>>>> seccomp.  Otherwise, in principle, there's a seccomp bypass for
>>>> privileged-but-seccomped programs.
>>>
>>> Andy, I simply can't understand why do we need any security check at all.
>>>
>>> OK, yes, in theory we can have a seccomped CAP_SYS_ADMIN process, seccomp
>>> doesn't filter ptrace, you hack that process and force it to attach to
>>> another CAP_SYS_ADMIN/seccomped process, etc, etc... Looks too paranoid
>>> to me.
>>
>> I've sometimes considered having privileged processes I write fork and
>> seccomp their child.  Of course, if you're allowing ptrace through
>> your seccomp filter, you open a giant can of worms, but I think we
>> should take the more paranoid approach to start and relax it later as
>> needed.  After all, for the intended use of this patch, stuff will
>> break regardless of what we do if the ptracer is itself seccomped.
>>
>> I could be convinced that if the ptracer is outside seccomp then we
>> shouldn't need the CAP_SYS_ADMIN check.  That would at least make this
>> work in a user namespace.
>
> But not if that namespace is running under a manager that has added a
> seccomp filter to do things like drop finit_module, as lxc does.

In that case, criu isn't going to handle seccomp right regardless of
what our security check is, so I think we can safely deal with the
security aspects of that case once we figure out the functionality
part.

IOW, I think I still like the direct "you must not be seccomped in
order to suspend seccomp" rule.

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-12 23:27           ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2015-06-12 23:27 UTC (permalink / raw)
  To: Kees Cook
  Cc: Oleg Nesterov, Tycho Andersen,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linux API, Will Drewry,
	Roland McGrath, Pavel Emelyanov, Serge E. Hallyn

On Wed, Jun 10, 2015 at 1:18 PM, Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
> On Wed, Jun 10, 2015 at 10:20 AM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
>> On Wed, Jun 10, 2015 at 9:31 AM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>>> On 06/09, Andy Lutomirski wrote:
>>>>
>>>> On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
>>>> >
>>>> > @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
>>>> >         if (data & ~(unsigned long)PTRACE_O_MASK)
>>>> >                 return -EINVAL;
>>>> >
>>>> > +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
>>>
>>> Well, we should do this if
>>>
>>>                         (data & O_SUSPEND) && !(flags & O_SUSPEND)
>>>
>>> or at least if
>>>
>>>                         (data ^ flags) & O_SUSPEND
>>>
>>>
>>>> > +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
>>>> > +                   !config_enabled(CONFIG_SECCOMP))
>>>> > +                       return -EINVAL;
>>>> > +
>>>> > +               if (!capable(CAP_SYS_ADMIN))
>>>> > +                       return -EPERM;
>>>>
>>>> I tend to think that we should also require that current not be using
>>>> seccomp.  Otherwise, in principle, there's a seccomp bypass for
>>>> privileged-but-seccomped programs.
>>>
>>> Andy, I simply can't understand why do we need any security check at all.
>>>
>>> OK, yes, in theory we can have a seccomped CAP_SYS_ADMIN process, seccomp
>>> doesn't filter ptrace, you hack that process and force it to attach to
>>> another CAP_SYS_ADMIN/seccomped process, etc, etc... Looks too paranoid
>>> to me.
>>
>> I've sometimes considered having privileged processes I write fork and
>> seccomp their child.  Of course, if you're allowing ptrace through
>> your seccomp filter, you open a giant can of worms, but I think we
>> should take the more paranoid approach to start and relax it later as
>> needed.  After all, for the intended use of this patch, stuff will
>> break regardless of what we do if the ptracer is itself seccomped.
>>
>> I could be convinced that if the ptracer is outside seccomp then we
>> shouldn't need the CAP_SYS_ADMIN check.  That would at least make this
>> work in a user namespace.
>
> But not if that namespace is running under a manager that has added a
> seccomp filter to do things like drop finit_module, as lxc does.

In that case, criu isn't going to handle seccomp right regardless of
what our security check is, so I think we can safely deal with the
security aspects of that case once we figure out the functionality
part.

IOW, I think I still like the direct "you must not be seccomped in
order to suspend seccomp" rule.

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-12 23:29             ` Kees Cook
  0 siblings, 0 replies; 29+ messages in thread
From: Kees Cook @ 2015-06-12 23:29 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Oleg Nesterov, Tycho Andersen, linux-kernel, Linux API,
	Will Drewry, Roland McGrath, Pavel Emelyanov, Serge E. Hallyn

On Fri, Jun 12, 2015 at 4:27 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Wed, Jun 10, 2015 at 1:18 PM, Kees Cook <keescook@chromium.org> wrote:
>> On Wed, Jun 10, 2015 at 10:20 AM, Andy Lutomirski <luto@amacapital.net> wrote:
>>> On Wed, Jun 10, 2015 at 9:31 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>>>> On 06/09, Andy Lutomirski wrote:
>>>>>
>>>>> On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
>>>>> >
>>>>> > @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
>>>>> >         if (data & ~(unsigned long)PTRACE_O_MASK)
>>>>> >                 return -EINVAL;
>>>>> >
>>>>> > +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
>>>>
>>>> Well, we should do this if
>>>>
>>>>                         (data & O_SUSPEND) && !(flags & O_SUSPEND)
>>>>
>>>> or at least if
>>>>
>>>>                         (data ^ flags) & O_SUSPEND
>>>>
>>>>
>>>>> > +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
>>>>> > +                   !config_enabled(CONFIG_SECCOMP))
>>>>> > +                       return -EINVAL;
>>>>> > +
>>>>> > +               if (!capable(CAP_SYS_ADMIN))
>>>>> > +                       return -EPERM;
>>>>>
>>>>> I tend to think that we should also require that current not be using
>>>>> seccomp.  Otherwise, in principle, there's a seccomp bypass for
>>>>> privileged-but-seccomped programs.
>>>>
>>>> Andy, I simply can't understand why do we need any security check at all.
>>>>
>>>> OK, yes, in theory we can have a seccomped CAP_SYS_ADMIN process, seccomp
>>>> doesn't filter ptrace, you hack that process and force it to attach to
>>>> another CAP_SYS_ADMIN/seccomped process, etc, etc... Looks too paranoid
>>>> to me.
>>>
>>> I've sometimes considered having privileged processes I write fork and
>>> seccomp their child.  Of course, if you're allowing ptrace through
>>> your seccomp filter, you open a giant can of worms, but I think we
>>> should take the more paranoid approach to start and relax it later as
>>> needed.  After all, for the intended use of this patch, stuff will
>>> break regardless of what we do if the ptracer is itself seccomped.
>>>
>>> I could be convinced that if the ptracer is outside seccomp then we
>>> shouldn't need the CAP_SYS_ADMIN check.  That would at least make this
>>> work in a user namespace.
>>
>> But not if that namespace is running under a manager that has added a
>> seccomp filter to do things like drop finit_module, as lxc does.
>
> In that case, criu isn't going to handle seccomp right regardless of
> what our security check is, so I think we can safely deal with the
> security aspects of that case once we figure out the functionality
> part.
>
> IOW, I think I still like the direct "you must not be seccomped in
> order to suspend seccomp" rule.

Adding that restriction would be fine by me.

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-12 23:29             ` Kees Cook
  0 siblings, 0 replies; 29+ messages in thread
From: Kees Cook @ 2015-06-12 23:29 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Oleg Nesterov, Tycho Andersen,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linux API, Will Drewry,
	Roland McGrath, Pavel Emelyanov, Serge E. Hallyn

On Fri, Jun 12, 2015 at 4:27 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
> On Wed, Jun 10, 2015 at 1:18 PM, Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
>> On Wed, Jun 10, 2015 at 10:20 AM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
>>> On Wed, Jun 10, 2015 at 9:31 AM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>>>> On 06/09, Andy Lutomirski wrote:
>>>>>
>>>>> On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
>>>>> >
>>>>> > @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
>>>>> >         if (data & ~(unsigned long)PTRACE_O_MASK)
>>>>> >                 return -EINVAL;
>>>>> >
>>>>> > +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
>>>>
>>>> Well, we should do this if
>>>>
>>>>                         (data & O_SUSPEND) && !(flags & O_SUSPEND)
>>>>
>>>> or at least if
>>>>
>>>>                         (data ^ flags) & O_SUSPEND
>>>>
>>>>
>>>>> > +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
>>>>> > +                   !config_enabled(CONFIG_SECCOMP))
>>>>> > +                       return -EINVAL;
>>>>> > +
>>>>> > +               if (!capable(CAP_SYS_ADMIN))
>>>>> > +                       return -EPERM;
>>>>>
>>>>> I tend to think that we should also require that current not be using
>>>>> seccomp.  Otherwise, in principle, there's a seccomp bypass for
>>>>> privileged-but-seccomped programs.
>>>>
>>>> Andy, I simply can't understand why do we need any security check at all.
>>>>
>>>> OK, yes, in theory we can have a seccomped CAP_SYS_ADMIN process, seccomp
>>>> doesn't filter ptrace, you hack that process and force it to attach to
>>>> another CAP_SYS_ADMIN/seccomped process, etc, etc... Looks too paranoid
>>>> to me.
>>>
>>> I've sometimes considered having privileged processes I write fork and
>>> seccomp their child.  Of course, if you're allowing ptrace through
>>> your seccomp filter, you open a giant can of worms, but I think we
>>> should take the more paranoid approach to start and relax it later as
>>> needed.  After all, for the intended use of this patch, stuff will
>>> break regardless of what we do if the ptracer is itself seccomped.
>>>
>>> I could be convinced that if the ptracer is outside seccomp then we
>>> shouldn't need the CAP_SYS_ADMIN check.  That would at least make this
>>> work in a user namespace.
>>
>> But not if that namespace is running under a manager that has added a
>> seccomp filter to do things like drop finit_module, as lxc does.
>
> In that case, criu isn't going to handle seccomp right regardless of
> what our security check is, so I think we can safely deal with the
> security aspects of that case once we figure out the functionality
> part.
>
> IOW, I think I still like the direct "you must not be seccomped in
> order to suspend seccomp" rule.

Adding that restriction would be fine by me.

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-13 15:06               ` Tycho Andersen
  0 siblings, 0 replies; 29+ messages in thread
From: Tycho Andersen @ 2015-06-13 15:06 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andy Lutomirski, Oleg Nesterov, linux-kernel, Linux API,
	Will Drewry, Roland McGrath, Pavel Emelyanov, Serge E. Hallyn

On Fri, Jun 12, 2015 at 04:29:00PM -0700, Kees Cook wrote:
> On Fri, Jun 12, 2015 at 4:27 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> > On Wed, Jun 10, 2015 at 1:18 PM, Kees Cook <keescook@chromium.org> wrote:
> >> On Wed, Jun 10, 2015 at 10:20 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> >>> On Wed, Jun 10, 2015 at 9:31 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> >>>> On 06/09, Andy Lutomirski wrote:
> >>>>>
> >>>>> On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
> >>>>> >
> >>>>> > @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
> >>>>> >         if (data & ~(unsigned long)PTRACE_O_MASK)
> >>>>> >                 return -EINVAL;
> >>>>> >
> >>>>> > +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
> >>>>
> >>>> Well, we should do this if
> >>>>
> >>>>                         (data & O_SUSPEND) && !(flags & O_SUSPEND)
> >>>>
> >>>> or at least if
> >>>>
> >>>>                         (data ^ flags) & O_SUSPEND
> >>>>
> >>>>
> >>>>> > +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
> >>>>> > +                   !config_enabled(CONFIG_SECCOMP))
> >>>>> > +                       return -EINVAL;
> >>>>> > +
> >>>>> > +               if (!capable(CAP_SYS_ADMIN))
> >>>>> > +                       return -EPERM;
> >>>>>
> >>>>> I tend to think that we should also require that current not be using
> >>>>> seccomp.  Otherwise, in principle, there's a seccomp bypass for
> >>>>> privileged-but-seccomped programs.
> >>>>
> >>>> Andy, I simply can't understand why do we need any security check at all.
> >>>>
> >>>> OK, yes, in theory we can have a seccomped CAP_SYS_ADMIN process, seccomp
> >>>> doesn't filter ptrace, you hack that process and force it to attach to
> >>>> another CAP_SYS_ADMIN/seccomped process, etc, etc... Looks too paranoid
> >>>> to me.
> >>>
> >>> I've sometimes considered having privileged processes I write fork and
> >>> seccomp their child.  Of course, if you're allowing ptrace through
> >>> your seccomp filter, you open a giant can of worms, but I think we
> >>> should take the more paranoid approach to start and relax it later as
> >>> needed.  After all, for the intended use of this patch, stuff will
> >>> break regardless of what we do if the ptracer is itself seccomped.
> >>>
> >>> I could be convinced that if the ptracer is outside seccomp then we
> >>> shouldn't need the CAP_SYS_ADMIN check.  That would at least make this
> >>> work in a user namespace.
> >>
> >> But not if that namespace is running under a manager that has added a
> >> seccomp filter to do things like drop finit_module, as lxc does.
> >
> > In that case, criu isn't going to handle seccomp right regardless of
> > what our security check is, so I think we can safely deal with the
> > security aspects of that case once we figure out the functionality
> > part.
> >
> > IOW, I think I still like the direct "you must not be seccomped in
> > order to suspend seccomp" rule.
> 
> Adding that restriction would be fine by me.

Ok, I just sent v5 with this change. I didn't carry your ack in the
hopes that I could get you to take this patch in the seccomp tree. Let
me know if that's not the right thing to do.

Thanks,

Tycho

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] seccomp: add ptrace options for suspend/resume
@ 2015-06-13 15:06               ` Tycho Andersen
  0 siblings, 0 replies; 29+ messages in thread
From: Tycho Andersen @ 2015-06-13 15:06 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andy Lutomirski, Oleg Nesterov,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linux API, Will Drewry,
	Roland McGrath, Pavel Emelyanov, Serge E. Hallyn

On Fri, Jun 12, 2015 at 04:29:00PM -0700, Kees Cook wrote:
> On Fri, Jun 12, 2015 at 4:27 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
> > On Wed, Jun 10, 2015 at 1:18 PM, Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
> >> On Wed, Jun 10, 2015 at 10:20 AM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
> >>> On Wed, Jun 10, 2015 at 9:31 AM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> >>>> On 06/09, Andy Lutomirski wrote:
> >>>>>
> >>>>> On Tue, Jun 9, 2015 at 5:49 PM, Tycho Andersen
> >>>>> >
> >>>>> > @@ -556,6 +556,15 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
> >>>>> >         if (data & ~(unsigned long)PTRACE_O_MASK)
> >>>>> >                 return -EINVAL;
> >>>>> >
> >>>>> > +       if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
> >>>>
> >>>> Well, we should do this if
> >>>>
> >>>>                         (data & O_SUSPEND) && !(flags & O_SUSPEND)
> >>>>
> >>>> or at least if
> >>>>
> >>>>                         (data ^ flags) & O_SUSPEND
> >>>>
> >>>>
> >>>>> > +               if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
> >>>>> > +                   !config_enabled(CONFIG_SECCOMP))
> >>>>> > +                       return -EINVAL;
> >>>>> > +
> >>>>> > +               if (!capable(CAP_SYS_ADMIN))
> >>>>> > +                       return -EPERM;
> >>>>>
> >>>>> I tend to think that we should also require that current not be using
> >>>>> seccomp.  Otherwise, in principle, there's a seccomp bypass for
> >>>>> privileged-but-seccomped programs.
> >>>>
> >>>> Andy, I simply can't understand why do we need any security check at all.
> >>>>
> >>>> OK, yes, in theory we can have a seccomped CAP_SYS_ADMIN process, seccomp
> >>>> doesn't filter ptrace, you hack that process and force it to attach to
> >>>> another CAP_SYS_ADMIN/seccomped process, etc, etc... Looks too paranoid
> >>>> to me.
> >>>
> >>> I've sometimes considered having privileged processes I write fork and
> >>> seccomp their child.  Of course, if you're allowing ptrace through
> >>> your seccomp filter, you open a giant can of worms, but I think we
> >>> should take the more paranoid approach to start and relax it later as
> >>> needed.  After all, for the intended use of this patch, stuff will
> >>> break regardless of what we do if the ptracer is itself seccomped.
> >>>
> >>> I could be convinced that if the ptracer is outside seccomp then we
> >>> shouldn't need the CAP_SYS_ADMIN check.  That would at least make this
> >>> work in a user namespace.
> >>
> >> But not if that namespace is running under a manager that has added a
> >> seccomp filter to do things like drop finit_module, as lxc does.
> >
> > In that case, criu isn't going to handle seccomp right regardless of
> > what our security check is, so I think we can safely deal with the
> > security aspects of that case once we figure out the functionality
> > part.
> >
> > IOW, I think I still like the direct "you must not be seccomped in
> > order to suspend seccomp" rule.
> 
> Adding that restriction would be fine by me.

Ok, I just sent v5 with this change. I didn't carry your ack in the
hopes that I could get you to take this patch in the seccomp tree. Let
me know if that's not the right thing to do.

Thanks,

Tycho

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2015-06-13 15:06 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-10  0:49 [PATCH v4] seccomp: add ptrace options for suspend/resume Tycho Andersen
2015-06-10  0:49 ` Tycho Andersen
2015-06-10  1:08 ` Andy Lutomirski
2015-06-10  1:08   ` Andy Lutomirski
2015-06-10 15:19   ` Tycho Andersen
2015-06-10 15:19     ` Tycho Andersen
2015-06-10 16:31   ` Oleg Nesterov
2015-06-10 17:20     ` Andy Lutomirski
2015-06-10 17:20       ` Andy Lutomirski
2015-06-10 17:29       ` Serge Hallyn
2015-06-10 17:29         ` Serge Hallyn
2015-06-10 17:42         ` Andy Lutomirski
2015-06-10 17:42           ` Andy Lutomirski
2015-06-10 19:20       ` Oleg Nesterov
2015-06-10 19:20         ` Oleg Nesterov
2015-06-10 20:18       ` Kees Cook
2015-06-10 20:18         ` Kees Cook
2015-06-10 20:26         ` Oleg Nesterov
2015-06-10 20:26           ` Oleg Nesterov
2015-06-12 23:27         ` Andy Lutomirski
2015-06-12 23:27           ` Andy Lutomirski
2015-06-12 23:29           ` Kees Cook
2015-06-12 23:29             ` Kees Cook
2015-06-13 15:06             ` Tycho Andersen
2015-06-13 15:06               ` Tycho Andersen
2015-06-10 20:33 ` Kees Cook
2015-06-10 20:33   ` Kees Cook
2015-06-10 20:57   ` Tycho Andersen
2015-06-10 20:57     ` Tycho Andersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.