linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Will Drewry <wad@chromium.org>
To: Indan Zupancic <indan@nul.nu>
Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-doc@vger.kernel.org, kernel-hardening@lists.openwall.com,
	netdev@vger.kernel.org, x86@kernel.org, arnd@arndb.de,
	davem@davemloft.net, hpa@zytor.com, mingo@redhat.com,
	oleg@redhat.com, peterz@infradead.org, rdunlap@xenotime.net,
	mcgrathr@chromium.org, tglx@linutronix.de, luto@mit.edu,
	eparis@redhat.com, serge.hallyn@canonical.com, djm@mindrot.org,
	scarybeasts@gmail.com, pmoore@redhat.com,
	akpm@linux-foundation.org, corbet@lwn.net,
	eric.dumazet@gmail.com, markus@chromium.org,
	coreyb@linux.vnet.ibm.com, keescook@chromium.org
Subject: Re: [PATCH v11 06/12] seccomp: add system call filtering using BPF
Date: Tue, 28 Feb 2012 11:17:59 -0600	[thread overview]
Message-ID: <CABqD9haZpe-i8kyHJDj4umCLyZ3E877aobabvW8_+aXeC+kTbg@mail.gmail.com> (raw)
In-Reply-To: <b983ece9dc6209269876869188303518.squirrel@webmail.greenhost.nl>

On Tue, Feb 28, 2012 at 12:51 AM, Indan Zupancic <indan@nul.nu> wrote:
> Hello,
>
> On Sat, February 25, 2012 04:21, Will Drewry wrote:
>> @@ -169,6 +170,7 @@ void free_task(struct task_struct *tsk)
>>       free_thread_info(tsk->stack);
>>       rt_mutex_debug_task_free(tsk);
>>       ftrace_graph_exit_task(tsk);
>> +     put_seccomp_filter(tsk->seccomp.filter);
>>       free_task_struct(tsk);
>> }
>> EXPORT_SYMBOL(free_task);
>> @@ -1113,6 +1115,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
>>               goto fork_out;
>>
>>       ftrace_graph_init_task(p);
>> +     copy_seccomp(&p->seccomp, &current->seccomp);
>
> I agree it's more symmetrical when get_seccomp_filter() is used here
> directly instead of copy_seccomp(). That should put Kees at ease.

Makes sense to me too. Once I'd dropped the other bits, it seemed
silly to keep copy_seccomp.

>> +static void seccomp_filter_log_failure(int syscall)
>> +{
>> +     int compat = 0;
>> +#ifdef CONFIG_COMPAT
>> +     compat = is_compat_task();
>> +#endif
>> +     pr_info("%s[%d]: %ssystem call %d blocked at 0x%lx\n",
>> +             current->comm, task_pid_nr(current),
>> +             (compat ? "compat " : ""),
>> +             syscall, KSTK_EIP(current));
>> +}
>
> This should be at least rate limited, but could be dropped altogether,
> as it's mostly useful for debugging filters. There is no kernel message
> when a process is killed because it exceeds a ulimit either. The death
> by SIGSYS is hopefully clear enough for users, and filter writers can
> return different errno values when debugging where it goes wrong.

I'll pull Kees's patch into the series this next go-round.

>> +/**
>> + * bpf_load: checks and returns a pointer to the requested offset
>> + * @nr: int syscall passed as a void * to bpf_run_filter
>> + * @off: offset into struct seccomp_data to load from
>
> Must be aligned, that's worth mentioning.

True - thanks!

>> + * @size: number of bytes to load (must be 4)
>> + * @buffer: temporary storage supplied by bpf_run_filter (4 bytes)
>
> '@buf'.

Oops - fixed.

>> +/**
>> + * copy_seccomp: manages inheritance on fork
>> + * @child: forkee's seccomp
>> + * @parent: forker's seccomp
>> + *
>> + * Ensures that @child inherits filters if in use.
>> + */
>> +void copy_seccomp(struct seccomp *child, const struct seccomp *parent)
>> +{
>> +     /* Other fields are handled by dup_task_struct. */
>> +     child->filter = get_seccomp_filter(parent->filter);
>> +}
>> +#endif       /* CONFIG_SECCOMP_FILTER */
>
> Yeah, just get rid of this and use get_seccomp_filter directly, and make
> it return void instead of a pointer.

It'll be updated.

>>
>> /*
>>  * Secure computing mode 1 allows only read/write/exit/sigreturn.
>> @@ -34,10 +293,11 @@ static int mode1_syscalls_32[] = {
>> void __secure_computing(int this_syscall)
>> {
>>       int mode = current->seccomp.mode;
>> -     int * syscall;
>> +     int exit_code = SIGKILL;
>> +     int *syscall;
>>
>>       switch (mode) {
>> -     case 1:
>> +     case SECCOMP_MODE_STRICT:
>>               syscall = mode1_syscalls;
>> #ifdef CONFIG_COMPAT
>>               if (is_compat_task())
>> @@ -48,6 +308,14 @@ void __secure_computing(int this_syscall)
>>                               return;
>>               } while (*++syscall);
>>               break;
>> +#ifdef CONFIG_SECCOMP_FILTER
>> +     case SECCOMP_MODE_FILTER:
>> +             if (seccomp_run_filters(this_syscall) == SECCOMP_RET_ALLOW)
>> +                     return;
>> +             seccomp_filter_log_failure(this_syscall);
>> +             exit_code = SIGSYS;
>
> Wouldn't it make more sense to always kill with SIGSYS, also for mode 1?
> I suppose it's too late for that now.

It would but I don't want to go and change existing ABI.

>> +/**
>> + * prctl_set_seccomp: configures current->seccomp.mode
>> + * @seccomp_mode: requested mode to use
>> + * @filter: optional struct sock_fprog for use with SECCOMP_MODE_FILTER
>> + *
>> + * This function may be called repeatedly with a @seccomp_mode of
>> + * SECCOMP_MODE_FILTER to install additional filters.  Every filter
>> + * successfully installed will be evaluated (in reverse order) for each system
>> + * call the task makes.
>> + *
>> + * It is not possible to transition change the current->seccomp.mode after
>> + * one has been set on a task.
>
> That reads awkwardly, do you mean the mode can't be changed once it's set?

Yup - I will fix that.  It doesn't even make sense :)

> ---
>
> Reviewed-by: Indan Zupancic <indan@nul.nu>

Thanks!

> All in all I'm quite happy with how the code is now, at least this bit.
> I'm still unsure about the networking patch and the 32-bit BPF on 64-bit
> archs mismatch.

Yeah - that is really the most challenging set of compromises, but I
think they are the right ones.

> As for the unlimited filter count, I'm not sure how to fix that.
> The problem is that filters are inherited and shared. Limiting the
> list length (tree depth) helps a bit, then the total memory usage
> is limited by max number of processes. It may make sense to limit
> the total instruction count instead of the number of filters.

I'll take a stab at tree path size for starters and hopefully we can
encourage consumers of the API to check for errors on return.

Thanks for the continued review and feedback!
will

  parent reply	other threads:[~2012-02-28 17:18 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-25  3:21 [PATCH v11 01/12] sk_run_filter: add support for custom load_pointer Will Drewry
2012-02-25  3:21 ` [PATCH v11 02/12] net/compat.c,linux/filter.h: share compat_sock_fprog Will Drewry
2012-02-25  3:21 ` [PATCH v11 03/12] seccomp: kill the seccomp_t typedef Will Drewry
2012-02-25  3:21 ` [PATCH v11 04/12] asm/syscall.h: add syscall_get_arch Will Drewry
2012-02-25  3:21 ` [PATCH v11 05/12] arch/x86: add syscall_get_arch to syscall.h Will Drewry
2012-02-25  3:21 ` [PATCH v11 06/12] seccomp: add system call filtering using BPF Will Drewry
2012-02-26 20:28   ` Kees Cook
2012-02-27 16:23     ` Will Drewry
2012-02-27 16:49       ` Eric Paris
2012-02-27 18:55         ` Kees Cook
2012-02-27 19:25           ` Eric Paris
2012-02-27 20:00             ` Kees Cook
2012-02-27 20:34               ` Eric Paris
2012-02-27 20:49                 ` Kees Cook
2012-02-27 17:09   ` Oleg Nesterov
2012-02-27 19:54     ` Will Drewry
2012-02-27 20:15       ` Kees Cook
2012-02-28 15:13       ` Oleg Nesterov
2012-02-28 17:18         ` Will Drewry
2012-02-28  6:51   ` Indan Zupancic
2012-02-28  7:52     ` Kees Cook
2012-02-28 17:17     ` Will Drewry [this message]
2012-02-28 17:47       ` Markus Gutschke
2012-02-25  3:21 ` [PATCH v11 07/12] seccomp: add SECCOMP_RET_ERRNO Will Drewry
2012-02-25 20:20   ` Kees Cook
2012-02-27 16:22     ` Will Drewry
2012-02-27 17:11   ` Oleg Nesterov
2012-02-27 18:09     ` Kees Cook
2012-02-27 18:14       ` Oleg Nesterov
2012-02-27 18:35         ` Andrew Lutomirski
2012-02-27 19:14           ` Kees Cook
2012-02-27 19:54             ` Will Drewry
2012-02-25  3:21 ` [PATCH v11 08/12] signal, x86: add SIGSYS info and make it synchronous Will Drewry
2012-02-27 17:22   ` Oleg Nesterov
2012-02-27 17:34     ` Roland McGrath
2012-02-27 18:08       ` Oleg Nesterov
2012-02-27 20:24     ` Will Drewry
2012-02-28 16:02       ` Oleg Nesterov
2012-02-28 17:06         ` Will Drewry
2012-02-25  3:21 ` [PATCH v11 09/12] seccomp: Add SECCOMP_RET_TRAP Will Drewry
2012-02-25  3:21 ` [PATCH v11 10/12] ptrace,seccomp: Add PTRACE_SECCOMP support Will Drewry
2012-02-27 17:54   ` Oleg Nesterov
2012-02-27 19:47     ` Will Drewry
2012-02-28 16:43       ` Oleg Nesterov
2012-02-28 17:04         ` Will Drewry
2012-02-28 18:34           ` Will Drewry
2012-02-29 16:14             ` Oleg Nesterov
2012-02-29 16:33               ` Will Drewry
2012-02-29 17:09                 ` Oleg Nesterov
2012-02-29 17:41                   ` Roland McGrath
2012-02-29 17:51                     ` Will Drewry
2012-02-25  3:21 ` [PATCH v11 11/12] x86: Enable HAVE_ARCH_SECCOMP_FILTER Will Drewry
2012-02-25  3:21 ` [PATCH v11 12/12] Documentation: prctl/seccomp_filter Will Drewry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABqD9haZpe-i8kyHJDj4umCLyZ3E877aobabvW8_+aXeC+kTbg@mail.gmail.com \
    --to=wad@chromium.org \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=corbet@lwn.net \
    --cc=coreyb@linux.vnet.ibm.com \
    --cc=davem@davemloft.net \
    --cc=djm@mindrot.org \
    --cc=eparis@redhat.com \
    --cc=eric.dumazet@gmail.com \
    --cc=hpa@zytor.com \
    --cc=indan@nul.nu \
    --cc=keescook@chromium.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@mit.edu \
    --cc=markus@chromium.org \
    --cc=mcgrathr@chromium.org \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pmoore@redhat.com \
    --cc=rdunlap@xenotime.net \
    --cc=scarybeasts@gmail.com \
    --cc=serge.hallyn@canonical.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).