linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
To: Jann Horn <jann@thejh.net>, Kees Cook <keescook@chromium.org>
Cc: mtk.manpages@gmail.com, linux-man <linux-man@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Will Drewry <wad@chromium.org>
Subject: Re: [PATCH v2 1/2] seccomp.2: Explain blacklisting problems, expand example
Date: Sun, 29 Mar 2015 18:01:35 +0200	[thread overview]
Message-ID: <551821DF.8090204@gmail.com> (raw)
In-Reply-To: <20150324183833.GB5677@pc.thejh.net>

Hi Jann,

Thanks for the patch. I've applied it.

Cheers,

Michael


On 03/24/2015 07:38 PM, Jann Horn wrote:
> ---
>  man2/seccomp.2 | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 67 insertions(+), 6 deletions(-)
> 
> diff --git a/man2/seccomp.2 b/man2/seccomp.2
> index e2a5060..b596fb8 100644
> --- a/man2/seccomp.2
> +++ b/man2/seccomp.2
> @@ -250,6 +250,55 @@ struct seccomp_data {
>  .fi
>  .in
>  
> +Because the numbers of system calls vary between architectures and
> +some architectures (e.g. X86-64) allow user-space code to use
> +the calling conventions of multiple architectures, it is usually
> +necessary to verify the value of the
> +.IR arch
> +field.
> +
> +It is strongly recommended to use a whitelisting approach whenever
> +possible because such an approach is more robust and simple.
> +A blacklist will have to be updated whenever a potentially
> +dangerous syscall is added (or a dangerous flag or option if those
> +are blacklisted), and it is often possible to alter the
> +representation of a value without altering its meaning, leading to
> +a blacklist bypass.
> +
> +The
> +.IR arch
> +field is not unique for all calling conventions. The X86-64 ABI and
> +the X32 ABI both use
> +.BR AUDIT_ARCH_X86_64
> +as
> +.IR arch ,
> +and they run on the same processors. Instead, the mask
> +.BR __X32_SYSCALL_BIT
> +is used on the system call number to tell the two ABIs apart.
> +This means that in order to create a seccomp-based
> +blacklist for system calls performed through the X86-64 ABI,
> +it is necessary to not only check that
> +.IR arch
> +equals
> +.BR AUDIT_ARCH_X86_64 ,
> +but also to explicitly reject all syscalls that contain
> +.BR __X32_SYSCALL_BIT
> +in
> +.IR nr .
> +
> +When checking values from
> +.IR args
> +against a blacklist, keep in mind that arguments are often
> +silently truncated before being processed, but after the seccomp
> +check. For example, this happens if the i386 ABI is used on an
> +X86-64 kernel: Although the kernel will normally not look beyond
> +the 32 lowest bits of the arguments, the values of the full
> +64-bit registers will be present in the seccomp data. A less
> +surprising example is that if the X86-64 ABI is used to perform
> +a syscall that takes an argument of type int, the
> +more-significant half of the argument register is ignored by
> +the syscall, but visible in the seccomp data.
> +
>  A seccomp filter returns a 32-bit value consisting of two parts:
>  the most significant 16 bits
>  (corresponding to the mask defined by the constant
> @@ -616,38 +665,50 @@ cecilia
>  #include <linux/seccomp.h>
>  #include <sys/prctl.h>
>  
> +#define X32_SYSCALL_BIT 0x40000000
> +
>  static int
>  install_filter(int syscall_nr, int t_arch, int f_errno)
>  {
> +    unsigned int upper_nr_limit = 0xffffffff;
> +    /* assume that AUDIT_ARCH_X86_64 means the normal X86-64 ABI */
> +    if (t_arch == AUDIT_ARCH_X86_64)
> +        upper_nr_limit = X32_SYSCALL_BIT - 1;
> +
>      struct sock_filter filter[] = {
>          /* [0] Load architecture from 'seccomp_data' buffer into
>                 accumulator */
>          BPF_STMT(BPF_LD | BPF_W | BPF_ABS,
>                   (offsetof(struct seccomp_data, arch))),
>  
> -        /* [1] Jump forward 4 instructions if architecture does not
> +        /* [1] Jump forward 5 instructions if architecture does not
>                 match 't_arch' */
> -        BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, t_arch, 0, 4),
> +        BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, t_arch, 0, 5),
>  
>          /* [2] Load system call number from 'seccomp_data' buffer into
>                 accumulator */
>          BPF_STMT(BPF_LD | BPF_W | BPF_ABS,
>                   (offsetof(struct seccomp_data, nr))),
>  
> -        /* [3] Jump forward 1 instruction if system call number
> +        /* [3] Check ABI - only needed for X86-64 in blacklist usecases.
> +               Use JGT instead of checking against the bitmask to avoid
> +               having to reload the syscall number. */
> +        BPF_JUMP(BPF_JMP | BPF_JGT | BPF_K, upper_nr_limit, 3, 0),
> +
> +        /* [4] Jump forward 1 instruction if system call number
>                 does not match 'syscall_nr' */
>          BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, syscall_nr, 0, 1),
>  
> -        /* [4] Matching architecture and system call: don't execute
> +        /* [5] Matching architecture and system call: don't execute
>  	       the system call, and return 'f_errno' in 'errno' */
>          BPF_STMT(BPF_RET | BPF_K,
>                   SECCOMP_RET_ERRNO | (f_errno & SECCOMP_RET_DATA)),
>  
> -        /* [5] Destination of system call number mismatch: allow other
> +        /* [6] Destination of system call number mismatch: allow other
>                 system calls */
>          BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_ALLOW),
>  
> -        /* [6] Destination of architecture mismatch: kill process */
> +        /* [7] Destination of architecture mismatch: kill process */
>          BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_KILL),
>      };
>  
> 


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

  reply	other threads:[~2015-03-31  5:29 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-16 18:01 [PATCH] seccomp.2: Explain arch checking, value (non-)truncation, expand example Jann Horn
2015-03-16 22:25 ` Kees Cook
2015-03-16 23:34   ` Jann Horn
2015-03-17 17:23     ` Kees Cook
2015-03-22 15:58     ` Michael Kerrisk (man-pages)
2015-03-24 18:38     ` [PATCH v2 1/2] seccomp.2: Explain blacklisting problems, " Jann Horn
2015-03-29 16:01       ` Michael Kerrisk (man-pages) [this message]
2015-03-24 18:40     ` [PATCH v2 2/2] syscall.2: add x32 ABI Jann Horn
2015-04-21 14:01       ` Michael Kerrisk (man-pages)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=551821DF.8090204@gmail.com \
    --to=mtk.manpages@gmail.com \
    --cc=jann@thejh.net \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-man@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=wad@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).