All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Drysdale <drysdale@google.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Julien Tinnes <jln@google.com>, Kees Cook <keescook@chromium.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Paolo Bonzini <pbonzini@redhat.com>,
	LSM List <linux-security-module@vger.kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Paul Moore <paul@paul-moore.com>,
	James Morris <james.l.morris@oracle.com>,
	Linux API <linux-api@vger.kernel.org>,
	Meredydd Luff <meredydd@senatehouse.org>,
	Christoph Hellwig <hch@infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 11/11] seccomp: Add tgid and tid into seccomp_data
Date: Sun, 27 Jul 2014 13:10:10 +0100	[thread overview]
Message-ID: <CAHse=S-9L5no=+R6Oh4KcMZ6C2sK+7EVLckuQikQgmA+MFD2oA@mail.gmail.com> (raw)
In-Reply-To: <CALCETrWrCU1bw+-xP_xxoRfv6L7j+GhZS_YwrWFHd2uhSp8ySw@mail.gmail.com>

On Fri, Jul 25, 2014 at 7:32 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Fri, Jul 25, 2014 at 11:22 AM, Julien Tinnes <jln@google.com> wrote:
>> On Fri, Jul 25, 2014 at 10:38 AM, Kees Cook <keescook@chromium.org> wrote:
>>>
>>> On Fri, Jul 25, 2014 at 10:18 AM, Andy Lutomirski <luto@amacapital.net>
>>> wrote:
>>> > [cc: Eric Biederman]
>>> >
>>> > On Fri, Jul 25, 2014 at 10:10 AM, Kees Cook <keescook@chromium.org>
>>> > wrote:
>>>
>>> >> Julien had been wanting something like this too (though he'd suggested
>>> >> it via prctl): limit the signal functions to "self" only. I wonder if
>>> >> adding a prctl like done for O_BENEATH could work for signal sending?
>>> >>
>>> >
>>> >
>>> > Can we do one better and add a flag to prevent any non-self pid
>>> > lookups?  This might actually be easy on top of the pid namespace work
>>> > (e.g. we could change the way that find_task_by_vpid works).
>>>
>>> Ooh, that would be extremely interesting, yes. Kind of an extreme form
>>> of pid namespace without actually being a namespace.
>>>
>>> > It's far from just being signals.  There's access_process_vm, ptrace,
>>> > all the signal functions, clock_gettime (see CPUCLOCK_PID -- yes, this
>>> > is ridiculous), and probably some others that I've forgotten about or
>>> > never noticed in the first place.
>>>
>>> Yeah, that would be very interesting.
>>
>>
>> Yes, this would be incredibly useful.
>>
>> 1. For Chromium [1], I dislike relying on seccomp purely for
>> "access-control" (to other processes or files). Because it's really hard to
>> think about everything (things like CPUCLOCK_PID bite, see
>> https://crbug.com/374479).
>
> Not public :(
>
>> Se we have a first layer of sandboxing (using PID + NET namespaces and
>> chroot) for "access-control" and a second layer for kernel attack surface
>> reduction and a few other things using seccomp-bpf.
>>
>> The first layer isn't currently very good; it's heavyweight and complex (you
>> need an init(1) per namespace and that init cannot be multi-purposed as a
>> useful process because pid = 1 can never receive signals). One PID namespace
>> per process isn't something that scales well. (Also before USER_NS it
>> required a setuid root program).
>>
>> 2. Even with a safe pure seccomp-bpf sandbox that prevents sending signals
>> to other process / ptrace() et al and that restrict clock_gettime(2)
>> properly, things become quickly very tedious because as far as the kernel is
>> concerned, the process under this BPF program can still pass
>> ptrace_may_access() to other processes. This means for instance that no
>> matter what you do, a model where open() is allowed can't work if /proc is
>> available. We need a mode that says "ptrace_may_access()" will never pass.
>>
>> So yes, I really would like:
>> - a prctl that says: "I'm dropping privileges and I now can't interact with
>> other thread groups (via signals, ptrace, etc..)".
>> - Something to drop access to the file system. It could be an unprivileged
>> way to chroot() to an empty directory (unprivileged namespaces work for
>> that, - except if you're already in a chroot -). This is a little tricky
>> without allowing chroot escapes, so I suspect we would want to express it in
>> terms of mount namespace, or something else, rather than chroot.
>
> Capsicum will give you this.

Yep, that's the idea.  As long as there aren't any open DFDs for "/proc" on
entry to capability mode, there shouldn't be a way to access it later -- but it
is still possible to openat(2) new files (relative to a pre-opened DFD).

> See the other thread for a more concrete proposal.  prctl is getting
> out of hand.
>
> --Andy

WARNING: multiple messages have this Message-ID (diff)
From: David Drysdale <drysdale-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
To: Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
Cc: Julien Tinnes <jln-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	"Eric W. Biederman"
	<ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>,
	Al Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
	Paolo Bonzini <pbonzini-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	LSM List
	<linux-security-module-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Greg Kroah-Hartman
	<gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org>,
	Paul Moore <paul-r2n+y4ga6xFZroRs9YW3xA@public.gmane.org>,
	James Morris
	<james.l.morris-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
	Linux API <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Meredydd Luff <meredydd-zPN50pYk8eUaUu29zAJCuw@public.gmane.org>,
	Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	"linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [PATCH 11/11] seccomp: Add tgid and tid into seccomp_data
Date: Sun, 27 Jul 2014 13:10:10 +0100	[thread overview]
Message-ID: <CAHse=S-9L5no=+R6Oh4KcMZ6C2sK+7EVLckuQikQgmA+MFD2oA@mail.gmail.com> (raw)
In-Reply-To: <CALCETrWrCU1bw+-xP_xxoRfv6L7j+GhZS_YwrWFHd2uhSp8ySw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Fri, Jul 25, 2014 at 7:32 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
> On Fri, Jul 25, 2014 at 11:22 AM, Julien Tinnes <jln-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>> On Fri, Jul 25, 2014 at 10:38 AM, Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
>>>
>>> On Fri, Jul 25, 2014 at 10:18 AM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
>>> wrote:
>>> > [cc: Eric Biederman]
>>> >
>>> > On Fri, Jul 25, 2014 at 10:10 AM, Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
>>> > wrote:
>>>
>>> >> Julien had been wanting something like this too (though he'd suggested
>>> >> it via prctl): limit the signal functions to "self" only. I wonder if
>>> >> adding a prctl like done for O_BENEATH could work for signal sending?
>>> >>
>>> >
>>> >
>>> > Can we do one better and add a flag to prevent any non-self pid
>>> > lookups?  This might actually be easy on top of the pid namespace work
>>> > (e.g. we could change the way that find_task_by_vpid works).
>>>
>>> Ooh, that would be extremely interesting, yes. Kind of an extreme form
>>> of pid namespace without actually being a namespace.
>>>
>>> > It's far from just being signals.  There's access_process_vm, ptrace,
>>> > all the signal functions, clock_gettime (see CPUCLOCK_PID -- yes, this
>>> > is ridiculous), and probably some others that I've forgotten about or
>>> > never noticed in the first place.
>>>
>>> Yeah, that would be very interesting.
>>
>>
>> Yes, this would be incredibly useful.
>>
>> 1. For Chromium [1], I dislike relying on seccomp purely for
>> "access-control" (to other processes or files). Because it's really hard to
>> think about everything (things like CPUCLOCK_PID bite, see
>> https://crbug.com/374479).
>
> Not public :(
>
>> Se we have a first layer of sandboxing (using PID + NET namespaces and
>> chroot) for "access-control" and a second layer for kernel attack surface
>> reduction and a few other things using seccomp-bpf.
>>
>> The first layer isn't currently very good; it's heavyweight and complex (you
>> need an init(1) per namespace and that init cannot be multi-purposed as a
>> useful process because pid = 1 can never receive signals). One PID namespace
>> per process isn't something that scales well. (Also before USER_NS it
>> required a setuid root program).
>>
>> 2. Even with a safe pure seccomp-bpf sandbox that prevents sending signals
>> to other process / ptrace() et al and that restrict clock_gettime(2)
>> properly, things become quickly very tedious because as far as the kernel is
>> concerned, the process under this BPF program can still pass
>> ptrace_may_access() to other processes. This means for instance that no
>> matter what you do, a model where open() is allowed can't work if /proc is
>> available. We need a mode that says "ptrace_may_access()" will never pass.
>>
>> So yes, I really would like:
>> - a prctl that says: "I'm dropping privileges and I now can't interact with
>> other thread groups (via signals, ptrace, etc..)".
>> - Something to drop access to the file system. It could be an unprivileged
>> way to chroot() to an empty directory (unprivileged namespaces work for
>> that, - except if you're already in a chroot -). This is a little tricky
>> without allowing chroot escapes, so I suspect we would want to express it in
>> terms of mount namespace, or something else, rather than chroot.
>
> Capsicum will give you this.

Yep, that's the idea.  As long as there aren't any open DFDs for "/proc" on
entry to capability mode, there shouldn't be a way to access it later -- but it
is still possible to openat(2) new files (relative to a pre-opened DFD).

> See the other thread for a more concrete proposal.  prctl is getting
> out of hand.
>
> --Andy

  reply	other threads:[~2014-07-27 12:10 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-25 13:46 [RFC PATCHv2 00/11] Adding FreeBSD's Capsicum security framework David Drysdale
2014-07-25 13:46 ` [PATCH 01/11] fs: add O_BENEATH flag to openat(2) David Drysdale
2014-07-25 13:46 ` [PATCH 02/11] selftests: Add test of O_BENEATH & openat(2) David Drysdale
2014-07-25 13:46 ` [PATCH 03/11] capsicum: rights values and structure definitions David Drysdale
2014-07-25 13:47 ` [PATCH 04/11] capsicum: implement fgetr() and friends David Drysdale
2014-07-25 13:47   ` David Drysdale
2014-07-25 13:47 ` [PATCH 05/11] capsicum: convert callers to use fgetr() etc David Drysdale
2014-07-25 13:47   ` David Drysdale
2014-07-25 13:47 ` [PATCH 06/11] capsicum: implement sockfd_lookupr() David Drysdale
2014-07-25 13:47 ` [PATCH 07/11] capsicum: convert callers to use sockfd_lookupr() etc David Drysdale
2014-07-25 13:47 ` [PATCH 08/11] capsicum: invoke Capsicum on FD/file conversion David Drysdale
2014-07-25 13:47 ` [PATCH 09/11] capsicum: add syscalls to limit FD rights David Drysdale
2014-07-25 13:47   ` David Drysdale
2014-07-25 13:47 ` [PATCH 10/11] capsicum: prctl(2) to force use of O_BENEATH David Drysdale
2014-07-25 13:47   ` David Drysdale
2014-07-25 14:01   ` Paolo Bonzini
2014-07-25 16:00     ` Andy Lutomirski
2014-07-27 12:08       ` David Drysdale
2014-07-25 13:47 ` [PATCH 11/11] seccomp: Add tgid and tid into seccomp_data David Drysdale
2014-07-25 15:59   ` Andy Lutomirski
2014-07-25 17:10     ` Kees Cook
2014-07-25 17:18       ` Andy Lutomirski
2014-07-25 17:38         ` Kees Cook
2014-07-25 18:24           ` Julien Tinnes
2014-07-25 18:24             ` Julien Tinnes
     [not found]           ` <CAKyRK=j-f92xHTL3+TNr9WOv_y47dkZR=WZkpY_a5YW3Q8HfaQ@mail.gmail.com>
2014-07-25 18:32             ` Andy Lutomirski
2014-07-27 12:10               ` David Drysdale [this message]
2014-07-27 12:10                 ` David Drysdale
2014-07-27 12:09         ` David Drysdale
2014-07-28 21:18         ` Eric W. Biederman
2014-07-28 21:18           ` Eric W. Biederman
2014-07-30  4:05           ` Andy Lutomirski
2014-07-30  4:05             ` Andy Lutomirski
2014-07-30  4:08             ` Eric W. Biederman
2014-07-30  4:08               ` Eric W. Biederman
2014-07-30  4:35               ` Andy Lutomirski
     [not found]                 ` <8761ifie81.fsf@x220.int.ebiederm.org>
2014-07-30 14:52                   ` Andy Lutomirski
2014-07-30 14:52                     ` Andy Lutomirski
2014-07-25 13:47 ` [PATCH 1/6] open.2: describe O_BENEATH flag David Drysdale
2014-07-25 13:47 ` [PATCH 2/6] capsicum.7: describe Capsicum capability framework David Drysdale
2014-07-25 13:47 ` [PATCH 3/6] rights.7: Describe Capsicum primary rights David Drysdale
2014-07-25 13:47 ` [PATCH 4/6] cap_rights_limit.2: limit FD rights for Capsicum David Drysdale
2014-07-25 13:47 ` [PATCH 5/6] cap_rights_get.2: retrieve Capsicum fd rights David Drysdale
2014-07-25 13:47 ` [PATCH 6/6] prctl.2: describe PR_SET_OPENAT_BENEATH/PR_GET_OPENAT_BENEATH David Drysdale
2014-07-25 13:47   ` David Drysdale
2014-07-26 21:04 ` [RFC PATCHv2 00/11] Adding FreeBSD's Capsicum security framework Eric W. Biederman
2014-07-26 21:04   ` Eric W. Biederman
2014-07-28 12:30   ` Paolo Bonzini
2014-07-28 12:30     ` Paolo Bonzini
2014-07-28 16:04   ` David Drysdale
2014-07-28 21:13     ` Eric W. Biederman
2014-07-28 21:13       ` Eric W. Biederman
2014-07-29  8:43       ` Paolo Bonzini
2014-07-29  8:43         ` Paolo Bonzini
2014-07-29 10:58       ` David Drysdale
2014-07-30  6:22         ` Eric W. Biederman
2014-07-30  6:22           ` Eric W. Biederman
2014-07-30 14:51           ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHse=S-9L5no=+R6Oh4KcMZ6C2sK+7EVLckuQikQgmA+MFD2oA@mail.gmail.com' \
    --to=drysdale@google.com \
    --cc=ebiederm@xmission.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@infradead.org \
    --cc=james.l.morris@oracle.com \
    --cc=jln@google.com \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=meredydd@senatehouse.org \
    --cc=paul@paul-moore.com \
    --cc=pbonzini@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.