LKML Archive on lore.kernel.org
 help / color / Atom feed
From: "Mickaël Salaün" <mic@digikod.net>
To: Andy Lutomirski <luto@amacapital.net>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Casey Schaufler <casey@schaufler-ca.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	David Drysdale <drysdale@google.com>,
	"David S . Miller" <davem@davemloft.net>,
	"Eric W . Biederman" <ebiederm@xmission.com>,
	James Morris <james.l.morris@oracle.com>,
	Jann Horn <jann@thejh.net>, Jonathan Corbet <corbet@lwn.net>,
	Matthew Garrett <mjg59@srcf.ucam.org>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Kees Cook <keescook@chromium.org>,
	Paul Moore <paul@paul-moore.com>,
	Sargun Dhillon <sargun@sargun.me>,
	"Serge E . Hallyn" <serge@hallyn.com>,
	Shuah Khan <shuah@kernel.org>, Tejun Heo <tj@kernel.org>,
	Thomas Graf <tgraf@suug.ch>, Will Drewry <wad@chromium.org>,
	"kernel-hardening@lists.openwall.com" 
	<kernel-hardening@lists.openwall.com>,
	Linux API <linux-api@vger.kernel.org>,
	LSM List <linux-security-module@vger.kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy
Date: Wed, 1 Mar 2017 23:14:09 +0100
Message-ID: <e6c917cb-d4a6-4147-9d6e-9c9fc906621d@digikod.net> (raw)
In-Reply-To: <CALCETrWExjKHccdj-cYxPtis1NtXTPEYfReNLh065J1hHKvFiA@mail.gmail.com>

[-- Attachment #1.1: Type: text/plain, Size: 8760 bytes --]


On 28/02/2017 21:01, Andy Lutomirski wrote:
> On Tue, Feb 21, 2017 at 5:26 PM, Mickaël Salaün <mic@digikod.net> wrote:
>> The seccomp(2) syscall can be use to apply a Landlock rule to the
>> current process. As with a seccomp filter, the Landlock rule is enforced
>> for all its future children. An inherited rule tree can be updated
>> (append-only) by the owner of inherited Landlock nodes (e.g. a parent
>> process that create a new rule)
> 
> Can you clarify exaclty what this type of update does?  Is it
> something that should be supported by normal seccomp rules as well?

There is two main structures involved here: struct landlock_node and
struct landlock_rule, both defined in include/linux/landlock.h [02/10].

Let's take an example with seccomp filter and then Landlock:
* seccomp filter: Process P1 creates and applies a seccomp filter F1 to
itself. Then it forks and creates a child P2, which inherits P1's
filters, hence F1. Now, if P1 add a new seccomp filter F2 to itself, P2
*won't get it*. The P2's filter list will still only contains F1 but not
F2. If P2 sets up and applies a new filter F3 to itself, its filter list
will contains F1 and F3.
* Landlock: Process P1 creates and applies a Landlock rule R1 to itself.
Underneath the kernel creates a new node N1 dedicated to P1, which
contains all its rules. Then P1 forks and creates a child P2, which
inherits P1's rules, hence R1. Underneath P2 inherited N1. Now, if P1
add a new Landlock rule R2 to itself, P2 *will get it* as well (because
R2 is part of N1). If P2 creates and applies a new rule R3 to itself,
its rules will contains R1, R2 and R3. Underneath the kernel created a
new node N2 for P2, which only contains R3 but inherits/links to N1.

This design makes it possible for a process to add more constraints to
its children on the fly. I think it is a good feature to have and a
safer default inheritance mechanism, but it could be guarded by an
option flag if we want both mechanism to be available. The same design
could be used by seccomp filter too.


> 
>> +/**
>> + * landlock_run_prog - run Landlock program for a syscall
> 
> Unless this is actually specific to syscalls, s/for a syscall//, perhaps?

Right, not specific to syscall anymore.

> 
>> +               if (new_events->nodes[event_idx]->owner ==
>> +                               &new_events->nodes[event_idx]) {
>> +                       /* We are the owner, we can then update the node. */
>> +                       add_landlock_rule(new_events, rule);
> 
> This is the part I don't get.  Adding a rule if you're the owner (BTW,
> why is ownership visible to userspace at all?) for just yourself and
> future children is very different from adding it so it applies to
> preexisting children too.

Node ownership is not (directly) visible to userspace.

The current inheritance mechanism doesn't enable to only add a rule to
the current process. The rule will be inherited by its children
(starting from the children created after the first applied rule). An
option flag NEW_RULE_HIERARCHY (or maybe another seccomp operation)
could enable to create a new node for the current process, and then
makes it not inherited by the previous children.


> 
> 
>> +               } else if (atomic_read(&current_events->usage) == 1) {
>> +                       WARN_ON(new_events->nodes[event_idx]->owner);
>> +                       /*
>> +                        * We can become the new owner if no other task use it.
>> +                        * This avoid an unnecessary allocation.
>> +                        */
>> +                       new_events->nodes[event_idx]->owner =
>> +                               &new_events->nodes[event_idx];
>> +                       add_landlock_rule(new_events, rule);
>> +               } else {
>> +                       /*
>> +                        * We are not the owner, we need to fork current_events
>> +                        * and then add a new node.
>> +                        */
>> +                       struct landlock_node *node;
>> +                       size_t i;
>> +
>> +                       node = kmalloc(sizeof(*node), GFP_KERNEL);
>> +                       if (!node) {
>> +                               new_events = ERR_PTR(-ENOMEM);
>> +                               goto put_rule;
>> +                       }
>> +                       atomic_set(&node->usage, 1);
>> +                       /* set the previous node after the new_events
>> +                        * allocation */
>> +                       node->prev = NULL;
>> +                       /* do not increment the previous node usage */
>> +                       node->owner = &new_events->nodes[event_idx];
>> +                       /* rule->prev is already NULL */
>> +                       atomic_set(&rule->usage, 1);
>> +                       node->rule = rule;
>> +
>> +                       new_events = new_raw_landlock_events();
>> +                       if (IS_ERR(new_events)) {
>> +                               /* put the rule as well */
>> +                               put_landlock_node(node);
>> +                               return ERR_PTR(-ENOMEM);
>> +                       }
>> +                       for (i = 0; i < ARRAY_SIZE(new_events->nodes); i++) {
>> +                               new_events->nodes[i] =
>> +                                       lockless_dereference(
>> +                                                       current_events->nodes[i]);
>> +                               if (i == event_idx)
>> +                                       node->prev = new_events->nodes[i];
>> +                               if (!WARN_ON(!new_events->nodes[i]))
>> +                                       atomic_inc(&new_events->nodes[i]->usage);
>> +                       }
>> +                       new_events->nodes[event_idx] = node;
>> +
>> +                       /*
>> +                        * @current_events will not be freed here because it's usage
>> +                        * field is > 1. It is only prevented to be freed by another
>> +                        * subject thanks to the caller of landlock_append_prog() which
>> +                        * should be locked if needed.
>> +                        */
>> +                       put_landlock_events(current_events);
>> +               }
>> +       }
>> +       return new_events;
>> +
>> +put_prog:
>> +       bpf_prog_put(prog);
>> +       return new_events;
>> +
>> +put_rule:
>> +       put_landlock_rule(rule);
>> +       return new_events;
>> +}
>> +
>> +/**
>> + * landlock_seccomp_append_prog - attach a Landlock rule to the current process
>> + *
>> + * current->seccomp.landlock_events is lazily allocated. When a process fork,
>> + * only a pointer is copied. When a new event is added by a process, if there
>> + * is other references to this process' landlock_events, then a new allocation
>> + * is made to contains an array pointing to Landlock rule lists. This design
>> + * has low-performance impact and is memory efficient while keeping the
>> + * property of append-only rules.
>> + *
>> + * @flags: not used for now, but could be used for TSYNC
>> + * @user_bpf_fd: file descriptor pointing to a loaded Landlock rule
>> + */
>> +#ifdef CONFIG_SECCOMP_FILTER
>> +int landlock_seccomp_append_prog(unsigned int flags, const char __user *user_bpf_fd)
>> +{
>> +       struct landlock_events *new_events;
>> +       struct bpf_prog *prog;
>> +       int bpf_fd;
>> +
>> +       /* force no_new_privs to limit privilege escalation */
>> +       if (!task_no_new_privs(current))
>> +               return -EPERM;
>> +       /* will be removed in the future to allow unprivileged tasks */
>> +       if (!capable(CAP_SYS_ADMIN))
>> +               return -EPERM;
>> +       if (!user_bpf_fd)
>> +               return -EFAULT;
>> +       if (flags)
>> +               return -EINVAL;
>> +       if (copy_from_user(&bpf_fd, user_bpf_fd, sizeof(bpf_fd)))
>> +               return -EFAULT;
>> +       prog = bpf_prog_get(bpf_fd);
>> +       if (IS_ERR(prog))
>> +               return PTR_ERR(prog);
>> +
>> +       /*
>> +        * We don't need to lock anything for the current process hierarchy,
>> +        * everything is guarded by the atomic counters.
>> +        */
>> +       new_events = landlock_append_prog(current->seccomp.landlock_events, prog);
> 
> Do you need to check that it's the right *kind* of bpf prog or is that
> handled elsewhere?

The program type is checked at the beginning of landlock_append_prog().

 Mickaël


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply index

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-22  1:26 [PATCH v5 00/10] Landlock LSM: Toward unprivileged sandboxing Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 01/10] bpf: Add eBPF program subtype and is_valid_subtype() verifier Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 02/10] bpf,landlock: Define an eBPF program type for Landlock Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 03/10] bpf: Define handle_fs and add a new helper bpf_handle_fs_get_mode() Mickaël Salaün
2017-03-01  9:32   ` James Morris
2017-03-01 22:20     ` Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 04/10] landlock: Add LSM hooks related to filesystem Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 05/10] seccomp: Split put_seccomp_filter() with put_seccomp() Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy Mickaël Salaün
2017-02-28 20:01   ` Andy Lutomirski
2017-03-01 22:14     ` Mickaël Salaün [this message]
2017-03-01 22:20       ` Andy Lutomirski
2017-03-01 23:28         ` Mickaël Salaün
2017-03-02 16:36           ` Andy Lutomirski
2017-03-03  0:48             ` Mickaël Salaün
2017-03-03  0:55               ` Andy Lutomirski
2017-03-03  1:05                 ` Mickaël Salaün
2017-03-02 10:22   ` [kernel-hardening] " Djalal Harouni
2017-03-03  0:54     ` Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 07/10] bpf: Add a Landlock sandbox example Mickaël Salaün
2017-02-23 22:13   ` Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 08/10] seccomp: Enhance test_harness with an assert step mechanism Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 09/10] bpf,landlock: Add tests for Landlock Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 10/10] landlock: Add user and kernel documentation " Mickaël Salaün
2017-02-22  5:21   ` Andy Lutomirski
2017-02-22  7:43     ` Mickaël Salaün

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e6c917cb-d4a6-4147-9d6e-9c9fc906621d@digikod.net \
    --to=mic@digikod.net \
    --cc=acme@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=ast@kernel.org \
    --cc=casey@schaufler-ca.com \
    --cc=corbet@lwn.net \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=drysdale@google.com \
    --cc=ebiederm@xmission.com \
    --cc=james.l.morris@oracle.com \
    --cc=jann@thejh.net \
    --cc=keescook@chromium.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mjg59@srcf.ucam.org \
    --cc=mtk.manpages@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=paul@paul-moore.com \
    --cc=sargun@sargun.me \
    --cc=serge@hallyn.com \
    --cc=shuah@kernel.org \
    --cc=tgraf@suug.ch \
    --cc=tj@kernel.org \
    --cc=wad@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git