LKML Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v5 00/10] Landlock LSM: Toward unprivileged sandboxing
@ 2017-02-22  1:26 Mickaël Salaün
  2017-02-22  1:26 ` [PATCH v5 01/10] bpf: Add eBPF program subtype and is_valid_subtype() verifier Mickaël Salaün
                   ` (9 more replies)
  0 siblings, 10 replies; 26+ messages in thread
From: Mickaël Salaün @ 2017-02-22  1:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Arnaldo Carvalho de Melo, Casey Schaufler, Daniel Borkmann,
	David Drysdale, David S . Miller, Eric W . Biederman,
	James Morris, Jann Horn, Jonathan Corbet, Matthew Garrett,
	Michael Kerrisk, Kees Cook, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Tejun Heo, Thomas Graf,
	Will Drewry, kernel-hardening, linux-api, linux-security-module,
	netdev

Hi,

This series follows 4 previous RFCs [1].  This is the first step of the roadmap
discussed at LPC [2] (with the inheritance feature included).  The big change
is an abstraction over LSM hooks.  Instead of exposing a similar interface to
userland, Landlock wraps LSM hooks into Landlock events.  A Landlock rule is a
dedicated eBPF program enforced for the calling process through the seccomp(2)
syscall.  While the intended final goal is to allow unprivileged users to use
Landlock, this series allows only a process with global CAP_SYS_ADMIN to load
and enforce a rule.  This may help to get feedback and avoid unexpected
behaviors.  The documentation patch contains some kernel documentation and
explanations on how to use Landlock.  The compiled documentation can be found
here:
https://landlock-lsm.github.io/linux-doc/landlock-v5/security/landlock/index.html

This series can be applied on top of net-next: e623a9e9dec2 ("net: socket: fix
recvmmsg not returning error from sock_error"). You may need a commit from the
perf tree: 7a5980f9c006 ("tools lib bpf: Add missing header to the library").
This can be tested with CONFIG_SECCOMP_FILTER and CONFIG_SECURITY_LANDLOCK.  I
would really appreciate constructive comments on the usability, architecture,
code, userland API or use cases.


# Landlock LSM

The goal of this new stackable Linux Security Module (LSM) called Landlock is
to allow any process, including unprivileged ones, to create powerful security
sandboxes comparable to XNU Sandbox or OpenBSD Pledge. This kind of sandbox is
expected to help mitigate the security impact of bugs or unexpected/malicious
behaviors in user-space applications.

The approach taken is to add the minimum amount of code while still allowing
the user-space application to create quite complex access rules.  A dedicated
security policy language such as the one used by SELinux, AppArmor and other
major LSMs involves a lot of code and is usually permitted to only a trusted
user (i.e. root).  On the contrary, eBPF programs already exist and are
designed to be safely loaded by unprivileged user-space.

This design does not seem too intrusive but is flexible enough to allow a
powerful sandbox mechanism accessible by any process on Linux. The use of
seccomp and Landlock is more suitable with the help of a user-space library
(e.g.  libseccomp) that could help to specify a high-level language to express
a security policy instead of raw eBPF programs. Moreover, thanks to the LLVM
front-end, it is quite easy to write an eBPF program with a subset of the C
language.


# Landlock events and rule enforcement

Unlike syscalls, LSM hooks are security checkpoints and are not architecture
dependent. They are designed to match a security need associated with a
security policy (e.g. access to a file).  The approach taken for Landlock is to
abstract these hooks with Landlock events such as a generic filesystem event
(LANDLOCK_SUBTYPE_EVENT_FS).  Further explanations can be found in the
documentation.

This series uses seccomp(2) only as an entry point to apply a rule to the
calling process and its future children.  It is planed to restore the ability
to use cgroup as an alternative way to enforce a Landlock rule.

There is as yet no way to allow a process to access only a subset of the
filesystem where the subset is specified via a path or a file descriptor.  This
feature is intentionally left out so as to minimize the amount of code of this
patch series but will come in a following series.  However, it is possible to
check the file type, as done in the following example.


# Sandbox example with a read-only filesystem

This example is provided in the samples/bpf directory.  It creates a read-only
environment for all kind of file access except for character devices such as a
TTY.

  # :> X
  # echo $?
  0
  # ./samples/bpf/landlock1 /bin/sh -i
  Launching a new sandboxed process.
  # :> Y
  cannot create Y: Operation not permitted


# Warning on read-only filesystems

Other than owing a mount namespace and remounting every accessible mounts
points as read-only, which may not be possible for an unprivileged security
sandbox, there is no way of preventing a process to change the access time of a
file, including anonymous inodes.  This provides a trivial way to leak
information from a sandboxed environment.  A new LSM hook has been proposed to
allow an LSM to enforce a real read-only filesystem view, but it did not get
strong support so far [5].


# Frequently asked questions

## Why is seccomp-bpf not enough?

A seccomp filter can access only raw syscall arguments (i.e. the register
values) which means that it is not possible to filter according to the value
pointed to by an argument, such as a file pathname. As an embryonic Landlock
version demonstrated, filtering at the syscall level is complicated (e.g. need
to take care of race conditions). This is mainly because the access control
checkpoints of the kernel are not at this high-level but more underneath, at
the LSM-hook level. The LSM hooks are designed to handle this kind of checks.
Landlock abstracts this approach to leverage the ability of unprivileged users
to limit themselves.

Cf. section "What it isn't?" in Documentation/prctl/seccomp_filter.txt


## Why use the seccomp(2) syscall?

Landlock use the same semantic as seccomp to apply access rule restrictions. It
add a new layer of security for the current process which is inherited by its
children. It makes sense to use an unique access-restricting syscall (that
should be allowed by seccomp filters) which can only drop privileges. Moreover,
a Landlock rule could come from outside a process (e.g.  passed through a UNIX
socket). It is then useful to differentiate the creation/load of Landlock eBPF
programs via bpf(2), from rule enforcement via seccomp(2).


## Why a new LSM? Are SELinux, AppArmor, Smack and Tomoyo not good enough?

The current access control LSMs are fine for their purpose which is to give the
*root* the ability to enforce a security policy for the *system*. What is
missing is a way to enforce a security policy for any application by its
developer and *unprivileged user* as seccomp can do for raw syscall filtering.

Differences from other (access control) LSMs:
* not only dedicated to administrators (i.e. no_new_priv);
* limited kernel attack surface (e.g. policy parsing);
* constrained policy rules (no DoS: deterministic execution time);
* do not leak more information than the loader process can legitimately have
  access to (minimize metadata inference).


# Changes since v4

* revamp Landlock to not expose an LSM hook interface but wrap and abstract
  them with Landlock events (currently one for all filesystem related
  operations: LANDLOCK_SUBTYPE_EVENT_FS)
* wrap all filesystem kernel objects through the same FS handle (struct
  landlock_handle_fs): struct file, struct inode, struct path and struct dentry
* a rule don't return an errno code but only a boolean to allow or deny an
  access request
* handle all filesystem related LSM hooks
* add some tests and a sample:
  * BPF context tests
  * Landlock sandboxing tests and sample
  * write Landlock rules in C and compiled them with LLVM
* change field names of eBPF program subtype
* remove arraymap of handles for now (will be replaced with a revamped map)
* remove cgroup handling for now
* add user and kernel documentation
* rebase on net-next (which includes some needed commits already upstreamed)


# Changes since v3

* use abstract LSM hook arguments with custom types (e.g. *_LANDLOCK_ARG_FS for
  struct file, struct inode and struct path)
* add more LSM hooks to support full filesystem access control
* improve the sandbox example
* fix races and RCU issues:
  * eBPF program execution and eBPF helpers
  * revamp the arraymap of handles to cleanly deal with update/delete
* eBPF program subtype for Landlock:
  * remove the "origin" field
  * add an "option" field
* rebase onto Daniel Mack's patches v7 [3]
* remove merged commit 1955351da41c ("bpf: Set register type according to
  is_valid_access()")
* fix spelling mistakes
* cleanup some type and variable names
* split patches
* for now, remove cgroup delegation handling for unprivileged user
* remove extra access check for cgroup_get_from_fd()
* remove unused example code dealing with skb
* remove seccomp-bpf link:
  * no more seccomp cookie
  * for now, it is no more possible to check the current syscall properties


# Changes since v2

* revamp cgroup handling:
  * use Daniel Mack's patches "Add eBPF hooks for cgroups" v5
  * remove bpf_landlock_cmp_cgroup_beneath()
  * make BPF_PROG_ATTACH usable with delegated cgroups
  * add a new CGRP_NO_NEW_PRIVS flag for safe cgroups
  * handle Landlock sandboxing for cgroups hierarchy
  * allow unprivileged processes to attach Landlock eBPF program to cgroups
* add subtype to eBPF programs:
  * replace Landlock hook identification by custom eBPF program types with a
    dedicated subtype field
  * manage fine-grained privileged Landlock programs
  * register Landlock programs for dedicated trigger origins (e.g. syscall,
    return from seccomp filter and/or interruption)
* performance and memory optimizations: use an array to access Landlock hooks
  directly but do not duplicated it for each thread (seccomp-based)
* allow running Landlock programs without seccomp filter
* fix seccomp-related issues
* remove extra errno bounding check for Landlock programs
* add some examples for optional eBPF functions or context access (network
  related) according to security checks to allow more features for privileged
  programs (e.g. Checmate)


# Changes since v1

* focus on the LSM hooks, not the syscalls:
  * much more simple implementation
  * does not need audit cache tricks to avoid race conditions
  * more simple to use and more generic because using the LSM hook abstraction
    directly
  * more efficient because only checking in LSM hooks
  * architecture agnostic
* switch from cBPF to eBPF:
  * new eBPF program types dedicated to Landlock
  * custom functions used by the eBPF program
  * gain some new features (e.g. 10 registers, can load values of different
	size, LLVM translator) but only a few functions allowed and a dedicated map
    type
  * new context: LSM hook ID, cookie and LSM hook arguments
  * need to set the sysctl kernel.unprivileged_bpf_disable to 0 (default value)
    to be able to load hook filters as unprivileged users
* smaller and simpler:
  * no more checker groups but dedicated arraymap of handles
  * simpler userland structs thanks to eBPF functions
* distinctive name: Landlock


[1] https://lkml.kernel.org/r/20161026065654.19166-1-mic@digikod.net
[2] https://lkml.kernel.org/r/5828776A.1010104@digikod.net
[3] https://lkml.kernel.org/r/1477390454-12553-1-git-send-email-daniel@zonque.org
[4] https://lkml.kernel.org/r/20160829114542.GA20836@ircssh.c.rugged-nimbus-611.internal
[5] https://lkml.kernel.org/r/20161221231506.19800-1-mic@digikod.net

Regards,

Mickaël Salaün (10):
  bpf: Add eBPF program subtype and is_valid_subtype() verifier
  bpf,landlock: Define an eBPF program type for Landlock
  bpf: Define handle_fs and add a new helper bpf_handle_fs_get_mode()
  landlock: Add LSM hooks related to filesystem
  seccomp: Split put_seccomp_filter() with put_seccomp()
  seccomp,landlock: Handle Landlock events per process hierarchy
  bpf: Add a Landlock sandbox example
  seccomp: Enhance test_harness with an assert step mechanism
  bpf,landlock: Add tests for Landlock
  landlock: Add user and kernel documentation for Landlock

 Documentation/security/index.rst                   |   1 +
 Documentation/security/landlock/index.rst          |  19 +
 Documentation/security/landlock/kernel.rst         | 132 +++
 Documentation/security/landlock/user.rst           | 298 +++++++
 include/linux/bpf.h                                |  40 +-
 include/linux/filter.h                             |   1 +
 include/linux/landlock.h                           |  80 ++
 include/linux/lsm_hooks.h                          |   5 +
 include/linux/seccomp.h                            |  12 +-
 include/uapi/linux/bpf.h                           | 125 ++-
 include/uapi/linux/seccomp.h                       |   1 +
 kernel/bpf/Makefile                                |   2 +-
 kernel/bpf/helpers_fs.c                            |  52 ++
 kernel/bpf/syscall.c                               |   5 +-
 kernel/bpf/verifier.c                              |  16 +-
 kernel/fork.c                                      |  16 +-
 kernel/seccomp.c                                   |  26 +-
 kernel/trace/bpf_trace.c                           |  15 +-
 net/core/filter.c                                  |  48 +-
 samples/bpf/.gitignore                             |  32 +
 samples/bpf/Makefile                               |   4 +
 samples/bpf/bpf_helpers.h                          |   2 +
 samples/bpf/bpf_load.c                             |  29 +-
 samples/bpf/fds_example.c                          |   2 +-
 samples/bpf/landlock1_kern.c                       |  46 +
 samples/bpf/landlock1_user.c                       | 102 +++
 samples/bpf/sock_example.c                         |   2 +-
 samples/bpf/test_cgrp2_attach.c                    |   2 +-
 samples/bpf/test_cgrp2_attach2.c                   |   2 +-
 samples/bpf/test_cgrp2_sock.c                      |   2 +-
 security/Kconfig                                   |   1 +
 security/Makefile                                  |   2 +
 security/landlock/Kconfig                          |  18 +
 security/landlock/Makefile                         |   5 +
 security/landlock/common.h                         |  25 +
 security/landlock/hooks.c                          | 962 +++++++++++++++++++++
 security/landlock/manager.c                        | 321 +++++++
 security/security.c                                |   7 +-
 tools/include/uapi/linux/bpf.h                     | 125 ++-
 tools/lib/bpf/bpf.c                                |   5 +-
 tools/lib/bpf/bpf.h                                |   2 +-
 tools/lib/bpf/libbpf.c                             |   4 +-
 tools/perf/tests/bpf.c                             |   2 +-
 tools/testing/selftests/Makefile                   |   1 +
 tools/testing/selftests/bpf/test_tag.c             |   2 +-
 tools/testing/selftests/bpf/test_verifier.c        |  57 +-
 tools/testing/selftests/landlock/.gitignore        |   2 +
 tools/testing/selftests/landlock/Makefile          |  47 +
 tools/testing/selftests/landlock/rules/Makefile    |  52 ++
 tools/testing/selftests/landlock/rules/README.rst  |   1 +
 .../testing/selftests/landlock/rules/bpf_helpers.h |   1 +
 tools/testing/selftests/landlock/rules/fs1.c       |  31 +
 tools/testing/selftests/landlock/rules/fs2.c       |  31 +
 tools/testing/selftests/landlock/test_fs.c         | 347 ++++++++
 tools/testing/selftests/seccomp/test_harness.h     |   8 +-
 55 files changed, 3116 insertions(+), 62 deletions(-)
 create mode 100644 Documentation/security/landlock/index.rst
 create mode 100644 Documentation/security/landlock/kernel.rst
 create mode 100644 Documentation/security/landlock/user.rst
 create mode 100644 include/linux/landlock.h
 create mode 100644 kernel/bpf/helpers_fs.c
 create mode 100644 samples/bpf/.gitignore
 create mode 100644 samples/bpf/landlock1_kern.c
 create mode 100644 samples/bpf/landlock1_user.c
 create mode 100644 security/landlock/Kconfig
 create mode 100644 security/landlock/Makefile
 create mode 100644 security/landlock/common.h
 create mode 100644 security/landlock/hooks.c
 create mode 100644 security/landlock/manager.c
 create mode 100644 tools/testing/selftests/landlock/.gitignore
 create mode 100644 tools/testing/selftests/landlock/Makefile
 create mode 100644 tools/testing/selftests/landlock/rules/Makefile
 create mode 120000 tools/testing/selftests/landlock/rules/README.rst
 create mode 120000 tools/testing/selftests/landlock/rules/bpf_helpers.h
 create mode 100644 tools/testing/selftests/landlock/rules/fs1.c
 create mode 100644 tools/testing/selftests/landlock/rules/fs2.c
 create mode 100644 tools/testing/selftests/landlock/test_fs.c

-- 
2.11.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 01/10] bpf: Add eBPF program subtype and is_valid_subtype() verifier
  2017-02-22  1:26 [PATCH v5 00/10] Landlock LSM: Toward unprivileged sandboxing Mickaël Salaün
@ 2017-02-22  1:26 ` Mickaël Salaün
  2017-02-22  1:26 ` [PATCH v5 02/10] bpf,landlock: Define an eBPF program type for Landlock Mickaël Salaün
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Mickaël Salaün @ 2017-02-22  1:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Arnaldo Carvalho de Melo, Casey Schaufler, Daniel Borkmann,
	David Drysdale, David S . Miller, Eric W . Biederman,
	James Morris, Jann Horn, Jonathan Corbet, Matthew Garrett,
	Michael Kerrisk, Kees Cook, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Tejun Heo, Thomas Graf,
	Will Drewry, kernel-hardening, linux-api, linux-security-module,
	netdev

The goal of the program subtype is to be able to have different static
fine-grained verifications for a unique program type.

The struct bpf_verifier_ops gets a new optional function:
is_valid_subtype(). This new verifier is called at the beginning of the
eBPF program verification to check if the (optional) program subtype is
valid.

For now, only Landlock eBPF programs are using a program subtype (see
next commit) but this could be used by other program types in the future.

Changes since v4:
* replace the "status" field with "version" (more generic)
* replace the "access" field with "ability" (less confusing)

Changes since v3:
* remove the "origin" field
* add an "option" field
* cleanup comments

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Link: https://lkml.kernel.org/r/20160827205559.GA43880@ast-mbp.thefacebook.com
---
 include/linux/bpf.h                         |  7 +++--
 include/linux/filter.h                      |  1 +
 include/uapi/linux/bpf.h                    | 10 ++++++
 kernel/bpf/syscall.c                        |  5 +--
 kernel/bpf/verifier.c                       | 10 ++++--
 kernel/trace/bpf_trace.c                    | 15 ++++++---
 net/core/filter.c                           | 48 ++++++++++++++++++-----------
 samples/bpf/bpf_load.c                      |  3 +-
 samples/bpf/fds_example.c                   |  2 +-
 samples/bpf/sock_example.c                  |  2 +-
 samples/bpf/test_cgrp2_attach.c             |  2 +-
 samples/bpf/test_cgrp2_attach2.c            |  2 +-
 samples/bpf/test_cgrp2_sock.c               |  2 +-
 tools/include/uapi/linux/bpf.h              | 10 ++++++
 tools/lib/bpf/bpf.c                         |  5 ++-
 tools/lib/bpf/bpf.h                         |  2 +-
 tools/lib/bpf/libbpf.c                      |  4 +--
 tools/perf/tests/bpf.c                      |  2 +-
 tools/testing/selftests/bpf/test_tag.c      |  2 +-
 tools/testing/selftests/bpf/test_verifier.c |  3 +-
 20 files changed, 95 insertions(+), 42 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 909fc033173a..dd954048aa19 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -154,19 +154,22 @@ struct bpf_prog;
 
 struct bpf_verifier_ops {
 	/* return eBPF function prototype for verification */
-	const struct bpf_func_proto *(*get_func_proto)(enum bpf_func_id func_id);
+	const struct bpf_func_proto *(*get_func_proto)(enum bpf_func_id func_id,
+				      union bpf_prog_subtype *prog_subtype);
 
 	/* return true if 'size' wide access at offset 'off' within bpf_context
 	 * with 'type' (read or write) is allowed
 	 */
 	bool (*is_valid_access)(int off, int size, enum bpf_access_type type,
-				enum bpf_reg_type *reg_type);
+				enum bpf_reg_type *reg_type,
+				union bpf_prog_subtype *prog_subtype);
 	int (*gen_prologue)(struct bpf_insn *insn, bool direct_write,
 			    const struct bpf_prog *prog);
 	u32 (*convert_ctx_access)(enum bpf_access_type type,
 				  const struct bpf_insn *src,
 				  struct bpf_insn *dst,
 				  struct bpf_prog *prog);
+	bool (*is_valid_subtype)(union bpf_prog_subtype *prog_subtype);
 };
 
 struct bpf_prog_type_list {
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 0c167fdee5f7..1f49b19a87c1 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -417,6 +417,7 @@ struct bpf_prog {
 	enum bpf_prog_type	type;		/* Type of BPF program */
 	u32			len;		/* Number of filter blocks */
 	u8			tag[BPF_TAG_SIZE];
+	union bpf_prog_subtype	subtype;	/* For fine-grained verifications */
 	struct bpf_prog_aux	*aux;		/* Auxiliary fields */
 	struct sock_fprog_kern	*orig_prog;	/* Original BPF program */
 	unsigned int		(*bpf_func)(const void *ctx,
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 0539a0ceef38..240c76f09d0d 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -145,6 +145,15 @@ enum bpf_attach_type {
  */
 #define BPF_F_NO_COMMON_LRU	(1U << 1)
 
+union bpf_prog_subtype {
+	struct {
+		__u32		version; /* cf. documentation */
+		__u32		event; /* enum landlock_subtype_event */
+		__aligned_u64	ability; /* LANDLOCK_SUBTYPE_ABILITY_* */
+		__aligned_u64	option; /* LANDLOCK_SUBTYPE_OPTION_* */
+	} landlock_rule;
+} __attribute__((aligned(8)));
+
 union bpf_attr {
 	struct { /* anonymous struct used by BPF_MAP_CREATE command */
 		__u32	map_type;	/* one of enum bpf_map_type */
@@ -173,6 +182,7 @@ union bpf_attr {
 		__u32		log_size;	/* size of user buffer */
 		__aligned_u64	log_buf;	/* user supplied buffer */
 		__u32		kern_version;	/* checked when prog_type=kprobe */
+		union bpf_prog_subtype prog_subtype;
 	};
 
 	struct { /* anonymous struct used by BPF_OBJ_* commands */
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 461eb1e66a0f..23f7ca14e898 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -628,7 +628,7 @@ static void fixup_bpf_calls(struct bpf_prog *prog)
 				continue;
 			}
 
-			fn = prog->aux->ops->get_func_proto(insn->imm);
+			fn = prog->aux->ops->get_func_proto(insn->imm, &prog->subtype);
 			/* all functions that have prototype and verifier allowed
 			 * programs to call them, must be real in-kernel functions
 			 */
@@ -827,7 +827,7 @@ struct bpf_prog *bpf_prog_get_type(u32 ufd, enum bpf_prog_type type)
 EXPORT_SYMBOL_GPL(bpf_prog_get_type);
 
 /* last field in 'union bpf_attr' used by this command */
-#define	BPF_PROG_LOAD_LAST_FIELD kern_version
+#define	BPF_PROG_LOAD_LAST_FIELD prog_subtype
 
 static int bpf_prog_load(union bpf_attr *attr)
 {
@@ -885,6 +885,7 @@ static int bpf_prog_load(union bpf_attr *attr)
 	err = find_prog_type(type, prog);
 	if (err < 0)
 		goto free_prog;
+	prog->subtype = attr->prog_subtype;
 
 	/* run eBPF verifier */
 	err = bpf_check(&prog, attr);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d2bded2b250c..fe26ec007a9a 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -740,7 +740,8 @@ static int check_ctx_access(struct bpf_verifier_env *env, int off, int size,
 		return 0;
 
 	if (env->prog->aux->ops->is_valid_access &&
-	    env->prog->aux->ops->is_valid_access(off, size, t, reg_type)) {
+	    env->prog->aux->ops->is_valid_access(off, size, t, reg_type,
+					         &env->prog->subtype)) {
 		/* remember the offset of last byte accessed in ctx */
 		if (env->prog->aux->max_ctx_offset < off + size)
 			env->prog->aux->max_ctx_offset = off + size;
@@ -1290,7 +1291,8 @@ static int check_call(struct bpf_verifier_env *env, int func_id)
 	}
 
 	if (env->prog->aux->ops->get_func_proto)
-		fn = env->prog->aux->ops->get_func_proto(func_id);
+		fn = env->prog->aux->ops->get_func_proto(func_id,
+							 &env->prog->subtype);
 
 	if (!fn) {
 		verbose("unknown func %s#%d\n", func_id_name(func_id), func_id);
@@ -3261,6 +3263,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr)
 	struct bpf_verifier_env *env;
 	int ret = -EINVAL;
 
+	if ((*prog)->aux->ops->is_valid_subtype &&
+	    !(*prog)->aux->ops->is_valid_subtype(&(*prog)->subtype))
+		return -EINVAL;
+
 	/* 'struct bpf_verifier_env' can be global, but since it's not small,
 	 * allocate/free it every time bpf_check() is called
 	 */
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index cee9802cf3e0..e71ee1bb7abf 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -469,7 +469,8 @@ static const struct bpf_func_proto *tracing_func_proto(enum bpf_func_id func_id)
 	}
 }
 
-static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id func_id)
+static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id func_id,
+		union bpf_prog_subtype *prog_subtype)
 {
 	switch (func_id) {
 	case BPF_FUNC_perf_event_output:
@@ -483,7 +484,8 @@ static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id func
 
 /* bpf+kprobe programs can access fields of 'struct pt_regs' */
 static bool kprobe_prog_is_valid_access(int off, int size, enum bpf_access_type type,
-					enum bpf_reg_type *reg_type)
+					enum bpf_reg_type *reg_type,
+					union bpf_prog_subtype *prog_subtype)
 {
 	if (off < 0 || off >= sizeof(struct pt_regs))
 		return false;
@@ -558,7 +560,8 @@ static const struct bpf_func_proto bpf_get_stackid_proto_tp = {
 	.arg3_type	= ARG_ANYTHING,
 };
 
-static const struct bpf_func_proto *tp_prog_func_proto(enum bpf_func_id func_id)
+static const struct bpf_func_proto *tp_prog_func_proto(enum bpf_func_id func_id,
+		union bpf_prog_subtype *prog_subtype)
 {
 	switch (func_id) {
 	case BPF_FUNC_perf_event_output:
@@ -571,7 +574,8 @@ static const struct bpf_func_proto *tp_prog_func_proto(enum bpf_func_id func_id)
 }
 
 static bool tp_prog_is_valid_access(int off, int size, enum bpf_access_type type,
-				    enum bpf_reg_type *reg_type)
+				    enum bpf_reg_type *reg_type,
+				    union bpf_prog_subtype *prog_subtype)
 {
 	if (off < sizeof(void *) || off >= PERF_MAX_TRACE_SIZE)
 		return false;
@@ -595,7 +599,8 @@ static struct bpf_prog_type_list tracepoint_tl __ro_after_init = {
 };
 
 static bool pe_prog_is_valid_access(int off, int size, enum bpf_access_type type,
-				    enum bpf_reg_type *reg_type)
+				    enum bpf_reg_type *reg_type,
+				    union bpf_prog_subtype *prog_subtype)
 {
 	if (off < 0 || off >= sizeof(struct bpf_perf_event_data))
 		return false;
diff --git a/net/core/filter.c b/net/core/filter.c
index e466e0040137..ac25920b5eae 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2600,7 +2600,8 @@ static const struct bpf_func_proto bpf_xdp_event_output_proto = {
 };
 
 static const struct bpf_func_proto *
-bpf_base_func_proto(enum bpf_func_id func_id)
+bpf_base_func_proto(enum bpf_func_id func_id,
+		    union bpf_prog_subtype *prog_subtype)
 {
 	switch (func_id) {
 	case BPF_FUNC_map_lookup_elem:
@@ -2628,18 +2629,20 @@ bpf_base_func_proto(enum bpf_func_id func_id)
 }
 
 static const struct bpf_func_proto *
-sk_filter_func_proto(enum bpf_func_id func_id)
+sk_filter_func_proto(enum bpf_func_id func_id,
+		     union bpf_prog_subtype *prog_subtype)
 {
 	switch (func_id) {
 	case BPF_FUNC_skb_load_bytes:
 		return &bpf_skb_load_bytes_proto;
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_base_func_proto(func_id, prog_subtype);
 	}
 }
 
 static const struct bpf_func_proto *
-tc_cls_act_func_proto(enum bpf_func_id func_id)
+tc_cls_act_func_proto(enum bpf_func_id func_id,
+		      union bpf_prog_subtype *prog_subtype)
 {
 	switch (func_id) {
 	case BPF_FUNC_skb_store_bytes:
@@ -2693,12 +2696,13 @@ tc_cls_act_func_proto(enum bpf_func_id func_id)
 	case BPF_FUNC_skb_under_cgroup:
 		return &bpf_skb_under_cgroup_proto;
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_base_func_proto(func_id, prog_subtype);
 	}
 }
 
 static const struct bpf_func_proto *
-xdp_func_proto(enum bpf_func_id func_id)
+xdp_func_proto(enum bpf_func_id func_id,
+	       union bpf_prog_subtype *prog_subtype)
 {
 	switch (func_id) {
 	case BPF_FUNC_perf_event_output:
@@ -2708,23 +2712,25 @@ xdp_func_proto(enum bpf_func_id func_id)
 	case BPF_FUNC_xdp_adjust_head:
 		return &bpf_xdp_adjust_head_proto;
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_base_func_proto(func_id, prog_subtype);
 	}
 }
 
 static const struct bpf_func_proto *
-cg_skb_func_proto(enum bpf_func_id func_id)
+cg_skb_func_proto(enum bpf_func_id func_id,
+		  union bpf_prog_subtype *prog_subtype)
 {
 	switch (func_id) {
 	case BPF_FUNC_skb_load_bytes:
 		return &bpf_skb_load_bytes_proto;
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_base_func_proto(func_id, prog_subtype);
 	}
 }
 
 static const struct bpf_func_proto *
-lwt_inout_func_proto(enum bpf_func_id func_id)
+lwt_inout_func_proto(enum bpf_func_id func_id,
+		     union bpf_prog_subtype *prog_subtype)
 {
 	switch (func_id) {
 	case BPF_FUNC_skb_load_bytes:
@@ -2746,12 +2752,13 @@ lwt_inout_func_proto(enum bpf_func_id func_id)
 	case BPF_FUNC_skb_under_cgroup:
 		return &bpf_skb_under_cgroup_proto;
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_base_func_proto(func_id, prog_subtype);
 	}
 }
 
 static const struct bpf_func_proto *
-lwt_xmit_func_proto(enum bpf_func_id func_id)
+lwt_xmit_func_proto(enum bpf_func_id func_id,
+		    union bpf_prog_subtype *prog_subtype)
 {
 	switch (func_id) {
 	case BPF_FUNC_skb_get_tunnel_key:
@@ -2781,7 +2788,7 @@ lwt_xmit_func_proto(enum bpf_func_id func_id)
 	case BPF_FUNC_set_hash_invalid:
 		return &bpf_set_hash_invalid_proto;
 	default:
-		return lwt_inout_func_proto(func_id);
+		return lwt_inout_func_proto(func_id, prog_subtype);
 	}
 }
 
@@ -2811,7 +2818,8 @@ static bool __is_valid_access(int off, int size)
 
 static bool sk_filter_is_valid_access(int off, int size,
 				      enum bpf_access_type type,
-				      enum bpf_reg_type *reg_type)
+				      enum bpf_reg_type *reg_type,
+				      union bpf_prog_subtype *prog_subtype)
 {
 	switch (off) {
 	case offsetof(struct __sk_buff, tc_classid):
@@ -2835,7 +2843,8 @@ static bool sk_filter_is_valid_access(int off, int size,
 
 static bool lwt_is_valid_access(int off, int size,
 				enum bpf_access_type type,
-				enum bpf_reg_type *reg_type)
+				enum bpf_reg_type *reg_type,
+				union bpf_prog_subtype *prog_subtype)
 {
 	switch (off) {
 	case offsetof(struct __sk_buff, tc_classid):
@@ -2868,7 +2877,8 @@ static bool lwt_is_valid_access(int off, int size,
 
 static bool sock_filter_is_valid_access(int off, int size,
 					enum bpf_access_type type,
-					enum bpf_reg_type *reg_type)
+					enum bpf_reg_type *reg_type,
+					union bpf_prog_subtype *prog_subtype)
 {
 	if (type == BPF_WRITE) {
 		switch (off) {
@@ -2931,7 +2941,8 @@ static int tc_cls_act_prologue(struct bpf_insn *insn_buf, bool direct_write,
 
 static bool tc_cls_act_is_valid_access(int off, int size,
 				       enum bpf_access_type type,
-				       enum bpf_reg_type *reg_type)
+				       enum bpf_reg_type *reg_type,
+				       union bpf_prog_subtype *prog_subtype)
 {
 	if (type == BPF_WRITE) {
 		switch (off) {
@@ -2973,7 +2984,8 @@ static bool __is_valid_xdp_access(int off, int size)
 
 static bool xdp_is_valid_access(int off, int size,
 				enum bpf_access_type type,
-				enum bpf_reg_type *reg_type)
+				enum bpf_reg_type *reg_type,
+				union bpf_prog_subtype *prog_subtype)
 {
 	if (type == BPF_WRITE)
 		return false;
diff --git a/samples/bpf/bpf_load.c b/samples/bpf/bpf_load.c
index 396e204888b3..d23dc13ab0f2 100644
--- a/samples/bpf/bpf_load.c
+++ b/samples/bpf/bpf_load.c
@@ -29,6 +29,7 @@
 
 static char license[128];
 static int kern_version;
+static union bpf_prog_subtype subtype = {};
 static bool processed_sec[128];
 char bpf_log_buf[BPF_LOG_BUF_SIZE];
 int map_fd[MAX_MAPS];
@@ -98,7 +99,7 @@ static int load_and_attach(const char *event, struct bpf_insn *prog, int size)
 	}
 
 	fd = bpf_load_program(prog_type, prog, insns_cnt, license, kern_version,
-			      bpf_log_buf, BPF_LOG_BUF_SIZE);
+			      bpf_log_buf, BPF_LOG_BUF_SIZE, &subtype);
 	if (fd < 0) {
 		printf("bpf_load_program() err=%d\n%s", errno, bpf_log_buf);
 		return -1;
diff --git a/samples/bpf/fds_example.c b/samples/bpf/fds_example.c
index e29bd52ff9e8..0f4f5f6a9f9f 100644
--- a/samples/bpf/fds_example.c
+++ b/samples/bpf/fds_example.c
@@ -62,7 +62,7 @@ static int bpf_prog_create(const char *object)
 	} else {
 		return bpf_load_program(BPF_PROG_TYPE_SOCKET_FILTER,
 					insns, insns_cnt, "GPL", 0,
-					bpf_log_buf, BPF_LOG_BUF_SIZE);
+					bpf_log_buf, BPF_LOG_BUF_SIZE, NULL);
 	}
 }
 
diff --git a/samples/bpf/sock_example.c b/samples/bpf/sock_example.c
index 6fc6e193ef1b..615f4d8c29dc 100644
--- a/samples/bpf/sock_example.c
+++ b/samples/bpf/sock_example.c
@@ -60,7 +60,7 @@ static int test_sock(void)
 	size_t insns_cnt = sizeof(prog) / sizeof(struct bpf_insn);
 
 	prog_fd = bpf_load_program(BPF_PROG_TYPE_SOCKET_FILTER, prog, insns_cnt,
-				   "GPL", 0, bpf_log_buf, BPF_LOG_BUF_SIZE);
+				   "GPL", 0, bpf_log_buf, BPF_LOG_BUF_SIZE, NULL);
 	if (prog_fd < 0) {
 		printf("failed to load prog '%s'\n", strerror(errno));
 		goto cleanup;
diff --git a/samples/bpf/test_cgrp2_attach.c b/samples/bpf/test_cgrp2_attach.c
index 4bfcaf93fcf3..f8a91d2b7896 100644
--- a/samples/bpf/test_cgrp2_attach.c
+++ b/samples/bpf/test_cgrp2_attach.c
@@ -72,7 +72,7 @@ static int prog_load(int map_fd, int verdict)
 
 	return bpf_load_program(BPF_PROG_TYPE_CGROUP_SKB,
 				prog, insns_cnt, "GPL", 0,
-				bpf_log_buf, BPF_LOG_BUF_SIZE);
+				bpf_log_buf, BPF_LOG_BUF_SIZE, NULL);
 }
 
 static int usage(const char *argv0)
diff --git a/samples/bpf/test_cgrp2_attach2.c b/samples/bpf/test_cgrp2_attach2.c
index 3049b1f26267..31a0f4bd665f 100644
--- a/samples/bpf/test_cgrp2_attach2.c
+++ b/samples/bpf/test_cgrp2_attach2.c
@@ -45,7 +45,7 @@ static int prog_load(int verdict)
 
 	ret = bpf_load_program(BPF_PROG_TYPE_CGROUP_SKB,
 			       prog, insns_cnt, "GPL", 0,
-			       bpf_log_buf, BPF_LOG_BUF_SIZE);
+			       bpf_log_buf, BPF_LOG_BUF_SIZE, NULL);
 
 	if (ret < 0) {
 		log_err("Loading program");
diff --git a/samples/bpf/test_cgrp2_sock.c b/samples/bpf/test_cgrp2_sock.c
index c3cfb23e23b5..697f2db30e6a 100644
--- a/samples/bpf/test_cgrp2_sock.c
+++ b/samples/bpf/test_cgrp2_sock.c
@@ -38,7 +38,7 @@ static int prog_load(int idx)
 	size_t insns_cnt = sizeof(prog) / sizeof(struct bpf_insn);
 
 	return bpf_load_program(BPF_PROG_TYPE_CGROUP_SOCK, prog, insns_cnt,
-				"GPL", 0, bpf_log_buf, BPF_LOG_BUF_SIZE);
+				"GPL", 0, bpf_log_buf, BPF_LOG_BUF_SIZE, NULL);
 }
 
 static int usage(const char *argv0)
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 0539a0ceef38..240c76f09d0d 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -145,6 +145,15 @@ enum bpf_attach_type {
  */
 #define BPF_F_NO_COMMON_LRU	(1U << 1)
 
+union bpf_prog_subtype {
+	struct {
+		__u32		version; /* cf. documentation */
+		__u32		event; /* enum landlock_subtype_event */
+		__aligned_u64	ability; /* LANDLOCK_SUBTYPE_ABILITY_* */
+		__aligned_u64	option; /* LANDLOCK_SUBTYPE_OPTION_* */
+	} landlock_rule;
+} __attribute__((aligned(8)));
+
 union bpf_attr {
 	struct { /* anonymous struct used by BPF_MAP_CREATE command */
 		__u32	map_type;	/* one of enum bpf_map_type */
@@ -173,6 +182,7 @@ union bpf_attr {
 		__u32		log_size;	/* size of user buffer */
 		__aligned_u64	log_buf;	/* user supplied buffer */
 		__u32		kern_version;	/* checked when prog_type=kprobe */
+		union bpf_prog_subtype prog_subtype;
 	};
 
 	struct { /* anonymous struct used by BPF_OBJ_* commands */
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index d48b70ceb25a..eb423a28e974 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -71,10 +71,12 @@ int bpf_create_map(enum bpf_map_type map_type, int key_size,
 
 int bpf_load_program(enum bpf_prog_type type, const struct bpf_insn *insns,
 		     size_t insns_cnt, const char *license,
-		     __u32 kern_version, char *log_buf, size_t log_buf_sz)
+		     __u32 kern_version, char *log_buf, size_t log_buf_sz,
+		     union bpf_prog_subtype *subtype)
 {
 	int fd;
 	union bpf_attr attr;
+	union bpf_prog_subtype st_none = {};
 
 	bzero(&attr, sizeof(attr));
 	attr.prog_type = type;
@@ -85,6 +87,7 @@ int bpf_load_program(enum bpf_prog_type type, const struct bpf_insn *insns,
 	attr.log_size = 0;
 	attr.log_level = 0;
 	attr.kern_version = kern_version;
+	attr.prog_subtype = subtype ? *subtype : st_none;
 
 	fd = sys_bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
 	if (fd >= 0 || !log_buf || !log_buf_sz)
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 09c3dcac0496..c2ddecfbbbba 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -32,7 +32,7 @@ int bpf_create_map(enum bpf_map_type map_type, int key_size, int value_size,
 int bpf_load_program(enum bpf_prog_type type, const struct bpf_insn *insns,
 		     size_t insns_cnt, const char *license,
 		     __u32 kern_version, char *log_buf,
-		     size_t log_buf_sz);
+		     size_t log_buf_sz, union bpf_prog_subtype *subtype);
 
 int bpf_map_update_elem(int fd, const void *key, const void *value,
 			__u64 flags);
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 84e6b35da4bd..c9a680faead2 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -975,7 +975,7 @@ load_program(enum bpf_prog_type type, struct bpf_insn *insns,
 		pr_warning("Alloc log buffer for bpf loader error, continue without log\n");
 
 	ret = bpf_load_program(type, insns, insns_cnt, license,
-			       kern_version, log_buf, BPF_LOG_BUF_SIZE);
+			       kern_version, log_buf, BPF_LOG_BUF_SIZE, NULL);
 
 	if (ret >= 0) {
 		*pfd = ret;
@@ -1002,7 +1002,7 @@ load_program(enum bpf_prog_type type, struct bpf_insn *insns,
 
 			fd = bpf_load_program(BPF_PROG_TYPE_KPROBE, insns,
 					      insns_cnt, license, kern_version,
-					      NULL, 0);
+					      NULL, 0, NULL);
 			if (fd >= 0) {
 				close(fd);
 				ret = -LIBBPF_ERRNO__PROGTYPE;
diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index 92343f43e44a..1b67c7c39127 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -266,7 +266,7 @@ static int check_env(void)
 
 	err = bpf_load_program(BPF_PROG_TYPE_KPROBE, insns,
 			       sizeof(insns) / sizeof(insns[0]),
-			       license, kver_int, NULL, 0);
+			       license, kver_int, NULL, 0, NULL);
 	if (err < 0) {
 		pr_err("Missing basic BPF support, skip this test: %s\n",
 		       strerror(errno));
diff --git a/tools/testing/selftests/bpf/test_tag.c b/tools/testing/selftests/bpf/test_tag.c
index de409fc50c35..cf7892c87b5a 100644
--- a/tools/testing/selftests/bpf/test_tag.c
+++ b/tools/testing/selftests/bpf/test_tag.c
@@ -57,7 +57,7 @@ static int bpf_try_load_prog(int insns, int fd_map,
 
 	bpf_filler(insns, fd_map);
 	fd_prog = bpf_load_program(BPF_PROG_TYPE_SCHED_CLS, prog, insns, "", 0,
-				   NULL, 0);
+				   NULL, 0, NULL);
 	assert(fd_prog > 0);
 	if (fd_map > 0)
 		bpf_filler(insns, 0);
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index e1f5b9eea1e8..15eeb79104fe 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -51,6 +51,7 @@ struct bpf_test {
 		REJECT
 	} result, result_unpriv;
 	enum bpf_prog_type prog_type;
+	union bpf_prog_subtype prog_subtype;
 };
 
 /* Note we want this to be 64 bit aligned so that the end of our array is
@@ -4539,7 +4540,7 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
 
 	fd_prog = bpf_load_program(prog_type ? : BPF_PROG_TYPE_SOCKET_FILTER,
 				   prog, prog_len, "GPL", 0, bpf_vlog,
-				   sizeof(bpf_vlog));
+				   sizeof(bpf_vlog), &test->prog_subtype);
 
 	expected_ret = unpriv && test->result_unpriv != UNDEF ?
 		       test->result_unpriv : test->result;
-- 
2.11.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 02/10] bpf,landlock: Define an eBPF program type for Landlock
  2017-02-22  1:26 [PATCH v5 00/10] Landlock LSM: Toward unprivileged sandboxing Mickaël Salaün
  2017-02-22  1:26 ` [PATCH v5 01/10] bpf: Add eBPF program subtype and is_valid_subtype() verifier Mickaël Salaün
@ 2017-02-22  1:26 ` Mickaël Salaün
  2017-02-22  1:26 ` [PATCH v5 03/10] bpf: Define handle_fs and add a new helper bpf_handle_fs_get_mode() Mickaël Salaün
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Mickaël Salaün @ 2017-02-22  1:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Arnaldo Carvalho de Melo, Casey Schaufler, Daniel Borkmann,
	David Drysdale, David S . Miller, Eric W . Biederman,
	James Morris, Jann Horn, Jonathan Corbet, Matthew Garrett,
	Michael Kerrisk, Kees Cook, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Tejun Heo, Thomas Graf,
	Will Drewry, kernel-hardening, linux-api, linux-security-module,
	netdev

Add a new type of eBPF program used by Landlock rules.

This new BPF program type will be registered with the Landlock LSM
initialization.

Add an initial Landlock Kconfig.

Changes since v4:
* merge a minimal (not enabled) LSM code and Kconfig in this commit

Changes since v3:
* split commit
* revamp the landlock_context:
  * add arch, syscall_nr and syscall_cmd (ioctl, fcntl…) to be able to
    cross-check action with the event type
  * replace args array with dedicated fields to ease the addition of new
    fields

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
---
 include/linux/landlock.h       |  80 ++++++++++++++++++++++++++
 include/uapi/linux/bpf.h       | 105 ++++++++++++++++++++++++++++++++++
 security/Kconfig               |   1 +
 security/Makefile              |   2 +
 security/landlock/Kconfig      |  18 ++++++
 security/landlock/Makefile     |   3 +
 security/landlock/common.h     |  25 +++++++++
 security/landlock/hooks.c      | 124 +++++++++++++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h | 105 ++++++++++++++++++++++++++++++++++
 9 files changed, 463 insertions(+)
 create mode 100644 include/linux/landlock.h
 create mode 100644 security/landlock/Kconfig
 create mode 100644 security/landlock/Makefile
 create mode 100644 security/landlock/common.h
 create mode 100644 security/landlock/hooks.c

diff --git a/include/linux/landlock.h b/include/linux/landlock.h
new file mode 100644
index 000000000000..6be3c02dfc7c
--- /dev/null
+++ b/include/linux/landlock.h
@@ -0,0 +1,80 @@
+/*
+ * Landlock LSM - Public headers
+ *
+ * Copyright © 2017 Mickaël Salaün <mic@digikod.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _LINUX_LANDLOCK_H
+#define _LINUX_LANDLOCK_H
+#ifdef CONFIG_SECURITY_LANDLOCK
+
+#include <linux/bpf.h>	/* _LANDLOCK_SUBTYPE_EVENT_LAST */
+#include <linux/types.h> /* atomic_t */
+
+/*
+ * This is not intended for the UAPI headers. Each userland software should use
+ * a static minimal version for the required features as explained in the
+ * documentation.
+ */
+#define LANDLOCK_VERSION 1
+
+struct landlock_rule {
+	atomic_t usage;
+	struct landlock_rule *prev;
+	struct bpf_prog *prog;
+};
+
+/**
+ * struct landlock_node - node in the rule hierarchy
+ *
+ * This is created when a task inserts its first rule in the Landlock rule
+ * hierarchy. The set of Landlock rules referenced by this node is then
+ * enforced for all the tasks that inherit this node. However, if a task is
+ * cloned before inserting any rule, it doesn't get a dedicated node and its
+ * children will not inherit any rules from this task.
+ *
+ * @usage: reference count to manage the node lifetime
+ * @rule: list of Landlock rules managed by this node
+ * @prev: reference the parent node
+ * @owner: reference the address of the node in the &struct landlock_events.
+ *         This is needed to know if we need to append a rule to the current
+ *         node or create a new node.
+ */
+struct landlock_node {
+	atomic_t usage;
+	struct landlock_rule *rule;
+	struct landlock_node *prev;
+	struct landlock_node **owner;
+};
+
+/**
+ * struct landlock_events - Landlock event rules enforced on a thread
+ *
+ * This is used for low performance impact when forking a process. Instead of
+ * copying the full array and incrementing the usage of each entries, only
+ * create a pointer to &struct landlock_events and increments its usage.
+ *
+ * @usage: reference count to manage the object lifetime. When a thread need to
+ *         add Landlock rules and if @usage is greater than 1, then the thread
+ *         must duplicate &struct landlock_events to not change the children's
+ *         rules as well.
+ * @nodes: array of non-NULL &struct landlock_node pointers
+ */
+struct landlock_events {
+	atomic_t usage;
+	struct landlock_node *nodes[_LANDLOCK_SUBTYPE_EVENT_LAST];
+};
+
+void put_landlock_events(struct landlock_events *events);
+
+#ifdef CONFIG_SECCOMP_FILTER
+int landlock_seccomp_append_prog(unsigned int flags,
+		const char __user *user_bpf_fd);
+#endif /* CONFIG_SECCOMP_FILTER */
+
+#endif /* CONFIG_SECURITY_LANDLOCK */
+#endif /* _LINUX_LANDLOCK_H */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 240c76f09d0d..c9c909a84f0b 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -112,6 +112,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_LWT_IN,
 	BPF_PROG_TYPE_LWT_OUT,
 	BPF_PROG_TYPE_LWT_XMIT,
+	BPF_PROG_TYPE_LANDLOCK,
 };
 
 enum bpf_attach_type {
@@ -643,4 +644,108 @@ struct xdp_md {
 	__u32 data_end;
 };
 
+/**
+ * enum landlock_subtype_event - event occuring when an action is performed on
+ * a particular kernel object
+ *
+ * An event is a policy decision point which exposes the same context type
+ * (especially the same arg[0-9] field types) for each rule execution.
+ *
+ * @LANDLOCK_SUBTYPE_EVENT_UNSPEC: invalid value
+ * @LANDLOCK_SUBTYPE_EVENT_FS: generic filesystem event
+ */
+enum landlock_subtype_event {
+	LANDLOCK_SUBTYPE_EVENT_UNSPEC,
+	LANDLOCK_SUBTYPE_EVENT_FS,
+};
+#define _LANDLOCK_SUBTYPE_EVENT_LAST LANDLOCK_SUBTYPE_EVENT_FS
+
+/**
+ * DOC: landlock_subtype_access
+ *
+ * eBPF context and functions allowed for a rule
+ *
+ * - LANDLOCK_SUBTYPE_ABILITY_WRITE: allows to directly send notification to
+ *   userland (e.g. through a map), which may leaks sensitive informations
+ * - LANDLOCK_SUBTYPE_ABILITY_DEBUG: allows to do debug actions (e.g. writing
+ *   logs), which may be dangerous and should only be used for rule testing
+ */
+#define LANDLOCK_SUBTYPE_ABILITY_WRITE		(1ULL << 0)
+#define LANDLOCK_SUBTYPE_ABILITY_DEBUG		(1ULL << 1)
+#define _LANDLOCK_SUBTYPE_ABILITY_NB		2
+#define _LANDLOCK_SUBTYPE_ABILITY_MASK		((1ULL << _LANDLOCK_SUBTYPE_ABILITY_NB) - 1)
+
+/*
+ * Future options for a Landlock rule (e.g. run even if a previous rule denied
+ * an action).
+ */
+#define _LANDLOCK_SUBTYPE_OPTION_NB		0
+#define _LANDLOCK_SUBTYPE_OPTION_MASK		((1ULL << _LANDLOCK_SUBTYPE_OPTION_NB) - 1)
+
+/*
+ * Status visible in the @status field of a context (e.g. already called in
+ * this syscall session, with same args...).
+ *
+ * The @status field exposed to a rule shall depend on the rule version.
+ */
+#define _LANDLOCK_SUBTYPE_STATUS_NB		0
+#define _LANDLOCK_SUBTYPE_STATUS_MASK		((1ULL << _LANDLOCK_SUBTYPE_STATUS_NB) - 1)
+
+/**
+ * DOC: landlock_action_fs
+ *
+ * - %LANDLOCK_ACTION_FS_EXEC: execute a file or walk through a directory
+ * - %LANDLOCK_ACTION_FS_WRITE: modify a file or a directory view (which
+ *   include mount actions)
+ * - %LANDLOCK_ACTION_FS_READ: read a file or a directory
+ * - %LANDLOCK_ACTION_FS_NEW: create a file or a directory
+ * - %LANDLOCK_ACTION_FS_GET: open or receive a file
+ * - %LANDLOCK_ACTION_FS_REMOVE: unlink a file or remove a directory
+ *
+ * Each of the following actions are specific to syscall multiplexers. They
+ * fill the syscall_cmd field from &struct landlock_context with their custom
+ * command.
+ *
+ * - %LANDLOCK_ACTION_FS_IOCTL: ioctl command
+ * - %LANDLOCK_ACTION_FS_LOCK: flock or fcntl lock command
+ * - %LANDLOCK_ACTION_FS_FCNTL: fcntl command
+ */
+#define LANDLOCK_ACTION_FS_EXEC			(1ULL << 0)
+#define LANDLOCK_ACTION_FS_WRITE		(1ULL << 1)
+#define LANDLOCK_ACTION_FS_READ			(1ULL << 2)
+#define LANDLOCK_ACTION_FS_NEW			(1ULL << 3)
+#define LANDLOCK_ACTION_FS_GET			(1ULL << 4)
+#define LANDLOCK_ACTION_FS_REMOVE		(1ULL << 5)
+#define LANDLOCK_ACTION_FS_IOCTL		(1ULL << 6)
+#define LANDLOCK_ACTION_FS_LOCK			(1ULL << 7)
+#define LANDLOCK_ACTION_FS_FCNTL		(1ULL << 8)
+#define _LANDLOCK_ACTION_FS_NB			9
+#define _LANDLOCK_ACTION_FS_MASK		((1ULL << _LANDLOCK_ACTION_FS_NB) - 1)
+
+
+/**
+ * struct landlock_context - context accessible to a Landlock rule
+ *
+ * @status: bitfield for future use (LANDLOCK_SUBTYPE_STATUS_*)
+ * @arch: indicates system call convention as an AUDIT_ARCH_* value
+ *        as defined in <linux/audit.h>
+ * @syscall_nr: the system call number called by the current process (may be
+ *              useful to debug: find out from which syscall this request came
+ *              from)
+ * @syscall_cmd: contains the command used by a multiplexer syscall (e.g.
+ *               ioctl, fcntl, flock)
+ * @event: event type (&enum landlock_subtype_event)
+ * @arg1: first event's optional argument
+ * @arg2: second event's optional argument
+ */
+struct landlock_context {
+	__u64 status;
+	__u32 arch;
+	__u32 syscall_nr;
+	__u32 syscall_cmd;
+	__u32 event;
+	__u64 arg1;
+	__u64 arg2;
+};
+
 #endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/security/Kconfig b/security/Kconfig
index 118f4549404e..c63194c561c5 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -164,6 +164,7 @@ source security/tomoyo/Kconfig
 source security/apparmor/Kconfig
 source security/loadpin/Kconfig
 source security/yama/Kconfig
+source security/landlock/Kconfig
 
 source security/integrity/Kconfig
 
diff --git a/security/Makefile b/security/Makefile
index f2d71cdb8e19..3fdc2f19dc48 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -9,6 +9,7 @@ subdir-$(CONFIG_SECURITY_TOMOYO)        += tomoyo
 subdir-$(CONFIG_SECURITY_APPARMOR)	+= apparmor
 subdir-$(CONFIG_SECURITY_YAMA)		+= yama
 subdir-$(CONFIG_SECURITY_LOADPIN)	+= loadpin
+subdir-$(CONFIG_SECURITY_LANDLOCK)		+= landlock
 
 # always enable default capabilities
 obj-y					+= commoncap.o
@@ -24,6 +25,7 @@ obj-$(CONFIG_SECURITY_TOMOYO)		+= tomoyo/
 obj-$(CONFIG_SECURITY_APPARMOR)		+= apparmor/
 obj-$(CONFIG_SECURITY_YAMA)		+= yama/
 obj-$(CONFIG_SECURITY_LOADPIN)		+= loadpin/
+obj-$(CONFIG_SECURITY_LANDLOCK)	+= landlock/
 obj-$(CONFIG_CGROUP_DEVICE)		+= device_cgroup.o
 
 # Object integrity file lists
diff --git a/security/landlock/Kconfig b/security/landlock/Kconfig
new file mode 100644
index 000000000000..aa5808e116f1
--- /dev/null
+++ b/security/landlock/Kconfig
@@ -0,0 +1,18 @@
+config SECURITY_LANDLOCK
+	bool "Landlock sandbox support"
+	depends on SECURITY
+	depends on BPF_SYSCALL
+	depends on SECCOMP_FILTER
+	default y
+	help
+	  Landlock is a stackable LSM which allows to load a security policy to
+	  restrict processes (i.e. create a sandbox). The policy is a list of
+	  stacked eBPF programs, called rules, dedicated to restrict access to
+	  a type of kernel object (e.g. file).
+
+	  You need to enable seccomp filter to apply a security policy to a
+	  process hierarchy (e.g. application with built-in sandboxing).
+
+	  See Documentation/security/landlock/ for further information.
+
+	  If you are unsure how to answer this question, answer Y.
diff --git a/security/landlock/Makefile b/security/landlock/Makefile
new file mode 100644
index 000000000000..b91af42f0c32
--- /dev/null
+++ b/security/landlock/Makefile
@@ -0,0 +1,3 @@
+obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
+
+landlock-y := hooks.o
diff --git a/security/landlock/common.h b/security/landlock/common.h
new file mode 100644
index 000000000000..a2483405349f
--- /dev/null
+++ b/security/landlock/common.h
@@ -0,0 +1,25 @@
+/*
+ * Landlock LSM - private headers
+ *
+ * Copyright © 2017 Mickaël Salaün <mic@digikod.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _SECURITY_LANDLOCK_COMMON_H
+#define _SECURITY_LANDLOCK_COMMON_H
+
+/**
+ * get_index - get an index for the rules of struct landlock_events
+ *
+ * @event: a Landlock event type
+ */
+static inline int get_index(enum landlock_subtype_event event)
+{
+	/* event ID > 0 for loaded programs */
+	return event - 1;
+}
+
+#endif /* _SECURITY_LANDLOCK_COMMON_H */
diff --git a/security/landlock/hooks.c b/security/landlock/hooks.c
new file mode 100644
index 000000000000..28a26bd8c1a2
--- /dev/null
+++ b/security/landlock/hooks.c
@@ -0,0 +1,124 @@
+/*
+ * Landlock LSM - hooks
+ *
+ * Copyright © 2017 Mickaël Salaün <mic@digikod.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ */
+
+#include <asm/current.h>
+#include <asm/processor.h> /* task_pt_regs() */
+#include <asm/syscall.h> /* syscall_get_nr(), syscall_get_arch() */
+#include <linux/bpf.h> /* enum bpf_access_type, enum bpf_*, enum landlock_subtype_event, struct landlock_context, struct bpf_handle_fs  */
+#include <linux/err.h> /* EPERM */
+#include <linux/filter.h> /* struct bpf_prog, BPF_PROG_RUN() */
+#include <linux/kernel.h> /* ARRAY_SIZE */
+#include <linux/landlock.h> /* struct landlock_node */
+#include <linux/lsm_hooks.h>
+#include <linux/seccomp.h> /* struct seccomp_* */
+#include <linux/stddef.h> /* offsetof */
+#include <linux/types.h> /* uintptr_t */
+
+#define CTX_ARG_NB 2
+
+
+static inline bool bpf_landlock_is_valid_access(int off, int size,
+		enum bpf_access_type type, enum bpf_reg_type *reg_type,
+		union bpf_prog_subtype *prog_subtype)
+{
+	return false;
+}
+
+static inline bool bpf_landlock_is_valid_subtype(
+		union bpf_prog_subtype *prog_subtype)
+{
+	enum landlock_subtype_event event = prog_subtype->landlock_rule.event;
+
+	switch (event) {
+	case LANDLOCK_SUBTYPE_EVENT_FS:
+		break;
+	case LANDLOCK_SUBTYPE_EVENT_UNSPEC:
+	default:
+		return false;
+	}
+	if (!prog_subtype->landlock_rule.version ||
+			prog_subtype->landlock_rule.version > LANDLOCK_VERSION)
+		return false;
+	if (!prog_subtype->landlock_rule.event ||
+			prog_subtype->landlock_rule.event > _LANDLOCK_SUBTYPE_EVENT_LAST)
+		return false;
+	if (prog_subtype->landlock_rule.ability & ~_LANDLOCK_SUBTYPE_ABILITY_MASK)
+		return false;
+	if (prog_subtype->landlock_rule.option & ~_LANDLOCK_SUBTYPE_OPTION_MASK)
+		return false;
+
+	/* check ability flags */
+	if (prog_subtype->landlock_rule.ability & LANDLOCK_SUBTYPE_ABILITY_WRITE &&
+			!capable(CAP_SYS_ADMIN))
+		return false;
+	if (prog_subtype->landlock_rule.ability & LANDLOCK_SUBTYPE_ABILITY_DEBUG &&
+			!capable(CAP_SYS_ADMIN))
+		return false;
+
+	return true;
+}
+
+static inline const struct bpf_func_proto *bpf_landlock_func_proto(
+		enum bpf_func_id func_id, union bpf_prog_subtype *prog_subtype)
+{
+	bool event_fs = (prog_subtype->landlock_rule.event ==
+			LANDLOCK_SUBTYPE_EVENT_FS);
+	bool ability_write = !!(prog_subtype->landlock_rule.ability &
+			LANDLOCK_SUBTYPE_ABILITY_WRITE);
+	bool ability_debug = !!(prog_subtype->landlock_rule.ability &
+			LANDLOCK_SUBTYPE_ABILITY_DEBUG);
+
+	switch (func_id) {
+	case BPF_FUNC_map_lookup_elem:
+		return &bpf_map_lookup_elem_proto;
+
+	/* ability_write */
+	case BPF_FUNC_map_delete_elem:
+		if (ability_write)
+			return &bpf_map_delete_elem_proto;
+		return NULL;
+	case BPF_FUNC_map_update_elem:
+		if (ability_write)
+			return &bpf_map_update_elem_proto;
+		return NULL;
+
+	/* ability_debug */
+	case BPF_FUNC_get_current_comm:
+		if (ability_debug)
+			return &bpf_get_current_comm_proto;
+		return NULL;
+	case BPF_FUNC_get_current_pid_tgid:
+		if (ability_debug)
+			return &bpf_get_current_pid_tgid_proto;
+		return NULL;
+	case BPF_FUNC_get_current_uid_gid:
+		if (ability_debug)
+			return &bpf_get_current_uid_gid_proto;
+		return NULL;
+	case BPF_FUNC_trace_printk:
+		if (ability_debug)
+			return bpf_get_trace_printk_proto();
+		return NULL;
+
+	default:
+		return NULL;
+	}
+}
+
+static const struct bpf_verifier_ops bpf_landlock_ops = {
+	.get_func_proto	= bpf_landlock_func_proto,
+	.is_valid_access = bpf_landlock_is_valid_access,
+	.is_valid_subtype = bpf_landlock_is_valid_subtype,
+};
+
+static struct bpf_prog_type_list bpf_landlock_type __ro_after_init = {
+	.ops = &bpf_landlock_ops,
+	.type = BPF_PROG_TYPE_LANDLOCK,
+};
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 240c76f09d0d..c9c909a84f0b 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -112,6 +112,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_LWT_IN,
 	BPF_PROG_TYPE_LWT_OUT,
 	BPF_PROG_TYPE_LWT_XMIT,
+	BPF_PROG_TYPE_LANDLOCK,
 };
 
 enum bpf_attach_type {
@@ -643,4 +644,108 @@ struct xdp_md {
 	__u32 data_end;
 };
 
+/**
+ * enum landlock_subtype_event - event occuring when an action is performed on
+ * a particular kernel object
+ *
+ * An event is a policy decision point which exposes the same context type
+ * (especially the same arg[0-9] field types) for each rule execution.
+ *
+ * @LANDLOCK_SUBTYPE_EVENT_UNSPEC: invalid value
+ * @LANDLOCK_SUBTYPE_EVENT_FS: generic filesystem event
+ */
+enum landlock_subtype_event {
+	LANDLOCK_SUBTYPE_EVENT_UNSPEC,
+	LANDLOCK_SUBTYPE_EVENT_FS,
+};
+#define _LANDLOCK_SUBTYPE_EVENT_LAST LANDLOCK_SUBTYPE_EVENT_FS
+
+/**
+ * DOC: landlock_subtype_access
+ *
+ * eBPF context and functions allowed for a rule
+ *
+ * - LANDLOCK_SUBTYPE_ABILITY_WRITE: allows to directly send notification to
+ *   userland (e.g. through a map), which may leaks sensitive informations
+ * - LANDLOCK_SUBTYPE_ABILITY_DEBUG: allows to do debug actions (e.g. writing
+ *   logs), which may be dangerous and should only be used for rule testing
+ */
+#define LANDLOCK_SUBTYPE_ABILITY_WRITE		(1ULL << 0)
+#define LANDLOCK_SUBTYPE_ABILITY_DEBUG		(1ULL << 1)
+#define _LANDLOCK_SUBTYPE_ABILITY_NB		2
+#define _LANDLOCK_SUBTYPE_ABILITY_MASK		((1ULL << _LANDLOCK_SUBTYPE_ABILITY_NB) - 1)
+
+/*
+ * Future options for a Landlock rule (e.g. run even if a previous rule denied
+ * an action).
+ */
+#define _LANDLOCK_SUBTYPE_OPTION_NB		0
+#define _LANDLOCK_SUBTYPE_OPTION_MASK		((1ULL << _LANDLOCK_SUBTYPE_OPTION_NB) - 1)
+
+/*
+ * Status visible in the @status field of a context (e.g. already called in
+ * this syscall session, with same args...).
+ *
+ * The @status field exposed to a rule shall depend on the rule version.
+ */
+#define _LANDLOCK_SUBTYPE_STATUS_NB		0
+#define _LANDLOCK_SUBTYPE_STATUS_MASK		((1ULL << _LANDLOCK_SUBTYPE_STATUS_NB) - 1)
+
+/**
+ * DOC: landlock_action_fs
+ *
+ * - %LANDLOCK_ACTION_FS_EXEC: execute a file or walk through a directory
+ * - %LANDLOCK_ACTION_FS_WRITE: modify a file or a directory view (which
+ *   include mount actions)
+ * - %LANDLOCK_ACTION_FS_READ: read a file or a directory
+ * - %LANDLOCK_ACTION_FS_NEW: create a file or a directory
+ * - %LANDLOCK_ACTION_FS_GET: open or receive a file
+ * - %LANDLOCK_ACTION_FS_REMOVE: unlink a file or remove a directory
+ *
+ * Each of the following actions are specific to syscall multiplexers. They
+ * fill the syscall_cmd field from &struct landlock_context with their custom
+ * command.
+ *
+ * - %LANDLOCK_ACTION_FS_IOCTL: ioctl command
+ * - %LANDLOCK_ACTION_FS_LOCK: flock or fcntl lock command
+ * - %LANDLOCK_ACTION_FS_FCNTL: fcntl command
+ */
+#define LANDLOCK_ACTION_FS_EXEC			(1ULL << 0)
+#define LANDLOCK_ACTION_FS_WRITE		(1ULL << 1)
+#define LANDLOCK_ACTION_FS_READ			(1ULL << 2)
+#define LANDLOCK_ACTION_FS_NEW			(1ULL << 3)
+#define LANDLOCK_ACTION_FS_GET			(1ULL << 4)
+#define LANDLOCK_ACTION_FS_REMOVE		(1ULL << 5)
+#define LANDLOCK_ACTION_FS_IOCTL		(1ULL << 6)
+#define LANDLOCK_ACTION_FS_LOCK			(1ULL << 7)
+#define LANDLOCK_ACTION_FS_FCNTL		(1ULL << 8)
+#define _LANDLOCK_ACTION_FS_NB			9
+#define _LANDLOCK_ACTION_FS_MASK		((1ULL << _LANDLOCK_ACTION_FS_NB) - 1)
+
+
+/**
+ * struct landlock_context - context accessible to a Landlock rule
+ *
+ * @status: bitfield for future use (LANDLOCK_SUBTYPE_STATUS_*)
+ * @arch: indicates system call convention as an AUDIT_ARCH_* value
+ *        as defined in <linux/audit.h>
+ * @syscall_nr: the system call number called by the current process (may be
+ *              useful to debug: find out from which syscall this request came
+ *              from)
+ * @syscall_cmd: contains the command used by a multiplexer syscall (e.g.
+ *               ioctl, fcntl, flock)
+ * @event: event type (&enum landlock_subtype_event)
+ * @arg1: first event's optional argument
+ * @arg2: second event's optional argument
+ */
+struct landlock_context {
+	__u64 status;
+	__u32 arch;
+	__u32 syscall_nr;
+	__u32 syscall_cmd;
+	__u32 event;
+	__u64 arg1;
+	__u64 arg2;
+};
+
 #endif /* _UAPI__LINUX_BPF_H__ */
-- 
2.11.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 03/10] bpf: Define handle_fs and add a new helper bpf_handle_fs_get_mode()
  2017-02-22  1:26 [PATCH v5 00/10] Landlock LSM: Toward unprivileged sandboxing Mickaël Salaün
  2017-02-22  1:26 ` [PATCH v5 01/10] bpf: Add eBPF program subtype and is_valid_subtype() verifier Mickaël Salaün
  2017-02-22  1:26 ` [PATCH v5 02/10] bpf,landlock: Define an eBPF program type for Landlock Mickaël Salaün
@ 2017-02-22  1:26 ` Mickaël Salaün
  2017-03-01  9:32   ` James Morris
  2017-02-22  1:26 ` [PATCH v5 04/10] landlock: Add LSM hooks related to filesystem Mickaël Salaün
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: Mickaël Salaün @ 2017-02-22  1:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Arnaldo Carvalho de Melo, Casey Schaufler, Daniel Borkmann,
	David Drysdale, David S . Miller, Eric W . Biederman,
	James Morris, Jann Horn, Jonathan Corbet, Matthew Garrett,
	Michael Kerrisk, Kees Cook, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Tejun Heo, Thomas Graf,
	Will Drewry, kernel-hardening, linux-api, linux-security-module,
	netdev

Add an eBPF function bpf_handle_fs_get_mode(handle_fs) to get the mode
of a an abstract object wrapping either a file, a dentry, a path, or an
inode.

Changes since v4:
* use a file abstraction (handle) to wrap inode, dentry, path and file
  structs
* remove bpf_landlock_cmp_fs_beneath()
* rename the BPF helper and move it to kernel/bpf/
* tighten helpers accessible by a Landlock rule

Changes since v3:
* remove bpf_landlock_cmp_fs_prop() (suggested by Alexie Starovoitov)
* add hooks dealing with struct inode and struct path pointers:
  inode_permission and inode_getattr
* add abstraction over eBPF helper arguments thanks to wrapping structs
* add bpf_landlock_get_fs_mode() helper to check file type and mode
* merge WARN_ON() (suggested by Kees Cook)
* fix and update bpf_helpers.h
* use BPF_CALL_* for eBPF helpers (suggested by Alexie Starovoitov)
* make handle arraymap safe (RCU) and remove buggy synchronize_rcu()
* factor out the arraymay walk
* use size_t to index array (suggested by Jann Horn)

Changes since v2:
* add MNT_INTERNAL check to only add file handle from user-visible FS
  (e.g. no anonymous inode)
* replace struct file* with struct path* in map_landlock_handle
* add BPF protos
* fix bpf_landlock_cmp_fs_prop_with_struct_file()

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Jann Horn <jann@thejh.net>
---
 include/linux/bpf.h            | 33 +++++++++++++++++++++++++++
 include/uapi/linux/bpf.h       | 10 +++++++-
 kernel/bpf/Makefile            |  2 +-
 kernel/bpf/helpers_fs.c        | 52 ++++++++++++++++++++++++++++++++++++++++++
 kernel/bpf/verifier.c          |  6 +++++
 samples/bpf/bpf_helpers.h      |  2 ++
 security/landlock/hooks.c      |  8 ++++++-
 tools/include/uapi/linux/bpf.h | 10 +++++++-
 8 files changed, 119 insertions(+), 4 deletions(-)
 create mode 100644 kernel/bpf/helpers_fs.c

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index dd954048aa19..bc01a7388168 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -15,6 +15,11 @@
 #include <linux/err.h>
 #include <linux/rbtree_latch.h>
 
+/* FS helpers */
+#include <linux/dcache.h> /* struct dentry */
+#include <linux/fs.h> /* struct file, struct inode */
+#include <linux/path.h> /* struct path */
+
 struct perf_event;
 struct bpf_map;
 
@@ -82,6 +87,8 @@ enum bpf_arg_type {
 
 	ARG_PTR_TO_CTX,		/* pointer to context */
 	ARG_ANYTHING,		/* any (initialized) argument is ok */
+
+	ARG_CONST_PTR_TO_HANDLE_FS,	/* pointer to an abstract FS struct */
 };
 
 /* type of values returned from helper functions */
@@ -148,6 +155,9 @@ enum bpf_reg_type {
 	 * map element.
 	 */
 	PTR_TO_MAP_VALUE_ADJ,
+
+	/* FS helpers */
+	CONST_PTR_TO_HANDLE_FS,
 };
 
 struct bpf_prog;
@@ -220,6 +230,26 @@ struct bpf_event_entry {
 	struct rcu_head rcu;
 };
 
+/* FS helpers */
+enum bpf_handle_fs_type {
+	BPF_HANDLE_FS_TYPE_NONE,
+	BPF_HANDLE_FS_TYPE_FILE,
+	BPF_HANDLE_FS_TYPE_INODE,
+	BPF_HANDLE_FS_TYPE_PATH,
+	BPF_HANDLE_FS_TYPE_DENTRY,
+};
+
+struct bpf_handle_fs {
+	enum bpf_handle_fs_type type;
+	union {
+		struct file *file;
+		struct inode *inode;
+		const struct path *path;
+		struct dentry *dentry;
+	};
+};
+
+
 u64 bpf_tail_call(u64 ctx, u64 r2, u64 index, u64 r4, u64 r5);
 u64 bpf_get_stackid(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
 
@@ -358,6 +388,9 @@ extern const struct bpf_func_proto bpf_skb_vlan_push_proto;
 extern const struct bpf_func_proto bpf_skb_vlan_pop_proto;
 extern const struct bpf_func_proto bpf_get_stackid_proto;
 
+/* FS helpers */
+extern const struct bpf_func_proto bpf_handle_fs_get_mode_proto;
+
 /* Shared helpers among cBPF and eBPF. */
 void bpf_user_rnd_init_once(void);
 u64 bpf_user_rnd_u32(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c9c909a84f0b..ffceb42ccc4e 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -467,6 +467,13 @@ union bpf_attr {
  *     Return:
  *       > 0 length of the string including the trailing NUL on success
  *       < 0 error
+ *
+ * s64 bpf_handle_fs_get_mode(handle_fs)
+ *     Get the mode of a struct bpf_handle_fs
+ *     fs: struct bpf_handle_fs address
+ *     Return:
+ *       >= 0 file mode
+ *       < 0 error
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -514,7 +521,8 @@ union bpf_attr {
 	FN(get_numa_node_id),		\
 	FN(skb_change_head),		\
 	FN(xdp_adjust_head),		\
-	FN(probe_read_str),
+	FN(probe_read_str),		\
+	FN(handle_fs_get_mode),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index e1ce4f4fd7fd..be090e802aa5 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -1,6 +1,6 @@
 obj-y := core.o
 
-obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o
+obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o helpers_fs.o
 obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o
 ifeq ($(CONFIG_PERF_EVENTS),y)
 obj-$(CONFIG_BPF_SYSCALL) += stackmap.o
diff --git a/kernel/bpf/helpers_fs.c b/kernel/bpf/helpers_fs.c
new file mode 100644
index 000000000000..8166b5b42cd7
--- /dev/null
+++ b/kernel/bpf/helpers_fs.c
@@ -0,0 +1,52 @@
+/*
+ * BPF filesystem helpers
+ *
+ * Copyright © 2017 Mickaël Salaün <mic@digikod.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/bpf.h> /* struct bpf_handle_fs */
+#include <linux/errno.h>
+#include <linux/filter.h> /* BPF_CALL*() */
+
+BPF_CALL_1(bpf_handle_fs_get_mode, struct bpf_handle_fs *, handle_fs)
+{
+	if (WARN_ON(!handle_fs))
+		return -EFAULT;
+	if (!handle_fs->file) {
+		/* file can be null for anonymous mmap */
+		WARN_ON(handle_fs->type != BPF_HANDLE_FS_TYPE_FILE);
+		return -ENOENT;
+	}
+	switch (handle_fs->type) {
+		case BPF_HANDLE_FS_TYPE_FILE:
+			if (WARN_ON(!handle_fs->file->f_inode))
+				return -ENOENT;
+			return handle_fs->file->f_inode->i_mode;
+		case BPF_HANDLE_FS_TYPE_INODE:
+			return handle_fs->inode->i_mode;
+		case BPF_HANDLE_FS_TYPE_PATH:
+			if (WARN_ON(!handle_fs->path->dentry ||
+					!handle_fs->path->dentry->d_inode))
+				return -ENOENT;
+			return handle_fs->path->dentry->d_inode->i_mode;
+		case BPF_HANDLE_FS_TYPE_DENTRY:
+			if (WARN_ON(!handle_fs->dentry->d_inode))
+				return -ENOENT;
+			return handle_fs->dentry->d_inode->i_mode;
+		case BPF_HANDLE_FS_TYPE_NONE:
+		default:
+			WARN_ON(1);
+			return -EFAULT;
+	}
+}
+
+const struct bpf_func_proto bpf_handle_fs_get_mode_proto = {
+	.func		= bpf_handle_fs_get_mode,
+	.gpl_only	= true,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_CONST_PTR_TO_HANDLE_FS,
+};
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index fe26ec007a9a..ab0d4fbba399 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -189,6 +189,7 @@ static const char * const reg_type_str[] = {
 	[CONST_IMM]		= "imm",
 	[PTR_TO_PACKET]		= "pkt",
 	[PTR_TO_PACKET_END]	= "pkt_end",
+	[CONST_PTR_TO_HANDLE_FS] = "handle_fs",
 };
 
 #define __BPF_FUNC_STR_FN(x) [BPF_FUNC_ ## x] = __stringify(bpf_ ## x)
@@ -546,6 +547,7 @@ static bool is_spillable_regtype(enum bpf_reg_type type)
 	case PTR_TO_PACKET_END:
 	case FRAME_PTR:
 	case CONST_PTR_TO_MAP:
+	case CONST_PTR_TO_HANDLE_FS:
 		return true;
 	default:
 		return false;
@@ -1052,6 +1054,10 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
 		expected_type = PTR_TO_CTX;
 		if (type != expected_type)
 			goto err_type;
+	} else if (arg_type == ARG_CONST_PTR_TO_HANDLE_FS) {
+		expected_type = CONST_PTR_TO_HANDLE_FS;
+		if (type != expected_type)
+			goto err_type;
 	} else if (arg_type == ARG_PTR_TO_MEM ||
 		   arg_type == ARG_PTR_TO_UNINIT_MEM) {
 		expected_type = PTR_TO_STACK;
diff --git a/samples/bpf/bpf_helpers.h b/samples/bpf/bpf_helpers.h
index faaffe2e139a..323e4b13d1d1 100644
--- a/samples/bpf/bpf_helpers.h
+++ b/samples/bpf/bpf_helpers.h
@@ -59,6 +59,8 @@ static unsigned long long (*bpf_get_prandom_u32)(void) =
 	(void *) BPF_FUNC_get_prandom_u32;
 static int (*bpf_xdp_adjust_head)(void *ctx, int offset) =
 	(void *) BPF_FUNC_xdp_adjust_head;
+static long long (*bpf_handle_fs_get_mode)(void *handle_fs) =
+	(void *) BPF_FUNC_handle_fs_get_mode;
 
 /* llvm builtin functions that eBPF C program may use to
  * emit BPF_LD_ABS and BPF_LD_IND instructions
diff --git a/security/landlock/hooks.c b/security/landlock/hooks.c
index 28a26bd8c1a2..6c1ad0e03cfc 100644
--- a/security/landlock/hooks.c
+++ b/security/landlock/hooks.c
@@ -11,7 +11,7 @@
 #include <asm/current.h>
 #include <asm/processor.h> /* task_pt_regs() */
 #include <asm/syscall.h> /* syscall_get_nr(), syscall_get_arch() */
-#include <linux/bpf.h> /* enum bpf_access_type, enum bpf_*, enum landlock_subtype_event, struct landlock_context, struct bpf_handle_fs  */
+#include <linux/bpf.h> /* enum bpf_access_type, struct landlock_context */
 #include <linux/err.h> /* EPERM */
 #include <linux/filter.h> /* struct bpf_prog, BPF_PROG_RUN() */
 #include <linux/kernel.h> /* ARRAY_SIZE */
@@ -79,6 +79,12 @@ static inline const struct bpf_func_proto *bpf_landlock_func_proto(
 	case BPF_FUNC_map_lookup_elem:
 		return &bpf_map_lookup_elem_proto;
 
+	/* event_fs */
+	case BPF_FUNC_handle_fs_get_mode:
+		if (event_fs)
+			return &bpf_handle_fs_get_mode_proto;
+		return NULL;
+
 	/* ability_write */
 	case BPF_FUNC_map_delete_elem:
 		if (ability_write)
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index c9c909a84f0b..ffceb42ccc4e 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -467,6 +467,13 @@ union bpf_attr {
  *     Return:
  *       > 0 length of the string including the trailing NUL on success
  *       < 0 error
+ *
+ * s64 bpf_handle_fs_get_mode(handle_fs)
+ *     Get the mode of a struct bpf_handle_fs
+ *     fs: struct bpf_handle_fs address
+ *     Return:
+ *       >= 0 file mode
+ *       < 0 error
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -514,7 +521,8 @@ union bpf_attr {
 	FN(get_numa_node_id),		\
 	FN(skb_change_head),		\
 	FN(xdp_adjust_head),		\
-	FN(probe_read_str),
+	FN(probe_read_str),		\
+	FN(handle_fs_get_mode),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
-- 
2.11.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 04/10] landlock: Add LSM hooks related to filesystem
  2017-02-22  1:26 [PATCH v5 00/10] Landlock LSM: Toward unprivileged sandboxing Mickaël Salaün
                   ` (2 preceding siblings ...)
  2017-02-22  1:26 ` [PATCH v5 03/10] bpf: Define handle_fs and add a new helper bpf_handle_fs_get_mode() Mickaël Salaün
@ 2017-02-22  1:26 ` Mickaël Salaün
  2017-02-22  1:26 ` [PATCH v5 05/10] seccomp: Split put_seccomp_filter() with put_seccomp() Mickaël Salaün
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Mickaël Salaün @ 2017-02-22  1:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Arnaldo Carvalho de Melo, Casey Schaufler, Daniel Borkmann,
	David Drysdale, David S . Miller, Eric W . Biederman,
	James Morris, Jann Horn, Jonathan Corbet, Matthew Garrett,
	Michael Kerrisk, Kees Cook, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Tejun Heo, Thomas Graf,
	Will Drewry, kernel-hardening, linux-api, linux-security-module,
	netdev

Handle 33 filesystem-related LSM hooks for the Landlock filesystem
event: LANDLOCK_SUBTYPE_EVENT_FS.

A Landlock event wrap LSM hooks for similar kernel object types (e.g.
struct file, struct path...). Multiple LSM hooks can trigger the same
Landlock event.

Landlock handle nine coarse-grained actions: read, write, execute, new,
get, remove, ioctl, lock and fcntl. Each of them abstract LSM hook
access control in a way that can be extended in the future.

The Landlock LSM hook registration is done after other LSM to only run
actions from user-space, via eBPF programs, if the access was granted by
major (privileged) LSMs.

Changes since v4:
* add LSM hook abstraction called Landlock event
  * use the compiler type checking to verify hooks use by an event
  * handle all filesystem related LSM hooks (e.g. file_permission,
    mmap_file, sb_mount...)
* register BPF programs for Landlock just after LSM hooks registration
* move hooks registration after other LSMs
* add failsafes to check if a hook is not used by the kernel
* allow partial raw value access form the context (needed for programs
  generated by LLVM)

Changes since v3:
* split commit
* add hooks dealing with struct inode and struct path pointers:
  inode_permission and inode_getattr
* add abstraction over eBPF helper arguments thanks to wrapping structs

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
---
 include/linux/lsm_hooks.h  |   5 +
 security/landlock/Makefile |   2 +
 security/landlock/hooks.c  | 794 ++++++++++++++++++++++++++++++++++++++++++++-
 security/security.c        |   7 +-
 4 files changed, 806 insertions(+), 2 deletions(-)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 558adfa5c8a8..069af34301d4 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1933,5 +1933,10 @@ void __init loadpin_add_hooks(void);
 #else
 static inline void loadpin_add_hooks(void) { };
 #endif
+#ifdef CONFIG_SECURITY_LANDLOCK
+extern void __init landlock_add_hooks(void);
+#else
+static inline void __init landlock_add_hooks(void) { }
+#endif /* CONFIG_SECURITY_LANDLOCK */
 
 #endif /* ! __LINUX_LSM_HOOKS_H */
diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index b91af42f0c32..8dc8bde660bd 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -1,3 +1,5 @@
+ccflags-$(CONFIG_SECURITY_LANDLOCK) += -Werror=unused-function
+
 obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
 
 landlock-y := hooks.o
diff --git a/security/landlock/hooks.c b/security/landlock/hooks.c
index 6c1ad0e03cfc..88ebe3f01758 100644
--- a/security/landlock/hooks.c
+++ b/security/landlock/hooks.c
@@ -21,14 +21,797 @@
 #include <linux/stddef.h> /* offsetof */
 #include <linux/types.h> /* uintptr_t */
 
+/* permissions translation */
+#include <linux/fs.h> /* MAY_* */
+#include <linux/mman.h> /* PROT_* */
+
+/* hook arguments */
+#include <linux/cred.h>
+#include <linux/dcache.h> /* struct dentry */
+#include <linux/fs.h> /* struct inode, struct iattr */
+#include <linux/mm_types.h> /* struct vm_area_struct */
+#include <linux/mount.h> /* struct vfsmount */
+#include <linux/path.h> /* struct path */
+#include <linux/sched.h> /* struct task_struct */
+#include <linux/time.h> /* struct timespec */
+
+
+#include "common.h" /* get_index() */
+
 #define CTX_ARG_NB 2
 
+/* separators */
+#define SEP_COMMA() ,
+#define SEP_SPACE()
+#define SEP_AND() &&
+
+#define MAP2x1(s, m, x1, x2, ...) m(x1, x2)
+#define MAP2x2(s, m, x1, x2, ...) m(x1, x2) s() MAP2x1(s, m, __VA_ARGS__)
+#define MAP2x3(s, m, x1, x2, ...) m(x1, x2) s() MAP2x2(s, m, __VA_ARGS__)
+#define MAP2x4(s, m, x1, x2, ...) m(x1, x2) s() MAP2x3(s, m, __VA_ARGS__)
+#define MAP2x5(s, m, x1, x2, ...) m(x1, x2) s() MAP2x4(s, m, __VA_ARGS__)
+#define MAP2x6(s, m, x1, x2, ...) m(x1, x2) s() MAP2x5(s, m, __VA_ARGS__)
+#define MAP2x(n, ...) MAP2x##n(__VA_ARGS__)
+
+#define MAP1x1(s, m, x1, ...) m(x1)
+#define MAP1x2(s, m, x1, ...) m(x1) s() MAP1x1(s, m, __VA_ARGS__)
+#define MAP1x(n, ...) MAP1x##n(__VA_ARGS__)
+
+#define SKIP2x1(x1, x2, ...) __VA_ARGS__
+#define SKIP2x2(x1, x2, ...) SKIP2x1(__VA_ARGS__)
+#define SKIP2x3(x1, x2, ...) SKIP2x2(__VA_ARGS__)
+#define SKIP2x4(x1, x2, ...) SKIP2x3(__VA_ARGS__)
+#define SKIP2x5(x1, x2, ...) SKIP2x4(__VA_ARGS__)
+#define SKIP2x6(x1, x2, ...) SKIP2x5(__VA_ARGS__)
+#define SKIP2x(n, ...) SKIP2x##n(__VA_ARGS__)
+
+/* LSM hook argument helpers */
+#define MAP_HOOK_COMMA(n, ...) MAP2x(n, SEP_COMMA, __VA_ARGS__)
+
+#define GET_HOOK_TA(t, a) t a
+
+/* Landlock event argument helpers  */
+#define MAP_EVENT_COMMA(h, n, m, ...) MAP2x(n, SEP_COMMA, m, SKIP2x(h, __VA_ARGS__))
+#define MAP_EVENT_SPACE(h, n, m, ...) MAP2x(n, SEP_SPACE, m, SKIP2x(h, __VA_ARGS__))
+#define MAP_EVENT_AND(h, n, m, ...) MAP2x(n, SEP_AND, m, SKIP2x(h, __VA_ARGS__))
+
+#define GET_CMD(h, n, ...) SKIP2x(n, SKIP2x(h, __VA_ARGS__))
+
+#define EXPAND_TYPE(d) d##_TYPE
+#define EXPAND_BPF(d) d##_BPF
+#define EXPAND_C(d) d##_C
+
+#define GET_TYPE_BPF(t) EXPAND_BPF(t)
+#define GET_TYPE_C(t) EXPAND_C(t) *
+
+#define GET_EVENT_C(d, a) GET_TYPE_C(EXPAND_TYPE(d))
+#define GET_EVENT_U64(d, a) ((u64)(d##_VAL(a)))
+#define GET_EVENT_DEC(d, a) d##_DEC(a)
+#define GET_EVENT_OK(d, a) d##_OK(a)
+
+
+/**
+ * HOOK_ACCESS
+ *
+ * @EVENT: Landlock event name
+ * @NA: number of event arguments
+ *
+ * The __consistent_##EVENT() extern functions and __wrapcheck_* types are
+ * useful to catch inconsistencies in LSM hook definitions thanks to the
+ * compiler type checking.
+ */
+#define HOOK_ACCESS(EVENT, NA, ...) \
+	static inline bool __is_valid_access_event_##EVENT(		\
+			int off, int size, enum bpf_access_type type,	\
+			enum bpf_reg_type *reg_type,			\
+			union bpf_prog_subtype *prog_subtype)		\
+	{								\
+		enum bpf_reg_type _ctx_types[CTX_ARG_NB] = {		\
+			MAP1x(NA, SEP_COMMA, GET_TYPE_BPF, __VA_ARGS__)	\
+		};							\
+		return __is_valid_access(off, size, type, reg_type,	\
+				_ctx_types, prog_subtype);		\
+	}								\
+	extern void __consistent_##EVENT(				\
+			MAP1x(NA, SEP_COMMA, GET_TYPE_C, __VA_ARGS__));
+
+/**
+ * HOOK_NEW
+ *
+ * @INST: event instance for this hook
+ * @EVENT: Landlock event name
+ * @NE: number of event arguments
+ * @HOOK: LSM hook name
+ * @NH: number of hook arguments
+ */
+#define HOOK_NEW(INST, EVENT, NE, HOOK, NH, ...)			\
+	static int landlock_hook_##EVENT##_##HOOK##_##INST(		\
+			MAP_HOOK_COMMA(NH, GET_HOOK_TA, __VA_ARGS__))	\
+	{								\
+		if (!landlock_used())					\
+			return 0;					\
+		if (!(MAP_EVENT_AND(NH, NE, GET_EVENT_OK,		\
+						__VA_ARGS__)))		\
+			return 0;					\
+		{							\
+		MAP_EVENT_SPACE(NH, NE, GET_EVENT_DEC, __VA_ARGS__)	\
+		__u64 _ctx_values[CTX_ARG_NB] = {			\
+			MAP_EVENT_COMMA(NH, NE, GET_EVENT_U64,		\
+					__VA_ARGS__)			\
+		};							\
+		u32 _cmd = GET_CMD(NH, NE, __VA_ARGS__);		\
+		return landlock_decide(LANDLOCK_SUBTYPE_EVENT_##EVENT,	\
+				_ctx_values, _cmd, #HOOK);		\
+		}							\
+	}								\
+	extern void __consistent_##EVENT(MAP_EVENT_COMMA(		\
+				NH, NE, GET_EVENT_C, __VA_ARGS__));
+
+#define HOOK_NEW_FS(...) HOOK_NEW(1, FS, 2, __VA_ARGS__, 0)
+#define HOOK_NEW_FS2(...) HOOK_NEW(2, FS, 2, __VA_ARGS__, 0)
+#define HOOK_NEW_FS3(...) HOOK_NEW(3, FS, 2, __VA_ARGS__, 0)
+#define HOOK_NEW_FS4(...) HOOK_NEW(4, FS, 2, __VA_ARGS__, 0)
+#define HOOK_NEW_FS_CMD(...) HOOK_NEW(1, FS, 2, __VA_ARGS__)
+#define HOOK_INIT_FS(HOOK) LSM_HOOK_INIT(HOOK, landlock_hook_FS_##HOOK##_1)
+#define HOOK_INIT_FS2(HOOK) LSM_HOOK_INIT(HOOK, landlock_hook_FS_##HOOK##_2)
+#define HOOK_INIT_FS3(HOOK) LSM_HOOK_INIT(HOOK, landlock_hook_FS_##HOOK##_3)
+#define HOOK_INIT_FS4(HOOK) LSM_HOOK_INIT(HOOK, landlock_hook_FS_##HOOK##_4)
+
+/*
+ * The WRAP_TYPE_* definitions group the bpf_reg_type enum value and the C
+ * type. This C type may remains unused except to catch inconsistencies in LSM
+ * hook definitions thanks to the compiler type checking.
+ */
+
+/* WRAP_TYPE_NONE */
+#define WRAP_TYPE_NONE_BPF	NOT_INIT
+#define WRAP_TYPE_NONE_C	struct __wrapcheck_none
+WRAP_TYPE_NONE_C;
+
+/* WRAP_TYPE_RAW */
+#define WRAP_TYPE_RAW_BPF	UNKNOWN_VALUE
+#define WRAP_TYPE_RAW_C		struct __wrapcheck_raw
+WRAP_TYPE_RAW_C;
+
+/* WRAP_TYPE_FS */
+#define WRAP_TYPE_FS_BPF	CONST_PTR_TO_HANDLE_FS
+#define WRAP_TYPE_FS_C		const struct bpf_handle_fs
+
+/*
+ * The WRAP_ARG_* definitions group the LSM hook argument type (C and BPF), the
+ * wrapping struct declaration (if any) and the value to copy to the BPF
+ * context. This definitions may be used thanks to the EXPAND_* helpers.
+ *
+ * *_OK: Can we handle the argument?
+ */
+
+/* WRAP_ARG_NONE */
+#define WRAP_ARG_NONE_TYPE	WRAP_TYPE_NONE
+#define WRAP_ARG_NONE_DEC(arg)
+#define WRAP_ARG_NONE_VAL(arg)	0
+#define WRAP_ARG_NONE_OK(arg)	(!WARN_ON(true))
+
+/* WRAP_ARG_RAW */
+#define WRAP_ARG_RAW_TYPE	WRAP_TYPE_RAW
+#define WRAP_ARG_RAW_DEC(arg)
+#define WRAP_ARG_RAW_VAL(arg)	arg
+#define WRAP_ARG_RAW_OK(arg)	(true)
+
+/* WRAP_ARG_FILE */
+#define WRAP_ARG_FILE_TYPE	WRAP_TYPE_FS
+#define WRAP_ARG_FILE_DEC(arg)					\
+	EXPAND_C(WRAP_TYPE_FS) wrap_##arg =			\
+	{ .type = BPF_HANDLE_FS_TYPE_FILE, .file = arg };
+#define WRAP_ARG_FILE_VAL(arg)	(uintptr_t)&wrap_##arg
+#define WRAP_ARG_FILE_OK(arg)	(arg)
+
+/* WRAP_ARG_VMAF */
+#define WRAP_ARG_VMAF_TYPE	WRAP_TYPE_FS
+#define WRAP_ARG_VMAF_DEC(arg)					\
+	EXPAND_C(WRAP_TYPE_FS) wrap_##arg =			\
+	{ .type = BPF_HANDLE_FS_TYPE_FILE, .file = arg->vm_file };
+#define WRAP_ARG_VMAF_VAL(arg)	(uintptr_t)&wrap_##arg
+#define WRAP_ARG_VMAF_OK(arg)	(arg && arg->vm_file)
+
+/* WRAP_ARG_INODE */
+#define WRAP_ARG_INODE_TYPE	WRAP_TYPE_FS
+#define WRAP_ARG_INODE_DEC(arg)					\
+	EXPAND_C(WRAP_TYPE_FS) wrap_##arg =			\
+	{ .type = BPF_HANDLE_FS_TYPE_INODE, .inode = arg };
+#define WRAP_ARG_INODE_VAL(arg)	(uintptr_t)&wrap_##arg
+#define WRAP_ARG_INODE_OK(arg)	(arg)
+
+/* WRAP_ARG_PATH */
+#define WRAP_ARG_PATH_TYPE	WRAP_TYPE_FS
+#define WRAP_ARG_PATH_DEC(arg)					\
+	EXPAND_C(WRAP_TYPE_FS) wrap_##arg =			\
+	{ .type = BPF_HANDLE_FS_TYPE_PATH, .path = arg };
+#define WRAP_ARG_PATH_VAL(arg)	(uintptr_t)&wrap_##arg
+#define WRAP_ARG_PATH_OK(arg)	(arg)
+
+/* WRAP_ARG_DENTRY */
+#define WRAP_ARG_DENTRY_TYPE	WRAP_TYPE_FS
+#define WRAP_ARG_DENTRY_DEC(arg)				\
+	EXPAND_C(WRAP_TYPE_FS) wrap_##arg =			\
+	{ .type = BPF_HANDLE_FS_TYPE_DENTRY, .dentry = arg };
+#define WRAP_ARG_DENTRY_VAL(arg)	(uintptr_t)&wrap_##arg
+#define WRAP_ARG_DENTRY_OK(arg)	(arg)
+
+/* WRAP_ARG_SB */
+#define WRAP_ARG_SB_TYPE	WRAP_TYPE_FS
+#define WRAP_ARG_SB_DEC(arg)					\
+	EXPAND_C(WRAP_TYPE_FS) wrap_##arg =			\
+	{ .type = BPF_HANDLE_FS_TYPE_DENTRY, .dentry = arg->s_root };
+#define WRAP_ARG_SB_VAL(arg)	(uintptr_t)&wrap_##arg
+#define WRAP_ARG_SB_OK(arg)	(arg && arg->s_root)
+
+/* WRAP_ARG_MNTROOT */
+#define WRAP_ARG_MNTROOT_TYPE	WRAP_TYPE_FS
+#define WRAP_ARG_MNTROOT_DEC(arg)				\
+	EXPAND_C(WRAP_TYPE_FS) wrap_##arg =			\
+	{ .type = BPF_HANDLE_FS_TYPE_DENTRY, .dentry = arg->mnt_root };
+#define WRAP_ARG_MNTROOT_VAL(arg)	(uintptr_t)&wrap_##arg
+#define WRAP_ARG_MNTROOT_OK(arg)	(arg && arg->mnt_root)
+
+
+static inline u64 fs_may_to_access(int fs_may)
+{
+	u64 ret = 0;
+
+	if (fs_may & MAY_EXEC)
+		ret |= LANDLOCK_ACTION_FS_EXEC;
+	if (fs_may & MAY_READ)
+		ret |= LANDLOCK_ACTION_FS_READ;
+	if (fs_may & MAY_WRITE)
+		ret |= LANDLOCK_ACTION_FS_WRITE;
+	if (fs_may & MAY_APPEND)
+		ret |= LANDLOCK_ACTION_FS_WRITE;
+	if (fs_may & MAY_OPEN)
+		ret |= LANDLOCK_ACTION_FS_GET;
+	/* ignore MAY_CHDIR and MAY_ACCESS */
+
+	return ret;
+}
+
+static u64 mem_prot_to_access(unsigned long prot, bool private)
+{
+	u64 ret = 0;
+
+	/* private mapping do not write to files */
+	if (!private && (prot & PROT_WRITE))
+		ret |= LANDLOCK_ACTION_FS_WRITE;
+	if (prot & PROT_READ)
+		ret |= LANDLOCK_ACTION_FS_READ;
+	if (prot & PROT_EXEC)
+		ret |= LANDLOCK_ACTION_FS_EXEC;
+
+	return ret;
+}
+
+static inline bool landlock_used(void)
+{
+	return false;
+}
+
+static int landlock_decide(enum landlock_subtype_event event,
+		__u64 ctx_values[CTX_ARG_NB], u32 cmd, const char *hook)
+{
+	int ret = 0;
+	u32 event_idx = get_index(event);
+
+	struct landlock_context ctx = {
+		.status = 0,
+		.arch = syscall_get_arch(),
+		.syscall_nr = syscall_get_nr(current, task_pt_regs(current)),
+		.syscall_cmd = cmd,
+		.event = event,
+		.arg1 = ctx_values[0],
+		.arg2 = ctx_values[1],
+	};
+
+	/* insert manager call here */
+
+	return ret;
+}
+
+static bool __is_valid_access(int off, int size, enum bpf_access_type type,
+		enum bpf_reg_type *reg_type,
+		enum bpf_reg_type ctx_types[CTX_ARG_NB],
+		union bpf_prog_subtype *prog_subtype)
+{
+	int max_size;
+
+	if (type != BPF_READ)
+		return false;
+	if (off < 0 || off >= sizeof(struct landlock_context))
+		return false;
+	if (size <= 0 || size > sizeof(__u64))
+		return false;
+
+	/* set max size */
+	switch (off) {
+	case offsetof(struct landlock_context, arch):
+	case offsetof(struct landlock_context, syscall_nr):
+	case offsetof(struct landlock_context, syscall_cmd):
+	case offsetof(struct landlock_context, event):
+		max_size = sizeof(__u32);
+		break;
+	case offsetof(struct landlock_context, status):
+	case offsetof(struct landlock_context, arg1):
+	case offsetof(struct landlock_context, arg2):
+		max_size = sizeof(__u64);
+		break;
+	default:
+		return false;
+	}
+
+	/* set register type */
+	switch (off) {
+	case offsetof(struct landlock_context, arg1):
+		*reg_type = ctx_types[0];
+		break;
+	case offsetof(struct landlock_context, arg2):
+		*reg_type = ctx_types[1];
+		break;
+	default:
+		*reg_type = UNKNOWN_VALUE;
+	}
+
+	/* check memory range access */
+	switch (*reg_type) {
+	case NOT_INIT:
+		return false;
+	case UNKNOWN_VALUE:
+	case CONST_IMM:
+		/* allow partial raw value */
+		if (size > max_size)
+			return false;
+		break;
+	default:
+		/* deny partial pointer */
+		if (size != max_size)
+			return false;
+	}
+
+	return true;
+}
+
+
+/* hook definitions */
+
+HOOK_ACCESS(FS, 2, WRAP_TYPE_FS, WRAP_TYPE_RAW);
+
+/* binder_* hooks */
+
+HOOK_NEW_FS(binder_transfer_file, 3,
+	struct task_struct *, from,
+	struct task_struct *, to,
+	struct file *, file,
+	WRAP_ARG_FILE, file,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_READ
+);
+
+/* sb_* hooks */
+
+HOOK_NEW_FS(sb_statfs, 1,
+	struct dentry *, dentry,
+	WRAP_ARG_DENTRY, dentry,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_READ
+);
+
+/*
+ * Being able to mount on a path means being able to override the underlying
+ * filesystem view of this path, hence the need for a write access right.
+ */
+HOOK_NEW_FS(sb_mount, 5,
+	const char *, dev_name,
+	const struct path *, path,
+	const char *, type,
+	unsigned long, flags,
+	void *, data,
+	WRAP_ARG_PATH, path,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_WRITE
+);
+
+HOOK_NEW_FS(sb_remount, 2,
+	struct super_block *, sb,
+	void *, data,
+	WRAP_ARG_SB, sb,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_WRITE
+);
+
+HOOK_NEW_FS(sb_umount, 2,
+	struct vfsmount *, mnt,
+	int, flags,
+	WRAP_ARG_MNTROOT, mnt,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_WRITE
+);
+
+/*
+ * The old_path is similar to a destination mount point.
+ */
+HOOK_NEW_FS(sb_pivotroot, 2,
+	const struct path *, old_path,
+	const struct path *, new_path,
+	WRAP_ARG_PATH, old_path,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_WRITE
+);
+
+/* inode_* hooks */
+
+/* a directory inode contains only one dentry */
+HOOK_NEW_FS(inode_create, 3,
+	struct inode *, dir,
+	struct dentry *, dentry,
+	umode_t, mode,
+	WRAP_ARG_INODE, dir,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_WRITE
+);
+
+HOOK_NEW_FS2(inode_create, 3,
+	struct inode *, dir,
+	struct dentry *, dentry,
+	umode_t, mode,
+	WRAP_ARG_DENTRY, dentry,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_NEW
+);
+
+HOOK_NEW_FS(inode_link, 3,
+	struct dentry *, old_dentry,
+	struct inode *, dir,
+	struct dentry *, new_dentry,
+	WRAP_ARG_DENTRY, old_dentry,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_READ
+);
+
+HOOK_NEW_FS2(inode_link, 3,
+	struct dentry *, old_dentry,
+	struct inode *, dir,
+	struct dentry *, new_dentry,
+	WRAP_ARG_INODE, dir,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_WRITE
+);
+
+HOOK_NEW_FS3(inode_link, 3,
+	struct dentry *, old_dentry,
+	struct inode *, dir,
+	struct dentry *, new_dentry,
+	WRAP_ARG_DENTRY, new_dentry,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_NEW
+);
+
+HOOK_NEW_FS(inode_unlink, 2,
+	struct inode *, dir,
+	struct dentry *, dentry,
+	WRAP_ARG_INODE, dir,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_WRITE
+);
+
+HOOK_NEW_FS2(inode_unlink, 2,
+	struct inode *, dir,
+	struct dentry *, dentry,
+	WRAP_ARG_DENTRY, dentry,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_REMOVE
+);
+
+HOOK_NEW_FS(inode_symlink, 3,
+	struct inode *, dir,
+	struct dentry *, dentry,
+	const char *, old_name,
+	WRAP_ARG_INODE, dir,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_WRITE
+);
+
+HOOK_NEW_FS2(inode_symlink, 3,
+	struct inode *, dir,
+	struct dentry *, dentry,
+	const char *, old_name,
+	WRAP_ARG_DENTRY, dentry,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_NEW
+);
+
+HOOK_NEW_FS(inode_mkdir, 3,
+	struct inode *, dir,
+	struct dentry *, dentry,
+	umode_t, mode,
+	WRAP_ARG_INODE, dir,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_WRITE
+);
+
+HOOK_NEW_FS2(inode_mkdir, 3,
+	struct inode *, dir,
+	struct dentry *, dentry,
+	umode_t, mode,
+	WRAP_ARG_DENTRY, dentry,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_NEW
+);
+
+HOOK_NEW_FS(inode_rmdir, 2,
+	struct inode *, dir,
+	struct dentry *, dentry,
+	WRAP_ARG_INODE, dir,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_WRITE
+);
+
+HOOK_NEW_FS2(inode_rmdir, 2,
+	struct inode *, dir,
+	struct dentry *, dentry,
+	WRAP_ARG_DENTRY, dentry,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_REMOVE
+);
+
+HOOK_NEW_FS(inode_mknod, 4,
+	struct inode *, dir,
+	struct dentry *, dentry,
+	umode_t, mode,
+	dev_t, dev,
+	WRAP_ARG_INODE, dir,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_WRITE
+);
+
+HOOK_NEW_FS2(inode_mknod, 4,
+	struct inode *, dir,
+	struct dentry *, dentry,
+	umode_t, mode,
+	dev_t, dev,
+	WRAP_ARG_DENTRY, dentry,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_NEW
+);
+
+HOOK_NEW_FS(inode_rename, 4,
+	struct inode *, old_dir,
+	struct dentry *, old_dentry,
+	struct inode *, new_dir,
+	struct dentry *, new_dentry,
+	WRAP_ARG_INODE, old_dir,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_WRITE
+);
+
+HOOK_NEW_FS2(inode_rename, 4,
+	struct inode *, old_dir,
+	struct dentry *, old_dentry,
+	struct inode *, new_dir,
+	struct dentry *, new_dentry,
+	WRAP_ARG_DENTRY, old_dentry,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_REMOVE
+);
+
+HOOK_NEW_FS3(inode_rename, 4,
+	struct inode *, old_dir,
+	struct dentry *, old_dentry,
+	struct inode *, new_dir,
+	struct dentry *, new_dentry,
+	WRAP_ARG_INODE, new_dir,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_WRITE
+);
+
+HOOK_NEW_FS4(inode_rename, 4,
+	struct inode *, old_dir,
+	struct dentry *, old_dentry,
+	struct inode *, new_dir,
+	struct dentry *, new_dentry,
+	WRAP_ARG_DENTRY, new_dentry,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_NEW
+);
+
+HOOK_NEW_FS(inode_readlink, 1,
+	struct dentry *, dentry,
+	WRAP_ARG_DENTRY, dentry,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_READ
+);
+
+// XXX: handle inode?
+HOOK_NEW_FS(inode_follow_link, 3,
+	struct dentry *, dentry,
+	struct inode *, inode,
+	bool, rcu,
+	WRAP_ARG_DENTRY, dentry,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_READ
+);
+
+HOOK_NEW_FS(inode_permission, 2,
+	struct inode *, inode,
+	int, mask,
+	WRAP_ARG_INODE, inode,
+	WRAP_ARG_RAW, fs_may_to_access(mask)
+);
+
+HOOK_NEW_FS(inode_setattr, 2,
+	struct dentry *, dentry,
+	struct iattr *, attr,
+	WRAP_ARG_DENTRY, dentry,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_WRITE
+);
+
+HOOK_NEW_FS(inode_getattr, 1,
+	const struct path *, path,
+	WRAP_ARG_PATH, path,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_READ
+);
+
+HOOK_NEW_FS(inode_setxattr, 5,
+	struct dentry *, dentry,
+	const char *, name,
+	const void *, value,
+	size_t, size,
+	int, flags,
+	WRAP_ARG_DENTRY, dentry,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_WRITE
+);
+
+HOOK_NEW_FS(inode_getxattr, 2,
+	struct dentry *, dentry,
+	const char *, name,
+	WRAP_ARG_DENTRY, dentry,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_READ
+);
+
+HOOK_NEW_FS(inode_listxattr, 1,
+	struct dentry *, dentry,
+	WRAP_ARG_DENTRY, dentry,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_READ
+);
+
+HOOK_NEW_FS(inode_removexattr, 2,
+	struct dentry *, dentry,
+	const char *, name,
+	WRAP_ARG_DENTRY, dentry,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_WRITE
+);
+
+HOOK_NEW_FS(inode_getsecurity, 4,
+	struct inode *, inode,
+	const char *, name,
+	void **, buffer,
+	bool, alloc,
+	WRAP_ARG_INODE, inode,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_READ
+);
+
+HOOK_NEW_FS(inode_setsecurity, 5,
+	struct inode *, inode,
+	const char *, name,
+	const void *, value,
+	size_t, size,
+	int, flag,
+	WRAP_ARG_INODE, inode,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_WRITE
+);
+
+/* file_* hooks */
+
+HOOK_NEW_FS(file_permission, 2,
+	struct file *, file,
+	int, mask,
+	WRAP_ARG_FILE, file,
+	WRAP_ARG_RAW, fs_may_to_access(mask)
+);
+
+/*
+ * An ioctl command can be a read or a write. This can be checked with _IOC*()
+ * for some commands but a Landlock rule should check the ioctl command to
+ * whitelist them.
+ */
+HOOK_NEW_FS_CMD(file_ioctl, 3,
+	struct file *, file,
+	unsigned int, cmd,
+	unsigned long, arg,
+	WRAP_ARG_FILE, file,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_IOCTL,
+	cmd
+);
+
+HOOK_NEW_FS_CMD(file_lock, 2,
+	struct file *, file,
+	unsigned int, cmd,
+	WRAP_ARG_FILE, file,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_LOCK,
+	cmd
+);
+
+HOOK_NEW_FS_CMD(file_fcntl, 3,
+	struct file *, file,
+	unsigned int, cmd,
+	unsigned long, arg,
+	WRAP_ARG_FILE, file,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_FCNTL,
+	cmd
+);
+
+HOOK_NEW_FS(mmap_file, 4,
+	struct file *, file,
+	unsigned long, reqprot,
+	unsigned long, prot,
+	unsigned long, flags,
+	WRAP_ARG_FILE, file,
+	WRAP_ARG_RAW, mem_prot_to_access(prot, flags & MAP_PRIVATE)
+);
+
+HOOK_NEW_FS(file_mprotect, 3,
+	struct vm_area_struct *, vma,
+	unsigned long, reqprot,
+	unsigned long, prot,
+	WRAP_ARG_VMAF, vma,
+	WRAP_ARG_RAW, mem_prot_to_access(prot, !(vma->vm_flags & VM_SHARED))
+);
+
+HOOK_NEW_FS(file_receive, 1,
+	struct file *, file,
+	WRAP_ARG_FILE, file,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_GET
+);
+
+HOOK_NEW_FS(file_open, 2,
+	struct file *, file,
+	const struct cred *, cred,
+	WRAP_ARG_FILE, file,
+	WRAP_ARG_RAW, LANDLOCK_ACTION_FS_GET
+);
+
+static struct security_hook_list landlock_hooks[] = {
+	HOOK_INIT_FS(binder_transfer_file),
+
+	HOOK_INIT_FS(sb_statfs),
+	HOOK_INIT_FS(sb_mount),
+	HOOK_INIT_FS(sb_remount),
+	HOOK_INIT_FS(sb_umount),
+	HOOK_INIT_FS(sb_pivotroot),
+
+	HOOK_INIT_FS(inode_create),
+	HOOK_INIT_FS2(inode_create),
+	HOOK_INIT_FS(inode_link),
+	HOOK_INIT_FS2(inode_link),
+	HOOK_INIT_FS3(inode_link),
+	HOOK_INIT_FS(inode_unlink),
+	HOOK_INIT_FS2(inode_unlink),
+	HOOK_INIT_FS(inode_symlink),
+	HOOK_INIT_FS2(inode_symlink),
+	HOOK_INIT_FS(inode_mkdir),
+	HOOK_INIT_FS2(inode_mkdir),
+	HOOK_INIT_FS(inode_rmdir),
+	HOOK_INIT_FS2(inode_rmdir),
+	HOOK_INIT_FS(inode_mknod),
+	HOOK_INIT_FS2(inode_mknod),
+	HOOK_INIT_FS(inode_rename),
+	HOOK_INIT_FS2(inode_rename),
+	HOOK_INIT_FS3(inode_rename),
+	HOOK_INIT_FS4(inode_rename),
+	HOOK_INIT_FS(inode_readlink),
+	HOOK_INIT_FS(inode_follow_link),
+	HOOK_INIT_FS(inode_permission),
+	HOOK_INIT_FS(inode_setattr),
+	HOOK_INIT_FS(inode_getattr),
+	HOOK_INIT_FS(inode_setxattr),
+	HOOK_INIT_FS(inode_getxattr),
+	HOOK_INIT_FS(inode_listxattr),
+	HOOK_INIT_FS(inode_removexattr),
+	HOOK_INIT_FS(inode_getsecurity),
+	HOOK_INIT_FS(inode_setsecurity),
+
+	HOOK_INIT_FS(file_permission),
+	HOOK_INIT_FS(file_ioctl),
+	HOOK_INIT_FS(file_lock),
+	HOOK_INIT_FS(file_fcntl),
+	HOOK_INIT_FS(mmap_file),
+	HOOK_INIT_FS(file_mprotect),
+	HOOK_INIT_FS(file_receive),
+	HOOK_INIT_FS(file_open),
+};
 
 static inline bool bpf_landlock_is_valid_access(int off, int size,
 		enum bpf_access_type type, enum bpf_reg_type *reg_type,
 		union bpf_prog_subtype *prog_subtype)
 {
-	return false;
+	enum landlock_subtype_event event = prog_subtype->landlock_rule.event;
+
+	switch (event) {
+	case LANDLOCK_SUBTYPE_EVENT_FS:
+		return __is_valid_access_event_FS(off, size, type, reg_type,
+				prog_subtype);
+	case LANDLOCK_SUBTYPE_EVENT_UNSPEC:
+	default:
+		return false;
+	}
 }
 
 static inline bool bpf_landlock_is_valid_subtype(
@@ -128,3 +911,12 @@ static struct bpf_prog_type_list bpf_landlock_type __ro_after_init = {
 	.ops = &bpf_landlock_ops,
 	.type = BPF_PROG_TYPE_LANDLOCK,
 };
+
+void __init landlock_add_hooks(void)
+{
+	pr_info("landlock: Version %u, ready to sandbox with %s\n",
+			LANDLOCK_VERSION,
+			"seccomp");
+	security_add_hooks(landlock_hooks, ARRAY_SIZE(landlock_hooks));
+	bpf_register_prog_type(&bpf_landlock_type);
+}
diff --git a/security/security.c b/security/security.c
index f825304f04a7..74d2bf057f30 100644
--- a/security/security.c
+++ b/security/security.c
@@ -63,10 +63,15 @@ int __init security_init(void)
 	loadpin_add_hooks();
 
 	/*
-	 * Load all the remaining security modules.
+	 * Load all remaining privileged security modules.
 	 */
 	do_security_initcalls();
 
+	/*
+	 * Load potentially-unprivileged security modules at the end.
+	 */
+	landlock_add_hooks();
+
 	return 0;
 }
 
-- 
2.11.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 05/10] seccomp: Split put_seccomp_filter() with put_seccomp()
  2017-02-22  1:26 [PATCH v5 00/10] Landlock LSM: Toward unprivileged sandboxing Mickaël Salaün
                   ` (3 preceding siblings ...)
  2017-02-22  1:26 ` [PATCH v5 04/10] landlock: Add LSM hooks related to filesystem Mickaël Salaün
@ 2017-02-22  1:26 ` Mickaël Salaün
  2017-02-22  1:26 ` [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy Mickaël Salaün
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Mickaël Salaün @ 2017-02-22  1:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Arnaldo Carvalho de Melo, Casey Schaufler, Daniel Borkmann,
	David Drysdale, David S . Miller, Eric W . Biederman,
	James Morris, Jann Horn, Jonathan Corbet, Matthew Garrett,
	Michael Kerrisk, Kees Cook, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Tejun Heo, Thomas Graf,
	Will Drewry, kernel-hardening, linux-api, linux-security-module,
	netdev

The semantic is unchanged. This will be useful for the Landlock
integration with seccomp (next commit).

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
---
 include/linux/seccomp.h |  4 ++--
 kernel/fork.c           |  2 +-
 kernel/seccomp.c        | 18 +++++++++++++-----
 3 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
index ecc296c137cd..e25aee2cdfc0 100644
--- a/include/linux/seccomp.h
+++ b/include/linux/seccomp.h
@@ -77,10 +77,10 @@ static inline int seccomp_mode(struct seccomp *s)
 #endif /* CONFIG_SECCOMP */
 
 #ifdef CONFIG_SECCOMP_FILTER
-extern void put_seccomp_filter(struct task_struct *tsk);
+extern void put_seccomp(struct task_struct *tsk);
 extern void get_seccomp_filter(struct task_struct *tsk);
 #else  /* CONFIG_SECCOMP_FILTER */
-static inline void put_seccomp_filter(struct task_struct *tsk)
+static inline void put_seccomp(struct task_struct *tsk)
 {
 	return;
 }
diff --git a/kernel/fork.c b/kernel/fork.c
index 11c5c8ab827c..a4f0d0e8aeb2 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -352,7 +352,7 @@ void free_task(struct task_struct *tsk)
 #endif
 	rt_mutex_debug_task_free(tsk);
 	ftrace_graph_exit_task(tsk);
-	put_seccomp_filter(tsk);
+	put_seccomp(tsk);
 	arch_release_task_struct(tsk);
 	if (tsk->flags & PF_KTHREAD)
 		free_kthread_struct(tsk);
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index f7ce79a46050..06f2f3ee454c 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -62,6 +62,8 @@ struct seccomp_filter {
 /* Limit any path through the tree to 256KB worth of instructions. */
 #define MAX_INSNS_PER_PATH ((1 << 18) / sizeof(struct sock_filter))
 
+static void put_seccomp_filter(struct seccomp_filter *filter);
+
 /*
  * Endianness is explicitly ignored and left for BPF program authors to manage
  * as per the specific architecture.
@@ -312,7 +314,7 @@ static inline void seccomp_sync_threads(void)
 		 * current's path will hold a reference.  (This also
 		 * allows a put before the assignment.)
 		 */
-		put_seccomp_filter(thread);
+		put_seccomp_filter(thread->seccomp.filter);
 		smp_store_release(&thread->seccomp.filter,
 				  caller->seccomp.filter);
 
@@ -474,10 +476,11 @@ static inline void seccomp_filter_free(struct seccomp_filter *filter)
 	}
 }
 
-/* put_seccomp_filter - decrements the ref count of tsk->seccomp.filter */
-void put_seccomp_filter(struct task_struct *tsk)
+/* put_seccomp_filter - decrements the ref count of a filter */
+static void put_seccomp_filter(struct seccomp_filter *filter)
 {
-	struct seccomp_filter *orig = tsk->seccomp.filter;
+	struct seccomp_filter *orig = filter;
+
 	/* Clean up single-reference branches iteratively. */
 	while (orig && atomic_dec_and_test(&orig->usage)) {
 		struct seccomp_filter *freeme = orig;
@@ -486,6 +489,11 @@ void put_seccomp_filter(struct task_struct *tsk)
 	}
 }
 
+void put_seccomp(struct task_struct *tsk)
+{
+	put_seccomp_filter(tsk->seccomp.filter);
+}
+
 /**
  * seccomp_send_sigsys - signals the task to allow in-process syscall emulation
  * @syscall: syscall number to send to userland
@@ -897,7 +905,7 @@ long seccomp_get_filter(struct task_struct *task, unsigned long filter_off,
 	if (copy_to_user(data, fprog->filter, bpf_classic_proglen(fprog)))
 		ret = -EFAULT;
 
-	put_seccomp_filter(task);
+	put_seccomp_filter(task->seccomp.filter);
 	return ret;
 
 out:
-- 
2.11.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy
  2017-02-22  1:26 [PATCH v5 00/10] Landlock LSM: Toward unprivileged sandboxing Mickaël Salaün
                   ` (4 preceding siblings ...)
  2017-02-22  1:26 ` [PATCH v5 05/10] seccomp: Split put_seccomp_filter() with put_seccomp() Mickaël Salaün
@ 2017-02-22  1:26 ` Mickaël Salaün
  2017-02-28 20:01   ` Andy Lutomirski
  2017-03-02 10:22   ` [kernel-hardening] " Djalal Harouni
  2017-02-22  1:26 ` [PATCH v5 07/10] bpf: Add a Landlock sandbox example Mickaël Salaün
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 26+ messages in thread
From: Mickaël Salaün @ 2017-02-22  1:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Arnaldo Carvalho de Melo, Casey Schaufler, Daniel Borkmann,
	David Drysdale, David S . Miller, Eric W . Biederman,
	James Morris, Jann Horn, Jonathan Corbet, Matthew Garrett,
	Michael Kerrisk, Kees Cook, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Tejun Heo, Thomas Graf,
	Will Drewry, kernel-hardening, linux-api, linux-security-module,
	netdev, Andrew Morton

The seccomp(2) syscall can be use to apply a Landlock rule to the
current process. As with a seccomp filter, the Landlock rule is enforced
for all its future children. An inherited rule tree can be updated
(append-only) by the owner of inherited Landlock nodes (e.g. a parent
process that create a new rule). However, an intermediate task, which
did not create a rule, will not be able to update its children's rules.

Landlock rules can be tied to a Landlock event. When such an event is
triggered, a tree of rules can be evaluated. Thisk kind of tree is
created with a first node.  This node reference a list of rules and an
optional parent node. Each rule return a 32-bit value which can
interrupt the evaluation with a non-zero value. If every rules returned
zero, the evaluation continues with the rule list of the parent node,
until the end of the tree.

Changes since v4:
* merge manager and seccomp patches
* return -EFAULT in seccomp(2) when user_bpf_fd is null to easely check
  if Landlock is supported
* only allow a process with the global CAP_SYS_ADMIN to use Landlock
  (will be lifted in the future)
* add an early check to exit as soon as possible if the current process
  does not have Landlock rules

Changes since v3:
* remove the hard link with seccomp (suggested by Andy Lutomirski and
  Kees Cook):
  * remove the cookie which could imply multiple evaluation of Landlock
    rules
  * remove the origin field in struct landlock_data
* remove documentation fix (merged upstream)
* rename the new seccomp command to SECCOMP_ADD_LANDLOCK_RULE
* internal renaming
* split commit
* new design to be able to inherit on the fly the parent rules

Changes since v2:
* Landlock programs can now be run without seccomp filter but for any
  syscall (from the process) or interruption
* move Landlock related functions and structs into security/landlock/*
  (to manage cgroups as well)
* fix seccomp filter handling: run Landlock programs for each of their
  legitimate seccomp filter
* properly clean up all seccomp results
* cosmetic changes to ease the understanding
* fix some ifdef

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Will Drewry <wad@chromium.org>
---
 include/linux/seccomp.h      |   8 ++
 include/uapi/linux/seccomp.h |   1 +
 kernel/fork.c                |  14 +-
 kernel/seccomp.c             |   8 ++
 security/landlock/Makefile   |   2 +-
 security/landlock/hooks.c    |  42 +++++-
 security/landlock/manager.c  | 321 +++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 392 insertions(+), 4 deletions(-)
 create mode 100644 security/landlock/manager.c

diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
index e25aee2cdfc0..9a38de3c0e72 100644
--- a/include/linux/seccomp.h
+++ b/include/linux/seccomp.h
@@ -10,6 +10,10 @@
 #include <linux/thread_info.h>
 #include <asm/seccomp.h>
 
+#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
+struct landlock_events;
+#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
+
 struct seccomp_filter;
 /**
  * struct seccomp - the state of a seccomp'ed process
@@ -18,6 +22,7 @@ struct seccomp_filter;
  *         system calls available to a process.
  * @filter: must always point to a valid seccomp-filter or NULL as it is
  *          accessed without locking during system call entry.
+ * @landlock_events: contains an array of Landlock rules.
  *
  *          @filter must only be accessed from the context of current as there
  *          is no read locking.
@@ -25,6 +30,9 @@ struct seccomp_filter;
 struct seccomp {
 	int mode;
 	struct seccomp_filter *filter;
+#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
+	struct landlock_events *landlock_events;
+#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
 };
 
 #ifdef CONFIG_HAVE_ARCH_SECCOMP_FILTER
diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
index 0f238a43ff1e..56dd692cddac 100644
--- a/include/uapi/linux/seccomp.h
+++ b/include/uapi/linux/seccomp.h
@@ -13,6 +13,7 @@
 /* Valid operations for seccomp syscall. */
 #define SECCOMP_SET_MODE_STRICT	0
 #define SECCOMP_SET_MODE_FILTER	1
+#define SECCOMP_ADD_LANDLOCK_RULE	2
 
 /* Valid flags for SECCOMP_SET_MODE_FILTER */
 #define SECCOMP_FILTER_FLAG_TSYNC	1
diff --git a/kernel/fork.c b/kernel/fork.c
index a4f0d0e8aeb2..bd5c72dffe60 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -37,6 +37,7 @@
 #include <linux/security.h>
 #include <linux/hugetlb.h>
 #include <linux/seccomp.h>
+#include <linux/landlock.h>
 #include <linux/swap.h>
 #include <linux/syscalls.h>
 #include <linux/jiffies.h>
@@ -515,7 +516,10 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
 	 * the usage counts on the error path calling free_task.
 	 */
 	tsk->seccomp.filter = NULL;
-#endif
+#ifdef CONFIG_SECURITY_LANDLOCK
+	tsk->seccomp.landlock_events = NULL;
+#endif /* CONFIG_SECURITY_LANDLOCK */
+#endif /* CONFIG_SECCOMP */
 
 	setup_thread_stack(tsk, orig);
 	clear_user_return_notifier(tsk);
@@ -1388,7 +1392,13 @@ static void copy_seccomp(struct task_struct *p)
 
 	/* Ref-count the new filter user, and assign it. */
 	get_seccomp_filter(current);
-	p->seccomp = current->seccomp;
+	p->seccomp.mode = current->seccomp.mode;
+	p->seccomp.filter = current->seccomp.filter;
+#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
+	p->seccomp.landlock_events = current->seccomp.landlock_events;
+	if (p->seccomp.landlock_events)
+		atomic_inc(&p->seccomp.landlock_events->usage);
+#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
 
 	/*
 	 * Explicitly enable no_new_privs here in case it got set
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 06f2f3ee454c..ef412d95ff5d 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -32,6 +32,7 @@
 #include <linux/security.h>
 #include <linux/tracehook.h>
 #include <linux/uaccess.h>
+#include <linux/landlock.h>
 
 /**
  * struct seccomp_filter - container for seccomp BPF programs
@@ -492,6 +493,9 @@ static void put_seccomp_filter(struct seccomp_filter *filter)
 void put_seccomp(struct task_struct *tsk)
 {
 	put_seccomp_filter(tsk->seccomp.filter);
+#ifdef CONFIG_SECURITY_LANDLOCK
+	put_landlock_events(tsk->seccomp.landlock_events);
+#endif /* CONFIG_SECURITY_LANDLOCK */
 }
 
 /**
@@ -796,6 +800,10 @@ static long do_seccomp(unsigned int op, unsigned int flags,
 		return seccomp_set_mode_strict();
 	case SECCOMP_SET_MODE_FILTER:
 		return seccomp_set_mode_filter(flags, uargs);
+#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
+	case SECCOMP_ADD_LANDLOCK_RULE:
+		return landlock_seccomp_append_prog(flags, uargs);
+#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
 	default:
 		return -EINVAL;
 	}
diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index 8dc8bde660bd..6c1b0d8bd810 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -2,4 +2,4 @@ ccflags-$(CONFIG_SECURITY_LANDLOCK) += -Werror=unused-function
 
 obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
 
-landlock-y := hooks.o
+landlock-y := hooks.o manager.o
diff --git a/security/landlock/hooks.c b/security/landlock/hooks.c
index 88ebe3f01758..704ea18377d2 100644
--- a/security/landlock/hooks.c
+++ b/security/landlock/hooks.c
@@ -290,7 +290,44 @@ static u64 mem_prot_to_access(unsigned long prot, bool private)
 
 static inline bool landlock_used(void)
 {
+#ifdef CONFIG_SECCOMP_FILTER
+	return !!(current->seccomp.landlock_events);
+#else
 	return false;
+#endif /* CONFIG_SECCOMP_FILTER */
+}
+
+/**
+ * landlock_run_prog - run Landlock program for a syscall
+ *
+ * @event_idx: event index in the rules array
+ * @ctx: non-NULL eBPF context
+ * @events: Landlock events pointer
+ */
+static int landlock_run_prog(u32 event_idx, const struct landlock_context *ctx,
+		struct landlock_events *events)
+{
+	struct landlock_node *node;
+
+	if (!events)
+		return 0;
+
+	for (node = events->nodes[event_idx]; node; node = node->prev) {
+		struct landlock_rule *rule;
+
+		for (rule = node->rule; rule; rule = rule->prev) {
+			u32 ret;
+
+			if (WARN_ON(!rule->prog))
+				continue;
+			rcu_read_lock();
+			ret = BPF_PROG_RUN(rule->prog, (void *)ctx);
+			rcu_read_unlock();
+			if (ret)
+				return -EPERM;
+		}
+	}
+	return 0;
 }
 
 static int landlock_decide(enum landlock_subtype_event event,
@@ -309,7 +346,10 @@ static int landlock_decide(enum landlock_subtype_event event,
 		.arg2 = ctx_values[1],
 	};
 
-	/* insert manager call here */
+#ifdef CONFIG_SECCOMP_FILTER
+	ret = landlock_run_prog(event_idx, &ctx,
+			current->seccomp.landlock_events);
+#endif /* CONFIG_SECCOMP_FILTER */
 
 	return ret;
 }
diff --git a/security/landlock/manager.c b/security/landlock/manager.c
new file mode 100644
index 000000000000..00bb2944c85e
--- /dev/null
+++ b/security/landlock/manager.c
@@ -0,0 +1,321 @@
+/*
+ * Landlock LSM - seccomp manager
+ *
+ * Copyright © 2017 Mickaël Salaün <mic@digikod.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ */
+
+#include <asm/page.h> /* PAGE_SIZE */
+#include <linux/atomic.h> /* atomic_*(), smp_store_release() */
+#include <linux/bpf.h> /* bpf_prog_put() */
+#include <linux/filter.h> /* struct bpf_prog */
+#include <linux/kernel.h> /* round_up() */
+#include <linux/landlock.h>
+#include <linux/sched.h> /* current_cred(), task_no_new_privs() */
+#include <linux/security.h> /* security_capable_noaudit() */
+#include <linux/slab.h> /* alloc(), kfree() */
+#include <linux/types.h> /* atomic_t */
+#include <linux/uaccess.h> /* copy_from_user() */
+
+#include "common.h"
+
+static void put_landlock_rule(struct landlock_rule *rule)
+{
+	struct landlock_rule *orig = rule;
+
+	/* clean up single-reference branches iteratively */
+	while (orig && atomic_dec_and_test(&orig->usage)) {
+		struct landlock_rule *freeme = orig;
+
+		bpf_prog_put(orig->prog);
+		orig = orig->prev;
+		kfree(freeme);
+	}
+}
+
+static void put_landlock_node(struct landlock_node *node)
+{
+	struct landlock_node *orig = node;
+
+	/* clean up single-reference branches iteratively */
+	while (orig && atomic_dec_and_test(&orig->usage)) {
+		struct landlock_node *freeme = orig;
+
+		put_landlock_rule(orig->rule);
+		orig = orig->prev;
+		kfree(freeme);
+	}
+}
+
+void put_landlock_events(struct landlock_events *events)
+{
+	if (events && atomic_dec_and_test(&events->usage)) {
+		size_t i;
+
+		/* XXX: Do we need to use lockless_dereference() here? */
+		for (i = 0; i < ARRAY_SIZE(events->nodes); i++) {
+			if (!events->nodes[i])
+				continue;
+			/* Are we the owner of this node? */
+			if (events->nodes[i]->owner == &events->nodes[i])
+				events->nodes[i]->owner = NULL;
+			put_landlock_node(events->nodes[i]);
+		}
+		kfree(events);
+	}
+}
+
+static struct landlock_events *new_raw_landlock_events(void)
+{
+	struct landlock_events *ret;
+
+	/* array filled with NULL values */
+	ret = kzalloc(sizeof(*ret), GFP_KERNEL);
+	if (!ret)
+		return ERR_PTR(-ENOMEM);
+	atomic_set(&ret->usage, 1);
+	return ret;
+}
+
+static struct landlock_events *new_filled_landlock_events(void)
+{
+	size_t i;
+	struct landlock_events *ret;
+
+	ret = new_raw_landlock_events();
+	if (IS_ERR(ret))
+		return ret;
+	/*
+	 * We need to initially allocate every nodes to be able to update the
+	 * rules they are pointing to, across every (future) children of the
+	 * current task.
+	 */
+	for (i = 0; i < ARRAY_SIZE(ret->nodes); i++) {
+		struct landlock_node *node;
+
+		node = kzalloc(sizeof(*node), GFP_KERNEL);
+		if (!node)
+			goto put_events;
+		atomic_set(&node->usage, 1);
+		/* we are the owner of this node */
+		node->owner = &ret->nodes[i];
+		ret->nodes[i] = node;
+	}
+	return ret;
+
+put_events:
+	put_landlock_events(ret);
+	return ERR_PTR(-ENOMEM);
+}
+
+static void add_landlock_rule(struct landlock_events *events,
+		struct landlock_rule *rule)
+{
+	/* subtype.landlock_rule.event > 0 for loaded programs */
+	u32 event_idx = get_index(rule->prog->subtype.landlock_rule.event);
+
+	rule->prev = events->nodes[event_idx]->rule;
+	WARN_ON(atomic_read(&rule->usage));
+	atomic_set(&rule->usage, 1);
+	/* do not increment the previous rule usage */
+	smp_store_release(&events->nodes[event_idx]->rule, rule);
+}
+
+/* Limit Landlock events to 256KB. */
+#define LANDLOCK_EVENTS_MAX_PAGES (1 << 6)
+
+/**
+ * landlock_append_prog - attach a Landlock rule to @current_events
+ *
+ * @current_events: landlock_events pointer, must be locked (if needed) to
+ *                  prevent a concurrent put/free. This pointer must not be
+ *                  freed after the call.
+ * @prog: non-NULL Landlock rule to append to @current_events. @prog will be
+ *        owned by landlock_append_prog() and freed if an error happened.
+ *
+ * Return @current_events or a new pointer when OK. Return a pointer error
+ * otherwise.
+ */
+static struct landlock_events *landlock_append_prog(
+		struct landlock_events *current_events, struct bpf_prog *prog)
+{
+	struct landlock_events *new_events = current_events;
+	unsigned long pages;
+	struct landlock_rule *rule;
+	u32 event_idx;
+
+	if (prog->type != BPF_PROG_TYPE_LANDLOCK) {
+		new_events = ERR_PTR(-EINVAL);
+		goto put_prog;
+	}
+
+	/* validate memory size allocation */
+	pages = prog->pages;
+	if (current_events) {
+		size_t i;
+
+		for (i = 0; i < ARRAY_SIZE(current_events->nodes); i++) {
+			struct landlock_node *walker_n;
+
+			for (walker_n = current_events->nodes[i];
+					walker_n;
+					walker_n = walker_n->prev) {
+				struct landlock_rule *walker_r;
+
+				for (walker_r = walker_n->rule;
+						walker_r;
+						walker_r = walker_r->prev)
+					pages += walker_r->prog->pages;
+			}
+		}
+		/* count a struct landlock_events if we need to allocate one */
+		if (atomic_read(&current_events->usage) != 1)
+			pages += round_up(sizeof(*current_events), PAGE_SIZE) /
+				PAGE_SIZE;
+	}
+	if (pages > LANDLOCK_EVENTS_MAX_PAGES) {
+		new_events = ERR_PTR(-E2BIG);
+		goto put_prog;
+	}
+
+	rule = kzalloc(sizeof(*rule), GFP_KERNEL);
+	if (!rule) {
+		new_events = ERR_PTR(-ENOMEM);
+		goto put_prog;
+	}
+	rule->prog = prog;
+
+	/* subtype.landlock_rule.event > 0 for loaded programs */
+	event_idx = get_index(rule->prog->subtype.landlock_rule.event);
+
+	if (!current_events) {
+		/* add a new landlock_events, if needed */
+		new_events = new_filled_landlock_events();
+		if (IS_ERR(new_events))
+			goto put_rule;
+		add_landlock_rule(new_events, rule);
+	} else {
+		if (new_events->nodes[event_idx]->owner ==
+				&new_events->nodes[event_idx]) {
+			/* We are the owner, we can then update the node. */
+			add_landlock_rule(new_events, rule);
+		} else if (atomic_read(&current_events->usage) == 1) {
+			WARN_ON(new_events->nodes[event_idx]->owner);
+			/*
+			 * We can become the new owner if no other task use it.
+			 * This avoid an unnecessary allocation.
+			 */
+			new_events->nodes[event_idx]->owner =
+				&new_events->nodes[event_idx];
+			add_landlock_rule(new_events, rule);
+		} else {
+			/*
+			 * We are not the owner, we need to fork current_events
+			 * and then add a new node.
+			 */
+			struct landlock_node *node;
+			size_t i;
+
+			node = kmalloc(sizeof(*node), GFP_KERNEL);
+			if (!node) {
+				new_events = ERR_PTR(-ENOMEM);
+				goto put_rule;
+			}
+			atomic_set(&node->usage, 1);
+			/* set the previous node after the new_events
+			 * allocation */
+			node->prev = NULL;
+			/* do not increment the previous node usage */
+			node->owner = &new_events->nodes[event_idx];
+			/* rule->prev is already NULL */
+			atomic_set(&rule->usage, 1);
+			node->rule = rule;
+
+			new_events = new_raw_landlock_events();
+			if (IS_ERR(new_events)) {
+				/* put the rule as well */
+				put_landlock_node(node);
+				return ERR_PTR(-ENOMEM);
+			}
+			for (i = 0; i < ARRAY_SIZE(new_events->nodes); i++) {
+				new_events->nodes[i] =
+					lockless_dereference(
+							current_events->nodes[i]);
+				if (i == event_idx)
+					node->prev = new_events->nodes[i];
+				if (!WARN_ON(!new_events->nodes[i]))
+					atomic_inc(&new_events->nodes[i]->usage);
+			}
+			new_events->nodes[event_idx] = node;
+
+			/*
+			 * @current_events will not be freed here because it's usage
+			 * field is > 1. It is only prevented to be freed by another
+			 * subject thanks to the caller of landlock_append_prog() which
+			 * should be locked if needed.
+			 */
+			put_landlock_events(current_events);
+		}
+	}
+	return new_events;
+
+put_prog:
+	bpf_prog_put(prog);
+	return new_events;
+
+put_rule:
+	put_landlock_rule(rule);
+	return new_events;
+}
+
+/**
+ * landlock_seccomp_append_prog - attach a Landlock rule to the current process
+ *
+ * current->seccomp.landlock_events is lazily allocated. When a process fork,
+ * only a pointer is copied. When a new event is added by a process, if there
+ * is other references to this process' landlock_events, then a new allocation
+ * is made to contains an array pointing to Landlock rule lists. This design
+ * has low-performance impact and is memory efficient while keeping the
+ * property of append-only rules.
+ *
+ * @flags: not used for now, but could be used for TSYNC
+ * @user_bpf_fd: file descriptor pointing to a loaded Landlock rule
+ */
+#ifdef CONFIG_SECCOMP_FILTER
+int landlock_seccomp_append_prog(unsigned int flags, const char __user *user_bpf_fd)
+{
+	struct landlock_events *new_events;
+	struct bpf_prog *prog;
+	int bpf_fd;
+
+	/* force no_new_privs to limit privilege escalation */
+	if (!task_no_new_privs(current))
+		return -EPERM;
+	/* will be removed in the future to allow unprivileged tasks */
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+	if (!user_bpf_fd)
+		return -EFAULT;
+	if (flags)
+		return -EINVAL;
+	if (copy_from_user(&bpf_fd, user_bpf_fd, sizeof(bpf_fd)))
+		return -EFAULT;
+	prog = bpf_prog_get(bpf_fd);
+	if (IS_ERR(prog))
+		return PTR_ERR(prog);
+
+	/*
+	 * We don't need to lock anything for the current process hierarchy,
+	 * everything is guarded by the atomic counters.
+	 */
+	new_events = landlock_append_prog(current->seccomp.landlock_events, prog);
+	/* @prog is managed/freed by landlock_append_prog() */
+	if (IS_ERR(new_events))
+		return PTR_ERR(new_events);
+	current->seccomp.landlock_events = new_events;
+	return 0;
+}
+#endif /* CONFIG_SECCOMP_FILTER */
-- 
2.11.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 07/10] bpf: Add a Landlock sandbox example
  2017-02-22  1:26 [PATCH v5 00/10] Landlock LSM: Toward unprivileged sandboxing Mickaël Salaün
                   ` (5 preceding siblings ...)
  2017-02-22  1:26 ` [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy Mickaël Salaün
@ 2017-02-22  1:26 ` Mickaël Salaün
  2017-02-23 22:13   ` Mickaël Salaün
  2017-02-22  1:26 ` [PATCH v5 08/10] seccomp: Enhance test_harness with an assert step mechanism Mickaël Salaün
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: Mickaël Salaün @ 2017-02-22  1:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Arnaldo Carvalho de Melo, Casey Schaufler, Daniel Borkmann,
	David Drysdale, David S . Miller, Eric W . Biederman,
	James Morris, Jann Horn, Jonathan Corbet, Matthew Garrett,
	Michael Kerrisk, Kees Cook, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Tejun Heo, Thomas Graf,
	Will Drewry, kernel-hardening, linux-api, linux-security-module,
	netdev

Add a basic sandbox tool to create a process isolated from some part of
the system. This sandbox create a read-only environment. It is only
allowed to write to a character device such as a TTY:

  # :> X
  # echo $?
  0
  # ./samples/bpf/landlock1 /bin/sh -i
  Launching a new sandboxed process.
  # :> Y
  cannot create Y: Operation not permitted

Changes since v4:
* write Landlock rule in C and compiled it with LLVM
* remove cgroup handling
* remove path handling: only handle a read-only environment
* remove errno return codes

Changes since v3:
* remove seccomp and origin field: completely free from seccomp programs
* handle more FS-related hooks
* handle inode hooks and directory traversal
* add faked but consistent view thanks to ENOENT
* add /lib64 in the example
* fix spelling
* rename some types and definitions (e.g. SECCOMP_ADD_LANDLOCK_RULE)

Changes since v2:
* use BPF_PROG_ATTACH for cgroup handling

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
---
 samples/bpf/.gitignore       |  32 ++++++++++++++
 samples/bpf/Makefile         |   4 ++
 samples/bpf/bpf_load.c       |  26 +++++++++--
 samples/bpf/landlock1_kern.c |  46 +++++++++++++++++++
 samples/bpf/landlock1_user.c | 102 +++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 206 insertions(+), 4 deletions(-)
 create mode 100644 samples/bpf/.gitignore
 create mode 100644 samples/bpf/landlock1_kern.c
 create mode 100644 samples/bpf/landlock1_user.c

diff --git a/samples/bpf/.gitignore b/samples/bpf/.gitignore
new file mode 100644
index 000000000000..a7562a5ef4c2
--- /dev/null
+++ b/samples/bpf/.gitignore
@@ -0,0 +1,32 @@
+fds_example
+lathist
+lwt_len_hist
+map_perf_test
+offwaketime
+sampleip
+sockex1
+sockex2
+sockex3
+sock_example
+spintest
+tc_l2_redirect
+test_cgrp2_array_pin
+test_cgrp2_attach
+test_cgrp2_attach2
+test_cgrp2_sock
+test_cgrp2_sock2
+test_current_task_under_cgroup
+test_lru_dist
+test_overhead
+test_probe_write_user
+trace_event
+trace_output
+tracex1
+tracex2
+tracex3
+tracex4
+tracex5
+tracex6
+xdp1
+xdp2
+xdp_tx_iptunnel
diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 09e9d535bd74..3d3afd709635 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -34,6 +34,7 @@ hostprogs-y += sampleip
 hostprogs-y += tc_l2_redirect
 hostprogs-y += lwt_len_hist
 hostprogs-y += xdp_tx_iptunnel
+hostprogs-y += landlock1
 
 # Libbpf dependencies
 LIBBPF := ../../tools/lib/bpf/bpf.o
@@ -72,6 +73,7 @@ sampleip-objs := bpf_load.o $(LIBBPF) sampleip_user.o
 tc_l2_redirect-objs := bpf_load.o $(LIBBPF) tc_l2_redirect_user.o
 lwt_len_hist-objs := bpf_load.o $(LIBBPF) lwt_len_hist_user.o
 xdp_tx_iptunnel-objs := bpf_load.o $(LIBBPF) xdp_tx_iptunnel_user.o
+landlock1-objs := bpf_load.o $(LIBBPF) landlock1_user.o
 
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
@@ -105,6 +107,7 @@ always += trace_event_kern.o
 always += sampleip_kern.o
 always += lwt_len_hist_kern.o
 always += xdp_tx_iptunnel_kern.o
+always += landlock1_kern.o
 
 HOSTCFLAGS += -I$(objtree)/usr/include
 HOSTCFLAGS += -I$(srctree)/tools/lib/
@@ -139,6 +142,7 @@ HOSTLOADLIBES_sampleip += -lelf
 HOSTLOADLIBES_tc_l2_redirect += -l elf
 HOSTLOADLIBES_lwt_len_hist += -l elf
 HOSTLOADLIBES_xdp_tx_iptunnel += -lelf
+HOSTLOADLIBES_landlock1 += -lelf
 
 # Allows pointing LLC/CLANG to a LLVM backend with bpf support, redefine on cmdline:
 #  make samples/bpf/ LLC=~/git/llvm/build/bin/llc CLANG=~/git/llvm/build/bin/clang
diff --git a/samples/bpf/bpf_load.c b/samples/bpf/bpf_load.c
index d23dc13ab0f2..78df39cf8b2f 100644
--- a/samples/bpf/bpf_load.c
+++ b/samples/bpf/bpf_load.c
@@ -68,6 +68,7 @@ static int load_and_attach(const char *event, struct bpf_insn *prog, int size)
 	bool is_perf_event = strncmp(event, "perf_event", 10) == 0;
 	bool is_cgroup_skb = strncmp(event, "cgroup/skb", 10) == 0;
 	bool is_cgroup_sk = strncmp(event, "cgroup/sock", 11) == 0;
+	bool is_landlock = strncmp(event, "landlock", 8) == 0;
 	size_t insns_cnt = size / sizeof(struct bpf_insn);
 	enum bpf_prog_type prog_type;
 	char buf[256];
@@ -93,6 +94,12 @@ static int load_and_attach(const char *event, struct bpf_insn *prog, int size)
 		prog_type = BPF_PROG_TYPE_CGROUP_SKB;
 	} else if (is_cgroup_sk) {
 		prog_type = BPF_PROG_TYPE_CGROUP_SOCK;
+	} else if (is_landlock) {
+		prog_type = BPF_PROG_TYPE_LANDLOCK;
+		if (!subtype.landlock_rule.event) {
+			printf("No subtype\n");
+			return -1;
+		}
 	} else {
 		printf("Unknown event '%s'\n", event);
 		return -1;
@@ -107,7 +114,7 @@ static int load_and_attach(const char *event, struct bpf_insn *prog, int size)
 
 	prog_fd[prog_cnt++] = fd;
 
-	if (is_xdp || is_perf_event || is_cgroup_skb || is_cgroup_sk)
+	if (is_xdp || is_perf_event || is_cgroup_skb || is_cgroup_sk || is_landlock)
 		return 0;
 
 	if (is_socket) {
@@ -278,6 +285,8 @@ int load_bpf_file(char *path)
 	Elf_Data *data, *data_prog, *symbols = NULL;
 	char *shname, *shname_prog;
 
+	subtype.landlock_rule.event = 0;
+
 	if (elf_version(EV_CURRENT) == EV_NONE)
 		return 1;
 
@@ -322,6 +331,14 @@ int load_bpf_file(char *path)
 			processed_sec[i] = true;
 			if (load_maps(data->d_buf, data->d_size))
 				return 1;
+		} else if (strcmp(shname, "subtype") == 0) {
+			processed_sec[i] = true;
+			if (data->d_size != sizeof(union bpf_prog_subtype)) {
+				printf("invalid size of subtype section %zd\n",
+				       data->d_size);
+				return 1;
+			}
+			memcpy(&subtype, data->d_buf, sizeof(union bpf_prog_subtype));
 		} else if (shdr.sh_type == SHT_SYMTAB) {
 			symbols = data;
 		}
@@ -357,14 +374,14 @@ int load_bpf_file(char *path)
 			    memcmp(shname_prog, "xdp", 3) == 0 ||
 			    memcmp(shname_prog, "perf_event", 10) == 0 ||
 			    memcmp(shname_prog, "socket", 6) == 0 ||
-			    memcmp(shname_prog, "cgroup/", 7) == 0)
+			    memcmp(shname_prog, "cgroup/", 7) == 0 ||
+			    memcmp(shname_prog, "landlock", 8) == 0)
 				load_and_attach(shname_prog, insns, data_prog->d_size);
 		}
 	}
 
 	/* load programs that don't use maps */
 	for (i = 1; i < ehdr.e_shnum; i++) {
-
 		if (processed_sec[i])
 			continue;
 
@@ -377,7 +394,8 @@ int load_bpf_file(char *path)
 		    memcmp(shname, "xdp", 3) == 0 ||
 		    memcmp(shname, "perf_event", 10) == 0 ||
 		    memcmp(shname, "socket", 6) == 0 ||
-		    memcmp(shname, "cgroup/", 7) == 0)
+		    memcmp(shname, "cgroup/", 7) == 0 ||
+		    memcmp(shname, "landlock", 8) == 0)
 			load_and_attach(shname, data->d_buf, data->d_size);
 	}
 
diff --git a/samples/bpf/landlock1_kern.c b/samples/bpf/landlock1_kern.c
new file mode 100644
index 000000000000..ff23061e6324
--- /dev/null
+++ b/samples/bpf/landlock1_kern.c
@@ -0,0 +1,46 @@
+/*
+ * Landlock LSM - Sample 1 (BPF program)
+ *
+ * Copyright © 2017 Mickaël Salaün <mic@digikod.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ */
+
+#define KBUILD_MODNAME "foo"
+#include <uapi/linux/bpf.h>
+#include <uapi/linux/stat.h> /* S_ISCHR() */
+#include "bpf_helpers.h"
+
+SEC("landlock1")
+static int landlock_fs_prog1(struct landlock_context *ctx)
+{
+	char fmt_error[] = "landlock1: error: get_mode:%lld\n";
+	char fmt_name[] = "landlock1: syscall:%d\n";
+	long long ret;
+
+	if (!(ctx->arg2 & LANDLOCK_ACTION_FS_WRITE))
+		return 0;
+	ret = bpf_handle_fs_get_mode((void *)ctx->arg1);
+	if (ret < 0) {
+		bpf_trace_printk(fmt_error, sizeof(fmt_error), ret);
+		return 1;
+	}
+	if (S_ISCHR(ret))
+		return 0;
+	bpf_trace_printk(fmt_name, sizeof(fmt_name), ctx->syscall_nr);
+	return 1;
+}
+
+SEC("subtype")
+static union bpf_prog_subtype _subtype = {
+	.landlock_rule = {
+		.version = 1,
+		.event = LANDLOCK_SUBTYPE_EVENT_FS,
+		.ability = LANDLOCK_SUBTYPE_ABILITY_DEBUG,
+	}
+};
+
+SEC("license")
+static char _license[] = "GPL";
diff --git a/samples/bpf/landlock1_user.c b/samples/bpf/landlock1_user.c
new file mode 100644
index 000000000000..5feb660caed4
--- /dev/null
+++ b/samples/bpf/landlock1_user.c
@@ -0,0 +1,102 @@
+/*
+ * Landlock LSM - Sample 1 (userland)
+ *
+ * Copyright © 2017 Mickaël Salaün <mic@digikod.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ */
+
+#include "bpf_load.h"
+#include "libbpf.h"
+
+#define _GNU_SOURCE
+#include <errno.h>
+#include <fcntl.h> /* open() */
+#include <linux/bpf.h>
+#include <linux/filter.h>
+#include <linux/prctl.h>
+#include <linux/seccomp.h>
+#include <stddef.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/prctl.h>
+#include <sys/syscall.h>
+#include <unistd.h>
+
+#ifndef seccomp
+static int seccomp(unsigned int op, unsigned int flags, void *args)
+{
+	errno = 0;
+	return syscall(__NR_seccomp, op, flags, args);
+}
+#endif
+
+#define ARRAY_SIZE(a)	(sizeof(a) / sizeof(a[0]))
+#define MAX_ERRNO	4095
+
+
+struct landlock_rule {
+	enum landlock_subtype_event event;
+	struct bpf_insn *bpf;
+	size_t size;
+};
+
+static int apply_sandbox(int prog_fd)
+{
+	int ret = 0;
+
+	/* set up the test sandbox */
+	if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
+		perror("prctl(no_new_priv)");
+		return 1;
+	}
+	if (seccomp(SECCOMP_ADD_LANDLOCK_RULE, 0, &prog_fd)) {
+		perror("seccomp(set_hook)");
+		ret = 1;
+	}
+	close(prog_fd);
+
+	return ret;
+}
+
+int main(int argc, char * const argv[], char * const *envp)
+{
+	char filename[256];
+	char *cmd_path;
+	char * const *cmd_argv;
+
+	if (argc < 2) {
+		fprintf(stderr, "usage: %s <cmd> [args]...\n\n", argv[0]);
+		fprintf(stderr, "Launch a command in a read-only environment "
+				"(except for character devices).\n");
+		fprintf(stderr, "Display debug with: "
+				"cat /sys/kernel/debug/tracing/trace_pipe &\n");
+		return 1;
+	}
+
+	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
+	if (load_bpf_file(filename)) {
+		printf("%s", bpf_log_buf);
+		return 1;
+	}
+	if (!prog_fd[0]) {
+		if (errno) {
+			printf("load_bpf_file: %s\n", strerror(errno));
+		} else {
+			printf("load_bpf_file: Error\n");
+		}
+		return 1;
+	}
+
+	if (apply_sandbox(prog_fd[0]))
+		return 1;
+	cmd_path = argv[1];
+	cmd_argv = argv + 1;
+	fprintf(stderr, "Launching a new sandboxed process.\n");
+	execve(cmd_path, cmd_argv, envp);
+	perror("execve");
+	return 1;
+}
-- 
2.11.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 08/10] seccomp: Enhance test_harness with an assert step mechanism
  2017-02-22  1:26 [PATCH v5 00/10] Landlock LSM: Toward unprivileged sandboxing Mickaël Salaün
                   ` (6 preceding siblings ...)
  2017-02-22  1:26 ` [PATCH v5 07/10] bpf: Add a Landlock sandbox example Mickaël Salaün
@ 2017-02-22  1:26 ` Mickaël Salaün
  2017-02-22  1:26 ` [PATCH v5 09/10] bpf,landlock: Add tests for Landlock Mickaël Salaün
  2017-02-22  1:26 ` [PATCH v5 10/10] landlock: Add user and kernel documentation " Mickaël Salaün
  9 siblings, 0 replies; 26+ messages in thread
From: Mickaël Salaün @ 2017-02-22  1:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Arnaldo Carvalho de Melo, Casey Schaufler, Daniel Borkmann,
	David Drysdale, David S . Miller, Eric W . Biederman,
	James Morris, Jann Horn, Jonathan Corbet, Matthew Garrett,
	Michael Kerrisk, Kees Cook, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Tejun Heo, Thomas Graf,
	Will Drewry, kernel-hardening, linux-api, linux-security-module,
	netdev

This is useful to return an information about the error without being
able to write to TH_LOG_STREAM.

Helpers from test_harness.h may be useful outside of the seccomp
directory.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Will Drewry <wad@chromium.org>
---
 tools/testing/selftests/seccomp/test_harness.h | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/seccomp/test_harness.h b/tools/testing/selftests/seccomp/test_harness.h
index a786c69c7584..77e407663e06 100644
--- a/tools/testing/selftests/seccomp/test_harness.h
+++ b/tools/testing/selftests/seccomp/test_harness.h
@@ -397,7 +397,7 @@ struct __test_metadata {
 	const char *name;
 	void (*fn)(struct __test_metadata *);
 	int termsig;
-	int passed;
+	__s8 passed;
 	int trigger; /* extra handler after the evaluation */
 	struct __test_metadata *prev, *next;
 };
@@ -476,6 +476,12 @@ void __run_test(struct __test_metadata *t)
 					"instead of by signal (code: %d)\n",
 					t->name,
 					WEXITSTATUS(status));
+			} else if (t->passed < 0) {
+				fprintf(TH_LOG_STREAM,
+					"%s: Failed at step #%d\n",
+					t->name,
+					t->passed * -1);
+				t->passed = 0;
 			}
 		} else if (WIFSIGNALED(status)) {
 			t->passed = 0;
-- 
2.11.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 09/10] bpf,landlock: Add tests for Landlock
  2017-02-22  1:26 [PATCH v5 00/10] Landlock LSM: Toward unprivileged sandboxing Mickaël Salaün
                   ` (7 preceding siblings ...)
  2017-02-22  1:26 ` [PATCH v5 08/10] seccomp: Enhance test_harness with an assert step mechanism Mickaël Salaün
@ 2017-02-22  1:26 ` Mickaël Salaün
  2017-02-22  1:26 ` [PATCH v5 10/10] landlock: Add user and kernel documentation " Mickaël Salaün
  9 siblings, 0 replies; 26+ messages in thread
From: Mickaël Salaün @ 2017-02-22  1:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Arnaldo Carvalho de Melo, Casey Schaufler, Daniel Borkmann,
	David Drysdale, David S . Miller, Eric W . Biederman,
	James Morris, Jann Horn, Jonathan Corbet, Matthew Garrett,
	Michael Kerrisk, Kees Cook, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Tejun Heo, Thomas Graf,
	Will Drewry, kernel-hardening, linux-api, linux-security-module,
	netdev

Test basic context access and filesystem event with multiple cases.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Will Drewry <wad@chromium.org>
---
 tools/testing/selftests/Makefile                   |   1 +
 tools/testing/selftests/bpf/test_verifier.c        |  54 +++-
 tools/testing/selftests/landlock/.gitignore        |   2 +
 tools/testing/selftests/landlock/Makefile          |  47 +++
 tools/testing/selftests/landlock/rules/Makefile    |  52 +++
 tools/testing/selftests/landlock/rules/README.rst  |   1 +
 .../testing/selftests/landlock/rules/bpf_helpers.h |   1 +
 tools/testing/selftests/landlock/rules/fs1.c       |  31 ++
 tools/testing/selftests/landlock/rules/fs2.c       |  31 ++
 tools/testing/selftests/landlock/test_fs.c         | 347 +++++++++++++++++++++
 10 files changed, 566 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/landlock/.gitignore
 create mode 100644 tools/testing/selftests/landlock/Makefile
 create mode 100644 tools/testing/selftests/landlock/rules/Makefile
 create mode 120000 tools/testing/selftests/landlock/rules/README.rst
 create mode 120000 tools/testing/selftests/landlock/rules/bpf_helpers.h
 create mode 100644 tools/testing/selftests/landlock/rules/fs1.c
 create mode 100644 tools/testing/selftests/landlock/rules/fs2.c
 create mode 100644 tools/testing/selftests/landlock/test_fs.c

diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 831022b12848..a8dadcfa4c01 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -10,6 +10,7 @@ TARGETS += futex
 TARGETS += gpio
 TARGETS += ipc
 TARGETS += kcmp
+TARGETS += landlock
 TARGETS += lib
 TARGETS += membarrier
 TARGETS += memfd
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index 15eeb79104fe..ee1d439e48e4 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -4451,7 +4451,59 @@ static struct bpf_test tests[] = {
 		.errstr = "R0 min value is negative, either use unsigned index or do a if (index >=0) check.",
 		.result = REJECT,
 		.result_unpriv = REJECT,
-	}
+	},
+	{
+		"landlock/fs: always accept",
+		.insns = {
+			BPF_MOV32_IMM(BPF_REG_0, 0),
+			BPF_EXIT_INSN(),
+		},
+		.result = ACCEPT,
+		.prog_type = BPF_PROG_TYPE_LANDLOCK,
+		.prog_subtype = {
+			.landlock_rule = {
+				.version = 1,
+				.event = LANDLOCK_SUBTYPE_EVENT_FS,
+			}
+		},
+	},
+	{
+		"landlock/fs: read context",
+		.insns = {
+			BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
+			BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_6,
+				offsetof(struct landlock_context, status)),
+			/* test operations on raw values */
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_7, 1),
+			BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_6,
+				offsetof(struct landlock_context, arch)),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_7, 1),
+			BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_6,
+				offsetof(struct landlock_context, syscall_nr)),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_7, 1),
+			BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_6,
+				offsetof(struct landlock_context, syscall_cmd)),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_7, 1),
+			BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_6,
+				offsetof(struct landlock_context, event)),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_7, 1),
+			BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_6,
+				offsetof(struct landlock_context, arg1)),
+			BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_6,
+				offsetof(struct landlock_context, arg2)),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_7, 1),
+			BPF_MOV32_IMM(BPF_REG_0, 0),
+			BPF_EXIT_INSN(),
+		},
+		.result = ACCEPT,
+		.prog_type = BPF_PROG_TYPE_LANDLOCK,
+		.prog_subtype = {
+			.landlock_rule = {
+				.version = 1,
+				.event = LANDLOCK_SUBTYPE_EVENT_FS,
+			}
+		},
+	},
 };
 
 static int probe_filter_length(const struct bpf_insn *fp)
diff --git a/tools/testing/selftests/landlock/.gitignore b/tools/testing/selftests/landlock/.gitignore
new file mode 100644
index 000000000000..e5ade9fc5633
--- /dev/null
+++ b/tools/testing/selftests/landlock/.gitignore
@@ -0,0 +1,2 @@
+/test_fs
+/tmp_*
diff --git a/tools/testing/selftests/landlock/Makefile b/tools/testing/selftests/landlock/Makefile
new file mode 100644
index 000000000000..9a52c82d64fa
--- /dev/null
+++ b/tools/testing/selftests/landlock/Makefile
@@ -0,0 +1,47 @@
+LIBDIR := ../../../lib
+BPFOBJ := $(LIBDIR)/bpf/bpf.o
+LOADOBJ := ../../../../samples/bpf/bpf_load.o
+
+CFLAGS += -Wl,-no-as-needed -Wall -O2 -I../../../include/uapi -I$(LIBDIR)
+LDFLAGS += -lelf
+
+test_src = $(wildcard test_*.c)
+rule_src = $(wildcard rules/*.c)
+
+test_objs := $(test_src:.c=)
+rule_objs := $(rule_src:.c=.o)
+
+TEST_PROGS := $(test_objs)
+
+.PHONY: all clean clean_tmp force
+
+all: $(test_objs) $(rule_objs)
+
+# force a rebuild of BPFOBJ when its dependencies are updated
+force:
+
+$(BPFOBJ): force
+	$(MAKE) -C $(dir $(BPFOBJ))
+
+$(LOADOBJ):
+	$(MAKE) -C $(dir $(LOADOBJ))
+
+# minimize builds
+rules/modules.order: $(rule_src)
+	$(MAKE) -C rules
+	@touch $@
+
+$(rule_objs): rules/modules.order
+	@
+
+$(test_objs): $(BPFOBJ) $(LOADOBJ)
+
+include ../lib.mk
+
+clean_tmp:
+	$(RM) -r tmp_*
+
+clean: clean_tmp
+	$(MAKE) -C rules clean
+	$(RM) $(test_objs)
+
diff --git a/tools/testing/selftests/landlock/rules/Makefile b/tools/testing/selftests/landlock/rules/Makefile
new file mode 100644
index 000000000000..bf33b67a9fc2
--- /dev/null
+++ b/tools/testing/selftests/landlock/rules/Makefile
@@ -0,0 +1,52 @@
+# kbuild trick to avoid linker error. Can be omitted if a module is built.
+obj- := dummy.o
+
+# Tell kbuild to always build the programs
+always := fs1.o
+always += fs2.o
+
+EXTRA_CFLAGS = -Wall -Wextra
+
+# Allows pointing LLC/CLANG to a LLVM backend with bpf support, redefine on cmdline:
+#  make samples/bpf/ LLC=~/git/llvm/build/bin/llc CLANG=~/git/llvm/build/bin/clang
+LLC ?= llc
+CLANG ?= clang
+
+# Verify LLVM compiler tools are available and bpf target is supported by llc
+.PHONY: all clean verify_cmds verify_target_bpf $(CLANG) $(LLC)
+
+# Trick to allow make to be run from this directory
+all:
+	$(MAKE) -C ../../../../../ $(CURDIR)/
+
+clean:
+	$(MAKE) -C ../../../../../ M=$(CURDIR) clean
+
+verify_cmds: $(CLANG) $(LLC)
+	@for TOOL in $^ ; do \
+		if ! (which -- "$${TOOL}" > /dev/null 2>&1); then \
+			echo "*** ERROR: Cannot find LLVM tool $${TOOL}" ;\
+			exit 1; \
+		else true; fi; \
+	done
+
+verify_target_bpf: verify_cmds
+	@if ! (${LLC} -march=bpf -mattr=help > /dev/null 2>&1); then \
+		echo "*** ERROR: LLVM (${LLC}) does not support 'bpf' target" ;\
+		echo "   NOTICE: LLVM version >= 3.7.1 required" ;\
+		exit 2; \
+	else true; fi
+
+%_kern.c: verify_target_bpf
+
+# asm/sysreg.h - inline assembly used by it is incompatible with llvm.
+# But, there is no easy way to fix it, so just exclude it since it is
+# useless for BPF samples.
+$(obj)/%.o: $(src)/%.c
+	$(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
+		-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
+		-Wno-compare-distinct-pointer-types \
+		-Wno-gnu-variable-sized-type-not-at-end \
+		-Wno-tautological-compare \
+		-O2 -emit-llvm -c $< -o -| $(LLC) -march=bpf -filetype=obj -o $@
+
diff --git a/tools/testing/selftests/landlock/rules/README.rst b/tools/testing/selftests/landlock/rules/README.rst
new file mode 120000
index 000000000000..605f48aa6f72
--- /dev/null
+++ b/tools/testing/selftests/landlock/rules/README.rst
@@ -0,0 +1 @@
+../../../../../samples/bpf/README.rst
\ No newline at end of file
diff --git a/tools/testing/selftests/landlock/rules/bpf_helpers.h b/tools/testing/selftests/landlock/rules/bpf_helpers.h
new file mode 120000
index 000000000000..0aa1a521b39a
--- /dev/null
+++ b/tools/testing/selftests/landlock/rules/bpf_helpers.h
@@ -0,0 +1 @@
+../../../../../samples/bpf/bpf_helpers.h
\ No newline at end of file
diff --git a/tools/testing/selftests/landlock/rules/fs1.c b/tools/testing/selftests/landlock/rules/fs1.c
new file mode 100644
index 000000000000..fb3cab7a3116
--- /dev/null
+++ b/tools/testing/selftests/landlock/rules/fs1.c
@@ -0,0 +1,31 @@
+/*
+ * Copyright © 2017 Mickaël Salaün <mic@digikod.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ * Landlock test - Read-only filesystem
+ */
+
+#include <uapi/linux/bpf.h>
+#include "bpf_helpers.h"
+
+SEC("landlock1")
+static int landlock_fs_prog1(struct landlock_context *ctx)
+{
+	if (!(ctx->arg2 & LANDLOCK_ACTION_FS_WRITE))
+		return 0;
+	return 1;
+}
+
+SEC("subtype")
+static union bpf_prog_subtype _subtype = {
+	.landlock_rule = {
+		.version = 1,
+		.event = LANDLOCK_SUBTYPE_EVENT_FS,
+	}
+};
+
+SEC("license")
+static char _license[] = "GPL";
diff --git a/tools/testing/selftests/landlock/rules/fs2.c b/tools/testing/selftests/landlock/rules/fs2.c
new file mode 100644
index 000000000000..d5cb6b9e8c26
--- /dev/null
+++ b/tools/testing/selftests/landlock/rules/fs2.c
@@ -0,0 +1,31 @@
+/*
+ * Copyright © 2017 Mickaël Salaün <mic@digikod.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ * Landlock test - No-open filesystem
+ */
+
+#include <uapi/linux/bpf.h>
+#include "bpf_helpers.h"
+
+SEC("landlock1")
+static int landlock_fs_prog1(struct landlock_context *ctx)
+{
+	if (!(ctx->arg2 & LANDLOCK_ACTION_FS_GET))
+		return 0;
+	return 1;
+}
+
+SEC("subtype")
+static union bpf_prog_subtype _subtype = {
+	.landlock_rule = {
+		.version = 1,
+		.event = LANDLOCK_SUBTYPE_EVENT_FS,
+	}
+};
+
+SEC("license")
+static char _license[] = "GPL";
diff --git a/tools/testing/selftests/landlock/test_fs.c b/tools/testing/selftests/landlock/test_fs.c
new file mode 100644
index 000000000000..3dcc0294324c
--- /dev/null
+++ b/tools/testing/selftests/landlock/test_fs.c
@@ -0,0 +1,347 @@
+/*
+ * Copyright © 2017 Mickaël Salaün <mic@digikod.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ * Tests code for Landlock
+ */
+
+#define _GNU_SOURCE
+#include <errno.h>
+#include <linux/bpf.h>
+#include <linux/filter.h>
+#include <linux/seccomp.h>
+#include <stddef.h>
+#include <string.h>
+#include <sys/prctl.h>
+#include <sys/syscall.h>
+
+#include <fcntl.h> /* open() */
+#include <sys/mount.h>
+#include <sys/stat.h> /* mkdir() */
+#include <sys/mman.h> /* mmap() */
+
+#include "../seccomp/test_harness.h"
+#include "../../../../samples/bpf/bpf_load.h"
+
+#define TMP_PREFIX "tmp_"
+
+#ifndef SECCOMP_ADD_LANDLOCK_RULE
+#define SECCOMP_ADD_LANDLOCK_RULE	2
+#endif
+
+#ifndef seccomp
+static int seccomp(unsigned int op, unsigned int flags, void *args)
+{
+	errno = 0;
+	return syscall(__NR_seccomp, op, flags, args);
+}
+#endif
+
+static unsigned int __step_count = 0;
+
+#define ASSERT_STEP(cond) \
+	{ \
+		step--; \
+		if (!(cond)) \
+			_exit(step); \
+	}
+
+TEST(seccomp_landlock)
+{
+	int ret;
+
+	ret = prctl(PR_SET_NO_NEW_PRIVS, 1, NULL, 0, 0);
+	ASSERT_EQ(0, ret) {
+		TH_LOG("Kernel does not support PR_SET_NO_NEW_PRIVS");
+	}
+	ret = seccomp(SECCOMP_ADD_LANDLOCK_RULE, 0, NULL);
+	EXPECT_EQ(-1, ret);
+	EXPECT_EQ(EFAULT, errno) {
+		TH_LOG("Kernel does not support CONFIG_SECURITY_LANDLOCK");
+	}
+}
+
+struct layout1 {
+	int file_ro;
+	int file_rw;
+	int file_wo;
+};
+
+static void setup_layout1(struct __test_metadata *_metadata,
+		struct layout1 *l1)
+{
+	int fd;
+	char buf[] = "fs1";
+
+	l1->file_ro = -1;
+	l1->file_rw = -1;
+	l1->file_wo = -1;
+
+	fd = open(TMP_PREFIX "file_created",
+			O_CREAT | O_EXCL | O_WRONLY | O_CLOEXEC,
+			S_IRUSR | S_IWUSR);
+	ASSERT_GE(fd, 0);
+	ASSERT_EQ(sizeof(buf), write(fd, buf, sizeof(buf)));
+	ASSERT_EQ(0, close(fd));
+
+	fd = mkdir(TMP_PREFIX "dir_created", S_IRUSR | S_IWUSR);
+	ASSERT_GE(fd, 0);
+	ASSERT_EQ(0, close(fd));
+
+	l1->file_ro = open(TMP_PREFIX "file_ro",
+			O_CREAT | O_EXCL | O_WRONLY | O_CLOEXEC,
+			S_IRUSR | S_IWUSR);
+	ASSERT_LE(0, l1->file_ro);
+	ASSERT_EQ(sizeof(buf), write(l1->file_ro, buf, sizeof(buf)));
+	ASSERT_EQ(0, close(l1->file_ro));
+	l1->file_ro = open(TMP_PREFIX "file_ro",
+			O_RDONLY | O_CLOEXEC,
+			S_IRUSR | S_IWUSR);
+	ASSERT_LE(0, l1->file_ro);
+
+	l1->file_rw = open(TMP_PREFIX "file_rw",
+			O_CREAT | O_EXCL | O_RDWR | O_CLOEXEC,
+			S_IRUSR | S_IWUSR);
+	ASSERT_LE(0, l1->file_rw);
+	ASSERT_EQ(sizeof(buf), write(l1->file_rw, buf, sizeof(buf)));
+	ASSERT_EQ(0, lseek(l1->file_rw, 0, SEEK_SET));
+
+	l1->file_wo = open(TMP_PREFIX "file_wo",
+			O_CREAT | O_EXCL | O_WRONLY | O_CLOEXEC,
+			S_IRUSR | S_IWUSR);
+	ASSERT_LE(0, l1->file_wo);
+	ASSERT_EQ(sizeof(buf), write(l1->file_wo, buf, sizeof(buf)));
+	ASSERT_EQ(0, lseek(l1->file_wo, 0, SEEK_SET));
+}
+
+static void cleanup_layout1(void)
+{
+	unlink(TMP_PREFIX "file_created");
+	unlink(TMP_PREFIX "file_ro");
+	unlink(TMP_PREFIX "file_rw");
+	unlink(TMP_PREFIX "file_wo");
+	unlink(TMP_PREFIX "should_not_exist");
+	rmdir(TMP_PREFIX "dir_created");
+}
+
+FIXTURE(rule_fs1) {
+	struct layout1 l1;
+	int prog;
+};
+
+FIXTURE_SETUP(rule_fs1)
+{
+	cleanup_layout1();
+	setup_layout1(_metadata, &self->l1);
+
+	ASSERT_EQ(0, load_bpf_file("rules/fs1.o")) {
+		TH_LOG("%s", bpf_log_buf);
+	}
+	self->prog = prog_fd[0];
+	ASSERT_EQ(0, prctl(PR_SET_NO_NEW_PRIVS, 1, NULL, 0, 0)) {
+		TH_LOG("Kernel does not support PR_SET_NO_NEW_PRIVS");
+	}
+}
+
+FIXTURE_TEARDOWN(rule_fs1)
+{
+	EXPECT_EQ(0, close(self->prog));
+	/* cleanup_layout1() would be denied here */
+}
+
+TEST_F(rule_fs1, load_prog) {}
+
+TEST_F(rule_fs1, read_only_file)
+{
+	int fd;
+	int step = 0;
+	char buf_write[] = "should not be written";
+	char buf_read[2];
+
+	ASSERT_EQ(-1, write(self->l1.file_ro, buf_write, sizeof(buf_write)));
+	ASSERT_EQ(EBADF, errno);
+
+	ASSERT_EQ(-1, read(self->l1.file_wo, buf_read, sizeof(buf_read)));
+	ASSERT_EQ(EBADF, errno);
+
+	ASSERT_EQ(0, seccomp(SECCOMP_ADD_LANDLOCK_RULE, 0, &self->prog)) {
+		TH_LOG("Failed to apply rule fs1: %s", strerror(errno));
+	}
+
+	fd = open(".",
+			O_TMPFILE | O_EXCL | O_RDWR | O_CLOEXEC,
+			S_IRUSR | S_IWUSR);
+	ASSERT_STEP(fd == -1);
+	ASSERT_STEP(errno != EOPNOTSUPP)
+	ASSERT_STEP(errno == EPERM);
+
+	fd = open(TMP_PREFIX "file_created",
+			O_RDONLY | O_CLOEXEC);
+	ASSERT_STEP(fd >= 0);
+	ASSERT_STEP(!close(fd));
+
+	fd = open(TMP_PREFIX "file_created",
+			O_RDWR | O_CLOEXEC);
+	ASSERT_STEP(fd == -1);
+	ASSERT_STEP(errno == EPERM);
+
+	fd = open(TMP_PREFIX "file_created",
+			O_WRONLY | O_CLOEXEC);
+	ASSERT_STEP(fd == -1);
+	ASSERT_STEP(errno == EPERM);
+
+	fd = open(TMP_PREFIX "should_not_exist",
+			O_CREAT | O_EXCL | O_CLOEXEC,
+			S_IRUSR | S_IWUSR);
+	ASSERT_STEP(fd == -1);
+	ASSERT_STEP(errno == EPERM);
+
+	ASSERT_STEP(-1 ==
+			write(self->l1.file_ro, buf_write, sizeof(buf_write)));
+	ASSERT_STEP(errno == EBADF);
+	ASSERT_STEP(sizeof(buf_read) ==
+			read(self->l1.file_ro, buf_read, sizeof(buf_read)));
+
+	ASSERT_STEP(-1 ==
+			write(self->l1.file_rw, buf_write, sizeof(buf_write)));
+	ASSERT_STEP(errno == EPERM);
+	ASSERT_STEP(sizeof(buf_read) ==
+			read(self->l1.file_rw, buf_read, sizeof(buf_read)));
+
+	ASSERT_STEP(-1 == write(self->l1.file_wo, buf_write, sizeof(buf_write)));
+	ASSERT_STEP(errno == EPERM);
+	ASSERT_STEP(-1 == read(self->l1.file_wo, buf_read, sizeof(buf_read)));
+	ASSERT_STEP(errno == EBADF);
+
+	ASSERT_STEP(-1 == unlink(TMP_PREFIX "file_created"));
+	ASSERT_STEP(errno == EPERM);
+	ASSERT_STEP(-1 == rmdir(TMP_PREFIX "dir_created"));
+	ASSERT_STEP(errno == EPERM);
+
+	ASSERT_STEP(0 == close(self->l1.file_ro));
+	ASSERT_STEP(0 == close(self->l1.file_rw));
+	ASSERT_STEP(0 == close(self->l1.file_wo));
+}
+
+TEST_F(rule_fs1, read_only_mount)
+{
+	int step = 0;
+
+	ASSERT_EQ(0, mount(".", TMP_PREFIX "dir_created",
+				NULL, MS_BIND, NULL));
+	ASSERT_EQ(0, umount2(TMP_PREFIX "dir_created", MNT_FORCE));
+
+	ASSERT_EQ(0, seccomp(SECCOMP_ADD_LANDLOCK_RULE, 0, &self->prog)) {
+		TH_LOG("Failed to apply rule fs1: %s", strerror(errno));
+	}
+
+	ASSERT_STEP(-1 == mount(".", TMP_PREFIX "dir_created",
+				NULL, MS_BIND, NULL));
+	ASSERT_STEP(errno == EPERM);
+	ASSERT_STEP(-1 == umount("/"));
+	ASSERT_STEP(errno == EPERM);
+}
+
+TEST_F(rule_fs1, read_only_mem)
+{
+	int step = 0;
+	void *addr;
+
+	addr = mmap(NULL, 1, PROT_READ | PROT_WRITE,
+			MAP_SHARED, self->l1.file_rw, 0);
+	ASSERT_NE(NULL, addr);
+	ASSERT_EQ(0, munmap(addr, 1));
+
+	ASSERT_EQ(0, seccomp(SECCOMP_ADD_LANDLOCK_RULE, 0, &self->prog)) {
+		TH_LOG("Failed to apply rule fs1: %s", strerror(errno));
+	}
+
+	addr = mmap(NULL, 1, PROT_READ, MAP_SHARED,
+			self->l1.file_rw, 0);
+	ASSERT_STEP(addr != NULL);
+	ASSERT_STEP(-1 == mprotect(addr, 1, PROT_WRITE));
+	ASSERT_STEP(errno == EPERM);
+	ASSERT_STEP(0 == munmap(addr, 1));
+
+	addr = mmap(NULL, 1, PROT_READ | PROT_WRITE, MAP_SHARED,
+			self->l1.file_rw, 0);
+	ASSERT_STEP(addr != NULL);
+	ASSERT_STEP(errno == EPERM);
+
+	addr = mmap(NULL, 1, PROT_READ | PROT_WRITE, MAP_PRIVATE,
+			self->l1.file_rw, 0);
+	ASSERT_STEP(addr != NULL);
+	ASSERT_STEP(0 == munmap(addr, 1));
+}
+
+FIXTURE(rule_fs2) {
+	struct layout1 l1;
+	int prog;
+};
+
+FIXTURE_SETUP(rule_fs2)
+{
+	cleanup_layout1();
+	setup_layout1(_metadata, &self->l1);
+
+	ASSERT_EQ(0, load_bpf_file("rules/fs2.o")) {
+		TH_LOG("%s", bpf_log_buf);
+	}
+	self->prog = prog_fd[0];
+	ASSERT_EQ(0, prctl(PR_SET_NO_NEW_PRIVS, 1, NULL, 0, 0)) {
+		TH_LOG("Kernel does not support PR_SET_NO_NEW_PRIVS");
+	}
+}
+
+FIXTURE_TEARDOWN(rule_fs2)
+{
+	EXPECT_EQ(0, close(self->prog));
+	cleanup_layout1();
+}
+
+static void landlocked_deny_open(struct __test_metadata *_metadata,
+		struct layout1 *l1)
+{
+	int fd;
+	void *addr;
+
+	fd = open(".", O_DIRECTORY | O_CLOEXEC);
+	ASSERT_EQ(-1, fd);
+	ASSERT_EQ(EPERM, errno);
+
+	addr = mmap(NULL, 1, PROT_READ | PROT_WRITE,
+			MAP_SHARED, l1->file_rw, 0);
+	ASSERT_NE(NULL, addr);
+	ASSERT_EQ(0, munmap(addr, 1));
+}
+
+TEST_F(rule_fs2, deny_open_for_hierarchy) {
+	int fd;
+	int status;
+	pid_t child;
+
+	fd = open(".", O_DIRECTORY | O_CLOEXEC);
+	ASSERT_LE(0, fd);
+	ASSERT_EQ(0, close(fd));
+
+	ASSERT_EQ(0, seccomp(SECCOMP_ADD_LANDLOCK_RULE, 0, &self->prog)) {
+		TH_LOG("Failed to apply rule fs2: %s", strerror(errno));
+	}
+
+	landlocked_deny_open(_metadata, &self->l1);
+
+	child = fork();
+	ASSERT_LE(0, child);
+	if (!child) {
+		landlocked_deny_open(_metadata, &self->l1);
+		_exit(1);
+	}
+	ASSERT_EQ(child, waitpid(child, &status, 0));
+	ASSERT_TRUE(WIFEXITED(status));
+	_exit(WEXITSTATUS(status));
+}
+
+TEST_HARNESS_MAIN
-- 
2.11.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 10/10] landlock: Add user and kernel documentation for Landlock
  2017-02-22  1:26 [PATCH v5 00/10] Landlock LSM: Toward unprivileged sandboxing Mickaël Salaün
                   ` (8 preceding siblings ...)
  2017-02-22  1:26 ` [PATCH v5 09/10] bpf,landlock: Add tests for Landlock Mickaël Salaün
@ 2017-02-22  1:26 ` Mickaël Salaün
  2017-02-22  5:21   ` Andy Lutomirski
  9 siblings, 1 reply; 26+ messages in thread
From: Mickaël Salaün @ 2017-02-22  1:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Arnaldo Carvalho de Melo, Casey Schaufler, Daniel Borkmann,
	David Drysdale, David S . Miller, Eric W . Biederman,
	James Morris, Jann Horn, Jonathan Corbet, Matthew Garrett,
	Michael Kerrisk, Kees Cook, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Tejun Heo, Thomas Graf,
	Will Drewry, kernel-hardening, linux-api, linux-security-module,
	netdev

This documentation can be built with the Sphinx framework.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
---
 Documentation/security/index.rst           |   1 +
 Documentation/security/landlock/index.rst  |  19 ++
 Documentation/security/landlock/kernel.rst | 132 +++++++++++++
 Documentation/security/landlock/user.rst   | 298 +++++++++++++++++++++++++++++
 4 files changed, 450 insertions(+)
 create mode 100644 Documentation/security/landlock/index.rst
 create mode 100644 Documentation/security/landlock/kernel.rst
 create mode 100644 Documentation/security/landlock/user.rst

diff --git a/Documentation/security/index.rst b/Documentation/security/index.rst
index 9bae6bb20e7f..21a5a6b6e666 100644
--- a/Documentation/security/index.rst
+++ b/Documentation/security/index.rst
@@ -5,3 +5,4 @@ Security documentation
 .. toctree::
 
    tpm/index
+   landlock/index
diff --git a/Documentation/security/landlock/index.rst b/Documentation/security/landlock/index.rst
new file mode 100644
index 000000000000..012bb9c2e2cb
--- /dev/null
+++ b/Documentation/security/landlock/index.rst
@@ -0,0 +1,19 @@
+============
+Landlock LSM
+============
+
+Landlock is a stackable Linux Security Module (LSM) that makes it possible to
+create security sandboxes.  This kind of sandbox is expected to help mitigate
+the security impact of bugs or unexpected/malicious behaviors in user-space
+applications.  The current version allows only a process with the global
+CAP_SYS_ADMIN capability to create such sandboxes but the ultimate goal of
+Landlock is to empower any process, including unprivileged ones, to securely
+restrict themselves.  Landlock is inspired by seccomp-bpf but instead of
+filtering syscalls and their raw arguments, a Landlock rule can inspect the use
+of kernel objects like files and hence make a decision according to the kernel
+semantic.
+
+.. toctree::
+
+    user
+    kernel
diff --git a/Documentation/security/landlock/kernel.rst b/Documentation/security/landlock/kernel.rst
new file mode 100644
index 000000000000..d31a7d6574af
--- /dev/null
+++ b/Documentation/security/landlock/kernel.rst
@@ -0,0 +1,132 @@
+==============================
+Landlock: kernel documentation
+==============================
+
+eBPF properties
+===============
+
+To get an expressive language while still being safe and small, Landlock is
+based on eBPF. Landlock should be usable by untrusted processes and must
+therefore expose a minimal attack surface. The eBPF bytecode is minimal,
+powerful, widely used and designed to be used by untrusted applications. Thus,
+reusing the eBPF support in the kernel enables a generic approach while
+minimizing new code.
+
+An eBPF program has access to an eBPF context containing some fields including
+event arguments (i.e. arg1 and arg2). These arguments can be used directly or
+passed to helper functions according to their types. It is then possible to do
+complex access checks without race conditions or inconsistent evaluation (i.e.
+`incorrect mirroring of the OS code and state
+<https://www.internetsociety.org/doc/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools>`_).
+
+A Landlock event describes a particular access type.  For now, there is only
+one event type dedicated to filesystem related operations:
+LANDLOCK_SUBTYPE_EVENT_FS.  A Landlock rule is tied to one event type.  This
+makes it possible to statically check context accesses, potentially performed
+by such rule, and hence prevents kernel address leaks and ensure the right use
+of event arguments with eBPF functions.  Any user can add multiple Landlock
+rules per Landlock event.  They are stacked and evaluated one after the other,
+starting from the most recent rule, as seccomp-bpf does with its filters.
+Underneath, an event is an abstraction over a set of LSM hooks.
+
+
+Guiding principles
+==================
+
+Unprivileged use
+----------------
+
+* Everything potentially security sensitive which is exposed to a Landlock
+  rule, through functions or context, shall have an associated ability flag to
+  specify which kind of privilege a process must have to load such a rule.
+* Every ability flag expresses a semantic goal (e.g. debug, process
+  introspection, process modification) potentially tied to a set of
+  capabilities.
+* Landlock helpers and context should be usable by any unprivileged and
+  untrusted rule while following the system security policy enforced by other
+  access control mechanisms (e.g. DAC, LSM).
+
+
+Landlock event and context
+--------------------------
+
+* A Landlock event shall be focused on access control on kernel objects instead
+  of syscall filtering (i.e. syscall arguments), which is the purpose of
+  seccomp-bpf.
+* A Landlock context provided by an event shall express the minimal interface
+  to control an access for a kernel object. This can be achieved by wrapping
+  this raw object (e.g. file, inode, path, dentry) with an abstract
+  representation (i.e. handle) for userland/bpfland.
+* An evolution of a context's field (e.g. new flags in the status field) shall
+  only be activated for a rule if the version specified by the loading thread
+  imply this behavior.  This makes it possible to ensure that the rule code
+  make sense (e.g.  only watch flags which may be activated).
+* An event type shall guaranty that all the BPF function calls from a rule are
+  safe.  Thus, the related Landlock context arguments shall always be of the
+  same type for a particular event type.  For example, a network event could
+  share helpers with a file event because of UNIX socket.  However, the same
+  helpers may not be compatible for a FS handle and a net handle.
+* Multiple event types may use the same context interface.
+
+
+Landlock helpers
+----------------
+
+* Landlock helpers shall be as generic as possible (i.e. using handles) while
+  at the same time being as simple as possible and following the syscall
+  creation principles (cf.  *Documentation/adding-syscalls.txt*).
+* The only behavior change allowed on a helper is to fix a (logical) bug to
+  match the initial semantic.
+* Helpers shall be reentrant, i.e. only take inputs from arguments (e.g. from
+  the BPF context) or from the current thread, to allow an event type to use a
+  cache.  Future rule options might change this cache behavior (e.g. invalidate
+  cache after some time).
+* It is quite easy to add new helpers to extend Landlock.  The main concern
+  should be about the possibility to leak information from a landlocked process
+  to another (e.g. through maps) to not reproduce the same security sensitive
+  behavior as ptrace(2).
+
+
+Rule addition and propagation
+=============================
+
+See :ref:`Documentation/security/landlock/user <inherited_rules>` for the
+intended goal of rule propagation.
+
+Structure definitions
+---------------------
+
+.. kernel-doc:: include/linux/landlock.h
+
+
+Functions for rule addition
+---------------------------
+
+.. kernel-doc:: security/landlock/manager.c
+
+
+Questions and answers
+=====================
+
+Why not create a custom event type for each kind of action?
+-----------------------------------------------------------
+
+Landlock rules can handle these checks.  Adding more exceptions to the kernel
+code would lead to more code complexity.  A decision to ignore a kind of action
+can and should be done at the beginning of a Landlock rule.
+
+
+Why a rule does not return an errno code?
+-----------------------------------------
+
+seccomp filters can return multiple kind of code, including an errno value,
+which may be convenient for access control.  Those return codes are hardwired
+in the userland ABI.  Instead, Landlock approach is to return a boolean to
+allow or deny an action, which is much simpler and more generic.  Moreover, we
+do not really have a choice because, unlike to seccomp, Landlock rules are not
+enforced at the syscall entry point but may be executed at any point in the
+kernel (through LSM hooks) where an errno return code may not make sense.
+However, with this simple ABI and with the ability to call helpers, Landlock
+may gain features similar to seccomp-bpf in the future while being compatible
+with previous rules.
+
diff --git a/Documentation/security/landlock/user.rst b/Documentation/security/landlock/user.rst
new file mode 100644
index 000000000000..3219aa1ed3a0
--- /dev/null
+++ b/Documentation/security/landlock/user.rst
@@ -0,0 +1,298 @@
+================================
+Landlock: userland documentation
+================================
+
+Landlock rules
+==============
+
+eBPF programs are used to create security rules.  They are contained and can
+call only a whitelist of dedicated functions. Moreover, they cannot loop, which
+protects from denial of service.  More information on BPF can be found in
+*Documentation/networking/filter.txt*.
+
+
+Writing a rule
+--------------
+
+To enforce a security policy, a thread first needs to create a Landlock rule.
+The easiest way to write an eBPF program depicting a security rule is to write
+it in the C language.  As described in *samples/bpf/README.rst*, LLVM can
+compile such programs.  Files *samples/bpf/landlock1_kern.c* and those in
+*tools/testing/selftests/landlock/rules/* can be used as examples.  The
+following example is a simple rule to forbid file creation, whatever syscall
+may be used (e.g. open, mkdir, link...).
+
+.. code-block:: c
+
+    static int deny_file_creation(struct landlock_context *ctx)
+    {
+        if (ctx->arg2 & LANDLOCK_ACTION_FS_NEW)
+            return 1;
+        return 0;
+    }
+
+Once the eBPF program is created, the next step is to create the metadata
+describing the Landlock rule.  This metadata includes a subtype which contains
+the version of Landlock, the event to which the rule is tied, and optional
+Landlock rule abilities.
+
+.. code-block:: c
+
+    static union bpf_prog_subtype subtype = {
+        .landlock_rule = {
+            .version = 1,
+            .event = LANDLOCK_SUBTYPE_EVENT_FS,
+        }
+    };
+
+The Landlock version is important to inform the kernel which features or
+behavior the rule can handle.  The user-space thread should set the lowest
+possible version to be as compatible as possible with older kernels.  For the
+list of features provided by version, see :ref:`features`.
+
+A Landlock event describes the kind of kernel object for which a rule will be
+triggered to allow or deny an action.  For example, the event
+LANDLOCK_SUBTYPE_EVENT_FS is triggered every time a landlocked thread performs
+an action related to the filesystem (e.g. open, read, write, mount...).
+
+The Landlock rule abilities should only be used if the rule needs a specific
+feature such as debugging.  This should be avoided if not strictly necessary.
+
+The next step is to fill a :c:type:`union bpf_attr <bpf_attr>` with
+BPF_PROG_TYPE_LANDLOCK, the previously created subtype and other BPF program
+metadata.  This bpf_attr must then be passed to the bpf(2) syscall alongside
+the BPF_PROG_LOAD command.  If everything is deemed correct by the kernel, the
+thread gets a file descriptor referring to this rule.
+
+In the following code, the *insn* variable is an array of BPF instructions
+which can be extracted from an ELF file as is done in bpf_load_file() from
+*samples/bpf/bpf_load.c*.
+
+.. code-block:: c
+
+    union bpf_attr attr = {
+        .prog_type = BPF_PROG_TYPE_LANDLOCK,
+        .insn_cnt = sizeof(insn) / sizeof(struct bpf_insn),
+        .insns = (__u64) (unsigned long) insn,
+        .license = (__u64) (unsigned long) "GPL",
+        .prog_subtype = &subtype,
+    };
+    int rule = bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
+    if (rule == -1)
+        exit(1);
+
+
+Enforcing a rule
+----------------
+
+Once the Landlock rule has been created or received (e.g. through a UNIX
+socket), the thread willing to sandbox itself (and its future children) needs
+to perform two steps to properly enforce a rule.
+
+The thread must first request to never be allowed to get new privileges with a
+call to prctl(2) and the PR_SET_NO_NEW_PRIVS option.  More information can be
+found in *Documentation/prctl/no_new_privs.txt*.
+
+.. code-block:: c
+
+    if (prctl(PR_SET_NO_NEW_PRIVS, 1, NULL, 0, 0))
+        exit(1);
+
+A thread can apply a rule to itself by using the seccomp(2) syscall.  The
+operation is SECCOMP_ADD_LANDLOCK_RULE, the flags must be empty and the *args*
+argument must point to a valid Landlock rule file descriptor.
+
+.. code-block:: c
+
+    if (seccomp(SECCOMP_ADD_LANDLOCK_RULE, 0, &rule))
+        exit(1);
+
+If the syscall succeeds, the rule is now enforced on the calling thread and
+will be enforced on all its subsequently created children of the thread as
+well.  Once a thread is landlocked, there is no way to remove this security
+policy, only stacking more restrictions is allowed.
+
+
+.. _inherited_rules:
+
+Inherited rules
+---------------
+
+Every new thread resulting from a clone(2) inherits Landlock rule restrictions
+from its parent.  This is comparable to the seccomp inheritance as described in
+*Documentation/prctl/seccomp_filter.txt*, but differs for rules addition.
+
+If a thread adds a rule for a particular event, then all its future children
+and their progeny will inherit all the rules from the same event, whether any
+of those rules were added before or after the fork.  This allows a thread to
+share its security policy with its children and further restrict them over
+time.  If a thread wants its future rules to be propagated, it must then create
+at least one rule tied to the same event before any fork.
+
+
+.. _features:
+
+Landlock features
+=================
+
+In order to support new features over time without changing a rule behavior,
+every context field, flag or helpers has a minimal Landlock version in which
+they are available.  A thread needs to specify this minimal version number in
+the subtype :c:type:`struct landlock_rule <landlock_rule>` defined in
+*include/uapi/linux/bpf.h*.
+
+
+Context
+-------
+
+The arch and syscall_nr fields may be useful to tighten an access control, but
+care must be taken to avoid pitfalls as explain in
+*Documentation/prctl/seccomp_filter.txt*.
+
+.. kernel-doc:: include/uapi/linux/bpf.h
+    :functions: landlock_context
+
+
+Landlock event types
+--------------------
+
+.. kernel-doc:: include/uapi/linux/bpf.h
+    :functions: landlock_subtype_event
+
+.. flat-table:: Event types availability
+
+    * - flags
+      - since
+
+    * - LANDLOCK_SUBTYPE_EVENT_FS
+      - v1
+
+
+File system access request
+--------------------------
+
+Optional arguments from :c:type:`struct landlock_context <landlock_context>`:
+
+* arg1: filesystem handle
+* arg2: action type
+
+
+File system action types
+------------------------
+
+Flags are used to express actions.  This makes it possible to compose actions
+and leaves room for future improvements to add more fine-grained action types.
+
+.. kernel-doc:: include/uapi/linux/bpf.h
+    :doc: landlock_action_fs
+
+.. flat-table:: FS action types availability
+
+    * - flags
+      - since
+
+    * - LANDLOCK_ACTION_FS_EXEC
+      - v1
+
+    * - LANDLOCK_ACTION_FS_WRITE
+      - v1
+
+    * - LANDLOCK_ACTION_FS_READ
+      - v1
+
+    * - LANDLOCK_ACTION_FS_NEW
+      - v1
+
+    * - LANDLOCK_ACTION_FS_GET
+      - v1
+
+    * - LANDLOCK_ACTION_FS_REMOVE
+      - v1
+
+    * - LANDLOCK_ACTION_FS_IOCTL
+      - v1
+
+    * - LANDLOCK_ACTION_FS_LOCK
+      - v1
+
+    * - LANDLOCK_ACTION_FS_FCNTL
+      - v1
+
+
+Ability types
+-------------
+
+The ability of a Landlock rule describes the available features (i.e. context
+fields and helpers).  This is useful to abstract user-space privileges for
+Landlock rules, which may not need all abilities (e.g. debug).  Only the
+minimal set of abilities should be used (e.g. disable debug once in
+production).
+
+
+.. kernel-doc:: include/uapi/linux/bpf.h
+    :doc: landlock_subtype_ability
+
+.. flat-table:: Ability types availability
+
+    * - flags
+      - since
+      - capability
+
+    * - LANDLOCK_SUBTYPE_ABILITY_WRITE
+      - v1
+      - CAP_SYS_ADMIN
+
+    * - LANDLOCK_SUBTYPE_ABILITY_DEBUG
+      - v1
+      - CAP_SYS_ADMIN
+
+
+Helper functions
+----------------
+
+See *include/uapi/linux/bpf.h* for functions documentation.
+
+.. flat-table:: Generic functions availability
+
+    * - helper
+      - since
+      - ability
+
+    * - bpf_map_lookup_elem
+      - v1
+      - (none)
+
+    * - bpf_map_delete_elem
+      - v1
+      - LANDLOCK_SUBTYPE_ABILITY_WRITE
+
+    * - bpf_map_update_elem
+      - v1
+      - LANDLOCK_SUBTYPE_ABILITY_WRITE
+
+    * - bpf_get_current_comm
+      - v1
+      - LANDLOCK_SUBTYPE_ABILITY_DEBUG
+
+    * - bpf_get_current_pid_tgid
+      - v1
+      - LANDLOCK_SUBTYPE_ABILITY_DEBUG
+
+    * - bpf_get_current_uid_gid
+      - v1
+      - LANDLOCK_SUBTYPE_ABILITY_DEBUG
+
+    * - bpf_get_trace_printk
+      - v1
+      - LANDLOCK_SUBTYPE_ABILITY_DEBUG
+
+.. flat-table:: File system functions availability
+
+    * - helper
+      - since
+      - ability
+
+    * - bpf_handle_fs_get_mode
+      - v1
+      - (none)
+
-- 
2.11.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 10/10] landlock: Add user and kernel documentation for Landlock
  2017-02-22  1:26 ` [PATCH v5 10/10] landlock: Add user and kernel documentation " Mickaël Salaün
@ 2017-02-22  5:21   ` Andy Lutomirski
  2017-02-22  7:43     ` Mickaël Salaün
  0 siblings, 1 reply; 26+ messages in thread
From: Andy Lutomirski @ 2017-02-22  5:21 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: linux-kernel, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Casey Schaufler, Daniel Borkmann, David Drysdale,
	David S . Miller, Eric W . Biederman, James Morris, Jann Horn,
	Jonathan Corbet, Matthew Garrett, Michael Kerrisk, Kees Cook,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Tejun Heo, Thomas Graf, Will Drewry, kernel-hardening, Linux API,
	LSM List, Network Development

On Tue, Feb 21, 2017 at 5:26 PM, Mickaël Salaün <mic@digikod.net> wrote:
> This documentation can be built with the Sphinx framework.
>
> Signed-off-by: Mickaël Salaün <mic@digikod.net>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: James Morris <james.l.morris@oracle.com>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Serge E. Hallyn <serge@hallyn.com>


> +
> +Writing a rule
> +--------------
> +
> +To enforce a security policy, a thread first needs to create a Landlock rule.
> +The easiest way to write an eBPF program depicting a security rule is to write
> +it in the C language.  As described in *samples/bpf/README.rst*, LLVM can
> +compile such programs.  Files *samples/bpf/landlock1_kern.c* and those in
> +*tools/testing/selftests/landlock/rules/* can be used as examples.  The
> +following example is a simple rule to forbid file creation, whatever syscall
> +may be used (e.g. open, mkdir, link...).
> +
> +.. code-block:: c
> +
> +    static int deny_file_creation(struct landlock_context *ctx)
> +    {
> +        if (ctx->arg2 & LANDLOCK_ACTION_FS_NEW)
> +            return 1;
> +        return 0;
> +    }
> +

Would it make sense to define landlock_context (or at least a prefix
thereof) in here?  Also, can't "arg2" have a better name?

Can you specify what the return value means?  Are 0 and 1 the only
choices?  Would "KILL" be useful?  How about "COREDUMP"?

> +File system action types
> +------------------------
> +
> +Flags are used to express actions.  This makes it possible to compose actions
> +and leaves room for future improvements to add more fine-grained action types.
> +
> +.. kernel-doc:: include/uapi/linux/bpf.h
> +    :doc: landlock_action_fs
> +
> +.. flat-table:: FS action types availability
> +
> +    * - flags
> +      - since
> +
> +    * - LANDLOCK_ACTION_FS_EXEC
> +      - v1
> +
> +    * - LANDLOCK_ACTION_FS_WRITE
> +      - v1
> +
> +    * - LANDLOCK_ACTION_FS_READ
> +      - v1
> +
> +    * - LANDLOCK_ACTION_FS_NEW
> +      - v1
> +
> +    * - LANDLOCK_ACTION_FS_GET
> +      - v1
> +
> +    * - LANDLOCK_ACTION_FS_REMOVE
> +      - v1
> +
> +    * - LANDLOCK_ACTION_FS_IOCTL
> +      - v1
> +
> +    * - LANDLOCK_ACTION_FS_LOCK
> +      - v1
> +
> +    * - LANDLOCK_ACTION_FS_FCNTL
> +      - v1

What happens if you run an old program on a new kernel?  Can you get
unexpected action types?

> +
> +
> +Ability types
> +-------------
> +
> +The ability of a Landlock rule describes the available features (i.e. context
> +fields and helpers).  This is useful to abstract user-space privileges for
> +Landlock rules, which may not need all abilities (e.g. debug).  Only the
> +minimal set of abilities should be used (e.g. disable debug once in
> +production).
> +
> +
> +.. kernel-doc:: include/uapi/linux/bpf.h
> +    :doc: landlock_subtype_ability
> +
> +.. flat-table:: Ability types availability
> +
> +    * - flags
> +      - since
> +      - capability
> +
> +    * - LANDLOCK_SUBTYPE_ABILITY_WRITE
> +      - v1
> +      - CAP_SYS_ADMIN
> +
> +    * - LANDLOCK_SUBTYPE_ABILITY_DEBUG
> +      - v1
> +      - CAP_SYS_ADMIN
> +

What do "WRITE" and "DEBUG" mean in this context?  I'm totally lost.

Hmm.  Reading below, "WRITE" seems to mean "modify state".  Would that
be accurate?

> +
> +Helper functions
> +----------------
> +
> +See *include/uapi/linux/bpf.h* for functions documentation.
> +
> +.. flat-table:: Generic functions availability
> +

> +
> +    * - bpf_get_current_comm
> +      - v1
> +      - LANDLOCK_SUBTYPE_ABILITY_DEBUG

What would this be used for?

> +    * - bpf_get_trace_printk
> +      - v1
> +      - LANDLOCK_SUBTYPE_ABILITY_DEBUG
> +

This is different from the other DEBUG stuff in that it has side
effects.  I wonder if it should have a different flag.

--Andy

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 10/10] landlock: Add user and kernel documentation for Landlock
  2017-02-22  5:21   ` Andy Lutomirski
@ 2017-02-22  7:43     ` Mickaël Salaün
  0 siblings, 0 replies; 26+ messages in thread
From: Mickaël Salaün @ 2017-02-22  7:43 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-kernel, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Casey Schaufler, Daniel Borkmann, David Drysdale,
	David S . Miller, Eric W . Biederman, James Morris, Jann Horn,
	Jonathan Corbet, Matthew Garrett, Michael Kerrisk, Kees Cook,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Tejun Heo, Thomas Graf, Will Drewry, kernel-hardening, Linux API,
	LSM List, Network Development

[-- Attachment #1.1: Type: text/plain, Size: 4995 bytes --]


On 22/02/2017 06:21, Andy Lutomirski wrote:
> On Tue, Feb 21, 2017 at 5:26 PM, Mickaël Salaün <mic@digikod.net> wrote:
>> This documentation can be built with the Sphinx framework.
>>
>> Signed-off-by: Mickaël Salaün <mic@digikod.net>
>> Cc: Alexei Starovoitov <ast@kernel.org>
>> Cc: Andy Lutomirski <luto@amacapital.net>
>> Cc: Daniel Borkmann <daniel@iogearbox.net>
>> Cc: David S. Miller <davem@davemloft.net>
>> Cc: James Morris <james.l.morris@oracle.com>
>> Cc: Jonathan Corbet <corbet@lwn.net>
>> Cc: Kees Cook <keescook@chromium.org>
>> Cc: Serge E. Hallyn <serge@hallyn.com>
> 
> 
>> +
>> +Writing a rule
>> +--------------
>> +
>> +To enforce a security policy, a thread first needs to create a Landlock rule.
>> +The easiest way to write an eBPF program depicting a security rule is to write
>> +it in the C language.  As described in *samples/bpf/README.rst*, LLVM can
>> +compile such programs.  Files *samples/bpf/landlock1_kern.c* and those in
>> +*tools/testing/selftests/landlock/rules/* can be used as examples.  The
>> +following example is a simple rule to forbid file creation, whatever syscall
>> +may be used (e.g. open, mkdir, link...).
>> +
>> +.. code-block:: c
>> +
>> +    static int deny_file_creation(struct landlock_context *ctx)
>> +    {
>> +        if (ctx->arg2 & LANDLOCK_ACTION_FS_NEW)
>> +            return 1;
>> +        return 0;
>> +    }
>> +
> 
> Would it make sense to define landlock_context (or at least a prefix
> thereof) in here?  Also, can't "arg2" have a better name?

arg2 is a generic name. Its meaning depends on the Landlock event, here
it is an action bitfield (FS event).

> 
> Can you specify what the return value means?  Are 0 and 1 the only
> choices?  Would "KILL" be useful?  How about "COREDUMP"?

This is explained thereafter and in the kernel Q&A section. I need to
briefly introduce that here.

> 
>> +File system action types
>> +------------------------
>> +
>> +Flags are used to express actions.  This makes it possible to compose actions
>> +and leaves room for future improvements to add more fine-grained action types.
>> +
>> +.. kernel-doc:: include/uapi/linux/bpf.h
>> +    :doc: landlock_action_fs
>> +
>> +.. flat-table:: FS action types availability
>> +
>> +    * - flags
>> +      - since
>> +
>> +    * - LANDLOCK_ACTION_FS_EXEC
>> +      - v1
>> +
>> +    * - LANDLOCK_ACTION_FS_WRITE
>> +      - v1
>> +
>> +    * - LANDLOCK_ACTION_FS_READ
>> +      - v1
>> +
>> +    * - LANDLOCK_ACTION_FS_NEW
>> +      - v1
>> +
>> +    * - LANDLOCK_ACTION_FS_GET
>> +      - v1
>> +
>> +    * - LANDLOCK_ACTION_FS_REMOVE
>> +      - v1
>> +
>> +    * - LANDLOCK_ACTION_FS_IOCTL
>> +      - v1
>> +
>> +    * - LANDLOCK_ACTION_FS_LOCK
>> +      - v1
>> +
>> +    * - LANDLOCK_ACTION_FS_FCNTL
>> +      - v1
> 
> What happens if you run an old program on a new kernel?  Can you get
> unexpected action types?

The old flags will still make sense, the new ones should be ignored by
the rule.

> 
>> +
>> +
>> +Ability types
>> +-------------
>> +
>> +The ability of a Landlock rule describes the available features (i.e. context
>> +fields and helpers).  This is useful to abstract user-space privileges for
>> +Landlock rules, which may not need all abilities (e.g. debug).  Only the
>> +minimal set of abilities should be used (e.g. disable debug once in
>> +production).
>> +
>> +
>> +.. kernel-doc:: include/uapi/linux/bpf.h
>> +    :doc: landlock_subtype_ability
>> +
>> +.. flat-table:: Ability types availability
>> +
>> +    * - flags
>> +      - since
>> +      - capability
>> +
>> +    * - LANDLOCK_SUBTYPE_ABILITY_WRITE
>> +      - v1
>> +      - CAP_SYS_ADMIN
>> +
>> +    * - LANDLOCK_SUBTYPE_ABILITY_DEBUG
>> +      - v1
>> +      - CAP_SYS_ADMIN
>> +
> 
> What do "WRITE" and "DEBUG" mean in this context?  I'm totally lost.
> 
> Hmm.  Reading below, "WRITE" seems to mean "modify state".  Would that
> be accurate?

That is correct, but handling a state in a safe way imply more than only
the ability to "write" outside bpfland (e.g. sequential execution).

> 
>> +
>> +Helper functions
>> +----------------
>> +
>> +See *include/uapi/linux/bpf.h* for functions documentation.
>> +
>> +.. flat-table:: Generic functions availability
>> +
> 
>> +
>> +    * - bpf_get_current_comm
>> +      - v1
>> +      - LANDLOCK_SUBTYPE_ABILITY_DEBUG
> 
> What would this be used for?

To get more information about the process which trigger an action?

> 
>> +    * - bpf_get_trace_printk
>> +      - v1
>> +      - LANDLOCK_SUBTYPE_ABILITY_DEBUG
>> +
> 
> This is different from the other DEBUG stuff in that it has side
> effects.  I wonder if it should have a different flag.

I think the debug flag is a clear warning to not ship a rule using this
ability. Maybe a sub-flag LANDLOCK_SUBTYPE_ABILITY_DEBUG_PRINT would fit
here?

 Mickaël


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 07/10] bpf: Add a Landlock sandbox example
  2017-02-22  1:26 ` [PATCH v5 07/10] bpf: Add a Landlock sandbox example Mickaël Salaün
@ 2017-02-23 22:13   ` Mickaël Salaün
  0 siblings, 0 replies; 26+ messages in thread
From: Mickaël Salaün @ 2017-02-23 22:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: Alexei Starovoitov, Andy Lutomirski, Arnaldo Carvalho de Melo,
	Casey Schaufler, Daniel Borkmann, David Drysdale,
	David S . Miller, Eric W . Biederman, James Morris, Jann Horn,
	Jonathan Corbet, Matthew Garrett, Michael Kerrisk, Kees Cook,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Tejun Heo, Thomas Graf, Will Drewry, kernel-hardening, linux-api,
	linux-security-module, netdev

[-- Attachment #1.1: Type: text/plain, Size: 2720 bytes --]


On 22/02/2017 02:26, Mickaël Salaün wrote:
> Add a basic sandbox tool to create a process isolated from some part of
> the system. This sandbox create a read-only environment. It is only
> allowed to write to a character device such as a TTY:
> 
>   # :> X
>   # echo $?
>   0
>   # ./samples/bpf/landlock1 /bin/sh -i
>   Launching a new sandboxed process.
>   # :> Y
>   cannot create Y: Operation not permitted
> 
> Changes since v4:
> * write Landlock rule in C and compiled it with LLVM
> * remove cgroup handling
> * remove path handling: only handle a read-only environment
> * remove errno return codes
> 
> Changes since v3:
> * remove seccomp and origin field: completely free from seccomp programs
> * handle more FS-related hooks
> * handle inode hooks and directory traversal
> * add faked but consistent view thanks to ENOENT
> * add /lib64 in the example
> * fix spelling
> * rename some types and definitions (e.g. SECCOMP_ADD_LANDLOCK_RULE)
> 
> Changes since v2:
> * use BPF_PROG_ATTACH for cgroup handling
> 
> Signed-off-by: Mickaël Salaün <mic@digikod.net>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: James Morris <james.l.morris@oracle.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Serge E. Hallyn <serge@hallyn.com>
> ---
>  samples/bpf/.gitignore       |  32 ++++++++++++++
>  samples/bpf/Makefile         |   4 ++
>  samples/bpf/bpf_load.c       |  26 +++++++++--
>  samples/bpf/landlock1_kern.c |  46 +++++++++++++++++++
>  samples/bpf/landlock1_user.c | 102 +++++++++++++++++++++++++++++++++++++++++++
>  5 files changed, 206 insertions(+), 4 deletions(-)
>  create mode 100644 samples/bpf/.gitignore
>  create mode 100644 samples/bpf/landlock1_kern.c
>  create mode 100644 samples/bpf/landlock1_user.c
> 
> diff --git a/samples/bpf/.gitignore b/samples/bpf/.gitignore
> new file mode 100644
> index 000000000000..a7562a5ef4c2
> --- /dev/null
> +++ b/samples/bpf/.gitignore
> @@ -0,0 +1,32 @@
> +fds_example
> +lathist
> +lwt_len_hist
> +map_perf_test
> +offwaketime
> +sampleip
> +sockex1
> +sockex2
> +sockex3
> +sock_example
> +spintest
> +tc_l2_redirect
> +test_cgrp2_array_pin
> +test_cgrp2_attach
> +test_cgrp2_attach2
> +test_cgrp2_sock
> +test_cgrp2_sock2
> +test_current_task_under_cgroup
> +test_lru_dist
> +test_overhead
> +test_probe_write_user
> +trace_event
> +trace_output
> +tracex1
> +tracex2
> +tracex3
> +tracex4
> +tracex5
> +tracex6
> +xdp1
> +xdp2
> +xdp_tx_iptunnel

Please ignore this hunk, it was part of another patch series…


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy
  2017-02-22  1:26 ` [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy Mickaël Salaün
@ 2017-02-28 20:01   ` Andy Lutomirski
  2017-03-01 22:14     ` Mickaël Salaün
  2017-03-02 10:22   ` [kernel-hardening] " Djalal Harouni
  1 sibling, 1 reply; 26+ messages in thread
From: Andy Lutomirski @ 2017-02-28 20:01 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: linux-kernel, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Casey Schaufler, Daniel Borkmann, David Drysdale,
	David S . Miller, Eric W . Biederman, James Morris, Jann Horn,
	Jonathan Corbet, Matthew Garrett, Michael Kerrisk, Kees Cook,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Tejun Heo, Thomas Graf, Will Drewry, kernel-hardening, Linux API,
	LSM List, Network Development, Andrew Morton

On Tue, Feb 21, 2017 at 5:26 PM, Mickaël Salaün <mic@digikod.net> wrote:
> The seccomp(2) syscall can be use to apply a Landlock rule to the
> current process. As with a seccomp filter, the Landlock rule is enforced
> for all its future children. An inherited rule tree can be updated
> (append-only) by the owner of inherited Landlock nodes (e.g. a parent
> process that create a new rule)

Can you clarify exaclty what this type of update does?  Is it
something that should be supported by normal seccomp rules as well?

> +/**
> + * landlock_run_prog - run Landlock program for a syscall

Unless this is actually specific to syscalls, s/for a syscall//, perhaps?

> +               if (new_events->nodes[event_idx]->owner ==
> +                               &new_events->nodes[event_idx]) {
> +                       /* We are the owner, we can then update the node. */
> +                       add_landlock_rule(new_events, rule);

This is the part I don't get.  Adding a rule if you're the owner (BTW,
why is ownership visible to userspace at all?) for just yourself and
future children is very different from adding it so it applies to
preexisting children too.


> +               } else if (atomic_read(&current_events->usage) == 1) {
> +                       WARN_ON(new_events->nodes[event_idx]->owner);
> +                       /*
> +                        * We can become the new owner if no other task use it.
> +                        * This avoid an unnecessary allocation.
> +                        */
> +                       new_events->nodes[event_idx]->owner =
> +                               &new_events->nodes[event_idx];
> +                       add_landlock_rule(new_events, rule);
> +               } else {
> +                       /*
> +                        * We are not the owner, we need to fork current_events
> +                        * and then add a new node.
> +                        */
> +                       struct landlock_node *node;
> +                       size_t i;
> +
> +                       node = kmalloc(sizeof(*node), GFP_KERNEL);
> +                       if (!node) {
> +                               new_events = ERR_PTR(-ENOMEM);
> +                               goto put_rule;
> +                       }
> +                       atomic_set(&node->usage, 1);
> +                       /* set the previous node after the new_events
> +                        * allocation */
> +                       node->prev = NULL;
> +                       /* do not increment the previous node usage */
> +                       node->owner = &new_events->nodes[event_idx];
> +                       /* rule->prev is already NULL */
> +                       atomic_set(&rule->usage, 1);
> +                       node->rule = rule;
> +
> +                       new_events = new_raw_landlock_events();
> +                       if (IS_ERR(new_events)) {
> +                               /* put the rule as well */
> +                               put_landlock_node(node);
> +                               return ERR_PTR(-ENOMEM);
> +                       }
> +                       for (i = 0; i < ARRAY_SIZE(new_events->nodes); i++) {
> +                               new_events->nodes[i] =
> +                                       lockless_dereference(
> +                                                       current_events->nodes[i]);
> +                               if (i == event_idx)
> +                                       node->prev = new_events->nodes[i];
> +                               if (!WARN_ON(!new_events->nodes[i]))
> +                                       atomic_inc(&new_events->nodes[i]->usage);
> +                       }
> +                       new_events->nodes[event_idx] = node;
> +
> +                       /*
> +                        * @current_events will not be freed here because it's usage
> +                        * field is > 1. It is only prevented to be freed by another
> +                        * subject thanks to the caller of landlock_append_prog() which
> +                        * should be locked if needed.
> +                        */
> +                       put_landlock_events(current_events);
> +               }
> +       }
> +       return new_events;
> +
> +put_prog:
> +       bpf_prog_put(prog);
> +       return new_events;
> +
> +put_rule:
> +       put_landlock_rule(rule);
> +       return new_events;
> +}
> +
> +/**
> + * landlock_seccomp_append_prog - attach a Landlock rule to the current process
> + *
> + * current->seccomp.landlock_events is lazily allocated. When a process fork,
> + * only a pointer is copied. When a new event is added by a process, if there
> + * is other references to this process' landlock_events, then a new allocation
> + * is made to contains an array pointing to Landlock rule lists. This design
> + * has low-performance impact and is memory efficient while keeping the
> + * property of append-only rules.
> + *
> + * @flags: not used for now, but could be used for TSYNC
> + * @user_bpf_fd: file descriptor pointing to a loaded Landlock rule
> + */
> +#ifdef CONFIG_SECCOMP_FILTER
> +int landlock_seccomp_append_prog(unsigned int flags, const char __user *user_bpf_fd)
> +{
> +       struct landlock_events *new_events;
> +       struct bpf_prog *prog;
> +       int bpf_fd;
> +
> +       /* force no_new_privs to limit privilege escalation */
> +       if (!task_no_new_privs(current))
> +               return -EPERM;
> +       /* will be removed in the future to allow unprivileged tasks */
> +       if (!capable(CAP_SYS_ADMIN))
> +               return -EPERM;
> +       if (!user_bpf_fd)
> +               return -EFAULT;
> +       if (flags)
> +               return -EINVAL;
> +       if (copy_from_user(&bpf_fd, user_bpf_fd, sizeof(bpf_fd)))
> +               return -EFAULT;
> +       prog = bpf_prog_get(bpf_fd);
> +       if (IS_ERR(prog))
> +               return PTR_ERR(prog);
> +
> +       /*
> +        * We don't need to lock anything for the current process hierarchy,
> +        * everything is guarded by the atomic counters.
> +        */
> +       new_events = landlock_append_prog(current->seccomp.landlock_events, prog);

Do you need to check that it's the right *kind* of bpf prog or is that
handled elsewhere?

--Andy

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 03/10] bpf: Define handle_fs and add a new helper bpf_handle_fs_get_mode()
  2017-02-22  1:26 ` [PATCH v5 03/10] bpf: Define handle_fs and add a new helper bpf_handle_fs_get_mode() Mickaël Salaün
@ 2017-03-01  9:32   ` James Morris
  2017-03-01 22:20     ` Mickaël Salaün
  0 siblings, 1 reply; 26+ messages in thread
From: James Morris @ 2017-03-01  9:32 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: linux-kernel, Alexei Starovoitov, Andy Lutomirski,
	Arnaldo Carvalho de Melo, Casey Schaufler, Daniel Borkmann,
	David Drysdale, David S . Miller, Eric W . Biederman,
	James Morris, Jann Horn, Jonathan Corbet, Matthew Garrett,
	Michael Kerrisk, Kees Cook, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Tejun Heo, Thomas Graf,
	Will Drewry, kernel-hardening, linux-api, linux-security-module,
	netdev


[-- Attachment #1: Type: text/plain, Size: 888 bytes --]

On Wed, 22 Feb 2017, Mickaël Salaün wrote:

> Add an eBPF function bpf_handle_fs_get_mode(handle_fs) to get the mode
> of a an abstract object wrapping either a file, a dentry, a path, or an
> inode.
> 
> Changes since v4:
> * use a file abstraction (handle) to wrap inode, dentry, path and file
>   structs

Good to see these abstractions.  As discussed at LPC, we need to ensure 
that we don't couple the Landlock API too closely with the LSM API, as the 
former is an ABI exposed to userland -- we don't want to lose the ability 
to change LSM internally due to breaking Landlock policies.

> @@ -82,6 +87,8 @@ enum bpf_arg_type {
>  
>  	ARG_PTR_TO_CTX,		/* pointer to context */
>  	ARG_ANYTHING,		/* any (initialized) argument is ok */
> +
> +	ARG_CONST_PTR_TO_HANDLE_FS,	/* pointer to an abstract FS struct */
>  };

Extraneous whitespace?


-- 
James Morris
<jmorris@namei.org>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy
  2017-02-28 20:01   ` Andy Lutomirski
@ 2017-03-01 22:14     ` Mickaël Salaün
  2017-03-01 22:20       ` Andy Lutomirski
  0 siblings, 1 reply; 26+ messages in thread
From: Mickaël Salaün @ 2017-03-01 22:14 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-kernel, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Casey Schaufler, Daniel Borkmann, David Drysdale,
	David S . Miller, Eric W . Biederman, James Morris, Jann Horn,
	Jonathan Corbet, Matthew Garrett, Michael Kerrisk, Kees Cook,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Tejun Heo, Thomas Graf, Will Drewry, kernel-hardening, Linux API,
	LSM List, Network Development, Andrew Morton

[-- Attachment #1.1: Type: text/plain, Size: 8760 bytes --]


On 28/02/2017 21:01, Andy Lutomirski wrote:
> On Tue, Feb 21, 2017 at 5:26 PM, Mickaël Salaün <mic@digikod.net> wrote:
>> The seccomp(2) syscall can be use to apply a Landlock rule to the
>> current process. As with a seccomp filter, the Landlock rule is enforced
>> for all its future children. An inherited rule tree can be updated
>> (append-only) by the owner of inherited Landlock nodes (e.g. a parent
>> process that create a new rule)
> 
> Can you clarify exaclty what this type of update does?  Is it
> something that should be supported by normal seccomp rules as well?

There is two main structures involved here: struct landlock_node and
struct landlock_rule, both defined in include/linux/landlock.h [02/10].

Let's take an example with seccomp filter and then Landlock:
* seccomp filter: Process P1 creates and applies a seccomp filter F1 to
itself. Then it forks and creates a child P2, which inherits P1's
filters, hence F1. Now, if P1 add a new seccomp filter F2 to itself, P2
*won't get it*. The P2's filter list will still only contains F1 but not
F2. If P2 sets up and applies a new filter F3 to itself, its filter list
will contains F1 and F3.
* Landlock: Process P1 creates and applies a Landlock rule R1 to itself.
Underneath the kernel creates a new node N1 dedicated to P1, which
contains all its rules. Then P1 forks and creates a child P2, which
inherits P1's rules, hence R1. Underneath P2 inherited N1. Now, if P1
add a new Landlock rule R2 to itself, P2 *will get it* as well (because
R2 is part of N1). If P2 creates and applies a new rule R3 to itself,
its rules will contains R1, R2 and R3. Underneath the kernel created a
new node N2 for P2, which only contains R3 but inherits/links to N1.

This design makes it possible for a process to add more constraints to
its children on the fly. I think it is a good feature to have and a
safer default inheritance mechanism, but it could be guarded by an
option flag if we want both mechanism to be available. The same design
could be used by seccomp filter too.


> 
>> +/**
>> + * landlock_run_prog - run Landlock program for a syscall
> 
> Unless this is actually specific to syscalls, s/for a syscall//, perhaps?

Right, not specific to syscall anymore.

> 
>> +               if (new_events->nodes[event_idx]->owner ==
>> +                               &new_events->nodes[event_idx]) {
>> +                       /* We are the owner, we can then update the node. */
>> +                       add_landlock_rule(new_events, rule);
> 
> This is the part I don't get.  Adding a rule if you're the owner (BTW,
> why is ownership visible to userspace at all?) for just yourself and
> future children is very different from adding it so it applies to
> preexisting children too.

Node ownership is not (directly) visible to userspace.

The current inheritance mechanism doesn't enable to only add a rule to
the current process. The rule will be inherited by its children
(starting from the children created after the first applied rule). An
option flag NEW_RULE_HIERARCHY (or maybe another seccomp operation)
could enable to create a new node for the current process, and then
makes it not inherited by the previous children.


> 
> 
>> +               } else if (atomic_read(&current_events->usage) == 1) {
>> +                       WARN_ON(new_events->nodes[event_idx]->owner);
>> +                       /*
>> +                        * We can become the new owner if no other task use it.
>> +                        * This avoid an unnecessary allocation.
>> +                        */
>> +                       new_events->nodes[event_idx]->owner =
>> +                               &new_events->nodes[event_idx];
>> +                       add_landlock_rule(new_events, rule);
>> +               } else {
>> +                       /*
>> +                        * We are not the owner, we need to fork current_events
>> +                        * and then add a new node.
>> +                        */
>> +                       struct landlock_node *node;
>> +                       size_t i;
>> +
>> +                       node = kmalloc(sizeof(*node), GFP_KERNEL);
>> +                       if (!node) {
>> +                               new_events = ERR_PTR(-ENOMEM);
>> +                               goto put_rule;
>> +                       }
>> +                       atomic_set(&node->usage, 1);
>> +                       /* set the previous node after the new_events
>> +                        * allocation */
>> +                       node->prev = NULL;
>> +                       /* do not increment the previous node usage */
>> +                       node->owner = &new_events->nodes[event_idx];
>> +                       /* rule->prev is already NULL */
>> +                       atomic_set(&rule->usage, 1);
>> +                       node->rule = rule;
>> +
>> +                       new_events = new_raw_landlock_events();
>> +                       if (IS_ERR(new_events)) {
>> +                               /* put the rule as well */
>> +                               put_landlock_node(node);
>> +                               return ERR_PTR(-ENOMEM);
>> +                       }
>> +                       for (i = 0; i < ARRAY_SIZE(new_events->nodes); i++) {
>> +                               new_events->nodes[i] =
>> +                                       lockless_dereference(
>> +                                                       current_events->nodes[i]);
>> +                               if (i == event_idx)
>> +                                       node->prev = new_events->nodes[i];
>> +                               if (!WARN_ON(!new_events->nodes[i]))
>> +                                       atomic_inc(&new_events->nodes[i]->usage);
>> +                       }
>> +                       new_events->nodes[event_idx] = node;
>> +
>> +                       /*
>> +                        * @current_events will not be freed here because it's usage
>> +                        * field is > 1. It is only prevented to be freed by another
>> +                        * subject thanks to the caller of landlock_append_prog() which
>> +                        * should be locked if needed.
>> +                        */
>> +                       put_landlock_events(current_events);
>> +               }
>> +       }
>> +       return new_events;
>> +
>> +put_prog:
>> +       bpf_prog_put(prog);
>> +       return new_events;
>> +
>> +put_rule:
>> +       put_landlock_rule(rule);
>> +       return new_events;
>> +}
>> +
>> +/**
>> + * landlock_seccomp_append_prog - attach a Landlock rule to the current process
>> + *
>> + * current->seccomp.landlock_events is lazily allocated. When a process fork,
>> + * only a pointer is copied. When a new event is added by a process, if there
>> + * is other references to this process' landlock_events, then a new allocation
>> + * is made to contains an array pointing to Landlock rule lists. This design
>> + * has low-performance impact and is memory efficient while keeping the
>> + * property of append-only rules.
>> + *
>> + * @flags: not used for now, but could be used for TSYNC
>> + * @user_bpf_fd: file descriptor pointing to a loaded Landlock rule
>> + */
>> +#ifdef CONFIG_SECCOMP_FILTER
>> +int landlock_seccomp_append_prog(unsigned int flags, const char __user *user_bpf_fd)
>> +{
>> +       struct landlock_events *new_events;
>> +       struct bpf_prog *prog;
>> +       int bpf_fd;
>> +
>> +       /* force no_new_privs to limit privilege escalation */
>> +       if (!task_no_new_privs(current))
>> +               return -EPERM;
>> +       /* will be removed in the future to allow unprivileged tasks */
>> +       if (!capable(CAP_SYS_ADMIN))
>> +               return -EPERM;
>> +       if (!user_bpf_fd)
>> +               return -EFAULT;
>> +       if (flags)
>> +               return -EINVAL;
>> +       if (copy_from_user(&bpf_fd, user_bpf_fd, sizeof(bpf_fd)))
>> +               return -EFAULT;
>> +       prog = bpf_prog_get(bpf_fd);
>> +       if (IS_ERR(prog))
>> +               return PTR_ERR(prog);
>> +
>> +       /*
>> +        * We don't need to lock anything for the current process hierarchy,
>> +        * everything is guarded by the atomic counters.
>> +        */
>> +       new_events = landlock_append_prog(current->seccomp.landlock_events, prog);
> 
> Do you need to check that it's the right *kind* of bpf prog or is that
> handled elsewhere?

The program type is checked at the beginning of landlock_append_prog().

 Mickaël


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy
  2017-03-01 22:14     ` Mickaël Salaün
@ 2017-03-01 22:20       ` Andy Lutomirski
  2017-03-01 23:28         ` Mickaël Salaün
  0 siblings, 1 reply; 26+ messages in thread
From: Andy Lutomirski @ 2017-03-01 22:20 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: linux-kernel, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Casey Schaufler, Daniel Borkmann, David Drysdale,
	David S . Miller, Eric W . Biederman, James Morris, Jann Horn,
	Jonathan Corbet, Matthew Garrett, Michael Kerrisk, Kees Cook,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Tejun Heo, Thomas Graf, Will Drewry, kernel-hardening, Linux API,
	LSM List, Network Development, Andrew Morton

On Wed, Mar 1, 2017 at 2:14 PM, Mickaël Salaün <mic@digikod.net> wrote:
>
> On 28/02/2017 21:01, Andy Lutomirski wrote:
>> On Tue, Feb 21, 2017 at 5:26 PM, Mickaël Salaün <mic@digikod.net> wrote:
>>> The seccomp(2) syscall can be use to apply a Landlock rule to the
>>> current process. As with a seccomp filter, the Landlock rule is enforced
>>> for all its future children. An inherited rule tree can be updated
>>> (append-only) by the owner of inherited Landlock nodes (e.g. a parent
>>> process that create a new rule)
>>
>> Can you clarify exaclty what this type of update does?  Is it
>> something that should be supported by normal seccomp rules as well?
>
> There is two main structures involved here: struct landlock_node and
> struct landlock_rule, both defined in include/linux/landlock.h [02/10].
>
> Let's take an example with seccomp filter and then Landlock:
> * seccomp filter: Process P1 creates and applies a seccomp filter F1 to
> itself. Then it forks and creates a child P2, which inherits P1's
> filters, hence F1. Now, if P1 add a new seccomp filter F2 to itself, P2
> *won't get it*. The P2's filter list will still only contains F1 but not
> F2. If P2 sets up and applies a new filter F3 to itself, its filter list
> will contains F1 and F3.
> * Landlock: Process P1 creates and applies a Landlock rule R1 to itself.
> Underneath the kernel creates a new node N1 dedicated to P1, which
> contains all its rules. Then P1 forks and creates a child P2, which
> inherits P1's rules, hence R1. Underneath P2 inherited N1. Now, if P1
> add a new Landlock rule R2 to itself, P2 *will get it* as well (because
> R2 is part of N1). If P2 creates and applies a new rule R3 to itself,
> its rules will contains R1, R2 and R3. Underneath the kernel created a
> new node N2 for P2, which only contains R3 but inherits/links to N1.
>
> This design makes it possible for a process to add more constraints to
> its children on the fly. I think it is a good feature to have and a
> safer default inheritance mechanism, but it could be guarded by an
> option flag if we want both mechanism to be available. The same design
> could be used by seccomp filter too.
>

Then let's do it right.

Currently each task has an array of seccomp filter layers.  When a
task forks, the child inherits the layers.  All the layers are
presently immutable.  With Landlock, a layer can logically be a
syscall fitler layer or a Landlock layer.  This fits in to the
existing model just fine.

If we want to have an interface to allow modification of an existing
layer, let's make it so that, when a layer is added, you have to
specify a flag to make the layer modifiable (by current, presumably,
although I can imagine other policies down the road).  Then have a
separate API that modifies a layer.

IOW, I think your patch is bad for three reasons, all fixable:

1. The default is wrong.  A layer should be immutable to avoid an easy
attack in which you try to sandbox *yourself* and then you just modify
the layer to weaken it.

2. The API that adds a layer should be different from the API that
modifies a layer.

3. The whole modification mechanism should be a separate patch to be
reviewed on its own merits.

> The current inheritance mechanism doesn't enable to only add a rule to
> the current process. The rule will be inherited by its children
> (starting from the children created after the first applied rule). An
> option flag NEW_RULE_HIERARCHY (or maybe another seccomp operation)
> could enable to create a new node for the current process, and then
> makes it not inherited by the previous children.

I like my proposal above much better.  "Add a layer" and "change a
layer" should be different operations.

--Andy

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 03/10] bpf: Define handle_fs and add a new helper bpf_handle_fs_get_mode()
  2017-03-01  9:32   ` James Morris
@ 2017-03-01 22:20     ` Mickaël Salaün
  0 siblings, 0 replies; 26+ messages in thread
From: Mickaël Salaün @ 2017-03-01 22:20 UTC (permalink / raw)
  To: James Morris
  Cc: linux-kernel, Alexei Starovoitov, Andy Lutomirski,
	Arnaldo Carvalho de Melo, Casey Schaufler, Daniel Borkmann,
	David Drysdale, David S . Miller, Eric W . Biederman,
	James Morris, Jann Horn, Jonathan Corbet, Matthew Garrett,
	Michael Kerrisk, Kees Cook, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Tejun Heo, Thomas Graf,
	Will Drewry, kernel-hardening, linux-api, linux-security-module,
	netdev

[-- Attachment #1.1: Type: text/plain, Size: 1103 bytes --]


On 01/03/2017 10:32, James Morris wrote:
> On Wed, 22 Feb 2017, Mickaël Salaün wrote:
> 
>> Add an eBPF function bpf_handle_fs_get_mode(handle_fs) to get the mode
>> of a an abstract object wrapping either a file, a dentry, a path, or an
>> inode.
>>
>> Changes since v4:
>> * use a file abstraction (handle) to wrap inode, dentry, path and file
>>   structs
> 
> Good to see these abstractions.  As discussed at LPC, we need to ensure 
> that we don't couple the Landlock API too closely with the LSM API, as the 
> former is an ABI exposed to userland -- we don't want to lose the ability 
> to change LSM internally due to breaking Landlock policies.

Right, it is the case now, especially with the Landlock events.

> 
>> @@ -82,6 +87,8 @@ enum bpf_arg_type {
>>  
>>  	ARG_PTR_TO_CTX,		/* pointer to context */
>>  	ARG_ANYTHING,		/* any (initialized) argument is ok */
>> +
>> +	ARG_CONST_PTR_TO_HANDLE_FS,	/* pointer to an abstract FS struct */
>>  };
> 
> Extraneous whitespace?

It is on purpose, following the same rules as used for this enum.

 Mickaël


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy
  2017-03-01 22:20       ` Andy Lutomirski
@ 2017-03-01 23:28         ` Mickaël Salaün
  2017-03-02 16:36           ` Andy Lutomirski
  0 siblings, 1 reply; 26+ messages in thread
From: Mickaël Salaün @ 2017-03-01 23:28 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-kernel, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Casey Schaufler, Daniel Borkmann, David Drysdale,
	David S . Miller, Eric W . Biederman, James Morris, Jann Horn,
	Jonathan Corbet, Matthew Garrett, Michael Kerrisk, Kees Cook,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Tejun Heo, Thomas Graf, Will Drewry, kernel-hardening, Linux API,
	LSM List, Network Development, Andrew Morton

[-- Attachment #1.1: Type: text/plain, Size: 4431 bytes --]



On 01/03/2017 23:20, Andy Lutomirski wrote:
> On Wed, Mar 1, 2017 at 2:14 PM, Mickaël Salaün <mic@digikod.net> wrote:
>>
>> On 28/02/2017 21:01, Andy Lutomirski wrote:
>>> On Tue, Feb 21, 2017 at 5:26 PM, Mickaël Salaün <mic@digikod.net> wrote:
>>>> The seccomp(2) syscall can be use to apply a Landlock rule to the
>>>> current process. As with a seccomp filter, the Landlock rule is enforced
>>>> for all its future children. An inherited rule tree can be updated
>>>> (append-only) by the owner of inherited Landlock nodes (e.g. a parent
>>>> process that create a new rule)
>>>
>>> Can you clarify exaclty what this type of update does?  Is it
>>> something that should be supported by normal seccomp rules as well?
>>
>> There is two main structures involved here: struct landlock_node and
>> struct landlock_rule, both defined in include/linux/landlock.h [02/10].
>>
>> Let's take an example with seccomp filter and then Landlock:
>> * seccomp filter: Process P1 creates and applies a seccomp filter F1 to
>> itself. Then it forks and creates a child P2, which inherits P1's
>> filters, hence F1. Now, if P1 add a new seccomp filter F2 to itself, P2
>> *won't get it*. The P2's filter list will still only contains F1 but not
>> F2. If P2 sets up and applies a new filter F3 to itself, its filter list
>> will contains F1 and F3.
>> * Landlock: Process P1 creates and applies a Landlock rule R1 to itself.
>> Underneath the kernel creates a new node N1 dedicated to P1, which
>> contains all its rules. Then P1 forks and creates a child P2, which
>> inherits P1's rules, hence R1. Underneath P2 inherited N1. Now, if P1
>> add a new Landlock rule R2 to itself, P2 *will get it* as well (because
>> R2 is part of N1). If P2 creates and applies a new rule R3 to itself,
>> its rules will contains R1, R2 and R3. Underneath the kernel created a
>> new node N2 for P2, which only contains R3 but inherits/links to N1.
>>
>> This design makes it possible for a process to add more constraints to
>> its children on the fly. I think it is a good feature to have and a
>> safer default inheritance mechanism, but it could be guarded by an
>> option flag if we want both mechanism to be available. The same design
>> could be used by seccomp filter too.
>>
> 
> Then let's do it right.
> 
> Currently each task has an array of seccomp filter layers.  When a
> task forks, the child inherits the layers.  All the layers are
> presently immutable.  With Landlock, a layer can logically be a
> syscall fitler layer or a Landlock layer.  This fits in to the
> existing model just fine.
> 
> If we want to have an interface to allow modification of an existing
> layer, let's make it so that, when a layer is added, you have to
> specify a flag to make the layer modifiable (by current, presumably,
> although I can imagine other policies down the road).  Then have a
> separate API that modifies a layer.
> 
> IOW, I think your patch is bad for three reasons, all fixable:
> 
> 1. The default is wrong.  A layer should be immutable to avoid an easy
> attack in which you try to sandbox *yourself* and then you just modify
> the layer to weaken it.

This is not possible, there is only an operation for now:
SECCOMP_ADD_LANDLOCK_RULE. You can only add more rules to the list (as
for seccomp filter). There is no way to weaken a sandbox. The question
is: how do we want to handle the rules *tree* (from the kernel point of
view)?

> 
> 2. The API that adds a layer should be different from the API that
> modifies a layer.

Right, but it doesn't apply now because we can only add rules.

> 
> 3. The whole modification mechanism should be a separate patch to be
> reviewed on its own merits.

For a rule *replacement*, sure!

> 
>> The current inheritance mechanism doesn't enable to only add a rule to
>> the current process. The rule will be inherited by its children
>> (starting from the children created after the first applied rule). An
>> option flag NEW_RULE_HIERARCHY (or maybe another seccomp operation)
>> could enable to create a new node for the current process, and then
>> makes it not inherited by the previous children.
> 
> I like my proposal above much better.  "Add a layer" and "change a
> layer" should be different operations.

I agree, but for now it's about how to handle immutable (but growing)
inherited rules.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [kernel-hardening] [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy
  2017-02-22  1:26 ` [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy Mickaël Salaün
  2017-02-28 20:01   ` Andy Lutomirski
@ 2017-03-02 10:22   ` Djalal Harouni
  2017-03-03  0:54     ` Mickaël Salaün
  1 sibling, 1 reply; 26+ messages in thread
From: Djalal Harouni @ 2017-03-02 10:22 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: linux-kernel, Alexei Starovoitov, Andy Lutomirski,
	Arnaldo Carvalho de Melo, Casey Schaufler, Daniel Borkmann,
	David Drysdale, David S . Miller, Eric W . Biederman,
	James Morris, Jann Horn, Jonathan Corbet, Matthew Garrett,
	Michael Kerrisk, Kees Cook, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Tejun Heo, Thomas Graf,
	Will Drewry, kernel-hardening, Linux API, LSM List, netdev,
	Andrew Morton

On Wed, Feb 22, 2017 at 2:26 AM, Mickaël Salaün <mic@digikod.net> wrote:
> The seccomp(2) syscall can be use to apply a Landlock rule to the
> current process. As with a seccomp filter, the Landlock rule is enforced
> for all its future children. An inherited rule tree can be updated
> (append-only) by the owner of inherited Landlock nodes (e.g. a parent
> process that create a new rule). However, an intermediate task, which
> did not create a rule, will not be able to update its children's rules.
>
> Landlock rules can be tied to a Landlock event. When such an event is
> triggered, a tree of rules can be evaluated. Thisk kind of tree is
> created with a first node.  This node reference a list of rules and an
> optional parent node. Each rule return a 32-bit value which can
> interrupt the evaluation with a non-zero value. If every rules returned
> zero, the evaluation continues with the rule list of the parent node,
> until the end of the tree.
>
> Changes since v4:
> * merge manager and seccomp patches
> * return -EFAULT in seccomp(2) when user_bpf_fd is null to easely check
>   if Landlock is supported
> * only allow a process with the global CAP_SYS_ADMIN to use Landlock
>   (will be lifted in the future)
> * add an early check to exit as soon as possible if the current process
>   does not have Landlock rules
>
> Changes since v3:
> * remove the hard link with seccomp (suggested by Andy Lutomirski and
>   Kees Cook):
>   * remove the cookie which could imply multiple evaluation of Landlock
>     rules
>   * remove the origin field in struct landlock_data
> * remove documentation fix (merged upstream)
> * rename the new seccomp command to SECCOMP_ADD_LANDLOCK_RULE
> * internal renaming
> * split commit
> * new design to be able to inherit on the fly the parent rules
>
> Changes since v2:
> * Landlock programs can now be run without seccomp filter but for any
>   syscall (from the process) or interruption
> * move Landlock related functions and structs into security/landlock/*
>   (to manage cgroups as well)
> * fix seccomp filter handling: run Landlock programs for each of their
>   legitimate seccomp filter
> * properly clean up all seccomp results
> * cosmetic changes to ease the understanding
> * fix some ifdef
>
> Signed-off-by: Mickaël Salaün <mic@digikod.net>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: James Morris <james.l.morris@oracle.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Serge E. Hallyn <serge@hallyn.com>
> Cc: Will Drewry <wad@chromium.org>
> ---
>  include/linux/seccomp.h      |   8 ++
>  include/uapi/linux/seccomp.h |   1 +
>  kernel/fork.c                |  14 +-
>  kernel/seccomp.c             |   8 ++
>  security/landlock/Makefile   |   2 +-
>  security/landlock/hooks.c    |  42 +++++-
>  security/landlock/manager.c  | 321 +++++++++++++++++++++++++++++++++++++++++++
>  7 files changed, 392 insertions(+), 4 deletions(-)
>  create mode 100644 security/landlock/manager.c
>
> diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
> index e25aee2cdfc0..9a38de3c0e72 100644
> --- a/include/linux/seccomp.h
> +++ b/include/linux/seccomp.h
> @@ -10,6 +10,10 @@
>  #include <linux/thread_info.h>
>  #include <asm/seccomp.h>
>
> +#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
> +struct landlock_events;
> +#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
> +
>  struct seccomp_filter;
>  /**
>   * struct seccomp - the state of a seccomp'ed process
> @@ -18,6 +22,7 @@ struct seccomp_filter;
>   *         system calls available to a process.
>   * @filter: must always point to a valid seccomp-filter or NULL as it is
>   *          accessed without locking during system call entry.
> + * @landlock_events: contains an array of Landlock rules.
>   *
>   *          @filter must only be accessed from the context of current as there
>   *          is no read locking.
> @@ -25,6 +30,9 @@ struct seccomp_filter;
>  struct seccomp {
>         int mode;
>         struct seccomp_filter *filter;
> +#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
> +       struct landlock_events *landlock_events;
> +#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
>  };
>
>  #ifdef CONFIG_HAVE_ARCH_SECCOMP_FILTER
> diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
> index 0f238a43ff1e..56dd692cddac 100644
> --- a/include/uapi/linux/seccomp.h
> +++ b/include/uapi/linux/seccomp.h
> @@ -13,6 +13,7 @@
>  /* Valid operations for seccomp syscall. */
>  #define SECCOMP_SET_MODE_STRICT        0
>  #define SECCOMP_SET_MODE_FILTER        1
> +#define SECCOMP_ADD_LANDLOCK_RULE      2
>
>  /* Valid flags for SECCOMP_SET_MODE_FILTER */
>  #define SECCOMP_FILTER_FLAG_TSYNC      1
> diff --git a/kernel/fork.c b/kernel/fork.c
> index a4f0d0e8aeb2..bd5c72dffe60 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -37,6 +37,7 @@
>  #include <linux/security.h>
>  #include <linux/hugetlb.h>
>  #include <linux/seccomp.h>
> +#include <linux/landlock.h>
>  #include <linux/swap.h>
>  #include <linux/syscalls.h>
>  #include <linux/jiffies.h>
> @@ -515,7 +516,10 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
>          * the usage counts on the error path calling free_task.
>          */
>         tsk->seccomp.filter = NULL;
> -#endif
> +#ifdef CONFIG_SECURITY_LANDLOCK
> +       tsk->seccomp.landlock_events = NULL;
> +#endif /* CONFIG_SECURITY_LANDLOCK */
> +#endif /* CONFIG_SECCOMP */
>
>         setup_thread_stack(tsk, orig);
>         clear_user_return_notifier(tsk);
> @@ -1388,7 +1392,13 @@ static void copy_seccomp(struct task_struct *p)
>
>         /* Ref-count the new filter user, and assign it. */
>         get_seccomp_filter(current);
> -       p->seccomp = current->seccomp;
> +       p->seccomp.mode = current->seccomp.mode;
> +       p->seccomp.filter = current->seccomp.filter;
> +#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
> +       p->seccomp.landlock_events = current->seccomp.landlock_events;
> +       if (p->seccomp.landlock_events)
> +               atomic_inc(&p->seccomp.landlock_events->usage);
> +#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
>
>         /*
>          * Explicitly enable no_new_privs here in case it got set
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index 06f2f3ee454c..ef412d95ff5d 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -32,6 +32,7 @@
>  #include <linux/security.h>
>  #include <linux/tracehook.h>
>  #include <linux/uaccess.h>
> +#include <linux/landlock.h>
>
>  /**
>   * struct seccomp_filter - container for seccomp BPF programs
> @@ -492,6 +493,9 @@ static void put_seccomp_filter(struct seccomp_filter *filter)
>  void put_seccomp(struct task_struct *tsk)
>  {
>         put_seccomp_filter(tsk->seccomp.filter);
> +#ifdef CONFIG_SECURITY_LANDLOCK
> +       put_landlock_events(tsk->seccomp.landlock_events);
> +#endif /* CONFIG_SECURITY_LANDLOCK */
>  }
>
>  /**
> @@ -796,6 +800,10 @@ static long do_seccomp(unsigned int op, unsigned int flags,
>                 return seccomp_set_mode_strict();
>         case SECCOMP_SET_MODE_FILTER:
>                 return seccomp_set_mode_filter(flags, uargs);
> +#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
> +       case SECCOMP_ADD_LANDLOCK_RULE:
> +               return landlock_seccomp_append_prog(flags, uargs);
> +#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
>         default:
>                 return -EINVAL;
>         }
> diff --git a/security/landlock/Makefile b/security/landlock/Makefile
> index 8dc8bde660bd..6c1b0d8bd810 100644
> --- a/security/landlock/Makefile
> +++ b/security/landlock/Makefile
> @@ -2,4 +2,4 @@ ccflags-$(CONFIG_SECURITY_LANDLOCK) += -Werror=unused-function
>
>  obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
>
> -landlock-y := hooks.o
> +landlock-y := hooks.o manager.o
> diff --git a/security/landlock/hooks.c b/security/landlock/hooks.c
> index 88ebe3f01758..704ea18377d2 100644
> --- a/security/landlock/hooks.c
> +++ b/security/landlock/hooks.c
> @@ -290,7 +290,44 @@ static u64 mem_prot_to_access(unsigned long prot, bool private)
>
>  static inline bool landlock_used(void)
>  {
> +#ifdef CONFIG_SECCOMP_FILTER
> +       return !!(current->seccomp.landlock_events);
> +#else
>         return false;
> +#endif /* CONFIG_SECCOMP_FILTER */
> +}
> +
> +/**
> + * landlock_run_prog - run Landlock program for a syscall
> + *
> + * @event_idx: event index in the rules array
> + * @ctx: non-NULL eBPF context
> + * @events: Landlock events pointer
> + */
> +static int landlock_run_prog(u32 event_idx, const struct landlock_context *ctx,
> +               struct landlock_events *events)
> +{
> +       struct landlock_node *node;
> +
> +       if (!events)
> +               return 0;
> +
> +       for (node = events->nodes[event_idx]; node; node = node->prev) {
> +               struct landlock_rule *rule;
> +
> +               for (rule = node->rule; rule; rule = rule->prev) {
> +                       u32 ret;
> +
> +                       if (WARN_ON(!rule->prog))
> +                               continue;
> +                       rcu_read_lock();
> +                       ret = BPF_PROG_RUN(rule->prog, (void *)ctx);
> +                       rcu_read_unlock();
> +                       if (ret)
> +                               return -EPERM;
> +               }
> +       }
> +       return 0;
>  }
>
>  static int landlock_decide(enum landlock_subtype_event event,
> @@ -309,7 +346,10 @@ static int landlock_decide(enum landlock_subtype_event event,
>                 .arg2 = ctx_values[1],
>         };
>
> -       /* insert manager call here */
> +#ifdef CONFIG_SECCOMP_FILTER
> +       ret = landlock_run_prog(event_idx, &ctx,
> +                       current->seccomp.landlock_events);
> +#endif /* CONFIG_SECCOMP_FILTER */
>
>         return ret;
>  }
> diff --git a/security/landlock/manager.c b/security/landlock/manager.c
> new file mode 100644
> index 000000000000..00bb2944c85e
> --- /dev/null
> +++ b/security/landlock/manager.c
> @@ -0,0 +1,321 @@
> +/*
> + * Landlock LSM - seccomp manager
> + *
> + * Copyright © 2017 Mickaël Salaün <mic@digikod.net>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2, as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <asm/page.h> /* PAGE_SIZE */
> +#include <linux/atomic.h> /* atomic_*(), smp_store_release() */
> +#include <linux/bpf.h> /* bpf_prog_put() */
> +#include <linux/filter.h> /* struct bpf_prog */
> +#include <linux/kernel.h> /* round_up() */
> +#include <linux/landlock.h>
> +#include <linux/sched.h> /* current_cred(), task_no_new_privs() */
> +#include <linux/security.h> /* security_capable_noaudit() */
> +#include <linux/slab.h> /* alloc(), kfree() */
> +#include <linux/types.h> /* atomic_t */
> +#include <linux/uaccess.h> /* copy_from_user() */
> +
> +#include "common.h"
> +
> +static void put_landlock_rule(struct landlock_rule *rule)
> +{
> +       struct landlock_rule *orig = rule;
> +
> +       /* clean up single-reference branches iteratively */
> +       while (orig && atomic_dec_and_test(&orig->usage)) {
> +               struct landlock_rule *freeme = orig;
> +
> +               bpf_prog_put(orig->prog);
> +               orig = orig->prev;
> +               kfree(freeme);
> +       }
> +}
> +
> +static void put_landlock_node(struct landlock_node *node)
> +{
> +       struct landlock_node *orig = node;
> +
> +       /* clean up single-reference branches iteratively */
> +       while (orig && atomic_dec_and_test(&orig->usage)) {
> +               struct landlock_node *freeme = orig;
> +
> +               put_landlock_rule(orig->rule);
> +               orig = orig->prev;
> +               kfree(freeme);
> +       }
> +}
> +
> +void put_landlock_events(struct landlock_events *events)
> +{
> +       if (events && atomic_dec_and_test(&events->usage)) {
> +               size_t i;
> +
> +               /* XXX: Do we need to use lockless_dereference() here? */
> +               for (i = 0; i < ARRAY_SIZE(events->nodes); i++) {
> +                       if (!events->nodes[i])
> +                               continue;
> +                       /* Are we the owner of this node? */
> +                       if (events->nodes[i]->owner == &events->nodes[i])
> +                               events->nodes[i]->owner = NULL;
> +                       put_landlock_node(events->nodes[i]);
> +               }
> +               kfree(events);
> +       }
> +}
> +
> +static struct landlock_events *new_raw_landlock_events(void)
> +{
> +       struct landlock_events *ret;
> +
> +       /* array filled with NULL values */
> +       ret = kzalloc(sizeof(*ret), GFP_KERNEL);
> +       if (!ret)
> +               return ERR_PTR(-ENOMEM);
> +       atomic_set(&ret->usage, 1);
> +       return ret;
> +}
> +
> +static struct landlock_events *new_filled_landlock_events(void)
> +{
> +       size_t i;
> +       struct landlock_events *ret;
> +
> +       ret = new_raw_landlock_events();
> +       if (IS_ERR(ret))
> +               return ret;
> +       /*
> +        * We need to initially allocate every nodes to be able to update the
> +        * rules they are pointing to, across every (future) children of the
> +        * current task.
> +        */
> +       for (i = 0; i < ARRAY_SIZE(ret->nodes); i++) {
> +               struct landlock_node *node;
> +
> +               node = kzalloc(sizeof(*node), GFP_KERNEL);
> +               if (!node)
> +                       goto put_events;
> +               atomic_set(&node->usage, 1);
> +               /* we are the owner of this node */
> +               node->owner = &ret->nodes[i];
> +               ret->nodes[i] = node;
> +       }
> +       return ret;
> +
> +put_events:
> +       put_landlock_events(ret);
> +       return ERR_PTR(-ENOMEM);
> +}
> +
> +static void add_landlock_rule(struct landlock_events *events,
> +               struct landlock_rule *rule)
> +{
> +       /* subtype.landlock_rule.event > 0 for loaded programs */
> +       u32 event_idx = get_index(rule->prog->subtype.landlock_rule.event);
> +
> +       rule->prev = events->nodes[event_idx]->rule;
> +       WARN_ON(atomic_read(&rule->usage));
> +       atomic_set(&rule->usage, 1);
> +       /* do not increment the previous rule usage */
> +       smp_store_release(&events->nodes[event_idx]->rule, rule);
> +}
> +
> +/* Limit Landlock events to 256KB. */
> +#define LANDLOCK_EVENTS_MAX_PAGES (1 << 6)
> +
> +/**
> + * landlock_append_prog - attach a Landlock rule to @current_events
> + *
> + * @current_events: landlock_events pointer, must be locked (if needed) to
> + *                  prevent a concurrent put/free. This pointer must not be
> + *                  freed after the call.
> + * @prog: non-NULL Landlock rule to append to @current_events. @prog will be
> + *        owned by landlock_append_prog() and freed if an error happened.
> + *
> + * Return @current_events or a new pointer when OK. Return a pointer error
> + * otherwise.
> + */
> +static struct landlock_events *landlock_append_prog(
> +               struct landlock_events *current_events, struct bpf_prog *prog)
> +{
> +       struct landlock_events *new_events = current_events;
> +       unsigned long pages;
> +       struct landlock_rule *rule;
> +       u32 event_idx;
> +
> +       if (prog->type != BPF_PROG_TYPE_LANDLOCK) {
> +               new_events = ERR_PTR(-EINVAL);
> +               goto put_prog;
> +       }
> +
> +       /* validate memory size allocation */
> +       pages = prog->pages;
> +       if (current_events) {
> +               size_t i;
> +
> +               for (i = 0; i < ARRAY_SIZE(current_events->nodes); i++) {
> +                       struct landlock_node *walker_n;
> +
> +                       for (walker_n = current_events->nodes[i];
> +                                       walker_n;
> +                                       walker_n = walker_n->prev) {
> +                               struct landlock_rule *walker_r;
> +
> +                               for (walker_r = walker_n->rule;
> +                                               walker_r;
> +                                               walker_r = walker_r->prev)
> +                                       pages += walker_r->prog->pages;
> +                       }
> +               }
> +               /* count a struct landlock_events if we need to allocate one */
> +               if (atomic_read(&current_events->usage) != 1)
> +                       pages += round_up(sizeof(*current_events), PAGE_SIZE) /
> +                               PAGE_SIZE;
> +       }
> +       if (pages > LANDLOCK_EVENTS_MAX_PAGES) {
> +               new_events = ERR_PTR(-E2BIG);
> +               goto put_prog;
> +       }
> +
> +       rule = kzalloc(sizeof(*rule), GFP_KERNEL);
> +       if (!rule) {
> +               new_events = ERR_PTR(-ENOMEM);
> +               goto put_prog;
> +       }
> +       rule->prog = prog;
> +
> +       /* subtype.landlock_rule.event > 0 for loaded programs */
> +       event_idx = get_index(rule->prog->subtype.landlock_rule.event);
> +
> +       if (!current_events) {
> +               /* add a new landlock_events, if needed */
> +               new_events = new_filled_landlock_events();
> +               if (IS_ERR(new_events))
> +                       goto put_rule;
> +               add_landlock_rule(new_events, rule);
> +       } else {
> +               if (new_events->nodes[event_idx]->owner ==
> +                               &new_events->nodes[event_idx]) {
> +                       /* We are the owner, we can then update the node. */
> +                       add_landlock_rule(new_events, rule);
> +               } else if (atomic_read(&current_events->usage) == 1) {
> +                       WARN_ON(new_events->nodes[event_idx]->owner);
> +                       /*
> +                        * We can become the new owner if no other task use it.
> +                        * This avoid an unnecessary allocation.
> +                        */
> +                       new_events->nodes[event_idx]->owner =
> +                               &new_events->nodes[event_idx];
> +                       add_landlock_rule(new_events, rule);

I still don't get it all, but maybe here to make it simple you have to
always allocate, since the WARN_ON() suggests it should be scheduled
to be removed... and avoid to care about whether you are using/freeing
old rules of the dead task... ?


> +               } else {
> +                       /*
> +                        * We are not the owner, we need to fork current_events
> +                        * and then add a new node.
> +                        */
> +                       struct landlock_node *node;
> +                       size_t i;
> +
> +                       node = kmalloc(sizeof(*node), GFP_KERNEL);
> +                       if (!node) {
> +                               new_events = ERR_PTR(-ENOMEM);
> +                               goto put_rule;
> +                       }
> +                       atomic_set(&node->usage, 1);
> +                       /* set the previous node after the new_events
> +                        * allocation */
> +                       node->prev = NULL;
> +                       /* do not increment the previous node usage */
> +                       node->owner = &new_events->nodes[event_idx];
> +                       /* rule->prev is already NULL */
> +                       atomic_set(&rule->usage, 1);
> +                       node->rule = rule;
> +
> +                       new_events = new_raw_landlock_events();
> +                       if (IS_ERR(new_events)) {
> +                               /* put the rule as well */
> +                               put_landlock_node(node);
> +                               return ERR_PTR(-ENOMEM);
> +                       }
> +                       for (i = 0; i < ARRAY_SIZE(new_events->nodes); i++) {
> +                               new_events->nodes[i] =
> +                                       lockless_dereference(
> +                                                       current_events->nodes[i]);
> +                               if (i == event_idx)
> +                                       node->prev = new_events->nodes[i];
> +                               if (!WARN_ON(!new_events->nodes[i]))
> +                                       atomic_inc(&new_events->nodes[i]->usage);
> +                       }
> +                       new_events->nodes[event_idx] = node;
> +
> +                       /*
> +                        * @current_events will not be freed here because it's usage
> +                        * field is > 1. It is only prevented to be freed by another
> +                        * subject thanks to the caller of landlock_append_prog() which
> +                        * should be locked if needed.
> +                        */
> +                       put_landlock_events(current_events);
> +               }
> +       }
> +       return new_events;
> +
> +put_prog:
> +       bpf_prog_put(prog);
> +       return new_events;
> +
> +put_rule:
> +       put_landlock_rule(rule);
> +       return new_events;
> +}
> +
> +/**
> + * landlock_seccomp_append_prog - attach a Landlock rule to the current process
> + *
> + * current->seccomp.landlock_events is lazily allocated. When a process fork,
> + * only a pointer is copied. When a new event is added by a process, if there
> + * is other references to this process' landlock_events, then a new allocation
> + * is made to contains an array pointing to Landlock rule lists. This design
> + * has low-performance impact and is memory efficient while keeping the
> + * property of append-only rules.
> + *
> + * @flags: not used for now, but could be used for TSYNC
> + * @user_bpf_fd: file descriptor pointing to a loaded Landlock rule
> + */
> +#ifdef CONFIG_SECCOMP_FILTER
> +int landlock_seccomp_append_prog(unsigned int flags, const char __user *user_bpf_fd)
> +{
> +       struct landlock_events *new_events;
> +       struct bpf_prog *prog;
> +       int bpf_fd;
> +
> +       /* force no_new_privs to limit privilege escalation */
> +       if (!task_no_new_privs(current))
> +               return -EPERM;
> +       /* will be removed in the future to allow unprivileged tasks */
> +       if (!capable(CAP_SYS_ADMIN))
> +               return -EPERM;
> +       if (!user_bpf_fd)
> +               return -EFAULT;
> +       if (flags)
> +               return -EINVAL;
> +       if (copy_from_user(&bpf_fd, user_bpf_fd, sizeof(bpf_fd)))
> +               return -EFAULT;
> +       prog = bpf_prog_get(bpf_fd);
> +       if (IS_ERR(prog))
> +               return PTR_ERR(prog);
> +
> +       /*
> +        * We don't need to lock anything for the current process hierarchy,
> +        * everything is guarded by the atomic counters.
> +        */
> +       new_events = landlock_append_prog(current->seccomp.landlock_events, prog);
> +       /* @prog is managed/freed by landlock_append_prog() */
> +       if (IS_ERR(new_events))
> +               return PTR_ERR(new_events);
> +       current->seccomp.landlock_events = new_events;
> +       return 0;
> +}
> +#endif /* CONFIG_SECCOMP_FILTER */
> --
> 2.11.0
>



-- 
tixxdz

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy
  2017-03-01 23:28         ` Mickaël Salaün
@ 2017-03-02 16:36           ` Andy Lutomirski
  2017-03-03  0:48             ` Mickaël Salaün
  0 siblings, 1 reply; 26+ messages in thread
From: Andy Lutomirski @ 2017-03-02 16:36 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: linux-kernel, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Casey Schaufler, Daniel Borkmann, David Drysdale,
	David S . Miller, Eric W . Biederman, James Morris, Jann Horn,
	Jonathan Corbet, Matthew Garrett, Michael Kerrisk, Kees Cook,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Tejun Heo, Thomas Graf, Will Drewry, kernel-hardening, Linux API,
	LSM List, Network Development, Andrew Morton

On Wed, Mar 1, 2017 at 3:28 PM, Mickaël Salaün <mic@digikod.net> wrote:
>
>
> On 01/03/2017 23:20, Andy Lutomirski wrote:
>> On Wed, Mar 1, 2017 at 2:14 PM, Mickaël Salaün <mic@digikod.net> wrote:
>>>
>>> On 28/02/2017 21:01, Andy Lutomirski wrote:
>>>> On Tue, Feb 21, 2017 at 5:26 PM, Mickaël Salaün <mic@digikod.net> wrote:
>>> This design makes it possible for a process to add more constraints to
>>> its children on the fly. I think it is a good feature to have and a
>>> safer default inheritance mechanism, but it could be guarded by an
>>> option flag if we want both mechanism to be available. The same design
>>> could be used by seccomp filter too.
>>>
>>
>> Then let's do it right.
>>
>> Currently each task has an array of seccomp filter layers.  When a
>> task forks, the child inherits the layers.  All the layers are
>> presently immutable.  With Landlock, a layer can logically be a
>> syscall fitler layer or a Landlock layer.  This fits in to the
>> existing model just fine.
>>
>> If we want to have an interface to allow modification of an existing
>> layer, let's make it so that, when a layer is added, you have to
>> specify a flag to make the layer modifiable (by current, presumably,
>> although I can imagine other policies down the road).  Then have a
>> separate API that modifies a layer.
>>
>> IOW, I think your patch is bad for three reasons, all fixable:
>>
>> 1. The default is wrong.  A layer should be immutable to avoid an easy
>> attack in which you try to sandbox *yourself* and then you just modify
>> the layer to weaken it.
>
> This is not possible, there is only an operation for now:
> SECCOMP_ADD_LANDLOCK_RULE. You can only add more rules to the list (as
> for seccomp filter). There is no way to weaken a sandbox. The question
> is: how do we want to handle the rules *tree* (from the kernel point of
> view)?
>

Fair enough.  But I still think that immutability (like regular
seccomp) should be the default.  For security, simplicity is
important.  I guess there could be two ways to relax immutability:
allowing making the layer stricter and allowing any change at all.

As a default, though, programs should be able to expect that:

seccomp(SECCOMP_ADD_WHATEVER, ...);
fork();

[parent gets compromised]
[in parent]seccomp(anything whatsoever);

will not affect the child,  If the parent wants to relax that, that's
fine, but I think it should be explicit.

>>
>> 2. The API that adds a layer should be different from the API that
>> modifies a layer.
>
> Right, but it doesn't apply now because we can only add rules.

That's not what the code appears to do, though.  Sometimes it makes a
new layer without modifying tasks that share the layer and sometimes
it modifies the layer.

Both operations are probably okay, but they're not the same operation
and they shouldn't pretend to be.


>
>>
>> 3. The whole modification mechanism should be a separate patch to be
>> reviewed on its own merits.
>
> For a rule *replacement*, sure!

And for modification of policy for non-current tasks.  That's a big
departure from normal seccomp and should be reviewed as such.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy
  2017-03-02 16:36           ` Andy Lutomirski
@ 2017-03-03  0:48             ` Mickaël Salaün
  2017-03-03  0:55               ` Andy Lutomirski
  0 siblings, 1 reply; 26+ messages in thread
From: Mickaël Salaün @ 2017-03-03  0:48 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-kernel, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Casey Schaufler, Daniel Borkmann, David Drysdale,
	David S . Miller, Eric W . Biederman, James Morris, Jann Horn,
	Jonathan Corbet, Matthew Garrett, Michael Kerrisk, Kees Cook,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Tejun Heo, Thomas Graf, Will Drewry, kernel-hardening, Linux API,
	LSM List, Network Development, Andrew Morton

[-- Attachment #1.1: Type: text/plain, Size: 4585 bytes --]


On 02/03/2017 17:36, Andy Lutomirski wrote:
> On Wed, Mar 1, 2017 at 3:28 PM, Mickaël Salaün <mic@digikod.net> wrote:
>>
>>
>> On 01/03/2017 23:20, Andy Lutomirski wrote:
>>> On Wed, Mar 1, 2017 at 2:14 PM, Mickaël Salaün <mic@digikod.net> wrote:
>>>>
>>>> On 28/02/2017 21:01, Andy Lutomirski wrote:
>>>>> On Tue, Feb 21, 2017 at 5:26 PM, Mickaël Salaün <mic@digikod.net> wrote:
>>>> This design makes it possible for a process to add more constraints to
>>>> its children on the fly. I think it is a good feature to have and a
>>>> safer default inheritance mechanism, but it could be guarded by an
>>>> option flag if we want both mechanism to be available. The same design
>>>> could be used by seccomp filter too.
>>>>
>>>
>>> Then let's do it right.
>>>
>>> Currently each task has an array of seccomp filter layers.  When a
>>> task forks, the child inherits the layers.  All the layers are
>>> presently immutable.  With Landlock, a layer can logically be a
>>> syscall fitler layer or a Landlock layer.  This fits in to the
>>> existing model just fine.
>>>
>>> If we want to have an interface to allow modification of an existing
>>> layer, let's make it so that, when a layer is added, you have to
>>> specify a flag to make the layer modifiable (by current, presumably,
>>> although I can imagine other policies down the road).  Then have a
>>> separate API that modifies a layer.
>>>
>>> IOW, I think your patch is bad for three reasons, all fixable:
>>>
>>> 1. The default is wrong.  A layer should be immutable to avoid an easy
>>> attack in which you try to sandbox *yourself* and then you just modify
>>> the layer to weaken it.
>>
>> This is not possible, there is only an operation for now:
>> SECCOMP_ADD_LANDLOCK_RULE. You can only add more rules to the list (as
>> for seccomp filter). There is no way to weaken a sandbox. The question
>> is: how do we want to handle the rules *tree* (from the kernel point of
>> view)?
>>
> 
> Fair enough.  But I still think that immutability (like regular
> seccomp) should be the default.  For security, simplicity is
> important.  I guess there could be two ways to relax immutability:
> allowing making the layer stricter and allowing any change at all.
> 
> As a default, though, programs should be able to expect that:
> 
> seccomp(SECCOMP_ADD_WHATEVER, ...);
> fork();
> 
> [parent gets compromised]
> [in parent]seccomp(anything whatsoever);
> 
> will not affect the child,  If the parent wants to relax that, that's
> fine, but I think it should be explicit.

Good point. However the term "immutability" doesn't fit right because
the process is still allowed to add more rules to itself (as for
seccomp). The difference lays in the way a rule may be "appended" (by
the current process) or "inserted" (by a parent process).

I think three or four kind of operations (through the seccomp syscall)
make sense:
* append a rule (for the current process and its future children)
* add a node (insert point), from which the inserted rules will be tied
* insert a rule in the node, which will be inherited by futures children
* (maybe a "lock" command to make a layer immutable for the current
process and its children)

Doing so, a process is only allowed to insert a rule if a node was
previously added. To forbid itself to insert new rules to one of its
children, a process just need to not add a node before forking. Like
this, there is no need for special rule flags nor default behavior,
everything is explicit.

For this series, I will stick to the same behavior as seccomp filter:
only append rules to the current process (and its future children).


>>> 2. The API that adds a layer should be different from the API that
>>> modifies a layer.
>>
>> Right, but it doesn't apply now because we can only add rules.
> 
> That's not what the code appears to do, though.  Sometimes it makes a
> new layer without modifying tasks that share the layer and sometimes
> it modifies the layer.
> 
> Both operations are probably okay, but they're not the same operation
> and they shouldn't pretend to be.

It should be OK with my previous proposal. The other details could be
discussed in a separate future patch series.


>>> 3. The whole modification mechanism should be a separate patch to be
>>> reviewed on its own merits.
>>
>> For a rule *replacement*, sure!
> 
> And for modification of policy for non-current tasks.  That's a big
> departure from normal seccomp and should be reviewed as such.

Agreed


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [kernel-hardening] [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy
  2017-03-02 10:22   ` [kernel-hardening] " Djalal Harouni
@ 2017-03-03  0:54     ` Mickaël Salaün
  0 siblings, 0 replies; 26+ messages in thread
From: Mickaël Salaün @ 2017-03-03  0:54 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: linux-kernel, Alexei Starovoitov, Andy Lutomirski,
	Arnaldo Carvalho de Melo, Casey Schaufler, Daniel Borkmann,
	David Drysdale, David S . Miller, Eric W . Biederman,
	James Morris, Jann Horn, Jonathan Corbet, Matthew Garrett,
	Michael Kerrisk, Kees Cook, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Tejun Heo, Thomas Graf,
	Will Drewry, kernel-hardening, Linux API, LSM List, netdev,
	Andrew Morton

[-- Attachment #1.1: Type: text/plain, Size: 19924 bytes --]



On 02/03/2017 11:22, Djalal Harouni wrote:
> On Wed, Feb 22, 2017 at 2:26 AM, Mickaël Salaün <mic@digikod.net> wrote:
>> The seccomp(2) syscall can be use to apply a Landlock rule to the
>> current process. As with a seccomp filter, the Landlock rule is enforced
>> for all its future children. An inherited rule tree can be updated
>> (append-only) by the owner of inherited Landlock nodes (e.g. a parent
>> process that create a new rule). However, an intermediate task, which
>> did not create a rule, will not be able to update its children's rules.
>>
>> Landlock rules can be tied to a Landlock event. When such an event is
>> triggered, a tree of rules can be evaluated. Thisk kind of tree is
>> created with a first node.  This node reference a list of rules and an
>> optional parent node. Each rule return a 32-bit value which can
>> interrupt the evaluation with a non-zero value. If every rules returned
>> zero, the evaluation continues with the rule list of the parent node,
>> until the end of the tree.
>>
>> Changes since v4:
>> * merge manager and seccomp patches
>> * return -EFAULT in seccomp(2) when user_bpf_fd is null to easely check
>>   if Landlock is supported
>> * only allow a process with the global CAP_SYS_ADMIN to use Landlock
>>   (will be lifted in the future)
>> * add an early check to exit as soon as possible if the current process
>>   does not have Landlock rules
>>
>> Changes since v3:
>> * remove the hard link with seccomp (suggested by Andy Lutomirski and
>>   Kees Cook):
>>   * remove the cookie which could imply multiple evaluation of Landlock
>>     rules
>>   * remove the origin field in struct landlock_data
>> * remove documentation fix (merged upstream)
>> * rename the new seccomp command to SECCOMP_ADD_LANDLOCK_RULE
>> * internal renaming
>> * split commit
>> * new design to be able to inherit on the fly the parent rules
>>
>> Changes since v2:
>> * Landlock programs can now be run without seccomp filter but for any
>>   syscall (from the process) or interruption
>> * move Landlock related functions and structs into security/landlock/*
>>   (to manage cgroups as well)
>> * fix seccomp filter handling: run Landlock programs for each of their
>>   legitimate seccomp filter
>> * properly clean up all seccomp results
>> * cosmetic changes to ease the understanding
>> * fix some ifdef
>>
>> Signed-off-by: Mickaël Salaün <mic@digikod.net>
>> Cc: Alexei Starovoitov <ast@kernel.org>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Andy Lutomirski <luto@amacapital.net>
>> Cc: James Morris <james.l.morris@oracle.com>
>> Cc: Kees Cook <keescook@chromium.org>
>> Cc: Serge E. Hallyn <serge@hallyn.com>
>> Cc: Will Drewry <wad@chromium.org>
>> ---
>>  include/linux/seccomp.h      |   8 ++
>>  include/uapi/linux/seccomp.h |   1 +
>>  kernel/fork.c                |  14 +-
>>  kernel/seccomp.c             |   8 ++
>>  security/landlock/Makefile   |   2 +-
>>  security/landlock/hooks.c    |  42 +++++-
>>  security/landlock/manager.c  | 321 +++++++++++++++++++++++++++++++++++++++++++
>>  7 files changed, 392 insertions(+), 4 deletions(-)
>>  create mode 100644 security/landlock/manager.c
>>
>> diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
>> index e25aee2cdfc0..9a38de3c0e72 100644
>> --- a/include/linux/seccomp.h
>> +++ b/include/linux/seccomp.h
>> @@ -10,6 +10,10 @@
>>  #include <linux/thread_info.h>
>>  #include <asm/seccomp.h>
>>
>> +#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
>> +struct landlock_events;
>> +#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
>> +
>>  struct seccomp_filter;
>>  /**
>>   * struct seccomp - the state of a seccomp'ed process
>> @@ -18,6 +22,7 @@ struct seccomp_filter;
>>   *         system calls available to a process.
>>   * @filter: must always point to a valid seccomp-filter or NULL as it is
>>   *          accessed without locking during system call entry.
>> + * @landlock_events: contains an array of Landlock rules.
>>   *
>>   *          @filter must only be accessed from the context of current as there
>>   *          is no read locking.
>> @@ -25,6 +30,9 @@ struct seccomp_filter;
>>  struct seccomp {
>>         int mode;
>>         struct seccomp_filter *filter;
>> +#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
>> +       struct landlock_events *landlock_events;
>> +#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
>>  };
>>
>>  #ifdef CONFIG_HAVE_ARCH_SECCOMP_FILTER
>> diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
>> index 0f238a43ff1e..56dd692cddac 100644
>> --- a/include/uapi/linux/seccomp.h
>> +++ b/include/uapi/linux/seccomp.h
>> @@ -13,6 +13,7 @@
>>  /* Valid operations for seccomp syscall. */
>>  #define SECCOMP_SET_MODE_STRICT        0
>>  #define SECCOMP_SET_MODE_FILTER        1
>> +#define SECCOMP_ADD_LANDLOCK_RULE      2
>>
>>  /* Valid flags for SECCOMP_SET_MODE_FILTER */
>>  #define SECCOMP_FILTER_FLAG_TSYNC      1
>> diff --git a/kernel/fork.c b/kernel/fork.c
>> index a4f0d0e8aeb2..bd5c72dffe60 100644
>> --- a/kernel/fork.c
>> +++ b/kernel/fork.c
>> @@ -37,6 +37,7 @@
>>  #include <linux/security.h>
>>  #include <linux/hugetlb.h>
>>  #include <linux/seccomp.h>
>> +#include <linux/landlock.h>
>>  #include <linux/swap.h>
>>  #include <linux/syscalls.h>
>>  #include <linux/jiffies.h>
>> @@ -515,7 +516,10 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
>>          * the usage counts on the error path calling free_task.
>>          */
>>         tsk->seccomp.filter = NULL;
>> -#endif
>> +#ifdef CONFIG_SECURITY_LANDLOCK
>> +       tsk->seccomp.landlock_events = NULL;
>> +#endif /* CONFIG_SECURITY_LANDLOCK */
>> +#endif /* CONFIG_SECCOMP */
>>
>>         setup_thread_stack(tsk, orig);
>>         clear_user_return_notifier(tsk);
>> @@ -1388,7 +1392,13 @@ static void copy_seccomp(struct task_struct *p)
>>
>>         /* Ref-count the new filter user, and assign it. */
>>         get_seccomp_filter(current);
>> -       p->seccomp = current->seccomp;
>> +       p->seccomp.mode = current->seccomp.mode;
>> +       p->seccomp.filter = current->seccomp.filter;
>> +#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
>> +       p->seccomp.landlock_events = current->seccomp.landlock_events;
>> +       if (p->seccomp.landlock_events)
>> +               atomic_inc(&p->seccomp.landlock_events->usage);
>> +#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
>>
>>         /*
>>          * Explicitly enable no_new_privs here in case it got set
>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>> index 06f2f3ee454c..ef412d95ff5d 100644
>> --- a/kernel/seccomp.c
>> +++ b/kernel/seccomp.c
>> @@ -32,6 +32,7 @@
>>  #include <linux/security.h>
>>  #include <linux/tracehook.h>
>>  #include <linux/uaccess.h>
>> +#include <linux/landlock.h>
>>
>>  /**
>>   * struct seccomp_filter - container for seccomp BPF programs
>> @@ -492,6 +493,9 @@ static void put_seccomp_filter(struct seccomp_filter *filter)
>>  void put_seccomp(struct task_struct *tsk)
>>  {
>>         put_seccomp_filter(tsk->seccomp.filter);
>> +#ifdef CONFIG_SECURITY_LANDLOCK
>> +       put_landlock_events(tsk->seccomp.landlock_events);
>> +#endif /* CONFIG_SECURITY_LANDLOCK */
>>  }
>>
>>  /**
>> @@ -796,6 +800,10 @@ static long do_seccomp(unsigned int op, unsigned int flags,
>>                 return seccomp_set_mode_strict();
>>         case SECCOMP_SET_MODE_FILTER:
>>                 return seccomp_set_mode_filter(flags, uargs);
>> +#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
>> +       case SECCOMP_ADD_LANDLOCK_RULE:
>> +               return landlock_seccomp_append_prog(flags, uargs);
>> +#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
>>         default:
>>                 return -EINVAL;
>>         }
>> diff --git a/security/landlock/Makefile b/security/landlock/Makefile
>> index 8dc8bde660bd..6c1b0d8bd810 100644
>> --- a/security/landlock/Makefile
>> +++ b/security/landlock/Makefile
>> @@ -2,4 +2,4 @@ ccflags-$(CONFIG_SECURITY_LANDLOCK) += -Werror=unused-function
>>
>>  obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
>>
>> -landlock-y := hooks.o
>> +landlock-y := hooks.o manager.o
>> diff --git a/security/landlock/hooks.c b/security/landlock/hooks.c
>> index 88ebe3f01758..704ea18377d2 100644
>> --- a/security/landlock/hooks.c
>> +++ b/security/landlock/hooks.c
>> @@ -290,7 +290,44 @@ static u64 mem_prot_to_access(unsigned long prot, bool private)
>>
>>  static inline bool landlock_used(void)
>>  {
>> +#ifdef CONFIG_SECCOMP_FILTER
>> +       return !!(current->seccomp.landlock_events);
>> +#else
>>         return false;
>> +#endif /* CONFIG_SECCOMP_FILTER */
>> +}
>> +
>> +/**
>> + * landlock_run_prog - run Landlock program for a syscall
>> + *
>> + * @event_idx: event index in the rules array
>> + * @ctx: non-NULL eBPF context
>> + * @events: Landlock events pointer
>> + */
>> +static int landlock_run_prog(u32 event_idx, const struct landlock_context *ctx,
>> +               struct landlock_events *events)
>> +{
>> +       struct landlock_node *node;
>> +
>> +       if (!events)
>> +               return 0;
>> +
>> +       for (node = events->nodes[event_idx]; node; node = node->prev) {
>> +               struct landlock_rule *rule;
>> +
>> +               for (rule = node->rule; rule; rule = rule->prev) {
>> +                       u32 ret;
>> +
>> +                       if (WARN_ON(!rule->prog))
>> +                               continue;
>> +                       rcu_read_lock();
>> +                       ret = BPF_PROG_RUN(rule->prog, (void *)ctx);
>> +                       rcu_read_unlock();
>> +                       if (ret)
>> +                               return -EPERM;
>> +               }
>> +       }
>> +       return 0;
>>  }
>>
>>  static int landlock_decide(enum landlock_subtype_event event,
>> @@ -309,7 +346,10 @@ static int landlock_decide(enum landlock_subtype_event event,
>>                 .arg2 = ctx_values[1],
>>         };
>>
>> -       /* insert manager call here */
>> +#ifdef CONFIG_SECCOMP_FILTER
>> +       ret = landlock_run_prog(event_idx, &ctx,
>> +                       current->seccomp.landlock_events);
>> +#endif /* CONFIG_SECCOMP_FILTER */
>>
>>         return ret;
>>  }
>> diff --git a/security/landlock/manager.c b/security/landlock/manager.c
>> new file mode 100644
>> index 000000000000..00bb2944c85e
>> --- /dev/null
>> +++ b/security/landlock/manager.c
>> @@ -0,0 +1,321 @@
>> +/*
>> + * Landlock LSM - seccomp manager
>> + *
>> + * Copyright © 2017 Mickaël Salaün <mic@digikod.net>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2, as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#include <asm/page.h> /* PAGE_SIZE */
>> +#include <linux/atomic.h> /* atomic_*(), smp_store_release() */
>> +#include <linux/bpf.h> /* bpf_prog_put() */
>> +#include <linux/filter.h> /* struct bpf_prog */
>> +#include <linux/kernel.h> /* round_up() */
>> +#include <linux/landlock.h>
>> +#include <linux/sched.h> /* current_cred(), task_no_new_privs() */
>> +#include <linux/security.h> /* security_capable_noaudit() */
>> +#include <linux/slab.h> /* alloc(), kfree() */
>> +#include <linux/types.h> /* atomic_t */
>> +#include <linux/uaccess.h> /* copy_from_user() */
>> +
>> +#include "common.h"
>> +
>> +static void put_landlock_rule(struct landlock_rule *rule)
>> +{
>> +       struct landlock_rule *orig = rule;
>> +
>> +       /* clean up single-reference branches iteratively */
>> +       while (orig && atomic_dec_and_test(&orig->usage)) {
>> +               struct landlock_rule *freeme = orig;
>> +
>> +               bpf_prog_put(orig->prog);
>> +               orig = orig->prev;
>> +               kfree(freeme);
>> +       }
>> +}
>> +
>> +static void put_landlock_node(struct landlock_node *node)
>> +{
>> +       struct landlock_node *orig = node;
>> +
>> +       /* clean up single-reference branches iteratively */
>> +       while (orig && atomic_dec_and_test(&orig->usage)) {
>> +               struct landlock_node *freeme = orig;
>> +
>> +               put_landlock_rule(orig->rule);
>> +               orig = orig->prev;
>> +               kfree(freeme);
>> +       }
>> +}
>> +
>> +void put_landlock_events(struct landlock_events *events)
>> +{
>> +       if (events && atomic_dec_and_test(&events->usage)) {
>> +               size_t i;
>> +
>> +               /* XXX: Do we need to use lockless_dereference() here? */
>> +               for (i = 0; i < ARRAY_SIZE(events->nodes); i++) {
>> +                       if (!events->nodes[i])
>> +                               continue;
>> +                       /* Are we the owner of this node? */
>> +                       if (events->nodes[i]->owner == &events->nodes[i])
>> +                               events->nodes[i]->owner = NULL;
>> +                       put_landlock_node(events->nodes[i]);
>> +               }
>> +               kfree(events);
>> +       }
>> +}
>> +
>> +static struct landlock_events *new_raw_landlock_events(void)
>> +{
>> +       struct landlock_events *ret;
>> +
>> +       /* array filled with NULL values */
>> +       ret = kzalloc(sizeof(*ret), GFP_KERNEL);
>> +       if (!ret)
>> +               return ERR_PTR(-ENOMEM);
>> +       atomic_set(&ret->usage, 1);
>> +       return ret;
>> +}
>> +
>> +static struct landlock_events *new_filled_landlock_events(void)
>> +{
>> +       size_t i;
>> +       struct landlock_events *ret;
>> +
>> +       ret = new_raw_landlock_events();
>> +       if (IS_ERR(ret))
>> +               return ret;
>> +       /*
>> +        * We need to initially allocate every nodes to be able to update the
>> +        * rules they are pointing to, across every (future) children of the
>> +        * current task.
>> +        */
>> +       for (i = 0; i < ARRAY_SIZE(ret->nodes); i++) {
>> +               struct landlock_node *node;
>> +
>> +               node = kzalloc(sizeof(*node), GFP_KERNEL);
>> +               if (!node)
>> +                       goto put_events;
>> +               atomic_set(&node->usage, 1);
>> +               /* we are the owner of this node */
>> +               node->owner = &ret->nodes[i];
>> +               ret->nodes[i] = node;
>> +       }
>> +       return ret;
>> +
>> +put_events:
>> +       put_landlock_events(ret);
>> +       return ERR_PTR(-ENOMEM);
>> +}
>> +
>> +static void add_landlock_rule(struct landlock_events *events,
>> +               struct landlock_rule *rule)
>> +{
>> +       /* subtype.landlock_rule.event > 0 for loaded programs */
>> +       u32 event_idx = get_index(rule->prog->subtype.landlock_rule.event);
>> +
>> +       rule->prev = events->nodes[event_idx]->rule;
>> +       WARN_ON(atomic_read(&rule->usage));
>> +       atomic_set(&rule->usage, 1);
>> +       /* do not increment the previous rule usage */
>> +       smp_store_release(&events->nodes[event_idx]->rule, rule);
>> +}
>> +
>> +/* Limit Landlock events to 256KB. */
>> +#define LANDLOCK_EVENTS_MAX_PAGES (1 << 6)
>> +
>> +/**
>> + * landlock_append_prog - attach a Landlock rule to @current_events
>> + *
>> + * @current_events: landlock_events pointer, must be locked (if needed) to
>> + *                  prevent a concurrent put/free. This pointer must not be
>> + *                  freed after the call.
>> + * @prog: non-NULL Landlock rule to append to @current_events. @prog will be
>> + *        owned by landlock_append_prog() and freed if an error happened.
>> + *
>> + * Return @current_events or a new pointer when OK. Return a pointer error
>> + * otherwise.
>> + */
>> +static struct landlock_events *landlock_append_prog(
>> +               struct landlock_events *current_events, struct bpf_prog *prog)
>> +{
>> +       struct landlock_events *new_events = current_events;
>> +       unsigned long pages;
>> +       struct landlock_rule *rule;
>> +       u32 event_idx;
>> +
>> +       if (prog->type != BPF_PROG_TYPE_LANDLOCK) {
>> +               new_events = ERR_PTR(-EINVAL);
>> +               goto put_prog;
>> +       }
>> +
>> +       /* validate memory size allocation */
>> +       pages = prog->pages;
>> +       if (current_events) {
>> +               size_t i;
>> +
>> +               for (i = 0; i < ARRAY_SIZE(current_events->nodes); i++) {
>> +                       struct landlock_node *walker_n;
>> +
>> +                       for (walker_n = current_events->nodes[i];
>> +                                       walker_n;
>> +                                       walker_n = walker_n->prev) {
>> +                               struct landlock_rule *walker_r;
>> +
>> +                               for (walker_r = walker_n->rule;
>> +                                               walker_r;
>> +                                               walker_r = walker_r->prev)
>> +                                       pages += walker_r->prog->pages;
>> +                       }
>> +               }
>> +               /* count a struct landlock_events if we need to allocate one */
>> +               if (atomic_read(&current_events->usage) != 1)
>> +                       pages += round_up(sizeof(*current_events), PAGE_SIZE) /
>> +                               PAGE_SIZE;
>> +       }
>> +       if (pages > LANDLOCK_EVENTS_MAX_PAGES) {
>> +               new_events = ERR_PTR(-E2BIG);
>> +               goto put_prog;
>> +       }
>> +
>> +       rule = kzalloc(sizeof(*rule), GFP_KERNEL);
>> +       if (!rule) {
>> +               new_events = ERR_PTR(-ENOMEM);
>> +               goto put_prog;
>> +       }
>> +       rule->prog = prog;
>> +
>> +       /* subtype.landlock_rule.event > 0 for loaded programs */
>> +       event_idx = get_index(rule->prog->subtype.landlock_rule.event);
>> +
>> +       if (!current_events) {
>> +               /* add a new landlock_events, if needed */
>> +               new_events = new_filled_landlock_events();
>> +               if (IS_ERR(new_events))
>> +                       goto put_rule;
>> +               add_landlock_rule(new_events, rule);
>> +       } else {
>> +               if (new_events->nodes[event_idx]->owner ==
>> +                               &new_events->nodes[event_idx]) {
>> +                       /* We are the owner, we can then update the node. */
>> +                       add_landlock_rule(new_events, rule);
>> +               } else if (atomic_read(&current_events->usage) == 1) {
>> +                       WARN_ON(new_events->nodes[event_idx]->owner);
>> +                       /*
>> +                        * We can become the new owner if no other task use it.
>> +                        * This avoid an unnecessary allocation.
>> +                        */
>> +                       new_events->nodes[event_idx]->owner =
>> +                               &new_events->nodes[event_idx];
>> +                       add_landlock_rule(new_events, rule);
> 
> I still don't get it all, but maybe here to make it simple you have to
> always allocate, since the WARN_ON() suggests it should be scheduled
> to be removed... and avoid to care about whether you are using/freeing
> old rules of the dead task... ?

I'm not sure to get your point but I will make the inheritance behavior
similar to seccomp filter at first, hence simpler.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy
  2017-03-03  0:48             ` Mickaël Salaün
@ 2017-03-03  0:55               ` Andy Lutomirski
  2017-03-03  1:05                 ` Mickaël Salaün
  0 siblings, 1 reply; 26+ messages in thread
From: Andy Lutomirski @ 2017-03-03  0:55 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: linux-kernel, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Casey Schaufler, Daniel Borkmann, David Drysdale,
	David S . Miller, Eric W . Biederman, James Morris, Jann Horn,
	Jonathan Corbet, Matthew Garrett, Michael Kerrisk, Kees Cook,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Tejun Heo, Thomas Graf, Will Drewry, kernel-hardening, Linux API,
	LSM List, Network Development, Andrew Morton

On Thu, Mar 2, 2017 at 4:48 PM, Mickaël Salaün <mic@digikod.net> wrote:
>
> On 02/03/2017 17:36, Andy Lutomirski wrote:
>> On Wed, Mar 1, 2017 at 3:28 PM, Mickaël Salaün <mic@digikod.net> wrote:
>>>
>>>
>>> On 01/03/2017 23:20, Andy Lutomirski wrote:
>>>> On Wed, Mar 1, 2017 at 2:14 PM, Mickaël Salaün <mic@digikod.net> wrote:
>>>>>
>>>>> On 28/02/2017 21:01, Andy Lutomirski wrote:
>>>>>> On Tue, Feb 21, 2017 at 5:26 PM, Mickaël Salaün <mic@digikod.net> wrote:
>>>>> This design makes it possible for a process to add more constraints to
>>>>> its children on the fly. I think it is a good feature to have and a
>>>>> safer default inheritance mechanism, but it could be guarded by an
>>>>> option flag if we want both mechanism to be available. The same design
>>>>> could be used by seccomp filter too.
>>>>>
>>>>
>>>> Then let's do it right.
>>>>
>>>> Currently each task has an array of seccomp filter layers.  When a
>>>> task forks, the child inherits the layers.  All the layers are
>>>> presently immutable.  With Landlock, a layer can logically be a
>>>> syscall fitler layer or a Landlock layer.  This fits in to the
>>>> existing model just fine.
>>>>
>>>> If we want to have an interface to allow modification of an existing
>>>> layer, let's make it so that, when a layer is added, you have to
>>>> specify a flag to make the layer modifiable (by current, presumably,
>>>> although I can imagine other policies down the road).  Then have a
>>>> separate API that modifies a layer.
>>>>
>>>> IOW, I think your patch is bad for three reasons, all fixable:
>>>>
>>>> 1. The default is wrong.  A layer should be immutable to avoid an easy
>>>> attack in which you try to sandbox *yourself* and then you just modify
>>>> the layer to weaken it.
>>>
>>> This is not possible, there is only an operation for now:
>>> SECCOMP_ADD_LANDLOCK_RULE. You can only add more rules to the list (as
>>> for seccomp filter). There is no way to weaken a sandbox. The question
>>> is: how do we want to handle the rules *tree* (from the kernel point of
>>> view)?
>>>
>>
>> Fair enough.  But I still think that immutability (like regular
>> seccomp) should be the default.  For security, simplicity is
>> important.  I guess there could be two ways to relax immutability:
>> allowing making the layer stricter and allowing any change at all.
>>
>> As a default, though, programs should be able to expect that:
>>
>> seccomp(SECCOMP_ADD_WHATEVER, ...);
>> fork();
>>
>> [parent gets compromised]
>> [in parent]seccomp(anything whatsoever);
>>
>> will not affect the child,  If the parent wants to relax that, that's
>> fine, but I think it should be explicit.
>
> Good point. However the term "immutability" doesn't fit right because
> the process is still allowed to add more rules to itself (as for
> seccomp). The difference lays in the way a rule may be "appended" (by
> the current process) or "inserted" (by a parent process).
>
> I think three or four kind of operations (through the seccomp syscall)
> make sense:
> * append a rule (for the current process and its future children)

Sure, but this operation should *never* affect existing children,
existing seccomp layers, existing nodes, etc.  It should affect
current and future children only.  Or it could simply not exist for
Landlock and instead you'd have to add a layer (see below) and then
program that layer.

> * add a node (insert point), from which the inserted rules will be tied
> * insert a rule in the node, which will be inherited by futures children

I would advocate calling this a "seccomp layer" and making creation
and manipulation of them generic.

> * (maybe a "lock" command to make a layer immutable for the current
> process and its children)

Hmm, maybe.

>
> Doing so, a process is only allowed to insert a rule if a node was
> previously added. To forbid itself to insert new rules to one of its
> children, a process just need to not add a node before forking. Like
> this, there is no need for special rule flags nor default behavior,
> everything is explicit.

This is still slightly too complicated.  If you really want an
operation that adds a layer (please don't call it a node in the ABI)
and adds a rule to that layer in a single operation, it should be a
separate operation.  Please make everything explicit.

(I don't like exposing the word "node" to userspace because it means
nothing.  Having more than one layer of filter makes sense to me,
which is why I like "layer".  I'm sure that other good choices exist.)

>
> For this series, I will stick to the same behavior as seccomp filter:
> only append rules to the current process (and its future children).
>
>
>>>> 2. The API that adds a layer should be different from the API that
>>>> modifies a layer.
>>>
>>> Right, but it doesn't apply now because we can only add rules.
>>
>> That's not what the code appears to do, though.  Sometimes it makes a
>> new layer without modifying tasks that share the layer and sometimes
>> it modifies the layer.
>>
>> Both operations are probably okay, but they're not the same operation
>> and they shouldn't pretend to be.
>
> It should be OK with my previous proposal. The other details could be
> discussed in a separate future patch series.
>

NAK, or at least NAK pending better docs and justification.  The
operations of "add a layer and put a rule in it" and "add a rule to an
existing layer" are logically different and should not be the same
SECCOMP operation.  "Do what I mean" is a nice paradigm for a language
like Perl, but for security (and for kernel interfaces in general),
"do what I say and error out if I said nonsense" is much safer.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy
  2017-03-03  0:55               ` Andy Lutomirski
@ 2017-03-03  1:05                 ` Mickaël Salaün
  0 siblings, 0 replies; 26+ messages in thread
From: Mickaël Salaün @ 2017-03-03  1:05 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-kernel, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Casey Schaufler, Daniel Borkmann, David Drysdale,
	David S . Miller, Eric W . Biederman, James Morris, Jann Horn,
	Jonathan Corbet, Matthew Garrett, Michael Kerrisk, Kees Cook,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Tejun Heo, Thomas Graf, Will Drewry, kernel-hardening, Linux API,
	LSM List, Network Development, Andrew Morton

[-- Attachment #1.1: Type: text/plain, Size: 6165 bytes --]



On 03/03/2017 01:55, Andy Lutomirski wrote:
> On Thu, Mar 2, 2017 at 4:48 PM, Mickaël Salaün <mic@digikod.net> wrote:
>>
>> On 02/03/2017 17:36, Andy Lutomirski wrote:
>>> On Wed, Mar 1, 2017 at 3:28 PM, Mickaël Salaün <mic@digikod.net> wrote:
>>>>
>>>>
>>>> On 01/03/2017 23:20, Andy Lutomirski wrote:
>>>>> On Wed, Mar 1, 2017 at 2:14 PM, Mickaël Salaün <mic@digikod.net> wrote:
>>>>>>
>>>>>> On 28/02/2017 21:01, Andy Lutomirski wrote:
>>>>>>> On Tue, Feb 21, 2017 at 5:26 PM, Mickaël Salaün <mic@digikod.net> wrote:
>>>>>> This design makes it possible for a process to add more constraints to
>>>>>> its children on the fly. I think it is a good feature to have and a
>>>>>> safer default inheritance mechanism, but it could be guarded by an
>>>>>> option flag if we want both mechanism to be available. The same design
>>>>>> could be used by seccomp filter too.
>>>>>>
>>>>>
>>>>> Then let's do it right.
>>>>>
>>>>> Currently each task has an array of seccomp filter layers.  When a
>>>>> task forks, the child inherits the layers.  All the layers are
>>>>> presently immutable.  With Landlock, a layer can logically be a
>>>>> syscall fitler layer or a Landlock layer.  This fits in to the
>>>>> existing model just fine.
>>>>>
>>>>> If we want to have an interface to allow modification of an existing
>>>>> layer, let's make it so that, when a layer is added, you have to
>>>>> specify a flag to make the layer modifiable (by current, presumably,
>>>>> although I can imagine other policies down the road).  Then have a
>>>>> separate API that modifies a layer.
>>>>>
>>>>> IOW, I think your patch is bad for three reasons, all fixable:
>>>>>
>>>>> 1. The default is wrong.  A layer should be immutable to avoid an easy
>>>>> attack in which you try to sandbox *yourself* and then you just modify
>>>>> the layer to weaken it.
>>>>
>>>> This is not possible, there is only an operation for now:
>>>> SECCOMP_ADD_LANDLOCK_RULE. You can only add more rules to the list (as
>>>> for seccomp filter). There is no way to weaken a sandbox. The question
>>>> is: how do we want to handle the rules *tree* (from the kernel point of
>>>> view)?
>>>>
>>>
>>> Fair enough.  But I still think that immutability (like regular
>>> seccomp) should be the default.  For security, simplicity is
>>> important.  I guess there could be two ways to relax immutability:
>>> allowing making the layer stricter and allowing any change at all.
>>>
>>> As a default, though, programs should be able to expect that:
>>>
>>> seccomp(SECCOMP_ADD_WHATEVER, ...);
>>> fork();
>>>
>>> [parent gets compromised]
>>> [in parent]seccomp(anything whatsoever);
>>>
>>> will not affect the child,  If the parent wants to relax that, that's
>>> fine, but I think it should be explicit.
>>
>> Good point. However the term "immutability" doesn't fit right because
>> the process is still allowed to add more rules to itself (as for
>> seccomp). The difference lays in the way a rule may be "appended" (by
>> the current process) or "inserted" (by a parent process).
>>
>> I think three or four kind of operations (through the seccomp syscall)
>> make sense:
>> * append a rule (for the current process and its future children)
> 
> Sure, but this operation should *never* affect existing children,
> existing seccomp layers, existing nodes, etc.  It should affect
> current and future children only.  Or it could simply not exist for
> Landlock and instead you'd have to add a layer (see below) and then
> program that layer.
> 
>> * add a node (insert point), from which the inserted rules will be tied
>> * insert a rule in the node, which will be inherited by futures children
> 
> I would advocate calling this a "seccomp layer" and making creation
> and manipulation of them generic.
> 
>> * (maybe a "lock" command to make a layer immutable for the current
>> process and its children)
> 
> Hmm, maybe.
> 
>>
>> Doing so, a process is only allowed to insert a rule if a node was
>> previously added. To forbid itself to insert new rules to one of its
>> children, a process just need to not add a node before forking. Like
>> this, there is no need for special rule flags nor default behavior,
>> everything is explicit.
> 
> This is still slightly too complicated.  If you really want an
> operation that adds a layer (please don't call it a node in the ABI)
> and adds a rule to that layer in a single operation, it should be a
> separate operation.  Please make everything explicit.
> 
> (I don't like exposing the word "node" to userspace because it means
> nothing.  Having more than one layer of filter makes sense to me,
> which is why I like "layer".  I'm sure that other good choices exist.)

I keep that for a future discussion, I'm now convinced the simple
inheritance (seccomp-like) doesn't block future extension.

> 
>>
>> For this series, I will stick to the same behavior as seccomp filter:
>> only append rules to the current process (and its future children).
>>
>>
>>>>> 2. The API that adds a layer should be different from the API that
>>>>> modifies a layer.
>>>>
>>>> Right, but it doesn't apply now because we can only add rules.
>>>
>>> That's not what the code appears to do, though.  Sometimes it makes a
>>> new layer without modifying tasks that share the layer and sometimes
>>> it modifies the layer.
>>>
>>> Both operations are probably okay, but they're not the same operation
>>> and they shouldn't pretend to be.
>>
>> It should be OK with my previous proposal. The other details could be
>> discussed in a separate future patch series.
>>
> 
> NAK, or at least NAK pending better docs and justification.  The
> operations of "add a layer and put a rule in it" and "add a rule to an
> existing layer" are logically different and should not be the same
> SECCOMP operation.

We are agree.

> "Do what I mean" is a nice paradigm for a language
> like Perl, but for security (and for kernel interfaces in general),
> "do what I say and error out if I said nonsense" is much safer.
> 

Totally agree.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, back to index

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-22  1:26 [PATCH v5 00/10] Landlock LSM: Toward unprivileged sandboxing Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 01/10] bpf: Add eBPF program subtype and is_valid_subtype() verifier Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 02/10] bpf,landlock: Define an eBPF program type for Landlock Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 03/10] bpf: Define handle_fs and add a new helper bpf_handle_fs_get_mode() Mickaël Salaün
2017-03-01  9:32   ` James Morris
2017-03-01 22:20     ` Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 04/10] landlock: Add LSM hooks related to filesystem Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 05/10] seccomp: Split put_seccomp_filter() with put_seccomp() Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy Mickaël Salaün
2017-02-28 20:01   ` Andy Lutomirski
2017-03-01 22:14     ` Mickaël Salaün
2017-03-01 22:20       ` Andy Lutomirski
2017-03-01 23:28         ` Mickaël Salaün
2017-03-02 16:36           ` Andy Lutomirski
2017-03-03  0:48             ` Mickaël Salaün
2017-03-03  0:55               ` Andy Lutomirski
2017-03-03  1:05                 ` Mickaël Salaün
2017-03-02 10:22   ` [kernel-hardening] " Djalal Harouni
2017-03-03  0:54     ` Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 07/10] bpf: Add a Landlock sandbox example Mickaël Salaün
2017-02-23 22:13   ` Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 08/10] seccomp: Enhance test_harness with an assert step mechanism Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 09/10] bpf,landlock: Add tests for Landlock Mickaël Salaün
2017-02-22  1:26 ` [PATCH v5 10/10] landlock: Add user and kernel documentation " Mickaël Salaün
2017-02-22  5:21   ` Andy Lutomirski
2017-02-22  7:43     ` Mickaël Salaün

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git