On 27/08/2016 09:40, Andy Lutomirski wrote: > On Thu, Aug 25, 2016 at 3:32 AM, Mickaël Salaün wrote: >> Hi, >> >> This series is a proof of concept to fill some missing part of seccomp as the >> ability to check syscall argument pointers or creating more dynamic security >> policies. The goal of this new stackable Linux Security Module (LSM) called >> Landlock is to allow any process, including unprivileged ones, to create >> powerful security sandboxes comparable to the Seatbelt/XNU Sandbox or the >> OpenBSD Pledge. This kind of sandbox help to mitigate the security impact of >> bugs or unexpected/malicious behaviors in userland applications. >> >> The first RFC [1] was focused on extending seccomp while staying at the syscall >> level. This brought a working PoC but with some (mitigated) ToCToU race >> conditions due to the seccomp ptrace hole (now fixed) and the non-atomic >> syscall argument evaluation (hence the LSM hooks). >> >> >> # Landlock LSM >> >> This second RFC is a fresh revamp of the code while keeping some working ideas. >> This series is mainly focused on LSM hooks, while keeping the possibility to >> tied them to syscalls. This new code removes all race conditions by design. It >> now use eBPF instead of a subset of cBPF (as used by seccomp-bpf). This allow >> to remove the previous stacked cBPF hack to do complex access checks thanks to >> dedicated eBPF functions. An eBPF program is still very limited (i.e. can only >> call a whitelist of functions) and can not do a denial of service (i.e. no >> loop). The other major improvement is the replacement of the previous custom >> checker groups of syscall arguments with a new dedicated eBPF map to collect >> and compare Landlock handles with system resources (e.g. files or network >> connections). >> >> The approach taken is to add the minimum amount of code while still allowing >> the userland to create quite complex access rules. A dedicated security policy >> language such as used by SELinux, AppArmor and other major LSMs is a lot of >> code and dedicated to a trusted process (i.e. root/administrator). >> > > I think there might be a problem with the current design. If I add a > seccomp filter that uses RET_LANDLOCK and some landlock filters, what > happens if a second seccomp filter *also* uses RET_LANDLOCK? I think > they'll interfere with each other. It might end up being necessary to > require only one landlock seccomp layer at a time or to find a way to > stick all the filters in a layer together with the LSM callbacks or > maybe to just drop RET_LANDLOCK and let the callbacks look at the > syscall args. This is correctly managed. For each RET_LANDLOCK, if there is one or more associated Landlock programs (i.e. created by the same thread after this seccomp filters), there is one Landlock program instance run for each seccomp that trigger them. This way, each cookie linked to a RET_LANDLOCK is evaluated one time by each relevant Landlock program. Example when a thread that loaded multiple seccomp filters (SF) and multiple Landlock programs (LP) associated with one LSM hook: SF0, SF1, LP0(file_open), SF2, LP1(file_open), LP2(file_permission) * If SF0 returns RET_LANDLOCK(cookie0), then LP0 and LP1 are run with cookie0 if the current syscall trigger the file_open hook, and LP2 is run with cookie0 if the syscall trigger the file_permission hook. * In addition to the previous case, if SF1 returns RET_LANDLOCK(cookie1), then LP0 and LP1 are run with cookie1 if the current syscall trigger the file_open hook, and LP2 is run with cookie1 if the syscall trigger the file_permission hook. * In addition to the previous cases, if SF2 returns RET_LANDLOCK(cookie2), then (only) LP1 is run with cookie2 if the current syscall trigger the file_open hook, and LP2 is run with cookie2 if the syscall trigger the file_permission hook. > > BTW, what happens if an LSM hook is called outside a syscall context, > e.g. from a page fault? Good catch! For now, only a syscall can trigger an LSM hook because of the RET_LANDLOCK constraint. It may be wise to trigger them without a cookie and add a dedicated variable in the eBPF context. > >> >> >> # Sandbox example with conditional access control depending on cgroup >> >> $ mkdir /sys/fs/cgroup/sandboxed >> $ ls /home >> user1 >> $ LANDLOCK_CGROUPS='/sys/fs/cgroup/sandboxed' \ >> LANDLOCK_ALLOWED='/bin:/lib:/usr:/tmp:/proc/self/fd/0' \ >> ./sandbox /bin/sh -i >> $ ls /home >> user1 >> $ echo $$ > /sys/fs/cgroup/sandboxed/cgroup.procs >> $ ls /home >> ls: cannot open directory '/home': Permission denied >> > > Something occurs to me that isn't strictly relevant to landlock but > may be relevant to unprivileged cgroups: can you cause trouble by > setting up a nastily-configured cgroup and running a setuid program in > it? > I hope not… But the use of cgroups should not be mandatory for Landlock.