On 2018-09-27, Tycho Andersen wrote: > This patch introduces a means for syscalls matched in seccomp to notify > some other task that a particular filter has been triggered. > > The motivation for this is primarily for use with containers. For example, > if a container does an init_module(), we obviously don't want to load this > untrusted code, which may be compiled for the wrong version of the kernel > anyway. Instead, we could parse the module image, figure out which module > the container is trying to load and load it on the host. > > As another example, containers cannot mknod(), since this checks > capable(CAP_SYS_ADMIN). However, harmless devices like /dev/null or > /dev/zero should be ok for containers to mknod, but we'd like to avoid hard > coding some whitelist in the kernel. Another example is mount(), which has > many security restrictions for good reason, but configuration or runtime > knowledge could potentially be used to relax these restrictions. Minor thing, but this is no longer _entirely_ true (now it checks ns_capable(sb->s_user_ns)). I think the kernel module auto-loading is a much more interesting example, but since this is just a commit message feel free to ignore my pedantry. :P > Signed-off-by: Tycho Andersen > CC: Kees Cook > CC: Andy Lutomirski > CC: Oleg Nesterov > CC: Eric W. Biederman > CC: "Serge E. Hallyn" > CC: Christian Brauner > CC: Tyler Hicks > CC: Akihiro Suda Would you mind adding me to the Cc: list for the next round of patches? It's looking pretty neat! Thanks! -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH