From: Michael Tirado <mtirado418@gmail.com> To: Kees Cook <keescook@chromium.org> Cc: LKML <linux-kernel@vger.kernel.org> Subject: Re: eBPF / seccomp globals? Date: Thu, 10 Sep 2015 21:55:33 +0000 Message-ID: <CAMkWEXM2pyuOx8VhGLQUQU=7jLuB4ifNydu4egK2k-HnWf6h1w@mail.gmail.com> (raw) In-Reply-To: <CAGXu5jJ4nY36M1xLXMe99YOpE1cABWBk7UchPpzz9EyW4YUAxw@mail.gmail.com> On Fri, Sep 4, 2015 at 8:37 PM, Kees Cook <keescook@chromium.org> wrote: > > Do you still need file capabilities with the availability of the new > ambient capabilities? > > https://s3hh.wordpress.com/2015/07/25/ambient-capabilities/ > http://thread.gmane.org/gmane.linux.kernel.lsm/24034 Ah.. thanks for the info on this, my launcher program could use ambient capabilities if whoever invoked it already has that capability. I am trying to have the new environment explicitly defined as a white list, and avoid any type of privilege escalation not already granted by root user either by filesystem mechanisms (setuid / file caps) or inheritable caps. I would still like to be able to launch programs with file capabilities since we can lock those down with capability bounding set, and maybe even setuid binaries too (with a hefty warning message). This rules out LD_PRELOAD for me, and also some linkers may not support it at all. > On the TODO list is > doing deep argument inspection, but it is not an easy thing to get > right. :) Yes, please do not rush such a thing!! It might even be a can of worms not worth opening. In case anyone is wondering what I am doing for-now(tm) while waiting for eBPF map support, or some other way to deal with this problem: I have crafted a very hacky patch to work around the issue that will allow 2 system calls to pass through before the filter program is run. I'm lazily using google webmail so, sorry if the tabs are missing :( From: Michael R. Tirado <mtirado418@gmail.com> Date: Thu, 10 Sep 2015 08:28:41 +0000 Subject: [PATCH] Add new seccomp filter mode + flag to allow two syscalls to pass before the filter is run. allows a launcher program to setuid(drop caps) and exec if those two privileges are not granted in seccomp filter whitelist. DISCLAIMER: I am doing this as a quick temporary workaround to this complex problem. Also, there may be a more efficient way to implement it instead of branching in the filter loop. --- include/linux/seccomp.h | 2 +- include/uapi/linux/seccomp.h | 2 ++ kernel/seccomp.c | 23 ++++++++++++++++++++--- 3 files changed, 23 insertions(+), 4 deletions(-) diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h index a19ddac..5547448c 100644 --- a/include/linux/seccomp.h +++ b/include/linux/seccomp.h @@ -3,7 +3,7 @@ #include <uapi/linux/seccomp.h> -#define SECCOMP_FILTER_FLAG_MASK (SECCOMP_FILTER_FLAG_TSYNC) +#define SECCOMP_FILTER_FLAG_MASK (SECCOMP_FILTER_FLAG_TSYNC | SECCOMP_FILTER_FLAG_DEFERRED) #ifdef CONFIG_SECCOMP diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h index 0f238a4..43a8fb8 100644 --- a/include/uapi/linux/seccomp.h +++ b/include/uapi/linux/seccomp.h @@ -9,6 +9,7 @@ #define SECCOMP_MODE_DISABLED 0 /* seccomp is not in use. */ #define SECCOMP_MODE_STRICT 1 /* uses hard-coded filter. */ #define SECCOMP_MODE_FILTER 2 /* uses user-supplied filter. */ +#define SECCOMP_MODE_FILTER_DEFERRED 3 /* sets filter mode + deferred flag */ /* Valid operations for seccomp syscall. */ #define SECCOMP_SET_MODE_STRICT 0 @@ -16,6 +17,7 @@ /* Valid flags for SECCOMP_SET_MODE_FILTER */ #define SECCOMP_FILTER_FLAG_TSYNC 1 +#define SECCOMP_FILTER_FLAG_DEFERRED 2 /* grant two unfiltered syscalls */ /* * All BPF programs must return a 32-bit value. diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 245df6b..dc2a5af 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -58,6 +58,7 @@ struct seccomp_filter { atomic_t usage; struct seccomp_filter *prev; struct bpf_prog *prog; + unsigned int deferred; }; /* Limit any path through the tree to 256KB worth of instructions. */ @@ -196,7 +197,12 @@ static u32 seccomp_run_filters(struct seccomp_data *sd) * value always takes priority (ignoring the DATA). */ for (; f; f = f->prev) { - u32 cur_ret = BPF_PROG_RUN(f->prog, (void *)sd); + u32 cur_ret; + if (unlikely(f->deferred)) { + --f->deferred; + continue; + } + cur_ret = BPF_PROG_RUN(f->prog, (void *)sd); if ((cur_ret & SECCOMP_RET_ACTION) < (ret & SECCOMP_RET_ACTION)) ret = cur_ret; @@ -444,6 +450,14 @@ static long seccomp_attach_filter(unsigned int flags, } /* + * in certain cases we may wish to defer filtering, and allow some + * syscalls. eg, a launcher program will setuid(drop caps) then exec. + */ + if (flags & SECCOMP_FILTER_FLAG_DEFERRED) { + filter->deferred = 2; + } + + /* * If there is an existing filter, make it the prev and don't drop its * task reference. */ @@ -838,6 +852,7 @@ long prctl_set_seccomp(unsigned long seccomp_mode, char __user *filter) { unsigned int op; char __user *uargs; + unsigned int flags = 0; switch (seccomp_mode) { case SECCOMP_MODE_STRICT: @@ -849,6 +864,9 @@ long prctl_set_seccomp(unsigned long seccomp_mode, char __user *filter) */ uargs = NULL; break; + /* set flag, older kernels lack seccomp syscall */ + case SECCOMP_MODE_FILTER_DEFERRED: + flags = SECCOMP_FILTER_FLAG_DEFERRED; case SECCOMP_MODE_FILTER: op = SECCOMP_SET_MODE_FILTER; uargs = filter; @@ -857,6 +875,5 @@ long prctl_set_seccomp(unsigned long seccomp_mode, char __user *filter) return -EINVAL; } - /* prctl interface doesn't have flags, so they are always zero. */ - return do_seccomp(op, 0, uargs); + return do_seccomp(op, flags, uargs); } -- 1.8.4
next prev parent reply index Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top 2015-09-04 1:01 Michael Tirado 2015-09-04 3:17 ` Alexei Starovoitov 2015-09-04 14:03 ` Tycho Andersen 2015-09-04 4:01 ` Kees Cook 2015-09-04 20:29 ` Michael Tirado 2015-09-04 20:37 ` Kees Cook 2015-09-10 21:55 ` Michael Tirado [this message] 2015-09-10 23:22 ` Michael Tirado 2015-09-29 23:44 ` Kees Cook 2015-09-30 0:07 ` Andy Lutomirski 2015-10-06 16:00 ` Michael Tirado
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAMkWEXM2pyuOx8VhGLQUQU=7jLuB4ifNydu4egK2k-HnWf6h1w@mail.gmail.com' \ --to=mtirado418@gmail.com \ --cc=keescook@chromium.org \ --cc=linux-kernel@vger.kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
LKML Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \ linux-kernel@vger.kernel.org public-inbox-index lkml Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel AGPL code for this site: git clone https://public-inbox.org/public-inbox.git