From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755877Ab2AQVJV (ORCPT ); Tue, 17 Jan 2012 16:09:21 -0500 Received: from mail-bk0-f46.google.com ([209.85.214.46]:45255 "EHLO mail-bk0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752819Ab2AQVJU convert rfc822-to-8bit (ORCPT ); Tue, 17 Jan 2012 16:09:20 -0500 MIME-Version: 1.0 In-Reply-To: References: <1326302710-9427-1-git-send-email-wad@chromium.org> <1326302710-9427-2-git-send-email-wad@chromium.org> <20120112162231.GA23960@redhat.com> <20120112172315.GA26295@redhat.com> <293e9587acd158b91d7d1793c7e16f7c.squirrel@webmail.greenhost.nl> <9642e1197443efe9716f418c4883489e.squirrel@webmail.greenhost.nl> Date: Tue, 17 Jan 2012 15:09:18 -0600 Message-ID: Subject: Re: [RFC,PATCH 1/2] seccomp_filters: system call filtering using BPF From: Will Drewry To: Kees Cook Cc: Indan Zupancic , Oleg Nesterov , linux-kernel@vger.kernel.org, john.johansen@canonical.com, serge.hallyn@canonical.com, coreyb@linux.vnet.ibm.com, pmoore@redhat.com, eparis@redhat.com, djm@mindrot.org, torvalds@linux-foundation.org, segoon@openwall.com, rostedt@goodmis.org, jmorris@namei.org, Roland McGrath Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 17, 2012 at 2:42 PM, Will Drewry wrote: > On Tue, Jan 17, 2012 at 2:34 PM, Kees Cook wrote: >> On Mon, Jan 16, 2012 at 10:46 PM, Indan Zupancic wrote: >>> So call it once and store the value in a long. Then copy the low half >>> to the right place and then the upper half when on 64 bits. It may not >>> look too pretty, but the compiler should be able to optimise almost all >>> overhead away and end up with 6 (or 12) int copies. Something like this: >>> >>> struct bpf_data { >>>        uint32 syscall_nr; >>>        uint32 arg_low[MAX_SC_ARGS]; >>>        uint32 arg_high[MAX_SC_ARGS]; >>> }; >>> >>> void fill_bpf_data(struct task_struct *t, struct pt_regs *r, struct bpf_data *d) >>> { >>>        int i; >>>        unsigned long arg; >>> >>>        d->syscall_nr = syscall_get_nr(t, r); >>>        for (i = 0; i < MAX_SC_ARGS; ++i){ >>>                syscall_get_arguments(t, r, i, 1, &arg); >>>                d->arg_low[i] = arg; >>>                d->arg_high[i] = arg >> 32; >>>        } >>> } >> >> If this turns out to be expensive, it might be possible to break it up >> and load the arguments on demand (and cache them); i.e. have >> load_pointer() or similar notice when it is about to access something >> other than bpf_data.syscall_nr. > > Makes perfect sense!  In theory (as a few other people pointed this > out off list), it is entirely possible to never populate any data for > load_pointer except an optional cache.  Just provide a custom > load_pointer that knows to take the offset return the syscall nr or > the args or some slice of the returned data. > > This is even easier if the struct looks like: > struct { >  int nr; >  union { >    uint32_t args32[6]; >    uint64_t args64[6]; >  } > }; > > since you can just use the offset without doing any endian-based > splitting.  Another suggestion (thanks roland!) was to add >  int syscall_arch; > to the struct populated with the AUDIT_ARCH_* defines.  This would > help the case Indan was worried about -- portable filter programs. > > It looks like there'd be some cross-arch plumbing to make the > AUDIT_ARCH_ data available, but not too bad. > > Seem sane? I'm headed down this path now and I think it'll work out > assuming there aren't major objections to the syscall_arch piece. Hrm. I'm still not so sure about the arch bit. Without it, BPF programs aren't directly share-able, but they could be as long as the values for k and syscall numbers are being adapted. By putting arch in the program, it makes it more likely that every system call will have a bpf preamble that has to check the syscall_arch. It could easily add 100s of nanoseconds to every call (on slower arches). I'll probably do the next patch series without arch-checking support then I can add if it is seems needed. Nothing forces a filter program to check it, so it could be that we let the author make the decision. cheers! will