From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 716F31094 for ; Thu, 15 Aug 2019 18:48:11 +0000 (UTC) Received: from youngberry.canonical.com (youngberry.canonical.com [91.189.89.112]) by smtp1.linuxfoundation.org (Postfix) with ESMTP id D3447CF for ; Thu, 15 Aug 2019 18:48:10 +0000 (UTC) Date: Thu, 15 Aug 2019 20:31:13 +0200 From: Christian Brauner To: Andy Lutomirski Message-ID: <20190815183113.rtaevi3sdipdz5y2@wittgenstein> References: <20190719093538.dhyopljyr5ns33qx@brauner.io> <201907192007.B43158B@keescook> <201908151034.CC0F7BD84@keescook> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Cc: ksummit Subject: Re: [Ksummit-discuss] [TECH TOPIC] seccomp List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Thu, Aug 15, 2019 at 11:26:10AM -0700, Andy Lutomirski wrote: > On Thu, Aug 15, 2019 at 10:48 AM Kees Cook wrote: > > > > On Wed, Aug 14, 2019 at 10:54:49AM -0700, Andy Lutomirski wrote: > > > After thinking about this a bit more, I think that deferring the main > > > seccomp filter invocation until arguments have been read is too > > > problematic. It has the ordering issues you're thinking of, but it > > > also has unpleasant effects if one of the reads faults or if > > > SECCOMP_RET_TRACE or SECCOMP_RET_TRAP is used. I'm thinking that this > > > > Right, I was actually thinking of the trace/trap as being the race. > > > > > type of deeper inspection filter should just be a totally separate > > > layer. Once the main seccomp logic decides that a filterable syscall > > > will be issued then, assuming that no -EFAULT happens, a totally > > > different program should get run with access to arguments. And there > > > should be a way for the main program to know that the syscall nr in > > > question is filterable on the running kernel. > > > > Right -- this is how I designed the original prototype: it was > > effectively an LSM that was triggered by seccomp (since LSMs don't know > > anything about syscalls -- their hooks are more generalized). So seccomp > > would set a flag to make the LSM hook pay attention. > > > > Existing LSMs are system-owner defined, so really something like Landlock > > is needed for a process-owned LSM to be defined. But I worry that LSM > > hooks are still too "deep" in the kernel to have a process-oriented > > filter author who is not a kernel developer make any sense of the > > hooks. They're certainly oriented in a better position to gain the > > intent of a filter. For example, if a filter says "you can't open(2) > > /etc/foo", but it misses saying "you can't openat(2) /etc/foo", that's a > > dumb exposure. The LSM hooks are positioned to say "you can't manipulate > > /etc/foo through any means". > > > > So, I'm not entirely sure. It needs a clear design that chooses and > > justifies the appropriate "depth" of filtering. And FWIW, the two most > > frequent examples of argument parsing requests have been path-based > > checking and network address checking. So any prototype needs to handle > > these two cases sanely... > > > > But also clone() flag filtering, and new clone() proposals keep > wanting to add structs. And filtering bpf(). /me runs. Yeah, I've mentioned clone3() in my initial mail. And it is not a proposal anymore it's in mainline since the 5.3 merge window. So the evil has been done. /me (sorry-not-sorry) ducks :)