On 2020-05-20, Kees Cook wrote: > On Wed, May 20, 2020 at 10:24:01PM +0200, Christian Brauner wrote: > > On Wed, May 20, 2020 at 12:08:52PM -0700, Linus Torvalds wrote: > > > On Wed, May 20, 2020 at 12:04 PM Kees Cook wrote: > > > > Perhaps the question is "how deeply does seccomp need to inspect?" > > > > and maybe it does not get to see anything beyond just the "top level" > > > > struct (i.e. struct clone_args) and all pointers within THAT become > > > > opaque? That certainly simplifies the design. > > > > > > Exactly. I think that's the most common situation by far. Does anybody > > > really really need to care at a deep level, and why? > > > > We mostly don't and making all second-level pointers opaque is ok imho. > > That'll make things MUCH easier. :) To be clear, my insistence on the second-level pointers topic is coming from the view that we should make sure whatever model we use for the first iteration of deep argument inspection can be expanded to second-level pointers if we need them. The jump-table proposal I had was just an example of how we could plan out a design that could be implemented piece-meal (heck, we don't even need jump-tables in the first iteration -- so long as we have an idea for how they'd work). I also hasten to point out that if we make all second-level pointers opaque then you won't be able to filter clone3() based on ->set_tid. Now, maybe that's something nobody cares about, but it should be taken into consideration that one of the handful of "obvious" syscalls will already not be completely-filterable with second-level pointers being opaque. But if that's fine (at least for a first iteration), then I'm also okay with that. > > But I think that we need some documented consensus on all that stuff > > which I stressed in other mails before. I'll hand something in about > > this, if that's ok than we can hash this out. > > Aleksa, I know you had an entire presentation[1] on the extensible > argument syscalls, but was there any text-based design doc that you made? > > It would be really nice to update Documentation/process/adding-syscalls.rst > with the specifics[2], and to (now) include the "no nested flags" > requirement. What do you think? Christian and I wrote a patch for adding-syscalls last year[1], but Jon felt that it should require greater community consensus before it gets put into adding-syscalls. But yes, I'm definitely in favour of having this be a properly-documented aspect of new syscall design. [1]: https://lore.kernel.org/linux-doc/20191002151437.5367-1-christian.brauner@ubuntu.com/ -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH