linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Ahern <dsahern@gmail.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: "Alexei Starovoitov" <alexei.starovoitov@gmail.com>,
	"Andy Lutomirski" <luto@kernel.org>,
	"Daniel Mack" <daniel@zonque.org>,
	"Mickaël Salaün" <mic@digikod.net>,
	"Kees Cook" <keescook@chromium.org>, "Jann Horn" <jann@thejh.net>,
	"Tejun Heo" <tj@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	"Thomas Graf" <tgraf@suug.ch>,
	"Michael Kerrisk" <mtk.manpages@gmail.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Linux API" <linux-api@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Network Development" <netdev@vger.kernel.org>
Subject: Re: Potential issues (security and otherwise) with the current cgroup-bpf API
Date: Mon, 19 Dec 2016 19:52:15 -0700	[thread overview]
Message-ID: <80574175-3692-0278-a74e-23b752d44f73@gmail.com> (raw)
In-Reply-To: <CALCETrUW2jEYmjSsOrPj+MAjkDGGUCw_rdxQh+5Er0r4ReGLnA@mail.gmail.com>

On 12/19/16 6:56 PM, Andy Lutomirski wrote:
> On Mon, Dec 19, 2016 at 5:44 PM, David Ahern <dsahern@gmail.com> wrote:
>> On 12/19/16 5:25 PM, Andy Lutomirski wrote:
>>> net.socket_create_filter = "none": no filter
>>> net.socket_create_filter = "bpf:baadf00d": bpf filter
>>> net.socket_create_filter = "disallow": no sockets created period
>>> net.socket_create_filter = "iptables:foobar": some iptables thingy
>>> net.socket_create_filter = "nft:blahblahblah": some nft thingy
>>> net.socket_create_filter = "address_family_list:1,2,3": allow AF 1, 2, and 3
>>
>> Such a scheme works for the socket create filter b/c it is a very simple use case. It does not work for the ingress and egress which allow generic bpf filters.
> 
> Can you elaborate on what goes wrong?  (Obviously the
> "address_family_list" example makes no sense in that context.)

Being able to dump a filter or see that one exists would be a great add-on, but I don't see how 'net.socket_create_filter = "bpf:baadf00d"' is a viable API for loading generic BPF filters. Simple cases like "disallow" are easy -- just return 0 in the filter, no complicated BPF code needed. The rest are specific cases of the moment which goes against the intent of ebpf and generic programmability. 
 
>>
>> ...
>>
>>>> you're ignoring use cases I described earlier.
>>>> In vrf case there is only one ifindex it needs to bind to.
>>>
>>> I'm totally lost.  Can you explain what this has to do with the cgroup
>>> hierarchy?
>>
>> I think the point is that a group hierarchy makes no sense for the VRF use case. What I put into iproute2 is
>>
>>     cgrp2/vrf/NAME
>>
>> where NAME is the vrf name. The filter added to it binds ipv4 and ipv6 sockets to a specific device index. cgrp2/vrf is the "default" vrf and does not have a filter. A user can certainly add another layer cgrp2/vrf/NAME/NAME2 but it provides no value since VRF in a VRF does not make sense.
> 
> I tend to agree.  I still think that the mechanism as it stands is
> broken in other respects and should be fixed before it goes live.  I
> have no desire to cause problems for the vrf use case.
> 
> But keep in mind that the vrf use case is, in Linus' tree, a bit
> broken right now in its interactions with other users of the same
> mechanism.  Suppose I create a container and want to trace all of its
> created sockets.  I'll set up cgrp2/container and load my tracer as a
> socket creation hook.  Then a container sets up
> cgrp2/container/vrf/NAME (using delgation) and loads your vrf binding
> filter.  Now the tracing stops working -- oops.

There are other ways to achieve socket tracing, but I get your point -- nested cases do not work as users may want.


>>>>> I like this last one, but IT'S NOT A POSSIBLE FUTURE EXTENSION.  You
>>>>> have to do it now (or disable the feature for 4.10).  This is why I'm
>>>>> bringing this whole thing up now.
>>>>
>>>> We don't have to touch user visible api here, so extensions are fine.
>>>
>>> Huh?  My example in the original email attaches a program in a
>>> sub-hierarchy.  Are you saying that 4.11 could make that example stop
>>> working?
>>
>> Are you suggesting sub-cgroups should not be allowed to override the filter of a parent cgroup?
> 
> Yes, exactly.  I think there are two sensible behaviors:
> 
> a) sub-cgroups cannot have a filter at all of the parent has a filter.
> (This is the "punt" approach -- it lets different semantics be
> assigned later without breaking userspace.)
> 
> b) sub-cgroups can have a filter if a parent does, too.  The semantics
> are that the sub-cgroup filter runs first and all side-effects occur.
> If that filter says "reject" then ancestor filters are skipped.  If
> that filter says "accept", then the ancestor filter is run and its
> side-effects happen as well.  (And so on, all the way up to the root.)

That comes with a big performance hit for skb / data path cases.

I'm riding my use case on Daniel's work, and as I understand it the nesting case has been discussed. I'll defer to Daniel and Alexei on this part.

  reply	other threads:[~2016-12-20  2:52 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-17 18:18 Potential issues (security and otherwise) with the current cgroup-bpf API Andy Lutomirski
2016-12-17 19:26 ` Mickaël Salaün
2016-12-17 20:02   ` Andy Lutomirski
2016-12-19 20:56 ` Alexei Starovoitov
2016-12-19 21:23   ` Andy Lutomirski
2016-12-20  0:02     ` Alexei Starovoitov
2016-12-20  0:25       ` Andy Lutomirski
2016-12-20  1:43         ` Andy Lutomirski
2016-12-20  1:44         ` David Ahern
2016-12-20  1:56           ` Andy Lutomirski
2016-12-20  2:52             ` David Ahern [this message]
2016-12-20  3:12               ` Andy Lutomirski
2016-12-20  4:44                 ` Alexei Starovoitov
2016-12-20  5:27                   ` Andy Lutomirski
2016-12-20  5:32                     ` Alexei Starovoitov
2016-12-20  9:11             ` Peter Zijlstra
2017-01-03 10:25               ` Michal Hocko
2017-01-16  1:19                 ` Tejun Heo
2017-01-17 13:03                   ` Michal Hocko
2017-01-17 13:32                     ` Peter Zijlstra
2017-01-17 13:58                       ` Michal Hocko
2017-01-17 20:23                         ` Andy Lutomirski
2017-01-18 22:18                         ` Tejun Heo
2017-01-19  9:00                           ` Michal Hocko
2016-12-20  3:18         ` Alexei Starovoitov
2016-12-20  3:50           ` Andy Lutomirski
2016-12-20  4:41             ` Alexei Starovoitov
2016-12-20 10:21             ` Daniel Mack
2016-12-20 17:23               ` Andy Lutomirski
2016-12-20 18:36                 ` Daniel Mack
2016-12-20 18:49                   ` Andy Lutomirski
2016-12-21  4:01                     ` Alexei Starovoitov
2016-12-20  1:34       ` David Miller
2016-12-20  1:40         ` Andy Lutomirski
2016-12-20  4:51           ` Alexei Starovoitov
2016-12-20  5:26             ` Andy Lutomirski
2017-01-17  5:18 Andy Lutomirski
2017-01-18 22:41 ` Potential issues (security and otherwise) with the current cgroup-bpf API Tejun Heo
2017-01-19  0:18   ` Andy Lutomirski
2017-01-19  0:59     ` Tejun Heo
2017-01-19  2:29       ` Andy Lutomirski
2017-01-20  2:39         ` Alexei Starovoitov
2017-01-20  4:04           ` Andy Lutomirski
2017-01-23  4:31             ` Alexei Starovoitov
2017-01-23 20:20               ` Andy Lutomirski
2017-02-03 21:07                 ` Andy Lutomirski
2017-02-03 23:21                   ` Alexei Starovoitov
2017-02-04 17:10                     ` Andy Lutomirski
2017-01-19  1:01     ` Mickaël Salaün

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=80574175-3692-0278-a74e-23b752d44f73@gmail.com \
    --to=dsahern@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=daniel@zonque.org \
    --cc=davem@davemloft.net \
    --cc=jann@thejh.net \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=mic@digikod.net \
    --cc=mtk.manpages@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tgraf@suug.ch \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).