From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pablo Neira Ayuso Subject: Re: [PATCH v7 0/6] Add eBPF hooks for cgroups Date: Fri, 28 Oct 2016 13:28:39 +0200 Message-ID: <20161028112839.GA29798@salvia> References: <1477390454-12553-1-git-send-email-daniel@zonque.org> <20161026195933.GA2031@salvia> <20161027033502.GA43960@ast-mbp.thefacebook.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Daniel Mack , htejun-b10kYP2dOMg@public.gmane.org, daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org, ast-b10kYP2dOMg@public.gmane.org, davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org, kafai-b10kYP2dOMg@public.gmane.org, fw-HFFVJYpyMKqzQB+pC5nmwQ@public.gmane.org, harald-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, sargun-GaZTRHToo+CzQB+pC5nmwQ@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Alexei Starovoitov Return-path: Content-Disposition: inline In-Reply-To: <20161027033502.GA43960-+o4/htvd0TDFYCXBM6kdu7fOX0fSgVTm@public.gmane.org> Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: netdev.vger.kernel.org Hi Alexei, On Wed, Oct 26, 2016 at 08:35:04PM -0700, Alexei Starovoitov wrote: > On Wed, Oct 26, 2016 at 09:59:33PM +0200, Pablo Neira Ayuso wrote: > > On Tue, Oct 25, 2016 at 12:14:08PM +0200, Daniel Mack wrote: > > [...] > > > Dumping programs once they are installed is problematic because of > > > the internal optimizations done to the eBPF program during its > > > lifetime. Also, the references to maps etc. would need to be > > > restored during the dump. > > > > > > Just exposing whether or not a program is attached would be > > > trivial to do, however, most easily through another bpf(2) > > > command. That can be added later on though. > > > > I don't know if anyone told you, but during last netconf, this topic > > took a bit of time of discussion and it was controversial, I would say > > 1/3 of netdev hackers there showed their concerns, and that's > > something that should not be skipped IMO. > > Though I attended netconf over hangouts, I think it was pretty > clear that bpf needs 'introspection' of loaded bpf programs and it > was a universal desire of everyone. Not 1/3 of hackers. Introspection is a different thing, very useful, no doubt. But this infrastructure is allowing way more than simple innocuous introspection. > As commit log says it's an orthogonal work and over the last > month we've been discussing pros and cons of different approaches. > The audit infra, tracepoints and other ideas. > We kept the discussion in private because, unfortunately, public > discussions are not fruitful due to threads like this one. We need to understand what people are trying to solve and it what way. That's why we have all those conferences and places to meet and discuss too. Please, don't think like that, this is sending the wrong message to everyone here, that is kind of: bypass public discussions and don't take time to describe what you're doing since it is a waste of time. That's not good. > The further points below were disputed many times in the past. > Let's address them one more time: > > > path. But this is adding hooks to push bpf programs in the middle of > > our generic stack, this is way different domain. > > incorrect. look at socket filters, cls_bpf. Classic socket filters don't allow you to deploy a global policy in such a fine grain way as this is doing. Then, cls_bpf is fine since it is visible via tc command, so sysadmins can use tools they are familiar with to inspect policies and say "oh look, some of the processes I'm deploying have installed filters via cls_bpf". However, this approach is visible in no way. [...] > > around this socket code in functions so we can invoke it. I guess > > filtering of UDP and TCP should be good for you at this stage. > > DanielM mentioned few times that it's not only about UDP and TCP. OK, since this is limited to the scope of inet sockets, let's revisit what we have around: DCCP is hopeless, who cares. We also have SCTP, that is deployed by telcos in datacenters, it cannot reach that domain because many Internet gateways are broken for it, so you may not get too far with it. Arguably it would be good to have SCTP support at some point, but I guess this is not a priority now. Then, UDPlite almost comes for free since it relies on the existing UDP infrastructure, it's basically UDP with a bit more features. What else? > > This would require more work though, but this would come with no > > hooks in the stack and packets will not have to consume *lots of > > cycles* just to be dropped before entering the socket queue. > > packets don't consume 'lost of cycles'. This is not a typical > n-tuple firewall framework. Not a DoS mitigation either. Please read > the cover letter and earlier submissions. > It's a framework centered around cgroups. > _Nothing_ in the current stack provides cgroup based monitoring > and application protection. Earlier cgroupv1 controllers don't > scale and we really cannot have more of v1 net controllers. > At the same time we've been brainstorming how this patch set > can work with v1. It's not easy. We're not giving up though. > For now it's v2 only. > Note that another two patchsets depend on this core cgroup+bpf framework. I saw those, I would really like to have a closer look at David Ahern's usecase since that skb iif mangling looks kludgy to me, and given this is exposing a new helper for general use, not only vrf, it would be good to make sure helpers provide something useful for everyone. So that new helper is questionable at this stage IMO. I'm concerned that people may start using bpf as the adhesive tape to glue things to solve probably design problems. The other patchset, I guess you refer to the new lsm module: I would suggest we address one thing at a time, I guess he can starts without relying on this chunk, as it can follow up later anyway. In general, I think bpf is very useful, but I think we have to accomodate new things in a way that makes sense into what we have. We have traditionally followed an "add in-kernel infrastructured + provide userspace interface as control plane" approach for long time. I have concerns (and my impression is that others were concerned in netconf too) on if we can go from the existing approach to a fully uninfrastructured use-bpf-everywhere, from one day to another without even taking the time to discuss the consequences of this decision. Thanks.