All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Daniel Mack <daniel@zonque.org>,
	htejun@fb.com, daniel@iogearbox.net, ast@fb.com,
	davem@davemloft.net, kafai@fb.com, fw@strlen.de,
	harald@redhat.com, netdev@vger.kernel.org, sargun@sargun.me,
	cgroups@vger.kernel.org
Subject: Re: [PATCH v5 0/6] Add eBPF hooks for cgroups
Date: Tue, 13 Sep 2016 21:42:19 -0700	[thread overview]
Message-ID: <20160914044217.GA44742@ast-mbp.thefacebook.com> (raw)
In-Reply-To: <20160913172408.GC6138@salvia>

On Tue, Sep 13, 2016 at 07:24:08PM +0200, Pablo Neira Ayuso wrote:
> On Tue, Sep 13, 2016 at 03:31:20PM +0200, Daniel Mack wrote:
> > Hi,
> > 
> > On 09/13/2016 01:56 PM, Pablo Neira Ayuso wrote:
> > > On Mon, Sep 12, 2016 at 06:12:09PM +0200, Daniel Mack wrote:
> > >> This is v5 of the patch set to allow eBPF programs for network
> > >> filtering and accounting to be attached to cgroups, so that they apply
> > >> to all sockets of all tasks placed in that cgroup. The logic also
> > >> allows to be extendeded for other cgroup based eBPF logic.
> > > 
> > > 1) This infrastructure can only be useful to systemd, or any similar
> > >    orchestration daemon. Look, you can only apply filtering policies
> > >    to processes that are launched by systemd, so this only works
> > >    for server processes.
> > 
> > Sorry, but both statements aren't true. The eBPF policies apply to every
> > process that is placed in a cgroup, and my example program in 6/6 shows
> > how that can be done from the command line.
> 
> Then you have to explain me how can anyone else than systemd use this
> infrastructure?

Sounds like systemd and bpf phobia combined :)
Jokes aside. I'm puzzled why systemd is even being mentioned here.
Here we use tupperware (our internal container management system) that
is heavily using cgroups and has nothing to do with systemd.
we're working as part of open container initiative, so hopefully soon
all container management systems will benefit from what we're building.
cgroups and bpf are crucial part of this process.

> > Also, systemd is able to control userspace processes just fine, and
> > it not limited to 'server processes'.
> 
> My main point is that those processes *need* to be launched by the
> orchestrator, which is was refering as 'server processes'.

No experience in systemd, so cannot comment about it,
but that statement is not true for our stuff.

> > > For client processes this infrastructure is
> > >    *racy*, you have to add new processes in runtime to the cgroup,
> > >    thus there will be time some little time where no filtering policy
> > >    will be applied. For quality of service, this may be an acceptable
> > >    race, but this is aiming to deploy a filtering policy.
> > 
> > That's a limitation that applies to many more control mechanisms in the
> > kernel, and it's something that can easily be solved with fork+exec.
> 
> As long as you have control to launch the processes yes, but this
> will not work in other scenarios. Just like cgroup net_cls and friends
> are broken for filtering for things that you have no control to
> fork+exec.

not true

> To use this infrastructure from a non-launcher process, you'll have to
> rely on the proc connection to subscribe to new process events, then
> echo that pid to the cgroup, and that interface is asynchronous so
> *adding new processes to the cgroup is subject to races*.

in general not true either. have you worked with cgroups or just speculating?
 
> *You're proposing a socket filtering facility that hooks layer 2
> output path*!

flashback. Not too long ago you were beating drums about netfilter
ingress hook operating at layer 2... sounds like nobody used it
and that was a bad call? Should we remove that netfilter hook then?

Our use case is different from Daniel's.
For us this cgroup+bpf is _not_ for filterting and _not_ for security.
We run a ton of tasks in cgroups that launch all sorts of
things on their own. We need to monitor what they do from networking
point of view. Therefore bpf programs need to monitor the traffic in
particular part of cgroup hierarchy. Not globally and no pass/drop decisions.
The monitoring itself is complicated. Like we need to group and
aggregate within bpf program based on certain bits of ipv6 address
and so on. bpf is only programmable engine that can do this job.
nft is simply not flexible enough to do that.
I'd really love to have an alternative to bpf for such tasks,
but you seem to spend all the energy arguing against bpf whereas
nft still has a lot to be desired.

  reply	other threads:[~2016-09-14  4:42 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-12 16:12 [PATCH v5 0/6] Add eBPF hooks for cgroups Daniel Mack
2016-09-12 16:12 ` [PATCH v5 1/6] bpf: add new prog type for cgroup socket filtering Daniel Mack
2016-09-12 16:12 ` [PATCH v5 2/6] cgroup: add support for eBPF programs Daniel Mack
2016-09-12 16:12 ` [PATCH v5 3/6] bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands Daniel Mack
     [not found] ` <1473696735-11269-1-git-send-email-daniel-cYrQPVfZoowdnm+yROfE0A@public.gmane.org>
2016-09-12 16:12   ` [PATCH v5 4/6] net: filter: run cgroup eBPF ingress programs Daniel Mack
2016-09-12 16:12   ` [PATCH v5 5/6] net: core: run cgroup eBPF egress programs Daniel Mack
2016-09-12 16:12   ` [PATCH v5 6/6] samples: bpf: add userspace example for attaching eBPF programs to cgroups Daniel Mack
2016-09-13 11:56 ` [PATCH v5 0/6] Add eBPF hooks for cgroups Pablo Neira Ayuso
2016-09-13 13:31   ` Daniel Mack
     [not found]     ` <da300784-284c-0d1f-a82e-aa0a0f8ae116-cYrQPVfZoowdnm+yROfE0A@public.gmane.org>
2016-09-13 14:14       ` Daniel Borkmann
2016-09-13 17:24       ` Pablo Neira Ayuso
2016-09-14  4:42         ` Alexei Starovoitov [this message]
2016-09-14  9:03           ` Thomas Graf
     [not found]           ` <20160914044217.GA44742-+o4/htvd0TDFYCXBM6kdu7fOX0fSgVTm@public.gmane.org>
2016-09-14 10:30             ` Pablo Neira Ayuso
2016-09-14 11:06               ` Thomas Graf
2016-09-14 11:36               ` Daniel Borkmann
2016-09-14 11:13         ` Daniel Mack
     [not found]           ` <6de6809a-13f5-4000-5639-c760dde30223-cYrQPVfZoowdnm+yROfE0A@public.gmane.org>
2016-09-14 11:42             ` Daniel Borkmann
     [not found]               ` <57D937B9.2090100-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org>
2016-09-14 15:55                 ` Alexei Starovoitov
2016-09-16 19:57           ` Sargun Dhillon
     [not found]             ` <20160916195728.GA14736-I4sfFR6g6EicJoAdRrHjTrzMkBWIpU9tytq7g7fCXyjEk0E+pv7Png@public.gmane.org>
2016-09-18 23:34               ` Sargun Dhillon
2016-09-19 16:34               ` Daniel Mack
2016-09-19 21:53                 ` Sargun Dhillon
     [not found]                   ` <20160919215311.GA9723-I4sfFR6g6EicJoAdRrHjTrzMkBWIpU9tytq7g7fCXyjEk0E+pv7Png@public.gmane.org>
2016-09-20 14:25                     ` Daniel Mack
2016-09-15  6:36 ` Vincent Bernat
     [not found]   ` <m3y42tlldz.fsf-PiWSfznZvZU/eRriIvX0kg@public.gmane.org>
2016-09-15  8:11     ` Daniel Mack
2016-09-15  8:11       ` Daniel Mack

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160914044217.GA44742@ast-mbp.thefacebook.com \
    --to=alexei.starovoitov@gmail.com \
    --cc=ast@fb.com \
    --cc=cgroups@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=daniel@zonque.org \
    --cc=davem@davemloft.net \
    --cc=fw@strlen.de \
    --cc=harald@redhat.com \
    --cc=htejun@fb.com \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=sargun@sargun.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.