netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Lorenzo Colitti <lorenzo-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Daniel Mack <daniel-cYrQPVfZoowdnm+yROfE0A@public.gmane.org>,
	Pablo Neira Ayuso <pablo-Cap9r6Oaw4JrovVCs/uTlw@public.gmane.org>,
	htejun-b10kYP2dOMg@public.gmane.org,
	Daniel Borkmann <daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org>,
	ast-b10kYP2dOMg@public.gmane.org,
	David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>,
	kafai-b10kYP2dOMg@public.gmane.org,
	Florian Westphal <fw-HFFVJYpyMKqzQB+pC5nmwQ@public.gmane.org>,
	harald-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	"netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Sargun Dhillon <sargun-GaZTRHToo+CzQB+pC5nmwQ@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH v7 0/6] Add eBPF hooks for cgroups
Date: Fri, 28 Oct 2016 23:24:47 -0700	[thread overview]
Message-ID: <20161029062442.GA61550@ast-mbp.thefacebook.com> (raw)
In-Reply-To: <CAKD1Yr2pMk52h7BdRwTvGwnP5+ONmr4ac6cyUBoZ9P+Kt-B8jw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Sat, Oct 29, 2016 at 01:59:23PM +0900, Lorenzo Colitti wrote:
> On Sat, Oct 29, 2016 at 1:51 PM, Alexei Starovoitov
> <alexei.starovoitov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >> What's the use case for egress?
> >>
> >> We (android networking) are currently looking at implementing network
> >> accounting via eBPF in order to replace the out-of-tree xt_qtaguid
> >> code. A per-cgroup eBPF program run on all traffic would be great. But
> >> when we looked at this patchset we realized it would not be useful for
> >> accounting purposes because even if a packet is counted here, it might
> >> still be dropped by netfilter hooks.
> >
> > don't use out-of-tree and instead drop using this mechanism or
> > any other in-kernel method? ;)
> 
> Getting rid of out-of-tree code is the goal, yes. We do have a
> requirement that things continue to work, though. Accounting for a
> packet in ip{,6}_output is not correct if that packet ends up being
> dropped by iptables later on.

understood.
it could be solved by swapping the order of cgroup_bpf_run_filter()
and NF_INET_POST_ROUTING in patch 5. It was proposed some time back, but
the current patch, I think, is more symmetrical.
cgroup+bpf runs after nf hook on rx and runs before it on tx.
imo it's more consistent.
Regardless of this choice... are you going to backport cgroupv2 to
android? Because this set is v2 only.

> > We (facebook infrastructure) have been using iptables and bpf networking
> > together with great success. They nicely co-exist and complement each other.
> > There is no need to reinvent the wheel if existing solution works.
> > iptables are great for their purpose.
> 
> That doesn't really answer my "what is the use case for egress"
> question though, right? Or are you saying "we use this, but we can't
> talk about how we use it"?

if the question is "why patch 4 alone is not enough and patch 5 is needed"?
Then it's symmetrical access. Accounting for RX only is a half done job.

> > there is iptables+cBPF support. It's being used in some cases already.
> 
> Adding eBPF support to the xt_bpf iptables code would be an option for
> what we want to do, yes. I think this requires that the eBPF map to be
> an fd that is available to the process that exec()s iptables, but we
> could do that.

yes. that's certainly doable, but sooner or later such approach will hit
scalability issue when number of cgroups is large. Same issue we saw
with cls_bpf and bpf_skb_under_cgroup(). Hence this patch set was needed
that is centered around cgroups instead of hooks. Note, unlike, tc and nf
there is no way to attach to a hook. The bpf program is attached to a cgroup.
It's an important distinction vs everything that currently exists in the stack.

  parent reply	other threads:[~2016-10-29  6:24 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-25 10:14 [PATCH v7 0/6] Add eBPF hooks for cgroups Daniel Mack
2016-10-25 10:14 ` [PATCH v7 1/6] bpf: add new prog type for cgroup socket filtering Daniel Mack
2016-10-25 10:14 ` [PATCH v7 4/6] net: filter: run cgroup eBPF ingress programs Daniel Mack
     [not found] ` <1477390454-12553-1-git-send-email-daniel-cYrQPVfZoowdnm+yROfE0A@public.gmane.org>
2016-10-25 10:14   ` [PATCH v7 2/6] cgroup: add support for eBPF programs Daniel Mack
2016-10-25 10:14   ` [PATCH v7 3/6] bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands Daniel Mack
2016-10-25 10:14   ` [PATCH v7 5/6] net: ipv4, ipv6: run cgroup eBPF egress programs Daniel Mack
2016-10-31 16:40     ` David Miller
     [not found]       ` <20161031.124003.1361406552151798940.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2016-11-02  1:17         ` Daniel Borkmann
2016-10-25 10:14   ` [PATCH v7 6/6] samples: bpf: add userspace example for attaching eBPF programs to cgroups Daniel Mack
2016-10-26 19:59 ` [PATCH v7 0/6] Add eBPF hooks for cgroups Pablo Neira Ayuso
2016-10-27  3:35   ` Alexei Starovoitov
     [not found]     ` <20161027033502.GA43960-+o4/htvd0TDFYCXBM6kdu7fOX0fSgVTm@public.gmane.org>
2016-10-28 11:28       ` Pablo Neira Ayuso
2016-10-28 15:00         ` David Ahern
2016-10-29  1:42         ` Alexei Starovoitov
2016-10-27  8:40   ` Daniel Mack
     [not found]     ` <c9683122-d770-355b-e275-7c446e6d1d0f-cYrQPVfZoowdnm+yROfE0A@public.gmane.org>
2016-10-28 11:53       ` Pablo Neira Ayuso
2016-10-28 12:07         ` Daniel Mack
2016-10-29  3:51       ` Lorenzo Colitti
     [not found]         ` <CAKD1Yr2aRDNUxX8onReZyURufphxGoSTek=Fjk3Wswq9WOVp4w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-10-29  4:51           ` Alexei Starovoitov
     [not found]             ` <20161029045107.GA61294-+o4/htvd0TDFYCXBM6kdu7fOX0fSgVTm@public.gmane.org>
2016-10-29  4:59               ` Lorenzo Colitti
     [not found]                 ` <CAKD1Yr2pMk52h7BdRwTvGwnP5+ONmr4ac6cyUBoZ9P+Kt-B8jw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-10-29  6:24                   ` Alexei Starovoitov [this message]
2016-10-29 15:34                     ` Lorenzo Colitti
2016-10-29 20:29                       ` Daniel Borkmann
2016-11-01 15:25                         ` Lorenzo Colitti
     [not found]                           ` <CAKD1Yr02SCHvd-xZJL14d_Ta8Dk4evHZ60zytpU0h4r80FucwA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-11-01 15:38                             ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161029062442.GA61550@ast-mbp.thefacebook.com \
    --to=alexei.starovoitov-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=ast-b10kYP2dOMg@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org \
    --cc=daniel-cYrQPVfZoowdnm+yROfE0A@public.gmane.org \
    --cc=davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org \
    --cc=fw-HFFVJYpyMKqzQB+pC5nmwQ@public.gmane.org \
    --cc=harald-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=htejun-b10kYP2dOMg@public.gmane.org \
    --cc=kafai-b10kYP2dOMg@public.gmane.org \
    --cc=lorenzo-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=pablo-Cap9r6Oaw4JrovVCs/uTlw@public.gmane.org \
    --cc=sargun-GaZTRHToo+CzQB+pC5nmwQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).