bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Toshiaki Makita <toshiaki.makita1@gmail.com>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Martin KaFai Lau" <kafai@fb.com>,
	"Song Liu" <songliubraving@fb.com>, "Yonghong Song" <yhs@fb.com>,
	"David S. Miller" <davem@davemloft.net>,
	"Jakub Kicinski" <jakub.kicinski@netronome.com>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"Jamal Hadi Salim" <jhs@mojatatu.com>,
	"Cong Wang" <xiyou.wangcong@gmail.com>,
	"Jiri Pirko" <jiri@resnulli.us>,
	"Pablo Neira Ayuso" <pablo@netfilter.org>,
	"Jozsef Kadlecsik" <kadlec@netfilter.org>,
	"Florian Westphal" <fw@strlen.de>,
	"Pravin B Shelar" <pshelar@ovn.org>
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org,
	William Tu <u9012063@gmail.com>,
	Stanislav Fomichev <sdf@fomichev.me>
Subject: Re: [RFC PATCH v2 bpf-next 00/15] xdp_flow: Flow offload to XDP
Date: Mon, 18 Nov 2019 15:41:00 +0900	[thread overview]
Message-ID: <6e08f714-6284-6d0d-9cbe-711c64bf97aa@gmail.com> (raw)
In-Reply-To: <87lfsiocj5.fsf@toke.dk>

On 2019/11/14 21:41, Toke Høiland-Jørgensen wrote:
> Toshiaki Makita <toshiaki.makita1@gmail.com> writes:
> 
>> On 2019/11/13 1:53, Toke Høiland-Jørgensen wrote:
>>> Toshiaki Makita <toshiaki.makita1@gmail.com> writes:
>>>>
>>>> Hi Toke,
>>>>
>>>> Sorry for the delay.
>>>>
>>>> On 2019/10/31 21:12, Toke Høiland-Jørgensen wrote:
>>>>> Toshiaki Makita <toshiaki.makita1@gmail.com> writes:
>>>>>
>>>>>> On 2019/10/28 0:21, Toke Høiland-Jørgensen wrote:
>>>>>>> Toshiaki Makita <toshiaki.makita1@gmail.com> writes:
>>>>>>>>> Yeah, you are right that it's something we're thinking about. I'm not
>>>>>>>>> sure we'll actually have the bandwidth to implement a complete solution
>>>>>>>>> ourselves, but we are very much interested in helping others do this,
>>>>>>>>> including smoothing out any rough edges (or adding missing features) in
>>>>>>>>> the core XDP feature set that is needed to achieve this :)
>>>>>>>>
>>>>>>>> I'm very interested in general usability solutions.
>>>>>>>> I'd appreciate if you could join the discussion.
>>>>>>>>
>>>>>>>> Here the basic idea of my approach is to reuse HW-offload infrastructure
>>>>>>>> in kernel.
>>>>>>>> Typical networking features in kernel have offload mechanism (TC flower,
>>>>>>>> nftables, bridge, routing, and so on).
>>>>>>>> In general these are what users want to accelerate, so easy XDP use also
>>>>>>>> should support these features IMO. With this idea, reusing existing
>>>>>>>> HW-offload mechanism is a natural way to me. OVS uses TC to offload
>>>>>>>> flows, then use TC for XDP as well...
>>>>>>>
>>>>>>> I agree that XDP should be able to accelerate existing kernel
>>>>>>> functionality. However, this does not necessarily mean that the kernel
>>>>>>> has to generate an XDP program and install it, like your patch does.
>>>>>>> Rather, what we should be doing is exposing the functionality through
>>>>>>> helpers so XDP can hook into the data structures already present in the
>>>>>>> kernel and make decisions based on what is contained there. We already
>>>>>>> have that for routing; L2 bridging, and some kind of connection
>>>>>>> tracking, are obvious contenders for similar additions.
>>>>>>
>>>>>> Thanks, adding helpers itself should be good, but how does this let users
>>>>>> start using XDP without having them write their own BPF code?
>>>>>
>>>>> It wouldn't in itself. But it would make it possible to write XDP
>>>>> programs that could provide the same functionality; people would then
>>>>> need to run those programs to actually opt-in to this.
>>>>>
>>>>> For some cases this would be a simple "on/off switch", e.g.,
>>>>> "xdp-route-accel --load <dev>", which would install an XDP program that
>>>>> uses the regular kernel routing table (and the same with bridging). We
>>>>> are planning to collect such utilities in the xdp-tools repo - I am
>>>>> currently working on a simple packet filter:
>>>>> https://github.com/xdp-project/xdp-tools/tree/xdp-filter
>>>>
>>>> Let me confirm how this tool adds filter rules.
>>>> Is this adding another commandline tool for firewall?
>>>>
>>>> If so, that is different from my goal.
>>>> Introducing another commandline tool will require people to learn
>>>> more.
>>>>
>>>> My proposal is to reuse kernel interface to minimize such need for
>>>> learning.
>>>
>>> I wasn't proposing that this particular tool should be a replacement for
>>> the kernel packet filter; it's deliberately fairly limited in
>>> functionality. My point was that we could create other such tools for
>>> specific use cases which could be more or less drop-in (similar to how
>>> nftables has a command line tool that is compatible with the iptables
>>> syntax).
>>>
>>> I'm all for exposing more of the existing kernel capabilities to XDP.
>>> However, I think it's the wrong approach to do this by reimplementing
>>> the functionality in eBPF program and replicating the state in maps;
>>> instead, it's better to refactor the existing kernel functionality to it
>>> can be called directly from an eBPF helper function. And then ship a
>>> tool as part of xdp-tools that installs an XDP program to make use of
>>> these helpers to accelerate the functionality.
>>>
>>> Take your example of TC rules: You were proposing a flow like this:
>>>
>>> Userspace TC rule -> kernel rule table -> eBPF map -> generated XDP
>>> program
>>>
>>> Whereas what I mean is that we could do this instead:
>>>
>>> Userspace TC rule -> kernel rule table
>>>
>>> and separately
>>>
>>> XDP program -> bpf helper -> lookup in kernel rule table
>>
>> Thanks, now I see what you mean.
>> You expect an XDP program like this, right?
>>
>> int xdp_tc(struct xdp_md *ctx)
>> {
>> 	int act = bpf_xdp_tc_filter(ctx);
>> 	return act;
>> }
> 
> Yes, basically, except that the XDP program would need to parse the
> packet first, and bpf_xdp_tc_filter() would take a parameter struct with
> the parsed values. See the usage of bpf_fib_lookup() in
> bpf/samples/xdp_fwd_kern.c
> 
>> But doesn't this way lose a chance to reduce/minimize the program to
>> only use necessary features for this device?
> 
> Not necessarily. Since the BPF program does the packet parsing and fills
> in the TC filter lookup data structure, it can limit what features are
> used that way (e.g., if I only want to do IPv6, I just parse the v6
> header, ignore TCP/UDP, and drop everything that's not IPv6). The lookup
> helper could also have a flag argument to disable some of the lookup
> features.

It's unclear to me how to configure that.
Use options when attaching the program? Something like
$ xdp_tc attach eth0 --only-with ipv6
But can users always determine their necessary features in advance?
Frequent manual reconfiguration when TC rules frequently changes does not sound nice.
Or, add hook to kernel to listen any TC filter event on some daemon and automatically
reload the attached program?

Another concern is key size. If we use the TC core then TC will use its hash table with
fixed key size. So we cannot decrease the size of hash table key in this way?

> 
> It would probably require a bit of refactoring in the kernel data
> structures so they can be used without being tied to an skb. David Ahern
> did something similar for the fib. For the routing table case, that
> resulted in a significant speedup: About 2.5x-3x the performance when
> using it via XDP (depending on the number of routes in the table).

I'm curious about how much the helper function can improve the performance compared to
XDP programs which emulates kernel feature without using such helpers.
2.5x-3x sounds a bit slow as XDP to me, but it can be routing specific problem.

Toshiaki Makita

  reply	other threads:[~2019-11-18  6:41 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-18  4:07 [RFC PATCH v2 bpf-next 00/15] xdp_flow: Flow offload to XDP Toshiaki Makita
2019-10-18  4:07 ` [RFC PATCH v2 bpf-next 01/15] xdp_flow: Add skeleton of XDP based flow offload driver Toshiaki Makita
2019-10-18  4:07 ` [RFC PATCH v2 bpf-next 02/15] xdp_flow: Add skeleton bpf program for XDP Toshiaki Makita
2019-10-18  4:07 ` [RFC PATCH v2 bpf-next 03/15] bpf: Add API to get program from id Toshiaki Makita
2019-10-18  4:07 ` [RFC PATCH v2 bpf-next 04/15] xdp: Export dev_check_xdp and dev_change_xdp Toshiaki Makita
2019-10-18  4:07 ` [RFC PATCH v2 bpf-next 05/15] xdp_flow: Attach bpf prog to XDP in kernel after UMH loaded program Toshiaki Makita
2019-10-18  4:07 ` [RFC PATCH v2 bpf-next 06/15] xdp_flow: Prepare flow tables in bpf Toshiaki Makita
2019-10-18  4:07 ` [RFC PATCH v2 bpf-next 07/15] xdp_flow: Add flow entry insertion/deletion logic in UMH Toshiaki Makita
2019-10-18  4:07 ` [RFC PATCH v2 bpf-next 08/15] xdp_flow: Add flow handling and basic actions in bpf prog Toshiaki Makita
2019-10-18  4:07 ` [RFC PATCH v2 bpf-next 09/15] xdp_flow: Implement flow replacement/deletion logic in xdp_flow kmod Toshiaki Makita
2019-10-18  4:07 ` [RFC PATCH v2 bpf-next 10/15] xdp_flow: Add netdev feature for enabling flow offload to XDP Toshiaki Makita
2019-10-18  4:07 ` [RFC PATCH v2 bpf-next 11/15] xdp_flow: Implement redirect action Toshiaki Makita
2019-10-18  4:07 ` [RFC PATCH v2 bpf-next 12/15] xdp_flow: Implement vlan_push action Toshiaki Makita
2019-10-18  4:07 ` [RFC PATCH v2 bpf-next 13/15] bpf, selftest: Add test for xdp_flow Toshiaki Makita
2019-10-18  4:07 ` [RFC PATCH v2 bpf-next 14/15] i40e: prefetch xdp->data before running XDP prog Toshiaki Makita
2019-10-18  4:07 ` [RFC PATCH v2 bpf-next 15/15] bpf, hashtab: Compare keys in long Toshiaki Makita
2019-10-18 15:22 ` [RFC PATCH v2 bpf-next 00/15] xdp_flow: Flow offload to XDP John Fastabend
2019-10-21  7:31   ` Toshiaki Makita
2019-10-22 16:54     ` John Fastabend
2019-10-22 17:45       ` Toke Høiland-Jørgensen
2019-10-24  4:27         ` John Fastabend
2019-10-24 10:13           ` Toke Høiland-Jørgensen
2019-10-27 13:19             ` Toshiaki Makita
2019-10-27 15:21               ` Toke Høiland-Jørgensen
2019-10-28  3:16                 ` David Ahern
2019-10-28  8:36                   ` Toke Høiland-Jørgensen
2019-10-28 10:08                     ` Jesper Dangaard Brouer
2019-10-28 19:07                       ` David Ahern
2019-10-28 19:05                     ` David Ahern
2019-10-31  0:18                 ` Toshiaki Makita
2019-10-31 12:12                   ` Toke Høiland-Jørgensen
2019-11-11  7:32                     ` Toshiaki Makita
2019-11-12 16:53                       ` Toke Høiland-Jørgensen
2019-11-14 10:11                         ` Toshiaki Makita
2019-11-14 12:41                           ` Toke Høiland-Jørgensen
2019-11-18  6:41                             ` Toshiaki Makita [this message]
2019-11-18 10:20                               ` Toke Høiland-Jørgensen
2019-11-22  5:42                                 ` Toshiaki Makita
2019-11-22 11:54                                   ` Toke Høiland-Jørgensen
2019-11-25 10:18                                     ` Toshiaki Makita
2019-11-25 13:03                                       ` Toke Høiland-Jørgensen
2019-11-18 10:28                               ` Toke Høiland-Jørgensen
2019-10-27 13:13         ` Toshiaki Makita
2019-10-27 15:24           ` Toke Høiland-Jørgensen
2019-10-27 19:17             ` David Miller
2019-10-31  0:32               ` Toshiaki Makita
2019-11-12 17:50                 ` William Tu
2019-11-14 10:06                   ` Toshiaki Makita
2019-11-14 17:09                     ` William Tu
2019-11-15 13:16                       ` Toke Høiland-Jørgensen
2019-11-12 17:38             ` William Tu
2019-10-23 14:11       ` Jamal Hadi Salim
2019-10-24  4:38         ` John Fastabend
2019-10-24 17:05           ` Jamal Hadi Salim
2019-10-27 13:27         ` Toshiaki Makita
2019-10-27 13:06       ` Toshiaki Makita
2019-10-21 11:23 ` Björn Töpel
2019-10-21 11:47   ` Toshiaki Makita

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6e08f714-6284-6d0d-9cbe-711c64bf97aa@gmail.com \
    --to=toshiaki.makita1@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=fw@strlen.de \
    --cc=hawk@kernel.org \
    --cc=jakub.kicinski@netronome.com \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=john.fastabend@gmail.com \
    --cc=kadlec@netfilter.org \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=pshelar@ovn.org \
    --cc=sdf@fomichev.me \
    --cc=songliubraving@fb.com \
    --cc=toke@redhat.com \
    --cc=u9012063@gmail.com \
    --cc=xiyou.wangcong@gmail.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).