linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: David Ahern <dsahern@gmail.com>,
	Jesper Dangaard Brouer <jbrouer@redhat.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	ast@kernel.org, daniel@iogearbox.net, mst@redhat.com,
	Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Subject: Re: [RFC PATCH net-next V2 0/6] XDP rx handler
Date: Wed, 15 Aug 2018 15:04:35 +0800	[thread overview]
Message-ID: <0809cbab-9c91-52f1-2abe-124a255d9304@redhat.com> (raw)
In-Reply-To: <20180815053550.5g4f5qeb7r4wtgm5@ast-mbp>



On 2018年08月15日 13:35, Alexei Starovoitov wrote:
> On Wed, Aug 15, 2018 at 08:29:45AM +0800, Jason Wang wrote:
>> Looks less flexible since the topology is hard coded in the XDP program
>> itself and this requires all logic to be implemented in the program on the
>> root netdev.
>>
>>> I have L3 forwarding working for vlan devices and bonds. I had not
>>> considered macvlans specifically yet, but it should be straightforward
>>> to add.
>>>
>> Yes, and all these could be done through XDP rx handler as well, and it can
>> do even more with rather simple logic:
>>
>> 1 macvlan has its own namespace, and want its own bpf logic.
>> 2 Ruse the exist topology information for dealing with more complex setup
>> like macvlan on top of bond and team. There's no need to bpf program to care
>> about topology. If you look at the code, there's even no need to attach XDP
>> on each stacked device. The calling of xdp_do_pass() can try to pass XDP
>> buff to upper device even if there's no XDP program attached to current
>> layer.
>> 3 Deliver XDP buff to userspace through macvtap.
> I think I'm getting what you're trying to achieve.
> You actually don't want any bpf programs in there at all.
> You want macvlan builtin logic to act on raw packet frames.

The built-in logic is just used to find the destination macvlan device. 
It could be done by through another bpf program. Instead of inventing 
lots of generic infrastructure on kernel with specific userspace API, 
built-in logic has its own advantages:

- support hundreds or even thousands of macvlans
- using exist tools to configure network
- immunity to topology changes

> It would have been less confusing if you said so from the beginning.

The name "XDP rx handler" is probably not good. Something like "stacked 
deivce XDP" might be better.

> I think there is little value in such work, since something still
> needs to process this raw frames eventually. If it's XDP with BPF progs
> than they can maintain the speed, but in such case there is no need
> for macvlan. The first layer can be normal xdp+bpf+xdp_redirect just fine.

I'm a little bit confused. We allow per veth XDP program, so I believe 
per macvlan XDP program makes sense as well? This allows great 
flexibility and there's no need to care about topology in bpf program. 
The configuration is also greatly simplified. The only difference is we 
can use xdp_redirect for veth since it was pair device, we can transmit 
XDP frames to one veth and do XDP on its peer. This does not work for 
the case of macvlan which is based on rx handler.

Actually, for the case of veth, if we implement XDP rx handler for 
bridge it can works seamlessly with veth like.

eth0(XDP_PASS) -> [bridge XDP rx handler and ndo_xdp_xmit()] -> veth --- 
veth (XDP).

Besides the usage for containers, we can implement macvtap RX handler 
which allows a fast packet forwarding to userspace.

> In case where there is no xdp+bpf in final processing, the frames are
> converted to skb and performance is lost, so in such cases there is no
> need for builtin macvlan acting on raw xdp frames either. Just keep
> existing macvlan acting on skbs.
>

Yes, this is how veth works as well.

Actually, the idea is not limited to macvlan but for all device that is 
based on rx handler. Consider the case of bonding, this allows to set a 
very simple XDP program on slaves and keep a single main logic XDP 
program on the bond instead of duplicating it in all slaves.

Thanks

  reply	other threads:[~2018-08-15  7:04 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-13  3:17 [RFC PATCH net-next V2 0/6] XDP rx handler Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next V2 1/6] net: core: factor out generic XDP check and process routine Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next V2 2/6] net: core: generic XDP support for stacked device Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next V2 3/6] net: core: introduce XDP rx handler Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next V2 4/6] macvlan: count the number of vlan in source mode Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next V2 5/6] macvlan: basic XDP support Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next V2 6/6] virtio-net: support XDP rx handler Jason Wang
2018-08-14  9:22   ` Jesper Dangaard Brouer
2018-08-14 13:01     ` Jason Wang
2018-08-14  0:32 ` [RFC PATCH net-next V2 0/6] " Alexei Starovoitov
2018-08-14  7:59   ` Jason Wang
2018-08-14 10:17     ` Jesper Dangaard Brouer
2018-08-14 13:20       ` Jason Wang
2018-08-14 14:03         ` David Ahern
2018-08-15  0:29           ` Jason Wang
2018-08-15  5:35             ` Alexei Starovoitov
2018-08-15  7:04               ` Jason Wang [this message]
2018-08-16  2:49                 ` Alexei Starovoitov
2018-08-16  4:21                   ` Jason Wang
2018-08-15 17:17             ` David Ahern
2018-08-16  3:34               ` Jason Wang
2018-08-16  4:05                 ` Alexei Starovoitov
2018-08-16  4:24                   ` Jason Wang
2018-08-17 21:15                 ` David Ahern
2018-08-20  6:34                   ` Jason Wang
2018-09-05 17:20                     ` David Ahern
2018-09-06  5:12                       ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0809cbab-9c91-52f1-2abe-124a255d9304@redhat.com \
    --to=jasowang@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dsahern@gmail.com \
    --cc=jbrouer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=makita.toshiaki@lab.ntt.co.jp \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).