linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: David Ahern <dsahern@gmail.com>,
	Jesper Dangaard Brouer <jbrouer@redhat.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	ast@kernel.org, daniel@iogearbox.net, mst@redhat.com
Subject: Re: [RFC PATCH net-next V2 0/6] XDP rx handler
Date: Thu, 6 Sep 2018 13:12:24 +0800	[thread overview]
Message-ID: <5fc9ace6-0101-9b56-340e-f67105f8e34b@redhat.com> (raw)
In-Reply-To: <99d3e3d0-a14d-7789-2777-67421c7d4a20@gmail.com>



On 2018年09月06日 01:20, David Ahern wrote:
> [ sorry for the delay; focused on the nexthop RFC ]

No problem. Your comments is appreciated.

> On 8/20/18 12:34 AM, Jason Wang wrote:
>>
>> On 2018年08月18日 05:15, David Ahern wrote:
>>> On 8/15/18 9:34 PM, Jason Wang wrote:
>>>> I may miss something but BPF forbids loop. Without a loop how can we
>>>> make sure all stacked devices is enumerated correctly without knowing
>>>> the topology in advance?
>>> netdev_for_each_upper_dev_rcu
>>>
>>> BPF helpers allow programs to do lookups in kernel tables, in this case
>>> the ability to find an upper device that would receive the packet.
>> So if I understand correctly, you mean using
>> netdev_for_each_upper_dev_rcu() inside a BPF helper? If yes, I think we
>> may still need device specific logic. E.g for macvlan,
>> netdev_for_each_upper_dev_rcu() enumerates all macvlan devices on top a
>> lower device. But what we need is one of the macvlan that matches the
>> dst mac address which is similar to what XDP rx handler did. And it
>> would become more complicated if we have multiple layers of device.
> My device lookup helper takes the base port index (starting device),
> vlan protocol, vlan tag and dest mac. So, yes, the mac address is used
> to uniquely identify the stacked device.

Ok.

>
>> So let's consider a simple case, consider we have 5 macvlan devices:
>>
>> macvlan0: doing some packet filtering before passing packets to TCP/IP
>> stack
>> macvlan1: modify packets and redirect to another interface
>> macvlan2: modify packets and transmit packet back through XDP_TX
>> macvlan3: deliver packets to AF_XDP
>> macvtap0: deliver packets raw XDP to VM
>>
>> So, with XDP rx handler, what we need to just to attach five different
>> XDP programs to each macvlan device. Your idea is to do all things in
>> the root device XDP program. This looks complicated and not flexible
>> since it needs to care a lot of things, e.g adding/removing
>> actions/policies. And XDP program needs to call BPF helper that use
>> netdev_for_each_upper_dev_rcu() to work correctly with stacked device.
>>
> Stacking on top of a nic port can have all kinds of combinations of
> vlans, bonds, bridges, vlans on bonds and bridges, macvlans, etc. I
> suspect trying to install a program for layer 3 forwarding on each one
> and iteratively running the programs would kill the performance gained
> from forwarding with xdp.

Yes, the performance may drop but it's still much faster than XDP 
generic path.

One reason for the drop is the device specific logic like mac address 
matching which is also needed for the case of a single XDP program on 
the root device. For macvlan, if we allow attach XDP on macvlan, we can 
offload the mac address lookup to hardware through L2 forwarding 
offload, this can give us no performance drop I believe. The only reason 
that was introduced by XDP rx handler itself is probably the indirect 
calls. We can try to amortize them by introducing some kind of batching 
on top. For the issue of multiple XDP program iterations, for this RFC, 
if we have N stacked devices, there's no need to attach XDP program on 
each layer, the only thing that need is the XDP_PASS action in the root 
device, then you can attach XDP program on any one or some stacked 
devices on top.

So the RFC is not intended to replace any exist solution, it just 
provides some flexibility for having native XDP on stacked device (which 
is based on rx handler) and benefit from exist tools to do the 
configuration. If user want to do all things in the root device, that 
should work well without any issues.

Thanks



      reply	other threads:[~2018-09-06  5:12 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-13  3:17 [RFC PATCH net-next V2 0/6] XDP rx handler Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next V2 1/6] net: core: factor out generic XDP check and process routine Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next V2 2/6] net: core: generic XDP support for stacked device Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next V2 3/6] net: core: introduce XDP rx handler Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next V2 4/6] macvlan: count the number of vlan in source mode Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next V2 5/6] macvlan: basic XDP support Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next V2 6/6] virtio-net: support XDP rx handler Jason Wang
2018-08-14  9:22   ` Jesper Dangaard Brouer
2018-08-14 13:01     ` Jason Wang
2018-08-14  0:32 ` [RFC PATCH net-next V2 0/6] " Alexei Starovoitov
2018-08-14  7:59   ` Jason Wang
2018-08-14 10:17     ` Jesper Dangaard Brouer
2018-08-14 13:20       ` Jason Wang
2018-08-14 14:03         ` David Ahern
2018-08-15  0:29           ` Jason Wang
2018-08-15  5:35             ` Alexei Starovoitov
2018-08-15  7:04               ` Jason Wang
2018-08-16  2:49                 ` Alexei Starovoitov
2018-08-16  4:21                   ` Jason Wang
2018-08-15 17:17             ` David Ahern
2018-08-16  3:34               ` Jason Wang
2018-08-16  4:05                 ` Alexei Starovoitov
2018-08-16  4:24                   ` Jason Wang
2018-08-17 21:15                 ` David Ahern
2018-08-20  6:34                   ` Jason Wang
2018-09-05 17:20                     ` David Ahern
2018-09-06  5:12                       ` Jason Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5fc9ace6-0101-9b56-340e-f67105f8e34b@redhat.com \
    --to=jasowang@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dsahern@gmail.com \
    --cc=jbrouer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).