BPF Archive on lore.kernel.org
 help / color / Atom feed
From: Daniel Borkmann <daniel@iogearbox.net>
To: Hangbin Liu <liuhangbin@gmail.com>, bpf@vger.kernel.org
Cc: netdev@vger.kernel.org,
	"Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Jiri Benc" <jbenc@redhat.com>,
	"Jesper Dangaard Brouer" <brouer@redhat.com>,
	"Eelco Chaudron" <echaudro@redhat.com>,
	ast@kernel.org, "Lorenzo Bianconi" <lorenzo.bianconi@redhat.com>
Subject: Re: [PATCHv6 bpf-next 0/3] xdp: add a new helper for dev map multicast support
Date: Fri, 10 Jul 2020 00:37:59 +0200
Message-ID: <7c80ca4b-4c7d-0322-9483-f6f0465d6370@iogearbox.net> (raw)
In-Reply-To: <20200709013008.3900892-1-liuhangbin@gmail.com>

On 7/9/20 3:30 AM, Hangbin Liu wrote:
> This patch is for xdp multicast support. which has been discussed before[0],
> The goal is to be able to implement an OVS-like data plane in XDP, i.e.,
> a software switch that can forward XDP frames to multiple ports.
> 
> To achieve this, an application needs to specify a group of interfaces
> to forward a packet to. It is also common to want to exclude one or more
> physical interfaces from the forwarding operation - e.g., to forward a
> packet to all interfaces in the multicast group except the interface it
> arrived on. While this could be done simply by adding more groups, this
> quickly leads to a combinatorial explosion in the number of groups an
> application has to maintain.
> 
> To avoid the combinatorial explosion, we propose to include the ability
> to specify an "exclude group" as part of the forwarding operation. This
> needs to be a group (instead of just a single port index), because a
> physical interface can be part of a logical grouping, such as a bond
> device.
> 
> Thus, the logical forwarding operation becomes a "set difference"
> operation, i.e. "forward to all ports in group A that are not also in
> group B". This series implements such an operation using device maps to
> represent the groups. This means that the XDP program specifies two
> device maps, one containing the list of netdevs to redirect to, and the
> other containing the exclude list.

Could you move this description as part of patch 1/3 instead of cover
letter? Mostly given this helps understanding the rationale wrt exclusion
map which is otherwise lacking from just looking at the patch itself.

Assuming you have a bond, how does this look in practice for your mentioned
ovs-like data plane in XDP? The map for 'group A' is shared among all XDP
progs and the map for 'group B' is managed per prog? The BPF_F_EXCLUDE_INGRESS
is clear, but how would this look wrt forwarding from a phys dev /to/ the
bond iface w/ XDP?

Also, what about tc BPF helper support for the case where not every device
might have native XDP (but they could still share the maps)?

> To achieve this, I re-implement a new helper bpf_redirect_map_multi()
> to accept two maps, the forwarding map and exclude map. If user
> don't want to use exclude map and just want simply stop redirecting back
> to ingress device, they can use flag BPF_F_EXCLUDE_INGRESS.
> 
> The 2nd and 3rd patches are for usage sample and testing purpose, so there
> is no effort has been made on performance optimisation. I did same tests
> with pktgen(pkt size 64) to compire with xdp_redirect_map(). Here is the
> test result(the veth peer has a dummy xdp program with XDP_DROP directly):
> 
> Version         | Test                                   | Native | Generic
> 5.8 rc1         | xdp_redirect_map       i40e->i40e      |  10.0M |   1.9M
> 5.8 rc1         | xdp_redirect_map       i40e->veth      |  12.7M |   1.6M
> 5.8 rc1 + patch | xdp_redirect_map       i40e->i40e      |  10.0M |   1.9M
> 5.8 rc1 + patch | xdp_redirect_map       i40e->veth      |  12.3M |   1.6M
> 5.8 rc1 + patch | xdp_redirect_map_multi i40e->i40e      |   7.2M |   1.5M
> 5.8 rc1 + patch | xdp_redirect_map_multi i40e->veth      |   8.5M |   1.3M
> 5.8 rc1 + patch | xdp_redirect_map_multi i40e->i40e+veth |   3.0M |  0.98M
> 
> The bpf_redirect_map_multi() is slower than bpf_redirect_map() as we loop
> the arrays and do clone skb/xdpf. The native path is slower than generic
> path as we send skbs by pktgen. So the result looks reasonable.
> 
> Last but not least, thanks a lot to Jiri, Eelco, Toke and Jesper for
> suggestions and help on implementation.
> 
> [0] https://xdp-project.net/#Handling-multicast
> 
> v6: converted helper return types from int to long
> 
> v5:
> a) Check devmap_get_next_key() return value.
> b) Pass through flags to __bpf_tx_xdp_map() instead of bool value.
> c) In function dev_map_enqueue_multi(), consume xdpf for the last
>     obj instead of the first on.
> d) Update helper description and code comments to explain that we
>     use NULL target value to distinguish multicast and unicast
>     forwarding.
> e) Update memory model, memory id and frame_sz in xdpf_clone().
> f) Split the tests from sample and add a bpf kernel selftest patch.
> 
> v4: Fix bpf_xdp_redirect_map_multi_proto arg2_type typo
> 
> v3: Based on Toke's suggestion, do the following update
> a) Update bpf_redirect_map_multi() description in bpf.h.
> b) Fix exclude_ifindex checking order in dev_in_exclude_map().
> c) Fix one more xdpf clone in dev_map_enqueue_multi().
> d) Go find next one in dev_map_enqueue_multi() if the interface is not
>     able to forward instead of abort the whole loop.
> e) Remove READ_ONCE/WRITE_ONCE for ex_map.
> 
> v2: Add new syscall bpf_xdp_redirect_map_multi() which could accept
> include/exclude maps directly.
> 
> Hangbin Liu (3):
>    xdp: add a new helper for dev map multicast support
>    sample/bpf: add xdp_redirect_map_multicast test
>    selftests/bpf: add xdp_redirect_multi test
> 
>   include/linux/bpf.h                           |  20 ++
>   include/linux/filter.h                        |   1 +
>   include/net/xdp.h                             |   1 +
>   include/uapi/linux/bpf.h                      |  22 +++
>   kernel/bpf/devmap.c                           | 154 ++++++++++++++++
>   kernel/bpf/verifier.c                         |   6 +
>   net/core/filter.c                             | 109 ++++++++++-
>   net/core/xdp.c                                |  29 +++
>   samples/bpf/Makefile                          |   3 +
>   samples/bpf/xdp_redirect_map_multi_kern.c     |  57 ++++++
>   samples/bpf/xdp_redirect_map_multi_user.c     | 166 +++++++++++++++++
>   tools/include/uapi/linux/bpf.h                |  22 +++
>   tools/testing/selftests/bpf/Makefile          |   4 +-
>   .../bpf/progs/xdp_redirect_multi_kern.c       |  90 +++++++++
>   .../selftests/bpf/test_xdp_redirect_multi.sh  | 164 +++++++++++++++++
>   .../selftests/bpf/xdp_redirect_multi.c        | 173 ++++++++++++++++++
>   16 files changed, 1015 insertions(+), 6 deletions(-)
>   create mode 100644 samples/bpf/xdp_redirect_map_multi_kern.c
>   create mode 100644 samples/bpf/xdp_redirect_map_multi_user.c
>   create mode 100644 tools/testing/selftests/bpf/progs/xdp_redirect_multi_kern.c
>   create mode 100755 tools/testing/selftests/bpf/test_xdp_redirect_multi.sh
>   create mode 100644 tools/testing/selftests/bpf/xdp_redirect_multi.c
> 


  parent reply index

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-15  8:54 [RFC PATCH bpf-next 0/2] xdp: add " Hangbin Liu
2020-04-15  8:54 ` [RFC PATCH bpf-next 1/2] " Hangbin Liu
2020-04-20  9:52   ` Hangbin Liu
2020-04-15  8:54 ` [RFC PATCH bpf-next 2/2] sample/bpf: add xdp_redirect_map_multicast test Hangbin Liu
2020-04-24  8:56 ` [RFC PATCHv2 bpf-next 0/2] xdp: add dev map multicast support Hangbin Liu
2020-04-24  8:56   ` [RFC PATCHv2 bpf-next 1/2] xdp: add a new helper for " Hangbin Liu
2020-04-24 14:19     ` Lorenzo Bianconi
2020-04-28 11:09       ` Eelco Chaudron
2020-05-06  9:35       ` Hangbin Liu
2020-04-24 14:34     ` Toke Høiland-Jørgensen
2020-05-06  9:14       ` Hangbin Liu
2020-05-06 10:00         ` Toke Høiland-Jørgensen
2020-05-08  8:53           ` Hangbin Liu
2020-05-08 14:58             ` Toke Høiland-Jørgensen
2020-05-18  8:45       ` Hangbin Liu
2020-05-19 10:15         ` Jesper Dangaard Brouer
2020-05-20  1:24           ` Hangbin Liu
2020-04-24  8:56   ` [RFC PATCHv2 bpf-next 2/2] sample/bpf: add xdp_redirect_map_multicast test Hangbin Liu
2020-04-24 14:21     ` Lorenzo Bianconi
2020-05-23  6:05 ` [PATCHv3 bpf-next 0/2] xdp: add dev map multicast support Hangbin Liu
2020-05-23  6:05   ` [PATCHv3 bpf-next 1/2] xdp: add a new helper for " Hangbin Liu
2020-05-26  7:34     ` kbuild test robot
2020-05-23  6:05   ` [PATCHv3 bpf-next 2/2] sample/bpf: add xdp_redirect_map_multicast test Hangbin Liu
2020-05-26 14:05 ` [PATCHv4 bpf-next 0/2] xdp: add dev map multicast support Hangbin Liu
2020-05-26 14:05   ` [PATCHv4 bpf-next 1/2] xdp: add a new helper for " Hangbin Liu
2020-05-27 10:29     ` Toke Høiland-Jørgensen
2020-06-10 10:18     ` Jesper Dangaard Brouer
2020-06-12  8:54       ` Hangbin Liu
2020-06-16  8:55         ` Jesper Dangaard Brouer
2020-06-16 10:11           ` Hangbin Liu
2020-06-16 14:38             ` Jesper Dangaard Brouer
2020-06-10 10:21     ` Jesper Dangaard Brouer
2020-06-10 10:29       ` Toke Høiland-Jørgensen
2020-06-16  9:04         ` Jesper Dangaard Brouer
2020-05-26 14:05   ` [PATCHv4 bpf-next 2/2] sample/bpf: add xdp_redirect_map_multicast test Hangbin Liu
2020-05-27 10:21   ` [PATCHv4 bpf-next 0/2] xdp: add dev map multicast support Toke Høiland-Jørgensen
2020-05-27 10:32     ` Eelco Chaudron
2020-05-27 12:38     ` Hangbin Liu
2020-05-27 15:04       ` Toke Høiland-Jørgensen
2020-06-16  9:09         ` Jesper Dangaard Brouer
2020-06-16  9:47           ` Hangbin Liu
2020-06-03  2:40     ` Hangbin Liu
2020-06-03 11:05       ` Toke Høiland-Jørgensen
2020-06-04  4:09         ` Hangbin Liu
2020-06-04  9:44           ` Toke Høiland-Jørgensen
2020-06-04 12:12             ` Hangbin Liu
2020-06-04 12:37               ` Toke Høiland-Jørgensen
2020-06-04 14:41                 ` Hangbin Liu
2020-06-04 16:02                   ` Toke Høiland-Jørgensen
2020-06-05  6:26                     ` Hangbin Liu
2020-06-08 15:32                       ` Toke Høiland-Jørgensen
2020-06-09  3:03                         ` Hangbin Liu
2020-06-09 20:31                           ` Toke Høiland-Jørgensen
2020-06-10  2:35                             ` Hangbin Liu
2020-06-10 10:03                               ` Jesper Dangaard Brouer
2020-07-01  4:19   ` [PATCHv5 bpf-next 0/3] xdp: add a new helper for " Hangbin Liu
2020-07-01  4:19     ` [PATCHv5 bpf-next 1/3] " Hangbin Liu
2020-07-01  5:09       ` Andrii Nakryiko
2020-07-01  6:51         ` Hangbin Liu
2020-07-01 18:33       ` kernel test robot
2020-07-01  4:19     ` [PATCHv5 bpf-next 2/3] sample/bpf: add xdp_redirect_map_multicast test Hangbin Liu
2020-07-01  4:19     ` [PATCHv5 bpf-next 3/3] selftests/bpf: add xdp_redirect_multi test Hangbin Liu
2020-07-09  1:30     ` [PATCHv6 bpf-next 0/3] xdp: add a new helper for dev map multicast support Hangbin Liu
2020-07-09  1:30       ` [PATCHv6 bpf-next 1/3] " Hangbin Liu
2020-07-09 16:33         ` David Ahern
2020-07-10  6:55           ` Hangbin Liu
2020-07-10 13:46             ` David Ahern
2020-07-11  0:26               ` Hangbin Liu
2020-07-11 16:09                 ` David Ahern
2020-07-09  1:30       ` [PATCHv6 bpf-next 2/3] sample/bpf: add xdp_redirect_map_multicast test Hangbin Liu
2020-07-09 22:40         ` Daniel Borkmann
2020-07-10  6:41           ` Hangbin Liu
2020-07-10 14:32             ` Daniel Borkmann
2020-07-11  0:21               ` Hangbin Liu
2020-07-09  1:30       ` [PATCHv6 bpf-next 3/3] selftests/bpf: add xdp_redirect_multi test Hangbin Liu
2020-07-09 22:37       ` Daniel Borkmann [this message]
2020-07-10  7:36         ` [PATCHv6 bpf-next 0/3] xdp: add a new helper for dev map multicast support Hangbin Liu
2020-07-10 15:02           ` Daniel Borkmann
2020-07-10 16:52             ` David Ahern
2020-07-14  6:32       ` [PATCHv7 " Hangbin Liu
2020-07-14  6:32         ` [PATCHv7 bpf-next 1/3] " Hangbin Liu
2020-07-14 21:52           ` Toke Høiland-Jørgensen
2020-07-15 12:25             ` Hangbin Liu
2020-07-14  6:32         ` [PATCHv7 bpf-next 2/3] sample/bpf: add xdp_redirect_map_multicast test Hangbin Liu
2020-07-14  6:32         ` [PATCHv7 bpf-next 3/3] selftests/bpf: add xdp_redirect_multi test Hangbin Liu
2020-07-14 12:29         ` [PATCHv7 bpf-next 0/3] xdp: add a new helper for dev map multicast support Toke Høiland-Jørgensen
2020-07-14 17:12           ` David Ahern
2020-07-14 21:53             ` Toke Høiland-Jørgensen
2020-07-15 12:31               ` Hangbin Liu
2020-07-15  3:45             ` Hangbin Liu
2020-07-15 13:08         ` [PATCHv8 " Hangbin Liu
2020-07-15 13:08           ` [PATCHv8 bpf-next 1/3] " Hangbin Liu
2020-07-15 13:08           ` [PATCHv8 bpf-next 2/3] sample/bpf: add xdp_redirect_map_multicast test Hangbin Liu
2020-07-15 13:08           ` [PATCHv8 bpf-next 3/3] selftests/bpf: add xdp_redirect_multi test Hangbin Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7c80ca4b-4c7d-0322-9483-f6f0465d6370@iogearbox.net \
    --to=daniel@iogearbox.net \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=echaudro@redhat.com \
    --cc=jbenc@redhat.com \
    --cc=liuhangbin@gmail.com \
    --cc=lorenzo.bianconi@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
		bpf@vger.kernel.org
	public-inbox-index bpf

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.bpf


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git