From: Yoshiki Komachi <komachi.yoshiki@gmail.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Jesper Dangaard Brouer <hawk@kernel.org>,
John Fastabend <john.fastabend@gmail.com>,
Jakub Kicinski <kuba@kernel.org>, Martin KaFai Lau <kafai@fb.com>,
Song Liu <songliubraving@fb.com>, Yonghong Song <yhs@fb.com>,
Andrii Nakryiko <andriin@fb.com>, KP Singh <kpsingh@chromium.org>,
Roopa Prabhu <roopa@cumulusnetworks.com>,
Nikolay Aleksandrov <nikolay@cumulusnetworks.com>,
David Ahern <dsahern@kernel.org>,
netdev@vger.kernel.org, bridge@lists.linux-foundation.org,
bpf@vger.kernel.org
Subject: Re: [RFC PATCH bpf-next 3/3] samples/bpf: Add a simple bridge example accelerated with XDP
Date: Tue, 4 Aug 2020 19:08:59 +0900 [thread overview]
Message-ID: <46CD5E66-D554-473A-BE9A-9AC2EF8D16B1@gmail.com> (raw)
In-Reply-To: <20200731161519.5f413f82@carbon>
> 2020/07/31 23:15、Jesper Dangaard Brouer <brouer@redhat.com>のメール:
>
>
> I really appreciate that you are working on adding this helper.
> Some comments below.
Thanks! Find my response below, please.
> On Fri, 31 Jul 2020 13:44:20 +0900
> Yoshiki Komachi <komachi.yoshiki@gmail.com> wrote:
>
>> diff --git a/samples/bpf/xdp_bridge_kern.c b/samples/bpf/xdp_bridge_kern.c
>> new file mode 100644
>> index 000000000000..00f802503199
>> --- /dev/null
>> +++ b/samples/bpf/xdp_bridge_kern.c
>> @@ -0,0 +1,129 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/* Copyright (c) 2020 NTT Corp. All Rights Reserved.
>> + *
> [...]
>> +
>> +struct {
>> + __uint(type, BPF_MAP_TYPE_DEVMAP_HASH);
>> + __uint(key_size, sizeof(int));
>> + __uint(value_size, sizeof(int));
>> + __uint(max_entries, 64);
>> +} xdp_tx_ports SEC(".maps");
>> +
>> +static __always_inline int xdp_bridge_proto(struct xdp_md *ctx, u16 br_vlan_proto)
>> +{
>> + void *data_end = (void *)(long)ctx->data_end;
>> + void *data = (void *)(long)ctx->data;
>> + struct bpf_fdb_lookup fdb_lookup_params;
>> + struct vlan_hdr *vlan_hdr = NULL;
>> + struct ethhdr *eth = data;
>> + u16 h_proto;
>> + u64 nh_off;
>> + int rc;
>> +
>> + nh_off = sizeof(*eth);
>> + if (data + nh_off > data_end)
>> + return XDP_DROP;
>> +
>> + __builtin_memset(&fdb_lookup_params, 0, sizeof(fdb_lookup_params));
>> +
>> + h_proto = eth->h_proto;
>> +
>> + if (unlikely(ntohs(h_proto) < ETH_P_802_3_MIN))
>> + return XDP_PASS;
>> +
>> + /* Handle VLAN tagged packet */
>> + if (h_proto == br_vlan_proto) {
>> + vlan_hdr = (void *)eth + nh_off;
>> + nh_off += sizeof(*vlan_hdr);
>> + if ((void *)eth + nh_off > data_end)
>> + return XDP_PASS;
>> +
>> + fdb_lookup_params.vlan_id = ntohs(vlan_hdr->h_vlan_TCI) &
>> + VLAN_VID_MASK;
>> + }
>> +
>> + /* FIXME: Although Linux bridge provides us with vlan filtering (contains
>> + * PVID) at ingress, the feature is currently unsupported in this XDP program.
>> + *
>> + * Two ideas to realize the vlan filtering are below:
>> + * 1. usespace daemon monitors bridge vlan events and notifies XDP programs
> ^^
> Typo: usespace -> userspace
I will fix this in the next version.
>> + * of them through BPF maps
>> + * 2. introduce another bpf helper to retrieve bridge vlan information
>
> The comment appears two times time this file.
I was aiming to show future implementation of the vlan filtering at ingress (not egress) to
be required here by the above comment.
>> + *
>> + *
>> + * FIXME: After the vlan filtering, learning feature is required here, but
>> + * it is currently unsupported as well. If another bpf helper for learning
>> + * is accepted, the processing could be implemented in the future.
>> + */
>> +
>> + memcpy(&fdb_lookup_params.addr, eth->h_dest, ETH_ALEN);
>> +
>> + /* Note: This program definitely takes ifindex of ingress interface as
>> + * a bridge port. Linux networking devices can be stacked and physical
>> + * interfaces are not necessarily slaves of bridges (e.g., bonding or
>> + * vlan devices can be slaves of bridges), but stacked bridge ports are
>> + * currently unsupported in this program. In such cases, XDP programs
>> + * should be attached to a lower device in order to process packets with
>> + * higher speed. Then, a new bpf helper to find upper devices will be
>> + * required here in the future because they will be registered on FDB
>> + * in the kernel.
>> + */
>> + fdb_lookup_params.ifindex = ctx->ingress_ifindex;
>> +
>> + rc = bpf_fdb_lookup(ctx, &fdb_lookup_params, sizeof(fdb_lookup_params), 0);
>> + if (rc != BPF_FDB_LKUP_RET_SUCCESS) {
>> + /* In cases of flooding, XDP_PASS will be returned here */
>> + return XDP_PASS;
>> + }
>> +
>> + /* FIXME: Although Linux bridge provides us with vlan filtering (contains
>> + * untagged policy) at egress as well, the feature is currently unsupported
>> + * in this XDP program.
>> + *
>> + * Two ideas to realize the vlan filtering are below:
>> + * 1. usespace daemon monitors bridge vlan events and notifies XDP programs
>> + * of them through BPF maps
>> + * 2. introduce another bpf helper to retrieve bridge vlan information
>> + */
>
> (2nd time the comment appears)
The 2nd one is marking for future implementation of the egress filtering.
Sorry for confusing you. I will try to remove the redundancy and confusion.
>> +
>
> A comment about below bpf_redirect_map() would be good. Explaining
> that we depend on fallback behavior, to let normal bridge code handle
> other cases (e.g. flood/broadcast). And also that if lookup fails,
> XDP_PASS/fallback also happens.
In this example, flooded packets will be transferred to the upper normal bridge by not the
bpf_redirect_map() call but the XDP_PASS action as below:
+ rc = bpf_fdb_lookup(ctx, &fdb_lookup_params, sizeof(fdb_lookup_params), 0);
+ if (rc != BPF_FDB_LKUP_RET_SUCCESS) {
+ /* In cases of flooding, XDP_PASS will be returned here */
+ return XDP_PASS;
+ }
Thus, such a comment should be described as above, IMO.
Thanks & Best regards,
>> + return bpf_redirect_map(&xdp_tx_ports, fdb_lookup_params.ifindex, XDP_PASS);
>> +}
>> +
>> +SEC("xdp_bridge")
>> +int xdp_bridge_prog(struct xdp_md *ctx)
>> +{
>> + return xdp_bridge_proto(ctx, 0);
>> +}
>> +
>> +SEC("xdp_8021q_bridge")
>> +int xdp_8021q_bridge_prog(struct xdp_md *ctx)
>> +{
>> + return xdp_bridge_proto(ctx, htons(ETH_P_8021Q));
>> +}
>> +
>> +SEC("xdp_8021ad_bridge")
>> +int xdp_8021ad_bridge_prog(struct xdp_md *ctx)
>> +{
>> + return xdp_bridge_proto(ctx, htons(ETH_P_8021AD));
>> +}
>> +
>> +char _license[] SEC("license") = "GPL";
>
>
> --
> Best regards,
> Jesper Dangaard Brouer
> MSc.CS, Principal Kernel Engineer at Red Hat
> LinkedIn: http://www.linkedin.com/in/brouer
>
—
Yoshiki Komachi
komachi.yoshiki@gmail.com
next prev parent reply other threads:[~2020-08-04 10:09 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-31 4:44 [RFC PATCH bpf-next 0/3] Add a new bpf helper for FDB lookup Yoshiki Komachi
2020-07-31 4:44 ` [RFC PATCH bpf-next 1/3] net/bridge: Add new function to access FDB from XDP programs Yoshiki Komachi
2020-07-31 4:44 ` [RFC PATCH bpf-next 2/3] bpf: Add helper to do forwarding lookups in kernel FDB table Yoshiki Komachi
2020-07-31 11:52 ` Maciej Fijalkowski
2020-08-04 8:44 ` Yoshiki Komachi
2020-07-31 17:15 ` David Ahern
2020-08-04 11:27 ` Yoshiki Komachi
2020-08-05 16:38 ` David Ahern
2020-08-07 8:06 ` Yoshiki Komachi
2020-07-31 21:12 ` Daniel Borkmann
2020-08-05 4:45 ` Yoshiki Komachi
2020-07-31 4:44 ` [RFC PATCH bpf-next 3/3] samples/bpf: Add a simple bridge example accelerated with XDP Yoshiki Komachi
2020-07-31 14:15 ` Jesper Dangaard Brouer
2020-08-04 10:08 ` Yoshiki Komachi [this message]
2020-07-31 17:48 ` Andrii Nakryiko
2020-08-04 10:35 ` Yoshiki Komachi
2020-07-31 21:52 ` [RFC PATCH bpf-next 0/3] Add a new bpf helper for FDB lookup John Fastabend
2020-08-05 10:26 ` Yoshiki Komachi
2020-08-05 16:36 ` David Ahern
2020-08-07 8:30 ` Yoshiki Komachi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=46CD5E66-D554-473A-BE9A-9AC2EF8D16B1@gmail.com \
--to=komachi.yoshiki@gmail.com \
--cc=andriin@fb.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=bridge@lists.linux-foundation.org \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=hawk@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=kafai@fb.com \
--cc=kpsingh@chromium.org \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=nikolay@cumulusnetworks.com \
--cc=roopa@cumulusnetworks.com \
--cc=songliubraving@fb.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).