From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Toshiaki Makita <toshiaki.makita1@gmail.com>,
John Fastabend <john.fastabend@gmail.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
Yonghong Song <yhs@fb.com>,
"David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <jakub.kicinski@netronome.com>,
Jesper Dangaard Brouer <hawk@kernel.org>,
Jamal Hadi Salim <jhs@mojatatu.com>,
Cong Wang <xiyou.wangcong@gmail.com>,
Jiri Pirko <jiri@resnulli.us>,
Pablo Neira Ayuso <pablo@netfilter.org>,
Jozsef Kadlecsik <kadlec@netfilter.org>,
Florian Westphal <fw@strlen.de>,
Pravin B Shelar <pshelar@ovn.org>
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org,
William Tu <u9012063@gmail.com>,
Stanislav Fomichev <sdf@fomichev.me>
Subject: Re: [RFC PATCH v2 bpf-next 00/15] xdp_flow: Flow offload to XDP
Date: Mon, 25 Nov 2019 14:03:14 +0100 [thread overview]
Message-ID: <87zhgk152l.fsf@toke.dk> (raw)
In-Reply-To: <c96d99ce-5fdd-dfa5-f013-ce11c6c8cfda@gmail.com>
Toshiaki Makita <toshiaki.makita1@gmail.com> writes:
> On 2019/11/22 20:54, Toke Høiland-Jørgensen wrote:
>> Toshiaki Makita <toshiaki.makita1@gmail.com> writes:
>>
>>> On 2019/11/18 19:20, Toke Høiland-Jørgensen wrote:
>>>> Toshiaki Makita <toshiaki.makita1@gmail.com> writes:
>>>>
>>>> [... trimming the context a bit ...]
>>>>
>>>>>>>> Take your example of TC rules: You were proposing a flow like this:
>>>>>>>>
>>>>>>>> Userspace TC rule -> kernel rule table -> eBPF map -> generated XDP
>>>>>>>> program
>>>>>>>>
>>>>>>>> Whereas what I mean is that we could do this instead:
>>>>>>>>
>>>>>>>> Userspace TC rule -> kernel rule table
>>>>>>>>
>>>>>>>> and separately
>>>>>>>>
>>>>>>>> XDP program -> bpf helper -> lookup in kernel rule table
>>>>>>>
>>>>>>> Thanks, now I see what you mean.
>>>>>>> You expect an XDP program like this, right?
>>>>>>>
>>>>>>> int xdp_tc(struct xdp_md *ctx)
>>>>>>> {
>>>>>>> int act = bpf_xdp_tc_filter(ctx);
>>>>>>> return act;
>>>>>>> }
>>>>>>
>>>>>> Yes, basically, except that the XDP program would need to parse the
>>>>>> packet first, and bpf_xdp_tc_filter() would take a parameter struct with
>>>>>> the parsed values. See the usage of bpf_fib_lookup() in
>>>>>> bpf/samples/xdp_fwd_kern.c
>>>>>>
>>>>>>> But doesn't this way lose a chance to reduce/minimize the program to
>>>>>>> only use necessary features for this device?
>>>>>>
>>>>>> Not necessarily. Since the BPF program does the packet parsing and fills
>>>>>> in the TC filter lookup data structure, it can limit what features are
>>>>>> used that way (e.g., if I only want to do IPv6, I just parse the v6
>>>>>> header, ignore TCP/UDP, and drop everything that's not IPv6). The lookup
>>>>>> helper could also have a flag argument to disable some of the lookup
>>>>>> features.
>>>>>
>>>>> It's unclear to me how to configure that.
>>>>> Use options when attaching the program? Something like
>>>>> $ xdp_tc attach eth0 --only-with ipv6
>>>>> But can users always determine their necessary features in advance?
>>>>
>>>> That's what I'm doing with xdp-filter now. But the answer to your second
>>>> question is likely to be 'probably not', so it would be good to not have
>>>> to do this :)
>>>>
>>>>> Frequent manual reconfiguration when TC rules frequently changes does
>>>>> not sound nice. Or, add hook to kernel to listen any TC filter event
>>>>> on some daemon and automatically reload the attached program?
>>>>
>>>> Doesn't have to be a kernel hook; we could enhance the userspace tooling
>>>> to do it. Say we integrate it into 'tc':
>>>>
>>>> - Add a new command 'tc xdp_accel enable <iface> --features [ipv6,etc]'
>>>> - When adding new rules, add the following logic:
>>>> - Check if XDP acceleration is enabled
>>>> - If it is, check whether the rule being added fits into the current
>>>> 'feature set' loaded on that interface.
>>>> - If the rule needs more features, reload the XDP program to one
>>>> with the needed additional features.
>>>> - Or, alternatively, just warn the user and let them manually
>>>> replace it?
>>>
>>> Ok, but there are other userspace tools to configure tc in wild.
>>> python and golang have their own netlink library project.
>>> OVS embeds TC netlink handling code in itself. There may be more tools like this.
>>> I think at least we should have rtnl notification about TC and monitor it
>>> from daemon, if we want to reload the program from userspace tools.
>>
>> A daemon would be one way to do this in cases where it needs to be
>> completely dynamic. My guess is that there are lots of environments
>> where that is not required, and where a user/administrator could
>> realistically specify ahead of time which feature set they want to
>> enable XDP acceleration for. So in my mind the way to go about this is
>> to implement the latter first, then add dynamic reconfiguration of it on
>> top when (or if) it turns out to be necessary...
>
> Hmm, but I think there is big difference between a daemon and a cli tool.
> Shouldn't we determine the design considering future usage?
Sure, we should make sure the design doesn't exclude either option. But
we also shouldn't end up in a "the perfect is the enemy of the good"
type of situation. And the kernel-side changes are likely to be somewhat
independent of what the userspace management ends up looking like...
-Toke
next prev parent reply other threads:[~2019-11-25 13:03 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-18 4:07 [RFC PATCH v2 bpf-next 00/15] xdp_flow: Flow offload to XDP Toshiaki Makita
2019-10-18 4:07 ` [RFC PATCH v2 bpf-next 01/15] xdp_flow: Add skeleton of XDP based flow offload driver Toshiaki Makita
2019-10-18 4:07 ` [RFC PATCH v2 bpf-next 02/15] xdp_flow: Add skeleton bpf program for XDP Toshiaki Makita
2019-10-18 4:07 ` [RFC PATCH v2 bpf-next 03/15] bpf: Add API to get program from id Toshiaki Makita
2019-10-18 4:07 ` [RFC PATCH v2 bpf-next 04/15] xdp: Export dev_check_xdp and dev_change_xdp Toshiaki Makita
2019-10-18 4:07 ` [RFC PATCH v2 bpf-next 05/15] xdp_flow: Attach bpf prog to XDP in kernel after UMH loaded program Toshiaki Makita
2019-10-18 4:07 ` [RFC PATCH v2 bpf-next 06/15] xdp_flow: Prepare flow tables in bpf Toshiaki Makita
2019-10-18 4:07 ` [RFC PATCH v2 bpf-next 07/15] xdp_flow: Add flow entry insertion/deletion logic in UMH Toshiaki Makita
2019-10-18 4:07 ` [RFC PATCH v2 bpf-next 08/15] xdp_flow: Add flow handling and basic actions in bpf prog Toshiaki Makita
2019-10-18 4:07 ` [RFC PATCH v2 bpf-next 09/15] xdp_flow: Implement flow replacement/deletion logic in xdp_flow kmod Toshiaki Makita
2019-10-18 4:07 ` [RFC PATCH v2 bpf-next 10/15] xdp_flow: Add netdev feature for enabling flow offload to XDP Toshiaki Makita
2019-10-18 4:07 ` [RFC PATCH v2 bpf-next 11/15] xdp_flow: Implement redirect action Toshiaki Makita
2019-10-18 4:07 ` [RFC PATCH v2 bpf-next 12/15] xdp_flow: Implement vlan_push action Toshiaki Makita
2019-10-18 4:07 ` [RFC PATCH v2 bpf-next 13/15] bpf, selftest: Add test for xdp_flow Toshiaki Makita
2019-10-18 4:07 ` [RFC PATCH v2 bpf-next 14/15] i40e: prefetch xdp->data before running XDP prog Toshiaki Makita
2019-10-18 4:07 ` [RFC PATCH v2 bpf-next 15/15] bpf, hashtab: Compare keys in long Toshiaki Makita
2019-10-18 15:22 ` [RFC PATCH v2 bpf-next 00/15] xdp_flow: Flow offload to XDP John Fastabend
2019-10-21 7:31 ` Toshiaki Makita
2019-10-22 16:54 ` John Fastabend
2019-10-22 17:45 ` Toke Høiland-Jørgensen
2019-10-24 4:27 ` John Fastabend
2019-10-24 10:13 ` Toke Høiland-Jørgensen
2019-10-27 13:19 ` Toshiaki Makita
2019-10-27 15:21 ` Toke Høiland-Jørgensen
2019-10-28 3:16 ` David Ahern
2019-10-28 8:36 ` Toke Høiland-Jørgensen
2019-10-28 10:08 ` Jesper Dangaard Brouer
2019-10-28 19:07 ` David Ahern
2019-10-28 19:05 ` David Ahern
2019-10-31 0:18 ` Toshiaki Makita
2019-10-31 12:12 ` Toke Høiland-Jørgensen
2019-11-11 7:32 ` Toshiaki Makita
2019-11-12 16:53 ` Toke Høiland-Jørgensen
2019-11-14 10:11 ` Toshiaki Makita
2019-11-14 12:41 ` Toke Høiland-Jørgensen
2019-11-18 6:41 ` Toshiaki Makita
2019-11-18 10:20 ` Toke Høiland-Jørgensen
2019-11-22 5:42 ` Toshiaki Makita
2019-11-22 11:54 ` Toke Høiland-Jørgensen
2019-11-25 10:18 ` Toshiaki Makita
2019-11-25 13:03 ` Toke Høiland-Jørgensen [this message]
2019-11-18 10:28 ` Toke Høiland-Jørgensen
2019-10-27 13:13 ` Toshiaki Makita
2019-10-27 15:24 ` Toke Høiland-Jørgensen
2019-10-27 19:17 ` David Miller
2019-10-31 0:32 ` Toshiaki Makita
2019-11-12 17:50 ` William Tu
2019-11-14 10:06 ` Toshiaki Makita
2019-11-14 17:09 ` William Tu
2019-11-15 13:16 ` Toke Høiland-Jørgensen
2019-11-12 17:38 ` William Tu
2019-10-23 14:11 ` Jamal Hadi Salim
2019-10-24 4:38 ` John Fastabend
2019-10-24 17:05 ` Jamal Hadi Salim
2019-10-27 13:27 ` Toshiaki Makita
2019-10-27 13:06 ` Toshiaki Makita
2019-10-21 11:23 ` Björn Töpel
2019-10-21 11:47 ` Toshiaki Makita
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zhgk152l.fsf@toke.dk \
--to=toke@redhat.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=fw@strlen.de \
--cc=hawk@kernel.org \
--cc=jakub.kicinski@netronome.com \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=john.fastabend@gmail.com \
--cc=kadlec@netfilter.org \
--cc=kafai@fb.com \
--cc=netdev@vger.kernel.org \
--cc=pablo@netfilter.org \
--cc=pshelar@ovn.org \
--cc=sdf@fomichev.me \
--cc=songliubraving@fb.com \
--cc=toshiaki.makita1@gmail.com \
--cc=u9012063@gmail.com \
--cc=xiyou.wangcong@gmail.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).