Re: [PATCH bpf-next 0/9] xdp: Support multiple programs on a single interface through chain calls - Toke Høiland-Jørgensen

From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Song Liu <songliubraving@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Alexei Starovoitov <ast@kernel.org>, Martin Lau <kafai@fb.com>,
	Yonghong Song <yhs@fb.com>,
	Marek Majkowski <marek@cloudflare.com>,
	Lorenz Bauer <lmb@cloudflare.com>,
	David Miller <davem@davemloft.net>,
	"netdev\@vger.kernel.org" <netdev@vger.kernel.org>,
	"bpf\@vger.kernel.org" <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next 0/9] xdp: Support multiple programs on a single interface through chain calls
Date: Thu, 03 Oct 2019 16:33:13 +0200	[thread overview]
Message-ID: <87pnjdq4pi.fsf@toke.dk> (raw)
In-Reply-To: <CAADnVQKTbaxJhkukxXM7Ue7=kA9eWsGMpnkXc=Z8O3iWGSaO0A@mail.gmail.com>

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Thu, Oct 3, 2019 at 1:53 AM Jesper Dangaard Brouer <brouer@redhat.com> wrote:
>>> The xdpcap use-case is to trap any XDP return action code (and tcpdump
>> via perf event ring_buffer).  For system administrators the xdpcap
>> use-case is something we hear about all the time, so one of the missing
>> features for XDP.  As Toke also wrote, we want to extend this to ALSO
>> be-able to see/dump the packet BEFORE a given XDP program.
>
> It sounds to me that 'xdpdump/xdpcap' (tcpdump equivalent) is
> the only use case both you and Toke are advocating for.
> I think such case we can do already without new kernel code:
> - retrieve prog_id of the program attached to given xdp ifindex
> - convert to fd
> - create prog_array of one element and store that prog_fd
> - create xdpump bpf prog that prints to ring buffer
>   and tail_calls into that prog_array
> - replace xdp prog on that ifindex
>
> Now it see all the traffic first and existing xdp progs keep working.
> What am I missing?

Yeah, that takes care of the "run xdpdump as the first thing" use case.
But we also want to be able to run it *after* another program, *without*
modifying that program to add a tail call.

More generally, we want to be able to chain XDP programs from multiple
sources in arbitrary ways. Let me try to provide a more fleshed-out
usage example:

Say a distro ships MyFirewall and MyIDS, two different upstream
projects, both of which support XDP acceleration. MyFirewall has
specified in its documentation that its XDP program will return XDP_PASS
for anything that it has determined should not be dropped. So the
sysadmin decides he wants to enable both, and of course he wants both to
be XDP-accelerated.

This particular sysadmin doesn't want to expend IDS resources on traffic
that the firewall has already decided to drop, so he'll just install the
firewall first, and then run the IDS on any traffic that gets XDP_PASS.
So he installs IDS as a chain-call XDP program on the XDP_PASS action
after the firewall.

Another sysadmin might be more paranoid (or have more CPU resources
available), and so he wants to run the IDS first, and the firewall
afterwards. So he installs the two XDP programs in the reverse order, by
chaining the firewall to the IDS' XDP_PASS action.

At the same time, the sysadmin wants to inspect what the firewall is
actually dropping, so he fires up xdpdump and tells it to show him
everything dropped by the firewall. The xdpdump tool does this by
attaching itself as a chain call program to the XDP_DROP action of the
firewall program.

In all cases, the sysadmin can't (or doesn't want to) modify any of the
XDP programs. In fact, they may just be installed as pre-compiled .so
BPF files on his system. So he needs to be able to configure the call
chain of different programs without modifying the eBPF program source
code.

This is basically what we're trying to support with XDP chain calls
(which I guess is now turning into more general eBPF chain calls). I
think it is doable with something based on the BPF_PROG_CHAIN_* series
you posted a link to earlier; but instead of having an explicit
tail_call_next() helper, I'll just make the verifier insert the chain
calls before each BPF_EXIT instruction when this feature is turned on.
Do you see any reason why this wouldn't work?

-Toke