bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next v7 1/3] bpf: Add "live packet" mode for XDP in bpf_prog_run()
Date: Sat, 8 Jan 2022 18:24:48 -0800	[thread overview]
Message-ID: <20220109022448.bxgatdsx3obvipbu@ast-mbp.dhcp.thefacebook.com> (raw)
In-Reply-To: <87mtk67zfm.fsf@toke.dk>

On Sat, Jan 08, 2022 at 09:19:41PM +0100, Toke Høiland-Jørgensen wrote:
> 
> Sure, totally fine with documenting it. Just seems to me the most
> obvious place to put this is in a new
> Documentation/bpf/prog_test_run.rst file with a short introduction about
> the general BPF_PROG_RUN mechanism, and then a subsection dedicated to
> this facility.

sgtm

> > I guess it's ok-ish to get stuck with 128.
> > It will be uapi that we cannot change though.
> > Are you comfortable with that?
> 
> UAPI in what sense? I'm thinking of documenting it like:
> 
> "The packet data being supplied as data_in to BPF_PROG_RUN will be used
>  for the initial run of the XDP program. However, when running the
>  program multiple times (with repeat > 1), only the packet *bounds*
>  (i.e., the data, data_end and data_meta pointers) will be reset on each
>  invocation, the packet data itself won't be rewritten. The pages
>  backing the packets are recycled, but the order depends on the path the
>  packet takes through the kernel, making it hard to predict when a
>  particular modified page makes it back to the XDP program. In practice,
>  this means that if the XDP program modifies the packet payload before
>  sending out the packet, it has to be prepared to deal with subsequent
>  invocations seeing either the initial data or the already-modified
>  packet, in arbitrary order."
> 
> I don't think this makes any promises about any particular size of the
> page pool, so how does it constitute UAPI?

Could you explain out-of-order scanario again?
It's possible only if xdp_redirect is done into different netdevs.
Then they can xmit at different times and cycle pages back into
the loop in different order. But TX or REDIRECT into the same netdev
will keep the pages in the same order. So the program can rely on that.

> >
> > reinit doesn't feel necessary.
> > How one would use this interface to send N different packets?
> > The api provides an interface for only one.
> 
> By having the XDP program react appropriately. E.g., here is the XDP
> program used by the trafficgen tool to cycle through UDP ports when
> sending out the packets - it just reads the current value and updates
> based on that, so it doesn't matter if it sees the initial page or one
> it already modified:

Sure. I think there is an untapped potential here.
With this live packet prog_run anyone can buy 10G or 100G nic equipped
server and for free transform it into $300k+ IXIA beating machine.
It could be a game changer. pktgen doesn't come close.
I'm thinking about generating and consuming test TCP traffic.
TCP blaster would xmit 1M TCP connections through this live prog_run
into eth0 and consume the traffic returning from "server under test"
via a different XDP program attached to eth0.
The prog_run's xdp prog would need to send SYN, increment sequence number,
and keep sane data in the packets. It could be HTTP request, for example.

To achive this IXIA beating setup the TCP blaster would need a full
understanding of what page pool is doing with the packets.
Just saying "in arbitrary order" is a non starter. It diminishes
this live prog_run into pktgen equivalent which is still useful,
but lots of potential is lost.

> Another question seeing as the merge window is imminent: How do you feel
> about merging this before the merge window? I can resubmit before it
> opens with the updated selftest and documentation, and we can deal with
> any tweaks during the -rcs; or would you rather postpone the whole
> thing until the next cycle?

It's already too late for this merge window, but bpf-next is always open.
Just like it was open for the last year. So please resubmit as soon as
the tests are green and this discussion is over.

  reply	other threads:[~2022-01-09  2:24 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-07 21:54 [PATCH bpf-next v7 0/3] Add support for transmitting packets using XDP in bpf_prog_run() Toke Høiland-Jørgensen
2022-01-07 21:54 ` [PATCH bpf-next v7 1/3] bpf: Add "live packet" mode for " Toke Høiland-Jørgensen
2022-01-08  2:44   ` Alexei Starovoitov
2022-01-08 13:19     ` Toke Høiland-Jørgensen
2022-01-08 19:28       ` Alexei Starovoitov
2022-01-08 20:19         ` Toke Høiland-Jørgensen
2022-01-09  2:24           ` Alexei Starovoitov [this message]
2022-01-09 12:30             ` Toke Høiland-Jørgensen
2022-01-13  1:37               ` Alexei Starovoitov
2022-02-11  7:19                 ` Martin KaFai Lau
2022-02-11 16:03                   ` Toke Høiland-Jørgensen
2022-01-07 21:54 ` [PATCH bpf-next v7 2/3] selftests/bpf: Move open_netns() and close_netns() into network_helpers.c Toke Høiland-Jørgensen
2022-01-07 21:54 ` [PATCH bpf-next v7 3/3] selftests/bpf: Add selftest for XDP_REDIRECT in bpf_prog_run() Toke Høiland-Jørgensen
2022-01-08 19:32   ` Alexei Starovoitov
2022-01-08 20:29     ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220109022448.bxgatdsx3obvipbu@ast-mbp.dhcp.thefacebook.com \
    --to=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=hawk@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=songliubraving@fb.com \
    --cc=toke@redhat.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).