bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: John Fastabend <john.fastabend@gmail.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Jakub Kicinski <kuba@kernel.org>
Cc: Lorenzo Bianconi <lorenzo@kernel.org>, bpf <bpf@vger.kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	Lorenzo Bianconi <lorenzo.bianconi@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	shayagr@amazon.com, John Fastabend <john.fastabend@gmail.com>,
	David Ahern <dsahern@kernel.org>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Eelco Chaudron <echaudro@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	Saeed Mahameed <saeed@kernel.org>,
	"Fijalkowski, Maciej" <maciej.fijalkowski@intel.com>,
	"Karlsson, Magnus" <magnus.karlsson@intel.com>,
	tirthendu.sarkar@intel.com
Subject: Re: [PATCH v14 bpf-next 00/18] mvneta: introduce XDP multi-buffer support
Date: Sat, 18 Sep 2021 13:53:35 +0200	[thread overview]
Message-ID: <8735q25ccg.fsf@toke.dk> (raw)
In-Reply-To: <614511bc3408b_8d5120862@john-XPS-13-9370.notmuch>

John Fastabend <john.fastabend@gmail.com> writes:

> Alexei Starovoitov wrote:
>> On Fri, Sep 17, 2021 at 12:00 PM Jakub Kicinski <kuba@kernel.org> wrote:
>> >
>> > On Fri, 17 Sep 2021 11:43:07 -0700 Alexei Starovoitov wrote:
>> > > > If bpf_xdp_load_bytes() / bpf_xdp_store_bytes() works for most we
>> > > > can start with that. In all honesty I don't know what the exact
>> > > > use cases for looking at data are, either. I'm primarily worried
>> > > > about exposing the kernel internals too early.
>> > >
>> > > I don't mind the xdp equivalent of skb_load_bytes,
>> > > but skb_header_pointer() idea is superior.
>> > > When we did xdp with data/data_end there was no refine_retval_range
>> > > concept in the verifier (iirc or we just missed that opportunity).
>> > > We'd need something more advanced: a pointer with valid range
>> > > refined by input argument 'len' or NULL.
>> > > The verifier doesn't have such thing yet, but it fits as a combination of
>> > > value_or_null plus refine_retval_range.
>> > > The bpf_xdp_header_pointer() and bpf_skb_header_pointer()
>> > > would probably simplify bpf programs as well.
>> > > There would be no need to deal with data/data_end.
>> >
>> > What are your thoughts on inlining? Can we inline the common case
>> > of the header being in the "head"? Otherwise data/end comparisons
>> > would be faster.
>> 
>> Yeah. It can be inlined by the verifier.
>> It would still look like a call from bpf prog pov with llvm doing spill/fill
>> of scratched regs, but it's minor.
>> 
>> Also we can use the same bpf_header_pointer(ctx, ...)
>> helper for both xdp and skb program types. They will have different
>> implementation underneath, but this might make possible writing bpf
>> programs that could work in both xdp and skb context.
>> I believe cilium has fancy macros to achieve that.
>
> Hi,
>
> First a header_pointer() logic that works across skb and xdp seems like
> a great idea to me. I wonder though if instead of doing the copy
> into a new buffer for offset past the initial frag like what is done in
> skb_header_pointer could we just walk the frags and point at the new offset.
> This is what we do on the socket side with bpf_msg_pull-data() for example.
> For XDP it should also work. The skb case would depend on clone state
> and things so might be a bit more tricky there.
>
> This has the advantage of only doing the copy when its necessary. This
> can be useful for example when reading the tail of an IPsec packet. With
> blind copy most packets will get hit with a copy. By just writing the
> pkt->data and pkt->data_end we can avoid this case.
>
> Lorenz originally implemented something similar earlier and we had the
> refine retval logic. It failed on no-alu32 for some reason we could
> revisit. I didn't mind the current help returning with data pointer set
> to the start of the frag so we stopped following up on it.
>
> I agree though the current implementation puts a lot on the BPF writer.
> So getting both cases covered, I want to take pains in my BPF prog
> to avoid copies and I just want these bytes handled behind a single
> helper seems good to me.

I'm OK with a bpf_header_pointer()-type helper - I quite like the
in-kernel version of this for SKBs, so replicating it as a BPF helper
would be great. But I'm a little worried about taking a performance hit.

I.e., if you do:

ptr = bpf_header_pointer(pkt, offset, len, stack_ptr)
*ptr = xxx;

then, if the helper ended up copying the data into the stack pointer,
you didn't actually change anything in the packet, so you need to do a
writeback.

Jakub suggested up-thread that this should be done with some kind of
flush() helper. But you don't know whether the header_pointer()-helper
copied the data, so you always need to call the flush() helper, which
will incur overhead. If the verifier can in-line the helpers that will
lower it, but will it be enough to make it negligible?

-Toke


  reply	other threads:[~2021-09-18 11:53 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-10 16:14 [PATCH v14 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
2021-09-10 16:14 ` [PATCH v14 bpf-next 01/18] net: skbuff: add size metadata to skb_shared_info for xdp Lorenzo Bianconi
2021-09-10 16:18   ` Jesper Dangaard Brouer
2021-09-10 16:14 ` [PATCH v14 bpf-next 02/18] xdp: introduce flags field in xdp_buff/xdp_frame Lorenzo Bianconi
2021-09-10 16:19   ` Jesper Dangaard Brouer
2021-09-10 16:14 ` [PATCH v14 bpf-next 03/18] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer Lorenzo Bianconi
2021-09-20  8:25   ` Shay Agroskin
2021-09-20  8:37     ` Lorenzo Bianconi
2021-09-20  8:45       ` Shay Agroskin
2021-09-20  9:00         ` Lorenzo Bianconi
2021-09-10 16:14 ` [PATCH v14 bpf-next 04/18] net: mvneta: simplify mvneta_swbm_add_rx_fragment management Lorenzo Bianconi
2021-09-10 16:14 ` [PATCH v14 bpf-next 05/18] net: xdp: add xdp_update_skb_shared_info utility routine Lorenzo Bianconi
2021-09-10 16:14 ` [PATCH v14 bpf-next 06/18] net: marvell: rely on " Lorenzo Bianconi
2021-09-10 16:14 ` [PATCH v14 bpf-next 07/18] xdp: add multi-buff support to xdp_return_{buff/frame} Lorenzo Bianconi
2021-09-10 16:14 ` [PATCH v14 bpf-next 08/18] net: mvneta: add multi buffer support to XDP_TX Lorenzo Bianconi
2021-09-10 16:14 ` [PATCH v14 bpf-next 09/18] net: mvneta: enable jumbo frames for XDP Lorenzo Bianconi
2021-09-10 16:14 ` [PATCH v14 bpf-next 10/18] bpf: add multi-buff support to the bpf_xdp_adjust_tail() API Lorenzo Bianconi
2021-09-16 16:55   ` Jakub Kicinski
2021-09-17 10:02     ` Lorenzo Bianconi
2021-09-17 13:03       ` Jakub Kicinski
2021-09-10 16:14 ` [PATCH v14 bpf-next 11/18] bpf: introduce bpf_xdp_get_buff_len helper Lorenzo Bianconi
2021-09-10 16:14 ` [PATCH v14 bpf-next 12/18] bpf: add multi-buffer support to xdp copy helpers Lorenzo Bianconi
2021-09-10 16:14 ` [PATCH v14 bpf-next 13/18] bpf: move user_size out of bpf_test_init Lorenzo Bianconi
2021-09-10 16:14 ` [PATCH v14 bpf-next 14/18] bpf: introduce multibuff support to bpf_prog_test_run_xdp() Lorenzo Bianconi
2021-09-10 16:14 ` [PATCH v14 bpf-next 15/18] bpf: test_run: add xdp_shared_info pointer in bpf_test_finish signature Lorenzo Bianconi
2021-09-10 16:14 ` [PATCH v14 bpf-next 16/18] bpf: update xdp_adjust_tail selftest to include multi-buffer Lorenzo Bianconi
2021-09-10 16:14 ` [PATCH v14 bpf-next 17/18] net: xdp: introduce bpf_xdp_adjust_data helper Lorenzo Bianconi
2021-09-10 16:14 ` [PATCH v14 bpf-next 18/18] bpf: add bpf_xdp_adjust_data selftest Lorenzo Bianconi
2021-09-16 16:55 ` [PATCH v14 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Jakub Kicinski
2021-09-17 14:51   ` Lorenzo Bianconi
2021-09-17 18:33     ` Jakub Kicinski
2021-09-17 18:43       ` Alexei Starovoitov
2021-09-17 19:00         ` Jakub Kicinski
2021-09-17 19:10           ` Alexei Starovoitov
2021-09-17 22:07             ` John Fastabend
2021-09-18 11:53               ` Toke Høiland-Jørgensen [this message]
2021-09-20 18:02                 ` Jakub Kicinski
2021-09-20 21:01                   ` Toke Høiland-Jørgensen
2021-09-20 21:25                     ` Jakub Kicinski
2021-09-20 22:44                       ` Toke Høiland-Jørgensen
2021-09-21 10:03                         ` Eelco Chaudron
2021-09-28 11:48                           ` Magnus Karlsson
2021-09-29 10:36                         ` Lorenz Bauer
2021-09-29 12:25                           ` Toke Høiland-Jørgensen
2021-09-29 12:32                             ` Lorenz Bauer
2021-09-29 17:48                             ` Jakub Kicinski
2021-09-29 17:46                           ` Jakub Kicinski
2021-09-29 10:41   ` Lorenz Bauer
2021-09-29 12:10     ` Toke Høiland-Jørgensen
2021-09-29 12:38       ` Lorenz Bauer
2021-09-29 18:54         ` Alexei Starovoitov
2021-09-29 19:22           ` Jakub Kicinski
2021-09-29 20:39             ` Toke Høiland-Jørgensen
2021-10-01  9:03               ` Lorenzo Bianconi
2021-10-01 18:35                 ` Jakub Kicinski
2021-10-06  9:32                   ` Lorenzo Bianconi
2021-10-06 10:08                     ` Eelco Chaudron
2021-10-06 12:15                       ` Lorenzo Bianconi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8735q25ccg.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=echaudro@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=lorenzo.bianconi@redhat.com \
    --cc=lorenzo@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeed@kernel.org \
    --cc=shayagr@amazon.com \
    --cc=tirthendu.sarkar@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).