netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>,
	Joanne Koong <joannelkoong@gmail.com>,
	daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org,
	ast@kernel.org, netdev@vger.kernel.org, memxor@gmail.com,
	kernel-team@fb.com, bpf@vger.kernel.org
Subject: Re: [PATCH v9 bpf-next 3/5] bpf: Add skb dynptrs
Date: Wed, 1 Feb 2023 17:21:39 -0800	[thread overview]
Message-ID: <CAEf4BzbTXqhsKqPd=hDANKeg75UDbKjtX318ucMGw7a1L3693w@mail.gmail.com> (raw)
In-Reply-To: <20230201004034.sea642affpiu7yfm@macbook-pro-6.dhcp.thefacebook.com>

On Tue, Jan 31, 2023 at 4:40 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Jan 31, 2023 at 04:11:47PM -0800, Andrii Nakryiko wrote:
> > >
> > > When prog is just parsing the packet it doesn't need to finalize with bpf_dynptr_write.
> > > The prog can always write into the pointer followed by if (p == buf) bpf_dynptr_write.
> > > No need for rdonly flag, but extra copy is there in case of cloned which
> > > could have been avoided with extra rd_only flag.
> >
> > Yep, given we are designing bpf_dynptr_slice for performance, extra
> > copy on reads is unfortunate. ro/rw flag or have separate
> > bpf_dynptr_slice_rw vs bpf_dynptr_slice_ro?
>
> Either flag or two kfuncs sound good to me.

Would it make sense to make bpf_dynptr_slice() as read-only variant,
and bpf_dynptr_slice_rw() for read/write? I think the common case is
read-only, right? And if users mistakenly use bpf_dynptr_slice() for
r/w case, they will get a verifier error when trying to write into the
returned pointer. While if we make bpf_dynptr_slice() as read-write,
users won't realize they are paying a performance penalty for
something that they don't actually need.

>
> > > Yes and No. bpf_skb_store_bytes is doing pull followed by memcpy,
> > > while xdp_store_bytes does scatter gather copy into frags.
> > > We should probably add similar copy to skb case to avoid allocations and pull.
> > > Then in case of:
> > >  if (p == buf) {
> > >       bpf_dynptr_write(dp, buf, 16);
> > >  }
> > >
> > > the write will guarantee to succeed for both xdp and skb and the user
> > > doesn't need to add error checking for alloc failures in case of skb.
> > >
> >
> > That seems like a nice guarantee, agreed.
>
> Just grepped through few projects that use skb_store_bytes.
> Everywhere it looks like:
> if (bpf_skb_store_byte(...))
>    return error;
>
> Not a pretty code to read.
> I should prioritize bpf_assert() work, so we can assert from inside of
> bpf_dynptr_write() eventually and remove all these IFs.
>
> > > > >
> > > > > > But I wonder if for simple cases when users are mostly sure that they
> > > > > > are going to access only header data directly we can have an option
> > > > > > for bpf_dynptr_from_skb() to specify what should be the behavior for
> > > > > > bpf_dynptr_slice():
> > > > > >
> > > > > >   - either return NULL for anything that crosses into frags (no
> > > > > > surprising perf penalty, but surprising NULLs);
> > > > > >   - do bpf_skb_pull_data() if bpf_dynptr_data() needs to point to data
> > > > > > beyond header (potential perf penalty, but on NULLs, if off+len is
> > > > > > within packet).
> > > > > >
> > > > > > And then bpf_dynptr_from_skb() can accept a flag specifying this
> > > > > > behavior and store it somewhere in struct bpf_dynptr.
> > > > >
> > > > > xdp does not have the bpf_skb_pull_data() equivalent, so xdp prog will still
> > > > > need the write back handling.
> > > > >
> > > >
> > > > Sure, unfortunately, can't have everything. I'm just thinking how to
> > > > make bpf_dynptr_data() generically usable. Think about some common BPF
> > > > routine that calculates hash for all bytes pointed to by dynptr,
> > > > regardless of underlying dynptr type; it can iterate in small chunks,
> > > > get memory slice, if possible, but fallback to generic
> > > > bpf_dynptr_read() if doesn't. This will work for skb, xdp, LOCAL,
> > > > RINGBUF, any other dynptr type.
> > >
> > > It looks to me that dynptr on top of skb, xdp, local can work as generic reader,
> > > but dynptr as a generic writer doesn't look possible.
> > > BPF_F_RECOMPUTE_CSUM and BPF_F_INVALIDATE_HASH are special to skb.
> > > There is also bpf_skb_change_proto and crazy complex bpf_skb_adjust_room.
> > > I don't think writing into skb vs xdp vs ringbuf are generalizable.
> > > The prog needs to do a ton more work to write into skb correctly.
> >
> > If that's the case, then yeah, bpf_dynptr_write() can just return
> > error for skb/xdp dynptrs?
>
> You mean to error when these skb only flags are present, but dynptr->type == xdp ?
> Yep. I don't see another option. My point was that dynptr doesn't quite work as an
> abstraction for writing into networking things.

agreed

> While libraries like: parse_http(&dynptr), compute_hash(&dynptr), find_string(&dynptr)
> can indeed be generic and work with raw bytes, skb, xdp as an input,
> which I think was on top of your wishlist for dynptr.

yep, it would be a great property

  reply	other threads:[~2023-02-02  1:21 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-27 19:16 [PATCH v9 bpf-next 0/5] Add skb + xdp dynptrs Joanne Koong
2023-01-27 19:16 ` [PATCH v9 bpf-next 1/5] bpf: Allow "sk_buff" and "xdp_buff" as valid kfunc arg types Joanne Koong
2023-01-27 19:17 ` [PATCH v9 bpf-next 2/5] bpf: Allow initializing dynptrs in kfuncs Joanne Koong
2023-01-27 19:17 ` [PATCH v9 bpf-next 3/5] bpf: Add skb dynptrs Joanne Koong
2023-01-29 23:39   ` Alexei Starovoitov
2023-01-31  0:44     ` Joanne Koong
2023-01-31  5:36       ` Alexei Starovoitov
2023-01-31 17:54         ` Joanne Koong
2023-01-31 19:50           ` Alexei Starovoitov
2023-01-31 21:29             ` Joanne Koong
2023-02-08 21:46           ` Joanne Koong
2023-02-08 23:22             ` Alexei Starovoitov
2023-01-30 22:04   ` Martin KaFai Lau
2023-01-30 22:31     ` Alexei Starovoitov
2023-01-30 23:11       ` Martin KaFai Lau
2023-01-31  1:04       ` Andrii Nakryiko
2023-01-31  1:49         ` Martin KaFai Lau
2023-01-31  4:43           ` Andrii Nakryiko
2023-01-31  5:30             ` Alexei Starovoitov
2023-01-31 22:07               ` Martin KaFai Lau
2023-01-31 23:17               ` Joanne Koong
2023-02-01  0:46                 ` Alexei Starovoitov
2023-02-01  0:11               ` Andrii Nakryiko
2023-02-01  0:40                 ` Alexei Starovoitov
2023-02-02  1:21                   ` Andrii Nakryiko [this message]
2023-02-02 11:43                     ` Alexei Starovoitov
2023-02-03 21:37                       ` Andrii Nakryiko
2023-02-08  2:25                         ` Alexei Starovoitov
2023-02-08 20:13                           ` Joanne Koong
2023-02-09  0:39                             ` Andrii Nakryiko
2023-01-31 18:30         ` Joanne Koong
2023-01-31 19:58           ` Alexei Starovoitov
2023-01-31 20:47             ` Joanne Koong
2023-01-31 21:10               ` Alexei Starovoitov
2023-01-31 21:33                 ` Joanne Koong
2023-01-31 18:18     ` Joanne Koong
2023-01-31  0:48   ` Andrii Nakryiko
2023-01-31  0:55     ` Joanne Koong
2023-01-31  1:06       ` Andrii Nakryiko
2023-01-31  1:13         ` Joanne Koong
2023-01-31  1:19           ` Andrii Nakryiko
2023-01-27 19:17 ` [PATCH v9 bpf-next 4/5] bpf: Add xdp dynptrs Joanne Koong
2023-01-27 19:17 ` [PATCH v9 bpf-next 5/5] selftests/bpf: tests for using dynptrs to parse skb and xdp buffers Joanne Koong
2023-01-31  0:49   ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEf4BzbTXqhsKqPd=hDANKeg75UDbKjtX318ucMGw7a1L3693w@mail.gmail.com' \
    --to=andrii.nakryiko@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=joannelkoong@gmail.com \
    --cc=kernel-team@fb.com \
    --cc=martin.lau@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=memxor@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).