bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joanne Koong <joannelkoong@gmail.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>, bpf <bpf@vger.kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Alexei Starovoitov <ast@kernel.org>
Subject: Re: [PATCH bpf-next v1 1/3] bpf: Add skb dynptrs
Date: Wed, 3 Aug 2022 20:44:39 -0700	[thread overview]
Message-ID: <CAJnrk1bROt-tPOYUPB=aOJ=e35aCMdRERffCDEh8sKN6gK3Cug@mail.gmail.com> (raw)
In-Reply-To: <20220803183435.4d33dc2e@kernel.org>

On Wed, Aug 3, 2022 at 6:34 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 3 Aug 2022 18:05:44 -0700 Joanne Koong wrote:
> > I think the problem is that the skb may be cloned, so a write into any
> > portion of the paged data requires a pull. If it weren't for this,
> > then we could do the write and checksumming without pulling (eg kmap
> > the page, get the csum_partial of the bytes you'll write over, do the
> > write, get the csum_partial of the bytes you just wrote, then unkmap,
> > then update skb->csum to be skb->csum - csum of the bytes you wrote
> > over + csum of the bytes you wrote). I think we would even be able to
> > provide a direct data slice to non-contiguous pages without needing
> > the additional copy to a stack buffer (eg kmap the non-contiguous
> > pages to a contiguous virtual address that we pass back to the bpf
> > program, and then when the bpf program is finished do the cleanup for
> > the mappings).
>
> The whole read/write/data concept is not a great match for packet
> parsing. Primary use for packet parsing is that you want to read
> a header and not have to deal with frags or pulling. In that case
> you should get a direct pointer or a copy on the stack, transparently.

The selftests [0] includes some examples of packet parsing using
dynptrs. You're able to get a pointer to the headers (if it's in the
head) directly, or you can use bpf_dynptr_read() to read the data from
the frag into a buffer (without needing to pull; bpf_dynptr_read()
essentially just calls bpf_skb_load_bytes()).

Does this address the use cases you have in mind?

I think the pull and unclone stuff only pertains to the cases for
writes into the frags.

[0] https://lore.kernel.org/bpf/20220726184706.954822-4-joannelkoong@gmail.com/

>
> Maybe before I go on talking nonsense I should read up on what dynptr
> is and what it's trying to achieve. Stan says its like unique_ptr in
> C++ which tells me.. nothing :)
>
> $ git grep dynptr -- Documentation/
> $
>
> Any pointers?

Ooh thanks for the reminder, adding a page for the dynptr
documentation is on my to-do list

A dynptr (also known as fat pointers in other languages) is a pointer
that stores extra metadata alongside the address it points to. In
particular, the metadata the bpf dynptr stores includes the length of
data it points to (so in the case of skb/xdp, the size of the packet),
the type of dynptr, and properties like whether it's read-only.
Dynptrs are an interface for bpf programs to dynamically access
memory, because the helper calls enforce that the accesses are safe.
For example, bpf_dynptr_read() allows reads at offsets and lengths not
statically known at compile time (in bpf_dynptr_read, the kernel uses
the metadata to check that the offset and length doesn't violate
memory bounds); without dynptrs, the verifier can't guarantee that the
offset or length of the read is safe since those values aren't
statically known at compile time, so for example you can't directly
read a dynamic number of bytes from a packet.

With regards to skb + xdp, the main use case of dynptrs are to 1)
allow dynamic-size accesses of packet data and 2) allow easier and
simpler packet parsing (for example, accessing skb->data directly
requires multiple if checks for ensuring it's within bounds of
skb->data_end in order to satisfy the verifier; with the dynptr
interface, you are able to get a direct data slice and access it
without needing the checks. The selftests 3rd patch has some
demonstrations of this).

>
> > Three ideas I'm thinking through as a possible solution:
> > 1) Enforce that the skb is always uncloned for skb-type bpf progs (we
> > currently do this just for the skb head, see bpf_unclone_prologue()),
> > but I'm not sure if the trade-off (pulling all the packet data, even
> > if it won't be used) is acceptable.
> >
> > 2) Don't support cloned skbs for bpf_dynptr_write/data and don't do
> > any pulling. If the prog wants to use bpf_dynptr_write/data, then they
> > have to pull it first
>
> I think all output skbs from TCP are cloned, so that's not gonna work.
>
> > 2) (uglier than #1 and #2) For bpf_dynptr_write()s, pull if the write
> > is to a paged area and the skb is cloned, otherwise write to the paged
> > area without pulling; if we do this, then we always have to invalidate
> > all data slices associated with the skb (even for writes to the head)
> > since at prog load time, the verifier doesn't know if the pull happens
> > or not. For bpf_dynptr_data()s, follow the same policy.
> >
> > I'm leaning towards 2. What are your thoughts?

  reply	other threads:[~2022-08-04  3:45 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-26 18:47 [PATCH bpf-next v1 0/3] Add skb + xdp dynptrs Joanne Koong
2022-07-26 18:47 ` [PATCH bpf-next v1 1/3] bpf: Add skb dynptrs Joanne Koong
2022-07-27 17:13   ` sdf
2022-07-28 16:49     ` Joanne Koong
2022-07-28 17:28       ` Stanislav Fomichev
2022-07-28 17:45   ` Hao Luo
2022-07-28 18:36     ` Joanne Koong
2022-07-28 23:39   ` Martin KaFai Lau
2022-07-29 20:26     ` Joanne Koong
2022-07-29 21:39       ` Martin KaFai Lau
2022-08-01 17:52         ` Joanne Koong
2022-08-01 19:38           ` Martin KaFai Lau
2022-08-01 21:16             ` Joanne Koong
2022-08-01 22:14               ` Andrii Nakryiko
2022-08-01 22:32               ` Martin KaFai Lau
2022-08-01 22:58                 ` Andrii Nakryiko
2022-08-01 23:23                   ` Martin KaFai Lau
2022-08-02  0:56                     ` Martin KaFai Lau
2022-08-02  3:51                       ` Andrii Nakryiko
2022-08-02  4:53                         ` Joanne Koong
2022-08-02  5:14                           ` Joanne Koong
2022-08-03 20:29         ` Joanne Koong
2022-08-03 20:36           ` Andrii Nakryiko
2022-08-03 20:56           ` Martin KaFai Lau
2022-08-03 23:25           ` Jakub Kicinski
2022-08-04  1:05             ` Joanne Koong
2022-08-04  1:34               ` Jakub Kicinski
2022-08-04  3:44                 ` Joanne Koong [this message]
2022-08-04  1:27             ` Martin KaFai Lau
2022-08-04  1:44               ` Jakub Kicinski
2022-08-04 22:58             ` Kumar Kartikeya Dwivedi
2022-08-05 23:25               ` Jakub Kicinski
2022-08-01 22:11   ` Andrii Nakryiko
2022-08-02  0:15     ` Joanne Koong
2022-08-01 23:33   ` Jakub Kicinski
2022-08-02  2:12     ` Joanne Koong
2022-08-04 21:55       ` Joanne Koong
2022-08-05 23:22         ` Jakub Kicinski
2022-08-03  6:37   ` Martin KaFai Lau
2022-07-26 18:47 ` [PATCH bpf-next v1 2/3] bpf: Add xdp dynptrs Joanne Koong
2022-07-26 18:47 ` [PATCH bpf-next v1 3/3] selftests/bpf: tests for using dynptrs to parse skb and xdp buffers Joanne Koong
2022-07-26 19:44   ` Zvi Effron
2022-07-26 20:06     ` Joanne Koong
2022-08-01 17:58   ` Andrii Nakryiko
2022-08-02 22:56     ` Joanne Koong
2022-08-03  0:53       ` Andrii Nakryiko
2022-08-03 16:11         ` Joanne Koong
2022-08-04 18:45           ` Alexei Starovoitov
2022-08-05 16:29             ` Joanne Koong
2022-08-01 19:12   ` Alexei Starovoitov
2022-08-02 22:21     ` Joanne Koong
2022-08-04 21:46       ` Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJnrk1bROt-tPOYUPB=aOJ=e35aCMdRERffCDEh8sKN6gK3Cug@mail.gmail.com' \
    --to=joannelkoong@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kafai@fb.com \
    --cc=kuba@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).