bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joanne Koong <joannelkoong@gmail.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>, bpf <bpf@vger.kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Alexei Starovoitov <ast@kernel.org>
Subject: Re: [PATCH bpf-next v1 1/3] bpf: Add skb dynptrs
Date: Wed, 3 Aug 2022 18:05:44 -0700	[thread overview]
Message-ID: <CAJnrk1YWvKqujw1py+HzV8QP1g=cCQmOJ0KtGkNXjdJ4PY3Zkg@mail.gmail.com> (raw)
In-Reply-To: <20220803162540.19d31294@kernel.org>

On Wed, Aug 3, 2022 at 4:25 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 3 Aug 2022 13:29:37 -0700 Joanne Koong wrote:
> > Thinking about this some more, I think BPF_FUNC_dynptr_from_skb needs
> > to be patched regardless in order to set the rd-only flag in the
> > metadata for the dynptr. There will be other helper functions that
> > write into dynptrs (eg memcpy with dynptrs, strncpy with dynptrs,
> > probe read user with dynptrs, ...) so I think it's more scalable if we
> > reject these writes at runtime through the rd-only flag in the
> > metadata, than for the verifier to custom-case that any helper funcs
> > that write into dynptrs will need to get dynptr type + do
> > may_access_direct_pkt_data() if it's type skb or xdp. The
> > inconsistency between not rd-only in metadata vs. rd-only in verifier
> > might be a little confusing as well.
> >
> > For these reasons, I'm leaning more towards having bpf_dynptr_write()
> > and other dynptr write helper funcs be rejected at runtime instead of
> > prog load time, but I'm eager to hear what you prefer.
> >
> > What are your thoughts?
>
> Oh. I thought dynptrs are an extension of the discussion we had about
> creating a skb_header_pointer()-like abstraction but it sounds like
> we veered quite far off that track at some point :(

I think the problem is that the skb may be cloned, so a write into any
portion of the paged data requires a pull. If it weren't for this,
then we could do the write and checksumming without pulling (eg kmap
the page, get the csum_partial of the bytes you'll write over, do the
write, get the csum_partial of the bytes you just wrote, then unkmap,
then update skb->csum to be skb->csum - csum of the bytes you wrote
over + csum of the bytes you wrote). I think we would even be able to
provide a direct data slice to non-contiguous pages without needing
the additional copy to a stack buffer (eg kmap the non-contiguous
pages to a contiguous virtual address that we pass back to the bpf
program, and then when the bpf program is finished do the cleanup for
the mappings).

Three ideas I'm thinking through as a possible solution:
1) Enforce that the skb is always uncloned for skb-type bpf progs (we
currently do this just for the skb head, see bpf_unclone_prologue()),
but I'm not sure if the trade-off (pulling all the packet data, even
if it won't be used) is acceptable.

2) Don't support cloned skbs for bpf_dynptr_write/data and don't do
any pulling. If the prog wants to use bpf_dynptr_write/data, then they
have to pull it first

2) (uglier than #1 and #2) For bpf_dynptr_write()s, pull if the write
is to a paged area and the skb is cloned, otherwise write to the paged
area without pulling; if we do this, then we always have to invalidate
all data slices associated with the skb (even for writes to the head)
since at prog load time, the verifier doesn't know if the pull happens
or not. For bpf_dynptr_data()s, follow the same policy.

I'm leaning towards 2. What are your thoughts?

>
> The point of skb_header_pointer() is to expose the chunk of the packet
> pointed to by [skb, offset, len] as a linear buffer. Potentially coping
> it out to a stack buffer *IIF* the header is not contiguous inside the
> skb head, which should very rarely happen.
>
> Here it seems we return an error so that user must pull if the data is
> not linear, which is defeating the purpose. The user of
> skb_header_pointer() wants to avoid the copy while _reliably_ getting
> a contiguous pointer. Plus pulling in the header may be far more
> expensive than a small copy to the stack.
>
> The pointer returned by skb_header_pointer is writable, but it's not
> guaranteed that the writes go to the packet, they may go to the
> on-stack buffer, so the caller must do some sort of:
>
>         if (data_ptr == stack_buf)
>                 skb_store_bits(...);
>
> Which we were thinking of wrapping in some sort of flush operation.
>
> If I'm reading this right dynptr as implemented here do not provide
> such semantics, am I confused in thinking that this is a continuation
> of the XDP multi-buff discussion? Is it a completely separate thing
> and we'll still need a header_pointer like helper?

  reply	other threads:[~2022-08-04  1:06 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-26 18:47 [PATCH bpf-next v1 0/3] Add skb + xdp dynptrs Joanne Koong
2022-07-26 18:47 ` [PATCH bpf-next v1 1/3] bpf: Add skb dynptrs Joanne Koong
2022-07-27 17:13   ` sdf
2022-07-28 16:49     ` Joanne Koong
2022-07-28 17:28       ` Stanislav Fomichev
2022-07-28 17:45   ` Hao Luo
2022-07-28 18:36     ` Joanne Koong
2022-07-28 23:39   ` Martin KaFai Lau
2022-07-29 20:26     ` Joanne Koong
2022-07-29 21:39       ` Martin KaFai Lau
2022-08-01 17:52         ` Joanne Koong
2022-08-01 19:38           ` Martin KaFai Lau
2022-08-01 21:16             ` Joanne Koong
2022-08-01 22:14               ` Andrii Nakryiko
2022-08-01 22:32               ` Martin KaFai Lau
2022-08-01 22:58                 ` Andrii Nakryiko
2022-08-01 23:23                   ` Martin KaFai Lau
2022-08-02  0:56                     ` Martin KaFai Lau
2022-08-02  3:51                       ` Andrii Nakryiko
2022-08-02  4:53                         ` Joanne Koong
2022-08-02  5:14                           ` Joanne Koong
2022-08-03 20:29         ` Joanne Koong
2022-08-03 20:36           ` Andrii Nakryiko
2022-08-03 20:56           ` Martin KaFai Lau
2022-08-03 23:25           ` Jakub Kicinski
2022-08-04  1:05             ` Joanne Koong [this message]
2022-08-04  1:34               ` Jakub Kicinski
2022-08-04  3:44                 ` Joanne Koong
2022-08-04  1:27             ` Martin KaFai Lau
2022-08-04  1:44               ` Jakub Kicinski
2022-08-04 22:58             ` Kumar Kartikeya Dwivedi
2022-08-05 23:25               ` Jakub Kicinski
2022-08-01 22:11   ` Andrii Nakryiko
2022-08-02  0:15     ` Joanne Koong
2022-08-01 23:33   ` Jakub Kicinski
2022-08-02  2:12     ` Joanne Koong
2022-08-04 21:55       ` Joanne Koong
2022-08-05 23:22         ` Jakub Kicinski
2022-08-03  6:37   ` Martin KaFai Lau
2022-07-26 18:47 ` [PATCH bpf-next v1 2/3] bpf: Add xdp dynptrs Joanne Koong
2022-07-26 18:47 ` [PATCH bpf-next v1 3/3] selftests/bpf: tests for using dynptrs to parse skb and xdp buffers Joanne Koong
2022-07-26 19:44   ` Zvi Effron
2022-07-26 20:06     ` Joanne Koong
2022-08-01 17:58   ` Andrii Nakryiko
2022-08-02 22:56     ` Joanne Koong
2022-08-03  0:53       ` Andrii Nakryiko
2022-08-03 16:11         ` Joanne Koong
2022-08-04 18:45           ` Alexei Starovoitov
2022-08-05 16:29             ` Joanne Koong
2022-08-01 19:12   ` Alexei Starovoitov
2022-08-02 22:21     ` Joanne Koong
2022-08-04 21:46       ` Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJnrk1YWvKqujw1py+HzV8QP1g=cCQmOJ0KtGkNXjdJ4PY3Zkg@mail.gmail.com' \
    --to=joannelkoong@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kafai@fb.com \
    --cc=kuba@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).