From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Joanne Koong <joannelkoong@gmail.com>
Cc: bpf@vger.kernel.org, andrii@kernel.org, ast@kernel.org,
daniel@iogearbox.net
Subject: Re: [PATCH v1 bpf-next 4/5] bpf: Add bpf_dynptr_clone
Date: Mon, 17 Apr 2023 16:46:13 -0700 [thread overview]
Message-ID: <CAEf4BzZPUmcSAPwtgVRebCDWccaU6EC3yHt99Asm=akoewbBEA@mail.gmail.com> (raw)
In-Reply-To: <CAJnrk1b+FsAUHneWdyarMs6kwd8CBNFi9s7n38KXQ2uF+NkvTw@mail.gmail.com>
On Thu, Apr 13, 2023 at 11:03 PM Joanne Koong <joannelkoong@gmail.com> wrote:
>
> On Wed, Apr 12, 2023 at 3:12 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Sat, Apr 8, 2023 at 8:34 PM Joanne Koong <joannelkoong@gmail.com> wrote:
> > >
> > > The cloned dynptr will point to the same data as its parent dynptr,
> > > with the same type, offset, size and read-only properties.
> > >
> > > Any writes to a dynptr will be reflected across all instances
> > > (by 'instance', this means any dynptrs that point to the same
> > > underlying data).
> > >
> > > Please note that data slice and dynptr invalidations will affect all
> > > instances as well. For example, if bpf_dynptr_write() is called on an
> > > skb-type dynptr, all data slices of dynptr instances to that skb
> > > will be invalidated as well (eg data slices of any clones, parents,
> > > grandparents, ...). Another example is if a ringbuf dynptr is submitted,
> > > any instance of that dynptr will be invalidated.
> > >
> > > Changing the view of the dynptr (eg advancing the offset or
> > > trimming the size) will only affect that dynptr and not affect any
> > > other instances.
> > >
> > > One example use case where cloning may be helpful is for hashing or
> > > iterating through dynptr data. Cloning will allow the user to maintain
> > > the original view of the dynptr for future use, while also allowing
> > > views to smaller subsets of the data after the offset is advanced or the
> > > size is trimmed.
> > >
> > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > > ---
> > > kernel/bpf/helpers.c | 14 +++++
> > > kernel/bpf/verifier.c | 125 +++++++++++++++++++++++++++++++++++++-----
> > > 2 files changed, 126 insertions(+), 13 deletions(-)
> > >
> > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > > index bac4c6fe49f0..108f3bcfa6da 100644
> > > --- a/kernel/bpf/helpers.c
> > > +++ b/kernel/bpf/helpers.c
> > > @@ -2351,6 +2351,19 @@ __bpf_kfunc __u32 bpf_dynptr_get_offset(const struct bpf_dynptr_kern *ptr)
> > > return ptr->offset;
> > > }
> > >
> > > +__bpf_kfunc int bpf_dynptr_clone(struct bpf_dynptr_kern *ptr,
> > > + struct bpf_dynptr_kern *clone__uninit)
> >
> > I think most of uses for bpf_dynptr_clone() would be to get a partial
> > view (like you mentioned above, to, e.g., do a hashing of a part of
> > the memory range). So it feels it would be best UX if clone would
> > allow you to define a new range in one go. So e.g., to create a
> > "sub-dynptr" for range of bytes [10, 30), it should be enough to:
> >
> > struct bpf_dynptr orig_ptr, new_ptr;
> >
> > bpf_dynptr_clone(&orig_ptr, 10, 30, &new_ptr);
> >
> > Instead of:
> >
> > bpf_dynptr_clone(&orig_ptr, &new_ptr);
> > bpf_dynptr_advance(&new_ptr, 10);
> > bpf_dynptr_trim(&new_ptr, bpf_dynptr_get_size(&orig_ptr) - 30);
> >
>
> I don't think we can do this because we can't have bpf_dynptr_clone()
> fail (which might happen if we take in a range, if the range is
> invalid). This is because in the case where the clone is of a
> reference-counted dynptr (eg like a ringbuf-type dynptr), the clone
> even if it's invalid must be treated by the verifier as a legit dynptr
> (since the verifier can't know ahead of time whether the clone call
> will succeed or not) which means if the invalid clone dynptr is then
> passed into a reference-releasing function, the verifier will release
> the reference but the actual reference won't be released at runtime
> (since the clone dynptr is invalid), which would leak the reference.
> An example is something like:
>
> // invalid range is passed, error is returned and new_ptr is invalid
> bpf_dynptr_clone(&ringbuf_ptr, 9999999, 9999999, &new_ptr);
> // this releases the reference and invalidates both new_ptr and ringbuf_ptr
> bpf_ringbuf_discard_dynptr(&new_ptr, 0);
>
> At runtime, bpf_ringbuf_discard_dynptr() will be a no-op since new_ptr
> is invalid - the ringbuf record still needs to be submitted/discarded,
> but the verifier will think this already happened
Ah, tricky, good point. But ok, I guess with bpf_dynptr_adjust()
proposal in another email this would be ok:
bpf_dynptr_clone(..); /* always succeeds */
bpf_dynptr_adjust(&new_ptr, 10, 30); /* could fail to adjust, but
dynptr is left valid */
Right?
>
> >
> > This, btw, shows the awkwardness of the bpf_dynptr_trim() approach.
> >
> > If someone really wanted an exact full-sized copy, it's trivial:
> >
> > bpf_dynptr_clone(&orig_ptr, 0, bpf_dynptr_get_size(&orig_ptr), &new_ptr);
> >
> >
> > BTW, let's rename bpf_dynptr_get_size -> bpf_dynptr_size()? That
> > "get_" is a sore in the eye, IMO.
>
> Will do!
> >
> >
> > > +{
> > > + if (!ptr->data) {
> > > + bpf_dynptr_set_null(clone__uninit);
> > > + return -EINVAL;
> > > + }
> > > +
> > > + memcpy(clone__uninit, ptr, sizeof(*clone__uninit));
> >
> > why memcpy instead of `*clone__uninit = *ptr`?
> >
> No good reason :) I'll change this for v2
> > > +
> > > + return 0;
> > > +}
> > > +
> > > __bpf_kfunc void *bpf_cast_to_kern_ctx(void *obj)
> > > {
> > > return obj;
> > > @@ -2429,6 +2442,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_is_null)
> > > BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly)
> > > BTF_ID_FLAGS(func, bpf_dynptr_get_size)
> > > BTF_ID_FLAGS(func, bpf_dynptr_get_offset)
> > > +BTF_ID_FLAGS(func, bpf_dynptr_clone)
> > > BTF_SET8_END(common_btf_ids)
> > >
> > > static const struct btf_kfunc_id_set common_kfunc_set = {
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index 3660b573048a..804cb50050f9 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -302,6 +302,7 @@ struct bpf_kfunc_call_arg_meta {
> > > struct {
> > > enum bpf_dynptr_type type;
> > > u32 id;
> > > + u32 ref_obj_id;
> > > } initialized_dynptr;
> > > struct {
> > > u8 spi;
> > > @@ -963,24 +964,15 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
> > > return 0;
> > > }
> > >
> > > -static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> > > +static void invalidate_dynptr(struct bpf_verifier_env *env, struct bpf_func_state *state, int spi)
> > > {
> > > - struct bpf_func_state *state = func(env, reg);
> > > - int spi, i;
> > > -
> > > - spi = dynptr_get_spi(env, reg);
> > > - if (spi < 0)
> > > - return spi;
> > > + int i;
> > >
> > > for (i = 0; i < BPF_REG_SIZE; i++) {
> > > state->stack[spi].slot_type[i] = STACK_INVALID;
> > > state->stack[spi - 1].slot_type[i] = STACK_INVALID;
> > > }
> > >
> > > - /* Invalidate any slices associated with this dynptr */
> > > - if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type))
> > > - WARN_ON_ONCE(release_reference(env, state->stack[spi].spilled_ptr.ref_obj_id));
> > > -
> > > __mark_reg_not_init(env, &state->stack[spi].spilled_ptr);
> > > __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr);
> > >
> > > @@ -1007,6 +999,51 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re
> > > */
> > > state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN;
> > > state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN;
> > > +}
> > > +
> > > +static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> > > +{
> > > + struct bpf_func_state *state = func(env, reg);
> > > + int spi;
> > > +
> > > + spi = dynptr_get_spi(env, reg);
> > > + if (spi < 0)
> > > + return spi;
> > > +
> > > + if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type)) {
> > > + int ref_obj_id = state->stack[spi].spilled_ptr.ref_obj_id;
> > > + int i;
> > > +
> > > + /* If the dynptr has a ref_obj_id, then we need to invaldiate
> >
> > typo: invalidate
> >
> > > + * two things:
> > > + *
> > > + * 1) Any dynptrs with a matching ref_obj_id (clones)
> > > + * 2) Any slices associated with the ref_obj_id
> >
> > I think this is where this dynptr_id confusion comes from. The rule
> > should be "any slices derived from this dynptr". But as mentioned on
> > another thread, it's a separate topic which we can address later.
> >
> If there's a parent and a clone and slices derived from the parent and
> slices derived from the clone, if the clone is invalidated then we
> need to invalidate slices associated with the parent as well. So
> shouldn't it be "any slices associated with the ref obj id" not "any
> slices derived from this dynptr"? (also just a note, parent/clone
> slices will share the same ref obj id and the same dynptr id, so
> checking against either does the same thing)
So, we have a ringbuf dynptr with ref_obj_id=1, id=2, ok? We clone it,
clone gets ref_obj_id=1, id=3. If either original dynptr or clone
dynptr is released due to bpf_ringbuf_discard_dynptr(), we invalidate
all the dynptrs with ref_obj_id=1. During invalidation of each dynptr,
we invalidate all the slices with ref_obj_id==dynptr's id. So we'll
invalidate slices derived from dynptr with id=2 (original dynptr), and
then all the slices derived from dynptr with id=3?
>
> > > + */
> > > +
> > > + /* Invalidate any slices associated with this dynptr */
> > > + WARN_ON_ONCE(release_reference(env, ref_obj_id));
> > > +
> > > + /* Invalidate any dynptr clones */
> > > + for (i = 1; i < state->allocated_stack / BPF_REG_SIZE; i++) {
> > > + if (state->stack[i].spilled_ptr.ref_obj_id == ref_obj_id) {
> >
> > nit: invert if condition and continue to reduce nestedness of the rest
> > the loop body
>
> Ooh good idea
> >
> > > + /* it should always be the case that if the ref obj id
> > > + * matches then the stack slot also belongs to a
> > > + * dynptr
> > > + */
> > > + if (state->stack[i].slot_type[0] != STACK_DYNPTR) {
> > > + verbose(env, "verifier internal error: misconfigured ref_obj_id\n");
> > > + return -EFAULT;
> > > + }
> > > + if (state->stack[i].spilled_ptr.dynptr.first_slot)
> > > + invalidate_dynptr(env, state, i);
> > > + }
> > > + }
> > > +
> > > + return 0;
> > > + }
> > > +
> > > + invalidate_dynptr(env, state, spi);
> > >
> > > return 0;
> > > }
> > > @@ -6967,6 +7004,50 @@ static int process_iter_next_call(struct bpf_verifier_env *env, int insn_idx,
> > > return 0;
> > > }
> > >
> > > +static int handle_dynptr_clone(struct bpf_verifier_env *env, enum bpf_arg_type arg_type,
> > > + int regno, int insn_idx, struct bpf_kfunc_call_arg_meta *meta)
> > > +{
> > > + struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno];
> > > + struct bpf_reg_state *first_reg_state, *second_reg_state;
> > > + struct bpf_func_state *state = func(env, reg);
> > > + enum bpf_dynptr_type dynptr_type = meta->initialized_dynptr.type;
> > > + int err, spi, ref_obj_id;
> > > +
> > > + if (!dynptr_type) {
> > > + verbose(env, "verifier internal error: no dynptr type for bpf_dynptr_clone\n");
> > > + return -EFAULT;
> > > + }
> > > + arg_type |= get_dynptr_type_flag(dynptr_type);
> >
> >
> > what about MEM_RDONLY and MEM_UNINIT flags, do we need to derive and add them?
>
> I don't think we need MEM_UNINIT because we can't clone an
> uninitialized dynptr. But yes, definitely MEM_RDONLY. I missed that, I
> will add it in in v2
>
> >
> > > +
> > > + err = process_dynptr_func(env, regno, insn_idx, arg_type);
> > > + if (err < 0)
> > > + return err;
> > > +
> > > + spi = dynptr_get_spi(env, reg);
> > > + if (spi < 0)
> > > + return spi;
> > > +
> > > + first_reg_state = &state->stack[spi].spilled_ptr;
> > > + second_reg_state = &state->stack[spi - 1].spilled_ptr;
> > > + ref_obj_id = first_reg_state->ref_obj_id;
> > > +
> > > + /* reassign the clone the same dynptr id as the original */
> > > + __mark_dynptr_reg(first_reg_state, dynptr_type, true, meta->initialized_dynptr.id);
> > > + __mark_dynptr_reg(second_reg_state, dynptr_type, false, meta->initialized_dynptr.id);
> >
> > I'm not sure why clone should have the same dynptr_id? Isn't it a new
> > instance of a dynptr? I get preserving ref_obj_id (if refcounted), but
> > why reusing dynptr_id?..
> >
> I think we need to also copy over the dynptr id because in the case of
> a non-reference counted dynptr, if the parent (or clone) is
> invalidated (eg overwriting bytes of the dynptr on the stack), we must
> also invalidate the slices of the clone (or parent)
yep, right now we'll have to do that because we have dynptr_id. But if
we get rid of it and stick to ref_obj_id and id, then clone would need
to get a new id, but keep ref_obj_id, right?
> >
> > > +
> > > + if (meta->initialized_dynptr.ref_obj_id) {
> > > + /* release the new ref obj id assigned during process_dynptr_func */
> > > + err = release_reference_state(cur_func(env), ref_obj_id);
> > > + if (err)
> > > + return err;
> >
> > ugh... this is not good to create reference just to release. If we
> > need to reuse logic, let's reuse the logic without parts that
> > shouldn't happen. Please check if we can split process_dynptr_func()
> > appropriately to allow this.
>
> I'll change this for v2. I think the simplest approach would be having
> mark_stack_slots_dynptr() take in a boolean or something that'll
> indicate whether it should acquire a new reference state or not
> >
> > > + /* reassign the clone the same ref obj id as the original */
> > > + first_reg_state->ref_obj_id = meta->initialized_dynptr.ref_obj_id;
> > > + second_reg_state->ref_obj_id = meta->initialized_dynptr.ref_obj_id;
> > > + }
> > > +
> > > + return 0;
> > > +}
> > > +
> >
> > [...]
next prev parent reply other threads:[~2023-04-17 23:46 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-09 3:34 [PATCH v1 bpf-next 0/5] Dynptr convenience helpers Joanne Koong
2023-04-09 3:34 ` [PATCH v1 bpf-next 1/5] bpf: Add bpf_dynptr_trim and bpf_dynptr_advance Joanne Koong
2023-04-12 21:46 ` Andrii Nakryiko
2023-04-14 5:15 ` Joanne Koong
2023-04-17 23:35 ` Andrii Nakryiko
2023-04-19 6:22 ` Joanne Koong
2023-04-19 16:30 ` Andrii Nakryiko
2023-04-09 3:34 ` [PATCH v1 bpf-next 2/5] bpf: Add bpf_dynptr_is_null and bpf_dynptr_is_rdonly Joanne Koong
2023-04-12 21:50 ` Andrii Nakryiko
2023-04-20 6:45 ` Joanne Koong
2023-04-09 3:34 ` [PATCH v1 bpf-next 3/5] bpf: Add bpf_dynptr_get_size and bpf_dynptr_get_offset Joanne Koong
2023-04-12 21:52 ` Andrii Nakryiko
2023-04-14 5:17 ` Joanne Koong
2023-04-09 3:34 ` [PATCH v1 bpf-next 4/5] bpf: Add bpf_dynptr_clone Joanne Koong
2023-04-12 22:12 ` Andrii Nakryiko
2023-04-14 6:02 ` Joanne Koong
2023-04-17 23:46 ` Andrii Nakryiko [this message]
2023-04-19 6:56 ` Joanne Koong
2023-04-19 16:34 ` Andrii Nakryiko
2023-04-17 18:53 ` kernel test robot
2023-04-09 3:34 ` [PATCH v1 bpf-next 5/5] selftests/bpf: add tests for dynptr convenience helpers Joanne Koong
2023-04-12 21:48 ` [PATCH v1 bpf-next 0/5] Dynptr " Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAEf4BzZPUmcSAPwtgVRebCDWccaU6EC3yHt99Asm=akoewbBEA@mail.gmail.com' \
--to=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=joannelkoong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).