All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Marchevsky <davemarchevsky@meta.com>
To: David Vernet <void@manifault.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Dave Marchevsky <davemarchevsky@fb.com>,
	bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Kernel Team <kernel-team@fb.com>,
	Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH v2 bpf-next 01/13] bpf: Support multiple arg regs w/ ref_obj_id for kfuncs
Date: Tue, 17 Jan 2023 12:26:32 -0500	[thread overview]
Message-ID: <afcae0f4-97a0-b06e-0a4e-7955ca7dbc7c@meta.com> (raw)
In-Reply-To: <Y63HrbJV+rTSmvVe@maniforge.lan>

On 12/29/22 12:00 PM, David Vernet wrote:
> On Thu, Dec 29, 2022 at 08:50:19AM -0800, Alexei Starovoitov wrote:
>> On Wed, Dec 28, 2022 at 10:40 PM David Vernet <void@manifault.com> wrote:
>>>
>>> On Sat, Dec 17, 2022 at 12:24:54AM -0800, Dave Marchevsky wrote:
>>>> Currently, kfuncs marked KF_RELEASE indicate that they release some
>>>> previously-acquired arg. The verifier assumes that such a function will
>>>> only have one arg reg w/ ref_obj_id set, and that that arg is the one to
>>>> be released. Multiple kfunc arg regs have ref_obj_id set is considered
>>>> an invalid state.
>>>>
>>>> For helpers, RELEASE is used to tag a particular arg in the function
>>>> proto, not the function itself. The arg with OBJ_RELEASE type tag is the
>>>> arg that the helper will release. There can only be one such tagged arg.
>>>> When verifying arg regs, multiple helper arg regs w/ ref_obj_id set is
>>>> also considered an invalid state.
>>>>
>>>> Later patches in this series will result in some linked_list helpers
>>>> marked KF_RELEASE having a valid reason to take two ref_obj_id args.
>>>> Specifically, bpf_list_push_{front,back} can push a node to a list head
>>>> which is itself part of a list node. In such a scenario both arguments
>>>> to these functions would have ref_obj_id > 0, thus would fail
>>>> verification under current logic.
>>>>
>>>> This patch changes kfunc ref_obj_id searching logic to find the last arg
>>>> reg w/ ref_obj_id and consider that the reg-to-release. This should be
>>>> backwards-compatible with all current kfuncs as they only expect one
>>>> such arg reg.
>>>
>>> Can't say I'm a huge fan of this proposal :-( While I think it's really
>>> unfortunate that kfunc flags are not defined per-arg for this exact type
>>> of reason, adding more flag-specific semantics like this is IMO a step
>>> in the wrong direction.  It's similar to the existing __sz and __k
>>> argument-naming semantics that inform the verifier that the arguments
>>> have special meaning. All of these little additions of special-case
>>> handling for kfunc flags end up requiring people writing kfuncs (and
>>> sometimes calling them) to read through the verifier to understand
>>> what's going on (though I will say that it's nice that __sz and __k are
>>> properly documented in [0]).
>>
>> Before getting to pros/cons of KF_* vs name suffix vs helper style
>> per-arg description...
>> It's important to highlight that here we're talking about
>> link list and rb tree kfuncs that are not like other kfuncs.
>> Majority of kfuncs can be added by subsystems like hid-bpf
>> without touching the verifier.
> 
> I hear you and I agree. It wasn't my intention to drag us into a larger
> discussion about kfuncs vs. helpers, but rather just to point out that I
> think we have to try hard to avoid adding special-case logic that
> requires looking into the verifier to understand the semantics. I think
> we're on the same page about this, based on this and your other
> response.
> 

In another thread you also mentioned that hypothetical "kfunc writer" persona
shouldn't have to understand kfunc flags in order to add their simple kfunc, and
I think your comments here are also presupposing a "kfunc writer" persona that
doesn't look at the verifier. Having such a person able to add kfuncs without
understanding the verifier is a good goal, but doesn't reflect current
reality when the kfunc needs to have any special semantics.

Regardless, I'd expect that anyone adding further new-style Graph
datastructures, old-style maps, or new datastructures unrelated to either,
will be closer to "verifier expert" than "random person adding a few kfuncs".

>> Here we're paving the way for graph (aka new gen data structs)
>> and so far not only kfuncs, but their arg types have to have
>> special handling inside the verifier.
>> There is not much yet to generalize and expose as generic KF_
>> flag or as a name suffix.
>> Therefore I think it's more appropriate to implement them
>> with minimal verifier changes and minimal complexity.
> 
> Agreed
> 

'Generalize' was addressed in Patch 2's thread.

>> There is no 3rd graph algorithm on the horizon after link list
>> and rbtree. Instead there is a big todo list for
>> 'multi owner graph node' and 'bpf_refcount_t'.
> 
> In this case my point in [0] of the only option for generalizing being
> to have something like KF_GRAPH_INSERT / KF_GRAPH_REMOVE is just not the
> way forward (which I also said was my opinion when I pointed it out as
> an option). Let's just special-case these kfuncs. There's already a
> precedence for doing that in the verifier anyways. Minimal complexity,
> minimal API changes. It's a win-win.
> 
> [0]: https://lore.kernel.org/all/Y63GLqZil9l1NzY4@maniforge.lan/
> 

There's certainly precedent for adding special-case "kfunc_id == KFUNC_whatever"
all over the verifier. It's a bad precedent, though, for reasons discussed in
[0].

To specifically address your points here, I don't buy the argument that
special-casing based on func id is "minimal complexity, minimal API changes".
Re: 'complexity': the logic implementing the complicated semantic will be
added regardless, it just won't have a name that's easily referenced in docs
and mailing list discussions.

Similarly, re: 'API changes': if by 'API' here you mean "API that's exposed
to folks adding kfuncs" - see my comments about "kfunc writer" persona above.
We can think of the verifier itself as an API too - with a single bpf_check
function. That API's behavior is indeed changed here, regardless of whether
the added semantics are gated by a kfunc flag or special-case checks. I don't
think that hiding complexity behind special-case checks when there could be
a named flag simplifies anything. The complexity is added regardless, question
is how many breadcrumbs and pointers we want to leave for folks trying to make
sense of it in the future.

  [0]: https://lore.kernel.org/bpf/9763aed7-0284-e400-b4dc-ed01718d8e1e@meta.com/

>> Those will require bigger changes in the verifier,
>> so I'd like to avoid premature generalization :) as analogous
>> to premature optimization :)
> 
> And of course given my points above and in other threads: agreed. I
> think we have an ideal middle-ground for minimizing complexity in the
> short term, and some nice follow-on todo-list items to work on in the
> medium-long term which will continue to improve things without
> (negatively) affecting users in any way. All SGTM

  reply	other threads:[~2023-01-17 17:32 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-17  8:24 [PATCH v2 bpf-next 00/13] BPF rbtree next-gen datastructure Dave Marchevsky
2022-12-17  8:24 ` [PATCH v2 bpf-next 01/13] bpf: Support multiple arg regs w/ ref_obj_id for kfuncs Dave Marchevsky
2022-12-29  3:24   ` Alexei Starovoitov
2022-12-29  6:40   ` David Vernet
2022-12-29 16:50     ` Alexei Starovoitov
2022-12-29 17:00       ` David Vernet
2023-01-17 17:26         ` Dave Marchevsky [this message]
2023-01-17 17:36           ` Alexei Starovoitov
2023-01-17 23:12             ` Dave Marchevsky
2023-01-20  5:13           ` David Vernet
2022-12-17  8:24 ` [PATCH v2 bpf-next 02/13] bpf: Migrate release_on_unlock logic to non-owning ref semantics Dave Marchevsky
2022-12-17  9:21   ` Dave Marchevsky
2022-12-28 23:46   ` David Vernet
2022-12-29 15:39     ` David Vernet
2022-12-29  3:56   ` Alexei Starovoitov
2022-12-29 16:54     ` David Vernet
2023-01-17 16:54       ` Dave Marchevsky
2023-01-17 16:07     ` Dave Marchevsky
2023-01-17 16:56       ` Alexei Starovoitov
2022-12-17  8:24 ` [PATCH v2 bpf-next 03/13] selftests/bpf: Update linked_list tests for " Dave Marchevsky
2022-12-17  8:24 ` [PATCH v2 bpf-next 04/13] bpf: rename list_head -> graph_root in field info types Dave Marchevsky
2022-12-17  8:24 ` [PATCH v2 bpf-next 05/13] bpf: Add basic bpf_rb_{root,node} support Dave Marchevsky
2022-12-17  8:24 ` [PATCH v2 bpf-next 06/13] bpf: Add bpf_rbtree_{add,remove,first} kfuncs Dave Marchevsky
2022-12-17  8:25 ` [PATCH v2 bpf-next 07/13] bpf: Add support for bpf_rb_root and bpf_rb_node in kfunc args Dave Marchevsky
2022-12-29  4:00   ` Alexei Starovoitov
2022-12-17  8:25 ` [PATCH v2 bpf-next 08/13] bpf: Add callback validation to kfunc verifier logic Dave Marchevsky
2022-12-17  8:25 ` [PATCH v2 bpf-next 09/13] bpf: Special verifier handling for bpf_rbtree_{remove, first} Dave Marchevsky
2022-12-29  4:02   ` Alexei Starovoitov
2022-12-17  8:25 ` [PATCH v2 bpf-next 10/13] bpf: Add bpf_rbtree_{add,remove,first} decls to bpf_experimental.h Dave Marchevsky
2022-12-17  8:25 ` [PATCH v2 bpf-next 11/13] libbpf: Make BTF mandatory if program BTF has spin_lock or alloc_obj type Dave Marchevsky
2022-12-22 18:50   ` Andrii Nakryiko
2022-12-17  8:25 ` [PATCH v2 bpf-next 12/13] selftests/bpf: Add rbtree selftests Dave Marchevsky
2022-12-17  8:25 ` [PATCH v2 bpf-next 13/13] bpf, documentation: Add graph documentation for non-owning refs Dave Marchevsky
2022-12-28 21:26   ` David Vernet
2023-01-18  2:16     ` Dave Marchevsky
2023-01-20  4:45       ` David Vernet
2022-12-17 10:23 [PATCH v2 bpf-next 02/13] bpf: Migrate release_on_unlock logic to non-owning ref semantics kernel test robot
2022-12-23 10:51 ` Dan Carpenter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afcae0f4-97a0-b06e-0a4e-7955ca7dbc7c@meta.com \
    --to=davemarchevsky@meta.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davemarchevsky@fb.com \
    --cc=kernel-team@fb.com \
    --cc=memxor@gmail.com \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.