From: Alexei Starovoitov <ast@fb.com>
To: Dave Marchevsky <davemarchevsky@fb.com>, bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Kernel Team <kernel-team@fb.com>, Tejun Heo <tj@kernel.org>
Subject: Re: [RFC PATCH bpf-next 09/11] bpf: Add CONDITIONAL_RELEASE type flag
Date: Mon, 1 Aug 2022 15:23:29 -0700 [thread overview]
Message-ID: <8f4c3e52-ce86-cbfc-5e76-884596ec11d7@fb.com> (raw)
In-Reply-To: <20220722183438.3319790-10-davemarchevsky@fb.com>
On 7/22/22 11:34 AM, Dave Marchevsky wrote:
> Currently if a helper proto has an arg with OBJ_RELEASE flag, the verifier
> assumes the 'release' logic in the helper will always succeed, and
> therefore always clears the arg reg, other regs w/ same
> ref_obj_id, and releases the reference. This poses a problem for
> 'release' helpers which may not always succeed.
>
> For example, bpf_rbtree_add will fail to add the passed-in node to a
> bpf_rbtree if the rbtree's lock is not held when the helper is called.
> In this case the helper returns NULL and the calling bpf program must
> release the node in another way before terminating to avoid leaking
> memory. An example of such logic:
>
> 1 struct node_data *node = bpf_rbtree_alloc_node(&rbtree, ...);
> 2 struct node_data *ret = bpf_rbtree_add(&rbtree, node);
> 3 if (!ret) {
> 4 bpf_rbtree_free_node(node);
> 5 return 0;
> 6 }
> 7 bpf_trace_printk("%d\n", ret->id);
>
> However, current verifier logic does not allow this: after line 2,
> ref_obj_id of reg holding 'node' (BPF_REG_2) will be released via
> release_reference function, which will mark BPF_REG_2 and any other reg
> with same ref_obj_id as unknown. As a result neither ret nor node will
> point to anything useful if line 3's check passes. Additionally, since the
> reference is unconditionally released, the program would pass the
> verifier without the null check.
>
> This patch adds 'conditional release' semantics so that the verifier can
> handle the above example correctly. The CONDITIONAL_RELEASE type flag
> works in concert with the existing OBJ_RELEASE flag - the latter is used
> to tag an argument, while the new type flag is used to tag return type.
>
> If a helper has an OBJ_RELEASE arg type and CONDITIONAL_RELEASE return
> type, the helper is considered to use its return value to indicate
> success or failure of the release operation. NULL is returned if release
> fails, non-null otherwise.
>
> For my concrete usecase - bpf_rbtree_add - CONDITIONAL_RELEASE works in
> concert with OBJ_NON_OWNING_REF: successful release results in a non-owning
> reference being returned, allowing line 7 in above example.
>
> Instead of unconditionally releasing the OBJ_RELEASE reference when
> doing check_helper_call, for CONDITIONAL_RELEASE helpers the verifier
> will wait until the return value is checked for null.
> If not null: the reference is released
>
> If null: no reference is released. Since other regs w/ same ref_obj_id
> were not marked unknown by check_helper_call, they can be
> used to release the reference via other means (line 4 above),
>
> It's necessary to prevent conditionally-released ref_obj_id regs from
> being used between the release helper and null check. For example:
>
> 1 struct node_data *node = bpf_rbtree_alloc_node(&rbtree, ...);
> 2 struct node_data *ret = bpf_rbtree_add(&rbtree, node);
> 3 do_something_with_a_node(node);
> 4 if (!ret) {
> 5 bpf_rbtree_free_node(node);
> 6 return 0;
> 7 }
>
> Line 3 shouldn't be allowed since node may have been released. The
> verifier tags all regs with ref_obj_id of the conditionally-released arg
> in the period between the helper call and null check for this reason.
>
> Why no matching CONDITIONAL_ACQUIRE type flag? Existing verifier logic
> already treats acquire of an _OR_NULL type as a conditional acquire.
> Consider this code:
>
> 1 struct thing *i = acquire_helper_that_returns_thing_or_null();
> 2 if (!i)
> 3 return 0;
> 4 manipulate_thing(i);
> 5 release_thing(i);
>
> After line 1, BPF_REG_0 will have an _OR_NULL type and a ref_obj_id set.
> When the verifier sees line 2's conditional jump, existing logic in
> mark_ptr_or_null_regs, specifically the if:
>
> if (ref_obj_id && ref_obj_id == id && is_null)
> /* regs[regno] is in the " == NULL" branch.
> * No one could have freed the reference state before
> * doing the NULL check.
> */
> WARN_ON_ONCE(release_reference_state(state, id));
>
> will release the reference in the is_null state.
>
> [ TODO: Either need to remove WARN_ON_ONCE there without adding
> CONDITIONAL_ACQUIRE flag or add the flag and don't WARN_ON_ONCE if it's
> set. Left out of first pass for simplicity's sake. ]
>
> Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
> ---
> include/linux/bpf.h | 3 +
> include/linux/bpf_verifier.h | 1 +
> kernel/bpf/rbtree.c | 2 +-
> kernel/bpf/verifier.c | 122 +++++++++++++++++++++++++++++++----
> 4 files changed, 113 insertions(+), 15 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index c9c4b4fb019c..a601ab30a2b1 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -413,6 +413,9 @@ enum bpf_type_flag {
> MEM_FIXED_SIZE = BIT(10 + BPF_BASE_TYPE_BITS),
>
> OBJ_NON_OWNING_REF = BIT(11 + BPF_BASE_TYPE_BITS),
> +
> + CONDITIONAL_RELEASE = BIT(12 + BPF_BASE_TYPE_BITS),
> +
> __BPF_TYPE_FLAG_MAX,
> __BPF_TYPE_LAST_FLAG = __BPF_TYPE_FLAG_MAX - 1,
> };
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 9c017575c034..bdc8c48c2343 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -313,6 +313,7 @@ struct bpf_verifier_state {
> u32 insn_idx;
> u32 curframe;
> u32 active_spin_lock;
> + u32 active_cond_ref_obj_id;
> bool speculative;
>
> /* first and last insn idx of this verifier state */
> diff --git a/kernel/bpf/rbtree.c b/kernel/bpf/rbtree.c
> index 34864fc83209..dcf8f69d4ada 100644
> --- a/kernel/bpf/rbtree.c
> +++ b/kernel/bpf/rbtree.c
> @@ -111,7 +111,7 @@ BPF_CALL_3(bpf_rbtree_add, struct bpf_map *, map, void *, value, void *, cb)
> const struct bpf_func_proto bpf_rbtree_add_proto = {
> .func = bpf_rbtree_add,
> .gpl_only = true,
> - .ret_type = RET_PTR_TO_BTF_ID_OR_NULL | OBJ_NON_OWNING_REF,
> + .ret_type = RET_PTR_TO_BTF_ID_OR_NULL | OBJ_NON_OWNING_REF | CONDITIONAL_RELEASE,
> .arg1_type = ARG_CONST_MAP_PTR,
> .arg2_type = ARG_PTR_TO_BTF_ID | OBJ_RELEASE,
> .arg2_btf_id = &bpf_rbtree_btf_ids[0],
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 4f46b2dfbc4b..f80e161170de 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -472,6 +472,11 @@ static bool type_is_non_owning_ref(u32 type)
> return type & OBJ_NON_OWNING_REF;
> }
>
> +static bool type_is_cond_release(u32 type)
> +{
> + return type & CONDITIONAL_RELEASE;
> +}
> +
> static bool type_may_be_null(u32 type)
> {
> return type & PTR_MAYBE_NULL;
> @@ -600,6 +605,15 @@ static const char *reg_type_str(struct bpf_verifier_env *env,
> postfix_idx += strlcpy(postfix + postfix_idx, "_non_own", 32 - postfix_idx);
> }
>
> + if (type_is_cond_release(type)) {
> + if (base_type(type) == PTR_TO_BTF_ID)
> + postfix_idx += strlcpy(postfix + postfix_idx, "cond_rel_",
> + 32 - postfix_idx);
> + else
> + postfix_idx += strlcpy(postfix + postfix_idx, "_cond_rel",
> + 32 - postfix_idx);
> + }
> +
> if (type & MEM_RDONLY)
> strncpy(prefix, "rdonly_", 32);
> if (type & MEM_ALLOC)
> @@ -1211,6 +1225,7 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state,
> dst_state->speculative = src->speculative;
> dst_state->curframe = src->curframe;
> dst_state->active_spin_lock = src->active_spin_lock;
> + dst_state->active_cond_ref_obj_id = src->active_cond_ref_obj_id;
> dst_state->branches = src->branches;
> dst_state->parent = src->parent;
> dst_state->first_insn_idx = src->first_insn_idx;
> @@ -1418,6 +1433,7 @@ static void mark_ptr_not_null_reg(struct bpf_reg_state *reg)
> return;
> }
>
> + reg->type &= ~CONDITIONAL_RELEASE;
> reg->type &= ~PTR_MAYBE_NULL;
> }
>
> @@ -6635,24 +6651,78 @@ static void release_reg_references(struct bpf_verifier_env *env,
> }
> }
>
> +static int __release_reference(struct bpf_verifier_env *env, struct bpf_verifier_state *vstate,
> + int ref_obj_id)
> +{
> + int err;
> + int i;
> +
> + err = release_reference_state(vstate->frame[vstate->curframe], ref_obj_id);
> + if (err)
> + return err;
> +
> + for (i = 0; i <= vstate->curframe; i++)
> + release_reg_references(env, vstate->frame[i], ref_obj_id);
> + return 0;
> +}
> +
> /* The pointer with the specified id has released its reference to kernel
> * resources. Identify all copies of the same pointer and clear the reference.
> */
> static int release_reference(struct bpf_verifier_env *env,
> int ref_obj_id)
> {
> - struct bpf_verifier_state *vstate = env->cur_state;
> - int err;
> + return __release_reference(env, env->cur_state, ref_obj_id);
> +}
> +
> +static void tag_reference_cond_release_regs(struct bpf_verifier_env *env,
> + struct bpf_func_state *state,
> + int ref_obj_id,
> + bool remove)
> +{
> + struct bpf_reg_state *regs = state->regs, *reg;
> int i;
>
> - err = release_reference_state(cur_func(env), ref_obj_id);
> - if (err)
> - return err;
> + for (i = 0; i < MAX_BPF_REG; i++)
> + if (regs[i].ref_obj_id == ref_obj_id) {
> + if (remove)
> + regs[i].type &= ~CONDITIONAL_RELEASE;
> + else
> + regs[i].type |= CONDITIONAL_RELEASE;
> + }
> +
> + bpf_for_each_spilled_reg(i, state, reg) {
> + if (!reg)
> + continue;
> + if (reg->ref_obj_id == ref_obj_id) {
> + if (remove)
> + reg->type &= ~CONDITIONAL_RELEASE;
> + else
> + reg->type |= CONDITIONAL_RELEASE;
> + }
> + }
> +}
> +
> +static void tag_reference_cond_release(struct bpf_verifier_env *env,
> + int ref_obj_id)
> +{
> + struct bpf_verifier_state *vstate = env->cur_state;
> + int i;
>
> for (i = 0; i <= vstate->curframe; i++)
> - release_reg_references(env, vstate->frame[i], ref_obj_id);
> + tag_reference_cond_release_regs(env, vstate->frame[i],
> + ref_obj_id, false);
> +}
>
> - return 0;
> +static void untag_reference_cond_release(struct bpf_verifier_env *env,
> + struct bpf_verifier_state *vstate,
> + int ref_obj_id)
> +{
> + int i;
> +
> + for (i = 0; i <= vstate->curframe; i++)
> + tag_reference_cond_release_regs(env, vstate->frame[i],
> + ref_obj_id, true);
> }
>
> static void clear_non_owning_ref_regs(struct bpf_verifier_env *env,
> @@ -7406,7 +7476,17 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
> if (arg_type_is_dynptr(fn->arg_type[meta.release_regno - BPF_REG_1]))
> err = unmark_stack_slots_dynptr(env, ®s[meta.release_regno]);
> else if (meta.ref_obj_id)
> - err = release_reference(env, meta.ref_obj_id);
> + if (type_is_cond_release(fn->ret_type)) {
> + if (env->cur_state->active_cond_ref_obj_id) {
> + verbose(env, "can't handle >1 cond_release\n");
> + return err;
> + }
> + env->cur_state->active_cond_ref_obj_id = meta.ref_obj_id;
> + tag_reference_cond_release(env, meta.ref_obj_id);
> + err = 0;
> + } else {
> + err = release_reference(env, meta.ref_obj_id);
> + }
> /* meta.ref_obj_id can only be 0 if register that is meant to be
> * released is NULL, which must be > R0.
> */
> @@ -10040,8 +10120,8 @@ static void __mark_ptr_or_null_regs(struct bpf_func_state *state, u32 id,
> /* The logic is similar to find_good_pkt_pointers(), both could eventually
> * be folded together at some point.
> */
> -static void mark_ptr_or_null_regs(struct bpf_verifier_state *vstate, u32 regno,
> - bool is_null)
> +static int mark_ptr_or_null_regs(struct bpf_verifier_state *vstate, u32 regno,
> + bool is_null, struct bpf_verifier_env *env)
> {
> struct bpf_func_state *state = vstate->frame[vstate->curframe];
> struct bpf_reg_state *regs = state->regs;
> @@ -10056,8 +10136,19 @@ static void mark_ptr_or_null_regs(struct bpf_verifier_state *vstate, u32 regno,
> */
> WARN_ON_ONCE(release_reference_state(state, id));
>
> + if (type_is_cond_release(regs[regno].type)) {
> + if (!is_null) {
> + __release_reference(env, vstate, vstate->active_cond_ref_obj_id);
> + vstate->active_cond_ref_obj_id = 0;
> + } else {
> + untag_reference_cond_release(env, vstate, vstate->active_cond_ref_obj_id);
> + vstate->active_cond_ref_obj_id = 0;
> + }
> + }
> for (i = 0; i <= vstate->curframe; i++)
> __mark_ptr_or_null_regs(vstate->frame[i], id, is_null);
> +
> + return 0;
> }
>
> static bool try_match_pkt_pointers(const struct bpf_insn *insn,
> @@ -10365,10 +10456,10 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
> /* Mark all identical registers in each branch as either
> * safe or unknown depending R == 0 or R != 0 conditional.
> */
> - mark_ptr_or_null_regs(this_branch, insn->dst_reg,
> - opcode == BPF_JNE);
> - mark_ptr_or_null_regs(other_branch, insn->dst_reg,
> - opcode == BPF_JEQ);
> + err = mark_ptr_or_null_regs(this_branch, insn->dst_reg,
> + opcode == BPF_JNE, env);
> + err = mark_ptr_or_null_regs(other_branch, insn->dst_reg,
> + opcode == BPF_JEQ, env);
CONDITIONAL_RELEASE concept doesn't look too horrible :)
I couldn't come up with anything better.
But what's the point of returning 0 from mark_ptr_or_null_regs() ?
next prev parent reply other threads:[~2022-08-01 22:23 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-22 18:34 [RFC PATCH bpf-next 00/11] bpf: Introduce rbtree map Dave Marchevsky
2022-07-22 18:34 ` [RFC PATCH bpf-next 01/11] bpf: Pull repeated reg access bounds check into helper fn Dave Marchevsky
2022-07-22 18:34 ` [RFC PATCH bpf-next 02/11] bpf: Add verifier support for custom callback return range Dave Marchevsky
2022-07-22 18:34 ` [RFC PATCH bpf-next 03/11] bpf: Add rb_node_off to bpf_map Dave Marchevsky
2022-08-01 22:19 ` Alexei Starovoitov
2022-07-22 18:34 ` [RFC PATCH bpf-next 04/11] bpf: Add rbtree map Dave Marchevsky
2022-08-01 21:49 ` Alexei Starovoitov
2022-07-22 18:34 ` [RFC PATCH bpf-next 05/11] bpf: Add bpf_spin_lock member to rbtree Dave Marchevsky
2022-08-01 22:17 ` Alexei Starovoitov
2022-08-02 13:59 ` Kumar Kartikeya Dwivedi
2022-08-02 15:30 ` Alexei Starovoitov
2022-08-10 21:46 ` Kumar Kartikeya Dwivedi
2022-08-10 22:06 ` Alexei Starovoitov
2022-08-10 23:16 ` Kumar Kartikeya Dwivedi
2022-08-15 5:33 ` Yonghong Song
2022-08-15 5:37 ` Kumar Kartikeya Dwivedi
2022-07-22 18:34 ` [RFC PATCH bpf-next 06/11] bpf: Add bpf_rbtree_{lock,unlock} helpers Dave Marchevsky
2022-08-01 21:58 ` Alexei Starovoitov
2022-07-22 18:34 ` [RFC PATCH bpf-next 07/11] bpf: Enforce spinlock hold for bpf_rbtree_{add,remove,find} Dave Marchevsky
2022-07-22 18:34 ` [RFC PATCH bpf-next 08/11] bpf: Add OBJ_NON_OWNING_REF type flag Dave Marchevsky
2022-08-01 22:41 ` Alexei Starovoitov
2022-07-22 18:34 ` [RFC PATCH bpf-next 09/11] bpf: Add CONDITIONAL_RELEASE " Dave Marchevsky
2022-08-01 22:23 ` Alexei Starovoitov [this message]
2022-07-22 18:34 ` [RFC PATCH bpf-next 10/11] bpf: Introduce PTR_ITER and PTR_ITER_END type flags Dave Marchevsky
2022-07-29 16:31 ` Tejun Heo
2022-08-01 22:44 ` Alexei Starovoitov
2022-08-02 13:05 ` Kumar Kartikeya Dwivedi
2022-08-02 15:10 ` Alexei Starovoitov
2022-08-10 17:56 ` Dave Marchevsky
2022-07-22 18:34 ` [RFC PATCH bpf-next 11/11] selftests/bpf: Add rbtree map tests Dave Marchevsky
2022-07-28 7:18 ` Yonghong Song
2022-08-10 17:48 ` Dave Marchevsky
2022-07-28 7:04 ` [RFC PATCH bpf-next 00/11] bpf: Introduce rbtree map Yonghong Song
2022-08-10 17:54 ` Dave Marchevsky
2022-08-01 21:27 ` Alexei Starovoitov
2022-08-10 18:11 ` Dave Marchevsky
2022-08-02 22:02 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8f4c3e52-ce86-cbfc-5e76-884596ec11d7@fb.com \
--to=ast@fb.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davemarchevsky@fb.com \
--cc=kernel-team@fb.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).