bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexei Starovoitov <ast@fb.com>
To: Dave Marchevsky <davemarchevsky@fb.com>, bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Kernel Team <kernel-team@fb.com>, Tejun Heo <tj@kernel.org>
Subject: Re: [RFC PATCH bpf-next 09/11] bpf: Add CONDITIONAL_RELEASE type flag
Date: Mon, 1 Aug 2022 15:23:29 -0700	[thread overview]
Message-ID: <8f4c3e52-ce86-cbfc-5e76-884596ec11d7@fb.com> (raw)
In-Reply-To: <20220722183438.3319790-10-davemarchevsky@fb.com>

On 7/22/22 11:34 AM, Dave Marchevsky wrote:
> Currently if a helper proto has an arg with OBJ_RELEASE flag, the verifier
> assumes the 'release' logic in the helper will always succeed, and
> therefore always clears the arg reg, other regs w/ same
> ref_obj_id, and releases the reference. This poses a problem for
> 'release' helpers which may not always succeed.
> 
> For example, bpf_rbtree_add will fail to add the passed-in node to a
> bpf_rbtree if the rbtree's lock is not held when the helper is called.
> In this case the helper returns NULL and the calling bpf program must
> release the node in another way before terminating to avoid leaking
> memory. An example of such logic:
> 
> 1  struct node_data *node = bpf_rbtree_alloc_node(&rbtree, ...);
> 2  struct node_data *ret = bpf_rbtree_add(&rbtree, node);
> 3  if (!ret) {
> 4    bpf_rbtree_free_node(node);
> 5    return 0;
> 6  }
> 7  bpf_trace_printk("%d\n", ret->id);
> 
> However, current verifier logic does not allow this: after line 2,
> ref_obj_id of reg holding 'node' (BPF_REG_2) will be released via
> release_reference function, which will mark BPF_REG_2 and any other reg
> with same ref_obj_id as unknown. As a result neither ret nor node will
> point to anything useful if line 3's check passes. Additionally, since the
> reference is unconditionally released, the program would pass the
> verifier without the null check.
> 
> This patch adds 'conditional release' semantics so that the verifier can
> handle the above example correctly. The CONDITIONAL_RELEASE type flag
> works in concert with the existing OBJ_RELEASE flag - the latter is used
> to tag an argument, while the new type flag is used to tag return type.
> 
> If a helper has an OBJ_RELEASE arg type and CONDITIONAL_RELEASE return
> type, the helper is considered to use its return value to indicate
> success or failure of the release operation. NULL is returned if release
> fails, non-null otherwise.
> 
> For my concrete usecase - bpf_rbtree_add - CONDITIONAL_RELEASE works in
> concert with OBJ_NON_OWNING_REF: successful release results in a non-owning
> reference being returned, allowing line 7 in above example.
> 
> Instead of unconditionally releasing the OBJ_RELEASE reference when
> doing check_helper_call, for CONDITIONAL_RELEASE helpers the verifier
> will wait until the return value is checked for null.
>    If not null: the reference is released
> 
>    If null: no reference is released. Since other regs w/ same ref_obj_id
>             were not marked unknown by check_helper_call, they can be
>             used to release the reference via other means (line 4 above),
> 
> It's necessary to prevent conditionally-released ref_obj_id regs from
> being used between the release helper and null check. For example:
> 
> 1  struct node_data *node = bpf_rbtree_alloc_node(&rbtree, ...);
> 2  struct node_data *ret = bpf_rbtree_add(&rbtree, node);
> 3  do_something_with_a_node(node);
> 4  if (!ret) {
> 5    bpf_rbtree_free_node(node);
> 6    return 0;
> 7  }
> 
> Line 3 shouldn't be allowed since node may have been released. The
> verifier tags all regs with ref_obj_id of the conditionally-released arg
> in the period between the helper call and null check for this reason.
> 
> Why no matching CONDITIONAL_ACQUIRE type flag? Existing verifier logic
> already treats acquire of an _OR_NULL type as a conditional acquire.
> Consider this code:
> 
> 1  struct thing *i = acquire_helper_that_returns_thing_or_null();
> 2  if (!i)
> 3    return 0;
> 4  manipulate_thing(i);
> 5  release_thing(i);
> 
> After line 1, BPF_REG_0 will have an _OR_NULL type and a ref_obj_id set.
> When the verifier sees line 2's conditional jump, existing logic in
> mark_ptr_or_null_regs, specifically the if:
> 
>    if (ref_obj_id && ref_obj_id == id && is_null)
>            /* regs[regno] is in the " == NULL" branch.
>             * No one could have freed the reference state before
>             * doing the NULL check.
>             */
>             WARN_ON_ONCE(release_reference_state(state, id));
> 
> will release the reference in the is_null state.
> 
> [ TODO: Either need to remove WARN_ON_ONCE there without adding
> CONDITIONAL_ACQUIRE flag or add the flag and don't WARN_ON_ONCE if it's
> set. Left out of first pass for simplicity's sake. ]
> 
> Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
> ---
>   include/linux/bpf.h          |   3 +
>   include/linux/bpf_verifier.h |   1 +
>   kernel/bpf/rbtree.c          |   2 +-
>   kernel/bpf/verifier.c        | 122 +++++++++++++++++++++++++++++++----
>   4 files changed, 113 insertions(+), 15 deletions(-)
> 
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index c9c4b4fb019c..a601ab30a2b1 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -413,6 +413,9 @@ enum bpf_type_flag {
>   	MEM_FIXED_SIZE		= BIT(10 + BPF_BASE_TYPE_BITS),
>   
>   	OBJ_NON_OWNING_REF	= BIT(11 + BPF_BASE_TYPE_BITS),
> +
> +	CONDITIONAL_RELEASE	= BIT(12 + BPF_BASE_TYPE_BITS),
> +
>   	__BPF_TYPE_FLAG_MAX,
>   	__BPF_TYPE_LAST_FLAG	= __BPF_TYPE_FLAG_MAX - 1,
>   };
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 9c017575c034..bdc8c48c2343 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -313,6 +313,7 @@ struct bpf_verifier_state {
>   	u32 insn_idx;
>   	u32 curframe;
>   	u32 active_spin_lock;
> +	u32 active_cond_ref_obj_id;
>   	bool speculative;
>   
>   	/* first and last insn idx of this verifier state */
> diff --git a/kernel/bpf/rbtree.c b/kernel/bpf/rbtree.c
> index 34864fc83209..dcf8f69d4ada 100644
> --- a/kernel/bpf/rbtree.c
> +++ b/kernel/bpf/rbtree.c
> @@ -111,7 +111,7 @@ BPF_CALL_3(bpf_rbtree_add, struct bpf_map *, map, void *, value, void *, cb)
>   const struct bpf_func_proto bpf_rbtree_add_proto = {
>   	.func = bpf_rbtree_add,
>   	.gpl_only = true,
> -	.ret_type = RET_PTR_TO_BTF_ID_OR_NULL | OBJ_NON_OWNING_REF,
> +	.ret_type = RET_PTR_TO_BTF_ID_OR_NULL | OBJ_NON_OWNING_REF | CONDITIONAL_RELEASE,
>   	.arg1_type = ARG_CONST_MAP_PTR,
>   	.arg2_type = ARG_PTR_TO_BTF_ID | OBJ_RELEASE,
>   	.arg2_btf_id = &bpf_rbtree_btf_ids[0],
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 4f46b2dfbc4b..f80e161170de 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -472,6 +472,11 @@ static bool type_is_non_owning_ref(u32 type)
>   	return type & OBJ_NON_OWNING_REF;
>   }
>   
> +static bool type_is_cond_release(u32 type)
> +{
> +	return type & CONDITIONAL_RELEASE;
> +}
> +
>   static bool type_may_be_null(u32 type)
>   {
>   	return type & PTR_MAYBE_NULL;
> @@ -600,6 +605,15 @@ static const char *reg_type_str(struct bpf_verifier_env *env,
>   			postfix_idx += strlcpy(postfix + postfix_idx, "_non_own", 32 - postfix_idx);
>   	}
>   
> +	if (type_is_cond_release(type)) {
> +		if (base_type(type) == PTR_TO_BTF_ID)
> +			postfix_idx += strlcpy(postfix + postfix_idx, "cond_rel_",
> +					       32 - postfix_idx);
> +		else
> +			postfix_idx += strlcpy(postfix + postfix_idx, "_cond_rel",
> +					       32 - postfix_idx);
> +	}
> +
>   	if (type & MEM_RDONLY)
>   		strncpy(prefix, "rdonly_", 32);
>   	if (type & MEM_ALLOC)
> @@ -1211,6 +1225,7 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state,
>   	dst_state->speculative = src->speculative;
>   	dst_state->curframe = src->curframe;
>   	dst_state->active_spin_lock = src->active_spin_lock;
> +	dst_state->active_cond_ref_obj_id = src->active_cond_ref_obj_id;
>   	dst_state->branches = src->branches;
>   	dst_state->parent = src->parent;
>   	dst_state->first_insn_idx = src->first_insn_idx;
> @@ -1418,6 +1433,7 @@ static void mark_ptr_not_null_reg(struct bpf_reg_state *reg)
>   		return;
>   	}
>   
> +	reg->type &= ~CONDITIONAL_RELEASE;
>   	reg->type &= ~PTR_MAYBE_NULL;
>   }
>   
> @@ -6635,24 +6651,78 @@ static void release_reg_references(struct bpf_verifier_env *env,
>   	}
>   }
>   
> +static int __release_reference(struct bpf_verifier_env *env, struct bpf_verifier_state *vstate,
> +			       int ref_obj_id)
> +{
> +	int err;
> +	int i;
> +
> +	err = release_reference_state(vstate->frame[vstate->curframe], ref_obj_id);
> +	if (err)
> +		return err;
> +
> +	for (i = 0; i <= vstate->curframe; i++)
> +		release_reg_references(env, vstate->frame[i], ref_obj_id);
> +	return 0;
> +}
> +
>   /* The pointer with the specified id has released its reference to kernel
>    * resources. Identify all copies of the same pointer and clear the reference.
>    */
>   static int release_reference(struct bpf_verifier_env *env,
>   			     int ref_obj_id)
>   {
> -	struct bpf_verifier_state *vstate = env->cur_state;
> -	int err;
> +	return __release_reference(env, env->cur_state, ref_obj_id);
> +}
> +
> +static void tag_reference_cond_release_regs(struct bpf_verifier_env *env,
> +					    struct bpf_func_state *state,
> +					    int ref_obj_id,
> +					    bool remove)
> +{
> +	struct bpf_reg_state *regs = state->regs, *reg;
>   	int i;
>   
> -	err = release_reference_state(cur_func(env), ref_obj_id);
> -	if (err)
> -		return err;
> +	for (i = 0; i < MAX_BPF_REG; i++)
> +		if (regs[i].ref_obj_id == ref_obj_id) {
> +			if (remove)
> +				regs[i].type &= ~CONDITIONAL_RELEASE;
> +			else
> +				regs[i].type |= CONDITIONAL_RELEASE;
> +		}
> +
> +	bpf_for_each_spilled_reg(i, state, reg) {
> +		if (!reg)
> +			continue;
> +		if (reg->ref_obj_id == ref_obj_id) {
> +			if (remove)
> +				reg->type &= ~CONDITIONAL_RELEASE;
> +			else
> +				reg->type |= CONDITIONAL_RELEASE;
> +		}
> +	}
> +}
> +
> +static void tag_reference_cond_release(struct bpf_verifier_env *env,
> +				       int ref_obj_id)
> +{
> +	struct bpf_verifier_state *vstate = env->cur_state;
> +	int i;
>   
>   	for (i = 0; i <= vstate->curframe; i++)
> -		release_reg_references(env, vstate->frame[i], ref_obj_id);
> +		tag_reference_cond_release_regs(env, vstate->frame[i],
> +						ref_obj_id, false);
> +}
>   
> -	return 0;
> +static void untag_reference_cond_release(struct bpf_verifier_env *env,
> +					 struct bpf_verifier_state *vstate,
> +					 int ref_obj_id)
> +{
> +	int i;
> +
> +	for (i = 0; i <= vstate->curframe; i++)
> +		tag_reference_cond_release_regs(env, vstate->frame[i],
> +						ref_obj_id, true);
>   }
>   
>   static void clear_non_owning_ref_regs(struct bpf_verifier_env *env,
> @@ -7406,7 +7476,17 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>   		if (arg_type_is_dynptr(fn->arg_type[meta.release_regno - BPF_REG_1]))
>   			err = unmark_stack_slots_dynptr(env, &regs[meta.release_regno]);
>   		else if (meta.ref_obj_id)
> -			err = release_reference(env, meta.ref_obj_id);
> +			if (type_is_cond_release(fn->ret_type)) {
> +				if (env->cur_state->active_cond_ref_obj_id) {
> +					verbose(env, "can't handle >1 cond_release\n");
> +					return err;
> +				}
> +				env->cur_state->active_cond_ref_obj_id = meta.ref_obj_id;
> +				tag_reference_cond_release(env, meta.ref_obj_id);
> +				err = 0;
> +			} else {
> +				err = release_reference(env, meta.ref_obj_id);
> +			}
>   		/* meta.ref_obj_id can only be 0 if register that is meant to be
>   		 * released is NULL, which must be > R0.
>   		 */
> @@ -10040,8 +10120,8 @@ static void __mark_ptr_or_null_regs(struct bpf_func_state *state, u32 id,
>   /* The logic is similar to find_good_pkt_pointers(), both could eventually
>    * be folded together at some point.
>    */
> -static void mark_ptr_or_null_regs(struct bpf_verifier_state *vstate, u32 regno,
> -				  bool is_null)
> +static int mark_ptr_or_null_regs(struct bpf_verifier_state *vstate, u32 regno,
> +				 bool is_null, struct bpf_verifier_env *env)
>   {
>   	struct bpf_func_state *state = vstate->frame[vstate->curframe];
>   	struct bpf_reg_state *regs = state->regs;
> @@ -10056,8 +10136,19 @@ static void mark_ptr_or_null_regs(struct bpf_verifier_state *vstate, u32 regno,
>   		 */
>   		WARN_ON_ONCE(release_reference_state(state, id));
>   
> +	if (type_is_cond_release(regs[regno].type)) {
> +		if (!is_null) {
> +			__release_reference(env, vstate, vstate->active_cond_ref_obj_id);
> +			vstate->active_cond_ref_obj_id = 0;
> +		} else {
> +			untag_reference_cond_release(env, vstate, vstate->active_cond_ref_obj_id);
> +			vstate->active_cond_ref_obj_id = 0;
> +		}
> +	}
>   	for (i = 0; i <= vstate->curframe; i++)
>   		__mark_ptr_or_null_regs(vstate->frame[i], id, is_null);
> +
> +	return 0;
>   }
>   
>   static bool try_match_pkt_pointers(const struct bpf_insn *insn,
> @@ -10365,10 +10456,10 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
>   		/* Mark all identical registers in each branch as either
>   		 * safe or unknown depending R == 0 or R != 0 conditional.
>   		 */
> -		mark_ptr_or_null_regs(this_branch, insn->dst_reg,
> -				      opcode == BPF_JNE);
> -		mark_ptr_or_null_regs(other_branch, insn->dst_reg,
> -				      opcode == BPF_JEQ);
> +		err = mark_ptr_or_null_regs(this_branch, insn->dst_reg,
> +					    opcode == BPF_JNE, env);
> +		err = mark_ptr_or_null_regs(other_branch, insn->dst_reg,
> +					    opcode == BPF_JEQ, env);

CONDITIONAL_RELEASE concept doesn't look too horrible :)
I couldn't come up with anything better.

But what's the point of returning 0 from mark_ptr_or_null_regs() ?

  reply	other threads:[~2022-08-01 22:23 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-22 18:34 [RFC PATCH bpf-next 00/11] bpf: Introduce rbtree map Dave Marchevsky
2022-07-22 18:34 ` [RFC PATCH bpf-next 01/11] bpf: Pull repeated reg access bounds check into helper fn Dave Marchevsky
2022-07-22 18:34 ` [RFC PATCH bpf-next 02/11] bpf: Add verifier support for custom callback return range Dave Marchevsky
2022-07-22 18:34 ` [RFC PATCH bpf-next 03/11] bpf: Add rb_node_off to bpf_map Dave Marchevsky
2022-08-01 22:19   ` Alexei Starovoitov
2022-07-22 18:34 ` [RFC PATCH bpf-next 04/11] bpf: Add rbtree map Dave Marchevsky
2022-08-01 21:49   ` Alexei Starovoitov
2022-07-22 18:34 ` [RFC PATCH bpf-next 05/11] bpf: Add bpf_spin_lock member to rbtree Dave Marchevsky
2022-08-01 22:17   ` Alexei Starovoitov
2022-08-02 13:59     ` Kumar Kartikeya Dwivedi
2022-08-02 15:30       ` Alexei Starovoitov
2022-08-10 21:46     ` Kumar Kartikeya Dwivedi
2022-08-10 22:06       ` Alexei Starovoitov
2022-08-10 23:16         ` Kumar Kartikeya Dwivedi
2022-08-15  5:33       ` Yonghong Song
2022-08-15  5:37         ` Kumar Kartikeya Dwivedi
2022-07-22 18:34 ` [RFC PATCH bpf-next 06/11] bpf: Add bpf_rbtree_{lock,unlock} helpers Dave Marchevsky
2022-08-01 21:58   ` Alexei Starovoitov
2022-07-22 18:34 ` [RFC PATCH bpf-next 07/11] bpf: Enforce spinlock hold for bpf_rbtree_{add,remove,find} Dave Marchevsky
2022-07-22 18:34 ` [RFC PATCH bpf-next 08/11] bpf: Add OBJ_NON_OWNING_REF type flag Dave Marchevsky
2022-08-01 22:41   ` Alexei Starovoitov
2022-07-22 18:34 ` [RFC PATCH bpf-next 09/11] bpf: Add CONDITIONAL_RELEASE " Dave Marchevsky
2022-08-01 22:23   ` Alexei Starovoitov [this message]
2022-07-22 18:34 ` [RFC PATCH bpf-next 10/11] bpf: Introduce PTR_ITER and PTR_ITER_END type flags Dave Marchevsky
2022-07-29 16:31   ` Tejun Heo
2022-08-01 22:44   ` Alexei Starovoitov
2022-08-02 13:05     ` Kumar Kartikeya Dwivedi
2022-08-02 15:10       ` Alexei Starovoitov
2022-08-10 17:56     ` Dave Marchevsky
2022-07-22 18:34 ` [RFC PATCH bpf-next 11/11] selftests/bpf: Add rbtree map tests Dave Marchevsky
2022-07-28  7:18   ` Yonghong Song
2022-08-10 17:48     ` Dave Marchevsky
2022-07-28  7:04 ` [RFC PATCH bpf-next 00/11] bpf: Introduce rbtree map Yonghong Song
2022-08-10 17:54   ` Dave Marchevsky
2022-08-01 21:27 ` Alexei Starovoitov
2022-08-10 18:11   ` Dave Marchevsky
2022-08-02 22:02 ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8f4c3e52-ce86-cbfc-5e76-884596ec11d7@fb.com \
    --to=ast@fb.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davemarchevsky@fb.com \
    --cc=kernel-team@fb.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).