bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexei Starovoitov <ast@fb.com>
To: Dave Marchevsky <davemarchevsky@fb.com>, bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Kernel Team <kernel-team@fb.com>, Tejun Heo <tj@kernel.org>
Subject: Re: [RFC PATCH bpf-next 08/11] bpf: Add OBJ_NON_OWNING_REF type flag
Date: Mon, 1 Aug 2022 15:41:18 -0700	[thread overview]
Message-ID: <e08116f0-8dbe-0d9f-ca2b-3f4c27d289b6@fb.com> (raw)
In-Reply-To: <20220722183438.3319790-9-davemarchevsky@fb.com>

On 7/22/22 11:34 AM, Dave Marchevsky wrote:
> Consider a pointer to a type that would normally need acquire / release
> semantics to be safely held. There may be scenarios where such a pointer
> can be safely held without the need to acquire a reference.
> 
> For example, although a PTR_TO_BTF_ID for a rbtree_map node is released
> via bpf_rbtree_add helper, the helper doesn't change the address of the
> node and must be called with the rbtree_map's spinlock held. Since the
> only way to remove a node from the rbtree - bpf_rbtree_remove helper -
> requires the same lock, the newly-added node cannot be removed by a
> concurrently-running program until the lock is released. Therefore it is
> safe to hold a reference to this node until bpf_rbtree_unlock is called.
> 
> This patch introduces a new type flag and associated verifier logic to
> handle such "non-owning" references.
> 
> Currently the only usecase I have is the rbtree example above, so the
> verifier logic is straightforward:
>    * Tag return types of bpf_rbtree_{add,find} with OBJ_NON_OWNING_REF
>      * These both require the rbtree lock to be held to return anything
>      non-NULL
>      * Since ret type for both is PTR_TO_BTF_ID_OR_NULL, if lock is not
>      held and NULL is returned, existing mark_ptr_or_null_reg logic
>      will clear reg type.
>      * So if mark_ptr_or_null_reg logic turns the returned reg into a
>      PTR_TO_BTF_ID | OBJ_NON_OWNING_REF, verifier knows lock is held.
> 
>    * When the lock is released the verifier invalidates any regs holding
>    non owning refs similarly to existing release_reference logic - but no
>    need to clear ref_obj_id as an 'owning' reference was never acquired.
> 
> [ TODO: Currently the invalidation logic in
> clear_rbtree_node_non_owning_refs is not parametrized by map so
> unlocking any rbtree lock will invalidate all non-owning refs ]

probably should be parametrized by 'lock' instead of by 'map'.

> Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
> ---
>   include/linux/bpf.h   |  1 +
>   kernel/bpf/rbtree.c   |  4 +--
>   kernel/bpf/verifier.c | 63 +++++++++++++++++++++++++++++++++++++++----
>   3 files changed, 61 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index eb8c550db0e2..c9c4b4fb019c 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -412,6 +412,7 @@ enum bpf_type_flag {
>   	/* Size is known at compile time. */
>   	MEM_FIXED_SIZE		= BIT(10 + BPF_BASE_TYPE_BITS),
>   
> +	OBJ_NON_OWNING_REF	= BIT(11 + BPF_BASE_TYPE_BITS),
>   	__BPF_TYPE_FLAG_MAX,
>   	__BPF_TYPE_LAST_FLAG	= __BPF_TYPE_FLAG_MAX - 1,
>   };
> diff --git a/kernel/bpf/rbtree.c b/kernel/bpf/rbtree.c
> index 5b1ab73e164f..34864fc83209 100644
> --- a/kernel/bpf/rbtree.c
> +++ b/kernel/bpf/rbtree.c
> @@ -111,7 +111,7 @@ BPF_CALL_3(bpf_rbtree_add, struct bpf_map *, map, void *, value, void *, cb)
>   const struct bpf_func_proto bpf_rbtree_add_proto = {
>   	.func = bpf_rbtree_add,
>   	.gpl_only = true,
> -	.ret_type = RET_PTR_TO_BTF_ID_OR_NULL,
> +	.ret_type = RET_PTR_TO_BTF_ID_OR_NULL | OBJ_NON_OWNING_REF,
>   	.arg1_type = ARG_CONST_MAP_PTR,
>   	.arg2_type = ARG_PTR_TO_BTF_ID | OBJ_RELEASE,
>   	.arg2_btf_id = &bpf_rbtree_btf_ids[0],
> @@ -133,7 +133,7 @@ BPF_CALL_3(bpf_rbtree_find, struct bpf_map *, map, void *, key, void *, cb)
>   const struct bpf_func_proto bpf_rbtree_find_proto = {
>   	.func = bpf_rbtree_find,
>   	.gpl_only = true,
> -	.ret_type = RET_PTR_TO_BTF_ID_OR_NULL,
> +	.ret_type = RET_PTR_TO_BTF_ID_OR_NULL | OBJ_NON_OWNING_REF,
>   	.ret_btf_id = &bpf_rbtree_btf_ids[0],
>   	.arg1_type = ARG_CONST_MAP_PTR,
>   	.arg2_type = ARG_ANYTHING,

To prevent race the bpf_rbtree_find/add feels not enough.
bpf_rbtree_alloc_node probably needs it too?
Otherwise after bpf_rbtree_unlock the stashed pointer from alloc
will still be accessible?
May be it's actually ok to access by the prog?
Due to the way preallocated maps work the elements can be use-after-free
within the same map. It's similar to SLAB_TYPESAFE_BY_RCU.
The elements won't be released into global kernel memory while progs are 
looking at them.
It seems to me it's ok to do away without OBJ_NON_OWNING_REF concept.
The prog might have pointers to rbtree elements and they will be
accessible by the prog even after bpf_rbtree_unlock().
This access will be racy, but it's safe.
So just drop this patch?

  reply	other threads:[~2022-08-01 22:41 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-22 18:34 [RFC PATCH bpf-next 00/11] bpf: Introduce rbtree map Dave Marchevsky
2022-07-22 18:34 ` [RFC PATCH bpf-next 01/11] bpf: Pull repeated reg access bounds check into helper fn Dave Marchevsky
2022-07-22 18:34 ` [RFC PATCH bpf-next 02/11] bpf: Add verifier support for custom callback return range Dave Marchevsky
2022-07-22 18:34 ` [RFC PATCH bpf-next 03/11] bpf: Add rb_node_off to bpf_map Dave Marchevsky
2022-08-01 22:19   ` Alexei Starovoitov
2022-07-22 18:34 ` [RFC PATCH bpf-next 04/11] bpf: Add rbtree map Dave Marchevsky
2022-08-01 21:49   ` Alexei Starovoitov
2022-07-22 18:34 ` [RFC PATCH bpf-next 05/11] bpf: Add bpf_spin_lock member to rbtree Dave Marchevsky
2022-08-01 22:17   ` Alexei Starovoitov
2022-08-02 13:59     ` Kumar Kartikeya Dwivedi
2022-08-02 15:30       ` Alexei Starovoitov
2022-08-10 21:46     ` Kumar Kartikeya Dwivedi
2022-08-10 22:06       ` Alexei Starovoitov
2022-08-10 23:16         ` Kumar Kartikeya Dwivedi
2022-08-15  5:33       ` Yonghong Song
2022-08-15  5:37         ` Kumar Kartikeya Dwivedi
2022-07-22 18:34 ` [RFC PATCH bpf-next 06/11] bpf: Add bpf_rbtree_{lock,unlock} helpers Dave Marchevsky
2022-08-01 21:58   ` Alexei Starovoitov
2022-07-22 18:34 ` [RFC PATCH bpf-next 07/11] bpf: Enforce spinlock hold for bpf_rbtree_{add,remove,find} Dave Marchevsky
2022-07-22 18:34 ` [RFC PATCH bpf-next 08/11] bpf: Add OBJ_NON_OWNING_REF type flag Dave Marchevsky
2022-08-01 22:41   ` Alexei Starovoitov [this message]
2022-07-22 18:34 ` [RFC PATCH bpf-next 09/11] bpf: Add CONDITIONAL_RELEASE " Dave Marchevsky
2022-08-01 22:23   ` Alexei Starovoitov
2022-07-22 18:34 ` [RFC PATCH bpf-next 10/11] bpf: Introduce PTR_ITER and PTR_ITER_END type flags Dave Marchevsky
2022-07-29 16:31   ` Tejun Heo
2022-08-01 22:44   ` Alexei Starovoitov
2022-08-02 13:05     ` Kumar Kartikeya Dwivedi
2022-08-02 15:10       ` Alexei Starovoitov
2022-08-10 17:56     ` Dave Marchevsky
2022-07-22 18:34 ` [RFC PATCH bpf-next 11/11] selftests/bpf: Add rbtree map tests Dave Marchevsky
2022-07-28  7:18   ` Yonghong Song
2022-08-10 17:48     ` Dave Marchevsky
2022-07-28  7:04 ` [RFC PATCH bpf-next 00/11] bpf: Introduce rbtree map Yonghong Song
2022-08-10 17:54   ` Dave Marchevsky
2022-08-01 21:27 ` Alexei Starovoitov
2022-08-10 18:11   ` Dave Marchevsky
2022-08-02 22:02 ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e08116f0-8dbe-0d9f-ca2b-3f4c27d289b6@fb.com \
    --to=ast@fb.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davemarchevsky@fb.com \
    --cc=kernel-team@fb.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).