From: Kumar Kartikeya Dwivedi <memxor@gmail.com>
To: bpf@vger.kernel.org
Cc: Dave Marchevsky <davemarchevsky@fb.com>,
Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Martin KaFai Lau <martin.lau@kernel.org>,
Dave Marchevsky <davemarchevsky@meta.com>
Subject: [PATCH bpf-next v10 17/24] bpf: Add 'release on unlock' logic for bpf_list_push_{front,back}
Date: Fri, 18 Nov 2022 07:26:07 +0530 [thread overview]
Message-ID: <20221118015614.2013203-18-memxor@gmail.com> (raw)
In-Reply-To: <20221118015614.2013203-1-memxor@gmail.com>
This commit implements the delayed release logic for bpf_list_push_front
and bpf_list_push_back.
Once a node has been added to the list, it's pointer changes to
PTR_UNTRUSTED. However, it is only released once the lock protecting the
list is unlocked. For such PTR_TO_BTF_ID | MEM_ALLOC with PTR_UNTRUSTED
set but an active ref_obj_id, it is still permitted to read them as long
as the lock is held. Writing to them is not allowed.
This allows having read access to push items we no longer own until we
release the lock guarding the list, allowing a little more flexibility
when working with these APIs.
Note that enabling write support has fairly tricky interactions with
what happens inside the critical section. Just as an example, currently,
bpf_obj_drop is not permitted, but if it were, being able to write to
the PTR_UNTRUSTED pointer while the object gets released back to the
memory allocator would violate safety properties we wish to guarantee
(i.e. not crashing the kernel). The memory could be reused for a
different type in the BPF program or even in the kernel as it gets
eventually kfree'd.
Not enabling bpf_obj_drop inside the critical section would appear to
prevent all of the above, but that is more of an artifical limitation
right now. Since the write support is tangled with how we handle
potential aliasing of nodes inside the critical section that may or may
not be part of the list anymore, it has been deferred to a future patch.
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
include/linux/bpf_verifier.h | 5 ++++
kernel/bpf/verifier.c | 58 +++++++++++++++++++++++++++++++++++-
2 files changed, 62 insertions(+), 1 deletion(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 3dc72d396dfc..23f30c685f28 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -223,6 +223,11 @@ struct bpf_reference_state {
* exiting a callback function.
*/
int callback_ref;
+ /* Mark the reference state to release the registers sharing the same id
+ * on bpf_spin_unlock (for nodes that we will lose ownership to but are
+ * safe to access inside the critical section).
+ */
+ bool release_on_unlock;
};
/* state of the program:
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 00d3122086c2..7f10b21f2dfc 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5652,7 +5652,9 @@ static int process_spin_lock(struct bpf_verifier_env *env, int regno,
cur->active_lock.ptr = btf;
cur->active_lock.id = reg->id;
} else {
+ struct bpf_func_state *fstate = cur_func(env);
void *ptr;
+ int i;
if (map)
ptr = map;
@@ -5670,6 +5672,23 @@ static int process_spin_lock(struct bpf_verifier_env *env, int regno,
}
cur->active_lock.ptr = NULL;
cur->active_lock.id = 0;
+
+ for (i = 0; i < fstate->acquired_refs; i++) {
+ int err;
+
+ /* Complain on error because this reference state cannot
+ * be freed before this point, as bpf_spin_lock critical
+ * section does not allow functions that release the
+ * allocated object immediately.
+ */
+ if (!fstate->refs[i].release_on_unlock)
+ continue;
+ err = release_reference(env, fstate->refs[i].id);
+ if (err) {
+ verbose(env, "failed to release release_on_unlock reference");
+ return err;
+ }
+ }
}
return 0;
}
@@ -8260,6 +8279,42 @@ static int process_kf_arg_ptr_to_kptr_strong(struct bpf_verifier_env *env,
return 0;
}
+static int ref_set_release_on_unlock(struct bpf_verifier_env *env, u32 ref_obj_id)
+{
+ struct bpf_func_state *state = cur_func(env);
+ struct bpf_reg_state *reg;
+ int i;
+
+ /* bpf_spin_lock only allows calling list_push and list_pop, no BPF
+ * subprogs, no global functions. This means that the references would
+ * not be released inside the critical section but they may be added to
+ * the reference state, and the acquired_refs are never copied out for a
+ * different frame as BPF to BPF calls don't work in bpf_spin_lock
+ * critical sections.
+ */
+ if (!ref_obj_id) {
+ verbose(env, "verifier internal error: ref_obj_id is zero for release_on_unlock\n");
+ return -EFAULT;
+ }
+ for (i = 0; i < state->acquired_refs; i++) {
+ if (state->refs[i].id == ref_obj_id) {
+ if (state->refs[i].release_on_unlock) {
+ verbose(env, "verifier internal error: expected false release_on_unlock");
+ return -EFAULT;
+ }
+ state->refs[i].release_on_unlock = true;
+ /* Now mark everyone sharing same ref_obj_id as untrusted */
+ bpf_for_each_reg_in_vstate(env->cur_state, state, reg, ({
+ if (reg->ref_obj_id == ref_obj_id)
+ reg->type |= PTR_UNTRUSTED;
+ }));
+ return 0;
+ }
+ }
+ verbose(env, "verifier internal error: ref state missing for ref_obj_id\n");
+ return -EFAULT;
+}
+
/* Implementation details:
*
* Each register points to some region of memory, which we define as an
@@ -8433,7 +8488,8 @@ static int process_kf_arg_ptr_to_list_node(struct bpf_verifier_env *env,
btf_name_by_offset(field->list_head.btf, et->name_off));
return -EINVAL;
}
- return 0;
+ /* Set arg#1 for expiration after unlock */
+ return ref_set_release_on_unlock(env, reg->ref_obj_id);
}
static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_arg_meta *meta)
--
2.38.1
next prev parent reply other threads:[~2022-11-18 1:57 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-18 1:55 [PATCH bpf-next v10 00/24] Allocated objects, BPF linked lists Kumar Kartikeya Dwivedi
2022-11-18 1:55 ` [PATCH bpf-next v10 01/24] bpf: Fix early return in map_check_btf Kumar Kartikeya Dwivedi
2022-11-18 1:55 ` [PATCH bpf-next v10 02/24] bpf: Do btf_record_free outside map_free callback Kumar Kartikeya Dwivedi
2022-11-18 1:55 ` [PATCH bpf-next v10 03/24] bpf: Free inner_map_meta when btf_record_dup fails Kumar Kartikeya Dwivedi
2022-11-18 1:55 ` [PATCH bpf-next v10 04/24] bpf: Populate field_offs for inner_map_meta Kumar Kartikeya Dwivedi
2022-11-18 1:55 ` [PATCH bpf-next v10 05/24] bpf: Introduce allocated objects support Kumar Kartikeya Dwivedi
2022-11-18 1:55 ` [PATCH bpf-next v10 06/24] bpf: Recognize lock and list fields in allocated objects Kumar Kartikeya Dwivedi
2022-11-18 1:55 ` [PATCH bpf-next v10 07/24] bpf: Verify ownership relationships for user BTF types Kumar Kartikeya Dwivedi
2022-11-18 1:55 ` [PATCH bpf-next v10 08/24] bpf: Allow locking bpf_spin_lock in allocated objects Kumar Kartikeya Dwivedi
2022-11-18 1:55 ` [PATCH bpf-next v10 09/24] bpf: Allow locking bpf_spin_lock global variables Kumar Kartikeya Dwivedi
2022-11-18 1:56 ` [PATCH bpf-next v10 10/24] bpf: Allow locking bpf_spin_lock in inner map values Kumar Kartikeya Dwivedi
2022-11-18 1:56 ` [PATCH bpf-next v10 11/24] bpf: Rewrite kfunc argument handling Kumar Kartikeya Dwivedi
2022-11-18 3:34 ` Alexei Starovoitov
2022-11-18 10:37 ` Kumar Kartikeya Dwivedi
2022-11-18 18:02 ` Alexei Starovoitov
2022-11-18 19:00 ` Kumar Kartikeya Dwivedi
2022-11-18 18:08 ` Alexei Starovoitov
2022-11-18 19:40 ` David Vernet
2022-11-20 19:25 ` Kumar Kartikeya Dwivedi
2022-11-18 1:56 ` [PATCH bpf-next v10 12/24] bpf: Support constant scalar arguments for kfuncs Kumar Kartikeya Dwivedi
2022-11-18 1:56 ` [PATCH bpf-next v10 13/24] bpf: Introduce bpf_obj_new Kumar Kartikeya Dwivedi
2022-11-18 1:56 ` [PATCH bpf-next v10 14/24] bpf: Introduce bpf_obj_drop Kumar Kartikeya Dwivedi
2022-11-18 1:56 ` [PATCH bpf-next v10 15/24] bpf: Permit NULL checking pointer with non-zero fixed offset Kumar Kartikeya Dwivedi
2022-11-18 1:56 ` [PATCH bpf-next v10 16/24] bpf: Introduce single ownership BPF linked list API Kumar Kartikeya Dwivedi
2022-11-21 18:34 ` Nathan Chancellor
2022-11-18 1:56 ` Kumar Kartikeya Dwivedi [this message]
2022-11-18 1:56 ` [PATCH bpf-next v10 18/24] bpf: Add comments for map BTF matching requirement for bpf_list_head Kumar Kartikeya Dwivedi
2022-11-18 1:56 ` [PATCH bpf-next v10 19/24] selftests/bpf: Add __contains macro to bpf_experimental.h Kumar Kartikeya Dwivedi
2022-11-18 1:56 ` [PATCH bpf-next v10 20/24] selftests/bpf: Update spinlock selftest Kumar Kartikeya Dwivedi
2022-11-18 1:56 ` [PATCH bpf-next v10 21/24] selftests/bpf: Add failure test cases for spin lock pairing Kumar Kartikeya Dwivedi
2022-11-18 1:56 ` [PATCH bpf-next v10 22/24] selftests/bpf: Add BPF linked list API tests Kumar Kartikeya Dwivedi
2022-11-18 1:56 ` [PATCH bpf-next v10 23/24] selftests/bpf: Add BTF sanity tests Kumar Kartikeya Dwivedi
2022-11-18 1:56 ` [PATCH bpf-next v10 24/24] selftests/bpf: Temporarily disable linked list tests Kumar Kartikeya Dwivedi
2022-11-22 17:24 ` Alexei Starovoitov
2022-11-18 3:40 ` [PATCH bpf-next v10 00/24] Allocated objects, BPF linked lists patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221118015614.2013203-18-memxor@gmail.com \
--to=memxor@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davemarchevsky@fb.com \
--cc=davemarchevsky@meta.com \
--cc=martin.lau@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.