bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC bpf v1 0/3] Verifier callback handling
@ 2022-08-15  5:15 Kumar Kartikeya Dwivedi
  2022-08-15  5:15 ` [PATCH RFC bpf v1 1/3] bpf: Move bpf_loop and bpf_for_each_map_elem under CAP_BPF Kumar Kartikeya Dwivedi
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-08-15  5:15 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Yonghong Song, Andrii Nakryiko, Daniel Borkmann

Hi, these are fixes for issues I noticed while working on the callback code.
It is an RFC because I would like to discuss how to fix one other problem,
I have a rough idea but would like to discuss before working on it, since
it isn't trivial. The conservative fix doesn't pass existing selftests.

It seems like the problem anticipated in [0] actually exists with all the
current callback based loop helpers, in a different form. The main reason is
that callback function is only explored as if it was executing once.

There are some example cases included in selftests to show the problem, and make
sure we keep catching and rejecting them. Since callback which is loop body is
only verified once, once the for_each helper returns the execution state from
verifier's perspective is as if callback executed once, but ofcourse it may
execute multiple times, unknowable statically.

Patch 1 fixes the case of acquiring and releasing references, unless I missed
some corner case it seems to mostly work fine.

In the future when we have a case where a sycnhronous callback will only be
executed once by a helper, we can improve this case by relaxing these
restrictions, so transfer of references and verifying callback as if it executes
only once can be done. For now, all in_callback_fn function get executed
multiple times, hence that change is left for when it will be needed.

Now, the important part:
------------------------

The included selftest will fail/crash, as one fix is missing from the series.
Whatever I tried so far wasn't backwards compatible with existing selftests
(e.g. conservatively marking all reads as UNKNOWN, and all writes scrubbing
stack slots). Some tests e.g. read skb pointer from ptr_to_stack passed to
callback.

It is not clear to me what the correct fix would be to handle reads and writes
from the stack inside callback function. If a stack slot is never touched inside
the callback in any of the states, it should be safe to allow it to load spilled
registers from caller's stack frame, even allowing precise scalars would be safe
if they are read from untouched slots.

The idea I have is roughly:

In the first pass, verify as if callback is executing once. This is the same
case as of now, this may e.g. load a spilled precise scalar, then overwrite it,
so insns before the store doing overwrite verification consider register as
precise. In reality, in later loop iterations though this assumption may be
wrong.

For e.g.
int i = 1;
int arr[2] = {0};

bpf_loop(100, cb, &i, 0);

cb:
	int i = *ctx;		 // seen as 1
	func(*((int *)ctx - i)); // stack read ok, arr[1]
	*ctx = i + 1;		 // i is now 2

verification ends here, but this would start reading uninit stack in later
iterations.

To fix this, we keep a bitmap of scratched slots for the stack in caller frame.
Once we see BPF_EXIT, we do reverification of the callback, but this time mark
_any_ register reading from slots that were scratched the first time around as
mark_reg_unknown (i.e. not precise). Moreover, in the caller stack it will be
marked STACK_MISC. It might not be able to track <8 byte writes but I guess
that is fine (since that would require 512 bits instead of 64).

The big problem I can see in this is that the precise register guards another
write in another branch that was not taken the first time around. In that case
push_stack is not done to explore the other branch, and is_branch_taken is used
to predict the outcome.

But: if this has to cause a problem, the precise reg guarding the branch must
change its value (and that we can see if it is being from a scratched slot
read), so by doing another pass we would get rid of this problem. I.e. one more
pass would be needed after downgrading precise registers to unknown.

If I can get some hints for new ideas or feedback on whether this would be the
correct approach, I can work on a proper fix for this case.

It seems like roughly the same approach will be required for open coded
iteration mentioned in [0], so we might be able to kill two birds with one shot.

Thanks!

  [0]: https://lore.kernel.org/bpf/33123904-5719-9e93-4af2-c7d549221520@fb.com

Kumar Kartikeya Dwivedi (3):
  bpf: Move bpf_loop and bpf_for_each_map_elem under CAP_BPF
  bpf: Fix reference state management for synchronous callbacks
  selftests/bpf: Add tests for callbacks fixes

 include/linux/bpf_verifier.h                  |  11 ++
 kernel/bpf/helpers.c                          |   8 +-
 kernel/bpf/verifier.c                         |  42 +++--
 .../selftests/bpf/prog_tests/cb_refs.c        |  49 ++++++
 tools/testing/selftests/bpf/progs/cb_refs.c   | 152 ++++++++++++++++++
 5 files changed, 249 insertions(+), 13 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/cb_refs.c
 create mode 100644 tools/testing/selftests/bpf/progs/cb_refs.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RFC bpf v1 1/3] bpf: Move bpf_loop and bpf_for_each_map_elem under CAP_BPF
  2022-08-15  5:15 [PATCH RFC bpf v1 0/3] Verifier callback handling Kumar Kartikeya Dwivedi
@ 2022-08-15  5:15 ` Kumar Kartikeya Dwivedi
  2022-08-15  5:15 ` [PATCH RFC bpf v1 2/3] bpf: Fix reference state management for synchronous callbacks Kumar Kartikeya Dwivedi
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-08-15  5:15 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Yonghong Song, Andrii Nakryiko, Daniel Borkmann

They would require func_info which needs prog BTF anyway, which requires
bpf_capable, so just to be double sure move both these helpers taking
callback under CAP_BPF protection as well.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 kernel/bpf/helpers.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 1f961f9982d2..d0e80926bac5 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1633,10 +1633,6 @@ bpf_base_func_proto(enum bpf_func_id func_id)
 		return &bpf_ringbuf_submit_dynptr_proto;
 	case BPF_FUNC_ringbuf_discard_dynptr:
 		return &bpf_ringbuf_discard_dynptr_proto;
-	case BPF_FUNC_for_each_map_elem:
-		return &bpf_for_each_map_elem_proto;
-	case BPF_FUNC_loop:
-		return &bpf_loop_proto;
 	case BPF_FUNC_strncmp:
 		return &bpf_strncmp_proto;
 	case BPF_FUNC_dynptr_from_mem:
@@ -1675,6 +1671,10 @@ bpf_base_func_proto(enum bpf_func_id func_id)
 		return &bpf_timer_cancel_proto;
 	case BPF_FUNC_kptr_xchg:
 		return &bpf_kptr_xchg_proto;
+	case BPF_FUNC_for_each_map_elem:
+		return &bpf_for_each_map_elem_proto;
+	case BPF_FUNC_loop:
+		return &bpf_loop_proto;
 	default:
 		break;
 	}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH RFC bpf v1 2/3] bpf: Fix reference state management for synchronous callbacks
  2022-08-15  5:15 [PATCH RFC bpf v1 0/3] Verifier callback handling Kumar Kartikeya Dwivedi
  2022-08-15  5:15 ` [PATCH RFC bpf v1 1/3] bpf: Move bpf_loop and bpf_for_each_map_elem under CAP_BPF Kumar Kartikeya Dwivedi
@ 2022-08-15  5:15 ` Kumar Kartikeya Dwivedi
  2022-08-19  0:51   ` Alexei Starovoitov
  2022-08-15  5:15 ` [PATCH RFC bpf v1 3/3] selftests/bpf: Add tests for callbacks fixes Kumar Kartikeya Dwivedi
  2022-08-24 23:18 ` [PATCH RFC bpf v1 0/3] Verifier callback handling Andrii Nakryiko
  3 siblings, 1 reply; 9+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-08-15  5:15 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Yonghong Song, Andrii Nakryiko, Daniel Borkmann

Currently, verifier verifies callback functions (sync and async) as if
they will be executed once, (i.e. it explores execution state as if the
function was being called once). The next insn to explore is set to
start of subprog and the exit from nested frame is handled using
curframe > 0 and prepare_func_exit. In case of async callback it uses a
customized variant of push_stack simulating a kind of branch to set up
custom state and execution context for the async callback.

While this approach is simple and works when callback really will be
executed only once, it is unsafe for all of our current helpers which
are for_each style, i.e. they execute the callback multiple times.

A callback releasing acquired references of the caller may do so
multiple times, but currently verifier sees it as one call inside the
frame, which then returns to caller. Hence, it thinks it released some
reference that the cb e.g. got access through callback_ctx (register
filled inside cb from spilled typed register on stack).

Similarly, it may see that an acquire call is unpaired inside the
callback, so the caller will copy the reference state of callback and
then will have to release the register with new ref_obj_ids. But again,
the callback may execute multiple times, but the verifier will only
account for acquired references for a single symbolic execution of the
callback.

Note that for async callback case, things are different. While currently
we have bpf_timer_set_callback which only executes it once, even for
multiple executions it would be safe, as reference state is NULL and
check_reference_leak would force program to release state before
BPF_EXIT. The state is also unaffected by analysis for the caller frame.
Hence async callback is safe.

To fix this, we disallow callbacks to transfer acquired references back
to caller. They must be released before callback hits BPF_EXIT, since
the number of times callback is invoked is not known to the verifier, it
cannot reliably track how many references will be created. Likewise, it
is not allowed to release caller reference state, since we don't know
how many times the callback will be invoked.

Lastly, now that callback function cannot change reference state it
copied from its parent, there is no need to copy reference state back to
the parent, since it won't change. It may be changed for the callee
frame but that state must match parent reference state by the time
callee exits, and it is going to be discarded anyway. So skip this copy
too. To be clear, it won't be incorrect if the copy was done, but it
would be inefficient and may be confusing to people reading the code.

Fixes: 69c87ba6225 ("bpf: Add bpf_for_each_map_elem() helper")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/linux/bpf_verifier.h | 11 ++++++++++
 kernel/bpf/verifier.c        | 42 ++++++++++++++++++++++++++++--------
 2 files changed, 44 insertions(+), 9 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 2e3bad8640dc..1fdddbf3546b 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -212,6 +212,17 @@ struct bpf_reference_state {
 	 * is used purely to inform the user of a reference leak.
 	 */
 	int insn_idx;
+	/* There can be a case like:
+	 * main (frame 0)
+	 *  cb (frame 1)
+	 *   func (frame 3)
+	 *    cb (frame 4)
+	 * Hence for frame 4, if callback_ref just stored boolean, it would be
+	 * impossible to distinguish nested callback refs. Hence store the
+	 * frameno and compare that to callback_ref in check_reference_leak when
+	 * exiting a callback function.
+	 */
+	int callback_ref;
 };
 
 /* state of the program:
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 096fdac70165..3e885ba88b02 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1086,6 +1086,7 @@ static int acquire_reference_state(struct bpf_verifier_env *env, int insn_idx)
 	id = ++env->id_gen;
 	state->refs[new_ofs].id = id;
 	state->refs[new_ofs].insn_idx = insn_idx;
+	state->refs[new_ofs].callback_ref = state->in_callback_fn ? state->frameno : 0;
 
 	return id;
 }
@@ -1098,6 +1099,9 @@ static int release_reference_state(struct bpf_func_state *state, int ptr_id)
 	last_idx = state->acquired_refs - 1;
 	for (i = 0; i < state->acquired_refs; i++) {
 		if (state->refs[i].id == ptr_id) {
+			/* Cannot release caller references in callbacks */
+			if (state->in_callback_fn && state->refs[i].callback_ref != state->frameno)
+				return -EINVAL;
 			if (last_idx && i != last_idx)
 				memcpy(&state->refs[i], &state->refs[last_idx],
 				       sizeof(*state->refs));
@@ -6938,10 +6942,17 @@ static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx)
 		caller->regs[BPF_REG_0] = *r0;
 	}
 
-	/* Transfer references to the caller */
-	err = copy_reference_state(caller, callee);
-	if (err)
-		return err;
+	/* callback_fn frame should have released its own additions to parent's
+	 * reference state at this point, or check_reference_leak would
+	 * complain, hence it must be the same as the caller. There is no need
+	 * to copy it back.
+	 */
+	if (!callee->in_callback_fn) {
+		/* Transfer references to the caller */
+		err = copy_reference_state(caller, callee);
+		if (err)
+			return err;
+	}
 
 	*insn_idx = callee->callsite + 1;
 	if (env->log.level & BPF_LOG_LEVEL) {
@@ -7065,13 +7076,20 @@ record_func_key(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta,
 static int check_reference_leak(struct bpf_verifier_env *env)
 {
 	struct bpf_func_state *state = cur_func(env);
+	bool refs_lingering = false;
 	int i;
 
+	if (state->frameno && !state->in_callback_fn)
+		return 0;
+
 	for (i = 0; i < state->acquired_refs; i++) {
+		if (state->in_callback_fn && state->refs[i].callback_ref != state->frameno)
+			continue;
 		verbose(env, "Unreleased reference id=%d alloc_insn=%d\n",
 			state->refs[i].id, state->refs[i].insn_idx);
+		refs_lingering = true;
 	}
-	return state->acquired_refs ? -EINVAL : 0;
+	return refs_lingering ? -EINVAL : 0;
 }
 
 static int check_bpf_snprintf_call(struct bpf_verifier_env *env,
@@ -12332,6 +12350,16 @@ static int do_check(struct bpf_verifier_env *env)
 					return -EINVAL;
 				}
 
+				/* We must do check_reference_leak here before
+				 * prepare_func_exit to handle the case when
+				 * state->curframe > 0, it may be a callback
+				 * function, for which reference_state must
+				 * match caller reference state when it exits.
+				 */
+				err = check_reference_leak(env);
+				if (err)
+					return err;
+
 				if (state->curframe) {
 					/* exit from nested function */
 					err = prepare_func_exit(env, &env->insn_idx);
@@ -12341,10 +12369,6 @@ static int do_check(struct bpf_verifier_env *env)
 					continue;
 				}
 
-				err = check_reference_leak(env);
-				if (err)
-					return err;
-
 				err = check_return_code(env);
 				if (err)
 					return err;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH RFC bpf v1 3/3] selftests/bpf: Add tests for callbacks fixes
  2022-08-15  5:15 [PATCH RFC bpf v1 0/3] Verifier callback handling Kumar Kartikeya Dwivedi
  2022-08-15  5:15 ` [PATCH RFC bpf v1 1/3] bpf: Move bpf_loop and bpf_for_each_map_elem under CAP_BPF Kumar Kartikeya Dwivedi
  2022-08-15  5:15 ` [PATCH RFC bpf v1 2/3] bpf: Fix reference state management for synchronous callbacks Kumar Kartikeya Dwivedi
@ 2022-08-15  5:15 ` Kumar Kartikeya Dwivedi
  2022-08-24 23:18 ` [PATCH RFC bpf v1 0/3] Verifier callback handling Andrii Nakryiko
  3 siblings, 0 replies; 9+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-08-15  5:15 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Yonghong Song, Andrii Nakryiko, Daniel Borkmann

These are regression tests to ensure we don't end up in invalid runtime
state for helpers that execute callbacks multiple times. It exercises
the fixes to verifier callback handling in previous patches.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 .../selftests/bpf/prog_tests/cb_refs.c        |  49 ++++++
 tools/testing/selftests/bpf/progs/cb_refs.c   | 152 ++++++++++++++++++
 2 files changed, 201 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/cb_refs.c
 create mode 100644 tools/testing/selftests/bpf/progs/cb_refs.c

diff --git a/tools/testing/selftests/bpf/prog_tests/cb_refs.c b/tools/testing/selftests/bpf/prog_tests/cb_refs.c
new file mode 100644
index 000000000000..a74ac3ace5c4
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/cb_refs.c
@@ -0,0 +1,49 @@
+// SPDX-License-Identifier: GPL-2.0
+#include "bpf/libbpf.h"
+#include <test_progs.h>
+#include <network_helpers.h>
+
+#include "cb_refs.skel.h"
+
+static char log_buf[1024 * 1024];
+
+struct {
+	const char *prog_name;
+	const char *err_msg;
+} cb_refs_tests[] = {
+	{ "underflow_prog", "reference has not been acquired before" },
+	{ "leak_prog", "Unreleased reference" },
+	{ "nested_cb", "Unreleased reference id=2 alloc_insn=16" },
+	{ "oob_access", "TBD..." },
+	{ "write", "TBD..." },
+};
+
+void test_cb_refs(void)
+{
+	LIBBPF_OPTS(bpf_object_open_opts, opts, .kernel_log_buf = log_buf,
+						.kernel_log_size = sizeof(log_buf),
+						.kernel_log_level = 1);
+	struct bpf_program *prog;
+	struct cb_refs *skel;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(cb_refs_tests); i++) {
+		LIBBPF_OPTS(bpf_test_run_opts, run_opts,
+			.data_in = &pkt_v4,
+			.data_size_in = sizeof(pkt_v4),
+			.repeat = 1,
+		);
+		skel = cb_refs__open_opts(&opts);
+		if (!ASSERT_OK_PTR(skel, "cb_refs__open_and_load"))
+			return;
+		prog = bpf_object__find_program_by_name(skel->obj, cb_refs_tests[i].prog_name);
+		bpf_program__set_autoload(prog, true);
+		if (!ASSERT_ERR(cb_refs__load(skel), "cb_refs__load"))
+			bpf_prog_test_run_opts(bpf_program__fd(prog), &run_opts);
+		if (!ASSERT_OK_PTR(strstr(log_buf, cb_refs_tests[i].err_msg), "expected error message")) {
+			fprintf(stderr, "Expected: %s\n", cb_refs_tests[i].err_msg);
+			fprintf(stderr, "Verifier: %s\n", log_buf);
+		}
+		cb_refs__destroy(skel);
+	}
+}
diff --git a/tools/testing/selftests/bpf/progs/cb_refs.c b/tools/testing/selftests/bpf/progs/cb_refs.c
new file mode 100644
index 000000000000..1f3ce0b4f8b2
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/cb_refs.c
@@ -0,0 +1,152 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <vmlinux.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_helpers.h>
+
+struct map_value {
+	unsigned long data;
+	unsigned long data2;
+	struct prog_test_ref_kfunc __kptr_ref *ptr;
+};
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__type(key, int);
+	__type(value, struct map_value);
+	__uint(max_entries, 16);
+} array_map SEC(".maps");
+
+extern struct prog_test_ref_kfunc *bpf_kfunc_call_test_acquire(unsigned long *sp) __ksym;
+extern void bpf_kfunc_call_test_release(struct prog_test_ref_kfunc *p) __ksym;
+
+static __always_inline int cb1(void *map, void *key, void *value, void *ctx)
+{
+	void *p = *(void **)ctx;
+	bpf_kfunc_call_test_release(p);
+	/* Without the fix this would cause underflow */
+	return 0;
+}
+
+SEC("?tc")
+int underflow_prog(void *ctx)
+{
+	struct prog_test_ref_kfunc *p;
+	unsigned long sl = 0;
+
+	p = bpf_kfunc_call_test_acquire(&sl);
+	if (!p)
+		return 0;
+	bpf_for_each_map_elem(&array_map, cb1, &p, 0);
+	return 0;
+}
+
+static __always_inline int cb2(void *map, void *key, void *value, void *ctx)
+{
+	unsigned long sl = 0;
+
+	*(void **)ctx = bpf_kfunc_call_test_acquire(&sl);
+	/* Without the fix this would leak memory */
+	return 0;
+}
+
+SEC("?tc")
+int leak_prog(void *ctx)
+{
+	struct prog_test_ref_kfunc *p;
+	struct map_value *v;
+	unsigned long sl;
+
+	v = bpf_map_lookup_elem(&array_map, &(int){0});
+	if (!v)
+		return 0;
+
+	p = NULL;
+	bpf_for_each_map_elem(&array_map, cb2, &p, 0);
+	p = bpf_kptr_xchg(&v->ptr, p);
+	if (p)
+		bpf_kfunc_call_test_release(p);
+	return 0;
+}
+
+static __always_inline int cb(void *map, void *key, void *value, void *ctx)
+{
+	return 0;
+}
+
+static __always_inline int cb3(void *map, void *key, void *value, void *ctx)
+{
+	unsigned long sl = 0;
+	void *p;
+
+	bpf_kfunc_call_test_acquire(&sl);
+	bpf_for_each_map_elem(&array_map, cb, &p, 0);
+	/* It should only complain here, not in cb. This is why we need
+	 * callback_ref to be set to frameno.
+	 */
+	return 0;
+}
+
+SEC("?tc")
+int nested_cb(void *ctx)
+{
+	int p = 0;
+
+	bpf_for_each_map_elem(&array_map, cb3, &p, 0);
+	return 0;
+}
+
+static __always_inline int lcb(void *map, unsigned long *idx)
+{
+	unsigned long i = *idx;
+	i++;
+	*idx = i;
+	return 0;
+}
+
+SEC("?tc")
+int oob_access(void *ctx)
+{
+	unsigned long idx = 0;
+	struct map_value *v;
+
+	v = bpf_map_lookup_elem(&array_map, &(int){0});
+	if (!v)
+		return 0;
+	bpf_loop(100, lcb, &idx, 0);
+	/* Verifier would think we are accessing using idx=1 without the fix */
+	return ((unsigned long *)&v->data)[idx];
+}
+
+static __always_inline int lcb1(void *map, int *idx)
+{
+	int i = *idx;
+	i--;
+	*idx = i;
+	return 0;
+}
+
+static __always_inline int lcb2(void *map, void **pp)
+{
+	int i = *(int *)(pp + 1);
+	pp[i + 2] = (void *)0xeB9F15D34D;
+	return 0;
+}
+
+SEC("?tc")
+int write(void *ctx)
+{
+	struct {
+		struct map_value *v;
+		int idx;
+	} x = {};
+	x.v = bpf_map_lookup_elem(&array_map, &(int){0});
+	if (!x.v)
+		return 0;
+	bpf_loop(2, &lcb1, &x.idx, 0);
+	/* idx is -2, verifier thinks it is -1 */
+	bpf_loop(1, &lcb2, &x, 0);
+	/* x.v is no longer map value, but verifier thinks so */
+	return x.v->data;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC bpf v1 2/3] bpf: Fix reference state management for synchronous callbacks
  2022-08-15  5:15 ` [PATCH RFC bpf v1 2/3] bpf: Fix reference state management for synchronous callbacks Kumar Kartikeya Dwivedi
@ 2022-08-19  0:51   ` Alexei Starovoitov
  2022-08-19  5:31     ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 9+ messages in thread
From: Alexei Starovoitov @ 2022-08-19  0:51 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Yonghong Song, Andrii Nakryiko, Daniel Borkmann

On Sun, Aug 14, 2022 at 10:15 PM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> Currently, verifier verifies callback functions (sync and async) as if
> they will be executed once, (i.e. it explores execution state as if the
> function was being called once). The next insn to explore is set to
> start of subprog and the exit from nested frame is handled using
> curframe > 0 and prepare_func_exit. In case of async callback it uses a
> customized variant of push_stack simulating a kind of branch to set up
> custom state and execution context for the async callback.
>
> While this approach is simple and works when callback really will be
> executed only once, it is unsafe for all of our current helpers which
> are for_each style, i.e. they execute the callback multiple times.
>
> A callback releasing acquired references of the caller may do so
> multiple times, but currently verifier sees it as one call inside the
> frame, which then returns to caller. Hence, it thinks it released some
> reference that the cb e.g. got access through callback_ctx (register
> filled inside cb from spilled typed register on stack).
>
> Similarly, it may see that an acquire call is unpaired inside the
> callback, so the caller will copy the reference state of callback and
> then will have to release the register with new ref_obj_ids. But again,
> the callback may execute multiple times, but the verifier will only
> account for acquired references for a single symbolic execution of the
> callback.
>
> Note that for async callback case, things are different. While currently
> we have bpf_timer_set_callback which only executes it once, even for
> multiple executions it would be safe, as reference state is NULL and
> check_reference_leak would force program to release state before
> BPF_EXIT. The state is also unaffected by analysis for the caller frame.
> Hence async callback is safe.
>
> To fix this, we disallow callbacks to transfer acquired references back
> to caller. They must be released before callback hits BPF_EXIT, since
> the number of times callback is invoked is not known to the verifier, it
> cannot reliably track how many references will be created. Likewise, it
> is not allowed to release caller reference state, since we don't know
> how many times the callback will be invoked.
>
> Lastly, now that callback function cannot change reference state it
> copied from its parent, there is no need to copy reference state back to
> the parent, since it won't change. It may be changed for the callee
> frame but that state must match parent reference state by the time
> callee exits, and it is going to be discarded anyway. So skip this copy
> too. To be clear, it won't be incorrect if the copy was done, but it
> would be inefficient and may be confusing to people reading the code.
>
> Fixes: 69c87ba6225 ("bpf: Add bpf_for_each_map_elem() helper")
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  include/linux/bpf_verifier.h | 11 ++++++++++
>  kernel/bpf/verifier.c        | 42 ++++++++++++++++++++++++++++--------
>  2 files changed, 44 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 2e3bad8640dc..1fdddbf3546b 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -212,6 +212,17 @@ struct bpf_reference_state {
>          * is used purely to inform the user of a reference leak.
>          */
>         int insn_idx;
> +       /* There can be a case like:
> +        * main (frame 0)
> +        *  cb (frame 1)
> +        *   func (frame 3)
> +        *    cb (frame 4)
> +        * Hence for frame 4, if callback_ref just stored boolean, it would be
> +        * impossible to distinguish nested callback refs. Hence store the
> +        * frameno and compare that to callback_ref in check_reference_leak when
> +        * exiting a callback function.
> +        */
> +       int callback_ref;
>  };
>
>  /* state of the program:
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 096fdac70165..3e885ba88b02 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -1086,6 +1086,7 @@ static int acquire_reference_state(struct bpf_verifier_env *env, int insn_idx)
>         id = ++env->id_gen;
>         state->refs[new_ofs].id = id;
>         state->refs[new_ofs].insn_idx = insn_idx;
> +       state->refs[new_ofs].callback_ref = state->in_callback_fn ? state->frameno : 0;
>
>         return id;
>  }
> @@ -1098,6 +1099,9 @@ static int release_reference_state(struct bpf_func_state *state, int ptr_id)
>         last_idx = state->acquired_refs - 1;
>         for (i = 0; i < state->acquired_refs; i++) {
>                 if (state->refs[i].id == ptr_id) {
> +                       /* Cannot release caller references in callbacks */
> +                       if (state->in_callback_fn && state->refs[i].callback_ref != state->frameno)
> +                               return -EINVAL;
>                         if (last_idx && i != last_idx)
>                                 memcpy(&state->refs[i], &state->refs[last_idx],
>                                        sizeof(*state->refs));
> @@ -6938,10 +6942,17 @@ static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx)
>                 caller->regs[BPF_REG_0] = *r0;
>         }
>
> -       /* Transfer references to the caller */
> -       err = copy_reference_state(caller, callee);
> -       if (err)
> -               return err;
> +       /* callback_fn frame should have released its own additions to parent's
> +        * reference state at this point, or check_reference_leak would
> +        * complain, hence it must be the same as the caller. There is no need
> +        * to copy it back.
> +        */
> +       if (!callee->in_callback_fn) {
> +               /* Transfer references to the caller */
> +               err = copy_reference_state(caller, callee);
> +               if (err)
> +                       return err;
> +       }

This part makes sense.

>
>         *insn_idx = callee->callsite + 1;
>         if (env->log.level & BPF_LOG_LEVEL) {
> @@ -7065,13 +7076,20 @@ record_func_key(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta,
>  static int check_reference_leak(struct bpf_verifier_env *env)
>  {
>         struct bpf_func_state *state = cur_func(env);
> +       bool refs_lingering = false;
>         int i;
>
> +       if (state->frameno && !state->in_callback_fn)
> +               return 0;
> +
>         for (i = 0; i < state->acquired_refs; i++) {
> +               if (state->in_callback_fn && state->refs[i].callback_ref != state->frameno)
> +                       continue;

This part I don't understand.
Why remember callback_ref at all?
if (state->in_callback_fn)
and there is something in acquired_refs it means
that callback acquired refs and since we're not transferring
them then we can error right away.

>                 verbose(env, "Unreleased reference id=%d alloc_insn=%d\n",
>                         state->refs[i].id, state->refs[i].insn_idx);
> +               refs_lingering = true;
>         }
> -       return state->acquired_refs ? -EINVAL : 0;
> +       return refs_lingering ? -EINVAL : 0;
>  }
>
>  static int check_bpf_snprintf_call(struct bpf_verifier_env *env,
> @@ -12332,6 +12350,16 @@ static int do_check(struct bpf_verifier_env *env)
>                                         return -EINVAL;
>                                 }
>
> +                               /* We must do check_reference_leak here before
> +                                * prepare_func_exit to handle the case when
> +                                * state->curframe > 0, it may be a callback
> +                                * function, for which reference_state must
> +                                * match caller reference state when it exits.
> +                                */
> +                               err = check_reference_leak(env);
> +                               if (err)
> +                                       return err;
> +
>                                 if (state->curframe) {
>                                         /* exit from nested function */
>                                         err = prepare_func_exit(env, &env->insn_idx);
> @@ -12341,10 +12369,6 @@ static int do_check(struct bpf_verifier_env *env)
>                                         continue;
>                                 }
>
> -                               err = check_reference_leak(env);
> -                               if (err)
> -                                       return err;
> -
>                                 err = check_return_code(env);
>                                 if (err)
>                                         return err;
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC bpf v1 2/3] bpf: Fix reference state management for synchronous callbacks
  2022-08-19  0:51   ` Alexei Starovoitov
@ 2022-08-19  5:31     ` Kumar Kartikeya Dwivedi
  2022-08-19 18:38       ` Alexei Starovoitov
  0 siblings, 1 reply; 9+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-08-19  5:31 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Yonghong Song, Andrii Nakryiko, Daniel Borkmann

On Fri, 19 Aug 2022 at 02:51, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Sun, Aug 14, 2022 at 10:15 PM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > Currently, verifier verifies callback functions (sync and async) as if
> > they will be executed once, (i.e. it explores execution state as if the
> > function was being called once). The next insn to explore is set to
> > start of subprog and the exit from nested frame is handled using
> > curframe > 0 and prepare_func_exit. In case of async callback it uses a
> > customized variant of push_stack simulating a kind of branch to set up
> > custom state and execution context for the async callback.
> >
> > While this approach is simple and works when callback really will be
> > executed only once, it is unsafe for all of our current helpers which
> > are for_each style, i.e. they execute the callback multiple times.
> >
> > A callback releasing acquired references of the caller may do so
> > multiple times, but currently verifier sees it as one call inside the
> > frame, which then returns to caller. Hence, it thinks it released some
> > reference that the cb e.g. got access through callback_ctx (register
> > filled inside cb from spilled typed register on stack).
> >
> > Similarly, it may see that an acquire call is unpaired inside the
> > callback, so the caller will copy the reference state of callback and
> > then will have to release the register with new ref_obj_ids. But again,
> > the callback may execute multiple times, but the verifier will only
> > account for acquired references for a single symbolic execution of the
> > callback.
> >
> > Note that for async callback case, things are different. While currently
> > we have bpf_timer_set_callback which only executes it once, even for
> > multiple executions it would be safe, as reference state is NULL and
> > check_reference_leak would force program to release state before
> > BPF_EXIT. The state is also unaffected by analysis for the caller frame.
> > Hence async callback is safe.
> >
> > To fix this, we disallow callbacks to transfer acquired references back
> > to caller. They must be released before callback hits BPF_EXIT, since
> > the number of times callback is invoked is not known to the verifier, it
> > cannot reliably track how many references will be created. Likewise, it
> > is not allowed to release caller reference state, since we don't know
> > how many times the callback will be invoked.
> >
> > Lastly, now that callback function cannot change reference state it
> > copied from its parent, there is no need to copy reference state back to
> > the parent, since it won't change. It may be changed for the callee
> > frame but that state must match parent reference state by the time
> > callee exits, and it is going to be discarded anyway. So skip this copy
> > too. To be clear, it won't be incorrect if the copy was done, but it
> > would be inefficient and may be confusing to people reading the code.
> >
> > Fixes: 69c87ba6225 ("bpf: Add bpf_for_each_map_elem() helper")
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> >  include/linux/bpf_verifier.h | 11 ++++++++++
> >  kernel/bpf/verifier.c        | 42 ++++++++++++++++++++++++++++--------
> >  2 files changed, 44 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > index 2e3bad8640dc..1fdddbf3546b 100644
> > --- a/include/linux/bpf_verifier.h
> > +++ b/include/linux/bpf_verifier.h
> > @@ -212,6 +212,17 @@ struct bpf_reference_state {
> >          * is used purely to inform the user of a reference leak.
> >          */
> >         int insn_idx;
> > +       /* There can be a case like:
> > +        * main (frame 0)
> > +        *  cb (frame 1)
> > +        *   func (frame 3)
> > +        *    cb (frame 4)
> > +        * Hence for frame 4, if callback_ref just stored boolean, it would be
> > +        * impossible to distinguish nested callback refs. Hence store the
> > +        * frameno and compare that to callback_ref in check_reference_leak when
> > +        * exiting a callback function.
> > +        */
> > +       int callback_ref;
> >  };
> >
> >  /* state of the program:
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 096fdac70165..3e885ba88b02 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -1086,6 +1086,7 @@ static int acquire_reference_state(struct bpf_verifier_env *env, int insn_idx)
> >         id = ++env->id_gen;
> >         state->refs[new_ofs].id = id;
> >         state->refs[new_ofs].insn_idx = insn_idx;
> > +       state->refs[new_ofs].callback_ref = state->in_callback_fn ? state->frameno : 0;
> >
> >         return id;
> >  }
> > @@ -1098,6 +1099,9 @@ static int release_reference_state(struct bpf_func_state *state, int ptr_id)
> >         last_idx = state->acquired_refs - 1;
> >         for (i = 0; i < state->acquired_refs; i++) {
> >                 if (state->refs[i].id == ptr_id) {
> > +                       /* Cannot release caller references in callbacks */
> > +                       if (state->in_callback_fn && state->refs[i].callback_ref != state->frameno)
> > +                               return -EINVAL;
> >                         if (last_idx && i != last_idx)
> >                                 memcpy(&state->refs[i], &state->refs[last_idx],
> >                                        sizeof(*state->refs));
> > @@ -6938,10 +6942,17 @@ static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx)
> >                 caller->regs[BPF_REG_0] = *r0;
> >         }
> >
> > -       /* Transfer references to the caller */
> > -       err = copy_reference_state(caller, callee);
> > -       if (err)
> > -               return err;
> > +       /* callback_fn frame should have released its own additions to parent's
> > +        * reference state at this point, or check_reference_leak would
> > +        * complain, hence it must be the same as the caller. There is no need
> > +        * to copy it back.
> > +        */
> > +       if (!callee->in_callback_fn) {
> > +               /* Transfer references to the caller */
> > +               err = copy_reference_state(caller, callee);
> > +               if (err)
> > +                       return err;
> > +       }
>
> This part makes sense.
>
> >
> >         *insn_idx = callee->callsite + 1;
> >         if (env->log.level & BPF_LOG_LEVEL) {
> > @@ -7065,13 +7076,20 @@ record_func_key(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta,
> >  static int check_reference_leak(struct bpf_verifier_env *env)
> >  {
> >         struct bpf_func_state *state = cur_func(env);
> > +       bool refs_lingering = false;
> >         int i;
> >
> > +       if (state->frameno && !state->in_callback_fn)
> > +               return 0;
> > +
> >         for (i = 0; i < state->acquired_refs; i++) {
> > +               if (state->in_callback_fn && state->refs[i].callback_ref != state->frameno)
> > +                       continue;
>
> This part I don't understand.
> Why remember callback_ref at all?
> if (state->in_callback_fn)
> and there is something in acquired_refs it means
> that callback acquired refs and since we're not transferring
> them then we can error right away.
>

Consider that a prog has 3 acquired_refs, with ids 1, 2, 3.

When you invoke the callback, you still want it to be able to load
pointers spilled to the stack with a ref_obj_id and pass them to
helpers. Hence, we do copy acquired_refs to the callback after
initializing bpf_func_state. However, since we don't know how many
times it will be invoked, it is unsafe to allow it to acquire refs it
doesn't release before its bpf_exit, or release the callers ref. The
former can lead to leaks (since if we only copy refs acquired in one
iteration, and if we release we only release refs released in one
iteration).

Hence, this patch completely disallows callback from mutating the
state of refs it copied from its parent. It can however acquire its
own refs and release them before bpf_exit. Hence, check_reference_leak
now errors if it sees we are in callback_fn and we have not released
callback_ref refs. Since there can be multiple nested callbacks, like
frame 0 -> cb1 -> cb2  etc. we need to also distinguish between
whether this particular ref belongs to this callback frame or parent,
and only error for our own, so we store state->frameno (which is
always non-zero for callbacks).

TLDR; callbacks can read parent reference_state, but cannot mutate it,
to be able to use pointers acquired by the caller. They must only undo
their changes (by releasing their own acquired_refs before bpf_exit)
on top of caller reference_state before returning (at which point the
caller and callback state will match anyway, so no need to copy back).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC bpf v1 2/3] bpf: Fix reference state management for synchronous callbacks
  2022-08-19  5:31     ` Kumar Kartikeya Dwivedi
@ 2022-08-19 18:38       ` Alexei Starovoitov
  0 siblings, 0 replies; 9+ messages in thread
From: Alexei Starovoitov @ 2022-08-19 18:38 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Yonghong Song, Andrii Nakryiko, Daniel Borkmann

On Fri, Aug 19, 2022 at 07:31:24AM +0200, Kumar Kartikeya Dwivedi wrote:
> On Fri, 19 Aug 2022 at 02:51, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Sun, Aug 14, 2022 at 10:15 PM Kumar Kartikeya Dwivedi
> > <memxor@gmail.com> wrote:
> > >
> > > Currently, verifier verifies callback functions (sync and async) as if
> > > they will be executed once, (i.e. it explores execution state as if the
> > > function was being called once). The next insn to explore is set to
> > > start of subprog and the exit from nested frame is handled using
> > > curframe > 0 and prepare_func_exit. In case of async callback it uses a
> > > customized variant of push_stack simulating a kind of branch to set up
> > > custom state and execution context for the async callback.
> > >
> > > While this approach is simple and works when callback really will be
> > > executed only once, it is unsafe for all of our current helpers which
> > > are for_each style, i.e. they execute the callback multiple times.
> > >
> > > A callback releasing acquired references of the caller may do so
> > > multiple times, but currently verifier sees it as one call inside the
> > > frame, which then returns to caller. Hence, it thinks it released some
> > > reference that the cb e.g. got access through callback_ctx (register
> > > filled inside cb from spilled typed register on stack).
> > >
> > > Similarly, it may see that an acquire call is unpaired inside the
> > > callback, so the caller will copy the reference state of callback and
> > > then will have to release the register with new ref_obj_ids. But again,
> > > the callback may execute multiple times, but the verifier will only
> > > account for acquired references for a single symbolic execution of the
> > > callback.
> > >
> > > Note that for async callback case, things are different. While currently
> > > we have bpf_timer_set_callback which only executes it once, even for
> > > multiple executions it would be safe, as reference state is NULL and
> > > check_reference_leak would force program to release state before
> > > BPF_EXIT. The state is also unaffected by analysis for the caller frame.
> > > Hence async callback is safe.
> > >
> > > To fix this, we disallow callbacks to transfer acquired references back
> > > to caller. They must be released before callback hits BPF_EXIT, since
> > > the number of times callback is invoked is not known to the verifier, it
> > > cannot reliably track how many references will be created. Likewise, it
> > > is not allowed to release caller reference state, since we don't know
> > > how many times the callback will be invoked.
> > >
> > > Lastly, now that callback function cannot change reference state it
> > > copied from its parent, there is no need to copy reference state back to
> > > the parent, since it won't change. It may be changed for the callee
> > > frame but that state must match parent reference state by the time
> > > callee exits, and it is going to be discarded anyway. So skip this copy
> > > too. To be clear, it won't be incorrect if the copy was done, but it
> > > would be inefficient and may be confusing to people reading the code.
> > >
> > > Fixes: 69c87ba6225 ("bpf: Add bpf_for_each_map_elem() helper")
> > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > ---
> > >  include/linux/bpf_verifier.h | 11 ++++++++++
> > >  kernel/bpf/verifier.c        | 42 ++++++++++++++++++++++++++++--------
> > >  2 files changed, 44 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > > index 2e3bad8640dc..1fdddbf3546b 100644
> > > --- a/include/linux/bpf_verifier.h
> > > +++ b/include/linux/bpf_verifier.h
> > > @@ -212,6 +212,17 @@ struct bpf_reference_state {
> > >          * is used purely to inform the user of a reference leak.
> > >          */
> > >         int insn_idx;
> > > +       /* There can be a case like:
> > > +        * main (frame 0)
> > > +        *  cb (frame 1)
> > > +        *   func (frame 3)
> > > +        *    cb (frame 4)
> > > +        * Hence for frame 4, if callback_ref just stored boolean, it would be
> > > +        * impossible to distinguish nested callback refs. Hence store the
> > > +        * frameno and compare that to callback_ref in check_reference_leak when
> > > +        * exiting a callback function.
> > > +        */
> > > +       int callback_ref;
> > >  };
> > >
> > >  /* state of the program:
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index 096fdac70165..3e885ba88b02 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -1086,6 +1086,7 @@ static int acquire_reference_state(struct bpf_verifier_env *env, int insn_idx)
> > >         id = ++env->id_gen;
> > >         state->refs[new_ofs].id = id;
> > >         state->refs[new_ofs].insn_idx = insn_idx;
> > > +       state->refs[new_ofs].callback_ref = state->in_callback_fn ? state->frameno : 0;
> > >
> > >         return id;
> > >  }
> > > @@ -1098,6 +1099,9 @@ static int release_reference_state(struct bpf_func_state *state, int ptr_id)
> > >         last_idx = state->acquired_refs - 1;
> > >         for (i = 0; i < state->acquired_refs; i++) {
> > >                 if (state->refs[i].id == ptr_id) {
> > > +                       /* Cannot release caller references in callbacks */
> > > +                       if (state->in_callback_fn && state->refs[i].callback_ref != state->frameno)
> > > +                               return -EINVAL;
> > >                         if (last_idx && i != last_idx)
> > >                                 memcpy(&state->refs[i], &state->refs[last_idx],
> > >                                        sizeof(*state->refs));
> > > @@ -6938,10 +6942,17 @@ static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx)
> > >                 caller->regs[BPF_REG_0] = *r0;
> > >         }
> > >
> > > -       /* Transfer references to the caller */
> > > -       err = copy_reference_state(caller, callee);
> > > -       if (err)
> > > -               return err;
> > > +       /* callback_fn frame should have released its own additions to parent's
> > > +        * reference state at this point, or check_reference_leak would
> > > +        * complain, hence it must be the same as the caller. There is no need
> > > +        * to copy it back.
> > > +        */
> > > +       if (!callee->in_callback_fn) {
> > > +               /* Transfer references to the caller */
> > > +               err = copy_reference_state(caller, callee);
> > > +               if (err)
> > > +                       return err;
> > > +       }
> >
> > This part makes sense.
> >
> > >
> > >         *insn_idx = callee->callsite + 1;
> > >         if (env->log.level & BPF_LOG_LEVEL) {
> > > @@ -7065,13 +7076,20 @@ record_func_key(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta,
> > >  static int check_reference_leak(struct bpf_verifier_env *env)
> > >  {
> > >         struct bpf_func_state *state = cur_func(env);
> > > +       bool refs_lingering = false;
> > >         int i;
> > >
> > > +       if (state->frameno && !state->in_callback_fn)
> > > +               return 0;
> > > +
> > >         for (i = 0; i < state->acquired_refs; i++) {
> > > +               if (state->in_callback_fn && state->refs[i].callback_ref != state->frameno)
> > > +                       continue;
> >
> > This part I don't understand.
> > Why remember callback_ref at all?
> > if (state->in_callback_fn)
> > and there is something in acquired_refs it means
> > that callback acquired refs and since we're not transferring
> > them then we can error right away.
> >
> 
> Consider that a prog has 3 acquired_refs, with ids 1, 2, 3.
> 
> When you invoke the callback, you still want it to be able to load
> pointers spilled to the stack with a ref_obj_id and pass them to
> helpers. Hence, we do copy acquired_refs to the callback after
> initializing bpf_func_state.

Ahh. I see. We do copy_reference_state() and the verifier state in the caller's
frame is indeed accessible through callback_ctx pointer passed into the callee.

> However, since we don't know how many
> times it will be invoked, it is unsafe to allow it to acquire refs it
> doesn't release before its bpf_exit, or release the callers ref. The
> former can lead to leaks (since if we only copy refs acquired in one
> iteration, and if we release we only release refs released in one
> iteration).

Right.

> 
> Hence, this patch completely disallows callback from mutating the
> state of refs it copied from its parent. It can however acquire its
> own refs and release them before bpf_exit. Hence, check_reference_leak
> now errors if it sees we are in callback_fn and we have not released
> callback_ref refs. Since there can be multiple nested callbacks, like
> frame 0 -> cb1 -> cb2  etc. we need to also distinguish between
> whether this particular ref belongs to this callback frame or parent,
> and only error for our own, so we store state->frameno (which is
> always non-zero for callbacks).

Right.

> TLDR; callbacks can read parent reference_state, but cannot mutate it,
> to be able to use pointers acquired by the caller. They must only undo
> their changes (by releasing their own acquired_refs before bpf_exit)
> on top of caller reference_state before returning (at which point the
> caller and callback state will match anyway, so no need to copy back).

Please add all these details to the commit log.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC bpf v1 0/3] Verifier callback handling
  2022-08-15  5:15 [PATCH RFC bpf v1 0/3] Verifier callback handling Kumar Kartikeya Dwivedi
                   ` (2 preceding siblings ...)
  2022-08-15  5:15 ` [PATCH RFC bpf v1 3/3] selftests/bpf: Add tests for callbacks fixes Kumar Kartikeya Dwivedi
@ 2022-08-24 23:18 ` Andrii Nakryiko
  2022-09-06  0:19   ` Kumar Kartikeya Dwivedi
  3 siblings, 1 reply; 9+ messages in thread
From: Andrii Nakryiko @ 2022-08-24 23:18 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Yonghong Song, Andrii Nakryiko, Daniel Borkmann

On Sun, Aug 14, 2022 at 10:15 PM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> Hi, these are fixes for issues I noticed while working on the callback code.
> It is an RFC because I would like to discuss how to fix one other problem,
> I have a rough idea but would like to discuss before working on it, since
> it isn't trivial. The conservative fix doesn't pass existing selftests.
>
> It seems like the problem anticipated in [0] actually exists with all the
> current callback based loop helpers, in a different form. The main reason is
> that callback function is only explored as if it was executing once.
>
> There are some example cases included in selftests to show the problem, and make
> sure we keep catching and rejecting them. Since callback which is loop body is
> only verified once, once the for_each helper returns the execution state from
> verifier's perspective is as if callback executed once, but ofcourse it may
> execute multiple times, unknowable statically.
>
> Patch 1 fixes the case of acquiring and releasing references, unless I missed
> some corner case it seems to mostly work fine.
>
> In the future when we have a case where a sycnhronous callback will only be
> executed once by a helper, we can improve this case by relaxing these
> restrictions, so transfer of references and verifying callback as if it executes
> only once can be done. For now, all in_callback_fn function get executed
> multiple times, hence that change is left for when it will be needed.
>
> Now, the important part:
> ------------------------
>
> The included selftest will fail/crash, as one fix is missing from the series.
> Whatever I tried so far wasn't backwards compatible with existing selftests
> (e.g. conservatively marking all reads as UNKNOWN, and all writes scrubbing
> stack slots). Some tests e.g. read skb pointer from ptr_to_stack passed to
> callback.
>
> It is not clear to me what the correct fix would be to handle reads and writes
> from the stack inside callback function. If a stack slot is never touched inside
> the callback in any of the states, it should be safe to allow it to load spilled
> registers from caller's stack frame, even allowing precise scalars would be safe
> if they are read from untouched slots.
>
> The idea I have is roughly:
>
> In the first pass, verify as if callback is executing once. This is the same
> case as of now, this may e.g. load a spilled precise scalar, then overwrite it,
> so insns before the store doing overwrite verification consider register as
> precise. In reality, in later loop iterations though this assumption may be
> wrong.
>
> For e.g.
> int i = 1;
> int arr[2] = {0};
>
> bpf_loop(100, cb, &i, 0);
>
> cb:
>         int i = *ctx;            // seen as 1
>         func(*((int *)ctx - i)); // stack read ok, arr[1]
>         *ctx = i + 1;            // i is now 2
>
> verification ends here, but this would start reading uninit stack in later
> iterations.
>
> To fix this, we keep a bitmap of scratched slots for the stack in caller frame.
> Once we see BPF_EXIT, we do reverification of the callback, but this time mark
> _any_ register reading from slots that were scratched the first time around as
> mark_reg_unknown (i.e. not precise). Moreover, in the caller stack it will be
> marked STACK_MISC. It might not be able to track <8 byte writes but I guess
> that is fine (since that would require 512 bits instead of 64).
>
> The big problem I can see in this is that the precise register guards another
> write in another branch that was not taken the first time around. In that case
> push_stack is not done to explore the other branch, and is_branch_taken is used
> to predict the outcome.
>
> But: if this has to cause a problem, the precise reg guarding the branch must
> change its value (and that we can see if it is being from a scratched slot
> read), so by doing another pass we would get rid of this problem. I.e. one more
> pass would be needed after downgrading precise registers to unknown.
>
> If I can get some hints for new ideas or feedback on whether this would be the
> correct approach, I can work on a proper fix for this case.
>
> It seems like roughly the same approach will be required for open coded
> iteration mentioned in [0], so we might be able to kill two birds with one shot.
>

Keep in mind that callbacks, just like open-coded iteration loops
(which I plan to work on as soon as I dig myself out of BPF mailing
list email hole), might be executed not just 1 or >1 times, they might
be executed exactly zero times (e.g., pass invalid arguments to
bpf_loop or any other such helper and your callback won't be
executed). And as such, in principle, any write to a new stack slot
and marking it as "initialized" is invalid.

So generally only state modifications that keep state with equivalent
(and not more) amount of information should be allowed. I.e., writing
to an already written stack slot and not adding more specific
knowledge to known register state should be ok regardless if callback
was called 0, 1, or N times. E.g, if we knew that r1 is [100, 200]
precise range and inside the callback we update that to [150, 175] --
it's fine. We just need to restore original [100, 200] state once we
are done validating callback. If parent's stack was non-precise and
internally callback makes it precise for internal use -- that's also
fine, but we need to restore that slot to original imprecise state
after callback validation. Similarly, if stack slot was uninitialized,
it's ok if callback uses that slot for something, but we should forget
that something once callback is done.

Iterators will also have an additional restriction that we don't
actually create a new frame for each inner loop (so what you did for
callbacks in this patch set won't be usable as-is directly, but it's
fine).


In short, possibility of zero executions makes everything much
clearer. We can't assume we know what callback writes into parent's
stack (as this write might never happen). As to how to make this a bit
more practical, we'll need to work this out in a bit more details, I
haven't spent too much time on this yet.

But I think the approach (which you are right will be conceptually
very similar to open-coded iterators) should be to do two passes:
  a) pass that assumes no callback happened and no parent function's
stack modifications happened. Program should still be correctly
verifiable (and as such, any program that assumes that callback sets
some stack slot to some known value from inside the callback is
invalid; that slot has to be initialized with some default value).
  b) pass that simulates exactly 1 execution. Once we are done with
this, let's remember this state.
  c) third pass that simulates N executions. We pass state from step
b) as input state. We validate with that state. Assuming it's
successful, we end up with some new state of the parent function. At
this point we should check that end state from step b) and state from
this step c) are "equivalent" (according to my explanation above,
there should be no expansion of knowledge of the state).


The latter state is what we do for BPF infinite loop detection. The
fun part is here we actually want to see that it would cause infinite
loop (states shouldn't differ). At this point we can just assume that
it's fine to execute N callbacks and we just pop state c), go back to
state b), and continue parent program verification.


I might be missing some details, but that's a rough idea and a plan I have.



> Thanks!
>
>   [0]: https://lore.kernel.org/bpf/33123904-5719-9e93-4af2-c7d549221520@fb.com
>
> Kumar Kartikeya Dwivedi (3):
>   bpf: Move bpf_loop and bpf_for_each_map_elem under CAP_BPF
>   bpf: Fix reference state management for synchronous callbacks
>   selftests/bpf: Add tests for callbacks fixes
>
>  include/linux/bpf_verifier.h                  |  11 ++
>  kernel/bpf/helpers.c                          |   8 +-
>  kernel/bpf/verifier.c                         |  42 +++--
>  .../selftests/bpf/prog_tests/cb_refs.c        |  49 ++++++
>  tools/testing/selftests/bpf/progs/cb_refs.c   | 152 ++++++++++++++++++
>  5 files changed, 249 insertions(+), 13 deletions(-)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/cb_refs.c
>  create mode 100644 tools/testing/selftests/bpf/progs/cb_refs.c
>
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC bpf v1 0/3] Verifier callback handling
  2022-08-24 23:18 ` [PATCH RFC bpf v1 0/3] Verifier callback handling Andrii Nakryiko
@ 2022-09-06  0:19   ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 9+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-09-06  0:19 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Yonghong Song, Andrii Nakryiko, Daniel Borkmann

On Thu, 25 Aug 2022 at 01:18, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> On Sun, Aug 14, 2022 at 10:15 PM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > Hi, these are fixes for issues I noticed while working on the callback code.
> > It is an RFC because I would like to discuss how to fix one other problem,
> > I have a rough idea but would like to discuss before working on it, since
> > it isn't trivial. The conservative fix doesn't pass existing selftests.
> >
> > It seems like the problem anticipated in [0] actually exists with all the
> > current callback based loop helpers, in a different form. The main reason is
> > that callback function is only explored as if it was executing once.
> >
> > There are some example cases included in selftests to show the problem, and make
> > sure we keep catching and rejecting them. Since callback which is loop body is
> > only verified once, once the for_each helper returns the execution state from
> > verifier's perspective is as if callback executed once, but ofcourse it may
> > execute multiple times, unknowable statically.
> >
> > Patch 1 fixes the case of acquiring and releasing references, unless I missed
> > some corner case it seems to mostly work fine.
> >
> > In the future when we have a case where a sycnhronous callback will only be
> > executed once by a helper, we can improve this case by relaxing these
> > restrictions, so transfer of references and verifying callback as if it executes
> > only once can be done. For now, all in_callback_fn function get executed
> > multiple times, hence that change is left for when it will be needed.
> >
> > Now, the important part:
> > ------------------------
> >
> > The included selftest will fail/crash, as one fix is missing from the series.
> > Whatever I tried so far wasn't backwards compatible with existing selftests
> > (e.g. conservatively marking all reads as UNKNOWN, and all writes scrubbing
> > stack slots). Some tests e.g. read skb pointer from ptr_to_stack passed to
> > callback.
> >
> > It is not clear to me what the correct fix would be to handle reads and writes
> > from the stack inside callback function. If a stack slot is never touched inside
> > the callback in any of the states, it should be safe to allow it to load spilled
> > registers from caller's stack frame, even allowing precise scalars would be safe
> > if they are read from untouched slots.
> >
> > The idea I have is roughly:
> >
> > In the first pass, verify as if callback is executing once. This is the same
> > case as of now, this may e.g. load a spilled precise scalar, then overwrite it,
> > so insns before the store doing overwrite verification consider register as
> > precise. In reality, in later loop iterations though this assumption may be
> > wrong.
> >
> > For e.g.
> > int i = 1;
> > int arr[2] = {0};
> >
> > bpf_loop(100, cb, &i, 0);
> >
> > cb:
> >         int i = *ctx;            // seen as 1
> >         func(*((int *)ctx - i)); // stack read ok, arr[1]
> >         *ctx = i + 1;            // i is now 2
> >
> > verification ends here, but this would start reading uninit stack in later
> > iterations.
> >
> > To fix this, we keep a bitmap of scratched slots for the stack in caller frame.
> > Once we see BPF_EXIT, we do reverification of the callback, but this time mark
> > _any_ register reading from slots that were scratched the first time around as
> > mark_reg_unknown (i.e. not precise). Moreover, in the caller stack it will be
> > marked STACK_MISC. It might not be able to track <8 byte writes but I guess
> > that is fine (since that would require 512 bits instead of 64).
> >
> > The big problem I can see in this is that the precise register guards another
> > write in another branch that was not taken the first time around. In that case
> > push_stack is not done to explore the other branch, and is_branch_taken is used
> > to predict the outcome.
> >
> > But: if this has to cause a problem, the precise reg guarding the branch must
> > change its value (and that we can see if it is being from a scratched slot
> > read), so by doing another pass we would get rid of this problem. I.e. one more
> > pass would be needed after downgrading precise registers to unknown.
> >
> > If I can get some hints for new ideas or feedback on whether this would be the
> > correct approach, I can work on a proper fix for this case.
> >
> > It seems like roughly the same approach will be required for open coded
> > iteration mentioned in [0], so we might be able to kill two birds with one shot.
> >
>
> Keep in mind that callbacks, just like open-coded iteration loops
> (which I plan to work on as soon as I dig myself out of BPF mailing
> list email hole), might be executed not just 1 or >1 times, they might
> be executed exactly zero times (e.g., pass invalid arguments to
> bpf_loop or any other such helper and your callback won't be
> executed). And as such, in principle, any write to a new stack slot
> and marking it as "initialized" is invalid.
>
> So generally only state modifications that keep state with equivalent
> (and not more) amount of information should be allowed. I.e., writing
> to an already written stack slot and not adding more specific
> knowledge to known register state should be ok regardless if callback
> was called 0, 1, or N times. E.g, if we knew that r1 is [100, 200]
> precise range and inside the callback we update that to [150, 175] --
> it's fine. We just need to restore original [100, 200] state once we
> are done validating callback. If parent's stack was non-precise and
> internally callback makes it precise for internal use -- that's also
> fine, but we need to restore that slot to original imprecise state
> after callback validation. Similarly, if stack slot was uninitialized,
> it's ok if callback uses that slot for something, but we should forget
> that something once callback is done.
>
> Iterators will also have an additional restriction that we don't
> actually create a new frame for each inner loop (so what you did for
> callbacks in this patch set won't be usable as-is directly, but it's
> fine).
>
>
> In short, possibility of zero executions makes everything much
> clearer. We can't assume we know what callback writes into parent's

Indeed, 0 callback executions seems like a significant insight.

There needs to be an exploration without actually exploring the loop
body, and we need to check whether the rest of the program passes
verification like that.

It seems that case should be handled even right now? It is like how we
handle branches, so you will push_stack pointing to the first
instruction of subprog in one exploration path (which simulates
execution once and continues back in parent from bpf_exit). While for
the 0 callback case, you continue verifying after the callback helper
call insn.

It seems orthogonal to the N times case. So maybe we should fix that
now while thinking about how to handle N explorations? Thoughts?

Right now, we verify the callback as if it executes once. So callback
may e.g. spill some register variable and mark it precise (bounded, or
even constant). Then we will verify the rest of the program using that
precise scalar. But it could happen that the callback never runs and
it will simply use that unbounded scalar thinking it is bounded.

> stack (as this write might never happen). As to how to make this a bit
> more practical, we'll need to work this out in a bit more details, I
> haven't spent too much time on this yet.
>
> But I think the approach (which you are right will be conceptually
> very similar to open-coded iterators) should be to do two passes:

You mean three, right? ;-)

>   a) pass that assumes no callback happened and no parent function's
> stack modifications happened. Program should still be correctly
> verifiable (and as such, any program that assumes that callback sets
> some stack slot to some known value from inside the callback is
> invalid; that slot has to be initialized with some default value).
>   b) pass that simulates exactly 1 execution. Once we are done with
> this, let's remember this state.
>   c) third pass that simulates N executions. We pass state from step
> b) as input state. We validate with that state. Assuming it's
> successful, we end up with some new state of the parent function. At
> this point we should check that end state from step b) and state from
> this step c) are "equivalent" (according to my explanation above,
> there should be no expansion of knowledge of the state).
>
>
> The latter state is what we do for BPF infinite loop detection. The
> fun part is here we actually want to see that it would cause infinite
> loop (states shouldn't differ). At this point we can just assume that
> it's fine to execute N callbacks and we just pop state c), go back to
> state b), and continue parent program verification.
>
>
> I might be missing some details, but that's a rough idea and a plan I have.
>

Thanks for articulating your ideas. Looking forward to the patches.
It does seem like checking whether states will be equal should be
enough, even though it might not allow all kinds of loops.
We already ensure when pruning search that the new state is 'safe'
given an old already explored safe state so it should still allow some
state modification inside the loop.
Choosing the number of times to repeat this process will be the hard
part, some value that isn't too conservative and yet still not too
high to end up consuming too much time and memory during verification,
since we might not start seeing equivalent states immediately on 2nd
exploration. Maybe the callback helpers themselves can give some hints
to tune the value of N. The other improvement might be identifying the
register involved in the condition used to break out of loops, if
scalar then learning the bounds from the branch containing the
bpf_exit with R0 set to callback return value, which might allow
exploring complicated cases without bounding to fixed N limit in
verifier.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-09-06  0:20 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-15  5:15 [PATCH RFC bpf v1 0/3] Verifier callback handling Kumar Kartikeya Dwivedi
2022-08-15  5:15 ` [PATCH RFC bpf v1 1/3] bpf: Move bpf_loop and bpf_for_each_map_elem under CAP_BPF Kumar Kartikeya Dwivedi
2022-08-15  5:15 ` [PATCH RFC bpf v1 2/3] bpf: Fix reference state management for synchronous callbacks Kumar Kartikeya Dwivedi
2022-08-19  0:51   ` Alexei Starovoitov
2022-08-19  5:31     ` Kumar Kartikeya Dwivedi
2022-08-19 18:38       ` Alexei Starovoitov
2022-08-15  5:15 ` [PATCH RFC bpf v1 3/3] selftests/bpf: Add tests for callbacks fixes Kumar Kartikeya Dwivedi
2022-08-24 23:18 ` [PATCH RFC bpf v1 0/3] Verifier callback handling Andrii Nakryiko
2022-09-06  0:19   ` Kumar Kartikeya Dwivedi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).