All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf v2 0/4] BPF fixes mixed tail and bpf2bpf calls
@ 2021-06-16 22:54 John Fastabend
  2021-06-16 22:55 ` [PATCH bpf v2 1/4] bpf: Fix null ptr deref with mixed tail calls and subprogs John Fastabend
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: John Fastabend @ 2021-06-16 22:54 UTC (permalink / raw)
  To: maciej.fijalkowski, ast, daniel, andriin; +Cc: john.fastabend, netdev, netdev

We recently tried to use mixed programs that have both tail calls and
subprograms, but it needs the attached fixes.

Also added a new test case tailcall_bpf2bpf_5 that simply runs the
previous test case tailcall_bpf2bpf_4 and adds some "noise". The
noise here is just a bunch of map calls to get the verifier to insert
instructions and cause code movement plus it forces used_maps logic
to be used. Originally, I just extended bpf2bpf_4 directly, but if I
got the feedback correct it seems the preference is to have another
test case for this specifically.

With attached patches our programs are happily running with mixed
subprograms and tailcalls.

Thanks,
John

---

John Fastabend (4):
      bpf: Fix null ptr deref with mixed tail calls and subprogs
      bpf: map_poke_descriptor is being called with an unstable poke_tab[]
      bpf: track subprog poke correctly
      bpf: selftest to verify mixing bpf2bpf calls and tailcalls with insn patch


 include/linux/bpf.h                           |  1 +
 kernel/bpf/core.c                             |  6 ++--
 kernel/bpf/verifier.c                         | 36 ++++++++++++++-----
 .../selftests/bpf/prog_tests/tailcalls.c      | 36 +++++++++++++------
 .../selftests/bpf/progs/tailcall_bpf2bpf4.c   | 20 ++++++++++-
 5 files changed, 77 insertions(+), 22 deletions(-)

--


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH bpf v2 1/4] bpf: Fix null ptr deref with mixed tail calls and subprogs
  2021-06-16 22:54 [PATCH bpf v2 0/4] BPF fixes mixed tail and bpf2bpf calls John Fastabend
@ 2021-06-16 22:55 ` John Fastabend
  2021-06-16 22:55 ` [PATCH bpf v2 2/4] bpf: map_poke_descriptor is being called with an unstable poke_tab[] John Fastabend
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: John Fastabend @ 2021-06-16 22:55 UTC (permalink / raw)
  To: maciej.fijalkowski, ast, daniel, andriin; +Cc: john.fastabend, netdev, netdev

The sub-programs prog->aux->poke_tab[] is populated in jit_subprogs() and
then used when emitting 'BPF_JMP|BPF_TAIL_CALL' insn->code from the
individual JITs. The poke_tab[] to use is stored in the insn->imm by
the code adding it to that array slot. The JIT then uses imm to find the
right entry for an individual instruction. In the x86 bpf_jit_comp.c
this is done by calling emit_bpf_tail_call_direct with the poke_tab[]
of the imm value.

However, we observed the below null-ptr-deref when mixing tail call
programs with subprog programs. For this to happen we just need to
mix bpf-2-bpf calls and tailcalls with some extra calls or instructions
that would be patched later by one of the fixup routines. So whats
happening?

Before the fixup_call_args() -- where the jit op is done -- various
code patching is done by do_misc_fixups(). This may increase the
insn count, for example when we patch map_lookup_up using map_gen_lookup
hook. This does two things. First, it means the instruction index,
insn_idx field, of a tail call instruction will move by a 'delta'.

In verifier code,

 struct bpf_jit_poke_descriptor desc = {
  .reason = BPF_POKE_REASON_TAIL_CALL,
  .tail_call.map = BPF_MAP_PTR(aux->map_ptr_state),
  .tail_call.key = bpf_map_key_immediate(aux),
  .insn_idx = i + delta,
 };

Then subprog start values subprog_info[i].start will be updated
with the delta and any poke descriptor index will also be updated
with the delta in adjust_poke_desc(). If we look at the adjust
subprog starts though we see its only adjusted when the delta
occurs before the new instructions,

        /* NOTE: fake 'exit' subprog should be updated as well. */
        for (i = 0; i <= env->subprog_cnt; i++) {
                if (env->subprog_info[i].start <= off)
                        continue;

Earlier subprograms are not changed because their start values
are not moved. But, adjust_poke_desc() does the offset + delta
indiscriminately. The result is poke descriptors are potentially
corrupted.

Then in jit_subprogs() we only populate the poke_tab[]
when the above insn_idx is less than the next subprogram start. From
above we corrupted our insn_idx so we might incorrectly assume a
poke descriptor is not used in a subprogram omitting it from the
subprogram. And finally when the jit runs it does the deref of poke_tab
when emitting the instruction and crashes with below. Because earlier
step omitted the poke descriptor.

The fix is straight forward with above context. Simply move same logic
from adjust_subprog_starts() into adjust_poke_descs() and only adjust
insn_idx when needed.

[   82.396354] bpf_testmod: version magic '5.12.0-rc2alu+ SMP preempt mod_unload ' should be '5.12.0+ SMP preempt mod_unload '
[   82.623001] loop10: detected capacity change from 0 to 8
[   88.487424] ==================================================================
[   88.487438] BUG: KASAN: null-ptr-deref in do_jit+0x184a/0x3290
[   88.487455] Write of size 8 at addr 0000000000000008 by task test_progs/5295
[   88.487471] CPU: 7 PID: 5295 Comm: test_progs Tainted: G          I       5.12.0+ #386
[   88.487483] Hardware name: Dell Inc. Precision 5820 Tower/002KVM, BIOS 1.9.2 01/24/2019
[   88.487490] Call Trace:
[   88.487498]  dump_stack+0x93/0xc2
[   88.487515]  kasan_report.cold+0x5f/0xd8
[   88.487530]  ? do_jit+0x184a/0x3290
[   88.487542]  do_jit+0x184a/0x3290
 ...
[   88.487709]  bpf_int_jit_compile+0x248/0x810
 ...
[   88.487765]  bpf_check+0x3718/0x5140
 ...
[   88.487920]  bpf_prog_load+0xa22/0xf10

Reported-by: Jussi Maki <joamaki@gmail.com>
CC: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Fixes: a748c6975dea3 ("bpf: propagate poke descriptors to subprograms")
Reviewed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 kernel/bpf/verifier.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index c6a27574242d..6e2ebcb0d66f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11459,7 +11459,7 @@ static void adjust_subprog_starts(struct bpf_verifier_env *env, u32 off, u32 len
 	}
 }
 
-static void adjust_poke_descs(struct bpf_prog *prog, u32 len)
+static void adjust_poke_descs(struct bpf_prog *prog, u32 off, u32 len)
 {
 	struct bpf_jit_poke_descriptor *tab = prog->aux->poke_tab;
 	int i, sz = prog->aux->size_poke_tab;
@@ -11467,6 +11467,8 @@ static void adjust_poke_descs(struct bpf_prog *prog, u32 len)
 
 	for (i = 0; i < sz; i++) {
 		desc = &tab[i];
+		if (desc->insn_idx <= off)
+			continue;
 		desc->insn_idx += len - 1;
 	}
 }
@@ -11487,7 +11489,7 @@ static struct bpf_prog *bpf_patch_insn_data(struct bpf_verifier_env *env, u32 of
 	if (adjust_insn_aux_data(env, new_prog, off, len))
 		return NULL;
 	adjust_subprog_starts(env, off, len);
-	adjust_poke_descs(new_prog, len);
+	adjust_poke_descs(new_prog, off, len);
 	return new_prog;
 }
 



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH bpf v2 2/4] bpf: map_poke_descriptor is being called with an unstable poke_tab[]
  2021-06-16 22:54 [PATCH bpf v2 0/4] BPF fixes mixed tail and bpf2bpf calls John Fastabend
  2021-06-16 22:55 ` [PATCH bpf v2 1/4] bpf: Fix null ptr deref with mixed tail calls and subprogs John Fastabend
@ 2021-06-16 22:55 ` John Fastabend
  2021-06-16 23:42   ` John Fastabend
  2021-06-22 21:54   ` Alexei Starovoitov
  2021-06-16 22:55 ` [PATCH bpf v2 3/4] bpf: track subprog poke correctly John Fastabend
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 12+ messages in thread
From: John Fastabend @ 2021-06-16 22:55 UTC (permalink / raw)
  To: maciej.fijalkowski, ast, daniel, andriin; +Cc: john.fastabend, netdev, netdev

When populating poke_tab[] of a subprog we call map_poke_track() after
doing bpf_jit_add_poke_descriptor(). But, bpf_jit_add_poke_descriptor()
may, likely will, realloc the poke_tab[] structure and free the old
one. So that prog->aux->poke_tab is not stable. However, the aux pointer
is referenced from bpf_array_aux and poke_tab[] is used to 'track'
prog<->map link. This way when progs are released the entry in the
map is dropped and vice versa when the map is released we don't drop
it too soon if a prog is in the process of calling it.

I wasn't able to trigger any errors here, for example having map_poke_run
run with a poke_tab[] pointer that was free'd from
bpf_jit_add_poke_descriptor(), but it looks possible and at very least
is very fragile.

This patch moves poke_track call out of loop that is calling add_poke
so that we only ever add stable aux->poke_tab pointers to the map's
bpf_array_aux struct. Further, we need this in the next patch to fix
a real bug where progs are not 'untracked'.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 kernel/bpf/verifier.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 6e2ebcb0d66f..066fac9b5460 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -12126,8 +12126,12 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 			}
 
 			func[i]->insnsi[insn_idx - subprog_start].imm = ret + 1;
+		}
 
-			map_ptr = func[i]->aux->poke_tab[ret].tail_call.map;
+		for (j = 0; j < func[i]->aux->size_poke_tab; j++) {
+			int ret;
+
+			map_ptr = func[i]->aux->poke_tab[j].tail_call.map;
 			ret = map_ptr->ops->map_poke_track(map_ptr, func[i]->aux);
 			if (ret < 0) {
 				verbose(env, "tracking tail call prog failed\n");



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH bpf v2 3/4] bpf: track subprog poke correctly
  2021-06-16 22:54 [PATCH bpf v2 0/4] BPF fixes mixed tail and bpf2bpf calls John Fastabend
  2021-06-16 22:55 ` [PATCH bpf v2 1/4] bpf: Fix null ptr deref with mixed tail calls and subprogs John Fastabend
  2021-06-16 22:55 ` [PATCH bpf v2 2/4] bpf: map_poke_descriptor is being called with an unstable poke_tab[] John Fastabend
@ 2021-06-16 22:55 ` John Fastabend
  2021-06-22 22:00   ` Alexei Starovoitov
  2021-06-16 22:55 ` [PATCH bpf v2 4/4] bpf: selftest to verify mixing bpf2bpf calls and tailcalls with insn patch John Fastabend
  2021-06-17  4:30 ` [PATCH bpf v2 0/4] BPF fixes mixed tail and bpf2bpf calls Andrii Nakryiko
  4 siblings, 1 reply; 12+ messages in thread
From: John Fastabend @ 2021-06-16 22:55 UTC (permalink / raw)
  To: maciej.fijalkowski, ast, daniel, andriin; +Cc: john.fastabend, netdev, netdev

Subprograms are calling map_poke_track but on program release there is no
hook to call map_poke_untrack. But on prog release the aux memory is freed
even though we still have a reference to it in the element list of the
map aux data.

So when we run map_poke_run() we end up accessing free'd memory. This
triggers with KASAN in prog_array_map_poke_run() shown here.

[  402.824686] ==================================================================
[  402.824689] BUG: KASAN: use-after-free in prog_array_map_poke_run+0xc2/0x34e
[  402.824698] Read of size 4 at addr ffff8881905a7940 by task hubble-fgs/4337

[  402.824705] CPU: 1 PID: 4337 Comm: hubble-fgs Tainted: G          I       5.12.0+ #399
[  402.824715] Call Trace:
[  402.824719]  dump_stack+0x93/0xc2
[  402.824727]  print_address_description.constprop.0+0x1a/0x140
[  402.824736]  ? prog_array_map_poke_run+0xc2/0x34e
[  402.824740]  ? prog_array_map_poke_run+0xc2/0x34e
[  402.824744]  kasan_report.cold+0x7c/0xd8
[  402.824752]  ? prog_array_map_poke_run+0xc2/0x34e
[  402.824757]  prog_array_map_poke_run+0xc2/0x34e
[  402.824765]  bpf_fd_array_map_update_elem+0x124/0x1a0

The elements concerned are walked like this,

                for (i = 0; i < elem->aux->size_poke_tab; i++) {
                        poke = &elem->aux->poke_tab[i];

So the access to size_poke_tab is the 4B read, verified by checking offsets
in the KASAN dump,

[  402.825004] The buggy address belongs to the object at ffff8881905a7800
                which belongs to the cache kmalloc-1k of size 1024
[  402.825008] The buggy address is located 320 bytes inside of
                1024-byte region [ffff8881905a7800, ffff8881905a7c00)

With pahol output,

 struct bpf_prog_aux {
 ...
        /* --- cacheline 5 boundary (320 bytes) --- */
        u32                        size_poke_tab;        /*   320     4 */
 ...

To fix track the map references by using a per subprogram used_maps array
and used_map_cnt values to hold references into the maps so when the
subprogram is released we can then untrack from the correct map using
the correct aux field.

Here we a slightly less than optimal because we insert all poke entries
into the used_map array, even duplicates. In theory we could get by
with only unique entries. This would require an extra collect the maps
stage though and seems unnecessary when this is simply an extra 8B
per duplicate. It also makes the logic simpler and the untrack hook
is happy to ignore entries previously removed.

Reported-by: Jussi Maki <joamaki@gmail.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 include/linux/bpf.h   |    1 +
 kernel/bpf/core.c     |    6 ++++--
 kernel/bpf/verifier.c |   36 +++++++++++++++++++++++++-----------
 3 files changed, 30 insertions(+), 13 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 02b02cb29ce2..c037c67347c0 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1780,6 +1780,7 @@ static inline struct bpf_prog *bpf_prog_get_type(u32 ufd,
 	return bpf_prog_get_type_dev(ufd, type, false);
 }
 
+void bpf_free_used_maps(struct bpf_prog_aux *aux);
 void __bpf_free_used_maps(struct bpf_prog_aux *aux,
 			  struct bpf_map **used_maps, u32 len);
 
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 5e31ee9f7512..ce5bb8932958 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2167,7 +2167,7 @@ void __bpf_free_used_maps(struct bpf_prog_aux *aux,
 	}
 }
 
-static void bpf_free_used_maps(struct bpf_prog_aux *aux)
+void bpf_free_used_maps(struct bpf_prog_aux *aux)
 {
 	__bpf_free_used_maps(aux, aux->used_maps, aux->used_map_cnt);
 	kfree(aux->used_maps);
@@ -2211,8 +2211,10 @@ static void bpf_prog_free_deferred(struct work_struct *work)
 #endif
 	if (aux->dst_trampoline)
 		bpf_trampoline_put(aux->dst_trampoline);
-	for (i = 0; i < aux->func_cnt; i++)
+	for (i = 0; i < aux->func_cnt; i++) {
+		bpf_free_used_maps(aux->func[i]->aux);
 		bpf_jit_free(aux->func[i]);
+	}
 	if (aux->func_cnt) {
 		kfree(aux->func);
 		bpf_prog_unlock_free(aux->prog);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 066fac9b5460..31c0f3ad9626 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -12128,14 +12128,32 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 			func[i]->insnsi[insn_idx - subprog_start].imm = ret + 1;
 		}
 
-		for (j = 0; j < func[i]->aux->size_poke_tab; j++) {
-			int ret;
+		/* overapproximate the number of map slots. Untrack will just skip
+		 * the lookup anyways and we avoid an extra layer of accounting.
+		 */
+		if (func[i]->aux->size_poke_tab) {
+			struct bpf_map **used_maps;
 
-			map_ptr = func[i]->aux->poke_tab[j].tail_call.map;
-			ret = map_ptr->ops->map_poke_track(map_ptr, func[i]->aux);
-			if (ret < 0) {
-				verbose(env, "tracking tail call prog failed\n");
+			used_maps = kmalloc_array(func[i]->aux->size_poke_tab,
+						  sizeof(struct bpf_map *),
+						  GFP_KERNEL);
+			if (!used_maps)
 				goto out_free;
+
+			func[i]->aux->used_maps = used_maps;
+
+			for (j = 0; j < func[i]->aux->size_poke_tab; j++) {
+				int ret;
+
+				map_ptr = func[i]->aux->poke_tab[j].tail_call.map;
+				ret = map_ptr->ops->map_poke_track(map_ptr, func[i]->aux);
+				if (ret < 0) {
+					verbose(env, "tracking tail call prog failed\n");
+					goto out_free;
+				}
+				bpf_map_inc(map_ptr);
+				func[i]->aux->used_map_cnt++;
+				func[i]->aux->used_maps[j] = map_ptr;
 			}
 		}
 
@@ -12259,11 +12277,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 	for (i = 0; i < env->subprog_cnt; i++) {
 		if (!func[i])
 			continue;
-
-		for (j = 0; j < func[i]->aux->size_poke_tab; j++) {
-			map_ptr = func[i]->aux->poke_tab[j].tail_call.map;
-			map_ptr->ops->map_poke_untrack(map_ptr, func[i]->aux);
-		}
+		bpf_free_used_maps(func[i]->aux);
 		bpf_jit_free(func[i]);
 	}
 	kfree(func);



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH bpf v2 4/4] bpf: selftest to verify mixing bpf2bpf calls and tailcalls with insn patch
  2021-06-16 22:54 [PATCH bpf v2 0/4] BPF fixes mixed tail and bpf2bpf calls John Fastabend
                   ` (2 preceding siblings ...)
  2021-06-16 22:55 ` [PATCH bpf v2 3/4] bpf: track subprog poke correctly John Fastabend
@ 2021-06-16 22:55 ` John Fastabend
  2021-06-17  4:30 ` [PATCH bpf v2 0/4] BPF fixes mixed tail and bpf2bpf calls Andrii Nakryiko
  4 siblings, 0 replies; 12+ messages in thread
From: John Fastabend @ 2021-06-16 22:55 UTC (permalink / raw)
  To: maciej.fijalkowski, ast, daniel, andriin; +Cc: john.fastabend, netdev, netdev

This adds some extra noise to the tailcall_bpf2bpf4 tests that will cause
verify to patch insns. This then moves around subprog start/end insn
index and poke descriptor insn index to ensure that verify and JIT will
continue to track these correctly.

If done correctly verifier should pass this program same as before and
JIT should emit tail call logic.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 tools/testing/selftests/bpf/prog_tests/tailcalls.c |   36 ++++++++++++++------
 .../selftests/bpf/progs/tailcall_bpf2bpf4.c        |   20 +++++++++++
 2 files changed, 45 insertions(+), 11 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/tailcalls.c b/tools/testing/selftests/bpf/prog_tests/tailcalls.c
index ee27d68d2a1c..b5940e6ca67c 100644
--- a/tools/testing/selftests/bpf/prog_tests/tailcalls.c
+++ b/tools/testing/selftests/bpf/prog_tests/tailcalls.c
@@ -715,6 +715,8 @@ static void test_tailcall_bpf2bpf_3(void)
 	bpf_object__close(obj);
 }
 
+#include "tailcall_bpf2bpf4.skel.h"
+
 /* test_tailcall_bpf2bpf_4 checks that tailcall counter is correctly preserved
  * across tailcalls combined with bpf2bpf calls. for making sure that tailcall
  * counter behaves correctly, bpf program will go through following flow:
@@ -727,10 +729,15 @@ static void test_tailcall_bpf2bpf_3(void)
  * the loop begins. At the end of the test make sure that the global counter is
  * equal to 31, because tailcall counter includes the first two tailcalls
  * whereas global counter is incremented only on loop presented on flow above.
+ *
+ * The noise parameter is used to insert bpf_map_update calls into the logic
+ * to force verifier to patch instructions. This allows us to ensure jump
+ * logic remains correct with instruction movement.
  */
-static void test_tailcall_bpf2bpf_4(void)
+static void test_tailcall_bpf2bpf_4(bool noise)
 {
-	int err, map_fd, prog_fd, main_fd, data_fd, i, val;
+	int err, map_fd, prog_fd, main_fd, data_fd, i;
+	struct tailcall_bpf2bpf4__bss val;
 	struct bpf_map *prog_array, *data_map;
 	struct bpf_program *prog;
 	struct bpf_object *obj;
@@ -774,11 +781,6 @@ static void test_tailcall_bpf2bpf_4(void)
 			goto out;
 	}
 
-	err = bpf_prog_test_run(main_fd, 1, &pkt_v4, sizeof(pkt_v4), 0,
-				&duration, &retval, NULL);
-	CHECK(err || retval != sizeof(pkt_v4) * 3, "tailcall", "err %d errno %d retval %d\n",
-	      err, errno, retval);
-
 	data_map = bpf_object__find_map_by_name(obj, "tailcall.bss");
 	if (CHECK_FAIL(!data_map || !bpf_map__is_internal(data_map)))
 		return;
@@ -787,10 +789,22 @@ static void test_tailcall_bpf2bpf_4(void)
 	if (CHECK_FAIL(map_fd < 0))
 		return;
 
+	i = 0;
+	val.noise = noise;
+	val.count = 0;
+	err = bpf_map_update_elem(data_fd, &i, &val, BPF_ANY);
+	if (CHECK_FAIL(err))
+		goto out;
+
+	err = bpf_prog_test_run(main_fd, 1, &pkt_v4, sizeof(pkt_v4), 0,
+				&duration, &retval, NULL);
+	CHECK(err || retval != sizeof(pkt_v4) * 3, "tailcall", "err %d errno %d retval %d\n",
+	      err, errno, retval);
+
 	i = 0;
 	err = bpf_map_lookup_elem(data_fd, &i, &val);
-	CHECK(err || val != 31, "tailcall count", "err %d errno %d count %d\n",
-	      err, errno, val);
+	CHECK(err || val.count != 31, "tailcall count", "err %d errno %d count %d\n",
+	      err, errno, val.count);
 
 out:
 	bpf_object__close(obj);
@@ -815,5 +829,7 @@ void test_tailcalls(void)
 	if (test__start_subtest("tailcall_bpf2bpf_3"))
 		test_tailcall_bpf2bpf_3();
 	if (test__start_subtest("tailcall_bpf2bpf_4"))
-		test_tailcall_bpf2bpf_4();
+		test_tailcall_bpf2bpf_4(false);
+	if (test__start_subtest("tailcall_bpf2bpf_5"))
+		test_tailcall_bpf2bpf_4(true);
 }
diff --git a/tools/testing/selftests/bpf/progs/tailcall_bpf2bpf4.c b/tools/testing/selftests/bpf/progs/tailcall_bpf2bpf4.c
index 9a1b166b7fbe..e89368a50b97 100644
--- a/tools/testing/selftests/bpf/progs/tailcall_bpf2bpf4.c
+++ b/tools/testing/selftests/bpf/progs/tailcall_bpf2bpf4.c
@@ -2,6 +2,13 @@
 #include <linux/bpf.h>
 #include <bpf/bpf_helpers.h>
 
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(max_entries, 1);
+	__uint(key_size, sizeof(__u32));
+	__uint(value_size, sizeof(__u32));
+} nop_table SEC(".maps");
+
 struct {
 	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
 	__uint(max_entries, 3);
@@ -9,11 +16,22 @@ struct {
 	__uint(value_size, sizeof(__u32));
 } jmp_table SEC(".maps");
 
-static volatile int count;
+int count = 0;
+int noise = 0;
+
+__always_inline int subprog_noise(void)
+{
+	__u32 key = 0;
+
+	bpf_map_lookup_elem(&nop_table, &key);
+	return 0;
+}
 
 __noinline
 int subprog_tail_2(struct __sk_buff *skb)
 {
+	if (noise)
+		subprog_noise();
 	bpf_tail_call_static(skb, &jmp_table, 2);
 	return skb->len * 3;
 }



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* RE: [PATCH bpf v2 2/4] bpf: map_poke_descriptor is being called with an unstable poke_tab[]
  2021-06-16 22:55 ` [PATCH bpf v2 2/4] bpf: map_poke_descriptor is being called with an unstable poke_tab[] John Fastabend
@ 2021-06-16 23:42   ` John Fastabend
  2021-06-22 21:54   ` Alexei Starovoitov
  1 sibling, 0 replies; 12+ messages in thread
From: John Fastabend @ 2021-06-16 23:42 UTC (permalink / raw)
  To: John Fastabend, maciej.fijalkowski, ast, daniel, andriin
  Cc: john.fastabend, netdev, netdev

John Fastabend wrote:
> When populating poke_tab[] of a subprog we call map_poke_track() after
> doing bpf_jit_add_poke_descriptor(). But, bpf_jit_add_poke_descriptor()
> may, likely will, realloc the poke_tab[] structure and free the old
> one. So that prog->aux->poke_tab is not stable. However, the aux pointer
> is referenced from bpf_array_aux and poke_tab[] is used to 'track'
> prog<->map link. This way when progs are released the entry in the
> map is dropped and vice versa when the map is released we don't drop
> it too soon if a prog is in the process of calling it.
> 
> I wasn't able to trigger any errors here, for example having map_poke_run
> run with a poke_tab[] pointer that was free'd from
> bpf_jit_add_poke_descriptor(), but it looks possible and at very least
> is very fragile.
> 
> This patch moves poke_track call out of loop that is calling add_poke
> so that we only ever add stable aux->poke_tab pointers to the map's
> bpf_array_aux struct. Further, we need this in the next patch to fix
> a real bug where progs are not 'untracked'.
> 
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---

Needs a fixes tag,

Fixes: a748c6975dea3 ("bpf: propagate poke descriptors to subprograms")

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf v2 0/4] BPF fixes mixed tail and bpf2bpf calls
  2021-06-16 22:54 [PATCH bpf v2 0/4] BPF fixes mixed tail and bpf2bpf calls John Fastabend
                   ` (3 preceding siblings ...)
  2021-06-16 22:55 ` [PATCH bpf v2 4/4] bpf: selftest to verify mixing bpf2bpf calls and tailcalls with insn patch John Fastabend
@ 2021-06-17  4:30 ` Andrii Nakryiko
  4 siblings, 0 replies; 12+ messages in thread
From: Andrii Nakryiko @ 2021-06-17  4:30 UTC (permalink / raw)
  To: John Fastabend
  Cc: Maciej Fijalkowski, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Networking

On Wed, Jun 16, 2021 at 7:34 PM John Fastabend <john.fastabend@gmail.com> wrote:
>
> We recently tried to use mixed programs that have both tail calls and
> subprograms, but it needs the attached fixes.
>
> Also added a new test case tailcall_bpf2bpf_5 that simply runs the
> previous test case tailcall_bpf2bpf_4 and adds some "noise". The
> noise here is just a bunch of map calls to get the verifier to insert
> instructions and cause code movement plus it forces used_maps logic
> to be used. Originally, I just extended bpf2bpf_4 directly, but if I
> got the feedback correct it seems the preference is to have another
> test case for this specifically.
>
> With attached patches our programs are happily running with mixed
> subprograms and tailcalls.
>
> Thanks,
> John
>
> ---

Would be nice to include bpf@vger.kernel.org as well. I bet not
everyone interested in BPF follows netdev@vger closely.

>
> John Fastabend (4):
>       bpf: Fix null ptr deref with mixed tail calls and subprogs
>       bpf: map_poke_descriptor is being called with an unstable poke_tab[]
>       bpf: track subprog poke correctly
>       bpf: selftest to verify mixing bpf2bpf calls and tailcalls with insn patch
>
>
>  include/linux/bpf.h                           |  1 +
>  kernel/bpf/core.c                             |  6 ++--
>  kernel/bpf/verifier.c                         | 36 ++++++++++++++-----
>  .../selftests/bpf/prog_tests/tailcalls.c      | 36 +++++++++++++------
>  .../selftests/bpf/progs/tailcall_bpf2bpf4.c   | 20 ++++++++++-
>  5 files changed, 77 insertions(+), 22 deletions(-)
>
> --
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf v2 2/4] bpf: map_poke_descriptor is being called with an unstable poke_tab[]
  2021-06-16 22:55 ` [PATCH bpf v2 2/4] bpf: map_poke_descriptor is being called with an unstable poke_tab[] John Fastabend
  2021-06-16 23:42   ` John Fastabend
@ 2021-06-22 21:54   ` Alexei Starovoitov
  2021-06-22 22:59     ` John Fastabend
  1 sibling, 1 reply; 12+ messages in thread
From: Alexei Starovoitov @ 2021-06-22 21:54 UTC (permalink / raw)
  To: John Fastabend; +Cc: maciej.fijalkowski, ast, daniel, andriin, netdev

On Wed, Jun 16, 2021 at 03:55:19PM -0700, John Fastabend wrote:
> When populating poke_tab[] of a subprog we call map_poke_track() after
> doing bpf_jit_add_poke_descriptor(). But, bpf_jit_add_poke_descriptor()
> may, likely will, realloc the poke_tab[] structure and free the old
> one. So that prog->aux->poke_tab is not stable. However, the aux pointer
> is referenced from bpf_array_aux and poke_tab[] is used to 'track'
> prog<->map link. This way when progs are released the entry in the
> map is dropped and vice versa when the map is released we don't drop
> it too soon if a prog is in the process of calling it.
> 
> I wasn't able to trigger any errors here, for example having map_poke_run
> run with a poke_tab[] pointer that was free'd from
> bpf_jit_add_poke_descriptor(), but it looks possible and at very least
> is very fragile.
> 
> This patch moves poke_track call out of loop that is calling add_poke
> so that we only ever add stable aux->poke_tab pointers to the map's
> bpf_array_aux struct. Further, we need this in the next patch to fix
> a real bug where progs are not 'untracked'.
> 
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
>  kernel/bpf/verifier.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 6e2ebcb0d66f..066fac9b5460 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -12126,8 +12126,12 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>  			}
>  
>  			func[i]->insnsi[insn_idx - subprog_start].imm = ret + 1;
> +		}
>  
> -			map_ptr = func[i]->aux->poke_tab[ret].tail_call.map;
> +		for (j = 0; j < func[i]->aux->size_poke_tab; j++) {
> +			int ret;
> +
> +			map_ptr = func[i]->aux->poke_tab[j].tail_call.map;

I don't see why it's necessary.
poke_tab pointer will be re-read after bpf_jit_add_poke_descriptor().
The compiler is not allowed to cache it.

I've applied the patch 1.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf v2 3/4] bpf: track subprog poke correctly
  2021-06-16 22:55 ` [PATCH bpf v2 3/4] bpf: track subprog poke correctly John Fastabend
@ 2021-06-22 22:00   ` Alexei Starovoitov
  2021-06-22 23:02     ` John Fastabend
  0 siblings, 1 reply; 12+ messages in thread
From: Alexei Starovoitov @ 2021-06-22 22:00 UTC (permalink / raw)
  To: John Fastabend; +Cc: maciej.fijalkowski, ast, daniel, andriin, netdev

On Wed, Jun 16, 2021 at 03:55:39PM -0700, John Fastabend wrote:
>  
> -static void bpf_free_used_maps(struct bpf_prog_aux *aux)
> +void bpf_free_used_maps(struct bpf_prog_aux *aux)
>  {
>  	__bpf_free_used_maps(aux, aux->used_maps, aux->used_map_cnt);
>  	kfree(aux->used_maps);
> @@ -2211,8 +2211,10 @@ static void bpf_prog_free_deferred(struct work_struct *work)
>  #endif
>  	if (aux->dst_trampoline)
>  		bpf_trampoline_put(aux->dst_trampoline);
> -	for (i = 0; i < aux->func_cnt; i++)
> +	for (i = 0; i < aux->func_cnt; i++) {
> +		bpf_free_used_maps(aux->func[i]->aux);
>  		bpf_jit_free(aux->func[i]);
> +	}

The sub-progs don't have all the properties of the main prog.
Only main prog suppose to keep maps incremented.
After this patch the prog with 100 subprogs will artificially bump maps
refcnt 100 times as a workaround for poke_tab access.
May be we can use single poke_tab in the main prog instead.
Looks like jit_subprogs is splitting the poke_tab into individual arrays
for each subprog, but maps are tracked by the main prog only.
That's the root cause of the issue, right?
I think that split of poke_tab isn't necessary.
bpf_int_jit_compile() can look into main prog poke_tab instead.
Then the loop:
for (j = 0; j < prog->aux->size_poke_tab)
    bpf_jit_add_poke_descriptor(func[i], &prog->aux->poke_tab[j]);
can be removed (It will address the concern in patch 2 as well, right?)
And hopefully will fix UAF too?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf v2 2/4] bpf: map_poke_descriptor is being called with an unstable poke_tab[]
  2021-06-22 21:54   ` Alexei Starovoitov
@ 2021-06-22 22:59     ` John Fastabend
  0 siblings, 0 replies; 12+ messages in thread
From: John Fastabend @ 2021-06-22 22:59 UTC (permalink / raw)
  To: Alexei Starovoitov, John Fastabend
  Cc: maciej.fijalkowski, ast, daniel, andriin, netdev

Alexei Starovoitov wrote:
> On Wed, Jun 16, 2021 at 03:55:19PM -0700, John Fastabend wrote:
> > When populating poke_tab[] of a subprog we call map_poke_track() after
> > doing bpf_jit_add_poke_descriptor(). But, bpf_jit_add_poke_descriptor()
> > may, likely will, realloc the poke_tab[] structure and free the old
> > one. So that prog->aux->poke_tab is not stable. However, the aux pointer
> > is referenced from bpf_array_aux and poke_tab[] is used to 'track'
> > prog<->map link. This way when progs are released the entry in the
> > map is dropped and vice versa when the map is released we don't drop
> > it too soon if a prog is in the process of calling it.
> > 
> > I wasn't able to trigger any errors here, for example having map_poke_run
> > run with a poke_tab[] pointer that was free'd from
> > bpf_jit_add_poke_descriptor(), but it looks possible and at very least
> > is very fragile.
> > 
> > This patch moves poke_track call out of loop that is calling add_poke
> > so that we only ever add stable aux->poke_tab pointers to the map's
> > bpf_array_aux struct. Further, we need this in the next patch to fix
> > a real bug where progs are not 'untracked'.
> > 
> > Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> > ---
> >  kernel/bpf/verifier.c |    6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 6e2ebcb0d66f..066fac9b5460 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -12126,8 +12126,12 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> >  			}
> >  
> >  			func[i]->insnsi[insn_idx - subprog_start].imm = ret + 1;
> > +		}
> >  
> > -			map_ptr = func[i]->aux->poke_tab[ret].tail_call.map;
> > +		for (j = 0; j < func[i]->aux->size_poke_tab; j++) {
> > +			int ret;
> > +
> > +			map_ptr = func[i]->aux->poke_tab[j].tail_call.map;
> 
> I don't see why it's necessary.

Agree its not necessary. Nothing else can get at poke_tab while we do
the realloc so I agree its fine as is. It still seems odd to me to do the
poke_track when we know we are about to do multiple reallocs in the
next round of the loop. Either way I'll reply on the feedback in 3/4 and
we can avoid this patch altogether I think.

> poke_tab pointer will be re-read after bpf_jit_add_poke_descriptor().
> The compiler is not allowed to cache it.
> 
> I've applied the patch 1.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf v2 3/4] bpf: track subprog poke correctly
  2021-06-22 22:00   ` Alexei Starovoitov
@ 2021-06-22 23:02     ` John Fastabend
  2021-06-22 23:07       ` Alexei Starovoitov
  0 siblings, 1 reply; 12+ messages in thread
From: John Fastabend @ 2021-06-22 23:02 UTC (permalink / raw)
  To: Alexei Starovoitov, John Fastabend
  Cc: maciej.fijalkowski, ast, daniel, andriin, netdev

Alexei Starovoitov wrote:
> On Wed, Jun 16, 2021 at 03:55:39PM -0700, John Fastabend wrote:
> >  
> > -static void bpf_free_used_maps(struct bpf_prog_aux *aux)
> > +void bpf_free_used_maps(struct bpf_prog_aux *aux)
> >  {
> >  	__bpf_free_used_maps(aux, aux->used_maps, aux->used_map_cnt);
> >  	kfree(aux->used_maps);
> > @@ -2211,8 +2211,10 @@ static void bpf_prog_free_deferred(struct work_struct *work)
> >  #endif
> >  	if (aux->dst_trampoline)
> >  		bpf_trampoline_put(aux->dst_trampoline);
> > -	for (i = 0; i < aux->func_cnt; i++)
> > +	for (i = 0; i < aux->func_cnt; i++) {
> > +		bpf_free_used_maps(aux->func[i]->aux);
> >  		bpf_jit_free(aux->func[i]);
> > +	}
> 
> The sub-progs don't have all the properties of the main prog.
> Only main prog suppose to keep maps incremented.
> After this patch the prog with 100 subprogs will artificially bump maps
> refcnt 100 times as a workaround for poke_tab access.

Yep.

> May be we can use single poke_tab in the main prog instead.
> Looks like jit_subprogs is splitting the poke_tab into individual arrays
> for each subprog, but maps are tracked by the main prog only.
> That's the root cause of the issue, right?

Correct.

> I think that split of poke_tab isn't necessary.
> bpf_int_jit_compile() can look into main prog poke_tab instead.
> Then the loop:
> for (j = 0; j < prog->aux->size_poke_tab)
>     bpf_jit_add_poke_descriptor(func[i], &prog->aux->poke_tab[j]);
> can be removed (It will address the concern in patch 2 as well, right?)
> And hopefully will fix UAF too?

Looks like it to me as well. A few details to work out around
imm value and emit hooks on the jit side, but looks doable to me.
I'll give it a try tomorrow or tonight.

Thanks,
John

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf v2 3/4] bpf: track subprog poke correctly
  2021-06-22 23:02     ` John Fastabend
@ 2021-06-22 23:07       ` Alexei Starovoitov
  0 siblings, 0 replies; 12+ messages in thread
From: Alexei Starovoitov @ 2021-06-22 23:07 UTC (permalink / raw)
  To: John Fastabend
  Cc: Fijalkowski, Maciej, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Network Development

On Tue, Jun 22, 2021 at 4:02 PM John Fastabend <john.fastabend@gmail.com> wrote:
>
> Alexei Starovoitov wrote:
> > On Wed, Jun 16, 2021 at 03:55:39PM -0700, John Fastabend wrote:
> > >
> > > -static void bpf_free_used_maps(struct bpf_prog_aux *aux)
> > > +void bpf_free_used_maps(struct bpf_prog_aux *aux)
> > >  {
> > >     __bpf_free_used_maps(aux, aux->used_maps, aux->used_map_cnt);
> > >     kfree(aux->used_maps);
> > > @@ -2211,8 +2211,10 @@ static void bpf_prog_free_deferred(struct work_struct *work)
> > >  #endif
> > >     if (aux->dst_trampoline)
> > >             bpf_trampoline_put(aux->dst_trampoline);
> > > -   for (i = 0; i < aux->func_cnt; i++)
> > > +   for (i = 0; i < aux->func_cnt; i++) {
> > > +           bpf_free_used_maps(aux->func[i]->aux);
> > >             bpf_jit_free(aux->func[i]);
> > > +   }
> >
> > The sub-progs don't have all the properties of the main prog.
> > Only main prog suppose to keep maps incremented.
> > After this patch the prog with 100 subprogs will artificially bump maps
> > refcnt 100 times as a workaround for poke_tab access.
>
> Yep.
>
> > May be we can use single poke_tab in the main prog instead.
> > Looks like jit_subprogs is splitting the poke_tab into individual arrays
> > for each subprog, but maps are tracked by the main prog only.
> > That's the root cause of the issue, right?
>
> Correct.
>
> > I think that split of poke_tab isn't necessary.
> > bpf_int_jit_compile() can look into main prog poke_tab instead.
> > Then the loop:
> > for (j = 0; j < prog->aux->size_poke_tab)
> >     bpf_jit_add_poke_descriptor(func[i], &prog->aux->poke_tab[j]);
> > can be removed (It will address the concern in patch 2 as well, right?)
> > And hopefully will fix UAF too?
>
> Looks like it to me as well. A few details to work out around
> imm value and emit hooks on the jit side, but looks doable to me.
> I'll give it a try tomorrow or tonight.

Perfect. Thank you John.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-06-22 23:07 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-16 22:54 [PATCH bpf v2 0/4] BPF fixes mixed tail and bpf2bpf calls John Fastabend
2021-06-16 22:55 ` [PATCH bpf v2 1/4] bpf: Fix null ptr deref with mixed tail calls and subprogs John Fastabend
2021-06-16 22:55 ` [PATCH bpf v2 2/4] bpf: map_poke_descriptor is being called with an unstable poke_tab[] John Fastabend
2021-06-16 23:42   ` John Fastabend
2021-06-22 21:54   ` Alexei Starovoitov
2021-06-22 22:59     ` John Fastabend
2021-06-16 22:55 ` [PATCH bpf v2 3/4] bpf: track subprog poke correctly John Fastabend
2021-06-22 22:00   ` Alexei Starovoitov
2021-06-22 23:02     ` John Fastabend
2021-06-22 23:07       ` Alexei Starovoitov
2021-06-16 22:55 ` [PATCH bpf v2 4/4] bpf: selftest to verify mixing bpf2bpf calls and tailcalls with insn patch John Fastabend
2021-06-17  4:30 ` [PATCH bpf v2 0/4] BPF fixes mixed tail and bpf2bpf calls Andrii Nakryiko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.