All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/4] verify scalar ids mapping in regsafe()
@ 2023-05-30 17:27 Eduard Zingerman
  2023-05-30 17:27 ` [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids() Eduard Zingerman
                   ` (3 more replies)
  0 siblings, 4 replies; 33+ messages in thread
From: Eduard Zingerman @ 2023-05-30 17:27 UTC (permalink / raw)
  To: bpf, ast; +Cc: andrii, daniel, martin.lau, kernel-team, yhs, Eduard Zingerman

Update regsafe() to use check_ids() for scalar values.
Otherwise the following unsafe pattern is accepted by verifier:

  1: r9 = ... some pointer with range X ...
  2: r6 = ... unbound scalar ID=a ...
  3: r7 = ... unbound scalar ID=b ...
  4: if (r6 > r7) goto +1
  5: r6 = r7
  6: if (r6 > X) goto ...
  --- checkpoint ---
  7: r9 += r7
  8: *(u64 *)r9 = Y

This example is unsafe because not all execution paths verify r7 range.
Because of the jump at (4) the verifier would arrive at (6) in two states:
I.  r6{.id=b}, r7{.id=b} via path 1-6;
II. r6{.id=a}, r7{.id=b} via path 1-4, 6.

Currently regsafe() does not call check_ids() for scalar registers,
thus from POV of regsafe() states (I) and (II) are identical.

The change is split in two parts:
- patches #1,2:
  - correctness fix for regsafe();
  - test cases.
  Patch #1 has a big hit on verification performance.
- patches #3,4
  - modification for find_equal_scalars() to save ids of registers
    that gain range via this function;
  - modification for regsafe() to do check_ids() for scalar registers
    only for such ids;
  - test cases.
  Patch #3 restores most of the verification performance.
  
A note on design decisions for patch #3. The change in patch #1 is
simple but very heavy handed:

  @@ -15151,6 +15151,28 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
 
 	switch (base_type(rold->type)) {
 	case SCALAR_VALUE:
  +		/* ... */
  +		if (!check_ids(rold->id, rcur->id, idmap))
  +			return false;

Ideally check_ids() should only be called for 'rold' that either:
(a) gained range via find_equal_scalars() in some child verification
    path and was then marked as precise;
(b) was a source of range information for some other register via
    find_equal_scalars() in some child verification path, and that
    register was then marked as precise.

While rold->precise flag could be a proxy for criteria (a) it is,
unfortunately, cannot be a proxy for criteria (b). E.g. for the
example above precision marks look as follows:

                           Precise registers
  5: r6 = r7              ; r7
  6: if (r6 > X) goto ... ; r7
  7: r9 += r7             ; r7

Jump at (6) cannot be predicted, thus there is no precision mark on r6.
If there is ever a jump to (6), cached state will not have precise
marks for r6.

This leaves two options:
- Modification of precision tracking to take find_equal_scalars() into
  account.
- Find a way to track which ids were used for range information
  transfer in find_equal_scalars().
  
The former is a bit complicated, because information about register id
assignments for instructions in the middle of a state is lost.
It is possible to extend bpf_verifier_state::jmp_history to track a
mask for registers / stack slots that gained range via
find_equal_scalars() and use this mask in backtrack_insn().
However, this is a significant complication for a very non-trivial code.

Thus, in patch #3 I opted for a latter approach, accumulate all ids
that gain range via find_equal_scalars() in a set stored in struct
bpf_verifier_env.

To represent this set I use a u32_hashset data structure derived from
tools/lib/bpf/hashmap.h. I tested it locally (see [1]), but I think
that ideally it should be tested using KUnit. However, AFAIK, this
would be the first use of KUnit in context of BPF verifier.
If people are ok with this, I will prepare the tests and necessary
CI integration.

Changelog:
- V1 -> v2:
  - 'rold->precise' and 'rold->id' checks are dropped as unsafe
    (thanks to discussion with Yonghong);
  - patches #3,4 adding tracking of ids used for range transfer in
    order to mitigate performance impact.
- RFC -> V1:
  - Function verifier.c:mark_equal_scalars_as_read() is dropped,
    as it was an incorrect fix for problem solved by commit [3].
  - check_ids() is called only for precise scalar values.
  - Test case updated to use inline assembly.

[V1]  https://lore.kernel.org/bpf/20230526184126.3104040-1-eddyz87@gmail.com/
[RFC] https://lore.kernel.org/bpf/20221128163442.280187-1-eddyz87@gmail.com/
[1]   https://gist.github.com/eddyz87/a32ea7e62a27d3c201117c9a39ab4286

Eduard Zingerman (4):
  bpf: verify scalar ids mapping in regsafe() using check_ids()
  selftests/bpf: verify that check_ids() is used for scalars in
    regsafe()
  bpf: filter out scalar ids not used for range transfer in regsafe()
  selftests/bpf: check env->that range_transfer_ids has effect

 include/linux/bpf_verifier.h                  |   4 +
 kernel/bpf/Makefile                           |   1 +
 kernel/bpf/u32_hashset.c                      | 137 +++++++++++
 kernel/bpf/u32_hashset.h                      |  30 +++
 kernel/bpf/verifier.c                         |  66 +++++-
 .../selftests/bpf/prog_tests/verifier.c       |   2 +
 .../selftests/bpf/progs/verifier_scalar_ids.c | 214 ++++++++++++++++++
 7 files changed, 447 insertions(+), 7 deletions(-)
 create mode 100644 kernel/bpf/u32_hashset.c
 create mode 100644 kernel/bpf/u32_hashset.h
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_scalar_ids.c

-- 
2.40.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-05-30 17:27 [PATCH bpf-next v2 0/4] verify scalar ids mapping in regsafe() Eduard Zingerman
@ 2023-05-30 17:27 ` Eduard Zingerman
  2023-05-30 21:37   ` Andrii Nakryiko
  2023-05-30 17:27 ` [PATCH bpf-next v2 2/4] selftests/bpf: verify that check_ids() is used for scalars in regsafe() Eduard Zingerman
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 33+ messages in thread
From: Eduard Zingerman @ 2023-05-30 17:27 UTC (permalink / raw)
  To: bpf, ast; +Cc: andrii, daniel, martin.lau, kernel-team, yhs, Eduard Zingerman

Make sure that the following unsafe example is rejected by verifier:

1: r9 = ... some pointer with range X ...
2: r6 = ... unbound scalar ID=a ...
3: r7 = ... unbound scalar ID=b ...
4: if (r6 > r7) goto +1
5: r6 = r7
6: if (r6 > X) goto ...
--- checkpoint ---
7: r9 += r7
8: *(u64 *)r9 = Y

This example is unsafe because not all execution paths verify r7 range.
Because of the jump at (4) the verifier would arrive at (6) in two states:
I.  r6{.id=b}, r7{.id=b} via path 1-6;
II. r6{.id=a}, r7{.id=b} via path 1-4, 6.

Currently regsafe() does not call check_ids() for scalar registers,
thus from POV of regsafe() states (I) and (II) are identical. If the
path 1-6 is taken by verifier first, and checkpoint is created at (6)
the path [1-4, 6] would be considered safe.

This commit updates regsafe() to call check_ids() for scalar registers.

This change is costly in terms of verification performance.
Using veristat to compare number of processed states for selftests
object files listed in tools/testing/selftests/bpf/veristat.cfg and
Cilium object files from [1] gives the following statistics:

  Filter        | Number of programs
  ----------------------------------
  states_pct>10 | 40
  states_pct>20 | 20
  states_pct>30 | 15
  states_pct>40 | 11

(Out of total 177 programs)

In fact, impact is so bad that in no-alu32 mode the following
test_progs tests no longer pass verifiction:
- verif_scale2: maximal number of instructions exceeded
- verif_scale3: maximal number of instructions exceeded
- verif_scale_pyperf600: maximal number of instructions exceeded

Additionally:
- verifier_search_pruning/allocated_stack: expected prunning does not
  happen because of differences in register id allocation.

Followup patch would address these issues.

[1] git@github.com:anakryiko/cilium.git

Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.")
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/verifier.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index af70dad655ab..9c10f2619c4f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -15151,6 +15151,28 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
 
 	switch (base_type(rold->type)) {
 	case SCALAR_VALUE:
+		/* Why check_ids() for scalar registers?
+		 *
+		 * Consider the following BPF code:
+		 *   1: r6 = ... unbound scalar, ID=a ...
+		 *   2: r7 = ... unbound scalar, ID=b ...
+		 *   3: if (r6 > r7) goto +1
+		 *   4: r6 = r7
+		 *   5: if (r6 > X) goto ...
+		 *   6: ... memory operation using r7 ...
+		 *
+		 * First verification path is [1-6]:
+		 * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7;
+		 * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark
+		 *   r7 <= X, because r6 and r7 share same id.
+		 *
+		 * Next verification path would start from (5), because of the jump at (3).
+		 * The only state difference between first and second visits of (5) is
+		 * bpf_reg_state::id assignments for r6 and r7: (b, b) vs (a, b).
+		 * Thus, use check_ids() to distinguish these states.
+		 */
+		if (!check_ids(rold->id, rcur->id, idmap))
+			return false;
 		if (regs_exact(rold, rcur, idmap))
 			return true;
 		if (env->explore_alu_limits)
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH bpf-next v2 2/4] selftests/bpf: verify that check_ids() is used for scalars in regsafe()
  2023-05-30 17:27 [PATCH bpf-next v2 0/4] verify scalar ids mapping in regsafe() Eduard Zingerman
  2023-05-30 17:27 ` [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids() Eduard Zingerman
@ 2023-05-30 17:27 ` Eduard Zingerman
  2023-05-30 17:27 ` [PATCH bpf-next v2 3/4] bpf: filter out scalar ids not used for range transfer " Eduard Zingerman
  2023-05-30 17:27 ` [PATCH bpf-next v2 4/4] selftests/bpf: check env->that range_transfer_ids has effect Eduard Zingerman
  3 siblings, 0 replies; 33+ messages in thread
From: Eduard Zingerman @ 2023-05-30 17:27 UTC (permalink / raw)
  To: bpf, ast; +Cc: andrii, daniel, martin.lau, kernel-team, yhs, Eduard Zingerman

Verify that the following example is rejected by verifier:

r9 = ... some pointer with range X ...
r6 = ... unbound scalar ID=a ...
r7 = ... unbound scalar ID=b ...
if (r6 > r7) goto +1
r6 = r7
if (r6 > X) goto exit
r9 += r7
*(u64 *)r9 = Y

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 .../selftests/bpf/prog_tests/verifier.c       |   2 +
 .../selftests/bpf/progs/verifier_scalar_ids.c | 108 ++++++++++++++++++
 2 files changed, 110 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_scalar_ids.c

diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
index 531621adef42..070a13833c3f 100644
--- a/tools/testing/selftests/bpf/prog_tests/verifier.c
+++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
@@ -50,6 +50,7 @@
 #include "verifier_regalloc.skel.h"
 #include "verifier_ringbuf.skel.h"
 #include "verifier_runtime_jit.skel.h"
+#include "verifier_scalar_ids.skel.h"
 #include "verifier_search_pruning.skel.h"
 #include "verifier_sock.skel.h"
 #include "verifier_spill_fill.skel.h"
@@ -150,6 +151,7 @@ void test_verifier_ref_tracking(void)         { RUN(verifier_ref_tracking); }
 void test_verifier_regalloc(void)             { RUN(verifier_regalloc); }
 void test_verifier_ringbuf(void)              { RUN(verifier_ringbuf); }
 void test_verifier_runtime_jit(void)          { RUN(verifier_runtime_jit); }
+void test_verifier_scalar_ids(void)           { RUN(verifier_scalar_ids); }
 void test_verifier_search_pruning(void)       { RUN(verifier_search_pruning); }
 void test_verifier_sock(void)                 { RUN(verifier_sock); }
 void test_verifier_spill_fill(void)           { RUN(verifier_spill_fill); }
diff --git a/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c b/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c
new file mode 100644
index 000000000000..0ea9a1f6e1ae
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c
@@ -0,0 +1,108 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+/* Verify that check_ids() is used by regsafe() for scalars.
+ *
+ * r9 = ... some pointer with range X ...
+ * r6 = ... unbound scalar ID=a ...
+ * r7 = ... unbound scalar ID=b ...
+ * if (r6 > r7) goto +1
+ * r6 = r7
+ * if (r6 > X) goto exit
+ * r9 += r7
+ * *(u8 *)r9 = Y
+ *
+ * The memory access is safe only if r7 is bounded,
+ * which is true for one branch and not true for another.
+ */
+SEC("socket")
+__failure __msg("register with unbounded min value")
+__flag(BPF_F_TEST_STATE_FREQ)
+__naked void ids_id_mapping_in_regsafe(void)
+{
+	asm volatile (
+	/* Bump allocated stack */
+	"r1 = 0;"
+	"*(u64*)(r10 - 8) = r1;"
+	/* r9 = pointer to stack */
+	"r9 = r10;"
+	"r9 += -8;"
+	/* r7 = ktime_get_ns() */
+	"call %[bpf_ktime_get_ns];"
+	"r7 = r0;"
+	/* r6 = ktime_get_ns() */
+	"call %[bpf_ktime_get_ns];"
+	"r6 = r0;"
+	/* if r6 > r7 is an unpredictable jump */
+	"if r6 > r7 goto l1_%=;"
+	"r6 = r7;"
+"l1_%=:"
+	/* a noop to get to add new parent state */
+	"r0 = r0;"
+	/* if r6 > 4 exit(0) */
+	"if r6 > 4 goto l2_%=;"
+	/* Access memory at r9[r7] */
+	"r9 += r7;"
+	"r0 = *(u8*)(r9 + 0);"
+"l2_%=:"
+	"r0 = 0;"
+	"exit;"
+	:
+	: __imm(bpf_ktime_get_ns)
+	: __clobber_all);
+}
+
+/* Similar to a previous one, but shows that bpf_reg_state::precise
+ * could not be used to filter out registers subject to check_ids() in
+ * verifier.c:regsafe(). At 'l0' register 'r6' does not have 'precise'
+ * flag set but it is important to have this register in the idmap.
+ */
+SEC("socket")
+__failure __msg("register with unbounded min value")
+__flag(BPF_F_TEST_STATE_FREQ)
+__naked void ids_id_mapping_in_regsafe_2(void)
+{
+	asm volatile (
+	/* Bump allocated stack */
+	"r1 = 0;"
+	"*(u64*)(r10 - 8) = r1;"
+	/* r9 = pointer to stack */
+	"r9 = r10;"
+	"r9 += -8;"
+	/* r8 = ktime_get_ns() */
+	"call %[bpf_ktime_get_ns];"
+	"r8 = r0;"
+	/* r7 = ktime_get_ns() */
+	"call %[bpf_ktime_get_ns];"
+	"r7 = r0;"
+	/* r6 = ktime_get_ns() */
+	"call %[bpf_ktime_get_ns];"
+	"r6 = r0;"
+	/* scratch .id from r0 */
+	"r0 = 0;"
+	/* if r6 > r7 is an unpredictable jump */
+	"if r6 > r7 goto l1_%=;"
+	/* tie r6 and r7 .id */
+	"r6 = r7;"
+"l0_%=:"
+	/* if r7 > 4 exit(0) */
+	"if r7 > 4 goto l2_%=;"
+	/* Access memory at r9[r7] */
+	"r9 += r6;"
+	"r0 = *(u8*)(r9 + 0);"
+"l2_%=:"
+	"r0 = 0;"
+	"exit;"
+"l1_%=:"
+	/* tie r6 and r8 .id */
+	"r6 = r8;"
+	"goto l0_%=;"
+	:
+	: __imm(bpf_ktime_get_ns)
+	: __clobber_all);
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH bpf-next v2 3/4] bpf: filter out scalar ids not used for range transfer in regsafe()
  2023-05-30 17:27 [PATCH bpf-next v2 0/4] verify scalar ids mapping in regsafe() Eduard Zingerman
  2023-05-30 17:27 ` [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids() Eduard Zingerman
  2023-05-30 17:27 ` [PATCH bpf-next v2 2/4] selftests/bpf: verify that check_ids() is used for scalars in regsafe() Eduard Zingerman
@ 2023-05-30 17:27 ` Eduard Zingerman
  2023-05-30 17:27 ` [PATCH bpf-next v2 4/4] selftests/bpf: check env->that range_transfer_ids has effect Eduard Zingerman
  3 siblings, 0 replies; 33+ messages in thread
From: Eduard Zingerman @ 2023-05-30 17:27 UTC (permalink / raw)
  To: bpf, ast; +Cc: andrii, daniel, martin.lau, kernel-team, yhs, Eduard Zingerman

verifier.c:regsafe() uses check_ids() to verify that scalar register
id assignments match between cached and current state. This is costly
because many registers might get an id, but not all of them would
actually gain range through verifier.c:find_equal_scalars().

For example, in the following code range is transferred via
find_equal_scalars():

  1: r0 = call some_func();
  2: r1 = r0;                       // r0 and r1 get same id
  3: if r0 > 10 goto exit;          // r0 and r1 get the same range
     ... use r1 ...

In this case it is important to verify that r0 and r1 have the same id
if there is ever a jump to (3).

However, for the following code registers id mapping is not important
but gets in a way:

  1: r6 = ...
  2: if ... goto <4>;
  3: r0 = call some_func(); // r0.id == 0
  4: goto <6>;
  5: r0 = r6;
  6: if r0 > 10 goto exit;  // first visit with r0.id == 0,
                            // second visit with r0.id != 0
     ... use r0 ...

Jump from 4 to 6 would not be considered safe and path starting from 6
would be processed again because of mismatch in r0 id mapping.

This commit modifies find_equal_scalars() to track which ids were
actually used for range transfer. regsafe() can safely omit id mapping
checks for ids that were never used for range transfer.

This brings back most of the performance lost because of the previous commit:

$ ./veristat -e file,prog,states -f 'states_pct!=0' \
             -C master-baseline.log current.log
File             Program                States (A)  States (B)  States (DIFF)
---------------  ---------------------  ----------  ----------  -------------
bpf_host.o       cil_from_host                  37          45   +8 (+21.62%)
bpf_sock.o       cil_sock6_connect              99         103    +4 (+4.04%)
bpf_sock.o       cil_sock6_getpeername          56          57    +1 (+1.79%)
bpf_sock.o       cil_sock6_recvmsg              56          57    +1 (+1.79%)
bpf_sock.o       cil_sock6_sendmsg              93          97    +4 (+4.30%)
test_usdt.bpf.o  usdt12                        116         117    +1 (+0.86%)

(As in the previous commit verification performance is tested for
 object files listed in tools/testing/selftests/bpf/veristat.cfg and
 Cilium object files from [1])

[1] git@github.com:anakryiko/cilium.git

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 include/linux/bpf_verifier.h |   4 +
 kernel/bpf/Makefile          |   1 +
 kernel/bpf/u32_hashset.c     | 137 +++++++++++++++++++++++++++++++++++
 kernel/bpf/u32_hashset.h     |  30 ++++++++
 kernel/bpf/verifier.c        |  46 ++++++++++--
 5 files changed, 210 insertions(+), 8 deletions(-)
 create mode 100644 kernel/bpf/u32_hashset.c
 create mode 100644 kernel/bpf/u32_hashset.h

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 5b11a3b0fec0..c5bc87403a6f 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -557,6 +557,8 @@ struct backtrack_state {
 	u64 stack_masks[MAX_CALL_FRAMES];
 };
 
+struct u32_hashset;
+
 /* single container for all structs
  * one verifier_env per bpf_check() call
  */
@@ -622,6 +624,8 @@ struct bpf_verifier_env {
 	u32 scratched_regs;
 	/* Same as scratched_regs but for stack slots */
 	u64 scratched_stack_slots;
+	/* set of ids that gain range via find_equal_scalars() */
+	struct u32_hashset *range_transfer_ids;
 	u64 prev_log_pos, prev_insn_print_pos;
 	/* buffer used to generate temporary string representations,
 	 * e.g., in reg_type_str() to generate reg_type string
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index 1d3892168d32..8e94e549679e 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -12,6 +12,7 @@ obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list
 obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o
 obj-$(CONFIG_BPF_SYSCALL) += bpf_local_storage.o bpf_task_storage.o
 obj-${CONFIG_BPF_LSM}	  += bpf_inode_storage.o
+obj-$(CONFIG_BPF_SYSCALL) += u32_hashset.o
 obj-$(CONFIG_BPF_SYSCALL) += disasm.o
 obj-$(CONFIG_BPF_JIT) += trampoline.o
 obj-$(CONFIG_BPF_SYSCALL) += btf.o memalloc.o
diff --git a/kernel/bpf/u32_hashset.c b/kernel/bpf/u32_hashset.c
new file mode 100644
index 000000000000..a2c5429e34e1
--- /dev/null
+++ b/kernel/bpf/u32_hashset.c
@@ -0,0 +1,137 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include "linux/gfp_types.h"
+#include "linux/random.h"
+#include "linux/slab.h"
+#include <linux/jhash.h>
+
+#include "u32_hashset.h"
+
+static struct u32_hashset_bucket *u32_hashset_put_in_bucket(struct u32_hashset_bucket *bucket,
+							    u32 item)
+{
+	struct u32_hashset_bucket *new_bucket;
+	u32 new_cap = bucket ? 2 * bucket->cap : 1;
+	u32 cnt = bucket ? bucket->cnt : 0;
+	size_t sz;
+
+	if (!bucket || bucket->cnt == bucket->cap) {
+		sz = sizeof(struct u32_hashset_bucket) + sizeof(u32) * new_cap;
+		new_bucket = krealloc(bucket, sz, GFP_KERNEL);
+		if (!new_bucket)
+			return NULL;
+		new_bucket->cap = new_cap;
+	} else {
+		new_bucket = bucket;
+	}
+
+	new_bucket->items[cnt] = item;
+	new_bucket->cnt = cnt + 1;
+
+	return new_bucket;
+}
+
+static bool u32_hashset_needs_to_grow(struct u32_hashset *set)
+{
+	/* grow if empty or more than 75% filled */
+	return (set->buckets_cnt == 0) || ((set->items_cnt + 1) * 4 / 3 > set->buckets_cnt);
+}
+
+static void u32_hashset_free_buckets(struct u32_hashset_bucket **buckets, size_t cnt)
+{
+	size_t bkt;
+
+	for (bkt = 0; bkt < cnt; ++bkt)
+		kfree(buckets[bkt]);
+	kfree(buckets);
+}
+
+static int u32_hashset_grow(struct u32_hashset *set)
+{
+	struct u32_hashset_bucket **new_buckets;
+	size_t new_buckets_cnt;
+	size_t h, bkt, i;
+	u32 item;
+
+	new_buckets_cnt = set->buckets_cnt ? set->buckets_cnt * 2 : 4;
+	new_buckets = kcalloc(new_buckets_cnt, sizeof(new_buckets[0]), GFP_KERNEL);
+	if (!new_buckets)
+		return -ENOMEM;
+
+	for (bkt = 0; bkt < set->buckets_cnt; ++bkt) {
+		if (!set->buckets[bkt])
+			continue;
+
+		for (i = 0; i < set->buckets[bkt]->cnt; ++i) {
+			item = set->buckets[bkt]->items[i];
+			h = jhash_1word(item, set->seed) % new_buckets_cnt;
+			new_buckets[h] = u32_hashset_put_in_bucket(new_buckets[h], item);
+			if (!new_buckets[h])
+				goto nomem;
+		}
+	}
+
+	u32_hashset_free_buckets(set->buckets, set->buckets_cnt);
+	set->buckets_cnt = new_buckets_cnt;
+	set->buckets = new_buckets;
+	return 0;
+
+nomem:
+	u32_hashset_free_buckets(new_buckets, new_buckets_cnt);
+
+	return -ENOMEM;
+}
+
+void u32_hashset_clear(struct u32_hashset *set)
+{
+	u32_hashset_free_buckets(set->buckets, set->buckets_cnt);
+	set->buckets = NULL;
+	set->buckets_cnt = 0;
+	set->items_cnt = 0;
+}
+
+bool u32_hashset_find(const struct u32_hashset *set, const u32 key)
+{
+	struct u32_hashset_bucket *bkt;
+	u32 i, hash;
+
+	if (!set->buckets)
+		return false;
+
+	hash = jhash_1word(key, set->seed) % set->buckets_cnt;
+	bkt = set->buckets[hash];
+	if (!bkt)
+		return false;
+
+	for (i = 0; i < bkt->cnt; ++i)
+		if (bkt->items[i] == key)
+			return true;
+
+	return false;
+}
+
+int u32_hashset_add(struct u32_hashset *set, u32 key)
+{
+	struct u32_hashset_bucket *new_bucket;
+	u32 hash;
+	int err;
+
+	if (u32_hashset_find(set, key))
+		return 0;
+
+	if (u32_hashset_needs_to_grow(set)) {
+		err = u32_hashset_grow(set);
+		if (err)
+			return err;
+	}
+
+	hash = jhash_1word(key, set->seed) % set->buckets_cnt;
+	new_bucket = u32_hashset_put_in_bucket(set->buckets[hash], key);
+	if (!new_bucket)
+		return -ENOMEM;
+
+	set->buckets[hash] = new_bucket;
+	set->items_cnt++;
+
+	return 0;
+}
diff --git a/kernel/bpf/u32_hashset.h b/kernel/bpf/u32_hashset.h
new file mode 100644
index 000000000000..76f03e2e4f14
--- /dev/null
+++ b/kernel/bpf/u32_hashset.h
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/* A hashset for u32 values, based on tools/lib/bpf/hashmap.h */
+
+#ifndef __U32_HASHSET_H__
+#define __U32_HASHSET_H__
+
+#include "linux/gfp_types.h"
+#include "linux/random.h"
+#include "linux/slab.h"
+#include <linux/jhash.h>
+
+struct u32_hashset_bucket {
+	u32 cnt;
+	u32 cap;
+	u32 items[];
+};
+
+struct u32_hashset {
+	struct u32_hashset_bucket **buckets;
+	size_t buckets_cnt;
+	size_t items_cnt;
+	u32 seed;
+};
+
+void u32_hashset_clear(struct u32_hashset *set);
+bool u32_hashset_find(const struct u32_hashset *set, const u32 key);
+int u32_hashset_add(struct u32_hashset *set, u32 key);
+
+#endif /* __U32_HASHSET_H__ */
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 9c10f2619c4f..0d3a695aa4da 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -27,6 +27,7 @@
 #include <linux/module.h>
 
 #include "disasm.h"
+#include "u32_hashset.h"
 
 static const struct bpf_verifier_ops * const bpf_verifier_ops[] = {
 #define BPF_PROG_TYPE(_id, _name, prog_ctx_type, kern_ctx_type) \
@@ -13629,16 +13630,25 @@ static bool try_match_pkt_pointers(const struct bpf_insn *insn,
 	return true;
 }
 
-static void find_equal_scalars(struct bpf_verifier_state *vstate,
-			       struct bpf_reg_state *known_reg)
+static int find_equal_scalars(struct bpf_verifier_env *env,
+			      struct bpf_verifier_state *vstate,
+			      struct bpf_reg_state *known_reg)
 {
 	struct bpf_func_state *state;
 	struct bpf_reg_state *reg;
+	int err = 0, count = 0;
 
 	bpf_for_each_reg_in_vstate(vstate, state, reg, ({
-		if (reg->type == SCALAR_VALUE && reg->id == known_reg->id)
+		if (reg->type == SCALAR_VALUE && reg->id == known_reg->id) {
 			copy_register_state(reg, known_reg);
+			++count;
+		}
 	}));
+
+	if (count > 1)
+		err = u32_hashset_add(env->range_transfer_ids, known_reg->id);
+
+	return err;
 }
 
 static int check_cond_jmp_op(struct bpf_verifier_env *env,
@@ -13803,8 +13813,13 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 						    src_reg, dst_reg, opcode);
 			if (src_reg->id &&
 			    !WARN_ON_ONCE(src_reg->id != other_branch_regs[insn->src_reg].id)) {
-				find_equal_scalars(this_branch, src_reg);
-				find_equal_scalars(other_branch, &other_branch_regs[insn->src_reg]);
+				err = find_equal_scalars(env, this_branch, src_reg);
+				if (err)
+					return err;
+				err = find_equal_scalars(env, other_branch,
+							 &other_branch_regs[insn->src_reg]);
+				if (err)
+					return err;
 			}
 
 		}
@@ -13816,8 +13831,12 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 
 	if (dst_reg->type == SCALAR_VALUE && dst_reg->id &&
 	    !WARN_ON_ONCE(dst_reg->id != other_branch_regs[insn->dst_reg].id)) {
-		find_equal_scalars(this_branch, dst_reg);
-		find_equal_scalars(other_branch, &other_branch_regs[insn->dst_reg]);
+		err = find_equal_scalars(env, this_branch, dst_reg);
+		if (err)
+			return err;
+		err = find_equal_scalars(env, other_branch, &other_branch_regs[insn->dst_reg]);
+		if (err)
+			return err;
 	}
 
 	/* if one pointer register is compared to another pointer
@@ -15170,8 +15189,13 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
 		 * The only state difference between first and second visits of (5) is
 		 * bpf_reg_state::id assignments for r6 and r7: (b, b) vs (a, b).
 		 * Thus, use check_ids() to distinguish these states.
+		 *
+		 * All children states of 'rold' are already verified.
+		 * Thus env->range_transfer_ids contains all ids that gained range via
+		 * find_equal_scalars() during children verification.
 		 */
-		if (!check_ids(rold->id, rcur->id, idmap))
+		if (u32_hashset_find(env->range_transfer_ids, rold->id) &&
+		    !check_ids(rold->id, rcur->id, idmap))
 			return false;
 		if (regs_exact(rold, rcur, idmap))
 			return true;
@@ -19289,6 +19313,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 	if (!env->explored_states)
 		goto skip_full_check;
 
+	env->range_transfer_ids = kzalloc(sizeof(*env->range_transfer_ids), GFP_KERNEL);
+	if (!env->range_transfer_ids)
+		goto skip_full_check;
+
 	ret = add_subprog_and_kfunc(env);
 	if (ret < 0)
 		goto skip_full_check;
@@ -19327,6 +19355,8 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 
 skip_full_check:
 	kvfree(env->explored_states);
+	u32_hashset_clear(env->range_transfer_ids);
+	kvfree(env->range_transfer_ids);
 
 	if (ret == 0)
 		ret = check_max_stack_depth(env);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH bpf-next v2 4/4] selftests/bpf: check env->that range_transfer_ids has effect
  2023-05-30 17:27 [PATCH bpf-next v2 0/4] verify scalar ids mapping in regsafe() Eduard Zingerman
                   ` (2 preceding siblings ...)
  2023-05-30 17:27 ` [PATCH bpf-next v2 3/4] bpf: filter out scalar ids not used for range transfer " Eduard Zingerman
@ 2023-05-30 17:27 ` Eduard Zingerman
  3 siblings, 0 replies; 33+ messages in thread
From: Eduard Zingerman @ 2023-05-30 17:27 UTC (permalink / raw)
  To: bpf, ast; +Cc: andrii, daniel, martin.lau, kernel-team, yhs, Eduard Zingerman

Previous commit adds bpf_verifier_env::range_transfer_ids check to
verifier.c:regsafe(). This check allows to skip check_ids() for some
ids in the cached verifier state and thus improves verification
performance.

This commit adds two test cases:
- first: showing that check_ids() is indeed skipped as expected;
- second: modification of first where check_ids() cannot be skipped.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 .../selftests/bpf/progs/verifier_scalar_ids.c | 106 ++++++++++++++++++
 1 file changed, 106 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c b/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c
index 0ea9a1f6e1ae..2c5bb72696ce 100644
--- a/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c
+++ b/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c
@@ -105,4 +105,110 @@ __naked void ids_id_mapping_in_regsafe_2(void)
 	: __clobber_all);
 }
 
+/* Label l1 could be reached in two combinations:
+ *
+ *   (1) r6{.id=A}, r7{.id=A}, r8{.id=B}
+ *   (2) r6{.id=B}, r7{.id=A}, r8{.id=B}
+ *
+ * However neither A nor B are used in find_equal_scalars()
+ * to transfer range information in this test.
+ * Thus states (1) and (2) should be considered identical due
+ * to bpf_verifier_env::range_transfer_ids handling.
+ *
+ * Make sure that this is the case by checking that second jump
+ * to l1 hits cached state.
+ */
+SEC("socket")
+__success __log_level(7) __msg("14: safe")
+__flag(BPF_F_TEST_STATE_FREQ)
+__naked void no_range_transfer_ids(void)
+{
+	asm volatile (
+	/* Bump allocated stack */
+	"r1 = 0;"
+	"*(u64*)(r10 - 16) = r1;"
+	/* r9 = pointer to stack */
+	"r9 = r10;"
+	"r9 += -16;"
+	/* r7 = ktime_get_ns() & 0b11 */
+	"call %[bpf_ktime_get_ns];"
+	"r8 = r0;"
+	"r8 &= 3;"
+	/* r6 = ktime_get_ns() & 0b11 */
+	"call %[bpf_ktime_get_ns];"
+	"r7 = r0;"
+	"r7 &= 3;"
+	/* if r6 > r7 is an unpredictable jump */
+	"if r7 > r8 goto l0_%=;"
+	"r6 = r7;"
+	"goto l1_%=;"
+"l0_%=:"
+	"r6 = r8;"
+"l1_%=:"
+	/* insn #14 */
+	"r9 += r6;"
+	"r9 += r7;"
+	"r9 += r8;"
+	"r0 = *(u8*)(r9 + 0);"
+	"r0 = 0;"
+	"exit;"
+	:
+	: __imm(bpf_ktime_get_ns)
+	: __clobber_all);
+}
+
+/* Same as above, but cached state for l1 has id used for
+ * range transfer:
+ *
+ *   (1) r6{.id=A}, r7{.id=A}, r8{.id=B}
+ *   (2) r6{.id=B}, r7{.id=A}, r8{.id=B}
+ *
+ * If (A) is used for range transfer (1) and (2) should not
+ * be considered identical.
+ *
+ * Check this by verifying that instruction immediately following l1
+ * is visited twice.
+ */
+SEC("socket")
+__success __log_level(7) __msg("r9 = r9") __msg("r9 = r9")
+__flag(BPF_F_TEST_STATE_FREQ)
+__naked void has_range_transfer_ids(void)
+{
+	asm volatile (
+	/* Bump allocated stack */
+	"r1 = 0;"
+	"*(u64*)(r10 - 16) = r1;"
+	/* r9 = pointer to stack */
+	"r9 = r10;"
+	"r9 += -16;"
+	/* r7 = ktime_get_ns() & 0b11 */
+	"call %[bpf_ktime_get_ns];"
+	"r8 = r0;"
+	/* r6 = ktime_get_ns() & 0b11 */
+	"call %[bpf_ktime_get_ns];"
+	"r7 = r0;"
+	/* if r6 > r7 is an unpredictable jump */
+	"if r7 > r8 goto l0_%=;"
+	"r6 = r7;"
+	"goto l1_%=;"
+"l0_%=:"
+	"r6 = r8;"
+"l1_%=:"
+	/* just a unique marker, this insn should be verified twice */
+	"r9 = r9;"
+	/* one of the instructions below transfers range for r6 */
+	"if r7 > 2 goto l2_%=;"
+	"if r8 > 2 goto l2_%=;"
+	"r9 += r6;"
+	"r9 += r7;"
+	"r9 += r8;"
+	"r0 = *(u8*)(r9 + 0);"
+"l2_%=:"
+	"r0 = 0;"
+	"exit;"
+	:
+	: __imm(bpf_ktime_get_ns)
+	: __clobber_all);
+}
+
 char _license[] SEC("license") = "GPL";
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-05-30 17:27 ` [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids() Eduard Zingerman
@ 2023-05-30 21:37   ` Andrii Nakryiko
  2023-05-30 23:02     ` Eduard Zingerman
       [not found]     ` <f2abf39bcd4de841a89bb248de9e242a880aaa93.camel@gmail.com>
  0 siblings, 2 replies; 33+ messages in thread
From: Andrii Nakryiko @ 2023-05-30 21:37 UTC (permalink / raw)
  To: Eduard Zingerman; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs

On Tue, May 30, 2023 at 10:27 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> Make sure that the following unsafe example is rejected by verifier:
>
> 1: r9 = ... some pointer with range X ...
> 2: r6 = ... unbound scalar ID=a ...
> 3: r7 = ... unbound scalar ID=b ...
> 4: if (r6 > r7) goto +1
> 5: r6 = r7
> 6: if (r6 > X) goto ...
> --- checkpoint ---
> 7: r9 += r7
> 8: *(u64 *)r9 = Y
>
> This example is unsafe because not all execution paths verify r7 range.
> Because of the jump at (4) the verifier would arrive at (6) in two states:
> I.  r6{.id=b}, r7{.id=b} via path 1-6;
> II. r6{.id=a}, r7{.id=b} via path 1-4, 6.
>
> Currently regsafe() does not call check_ids() for scalar registers,
> thus from POV of regsafe() states (I) and (II) are identical. If the
> path 1-6 is taken by verifier first, and checkpoint is created at (6)
> the path [1-4, 6] would be considered safe.
>
> This commit updates regsafe() to call check_ids() for scalar registers.
>
> This change is costly in terms of verification performance.
> Using veristat to compare number of processed states for selftests
> object files listed in tools/testing/selftests/bpf/veristat.cfg and
> Cilium object files from [1] gives the following statistics:
>
>   Filter        | Number of programs
>   ----------------------------------
>   states_pct>10 | 40
>   states_pct>20 | 20
>   states_pct>30 | 15
>   states_pct>40 | 11
>
> (Out of total 177 programs)
>
> In fact, impact is so bad that in no-alu32 mode the following
> test_progs tests no longer pass verifiction:
> - verif_scale2: maximal number of instructions exceeded
> - verif_scale3: maximal number of instructions exceeded
> - verif_scale_pyperf600: maximal number of instructions exceeded
>
> Additionally:
> - verifier_search_pruning/allocated_stack: expected prunning does not
>   happen because of differences in register id allocation.
>
> Followup patch would address these issues.
>
> [1] git@github.com:anakryiko/cilium.git
>
> Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.")
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  kernel/bpf/verifier.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index af70dad655ab..9c10f2619c4f 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -15151,6 +15151,28 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
>
>         switch (base_type(rold->type)) {
>         case SCALAR_VALUE:
> +               /* Why check_ids() for scalar registers?
> +                *
> +                * Consider the following BPF code:
> +                *   1: r6 = ... unbound scalar, ID=a ...
> +                *   2: r7 = ... unbound scalar, ID=b ...
> +                *   3: if (r6 > r7) goto +1
> +                *   4: r6 = r7
> +                *   5: if (r6 > X) goto ...
> +                *   6: ... memory operation using r7 ...
> +                *
> +                * First verification path is [1-6]:
> +                * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7;
> +                * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark
> +                *   r7 <= X, because r6 and r7 share same id.
> +                *
> +                * Next verification path would start from (5), because of the jump at (3).
> +                * The only state difference between first and second visits of (5) is
> +                * bpf_reg_state::id assignments for r6 and r7: (b, b) vs (a, b).
> +                * Thus, use check_ids() to distinguish these states.
> +                */
> +               if (!check_ids(rold->id, rcur->id, idmap))
> +                       return false;

does this check_ids() have to be performed before regs_exact (which
also checks IDs, btw) *and* before rold->precise check?

Intuitively, it feels like `rold->precise = false` case shouldn't care
about IDs in rcur, as any value should be safe. But I haven't spent
much time thinking about this, so there might be some corner case I'm
missing.

I wonder if moving this check after we handled imprecise rold case would help.

Also, it might make sense to drop SCALAR register IDs as soon as we
have only one instance of it left (e.g., if "paired" register was
overwritten already). I.e., aggressively drop IDs when they become
useless. WDYT?

The u32_hashset.... Have you also measured verification time
regression with this? I wonder how much impact that change has?

>                 if (regs_exact(rold, rcur, idmap))
>                         return true;
>                 if (env->explore_alu_limits)
> --
> 2.40.1
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-05-30 21:37   ` Andrii Nakryiko
@ 2023-05-30 23:02     ` Eduard Zingerman
  2023-06-01  2:05       ` Alexei Starovoitov
       [not found]     ` <f2abf39bcd4de841a89bb248de9e242a880aaa93.camel@gmail.com>
  1 sibling, 1 reply; 33+ messages in thread
From: Eduard Zingerman @ 2023-05-30 23:02 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs

On Tue, 2023-05-30 at 14:37 -0700, Andrii Nakryiko wrote:
> On Tue, May 30, 2023 at 10:27 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > 
> > Make sure that the following unsafe example is rejected by verifier:
> > 
> > 1: r9 = ... some pointer with range X ...
> > 2: r6 = ... unbound scalar ID=a ...
> > 3: r7 = ... unbound scalar ID=b ...
> > 4: if (r6 > r7) goto +1
> > 5: r6 = r7
> > 6: if (r6 > X) goto ...
> > --- checkpoint ---
> > 7: r9 += r7
> > 8: *(u64 *)r9 = Y
> > 
> > This example is unsafe because not all execution paths verify r7 range.
> > Because of the jump at (4) the verifier would arrive at (6) in two states:
> > I.  r6{.id=b}, r7{.id=b} via path 1-6;
> > II. r6{.id=a}, r7{.id=b} via path 1-4, 6.
> > 
> > Currently regsafe() does not call check_ids() for scalar registers,
> > thus from POV of regsafe() states (I) and (II) are identical. If the
> > path 1-6 is taken by verifier first, and checkpoint is created at (6)
> > the path [1-4, 6] would be considered safe.
> > 
> > This commit updates regsafe() to call check_ids() for scalar registers.
> > 
> > This change is costly in terms of verification performance.
> > Using veristat to compare number of processed states for selftests
> > object files listed in tools/testing/selftests/bpf/veristat.cfg and
> > Cilium object files from [1] gives the following statistics:
> > 
> >   Filter        | Number of programs
> >   ----------------------------------
> >   states_pct>10 | 40
> >   states_pct>20 | 20
> >   states_pct>30 | 15
> >   states_pct>40 | 11
> > 
> > (Out of total 177 programs)
> > 
> > In fact, impact is so bad that in no-alu32 mode the following
> > test_progs tests no longer pass verifiction:
> > - verif_scale2: maximal number of instructions exceeded
> > - verif_scale3: maximal number of instructions exceeded
> > - verif_scale_pyperf600: maximal number of instructions exceeded
> > 
> > Additionally:
> > - verifier_search_pruning/allocated_stack: expected prunning does not
> >   happen because of differences in register id allocation.
> > 
> > Followup patch would address these issues.
> > 
> > [1] git@github.com:anakryiko/cilium.git
> > 
> > Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.")
> > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> > ---
> >  kernel/bpf/verifier.c | 22 ++++++++++++++++++++++
> >  1 file changed, 22 insertions(+)
> > 
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index af70dad655ab..9c10f2619c4f 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -15151,6 +15151,28 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
> > 
> >         switch (base_type(rold->type)) {
> >         case SCALAR_VALUE:
> > +               /* Why check_ids() for scalar registers?
> > +                *
> > +                * Consider the following BPF code:
> > +                *   1: r6 = ... unbound scalar, ID=a ...
> > +                *   2: r7 = ... unbound scalar, ID=b ...
> > +                *   3: if (r6 > r7) goto +1
> > +                *   4: r6 = r7
> > +                *   5: if (r6 > X) goto ...
> > +                *   6: ... memory operation using r7 ...
> > +                *
> > +                * First verification path is [1-6]:
> > +                * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7;
> > +                * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark
> > +                *   r7 <= X, because r6 and r7 share same id.
> > +                *
> > +                * Next verification path would start from (5), because of the jump at (3).
> > +                * The only state difference between first and second visits of (5) is
> > +                * bpf_reg_state::id assignments for r6 and r7: (b, b) vs (a, b).
> > +                * Thus, use check_ids() to distinguish these states.
> > +                */
> > +               if (!check_ids(rold->id, rcur->id, idmap))
> > +                       return false;
> 
> does this check_ids() have to be performed before regs_exact (which
> also checks IDs, btw) *and* before rold->precise check?

Relative position to regs_exact() does not matter (because it does check_ids).
Relative position to rold->precise *does* matter (see next answer).

> Intuitively, it feels like `rold->precise = false` case shouldn't care
> about IDs in rcur, as any value should be safe. But I haven't spent
> much time thinking about this, so there might be some corner case I'm
> missing.

I tried to explain this in the cover letter, I'll try to add more
details below. Effectively what you suggest is to modify the check as
follows:

  if (rold->precise && !check_ids(rold->id, rcur->id, idmap))
    return false;

Unfortunately, not all registers with interesting IDs would be marked
as precise. Consider the original example but with r6 and r7 swapped:

  1: r9 = ... some pointer with range X ...
  2: r6 = ... unbound scalar ID=a ...
  3: r7 = ... unbound scalar ID=b ...
  4: if (r6 > r7) goto +1
  5: r7 = r6
  6: if (r7 > X) goto ...
  7: r9 += r6

Suppose that current verification path is 1-7:
- On a way down 1-6 r7 will not be marked as precise, because
  condition (r7 > X) is not predictable (see check_cond_jmp_op());
- When (7) is reached mark_chain_precision() will start moving up
  marking the following registers as precise:

  4: if (r6 > r7) goto +1 ; r6, r7
  5: r7 = r6              ; r6
  6: if (r7 > X) goto ... ; r6
  7: r9 += r6             ; r6

- Thus, if checkpoint is created for (6) r7 would be marked as read,
  but will not be marked as precise.
  
Next, suppose that jump from 4 to 6 is verified and checkpoint for (6)
is considered:
- r6 is not precise, so check_ids() is not called for it and it is not
  added to idmap;
- r7 is precise, so check_ids() is called for it, but it is a sole
  register in the idmap;
- States are considered equal.

Here is the log (I added a few prints for states cache comparison):

  from 10 to 13: safe
    steq hit 10, cur:
      R0=scalar(id=2) R6=scalar(id=2) R7=scalar(id=1) R9=fp-8 R10=fp0 fp-8=00000000
    steq hit 10, old:
      R6_rD=Pscalar(id=2) R7_rwD=scalar(id=2) R9_rD=fp-8 R10=fp0 fp-8_rD=00000000

The verifier_scalar_ids.c:ids_id_mapping_in_regsafe_2() is another example.
(test names are so-so...).

I'll recite myself:

  Ideally check_ids() should only be called for 'rold' that either:
  (a) gained range via find_equal_scalars() in some child verification
      path and was then marked as precise;
  (b) was a source of range information for some other register via
      find_equal_scalars() in some child verification path, and that
      register was then marked as precise.

Current implementation of mark_chain_precision() does not guarantee
precision mark for point (b).

This leads to a few questions:
- Q: Should (b) be guaranteed?
  A: I don't know. If patch #1 from this patch-set is applied,
     I cannot figure out any counter-example showing that (b)
     is necessary.
  Corollary: (b) is a performance optimization for patch #1.
- Q: How hard is it to implement (b)?
  A: Information about register id assignments for instructions in the
     middle of a state is lost. I see a few ways to mitigate this:
     - Extend bpf_verifier_state::jmp_history to track a mask for
       registers / stack slots that gained range via
       find_equal_scalars() and use this mask in backtrack_insn().
     - Make it so, that every conditional jump instruction is the last
       instruction in a state. Then register ID assignments should
       actually be valid, and backtrack_insn() could search for
       registers with the same ID when marking precise registers.
- Q: If (b) is merely an optimization, what is simpler (b) or patch #3
     (u32_hashset thing)?
  A: I think that patch #3 is logically simpler.

So, it boils down to a question should (b) be guaranteed?
What do you think?

> I wonder if moving this check after we handled imprecise rold case would help.
> 
> Also, it might make sense to drop SCALAR register IDs as soon as we
> have only one instance of it left (e.g., if "paired" register was
> overwritten already). I.e., aggressively drop IDs when they become
> useless. WDYT?

I'll try, but don't expect a big change, will report a bit later.

> The u32_hashset.... Have you also measured verification time
> regression with this? I wonder how much impact that change has?

Summing up duration_diff field for all object files average is 0.5%
verification time increase.

> 
> >                 if (regs_exact(rold, rcur, idmap))
> >                         return true;
> >                 if (env->explore_alu_limits)
> > --
> > 2.40.1
> > 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
       [not found]     ` <f2abf39bcd4de841a89bb248de9e242a880aaa93.camel@gmail.com>
@ 2023-05-31 18:29       ` Andrii Nakryiko
  2023-05-31 19:30         ` Eduard Zingerman
  0 siblings, 1 reply; 33+ messages in thread
From: Andrii Nakryiko @ 2023-05-31 18:29 UTC (permalink / raw)
  To: Eduard Zingerman; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs

On Wed, May 31, 2023 at 10:21 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Tue, 2023-05-30 at 14:37 -0700, Andrii Nakryiko wrote:
> [...]
> > Also, it might make sense to drop SCALAR register IDs as soon as we
> > have only one instance of it left (e.g., if "paired" register was
> > overwritten already). I.e., aggressively drop IDs when they become
> > useless. WDYT?
>
> I added modification which resets sole scalar IDs to zero before
> states comparison, it shows some speedup but is still slow:
>
>   Filter        | Number of programs | Number of programs
>                 | patch #1           | patch #1 + sole scalar ID pruning
>   ----------------------------------------------------------------------
>   states_pct>10 | 40                 | 40
>   states_pct>20 | 20                 | 19
>   states_pct>30 | 15                 | 13
>   states_pct>40 | 11                 | 8
>
> (Out of 177 programs).
>
> I'll modify mark_chain_precision() to propagate precision marks for
> find_equal_scalars(), so that it could be compared to current patch #3
> in terms of code complexity and verification performance.
>
> If you have any thoughts regarding my previous email, please share.
>

Yep, I do. Given SCALAR registers with the same ID are meant to "share
the destiny", shouldn't it be required that when we mark r6 as precise
we should automatically mark linked r7 as precise at the same point.
So in your example:

7: r9 += r6

should be where we request both r6 and r7 (and whatever other register
has the same ID) to be marked as precise. It should be very easy to
implement, especially given my recent refactoring with
mark_chain_precision_batch.

The question I have (and again, haven't spent any time thinking about
any other corner cases, sorry) is whether that alone would be a proper
fix?


As for this u32_hashset, it just feels like a big overkill, tbh. If we
have to do something like that, wouldn't it be better to, say, set
highest bit in reg->id (for all linked registers, of course) to mark
it as "used for range checks" instead of maintaining a separate check?

But just the whole idea of keeping track of some special circumstances
under which IDs are meaningful seems wrong... All this logic is
complicated, now we are adding another layer of complexity on top. And
the complexity is not in the code, it's in thinking about all possible
scenarios and their interactions.


> [...]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-05-31 18:29       ` Andrii Nakryiko
@ 2023-05-31 19:30         ` Eduard Zingerman
  2023-05-31 20:12           ` Andrii Nakryiko
  0 siblings, 1 reply; 33+ messages in thread
From: Eduard Zingerman @ 2023-05-31 19:30 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs

On Wed, 2023-05-31 at 11:29 -0700, Andrii Nakryiko wrote:
> On Wed, May 31, 2023 at 10:21 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > 
> > On Tue, 2023-05-30 at 14:37 -0700, Andrii Nakryiko wrote:
> > [...]
> > > Also, it might make sense to drop SCALAR register IDs as soon as we
> > > have only one instance of it left (e.g., if "paired" register was
> > > overwritten already). I.e., aggressively drop IDs when they become
> > > useless. WDYT?
> > 
> > I added modification which resets sole scalar IDs to zero before
> > states comparison, it shows some speedup but is still slow:
> > 
> >   Filter        | Number of programs | Number of programs
> >                 | patch #1           | patch #1 + sole scalar ID pruning
> >   ----------------------------------------------------------------------
> >   states_pct>10 | 40                 | 40
> >   states_pct>20 | 20                 | 19
> >   states_pct>30 | 15                 | 13
> >   states_pct>40 | 11                 | 8
> > 
> > (Out of 177 programs).
> > 
> > I'll modify mark_chain_precision() to propagate precision marks for
> > find_equal_scalars(), so that it could be compared to current patch #3
> > in terms of code complexity and verification performance.
> > 
> > If you have any thoughts regarding my previous email, please share.
> > 
> 
> Yep, I do. Given SCALAR registers with the same ID are meant to "share
> the destiny", shouldn't it be required that when we mark r6 as precise
> we should automatically mark linked r7 as precise at the same point.
> So in your example:
> 
> 7: r9 += r6
> 
> should be where we request both r6 and r7 (and whatever other register
> has the same ID) to be marked as precise. It should be very easy to
> implement, especially given my recent refactoring with
> mark_chain_precision_batch.

Ok, I'll modify the `struct backtrack_state` as follows:

  struct backtrack_state {
	struct bpf_verifier_env *env;
	u32 frame;
	u32 reg_masks[MAX_CALL_FRAMES];
+	u32 reg_ids[MAX_CALL_FRAMES];
	u64 stack_masks[MAX_CALL_FRAMES];
+	u64 stack_ids[MAX_CALL_FRAMES];
  };

And add corresponding logic to backtrack_insn().

> The question I have (and again, haven't spent any time thinking about
> any other corner cases, sorry) is whether that alone would be a proper
> fix?

As far as I understand, in terms of correctness it would be a proper fix.
In terms of performance, I hope that it would be enough but we will see.

> As for this u32_hashset, it just feels like a big overkill, tbh. If we
> have to do something like that, wouldn't it be better to, say, set
> highest bit in reg->id (for all linked registers, of course) to mark
> it as "used for range checks" instead of maintaining a separate check?

Unfortunately no, because this ID change would have to be propagated
backwards. It was the first implementation I tried.

> But just the whole idea of keeping track of some special circumstances
> under which IDs are meaningful seems wrong... All this logic is
> complicated, now we are adding another layer of complexity on top. And
> the complexity is not in the code, it's in thinking about all possible
> scenarios and their interactions.

I agree that adding more layers is a complication in itself.
Thank you for your input.

> > [...]


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-05-31 19:30         ` Eduard Zingerman
@ 2023-05-31 20:12           ` Andrii Nakryiko
  2023-05-31 20:31             ` Eduard Zingerman
  0 siblings, 1 reply; 33+ messages in thread
From: Andrii Nakryiko @ 2023-05-31 20:12 UTC (permalink / raw)
  To: Eduard Zingerman; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs

On Wed, May 31, 2023 at 12:30 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Wed, 2023-05-31 at 11:29 -0700, Andrii Nakryiko wrote:
> > On Wed, May 31, 2023 at 10:21 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > On Tue, 2023-05-30 at 14:37 -0700, Andrii Nakryiko wrote:
> > > [...]
> > > > Also, it might make sense to drop SCALAR register IDs as soon as we
> > > > have only one instance of it left (e.g., if "paired" register was
> > > > overwritten already). I.e., aggressively drop IDs when they become
> > > > useless. WDYT?
> > >
> > > I added modification which resets sole scalar IDs to zero before
> > > states comparison, it shows some speedup but is still slow:
> > >
> > >   Filter        | Number of programs | Number of programs
> > >                 | patch #1           | patch #1 + sole scalar ID pruning
> > >   ----------------------------------------------------------------------
> > >   states_pct>10 | 40                 | 40
> > >   states_pct>20 | 20                 | 19
> > >   states_pct>30 | 15                 | 13
> > >   states_pct>40 | 11                 | 8
> > >
> > > (Out of 177 programs).
> > >
> > > I'll modify mark_chain_precision() to propagate precision marks for
> > > find_equal_scalars(), so that it could be compared to current patch #3
> > > in terms of code complexity and verification performance.
> > >
> > > If you have any thoughts regarding my previous email, please share.
> > >
> >
> > Yep, I do. Given SCALAR registers with the same ID are meant to "share
> > the destiny", shouldn't it be required that when we mark r6 as precise
> > we should automatically mark linked r7 as precise at the same point.
> > So in your example:
> >
> > 7: r9 += r6
> >
> > should be where we request both r6 and r7 (and whatever other register
> > has the same ID) to be marked as precise. It should be very easy to
> > implement, especially given my recent refactoring with
> > mark_chain_precision_batch.
>
> Ok, I'll modify the `struct backtrack_state` as follows:
>
>   struct backtrack_state {
>         struct bpf_verifier_env *env;
>         u32 frame;
>         u32 reg_masks[MAX_CALL_FRAMES];
> +       u32 reg_ids[MAX_CALL_FRAMES];
>         u64 stack_masks[MAX_CALL_FRAMES];
> +       u64 stack_ids[MAX_CALL_FRAMES];
>   };
>
> And add corresponding logic to backtrack_insn().

I don't see why you need to change anything about backtrack_state at
all, so we are not on the same page.

What I propose is that in mark_chain_precision(), when given regno, go
over all *current* registers with the same ID, and set all of them as
"to be marked precise". And then call mark_chain_precision_batch().
See propagate_precision() for how we do similar stuff for bulk
precision propagation.

backtrack_state shouldn't need to know about IDs, unless I'm missing something

>
> > The question I have (and again, haven't spent any time thinking about
> > any other corner cases, sorry) is whether that alone would be a proper
> > fix?
>
> As far as I understand, in terms of correctness it would be a proper fix.
> In terms of performance, I hope that it would be enough but we will see.

ok, let's try that then, before we commit to u32_hashset stuff

>
> > As for this u32_hashset, it just feels like a big overkill, tbh. If we
> > have to do something like that, wouldn't it be better to, say, set
> > highest bit in reg->id (for all linked registers, of course) to mark
> > it as "used for range checks" instead of maintaining a separate check?
>
> Unfortunately no, because this ID change would have to be propagated
> backwards. It was the first implementation I tried.
>
> > But just the whole idea of keeping track of some special circumstances
> > under which IDs are meaningful seems wrong... All this logic is
> > complicated, now we are adding another layer of complexity on top. And
> > the complexity is not in the code, it's in thinking about all possible
> > scenarios and their interactions.
>
> I agree that adding more layers is a complication in itself.
> Thank you for your input.
>
> > > [...]
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-05-31 20:12           ` Andrii Nakryiko
@ 2023-05-31 20:31             ` Eduard Zingerman
  2023-05-31 21:14               ` Andrii Nakryiko
  0 siblings, 1 reply; 33+ messages in thread
From: Eduard Zingerman @ 2023-05-31 20:31 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs

On Wed, 2023-05-31 at 13:12 -0700, Andrii Nakryiko wrote:
> On Wed, May 31, 2023 at 12:30 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > 
> > On Wed, 2023-05-31 at 11:29 -0700, Andrii Nakryiko wrote:
> > > On Wed, May 31, 2023 at 10:21 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > 
> > > > On Tue, 2023-05-30 at 14:37 -0700, Andrii Nakryiko wrote:
> > > > [...]
> > > > > Also, it might make sense to drop SCALAR register IDs as soon as we
> > > > > have only one instance of it left (e.g., if "paired" register was
> > > > > overwritten already). I.e., aggressively drop IDs when they become
> > > > > useless. WDYT?
> > > > 
> > > > I added modification which resets sole scalar IDs to zero before
> > > > states comparison, it shows some speedup but is still slow:
> > > > 
> > > >   Filter        | Number of programs | Number of programs
> > > >                 | patch #1           | patch #1 + sole scalar ID pruning
> > > >   ----------------------------------------------------------------------
> > > >   states_pct>10 | 40                 | 40
> > > >   states_pct>20 | 20                 | 19
> > > >   states_pct>30 | 15                 | 13
> > > >   states_pct>40 | 11                 | 8
> > > > 
> > > > (Out of 177 programs).
> > > > 
> > > > I'll modify mark_chain_precision() to propagate precision marks for
> > > > find_equal_scalars(), so that it could be compared to current patch #3
> > > > in terms of code complexity and verification performance.
> > > > 
> > > > If you have any thoughts regarding my previous email, please share.
> > > > 
> > > 
> > > Yep, I do. Given SCALAR registers with the same ID are meant to "share
> > > the destiny", shouldn't it be required that when we mark r6 as precise
> > > we should automatically mark linked r7 as precise at the same point.
> > > So in your example:
> > > 
> > > 7: r9 += r6
> > > 
> > > should be where we request both r6 and r7 (and whatever other register
> > > has the same ID) to be marked as precise. It should be very easy to
> > > implement, especially given my recent refactoring with
> > > mark_chain_precision_batch.
> > 
> > Ok, I'll modify the `struct backtrack_state` as follows:
> > 
> >   struct backtrack_state {
> >         struct bpf_verifier_env *env;
> >         u32 frame;
> >         u32 reg_masks[MAX_CALL_FRAMES];
> > +       u32 reg_ids[MAX_CALL_FRAMES];
> >         u64 stack_masks[MAX_CALL_FRAMES];
> > +       u64 stack_ids[MAX_CALL_FRAMES];
> >   };
> > 
> > And add corresponding logic to backtrack_insn().
> 
> I don't see why you need to change anything about backtrack_state at
> all, so we are not on the same page.
> 
> What I propose is that in mark_chain_precision(), when given regno, go
> over all *current* registers with the same ID, and set all of them as
> "to be marked precise". And then call mark_chain_precision_batch().
> See propagate_precision() for how we do similar stuff for bulk
> precision propagation.
> 
> backtrack_state shouldn't need to know about IDs, unless I'm missing something

Consider a modified version of the original example:

  1: r9 = ... some pointer with range X ...
  2: r6 = ... unbound scalar ID=a ...
  3: r7 = ... unbound scalar ID=b ...
  4: if (r6 > r7) goto +1
  5: r7 = r6
  6: if (r7 > X) goto ...
  7: r7 = 0
  8: r9 += r6

Suppose verification path is 1-8, the ID assignments are:

                            r6.id r7.id
  4: if (r6 > r7) goto +1   a     b
  5: r7 = r6                a     a
  6: if (r7 > X) goto ...   a     a
  7: r7 = 0                 a     -
  8: r9 += r6               a     -

When mark_chain_precision() is called at (8) it no longer knows that
r6 and r7 shared the same ID at some point in the past. What you
suggest won't work in this case, what I suggest won't work as well.

Looks like the only way to do it correctly is to save ID assignments
after each find_equal_scalars() (for (6) in this example) in the
bpf_verifier_state, and use this information in backtrack_insn().
For example, extend jmp_history.

> 
> > 
> > > The question I have (and again, haven't spent any time thinking about
> > > any other corner cases, sorry) is whether that alone would be a proper
> > > fix?
> > 
> > As far as I understand, in terms of correctness it would be a proper fix.
> > In terms of performance, I hope that it would be enough but we will see.
> 
> ok, let's try that then, before we commit to u32_hashset stuff
> 
> > 
> > > As for this u32_hashset, it just feels like a big overkill, tbh. If we
> > > have to do something like that, wouldn't it be better to, say, set
> > > highest bit in reg->id (for all linked registers, of course) to mark
> > > it as "used for range checks" instead of maintaining a separate check?
> > 
> > Unfortunately no, because this ID change would have to be propagated
> > backwards. It was the first implementation I tried.
> > 
> > > But just the whole idea of keeping track of some special circumstances
> > > under which IDs are meaningful seems wrong... All this logic is
> > > complicated, now we are adding another layer of complexity on top. And
> > > the complexity is not in the code, it's in thinking about all possible
> > > scenarios and their interactions.
> > 
> > I agree that adding more layers is a complication in itself.
> > Thank you for your input.
> > 
> > > > [...]
> > 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-05-31 20:31             ` Eduard Zingerman
@ 2023-05-31 21:14               ` Andrii Nakryiko
  2023-05-31 22:15                 ` Eduard Zingerman
  0 siblings, 1 reply; 33+ messages in thread
From: Andrii Nakryiko @ 2023-05-31 21:14 UTC (permalink / raw)
  To: Eduard Zingerman; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs

On Wed, May 31, 2023 at 1:31 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Wed, 2023-05-31 at 13:12 -0700, Andrii Nakryiko wrote:
> > On Wed, May 31, 2023 at 12:30 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > On Wed, 2023-05-31 at 11:29 -0700, Andrii Nakryiko wrote:
> > > > On Wed, May 31, 2023 at 10:21 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > >
> > > > > On Tue, 2023-05-30 at 14:37 -0700, Andrii Nakryiko wrote:
> > > > > [...]
> > > > > > Also, it might make sense to drop SCALAR register IDs as soon as we
> > > > > > have only one instance of it left (e.g., if "paired" register was
> > > > > > overwritten already). I.e., aggressively drop IDs when they become
> > > > > > useless. WDYT?
> > > > >
> > > > > I added modification which resets sole scalar IDs to zero before
> > > > > states comparison, it shows some speedup but is still slow:
> > > > >
> > > > >   Filter        | Number of programs | Number of programs
> > > > >                 | patch #1           | patch #1 + sole scalar ID pruning
> > > > >   ----------------------------------------------------------------------
> > > > >   states_pct>10 | 40                 | 40
> > > > >   states_pct>20 | 20                 | 19
> > > > >   states_pct>30 | 15                 | 13
> > > > >   states_pct>40 | 11                 | 8
> > > > >
> > > > > (Out of 177 programs).
> > > > >
> > > > > I'll modify mark_chain_precision() to propagate precision marks for
> > > > > find_equal_scalars(), so that it could be compared to current patch #3
> > > > > in terms of code complexity and verification performance.
> > > > >
> > > > > If you have any thoughts regarding my previous email, please share.
> > > > >
> > > >
> > > > Yep, I do. Given SCALAR registers with the same ID are meant to "share
> > > > the destiny", shouldn't it be required that when we mark r6 as precise
> > > > we should automatically mark linked r7 as precise at the same point.
> > > > So in your example:
> > > >
> > > > 7: r9 += r6
> > > >
> > > > should be where we request both r6 and r7 (and whatever other register
> > > > has the same ID) to be marked as precise. It should be very easy to
> > > > implement, especially given my recent refactoring with
> > > > mark_chain_precision_batch.
> > >
> > > Ok, I'll modify the `struct backtrack_state` as follows:
> > >
> > >   struct backtrack_state {
> > >         struct bpf_verifier_env *env;
> > >         u32 frame;
> > >         u32 reg_masks[MAX_CALL_FRAMES];
> > > +       u32 reg_ids[MAX_CALL_FRAMES];
> > >         u64 stack_masks[MAX_CALL_FRAMES];
> > > +       u64 stack_ids[MAX_CALL_FRAMES];
> > >   };
> > >
> > > And add corresponding logic to backtrack_insn().
> >
> > I don't see why you need to change anything about backtrack_state at
> > all, so we are not on the same page.
> >
> > What I propose is that in mark_chain_precision(), when given regno, go
> > over all *current* registers with the same ID, and set all of them as
> > "to be marked precise". And then call mark_chain_precision_batch().
> > See propagate_precision() for how we do similar stuff for bulk
> > precision propagation.
> >
> > backtrack_state shouldn't need to know about IDs, unless I'm missing something
>
> Consider a modified version of the original example:
>
>   1: r9 = ... some pointer with range X ...
>   2: r6 = ... unbound scalar ID=a ...
>   3: r7 = ... unbound scalar ID=b ...
>   4: if (r6 > r7) goto +1
>   5: r7 = r6
>   6: if (r7 > X) goto ...
>   7: r7 = 0
>   8: r9 += r6
>
> Suppose verification path is 1-8, the ID assignments are:
>
>                             r6.id r7.id
>   4: if (r6 > r7) goto +1   a     b
>   5: r7 = r6                a     a
>   6: if (r7 > X) goto ...   a     a
>   7: r7 = 0                 a     -
>   8: r9 += r6               a     -
>
> When mark_chain_precision() is called at (8) it no longer knows that
> r6 and r7 shared the same ID at some point in the past. What you
> suggest won't work in this case, what I suggest won't work as well.

Let's not jump the gun here. If r6 and r7 are not linked anymore, then
a) precision of either r6 or r7 didn't matter before and b) r6
precision doesn't imply r7 precision anymore, because r7 is completely
independent now.

If there is some other code path at which r7's precise range would
matter, then a) in that other code path r6 and r7 would still be
linked and thus proposed precision tracking will mark both as precise,
or b) r7 would be already independent and precision tracking will mark
only r7 directly.

The way I see it, in a bit of an abstract way, if two registers are
linked through the same ID, then *until the point that they diverge*,
we have to make sure their states are the same. We do that with range
on each operation, but the precise bit is different, as it's
backtracked. So that's what my proposed fix to mark_precise_chain() is
doing: it makes sure we link precise bits for these registers
together. Yes, we will miss some intermediate state where r6 and r7
were linked, but if that link didn't matter (because neither r6 nor r7
were marked as precise at that point), then it effectively as if those
IDs were not even assigned at all.

BTW, as soon as one of the linked registers is modified, I hope we are
breaking the link and dropping the ID (I haven't checked the code, but
that is how it should work, I think). Please double check as you look
at this code, thanks!

Let's see if you can come up with another counterexample to this line
of thinking, I'm very open to me missing another detail, happens all
the time, but discussions like this are the only way to learn about
such details :)

>
> Looks like the only way to do it correctly is to save ID assignments
> after each find_equal_scalars() (for (6) in this example) in the
> bpf_verifier_state, and use this information in backtrack_insn().
> For example, extend jmp_history.
>
> >
> > >
> > > > The question I have (and again, haven't spent any time thinking about
> > > > any other corner cases, sorry) is whether that alone would be a proper
> > > > fix?
> > >
> > > As far as I understand, in terms of correctness it would be a proper fix.
> > > In terms of performance, I hope that it would be enough but we will see.
> >
> > ok, let's try that then, before we commit to u32_hashset stuff
> >
> > >
> > > > As for this u32_hashset, it just feels like a big overkill, tbh. If we
> > > > have to do something like that, wouldn't it be better to, say, set
> > > > highest bit in reg->id (for all linked registers, of course) to mark
> > > > it as "used for range checks" instead of maintaining a separate check?
> > >
> > > Unfortunately no, because this ID change would have to be propagated
> > > backwards. It was the first implementation I tried.
> > >
> > > > But just the whole idea of keeping track of some special circumstances
> > > > under which IDs are meaningful seems wrong... All this logic is
> > > > complicated, now we are adding another layer of complexity on top. And
> > > > the complexity is not in the code, it's in thinking about all possible
> > > > scenarios and their interactions.
> > >
> > > I agree that adding more layers is a complication in itself.
> > > Thank you for your input.
> > >
> > > > > [...]
> > >
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-05-31 21:14               ` Andrii Nakryiko
@ 2023-05-31 22:15                 ` Eduard Zingerman
  2023-05-31 22:54                   ` Andrii Nakryiko
  0 siblings, 1 reply; 33+ messages in thread
From: Eduard Zingerman @ 2023-05-31 22:15 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs

On Wed, 2023-05-31 at 14:14 -0700, Andrii Nakryiko wrote:
> On Wed, May 31, 2023 at 1:31 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > 
> > On Wed, 2023-05-31 at 13:12 -0700, Andrii Nakryiko wrote:
> > > On Wed, May 31, 2023 at 12:30 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > 
> > > > On Wed, 2023-05-31 at 11:29 -0700, Andrii Nakryiko wrote:
> > > > > On Wed, May 31, 2023 at 10:21 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > > > 
> > > > > > On Tue, 2023-05-30 at 14:37 -0700, Andrii Nakryiko wrote:
> > > > > > [...]
> > > > > > > Also, it might make sense to drop SCALAR register IDs as soon as we
> > > > > > > have only one instance of it left (e.g., if "paired" register was
> > > > > > > overwritten already). I.e., aggressively drop IDs when they become
> > > > > > > useless. WDYT?
> > > > > > 
> > > > > > I added modification which resets sole scalar IDs to zero before
> > > > > > states comparison, it shows some speedup but is still slow:
> > > > > > 
> > > > > >   Filter        | Number of programs | Number of programs
> > > > > >                 | patch #1           | patch #1 + sole scalar ID pruning
> > > > > >   ----------------------------------------------------------------------
> > > > > >   states_pct>10 | 40                 | 40
> > > > > >   states_pct>20 | 20                 | 19
> > > > > >   states_pct>30 | 15                 | 13
> > > > > >   states_pct>40 | 11                 | 8
> > > > > > 
> > > > > > (Out of 177 programs).
> > > > > > 
> > > > > > I'll modify mark_chain_precision() to propagate precision marks for
> > > > > > find_equal_scalars(), so that it could be compared to current patch #3
> > > > > > in terms of code complexity and verification performance.
> > > > > > 
> > > > > > If you have any thoughts regarding my previous email, please share.
> > > > > > 
> > > > > 
> > > > > Yep, I do. Given SCALAR registers with the same ID are meant to "share
> > > > > the destiny", shouldn't it be required that when we mark r6 as precise
> > > > > we should automatically mark linked r7 as precise at the same point.
> > > > > So in your example:
> > > > > 
> > > > > 7: r9 += r6
> > > > > 
> > > > > should be where we request both r6 and r7 (and whatever other register
> > > > > has the same ID) to be marked as precise. It should be very easy to
> > > > > implement, especially given my recent refactoring with
> > > > > mark_chain_precision_batch.
> > > > 
> > > > Ok, I'll modify the `struct backtrack_state` as follows:
> > > > 
> > > >   struct backtrack_state {
> > > >         struct bpf_verifier_env *env;
> > > >         u32 frame;
> > > >         u32 reg_masks[MAX_CALL_FRAMES];
> > > > +       u32 reg_ids[MAX_CALL_FRAMES];
> > > >         u64 stack_masks[MAX_CALL_FRAMES];
> > > > +       u64 stack_ids[MAX_CALL_FRAMES];
> > > >   };
> > > > 
> > > > And add corresponding logic to backtrack_insn().
> > > 
> > > I don't see why you need to change anything about backtrack_state at
> > > all, so we are not on the same page.
> > > 
> > > What I propose is that in mark_chain_precision(), when given regno, go
> > > over all *current* registers with the same ID, and set all of them as
> > > "to be marked precise". And then call mark_chain_precision_batch().
> > > See propagate_precision() for how we do similar stuff for bulk
> > > precision propagation.
> > > 
> > > backtrack_state shouldn't need to know about IDs, unless I'm missing something
> > 
> > Consider a modified version of the original example:
> > 
> >   1: r9 = ... some pointer with range X ...
> >   2: r6 = ... unbound scalar ID=a ...
> >   3: r7 = ... unbound scalar ID=b ...
> >   4: if (r6 > r7) goto +1
> >   5: r7 = r6
> >   6: if (r7 > X) goto ...
> >   7: r7 = 0
> >   8: r9 += r6
> > 
> > Suppose verification path is 1-8, the ID assignments are:
> > 
> >                             r6.id r7.id
> >   4: if (r6 > r7) goto +1   a     b
> >   5: r7 = r6                a     a
> >   6: if (r7 > X) goto ...   a     a
> >   7: r7 = 0                 a     -
> >   8: r9 += r6               a     -
> > 
> > When mark_chain_precision() is called at (8) it no longer knows that
> > r6 and r7 shared the same ID at some point in the past. What you
> > suggest won't work in this case, what I suggest won't work as well.
> 
> Let's not jump the gun here. If r6 and r7 are not linked anymore, then
> a) precision of either r6 or r7 didn't matter before and b) r6
> precision doesn't imply r7 precision anymore, because r7 is completely
> independent now.
> 
> If there is some other code path at which r7's precise range would
> matter, then a) in that other code path r6 and r7 would still be
> linked and thus proposed precision tracking will mark both as precise,
> or b) r7 would be already independent and precision tracking will mark
> only r7 directly.

Just to be on the same page, you suggest to do the following
modification:

  int mark_chain_precision(struct bpf_verifier_env *env, int regno)
  {
    // set env->bt flag for regno
	// set env->bt flag for each register the same id as regno 
	return mark_chain_precision_batch(env);
  }

right?

It doesn't work for the example at hand:
- First, the path 1-8 would be marked as safe.
  Precision marks would be propagated only for r6.
  Also, let's suppose that checkpoint is created at (6):

   4: if (r6 > r7) goto +1 ; r6=Pscalar, r7=Pscalar
   5: r7 = r6              ; r6=Pscalar
   --- checkpoint #1 ---
   6: if (r7 > X) goto ... ; r6=Pscalar
   7: r7 = 0               ; r6=Pscalar
   8: r9 += r6             ; r6=Pscalar

- Next, the jump (6) to ... is verified, suppose it is safe.
- Next, the jump from 4 to 6 would be verified, the checkpoint #1
  would be considered. For this checkpoint only r6 is marked precise,
  so only r6 would be fed to check_ids(), thus regsafe() would return true.
  For correct regsafe() operation we need precision marks for both
  r7 and r6 at checkpoint #1.

Now, a question, is it possible to modify your suggestion?
- all register assignments are known at the beginning of checkpoint #1;
- checkpoint #1 is a parent state of a state visiting (8)
  (the state in which mark_chain_precision(env, 6) is called);
- __mark_chain_precision() can be modified to do a similar thing every
  time parent state is reached:
  - look for IDs of registers currently marked as precise;
  - mark all registers with these IDs as precise.

But does this modification has any loopholes for more complex cases?
E.g. can we get in trouble with the following modification of the
original example?:

   4: if (r6 > r7) goto +1 
   5: r7 = r6              
   --- checkpoint #1 ---
   6: <something>
   7: if (r7 > X) goto ... 
   8: r7 = 0               
   9: r9 += r6             

It looks like your next paragraph applies here:
- if <something> modifies r7: there is no r6/r7 link anymore;
- if <something> modifies r6: same thing;
- if <something> does not touch r7/r6 the ID assignments for r7/r6 at
  checkpoint #1 should be fine;

Still, sounds a bit fragile.

> The way I see it, in a bit of an abstract way, if two registers are
> linked through the same ID, then *until the point that they diverge*,
> we have to make sure their states are the same. We do that with range
> on each operation, but the precise bit is different, as it's
> backtracked. So that's what my proposed fix to mark_precise_chain() is
> doing: it makes sure we link precise bits for these registers
> together. Yes, we will miss some intermediate state where r6 and r7
> were linked, but if that link didn't matter (because neither r6 nor r7
> were marked as precise at that point), then it effectively as if those
> IDs were not even assigned at all.
> 
> BTW, as soon as one of the linked registers is modified, I hope we are
> breaking the link and dropping the ID (I haven't checked the code, but
> that is how it should work, I think). Please double check as you look
> at this code, thanks!

This is my understanding as well, but will double-check.

> Let's see if you can come up with another counterexample to this line
> of thinking, I'm very open to me missing another detail, happens all
> the time, but discussions like this are the only way to learn about
> such details :)
 
Discussion is good, I knew you wouldn't like the u32_hashset :)

> > 
> > Looks like the only way to do it correctly is to save ID assignments
> > after each find_equal_scalars() (for (6) in this example) in the
> > bpf_verifier_state, and use this information in backtrack_insn().
> > For example, extend jmp_history.
> > 
> > > 
> > > > 
> > > > > The question I have (and again, haven't spent any time thinking about
> > > > > any other corner cases, sorry) is whether that alone would be a proper
> > > > > fix?
> > > > 
> > > > As far as I understand, in terms of correctness it would be a proper fix.
> > > > In terms of performance, I hope that it would be enough but we will see.
> > > 
> > > ok, let's try that then, before we commit to u32_hashset stuff
> > > 
> > > > 
> > > > > As for this u32_hashset, it just feels like a big overkill, tbh. If we
> > > > > have to do something like that, wouldn't it be better to, say, set
> > > > > highest bit in reg->id (for all linked registers, of course) to mark
> > > > > it as "used for range checks" instead of maintaining a separate check?
> > > > 
> > > > Unfortunately no, because this ID change would have to be propagated
> > > > backwards. It was the first implementation I tried.
> > > > 
> > > > > But just the whole idea of keeping track of some special circumstances
> > > > > under which IDs are meaningful seems wrong... All this logic is
> > > > > complicated, now we are adding another layer of complexity on top. And
> > > > > the complexity is not in the code, it's in thinking about all possible
> > > > > scenarios and their interactions.
> > > > 
> > > > I agree that adding more layers is a complication in itself.
> > > > Thank you for your input.
> > > > 
> > > > > > [...]
> > > > 
> > 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-05-31 22:15                 ` Eduard Zingerman
@ 2023-05-31 22:54                   ` Andrii Nakryiko
  2023-05-31 23:42                     ` Eduard Zingerman
  0 siblings, 1 reply; 33+ messages in thread
From: Andrii Nakryiko @ 2023-05-31 22:54 UTC (permalink / raw)
  To: Eduard Zingerman; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs

On Wed, May 31, 2023 at 3:15 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Wed, 2023-05-31 at 14:14 -0700, Andrii Nakryiko wrote:
> > On Wed, May 31, 2023 at 1:31 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > On Wed, 2023-05-31 at 13:12 -0700, Andrii Nakryiko wrote:
> > > > On Wed, May 31, 2023 at 12:30 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > >
> > > > > On Wed, 2023-05-31 at 11:29 -0700, Andrii Nakryiko wrote:
> > > > > > On Wed, May 31, 2023 at 10:21 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > > > >
> > > > > > > On Tue, 2023-05-30 at 14:37 -0700, Andrii Nakryiko wrote:
> > > > > > > [...]
> > > > > > > > Also, it might make sense to drop SCALAR register IDs as soon as we
> > > > > > > > have only one instance of it left (e.g., if "paired" register was
> > > > > > > > overwritten already). I.e., aggressively drop IDs when they become
> > > > > > > > useless. WDYT?
> > > > > > >
> > > > > > > I added modification which resets sole scalar IDs to zero before
> > > > > > > states comparison, it shows some speedup but is still slow:
> > > > > > >
> > > > > > >   Filter        | Number of programs | Number of programs
> > > > > > >                 | patch #1           | patch #1 + sole scalar ID pruning
> > > > > > >   ----------------------------------------------------------------------
> > > > > > >   states_pct>10 | 40                 | 40
> > > > > > >   states_pct>20 | 20                 | 19
> > > > > > >   states_pct>30 | 15                 | 13
> > > > > > >   states_pct>40 | 11                 | 8
> > > > > > >
> > > > > > > (Out of 177 programs).
> > > > > > >
> > > > > > > I'll modify mark_chain_precision() to propagate precision marks for
> > > > > > > find_equal_scalars(), so that it could be compared to current patch #3
> > > > > > > in terms of code complexity and verification performance.
> > > > > > >
> > > > > > > If you have any thoughts regarding my previous email, please share.
> > > > > > >
> > > > > >
> > > > > > Yep, I do. Given SCALAR registers with the same ID are meant to "share
> > > > > > the destiny", shouldn't it be required that when we mark r6 as precise
> > > > > > we should automatically mark linked r7 as precise at the same point.
> > > > > > So in your example:
> > > > > >
> > > > > > 7: r9 += r6
> > > > > >
> > > > > > should be where we request both r6 and r7 (and whatever other register
> > > > > > has the same ID) to be marked as precise. It should be very easy to
> > > > > > implement, especially given my recent refactoring with
> > > > > > mark_chain_precision_batch.
> > > > >
> > > > > Ok, I'll modify the `struct backtrack_state` as follows:
> > > > >
> > > > >   struct backtrack_state {
> > > > >         struct bpf_verifier_env *env;
> > > > >         u32 frame;
> > > > >         u32 reg_masks[MAX_CALL_FRAMES];
> > > > > +       u32 reg_ids[MAX_CALL_FRAMES];
> > > > >         u64 stack_masks[MAX_CALL_FRAMES];
> > > > > +       u64 stack_ids[MAX_CALL_FRAMES];
> > > > >   };
> > > > >
> > > > > And add corresponding logic to backtrack_insn().
> > > >
> > > > I don't see why you need to change anything about backtrack_state at
> > > > all, so we are not on the same page.
> > > >
> > > > What I propose is that in mark_chain_precision(), when given regno, go
> > > > over all *current* registers with the same ID, and set all of them as
> > > > "to be marked precise". And then call mark_chain_precision_batch().
> > > > See propagate_precision() for how we do similar stuff for bulk
> > > > precision propagation.
> > > >
> > > > backtrack_state shouldn't need to know about IDs, unless I'm missing something
> > >
> > > Consider a modified version of the original example:
> > >
> > >   1: r9 = ... some pointer with range X ...
> > >   2: r6 = ... unbound scalar ID=a ...
> > >   3: r7 = ... unbound scalar ID=b ...
> > >   4: if (r6 > r7) goto +1
> > >   5: r7 = r6
> > >   6: if (r7 > X) goto ...
> > >   7: r7 = 0
> > >   8: r9 += r6
> > >
> > > Suppose verification path is 1-8, the ID assignments are:
> > >
> > >                             r6.id r7.id
> > >   4: if (r6 > r7) goto +1   a     b
> > >   5: r7 = r6                a     a
> > >   6: if (r7 > X) goto ...   a     a
> > >   7: r7 = 0                 a     -
> > >   8: r9 += r6               a     -
> > >
> > > When mark_chain_precision() is called at (8) it no longer knows that
> > > r6 and r7 shared the same ID at some point in the past. What you
> > > suggest won't work in this case, what I suggest won't work as well.
> >
> > Let's not jump the gun here. If r6 and r7 are not linked anymore, then
> > a) precision of either r6 or r7 didn't matter before and b) r6
> > precision doesn't imply r7 precision anymore, because r7 is completely
> > independent now.
> >
> > If there is some other code path at which r7's precise range would
> > matter, then a) in that other code path r6 and r7 would still be
> > linked and thus proposed precision tracking will mark both as precise,
> > or b) r7 would be already independent and precision tracking will mark
> > only r7 directly.
>
> Just to be on the same page, you suggest to do the following
> modification:
>
>   int mark_chain_precision(struct bpf_verifier_env *env, int regno)
>   {
>     // set env->bt flag for regno
>         // set env->bt flag for each register the same id as regno
>         return mark_chain_precision_batch(env);
>   }
>
> right?

yep

>
> It doesn't work for the example at hand:
> - First, the path 1-8 would be marked as safe.
>   Precision marks would be propagated only for r6.
>   Also, let's suppose that checkpoint is created at (6):
>
>    4: if (r6 > r7) goto +1 ; r6=Pscalar, r7=Pscalar
>    5: r7 = r6              ; r6=Pscalar
>    --- checkpoint #1 ---
>    6: if (r7 > X) goto ... ; r6=Pscalar
>    7: r7 = 0               ; r6=Pscalar
>    8: r9 += r6             ; r6=Pscalar
>
> - Next, the jump (6) to ... is verified, suppose it is safe.
> - Next, the jump from 4 to 6 would be verified, the checkpoint #1
>   would be considered. For this checkpoint only r6 is marked precise,
>   so only r6 would be fed to check_ids(), thus regsafe() would return true.
>   For correct regsafe() operation we need precision marks for both
>   r7 and r6 at checkpoint #1.

right :(

>
> Now, a question, is it possible to modify your suggestion?
> - all register assignments are known at the beginning of checkpoint #1;
> - checkpoint #1 is a parent state of a state visiting (8)
>   (the state in which mark_chain_precision(env, 6) is called);
> - __mark_chain_precision() can be modified to do a similar thing every
>   time parent state is reached:
>   - look for IDs of registers currently marked as precise;
>   - mark all registers with these IDs as precise.

good point, this would have to happen within each state in which we
need to mark precisions

>
> But does this modification has any loopholes for more complex cases?
> E.g. can we get in trouble with the following modification of the
> original example?:
>
>    4: if (r6 > r7) goto +1
>    5: r7 = r6
>    --- checkpoint #1 ---
>    6: <something>
>    7: if (r7 > X) goto ...
>    8: r7 = 0
>    9: r9 += r6
>
> It looks like your next paragraph applies here:
> - if <something> modifies r7: there is no r6/r7 link anymore;
> - if <something> modifies r6: same thing;
> - if <something> does not touch r7/r6 the ID assignments for r7/r6 at
>   checkpoint #1 should be fine;
>
> Still, sounds a bit fragile.

well, what can I say... force all imprecise logic isn't that
straightforward either, but so far it still holds. And the big idea
here is similar: whatever happens between two checkpoints doesn't
matter if its effect is not visible at the end of the checkpoint.

So I guess the intent of my proposal is correct: every time we mark r6
as precise, we should mark r7 as well in each state in which they are
still linked. Which necessitates to do this on each walk up the state
chain.

At least let's give it a try and see how it holds up against existing
tests and whatever test you can come up with?..

>
> > The way I see it, in a bit of an abstract way, if two registers are
> > linked through the same ID, then *until the point that they diverge*,
> > we have to make sure their states are the same. We do that with range
> > on each operation, but the precise bit is different, as it's
> > backtracked. So that's what my proposed fix to mark_precise_chain() is
> > doing: it makes sure we link precise bits for these registers
> > together. Yes, we will miss some intermediate state where r6 and r7
> > were linked, but if that link didn't matter (because neither r6 nor r7
> > were marked as precise at that point), then it effectively as if those
> > IDs were not even assigned at all.
> >
> > BTW, as soon as one of the linked registers is modified, I hope we are
> > breaking the link and dropping the ID (I haven't checked the code, but
> > that is how it should work, I think). Please double check as you look
> > at this code, thanks!
>
> This is my understanding as well, but will double-check.

sounds good

>
> > Let's see if you can come up with another counterexample to this line
> > of thinking, I'm very open to me missing another detail, happens all
> > the time, but discussions like this are the only way to learn about
> > such details :)
>
> Discussion is good, I knew you wouldn't like the u32_hashset :)

yeah, I don't.


BTW, I did contemplate extending jmp_history to contain extra flags
about "interesting" instructions, though. Specifically, last
unsupported case for precision backtracking is when register other
than r10 is used for stack access (which can happen when one passes a
pointer to a SCALAR to parent function's stack), for which having a
bit next to such instruction saying "this is really a stack access"
would help cover the last class of unsupported situations.

But this is a pretty significant complication. And to make it really
practical, we'd need to think very hard on how to implement
jmp_history more efficiently, without constant reallocations. I have a
hunch that jmp_history should be one large resizable array that all
states share and just point into different parts of it. When state is
pushed to the stack, we just remember at which index it starts. When a
state is finalized, its jump history segment shouldn't be needed by
that state and can be reused by its siblings and parent states.
Ultimately, we only have a linear chain of actively worked-on states
which do use jmp_history, and all others either don't need it
*anymore* (verified states) or don't need it *yet* (enqueued states).

This would allow us to even have an exact "log of execution" with
insn_idx and associated extra information, but it will be code path
dependent, unlike bpf_insn_aux. And the best property is that it will
never grow beyond 1 million instructions deep (absolute worst case).

We might not even need backtracking in its current form if we just
proactively maintain involved registers information (something that we
currently derive during instruction interpretation in backtrack_insn).

So at some point I'd like to get to thinking and implementing this,
but it isn't the most pressing need right now, of course.

>
> > >
> > > Looks like the only way to do it correctly is to save ID assignments
> > > after each find_equal_scalars() (for (6) in this example) in the
> > > bpf_verifier_state, and use this information in backtrack_insn().
> > > For example, extend jmp_history.
> > >
> > > >
> > > > >
> > > > > > The question I have (and again, haven't spent any time thinking about
> > > > > > any other corner cases, sorry) is whether that alone would be a proper
> > > > > > fix?
> > > > >
> > > > > As far as I understand, in terms of correctness it would be a proper fix.
> > > > > In terms of performance, I hope that it would be enough but we will see.
> > > >
> > > > ok, let's try that then, before we commit to u32_hashset stuff
> > > >
> > > > >
> > > > > > As for this u32_hashset, it just feels like a big overkill, tbh. If we
> > > > > > have to do something like that, wouldn't it be better to, say, set
> > > > > > highest bit in reg->id (for all linked registers, of course) to mark
> > > > > > it as "used for range checks" instead of maintaining a separate check?
> > > > >
> > > > > Unfortunately no, because this ID change would have to be propagated
> > > > > backwards. It was the first implementation I tried.
> > > > >
> > > > > > But just the whole idea of keeping track of some special circumstances
> > > > > > under which IDs are meaningful seems wrong... All this logic is
> > > > > > complicated, now we are adding another layer of complexity on top. And
> > > > > > the complexity is not in the code, it's in thinking about all possible
> > > > > > scenarios and their interactions.
> > > > >
> > > > > I agree that adding more layers is a complication in itself.
> > > > > Thank you for your input.
> > > > >
> > > > > > > [...]
> > > > >
> > >
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-05-31 22:54                   ` Andrii Nakryiko
@ 2023-05-31 23:42                     ` Eduard Zingerman
  2023-06-01  0:23                       ` Andrii Nakryiko
  0 siblings, 1 reply; 33+ messages in thread
From: Eduard Zingerman @ 2023-05-31 23:42 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs

On Wed, 2023-05-31 at 15:54 -0700, Andrii Nakryiko wrote:
> [...] 
>
> well, what can I say... force all imprecise logic isn't that
> straightforward either, but so far it still holds. And the big idea
> here is similar: whatever happens between two checkpoints doesn't
> matter if its effect is not visible at the end of the checkpoint.
> 
> So I guess the intent of my proposal is correct: every time we mark r6
> as precise, we should mark r7 as well in each state in which they are
> still linked. Which necessitates to do this on each walk up the state
> chain.
> 
> At least let's give it a try and see how it holds up against existing
> tests and whatever test you can come up with?..
 
I'll try this thing, thanks a lot for all the input!
Hopefully will get back tomorrow.

> [...]
> 
> BTW, I did contemplate extending jmp_history to contain extra flags
> about "interesting" instructions, though. Specifically, last
> unsupported case for precision backtracking is when register other
> than r10 is used for stack access (which can happen when one passes a
> pointer to a SCALAR to parent function's stack), for which having a
> bit next to such instruction saying "this is really a stack access"
> would help cover the last class of unsupported situations.

Yes, it would have required some kind of redesign for this case as
well. My expectation is that only a few registers get range on each
find_equal_scalars() call, so storing big masks for all frame levels
is very sub-optimal.

> But this is a pretty significant complication. And to make it really
> practical, we'd need to think very hard on how to implement
> jmp_history more efficiently, without constant reallocations. I have a
> hunch that jmp_history should be one large resizable array that all
> states share and just point into different parts of it. When state is
> pushed to the stack, we just remember at which index it starts. When a
> state is finalized, its jump history segment shouldn't be needed by
> that state and can be reused by its siblings and parent states.
> Ultimately, we only have a linear chain of actively worked-on states
> which do use jmp_history, and all others either don't need it
> *anymore* (verified states) or don't need it *yet* (enqueued states).
>
> This would allow us to even have an exact "log of execution" with
> insn_idx and associated extra information, but it will be code path
> dependent, unlike bpf_insn_aux. And the best property is that it will
> never grow beyond 1 million instructions deep (absolute worst case).

I'm not sure I understand why is it bounded.
Conditionals and loops potentially give big number of possible
execution paths over a small number of instructions.
So, keeping per-path / per-instruction still is going to blow up in
terms of memory use. To keep it bounded (?) something smart needs to
know which visited states could never be visited again.

> We might not even need backtracking in its current form if we just
> proactively maintain involved registers information (something that we
> currently derive during instruction interpretation in backtrack_insn).

Well, precision marks still have to be propagated backwards, so some
form of "backward movement" within a state will have to happen anyway.
Like, we have a combination of forward and backward analyses in any case.
Sounds a bit like you want to converge the design to some kind of a
classical data flow analysis, but have states instead of basic blocks.

> So at some point I'd like to get to thinking and implementing this,
> but it isn't the most pressing need right now, of course.
> [...]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-05-31 23:42                     ` Eduard Zingerman
@ 2023-06-01  0:23                       ` Andrii Nakryiko
  0 siblings, 0 replies; 33+ messages in thread
From: Andrii Nakryiko @ 2023-06-01  0:23 UTC (permalink / raw)
  To: Eduard Zingerman; +Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs

On Wed, May 31, 2023 at 4:42 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Wed, 2023-05-31 at 15:54 -0700, Andrii Nakryiko wrote:
> > [...]
> >
> > well, what can I say... force all imprecise logic isn't that
> > straightforward either, but so far it still holds. And the big idea
> > here is similar: whatever happens between two checkpoints doesn't
> > matter if its effect is not visible at the end of the checkpoint.
> >
> > So I guess the intent of my proposal is correct: every time we mark r6
> > as precise, we should mark r7 as well in each state in which they are
> > still linked. Which necessitates to do this on each walk up the state
> > chain.
> >
> > At least let's give it a try and see how it holds up against existing
> > tests and whatever test you can come up with?..
>
> I'll try this thing, thanks a lot for all the input!
> Hopefully will get back tomorrow.

Great, let's see how it goes.

>
> > [...]
> >
> > BTW, I did contemplate extending jmp_history to contain extra flags
> > about "interesting" instructions, though. Specifically, last
> > unsupported case for precision backtracking is when register other
> > than r10 is used for stack access (which can happen when one passes a
> > pointer to a SCALAR to parent function's stack), for which having a
> > bit next to such instruction saying "this is really a stack access"
> > would help cover the last class of unsupported situations.
>
> Yes, it would have required some kind of redesign for this case as
> well. My expectation is that only a few registers get range on each
> find_equal_scalars() call, so storing big masks for all frame levels
> is very sub-optimal.
>
> > But this is a pretty significant complication. And to make it really
> > practical, we'd need to think very hard on how to implement
> > jmp_history more efficiently, without constant reallocations. I have a
> > hunch that jmp_history should be one large resizable array that all
> > states share and just point into different parts of it. When state is
> > pushed to the stack, we just remember at which index it starts. When a
> > state is finalized, its jump history segment shouldn't be needed by
> > that state and can be reused by its siblings and parent states.
> > Ultimately, we only have a linear chain of actively worked-on states
> > which do use jmp_history, and all others either don't need it
> > *anymore* (verified states) or don't need it *yet* (enqueued states).
> >
> > This would allow us to even have an exact "log of execution" with
> > insn_idx and associated extra information, but it will be code path
> > dependent, unlike bpf_insn_aux. And the best property is that it will
> > never grow beyond 1 million instructions deep (absolute worst case).
>
> I'm not sure I understand why is it bounded.
> Conditionals and loops potentially give big number of possible
> execution paths over a small number of instructions.
> So, keeping per-path / per-instruction still is going to blow up in
> terms of memory use. To keep it bounded (?) something smart needs to
> know which visited states could never be visited again.

#define BPF_COMPLEXITY_LIMIT_INSNS      1000000 /* yes. 1M insns */

that's the bound, you can't go above that

>
> > We might not even need backtracking in its current form if we just
> > proactively maintain involved registers information (something that we
> > currently derive during instruction interpretation in backtrack_insn).
>
> Well, precision marks still have to be propagated backwards, so some
> form of "backward movement" within a state will have to happen anyway.
> Like, we have a combination of forward and backward analyses in any case.
> Sounds a bit like you want to converge the design to some kind of a
> classical data flow analysis, but have states instead of basic blocks.

It's just an idea. And yes, there will still be backtracking, just
less of "let's try to understand what's going on based on instruction
itself". So instead of deriving that we are exiting from subprog, we
can have a flag that says "this instruction is exit from subprog". The
advantage would be an overall simplification and keeping instruction
interpretation in one place.

But this is offtopic, we can talk about this separately. It's not
urgent and not even imminent change.

>
> > So at some point I'd like to get to thinking and implementing this,
> > but it isn't the most pressing need right now, of course.
> > [...]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-05-30 23:02     ` Eduard Zingerman
@ 2023-06-01  2:05       ` Alexei Starovoitov
  2023-06-01 16:57         ` Eduard Zingerman
  0 siblings, 1 reply; 33+ messages in thread
From: Alexei Starovoitov @ 2023-06-01  2:05 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Andrii Nakryiko, bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs

On Wed, May 31, 2023 at 02:02:37AM +0300, Eduard Zingerman wrote:
> On Tue, 2023-05-30 at 14:37 -0700, Andrii Nakryiko wrote:
> > On Tue, May 30, 2023 at 10:27 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > 
> > > Make sure that the following unsafe example is rejected by verifier:
> > > 
> > > 1: r9 = ... some pointer with range X ...
> > > 2: r6 = ... unbound scalar ID=a ...
> > > 3: r7 = ... unbound scalar ID=b ...
> > > 4: if (r6 > r7) goto +1
> > > 5: r6 = r7
> > > 6: if (r6 > X) goto ...
> > > --- checkpoint ---
> > > 7: r9 += r7
> > > 8: *(u64 *)r9 = Y
> > > 
> > > This example is unsafe because not all execution paths verify r7 range.
> > > Because of the jump at (4) the verifier would arrive at (6) in two states:
> > > I.  r6{.id=b}, r7{.id=b} via path 1-6;
> > > II. r6{.id=a}, r7{.id=b} via path 1-4, 6.
> > > 
> > > Currently regsafe() does not call check_ids() for scalar registers,
> > > thus from POV of regsafe() states (I) and (II) are identical. If the
> > > path 1-6 is taken by verifier first, and checkpoint is created at (6)
> > > the path [1-4, 6] would be considered safe.
> > > 
> > > This commit updates regsafe() to call check_ids() for scalar registers.
> > > 
> > > This change is costly in terms of verification performance.
> > > Using veristat to compare number of processed states for selftests
> > > object files listed in tools/testing/selftests/bpf/veristat.cfg and
> > > Cilium object files from [1] gives the following statistics:
> > > 
> > >   Filter        | Number of programs
> > >   ----------------------------------
> > >   states_pct>10 | 40
> > >   states_pct>20 | 20
> > >   states_pct>30 | 15
> > >   states_pct>40 | 11
> > > 
> > > (Out of total 177 programs)
> > > 
> > > In fact, impact is so bad that in no-alu32 mode the following
> > > test_progs tests no longer pass verifiction:
> > > - verif_scale2: maximal number of instructions exceeded
> > > - verif_scale3: maximal number of instructions exceeded
> > > - verif_scale_pyperf600: maximal number of instructions exceeded
> > > 
> > > Additionally:
> > > - verifier_search_pruning/allocated_stack: expected prunning does not
> > >   happen because of differences in register id allocation.
> > > 
> > > Followup patch would address these issues.
> > > 
> > > [1] git@github.com:anakryiko/cilium.git
> > > 
> > > Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.")
> > > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> > > ---
> > >  kernel/bpf/verifier.c | 22 ++++++++++++++++++++++
> > >  1 file changed, 22 insertions(+)
> > > 
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index af70dad655ab..9c10f2619c4f 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -15151,6 +15151,28 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
> > > 
> > >         switch (base_type(rold->type)) {
> > >         case SCALAR_VALUE:
> > > +               /* Why check_ids() for scalar registers?
> > > +                *
> > > +                * Consider the following BPF code:
> > > +                *   1: r6 = ... unbound scalar, ID=a ...
> > > +                *   2: r7 = ... unbound scalar, ID=b ...
> > > +                *   3: if (r6 > r7) goto +1
> > > +                *   4: r6 = r7
> > > +                *   5: if (r6 > X) goto ...
> > > +                *   6: ... memory operation using r7 ...
> > > +                *
> > > +                * First verification path is [1-6]:
> > > +                * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7;
> > > +                * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark
> > > +                *   r7 <= X, because r6 and r7 share same id.
> > > +                *
> > > +                * Next verification path would start from (5), because of the jump at (3).
> > > +                * The only state difference between first and second visits of (5) is
> > > +                * bpf_reg_state::id assignments for r6 and r7: (b, b) vs (a, b).
> > > +                * Thus, use check_ids() to distinguish these states.
> > > +                */
> > > +               if (!check_ids(rold->id, rcur->id, idmap))
> > > +                       return false;
> > 
> > does this check_ids() have to be performed before regs_exact (which
> > also checks IDs, btw) *and* before rold->precise check?
> 
> Relative position to regs_exact() does not matter (because it does check_ids).
> Relative position to rold->precise *does* matter (see next answer).
> 
> > Intuitively, it feels like `rold->precise = false` case shouldn't care
> > about IDs in rcur, as any value should be safe. But I haven't spent
> > much time thinking about this, so there might be some corner case I'm
> > missing.
> 
> I tried to explain this in the cover letter, I'll try to add more
> details below. Effectively what you suggest is to modify the check as
> follows:
> 
>   if (rold->precise && !check_ids(rold->id, rcur->id, idmap))
>     return false;
> 
> Unfortunately, not all registers with interesting IDs would be marked
> as precise. Consider the original example but with r6 and r7 swapped:
> 
>   1: r9 = ... some pointer with range X ...
>   2: r6 = ... unbound scalar ID=a ...
>   3: r7 = ... unbound scalar ID=b ...
>   4: if (r6 > r7) goto +1
>   5: r7 = r6
>   6: if (r7 > X) goto ...
>   7: r9 += r6
> 
> Suppose that current verification path is 1-7:
> - On a way down 1-6 r7 will not be marked as precise, because
>   condition (r7 > X) is not predictable (see check_cond_jmp_op());
> - When (7) is reached mark_chain_precision() will start moving up
>   marking the following registers as precise:
> 
>   4: if (r6 > r7) goto +1 ; r6, r7
>   5: r7 = r6              ; r6
>   6: if (r7 > X) goto ... ; r6
>   7: r9 += r6             ; r6
> 
> - Thus, if checkpoint is created for (6) r7 would be marked as read,
>   but will not be marked as precise.
>   
> Next, suppose that jump from 4 to 6 is verified and checkpoint for (6)
> is considered:
> - r6 is not precise, so check_ids() is not called for it and it is not
>   added to idmap;
> - r7 is precise, so check_ids() is called for it, but it is a sole
>   register in the idmap;

typos in above?
r6 is precise and r7 is not precise.

> - States are considered equal.
> 
> Here is the log (I added a few prints for states cache comparison):
> 
>   from 10 to 13: safe
>     steq hit 10, cur:
>       R0=scalar(id=2) R6=scalar(id=2) R7=scalar(id=1) R9=fp-8 R10=fp0 fp-8=00000000
>     steq hit 10, old:
>       R6_rD=Pscalar(id=2) R7_rwD=scalar(id=2) R9_rD=fp-8 R10=fp0 fp-8_rD=00000000

the log is correct, thouhg.
r6_old = Pscalar which will go through check_ids() successfully and both are unbounded.
r7_old is not precise. different id-s don't matter and different ranges don't matter.

As another potential fix...
can we mark_chain_precision() right at the time of R1 = R2 when we do
src_reg->id = ++env->id_gen
and copy_register_state();
for both regs?

I think
if (rold->precise && !check_ids(rold->id, rcur->id, idmap))
would be good property to have.
I don't like u32_hashset either.
It's more or less saying that scalar id-s are incompatible with precision.

I hope we don't need to do:
+       u32 reg_ids[MAX_CALL_FRAMES];
for backtracking either.
Hacking id-s into jmp history is equally bad.

Let's figure out a minimal fix.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-06-01  2:05       ` Alexei Starovoitov
@ 2023-06-01 16:57         ` Eduard Zingerman
  2023-06-01 17:13           ` Alexei Starovoitov
  0 siblings, 1 reply; 33+ messages in thread
From: Eduard Zingerman @ 2023-06-01 16:57 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, ast, andrii, daniel, martin.lau, kernel-team, yhs

On Wed, 2023-05-31 at 19:05 -0700, Alexei Starovoitov wrote:
> [...]
> > Suppose that current verification path is 1-7:
> > - On a way down 1-6 r7 will not be marked as precise, because
> >   condition (r7 > X) is not predictable (see check_cond_jmp_op());
> > - When (7) is reached mark_chain_precision() will start moving up
> >   marking the following registers as precise:
> > 
> >   4: if (r6 > r7) goto +1 ; r6, r7
> >   5: r7 = r6              ; r6
> >   6: if (r7 > X) goto ... ; r6
> >   7: r9 += r6             ; r6
> > 
> > - Thus, if checkpoint is created for (6) r7 would be marked as read,
> >   but will not be marked as precise.
> >   
> > Next, suppose that jump from 4 to 6 is verified and checkpoint for (6)
> > is considered:
> > - r6 is not precise, so check_ids() is not called for it and it is not
> >   added to idmap;
> > - r7 is precise, so check_ids() is called for it, but it is a sole
> >   register in the idmap;
> 
> typos in above?
> r6 is precise and r7 is not precise.

Yes, it should be the other way around in the description:
r6 precise, r7 not precise. Sorry for confusion.

> > - States are considered equal.
> > 
> > Here is the log (I added a few prints for states cache comparison):
> > 
> >   from 10 to 13: safe
> >     steq hit 10, cur:
> >       R0=scalar(id=2) R6=scalar(id=2) R7=scalar(id=1) R9=fp-8 R10=fp0 fp-8=00000000
> >     steq hit 10, old:
> >       R6_rD=Pscalar(id=2) R7_rwD=scalar(id=2) R9_rD=fp-8 R10=fp0 fp-8_rD=00000000
> 
> the log is correct, thouhg.
> r6_old = Pscalar which will go through check_ids() successfully and both are unbounded.
> r7_old is not precise. different id-s don't matter and different ranges don't matter.
> 
> As another potential fix...
> can we mark_chain_precision() right at the time of R1 = R2 when we do
> src_reg->id = ++env->id_gen
> and copy_register_state();
> for both regs?

This won't help, e.g. for the original example precise markings would be:

  4: if (r6 > r7) goto +1 ; r6, r7
  5: r7 = r6              ; r6, r7
  6: if (r7 > X) goto ... ; r6     <-- mark for r7 is still missing
  7: r9 += r6             ; r6

What might help is to call mark_chain_precision() from
find_equal_scalars(), but I expect this to be very expensive.

> I think
> if (rold->precise && !check_ids(rold->id, rcur->id, idmap))
> would be good property to have.
> I don't like u32_hashset either.
> It's more or less saying that scalar id-s are incompatible with precision.
> 
> I hope we don't need to do:
> +       u32 reg_ids[MAX_CALL_FRAMES];
> for backtracking either.
> Hacking id-s into jmp history is equally bad.
> 
> Let's figure out a minimal fix.

Solution discussed with Andrii yesterday seems to work.
There is still a performance regression, but much less severe:

$ ./veristat -e file,prog,states -f "states_pct>5" -C master-baseline.log current.log
File                      Program                         States (A)  States (B)  States     (DIFF)
------------------------  ------------------------------  ----------  ----------  -----------------
bpf_host.o                cil_to_host                            188         198       +10 (+5.32%)
bpf_host.o                tail_handle_ipv4_from_host             225         243       +18 (+8.00%)
bpf_host.o                tail_ipv6_host_policy_ingress           98         104        +6 (+6.12%)
bpf_xdp.o                 tail_handle_nat_fwd_ipv6               648         806     +158 (+24.38%)
bpf_xdp.o                 tail_lb_ipv4                          2491        2930     +439 (+17.62%)
bpf_xdp.o                 tail_nodeport_nat_egress_ipv4          749         868     +119 (+15.89%)
bpf_xdp.o                 tail_nodeport_nat_ingress_ipv4         375         477     +102 (+27.20%)
bpf_xdp.o                 tail_rev_nodeport_lb4                  398         486      +88 (+22.11%)
loop6.bpf.o               trace_virtqueue_add_sgs                226         251      +25 (+11.06%)
pyperf600.bpf.o           on_event                             22200       45095  +22895 (+103.13%)
pyperf600_nounroll.bpf.o  on_event                             34169       37235     +3066 (+8.97%)

I need to add a bunch of tests and take a look at pyperf600.bpf.o
before submitting next patch-set version.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-06-01 16:57         ` Eduard Zingerman
@ 2023-06-01 17:13           ` Alexei Starovoitov
  2023-06-01 18:52             ` Eduard Zingerman
  0 siblings, 1 reply; 33+ messages in thread
From: Alexei Starovoitov @ 2023-06-01 17:13 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Thu, Jun 1, 2023 at 9:57 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Wed, 2023-05-31 at 19:05 -0700, Alexei Starovoitov wrote:
> > [...]
> > > Suppose that current verification path is 1-7:
> > > - On a way down 1-6 r7 will not be marked as precise, because
> > >   condition (r7 > X) is not predictable (see check_cond_jmp_op());
> > > - When (7) is reached mark_chain_precision() will start moving up
> > >   marking the following registers as precise:
> > >
> > >   4: if (r6 > r7) goto +1 ; r6, r7
> > >   5: r7 = r6              ; r6
> > >   6: if (r7 > X) goto ... ; r6
> > >   7: r9 += r6             ; r6
> > >
> > > - Thus, if checkpoint is created for (6) r7 would be marked as read,
> > >   but will not be marked as precise.
> > >
> > > Next, suppose that jump from 4 to 6 is verified and checkpoint for (6)
> > > is considered:
> > > - r6 is not precise, so check_ids() is not called for it and it is not
> > >   added to idmap;
> > > - r7 is precise, so check_ids() is called for it, but it is a sole
> > >   register in the idmap;
> >
> > typos in above?
> > r6 is precise and r7 is not precise.
>
> Yes, it should be the other way around in the description:
> r6 precise, r7 not precise. Sorry for confusion.
>
> > > - States are considered equal.
> > >
> > > Here is the log (I added a few prints for states cache comparison):
> > >
> > >   from 10 to 13: safe
> > >     steq hit 10, cur:
> > >       R0=scalar(id=2) R6=scalar(id=2) R7=scalar(id=1) R9=fp-8 R10=fp0 fp-8=00000000
> > >     steq hit 10, old:
> > >       R6_rD=Pscalar(id=2) R7_rwD=scalar(id=2) R9_rD=fp-8 R10=fp0 fp-8_rD=00000000
> >
> > the log is correct, thouhg.
> > r6_old = Pscalar which will go through check_ids() successfully and both are unbounded.
> > r7_old is not precise. different id-s don't matter and different ranges don't matter.
> >
> > As another potential fix...
> > can we mark_chain_precision() right at the time of R1 = R2 when we do
> > src_reg->id = ++env->id_gen
> > and copy_register_state();
> > for both regs?
>
> This won't help, e.g. for the original example precise markings would be:
>
>   4: if (r6 > r7) goto +1 ; r6, r7
>   5: r7 = r6              ; r6, r7
>   6: if (r7 > X) goto ... ; r6     <-- mark for r7 is still missing
>   7: r9 += r6             ; r6

Because 6 is a new state and we do mark_all_scalars_imprecise() after 5 ?

> What might help is to call mark_chain_precision() from
> find_equal_scalars(), but I expect this to be very expensive.

maybe worth giving it a shot?

> > I think
> > if (rold->precise && !check_ids(rold->id, rcur->id, idmap))
> > would be good property to have.
> > I don't like u32_hashset either.
> > It's more or less saying that scalar id-s are incompatible with precision.
> >
> > I hope we don't need to do:
> > +       u32 reg_ids[MAX_CALL_FRAMES];
> > for backtracking either.
> > Hacking id-s into jmp history is equally bad.
> >
> > Let's figure out a minimal fix.
>
> Solution discussed with Andrii yesterday seems to work.

The thread is long. Could you please describe it again in pseudo code?

> There is still a performance regression, but much less severe:
>
> $ ./veristat -e file,prog,states -f "states_pct>5" -C master-baseline.log current.log
> File                      Program                         States (A)  States (B)  States     (DIFF)
> ------------------------  ------------------------------  ----------  ----------  -----------------
> bpf_host.o                cil_to_host                            188         198       +10 (+5.32%)
> bpf_host.o                tail_handle_ipv4_from_host             225         243       +18 (+8.00%)
> bpf_host.o                tail_ipv6_host_policy_ingress           98         104        +6 (+6.12%)
> bpf_xdp.o                 tail_handle_nat_fwd_ipv6               648         806     +158 (+24.38%)
> bpf_xdp.o                 tail_lb_ipv4                          2491        2930     +439 (+17.62%)
> bpf_xdp.o                 tail_nodeport_nat_egress_ipv4          749         868     +119 (+15.89%)
> bpf_xdp.o                 tail_nodeport_nat_ingress_ipv4         375         477     +102 (+27.20%)
> bpf_xdp.o                 tail_rev_nodeport_lb4                  398         486      +88 (+22.11%)
> loop6.bpf.o               trace_virtqueue_add_sgs                226         251      +25 (+11.06%)
> pyperf600.bpf.o           on_event                             22200       45095  +22895 (+103.13%)
> pyperf600_nounroll.bpf.o  on_event                             34169       37235     +3066 (+8.97%)
>
> I need to add a bunch of tests and take a look at pyperf600.bpf.o
> before submitting next patch-set version.

Great. Looking forward.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-06-01 17:13           ` Alexei Starovoitov
@ 2023-06-01 18:52             ` Eduard Zingerman
  2023-06-02 18:50               ` Andrii Nakryiko
  0 siblings, 1 reply; 33+ messages in thread
From: Eduard Zingerman @ 2023-06-01 18:52 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Thu, 2023-06-01 at 10:13 -0700, Alexei Starovoitov wrote:
> On Thu, Jun 1, 2023 at 9:57 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > 
> > On Wed, 2023-05-31 at 19:05 -0700, Alexei Starovoitov wrote:
> > > [...]
> > > > Suppose that current verification path is 1-7:
> > > > - On a way down 1-6 r7 will not be marked as precise, because
> > > >   condition (r7 > X) is not predictable (see check_cond_jmp_op());
> > > > - When (7) is reached mark_chain_precision() will start moving up
> > > >   marking the following registers as precise:
> > > > 
> > > >   4: if (r6 > r7) goto +1 ; r6, r7
> > > >   5: r7 = r6              ; r6
> > > >   6: if (r7 > X) goto ... ; r6
> > > >   7: r9 += r6             ; r6
> > > > 
> > > > - Thus, if checkpoint is created for (6) r7 would be marked as read,
> > > >   but will not be marked as precise.
> > > > 
> > > > Next, suppose that jump from 4 to 6 is verified and checkpoint for (6)
> > > > is considered:
> > > > - r6 is not precise, so check_ids() is not called for it and it is not
> > > >   added to idmap;
> > > > - r7 is precise, so check_ids() is called for it, but it is a sole
> > > >   register in the idmap;
> > > 
> > > typos in above?
> > > r6 is precise and r7 is not precise.
> > 
> > Yes, it should be the other way around in the description:
> > r6 precise, r7 not precise. Sorry for confusion.
> > 
> > > > - States are considered equal.
> > > > 
> > > > Here is the log (I added a few prints for states cache comparison):
> > > > 
> > > >   from 10 to 13: safe
> > > >     steq hit 10, cur:
> > > >       R0=scalar(id=2) R6=scalar(id=2) R7=scalar(id=1) R9=fp-8 R10=fp0 fp-8=00000000
> > > >     steq hit 10, old:
> > > >       R6_rD=Pscalar(id=2) R7_rwD=scalar(id=2) R9_rD=fp-8 R10=fp0 fp-8_rD=00000000
> > > 
> > > the log is correct, thouhg.
> > > r6_old = Pscalar which will go through check_ids() successfully and both are unbounded.
> > > r7_old is not precise. different id-s don't matter and different ranges don't matter.
> > > 
> > > As another potential fix...
> > > can we mark_chain_precision() right at the time of R1 = R2 when we do
> > > src_reg->id = ++env->id_gen
> > > and copy_register_state();
> > > for both regs?
> > 
> > This won't help, e.g. for the original example precise markings would be:
> > 
> >   4: if (r6 > r7) goto +1 ; r6, r7
> >   5: r7 = r6              ; r6, r7
> >   6: if (r7 > X) goto ... ; r6     <-- mark for r7 is still missing
> >   7: r9 += r6             ; r6
> 
> Because 6 is a new state and we do mark_all_scalars_imprecise() after 5 ?

Yes, precision marks are not inherited by child states.

> 
> > What might help is to call mark_chain_precision() from
> > find_equal_scalars(), but I expect this to be very expensive.
> 
> maybe worth giving it a shot?

Sure, will report a bit later today.
 
> > > I think
> > > if (rold->precise && !check_ids(rold->id, rcur->id, idmap))
> > > would be good property to have.
> > > I don't like u32_hashset either.
> > > It's more or less saying that scalar id-s are incompatible with precision.
> > > 
> > > I hope we don't need to do:
> > > +       u32 reg_ids[MAX_CALL_FRAMES];
> > > for backtracking either.
> > > Hacking id-s into jmp history is equally bad.
> > > 
> > > Let's figure out a minimal fix.
> > 
> > Solution discussed with Andrii yesterday seems to work.
> 
> The thread is long. Could you please describe it again in pseudo code?

- Add a function mark_precise_scalar_ids(struct bpf_verifier_env *env,
                                        struct bpf_verifier_state *st)
  such that it:
  - collect PRECISE_IDS: a set of IDs of all registers marked in env->bt
  - visit all registers with ids from PRECISE_IDS and make sure
    that these registers are marked in env->bt
- Call mark_precise_scalar_ids() from __mark_chain_precision()
  for each state 'st' visited by states chain processing loop,
  so that:
  - mark_precise_scalar_ids() is called for current state when
    __mark_chain_precision() is entered, reusing id assignments in
    current state;
  - mark_precise_scalar_ids() is called for each parent state, reusing
    id assignments valid at 'last_idx' instruction of that state.

The idea is that in situations like below:

   4: if (r6 > r7) goto +1 
   5: r7 = r6
   --- checkpoint #1 ---
   6: <something>
   7: if (r7 > X) goto ... 
   8: r7 = 0
   9: r9 += r6

The mark_precise_scalar_ids() would be called at:
- (9) and current id assignments would be used.
- (6) and id assignments saved in checkpoint #1 would be used.

If <something> is the code that modifies r6/r7 the link would be
broken and we would overestimate the set of precise registers.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-06-01 18:52             ` Eduard Zingerman
@ 2023-06-02 18:50               ` Andrii Nakryiko
  2023-06-02 19:13                 ` Eduard Zingerman
  0 siblings, 1 reply; 33+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 18:50 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Alexei Starovoitov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Thu, Jun 1, 2023 at 11:52 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Thu, 2023-06-01 at 10:13 -0700, Alexei Starovoitov wrote:
> > On Thu, Jun 1, 2023 at 9:57 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > On Wed, 2023-05-31 at 19:05 -0700, Alexei Starovoitov wrote:
> > > > [...]
> > > > > Suppose that current verification path is 1-7:
> > > > > - On a way down 1-6 r7 will not be marked as precise, because
> > > > >   condition (r7 > X) is not predictable (see check_cond_jmp_op());
> > > > > - When (7) is reached mark_chain_precision() will start moving up
> > > > >   marking the following registers as precise:
> > > > >
> > > > >   4: if (r6 > r7) goto +1 ; r6, r7
> > > > >   5: r7 = r6              ; r6
> > > > >   6: if (r7 > X) goto ... ; r6
> > > > >   7: r9 += r6             ; r6
> > > > >
> > > > > - Thus, if checkpoint is created for (6) r7 would be marked as read,
> > > > >   but will not be marked as precise.
> > > > >
> > > > > Next, suppose that jump from 4 to 6 is verified and checkpoint for (6)
> > > > > is considered:
> > > > > - r6 is not precise, so check_ids() is not called for it and it is not
> > > > >   added to idmap;
> > > > > - r7 is precise, so check_ids() is called for it, but it is a sole
> > > > >   register in the idmap;
> > > >
> > > > typos in above?
> > > > r6 is precise and r7 is not precise.
> > >
> > > Yes, it should be the other way around in the description:
> > > r6 precise, r7 not precise. Sorry for confusion.
> > >
> > > > > - States are considered equal.
> > > > >
> > > > > Here is the log (I added a few prints for states cache comparison):
> > > > >
> > > > >   from 10 to 13: safe
> > > > >     steq hit 10, cur:
> > > > >       R0=scalar(id=2) R6=scalar(id=2) R7=scalar(id=1) R9=fp-8 R10=fp0 fp-8=00000000
> > > > >     steq hit 10, old:
> > > > >       R6_rD=Pscalar(id=2) R7_rwD=scalar(id=2) R9_rD=fp-8 R10=fp0 fp-8_rD=00000000
> > > >
> > > > the log is correct, thouhg.
> > > > r6_old = Pscalar which will go through check_ids() successfully and both are unbounded.
> > > > r7_old is not precise. different id-s don't matter and different ranges don't matter.
> > > >
> > > > As another potential fix...
> > > > can we mark_chain_precision() right at the time of R1 = R2 when we do
> > > > src_reg->id = ++env->id_gen
> > > > and copy_register_state();
> > > > for both regs?
> > >
> > > This won't help, e.g. for the original example precise markings would be:
> > >
> > >   4: if (r6 > r7) goto +1 ; r6, r7
> > >   5: r7 = r6              ; r6, r7
> > >   6: if (r7 > X) goto ... ; r6     <-- mark for r7 is still missing
> > >   7: r9 += r6             ; r6
> >
> > Because 6 is a new state and we do mark_all_scalars_imprecise() after 5 ?
>
> Yes, precision marks are not inherited by child states.
>
> >
> > > What might help is to call mark_chain_precision() from
> > > find_equal_scalars(), but I expect this to be very expensive.
> >
> > maybe worth giving it a shot?
>
> Sure, will report a bit later today.
>
> > > > I think
> > > > if (rold->precise && !check_ids(rold->id, rcur->id, idmap))
> > > > would be good property to have.
> > > > I don't like u32_hashset either.
> > > > It's more or less saying that scalar id-s are incompatible with precision.
> > > >
> > > > I hope we don't need to do:
> > > > +       u32 reg_ids[MAX_CALL_FRAMES];
> > > > for backtracking either.
> > > > Hacking id-s into jmp history is equally bad.
> > > >
> > > > Let's figure out a minimal fix.
> > >
> > > Solution discussed with Andrii yesterday seems to work.
> >
> > The thread is long. Could you please describe it again in pseudo code?
>
> - Add a function mark_precise_scalar_ids(struct bpf_verifier_env *env,
>                                         struct bpf_verifier_state *st)
>   such that it:
>   - collect PRECISE_IDS: a set of IDs of all registers marked in env->bt
>   - visit all registers with ids from PRECISE_IDS and make sure
>     that these registers are marked in env->bt
> - Call mark_precise_scalar_ids() from __mark_chain_precision()
>   for each state 'st' visited by states chain processing loop,
>   so that:
>   - mark_precise_scalar_ids() is called for current state when
>     __mark_chain_precision() is entered, reusing id assignments in
>     current state;
>   - mark_precise_scalar_ids() is called for each parent state, reusing
>     id assignments valid at 'last_idx' instruction of that state.
>
> The idea is that in situations like below:
>
>    4: if (r6 > r7) goto +1
>    5: r7 = r6
>    --- checkpoint #1 ---
>    6: <something>
>    7: if (r7 > X) goto ...
>    8: r7 = 0
>    9: r9 += r6
>
> The mark_precise_scalar_ids() would be called at:
> - (9) and current id assignments would be used.
> - (6) and id assignments saved in checkpoint #1 would be used.
>
> If <something> is the code that modifies r6/r7 the link would be
> broken and we would overestimate the set of precise registers.
>

To avoid this we need to recalculate these IDs on each new parent
state, based on requested precision marks. If we keep a simple and
small array of IDs and do a quick linear search over them for each
SCALAR register, I suspect it should be very fast. I don't think in
practice we'll have more than 1-2 IDs in that array, right?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-06-02 18:50               ` Andrii Nakryiko
@ 2023-06-02 19:13                 ` Eduard Zingerman
  2023-06-02 19:27                   ` Alexei Starovoitov
  2023-06-02 20:17                   ` Andrii Nakryiko
  0 siblings, 2 replies; 33+ messages in thread
From: Eduard Zingerman @ 2023-06-02 19:13 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Fri, 2023-06-02 at 11:50 -0700, Andrii Nakryiko wrote:
[...]
> > > The thread is long. Could you please describe it again in pseudo code?
> > 
> > - Add a function mark_precise_scalar_ids(struct bpf_verifier_env *env,
> >                                         struct bpf_verifier_state *st)
> >   such that it:
> >   - collect PRECISE_IDS: a set of IDs of all registers marked in env->bt
> >   - visit all registers with ids from PRECISE_IDS and make sure
> >     that these registers are marked in env->bt
> > - Call mark_precise_scalar_ids() from __mark_chain_precision()
> >   for each state 'st' visited by states chain processing loop,
> >   so that:
> >   - mark_precise_scalar_ids() is called for current state when
> >     __mark_chain_precision() is entered, reusing id assignments in
> >     current state;
> >   - mark_precise_scalar_ids() is called for each parent state, reusing
> >     id assignments valid at 'last_idx' instruction of that state.
> > 
> > The idea is that in situations like below:
> > 
> >    4: if (r6 > r7) goto +1
> >    5: r7 = r6
> >    --- checkpoint #1 ---
> >    6: <something>
> >    7: if (r7 > X) goto ...
> >    8: r7 = 0
> >    9: r9 += r6
> > 
> > The mark_precise_scalar_ids() would be called at:
> > - (9) and current id assignments would be used.
> > - (6) and id assignments saved in checkpoint #1 would be used.
> > 
> > If <something> is the code that modifies r6/r7 the link would be
> > broken and we would overestimate the set of precise registers.
> > 
> 
> To avoid this we need to recalculate these IDs on each new parent
> state, based on requested precision marks. If we keep a simple and
> small array of IDs and do a quick linear search over them for each
> SCALAR register, I suspect it should be very fast. I don't think in
> practice we'll have more than 1-2 IDs in that array, right?

I'm not sure I understand, could you please describe how it should
work for e.g.?:

    3: r6 &= 0xf            // assume safe bound
    4: if (r6 > r7) goto +1
    5: r7 = r6
    --- checkpoint #1 ---
    6: r7 = 0
    7: if (r7 > 10) goto exit;
    8: r7 = 0
    9: r9 += r6

__mark_chain_precision() would get to checkpoint #1 with only r6 as
precise, what should happen next?

As a side note: I added several optimizations:
- avoid allocation of scalar ids for constants;
- remove sole scalar ids from cached states;
- do a check as follows:
  if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
    return false;

And I'm seeing almost zero performance overhead now.
So, maybe what we figured so far is good enough.
Need to add more tests, though.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-06-02 19:13                 ` Eduard Zingerman
@ 2023-06-02 19:27                   ` Alexei Starovoitov
  2023-06-02 19:37                     ` Eduard Zingerman
  2023-06-02 20:17                   ` Andrii Nakryiko
  1 sibling, 1 reply; 33+ messages in thread
From: Alexei Starovoitov @ 2023-06-02 19:27 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Fri, Jun 2, 2023 at 12:13 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Fri, 2023-06-02 at 11:50 -0700, Andrii Nakryiko wrote:
> [...]
> > > > The thread is long. Could you please describe it again in pseudo code?
> > >
> > > - Add a function mark_precise_scalar_ids(struct bpf_verifier_env *env,
> > >                                         struct bpf_verifier_state *st)
> > >   such that it:
> > >   - collect PRECISE_IDS: a set of IDs of all registers marked in env->bt
> > >   - visit all registers with ids from PRECISE_IDS and make sure
> > >     that these registers are marked in env->bt
> > > - Call mark_precise_scalar_ids() from __mark_chain_precision()
> > >   for each state 'st' visited by states chain processing loop,
> > >   so that:
> > >   - mark_precise_scalar_ids() is called for current state when
> > >     __mark_chain_precision() is entered, reusing id assignments in
> > >     current state;
> > >   - mark_precise_scalar_ids() is called for each parent state, reusing
> > >     id assignments valid at 'last_idx' instruction of that state.
> > >
> > > The idea is that in situations like below:
> > >
> > >    4: if (r6 > r7) goto +1
> > >    5: r7 = r6
> > >    --- checkpoint #1 ---
> > >    6: <something>
> > >    7: if (r7 > X) goto ...
> > >    8: r7 = 0
> > >    9: r9 += r6
> > >
> > > The mark_precise_scalar_ids() would be called at:
> > > - (9) and current id assignments would be used.
> > > - (6) and id assignments saved in checkpoint #1 would be used.
> > >
> > > If <something> is the code that modifies r6/r7 the link would be
> > > broken and we would overestimate the set of precise registers.
> > >
> >
> > To avoid this we need to recalculate these IDs on each new parent
> > state, based on requested precision marks. If we keep a simple and
> > small array of IDs and do a quick linear search over them for each
> > SCALAR register, I suspect it should be very fast. I don't think in
> > practice we'll have more than 1-2 IDs in that array, right?
>
> I'm not sure I understand, could you please describe how it should
> work for e.g.?:
>
>     3: r6 &= 0xf            // assume safe bound
>     4: if (r6 > r7) goto +1
>     5: r7 = r6
>     --- checkpoint #1 ---
>     6: r7 = 0
>     7: if (r7 > 10) goto exit;
>     8: r7 = 0
>     9: r9 += r6
>
> __mark_chain_precision() would get to checkpoint #1 with only r6 as
> precise, what should happen next?
>
> As a side note: I added several optimizations:
> - avoid allocation of scalar ids for constants;

+1

> - remove sole scalar ids from cached states;
> - do a check as follows:
>   if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))

Ignoring rcur->id > 0 ? Is it safe?

>     return false;
>
> And I'm seeing almost zero performance overhead now.
> So, maybe what we figured so far is good enough.
> Need to add more tests, though.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-06-02 19:27                   ` Alexei Starovoitov
@ 2023-06-02 19:37                     ` Eduard Zingerman
  2023-06-02 19:43                       ` Alexei Starovoitov
  0 siblings, 1 reply; 33+ messages in thread
From: Eduard Zingerman @ 2023-06-02 19:37 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Fri, 2023-06-02 at 12:27 -0700, Alexei Starovoitov wrote:
> On Fri, Jun 2, 2023 at 12:13 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > 
> > On Fri, 2023-06-02 at 11:50 -0700, Andrii Nakryiko wrote:
> > [...]
> > > > > The thread is long. Could you please describe it again in pseudo code?
> > > > 
> > > > - Add a function mark_precise_scalar_ids(struct bpf_verifier_env *env,
> > > >                                         struct bpf_verifier_state *st)
> > > >   such that it:
> > > >   - collect PRECISE_IDS: a set of IDs of all registers marked in env->bt
> > > >   - visit all registers with ids from PRECISE_IDS and make sure
> > > >     that these registers are marked in env->bt
> > > > - Call mark_precise_scalar_ids() from __mark_chain_precision()
> > > >   for each state 'st' visited by states chain processing loop,
> > > >   so that:
> > > >   - mark_precise_scalar_ids() is called for current state when
> > > >     __mark_chain_precision() is entered, reusing id assignments in
> > > >     current state;
> > > >   - mark_precise_scalar_ids() is called for each parent state, reusing
> > > >     id assignments valid at 'last_idx' instruction of that state.
> > > > 
> > > > The idea is that in situations like below:
> > > > 
> > > >    4: if (r6 > r7) goto +1
> > > >    5: r7 = r6
> > > >    --- checkpoint #1 ---
> > > >    6: <something>
> > > >    7: if (r7 > X) goto ...
> > > >    8: r7 = 0
> > > >    9: r9 += r6
> > > > 
> > > > The mark_precise_scalar_ids() would be called at:
> > > > - (9) and current id assignments would be used.
> > > > - (6) and id assignments saved in checkpoint #1 would be used.
> > > > 
> > > > If <something> is the code that modifies r6/r7 the link would be
> > > > broken and we would overestimate the set of precise registers.
> > > > 
> > > 
> > > To avoid this we need to recalculate these IDs on each new parent
> > > state, based on requested precision marks. If we keep a simple and
> > > small array of IDs and do a quick linear search over them for each
> > > SCALAR register, I suspect it should be very fast. I don't think in
> > > practice we'll have more than 1-2 IDs in that array, right?
> > 
> > I'm not sure I understand, could you please describe how it should
> > work for e.g.?:
> > 
> >     3: r6 &= 0xf            // assume safe bound
> >     4: if (r6 > r7) goto +1
> >     5: r7 = r6
> >     --- checkpoint #1 ---
> >     6: r7 = 0
> >     7: if (r7 > 10) goto exit;
> >     8: r7 = 0
> >     9: r9 += r6
> > 
> > __mark_chain_precision() would get to checkpoint #1 with only r6 as
> > precise, what should happen next?
> > 
> > As a side note: I added several optimizations:
> > - avoid allocation of scalar ids for constants;
> 
> +1
> 
> > - remove sole scalar ids from cached states;
> > - do a check as follows:
> >   if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
> 
> Ignoring rcur->id > 0 ? Is it safe?

Well, I thought about it a bit and arrived to the following reasoning:
- suppose checkpoint C exists, is proven safe and has
  registers r6=Pscalar(range1),id=0 and r7=Pscalar(range2),id=0
- this means that C is proven safe for any value of
  r6 in range1 and any value of r7 in range2
- having same id on r6 and r7 means that r6 and r7 share same value
- so this is just a special case of what's already proven.

But having written this down, it looks like I also need to verify
that range1 and range2 overlap :(

> 
> >     return false;
> > 
> > And I'm seeing almost zero performance overhead now.
> > So, maybe what we figured so far is good enough.
> > Need to add more tests, though.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-06-02 19:37                     ` Eduard Zingerman
@ 2023-06-02 19:43                       ` Alexei Starovoitov
  2023-06-02 19:51                         ` Eduard Zingerman
  0 siblings, 1 reply; 33+ messages in thread
From: Alexei Starovoitov @ 2023-06-02 19:43 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Fri, Jun 2, 2023 at 12:37 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > - do a check as follows:
> > >   if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
> >
> > Ignoring rcur->id > 0 ? Is it safe?
>
> Well, I thought about it a bit and arrived to the following reasoning:
> - suppose checkpoint C exists, is proven safe and has
>   registers r6=Pscalar(range1),id=0 and r7=Pscalar(range2),id=0
> - this means that C is proven safe for any value of
>   r6 in range1 and any value of r7 in range2
> - having same id on r6 and r7 means that r6 and r7 share same value
> - so this is just a special case of what's already proven.
>
> But having written this down, it looks like I also need to verify
> that range1 and range2 overlap :(

I'm lost.
id==0 means there is no relationship between regs.
with
if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))

and r6_old->precise
we will only do range_within(rold, rcur) && tnum_in() check
and will ignore r6_cur->id and its relationship with some other reg in cur.
It could be ok.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-06-02 19:43                       ` Alexei Starovoitov
@ 2023-06-02 19:51                         ` Eduard Zingerman
  2023-06-02 20:03                           ` Alexei Starovoitov
  0 siblings, 1 reply; 33+ messages in thread
From: Eduard Zingerman @ 2023-06-02 19:51 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Fri, 2023-06-02 at 12:43 -0700, Alexei Starovoitov wrote:
> On Fri, Jun 2, 2023 at 12:37 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > - do a check as follows:
> > > >   if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
> > > 
> > > Ignoring rcur->id > 0 ? Is it safe?
> > 
> > Well, I thought about it a bit and arrived to the following reasoning:
> > - suppose checkpoint C exists, is proven safe and has
> >   registers r6=Pscalar(range1),id=0 and r7=Pscalar(range2),id=0
> > - this means that C is proven safe for any value of
> >   r6 in range1 and any value of r7 in range2
> > - having same id on r6 and r7 means that r6 and r7 share same value
> > - so this is just a special case of what's already proven.
> > 
> > But having written this down, it looks like I also need to verify
> > that range1 and range2 overlap :(
> 
> I'm lost.
> id==0 means there is no relationship between regs.
> with
> if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
> 
> and r6_old->precise
> we will only do range_within(rold, rcur) && tnum_in() check
> and will ignore r6_cur->id and its relationship with some other reg in cur.
> It could be ok.

Yes, but I just realized that for the following case:

  Old                      Cur
  r6=Pscalar(range1),id=0  r6=Pscalar(range1),id=1
  r7=Pscalar(range2),id=0  r7=Pscalar(range2),id=1

For 'Cur' to be a subset of 'Old' ranges range1 and range2
have to have non-empty overlap, so my new check:

  if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))

is not fully correct.

It was a "clever" attempt to ignore solo scalar IDs in Cur without modifying Cur.
I'll think a bit more, sorry for a lot of noise.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-06-02 19:51                         ` Eduard Zingerman
@ 2023-06-02 20:03                           ` Alexei Starovoitov
  2023-06-02 20:22                             ` Eduard Zingerman
  0 siblings, 1 reply; 33+ messages in thread
From: Alexei Starovoitov @ 2023-06-02 20:03 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Fri, Jun 2, 2023 at 12:51 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Fri, 2023-06-02 at 12:43 -0700, Alexei Starovoitov wrote:
> > On Fri, Jun 2, 2023 at 12:37 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > > - do a check as follows:
> > > > >   if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
> > > >
> > > > Ignoring rcur->id > 0 ? Is it safe?
> > >
> > > Well, I thought about it a bit and arrived to the following reasoning:
> > > - suppose checkpoint C exists, is proven safe and has
> > >   registers r6=Pscalar(range1),id=0 and r7=Pscalar(range2),id=0
> > > - this means that C is proven safe for any value of
> > >   r6 in range1 and any value of r7 in range2
> > > - having same id on r6 and r7 means that r6 and r7 share same value
> > > - so this is just a special case of what's already proven.
> > >
> > > But having written this down, it looks like I also need to verify
> > > that range1 and range2 overlap :(
> >
> > I'm lost.
> > id==0 means there is no relationship between regs.
> > with
> > if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
> >
> > and r6_old->precise
> > we will only do range_within(rold, rcur) && tnum_in() check
> > and will ignore r6_cur->id and its relationship with some other reg in cur.
> > It could be ok.
>
> Yes, but I just realized that for the following case:
>
>   Old                      Cur
>   r6=Pscalar(range1),id=0  r6=Pscalar(range1),id=1
>   r7=Pscalar(range2),id=0  r7=Pscalar(range2),id=1
>
> For 'Cur' to be a subset of 'Old' ranges range1 and range2
> have to have non-empty overlap, so my new check:

In theory. yes. and most likely that _was_ the case for 'old',
but 'cur' doesn't need to do that check.
'old' was successful already and 'cur' ranges just need to be within.

so

>   if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
>
> is not fully correct.

still looks correct.

>
> It was a "clever" attempt to ignore solo scalar IDs in Cur without modifying Cur.
> I'll think a bit more, sorry for a lot of noise.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-06-02 19:13                 ` Eduard Zingerman
  2023-06-02 19:27                   ` Alexei Starovoitov
@ 2023-06-02 20:17                   ` Andrii Nakryiko
  2023-06-02 21:13                     ` Eduard Zingerman
  1 sibling, 1 reply; 33+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 20:17 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Alexei Starovoitov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Fri, Jun 2, 2023 at 12:13 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Fri, 2023-06-02 at 11:50 -0700, Andrii Nakryiko wrote:
> [...]
> > > > The thread is long. Could you please describe it again in pseudo code?
> > >
> > > - Add a function mark_precise_scalar_ids(struct bpf_verifier_env *env,
> > >                                         struct bpf_verifier_state *st)
> > >   such that it:
> > >   - collect PRECISE_IDS: a set of IDs of all registers marked in env->bt
> > >   - visit all registers with ids from PRECISE_IDS and make sure
> > >     that these registers are marked in env->bt
> > > - Call mark_precise_scalar_ids() from __mark_chain_precision()
> > >   for each state 'st' visited by states chain processing loop,
> > >   so that:
> > >   - mark_precise_scalar_ids() is called for current state when
> > >     __mark_chain_precision() is entered, reusing id assignments in
> > >     current state;
> > >   - mark_precise_scalar_ids() is called for each parent state, reusing
> > >     id assignments valid at 'last_idx' instruction of that state.
> > >
> > > The idea is that in situations like below:
> > >
> > >    4: if (r6 > r7) goto +1
> > >    5: r7 = r6
> > >    --- checkpoint #1 ---
> > >    6: <something>
> > >    7: if (r7 > X) goto ...
> > >    8: r7 = 0
> > >    9: r9 += r6
> > >
> > > The mark_precise_scalar_ids() would be called at:
> > > - (9) and current id assignments would be used.
> > > - (6) and id assignments saved in checkpoint #1 would be used.
> > >
> > > If <something> is the code that modifies r6/r7 the link would be
> > > broken and we would overestimate the set of precise registers.
> > >
> >
> > To avoid this we need to recalculate these IDs on each new parent
> > state, based on requested precision marks. If we keep a simple and
> > small array of IDs and do a quick linear search over them for each
> > SCALAR register, I suspect it should be very fast. I don't think in
> > practice we'll have more than 1-2 IDs in that array, right?
>
> I'm not sure I understand, could you please describe how it should
> work for e.g.?:
>
>     3: r6 &= 0xf            // assume safe bound
>     4: if (r6 > r7) goto +1
>     5: r7 = r6
>     --- checkpoint #1 ---
>     6: r7 = 0
>     7: if (r7 > 10) goto exit;
>     8: r7 = 0
>     9: r9 += r6
>
> __mark_chain_precision() would get to checkpoint #1 with only r6 as
> precise, what should happen next?

it should mark all SCALARs that have r6's ID in env->bt, and then
proceed with precision propagation until next parent state? This is
where you'll mark r7, because in parent state (checkpoint #1) r6.id ==
r7.id.

It might be easier to just discuss latest code you have, there are
lots of intricacies, and code wins over words :)

>
> As a side note: I added several optimizations:
> - avoid allocation of scalar ids for constants;
> - remove sole scalar ids from cached states;

so that's what I was proposing earlier, but not just from cached
states, but from any state. As soon as we get SCALAR with some ID that
is not shared by any other SCALAR, we should try to drop that ID. The
question is only in how to implement this efficiently.

> - do a check as follows:
>   if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
>     return false;

Hm.. do we need extra special case here? With precision changes we are
discussion, and this removing singular SCALAR IDs you are proposing,
just extending existing logic to:

                if (regs_exact(rold, rcur, idmap))
                        return true;
                if (env->explore_alu_limits)
                        return false;
                if (!rold->precise)
                        return true;
                /* new val must satisfy old val knowledge */
                return range_within(rold, rcur) &&
                       check_ids(rold->id, rcur->id, idmap) &&
                       check_ids(rold->ref_obj_id, rcur->ref_obj_id, idmap) &&
                       tnum_in(rold->var_off, rcur->var_off);

wouldn't be enough?


>
> And I'm seeing almost zero performance overhead now.
> So, maybe what we figured so far is good enough.
> Need to add more tests, though.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-06-02 20:03                           ` Alexei Starovoitov
@ 2023-06-02 20:22                             ` Eduard Zingerman
  0 siblings, 0 replies; 33+ messages in thread
From: Eduard Zingerman @ 2023-06-02 20:22 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Fri, 2023-06-02 at 13:03 -0700, Alexei Starovoitov wrote:
> On Fri, Jun 2, 2023 at 12:51 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > 
> > On Fri, 2023-06-02 at 12:43 -0700, Alexei Starovoitov wrote:
> > > On Fri, Jun 2, 2023 at 12:37 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > > > - do a check as follows:
> > > > > >   if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
> > > > > 
> > > > > Ignoring rcur->id > 0 ? Is it safe?
> > > > 
> > > > Well, I thought about it a bit and arrived to the following reasoning:
> > > > - suppose checkpoint C exists, is proven safe and has
> > > >   registers r6=Pscalar(range1),id=0 and r7=Pscalar(range2),id=0
> > > > - this means that C is proven safe for any value of
> > > >   r6 in range1 and any value of r7 in range2
> > > > - having same id on r6 and r7 means that r6 and r7 share same value
> > > > - so this is just a special case of what's already proven.
> > > > 
> > > > But having written this down, it looks like I also need to verify
> > > > that range1 and range2 overlap :(
> > > 
> > > I'm lost.
> > > id==0 means there is no relationship between regs.
> > > with
> > > if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
> > > 
> > > and r6_old->precise
> > > we will only do range_within(rold, rcur) && tnum_in() check
> > > and will ignore r6_cur->id and its relationship with some other reg in cur.
> > > It could be ok.
> > 
> > Yes, but I just realized that for the following case:
> > 
> >   Old                      Cur
> >   r6=Pscalar(range1),id=0  r6=Pscalar(range1),id=1
> >   r7=Pscalar(range2),id=0  r7=Pscalar(range2),id=1
> > 
> > For 'Cur' to be a subset of 'Old' ranges range1 and range2
> > have to have non-empty overlap, so my new check:
> 
> In theory. yes. and most likely that _was_ the case for 'old',
> but 'cur' doesn't need to do that check.
> 'old' was successful already and 'cur' ranges just need to be within.
> 
> so
> 
> >   if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
> > 
> > is not fully correct.
> 
> still looks correct.

Ok, that's the last piece, if Cur[r6.id=1] and Cur[r7.id=1] they share
the same range. So as long as this range is within range1, range2 it is
all good. And the range is checked downstream of regsafe.
So, yes '(rold->precise && rold->id && !check_ids(idmap, rold, rcur))' is fine.

> 
> > 
> > It was a "clever" attempt to ignore solo scalar IDs in Cur without modifying Cur.
> > I'll think a bit more, sorry for a lot of noise.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-06-02 20:17                   ` Andrii Nakryiko
@ 2023-06-02 21:13                     ` Eduard Zingerman
  2023-06-02 22:07                       ` Andrii Nakryiko
  0 siblings, 1 reply; 33+ messages in thread
From: Eduard Zingerman @ 2023-06-02 21:13 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Fri, 2023-06-02 at 13:17 -0700, Andrii Nakryiko wrote:
> On Fri, Jun 2, 2023 at 12:13 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > 
> > On Fri, 2023-06-02 at 11:50 -0700, Andrii Nakryiko wrote:
> > [...]
> > > > > The thread is long. Could you please describe it again in pseudo code?
> > > > 
> > > > - Add a function mark_precise_scalar_ids(struct bpf_verifier_env *env,
> > > >                                         struct bpf_verifier_state *st)
> > > >   such that it:
> > > >   - collect PRECISE_IDS: a set of IDs of all registers marked in env->bt
> > > >   - visit all registers with ids from PRECISE_IDS and make sure
> > > >     that these registers are marked in env->bt
> > > > - Call mark_precise_scalar_ids() from __mark_chain_precision()
> > > >   for each state 'st' visited by states chain processing loop,
> > > >   so that:
> > > >   - mark_precise_scalar_ids() is called for current state when
> > > >     __mark_chain_precision() is entered, reusing id assignments in
> > > >     current state;
> > > >   - mark_precise_scalar_ids() is called for each parent state, reusing
> > > >     id assignments valid at 'last_idx' instruction of that state.
> > > > 
> > > > The idea is that in situations like below:
> > > > 
> > > >    4: if (r6 > r7) goto +1
> > > >    5: r7 = r6
> > > >    --- checkpoint #1 ---
> > > >    6: <something>
> > > >    7: if (r7 > X) goto ...
> > > >    8: r7 = 0
> > > >    9: r9 += r6
> > > > 
> > > > The mark_precise_scalar_ids() would be called at:
> > > > - (9) and current id assignments would be used.
> > > > - (6) and id assignments saved in checkpoint #1 would be used.
> > > > 
> > > > If <something> is the code that modifies r6/r7 the link would be
> > > > broken and we would overestimate the set of precise registers.
> > > > 
> > > 
> > > To avoid this we need to recalculate these IDs on each new parent
> > > state, based on requested precision marks. If we keep a simple and
> > > small array of IDs and do a quick linear search over them for each
> > > SCALAR register, I suspect it should be very fast. I don't think in
> > > practice we'll have more than 1-2 IDs in that array, right?
> > 
> > I'm not sure I understand, could you please describe how it should
> > work for e.g.?:
> > 
> >     3: r6 &= 0xf            // assume safe bound
> >     4: if (r6 > r7) goto +1
> >     5: r7 = r6
> >     --- checkpoint #1 ---
> >     6: r7 = 0
> >     7: if (r7 > 10) goto exit;
> >     8: r7 = 0
> >     9: r9 += r6
> > 
> > __mark_chain_precision() would get to checkpoint #1 with only r6 as
> > precise, what should happen next?
> 
> it should mark all SCALARs that have r6's ID in env->bt, and then
> proceed with precision propagation until next parent state? This is
> where you'll mark r7, because in parent state (checkpoint #1) r6.id ==
> r7.id.

That's what I do now.
Sorry, I thought you had a suggestion on how to avoid the precise set
overestimation (e.g. how to detect that "6: r7 = 0" breaks the link).

> It might be easier to just discuss latest code you have, there are
> lots of intricacies, and code wins over words :)

Here is what I have now:
https://github.com/kernel-patches/bpf/compare/bpf-next_base...eddyz87:bpf:verify-ids-for-scalars-in-regsafe-v3
The interesting part is mark_precise_scalar_ids().

But a few tests are not passing because expected messages have to be adjusted.
And a lot of tests have to be added.
We can delay discussion until I submit v3 (worst case tomorrow).

> > As a side note: I added several optimizations:
> > - avoid allocation of scalar ids for constants;
> > - remove sole scalar ids from cached states;
> 
> so that's what I was proposing earlier,

Yes, it turned out beneficial when I inspected logs for bpf_xdp.o.

> but not just from cached
> states, but from any state. As soon as we get SCALAR with some ID that
> is not shared by any other SCALAR, we should try to drop that ID. The
> question is only in how to implement this efficiently.

No, we don't want to do it for non-cached state, not until we generate
scalar ids on stack spills and fills. Otherwise we would break
find_equal_scalars() for the following case:

  r1 = r2         // r1 gains ID
  fp[-8] = r1     //
  r2 = 0          // 
  r1 = 0          // fp[-8] has a unique ID now
  --- checkpoint ---
  r1 = fp[-8]
  r2 = fp[-8]
  if r1 > 10 goto exit; // range propagates to r2 now,
                        // but won't propagate if fp[-8] ID
                        // is cleared at checkpoint

(A bit contrived, but conveys the idea)

And we don't really need to bother about unique IDs in non-cached state
when rold->id check discussed in a sibling thread is used:

		if (rold->precise && rold->id && !check_ids(rold->id, rcur->id, idmap))
			return false;

Here, if rcur->id is unique there are two cases:
- rold->id == 0: then rcur->id is just ignored
- rold->id != 0: then rold->id/rcur->id pair would be added to idmap,
                 there is some other precise old register with the
                 same id as rold->id, so eventually check_ids()
                 would make regsafe() return false.

> > - do a check as follows:
> >   if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
> >     return false;
> 
> Hm.. do we need extra special case here? With precision changes we are
> discussion, and this removing singular SCALAR IDs you are proposing,
> just extending existing logic to:
> 
>                 if (regs_exact(rold, rcur, idmap))
>                         return true;
>                 if (env->explore_alu_limits)
>                         return false;
>                 if (!rold->precise)
>                         return true;
>                 /* new val must satisfy old val knowledge */
>                 return range_within(rold, rcur) &&
>                        check_ids(rold->id, rcur->id, idmap) &&
>                        check_ids(rold->ref_obj_id, rcur->ref_obj_id, idmap) &&
>                        tnum_in(rold->var_off, rcur->var_off);
> 
> wouldn't be enough?

Yes, it could be shortened as below:

                 return range_within(rold, rcur) &&
                        (rold->id == 0 || check_ids(rold->id, rcur->id, idmap)) &&
                        check_ids(rold->ref_obj_id, rcur->ref_obj_id, idmap) &&
                        tnum_in(rold->var_off, rcur->var_off);

but I wanted a separate place to put a long comment at.

> > 
> > And I'm seeing almost zero performance overhead now.
> > So, maybe what we figured so far is good enough.
> > Need to add more tests, though.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-06-02 21:13                     ` Eduard Zingerman
@ 2023-06-02 22:07                       ` Andrii Nakryiko
  2023-06-02 23:20                         ` Eduard Zingerman
  0 siblings, 1 reply; 33+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 22:07 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Alexei Starovoitov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Fri, Jun 2, 2023 at 2:13 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Fri, 2023-06-02 at 13:17 -0700, Andrii Nakryiko wrote:
> > On Fri, Jun 2, 2023 at 12:13 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > On Fri, 2023-06-02 at 11:50 -0700, Andrii Nakryiko wrote:
> > > [...]
> > > > > > The thread is long. Could you please describe it again in pseudo code?
> > > > >
> > > > > - Add a function mark_precise_scalar_ids(struct bpf_verifier_env *env,
> > > > >                                         struct bpf_verifier_state *st)
> > > > >   such that it:
> > > > >   - collect PRECISE_IDS: a set of IDs of all registers marked in env->bt
> > > > >   - visit all registers with ids from PRECISE_IDS and make sure
> > > > >     that these registers are marked in env->bt
> > > > > - Call mark_precise_scalar_ids() from __mark_chain_precision()
> > > > >   for each state 'st' visited by states chain processing loop,
> > > > >   so that:
> > > > >   - mark_precise_scalar_ids() is called for current state when
> > > > >     __mark_chain_precision() is entered, reusing id assignments in
> > > > >     current state;
> > > > >   - mark_precise_scalar_ids() is called for each parent state, reusing
> > > > >     id assignments valid at 'last_idx' instruction of that state.
> > > > >
> > > > > The idea is that in situations like below:
> > > > >
> > > > >    4: if (r6 > r7) goto +1
> > > > >    5: r7 = r6
> > > > >    --- checkpoint #1 ---
> > > > >    6: <something>
> > > > >    7: if (r7 > X) goto ...
> > > > >    8: r7 = 0
> > > > >    9: r9 += r6
> > > > >
> > > > > The mark_precise_scalar_ids() would be called at:
> > > > > - (9) and current id assignments would be used.
> > > > > - (6) and id assignments saved in checkpoint #1 would be used.
> > > > >
> > > > > If <something> is the code that modifies r6/r7 the link would be
> > > > > broken and we would overestimate the set of precise registers.
> > > > >
> > > >
> > > > To avoid this we need to recalculate these IDs on each new parent
> > > > state, based on requested precision marks. If we keep a simple and
> > > > small array of IDs and do a quick linear search over them for each
> > > > SCALAR register, I suspect it should be very fast. I don't think in
> > > > practice we'll have more than 1-2 IDs in that array, right?
> > >
> > > I'm not sure I understand, could you please describe how it should
> > > work for e.g.?:
> > >
> > >     3: r6 &= 0xf            // assume safe bound
> > >     4: if (r6 > r7) goto +1
> > >     5: r7 = r6
> > >     --- checkpoint #1 ---
> > >     6: r7 = 0
> > >     7: if (r7 > 10) goto exit;
> > >     8: r7 = 0
> > >     9: r9 += r6
> > >
> > > __mark_chain_precision() would get to checkpoint #1 with only r6 as
> > > precise, what should happen next?
> >
> > it should mark all SCALARs that have r6's ID in env->bt, and then
> > proceed with precision propagation until next parent state? This is
> > where you'll mark r7, because in parent state (checkpoint #1) r6.id ==
> > r7.id.
>
> That's what I do now.

Ok, cool...

> Sorry, I thought you had a suggestion on how to avoid the precise set
> overestimation (e.g. how to detect that "6: r7 = 0" breaks the link).

..but "overestimation" confuses me. There is no overestimation. After
checkpoint #1 (let's say we have checkpoint #2 after instruction 9),
r6 and r7 are not linked, and if we had to mark r6 as precise, we'd
mark only r7. But as of checkpoint #1, r7.id == r6.id and they are
linked. So there is no overestimation. They are linked together as of
checkpoint #1.

But anyways, I think we are on the same page, even if we don't use the
same words :)

>
> > It might be easier to just discuss latest code you have, there are
> > lots of intricacies, and code wins over words :)
>
> Here is what I have now:
> https://github.com/kernel-patches/bpf/compare/bpf-next_base...eddyz87:bpf:verify-ids-for-scalars-in-regsafe-v3

feels a bit heavyweight to do sorting and stuff. I think in practice
you'll have only very small amount of linked SCALAR IDs, so a simple
linear search would be faster than all that sort+bsearch.

Look at check_ids(). It's called all the time and is a linear search.
I think it's fine to keep thing simple here as well.

> The interesting part is mark_precise_scalar_ids().
>
> But a few tests are not passing because expected messages have to be adjusted.
> And a lot of tests have to be added.
> We can delay discussion until I submit v3 (worst case tomorrow).
>
> > > As a side note: I added several optimizations:
> > > - avoid allocation of scalar ids for constants;
> > > - remove sole scalar ids from cached states;
> >
> > so that's what I was proposing earlier,
>
> Yes, it turned out beneficial when I inspected logs for bpf_xdp.o.
>
> > but not just from cached
> > states, but from any state. As soon as we get SCALAR with some ID that
> > is not shared by any other SCALAR, we should try to drop that ID. The
> > question is only in how to implement this efficiently.
>
> No, we don't want to do it for non-cached state, not until we generate
> scalar ids on stack spills and fills. Otherwise we would break
> find_equal_scalars() for the following case:
>
>   r1 = r2         // r1 gains ID
>   fp[-8] = r1     //
>   r2 = 0          //
>   r1 = 0          // fp[-8] has a unique ID now

exactly, as of right now there is no linked registers anymore, so we
can clear ID


>   --- checkpoint ---
>   r1 = fp[-8]

and this is where we should generate a new ID and assign it to r1.id
and fp[-8].id.

How is this fundamentally different from just `r1 = r2`?

>   r2 = fp[-8]

and now r2.id = r1.id = fp[-8].id (but it's different ID than as
before checkpoint)

>   if r1 > 10 goto exit; // range propagates to r2 now,
>                         // but won't propagate if fp[-8] ID
>                         // is cleared at checkpoint
>
> (A bit contrived, but conveys the idea)
>
> And we don't really need to bother about unique IDs in non-cached state
> when rold->id check discussed in a sibling thread is used:
>
>                 if (rold->precise && rold->id && !check_ids(rold->id, rcur->id, idmap))
>                         return false;

It makes me worry that we are mixing no ID and ID-ed SCALARs and
making them equivalent. I need to think some more about implications
(and re-read what you and Alexei discussed). I don't feel good about
this and suspect we'll miss some non-obvious corner case if we do
this.

>
> Here, if rcur->id is unique there are two cases:
> - rold->id == 0: then rcur->id is just ignored
> - rold->id != 0: then rold->id/rcur->id pair would be added to idmap,
>                  there is some other precise old register with the
>                  same id as rold->id, so eventually check_ids()
>                  would make regsafe() return false.
>
> > > - do a check as follows:
> > >   if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
> > >     return false;
> >
> > Hm.. do we need extra special case here? With precision changes we are
> > discussion, and this removing singular SCALAR IDs you are proposing,
> > just extending existing logic to:
> >
> >                 if (regs_exact(rold, rcur, idmap))
> >                         return true;
> >                 if (env->explore_alu_limits)
> >                         return false;
> >                 if (!rold->precise)
> >                         return true;
> >                 /* new val must satisfy old val knowledge */
> >                 return range_within(rold, rcur) &&
> >                        check_ids(rold->id, rcur->id, idmap) &&
> >                        check_ids(rold->ref_obj_id, rcur->ref_obj_id, idmap) &&
> >                        tnum_in(rold->var_off, rcur->var_off);
> >
> > wouldn't be enough?
>
> Yes, it could be shortened as below:
>
>                  return range_within(rold, rcur) &&
>                         (rold->id == 0 || check_ids(rold->id, rcur->id, idmap)) &&
>                         check_ids(rold->ref_obj_id, rcur->ref_obj_id, idmap) &&
>                         tnum_in(rold->var_off, rcur->var_off);
>
> but I wanted a separate place to put a long comment at.

you can still put a big comment right here? At the very least I'd move
this new condition to after `if (!rold->precise)`. Where you put it
right now seems a bit out of place.

>
> > >
> > > And I'm seeing almost zero performance overhead now.
> > > So, maybe what we figured so far is good enough.
> > > Need to add more tests, though.
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-06-02 22:07                       ` Andrii Nakryiko
@ 2023-06-02 23:20                         ` Eduard Zingerman
  2023-06-05 20:43                           ` Eduard Zingerman
  0 siblings, 1 reply; 33+ messages in thread
From: Eduard Zingerman @ 2023-06-02 23:20 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Fri, 2023-06-02 at 15:07 -0700, Andrii Nakryiko wrote:
[...]
> > > > I'm not sure I understand, could you please describe how it should
> > > > work for e.g.?:
> > > > 
> > > >     3: r6 &= 0xf            // assume safe bound
> > > >     4: if (r6 > r7) goto +1
> > > >     5: r7 = r6
> > > >     --- checkpoint #1 ---
> > > >     6: r7 = 0
> > > >     7: if (r7 > 10) goto exit;
> > > >     8: r7 = 0
> > > >     9: r9 += r6
> > > > 
> > > > __mark_chain_precision() would get to checkpoint #1 with only r6 as
> > > > precise, what should happen next?
> > > 
> > > it should mark all SCALARs that have r6's ID in env->bt, and then
> > > proceed with precision propagation until next parent state? This is
> > > where you'll mark r7, because in parent state (checkpoint #1) r6.id ==
> > > r7.id.
> > 
> > That's what I do now.
> 
> Ok, cool...
> 
> > Sorry, I thought you had a suggestion on how to avoid the precise set
> > overestimation (e.g. how to detect that "6: r7 = 0" breaks the link).
> 
> ..but "overestimation" confuses me. There is no overestimation. After
> checkpoint #1 (let's say we have checkpoint #2 after instruction 9),
> r6 and r7 are not linked, and if we had to mark r6 as precise, we'd
> mark only r7. But as of checkpoint #1, r7.id == r6.id and they are
> linked. So there is no overestimation. They are linked together as of
> checkpoint #1.

Sorry, I should have adjusted the example a bit more, consider:

                                   precise registers
     1: r6 = scalar of range X
     2: r7 = unbound scalar
     3: r8 = unbound scalar
     4: r9 = stack of size X
     5: if (r6 > r7) goto +1
     6: r7 = r6
    --- checkpoint #1 ---          r6, r7 (because r6.id == r7.id in checkpoint #1)
     7: r7 = 0                     r6
     8: if (r7 > r8) goto exit     r6     (unpredictable jump => r{7,8} not precise)
     9: r7 = 0                     r6
    10: r9 += r6                   r6
    --- checkpoint #2 ---

Here there is no real need to mark r7 as precise, but current
algorithm would. Thus, we overestimate the set of precise registers.
(But it might be fine if workaround is complicated).

> But anyways, I think we are on the same page, even if we don't use the
> same words :)
> 
> > 
> > > It might be easier to just discuss latest code you have, there are
> > > lots of intricacies, and code wins over words :)
> > 
> > Here is what I have now:
> > https://github.com/kernel-patches/bpf/compare/bpf-next_base...eddyz87:bpf:verify-ids-for-scalars-in-regsafe-v3
> 
> feels a bit heavyweight to do sorting and stuff. I think in practice
> you'll have only very small amount of linked SCALAR IDs, so a simple
> linear search would be faster than all that sort+bsearch.
> 
> Look at check_ids(). It's called all the time and is a linear search.
> I think it's fine to keep thing simple here as well.

:(
Ok, I can keep it linear.
In ideal world `sort` / `bsearch` would be inlined and have a special
case for count < 16 (or whatever is the magic number on modern CPUs).

> > The interesting part is mark_precise_scalar_ids().
> > 
> > But a few tests are not passing because expected messages have to be adjusted.
> > And a lot of tests have to be added.
> > We can delay discussion until I submit v3 (worst case tomorrow).
> > 
> > > > As a side note: I added several optimizations:
> > > > - avoid allocation of scalar ids for constants;
> > > > - remove sole scalar ids from cached states;
> > > 
> > > so that's what I was proposing earlier,
> > 
> > Yes, it turned out beneficial when I inspected logs for bpf_xdp.o.
> > 
> > > but not just from cached
> > > states, but from any state. As soon as we get SCALAR with some ID that
> > > is not shared by any other SCALAR, we should try to drop that ID. The
> > > question is only in how to implement this efficiently.
> > 
> > No, we don't want to do it for non-cached state, not until we generate
> > scalar ids on stack spills and fills. Otherwise we would break
> > find_equal_scalars() for the following case:
> > 
> >   r1 = r2         // r1 gains ID
> >   fp[-8] = r1     //
> >   r2 = 0          //
> >   r1 = 0          // fp[-8] has a unique ID now
> 
> exactly, as of right now there is no linked registers anymore, so we
> can clear ID
> 
> 
> >   --- checkpoint ---
> >   r1 = fp[-8]
> 
> and this is where we should generate a new ID and assign it to r1.id
> and fp[-8].id.
> 
> How is this fundamentally different from just `r1 = r2`?

I agree that IDs should be issued on spills and fills.
However, this is not happening right now.
I have an experimental patch-set with id generation at spill,
and see some verification performance issues.
If possible, I'd prefer to complete these two tasks separately.

> >   r2 = fp[-8]
> 
> and now r2.id = r1.id = fp[-8].id (but it's different ID than as
> before checkpoint)
> 
> >   if r1 > 10 goto exit; // range propagates to r2 now,
> >                         // but won't propagate if fp[-8] ID
> >                         // is cleared at checkpoint
> > 
> > (A bit contrived, but conveys the idea)
> > 
> > And we don't really need to bother about unique IDs in non-cached state
> > when rold->id check discussed in a sibling thread is used:
> > 
> >                 if (rold->precise && rold->id && !check_ids(rold->id, rcur->id, idmap))
> >                         return false;
> 
> It makes me worry that we are mixing no ID and ID-ed SCALARs and
> making them equivalent. I need to think some more about implications
> (and re-read what you and Alexei discussed). I don't feel good about
> this and suspect we'll miss some non-obvious corner case if we do
> this.

Here is the complete argument in a single piece:

  Suppose:
    - There is a checkpoint state 'Old' with two registers marked as:
      - rA=precise scalar with range A, id=0;
      - rB=precise scalar with range B, id=0.
    - 'Old' is proven to be safe.
    - There is a new state 'Cur' with two registers marked as:
      - rA=scalar with range C, id=U;
      - rB=scalar with range C, id=U;
        (Note: if rA.id == rB.id the registers have identical range).

  Proposition:
    As long as range C is a subset of both range A and range B
    the state 'Cur' is safe.

  Proof:
    State 'Cur' represents a special case of state 'Old',
    a variant of 'Old' where rA and rB happen to have same value.
    Thus 'Cur' is safe because 'Old' is safe.

> > 
> > Here, if rcur->id is unique there are two cases:
> > - rold->id == 0: then rcur->id is just ignored
> > - rold->id != 0: then rold->id/rcur->id pair would be added to idmap,
> >                  there is some other precise old register with the
> >                  same id as rold->id, so eventually check_ids()
> >                  would make regsafe() return false.
> > 
> > > > - do a check as follows:
> > > >   if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
> > > >     return false;
> > > 
> > > Hm.. do we need extra special case here? With precision changes we are
> > > discussion, and this removing singular SCALAR IDs you are proposing,
> > > just extending existing logic to:
> > > 
> > >                 if (regs_exact(rold, rcur, idmap))
> > >                         return true;
> > >                 if (env->explore_alu_limits)
> > >                         return false;
> > >                 if (!rold->precise)
> > >                         return true;
> > >                 /* new val must satisfy old val knowledge */
> > >                 return range_within(rold, rcur) &&
> > >                        check_ids(rold->id, rcur->id, idmap) &&
> > >                        check_ids(rold->ref_obj_id, rcur->ref_obj_id, idmap) &&
> > >                        tnum_in(rold->var_off, rcur->var_off);
> > > 
> > > wouldn't be enough?
> > 
> > Yes, it could be shortened as below:
> > 
> >                  return range_within(rold, rcur) &&
> >                         (rold->id == 0 || check_ids(rold->id, rcur->id, idmap)) &&
> >                         check_ids(rold->ref_obj_id, rcur->ref_obj_id, idmap) &&
> >                         tnum_in(rold->var_off, rcur->var_off);
> > 
> > but I wanted a separate place to put a long comment at.
> 
> you can still put a big comment right here? At the very least I'd move
> this new condition to after `if (!rold->precise)`. Where you put it
> right now seems a bit out of place.

I'll move the condition after `if (!rold->precise)`.

> > > > And I'm seeing almost zero performance overhead now.
> > > > So, maybe what we figured so far is good enough.
> > > > Need to add more tests, though.
> > 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()
  2023-06-02 23:20                         ` Eduard Zingerman
@ 2023-06-05 20:43                           ` Eduard Zingerman
  0 siblings, 0 replies; 33+ messages in thread
From: Eduard Zingerman @ 2023-06-05 20:43 UTC (permalink / raw)
  To: Andrii Nakryiko, Alexei Starovoitov
  Cc: Alexei Starovoitov, bpf, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kernel Team, Yonghong Song

On Sat, 2023-06-03 at 02:20 +0300, Eduard Zingerman wrote:
> On Fri, 2023-06-02 at 15:07 -0700, Andrii Nakryiko wrote:
> [...]
> > > And we don't really need to bother about unique IDs in non-cached state
> > > when rold->id check discussed in a sibling thread is used:
> > > 
> > >                 if (rold->precise && rold->id && !check_ids(rold->id, rcur->id, idmap))
> > >                         return false;
> > 
> > It makes me worry that we are mixing no ID and ID-ed SCALARs and
> > making them equivalent. I need to think some more about implications
> > (and re-read what you and Alexei discussed). I don't feel good about
> > this and suspect we'll miss some non-obvious corner case if we do
> > this.
> 
> Here is the complete argument in a single piece:
> 
>   Suppose:
>     - There is a checkpoint state 'Old' with two registers marked as:
>       - rA=precise scalar with range A, id=0;
>       - rB=precise scalar with range B, id=0.
>     - 'Old' is proven to be safe.
>     - There is a new state 'Cur' with two registers marked as:
>       - rA=scalar with range C, id=U;
>       - rB=scalar with range C, id=U;
>         (Note: if rA.id == rB.id the registers have identical range).
> 
>   Proposition:
>     As long as range C is a subset of both range A and range B
>     the state 'Cur' is safe.
> 
>   Proof:
>     State 'Cur' represents a special case of state 'Old',
>     a variant of 'Old' where rA and rB happen to have same value.
>     Thus 'Cur' is safe because 'Old' is safe.
> 

I had a separate discussion with Andrii on this topic.
Andrii is still concerned that the change from:

	if (rold->precise && !check_ids(idmap, rold, rcur))
		return false;
to:

	if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
		return false;

Is not fully equivalent and there might be a corner case where 'Cur'
and 'Old' don't behave in the same way. I don't have a better argument
than the one stated above.

On the other hand, this optimization is needed to mitigate relatively
small overhead:

$ ./veristat -e file,prog,states -f "states_pct>10" \
    -C master-baseline.log current.log
File         Program                         States  (DIFF)
-----------  ------------------------------  --------------
bpf_xdp.o    tail_handle_nat_fwd_ipv6        +155 (+23.92%)
bpf_xdp.o    tail_nodeport_nat_ingress_ipv4  +102 (+27.20%)
bpf_xdp.o    tail_rev_nodeport_lb4            +83 (+20.85%)
loop6.bpf.o  trace_virtqueue_add_sgs          +25 (+11.06%)

(there are also a few programs with overhead <1%, I don't list those
 for brevity).
 
On yet another hand:
- this change major impact is because of elimination of unique IDs in
  the 'Cur' state;
- once register spill/fill code is modified to generate scalar IDs in
  the same way as BPF_MOV code does it would be possible to drop
  unique scalar IDs in 'Cur' and in 'Old' in is_state_visited() /
  clean_live_states() using the same code.

Having this in mind we decided to withhold this change for now.
I will post an updated patch-set without it, Andrii will test it on
some more internal BPF programs to see if impact is still small.

> > > 
> > > Here, if rcur->id is unique there are two cases:
> > > - rold->id == 0: then rcur->id is just ignored
> > > - rold->id != 0: then rold->id/rcur->id pair would be added to idmap,
> > >                  there is some other precise old register with the
> > >                  same id as rold->id, so eventually check_ids()
> > >                  would make regsafe() return false.
> > > 
> > > > > - do a check as follows:
> > > > >   if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
> > > > >     return false;
> > > > 

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2023-06-05 20:44 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-30 17:27 [PATCH bpf-next v2 0/4] verify scalar ids mapping in regsafe() Eduard Zingerman
2023-05-30 17:27 ` [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids() Eduard Zingerman
2023-05-30 21:37   ` Andrii Nakryiko
2023-05-30 23:02     ` Eduard Zingerman
2023-06-01  2:05       ` Alexei Starovoitov
2023-06-01 16:57         ` Eduard Zingerman
2023-06-01 17:13           ` Alexei Starovoitov
2023-06-01 18:52             ` Eduard Zingerman
2023-06-02 18:50               ` Andrii Nakryiko
2023-06-02 19:13                 ` Eduard Zingerman
2023-06-02 19:27                   ` Alexei Starovoitov
2023-06-02 19:37                     ` Eduard Zingerman
2023-06-02 19:43                       ` Alexei Starovoitov
2023-06-02 19:51                         ` Eduard Zingerman
2023-06-02 20:03                           ` Alexei Starovoitov
2023-06-02 20:22                             ` Eduard Zingerman
2023-06-02 20:17                   ` Andrii Nakryiko
2023-06-02 21:13                     ` Eduard Zingerman
2023-06-02 22:07                       ` Andrii Nakryiko
2023-06-02 23:20                         ` Eduard Zingerman
2023-06-05 20:43                           ` Eduard Zingerman
     [not found]     ` <f2abf39bcd4de841a89bb248de9e242a880aaa93.camel@gmail.com>
2023-05-31 18:29       ` Andrii Nakryiko
2023-05-31 19:30         ` Eduard Zingerman
2023-05-31 20:12           ` Andrii Nakryiko
2023-05-31 20:31             ` Eduard Zingerman
2023-05-31 21:14               ` Andrii Nakryiko
2023-05-31 22:15                 ` Eduard Zingerman
2023-05-31 22:54                   ` Andrii Nakryiko
2023-05-31 23:42                     ` Eduard Zingerman
2023-06-01  0:23                       ` Andrii Nakryiko
2023-05-30 17:27 ` [PATCH bpf-next v2 2/4] selftests/bpf: verify that check_ids() is used for scalars in regsafe() Eduard Zingerman
2023-05-30 17:27 ` [PATCH bpf-next v2 3/4] bpf: filter out scalar ids not used for range transfer " Eduard Zingerman
2023-05-30 17:27 ` [PATCH bpf-next v2 4/4] selftests/bpf: check env->that range_transfer_ids has effect Eduard Zingerman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.