All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications
@ 2018-03-29 22:44 Edward Cree
  2018-03-29 22:45 ` [PATCH v2 bpf-next 1/3] bpf/verifier: validate func_calls by marking at do_check() time Edward Cree
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Edward Cree @ 2018-03-29 22:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann; +Cc: netdev

By storing subprog boundaries as a subprogno mark on each insn, rather than
 a start (and implicit end) for each subprog, we collect a number of gains:
* More efficient determination of which subprog contains a given insn, and
  thus of find_subprog (which subprog begins at a given insn).
* Number of verifier "full recursive walk" passes is reduced, since most of
  the work is done in the main insn walk (do_check()).  Leftover work in
  other passes is mostly linear scans (O(insn_cnt)) or, in the case of
  check_max_stack_depth(), a topological sort (O(subprog_cnt)).

Some other changes were also included to support this:
* Per-subprog info is stored in env->subprog_info, an array of structs,
  rather than several arrays with a common index.
* Call graph is now stored in the new bpf_subprog_info struct; used here
  for check_max_stack_depth() but may have other uses too.

Along with this, patch #3 puts parent pointers (used by liveness analysis)
 in the registers instead of the func_state or verifier_state, so that we
 don't need skip_callee() machinery.  This also does the right thing for
 stack slots, so they don't need their own special handling for liveness
 marking either.

Edward Cree (3):
  bpf/verifier: validate func_calls by marking at do_check() time
  bpf/verifier: update selftests
  bpf/verifier: per-register parent pointers

 include/linux/bpf_verifier.h                |  32 +-
 kernel/bpf/verifier.c                       | 631 +++++++++++++---------------
 tools/testing/selftests/bpf/test_verifier.c |  51 ++-
 3 files changed, 344 insertions(+), 370 deletions(-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 bpf-next 1/3] bpf/verifier: validate func_calls by marking at do_check() time
  2018-03-29 22:44 [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications Edward Cree
@ 2018-03-29 22:45 ` Edward Cree
  2018-03-29 22:46 ` [PATCH v2 bpf-next 2/3] bpf/verifier: update selftests Edward Cree
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Edward Cree @ 2018-03-29 22:45 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann; +Cc: netdev

Removes a couple of passes from the verifier, one to check subprogs don't
 overlap etc., and one to compute max stack depth (which now is done by
 topologically sorting the call graph).  This improves the asymptotic
 complexity of a number of operations, for instance the max stack depth
 check is now O(n) in the number of subprogs, rather than having to walk
 every insn of every possible call chain.

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 include/linux/bpf_verifier.h |  24 ++-
 kernel/bpf/verifier.c        | 451 ++++++++++++++++++++++++-------------------
 2 files changed, 267 insertions(+), 208 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 7e61c395fddf..3af3f9cceede 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -146,6 +146,7 @@ struct bpf_insn_aux_data {
 		s32 call_imm;			/* saved imm field of call insn */
 	};
 	int ctx_field_size; /* the ctx field size for load insn, maybe 0 */
+	u16 subprogno; /* subprog in which this insn resides, valid iff @seen */
 	bool seen; /* this insn was processed by the verifier */
 };
 
@@ -173,6 +174,15 @@ static inline bool bpf_verifier_log_needed(const struct bpf_verifier_log *log)
 
 #define BPF_MAX_SUBPROGS 256
 
+struct bpf_subprog_info {
+	/* which other subprogs does this one directly call? */
+	DECLARE_BITMAP(callees, BPF_MAX_SUBPROGS);
+	u32 start; /* insn idx of function entry point */
+	u16 stack_depth; /* max. stack depth used by this function */
+	u16 total_stack_depth; /* max. stack depth used by entire call chain */
+	u16 len; /* #insns in this subprog */
+};
+
 /* single container for all structs
  * one verifier_env per bpf_check() call
  */
@@ -189,11 +199,10 @@ struct bpf_verifier_env {
 	u32 id_gen;			/* used to generate unique reg IDs */
 	bool allow_ptr_leaks;
 	bool seen_direct_write;
+	bool seen_pseudo_call;		/* populated at check_cfg() time */
 	struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */
 	struct bpf_verifier_log log;
-	u32 subprog_starts[BPF_MAX_SUBPROGS];
-	/* computes the stack depth of each bpf function */
-	u16 subprog_stack_depth[BPF_MAX_SUBPROGS + 1];
+	struct bpf_subprog_info subprog_info[BPF_MAX_SUBPROGS];
 	u32 subprog_cnt;
 };
 
@@ -202,11 +211,16 @@ void bpf_verifier_vlog(struct bpf_verifier_log *log, const char *fmt,
 __printf(2, 3) void bpf_verifier_log_write(struct bpf_verifier_env *env,
 					   const char *fmt, ...);
 
-static inline struct bpf_reg_state *cur_regs(struct bpf_verifier_env *env)
+static inline struct bpf_func_state *cur_frame(struct bpf_verifier_env *env)
 {
 	struct bpf_verifier_state *cur = env->cur_state;
 
-	return cur->frame[cur->curframe]->regs;
+	return cur->frame[cur->curframe];
+}
+
+static inline struct bpf_reg_state *cur_regs(struct bpf_verifier_env *env)
+{
+	return cur_frame(env)->regs;
 }
 
 int bpf_prog_offload_verifier_prep(struct bpf_verifier_env *env);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 8acd2207e412..33963357a7ef 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -736,111 +736,49 @@ enum reg_arg_type {
 	DST_OP_NO_MARK	/* same as above, check only, don't mark */
 };
 
-static int cmp_subprogs(const void *a, const void *b)
+static int find_subprog(struct bpf_verifier_env *env, int insn_idx)
 {
-	return *(int *)a - *(int *)b;
-}
-
-static int find_subprog(struct bpf_verifier_env *env, int off)
-{
-	u32 *p;
+	struct bpf_insn_aux_data *aux;
+	int insn_cnt = env->prog->len;
+	u32 subprogno;
 
-	p = bsearch(&off, env->subprog_starts, env->subprog_cnt,
-		    sizeof(env->subprog_starts[0]), cmp_subprogs);
-	if (!p)
+	if (insn_idx >= insn_cnt || insn_idx < 0) {
+		verbose(env, "find_subprog of invalid insn_idx %d\n", insn_idx);
+		return -EINVAL;
+	}
+	aux = &env->insn_aux_data[insn_idx];
+	if (!aux->seen) /* haven't visited this line yet */
 		return -ENOENT;
-	return p - env->subprog_starts;
-
+	subprogno = aux->subprogno;
+	/* validate that we are at start of subprog */
+	if (env->subprog_info[subprogno].start != insn_idx) {
+		verbose(env, "insn_idx %d is in subprog %u but that starts at %d\n",
+			insn_idx, subprogno, env->subprog_info[subprogno].start);
+		return -EINVAL;
+	}
+	return subprogno;
 }
 
 static int add_subprog(struct bpf_verifier_env *env, int off)
 {
 	int insn_cnt = env->prog->len;
+	struct bpf_subprog_info *info;
 	int ret;
 
 	if (off >= insn_cnt || off < 0) {
 		verbose(env, "call to invalid destination\n");
 		return -EINVAL;
 	}
-	ret = find_subprog(env, off);
-	if (ret >= 0)
-		return 0;
 	if (env->subprog_cnt >= BPF_MAX_SUBPROGS) {
 		verbose(env, "too many subprograms\n");
 		return -E2BIG;
 	}
-	env->subprog_starts[env->subprog_cnt++] = off;
-	sort(env->subprog_starts, env->subprog_cnt,
-	     sizeof(env->subprog_starts[0]), cmp_subprogs, NULL);
-	return 0;
-}
-
-static int check_subprogs(struct bpf_verifier_env *env)
-{
-	int i, ret, subprog_start, subprog_end, off, cur_subprog = 0;
-	struct bpf_insn *insn = env->prog->insnsi;
-	int insn_cnt = env->prog->len;
-
-	/* determine subprog starts. The end is one before the next starts */
-	for (i = 0; i < insn_cnt; i++) {
-		if (insn[i].code != (BPF_JMP | BPF_CALL))
-			continue;
-		if (insn[i].src_reg != BPF_PSEUDO_CALL)
-			continue;
-		if (!env->allow_ptr_leaks) {
-			verbose(env, "function calls to other bpf functions are allowed for root only\n");
-			return -EPERM;
-		}
-		if (bpf_prog_is_dev_bound(env->prog->aux)) {
-			verbose(env, "function calls in offloaded programs are not supported yet\n");
-			return -EINVAL;
-		}
-		ret = add_subprog(env, i + insn[i].imm + 1);
-		if (ret < 0)
-			return ret;
-	}
-
-	if (env->log.level > 1)
-		for (i = 0; i < env->subprog_cnt; i++)
-			verbose(env, "func#%d @%d\n", i, env->subprog_starts[i]);
-
-	/* now check that all jumps are within the same subprog */
-	subprog_start = 0;
-	if (env->subprog_cnt == cur_subprog)
-		subprog_end = insn_cnt;
-	else
-		subprog_end = env->subprog_starts[cur_subprog++];
-	for (i = 0; i < insn_cnt; i++) {
-		u8 code = insn[i].code;
-
-		if (BPF_CLASS(code) != BPF_JMP)
-			goto next;
-		if (BPF_OP(code) == BPF_EXIT || BPF_OP(code) == BPF_CALL)
-			goto next;
-		off = i + insn[i].off + 1;
-		if (off < subprog_start || off >= subprog_end) {
-			verbose(env, "jump out of range from insn %d to %d\n", i, off);
-			return -EINVAL;
-		}
-next:
-		if (i == subprog_end - 1) {
-			/* to avoid fall-through from one subprog into another
-			 * the last insn of the subprog should be either exit
-			 * or unconditional jump back
-			 */
-			if (code != (BPF_JMP | BPF_EXIT) &&
-			    code != (BPF_JMP | BPF_JA)) {
-				verbose(env, "last insn is not an exit or jmp\n");
-				return -EINVAL;
-			}
-			subprog_start = subprog_end;
-			if (env->subprog_cnt == cur_subprog)
-				subprog_end = insn_cnt;
-			else
-				subprog_end = env->subprog_starts[cur_subprog++];
-		}
-	}
-	return 0;
+	ret = env->subprog_cnt++;
+	info = &env->subprog_info[ret];
+	info->start = off;
+	info->stack_depth = 0;
+	memset(info->callees, 0, sizeof(info->callees));
+	return ret;
 }
 
 static
@@ -1470,80 +1408,84 @@ static int update_stack_depth(struct bpf_verifier_env *env,
 			      const struct bpf_func_state *func,
 			      int off)
 {
-	u16 stack = env->subprog_stack_depth[func->subprogno];
+	struct bpf_subprog_info *info;
 
-	if (stack >= -off)
-		return 0;
+	if (!func) return -EFAULT;
+	if (func->subprogno >= BPF_MAX_SUBPROGS) return -E2BIG;
+	info = &env->subprog_info[func->subprogno];
 
 	/* update known max for given subprogram */
-	env->subprog_stack_depth[func->subprogno] = -off;
+	info->stack_depth = max_t(u16, info->stack_depth, -off);
 	return 0;
 }
 
-/* starting from main bpf function walk all instructions of the function
- * and recursively walk all callees that given function can call.
- * Ignore jump and exit insns.
- * Since recursion is prevented by check_cfg() this algorithm
- * only needs a local stack of MAX_CALL_FRAMES to remember callsites
+/* Topologically sort the call graph, and thereby determine the maximum stack
+ * depth of each subprog's worst-case call chain.  Store in total_stack_depth.
+ * The tsort is performed using Kahn's algorithm.
  */
 static int check_max_stack_depth(struct bpf_verifier_env *env)
 {
-	int depth = 0, frame = 0, subprog = 0, i = 0, subprog_end;
-	struct bpf_insn *insn = env->prog->insnsi;
-	int insn_cnt = env->prog->len;
-	int ret_insn[MAX_CALL_FRAMES];
-	int ret_prog[MAX_CALL_FRAMES];
-
-process_func:
-	/* round up to 32-bytes, since this is granularity
-	 * of interpreter stack size
-	 */
-	depth += round_up(max_t(u32, env->subprog_stack_depth[subprog], 1), 32);
-	if (depth > MAX_BPF_STACK) {
-		verbose(env, "combined stack size of %d calls is %d. Too large\n",
-			frame + 1, depth);
-		return -EACCES;
-	}
-continue_func:
-	if (env->subprog_cnt == subprog)
-		subprog_end = insn_cnt;
-	else
-		subprog_end = env->subprog_starts[subprog];
-	for (; i < subprog_end; i++) {
-		if (insn[i].code != (BPF_JMP | BPF_CALL))
-			continue;
-		if (insn[i].src_reg != BPF_PSEUDO_CALL)
-			continue;
-		/* remember insn and function to return to */
-		ret_insn[frame] = i + 1;
-		ret_prog[frame] = subprog;
-
-		/* find the callee */
-		i = i + insn[i].imm + 1;
-		subprog = find_subprog(env, i);
-		if (subprog < 0) {
-			WARN_ONCE(1, "verifier bug. No program starts at insn %d\n",
-				  i);
-			return -EFAULT;
+	DECLARE_BITMAP(frontier, BPF_MAX_SUBPROGS) = {0};
+	DECLARE_BITMAP(visited, BPF_MAX_SUBPROGS) = {0};
+	int subprog, i, j;
+
+	/* subprog 0 has no incoming edges, and should be the only such */
+	__set_bit(0, frontier);
+	env->subprog_info[0].total_stack_depth = env->subprog_info[0].stack_depth;
+
+	while (true) {
+		/* select a frontier node */
+		subprog = find_first_bit(frontier, BPF_MAX_SUBPROGS);
+		if (subprog >= BPF_MAX_SUBPROGS) /* frontier is empty, done */
+			break;
+		/* remove it from the frontier */
+		__clear_bit(subprog, frontier);
+		/* validate its total stack depth */
+		if (env->subprog_info[subprog].total_stack_depth > MAX_BPF_STACK) {
+			verbose(env, "combined stack size of calls to %d (at insn %d) is %d.  Too large\n",
+				subprog, env->subprog_info[subprog].start,
+				env->subprog_info[subprog].total_stack_depth);
+			return -EACCES;
 		}
-		subprog++;
-		frame++;
-		if (frame >= MAX_CALL_FRAMES) {
-			WARN_ONCE(1, "verifier bug. Call stack is too deep\n");
-			return -EFAULT;
+		if (env->log.level > 1) {
+			verbose(env, "combined stack size of calls to %d (at insn %d) is %d\n",
+				subprog, env->subprog_info[subprog].start,
+				env->subprog_info[subprog].total_stack_depth);
+		}
+		__set_bit(subprog, visited);
+		/* for each callee */
+		for_each_set_bit(i, env->subprog_info[subprog].callees,
+				 BPF_MAX_SUBPROGS) {
+			/* round up to 32-bytes, since this is granularity of
+			 * interpreter stack size
+			 */
+			u16 stack_depth = round_up(max_t(u16, env->subprog_info[i].stack_depth, 1), 32);
+
+			/* Update callee total stack depth */
+			env->subprog_info[i].total_stack_depth = max_t(u16,
+				env->subprog_info[i].total_stack_depth,
+				env->subprog_info[subprog].total_stack_depth +
+				stack_depth);
+			/* does it have unvisited callers? */
+			for_each_clear_bit(j, visited, env->subprog_cnt) {
+				if (test_bit(i, env->subprog_info[j].callees))
+					break;
+			}
+			/* if not, add it to the frontier */
+			if (j >= env->subprog_cnt)
+				__set_bit(i, frontier);
 		}
-		goto process_func;
 	}
-	/* end of for() loop means the last insn of the 'subprog'
-	 * was reached. Doesn't matter whether it was JA or EXIT
-	 */
-	if (frame == 0)
-		return 0;
-	depth -= round_up(max_t(u32, env->subprog_stack_depth[subprog], 1), 32);
-	frame--;
-	i = ret_insn[frame];
-	subprog = ret_prog[frame];
-	goto continue_func;
+
+	/* are any nodes left unvisited? */
+	subprog = find_first_zero_bit(visited, env->subprog_cnt);
+	if (subprog < env->subprog_cnt) {
+		/* then call graph is not acyclic, which shouldn't happen */
+		verbose(env, "verifier bug.  Call graph has a cycle including subprog %d (at insn %d)\n",
+			subprog, env->subprog_info[subprog].start);
+		return -EFAULT;
+	}
+	return 0;
 }
 
 #ifndef CONFIG_BPF_JIT_ALWAYS_ON
@@ -1558,8 +1500,7 @@ static int get_callee_stack_depth(struct bpf_verifier_env *env,
 			  start);
 		return -EFAULT;
 	}
-	subprog++;
-	return env->subprog_stack_depth[subprog];
+	return env->subprog_info[subprog].stack_depth;
 }
 #endif
 
@@ -2097,7 +2038,7 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env,
 	case BPF_FUNC_tail_call:
 		if (map->map_type != BPF_MAP_TYPE_PROG_ARRAY)
 			goto error;
-		if (env->subprog_cnt) {
+		if (env->subprog_cnt > 1) {
 			verbose(env, "tail_calls are not allowed in programs with bpf-to-bpf calls\n");
 			return -EINVAL;
 		}
@@ -2233,22 +2174,33 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 {
 	struct bpf_verifier_state *state = env->cur_state;
 	struct bpf_func_state *caller, *callee;
-	int i, subprog, target_insn;
+	int i, subprog, target;
 
 	if (state->curframe + 1 >= MAX_CALL_FRAMES) {
 		verbose(env, "the call stack of %d frames is too deep\n",
 			state->curframe + 2);
 		return -E2BIG;
 	}
-
-	target_insn = *insn_idx + insn->imm;
-	subprog = find_subprog(env, target_insn + 1);
-	if (subprog < 0) {
-		verbose(env, "verifier bug. No program starts at insn %d\n",
-			target_insn + 1);
-		return -EFAULT;
+	if (!env->allow_ptr_leaks) {
+		verbose(env, "function calls to other bpf functions are allowed for root only\n");
+		return -EPERM;
+	}
+	if (bpf_prog_is_dev_bound(env->prog->aux)) {
+		verbose(env, "function calls in offloaded programs are not supported yet\n");
+		return -EINVAL;
 	}
 
+
+	target = *insn_idx + insn->imm;
+	/* We will increment insn_idx (PC) in do_check() after handling this
+	 * call insn, so the actual start of the function is target + 1.
+	 */
+	subprog = find_subprog(env, target + 1);
+	if (subprog == -ENOENT)
+		subprog = add_subprog(env, target + 1);
+	if (subprog < 0)
+		return subprog;
+
 	caller = state->frame[state->curframe];
 	if (state->frame[state->curframe + 1]) {
 		verbose(env, "verifier bug. Frame %d already allocated\n",
@@ -2261,6 +2213,9 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		return -ENOMEM;
 	state->frame[state->curframe + 1] = callee;
 
+	/* record edge in call graph */
+	__set_bit(subprog, env->subprog_info[caller->subprogno].callees);
+
 	/* callee cannot access r0, r6 - r9 for reading and has to write
 	 * into its own stack before reading from it.
 	 * callee can read/write into caller's stack
@@ -2269,7 +2224,7 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			/* remember the callsite, it will be used by bpf_exit */
 			*insn_idx /* callsite */,
 			state->curframe + 1 /* frameno within this callchain */,
-			subprog + 1 /* subprog number within this prog */);
+			subprog /* subprog number within this prog */);
 
 	/* copy r1 - r5 args that callee can access */
 	for (i = BPF_REG_1; i <= BPF_REG_5; i++)
@@ -2285,7 +2240,7 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 	state->curframe++;
 
 	/* and go analyze first insn of the callee */
-	*insn_idx = target_insn;
+	*insn_idx = target;
 
 	if (env->log.level) {
 		verbose(env, "caller:\n");
@@ -3828,13 +3783,13 @@ static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn)
 		return -EINVAL;
 	}
 
-	if (env->subprog_cnt) {
+	if (env->seen_pseudo_call) {
 		/* when program has LD_ABS insn JITs and interpreter assume
 		 * that r1 == ctx == skb which is not the case for callees
 		 * that can have arbitrary arguments. It's problematic
 		 * for main prog as well since JITs would need to analyze
 		 * all functions in order to make proper register save/restore
-		 * decisions in the main prog. Hence disallow LD_ABS with calls
+		 * decisions in the main prog. Hence disallow LD_ABS with calls.
 		 */
 		verbose(env, "BPF_LD_[ABS|IND] instructions cannot be mixed with bpf-to-bpf calls\n");
 		return -EINVAL;
@@ -4016,10 +3971,6 @@ static int check_cfg(struct bpf_verifier_env *env)
 	int ret = 0;
 	int i, t;
 
-	ret = check_subprogs(env);
-	if (ret < 0)
-		return ret;
-
 	insn_state = kcalloc(insn_cnt, sizeof(int), GFP_KERNEL);
 	if (!insn_state)
 		return -ENOMEM;
@@ -4053,6 +4004,7 @@ static int check_cfg(struct bpf_verifier_env *env)
 			if (t + 1 < insn_cnt)
 				env->explored_states[t + 1] = STATE_LIST_MARK;
 			if (insns[t].src_reg == BPF_PSEUDO_CALL) {
+				env->seen_pseudo_call = true;
 				env->explored_states[t] = STATE_LIST_MARK;
 				ret = push_insn(t, t + insns[t].imm + 1, BRANCH, env);
 				if (ret == 1)
@@ -4534,6 +4486,52 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 	return 0;
 }
 
+/* Checks that all subprogs are contiguous */
+static int validate_subprogs(struct bpf_verifier_env *env)
+{
+	int cur_subprog = -1, i;
+
+	for (i = 0; i < env->prog->len; i++) {
+		struct bpf_insn_aux_data *aux = &env->insn_aux_data[i];
+
+		if (!aux->seen) {
+			if (cur_subprog < 0) { /* can't happen */
+				verbose(env, "unseen insn %d before subprog\n",
+					i);
+				return -EINVAL;
+			}
+			/* unreachable code belongs to the subprog in which it
+			 * is embedded.  Otherwise a forward jump could create
+			 * an apparently non-contiguous subprog.
+			 */
+			aux->subprogno = cur_subprog;
+			continue;
+		}
+		if (aux->subprogno != cur_subprog) {
+			/* change of subprog should only happen at start of
+			 * new subprog, since subprog must start with entry
+			 * point & be contiguous thereafter.
+			 */
+			if (unlikely(aux->subprogno >= env->subprog_cnt)) {
+				/* can't happen */
+				verbose(env, "verifier ran off subprog array, "
+					"%u >= %u\n",
+					aux->subprogno, env->subprog_cnt);
+				return -EINVAL;
+			}
+			if (env->subprog_info[aux->subprogno].start != i) {
+				verbose(env, "subprog %u is non-contiguous at %d\n",
+					aux->subprogno, i);
+				return -EINVAL;
+			}
+			/* entered new subprog */
+			cur_subprog = aux->subprogno;
+		}
+	}
+
+	return 0;
+}
+
 static int do_check(struct bpf_verifier_env *env)
 {
 	struct bpf_verifier_state *state;
@@ -4542,6 +4540,7 @@ static int do_check(struct bpf_verifier_env *env)
 	int insn_cnt = env->prog->len, i;
 	int insn_idx, prev_insn_idx = 0;
 	int insn_processed = 0;
+	int mainprogno;
 	bool do_print_state = false;
 
 	state = kzalloc(sizeof(struct bpf_verifier_state), GFP_KERNEL);
@@ -4555,12 +4554,17 @@ static int do_check(struct bpf_verifier_env *env)
 		return -ENOMEM;
 	}
 	env->cur_state = state;
+	mainprogno = add_subprog(env, 0);
+	if (mainprogno < 0)
+		return mainprogno;
 	init_func_state(env, state->frame[0],
 			BPF_MAIN_FUNC /* callsite */,
 			0 /* frameno */,
-			0 /* subprogno, zero == main subprog */);
+			mainprogno /* subprogno */);
 	insn_idx = 0;
 	for (;;) {
+		struct bpf_insn_aux_data *aux = &env->insn_aux_data[insn_idx];
+		struct bpf_func_state *frame = cur_frame(env);
 		struct bpf_insn *insn;
 		u8 class;
 		int err;
@@ -4581,6 +4585,17 @@ static int do_check(struct bpf_verifier_env *env)
 			return -E2BIG;
 		}
 
+		/* check the insn doesn't appear in multiple subprogs, which
+		 * would imply bad function calls/returns/jumps
+		 */
+		if (aux->seen && aux->subprogno != frame->subprogno) {
+			verbose(env, "insn %d was in subprog %u, now %u\n",
+				insn_idx, aux->subprogno, frame->subprogno);
+			return -EINVAL;
+		}
+		aux->subprogno = frame->subprogno;
+		aux->seen = true;
+
 		err = is_state_visited(env, insn_idx);
 		if (err < 0)
 			return err;
@@ -4605,7 +4620,7 @@ static int do_check(struct bpf_verifier_env *env)
 			else
 				verbose(env, "\nfrom %d to %d:",
 					prev_insn_idx, insn_idx);
-			print_verifier_state(env, state->frame[state->curframe]);
+			print_verifier_state(env, frame);
 			do_print_state = false;
 		}
 
@@ -4627,7 +4642,6 @@ static int do_check(struct bpf_verifier_env *env)
 		}
 
 		regs = cur_regs(env);
-		env->insn_aux_data[insn_idx].seen = true;
 		if (class == BPF_ALU || class == BPF_ALU64) {
 			err = check_alu_op(env, insn);
 			if (err)
@@ -4843,7 +4857,15 @@ static int do_check(struct bpf_verifier_env *env)
 					return err;
 
 				insn_idx++;
-				env->insn_aux_data[insn_idx].seen = true;
+				/* Mark second half of LD_IMM64 insn */
+				aux++;
+				if (aux->seen && aux->subprogno != frame->subprogno) {
+					verbose(env, "insn %d was in subprog %u, now %u\n",
+						insn_idx, aux->subprogno, frame->subprogno);
+					return -EINVAL;
+				}
+				aux->subprogno = frame->subprogno;
+				aux->seen = true;
 			} else {
 				verbose(env, "invalid BPF_LD mode\n");
 				return -EINVAL;
@@ -4858,16 +4880,16 @@ static int do_check(struct bpf_verifier_env *env)
 
 	verbose(env, "processed %d insns (limit %d), stack depth ",
 		insn_processed, BPF_COMPLEXITY_LIMIT_INSNS);
-	for (i = 0; i < env->subprog_cnt + 1; i++) {
-		u32 depth = env->subprog_stack_depth[i];
+	for (i = 0; i < env->subprog_cnt; i++) {
+		u32 depth = env->subprog_info[i].stack_depth;
 
-		verbose(env, "%d", depth);
-		if (i + 1 < env->subprog_cnt + 1)
+		if (i)
 			verbose(env, "+");
+		verbose(env, "%d", depth);
 	}
 	verbose(env, "\n");
-	env->prog->aux->stack_depth = env->subprog_stack_depth[0];
-	return 0;
+	env->prog->aux->stack_depth = env->subprog_info[0].stack_depth;
+	return validate_subprogs(env);
 }
 
 static int check_map_prealloc(struct bpf_map *map)
@@ -5059,8 +5081,10 @@ static int adjust_insn_aux_data(struct bpf_verifier_env *env, u32 prog_len,
 	memcpy(new_data, old_data, sizeof(struct bpf_insn_aux_data) * off);
 	memcpy(new_data + off + cnt - 1, old_data + off,
 	       sizeof(struct bpf_insn_aux_data) * (prog_len - off - cnt + 1));
-	for (i = off; i < off + cnt - 1; i++)
+	for (i = off; i < off + cnt - 1; i++) {
 		new_data[i].seen = true;
+		new_data[i].subprogno = old_data[off].subprogno;
+	}
 	env->insn_aux_data = new_data;
 	vfree(old_data);
 	return 0;
@@ -5073,9 +5097,9 @@ static void adjust_subprog_starts(struct bpf_verifier_env *env, u32 off, u32 len
 	if (len == 1)
 		return;
 	for (i = 0; i < env->subprog_cnt; i++) {
-		if (env->subprog_starts[i] < off)
+		if (env->subprog_info[i].start < off)
 			continue;
-		env->subprog_starts[i] += len - 1;
+		env->subprog_info[i].start += len - 1;
 	}
 }
 
@@ -5234,15 +5258,19 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
 static int jit_subprogs(struct bpf_verifier_env *env)
 {
 	struct bpf_prog *prog = env->prog, **func, *tmp;
-	int i, j, subprog_start, subprog_end = 0, len, subprog;
+	int i, j, subprog;
 	struct bpf_insn *insn;
 	void *old_bpf_func;
 	int err = -ENOMEM;
 
-	if (env->subprog_cnt == 0)
+	if (env->subprog_cnt <= 1)
 		return 0;
 
+	for (i = 0; i <= env->subprog_cnt; i++)
+		env->subprog_info[i].len = 0;
+
 	for (i = 0, insn = prog->insnsi; i < prog->len; i++, insn++) {
+		env->subprog_info[env->insn_aux_data[i].subprogno].len++;
 		if (insn->code != (BPF_JMP | BPF_CALL) ||
 		    insn->src_reg != BPF_PSEUDO_CALL)
 			continue;
@@ -5255,7 +5283,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 		/* temporarily remember subprog id inside insn instead of
 		 * aux_data, since next loop will split up all insns into funcs
 		 */
-		insn->off = subprog + 1;
+		insn->off = subprog;
 		/* remember original imm in case JIT fails and fallback
 		 * to interpreter will be needed
 		 */
@@ -5264,25 +5292,37 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 		insn->imm = 1;
 	}
 
-	func = kzalloc(sizeof(prog) * (env->subprog_cnt + 1), GFP_KERNEL);
+	func = kzalloc(sizeof(prog) * env->subprog_cnt, GFP_KERNEL);
 	if (!func)
 		return -ENOMEM;
 
-	for (i = 0; i <= env->subprog_cnt; i++) {
-		subprog_start = subprog_end;
-		if (env->subprog_cnt == i)
-			subprog_end = prog->len;
-		else
-			subprog_end = env->subprog_starts[i];
+	for (i = 0; i < env->subprog_cnt; i++) {
+		struct bpf_subprog_info *info = &env->subprog_info[i];
+		int k = 0;
 
-		len = subprog_end - subprog_start;
-		func[i] = bpf_prog_alloc(bpf_prog_size(len), GFP_USER);
+		func[i] = bpf_prog_alloc(bpf_prog_size(info->len), GFP_USER);
 		if (!func[i])
 			goto out_free;
-		memcpy(func[i]->insnsi, &prog->insnsi[subprog_start],
-		       len * sizeof(struct bpf_insn));
+
+		for (j = 0; j < prog->len; j++) {
+			if (env->insn_aux_data[j].subprogno != i)
+				continue;
+			if (WARN_ON_ONCE(k >= info->len)) {
+				verbose(env, "Tried to add insn %d to subprog %d but previously counted %d\n",
+					k, i, info->len);
+				err = -EFAULT;
+				goto out_free;
+			}
+			memcpy(func[i]->insnsi + k++, prog->insnsi + j,
+			       sizeof(struct bpf_insn));
+		}
+		if (WARN_ON_ONCE(k != info->len)) {
+			verbose(env, "Only %d insns in subprog %d but previously counted %d\n",
+				k, i, info->len);
+		}
+
 		func[i]->type = prog->type;
-		func[i]->len = len;
+		func[i]->len = info->len;
 		if (bpf_prog_calc_tag(func[i]))
 			goto out_free;
 		func[i]->is_func = 1;
@@ -5290,7 +5330,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 		 * Long term would need debug info to populate names
 		 */
 		func[i]->aux->name[0] = 'F';
-		func[i]->aux->stack_depth = env->subprog_stack_depth[i];
+		func[i]->aux->stack_depth = env->subprog_info[i].stack_depth;
 		func[i]->jit_requested = 1;
 		func[i] = bpf_int_jit_compile(func[i]);
 		if (!func[i]->jited) {
@@ -5303,7 +5343,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 	 * now populate all bpf_calls with correct addresses and
 	 * run last pass of JIT
 	 */
-	for (i = 0; i <= env->subprog_cnt; i++) {
+	for (i = 0; i < env->subprog_cnt; i++) {
 		insn = func[i]->insnsi;
 		for (j = 0; j < func[i]->len; j++, insn++) {
 			if (insn->code != (BPF_JMP | BPF_CALL) ||
@@ -5311,12 +5351,16 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 				continue;
 			subprog = insn->off;
 			insn->off = 0;
+			if (subprog < 0 || subprog >= env->subprog_cnt) {
+				err = -EFAULT;
+				goto out_free;
+			}
 			insn->imm = (u64 (*)(u64, u64, u64, u64, u64))
 				func[subprog]->bpf_func -
 				__bpf_call_base;
 		}
 	}
-	for (i = 0; i <= env->subprog_cnt; i++) {
+	for (i = 0; i < env->subprog_cnt; i++) {
 		old_bpf_func = func[i]->bpf_func;
 		tmp = bpf_int_jit_compile(func[i]);
 		if (tmp != func[i] || func[i]->bpf_func != old_bpf_func) {
@@ -5330,7 +5374,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 	/* finally lock prog and jit images for all functions and
 	 * populate kallsysm
 	 */
-	for (i = 0; i <= env->subprog_cnt; i++) {
+	for (i = 0; i < env->subprog_cnt; i++) {
 		bpf_prog_lock_ro(func[i]);
 		bpf_prog_kallsyms_add(func[i]);
 	}
@@ -5347,7 +5391,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 			continue;
 		insn->off = env->insn_aux_data[i].call_imm;
 		subprog = find_subprog(env, i + insn->off + 1);
-		addr  = (unsigned long)func[subprog + 1]->bpf_func;
+		addr  = (unsigned long)func[subprog]->bpf_func;
 		addr &= PAGE_MASK;
 		insn->imm = (u64 (*)(u64, u64, u64, u64, u64))
 			    addr - __bpf_call_base;
@@ -5356,10 +5400,10 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 	prog->jited = 1;
 	prog->bpf_func = func[0]->bpf_func;
 	prog->aux->func = func;
-	prog->aux->func_cnt = env->subprog_cnt + 1;
+	prog->aux->func_cnt = env->subprog_cnt;
 	return 0;
 out_free:
-	for (i = 0; i <= env->subprog_cnt; i++)
+	for (i = 0; i < env->subprog_cnt; i++)
 		if (func[i])
 			bpf_jit_free(func[i]);
 	kfree(func);
@@ -5389,6 +5433,7 @@ static int fixup_call_args(struct bpf_verifier_env *env)
 		err = jit_subprogs(env);
 		if (err == 0)
 			return 0;
+		verbose(env, "failed to jit_subprogs %d\n", err);
 	}
 #ifndef CONFIG_BPF_JIT_ALWAYS_ON
 	for (i = 0; i < prog->len; i++, insn++) {

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 bpf-next 2/3] bpf/verifier: update selftests
  2018-03-29 22:44 [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications Edward Cree
  2018-03-29 22:45 ` [PATCH v2 bpf-next 1/3] bpf/verifier: validate func_calls by marking at do_check() time Edward Cree
@ 2018-03-29 22:46 ` Edward Cree
  2018-03-29 22:46 ` [PATCH v2 bpf-next 3/3] bpf/verifier: per-register parent pointers Edward Cree
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Edward Cree @ 2018-03-29 22:46 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann; +Cc: netdev

Error messages for some bad programs have changed, partly because we now
 check for loops / out-of-bounds jumps before checking subprogs.
Also added a test ("calls: interleaved functions") to ensure that subprogs
 are required to be contiguous.
It wasn't entirely clear to me what "calls: wrong recursive calls" was
 meant to test for, since all of the JMP|CALL insns are unreachable.  I've
 changed it so that they are now reachable, which causes static back-edges
 to be detected (since that, like insn reachability, is now tested before
 subprog boundaries are determined).

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 tools/testing/selftests/bpf/test_verifier.c | 51 ++++++++++++++++++-----------
 1 file changed, 32 insertions(+), 19 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index 3e7718b1a9ae..cc45a0b52439 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -646,7 +646,7 @@ static struct bpf_test tests[] = {
 		.insns = {
 			BPF_ALU64_REG(BPF_MOV, BPF_REG_0, BPF_REG_2),
 		},
-		.errstr = "not an exit",
+		.errstr = "jump out of range",
 		.result = REJECT,
 	},
 	{
@@ -9442,13 +9442,13 @@ static struct bpf_test tests[] = {
 			BPF_EXIT_INSN(),
 		},
 		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
-		.errstr = "last insn is not an exit or jmp",
+		.errstr = "insn 1 was in subprog 1, now 0",
 		.result = REJECT,
 	},
 	{
 		"calls: wrong recursive calls",
 		.insns = {
-			BPF_JMP_IMM(BPF_JA, 0, 0, 4),
+			BPF_JMP_IMM(BPF_JA, 0, 0, 3),
 			BPF_JMP_IMM(BPF_JA, 0, 0, 4),
 			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, -2),
 			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, -2),
@@ -9457,7 +9457,7 @@ static struct bpf_test tests[] = {
 			BPF_EXIT_INSN(),
 		},
 		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
-		.errstr = "jump out of range",
+		.errstr = "back-edge from insn",
 		.result = REJECT,
 	},
 	{
@@ -9508,7 +9508,7 @@ static struct bpf_test tests[] = {
 			BPF_EXIT_INSN(),
 		},
 		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
-		.errstr = "jump out of range",
+		.errstr = "insn 5 was in subprog 1, now 0",
 		.result = REJECT,
 	},
 	{
@@ -9787,7 +9787,7 @@ static struct bpf_test tests[] = {
 			BPF_EXIT_INSN(),
 		},
 		.prog_type = BPF_PROG_TYPE_SCHED_CLS,
-		.errstr = "jump out of range from insn 1 to 4",
+		.errstr = "insn 5 was in subprog 1, now 0",
 		.result = REJECT,
 	},
 	{
@@ -9803,13 +9803,12 @@ static struct bpf_test tests[] = {
 			BPF_ALU64_REG(BPF_ADD, BPF_REG_7, BPF_REG_0),
 			BPF_MOV64_REG(BPF_REG_0, BPF_REG_7),
 			BPF_EXIT_INSN(),
-			BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1,
-				    offsetof(struct __sk_buff, len)),
+			BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, 8),
 			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, -3),
 			BPF_EXIT_INSN(),
 		},
 		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
-		.errstr = "jump out of range from insn 11 to 9",
+		.errstr = "insn 9 was in subprog 1, now 2",
 		.result = REJECT,
 	},
 	{
@@ -9861,7 +9860,7 @@ static struct bpf_test tests[] = {
 			BPF_EXIT_INSN(),
 		},
 		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
-		.errstr = "invalid destination",
+		.errstr = "jump out of range from insn 2 to -1",
 		.result = REJECT,
 	},
 	{
@@ -9873,7 +9872,7 @@ static struct bpf_test tests[] = {
 			BPF_EXIT_INSN(),
 		},
 		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
-		.errstr = "invalid destination",
+		.errstr = "jump out of range from insn 2 to -2147483646",
 		.result = REJECT,
 	},
 	{
@@ -9886,7 +9885,7 @@ static struct bpf_test tests[] = {
 			BPF_EXIT_INSN(),
 		},
 		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
-		.errstr = "jump out of range",
+		.errstr = "insn 1 was in subprog 0, now 1",
 		.result = REJECT,
 	},
 	{
@@ -9899,7 +9898,7 @@ static struct bpf_test tests[] = {
 			BPF_EXIT_INSN(),
 		},
 		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
-		.errstr = "jump out of range",
+		.errstr = "insn 4 was in subprog 1, now 0",
 		.result = REJECT,
 	},
 	{
@@ -9913,7 +9912,7 @@ static struct bpf_test tests[] = {
 			BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0, -2),
 		},
 		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
-		.errstr = "not an exit",
+		.errstr = "jump out of range from insn 5 to 6",
 		.result = REJECT,
 	},
 	{
@@ -9927,7 +9926,7 @@ static struct bpf_test tests[] = {
 			BPF_EXIT_INSN(),
 		},
 		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
-		.errstr = "last insn",
+		.errstr = "insn_idx 5 is in subprog 1 but that starts at 4",
 		.result = REJECT,
 	},
 	{
@@ -9942,7 +9941,7 @@ static struct bpf_test tests[] = {
 			BPF_EXIT_INSN(),
 		},
 		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
-		.errstr = "last insn",
+		.errstr = "insn_idx 5 is in subprog 1 but that starts at 4",
 		.result = REJECT,
 	},
 	{
@@ -9982,12 +9981,11 @@ static struct bpf_test tests[] = {
 			BPF_ALU64_REG(BPF_ADD, BPF_REG_7, BPF_REG_0),
 			BPF_MOV64_REG(BPF_REG_0, BPF_REG_7),
 			BPF_MOV64_REG(BPF_REG_0, BPF_REG_0),
-			BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1,
-				    offsetof(struct __sk_buff, len)),
+			BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, 8),
 			BPF_EXIT_INSN(),
 		},
 		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
-		.errstr = "not an exit",
+		.errstr = "insn 10 was in subprog 2, now 1",
 		.result = REJECT,
 	},
 	{
@@ -11423,6 +11421,21 @@ static struct bpf_test tests[] = {
 		.errstr = "BPF_XADD stores into R2 packet",
 		.prog_type = BPF_PROG_TYPE_XDP,
 	},
+	{
+		"calls: interleaved functions",
+		.insns = {
+			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 2),
+			BPF_MOV64_IMM(BPF_REG_0, 1),
+			BPF_JMP_IMM(BPF_JA, 0, 0, 2),
+			BPF_MOV64_IMM(BPF_REG_0, 2),
+			BPF_JMP_IMM(BPF_JA, 0, 0, 1),
+			BPF_EXIT_INSN(),
+			BPF_EXIT_INSN(),
+		},
+		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+		.errstr = "subprog 0 is non-contiguous at 5",
+		.result = REJECT,
+	},
 };
 
 static int probe_filter_length(const struct bpf_insn *fp)

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 bpf-next 3/3] bpf/verifier: per-register parent pointers
  2018-03-29 22:44 [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications Edward Cree
  2018-03-29 22:45 ` [PATCH v2 bpf-next 1/3] bpf/verifier: validate func_calls by marking at do_check() time Edward Cree
  2018-03-29 22:46 ` [PATCH v2 bpf-next 2/3] bpf/verifier: update selftests Edward Cree
@ 2018-03-29 22:46 ` Edward Cree
  2018-03-29 22:50 ` [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications Edward Cree
  2018-04-03  1:08 ` Alexei Starovoitov
  4 siblings, 0 replies; 14+ messages in thread
From: Edward Cree @ 2018-03-29 22:46 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann; +Cc: netdev

By giving each register its own liveness chain, we elide the skip_callee()
 logic.  Instead, each register's parent is the state it inherits from;
 both check_func_call() and prepare_func_exit() automatically connect
 reg states to the correct chain since when they copy the reg state across
 (r1-r5 into the callee as args, and r0 out as the return value) they also
 copy the parent pointer.

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 include/linux/bpf_verifier.h |   8 +-
 kernel/bpf/verifier.c        | 180 ++++++++++---------------------------------
 2 files changed, 45 insertions(+), 143 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 3af3f9cceede..2ec31b388dd6 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -41,6 +41,7 @@ enum bpf_reg_liveness {
 };
 
 struct bpf_reg_state {
+	/* Ordering of fields matters.  See states_equal() */
 	enum bpf_reg_type type;
 	union {
 		/* valid when type == PTR_TO_PACKET */
@@ -59,7 +60,6 @@ struct bpf_reg_state {
 	 * came from, when one is tested for != NULL.
 	 */
 	u32 id;
-	/* Ordering of fields matters.  See states_equal() */
 	/* For scalar types (SCALAR_VALUE), this represents our knowledge of
 	 * the actual value.
 	 * For pointer types, this represents the variable part of the offset
@@ -76,15 +76,15 @@ struct bpf_reg_state {
 	s64 smax_value; /* maximum possible (s64)value */
 	u64 umin_value; /* minimum possible (u64)value */
 	u64 umax_value; /* maximum possible (u64)value */
+	/* parentage chain for liveness checking */
+	struct bpf_reg_state *parent;
 	/* Inside the callee two registers can be both PTR_TO_STACK like
 	 * R1=fp-8 and R2=fp-8, but one of them points to this function stack
 	 * while another to the caller's stack. To differentiate them 'frameno'
 	 * is used which is an index in bpf_verifier_state->frame[] array
 	 * pointing to bpf_func_state.
-	 * This field must be second to last, for states_equal() reasons.
 	 */
 	u32 frameno;
-	/* This field must be last, for states_equal() reasons. */
 	enum bpf_reg_liveness live;
 };
 
@@ -107,7 +107,6 @@ struct bpf_stack_state {
  */
 struct bpf_func_state {
 	struct bpf_reg_state regs[MAX_BPF_REG];
-	struct bpf_verifier_state *parent;
 	/* index of call instruction that called into this func */
 	int callsite;
 	/* stack frame number of this function state from pov of
@@ -129,7 +128,6 @@ struct bpf_func_state {
 struct bpf_verifier_state {
 	/* call stack tracking */
 	struct bpf_func_state *frame[MAX_CALL_FRAMES];
-	struct bpf_verifier_state *parent;
 	u32 curframe;
 };
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 33963357a7ef..edb2ec0da95c 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -355,9 +355,9 @@ static int copy_stack_state(struct bpf_func_state *dst,
 /* do_check() starts with zero-sized stack in struct bpf_verifier_state to
  * make it consume minimal amount of memory. check_stack_write() access from
  * the program calls into realloc_func_state() to grow the stack size.
- * Note there is a non-zero 'parent' pointer inside bpf_verifier_state
- * which this function copies over. It points to previous bpf_verifier_state
- * which is never reallocated
+ * Note there is a non-zero parent pointer inside each reg of bpf_verifier_state
+ * which this function copies over. It points to corresponding reg in previous
+ * bpf_verifier_state which is never reallocated
  */
 static int realloc_func_state(struct bpf_func_state *state, int size,
 			      bool copy_old)
@@ -441,7 +441,6 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state,
 		dst_state->frame[i] = NULL;
 	}
 	dst_state->curframe = src->curframe;
-	dst_state->parent = src->parent;
 	for (i = 0; i <= src->curframe; i++) {
 		dst = dst_state->frame[i];
 		if (!dst) {
@@ -707,6 +706,7 @@ static void init_reg_state(struct bpf_verifier_env *env,
 	for (i = 0; i < MAX_BPF_REG; i++) {
 		mark_reg_not_init(env, regs, i);
 		regs[i].live = REG_LIVE_NONE;
+		regs[i].parent = NULL;
 	}
 
 	/* frame pointer */
@@ -781,74 +781,21 @@ static int add_subprog(struct bpf_verifier_env *env, int off)
 	return ret;
 }
 
-static
-struct bpf_verifier_state *skip_callee(struct bpf_verifier_env *env,
-				       const struct bpf_verifier_state *state,
-				       struct bpf_verifier_state *parent,
-				       u32 regno)
-{
-	struct bpf_verifier_state *tmp = NULL;
-
-	/* 'parent' could be a state of caller and
-	 * 'state' could be a state of callee. In such case
-	 * parent->curframe < state->curframe
-	 * and it's ok for r1 - r5 registers
-	 *
-	 * 'parent' could be a callee's state after it bpf_exit-ed.
-	 * In such case parent->curframe > state->curframe
-	 * and it's ok for r0 only
-	 */
-	if (parent->curframe == state->curframe ||
-	    (parent->curframe < state->curframe &&
-	     regno >= BPF_REG_1 && regno <= BPF_REG_5) ||
-	    (parent->curframe > state->curframe &&
-	       regno == BPF_REG_0))
-		return parent;
-
-	if (parent->curframe > state->curframe &&
-	    regno >= BPF_REG_6) {
-		/* for callee saved regs we have to skip the whole chain
-		 * of states that belong to callee and mark as LIVE_READ
-		 * the registers before the call
-		 */
-		tmp = parent;
-		while (tmp && tmp->curframe != state->curframe) {
-			tmp = tmp->parent;
-		}
-		if (!tmp)
-			goto bug;
-		parent = tmp;
-	} else {
-		goto bug;
-	}
-	return parent;
-bug:
-	verbose(env, "verifier bug regno %d tmp %p\n", regno, tmp);
-	verbose(env, "regno %d parent frame %d current frame %d\n",
-		regno, parent->curframe, state->curframe);
-	return NULL;
-}
-
+/* Parentage chain of this register (or stack slot) should take care of all
+ * issues like callee-saved registers, stack slot allocation time, etc.
+ */
 static int mark_reg_read(struct bpf_verifier_env *env,
-			 const struct bpf_verifier_state *state,
-			 struct bpf_verifier_state *parent,
-			 u32 regno)
+			 const struct bpf_reg_state *state,
+			 struct bpf_reg_state *parent)
 {
 	bool writes = parent == state->parent; /* Observe write marks */
 
-	if (regno == BPF_REG_FP)
-		/* We don't need to worry about FP liveness because it's read-only */
-		return 0;
-
 	while (parent) {
 		/* if read wasn't screened by an earlier write ... */
-		if (writes && state->frame[state->curframe]->regs[regno].live & REG_LIVE_WRITTEN)
+		if (writes && state->live & REG_LIVE_WRITTEN)
 			break;
-		parent = skip_callee(env, state, parent, regno);
-		if (!parent)
-			return -EFAULT;
 		/* ... then we depend on parent's value */
-		parent->frame[parent->curframe]->regs[regno].live |= REG_LIVE_READ;
+		parent->live |= REG_LIVE_READ;
 		state = parent;
 		parent = state->parent;
 		writes = true;
@@ -874,7 +821,10 @@ static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
 			verbose(env, "R%d !read_ok\n", regno);
 			return -EACCES;
 		}
-		return mark_reg_read(env, vstate, vstate->parent, regno);
+		/* We don't need to worry about FP liveness because it's read-only */
+		if (regno != BPF_REG_FP)
+			return mark_reg_read(env, &regs[regno],
+					     regs[regno].parent);
 	} else {
 		/* check whether register used as dest operand can be written to */
 		if (regno == BPF_REG_FP) {
@@ -986,61 +936,6 @@ static int check_stack_write(struct bpf_verifier_env *env,
 	return 0;
 }
 
-/* registers of every function are unique and mark_reg_read() propagates
- * the liveness in the following cases:
- * - from callee into caller for R1 - R5 that were used as arguments
- * - from caller into callee for R0 that used as result of the call
- * - from caller to the same caller skipping states of the callee for R6 - R9,
- *   since R6 - R9 are callee saved by implicit function prologue and
- *   caller's R6 != callee's R6, so when we propagate liveness up to
- *   parent states we need to skip callee states for R6 - R9.
- *
- * stack slot marking is different, since stacks of caller and callee are
- * accessible in both (since caller can pass a pointer to caller's stack to
- * callee which can pass it to another function), hence mark_stack_slot_read()
- * has to propagate the stack liveness to all parent states at given frame number.
- * Consider code:
- * f1() {
- *   ptr = fp - 8;
- *   *ptr = ctx;
- *   call f2 {
- *      .. = *ptr;
- *   }
- *   .. = *ptr;
- * }
- * First *ptr is reading from f1's stack and mark_stack_slot_read() has
- * to mark liveness at the f1's frame and not f2's frame.
- * Second *ptr is also reading from f1's stack and mark_stack_slot_read() has
- * to propagate liveness to f2 states at f1's frame level and further into
- * f1 states at f1's frame level until write into that stack slot
- */
-static void mark_stack_slot_read(struct bpf_verifier_env *env,
-				 const struct bpf_verifier_state *state,
-				 struct bpf_verifier_state *parent,
-				 int slot, int frameno)
-{
-	bool writes = parent == state->parent; /* Observe write marks */
-
-	while (parent) {
-		if (parent->frame[frameno]->allocated_stack <= slot * BPF_REG_SIZE)
-			/* since LIVE_WRITTEN mark is only done for full 8-byte
-			 * write the read marks are conservative and parent
-			 * state may not even have the stack allocated. In such case
-			 * end the propagation, since the loop reached beginning
-			 * of the function
-			 */
-			break;
-		/* if read wasn't screened by an earlier write ... */
-		if (writes && state->frame[frameno]->stack[slot].spilled_ptr.live & REG_LIVE_WRITTEN)
-			break;
-		/* ... then we depend on parent's value */
-		parent->frame[frameno]->stack[slot].spilled_ptr.live |= REG_LIVE_READ;
-		state = parent;
-		parent = state->parent;
-		writes = true;
-	}
-}
-
 static int check_stack_read(struct bpf_verifier_env *env,
 			    struct bpf_func_state *reg_state /* func where register points to */,
 			    int off, int size, int value_regno)
@@ -1078,8 +973,8 @@ static int check_stack_read(struct bpf_verifier_env *env,
 			 */
 			state->regs[value_regno].live |= REG_LIVE_WRITTEN;
 		}
-		mark_stack_slot_read(env, vstate, vstate->parent, spi,
-				     reg_state->frameno);
+		mark_reg_read(env, &reg_state->stack[spi].spilled_ptr,
+			      reg_state->stack[spi].spilled_ptr.parent);
 		return 0;
 	} else {
 		int zeros = 0;
@@ -1095,8 +990,8 @@ static int check_stack_read(struct bpf_verifier_env *env,
 				off, i, size);
 			return -EACCES;
 		}
-		mark_stack_slot_read(env, vstate, vstate->parent, spi,
-				     reg_state->frameno);
+		mark_reg_read(env, &reg_state->stack[spi].spilled_ptr,
+			      reg_state->stack[spi].spilled_ptr.parent);
 		if (value_regno >= 0) {
 			if (zeros == size) {
 				/* any size read into register is zero extended,
@@ -1783,8 +1678,8 @@ static int check_stack_boundary(struct bpf_verifier_env *env, int regno,
 		/* reading any byte out of 8-byte 'spill_slot' will cause
 		 * the whole slot to be marked as 'read'
 		 */
-		mark_stack_slot_read(env, env->cur_state, env->cur_state->parent,
-				     spi, state->frameno);
+		mark_reg_read(env, &state->stack[spi].spilled_ptr,
+			      state->stack[spi].spilled_ptr.parent);
 	}
 	return update_stack_depth(env, state, off);
 }
@@ -2226,11 +2121,13 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			state->curframe + 1 /* frameno within this callchain */,
 			subprog /* subprog number within this prog */);
 
-	/* copy r1 - r5 args that callee can access */
+	/* copy r1 - r5 args that callee can access.  The copy includes parent
+	 * pointers, which connects us up to the liveness chain
+	 */
 	for (i = BPF_REG_1; i <= BPF_REG_5; i++)
 		callee->regs[i] = caller->regs[i];
 
-	/* after the call regsiters r0 - r5 were scratched */
+	/* after the call registers r0 - r5 were scratched */
 	for (i = 0; i < CALLER_SAVED_REGS; i++) {
 		mark_reg_not_init(env, caller->regs, caller_saved[i]);
 		check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);
@@ -4136,7 +4033,7 @@ static bool regsafe(struct bpf_reg_state *rold, struct bpf_reg_state *rcur,
 		/* explored state didn't use this */
 		return true;
 
-	equal = memcmp(rold, rcur, offsetof(struct bpf_reg_state, frameno)) == 0;
+	equal = memcmp(rold, rcur, offsetof(struct bpf_reg_state, parent)) == 0;
 
 	if (rold->type == PTR_TO_STACK)
 		/* two stack pointers are equal only if they're pointing to
@@ -4369,7 +4266,7 @@ static bool states_equal(struct bpf_verifier_env *env,
  * equivalent state (jump target or such) we didn't arrive by the straight-line
  * code, so read marks in the state must propagate to the parent regardless
  * of the state's write marks. That's what 'parent == state->parent' comparison
- * in mark_reg_read() and mark_stack_slot_read() is for.
+ * in mark_reg_read() is for.
  */
 static int propagate_liveness(struct bpf_verifier_env *env,
 			      const struct bpf_verifier_state *vstate,
@@ -4390,7 +4287,8 @@ static int propagate_liveness(struct bpf_verifier_env *env,
 		if (vparent->frame[vparent->curframe]->regs[i].live & REG_LIVE_READ)
 			continue;
 		if (vstate->frame[vstate->curframe]->regs[i].live & REG_LIVE_READ) {
-			err = mark_reg_read(env, vstate, vparent, i);
+			err = mark_reg_read(env, &vstate->frame[vstate->curframe]->regs[i],
+					    &vparent->frame[vstate->curframe]->regs[i]);
 			if (err)
 				return err;
 		}
@@ -4405,7 +4303,8 @@ static int propagate_liveness(struct bpf_verifier_env *env,
 			if (parent->stack[i].spilled_ptr.live & REG_LIVE_READ)
 				continue;
 			if (state->stack[i].spilled_ptr.live & REG_LIVE_READ)
-				mark_stack_slot_read(env, vstate, vparent, i, frame);
+				mark_reg_read(env, &state->stack[i].spilled_ptr,
+					      &parent->stack[i].spilled_ptr);
 		}
 	}
 	return err;
@@ -4415,7 +4314,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 {
 	struct bpf_verifier_state_list *new_sl;
 	struct bpf_verifier_state_list *sl;
-	struct bpf_verifier_state *cur = env->cur_state;
+	struct bpf_verifier_state *cur = env->cur_state, *new;
 	int i, j, err;
 
 	sl = env->explored_states[insn_idx];
@@ -4457,16 +4356,18 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 		return -ENOMEM;
 
 	/* add new state to the head of linked list */
-	err = copy_verifier_state(&new_sl->state, cur);
+	new = &new_sl->state;
+	err = copy_verifier_state(new, cur);
 	if (err) {
-		free_verifier_state(&new_sl->state, false);
+		free_verifier_state(new, false);
 		kfree(new_sl);
 		return err;
 	}
 	new_sl->next = env->explored_states[insn_idx];
 	env->explored_states[insn_idx] = new_sl;
 	/* connect new state to parentage chain */
-	cur->parent = &new_sl->state;
+	for (i = 0; i < BPF_REG_FP; i++)
+		cur_regs(env)[i].parent = &new->frame[new->curframe]->regs[i];
 	/* clear write marks in current state: the writes we did are not writes
 	 * our child did, so they don't screen off its reads from us.
 	 * (There are no read marks in current state, because reads always mark
@@ -4479,9 +4380,13 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 	/* all stack frames are accessible from callee, clear them all */
 	for (j = 0; j <= cur->curframe; j++) {
 		struct bpf_func_state *frame = cur->frame[j];
+		struct bpf_func_state *newframe = new->frame[j];
 
-		for (i = 0; i < frame->allocated_stack / BPF_REG_SIZE; i++)
+		for (i = 0; i < frame->allocated_stack / BPF_REG_SIZE; i++) {
 			frame->stack[i].spilled_ptr.live = REG_LIVE_NONE;
+			frame->stack[i].spilled_ptr.parent =
+						&newframe->stack[i].spilled_ptr;
+		}
 	}
 	return 0;
 }
@@ -4547,7 +4452,6 @@ static int do_check(struct bpf_verifier_env *env)
 	if (!state)
 		return -ENOMEM;
 	state->curframe = 0;
-	state->parent = NULL;
 	state->frame[0] = kzalloc(sizeof(struct bpf_func_state), GFP_KERNEL);
 	if (!state->frame[0]) {
 		kfree(state);

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications
  2018-03-29 22:44 [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications Edward Cree
                   ` (2 preceding siblings ...)
  2018-03-29 22:46 ` [PATCH v2 bpf-next 3/3] bpf/verifier: per-register parent pointers Edward Cree
@ 2018-03-29 22:50 ` Edward Cree
  2018-04-03  1:08 ` Alexei Starovoitov
  4 siblings, 0 replies; 14+ messages in thread
From: Edward Cree @ 2018-03-29 22:50 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann; +Cc: netdev

On 29/03/18 23:44, Edward Cree wrote:
> By storing subprog boundaries as a subprogno mark on each insn, rather than
>  a start (and implicit end) for each subprog, we collect a number of gains:
> * More efficient determination of which subprog contains a given insn, and
>   thus of find_subprog (which subprog begins at a given insn).
> * Number of verifier "full recursive walk" passes is reduced, since most of
>   the work is done in the main insn walk (do_check()).  Leftover work in
>   other passes is mostly linear scans (O(insn_cnt)) or, in the case of
>   check_max_stack_depth(), a topological sort (O(subprog_cnt)).
>
> Some other changes were also included to support this:
> * Per-subprog info is stored in env->subprog_info, an array of structs,
>   rather than several arrays with a common index.
> * Call graph is now stored in the new bpf_subprog_info struct; used here
>   for check_max_stack_depth() but may have other uses too.
>
> Along with this, patch #3 puts parent pointers (used by liveness analysis)
>  in the registers instead of the func_state or verifier_state, so that we
>  don't need skip_callee() machinery.  This also does the right thing for
>  stack slots, so they don't need their own special handling for liveness
>  marking either.
Whoops, forgot to add:
Changes from v1:
* No longer allows non-contiguous subprogs.
* No longer allows LD_ABS|IND and pseudo-calls in the same prog.

> Edward Cree (3):
>   bpf/verifier: validate func_calls by marking at do_check() time
>   bpf/verifier: update selftests
>   bpf/verifier: per-register parent pointers
>
>  include/linux/bpf_verifier.h                |  32 +-
>  kernel/bpf/verifier.c                       | 631 +++++++++++++---------------
>  tools/testing/selftests/bpf/test_verifier.c |  51 ++-
>  3 files changed, 344 insertions(+), 370 deletions(-)
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications
  2018-03-29 22:44 [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications Edward Cree
                   ` (3 preceding siblings ...)
  2018-03-29 22:50 ` [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications Edward Cree
@ 2018-04-03  1:08 ` Alexei Starovoitov
  2018-04-03 13:39   ` Edward Cree
  2018-04-05 15:50   ` Jiong Wang
  4 siblings, 2 replies; 14+ messages in thread
From: Alexei Starovoitov @ 2018-04-03  1:08 UTC (permalink / raw)
  To: Edward Cree; +Cc: Daniel Borkmann, netdev

On Thu, Mar 29, 2018 at 11:44:17PM +0100, Edward Cree wrote:
> By storing subprog boundaries as a subprogno mark on each insn, rather than
>  a start (and implicit end) for each subprog, we collect a number of gains:
> * More efficient determination of which subprog contains a given insn, and
>   thus of find_subprog (which subprog begins at a given insn).
> * Number of verifier "full recursive walk" passes is reduced, since most of
>   the work is done in the main insn walk (do_check()).  Leftover work in
>   other passes is mostly linear scans (O(insn_cnt)) or, in the case of
>   check_max_stack_depth(), a topological sort (O(subprog_cnt)).
> 
> Some other changes were also included to support this:
> * Per-subprog info is stored in env->subprog_info, an array of structs,
>   rather than several arrays with a common index.
> * Call graph is now stored in the new bpf_subprog_info struct; used here
>   for check_max_stack_depth() but may have other uses too.
> 
> Along with this, patch #3 puts parent pointers (used by liveness analysis)
>  in the registers instead of the func_state or verifier_state, so that we
>  don't need skip_callee() machinery.  This also does the right thing for
>  stack slots, so they don't need their own special handling for liveness
>  marking either.

I like patch 3 and going to play with it.
How did you test it ? Do you have processed_insn numbers for
cilium or selftests programs before and after?
There should be no difference, right?
Essentially it's trading extra run-time cost of skip_callee logic
for higher memory overhead due to parent pointers in every register
state. I guess that's ok, since it does cleanup things nicely.

As far as patch 1 it was very difficult to review, since several logically
different things clamped together. So breaking it apart:
- converting two arrays of subprog_starts and subprog_stack_depth into
  single array of struct bpf_subprog_info is a good thing
- tsort is interesting, but not sure it's correct. more below
- but main change of combining subprog logic with do_check is no good.
The verifier have to move towards compiler style code analysis instead of
the other way around. It has to analyze the program in simple and
hopefully easy to understand passes instead of combinging everything
into one loop. We _have_ to get rid of do_check() approach and
corresponding insn_processed counter. That was and still is
the biggest and most pressing issue we need to solve in bpf verification.
The verifier has to switch to compiler style code analysis to scale.
The algorithms should be such that scale with thousands of instructions
and thousands of functions. The only way I see reaching that goal
is to do hierarchical program analysis in passes:
- identify subrpograms
- identify basic blocks, build control flow graph
- build dom graph, ssa and so on
- and do per-basic block liveness and data flow analysis that propagates
  the combined result further into other BBs along cfg edges.
There will be no 'do_check() across whole program' walk.
Combining subprog pass with do_check is going into opposite direction
of this long term work. Divide and conquer. Combining more things into
do_check is the opposite of this programming principle.
My short term plan is to split basic instruction correctness checks
out of do_check() loop into separate pass and run it early on.
It's necessary for bpf libraries to work, since the verifier cannot do
full data flow analysis at this point, but should build cfg and move
as much as possible out of instruction by instruction walk to scale
with number of bpf programs/libraries and sizes of them.

As far as tsort approach for determining max stack depth...
It's an interesting idea, but implementation is suffering from the same
'combine everything into one loop' coding issue.
I think updating total_stack_depth math should be separate from sorting,
since the same function can be part of different stacks with different max.
I don't see how updating global subprog_info[i].total_stack_depth can
be correct. It has to be different for every stack and the longest
stack is not necessarily the deepest. May be I'm missing something,
but combined sort and stack_depth math didn't make it easy to review.
I find existing check_max_stack_depth() algo much easier to understand.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications
  2018-04-03  1:08 ` Alexei Starovoitov
@ 2018-04-03 13:39   ` Edward Cree
  2018-04-03 23:37     ` Alexei Starovoitov
  2018-04-05 15:50   ` Jiong Wang
  1 sibling, 1 reply; 14+ messages in thread
From: Edward Cree @ 2018-04-03 13:39 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Daniel Borkmann, netdev

On 03/04/18 02:08, Alexei Starovoitov wrote:
> I like patch 3 and going to play with it.
> How did you test it ?
Just test_verifier and test_progs (the latter has a failure
 "test_bpf_obj_id:FAIL:get-prog-info(fd) err 0 errno 2 i 0 type 1(1) info_len 80(40) jit_enabled 0 jited_prog_len 0 xlated_prog_len 72"
 but that was already there before my patch).

> Do you have processed_insn numbers for
> cilium or selftests programs before and after?
> There should be no difference, right?
That's a good check, I'll do that.

> As far as patch 1 it was very difficult to review, since several logically
> different things clamped together. So breaking it apart:
> - converting two arrays of subprog_starts and subprog_stack_depth into
>   single array of struct bpf_subprog_info is a good thing
> - tsort is interesting, but not sure it's correct. more below
> - but main change of combining subprog logic with do_check is no good.
<snip>
> There will be no 'do_check() across whole program' walk.
> Combining subprog pass with do_check is going into opposite direction
> of this long term work. Divide and conquer. Combining more things into
> do_check is the opposite of this programming principle.
The main object of my change here was to change the data structures, to
 store a subprogno in each insn aux rather than using bsearch() on the
 subprog_starts.  I have now figured out the algorithm to do this in its
 own pass (previously I thought this needed a recursive walk which is why
 I wanted to roll it into do_check() - doing more than one whole-program
 recursive walk seems like a bad idea.)

> My short term plan is to split basic instruction correctness checks
> out of do_check() loop into separate pass and run it early on.
I agree with that short term plan, sounds like a good idea.
I'm still not sure I understand the long-term plan, though; since most
 insns' input registers will still need to be checked (I'm assuming
 majority of most real ebpf programs consists of computing and
 dereferencing pointers), the data flow analysis will have to end up
 doing all the same register updates current do_check() does (though
 potentially in a different order), e.g. if a function is called three
 times it will have to analyse it with three sets of input registers.
Unless you have some way of specifying function preconditions, I don't
 see how it works.  In particular something like
    char *f(char *p)
    {
        *p++ = 0;
        return p;
    }
    int main(void)
    {
        char *p = "abc"; /* represents getting a ptr from ctx or map */
        p = f(p);
        p = f(p);
        p = f(p);
        return 0;
    }
 seems as though it would be difficult to analyse in any way more
 scalable than the current full recursive walk.  Unless you somehow
 propagate register state _backwards_ as constraints when analysing a
 function...?  In any case it seems like there are likely to be things
 which current verifier accepts which require 'whole-program' analysis
 to determine the dataflow (e.g. what if there were some conditional
 branches in f(), and the preconditions on p depended on the value of
 some other arg in r2?)

> As far as tsort approach for determining max stack depth...
> It's an interesting idea, but implementation is suffering from the same
> 'combine everything into one loop' coding issue.
> I think updating total_stack_depth math should be separate from sorting,
> since the same function can be part of different stacks with different max.
> I don't see how updating global subprog_info[i].total_stack_depth can
> be correct. It has to be different for every stack and the longest
> stack is not necessarily the deepest. May be I'm missing something,
> but combined sort and stack_depth math didn't make it easy to review.
> I find existing check_max_stack_depth() algo much easier to understand.
The sort _is_ the way to compute total stack depth.  The sort order isn't
 being stored anywhere; it's being done just so that each subprog gets
 looked at after all its callers have been considered.  So when it gets
 selected as a 'frontier node', its maximum stack depth is known, and can
 thus be used to update its callees (note that we do a max_t() with each
 callee's existing total_stack_depth, thus getting the deepest stack of
 all call chains to the function).
It may help to imagine drawing the call graph and labelling each node with
 a stack depth as it is visited; sadly that's difficult to show in an email
 (or a code comment).  But I can try to explain it a bit better than
 "/* Update callee total stack depth */".

I will also try to split up patch #1 into more pieces.  I mistakenly thought
 that existing check_max_stack_depth() depended on some invariants that I was
 removing, but I guess that was only true while I had non-contiguous subprogs.

-Ed

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications
  2018-04-03 13:39   ` Edward Cree
@ 2018-04-03 23:37     ` Alexei Starovoitov
  2018-04-04 23:58       ` Edward Cree
  0 siblings, 1 reply; 14+ messages in thread
From: Alexei Starovoitov @ 2018-04-03 23:37 UTC (permalink / raw)
  To: Edward Cree; +Cc: Daniel Borkmann, netdev

On Tue, Apr 03, 2018 at 02:39:11PM +0100, Edward Cree wrote:
> On 03/04/18 02:08, Alexei Starovoitov wrote:
> > I like patch 3 and going to play with it.
> > How did you test it ?
> Just test_verifier and test_progs (the latter has a failure
>  "test_bpf_obj_id:FAIL:get-prog-info(fd) err 0 errno 2 i 0 type 1(1) info_len 80(40) jit_enabled 0 jited_prog_len 0 xlated_prog_len 72"
>  but that was already there before my patch).

hmm. that doesn't fail for me and any other bots didn't complain.
Are you sure you're running the latest kernel and tests?

You may see the issue with buildid test:
test_stacktrace_build_id:FAIL:get build_id with readelf err -1 errno 2
that's due to actually missing buildid in the binary.

> > Do you have processed_insn numbers for
> > cilium or selftests programs before and after?
> > There should be no difference, right?
> That's a good check, I'll do that.
> 
> > As far as patch 1 it was very difficult to review, since several logically
> > different things clamped together. So breaking it apart:
> > - converting two arrays of subprog_starts and subprog_stack_depth into
> >   single array of struct bpf_subprog_info is a good thing
> > - tsort is interesting, but not sure it's correct. more below
> > - but main change of combining subprog logic with do_check is no good.
> <snip>
> > There will be no 'do_check() across whole program' walk.
> > Combining subprog pass with do_check is going into opposite direction
> > of this long term work. Divide and conquer. Combining more things into
> > do_check is the opposite of this programming principle.
> The main object of my change here was to change the data structures, to
>  store a subprogno in each insn aux rather than using bsearch() on the
>  subprog_starts.  I have now figured out the algorithm to do this in its
>  own pass (previously I thought this needed a recursive walk which is why
>  I wanted to roll it into do_check() - doing more than one whole-program
>  recursive walk seems like a bad idea.)

hmm. what's wrong with bsearch? It's trivial and fast.

> > My short term plan is to split basic instruction correctness checks
> > out of do_check() loop into separate pass and run it early on.
> I agree with that short term plan, sounds like a good idea.
> I'm still not sure I understand the long-term plan, though; since most
>  insns' input registers will still need to be checked (I'm assuming
>  majority of most real ebpf programs consists of computing and
>  dereferencing pointers), the data flow analysis will have to end up
>  doing all the same register updates current do_check() does (though
>  potentially in a different order), e.g. if a function is called three
>  times it will have to analyse it with three sets of input registers.
> Unless you have some way of specifying function preconditions, I don't
>  see how it works.  In particular something like
>     char *f(char *p)
>     {
>         *p++ = 0;
>         return p;
>     }
>     int main(void)
>     {
>         char *p = "abc"; /* represents getting a ptr from ctx or map */
>         p = f(p);
>         p = f(p);
>         p = f(p);
>         return 0;
>     }
>  seems as though it would be difficult to analyse in any way more
>  scalable than the current full recursive walk.  Unless you somehow
>  propagate register state _backwards_ as constraints when analysing a
>  function...?  In any case it seems like there are likely to be things
>  which current verifier accepts which require 'whole-program' analysis
>  to determine the dataflow (e.g. what if there were some conditional
>  branches in f(), and the preconditions on p depended on the value of
>  some other arg in r2?)

The main point that we have to make verifier scale. That's my #1 priority.
Even if we don't see the solution today we have to work towards it.
I believe adopting compiler style analysis is the right way forward.
The independent passes to determine and _keep_ info about subprograms,
basic blocks, cfg, dom graph, liveness, dataflow are necessary steps
not only for the main purpose of the verifier (which is safety check),
but for additional analysis that JITs, like NFP, have to do.
"walk all instruction and apply single transformation" is a _good thing_.
llvm has tens, if not hundreds, loops like:
for_each_bb_in_func(bb, func)
  for_each_insn_in_bb(insn, bb)
    do stuff with insn
Compiler designers could have combined multiple of such passes into
fewer ones, but it's not done, because it increases complexity and
causes tough bugs where pass is partially complete. cfg and/or dataflow
sometimes is recomputed independently from transformation
instead of doing during the pass for the same reasons.

As far as scaling the verifier the number to shot for is 1M bpf instructions.
For certain program types, like XDP, we probably will always keep 4k insn limit,
but in some cases, where program runs in the user contex, we can run it longer.
There 64k insns would be fine. Verifier already capable to inject code
into the program. It could easily inject cond_resched afer every 4k or so.
We'd need to make progs preemtable and be smarter with rcu-based map lookups.
It's all solvable and easy to do as long as core of the verifier scales.
The prime example where more than 4k instructions and loops are mandatory
is user space stack analysis inside the program. Like walking python stack
requires non-trival pointer chasing. With 'pragma unroll' the stack depth
limit today is ~18. That's not really usable. Looping through 100 python
frames would require about 16k bpf assembler instructions. That's 16k JITed
cpu instructions. It's plenty fast and safe in user contex, but if we increase
bpf max_insn limit to 16k, I'm afraid, the verifier will start falling apart
due to insn_processed counter.
Hence do_check approach must go. The rough idea is to compute per basic
block a set of INs (registers and stack) that basic block needs
to see to be safe and corresponding set of OUTs.
Then propagate this knowledge across cfg edges.
Once we have such set per bpf function, it will essentially answer the question
'what arguments this function needs to see to be safe and what it returns'
To make bpf libraries scale we'd need to keep such information
around after the verification, so dynamic linking and indirect calls
are fast to verify.
It's very high level obviously. There are many gotchas to resolve.

> > As far as tsort approach for determining max stack depth...
> > It's an interesting idea, but implementation is suffering from the same
> > 'combine everything into one loop' coding issue.
> > I think updating total_stack_depth math should be separate from sorting,
> > since the same function can be part of different stacks with different max.
> > I don't see how updating global subprog_info[i].total_stack_depth can
> > be correct. It has to be different for every stack and the longest
> > stack is not necessarily the deepest. May be I'm missing something,
> > but combined sort and stack_depth math didn't make it easy to review.
> > I find existing check_max_stack_depth() algo much easier to understand.
> The sort _is_ the way to compute total stack depth.  The sort order isn't
>  being stored anywhere; it's being done just so that each subprog gets
>  looked at after all its callers have been considered.  So when it gets
>  selected as a 'frontier node', its maximum stack depth is known, and can
>  thus be used to update its callees (note that we do a max_t() with each
>  callee's existing total_stack_depth, thus getting the deepest stack of
>  all call chains to the function).
> It may help to imagine drawing the call graph and labelling each node with
>  a stack depth as it is visited; sadly that's difficult to show in an email
>  (or a code comment).  But I can try to explain it a bit better than
>  "/* Update callee total stack depth */".

Please do, since that's my concern with tsort.
The verifier is the key piece of bpf infra and to be effective maintainers
we need to thoroughly understand the verifier code.
We cannot just take the patch based on the cover letter. The author may
disappear tomorrow and what we're going to do with the code?
Hence today, with information I have, I prefer to keep the current
check_max_stack_depth() as-is.
In other words it looks to me that tsort complexity is not justified.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications
  2018-04-03 23:37     ` Alexei Starovoitov
@ 2018-04-04 23:58       ` Edward Cree
  2018-04-05  5:28         ` Alexei Starovoitov
  0 siblings, 1 reply; 14+ messages in thread
From: Edward Cree @ 2018-04-04 23:58 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Daniel Borkmann, netdev

On 04/04/18 00:37, Alexei Starovoitov wrote:
> hmm. that doesn't fail for me and any other bots didn't complain.
> Are you sure you're running the latest kernel and tests?
Ah, test_progs isn't actually rebuilding because __NR_bpf is undeclared;
 something must be going wrong with header files.
Never mind.

> hmm. what's wrong with bsearch? It's trivial and fast. 
bsearch is O(log n), and the sort() call on the subprog_starts (which happens
 every time add_subprog() is called) is O(n log n).
Whereas reading aux->subprogno is O(1).
As far as I'm concerned, that's a sign that the latter data structure is the
 appropriate one.

> Even if we don't see the solution today we have to work towards it.
I guess I'm just not confident "towards" is the direction you think it is.

> Compiler designers could have combined multiple of such passes into
> fewer ones, but it's not done, because it increases complexity and
> causes tough bugs where pass is partially complete.
I'm not trying to combine together multiple 'for bb in prog/for insn in bb'-
 type passes.  The combining I was doing was more on 'for all possible
 execution paths'-type passes, because it's those that explode combinatorially.
Happily I think we can go a long way towards getting rid of them; but while I
 think we can get down to only having 1, I don't think we can reach 0.

> The prime example where more than 4k instructions and loops are mandatory
> is user space stack analysis inside the program. Like walking python stack
> requires non-trival pointer chasing. With 'pragma unroll' the stack depth
> limit today is ~18. That's not really usable. Looping through 100 python
> frames would require about 16k bpf assembler instructions.
But this would be solved by having support for bounded loops, and I think I've
 successfully shown that this is not inherently incompatible with a do_check()
 style walk.

> Hence do_check approach must go. The rough idea is to compute per basic
> block a set of INs (registers and stack) that basic block needs
> to see to be safe and corresponding set of OUTs.
> Then propagate this knowledge across cfg edges.
> Once we have such set per bpf function, it will essentially answer the question
> 'what arguments this function needs to see to be safe and what it returns'
> To make bpf libraries scale we'd need to keep such information
> around after the verification, so dynamic linking and indirect calls
> are fast to verify.
> It's very high level obviously. There are many gotchas to resolve.
I agree that if we can do this it'll be ideal.  But that's a big 'if'; my
 example code was intended to demonstrate that the "set of INs bb/func needs to
 see to be safe" can be an arbitrarily complicated disjunction, and that instead
 of a combinatorially exploding number of paths to walk (the do_check() approach)
 you now have combinatorially exploding IN-constraints to propagate backwards.

> Please do, since that's my concern with tsort.
> The verifier is the key piece of bpf infra and to be effective maintainers
> we need to thoroughly understand the verifier code.
> We cannot just take the patch based on the cover letter. The author may
> disappear tomorrow and what we're going to do with the code?
I entirely accept this argument.  Unfortunately, when writing explanations,
 it's difficult to know when one has reached something that will be
 understood, so inevitably there will have to be a few iterations to get
 there ;-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications
  2018-04-04 23:58       ` Edward Cree
@ 2018-04-05  5:28         ` Alexei Starovoitov
  2018-04-05  8:49           ` Edward Cree
  0 siblings, 1 reply; 14+ messages in thread
From: Alexei Starovoitov @ 2018-04-05  5:28 UTC (permalink / raw)
  To: Edward Cree; +Cc: Daniel Borkmann, netdev

On Thu, Apr 05, 2018 at 12:58:46AM +0100, Edward Cree wrote:
> On 04/04/18 00:37, Alexei Starovoitov wrote:
> > hmm. that doesn't fail for me and any other bots didn't complain.
> > Are you sure you're running the latest kernel and tests?
> Ah, test_progs isn't actually rebuilding because __NR_bpf is undeclared;
>  something must be going wrong with header files.
> Never mind.
> 
> > hmm. what's wrong with bsearch? It's trivial and fast. 
> bsearch is O(log n), and the sort() call on the subprog_starts (which happens
>  every time add_subprog() is called) is O(n log n).
> Whereas reading aux->subprogno is O(1).
> As far as I'm concerned, that's a sign that the latter data structure is the
>  appropriate one.

only if it can be done as separate _first_ pass before cfg.

> > Even if we don't see the solution today we have to work towards it.
> I guess I'm just not confident "towards" is the direction you think it is.
> 
> > Compiler designers could have combined multiple of such passes into
> > fewer ones, but it's not done, because it increases complexity and
> > causes tough bugs where pass is partially complete.
> I'm not trying to combine together multiple 'for bb in prog/for insn in bb'-
>  type passes.  The combining I was doing was more on 'for all possible
>  execution paths'-type passes, because it's those that explode combinatorially.
> Happily I think we can go a long way towards getting rid of them; but while I
>  think we can get down to only having 1, I don't think we can reach 0.

we have to. That's the point. 'explore all possible paths' must be removed.
New code should not be relying on that in a way that it's difficult to remove
later. subprog boundaries is a typical example where doing it as
part of 'explore all paths' is harmful long term. Regardless of extra
O(num_of_subrogs) complexity it brings short term.

> > The prime example where more than 4k instructions and loops are mandatory
> > is user space stack analysis inside the program. Like walking python stack
> > requires non-trival pointer chasing. With 'pragma unroll' the stack depth
> > limit today is ~18. That's not really usable. Looping through 100 python
> > frames would require about 16k bpf assembler instructions.
> But this would be solved by having support for bounded loops, and I think I've
>  successfully shown that this is not inherently incompatible with a do_check()
>  style walk.

No. It's worse. Your proposed approach to bounded loops completely
relying on 'explore all paths' whereas John's does not.
Can 'explore all paths' work with 1M bpf insns? No.
Whereas an approach that builds dom graph, detects natural loops
and loop invariants - can.

> > Hence do_check approach must go. The rough idea is to compute per basic
> > block a set of INs (registers and stack) that basic block needs
> > to see to be safe and corresponding set of OUTs.
> > Then propagate this knowledge across cfg edges.
> > Once we have such set per bpf function, it will essentially answer the question
> > 'what arguments this function needs to see to be safe and what it returns'
> > To make bpf libraries scale we'd need to keep such information
> > around after the verification, so dynamic linking and indirect calls
> > are fast to verify.
> > It's very high level obviously. There are many gotchas to resolve.
> I agree that if we can do this it'll be ideal.  But that's a big 'if'; my
>  example code was intended to demonstrate that the "set of INs bb/func needs to
>  see to be safe" can be an arbitrarily complicated disjunction, and that instead
>  of a combinatorially exploding number of paths to walk (the do_check() approach)
>  you now have combinatorially exploding IN-constraints to propagate backwards.

This sounds like arbitrary assumptions what this new approach would do.
Instead of rejecting it outright try to solve this very hard problem.

Before this discussion gets carried away too far let's get back to this patch set.
Having seen all arguments I still think that only patch 3 is worth pursuing further.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications
  2018-04-05  5:28         ` Alexei Starovoitov
@ 2018-04-05  8:49           ` Edward Cree
  0 siblings, 0 replies; 14+ messages in thread
From: Edward Cree @ 2018-04-05  8:49 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Daniel Borkmann, netdev

On 05/04/18 06:28, Alexei Starovoitov wrote:
> On Thu, Apr 05, 2018 at 12:58:46AM +0100, Edward Cree wrote:
>> On 04/04/18 00:37, Alexei Starovoitov wrote:
>>> hmm. that doesn't fail for me and any other bots didn't complain.
>>> Are you sure you're running the latest kernel and tests?
>> Ah, test_progs isn't actually rebuilding because __NR_bpf is undeclared;
>>  something must be going wrong with header files.
>> Never mind.
>>
>>> hmm. what's wrong with bsearch? It's trivial and fast. 
>> bsearch is O(log n), and the sort() call on the subprog_starts (which happens
>>  every time add_subprog() is called) is O(n log n).
>> Whereas reading aux->subprogno is O(1).
>> As far as I'm concerned, that's a sign that the latter data structure is the
>>  appropriate one.
> only if it can be done as separate _first_ pass before cfg.
As I think I already said, I'm working on a next version of the patch that
 does just that.

> No. It's worse. Your proposed approach to bounded loops completely
> relying on 'explore all paths' whereas John's does not.
> Can 'explore all paths' work with 1M bpf insns? No.
> Whereas an approach that builds dom graph, detects natural loops
> and loop invariants - can.
... IF it's possible at all, which I'm still doubtful about.

> This sounds like arbitrary assumptions what this new approach would do.
> Instead of rejecting it outright try to solve this very hard problem.
I'm not rejecting it outright; I'm just saying, be prepared for the possibility
 that it can't be solved, and try to leave a line of retreat / have a plan B in
 the case that it proves to be subject to a no-free-lunch theorem.  Because my
 intuition is that an entropy argument may require all algos to have the same
 superexponential worst-case running time as 'explore all paths' does.

> Before this discussion gets carried away too far let's get back to this patch set.
> Having seen all arguments I still think that only patch 3 is worth pursuing further.
I think the next version of the patch series (coming real soon now) answers at
 least most of your objections (and indeed the above debate isn't relevant to it).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications
  2018-04-03  1:08 ` Alexei Starovoitov
  2018-04-03 13:39   ` Edward Cree
@ 2018-04-05 15:50   ` Jiong Wang
  2018-04-05 16:29     ` Edward Cree
  2018-04-06  1:20     ` Alexei Starovoitov
  1 sibling, 2 replies; 14+ messages in thread
From: Jiong Wang @ 2018-04-05 15:50 UTC (permalink / raw)
  To: Alexei Starovoitov, Edward Cree; +Cc: Daniel Borkmann, netdev

On 03/04/2018 02:08, Alexei Starovoitov wrote:
> Combining subprog pass with do_check is going into opposite direction
> of this long term work. Divide and conquer. Combining more things into
> do_check is the opposite of this programming principle.

Agree. And for the redundant insn traversal issue in check_subprogs that
Edward trying to fix, I am thinking we could do it by utilizing the
existing DFS traversal in check_cfg.

The current DFS probably could be improved into an generic instruction
information collection pass.

This won't make the existing DFS complexer as it only does information
collection as a side job during traversal. These collected information
then could be used to build any other information to be consumed later,
for example subprog, basic blocks etc.

For the redundant insn traversal issue during subprog detection, the
Like how we mark STATE_LIST_MARK in DFS, we could just call add_subprog
for BPF_PSEUDO_CALL insn during DFS.

i.e we change the code logic of check_cfg into:

check_cfg
{
   * DFS traversal:
     - detect back-edge.
     - collect STATE_LIST_MARK.
     - collect subprog destination.

   * check all insns are reachable.
   * check_subprogs (insn traversal removed).
}

I prototyped a patch for above change, the change looks to me is easier to
review. There are 8 regressions on selftests however after this change due
to check_subprogs moved after some checks in DFS that some errors detected
before the expected errors, need to be cleaned up.

Does this direction sound good?

And if we want to build basic-block later, we could just call a new add_bb
(similar as add_subprog) for jump targets etc. (some of those places are
actually STATE_LIST_MARK_JMPTARGET if we separate STATE_LIST_MARK as
STATE_LIST_MARK_JMPTARGET and STATE_LIST_MARK_HEURISTIC).

Regards,
Jiong

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications
  2018-04-05 15:50   ` Jiong Wang
@ 2018-04-05 16:29     ` Edward Cree
  2018-04-06  1:20     ` Alexei Starovoitov
  1 sibling, 0 replies; 14+ messages in thread
From: Edward Cree @ 2018-04-05 16:29 UTC (permalink / raw)
  To: Jiong Wang, Alexei Starovoitov; +Cc: Daniel Borkmann, netdev

On 05/04/18 16:50, Jiong Wang wrote:
> On 03/04/2018 02:08, Alexei Starovoitov wrote:
>> Combining subprog pass with do_check is going into opposite direction
>> of this long term work. Divide and conquer. Combining more things into
>> do_check is the opposite of this programming principle.
>
> Agree. And for the redundant insn traversal issue in check_subprogs that
> Edward trying to fix, I am thinking we could do it by utilizing the
> existing DFS traversal in check_cfg.
>
> The current DFS probably could be improved into an generic instruction
> information collection pass.
<snip>
> And if we want to build basic-block later, we could just call a new add_bb
> (similar as add_subprog) for jump targets etc. (some of those places are
> actually STATE_LIST_MARK_JMPTARGET if we separate STATE_LIST_MARK as
> STATE_LIST_MARK_JMPTARGET and STATE_LIST_MARK_HEURISTIC).
>
> Regards,
> Jiong
>
This is broadly similar to my medium-term plan, except that I would use
 the Kahn's-algorithm style tsort rather than the depth-first tsort
 algorithm used by current check_cfg.
* chop up prog into functions and bbs, incidentally placing S_L_Ms
* create control flow graph similar to my function call graph
* tsort it to detect loops, similar to how my check_max_stack_depth()
  implementation detects recursion.

-Ed

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications
  2018-04-05 15:50   ` Jiong Wang
  2018-04-05 16:29     ` Edward Cree
@ 2018-04-06  1:20     ` Alexei Starovoitov
  1 sibling, 0 replies; 14+ messages in thread
From: Alexei Starovoitov @ 2018-04-06  1:20 UTC (permalink / raw)
  To: Jiong Wang; +Cc: Edward Cree, Daniel Borkmann, netdev

On Thu, Apr 05, 2018 at 04:50:03PM +0100, Jiong Wang wrote:
> On 03/04/2018 02:08, Alexei Starovoitov wrote:
> > Combining subprog pass with do_check is going into opposite direction
> > of this long term work. Divide and conquer. Combining more things into
> > do_check is the opposite of this programming principle.
> 
> Agree. And for the redundant insn traversal issue in check_subprogs that
> Edward trying to fix, I am thinking we could do it by utilizing the
> existing DFS traversal in check_cfg.
> 
> The current DFS probably could be improved into an generic instruction
> information collection pass.
> 
> This won't make the existing DFS complexer as it only does information
> collection as a side job during traversal. These collected information
> then could be used to build any other information to be consumed later,
> for example subprog, basic blocks etc.
> 
> For the redundant insn traversal issue during subprog detection, the
> Like how we mark STATE_LIST_MARK in DFS, we could just call add_subprog
> for BPF_PSEUDO_CALL insn during DFS.
> 
> i.e we change the code logic of check_cfg into:
> 
> check_cfg
> {
>   * DFS traversal:
>     - detect back-edge.
>     - collect STATE_LIST_MARK.
>     - collect subprog destination.
> 
>   * check all insns are reachable.
>   * check_subprogs (insn traversal removed).
> }

I don't think that will work.
Functions are independent from CFG.
With bpf libraries we will have disjoint functions sitting in the kernel
and check_cfg would need to be done separately from function boundaries.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-04-06  1:20 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-29 22:44 [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications Edward Cree
2018-03-29 22:45 ` [PATCH v2 bpf-next 1/3] bpf/verifier: validate func_calls by marking at do_check() time Edward Cree
2018-03-29 22:46 ` [PATCH v2 bpf-next 2/3] bpf/verifier: update selftests Edward Cree
2018-03-29 22:46 ` [PATCH v2 bpf-next 3/3] bpf/verifier: per-register parent pointers Edward Cree
2018-03-29 22:50 ` [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications Edward Cree
2018-04-03  1:08 ` Alexei Starovoitov
2018-04-03 13:39   ` Edward Cree
2018-04-03 23:37     ` Alexei Starovoitov
2018-04-04 23:58       ` Edward Cree
2018-04-05  5:28         ` Alexei Starovoitov
2018-04-05  8:49           ` Edward Cree
2018-04-05 15:50   ` Jiong Wang
2018-04-05 16:29     ` Edward Cree
2018-04-06  1:20     ` Alexei Starovoitov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.