[RFC PATCH v2 bpf-next 0/2] verifier liveness simplification

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification
@ 2018-08-22 19:00 Edward Cree
  2018-08-22 19:02 ` [RFC PATCH v2 bpf-next 1/2] bpf/verifier: per-register parent pointers Edward Cree
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Edward Cree @ 2018-08-22 19:00 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev

The first patch is a simplification of register liveness tracking by using
 a separate parentage chain for each register and stack slot, thus avoiding
 the need for logic to handle callee-saved registers when applying read
 marks.  In the future this idea may be extended to form use-def chains.
The second patch adds information about misc/zero data on the stack to the
 state dumps emitted to the log at various points; this information was
 found essential in debugging the first patch, and may be useful elsewhere.

Edward Cree (2):
  bpf/verifier: per-register parent pointers
  bpf/verifier: display non-spill stack slot types in
    print_verifier_state

 include/linux/bpf_verifier.h |   8 +-
 kernel/bpf/verifier.c        | 216 ++++++++++++++-----------------------------
 2 files changed, 72 insertions(+), 152 deletions(-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RFC PATCH v2 bpf-next 1/2] bpf/verifier: per-register parent pointers
  2018-08-22 19:00 [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification Edward Cree
@ 2018-08-22 19:02 ` Edward Cree
  2018-08-22 19:02 ` [RFC PATCH v2 bpf-next 2/2] bpf/verifier: display non-spill stack slot types in print_verifier_state Edward Cree
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Edward Cree @ 2018-08-22 19:02 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev

By giving each register its own liveness chain, we elide the skip_callee()
 logic.  Instead, each register's parent is the state it inherits from;
 both check_func_call() and prepare_func_exit() automatically connect
 reg states to the correct chain since when they copy the reg state across
 (r1-r5 into the callee as args, and r0 out as the return value) they also
 copy the parent pointer.

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 include/linux/bpf_verifier.h |   8 +-
 kernel/bpf/verifier.c        | 184 +++++++++++--------------------------------
 2 files changed, 47 insertions(+), 145 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 38b04f559ad3..b42b60a83e19 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -41,6 +41,7 @@ enum bpf_reg_liveness {
 };
 
 struct bpf_reg_state {
+	/* Ordering of fields matters.  See states_equal() */
 	enum bpf_reg_type type;
 	union {
 		/* valid when type == PTR_TO_PACKET */
@@ -59,7 +60,6 @@ struct bpf_reg_state {
 	 * came from, when one is tested for != NULL.
 	 */
 	u32 id;
-	/* Ordering of fields matters.  See states_equal() */
 	/* For scalar types (SCALAR_VALUE), this represents our knowledge of
 	 * the actual value.
 	 * For pointer types, this represents the variable part of the offset
@@ -76,15 +76,15 @@ struct bpf_reg_state {
 	s64 smax_value; /* maximum possible (s64)value */
 	u64 umin_value; /* minimum possible (u64)value */
 	u64 umax_value; /* maximum possible (u64)value */
+	/* parentage chain for liveness checking */
+	struct bpf_reg_state *parent;
 	/* Inside the callee two registers can be both PTR_TO_STACK like
 	 * R1=fp-8 and R2=fp-8, but one of them points to this function stack
 	 * while another to the caller's stack. To differentiate them 'frameno'
 	 * is used which is an index in bpf_verifier_state->frame[] array
 	 * pointing to bpf_func_state.
-	 * This field must be second to last, for states_equal() reasons.
 	 */
 	u32 frameno;
-	/* This field must be last, for states_equal() reasons. */
 	enum bpf_reg_liveness live;
 };
 
@@ -107,7 +107,6 @@ struct bpf_stack_state {
  */
 struct bpf_func_state {
 	struct bpf_reg_state regs[MAX_BPF_REG];
-	struct bpf_verifier_state *parent;
 	/* index of call instruction that called into this func */
 	int callsite;
 	/* stack frame number of this function state from pov of
@@ -129,7 +128,6 @@ struct bpf_func_state {
 struct bpf_verifier_state {
 	/* call stack tracking */
 	struct bpf_func_state *frame[MAX_CALL_FRAMES];
-	struct bpf_verifier_state *parent;
 	u32 curframe;
 };
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ca90679a7fe5..b11d45916fff 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -380,9 +380,9 @@ static int copy_stack_state(struct bpf_func_state *dst,
 /* do_check() starts with zero-sized stack in struct bpf_verifier_state to
  * make it consume minimal amount of memory. check_stack_write() access from
  * the program calls into realloc_func_state() to grow the stack size.
- * Note there is a non-zero 'parent' pointer inside bpf_verifier_state
- * which this function copies over. It points to previous bpf_verifier_state
- * which is never reallocated
+ * Note there is a non-zero parent pointer inside each reg of bpf_verifier_state
+ * which this function copies over. It points to corresponding reg in previous
+ * bpf_verifier_state which is never reallocated
  */
 static int realloc_func_state(struct bpf_func_state *state, int size,
 			      bool copy_old)
@@ -466,7 +466,6 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state,
 		dst_state->frame[i] = NULL;
 	}
 	dst_state->curframe = src->curframe;
-	dst_state->parent = src->parent;
 	for (i = 0; i <= src->curframe; i++) {
 		dst = dst_state->frame[i];
 		if (!dst) {
@@ -732,6 +731,7 @@ static void init_reg_state(struct bpf_verifier_env *env,
 	for (i = 0; i < MAX_BPF_REG; i++) {
 		mark_reg_not_init(env, regs, i);
 		regs[i].live = REG_LIVE_NONE;
+		regs[i].parent = NULL;
 	}
 
 	/* frame pointer */
@@ -876,74 +876,21 @@ static int check_subprogs(struct bpf_verifier_env *env)
 	return 0;
 }
 
-static
-struct bpf_verifier_state *skip_callee(struct bpf_verifier_env *env,
-				       const struct bpf_verifier_state *state,
-				       struct bpf_verifier_state *parent,
-				       u32 regno)
-{
-	struct bpf_verifier_state *tmp = NULL;
-
-	/* 'parent' could be a state of caller and
-	 * 'state' could be a state of callee. In such case
-	 * parent->curframe < state->curframe
-	 * and it's ok for r1 - r5 registers
-	 *
-	 * 'parent' could be a callee's state after it bpf_exit-ed.
-	 * In such case parent->curframe > state->curframe
-	 * and it's ok for r0 only
-	 */
-	if (parent->curframe == state->curframe ||
-	    (parent->curframe < state->curframe &&
-	     regno >= BPF_REG_1 && regno <= BPF_REG_5) ||
-	    (parent->curframe > state->curframe &&
-	       regno == BPF_REG_0))
-		return parent;
-
-	if (parent->curframe > state->curframe &&
-	    regno >= BPF_REG_6) {
-		/* for callee saved regs we have to skip the whole chain
-		 * of states that belong to callee and mark as LIVE_READ
-		 * the registers before the call
-		 */
-		tmp = parent;
-		while (tmp && tmp->curframe != state->curframe) {
-			tmp = tmp->parent;
-		}
-		if (!tmp)
-			goto bug;
-		parent = tmp;
-	} else {
-		goto bug;
-	}
-	return parent;
-bug:
-	verbose(env, "verifier bug regno %d tmp %p\n", regno, tmp);
-	verbose(env, "regno %d parent frame %d current frame %d\n",
-		regno, parent->curframe, state->curframe);
-	return NULL;
-}
-
+/* Parentage chain of this register (or stack slot) should take care of all
+ * issues like callee-saved registers, stack slot allocation time, etc.
+ */
 static int mark_reg_read(struct bpf_verifier_env *env,
-			 const struct bpf_verifier_state *state,
-			 struct bpf_verifier_state *parent,
-			 u32 regno)
+			 const struct bpf_reg_state *state,
+			 struct bpf_reg_state *parent)
 {
 	bool writes = parent == state->parent; /* Observe write marks */
 
-	if (regno == BPF_REG_FP)
-		/* We don't need to worry about FP liveness because it's read-only */
-		return 0;
-
 	while (parent) {
 		/* if read wasn't screened by an earlier write ... */
-		if (writes && state->frame[state->curframe]->regs[regno].live & REG_LIVE_WRITTEN)
+		if (writes && state->live & REG_LIVE_WRITTEN)
 			break;
-		parent = skip_callee(env, state, parent, regno);
-		if (!parent)
-			return -EFAULT;
 		/* ... then we depend on parent's value */
-		parent->frame[parent->curframe]->regs[regno].live |= REG_LIVE_READ;
+		parent->live |= REG_LIVE_READ;
 		state = parent;
 		parent = state->parent;
 		writes = true;
@@ -969,7 +916,10 @@ static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
 			verbose(env, "R%d !read_ok\n", regno);
 			return -EACCES;
 		}
-		return mark_reg_read(env, vstate, vstate->parent, regno);
+		/* We don't need to worry about FP liveness because it's read-only */
+		if (regno != BPF_REG_FP)
+			return mark_reg_read(env, &regs[regno],
+					     regs[regno].parent);
 	} else {
 		/* check whether register used as dest operand can be written to */
 		if (regno == BPF_REG_FP) {
@@ -1080,8 +1030,8 @@ static int check_stack_write(struct bpf_verifier_env *env,
 	} else {
 		u8 type = STACK_MISC;
 
-		/* regular write of data into stack */
-		state->stack[spi].spilled_ptr = (struct bpf_reg_state) {};
+		/* regular write of data into stack destroys any spilled ptr */
+		state->stack[spi].spilled_ptr.type = NOT_INIT;
 
 		/* only mark the slot as written if all 8 bytes were written
 		 * otherwise read propagation may incorrectly stop too soon
@@ -1106,61 +1056,6 @@ static int check_stack_write(struct bpf_verifier_env *env,
 	return 0;
 }
 
-/* registers of every function are unique and mark_reg_read() propagates
- * the liveness in the following cases:
- * - from callee into caller for R1 - R5 that were used as arguments
- * - from caller into callee for R0 that used as result of the call
- * - from caller to the same caller skipping states of the callee for R6 - R9,
- *   since R6 - R9 are callee saved by implicit function prologue and
- *   caller's R6 != callee's R6, so when we propagate liveness up to
- *   parent states we need to skip callee states for R6 - R9.
- *
- * stack slot marking is different, since stacks of caller and callee are
- * accessible in both (since caller can pass a pointer to caller's stack to
- * callee which can pass it to another function), hence mark_stack_slot_read()
- * has to propagate the stack liveness to all parent states at given frame number.
- * Consider code:
- * f1() {
- *   ptr = fp - 8;
- *   *ptr = ctx;
- *   call f2 {
- *      .. = *ptr;
- *   }
- *   .. = *ptr;
- * }
- * First *ptr is reading from f1's stack and mark_stack_slot_read() has
- * to mark liveness at the f1's frame and not f2's frame.
- * Second *ptr is also reading from f1's stack and mark_stack_slot_read() has
- * to propagate liveness to f2 states at f1's frame level and further into
- * f1 states at f1's frame level until write into that stack slot
- */
-static void mark_stack_slot_read(struct bpf_verifier_env *env,
-				 const struct bpf_verifier_state *state,
-				 struct bpf_verifier_state *parent,
-				 int slot, int frameno)
-{
-	bool writes = parent == state->parent; /* Observe write marks */
-
-	while (parent) {
-		if (parent->frame[frameno]->allocated_stack <= slot * BPF_REG_SIZE)
-			/* since LIVE_WRITTEN mark is only done for full 8-byte
-			 * write the read marks are conservative and parent
-			 * state may not even have the stack allocated. In such case
-			 * end the propagation, since the loop reached beginning
-			 * of the function
-			 */
-			break;
-		/* if read wasn't screened by an earlier write ... */
-		if (writes && state->frame[frameno]->stack[slot].spilled_ptr.live & REG_LIVE_WRITTEN)
-			break;
-		/* ... then we depend on parent's value */
-		parent->frame[frameno]->stack[slot].spilled_ptr.live |= REG_LIVE_READ;
-		state = parent;
-		parent = state->parent;
-		writes = true;
-	}
-}
-
 static int check_stack_read(struct bpf_verifier_env *env,
 			    struct bpf_func_state *reg_state /* func where register points to */,
 			    int off, int size, int value_regno)
@@ -1198,8 +1093,8 @@ static int check_stack_read(struct bpf_verifier_env *env,
 			 */
 			state->regs[value_regno].live |= REG_LIVE_WRITTEN;
 		}
-		mark_stack_slot_read(env, vstate, vstate->parent, spi,
-				     reg_state->frameno);
+		mark_reg_read(env, &reg_state->stack[spi].spilled_ptr,
+			      reg_state->stack[spi].spilled_ptr.parent);
 		return 0;
 	} else {
 		int zeros = 0;
@@ -1215,8 +1110,8 @@ static int check_stack_read(struct bpf_verifier_env *env,
 				off, i, size);
 			return -EACCES;
 		}
-		mark_stack_slot_read(env, vstate, vstate->parent, spi,
-				     reg_state->frameno);
+		mark_reg_read(env, &reg_state->stack[spi].spilled_ptr,
+			      reg_state->stack[spi].spilled_ptr.parent);
 		if (value_regno >= 0) {
 			if (zeros == size) {
 				/* any size read into register is zero extended,
@@ -1908,8 +1803,8 @@ static int check_stack_boundary(struct bpf_verifier_env *env, int regno,
 		/* reading any byte out of 8-byte 'spill_slot' will cause
 		 * the whole slot to be marked as 'read'
 		 */
-		mark_stack_slot_read(env, env->cur_state, env->cur_state->parent,
-				     spi, state->frameno);
+		mark_reg_read(env, &state->stack[spi].spilled_ptr,
+			      state->stack[spi].spilled_ptr.parent);
 	}
 	return update_stack_depth(env, state, off);
 }
@@ -2366,11 +2261,13 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			state->curframe + 1 /* frameno within this callchain */,
 			subprog /* subprog number within this prog */);
 
-	/* copy r1 - r5 args that callee can access */
+	/* copy r1 - r5 args that callee can access.  The copy includes parent
+	 * pointers, which connects us up to the liveness chain
+	 */
 	for (i = BPF_REG_1; i <= BPF_REG_5; i++)
 		callee->regs[i] = caller->regs[i];
 
-	/* after the call regsiters r0 - r5 were scratched */
+	/* after the call registers r0 - r5 were scratched */
 	for (i = 0; i < CALLER_SAVED_REGS; i++) {
 		mark_reg_not_init(env, caller->regs, caller_saved[i]);
 		check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);
@@ -4370,7 +4267,7 @@ static bool regsafe(struct bpf_reg_state *rold, struct bpf_reg_state *rcur,
 		/* explored state didn't use this */
 		return true;
 
-	equal = memcmp(rold, rcur, offsetof(struct bpf_reg_state, frameno)) == 0;
+	equal = memcmp(rold, rcur, offsetof(struct bpf_reg_state, parent)) == 0;
 
 	if (rold->type == PTR_TO_STACK)
 		/* two stack pointers are equal only if they're pointing to
@@ -4603,7 +4500,7 @@ static bool states_equal(struct bpf_verifier_env *env,
  * equivalent state (jump target or such) we didn't arrive by the straight-line
  * code, so read marks in the state must propagate to the parent regardless
  * of the state's write marks. That's what 'parent == state->parent' comparison
- * in mark_reg_read() and mark_stack_slot_read() is for.
+ * in mark_reg_read() is for.
  */
 static int propagate_liveness(struct bpf_verifier_env *env,
 			      const struct bpf_verifier_state *vstate,
@@ -4624,7 +4521,8 @@ static int propagate_liveness(struct bpf_verifier_env *env,
 		if (vparent->frame[vparent->curframe]->regs[i].live & REG_LIVE_READ)
 			continue;
 		if (vstate->frame[vstate->curframe]->regs[i].live & REG_LIVE_READ) {
-			err = mark_reg_read(env, vstate, vparent, i);
+			err = mark_reg_read(env, &vstate->frame[vstate->curframe]->regs[i],
+					    &vparent->frame[vstate->curframe]->regs[i]);
 			if (err)
 				return err;
 		}
@@ -4639,7 +4537,8 @@ static int propagate_liveness(struct bpf_verifier_env *env,
 			if (parent->stack[i].spilled_ptr.live & REG_LIVE_READ)
 				continue;
 			if (state->stack[i].spilled_ptr.live & REG_LIVE_READ)
-				mark_stack_slot_read(env, vstate, vparent, i, frame);
+				mark_reg_read(env, &state->stack[i].spilled_ptr,
+					      &parent->stack[i].spilled_ptr);
 		}
 	}
 	return err;
@@ -4649,7 +4548,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 {
 	struct bpf_verifier_state_list *new_sl;
 	struct bpf_verifier_state_list *sl;
-	struct bpf_verifier_state *cur = env->cur_state;
+	struct bpf_verifier_state *cur = env->cur_state, *new;
 	int i, j, err;
 
 	sl = env->explored_states[insn_idx];
@@ -4691,16 +4590,18 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 		return -ENOMEM;
 
 	/* add new state to the head of linked list */
-	err = copy_verifier_state(&new_sl->state, cur);
+	new = &new_sl->state;
+	err = copy_verifier_state(new, cur);
 	if (err) {
-		free_verifier_state(&new_sl->state, false);
+		free_verifier_state(new, false);
 		kfree(new_sl);
 		return err;
 	}
 	new_sl->next = env->explored_states[insn_idx];
 	env->explored_states[insn_idx] = new_sl;
 	/* connect new state to parentage chain */
-	cur->parent = &new_sl->state;
+	for (i = 0; i < BPF_REG_FP; i++)
+		cur_regs(env)[i].parent = &new->frame[new->curframe]->regs[i];
 	/* clear write marks in current state: the writes we did are not writes
 	 * our child did, so they don't screen off its reads from us.
 	 * (There are no read marks in current state, because reads always mark
@@ -4713,9 +4614,13 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 	/* all stack frames are accessible from callee, clear them all */
 	for (j = 0; j <= cur->curframe; j++) {
 		struct bpf_func_state *frame = cur->frame[j];
+		struct bpf_func_state *newframe = new->frame[j];
 
-		for (i = 0; i < frame->allocated_stack / BPF_REG_SIZE; i++)
+		for (i = 0; i < frame->allocated_stack / BPF_REG_SIZE; i++) {
 			frame->stack[i].spilled_ptr.live = REG_LIVE_NONE;
+			frame->stack[i].spilled_ptr.parent =
+						&newframe->stack[i].spilled_ptr;
+		}
 	}
 	return 0;
 }
@@ -4734,7 +4639,6 @@ static int do_check(struct bpf_verifier_env *env)
 	if (!state)
 		return -ENOMEM;
 	state->curframe = 0;
-	state->parent = NULL;
 	state->frame[0] = kzalloc(sizeof(struct bpf_func_state), GFP_KERNEL);
 	if (!state->frame[0]) {
 		kfree(state);

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH v2 bpf-next 2/2] bpf/verifier: display non-spill stack slot types in print_verifier_state
  2018-08-22 19:00 [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification Edward Cree
  2018-08-22 19:02 ` [RFC PATCH v2 bpf-next 1/2] bpf/verifier: per-register parent pointers Edward Cree
@ 2018-08-22 19:02 ` Edward Cree
  2018-08-30  2:18 ` [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification Alexei Starovoitov
  2018-09-26 22:16 ` Jiong Wang
  3 siblings, 0 replies; 12+ messages in thread
From: Edward Cree @ 2018-08-22 19:02 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev

If a stack slot does not hold a spilled register (STACK_SPILL), then each
 of its eight bytes could potentially have a different slot_type.  This
 information can be important for debugging, and previously we either did
 not print anything for the stack slot, or just printed fp-X=0 in the case
 where its first byte was STACK_ZERO.
Instead, print eight characters with either 0 (STACK_ZERO), m (STACK_MISC)
 or ? (STACK_INVALID) for any stack slot which is neither STACK_SPILL nor
 entirely STACK_INVALID.

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 kernel/bpf/verifier.c | 32 +++++++++++++++++++++++++-------
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index b11d45916fff..2f4b52cf864c 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -263,6 +263,13 @@ static const char * const reg_type_str[] = {
 	[PTR_TO_PACKET_END]	= "pkt_end",
 };
 
+static char slot_type_char[] = {
+	[STACK_INVALID]	= '?',
+	[STACK_SPILL]	= 'r',
+	[STACK_MISC]	= 'm',
+	[STACK_ZERO]	= '0',
+};
+
 static void print_liveness(struct bpf_verifier_env *env,
 			   enum bpf_reg_liveness live)
 {
@@ -349,15 +356,26 @@ static void print_verifier_state(struct bpf_verifier_env *env,
 		}
 	}
 	for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) {
-		if (state->stack[i].slot_type[0] == STACK_SPILL) {
-			verbose(env, " fp%d",
-				(-i - 1) * BPF_REG_SIZE);
-			print_liveness(env, state->stack[i].spilled_ptr.live);
+		char types_buf[BPF_REG_SIZE + 1];
+		bool valid = false;
+		int j;
+
+		for (j = 0; j < BPF_REG_SIZE; j++) {
+			if (state->stack[i].slot_type[j] != STACK_INVALID)
+				valid = true;
+			types_buf[j] = slot_type_char[
+					state->stack[i].slot_type[j]];
+		}
+		types_buf[BPF_REG_SIZE] = 0;
+		if (!valid)
+			continue;
+		verbose(env, " fp%d", (-i - 1) * BPF_REG_SIZE);
+		print_liveness(env, state->stack[i].spilled_ptr.live);
+		if (state->stack[i].slot_type[0] == STACK_SPILL)
 			verbose(env, "=%s",
 				reg_type_str[state->stack[i].spilled_ptr.type]);
-		}
-		if (state->stack[i].slot_type[0] == STACK_ZERO)
-			verbose(env, " fp%d=0", (-i - 1) * BPF_REG_SIZE);
+		else
+			verbose(env, "=%s", types_buf);
 	}
 	verbose(env, "\n");
 }

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification
  2018-08-22 19:00 [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification Edward Cree
  2018-08-22 19:02 ` [RFC PATCH v2 bpf-next 1/2] bpf/verifier: per-register parent pointers Edward Cree
  2018-08-22 19:02 ` [RFC PATCH v2 bpf-next 2/2] bpf/verifier: display non-spill stack slot types in print_verifier_state Edward Cree
@ 2018-08-30  2:18 ` Alexei Starovoitov
  2018-08-31 15:50   ` Edward Cree
  2018-09-26 22:16 ` Jiong Wang
  3 siblings, 1 reply; 12+ messages in thread
From: Alexei Starovoitov @ 2018-08-30  2:18 UTC (permalink / raw)
  To: Edward Cree; +Cc: ast, daniel, netdev

On Wed, Aug 22, 2018 at 08:00:46PM +0100, Edward Cree wrote:
> The first patch is a simplification of register liveness tracking by using
>  a separate parentage chain for each register and stack slot, thus avoiding
>  the need for logic to handle callee-saved registers when applying read
>  marks.  In the future this idea may be extended to form use-def chains.
> The second patch adds information about misc/zero data on the stack to the
>  state dumps emitted to the log at various points; this information was
>  found essential in debugging the first patch, and may be useful elsewhere.

I think this set is a great improvement in liveness tracking,
so depsite seeing the issues I applied it to bpf-next.

I think it's a better base to continue debugging.
In particular:
1. we have instability issue in the verifier.
 from time to time the verifier goes to process extra 7 instructions on one
 of the cilium tests. This was happening before and after this set.
2. there is a nice improvement in number of processed insns with this set,
 but the difference I cannot explain, hence it has to debugged.
 In theory the liveness rewrite shouldn't cause the difference in processed insns.

If not for the issue 1 I would argue that the issue 2 means that the set has to
be debugged before going in, but since the verifier is unstable it's better
to debug from this base with this patch set applied (because it greatly
simplifies liveness and adds additional debug in patch2)
and once we figure out the issue 1, I hope, the issue 2 will be resolved automatically.

The numbers on cilium bpf programs:
                      before1    before2   after1  after2
bpf_lb-DLB_L3.o 	2003      2003      2003    2003
bpf_lb-DLB_L4.o 	3173      3173      3173    3173
bpf_lb-DUNKNOWN.o 	1080      1080      1080    1080
bpf_lxc-DDROP_ALL.o	29587     29587     29587   29587
bpf_lxc-DUNKNOWN.o	37204     37211     36926   36933
bpf_netdev.o		11283     11283     11188   11188
bpf_overlay.o		6679      6679      6679    6679
bpf_lcx_jit.o		39657     39657     39561   39561

notice how bpf_lxc-DUNKNOWN.o fluctuates with +7 before and after. That is issue 1.
bpf_lxc-DUNKNOWN.o, bpf_netdev.o, and bpf_lcx_jit.o have small improvements.
That is issue 2.

To reproduce above numbers clone this repo: https://github.com/4ast/bpf_cilium_test
and run .sh. The .o files in there are pretty old cilium bpf programs.
I kept them frozen and didn't recompile for more than a year to keep stable
base line and track the progress of the verifier in 'processed insns'.

Thanks

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification
  2018-08-30  2:18 ` [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification Alexei Starovoitov
@ 2018-08-31 15:50   ` Edward Cree
  0 siblings, 0 replies; 12+ messages in thread
From: Edward Cree @ 2018-08-31 15:50 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: ast, daniel, netdev

On 30/08/18 03:18, Alexei Starovoitov wrote:
> I think it's a better base to continue debugging.
> In particular:
> 1. we have instability issue in the verifier.
>  from time to time the verifier goes to process extra 7 instructions on one
>  of the cilium tests. This was happening before and after this set.
I can't reproduce this; I always get 36926.  Can you try recording the
 verifier log and diff the output between the two cases?
> 2. there is a nice improvement in number of processed insns with this set,
>  but the difference I cannot explain, hence it has to debugged.
>  In theory the liveness rewrite shouldn't cause the difference in processed insns.
I shall attack this with the same methods I used for the other delta.
 Since that one turned out to be a real bug in the patch, I'm not so
 sanguine as to dismiss this one as probably connected to issue 1.

-Ed

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification
  2018-08-22 19:00 [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification Edward Cree
                   ` (2 preceding siblings ...)
  2018-08-30  2:18 ` [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification Alexei Starovoitov
@ 2018-09-26 22:16 ` Jiong Wang
  2018-09-28 13:36   ` Edward Cree
  3 siblings, 1 reply; 12+ messages in thread
From: Jiong Wang @ 2018-09-26 22:16 UTC (permalink / raw)
  To: Edward Cree, ast, daniel; +Cc: netdev

On 22/08/2018 20:00, Edward Cree wrote:
> The first patch is a simplification of register liveness tracking by using
>  a separate parentage chain for each register and stack slot, thus avoiding
>  the need for logic to handle callee-saved registers when applying read
>  marks.  In the future this idea may be extended to form use-def chains.

Interesting.

This could potentially help efficient code-gen for 32-bit architectures and I
had been thinking some implementations along this line and would like to have
a discussion.

Program description
===
It will be good if we could know one instruction is safe to operate on low
32-bit sub-register only. If this is true:
   - 32-bit arches could save some code-gen.
   - 64-bit arches could utilize 32-bit sub-register instructions.

Algorithm
===
Based on the current verifier code-path-walker, it looks to me we could get
32-bit safety information using the following algorithm:

   1. assume all instructions operate on 32-bit sub-register initially.
   2. link each insn to insns which define its use.
   3. for a register use, if it is a 64-bit write back to memory, or if it is
      a comparison-then-jump based on 64-bit value, or if it is a right shift,
      then consider the use is a 64-bit use, then all its define insns and their
      parents define insns should be marked as need full 64-bit operation.
   4. currently, register read (REG_LIVE_READ) will be propagated to parent
      state when path prune happened, and REG_LIVE_WRITTEN would screen off such
      propagation. We need to propagate 64-bit mark in similar way, but
      REG_LIVE_WRITTEN shouldn't screen off backward 64-bit mark propagation if
      the written reg also used as source reg (this has raised REG_LIVE_READ
      propagation but it is not necessarily indicating 64-bit read) in the same
      instruction.

Implementation
===
And it seems there could be an implementation based on current liveness tracking
infrastructure without dramatic change.

   1. instruction level use->def chain

      - new active_defs array for each call frame to record which insn the active
        define of one register comes from. callee copies active_defs from caller
        for argument registers only, and copies active_def of R0 to caller when
        exit.

        struct bpf_func_state {
          ...
          s16 active_defs[__MAX_BPF_REG];

      - new use->def chains for each instruction. one eBPF insn could have two
        uses at maximum. also one new boolean "full_ref_def" added to keep the
        final 32-bit safety information. it will be true if this instruction
        needs to operate on full 64-bit.

        bpf_insn_aux_data {
        ...
          struct {
            s16 def[2];
            bool full_ref_def;
          };

   2. Inside mark_reg_read, split SRC_OP to SRC0_OP/SRC1_OP/SRC_OP_64/SRC1_OP_64
      to indicate one register read if from the first or second use slot of this
      instruction, and to indicate whether the read is on full 64-bit which will
      only be true if the read is for 64-bit write back to memory, or 64-bit
      comparison-then-jump, or bit right shift means 64-bit.

      Build use->def chain for any read, but do 64-bit backward propagation
      for 64-bit read only. The propagation is to set full_ref_def to true for
      def and parent defs of the 64-bit read.

    3. Need new bpf_reg_liveness enum, REG_LIVE_READ64 to indicate there is
       64-bit access to one register in the pruned path, and need
       REG_LIVE_WRITTEN64 to indicate a write that is REG_LIVE_WRITTEN but should
       not screen backward propagate 64-bit read info.

        #define REG_LIVE_NONE          0
        #define REG_LIVE_READ          BIT(0)
        #define REG_LIVE_READ64        BIT(1)
        #define REG_LIVE_WRITTEN       BIT(2)
        #define REG_LIVE_WRITTEN64     BIT(3)

I might have missed something in above thoughts, would appreciate reviews and
suggestions. I will also send out some concrete patches later.

Regards,
Jiong

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification
  2018-09-26 22:16 ` Jiong Wang
@ 2018-09-28 13:36   ` Edward Cree
  2018-10-03 15:36     ` Jiong Wang
  0 siblings, 1 reply; 12+ messages in thread
From: Edward Cree @ 2018-09-28 13:36 UTC (permalink / raw)
  To: Jiong Wang, ast, daniel; +Cc: netdev

On 26/09/18 23:16, Jiong Wang wrote:
> On 22/08/2018 20:00, Edward Cree wrote:
>> In the future this idea may be extended to form use-def chains.
>
>   1. instruction level use->def chain
>
>      - new use->def chains for each instruction. one eBPF insn could have two
>        uses at maximum.
I was thinking of something a lot weaker/simpler, just making
    ld rX, rY
 copy rY.parent into rX.parent and not read-mark rY (whereas actual
 arithmetic, pointer deref etc. would still create read marks).
But what you've described sounds interesting; perhaps it would also
 help later with loop-variable handling?

-Ed

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification
  2018-09-28 13:36   ` Edward Cree
@ 2018-10-03 15:36     ` Jiong Wang
  2018-10-03 15:59       ` Alexei Starovoitov
  2018-10-04 17:35       ` Edward Cree
  0 siblings, 2 replies; 12+ messages in thread
From: Jiong Wang @ 2018-10-03 15:36 UTC (permalink / raw)
  To: Edward Cree, ast, daniel; +Cc: netdev

On 28/09/2018 14:36, Edward Cree wrote:
 > On 26/09/18 23:16, Jiong Wang wrote:
 >> On 22/08/2018 20:00, Edward Cree wrote:
 >>> In the future this idea may be extended to form use-def chains.
 >>
 >>   1. instruction level use->def chain
 >>
 >>      - new use->def chains for each instruction. one eBPF insn could 
have two
 >>        uses at maximum.
 > I was thinking of something a lot weaker/simpler, just making
 >     ld rX, rY
 >  copy rY.parent into rX.parent and not read-mark rY (whereas actual
 >  arithmetic, pointer deref etc. would still create read marks).

Thanks for the feedback Edward.

 > But what you've described sounds interesting; perhaps it would also
 >  help later with loop-variable handling?

Haven't considered how to use this for loop-variable handling, guess you 
mean
applying what I have described to your previous loop detection RFC? I 
will look
into your RFC later.

At the moment the design of the use->def chain is mainly to optimize 32-bit
code-gen. I was about to satisfied with a local implementation and to 
share it
to ML for further discussion. However, when manually check the optimization
result on testcase with medium size (~1000 eBPF insns) and proper complexity
(make sure path prunes etc are triggered inside verifier), I found the 
code-gen
doesn't meet my expectation.

For example, for the following sequence, insn at 25 should operate on 
full-64
bit but I found it is marked as 32-bit safe.

   25:    r7 = 1
   26:    if r4 > r8 goto +1200 <L>
   27:    r1 = *(u8 *)(r1 + 0)
   28:    r1 &= 15
   29:    r7 = 1
   ...

L:
   1227:  r0 = r7
   1228:  exit

As described at previous email, the algorithm assume all insns are 
32-bit safe
first, then start to insns back to "64-bit" if there is any 64-bit use found
for a insn.

Insn 25 is defining r7 which is used at the 1227 where its value 
propagated to
r0 and then r0 is implicitly used at insn 1228 as it is a exit from main
function to external.

For above example, as we don't know the external use of r0 at 1228 (exit 
from
main to external), so r0 is treated as 64-bit implicit use. The define is at
1227, so insn 1227 is marked as "64-bit". The "64-bit" attribute should
propagate to source register operand through register move and 
arithmetic, so
r7 at insn 1227 is a "64-bit" use and should make its definition 
instruction,
insn 25, marked as "64-bit". This is my thinking of how insn 25 should be
marked.

Now this hasn't happened. I am still debugging the root cause, but kind 
of feel
"64-bit" attribute propagation is the issue, it seems to me it can't be 
nicely
integrated into the existing register read/write propagation 
infrastructure. For
example, for a slightly more complex sequence which is composed of three 
states:

State A
   ...
   10: r6 = *(u32 *)(r10 - 4)
   11: r7 = *(u32 *)(r10 - 8)
   12: *(u64 *)(r10 - 16) = r6
   13: *(u64 *)(r10 - 24) = r7

State B
   14: r6 += 1
   15: r7 += r6
   16: *(u32 *)(r10 - 28) = r7

State C
   ...
   17: r3 += r7
   18: r4 = 1
   19: *(u64 *)(r10 - 32) = r3
   20: *(u64 *)(r10 - 40) = r4

State A is parent of state B which is parent of state C.

Inside state C, at insn 20, r4 is a 64-bit read/use, so its define at 18 is
marked as "64-bit". There is no register source at 18, so "64-bit" attribute
propagation is stopped.

Then at insn 19, r3 is a 64-bit read/use, so its define at 17 is marked as
"64-bit" read/use. Insn 17 has two register sources, r3 and r7, they become
"64-bit" now, and their definition should be marked as "64-bit".

Now if the definition of r3 or r7 comes from parent state, then the 
parent state
should receive a "REG_LIVE_READ64", this is necessary if later another path
reaches state C and triggers prune path, for which case that path should 
know
there is "64-bit" use inside state C on some registers, and should use this
information to mark "64-bit" insn.

If the definition of r3 or r7 is still inside state C, we need to keep 
walking
up the instruction sequences, and propagate "64-bit" attribute upward 
until it
goes beyond the state C.

The above propagation logic is quite different from existing register 
read/write
propagation. For the latter, a write just screen up all following read, 
and a
read would propagate directly to its parent is there is not previous 
write, no
instruction analysis is required.

I am just describing what I have run into and trying to resolve, any 
thoughts
and suggestions are appreciated.

Regards,
Jiong

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification
  2018-10-03 15:36     ` Jiong Wang
@ 2018-10-03 15:59       ` Alexei Starovoitov
  2018-10-03 16:53         ` Jiong Wang
  2018-10-04 17:35       ` Edward Cree
  1 sibling, 1 reply; 12+ messages in thread
From: Alexei Starovoitov @ 2018-10-03 15:59 UTC (permalink / raw)
  To: Jiong Wang; +Cc: Edward Cree, ast, daniel, netdev

On Wed, Oct 03, 2018 at 04:36:31PM +0100, Jiong Wang wrote:
> On 28/09/2018 14:36, Edward Cree wrote:
> > On 26/09/18 23:16, Jiong Wang wrote:
> >> On 22/08/2018 20:00, Edward Cree wrote:
> >>> In the future this idea may be extended to form use-def chains.
> >>
> >>   1. instruction level use->def chain
> >>
> >>      - new use->def chains for each instruction. one eBPF insn could have
> two
> >>        uses at maximum.
> > I was thinking of something a lot weaker/simpler, just making
> >     ld rX, rY
> >  copy rY.parent into rX.parent and not read-mark rY (whereas actual
> >  arithmetic, pointer deref etc. would still create read marks).
> 
> Thanks for the feedback Edward.
> 
> > But what you've described sounds interesting; perhaps it would also
> >  help later with loop-variable handling?
> 
> Haven't considered how to use this for loop-variable handling, guess you
> mean
> applying what I have described to your previous loop detection RFC? I will
> look
> into your RFC later.
> 
> At the moment the design of the use->def chain is mainly to optimize 32-bit
> code-gen. I was about to satisfied with a local implementation and to share
> it
> to ML for further discussion. However, when manually check the optimization
> result on testcase with medium size (~1000 eBPF insns) and proper complexity
> (make sure path prunes etc are triggered inside verifier), I found the
> code-gen
> doesn't meet my expectation.
> 
> For example, for the following sequence, insn at 25 should operate on
> full-64
> bit but I found it is marked as 32-bit safe.
> 
>   25:    r7 = 1
>   26:    if r4 > r8 goto +1200 <L>
>   27:    r1 = *(u8 *)(r1 + 0)
>   28:    r1 &= 15
>   29:    r7 = 1
>   ...
> 
> L:
>   1227:  r0 = r7
>   1228:  exit
> 
> As described at previous email, the algorithm assume all insns are 32-bit
> safe
> first, then start to insns back to "64-bit" if there is any 64-bit use found
> for a insn.
> 
> Insn 25 is defining r7 which is used at the 1227 where its value propagated
> to
> r0 and then r0 is implicitly used at insn 1228 as it is a exit from main
> function to external.
> 
> For above example, as we don't know the external use of r0 at 1228 (exit
> from
> main to external), so r0 is treated as 64-bit implicit use. The define is at
> 1227, so insn 1227 is marked as "64-bit". The "64-bit" attribute should
> propagate to source register operand through register move and arithmetic,
> so
> r7 at insn 1227 is a "64-bit" use and should make its definition
> instruction,
> insn 25, marked as "64-bit". This is my thinking of how insn 25 should be
> marked.

all makes sense to me.

> Now this hasn't happened. I am still debugging the root cause, but kind of
> feel
> "64-bit" attribute propagation is the issue, it seems to me it can't be
> nicely
> integrated into the existing register read/write propagation infrastructure.

may be share your patches that modify the liveness propagation?

> For
> example, for a slightly more complex sequence which is composed of three
> states:
> 
> State A
>   ...
>   10: r6 = *(u32 *)(r10 - 4)
>   11: r7 = *(u32 *)(r10 - 8)
>   12: *(u64 *)(r10 - 16) = r6
>   13: *(u64 *)(r10 - 24) = r7
> 
> State B
>   14: r6 += 1
>   15: r7 += r6
>   16: *(u32 *)(r10 - 28) = r7
> 
> State C
>   ...
>   17: r3 += r7
>   18: r4 = 1
>   19: *(u64 *)(r10 - 32) = r3
>   20: *(u64 *)(r10 - 40) = r4
> 
> State A is parent of state B which is parent of state C.
> 
> Inside state C, at insn 20, r4 is a 64-bit read/use, so its define at 18 is
> marked as "64-bit". There is no register source at 18, so "64-bit" attribute
> propagation is stopped.
> 
> Then at insn 19, r3 is a 64-bit read/use, so its define at 17 is marked as
> "64-bit" read/use. Insn 17 has two register sources, r3 and r7, they become
> "64-bit" now, and their definition should be marked as "64-bit".
> 
> Now if the definition of r3 or r7 comes from parent state, then the parent

... the definition of r3 _and_ r7 ...
both need to propagate up with your algo, right?

> state
> should receive a "REG_LIVE_READ64", this is necessary if later another path
> reaches state C and triggers prune path, for which case that path should
> know
> there is "64-bit" use inside state C on some registers, and should use this
> information to mark "64-bit" insn.
> 
> If the definition of r3 or r7 is still inside state C, we need to keep
> walking
> up the instruction sequences, and propagate "64-bit" attribute upward until
> it
> goes beyond the state C.
> 
> The above propagation logic is quite different from existing register
> read/write
> propagation.
> For the latter, a write just screen up all following read, and
> a
> read would propagate directly to its parent is there is not previous write,
> no
> instruction analysis is required.

correct.
with such algo REG_LIVE_WRITTEN shouldn't be screening the propagation.

I think the patches will discuss the algo.
Also I think the initial state of 'everything is 32-bit safe'
and make marks to enforce 64-bit-ness is a dangerous algorithmic choice.
Why not to start at a safer state where everything is 64-bit
and work backward to find out which ones can be 32-bit?
That will be safer algo in case there are issues with bit like
you described in above.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification
  2018-10-03 15:59       ` Alexei Starovoitov
@ 2018-10-03 16:53         ` Jiong Wang
  2018-10-08 20:18           ` Jiong Wang
  0 siblings, 1 reply; 12+ messages in thread
From: Jiong Wang @ 2018-10-03 16:53 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Edward Cree, ast, daniel, netdev

On 03/10/2018 16:59, Alexei Starovoitov wrote:
> On Wed, Oct 03, 2018 at 04:36:31PM +0100, Jiong Wang wrote:
<snip...>
>> Now this hasn't happened. I am still debugging the root cause, but kind of
>> feel
>> "64-bit" attribute propagation is the issue, it seems to me it can't be
>> nicely
>> integrated into the existing register read/write propagation infrastructure.
>
> may be share your patches that modify the liveness propagation?

OK, I will share it after some clean up.

>> For
>> example, for a slightly more complex sequence which is composed of three
>> states:
>>
>> State A
>>   ...
>>   10: r6 = *(u32 *)(r10 - 4)
>>   11: r7 = *(u32 *)(r10 - 8)
>>   12: *(u64 *)(r10 - 16) = r6
>>   13: *(u64 *)(r10 - 24) = r7
>>
>> State B
>>   14: r6 += 1
>>   15: r7 += r6
>>   16: *(u32 *)(r10 - 28) = r7
>>
>> State C
>>   ...
>>   17: r3 += r7
>>   18: r4 = 1
>>   19: *(u64 *)(r10 - 32) = r3
>>   20: *(u64 *)(r10 - 40) = r4
>>
>> State A is parent of state B which is parent of state C.
>>
>> Inside state C, at insn 20, r4 is a 64-bit read/use, so its define at 18 is
>> marked as "64-bit". There is no register source at 18, so "64-bit" attribute
>> propagation is stopped.
>>
>> Then at insn 19, r3 is a 64-bit read/use, so its define at 17 is marked as
>> "64-bit" read/use. Insn 17 has two register sources, r3 and r7, they become
>> "64-bit" now, and their definition should be marked as "64-bit".
>>
>> Now if the definition of r3 or r7 comes from parent state, then the parent
>
> ... the definition of r3 _and_ r7 ...
> both need to propagate up with your algo, right?

Yes, all sources of insn 17, both r3 and r7.

>> state
>> should receive a "REG_LIVE_READ64", this is necessary if later another path
>> reaches state C and triggers prune path, for which case that path should
>> know
>> there is "64-bit" use inside state C on some registers, and should use this
>> information to mark "64-bit" insn.
>>
>> If the definition of r3 or r7 is still inside state C, we need to keep
>> walking
>> up the instruction sequences, and propagate "64-bit" attribute upward until
>> it
>> goes beyond the state C.
>>
>> The above propagation logic is quite different from existing register
>> read/write propagation.
>> For the latter, a write just screen up all following read, and a
>> read would propagate directly to its parent is there is not previous write,
>> no instruction analysis is required.
>
> correct.
> with such algo REG_LIVE_WRITTEN shouldn't be screening the propagation.
>
> I think the patches will discuss the algo.
> Also I think the initial state of 'everything is 32-bit safe'
> and make marks to enforce 64-bit-ness is a dangerous algorithmic choice.

Indeed, and I am actually thinking the same thing ...

> Why not to start at a safer state where everything is 64-bit
> and work backward to find out which ones can be 32-bit?
> That will be safer algo in case there are issues with bit like
> you described in above.

... but I failed to find a algo works on making initial state of
"everything is 64-bit". I haven't figured out a way to check that all use of one
definition are 32-bit on all possible code paths, which looks to me is a must for
one insn marked as 32-bit safe.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification
  2018-10-03 16:53         ` Jiong Wang
@ 2018-10-08 20:18           ` Jiong Wang
  0 siblings, 0 replies; 12+ messages in thread
From: Jiong Wang @ 2018-10-08 20:18 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Edward Cree, ast, daniel, netdev

On 03/10/2018 17:53, Jiong Wang wrote:
> On 03/10/2018 16:59, Alexei Starovoitov wrote:
>> On Wed, Oct 03, 2018 at 04:36:31PM +0100, Jiong Wang wrote:
> <snip...>
>>> Now this hasn't happened. I am still debugging the root cause, but kind of
>>> feel
>>> "64-bit" attribute propagation is the issue, it seems to me it can't be
>>> nicely
>>> integrated into the existing register read/write propagation infrastructure.
>>
>> may be share your patches that modify the liveness propagation?
>
> OK, I will share it after some clean up.

Have done more experiments on the algo, and finally got something passed my
local tests. bpf selftest passed as well except several test_verifier failures
due to some corner cases I guess, will have a look later.

In these test, x86-64 is using the 32-bit information. In the insn loop inside
sanitize_dead_code, once an insn is not marked as 64-bit and if it is ALU64, it
will then be changed to ALU32. When enable tracking using printk, I could see
quite a few ALU64 instructions really are rewritten into ALU32, so tests like
test_l4lb runs OK looks like a positive signal of the correctness.

The algo separates the problem into two steps.

   - First, assume there is no code path prune and all code paths will be walked
     through. In this case, if we could propagate 64-bit info backward along the
     use-def chains for all paths walked, one insn must will be marked as 64-bit
     correctly. Finish this step requires building use-def chain, and it is done
     in the following way:

      1. each insn could have two explicit uses, so add two chain fields in
         bpf_insn_aux_data.
      2. we need finer enum to describe register use, so split SRC_OP into
         SRC_OP_0, SRC_OP64_0, SRC_OP_1, SRC_OP64_1 to indicate the source
         is the first/second source and whether it is a 64-bit source.
      3. create the chain at check_reg_arg which is exactly covering all
         register use sites. The function to create the chain is link_reg_to_def.
      4. when creating the chain, if a source is a 64-bit source, also
         propagating the information backward along use-def chain straight away.
         This is done in mark_reg_read which further calls the new function
         "mark_insn_64bit" which is doing the real job. "mark_insn_64bit" fetches
         the def insn for the 64-bit source, and further marks the def insns of
         its sources as 64-bit. This will be repeated until the whole use-def
         chain consumed.
      5. by use-def chain described above, if there is no code path prune, one
         insn must have been marked as 64-bit when it's result has 64-bit use.
      6. helper call causing implicit reg use and must be conservative treated
         as 64-bit use, bpf-to-bpf function call has insn connected by use-def
         so doesn't need to make that conservative assumption.

   - Second, to handle code path prune, define new liveness enum
     REG_LIVE_READ64 and REG_LIVE_UNIQUE_WRITTEN. The latter will only be
     set if reg_arg_type is the new U_DST_OP or U_DST_OP_NO_MARK, and
     REG_LIVE_READ64 will be set if one 64-bit read is not screened off by
     REG_LIVE_UNIQUE_WRITTEN.

     The thought is 64-bit use info will only be screened off if the dst register
     is unique in all register operands, meaning not the same as any source. For
     example, insn 18 below will screen off r4 64-bit propagation.

       17: r3 += r7
       18: r4 = 1
       19: *(u64 *)(r10 - 32) = r3
       20: *(u64 *)(r10 - 40) = r4

     So, U_DST_OP/U_DST_OP_NO_MARK have been introduced to differentiate with
     DST_OP/DST_OP_NO_MARK. Inside check_reg_arg, checks are done on dst_reg,
     and would pass U_DST_* as reg_arg_type when it is unique. U_DST_* then
     will set liveness to REG_LIVE_UNIQUE_WRITTEN. In side mark_reg_read, if
     one 64-bit read is not screened off by REG_LIVE_UNIQUE_WRITTEN, then
     REG_LIVE_READ64 will be set in the reg_state. REG_LIVE_READ64 further
     triggers propagating downstream 64-bit uses from the pruned paths into the
     current path inside propagate_liveness when path prune happened.

     Compared with propagating REG_LIVE_READ, propagating REG_LIVE_READ64 needs
     more work. Because one 64-bit read could indicating more than one registers
     are 64-bit. For example, insn 19 above shows r3 is 64-bit source, so its
     define at insn 17 are 64-bit, then all sources of insn 17 must be 64-bit,
     meaning both r3 and r7 are 64-bit. Therefore, REG_LIVE_READ64 needs to be
     propagated on both r3 and r7 upward along the register parentage chain.
     During walking def-use chain, we record all such affected reg_state, and
     propagate REG_LIVE_READ64 for all of them. This logic is done inside
     mark_insn_64bit as well.

   - For short, the implementation treating the new 64-bit info (REG_LIVE_READ64)
     propagation the same as REG_LIVE_READ propagation.

     REG_LIVE_READ triggers more path prune during state equal comparison, while
     REG_LIVE_READ64 triggers 64-bit insn marking during def-use chain walking.

     Use-def chain is separate from reg state parentage chain chain, the prior is
     helping the later, reg states that needs REG_LIVE_READ64 propagation are
     collected during use-def chain walking.

Please review the following implementation.

Thanks.

Regards,
Jiong

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index b42b60a..ea22f43 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -34,11 +34,15 @@
   * but of the link between it and its parent.  See mark_reg_read() and
   * mark_stack_slot_read() in kernel/bpf/verifier.c.
   */
-enum bpf_reg_liveness {
-	REG_LIVE_NONE = 0, /* reg hasn't been read or written this branch */
-	REG_LIVE_READ, /* reg was read, so we're sensitive to initial value */
-	REG_LIVE_WRITTEN, /* reg was written first, screening off later reads */
-};
+/* Reg hasn't been read or written this branch. */
+#define REG_LIVE_NONE		0
+/* Reg was read, so we're sensitive to initial value. */
+#define REG_LIVE_READ		0x1
+/* The read is also 64-bit. */
+#define REG_LIVE_READ64		0x2
+#define REG_LIVE_WRITTEN	0x4
+/* The write also should screen off 64-bit backward propagation.  */
+#define REG_LIVE_UNIQUE_WRITTEN	0x8
  
  struct bpf_reg_state {
  	/* Ordering of fields matters.  See states_equal() */
@@ -85,7 +89,11 @@ struct bpf_reg_state {
  	 * pointing to bpf_func_state.
  	 */
  	u32 frameno;
-	enum bpf_reg_liveness live;
+	u32 live;
+	struct {
+		s16 active_def;
+		bool full_ref;
+	};
  };
  
  enum bpf_stack_slot_type {
@@ -145,6 +153,11 @@ struct bpf_insn_aux_data {
  	};
  	int ctx_field_size; /* the ctx field size for load insn, maybe 0 */
  	int sanitize_stack_off; /* stack slot to be cleared */
+	struct {
+		s16 def[2]; /* The insns defining the uses of this insn. */
+		bool full_ref; /* This insn should operate on full 64-bit. */
+	};
+	struct bpf_reg_state *reg_state;
  	bool seen; /* this insn was processed by the verifier */
  };
  
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 8ccbff4..061345d 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -271,8 +271,7 @@ static char slot_type_char[] = {
  	[STACK_ZERO]	= '0',
  };
  
-static void print_liveness(struct bpf_verifier_env *env,
-			   enum bpf_reg_liveness live)
+static void print_liveness(struct bpf_verifier_env *env, u32 live)
  {
  	if (live & (REG_LIVE_READ | REG_LIVE_WRITTEN))
  	    verbose(env, "_");
@@ -755,6 +754,7 @@ static void init_reg_state(struct bpf_verifier_env *env,
  		mark_reg_not_init(env, regs, i);
  		regs[i].live = REG_LIVE_NONE;
  		regs[i].parent = NULL;
+		regs[i].active_def = -1;
  	}
  
  	/* frame pointer */
@@ -779,9 +779,16 @@ static void init_func_state(struct bpf_verifier_env *env,
  }
  
  enum reg_arg_type {
-	SRC_OP,		/* register is used as source operand */
+	SRC_OP_0,	/* register is used as source operand */
+	SRC_OP64_0,	/* likewise, and all 64 bits of the source matter */
+	SRC_OP_1,	/* the second source */
+	SRC_OP64_1,	/* likewise */
+	SRC_OP64_IMP,	/* implicit source, always 64-bit */
+	SRC_OP_SPILL,	/* stack spill */
  	DST_OP,		/* register is used as destination operand */
-	DST_OP_NO_MARK	/* same as above, check only, don't mark */
+	U_DST_OP,	/* likewise, and no overlap with any source */
+	DST_OP_NO_MARK,	/* same as above, check only, don't mark */
+	U_DST_OP_NO_MARK/* likewise */
  };
  
  static int cmp_subprogs(const void *a, const void *b)
@@ -899,14 +906,101 @@ static int check_subprogs(struct bpf_verifier_env *env)
  	return 0;
  }
  
+static void
+link_reg_to_def(struct bpf_verifier_env *env, const struct bpf_reg_state *rstate,
+		enum reg_arg_type read_t, s16 insn_idx)
+{
+	s16 slot_idx, def_idx;
+
+	if (insn_idx == -1)
+		return;
+
+	slot_idx = (read_t - SRC_OP_0) % 2;
+	def_idx = rstate->active_def;
+	env->insn_aux_data[insn_idx].def[slot_idx] = def_idx + 1;
+}
+
+/* Stack of insns to process, this is also used by check_cfg. */
+static int *insn_stack;
+
+struct prop_pair {
+  struct bpf_reg_state *reg_state;
+  u8 regno;
+};
+
+/* Mark insn upward along the chain, also propagate READ64 liveness. */
+static int mark_insn_64bit(struct bpf_verifier_env *env, int insn_idx)
+{
+	struct bpf_insn_aux_data *aux = env->insn_aux_data;
+	u16 u2d0, u2d1, stack_idx = 0, prop_idx = 0;
+	struct prop_pair *prop_stack;
+
+	prop_stack = kcalloc(1024, sizeof(struct prop_pair), GFP_KERNEL);
+	if (!prop_stack)
+		return -ENOMEM;
+
+	insn_stack[stack_idx++] = insn_idx;
+
+	while (stack_idx) {
+		u16 def_idx = insn_stack[--stack_idx];
+
+		if (!def_idx)
+			continue;
+
+		def_idx--;
+
+		aux[def_idx].full_ref = true;
+		u2d0 = aux[def_idx].def[0];
+		if (u2d0) {
+			insn_stack[stack_idx++] = u2d0;
+			prop_stack[prop_idx].reg_state = aux[def_idx].reg_state;
+			prop_stack[prop_idx++].regno =
+				env->prog->insnsi[u2d0 - 1].dst_reg;
+		}
+		u2d1 = aux[def_idx].def[1];
+		if (u2d1) {
+			insn_stack[stack_idx++] = u2d1;
+			prop_stack[prop_idx].reg_state = aux[def_idx].reg_state;
+			prop_stack[prop_idx++].regno =
+				env->prog->insnsi[u2d1 - 1].dst_reg;
+		}
+	}
+
+	while (prop_idx) {
+		struct prop_pair pair = prop_stack[--prop_idx];
+		struct bpf_reg_state *state, *parent;
+		u8 regno = pair.regno;
+
+		state = pair.reg_state + regno;
+		parent = state->parent;
+
+		while (parent) {
+			if (state->live & REG_LIVE_UNIQUE_WRITTEN)
+				break;
+
+			parent->live |= REG_LIVE_READ64;
+			state = parent;
+			parent = state->parent;
+		}
+	}
+
+	kfree(prop_stack);
+
+	return 0;
+}
+
  /* Parentage chain of this register (or stack slot) should take care of all
   * issues like callee-saved registers, stack slot allocation time, etc.
   */
  static int mark_reg_read(struct bpf_verifier_env *env,
  			 const struct bpf_reg_state *state,
-			 struct bpf_reg_state *parent)
+			 struct bpf_reg_state *parent, bool full_reg_read)
  {
  	bool writes = parent == state->parent; /* Observe write marks */
+	const struct bpf_reg_state *orig_state = state;
+	struct bpf_reg_state *orig_parent = parent;
+	bool orig_writes = writes;
+	int def_idx;
  
  	while (parent) {
  		/* if read wasn't screened by an earlier write ... */
@@ -918,11 +1012,20 @@ static int mark_reg_read(struct bpf_verifier_env *env,
  		parent = state->parent;
  		writes = true;
  	}
-	return 0;
+
+	writes = orig_writes;
+	parent = orig_parent;
+	state = orig_state;
+
+	def_idx = writes ? state->active_def : parent->active_def;
+	if (!full_reg_read || def_idx == -1)
+		return 0;
+
+	return mark_insn_64bit(env, def_idx + 1);
  }
  
  static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
-			 enum reg_arg_type t)
+			 enum reg_arg_type t, s16 insn_idx)
  {
  	struct bpf_verifier_state *vstate = env->cur_state;
  	struct bpf_func_state *state = vstate->frame[vstate->curframe];
@@ -933,16 +1036,24 @@ static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
  		return -EINVAL;
  	}
  
-	if (t == SRC_OP) {
+	if (t == SRC_OP_0 || t == SRC_OP_1 ||
+	    t == SRC_OP64_0 || t == SRC_OP64_1 || t == SRC_OP64_IMP) {
+		struct bpf_reg_state *rstate = regs + regno;
+
  		/* check whether register used as source operand can be read */
-		if (regs[regno].type == NOT_INIT) {
+		if (rstate->type == NOT_INIT) {
  			verbose(env, "R%d !read_ok\n", regno);
  			return -EACCES;
  		}
+
+		link_reg_to_def(env, rstate, t, insn_idx);
+
  		/* We don't need to worry about FP liveness because it's read-only */
  		if (regno != BPF_REG_FP)
-			return mark_reg_read(env, &regs[regno],
-					     regs[regno].parent);
+			return mark_reg_read(env, rstate, rstate->parent,
+					     t == SRC_OP64_0 ||
+					     t == SRC_OP64_1 ||
+					     t == SRC_OP64_IMP);
  	} else {
  		/* check whether register used as dest operand can be written to */
  		if (regno == BPF_REG_FP) {
@@ -950,8 +1061,14 @@ static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
  			return -EACCES;
  		}
  		regs[regno].live |= REG_LIVE_WRITTEN;
-		if (t == DST_OP)
+		if (t == U_DST_OP || t == U_DST_OP_NO_MARK)
+			regs[regno].live |= REG_LIVE_UNIQUE_WRITTEN;
+		if (t == DST_OP || t == U_DST_OP)
  			mark_reg_unknown(env, regs, regno);
+		regs[regno].active_def = insn_idx;
+		if (insn_idx != -1) {
+			env->insn_aux_data[insn_idx].reg_state = regs;
+		}
  	}
  	return 0;
  }
@@ -1118,7 +1235,8 @@ static int check_stack_read(struct bpf_verifier_env *env,
  			state->regs[value_regno].live |= REG_LIVE_WRITTEN;
  		}
  		mark_reg_read(env, &reg_state->stack[spi].spilled_ptr,
-			      reg_state->stack[spi].spilled_ptr.parent);
+			      reg_state->stack[spi].spilled_ptr.parent,
+			      SRC_OP_SPILL);
  		return 0;
  	} else {
  		int zeros = 0;
@@ -1135,7 +1253,8 @@ static int check_stack_read(struct bpf_verifier_env *env,
  			return -EACCES;
  		}
  		mark_reg_read(env, &reg_state->stack[spi].spilled_ptr,
-			      reg_state->stack[spi].spilled_ptr.parent);
+			      reg_state->stack[spi].spilled_ptr.parent,
+			      SRC_OP_SPILL);
  		if (value_regno >= 0) {
  			if (zeros == size) {
  				/* any size read into register is zero extended,
@@ -1735,23 +1854,26 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
  	return err;
  }
  
-static int check_xadd(struct bpf_verifier_env *env, int insn_idx, struct bpf_insn *insn)
+static int
+check_xadd(struct bpf_verifier_env *env, int insn_idx, struct bpf_insn *insn)
  {
+	bool is_xadd_64 = BPF_SIZE(insn->code) == BPF_DW;
+	bool is_xadd_32 = BPF_SIZE(insn->code) == BPF_W;
  	int err;
  
-	if ((BPF_SIZE(insn->code) != BPF_W && BPF_SIZE(insn->code) != BPF_DW) ||
-	    insn->imm != 0) {
+	if (!(is_xadd_32 || is_xadd_64) || insn->imm != 0) {
  		verbose(env, "BPF_XADD uses reserved fields\n");
  		return -EINVAL;
  	}
  
  	/* check src1 operand */
-	err = check_reg_arg(env, insn->src_reg, SRC_OP);
+	err = check_reg_arg(env, insn->src_reg,
+			    is_xadd_32 ? SRC_OP_0 : SRC_OP64_0, insn_idx);
  	if (err)
  		return err;
  
  	/* check src2 operand */
-	err = check_reg_arg(env, insn->dst_reg, SRC_OP);
+	err = check_reg_arg(env, insn->dst_reg, SRC_OP64_1, insn_idx);
  	if (err)
  		return err;
  
@@ -1852,7 +1974,8 @@ static int check_stack_boundary(struct bpf_verifier_env *env, int regno,
  		 * the whole slot to be marked as 'read'
  		 */
  		mark_reg_read(env, &state->stack[spi].spilled_ptr,
-			      state->stack[spi].spilled_ptr.parent);
+			      state->stack[spi].spilled_ptr.parent,
+			      SRC_OP_SPILL);
  	}
  	return update_stack_depth(env, state, off);
  }
@@ -1903,7 +2026,7 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
  	if (arg_type == ARG_DONTCARE)
  		return 0;
  
-	err = check_reg_arg(env, regno, SRC_OP);
+	err = check_reg_arg(env, regno, SRC_OP64_IMP, -1);
  	if (err)
  		return err;
  
@@ -2320,7 +2443,7 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
  	/* after the call registers r0 - r5 were scratched */
  	for (i = 0; i < CALLER_SAVED_REGS; i++) {
  		mark_reg_not_init(env, caller->regs, caller_saved[i]);
-		check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);
+		check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK, -1);
  	}
  
  	/* only increment it after check_reg_arg() finished */
@@ -2510,7 +2633,7 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
  	/* reset caller saved regs */
  	for (i = 0; i < CALLER_SAVED_REGS; i++) {
  		mark_reg_not_init(env, regs, caller_saved[i]);
-		check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);
+		check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK, -1);
  	}
  
  	/* update return register (already marked as written above) */
@@ -3161,7 +3284,10 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
  {
  	struct bpf_reg_state *regs = cur_regs(env);
  	u8 opcode = BPF_OP(insn->code);
-	int err;
+	enum reg_arg_type dst_type;
+	int err, insn_idx;
+
+	insn_idx = insn - env->prog->insnsi;
  
  	if (opcode == BPF_END || opcode == BPF_NEG) {
  		if (opcode == BPF_NEG) {
@@ -3181,7 +3307,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
  		}
  
  		/* check src operand */
-		err = check_reg_arg(env, insn->dst_reg, SRC_OP);
+		err = check_reg_arg(env, insn->dst_reg, SRC_OP_0, insn_idx);
  		if (err)
  			return err;
  
@@ -3191,8 +3317,9 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
  			return -EACCES;
  		}
  
+		dst_type = insn->dst_reg != insn->src_reg ? U_DST_OP : DST_OP;
  		/* check dest operand */
-		err = check_reg_arg(env, insn->dst_reg, DST_OP);
+		err = check_reg_arg(env, insn->dst_reg, dst_type, insn_idx);
  		if (err)
  			return err;
  
@@ -3205,18 +3332,24 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
  			}
  
  			/* check src operand */
-			err = check_reg_arg(env, insn->src_reg, SRC_OP);
+			err = check_reg_arg(env, insn->src_reg, SRC_OP_0,
+					    insn_idx);
  			if (err)
  				return err;
+
+			dst_type = insn->dst_reg != insn->src_reg ?
+					U_DST_OP_NO_MARK : DST_OP_NO_MARK;
  		} else {
  			if (insn->src_reg != BPF_REG_0 || insn->off != 0) {
  				verbose(env, "BPF_MOV uses reserved fields\n");
  				return -EINVAL;
  			}
+
+			dst_type = U_DST_OP_NO_MARK;
  		}
  
  		/* check dest operand, mark as required later */
-		err = check_reg_arg(env, insn->dst_reg, DST_OP_NO_MARK);
+		err = check_reg_arg(env, insn->dst_reg, dst_type, insn_idx);
  		if (err)
  			return err;
  
@@ -3225,8 +3358,12 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
  				/* case: R1 = R2
  				 * copy register state to dest reg
  				 */
+				u32 live_old = regs[insn->dst_reg].live;
+				s16 def_old = regs[insn->dst_reg].active_def;
+
  				regs[insn->dst_reg] = regs[insn->src_reg];
-				regs[insn->dst_reg].live |= REG_LIVE_WRITTEN;
+				regs[insn->dst_reg].live |= live_old;
+				regs[insn->dst_reg].active_def = def_old;
  			} else {
  				/* R1 = (u32) R2 */
  				if (is_pointer_value(env, insn->src_reg)) {
@@ -3266,18 +3403,23 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
  				return -EINVAL;
  			}
  			/* check src1 operand */
-			err = check_reg_arg(env, insn->src_reg, SRC_OP);
+			err = check_reg_arg(env, insn->src_reg, SRC_OP_0,
+					    insn_idx);
  			if (err)
  				return err;
+			dst_type = insn->dst_reg != insn->src_reg ?
+					U_DST_OP_NO_MARK : DST_OP_NO_MARK;
  		} else {
  			if (insn->src_reg != BPF_REG_0 || insn->off != 0) {
  				verbose(env, "BPF_ALU uses reserved fields\n");
  				return -EINVAL;
  			}
+
+			dst_type = DST_OP_NO_MARK;
  		}
  
  		/* check src2 operand */
-		err = check_reg_arg(env, insn->dst_reg, SRC_OP);
+		err = check_reg_arg(env, insn->dst_reg, SRC_OP_1, insn_idx);
  		if (err)
  			return err;
  
@@ -3303,7 +3445,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
  		}
  
  		/* check dest operand */
-		err = check_reg_arg(env, insn->dst_reg, DST_OP_NO_MARK);
+		err = check_reg_arg(env, insn->dst_reg, dst_type, insn_idx);
  		if (err)
  			return err;
  
@@ -3759,6 +3901,7 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
  	struct bpf_reg_state *regs = this_branch->frame[this_branch->curframe]->regs;
  	struct bpf_reg_state *dst_reg, *other_branch_regs;
  	u8 opcode = BPF_OP(insn->code);
+	int old_insn_idx = *insn_idx;
  	int err;
  
  	if (opcode > BPF_JSLE) {
@@ -3773,7 +3916,8 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
  		}
  
  		/* check src1 operand */
-		err = check_reg_arg(env, insn->src_reg, SRC_OP);
+		err = check_reg_arg(env, insn->src_reg, SRC_OP64_0,
+				    old_insn_idx);
  		if (err)
  			return err;
  
@@ -3790,7 +3934,7 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
  	}
  
  	/* check src2 operand */
-	err = check_reg_arg(env, insn->dst_reg, SRC_OP);
+	err = check_reg_arg(env, insn->dst_reg, SRC_OP64_1, old_insn_idx);
  	if (err)
  		return err;
  
@@ -3896,7 +4040,8 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn)
  		return -EINVAL;
  	}
  
-	err = check_reg_arg(env, insn->dst_reg, DST_OP);
+	err = check_reg_arg(env, insn->dst_reg, U_DST_OP,
+			    insn - env->prog->insnsi);
  	if (err)
  		return err;
  
@@ -3945,9 +4090,9 @@ static bool may_access_skb(enum bpf_prog_type type)
   */
  static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn)
  {
+	int i, err, insn_idx = insn - env->prog->insnsi;
  	struct bpf_reg_state *regs = cur_regs(env);
  	u8 mode = BPF_MODE(insn->code);
-	int i, err;
  
  	if (!may_access_skb(env->prog->type)) {
  		verbose(env, "BPF_LD_[ABS|IND] instructions not allowed for this program type\n");
@@ -3979,7 +4124,7 @@ static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn)
  	}
  
  	/* check whether implicit source operand (register R6) is readable */
-	err = check_reg_arg(env, BPF_REG_6, SRC_OP);
+	err = check_reg_arg(env, BPF_REG_6, SRC_OP64_1, insn_idx);
  	if (err)
  		return err;
  
@@ -3991,7 +4136,7 @@ static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn)
  
  	if (mode == BPF_IND) {
  		/* check explicit source operand */
-		err = check_reg_arg(env, insn->src_reg, SRC_OP);
+		err = check_reg_arg(env, insn->src_reg, SRC_OP_0, insn_idx);
  		if (err)
  			return err;
  	}
@@ -3999,7 +4144,7 @@ static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn)
  	/* reset caller saved regs to unreadable */
  	for (i = 0; i < CALLER_SAVED_REGS; i++) {
  		mark_reg_not_init(env, regs, caller_saved[i]);
-		check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);
+		check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK, -1);
  	}
  
  	/* mark destination R0 register as readable, since it contains
@@ -4091,7 +4236,6 @@ enum {
  
  #define STATE_LIST_MARK ((struct bpf_verifier_state_list *) -1L)
  
-static int *insn_stack;	/* stack of insns to process */
  static int cur_stack;	/* current stack index */
  static int *insn_state;
  
@@ -4554,10 +4698,11 @@ static bool states_equal(struct bpf_verifier_env *env,
   */
  static int propagate_liveness(struct bpf_verifier_env *env,
  			      const struct bpf_verifier_state *vstate,
-			      struct bpf_verifier_state *vparent)
+			      struct bpf_verifier_state *vparent, int insn_idx)
  {
-	int i, frame, err = 0;
+	struct bpf_reg_state *regs, *parent_regs;
  	struct bpf_func_state *state, *parent;
+	int i, frame, err = 0;
  
  	if (vparent->curframe != vstate->curframe) {
  		WARN(1, "propagate_live: parent frame %d current frame %d\n",
@@ -4566,13 +4711,24 @@ static int propagate_liveness(struct bpf_verifier_env *env,
  	}
  	/* Propagate read liveness of registers... */
  	BUILD_BUG_ON(BPF_REG_FP + 1 != MAX_BPF_REG);
+	parent_regs = vparent->frame[vparent->curframe]->regs;
+	regs = vstate->frame[vparent->curframe]->regs;
  	/* We don't need to worry about FP liveness because it's read-only */
  	for (i = 0; i < BPF_REG_FP; i++) {
-		if (vparent->frame[vparent->curframe]->regs[i].live & REG_LIVE_READ)
+		if (!(parent_regs[i].live & REG_LIVE_READ64) &&
+		    regs[i].live & REG_LIVE_READ64) {
+			err = mark_reg_read(env, &regs[i], &parent_regs[i],
+					    true);
+			if (err)
+				return err;
  			continue;
-		if (vstate->frame[vstate->curframe]->regs[i].live & REG_LIVE_READ) {
-			err = mark_reg_read(env, &vstate->frame[vstate->curframe]->regs[i],
-					    &vparent->frame[vstate->curframe]->regs[i]);
+		}
+
+		if (parent_regs[i].live & REG_LIVE_READ)
+			continue;
+		if (regs[i].live & REG_LIVE_READ) {
+			err = mark_reg_read(env, &regs[i], &parent_regs[i],
+					    false);
  			if (err)
  				return err;
  		}
@@ -4588,7 +4744,8 @@ static int propagate_liveness(struct bpf_verifier_env *env,
  				continue;
  			if (state->stack[i].spilled_ptr.live & REG_LIVE_READ)
  				mark_reg_read(env, &state->stack[i].spilled_ptr,
-					      &parent->stack[i].spilled_ptr);
+					      &parent->stack[i].spilled_ptr,
+					      false);
  		}
  	}
  	return err;
@@ -4620,7 +4777,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
  			 * they'll be immediately forgotten as we're pruning
  			 * this state and will pop a new one.
  			 */
-			err = propagate_liveness(env, &sl->state, cur);
+			err = propagate_liveness(env, &sl->state, cur, insn_idx);
  			if (err)
  				return err;
  			return 1;
@@ -4699,6 +4856,12 @@ static int do_check(struct bpf_verifier_env *env)
  			BPF_MAIN_FUNC /* callsite */,
  			0 /* frameno */,
  			0 /* subprogno, zero == main subprog */);
+
+	/* insn_stack will be used by propagate_64bit_usage. */
+	insn_stack = kcalloc(insn_cnt * 2, sizeof(int), GFP_KERNEL);
+	if (!insn_stack)
+		return -ENOMEM;
+
  	insn_idx = 0;
  	for (;;) {
  		struct bpf_insn *insn;
@@ -4779,11 +4942,13 @@ static int do_check(struct bpf_verifier_env *env)
  			/* check for reserved fields is already done */
  
  			/* check src operand */
-			err = check_reg_arg(env, insn->src_reg, SRC_OP);
+			err = check_reg_arg(env, insn->src_reg, SRC_OP64_0,
+					    insn_idx);
  			if (err)
  				return err;
  
-			err = check_reg_arg(env, insn->dst_reg, DST_OP_NO_MARK);
+			err = check_reg_arg(env, insn->dst_reg,
+					    U_DST_OP_NO_MARK, insn_idx);
  			if (err)
  				return err;
  
@@ -4823,6 +4988,12 @@ static int do_check(struct bpf_verifier_env *env)
  
  		} else if (class == BPF_STX) {
  			enum bpf_reg_type *prev_dst_type, dst_reg_type;
+			enum reg_arg_type source_type;
+
+			/* Let the back-end decide which insn to use according
+			 * to store width.
+			 */
+			env->insn_aux_data[insn_idx].full_ref = true;
  
  			if (BPF_MODE(insn->code) == BPF_XADD) {
  				err = check_xadd(env, insn_idx, insn);
@@ -4832,12 +5003,17 @@ static int do_check(struct bpf_verifier_env *env)
  				continue;
  			}
  
+			source_type = BPF_SIZE(insn->code) == BPF_DW ?
+					SRC_OP64_0 : SRC_OP_0;
+
  			/* check src1 operand */
-			err = check_reg_arg(env, insn->src_reg, SRC_OP);
+			err = check_reg_arg(env, insn->src_reg, source_type,
+					    insn_idx);
  			if (err)
  				return err;
  			/* check src2 operand */
-			err = check_reg_arg(env, insn->dst_reg, SRC_OP);
+			err = check_reg_arg(env, insn->dst_reg, SRC_OP64_1,
+					    insn_idx);
  			if (err)
  				return err;
  
@@ -4867,8 +5043,12 @@ static int do_check(struct bpf_verifier_env *env)
  				verbose(env, "BPF_ST uses reserved fields\n");
  				return -EINVAL;
  			}
+
+			env->insn_aux_data[insn_idx].full_ref = true;
+
  			/* check src operand */
-			err = check_reg_arg(env, insn->dst_reg, SRC_OP);
+			err = check_reg_arg(env, insn->dst_reg, SRC_OP64_0,
+					    insn_idx);
  			if (err)
  				return err;
  
@@ -4888,6 +5068,12 @@ static int do_check(struct bpf_verifier_env *env)
  		} else if (class == BPF_JMP) {
  			u8 opcode = BPF_OP(insn->code);
  
+			/* TODO: could potentially use range info to check if
+			 * the comparison setting condition could be done on
+			 * 32-bit sub-register.
+			 */
+			env->insn_aux_data[insn_idx].full_ref = true;
+
  			if (opcode == BPF_CALL) {
  				if (BPF_SRC(insn->code) != BPF_K ||
  				    insn->off != 0 ||
@@ -4918,6 +5104,9 @@ static int do_check(struct bpf_verifier_env *env)
  				continue;
  
  			} else if (opcode == BPF_EXIT) {
+				bool is_main_exit;
+				u32 frame_idx;
+
  				if (BPF_SRC(insn->code) != BPF_K ||
  				    insn->imm != 0 ||
  				    insn->src_reg != BPF_REG_0 ||
@@ -4926,7 +5115,8 @@ static int do_check(struct bpf_verifier_env *env)
  					return -EINVAL;
  				}
  
-				if (state->curframe) {
+				frame_idx = state->curframe;
+				if (frame_idx) {
  					/* exit from nested function */
  					prev_insn_idx = insn_idx;
  					err = prepare_func_exit(env, &insn_idx);
@@ -4942,7 +5132,12 @@ static int do_check(struct bpf_verifier_env *env)
  				 * of bpf_exit, which means that program wrote
  				 * something into it earlier
  				 */
-				err = check_reg_arg(env, BPF_REG_0, SRC_OP);
+				is_main_exit =
+					!state->frame[frame_idx]->subprogno;
+				err = check_reg_arg(env, BPF_REG_0,
+						    is_main_exit ?
+							SRC_OP64_0 :SRC_OP_0,
+						    insn_idx);
  				if (err)
  					return err;
  
@@ -5267,6 +5462,13 @@ static void sanitize_dead_code(struct bpf_verifier_env *env)
  	int i;
  
  	for (i = 0; i < insn_cnt; i++) {
+		/* Demonstration the usage of "full_ref" info. We could rewrite
+		 * BPF_ALU64 into BPF_ALU. 64-bit architecture could
+		 * then automatically get more 32-bit sub-register.
+		 */
+		if (!aux_data[i].full_ref &&
+		    BPF_CLASS(insn[i].code) == BPF_ALU64)
+			insn[i].code = (insn[i].code & ~0x7) | BPF_ALU;
  		if (aux_data[i].seen)
  			continue;
  		memcpy(insn + i, &trap, sizeof(trap));
@@ -5910,11 +6112,13 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr)
  	if (ret < 0)
  		goto skip_full_check;
  
+	insn_stack = NULL;
  	ret = do_check(env);
  	if (env->cur_state) {
  		free_verifier_state(env->cur_state, true);
  		env->cur_state = NULL;
  	}
+	kfree(insn_stack);
  
  skip_full_check:
  	while (!pop_stack(env, NULL, NULL));

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification
  2018-10-03 15:36     ` Jiong Wang
  2018-10-03 15:59       ` Alexei Starovoitov
@ 2018-10-04 17:35       ` Edward Cree
  1 sibling, 0 replies; 12+ messages in thread
From: Edward Cree @ 2018-10-04 17:35 UTC (permalink / raw)
  To: Jiong Wang, ast, daniel; +Cc: netdev

On 03/10/18 16:36, Jiong Wang wrote:
> On 28/09/2018 14:36, Edward Cree wrote:
> > But what you've described sounds interesting; perhaps it would also
> >  help later with loop-variable handling?
>
> Haven't considered how to use this for loop-variable handling, guess you mean
> applying what I have described to your previous loop detection RFC? I will look
> into your RFC later.

Tbh I was thinking more of John Fastabend's version (I'm not sure if he ever
 got round to posting patches, but he discussed the design towards the end of
 https://www.mail-archive.com/netdev@vger.kernel.org/msg216285.html ) which
 is building 'proper compiler data structures' and thus might be interested
 in proper use-def chains.  (Or it might not; I'm not really a compiler-guru
 so it's not immediately obvious to me.)

My approach was much less interested in the 'provenance' of the induction
 variable, just that it was increasing appropriately, so use-def chains are
 not really relevant to it.

-Ed

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-10-09  3:32 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-22 19:00 [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification Edward Cree
2018-08-22 19:02 ` [RFC PATCH v2 bpf-next 1/2] bpf/verifier: per-register parent pointers Edward Cree
2018-08-22 19:02 ` [RFC PATCH v2 bpf-next 2/2] bpf/verifier: display non-spill stack slot types in print_verifier_state Edward Cree
2018-08-30  2:18 ` [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification Alexei Starovoitov
2018-08-31 15:50   ` Edward Cree
2018-09-26 22:16 ` Jiong Wang
2018-09-28 13:36   ` Edward Cree
2018-10-03 15:36     ` Jiong Wang
2018-10-03 15:59       ` Alexei Starovoitov
2018-10-03 16:53         ` Jiong Wang
2018-10-08 20:18           ` Jiong Wang
2018-10-04 17:35       ` Edward Cree

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).