* [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification @ 2018-08-22 19:00 Edward Cree 2018-08-22 19:02 ` [RFC PATCH v2 bpf-next 1/2] bpf/verifier: per-register parent pointers Edward Cree ` (3 more replies) 0 siblings, 4 replies; 12+ messages in thread From: Edward Cree @ 2018-08-22 19:00 UTC (permalink / raw) To: ast, daniel; +Cc: netdev The first patch is a simplification of register liveness tracking by using a separate parentage chain for each register and stack slot, thus avoiding the need for logic to handle callee-saved registers when applying read marks. In the future this idea may be extended to form use-def chains. The second patch adds information about misc/zero data on the stack to the state dumps emitted to the log at various points; this information was found essential in debugging the first patch, and may be useful elsewhere. Edward Cree (2): bpf/verifier: per-register parent pointers bpf/verifier: display non-spill stack slot types in print_verifier_state include/linux/bpf_verifier.h | 8 +- kernel/bpf/verifier.c | 216 ++++++++++++++----------------------------- 2 files changed, 72 insertions(+), 152 deletions(-) ^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC PATCH v2 bpf-next 1/2] bpf/verifier: per-register parent pointers 2018-08-22 19:00 [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification Edward Cree @ 2018-08-22 19:02 ` Edward Cree 2018-08-22 19:02 ` [RFC PATCH v2 bpf-next 2/2] bpf/verifier: display non-spill stack slot types in print_verifier_state Edward Cree ` (2 subsequent siblings) 3 siblings, 0 replies; 12+ messages in thread From: Edward Cree @ 2018-08-22 19:02 UTC (permalink / raw) To: ast, daniel; +Cc: netdev By giving each register its own liveness chain, we elide the skip_callee() logic. Instead, each register's parent is the state it inherits from; both check_func_call() and prepare_func_exit() automatically connect reg states to the correct chain since when they copy the reg state across (r1-r5 into the callee as args, and r0 out as the return value) they also copy the parent pointer. Signed-off-by: Edward Cree <ecree@solarflare.com> --- include/linux/bpf_verifier.h | 8 +- kernel/bpf/verifier.c | 184 +++++++++++-------------------------------- 2 files changed, 47 insertions(+), 145 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 38b04f559ad3..b42b60a83e19 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -41,6 +41,7 @@ enum bpf_reg_liveness { }; struct bpf_reg_state { + /* Ordering of fields matters. See states_equal() */ enum bpf_reg_type type; union { /* valid when type == PTR_TO_PACKET */ @@ -59,7 +60,6 @@ struct bpf_reg_state { * came from, when one is tested for != NULL. */ u32 id; - /* Ordering of fields matters. See states_equal() */ /* For scalar types (SCALAR_VALUE), this represents our knowledge of * the actual value. * For pointer types, this represents the variable part of the offset @@ -76,15 +76,15 @@ struct bpf_reg_state { s64 smax_value; /* maximum possible (s64)value */ u64 umin_value; /* minimum possible (u64)value */ u64 umax_value; /* maximum possible (u64)value */ + /* parentage chain for liveness checking */ + struct bpf_reg_state *parent; /* Inside the callee two registers can be both PTR_TO_STACK like * R1=fp-8 and R2=fp-8, but one of them points to this function stack * while another to the caller's stack. To differentiate them 'frameno' * is used which is an index in bpf_verifier_state->frame[] array * pointing to bpf_func_state. - * This field must be second to last, for states_equal() reasons. */ u32 frameno; - /* This field must be last, for states_equal() reasons. */ enum bpf_reg_liveness live; }; @@ -107,7 +107,6 @@ struct bpf_stack_state { */ struct bpf_func_state { struct bpf_reg_state regs[MAX_BPF_REG]; - struct bpf_verifier_state *parent; /* index of call instruction that called into this func */ int callsite; /* stack frame number of this function state from pov of @@ -129,7 +128,6 @@ struct bpf_func_state { struct bpf_verifier_state { /* call stack tracking */ struct bpf_func_state *frame[MAX_CALL_FRAMES]; - struct bpf_verifier_state *parent; u32 curframe; }; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index ca90679a7fe5..b11d45916fff 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -380,9 +380,9 @@ static int copy_stack_state(struct bpf_func_state *dst, /* do_check() starts with zero-sized stack in struct bpf_verifier_state to * make it consume minimal amount of memory. check_stack_write() access from * the program calls into realloc_func_state() to grow the stack size. - * Note there is a non-zero 'parent' pointer inside bpf_verifier_state - * which this function copies over. It points to previous bpf_verifier_state - * which is never reallocated + * Note there is a non-zero parent pointer inside each reg of bpf_verifier_state + * which this function copies over. It points to corresponding reg in previous + * bpf_verifier_state which is never reallocated */ static int realloc_func_state(struct bpf_func_state *state, int size, bool copy_old) @@ -466,7 +466,6 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state, dst_state->frame[i] = NULL; } dst_state->curframe = src->curframe; - dst_state->parent = src->parent; for (i = 0; i <= src->curframe; i++) { dst = dst_state->frame[i]; if (!dst) { @@ -732,6 +731,7 @@ static void init_reg_state(struct bpf_verifier_env *env, for (i = 0; i < MAX_BPF_REG; i++) { mark_reg_not_init(env, regs, i); regs[i].live = REG_LIVE_NONE; + regs[i].parent = NULL; } /* frame pointer */ @@ -876,74 +876,21 @@ static int check_subprogs(struct bpf_verifier_env *env) return 0; } -static -struct bpf_verifier_state *skip_callee(struct bpf_verifier_env *env, - const struct bpf_verifier_state *state, - struct bpf_verifier_state *parent, - u32 regno) -{ - struct bpf_verifier_state *tmp = NULL; - - /* 'parent' could be a state of caller and - * 'state' could be a state of callee. In such case - * parent->curframe < state->curframe - * and it's ok for r1 - r5 registers - * - * 'parent' could be a callee's state after it bpf_exit-ed. - * In such case parent->curframe > state->curframe - * and it's ok for r0 only - */ - if (parent->curframe == state->curframe || - (parent->curframe < state->curframe && - regno >= BPF_REG_1 && regno <= BPF_REG_5) || - (parent->curframe > state->curframe && - regno == BPF_REG_0)) - return parent; - - if (parent->curframe > state->curframe && - regno >= BPF_REG_6) { - /* for callee saved regs we have to skip the whole chain - * of states that belong to callee and mark as LIVE_READ - * the registers before the call - */ - tmp = parent; - while (tmp && tmp->curframe != state->curframe) { - tmp = tmp->parent; - } - if (!tmp) - goto bug; - parent = tmp; - } else { - goto bug; - } - return parent; -bug: - verbose(env, "verifier bug regno %d tmp %p\n", regno, tmp); - verbose(env, "regno %d parent frame %d current frame %d\n", - regno, parent->curframe, state->curframe); - return NULL; -} - +/* Parentage chain of this register (or stack slot) should take care of all + * issues like callee-saved registers, stack slot allocation time, etc. + */ static int mark_reg_read(struct bpf_verifier_env *env, - const struct bpf_verifier_state *state, - struct bpf_verifier_state *parent, - u32 regno) + const struct bpf_reg_state *state, + struct bpf_reg_state *parent) { bool writes = parent == state->parent; /* Observe write marks */ - if (regno == BPF_REG_FP) - /* We don't need to worry about FP liveness because it's read-only */ - return 0; - while (parent) { /* if read wasn't screened by an earlier write ... */ - if (writes && state->frame[state->curframe]->regs[regno].live & REG_LIVE_WRITTEN) + if (writes && state->live & REG_LIVE_WRITTEN) break; - parent = skip_callee(env, state, parent, regno); - if (!parent) - return -EFAULT; /* ... then we depend on parent's value */ - parent->frame[parent->curframe]->regs[regno].live |= REG_LIVE_READ; + parent->live |= REG_LIVE_READ; state = parent; parent = state->parent; writes = true; @@ -969,7 +916,10 @@ static int check_reg_arg(struct bpf_verifier_env *env, u32 regno, verbose(env, "R%d !read_ok\n", regno); return -EACCES; } - return mark_reg_read(env, vstate, vstate->parent, regno); + /* We don't need to worry about FP liveness because it's read-only */ + if (regno != BPF_REG_FP) + return mark_reg_read(env, ®s[regno], + regs[regno].parent); } else { /* check whether register used as dest operand can be written to */ if (regno == BPF_REG_FP) { @@ -1080,8 +1030,8 @@ static int check_stack_write(struct bpf_verifier_env *env, } else { u8 type = STACK_MISC; - /* regular write of data into stack */ - state->stack[spi].spilled_ptr = (struct bpf_reg_state) {}; + /* regular write of data into stack destroys any spilled ptr */ + state->stack[spi].spilled_ptr.type = NOT_INIT; /* only mark the slot as written if all 8 bytes were written * otherwise read propagation may incorrectly stop too soon @@ -1106,61 +1056,6 @@ static int check_stack_write(struct bpf_verifier_env *env, return 0; } -/* registers of every function are unique and mark_reg_read() propagates - * the liveness in the following cases: - * - from callee into caller for R1 - R5 that were used as arguments - * - from caller into callee for R0 that used as result of the call - * - from caller to the same caller skipping states of the callee for R6 - R9, - * since R6 - R9 are callee saved by implicit function prologue and - * caller's R6 != callee's R6, so when we propagate liveness up to - * parent states we need to skip callee states for R6 - R9. - * - * stack slot marking is different, since stacks of caller and callee are - * accessible in both (since caller can pass a pointer to caller's stack to - * callee which can pass it to another function), hence mark_stack_slot_read() - * has to propagate the stack liveness to all parent states at given frame number. - * Consider code: - * f1() { - * ptr = fp - 8; - * *ptr = ctx; - * call f2 { - * .. = *ptr; - * } - * .. = *ptr; - * } - * First *ptr is reading from f1's stack and mark_stack_slot_read() has - * to mark liveness at the f1's frame and not f2's frame. - * Second *ptr is also reading from f1's stack and mark_stack_slot_read() has - * to propagate liveness to f2 states at f1's frame level and further into - * f1 states at f1's frame level until write into that stack slot - */ -static void mark_stack_slot_read(struct bpf_verifier_env *env, - const struct bpf_verifier_state *state, - struct bpf_verifier_state *parent, - int slot, int frameno) -{ - bool writes = parent == state->parent; /* Observe write marks */ - - while (parent) { - if (parent->frame[frameno]->allocated_stack <= slot * BPF_REG_SIZE) - /* since LIVE_WRITTEN mark is only done for full 8-byte - * write the read marks are conservative and parent - * state may not even have the stack allocated. In such case - * end the propagation, since the loop reached beginning - * of the function - */ - break; - /* if read wasn't screened by an earlier write ... */ - if (writes && state->frame[frameno]->stack[slot].spilled_ptr.live & REG_LIVE_WRITTEN) - break; - /* ... then we depend on parent's value */ - parent->frame[frameno]->stack[slot].spilled_ptr.live |= REG_LIVE_READ; - state = parent; - parent = state->parent; - writes = true; - } -} - static int check_stack_read(struct bpf_verifier_env *env, struct bpf_func_state *reg_state /* func where register points to */, int off, int size, int value_regno) @@ -1198,8 +1093,8 @@ static int check_stack_read(struct bpf_verifier_env *env, */ state->regs[value_regno].live |= REG_LIVE_WRITTEN; } - mark_stack_slot_read(env, vstate, vstate->parent, spi, - reg_state->frameno); + mark_reg_read(env, ®_state->stack[spi].spilled_ptr, + reg_state->stack[spi].spilled_ptr.parent); return 0; } else { int zeros = 0; @@ -1215,8 +1110,8 @@ static int check_stack_read(struct bpf_verifier_env *env, off, i, size); return -EACCES; } - mark_stack_slot_read(env, vstate, vstate->parent, spi, - reg_state->frameno); + mark_reg_read(env, ®_state->stack[spi].spilled_ptr, + reg_state->stack[spi].spilled_ptr.parent); if (value_regno >= 0) { if (zeros == size) { /* any size read into register is zero extended, @@ -1908,8 +1803,8 @@ static int check_stack_boundary(struct bpf_verifier_env *env, int regno, /* reading any byte out of 8-byte 'spill_slot' will cause * the whole slot to be marked as 'read' */ - mark_stack_slot_read(env, env->cur_state, env->cur_state->parent, - spi, state->frameno); + mark_reg_read(env, &state->stack[spi].spilled_ptr, + state->stack[spi].spilled_ptr.parent); } return update_stack_depth(env, state, off); } @@ -2366,11 +2261,13 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn, state->curframe + 1 /* frameno within this callchain */, subprog /* subprog number within this prog */); - /* copy r1 - r5 args that callee can access */ + /* copy r1 - r5 args that callee can access. The copy includes parent + * pointers, which connects us up to the liveness chain + */ for (i = BPF_REG_1; i <= BPF_REG_5; i++) callee->regs[i] = caller->regs[i]; - /* after the call regsiters r0 - r5 were scratched */ + /* after the call registers r0 - r5 were scratched */ for (i = 0; i < CALLER_SAVED_REGS; i++) { mark_reg_not_init(env, caller->regs, caller_saved[i]); check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK); @@ -4370,7 +4267,7 @@ static bool regsafe(struct bpf_reg_state *rold, struct bpf_reg_state *rcur, /* explored state didn't use this */ return true; - equal = memcmp(rold, rcur, offsetof(struct bpf_reg_state, frameno)) == 0; + equal = memcmp(rold, rcur, offsetof(struct bpf_reg_state, parent)) == 0; if (rold->type == PTR_TO_STACK) /* two stack pointers are equal only if they're pointing to @@ -4603,7 +4500,7 @@ static bool states_equal(struct bpf_verifier_env *env, * equivalent state (jump target or such) we didn't arrive by the straight-line * code, so read marks in the state must propagate to the parent regardless * of the state's write marks. That's what 'parent == state->parent' comparison - * in mark_reg_read() and mark_stack_slot_read() is for. + * in mark_reg_read() is for. */ static int propagate_liveness(struct bpf_verifier_env *env, const struct bpf_verifier_state *vstate, @@ -4624,7 +4521,8 @@ static int propagate_liveness(struct bpf_verifier_env *env, if (vparent->frame[vparent->curframe]->regs[i].live & REG_LIVE_READ) continue; if (vstate->frame[vstate->curframe]->regs[i].live & REG_LIVE_READ) { - err = mark_reg_read(env, vstate, vparent, i); + err = mark_reg_read(env, &vstate->frame[vstate->curframe]->regs[i], + &vparent->frame[vstate->curframe]->regs[i]); if (err) return err; } @@ -4639,7 +4537,8 @@ static int propagate_liveness(struct bpf_verifier_env *env, if (parent->stack[i].spilled_ptr.live & REG_LIVE_READ) continue; if (state->stack[i].spilled_ptr.live & REG_LIVE_READ) - mark_stack_slot_read(env, vstate, vparent, i, frame); + mark_reg_read(env, &state->stack[i].spilled_ptr, + &parent->stack[i].spilled_ptr); } } return err; @@ -4649,7 +4548,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) { struct bpf_verifier_state_list *new_sl; struct bpf_verifier_state_list *sl; - struct bpf_verifier_state *cur = env->cur_state; + struct bpf_verifier_state *cur = env->cur_state, *new; int i, j, err; sl = env->explored_states[insn_idx]; @@ -4691,16 +4590,18 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) return -ENOMEM; /* add new state to the head of linked list */ - err = copy_verifier_state(&new_sl->state, cur); + new = &new_sl->state; + err = copy_verifier_state(new, cur); if (err) { - free_verifier_state(&new_sl->state, false); + free_verifier_state(new, false); kfree(new_sl); return err; } new_sl->next = env->explored_states[insn_idx]; env->explored_states[insn_idx] = new_sl; /* connect new state to parentage chain */ - cur->parent = &new_sl->state; + for (i = 0; i < BPF_REG_FP; i++) + cur_regs(env)[i].parent = &new->frame[new->curframe]->regs[i]; /* clear write marks in current state: the writes we did are not writes * our child did, so they don't screen off its reads from us. * (There are no read marks in current state, because reads always mark @@ -4713,9 +4614,13 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) /* all stack frames are accessible from callee, clear them all */ for (j = 0; j <= cur->curframe; j++) { struct bpf_func_state *frame = cur->frame[j]; + struct bpf_func_state *newframe = new->frame[j]; - for (i = 0; i < frame->allocated_stack / BPF_REG_SIZE; i++) + for (i = 0; i < frame->allocated_stack / BPF_REG_SIZE; i++) { frame->stack[i].spilled_ptr.live = REG_LIVE_NONE; + frame->stack[i].spilled_ptr.parent = + &newframe->stack[i].spilled_ptr; + } } return 0; } @@ -4734,7 +4639,6 @@ static int do_check(struct bpf_verifier_env *env) if (!state) return -ENOMEM; state->curframe = 0; - state->parent = NULL; state->frame[0] = kzalloc(sizeof(struct bpf_func_state), GFP_KERNEL); if (!state->frame[0]) { kfree(state); ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC PATCH v2 bpf-next 2/2] bpf/verifier: display non-spill stack slot types in print_verifier_state 2018-08-22 19:00 [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification Edward Cree 2018-08-22 19:02 ` [RFC PATCH v2 bpf-next 1/2] bpf/verifier: per-register parent pointers Edward Cree @ 2018-08-22 19:02 ` Edward Cree 2018-08-30 2:18 ` [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification Alexei Starovoitov 2018-09-26 22:16 ` Jiong Wang 3 siblings, 0 replies; 12+ messages in thread From: Edward Cree @ 2018-08-22 19:02 UTC (permalink / raw) To: ast, daniel; +Cc: netdev If a stack slot does not hold a spilled register (STACK_SPILL), then each of its eight bytes could potentially have a different slot_type. This information can be important for debugging, and previously we either did not print anything for the stack slot, or just printed fp-X=0 in the case where its first byte was STACK_ZERO. Instead, print eight characters with either 0 (STACK_ZERO), m (STACK_MISC) or ? (STACK_INVALID) for any stack slot which is neither STACK_SPILL nor entirely STACK_INVALID. Signed-off-by: Edward Cree <ecree@solarflare.com> --- kernel/bpf/verifier.c | 32 +++++++++++++++++++++++++------- 1 file changed, 25 insertions(+), 7 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index b11d45916fff..2f4b52cf864c 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -263,6 +263,13 @@ static const char * const reg_type_str[] = { [PTR_TO_PACKET_END] = "pkt_end", }; +static char slot_type_char[] = { + [STACK_INVALID] = '?', + [STACK_SPILL] = 'r', + [STACK_MISC] = 'm', + [STACK_ZERO] = '0', +}; + static void print_liveness(struct bpf_verifier_env *env, enum bpf_reg_liveness live) { @@ -349,15 +356,26 @@ static void print_verifier_state(struct bpf_verifier_env *env, } } for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) { - if (state->stack[i].slot_type[0] == STACK_SPILL) { - verbose(env, " fp%d", - (-i - 1) * BPF_REG_SIZE); - print_liveness(env, state->stack[i].spilled_ptr.live); + char types_buf[BPF_REG_SIZE + 1]; + bool valid = false; + int j; + + for (j = 0; j < BPF_REG_SIZE; j++) { + if (state->stack[i].slot_type[j] != STACK_INVALID) + valid = true; + types_buf[j] = slot_type_char[ + state->stack[i].slot_type[j]]; + } + types_buf[BPF_REG_SIZE] = 0; + if (!valid) + continue; + verbose(env, " fp%d", (-i - 1) * BPF_REG_SIZE); + print_liveness(env, state->stack[i].spilled_ptr.live); + if (state->stack[i].slot_type[0] == STACK_SPILL) verbose(env, "=%s", reg_type_str[state->stack[i].spilled_ptr.type]); - } - if (state->stack[i].slot_type[0] == STACK_ZERO) - verbose(env, " fp%d=0", (-i - 1) * BPF_REG_SIZE); + else + verbose(env, "=%s", types_buf); } verbose(env, "\n"); } ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification 2018-08-22 19:00 [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification Edward Cree 2018-08-22 19:02 ` [RFC PATCH v2 bpf-next 1/2] bpf/verifier: per-register parent pointers Edward Cree 2018-08-22 19:02 ` [RFC PATCH v2 bpf-next 2/2] bpf/verifier: display non-spill stack slot types in print_verifier_state Edward Cree @ 2018-08-30 2:18 ` Alexei Starovoitov 2018-08-31 15:50 ` Edward Cree 2018-09-26 22:16 ` Jiong Wang 3 siblings, 1 reply; 12+ messages in thread From: Alexei Starovoitov @ 2018-08-30 2:18 UTC (permalink / raw) To: Edward Cree; +Cc: ast, daniel, netdev On Wed, Aug 22, 2018 at 08:00:46PM +0100, Edward Cree wrote: > The first patch is a simplification of register liveness tracking by using > a separate parentage chain for each register and stack slot, thus avoiding > the need for logic to handle callee-saved registers when applying read > marks. In the future this idea may be extended to form use-def chains. > The second patch adds information about misc/zero data on the stack to the > state dumps emitted to the log at various points; this information was > found essential in debugging the first patch, and may be useful elsewhere. I think this set is a great improvement in liveness tracking, so depsite seeing the issues I applied it to bpf-next. I think it's a better base to continue debugging. In particular: 1. we have instability issue in the verifier. from time to time the verifier goes to process extra 7 instructions on one of the cilium tests. This was happening before and after this set. 2. there is a nice improvement in number of processed insns with this set, but the difference I cannot explain, hence it has to debugged. In theory the liveness rewrite shouldn't cause the difference in processed insns. If not for the issue 1 I would argue that the issue 2 means that the set has to be debugged before going in, but since the verifier is unstable it's better to debug from this base with this patch set applied (because it greatly simplifies liveness and adds additional debug in patch2) and once we figure out the issue 1, I hope, the issue 2 will be resolved automatically. The numbers on cilium bpf programs: before1 before2 after1 after2 bpf_lb-DLB_L3.o 2003 2003 2003 2003 bpf_lb-DLB_L4.o 3173 3173 3173 3173 bpf_lb-DUNKNOWN.o 1080 1080 1080 1080 bpf_lxc-DDROP_ALL.o 29587 29587 29587 29587 bpf_lxc-DUNKNOWN.o 37204 37211 36926 36933 bpf_netdev.o 11283 11283 11188 11188 bpf_overlay.o 6679 6679 6679 6679 bpf_lcx_jit.o 39657 39657 39561 39561 notice how bpf_lxc-DUNKNOWN.o fluctuates with +7 before and after. That is issue 1. bpf_lxc-DUNKNOWN.o, bpf_netdev.o, and bpf_lcx_jit.o have small improvements. That is issue 2. To reproduce above numbers clone this repo: https://github.com/4ast/bpf_cilium_test and run .sh. The .o files in there are pretty old cilium bpf programs. I kept them frozen and didn't recompile for more than a year to keep stable base line and track the progress of the verifier in 'processed insns'. Thanks ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification 2018-08-30 2:18 ` [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification Alexei Starovoitov @ 2018-08-31 15:50 ` Edward Cree 0 siblings, 0 replies; 12+ messages in thread From: Edward Cree @ 2018-08-31 15:50 UTC (permalink / raw) To: Alexei Starovoitov; +Cc: ast, daniel, netdev On 30/08/18 03:18, Alexei Starovoitov wrote: > I think it's a better base to continue debugging. > In particular: > 1. we have instability issue in the verifier. > from time to time the verifier goes to process extra 7 instructions on one > of the cilium tests. This was happening before and after this set. I can't reproduce this; I always get 36926. Can you try recording the verifier log and diff the output between the two cases? > 2. there is a nice improvement in number of processed insns with this set, > but the difference I cannot explain, hence it has to debugged. > In theory the liveness rewrite shouldn't cause the difference in processed insns. I shall attack this with the same methods I used for the other delta. Since that one turned out to be a real bug in the patch, I'm not so sanguine as to dismiss this one as probably connected to issue 1. -Ed ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification 2018-08-22 19:00 [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification Edward Cree ` (2 preceding siblings ...) 2018-08-30 2:18 ` [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification Alexei Starovoitov @ 2018-09-26 22:16 ` Jiong Wang 2018-09-28 13:36 ` Edward Cree 3 siblings, 1 reply; 12+ messages in thread From: Jiong Wang @ 2018-09-26 22:16 UTC (permalink / raw) To: Edward Cree, ast, daniel; +Cc: netdev On 22/08/2018 20:00, Edward Cree wrote: > The first patch is a simplification of register liveness tracking by using > a separate parentage chain for each register and stack slot, thus avoiding > the need for logic to handle callee-saved registers when applying read > marks. In the future this idea may be extended to form use-def chains. Interesting. This could potentially help efficient code-gen for 32-bit architectures and I had been thinking some implementations along this line and would like to have a discussion. Program description === It will be good if we could know one instruction is safe to operate on low 32-bit sub-register only. If this is true: - 32-bit arches could save some code-gen. - 64-bit arches could utilize 32-bit sub-register instructions. Algorithm === Based on the current verifier code-path-walker, it looks to me we could get 32-bit safety information using the following algorithm: 1. assume all instructions operate on 32-bit sub-register initially. 2. link each insn to insns which define its use. 3. for a register use, if it is a 64-bit write back to memory, or if it is a comparison-then-jump based on 64-bit value, or if it is a right shift, then consider the use is a 64-bit use, then all its define insns and their parents define insns should be marked as need full 64-bit operation. 4. currently, register read (REG_LIVE_READ) will be propagated to parent state when path prune happened, and REG_LIVE_WRITTEN would screen off such propagation. We need to propagate 64-bit mark in similar way, but REG_LIVE_WRITTEN shouldn't screen off backward 64-bit mark propagation if the written reg also used as source reg (this has raised REG_LIVE_READ propagation but it is not necessarily indicating 64-bit read) in the same instruction. Implementation === And it seems there could be an implementation based on current liveness tracking infrastructure without dramatic change. 1. instruction level use->def chain - new active_defs array for each call frame to record which insn the active define of one register comes from. callee copies active_defs from caller for argument registers only, and copies active_def of R0 to caller when exit. struct bpf_func_state { ... s16 active_defs[__MAX_BPF_REG]; - new use->def chains for each instruction. one eBPF insn could have two uses at maximum. also one new boolean "full_ref_def" added to keep the final 32-bit safety information. it will be true if this instruction needs to operate on full 64-bit. bpf_insn_aux_data { ... struct { s16 def[2]; bool full_ref_def; }; 2. Inside mark_reg_read, split SRC_OP to SRC0_OP/SRC1_OP/SRC_OP_64/SRC1_OP_64 to indicate one register read if from the first or second use slot of this instruction, and to indicate whether the read is on full 64-bit which will only be true if the read is for 64-bit write back to memory, or 64-bit comparison-then-jump, or bit right shift means 64-bit. Build use->def chain for any read, but do 64-bit backward propagation for 64-bit read only. The propagation is to set full_ref_def to true for def and parent defs of the 64-bit read. 3. Need new bpf_reg_liveness enum, REG_LIVE_READ64 to indicate there is 64-bit access to one register in the pruned path, and need REG_LIVE_WRITTEN64 to indicate a write that is REG_LIVE_WRITTEN but should not screen backward propagate 64-bit read info. #define REG_LIVE_NONE 0 #define REG_LIVE_READ BIT(0) #define REG_LIVE_READ64 BIT(1) #define REG_LIVE_WRITTEN BIT(2) #define REG_LIVE_WRITTEN64 BIT(3) I might have missed something in above thoughts, would appreciate reviews and suggestions. I will also send out some concrete patches later. Regards, Jiong ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification 2018-09-26 22:16 ` Jiong Wang @ 2018-09-28 13:36 ` Edward Cree 2018-10-03 15:36 ` Jiong Wang 0 siblings, 1 reply; 12+ messages in thread From: Edward Cree @ 2018-09-28 13:36 UTC (permalink / raw) To: Jiong Wang, ast, daniel; +Cc: netdev On 26/09/18 23:16, Jiong Wang wrote: > On 22/08/2018 20:00, Edward Cree wrote: >> In the future this idea may be extended to form use-def chains. > > 1. instruction level use->def chain > > - new use->def chains for each instruction. one eBPF insn could have two > uses at maximum. I was thinking of something a lot weaker/simpler, just making ld rX, rY copy rY.parent into rX.parent and not read-mark rY (whereas actual arithmetic, pointer deref etc. would still create read marks). But what you've described sounds interesting; perhaps it would also help later with loop-variable handling? -Ed ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification 2018-09-28 13:36 ` Edward Cree @ 2018-10-03 15:36 ` Jiong Wang 2018-10-03 15:59 ` Alexei Starovoitov 2018-10-04 17:35 ` Edward Cree 0 siblings, 2 replies; 12+ messages in thread From: Jiong Wang @ 2018-10-03 15:36 UTC (permalink / raw) To: Edward Cree, ast, daniel; +Cc: netdev On 28/09/2018 14:36, Edward Cree wrote: > On 26/09/18 23:16, Jiong Wang wrote: >> On 22/08/2018 20:00, Edward Cree wrote: >>> In the future this idea may be extended to form use-def chains. >> >> 1. instruction level use->def chain >> >> - new use->def chains for each instruction. one eBPF insn could have two >> uses at maximum. > I was thinking of something a lot weaker/simpler, just making > ld rX, rY > copy rY.parent into rX.parent and not read-mark rY (whereas actual > arithmetic, pointer deref etc. would still create read marks). Thanks for the feedback Edward. > But what you've described sounds interesting; perhaps it would also > help later with loop-variable handling? Haven't considered how to use this for loop-variable handling, guess you mean applying what I have described to your previous loop detection RFC? I will look into your RFC later. At the moment the design of the use->def chain is mainly to optimize 32-bit code-gen. I was about to satisfied with a local implementation and to share it to ML for further discussion. However, when manually check the optimization result on testcase with medium size (~1000 eBPF insns) and proper complexity (make sure path prunes etc are triggered inside verifier), I found the code-gen doesn't meet my expectation. For example, for the following sequence, insn at 25 should operate on full-64 bit but I found it is marked as 32-bit safe. 25: r7 = 1 26: if r4 > r8 goto +1200 <L> 27: r1 = *(u8 *)(r1 + 0) 28: r1 &= 15 29: r7 = 1 ... L: 1227: r0 = r7 1228: exit As described at previous email, the algorithm assume all insns are 32-bit safe first, then start to insns back to "64-bit" if there is any 64-bit use found for a insn. Insn 25 is defining r7 which is used at the 1227 where its value propagated to r0 and then r0 is implicitly used at insn 1228 as it is a exit from main function to external. For above example, as we don't know the external use of r0 at 1228 (exit from main to external), so r0 is treated as 64-bit implicit use. The define is at 1227, so insn 1227 is marked as "64-bit". The "64-bit" attribute should propagate to source register operand through register move and arithmetic, so r7 at insn 1227 is a "64-bit" use and should make its definition instruction, insn 25, marked as "64-bit". This is my thinking of how insn 25 should be marked. Now this hasn't happened. I am still debugging the root cause, but kind of feel "64-bit" attribute propagation is the issue, it seems to me it can't be nicely integrated into the existing register read/write propagation infrastructure. For example, for a slightly more complex sequence which is composed of three states: State A ... 10: r6 = *(u32 *)(r10 - 4) 11: r7 = *(u32 *)(r10 - 8) 12: *(u64 *)(r10 - 16) = r6 13: *(u64 *)(r10 - 24) = r7 State B 14: r6 += 1 15: r7 += r6 16: *(u32 *)(r10 - 28) = r7 State C ... 17: r3 += r7 18: r4 = 1 19: *(u64 *)(r10 - 32) = r3 20: *(u64 *)(r10 - 40) = r4 State A is parent of state B which is parent of state C. Inside state C, at insn 20, r4 is a 64-bit read/use, so its define at 18 is marked as "64-bit". There is no register source at 18, so "64-bit" attribute propagation is stopped. Then at insn 19, r3 is a 64-bit read/use, so its define at 17 is marked as "64-bit" read/use. Insn 17 has two register sources, r3 and r7, they become "64-bit" now, and their definition should be marked as "64-bit". Now if the definition of r3 or r7 comes from parent state, then the parent state should receive a "REG_LIVE_READ64", this is necessary if later another path reaches state C and triggers prune path, for which case that path should know there is "64-bit" use inside state C on some registers, and should use this information to mark "64-bit" insn. If the definition of r3 or r7 is still inside state C, we need to keep walking up the instruction sequences, and propagate "64-bit" attribute upward until it goes beyond the state C. The above propagation logic is quite different from existing register read/write propagation. For the latter, a write just screen up all following read, and a read would propagate directly to its parent is there is not previous write, no instruction analysis is required. I am just describing what I have run into and trying to resolve, any thoughts and suggestions are appreciated. Regards, Jiong ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification 2018-10-03 15:36 ` Jiong Wang @ 2018-10-03 15:59 ` Alexei Starovoitov 2018-10-03 16:53 ` Jiong Wang 2018-10-04 17:35 ` Edward Cree 1 sibling, 1 reply; 12+ messages in thread From: Alexei Starovoitov @ 2018-10-03 15:59 UTC (permalink / raw) To: Jiong Wang; +Cc: Edward Cree, ast, daniel, netdev On Wed, Oct 03, 2018 at 04:36:31PM +0100, Jiong Wang wrote: > On 28/09/2018 14:36, Edward Cree wrote: > > On 26/09/18 23:16, Jiong Wang wrote: > >> On 22/08/2018 20:00, Edward Cree wrote: > >>> In the future this idea may be extended to form use-def chains. > >> > >> 1. instruction level use->def chain > >> > >> - new use->def chains for each instruction. one eBPF insn could have > two > >> uses at maximum. > > I was thinking of something a lot weaker/simpler, just making > > ld rX, rY > > copy rY.parent into rX.parent and not read-mark rY (whereas actual > > arithmetic, pointer deref etc. would still create read marks). > > Thanks for the feedback Edward. > > > But what you've described sounds interesting; perhaps it would also > > help later with loop-variable handling? > > Haven't considered how to use this for loop-variable handling, guess you > mean > applying what I have described to your previous loop detection RFC? I will > look > into your RFC later. > > At the moment the design of the use->def chain is mainly to optimize 32-bit > code-gen. I was about to satisfied with a local implementation and to share > it > to ML for further discussion. However, when manually check the optimization > result on testcase with medium size (~1000 eBPF insns) and proper complexity > (make sure path prunes etc are triggered inside verifier), I found the > code-gen > doesn't meet my expectation. > > For example, for the following sequence, insn at 25 should operate on > full-64 > bit but I found it is marked as 32-bit safe. > > 25: r7 = 1 > 26: if r4 > r8 goto +1200 <L> > 27: r1 = *(u8 *)(r1 + 0) > 28: r1 &= 15 > 29: r7 = 1 > ... > > L: > 1227: r0 = r7 > 1228: exit > > As described at previous email, the algorithm assume all insns are 32-bit > safe > first, then start to insns back to "64-bit" if there is any 64-bit use found > for a insn. > > Insn 25 is defining r7 which is used at the 1227 where its value propagated > to > r0 and then r0 is implicitly used at insn 1228 as it is a exit from main > function to external. > > For above example, as we don't know the external use of r0 at 1228 (exit > from > main to external), so r0 is treated as 64-bit implicit use. The define is at > 1227, so insn 1227 is marked as "64-bit". The "64-bit" attribute should > propagate to source register operand through register move and arithmetic, > so > r7 at insn 1227 is a "64-bit" use and should make its definition > instruction, > insn 25, marked as "64-bit". This is my thinking of how insn 25 should be > marked. all makes sense to me. > Now this hasn't happened. I am still debugging the root cause, but kind of > feel > "64-bit" attribute propagation is the issue, it seems to me it can't be > nicely > integrated into the existing register read/write propagation infrastructure. may be share your patches that modify the liveness propagation? > For > example, for a slightly more complex sequence which is composed of three > states: > > State A > ... > 10: r6 = *(u32 *)(r10 - 4) > 11: r7 = *(u32 *)(r10 - 8) > 12: *(u64 *)(r10 - 16) = r6 > 13: *(u64 *)(r10 - 24) = r7 > > State B > 14: r6 += 1 > 15: r7 += r6 > 16: *(u32 *)(r10 - 28) = r7 > > State C > ... > 17: r3 += r7 > 18: r4 = 1 > 19: *(u64 *)(r10 - 32) = r3 > 20: *(u64 *)(r10 - 40) = r4 > > State A is parent of state B which is parent of state C. > > Inside state C, at insn 20, r4 is a 64-bit read/use, so its define at 18 is > marked as "64-bit". There is no register source at 18, so "64-bit" attribute > propagation is stopped. > > Then at insn 19, r3 is a 64-bit read/use, so its define at 17 is marked as > "64-bit" read/use. Insn 17 has two register sources, r3 and r7, they become > "64-bit" now, and their definition should be marked as "64-bit". > > Now if the definition of r3 or r7 comes from parent state, then the parent ... the definition of r3 _and_ r7 ... both need to propagate up with your algo, right? > state > should receive a "REG_LIVE_READ64", this is necessary if later another path > reaches state C and triggers prune path, for which case that path should > know > there is "64-bit" use inside state C on some registers, and should use this > information to mark "64-bit" insn. > > If the definition of r3 or r7 is still inside state C, we need to keep > walking > up the instruction sequences, and propagate "64-bit" attribute upward until > it > goes beyond the state C. > > The above propagation logic is quite different from existing register > read/write > propagation. > For the latter, a write just screen up all following read, and > a > read would propagate directly to its parent is there is not previous write, > no > instruction analysis is required. correct. with such algo REG_LIVE_WRITTEN shouldn't be screening the propagation. I think the patches will discuss the algo. Also I think the initial state of 'everything is 32-bit safe' and make marks to enforce 64-bit-ness is a dangerous algorithmic choice. Why not to start at a safer state where everything is 64-bit and work backward to find out which ones can be 32-bit? That will be safer algo in case there are issues with bit like you described in above. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification 2018-10-03 15:59 ` Alexei Starovoitov @ 2018-10-03 16:53 ` Jiong Wang 2018-10-08 20:18 ` Jiong Wang 0 siblings, 1 reply; 12+ messages in thread From: Jiong Wang @ 2018-10-03 16:53 UTC (permalink / raw) To: Alexei Starovoitov; +Cc: Edward Cree, ast, daniel, netdev On 03/10/2018 16:59, Alexei Starovoitov wrote: > On Wed, Oct 03, 2018 at 04:36:31PM +0100, Jiong Wang wrote: <snip...> >> Now this hasn't happened. I am still debugging the root cause, but kind of >> feel >> "64-bit" attribute propagation is the issue, it seems to me it can't be >> nicely >> integrated into the existing register read/write propagation infrastructure. > > may be share your patches that modify the liveness propagation? OK, I will share it after some clean up. >> For >> example, for a slightly more complex sequence which is composed of three >> states: >> >> State A >> ... >> 10: r6 = *(u32 *)(r10 - 4) >> 11: r7 = *(u32 *)(r10 - 8) >> 12: *(u64 *)(r10 - 16) = r6 >> 13: *(u64 *)(r10 - 24) = r7 >> >> State B >> 14: r6 += 1 >> 15: r7 += r6 >> 16: *(u32 *)(r10 - 28) = r7 >> >> State C >> ... >> 17: r3 += r7 >> 18: r4 = 1 >> 19: *(u64 *)(r10 - 32) = r3 >> 20: *(u64 *)(r10 - 40) = r4 >> >> State A is parent of state B which is parent of state C. >> >> Inside state C, at insn 20, r4 is a 64-bit read/use, so its define at 18 is >> marked as "64-bit". There is no register source at 18, so "64-bit" attribute >> propagation is stopped. >> >> Then at insn 19, r3 is a 64-bit read/use, so its define at 17 is marked as >> "64-bit" read/use. Insn 17 has two register sources, r3 and r7, they become >> "64-bit" now, and their definition should be marked as "64-bit". >> >> Now if the definition of r3 or r7 comes from parent state, then the parent > > ... the definition of r3 _and_ r7 ... > both need to propagate up with your algo, right? Yes, all sources of insn 17, both r3 and r7. >> state >> should receive a "REG_LIVE_READ64", this is necessary if later another path >> reaches state C and triggers prune path, for which case that path should >> know >> there is "64-bit" use inside state C on some registers, and should use this >> information to mark "64-bit" insn. >> >> If the definition of r3 or r7 is still inside state C, we need to keep >> walking >> up the instruction sequences, and propagate "64-bit" attribute upward until >> it >> goes beyond the state C. >> >> The above propagation logic is quite different from existing register >> read/write propagation. >> For the latter, a write just screen up all following read, and a >> read would propagate directly to its parent is there is not previous write, >> no instruction analysis is required. > > correct. > with such algo REG_LIVE_WRITTEN shouldn't be screening the propagation. > > I think the patches will discuss the algo. > Also I think the initial state of 'everything is 32-bit safe' > and make marks to enforce 64-bit-ness is a dangerous algorithmic choice. Indeed, and I am actually thinking the same thing ... > Why not to start at a safer state where everything is 64-bit > and work backward to find out which ones can be 32-bit? > That will be safer algo in case there are issues with bit like > you described in above. ... but I failed to find a algo works on making initial state of "everything is 64-bit". I haven't figured out a way to check that all use of one definition are 32-bit on all possible code paths, which looks to me is a must for one insn marked as 32-bit safe. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification 2018-10-03 16:53 ` Jiong Wang @ 2018-10-08 20:18 ` Jiong Wang 0 siblings, 0 replies; 12+ messages in thread From: Jiong Wang @ 2018-10-08 20:18 UTC (permalink / raw) To: Alexei Starovoitov; +Cc: Edward Cree, ast, daniel, netdev On 03/10/2018 17:53, Jiong Wang wrote: > On 03/10/2018 16:59, Alexei Starovoitov wrote: >> On Wed, Oct 03, 2018 at 04:36:31PM +0100, Jiong Wang wrote: > <snip...> >>> Now this hasn't happened. I am still debugging the root cause, but kind of >>> feel >>> "64-bit" attribute propagation is the issue, it seems to me it can't be >>> nicely >>> integrated into the existing register read/write propagation infrastructure. >> >> may be share your patches that modify the liveness propagation? > > OK, I will share it after some clean up. Have done more experiments on the algo, and finally got something passed my local tests. bpf selftest passed as well except several test_verifier failures due to some corner cases I guess, will have a look later. In these test, x86-64 is using the 32-bit information. In the insn loop inside sanitize_dead_code, once an insn is not marked as 64-bit and if it is ALU64, it will then be changed to ALU32. When enable tracking using printk, I could see quite a few ALU64 instructions really are rewritten into ALU32, so tests like test_l4lb runs OK looks like a positive signal of the correctness. The algo separates the problem into two steps. - First, assume there is no code path prune and all code paths will be walked through. In this case, if we could propagate 64-bit info backward along the use-def chains for all paths walked, one insn must will be marked as 64-bit correctly. Finish this step requires building use-def chain, and it is done in the following way: 1. each insn could have two explicit uses, so add two chain fields in bpf_insn_aux_data. 2. we need finer enum to describe register use, so split SRC_OP into SRC_OP_0, SRC_OP64_0, SRC_OP_1, SRC_OP64_1 to indicate the source is the first/second source and whether it is a 64-bit source. 3. create the chain at check_reg_arg which is exactly covering all register use sites. The function to create the chain is link_reg_to_def. 4. when creating the chain, if a source is a 64-bit source, also propagating the information backward along use-def chain straight away. This is done in mark_reg_read which further calls the new function "mark_insn_64bit" which is doing the real job. "mark_insn_64bit" fetches the def insn for the 64-bit source, and further marks the def insns of its sources as 64-bit. This will be repeated until the whole use-def chain consumed. 5. by use-def chain described above, if there is no code path prune, one insn must have been marked as 64-bit when it's result has 64-bit use. 6. helper call causing implicit reg use and must be conservative treated as 64-bit use, bpf-to-bpf function call has insn connected by use-def so doesn't need to make that conservative assumption. - Second, to handle code path prune, define new liveness enum REG_LIVE_READ64 and REG_LIVE_UNIQUE_WRITTEN. The latter will only be set if reg_arg_type is the new U_DST_OP or U_DST_OP_NO_MARK, and REG_LIVE_READ64 will be set if one 64-bit read is not screened off by REG_LIVE_UNIQUE_WRITTEN. The thought is 64-bit use info will only be screened off if the dst register is unique in all register operands, meaning not the same as any source. For example, insn 18 below will screen off r4 64-bit propagation. 17: r3 += r7 18: r4 = 1 19: *(u64 *)(r10 - 32) = r3 20: *(u64 *)(r10 - 40) = r4 So, U_DST_OP/U_DST_OP_NO_MARK have been introduced to differentiate with DST_OP/DST_OP_NO_MARK. Inside check_reg_arg, checks are done on dst_reg, and would pass U_DST_* as reg_arg_type when it is unique. U_DST_* then will set liveness to REG_LIVE_UNIQUE_WRITTEN. In side mark_reg_read, if one 64-bit read is not screened off by REG_LIVE_UNIQUE_WRITTEN, then REG_LIVE_READ64 will be set in the reg_state. REG_LIVE_READ64 further triggers propagating downstream 64-bit uses from the pruned paths into the current path inside propagate_liveness when path prune happened. Compared with propagating REG_LIVE_READ, propagating REG_LIVE_READ64 needs more work. Because one 64-bit read could indicating more than one registers are 64-bit. For example, insn 19 above shows r3 is 64-bit source, so its define at insn 17 are 64-bit, then all sources of insn 17 must be 64-bit, meaning both r3 and r7 are 64-bit. Therefore, REG_LIVE_READ64 needs to be propagated on both r3 and r7 upward along the register parentage chain. During walking def-use chain, we record all such affected reg_state, and propagate REG_LIVE_READ64 for all of them. This logic is done inside mark_insn_64bit as well. - For short, the implementation treating the new 64-bit info (REG_LIVE_READ64) propagation the same as REG_LIVE_READ propagation. REG_LIVE_READ triggers more path prune during state equal comparison, while REG_LIVE_READ64 triggers 64-bit insn marking during def-use chain walking. Use-def chain is separate from reg state parentage chain chain, the prior is helping the later, reg states that needs REG_LIVE_READ64 propagation are collected during use-def chain walking. Please review the following implementation. Thanks. Regards, Jiong diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index b42b60a..ea22f43 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -34,11 +34,15 @@ * but of the link between it and its parent. See mark_reg_read() and * mark_stack_slot_read() in kernel/bpf/verifier.c. */ -enum bpf_reg_liveness { - REG_LIVE_NONE = 0, /* reg hasn't been read or written this branch */ - REG_LIVE_READ, /* reg was read, so we're sensitive to initial value */ - REG_LIVE_WRITTEN, /* reg was written first, screening off later reads */ -}; +/* Reg hasn't been read or written this branch. */ +#define REG_LIVE_NONE 0 +/* Reg was read, so we're sensitive to initial value. */ +#define REG_LIVE_READ 0x1 +/* The read is also 64-bit. */ +#define REG_LIVE_READ64 0x2 +#define REG_LIVE_WRITTEN 0x4 +/* The write also should screen off 64-bit backward propagation. */ +#define REG_LIVE_UNIQUE_WRITTEN 0x8 struct bpf_reg_state { /* Ordering of fields matters. See states_equal() */ @@ -85,7 +89,11 @@ struct bpf_reg_state { * pointing to bpf_func_state. */ u32 frameno; - enum bpf_reg_liveness live; + u32 live; + struct { + s16 active_def; + bool full_ref; + }; }; enum bpf_stack_slot_type { @@ -145,6 +153,11 @@ struct bpf_insn_aux_data { }; int ctx_field_size; /* the ctx field size for load insn, maybe 0 */ int sanitize_stack_off; /* stack slot to be cleared */ + struct { + s16 def[2]; /* The insns defining the uses of this insn. */ + bool full_ref; /* This insn should operate on full 64-bit. */ + }; + struct bpf_reg_state *reg_state; bool seen; /* this insn was processed by the verifier */ }; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 8ccbff4..061345d 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -271,8 +271,7 @@ static char slot_type_char[] = { [STACK_ZERO] = '0', }; -static void print_liveness(struct bpf_verifier_env *env, - enum bpf_reg_liveness live) +static void print_liveness(struct bpf_verifier_env *env, u32 live) { if (live & (REG_LIVE_READ | REG_LIVE_WRITTEN)) verbose(env, "_"); @@ -755,6 +754,7 @@ static void init_reg_state(struct bpf_verifier_env *env, mark_reg_not_init(env, regs, i); regs[i].live = REG_LIVE_NONE; regs[i].parent = NULL; + regs[i].active_def = -1; } /* frame pointer */ @@ -779,9 +779,16 @@ static void init_func_state(struct bpf_verifier_env *env, } enum reg_arg_type { - SRC_OP, /* register is used as source operand */ + SRC_OP_0, /* register is used as source operand */ + SRC_OP64_0, /* likewise, and all 64 bits of the source matter */ + SRC_OP_1, /* the second source */ + SRC_OP64_1, /* likewise */ + SRC_OP64_IMP, /* implicit source, always 64-bit */ + SRC_OP_SPILL, /* stack spill */ DST_OP, /* register is used as destination operand */ - DST_OP_NO_MARK /* same as above, check only, don't mark */ + U_DST_OP, /* likewise, and no overlap with any source */ + DST_OP_NO_MARK, /* same as above, check only, don't mark */ + U_DST_OP_NO_MARK/* likewise */ }; static int cmp_subprogs(const void *a, const void *b) @@ -899,14 +906,101 @@ static int check_subprogs(struct bpf_verifier_env *env) return 0; } +static void +link_reg_to_def(struct bpf_verifier_env *env, const struct bpf_reg_state *rstate, + enum reg_arg_type read_t, s16 insn_idx) +{ + s16 slot_idx, def_idx; + + if (insn_idx == -1) + return; + + slot_idx = (read_t - SRC_OP_0) % 2; + def_idx = rstate->active_def; + env->insn_aux_data[insn_idx].def[slot_idx] = def_idx + 1; +} + +/* Stack of insns to process, this is also used by check_cfg. */ +static int *insn_stack; + +struct prop_pair { + struct bpf_reg_state *reg_state; + u8 regno; +}; + +/* Mark insn upward along the chain, also propagate READ64 liveness. */ +static int mark_insn_64bit(struct bpf_verifier_env *env, int insn_idx) +{ + struct bpf_insn_aux_data *aux = env->insn_aux_data; + u16 u2d0, u2d1, stack_idx = 0, prop_idx = 0; + struct prop_pair *prop_stack; + + prop_stack = kcalloc(1024, sizeof(struct prop_pair), GFP_KERNEL); + if (!prop_stack) + return -ENOMEM; + + insn_stack[stack_idx++] = insn_idx; + + while (stack_idx) { + u16 def_idx = insn_stack[--stack_idx]; + + if (!def_idx) + continue; + + def_idx--; + + aux[def_idx].full_ref = true; + u2d0 = aux[def_idx].def[0]; + if (u2d0) { + insn_stack[stack_idx++] = u2d0; + prop_stack[prop_idx].reg_state = aux[def_idx].reg_state; + prop_stack[prop_idx++].regno = + env->prog->insnsi[u2d0 - 1].dst_reg; + } + u2d1 = aux[def_idx].def[1]; + if (u2d1) { + insn_stack[stack_idx++] = u2d1; + prop_stack[prop_idx].reg_state = aux[def_idx].reg_state; + prop_stack[prop_idx++].regno = + env->prog->insnsi[u2d1 - 1].dst_reg; + } + } + + while (prop_idx) { + struct prop_pair pair = prop_stack[--prop_idx]; + struct bpf_reg_state *state, *parent; + u8 regno = pair.regno; + + state = pair.reg_state + regno; + parent = state->parent; + + while (parent) { + if (state->live & REG_LIVE_UNIQUE_WRITTEN) + break; + + parent->live |= REG_LIVE_READ64; + state = parent; + parent = state->parent; + } + } + + kfree(prop_stack); + + return 0; +} + /* Parentage chain of this register (or stack slot) should take care of all * issues like callee-saved registers, stack slot allocation time, etc. */ static int mark_reg_read(struct bpf_verifier_env *env, const struct bpf_reg_state *state, - struct bpf_reg_state *parent) + struct bpf_reg_state *parent, bool full_reg_read) { bool writes = parent == state->parent; /* Observe write marks */ + const struct bpf_reg_state *orig_state = state; + struct bpf_reg_state *orig_parent = parent; + bool orig_writes = writes; + int def_idx; while (parent) { /* if read wasn't screened by an earlier write ... */ @@ -918,11 +1012,20 @@ static int mark_reg_read(struct bpf_verifier_env *env, parent = state->parent; writes = true; } - return 0; + + writes = orig_writes; + parent = orig_parent; + state = orig_state; + + def_idx = writes ? state->active_def : parent->active_def; + if (!full_reg_read || def_idx == -1) + return 0; + + return mark_insn_64bit(env, def_idx + 1); } static int check_reg_arg(struct bpf_verifier_env *env, u32 regno, - enum reg_arg_type t) + enum reg_arg_type t, s16 insn_idx) { struct bpf_verifier_state *vstate = env->cur_state; struct bpf_func_state *state = vstate->frame[vstate->curframe]; @@ -933,16 +1036,24 @@ static int check_reg_arg(struct bpf_verifier_env *env, u32 regno, return -EINVAL; } - if (t == SRC_OP) { + if (t == SRC_OP_0 || t == SRC_OP_1 || + t == SRC_OP64_0 || t == SRC_OP64_1 || t == SRC_OP64_IMP) { + struct bpf_reg_state *rstate = regs + regno; + /* check whether register used as source operand can be read */ - if (regs[regno].type == NOT_INIT) { + if (rstate->type == NOT_INIT) { verbose(env, "R%d !read_ok\n", regno); return -EACCES; } + + link_reg_to_def(env, rstate, t, insn_idx); + /* We don't need to worry about FP liveness because it's read-only */ if (regno != BPF_REG_FP) - return mark_reg_read(env, ®s[regno], - regs[regno].parent); + return mark_reg_read(env, rstate, rstate->parent, + t == SRC_OP64_0 || + t == SRC_OP64_1 || + t == SRC_OP64_IMP); } else { /* check whether register used as dest operand can be written to */ if (regno == BPF_REG_FP) { @@ -950,8 +1061,14 @@ static int check_reg_arg(struct bpf_verifier_env *env, u32 regno, return -EACCES; } regs[regno].live |= REG_LIVE_WRITTEN; - if (t == DST_OP) + if (t == U_DST_OP || t == U_DST_OP_NO_MARK) + regs[regno].live |= REG_LIVE_UNIQUE_WRITTEN; + if (t == DST_OP || t == U_DST_OP) mark_reg_unknown(env, regs, regno); + regs[regno].active_def = insn_idx; + if (insn_idx != -1) { + env->insn_aux_data[insn_idx].reg_state = regs; + } } return 0; } @@ -1118,7 +1235,8 @@ static int check_stack_read(struct bpf_verifier_env *env, state->regs[value_regno].live |= REG_LIVE_WRITTEN; } mark_reg_read(env, ®_state->stack[spi].spilled_ptr, - reg_state->stack[spi].spilled_ptr.parent); + reg_state->stack[spi].spilled_ptr.parent, + SRC_OP_SPILL); return 0; } else { int zeros = 0; @@ -1135,7 +1253,8 @@ static int check_stack_read(struct bpf_verifier_env *env, return -EACCES; } mark_reg_read(env, ®_state->stack[spi].spilled_ptr, - reg_state->stack[spi].spilled_ptr.parent); + reg_state->stack[spi].spilled_ptr.parent, + SRC_OP_SPILL); if (value_regno >= 0) { if (zeros == size) { /* any size read into register is zero extended, @@ -1735,23 +1854,26 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn return err; } -static int check_xadd(struct bpf_verifier_env *env, int insn_idx, struct bpf_insn *insn) +static int +check_xadd(struct bpf_verifier_env *env, int insn_idx, struct bpf_insn *insn) { + bool is_xadd_64 = BPF_SIZE(insn->code) == BPF_DW; + bool is_xadd_32 = BPF_SIZE(insn->code) == BPF_W; int err; - if ((BPF_SIZE(insn->code) != BPF_W && BPF_SIZE(insn->code) != BPF_DW) || - insn->imm != 0) { + if (!(is_xadd_32 || is_xadd_64) || insn->imm != 0) { verbose(env, "BPF_XADD uses reserved fields\n"); return -EINVAL; } /* check src1 operand */ - err = check_reg_arg(env, insn->src_reg, SRC_OP); + err = check_reg_arg(env, insn->src_reg, + is_xadd_32 ? SRC_OP_0 : SRC_OP64_0, insn_idx); if (err) return err; /* check src2 operand */ - err = check_reg_arg(env, insn->dst_reg, SRC_OP); + err = check_reg_arg(env, insn->dst_reg, SRC_OP64_1, insn_idx); if (err) return err; @@ -1852,7 +1974,8 @@ static int check_stack_boundary(struct bpf_verifier_env *env, int regno, * the whole slot to be marked as 'read' */ mark_reg_read(env, &state->stack[spi].spilled_ptr, - state->stack[spi].spilled_ptr.parent); + state->stack[spi].spilled_ptr.parent, + SRC_OP_SPILL); } return update_stack_depth(env, state, off); } @@ -1903,7 +2026,7 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno, if (arg_type == ARG_DONTCARE) return 0; - err = check_reg_arg(env, regno, SRC_OP); + err = check_reg_arg(env, regno, SRC_OP64_IMP, -1); if (err) return err; @@ -2320,7 +2443,7 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn, /* after the call registers r0 - r5 were scratched */ for (i = 0; i < CALLER_SAVED_REGS; i++) { mark_reg_not_init(env, caller->regs, caller_saved[i]); - check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK); + check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK, -1); } /* only increment it after check_reg_arg() finished */ @@ -2510,7 +2633,7 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn /* reset caller saved regs */ for (i = 0; i < CALLER_SAVED_REGS; i++) { mark_reg_not_init(env, regs, caller_saved[i]); - check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK); + check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK, -1); } /* update return register (already marked as written above) */ @@ -3161,7 +3284,10 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) { struct bpf_reg_state *regs = cur_regs(env); u8 opcode = BPF_OP(insn->code); - int err; + enum reg_arg_type dst_type; + int err, insn_idx; + + insn_idx = insn - env->prog->insnsi; if (opcode == BPF_END || opcode == BPF_NEG) { if (opcode == BPF_NEG) { @@ -3181,7 +3307,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) } /* check src operand */ - err = check_reg_arg(env, insn->dst_reg, SRC_OP); + err = check_reg_arg(env, insn->dst_reg, SRC_OP_0, insn_idx); if (err) return err; @@ -3191,8 +3317,9 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) return -EACCES; } + dst_type = insn->dst_reg != insn->src_reg ? U_DST_OP : DST_OP; /* check dest operand */ - err = check_reg_arg(env, insn->dst_reg, DST_OP); + err = check_reg_arg(env, insn->dst_reg, dst_type, insn_idx); if (err) return err; @@ -3205,18 +3332,24 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) } /* check src operand */ - err = check_reg_arg(env, insn->src_reg, SRC_OP); + err = check_reg_arg(env, insn->src_reg, SRC_OP_0, + insn_idx); if (err) return err; + + dst_type = insn->dst_reg != insn->src_reg ? + U_DST_OP_NO_MARK : DST_OP_NO_MARK; } else { if (insn->src_reg != BPF_REG_0 || insn->off != 0) { verbose(env, "BPF_MOV uses reserved fields\n"); return -EINVAL; } + + dst_type = U_DST_OP_NO_MARK; } /* check dest operand, mark as required later */ - err = check_reg_arg(env, insn->dst_reg, DST_OP_NO_MARK); + err = check_reg_arg(env, insn->dst_reg, dst_type, insn_idx); if (err) return err; @@ -3225,8 +3358,12 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) /* case: R1 = R2 * copy register state to dest reg */ + u32 live_old = regs[insn->dst_reg].live; + s16 def_old = regs[insn->dst_reg].active_def; + regs[insn->dst_reg] = regs[insn->src_reg]; - regs[insn->dst_reg].live |= REG_LIVE_WRITTEN; + regs[insn->dst_reg].live |= live_old; + regs[insn->dst_reg].active_def = def_old; } else { /* R1 = (u32) R2 */ if (is_pointer_value(env, insn->src_reg)) { @@ -3266,18 +3403,23 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) return -EINVAL; } /* check src1 operand */ - err = check_reg_arg(env, insn->src_reg, SRC_OP); + err = check_reg_arg(env, insn->src_reg, SRC_OP_0, + insn_idx); if (err) return err; + dst_type = insn->dst_reg != insn->src_reg ? + U_DST_OP_NO_MARK : DST_OP_NO_MARK; } else { if (insn->src_reg != BPF_REG_0 || insn->off != 0) { verbose(env, "BPF_ALU uses reserved fields\n"); return -EINVAL; } + + dst_type = DST_OP_NO_MARK; } /* check src2 operand */ - err = check_reg_arg(env, insn->dst_reg, SRC_OP); + err = check_reg_arg(env, insn->dst_reg, SRC_OP_1, insn_idx); if (err) return err; @@ -3303,7 +3445,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) } /* check dest operand */ - err = check_reg_arg(env, insn->dst_reg, DST_OP_NO_MARK); + err = check_reg_arg(env, insn->dst_reg, dst_type, insn_idx); if (err) return err; @@ -3759,6 +3901,7 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, struct bpf_reg_state *regs = this_branch->frame[this_branch->curframe]->regs; struct bpf_reg_state *dst_reg, *other_branch_regs; u8 opcode = BPF_OP(insn->code); + int old_insn_idx = *insn_idx; int err; if (opcode > BPF_JSLE) { @@ -3773,7 +3916,8 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, } /* check src1 operand */ - err = check_reg_arg(env, insn->src_reg, SRC_OP); + err = check_reg_arg(env, insn->src_reg, SRC_OP64_0, + old_insn_idx); if (err) return err; @@ -3790,7 +3934,7 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, } /* check src2 operand */ - err = check_reg_arg(env, insn->dst_reg, SRC_OP); + err = check_reg_arg(env, insn->dst_reg, SRC_OP64_1, old_insn_idx); if (err) return err; @@ -3896,7 +4040,8 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn) return -EINVAL; } - err = check_reg_arg(env, insn->dst_reg, DST_OP); + err = check_reg_arg(env, insn->dst_reg, U_DST_OP, + insn - env->prog->insnsi); if (err) return err; @@ -3945,9 +4090,9 @@ static bool may_access_skb(enum bpf_prog_type type) */ static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn) { + int i, err, insn_idx = insn - env->prog->insnsi; struct bpf_reg_state *regs = cur_regs(env); u8 mode = BPF_MODE(insn->code); - int i, err; if (!may_access_skb(env->prog->type)) { verbose(env, "BPF_LD_[ABS|IND] instructions not allowed for this program type\n"); @@ -3979,7 +4124,7 @@ static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn) } /* check whether implicit source operand (register R6) is readable */ - err = check_reg_arg(env, BPF_REG_6, SRC_OP); + err = check_reg_arg(env, BPF_REG_6, SRC_OP64_1, insn_idx); if (err) return err; @@ -3991,7 +4136,7 @@ static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn) if (mode == BPF_IND) { /* check explicit source operand */ - err = check_reg_arg(env, insn->src_reg, SRC_OP); + err = check_reg_arg(env, insn->src_reg, SRC_OP_0, insn_idx); if (err) return err; } @@ -3999,7 +4144,7 @@ static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn) /* reset caller saved regs to unreadable */ for (i = 0; i < CALLER_SAVED_REGS; i++) { mark_reg_not_init(env, regs, caller_saved[i]); - check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK); + check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK, -1); } /* mark destination R0 register as readable, since it contains @@ -4091,7 +4236,6 @@ enum { #define STATE_LIST_MARK ((struct bpf_verifier_state_list *) -1L) -static int *insn_stack; /* stack of insns to process */ static int cur_stack; /* current stack index */ static int *insn_state; @@ -4554,10 +4698,11 @@ static bool states_equal(struct bpf_verifier_env *env, */ static int propagate_liveness(struct bpf_verifier_env *env, const struct bpf_verifier_state *vstate, - struct bpf_verifier_state *vparent) + struct bpf_verifier_state *vparent, int insn_idx) { - int i, frame, err = 0; + struct bpf_reg_state *regs, *parent_regs; struct bpf_func_state *state, *parent; + int i, frame, err = 0; if (vparent->curframe != vstate->curframe) { WARN(1, "propagate_live: parent frame %d current frame %d\n", @@ -4566,13 +4711,24 @@ static int propagate_liveness(struct bpf_verifier_env *env, } /* Propagate read liveness of registers... */ BUILD_BUG_ON(BPF_REG_FP + 1 != MAX_BPF_REG); + parent_regs = vparent->frame[vparent->curframe]->regs; + regs = vstate->frame[vparent->curframe]->regs; /* We don't need to worry about FP liveness because it's read-only */ for (i = 0; i < BPF_REG_FP; i++) { - if (vparent->frame[vparent->curframe]->regs[i].live & REG_LIVE_READ) + if (!(parent_regs[i].live & REG_LIVE_READ64) && + regs[i].live & REG_LIVE_READ64) { + err = mark_reg_read(env, ®s[i], &parent_regs[i], + true); + if (err) + return err; continue; - if (vstate->frame[vstate->curframe]->regs[i].live & REG_LIVE_READ) { - err = mark_reg_read(env, &vstate->frame[vstate->curframe]->regs[i], - &vparent->frame[vstate->curframe]->regs[i]); + } + + if (parent_regs[i].live & REG_LIVE_READ) + continue; + if (regs[i].live & REG_LIVE_READ) { + err = mark_reg_read(env, ®s[i], &parent_regs[i], + false); if (err) return err; } @@ -4588,7 +4744,8 @@ static int propagate_liveness(struct bpf_verifier_env *env, continue; if (state->stack[i].spilled_ptr.live & REG_LIVE_READ) mark_reg_read(env, &state->stack[i].spilled_ptr, - &parent->stack[i].spilled_ptr); + &parent->stack[i].spilled_ptr, + false); } } return err; @@ -4620,7 +4777,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) * they'll be immediately forgotten as we're pruning * this state and will pop a new one. */ - err = propagate_liveness(env, &sl->state, cur); + err = propagate_liveness(env, &sl->state, cur, insn_idx); if (err) return err; return 1; @@ -4699,6 +4856,12 @@ static int do_check(struct bpf_verifier_env *env) BPF_MAIN_FUNC /* callsite */, 0 /* frameno */, 0 /* subprogno, zero == main subprog */); + + /* insn_stack will be used by propagate_64bit_usage. */ + insn_stack = kcalloc(insn_cnt * 2, sizeof(int), GFP_KERNEL); + if (!insn_stack) + return -ENOMEM; + insn_idx = 0; for (;;) { struct bpf_insn *insn; @@ -4779,11 +4942,13 @@ static int do_check(struct bpf_verifier_env *env) /* check for reserved fields is already done */ /* check src operand */ - err = check_reg_arg(env, insn->src_reg, SRC_OP); + err = check_reg_arg(env, insn->src_reg, SRC_OP64_0, + insn_idx); if (err) return err; - err = check_reg_arg(env, insn->dst_reg, DST_OP_NO_MARK); + err = check_reg_arg(env, insn->dst_reg, + U_DST_OP_NO_MARK, insn_idx); if (err) return err; @@ -4823,6 +4988,12 @@ static int do_check(struct bpf_verifier_env *env) } else if (class == BPF_STX) { enum bpf_reg_type *prev_dst_type, dst_reg_type; + enum reg_arg_type source_type; + + /* Let the back-end decide which insn to use according + * to store width. + */ + env->insn_aux_data[insn_idx].full_ref = true; if (BPF_MODE(insn->code) == BPF_XADD) { err = check_xadd(env, insn_idx, insn); @@ -4832,12 +5003,17 @@ static int do_check(struct bpf_verifier_env *env) continue; } + source_type = BPF_SIZE(insn->code) == BPF_DW ? + SRC_OP64_0 : SRC_OP_0; + /* check src1 operand */ - err = check_reg_arg(env, insn->src_reg, SRC_OP); + err = check_reg_arg(env, insn->src_reg, source_type, + insn_idx); if (err) return err; /* check src2 operand */ - err = check_reg_arg(env, insn->dst_reg, SRC_OP); + err = check_reg_arg(env, insn->dst_reg, SRC_OP64_1, + insn_idx); if (err) return err; @@ -4867,8 +5043,12 @@ static int do_check(struct bpf_verifier_env *env) verbose(env, "BPF_ST uses reserved fields\n"); return -EINVAL; } + + env->insn_aux_data[insn_idx].full_ref = true; + /* check src operand */ - err = check_reg_arg(env, insn->dst_reg, SRC_OP); + err = check_reg_arg(env, insn->dst_reg, SRC_OP64_0, + insn_idx); if (err) return err; @@ -4888,6 +5068,12 @@ static int do_check(struct bpf_verifier_env *env) } else if (class == BPF_JMP) { u8 opcode = BPF_OP(insn->code); + /* TODO: could potentially use range info to check if + * the comparison setting condition could be done on + * 32-bit sub-register. + */ + env->insn_aux_data[insn_idx].full_ref = true; + if (opcode == BPF_CALL) { if (BPF_SRC(insn->code) != BPF_K || insn->off != 0 || @@ -4918,6 +5104,9 @@ static int do_check(struct bpf_verifier_env *env) continue; } else if (opcode == BPF_EXIT) { + bool is_main_exit; + u32 frame_idx; + if (BPF_SRC(insn->code) != BPF_K || insn->imm != 0 || insn->src_reg != BPF_REG_0 || @@ -4926,7 +5115,8 @@ static int do_check(struct bpf_verifier_env *env) return -EINVAL; } - if (state->curframe) { + frame_idx = state->curframe; + if (frame_idx) { /* exit from nested function */ prev_insn_idx = insn_idx; err = prepare_func_exit(env, &insn_idx); @@ -4942,7 +5132,12 @@ static int do_check(struct bpf_verifier_env *env) * of bpf_exit, which means that program wrote * something into it earlier */ - err = check_reg_arg(env, BPF_REG_0, SRC_OP); + is_main_exit = + !state->frame[frame_idx]->subprogno; + err = check_reg_arg(env, BPF_REG_0, + is_main_exit ? + SRC_OP64_0 :SRC_OP_0, + insn_idx); if (err) return err; @@ -5267,6 +5462,13 @@ static void sanitize_dead_code(struct bpf_verifier_env *env) int i; for (i = 0; i < insn_cnt; i++) { + /* Demonstration the usage of "full_ref" info. We could rewrite + * BPF_ALU64 into BPF_ALU. 64-bit architecture could + * then automatically get more 32-bit sub-register. + */ + if (!aux_data[i].full_ref && + BPF_CLASS(insn[i].code) == BPF_ALU64) + insn[i].code = (insn[i].code & ~0x7) | BPF_ALU; if (aux_data[i].seen) continue; memcpy(insn + i, &trap, sizeof(trap)); @@ -5910,11 +6112,13 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr) if (ret < 0) goto skip_full_check; + insn_stack = NULL; ret = do_check(env); if (env->cur_state) { free_verifier_state(env->cur_state, true); env->cur_state = NULL; } + kfree(insn_stack); skip_full_check: while (!pop_stack(env, NULL, NULL)); ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification 2018-10-03 15:36 ` Jiong Wang 2018-10-03 15:59 ` Alexei Starovoitov @ 2018-10-04 17:35 ` Edward Cree 1 sibling, 0 replies; 12+ messages in thread From: Edward Cree @ 2018-10-04 17:35 UTC (permalink / raw) To: Jiong Wang, ast, daniel; +Cc: netdev On 03/10/18 16:36, Jiong Wang wrote: > On 28/09/2018 14:36, Edward Cree wrote: > > But what you've described sounds interesting; perhaps it would also > > help later with loop-variable handling? > > Haven't considered how to use this for loop-variable handling, guess you mean > applying what I have described to your previous loop detection RFC? I will look > into your RFC later. Tbh I was thinking more of John Fastabend's version (I'm not sure if he ever got round to posting patches, but he discussed the design towards the end of https://www.mail-archive.com/netdev@vger.kernel.org/msg216285.html ) which is building 'proper compiler data structures' and thus might be interested in proper use-def chains. (Or it might not; I'm not really a compiler-guru so it's not immediately obvious to me.) My approach was much less interested in the 'provenance' of the induction variable, just that it was increasing appropriately, so use-def chains are not really relevant to it. -Ed ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2018-10-09 3:32 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-08-22 19:00 [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification Edward Cree 2018-08-22 19:02 ` [RFC PATCH v2 bpf-next 1/2] bpf/verifier: per-register parent pointers Edward Cree 2018-08-22 19:02 ` [RFC PATCH v2 bpf-next 2/2] bpf/verifier: display non-spill stack slot types in print_verifier_state Edward Cree 2018-08-30 2:18 ` [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification Alexei Starovoitov 2018-08-31 15:50 ` Edward Cree 2018-09-26 22:16 ` Jiong Wang 2018-09-28 13:36 ` Edward Cree 2018-10-03 15:36 ` Jiong Wang 2018-10-03 15:59 ` Alexei Starovoitov 2018-10-03 16:53 ` Jiong Wang 2018-10-08 20:18 ` Jiong Wang 2018-10-04 17:35 ` Edward Cree
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).