All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 bpf-next 00/15] bpf: eliminate zero extensions for sub-register writes
@ 2019-04-15 17:26 Jiong Wang
  2019-04-15 17:26 ` [PATCH v4 bpf-next 01/15] bpf: split read liveness into REG_LIVE_READ64 and REG_LIVE_READ32 Jiong Wang
                   ` (14 more replies)
  0 siblings, 15 replies; 30+ messages in thread
From: Jiong Wang @ 2019-04-15 17:26 UTC (permalink / raw)
  To: alexei.starovoitov, daniel
  Cc: bpf, netdev, oss-drivers, Jiong Wang, David S . Miller,
	Paul Burton, Wang YanQing, Zi Shen Lim, Shubham Bansal,
	Naveen N . Rao, Sandipan Das, Martin Schwidefsky, Heiko Carstens,
	Jakub Kicinski

v4:
  - added the two missing fixes which addresses two Jakub's reviewes in v3.
  - rebase on top of bpf-next.

v3:
  - remove redundant check in "propagate_liveness_reg". (Jakub)
  - add extra check in "mark_reg_read" to prune more search. (Jakub)
  - re-implemented "prog_flags" passing mechanism, removed use of
    global switch inside libbpf.
  - enabled high 32-bit randomization beyond "test_verifier" and
    "test_progs". Now it should have been enabled for all possible
    tests. Re-run all tests, haven't noticed regression.
  - remove RFC tag.

v2:
  - rebased on top of bpf-next master.
  - added comments for what is sub-register def index. (Edward, Alexei)
  - removed patch 1 which turns bit mask from enum to macro. (Alexei)
  - removed sysctl/bpf_jit_32bit_opt. (Alexei)
  - merged sub-register def insn index into reg state. (Alexei)
  - change test methodology (Alexei):
      + instead of simple unit tests on x86_64 for which this optimization
        doesn't enabled due to there is hardware support, poison high
        32-bit for whose def identified as safe to do so. this could let
        the correctness of this patch set checked when daily bpf selftest
        ran which delivers very stressful test on host machine like x86_64.
      + hi32 poisoning is gated by a new BPF_F_TEST_RND_HI32 prog flags.
      + BPF_F_TEST_RND_HI32 is enabled for all tests of "test_progs" and
        "test_verifier", the latter needs minor tweak on two unit tests,
        please see the patch for the change.
      + introduced a new global variable "libbpf_test_mode" into libbpf.
        once it is set to true, it will set BPF_F_TEST_RND_HI32 for all the
        later PROG_LOAD syscall, the goal is to easy the enable of hi32
        poison on exsiting testsuite.
        we could also introduce new APIs, for example "bpf_prog_test_load",
        then use -Dbpf_prog_load=bpf_prog_test_load to migrate tests under
        test_progs, but there are several load APIs, and such new API need
        some change on struture like "struct bpf_prog_load_attr".
      + removed old unit tests. it is based on insn scan and requires quite
        a few test_verifier generic code change. given hi32 randomization
        could offer good test coverage, the unit tests doesn't add much
        extra test value.
  - enhanced register width check ("is_reg64") when record sub-register
    write, now, it returns more accurate width.
  - Re-run all tests under "test_progs" and "test_verifier" on x86_64, no
    regression. Fixed a couple of bugs exposed:
      1. ctx field size transformation was not taken into account.
      2. insn patch could cause lost of original aux data which is
         important for ctx field conversion.
      3. return value for propagate_liveness was wrong and caused
         regression on processed insn number.
      4. helper call arg wasn't handled properly that path prune may cause
         64-bit read info in pruned path lost.
  - Re-run Cilium bpf prog for processed-insn-number benchmarking, no
    regression.

eBPF ISA specification requires high 32-bit cleared when low 32-bit
sub-register is written. This applies to destination register of
ALU32/LD_H/B/W etc. JIT back-ends must guarantee this semantic when doing
code-gen.

x86-64 and arm64 ISA has the same semantics, so the corresponding JIT
back-end doesn't need to do extra work. However, 32-bit arches (arm, nfp
etc.) and some other 64-bit arches (powerpc, sparc etc), need explicitly
zero extension sequence to meet such semantic.

This is important, because for C code like the following:

  u64_value = (u64) u32_value
  ... other uses of u64_value

compiler could exploit the semantic described above and save those zero
extensions for extending u32_value to u64_value. Hardware, runtime, or BPF
JIT back-ends, are responsible for guaranteeing this. Some benchmarks shows
~40% sub-register writes out of total insns, meaning ~40% extra code-gen
and could go up for arches requiring two shifts for zero extension. All
these are because JIT back-end needs to do extra code-gen for all such
instructions, always.

However this is not always necessary in case u32 value is never cast into a
u64, which is quite normal in real life program. So, it would be really
good if we could identify those places where such type cast happened, and
only do zero extensions for them, not for the others. This could save a lot
of BPF code-gen.

Algo
====
We could use insn scan based static analysis to tell whether one
sub-register def doesn't need zero extension. However, using such static
analysis, we must do conservative assumption at branching point where
multiple uses could be introduced. So, for any sub-register def that is
active at branching point, we need to mark it as needing zero extension.
This could introducing quite a few false alarms, for example ~25% on
Cilium bpf_lxc.

It will be far better to use dynamic data-flow tracing which verifier
fortunately already has and could be easily extend to serve the purpose of
this patch set.

 - Record indices of instructions that do sub-register def (write). And
   these indices need to stay with function state so path pruning and bpf
   to bpf function call could be handled properly.

   These indices are kept up to date while doing insn walk.

 - A full register read on an active sub-register def marks the def insn as
   needing zero extension on dst register.

 - A new sub-register write overrides the old one.

   A new full register write makes the register free of zero extension on
   dst register.

 - When propagating register read64 during path pruning, it also marks def
   insns whose defs are hanging active sub-register, if there is any read64
   from shown from the equal state.

 The core patch in this set is patch 4.

Benchmark
=========
 - I estimate the JITed image could be 25% smaller on average on all these
   affected arches (nfp, arm, x32, risv, ppc, sparc, s390).

 - The implementation is based on existing register read liveness tracking
   infrastructure, so it is dynamic tracking and would trace all possible
   code paths, therefore, it shouldn't be any false alarm.

   For Cilium bpf_lxc, there is ~11500 insns in the compiled binary (use
   latest LLVM snapshot, and with -mcpu=v3 -mattr=+alu32 enabled), 4460 of
   them has sub-register writes (~40%). Calculated by:

    cat dump | grep -P "\tw" | wc -l       (ALU32)
    cat dump | grep -P "r.*=.*u32" | wc -l (READ_W)
    cat dump | grep -P "r.*=.*u16" | wc -l (READ_H)
    cat dump | grep -P "r.*=.*u8" | wc -l  (READ_B)

   After this patch set enabled, 647 out of those 4460 are identified as
   really needing zero extension on the destination, then it is safe for
   JIT back-ends to eliminate zero extension for all the other instructions
   which is ~85% of all those sub-register write insns or 33% of total
   insns. It is a significant save.

   For those 647 insns marked as needing zero extension, part of them are
   setting up u64 parameters for help calls, remaining ones are those whose
   sub-register defs really have 64-bit reads.

ToDo
====
 - eBPF doesn't have zero extension instruction, so lshift and rshift are
   used to do it, meaning two native insns while JIT back-ends could just
   use one truncate insn if they understand the lshift + rshift is just
   doing zero extension.

   There are three approaches to fix this:

     - Minor change on back-end JIT hook, also pass aux_insn information to
       back-ends so they could have per insn information and they could do
       zero extension for the marked insn themselves using the most
       efficient native insn.

    - Introduce zero extension insn for eBPF. Then verifier could insert
      the new zext insn instead of lshift + rshift. zext could be JITed
      more efficiently.

      NOTE: existing MOV64 can't be used as zero extension insn, because
            after zext optimization enabled, MOV64 doesn't necessarily
            clear high 32-bit.

    - Otherwise JIT back-ends need to do peephole to catch lshift + rshift
      and turn them into native zext.

 - In this set, for JIT back-ends except NFP, I have only enabled the
   optimization for ALU32 instructions, while it could be easily enabled
   for load instruction.

Reviews
=======
 - Fixed the missing handling on callee-saved for bpf-to-bpf call,
   sub-register defs therefore moved to frame state. (Jakub Kicinski)
 - Removed redundant "cross_reg". (Jakub Kicinski)
 - Various coding styles & grammar fixes. (Jakub Kicinski, Quentin Monnet)

Cc: David S. Miller <davem@davemloft.net>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Wang YanQing <udknight@gmail.com>
Cc: Zi Shen Lim <zlim.lnx@gmail.com>
Cc: Shubham Bansal <illusionist.neo@gmail.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>

Jiong Wang (15):
  bpf: split read liveness into REG_LIVE_READ64 and REG_LIVE_READ32
  bpf: mark lo32 writes that should be zero extended into hi32
  bpf: reduce false alarm by refining helper call arg types
  bpf: insert explicit zero extension insn when hardware doesn't do it
    implicitly
  bpf: introduce new bpf prog load flags "BPF_F_TEST_RND_HI32"
  bpf: randomize high 32-bit when BPF_F_TEST_RND_HI32 is set
  libbpf: add "prog_flags" to
    bpf_program/bpf_prog_load_attr/bpf_load_program_attr
  selftests: enable hi32 randomization for all tests
  arm: bpf: eliminate zero extension code-gen
  powerpc: bpf: eliminate zero extension code-gen
  s390: bpf: eliminate zero extension code-gen
  sparc: bpf: eliminate zero extension code-gen
  x32: bpf: eliminate zero extension code-gen
  riscv: bpf: eliminate zero extension code-gen
  nfp: bpf: eliminate zero extension code-gen

 arch/arm/net/bpf_jit_32.c                          |  22 +-
 arch/powerpc/net/bpf_jit_comp64.c                  |   7 +-
 arch/riscv/net/bpf_jit_comp.c                      |  32 +-
 arch/s390/net/bpf_jit_comp.c                       |  13 +-
 arch/sparc/net/bpf_jit_comp_64.c                   |   8 +-
 arch/x86/net/bpf_jit_comp32.c                      |  32 +-
 drivers/net/ethernet/netronome/nfp/bpf/jit.c       | 119 ++++---
 drivers/net/ethernet/netronome/nfp/bpf/main.h      |   2 +
 drivers/net/ethernet/netronome/nfp/bpf/verifier.c  |  12 +
 include/linux/bpf.h                                |   4 +
 include/linux/bpf_verifier.h                       |  14 +-
 include/linux/filter.h                             |   1 +
 include/uapi/linux/bpf.h                           |  18 ++
 kernel/bpf/core.c                                  |  10 +-
 kernel/bpf/helpers.c                               |   2 +-
 kernel/bpf/syscall.c                               |   4 +-
 kernel/bpf/verifier.c                              | 353 +++++++++++++++++++--
 net/core/filter.c                                  |  28 +-
 tools/include/uapi/linux/bpf.h                     |  18 ++
 tools/lib/bpf/bpf.c                                |   1 +
 tools/lib/bpf/bpf.h                                |   1 +
 tools/lib/bpf/libbpf.c                             |   3 +
 tools/lib/bpf/libbpf.h                             |   1 +
 tools/testing/selftests/bpf/Makefile               |  10 +-
 .../selftests/bpf/prog_tests/bpf_verif_scale.c     |   1 +
 tools/testing/selftests/bpf/test_sock_addr.c       |   1 +
 tools/testing/selftests/bpf/test_sock_fields.c     |   1 +
 tools/testing/selftests/bpf/test_socket_cookie.c   |   1 +
 tools/testing/selftests/bpf/test_stub.c            |  40 +++
 tools/testing/selftests/bpf/test_verifier.c        |   6 +-
 30 files changed, 636 insertions(+), 129 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/test_stub.c

-- 
2.7.4


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v4 bpf-next 01/15] bpf: split read liveness into REG_LIVE_READ64 and REG_LIVE_READ32
  2019-04-15 17:26 [PATCH v4 bpf-next 00/15] bpf: eliminate zero extensions for sub-register writes Jiong Wang
@ 2019-04-15 17:26 ` Jiong Wang
  2019-04-15 23:03   ` Jakub Kicinski
  2019-04-16  1:26   ` Alexei Starovoitov
  2019-04-15 17:26 ` [PATCH v4 bpf-next 02/15] bpf: mark lo32 writes that should be zero extended into hi32 Jiong Wang
                   ` (13 subsequent siblings)
  14 siblings, 2 replies; 30+ messages in thread
From: Jiong Wang @ 2019-04-15 17:26 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: bpf, netdev, oss-drivers, Jiong Wang

Register liveness infrastructure doesn't track register read width at the
moment, while the width information will be needed for the later 32-bit
safety analysis pass.

This patch take the first step to split read liveness into REG_LIVE_READ64
and REG_LIVE_READ32.

Liveness propagation code are updated accordingly. They are taught to
understand how to propagate REG_LIVE_READ64 and REG_LIVE_READ32 at the same
propagation iteration. For example, "mark_reg_read" now propagate "flags"
which could be multiple read bits instead of the single REG_LIVE_READ64.

A write still screen off all width of reads.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 include/linux/bpf_verifier.h |   8 +--
 kernel/bpf/verifier.c        | 119 +++++++++++++++++++++++++++++++++++++++----
 2 files changed, 115 insertions(+), 12 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index b3ab61f..fba0ebb 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -36,9 +36,11 @@
  */
 enum bpf_reg_liveness {
 	REG_LIVE_NONE = 0, /* reg hasn't been read or written this branch */
-	REG_LIVE_READ, /* reg was read, so we're sensitive to initial value */
-	REG_LIVE_WRITTEN, /* reg was written first, screening off later reads */
-	REG_LIVE_DONE = 4, /* liveness won't be updating this register anymore */
+	REG_LIVE_READ32 = 0x1, /* reg was read, so we're sensitive to initial value */
+	REG_LIVE_READ64 = 0x2, /* likewise, but full 64-bit content matters */
+	REG_LIVE_READ = REG_LIVE_READ32 | REG_LIVE_READ64,
+	REG_LIVE_WRITTEN = 0x4, /* reg was written first, screening off later reads */
+	REG_LIVE_DONE = 0x8, /* liveness won't be updating this register anymore */
 };
 
 struct bpf_reg_state {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index c722015..5784b279 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1135,7 +1135,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
  */
 static int mark_reg_read(struct bpf_verifier_env *env,
 			 const struct bpf_reg_state *state,
-			 struct bpf_reg_state *parent)
+			 struct bpf_reg_state *parent, u8 flags)
 {
 	bool writes = parent == state->parent; /* Observe write marks */
 	int cnt = 0;
@@ -1150,17 +1150,23 @@ static int mark_reg_read(struct bpf_verifier_env *env,
 				parent->var_off.value, parent->off);
 			return -EFAULT;
 		}
-		if (parent->live & REG_LIVE_READ)
+		/* The first condition is much more likely to be true than the
+		 * second, make it checked first.
+		 */
+		if ((parent->live & REG_LIVE_READ) == flags ||
+		    parent->live & REG_LIVE_READ64)
 			/* The parentage chain never changes and
 			 * this parent was already marked as LIVE_READ.
 			 * There is no need to keep walking the chain again and
 			 * keep re-marking all parents as LIVE_READ.
 			 * This case happens when the same register is read
 			 * multiple times without writes into it in-between.
+			 * Also, if parent has REG_LIVE_READ64 set, then no need
+			 * to set the weak REG_LIVE_READ32.
 			 */
 			break;
 		/* ... then we depend on parent's value */
-		parent->live |= REG_LIVE_READ;
+		parent->live |= flags;
 		state = parent;
 		parent = state->parent;
 		writes = true;
@@ -1172,12 +1178,95 @@ static int mark_reg_read(struct bpf_verifier_env *env,
 	return 0;
 }
 
+/* This function is supposed to be used by the following 32-bit optimization
+ * code only. It returns TRUE if the source or destination register operates
+ * on 64-bit, otherwise return FALSE.
+ */
+static bool is_reg64(struct bpf_insn *insn, u32 regno,
+		     struct bpf_reg_state *reg, enum reg_arg_type t)
+{
+	u8 code, class, op;
+
+	code = insn->code;
+	class = BPF_CLASS(code);
+	op = BPF_OP(code);
+	if (class == BPF_JMP) {
+		/* BPF_EXIT will reach here because of return value readability
+		 * test for "main" which has s32 return value.
+		 */
+		if (op == BPF_EXIT)
+			return false;
+		if (op == BPF_CALL) {
+			/* BPF to BPF call will reach here because of marking
+			 * caller saved clobber with DST_OP_NO_MARK for which we
+			 * don't care the register def because they are anyway
+			 * marked as NOT_INIT already.
+			 */
+			if (insn->src_reg == BPF_PSEUDO_CALL)
+				return false;
+			/* Helper call will reach here because of arg type
+			 * check. Conservatively marking all args as 64-bit.
+			 */
+			return true;
+		}
+	}
+
+	if (class == BPF_ALU64 || class == BPF_JMP ||
+	    /* BPF_END always use BPF_ALU class. */
+	    (class == BPF_ALU && op == BPF_END && insn->imm == 64))
+		return true;
+
+	if (class == BPF_ALU || class == BPF_JMP32)
+		return false;
+
+	if (class == BPF_LDX) {
+		if (t != SRC_OP)
+			return BPF_SIZE(code) == BPF_DW;
+		/* LDX source must be ptr. */
+		return true;
+	}
+
+	if (class == BPF_STX) {
+		if (reg->type != SCALAR_VALUE)
+			return true;
+		return BPF_SIZE(code) == BPF_DW;
+	}
+
+	if (class == BPF_LD) {
+		u8 mode = BPF_MODE(code);
+
+		/* LD_IMM64 */
+		if (mode == BPF_IMM)
+			return true;
+
+		/* Both LD_IND and LD_ABS return 32-bit data. */
+		if (t != SRC_OP)
+			return  false;
+
+		/* Implicit ctx ptr. */
+		if (regno == BPF_REG_6)
+			return true;
+
+		/* Explicit source could be any width. */
+		return true;
+	}
+
+	if (class == BPF_ST)
+		/* The only source register for BPF_ST is a ptr. */
+		return true;
+
+	/* Conservatively return true at default. */
+	return true;
+}
+
 static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
 			 enum reg_arg_type t)
 {
 	struct bpf_verifier_state *vstate = env->cur_state;
 	struct bpf_func_state *state = vstate->frame[vstate->curframe];
+	struct bpf_insn *insn = env->prog->insnsi + env->insn_idx;
 	struct bpf_reg_state *reg, *regs = state->regs;
+	bool rw64;
 
 	if (regno >= MAX_BPF_REG) {
 		verbose(env, "R%d is invalid\n", regno);
@@ -1185,6 +1274,7 @@ static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
 	}
 
 	reg = &regs[regno];
+	rw64 = is_reg64(insn, regno, reg, t);
 	if (t == SRC_OP) {
 		/* check whether register used as source operand can be read */
 		if (reg->type == NOT_INIT) {
@@ -1195,7 +1285,8 @@ static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
 		if (regno == BPF_REG_FP)
 			return 0;
 
-		return mark_reg_read(env, reg, reg->parent);
+		return mark_reg_read(env, reg, reg->parent,
+				     rw64 ? REG_LIVE_READ64 : REG_LIVE_READ32);
 	} else {
 		/* check whether register used as dest operand can be written to */
 		if (regno == BPF_REG_FP) {
@@ -1382,7 +1473,8 @@ static int check_stack_read(struct bpf_verifier_env *env,
 			state->regs[value_regno].live |= REG_LIVE_WRITTEN;
 		}
 		mark_reg_read(env, &reg_state->stack[spi].spilled_ptr,
-			      reg_state->stack[spi].spilled_ptr.parent);
+			      reg_state->stack[spi].spilled_ptr.parent,
+			      REG_LIVE_READ64);
 		return 0;
 	} else {
 		int zeros = 0;
@@ -1399,7 +1491,9 @@ static int check_stack_read(struct bpf_verifier_env *env,
 			return -EACCES;
 		}
 		mark_reg_read(env, &reg_state->stack[spi].spilled_ptr,
-			      reg_state->stack[spi].spilled_ptr.parent);
+			      reg_state->stack[spi].spilled_ptr.parent,
+			      size == BPF_REG_SIZE
+			      ? REG_LIVE_READ64 : REG_LIVE_READ32);
 		if (value_regno >= 0) {
 			if (zeros == size) {
 				/* any size read into register is zero extended,
@@ -2337,7 +2431,9 @@ static int check_stack_boundary(struct bpf_verifier_env *env, int regno,
 		 * the whole slot to be marked as 'read'
 		 */
 		mark_reg_read(env, &state->stack[spi].spilled_ptr,
-			      state->stack[spi].spilled_ptr.parent);
+			      state->stack[spi].spilled_ptr.parent,
+			      access_size == BPF_REG_SIZE
+			      ? REG_LIVE_READ64 : REG_LIVE_READ32);
 	}
 	return update_stack_depth(env, state, min_off);
 }
@@ -6227,12 +6323,17 @@ static int propagate_liveness_reg(struct bpf_verifier_env *env,
 				  struct bpf_reg_state *reg,
 				  struct bpf_reg_state *parent_reg)
 {
+	u8 parent_bits = parent_reg->live & REG_LIVE_READ;
+	u8 bits = reg->live & REG_LIVE_READ;
+	u8 bits_diff = parent_bits ^ bits;
+	u8 bits_prop = bits_diff & bits;
 	int err;
 
-	if (parent_reg->live & REG_LIVE_READ || !(reg->live & REG_LIVE_READ))
+	/* No diff bit comes from "reg". */
+	if (!bits_prop)
 		return 0;
 
-	err = mark_reg_read(env, reg, parent_reg);
+	err = mark_reg_read(env, reg, parent_reg, bits_prop);
 	if (err)
 		return err;
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 bpf-next 02/15] bpf: mark lo32 writes that should be zero extended into hi32
  2019-04-15 17:26 [PATCH v4 bpf-next 00/15] bpf: eliminate zero extensions for sub-register writes Jiong Wang
  2019-04-15 17:26 ` [PATCH v4 bpf-next 01/15] bpf: split read liveness into REG_LIVE_READ64 and REG_LIVE_READ32 Jiong Wang
@ 2019-04-15 17:26 ` Jiong Wang
  2019-04-18 23:57   ` Alexei Starovoitov
  2019-04-15 17:26 ` [PATCH v4 bpf-next 03/15] bpf: reduce false alarm by refining helper call arg types Jiong Wang
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 30+ messages in thread
From: Jiong Wang @ 2019-04-15 17:26 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: bpf, netdev, oss-drivers, Jiong Wang

eBPF ISA specification requires high 32-bit cleared when low 32-bit
sub-register is written. This applies to destination register of ALU32 etc.
JIT back-ends must guarantee this semantic when doing code-gen.

x86-64 and arm64 ISA has the same semantic, so the corresponding JIT
back-end doesn't need to do extra work. However, 32-bit arches (arm, nfp
etc.) and some other 64-bit arches (powerpc, sparc etc), need explicit zero
extension sequence to meet such semantic.

This is important, because for code the following:

  u64_value = (u64) u32_value
  ... other uses of u64_value

compiler could exploit the semantic described above and save those zero
extensions for extending u32_value to u64_value. Hardware, runtime, or BPF
JIT back-ends, are responsible for guaranteeing this. Some benchmarks show
~40% sub-register writes out of total insns, meaning ~40% extra code-gen (
could go up to more for some arches which requires two shifts for zero
extension) because JIT back-end needs to do extra code-gen for all such
instructions.

However this is not always necessary in case u32_value is never cast into
a u64, which is quite normal in real life program. So, it would be really
good if we could identify those places where such type cast happened, and
only do zero extensions for them, not for the others. This could save a lot
of BPF code-gen.

Algo:
 - Record indices of instructions that do sub-register def (write). And
   these indices need to stay with reg state so path pruning and bpf
   to bpf function call could be handled properly.

   These indices are kept up to date while doing insn walk.

 - A full register read on an active sub-register def marks the def insn as
   needing zero extension on dst register.

 - A new sub-register write overrides the old one.

   A new full register write makes the register free of zero extension on
   dst register.

 - When propagating read64 during path pruning, also marks def insns whose
   defs are hanging active sub-register.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 include/linux/bpf_verifier.h |  6 ++++++
 kernel/bpf/verifier.c        | 45 ++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index fba0ebb..c1923a5 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -133,6 +133,11 @@ struct bpf_reg_state {
 	 * pointing to bpf_func_state.
 	 */
 	u32 frameno;
+	/* Tracks subreg definition. The stored value is the insn_idx of the
+	 * writing insn. This is safe because subreg_def is used before any insn
+	 * patching which only happens after main verification finished.
+	 */
+	s32 subreg_def;
 	enum bpf_reg_liveness live;
 };
 
@@ -234,6 +239,7 @@ struct bpf_insn_aux_data {
 	int ctx_field_size; /* the ctx field size for load insn, maybe 0 */
 	int sanitize_stack_off; /* stack slot to be cleared */
 	bool seen; /* this insn was processed by the verifier */
+	bool zext_dst; /* this insn zero extend dst reg */
 	u8 alu_state; /* used in combination with alu_limit */
 	unsigned int orig_idx; /* original instruction index */
 };
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5784b279..d5cc167 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -980,6 +980,7 @@ static void mark_reg_not_init(struct bpf_verifier_env *env,
 	__mark_reg_not_init(regs + regno);
 }
 
+#define DEF_NOT_SUBREG	(-1)
 static void init_reg_state(struct bpf_verifier_env *env,
 			   struct bpf_func_state *state)
 {
@@ -990,6 +991,7 @@ static void init_reg_state(struct bpf_verifier_env *env,
 		mark_reg_not_init(env, regs, i);
 		regs[i].live = REG_LIVE_NONE;
 		regs[i].parent = NULL;
+		regs[i].subreg_def = DEF_NOT_SUBREG;
 	}
 
 	/* frame pointer */
@@ -1259,6 +1261,19 @@ static bool is_reg64(struct bpf_insn *insn, u32 regno,
 	return true;
 }
 
+static void mark_insn_zext(struct bpf_verifier_env *env,
+			   struct bpf_reg_state *reg)
+{
+	s32 def_idx = reg->subreg_def;
+
+	if (def_idx == DEF_NOT_SUBREG)
+		return;
+
+	env->insn_aux_data[def_idx].zext_dst = true;
+	/* The dst will be zero extended, so won't be sub-register anymore. */
+	reg->subreg_def = DEF_NOT_SUBREG;
+}
+
 static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
 			 enum reg_arg_type t)
 {
@@ -1285,6 +1300,9 @@ static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
 		if (regno == BPF_REG_FP)
 			return 0;
 
+		if (rw64)
+			mark_insn_zext(env, reg);
+
 		return mark_reg_read(env, reg, reg->parent,
 				     rw64 ? REG_LIVE_READ64 : REG_LIVE_READ32);
 	} else {
@@ -1294,6 +1312,7 @@ static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
 			return -EACCES;
 		}
 		reg->live |= REG_LIVE_WRITTEN;
+		reg->subreg_def = rw64 ? DEF_NOT_SUBREG : env->insn_idx;
 		if (t == DST_OP)
 			mark_reg_unknown(env, regs, regno);
 	}
@@ -2176,6 +2195,12 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
 						    value_regno);
 				if (reg_type_may_be_null(reg_type))
 					regs[value_regno].id = ++env->id_gen;
+				/* A load of ctx field could have different
+				 * actual load size with the one encoded in the
+				 * insn. When the dst is PTR, it is for sure not
+				 * a sub-register.
+				 */
+				regs[value_regno].subreg_def = DEF_NOT_SUBREG;
 			}
 			regs[value_regno].type = reg_type;
 		}
@@ -3376,6 +3401,9 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
 		check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);
 	}
 
+	/* helper call must return full 64-bit R0. */
+	regs[BPF_REG_0].subreg_def = DEF_NOT_SUBREG;
+
 	/* update return register (already marked as written above) */
 	if (fn->ret_type == RET_INTEGER) {
 		/* sets type to SCALAR_VALUE */
@@ -4307,6 +4335,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
 				 */
 				*dst_reg = *src_reg;
 				dst_reg->live |= REG_LIVE_WRITTEN;
+				dst_reg->subreg_def = DEF_NOT_SUBREG;
 			} else {
 				/* R1 = (u32) R2 */
 				if (is_pointer_value(env, insn->src_reg)) {
@@ -4317,6 +4346,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
 				} else if (src_reg->type == SCALAR_VALUE) {
 					*dst_reg = *src_reg;
 					dst_reg->live |= REG_LIVE_WRITTEN;
+					dst_reg->subreg_def = env->insn_idx;
 				} else {
 					mark_reg_unknown(env, regs,
 							 insn->dst_reg);
@@ -5380,6 +5410,8 @@ static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn)
 	 * Already marked as written above.
 	 */
 	mark_reg_unknown(env, regs, BPF_REG_0);
+	/* ld_abs load up to 32-bit skb data. */
+	regs[BPF_REG_0].subreg_def = env->insn_idx;
 	return 0;
 }
 
@@ -6319,6 +6351,9 @@ static bool states_equal(struct bpf_verifier_env *env,
 	return true;
 }
 
+/* Return 0 if no propagation happened. Return negative error code if error
+ * happened. Otherwise, return the propagated bits.
+ */
 static int propagate_liveness_reg(struct bpf_verifier_env *env,
 				  struct bpf_reg_state *reg,
 				  struct bpf_reg_state *parent_reg)
@@ -6337,7 +6372,7 @@ static int propagate_liveness_reg(struct bpf_verifier_env *env,
 	if (err)
 		return err;
 
-	return 0;
+	return bits_prop;
 }
 
 /* A write screens off any subsequent reads; but write marks come from the
@@ -6371,8 +6406,10 @@ static int propagate_liveness(struct bpf_verifier_env *env,
 		for (i = frame < vstate->curframe ? BPF_REG_6 : 0; i < BPF_REG_FP; i++) {
 			err = propagate_liveness_reg(env, &state_reg[i],
 						     &parent_reg[i]);
-			if (err)
+			if (err < 0)
 				return err;
+			if (err & REG_LIVE_READ64)
+				mark_insn_zext(env, &parent_reg[i]);
 		}
 
 		/* Propagate stack slots. */
@@ -6382,11 +6419,11 @@ static int propagate_liveness(struct bpf_verifier_env *env,
 			state_reg = &state->stack[i].spilled_ptr;
 			err = propagate_liveness_reg(env, state_reg,
 						     parent_reg);
-			if (err)
+			if (err < 0)
 				return err;
 		}
 	}
-	return err;
+	return 0;
 }
 
 static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 bpf-next 03/15] bpf: reduce false alarm by refining helper call arg types
  2019-04-15 17:26 [PATCH v4 bpf-next 00/15] bpf: eliminate zero extensions for sub-register writes Jiong Wang
  2019-04-15 17:26 ` [PATCH v4 bpf-next 01/15] bpf: split read liveness into REG_LIVE_READ64 and REG_LIVE_READ32 Jiong Wang
  2019-04-15 17:26 ` [PATCH v4 bpf-next 02/15] bpf: mark lo32 writes that should be zero extended into hi32 Jiong Wang
@ 2019-04-15 17:26 ` Jiong Wang
  2019-04-15 17:26 ` [PATCH v4 bpf-next 04/15] bpf: insert explicit zero extension insn when hardware doesn't do it implicitly Jiong Wang
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Jiong Wang @ 2019-04-15 17:26 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: bpf, netdev, oss-drivers, Jiong Wang

Unlike BPF to BPF function call, BPF helper call is calls to native insns
that verifier can't walk. So, verifier needs helper proto type descriptions
for data-flow purpose.

There is such information already, but it is not differentiate sub-register
read with full register read.

This patch split "enum bpf_arg_type" for sub-register read, and updated
descriptions for several functions that shown frequent usage in one Cilium
benchmark.

"is_reg64" then taught about these new arg types.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 include/linux/bpf.h   |  3 +++
 kernel/bpf/core.c     |  2 +-
 kernel/bpf/helpers.c  |  2 +-
 kernel/bpf/verifier.c | 61 +++++++++++++++++++++++++++++++++++++++++++--------
 net/core/filter.c     | 28 +++++++++++------------
 5 files changed, 71 insertions(+), 25 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index f15432d..884b8e1 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -197,9 +197,12 @@ enum bpf_arg_type {
 
 	ARG_CONST_SIZE,		/* number of bytes accessed from memory */
 	ARG_CONST_SIZE_OR_ZERO,	/* number of bytes accessed from memory or 0 */
+	ARG_CONST_SIZE32,	/* Likewise, but size fits into 32-bit */
+	ARG_CONST_SIZE32_OR_ZERO,	/* Ditto */
 
 	ARG_PTR_TO_CTX,		/* pointer to context */
 	ARG_ANYTHING,		/* any (initialized) argument is ok */
+	ARG_ANYTHING32,		/* Likewise, but it is a 32-bit argument */
 	ARG_PTR_TO_SPIN_LOCK,	/* pointer to bpf_spin_lock */
 	ARG_PTR_TO_SOCK_COMMON,	/* pointer to sock_common */
 	ARG_PTR_TO_INT,		/* pointer to int */
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index ace8c22..2792eda 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2067,7 +2067,7 @@ const struct bpf_func_proto bpf_tail_call_proto = {
 	.ret_type	= RET_VOID,
 	.arg1_type	= ARG_PTR_TO_CTX,
 	.arg2_type	= ARG_CONST_MAP_PTR,
-	.arg3_type	= ARG_ANYTHING,
+	.arg3_type	= ARG_ANYTHING32,
 };
 
 /* Stub for JITs that only support cBPF. eBPF programs are interpreted.
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 4266ffd..039ec8e 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -221,7 +221,7 @@ const struct bpf_func_proto bpf_get_current_comm_proto = {
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
 	.arg1_type	= ARG_PTR_TO_UNINIT_MEM,
-	.arg2_type	= ARG_CONST_SIZE,
+	.arg2_type	= ARG_CONST_SIZE32,
 };
 
 #if defined(CONFIG_QUEUED_SPINLOCKS) || defined(CONFIG_BPF_ARCH_SPINLOCK)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d5cc167..388a583 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1180,12 +1180,47 @@ static int mark_reg_read(struct bpf_verifier_env *env,
 	return 0;
 }
 
+static bool helper_call_arg64(struct bpf_verifier_env *env, int func_id,
+			      u32 regno)
+{
+	/* get_func_proto must succeed, other it should have been rejected
+	 * early inside check_helper_call.
+	 */
+	const struct bpf_func_proto *fn =
+		env->ops->get_func_proto(func_id, env->prog);
+	enum bpf_arg_type arg_type;
+
+	switch (regno) {
+	case BPF_REG_1:
+		arg_type = fn->arg1_type;
+		break;
+	case BPF_REG_2:
+		arg_type = fn->arg2_type;
+		break;
+	case BPF_REG_3:
+		arg_type = fn->arg3_type;
+		break;
+	case BPF_REG_4:
+		arg_type = fn->arg4_type;
+		break;
+	case BPF_REG_5:
+		arg_type = fn->arg5_type;
+		break;
+	default:
+		arg_type = ARG_DONTCARE;
+	}
+
+	return arg_type != ARG_CONST_SIZE32 &&
+	       arg_type != ARG_CONST_SIZE32_OR_ZERO &&
+	       arg_type != ARG_ANYTHING32;
+}
+
 /* This function is supposed to be used by the following 32-bit optimization
  * code only. It returns TRUE if the source or destination register operates
  * on 64-bit, otherwise return FALSE.
  */
-static bool is_reg64(struct bpf_insn *insn, u32 regno,
-		     struct bpf_reg_state *reg, enum reg_arg_type t)
+static bool is_reg64(struct bpf_verifier_env *env, struct bpf_insn *insn,
+		     u32 regno, struct bpf_reg_state *reg, enum reg_arg_type t)
 {
 	u8 code, class, op;
 
@@ -1207,9 +1242,12 @@ static bool is_reg64(struct bpf_insn *insn, u32 regno,
 			if (insn->src_reg == BPF_PSEUDO_CALL)
 				return false;
 			/* Helper call will reach here because of arg type
-			 * check. Conservatively marking all args as 64-bit.
+			 * check.
 			 */
-			return true;
+			if (t == SRC_OP)
+				return helper_call_arg64(env, insn->imm, regno);
+
+			return false;
 		}
 	}
 
@@ -1289,7 +1327,7 @@ static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
 	}
 
 	reg = &regs[regno];
-	rw64 = is_reg64(insn, regno, reg, t);
+	rw64 = is_reg64(env, insn, regno, reg, t);
 	if (t == SRC_OP) {
 		/* check whether register used as source operand can be read */
 		if (reg->type == NOT_INIT) {
@@ -2582,7 +2620,9 @@ static bool arg_type_is_mem_ptr(enum bpf_arg_type type)
 static bool arg_type_is_mem_size(enum bpf_arg_type type)
 {
 	return type == ARG_CONST_SIZE ||
-	       type == ARG_CONST_SIZE_OR_ZERO;
+	       type == ARG_CONST_SIZE_OR_ZERO ||
+	       type == ARG_CONST_SIZE32 ||
+	       type == ARG_CONST_SIZE32_OR_ZERO;
 }
 
 static bool arg_type_is_int_ptr(enum bpf_arg_type type)
@@ -2616,7 +2656,7 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
 	if (err)
 		return err;
 
-	if (arg_type == ARG_ANYTHING) {
+	if (arg_type == ARG_ANYTHING || arg_type == ARG_ANYTHING32) {
 		if (is_pointer_value(env, regno)) {
 			verbose(env, "R%d leaks addr into helper function\n",
 				regno);
@@ -2639,7 +2679,9 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
 		    type != expected_type)
 			goto err_type;
 	} else if (arg_type == ARG_CONST_SIZE ||
-		   arg_type == ARG_CONST_SIZE_OR_ZERO) {
+		   arg_type == ARG_CONST_SIZE_OR_ZERO ||
+		   arg_type == ARG_CONST_SIZE32 ||
+		   arg_type == ARG_CONST_SIZE32_OR_ZERO) {
 		expected_type = SCALAR_VALUE;
 		if (type != expected_type)
 			goto err_type;
@@ -2739,7 +2781,8 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
 					      meta->map_ptr->value_size, false,
 					      meta);
 	} else if (arg_type_is_mem_size(arg_type)) {
-		bool zero_size_allowed = (arg_type == ARG_CONST_SIZE_OR_ZERO);
+		bool zero_size_allowed = (arg_type == ARG_CONST_SIZE_OR_ZERO ||
+					  arg_type == ARG_CONST_SIZE32_OR_ZERO);
 
 		/* remember the mem_size which may be used later
 		 * to refine return values.
diff --git a/net/core/filter.c b/net/core/filter.c
index 95a27fd..d3c7200 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1694,9 +1694,9 @@ static const struct bpf_func_proto bpf_skb_store_bytes_proto = {
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
 	.arg1_type	= ARG_PTR_TO_CTX,
-	.arg2_type	= ARG_ANYTHING,
+	.arg2_type	= ARG_ANYTHING32,
 	.arg3_type	= ARG_PTR_TO_MEM,
-	.arg4_type	= ARG_CONST_SIZE,
+	.arg4_type	= ARG_CONST_SIZE32,
 	.arg5_type	= ARG_ANYTHING,
 };
 
@@ -1725,9 +1725,9 @@ static const struct bpf_func_proto bpf_skb_load_bytes_proto = {
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
 	.arg1_type	= ARG_PTR_TO_CTX,
-	.arg2_type	= ARG_ANYTHING,
+	.arg2_type	= ARG_ANYTHING32,
 	.arg3_type	= ARG_PTR_TO_UNINIT_MEM,
-	.arg4_type	= ARG_CONST_SIZE,
+	.arg4_type	= ARG_CONST_SIZE32,
 };
 
 BPF_CALL_5(bpf_skb_load_bytes_relative, const struct sk_buff *, skb,
@@ -1876,7 +1876,7 @@ static const struct bpf_func_proto bpf_l3_csum_replace_proto = {
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
 	.arg1_type	= ARG_PTR_TO_CTX,
-	.arg2_type	= ARG_ANYTHING,
+	.arg2_type	= ARG_ANYTHING32,
 	.arg3_type	= ARG_ANYTHING,
 	.arg4_type	= ARG_ANYTHING,
 	.arg5_type	= ARG_ANYTHING,
@@ -1929,7 +1929,7 @@ static const struct bpf_func_proto bpf_l4_csum_replace_proto = {
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
 	.arg1_type	= ARG_PTR_TO_CTX,
-	.arg2_type	= ARG_ANYTHING,
+	.arg2_type	= ARG_ANYTHING32,
 	.arg3_type	= ARG_ANYTHING,
 	.arg4_type	= ARG_ANYTHING,
 	.arg5_type	= ARG_ANYTHING,
@@ -1968,9 +1968,9 @@ static const struct bpf_func_proto bpf_csum_diff_proto = {
 	.pkt_access	= true,
 	.ret_type	= RET_INTEGER,
 	.arg1_type	= ARG_PTR_TO_MEM_OR_NULL,
-	.arg2_type	= ARG_CONST_SIZE_OR_ZERO,
+	.arg2_type	= ARG_CONST_SIZE32_OR_ZERO,
 	.arg3_type	= ARG_PTR_TO_MEM_OR_NULL,
-	.arg4_type	= ARG_CONST_SIZE_OR_ZERO,
+	.arg4_type	= ARG_CONST_SIZE32_OR_ZERO,
 	.arg5_type	= ARG_ANYTHING,
 };
 
@@ -2151,7 +2151,7 @@ static const struct bpf_func_proto bpf_redirect_proto = {
 	.func           = bpf_redirect,
 	.gpl_only       = false,
 	.ret_type       = RET_INTEGER,
-	.arg1_type      = ARG_ANYTHING,
+	.arg1_type      = ARG_ANYTHING32,
 	.arg2_type      = ARG_ANYTHING,
 };
 
@@ -2929,7 +2929,7 @@ static const struct bpf_func_proto bpf_skb_change_proto_proto = {
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
 	.arg1_type	= ARG_PTR_TO_CTX,
-	.arg2_type	= ARG_ANYTHING,
+	.arg2_type	= ARG_ANYTHING32,
 	.arg3_type	= ARG_ANYTHING,
 };
 
@@ -2949,7 +2949,7 @@ static const struct bpf_func_proto bpf_skb_change_type_proto = {
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
 	.arg1_type	= ARG_PTR_TO_CTX,
-	.arg2_type	= ARG_ANYTHING,
+	.arg2_type	= ARG_ANYTHING32,
 };
 
 static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
@@ -3241,7 +3241,7 @@ static const struct bpf_func_proto bpf_skb_change_tail_proto = {
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
 	.arg1_type	= ARG_PTR_TO_CTX,
-	.arg2_type	= ARG_ANYTHING,
+	.arg2_type	= ARG_ANYTHING32,
 	.arg3_type	= ARG_ANYTHING,
 };
 
@@ -3837,7 +3837,7 @@ static const struct bpf_func_proto bpf_skb_get_tunnel_key_proto = {
 	.ret_type	= RET_INTEGER,
 	.arg1_type	= ARG_PTR_TO_CTX,
 	.arg2_type	= ARG_PTR_TO_UNINIT_MEM,
-	.arg3_type	= ARG_CONST_SIZE,
+	.arg3_type	= ARG_CONST_SIZE32,
 	.arg4_type	= ARG_ANYTHING,
 };
 
@@ -3946,7 +3946,7 @@ static const struct bpf_func_proto bpf_skb_set_tunnel_key_proto = {
 	.ret_type	= RET_INTEGER,
 	.arg1_type	= ARG_PTR_TO_CTX,
 	.arg2_type	= ARG_PTR_TO_MEM,
-	.arg3_type	= ARG_CONST_SIZE,
+	.arg3_type	= ARG_CONST_SIZE32,
 	.arg4_type	= ARG_ANYTHING,
 };
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 bpf-next 04/15] bpf: insert explicit zero extension insn when hardware doesn't do it implicitly
  2019-04-15 17:26 [PATCH v4 bpf-next 00/15] bpf: eliminate zero extensions for sub-register writes Jiong Wang
                   ` (2 preceding siblings ...)
  2019-04-15 17:26 ` [PATCH v4 bpf-next 03/15] bpf: reduce false alarm by refining helper call arg types Jiong Wang
@ 2019-04-15 17:26 ` Jiong Wang
  2019-04-15 17:26 ` [PATCH v4 bpf-next 05/15] bpf: introduce new bpf prog load flags "BPF_F_TEST_RND_HI32" Jiong Wang
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Jiong Wang @ 2019-04-15 17:26 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: bpf, netdev, oss-drivers, Jiong Wang

After previous patches, verifier has marked those instructions that really
need zero extension on dst_reg.

It is then for all back-ends to decide how to use such information to
eliminate unnecessary zero extension code-gen during JIT compilation.

One approach is:
  1. Verifier insert explicit zero extension for those instructions that
     need zero extension.
  2. All JIT back-ends do NOT generate zero extension for sub-register
     write any more.

The good thing for this approach is no major change on JIT back-end
interface, all back-ends could get this optimization.

However, only those back-ends that do not have hardware zero extension
want this optimization. For back-ends like x86_64 and AArch64, there is
hardware support, so zext insertion should be disabled.

This patch introduces new target hook "bpf_jit_hardware_zext" which is
default true, meaning the underlying hardware will do zero extension
implicitly, therefore zext insertion by verifier will be disabled. Once a
back-end overrides this hook to false, then verifier will insert zext
sequence to clear high 32-bit of definitions when necessary.

Offload targets do not use this native target hook, instead, they could
get the optimization results using bpf_prog_offload_ops.finalize.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 include/linux/bpf.h    |  1 +
 include/linux/filter.h |  1 +
 kernel/bpf/core.c      |  8 +++++
 kernel/bpf/verifier.c  | 87 +++++++++++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 96 insertions(+), 1 deletion(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 884b8e1..bdab6e7 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -368,6 +368,7 @@ struct bpf_prog_aux {
 	u32 id;
 	u32 func_cnt; /* used by non-func prog as the number of func progs */
 	u32 func_idx; /* 0 for non-func prog, the index in func array for func prog */
+	bool no_verifier_zext; /* No zero extension insertion by verifier. */
 	bool offload_requested;
 	struct bpf_prog **func;
 	void *jit_data; /* JIT specific data. arch dependent */
diff --git a/include/linux/filter.h b/include/linux/filter.h
index fb0edad..8750657 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -821,6 +821,7 @@ u64 __bpf_call_base(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
 
 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog);
 void bpf_jit_compile(struct bpf_prog *prog);
+bool bpf_jit_hardware_zext(void);
 bool bpf_helper_changes_pkt_data(void *func);
 
 static inline bool bpf_dump_raw_ok(void)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 2792eda..1c54274 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2091,6 +2091,14 @@ bool __weak bpf_helper_changes_pkt_data(void *func)
 	return false;
 }
 
+/* Return TRUE is the target hardware of JIT will do zero extension to high bits
+ * when writing to low 32-bit of one register. Otherwise, return FALSE.
+ */
+bool __weak bpf_jit_hardware_zext(void)
+{
+	return true;
+}
+
 /* To execute LD_ABS/LD_IND instructions __bpf_prog_run() may call
  * skb_copy_bits(), so provide a weak definition of it for NET-less config.
  */
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 388a583..33d7e54 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7555,6 +7555,80 @@ static int opt_remove_nops(struct bpf_verifier_env *env)
 	return 0;
 }
 
+static int opt_subreg_zext_lo32(struct bpf_verifier_env *env)
+{
+	struct bpf_insn_aux_data orig_aux, *aux = env->insn_aux_data;
+	struct bpf_insn *insns = env->prog->insnsi;
+	int i, delta = 0, len = env->prog->len;
+	struct bpf_insn zext_patch[3];
+	struct bpf_prog *new_prog;
+
+	zext_patch[1] = BPF_ALU64_IMM(BPF_LSH, 0, 32);
+	zext_patch[2] = BPF_ALU64_IMM(BPF_RSH, 0, 32);
+	for (i = 0; i < len; i++) {
+		int adj_idx = i + delta;
+		struct bpf_insn insn;
+
+		if (!aux[adj_idx].zext_dst)
+			continue;
+
+		insn = insns[adj_idx];
+		/* "adjust_insn_aux_data" only retains the original insn aux
+		 * data if insn at patched offset is at the end of the patch
+		 * buffer. That is to say, given the following insn sequence:
+		 *
+		 *   insn 1
+		 *   insn 2
+		 *   insn 3
+		 *
+		 * if the patch offset is at insn 2, then the patch buffer must
+		 * be the following that original insn aux data can be retained.
+		 *
+		 *   {lshift, rshift, insn2}
+		 *
+		 * However, zero extension needs to be inserted after insn2, so
+		 * insn patch buffer needs to be the following:
+		 *
+		 *   {insn2, lshift, rshift}
+		 *
+		 * which would cause insn aux data of insn2 lost and that data
+		 * is critical for ctx field load instruction transformed
+		 * correctly later inside "convert_ctx_accesses".
+		 *
+		 * The simplest way to fix this to build the following patch
+		 * buffer:
+		 *
+		 *   {lshift, rshift, insn-next-to-insn2}
+		 *
+		 * Given insn2 defines a value, it can't be a JMP, hence there
+		 * must be a next insn for it otherwise CFG check should have
+		 * rejected this program. However, insn-next-to-insn2 could
+		 * be a JMP and verifier insn patch infrastructure doesn't
+		 * support adjust offset for JMP inside patch buffer. We would
+		 * end up with a few insn check and offset adj code outside of
+		 * the generic insn patch helpers if we go with this approach.
+		 *
+		 * Therefore, we still use {insn2, lshift, rshift} as the patch
+		 * buffer, we copy and restore insn aux data for insn2
+		 * explicitly. The change looks simpler and smaller.
+		 */
+		zext_patch[0] = insns[adj_idx];
+		zext_patch[1].dst_reg = insn.dst_reg;
+		zext_patch[2].dst_reg = insn.dst_reg;
+		memcpy(&orig_aux, &aux[adj_idx], sizeof(orig_aux));
+		new_prog = bpf_patch_insn_data(env, adj_idx, zext_patch, 3);
+		if (!new_prog)
+			return -ENOMEM;
+		env->prog = new_prog;
+		insns = new_prog->insnsi;
+		aux = env->insn_aux_data;
+		memcpy(&aux[adj_idx], &orig_aux, sizeof(orig_aux));
+		delta += 2;
+	}
+
+	return 0;
+}
+
 /* convert load instructions that access fields of a context type into a
  * sequence of instructions that access fields of the underlying structure:
  *     struct __sk_buff    -> struct sk_buff
@@ -8386,7 +8460,18 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
 	if (ret == 0)
 		ret = check_max_stack_depth(env);
 
-	/* instruction rewrites happen after this point */
+	/* Instruction rewrites happen after this point.
+	 * For offload target, finalize hook has all aux insn info, do any
+	 * customized work there.
+	 */
+	if (ret == 0 && !bpf_jit_hardware_zext() &&
+	    !bpf_prog_is_dev_bound(env->prog->aux)) {
+		ret = opt_subreg_zext_lo32(env);
+		env->prog->aux->no_verifier_zext = !!ret;
+	} else {
+		env->prog->aux->no_verifier_zext = true;
+	}
+
 	if (is_priv) {
 		if (ret == 0)
 			opt_hard_wire_dead_code_branches(env);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 bpf-next 05/15] bpf: introduce new bpf prog load flags "BPF_F_TEST_RND_HI32"
  2019-04-15 17:26 [PATCH v4 bpf-next 00/15] bpf: eliminate zero extensions for sub-register writes Jiong Wang
                   ` (3 preceding siblings ...)
  2019-04-15 17:26 ` [PATCH v4 bpf-next 04/15] bpf: insert explicit zero extension insn when hardware doesn't do it implicitly Jiong Wang
@ 2019-04-15 17:26 ` Jiong Wang
  2019-04-15 17:26 ` [PATCH v4 bpf-next 06/15] bpf: randomize high 32-bit when BPF_F_TEST_RND_HI32 is set Jiong Wang
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Jiong Wang @ 2019-04-15 17:26 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: bpf, netdev, oss-drivers, Jiong Wang

x86_64 and AArch64 perhaps are two arches that running bpf testsuite
frequently, however the zero extension insertion pass is not enabled for
them because of their hardware support.

It is critical to guarantee the pass correction as it is supposed to be
enabled at default for a couple of other arches, for example PowerPC,
SPARC, arm, NFP etc. Therefore, it would be very useful if there is a way
to test this pass on for example x86_64.

The test methodology employed by this set is "poisoning" useless bits. High
32-bit of a definition is randomized if it is identified as not used by any
later instructions. Such randomization is only enabled under testing mode
which is gated by the new bpf prog load flags "BPF_F_TEST_RND_HI32".

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 include/uapi/linux/bpf.h       | 18 ++++++++++++++++++
 kernel/bpf/syscall.c           |  4 +++-
 tools/include/uapi/linux/bpf.h | 18 ++++++++++++++++++
 3 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c26be24..89c85a3 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -258,6 +258,24 @@ enum bpf_attach_type {
  */
 #define BPF_F_ANY_ALIGNMENT	(1U << 1)
 
+/* BPF_F_TEST_RND_HI32 is used in BPF_PROG_LOAD command for testing purpose.
+ * Verifier does sub-register def/use analysis and identifies instructions whose
+ * def only matters for low 32-bit, high 32-bit is never referenced later
+ * through implicit zero extension. Therefore verifier notifies JIT back-ends
+ * that it is safe to ignore clearing high 32-bit for these instructions. This
+ * saves some back-ends a lot of code-gen. However such optimization is not
+ * necessary on some arches, for example x86_64, arm64 etc, whose JIT back-ends
+ * hence hasn't used verifier's analysis result. But, we really want to have a
+ * way to be able to verify the correctness of the described optimization on
+ * x86_64 on which testsuites are frequently exercised.
+ *
+ * So, this flag is introduced. Once it is set, verifier will randomize high
+ * 32-bit for those instructions who has been identified as safe to ignore them.
+ * Then, if verifier is not doing correct analysis, such randomization will
+ * regress tests to expose bugs.
+ */
+#define BPF_F_TEST_RND_HI32	(1U << 2)
+
 /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
  * two extensions:
  *
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 92c9b8a..abe2804 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1600,7 +1600,9 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
 	if (CHECK_ATTR(BPF_PROG_LOAD))
 		return -EINVAL;
 
-	if (attr->prog_flags & ~(BPF_F_STRICT_ALIGNMENT | BPF_F_ANY_ALIGNMENT))
+	if (attr->prog_flags & ~(BPF_F_STRICT_ALIGNMENT |
+				 BPF_F_ANY_ALIGNMENT |
+				 BPF_F_TEST_RND_HI32))
 		return -EINVAL;
 
 	if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) &&
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index c26be24..89c85a3 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -258,6 +258,24 @@ enum bpf_attach_type {
  */
 #define BPF_F_ANY_ALIGNMENT	(1U << 1)
 
+/* BPF_F_TEST_RND_HI32 is used in BPF_PROG_LOAD command for testing purpose.
+ * Verifier does sub-register def/use analysis and identifies instructions whose
+ * def only matters for low 32-bit, high 32-bit is never referenced later
+ * through implicit zero extension. Therefore verifier notifies JIT back-ends
+ * that it is safe to ignore clearing high 32-bit for these instructions. This
+ * saves some back-ends a lot of code-gen. However such optimization is not
+ * necessary on some arches, for example x86_64, arm64 etc, whose JIT back-ends
+ * hence hasn't used verifier's analysis result. But, we really want to have a
+ * way to be able to verify the correctness of the described optimization on
+ * x86_64 on which testsuites are frequently exercised.
+ *
+ * So, this flag is introduced. Once it is set, verifier will randomize high
+ * 32-bit for those instructions who has been identified as safe to ignore them.
+ * Then, if verifier is not doing correct analysis, such randomization will
+ * regress tests to expose bugs.
+ */
+#define BPF_F_TEST_RND_HI32	(1U << 2)
+
 /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
  * two extensions:
  *
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 bpf-next 06/15] bpf: randomize high 32-bit when BPF_F_TEST_RND_HI32 is set
  2019-04-15 17:26 [PATCH v4 bpf-next 00/15] bpf: eliminate zero extensions for sub-register writes Jiong Wang
                   ` (4 preceding siblings ...)
  2019-04-15 17:26 ` [PATCH v4 bpf-next 05/15] bpf: introduce new bpf prog load flags "BPF_F_TEST_RND_HI32" Jiong Wang
@ 2019-04-15 17:26 ` Jiong Wang
  2019-04-15 17:26 ` [PATCH v4 bpf-next 07/15] libbpf: add "prog_flags" to bpf_program/bpf_prog_load_attr/bpf_load_program_attr Jiong Wang
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Jiong Wang @ 2019-04-15 17:26 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: bpf, netdev, oss-drivers, Jiong Wang

This patch randomizes high 32-bit of a definition when BPF_F_TEST_RND_HI32
is set.

It does this once the flag set no matter there is hardware zero extension
support or not. Because this is a test feature and we want to deliver the
most stressful test.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 kernel/bpf/verifier.c | 85 ++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 68 insertions(+), 17 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 33d7e54..03c4443 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7555,24 +7555,70 @@ static int opt_remove_nops(struct bpf_verifier_env *env)
 	return 0;
 }
 
-static int opt_subreg_zext_lo32(struct bpf_verifier_env *env)
+static int opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env,
+					 const union bpf_attr *attr)
 {
 	struct bpf_insn_aux_data orig_aux, *aux = env->insn_aux_data;
+	struct bpf_insn *patch, zext_patch[3], rnd_hi32_patch[4];
+	int i, patch_len, delta = 0, len = env->prog->len;
 	struct bpf_insn *insns = env->prog->insnsi;
-	int i, delta = 0, len = env->prog->len;
-	struct bpf_insn zext_patch[3];
 	struct bpf_prog *new_prog;
+	bool rnd_hi32;
+
+	rnd_hi32 = attr->prog_flags & BPF_F_TEST_RND_HI32;
 
 	zext_patch[1] = BPF_ALU64_IMM(BPF_LSH, 0, 32);
 	zext_patch[2] = BPF_ALU64_IMM(BPF_RSH, 0, 32);
+	rnd_hi32_patch[1] = BPF_ALU64_IMM(BPF_MOV, BPF_REG_AX, 0);
+	rnd_hi32_patch[2] = BPF_ALU64_IMM(BPF_LSH, BPF_REG_AX, 32);
+	rnd_hi32_patch[3] = BPF_ALU64_REG(BPF_OR, 0, BPF_REG_AX);
 	for (i = 0; i < len; i++) {
 		int adj_idx = i + delta;
 		struct bpf_insn insn;
 
-		if (!aux[adj_idx].zext_dst)
+		insn = insns[adj_idx];
+		if (!aux[adj_idx].zext_dst) {
+			u8 code, class;
+			u32 imm_rnd;
+
+			if (!rnd_hi32)
+				continue;
+
+			code = insn.code;
+			class = BPF_CLASS(code);
+			/* Insns doesn't define any value. */
+			if (class == BPF_JMP || class == BPF_JMP32 ||
+			    class == BPF_STX || class == BPF_ST)
+				continue;
+
+			/* NOTE: arg "reg" is only used for BPF_STX, as it has
+			 *       been ruled out in above check, it is safe to
+			 *       pass NULL here.
+			 */
+			if (is_reg64(env, &insn, insn.dst_reg, NULL, DST_OP)) {
+				if (class == BPF_LD &&
+				    BPF_MODE(code) == BPF_IMM)
+					i++;
+				continue;
+			}
+
+			/* ctx load could be transformed into wider load. */
+			if (class == BPF_LDX &&
+			    aux[adj_idx].ptr_type == PTR_TO_CTX)
+				continue;
+
+			imm_rnd = get_random_int();
+			rnd_hi32_patch[0] = insns[adj_idx];
+			rnd_hi32_patch[1].imm = imm_rnd;
+			rnd_hi32_patch[3].dst_reg = insn.dst_reg;
+			patch = rnd_hi32_patch;
+			patch_len = 4;
+			goto apply_patch_buffer;
+		}
+
+		if (bpf_jit_hardware_zext())
 			continue;
 
-		insn = insns[adj_idx];
 		/* "adjust_insn_aux_data" only retains the original insn aux
 		 * data if insn at patched offset is at the end of the patch
 		 * buffer. That is to say, given the following insn sequence:
@@ -7615,15 +7661,18 @@ static int opt_subreg_zext_lo32(struct bpf_verifier_env *env)
 		zext_patch[0] = insns[adj_idx];
 		zext_patch[1].dst_reg = insn.dst_reg;
 		zext_patch[2].dst_reg = insn.dst_reg;
+		patch = zext_patch;
+		patch_len = 3;
+apply_patch_buffer:
 		memcpy(&orig_aux, &aux[adj_idx], sizeof(orig_aux));
-		new_prog = bpf_patch_insn_data(env, adj_idx, zext_patch, 3);
+		new_prog = bpf_patch_insn_data(env, adj_idx, patch, patch_len);
 		if (!new_prog)
 			return -ENOMEM;
 		env->prog = new_prog;
 		insns = new_prog->insnsi;
 		aux = env->insn_aux_data;
 		memcpy(&aux[adj_idx], &orig_aux, sizeof(orig_aux));
-		delta += 2;
+		delta += patch_len - 1;
 	}
 
 	return 0;
@@ -8460,16 +8509,18 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
 	if (ret == 0)
 		ret = check_max_stack_depth(env);
 
-	/* Instruction rewrites happen after this point.
-	 * For offload target, finalize hook has all aux insn info, do any
-	 * customized work there.
-	 */
-	if (ret == 0 && !bpf_jit_hardware_zext() &&
-	    !bpf_prog_is_dev_bound(env->prog->aux)) {
-		ret = opt_subreg_zext_lo32(env);
-		env->prog->aux->no_verifier_zext = !!ret;
-	} else {
-		env->prog->aux->no_verifier_zext = true;
+	/* Instruction rewrites happen after this point. */
+	if (ret == 0) {
+		if (bpf_prog_is_dev_bound(env->prog->aux)) {
+			/* For offload target, finalize hook has all aux insn
+			 * info, copy the analysis result at there.
+			 */
+			env->prog->aux->no_verifier_zext = true;
+		} else {
+			ret = opt_subreg_zext_lo32_rnd_hi32(env, attr);
+			env->prog->aux->no_verifier_zext =
+				bpf_jit_hardware_zext() ? true : !!ret;
+		}
 	}
 
 	if (is_priv) {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 bpf-next 07/15] libbpf: add "prog_flags" to bpf_program/bpf_prog_load_attr/bpf_load_program_attr
  2019-04-15 17:26 [PATCH v4 bpf-next 00/15] bpf: eliminate zero extensions for sub-register writes Jiong Wang
                   ` (5 preceding siblings ...)
  2019-04-15 17:26 ` [PATCH v4 bpf-next 06/15] bpf: randomize high 32-bit when BPF_F_TEST_RND_HI32 is set Jiong Wang
@ 2019-04-15 17:26 ` Jiong Wang
  2019-04-15 17:26 ` [PATCH v4 bpf-next 08/15] selftests: enable hi32 randomization for all tests Jiong Wang
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Jiong Wang @ 2019-04-15 17:26 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: bpf, netdev, oss-drivers, Jiong Wang

libbpf doesn't allow passing "prog_flags" during bpf program load in a
couple of load related APIs, "bpf_load_program_xattr", "load_program" and
"bpf_prog_load_xattr".

It makes sense to allow passing "prog_flags" which is useful for
customizing program loading.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 tools/lib/bpf/bpf.c    | 1 +
 tools/lib/bpf/bpf.h    | 1 +
 tools/lib/bpf/libbpf.c | 3 +++
 tools/lib/bpf/libbpf.h | 1 +
 4 files changed, 6 insertions(+)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 955191c..f79ec49 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -254,6 +254,7 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
 	if (load_attr->name)
 		memcpy(attr.prog_name, load_attr->name,
 		       min(strlen(load_attr->name), BPF_OBJ_NAME_LEN - 1));
+	attr.prog_flags = load_attr->prog_flags;
 
 	fd = sys_bpf_prog_load(&attr, sizeof(attr));
 	if (fd >= 0)
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index bc30783..a983442 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -86,6 +86,7 @@ struct bpf_load_program_attr {
 	const void *line_info;
 	__u32 line_info_cnt;
 	__u32 log_level;
+	__u32 prog_flags;
 };
 
 /* Flags to direct loading requirements */
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index e5b77ad..e0affd0 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -182,6 +182,7 @@ struct bpf_program {
 	void *line_info;
 	__u32 line_info_rec_size;
 	__u32 line_info_cnt;
+	__u32 prog_flags;
 };
 
 enum libbpf_map_type {
@@ -1876,6 +1877,7 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
 	load_attr.line_info_rec_size = prog->line_info_rec_size;
 	load_attr.line_info_cnt = prog->line_info_cnt;
 	load_attr.log_level = prog->log_level;
+	load_attr.prog_flags = prog->prog_flags;
 	if (!load_attr.insns || !load_attr.insns_cnt)
 		return -EINVAL;
 
@@ -3320,6 +3322,7 @@ int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr,
 						      expected_attach_type);
 
 		prog->log_level = attr->log_level;
+		prog->prog_flags = attr->prog_flags;
 		if (!first_prog)
 			first_prog = prog;
 	}
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index c5ff005..5abc237 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -320,6 +320,7 @@ struct bpf_prog_load_attr {
 	enum bpf_attach_type expected_attach_type;
 	int ifindex;
 	int log_level;
+	int prog_flags;
 };
 
 LIBBPF_API int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 bpf-next 08/15] selftests: enable hi32 randomization for all tests
  2019-04-15 17:26 [PATCH v4 bpf-next 00/15] bpf: eliminate zero extensions for sub-register writes Jiong Wang
                   ` (6 preceding siblings ...)
  2019-04-15 17:26 ` [PATCH v4 bpf-next 07/15] libbpf: add "prog_flags" to bpf_program/bpf_prog_load_attr/bpf_load_program_attr Jiong Wang
@ 2019-04-15 17:26 ` Jiong Wang
  2019-04-15 17:26 ` [PATCH v4 bpf-next 09/15] arm: bpf: eliminate zero extension code-gen Jiong Wang
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Jiong Wang @ 2019-04-15 17:26 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: bpf, netdev, oss-drivers, Jiong Wang

The previous libbpf patch allows user to specify "prog_flags" to bpf
program load APIs. To enable high 32-bit randomization for a test, we need
to set BPF_F_TEST_RND_HI32 in "prog_flags".

To enable such randomization for all tests, we need to make sure all places
are passing BPF_F_TEST_RND_HI32. Changing them one by one is not
convenient, also, it would be better if a test could be switched to
"normal" running mode without code change.

Given the program load APIs used across bpf selftests are mostly:
  bpf_prog_load:      load from file
  bpf_load_program:   load from raw insns

A test_stub.c is implemented for bpf seltests, it offers two functions for
testing purpose:

  bpf_prog_test_load
  bpf_test_load_program

The are the same as "bpf_prog_load" and "bpf_load_program", except they
also set BPF_F_TEST_RND_HI32. Given *_xattr functions are the APIs to
customize any "prog_flags", it makes little sense to put these two
functions into libbpf.

Then, the following CFLAGS are passed to compilations for host programs:
  -Dbpf_prog_load=bpf_prog_test_load
  -Dbpf_load_program=bpf_test_load_program

They migrate the used load APIs to the test version, hence enable high
32-bit randomization for these tests without changing source code.

"test_verifier" is using bpf_verify_program which supports passing
"prog_flags" already, so we are fine. But, two unit tests needs to be
adjusted due to there will be 16-bit jump distance overflow:
  - "ld_abs: vlan + abs, test1"
    The function "bpf_fill_ld_abs_vlan_push_pop" inside test_verifier.c
    needs to use ALU64 to avoid insertion of hi32 randomization sequence
    that would overflow the sequence.
  - bpf_fill_jump_around_ld_abs needs to consider hi32 randomization to the
    load dst of ld_abs.

Besides all these, there are several testcases are using
"bpf_prog_load_attr" directly, their call sites are updated to pass
BPF_F_TEST_RND_HI32.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 tools/testing/selftests/bpf/Makefile               | 10 +++---
 .../selftests/bpf/prog_tests/bpf_verif_scale.c     |  1 +
 tools/testing/selftests/bpf/test_sock_addr.c       |  1 +
 tools/testing/selftests/bpf/test_sock_fields.c     |  1 +
 tools/testing/selftests/bpf/test_socket_cookie.c   |  1 +
 tools/testing/selftests/bpf/test_stub.c            | 40 ++++++++++++++++++++++
 tools/testing/selftests/bpf/test_verifier.c        |  6 ++--
 7 files changed, 53 insertions(+), 7 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/test_stub.c

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index f9d83ba..f2accf6 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -15,7 +15,9 @@ LLC		?= llc
 LLVM_OBJCOPY	?= llvm-objcopy
 LLVM_READELF	?= llvm-readelf
 BTF_PAHOLE	?= pahole
-CFLAGS += -Wall -O2 -I$(APIDIR) -I$(LIBDIR) -I$(BPFDIR) -I$(GENDIR) $(GENFLAGS) -I../../../include
+CFLAGS += -Wall -O2 -I$(APIDIR) -I$(LIBDIR) -I$(BPFDIR) -I$(GENDIR) $(GENFLAGS) -I../../../include \
+	  -Dbpf_prog_load=bpf_prog_test_load \
+	  -Dbpf_load_program=bpf_test_load_program
 LDLIBS += -lcap -lelf -lrt -lpthread
 
 # Order correspond to 'make run_tests' order
@@ -76,9 +78,9 @@ $(OUTPUT)/urandom_read: $(OUTPUT)/%: %.c
 
 BPFOBJ := $(OUTPUT)/libbpf.a
 
-$(TEST_GEN_PROGS): $(BPFOBJ)
+$(TEST_GEN_PROGS): test_stub.o $(BPFOBJ)
 
-$(TEST_GEN_PROGS_EXTENDED): $(OUTPUT)/libbpf.a
+$(TEST_GEN_PROGS_EXTENDED): test_stub.o $(OUTPUT)/libbpf.a
 
 $(OUTPUT)/test_dev_cgroup: cgroup_helpers.c
 $(OUTPUT)/test_skb_cgroup_id_user: cgroup_helpers.c
@@ -174,7 +176,7 @@ $(ALU32_BUILD_DIR)/test_progs_32: test_progs.c $(OUTPUT)/libbpf.a\
 						$(ALU32_BUILD_DIR)/urandom_read
 	$(CC) $(TEST_PROGS_CFLAGS) $(CFLAGS) \
 		-o $(ALU32_BUILD_DIR)/test_progs_32 \
-		test_progs.c trace_helpers.c prog_tests/*.c \
+		test_progs.c test_stub.c trace_helpers.c prog_tests/*.c \
 		$(OUTPUT)/libbpf.a $(LDLIBS)
 
 $(ALU32_BUILD_DIR)/test_progs_32: $(PROG_TESTS_H)
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
index 23b159d..2623d15 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
@@ -22,6 +22,7 @@ static int check_load(const char *file)
 	attr.file = file;
 	attr.prog_type = BPF_PROG_TYPE_SCHED_CLS;
 	attr.log_level = 4;
+	attr.prog_flags = BPF_F_TEST_RND_HI32;
 	err = bpf_prog_load_xattr(&attr, &obj, &prog_fd);
 	bpf_object__close(obj);
 	if (err)
diff --git a/tools/testing/selftests/bpf/test_sock_addr.c b/tools/testing/selftests/bpf/test_sock_addr.c
index 3f110ea..5d0c4f0 100644
--- a/tools/testing/selftests/bpf/test_sock_addr.c
+++ b/tools/testing/selftests/bpf/test_sock_addr.c
@@ -745,6 +745,7 @@ static int load_path(const struct sock_addr_test *test, const char *path)
 	attr.file = path;
 	attr.prog_type = BPF_PROG_TYPE_CGROUP_SOCK_ADDR;
 	attr.expected_attach_type = test->expected_attach_type;
+	attr.prog_flags = BPF_F_TEST_RND_HI32;
 
 	if (bpf_prog_load_xattr(&attr, &obj, &prog_fd)) {
 		if (test->expected_result != LOAD_REJECT)
diff --git a/tools/testing/selftests/bpf/test_sock_fields.c b/tools/testing/selftests/bpf/test_sock_fields.c
index dcae7f6..f08c8ee 100644
--- a/tools/testing/selftests/bpf/test_sock_fields.c
+++ b/tools/testing/selftests/bpf/test_sock_fields.c
@@ -339,6 +339,7 @@ int main(int argc, char **argv)
 	struct bpf_prog_load_attr attr = {
 		.file = "test_sock_fields_kern.o",
 		.prog_type = BPF_PROG_TYPE_CGROUP_SKB,
+		.prog_flags = BPF_F_TEST_RND_HI32,
 	};
 	int cgroup_fd, egress_fd, ingress_fd, err;
 	struct bpf_program *ingress_prog;
diff --git a/tools/testing/selftests/bpf/test_socket_cookie.c b/tools/testing/selftests/bpf/test_socket_cookie.c
index e51d637..cac8ee5 100644
--- a/tools/testing/selftests/bpf/test_socket_cookie.c
+++ b/tools/testing/selftests/bpf/test_socket_cookie.c
@@ -148,6 +148,7 @@ static int run_test(int cgfd)
 	memset(&attr, 0, sizeof(attr));
 	attr.file = SOCKET_COOKIE_PROG;
 	attr.prog_type = BPF_PROG_TYPE_UNSPEC;
+	attr.prog_flags = BPF_F_TEST_RND_HI32;
 
 	err = bpf_prog_load_xattr(&attr, &pobj, &prog_fd);
 	if (err) {
diff --git a/tools/testing/selftests/bpf/test_stub.c b/tools/testing/selftests/bpf/test_stub.c
new file mode 100644
index 0000000..84e81a8
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_stub.c
@@ -0,0 +1,40 @@
+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+/* Copyright (C) 2019 Netronome Systems, Inc. */
+
+#include <bpf/bpf.h>
+#include <bpf/libbpf.h>
+#include <string.h>
+
+int bpf_prog_test_load(const char *file, enum bpf_prog_type type,
+		       struct bpf_object **pobj, int *prog_fd)
+{
+	struct bpf_prog_load_attr attr;
+
+	memset(&attr, 0, sizeof(struct bpf_prog_load_attr));
+	attr.file = file;
+	attr.prog_type = type;
+	attr.expected_attach_type = 0;
+	attr.prog_flags = BPF_F_TEST_RND_HI32;
+
+	return bpf_prog_load_xattr(&attr, pobj, prog_fd);
+}
+
+int bpf_test_load_program(enum bpf_prog_type type, const struct bpf_insn *insns,
+			  size_t insns_cnt, const char *license,
+			  __u32 kern_version, char *log_buf,
+		     size_t log_buf_sz)
+{
+	struct bpf_load_program_attr load_attr;
+
+	memset(&load_attr, 0, sizeof(struct bpf_load_program_attr));
+	load_attr.prog_type = type;
+	load_attr.expected_attach_type = 0;
+	load_attr.name = NULL;
+	load_attr.insns = insns;
+	load_attr.insns_cnt = insns_cnt;
+	load_attr.license = license;
+	load_attr.kern_version = kern_version;
+	load_attr.prog_flags = BPF_F_TEST_RND_HI32;
+
+	return bpf_load_program_xattr(&load_attr, log_buf, log_buf_sz);
+}
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index e2ebcad..a5eacc8 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -161,7 +161,7 @@ static void bpf_fill_ld_abs_vlan_push_pop(struct bpf_test *self)
 		goto loop;
 
 	for (; i < len - 1; i++)
-		insn[i] = BPF_ALU32_IMM(BPF_MOV, BPF_REG_0, 0xbef);
+		insn[i] = BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0xbef);
 	insn[len - 1] = BPF_EXIT_INSN();
 	self->prog_len = len;
 }
@@ -170,7 +170,7 @@ static void bpf_fill_jump_around_ld_abs(struct bpf_test *self)
 {
 	struct bpf_insn *insn = self->fill_insns;
 	/* jump range is limited to 16 bit. every ld_abs is replaced by 6 insns */
-	unsigned int len = (1 << 15) / 6;
+	unsigned int len = (1 << 15) / 9;
 	int i = 0;
 
 	insn[i++] = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1);
@@ -783,7 +783,7 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
 	if (fixup_skips != skips)
 		return;
 
-	pflags = 0;
+	pflags = BPF_F_TEST_RND_HI32;
 	if (test->flags & F_LOAD_WITH_STRICT_ALIGNMENT)
 		pflags |= BPF_F_STRICT_ALIGNMENT;
 	if (test->flags & F_NEEDS_EFFICIENT_UNALIGNED_ACCESS)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 bpf-next 09/15] arm: bpf: eliminate zero extension code-gen
  2019-04-15 17:26 [PATCH v4 bpf-next 00/15] bpf: eliminate zero extensions for sub-register writes Jiong Wang
                   ` (7 preceding siblings ...)
  2019-04-15 17:26 ` [PATCH v4 bpf-next 08/15] selftests: enable hi32 randomization for all tests Jiong Wang
@ 2019-04-15 17:26 ` Jiong Wang
  2019-04-15 17:26 ` [PATCH v4 bpf-next 10/15] powerpc: " Jiong Wang
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Jiong Wang @ 2019-04-15 17:26 UTC (permalink / raw)
  To: alexei.starovoitov, daniel
  Cc: bpf, netdev, oss-drivers, Jiong Wang, Shubham Bansal

Cc: Shubham Bansal <illusionist.neo@gmail.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 arch/arm/net/bpf_jit_32.c | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index c8bfbbf..8cecd06 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -736,7 +736,8 @@ static inline void emit_a32_alu_r64(const bool is64, const s8 dst[],
 
 		/* ALU operation */
 		emit_alu_r(rd[1], rs, true, false, op, ctx);
-		emit_a32_mov_i(rd[0], 0, ctx);
+		if (ctx->prog->aux->no_verifier_zext)
+			emit_a32_mov_i(rd[0], 0, ctx);
 	}
 
 	arm_bpf_put_reg64(dst, rd, ctx);
@@ -758,8 +759,9 @@ static inline void emit_a32_mov_r64(const bool is64, const s8 dst[],
 				  struct jit_ctx *ctx) {
 	if (!is64) {
 		emit_a32_mov_r(dst_lo, src_lo, ctx);
-		/* Zero out high 4 bytes */
-		emit_a32_mov_i(dst_hi, 0, ctx);
+		if (ctx->prog->aux->no_verifier_zext)
+			/* Zero out high 4 bytes */
+			emit_a32_mov_i(dst_hi, 0, ctx);
 	} else if (__LINUX_ARM_ARCH__ < 6 &&
 		   ctx->cpu_architecture < CPU_ARCH_ARMv5TE) {
 		/* complete 8 byte move */
@@ -1438,7 +1440,8 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
 		}
 		emit_udivmod(rd_lo, rd_lo, rt, ctx, BPF_OP(code));
 		arm_bpf_put_reg32(dst_lo, rd_lo, ctx);
-		emit_a32_mov_i(dst_hi, 0, ctx);
+		if (ctx->prog->aux->no_verifier_zext)
+			emit_a32_mov_i(dst_hi, 0, ctx);
 		break;
 	case BPF_ALU64 | BPF_DIV | BPF_K:
 	case BPF_ALU64 | BPF_DIV | BPF_X:
@@ -1453,7 +1456,8 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
 			return -EINVAL;
 		if (imm)
 			emit_a32_alu_i(dst_lo, imm, ctx, BPF_OP(code));
-		emit_a32_mov_i(dst_hi, 0, ctx);
+		if (ctx->prog->aux->no_verifier_zext)
+			emit_a32_mov_i(dst_hi, 0, ctx);
 		break;
 	/* dst = dst << imm */
 	case BPF_ALU64 | BPF_LSH | BPF_K:
@@ -1488,7 +1492,8 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
 	/* dst = ~dst */
 	case BPF_ALU | BPF_NEG:
 		emit_a32_alu_i(dst_lo, 0, ctx, BPF_OP(code));
-		emit_a32_mov_i(dst_hi, 0, ctx);
+		if (ctx->prog->aux->no_verifier_zext)
+			emit_a32_mov_i(dst_hi, 0, ctx);
 		break;
 	/* dst = ~dst (64 bit) */
 	case BPF_ALU64 | BPF_NEG:
@@ -1838,6 +1843,11 @@ void bpf_jit_compile(struct bpf_prog *prog)
 	/* Nothing to do here. We support Internal BPF. */
 }
 
+bool bpf_jit_hardware_zext(void)
+{
+	return false;
+}
+
 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 {
 	struct bpf_prog *tmp, *orig_prog = prog;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 bpf-next 10/15] powerpc: bpf: eliminate zero extension code-gen
  2019-04-15 17:26 [PATCH v4 bpf-next 00/15] bpf: eliminate zero extensions for sub-register writes Jiong Wang
                   ` (8 preceding siblings ...)
  2019-04-15 17:26 ` [PATCH v4 bpf-next 09/15] arm: bpf: eliminate zero extension code-gen Jiong Wang
@ 2019-04-15 17:26 ` Jiong Wang
  2019-04-15 17:26 ` [PATCH v4 bpf-next 11/15] s390: " Jiong Wang
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Jiong Wang @ 2019-04-15 17:26 UTC (permalink / raw)
  To: alexei.starovoitov, daniel
  Cc: bpf, netdev, oss-drivers, Jiong Wang, Naveen N . Rao, Sandipan Das

Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 arch/powerpc/net/bpf_jit_comp64.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 21a1dcd..d10621b 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -559,7 +559,7 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,
 
 bpf_alu32_trunc:
 		/* Truncate to 32-bits */
-		if (BPF_CLASS(code) == BPF_ALU)
+		if (BPF_CLASS(code) == BPF_ALU && fp->aux->no_verifier_zext)
 			PPC_RLWINM(dst_reg, dst_reg, 0, 0, 31);
 		break;
 
@@ -1046,6 +1046,11 @@ struct powerpc64_jit_data {
 	struct codegen_context ctx;
 };
 
+bool bpf_jit_hardware_zext(void)
+{
+	return false;
+}
+
 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 {
 	u32 proglen;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 bpf-next 11/15] s390: bpf: eliminate zero extension code-gen
  2019-04-15 17:26 [PATCH v4 bpf-next 00/15] bpf: eliminate zero extensions for sub-register writes Jiong Wang
                   ` (9 preceding siblings ...)
  2019-04-15 17:26 ` [PATCH v4 bpf-next 10/15] powerpc: " Jiong Wang
@ 2019-04-15 17:26 ` Jiong Wang
  2019-04-15 17:26 ` [PATCH v4 bpf-next 12/15] sparc: " Jiong Wang
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Jiong Wang @ 2019-04-15 17:26 UTC (permalink / raw)
  To: alexei.starovoitov, daniel
  Cc: bpf, netdev, oss-drivers, Jiong Wang, Martin Schwidefsky, Heiko Carstens

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 arch/s390/net/bpf_jit_comp.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index 51dd026..59592d7 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -299,9 +299,11 @@ static inline void reg_set_seen(struct bpf_jit *jit, u32 b1)
 
 #define EMIT_ZERO(b1)						\
 ({								\
-	/* llgfr %dst,%dst (zero extend to 64 bit) */		\
-	EMIT4(0xb9160000, b1, b1);				\
-	REG_SET_SEEN(b1);					\
+	if (fp->aux->no_verifier_zext) {			\
+		/* llgfr %dst,%dst (zero extend to 64 bit) */	\
+		EMIT4(0xb9160000, b1, b1);			\
+		REG_SET_SEEN(b1);				\
+	}							\
 })
 
 /*
@@ -1282,6 +1284,11 @@ static int bpf_jit_prog(struct bpf_jit *jit, struct bpf_prog *fp)
 	return 0;
 }
 
+bool bpf_jit_hardware_zext(void)
+{
+	return false;
+}
+
 /*
  * Compile eBPF program "fp"
  */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 bpf-next 12/15] sparc: bpf: eliminate zero extension code-gen
  2019-04-15 17:26 [PATCH v4 bpf-next 00/15] bpf: eliminate zero extensions for sub-register writes Jiong Wang
                   ` (10 preceding siblings ...)
  2019-04-15 17:26 ` [PATCH v4 bpf-next 11/15] s390: " Jiong Wang
@ 2019-04-15 17:26 ` Jiong Wang
  2019-04-15 17:26 ` [PATCH v4 bpf-next 13/15] x32: " Jiong Wang
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Jiong Wang @ 2019-04-15 17:26 UTC (permalink / raw)
  To: alexei.starovoitov, daniel
  Cc: bpf, netdev, oss-drivers, Jiong Wang, David S . Miller

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 arch/sparc/net/bpf_jit_comp_64.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/sparc/net/bpf_jit_comp_64.c b/arch/sparc/net/bpf_jit_comp_64.c
index 65428e7..e58f84e 100644
--- a/arch/sparc/net/bpf_jit_comp_64.c
+++ b/arch/sparc/net/bpf_jit_comp_64.c
@@ -1144,7 +1144,8 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
 		break;
 
 	do_alu32_trunc:
-		if (BPF_CLASS(code) == BPF_ALU)
+		if (BPF_CLASS(code) == BPF_ALU &&
+		    ctx->prog->aux->no_verifier_zext)
 			emit_alu_K(SRL, dst, 0, ctx);
 		break;
 
@@ -1432,6 +1433,11 @@ static void jit_fill_hole(void *area, unsigned int size)
 		*ptr++ = 0x91d02005; /* ta 5 */
 }
 
+bool bpf_jit_hardware_zext(void)
+{
+	return false;
+}
+
 struct sparc64_jit_data {
 	struct bpf_binary_header *header;
 	u8 *image;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 bpf-next 13/15] x32: bpf: eliminate zero extension code-gen
  2019-04-15 17:26 [PATCH v4 bpf-next 00/15] bpf: eliminate zero extensions for sub-register writes Jiong Wang
                   ` (11 preceding siblings ...)
  2019-04-15 17:26 ` [PATCH v4 bpf-next 12/15] sparc: " Jiong Wang
@ 2019-04-15 17:26 ` Jiong Wang
  2019-04-15 17:26 ` [PATCH v4 bpf-next 14/15] riscv: " Jiong Wang
  2019-04-15 17:26 ` [PATCH v4 bpf-next 15/15] nfp: " Jiong Wang
  14 siblings, 0 replies; 30+ messages in thread
From: Jiong Wang @ 2019-04-15 17:26 UTC (permalink / raw)
  To: alexei.starovoitov, daniel
  Cc: bpf, netdev, oss-drivers, Jiong Wang, Wang YanQing

Cc: Wang YanQing <udknight@gmail.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 arch/x86/net/bpf_jit_comp32.c | 32 ++++++++++++++++++++++----------
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/bpf_jit_comp32.c
index 0d9cdff..8c6cf22 100644
--- a/arch/x86/net/bpf_jit_comp32.c
+++ b/arch/x86/net/bpf_jit_comp32.c
@@ -567,7 +567,7 @@ static inline void emit_ia32_alu_r(const bool is64, const bool hi, const u8 op,
 static inline void emit_ia32_alu_r64(const bool is64, const u8 op,
 				     const u8 dst[], const u8 src[],
 				     bool dstk,  bool sstk,
-				     u8 **pprog)
+				     u8 **pprog, const struct bpf_prog_aux *aux)
 {
 	u8 *prog = *pprog;
 
@@ -575,7 +575,7 @@ static inline void emit_ia32_alu_r64(const bool is64, const u8 op,
 	if (is64)
 		emit_ia32_alu_r(is64, true, op, dst_hi, src_hi, dstk, sstk,
 				&prog);
-	else
+	else if (aux->no_verifier_zext)
 		emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
 	*pprog = prog;
 }
@@ -666,7 +666,8 @@ static inline void emit_ia32_alu_i(const bool is64, const bool hi, const u8 op,
 /* ALU operation (64 bit) */
 static inline void emit_ia32_alu_i64(const bool is64, const u8 op,
 				     const u8 dst[], const u32 val,
-				     bool dstk, u8 **pprog)
+				     bool dstk, u8 **pprog,
+				     const struct bpf_prog_aux *aux)
 {
 	u8 *prog = *pprog;
 	u32 hi = 0;
@@ -677,7 +678,7 @@ static inline void emit_ia32_alu_i64(const bool is64, const u8 op,
 	emit_ia32_alu_i(is64, false, op, dst_lo, val, dstk, &prog);
 	if (is64)
 		emit_ia32_alu_i(is64, true, op, dst_hi, hi, dstk, &prog);
-	else
+	else if (aux->no_verifier_zext)
 		emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
 
 	*pprog = prog;
@@ -1690,11 +1691,13 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
 			switch (BPF_SRC(code)) {
 			case BPF_X:
 				emit_ia32_alu_r64(is64, BPF_OP(code), dst,
-						  src, dstk, sstk, &prog);
+						  src, dstk, sstk, &prog,
+						  bpf_prog->aux);
 				break;
 			case BPF_K:
 				emit_ia32_alu_i64(is64, BPF_OP(code), dst,
-						  imm32, dstk, &prog);
+						  imm32, dstk, &prog,
+						  bpf_prog->aux);
 				break;
 			}
 			break;
@@ -1713,7 +1716,8 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
 						false, &prog);
 				break;
 			}
-			emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
+			if (bpf_prog->aux->no_verifier_zext)
+				emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
 			break;
 		case BPF_ALU | BPF_LSH | BPF_X:
 		case BPF_ALU | BPF_RSH | BPF_X:
@@ -1733,7 +1737,8 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
 						  &prog);
 				break;
 			}
-			emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
+			if (bpf_prog->aux->no_verifier_zext)
+				emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
 			break;
 		/* dst = dst / src(imm) */
 		/* dst = dst % src(imm) */
@@ -1755,7 +1760,8 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
 						    &prog);
 				break;
 			}
-			emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
+			if (bpf_prog->aux->no_verifier_zext)
+				emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
 			break;
 		case BPF_ALU64 | BPF_DIV | BPF_K:
 		case BPF_ALU64 | BPF_DIV | BPF_X:
@@ -1772,7 +1778,8 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
 			EMIT2_off32(0xC7, add_1reg(0xC0, IA32_ECX), imm32);
 			emit_ia32_shift_r(BPF_OP(code), dst_lo, IA32_ECX, dstk,
 					  false, &prog);
-			emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
+			if (bpf_prog->aux->no_verifier_zext)
+				emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
 			break;
 		/* dst = dst << imm */
 		case BPF_ALU64 | BPF_LSH | BPF_K:
@@ -2367,6 +2374,11 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
 	return proglen;
 }
 
+bool bpf_jit_hardware_zext(void)
+{
+	return false;
+}
+
 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 {
 	struct bpf_binary_header *header = NULL;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 bpf-next 14/15] riscv: bpf: eliminate zero extension code-gen
  2019-04-15 17:26 [PATCH v4 bpf-next 00/15] bpf: eliminate zero extensions for sub-register writes Jiong Wang
                   ` (12 preceding siblings ...)
  2019-04-15 17:26 ` [PATCH v4 bpf-next 13/15] x32: " Jiong Wang
@ 2019-04-15 17:26 ` Jiong Wang
  2019-04-17  7:55   ` Björn Töpel
  2019-04-15 17:26 ` [PATCH v4 bpf-next 15/15] nfp: " Jiong Wang
  14 siblings, 1 reply; 30+ messages in thread
From: Jiong Wang @ 2019-04-15 17:26 UTC (permalink / raw)
  To: alexei.starovoitov, daniel
  Cc: bpf, netdev, oss-drivers, Jiong Wang, Björn Töpel

CC: Björn Töpel <bjorn.topel@gmail.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 arch/riscv/net/bpf_jit_comp.c | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/arch/riscv/net/bpf_jit_comp.c b/arch/riscv/net/bpf_jit_comp.c
index 80b12aa..9cba262 100644
--- a/arch/riscv/net/bpf_jit_comp.c
+++ b/arch/riscv/net/bpf_jit_comp.c
@@ -731,6 +731,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
 {
 	bool is64 = BPF_CLASS(insn->code) == BPF_ALU64 ||
 		    BPF_CLASS(insn->code) == BPF_JMP;
+	struct bpf_prog_aux *aux = ctx->prog->aux;
 	int rvoff, i = insn - ctx->prog->insnsi;
 	u8 rd = -1, rs = -1, code = insn->code;
 	s16 off = insn->off;
@@ -743,7 +744,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
 	case BPF_ALU | BPF_MOV | BPF_X:
 	case BPF_ALU64 | BPF_MOV | BPF_X:
 		emit(is64 ? rv_addi(rd, rs, 0) : rv_addiw(rd, rs, 0), ctx);
-		if (!is64)
+		if (!is64 && aux->no_verifier_zext)
 			emit_zext_32(rd, ctx);
 		break;
 
@@ -771,19 +772,19 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
 	case BPF_ALU | BPF_MUL | BPF_X:
 	case BPF_ALU64 | BPF_MUL | BPF_X:
 		emit(is64 ? rv_mul(rd, rd, rs) : rv_mulw(rd, rd, rs), ctx);
-		if (!is64)
+		if (!is64 && aux->no_verifier_zext)
 			emit_zext_32(rd, ctx);
 		break;
 	case BPF_ALU | BPF_DIV | BPF_X:
 	case BPF_ALU64 | BPF_DIV | BPF_X:
 		emit(is64 ? rv_divu(rd, rd, rs) : rv_divuw(rd, rd, rs), ctx);
-		if (!is64)
+		if (!is64 && aux->no_verifier_zext)
 			emit_zext_32(rd, ctx);
 		break;
 	case BPF_ALU | BPF_MOD | BPF_X:
 	case BPF_ALU64 | BPF_MOD | BPF_X:
 		emit(is64 ? rv_remu(rd, rd, rs) : rv_remuw(rd, rd, rs), ctx);
-		if (!is64)
+		if (!is64 && aux->no_verifier_zext)
 			emit_zext_32(rd, ctx);
 		break;
 	case BPF_ALU | BPF_LSH | BPF_X:
@@ -867,7 +868,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
 	case BPF_ALU | BPF_MOV | BPF_K:
 	case BPF_ALU64 | BPF_MOV | BPF_K:
 		emit_imm(rd, imm, ctx);
-		if (!is64)
+		if (!is64 && aux->no_verifier_zext)
 			emit_zext_32(rd, ctx);
 		break;
 
@@ -882,7 +883,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
 			emit(is64 ? rv_add(rd, rd, RV_REG_T1) :
 			     rv_addw(rd, rd, RV_REG_T1), ctx);
 		}
-		if (!is64)
+		if (!is64 && aux->no_verifier_zext)
 			emit_zext_32(rd, ctx);
 		break;
 	case BPF_ALU | BPF_SUB | BPF_K:
@@ -895,7 +896,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
 			emit(is64 ? rv_sub(rd, rd, RV_REG_T1) :
 			     rv_subw(rd, rd, RV_REG_T1), ctx);
 		}
-		if (!is64)
+		if (!is64 && aux->no_verifier_zext)
 			emit_zext_32(rd, ctx);
 		break;
 	case BPF_ALU | BPF_AND | BPF_K:
@@ -906,7 +907,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
 			emit_imm(RV_REG_T1, imm, ctx);
 			emit(rv_and(rd, rd, RV_REG_T1), ctx);
 		}
-		if (!is64)
+		if (!is64 && aux->no_verifier_zext)
 			emit_zext_32(rd, ctx);
 		break;
 	case BPF_ALU | BPF_OR | BPF_K:
@@ -917,7 +918,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
 			emit_imm(RV_REG_T1, imm, ctx);
 			emit(rv_or(rd, rd, RV_REG_T1), ctx);
 		}
-		if (!is64)
+		if (!is64 && aux->no_verifier_zext)
 			emit_zext_32(rd, ctx);
 		break;
 	case BPF_ALU | BPF_XOR | BPF_K:
@@ -928,7 +929,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
 			emit_imm(RV_REG_T1, imm, ctx);
 			emit(rv_xor(rd, rd, RV_REG_T1), ctx);
 		}
-		if (!is64)
+		if (!is64 && aux->no_verifier_zext)
 			emit_zext_32(rd, ctx);
 		break;
 	case BPF_ALU | BPF_MUL | BPF_K:
@@ -936,7 +937,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
 		emit_imm(RV_REG_T1, imm, ctx);
 		emit(is64 ? rv_mul(rd, rd, RV_REG_T1) :
 		     rv_mulw(rd, rd, RV_REG_T1), ctx);
-		if (!is64)
+		if (!is64 && aux->no_verifier_zext)
 			emit_zext_32(rd, ctx);
 		break;
 	case BPF_ALU | BPF_DIV | BPF_K:
@@ -944,7 +945,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
 		emit_imm(RV_REG_T1, imm, ctx);
 		emit(is64 ? rv_divu(rd, rd, RV_REG_T1) :
 		     rv_divuw(rd, rd, RV_REG_T1), ctx);
-		if (!is64)
+		if (!is64 && aux->no_verifier_zext)
 			emit_zext_32(rd, ctx);
 		break;
 	case BPF_ALU | BPF_MOD | BPF_K:
@@ -952,7 +953,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
 		emit_imm(RV_REG_T1, imm, ctx);
 		emit(is64 ? rv_remu(rd, rd, RV_REG_T1) :
 		     rv_remuw(rd, rd, RV_REG_T1), ctx);
-		if (!is64)
+		if (!is64 && aux->no_verifier_zext)
 			emit_zext_32(rd, ctx);
 		break;
 	case BPF_ALU | BPF_LSH | BPF_K:
@@ -1503,6 +1504,11 @@ static void bpf_flush_icache(void *start, void *end)
 	flush_icache_range((unsigned long)start, (unsigned long)end);
 }
 
+bool bpf_jit_hardware_zext(void)
+{
+	return false;
+}
+
 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 {
 	bool tmp_blinded = false, extra_pass = false;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 bpf-next 15/15] nfp: bpf: eliminate zero extension code-gen
  2019-04-15 17:26 [PATCH v4 bpf-next 00/15] bpf: eliminate zero extensions for sub-register writes Jiong Wang
                   ` (13 preceding siblings ...)
  2019-04-15 17:26 ` [PATCH v4 bpf-next 14/15] riscv: " Jiong Wang
@ 2019-04-15 17:26 ` Jiong Wang
  2019-04-24 16:31   ` kbuild test robot
  14 siblings, 1 reply; 30+ messages in thread
From: Jiong Wang @ 2019-04-15 17:26 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: bpf, netdev, oss-drivers, Jiong Wang

This patch eliminate zero extension code-gen for instructions except
load/store when possible.

Elimination for load/store will be supported in up coming patches.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/bpf/jit.c      | 119 +++++++++++++---------
 drivers/net/ethernet/netronome/nfp/bpf/main.h     |   2 +
 drivers/net/ethernet/netronome/nfp/bpf/verifier.c |  12 +++
 3 files changed, 83 insertions(+), 50 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
index f272247..eb30c52 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
@@ -612,6 +612,13 @@ static void wrp_immed(struct nfp_prog *nfp_prog, swreg dst, u32 imm)
 }
 
 static void
+wrp_zext(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta, u8 dst)
+{
+	if (meta->flags & FLAG_INSN_DO_ZEXT)
+		wrp_immed(nfp_prog, reg_both(dst + 1), 0);
+}
+
+static void
 wrp_immed_relo(struct nfp_prog *nfp_prog, swreg dst, u32 imm,
 	       enum nfp_relo_type relo)
 {
@@ -847,7 +854,8 @@ static int nfp_cpp_memcpy(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 }
 
 static int
-data_ld(struct nfp_prog *nfp_prog, swreg offset, u8 dst_gpr, int size)
+data_ld(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta, swreg offset,
+	u8 dst_gpr, int size)
 {
 	unsigned int i;
 	u16 shift, sz;
@@ -870,14 +878,15 @@ data_ld(struct nfp_prog *nfp_prog, swreg offset, u8 dst_gpr, int size)
 			wrp_mov(nfp_prog, reg_both(dst_gpr + i), reg_xfer(i));
 
 	if (i < 2)
-		wrp_immed(nfp_prog, reg_both(dst_gpr + 1), 0);
+		wrp_zext(nfp_prog, meta, dst_gpr);
 
 	return 0;
 }
 
 static int
-data_ld_host_order(struct nfp_prog *nfp_prog, u8 dst_gpr,
-		   swreg lreg, swreg rreg, int size, enum cmd_mode mode)
+data_ld_host_order(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+		   u8 dst_gpr, swreg lreg, swreg rreg, int size,
+		   enum cmd_mode mode)
 {
 	unsigned int i;
 	u8 mask, sz;
@@ -900,33 +909,34 @@ data_ld_host_order(struct nfp_prog *nfp_prog, u8 dst_gpr,
 			wrp_mov(nfp_prog, reg_both(dst_gpr + i), reg_xfer(i));
 
 	if (i < 2)
-		wrp_immed(nfp_prog, reg_both(dst_gpr + 1), 0);
+		wrp_zext(nfp_prog, meta, dst_gpr);
 
 	return 0;
 }
 
 static int
-data_ld_host_order_addr32(struct nfp_prog *nfp_prog, u8 src_gpr, swreg offset,
-			  u8 dst_gpr, u8 size)
+data_ld_host_order_addr32(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+			  u8 src_gpr, swreg offset, u8 dst_gpr, u8 size)
 {
-	return data_ld_host_order(nfp_prog, dst_gpr, reg_a(src_gpr), offset,
-				  size, CMD_MODE_32b);
+	return data_ld_host_order(nfp_prog, meta, dst_gpr, reg_a(src_gpr),
+				  offset, size, CMD_MODE_32b);
 }
 
 static int
-data_ld_host_order_addr40(struct nfp_prog *nfp_prog, u8 src_gpr, swreg offset,
-			  u8 dst_gpr, u8 size)
+data_ld_host_order_addr40(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+			  u8 src_gpr, swreg offset, u8 dst_gpr, u8 size)
 {
 	swreg rega, regb;
 
 	addr40_offset(nfp_prog, src_gpr, offset, &rega, &regb);
 
-	return data_ld_host_order(nfp_prog, dst_gpr, rega, regb,
+	return data_ld_host_order(nfp_prog, meta, dst_gpr, rega, regb,
 				  size, CMD_MODE_40b_BA);
 }
 
 static int
-construct_data_ind_ld(struct nfp_prog *nfp_prog, u16 offset, u16 src, u8 size)
+construct_data_ind_ld(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+		      u16 offset, u16 src, u8 size)
 {
 	swreg tmp_reg;
 
@@ -942,10 +952,12 @@ construct_data_ind_ld(struct nfp_prog *nfp_prog, u16 offset, u16 src, u8 size)
 	emit_br_relo(nfp_prog, BR_BLO, BR_OFF_RELO, 0, RELO_BR_GO_ABORT);
 
 	/* Load data */
-	return data_ld(nfp_prog, imm_b(nfp_prog), 0, size);
+	return data_ld(nfp_prog, meta, imm_b(nfp_prog), 0, size);
 }
 
-static int construct_data_ld(struct nfp_prog *nfp_prog, u16 offset, u8 size)
+static int
+construct_data_ld(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+		  u16 offset, u8 size)
 {
 	swreg tmp_reg;
 
@@ -956,7 +968,7 @@ static int construct_data_ld(struct nfp_prog *nfp_prog, u16 offset, u8 size)
 
 	/* Load data */
 	tmp_reg = re_load_imm_any(nfp_prog, offset, imm_b(nfp_prog));
-	return data_ld(nfp_prog, tmp_reg, 0, size);
+	return data_ld(nfp_prog, meta, tmp_reg, 0, size);
 }
 
 static int
@@ -1193,7 +1205,7 @@ mem_op_stack(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
 	}
 
 	if (clr_gpr && size < 8)
-		wrp_immed(nfp_prog, reg_both(gpr + 1), 0);
+		wrp_zext(nfp_prog, meta, gpr);
 
 	while (size) {
 		u32 slice_end;
@@ -1294,9 +1306,10 @@ wrp_alu32_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
 	      enum alu_op alu_op)
 {
 	const struct bpf_insn *insn = &meta->insn;
+	u8 dst = insn->dst_reg * 2;
 
-	wrp_alu_imm(nfp_prog, insn->dst_reg * 2, alu_op, insn->imm);
-	wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2 + 1), 0);
+	wrp_alu_imm(nfp_prog, dst, alu_op, insn->imm);
+	wrp_zext(nfp_prog, meta, dst);
 
 	return 0;
 }
@@ -1308,7 +1321,7 @@ wrp_alu32_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
 	u8 dst = meta->insn.dst_reg * 2, src = meta->insn.src_reg * 2;
 
 	emit_alu(nfp_prog, reg_both(dst), reg_a(dst), alu_op, reg_b(src));
-	wrp_immed(nfp_prog, reg_both(meta->insn.dst_reg * 2 + 1), 0);
+	wrp_zext(nfp_prog, meta, dst);
 
 	return 0;
 }
@@ -2385,12 +2398,14 @@ static int neg_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 	u8 dst = meta->insn.dst_reg * 2;
 
 	emit_alu(nfp_prog, reg_both(dst), reg_imm(0), ALU_OP_SUB, reg_b(dst));
-	wrp_immed(nfp_prog, reg_both(meta->insn.dst_reg * 2 + 1), 0);
+	wrp_zext(nfp_prog, meta, dst);
 
 	return 0;
 }
 
-static int __ashr_imm(struct nfp_prog *nfp_prog, u8 dst, u8 shift_amt)
+static int
+__ashr_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta, u8 dst,
+	   u8 shift_amt)
 {
 	if (shift_amt) {
 		/* Set signedness bit (MSB of result). */
@@ -2399,7 +2414,7 @@ static int __ashr_imm(struct nfp_prog *nfp_prog, u8 dst, u8 shift_amt)
 		emit_shf(nfp_prog, reg_both(dst), reg_none(), SHF_OP_ASHR,
 			 reg_b(dst), SHF_SC_R_SHF, shift_amt);
 	}
-	wrp_immed(nfp_prog, reg_both(dst + 1), 0);
+	wrp_zext(nfp_prog, meta, dst);
 
 	return 0;
 }
@@ -2414,7 +2429,7 @@ static int ashr_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 	umin = meta->umin_src;
 	umax = meta->umax_src;
 	if (umin == umax)
-		return __ashr_imm(nfp_prog, dst, umin);
+		return __ashr_imm(nfp_prog, meta, dst, umin);
 
 	src = insn->src_reg * 2;
 	/* NOTE: the first insn will set both indirect shift amount (source A)
@@ -2423,7 +2438,7 @@ static int ashr_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 	emit_alu(nfp_prog, reg_none(), reg_a(src), ALU_OP_OR, reg_b(dst));
 	emit_shf_indir(nfp_prog, reg_both(dst), reg_none(), SHF_OP_ASHR,
 		       reg_b(dst), SHF_SC_R_SHF);
-	wrp_immed(nfp_prog, reg_both(dst + 1), 0);
+	wrp_zext(nfp_prog, meta, dst);
 
 	return 0;
 }
@@ -2433,15 +2448,17 @@ static int ashr_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 	const struct bpf_insn *insn = &meta->insn;
 	u8 dst = insn->dst_reg * 2;
 
-	return __ashr_imm(nfp_prog, dst, insn->imm);
+	return __ashr_imm(nfp_prog, meta, dst, insn->imm);
 }
 
-static int __shr_imm(struct nfp_prog *nfp_prog, u8 dst, u8 shift_amt)
+static int
+__shr_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta, u8 dst,
+	  u8 shift_amt)
 {
 	if (shift_amt)
 		emit_shf(nfp_prog, reg_both(dst), reg_none(), SHF_OP_NONE,
 			 reg_b(dst), SHF_SC_R_SHF, shift_amt);
-	wrp_immed(nfp_prog, reg_both(dst + 1), 0);
+	wrp_zext(nfp_prog, meta, dst);
 	return 0;
 }
 
@@ -2450,7 +2467,7 @@ static int shr_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 	const struct bpf_insn *insn = &meta->insn;
 	u8 dst = insn->dst_reg * 2;
 
-	return __shr_imm(nfp_prog, dst, insn->imm);
+	return __shr_imm(nfp_prog, meta, dst, insn->imm);
 }
 
 static int shr_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
@@ -2463,22 +2480,24 @@ static int shr_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 	umin = meta->umin_src;
 	umax = meta->umax_src;
 	if (umin == umax)
-		return __shr_imm(nfp_prog, dst, umin);
+		return __shr_imm(nfp_prog, meta, dst, umin);
 
 	src = insn->src_reg * 2;
 	emit_alu(nfp_prog, reg_none(), reg_a(src), ALU_OP_OR, reg_imm(0));
 	emit_shf_indir(nfp_prog, reg_both(dst), reg_none(), SHF_OP_NONE,
 		       reg_b(dst), SHF_SC_R_SHF);
-	wrp_immed(nfp_prog, reg_both(dst + 1), 0);
+	wrp_zext(nfp_prog, meta, dst);
 	return 0;
 }
 
-static int __shl_imm(struct nfp_prog *nfp_prog, u8 dst, u8 shift_amt)
+static int
+__shl_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta, u8 dst,
+	  u8 shift_amt)
 {
 	if (shift_amt)
 		emit_shf(nfp_prog, reg_both(dst), reg_none(), SHF_OP_NONE,
 			 reg_b(dst), SHF_SC_L_SHF, shift_amt);
-	wrp_immed(nfp_prog, reg_both(dst + 1), 0);
+	wrp_zext(nfp_prog, meta, dst);
 	return 0;
 }
 
@@ -2487,7 +2506,7 @@ static int shl_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 	const struct bpf_insn *insn = &meta->insn;
 	u8 dst = insn->dst_reg * 2;
 
-	return __shl_imm(nfp_prog, dst, insn->imm);
+	return __shl_imm(nfp_prog, meta, dst, insn->imm);
 }
 
 static int shl_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
@@ -2500,11 +2519,11 @@ static int shl_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 	umin = meta->umin_src;
 	umax = meta->umax_src;
 	if (umin == umax)
-		return __shl_imm(nfp_prog, dst, umin);
+		return __shl_imm(nfp_prog, meta, dst, umin);
 
 	src = insn->src_reg * 2;
 	shl_reg64_lt32_low(nfp_prog, dst, src);
-	wrp_immed(nfp_prog, reg_both(dst + 1), 0);
+	wrp_zext(nfp_prog, meta, dst);
 	return 0;
 }
 
@@ -2566,34 +2585,34 @@ static int imm_ld8(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 
 static int data_ld1(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 {
-	return construct_data_ld(nfp_prog, meta->insn.imm, 1);
+	return construct_data_ld(nfp_prog, meta, meta->insn.imm, 1);
 }
 
 static int data_ld2(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 {
-	return construct_data_ld(nfp_prog, meta->insn.imm, 2);
+	return construct_data_ld(nfp_prog, meta, meta->insn.imm, 2);
 }
 
 static int data_ld4(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 {
-	return construct_data_ld(nfp_prog, meta->insn.imm, 4);
+	return construct_data_ld(nfp_prog, meta, meta->insn.imm, 4);
 }
 
 static int data_ind_ld1(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 {
-	return construct_data_ind_ld(nfp_prog, meta->insn.imm,
+	return construct_data_ind_ld(nfp_prog, meta, meta->insn.imm,
 				     meta->insn.src_reg * 2, 1);
 }
 
 static int data_ind_ld2(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 {
-	return construct_data_ind_ld(nfp_prog, meta->insn.imm,
+	return construct_data_ind_ld(nfp_prog, meta, meta->insn.imm,
 				     meta->insn.src_reg * 2, 2);
 }
 
 static int data_ind_ld4(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 {
-	return construct_data_ind_ld(nfp_prog, meta->insn.imm,
+	return construct_data_ind_ld(nfp_prog, meta, meta->insn.imm,
 				     meta->insn.src_reg * 2, 4);
 }
 
@@ -2632,7 +2651,7 @@ static int mem_ldx_skb(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
 		return -EOPNOTSUPP;
 	}
 
-	wrp_immed(nfp_prog, reg_both(meta->insn.dst_reg * 2 + 1), 0);
+	wrp_zext(nfp_prog, meta, dst);
 
 	return 0;
 }
@@ -2658,7 +2677,7 @@ static int mem_ldx_xdp(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
 		return -EOPNOTSUPP;
 	}
 
-	wrp_immed(nfp_prog, reg_both(meta->insn.dst_reg * 2 + 1), 0);
+	wrp_zext(nfp_prog, meta, dst);
 
 	return 0;
 }
@@ -2671,7 +2690,7 @@ mem_ldx_data(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
 
 	tmp_reg = re_load_imm_any(nfp_prog, meta->insn.off, imm_b(nfp_prog));
 
-	return data_ld_host_order_addr32(nfp_prog, meta->insn.src_reg * 2,
+	return data_ld_host_order_addr32(nfp_prog, meta, meta->insn.src_reg * 2,
 					 tmp_reg, meta->insn.dst_reg * 2, size);
 }
 
@@ -2683,7 +2702,7 @@ mem_ldx_emem(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
 
 	tmp_reg = re_load_imm_any(nfp_prog, meta->insn.off, imm_b(nfp_prog));
 
-	return data_ld_host_order_addr40(nfp_prog, meta->insn.src_reg * 2,
+	return data_ld_host_order_addr40(nfp_prog, meta, meta->insn.src_reg * 2,
 					 tmp_reg, meta->insn.dst_reg * 2, size);
 }
 
@@ -2744,7 +2763,7 @@ mem_ldx_data_from_pktcache_unaligned(struct nfp_prog *nfp_prog,
 	wrp_reg_subpart(nfp_prog, dst_lo, src_lo, len_lo, off);
 
 	if (!len_mid) {
-		wrp_immed(nfp_prog, dst_hi, 0);
+		wrp_zext(nfp_prog, meta, dst_gpr);
 		return 0;
 	}
 
@@ -2752,7 +2771,7 @@ mem_ldx_data_from_pktcache_unaligned(struct nfp_prog *nfp_prog,
 
 	if (size <= REG_WIDTH) {
 		wrp_reg_or_subpart(nfp_prog, dst_lo, src_mid, len_mid, len_lo);
-		wrp_immed(nfp_prog, dst_hi, 0);
+		wrp_zext(nfp_prog, meta, dst_gpr);
 	} else {
 		swreg src_hi = reg_xfer(idx + 2);
 
@@ -2783,10 +2802,10 @@ mem_ldx_data_from_pktcache_aligned(struct nfp_prog *nfp_prog,
 
 	if (size < REG_WIDTH) {
 		wrp_reg_subpart(nfp_prog, dst_lo, src_lo, size, 0);
-		wrp_immed(nfp_prog, dst_hi, 0);
+		wrp_zext(nfp_prog, meta, dst_gpr);
 	} else if (size == REG_WIDTH) {
 		wrp_mov(nfp_prog, dst_lo, src_lo);
-		wrp_immed(nfp_prog, dst_hi, 0);
+		wrp_zext(nfp_prog, meta, dst_gpr);
 	} else {
 		swreg src_hi = reg_xfer(idx + 1);
 
diff --git a/drivers/net/ethernet/netronome/nfp/bpf/main.h b/drivers/net/ethernet/netronome/nfp/bpf/main.h
index b25a482..7369bdf 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/main.h
+++ b/drivers/net/ethernet/netronome/nfp/bpf/main.h
@@ -249,6 +249,8 @@ struct nfp_bpf_reg_state {
 #define FLAG_INSN_SKIP_PREC_DEPENDENT		BIT(4)
 /* Instruction is optimized by the verifier */
 #define FLAG_INSN_SKIP_VERIFIER_OPT		BIT(5)
+/* Instruction needs to zero extend to high 32-bit */
+#define FLAG_INSN_DO_ZEXT			BIT(6)
 
 #define FLAG_INSN_SKIP_MASK		(FLAG_INSN_SKIP_NOOP | \
 					 FLAG_INSN_SKIP_PREC_DEPENDENT | \
diff --git a/drivers/net/ethernet/netronome/nfp/bpf/verifier.c b/drivers/net/ethernet/netronome/nfp/bpf/verifier.c
index 36f56eb..e92ee51 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/verifier.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/verifier.c
@@ -744,6 +744,17 @@ static unsigned int nfp_bpf_get_stack_usage(struct nfp_prog *nfp_prog)
 	goto continue_subprog;
 }
 
+static void nfp_bpf_insn_flag_zext(struct nfp_prog *nfp_prog,
+				   struct bpf_insn_aux_data *aux)
+{
+	struct nfp_insn_meta *meta;
+
+	list_for_each_entry(meta, &nfp_prog->insns, l) {
+		if (aux[meta->n].zext_dst)
+			meta->flags |= FLAG_INSN_DO_ZEXT;
+	}
+}
+
 int nfp_bpf_finalize(struct bpf_verifier_env *env)
 {
 	struct bpf_subprog_info *info;
@@ -784,6 +795,7 @@ int nfp_bpf_finalize(struct bpf_verifier_env *env)
 		return -EOPNOTSUPP;
 	}
 
+	nfp_bpf_insn_flag_zext(nfp_prog, env->insn_aux_data);
 	return 0;
 }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 bpf-next 01/15] bpf: split read liveness into REG_LIVE_READ64 and REG_LIVE_READ32
  2019-04-15 17:26 ` [PATCH v4 bpf-next 01/15] bpf: split read liveness into REG_LIVE_READ64 and REG_LIVE_READ32 Jiong Wang
@ 2019-04-15 23:03   ` Jakub Kicinski
  2019-04-16  1:26   ` Alexei Starovoitov
  1 sibling, 0 replies; 30+ messages in thread
From: Jakub Kicinski @ 2019-04-15 23:03 UTC (permalink / raw)
  To: Jiong Wang; +Cc: alexei.starovoitov, daniel, bpf, netdev, oss-drivers

On Mon, 15 Apr 2019 18:26:11 +0100, Jiong Wang wrote:
> Register liveness infrastructure doesn't track register read width at the
> moment, while the width information will be needed for the later 32-bit
> safety analysis pass.
> 
> This patch take the first step to split read liveness into REG_LIVE_READ64
> and REG_LIVE_READ32.
> 
> Liveness propagation code are updated accordingly. They are taught to
> understand how to propagate REG_LIVE_READ64 and REG_LIVE_READ32 at the same
> propagation iteration. For example, "mark_reg_read" now propagate "flags"
> which could be multiple read bits instead of the single REG_LIVE_READ64.
> 
> A write still screen off all width of reads.
> 
> Signed-off-by: Jiong Wang <jiong.wang@netronome.com>

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>

Thanks!

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 bpf-next 01/15] bpf: split read liveness into REG_LIVE_READ64 and REG_LIVE_READ32
  2019-04-15 17:26 ` [PATCH v4 bpf-next 01/15] bpf: split read liveness into REG_LIVE_READ64 and REG_LIVE_READ32 Jiong Wang
  2019-04-15 23:03   ` Jakub Kicinski
@ 2019-04-16  1:26   ` Alexei Starovoitov
  2019-04-16  7:39     ` Jiong Wang
  1 sibling, 1 reply; 30+ messages in thread
From: Alexei Starovoitov @ 2019-04-16  1:26 UTC (permalink / raw)
  To: Jiong Wang; +Cc: daniel, bpf, netdev, oss-drivers

On Mon, Apr 15, 2019 at 06:26:11PM +0100, Jiong Wang wrote:
> Register liveness infrastructure doesn't track register read width at the
> moment, while the width information will be needed for the later 32-bit
> safety analysis pass.
> 
> This patch take the first step to split read liveness into REG_LIVE_READ64
> and REG_LIVE_READ32.
> 
> Liveness propagation code are updated accordingly. They are taught to
> understand how to propagate REG_LIVE_READ64 and REG_LIVE_READ32 at the same
> propagation iteration. For example, "mark_reg_read" now propagate "flags"
> which could be multiple read bits instead of the single REG_LIVE_READ64.
> 
> A write still screen off all width of reads.
> 
> Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
> ---
>  include/linux/bpf_verifier.h |   8 +--
>  kernel/bpf/verifier.c        | 119 +++++++++++++++++++++++++++++++++++++++----
>  2 files changed, 115 insertions(+), 12 deletions(-)
> 
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index b3ab61f..fba0ebb 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -36,9 +36,11 @@
>   */
>  enum bpf_reg_liveness {
>  	REG_LIVE_NONE = 0, /* reg hasn't been read or written this branch */
> -	REG_LIVE_READ, /* reg was read, so we're sensitive to initial value */
> -	REG_LIVE_WRITTEN, /* reg was written first, screening off later reads */
> -	REG_LIVE_DONE = 4, /* liveness won't be updating this register anymore */
> +	REG_LIVE_READ32 = 0x1, /* reg was read, so we're sensitive to initial value */
> +	REG_LIVE_READ64 = 0x2, /* likewise, but full 64-bit content matters */
> +	REG_LIVE_READ = REG_LIVE_READ32 | REG_LIVE_READ64,
> +	REG_LIVE_WRITTEN = 0x4, /* reg was written first, screening off later reads */
> +	REG_LIVE_DONE = 0x8, /* liveness won't be updating this register anymore */
>  };
>  
>  struct bpf_reg_state {
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index c722015..5784b279 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -1135,7 +1135,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
>   */
>  static int mark_reg_read(struct bpf_verifier_env *env,
>  			 const struct bpf_reg_state *state,
> -			 struct bpf_reg_state *parent)
> +			 struct bpf_reg_state *parent, u8 flags)
>  {
>  	bool writes = parent == state->parent; /* Observe write marks */
>  	int cnt = 0;
> @@ -1150,17 +1150,23 @@ static int mark_reg_read(struct bpf_verifier_env *env,
>  				parent->var_off.value, parent->off);
>  			return -EFAULT;
>  		}
> -		if (parent->live & REG_LIVE_READ)
> +		/* The first condition is much more likely to be true than the
> +		 * second, make it checked first.
> +		 */
> +		if ((parent->live & REG_LIVE_READ) == flags ||
> +		    parent->live & REG_LIVE_READ64)
>  			/* The parentage chain never changes and
>  			 * this parent was already marked as LIVE_READ.
>  			 * There is no need to keep walking the chain again and
>  			 * keep re-marking all parents as LIVE_READ.
>  			 * This case happens when the same register is read
>  			 * multiple times without writes into it in-between.
> +			 * Also, if parent has REG_LIVE_READ64 set, then no need
> +			 * to set the weak REG_LIVE_READ32.
>  			 */
>  			break;
>  		/* ... then we depend on parent's value */
> -		parent->live |= REG_LIVE_READ;
> +		parent->live |= flags;
>  		state = parent;
>  		parent = state->parent;
>  		writes = true;
> @@ -1172,12 +1178,95 @@ static int mark_reg_read(struct bpf_verifier_env *env,
>  	return 0;
>  }
>  
> +/* This function is supposed to be used by the following 32-bit optimization
> + * code only. It returns TRUE if the source or destination register operates
> + * on 64-bit, otherwise return FALSE.
> + */
> +static bool is_reg64(struct bpf_insn *insn, u32 regno,
> +		     struct bpf_reg_state *reg, enum reg_arg_type t)
> +{
> +	u8 code, class, op;
> +

why is it called for case when t != SRC_OP ?
this patch is using the return value only in t == SRC_OP case
and other patches don't use is_reg64() at all.

> +	code = insn->code;
> +	class = BPF_CLASS(code);
> +	op = BPF_OP(code);
> +	if (class == BPF_JMP) {
> +		/* BPF_EXIT will reach here because of return value readability
> +		 * test for "main" which has s32 return value.
> +		 */
> +		if (op == BPF_EXIT)
> +			return false;

That's not incorrect. bpf2bpf calls return 64-bit values.
bpf abi is such that all bpf progs return 64-bit values.
Historically we truncate to 32-bit in BPF_PROG_RUN,
but some future bpf hook might use all 64 bits of return value.

> +		if (op == BPF_CALL) {
> +			/* BPF to BPF call will reach here because of marking
> +			 * caller saved clobber with DST_OP_NO_MARK for which we
> +			 * don't care the register def because they are anyway
> +			 * marked as NOT_INIT already.
> +			 */

the comment doesn't seem to match the code. why return anything here?
The return value won't be used anyway.

If is_reg64() is inside check_reg_arg() under if (t == SRC_OP)
all these corner cases wouldn't cause review headaches and can be converted
to WARN_ONCE.

What am I missing?


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 bpf-next 01/15] bpf: split read liveness into REG_LIVE_READ64 and REG_LIVE_READ32
  2019-04-16  1:26   ` Alexei Starovoitov
@ 2019-04-16  7:39     ` Jiong Wang
  2019-04-16 16:20       ` Alexei Starovoitov
  0 siblings, 1 reply; 30+ messages in thread
From: Jiong Wang @ 2019-04-16  7:39 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Jiong Wang, daniel, bpf, netdev, oss-drivers


Alexei Starovoitov writes:

> On Mon, Apr 15, 2019 at 06:26:11PM +0100, Jiong Wang wrote:
>> Register liveness infrastructure doesn't track register read width at the
>> moment, while the width information will be needed for the later 32-bit
>> safety analysis pass.
>> 
>> This patch take the first step to split read liveness into REG_LIVE_READ64
>> and REG_LIVE_READ32.
>> 
>> Liveness propagation code are updated accordingly. They are taught to
>> understand how to propagate REG_LIVE_READ64 and REG_LIVE_READ32 at the same
>> propagation iteration. For example, "mark_reg_read" now propagate "flags"
>> which could be multiple read bits instead of the single REG_LIVE_READ64.
>> 
>> A write still screen off all width of reads.
>> 
>> Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
>> ---
>>  include/linux/bpf_verifier.h |   8 +--
>>  kernel/bpf/verifier.c        | 119 +++++++++++++++++++++++++++++++++++++++----
>>  2 files changed, 115 insertions(+), 12 deletions(-)
>> 
>> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
>> index b3ab61f..fba0ebb 100644
>> --- a/include/linux/bpf_verifier.h
>> +++ b/include/linux/bpf_verifier.h
>> @@ -36,9 +36,11 @@
>>   */
>>  enum bpf_reg_liveness {
>>  	REG_LIVE_NONE = 0, /* reg hasn't been read or written this branch */
>> -	REG_LIVE_READ, /* reg was read, so we're sensitive to initial value */
>> -	REG_LIVE_WRITTEN, /* reg was written first, screening off later reads */
>> -	REG_LIVE_DONE = 4, /* liveness won't be updating this register anymore */
>> +	REG_LIVE_READ32 = 0x1, /* reg was read, so we're sensitive to initial value */
>> +	REG_LIVE_READ64 = 0x2, /* likewise, but full 64-bit content matters */
>> +	REG_LIVE_READ = REG_LIVE_READ32 | REG_LIVE_READ64,
>> +	REG_LIVE_WRITTEN = 0x4, /* reg was written first, screening off later reads */
>> +	REG_LIVE_DONE = 0x8, /* liveness won't be updating this register anymore */
>>  };
>>  
>>  struct bpf_reg_state {
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index c722015..5784b279 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -1135,7 +1135,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
>>   */
>>  static int mark_reg_read(struct bpf_verifier_env *env,
>>  			 const struct bpf_reg_state *state,
>> -			 struct bpf_reg_state *parent)
>> +			 struct bpf_reg_state *parent, u8 flags)
>>  {
>>  	bool writes = parent == state->parent; /* Observe write marks */
>>  	int cnt = 0;
>> @@ -1150,17 +1150,23 @@ static int mark_reg_read(struct bpf_verifier_env *env,
>>  				parent->var_off.value, parent->off);
>>  			return -EFAULT;
>>  		}
>> -		if (parent->live & REG_LIVE_READ)
>> +		/* The first condition is much more likely to be true than the
>> +		 * second, make it checked first.
>> +		 */
>> +		if ((parent->live & REG_LIVE_READ) == flags ||
>> +		    parent->live & REG_LIVE_READ64)
>>  			/* The parentage chain never changes and
>>  			 * this parent was already marked as LIVE_READ.
>>  			 * There is no need to keep walking the chain again and
>>  			 * keep re-marking all parents as LIVE_READ.
>>  			 * This case happens when the same register is read
>>  			 * multiple times without writes into it in-between.
>> +			 * Also, if parent has REG_LIVE_READ64 set, then no need
>> +			 * to set the weak REG_LIVE_READ32.
>>  			 */
>>  			break;
>>  		/* ... then we depend on parent's value */
>> -		parent->live |= REG_LIVE_READ;
>> +		parent->live |= flags;
>>  		state = parent;
>>  		parent = state->parent;
>>  		writes = true;
>> @@ -1172,12 +1178,95 @@ static int mark_reg_read(struct bpf_verifier_env *env,
>>  	return 0;
>>  }
>>  
>> +/* This function is supposed to be used by the following 32-bit optimization
>> + * code only. It returns TRUE if the source or destination register operates
>> + * on 64-bit, otherwise return FALSE.
>> + */
>> +static bool is_reg64(struct bpf_insn *insn, u32 regno,
>> +		     struct bpf_reg_state *reg, enum reg_arg_type t)
>> +{
>> +	u8 code, class, op;
>> +
>
> why is it called for case when t != SRC_OP ?
> this patch is using the return value only in t == SRC_OP case
> and other patches don't use is_reg64() at all.

It is used for case when t == DST*, in patch 2/15, please search "rw64" in
that patch.

And "is_reg64" aims to return TRUE if it is 64-bit read when T == SRC_OP,
and return TRUE if it is 64-bit write when T = DST*. 

>> +	code = insn->code;
>> +	class = BPF_CLASS(code);
>> +	op = BPF_OP(code);
>> +	if (class == BPF_JMP) {
>> +		/* BPF_EXIT will reach here because of return value readability
>> +		 * test for "main" which has s32 return value.
>> +		 */
>> +		if (op == BPF_EXIT)
>> +			return false;
>
> That's not incorrect. bpf2bpf calls return 64-bit values.

bpf2bpf calls has all instructions exposed to insn walker, so the data-flow
is naturally tracked. For example:
  
callee:
  w0 = w2
  exit

caller:
  call callee2
  r2 = r0
  
insn walker should have marked REG_0 is a sub-register define in callee's
frame, and such marker is naturally propagetd back to caller's frame inside
"prepare_func_exit" which is doing:

  /* return to the caller whatever r0 had in the callee */                 
  caller->regs[BPF_REG_0] = *r0;

This copies parentage chain, and also copies the sub-register definition
information as we have merged it into reg state.

> bpf abi is such that all bpf progs return 64-bit values.
> Historically we truncate to 32-bit in BPF_PROG_RUN,
> but some future bpf hook might use all 64 bits of return value.

hmm, was thinking bpf main always return 32-bit, so safe to return FALSE
here. If we could have 64-bit main return, then perhaps for main, always do
zero-extension on r0 if it comes from sub-register def, this seems a little
bit unnecessary, given there is prog_type available during the check, using
which we can known whether whether return value is 64-bit or not.

thoughts?

>> +		if (op == BPF_CALL) {
>> +			/* BPF to BPF call will reach here because of marking
>> +			 * caller saved clobber with DST_OP_NO_MARK for which we
>> +			 * don't care the register def because they are anyway
>> +			 * marked as NOT_INIT already.
>> +			 */
>
> the comment doesn't seem to match the code. why return anything here?

It is to handle the case like:

  check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);

which is called inside "check_func_call" for call insn. I think for caller
saved register, return anything is fine. They must have new def inside
callee before used. And when returned to caller, they must be restored
before used otherwise the prog will be rejected due to their status is
marked as NOT_INIT.

So, for all cases, they are tracked insn walker accurately.

> The return value won't be used anyway.
>
> If is_reg64() is inside check_reg_arg() under if (t == SRC_OP)
> all these corner cases wouldn't cause review headaches and can be converted
> to WARN_ONCE.
>
> What am I missing?

As explained, is_reg64 could be used by both t == SRC_OP and DST*. And is
will be called for instructions with implicit register usage for example
CALL, and we need to return access width for those implicitly used register
here as well.

Make sense?

Regards,
Jiong

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 bpf-next 01/15] bpf: split read liveness into REG_LIVE_READ64 and REG_LIVE_READ32
  2019-04-16  7:39     ` Jiong Wang
@ 2019-04-16 16:20       ` Alexei Starovoitov
  2019-04-16 20:19         ` Jiong Wang
  0 siblings, 1 reply; 30+ messages in thread
From: Alexei Starovoitov @ 2019-04-16 16:20 UTC (permalink / raw)
  To: Jiong Wang; +Cc: daniel, bpf, netdev, oss-drivers

On Tue, Apr 16, 2019 at 08:39:30AM +0100, Jiong Wang wrote:
> 
> Alexei Starovoitov writes:
> 
> > On Mon, Apr 15, 2019 at 06:26:11PM +0100, Jiong Wang wrote:
> >> Register liveness infrastructure doesn't track register read width at the
> >> moment, while the width information will be needed for the later 32-bit
> >> safety analysis pass.
> >> 
> >> This patch take the first step to split read liveness into REG_LIVE_READ64
> >> and REG_LIVE_READ32.
> >> 
> >> Liveness propagation code are updated accordingly. They are taught to
> >> understand how to propagate REG_LIVE_READ64 and REG_LIVE_READ32 at the same
> >> propagation iteration. For example, "mark_reg_read" now propagate "flags"
> >> which could be multiple read bits instead of the single REG_LIVE_READ64.
> >> 
> >> A write still screen off all width of reads.
> >> 
> >> Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
> >> ---
> >>  include/linux/bpf_verifier.h |   8 +--
> >>  kernel/bpf/verifier.c        | 119 +++++++++++++++++++++++++++++++++++++++----
> >>  2 files changed, 115 insertions(+), 12 deletions(-)
> >> 
> >> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> >> index b3ab61f..fba0ebb 100644
> >> --- a/include/linux/bpf_verifier.h
> >> +++ b/include/linux/bpf_verifier.h
> >> @@ -36,9 +36,11 @@
> >>   */
> >>  enum bpf_reg_liveness {
> >>  	REG_LIVE_NONE = 0, /* reg hasn't been read or written this branch */
> >> -	REG_LIVE_READ, /* reg was read, so we're sensitive to initial value */
> >> -	REG_LIVE_WRITTEN, /* reg was written first, screening off later reads */
> >> -	REG_LIVE_DONE = 4, /* liveness won't be updating this register anymore */
> >> +	REG_LIVE_READ32 = 0x1, /* reg was read, so we're sensitive to initial value */
> >> +	REG_LIVE_READ64 = 0x2, /* likewise, but full 64-bit content matters */
> >> +	REG_LIVE_READ = REG_LIVE_READ32 | REG_LIVE_READ64,
> >> +	REG_LIVE_WRITTEN = 0x4, /* reg was written first, screening off later reads */
> >> +	REG_LIVE_DONE = 0x8, /* liveness won't be updating this register anymore */
> >>  };
> >>  
> >>  struct bpf_reg_state {
> >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> >> index c722015..5784b279 100644
> >> --- a/kernel/bpf/verifier.c
> >> +++ b/kernel/bpf/verifier.c
> >> @@ -1135,7 +1135,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
> >>   */
> >>  static int mark_reg_read(struct bpf_verifier_env *env,
> >>  			 const struct bpf_reg_state *state,
> >> -			 struct bpf_reg_state *parent)
> >> +			 struct bpf_reg_state *parent, u8 flags)
> >>  {
> >>  	bool writes = parent == state->parent; /* Observe write marks */
> >>  	int cnt = 0;
> >> @@ -1150,17 +1150,23 @@ static int mark_reg_read(struct bpf_verifier_env *env,
> >>  				parent->var_off.value, parent->off);
> >>  			return -EFAULT;
> >>  		}
> >> -		if (parent->live & REG_LIVE_READ)
> >> +		/* The first condition is much more likely to be true than the
> >> +		 * second, make it checked first.
> >> +		 */
> >> +		if ((parent->live & REG_LIVE_READ) == flags ||
> >> +		    parent->live & REG_LIVE_READ64)
> >>  			/* The parentage chain never changes and
> >>  			 * this parent was already marked as LIVE_READ.
> >>  			 * There is no need to keep walking the chain again and
> >>  			 * keep re-marking all parents as LIVE_READ.
> >>  			 * This case happens when the same register is read
> >>  			 * multiple times without writes into it in-between.
> >> +			 * Also, if parent has REG_LIVE_READ64 set, then no need
> >> +			 * to set the weak REG_LIVE_READ32.
> >>  			 */
> >>  			break;
> >>  		/* ... then we depend on parent's value */
> >> -		parent->live |= REG_LIVE_READ;
> >> +		parent->live |= flags;
> >>  		state = parent;
> >>  		parent = state->parent;
> >>  		writes = true;
> >> @@ -1172,12 +1178,95 @@ static int mark_reg_read(struct bpf_verifier_env *env,
> >>  	return 0;
> >>  }
> >>  
> >> +/* This function is supposed to be used by the following 32-bit optimization
> >> + * code only. It returns TRUE if the source or destination register operates
> >> + * on 64-bit, otherwise return FALSE.
> >> + */
> >> +static bool is_reg64(struct bpf_insn *insn, u32 regno,
> >> +		     struct bpf_reg_state *reg, enum reg_arg_type t)
> >> +{
> >> +	u8 code, class, op;
> >> +
> >
> > why is it called for case when t != SRC_OP ?
> > this patch is using the return value only in t == SRC_OP case
> > and other patches don't use is_reg64() at all.
> 
> It is used for case when t == DST*, in patch 2/15, please search "rw64" in
> that patch.

argh. such patch split didn't make it easy to review.

> And "is_reg64" aims to return TRUE if it is 64-bit read when T == SRC_OP,
> and return TRUE if it is 64-bit write when T = DST*. 
> 
> >> +	code = insn->code;
> >> +	class = BPF_CLASS(code);
> >> +	op = BPF_OP(code);
> >> +	if (class == BPF_JMP) {
> >> +		/* BPF_EXIT will reach here because of return value readability
> >> +		 * test for "main" which has s32 return value.
> >> +		 */
> >> +		if (op == BPF_EXIT)
> >> +			return false;
> >
> > That's not incorrect. bpf2bpf calls return 64-bit values.
> 
> bpf2bpf calls has all instructions exposed to insn walker, so the data-flow
> is naturally tracked. For example:
>   
> callee:
>   w0 = w2
>   exit

So then it needs to be different code? Like read32_maybe ?

> caller:
>   call callee2
>   r2 = r0
>   
> insn walker should have marked REG_0 is a sub-register define in callee's
> frame, and such marker is naturally propagetd back to caller's frame inside
> "prepare_func_exit" which is doing:
> 
>   /* return to the caller whatever r0 had in the callee */                 
>   caller->regs[BPF_REG_0] = *r0;
> 
> This copies parentage chain, and also copies the sub-register definition
> information as we have merged it into reg state.
> 
> > bpf abi is such that all bpf progs return 64-bit values.
> > Historically we truncate to 32-bit in BPF_PROG_RUN,
> > but some future bpf hook might use all 64 bits of return value.
> 
> hmm, was thinking bpf main always return 32-bit, so safe to return FALSE
> here. If we could have 64-bit main return, then perhaps for main, always do
> zero-extension on r0 if it comes from sub-register def, this seems a little
> bit unnecessary, given there is prog_type available during the check, using
> which we can known whether whether return value is 64-bit or not.
> 
> thoughts?

I think it's safer to go with read64 for now and optimize later
if really necessary.

> >> +		if (op == BPF_CALL) {
> >> +			/* BPF to BPF call will reach here because of marking
> >> +			 * caller saved clobber with DST_OP_NO_MARK for which we
> >> +			 * don't care the register def because they are anyway
> >> +			 * marked as NOT_INIT already.
> >> +			 */
> >
> > the comment doesn't seem to match the code. why return anything here?
> 
> It is to handle the case like:
> 
>   check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);
> 
> which is called inside "check_func_call" for call insn. I think for caller
> saved register, return anything is fine. They must have new def inside
> callee before used. And when returned to caller, they must be restored
> before used otherwise the prog will be rejected due to their status is
> marked as NOT_INIT.

but neither this patch nor patch 2 is using the return value.

> 
> So, for all cases, they are tracked insn walker accurately.
> 
> > The return value won't be used anyway.
> >
> > If is_reg64() is inside check_reg_arg() under if (t == SRC_OP)
> > all these corner cases wouldn't cause review headaches and can be converted
> > to WARN_ONCE.
> >
> > What am I missing?
> 
> As explained, is_reg64 could be used by both t == SRC_OP and DST*. And is
> will be called for instructions with implicit register usage for example
> CALL, and we need to return access width for those implicitly used register
> here as well.
> 
> Make sense?

Kinda. I'll take a look again. My first reaction is that patch 1 and 2
should be one patch or split differently.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 bpf-next 01/15] bpf: split read liveness into REG_LIVE_READ64 and REG_LIVE_READ32
  2019-04-16 16:20       ` Alexei Starovoitov
@ 2019-04-16 20:19         ` Jiong Wang
  0 siblings, 0 replies; 30+ messages in thread
From: Jiong Wang @ 2019-04-16 20:19 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Jiong Wang, daniel, bpf, netdev, oss-drivers


Alexei Starovoitov writes:

> On Tue, Apr 16, 2019 at 08:39:30AM +0100, Jiong Wang wrote:
>> 
>> Alexei Starovoitov writes:
>> 
>> > On Mon, Apr 15, 2019 at 06:26:11PM +0100, Jiong Wang wrote:
>> >> Register liveness infrastructure doesn't track register read width at the
>> >> moment, while the width information will be needed for the later 32-bit
>> >> safety analysis pass.
>> >> 
>> >> This patch take the first step to split read liveness into REG_LIVE_READ64
>> >> and REG_LIVE_READ32.
>> >> 
>> >> Liveness propagation code are updated accordingly. They are taught to
>> >> understand how to propagate REG_LIVE_READ64 and REG_LIVE_READ32 at the same
>> >> propagation iteration. For example, "mark_reg_read" now propagate "flags"
>> >> which could be multiple read bits instead of the single REG_LIVE_READ64.
>> >> 
>> >> A write still screen off all width of reads.
>> >> 
>> >> Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
>> >> ---
>> >>  include/linux/bpf_verifier.h |   8 +--
>> >>  kernel/bpf/verifier.c        | 119 +++++++++++++++++++++++++++++++++++++++----
>> >>  2 files changed, 115 insertions(+), 12 deletions(-)
>> >> 
>> >> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
>> >> index b3ab61f..fba0ebb 100644
>> >> --- a/include/linux/bpf_verifier.h
>> >> +++ b/include/linux/bpf_verifier.h
>> >> @@ -36,9 +36,11 @@
>> >>   */
>> >>  enum bpf_reg_liveness {
>> >>  	REG_LIVE_NONE = 0, /* reg hasn't been read or written this branch */
>> >> -	REG_LIVE_READ, /* reg was read, so we're sensitive to initial value */
>> >> -	REG_LIVE_WRITTEN, /* reg was written first, screening off later reads */
>> >> -	REG_LIVE_DONE = 4, /* liveness won't be updating this register anymore */
>> >> +	REG_LIVE_READ32 = 0x1, /* reg was read, so we're sensitive to initial value */
>> >> +	REG_LIVE_READ64 = 0x2, /* likewise, but full 64-bit content matters */
>> >> +	REG_LIVE_READ = REG_LIVE_READ32 | REG_LIVE_READ64,
>> >> +	REG_LIVE_WRITTEN = 0x4, /* reg was written first, screening off later reads */
>> >> +	REG_LIVE_DONE = 0x8, /* liveness won't be updating this register anymore */
>> >>  };
>> >>  
>> >>  struct bpf_reg_state {
>> >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> >> index c722015..5784b279 100644
>> >> --- a/kernel/bpf/verifier.c
>> >> +++ b/kernel/bpf/verifier.c
>> >> @@ -1135,7 +1135,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
>> >>   */
>> >>  static int mark_reg_read(struct bpf_verifier_env *env,
>> >>  			 const struct bpf_reg_state *state,
>> >> -			 struct bpf_reg_state *parent)
>> >> +			 struct bpf_reg_state *parent, u8 flags)
>> >>  {
>> >>  	bool writes = parent == state->parent; /* Observe write marks */
>> >>  	int cnt = 0;
>> >> @@ -1150,17 +1150,23 @@ static int mark_reg_read(struct bpf_verifier_env *env,
>> >>  				parent->var_off.value, parent->off);
>> >>  			return -EFAULT;
>> >>  		}
>> >> -		if (parent->live & REG_LIVE_READ)
>> >> +		/* The first condition is much more likely to be true than the
>> >> +		 * second, make it checked first.
>> >> +		 */
>> >> +		if ((parent->live & REG_LIVE_READ) == flags ||
>> >> +		    parent->live & REG_LIVE_READ64)
>> >>  			/* The parentage chain never changes and
>> >>  			 * this parent was already marked as LIVE_READ.
>> >>  			 * There is no need to keep walking the chain again and
>> >>  			 * keep re-marking all parents as LIVE_READ.
>> >>  			 * This case happens when the same register is read
>> >>  			 * multiple times without writes into it in-between.
>> >> +			 * Also, if parent has REG_LIVE_READ64 set, then no need
>> >> +			 * to set the weak REG_LIVE_READ32.
>> >>  			 */
>> >>  			break;
>> >>  		/* ... then we depend on parent's value */
>> >> -		parent->live |= REG_LIVE_READ;
>> >> +		parent->live |= flags;
>> >>  		state = parent;
>> >>  		parent = state->parent;
>> >>  		writes = true;
>> >> @@ -1172,12 +1178,95 @@ static int mark_reg_read(struct bpf_verifier_env *env,
>> >>  	return 0;
>> >>  }
>> >>  
>> >> +/* This function is supposed to be used by the following 32-bit optimization
>> >> + * code only. It returns TRUE if the source or destination register operates
>> >> + * on 64-bit, otherwise return FALSE.
>> >> + */
>> >> +static bool is_reg64(struct bpf_insn *insn, u32 regno,
>> >> +		     struct bpf_reg_state *reg, enum reg_arg_type t)
>> >> +{
>> >> +	u8 code, class, op;
>> >> +
>> >
>> > why is it called for case when t != SRC_OP ?
>> > this patch is using the return value only in t == SRC_OP case
>> > and other patches don't use is_reg64() at all.
>> 
>> It is used for case when t == DST*, in patch 2/15, please search "rw64" in
>> that patch.
>
> argh. such patch split didn't make it easy to review.

Was trying to make each patch smaller, and was thinking split READ into
READ64 and READ32 doesn't affect currect read check, so could be treated as
an independent first step.

On the other hand, zero-extension candidates are identified through from
read and write info, so perhaps merge them into one patch is more logically
sane, especially when reviews are done.

Will go ahead to merge patch 1 and 2 into a single patch in next version.

>> And "is_reg64" aims to return TRUE if it is 64-bit read when T == SRC_OP,
>> and return TRUE if it is 64-bit write when T = DST*. 
>> 
>> >> +	code = insn->code;
>> >> +	class = BPF_CLASS(code);
>> >> +	op = BPF_OP(code);
>> >> +	if (class == BPF_JMP) {
>> >> +		/* BPF_EXIT will reach here because of return value readability
>> >> +		 * test for "main" which has s32 return value.
>> >> +		 */
>> >> +		if (op == BPF_EXIT)
>> >> +			return false;
>> >
>> > That's not incorrect. bpf2bpf calls return 64-bit values.
>> 
>> bpf2bpf calls has all instructions exposed to insn walker, so the data-flow
>> is naturally tracked. For example:
>>   
>> callee:
>>   w0 = w2
>>   exit
>
> So then it needs to be different code? Like read32_maybe ?

Read this question several times, still can't get the meaning, could you
please be more clear on this?

>> caller:
>>   call callee2
>>   r2 = r0
>>   
>> insn walker should have marked REG_0 is a sub-register define in callee's
>> frame, and such marker is naturally propagetd back to caller's frame inside
>> "prepare_func_exit" which is doing:
>> 
>>   /* return to the caller whatever r0 had in the callee */                 
>>   caller->regs[BPF_REG_0] = *r0;
>> 
>> This copies parentage chain, and also copies the sub-register definition
>> information as we have merged it into reg state.
>> 
>> > bpf abi is such that all bpf progs return 64-bit values.
>> > Historically we truncate to 32-bit in BPF_PROG_RUN,
>> > but some future bpf hook might use all 64 bits of return value.
>> 
>> hmm, was thinking bpf main always return 32-bit, so safe to return FALSE
>> here. If we could have 64-bit main return, then perhaps for main, always do
>> zero-extension on r0 if it comes from sub-register def, this seems a little
>> bit unnecessary, given there is prog_type available during the check, using
>> which we can known whether whether return value is 64-bit or not.
>> 
>> thoughts?
>
> I think it's safer to go with read64 for now and optimize later
> if really necessary.

OK, will do in next version.

>> >> +		if (op == BPF_CALL) {
>> >> +			/* BPF to BPF call will reach here because of marking
>> >> +			 * caller saved clobber with DST_OP_NO_MARK for which we
>> >> +			 * don't care the register def because they are anyway
>> >> +			 * marked as NOT_INIT already.
>> >> +			 */
>> >
>> > the comment doesn't seem to match the code. why return anything here?
>> 
>> It is to handle the case like:
>> 
>>   check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);
>> 
>> which is called inside "check_func_call" for call insn. I think for caller
>> saved register, return anything is fine. They must have new def inside
>> callee before used. And when returned to caller, they must be restored
>> before used otherwise the prog will be rejected due to their status is
>> marked as NOT_INIT.
>
> but neither this patch nor patch 2 is using the return value.

You mean "neither this patch nor patch 2 is using the return value of
"is_reg64" when we reached it from BPF_CALL insn through above caller saved
registr check, right?

patch 2 is using it inside check_reg_arg:

  reg->subreg_def = rw64 ? DEF_NOT_SUBREG : env->insn_idx;

I think DST_OP_NO_MARK was mostly for customized value range calculation,
so when reg_arg_type is it, subreg_def will still be set.

And as explained, here for caller saved registers, return either TRUE or
FALSE from is_reg64 is fine. TRUE or FALSE will cause different marking for
sub-register def, but because caller saved registers have also been marked
as NOT_INIT, so there must be later restore instruction to override the
sub-register def to correct one.

Let me know if I am answering wrong question.

>> 
>> So, for all cases, they are tracked insn walker accurately.
>> 
>> > The return value won't be used anyway.
>> >
>> > If is_reg64() is inside check_reg_arg() under if (t == SRC_OP)
>> > all these corner cases wouldn't cause review headaches and can be converted
>> > to WARN_ONCE.
>> >
>> > What am I missing?
>> 
>> As explained, is_reg64 could be used by both t == SRC_OP and DST*. And is
>> will be called for instructions with implicit register usage for example
>> CALL, and we need to return access width for those implicitly used register
>> here as well.
>> 
>> Make sense?
>
> Kinda. I'll take a look again.

Thanks.

> My first reaction is that patch 1 and 2 should be one patch or split
> differently.

As replied, will merge patch 1 and patch 2.

Regards,
Jiong


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 bpf-next 14/15] riscv: bpf: eliminate zero extension code-gen
  2019-04-15 17:26 ` [PATCH v4 bpf-next 14/15] riscv: " Jiong Wang
@ 2019-04-17  7:55   ` Björn Töpel
  0 siblings, 0 replies; 30+ messages in thread
From: Björn Töpel @ 2019-04-17  7:55 UTC (permalink / raw)
  To: Jiong Wang; +Cc: Alexei Starovoitov, Daniel Borkmann, bpf, Netdev, oss-drivers

On Mon, 15 Apr 2019 at 19:26, Jiong Wang <jiong.wang@netronome.com> wrote:
>
> CC: Björn Töpel <bjorn.topel@gmail.com>
> Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
> ---
>  arch/riscv/net/bpf_jit_comp.c | 32 +++++++++++++++++++-------------
>  1 file changed, 19 insertions(+), 13 deletions(-)
>
> diff --git a/arch/riscv/net/bpf_jit_comp.c b/arch/riscv/net/bpf_jit_comp.c
> index 80b12aa..9cba262 100644
> --- a/arch/riscv/net/bpf_jit_comp.c
> +++ b/arch/riscv/net/bpf_jit_comp.c
> @@ -731,6 +731,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
>  {
>         bool is64 = BPF_CLASS(insn->code) == BPF_ALU64 ||
>                     BPF_CLASS(insn->code) == BPF_JMP;
> +       struct bpf_prog_aux *aux = ctx->prog->aux;
>         int rvoff, i = insn - ctx->prog->insnsi;
>         u8 rd = -1, rs = -1, code = insn->code;
>         s16 off = insn->off;
> @@ -743,7 +744,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
>         case BPF_ALU | BPF_MOV | BPF_X:
>         case BPF_ALU64 | BPF_MOV | BPF_X:
>                 emit(is64 ? rv_addi(rd, rs, 0) : rv_addiw(rd, rs, 0), ctx);
> -               if (!is64)
> +               if (!is64 && aux->no_verifier_zext)
>                         emit_zext_32(rd, ctx);
>                 break;
>
> @@ -771,19 +772,19 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
>         case BPF_ALU | BPF_MUL | BPF_X:
>         case BPF_ALU64 | BPF_MUL | BPF_X:
>                 emit(is64 ? rv_mul(rd, rd, rs) : rv_mulw(rd, rd, rs), ctx);
> -               if (!is64)
> +               if (!is64 && aux->no_verifier_zext)
>                         emit_zext_32(rd, ctx);
>                 break;
>         case BPF_ALU | BPF_DIV | BPF_X:
>         case BPF_ALU64 | BPF_DIV | BPF_X:
>                 emit(is64 ? rv_divu(rd, rd, rs) : rv_divuw(rd, rd, rs), ctx);
> -               if (!is64)
> +               if (!is64 && aux->no_verifier_zext)
>                         emit_zext_32(rd, ctx);
>                 break;
>         case BPF_ALU | BPF_MOD | BPF_X:
>         case BPF_ALU64 | BPF_MOD | BPF_X:
>                 emit(is64 ? rv_remu(rd, rd, rs) : rv_remuw(rd, rd, rs), ctx);
> -               if (!is64)
> +               if (!is64 && aux->no_verifier_zext)
>                         emit_zext_32(rd, ctx);
>                 break;
>         case BPF_ALU | BPF_LSH | BPF_X:
> @@ -867,7 +868,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
>         case BPF_ALU | BPF_MOV | BPF_K:
>         case BPF_ALU64 | BPF_MOV | BPF_K:
>                 emit_imm(rd, imm, ctx);
> -               if (!is64)
> +               if (!is64 && aux->no_verifier_zext)
>                         emit_zext_32(rd, ctx);
>                 break;
>
> @@ -882,7 +883,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
>                         emit(is64 ? rv_add(rd, rd, RV_REG_T1) :
>                              rv_addw(rd, rd, RV_REG_T1), ctx);
>                 }
> -               if (!is64)
> +               if (!is64 && aux->no_verifier_zext)
>                         emit_zext_32(rd, ctx);
>                 break;
>         case BPF_ALU | BPF_SUB | BPF_K:
> @@ -895,7 +896,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
>                         emit(is64 ? rv_sub(rd, rd, RV_REG_T1) :
>                              rv_subw(rd, rd, RV_REG_T1), ctx);
>                 }
> -               if (!is64)
> +               if (!is64 && aux->no_verifier_zext)
>                         emit_zext_32(rd, ctx);
>                 break;
>         case BPF_ALU | BPF_AND | BPF_K:
> @@ -906,7 +907,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
>                         emit_imm(RV_REG_T1, imm, ctx);
>                         emit(rv_and(rd, rd, RV_REG_T1), ctx);
>                 }
> -               if (!is64)
> +               if (!is64 && aux->no_verifier_zext)
>                         emit_zext_32(rd, ctx);
>                 break;
>         case BPF_ALU | BPF_OR | BPF_K:
> @@ -917,7 +918,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
>                         emit_imm(RV_REG_T1, imm, ctx);
>                         emit(rv_or(rd, rd, RV_REG_T1), ctx);
>                 }
> -               if (!is64)
> +               if (!is64 && aux->no_verifier_zext)
>                         emit_zext_32(rd, ctx);
>                 break;
>         case BPF_ALU | BPF_XOR | BPF_K:
> @@ -928,7 +929,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
>                         emit_imm(RV_REG_T1, imm, ctx);
>                         emit(rv_xor(rd, rd, RV_REG_T1), ctx);
>                 }
> -               if (!is64)
> +               if (!is64 && aux->no_verifier_zext)
>                         emit_zext_32(rd, ctx);
>                 break;
>         case BPF_ALU | BPF_MUL | BPF_K:
> @@ -936,7 +937,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
>                 emit_imm(RV_REG_T1, imm, ctx);
>                 emit(is64 ? rv_mul(rd, rd, RV_REG_T1) :
>                      rv_mulw(rd, rd, RV_REG_T1), ctx);
> -               if (!is64)
> +               if (!is64 && aux->no_verifier_zext)
>                         emit_zext_32(rd, ctx);
>                 break;
>         case BPF_ALU | BPF_DIV | BPF_K:
> @@ -944,7 +945,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
>                 emit_imm(RV_REG_T1, imm, ctx);
>                 emit(is64 ? rv_divu(rd, rd, RV_REG_T1) :
>                      rv_divuw(rd, rd, RV_REG_T1), ctx);
> -               if (!is64)
> +               if (!is64 && aux->no_verifier_zext)
>                         emit_zext_32(rd, ctx);
>                 break;
>         case BPF_ALU | BPF_MOD | BPF_K:
> @@ -952,7 +953,7 @@ static int emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
>                 emit_imm(RV_REG_T1, imm, ctx);
>                 emit(is64 ? rv_remu(rd, rd, RV_REG_T1) :
>                      rv_remuw(rd, rd, RV_REG_T1), ctx);
> -               if (!is64)
> +               if (!is64 && aux->no_verifier_zext)
>                         emit_zext_32(rd, ctx);
>                 break;
>         case BPF_ALU | BPF_LSH | BPF_K:
> @@ -1503,6 +1504,11 @@ static void bpf_flush_icache(void *start, void *end)
>         flush_icache_range((unsigned long)start, (unsigned long)end);
>  }
>
> +bool bpf_jit_hardware_zext(void)
> +{
> +       return false;
> +}
> +
>  struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
>  {
>         bool tmp_blinded = false, extra_pass = false;
> --
> 2.7.4
>

Really interesting set!

For the RV-JIT patch:

Acked-by: Björn Töpel <bjorn.topel@gmail.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 bpf-next 02/15] bpf: mark lo32 writes that should be zero extended into hi32
  2019-04-15 17:26 ` [PATCH v4 bpf-next 02/15] bpf: mark lo32 writes that should be zero extended into hi32 Jiong Wang
@ 2019-04-18 23:57   ` Alexei Starovoitov
  2019-04-19 20:40     ` Jakub Kicinski
  0 siblings, 1 reply; 30+ messages in thread
From: Alexei Starovoitov @ 2019-04-18 23:57 UTC (permalink / raw)
  To: Jiong Wang; +Cc: daniel, bpf, netdev, oss-drivers

On Mon, Apr 15, 2019 at 06:26:12PM +0100, Jiong Wang wrote:
> eBPF ISA specification requires high 32-bit cleared when low 32-bit
> sub-register is written. This applies to destination register of ALU32 etc.
> JIT back-ends must guarantee this semantic when doing code-gen.
> 
> x86-64 and arm64 ISA has the same semantic, so the corresponding JIT
> back-end doesn't need to do extra work. However, 32-bit arches (arm, nfp
> etc.) and some other 64-bit arches (powerpc, sparc etc), need explicit zero
> extension sequence to meet such semantic.
> 
> This is important, because for code the following:
> 
>   u64_value = (u64) u32_value
>   ... other uses of u64_value
> 
> compiler could exploit the semantic described above and save those zero
> extensions for extending u32_value to u64_value. Hardware, runtime, or BPF
> JIT back-ends, are responsible for guaranteeing this. Some benchmarks show
> ~40% sub-register writes out of total insns, meaning ~40% extra code-gen (
> could go up to more for some arches which requires two shifts for zero
> extension) because JIT back-end needs to do extra code-gen for all such
> instructions.
> 
> However this is not always necessary in case u32_value is never cast into
> a u64, which is quite normal in real life program. So, it would be really
> good if we could identify those places where such type cast happened, and
> only do zero extensions for them, not for the others. This could save a lot
> of BPF code-gen.
> 
> Algo:
>  - Record indices of instructions that do sub-register def (write). And
>    these indices need to stay with reg state so path pruning and bpf
>    to bpf function call could be handled properly.
> 
>    These indices are kept up to date while doing insn walk.
> 
>  - A full register read on an active sub-register def marks the def insn as
>    needing zero extension on dst register.
> 
>  - A new sub-register write overrides the old one.
> 
>    A new full register write makes the register free of zero extension on
>    dst register.
> 
>  - When propagating read64 during path pruning, also marks def insns whose
>    defs are hanging active sub-register.
> 
> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
> ---
>  include/linux/bpf_verifier.h |  6 ++++++
>  kernel/bpf/verifier.c        | 45 ++++++++++++++++++++++++++++++++++++++++----
>  2 files changed, 47 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index fba0ebb..c1923a5 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -133,6 +133,11 @@ struct bpf_reg_state {
>  	 * pointing to bpf_func_state.
>  	 */
>  	u32 frameno;
> +	/* Tracks subreg definition. The stored value is the insn_idx of the
> +	 * writing insn. This is safe because subreg_def is used before any insn
> +	 * patching which only happens after main verification finished.
> +	 */
> +	s32 subreg_def;

could you use u32 instead ?
DEF_NOT_SUBREG==0 and valid subreg_def = insn_idx + 1 ?

Then if we miss regs[i].subreg_def = DEF_NOT_SUBREG; somewhere
it will still be conservative.

>  	enum bpf_reg_liveness live;
>  };
>  
> @@ -234,6 +239,7 @@ struct bpf_insn_aux_data {
>  	int ctx_field_size; /* the ctx field size for load insn, maybe 0 */
>  	int sanitize_stack_off; /* stack slot to be cleared */
>  	bool seen; /* this insn was processed by the verifier */
> +	bool zext_dst; /* this insn zero extend dst reg */
>  	u8 alu_state; /* used in combination with alu_limit */
>  	unsigned int orig_idx; /* original instruction index */
>  };
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 5784b279..d5cc167 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -980,6 +980,7 @@ static void mark_reg_not_init(struct bpf_verifier_env *env,
>  	__mark_reg_not_init(regs + regno);
>  }
>  
> +#define DEF_NOT_SUBREG	(-1)
>  static void init_reg_state(struct bpf_verifier_env *env,
>  			   struct bpf_func_state *state)
>  {
> @@ -990,6 +991,7 @@ static void init_reg_state(struct bpf_verifier_env *env,
>  		mark_reg_not_init(env, regs, i);
>  		regs[i].live = REG_LIVE_NONE;
>  		regs[i].parent = NULL;
> +		regs[i].subreg_def = DEF_NOT_SUBREG;
>  	}
>  
>  	/* frame pointer */
> @@ -1259,6 +1261,19 @@ static bool is_reg64(struct bpf_insn *insn, u32 regno,
>  	return true;
>  }
>  
> +static void mark_insn_zext(struct bpf_verifier_env *env,
> +			   struct bpf_reg_state *reg)
> +{
> +	s32 def_idx = reg->subreg_def;
> +
> +	if (def_idx == DEF_NOT_SUBREG)
> +		return;
> +
> +	env->insn_aux_data[def_idx].zext_dst = true;
> +	/* The dst will be zero extended, so won't be sub-register anymore. */
> +	reg->subreg_def = DEF_NOT_SUBREG;
> +}
> +
>  static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
>  			 enum reg_arg_type t)
>  {
> @@ -1285,6 +1300,9 @@ static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
>  		if (regno == BPF_REG_FP)
>  			return 0;
>  
> +		if (rw64)
> +			mark_insn_zext(env, reg);
> +
>  		return mark_reg_read(env, reg, reg->parent,
>  				     rw64 ? REG_LIVE_READ64 : REG_LIVE_READ32);
>  	} else {
> @@ -1294,6 +1312,7 @@ static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
>  			return -EACCES;
>  		}
>  		reg->live |= REG_LIVE_WRITTEN;
> +		reg->subreg_def = rw64 ? DEF_NOT_SUBREG : env->insn_idx;
>  		if (t == DST_OP)
>  			mark_reg_unknown(env, regs, regno);
>  	}
> @@ -2176,6 +2195,12 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
>  						    value_regno);
>  				if (reg_type_may_be_null(reg_type))
>  					regs[value_regno].id = ++env->id_gen;
> +				/* A load of ctx field could have different
> +				 * actual load size with the one encoded in the
> +				 * insn. When the dst is PTR, it is for sure not
> +				 * a sub-register.
> +				 */
> +				regs[value_regno].subreg_def = DEF_NOT_SUBREG;
>  			}
>  			regs[value_regno].type = reg_type;
>  		}
> @@ -3376,6 +3401,9 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
>  		check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);
>  	}
>  
> +	/* helper call must return full 64-bit R0. */
> +	regs[BPF_REG_0].subreg_def = DEF_NOT_SUBREG;
> +
>  	/* update return register (already marked as written above) */
>  	if (fn->ret_type == RET_INTEGER) {
>  		/* sets type to SCALAR_VALUE */
> @@ -4307,6 +4335,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
>  				 */
>  				*dst_reg = *src_reg;
>  				dst_reg->live |= REG_LIVE_WRITTEN;
> +				dst_reg->subreg_def = DEF_NOT_SUBREG;
>  			} else {
>  				/* R1 = (u32) R2 */
>  				if (is_pointer_value(env, insn->src_reg)) {
> @@ -4317,6 +4346,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
>  				} else if (src_reg->type == SCALAR_VALUE) {
>  					*dst_reg = *src_reg;
>  					dst_reg->live |= REG_LIVE_WRITTEN;
> +					dst_reg->subreg_def = env->insn_idx;
>  				} else {
>  					mark_reg_unknown(env, regs,
>  							 insn->dst_reg);
> @@ -5380,6 +5410,8 @@ static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn)
>  	 * Already marked as written above.
>  	 */
>  	mark_reg_unknown(env, regs, BPF_REG_0);
> +	/* ld_abs load up to 32-bit skb data. */
> +	regs[BPF_REG_0].subreg_def = env->insn_idx;
>  	return 0;
>  }
>  
> @@ -6319,6 +6351,9 @@ static bool states_equal(struct bpf_verifier_env *env,
>  	return true;
>  }
>  
> +/* Return 0 if no propagation happened. Return negative error code if error
> + * happened. Otherwise, return the propagated bits.
> + */
>  static int propagate_liveness_reg(struct bpf_verifier_env *env,
>  				  struct bpf_reg_state *reg,
>  				  struct bpf_reg_state *parent_reg)
> @@ -6337,7 +6372,7 @@ static int propagate_liveness_reg(struct bpf_verifier_env *env,
>  	if (err)
>  		return err;
>  
> -	return 0;
> +	return bits_prop;
>  }
>  
>  /* A write screens off any subsequent reads; but write marks come from the
> @@ -6371,8 +6406,10 @@ static int propagate_liveness(struct bpf_verifier_env *env,
>  		for (i = frame < vstate->curframe ? BPF_REG_6 : 0; i < BPF_REG_FP; i++) {
>  			err = propagate_liveness_reg(env, &state_reg[i],
>  						     &parent_reg[i]);
> -			if (err)
> +			if (err < 0)
>  				return err;
> +			if (err & REG_LIVE_READ64)
> +				mark_insn_zext(env, &parent_reg[i]);

I'm not quite following why it's parent_reg here instead of state_reg.

If I understood the code the liveness can have all three states:
REG_LIVE_READ64 | REG_LIVE_READ32
REG_LIVE_READ64
REG_LIVE_READ32
whereas 2 is a superset of 3, so 1 should never be seen.

If so, why in propagate_liveness we have this dance:
+       u8 parent_bits = parent_reg->live & REG_LIVE_READ;
+       u8 bits = reg->live & REG_LIVE_READ;
+       u8 bits_diff = parent_bits ^ bits;
+       u8 bits_prop = bits_diff & bits;
        int err;

-       if (parent_reg->live & REG_LIVE_READ || !(reg->live & REG_LIVE_READ))
+       /* No diff bit comes from "reg". */
+       if (!bits_prop)

I'm struggling to see through all 3 combinations in respect to above diff.
Shouldn't propagation of REG_LIVE_READ64 remove REG_LIVE_READ32 bit
and clear subreg_def during mark_reg_read() instead of
once in propagate_liveness() ?


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 bpf-next 02/15] bpf: mark lo32 writes that should be zero extended into hi32
  2019-04-18 23:57   ` Alexei Starovoitov
@ 2019-04-19 20:40     ` Jakub Kicinski
  2019-04-19 21:14       ` Alexei Starovoitov
  0 siblings, 1 reply; 30+ messages in thread
From: Jakub Kicinski @ 2019-04-19 20:40 UTC (permalink / raw)
  To: Alexei Starovoitov, Jiong Wang; +Cc: daniel, bpf, netdev, oss-drivers

On Thu, 18 Apr 2019 16:57:50 -0700, Alexei Starovoitov wrote:
> > @@ -6371,8 +6406,10 @@ static int propagate_liveness(struct bpf_verifier_env *env,
> >  		for (i = frame < vstate->curframe ? BPF_REG_6 : 0; i < BPF_REG_FP; i++) {
> >  			err = propagate_liveness_reg(env, &state_reg[i],
> >  						     &parent_reg[i]);
> > -			if (err)
> > +			if (err < 0)
> >  				return err;
> > +			if (err & REG_LIVE_READ64)
> > +				mark_insn_zext(env, &parent_reg[i]);  
> 
> I'm not quite following why it's parent_reg here instead of state_reg.

Perhaps we should rename the parameters to something else than parent
here?  I always have to do some mental gymnastics looking at this code..
"explored" and "current"?

The current state is parent, the "next" state that pruned the search
is "state".  So we check if the reads under state X need 64bit, if so
have to propagate back to writes on current (which is called parent
here, even though it won't become state's parent, ugh.)

> If I understood the code the liveness can have all three states:
> REG_LIVE_READ64 | REG_LIVE_READ32
> REG_LIVE_READ64
> REG_LIVE_READ32
> whereas 2 is a superset of 3, so 1 should never be seen.
> 
> If so, why in propagate_liveness we have this dance:
> +       u8 parent_bits = parent_reg->live & REG_LIVE_READ;
> +       u8 bits = reg->live & REG_LIVE_READ;
> +       u8 bits_diff = parent_bits ^ bits;
> +       u8 bits_prop = bits_diff & bits;
>         int err;
> 
> -       if (parent_reg->live & REG_LIVE_READ || !(reg->live & REG_LIVE_READ))
> +       /* No diff bit comes from "reg". */
> +       if (!bits_prop)
> 
> I'm struggling to see through all 3 combinations in respect to above diff.
> Shouldn't propagation of REG_LIVE_READ64 remove REG_LIVE_READ32 bit
> and clear subreg_def during mark_reg_read() instead of
> once in propagate_liveness() ?

This reminds me, I'm not entirely clear on the need to propagate the
zext through stack slots...  Pointers are guaranteed to be 64bit, we
don't save parentage on scalars (AFAICT), why not pass REG_LIVE_READ
or READ64 to mark_reg_read() from stack_read?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 bpf-next 02/15] bpf: mark lo32 writes that should be zero extended into hi32
  2019-04-19 20:40     ` Jakub Kicinski
@ 2019-04-19 21:14       ` Alexei Starovoitov
  2019-04-19 21:33         ` Jakub Kicinski
  0 siblings, 1 reply; 30+ messages in thread
From: Alexei Starovoitov @ 2019-04-19 21:14 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Jiong Wang, daniel, bpf, netdev, oss-drivers

On Fri, Apr 19, 2019 at 01:40:51PM -0700, Jakub Kicinski wrote:
> On Thu, 18 Apr 2019 16:57:50 -0700, Alexei Starovoitov wrote:
> > > @@ -6371,8 +6406,10 @@ static int propagate_liveness(struct bpf_verifier_env *env,
> > >  		for (i = frame < vstate->curframe ? BPF_REG_6 : 0; i < BPF_REG_FP; i++) {
> > >  			err = propagate_liveness_reg(env, &state_reg[i],
> > >  						     &parent_reg[i]);
> > > -			if (err)
> > > +			if (err < 0)
> > >  				return err;
> > > +			if (err & REG_LIVE_READ64)
> > > +				mark_insn_zext(env, &parent_reg[i]);  
> > 
> > I'm not quite following why it's parent_reg here instead of state_reg.
> 
> Perhaps we should rename the parameters to something else than parent
> here?  I always have to do some mental gymnastics looking at this code..
> "explored" and "current"?
> 
> The current state is parent, the "next" state that pruned the search
> is "state".  So we check if the reads under state X need 64bit, if so
> have to propagate back to writes on current (which is called parent
> here, even though it won't become state's parent, ugh.)

agree :)

> > If I understood the code the liveness can have all three states:
> > REG_LIVE_READ64 | REG_LIVE_READ32
> > REG_LIVE_READ64
> > REG_LIVE_READ32
> > whereas 2 is a superset of 3, so 1 should never be seen.
> > 
> > If so, why in propagate_liveness we have this dance:
> > +       u8 parent_bits = parent_reg->live & REG_LIVE_READ;
> > +       u8 bits = reg->live & REG_LIVE_READ;
> > +       u8 bits_diff = parent_bits ^ bits;
> > +       u8 bits_prop = bits_diff & bits;
> >         int err;
> > 
> > -       if (parent_reg->live & REG_LIVE_READ || !(reg->live & REG_LIVE_READ))
> > +       /* No diff bit comes from "reg". */
> > +       if (!bits_prop)
> > 
> > I'm struggling to see through all 3 combinations in respect to above diff.
> > Shouldn't propagation of REG_LIVE_READ64 remove REG_LIVE_READ32 bit
> > and clear subreg_def during mark_reg_read() instead of
> > once in propagate_liveness() ?
> 
> This reminds me, I'm not entirely clear on the need to propagate the
> zext through stack slots...  Pointers are guaranteed to be 64bit, we
> don't save parentage on scalars (AFAICT),

scalars have parentage chain too.
we don't track them precisely when they're spilled to stack.
That actually caused an issue recently when valid program was rejected,
so we might add a feature to track full contents of scalars in the stack.

> why not pass REG_LIVE_READ
> or READ64 to mark_reg_read() from stack_read?

can we agree on only two states first ? ;)


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 bpf-next 02/15] bpf: mark lo32 writes that should be zero extended into hi32
  2019-04-19 21:14       ` Alexei Starovoitov
@ 2019-04-19 21:33         ` Jakub Kicinski
  2019-04-19 21:41           ` Alexei Starovoitov
  0 siblings, 1 reply; 30+ messages in thread
From: Jakub Kicinski @ 2019-04-19 21:33 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Jiong Wang, daniel, bpf, netdev, oss-drivers

On Fri, 19 Apr 2019 14:14:05 -0700, Alexei Starovoitov wrote:
> > This reminds me, I'm not entirely clear on the need to propagate the
> > zext through stack slots...  Pointers are guaranteed to be 64bit, we
> > don't save parentage on scalars (AFAICT),  
> 
> scalars have parentage chain too.
> we don't track them precisely when they're spilled to stack.
> That actually caused an issue recently when valid program was rejected,
> so we might add a feature to track full contents of scalars in the stack.

Interesting..

> > why not pass REG_LIVE_READ
> > or READ64 to mark_reg_read() from stack_read?  
> 
> can we agree on only two states first ? ;)

Yess, the LIVE_READ was thought to be more of a mask for those accesses
that only care about "any read" being set, to be honest.  As you said
read64 is a strict superset of read32.  Keeping the name REG_LIVE_READ,
rather than REG_LIVE_READ_ANY or _MASK let us leave some of the
existing code untouched.

Jiong's original idea was to add a read32, and have read mean read64.

I think you said we should have read32 and read64 flags, but clear
read32 once read64 gets set?  SGTM!

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 bpf-next 02/15] bpf: mark lo32 writes that should be zero extended into hi32
  2019-04-19 21:33         ` Jakub Kicinski
@ 2019-04-19 21:41           ` Alexei Starovoitov
  2019-04-19 23:27             ` Jiong Wang
  0 siblings, 1 reply; 30+ messages in thread
From: Alexei Starovoitov @ 2019-04-19 21:41 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Jiong Wang, daniel, bpf, netdev, oss-drivers

On Fri, Apr 19, 2019 at 02:33:10PM -0700, Jakub Kicinski wrote:
> On Fri, 19 Apr 2019 14:14:05 -0700, Alexei Starovoitov wrote:
> > > This reminds me, I'm not entirely clear on the need to propagate the
> > > zext through stack slots...  Pointers are guaranteed to be 64bit, we
> > > don't save parentage on scalars (AFAICT),  
> > 
> > scalars have parentage chain too.
> > we don't track them precisely when they're spilled to stack.
> > That actually caused an issue recently when valid program was rejected,
> > so we might add a feature to track full contents of scalars in the stack.
> 
> Interesting..
> 
> > > why not pass REG_LIVE_READ
> > > or READ64 to mark_reg_read() from stack_read?  
> > 
> > can we agree on only two states first ? ;)
> 
> Yess, the LIVE_READ was thought to be more of a mask for those accesses
> that only care about "any read" being set, to be honest.  As you said
> read64 is a strict superset of read32.  Keeping the name REG_LIVE_READ,
> rather than REG_LIVE_READ_ANY or _MASK let us leave some of the
> existing code untouched.
> 
> Jiong's original idea was to add a read32, and have read mean read64.
> 
> I think you said we should have read32 and read64 flags, but clear
> read32 once read64 gets set?  SGTM!

yep.

any subsequent read64 means that earlier read32 marks are irrelevant
from zext optimization pov.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 bpf-next 02/15] bpf: mark lo32 writes that should be zero extended into hi32
  2019-04-19 21:41           ` Alexei Starovoitov
@ 2019-04-19 23:27             ` Jiong Wang
  2019-04-19 23:28               ` Alexei Starovoitov
  0 siblings, 1 reply; 30+ messages in thread
From: Jiong Wang @ 2019-04-19 23:27 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jakub Kicinski, Jiong Wang, daniel, bpf, netdev, oss-drivers


Alexei Starovoitov writes:

> On Fri, Apr 19, 2019 at 02:33:10PM -0700, Jakub Kicinski wrote:
>> On Fri, 19 Apr 2019 14:14:05 -0700, Alexei Starovoitov wrote:
>> > > This reminds me, I'm not entirely clear on the need to propagate the
>> > > zext through stack slots...  Pointers are guaranteed to be 64bit, we
>> > > don't save parentage on scalars (AFAICT),  
>> > 
>> > scalars have parentage chain too.
>> > we don't track them precisely when they're spilled to stack.
>> > That actually caused an issue recently when valid program was rejected,
>> > so we might add a feature to track full contents of scalars in the stack.
>> 
>> Interesting..
>> 
>> > > why not pass REG_LIVE_READ
>> > > or READ64 to mark_reg_read() from stack_read?  
>> > 
>> > can we agree on only two states first ? ;)
>> 
>> Yess, the LIVE_READ was thought to be more of a mask for those accesses
>> that only care about "any read" being set, to be honest.  As you said
>> read64 is a strict superset of read32.  Keeping the name REG_LIVE_READ,
>> rather than REG_LIVE_READ_ANY or _MASK let us leave some of the
>> existing code untouched.
>> 
>> Jiong's original idea was to add a read32, and have read mean read64.
>> 
>> I think you said we should have read32 and read64 flags, but clear
>> read32 once read64 gets set?  SGTM!
>
> yep.
>
> any subsequent read64 means that earlier read32 marks are irrelevant
> from zext optimization pov.

OK, will split REG_LIVE_READ into REG_LIVE_READ64 and REG_LIVE_READ32, and
will let the prior override the latter early inside mark_reg_read. I feel
renaming parameter for propagate_liveness (the "parent" etc) could be a
following up patch? Let me know if you want it included in this set.

(I am travelling, will try to send out updated version around next Tuesday)

Regards,
Jiong

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 bpf-next 02/15] bpf: mark lo32 writes that should be zero extended into hi32
  2019-04-19 23:27             ` Jiong Wang
@ 2019-04-19 23:28               ` Alexei Starovoitov
  0 siblings, 0 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2019-04-19 23:28 UTC (permalink / raw)
  To: Jiong Wang
  Cc: Jakub Kicinski, Daniel Borkmann, bpf, Network Development, oss-drivers

On Fri, Apr 19, 2019 at 4:27 PM Jiong Wang <jiong.wang@netronome.com> wrote:
>
> OK, will split REG_LIVE_READ into REG_LIVE_READ64 and REG_LIVE_READ32, and
> will let the prior override the latter early inside mark_reg_read. I feel
> renaming parameter for propagate_liveness (the "parent" etc) could be a
> following up patch? Let me know if you want it included in this set.

of course. renaming should be separate.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 bpf-next 15/15] nfp: bpf: eliminate zero extension code-gen
  2019-04-15 17:26 ` [PATCH v4 bpf-next 15/15] nfp: " Jiong Wang
@ 2019-04-24 16:31   ` kbuild test robot
  0 siblings, 0 replies; 30+ messages in thread
From: kbuild test robot @ 2019-04-24 16:31 UTC (permalink / raw)
  To: Jiong Wang
  Cc: kbuild-all, alexei.starovoitov, daniel, bpf, netdev, oss-drivers,
	Jiong Wang

Hi Jiong,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Jiong-Wang/bpf-eliminate-zero-extensions-for-sub-register-writes/20190416-191711
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
reproduce:
        # apt-get install sparse
        make ARCH=x86_64 allmodconfig
        make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

sparse warnings: (new ones prefixed by >>)

>> drivers/net/ethernet/netronome/nfp/bpf/jit.c:2654:34: sparse: sparse: incorrect type in argument 3 (different base types) @@    expected unsigned char [usertype] dst @@    got resunsigned char [usertype] dst @@
>> drivers/net/ethernet/netronome/nfp/bpf/jit.c:2654:34: sparse:    expected unsigned char [usertype] dst
>> drivers/net/ethernet/netronome/nfp/bpf/jit.c:2654:34: sparse:    got restricted swreg [usertype] dst
   drivers/net/ethernet/netronome/nfp/bpf/jit.c:2680:34: sparse: sparse: incorrect type in argument 3 (different base types) @@    expected unsigned char [usertype] dst @@    got resunsigned char [usertype] dst @@
   drivers/net/ethernet/netronome/nfp/bpf/jit.c:2680:34: sparse:    expected unsigned char [usertype] dst
   drivers/net/ethernet/netronome/nfp/bpf/jit.c:2680:34: sparse:    got restricted swreg [usertype] dst

vim +2654 drivers/net/ethernet/netronome/nfp/bpf/jit.c

  2627	
  2628	static int mem_ldx_skb(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
  2629			       u8 size)
  2630	{
  2631		swreg dst = reg_both(meta->insn.dst_reg * 2);
  2632	
  2633		switch (meta->insn.off) {
  2634		case offsetof(struct __sk_buff, len):
  2635			if (size != FIELD_SIZEOF(struct __sk_buff, len))
  2636				return -EOPNOTSUPP;
  2637			wrp_mov(nfp_prog, dst, plen_reg(nfp_prog));
  2638			break;
  2639		case offsetof(struct __sk_buff, data):
  2640			if (size != FIELD_SIZEOF(struct __sk_buff, data))
  2641				return -EOPNOTSUPP;
  2642			wrp_mov(nfp_prog, dst, pptr_reg(nfp_prog));
  2643			break;
  2644		case offsetof(struct __sk_buff, data_end):
  2645			if (size != FIELD_SIZEOF(struct __sk_buff, data_end))
  2646				return -EOPNOTSUPP;
  2647			emit_alu(nfp_prog, dst,
  2648				 plen_reg(nfp_prog), ALU_OP_ADD, pptr_reg(nfp_prog));
  2649			break;
  2650		default:
  2651			return -EOPNOTSUPP;
  2652		}
  2653	
> 2654		wrp_zext(nfp_prog, meta, dst);
  2655	
  2656		return 0;
  2657	}
  2658	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2019-04-24 16:31 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-15 17:26 [PATCH v4 bpf-next 00/15] bpf: eliminate zero extensions for sub-register writes Jiong Wang
2019-04-15 17:26 ` [PATCH v4 bpf-next 01/15] bpf: split read liveness into REG_LIVE_READ64 and REG_LIVE_READ32 Jiong Wang
2019-04-15 23:03   ` Jakub Kicinski
2019-04-16  1:26   ` Alexei Starovoitov
2019-04-16  7:39     ` Jiong Wang
2019-04-16 16:20       ` Alexei Starovoitov
2019-04-16 20:19         ` Jiong Wang
2019-04-15 17:26 ` [PATCH v4 bpf-next 02/15] bpf: mark lo32 writes that should be zero extended into hi32 Jiong Wang
2019-04-18 23:57   ` Alexei Starovoitov
2019-04-19 20:40     ` Jakub Kicinski
2019-04-19 21:14       ` Alexei Starovoitov
2019-04-19 21:33         ` Jakub Kicinski
2019-04-19 21:41           ` Alexei Starovoitov
2019-04-19 23:27             ` Jiong Wang
2019-04-19 23:28               ` Alexei Starovoitov
2019-04-15 17:26 ` [PATCH v4 bpf-next 03/15] bpf: reduce false alarm by refining helper call arg types Jiong Wang
2019-04-15 17:26 ` [PATCH v4 bpf-next 04/15] bpf: insert explicit zero extension insn when hardware doesn't do it implicitly Jiong Wang
2019-04-15 17:26 ` [PATCH v4 bpf-next 05/15] bpf: introduce new bpf prog load flags "BPF_F_TEST_RND_HI32" Jiong Wang
2019-04-15 17:26 ` [PATCH v4 bpf-next 06/15] bpf: randomize high 32-bit when BPF_F_TEST_RND_HI32 is set Jiong Wang
2019-04-15 17:26 ` [PATCH v4 bpf-next 07/15] libbpf: add "prog_flags" to bpf_program/bpf_prog_load_attr/bpf_load_program_attr Jiong Wang
2019-04-15 17:26 ` [PATCH v4 bpf-next 08/15] selftests: enable hi32 randomization for all tests Jiong Wang
2019-04-15 17:26 ` [PATCH v4 bpf-next 09/15] arm: bpf: eliminate zero extension code-gen Jiong Wang
2019-04-15 17:26 ` [PATCH v4 bpf-next 10/15] powerpc: " Jiong Wang
2019-04-15 17:26 ` [PATCH v4 bpf-next 11/15] s390: " Jiong Wang
2019-04-15 17:26 ` [PATCH v4 bpf-next 12/15] sparc: " Jiong Wang
2019-04-15 17:26 ` [PATCH v4 bpf-next 13/15] x32: " Jiong Wang
2019-04-15 17:26 ` [PATCH v4 bpf-next 14/15] riscv: " Jiong Wang
2019-04-17  7:55   ` Björn Töpel
2019-04-15 17:26 ` [PATCH v4 bpf-next 15/15] nfp: " Jiong Wang
2019-04-24 16:31   ` kbuild test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.