netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATH bpf-next 00/13] bpf: propose new jmp32 instructions
@ 2018-12-19 22:44 Jiong Wang
  2018-12-19 22:44 ` [PATH bpf-next 01/13] bpf: encoding description and macros for JMP32 Jiong Wang
                   ` (13 more replies)
  0 siblings, 14 replies; 18+ messages in thread
From: Jiong Wang @ 2018-12-19 22:44 UTC (permalink / raw)
  To: ast, daniel
  Cc: netdev, oss-drivers, Jiong Wang, David S . Miller, Paul Burton,
	Wang YanQing, Zi Shen Lim, Shubham Bansal, Naveen N . Rao,
	Sandipan Das, Martin Schwidefsky, Heiko Carstens

Current eBPF ISA has 32-bit sub-register and has defined a set of ALU32
instructions.

However, there is no JMP32 instructions, the consequence is code-gen for
32-bit sub-registers is not efficient. For example, explicit sign-extension
from 32-bit to 64-bit is needed for signed comparison.

Adding JMP32 instruction therefore could complete eBPF ISA on 32-bit
sub-register support. This also match those JMP32 instructions in most JIT
backends, for example x64-64 and AArch64. These new eBPF JMP32 instructions
could have one-to-one map on them.

A few verifier ALU32 related bugs has been fixed recently, and JMP32
introduced by this set further improves BPF sub-register ecosystem. Once
this is landed, BPF programs using 32-bit sub-register ISA could get
reasonably good support from verifier and JIT compilers. Users then could
compare the runtime efficiency of one BPF program under both modes, and
could use the one benchmarked as better. One good thing is JMP32 is making 
32-bit JIT more efficient, because it only has 32-bit use, no def, so
unlike ALU32, no need to clear high bits. Hence, even without data-flow
analysis, JMP32 is making better code-gen then JMP64. More benchmark
results are listed below in this cover letter.

 - Encoding

   Ideally, JMP32 could use new CLASS BPF_JMP32, just like BPF_ALU and
   BPF_ALU32. But we only has one class number 0x06 unused. I am not sure
   if we want to keep it for other extension purpose. For example restore
   it as BPF_MISC which could then redefine the interpretation of all the 
   remaining bits in bis[7:1];

   So, I am following the coding style used by BPF_PSEUDO_CALL, that is to
   use reserved bits under BPF_JMP. When BPF_SRC(code) == BPF_X, the
   encoding is 0x1 at insn->imm. When BPF_SRC(code) == BPF_K, the encoding
   is 0x1 at insn->src_reg. All other bits in imm and src_reg are still
   reserved and should be zeroed.

 - Testing

   A couple of unit tests has been added and included in this set. Also
   LLVM code-gen for JMP32 has been added, so you could just compile any
   BPF C program with both -mcpu=probe and -mattr=+alu32 specified if
   you are compiling on a machine with kernel patched by this set, LLVM
   will select the ISA automatically based on host probe results.
   Otherwise specify -mcpu=v3 and -mattr=+alu32 to force use JMP32 ISA
   and enable sub-register code-gen.

   LLVM support could be found at:

     https://github.com/Netronome/llvm/commit/607f088b92ebfb09f026a84a9443a59237cf6628

   (will send out merge request once kernel set reached consensus.
    Hopefully could get into LLVM 8.0 which will be branched at
    16-Jan-2019)

   I have compiled BPF selftest with JMP32 enabled. The methodology is
   BPF selftest Makefile has introduced a new variable "BPF_SELFTEST_32BIT"
   which allows BPF C programs contained inside the testsuite compiled
   using sub-register mode for which ALU32 and JMP32 instructions will be
   generated once the kernel installed on the compilation machine support
   them. From my tests, no regression on this sub-register test mode except
   when loading bpf_flow.o which somehow verifier doesn't reason the pkt
   range accurately. test_progs which contains quite a few BPF C tests
   passed cleanly.

   Using an env variable to control test mode seems bring smallest change to
   the Makefile, and would require "make check" with BPF_SELFTEST_32BIT
   defined in your test driver script for this new test mode.

   Would appreicate if any better idea on how to enable extra test mode for
   BPF selftests.

 - JIT backends support

   A couple of JIT backends has been supported in this set except SPARC
   and MIPS which I need maintainer's help on implementing them.
   @David, @Paul, would appreciate if you could help on this.

   Also those implemented in this set needs port maintainer's review and
   tests. I have only tested x86_64 and NFP.

 - Benchmarking

   Below are some benchmark results from Cilium BPF programs. After JMP32
   enabled, we could see consistently code size reduction and processed
   instruction numbers are reduced in general as well.

   Text size in bytes (generated by "size")
   ===
   LLVM code-gen option     default   alu32   alu32 + jmp32  change
                                                             (Vs. alu32)
   bpf_lb-DLB_L3.o:         6456      6280    6160           -1.91%
   bpf_lb-DLB_L4.o:         7848      7664    7136           -6.89%
   bpf_lb-DUNKNOWN.o:       2680      2664    2568           -3.60%
   bpf_lxc.o:               104824    104744  97360          -7.05%
   bpf_netdev.o:            23456     23576   21632          -8.25%
   bpf_overlay.o:           16184     16304   14648          -10.16%
   
   Processed insn number
   ===
   LLVM code-gen option     default   alu32   alu32 + jmp32  change
   
   bpf_lb-DLB_L3.o:         1579      1281    1304           +1.79%
   bpf_lb-DLB_L4.o:         2045      1663    1554           -6.55%
   bpf_lb-DUNKNOWN.o:       606       513     505            -1.56%
   bpf_lxc.o:               85381     103218  102666         -0.53%
   bpf_netdev.o:            5246      5809    5376           -7.45%
   bpf_overlay.o:           2443      2705    2460           -9.05%
   
   JITed insn num (on NFP, other 32-bit arches could be similar)
   ===
   LLVM code-gen option     default   alu32   alu32 + jmp32  change
                                                             (Vs. alu32)
   one ~300 line C program  632       612     597            -2.45%
   (NFP contains some fixed sequence, so the real improvements is higher)

Thanks.

Cc: David S. Miller <davem@davemloft.net>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Wang YanQing <udknight@gmail.com>
Cc: Zi Shen Lim <zlim.lnx@gmail.com>
Cc: Shubham Bansal <illusionist.neo@gmail.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>

Jiong Wang (13):
  bpf: encoding description and macros for JMP32
  bpf: interpreter support for JMP32
  bpf: JIT blinds support JMP32
  x86_64: bpf: implement jitting of JMP32
  x32: bpf: implement jitting of JMP32
  arm64: bpf: implement jitting of JMP32
  arm: bpf: implement jitting of JMP32
  ppc: bpf: implement jitting of JMP32
  s390: bpf: implement jitting of JMP32
  nfp: bpf: implement jitting of JMP32
  bpf: verifier support JMP32
  bpf: unit tests for JMP32
  selftests: bpf: makefile support sub-register code-gen test mode

 Documentation/networking/filter.txt          |  10 +
 arch/arm/net/bpf_jit_32.c                    |  23 +-
 arch/arm64/net/bpf_jit_comp.c                |  10 +-
 arch/powerpc/net/bpf_jit_comp64.c            |  50 ++++-
 arch/s390/net/bpf_jit_comp.c                 |  12 +-
 arch/x86/net/bpf_jit_comp.c                  |  13 +-
 arch/x86/net/bpf_jit_comp32.c                |  46 ++--
 drivers/net/ethernet/netronome/nfp/bpf/jit.c |  69 ++++--
 include/linux/filter.h                       |  19 ++
 include/uapi/linux/bpf.h                     |   4 +
 kernel/bpf/core.c                            |  60 +++--
 kernel/bpf/verifier.c                        | 178 +++++++++++----
 lib/test_bpf.c                               | 321 ++++++++++++++++++++++++++-
 tools/include/uapi/linux/bpf.h               |   4 +
 tools/testing/selftests/bpf/Makefile         |   4 +
 15 files changed, 696 insertions(+), 127 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATH bpf-next 01/13] bpf: encoding description and macros for JMP32
  2018-12-19 22:44 [PATH bpf-next 00/13] bpf: propose new jmp32 instructions Jiong Wang
@ 2018-12-19 22:44 ` Jiong Wang
  2018-12-19 22:44 ` [PATH bpf-next 02/13] bpf: interpreter support " Jiong Wang
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 18+ messages in thread
From: Jiong Wang @ 2018-12-19 22:44 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev, oss-drivers, Jiong Wang

This patch update kernel BPF documents for JMP32 encoding.

Two macros are added for JMP32 to easy programming.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 Documentation/networking/filter.txt | 10 ++++++++++
 include/uapi/linux/bpf.h            |  4 ++++
 tools/include/uapi/linux/bpf.h      |  4 ++++
 3 files changed, 18 insertions(+)

diff --git a/Documentation/networking/filter.txt b/Documentation/networking/filter.txt
index 2196b82..5f6fafa 100644
--- a/Documentation/networking/filter.txt
+++ b/Documentation/networking/filter.txt
@@ -939,6 +939,16 @@ in eBPF means function exit only. The eBPF program needs to store return
 value into register R0 before doing a BPF_EXIT. Class 6 in eBPF is currently
 unused and reserved for future use.
 
+And BPF_JMP has sub-opcode. When BPF_SRC(code) == BPF_X, the encoding is at
+insn->imm, starting from LSB. When BPF_SRC(code) == BPF_K, the encoding is at
+insn->src_reg, starting from LSB as well. Only one bit is used for sub-opcode at
+the moment, all other bits in imm and src_reg are still reserved and should be
+zeroed. The BPF_JMP sub-opcode is:
+
+  BPF_JMP_SUBOP_32BIT	0x1
+
+It means jump insn use 32-bit sub-register when doing comparison.
+
 For load and store instructions the 8-bit 'code' field is divided as:
 
   +--------+--------+-------------------+
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index e7d57e89..f30d646 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -42,6 +42,10 @@
 #define BPF_CALL	0x80	/* function call */
 #define BPF_EXIT	0x90	/* function return */
 
+/* jmp has sub-opcode */
+#define BPF_JMP_SUBOP_MASK	0x1
+#define BPF_JMP_SUBOP_32BIT	0x1
+
 /* Register numbers */
 enum {
 	BPF_REG_0 = 0,
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index e7d57e89..f30d646 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -42,6 +42,10 @@
 #define BPF_CALL	0x80	/* function call */
 #define BPF_EXIT	0x90	/* function return */
 
+/* jmp has sub-opcode */
+#define BPF_JMP_SUBOP_MASK	0x1
+#define BPF_JMP_SUBOP_32BIT	0x1
+
 /* Register numbers */
 enum {
 	BPF_REG_0 = 0,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATH bpf-next 02/13] bpf: interpreter support for JMP32
  2018-12-19 22:44 [PATH bpf-next 00/13] bpf: propose new jmp32 instructions Jiong Wang
  2018-12-19 22:44 ` [PATH bpf-next 01/13] bpf: encoding description and macros for JMP32 Jiong Wang
@ 2018-12-19 22:44 ` Jiong Wang
  2018-12-19 22:44 ` [PATH bpf-next 03/13] bpf: JIT blinds support JMP32 Jiong Wang
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 18+ messages in thread
From: Jiong Wang @ 2018-12-19 22:44 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev, oss-drivers, Jiong Wang

This patch implements interpreting new JMP32 instructions.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 kernel/bpf/core.c | 52 ++++++++++++++++++++++++++++++++--------------------
 1 file changed, 32 insertions(+), 20 deletions(-)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 5cdd8da..33609f3 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -57,6 +57,7 @@
 #define ARG1	regs[BPF_REG_ARG1]
 #define CTX	regs[BPF_REG_CTX]
 #define IMM	insn->imm
+#define SRCNO	insn->src_reg
 
 /* No hurry in this branch
  *
@@ -1366,121 +1367,132 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
 		insn += insn->off;
 		CONT;
 	JMP_JEQ_X:
-		if (DST == SRC) {
+		if (DST == SRC || (IMM && (u32) DST == (u32) SRC)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JEQ_K:
-		if (DST == IMM) {
+		if (DST == IMM || (SRCNO && (u32) DST == (u32) IMM)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JNE_X:
-		if (DST != SRC) {
+		if ((!IMM && DST != SRC) || (IMM && (u32) DST != (u32) SRC)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JNE_K:
-		if (DST != IMM) {
+		if ((!SRCNO && DST != IMM) ||
+		    (SRCNO && (u32) DST != (u32) IMM)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JGT_X:
-		if (DST > SRC) {
+		if ((!IMM && DST > SRC) || (IMM && (u32) DST > (u32) SRC)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JGT_K:
-		if (DST > IMM) {
+		if ((!SRCNO && DST > IMM) || (SRCNO && (u32) DST > (u32) IMM)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JLT_X:
-		if (DST < SRC) {
+		if ((!IMM && DST < SRC) || (IMM && (u32) DST < (u32) SRC)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JLT_K:
-		if (DST < IMM) {
+		if ((!SRCNO && DST < IMM) || (SRCNO && (u32) DST < (u32) IMM)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JGE_X:
-		if (DST >= SRC) {
+		if ((!IMM && DST >= SRC) || (IMM && (u32) DST >= (u32) SRC)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JGE_K:
-		if (DST >= IMM) {
+		if ((!SRCNO && DST >= IMM) ||
+		    (SRCNO && (u32) DST >= (u32) IMM)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JLE_X:
-		if (DST <= SRC) {
+		if ((!IMM && DST <= SRC) || (IMM && (u32) DST <= (u32) SRC)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JLE_K:
-		if (DST <= IMM) {
+		if ((!SRCNO && DST <= IMM) ||
+		    (SRCNO && (u32) DST <= (u32) IMM)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JSGT_X:
-		if (((s64) DST) > ((s64) SRC)) {
+		if ((!IMM && ((s64) DST) > (s64) SRC) ||
+		    (IMM && (s32) (u32) DST > (s32) (u32) SRC)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JSGT_K:
-		if (((s64) DST) > ((s64) IMM)) {
+		if ((!SRCNO && (s64) DST > (s64) IMM) ||
+		    (SRCNO && (s32) (u32) DST > (s32) (u32) IMM)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JSLT_X:
-		if (((s64) DST) < ((s64) SRC)) {
+		if ((!IMM && (s64) DST < (s64) SRC) ||
+		    (IMM && (s32) (u32) DST < (s32) (u32) SRC)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JSLT_K:
-		if (((s64) DST) < ((s64) IMM)) {
+		if ((!SRCNO && (s64) DST < (s64) IMM) ||
+		    (SRCNO && (s32) (u32) DST < (s32) (u32) IMM)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JSGE_X:
-		if (((s64) DST) >= ((s64) SRC)) {
+		if ((!IMM && (s64) DST >= (s64) SRC) ||
+		    (IMM && (s32) (u32) DST >= (s32) (u32) SRC)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JSGE_K:
-		if (((s64) DST) >= ((s64) IMM)) {
+		if ((!SRCNO && (s64) DST >= (s64) IMM) ||
+		    (SRCNO && (s32) (u32) DST >= (s32) (u32) IMM)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JSLE_X:
-		if (((s64) DST) <= ((s64) SRC)) {
+		if ((!IMM && (s64) DST <= (s64) SRC) ||
+		    (IMM && (s32) (u32) DST <= (s32) (u32) SRC)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
 	JMP_JSLE_K:
-		if (((s64) DST) <= ((s64) IMM)) {
+		if ((!SRCNO && (s64) DST <= (s64) IMM) ||
+		    (SRCNO && (s32) (u32) DST <= (s32) (u32) IMM)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATH bpf-next 03/13] bpf: JIT blinds support JMP32
  2018-12-19 22:44 [PATH bpf-next 00/13] bpf: propose new jmp32 instructions Jiong Wang
  2018-12-19 22:44 ` [PATH bpf-next 01/13] bpf: encoding description and macros for JMP32 Jiong Wang
  2018-12-19 22:44 ` [PATH bpf-next 02/13] bpf: interpreter support " Jiong Wang
@ 2018-12-19 22:44 ` Jiong Wang
  2018-12-19 22:44 ` [PATH bpf-next 04/13] x86_64: bpf: implement jitting of JMP32 Jiong Wang
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 18+ messages in thread
From: Jiong Wang @ 2018-12-19 22:44 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev, oss-drivers, Jiong Wang

JIT blinds is building eBPF JMP insn directly, and is transforming BPF_K
into BPF_X.

Update the code to be aware of JMP32

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 kernel/bpf/core.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 33609f3..0252e28 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -902,7 +902,13 @@ static int bpf_jit_blind_insn(const struct bpf_insn *from,
 			off -= 2;
 		*to++ = BPF_ALU64_IMM(BPF_MOV, BPF_REG_AX, imm_rnd ^ from->imm);
 		*to++ = BPF_ALU64_IMM(BPF_XOR, BPF_REG_AX, imm_rnd);
-		*to++ = BPF_JMP_REG(from->code, from->dst_reg, BPF_REG_AX, off);
+		*to = BPF_JMP_REG(from->code, from->dst_reg, BPF_REG_AX, off);
+		if (from->src_reg)
+			/* NOTE: BPF_K has been transformed into BPF_X,
+			 * mark imm instead of src_reg
+			 */
+			to->imm |= BPF_JMP_SUBOP_32BIT;
+		to++;
 		break;
 
 	case BPF_LD | BPF_IMM | BPF_DW:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATH bpf-next 04/13] x86_64: bpf: implement jitting of JMP32
  2018-12-19 22:44 [PATH bpf-next 00/13] bpf: propose new jmp32 instructions Jiong Wang
                   ` (2 preceding siblings ...)
  2018-12-19 22:44 ` [PATH bpf-next 03/13] bpf: JIT blinds support JMP32 Jiong Wang
@ 2018-12-19 22:44 ` Jiong Wang
  2018-12-19 22:44 ` [PATH bpf-next 05/13] x32: " Jiong Wang
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 18+ messages in thread
From: Jiong Wang @ 2018-12-19 22:44 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev, oss-drivers, Jiong Wang

This patch implements code-gen for new JMP32 instructions on x86_64.

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 arch/x86/net/bpf_jit_comp.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 5542303..499f1c1 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -882,8 +882,12 @@ xadd:			if (is_imm8(insn->off))
 		case BPF_JMP | BPF_JSGE | BPF_X:
 		case BPF_JMP | BPF_JSLE | BPF_X:
 			/* cmp dst_reg, src_reg */
-			EMIT3(add_2mod(0x48, dst_reg, src_reg), 0x39,
-			      add_2reg(0xC0, dst_reg, src_reg));
+			if (!imm32) /* JMP32 */
+				EMIT1(add_2mod(0x48, dst_reg, src_reg));
+			else if (is_ereg(dst_reg) || is_ereg(src_reg))
+				EMIT1(add_2mod(0x40, dst_reg, src_reg));
+
+			EMIT2(0x39, add_2reg(0xC0, dst_reg, src_reg));
 			goto emit_cond_jmp;
 
 		case BPF_JMP | BPF_JSET | BPF_X:
@@ -909,7 +913,10 @@ xadd:			if (is_imm8(insn->off))
 		case BPF_JMP | BPF_JSGE | BPF_K:
 		case BPF_JMP | BPF_JSLE | BPF_K:
 			/* cmp dst_reg, imm8/32 */
-			EMIT1(add_1mod(0x48, dst_reg));
+			if (!src_reg) /* JMP32 */
+				EMIT1(add_1mod(0x48, dst_reg));
+			else if (is_ereg(dst_reg))
+				EMIT1(add_1mod(0x40, dst_reg));
 
 			if (is_imm8(imm32))
 				EMIT3(0x83, add_1reg(0xF8, dst_reg), imm32);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATH bpf-next 05/13] x32: bpf: implement jitting of JMP32
  2018-12-19 22:44 [PATH bpf-next 00/13] bpf: propose new jmp32 instructions Jiong Wang
                   ` (3 preceding siblings ...)
  2018-12-19 22:44 ` [PATH bpf-next 04/13] x86_64: bpf: implement jitting of JMP32 Jiong Wang
@ 2018-12-19 22:44 ` Jiong Wang
  2018-12-19 22:44 ` [PATH bpf-next 06/13] arm64: " Jiong Wang
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 18+ messages in thread
From: Jiong Wang @ 2018-12-19 22:44 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev, oss-drivers, Jiong Wang, Wang YanQing

This patch implements code-gen for new JMP32 instructions on x32.

Cc: Wang YanQing <udknight@gmail.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 arch/x86/net/bpf_jit_comp32.c | 46 ++++++++++++++++++++++++++++---------------
 1 file changed, 30 insertions(+), 16 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/bpf_jit_comp32.c
index 8f6cc71..9f08918 100644
--- a/arch/x86/net/bpf_jit_comp32.c
+++ b/arch/x86/net/bpf_jit_comp32.c
@@ -2077,24 +2077,33 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
 			u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
 			u8 sreg_lo = sstk ? IA32_ECX : src_lo;
 			u8 sreg_hi = sstk ? IA32_EBX : src_hi;
+			bool is_jmp64 = !imm32;
 
 			if (dstk) {
 				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
 				      STACK_VAR(dst_lo));
-				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
-				      STACK_VAR(dst_hi));
+				if (is_jmp64)
+					EMIT3(0x8B,
+					      add_2reg(0x40, IA32_EBP,
+						       IA32_EDX),
+					      STACK_VAR(dst_hi));
 			}
 
 			if (sstk) {
 				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX),
 				      STACK_VAR(src_lo));
-				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EBX),
-				      STACK_VAR(src_hi));
+				if (is_jmp64)
+					EMIT3(0x8B,
+					      add_2reg(0x40, IA32_EBP,
+						       IA32_EBX),
+					      STACK_VAR(src_hi));
 			}
 
-			/* cmp dreg_hi,sreg_hi */
-			EMIT2(0x39, add_2reg(0xC0, dreg_hi, sreg_hi));
-			EMIT2(IA32_JNE, 2);
+			if (is_jmp64) {
+				/* cmp dreg_hi,sreg_hi */
+				EMIT2(0x39, add_2reg(0xC0, dreg_hi, sreg_hi));
+				EMIT2(IA32_JNE, 2);
+			}
 			/* cmp dreg_lo,sreg_lo */
 			EMIT2(0x39, add_2reg(0xC0, dreg_lo, sreg_lo));
 			goto emit_cond_jmp;
@@ -2169,23 +2178,28 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
 			u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
 			u8 sreg_lo = IA32_ECX;
 			u8 sreg_hi = IA32_EBX;
+			bool is_jmp64 = !insn->src_reg;
 
 			if (dstk) {
 				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
 				      STACK_VAR(dst_lo));
-				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
-				      STACK_VAR(dst_hi));
+				if (is_jmp64)
+					EMIT3(0x8B,
+					      add_2reg(0x40, IA32_EBP,
+						       IA32_EDX),
+					      STACK_VAR(dst_hi));
 			}
 
-			hi = imm32 & (1<<31) ? (u32)~0 : 0;
 			/* mov ecx,imm32 */
 			EMIT2_off32(0xC7, add_1reg(0xC0, IA32_ECX), imm32);
-			/* mov ebx,imm32 */
-			EMIT2_off32(0xC7, add_1reg(0xC0, IA32_EBX), hi);
-
-			/* cmp dreg_hi,sreg_hi */
-			EMIT2(0x39, add_2reg(0xC0, dreg_hi, sreg_hi));
-			EMIT2(IA32_JNE, 2);
+			if (is_jmp64) {
+				hi = imm32 & (1 << 31) ? (u32)~0 : 0;
+				/* mov ebx,imm32 */
+				EMIT2_off32(0xC7, add_1reg(0xC0, IA32_EBX), hi);
+				/* cmp dreg_hi,sreg_hi */
+				EMIT2(0x39, add_2reg(0xC0, dreg_hi, sreg_hi));
+				EMIT2(IA32_JNE, 2);
+			}
 			/* cmp dreg_lo,sreg_lo */
 			EMIT2(0x39, add_2reg(0xC0, dreg_lo, sreg_lo));
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATH bpf-next 06/13] arm64: bpf: implement jitting of JMP32
  2018-12-19 22:44 [PATH bpf-next 00/13] bpf: propose new jmp32 instructions Jiong Wang
                   ` (4 preceding siblings ...)
  2018-12-19 22:44 ` [PATH bpf-next 05/13] x32: " Jiong Wang
@ 2018-12-19 22:44 ` Jiong Wang
  2018-12-19 22:44 ` [PATH bpf-next 07/13] arm: " Jiong Wang
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 18+ messages in thread
From: Jiong Wang @ 2018-12-19 22:44 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev, oss-drivers, Jiong Wang, Zi Shen Lim

This patch implements code-gen for new JMP32 instructions on arm64.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Zi Shen Lim <zlim.lnx@gmail.com>

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 arch/arm64/net/bpf_jit_comp.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 1542df0..9ed73af 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -559,7 +559,7 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
 	case BPF_JMP | BPF_JSLT | BPF_X:
 	case BPF_JMP | BPF_JSGE | BPF_X:
 	case BPF_JMP | BPF_JSLE | BPF_X:
-		emit(A64_CMP(1, dst, src), ctx);
+		emit(A64_CMP(!!imm, dst, src), ctx);
 emit_cond_jmp:
 		jmp_offset = bpf2a64_offset(i + off, i, ctx);
 		check_imm19(jmp_offset);
@@ -614,9 +614,13 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
 	case BPF_JMP | BPF_JSLT | BPF_K:
 	case BPF_JMP | BPF_JSGE | BPF_K:
 	case BPF_JMP | BPF_JSLE | BPF_K:
-		emit_a64_mov_i(1, tmp, imm, ctx);
-		emit(A64_CMP(1, dst, tmp), ctx);
+	{
+		bool is_jmp32 = !!insn->src_reg;
+
+		emit_a64_mov_i(is_jmp32, tmp, imm, ctx);
+		emit(A64_CMP(is_jmp32, dst, tmp), ctx);
 		goto emit_cond_jmp;
+	}
 	case BPF_JMP | BPF_JSET | BPF_K:
 		emit_a64_mov_i(1, tmp, imm, ctx);
 		emit(A64_TST(1, dst, tmp), ctx);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATH bpf-next 07/13] arm: bpf: implement jitting of JMP32
  2018-12-19 22:44 [PATH bpf-next 00/13] bpf: propose new jmp32 instructions Jiong Wang
                   ` (5 preceding siblings ...)
  2018-12-19 22:44 ` [PATH bpf-next 06/13] arm64: " Jiong Wang
@ 2018-12-19 22:44 ` Jiong Wang
  2018-12-19 22:44 ` [PATH bpf-next 08/13] ppc: " Jiong Wang
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 18+ messages in thread
From: Jiong Wang @ 2018-12-19 22:44 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev, oss-drivers, Jiong Wang, Shubham Bansal

This patch implements code-gen for new JMP32 instructions on arm.

Cc: Shubham Bansal <illusionist.neo@gmail.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 arch/arm/net/bpf_jit_32.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 25b3ee8..e9830a7 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -1083,7 +1083,8 @@ static inline void emit_ldx_r(const s8 dst[], const s8 src,
 
 /* Arithmatic Operation */
 static inline void emit_ar_r(const u8 rd, const u8 rt, const u8 rm,
-			     const u8 rn, struct jit_ctx *ctx, u8 op) {
+			     const u8 rn, struct jit_ctx *ctx, u8 op,
+			     bool is_jmp64) {
 	switch (op) {
 	case BPF_JSET:
 		emit(ARM_AND_R(ARM_IP, rt, rn), ctx);
@@ -1096,18 +1097,25 @@ static inline void emit_ar_r(const u8 rd, const u8 rt, const u8 rm,
 	case BPF_JGE:
 	case BPF_JLE:
 	case BPF_JLT:
-		emit(ARM_CMP_R(rd, rm), ctx);
-		_emit(ARM_COND_EQ, ARM_CMP_R(rt, rn), ctx);
+		if (is_jmp64) {
+			emit(ARM_CMP_R(rd, rm), ctx);
+			/* Only compare low halve if high halve are equal. */
+			_emit(ARM_COND_EQ, ARM_CMP_R(rt, rn), ctx);
+		} else {
+			emit(ARM_CMP_R(rt, rn), ctx);
+		}
 		break;
 	case BPF_JSLE:
 	case BPF_JSGT:
 		emit(ARM_CMP_R(rn, rt), ctx);
-		emit(ARM_SBCS_R(ARM_IP, rm, rd), ctx);
+		if (is_jmp64)
+			emit(ARM_SBCS_R(ARM_IP, rm, rd), ctx);
 		break;
 	case BPF_JSLT:
 	case BPF_JSGE:
 		emit(ARM_CMP_R(rt, rn), ctx);
-		emit(ARM_SBCS_R(ARM_IP, rd, rm), ctx);
+		if (is_jmp64)
+			emit(ARM_SBCS_R(ARM_IP, rd, rm), ctx);
 		break;
 	}
 }
@@ -1326,6 +1334,7 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
 	const s8 *rd, *rs;
 	s8 rd_lo, rt, rm, rn;
 	s32 jmp_offset;
+	bool is_jmp64;
 
 #define check_imm(bits, imm) do {				\
 	if ((imm) >= (1 << ((bits) - 1)) ||			\
@@ -1615,6 +1624,7 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
 	case BPF_JMP | BPF_JLT | BPF_X:
 	case BPF_JMP | BPF_JSLT | BPF_X:
 	case BPF_JMP | BPF_JSLE | BPF_X:
+		is_jmp64 = !imm;
 		/* Setup source registers */
 		rm = arm_bpf_get_reg32(src_hi, tmp2[0], ctx);
 		rn = arm_bpf_get_reg32(src_lo, tmp2[1], ctx);
@@ -1641,6 +1651,7 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
 	case BPF_JMP | BPF_JLE | BPF_K:
 	case BPF_JMP | BPF_JSLT | BPF_K:
 	case BPF_JMP | BPF_JSLE | BPF_K:
+		is_jmp64 = !insn->src_reg;
 		if (off == 0)
 			break;
 		rm = tmp2[0];
@@ -1652,7 +1663,7 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
 		rd = arm_bpf_get_reg64(dst, tmp, ctx);
 
 		/* Check for the condition */
-		emit_ar_r(rd[0], rd[1], rm, rn, ctx, BPF_OP(code));
+		emit_ar_r(rd[0], rd[1], rm, rn, ctx, BPF_OP(code), is_jmp64);
 
 		/* Setup JUMP instruction */
 		jmp_offset = bpf2a32_offset(i+off, i, ctx);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATH bpf-next 08/13] ppc: bpf: implement jitting of JMP32
  2018-12-19 22:44 [PATH bpf-next 00/13] bpf: propose new jmp32 instructions Jiong Wang
                   ` (6 preceding siblings ...)
  2018-12-19 22:44 ` [PATH bpf-next 07/13] arm: " Jiong Wang
@ 2018-12-19 22:44 ` Jiong Wang
  2018-12-19 22:44 ` [PATH bpf-next 09/13] s390: " Jiong Wang
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 18+ messages in thread
From: Jiong Wang @ 2018-12-19 22:44 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev, oss-drivers, Jiong Wang, Naveen N . Rao, Sandipan Das

This patch implements code-gen for new JMP32 instructions on ppc.

Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 arch/powerpc/net/bpf_jit_comp64.c | 50 +++++++++++++++++++++++++++++++--------
 1 file changed, 40 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 7ce57657..4a9d759 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -810,14 +810,20 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,
 			case BPF_JMP | BPF_JEQ | BPF_X:
 			case BPF_JMP | BPF_JNE | BPF_X:
 				/* unsigned comparison */
-				PPC_CMPLD(dst_reg, src_reg);
+				if (!!imm) /* jmp32 */
+					PPC_CMPLW(dst_reg, src_reg);
+				else
+					PPC_CMPLD(dst_reg, src_reg);
 				break;
 			case BPF_JMP | BPF_JSGT | BPF_X:
 			case BPF_JMP | BPF_JSLT | BPF_X:
 			case BPF_JMP | BPF_JSGE | BPF_X:
 			case BPF_JMP | BPF_JSLE | BPF_X:
 				/* signed comparison */
-				PPC_CMPD(dst_reg, src_reg);
+				if (!!imm) /* jmp32 */
+					PPC_CMPW(dst_reg, src_reg);
+				else
+					PPC_CMPD(dst_reg, src_reg);
 				break;
 			case BPF_JMP | BPF_JSET | BPF_X:
 				PPC_AND_DOT(b2p[TMP_REG_1], dst_reg, src_reg);
@@ -828,34 +834,58 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,
 			case BPF_JMP | BPF_JLT | BPF_K:
 			case BPF_JMP | BPF_JGE | BPF_K:
 			case BPF_JMP | BPF_JLE | BPF_K:
+			{
+				bool is_jmp32 = !!insn[i].src_reg;
+
 				/*
 				 * Need sign-extended load, so only positive
 				 * values can be used as imm in cmpldi
 				 */
-				if (imm >= 0 && imm < 32768)
-					PPC_CMPLDI(dst_reg, imm);
-				else {
+				if (imm >= 0 && imm < 32768) {
+					if (is_jmp32)
+						PPC_CMPLWI(dst_reg, imm);
+					else
+						PPC_CMPLDI(dst_reg, imm);
+				} else {
 					/* sign-extending load */
 					PPC_LI32(b2p[TMP_REG_1], imm);
 					/* ... but unsigned comparison */
-					PPC_CMPLD(dst_reg, b2p[TMP_REG_1]);
+					if (is_jmp32)
+						PPC_CMPLW(dst_reg,
+							  b2p[TMP_REG_1]);
+					else
+						PPC_CMPLD(dst_reg,
+							  b2p[TMP_REG_1]);
 				}
 				break;
+			}
 			case BPF_JMP | BPF_JSGT | BPF_K:
 			case BPF_JMP | BPF_JSLT | BPF_K:
 			case BPF_JMP | BPF_JSGE | BPF_K:
 			case BPF_JMP | BPF_JSLE | BPF_K:
+			{
+				bool is_jmp32 = !!insn[i].src_reg;
+
 				/*
 				 * signed comparison, so any 16-bit value
 				 * can be used in cmpdi
 				 */
-				if (imm >= -32768 && imm < 32768)
-					PPC_CMPDI(dst_reg, imm);
-				else {
+				if (imm >= -32768 && imm < 32768) {
+					if (is_jmp32)
+						PPC_CMPWI(dst_reg, imm);
+					else
+						PPC_CMPDI(dst_reg, imm);
+				} else {
 					PPC_LI32(b2p[TMP_REG_1], imm);
-					PPC_CMPD(dst_reg, b2p[TMP_REG_1]);
+					if (is_jmp32)
+						PPC_CMPW(dst_reg,
+							 b2p[TMP_REG_1]);
+					else
+						PPC_CMPD(dst_reg,
+							 b2p[TMP_REG_1]);
 				}
 				break;
+			}
 			case BPF_JMP | BPF_JSET | BPF_K:
 				/* andi does not sign-extend the immediate */
 				if (imm >= 0 && imm < 32768)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATH bpf-next 09/13] s390: bpf: implement jitting of JMP32
  2018-12-19 22:44 [PATH bpf-next 00/13] bpf: propose new jmp32 instructions Jiong Wang
                   ` (7 preceding siblings ...)
  2018-12-19 22:44 ` [PATH bpf-next 08/13] ppc: " Jiong Wang
@ 2018-12-19 22:44 ` Jiong Wang
  2018-12-20  6:47   ` Martin Schwidefsky
  2018-12-19 22:44 ` [PATH bpf-next 10/13] nfp: " Jiong Wang
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 18+ messages in thread
From: Jiong Wang @ 2018-12-19 22:44 UTC (permalink / raw)
  To: ast, daniel
  Cc: netdev, oss-drivers, Jiong Wang, Martin Schwidefsky, Heiko Carstens

This patch implements code-gen for new JMP32 instructions on s390.

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 arch/s390/net/bpf_jit_comp.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index 3ff758e..c7101e5 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -1186,21 +1186,25 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, int i
 		/* lgfi %w1,imm (load sign extend imm) */
 		EMIT6_IMM(0xc0010000, REG_W1, imm);
 		/* cgrj %dst,%w1,mask,off */
-		EMIT6_PCREL(0xec000000, 0x0064, dst_reg, REG_W1, i, off, mask);
+		EMIT6_PCREL(0xec000000, src_reg ? 0x0076 : 0x0064,
+			    dst_reg, REG_W1, i, off, mask);
 		break;
 branch_ku:
 		/* lgfi %w1,imm (load sign extend imm) */
 		EMIT6_IMM(0xc0010000, REG_W1, imm);
 		/* clgrj %dst,%w1,mask,off */
-		EMIT6_PCREL(0xec000000, 0x0065, dst_reg, REG_W1, i, off, mask);
+		EMIT6_PCREL(0xec000000, src_reg ? 0x0077 : 0x0065,
+			    dst_reg, REG_W1, i, off, mask);
 		break;
 branch_xs:
 		/* cgrj %dst,%src,mask,off */
-		EMIT6_PCREL(0xec000000, 0x0064, dst_reg, src_reg, i, off, mask);
+		EMIT6_PCREL(0xec000000, imm ? 0x0076 : 0x0064,
+			    dst_reg, src_reg, i, off, mask);
 		break;
 branch_xu:
 		/* clgrj %dst,%src,mask,off */
-		EMIT6_PCREL(0xec000000, 0x0065, dst_reg, src_reg, i, off, mask);
+		EMIT6_PCREL(0xec000000, imm ? 0x0077 : 0x0065,
+			    dst_reg, src_reg, i, off, mask);
 		break;
 branch_oc:
 		/* brc mask,jmp_off (branch instruction needs 4 bytes) */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATH bpf-next 10/13] nfp: bpf: implement jitting of JMP32
  2018-12-19 22:44 [PATH bpf-next 00/13] bpf: propose new jmp32 instructions Jiong Wang
                   ` (8 preceding siblings ...)
  2018-12-19 22:44 ` [PATH bpf-next 09/13] s390: " Jiong Wang
@ 2018-12-19 22:44 ` Jiong Wang
  2018-12-19 22:44 ` [PATH bpf-next 11/13] bpf: verifier support JMP32 Jiong Wang
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 18+ messages in thread
From: Jiong Wang @ 2018-12-19 22:44 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev, oss-drivers, Jiong Wang, Jakub Kicinski

This patch implements code-gen for new JMP32 instructions on NFP.

Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/bpf/jit.c | 69 ++++++++++++++++++++--------
 1 file changed, 51 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
index 662cbc2..6afb11e 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
@@ -1334,8 +1334,11 @@ wrp_test_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
 
 	wrp_test_reg_one(nfp_prog, insn->dst_reg * 2, alu_op,
 			 insn->src_reg * 2, br_mask, insn->off);
-	wrp_test_reg_one(nfp_prog, insn->dst_reg * 2 + 1, alu_op,
-			 insn->src_reg * 2 + 1, br_mask, insn->off);
+	/* JMP64 */
+	if (!insn->imm) {
+		wrp_test_reg_one(nfp_prog, insn->dst_reg * 2 + 1, alu_op,
+				 insn->src_reg * 2 + 1, br_mask, insn->off);
+	}
 
 	return 0;
 }
@@ -1390,13 +1393,16 @@ static int cmp_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 	else
 		emit_alu(nfp_prog, reg_none(), tmp_reg, alu_op, reg_a(reg));
 
-	tmp_reg = ur_load_imm_any(nfp_prog, imm >> 32, imm_b(nfp_prog));
-	if (!code->swap)
-		emit_alu(nfp_prog, reg_none(),
-			 reg_a(reg + 1), carry_op, tmp_reg);
-	else
-		emit_alu(nfp_prog, reg_none(),
-			 tmp_reg, carry_op, reg_a(reg + 1));
+	/* JMP64 */
+	if (!insn->src_reg) {
+		tmp_reg = ur_load_imm_any(nfp_prog, imm >> 32, imm_b(nfp_prog));
+		if (!code->swap)
+			emit_alu(nfp_prog, reg_none(),
+				 reg_a(reg + 1), carry_op, tmp_reg);
+		else
+			emit_alu(nfp_prog, reg_none(),
+				 tmp_reg, carry_op, reg_a(reg + 1));
+	}
 
 	emit_br(nfp_prog, code->br_mask, insn->off, 0);
 
@@ -1423,8 +1429,10 @@ static int cmp_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 	}
 
 	emit_alu(nfp_prog, reg_none(), reg_a(areg), ALU_OP_SUB, reg_b(breg));
-	emit_alu(nfp_prog, reg_none(),
-		 reg_a(areg + 1), ALU_OP_SUB_C, reg_b(breg + 1));
+	/* JMP64 */
+	if (!insn->imm)
+		emit_alu(nfp_prog, reg_none(),
+			 reg_a(areg + 1), ALU_OP_SUB_C, reg_b(breg + 1));
 	emit_br(nfp_prog, code->br_mask, insn->off, 0);
 
 	return 0;
@@ -3023,7 +3031,8 @@ static int jeq_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 {
 	const struct bpf_insn *insn = &meta->insn;
 	u64 imm = insn->imm; /* sign extend */
-	swreg or1, or2, tmp_reg;
+	swreg or1, or2, or1b, tmp_reg;
+	bool alu_flag_set = false;
 
 	or1 = reg_a(insn->dst_reg * 2);
 	or2 = reg_b(insn->dst_reg * 2 + 1);
@@ -3033,6 +3042,17 @@ static int jeq_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 		emit_alu(nfp_prog, imm_a(nfp_prog),
 			 reg_a(insn->dst_reg * 2), ALU_OP_XOR, tmp_reg);
 		or1 = imm_a(nfp_prog);
+		alu_flag_set = true;
+	}
+
+	/* JMP32 */
+	if (insn->src_reg) {
+		if (!alu_flag_set) {
+			or1b = reg_b(insn->dst_reg * 2);
+			emit_alu(nfp_prog, reg_none(), reg_none(), ALU_OP_NONE,
+				 or1b);
+		}
+		goto jeq_imm_br_out;
 	}
 
 	if (imm >> 32) {
@@ -3043,6 +3063,7 @@ static int jeq_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 	}
 
 	emit_alu(nfp_prog, reg_none(), or1, ALU_OP_OR, or2);
+jeq_imm_br_out:
 	emit_br(nfp_prog, BR_BEQ, insn->off, 0);
 
 	return 0;
@@ -3080,11 +3101,16 @@ static int jne_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 {
 	const struct bpf_insn *insn = &meta->insn;
 	u64 imm = insn->imm; /* sign extend */
+	bool is_jmp32 = !!insn->src_reg;
 	swreg tmp_reg;
 
 	if (!imm) {
-		emit_alu(nfp_prog, reg_none(), reg_a(insn->dst_reg * 2),
-			 ALU_OP_OR, reg_b(insn->dst_reg * 2 + 1));
+		if (is_jmp32)
+			emit_alu(nfp_prog, reg_none(), reg_none(), ALU_OP_NONE,
+				 reg_b(insn->dst_reg * 2));
+		else
+			emit_alu(nfp_prog, reg_none(), reg_a(insn->dst_reg * 2),
+				 ALU_OP_OR, reg_b(insn->dst_reg * 2 + 1));
 		emit_br(nfp_prog, BR_BNE, insn->off, 0);
 		return 0;
 	}
@@ -3094,6 +3120,9 @@ static int jne_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 		 reg_a(insn->dst_reg * 2), ALU_OP_XOR, tmp_reg);
 	emit_br(nfp_prog, BR_BNE, insn->off, 0);
 
+	if (is_jmp32)
+		return 0;
+
 	tmp_reg = ur_load_imm_any(nfp_prog, imm >> 32, imm_b(nfp_prog));
 	emit_alu(nfp_prog, reg_none(),
 		 reg_a(insn->dst_reg * 2 + 1), ALU_OP_XOR, tmp_reg);
@@ -3108,10 +3137,14 @@ static int jeq_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 
 	emit_alu(nfp_prog, imm_a(nfp_prog), reg_a(insn->dst_reg * 2),
 		 ALU_OP_XOR, reg_b(insn->src_reg * 2));
-	emit_alu(nfp_prog, imm_b(nfp_prog), reg_a(insn->dst_reg * 2 + 1),
-		 ALU_OP_XOR, reg_b(insn->src_reg * 2 + 1));
-	emit_alu(nfp_prog, reg_none(),
-		 imm_a(nfp_prog), ALU_OP_OR, imm_b(nfp_prog));
+	/* JMP64 */
+	if (!insn->imm) {
+		emit_alu(nfp_prog, imm_b(nfp_prog),
+			 reg_a(insn->dst_reg * 2 + 1), ALU_OP_XOR,
+			 reg_b(insn->src_reg * 2 + 1));
+		emit_alu(nfp_prog, reg_none(), imm_a(nfp_prog), ALU_OP_OR,
+			 imm_b(nfp_prog));
+	}
 	emit_br(nfp_prog, BR_BEQ, insn->off, 0);
 
 	return 0;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATH bpf-next 11/13] bpf: verifier support JMP32
  2018-12-19 22:44 [PATH bpf-next 00/13] bpf: propose new jmp32 instructions Jiong Wang
                   ` (9 preceding siblings ...)
  2018-12-19 22:44 ` [PATH bpf-next 10/13] nfp: " Jiong Wang
@ 2018-12-19 22:44 ` Jiong Wang
  2018-12-19 23:54   ` Jakub Kicinski
  2018-12-19 22:44 ` [PATH bpf-next 12/13] bpf: unit tests for JMP32 Jiong Wang
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 18+ messages in thread
From: Jiong Wang @ 2018-12-19 22:44 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev, oss-drivers, Jiong Wang

Verifier is doing some runtime optimizations based on the extra info
conditional jump instruction could offer, especially when the comparison
is between constant and register for which case the value range of the
register could be improved.

  is_branch_taken/reg_set_min_max/reg_set_min_max_inv

are the three functions that needs updating.

There are some other conditional jump related optimizations but they
are with pointer types comparison which JMP32 won't be generated for.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 kernel/bpf/verifier.c | 178 ++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 137 insertions(+), 41 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index e0e77ff..3123c91 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3919,7 +3919,7 @@ static int is_branch_taken(struct bpf_reg_state *reg, u64 val, u8 opcode)
  */
 static void reg_set_min_max(struct bpf_reg_state *true_reg,
 			    struct bpf_reg_state *false_reg, u64 val,
-			    u8 opcode)
+			    u8 opcode, bool is_jmp32)
 {
 	/* If the dst_reg is a pointer, we can't learn anything about its
 	 * variable offset from the compare (unless src_reg were a pointer into
@@ -3935,45 +3935,69 @@ static void reg_set_min_max(struct bpf_reg_state *true_reg,
 		/* If this is false then we know nothing Jon Snow, but if it is
 		 * true then we know for sure.
 		 */
-		__mark_reg_known(true_reg, val);
+		if (is_jmp32)
+			true_reg->var_off = tnum_or(true_reg->var_off,
+						    tnum_const(val));
+		else
+			__mark_reg_known(true_reg, val);
 		break;
 	case BPF_JNE:
 		/* If this is true we know nothing Jon Snow, but if it is false
-		 * we know the value for sure;
+		 * we know the value for sure.
 		 */
-		__mark_reg_known(false_reg, val);
+		if (is_jmp32)
+			false_reg->var_off = tnum_or(false_reg->var_off,
+						     tnum_const(val));
+		else
+			__mark_reg_known(false_reg, val);
 		break;
 	case BPF_JGT:
-		false_reg->umax_value = min(false_reg->umax_value, val);
+		if (!is_jmp32)
+			false_reg->umax_value = min(false_reg->umax_value, val);
 		true_reg->umin_value = max(true_reg->umin_value, val + 1);
 		break;
 	case BPF_JSGT:
-		false_reg->smax_value = min_t(s64, false_reg->smax_value, val);
+		if (!is_jmp32)
+			false_reg->smax_value = min_t(s64,
+						      false_reg->smax_value,
+						      val);
 		true_reg->smin_value = max_t(s64, true_reg->smin_value, val + 1);
 		break;
 	case BPF_JLT:
 		false_reg->umin_value = max(false_reg->umin_value, val);
-		true_reg->umax_value = min(true_reg->umax_value, val - 1);
+		if (!is_jmp32)
+			true_reg->umax_value = min(true_reg->umax_value,
+						   val - 1);
 		break;
 	case BPF_JSLT:
 		false_reg->smin_value = max_t(s64, false_reg->smin_value, val);
-		true_reg->smax_value = min_t(s64, true_reg->smax_value, val - 1);
+		if (!is_jmp32)
+			true_reg->smax_value = min_t(s64, true_reg->smax_value,
+						     val - 1);
 		break;
 	case BPF_JGE:
-		false_reg->umax_value = min(false_reg->umax_value, val - 1);
+		if (!is_jmp32)
+			false_reg->umax_value = min(false_reg->umax_value,
+						    val - 1);
 		true_reg->umin_value = max(true_reg->umin_value, val);
 		break;
 	case BPF_JSGE:
-		false_reg->smax_value = min_t(s64, false_reg->smax_value, val - 1);
+		if (!is_jmp32)
+			false_reg->smax_value = min_t(s64,
+						      false_reg->smax_value,
+						      val - 1);
 		true_reg->smin_value = max_t(s64, true_reg->smin_value, val);
 		break;
 	case BPF_JLE:
 		false_reg->umin_value = max(false_reg->umin_value, val + 1);
-		true_reg->umax_value = min(true_reg->umax_value, val);
+		if (!is_jmp32)
+			true_reg->umax_value = min(true_reg->umax_value, val);
 		break;
 	case BPF_JSLE:
 		false_reg->smin_value = max_t(s64, false_reg->smin_value, val + 1);
-		true_reg->smax_value = min_t(s64, true_reg->smax_value, val);
+		if (!is_jmp32)
+			true_reg->smax_value = min_t(s64, true_reg->smax_value,
+						     val);
 		break;
 	default:
 		break;
@@ -3997,7 +4021,7 @@ static void reg_set_min_max(struct bpf_reg_state *true_reg,
  */
 static void reg_set_min_max_inv(struct bpf_reg_state *true_reg,
 				struct bpf_reg_state *false_reg, u64 val,
-				u8 opcode)
+				u8 opcode, bool is_jmp32)
 {
 	if (__is_pointer_value(false, false_reg))
 		return;
@@ -4007,45 +4031,69 @@ static void reg_set_min_max_inv(struct bpf_reg_state *true_reg,
 		/* If this is false then we know nothing Jon Snow, but if it is
 		 * true then we know for sure.
 		 */
-		__mark_reg_known(true_reg, val);
+		if (is_jmp32)
+			true_reg->var_off = tnum_or(true_reg->var_off,
+						    tnum_const(val));
+		else
+			__mark_reg_known(true_reg, val);
 		break;
 	case BPF_JNE:
 		/* If this is true we know nothing Jon Snow, but if it is false
-		 * we know the value for sure;
+		 * we know the value for sure.
 		 */
-		__mark_reg_known(false_reg, val);
+		if (is_jmp32)
+			false_reg->var_off = tnum_or(false_reg->var_off,
+						     tnum_const(val));
+		else
+			__mark_reg_known(false_reg, val);
 		break;
 	case BPF_JGT:
-		true_reg->umax_value = min(true_reg->umax_value, val - 1);
+		if (!is_jmp32)
+			true_reg->umax_value = min(true_reg->umax_value,
+						   val - 1);
 		false_reg->umin_value = max(false_reg->umin_value, val);
 		break;
 	case BPF_JSGT:
-		true_reg->smax_value = min_t(s64, true_reg->smax_value, val - 1);
+		if (!is_jmp32)
+			true_reg->smax_value = min_t(s64, true_reg->smax_value,
+						     val - 1);
 		false_reg->smin_value = max_t(s64, false_reg->smin_value, val);
 		break;
 	case BPF_JLT:
 		true_reg->umin_value = max(true_reg->umin_value, val + 1);
-		false_reg->umax_value = min(false_reg->umax_value, val);
+		if (!is_jmp32)
+			false_reg->umax_value = min(false_reg->umax_value, val);
 		break;
 	case BPF_JSLT:
 		true_reg->smin_value = max_t(s64, true_reg->smin_value, val + 1);
-		false_reg->smax_value = min_t(s64, false_reg->smax_value, val);
+		if (!is_jmp32)
+			false_reg->smax_value = min_t(s64,
+						      false_reg->smax_value,
+						      val);
 		break;
 	case BPF_JGE:
-		true_reg->umax_value = min(true_reg->umax_value, val);
+		if (!is_jmp32)
+			true_reg->umax_value = min(true_reg->umax_value, val);
 		false_reg->umin_value = max(false_reg->umin_value, val + 1);
 		break;
 	case BPF_JSGE:
-		true_reg->smax_value = min_t(s64, true_reg->smax_value, val);
+		if (!is_jmp32)
+			true_reg->smax_value = min_t(s64, true_reg->smax_value,
+						     val);
 		false_reg->smin_value = max_t(s64, false_reg->smin_value, val + 1);
 		break;
 	case BPF_JLE:
 		true_reg->umin_value = max(true_reg->umin_value, val);
-		false_reg->umax_value = min(false_reg->umax_value, val - 1);
+		if (!is_jmp32)
+			false_reg->umax_value = min(false_reg->umax_value,
+						    val - 1);
 		break;
 	case BPF_JSLE:
 		true_reg->smin_value = max_t(s64, true_reg->smin_value, val);
-		false_reg->smax_value = min_t(s64, false_reg->smax_value, val - 1);
+		if (!is_jmp32)
+			false_reg->smax_value = min_t(s64,
+						      false_reg->smax_value,
+						      val - 1);
 		break;
 	default:
 		break;
@@ -4276,6 +4324,7 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 	struct bpf_reg_state *regs = this_branch->frame[this_branch->curframe]->regs;
 	struct bpf_reg_state *dst_reg, *other_branch_regs;
 	u8 opcode = BPF_OP(insn->code);
+	bool is_jmp32;
 	int err;
 
 	if (opcode > BPF_JSLE) {
@@ -4284,11 +4333,13 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 	}
 
 	if (BPF_SRC(insn->code) == BPF_X) {
-		if (insn->imm != 0) {
+		if (insn->imm & ~BPF_JMP_SUBOP_MASK) {
 			verbose(env, "BPF_JMP uses reserved fields\n");
 			return -EINVAL;
 		}
 
+		is_jmp32 = insn->imm & BPF_JMP_SUBOP_32BIT;
+
 		/* check src1 operand */
 		err = check_reg_arg(env, insn->src_reg, SRC_OP);
 		if (err)
@@ -4300,10 +4351,12 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 			return -EACCES;
 		}
 	} else {
-		if (insn->src_reg != BPF_REG_0) {
+		if (insn->src_reg & ~BPF_JMP_SUBOP_MASK) {
 			verbose(env, "BPF_JMP uses reserved fields\n");
 			return -EINVAL;
 		}
+
+		is_jmp32 = insn->src_reg & BPF_JMP_SUBOP_32BIT;
 	}
 
 	/* check src2 operand */
@@ -4314,8 +4367,29 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 	dst_reg = &regs[insn->dst_reg];
 
 	if (BPF_SRC(insn->code) == BPF_K) {
-		int pred = is_branch_taken(dst_reg, insn->imm, opcode);
+		struct bpf_reg_state *reg = dst_reg;
+		struct bpf_reg_state reg_lo;
+		int pred;
+
+		if (is_jmp32) {
+			reg_lo = *dst_reg;
+			reg = &reg_lo;
+			coerce_reg_to_size(reg, 4);
+			/* If s32 min/max has the same sign bit, then min/max
+			 * relationship still maintain after copying value from
+			 * unsigned range inside coerce_reg_to_size.
+			 */
+
+			if (reg->umax_value > (u32)S32_MAX &&
+			    reg->umin_value <= (u32)S32_MAX) {
+				reg->smin_value = S32_MIN;
+				reg->smax_value = S32_MAX;
+			}
+			reg->smin_value = (s64)(s32)reg->smin_value;
+			reg->smax_value = (s64)(s32)reg->smax_value;
+		}
 
+		pred = is_branch_taken(reg, insn->imm, opcode);
 		if (pred == 1) {
 			 /* only follow the goto, ignore fall-through */
 			*insn_idx += insn->off;
@@ -4341,30 +4415,51 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 	 * comparable.
 	 */
 	if (BPF_SRC(insn->code) == BPF_X) {
+		struct bpf_reg_state *src_reg = &regs[insn->src_reg];
+		struct bpf_reg_state lo_reg0 = *dst_reg;
+		struct bpf_reg_state lo_reg1 = *src_reg;
+		struct bpf_reg_state *src_lo, *dst_lo;
+
+		dst_lo = &lo_reg0;
+		src_lo = &lo_reg1;
+		coerce_reg_to_size(dst_lo, 4);
+		coerce_reg_to_size(src_lo, 4);
+
 		if (dst_reg->type == SCALAR_VALUE &&
-		    regs[insn->src_reg].type == SCALAR_VALUE) {
-			if (tnum_is_const(regs[insn->src_reg].var_off))
+		    src_reg->type == SCALAR_VALUE) {
+			if (tnum_is_const(src_reg->var_off) ||
+			    (is_jmp32 && tnum_is_const(src_lo->var_off)))
 				reg_set_min_max(&other_branch_regs[insn->dst_reg],
-						dst_reg, regs[insn->src_reg].var_off.value,
-						opcode);
-			else if (tnum_is_const(dst_reg->var_off))
+						dst_reg,
+						is_jmp32
+						? src_lo->var_off.value
+						: src_reg->var_off.value,
+						opcode, is_jmp32);
+			else if (tnum_is_const(dst_reg->var_off) ||
+				 (is_jmp32 && tnum_is_const(dst_lo->var_off)))
 				reg_set_min_max_inv(&other_branch_regs[insn->src_reg],
-						    &regs[insn->src_reg],
-						    dst_reg->var_off.value, opcode);
-			else if (opcode == BPF_JEQ || opcode == BPF_JNE)
+						    src_reg,
+						    is_jmp32
+						    ? dst_lo->var_off.value
+						    : dst_reg->var_off.value,
+						    opcode, is_jmp32);
+			else if (!is_jmp32 &&
+				 (opcode == BPF_JEQ || opcode == BPF_JNE))
 				/* Comparing for equality, we can combine knowledge */
 				reg_combine_min_max(&other_branch_regs[insn->src_reg],
 						    &other_branch_regs[insn->dst_reg],
-						    &regs[insn->src_reg],
-						    &regs[insn->dst_reg], opcode);
+						    src_reg, dst_reg, opcode);
 		}
 	} else if (dst_reg->type == SCALAR_VALUE) {
 		reg_set_min_max(&other_branch_regs[insn->dst_reg],
-					dst_reg, insn->imm, opcode);
+					dst_reg, insn->imm, opcode, is_jmp32);
 	}
 
-	/* detect if R == 0 where R is returned from bpf_map_lookup_elem() */
-	if (BPF_SRC(insn->code) == BPF_K &&
+	/* detect if R == 0 where R is returned from bpf_map_lookup_elem().
+	 * NOTE: these optimizations below are related with pointer comparison
+	 *       which will always be JMP32.
+	 */
+	if (!is_jmp32 && BPF_SRC(insn->code) == BPF_K &&
 	    insn->imm == 0 && (opcode == BPF_JEQ || opcode == BPF_JNE) &&
 	    reg_type_may_be_null(dst_reg->type)) {
 		/* Mark all identical registers in each branch as either
@@ -4374,7 +4469,8 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 				      opcode == BPF_JNE);
 		mark_ptr_or_null_regs(other_branch, insn->dst_reg,
 				      opcode == BPF_JEQ);
-	} else if (!try_match_pkt_pointers(insn, dst_reg, &regs[insn->src_reg],
+	} else if (!is_jmp32 &&
+		   !try_match_pkt_pointers(insn, dst_reg, &regs[insn->src_reg],
 					   this_branch, other_branch) &&
 		   is_pointer_value(env, insn->dst_reg)) {
 		verbose(env, "R%d pointer comparison prohibited\n",
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATH bpf-next 12/13] bpf: unit tests for JMP32
  2018-12-19 22:44 [PATH bpf-next 00/13] bpf: propose new jmp32 instructions Jiong Wang
                   ` (10 preceding siblings ...)
  2018-12-19 22:44 ` [PATH bpf-next 11/13] bpf: verifier support JMP32 Jiong Wang
@ 2018-12-19 22:44 ` Jiong Wang
  2018-12-19 22:44 ` [PATH bpf-next 13/13] selftests: bpf: makefile support sub-register code-gen test mode Jiong Wang
  2018-12-20  3:08 ` [PATH bpf-next 00/13] bpf: propose new jmp32 instructions Alexei Starovoitov
  13 siblings, 0 replies; 18+ messages in thread
From: Jiong Wang @ 2018-12-19 22:44 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev, oss-drivers, Jiong Wang

This patch adds unit tests for new JMP32 instructions.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 include/linux/filter.h |  19 +++
 lib/test_bpf.c         | 321 ++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 335 insertions(+), 5 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 537e9e7..94e1000 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -271,6 +271,15 @@ struct sock_reuseport;
 		.off   = OFF,					\
 		.imm   = 0 })
 
+/* Likewise, but is 32-bit variant. */
+#define BPF_JMP32_REG(OP, DST, SRC, OFF)			\
+	((struct bpf_insn) {					\
+		.code  = BPF_JMP | BPF_OP(OP) | BPF_X,		\
+		.dst_reg = DST,					\
+		.src_reg = SRC,					\
+		.off   = OFF,					\
+		.imm   = 1 })
+
 /* Conditional jumps against immediates, if (dst_reg 'op' imm32) goto pc + off16 */
 
 #define BPF_JMP_IMM(OP, DST, IMM, OFF)				\
@@ -281,6 +290,16 @@ struct sock_reuseport;
 		.off   = OFF,					\
 		.imm   = IMM })
 
+/* Likewise, but is 32-bit variant. */
+
+#define BPF_JMP32_IMM(OP, DST, IMM, OFF)			\
+	((struct bpf_insn) {					\
+		.code  = BPF_JMP | BPF_OP(OP) | BPF_K,		\
+		.dst_reg = DST,					\
+		.src_reg = 1,					\
+		.off   = OFF,					\
+		.imm   = IMM })
+
 /* Unconditional jumps, goto pc + off16 */
 
 #define BPF_JMP_A(OFF)						\
diff --git a/lib/test_bpf.c b/lib/test_bpf.c
index f3e5707..c17f08b 100644
--- a/lib/test_bpf.c
+++ b/lib/test_bpf.c
@@ -4447,6 +4447,21 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JSLT | BPF_K (32-bit variant) */
+	{
+		"JMP32_JSLT_K: Signed jump: if (-2 < -1) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0x1fffffffeLL),
+			BPF_JMP32_IMM(BPF_JSLT, R1, -1, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } },
+	},
 	{
 		"JMP_JSLT_K: Signed jump: if (-1 < -1) return 0",
 		.u.insns_int = {
@@ -4476,6 +4491,21 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JSGT | BPF_K (32-bit variant) */
+	{
+		"JMP32_JSGT_K: Signed jump: if (-1 > -2) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0xfffffff7ffffffffLL),
+			BPF_JMP32_IMM(BPF_JSGT, R1, -2, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } },
+	},
 	{
 		"JMP_JSGT_K: Signed jump: if (-1 > -1) return 0",
 		.u.insns_int = {
@@ -4505,6 +4535,21 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JSLE | BPF_K (32-bit variant) */
+	{
+		"JMP32_JSLE_K: Signed jump: if (-2 <= -1) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0x7ffffffffffffffeLL),
+			BPF_JMP32_IMM(BPF_JSLE, R1, -1, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } },
+	},
 	{
 		"JMP_JSLE_K: Signed jump: if (-1 <= -1) return 1",
 		.u.insns_int = {
@@ -4572,6 +4617,21 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JSGE | BPF_K (32-bit variant) */
+	{
+		"JMP32_JSGE_K: Signed jump: if (-1 >= -2) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0xfffffff7ffffffffLL),
+			BPF_JMP32_IMM(BPF_JSGE, R1, -2, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } },
+	},
 	{
 		"JMP_JSGE_K: Signed jump: if (-1 >= -1) return 1",
 		.u.insns_int = {
@@ -4639,6 +4699,21 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JGT | BPF_K (32-bit variant) */
+	{
+		"JMP32_JGT_K: if (2 > 3) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0x100000002),
+			BPF_JMP32_IMM(BPF_JGT, R1, 3, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 0 } },
+	},
 	{
 		"JMP_JGT_K: Unsigned jump: if (-1 > 1) return 1",
 		.u.insns_int = {
@@ -4668,8 +4743,23 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JLT | BPF_K (32-bit variant) */
 	{
-		"JMP_JGT_K: Unsigned jump: if (1 < -1) return 1",
+		"JMP32_JLT_K: if (2 < 3) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0x100000002),
+			BPF_JMP32_IMM(BPF_JLT, R1, 3, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } },
+	},
+	{
+		"JMP_JLT_K: Unsigned jump: if (1 < -1) return 1",
 		.u.insns_int = {
 			BPF_ALU32_IMM(BPF_MOV, R0, 0),
 			BPF_LD_IMM64(R1, 1),
@@ -4697,6 +4787,21 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JGE | BPF_K (32-bit variant) */
+	{
+		"JMP32_JGE_K: if (2 >= 3) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0x100000002),
+			BPF_JMP32_IMM(BPF_JGE, R1, 3, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 0 } },
+	},
 	/* BPF_JMP | BPF_JLE | BPF_K */
 	{
 		"JMP_JLE_K: if (2 <= 3) return 1",
@@ -4712,6 +4817,21 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JLE | BPF_K (32-bit variant) */
+	{
+		"JMP32_JLE_K: if (2 <= 3) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0x100000002),
+			BPF_JMP32_IMM(BPF_JLE, R1, 3, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } },
+	},
 	/* BPF_JMP | BPF_JGT | BPF_K jump backwards */
 	{
 		"JMP_JGT_K: if (3 > 2) return 1 (jump backwards)",
@@ -4787,6 +4907,21 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JNE | BPF_K (32-bit variant) */
+	{
+		"JMP32_JNE_K: if (3 != 2) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0x100000003ULL),
+			BPF_JMP32_IMM(BPF_JNE, R1, 2, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } },
+	},
 	/* BPF_JMP | BPF_JEQ | BPF_K */
 	{
 		"JMP_JEQ_K: if (3 == 3) return 1",
@@ -4802,6 +4937,21 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JEQ | BPF_K (32-bit variant) */
+	{
+		"JMP32_JEQ_K: if (3 == 3) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0x100000003ULL),
+			BPF_JMP32_IMM(BPF_JEQ, R1, 3, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } },
+	},
 	/* BPF_JMP | BPF_JSET | BPF_K */
 	{
 		"JMP_JSET_K: if (0x3 & 0x2) return 1",
@@ -4847,6 +4997,22 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JSGT | BPF_X (32-bit variant) */
+	{
+		"JMP32_JSGT_X: Signed jump: if (-1 > -2) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0x7fffffffffffffffULL),
+			BPF_LD_IMM64(R2, -2),
+			BPF_JMP32_REG(BPF_JSGT, R1, R2, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } },
+	},
 	{
 		"JMP_JSGT_X: Signed jump: if (-1 > -1) return 0",
 		.u.insns_int = {
@@ -4878,13 +5044,30 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JSLT | BPF_X */
 	{
-		"JMP_JSLT_X: Signed jump: if (-1 < -1) return 0",
+		"JMP_JSLT_X: Signed jump: if (-2 < -1) return 1",
 		.u.insns_int = {
-			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
 			BPF_LD_IMM64(R1, -1),
-			BPF_LD_IMM64(R2, -1),
-			BPF_JMP_REG(BPF_JSLT, R1, R2, 1),
+			BPF_LD_IMM64(R2, -2),
+			BPF_JMP_REG(BPF_JSLT, R2, R1, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } },
+	},
+	/* BPF_JMP | BPF_JSLT | BPF_X (32-bit variant) */
+	{
+		"JMP32_JSLT_X: Signed jump: if (-1 < -1) return 0",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_LD_IMM64(R1, 0x1ffffffffULL),
+			BPF_LD_IMM64(R2, 0x2ffffffffULL),
+			BPF_JMP32_REG(BPF_JSLT, R1, R2, 1),
 			BPF_EXIT_INSN(),
 			BPF_ALU32_IMM(BPF_MOV, R0, 0),
 			BPF_EXIT_INSN(),
@@ -4909,6 +5092,22 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JSGE | BPF_X (32-bit variant) */
+	{
+		"JMP32_JSGE_X: Signed jump: if (-1 >= -2) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0xfffffff7ffffffffULL),
+			BPF_LD_IMM64(R2, -2),
+			BPF_JMP32_REG(BPF_JSGE, R1, R2, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } },
+	},
 	{
 		"JMP_JSGE_X: Signed jump: if (-1 >= -1) return 1",
 		.u.insns_int = {
@@ -4940,6 +5139,22 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JSLE | BPF_X (32-bit variant) */
+	{
+		"JMP32_JSLE_X: Signed jump: if (-2 <= -1) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0x1ffffffffULL),
+			BPF_LD_IMM64(R2, -2),
+			BPF_JMP32_REG(BPF_JSLE, R2, R1, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } },
+	},
 	{
 		"JMP_JSLE_X: Signed jump: if (-1 <= -1) return 1",
 		.u.insns_int = {
@@ -4971,6 +5186,22 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JGT | BPF_X (32-bit variant) */
+	{
+		"JMP32_JGT_X: if (3 > 2) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0x7fffffff3ULL),
+			BPF_LD_IMM64(R2, 0xffffffff2ULL),
+			BPF_JMP32_REG(BPF_JGT, R1, R2, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } },
+	},
 	{
 		"JMP_JGT_X: Unsigned jump: if (-1 > 1) return 1",
 		.u.insns_int = {
@@ -5002,6 +5233,22 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JLT | BPF_X (32-bit variant) */
+	{
+		"JMP32_JLT_X: if (2 < 3) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0x100000003ULL),
+			BPF_LD_IMM64(R2, 0x200000002ULL),
+			BPF_JMP32_REG(BPF_JLT, R2, R1, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } },
+	},
 	{
 		"JMP_JLT_X: Unsigned jump: if (1 < -1) return 1",
 		.u.insns_int = {
@@ -5033,6 +5280,22 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JGE | BPF_X (32-bit variant) */
+	{
+		"JMP32_JGE_X: if (3 >= 2) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0x123456783ULL),
+			BPF_LD_IMM64(R2, 0xf23456782ULL),
+			BPF_JMP32_REG(BPF_JGE, R1, R2, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } },
+	},
 	{
 		"JMP_JGE_X: if (3 >= 3) return 1",
 		.u.insns_int = {
@@ -5064,6 +5327,22 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JLE | BPF_X (32-bit variant) */
+	{
+		"JMP32_JLE_X: if (2 <= 3) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0x7fffffff3ULL),
+			BPF_LD_IMM64(R2, 0x8fffffff2ULL),
+			BPF_JMP32_REG(BPF_JLE, R2, R1, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } },
+	},
 	{
 		"JMP_JLE_X: if (3 <= 3) return 1",
 		.u.insns_int = {
@@ -5184,6 +5463,22 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JNE | BPF_X (32-bit variant) */
+	{
+		"JMP32_JNE_X: if (3 != 2) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0x10000003ULL),
+			BPF_LD_IMM64(R2, 0x10000002ULL),
+			BPF_JMP32_REG(BPF_JNE, R1, R2, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } },
+	},
 	/* BPF_JMP | BPF_JEQ | BPF_X */
 	{
 		"JMP_JEQ_X: if (3 == 3) return 1",
@@ -5200,6 +5495,22 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 0, 1 } },
 	},
+	/* BPF_JMP | BPF_JEQ | BPF_X (32-bit variant) */
+	{
+		"JMP32_JEQ_X: if (3 == 3) return 1",
+		.u.insns_int = {
+			BPF_ALU32_IMM(BPF_MOV, R0, 0),
+			BPF_LD_IMM64(R1, 0x100000003ULL),
+			BPF_LD_IMM64(R2, 0x200000003ULL),
+			BPF_JMP32_REG(BPF_JEQ, R1, R2, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU32_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } },
+	},
 	/* BPF_JMP | BPF_JSET | BPF_X */
 	{
 		"JMP_JSET_X: if (0x3 & 0x2) return 1",
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATH bpf-next 13/13] selftests: bpf: makefile support sub-register code-gen test mode
  2018-12-19 22:44 [PATH bpf-next 00/13] bpf: propose new jmp32 instructions Jiong Wang
                   ` (11 preceding siblings ...)
  2018-12-19 22:44 ` [PATH bpf-next 12/13] bpf: unit tests for JMP32 Jiong Wang
@ 2018-12-19 22:44 ` Jiong Wang
  2018-12-20  3:08 ` [PATH bpf-next 00/13] bpf: propose new jmp32 instructions Alexei Starovoitov
  13 siblings, 0 replies; 18+ messages in thread
From: Jiong Wang @ 2018-12-19 22:44 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev, oss-drivers, Jiong Wang

Add a new BPF_SELFTEST_32BIT variable to enable sub-register code-gen test
mode. For example:

  make BPF_SELFTEST_32BIT=1 -C tools/testing/selftests/bpf run_tests

will compile all bpf C program with sub-register code-gen enabled, ALU32
and JMP32 instructions will be used.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 tools/testing/selftests/bpf/Makefile | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 73aa6d8..70bccf6 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -146,6 +146,10 @@ endif
 endif
 endif
 
+ifneq ($(BPF_SELFTEST_32BIT),)
+	LLC_FLAGS += -mattr=alu32
+endif
+
 # Have one program compiled without "-target bpf" to test whether libbpf loads
 # it successfully
 $(OUTPUT)/test_xdp.o: test_xdp.c
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATH bpf-next 11/13] bpf: verifier support JMP32
  2018-12-19 22:44 ` [PATH bpf-next 11/13] bpf: verifier support JMP32 Jiong Wang
@ 2018-12-19 23:54   ` Jakub Kicinski
  2019-01-08 15:23     ` Jiong Wang
  0 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2018-12-19 23:54 UTC (permalink / raw)
  To: Jiong Wang; +Cc: ast, daniel, netdev, oss-drivers

On Wed, 19 Dec 2018 17:44:18 -0500, Jiong Wang wrote:
> Verifier is doing some runtime optimizations based on the extra info
> conditional jump instruction could offer, especially when the comparison
> is between constant and register for which case the value range of the
> register could be improved.
> 
>   is_branch_taken/reg_set_min_max/reg_set_min_max_inv
> 
> are the three functions that needs updating.
> 
> There are some other conditional jump related optimizations but they
> are with pointer types comparison which JMP32 won't be generated for.
> 
> Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
> ---
>  kernel/bpf/verifier.c | 178 ++++++++++++++++++++++++++++++++++++++------------
>  1 file changed, 137 insertions(+), 41 deletions(-)
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index e0e77ff..3123c91 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -3919,7 +3919,7 @@ static int is_branch_taken(struct bpf_reg_state *reg, u64 val, u8 opcode)
>   */
>  static void reg_set_min_max(struct bpf_reg_state *true_reg,
>  			    struct bpf_reg_state *false_reg, u64 val,
> -			    u8 opcode)
> +			    u8 opcode, bool is_jmp32)
>  {
>  	/* If the dst_reg is a pointer, we can't learn anything about its
>  	 * variable offset from the compare (unless src_reg were a pointer into
> @@ -3935,45 +3935,69 @@ static void reg_set_min_max(struct bpf_reg_state *true_reg,
>  		/* If this is false then we know nothing Jon Snow, but if it is
>  		 * true then we know for sure.
>  		 */
> -		__mark_reg_known(true_reg, val);
> +		if (is_jmp32)
> +			true_reg->var_off = tnum_or(true_reg->var_off,
> +						    tnum_const(val));

These tnum updates look strange, if the jump is 32bit we know the
bottom bits.  So:

	tnum.m &= GENMASK(63, 32);
	tnum.v = upper_32_bits(tnum.v) | lower_32_bits(val);

> +		else
> +			__mark_reg_known(true_reg, val);
>  		break;

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATH bpf-next 00/13] bpf: propose new jmp32 instructions
  2018-12-19 22:44 [PATH bpf-next 00/13] bpf: propose new jmp32 instructions Jiong Wang
                   ` (12 preceding siblings ...)
  2018-12-19 22:44 ` [PATH bpf-next 13/13] selftests: bpf: makefile support sub-register code-gen test mode Jiong Wang
@ 2018-12-20  3:08 ` Alexei Starovoitov
  13 siblings, 0 replies; 18+ messages in thread
From: Alexei Starovoitov @ 2018-12-20  3:08 UTC (permalink / raw)
  To: Jiong Wang
  Cc: ast, daniel, netdev, oss-drivers, David S . Miller, Paul Burton,
	Wang YanQing, Zi Shen Lim, Shubham Bansal, Naveen N . Rao,
	Sandipan Das, Martin Schwidefsky, Heiko Carstens

On Wed, Dec 19, 2018 at 05:44:07PM -0500, Jiong Wang wrote:
> Current eBPF ISA has 32-bit sub-register and has defined a set of ALU32
> instructions.
> 
> However, there is no JMP32 instructions, the consequence is code-gen for
> 32-bit sub-registers is not efficient. For example, explicit sign-extension
> from 32-bit to 64-bit is needed for signed comparison.
> 
> Adding JMP32 instruction therefore could complete eBPF ISA on 32-bit
> sub-register support. This also match those JMP32 instructions in most JIT
> backends, for example x64-64 and AArch64. These new eBPF JMP32 instructions
> could have one-to-one map on them.
> 
> A few verifier ALU32 related bugs has been fixed recently, and JMP32
> introduced by this set further improves BPF sub-register ecosystem. Once
> this is landed, BPF programs using 32-bit sub-register ISA could get
> reasonably good support from verifier and JIT compilers. Users then could
> compare the runtime efficiency of one BPF program under both modes, and
> could use the one benchmarked as better. One good thing is JMP32 is making 
> 32-bit JIT more efficient, because it only has 32-bit use, no def, so
> unlike ALU32, no need to clear high bits. Hence, even without data-flow
> analysis, JMP32 is making better code-gen then JMP64. More benchmark
> results are listed below in this cover letter.
> 
>  - Encoding
> 
>    Ideally, JMP32 could use new CLASS BPF_JMP32, just like BPF_ALU and
>    BPF_ALU32. But we only has one class number 0x06 unused. I am not sure
>    if we want to keep it for other extension purpose. For example restore
>    it as BPF_MISC which could then redefine the interpretation of all the 
>    remaining bits in bis[7:1];
> 
>    So, I am following the coding style used by BPF_PSEUDO_CALL, that is to
>    use reserved bits under BPF_JMP. When BPF_SRC(code) == BPF_X, the
>    encoding is 0x1 at insn->imm. When BPF_SRC(code) == BPF_K, the encoding
>    is 0x1 at insn->src_reg. All other bits in imm and src_reg are still
>    reserved and should be zeroed.

this choice of encoding penalizes interpreter a lot, since every jmp
(both 64 and 32-bit) become multiple conditional branches.
I suspect interpreter performance suffers a lot.

We can still use such encoding for uapi and recode to 256 opcodes,
but why jump the hoops when class 6 is still unused?
Just use it for BPF_JMP32.
It will also help avoid issues with JITs that rely on opcode
to do conversion.
Like if we don't convert all JITs with the proposed encoding
unconverted JITs will generate 64-bit jmp for 32-bit one,
since they didn't check insn->imm or src_reg.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATH bpf-next 09/13] s390: bpf: implement jitting of JMP32
  2018-12-19 22:44 ` [PATH bpf-next 09/13] s390: " Jiong Wang
@ 2018-12-20  6:47   ` Martin Schwidefsky
  0 siblings, 0 replies; 18+ messages in thread
From: Martin Schwidefsky @ 2018-12-20  6:47 UTC (permalink / raw)
  To: Jiong Wang; +Cc: ast, daniel, netdev, oss-drivers, Heiko Carstens

On Wed, 19 Dec 2018 17:44:16 -0500
Jiong Wang <jiong.wang@netronome.com> wrote:

> This patch implements code-gen for new JMP32 instructions on s390.
> 
> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
> Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
> ---
>  arch/s390/net/bpf_jit_comp.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
> index 3ff758e..c7101e5 100644
> --- a/arch/s390/net/bpf_jit_comp.c
> +++ b/arch/s390/net/bpf_jit_comp.c
> @@ -1186,21 +1186,25 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, int i
>  		/* lgfi %w1,imm (load sign extend imm) */
>  		EMIT6_IMM(0xc0010000, REG_W1, imm);
>  		/* cgrj %dst,%w1,mask,off */

Please correct the comment about the instruction as well. Dependent on the
BPF_JMP_SUBOP_32BIT in src_reg the instruction is either cgrj or crj.

> -		EMIT6_PCREL(0xec000000, 0x0064, dst_reg, REG_W1, i, off, mask);
> +		EMIT6_PCREL(0xec000000, src_reg ? 0x0076 : 0x0064,
> +			    dst_reg, REG_W1, i, off, mask);
>  		break;
>  branch_ku:
>  		/* lgfi %w1,imm (load sign extend imm) */
>  		EMIT6_IMM(0xc0010000, REG_W1, imm);
>  		/* clgrj %dst,%w1,mask,off */

Same here, dependent on src_reg either clgrj or clrj.

> -		EMIT6_PCREL(0xec000000, 0x0065, dst_reg, REG_W1, i, off, mask);
> +		EMIT6_PCREL(0xec000000, src_reg ? 0x0077 : 0x0065,
> +			    dst_reg, REG_W1, i, off, mask);
>  		break;
>  branch_xs:
>  		/* cgrj %dst,%src,mask,off */

Same same, dependent on imm either cgrj or crj.

> -		EMIT6_PCREL(0xec000000, 0x0064, dst_reg, src_reg, i, off, mask);
> +		EMIT6_PCREL(0xec000000, imm ? 0x0076 : 0x0064,
> +			    dst_reg, src_reg, i, off, mask);
>  		break;
>  branch_xu:
>  		/* clgrj %dst,%src,mask,off */

And again, dependent on imm either clgrj or clrj.

> -		EMIT6_PCREL(0xec000000, 0x0065, dst_reg, src_reg, i, off, mask);
> +		EMIT6_PCREL(0xec000000, imm ? 0x0077 : 0x0065,
> +			    dst_reg, src_reg, i, off, mask);
>  		break;
>  branch_oc:
>  		/* brc mask,jmp_off (branch instruction needs 4 bytes) */


-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATH bpf-next 11/13] bpf: verifier support JMP32
  2018-12-19 23:54   ` Jakub Kicinski
@ 2019-01-08 15:23     ` Jiong Wang
  0 siblings, 0 replies; 18+ messages in thread
From: Jiong Wang @ 2019-01-08 15:23 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Jiong Wang, ast, daniel, netdev, oss-drivers


Jakub Kicinski writes:

> On Wed, 19 Dec 2018 17:44:18 -0500, Jiong Wang wrote:
>> Verifier is doing some runtime optimizations based on the extra info
>> conditional jump instruction could offer, especially when the comparison
>> is between constant and register for which case the value range of the
>> register could be improved.
>> 
>>   is_branch_taken/reg_set_min_max/reg_set_min_max_inv
>> 
>> are the three functions that needs updating.
>> 
>> There are some other conditional jump related optimizations but they
>> are with pointer types comparison which JMP32 won't be generated for.
>> 
>> Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
>> ---
>>  kernel/bpf/verifier.c | 178 ++++++++++++++++++++++++++++++++++++++------------
>>  1 file changed, 137 insertions(+), 41 deletions(-)
>> 
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index e0e77ff..3123c91 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -3919,7 +3919,7 @@ static int is_branch_taken(struct bpf_reg_state *reg, u64 val, u8 opcode)
>>   */
>>  static void reg_set_min_max(struct bpf_reg_state *true_reg,
>>  			    struct bpf_reg_state *false_reg, u64 val,
>> -			    u8 opcode)
>> +			    u8 opcode, bool is_jmp32)
>>  {
>>  	/* If the dst_reg is a pointer, we can't learn anything about its
>>  	 * variable offset from the compare (unless src_reg were a pointer into
>> @@ -3935,45 +3935,69 @@ static void reg_set_min_max(struct bpf_reg_state *true_reg,
>>  		/* If this is false then we know nothing Jon Snow, but if it is
>>  		 * true then we know for sure.
>>  		 */
>> -		__mark_reg_known(true_reg, val);
>> +		if (is_jmp32)
>> +			true_reg->var_off = tnum_or(true_reg->var_off,
>> +						    tnum_const(val));
>
> These tnum updates look strange, if the jump is 32bit we know the
> bottom bits.  So:
>
> 	tnum.m &= GENMASK(63, 32);
> 	tnum.v = upper_32_bits(tnum.v) | lower_32_bits(val);

Ack.

By the way, I also fixed range deduction for some other operations which
eventually fixed the only regression on bpf_flow.o mentioned in the cover
letter. Now the processed insn number looks in general a consistent win
against either alu32 or default.

Processed insn number
===
LLVM code-gen option   default  alu32  alu32/jmp32  change Vs.  change Vs.
                                                    alu32       default
bpf_lb-DLB_L3.o:       1579     1281   1295         +1.09%      -17.99%
bpf_lb-DLB_L4.o:       2045     1663   1556         -6.43%      -23.91%
bpf_lb-DUNKNOWN.o:     606      513    501          -2.34%      -17.33%
bpf_lxc.o:             85381    103218 84236        -18.39%     -1.34%
bpf_netdev.o:          5246     5809   5200         -10.48%     -0.08%
bpf_overlay.o:         2443     2705   2456         -9.02%      -0.53%

Will included all fixes in v2.

Regards,
Jiong

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-01-08 15:23 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-19 22:44 [PATH bpf-next 00/13] bpf: propose new jmp32 instructions Jiong Wang
2018-12-19 22:44 ` [PATH bpf-next 01/13] bpf: encoding description and macros for JMP32 Jiong Wang
2018-12-19 22:44 ` [PATH bpf-next 02/13] bpf: interpreter support " Jiong Wang
2018-12-19 22:44 ` [PATH bpf-next 03/13] bpf: JIT blinds support JMP32 Jiong Wang
2018-12-19 22:44 ` [PATH bpf-next 04/13] x86_64: bpf: implement jitting of JMP32 Jiong Wang
2018-12-19 22:44 ` [PATH bpf-next 05/13] x32: " Jiong Wang
2018-12-19 22:44 ` [PATH bpf-next 06/13] arm64: " Jiong Wang
2018-12-19 22:44 ` [PATH bpf-next 07/13] arm: " Jiong Wang
2018-12-19 22:44 ` [PATH bpf-next 08/13] ppc: " Jiong Wang
2018-12-19 22:44 ` [PATH bpf-next 09/13] s390: " Jiong Wang
2018-12-20  6:47   ` Martin Schwidefsky
2018-12-19 22:44 ` [PATH bpf-next 10/13] nfp: " Jiong Wang
2018-12-19 22:44 ` [PATH bpf-next 11/13] bpf: verifier support JMP32 Jiong Wang
2018-12-19 23:54   ` Jakub Kicinski
2019-01-08 15:23     ` Jiong Wang
2018-12-19 22:44 ` [PATH bpf-next 12/13] bpf: unit tests for JMP32 Jiong Wang
2018-12-19 22:44 ` [PATH bpf-next 13/13] selftests: bpf: makefile support sub-register code-gen test mode Jiong Wang
2018-12-20  3:08 ` [PATH bpf-next 00/13] bpf: propose new jmp32 instructions Alexei Starovoitov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).