All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/5] BPF updates
@ 2014-04-24  6:45 Daniel Borkmann
  2014-04-24  6:45 ` [PATCH net-next v2 1/5] net: filter: simplify label names from jump-table Daniel Borkmann
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Daniel Borkmann @ 2014-04-24  6:45 UTC (permalink / raw)
  To: davem; +Cc: ast, netdev

I had these cleanups still in my queue before the merge window.
The set is against net-next tree, but with 83d5b7ef99 ("net: filter:
initialize A and X registers") applied on top of it, so a merge
of net into net-next would be required *before* applying this set.

The main objective for these updates is that we get the code
a bit more readable/comprehensible and avoid one additional
instruction in the interpreter during fast-path.

Tested with Alexei's BPF test suite and seccomp test suite, no
issues found.

Thanks!

v1->v2:
 - Only changed patch 5 as to suggestion from Alexei
 - Rest is the same

Daniel Borkmann (5):
  net: filter: simplify label names from jump-table
  net: filter: misc/various cleanups
  net: filter: get rid of sock_fprog_kern
  net: filter: make register namings more comprehensible
  net: filter: optimize BPF migration for ARG1/CTX handling

 include/linux/filter.h |  60 ++++--
 net/core/filter.c      | 568 ++++++++++++++++++++++++-------------------------
 net/core/sock_diag.c   |   4 +-
 3 files changed, 329 insertions(+), 303 deletions(-)

-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH net-next v2 1/5] net: filter: simplify label names from jump-table
  2014-04-24  6:45 [PATCH net-next v2 0/5] BPF updates Daniel Borkmann
@ 2014-04-24  6:45 ` Daniel Borkmann
  2014-04-24  6:45 ` [PATCH net-next v2 2/5] net: filter: misc/various cleanups Daniel Borkmann
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Daniel Borkmann @ 2014-04-24  6:45 UTC (permalink / raw)
  To: davem; +Cc: ast, netdev

This patch simplifies label naming for the BPF jump-table.
When we define labels via DL(), we just concatenate/textify
the combination of instruction opcode which consists of the
class, subclass, word size, target register and so on. Each
time we leave BPF_ prefix intact, so that e.g. the preprocessor
generates a label BPF_ALU_BPF_ADD_BPF_X for DL(BPF_ALU, BPF_ADD,
BPF_X) whereas a label name of ALU_ADD_X is much more easy
to grasp.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
---
 include/linux/filter.h |   3 +
 net/core/filter.c      | 308 ++++++++++++++++++++++++-------------------------
 2 files changed, 157 insertions(+), 154 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 759abf7..b042d1d 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -37,6 +37,9 @@
 #define BPF_CALL	0x80	/* function call */
 #define BPF_EXIT	0x90	/* function return */
 
+/* Placeholder/dummy for 0 */
+#define BPF_0		0
+
 /* BPF has 10 general purpose 64-bit registers and stack frame. */
 #define MAX_BPF_REG	11
 
diff --git a/net/core/filter.c b/net/core/filter.c
index 7c4db3d..a1784e9 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -156,94 +156,94 @@ unsigned int __sk_run_filter(void *ctx, const struct sock_filter_int *insn)
 	static const void *jumptable[256] = {
 		[0 ... 255] = &&default_label,
 		/* Now overwrite non-defaults ... */
-#define DL(A, B, C)	[A|B|C] = &&A##_##B##_##C
-		DL(BPF_ALU, BPF_ADD, BPF_X),
-		DL(BPF_ALU, BPF_ADD, BPF_K),
-		DL(BPF_ALU, BPF_SUB, BPF_X),
-		DL(BPF_ALU, BPF_SUB, BPF_K),
-		DL(BPF_ALU, BPF_AND, BPF_X),
-		DL(BPF_ALU, BPF_AND, BPF_K),
-		DL(BPF_ALU, BPF_OR, BPF_X),
-		DL(BPF_ALU, BPF_OR, BPF_K),
-		DL(BPF_ALU, BPF_LSH, BPF_X),
-		DL(BPF_ALU, BPF_LSH, BPF_K),
-		DL(BPF_ALU, BPF_RSH, BPF_X),
-		DL(BPF_ALU, BPF_RSH, BPF_K),
-		DL(BPF_ALU, BPF_XOR, BPF_X),
-		DL(BPF_ALU, BPF_XOR, BPF_K),
-		DL(BPF_ALU, BPF_MUL, BPF_X),
-		DL(BPF_ALU, BPF_MUL, BPF_K),
-		DL(BPF_ALU, BPF_MOV, BPF_X),
-		DL(BPF_ALU, BPF_MOV, BPF_K),
-		DL(BPF_ALU, BPF_DIV, BPF_X),
-		DL(BPF_ALU, BPF_DIV, BPF_K),
-		DL(BPF_ALU, BPF_MOD, BPF_X),
-		DL(BPF_ALU, BPF_MOD, BPF_K),
-		DL(BPF_ALU, BPF_NEG, 0),
-		DL(BPF_ALU, BPF_END, BPF_TO_BE),
-		DL(BPF_ALU, BPF_END, BPF_TO_LE),
-		DL(BPF_ALU64, BPF_ADD, BPF_X),
-		DL(BPF_ALU64, BPF_ADD, BPF_K),
-		DL(BPF_ALU64, BPF_SUB, BPF_X),
-		DL(BPF_ALU64, BPF_SUB, BPF_K),
-		DL(BPF_ALU64, BPF_AND, BPF_X),
-		DL(BPF_ALU64, BPF_AND, BPF_K),
-		DL(BPF_ALU64, BPF_OR, BPF_X),
-		DL(BPF_ALU64, BPF_OR, BPF_K),
-		DL(BPF_ALU64, BPF_LSH, BPF_X),
-		DL(BPF_ALU64, BPF_LSH, BPF_K),
-		DL(BPF_ALU64, BPF_RSH, BPF_X),
-		DL(BPF_ALU64, BPF_RSH, BPF_K),
-		DL(BPF_ALU64, BPF_XOR, BPF_X),
-		DL(BPF_ALU64, BPF_XOR, BPF_K),
-		DL(BPF_ALU64, BPF_MUL, BPF_X),
-		DL(BPF_ALU64, BPF_MUL, BPF_K),
-		DL(BPF_ALU64, BPF_MOV, BPF_X),
-		DL(BPF_ALU64, BPF_MOV, BPF_K),
-		DL(BPF_ALU64, BPF_ARSH, BPF_X),
-		DL(BPF_ALU64, BPF_ARSH, BPF_K),
-		DL(BPF_ALU64, BPF_DIV, BPF_X),
-		DL(BPF_ALU64, BPF_DIV, BPF_K),
-		DL(BPF_ALU64, BPF_MOD, BPF_X),
-		DL(BPF_ALU64, BPF_MOD, BPF_K),
-		DL(BPF_ALU64, BPF_NEG, 0),
-		DL(BPF_JMP, BPF_CALL, 0),
-		DL(BPF_JMP, BPF_JA, 0),
-		DL(BPF_JMP, BPF_JEQ, BPF_X),
-		DL(BPF_JMP, BPF_JEQ, BPF_K),
-		DL(BPF_JMP, BPF_JNE, BPF_X),
-		DL(BPF_JMP, BPF_JNE, BPF_K),
-		DL(BPF_JMP, BPF_JGT, BPF_X),
-		DL(BPF_JMP, BPF_JGT, BPF_K),
-		DL(BPF_JMP, BPF_JGE, BPF_X),
-		DL(BPF_JMP, BPF_JGE, BPF_K),
-		DL(BPF_JMP, BPF_JSGT, BPF_X),
-		DL(BPF_JMP, BPF_JSGT, BPF_K),
-		DL(BPF_JMP, BPF_JSGE, BPF_X),
-		DL(BPF_JMP, BPF_JSGE, BPF_K),
-		DL(BPF_JMP, BPF_JSET, BPF_X),
-		DL(BPF_JMP, BPF_JSET, BPF_K),
-		DL(BPF_JMP, BPF_EXIT, 0),
-		DL(BPF_STX, BPF_MEM, BPF_B),
-		DL(BPF_STX, BPF_MEM, BPF_H),
-		DL(BPF_STX, BPF_MEM, BPF_W),
-		DL(BPF_STX, BPF_MEM, BPF_DW),
-		DL(BPF_STX, BPF_XADD, BPF_W),
-		DL(BPF_STX, BPF_XADD, BPF_DW),
-		DL(BPF_ST, BPF_MEM, BPF_B),
-		DL(BPF_ST, BPF_MEM, BPF_H),
-		DL(BPF_ST, BPF_MEM, BPF_W),
-		DL(BPF_ST, BPF_MEM, BPF_DW),
-		DL(BPF_LDX, BPF_MEM, BPF_B),
-		DL(BPF_LDX, BPF_MEM, BPF_H),
-		DL(BPF_LDX, BPF_MEM, BPF_W),
-		DL(BPF_LDX, BPF_MEM, BPF_DW),
-		DL(BPF_LD, BPF_ABS, BPF_W),
-		DL(BPF_LD, BPF_ABS, BPF_H),
-		DL(BPF_LD, BPF_ABS, BPF_B),
-		DL(BPF_LD, BPF_IND, BPF_W),
-		DL(BPF_LD, BPF_IND, BPF_H),
-		DL(BPF_LD, BPF_IND, BPF_B),
+#define DL(A, B, C)	[BPF_##A|BPF_##B|BPF_##C] = &&A##_##B##_##C
+		DL(ALU, ADD, X),
+		DL(ALU, ADD, K),
+		DL(ALU, SUB, X),
+		DL(ALU, SUB, K),
+		DL(ALU, AND, X),
+		DL(ALU, AND, K),
+		DL(ALU, OR, X),
+		DL(ALU, OR, K),
+		DL(ALU, LSH, X),
+		DL(ALU, LSH, K),
+		DL(ALU, RSH, X),
+		DL(ALU, RSH, K),
+		DL(ALU, XOR, X),
+		DL(ALU, XOR, K),
+		DL(ALU, MUL, X),
+		DL(ALU, MUL, K),
+		DL(ALU, MOV, X),
+		DL(ALU, MOV, K),
+		DL(ALU, DIV, X),
+		DL(ALU, DIV, K),
+		DL(ALU, MOD, X),
+		DL(ALU, MOD, K),
+		DL(ALU, NEG, 0),
+		DL(ALU, END, TO_BE),
+		DL(ALU, END, TO_LE),
+		DL(ALU64, ADD, X),
+		DL(ALU64, ADD, K),
+		DL(ALU64, SUB, X),
+		DL(ALU64, SUB, K),
+		DL(ALU64, AND, X),
+		DL(ALU64, AND, K),
+		DL(ALU64, OR, X),
+		DL(ALU64, OR, K),
+		DL(ALU64, LSH, X),
+		DL(ALU64, LSH, K),
+		DL(ALU64, RSH, X),
+		DL(ALU64, RSH, K),
+		DL(ALU64, XOR, X),
+		DL(ALU64, XOR, K),
+		DL(ALU64, MUL, X),
+		DL(ALU64, MUL, K),
+		DL(ALU64, MOV, X),
+		DL(ALU64, MOV, K),
+		DL(ALU64, ARSH, X),
+		DL(ALU64, ARSH, K),
+		DL(ALU64, DIV, X),
+		DL(ALU64, DIV, K),
+		DL(ALU64, MOD, X),
+		DL(ALU64, MOD, K),
+		DL(ALU64, NEG, 0),
+		DL(JMP, CALL, 0),
+		DL(JMP, JA, 0),
+		DL(JMP, JEQ, X),
+		DL(JMP, JEQ, K),
+		DL(JMP, JNE, X),
+		DL(JMP, JNE, K),
+		DL(JMP, JGT, X),
+		DL(JMP, JGT, K),
+		DL(JMP, JGE, X),
+		DL(JMP, JGE, K),
+		DL(JMP, JSGT, X),
+		DL(JMP, JSGT, K),
+		DL(JMP, JSGE, X),
+		DL(JMP, JSGE, K),
+		DL(JMP, JSET, X),
+		DL(JMP, JSET, K),
+		DL(JMP, EXIT, 0),
+		DL(STX, MEM, B),
+		DL(STX, MEM, H),
+		DL(STX, MEM, W),
+		DL(STX, MEM, DW),
+		DL(STX, XADD, W),
+		DL(STX, XADD, DW),
+		DL(ST, MEM, B),
+		DL(ST, MEM, H),
+		DL(ST, MEM, W),
+		DL(ST, MEM, DW),
+		DL(LDX, MEM, B),
+		DL(LDX, MEM, H),
+		DL(LDX, MEM, W),
+		DL(LDX, MEM, DW),
+		DL(LD, ABS, W),
+		DL(LD, ABS, H),
+		DL(LD, ABS, B),
+		DL(LD, IND, W),
+		DL(LD, IND, H),
+		DL(LD, IND, B),
 #undef DL
 	};
 
@@ -257,93 +257,93 @@ select_insn:
 
 	/* ALU */
 #define ALU(OPCODE, OP)			\
-	BPF_ALU64_##OPCODE##_BPF_X:	\
+	ALU64_##OPCODE##_X:		\
 		A = A OP X;		\
 		CONT;			\
-	BPF_ALU_##OPCODE##_BPF_X:	\
+	ALU_##OPCODE##_X:		\
 		A = (u32) A OP (u32) X;	\
 		CONT;			\
-	BPF_ALU64_##OPCODE##_BPF_K:	\
+	ALU64_##OPCODE##_K:		\
 		A = A OP K;		\
 		CONT;			\
-	BPF_ALU_##OPCODE##_BPF_K:	\
+	ALU_##OPCODE##_K:		\
 		A = (u32) A OP (u32) K;	\
 		CONT;
 
-	ALU(BPF_ADD,  +)
-	ALU(BPF_SUB,  -)
-	ALU(BPF_AND,  &)
-	ALU(BPF_OR,   |)
-	ALU(BPF_LSH, <<)
-	ALU(BPF_RSH, >>)
-	ALU(BPF_XOR,  ^)
-	ALU(BPF_MUL,  *)
+	ALU(ADD,  +)
+	ALU(SUB,  -)
+	ALU(AND,  &)
+	ALU(OR,   |)
+	ALU(LSH, <<)
+	ALU(RSH, >>)
+	ALU(XOR,  ^)
+	ALU(MUL,  *)
 #undef ALU
-	BPF_ALU_BPF_NEG_0:
+	ALU_NEG_0:
 		A = (u32) -A;
 		CONT;
-	BPF_ALU64_BPF_NEG_0:
+	ALU64_NEG_0:
 		A = -A;
 		CONT;
-	BPF_ALU_BPF_MOV_BPF_X:
+	ALU_MOV_X:
 		A = (u32) X;
 		CONT;
-	BPF_ALU_BPF_MOV_BPF_K:
+	ALU_MOV_K:
 		A = (u32) K;
 		CONT;
-	BPF_ALU64_BPF_MOV_BPF_X:
+	ALU64_MOV_X:
 		A = X;
 		CONT;
-	BPF_ALU64_BPF_MOV_BPF_K:
+	ALU64_MOV_K:
 		A = K;
 		CONT;
-	BPF_ALU64_BPF_ARSH_BPF_X:
+	ALU64_ARSH_X:
 		(*(s64 *) &A) >>= X;
 		CONT;
-	BPF_ALU64_BPF_ARSH_BPF_K:
+	ALU64_ARSH_K:
 		(*(s64 *) &A) >>= K;
 		CONT;
-	BPF_ALU64_BPF_MOD_BPF_X:
+	ALU64_MOD_X:
 		if (unlikely(X == 0))
 			return 0;
 		tmp = A;
 		A = do_div(tmp, X);
 		CONT;
-	BPF_ALU_BPF_MOD_BPF_X:
+	ALU_MOD_X:
 		if (unlikely(X == 0))
 			return 0;
 		tmp = (u32) A;
 		A = do_div(tmp, (u32) X);
 		CONT;
-	BPF_ALU64_BPF_MOD_BPF_K:
+	ALU64_MOD_K:
 		tmp = A;
 		A = do_div(tmp, K);
 		CONT;
-	BPF_ALU_BPF_MOD_BPF_K:
+	ALU_MOD_K:
 		tmp = (u32) A;
 		A = do_div(tmp, (u32) K);
 		CONT;
-	BPF_ALU64_BPF_DIV_BPF_X:
+	ALU64_DIV_X:
 		if (unlikely(X == 0))
 			return 0;
 		do_div(A, X);
 		CONT;
-	BPF_ALU_BPF_DIV_BPF_X:
+	ALU_DIV_X:
 		if (unlikely(X == 0))
 			return 0;
 		tmp = (u32) A;
 		do_div(tmp, (u32) X);
 		A = (u32) tmp;
 		CONT;
-	BPF_ALU64_BPF_DIV_BPF_K:
+	ALU64_DIV_K:
 		do_div(A, K);
 		CONT;
-	BPF_ALU_BPF_DIV_BPF_K:
+	ALU_DIV_K:
 		tmp = (u32) A;
 		do_div(tmp, (u32) K);
 		A = (u32) tmp;
 		CONT;
-	BPF_ALU_BPF_END_BPF_TO_BE:
+	ALU_END_TO_BE:
 		switch (K) {
 		case 16:
 			A = (__force u16) cpu_to_be16(A);
@@ -356,7 +356,7 @@ select_insn:
 			break;
 		}
 		CONT;
-	BPF_ALU_BPF_END_BPF_TO_LE:
+	ALU_END_TO_LE:
 		switch (K) {
 		case 16:
 			A = (__force u16) cpu_to_le16(A);
@@ -371,7 +371,7 @@ select_insn:
 		CONT;
 
 	/* CALL */
-	BPF_JMP_BPF_CALL_0:
+	JMP_CALL_0:
 		/* Function call scratches R1-R5 registers, preserves R6-R9,
 		 * and stores return value into R0.
 		 */
@@ -380,122 +380,122 @@ select_insn:
 		CONT;
 
 	/* JMP */
-	BPF_JMP_BPF_JA_0:
+	JMP_JA_0:
 		insn += insn->off;
 		CONT;
-	BPF_JMP_BPF_JEQ_BPF_X:
+	JMP_JEQ_X:
 		if (A == X) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
-	BPF_JMP_BPF_JEQ_BPF_K:
+	JMP_JEQ_K:
 		if (A == K) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
-	BPF_JMP_BPF_JNE_BPF_X:
+	JMP_JNE_X:
 		if (A != X) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
-	BPF_JMP_BPF_JNE_BPF_K:
+	JMP_JNE_K:
 		if (A != K) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
-	BPF_JMP_BPF_JGT_BPF_X:
+	JMP_JGT_X:
 		if (A > X) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
-	BPF_JMP_BPF_JGT_BPF_K:
+	JMP_JGT_K:
 		if (A > K) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
-	BPF_JMP_BPF_JGE_BPF_X:
+	JMP_JGE_X:
 		if (A >= X) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
-	BPF_JMP_BPF_JGE_BPF_K:
+	JMP_JGE_K:
 		if (A >= K) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
-	BPF_JMP_BPF_JSGT_BPF_X:
-		if (((s64)A) > ((s64)X)) {
+	JMP_JSGT_X:
+		if (((s64) A) > ((s64) X)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
-	BPF_JMP_BPF_JSGT_BPF_K:
-		if (((s64)A) > ((s64)K)) {
+	JMP_JSGT_K:
+		if (((s64) A) > ((s64) K)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
-	BPF_JMP_BPF_JSGE_BPF_X:
-		if (((s64)A) >= ((s64)X)) {
+	JMP_JSGE_X:
+		if (((s64) A) >= ((s64) X)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
-	BPF_JMP_BPF_JSGE_BPF_K:
-		if (((s64)A) >= ((s64)K)) {
+	JMP_JSGE_K:
+		if (((s64) A) >= ((s64) K)) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
-	BPF_JMP_BPF_JSET_BPF_X:
+	JMP_JSET_X:
 		if (A & X) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
-	BPF_JMP_BPF_JSET_BPF_K:
+	JMP_JSET_K:
 		if (A & K) {
 			insn += insn->off;
 			CONT_JMP;
 		}
 		CONT;
-	BPF_JMP_BPF_EXIT_0:
+	JMP_EXIT_0:
 		return R0;
 
 	/* STX and ST and LDX*/
 #define LDST(SIZEOP, SIZE)					\
-	BPF_STX_BPF_MEM_##SIZEOP:				\
+	STX_MEM_##SIZEOP:					\
 		*(SIZE *)(unsigned long) (A + insn->off) = X;	\
 		CONT;						\
-	BPF_ST_BPF_MEM_##SIZEOP:				\
+	ST_MEM_##SIZEOP:					\
 		*(SIZE *)(unsigned long) (A + insn->off) = K;	\
 		CONT;						\
-	BPF_LDX_BPF_MEM_##SIZEOP:				\
+	LDX_MEM_##SIZEOP:					\
 		A = *(SIZE *)(unsigned long) (X + insn->off);	\
 		CONT;
 
-	LDST(BPF_B,   u8)
-	LDST(BPF_H,  u16)
-	LDST(BPF_W,  u32)
-	LDST(BPF_DW, u64)
+	LDST(B,   u8)
+	LDST(H,  u16)
+	LDST(W,  u32)
+	LDST(DW, u64)
 #undef LDST
-	BPF_STX_BPF_XADD_BPF_W: /* lock xadd *(u32 *)(A + insn->off) += X */
+	STX_XADD_W: /* lock xadd *(u32 *)(A + insn->off) += X */
 		atomic_add((u32) X, (atomic_t *)(unsigned long)
 			   (A + insn->off));
 		CONT;
-	BPF_STX_BPF_XADD_BPF_DW: /* lock xadd *(u64 *)(A + insn->off) += X */
+	STX_XADD_DW: /* lock xadd *(u64 *)(A + insn->off) += X */
 		atomic64_add((u64) X, (atomic64_t *)(unsigned long)
 			     (A + insn->off));
 		CONT;
-	BPF_LD_BPF_ABS_BPF_W: /* R0 = ntohl(*(u32 *) (skb->data + K)) */
+	LD_ABS_W: /* R0 = ntohl(*(u32 *) (skb->data + K)) */
 		off = K;
 load_word:
 		/* BPF_LD + BPD_ABS and BPF_LD + BPF_IND insns are only
@@ -524,7 +524,7 @@ load_word:
 			CONT;
 		}
 		return 0;
-	BPF_LD_BPF_ABS_BPF_H: /* R0 = ntohs(*(u16 *) (skb->data + K)) */
+	LD_ABS_H: /* R0 = ntohs(*(u16 *) (skb->data + K)) */
 		off = K;
 load_half:
 		ptr = load_pointer((struct sk_buff *) ctx, off, 2, &tmp);
@@ -533,7 +533,7 @@ load_half:
 			CONT;
 		}
 		return 0;
-	BPF_LD_BPF_ABS_BPF_B: /* R0 = *(u8 *) (ctx + K) */
+	LD_ABS_B: /* R0 = *(u8 *) (ctx + K) */
 		off = K;
 load_byte:
 		ptr = load_pointer((struct sk_buff *) ctx, off, 1, &tmp);
@@ -542,13 +542,13 @@ load_byte:
 			CONT;
 		}
 		return 0;
-	BPF_LD_BPF_IND_BPF_W: /* R0 = ntohl(*(u32 *) (skb->data + X + K)) */
+	LD_IND_W: /* R0 = ntohl(*(u32 *) (skb->data + X + K)) */
 		off = K + X;
 		goto load_word;
-	BPF_LD_BPF_IND_BPF_H: /* R0 = ntohs(*(u16 *) (skb->data + X + K)) */
+	LD_IND_H: /* R0 = ntohs(*(u16 *) (skb->data + X + K)) */
 		off = K + X;
 		goto load_half;
-	BPF_LD_BPF_IND_BPF_B: /* R0 = *(u8 *) (skb->data + X + K) */
+	LD_IND_B: /* R0 = *(u8 *) (skb->data + X + K) */
 		off = K + X;
 		goto load_byte;
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net-next v2 2/5] net: filter: misc/various cleanups
  2014-04-24  6:45 [PATCH net-next v2 0/5] BPF updates Daniel Borkmann
  2014-04-24  6:45 ` [PATCH net-next v2 1/5] net: filter: simplify label names from jump-table Daniel Borkmann
@ 2014-04-24  6:45 ` Daniel Borkmann
  2014-04-24 20:00   ` David Miller
  2014-04-24  6:45 ` [PATCH net-next v2 3/5] net: filter: get rid of sock_fprog_kern Daniel Borkmann
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Daniel Borkmann @ 2014-04-24  6:45 UTC (permalink / raw)
  To: davem; +Cc: ast, netdev

This contains only some minor misc cleanpus. We can spare us the
extra variable declaration in __skb_get_pay_offset(), we can mark
some unexpected conditions in the fast-path as unlikely(), the
cast in __get_random_u32() is rather unnecessary and in
__sk_migrate_realloc() we can remove the memcpy() and do a direct
assignment of the structs. Latter was suggested by Fengguang Wu
found with coccinelle.

Suggested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
---
 net/core/filter.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index a1784e9..2fd2293 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -57,9 +57,9 @@ void *bpf_internal_load_pointer_neg_helper(const struct sk_buff *skb, int k, uns
 		ptr = skb_network_header(skb) + k - SKF_NET_OFF;
 	else if (k >= SKF_LL_OFF)
 		ptr = skb_mac_header(skb) + k - SKF_LL_OFF;
-
 	if (ptr >= skb->head && ptr + size <= skb_tail_pointer(skb))
 		return ptr;
+
 	return NULL;
 }
 
@@ -68,6 +68,7 @@ static inline void *load_pointer(const struct sk_buff *skb, int k,
 {
 	if (k >= 0)
 		return skb_header_pointer(skb, k, size, buffer);
+
 	return bpf_internal_load_pointer_neg_helper(skb, k, size);
 }
 
@@ -596,9 +597,7 @@ static unsigned int pkt_type_offset(void)
 
 static u64 __skb_get_pay_offset(u64 ctx, u64 A, u64 X, u64 r4, u64 r5)
 {
-	struct sk_buff *skb = (struct sk_buff *)(long) ctx;
-
-	return __skb_get_poff(skb);
+	return __skb_get_poff((struct sk_buff *)(long) ctx);
 }
 
 static u64 __skb_get_nlattr(u64 ctx, u64 A, u64 X, u64 r4, u64 r5)
@@ -609,10 +608,10 @@ static u64 __skb_get_nlattr(u64 ctx, u64 A, u64 X, u64 r4, u64 r5)
 	if (skb_is_nonlinear(skb))
 		return 0;
 
-	if (skb->len < sizeof(struct nlattr))
+	if (unlikely(skb->len < sizeof(struct nlattr)))
 		return 0;
 
-	if (A > skb->len - sizeof(struct nlattr))
+	if (unlikely(A > skb->len - sizeof(struct nlattr)))
 		return 0;
 
 	nla = nla_find((struct nlattr *) &skb->data[A], skb->len - A, X);
@@ -630,14 +629,14 @@ static u64 __skb_get_nlattr_nest(u64 ctx, u64 A, u64 X, u64 r4, u64 r5)
 	if (skb_is_nonlinear(skb))
 		return 0;
 
-	if (skb->len < sizeof(struct nlattr))
+	if (unlikely(skb->len < sizeof(struct nlattr)))
 		return 0;
 
-	if (A > skb->len - sizeof(struct nlattr))
+	if (unlikely(A > skb->len - sizeof(struct nlattr)))
 		return 0;
 
 	nla = (struct nlattr *) &skb->data[A];
-	if (nla->nla_len > skb->len - A)
+	if (unlikely(nla->nla_len > skb->len - A))
 		return 0;
 
 	nla = nla_find_nested(nla, X);
@@ -655,7 +654,7 @@ static u64 __get_raw_cpu_id(u64 ctx, u64 A, u64 X, u64 r4, u64 r5)
 /* note that this only generates 32-bit random numbers */
 static u64 __get_random_u32(u64 ctx, u64 A, u64 X, u64 r4, u64 r5)
 {
-	return (u64)prandom_u32();
+	return prandom_u32();
 }
 
 static bool convert_bpf_extensions(struct sock_filter *fp,
@@ -1472,7 +1471,7 @@ static struct sk_filter *__sk_migrate_realloc(struct sk_filter *fp,
 
 	fp_new = sock_kmalloc(sk, len, GFP_KERNEL);
 	if (fp_new) {
-		memcpy(fp_new, fp, sizeof(struct sk_filter));
+		*fp_new = *fp;
 		/* As we're kepping orig_prog in fp_new along,
 		 * we need to make sure we're not evicting it
 		 * from the old fp.
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net-next v2 3/5] net: filter: get rid of sock_fprog_kern
  2014-04-24  6:45 [PATCH net-next v2 0/5] BPF updates Daniel Borkmann
  2014-04-24  6:45 ` [PATCH net-next v2 1/5] net: filter: simplify label names from jump-table Daniel Borkmann
  2014-04-24  6:45 ` [PATCH net-next v2 2/5] net: filter: misc/various cleanups Daniel Borkmann
@ 2014-04-24  6:45 ` Daniel Borkmann
  2014-04-24 20:02   ` David Miller
  2014-04-24  6:45 ` [PATCH net-next v2 4/5] net: filter: make register namings more comprehensible Daniel Borkmann
  2014-04-24  6:45 ` [PATCH net-next v2 5/5] net: filter: optimize BPF migration for ARG1/CTX handling Daniel Borkmann
  4 siblings, 1 reply; 13+ messages in thread
From: Daniel Borkmann @ 2014-04-24  6:45 UTC (permalink / raw)
  To: davem; +Cc: ast, netdev

It is actually cleaner to just get rid of sock_fprog_kern structure.
It's not really useful as we can just use sock_fprog structure as
we do elsewhere in the kernel, this could throw some sparse false
positives though, but getting rid of the structure duplication is
probably the better way to go.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
---
 include/linux/filter.h | 13 +++++--------
 net/core/filter.c      | 24 ++++++++++++------------
 net/core/sock_diag.c   |  4 ++--
 3 files changed, 19 insertions(+), 22 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index b042d1d..e58b687 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -67,11 +67,6 @@ struct compat_sock_fprog {
 };
 #endif
 
-struct sock_fprog_kern {
-	u16			len;
-	struct sock_filter	*filter;
-};
-
 struct sk_buff;
 struct sock;
 struct seccomp_data;
@@ -80,7 +75,7 @@ struct sk_filter {
 	atomic_t		refcnt;
 	u32			jited:1,	/* Is our filter JIT'ed? */
 				len:31;		/* Number of filter blocks */
-	struct sock_fprog_kern	*orig_prog;	/* Original BPF program */
+	struct sock_fprog	*orig_prog;	/* Original BPF program */
 	struct rcu_head		rcu;
 	unsigned int		(*bpf_func)(const struct sk_buff *skb,
 					    const struct sock_filter_int *filter);
@@ -97,8 +92,10 @@ static inline unsigned int sk_filter_size(unsigned int proglen)
 		   offsetof(struct sk_filter, insns[proglen]));
 }
 
-#define sk_filter_proglen(fprog)			\
-		(fprog->len * sizeof(fprog->filter[0]))
+static inline unsigned int sk_filter_prog_size(const struct sock_fprog *fprog)
+{
+	return fprog->len * sizeof(fprog->filter[0]);
+}
 
 #define SK_RUN_FILTER(filter, ctx)			\
 		(*filter->bpf_func)(ctx, filter->insnsi)
diff --git a/net/core/filter.c b/net/core/filter.c
index 2fd2293..e5e7717 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1396,17 +1396,17 @@ EXPORT_SYMBOL(sk_chk_filter);
 static int sk_store_orig_filter(struct sk_filter *fp,
 				const struct sock_fprog *fprog)
 {
-	unsigned int fsize = sk_filter_proglen(fprog);
-	struct sock_fprog_kern *fkprog;
+	unsigned int fsize = sk_filter_prog_size(fprog);
+	struct sock_fprog *op;
 
-	fp->orig_prog = kmalloc(sizeof(*fkprog), GFP_KERNEL);
+	fp->orig_prog = kmalloc(sizeof(*op), GFP_KERNEL);
 	if (!fp->orig_prog)
 		return -ENOMEM;
 
-	fkprog = fp->orig_prog;
-	fkprog->len = fprog->len;
-	fkprog->filter = kmemdup(fp->insns, fsize, GFP_KERNEL);
-	if (!fkprog->filter) {
+	op = fp->orig_prog;
+	op->len = fprog->len;
+	op->filter = kmemdup(fp->insns, fsize, GFP_KERNEL);
+	if (!op->filter) {
 		kfree(fp->orig_prog);
 		return -ENOMEM;
 	}
@@ -1416,7 +1416,7 @@ static int sk_store_orig_filter(struct sk_filter *fp,
 
 static void sk_release_orig_filter(struct sk_filter *fp)
 {
-	struct sock_fprog_kern *fprog = fp->orig_prog;
+	struct sock_fprog *fprog = fp->orig_prog;
 
 	if (fprog) {
 		kfree(fprog->filter);
@@ -1599,7 +1599,7 @@ static struct sk_filter *__sk_prepare_filter(struct sk_filter *fp,
 int sk_unattached_filter_create(struct sk_filter **pfp,
 				struct sock_fprog *fprog)
 {
-	unsigned int fsize = sk_filter_proglen(fprog);
+	unsigned int fsize = sk_filter_prog_size(fprog);
 	struct sk_filter *fp;
 
 	/* Make sure new filter is there and in the right amounts. */
@@ -1651,7 +1651,7 @@ EXPORT_SYMBOL_GPL(sk_unattached_filter_destroy);
 int sk_attach_filter(struct sock_fprog *fprog, struct sock *sk)
 {
 	struct sk_filter *fp, *old_fp;
-	unsigned int fsize = sk_filter_proglen(fprog);
+	unsigned int fsize = sk_filter_prog_size(fprog);
 	unsigned int sk_fsize = sk_filter_size(fprog->len);
 	int err;
 
@@ -1799,7 +1799,7 @@ void sk_decode_filter(struct sock_filter *filt, struct sock_filter *to)
 int sk_get_filter(struct sock *sk, struct sock_filter __user *ubuf,
 		  unsigned int len)
 {
-	struct sock_fprog_kern *fprog;
+	struct sock_fprog *fprog;
 	struct sk_filter *filter;
 	int ret = 0;
 
@@ -1824,7 +1824,7 @@ int sk_get_filter(struct sock *sk, struct sock_filter __user *ubuf,
 		goto out;
 
 	ret = -EFAULT;
-	if (copy_to_user(ubuf, fprog->filter, sk_filter_proglen(fprog)))
+	if (copy_to_user(ubuf, fprog->filter, sk_filter_prog_size(fprog)))
 		goto out;
 
 	/* Instead of bytes, the API requests to return the number
diff --git a/net/core/sock_diag.c b/net/core/sock_diag.c
index d7af188..6669077 100644
--- a/net/core/sock_diag.c
+++ b/net/core/sock_diag.c
@@ -52,7 +52,7 @@ EXPORT_SYMBOL_GPL(sock_diag_put_meminfo);
 int sock_diag_put_filterinfo(struct user_namespace *user_ns, struct sock *sk,
 			     struct sk_buff *skb, int attrtype)
 {
-	struct sock_fprog_kern *fprog;
+	struct sock_fprog *fprog;
 	struct sk_filter *filter;
 	struct nlattr *attr;
 	unsigned int flen;
@@ -69,7 +69,7 @@ int sock_diag_put_filterinfo(struct user_namespace *user_ns, struct sock *sk,
 		goto out;
 
 	fprog = filter->orig_prog;
-	flen = sk_filter_proglen(fprog);
+	flen = sk_filter_prog_size(fprog);
 
 	attr = nla_reserve(skb, attrtype, flen);
 	if (attr == NULL) {
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net-next v2 4/5] net: filter: make register namings more comprehensible
  2014-04-24  6:45 [PATCH net-next v2 0/5] BPF updates Daniel Borkmann
                   ` (2 preceding siblings ...)
  2014-04-24  6:45 ` [PATCH net-next v2 3/5] net: filter: get rid of sock_fprog_kern Daniel Borkmann
@ 2014-04-24  6:45 ` Daniel Borkmann
  2014-04-24  6:45 ` [PATCH net-next v2 5/5] net: filter: optimize BPF migration for ARG1/CTX handling Daniel Borkmann
  4 siblings, 0 replies; 13+ messages in thread
From: Daniel Borkmann @ 2014-04-24  6:45 UTC (permalink / raw)
  To: davem; +Cc: ast, netdev

The current code is a bit hard to parse on which registers can be used,
how they are mapped and all play together. It makes much more sense to
define this a bit more clearly so that the code is a bit more intuitive.
This patch cleans this up, and makes naming a bit more consistent among
the code. This also allows for moving some of the defines into the header
file. Clearing of A and X registers in __sk_run_filter() do not get a
particular register name assigned as they have not an 'official' function,
but rather just result from the concrete initial mapping. Since for BPF
helper functions for BPF_CALL we already use small letters, so be
consistent here as well.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
---
 include/linux/filter.h |  44 ++++++++--
 net/core/filter.c      | 215 +++++++++++++++++++++++++------------------------
 2 files changed, 145 insertions(+), 114 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index e58b687..947dc33 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -40,16 +40,47 @@
 /* Placeholder/dummy for 0 */
 #define BPF_0		0
 
+/* Register numbers */
+enum {
+	BPF_REG_0 = 0,
+	BPF_REG_1,
+	BPF_REG_2,
+	BPF_REG_3,
+	BPF_REG_4,
+	BPF_REG_5,
+	BPF_REG_6,
+	BPF_REG_7,
+	BPF_REG_8,
+	BPF_REG_9,
+	BPF_REG_10,
+	__MAX_BPF_REG,
+};
+
 /* BPF has 10 general purpose 64-bit registers and stack frame. */
-#define MAX_BPF_REG	11
+#define MAX_BPF_REG	__MAX_BPF_REG
+
+/* ArgX, context and stack frame pointer register positions. Note,
+ * Arg1, Arg2, Arg3, etc are used as argument mappings of function
+ * calls in BPF_CALL instruction.
+ */
+#define BPF_REG_ARG1	BPF_REG_1
+#define BPF_REG_ARG2	BPF_REG_2
+#define BPF_REG_ARG3	BPF_REG_3
+#define BPF_REG_ARG4	BPF_REG_4
+#define BPF_REG_ARG5	BPF_REG_5
+#define BPF_REG_CTX	BPF_REG_6
+#define BPF_REG_FP	BPF_REG_10
+
+/* Additional register mappings for converted user programs. */
+#define BPF_REG_A	BPF_REG_0
+#define BPF_REG_X	BPF_REG_7
+#define BPF_REG_TMP	BPF_REG_8
 
 /* BPF program can access up to 512 bytes of stack space. */
 #define MAX_BPF_STACK	512
 
-/* Arg1, context and stack frame pointer register positions. */
-#define ARG1_REG	1
-#define CTX_REG		6
-#define FP_REG		10
+/* Macro to invoke filter function. */
+#define SK_RUN_FILTER(filter, ctx)  (*filter->bpf_func)(ctx, filter->insnsi)
 
 struct sock_filter_int {
 	__u8	code;		/* opcode */
@@ -97,9 +128,6 @@ static inline unsigned int sk_filter_prog_size(const struct sock_fprog *fprog)
 	return fprog->len * sizeof(fprog->filter[0]);
 }
 
-#define SK_RUN_FILTER(filter, ctx)			\
-		(*filter->bpf_func)(ctx, filter->insnsi)
-
 int sk_filter(struct sock *sk, struct sk_buff *skb);
 
 u32 sk_run_filter_int_seccomp(const struct seccomp_data *ctx,
diff --git a/net/core/filter.c b/net/core/filter.c
index e5e7717..eada3d5 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -45,6 +45,27 @@
 #include <linux/seccomp.h>
 #include <linux/if_vlan.h>
 
+/* Registers */
+#define R0	regs[BPF_REG_0]
+#define R1	regs[BPF_REG_1]
+#define R2	regs[BPF_REG_2]
+#define R3	regs[BPF_REG_3]
+#define R4	regs[BPF_REG_4]
+#define R5	regs[BPF_REG_5]
+#define R6	regs[BPF_REG_6]
+#define R7	regs[BPF_REG_7]
+#define R8	regs[BPF_REG_8]
+#define R9	regs[BPF_REG_9]
+#define R10	regs[BPF_REG_10]
+
+/* Named registers */
+#define A	regs[insn->a_reg]
+#define X	regs[insn->x_reg]
+#define FP	regs[BPF_REG_FP]
+#define ARG1	regs[BPF_REG_ARG1]
+#define CTX	regs[BPF_REG_CTX]
+#define K	insn->imm
+
 /* No hurry in this branch
  *
  * Exported for the bpf jit load helper.
@@ -123,13 +144,6 @@ noinline u64 __bpf_call_base(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)
 	return 0;
 }
 
-/* Register mappings for user programs. */
-#define A_REG		0
-#define X_REG		7
-#define TMP_REG		8
-#define ARG2_REG	2
-#define ARG3_REG	3
-
 /**
  *	__sk_run_filter - run a filter on a given context
  *	@ctx: buffer to run the filter on
@@ -143,17 +157,6 @@ unsigned int __sk_run_filter(void *ctx, const struct sock_filter_int *insn)
 {
 	u64 stack[MAX_BPF_STACK / sizeof(u64)];
 	u64 regs[MAX_BPF_REG], tmp;
-	void *ptr;
-	int off;
-
-#define K  insn->imm
-#define A  regs[insn->a_reg]
-#define X  regs[insn->x_reg]
-#define R0 regs[0]
-
-#define CONT	 ({insn++; goto select_insn; })
-#define CONT_JMP ({insn++; goto select_insn; })
-
 	static const void *jumptable[256] = {
 		[0 ... 255] = &&default_label,
 		/* Now overwrite non-defaults ... */
@@ -247,11 +250,18 @@ unsigned int __sk_run_filter(void *ctx, const struct sock_filter_int *insn)
 		DL(LD, IND, B),
 #undef DL
 	};
+	void *ptr;
+	int off;
 
-	regs[FP_REG]  = (u64) (unsigned long) &stack[ARRAY_SIZE(stack)];
-	regs[ARG1_REG] = (u64) (unsigned long) ctx;
-	regs[A_REG] = 0;
-	regs[X_REG] = 0;
+#define CONT	 ({ insn++; goto select_insn; })
+#define CONT_JMP ({ insn++; goto select_insn; })
+
+	FP = (u64) (unsigned long) &stack[ARRAY_SIZE(stack)];
+	ARG1 = (u64) (unsigned long) ctx;
+
+	/* Register for user BPF programs need to be reset first. */
+	regs[BPF_REG_A] = 0;
+	regs[BPF_REG_X] = 0;
 
 select_insn:
 	goto *jumptable[insn->code];
@@ -376,8 +386,7 @@ select_insn:
 		/* Function call scratches R1-R5 registers, preserves R6-R9,
 		 * and stores return value into R0.
 		 */
-		R0 = (__bpf_call_base + insn->imm)(regs[1], regs[2], regs[3],
-						   regs[4], regs[5]);
+		R0 = (__bpf_call_base + insn->imm)(R1, R2, R3, R4, R5);
 		CONT;
 
 	/* JMP */
@@ -501,7 +510,7 @@ select_insn:
 load_word:
 		/* BPF_LD + BPD_ABS and BPF_LD + BPF_IND insns are only
 		 * appearing in the programs where ctx == skb. All programs
-		 * keep 'ctx' in regs[CTX_REG] == R6, sk_convert_filter()
+		 * keep 'ctx' in regs[BPF_REG_CTX] == R6, sk_convert_filter()
 		 * saves it in R6, internal BPF verifier will check that
 		 * R6 == ctx.
 		 *
@@ -557,13 +566,6 @@ load_byte:
 		/* If we ever reach this, we have a bug somewhere. */
 		WARN_RATELIMIT(1, "unknown opcode %02x\n", insn->code);
 		return 0;
-#undef CONT_JMP
-#undef CONT
-
-#undef R0
-#undef X
-#undef A
-#undef K
 }
 
 u32 sk_run_filter_int_seccomp(const struct seccomp_data *ctx,
@@ -595,12 +597,12 @@ static unsigned int pkt_type_offset(void)
 	return -1;
 }
 
-static u64 __skb_get_pay_offset(u64 ctx, u64 A, u64 X, u64 r4, u64 r5)
+static u64 __skb_get_pay_offset(u64 ctx, u64 a, u64 x, u64 r4, u64 r5)
 {
 	return __skb_get_poff((struct sk_buff *)(long) ctx);
 }
 
-static u64 __skb_get_nlattr(u64 ctx, u64 A, u64 X, u64 r4, u64 r5)
+static u64 __skb_get_nlattr(u64 ctx, u64 a, u64 x, u64 r4, u64 r5)
 {
 	struct sk_buff *skb = (struct sk_buff *)(long) ctx;
 	struct nlattr *nla;
@@ -611,17 +613,17 @@ static u64 __skb_get_nlattr(u64 ctx, u64 A, u64 X, u64 r4, u64 r5)
 	if (unlikely(skb->len < sizeof(struct nlattr)))
 		return 0;
 
-	if (unlikely(A > skb->len - sizeof(struct nlattr)))
+	if (unlikely(a > skb->len - sizeof(struct nlattr)))
 		return 0;
 
-	nla = nla_find((struct nlattr *) &skb->data[A], skb->len - A, X);
+	nla = nla_find((struct nlattr *) &skb->data[a], skb->len - a, x);
 	if (nla)
 		return (void *) nla - (void *) skb->data;
 
 	return 0;
 }
 
-static u64 __skb_get_nlattr_nest(u64 ctx, u64 A, u64 X, u64 r4, u64 r5)
+static u64 __skb_get_nlattr_nest(u64 ctx, u64 a, u64 x, u64 r4, u64 r5)
 {
 	struct sk_buff *skb = (struct sk_buff *)(long) ctx;
 	struct nlattr *nla;
@@ -632,27 +634,27 @@ static u64 __skb_get_nlattr_nest(u64 ctx, u64 A, u64 X, u64 r4, u64 r5)
 	if (unlikely(skb->len < sizeof(struct nlattr)))
 		return 0;
 
-	if (unlikely(A > skb->len - sizeof(struct nlattr)))
+	if (unlikely(a > skb->len - sizeof(struct nlattr)))
 		return 0;
 
-	nla = (struct nlattr *) &skb->data[A];
-	if (unlikely(nla->nla_len > skb->len - A))
+	nla = (struct nlattr *) &skb->data[a];
+	if (unlikely(nla->nla_len > skb->len - a))
 		return 0;
 
-	nla = nla_find_nested(nla, X);
+	nla = nla_find_nested(nla, x);
 	if (nla)
 		return (void *) nla - (void *) skb->data;
 
 	return 0;
 }
 
-static u64 __get_raw_cpu_id(u64 ctx, u64 A, u64 X, u64 r4, u64 r5)
+static u64 __get_raw_cpu_id(u64 ctx, u64 a, u64 x, u64 r4, u64 r5)
 {
 	return raw_smp_processor_id();
 }
 
 /* note that this only generates 32-bit random numbers */
-static u64 __get_random_u32(u64 ctx, u64 A, u64 X, u64 r4, u64 r5)
+static u64 __get_random_u32(u64 ctx, u64 a, u64 x, u64 r4, u64 r5)
 {
 	return prandom_u32();
 }
@@ -667,28 +669,28 @@ static bool convert_bpf_extensions(struct sock_filter *fp,
 		BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, protocol) != 2);
 
 		insn->code = BPF_LDX | BPF_MEM | BPF_H;
-		insn->a_reg = A_REG;
-		insn->x_reg = CTX_REG;
+		insn->a_reg = BPF_REG_A;
+		insn->x_reg = BPF_REG_CTX;
 		insn->off = offsetof(struct sk_buff, protocol);
 		insn++;
 
 		/* A = ntohs(A) [emitting a nop or swap16] */
 		insn->code = BPF_ALU | BPF_END | BPF_FROM_BE;
-		insn->a_reg = A_REG;
+		insn->a_reg = BPF_REG_A;
 		insn->imm = 16;
 		break;
 
 	case SKF_AD_OFF + SKF_AD_PKTTYPE:
 		insn->code = BPF_LDX | BPF_MEM | BPF_B;
-		insn->a_reg = A_REG;
-		insn->x_reg = CTX_REG;
+		insn->a_reg = BPF_REG_A;
+		insn->x_reg = BPF_REG_CTX;
 		insn->off = pkt_type_offset();
 		if (insn->off < 0)
 			return false;
 		insn++;
 
 		insn->code = BPF_ALU | BPF_AND | BPF_K;
-		insn->a_reg = A_REG;
+		insn->a_reg = BPF_REG_A;
 		insn->imm = PKT_TYPE_MAX;
 		break;
 
@@ -698,13 +700,13 @@ static bool convert_bpf_extensions(struct sock_filter *fp,
 			insn->code = BPF_LDX | BPF_MEM | BPF_DW;
 		else
 			insn->code = BPF_LDX | BPF_MEM | BPF_W;
-		insn->a_reg = TMP_REG;
-		insn->x_reg = CTX_REG;
+		insn->a_reg = BPF_REG_TMP;
+		insn->x_reg = BPF_REG_CTX;
 		insn->off = offsetof(struct sk_buff, dev);
 		insn++;
 
 		insn->code = BPF_JMP | BPF_JNE | BPF_K;
-		insn->a_reg = TMP_REG;
+		insn->a_reg = BPF_REG_TMP;
 		insn->imm = 0;
 		insn->off = 1;
 		insn++;
@@ -715,8 +717,8 @@ static bool convert_bpf_extensions(struct sock_filter *fp,
 		BUILD_BUG_ON(FIELD_SIZEOF(struct net_device, ifindex) != 4);
 		BUILD_BUG_ON(FIELD_SIZEOF(struct net_device, type) != 2);
 
-		insn->a_reg = A_REG;
-		insn->x_reg = TMP_REG;
+		insn->a_reg = BPF_REG_A;
+		insn->x_reg = BPF_REG_TMP;
 
 		if (fp->k == SKF_AD_OFF + SKF_AD_IFINDEX) {
 			insn->code = BPF_LDX | BPF_MEM | BPF_W;
@@ -731,8 +733,8 @@ static bool convert_bpf_extensions(struct sock_filter *fp,
 		BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, mark) != 4);
 
 		insn->code = BPF_LDX | BPF_MEM | BPF_W;
-		insn->a_reg = A_REG;
-		insn->x_reg = CTX_REG;
+		insn->a_reg = BPF_REG_A;
+		insn->x_reg = BPF_REG_CTX;
 		insn->off = offsetof(struct sk_buff, mark);
 		break;
 
@@ -740,8 +742,8 @@ static bool convert_bpf_extensions(struct sock_filter *fp,
 		BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, hash) != 4);
 
 		insn->code = BPF_LDX | BPF_MEM | BPF_W;
-		insn->a_reg = A_REG;
-		insn->x_reg = CTX_REG;
+		insn->a_reg = BPF_REG_A;
+		insn->x_reg = BPF_REG_CTX;
 		insn->off = offsetof(struct sk_buff, hash);
 		break;
 
@@ -749,8 +751,8 @@ static bool convert_bpf_extensions(struct sock_filter *fp,
 		BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, queue_mapping) != 2);
 
 		insn->code = BPF_LDX | BPF_MEM | BPF_H;
-		insn->a_reg = A_REG;
-		insn->x_reg = CTX_REG;
+		insn->a_reg = BPF_REG_A;
+		insn->x_reg = BPF_REG_CTX;
 		insn->off = offsetof(struct sk_buff, queue_mapping);
 		break;
 
@@ -759,8 +761,8 @@ static bool convert_bpf_extensions(struct sock_filter *fp,
 		BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_tci) != 2);
 
 		insn->code = BPF_LDX | BPF_MEM | BPF_H;
-		insn->a_reg = A_REG;
-		insn->x_reg = CTX_REG;
+		insn->a_reg = BPF_REG_A;
+		insn->x_reg = BPF_REG_CTX;
 		insn->off = offsetof(struct sk_buff, vlan_tci);
 		insn++;
 
@@ -768,16 +770,16 @@ static bool convert_bpf_extensions(struct sock_filter *fp,
 
 		if (fp->k == SKF_AD_OFF + SKF_AD_VLAN_TAG) {
 			insn->code = BPF_ALU | BPF_AND | BPF_K;
-			insn->a_reg = A_REG;
+			insn->a_reg = BPF_REG_A;
 			insn->imm = ~VLAN_TAG_PRESENT;
 		} else {
 			insn->code = BPF_ALU | BPF_RSH | BPF_K;
-			insn->a_reg = A_REG;
+			insn->a_reg = BPF_REG_A;
 			insn->imm = 12;
 			insn++;
 
 			insn->code = BPF_ALU | BPF_AND | BPF_K;
-			insn->a_reg = A_REG;
+			insn->a_reg = BPF_REG_A;
 			insn->imm = 1;
 		}
 		break;
@@ -789,20 +791,20 @@ static bool convert_bpf_extensions(struct sock_filter *fp,
 	case SKF_AD_OFF + SKF_AD_RANDOM:
 		/* arg1 = ctx */
 		insn->code = BPF_ALU64 | BPF_MOV | BPF_X;
-		insn->a_reg = ARG1_REG;
-		insn->x_reg = CTX_REG;
+		insn->a_reg = BPF_REG_ARG1;
+		insn->x_reg = BPF_REG_CTX;
 		insn++;
 
 		/* arg2 = A */
 		insn->code = BPF_ALU64 | BPF_MOV | BPF_X;
-		insn->a_reg = ARG2_REG;
-		insn->x_reg = A_REG;
+		insn->a_reg = BPF_REG_ARG2;
+		insn->x_reg = BPF_REG_A;
 		insn++;
 
 		/* arg3 = X */
 		insn->code = BPF_ALU64 | BPF_MOV | BPF_X;
-		insn->a_reg = ARG3_REG;
-		insn->x_reg = X_REG;
+		insn->a_reg = BPF_REG_ARG3;
+		insn->x_reg = BPF_REG_X;
 		insn++;
 
 		/* Emit call(ctx, arg2=A, arg3=X) */
@@ -828,8 +830,8 @@ static bool convert_bpf_extensions(struct sock_filter *fp,
 
 	case SKF_AD_OFF + SKF_AD_ALU_XOR_X:
 		insn->code = BPF_ALU | BPF_XOR | BPF_X;
-		insn->a_reg = A_REG;
-		insn->x_reg = X_REG;
+		insn->a_reg = BPF_REG_A;
+		insn->x_reg = BPF_REG_X;
 		break;
 
 	default:
@@ -879,7 +881,7 @@ int sk_convert_filter(struct sock_filter *prog, int len,
 	u8 bpf_src;
 
 	BUILD_BUG_ON(BPF_MEMWORDS * sizeof(u32) > MAX_BPF_STACK);
-	BUILD_BUG_ON(FP_REG + 1 != MAX_BPF_REG);
+	BUILD_BUG_ON(BPF_REG_FP + 1 != MAX_BPF_REG);
 
 	if (len <= 0 || len >= BPF_MAXINSNS)
 		return -EINVAL;
@@ -896,8 +898,8 @@ do_pass:
 
 	if (new_insn) {
 		new_insn->code = BPF_ALU64 | BPF_MOV | BPF_X;
-		new_insn->a_reg = CTX_REG;
-		new_insn->x_reg = ARG1_REG;
+		new_insn->a_reg = BPF_REG_CTX;
+		new_insn->x_reg = BPF_REG_ARG1;
 	}
 	new_insn++;
 
@@ -947,8 +949,8 @@ do_pass:
 				break;
 
 			insn->code = fp->code;
-			insn->a_reg = A_REG;
-			insn->x_reg = X_REG;
+			insn->a_reg = BPF_REG_A;
+			insn->x_reg = BPF_REG_X;
 			insn->imm = fp->k;
 			break;
 
@@ -982,16 +984,16 @@ do_pass:
 				 * in compare insn.
 				 */
 				insn->code = BPF_ALU | BPF_MOV | BPF_K;
-				insn->a_reg = TMP_REG;
+				insn->a_reg = BPF_REG_TMP;
 				insn->imm = fp->k;
 				insn++;
 
-				insn->a_reg = A_REG;
-				insn->x_reg = TMP_REG;
+				insn->a_reg = BPF_REG_A;
+				insn->x_reg = BPF_REG_TMP;
 				bpf_src = BPF_X;
 			} else {
-				insn->a_reg = A_REG;
-				insn->x_reg = X_REG;
+				insn->a_reg = BPF_REG_A;
+				insn->x_reg = BPF_REG_X;
 				insn->imm = fp->k;
 				bpf_src = BPF_SRC(fp->code);
 			}
@@ -1026,33 +1028,33 @@ do_pass:
 		/* ldxb 4 * ([14] & 0xf) is remaped into 6 insns. */
 		case BPF_LDX | BPF_MSH | BPF_B:
 			insn->code = BPF_ALU64 | BPF_MOV | BPF_X;
-			insn->a_reg = TMP_REG;
-			insn->x_reg = A_REG;
+			insn->a_reg = BPF_REG_TMP;
+			insn->x_reg = BPF_REG_A;
 			insn++;
 
 			insn->code = BPF_LD | BPF_ABS | BPF_B;
-			insn->a_reg = A_REG;
+			insn->a_reg = BPF_REG_A;
 			insn->imm = fp->k;
 			insn++;
 
 			insn->code = BPF_ALU | BPF_AND | BPF_K;
-			insn->a_reg = A_REG;
+			insn->a_reg = BPF_REG_A;
 			insn->imm = 0xf;
 			insn++;
 
 			insn->code = BPF_ALU | BPF_LSH | BPF_K;
-			insn->a_reg = A_REG;
+			insn->a_reg = BPF_REG_A;
 			insn->imm = 2;
 			insn++;
 
 			insn->code = BPF_ALU64 | BPF_MOV | BPF_X;
-			insn->a_reg = X_REG;
-			insn->x_reg = A_REG;
+			insn->a_reg = BPF_REG_X;
+			insn->x_reg = BPF_REG_A;
 			insn++;
 
 			insn->code = BPF_ALU64 | BPF_MOV | BPF_X;
-			insn->a_reg = A_REG;
-			insn->x_reg = TMP_REG;
+			insn->a_reg = BPF_REG_A;
+			insn->x_reg = BPF_REG_TMP;
 			break;
 
 		/* RET_K, RET_A are remaped into 2 insns. */
@@ -1062,7 +1064,7 @@ do_pass:
 				     (BPF_RVAL(fp->code) == BPF_K ?
 				      BPF_K : BPF_X);
 			insn->a_reg = 0;
-			insn->x_reg = A_REG;
+			insn->x_reg = BPF_REG_A;
 			insn->imm = fp->k;
 			insn++;
 
@@ -1073,8 +1075,9 @@ do_pass:
 		case BPF_ST:
 		case BPF_STX:
 			insn->code = BPF_STX | BPF_MEM | BPF_W;
-			insn->a_reg = FP_REG;
-			insn->x_reg = fp->code == BPF_ST ? A_REG : X_REG;
+			insn->a_reg = BPF_REG_FP;
+			insn->x_reg = fp->code == BPF_ST ?
+				      BPF_REG_A : BPF_REG_X;
 			insn->off = -(BPF_MEMWORDS - fp->k) * 4;
 			break;
 
@@ -1083,8 +1086,8 @@ do_pass:
 		case BPF_LDX | BPF_MEM:
 			insn->code = BPF_LDX | BPF_MEM | BPF_W;
 			insn->a_reg = BPF_CLASS(fp->code) == BPF_LD ?
-				      A_REG : X_REG;
-			insn->x_reg = FP_REG;
+				      BPF_REG_A : BPF_REG_X;
+			insn->x_reg = BPF_REG_FP;
 			insn->off = -(BPF_MEMWORDS - fp->k) * 4;
 			break;
 
@@ -1093,22 +1096,22 @@ do_pass:
 		case BPF_LDX | BPF_IMM:
 			insn->code = BPF_ALU | BPF_MOV | BPF_K;
 			insn->a_reg = BPF_CLASS(fp->code) == BPF_LD ?
-				      A_REG : X_REG;
+				      BPF_REG_A : BPF_REG_X;
 			insn->imm = fp->k;
 			break;
 
 		/* X = A */
 		case BPF_MISC | BPF_TAX:
 			insn->code = BPF_ALU64 | BPF_MOV | BPF_X;
-			insn->a_reg = X_REG;
-			insn->x_reg = A_REG;
+			insn->a_reg = BPF_REG_X;
+			insn->x_reg = BPF_REG_A;
 			break;
 
 		/* A = X */
 		case BPF_MISC | BPF_TXA:
 			insn->code = BPF_ALU64 | BPF_MOV | BPF_X;
-			insn->a_reg = A_REG;
-			insn->x_reg = X_REG;
+			insn->a_reg = BPF_REG_A;
+			insn->x_reg = BPF_REG_X;
 			break;
 
 		/* A = skb->len or X = skb->len */
@@ -1116,16 +1119,16 @@ do_pass:
 		case BPF_LDX | BPF_W | BPF_LEN:
 			insn->code = BPF_LDX | BPF_MEM | BPF_W;
 			insn->a_reg = BPF_CLASS(fp->code) == BPF_LD ?
-				      A_REG : X_REG;
-			insn->x_reg = CTX_REG;
+				      BPF_REG_A : BPF_REG_X;
+			insn->x_reg = BPF_REG_CTX;
 			insn->off = offsetof(struct sk_buff, len);
 			break;
 
 		/* access seccomp_data fields */
 		case BPF_LDX | BPF_ABS | BPF_W:
 			insn->code = BPF_LDX | BPF_MEM | BPF_W;
-			insn->a_reg = A_REG;
-			insn->x_reg = CTX_REG;
+			insn->a_reg = BPF_REG_A;
+			insn->x_reg = BPF_REG_CTX;
 			insn->off = fp->k;
 			break;
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net-next v2 5/5] net: filter: optimize BPF migration for ARG1/CTX handling
  2014-04-24  6:45 [PATCH net-next v2 0/5] BPF updates Daniel Borkmann
                   ` (3 preceding siblings ...)
  2014-04-24  6:45 ` [PATCH net-next v2 4/5] net: filter: make register namings more comprehensible Daniel Borkmann
@ 2014-04-24  6:45 ` Daniel Borkmann
  2014-04-24 15:43   ` Alexei Starovoitov
  2014-04-24 20:04   ` David Miller
  4 siblings, 2 replies; 13+ messages in thread
From: Daniel Borkmann @ 2014-04-24  6:45 UTC (permalink / raw)
  To: davem; +Cc: ast, netdev

Currently, at initial setup in __sk_run_filter() we initialize the
BPF stack's frame-pointer and CTX register. However, instead of the
CTX register, we initialize context to ARG1, and during user filter
migration we emit *always* an instruction that copies the content
from ARG1 over to CTX. ARG1 is needed in BPF_CALL instructions to
setup ctx; for user BPF filter ARG2 has A, and ARG3 X for call
emission. However, we nevertheless copy CTX over to ARG1 in these
cases for user migrated filters. We can spare us this extra interpreter
instruction and assign it during initial setup time.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
---
 net/core/filter.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index eada3d5..9744365 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -257,7 +257,10 @@ unsigned int __sk_run_filter(void *ctx, const struct sock_filter_int *insn)
 #define CONT_JMP ({ insn++; goto select_insn; })
 
 	FP = (u64) (unsigned long) &stack[ARRAY_SIZE(stack)];
-	ARG1 = (u64) (unsigned long) ctx;
+	CTX = (u64) (unsigned long) ctx;
+
+	/* Direct users expect arg1 to be filled already with ctx. */
+	ARG1 = CTX;
 
 	/* Register for user BPF programs need to be reset first. */
 	regs[BPF_REG_A] = 0;
@@ -896,13 +899,6 @@ do_pass:
 	new_insn = new_prog;
 	fp = prog;
 
-	if (new_insn) {
-		new_insn->code = BPF_ALU64 | BPF_MOV | BPF_X;
-		new_insn->a_reg = BPF_REG_CTX;
-		new_insn->x_reg = BPF_REG_ARG1;
-	}
-	new_insn++;
-
 	for (i = 0; i < len; fp++, i++) {
 		struct sock_filter_int tmp_insns[6] = { };
 		struct sock_filter_int *insn = tmp_insns;
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v2 5/5] net: filter: optimize BPF migration for ARG1/CTX handling
  2014-04-24  6:45 ` [PATCH net-next v2 5/5] net: filter: optimize BPF migration for ARG1/CTX handling Daniel Borkmann
@ 2014-04-24 15:43   ` Alexei Starovoitov
  2014-04-24 20:04   ` David Miller
  1 sibling, 0 replies; 13+ messages in thread
From: Alexei Starovoitov @ 2014-04-24 15:43 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: David S. Miller, Network Development

On Wed, Apr 23, 2014 at 11:45 PM, Daniel Borkmann <dborkman@redhat.com> wrote:
> Currently, at initial setup in __sk_run_filter() we initialize the
> BPF stack's frame-pointer and CTX register. However, instead of the
> CTX register, we initialize context to ARG1, and during user filter
> migration we emit *always* an instruction that copies the content
> from ARG1 over to CTX. ARG1 is needed in BPF_CALL instructions to
> setup ctx; for user BPF filter ARG2 has A, and ARG3 X for call
> emission. However, we nevertheless copy CTX over to ARG1 in these
> cases for user migrated filters. We can spare us this extra interpreter
> instruction and assign it during initial setup time.
>
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>

Acked-by: Alexei Starovoitov <ast@plumgrid.com>

> opinion on this particular patch, just wanted to have clarified what we
> have with ABI here as there's no particular freeze from the code we have

yes. Of course, we do not have a freeze on ABI.
I was arguing about merits of not changing it for this particular case.
Especially when the difference is a single line of code.
If we want to add another register to ISA, that would be a fine reason
to change it.

Thanks!

> ---
>  net/core/filter.c | 12 ++++--------
>  1 file changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index eada3d5..9744365 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -257,7 +257,10 @@ unsigned int __sk_run_filter(void *ctx, const struct sock_filter_int *insn)
>  #define CONT_JMP ({ insn++; goto select_insn; })
>
>         FP = (u64) (unsigned long) &stack[ARRAY_SIZE(stack)];
> -       ARG1 = (u64) (unsigned long) ctx;
> +       CTX = (u64) (unsigned long) ctx;
> +
> +       /* Direct users expect arg1 to be filled already with ctx. */
> +       ARG1 = CTX;
>
>         /* Register for user BPF programs need to be reset first. */
>         regs[BPF_REG_A] = 0;
> @@ -896,13 +899,6 @@ do_pass:
>         new_insn = new_prog;
>         fp = prog;
>
> -       if (new_insn) {
> -               new_insn->code = BPF_ALU64 | BPF_MOV | BPF_X;
> -               new_insn->a_reg = BPF_REG_CTX;
> -               new_insn->x_reg = BPF_REG_ARG1;
> -       }
> -       new_insn++;
> -
>         for (i = 0; i < len; fp++, i++) {
>                 struct sock_filter_int tmp_insns[6] = { };
>                 struct sock_filter_int *insn = tmp_insns;
> --
> 1.7.11.7
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v2 2/5] net: filter: misc/various cleanups
  2014-04-24  6:45 ` [PATCH net-next v2 2/5] net: filter: misc/various cleanups Daniel Borkmann
@ 2014-04-24 20:00   ` David Miller
  2014-04-25  7:52     ` Daniel Borkmann
  0 siblings, 1 reply; 13+ messages in thread
From: David Miller @ 2014-04-24 20:00 UTC (permalink / raw)
  To: dborkman; +Cc: ast, netdev

From: Daniel Borkmann <dborkman@redhat.com>
Date: Thu, 24 Apr 2014 08:45:24 +0200

> +	return __skb_get_poff((struct sk_buff *)(long) ctx);

Please use "unsigned long" when casting pointers to/from integers.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v2 3/5] net: filter: get rid of sock_fprog_kern
  2014-04-24  6:45 ` [PATCH net-next v2 3/5] net: filter: get rid of sock_fprog_kern Daniel Borkmann
@ 2014-04-24 20:02   ` David Miller
  2014-04-25  7:58     ` Daniel Borkmann
  0 siblings, 1 reply; 13+ messages in thread
From: David Miller @ 2014-04-24 20:02 UTC (permalink / raw)
  To: dborkman; +Cc: ast, netdev

From: Daniel Borkmann <dborkman@redhat.com>
Date: Thu, 24 Apr 2014 08:45:25 +0200

> It is actually cleaner to just get rid of sock_fprog_kern structure.
> It's not really useful as we can just use sock_fprog structure as
> we do elsewhere in the kernel, this could throw some sparse false
> positives though, but getting rid of the structure duplication is
> probably the better way to go.
> 
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
> Acked-by: Alexei Starovoitov <ast@plumgrid.com>

But this is the whole point of sock_fprog_kern.

It's so that we can always have the correct type for the pointers, and
we can show cleanly (without lots of ugly casts) to sparse that whether
we expect a pointer to be a user or a kernel one.

I really don't like this change, sorry.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v2 5/5] net: filter: optimize BPF migration for ARG1/CTX handling
  2014-04-24  6:45 ` [PATCH net-next v2 5/5] net: filter: optimize BPF migration for ARG1/CTX handling Daniel Borkmann
  2014-04-24 15:43   ` Alexei Starovoitov
@ 2014-04-24 20:04   ` David Miller
  2014-04-26 18:06     ` Alexei Starovoitov
  1 sibling, 1 reply; 13+ messages in thread
From: David Miller @ 2014-04-24 20:04 UTC (permalink / raw)
  To: dborkman; +Cc: ast, netdev

From: Daniel Borkmann <dborkman@redhat.com>
Date: Thu, 24 Apr 2014 08:45:27 +0200

> Currently, at initial setup in __sk_run_filter() we initialize the
> BPF stack's frame-pointer and CTX register. However, instead of the
> CTX register, we initialize context to ARG1, and during user filter
> migration we emit *always* an instruction that copies the content
> from ARG1 over to CTX. ARG1 is needed in BPF_CALL instructions to
> setup ctx; for user BPF filter ARG2 has A, and ARG3 X for call
> emission. However, we nevertheless copy CTX over to ARG1 in these
> cases for user migrated filters. We can spare us this extra interpreter
> instruction and assign it during initial setup time.
> 
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>

We're adjust code for facilities that aren't even used.

I want someone to explain to me exactly how calls in and out of
the EBPF context are expected to behave.

There are so many cpu calling conventions.  Some use registers up
to a certain number for passing arguments, then any overflow args
go on the stack.  Some pass all args on the stack.

How will all such schemes be accomodated by the BPF_CALL facilities?

Until I personally can even begin to understand this, I'm not applying
any more patches to these areas of the code, sorry.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v2 2/5] net: filter: misc/various cleanups
  2014-04-24 20:00   ` David Miller
@ 2014-04-25  7:52     ` Daniel Borkmann
  0 siblings, 0 replies; 13+ messages in thread
From: Daniel Borkmann @ 2014-04-25  7:52 UTC (permalink / raw)
  To: David Miller; +Cc: ast, netdev

On 04/24/2014 10:00 PM, David Miller wrote:
> From: Daniel Borkmann <dborkman@redhat.com>
> Date: Thu, 24 Apr 2014 08:45:24 +0200
>
>> +	return __skb_get_poff((struct sk_buff *)(long) ctx);
>
> Please use "unsigned long" when casting pointers to/from integers.

I'll fix it up, thanks.

There's ongoing work to improve the documentation file to address
your concerns.

Thanks,

Daniel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v2 3/5] net: filter: get rid of sock_fprog_kern
  2014-04-24 20:02   ` David Miller
@ 2014-04-25  7:58     ` Daniel Borkmann
  0 siblings, 0 replies; 13+ messages in thread
From: Daniel Borkmann @ 2014-04-25  7:58 UTC (permalink / raw)
  To: David Miller; +Cc: ast, netdev

On 04/24/2014 10:02 PM, David Miller wrote:
...
> It's so that we can always have the correct type for the pointers, and
> we can show cleanly (without lots of ugly casts) to sparse that whether
> we expect a pointer to be a user or a kernel one.

Understood, I'll take the time to go the other way around and try to
convert sk_unattached_filter_create() API to use struct sock_fprog_kern
instead as this is only for in-kernel users.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v2 5/5] net: filter: optimize BPF migration for ARG1/CTX handling
  2014-04-24 20:04   ` David Miller
@ 2014-04-26 18:06     ` Alexei Starovoitov
  0 siblings, 0 replies; 13+ messages in thread
From: Alexei Starovoitov @ 2014-04-26 18:06 UTC (permalink / raw)
  To: David Miller; +Cc: Daniel Borkmann, Network Development

On Thu, Apr 24, 2014 at 1:04 PM, David Miller <davem@davemloft.net> wrote:
> From: Daniel Borkmann <dborkman@redhat.com>
> Date: Thu, 24 Apr 2014 08:45:27 +0200
>
>> Currently, at initial setup in __sk_run_filter() we initialize the
>> BPF stack's frame-pointer and CTX register. However, instead of the
>> CTX register, we initialize context to ARG1, and during user filter
>> migration we emit *always* an instruction that copies the content
>> from ARG1 over to CTX. ARG1 is needed in BPF_CALL instructions to
>> setup ctx; for user BPF filter ARG2 has A, and ARG3 X for call
>> emission. However, we nevertheless copy CTX over to ARG1 in these
>> cases for user migrated filters. We can spare us this extra interpreter
>> instruction and assign it during initial setup time.
>>
>> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
>
> We're adjust code for facilities that aren't even used.
>
> I want someone to explain to me exactly how calls in and out of
> the EBPF context are expected to behave.
>
> There are so many cpu calling conventions.  Some use registers up
> to a certain number for passing arguments, then any overflow args
> go on the stack.  Some pass all args on the stack.
>
> How will all such schemes be accomodated by the BPF_CALL facilities?

calling convention was picked to cover common call situations
without performance penalty:
"R1-R5 - arguments to in-kernel function or to bpf program"

Calls from kernel into bpf are limited to one argument (in R1),
since that's what socket filters and seccomp use,
and tracing filters will be fine as well.
(can be extended in the future if needed).

Calls from bpf to kernel are limited to 5 arguments and all
in kernel helper functions look like:
u64 __bpf_helper_func(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
these functions may ignore some or all arguments,
(since on most architectures there is no penalty to ignore args
and negligible cost to pass them comparing to the rest of the program)

Calls from bpf to kernel with 6 or more arguments are not supported,
since they are rare in normal programs and would unnecessary
complicate code and JITs.
x86_64 passes first 6 arguments via registers and the rest on the stack,
other 64-bit architectures pass 7+ in registers.
So from JITs point of view passing 5 is straightforward and no messy
stack manipulations are necessary.

JIT to 32-bit CPUs is not easy,
since all ebpf registers are 64-bit, but underlying cpu is 32-bit,
JIT needs to do 64-bit arithmetic with 32-bit hw registers.
Can be an interesting project for someone with time on hands.
In any case ebpf interpreter is quite fast and well optimized.

32-bit JITs can use middle ground approach too.
If program contains only 32-bit arithmetic, jit it, otherwise let
interpreter run it.

btw the JIT of BPF_CALL to x86_64 looks like:
               case BPF_JMP | BPF_CALL:
                        func = (u8 *)__bpf_call_base + K;
                        jmp_offset = func - (image + addrs[i]);
                        if (!K || !is_simm32(jmp_offset)) {
                                pr_err(...);
                                return -EINVAL;
                        }
                        EMIT1_off32(0xE8, jmp_offset);
                        break;
it's really one to one.

> Until I personally can even begin to understand this, I'm not applying
> any more patches to these areas of the code, sorry.

Point taken.
We realize that documentation in filter.txt is not sufficient.
Last few days we've been working on extensive documentation update.
It should be ready by Monday or Tuesday.

Thank you
Alexei

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-04-26 18:06 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-24  6:45 [PATCH net-next v2 0/5] BPF updates Daniel Borkmann
2014-04-24  6:45 ` [PATCH net-next v2 1/5] net: filter: simplify label names from jump-table Daniel Borkmann
2014-04-24  6:45 ` [PATCH net-next v2 2/5] net: filter: misc/various cleanups Daniel Borkmann
2014-04-24 20:00   ` David Miller
2014-04-25  7:52     ` Daniel Borkmann
2014-04-24  6:45 ` [PATCH net-next v2 3/5] net: filter: get rid of sock_fprog_kern Daniel Borkmann
2014-04-24 20:02   ` David Miller
2014-04-25  7:58     ` Daniel Borkmann
2014-04-24  6:45 ` [PATCH net-next v2 4/5] net: filter: make register namings more comprehensible Daniel Borkmann
2014-04-24  6:45 ` [PATCH net-next v2 5/5] net: filter: optimize BPF migration for ARG1/CTX handling Daniel Borkmann
2014-04-24 15:43   ` Alexei Starovoitov
2014-04-24 20:04   ` David Miller
2014-04-26 18:06     ` Alexei Starovoitov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.