netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 bpf-next 0/3] bpf: Program extensions or dynamic re-linking
@ 2020-01-21  0:53 Alexei Starovoitov
  2020-01-21  0:53 ` [PATCH v2 bpf-next 1/3] bpf: Introduce dynamic program extensions Alexei Starovoitov
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Alexei Starovoitov @ 2020-01-21  0:53 UTC (permalink / raw)
  To: davem; +Cc: daniel, netdev, bpf, kernel-team

The last few month BPF community has been discussing an approach to call
chaining, since exiting bpt_tail_call() mechanism used in production XDP
programs has plenty of downsides. The outcome of these discussion was a
conclusion to implement dynamic re-linking of BPF programs. Where rootlet XDP
program attached to a netdevice can programmatically define a policy of
execution of other XDP programs. Such rootlet would be compiled as normal XDP
program and provide a number of placeholder global functions which later can be
replaced with future XDP programs. BPF trampoline, function by function
verification were building blocks towards that goal. The patch 1 is a final
building block. It introduces dynamic program extensions. A number of
improvements like more flexible function by function verification and better
libbpf api will be implemented in future patches.

v1->v2:
- addressed Andrii's comments
- rebase

Alexei Starovoitov (3):
  bpf: Introduce dynamic program extensions
  libbpf: Add support for program extensions
  selftests/bpf: Add tests for program extensions

 include/linux/bpf.h                           |  10 +-
 include/linux/bpf_types.h                     |   2 +
 include/linux/btf.h                           |   5 +
 include/uapi/linux/bpf.h                      |   1 +
 kernel/bpf/btf.c                              | 152 +++++++++++++++++-
 kernel/bpf/syscall.c                          |  15 +-
 kernel/bpf/trampoline.c                       |  41 ++++-
 kernel/bpf/verifier.c                         |  85 +++++++---
 tools/include/uapi/linux/bpf.h                |   1 +
 tools/lib/bpf/bpf.c                           |   3 +-
 tools/lib/bpf/libbpf.c                        |  13 +-
 tools/lib/bpf/libbpf.h                        |   2 +
 tools/lib/bpf/libbpf.map                      |   2 +
 tools/lib/bpf/libbpf_probes.c                 |   1 +
 .../selftests/bpf/prog_tests/fexit_bpf2bpf.c  |  20 ++-
 .../selftests/bpf/progs/fexit_bpf2bpf.c       |  57 +++++++
 .../selftests/bpf/progs/test_pkt_access.c     |   8 +-
 17 files changed, 384 insertions(+), 34 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 bpf-next 1/3] bpf: Introduce dynamic program extensions
  2020-01-21  0:53 [PATCH v2 bpf-next 0/3] bpf: Program extensions or dynamic re-linking Alexei Starovoitov
@ 2020-01-21  0:53 ` Alexei Starovoitov
  2020-01-21  7:36   ` John Fastabend
  2020-01-21  0:53 ` [PATCH v2 bpf-next 2/3] libbpf: Add support for " Alexei Starovoitov
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2020-01-21  0:53 UTC (permalink / raw)
  To: davem; +Cc: daniel, netdev, bpf, kernel-team

Introduce dynamic program extensions. The users can load additional BPF
functions and replace global functions in previously loaded BPF programs while
these programs are executing.

Global functions are verified individually by the verifier based on their types only.
Hence the global function in the new program which types match older function can
safely replace that corresponding function.

This new function/program is called 'an extension' of old program. At load time
the verifier uses (attach_prog_fd, attach_btf_id) pair to identify the function
to be replaced. The BPF program type is derived from the target program into
extension program. Technically bpf_verifier_ops is copied from target program.
The BPF_PROG_TYPE_EXT program type is a placeholder. It has empty verifier_ops.
The extension program can call the same bpf helper functions as target program.
Single BPF_PROG_TYPE_EXT type is used to extend XDP, SKB and all other program
types. The verifier allows only one level of replacement. Meaning that the
extension program cannot recursively extend an extension. That also means that
the maximum stack size is increasing from 512 to 1024 bytes and maximum
function nesting level from 8 to 16. The programs don't always consume that
much. The stack usage is determined by the number of on-stack variables used by
the program. The verifier could have enforced 512 limit for combined original
plus extension program, but it makes for difficult user experience. The main
use case for extensions is to provide generic mechanism to plug external
programs into policy program or function call chaining.

BPF trampoline is used to track both fentry/fexit and program extensions
because both are using the same nop slot at the beginning of every BPF
function. Attaching fentry/fexit to a function that was replaced is not
allowed. The opposite is true as well. Replacing a function that currently
being analyzed with fentry/fexit is not allowed. The executable page allocated
by BPF trampoline is not used by program extensions. This inefficiency will be
optimized in future patches.

Function by function verification of global function supports scalars and
pointer to context only. Hence program extensions are supported for such class
of global functions only. In the future the verifier will be extended with
support to pointers to structures, arrays with sizes, etc.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/bpf.h       |  10 ++-
 include/linux/bpf_types.h |   2 +
 include/linux/btf.h       |   5 ++
 include/uapi/linux/bpf.h  |   1 +
 kernel/bpf/btf.c          | 152 +++++++++++++++++++++++++++++++++++++-
 kernel/bpf/syscall.c      |  15 +++-
 kernel/bpf/trampoline.c   |  41 +++++++++-
 kernel/bpf/verifier.c     |  85 ++++++++++++++++-----
 8 files changed, 283 insertions(+), 28 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 8e3b8f4ad183..05d16615054c 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -465,7 +465,8 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start);
 enum bpf_tramp_prog_type {
 	BPF_TRAMP_FENTRY,
 	BPF_TRAMP_FEXIT,
-	BPF_TRAMP_MAX
+	BPF_TRAMP_MAX,
+	BPF_TRAMP_REPLACE, /* more than MAX */
 };
 
 struct bpf_trampoline {
@@ -480,6 +481,11 @@ struct bpf_trampoline {
 		void *addr;
 		bool ftrace_managed;
 	} func;
+	/* if !NULL this is BPF_PROG_TYPE_EXT program that extends another BPF
+	 * program by replacing one of its functions. func.addr is the address
+	 * of the function it replaced.
+	 */
+	struct bpf_prog *extension_prog;
 	/* list of BPF programs using this trampoline */
 	struct hlist_head progs_hlist[BPF_TRAMP_MAX];
 	/* Number of attached programs. A counter per kind. */
@@ -1107,6 +1113,8 @@ int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog,
 			     struct bpf_reg_state *regs);
 int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog,
 			  struct bpf_reg_state *reg);
+int btf_check_type_match(struct bpf_verifier_env *env, struct bpf_prog *prog,
+			 struct btf *btf, const struct btf_type *t);
 
 struct bpf_prog *bpf_prog_by_id(u32 id);
 
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index 9f326e6ef885..c81d4ece79a4 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -68,6 +68,8 @@ BPF_PROG_TYPE(BPF_PROG_TYPE_SK_REUSEPORT, sk_reuseport,
 #if defined(CONFIG_BPF_JIT)
 BPF_PROG_TYPE(BPF_PROG_TYPE_STRUCT_OPS, bpf_struct_ops,
 	      void *, void *)
+BPF_PROG_TYPE(BPF_PROG_TYPE_EXT, bpf_extension,
+	      void *, void *)
 #endif
 
 BPF_MAP_TYPE(BPF_MAP_TYPE_ARRAY, array_map_ops)
diff --git a/include/linux/btf.h b/include/linux/btf.h
index 881e9b76ef49..5c1ea99b480f 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -107,6 +107,11 @@ static inline u16 btf_type_vlen(const struct btf_type *t)
 	return BTF_INFO_VLEN(t->info);
 }
 
+static inline u16 btf_func_linkage(const struct btf_type *t)
+{
+	return BTF_INFO_VLEN(t->info);
+}
+
 static inline bool btf_type_kflag(const struct btf_type *t)
 {
 	return BTF_INFO_KFLAG(t->info);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 033d90a2282d..e81628eb059c 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -180,6 +180,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_CGROUP_SOCKOPT,
 	BPF_PROG_TYPE_TRACING,
 	BPF_PROG_TYPE_STRUCT_OPS,
+	BPF_PROG_TYPE_EXT,
 };
 
 enum bpf_attach_type {
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 832b5d7fd892..32963b6d5a9c 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -276,6 +276,11 @@ static const char * const btf_kind_str[NR_BTF_KINDS] = {
 	[BTF_KIND_DATASEC]	= "DATASEC",
 };
 
+static const char *btf_type_str(const struct btf_type *t)
+{
+	return btf_kind_str[BTF_INFO_KIND(t->info)];
+}
+
 struct btf_kind_operations {
 	s32 (*check_meta)(struct btf_verifier_env *env,
 			  const struct btf_type *t,
@@ -4115,6 +4120,148 @@ int btf_distill_func_proto(struct bpf_verifier_log *log,
 	return 0;
 }
 
+/* Compare BTFs of two functions assuming only scalars and pointers to context.
+ * t1 points to BTF_KIND_FUNC in btf1
+ * t2 points to BTF_KIND_FUNC in btf2
+ * Returns:
+ * EINVAL - function prototype mismatch
+ * EFAULT - verifier bug
+ * 0 - 99% match. The last 1% is validated by the verifier.
+ */
+int btf_check_func_type_match(struct bpf_verifier_log *log,
+			      struct btf *btf1, const struct btf_type *t1,
+			      struct btf *btf2, const struct btf_type *t2)
+{
+	const struct btf_param *args1, *args2;
+	const char *fn1, *fn2, *s1, *s2;
+	u32 nargs1, nargs2, i;
+
+	fn1 = btf_name_by_offset(btf1, t1->name_off);
+	fn2 = btf_name_by_offset(btf2, t2->name_off);
+
+	if (btf_func_linkage(t1) != BTF_FUNC_GLOBAL) {
+		bpf_log(log, "%s() is not a global function\n", fn1);
+		return -EINVAL;
+	}
+	if (btf_func_linkage(t2) != BTF_FUNC_GLOBAL) {
+		bpf_log(log, "%s() is not a global function\n", fn2);
+		return -EINVAL;
+	}
+
+	t1 = btf_type_by_id(btf1, t1->type);
+	if (!t1 || !btf_type_is_func_proto(t1))
+		return -EFAULT;
+	t2 = btf_type_by_id(btf2, t2->type);
+	if (!t2 || !btf_type_is_func_proto(t2))
+		return -EFAULT;
+
+	args1 = (const struct btf_param *)(t1 + 1);
+	nargs1 = btf_type_vlen(t1);
+	args2 = (const struct btf_param *)(t2 + 1);
+	nargs2 = btf_type_vlen(t2);
+
+	if (nargs1 != nargs2) {
+		bpf_log(log, "%s() has %d args while %s() has %d args\n",
+			fn1, nargs1, fn2, nargs2);
+		return -EINVAL;
+	}
+
+	t1 = btf_type_skip_modifiers(btf1, t1->type, NULL);
+	t2 = btf_type_skip_modifiers(btf2, t2->type, NULL);
+	if (t1->info != t2->info) {
+		bpf_log(log,
+			"Return type %s of %s() doesn't match type %s of %s()\n",
+			btf_type_str(t1), fn1,
+			btf_type_str(t2), fn2);
+		return -EINVAL;
+	}
+
+	for (i = 0; i < nargs1; i++) {
+		t1 = btf_type_skip_modifiers(btf1, args1[i].type, NULL);
+		t2 = btf_type_skip_modifiers(btf2, args2[i].type, NULL);
+
+		if (t1->info != t2->info) {
+			bpf_log(log, "arg%d in %s() is %s while %s() has %s\n",
+				i, fn1, btf_type_str(t1),
+				fn2, btf_type_str(t2));
+			return -EINVAL;
+		}
+		if (btf_type_has_size(t1) && t1->size != t2->size) {
+			bpf_log(log,
+				"arg%d in %s() has size %d while %s() has %d\n",
+				i, fn1, t1->size,
+				fn2, t2->size);
+			return -EINVAL;
+		}
+
+		/* global functions are validated with scalars and pointers
+		 * to context only. And only global functions can be replaced.
+		 * Hence type check only those types.
+		 */
+		if (btf_type_is_int(t1) || btf_type_is_enum(t1))
+			continue;
+		if (!btf_type_is_ptr(t1)) {
+			bpf_log(log,
+				"arg%d in %s() has unrecognized type\n",
+				i, fn1);
+			return -EINVAL;
+		}
+		t1 = btf_type_skip_modifiers(btf1, t1->type, NULL);
+		t2 = btf_type_skip_modifiers(btf2, t2->type, NULL);
+		if (!btf_type_is_struct(t1)) {
+			bpf_log(log,
+				"arg%d in %s() is not a pointer to context\n",
+				i, fn1);
+			return -EINVAL;
+		}
+		if (!btf_type_is_struct(t2)) {
+			bpf_log(log,
+				"arg%d in %s() is not a pointer to context\n",
+				i, fn2);
+			return -EINVAL;
+		}
+		/* This is an optional check to make program writing easier.
+		 * Compare names of structs and report an error to the user.
+		 * btf_prepare_func_args() already checked that t2 struct
+		 * is a context type. btf_prepare_func_args() will check
+		 * later that t1 struct is a context type as well.
+		 */
+		s1 = btf_name_by_offset(btf1, t1->name_off);
+		s2 = btf_name_by_offset(btf2, t2->name_off);
+		if (strcmp(s1, s2)) {
+			bpf_log(log,
+				"arg%d %s(struct %s *) doesn't match %s(struct %s *)\n",
+				i, fn1, s1, fn2, s2);
+			return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+/* Compare BTFs of given program with BTF of target program */
+int btf_check_type_match(struct bpf_verifier_env *env, struct bpf_prog *prog,
+			 struct btf *btf2, const struct btf_type *t2)
+{
+	struct btf *btf1 = prog->aux->btf;
+	const struct btf_type *t1;
+	u32 btf_id = 0;
+
+	if (!prog->aux->func_info) {
+		bpf_log(&env->log, "Program extension requires BTF\n");
+		return -EINVAL;
+	}
+
+	btf_id = prog->aux->func_info[0].type_id;
+	if (!btf_id)
+		return -EFAULT;
+
+	t1 = btf_type_by_id(btf1, btf_id);
+	if (!t1 || !btf_type_is_func(t1))
+		return -EFAULT;
+
+	return btf_check_func_type_match(&env->log, btf1, t1, btf2, t2);
+}
+
 /* Compare BTF of a function with given bpf_reg_state.
  * Returns:
  * EFAULT - there is a verifier bug. Abort verification.
@@ -4224,6 +4371,7 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog,
 {
 	struct bpf_verifier_log *log = &env->log;
 	struct bpf_prog *prog = env->prog;
+	enum bpf_prog_type prog_type = prog->type;
 	struct btf *btf = prog->aux->btf;
 	const struct btf_param *args;
 	const struct btf_type *t;
@@ -4261,6 +4409,8 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog,
 		bpf_log(log, "Verifier bug in function %s()\n", tname);
 		return -EFAULT;
 	}
+	if (prog_type == BPF_PROG_TYPE_EXT)
+		prog_type = prog->aux->linked_prog->type;
 
 	t = btf_type_by_id(btf, t->type);
 	if (!t || !btf_type_is_func_proto(t)) {
@@ -4296,7 +4446,7 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog,
 			continue;
 		}
 		if (btf_type_is_ptr(t) &&
-		    btf_get_prog_ctx_type(log, btf, t, prog->type, i)) {
+		    btf_get_prog_ctx_type(log, btf, t, prog_type, i)) {
 			reg[i + 1].type = PTR_TO_CTX;
 			continue;
 		}
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 9a840c57f6df..a91ad518c050 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1932,13 +1932,15 @@ bpf_prog_load_check_attach(enum bpf_prog_type prog_type,
 		switch (prog_type) {
 		case BPF_PROG_TYPE_TRACING:
 		case BPF_PROG_TYPE_STRUCT_OPS:
+		case BPF_PROG_TYPE_EXT:
 			break;
 		default:
 			return -EINVAL;
 		}
 	}
 
-	if (prog_fd && prog_type != BPF_PROG_TYPE_TRACING)
+	if (prog_fd && prog_type != BPF_PROG_TYPE_TRACING &&
+	    prog_type != BPF_PROG_TYPE_EXT)
 		return -EINVAL;
 
 	switch (prog_type) {
@@ -1981,6 +1983,10 @@ bpf_prog_load_check_attach(enum bpf_prog_type prog_type,
 		default:
 			return -EINVAL;
 		}
+	case BPF_PROG_TYPE_EXT:
+		if (expected_attach_type)
+			return -EINVAL;
+		/* fallthrough */
 	default:
 		return 0;
 	}
@@ -2183,7 +2189,8 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog)
 	int tr_fd, err;
 
 	if (prog->expected_attach_type != BPF_TRACE_FENTRY &&
-	    prog->expected_attach_type != BPF_TRACE_FEXIT) {
+	    prog->expected_attach_type != BPF_TRACE_FEXIT &&
+	    prog->type != BPF_PROG_TYPE_EXT) {
 		err = -EINVAL;
 		goto out_put_prog;
 	}
@@ -2250,12 +2257,14 @@ static int bpf_raw_tracepoint_open(const union bpf_attr *attr)
 
 	if (prog->type != BPF_PROG_TYPE_RAW_TRACEPOINT &&
 	    prog->type != BPF_PROG_TYPE_TRACING &&
+	    prog->type != BPF_PROG_TYPE_EXT &&
 	    prog->type != BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE) {
 		err = -EINVAL;
 		goto out_put_prog;
 	}
 
-	if (prog->type == BPF_PROG_TYPE_TRACING) {
+	if (prog->type == BPF_PROG_TYPE_TRACING ||
+	    prog->type == BPF_PROG_TYPE_EXT) {
 		if (attr->raw_tracepoint.name) {
 			/* The attach point for this category of programs
 			 * should be specified via btf_id during program load.
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 79a04417050d..cbd72cb87c3d 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -5,6 +5,12 @@
 #include <linux/filter.h>
 #include <linux/ftrace.h>
 
+/* dummy _ops. The verifier will operate on target program's ops. */
+const struct bpf_verifier_ops bpf_extension_verifier_ops = {
+};
+const struct bpf_prog_ops bpf_extension_prog_ops = {
+};
+
 /* btf_vmlinux has ~22k attachable functions. 1k htab is enough. */
 #define TRAMPOLINE_HASH_BITS 10
 #define TRAMPOLINE_TABLE_SIZE (1 << TRAMPOLINE_HASH_BITS)
@@ -186,8 +192,10 @@ static enum bpf_tramp_prog_type bpf_attach_type_to_tramp(enum bpf_attach_type t)
 	switch (t) {
 	case BPF_TRACE_FENTRY:
 		return BPF_TRAMP_FENTRY;
-	default:
+	case BPF_TRACE_FEXIT:
 		return BPF_TRAMP_FEXIT;
+	default:
+		return BPF_TRAMP_REPLACE;
 	}
 }
 
@@ -196,12 +204,31 @@ int bpf_trampoline_link_prog(struct bpf_prog *prog)
 	enum bpf_tramp_prog_type kind;
 	struct bpf_trampoline *tr;
 	int err = 0;
+	int cnt;
 
 	tr = prog->aux->trampoline;
 	kind = bpf_attach_type_to_tramp(prog->expected_attach_type);
 	mutex_lock(&tr->mutex);
-	if (tr->progs_cnt[BPF_TRAMP_FENTRY] + tr->progs_cnt[BPF_TRAMP_FEXIT]
-	    >= BPF_MAX_TRAMP_PROGS) {
+	if (tr->extension_prog) {
+		/* cannot attach fentry/fexit if extension prog is attached.
+		 * cannot overwrite extension prog either.
+		 */
+		err = -EBUSY;
+		goto out;
+	}
+	cnt = tr->progs_cnt[BPF_TRAMP_FENTRY] + tr->progs_cnt[BPF_TRAMP_FEXIT];
+	if (kind == BPF_TRAMP_REPLACE) {
+		/* Cannot attach extension if fentry/fexit are in use. */
+		if (cnt) {
+			err = -EBUSY;
+			goto out;
+		}
+		tr->extension_prog = prog;
+		err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP, NULL,
+					 prog->bpf_func);
+		goto out;
+	}
+	if (cnt >= BPF_MAX_TRAMP_PROGS) {
 		err = -E2BIG;
 		goto out;
 	}
@@ -232,9 +259,17 @@ int bpf_trampoline_unlink_prog(struct bpf_prog *prog)
 	tr = prog->aux->trampoline;
 	kind = bpf_attach_type_to_tramp(prog->expected_attach_type);
 	mutex_lock(&tr->mutex);
+	if (kind == BPF_TRAMP_REPLACE) {
+		WARN_ON_ONCE(!tr->extension_prog);
+		err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP,
+					 tr->extension_prog->bpf_func, NULL);
+		tr->extension_prog = NULL;
+		goto out;
+	}
 	hlist_del(&prog->aux->tramp_hlist);
 	tr->progs_cnt[kind]--;
 	err = bpf_trampoline_update(prog->aux->trampoline);
+out:
 	mutex_unlock(&tr->mutex);
 	return err;
 }
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ca17dccc17ba..62f2d432f74c 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -9564,7 +9564,7 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog)
 			subprog);
 
 	regs = state->frame[state->curframe]->regs;
-	if (subprog) {
+	if (subprog || env->prog->type == BPF_PROG_TYPE_EXT) {
 		ret = btf_prepare_func_args(env, subprog, regs);
 		if (ret)
 			goto out;
@@ -9737,6 +9737,7 @@ static int check_struct_ops_btf_id(struct bpf_verifier_env *env)
 static int check_attach_btf_id(struct bpf_verifier_env *env)
 {
 	struct bpf_prog *prog = env->prog;
+	bool prog_extension = prog->type == BPF_PROG_TYPE_EXT;
 	struct bpf_prog *tgt_prog = prog->aux->linked_prog;
 	u32 btf_id = prog->aux->attach_btf_id;
 	const char prefix[] = "btf_trace_";
@@ -9752,7 +9753,7 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
 	if (prog->type == BPF_PROG_TYPE_STRUCT_OPS)
 		return check_struct_ops_btf_id(env);
 
-	if (prog->type != BPF_PROG_TYPE_TRACING)
+	if (prog->type != BPF_PROG_TYPE_TRACING && !prog_extension)
 		return 0;
 
 	if (!btf_id) {
@@ -9788,8 +9789,59 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
 			return -EINVAL;
 		}
 		conservative = aux->func_info_aux[subprog].unreliable;
+		if (prog_extension) {
+			if (conservative) {
+				verbose(env,
+					"Cannot replace static functions\n");
+				return -EINVAL;
+			}
+			if (!prog->jit_requested) {
+				verbose(env,
+					"Extension programs should be JITed\n");
+				return -EINVAL;
+			}
+			env->ops = bpf_verifier_ops[tgt_prog->type];
+		}
+		if (!tgt_prog->jited) {
+			verbose(env, "Can attach to only JITed progs\n");
+			return -EINVAL;
+		}
+		if (tgt_prog->type == prog->type) {
+			/* Cannot fentry/fexit another fentry/fexit program.
+			 * Cannot attach program extension to another extension.
+			 * It's ok to attach fentry/fexit to extension program.
+			 */
+			verbose(env, "Cannot recursively attach\n");
+			return -EINVAL;
+		}
+		if (tgt_prog->type == BPF_PROG_TYPE_TRACING &&
+		    prog_extension &&
+		    (tgt_prog->expected_attach_type == BPF_TRACE_FENTRY ||
+		     tgt_prog->expected_attach_type == BPF_TRACE_FEXIT)) {
+			/* Program extensions can extend all program types
+			 * except fentry/fexit. The reason is the following.
+			 * The fentry/fexit programs are used for performance
+			 * analysis, stats and can be attached to any program
+			 * type except themselves. When extension program is
+			 * replacing XDP function it is necessary to allow
+			 * performance analysis of all functions. Both original
+			 * XDP program and its program extension. Hence
+			 * attaching fentry/fexit to BPF_PROG_TYPE_EXT is
+			 * allowed. If extending of fentry/fexit was allowed it
+			 * would be possible to create long call chain
+			 * fentry->extension->fentry->extension beyond
+			 * reasonable stack size. Hence extending fentry is not
+			 * allowed.
+			 */
+			verbose(env, "Cannot extend fentry/fexit\n");
+			return -EINVAL;
+		}
 		key = ((u64)aux->id) << 32 | btf_id;
 	} else {
+		if (prog_extension) {
+			verbose(env, "Cannot replace kernel functions\n");
+			return -EINVAL;
+		}
 		key = btf_id;
 	}
 
@@ -9827,6 +9879,10 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
 		prog->aux->attach_func_proto = t;
 		prog->aux->attach_btf_trace = true;
 		return 0;
+	default:
+		if (!prog_extension)
+			return -EINVAL;
+		/* fallthrough */
 	case BPF_TRACE_FENTRY:
 	case BPF_TRACE_FEXIT:
 		if (!btf_type_is_func(t)) {
@@ -9834,6 +9890,9 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
 				btf_id);
 			return -EINVAL;
 		}
+		if (prog_extension &&
+		    btf_check_type_match(env, prog, btf, t))
+			return -EINVAL;
 		t = btf_type_by_id(btf, t->type);
 		if (!btf_type_is_func_proto(t))
 			return -EINVAL;
@@ -9857,18 +9916,6 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
 		if (ret < 0)
 			goto out;
 		if (tgt_prog) {
-			if (!tgt_prog->jited) {
-				/* for now */
-				verbose(env, "Can trace only JITed BPF progs\n");
-				ret = -EINVAL;
-				goto out;
-			}
-			if (tgt_prog->type == BPF_PROG_TYPE_TRACING) {
-				/* prevent cycles */
-				verbose(env, "Cannot recursively attach\n");
-				ret = -EINVAL;
-				goto out;
-			}
 			if (subprog == 0)
 				addr = (long) tgt_prog->bpf_func;
 			else
@@ -9890,8 +9937,6 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
 		if (ret)
 			bpf_trampoline_put(tr);
 		return ret;
-	default:
-		return -EINVAL;
 	}
 }
 
@@ -9961,10 +10006,6 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
 		goto skip_full_check;
 	}
 
-	ret = check_attach_btf_id(env);
-	if (ret)
-		goto skip_full_check;
-
 	env->strict_alignment = !!(attr->prog_flags & BPF_F_STRICT_ALIGNMENT);
 	if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS))
 		env->strict_alignment = true;
@@ -10001,6 +10042,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
 	if (ret < 0)
 		goto skip_full_check;
 
+	ret = check_attach_btf_id(env);
+	if (ret)
+		goto skip_full_check;
+
 	ret = check_cfg(env);
 	if (ret < 0)
 		goto skip_full_check;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 bpf-next 2/3] libbpf: Add support for program extensions
  2020-01-21  0:53 [PATCH v2 bpf-next 0/3] bpf: Program extensions or dynamic re-linking Alexei Starovoitov
  2020-01-21  0:53 ` [PATCH v2 bpf-next 1/3] bpf: Introduce dynamic program extensions Alexei Starovoitov
@ 2020-01-21  0:53 ` Alexei Starovoitov
  2020-01-21 18:38   ` John Fastabend
  2020-01-21  0:53 ` [PATCH v2 bpf-next 3/3] selftests/bpf: Add tests " Alexei Starovoitov
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2020-01-21  0:53 UTC (permalink / raw)
  To: davem; +Cc: daniel, netdev, bpf, kernel-team

Add minimal support for program extensions. bpf_object_open_opts() needs to be
called with attach_prog_fd = target_prog_fd and BPF program extension needs to
have in .c file section definition like SEC("freplace/func_to_be_replaced").
libbpf will search for "func_to_be_replaced" in the target_prog_fd's BTF and
will pass it in attach_btf_id to the kernel. This approach works for tests, but
more compex use case may need to request function name (and attach_btf_id that
kernel sees) to be more dynamic. Such API will be added in future patches.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/include/uapi/linux/bpf.h |  1 +
 tools/lib/bpf/bpf.c            |  3 ++-
 tools/lib/bpf/libbpf.c         | 13 ++++++++++---
 tools/lib/bpf/libbpf.h         |  2 ++
 tools/lib/bpf/libbpf.map       |  2 ++
 tools/lib/bpf/libbpf_probes.c  |  1 +
 6 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 033d90a2282d..e81628eb059c 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -180,6 +180,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_CGROUP_SOCKOPT,
 	BPF_PROG_TYPE_TRACING,
 	BPF_PROG_TYPE_STRUCT_OPS,
+	BPF_PROG_TYPE_EXT,
 };
 
 enum bpf_attach_type {
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index ed42b006533c..c6dafe563176 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -237,7 +237,8 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
 	attr.expected_attach_type = load_attr->expected_attach_type;
 	if (attr.prog_type == BPF_PROG_TYPE_STRUCT_OPS) {
 		attr.attach_btf_id = load_attr->attach_btf_id;
-	} else if (attr.prog_type == BPF_PROG_TYPE_TRACING) {
+	} else if (attr.prog_type == BPF_PROG_TYPE_TRACING ||
+		   attr.prog_type == BPF_PROG_TYPE_EXT) {
 		attr.attach_btf_id = load_attr->attach_btf_id;
 		attr.attach_prog_fd = load_attr->attach_prog_fd;
 	} else {
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index faab96a42141..ae34b681ae82 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -4837,7 +4837,8 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
 	load_attr.license = license;
 	if (prog->type == BPF_PROG_TYPE_STRUCT_OPS) {
 		load_attr.attach_btf_id = prog->attach_btf_id;
-	} else if (prog->type == BPF_PROG_TYPE_TRACING) {
+	} else if (prog->type == BPF_PROG_TYPE_TRACING ||
+		   prog->type == BPF_PROG_TYPE_EXT) {
 		load_attr.attach_prog_fd = prog->attach_prog_fd;
 		load_attr.attach_btf_id = prog->attach_btf_id;
 	} else {
@@ -4918,7 +4919,8 @@ int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver)
 {
 	int err = 0, fd, i, btf_id;
 
-	if (prog->type == BPF_PROG_TYPE_TRACING) {
+	if (prog->type == BPF_PROG_TYPE_TRACING ||
+	    prog->type == BPF_PROG_TYPE_EXT) {
 		btf_id = libbpf_find_attach_btf_id(prog);
 		if (btf_id <= 0)
 			return btf_id;
@@ -5092,7 +5094,8 @@ __bpf_object__open(const char *path, const void *obj_buf, size_t obj_buf_sz,
 
 		bpf_program__set_type(prog, prog_type);
 		bpf_program__set_expected_attach_type(prog, attach_type);
-		if (prog_type == BPF_PROG_TYPE_TRACING)
+		if (prog_type == BPF_PROG_TYPE_TRACING ||
+		    prog_type == BPF_PROG_TYPE_EXT)
 			prog->attach_prog_fd = OPTS_GET(opts, attach_prog_fd, 0);
 	}
 
@@ -6166,6 +6169,7 @@ BPF_PROG_TYPE_FNS(xdp, BPF_PROG_TYPE_XDP);
 BPF_PROG_TYPE_FNS(perf_event, BPF_PROG_TYPE_PERF_EVENT);
 BPF_PROG_TYPE_FNS(tracing, BPF_PROG_TYPE_TRACING);
 BPF_PROG_TYPE_FNS(struct_ops, BPF_PROG_TYPE_STRUCT_OPS);
+BPF_PROG_TYPE_FNS(extension, BPF_PROG_TYPE_EXT);
 
 enum bpf_attach_type
 bpf_program__get_expected_attach_type(struct bpf_program *prog)
@@ -6265,6 +6269,9 @@ static const struct bpf_sec_def section_defs[] = {
 		.expected_attach_type = BPF_TRACE_FEXIT,
 		.is_attach_btf = true,
 		.attach_fn = attach_trace),
+	SEC_DEF("freplace/", EXT,
+		.is_attach_btf = true,
+		.attach_fn = attach_trace),
 	BPF_PROG_SEC("xdp",			BPF_PROG_TYPE_XDP),
 	BPF_PROG_SEC("perf_event",		BPF_PROG_TYPE_PERF_EVENT),
 	BPF_PROG_SEC("lwt_in",			BPF_PROG_TYPE_LWT_IN),
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 01639f9a1062..2a5e3b087002 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -318,6 +318,7 @@ LIBBPF_API int bpf_program__set_xdp(struct bpf_program *prog);
 LIBBPF_API int bpf_program__set_perf_event(struct bpf_program *prog);
 LIBBPF_API int bpf_program__set_tracing(struct bpf_program *prog);
 LIBBPF_API int bpf_program__set_struct_ops(struct bpf_program *prog);
+LIBBPF_API int bpf_program__set_extension(struct bpf_program *prog);
 
 LIBBPF_API enum bpf_prog_type bpf_program__get_type(struct bpf_program *prog);
 LIBBPF_API void bpf_program__set_type(struct bpf_program *prog,
@@ -339,6 +340,7 @@ LIBBPF_API bool bpf_program__is_xdp(const struct bpf_program *prog);
 LIBBPF_API bool bpf_program__is_perf_event(const struct bpf_program *prog);
 LIBBPF_API bool bpf_program__is_tracing(const struct bpf_program *prog);
 LIBBPF_API bool bpf_program__is_struct_ops(const struct bpf_program *prog);
+LIBBPF_API bool bpf_program__is_extension(const struct bpf_program *prog);
 
 /*
  * No need for __attribute__((packed)), all members of 'bpf_map_def'
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 64ec71ba41f1..b035122142bb 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -228,7 +228,9 @@ LIBBPF_0.0.7 {
 		bpf_prog_attach_xattr;
 		bpf_program__attach;
 		bpf_program__name;
+		bpf_program__is_extension;
 		bpf_program__is_struct_ops;
+		bpf_program__set_extension;
 		bpf_program__set_struct_ops;
 		btf__align_of;
 		libbpf_find_kernel_btf;
diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c
index 8cc992bc532a..b782ebef6ac9 100644
--- a/tools/lib/bpf/libbpf_probes.c
+++ b/tools/lib/bpf/libbpf_probes.c
@@ -107,6 +107,7 @@ probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns,
 	case BPF_PROG_TYPE_CGROUP_SOCKOPT:
 	case BPF_PROG_TYPE_TRACING:
 	case BPF_PROG_TYPE_STRUCT_OPS:
+	case BPF_PROG_TYPE_EXT:
 	default:
 		break;
 	}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 bpf-next 3/3] selftests/bpf: Add tests for program extensions
  2020-01-21  0:53 [PATCH v2 bpf-next 0/3] bpf: Program extensions or dynamic re-linking Alexei Starovoitov
  2020-01-21  0:53 ` [PATCH v2 bpf-next 1/3] bpf: Introduce dynamic program extensions Alexei Starovoitov
  2020-01-21  0:53 ` [PATCH v2 bpf-next 2/3] libbpf: Add support for " Alexei Starovoitov
@ 2020-01-21  0:53 ` Alexei Starovoitov
  2020-01-21 18:41   ` John Fastabend
  2020-01-21 15:37 ` [PATCH v2 bpf-next 0/3] bpf: Program extensions or dynamic re-linking Toke Høiland-Jørgensen
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2020-01-21  0:53 UTC (permalink / raw)
  To: davem; +Cc: daniel, netdev, bpf, kernel-team

Add program extension tests that build on top of fexit_bpf2bpf tests.
Replace three global functions in previously loaded test_pkt_access.c program
with three new implementations:
int get_skb_len(struct __sk_buff *skb);
int get_constant(long val);
int get_skb_ifindex(int val, struct __sk_buff *skb, int var);
New function return the same results as original only if arguments match.

new_get_skb_ifindex() demonstrates that 'skb' argument doesn't have to be first
and only argument of BPF program. All normal skb based accesses are available.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 .../selftests/bpf/prog_tests/fexit_bpf2bpf.c  | 20 ++++++-
 .../selftests/bpf/progs/fexit_bpf2bpf.c       | 57 +++++++++++++++++++
 .../selftests/bpf/progs/test_pkt_access.c     |  8 ++-
 3 files changed, 83 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c b/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c
index 7d3740d38965..db5c74d2ce6d 100644
--- a/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c
+++ b/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c
@@ -26,7 +26,7 @@ static void test_fexit_bpf2bpf_common(const char *obj_file,
 
 	link = calloc(sizeof(struct bpf_link *), prog_cnt);
 	prog = calloc(sizeof(struct bpf_program *), prog_cnt);
-	result = malloc(prog_cnt * sizeof(u64));
+	result = malloc((prog_cnt + 32 /* spare */) * sizeof(u64));
 	if (CHECK(!link || !prog || !result, "alloc_memory",
 		  "failed to alloc memory"))
 		goto close_prog;
@@ -106,8 +106,26 @@ static void test_target_yes_callees(void)
 				  prog_name);
 }
 
+static void test_func_replace(void)
+{
+	const char *prog_name[] = {
+		"fexit/test_pkt_access",
+		"fexit/test_pkt_access_subprog1",
+		"fexit/test_pkt_access_subprog2",
+		"fexit/test_pkt_access_subprog3",
+		"freplace/get_skb_len",
+		"freplace/get_skb_ifindex",
+		"freplace/get_constant",
+	};
+	test_fexit_bpf2bpf_common("./fexit_bpf2bpf.o",
+				  "./test_pkt_access.o",
+				  ARRAY_SIZE(prog_name),
+				  prog_name);
+}
+
 void test_fexit_bpf2bpf(void)
 {
 	test_target_no_callees();
 	test_target_yes_callees();
+	test_func_replace();
 }
diff --git a/tools/testing/selftests/bpf/progs/fexit_bpf2bpf.c b/tools/testing/selftests/bpf/progs/fexit_bpf2bpf.c
index ec5710767d13..c329fccf9842 100644
--- a/tools/testing/selftests/bpf/progs/fexit_bpf2bpf.c
+++ b/tools/testing/selftests/bpf/progs/fexit_bpf2bpf.c
@@ -1,7 +1,10 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c) 2019 Facebook */
+#include <linux/stddef.h>
+#include <linux/ipv6.h>
 #include <linux/bpf.h>
 #include <bpf/bpf_helpers.h>
+#include <bpf/bpf_endian.h>
 #include "bpf_trace_helpers.h"
 
 struct sk_buff {
@@ -94,4 +97,58 @@ int BPF_PROG(test_subprog3, int val, struct sk_buff *skb, int ret)
 	test_result_subprog3 = 1;
 	return 0;
 }
+
+__u64 test_get_skb_len = 0;
+SEC("freplace/get_skb_len")
+int new_get_skb_len(struct __sk_buff *skb)
+{
+	int len = skb->len;
+
+	if (len != 74)
+		return 0;
+	test_get_skb_len = 1;
+	return 74; /* original get_skb_len() returns skb->len */
+}
+
+__u64 test_get_skb_ifindex = 0;
+SEC("freplace/get_skb_ifindex")
+int new_get_skb_ifindex(int val, struct __sk_buff *skb, int var)
+{
+	void *data_end = (void *)(long)skb->data_end;
+	void *data = (void *)(long)skb->data;
+	struct ipv6hdr ip6, *ip6p;
+	int ifindex = skb->ifindex;
+	__u32 eth_proto;
+	__u32 nh_off;
+
+	/* check that BPF extension can read packet via direct packet access */
+	if (data + 14 + sizeof(ip6) > data_end)
+		return 0;
+	ip6p = data + 14;
+
+	if (ip6p->nexthdr != 6 || ip6p->payload_len != __bpf_constant_htons(123))
+		return 0;
+
+	/* check that legacy packet access helper works too */
+	if (bpf_skb_load_bytes(skb, 14, &ip6, sizeof(ip6)) < 0)
+		return 0;
+	ip6p = &ip6;
+	if (ip6p->nexthdr != 6 || ip6p->payload_len != __bpf_constant_htons(123))
+		return 0;
+
+	if (ifindex != 1 || val != 3 || var != 1)
+		return 0;
+	test_get_skb_ifindex = 1;
+	return 3; /* original get_skb_ifindex() returns val * ifindex * var */
+}
+
+volatile __u64 test_get_constant = 0;
+SEC("freplace/get_constant")
+int new_get_constant(long val)
+{
+	if (val != 123)
+		return 0;
+	test_get_constant = 1;
+	return test_get_constant; /* original get_constant() returns val - 122 */
+}
 char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/test_pkt_access.c b/tools/testing/selftests/bpf/progs/test_pkt_access.c
index 7c9fcfd2b463..e72eba4a93d2 100644
--- a/tools/testing/selftests/bpf/progs/test_pkt_access.c
+++ b/tools/testing/selftests/bpf/progs/test_pkt_access.c
@@ -57,12 +57,18 @@ int get_skb_len(struct __sk_buff *skb)
 	return skb->len;
 }
 
+__attribute__ ((noinline))
+int get_constant(long val)
+{
+	return val - 122;
+}
+
 int get_skb_ifindex(int, struct __sk_buff *skb, int);
 
 __attribute__ ((noinline))
 int test_pkt_access_subprog3(int val, struct __sk_buff *skb)
 {
-	return get_skb_len(skb) * get_skb_ifindex(val, skb, 1);
+	return get_skb_len(skb) * get_skb_ifindex(val, skb, get_constant(123));
 }
 
 __attribute__ ((noinline))
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* RE: [PATCH v2 bpf-next 1/3] bpf: Introduce dynamic program extensions
  2020-01-21  0:53 ` [PATCH v2 bpf-next 1/3] bpf: Introduce dynamic program extensions Alexei Starovoitov
@ 2020-01-21  7:36   ` John Fastabend
  2020-01-21 16:00     ` Alexei Starovoitov
  0 siblings, 1 reply; 17+ messages in thread
From: John Fastabend @ 2020-01-21  7:36 UTC (permalink / raw)
  To: Alexei Starovoitov, davem; +Cc: daniel, netdev, bpf, kernel-team

Alexei Starovoitov wrote:
> Introduce dynamic program extensions. The users can load additional BPF
> functions and replace global functions in previously loaded BPF programs while
> these programs are executing.
> 
> Global functions are verified individually by the verifier based on their types only.
> Hence the global function in the new program which types match older function can
> safely replace that corresponding function.
> 
> This new function/program is called 'an extension' of old program. At load time
> the verifier uses (attach_prog_fd, attach_btf_id) pair to identify the function
> to be replaced. The BPF program type is derived from the target program into
> extension program. Technically bpf_verifier_ops is copied from target program.
> The BPF_PROG_TYPE_EXT program type is a placeholder. It has empty verifier_ops.
> The extension program can call the same bpf helper functions as target program.
> Single BPF_PROG_TYPE_EXT type is used to extend XDP, SKB and all other program
> types. The verifier allows only one level of replacement. Meaning that the
> extension program cannot recursively extend an extension. That also means that
> the maximum stack size is increasing from 512 to 1024 bytes and maximum
> function nesting level from 8 to 16. The programs don't always consume that
> much. The stack usage is determined by the number of on-stack variables used by
> the program. The verifier could have enforced 512 limit for combined original
> plus extension program, but it makes for difficult user experience. The main
> use case for extensions is to provide generic mechanism to plug external
> programs into policy program or function call chaining.
> 
> BPF trampoline is used to track both fentry/fexit and program extensions
> because both are using the same nop slot at the beginning of every BPF
> function. Attaching fentry/fexit to a function that was replaced is not
> allowed. The opposite is true as well. Replacing a function that currently
> being analyzed with fentry/fexit is not allowed. The executable page allocated
> by BPF trampoline is not used by program extensions. This inefficiency will be
> optimized in future patches.
> 
> Function by function verification of global function supports scalars and
> pointer to context only. Hence program extensions are supported for such class
> of global functions only. In the future the verifier will be extended with
> support to pointers to structures, arrays with sizes, etc.
> 
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---

[...]

> +
> +	t1 = btf_type_skip_modifiers(btf1, t1->type, NULL);
> +	t2 = btf_type_skip_modifiers(btf2, t2->type, NULL);

Is it really best to skip modifiers? I would expect that if the
signature is different including modifiers then we should just reject it.
OTOH its not really C code here either so modifiers may not have the same
meaning. With just integers and struct it may be ok but if we add pointers
to ints then what would we expect from a const int*?

So whats the reasoning for skipping modifiers? Is it purely an argument
that its not required for safety so solve it elsewhere? In that case then
checking names of functions is also equally not required.

Otherwise LGTM.


> +	if (t1->info != t2->info) {
> +		bpf_log(log,
> +			"Return type %s of %s() doesn't match type %s of %s()\n",
> +			btf_type_str(t1), fn1,
> +			btf_type_str(t2), fn2);
> +		return -EINVAL;
> +	}
> +
> +	for (i = 0; i < nargs1; i++) {
> +		t1 = btf_type_skip_modifiers(btf1, args1[i].type, NULL);
> +		t2 = btf_type_skip_modifiers(btf2, args2[i].type, NULL);
> +
> +		if (t1->info != t2->info) {
> +			bpf_log(log, "arg%d in %s() is %s while %s() has %s\n",
> +				i, fn1, btf_type_str(t1),
> +				fn2, btf_type_str(t2));
> +			return -EINVAL;
> +		}
> +		if (btf_type_has_size(t1) && t1->size != t2->size) {
> +			bpf_log(log,
> +				"arg%d in %s() has size %d while %s() has %d\n",
> +				i, fn1, t1->size,
> +				fn2, t2->size);
> +			return -EINVAL;
> +		}
> +
> +		/* global functions are validated with scalars and pointers
> +		 * to context only. And only global functions can be replaced.
> +		 * Hence type check only those types.
> +		 */
> +		if (btf_type_is_int(t1) || btf_type_is_enum(t1))
> +			continue;
> +		if (!btf_type_is_ptr(t1)) {
> +			bpf_log(log,
> +				"arg%d in %s() has unrecognized type\n",
> +				i, fn1);
> +			return -EINVAL;
> +		}
> +		t1 = btf_type_skip_modifiers(btf1, t1->type, NULL);
> +		t2 = btf_type_skip_modifiers(btf2, t2->type, NULL);
> +		if (!btf_type_is_struct(t1)) {
> +			bpf_log(log,
> +				"arg%d in %s() is not a pointer to context\n",
> +				i, fn1);
> +			return -EINVAL;
> +		}
> +		if (!btf_type_is_struct(t2)) {
> +			bpf_log(log,
> +				"arg%d in %s() is not a pointer to context\n",
> +				i, fn2);
> +			return -EINVAL;
> +		}
> +		/* This is an optional check to make program writing easier.
> +		 * Compare names of structs and report an error to the user.
> +		 * btf_prepare_func_args() already checked that t2 struct
> +		 * is a context type. btf_prepare_func_args() will check
> +		 * later that t1 struct is a context type as well.
> +		 */
> +		s1 = btf_name_by_offset(btf1, t1->name_off);
> +		s2 = btf_name_by_offset(btf2, t2->name_off);
> +		if (strcmp(s1, s2)) {
> +			bpf_log(log,
> +				"arg%d %s(struct %s *) doesn't match %s(struct %s *)\n",
> +				i, fn1, s1, fn2, s2);
> +			return -EINVAL;
> +		}
> +	}
> +	return 0;
> +}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 bpf-next 0/3] bpf: Program extensions or dynamic re-linking
  2020-01-21  0:53 [PATCH v2 bpf-next 0/3] bpf: Program extensions or dynamic re-linking Alexei Starovoitov
                   ` (2 preceding siblings ...)
  2020-01-21  0:53 ` [PATCH v2 bpf-next 3/3] selftests/bpf: Add tests " Alexei Starovoitov
@ 2020-01-21 15:37 ` Toke Høiland-Jørgensen
  2020-01-21 17:00   ` Alexei Starovoitov
  2020-01-21 18:21 ` Andrii Nakryiko
  2020-01-22 22:12 ` Daniel Borkmann
  5 siblings, 1 reply; 17+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-01-21 15:37 UTC (permalink / raw)
  To: Alexei Starovoitov, davem; +Cc: daniel, netdev, bpf, kernel-team

Alexei Starovoitov <ast@kernel.org> writes:

> The last few month BPF community has been discussing an approach to call
> chaining, since exiting bpt_tail_call() mechanism used in production XDP
> programs has plenty of downsides. The outcome of these discussion was a
> conclusion to implement dynamic re-linking of BPF programs. Where rootlet XDP
> program attached to a netdevice can programmatically define a policy of
> execution of other XDP programs. Such rootlet would be compiled as normal XDP
> program and provide a number of placeholder global functions which later can be
> replaced with future XDP programs. BPF trampoline, function by function
> verification were building blocks towards that goal. The patch 1 is a final
> building block. It introduces dynamic program extensions. A number of
> improvements like more flexible function by function verification and better
> libbpf api will be implemented in future patches.

This is great, thank you! I'll go play around with it; couldn't spot
anything obvious from eye-balling the code, except that yeah, it does
need a more flexible libbpf api :)

One thing that's not obvious to me: How can userspace tell which
programs replace which functions after they are loaded? Is this put into
prog_tags in struct bpf_prog_info, or?


For the series:
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>


-Toke


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 bpf-next 1/3] bpf: Introduce dynamic program extensions
  2020-01-21  7:36   ` John Fastabend
@ 2020-01-21 16:00     ` Alexei Starovoitov
  2020-01-21 18:15       ` John Fastabend
  0 siblings, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2020-01-21 16:00 UTC (permalink / raw)
  To: John Fastabend
  Cc: Alexei Starovoitov, davem, daniel, netdev, bpf, kernel-team

On Mon, Jan 20, 2020 at 11:36:43PM -0800, John Fastabend wrote:
> 
> > +
> > +	t1 = btf_type_skip_modifiers(btf1, t1->type, NULL);
> > +	t2 = btf_type_skip_modifiers(btf2, t2->type, NULL);
> 
> Is it really best to skip modifiers? I would expect that if the
> signature is different including modifiers then we should just reject it.
> OTOH its not really C code here either so modifiers may not have the same
> meaning. With just integers and struct it may be ok but if we add pointers
> to ints then what would we expect from a const int*?
> 
> So whats the reasoning for skipping modifiers? Is it purely an argument
> that its not required for safety so solve it elsewhere? In that case then
> checking names of functions is also equally not required.

Function names are not checked by the kernel. It's purely libbpf and bpf_prog.c
convention. The kernel operates on prog_fd+btf_id only. The names of function
arguments are not compared either.

The code has to skip modifiers. Otherwise the type comparison algorithm will be
quite complex, since typedef is such modifier. Like 'u32' in original program
and 'u32' in extension program would have to be recursively checked.

Another reason to skip modifiers is 'volatile' modifier. I suspect we would
have to use it from time to time in original placeholder functions. Yet new
replacement function will be written without volatile. The placeholder may need
volatile to make sure compiler doesn't optimize things away. I found cases
where 'noinline' in placeholder was not enough. clang would still inline the
body of the function and remove call instruction. So far I've been using
volatile as a workaround. May be we will introduce new function attribute to
clang.

Having said that I share your concern regarding skipping 'const'. For 'const
int arg' it's totally ok to skip it, since it's meaningless from safety pov,
but for 'const int *arg' and 'const struct foo *arg' I'm planning to preserve
it. It will be preserved at the verifier bpf_reg_state level though. Just
checking that 'const' is present in extension prog's BTF doesn't help safety.
I'm planing to make the verifier enforce that bpf prog cannot write into
argument which type is pointer to const struct. That part is still wip. It will
be implemented for global functions first and then for extension programs.
Currently the verifier rejects any pointer to struct (other than context), so
no backward compatibility issues.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 bpf-next 0/3] bpf: Program extensions or dynamic re-linking
  2020-01-21 15:37 ` [PATCH v2 bpf-next 0/3] bpf: Program extensions or dynamic re-linking Toke Høiland-Jørgensen
@ 2020-01-21 17:00   ` Alexei Starovoitov
  2020-01-22 10:45     ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2020-01-21 17:00 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Alexei Starovoitov, davem, daniel, netdev, bpf, kernel-team

On Tue, Jan 21, 2020 at 04:37:31PM +0100, Toke Høiland-Jørgensen wrote:
> Alexei Starovoitov <ast@kernel.org> writes:
> 
> > The last few month BPF community has been discussing an approach to call
> > chaining, since exiting bpt_tail_call() mechanism used in production XDP
> > programs has plenty of downsides. The outcome of these discussion was a
> > conclusion to implement dynamic re-linking of BPF programs. Where rootlet XDP
> > program attached to a netdevice can programmatically define a policy of
> > execution of other XDP programs. Such rootlet would be compiled as normal XDP
> > program and provide a number of placeholder global functions which later can be
> > replaced with future XDP programs. BPF trampoline, function by function
> > verification were building blocks towards that goal. The patch 1 is a final
> > building block. It introduces dynamic program extensions. A number of
> > improvements like more flexible function by function verification and better
> > libbpf api will be implemented in future patches.
> 
> This is great, thank you! I'll go play around with it; couldn't spot
> anything obvious from eye-balling the code, except that yeah, it does
> need a more flexible libbpf api :)
> 
> One thing that's not obvious to me: How can userspace tell which
> programs replace which functions after they are loaded? Is this put into
> prog_tags in struct bpf_prog_info, or?

good point. Would be good to extend bpf_prog_info. Since prog-to-prog
connection is unidirectional the bpf_prog_info of extension prog will be able
to say which original program it's replacing. bpftool prog show will be able to
print all this data. I think fenry/fexit progs would need the same
bpf_prog_info extension. attach_prog_id + attach_btf_id would be enough.
In the mean time I can try to hack drgn script to do the same.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 bpf-next 1/3] bpf: Introduce dynamic program extensions
  2020-01-21 16:00     ` Alexei Starovoitov
@ 2020-01-21 18:15       ` John Fastabend
  2020-01-21 19:08         ` Alexei Starovoitov
  2020-01-21 21:13         ` Yonghong Song
  0 siblings, 2 replies; 17+ messages in thread
From: John Fastabend @ 2020-01-21 18:15 UTC (permalink / raw)
  To: Alexei Starovoitov, John Fastabend
  Cc: Alexei Starovoitov, davem, daniel, netdev, bpf, kernel-team

Alexei Starovoitov wrote:
> On Mon, Jan 20, 2020 at 11:36:43PM -0800, John Fastabend wrote:
> > 
> > > +
> > > +	t1 = btf_type_skip_modifiers(btf1, t1->type, NULL);
> > > +	t2 = btf_type_skip_modifiers(btf2, t2->type, NULL);
> > 
> > Is it really best to skip modifiers? I would expect that if the
> > signature is different including modifiers then we should just reject it.
> > OTOH its not really C code here either so modifiers may not have the same
> > meaning. With just integers and struct it may be ok but if we add pointers
> > to ints then what would we expect from a const int*?
> > 
> > So whats the reasoning for skipping modifiers? Is it purely an argument
> > that its not required for safety so solve it elsewhere? In that case then
> > checking names of functions is also equally not required.
> 
> Function names are not checked by the kernel. It's purely libbpf and bpf_prog.c
> convention. The kernel operates on prog_fd+btf_id only. The names of function
> arguments are not compared either.

Sorry mistyped names of struct is what I meant, but that is probably nice to
have per comment.

> 
> The code has to skip modifiers. Otherwise the type comparison algorithm will be
> quite complex, since typedef is such modifier. Like 'u32' in original program
> and 'u32' in extension program would have to be recursively checked.
> 
> Another reason to skip modifiers is 'volatile' modifier. I suspect we would
> have to use it from time to time in original placeholder functions. Yet new
> replacement function will be written without volatile. The placeholder may need
> volatile to make sure compiler doesn't optimize things away. I found cases
> where 'noinline' in placeholder was not enough. clang would still inline the
> body of the function and remove call instruction. So far I've been using
> volatile as a workaround. May be we will introduce new function attribute to
> clang.

Yes, we have various similar issue and have in the past used volatile to work
around them but volatile's inside loops tends to break loop optimizations and
cause clang warnings/errors. Another common one is verifier failing to track
when scalars move around in registers. As an example the following is valid
C for a bounded additon to array pointer but not tractable for the verifier
at the moment. (made example at some point I'll dig up a collection of
real-world examples)

    r1 = *(u64 *)(r10 - 8)
    r6 = r1
    if r6 < %[const] goto %l[err]
    r3 += r1
    r2 = %[copy_size]
    r1 = r7
    call 4

compiler barriers help but not always and also breaks loop optimization
passes. But, thats a different discussion I only mention it because
either verifier has to track above logic better or new attributes in clang
could be used for these things. But the new attributes don't usually work
well when mixed with optimization passes that we would actually like to
keep.

> 
> Having said that I share your concern regarding skipping 'const'. For 'const
> int arg' it's totally ok to skip it, since it's meaningless from safety pov,
> but for 'const int *arg' and 'const struct foo *arg' I'm planning to preserve
> it. It will be preserved at the verifier bpf_reg_state level though. Just
> checking that 'const' is present in extension prog's BTF doesn't help safety.
> I'm planing to make the verifier enforce that bpf prog cannot write into
> argument which type is pointer to const struct. That part is still wip. It will
> be implemented for global functions first and then for extension programs.
> Currently the verifier rejects any pointer to struct (other than context), so
> no backward compatibility issues.

Ah ok this will be great. In that case const will be more general then
merely functions and should be applicable generally at least as an end
goal IMO. There will be a slight annoyance where old extensions may not
run on new kernels though. I will argue such extensions are broken though.

For this patch then,

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 bpf-next 0/3] bpf: Program extensions or dynamic re-linking
  2020-01-21  0:53 [PATCH v2 bpf-next 0/3] bpf: Program extensions or dynamic re-linking Alexei Starovoitov
                   ` (3 preceding siblings ...)
  2020-01-21 15:37 ` [PATCH v2 bpf-next 0/3] bpf: Program extensions or dynamic re-linking Toke Høiland-Jørgensen
@ 2020-01-21 18:21 ` Andrii Nakryiko
  2020-01-22 22:12 ` Daniel Borkmann
  5 siblings, 0 replies; 17+ messages in thread
From: Andrii Nakryiko @ 2020-01-21 18:21 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Networking, bpf, Kernel Team

On Mon, Jan 20, 2020 at 4:54 PM Alexei Starovoitov <ast@kernel.org> wrote:
>
> The last few month BPF community has been discussing an approach to call
> chaining, since exiting bpt_tail_call() mechanism used in production XDP
> programs has plenty of downsides. The outcome of these discussion was a
> conclusion to implement dynamic re-linking of BPF programs. Where rootlet XDP
> program attached to a netdevice can programmatically define a policy of
> execution of other XDP programs. Such rootlet would be compiled as normal XDP
> program and provide a number of placeholder global functions which later can be
> replaced with future XDP programs. BPF trampoline, function by function
> verification were building blocks towards that goal. The patch 1 is a final
> building block. It introduces dynamic program extensions. A number of
> improvements like more flexible function by function verification and better
> libbpf api will be implemented in future patches.
>
> v1->v2:
> - addressed Andrii's comments
> - rebase
>
> Alexei Starovoitov (3):
>   bpf: Introduce dynamic program extensions
>   libbpf: Add support for program extensions
>   selftests/bpf: Add tests for program extensions
>
>  include/linux/bpf.h                           |  10 +-
>  include/linux/bpf_types.h                     |   2 +
>  include/linux/btf.h                           |   5 +
>  include/uapi/linux/bpf.h                      |   1 +
>  kernel/bpf/btf.c                              | 152 +++++++++++++++++-
>  kernel/bpf/syscall.c                          |  15 +-
>  kernel/bpf/trampoline.c                       |  41 ++++-
>  kernel/bpf/verifier.c                         |  85 +++++++---
>  tools/include/uapi/linux/bpf.h                |   1 +
>  tools/lib/bpf/bpf.c                           |   3 +-
>  tools/lib/bpf/libbpf.c                        |  13 +-
>  tools/lib/bpf/libbpf.h                        |   2 +
>  tools/lib/bpf/libbpf.map                      |   2 +
>  tools/lib/bpf/libbpf_probes.c                 |   1 +
>  .../selftests/bpf/prog_tests/fexit_bpf2bpf.c  |  20 ++-
>  .../selftests/bpf/progs/fexit_bpf2bpf.c       |  57 +++++++
>  .../selftests/bpf/progs/test_pkt_access.c     |   8 +-
>  17 files changed, 384 insertions(+), 34 deletions(-)
>
> --
> 2.23.0
>

LGTM.

For the series:

Acked-by: Andrii Nakryiko <andriin@fb.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH v2 bpf-next 2/3] libbpf: Add support for program extensions
  2020-01-21  0:53 ` [PATCH v2 bpf-next 2/3] libbpf: Add support for " Alexei Starovoitov
@ 2020-01-21 18:38   ` John Fastabend
  0 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2020-01-21 18:38 UTC (permalink / raw)
  To: Alexei Starovoitov, davem; +Cc: daniel, netdev, bpf, kernel-team

Alexei Starovoitov wrote:
> Add minimal support for program extensions. bpf_object_open_opts() needs to be
> called with attach_prog_fd = target_prog_fd and BPF program extension needs to
> have in .c file section definition like SEC("freplace/func_to_be_replaced").
> libbpf will search for "func_to_be_replaced" in the target_prog_fd's BTF and
> will pass it in attach_btf_id to the kernel. This approach works for tests, but
> more compex use case may need to request function name (and attach_btf_id that
> kernel sees) to be more dynamic. Such API will be added in future patches.
> 
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH v2 bpf-next 3/3] selftests/bpf: Add tests for program extensions
  2020-01-21  0:53 ` [PATCH v2 bpf-next 3/3] selftests/bpf: Add tests " Alexei Starovoitov
@ 2020-01-21 18:41   ` John Fastabend
  0 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2020-01-21 18:41 UTC (permalink / raw)
  To: Alexei Starovoitov, davem; +Cc: daniel, netdev, bpf, kernel-team

Alexei Starovoitov wrote:
> Add program extension tests that build on top of fexit_bpf2bpf tests.
> Replace three global functions in previously loaded test_pkt_access.c program
> with three new implementations:
> int get_skb_len(struct __sk_buff *skb);
> int get_constant(long val);
> int get_skb_ifindex(int val, struct __sk_buff *skb, int var);
> New function return the same results as original only if arguments match.
> 
> new_get_skb_ifindex() demonstrates that 'skb' argument doesn't have to be first
> and only argument of BPF program. All normal skb based accesses are available.
> 
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  .../selftests/bpf/prog_tests/fexit_bpf2bpf.c  | 20 ++++++-
>  .../selftests/bpf/progs/fexit_bpf2bpf.c       | 57 +++++++++++++++++++
>  .../selftests/bpf/progs/test_pkt_access.c     |  8 ++-
>  3 files changed, 83 insertions(+), 2 deletions(-)

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 bpf-next 1/3] bpf: Introduce dynamic program extensions
  2020-01-21 18:15       ` John Fastabend
@ 2020-01-21 19:08         ` Alexei Starovoitov
  2020-01-21 19:54           ` John Fastabend
  2020-01-21 21:13         ` Yonghong Song
  1 sibling, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2020-01-21 19:08 UTC (permalink / raw)
  To: John Fastabend
  Cc: Alexei Starovoitov, davem, daniel, netdev, bpf, kernel-team

On Tue, Jan 21, 2020 at 10:15:42AM -0800, John Fastabend wrote:
> Alexei Starovoitov wrote:
> > On Mon, Jan 20, 2020 at 11:36:43PM -0800, John Fastabend wrote:
> > > 
> > > > +
> > > > +	t1 = btf_type_skip_modifiers(btf1, t1->type, NULL);
> > > > +	t2 = btf_type_skip_modifiers(btf2, t2->type, NULL);
> > > 
> > > Is it really best to skip modifiers? I would expect that if the
> > > signature is different including modifiers then we should just reject it.
> > > OTOH its not really C code here either so modifiers may not have the same
> > > meaning. With just integers and struct it may be ok but if we add pointers
> > > to ints then what would we expect from a const int*?
> > > 
> > > So whats the reasoning for skipping modifiers? Is it purely an argument
> > > that its not required for safety so solve it elsewhere? In that case then
> > > checking names of functions is also equally not required.
> > 
> > Function names are not checked by the kernel. It's purely libbpf and bpf_prog.c
> > convention. The kernel operates on prog_fd+btf_id only. The names of function
> > arguments are not compared either.
> 
> Sorry mistyped names of struct is what I meant, but that is probably nice to
> have per comment.

I think comments in the kernel code won't be sufficient :) I'm planing to write
a proper doc describing global functions and program extensions, since these
concepts are coupled.

> > 
> > The code has to skip modifiers. Otherwise the type comparison algorithm will be
> > quite complex, since typedef is such modifier. Like 'u32' in original program
> > and 'u32' in extension program would have to be recursively checked.
> > 
> > Another reason to skip modifiers is 'volatile' modifier. I suspect we would
> > have to use it from time to time in original placeholder functions. Yet new
> > replacement function will be written without volatile. The placeholder may need
> > volatile to make sure compiler doesn't optimize things away. I found cases
> > where 'noinline' in placeholder was not enough. clang would still inline the
> > body of the function and remove call instruction. So far I've been using
> > volatile as a workaround. May be we will introduce new function attribute to
> > clang.
> 
> Yes, we have various similar issue and have in the past used volatile to work
> around them but volatile's inside loops tends to break loop optimizations and
> cause clang warnings/errors. Another common one is verifier failing to track
> when scalars move around in registers. As an example the following is valid
> C for a bounded additon to array pointer but not tractable for the verifier
> at the moment. (made example at some point I'll dig up a collection of
> real-world examples)
> 
>     r1 = *(u64 *)(r10 - 8)
>     r6 = r1
>     if r6 < %[const] goto %l[err]
>     r3 += r1
>     r2 = %[copy_size]
>     r1 = r7
>     call 4

Is it a sign compare in the above? The verifier currently struggles with sign
compares. The issue you're describing could also be related to instruction
combining optimization. Yonghong is working on undoing parts of instcombine here:
https://reviews.llvm.org/D72787
Dealing with 32-bit sign compares probably will be done as a verifier
improvement. It's too hacky to do it in the llvm.

> compiler barriers help but not always and also breaks loop optimization
> passes. But, thats a different discussion I only mention it because
> either verifier has to track above logic better or new attributes in clang
> could be used for these things. But the new attributes don't usually work
> well when mixed with optimization passes that we would actually like to
> keep.

yep. unfortunately llvm folks didn't like the idea of disabling individual
optimization passes or parts of passes based on the backend flag. Hence BPF
backend has to normalize IR before generating asm. Since all of the issues
related to "llvm optimizing too much and verifier is too dumb" are related to
missed tracking of registers, I'm thinking that we may try virtual register
approach. Instead of doing register allocation and spill/fill in the llvm the
backend may generate as many virtual registers as necessary and final BPF
assembly will look like SSA except phisical registers will still be used to
receive and pass arguments while all other math, branches will operate on
virtual registers. Then the kernel will do register alloc after verification. I
hope that should help the verifier get smarter at the expense of less efficient
code and more spill/fill. But that is the last resort.

> > 
> > Having said that I share your concern regarding skipping 'const'. For 'const
> > int arg' it's totally ok to skip it, since it's meaningless from safety pov,
> > but for 'const int *arg' and 'const struct foo *arg' I'm planning to preserve
> > it. It will be preserved at the verifier bpf_reg_state level though. Just
> > checking that 'const' is present in extension prog's BTF doesn't help safety.
> > I'm planing to make the verifier enforce that bpf prog cannot write into
> > argument which type is pointer to const struct. That part is still wip. It will
> > be implemented for global functions first and then for extension programs.
> > Currently the verifier rejects any pointer to struct (other than context), so
> > no backward compatibility issues.
> 
> Ah ok this will be great. In that case const will be more general then
> merely functions and should be applicable generally at least as an end
> goal IMO. There will be a slight annoyance where old extensions may not
> run on new kernels though. I will argue such extensions are broken though.

'old extensions' won't run? I don't see it. Strict 'const' enforcement
for pointers to struct won't break things. Currently pointers to struct
are not supported. Only a pointer to one specific struct which is context.
That's different. I don't think there will be any compatibility issues.

> For this patch then,
> 
> Acked-by: John Fastabend <john.fastabend@gmail.com>

Thanks for the review!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 bpf-next 1/3] bpf: Introduce dynamic program extensions
  2020-01-21 19:08         ` Alexei Starovoitov
@ 2020-01-21 19:54           ` John Fastabend
  0 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2020-01-21 19:54 UTC (permalink / raw)
  To: Alexei Starovoitov, John Fastabend
  Cc: Alexei Starovoitov, davem, daniel, netdev, bpf, kernel-team

Alexei Starovoitov wrote:
> On Tue, Jan 21, 2020 at 10:15:42AM -0800, John Fastabend wrote:
> > Alexei Starovoitov wrote:
> > > On Mon, Jan 20, 2020 at 11:36:43PM -0800, John Fastabend wrote:
> > > > 
> > > > > +
> > > > > +	t1 = btf_type_skip_modifiers(btf1, t1->type, NULL);
> > > > > +	t2 = btf_type_skip_modifiers(btf2, t2->type, NULL);
> > > > 
> > > > Is it really best to skip modifiers? I would expect that if the
> > > > signature is different including modifiers then we should just reject it.
> > > > OTOH its not really C code here either so modifiers may not have the same
> > > > meaning. With just integers and struct it may be ok but if we add pointers
> > > > to ints then what would we expect from a const int*?
> > > > 
> > > > So whats the reasoning for skipping modifiers? Is it purely an argument
> > > > that its not required for safety so solve it elsewhere? In that case then
> > > > checking names of functions is also equally not required.
> > > 
> > > Function names are not checked by the kernel. It's purely libbpf and bpf_prog.c
> > > convention. The kernel operates on prog_fd+btf_id only. The names of function
> > > arguments are not compared either.
> > 
> > Sorry mistyped names of struct is what I meant, but that is probably nice to
> > have per comment.
> 
> I think comments in the kernel code won't be sufficient :) I'm planing to write
> a proper doc describing global functions and program extensions, since these
> concepts are coupled.
> 
> > > 
> > > The code has to skip modifiers. Otherwise the type comparison algorithm will be
> > > quite complex, since typedef is such modifier. Like 'u32' in original program
> > > and 'u32' in extension program would have to be recursively checked.
> > > 
> > > Another reason to skip modifiers is 'volatile' modifier. I suspect we would
> > > have to use it from time to time in original placeholder functions. Yet new
> > > replacement function will be written without volatile. The placeholder may need
> > > volatile to make sure compiler doesn't optimize things away. I found cases
> > > where 'noinline' in placeholder was not enough. clang would still inline the
> > > body of the function and remove call instruction. So far I've been using
> > > volatile as a workaround. May be we will introduce new function attribute to
> > > clang.
> > 
> > Yes, we have various similar issue and have in the past used volatile to work
> > around them but volatile's inside loops tends to break loop optimizations and
> > cause clang warnings/errors. Another common one is verifier failing to track
> > when scalars move around in registers. As an example the following is valid
> > C for a bounded additon to array pointer but not tractable for the verifier
> > at the moment. (made example at some point I'll dig up a collection of
> > real-world examples)
> > 
> >     r1 = *(u64 *)(r10 - 8)
> >     r6 = r1
> >     if r6 < %[const] goto %l[err]
> >     r3 += r1
> >     r2 = %[copy_size]
> >     r1 = r7
> >     call 4
> 
> Is it a sign compare in the above? The verifier currently struggles with sign
> compares. The issue you're describing could also be related to instruction
> combining optimization. Yonghong is working on undoing parts of instcombine here:
> https://reviews.llvm.org/D72787
> Dealing with 32-bit sign compares probably will be done as a verifier
> improvement. It's too hacky to do it in the llvm.

hmm I think it was an unsigned compare but I'll try to come up with some
simple examples of real code this week and post them. To illustrate some
of the verifier complaints we've been working around.

> 
> > compiler barriers help but not always and also breaks loop optimization
> > passes. But, thats a different discussion I only mention it because
> > either verifier has to track above logic better or new attributes in clang
> > could be used for these things. But the new attributes don't usually work
> > well when mixed with optimization passes that we would actually like to
> > keep.
> 
> yep. unfortunately llvm folks didn't like the idea of disabling individual
> optimization passes or parts of passes based on the backend flag. Hence BPF
> backend has to normalize IR before generating asm. Since all of the issues
> related to "llvm optimizing too much and verifier is too dumb" are related to
> missed tracking of registers, I'm thinking that we may try virtual register
> approach. Instead of doing register allocation and spill/fill in the llvm the
> backend may generate as many virtual registers as necessary and final BPF
> assembly will look like SSA except phisical registers will still be used to
> receive and pass arguments while all other math, branches will operate on
> virtual registers. Then the kernel will do register alloc after verification. I
> hope that should help the verifier get smarter at the expense of less efficient
> code and more spill/fill. But that is the last resort.

Interesting. Or maybe follow the parent chain and push information down the
chain when we see register moves and further bounds refining.

> 
> > > 
> > > Having said that I share your concern regarding skipping 'const'. For 'const
> > > int arg' it's totally ok to skip it, since it's meaningless from safety pov,
> > > but for 'const int *arg' and 'const struct foo *arg' I'm planning to preserve
> > > it. It will be preserved at the verifier bpf_reg_state level though. Just
> > > checking that 'const' is present in extension prog's BTF doesn't help safety.
> > > I'm planing to make the verifier enforce that bpf prog cannot write into
> > > argument which type is pointer to const struct. That part is still wip. It will
> > > be implemented for global functions first and then for extension programs.
> > > Currently the verifier rejects any pointer to struct (other than context), so
> > > no backward compatibility issues.
> > 
> > Ah ok this will be great. In that case const will be more general then
> > merely functions and should be applicable generally at least as an end
> > goal IMO. There will be a slight annoyance where old extensions may not
> > run on new kernels though. I will argue such extensions are broken though.
> 
> 'old extensions' won't run? I don't see it. Strict 'const' enforcement
> for pointers to struct won't break things. Currently pointers to struct
> are not supported. Only a pointer to one specific struct which is context.
> That's different. I don't think there will be any compatibility issues.

Right its not an issue.

> 
> > For this patch then,
> > 
> > Acked-by: John Fastabend <john.fastabend@gmail.com>
> 
> Thanks for the review!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 bpf-next 1/3] bpf: Introduce dynamic program extensions
  2020-01-21 18:15       ` John Fastabend
  2020-01-21 19:08         ` Alexei Starovoitov
@ 2020-01-21 21:13         ` Yonghong Song
  1 sibling, 0 replies; 17+ messages in thread
From: Yonghong Song @ 2020-01-21 21:13 UTC (permalink / raw)
  To: John Fastabend, Alexei Starovoitov
  Cc: Alexei Starovoitov, davem, daniel, netdev, bpf, Kernel Team



On 1/21/20 10:15 AM, John Fastabend wrote:
> Alexei Starovoitov wrote:
>> On Mon, Jan 20, 2020 at 11:36:43PM -0800, John Fastabend wrote:
>>>
>>>> +
>>>> +	t1 = btf_type_skip_modifiers(btf1, t1->type, NULL);
>>>> +	t2 = btf_type_skip_modifiers(btf2, t2->type, NULL);
>>>
>>> Is it really best to skip modifiers? I would expect that if the
>>> signature is different including modifiers then we should just reject it.
>>> OTOH its not really C code here either so modifiers may not have the same
>>> meaning. With just integers and struct it may be ok but if we add pointers
>>> to ints then what would we expect from a const int*?
>>>
>>> So whats the reasoning for skipping modifiers? Is it purely an argument
>>> that its not required for safety so solve it elsewhere? In that case then
>>> checking names of functions is also equally not required.
>>
>> Function names are not checked by the kernel. It's purely libbpf and bpf_prog.c
>> convention. The kernel operates on prog_fd+btf_id only. The names of function
>> arguments are not compared either.
> 
> Sorry mistyped names of struct is what I meant, but that is probably nice to
> have per comment.
> 
>>
>> The code has to skip modifiers. Otherwise the type comparison algorithm will be
>> quite complex, since typedef is such modifier. Like 'u32' in original program
>> and 'u32' in extension program would have to be recursively checked.
>>
>> Another reason to skip modifiers is 'volatile' modifier. I suspect we would
>> have to use it from time to time in original placeholder functions. Yet new
>> replacement function will be written without volatile. The placeholder may need
>> volatile to make sure compiler doesn't optimize things away. I found cases
>> where 'noinline' in placeholder was not enough. clang would still inline the
>> body of the function and remove call instruction. So far I've been using
>> volatile as a workaround. May be we will introduce new function attribute to
>> clang.
> 
> Yes, we have various similar issue and have in the past used volatile to work
> around them but volatile's inside loops tends to break loop optimizations and
> cause clang warnings/errors. Another common one is verifier failing to track
> when scalars move around in registers. As an example the following is valid
> C for a bounded additon to array pointer but not tractable for the verifier
> at the moment. (made example at some point I'll dig up a collection of
> real-world examples)
> 
>      r1 = *(u64 *)(r10 - 8)
>      r6 = r1
>      if r6 < %[const] goto %l[err]
>      r3 += r1
>      r2 = %[copy_size]
>      r1 = r7
>      call 4
> 
> compiler barriers help but not always and also breaks loop optimization
> passes. But, thats a different discussion I only mention it because
> either verifier has to track above logic better or new attributes in clang
> could be used for these things. But the new attributes don't usually work
> well when mixed with optimization passes that we would actually like to
> keep.

John, could you send your original C code how to reproduce this? I am 
working on llvm side to avoid such optimizations, i.e., moving scalars 
around so later on you could have two scales with different verified 
state holding the [slightly] same value.

> 
>>
>> Having said that I share your concern regarding skipping 'const'. For 'const
>> int arg' it's totally ok to skip it, since it's meaningless from safety pov,
>> but for 'const int *arg' and 'const struct foo *arg' I'm planning to preserve
>> it. It will be preserved at the verifier bpf_reg_state level though. Just
>> checking that 'const' is present in extension prog's BTF doesn't help safety.
>> I'm planing to make the verifier enforce that bpf prog cannot write into
>> argument which type is pointer to const struct. That part is still wip. It will
>> be implemented for global functions first and then for extension programs.
>> Currently the verifier rejects any pointer to struct (other than context), so
>> no backward compatibility issues.
> 
> Ah ok this will be great. In that case const will be more general then
> merely functions and should be applicable generally at least as an end
> goal IMO. There will be a slight annoyance where old extensions may not
> run on new kernels though. I will argue such extensions are broken though.
> 
> For this patch then,
> 
> Acked-by: John Fastabend <john.fastabend@gmail.com>
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 bpf-next 0/3] bpf: Program extensions or dynamic re-linking
  2020-01-21 17:00   ` Alexei Starovoitov
@ 2020-01-22 10:45     ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 17+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-01-22 10:45 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, davem, daniel, netdev, bpf, kernel-team

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Tue, Jan 21, 2020 at 04:37:31PM +0100, Toke Høiland-Jørgensen wrote:
>> Alexei Starovoitov <ast@kernel.org> writes:
>> 
>> > The last few month BPF community has been discussing an approach to call
>> > chaining, since exiting bpt_tail_call() mechanism used in production XDP
>> > programs has plenty of downsides. The outcome of these discussion was a
>> > conclusion to implement dynamic re-linking of BPF programs. Where rootlet XDP
>> > program attached to a netdevice can programmatically define a policy of
>> > execution of other XDP programs. Such rootlet would be compiled as normal XDP
>> > program and provide a number of placeholder global functions which later can be
>> > replaced with future XDP programs. BPF trampoline, function by function
>> > verification were building blocks towards that goal. The patch 1 is a final
>> > building block. It introduces dynamic program extensions. A number of
>> > improvements like more flexible function by function verification and better
>> > libbpf api will be implemented in future patches.
>> 
>> This is great, thank you! I'll go play around with it; couldn't spot
>> anything obvious from eye-balling the code, except that yeah, it does
>> need a more flexible libbpf api :)
>> 
>> One thing that's not obvious to me: How can userspace tell which
>> programs replace which functions after they are loaded? Is this put into
>> prog_tags in struct bpf_prog_info, or?
>
> good point. Would be good to extend bpf_prog_info. Since prog-to-prog
> connection is unidirectional the bpf_prog_info of extension prog will be able
> to say which original program it's replacing.

Yeah, that'll do. I can live with having to enumerate all programs and
backtrack to the attached XDP program to figure out its component parts.

> bpftool prog show will be able to print all this data. I think
> fenry/fexit progs would need the same bpf_prog_info extension.
> attach_prog_id + attach_btf_id would be enough.

Yes, please. I actually assumed this was already there for fentry/fexit,
which is why I was puzzled I couldn't find where this series hooked into
that. I'll just wait for such an extension to show up, then :)

> In the mean time I can try to hack drgn script to do the same.

That would be great, thanks!

-Toke


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 bpf-next 0/3] bpf: Program extensions or dynamic re-linking
  2020-01-21  0:53 [PATCH v2 bpf-next 0/3] bpf: Program extensions or dynamic re-linking Alexei Starovoitov
                   ` (4 preceding siblings ...)
  2020-01-21 18:21 ` Andrii Nakryiko
@ 2020-01-22 22:12 ` Daniel Borkmann
  5 siblings, 0 replies; 17+ messages in thread
From: Daniel Borkmann @ 2020-01-22 22:12 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: davem, netdev, bpf, kernel-team

On Mon, Jan 20, 2020 at 04:53:45PM -0800, Alexei Starovoitov wrote:
> The last few month BPF community has been discussing an approach to call
> chaining, since exiting bpt_tail_call() mechanism used in production XDP
> programs has plenty of downsides. The outcome of these discussion was a
> conclusion to implement dynamic re-linking of BPF programs. Where rootlet XDP
> program attached to a netdevice can programmatically define a policy of
> execution of other XDP programs. Such rootlet would be compiled as normal XDP
> program and provide a number of placeholder global functions which later can be
> replaced with future XDP programs. BPF trampoline, function by function
> verification were building blocks towards that goal. The patch 1 is a final
> building block. It introduces dynamic program extensions. A number of
> improvements like more flexible function by function verification and better
> libbpf api will be implemented in future patches.
> 
> v1->v2:
> - addressed Andrii's comments
> - rebase

Applied, thanks!

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-01-22 22:12 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-21  0:53 [PATCH v2 bpf-next 0/3] bpf: Program extensions or dynamic re-linking Alexei Starovoitov
2020-01-21  0:53 ` [PATCH v2 bpf-next 1/3] bpf: Introduce dynamic program extensions Alexei Starovoitov
2020-01-21  7:36   ` John Fastabend
2020-01-21 16:00     ` Alexei Starovoitov
2020-01-21 18:15       ` John Fastabend
2020-01-21 19:08         ` Alexei Starovoitov
2020-01-21 19:54           ` John Fastabend
2020-01-21 21:13         ` Yonghong Song
2020-01-21  0:53 ` [PATCH v2 bpf-next 2/3] libbpf: Add support for " Alexei Starovoitov
2020-01-21 18:38   ` John Fastabend
2020-01-21  0:53 ` [PATCH v2 bpf-next 3/3] selftests/bpf: Add tests " Alexei Starovoitov
2020-01-21 18:41   ` John Fastabend
2020-01-21 15:37 ` [PATCH v2 bpf-next 0/3] bpf: Program extensions or dynamic re-linking Toke Høiland-Jørgensen
2020-01-21 17:00   ` Alexei Starovoitov
2020-01-22 10:45     ` Toke Høiland-Jørgensen
2020-01-21 18:21 ` Andrii Nakryiko
2020-01-22 22:12 ` Daniel Borkmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).