All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs
@ 2022-11-20 19:54 Yonghong Song
  2022-11-20 19:54 ` [PATCH bpf-next v4 1/4] bpf: Add support for kfunc set with common btf_ids Yonghong Song
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Yonghong Song @ 2022-11-20 19:54 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	John Fastabend, kernel-team, Kumar Kartikeya Dwivedi,
	Martin KaFai Lau

Currenty, a non-tracing bpf program typically has a single 'context' argument
with predefined uapi struct type. Following these uapi struct, user is able
to access other fields defined in uapi header. Inside the kernel, the
user-seen 'context' argument is replaced with 'kernel context' (or 'kctx'
in short) which can access more information than what uapi header provides.
To access other info not in uapi header, people typically do two things:
  (1). extend uapi to access more fields rooted from 'context'.
  (2). use bpf_probe_read_kernl() helper to read particular field based on
    kctx.
Using (1) needs uapi change and using (2) makes code more complex since
direct memory access is not allowed.

There are already a few instances trying to access more information from
kctx:
  . trying to access some fields from perf_event kctx ([1]).
  . trying to access some fields from xdp kctx ([2]).

This patch set tried to allow direct memory access for kctx fields
by introducing bpf_cast_to_kern_ctx() kfunc.

Martin mentioned a use case like type casting below:
  #define skb_shinfo(SKB) ((struct skb_shared_info *)(skb_end_pointer(SKB)))
basically a 'unsigned char *" casted to 'struct skb_shared_info *'. This patch
set tries to support such a use case as well with bpf_rdonly_cast().

For the patch series, Patch 1 added support for a kfunc available to all
prog types. Patch 2 added bpf_cast_to_kern_ctx() kfunc. Patch 3 added
bpf_rdonly_cast() kfunc. Patch 4 added a few positive and negative tests.

  [1] https://lore.kernel.org/bpf/ad15b398-9069-4a0e-48cb-4bb651ec3088@meta.com/
  [2] https://lore.kernel.org/bpf/20221109215242.1279993-1-john.fastabend@gmail.com/

Changelog:
  v3 -> v4:
    - remove unnecessary bpf_ctx_convert.t error checking
    - add and use meta.ret_btf_id instead of meta.arg_constant.value for
      bpf_cast_to_kern_ctx().
    - add PTR_TRUSTED to the return PTR_TO_BTF_ID type for bpf_cast_to_kern_ctx().
  v2 -> v3:
    - rebase on top of bpf-next (for merging conflicts)
    - add the selftest to s390x deny list
  rfcv1 -> v2:
    - break original one kfunc into two.
    - add missing error checks and error logs.
    - adapt to the new conventions in
      https://lore.kernel.org/all/20221118015614.2013203-1-memxor@gmail.com/
      for example, with __ign and __k suffix.
    - added support in fixup_kfunc_call() to replace kfunc calls with a single mov.

Yonghong Song (4):
  bpf: Add support for kfunc set with common btf_ids
  bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx
  bpf: Add a kfunc for generic type cast
  bpf: Add type cast unit tests

 include/linux/btf.h                           |   5 +
 kernel/bpf/btf.c                              |  28 +++++
 kernel/bpf/helpers.c                          |  24 +++-
 kernel/bpf/verifier.c                         |  46 ++++++-
 tools/testing/selftests/bpf/DENYLIST.s390x    |   1 +
 .../selftests/bpf/prog_tests/type_cast.c      | 114 ++++++++++++++++++
 tools/testing/selftests/bpf/progs/type_cast.c |  83 +++++++++++++
 7 files changed, 299 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/type_cast.c
 create mode 100644 tools/testing/selftests/bpf/progs/type_cast.c

-- 
2.30.2


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v4 1/4] bpf: Add support for kfunc set with common btf_ids
  2022-11-20 19:54 [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs Yonghong Song
@ 2022-11-20 19:54 ` Yonghong Song
  2022-11-20 19:54 ` [PATCH bpf-next v4 2/4] bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx Yonghong Song
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Yonghong Song @ 2022-11-20 19:54 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	John Fastabend, kernel-team, Kumar Kartikeya Dwivedi,
	Martin KaFai Lau

Later on, we will introduce kfuncs bpf_cast_to_kern_ctx() and
bpf_rdonly_cast() which apply to all program types. Currently kfunc set
only supports individual prog types. This patch added support for kfunc
applying to all program types.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 kernel/bpf/btf.c     |  8 ++++++++
 kernel/bpf/helpers.c | 12 +++++++++++-
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index d52054ec69c9..1c78d4df9e18 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -199,6 +199,7 @@ DEFINE_IDR(btf_idr);
 DEFINE_SPINLOCK(btf_idr_lock);
 
 enum btf_kfunc_hook {
+	BTF_KFUNC_HOOK_COMMON,
 	BTF_KFUNC_HOOK_XDP,
 	BTF_KFUNC_HOOK_TC,
 	BTF_KFUNC_HOOK_STRUCT_OPS,
@@ -7531,6 +7532,8 @@ static u32 *__btf_kfunc_id_set_contains(const struct btf *btf,
 static int bpf_prog_type_to_kfunc_hook(enum bpf_prog_type prog_type)
 {
 	switch (prog_type) {
+	case BPF_PROG_TYPE_UNSPEC:
+		return BTF_KFUNC_HOOK_COMMON;
 	case BPF_PROG_TYPE_XDP:
 		return BTF_KFUNC_HOOK_XDP;
 	case BPF_PROG_TYPE_SCHED_CLS:
@@ -7559,6 +7562,11 @@ u32 *btf_kfunc_id_set_contains(const struct btf *btf,
 			       u32 kfunc_btf_id)
 {
 	enum btf_kfunc_hook hook;
+	u32 *kfunc_flags;
+
+	kfunc_flags = __btf_kfunc_id_set_contains(btf, BTF_KFUNC_HOOK_COMMON, kfunc_btf_id);
+	if (kfunc_flags)
+		return kfunc_flags;
 
 	hook = bpf_prog_type_to_kfunc_hook(prog_type);
 	return __btf_kfunc_id_set_contains(btf, hook, kfunc_btf_id);
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 89a95f3d854c..6b2de097b950 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1903,10 +1903,19 @@ static const struct btf_kfunc_id_set generic_kfunc_set = {
 	.set   = &generic_btf_ids,
 };
 
+
 BTF_ID_LIST(generic_dtor_ids)
 BTF_ID(struct, task_struct)
 BTF_ID(func, bpf_task_release)
 
+BTF_SET8_START(common_btf_ids)
+BTF_SET8_END(common_btf_ids)
+
+static const struct btf_kfunc_id_set common_kfunc_set = {
+	.owner = THIS_MODULE,
+	.set   = &common_btf_ids,
+};
+
 static int __init kfunc_init(void)
 {
 	int ret;
@@ -1920,9 +1929,10 @@ static int __init kfunc_init(void)
 	ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &generic_kfunc_set);
 	ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &generic_kfunc_set);
 	ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &generic_kfunc_set);
-	return ret ?: register_btf_id_dtor_kfuncs(generic_dtors,
+	ret = ret ?: register_btf_id_dtor_kfuncs(generic_dtors,
 						  ARRAY_SIZE(generic_dtors),
 						  THIS_MODULE);
+	return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_UNSPEC, &common_kfunc_set);
 }
 
 late_initcall(kfunc_init);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v4 2/4] bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx
  2022-11-20 19:54 [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs Yonghong Song
  2022-11-20 19:54 ` [PATCH bpf-next v4 1/4] bpf: Add support for kfunc set with common btf_ids Yonghong Song
@ 2022-11-20 19:54 ` Yonghong Song
  2022-11-20 19:54 ` [PATCH bpf-next v4 3/4] bpf: Add a kfunc for generic type cast Yonghong Song
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Yonghong Song @ 2022-11-20 19:54 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	John Fastabend, kernel-team, Kumar Kartikeya Dwivedi,
	Martin KaFai Lau

Implement bpf_cast_to_kern_ctx() kfunc which does a type cast
of a uapi ctx object to the corresponding kernel ctx. Previously
if users want to access some data available in kctx but not
in uapi ctx, bpf_probe_read_kernel() helper is needed.
The introduction of bpf_cast_to_kern_ctx() allows direct
memory access which makes code simpler and easier to understand.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 include/linux/btf.h   |  5 +++++
 kernel/bpf/btf.c      | 20 ++++++++++++++++++++
 kernel/bpf/helpers.c  |  6 ++++++
 kernel/bpf/verifier.c | 22 ++++++++++++++++++++++
 4 files changed, 53 insertions(+)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index d38aa4251c28..9ed00077db6e 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -487,6 +487,7 @@ const struct btf_member *
 btf_get_prog_ctx_type(struct bpf_verifier_log *log, const struct btf *btf,
 		      const struct btf_type *t, enum bpf_prog_type prog_type,
 		      int arg);
+int get_kern_ctx_btf_id(struct bpf_verifier_log *log, enum bpf_prog_type prog_type);
 bool btf_types_are_same(const struct btf *btf1, u32 id1,
 			const struct btf *btf2, u32 id2);
 #else
@@ -531,6 +532,10 @@ btf_get_prog_ctx_type(struct bpf_verifier_log *log, const struct btf *btf,
 {
 	return NULL;
 }
+static inline int get_kern_ctx_btf_id(struct bpf_verifier_log *log,
+				      enum bpf_prog_type prog_type) {
+	return -EINVAL;
+}
 static inline bool btf_types_are_same(const struct btf *btf1, u32 id1,
 				      const struct btf *btf2, u32 id2)
 {
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 1c78d4df9e18..1a59cc7ad730 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5603,6 +5603,26 @@ static int btf_translate_to_vmlinux(struct bpf_verifier_log *log,
 	return kern_ctx_type->type;
 }
 
+int get_kern_ctx_btf_id(struct bpf_verifier_log *log, enum bpf_prog_type prog_type)
+{
+	const struct btf_member *kctx_member;
+	const struct btf_type *conv_struct;
+	const struct btf_type *kctx_type;
+	u32 kctx_type_id;
+
+	conv_struct = bpf_ctx_convert.t;
+	/* get member for kernel ctx type */
+	kctx_member = btf_type_member(conv_struct) + bpf_ctx_convert_map[prog_type] * 2 + 1;
+	kctx_type_id = kctx_member->type;
+	kctx_type = btf_type_by_id(btf_vmlinux, kctx_type_id);
+	if (!btf_type_is_struct(kctx_type)) {
+		bpf_log(log, "kern ctx type id %u is not a struct\n", kctx_type_id);
+		return -EINVAL;
+	}
+
+	return kctx_type_id;
+}
+
 BTF_ID_LIST(bpf_ctx_convert_btf_id)
 BTF_ID(struct, bpf_ctx_convert)
 
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 6b2de097b950..a4b9cfcecb00 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1881,6 +1881,11 @@ void bpf_task_release(struct task_struct *p)
 	put_task_struct_rcu_user(p);
 }
 
+void *bpf_cast_to_kern_ctx(void *obj)
+{
+	return obj;
+}
+
 __diag_pop();
 
 BTF_SET8_START(generic_btf_ids)
@@ -1909,6 +1914,7 @@ BTF_ID(struct, task_struct)
 BTF_ID(func, bpf_task_release)
 
 BTF_SET8_START(common_btf_ids)
+BTF_ID_FLAGS(func, bpf_cast_to_kern_ctx)
 BTF_SET8_END(common_btf_ids)
 
 static const struct btf_kfunc_id_set common_kfunc_set = {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5bc9d84d7924..a035356ed5df 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7907,6 +7907,7 @@ struct bpf_kfunc_call_arg_meta {
 	u32 ref_obj_id;
 	u8 release_regno;
 	bool r0_rdonly;
+	u32 ret_btf_id;
 	u64 r0_size;
 	struct {
 		u64 value;
@@ -8151,6 +8152,7 @@ enum special_kfunc_type {
 	KF_bpf_list_push_back,
 	KF_bpf_list_pop_front,
 	KF_bpf_list_pop_back,
+	KF_bpf_cast_to_kern_ctx,
 };
 
 BTF_SET_START(special_kfunc_set)
@@ -8160,6 +8162,7 @@ BTF_ID(func, bpf_list_push_front)
 BTF_ID(func, bpf_list_push_back)
 BTF_ID(func, bpf_list_pop_front)
 BTF_ID(func, bpf_list_pop_back)
+BTF_ID(func, bpf_cast_to_kern_ctx)
 BTF_SET_END(special_kfunc_set)
 
 BTF_ID_LIST(special_kfunc_list)
@@ -8169,6 +8172,7 @@ BTF_ID(func, bpf_list_push_front)
 BTF_ID(func, bpf_list_push_back)
 BTF_ID(func, bpf_list_pop_front)
 BTF_ID(func, bpf_list_pop_back)
+BTF_ID(func, bpf_cast_to_kern_ctx)
 
 static enum kfunc_ptr_arg_type
 get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
@@ -8182,6 +8186,9 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
 	struct bpf_reg_state *reg = &regs[regno];
 	bool arg_mem_size = false;
 
+	if (meta->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx])
+		return KF_ARG_PTR_TO_CTX;
+
 	/* In this function, we verify the kfunc's BTF as per the argument type,
 	 * leaving the rest of the verification with respect to the register
 	 * type to our caller. When a set of conditions hold in the BTF type of
@@ -8668,6 +8675,13 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
 				verbose(env, "arg#%d expected pointer to ctx, but got %s\n", i, btf_type_str(t));
 				return -EINVAL;
 			}
+
+			if (meta->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx]) {
+				ret = get_kern_ctx_btf_id(&env->log, resolve_prog_type(env->prog));
+				if (ret < 0)
+					return -EINVAL;
+				meta->ret_btf_id  = ret;
+			}
 			break;
 		case KF_ARG_PTR_TO_ALLOC_BTF_ID:
 			if (reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) {
@@ -8919,6 +8933,11 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 				regs[BPF_REG_0].btf = field->list_head.btf;
 				regs[BPF_REG_0].btf_id = field->list_head.value_btf_id;
 				regs[BPF_REG_0].off = field->list_head.node_offset;
+			} else if (meta.func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx]) {
+				mark_reg_known_zero(env, regs, BPF_REG_0);
+				regs[BPF_REG_0].type = PTR_TO_BTF_ID | PTR_TRUSTED;
+				regs[BPF_REG_0].btf = desc_btf;
+				regs[BPF_REG_0].btf_id = meta.ret_btf_id;
 			} else {
 				verbose(env, "kernel function %s unhandled dynamic return type\n",
 					meta.func_name);
@@ -15172,6 +15191,9 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		insn_buf[1] = addr[1];
 		insn_buf[2] = *insn;
 		*cnt = 3;
+	} else if (desc->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx]) {
+		insn_buf[0] = BPF_MOV64_REG(BPF_REG_0, BPF_REG_1);
+		*cnt = 1;
 	}
 	return 0;
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v4 3/4] bpf: Add a kfunc for generic type cast
  2022-11-20 19:54 [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs Yonghong Song
  2022-11-20 19:54 ` [PATCH bpf-next v4 1/4] bpf: Add support for kfunc set with common btf_ids Yonghong Song
  2022-11-20 19:54 ` [PATCH bpf-next v4 2/4] bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx Yonghong Song
@ 2022-11-20 19:54 ` Yonghong Song
  2022-11-20 20:16   ` Alexei Starovoitov
  2022-11-20 19:54 ` [PATCH bpf-next v4 4/4] bpf: Add type cast unit tests Yonghong Song
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Yonghong Song @ 2022-11-20 19:54 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	John Fastabend, kernel-team, Kumar Kartikeya Dwivedi,
	Martin KaFai Lau

Implement bpf_rdonly_cast() which tries to cast the object
to a specified type. This tries to support use case like below:
  #define skb_shinfo(SKB) ((struct skb_shared_info *)(skb_end_pointer(SKB)))
where skb_end_pointer(SKB) is a 'unsigned char *' and needs to
be casted to 'struct skb_shared_info *'.

The signature of bpf_rdonly_cast() looks like
   void *bpf_rdonly_cast(void *obj, __u32 btf_id)
The function returns the same 'obj' but with PTR_TO_BTF_ID with
btf_id. The verifier will ensure btf_id being a struct type.

Since the supported type cast may not reflect what the 'obj'
represents, the returned btf_id is marked as PTR_UNTRUSTED, so
the return value and subsequent pointer chasing cannot be
used as helper/kfunc arguments.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 kernel/bpf/helpers.c  |  6 ++++++
 kernel/bpf/verifier.c | 26 ++++++++++++++++++++++++--
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index a4b9cfcecb00..89310c2ca03e 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1886,6 +1886,11 @@ void *bpf_cast_to_kern_ctx(void *obj)
 	return obj;
 }
 
+void *bpf_rdonly_cast(void *obj__ign, u32 btf_id__k)
+{
+	return obj__ign;
+}
+
 __diag_pop();
 
 BTF_SET8_START(generic_btf_ids)
@@ -1915,6 +1920,7 @@ BTF_ID(func, bpf_task_release)
 
 BTF_SET8_START(common_btf_ids)
 BTF_ID_FLAGS(func, bpf_cast_to_kern_ctx)
+BTF_ID_FLAGS(func, bpf_rdonly_cast)
 BTF_SET8_END(common_btf_ids)
 
 static const struct btf_kfunc_id_set common_kfunc_set = {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index a035356ed5df..8e01a7eebf7f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -8153,6 +8153,7 @@ enum special_kfunc_type {
 	KF_bpf_list_pop_front,
 	KF_bpf_list_pop_back,
 	KF_bpf_cast_to_kern_ctx,
+	KF_bpf_rdonly_cast,
 };
 
 BTF_SET_START(special_kfunc_set)
@@ -8163,6 +8164,7 @@ BTF_ID(func, bpf_list_push_back)
 BTF_ID(func, bpf_list_pop_front)
 BTF_ID(func, bpf_list_pop_back)
 BTF_ID(func, bpf_cast_to_kern_ctx)
+BTF_ID(func, bpf_rdonly_cast)
 BTF_SET_END(special_kfunc_set)
 
 BTF_ID_LIST(special_kfunc_list)
@@ -8173,6 +8175,7 @@ BTF_ID(func, bpf_list_push_back)
 BTF_ID(func, bpf_list_pop_front)
 BTF_ID(func, bpf_list_pop_back)
 BTF_ID(func, bpf_cast_to_kern_ctx)
+BTF_ID(func, bpf_rdonly_cast)
 
 static enum kfunc_ptr_arg_type
 get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
@@ -8809,6 +8812,7 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 	u32 i, nargs, func_id, ptr_type_id;
 	int err, insn_idx = *insn_idx_p;
 	const struct btf_param *args;
+	const struct btf_type *ret_t;
 	struct btf *desc_btf;
 	u32 *kfunc_flags;
 
@@ -8888,7 +8892,6 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 
 		if (meta.btf == btf_vmlinux && btf_id_set_contains(&special_kfunc_set, meta.func_id)) {
 			if (meta.func_id == special_kfunc_list[KF_bpf_obj_new_impl]) {
-				const struct btf_type *ret_t;
 				struct btf *ret_btf;
 				u32 ret_btf_id;
 
@@ -8938,6 +8941,24 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 				regs[BPF_REG_0].type = PTR_TO_BTF_ID | PTR_TRUSTED;
 				regs[BPF_REG_0].btf = desc_btf;
 				regs[BPF_REG_0].btf_id = meta.ret_btf_id;
+			} else if (meta.func_id == special_kfunc_list[KF_bpf_rdonly_cast]) {
+				if (!capable(CAP_PERFMON)) {
+					verbose(env,
+						"kfunc bpf_rdonly_cast requires CAP_PERFMON capability\n");
+					return -EACCES;
+				}
+
+				ret_t = btf_type_by_id(desc_btf, meta.arg_constant.value);
+				if (!ret_t || !btf_type_is_struct(ret_t)) {
+					verbose(env,
+						"kfunc bpf_rdonly_cast type ID argument must be of a struct\n");
+					return -EINVAL;
+				}
+
+				mark_reg_known_zero(env, regs, BPF_REG_0);
+				regs[BPF_REG_0].type = PTR_TO_BTF_ID | PTR_UNTRUSTED;
+				regs[BPF_REG_0].btf = desc_btf;
+				regs[BPF_REG_0].btf_id = meta.arg_constant.value;
 			} else {
 				verbose(env, "kernel function %s unhandled dynamic return type\n",
 					meta.func_name);
@@ -15191,7 +15212,8 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		insn_buf[1] = addr[1];
 		insn_buf[2] = *insn;
 		*cnt = 3;
-	} else if (desc->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx]) {
+	} else if (desc->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx] ||
+		   desc->func_id == special_kfunc_list[KF_bpf_rdonly_cast]) {
 		insn_buf[0] = BPF_MOV64_REG(BPF_REG_0, BPF_REG_1);
 		*cnt = 1;
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v4 4/4] bpf: Add type cast unit tests
  2022-11-20 19:54 [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs Yonghong Song
                   ` (2 preceding siblings ...)
  2022-11-20 19:54 ` [PATCH bpf-next v4 3/4] bpf: Add a kfunc for generic type cast Yonghong Song
@ 2022-11-20 19:54 ` Yonghong Song
  2022-11-21  0:00 ` [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs patchwork-bot+netdevbpf
  2022-11-21  2:10 ` John Fastabend
  5 siblings, 0 replies; 18+ messages in thread
From: Yonghong Song @ 2022-11-20 19:54 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	John Fastabend, kernel-team, Kumar Kartikeya Dwivedi,
	Martin KaFai Lau

Three tests are added. One is from John Fastabend ({1]) which tests
tracing style access for xdp program from the kernel ctx.
Another is a tc test to test both kernel ctx tracing style access
and explicit non-ctx type cast. The third one is for negative tests
including two tests, a tp_bpf test where the bpf_rdonly_cast()
returns a untrusted ptr which cannot be used as helper argument,
and a tracepoint test where the kernel ctx is a u64.

Also added the test to DENYLIST.s390x since s390 does not currently
support calling kernel functions in JIT mode.

  [1] https://lore.kernel.org/bpf/20221109215242.1279993-1-john.fastabend@gmail.com/

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/testing/selftests/bpf/DENYLIST.s390x    |   1 +
 .../selftests/bpf/prog_tests/type_cast.c      | 114 ++++++++++++++++++
 tools/testing/selftests/bpf/progs/type_cast.c |  83 +++++++++++++
 3 files changed, 198 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/type_cast.c
 create mode 100644 tools/testing/selftests/bpf/progs/type_cast.c

diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x
index f70a677b38e5..12cf2159975e 100644
--- a/tools/testing/selftests/bpf/DENYLIST.s390x
+++ b/tools/testing/selftests/bpf/DENYLIST.s390x
@@ -71,6 +71,7 @@ trace_printk                             # trace_printk__load unexpected error:
 trace_vprintk                            # trace_vprintk__open_and_load unexpected error: -9                           (?)
 tracing_struct                           # failed to auto-attach: -524                                                 (trampoline)
 trampoline_count                         # prog 'prog1': failed to attach: ERROR: strerror_r(-524)=22                  (trampoline)
+type_cast                                # JIT does not support calling kernel function
 unpriv_bpf_disabled                      # fentry
 user_ringbuf                             # failed to find kernel BTF type ID of '__s390x_sys_prctl': -3                (?)
 verif_stats                              # trace_vprintk__open_and_load unexpected error: -9                           (?)
diff --git a/tools/testing/selftests/bpf/prog_tests/type_cast.c b/tools/testing/selftests/bpf/prog_tests/type_cast.c
new file mode 100644
index 000000000000..9317d5fa2635
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/type_cast.c
@@ -0,0 +1,114 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */
+#include <test_progs.h>
+#include <network_helpers.h>
+#include "type_cast.skel.h"
+
+static void test_xdp(void)
+{
+	struct type_cast *skel;
+	int err, prog_fd;
+	char buf[128];
+
+	LIBBPF_OPTS(bpf_test_run_opts, topts,
+		.data_in = &pkt_v4,
+		.data_size_in = sizeof(pkt_v4),
+		.data_out = buf,
+		.data_size_out = sizeof(buf),
+		.repeat = 1,
+	);
+
+	skel = type_cast__open();
+	if (!ASSERT_OK_PTR(skel, "skel_open"))
+		return;
+
+	bpf_program__set_autoload(skel->progs.md_xdp, true);
+	err = type_cast__load(skel);
+	if (!ASSERT_OK(err, "skel_load"))
+		goto out;
+
+	prog_fd = bpf_program__fd(skel->progs.md_xdp);
+	err = bpf_prog_test_run_opts(prog_fd, &topts);
+	ASSERT_OK(err, "test_run");
+	ASSERT_EQ(topts.retval, XDP_PASS, "xdp test_run retval");
+
+	ASSERT_EQ(skel->bss->ifindex, 1, "xdp_md ifindex");
+	ASSERT_EQ(skel->bss->ifindex, skel->bss->ingress_ifindex, "xdp_md ingress_ifindex");
+	ASSERT_STREQ(skel->bss->name, "lo", "xdp_md name");
+	ASSERT_NEQ(skel->bss->inum, 0, "xdp_md inum");
+
+out:
+	type_cast__destroy(skel);
+}
+
+static void test_tc(void)
+{
+	struct type_cast *skel;
+	int err, prog_fd;
+
+	LIBBPF_OPTS(bpf_test_run_opts, topts,
+		.data_in = &pkt_v4,
+		.data_size_in = sizeof(pkt_v4),
+		.repeat = 1,
+	);
+
+	skel = type_cast__open();
+	if (!ASSERT_OK_PTR(skel, "skel_open"))
+		return;
+
+	bpf_program__set_autoload(skel->progs.md_skb, true);
+	err = type_cast__load(skel);
+	if (!ASSERT_OK(err, "skel_load"))
+		goto out;
+
+	prog_fd = bpf_program__fd(skel->progs.md_skb);
+	err = bpf_prog_test_run_opts(prog_fd, &topts);
+	ASSERT_OK(err, "test_run");
+	ASSERT_EQ(topts.retval, 0, "tc test_run retval");
+
+	ASSERT_EQ(skel->bss->meta_len, 0, "skb meta_len");
+	ASSERT_EQ(skel->bss->frag0_len, 0, "skb frag0_len");
+	ASSERT_NEQ(skel->bss->kskb_len, 0, "skb len");
+	ASSERT_NEQ(skel->bss->kskb2_len, 0, "skb2 len");
+	ASSERT_EQ(skel->bss->kskb_len, skel->bss->kskb2_len, "skb len compare");
+
+out:
+	type_cast__destroy(skel);
+}
+
+static const char * const negative_tests[] = {
+	"untrusted_ptr",
+	"kctx_u64",
+};
+
+static void test_negative(void)
+{
+	struct bpf_program *prog;
+	struct type_cast *skel;
+	int i, err;
+
+	for (i = 0; i < ARRAY_SIZE(negative_tests); i++) {
+		skel = type_cast__open();
+		if (!ASSERT_OK_PTR(skel, "skel_open"))
+			return;
+
+		prog = bpf_object__find_program_by_name(skel->obj, negative_tests[i]);
+		if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name"))
+			goto out;
+		bpf_program__set_autoload(prog, true);
+		err = type_cast__load(skel);
+		ASSERT_ERR(err, "skel_load");
+out:
+		type_cast__destroy(skel);
+	}
+}
+
+void test_type_cast(void)
+{
+	if (test__start_subtest("xdp"))
+		test_xdp();
+	if (test__start_subtest("tc"))
+		test_tc();
+	if (test__start_subtest("negative"))
+		test_negative();
+}
diff --git a/tools/testing/selftests/bpf/progs/type_cast.c b/tools/testing/selftests/bpf/progs/type_cast.c
new file mode 100644
index 000000000000..eb78e6f03129
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/type_cast.c
@@ -0,0 +1,83 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_core_read.h>
+
+struct {
+	__uint(type, BPF_MAP_TYPE_TASK_STORAGE);
+	__uint(map_flags, BPF_F_NO_PREALLOC);
+	__type(key, int);
+	__type(value, long);
+} enter_id SEC(".maps");
+
+#define	IFNAMSIZ 16
+
+int ifindex, ingress_ifindex;
+char name[IFNAMSIZ];
+unsigned int inum;
+unsigned int meta_len, frag0_len, kskb_len, kskb2_len;
+
+void *bpf_cast_to_kern_ctx(void *) __ksym;
+void *bpf_rdonly_cast(void *, __u32) __ksym;
+
+SEC("?xdp")
+int md_xdp(struct xdp_md *ctx)
+{
+	struct xdp_buff *kctx = bpf_cast_to_kern_ctx(ctx);
+	struct net_device *dev;
+
+	dev = kctx->rxq->dev;
+	ifindex = dev->ifindex;
+	inum = dev->nd_net.net->ns.inum;
+	__builtin_memcpy(name, dev->name, IFNAMSIZ);
+	ingress_ifindex = ctx->ingress_ifindex;
+	return XDP_PASS;
+}
+
+SEC("?tc")
+int md_skb(struct __sk_buff *skb)
+{
+	struct sk_buff *kskb = bpf_cast_to_kern_ctx(skb);
+	struct skb_shared_info *shared_info;
+	struct sk_buff *kskb2;
+
+	kskb_len = kskb->len;
+
+	/* Simulate the following kernel macro:
+	 *   #define skb_shinfo(SKB) ((struct skb_shared_info *)(skb_end_pointer(SKB)))
+	 */
+	shared_info = bpf_rdonly_cast(kskb->head + kskb->end,
+		bpf_core_type_id_kernel(struct skb_shared_info));
+	meta_len = shared_info->meta_len;
+	frag0_len = shared_info->frag_list->len;
+
+	/* kskb2 should be equal to kskb */
+	kskb2 = bpf_rdonly_cast(kskb, bpf_core_type_id_kernel(struct sk_buff));
+	kskb2_len = kskb2->len;
+	return 0;
+}
+
+SEC("?tp_btf/sys_enter")
+int BPF_PROG(untrusted_ptr, struct pt_regs *regs, long id)
+{
+	struct task_struct *task, *task_dup;
+	long *ptr;
+
+	task = bpf_get_current_task_btf();
+	task_dup = bpf_rdonly_cast(task, bpf_core_type_id_kernel(struct task_struct));
+	(void)bpf_task_storage_get(&enter_id, task_dup, 0, 0);
+	return 0;
+}
+
+SEC("?tracepoint/syscalls/sys_enter_nanosleep")
+int kctx_u64(void *ctx)
+{
+	u64 *kctx = bpf_rdonly_cast(ctx, bpf_core_type_id_kernel(u64));
+
+	(void)kctx;
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v4 3/4] bpf: Add a kfunc for generic type cast
  2022-11-20 19:54 ` [PATCH bpf-next v4 3/4] bpf: Add a kfunc for generic type cast Yonghong Song
@ 2022-11-20 20:16   ` Alexei Starovoitov
  2022-11-20 20:49     ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2022-11-20 20:16 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	John Fastabend, Kernel Team, Kumar Kartikeya Dwivedi,
	Martin KaFai Lau

On Sun, Nov 20, 2022 at 11:57 AM Yonghong Song <yhs@fb.com> wrote:
>
> @@ -8938,6 +8941,24 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>                                 regs[BPF_REG_0].type = PTR_TO_BTF_ID | PTR_TRUSTED;
>                                 regs[BPF_REG_0].btf = desc_btf;
>                                 regs[BPF_REG_0].btf_id = meta.ret_btf_id;
> +                       } else if (meta.func_id == special_kfunc_list[KF_bpf_rdonly_cast]) {
> +                               if (!capable(CAP_PERFMON)) {
> +                                       verbose(env,
> +                                               "kfunc bpf_rdonly_cast requires CAP_PERFMON capability\n");
> +                                       return -EACCES;
> +                               }

Just realized that bpf_cast_to_kern_ctx() has to be
gated by cap_perfmon as well.

Also the direct capable(CAP_PERFMON) is not quite correct.
It should at least be perfmon_capable().
But even better to use env->allow_ptr_leaks here.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v4 3/4] bpf: Add a kfunc for generic type cast
  2022-11-20 20:16   ` Alexei Starovoitov
@ 2022-11-20 20:49     ` Kumar Kartikeya Dwivedi
  2022-11-20 22:34       ` Alexei Starovoitov
  0 siblings, 1 reply; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-11-20 20:49 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Yonghong Song, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, John Fastabend, Kernel Team, Martin KaFai Lau

On Mon, Nov 21, 2022 at 01:46:04AM IST, Alexei Starovoitov wrote:
> On Sun, Nov 20, 2022 at 11:57 AM Yonghong Song <yhs@fb.com> wrote:
> >
> > @@ -8938,6 +8941,24 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> >                                 regs[BPF_REG_0].type = PTR_TO_BTF_ID | PTR_TRUSTED;
> >                                 regs[BPF_REG_0].btf = desc_btf;
> >                                 regs[BPF_REG_0].btf_id = meta.ret_btf_id;
> > +                       } else if (meta.func_id == special_kfunc_list[KF_bpf_rdonly_cast]) {
> > +                               if (!capable(CAP_PERFMON)) {
> > +                                       verbose(env,
> > +                                               "kfunc bpf_rdonly_cast requires CAP_PERFMON capability\n");
> > +                                       return -EACCES;
> > +                               }
>
> Just realized that bpf_cast_to_kern_ctx() has to be
> gated by cap_perfmon as well.
>
> Also the direct capable(CAP_PERFMON) is not quite correct.
> It should at least be perfmon_capable().
> But even better to use env->allow_ptr_leaks here.

Based on this, I wonder if this needs to be done for bpf_obj_new as well? It
doesn't zero initialize the memory it returns (except special fields, which is
required for correctness), so technically it allows leaking kernel addresses
with just CAP_BPF (apart from capabilities needed for the specific program types
it is available to).

Should that also have a env->allow_ptr_leaks check?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v4 3/4] bpf: Add a kfunc for generic type cast
  2022-11-20 20:49     ` Kumar Kartikeya Dwivedi
@ 2022-11-20 22:34       ` Alexei Starovoitov
  2022-11-20 23:32         ` Alexei Starovoitov
  0 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2022-11-20 22:34 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: Yonghong Song, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, John Fastabend, Kernel Team, Martin KaFai Lau

On Mon, Nov 21, 2022 at 02:19:30AM +0530, Kumar Kartikeya Dwivedi wrote:
> On Mon, Nov 21, 2022 at 01:46:04AM IST, Alexei Starovoitov wrote:
> > On Sun, Nov 20, 2022 at 11:57 AM Yonghong Song <yhs@fb.com> wrote:
> > >
> > > @@ -8938,6 +8941,24 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> > >                                 regs[BPF_REG_0].type = PTR_TO_BTF_ID | PTR_TRUSTED;
> > >                                 regs[BPF_REG_0].btf = desc_btf;
> > >                                 regs[BPF_REG_0].btf_id = meta.ret_btf_id;
> > > +                       } else if (meta.func_id == special_kfunc_list[KF_bpf_rdonly_cast]) {
> > > +                               if (!capable(CAP_PERFMON)) {
> > > +                                       verbose(env,
> > > +                                               "kfunc bpf_rdonly_cast requires CAP_PERFMON capability\n");
> > > +                                       return -EACCES;
> > > +                               }
> >
> > Just realized that bpf_cast_to_kern_ctx() has to be
> > gated by cap_perfmon as well.
> >
> > Also the direct capable(CAP_PERFMON) is not quite correct.
> > It should at least be perfmon_capable().
> > But even better to use env->allow_ptr_leaks here.
> 
> Based on this, I wonder if this needs to be done for bpf_obj_new as well? It
> doesn't zero initialize the memory it returns (except special fields, which is
> required for correctness), so technically it allows leaking kernel addresses
> with just CAP_BPF (apart from capabilities needed for the specific program types
> it is available to).
> 
> Should that also have a env->allow_ptr_leaks check?

Yeah. Good point.
My first reaction was to audit everything where the verifier produces
PTR_TO_BTF_ID and gate it with allow_ptr_leaks.
But then it looks simpler to gate it once in check_ptr_to_btf_access().
Then bpf_rdonly_cast and everything wouldn't need those checks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v4 3/4] bpf: Add a kfunc for generic type cast
  2022-11-20 22:34       ` Alexei Starovoitov
@ 2022-11-20 23:32         ` Alexei Starovoitov
  2022-11-20 23:47           ` Alexei Starovoitov
  0 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2022-11-20 23:32 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: Yonghong Song, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, John Fastabend, Kernel Team, Martin KaFai Lau

On Sun, Nov 20, 2022 at 2:34 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Nov 21, 2022 at 02:19:30AM +0530, Kumar Kartikeya Dwivedi wrote:
> > On Mon, Nov 21, 2022 at 01:46:04AM IST, Alexei Starovoitov wrote:
> > > On Sun, Nov 20, 2022 at 11:57 AM Yonghong Song <yhs@fb.com> wrote:
> > > >
> > > > @@ -8938,6 +8941,24 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> > > >                                 regs[BPF_REG_0].type = PTR_TO_BTF_ID | PTR_TRUSTED;
> > > >                                 regs[BPF_REG_0].btf = desc_btf;
> > > >                                 regs[BPF_REG_0].btf_id = meta.ret_btf_id;
> > > > +                       } else if (meta.func_id == special_kfunc_list[KF_bpf_rdonly_cast]) {
> > > > +                               if (!capable(CAP_PERFMON)) {
> > > > +                                       verbose(env,
> > > > +                                               "kfunc bpf_rdonly_cast requires CAP_PERFMON capability\n");
> > > > +                                       return -EACCES;
> > > > +                               }
> > >
> > > Just realized that bpf_cast_to_kern_ctx() has to be
> > > gated by cap_perfmon as well.
> > >
> > > Also the direct capable(CAP_PERFMON) is not quite correct.
> > > It should at least be perfmon_capable().
> > > But even better to use env->allow_ptr_leaks here.
> >
> > Based on this, I wonder if this needs to be done for bpf_obj_new as well? It
> > doesn't zero initialize the memory it returns (except special fields, which is
> > required for correctness), so technically it allows leaking kernel addresses
> > with just CAP_BPF (apart from capabilities needed for the specific program types
> > it is available to).
> >
> > Should that also have a env->allow_ptr_leaks check?
>
> Yeah. Good point.
> My first reaction was to audit everything where the verifier produces
> PTR_TO_BTF_ID and gate it with allow_ptr_leaks.
> But then it looks simpler to gate it once in check_ptr_to_btf_access().
> Then bpf_rdonly_cast and everything wouldn't need those checks.

Noticed that check_ptr_to_map_access is doing
"Simulate access to a PTR_TO_BTF_ID"
and has weird allow_ptr_to_map_access bool
which is the same as allow_ptr_leaks.
So I'm thinking we can remove allow_ptr_to_map_access
and add allow_ptr_leaks check to btf_struct_access()
which will cover all these cases.

Also since bpf_cast_to_kern_ctx() is expected to be used out of
networking progs and those progs are not always GPL we should add
env->prog->gpl_compatible to btf_struct_access() too.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v4 3/4] bpf: Add a kfunc for generic type cast
  2022-11-20 23:32         ` Alexei Starovoitov
@ 2022-11-20 23:47           ` Alexei Starovoitov
  0 siblings, 0 replies; 18+ messages in thread
From: Alexei Starovoitov @ 2022-11-20 23:47 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: Yonghong Song, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, John Fastabend, Kernel Team, Martin KaFai Lau

On Sun, Nov 20, 2022 at 3:32 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Sun, Nov 20, 2022 at 2:34 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Mon, Nov 21, 2022 at 02:19:30AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > On Mon, Nov 21, 2022 at 01:46:04AM IST, Alexei Starovoitov wrote:
> > > > On Sun, Nov 20, 2022 at 11:57 AM Yonghong Song <yhs@fb.com> wrote:
> > > > >
> > > > > @@ -8938,6 +8941,24 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> > > > >                                 regs[BPF_REG_0].type = PTR_TO_BTF_ID | PTR_TRUSTED;
> > > > >                                 regs[BPF_REG_0].btf = desc_btf;
> > > > >                                 regs[BPF_REG_0].btf_id = meta.ret_btf_id;
> > > > > +                       } else if (meta.func_id == special_kfunc_list[KF_bpf_rdonly_cast]) {
> > > > > +                               if (!capable(CAP_PERFMON)) {
> > > > > +                                       verbose(env,
> > > > > +                                               "kfunc bpf_rdonly_cast requires CAP_PERFMON capability\n");
> > > > > +                                       return -EACCES;
> > > > > +                               }
> > > >
> > > > Just realized that bpf_cast_to_kern_ctx() has to be
> > > > gated by cap_perfmon as well.
> > > >
> > > > Also the direct capable(CAP_PERFMON) is not quite correct.
> > > > It should at least be perfmon_capable().
> > > > But even better to use env->allow_ptr_leaks here.
> > >
> > > Based on this, I wonder if this needs to be done for bpf_obj_new as well? It
> > > doesn't zero initialize the memory it returns (except special fields, which is
> > > required for correctness), so technically it allows leaking kernel addresses
> > > with just CAP_BPF (apart from capabilities needed for the specific program types
> > > it is available to).
> > >
> > > Should that also have a env->allow_ptr_leaks check?
> >
> > Yeah. Good point.
> > My first reaction was to audit everything where the verifier produces
> > PTR_TO_BTF_ID and gate it with allow_ptr_leaks.
> > But then it looks simpler to gate it once in check_ptr_to_btf_access().
> > Then bpf_rdonly_cast and everything wouldn't need those checks.
>
> Noticed that check_ptr_to_map_access is doing
> "Simulate access to a PTR_TO_BTF_ID"
> and has weird allow_ptr_to_map_access bool
> which is the same as allow_ptr_leaks.
> So I'm thinking we can remove allow_ptr_to_map_access
> and add allow_ptr_leaks check to btf_struct_access()
> which will cover all these cases.
>
> Also since bpf_cast_to_kern_ctx() is expected to be used out of
> networking progs and those progs are not always GPL we should add
> env->prog->gpl_compatible to btf_struct_access() too.

Since that follow up will cover bpf_rdonly_cast too
I've removed cap_perfmon check from this commit and will push
this series shortly.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs
  2022-11-20 19:54 [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs Yonghong Song
                   ` (3 preceding siblings ...)
  2022-11-20 19:54 ` [PATCH bpf-next v4 4/4] bpf: Add type cast unit tests Yonghong Song
@ 2022-11-21  0:00 ` patchwork-bot+netdevbpf
  2022-11-21  2:10 ` John Fastabend
  5 siblings, 0 replies; 18+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-11-21  0:00 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, ast, andrii, daniel, john.fastabend, kernel-team, memxor,
	martin.lau

Hello:

This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Sun, 20 Nov 2022 11:54:21 -0800 you wrote:
> Currenty, a non-tracing bpf program typically has a single 'context' argument
> with predefined uapi struct type. Following these uapi struct, user is able
> to access other fields defined in uapi header. Inside the kernel, the
> user-seen 'context' argument is replaced with 'kernel context' (or 'kctx'
> in short) which can access more information than what uapi header provides.
> To access other info not in uapi header, people typically do two things:
>   (1). extend uapi to access more fields rooted from 'context'.
>   (2). use bpf_probe_read_kernl() helper to read particular field based on
>     kctx.
> Using (1) needs uapi change and using (2) makes code more complex since
> direct memory access is not allowed.
> 
> [...]

Here is the summary with links:
  - [bpf-next,v4,1/4] bpf: Add support for kfunc set with common btf_ids
    https://git.kernel.org/bpf/bpf-next/c/cfe1456440c8
  - [bpf-next,v4,2/4] bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx
    https://git.kernel.org/bpf/bpf-next/c/fd264ca02094
  - [bpf-next,v4,3/4] bpf: Add a kfunc for generic type cast
    https://git.kernel.org/bpf/bpf-next/c/a35b9af4ec2c
  - [bpf-next,v4,4/4] bpf: Add type cast unit tests
    https://git.kernel.org/bpf/bpf-next/c/58d84bee5846

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs
  2022-11-20 19:54 [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs Yonghong Song
                   ` (4 preceding siblings ...)
  2022-11-21  0:00 ` [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs patchwork-bot+netdevbpf
@ 2022-11-21  2:10 ` John Fastabend
  2022-11-21 17:29   ` Yonghong Song
  5 siblings, 1 reply; 18+ messages in thread
From: John Fastabend @ 2022-11-21  2:10 UTC (permalink / raw)
  To: Yonghong Song, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	John Fastabend, kernel-team, Kumar Kartikeya Dwivedi,
	Martin KaFai Lau

Yonghong Song wrote:
> Currenty, a non-tracing bpf program typically has a single 'context' argument
> with predefined uapi struct type. Following these uapi struct, user is able
> to access other fields defined in uapi header. Inside the kernel, the
> user-seen 'context' argument is replaced with 'kernel context' (or 'kctx'
> in short) which can access more information than what uapi header provides.
> To access other info not in uapi header, people typically do two things:
>   (1). extend uapi to access more fields rooted from 'context'.
>   (2). use bpf_probe_read_kernl() helper to read particular field based on
>     kctx.
> Using (1) needs uapi change and using (2) makes code more complex since
> direct memory access is not allowed.
> 
> There are already a few instances trying to access more information from
> kctx:
>   . trying to access some fields from perf_event kctx ([1]).
>   . trying to access some fields from xdp kctx ([2]).
> 
> This patch set tried to allow direct memory access for kctx fields
> by introducing bpf_cast_to_kern_ctx() kfunc.
> 
> Martin mentioned a use case like type casting below:
>   #define skb_shinfo(SKB) ((struct skb_shared_info *)(skb_end_pointer(SKB)))
> basically a 'unsigned char *" casted to 'struct skb_shared_info *'. This patch
> set tries to support such a use case as well with bpf_rdonly_cast().
> 
> For the patch series, Patch 1 added support for a kfunc available to all
> prog types. Patch 2 added bpf_cast_to_kern_ctx() kfunc. Patch 3 added
> bpf_rdonly_cast() kfunc. Patch 4 added a few positive and negative tests.
> 
>   [1] https://lore.kernel.org/bpf/ad15b398-9069-4a0e-48cb-4bb651ec3088@meta.com/
>   [2] https://lore.kernel.org/bpf/20221109215242.1279993-1-john.fastabend@gmail.com/
> 
> Changelog:
>   v3 -> v4:
>     - remove unnecessary bpf_ctx_convert.t error checking
>     - add and use meta.ret_btf_id instead of meta.arg_constant.value for
>       bpf_cast_to_kern_ctx().
>     - add PTR_TRUSTED to the return PTR_TO_BTF_ID type for bpf_cast_to_kern_ctx().
>   v2 -> v3:
>     - rebase on top of bpf-next (for merging conflicts)
>     - add the selftest to s390x deny list
>   rfcv1 -> v2:
>     - break original one kfunc into two.
>     - add missing error checks and error logs.
>     - adapt to the new conventions in
>       https://lore.kernel.org/all/20221118015614.2013203-1-memxor@gmail.com/
>       for example, with __ign and __k suffix.
>     - added support in fixup_kfunc_call() to replace kfunc calls with a single mov.
> 
> Yonghong Song (4):
>   bpf: Add support for kfunc set with common btf_ids
>   bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx
>   bpf: Add a kfunc for generic type cast
>   bpf: Add type cast unit tests

Thanks Yonghong! Ack for the series for me, but looks like Alexei is
quick.

From myside this allows us to pull in the dev info and from that get
netns so fixes a gap we had to split into a kprobe + xdp.

If we can get a pointer to the recv queue then with a few reads we
get the hash, vlan, etc. (see timestapm thread)

And then last bit is if we can get a ptr to the net ns list, plus
the rcu patch we can build the net ns iterator directly in BPF
which seems stronger than an iterator IMO because we can kick it
off on events anywhere in the kernel. Or based on event kick of
some specific iterator e.g. walk net_devs in netns X with SR-IOV
interfaces). Ideally we would also wire it up to timers so we
can call it every N seconds without any user space intervention.
Eventually, its nice if the user space can crash, restart, and
so on without impacting the logic in kernel.

Thanks again.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs
  2022-11-21  2:10 ` John Fastabend
@ 2022-11-21 17:29   ` Yonghong Song
  2022-11-22  1:48     ` John Fastabend
  0 siblings, 1 reply; 18+ messages in thread
From: Yonghong Song @ 2022-11-21 17:29 UTC (permalink / raw)
  To: John Fastabend, Yonghong Song, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, Kumar Kartikeya Dwivedi, Martin KaFai Lau



On 11/20/22 6:10 PM, John Fastabend wrote:
> Yonghong Song wrote:
>> Currenty, a non-tracing bpf program typically has a single 'context' argument
>> with predefined uapi struct type. Following these uapi struct, user is able
>> to access other fields defined in uapi header. Inside the kernel, the
>> user-seen 'context' argument is replaced with 'kernel context' (or 'kctx'
>> in short) which can access more information than what uapi header provides.
>> To access other info not in uapi header, people typically do two things:
>>    (1). extend uapi to access more fields rooted from 'context'.
>>    (2). use bpf_probe_read_kernl() helper to read particular field based on
>>      kctx.
>> Using (1) needs uapi change and using (2) makes code more complex since
>> direct memory access is not allowed.
>>
>> There are already a few instances trying to access more information from
>> kctx:
>>    . trying to access some fields from perf_event kctx ([1]).
>>    . trying to access some fields from xdp kctx ([2]).
>>
>> This patch set tried to allow direct memory access for kctx fields
>> by introducing bpf_cast_to_kern_ctx() kfunc.
>>
>> Martin mentioned a use case like type casting below:
>>    #define skb_shinfo(SKB) ((struct skb_shared_info *)(skb_end_pointer(SKB)))
>> basically a 'unsigned char *" casted to 'struct skb_shared_info *'. This patch
>> set tries to support such a use case as well with bpf_rdonly_cast().
>>
>> For the patch series, Patch 1 added support for a kfunc available to all
>> prog types. Patch 2 added bpf_cast_to_kern_ctx() kfunc. Patch 3 added
>> bpf_rdonly_cast() kfunc. Patch 4 added a few positive and negative tests.
>>
>>    [1] https://lore.kernel.org/bpf/ad15b398-9069-4a0e-48cb-4bb651ec3088@meta.com/
>>    [2] https://lore.kernel.org/bpf/20221109215242.1279993-1-john.fastabend@gmail.com/
>>
>> Changelog:
>>    v3 -> v4:
>>      - remove unnecessary bpf_ctx_convert.t error checking
>>      - add and use meta.ret_btf_id instead of meta.arg_constant.value for
>>        bpf_cast_to_kern_ctx().
>>      - add PTR_TRUSTED to the return PTR_TO_BTF_ID type for bpf_cast_to_kern_ctx().
>>    v2 -> v3:
>>      - rebase on top of bpf-next (for merging conflicts)
>>      - add the selftest to s390x deny list
>>    rfcv1 -> v2:
>>      - break original one kfunc into two.
>>      - add missing error checks and error logs.
>>      - adapt to the new conventions in
>>        https://lore.kernel.org/all/20221118015614.2013203-1-memxor@gmail.com/
>>        for example, with __ign and __k suffix.
>>      - added support in fixup_kfunc_call() to replace kfunc calls with a single mov.
>>
>> Yonghong Song (4):
>>    bpf: Add support for kfunc set with common btf_ids
>>    bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx
>>    bpf: Add a kfunc for generic type cast
>>    bpf: Add type cast unit tests
> 
> Thanks Yonghong! Ack for the series for me, but looks like Alexei is
> quick.
> 
>  From myside this allows us to pull in the dev info and from that get
> netns so fixes a gap we had to split into a kprobe + xdp.
> 
> If we can get a pointer to the recv queue then with a few reads we
> get the hash, vlan, etc. (see timestapm thread)

Thanks, John. Glad to see it is useful.

> 
> And then last bit is if we can get a ptr to the net ns list, plus

Unfortunately, currently vmlinux btf does not have non-percpu global
variables, so net_namespace_list is not available to bpf programs.
But I think we could do the following with a little bit user space
initial involvement as a workaround.

In bpf program, we could have global variable
   __u64 net_namespace_list;
and user space can lookup /proc/kallsyms for net_namespace_list
and assign it to bpf program 'net_namespace_list' before prog load.

After that, you could implement an in-bpf-prog iterator with bounded
loop to ensure eventual ending. You can use
   struct list_head *lh = bpf_rdonly_cast(net_namespace_list, 
struct_list_head_btf_id)
cast to struct list_head pointer. From there you can tracing down
the list with needed bpf_rdonly_cast() for casting to element type.

> the rcu patch we can build the net ns iterator directly in BPF

I just posted rcu patch 
https://lore.kernel.org/bpf/20221121170515.1193967-1-yhs@fb.com/
Please help take a look whether it can serve your need.

> which seems stronger than an iterator IMO because we can kick it
> off on events anywhere in the kernel. Or based on event kick of
> some specific iterator e.g. walk net_devs in netns X with SR-IOV
> interfaces). Ideally we would also wire it up to timers so we
> can call it every N seconds without any user space intervention.
> Eventually, its nice if the user space can crash, restart, and
> so on without impacting the logic in kernel.
> 
> Thanks again.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs
  2022-11-21 17:29   ` Yonghong Song
@ 2022-11-22  1:48     ` John Fastabend
  2022-11-22  4:52       ` Alexei Starovoitov
  0 siblings, 1 reply; 18+ messages in thread
From: John Fastabend @ 2022-11-22  1:48 UTC (permalink / raw)
  To: Yonghong Song, John Fastabend, Yonghong Song, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, Kumar Kartikeya Dwivedi, Martin KaFai Lau

Yonghong Song wrote:
> 
> 
> On 11/20/22 6:10 PM, John Fastabend wrote:
> > Yonghong Song wrote:
> >> Currenty, a non-tracing bpf program typically has a single 'context' argument
> >> with predefined uapi struct type. Following these uapi struct, user is able
> >> to access other fields defined in uapi header. Inside the kernel, the
> >> user-seen 'context' argument is replaced with 'kernel context' (or 'kctx'
> >> in short) which can access more information than what uapi header provides.
> >> To access other info not in uapi header, people typically do two things:
> >>    (1). extend uapi to access more fields rooted from 'context'.
> >>    (2). use bpf_probe_read_kernl() helper to read particular field based on
> >>      kctx.

[...]

> >  From myside this allows us to pull in the dev info and from that get
> > netns so fixes a gap we had to split into a kprobe + xdp.
> > 
> > If we can get a pointer to the recv queue then with a few reads we
> > get the hash, vlan, etc. (see timestapm thread)
> 
> Thanks, John. Glad to see it is useful.
> 
> > 
> > And then last bit is if we can get a ptr to the net ns list, plus
> 
> Unfortunately, currently vmlinux btf does not have non-percpu global
> variables, so net_namespace_list is not available to bpf programs.
> But I think we could do the following with a little bit user space
> initial involvement as a workaround.

What would you think of another kfunc, bpf_get_global_var() to fetch
the global reference and cast it with a type? I think even if you
had it in BTF you would still need some sort of helper otherwise
how would you know what scope of the var should be and get it
correct in type checker as a TRUSTED arg? I think for my use case
UNTRUSTED is find, seeing we do it with probe_reads already, but
getting a TRUSTED arg seems nicer given it can be known correct
from kernel side.

I was thinking something like,

  struct net *head = bpf_get_global_var(net_namespace_list,
				bpf_core_type_id_kernel(struct *net));

> 
> In bpf program, we could have global variable
>    __u64 net_namespace_list;
> and user space can lookup /proc/kallsyms for net_namespace_list
> and assign it to bpf program 'net_namespace_list' before prog load.
> 
> After that, you could implement an in-bpf-prog iterator with bounded
> loop to ensure eventual ending. You can use
>    struct list_head *lh = bpf_rdonly_cast(net_namespace_list, 
> struct_list_head_btf_id)
> cast to struct list_head pointer. From there you can tracing down
> the list with needed bpf_rdonly_cast() for casting to element type.
> 
> > the rcu patch we can build the net ns iterator directly in BPF
> 
> I just posted rcu patch 
> https://lore.kernel.org/bpf/20221121170515.1193967-1-yhs@fb.com/
> Please help take a look whether it can serve your need.

Will look.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs
  2022-11-22  1:48     ` John Fastabend
@ 2022-11-22  4:52       ` Alexei Starovoitov
  2022-11-23  3:18         ` John Fastabend
  0 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2022-11-22  4:52 UTC (permalink / raw)
  To: John Fastabend, Yonghong Song, Yonghong Song, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, Kumar Kartikeya Dwivedi, Martin KaFai Lau

On 11/21/22 5:48 PM, John Fastabend wrote:
> Yonghong Song wrote:
>>
>>
>> On 11/20/22 6:10 PM, John Fastabend wrote:
>>> Yonghong Song wrote:
>>>> Currenty, a non-tracing bpf program typically has a single 'context' argument
>>>> with predefined uapi struct type. Following these uapi struct, user is able
>>>> to access other fields defined in uapi header. Inside the kernel, the
>>>> user-seen 'context' argument is replaced with 'kernel context' (or 'kctx'
>>>> in short) which can access more information than what uapi header provides.
>>>> To access other info not in uapi header, people typically do two things:
>>>>     (1). extend uapi to access more fields rooted from 'context'.
>>>>     (2). use bpf_probe_read_kernl() helper to read particular field based on
>>>>       kctx.
> 
> [...]
> 
>>>   From myside this allows us to pull in the dev info and from that get
>>> netns so fixes a gap we had to split into a kprobe + xdp.
>>>
>>> If we can get a pointer to the recv queue then with a few reads we
>>> get the hash, vlan, etc. (see timestapm thread)
>>
>> Thanks, John. Glad to see it is useful.
>>
>>>
>>> And then last bit is if we can get a ptr to the net ns list, plus
>>
>> Unfortunately, currently vmlinux btf does not have non-percpu global
>> variables, so net_namespace_list is not available to bpf programs.
>> But I think we could do the following with a little bit user space
>> initial involvement as a workaround.
> 
> What would you think of another kfunc, bpf_get_global_var() to fetch
> the global reference and cast it with a type? I think even if you
> had it in BTF you would still need some sort of helper otherwise
> how would you know what scope of the var should be and get it
> correct in type checker as a TRUSTED arg? I think for my use case
> UNTRUSTED is find, seeing we do it with probe_reads already, but
> getting a TRUSTED arg seems nicer given it can be known correct
> from kernel side.
> 
> I was thinking something like,
> 
>    struct net *head = bpf_get_global_var(net_namespace_list,
> 				bpf_core_type_id_kernel(struct *net));

We cannot do this as ptr_trusted, since it's an unknown cast.
The verifier cannot trust bpf prog to do the right thing.
But we can enable this buy adding export_symbol_gpl global vars to BTF.
Then they will be trusted and their types correct.
Pretty much like per-cpu variables.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs
  2022-11-22  4:52       ` Alexei Starovoitov
@ 2022-11-23  3:18         ` John Fastabend
  2022-11-23 20:46           ` Alexei Starovoitov
  0 siblings, 1 reply; 18+ messages in thread
From: John Fastabend @ 2022-11-23  3:18 UTC (permalink / raw)
  To: Alexei Starovoitov, John Fastabend, Yonghong Song, Yonghong Song, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, Kumar Kartikeya Dwivedi, Martin KaFai Lau

Alexei Starovoitov wrote:
> On 11/21/22 5:48 PM, John Fastabend wrote:
> > Yonghong Song wrote:
> >>
> >>
> >> On 11/20/22 6:10 PM, John Fastabend wrote:
> >>> Yonghong Song wrote:
> >>>> Currenty, a non-tracing bpf program typically has a single 'context' argument
> >>>> with predefined uapi struct type. Following these uapi struct, user is able
> >>>> to access other fields defined in uapi header. Inside the kernel, the
> >>>> user-seen 'context' argument is replaced with 'kernel context' (or 'kctx'
> >>>> in short) which can access more information than what uapi header provides.
> >>>> To access other info not in uapi header, people typically do two things:
> >>>>     (1). extend uapi to access more fields rooted from 'context'.
> >>>>     (2). use bpf_probe_read_kernl() helper to read particular field based on
> >>>>       kctx.
> > 
> > [...]
> > 
> >>>   From myside this allows us to pull in the dev info and from that get
> >>> netns so fixes a gap we had to split into a kprobe + xdp.
> >>>
> >>> If we can get a pointer to the recv queue then with a few reads we
> >>> get the hash, vlan, etc. (see timestapm thread)
> >>
> >> Thanks, John. Glad to see it is useful.
> >>
> >>>
> >>> And then last bit is if we can get a ptr to the net ns list, plus
> >>
> >> Unfortunately, currently vmlinux btf does not have non-percpu global
> >> variables, so net_namespace_list is not available to bpf programs.
> >> But I think we could do the following with a little bit user space
> >> initial involvement as a workaround.
> > 
> > What would you think of another kfunc, bpf_get_global_var() to fetch
> > the global reference and cast it with a type? I think even if you
> > had it in BTF you would still need some sort of helper otherwise
> > how would you know what scope of the var should be and get it
> > correct in type checker as a TRUSTED arg? I think for my use case
> > UNTRUSTED is find, seeing we do it with probe_reads already, but
> > getting a TRUSTED arg seems nicer given it can be known correct
> > from kernel side.
> > 
> > I was thinking something like,
> > 
> >    struct net *head = bpf_get_global_var(net_namespace_list,
> > 				bpf_core_type_id_kernel(struct *net));
> 
> We cannot do this as ptr_trusted, since it's an unknown cast.

I think you _could_ do it if the kfunc new to check the case type
and knew that net_namespace_list should return that specific global.
The verifier would special code that var and type.

> The verifier cannot trust bpf prog to do the right thing.
> But we can enable this buy adding export_symbol_gpl global vars to BTF.
> Then they will be trusted and their types correct.
> Pretty much like per-cpu variables.
> 

Yep this is the more generic way and sounds better to me. Anyone
working on adding the global var to BTF now?

Thanks,
John

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs
  2022-11-23  3:18         ` John Fastabend
@ 2022-11-23 20:46           ` Alexei Starovoitov
  2022-11-29 16:30             ` Alan Maguire
  0 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2022-11-23 20:46 UTC (permalink / raw)
  To: John Fastabend, Yonghong Song, Yonghong Song, bpf, alan.maguire
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, Kumar Kartikeya Dwivedi, Martin KaFai Lau

On 11/22/22 7:18 PM, John Fastabend wrote:
> Alexei Starovoitov wrote:
>> On 11/21/22 5:48 PM, John Fastabend wrote:
>>> Yonghong Song wrote:
>>>>
>>>>
>>>> On 11/20/22 6:10 PM, John Fastabend wrote:
>>>>> Yonghong Song wrote:
>>>>>> Currenty, a non-tracing bpf program typically has a single 'context' argument
>>>>>> with predefined uapi struct type. Following these uapi struct, user is able
>>>>>> to access other fields defined in uapi header. Inside the kernel, the
>>>>>> user-seen 'context' argument is replaced with 'kernel context' (or 'kctx'
>>>>>> in short) which can access more information than what uapi header provides.
>>>>>> To access other info not in uapi header, people typically do two things:
>>>>>>      (1). extend uapi to access more fields rooted from 'context'.
>>>>>>      (2). use bpf_probe_read_kernl() helper to read particular field based on
>>>>>>        kctx.
>>>
>>> [...]
>>>
>>>>>    From myside this allows us to pull in the dev info and from that get
>>>>> netns so fixes a gap we had to split into a kprobe + xdp.
>>>>>
>>>>> If we can get a pointer to the recv queue then with a few reads we
>>>>> get the hash, vlan, etc. (see timestapm thread)
>>>>
>>>> Thanks, John. Glad to see it is useful.
>>>>
>>>>>
>>>>> And then last bit is if we can get a ptr to the net ns list, plus
>>>>
>>>> Unfortunately, currently vmlinux btf does not have non-percpu global
>>>> variables, so net_namespace_list is not available to bpf programs.
>>>> But I think we could do the following with a little bit user space
>>>> initial involvement as a workaround.
>>>
>>> What would you think of another kfunc, bpf_get_global_var() to fetch
>>> the global reference and cast it with a type? I think even if you
>>> had it in BTF you would still need some sort of helper otherwise
>>> how would you know what scope of the var should be and get it
>>> correct in type checker as a TRUSTED arg? I think for my use case
>>> UNTRUSTED is find, seeing we do it with probe_reads already, but
>>> getting a TRUSTED arg seems nicer given it can be known correct
>>> from kernel side.
>>>
>>> I was thinking something like,
>>>
>>>     struct net *head = bpf_get_global_var(net_namespace_list,
>>> 				bpf_core_type_id_kernel(struct *net));
>>
>> We cannot do this as ptr_trusted, since it's an unknown cast.
> 
> I think you _could_ do it if the kfunc new to check the case type
> and knew that net_namespace_list should return that specific global.
> The verifier would special code that var and type.

Hard code it in the verifier just for one or two variables? Ouch.
Let's see whether all export_symbol_gpl can work.

>> The verifier cannot trust bpf prog to do the right thing.
>> But we can enable this buy adding export_symbol_gpl global vars to BTF.
>> Then they will be trusted and their types correct.
>> Pretty much like per-cpu variables.
>>
> 
> Yep this is the more generic way and sounds better to me. Anyone
> working on adding the global var to BTF now?

Alan Maguire looked at it. cc-ing.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs
  2022-11-23 20:46           ` Alexei Starovoitov
@ 2022-11-29 16:30             ` Alan Maguire
  0 siblings, 0 replies; 18+ messages in thread
From: Alan Maguire @ 2022-11-29 16:30 UTC (permalink / raw)
  To: Alexei Starovoitov, John Fastabend, Yonghong Song, Yonghong Song, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, Kumar Kartikeya Dwivedi, Martin KaFai Lau

On 23/11/2022 20:46, Alexei Starovoitov wrote:
> On 11/22/22 7:18 PM, John Fastabend wrote:
>> Alexei Starovoitov wrote:
>>> On 11/21/22 5:48 PM, John Fastabend wrote:
>>>> Yonghong Song wrote:
>>>>>
>>>>>
>>>>> On 11/20/22 6:10 PM, John Fastabend wrote:
>>>>>> Yonghong Song wrote:
>>>>>>> Currenty, a non-tracing bpf program typically has a single 'context' argument
>>>>>>> with predefined uapi struct type. Following these uapi struct, user is able
>>>>>>> to access other fields defined in uapi header. Inside the kernel, the
>>>>>>> user-seen 'context' argument is replaced with 'kernel context' (or 'kctx'
>>>>>>> in short) which can access more information than what uapi header provides.
>>>>>>> To access other info not in uapi header, people typically do two things:
>>>>>>>      (1). extend uapi to access more fields rooted from 'context'.
>>>>>>>      (2). use bpf_probe_read_kernl() helper to read particular field based on
>>>>>>>        kctx.
>>>>
>>>> [...]
>>>>
>>>>>>    From myside this allows us to pull in the dev info and from that get
>>>>>> netns so fixes a gap we had to split into a kprobe + xdp.
>>>>>>
>>>>>> If we can get a pointer to the recv queue then with a few reads we
>>>>>> get the hash, vlan, etc. (see timestapm thread)
>>>>>
>>>>> Thanks, John. Glad to see it is useful.
>>>>>
>>>>>>
>>>>>> And then last bit is if we can get a ptr to the net ns list, plus
>>>>>
>>>>> Unfortunately, currently vmlinux btf does not have non-percpu global
>>>>> variables, so net_namespace_list is not available to bpf programs.
>>>>> But I think we could do the following with a little bit user space
>>>>> initial involvement as a workaround.
>>>>
>>>> What would you think of another kfunc, bpf_get_global_var() to fetch
>>>> the global reference and cast it with a type? I think even if you
>>>> had it in BTF you would still need some sort of helper otherwise
>>>> how would you know what scope of the var should be and get it
>>>> correct in type checker as a TRUSTED arg? I think for my use case
>>>> UNTRUSTED is find, seeing we do it with probe_reads already, but
>>>> getting a TRUSTED arg seems nicer given it can be known correct
>>>> from kernel side.
>>>>
>>>> I was thinking something like,
>>>>
>>>>     struct net *head = bpf_get_global_var(net_namespace_list,
>>>>                 bpf_core_type_id_kernel(struct *net));
>>>
>>> We cannot do this as ptr_trusted, since it's an unknown cast.
>>
>> I think you _could_ do it if the kfunc new to check the case type
>> and knew that net_namespace_list should return that specific global.
>> The verifier would special code that var and type.
> 
> Hard code it in the verifier just for one or two variables? Ouch.
> Let's see whether all export_symbol_gpl can work.
> 
>>> The verifier cannot trust bpf prog to do the right thing.
>>> But we can enable this buy adding export_symbol_gpl global vars to BTF.
>>> Then they will be trusted and their types correct.
>>> Pretty much like per-cpu variables.
>>>
>>
>> Yep this is the more generic way and sounds better to me. Anyone
>> working on adding the global var to BTF now?
> 
> Alan Maguire looked at it. cc-ing.
> 

Yep, latest version of Stephen's series is at [1].

As mentioned in [2], I think we want to figure out the right
way to handle tristate support for BTF data. That discussion
suggests some folks would like fully module-delivered vmlinux
BTF, so I think we need to update the naming slightly in that
series to accommodate where all vmlinux BTF is delivered via
module, rather than just a subset. If there's any other
feedback on the var series from your perspective it'd be
great to get it soon so we can roll it into the next version.
Thanks!

[1] https://lore.kernel.org/bpf/20221104231103.752040-1-stephen.s.brennan@oracle.com/
[2] https://lore.kernel.org/bpf/43fd3775-e796-6802-17f0-5c9fdbf368f5@oracle.com/

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-11-29 16:31 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-20 19:54 [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs Yonghong Song
2022-11-20 19:54 ` [PATCH bpf-next v4 1/4] bpf: Add support for kfunc set with common btf_ids Yonghong Song
2022-11-20 19:54 ` [PATCH bpf-next v4 2/4] bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx Yonghong Song
2022-11-20 19:54 ` [PATCH bpf-next v4 3/4] bpf: Add a kfunc for generic type cast Yonghong Song
2022-11-20 20:16   ` Alexei Starovoitov
2022-11-20 20:49     ` Kumar Kartikeya Dwivedi
2022-11-20 22:34       ` Alexei Starovoitov
2022-11-20 23:32         ` Alexei Starovoitov
2022-11-20 23:47           ` Alexei Starovoitov
2022-11-20 19:54 ` [PATCH bpf-next v4 4/4] bpf: Add type cast unit tests Yonghong Song
2022-11-21  0:00 ` [PATCH bpf-next v4 0/4] bpf: Implement two type cast kfuncs patchwork-bot+netdevbpf
2022-11-21  2:10 ` John Fastabend
2022-11-21 17:29   ` Yonghong Song
2022-11-22  1:48     ` John Fastabend
2022-11-22  4:52       ` Alexei Starovoitov
2022-11-23  3:18         ` John Fastabend
2022-11-23 20:46           ` Alexei Starovoitov
2022-11-29 16:30             ` Alan Maguire

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.