BPF Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v4 bpf-next 0/2] Add array support to btf_struct_access
@ 2019-11-07 18:09 Martin KaFai Lau
  2019-11-07 18:09 ` [PATCH v4 bpf-next 1/2] bpf: " Martin KaFai Lau
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Martin KaFai Lau @ 2019-11-07 18:09 UTC (permalink / raw)
  To: bpf, netdev
  Cc: Alexei Starovoitov, Daniel Borkmann, David Miller, kernel-team

This series adds array support to btf_struct_access().
Please see individual patch for details.

v4:
- Avoid removing a useful comment from btf_struct_access()
- Add comments to clarify the "mtrue_end" naming and
  how it may not always correspond to mtype/msize/moff.

v3:
- Fixed an interpreter issue that missed accounting
  for insn->off

v2:
- Fix a divide-by-zero when there is empty array in
  a struct (e.g. "__u8 __cloned_offset[0];" in skbuff)
- Add 'static' to a global var in prog_tests/kfree_skb.c

Martin KaFai Lau (2):
  bpf: Add array support to btf_struct_access
  bpf: Add cb access in kfree_skb test

 kernel/bpf/btf.c                              | 195 +++++++++++++++---
 .../selftests/bpf/prog_tests/kfree_skb.c      |  54 ++++-
 tools/testing/selftests/bpf/progs/kfree_skb.c |  25 ++-
 3 files changed, 229 insertions(+), 45 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v4 bpf-next 1/2] bpf: Add array support to btf_struct_access
  2019-11-07 18:09 [PATCH v4 bpf-next 0/2] Add array support to btf_struct_access Martin KaFai Lau
@ 2019-11-07 18:09 ` " Martin KaFai Lau
  2019-11-07 18:09 ` [PATCH v4 bpf-next 2/2] bpf: Add cb access in kfree_skb test Martin KaFai Lau
  2019-11-07 22:15 ` [PATCH v4 bpf-next 0/2] Add array support to btf_struct_access Alexei Starovoitov
  2 siblings, 0 replies; 4+ messages in thread
From: Martin KaFai Lau @ 2019-11-07 18:09 UTC (permalink / raw)
  To: bpf, netdev
  Cc: Alexei Starovoitov, Daniel Borkmann, David Miller, kernel-team

This patch adds array support to btf_struct_access().
It supports array of int, array of struct and multidimensional
array.

It also allows using u8[] as a scratch space.  For example,
it allows access the "char cb[48]" with size larger than
the array's element "char".  Another potential use case is
"u64 icsk_ca_priv[]" in the tcp congestion control.

btf_resolve_size() is added to resolve the size of any type.
It will follow the modifier if there is any.  Please
see the function comment for details.

This patch also adds the "off < moff" check at the beginning
of the for loop.  It is to reject cases when "off" is pointing
to a "hole" in a struct.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 kernel/bpf/btf.c | 195 ++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 166 insertions(+), 29 deletions(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 128d89601d73..4639c4ba9a9b 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -1036,6 +1036,82 @@ static const struct resolve_vertex *env_stack_peak(struct btf_verifier_env *env)
 	return env->top_stack ? &env->stack[env->top_stack - 1] : NULL;
 }
 
+/* Resolve the size of a passed-in "type"
+ *
+ * type: is an array (e.g. u32 array[x][y])
+ * return type: type "u32[x][y]", i.e. BTF_KIND_ARRAY,
+ * *type_size: (x * y * sizeof(u32)).  Hence, *type_size always
+ *             corresponds to the return type.
+ * *elem_type: u32
+ * *total_nelems: (x * y).  Hence, individual elem size is
+ *                (*type_size / *total_nelems)
+ *
+ * type: is not an array (e.g. const struct X)
+ * return type: type "struct X"
+ * *type_size: sizeof(struct X)
+ * *elem_type: same as return type ("struct X")
+ * *total_nelems: 1
+ */
+static const struct btf_type *
+btf_resolve_size(const struct btf *btf, const struct btf_type *type,
+		 u32 *type_size, const struct btf_type **elem_type,
+		 u32 *total_nelems)
+{
+	const struct btf_type *array_type = NULL;
+	const struct btf_array *array;
+	u32 i, size, nelems = 1;
+
+	for (i = 0; i < MAX_RESOLVE_DEPTH; i++) {
+		switch (BTF_INFO_KIND(type->info)) {
+		/* type->size can be used */
+		case BTF_KIND_INT:
+		case BTF_KIND_STRUCT:
+		case BTF_KIND_UNION:
+		case BTF_KIND_ENUM:
+			size = type->size;
+			goto resolved;
+
+		case BTF_KIND_PTR:
+			size = sizeof(void *);
+			goto resolved;
+
+		/* Modifiers */
+		case BTF_KIND_TYPEDEF:
+		case BTF_KIND_VOLATILE:
+		case BTF_KIND_CONST:
+		case BTF_KIND_RESTRICT:
+			type = btf_type_by_id(btf, type->type);
+			break;
+
+		case BTF_KIND_ARRAY:
+			if (!array_type)
+				array_type = type;
+			array = btf_type_array(type);
+			if (nelems && array->nelems > U32_MAX / nelems)
+				return ERR_PTR(-EINVAL);
+			nelems *= array->nelems;
+			type = btf_type_by_id(btf, array->type);
+			break;
+
+		/* type without size */
+		default:
+			return ERR_PTR(-EINVAL);
+		}
+	}
+
+	return ERR_PTR(-EINVAL);
+
+resolved:
+	if (nelems && size > U32_MAX / nelems)
+		return ERR_PTR(-EINVAL);
+
+	*type_size = nelems * size;
+	*total_nelems = nelems;
+	*elem_type = type;
+
+	return array_type ? : type;
+}
+
 /* The input param "type_id" must point to a needs_resolve type */
 static const struct btf_type *btf_type_id_resolve(const struct btf *btf,
 						  u32 *type_id)
@@ -3494,10 +3570,10 @@ int btf_struct_access(struct bpf_verifier_log *log,
 		      enum bpf_access_type atype,
 		      u32 *next_btf_id)
 {
+	u32 i, moff, mtrue_end, msize = 0, total_nelems = 0;
+	const struct btf_type *mtype, *elem_type = NULL;
 	const struct btf_member *member;
-	const struct btf_type *mtype;
 	const char *tname, *mname;
-	int i, moff = 0, msize;
 
 again:
 	tname = __btf_name_by_offset(btf_vmlinux, t->name_off);
@@ -3507,40 +3583,88 @@ int btf_struct_access(struct bpf_verifier_log *log,
 	}
 
 	for_each_member(i, t, member) {
-		/* offset of the field in bits */
-		moff = btf_member_bit_offset(t, member);
-
 		if (btf_member_bitfield_size(t, member))
 			/* bitfields are not supported yet */
 			continue;
 
-		if (off + size <= moff / 8)
+		/* offset of the field in bytes */
+		moff = btf_member_bit_offset(t, member) / 8;
+		if (off + size <= moff)
 			/* won't find anything, field is already too far */
 			break;
+		/* In case of "off" is pointing to holes of a struct */
+		if (off < moff)
+			continue;
 
 		/* type of the field */
 		mtype = btf_type_by_id(btf_vmlinux, member->type);
 		mname = __btf_name_by_offset(btf_vmlinux, member->name_off);
 
-		/* skip modifiers */
-		while (btf_type_is_modifier(mtype))
-			mtype = btf_type_by_id(btf_vmlinux, mtype->type);
-
-		if (btf_type_is_array(mtype))
-			/* array deref is not supported yet */
-			continue;
-
-		if (!btf_type_has_size(mtype) && !btf_type_is_ptr(mtype)) {
+		mtype = btf_resolve_size(btf_vmlinux, mtype, &msize,
+					 &elem_type, &total_nelems);
+		if (IS_ERR(mtype)) {
 			bpf_log(log, "field %s doesn't have size\n", mname);
 			return -EFAULT;
 		}
-		if (btf_type_is_ptr(mtype))
-			msize = 8;
-		else
-			msize = mtype->size;
-		if (off >= moff / 8 + msize)
+
+		mtrue_end = moff + msize;
+		if (off >= mtrue_end)
 			/* no overlap with member, keep iterating */
 			continue;
+
+		if (btf_type_is_array(mtype)) {
+			u32 elem_idx;
+
+			/* btf_resolve_size() above helps to
+			 * linearize a multi-dimensional array.
+			 *
+			 * The logic here is treating an array
+			 * in a struct as the following way:
+			 *
+			 * struct outer {
+			 *	struct inner array[2][2];
+			 * };
+			 *
+			 * looks like:
+			 *
+			 * struct outer {
+			 *	struct inner array_elem0;
+			 *	struct inner array_elem1;
+			 *	struct inner array_elem2;
+			 *	struct inner array_elem3;
+			 * };
+			 *
+			 * When accessing outer->array[1][0], it moves
+			 * moff to "array_elem2", set mtype to
+			 * "struct inner", and msize also becomes
+			 * sizeof(struct inner).  Then most of the
+			 * remaining logic will fall through without
+			 * caring the current member is an array or
+			 * not.
+			 *
+			 * Unlike mtype/msize/moff, mtrue_end does not
+			 * change.  The naming difference ("_true") tells
+			 * that it is not always corresponding to
+			 * the current mtype/msize/moff.
+			 * It is the true end of the current
+			 * member (i.e. array in this case).  That
+			 * will allow an int array to be accessed like
+			 * a scratch space,
+			 * i.e. allow access beyond the size of
+			 *      the array's element as long as it is
+			 *      within the mtrue_end boundary.
+			 */
+
+			/* skip empty array */
+			if (moff == mtrue_end)
+				continue;
+
+			msize /= total_nelems;
+			elem_idx = (off - moff) / msize;
+			moff += elem_idx * msize;
+			mtype = elem_type;
+		}
+
 		/* the 'off' we're looking for is either equal to start
 		 * of this field or inside of this struct
 		 */
@@ -3549,20 +3673,20 @@ int btf_struct_access(struct bpf_verifier_log *log,
 			t = mtype;
 
 			/* adjust offset we're looking for */
-			off -= moff / 8;
+			off -= moff;
 			goto again;
 		}
-		if (msize != size) {
-			/* field access size doesn't match */
-			bpf_log(log,
-				"cannot access %d bytes in struct %s field %s that has size %d\n",
-				size, tname, mname, msize);
-			return -EACCES;
-		}
 
 		if (btf_type_is_ptr(mtype)) {
 			const struct btf_type *stype;
 
+			if (msize != size || off != moff) {
+				bpf_log(log,
+					"cannot access ptr member %s with moff %u in struct %s with off %u size %u\n",
+					mname, moff, tname, off, size);
+				return -EACCES;
+			}
+
 			stype = btf_type_by_id(btf_vmlinux, mtype->type);
 			/* skip modifiers */
 			while (btf_type_is_modifier(stype))
@@ -3572,7 +3696,20 @@ int btf_struct_access(struct bpf_verifier_log *log,
 				return PTR_TO_BTF_ID;
 			}
 		}
-		/* all other fields are treated as scalars */
+
+		/* Allow more flexible access within an int as long as
+		 * it is within mtrue_end.
+		 * Since mtrue_end could be the end of an array,
+		 * that also allows using an array of int as a scratch
+		 * space. e.g. skb->cb[].
+		 */
+		if (off + size > mtrue_end) {
+			bpf_log(log,
+				"access beyond the end of member %s (mend:%u) in struct %s with off %u size %u\n",
+				mname, mtrue_end, tname, off, size);
+			return -EACCES;
+		}
+
 		return SCALAR_VALUE;
 	}
 	bpf_log(log, "struct %s doesn't have field at offset %d\n", tname, off);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v4 bpf-next 2/2] bpf: Add cb access in kfree_skb test
  2019-11-07 18:09 [PATCH v4 bpf-next 0/2] Add array support to btf_struct_access Martin KaFai Lau
  2019-11-07 18:09 ` [PATCH v4 bpf-next 1/2] bpf: " Martin KaFai Lau
@ 2019-11-07 18:09 ` Martin KaFai Lau
  2019-11-07 22:15 ` [PATCH v4 bpf-next 0/2] Add array support to btf_struct_access Alexei Starovoitov
  2 siblings, 0 replies; 4+ messages in thread
From: Martin KaFai Lau @ 2019-11-07 18:09 UTC (permalink / raw)
  To: bpf, netdev
  Cc: Alexei Starovoitov, Daniel Borkmann, David Miller, kernel-team

Access the skb->cb[] in the kfree_skb test.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 .../selftests/bpf/prog_tests/kfree_skb.c      | 54 +++++++++++++++----
 tools/testing/selftests/bpf/progs/kfree_skb.c | 25 +++++++--
 2 files changed, 63 insertions(+), 16 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/kfree_skb.c b/tools/testing/selftests/bpf/prog_tests/kfree_skb.c
index 430b50de1583..55d36856e621 100644
--- a/tools/testing/selftests/bpf/prog_tests/kfree_skb.c
+++ b/tools/testing/selftests/bpf/prog_tests/kfree_skb.c
@@ -1,15 +1,38 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <test_progs.h>
 
+struct meta {
+	int ifindex;
+	__u32 cb32_0;
+	__u8 cb8_0;
+};
+
+static union {
+	__u32 cb32[5];
+	__u8 cb8[20];
+} cb = {
+	.cb32[0] = 0x81828384,
+};
+
 static void on_sample(void *ctx, int cpu, void *data, __u32 size)
 {
-	int ifindex = *(int *)data, duration = 0;
-	struct ipv6_packet *pkt_v6 = data + 4;
+	struct meta *meta = (struct meta *)data;
+	struct ipv6_packet *pkt_v6 = data + sizeof(*meta);
+	int duration = 0;
 
-	if (ifindex != 1)
+	if (CHECK(size != 72 + sizeof(*meta), "check_size", "size %u != %zu\n",
+		  size, 72 + sizeof(*meta)))
+		return;
+	if (CHECK(meta->ifindex != 1, "check_meta_ifindex",
+		  "meta->ifindex = %d\n", meta->ifindex))
 		/* spurious kfree_skb not on loopback device */
 		return;
-	if (CHECK(size != 76, "check_size", "size %u != 76\n", size))
+	if (CHECK(meta->cb8_0 != cb.cb8[0], "check_cb8_0", "cb8_0 %x != %x\n",
+		  meta->cb8_0, cb.cb8[0]))
+		return;
+	if (CHECK(meta->cb32_0 != cb.cb32[0], "check_cb32_0",
+		  "cb32_0 %x != %x\n",
+		  meta->cb32_0, cb.cb32[0]))
 		return;
 	if (CHECK(pkt_v6->eth.h_proto != 0xdd86, "check_eth",
 		  "h_proto %x\n", pkt_v6->eth.h_proto))
@@ -26,6 +49,13 @@ static void on_sample(void *ctx, int cpu, void *data, __u32 size)
 
 void test_kfree_skb(void)
 {
+	struct __sk_buff skb = {};
+	struct bpf_prog_test_run_attr tattr = {
+		.data_in = &pkt_v6,
+		.data_size_in = sizeof(pkt_v6),
+		.ctx_in = &skb,
+		.ctx_size_in = sizeof(skb),
+	};
 	struct bpf_prog_load_attr attr = {
 		.file = "./kfree_skb.o",
 	};
@@ -36,11 +66,12 @@ void test_kfree_skb(void)
 	struct bpf_link *link = NULL;
 	struct bpf_map *perf_buf_map;
 	struct bpf_program *prog;
-	__u32 duration, retval;
-	int err, pkt_fd, kfree_skb_fd;
+	int err, kfree_skb_fd;
 	bool passed = false;
+	__u32 duration = 0;
 
-	err = bpf_prog_load("./test_pkt_access.o", BPF_PROG_TYPE_SCHED_CLS, &obj, &pkt_fd);
+	err = bpf_prog_load("./test_pkt_access.o", BPF_PROG_TYPE_SCHED_CLS,
+			    &obj, &tattr.prog_fd);
 	if (CHECK(err, "prog_load sched cls", "err %d errno %d\n", err, errno))
 		return;
 
@@ -66,11 +97,12 @@ void test_kfree_skb(void)
 	if (CHECK(IS_ERR(pb), "perf_buf__new", "err %ld\n", PTR_ERR(pb)))
 		goto close_prog;
 
-	err = bpf_prog_test_run(pkt_fd, 1, &pkt_v6, sizeof(pkt_v6),
-				NULL, NULL, &retval, &duration);
-	CHECK(err || retval, "ipv6",
+	memcpy(skb.cb, &cb, sizeof(cb));
+	err = bpf_prog_test_run_xattr(&tattr);
+	duration = tattr.duration;
+	CHECK(err || tattr.retval, "ipv6",
 	      "err %d errno %d retval %d duration %d\n",
-	      err, errno, retval, duration);
+	      err, errno, tattr.retval, duration);
 
 	/* read perf buffer */
 	err = perf_buffer__poll(pb, 100);
diff --git a/tools/testing/selftests/bpf/progs/kfree_skb.c b/tools/testing/selftests/bpf/progs/kfree_skb.c
index 489319ea1d6a..f769fdbf6725 100644
--- a/tools/testing/selftests/bpf/progs/kfree_skb.c
+++ b/tools/testing/selftests/bpf/progs/kfree_skb.c
@@ -43,6 +43,7 @@ struct sk_buff {
 	refcount_t users;
 	unsigned char *data;
 	char __pkt_type_offset[0];
+	char cb[48];
 };
 
 /* copy arguments from
@@ -57,28 +58,41 @@ struct trace_kfree_skb {
 	void *location;
 };
 
+struct meta {
+	int ifindex;
+	__u32 cb32_0;
+	__u8 cb8_0;
+};
+
 SEC("tp_btf/kfree_skb")
 int trace_kfree_skb(struct trace_kfree_skb *ctx)
 {
 	struct sk_buff *skb = ctx->skb;
 	struct net_device *dev;
-	int ifindex;
 	struct callback_head *ptr;
 	void *func;
 	int users;
 	unsigned char *data;
 	unsigned short pkt_data;
+	struct meta meta = {};
 	char pkt_type;
+	__u32 *cb32;
+	__u8 *cb8;
 
 	__builtin_preserve_access_index(({
 		users = skb->users.refs.counter;
 		data = skb->data;
 		dev = skb->dev;
-		ifindex = dev->ifindex;
 		ptr = dev->ifalias->rcuhead.next;
 		func = ptr->func;
+		cb8 = (__u8 *)&skb->cb;
+		cb32 = (__u32 *)&skb->cb;
 	}));
 
+	meta.ifindex = _(dev->ifindex);
+	meta.cb8_0 = cb8[8];
+	meta.cb32_0 = cb32[2];
+
 	bpf_probe_read_kernel(&pkt_type, sizeof(pkt_type), _(&skb->__pkt_type_offset));
 	pkt_type &= 7;
 
@@ -90,14 +104,15 @@ int trace_kfree_skb(struct trace_kfree_skb *ctx)
 		   _(skb->len), users, pkt_type);
 	bpf_printk("skb->queue_mapping %d\n", _(skb->queue_mapping));
 	bpf_printk("dev->ifindex %d data %llx pkt_data %x\n",
-		   ifindex, data, pkt_data);
+		   meta.ifindex, data, pkt_data);
+	bpf_printk("cb8_0:%x cb32_0:%x\n", meta.cb8_0, meta.cb32_0);
 
-	if (users != 1 || pkt_data != bpf_htons(0x86dd) || ifindex != 1)
+	if (users != 1 || pkt_data != bpf_htons(0x86dd) || meta.ifindex != 1)
 		/* raw tp ignores return value */
 		return 0;
 
 	/* send first 72 byte of the packet to user space */
 	bpf_skb_output(skb, &perf_buf_map, (72ull << 32) | BPF_F_CURRENT_CPU,
-		       &ifindex, sizeof(ifindex));
+		       &meta, sizeof(meta));
 	return 0;
 }
-- 
2.17.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v4 bpf-next 0/2] Add array support to btf_struct_access
  2019-11-07 18:09 [PATCH v4 bpf-next 0/2] Add array support to btf_struct_access Martin KaFai Lau
  2019-11-07 18:09 ` [PATCH v4 bpf-next 1/2] bpf: " Martin KaFai Lau
  2019-11-07 18:09 ` [PATCH v4 bpf-next 2/2] bpf: Add cb access in kfree_skb test Martin KaFai Lau
@ 2019-11-07 22:15 ` Alexei Starovoitov
  2 siblings, 0 replies; 4+ messages in thread
From: Alexei Starovoitov @ 2019-11-07 22:15 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: bpf, Network Development, Alexei Starovoitov, Daniel Borkmann,
	David Miller, Kernel Team

On Thu, Nov 7, 2019 at 10:09 AM Martin KaFai Lau <kafai@fb.com> wrote:
>
> This series adds array support to btf_struct_access().
> Please see individual patch for details.
>
> v4:
> - Avoid removing a useful comment from btf_struct_access()
> - Add comments to clarify the "mtrue_end" naming and
>   how it may not always correspond to mtype/msize/moff.

Applied. Thanks

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-07 18:09 [PATCH v4 bpf-next 0/2] Add array support to btf_struct_access Martin KaFai Lau
2019-11-07 18:09 ` [PATCH v4 bpf-next 1/2] bpf: " Martin KaFai Lau
2019-11-07 18:09 ` [PATCH v4 bpf-next 2/2] bpf: Add cb access in kfree_skb test Martin KaFai Lau
2019-11-07 22:15 ` [PATCH v4 bpf-next 0/2] Add array support to btf_struct_access Alexei Starovoitov

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
		bpf@vger.kernel.org
	public-inbox-index bpf

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.bpf


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git