bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v4 0/3] Allow struct_ops maps with a large number of programs
@ 2024-02-24 22:34 Kui-Feng Lee
  2024-02-24 22:34 ` [PATCH bpf-next v4 1/3] bpf, net: validate struct_ops when updating value Kui-Feng Lee
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Kui-Feng Lee @ 2024-02-24 22:34 UTC (permalink / raw)
  To: bpf, ast, martin.lau, song, kernel-team, andrii
  Cc: sinquersw, kuifeng, Kui-Feng Lee

The BPF struct_ops previously only allowed for one page to be used for
the trampolines of all links in a map. However, we have recently run
out of space due to the large number of BPF program links. By
allocating additional pages when we exhaust an existing page, we can
accommodate more links in a single map.

The variable st_map->image has been changed to st_map->image_pages,
and its type has been changed to an array of pointers to buffers of
PAGE_SIZE. Additional pages are allocated when all existing pages are
exhausted.

The test case loads a struct_ops maps having 40 programs. Their
trampolines takes about 6.6k+ bytes over 1.5 pages on x86.

---
Major differences from v3:

 - Refactor buffer allocations to bpf_struct_ops_tramp_buf_alloc() and
   bpf_struct_ops_tramp_buf_free().

Major differences from v2:

 - Move image buffer allocation to bpf_struct_ops_prepare_trampoline().

Major differences from v1:

 - Always free pages if failing to update.

 - Allocate 8 pages at most.

v3: https://lore.kernel.org/all/20240224030302.1500343-1-thinker.li@gmail.com/
v2: https://lore.kernel.org/all/20240221225911.757861-1-thinker.li@gmail.com/
v1: https://lore.kernel.org/all/20240216182828.201727-1-thinker.li@gmail.com/

Kui-Feng Lee (3):
  bpf, net: validate struct_ops when updating value.
  bpf: struct_ops supports more than one page for trampolines.
  selftests/bpf: Test struct_ops maps with a large number of program
    links.

 include/linux/bpf.h                           |   4 +-
 kernel/bpf/bpf_struct_ops.c                   | 139 ++++++++++++------
 net/bpf/bpf_dummy_struct_ops.c                |  12 +-
 net/ipv4/tcp_cong.c                           |   6 +-
 .../selftests/bpf/bpf_testmod/bpf_testmod.h   |  44 ++++++
 .../prog_tests/test_struct_ops_multi_pages.c  |  30 ++++
 .../bpf/progs/struct_ops_multi_pages.c        | 102 +++++++++++++
 7 files changed, 280 insertions(+), 57 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/test_struct_ops_multi_pages.c
 create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_multi_pages.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH bpf-next v4 1/3] bpf, net: validate struct_ops when updating value.
  2024-02-24 22:34 [PATCH bpf-next v4 0/3] Allow struct_ops maps with a large number of programs Kui-Feng Lee
@ 2024-02-24 22:34 ` Kui-Feng Lee
  2024-02-24 22:34 ` [PATCH bpf-next v4 2/3] bpf: struct_ops supports more than one page for trampolines Kui-Feng Lee
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Kui-Feng Lee @ 2024-02-24 22:34 UTC (permalink / raw)
  To: bpf, ast, martin.lau, song, kernel-team, andrii
  Cc: sinquersw, kuifeng, Kui-Feng Lee, netdev

Perform all validations when updating values of struct_ops maps. Doing
validation in st_ops->reg() and st_ops->update() is not necessary anymore.
However, tcp_register_congestion_control() has been called in various
places. It still needs to do validations.

Cc: netdev@vger.kernel.org
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
---
 kernel/bpf/bpf_struct_ops.c | 11 ++++++-----
 net/ipv4/tcp_cong.c         |  6 +-----
 2 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
index a6019087b467..07e554c191d1 100644
--- a/kernel/bpf/bpf_struct_ops.c
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -672,13 +672,14 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 		*(unsigned long *)(udata + moff) = prog->aux->id;
 	}
 
+	if (st_ops->validate) {
+		err = st_ops->validate(kdata);
+		if (err)
+			goto reset_unlock;
+	}
+
 	if (st_map->map.map_flags & BPF_F_LINK) {
 		err = 0;
-		if (st_ops->validate) {
-			err = st_ops->validate(kdata);
-			if (err)
-				goto reset_unlock;
-		}
 		arch_protect_bpf_trampoline(st_map->image, PAGE_SIZE);
 		/* Let bpf_link handle registration & unregistration.
 		 *
diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c
index 1b34050a7538..28ffcfbeef14 100644
--- a/net/ipv4/tcp_cong.c
+++ b/net/ipv4/tcp_cong.c
@@ -146,11 +146,7 @@ EXPORT_SYMBOL_GPL(tcp_unregister_congestion_control);
 int tcp_update_congestion_control(struct tcp_congestion_ops *ca, struct tcp_congestion_ops *old_ca)
 {
 	struct tcp_congestion_ops *existing;
-	int ret;
-
-	ret = tcp_validate_congestion_control(ca);
-	if (ret)
-		return ret;
+	int ret = 0;
 
 	ca->key = jhash(ca->name, sizeof(ca->name), strlen(ca->name));
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH bpf-next v4 2/3] bpf: struct_ops supports more than one page for trampolines.
  2024-02-24 22:34 [PATCH bpf-next v4 0/3] Allow struct_ops maps with a large number of programs Kui-Feng Lee
  2024-02-24 22:34 ` [PATCH bpf-next v4 1/3] bpf, net: validate struct_ops when updating value Kui-Feng Lee
@ 2024-02-24 22:34 ` Kui-Feng Lee
  2024-03-04 22:34   ` Martin KaFai Lau
  2024-02-24 22:34 ` [PATCH bpf-next v4 3/3] selftests/bpf: Test struct_ops maps with a large number of program links Kui-Feng Lee
  2024-03-04 22:30 ` [PATCH bpf-next v4 0/3] Allow struct_ops maps with a large number of programs patchwork-bot+netdevbpf
  3 siblings, 1 reply; 6+ messages in thread
From: Kui-Feng Lee @ 2024-02-24 22:34 UTC (permalink / raw)
  To: bpf, ast, martin.lau, song, kernel-team, andrii
  Cc: sinquersw, kuifeng, Kui-Feng Lee

The BPF struct_ops previously only allowed for one page to be used for the
trampolines of all links in a map. However, we have recently run out of
space due to the large number of BPF program links. By allocating
additional pages when we exhaust an existing page, we can accommodate more
links in a single map.

The variable st_map->image has been changed to st_map->image_pages, and its
type has been changed to an array of pointers to buffers of
PAGE_SIZE. Every struct_ops map can have MAX_IMAGE_PAGES (8) pages for
trampolines at most.

Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
---
 include/linux/bpf.h            |   4 +-
 kernel/bpf/bpf_struct_ops.c    | 128 +++++++++++++++++++++++----------
 net/bpf/bpf_dummy_struct_ops.c |  12 ++--
 3 files changed, 97 insertions(+), 47 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 814dc913a968..f8d9ff56057c 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1763,7 +1763,9 @@ int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_links *tlinks,
 				      struct bpf_tramp_link *link,
 				      const struct btf_func_model *model,
 				      void *stub_func,
-				      void *image, void *image_end);
+				      void **image, u32 *image_off,
+				      bool allow_alloc);
+void bpf_struct_ops_tramp_buf_free(void *image);
 static inline bool bpf_try_module_get(const void *data, struct module *owner)
 {
 	if (owner == BPF_MODULE_OWNER)
diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
index 07e554c191d1..7aabc78e9b5b 100644
--- a/kernel/bpf/bpf_struct_ops.c
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -18,6 +18,8 @@ struct bpf_struct_ops_value {
 	char data[] ____cacheline_aligned_in_smp;
 };
 
+#define MAX_TRAMP_IMAGE_PAGES 8
+
 struct bpf_struct_ops_map {
 	struct bpf_map map;
 	struct rcu_head rcu;
@@ -30,12 +32,11 @@ struct bpf_struct_ops_map {
 	 */
 	struct bpf_link **links;
 	u32 links_cnt;
-	/* image is a page that has all the trampolines
+	u32 image_pages_cnt;
+	/* image_pages is an array of pages that has all the trampolines
 	 * that stores the func args before calling the bpf_prog.
-	 * A PAGE_SIZE "image" is enough to store all trampoline for
-	 * "links[]".
 	 */
-	void *image;
+	void *image_pages[MAX_TRAMP_IMAGE_PAGES];
 	/* The owner moduler's btf. */
 	struct btf *btf;
 	/* uvalue->data stores the kernel struct
@@ -116,6 +117,30 @@ static bool is_valid_value_type(struct btf *btf, s32 value_id,
 	return true;
 }
 
+static void *bpf_struct_ops_tramp_buf_alloc(void)
+{
+	void *image;
+	int err;
+
+	err = bpf_jit_charge_modmem(PAGE_SIZE);
+	if (err)
+		return ERR_PTR(err);
+	image = arch_alloc_bpf_trampoline(PAGE_SIZE);
+	if (!image) {
+		bpf_jit_uncharge_modmem(PAGE_SIZE);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	return image;
+}
+
+void bpf_struct_ops_tramp_buf_free(void *image)
+{
+	arch_free_bpf_trampoline(image, PAGE_SIZE);
+	if (image)
+		bpf_jit_uncharge_modmem(PAGE_SIZE);
+}
+
 #define MAYBE_NULL_SUFFIX "__nullable"
 #define MAX_STUB_NAME 128
 
@@ -461,6 +486,15 @@ static void bpf_struct_ops_map_put_progs(struct bpf_struct_ops_map *st_map)
 	}
 }
 
+static void bpf_struct_ops_map_free_image(struct bpf_struct_ops_map *st_map)
+{
+	int i;
+
+	for (i = 0; i < st_map->image_pages_cnt; i++)
+		bpf_struct_ops_tramp_buf_free(st_map->image_pages[i]);
+	st_map->image_pages_cnt = 0;
+}
+
 static int check_zero_holes(const struct btf *btf, const struct btf_type *t, void *data)
 {
 	const struct btf_member *member;
@@ -503,12 +537,21 @@ const struct bpf_link_ops bpf_struct_ops_link_lops = {
 	.dealloc = bpf_struct_ops_link_dealloc,
 };
 
+/* *image should be NULL and allow_alloc should be true if a caller wants
+ * this function to allocate a image buffer for it. Otherwise, this
+ * function allocate a new image buffer only if allow_alloc is true and the
+ * size of the trampoline is larger than the space left in the current
+ * image buffer.
+ */
 int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_links *tlinks,
 				      struct bpf_tramp_link *link,
 				      const struct btf_func_model *model,
-				      void *stub_func, void *image, void *image_end)
+				      void *stub_func,
+				      void **_image, u32 *image_off,
+				      bool allow_alloc)
 {
 	u32 flags = BPF_TRAMP_F_INDIRECT;
+	void *image = *_image;
 	int size;
 
 	tlinks[BPF_TRAMP_FENTRY].links[0] = link;
@@ -518,14 +561,30 @@ int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_links *tlinks,
 		flags |= BPF_TRAMP_F_RET_FENTRY_RET;
 
 	size = arch_bpf_trampoline_size(model, flags, tlinks, NULL);
-	if (size < 0)
+	if (size <= 0)
 		return size;
-	if (size > (unsigned long)image_end - (unsigned long)image)
-		return -E2BIG;
-	return arch_prepare_bpf_trampoline(NULL, image, image_end,
+
+	/* Allocate image buffer if necessary */
+	if (!image || size > PAGE_SIZE - *image_off) {
+		if (!allow_alloc)
+			return -E2BIG;
+
+		image = bpf_struct_ops_tramp_buf_alloc();
+		if (IS_ERR(image))
+			return PTR_ERR(image);
+		*_image = image;
+		*image_off = 0;
+	}
+
+	size = arch_prepare_bpf_trampoline(NULL, image + *image_off,
+					   image + PAGE_SIZE,
 					   model, flags, tlinks, stub_func);
-}
+	if (size > 0)
+		*image_off += size;
+	/* The caller should free the allocated memory even if size < 0 */
 
+	return size;
+}
 static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 					   void *value, u64 flags)
 {
@@ -539,8 +598,8 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 	struct bpf_tramp_links *tlinks;
 	void *udata, *kdata;
 	int prog_fd, err;
-	void *image, *image_end;
-	u32 i;
+	u32 i, image_off = 0;
+	void *image = NULL;
 
 	if (flags)
 		return -EINVAL;
@@ -578,14 +637,15 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 
 	udata = &uvalue->data;
 	kdata = &kvalue->data;
-	image = st_map->image;
-	image_end = st_map->image + PAGE_SIZE;
 
 	module_type = btf_type_by_id(btf_vmlinux, st_ops_ids[IDX_MODULE_ID]);
 	for_each_member(i, t, member) {
 		const struct btf_type *mtype, *ptype;
 		struct bpf_prog *prog;
 		struct bpf_tramp_link *link;
+		void *saved_image = image;
+		u32 init_off = image_off;
+		bool allow_alloc;
 		u32 moff;
 
 		moff = __btf_member_bit_offset(t, member) / 8;
@@ -658,15 +718,24 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 			      &bpf_struct_ops_link_lops, prog);
 		st_map->links[i] = &link->link;
 
+		allow_alloc = st_map->image_pages_cnt < MAX_TRAMP_IMAGE_PAGES;
 		err = bpf_struct_ops_prepare_trampoline(tlinks, link,
 							&st_ops->func_models[i],
 							*(void **)(st_ops->cfi_stubs + moff),
-							image, image_end);
+							&image, &image_off,
+							allow_alloc);
+		if (saved_image != image) {
+			/* Add to image_pages[] to ensure the page has been
+			 * free later even the above call fails
+			 */
+			st_map->image_pages[st_map->image_pages_cnt++] = image;
+			init_off = 0;
+		}
 		if (err < 0)
 			goto reset_unlock;
 
-		*(void **)(kdata + moff) = image + cfi_get_offset();
-		image += err;
+		*(void **)(kdata + moff) =
+			image + init_off + cfi_get_offset();
 
 		/* put prog_id to udata */
 		*(unsigned long *)(udata + moff) = prog->aux->id;
@@ -677,10 +746,11 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 		if (err)
 			goto reset_unlock;
 	}
+	for (i = 0; i < st_map->image_pages_cnt; i++)
+		arch_protect_bpf_trampoline(st_map->image_pages[i], PAGE_SIZE);
 
 	if (st_map->map.map_flags & BPF_F_LINK) {
 		err = 0;
-		arch_protect_bpf_trampoline(st_map->image, PAGE_SIZE);
 		/* Let bpf_link handle registration & unregistration.
 		 *
 		 * Pair with smp_load_acquire() during lookup_elem().
@@ -689,7 +759,6 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 		goto unlock;
 	}
 
-	arch_protect_bpf_trampoline(st_map->image, PAGE_SIZE);
 	err = st_ops->reg(kdata);
 	if (likely(!err)) {
 		/* This refcnt increment on the map here after
@@ -712,9 +781,9 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 	 * there was a race in registering the struct_ops (under the same name) to
 	 * a sub-system through different struct_ops's maps.
 	 */
-	arch_unprotect_bpf_trampoline(st_map->image, PAGE_SIZE);
 
 reset_unlock:
+	bpf_struct_ops_map_free_image(st_map);
 	bpf_struct_ops_map_put_progs(st_map);
 	memset(uvalue, 0, map->value_size);
 	memset(kvalue, 0, map->value_size);
@@ -781,10 +850,7 @@ static void __bpf_struct_ops_map_free(struct bpf_map *map)
 	if (st_map->links)
 		bpf_struct_ops_map_put_progs(st_map);
 	bpf_map_area_free(st_map->links);
-	if (st_map->image) {
-		arch_free_bpf_trampoline(st_map->image, PAGE_SIZE);
-		bpf_jit_uncharge_modmem(PAGE_SIZE);
-	}
+	bpf_struct_ops_map_free_image(st_map);
 	bpf_map_area_free(st_map->uvalue);
 	bpf_map_area_free(st_map);
 }
@@ -894,20 +960,6 @@ static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr)
 	st_map->st_ops_desc = st_ops_desc;
 	map = &st_map->map;
 
-	ret = bpf_jit_charge_modmem(PAGE_SIZE);
-	if (ret)
-		goto errout_free;
-
-	st_map->image = arch_alloc_bpf_trampoline(PAGE_SIZE);
-	if (!st_map->image) {
-		/* __bpf_struct_ops_map_free() uses st_map->image as flag
-		 * for "charged or not". In this case, we need to unchange
-		 * here.
-		 */
-		bpf_jit_uncharge_modmem(PAGE_SIZE);
-		ret = -ENOMEM;
-		goto errout_free;
-	}
 	st_map->uvalue = bpf_map_area_alloc(vt->size, NUMA_NO_NODE);
 	st_map->links_cnt = btf_type_vlen(t);
 	st_map->links =
diff --git a/net/bpf/bpf_dummy_struct_ops.c b/net/bpf/bpf_dummy_struct_ops.c
index 02de71719aed..da73905eff4a 100644
--- a/net/bpf/bpf_dummy_struct_ops.c
+++ b/net/bpf/bpf_dummy_struct_ops.c
@@ -91,6 +91,7 @@ int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
 	struct bpf_tramp_link *link = NULL;
 	void *image = NULL;
 	unsigned int op_idx;
+	u32 image_off = 0;
 	int prog_ret;
 	s32 type_id;
 	int err;
@@ -114,12 +115,6 @@ int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
 		goto out;
 	}
 
-	image = arch_alloc_bpf_trampoline(PAGE_SIZE);
-	if (!image) {
-		err = -ENOMEM;
-		goto out;
-	}
-
 	link = kzalloc(sizeof(*link), GFP_USER);
 	if (!link) {
 		err = -ENOMEM;
@@ -133,7 +128,8 @@ int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
 	err = bpf_struct_ops_prepare_trampoline(tlinks, link,
 						&st_ops->func_models[op_idx],
 						&dummy_ops_test_ret_function,
-						image, image + PAGE_SIZE);
+						&image, &image_off,
+						true);
 	if (err < 0)
 		goto out;
 
@@ -147,7 +143,7 @@ int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
 		err = -EFAULT;
 out:
 	kfree(args);
-	arch_free_bpf_trampoline(image, PAGE_SIZE);
+	bpf_struct_ops_tramp_buf_free(image);
 	if (link)
 		bpf_link_put(&link->link);
 	kfree(tlinks);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH bpf-next v4 3/3] selftests/bpf: Test struct_ops maps with a large number of program links.
  2024-02-24 22:34 [PATCH bpf-next v4 0/3] Allow struct_ops maps with a large number of programs Kui-Feng Lee
  2024-02-24 22:34 ` [PATCH bpf-next v4 1/3] bpf, net: validate struct_ops when updating value Kui-Feng Lee
  2024-02-24 22:34 ` [PATCH bpf-next v4 2/3] bpf: struct_ops supports more than one page for trampolines Kui-Feng Lee
@ 2024-02-24 22:34 ` Kui-Feng Lee
  2024-03-04 22:30 ` [PATCH bpf-next v4 0/3] Allow struct_ops maps with a large number of programs patchwork-bot+netdevbpf
  3 siblings, 0 replies; 6+ messages in thread
From: Kui-Feng Lee @ 2024-02-24 22:34 UTC (permalink / raw)
  To: bpf, ast, martin.lau, song, kernel-team, andrii
  Cc: sinquersw, kuifeng, Kui-Feng Lee

Create and load a struct_ops map with a large number of programs to
generate trampolines taking a size over multiple pages. The map includes 40
programs. Their trampolines takes 6.6k+, more than 1.5 pages, on x86.

Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
---
 .../selftests/bpf/bpf_testmod/bpf_testmod.h   |  44 ++++++++
 .../prog_tests/test_struct_ops_multi_pages.c  |  30 ++++++
 .../bpf/progs/struct_ops_multi_pages.c        | 102 ++++++++++++++++++
 3 files changed, 176 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/test_struct_ops_multi_pages.c
 create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_multi_pages.c

diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h
index c3b0cf788f9f..3e6481b2a50a 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h
@@ -35,6 +35,50 @@ struct bpf_testmod_ops {
 	void (*test_2)(int a, int b);
 	/* Used to test nullable arguments. */
 	int (*test_maybe_null)(int dummy, struct task_struct *task);
+
+	/* The following pointers are used to test the maps having multiple
+	 * pages of trampolines.
+	 */
+	int (*tramp_1)(int value);
+	int (*tramp_2)(int value);
+	int (*tramp_3)(int value);
+	int (*tramp_4)(int value);
+	int (*tramp_5)(int value);
+	int (*tramp_6)(int value);
+	int (*tramp_7)(int value);
+	int (*tramp_8)(int value);
+	int (*tramp_9)(int value);
+	int (*tramp_10)(int value);
+	int (*tramp_11)(int value);
+	int (*tramp_12)(int value);
+	int (*tramp_13)(int value);
+	int (*tramp_14)(int value);
+	int (*tramp_15)(int value);
+	int (*tramp_16)(int value);
+	int (*tramp_17)(int value);
+	int (*tramp_18)(int value);
+	int (*tramp_19)(int value);
+	int (*tramp_20)(int value);
+	int (*tramp_21)(int value);
+	int (*tramp_22)(int value);
+	int (*tramp_23)(int value);
+	int (*tramp_24)(int value);
+	int (*tramp_25)(int value);
+	int (*tramp_26)(int value);
+	int (*tramp_27)(int value);
+	int (*tramp_28)(int value);
+	int (*tramp_29)(int value);
+	int (*tramp_30)(int value);
+	int (*tramp_31)(int value);
+	int (*tramp_32)(int value);
+	int (*tramp_33)(int value);
+	int (*tramp_34)(int value);
+	int (*tramp_35)(int value);
+	int (*tramp_36)(int value);
+	int (*tramp_37)(int value);
+	int (*tramp_38)(int value);
+	int (*tramp_39)(int value);
+	int (*tramp_40)(int value);
 };
 
 #endif /* _BPF_TESTMOD_H */
diff --git a/tools/testing/selftests/bpf/prog_tests/test_struct_ops_multi_pages.c b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_multi_pages.c
new file mode 100644
index 000000000000..645d32b5160c
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_multi_pages.c
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
+#include <test_progs.h>
+
+#include "struct_ops_multi_pages.skel.h"
+
+static void do_struct_ops_multi_pages(void)
+{
+	struct struct_ops_multi_pages *skel;
+	struct bpf_link *link;
+
+	/* The size of all trampolines of skel->maps.multi_pages should be
+	 * over 1 page (at least for x86).
+	 */
+	skel = struct_ops_multi_pages__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "struct_ops_multi_pages_open_and_load"))
+		return;
+
+	link = bpf_map__attach_struct_ops(skel->maps.multi_pages);
+	ASSERT_OK_PTR(link, "attach_multi_pages");
+
+	bpf_link__destroy(link);
+	struct_ops_multi_pages__destroy(skel);
+}
+
+void test_struct_ops_multi_pages(void)
+{
+	if (test__start_subtest("multi_pages"))
+		do_struct_ops_multi_pages();
+}
diff --git a/tools/testing/selftests/bpf/progs/struct_ops_multi_pages.c b/tools/testing/selftests/bpf/progs/struct_ops_multi_pages.c
new file mode 100644
index 000000000000..9efcc6e4d356
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/struct_ops_multi_pages.c
@@ -0,0 +1,102 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include "../bpf_testmod/bpf_testmod.h"
+
+char _license[] SEC("license") = "GPL";
+
+#define TRAMP(x) \
+	SEC("struct_ops/tramp_" #x)		\
+	int BPF_PROG(tramp_ ## x, int a)	\
+	{					\
+		return a;			\
+	}
+
+TRAMP(1)
+TRAMP(2)
+TRAMP(3)
+TRAMP(4)
+TRAMP(5)
+TRAMP(6)
+TRAMP(7)
+TRAMP(8)
+TRAMP(9)
+TRAMP(10)
+TRAMP(11)
+TRAMP(12)
+TRAMP(13)
+TRAMP(14)
+TRAMP(15)
+TRAMP(16)
+TRAMP(17)
+TRAMP(18)
+TRAMP(19)
+TRAMP(20)
+TRAMP(21)
+TRAMP(22)
+TRAMP(23)
+TRAMP(24)
+TRAMP(25)
+TRAMP(26)
+TRAMP(27)
+TRAMP(28)
+TRAMP(29)
+TRAMP(30)
+TRAMP(31)
+TRAMP(32)
+TRAMP(33)
+TRAMP(34)
+TRAMP(35)
+TRAMP(36)
+TRAMP(37)
+TRAMP(38)
+TRAMP(39)
+TRAMP(40)
+
+#define F_TRAMP(x) .tramp_ ## x = (void *)tramp_ ## x
+
+SEC(".struct_ops.link")
+struct bpf_testmod_ops multi_pages = {
+	F_TRAMP(1),
+	F_TRAMP(2),
+	F_TRAMP(3),
+	F_TRAMP(4),
+	F_TRAMP(5),
+	F_TRAMP(6),
+	F_TRAMP(7),
+	F_TRAMP(8),
+	F_TRAMP(9),
+	F_TRAMP(10),
+	F_TRAMP(11),
+	F_TRAMP(12),
+	F_TRAMP(13),
+	F_TRAMP(14),
+	F_TRAMP(15),
+	F_TRAMP(16),
+	F_TRAMP(17),
+	F_TRAMP(18),
+	F_TRAMP(19),
+	F_TRAMP(20),
+	F_TRAMP(21),
+	F_TRAMP(22),
+	F_TRAMP(23),
+	F_TRAMP(24),
+	F_TRAMP(25),
+	F_TRAMP(26),
+	F_TRAMP(27),
+	F_TRAMP(28),
+	F_TRAMP(29),
+	F_TRAMP(30),
+	F_TRAMP(31),
+	F_TRAMP(32),
+	F_TRAMP(33),
+	F_TRAMP(34),
+	F_TRAMP(35),
+	F_TRAMP(36),
+	F_TRAMP(37),
+	F_TRAMP(38),
+	F_TRAMP(39),
+	F_TRAMP(40),
+};
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next v4 0/3] Allow struct_ops maps with a large number of programs
  2024-02-24 22:34 [PATCH bpf-next v4 0/3] Allow struct_ops maps with a large number of programs Kui-Feng Lee
                   ` (2 preceding siblings ...)
  2024-02-24 22:34 ` [PATCH bpf-next v4 3/3] selftests/bpf: Test struct_ops maps with a large number of program links Kui-Feng Lee
@ 2024-03-04 22:30 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 6+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-03-04 22:30 UTC (permalink / raw)
  To: Kui-Feng Lee
  Cc: bpf, ast, martin.lau, song, kernel-team, andrii, sinquersw, kuifeng

Hello:

This series was applied to bpf/bpf-next.git (master)
by Martin KaFai Lau <martin.lau@kernel.org>:

On Sat, 24 Feb 2024 14:34:15 -0800 you wrote:
> The BPF struct_ops previously only allowed for one page to be used for
> the trampolines of all links in a map. However, we have recently run
> out of space due to the large number of BPF program links. By
> allocating additional pages when we exhaust an existing page, we can
> accommodate more links in a single map.
> 
> The variable st_map->image has been changed to st_map->image_pages,
> and its type has been changed to an array of pointers to buffers of
> PAGE_SIZE. Additional pages are allocated when all existing pages are
> exhausted.
> 
> [...]

Here is the summary with links:
  - [bpf-next,v4,1/3] bpf, net: validate struct_ops when updating value.
    https://git.kernel.org/bpf/bpf-next/c/73e4f9e615d7
  - [bpf-next,v4,2/3] bpf: struct_ops supports more than one page for trampolines.
    https://git.kernel.org/bpf/bpf-next/c/187e2af05abe
  - [bpf-next,v4,3/3] selftests/bpf: Test struct_ops maps with a large number of program links.
    https://git.kernel.org/bpf/bpf-next/c/93bc28d859e5

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next v4 2/3] bpf: struct_ops supports more than one page for trampolines.
  2024-02-24 22:34 ` [PATCH bpf-next v4 2/3] bpf: struct_ops supports more than one page for trampolines Kui-Feng Lee
@ 2024-03-04 22:34   ` Martin KaFai Lau
  0 siblings, 0 replies; 6+ messages in thread
From: Martin KaFai Lau @ 2024-03-04 22:34 UTC (permalink / raw)
  To: Kui-Feng Lee; +Cc: sinquersw, kuifeng, bpf, ast, song, kernel-team, andrii

On 2/24/24 2:34 PM, Kui-Feng Lee wrote:
> +void bpf_struct_ops_tramp_buf_free(void *image)
> +{
> +	arch_free_bpf_trampoline(image, PAGE_SIZE);
> +	if (image)

Moved the "arch_free_bpf_trampoline(image, PAGE_SIZE);" after the image NULL 
test. I think it was an overlook at the bpf_dummy_struct_ops.c. The exisiting 
__bpf_struct_ops_map_free(image, PAGE_SIZE) had been testing NULL before free also.

> +		bpf_jit_uncharge_modmem(PAGE_SIZE);
> +}
> +
>   #define MAYBE_NULL_SUFFIX "__nullable"
>   #define MAX_STUB_NAME 128
>   
> @@ -461,6 +486,15 @@ static void bpf_struct_ops_map_put_progs(struct bpf_struct_ops_map *st_map)
>   	}
>   }
>   
> +static void bpf_struct_ops_map_free_image(struct bpf_struct_ops_map *st_map)
> +{
> +	int i;
> +
> +	for (i = 0; i < st_map->image_pages_cnt; i++)
> +		bpf_struct_ops_tramp_buf_free(st_map->image_pages[i]);
> +	st_map->image_pages_cnt = 0;
> +}
> +
>   static int check_zero_holes(const struct btf *btf, const struct btf_type *t, void *data)
>   {
>   	const struct btf_member *member;
> @@ -503,12 +537,21 @@ const struct bpf_link_ops bpf_struct_ops_link_lops = {
>   	.dealloc = bpf_struct_ops_link_dealloc,
>   };
>   
> +/* *image should be NULL and allow_alloc should be true if a caller wants
> + * this function to allocate a image buffer for it. Otherwise, this
> + * function allocate a new image buffer only if allow_alloc is true and the
> + * size of the trampoline is larger than the space left in the current
> + * image buffer.
> + */
>   int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_links *tlinks,
>   				      struct bpf_tramp_link *link,
>   				      const struct btf_func_model *model,
> -				      void *stub_func, void *image, void *image_end)
> +				      void *stub_func,
> +				      void **_image, u32 *image_off,
> +				      bool allow_alloc)
>   {
>   	u32 flags = BPF_TRAMP_F_INDIRECT;
> +	void *image = *_image;
>   	int size;
>   
>   	tlinks[BPF_TRAMP_FENTRY].links[0] = link;
> @@ -518,14 +561,30 @@ int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_links *tlinks,
>   		flags |= BPF_TRAMP_F_RET_FENTRY_RET;
>   
>   	size = arch_bpf_trampoline_size(model, flags, tlinks, NULL);
> -	if (size < 0)
> +	if (size <= 0)
>   		return size;
> -	if (size > (unsigned long)image_end - (unsigned long)image)
> -		return -E2BIG;
> -	return arch_prepare_bpf_trampoline(NULL, image, image_end,
> +
> +	/* Allocate image buffer if necessary */
> +	if (!image || size > PAGE_SIZE - *image_off) {
> +		if (!allow_alloc)
> +			return -E2BIG;
> +
> +		image = bpf_struct_ops_tramp_buf_alloc();
> +		if (IS_ERR(image))
> +			return PTR_ERR(image);
> +		*_image = image;
> +		*image_off = 0;
> +	}
> +
> +	size = arch_prepare_bpf_trampoline(NULL, image + *image_off,
> +					   image + PAGE_SIZE,
>   					   model, flags, tlinks, stub_func);
> -}
> +	if (size > 0)
> +		*image_off += size;
> +	/* The caller should free the allocated memory even if size < 0 */

I massage the logic a little such that the caller does not need to free the new 
unused page when this function errored out. I kept both the "*_image" and 
"*image_off" param unchanged on the error case.

Applied. Thanks.

[ ... ]

> @@ -781,10 +850,7 @@ static void __bpf_struct_ops_map_free(struct bpf_map *map)
>   	if (st_map->links)
>   		bpf_struct_ops_map_put_progs(st_map);
>   	bpf_map_area_free(st_map->links);
> -	if (st_map->image) {
> -		arch_free_bpf_trampoline(st_map->image, PAGE_SIZE);
> -		bpf_jit_uncharge_modmem(PAGE_SIZE);
> -	}
> +	bpf_struct_ops_map_free_image(st_map);
>   	bpf_map_area_free(st_map->uvalue);
>   	bpf_map_area_free(st_map);
>   }

[ ... ]

> diff --git a/net/bpf/bpf_dummy_struct_ops.c b/net/bpf/bpf_dummy_struct_ops.c
> index 02de71719aed..da73905eff4a 100644
> --- a/net/bpf/bpf_dummy_struct_ops.c
> +++ b/net/bpf/bpf_dummy_struct_ops.c
> @@ -91,6 +91,7 @@ int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
>   	struct bpf_tramp_link *link = NULL;
>   	void *image = NULL;
>   	unsigned int op_idx;
> +	u32 image_off = 0;
>   	int prog_ret;
>   	s32 type_id;
>   	int err;
> @@ -114,12 +115,6 @@ int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
>   		goto out;
>   	}
>   
> -	image = arch_alloc_bpf_trampoline(PAGE_SIZE);
> -	if (!image) {
> -		err = -ENOMEM;
> -		goto out;
> -	}
> -
>   	link = kzalloc(sizeof(*link), GFP_USER);
>   	if (!link) {
>   		err = -ENOMEM;
> @@ -133,7 +128,8 @@ int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
>   	err = bpf_struct_ops_prepare_trampoline(tlinks, link,
>   						&st_ops->func_models[op_idx],
>   						&dummy_ops_test_ret_function,
> -						image, image + PAGE_SIZE);
> +						&image, &image_off,
> +						true);
>   	if (err < 0)
>   		goto out;
>   
> @@ -147,7 +143,7 @@ int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
>   		err = -EFAULT;
>   out:
>   	kfree(args);
> -	arch_free_bpf_trampoline(image, PAGE_SIZE);
> +	bpf_struct_ops_tramp_buf_free(image);
>   	if (link)
>   		bpf_link_put(&link->link);
>   	kfree(tlinks);


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-03-04 22:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-24 22:34 [PATCH bpf-next v4 0/3] Allow struct_ops maps with a large number of programs Kui-Feng Lee
2024-02-24 22:34 ` [PATCH bpf-next v4 1/3] bpf, net: validate struct_ops when updating value Kui-Feng Lee
2024-02-24 22:34 ` [PATCH bpf-next v4 2/3] bpf: struct_ops supports more than one page for trampolines Kui-Feng Lee
2024-03-04 22:34   ` Martin KaFai Lau
2024-02-24 22:34 ` [PATCH bpf-next v4 3/3] selftests/bpf: Test struct_ops maps with a large number of program links Kui-Feng Lee
2024-03-04 22:30 ` [PATCH bpf-next v4 0/3] Allow struct_ops maps with a large number of programs patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).