All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii@kernel.org>
To: <bpf@vger.kernel.org>, <ast@kernel.org>, <daniel@iogearbox.net>
Cc: <andrii@kernel.org>, <kernel-team@fb.com>, Tejun Heo <tj@kernel.org>
Subject: [PATCH bpf-next 14/17] bpf: implement number iterator
Date: Thu, 2 Mar 2023 15:50:12 -0800	[thread overview]
Message-ID: <20230302235015.2044271-15-andrii@kernel.org> (raw)
In-Reply-To: <20230302235015.2044271-1-andrii@kernel.org>

Implement first open-coded iterator type over a range of integers.

It's public API consists of:
  - bpf_iter_num_new() constructor, which accepts [start, end) range
    (that is, start is inclusive, while end is exclusive).
  - bpf_iter_num_next() which will keep returning 4-byte read-only
    pointer to int until the range is exhausted, at which point NULL will
    be returned. If bpf_iter_num_next() is kept calling after this, NULL
    will be persistently returned.
  - bpf_iter_num_destroy() destructor, that needs to be called at some
    point to clean up iterator state. BPF verifier enforces that iterator
    destructor is called at some point before BPF program exits.

Note that `start = end = X` is a valid combination to setup empty
iterator. bpf_iter_num_new() will return 0 (success) for any such
combination.

If bpf_iter_num_new() detects invalid combination of input arguments, it
returns error, resets iterator state to, effectively, empty iterator, so
any subsequent call to bpf_iter_num_next() will keep returning NULL.

BPF verifier has no knowledge that returned integers are in the [start,
end) value range, as both `start` and `end` are not statically
known/enforced, they are runtime values only.

While implementation is pretty trivial, some care needs to be taken to
avoid overflows and underflows. Subsequent selftests will validate
correctness of [start, end) semantics, especially around extremes
(INT_MIN and INT_MAX).

Similarly to bpf_loop(), we enforce that no more than BPF_MAX_LOOPS can
be specified.

bpf_iter_num_{new,next,destroy}() is a logical evolution from bounded
BPF loops and bpf_loop() helper and is the basis for implementing
ergonomic BPF loops with no statically known and verified bounds.
Subsequent patches implement bpf_for() macro, demonstrating how this can
be wrapped into something that works and feels like a normal for() loop
in C language.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 include/linux/bpf.h   | 14 +++++++--
 kernel/bpf/bpf_iter.c | 71 +++++++++++++++++++++++++++++++++++++++++++
 kernel/bpf/helpers.c  |  3 ++
 kernel/bpf/verifier.c | 24 +++++++++++++--
 4 files changed, 107 insertions(+), 5 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index a968282ba324..2a730759a471 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -613,6 +613,9 @@ enum bpf_type_flag {
 	/* DYNPTR points to xdp_buff */
 	DYNPTR_TYPE_XDP		= BIT(16 + BPF_BASE_TYPE_BITS),
 
+	/* ITER of integers */
+	ITER_TYPE_NUM		= BIT(17 + BPF_BASE_TYPE_BITS),
+
 	__BPF_TYPE_FLAG_MAX,
 	__BPF_TYPE_LAST_FLAG	= __BPF_TYPE_FLAG_MAX - 1,
 };
@@ -620,7 +623,7 @@ enum bpf_type_flag {
 #define DYNPTR_TYPE_FLAG_MASK	(DYNPTR_TYPE_LOCAL | DYNPTR_TYPE_RINGBUF | DYNPTR_TYPE_SKB \
 				 | DYNPTR_TYPE_XDP)
 
-#define ITER_TYPE_FLAG_MASK	(0)
+#define ITER_TYPE_FLAG_MASK	(ITER_TYPE_NUM)
 
 /* Max number of base types. */
 #define BPF_BASE_TYPE_LIMIT	(1UL << BPF_BASE_TYPE_BITS)
@@ -1167,6 +1170,7 @@ u32 bpf_dynptr_get_size(const struct bpf_dynptr_kern *ptr);
 
 enum bpf_iter_type {
 	BPF_ITER_TYPE_INVALID,
+	BPF_ITER_TYPE_NUM,
 };
 
 #ifdef CONFIG_BPF_JIT
@@ -1622,8 +1626,12 @@ struct bpf_array {
 #define BPF_COMPLEXITY_LIMIT_INSNS      1000000 /* yes. 1M insns */
 #define MAX_TAIL_CALL_CNT 33
 
-/* Maximum number of loops for bpf_loop */
-#define BPF_MAX_LOOPS	BIT(23)
+/* Maximum number of loops for bpf_loop and bpf_iter_num.
+ * It's enum to expose it (and thus make it discoverable) through BTF.
+ */
+enum {
+	BPF_MAX_LOOPS = 8 * 1024 * 1024,
+};
 
 #define BPF_F_ACCESS_MASK	(BPF_F_RDONLY |		\
 				 BPF_F_RDONLY_PROG |	\
diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c
index 5dc307bdeaeb..504189a3b474 100644
--- a/kernel/bpf/bpf_iter.c
+++ b/kernel/bpf/bpf_iter.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright (c) 2020 Facebook */
 
+#include "linux/build_bug.h"
 #include <linux/fs.h>
 #include <linux/anon_inodes.h>
 #include <linux/filter.h>
@@ -776,3 +777,73 @@ const struct bpf_func_proto bpf_loop_proto = {
 	.arg3_type	= ARG_PTR_TO_STACK_OR_NULL,
 	.arg4_type	= ARG_ANYTHING,
 };
+
+struct bpf_iter_num_kern {
+	int cur; /* current value, inclusive */
+	int end; /* final value, exclusive */
+	__u64 :64;
+	__u64 :64;
+} __aligned(8);
+
+__diag_push();
+__diag_ignore_all("-Wmissing-prototypes",
+		  "Global functions as their definitions will be in vmlinux BTF");
+
+__bpf_kfunc int bpf_iter_num_new(struct bpf_iter *it__uninit, int start, int end)
+{
+	struct bpf_iter_num_kern *s = (void *)it__uninit;
+
+	BUILD_BUG_ON(sizeof(struct bpf_iter_num_kern) != sizeof(struct bpf_iter));
+	BUILD_BUG_ON(__alignof__(struct bpf_iter_num_kern) != __alignof__(struct bpf_iter));
+
+	/* start == end is legit, it's an empty range and we'll just get NULL
+	 * on first (and any subsequent) bpf_iter_num_next() call
+	 */
+	if (start > end) {
+		s->cur = s->end = 0;
+		return -EINVAL;
+	}
+
+	/* avoid overflows, e.g., if start == INT_MIN and end == INT_MAX */
+	if ((s64)end - (s64)start > BPF_MAX_LOOPS) {
+		s->cur = s->end = 0;
+		return -E2BIG;
+	}
+
+	/* user will call bpf_iter_num_next() first,
+	 * which will set s->cur to exactly start value;
+	 * underflow shouldn't matter
+	 */
+	s->cur = start - 1;
+	s->end = end;
+
+	return 0;
+}
+
+__bpf_kfunc int *bpf_iter_num_next(struct bpf_iter* it)
+{
+	struct bpf_iter_num_kern *s = (void *)it;
+
+	/* check failed initialization or if we are done (same behavior);
+	 * need to be careful about overflow, so convert to s64 for checks,
+	 * e.g., if s->cur == s->end == INT_MAX, we can't just do
+	 * s->cur + 1 >= s->end
+	 */
+	if ((s64)(s->cur + 1) >= s->end) {
+		s->cur = s->end = 0;
+		return NULL;
+	}
+
+	s->cur++;
+
+	return &s->cur;
+}
+
+__bpf_kfunc void bpf_iter_num_destroy(struct bpf_iter *it)
+{
+	struct bpf_iter_num_kern *s = (void *)it;
+
+	s->cur = s->end = 0;
+}
+
+__diag_pop();
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index de9ef8476e29..23c8f2313d5a 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2398,6 +2398,9 @@ BTF_ID_FLAGS(func, bpf_rcu_read_lock)
 BTF_ID_FLAGS(func, bpf_rcu_read_unlock)
 BTF_ID_FLAGS(func, bpf_dynptr_slice, KF_RET_NULL)
 BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_iter_num_new)
+BTF_ID_FLAGS(func, bpf_iter_num_next, KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_iter_num_destroy)
 BTF_SET8_END(common_btf_ids)
 
 static const struct btf_kfunc_id_set common_kfunc_set = {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 58754929ee33..9671b4f354e9 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -779,6 +779,8 @@ static const char *dynptr_type_str(enum bpf_dynptr_type type)
 static const char *iter_type_str(enum bpf_iter_type type)
 {
 	switch (type) {
+	case BPF_ITER_TYPE_NUM:
+		return "num";
 	case BPF_ITER_TYPE_INVALID:
 		return "<invalid>";
 	default:
@@ -1157,6 +1159,8 @@ static bool is_dynptr_type_expected(struct bpf_verifier_env *env, struct bpf_reg
 static enum bpf_dynptr_type arg_to_iter_type(enum bpf_arg_type arg_type)
 {
 	switch (arg_type & ITER_TYPE_FLAG_MASK) {
+	case ITER_TYPE_NUM:
+		return BPF_ITER_TYPE_NUM;
 	default:
 		return BPF_ITER_TYPE_INVALID;
 	}
@@ -9445,6 +9449,9 @@ enum special_kfunc_type {
 	KF_bpf_dynptr_from_xdp,
 	KF_bpf_dynptr_slice,
 	KF_bpf_dynptr_slice_rdwr,
+	KF_bpf_iter_num_new,
+	KF_bpf_iter_num_next,
+	KF_bpf_iter_num_destroy,
 };
 
 BTF_SET_START(special_kfunc_set)
@@ -9483,6 +9490,9 @@ BTF_ID(func, bpf_dynptr_from_skb)
 BTF_ID(func, bpf_dynptr_from_xdp)
 BTF_ID(func, bpf_dynptr_slice)
 BTF_ID(func, bpf_dynptr_slice_rdwr)
+BTF_ID(func, bpf_iter_num_new)
+BTF_ID(func, bpf_iter_num_next)
+BTF_ID(func, bpf_iter_num_destroy)
 
 static bool is_kfunc_bpf_rcu_read_lock(struct bpf_kfunc_call_arg_meta *meta)
 {
@@ -9496,7 +9506,7 @@ static bool is_kfunc_bpf_rcu_read_unlock(struct bpf_kfunc_call_arg_meta *meta)
 
 static bool is_iter_next_kfunc(int btf_id)
 {
-	return false;
+	return btf_id == special_kfunc_list[KF_bpf_iter_num_next];
 }
 
 static enum kfunc_ptr_arg_type
@@ -10278,7 +10288,17 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
 			if (is_kfunc_arg_uninit(btf, &args[i]))
 				iter_arg_type |= MEM_UNINIT;
 
-			ret = process_iter_arg(env, regno, insn_idx, iter_arg_type,  meta);
+			if (meta->func_id == special_kfunc_list[KF_bpf_iter_num_new] ||
+			    meta->func_id == special_kfunc_list[KF_bpf_iter_num_next]) {
+				iter_arg_type |= ITER_TYPE_NUM;
+			} else if (meta->func_id == special_kfunc_list[KF_bpf_iter_num_destroy]) {
+				iter_arg_type |= ITER_TYPE_NUM | OBJ_RELEASE;
+			} else {
+				verbose(env, "verifier internal error: unrecognized iterator kfunc\n");
+				return -EFAULT;
+			}
+
+			ret = process_iter_arg(env, regno, insn_idx, iter_arg_type, meta);
 			if (ret < 0)
 				return ret;
 			break;
-- 
2.30.2


  parent reply	other threads:[~2023-03-02 23:50 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-02 23:49 [PATCH bpf-next 00/17] BPF open-coded iterators Andrii Nakryiko
2023-03-02 23:49 ` [PATCH bpf-next 01/17] bpf: improve stack slot state printing Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 02/17] bpf: improve regsafe() checks for PTR_TO_{MEM,BUF,TP_BUFFER} Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 03/17] selftests/bpf: enhance align selftest's expected log matching Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 04/17] bpf: honor env->test_state_freq flag in is_state_visited() Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 05/17] selftests/bpf: adjust log_fixup's buffer size for proper truncation Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 06/17] bpf: clean up visit_insn()'s instruction processing Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 07/17] bpf: fix visit_insn()'s detection of BPF_FUNC_timer_set_callback helper Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 08/17] bpf: ensure that r0 is marked scratched after any function call Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 09/17] bpf: move kfunc_call_arg_meta higher in the file Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 10/17] bpf: mark PTR_TO_MEM as non-null register type Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 11/17] bpf: generalize dynptr_get_spi to be usable for iters Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 12/17] bpf: add support for fixed-size memory pointer returns for kfuncs Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 13/17] bpf: add support for open-coded iterator loops Andrii Nakryiko
2023-03-04 20:02   ` Alexei Starovoitov
2023-03-04 23:27     ` Andrii Nakryiko
2023-03-05 23:46       ` Alexei Starovoitov
2023-03-07 21:54         ` Andrii Nakryiko
2023-03-02 23:50 ` Andrii Nakryiko [this message]
2023-03-04 20:21   ` [PATCH bpf-next 14/17] bpf: implement number iterator Alexei Starovoitov
2023-03-04 23:27     ` Andrii Nakryiko
2023-03-05 23:49       ` Alexei Starovoitov
2023-03-02 23:50 ` [PATCH bpf-next 15/17] selftests/bpf: add bpf_for_each(), bpf_for(), and bpf_repeat() macros Andrii Nakryiko
2023-03-04 20:34   ` Alexei Starovoitov
2023-03-04 23:28     ` Andrii Nakryiko
2023-03-06  0:12       ` Alexei Starovoitov
2023-03-07 21:54         ` Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 16/17] selftests/bpf: add iterators tests Andrii Nakryiko
2023-03-04 20:39   ` Alexei Starovoitov
2023-03-04 23:29     ` Andrii Nakryiko
2023-03-06  0:14       ` Alexei Starovoitov
2023-03-04 21:09   ` Jiri Olsa
2023-03-04 23:29     ` Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 17/17] selftests/bpf: add number iterator tests Andrii Nakryiko
2023-03-04 19:30 ` [PATCH bpf-next 00/17] BPF open-coded iterators patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230302235015.2044271-15-andrii@kernel.org \
    --to=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.