All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v3 0/5] bpf_loop inlining
@ 2022-06-03 14:10 Eduard Zingerman
  2022-06-03 14:10 ` [PATCH bpf-next v3 1/5] selftests/bpf: specify expected instructions in test_verifier tests Eduard Zingerman
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Eduard Zingerman @ 2022-06-03 14:10 UTC (permalink / raw)
  To: bpf, ast, andrii, daniel, kernel-team, song; +Cc: eddyz87

Hi Everyone,

This is the next iteration of the patch. Please find updated intro
message and change log below.

This patch implements inlining of calls to bpf_loop helper function
when bpf_loop's callback is statically known. E.g. the rewrite does
the following transformation during BPF program processing:

  bpf_loop(10, foo, NULL, 0);

 ->

  for (int i = 0; i < 10; ++i)
    foo(i, NULL);

The transformation leads to measurable latency change for simple
loops. Measurements using `benchs/run_bench_bpf_loop.sh` inside QEMU /
KVM on i7-4710HQ CPU show a drop in latency from 14 ns/op to 2 ns/op.

The change is split in five parts:

* Update to test_verifier.c to specify expected and unexpected
  instruction sequences. This allows to check BPF program rewrites
  applied by do_mix_fixups function.

* Update to test_verifier.c to specify BTF function infos and types
  per test case. This is necessary for tests that load sub-program
  addresses to a variable because of the checks applied by
  check_ld_imm function.

* The update to verifier.c that tracks state of the parameters for
  each bpf_loop call in a program and decides whether it could be
  replaced by a loop.

* A set of test cases for `test_verifier` that use capabilities added
  by the first two patches to verify instructions produced by inlining
  logic.

* Two test cases for `test_prog` to check that possible corner cases
  behave as expected.

Additional details are available in commit messages for each patch.

Changes since v2:
 - fix for `stack_check` test case in `test_progs-no_alu32`, all tests
   are passing now;
 - v2 3/3 patch is split in three parts:
   - kernel changes
   - test_verifier changes
   - test_prog changes
 - updated `inline_bpf_loop` in `verifier.c` to calculate each offset
   used in instructions to avoid "magic" numbers;
 - removed newline handling logic in `fail_log` branch of
   `do_single_test` in `test_verifier.c` to simplify the patch set;
 - styling fixes suggested in review for v2 of this patch set.

Changes since v1:
 - allow to use SKIP_INSNS in instruction pattern specification in
   test_verifier tests;
 - fix for a bug in spill offset assignement for loop vars when
   bpf_loop is located in a non-main function.

Eduard Zingerman (5):
  selftests/bpf: specify expected instructions in test_verifier tests
  selftests/bpf: allow BTF specs and func infos in test_verifier tests
  bpf: Inline calls to bpf_loop when callback is known
  selftests/bpf: BPF test_verifier selftests for bpf_loop inlining
  selftests/bpf: BPF test_prog selftests for bpf_loop inlining

 include/linux/bpf_verifier.h                  |  16 +
 kernel/bpf/bpf_iter.c                         |   9 +-
 kernel/bpf/verifier.c                         | 199 ++++++++++-
 .../selftests/bpf/prog_tests/bpf_loop.c       |  61 ++++
 tools/testing/selftests/bpf/prog_tests/btf.c  |   1 -
 tools/testing/selftests/bpf/progs/bpf_loop.c  | 114 +++++++
 tools/testing/selftests/bpf/test_btf.h        |   2 +
 tools/testing/selftests/bpf/test_verifier.c   | 316 +++++++++++++++++-
 .../selftests/bpf/verifier/bpf_loop_inline.c  | 244 ++++++++++++++
 9 files changed, 935 insertions(+), 27 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/verifier/bpf_loop_inline.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v3 1/5] selftests/bpf: specify expected instructions in test_verifier tests
  2022-06-03 14:10 [PATCH bpf-next v3 0/5] bpf_loop inlining Eduard Zingerman
@ 2022-06-03 14:10 ` Eduard Zingerman
  2022-06-03 21:31   ` Song Liu
  2022-06-03 14:10 ` [PATCH bpf-next v3 2/5] selftests/bpf: allow BTF specs and func infos " Eduard Zingerman
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Eduard Zingerman @ 2022-06-03 14:10 UTC (permalink / raw)
  To: bpf, ast, andrii, daniel, kernel-team, song; +Cc: eddyz87

Allows to specify expected and unexpected instruction sequences in
test_verifier test cases. The instructions are requested from kernel
after BPF program loading, thus allowing to check some of the
transformations applied by BPF verifier.

- `expected_insn` field specifies a sequence of instructions expected
  to be found in the program;
- `unexpected_insn` field specifies a sequence of instructions that
  are not expected to be found in the program;
- `INSN_OFF_MASK` and `INSN_IMM_MASK` values could be used to mask
  `off` and `imm` fields.
- `SKIP_INSNS` could be used to specify that some instructions in the
  (un)expected pattern are not important (behavior similar to usage of
  `\t` in `errstr` field).

The intended usage is as follows:

  {
	"inline simple bpf_loop call",
	.insns = {
	/* main */
	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2,
			BPF_PSEUDO_FUNC, 0, 6),
    ...
	BPF_EXIT_INSN(),
	/* callback */
	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
	BPF_EXIT_INSN(),
	},
	.expected_insns = {
	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
	SKIP_INSNS(),
	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_CALL, 8, 1)
	},
	.unexpected_insns = {
	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0,
			INSN_OFF_MASK, INSN_IMM_MASK),
	},
	.prog_type = BPF_PROG_TYPE_TRACEPOINT,
	.result = ACCEPT,
	.runs = 0,
  },

Here it is expected that move of 1 to register 1 would remain in place
and helper function call instruction would be replaced by a relative
call instruction.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/test_verifier.c | 222 ++++++++++++++++++++
 1 file changed, 222 insertions(+)

diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index 372579c9f45e..373f7661f4d0 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -51,6 +51,8 @@
 #endif
 
 #define MAX_INSNS	BPF_MAXINSNS
+#define MAX_EXPECTED_INSNS	32
+#define MAX_UNEXPECTED_INSNS	32
 #define MAX_TEST_INSNS	1000000
 #define MAX_FIXUPS	8
 #define MAX_NR_MAPS	23
@@ -58,6 +60,10 @@
 #define POINTER_VALUE	0xcafe4all
 #define TEST_DATA_LEN	64
 
+#define INSN_OFF_MASK	((s16)0xFFFF)
+#define INSN_IMM_MASK	((s32)0xFFFFFFFF)
+#define SKIP_INSNS()	BPF_RAW_INSN(0xde, 0xa, 0xd, 0xbeef, 0xdeadbeef)
+
 #define F_NEEDS_EFFICIENT_UNALIGNED_ACCESS	(1 << 0)
 #define F_LOAD_WITH_STRICT_ALIGNMENT		(1 << 1)
 
@@ -79,6 +85,19 @@ struct bpf_test {
 	const char *descr;
 	struct bpf_insn	insns[MAX_INSNS];
 	struct bpf_insn	*fill_insns;
+	/* If specified, test engine looks for this sequence of
+	 * instructions in the BPF program after loading. Allows to
+	 * test rewrites applied by verifier.  Use values
+	 * INSN_OFF_MASK and INSN_IMM_MASK to mask `off` and `imm`
+	 * fields if content does not matter.  The test case fails if
+	 * specified instructions are not found.
+	 *
+	 * The sequence could be split into sub-sequences by adding
+	 * SKIP_INSNS instruction at the end of each sub-sequence. In
+	 * such case sub-sequences are searched for one after another.
+	 */
+	struct bpf_insn expected_insns[MAX_EXPECTED_INSNS];
+	struct bpf_insn unexpected_insns[MAX_UNEXPECTED_INSNS];
 	int fixup_map_hash_8b[MAX_FIXUPS];
 	int fixup_map_hash_48b[MAX_FIXUPS];
 	int fixup_map_hash_16b[MAX_FIXUPS];
@@ -1126,6 +1145,206 @@ static bool cmp_str_seq(const char *log, const char *exp)
 	return true;
 }
 
+static int get_xlated_program(int fd_prog, struct bpf_insn **buf, int *cnt)
+{
+	struct bpf_prog_info info = {};
+	__u32 info_len = sizeof(info);
+	__u32 xlated_prog_len;
+	__u32 buf_elt_size = sizeof(**buf);
+
+	if (bpf_obj_get_info_by_fd(fd_prog, &info, &info_len)) {
+		perror("bpf_obj_get_info_by_fd failed");
+		return -1;
+	}
+
+	xlated_prog_len = info.xlated_prog_len;
+	if (xlated_prog_len % buf_elt_size) {
+		printf("Program length %d is not multiple of %d\n",
+		       xlated_prog_len, buf_elt_size);
+		return -1;
+	}
+
+	*cnt = xlated_prog_len / buf_elt_size;
+	*buf = calloc(*cnt, buf_elt_size);
+	if (!buf) {
+		perror("can't allocate xlated program buffer");
+		return -ENOMEM;
+	}
+
+	bzero(&info, sizeof(info));
+	info.xlated_prog_len = xlated_prog_len;
+	info.xlated_prog_insns = (__u64)*buf;
+	if (bpf_obj_get_info_by_fd(fd_prog, &info, &info_len)) {
+		perror("second bpf_obj_get_info_by_fd failed");
+		goto out_free_buf;
+	}
+
+	return 0;
+
+out_free_buf:
+	free(*buf);
+	return -1;
+}
+
+static bool is_null_insn(struct bpf_insn *insn)
+{
+	struct bpf_insn null_insn = {};
+
+	return memcmp(insn, &null_insn, sizeof(null_insn)) == 0;
+}
+
+static bool is_skip_insn(struct bpf_insn *insn)
+{
+	struct bpf_insn skip_insn = SKIP_INSNS();
+
+	return memcmp(insn, &skip_insn, sizeof(skip_insn)) == 0;
+}
+
+static int null_terminated_insn_len(struct bpf_insn *seq, int max_len)
+{
+	for (int i = 0; i < max_len; ++i) {
+		if (is_null_insn(&seq[i]))
+			return i;
+	}
+	return max_len;
+}
+
+static bool compare_masked_insn(struct bpf_insn *orig, struct bpf_insn *masked)
+{
+	struct bpf_insn orig_masked;
+
+	memcpy(&orig_masked, orig, sizeof(orig_masked));
+	if (masked->imm == INSN_IMM_MASK)
+		orig_masked.imm = INSN_IMM_MASK;
+	if (masked->off == INSN_OFF_MASK)
+		orig_masked.off = INSN_OFF_MASK;
+
+	return memcmp(&orig_masked, masked, sizeof(orig_masked)) == 0;
+}
+
+static int find_insn_subseq(struct bpf_insn *seq, struct bpf_insn *subseq,
+			    int seq_len, int subseq_len)
+{
+	if (subseq_len > seq_len)
+		return -1;
+
+	for (int i = 0; i < seq_len - subseq_len + 1; ++i) {
+		bool found = true;
+
+		for (int j = 0; j < subseq_len; ++j) {
+			if (!compare_masked_insn(&seq[i + j], &subseq[j])) {
+				found = false;
+				break;
+			}
+		}
+		if (found)
+			return i;
+	}
+
+	return -1;
+}
+
+static int find_skip_insn_marker(struct bpf_insn *seq, int len)
+{
+	for (int i = 0; i < len; ++i)
+		if (is_skip_insn(&seq[i]))
+			return i;
+
+	return -1;
+}
+
+/* Return true if all sub-sequences in `subseqs` could be found in
+ * `seq` one after another. Sub-sequences are separated by a single
+ * nil instruction.
+ */
+static bool find_all_insn_subseqs(struct bpf_insn *seq, struct bpf_insn *subseqs,
+				  int seq_len, int max_subseqs_len)
+{
+	int subseqs_len = null_terminated_insn_len(subseqs, max_subseqs_len);
+
+	while (subseqs_len > 0) {
+		int skip_idx = find_skip_insn_marker(subseqs, subseqs_len);
+		int cur_subseq_len = skip_idx < 0 ? subseqs_len : skip_idx;
+		int subseq_idx = find_insn_subseq(seq, subseqs,
+						  seq_len, cur_subseq_len);
+
+		if (subseq_idx < 0)
+			return false;
+		seq += subseq_idx + cur_subseq_len;
+		seq_len -= subseq_idx + cur_subseq_len;
+		subseqs += cur_subseq_len + 1;
+		subseqs_len -= cur_subseq_len + 1;
+	}
+
+	return true;
+}
+
+static void print_insn(struct bpf_insn *buf, int cnt)
+{
+	printf("  addr  op d s off  imm\n");
+	for (int i = 0; i < cnt; ++i) {
+		struct bpf_insn *insn = &buf[i];
+
+		if (is_null_insn(insn))
+			break;
+
+		if (is_skip_insn(insn))
+			printf("  ...\n");
+		else
+			printf("  %04x: %02x %1x %x %04hx %08x\n",
+			       i, insn->code, insn->dst_reg,
+			       insn->src_reg, insn->off, insn->imm);
+	}
+}
+
+static bool check_xlated_program(struct bpf_test *test, int fd_prog)
+{
+	struct bpf_insn *buf;
+	int cnt;
+	bool result = true;
+	bool check_expected = !is_null_insn(test->expected_insns);
+	bool check_unexpected = !is_null_insn(test->unexpected_insns);
+
+	if (!check_expected && !check_unexpected)
+		goto out;
+
+	if (get_xlated_program(fd_prog, &buf, &cnt)) {
+		printf("FAIL: can't get xlated program\n");
+		result = false;
+		goto out;
+	}
+
+	if (check_expected &&
+	    !find_all_insn_subseqs(buf, test->expected_insns,
+				   cnt, MAX_EXPECTED_INSNS)) {
+		printf("FAIL: can't find expected subsequence of instructions\n");
+		result = false;
+		if (verbose) {
+			printf("Program:\n");
+			print_insn(buf, cnt);
+			printf("Expected subsequence:\n");
+			print_insn(test->expected_insns, MAX_EXPECTED_INSNS);
+		}
+	}
+
+	if (check_unexpected &&
+	    find_all_insn_subseqs(buf, test->unexpected_insns,
+				  cnt, MAX_UNEXPECTED_INSNS)) {
+		printf("FAIL: found unexpected subsequence of instructions\n");
+		result = false;
+		if (verbose) {
+			printf("Program:\n");
+			print_insn(buf, cnt);
+			printf("Un-expected subsequence:\n");
+			print_insn(test->unexpected_insns, MAX_UNEXPECTED_INSNS);
+		}
+	}
+
+	free(buf);
+ out:
+	return result;
+}
+
 static void do_test_single(struct bpf_test *test, bool unpriv,
 			   int *passes, int *errors)
 {
@@ -1262,6 +1481,9 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
 	if (verbose)
 		printf(", verifier log:\n%s", bpf_vlog);
 
+	if (!check_xlated_program(test, fd_prog))
+		goto fail_log;
+
 	run_errs = 0;
 	run_successes = 0;
 	if (!alignment_prevented_execution && fd_prog >= 0 && test->runs >= 0) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v3 2/5] selftests/bpf: allow BTF specs and func infos in test_verifier tests
  2022-06-03 14:10 [PATCH bpf-next v3 0/5] bpf_loop inlining Eduard Zingerman
  2022-06-03 14:10 ` [PATCH bpf-next v3 1/5] selftests/bpf: specify expected instructions in test_verifier tests Eduard Zingerman
@ 2022-06-03 14:10 ` Eduard Zingerman
  2022-06-03 21:41   ` Song Liu
  2022-06-03 14:10 ` [PATCH bpf-next v3 3/5] bpf: Inline calls to bpf_loop when callback is known Eduard Zingerman
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Eduard Zingerman @ 2022-06-03 14:10 UTC (permalink / raw)
  To: bpf, ast, andrii, daniel, kernel-team, song; +Cc: eddyz87

The BTF and func_info specification for test_verifier tests follows
the same notation as in prog_tests/btf.c tests. E.g.:

  ...
  .func_info = { { 0, 6 }, { 8, 7 } },
  .func_info_cnt = 2,
  .btf_strings = "\0int\0",
  .btf_types = {
    BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4),
    BTF_PTR_ENC(1),
  },
  ...

The BTF specification is loaded only when specified.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/prog_tests/btf.c |  1 -
 tools/testing/selftests/bpf/test_btf.h       |  2 +
 tools/testing/selftests/bpf/test_verifier.c  | 94 ++++++++++++++++----
 3 files changed, 79 insertions(+), 18 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c
index ba5bde53d418..bebc98bad615 100644
--- a/tools/testing/selftests/bpf/prog_tests/btf.c
+++ b/tools/testing/selftests/bpf/prog_tests/btf.c
@@ -34,7 +34,6 @@ static bool always_log;
 #undef CHECK
 #define CHECK(condition, format...) _CHECK(condition, "check", duration, format)
 
-#define BTF_END_RAW 0xdeadbeef
 #define NAME_TBD 0xdeadb33f
 
 #define NAME_NTH(N) (0xfffe0000 | N)
diff --git a/tools/testing/selftests/bpf/test_btf.h b/tools/testing/selftests/bpf/test_btf.h
index 128989bed8b7..044663c45a57 100644
--- a/tools/testing/selftests/bpf/test_btf.h
+++ b/tools/testing/selftests/bpf/test_btf.h
@@ -4,6 +4,8 @@
 #ifndef _TEST_BTF_H
 #define _TEST_BTF_H
 
+#define BTF_END_RAW 0xdeadbeef
+
 #define BTF_INFO_ENC(kind, kind_flag, vlen)			\
 	((!!(kind_flag) << 31) | ((kind) << 24) | ((vlen) & BTF_MAX_VLEN))
 
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index 373f7661f4d0..a3a16e2ba6d9 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -59,11 +59,17 @@
 #define MAX_TEST_RUNS	8
 #define POINTER_VALUE	0xcafe4all
 #define TEST_DATA_LEN	64
+#define MAX_FUNC_INFOS	8
+#define MAX_BTF_STRINGS	256
+#define MAX_BTF_TYPES	256
 
 #define INSN_OFF_MASK	((s16)0xFFFF)
 #define INSN_IMM_MASK	((s32)0xFFFFFFFF)
 #define SKIP_INSNS()	BPF_RAW_INSN(0xde, 0xa, 0xd, 0xbeef, 0xdeadbeef)
 
+#define DEFAULT_LIBBPF_LOG_LEVEL	4
+#define VERBOSE_LIBBPF_LOG_LEVEL	1
+
 #define F_NEEDS_EFFICIENT_UNALIGNED_ACCESS	(1 << 0)
 #define F_LOAD_WITH_STRICT_ALIGNMENT		(1 << 1)
 
@@ -154,6 +160,14 @@ struct bpf_test {
 	};
 	enum bpf_attach_type expected_attach_type;
 	const char *kfunc;
+	struct bpf_func_info func_info[MAX_FUNC_INFOS];
+	int func_info_cnt;
+	char btf_strings[MAX_BTF_STRINGS];
+	/* A set of BTF types to load when specified,
+	 * use macro definitions from test_btf.h,
+	 * must end with BTF_END_RAW
+	 */
+	__u32 btf_types[MAX_BTF_TYPES];
 };
 
 /* Note we want this to be 64 bit aligned so that the end of our array is
@@ -683,34 +697,66 @@ static __u32 btf_raw_types[] = {
 	BTF_MEMBER_ENC(71, 13, 128), /* struct prog_test_member __kptr_ref *ptr; */
 };
 
-static int load_btf(void)
+static char bpf_vlog[UINT_MAX >> 8];
+
+static int load_btf_spec(__u32 *types, int types_len,
+			 const char *strings, int strings_len)
 {
 	struct btf_header hdr = {
 		.magic = BTF_MAGIC,
 		.version = BTF_VERSION,
 		.hdr_len = sizeof(struct btf_header),
-		.type_len = sizeof(btf_raw_types),
-		.str_off = sizeof(btf_raw_types),
-		.str_len = sizeof(btf_str_sec),
+		.type_len = types_len,
+		.str_off = types_len,
+		.str_len = strings_len,
 	};
 	void *ptr, *raw_btf;
 	int btf_fd;
+	LIBBPF_OPTS(bpf_btf_load_opts, opts,
+		    .log_buf = bpf_vlog,
+		    .log_size = sizeof(bpf_vlog),
+		    .log_level = (verbose
+				  ? VERBOSE_LIBBPF_LOG_LEVEL
+				  : DEFAULT_LIBBPF_LOG_LEVEL),
+	);
 
-	ptr = raw_btf = malloc(sizeof(hdr) + sizeof(btf_raw_types) +
-			       sizeof(btf_str_sec));
+	raw_btf = malloc(sizeof(hdr) + types_len + strings_len);
 
+	ptr = raw_btf;
 	memcpy(ptr, &hdr, sizeof(hdr));
 	ptr += sizeof(hdr);
-	memcpy(ptr, btf_raw_types, hdr.type_len);
+	memcpy(ptr, types, hdr.type_len);
 	ptr += hdr.type_len;
-	memcpy(ptr, btf_str_sec, hdr.str_len);
+	memcpy(ptr, strings, hdr.str_len);
 	ptr += hdr.str_len;
 
-	btf_fd = bpf_btf_load(raw_btf, ptr - raw_btf, NULL);
-	free(raw_btf);
+	btf_fd = bpf_btf_load(raw_btf, ptr - raw_btf, &opts);
 	if (btf_fd < 0)
-		return -1;
-	return btf_fd;
+		printf("Failed to load BTF spec: '%s'\n", strerror(errno));
+
+	free(raw_btf);
+
+	return btf_fd < 0 ? -1 : btf_fd;
+}
+
+static int load_btf(void)
+{
+	return load_btf_spec(btf_raw_types, sizeof(btf_raw_types),
+			     btf_str_sec, sizeof(btf_str_sec));
+}
+
+static int load_btf_for_test(struct bpf_test *test)
+{
+	int types_num = 0;
+
+	while (types_num < MAX_BTF_TYPES &&
+	       test->btf_types[types_num] != BTF_END_RAW)
+		++types_num;
+
+	int types_len = types_num * sizeof(test->btf_types[0]);
+
+	return load_btf_spec(test->btf_types, types_len,
+			     test->btf_strings, sizeof(test->btf_strings));
 }
 
 static int create_map_spin_lock(void)
@@ -789,8 +835,6 @@ static int create_map_kptr(void)
 	return fd;
 }
 
-static char bpf_vlog[UINT_MAX >> 8];
-
 static void do_test_fixup(struct bpf_test *test, enum bpf_prog_type prog_type,
 			  struct bpf_insn *prog, int *map_fds)
 {
@@ -1348,7 +1392,7 @@ static bool check_xlated_program(struct bpf_test *test, int fd_prog)
 static void do_test_single(struct bpf_test *test, bool unpriv,
 			   int *passes, int *errors)
 {
-	int fd_prog, expected_ret, alignment_prevented_execution;
+	int fd_prog, btf_fd, expected_ret, alignment_prevented_execution;
 	int prog_len, prog_type = test->prog_type;
 	struct bpf_insn *prog = test->insns;
 	LIBBPF_OPTS(bpf_prog_load_opts, opts);
@@ -1360,8 +1404,10 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
 	__u32 pflags;
 	int i, err;
 
+	fd_prog = -1;
 	for (i = 0; i < MAX_NR_MAPS; i++)
 		map_fds[i] = -1;
+	btf_fd = -1;
 
 	if (!prog_type)
 		prog_type = BPF_PROG_TYPE_SOCKET_FILTER;
@@ -1394,11 +1440,11 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
 
 	opts.expected_attach_type = test->expected_attach_type;
 	if (verbose)
-		opts.log_level = 1;
+		opts.log_level = VERBOSE_LIBBPF_LOG_LEVEL;
 	else if (expected_ret == VERBOSE_ACCEPT)
 		opts.log_level = 2;
 	else
-		opts.log_level = 4;
+		opts.log_level = DEFAULT_LIBBPF_LOG_LEVEL;
 	opts.prog_flags = pflags;
 
 	if (prog_type == BPF_PROG_TYPE_TRACING && test->kfunc) {
@@ -1416,6 +1462,19 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
 		opts.attach_btf_id = attach_btf_id;
 	}
 
+	if (test->btf_types[0] != 0) {
+		btf_fd = load_btf_for_test(test);
+		if (btf_fd < 0)
+			goto fail_log;
+		opts.prog_btf_fd = btf_fd;
+	}
+
+	if (test->func_info_cnt != 0) {
+		opts.func_info = test->func_info;
+		opts.func_info_cnt = test->func_info_cnt;
+		opts.func_info_rec_size = sizeof(test->func_info[0]);
+	}
+
 	opts.log_buf = bpf_vlog;
 	opts.log_size = sizeof(bpf_vlog);
 	fd_prog = bpf_prog_load(prog_type, NULL, "GPL", prog, prog_len, &opts);
@@ -1527,6 +1586,7 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
 	if (test->fill_insns)
 		free(test->fill_insns);
 	close(fd_prog);
+	close(btf_fd);
 	for (i = 0; i < MAX_NR_MAPS; i++)
 		close(map_fds[i]);
 	sched_yield();
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v3 3/5] bpf: Inline calls to bpf_loop when callback is known
  2022-06-03 14:10 [PATCH bpf-next v3 0/5] bpf_loop inlining Eduard Zingerman
  2022-06-03 14:10 ` [PATCH bpf-next v3 1/5] selftests/bpf: specify expected instructions in test_verifier tests Eduard Zingerman
  2022-06-03 14:10 ` [PATCH bpf-next v3 2/5] selftests/bpf: allow BTF specs and func infos " Eduard Zingerman
@ 2022-06-03 14:10 ` Eduard Zingerman
  2022-06-03 22:36   ` Song Liu
                     ` (2 more replies)
  2022-06-03 14:10 ` [PATCH bpf-next v3 4/5] selftests/bpf: BPF test_verifier selftests for bpf_loop inlining Eduard Zingerman
  2022-06-03 14:10 ` [PATCH bpf-next v3 5/5] selftests/bpf: BPF test_prog " Eduard Zingerman
  4 siblings, 3 replies; 20+ messages in thread
From: Eduard Zingerman @ 2022-06-03 14:10 UTC (permalink / raw)
  To: bpf, ast, andrii, daniel, kernel-team, song; +Cc: eddyz87

Calls to `bpf_loop` are replaced with direct loops to avoid
indirection. E.g. the following:

  bpf_loop(10, foo, NULL, 0);

Is replaced by equivalent of the following:

  for (int i = 0; i < 10; ++i)
    foo(i, NULL);

This transformation could be applied when:
- callback is known and does not change during program execution;
- flags passed to `bpf_loop` are always zero.

Inlining logic works as follows:

- During execution simulation function `update_loop_inline_state`
  tracks the following information for each `bpf_loop` call
  instruction:
  - is callback known and constant?
  - are flags constant and zero?
- Function `adjust_stack_depth_for_loop_inlining` increases stack
  depth for functions where `bpf_loop` calls could be inlined. This is
  needed to spill registers R6, R7 and R8. These registers are used as
  loop counter, loop maximal bound and callback context parameter;
- Function `inline_bpf_loop` called from `do_misc_fixups` replaces
  `bpf_loop` calls fit for inlining with corresponding loop
  instructions.

Measurements using `benchs/run_bench_bpf_loop.sh` inside QEMU / KVM on
i7-4710HQ CPU show a drop in latency from 14 ns/op to 2 ns/op.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 include/linux/bpf_verifier.h |  16 +++
 kernel/bpf/bpf_iter.c        |   9 +-
 kernel/bpf/verifier.c        | 199 ++++++++++++++++++++++++++++++++++-
 3 files changed, 215 insertions(+), 9 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index e8439f6cbe57..80279616a76b 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -20,6 +20,8 @@
 #define BPF_MAX_VAR_SIZ	(1 << 29)
 /* size of type_str_buf in bpf_verifier. */
 #define TYPE_STR_BUF_LEN 64
+/* Maximum number of loops for bpf_loop */
+#define BPF_MAX_LOOPS	BIT(23)
 
 /* Liveness marks, used for registers and spilled-regs (in stack slots).
  * Read marks propagate upwards until they find a write mark; they record that
@@ -344,6 +346,16 @@ struct bpf_verifier_state_list {
 	int miss_cnt, hit_cnt;
 };
 
+struct bpf_loop_inline_state {
+	u32 initialized:1; /* set to true upon first entry */
+	u32 callback_is_constant:1; /* true if callback function
+				     * is the same at each call
+				     */
+	u32 flags_is_zero:1; /* true if flags register is zero at each call */
+	u32 callback_subprogno; /* valid when callback_is_constant == 1 */
+	s32 stack_base; /* stack offset for loop vars */
+};
+
 /* Possible states for alu_state member. */
 #define BPF_ALU_SANITIZE_SRC		(1U << 0)
 #define BPF_ALU_SANITIZE_DST		(1U << 1)
@@ -380,6 +392,10 @@ struct bpf_insn_aux_data {
 	bool sanitize_stack_spill; /* subject to Spectre v4 sanitation */
 	bool zext_dst; /* this insn zero extends dst reg */
 	u8 alu_state; /* used in combination with alu_limit */
+	/* if instruction is a call to bpf_loop this field tracks
+	 * the state of the relevant registers to take decision about inlining
+	 */
+	struct bpf_loop_inline_state loop_inline_state;
 
 	/* below fields are initialized once */
 	unsigned int orig_idx; /* original instruction index */
diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c
index d5d96ceca105..cdb898fce118 100644
--- a/kernel/bpf/bpf_iter.c
+++ b/kernel/bpf/bpf_iter.c
@@ -6,6 +6,7 @@
 #include <linux/filter.h>
 #include <linux/bpf.h>
 #include <linux/rcupdate_trace.h>
+#include <linux/bpf_verifier.h>
 
 struct bpf_iter_target_info {
 	struct list_head list;
@@ -723,9 +724,6 @@ const struct bpf_func_proto bpf_for_each_map_elem_proto = {
 	.arg4_type	= ARG_ANYTHING,
 };
 
-/* maximum number of loops */
-#define MAX_LOOPS	BIT(23)
-
 BPF_CALL_4(bpf_loop, u32, nr_loops, void *, callback_fn, void *, callback_ctx,
 	   u64, flags)
 {
@@ -733,9 +731,12 @@ BPF_CALL_4(bpf_loop, u32, nr_loops, void *, callback_fn, void *, callback_ctx,
 	u64 ret;
 	u32 i;
 
+	/* note: these safety checks are also verified when bpf_loop is inlined,
+	 * be careful to modify this code in sync
+	 */
 	if (flags)
 		return -EINVAL;
-	if (nr_loops > MAX_LOOPS)
+	if (nr_loops > BPF_MAX_LOOPS)
 		return -E2BIG;
 
 	for (i = 0; i < nr_loops; i++) {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index aedac2ac02b9..27d78fe6c3f9 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7103,6 +7103,78 @@ static int check_get_func_ip(struct bpf_verifier_env *env)
 	return -ENOTSUPP;
 }
 
+static struct bpf_insn_aux_data *cur_aux(struct bpf_verifier_env *env)
+{
+	return &env->insn_aux_data[env->insn_idx];
+}
+
+static bool fit_for_bpf_loop_inline(struct bpf_insn_aux_data *insn_aux)
+{
+	return insn_aux->loop_inline_state.initialized &&
+		insn_aux->loop_inline_state.flags_is_zero &&
+		insn_aux->loop_inline_state.callback_is_constant;
+}
+
+/* For all sub-programs in the program (including main) checks
+ * insn_aux_data to see if there are bpf_loop calls that require
+ * inlining. If such calls are found subprog stack_depth is increased
+ * by the size of 3 registers. Reserved space would be used in the
+ * do_misc_fixups to spill values of the R6, R7, R8 to use these
+ * registers for loop iteration.
+ */
+static void adjust_stack_depth_for_loop_inlining(struct bpf_verifier_env *env)
+{
+	int i, subprog_end, cur_subprog = 0;
+	struct bpf_subprog_info *subprogs = env->subprog_info;
+	int insn_cnt = env->prog->len;
+	bool subprog_updated = false;
+	s32 stack_base;
+
+	subprog_end = (env->subprog_cnt > 1
+		       ? subprogs[cur_subprog + 1].start
+		       : insn_cnt);
+	for (i = 0; i < insn_cnt; i++) {
+		struct bpf_insn_aux_data *aux = &env->insn_aux_data[i];
+
+		if (fit_for_bpf_loop_inline(aux)) {
+			if (!subprog_updated) {
+				subprog_updated = true;
+				subprogs[cur_subprog].stack_depth += BPF_REG_SIZE * 3;
+				stack_base = -subprogs[cur_subprog].stack_depth;
+			}
+			aux->loop_inline_state.stack_base = stack_base;
+		}
+		if (i == subprog_end - 1) {
+			subprog_updated = false;
+			cur_subprog++;
+			if (cur_subprog < env->subprog_cnt)
+				subprog_end = subprogs[cur_subprog + 1].start;
+		}
+	}
+
+	env->prog->aux->stack_depth = env->subprog_info[0].stack_depth;
+}
+
+static void update_loop_inline_state(struct bpf_verifier_env *env, u32 subprogno)
+{
+	struct bpf_loop_inline_state *state = &cur_aux(env)->loop_inline_state;
+	struct bpf_reg_state *regs = cur_regs(env);
+	struct bpf_reg_state *flags_reg = &regs[BPF_REG_4];
+
+	int flags_is_zero =
+		register_is_const(flags_reg) && flags_reg->var_off.value == 0;
+
+	if (state->initialized) {
+		state->flags_is_zero &= flags_is_zero;
+		state->callback_is_constant &= state->callback_subprogno == subprogno;
+	} else {
+		state->initialized = 1;
+		state->callback_is_constant = 1;
+		state->flags_is_zero = flags_is_zero;
+		state->callback_subprogno = subprogno;
+	}
+}
+
 static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			     int *insn_idx_p)
 {
@@ -7255,6 +7327,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 		err = check_bpf_snprintf_call(env, regs);
 		break;
 	case BPF_FUNC_loop:
+		update_loop_inline_state(env, meta.subprogno);
 		err = __check_func_call(env, insn, insn_idx_p, meta.subprogno,
 					set_loop_callback_state);
 		break;
@@ -7661,11 +7734,6 @@ static bool check_reg_sane_offset(struct bpf_verifier_env *env,
 	return true;
 }
 
-static struct bpf_insn_aux_data *cur_aux(struct bpf_verifier_env *env)
-{
-	return &env->insn_aux_data[env->insn_idx];
-}
-
 enum {
 	REASON_BOUNDS	= -1,
 	REASON_TYPE	= -2,
@@ -12920,6 +12988,22 @@ static struct bpf_prog *bpf_patch_insn_data(struct bpf_verifier_env *env, u32 of
 	return new_prog;
 }
 
+static void adjust_loop_inline_subprogno(struct bpf_verifier_env *env,
+					 u32 first_removed,
+					 u32 first_remaining)
+{
+	int delta = first_remaining - first_removed;
+
+	for (int i = 0; i < env->prog->len; ++i) {
+		struct bpf_loop_inline_state *state =
+			&env->insn_aux_data[i].loop_inline_state;
+
+		if (state->initialized &&
+		    state->callback_subprogno >= first_remaining)
+			state->callback_subprogno -= delta;
+	}
+}
+
 static int adjust_subprog_starts_after_remove(struct bpf_verifier_env *env,
 					      u32 off, u32 cnt)
 {
@@ -12963,6 +13047,8 @@ static int adjust_subprog_starts_after_remove(struct bpf_verifier_env *env,
 			 * in adjust_btf_func() - no need to adjust
 			 */
 		}
+
+		adjust_loop_inline_subprogno(env, i, j);
 	} else {
 		/* convert i from "first prog to remove" to "first to adjust" */
 		if (env->subprog_info[i].start == off)
@@ -13773,6 +13859,94 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env,
 	return 0;
 }
 
+static struct bpf_prog *inline_bpf_loop(struct bpf_verifier_env *env,
+					int position, u32 *cnt)
+{
+	struct bpf_insn_aux_data *aux = &env->insn_aux_data[position];
+	s32 stack_base = aux->loop_inline_state.stack_base;
+	s32 r6_offset = stack_base + 0 * BPF_REG_SIZE;
+	s32 r7_offset = stack_base + 1 * BPF_REG_SIZE;
+	s32 r8_offset = stack_base + 2 * BPF_REG_SIZE;
+	int reg_loop_max = BPF_REG_6;
+	int reg_loop_cnt = BPF_REG_7;
+	int reg_loop_ctx = BPF_REG_8;
+
+	struct bpf_prog *new_prog;
+	u32 callback_subprogno = aux->loop_inline_state.callback_subprogno;
+	u32 callback_start;
+	u32 call_insn_offset;
+	s32 callback_offset;
+	struct bpf_insn insn_buf[19];
+	struct bpf_insn *next = insn_buf;
+	struct bpf_insn *call, *jump_to_end, *loop_header;
+	struct bpf_insn *jump_to_header, *loop_exit;
+
+	/* Return error and jump to the end of the patch if
+	 * expected number of iterations is too big.  This
+	 * repeats the check done in bpf_loop helper function,
+	 * be careful to modify this code in sync.
+	 */
+	(*next++) = BPF_JMP_IMM(BPF_JLE, BPF_REG_1, BPF_MAX_LOOPS, 2);
+	(*next++) = BPF_MOV32_IMM(BPF_REG_0, -E2BIG);
+	jump_to_end = next;
+	(*next++) = BPF_JMP_IMM(BPF_JA, 0, 0, 0 /* set below */);
+	/* spill R6, R7, R8 to use these as loop vars */
+	(*next++) = BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, r6_offset);
+	(*next++) = BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, r7_offset);
+	(*next++) = BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, r8_offset);
+	/* initialize loop vars */
+	(*next++) = BPF_MOV64_REG(reg_loop_max, BPF_REG_1);
+	(*next++) = BPF_MOV32_IMM(reg_loop_cnt, 0);
+	(*next++) = BPF_MOV64_REG(reg_loop_ctx, BPF_REG_3);
+	/* loop header;
+	 * if reg_loop_cnt >= reg_loop_max skip the loop body
+	 */
+	loop_header = next;
+	(*next++) = BPF_JMP_REG(BPF_JGE, reg_loop_cnt, reg_loop_max,
+				0 /* set below */);
+	/* callback call */
+	(*next++) = BPF_MOV64_REG(BPF_REG_1, reg_loop_cnt);
+	(*next++) = BPF_MOV64_REG(BPF_REG_2, reg_loop_ctx);
+	call = next;
+	(*next++) = BPF_CALL_REL(0 /* set below after patching */);
+	/* increment loop counter */
+	(*next++) = BPF_ALU64_IMM(BPF_ADD, reg_loop_cnt, 1);
+	/* jump to loop header if callback returned 0 */
+	jump_to_header = next;
+	(*next++) = BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 0 /* set below */);
+	/* return value of bpf_loop;
+	 * set R0 to the number of iterations
+	 */
+	loop_exit = next;
+	(*next++) = BPF_MOV64_REG(BPF_REG_0, reg_loop_cnt);
+	/* restore original values of R6, R7, R8 */
+	(*next++) = BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_10, r6_offset);
+	(*next++) = BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_10, r7_offset);
+	(*next++) = BPF_LDX_MEM(BPF_DW, BPF_REG_8, BPF_REG_10, r8_offset);
+
+	*cnt = next - insn_buf;
+	if (*cnt > ARRAY_SIZE(insn_buf)) {
+		WARN_ONCE(1, "BUG %s: 'next' exceeds bounds for 'insn_buf'\n",
+			  __func__);
+		return NULL;
+	}
+	jump_to_end->off = next - jump_to_end - 1;
+	loop_header->off = loop_exit - loop_header - 1;
+	jump_to_header->off = loop_header - jump_to_header - 1;
+
+	new_prog = bpf_patch_insn_data(env, position, insn_buf, *cnt);
+	if (!new_prog)
+		return new_prog;
+
+	/* callback start is known only after patching */
+	callback_start = env->subprog_info[callback_subprogno].start;
+	call_insn_offset = position + (call - insn_buf);
+	callback_offset = callback_start - call_insn_offset - 1;
+	env->prog->insnsi[call_insn_offset].imm = callback_offset;
+
+	return new_prog;
+}
+
 /* Do various post-verification rewrites in a single program pass.
  * These rewrites simplify JIT and interpreter implementations.
  */
@@ -14258,6 +14432,18 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
 			continue;
 		}
 
+		if (insn->imm == BPF_FUNC_loop &&
+		    fit_for_bpf_loop_inline(&env->insn_aux_data[i + delta])) {
+			new_prog = inline_bpf_loop(env, i + delta, &cnt);
+			if (!new_prog)
+				return -ENOMEM;
+
+			delta    += cnt - 1;
+			env->prog = prog = new_prog;
+			insn      = new_prog->insnsi + i + delta;
+			continue;
+		}
+
 patch_call_imm:
 		fn = env->ops->get_func_proto(insn->imm, env->prog);
 		/* all functions that have prototype and verifier allowed
@@ -15030,6 +15216,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr)
 	if (ret == 0)
 		ret = check_max_stack_depth(env);
 
+	if (ret == 0)
+		adjust_stack_depth_for_loop_inlining(env);
+
 	/* instruction rewrites happen after this point */
 	if (is_priv) {
 		if (ret == 0)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v3 4/5] selftests/bpf: BPF test_verifier selftests for bpf_loop inlining
  2022-06-03 14:10 [PATCH bpf-next v3 0/5] bpf_loop inlining Eduard Zingerman
                   ` (2 preceding siblings ...)
  2022-06-03 14:10 ` [PATCH bpf-next v3 3/5] bpf: Inline calls to bpf_loop when callback is known Eduard Zingerman
@ 2022-06-03 14:10 ` Eduard Zingerman
  2022-06-03 22:38   ` Song Liu
  2022-06-03 14:10 ` [PATCH bpf-next v3 5/5] selftests/bpf: BPF test_prog " Eduard Zingerman
  4 siblings, 1 reply; 20+ messages in thread
From: Eduard Zingerman @ 2022-06-03 14:10 UTC (permalink / raw)
  To: bpf, ast, andrii, daniel, kernel-team, song; +Cc: eddyz87

A number of test cases for BPF selftests test_verifier to check how
bpf_loop inline transformation rewrites the BPF program. The following
cases are covered:
 - happy path
 - no-rewrite when flags is non-zero
 - no-rewrite when callback is non-constant
 - subprogno in insn_aux is updated correctly when dead sub-programs
   are removed
 - check that correct stack offsets are assigned for spilling of R6-R8
   registers

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 .../selftests/bpf/verifier/bpf_loop_inline.c  | 244 ++++++++++++++++++
 1 file changed, 244 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/verifier/bpf_loop_inline.c

diff --git a/tools/testing/selftests/bpf/verifier/bpf_loop_inline.c b/tools/testing/selftests/bpf/verifier/bpf_loop_inline.c
new file mode 100644
index 000000000000..d1fbcfef69f2
--- /dev/null
+++ b/tools/testing/selftests/bpf/verifier/bpf_loop_inline.c
@@ -0,0 +1,244 @@
+#define BTF_TYPES \
+	.btf_strings = "\0int\0i\0ctx\0callback\0main\0", \
+	.btf_types = { \
+	/* 1: int   */ BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4), \
+	/* 2: int*  */ BTF_PTR_ENC(1), \
+	/* 3: void* */ BTF_PTR_ENC(0), \
+	/* 4: int __(void*) */ BTF_FUNC_PROTO_ENC(1, 1), \
+		BTF_FUNC_PROTO_ARG_ENC(7, 3), \
+	/* 5: int __(int, int*) */ BTF_FUNC_PROTO_ENC(1, 2), \
+		BTF_FUNC_PROTO_ARG_ENC(5, 1), \
+		BTF_FUNC_PROTO_ARG_ENC(7, 2), \
+	/* 6: main      */ BTF_FUNC_ENC(20, 4), \
+	/* 7: callback  */ BTF_FUNC_ENC(11, 5), \
+	BTF_END_RAW \
+	}
+
+#define MAIN_TYPE	6
+#define CALLBACK_TYPE	7
+
+/* can't use BPF_CALL_REL, jit_subprogs adjusts IMM & OFF
+ * fields for pseudo calls
+ */
+#define PSEUDO_CALL_INSN() \
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_CALL, \
+		     INSN_OFF_MASK, INSN_IMM_MASK)
+
+/* can't use BPF_FUNC_loop constant,
+ * do_mix_fixups adjusts the IMM field
+ */
+#define HELPER_CALL_INSN() \
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, INSN_OFF_MASK, INSN_IMM_MASK)
+
+{
+	"inline simple bpf_loop call",
+	.insns = {
+	/* main */
+	/* force verifier state branching to verify logic on first and
+	 * subsequent bpf_loop insn processing steps
+	 */
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
+	BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 777, 2),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
+	BPF_JMP_IMM(BPF_JA, 0, 0, 1),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 2),
+
+	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 6),
+	BPF_RAW_INSN(0, 0, 0, 0, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
+	BPF_EXIT_INSN(),
+	/* callback */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	},
+	.expected_insns = { PSEUDO_CALL_INSN() },
+	.unexpected_insns = { HELPER_CALL_INSN() },
+	.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	.result = ACCEPT,
+	.runs = 0,
+	.func_info = { { 0, MAIN_TYPE }, { 12, CALLBACK_TYPE } },
+	.func_info_cnt = 2,
+	BTF_TYPES
+},
+{
+	"don't inline bpf_loop call, flags non-zero",
+	.insns = {
+	/* main */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
+	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 6),
+	BPF_RAW_INSN(0, 0, 0, 0, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 1),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
+	BPF_EXIT_INSN(),
+	/* callback */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	},
+	.expected_insns = { HELPER_CALL_INSN() },
+	.unexpected_insns = { PSEUDO_CALL_INSN() },
+	.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	.result = ACCEPT,
+	.runs = 0,
+	.func_info = { { 0, MAIN_TYPE }, { 8, CALLBACK_TYPE } },
+	.func_info_cnt = 2,
+	BTF_TYPES
+},
+{
+	"don't inline bpf_loop call, callback non-constant",
+	.insns = {
+	/* main */
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
+	BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 777, 4), /* pick a random callback */
+
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
+	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 10),
+	BPF_RAW_INSN(0, 0, 0, 0, 0),
+	BPF_JMP_IMM(BPF_JA, 0, 0, 3),
+
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
+	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 8),
+	BPF_RAW_INSN(0, 0, 0, 0, 0),
+
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
+	BPF_EXIT_INSN(),
+	/* callback */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	/* callback #2 */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	},
+	.expected_insns = { HELPER_CALL_INSN() },
+	.unexpected_insns = { PSEUDO_CALL_INSN() },
+	.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	.result = ACCEPT,
+	.runs = 0,
+	.func_info = {
+		{ 0, MAIN_TYPE },
+		{ 14, CALLBACK_TYPE },
+		{ 16, CALLBACK_TYPE }
+	},
+	.func_info_cnt = 3,
+	BTF_TYPES
+},
+{
+	"bpf_loop_inline and a dead func",
+	.insns = {
+	/* main */
+
+	/* A reference to callback #1 to make verifier count it as a func.
+	 * This reference is overwritten below and callback #1 is dead.
+	 */
+	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 9),
+	BPF_RAW_INSN(0, 0, 0, 0, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
+	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 8),
+	BPF_RAW_INSN(0, 0, 0, 0, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
+	BPF_EXIT_INSN(),
+	/* callback */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	/* callback #2 */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	},
+	.expected_insns = { PSEUDO_CALL_INSN() },
+	.unexpected_insns = { HELPER_CALL_INSN() },
+	.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	.result = ACCEPT,
+	.runs = 0,
+	.func_info = {
+		{ 0, MAIN_TYPE },
+		{ 10, CALLBACK_TYPE },
+		{ 12, CALLBACK_TYPE }
+	},
+	.func_info_cnt = 3,
+	BTF_TYPES
+},
+{
+	"bpf_loop_inline stack locations for loop vars",
+	.insns = {
+	/* main */
+	BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0x77),
+	/* bpf_loop call #1 */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
+	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 22),
+	BPF_RAW_INSN(0, 0, 0, 0, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+	/* bpf_loop call #2 */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 2),
+	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 16),
+	BPF_RAW_INSN(0, 0, 0, 0, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+	/* call func and exit */
+	BPF_CALL_REL(2),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
+	BPF_EXIT_INSN(),
+	/* func */
+	BPF_ST_MEM(BPF_DW, BPF_REG_10, -32, 0x55),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 2),
+	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 6),
+	BPF_RAW_INSN(0, 0, 0, 0, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
+	BPF_EXIT_INSN(),
+	/* callback */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	},
+	.expected_insns = {
+	BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0x77),
+	SKIP_INSNS(),
+	BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, -40),
+	BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, -32),
+	BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, -24),
+	SKIP_INSNS(),
+	/* offsets are the same as in the first call */
+	BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, -40),
+	BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, -32),
+	BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, -24),
+	SKIP_INSNS(),
+	BPF_ST_MEM(BPF_DW, BPF_REG_10, -32, 0x55),
+	SKIP_INSNS(),
+	/* offsets differ from main because of different offset
+	 * in BPF_ST_MEM instruction
+	 */
+	BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, -56),
+	BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, -48),
+	BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, -40),
+	},
+	.unexpected_insns = { HELPER_CALL_INSN() },
+	.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	.result = ACCEPT,
+	.func_info = {
+		{ 0, MAIN_TYPE },
+		{ 16, MAIN_TYPE },
+		{ 25, CALLBACK_TYPE },
+	},
+	.func_info_cnt = 3,
+	BTF_TYPES
+},
+
+#undef HELPER_CALL_INSN
+#undef PSEUDO_CALL_INSN
+#undef CALLBACK_TYPE
+#undef MAIN_TYPE
+#undef BTF_TYPES
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v3 5/5] selftests/bpf: BPF test_prog selftests for bpf_loop inlining
  2022-06-03 14:10 [PATCH bpf-next v3 0/5] bpf_loop inlining Eduard Zingerman
                   ` (3 preceding siblings ...)
  2022-06-03 14:10 ` [PATCH bpf-next v3 4/5] selftests/bpf: BPF test_verifier selftests for bpf_loop inlining Eduard Zingerman
@ 2022-06-03 14:10 ` Eduard Zingerman
  2022-06-03 22:52   ` Song Liu
  4 siblings, 1 reply; 20+ messages in thread
From: Eduard Zingerman @ 2022-06-03 14:10 UTC (permalink / raw)
  To: bpf, ast, andrii, daniel, kernel-team, song; +Cc: eddyz87

Two new test BPF programs for test_prog selftests checking bpf_loop
behavior. Both are corner cases for bpf_loop inlinig transformation:
 - check that bpf_loop behaves correctly when callback function is not
   a compile time constant
 - check that local function variables are not affected by allocating
   additional stack storage for registers spilled by loop inlining

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 .../selftests/bpf/prog_tests/bpf_loop.c       |  61 ++++++++++
 tools/testing/selftests/bpf/progs/bpf_loop.c  | 114 ++++++++++++++++++
 2 files changed, 175 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_loop.c b/tools/testing/selftests/bpf/prog_tests/bpf_loop.c
index 380d7a2072e3..82be7c37e39f 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_loop.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_loop.c
@@ -120,6 +120,63 @@ static void check_nested_calls(struct bpf_loop *skel)
 	bpf_link__destroy(link);
 }
 
+static void check_non_constant_callback(struct bpf_loop *skel)
+{
+	struct bpf_link *link =
+		bpf_program__attach(skel->progs.prog_non_constant_callback);
+
+	if (!ASSERT_OK_PTR(link, "link"))
+		return;
+
+	skel->bss->callback_selector = 0x0F;
+	usleep(1);
+	ASSERT_EQ(skel->bss->g_output, 0x0F, "g_output #1");
+
+	skel->bss->callback_selector = 0xF0;
+	usleep(1);
+	ASSERT_EQ(skel->bss->g_output, 0xF0, "g_output #2");
+
+	bpf_link__destroy(link);
+}
+
+static void check_stack(struct bpf_loop *skel)
+{
+	const int max_key = 12;
+	struct bpf_link *link = bpf_program__attach(skel->progs.stack_check);
+	int map_fd;
+
+	if (!ASSERT_OK_PTR(link, "link"))
+		return;
+
+	map_fd = bpf_map__fd(skel->maps.map1);
+
+	if (!ASSERT_GE(map_fd, 0, "bpf_map__fd"))
+		goto out;
+
+	for (int key = 1; key <= max_key; ++key) {
+		int val = key;
+		int err = bpf_map_update_elem(map_fd, &key, &val, BPF_NOEXIST);
+
+		if (!ASSERT_OK(err, "bpf_map_update_elem"))
+			goto out;
+	}
+
+	usleep(1);
+
+	for (int key = 1; key <= max_key; ++key) {
+		int val;
+		int err = bpf_map_lookup_elem(map_fd, &key, &val);
+
+		if (!ASSERT_OK(err, "bpf_map_lookup_elem"))
+			goto out;
+		if (!ASSERT_EQ(val, key + 1, "bad value in the map"))
+			goto out;
+	}
+
+out:
+	bpf_link__destroy(link);
+}
+
 void test_bpf_loop(void)
 {
 	struct bpf_loop *skel;
@@ -140,6 +197,10 @@ void test_bpf_loop(void)
 		check_invalid_flags(skel);
 	if (test__start_subtest("check_nested_calls"))
 		check_nested_calls(skel);
+	if (test__start_subtest("check_non_constant_callback"))
+		check_non_constant_callback(skel);
+	if (test__start_subtest("check_stack"))
+		check_stack(skel);
 
 	bpf_loop__destroy(skel);
 }
diff --git a/tools/testing/selftests/bpf/progs/bpf_loop.c b/tools/testing/selftests/bpf/progs/bpf_loop.c
index e08565282759..de1fc82d2710 100644
--- a/tools/testing/selftests/bpf/progs/bpf_loop.c
+++ b/tools/testing/selftests/bpf/progs/bpf_loop.c
@@ -11,11 +11,19 @@ struct callback_ctx {
 	int output;
 };
 
+struct {
+	__uint(type, BPF_MAP_TYPE_HASH);
+	__uint(max_entries, 32);
+	__type(key, int);
+	__type(value, int);
+} map1 SEC(".maps");
+
 /* These should be set by the user program */
 u32 nested_callback_nr_loops;
 u32 stop_index = -1;
 u32 nr_loops;
 int pid;
+int callback_selector;
 
 /* Making these global variables so that the userspace program
  * can verify the output through the skeleton
@@ -111,3 +119,109 @@ int prog_nested_calls(void *ctx)
 
 	return 0;
 }
+
+static int callback_set_f0(int i, void *ctx)
+{
+	g_output = 0xF0;
+	return 0;
+}
+
+static int callback_set_0f(int i, void *ctx)
+{
+	g_output = 0x0F;
+	return 0;
+}
+
+/*
+ * non-constant callback is a corner case for bpf_loop inline logic
+ */
+SEC("fentry/" SYS_PREFIX "sys_nanosleep")
+int prog_non_constant_callback(void *ctx)
+{
+	struct callback_ctx data = {};
+
+	if (bpf_get_current_pid_tgid() >> 32 != pid)
+		return 0;
+
+	int (*callback)(int i, void *ctx);
+
+	g_output = 0;
+
+	if (callback_selector == 0x0F)
+		callback = callback_set_0f;
+	else
+		callback = callback_set_f0;
+
+	bpf_loop(1, callback, NULL, 0);
+
+	return 0;
+}
+
+static int stack_check_inner_callback(void *ctx)
+{
+	return 0;
+}
+
+static int map1_lookup_elem(int key)
+{
+	int *val = bpf_map_lookup_elem(&map1, &key);
+
+	return val ? *val : -1;
+}
+
+static void map1_update_elem(int key, int val)
+{
+	bpf_map_update_elem(&map1, &key, &val, BPF_ANY);
+}
+
+static int stack_check_outer_callback(void *ctx)
+{
+	int a = map1_lookup_elem(1);
+	int b = map1_lookup_elem(2);
+	int c = map1_lookup_elem(3);
+	int d = map1_lookup_elem(4);
+	int e = map1_lookup_elem(5);
+	int f = map1_lookup_elem(6);
+
+	bpf_loop(1, stack_check_inner_callback, NULL, 0);
+
+	map1_update_elem(1, a + 1);
+	map1_update_elem(2, b + 1);
+	map1_update_elem(3, c + 1);
+	map1_update_elem(4, d + 1);
+	map1_update_elem(5, e + 1);
+	map1_update_elem(6, f + 1);
+
+	return 0;
+}
+
+/* Some of the local variables in stack_check and
+ * stack_check_outer_callback would be allocated on stack by
+ * compiler. This test should verify that stack content for these
+ * variables is preserved between calls to bpf_loop (might be an issue
+ * if loop inlining allocates stack slots incorrectly).
+ */
+SEC("fentry/" SYS_PREFIX "sys_nanosleep")
+int stack_check(void *ctx)
+{
+	if (bpf_get_current_pid_tgid() >> 32 != pid)
+		return 0;
+
+	int a = map1_lookup_elem(7);
+	int b = map1_lookup_elem(8);
+	int c = map1_lookup_elem(9);
+	int d = map1_lookup_elem(10);
+	int e = map1_lookup_elem(11);
+	int f = map1_lookup_elem(12);
+
+	bpf_loop(1, stack_check_outer_callback, NULL, 0);
+
+	map1_update_elem(7,  a + 1);
+	map1_update_elem(8, b + 1);
+	map1_update_elem(9, c + 1);
+	map1_update_elem(10, d + 1);
+	map1_update_elem(11, e + 1);
+	map1_update_elem(12, f + 1);
+
+	return 0;
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v3 1/5] selftests/bpf: specify expected instructions in test_verifier tests
  2022-06-03 14:10 ` [PATCH bpf-next v3 1/5] selftests/bpf: specify expected instructions in test_verifier tests Eduard Zingerman
@ 2022-06-03 21:31   ` Song Liu
  2022-06-03 22:08     ` Eduard Zingerman
  0 siblings, 1 reply; 20+ messages in thread
From: Song Liu @ 2022-06-03 21:31 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team

On Fri, Jun 3, 2022 at 7:11 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
[...]

> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  tools/testing/selftests/bpf/test_verifier.c | 222 ++++++++++++++++++++
>  1 file changed, 222 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
> index 372579c9f45e..373f7661f4d0 100644
> --- a/tools/testing/selftests/bpf/test_verifier.c
> +++ b/tools/testing/selftests/bpf/test_verifier.c
> @@ -51,6 +51,8 @@
>  #endif
>
>  #define MAX_INSNS      BPF_MAXINSNS
> +#define MAX_EXPECTED_INSNS     32
> +#define MAX_UNEXPECTED_INSNS   32
>  #define MAX_TEST_INSNS 1000000
>  #define MAX_FIXUPS     8
>  #define MAX_NR_MAPS    23
> @@ -58,6 +60,10 @@
>  #define POINTER_VALUE  0xcafe4all
>  #define TEST_DATA_LEN  64
>
> +#define INSN_OFF_MASK  ((s16)0xFFFF)
> +#define INSN_IMM_MASK  ((s32)0xFFFFFFFF)

Shall we use __s16 and __s32 to match struct bpf_insn exactly.

> +#define SKIP_INSNS()   BPF_RAW_INSN(0xde, 0xa, 0xd, 0xbeef, 0xdeadbeef)
> +
>  #define F_NEEDS_EFFICIENT_UNALIGNED_ACCESS     (1 << 0)
>  #define F_LOAD_WITH_STRICT_ALIGNMENT           (1 << 1)
>
> @@ -79,6 +85,19 @@ struct bpf_test {
>         const char *descr;
>         struct bpf_insn insns[MAX_INSNS];
>         struct bpf_insn *fill_insns;
> +       /* If specified, test engine looks for this sequence of
> +        * instructions in the BPF program after loading. Allows to
> +        * test rewrites applied by verifier.  Use values
> +        * INSN_OFF_MASK and INSN_IMM_MASK to mask `off` and `imm`
> +        * fields if content does not matter.  The test case fails if
> +        * specified instructions are not found.
> +        *
> +        * The sequence could be split into sub-sequences by adding
> +        * SKIP_INSNS instruction at the end of each sub-sequence. In
> +        * such case sub-sequences are searched for one after another.
> +        */
> +       struct bpf_insn expected_insns[MAX_EXPECTED_INSNS];
> +       struct bpf_insn unexpected_insns[MAX_UNEXPECTED_INSNS];
>         int fixup_map_hash_8b[MAX_FIXUPS];
>         int fixup_map_hash_48b[MAX_FIXUPS];
>         int fixup_map_hash_16b[MAX_FIXUPS];
> @@ -1126,6 +1145,206 @@ static bool cmp_str_seq(const char *log, const char *exp)
>         return true;
>  }
>
> +static int get_xlated_program(int fd_prog, struct bpf_insn **buf, int *cnt)
> +{
> +       struct bpf_prog_info info = {};
> +       __u32 info_len = sizeof(info);
> +       __u32 xlated_prog_len;
> +       __u32 buf_elt_size = sizeof(**buf);

I guess elt means "element"? I would recommend use sizeof(struct bpf_insn)
directly.

[...]

> +static int null_terminated_insn_len(struct bpf_insn *seq, int max_len)
> +{
> +       for (int i = 0; i < max_len; ++i) {

Sorry for missing this in v1. We should really pull variable
declaration out, like

int i;

for (int i = 0; ...)

> +               if (is_null_insn(&seq[i]))
> +                       return i;
> +       }
> +       return max_len;
> +}
> +
[...]

> +
> +static int find_insn_subseq(struct bpf_insn *seq, struct bpf_insn *subseq,
> +                           int seq_len, int subseq_len)
> +{
> +       if (subseq_len > seq_len)
> +               return -1;
> +
> +       for (int i = 0; i < seq_len - subseq_len + 1; ++i) {
> +               bool found = true;
> +
> +               for (int j = 0; j < subseq_len; ++j) {
> +                       if (!compare_masked_insn(&seq[i + j], &subseq[j])) {
> +                               found = false;
> +                               break;
> +                       }
> +               }
> +               if (found)
> +                       return i;
> +       }
> +
> +       return -1;
> +}
> +

[...]

> +
> +static bool check_xlated_program(struct bpf_test *test, int fd_prog)
> +{
> +       struct bpf_insn *buf;
> +       int cnt;
> +       bool result = true;
> +       bool check_expected = !is_null_insn(test->expected_insns);
> +       bool check_unexpected = !is_null_insn(test->unexpected_insns);
> +
> +       if (!check_expected && !check_unexpected)
> +               goto out;
> +
> +       if (get_xlated_program(fd_prog, &buf, &cnt)) {
> +               printf("FAIL: can't get xlated program\n");
> +               result = false;
> +               goto out;
> +       }
> +
> +       if (check_expected &&
> +           !find_all_insn_subseqs(buf, test->expected_insns,
> +                                  cnt, MAX_EXPECTED_INSNS)) {
> +               printf("FAIL: can't find expected subsequence of instructions\n");
> +               result = false;
> +               if (verbose) {
> +                       printf("Program:\n");
> +                       print_insn(buf, cnt);
> +                       printf("Expected subsequence:\n");
> +                       print_insn(test->expected_insns, MAX_EXPECTED_INSNS);
> +               }
> +       }
> +
> +       if (check_unexpected &&
> +           find_all_insn_subseqs(buf, test->unexpected_insns,
> +                                 cnt, MAX_UNEXPECTED_INSNS)) {

I wonder whether we want different logic for unexpected_insns. With multiple
sub sequences, say seq-A and seq-B, it is more natural to reject any results
with either seq-A or seq-B. However, current logic will reject seq-A => seq-B,
but will accept seq-B => seq-A. Does this make sense?

> +               printf("FAIL: found unexpected subsequence of instructions\n");
> +               result = false;
> +               if (verbose) {
> +                       printf("Program:\n");
> +                       print_insn(buf, cnt);
> +                       printf("Un-expected subsequence:\n");
> +                       print_insn(test->unexpected_insns, MAX_UNEXPECTED_INSNS);
> +               }
> +       }
> +
> +       free(buf);
> + out:
> +       return result;
> +}
> +
>  static void do_test_single(struct bpf_test *test, bool unpriv,
>                            int *passes, int *errors)
>  {
> @@ -1262,6 +1481,9 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
>         if (verbose)
>                 printf(", verifier log:\n%s", bpf_vlog);
>
> +       if (!check_xlated_program(test, fd_prog))
> +               goto fail_log;
> +
>         run_errs = 0;
>         run_successes = 0;
>         if (!alignment_prevented_execution && fd_prog >= 0 && test->runs >= 0) {
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v3 2/5] selftests/bpf: allow BTF specs and func infos in test_verifier tests
  2022-06-03 14:10 ` [PATCH bpf-next v3 2/5] selftests/bpf: allow BTF specs and func infos " Eduard Zingerman
@ 2022-06-03 21:41   ` Song Liu
  0 siblings, 0 replies; 20+ messages in thread
From: Song Liu @ 2022-06-03 21:41 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team

On Fri, Jun 3, 2022 at 7:11 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> The BTF and func_info specification for test_verifier tests follows
> the same notation as in prog_tests/btf.c tests. E.g.:
>
>   ...
>   .func_info = { { 0, 6 }, { 8, 7 } },
>   .func_info_cnt = 2,
>   .btf_strings = "\0int\0",
>   .btf_types = {
>     BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4),
>     BTF_PTR_ENC(1),
>   },
>   ...
>
> The BTF specification is loaded only when specified.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

Acked-by: Song Liu <songliubraving@fb.com>

[...]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v3 1/5] selftests/bpf: specify expected instructions in test_verifier tests
  2022-06-03 21:31   ` Song Liu
@ 2022-06-03 22:08     ` Eduard Zingerman
  2022-06-03 23:01       ` Song Liu
  0 siblings, 1 reply; 20+ messages in thread
From: Eduard Zingerman @ 2022-06-03 22:08 UTC (permalink / raw)
  To: Song Liu
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team

> On Fri, 2022-06-03 at 14:31 -0700, Song Liu wrote:
> > On Fri, Jun 3, 2022 at 7:11 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> 
> > +#define INSN_OFF_MASK  ((s16)0xFFFF)
> > +#define INSN_IMM_MASK  ((s32)0xFFFFFFFF)
> 
> Shall we use __s16 and __s32 to match struct bpf_insn exactly.

Will do.

[...]

> > +       __u32 buf_elt_size = sizeof(**buf);
> 
> I guess elt means "element"? I would recommend use sizeof(struct bpf_insn)
> directly.

Will do.

[...]

> > +static int null_terminated_insn_len(struct bpf_insn *seq, int max_len)
> > +{
> > +       for (int i = 0; i < max_len; ++i) {
> 
> Sorry for missing this in v1. We should really pull variable
> declaration out, like
> 
> int i;
> 
> for (int i = 0; ...)

Sorry, just to clarify, you want me to pull all loop counter
declarations to the top of the relevant functions, right? (Affecting 4
functions in this patch).

[...]

> > +static int find_insn_subseq(struct bpf_insn *seq, struct bpf_insn *subseq,
> > +       if (check_unexpected &&
> > +           find_all_insn_subseqs(buf, test->unexpected_insns,
> > +                                 cnt, MAX_UNEXPECTED_INSNS)) {
> 
> I wonder whether we want different logic for unexpected_insns. With multiple
> sub sequences, say seq-A and seq-B, it is more natural to reject any results
> with either seq-A or seq-B. However, current logic will reject seq-A => seq-B,
> but will accept seq-B => seq-A. Does this make sense?

Have no strong opinion on this topic. In theory different variants
might be useful in different cases.

In the test cases for bpf_loop inlining I only had to match a single
unexpected instruction, so I opted to use same match function in both
expected and unexpected cases to keep things simple.

One thought that I had was that struct bpf_test might be extended in
the future as follows:

#define MAX_PATTERNS 4
...
struct bpf_test {
	...
	struct bpf_insn unexpected_insns[MAX_UNEXPECTED_INSNS][MAX_PATTERNS];
	...
}

Where each pattern follows logic "seq-A => seq-B", but patterns are
matched independently. So if the goal is to match either "seq-A" or
"seq-B" these should be specified as separate patterns. However, this
seems to be an overkill for the problem at hand.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v3 3/5] bpf: Inline calls to bpf_loop when callback is known
  2022-06-03 14:10 ` [PATCH bpf-next v3 3/5] bpf: Inline calls to bpf_loop when callback is known Eduard Zingerman
@ 2022-06-03 22:36   ` Song Liu
  2022-06-04  0:06   ` Joanne Koong
  2022-06-04 14:47   ` Alexei Starovoitov
  2 siblings, 0 replies; 20+ messages in thread
From: Song Liu @ 2022-06-03 22:36 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team

On Fri, Jun 3, 2022 at 7:11 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>

[...]

>
> Measurements using `benchs/run_bench_bpf_loop.sh` inside QEMU / KVM on
> i7-4710HQ CPU show a drop in latency from 14 ns/op to 2 ns/op.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
[...]
>
> -/* maximum number of loops */
> -#define MAX_LOOPS      BIT(23)
> -
>  BPF_CALL_4(bpf_loop, u32, nr_loops, void *, callback_fn, void *, callback_ctx,
>            u64, flags)
>  {
> @@ -733,9 +731,12 @@ BPF_CALL_4(bpf_loop, u32, nr_loops, void *, callback_fn, void *, callback_ctx,
>         u64 ret;
>         u32 i;
>
> +       /* note: these safety checks are also verified when bpf_loop is inlined,
> +        * be careful to modify this code in sync

Let's call out inline_bpf_loop() here to be more clear.

[...]

>

Let's also call out here this the inlined version of bpf_iter.c:bpf_loop

> +static struct bpf_prog *inline_bpf_loop(struct bpf_verifier_env *env,
> +                                       int position, u32 *cnt)
> +{
> +       struct bpf_insn_aux_data *aux = &env->insn_aux_data[position];
> +       s32 stack_base = aux->loop_inline_state.stack_base;
> +       s32 r6_offset = stack_base + 0 * BPF_REG_SIZE;
> +       s32 r7_offset = stack_base + 1 * BPF_REG_SIZE;
> +       s32 r8_offset = stack_base + 2 * BPF_REG_SIZE;
> +       int reg_loop_max = BPF_REG_6;
> +       int reg_loop_cnt = BPF_REG_7;
> +       int reg_loop_ctx = BPF_REG_8;
> +
> +       struct bpf_prog *new_prog;
> +       u32 callback_subprogno = aux->loop_inline_state.callback_subprogno;
> +       u32 callback_start;
> +       u32 call_insn_offset;
> +       s32 callback_offset;
> +       struct bpf_insn insn_buf[19];
> +       struct bpf_insn *next = insn_buf;
> +       struct bpf_insn *call, *jump_to_end, *loop_header;
> +       struct bpf_insn *jump_to_header, *loop_exit;
> +
> +       /* Return error and jump to the end of the patch if
> +        * expected number of iterations is too big.  This
> +        * repeats the check done in bpf_loop helper function,
> +        * be careful to modify this code in sync.
> +        */
> +       (*next++) = BPF_JMP_IMM(BPF_JLE, BPF_REG_1, BPF_MAX_LOOPS, 2);
> +       (*next++) = BPF_MOV32_IMM(BPF_REG_0, -E2BIG);
> +       jump_to_end = next;
> +       (*next++) = BPF_JMP_IMM(BPF_JA, 0, 0, 0 /* set below */);
> +       /* spill R6, R7, R8 to use these as loop vars */
> +       (*next++) = BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, r6_offset);
> +       (*next++) = BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, r7_offset);
> +       (*next++) = BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, r8_offset);
> +       /* initialize loop vars */
> +       (*next++) = BPF_MOV64_REG(reg_loop_max, BPF_REG_1);
> +       (*next++) = BPF_MOV32_IMM(reg_loop_cnt, 0);
> +       (*next++) = BPF_MOV64_REG(reg_loop_ctx, BPF_REG_3);
> +       /* loop header;
> +        * if reg_loop_cnt >= reg_loop_max skip the loop body
> +        */
> +       loop_header = next;
> +       (*next++) = BPF_JMP_REG(BPF_JGE, reg_loop_cnt, reg_loop_max,
> +                               0 /* set below */);
> +       /* callback call */
> +       (*next++) = BPF_MOV64_REG(BPF_REG_1, reg_loop_cnt);
> +       (*next++) = BPF_MOV64_REG(BPF_REG_2, reg_loop_ctx);
> +       call = next;
> +       (*next++) = BPF_CALL_REL(0 /* set below after patching */);
> +       /* increment loop counter */
> +       (*next++) = BPF_ALU64_IMM(BPF_ADD, reg_loop_cnt, 1);
> +       /* jump to loop header if callback returned 0 */
> +       jump_to_header = next;
> +       (*next++) = BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 0 /* set below */);
> +       /* return value of bpf_loop;
> +        * set R0 to the number of iterations
> +        */
> +       loop_exit = next;
> +       (*next++) = BPF_MOV64_REG(BPF_REG_0, reg_loop_cnt);
> +       /* restore original values of R6, R7, R8 */
> +       (*next++) = BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_10, r6_offset);
> +       (*next++) = BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_10, r7_offset);
> +       (*next++) = BPF_LDX_MEM(BPF_DW, BPF_REG_8, BPF_REG_10, r8_offset);
> +
> +       *cnt = next - insn_buf;
> +       if (*cnt > ARRAY_SIZE(insn_buf)) {
> +               WARN_ONCE(1, "BUG %s: 'next' exceeds bounds for 'insn_buf'\n",
> +                         __func__);
> +               return NULL;
> +       }
> +       jump_to_end->off = next - jump_to_end - 1;
> +       loop_header->off = loop_exit - loop_header - 1;
> +       jump_to_header->off = loop_header - jump_to_header - 1;

I know these changes are made based on my feedback on v1. But it seems
v1 is actually cleaner. How about we use v1, but add comments on offsets that
need to redo when we make changes to insn_buf[]?



> +
> +       new_prog = bpf_patch_insn_data(env, position, insn_buf, *cnt);
> +       if (!new_prog)
> +               return new_prog;
> +
> +       /* callback start is known only after patching */
> +       callback_start = env->subprog_info[callback_subprogno].start;
> +       call_insn_offset = position + (call - insn_buf);
> +       callback_offset = callback_start - call_insn_offset - 1;
> +       env->prog->insnsi[call_insn_offset].imm = callback_offset;
> +
> +       return new_prog;
> +}
> +
>  /* Do various post-verification rewrites in a single program pass.
>   * These rewrites simplify JIT and interpreter implementations.
>   */
> @@ -14258,6 +14432,18 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
>                         continue;
>                 }
>
> +               if (insn->imm == BPF_FUNC_loop &&
> +                   fit_for_bpf_loop_inline(&env->insn_aux_data[i + delta])) {
> +                       new_prog = inline_bpf_loop(env, i + delta, &cnt);
> +                       if (!new_prog)
> +                               return -ENOMEM;
> +
> +                       delta    += cnt - 1;
> +                       env->prog = prog = new_prog;
> +                       insn      = new_prog->insnsi + i + delta;
> +                       continue;
> +               }
> +
>  patch_call_imm:
>                 fn = env->ops->get_func_proto(insn->imm, env->prog);
>                 /* all functions that have prototype and verifier allowed
> @@ -15030,6 +15216,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr)
>         if (ret == 0)
>                 ret = check_max_stack_depth(env);
>
> +       if (ret == 0)
> +               adjust_stack_depth_for_loop_inlining(env);
> +
>         /* instruction rewrites happen after this point */
>         if (is_priv) {
>                 if (ret == 0)
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v3 4/5] selftests/bpf: BPF test_verifier selftests for bpf_loop inlining
  2022-06-03 14:10 ` [PATCH bpf-next v3 4/5] selftests/bpf: BPF test_verifier selftests for bpf_loop inlining Eduard Zingerman
@ 2022-06-03 22:38   ` Song Liu
  0 siblings, 0 replies; 20+ messages in thread
From: Song Liu @ 2022-06-03 22:38 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team

On Fri, Jun 3, 2022 at 7:11 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> A number of test cases for BPF selftests test_verifier to check how
> bpf_loop inline transformation rewrites the BPF program. The following
> cases are covered:
>  - happy path
>  - no-rewrite when flags is non-zero
>  - no-rewrite when callback is non-constant
>  - subprogno in insn_aux is updated correctly when dead sub-programs
>    are removed
>  - check that correct stack offsets are assigned for spilling of R6-R8
>    registers
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  .../selftests/bpf/verifier/bpf_loop_inline.c  | 244 ++++++++++++++++++
>  1 file changed, 244 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/verifier/bpf_loop_inline.c
>
> diff --git a/tools/testing/selftests/bpf/verifier/bpf_loop_inline.c b/tools/testing/selftests/bpf/verifier/bpf_loop_inline.c
> new file mode 100644
> index 000000000000..d1fbcfef69f2
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/verifier/bpf_loop_inline.c
> @@ -0,0 +1,244 @@
> +#define BTF_TYPES \
> +       .btf_strings = "\0int\0i\0ctx\0callback\0main\0", \
> +       .btf_types = { \
> +       /* 1: int   */ BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4), \
> +       /* 2: int*  */ BTF_PTR_ENC(1), \
> +       /* 3: void* */ BTF_PTR_ENC(0), \
> +       /* 4: int __(void*) */ BTF_FUNC_PROTO_ENC(1, 1), \
> +               BTF_FUNC_PROTO_ARG_ENC(7, 3), \
> +       /* 5: int __(int, int*) */ BTF_FUNC_PROTO_ENC(1, 2), \
> +               BTF_FUNC_PROTO_ARG_ENC(5, 1), \
> +               BTF_FUNC_PROTO_ARG_ENC(7, 2), \
> +       /* 6: main      */ BTF_FUNC_ENC(20, 4), \
> +       /* 7: callback  */ BTF_FUNC_ENC(11, 5), \
> +       BTF_END_RAW \
> +       }
> +
> +#define MAIN_TYPE      6
> +#define CALLBACK_TYPE  7
> +
> +/* can't use BPF_CALL_REL, jit_subprogs adjusts IMM & OFF
> + * fields for pseudo calls
> + */
> +#define PSEUDO_CALL_INSN() \
> +       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_CALL, \
> +                    INSN_OFF_MASK, INSN_IMM_MASK)
> +
> +/* can't use BPF_FUNC_loop constant,
> + * do_mix_fixups adjusts the IMM field
> + */
> +#define HELPER_CALL_INSN() \
> +       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, INSN_OFF_MASK, INSN_IMM_MASK)
> +
> +{
> +       "inline simple bpf_loop call",
> +       .insns = {
> +       /* main */
> +       /* force verifier state branching to verify logic on first and
> +        * subsequent bpf_loop insn processing steps
> +        */
> +       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
> +       BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 777, 2),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
> +       BPF_JMP_IMM(BPF_JA, 0, 0, 1),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 2),
> +
> +       BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 6),
> +       BPF_RAW_INSN(0, 0, 0, 0, 0),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
> +       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
> +       BPF_EXIT_INSN(),
> +       /* callback */
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
> +       BPF_EXIT_INSN(),
> +       },
> +       .expected_insns = { PSEUDO_CALL_INSN() },
> +       .unexpected_insns = { HELPER_CALL_INSN() },
> +       .prog_type = BPF_PROG_TYPE_TRACEPOINT,
> +       .result = ACCEPT,
> +       .runs = 0,
> +       .func_info = { { 0, MAIN_TYPE }, { 12, CALLBACK_TYPE } },
> +       .func_info_cnt = 2,
> +       BTF_TYPES
> +},
> +{
> +       "don't inline bpf_loop call, flags non-zero",
> +       .insns = {
> +       /* main */
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
> +       BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 6),
> +       BPF_RAW_INSN(0, 0, 0, 0, 0),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 1),
> +       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
> +       BPF_EXIT_INSN(),
> +       /* callback */
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
> +       BPF_EXIT_INSN(),
> +       },
> +       .expected_insns = { HELPER_CALL_INSN() },
> +       .unexpected_insns = { PSEUDO_CALL_INSN() },
> +       .prog_type = BPF_PROG_TYPE_TRACEPOINT,
> +       .result = ACCEPT,
> +       .runs = 0,
> +       .func_info = { { 0, MAIN_TYPE }, { 8, CALLBACK_TYPE } },
> +       .func_info_cnt = 2,
> +       BTF_TYPES
> +},
> +{
> +       "don't inline bpf_loop call, callback non-constant",
> +       .insns = {
> +       /* main */
> +       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
> +       BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 777, 4), /* pick a random callback */
> +
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
> +       BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 10),
> +       BPF_RAW_INSN(0, 0, 0, 0, 0),
> +       BPF_JMP_IMM(BPF_JA, 0, 0, 3),
> +
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
> +       BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 8),
> +       BPF_RAW_INSN(0, 0, 0, 0, 0),
> +
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
> +       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
> +       BPF_EXIT_INSN(),
> +       /* callback */
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
> +       BPF_EXIT_INSN(),
> +       /* callback #2 */
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
> +       BPF_EXIT_INSN(),
> +       },
> +       .expected_insns = { HELPER_CALL_INSN() },
> +       .unexpected_insns = { PSEUDO_CALL_INSN() },
> +       .prog_type = BPF_PROG_TYPE_TRACEPOINT,
> +       .result = ACCEPT,
> +       .runs = 0,
> +       .func_info = {
> +               { 0, MAIN_TYPE },
> +               { 14, CALLBACK_TYPE },
> +               { 16, CALLBACK_TYPE }
> +       },
> +       .func_info_cnt = 3,
> +       BTF_TYPES
> +},
> +{
> +       "bpf_loop_inline and a dead func",
> +       .insns = {
> +       /* main */
> +
> +       /* A reference to callback #1 to make verifier count it as a func.
> +        * This reference is overwritten below and callback #1 is dead.
> +        */
> +       BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 9),
> +       BPF_RAW_INSN(0, 0, 0, 0, 0),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
> +       BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 8),
> +       BPF_RAW_INSN(0, 0, 0, 0, 0),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
> +       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
> +       BPF_EXIT_INSN(),
> +       /* callback */
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
> +       BPF_EXIT_INSN(),
> +       /* callback #2 */
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
> +       BPF_EXIT_INSN(),
> +       },
> +       .expected_insns = { PSEUDO_CALL_INSN() },
> +       .unexpected_insns = { HELPER_CALL_INSN() },
> +       .prog_type = BPF_PROG_TYPE_TRACEPOINT,
> +       .result = ACCEPT,
> +       .runs = 0,
> +       .func_info = {
> +               { 0, MAIN_TYPE },
> +               { 10, CALLBACK_TYPE },
> +               { 12, CALLBACK_TYPE }
> +       },
> +       .func_info_cnt = 3,
> +       BTF_TYPES
> +},
> +{
> +       "bpf_loop_inline stack locations for loop vars",
> +       .insns = {
> +       /* main */
> +       BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0x77),
> +       /* bpf_loop call #1 */
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
> +       BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 22),
> +       BPF_RAW_INSN(0, 0, 0, 0, 0),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
> +       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
> +       /* bpf_loop call #2 */
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 2),
> +       BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 16),
> +       BPF_RAW_INSN(0, 0, 0, 0, 0),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
> +       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
> +       /* call func and exit */
> +       BPF_CALL_REL(2),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
> +       BPF_EXIT_INSN(),
> +       /* func */
> +       BPF_ST_MEM(BPF_DW, BPF_REG_10, -32, 0x55),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 2),
> +       BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 6),
> +       BPF_RAW_INSN(0, 0, 0, 0, 0),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
> +       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
> +       BPF_EXIT_INSN(),
> +       /* callback */
> +       BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
> +       BPF_EXIT_INSN(),
> +       },
> +       .expected_insns = {
> +       BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0x77),
> +       SKIP_INSNS(),
> +       BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, -40),
> +       BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, -32),
> +       BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, -24),
> +       SKIP_INSNS(),
> +       /* offsets are the same as in the first call */
> +       BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, -40),
> +       BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, -32),
> +       BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, -24),
> +       SKIP_INSNS(),
> +       BPF_ST_MEM(BPF_DW, BPF_REG_10, -32, 0x55),
> +       SKIP_INSNS(),
> +       /* offsets differ from main because of different offset
> +        * in BPF_ST_MEM instruction
> +        */
> +       BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, -56),
> +       BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, -48),
> +       BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, -40),
> +       },
> +       .unexpected_insns = { HELPER_CALL_INSN() },
> +       .prog_type = BPF_PROG_TYPE_TRACEPOINT,
> +       .result = ACCEPT,
> +       .func_info = {
> +               { 0, MAIN_TYPE },
> +               { 16, MAIN_TYPE },
> +               { 25, CALLBACK_TYPE },
> +       },
> +       .func_info_cnt = 3,
> +       BTF_TYPES
> +},
> +
> +#undef HELPER_CALL_INSN
> +#undef PSEUDO_CALL_INSN
> +#undef CALLBACK_TYPE
> +#undef MAIN_TYPE
> +#undef BTF_TYPES
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v3 5/5] selftests/bpf: BPF test_prog selftests for bpf_loop inlining
  2022-06-03 14:10 ` [PATCH bpf-next v3 5/5] selftests/bpf: BPF test_prog " Eduard Zingerman
@ 2022-06-03 22:52   ` Song Liu
  0 siblings, 0 replies; 20+ messages in thread
From: Song Liu @ 2022-06-03 22:52 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team

On Fri, Jun 3, 2022 at 7:11 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> Two new test BPF programs for test_prog selftests checking bpf_loop
> behavior. Both are corner cases for bpf_loop inlinig transformation:
>  - check that bpf_loop behaves correctly when callback function is not
>    a compile time constant
>  - check that local function variables are not affected by allocating
>    additional stack storage for registers spilled by loop inlining
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

Acked-by: Song Liu <songliubraving@fb.com>

With some nitpick below.

[...]

> +static void check_stack(struct bpf_loop *skel)
> +{
> +       const int max_key = 12;
> +       struct bpf_link *link = bpf_program__attach(skel->progs.stack_check);
> +       int map_fd;
> +
> +       if (!ASSERT_OK_PTR(link, "link"))
> +               return;
> +
> +       map_fd = bpf_map__fd(skel->maps.map1);
> +
> +       if (!ASSERT_GE(map_fd, 0, "bpf_map__fd"))
> +               goto out;
> +
> +       for (int key = 1; key <= max_key; ++key) {

Let's move the definition of i to the beginning of the function.

> +               int val = key;
> +               int err = bpf_map_update_elem(map_fd, &key, &val, BPF_NOEXIST);
> +
> +               if (!ASSERT_OK(err, "bpf_map_update_elem"))
> +                       goto out;
> +       }
> +
> +       usleep(1);
> +
> +       for (int key = 1; key <= max_key; ++key) {
ditto.

> +               int val;
> +               int err = bpf_map_lookup_elem(map_fd, &key, &val);
> +
> +               if (!ASSERT_OK(err, "bpf_map_lookup_elem"))
> +                       goto out;
> +               if (!ASSERT_EQ(val, key + 1, "bad value in the map"))
> +                       goto out;
> +       }
> +
> +out:
> +       bpf_link__destroy(link);
> +}
> +

[...]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v3 1/5] selftests/bpf: specify expected instructions in test_verifier tests
  2022-06-03 22:08     ` Eduard Zingerman
@ 2022-06-03 23:01       ` Song Liu
  2022-06-04 12:53         ` Eduard Zingerman
  0 siblings, 1 reply; 20+ messages in thread
From: Song Liu @ 2022-06-03 23:01 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team

On Fri, Jun 3, 2022 at 3:08 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
[...]

> >
> > for (int i = 0; ...)
>
> Sorry, just to clarify, you want me to pull all loop counter
> declarations to the top of the relevant functions, right? (Affecting 4
> functions in this patch).

Right. There are a few more in later patches.

>
> [...]
>
> > > +static int find_insn_subseq(struct bpf_insn *seq, struct bpf_insn *subseq,
> > > +       if (check_unexpected &&
> > > +           find_all_insn_subseqs(buf, test->unexpected_insns,
> > > +                                 cnt, MAX_UNEXPECTED_INSNS)) {
> >
> > I wonder whether we want different logic for unexpected_insns. With multiple
> > sub sequences, say seq-A and seq-B, it is more natural to reject any results
> > with either seq-A or seq-B. However, current logic will reject seq-A => seq-B,
> > but will accept seq-B => seq-A. Does this make sense?
>
> Have no strong opinion on this topic. In theory different variants
> might be useful in different cases.
>
> In the test cases for bpf_loop inlining I only had to match a single
> unexpected instruction, so I opted to use same match function in both
> expected and unexpected cases to keep things simple.

I think we can wait until we have an actual use case for multiple seq.
For now, let's keep current logic and document this clearly in struct
bpf_test.

Thanks,
Song
[...]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v3 3/5] bpf: Inline calls to bpf_loop when callback is known
  2022-06-03 14:10 ` [PATCH bpf-next v3 3/5] bpf: Inline calls to bpf_loop when callback is known Eduard Zingerman
  2022-06-03 22:36   ` Song Liu
@ 2022-06-04  0:06   ` Joanne Koong
  2022-06-04 12:51     ` Eduard Zingerman
  2022-06-04 14:47   ` Alexei Starovoitov
  2 siblings, 1 reply; 20+ messages in thread
From: Joanne Koong @ 2022-06-04  0:06 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, song

On Fri, Jun 3, 2022 at 11:51 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> Calls to `bpf_loop` are replaced with direct loops to avoid
> indirection. E.g. the following:
>
>   bpf_loop(10, foo, NULL, 0);
>
> Is replaced by equivalent of the following:
>
>   for (int i = 0; i < 10; ++i)
>     foo(i, NULL);
>
> This transformation could be applied when:
> - callback is known and does not change during program execution;
> - flags passed to `bpf_loop` are always zero.
>
> Inlining logic works as follows:
>
> - During execution simulation function `update_loop_inline_state`
>   tracks the following information for each `bpf_loop` call
>   instruction:
>   - is callback known and constant?
>   - are flags constant and zero?
> - Function `adjust_stack_depth_for_loop_inlining` increases stack
>   depth for functions where `bpf_loop` calls could be inlined. This is
>   needed to spill registers R6, R7 and R8. These registers are used as
>   loop counter, loop maximal bound and callback context parameter;
> - Function `inline_bpf_loop` called from `do_misc_fixups` replaces
>   `bpf_loop` calls fit for inlining with corresponding loop
>   instructions.
>
> Measurements using `benchs/run_bench_bpf_loop.sh` inside QEMU / KVM on
> i7-4710HQ CPU show a drop in latency from 14 ns/op to 2 ns/op.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  include/linux/bpf_verifier.h |  16 +++
>  kernel/bpf/bpf_iter.c        |   9 +-
>  kernel/bpf/verifier.c        | 199 ++++++++++++++++++++++++++++++++++-
>  3 files changed, 215 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index e8439f6cbe57..80279616a76b 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -20,6 +20,8 @@
>  #define BPF_MAX_VAR_SIZ        (1 << 29)
>  /* size of type_str_buf in bpf_verifier. */
>  #define TYPE_STR_BUF_LEN 64
> +/* Maximum number of loops for bpf_loop */
> +#define BPF_MAX_LOOPS  BIT(23)

Should this be moved to include/linux/bpf.h since
kernel/bpf/bpf_iter.c also uses this? Then we don't need the "#include
<linux/bpf_verifier.h>" in bpf_iter.c

>
>  /* Liveness marks, used for registers and spilled-regs (in stack slots).
>   * Read marks propagate upwards until they find a write mark; they record that
> @@ -344,6 +346,16 @@ struct bpf_verifier_state_list {
>         int miss_cnt, hit_cnt;
>  };
>
> +struct bpf_loop_inline_state {
> +       u32 initialized:1; /* set to true upon first entry */
> +       u32 callback_is_constant:1; /* true if callback function
> +                                    * is the same at each call
> +                                    */
> +       u32 flags_is_zero:1; /* true if flags register is zero at each call */

I think the more straightforward

bool initialized;
bool callback_is_constant;
bool flags_is_zero;

ends up being the same size as using bit fields.

If your intention was to use bit fields to try to minimize the size of
the struct, I think we could do something like:

bool initialized:1;
bool callback_is_constant:1;
bool flags_is_zero:1;
 /* u16 since bpf_subprog_info.stack_depth is u16; we take the
negative of it whenever we use it since the stack grows downwards */
u16 stack_depth;
u32 callback_subprogno;

to make the struct more compact.

Also, as a side-note, I think if we have an "is_valid" field, then we
don't need both the "callback_is_constant" and "flags_is_zero" fields.
If the flags is not zero or the callback subprogno is not the same on
each invocation of the instruction, then we could represent that by
just setting is_valid to false.

> +       u32 callback_subprogno; /* valid when callback_is_constant == 1 */
> +       s32 stack_base; /* stack offset for loop vars */
> +};
> +
>  /* Possible states for alu_state member. */
>  #define BPF_ALU_SANITIZE_SRC           (1U << 0)
>  #define BPF_ALU_SANITIZE_DST           (1U << 1)
> @@ -380,6 +392,10 @@ struct bpf_insn_aux_data {
>         bool sanitize_stack_spill; /* subject to Spectre v4 sanitation */
>         bool zext_dst; /* this insn zero extends dst reg */
>         u8 alu_state; /* used in combination with alu_limit */
> +       /* if instruction is a call to bpf_loop this field tracks
> +        * the state of the relevant registers to take decision about inlining
> +        */
> +       struct bpf_loop_inline_state loop_inline_state;

Is placing this inside the union in "struct bpf_insn_aux_data" an option?

>
>         /* below fields are initialized once */
>         unsigned int orig_idx; /* original instruction index */
> diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c
> index d5d96ceca105..cdb898fce118 100644
> --- a/kernel/bpf/bpf_iter.c
> +++ b/kernel/bpf/bpf_iter.c
> @@ -6,6 +6,7 @@
>  #include <linux/filter.h>
>  #include <linux/bpf.h>
>  #include <linux/rcupdate_trace.h>
> +#include <linux/bpf_verifier.h>
>
>  struct bpf_iter_target_info {
>         struct list_head list;
> @@ -723,9 +724,6 @@ const struct bpf_func_proto bpf_for_each_map_elem_proto = {
>         .arg4_type      = ARG_ANYTHING,
>  };
>
> -/* maximum number of loops */
> -#define MAX_LOOPS      BIT(23)
> -
>  BPF_CALL_4(bpf_loop, u32, nr_loops, void *, callback_fn, void *, callback_ctx,
>            u64, flags)
>  {
> @@ -733,9 +731,12 @@ BPF_CALL_4(bpf_loop, u32, nr_loops, void *, callback_fn, void *, callback_ctx,
>         u64 ret;
>         u32 i;
>
> +       /* note: these safety checks are also verified when bpf_loop is inlined,
> +        * be careful to modify this code in sync
> +        */
>         if (flags)
>                 return -EINVAL;
> -       if (nr_loops > MAX_LOOPS)
> +       if (nr_loops > BPF_MAX_LOOPS)
>                 return -E2BIG;
>
>         for (i = 0; i < nr_loops; i++) {
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index aedac2ac02b9..27d78fe6c3f9 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -7103,6 +7103,78 @@ static int check_get_func_ip(struct bpf_verifier_env *env)
>         return -ENOTSUPP;
>  }
>
> +static struct bpf_insn_aux_data *cur_aux(struct bpf_verifier_env *env)
> +{
> +       return &env->insn_aux_data[env->insn_idx];
> +}
> +
> +static bool fit_for_bpf_loop_inline(struct bpf_insn_aux_data *insn_aux)
> +{
> +       return insn_aux->loop_inline_state.initialized &&
> +               insn_aux->loop_inline_state.flags_is_zero &&
> +               insn_aux->loop_inline_state.callback_is_constant;
> +}
> +
> +/* For all sub-programs in the program (including main) checks

nit: "check" without the extra s :)

> + * insn_aux_data to see if there are bpf_loop calls that require
> + * inlining. If such calls are found subprog stack_depth is increased
> + * by the size of 3 registers. Reserved space would be used in the
> + * do_misc_fixups to spill values of the R6, R7, R8 to use these
> + * registers for loop iteration.
> + */
> +static void adjust_stack_depth_for_loop_inlining(struct bpf_verifier_env *env)
> +{
> +       int i, subprog_end, cur_subprog = 0;
> +       struct bpf_subprog_info *subprogs = env->subprog_info;
nit: I think this needs to be above the "int i, subprog_end..." line
since it's longer

> +       int insn_cnt = env->prog->len;
> +       bool subprog_updated = false;
> +       s32 stack_base;
> +
> +       subprog_end = (env->subprog_cnt > 1
> +                      ? subprogs[cur_subprog + 1].start
> +                      : insn_cnt);
> +       for (i = 0; i < insn_cnt; i++) {
> +               struct bpf_insn_aux_data *aux = &env->insn_aux_data[i];

Maybe this should be "struct bpf_loop_inline_state *inline_state =
&env->insn_aux_data[i].loop_inline_state" since we only use aux for
aux.loop_inline_state

> +
> +               if (fit_for_bpf_loop_inline(aux)) {
> +                       if (!subprog_updated) {
> +                               subprog_updated = true;
> +                               subprogs[cur_subprog].stack_depth += BPF_REG_SIZE * 3;
> +                               stack_base = -subprogs[cur_subprog].stack_depth;
> +                       }
> +                       aux->loop_inline_state.stack_base = stack_base;
> +               }
> +               if (i == subprog_end - 1) {
> +                       subprog_updated = false;
> +                       cur_subprog++;
> +                       if (cur_subprog < env->subprog_cnt)
> +                               subprog_end = subprogs[cur_subprog + 1].start;
> +               }
> +       }
> +
> +       env->prog->aux->stack_depth = env->subprog_info[0].stack_depth;

In the case where a subprogram that is not subprogram 0 is a fit for
the bpf loop inline and thus increases its stack depth, won't
env->prog->aux->stack_depth need to also be updated?

> +}
> +
> +static void update_loop_inline_state(struct bpf_verifier_env *env, u32 subprogno)
> +{
> +       struct bpf_loop_inline_state *state = &cur_aux(env)->loop_inline_state;
> +       struct bpf_reg_state *regs = cur_regs(env);
> +       struct bpf_reg_state *flags_reg = &regs[BPF_REG_4];
> +
> +       int flags_is_zero =
> +               register_is_const(flags_reg) && flags_reg->var_off.value == 0;
> +
> +       if (state->initialized) {
> +               state->flags_is_zero &= flags_is_zero;
> +               state->callback_is_constant &= state->callback_subprogno == subprogno;
> +       } else {
> +               state->initialized = 1;
> +               state->callback_is_constant = 1;
> +               state->flags_is_zero = flags_is_zero;
> +               state->callback_subprogno = subprogno;
> +       }
> +}
> +
>  static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>                              int *insn_idx_p)
>  {
> @@ -7255,6 +7327,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>                 err = check_bpf_snprintf_call(env, regs);
>                 break;
>         case BPF_FUNC_loop:
> +               update_loop_inline_state(env, meta.subprogno);
>                 err = __check_func_call(env, insn, insn_idx_p, meta.subprogno,
>                                         set_loop_callback_state);
>                 break;
> @@ -7661,11 +7734,6 @@ static bool check_reg_sane_offset(struct bpf_verifier_env *env,
>         return true;
>  }
>
> -static struct bpf_insn_aux_data *cur_aux(struct bpf_verifier_env *env)
> -{
> -       return &env->insn_aux_data[env->insn_idx];
> -}
> -
>  enum {
>         REASON_BOUNDS   = -1,
>         REASON_TYPE     = -2,
> @@ -12920,6 +12988,22 @@ static struct bpf_prog *bpf_patch_insn_data(struct bpf_verifier_env *env, u32 of
>         return new_prog;
>  }
>
> +static void adjust_loop_inline_subprogno(struct bpf_verifier_env *env,
> +                                        u32 first_removed,
> +                                        u32 first_remaining)
> +{
> +       int delta = first_remaining - first_removed;
> +
> +       for (int i = 0; i < env->prog->len; ++i) {
> +               struct bpf_loop_inline_state *state =
> +                       &env->insn_aux_data[i].loop_inline_state;
> +
> +               if (state->initialized &&
> +                   state->callback_subprogno >= first_remaining)
> +                       state->callback_subprogno -= delta;
> +       }
> +}
> +
>  static int adjust_subprog_starts_after_remove(struct bpf_verifier_env *env,
>                                               u32 off, u32 cnt)
>  {
> @@ -12963,6 +13047,8 @@ static int adjust_subprog_starts_after_remove(struct bpf_verifier_env *env,
>                          * in adjust_btf_func() - no need to adjust
>                          */
>                 }
> +
> +               adjust_loop_inline_subprogno(env, i, j);
>         } else {
>                 /* convert i from "first prog to remove" to "first to adjust" */
>                 if (env->subprog_info[i].start == off)
> @@ -13773,6 +13859,94 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env,
>         return 0;
>  }
>
> +static struct bpf_prog *inline_bpf_loop(struct bpf_verifier_env *env,
> +                                       int position, u32 *cnt)
> +{
> +       struct bpf_insn_aux_data *aux = &env->insn_aux_data[position];
> +       s32 stack_base = aux->loop_inline_state.stack_base;
> +       s32 r6_offset = stack_base + 0 * BPF_REG_SIZE;
> +       s32 r7_offset = stack_base + 1 * BPF_REG_SIZE;
> +       s32 r8_offset = stack_base + 2 * BPF_REG_SIZE;
> +       int reg_loop_max = BPF_REG_6;
> +       int reg_loop_cnt = BPF_REG_7;
> +       int reg_loop_ctx = BPF_REG_8;
> +
> +       struct bpf_prog *new_prog;
> +       u32 callback_subprogno = aux->loop_inline_state.callback_subprogno;
> +       u32 callback_start;
> +       u32 call_insn_offset;
> +       s32 callback_offset;
> +       struct bpf_insn insn_buf[19];
> +       struct bpf_insn *next = insn_buf;
> +       struct bpf_insn *call, *jump_to_end, *loop_header;
> +       struct bpf_insn *jump_to_header, *loop_exit;
> +
> +       /* Return error and jump to the end of the patch if
> +        * expected number of iterations is too big.  This
> +        * repeats the check done in bpf_loop helper function,
> +        * be careful to modify this code in sync.
> +        */
> +       (*next++) = BPF_JMP_IMM(BPF_JLE, BPF_REG_1, BPF_MAX_LOOPS, 2);
> +       (*next++) = BPF_MOV32_IMM(BPF_REG_0, -E2BIG);
> +       jump_to_end = next;
> +       (*next++) = BPF_JMP_IMM(BPF_JA, 0, 0, 0 /* set below */);
> +       /* spill R6, R7, R8 to use these as loop vars */
> +       (*next++) = BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, r6_offset);
> +       (*next++) = BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, r7_offset);
> +       (*next++) = BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, r8_offset);
> +       /* initialize loop vars */
> +       (*next++) = BPF_MOV64_REG(reg_loop_max, BPF_REG_1);
> +       (*next++) = BPF_MOV32_IMM(reg_loop_cnt, 0);
> +       (*next++) = BPF_MOV64_REG(reg_loop_ctx, BPF_REG_3);
> +       /* loop header;
> +        * if reg_loop_cnt >= reg_loop_max skip the loop body
> +        */
> +       loop_header = next;
> +       (*next++) = BPF_JMP_REG(BPF_JGE, reg_loop_cnt, reg_loop_max,
> +                               0 /* set below */);
> +       /* callback call */
> +       (*next++) = BPF_MOV64_REG(BPF_REG_1, reg_loop_cnt);
> +       (*next++) = BPF_MOV64_REG(BPF_REG_2, reg_loop_ctx);
> +       call = next;
> +       (*next++) = BPF_CALL_REL(0 /* set below after patching */);
> +       /* increment loop counter */
> +       (*next++) = BPF_ALU64_IMM(BPF_ADD, reg_loop_cnt, 1);
> +       /* jump to loop header if callback returned 0 */
> +       jump_to_header = next;
> +       (*next++) = BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 0 /* set below */);
> +       /* return value of bpf_loop;
> +        * set R0 to the number of iterations
> +        */
> +       loop_exit = next;
> +       (*next++) = BPF_MOV64_REG(BPF_REG_0, reg_loop_cnt);
> +       /* restore original values of R6, R7, R8 */
> +       (*next++) = BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_10, r6_offset);
> +       (*next++) = BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_10, r7_offset);
> +       (*next++) = BPF_LDX_MEM(BPF_DW, BPF_REG_8, BPF_REG_10, r8_offset);
> +
> +       *cnt = next - insn_buf;
> +       if (*cnt > ARRAY_SIZE(insn_buf)) {
> +               WARN_ONCE(1, "BUG %s: 'next' exceeds bounds for 'insn_buf'\n",
> +                         __func__);
> +               return NULL;
> +       }
> +       jump_to_end->off = next - jump_to_end - 1;
> +       loop_header->off = loop_exit - loop_header - 1;
> +       jump_to_header->off = loop_header - jump_to_header - 1;
> +
> +       new_prog = bpf_patch_insn_data(env, position, insn_buf, *cnt);
> +       if (!new_prog)
> +               return new_prog;
> +
> +       /* callback start is known only after patching */
> +       callback_start = env->subprog_info[callback_subprogno].start;
> +       call_insn_offset = position + (call - insn_buf);
> +       callback_offset = callback_start - call_insn_offset - 1;
> +       env->prog->insnsi[call_insn_offset].imm = callback_offset;
> +
> +       return new_prog;
> +}
> +
>  /* Do various post-verification rewrites in a single program pass.
>   * These rewrites simplify JIT and interpreter implementations.
>   */
> @@ -14258,6 +14432,18 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
>                         continue;
>                 }
>
> +               if (insn->imm == BPF_FUNC_loop &&
> +                   fit_for_bpf_loop_inline(&env->insn_aux_data[i + delta])) {
> +                       new_prog = inline_bpf_loop(env, i + delta, &cnt);
> +                       if (!new_prog)
> +                               return -ENOMEM;
> +
> +                       delta    += cnt - 1;
> +                       env->prog = prog = new_prog;
> +                       insn      = new_prog->insnsi + i + delta;
> +                       continue;
> +               }
> +
>  patch_call_imm:
>                 fn = env->ops->get_func_proto(insn->imm, env->prog);
>                 /* all functions that have prototype and verifier allowed
> @@ -15030,6 +15216,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr)
>         if (ret == 0)
>                 ret = check_max_stack_depth(env);
>
> +       if (ret == 0)
> +               adjust_stack_depth_for_loop_inlining(env);

Do we need to do this before the check_max_stack_depth() call above
since adjust_stack_depth_for_loop_inlining() adjusts the stack depth?

> +
>         /* instruction rewrites happen after this point */
>         if (is_priv) {
>                 if (ret == 0)
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v3 3/5] bpf: Inline calls to bpf_loop when callback is known
  2022-06-04  0:06   ` Joanne Koong
@ 2022-06-04 12:51     ` Eduard Zingerman
  2022-06-06 18:08       ` Joanne Koong
  0 siblings, 1 reply; 20+ messages in thread
From: Eduard Zingerman @ 2022-06-04 12:51 UTC (permalink / raw)
  To: Joanne Koong
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, song

Hi Joanne, thanks for the review!

> On Fri, 2022-06-03 at 17:06 -0700, Joanne Koong wrote:
> Should this be moved to include/linux/bpf.h since
> kernel/bpf/bpf_iter.c also uses this? Then we don't need the "#include
> <linux/bpf_verifier.h>" in bpf_iter.c

Will do.

> Also, as a side-note, I think if we have an "is_valid" field, then we
> don't need both the "callback_is_constant" and "flags_is_zero" fields.
> If the flags is not zero or the callback subprogno is not the same on
> each invocation of the instruction, then we could represent that by
> just setting is_valid to false.

Agree, I'll update the declaration as follows:

struct bpf_loop_inline_state {
	bool initialized; /* set to true upon first entry */
	bool can_inline; /* true if callback function
			  * is the same at each call
			  * and flags are always zero */
	u32 callback_subprogno; /* valid when can_inline is true */
	/* stack offset for loop vars;
	 * u16 since bpf_subprog_info.stack_depth is u16;
	 * we take the negative of it whenever we use it
	 * since the stack grows downwards
	 */
	u16 stack_base;
};

> Is placing this inside the union in "struct bpf_insn_aux_data" an option?

This is an option indeed, will do.

> > +
> > +               if (fit_for_bpf_loop_inline(aux)) {
> > +                       if (!subprog_updated) {
> > +                               subprog_updated = true;
> > +                               subprogs[cur_subprog].stack_depth += BPF_REG_SIZE * 3;
> > +                               stack_base = -subprogs[cur_subprog].stack_depth;
> > +                       }
> > +                       aux->loop_inline_state.stack_base = stack_base;
> > +               }
> > +               if (i == subprog_end - 1) {
> > +                       subprog_updated = false;
> > +                       cur_subprog++;
> > +                       if (cur_subprog < env->subprog_cnt)
> > +                               subprog_end = subprogs[cur_subprog + 1].start;
> > +               }
> > +       }
> > +
> > +       env->prog->aux->stack_depth = env->subprog_info[0].stack_depth;
> 
> In the case where a subprogram that is not subprogram 0 is a fit for
> the bpf loop inline and thus increases its stack depth, won't
> env->prog->aux->stack_depth need to also be updated?

As far as I understand the logic in `do_check_main` and `jit_subprogs`
`env->prog->aux->stack_depth` always reflects the stack depth of the
first sub-program (aka `env->subprog_info[0].stack_depth`). So the
last line of `adjust_stack_depth_for_loop_inlining` merely ensures
this invariant. The stack depth for each relevant subprogram is
updated earlier in the function:

static void adjust_stack_depth_for_loop_inlining(struct bpf_verifier_env *env)
{
	...
	for (i = 0; i < insn_cnt; i++) {
		...
		if (fit_for_bpf_loop_inline(aux)) {
			if (!subprog_updated) {
				subprog_updated = true;
here  ---->			subprogs[cur_subprog].stack_depth += BPF_REG_SIZE * 3;
				...
			}
			...
		}
		if (i == subprog_end - 1) {
			subprog_updated = false;
			cur_subprog++;
			...
		}
	}
	...
}

Also, the patch v3 4/5 in a series has a test case "bpf_loop_inline
stack locations for loop vars" which checks that stack offsets for
spilled registers are assigned correctly for subprogram that is not a
first subprogram.

> > @@ -15030,6 +15216,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr)
> >         if (ret == 0)
> >                 ret = check_max_stack_depth(env);
> > 
> > +       if (ret == 0)
> > +               adjust_stack_depth_for_loop_inlining(env);
> 
> Do we need to do this before the check_max_stack_depth() call above
> since adjust_stack_depth_for_loop_inlining() adjusts the stack depth?

This is an interesting question. I used the following reasoning
placing `adjust_stack_depth_for_loop_inlining` after the
`check_max_stack_depth`:

1. If `adjust_stack_depth_for_loop_inlining` is placed before
   `check_max_stack_depth` some of the existing BPF programs might
   stop loading, because of increased stack usage.
2. To avoid (1) it is possible to do `check_max_stack_depth` first,
   remember actual max depth and apply
   `adjust_stack_depth_for_loop_inlining` only when there is enough
   stack space. However there are two downsides:
   - the optimization becomes flaky, similarly simple changes to the
     code of the BPF program might cause loop inlining to stop
     working;
   - the precise verification itself is non-trivial as each possible
     stack trace has to be analyzed in terms of stack size with loop
     inlining / stack size without loop inlining. If necessary, I will
     probably use a simpler heuristic where stack budget for register
     spilling would be computed as
     `MAX_BPF_STACK - actual_max_stack_depth`
3. Things are simpler if MAX_BPF_STACK is a soft limit that ensures
   that BPF programs consume limited amount of stack. In such case
   current implementation is sufficient.

So this boils down to the question what is `MAX_BPF_STACK`:
- a hard limit for the BPF program stack size?
- a soft limit used to verify that programs consume a limited amount
  of stack while executing?

If the answer is "hard limit" I'll proceed with implementation for
option (2).

Thanks,
Eduard


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v3 1/5] selftests/bpf: specify expected instructions in test_verifier tests
  2022-06-03 23:01       ` Song Liu
@ 2022-06-04 12:53         ` Eduard Zingerman
  0 siblings, 0 replies; 20+ messages in thread
From: Eduard Zingerman @ 2022-06-04 12:53 UTC (permalink / raw)
  To: Song Liu
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team

Hi Song,

> Right. There are a few more in later patches.

Understood, thanks for the review, I'll apply the suggested fixes.

Thanks,
Eduard


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v3 3/5] bpf: Inline calls to bpf_loop when callback is known
  2022-06-03 14:10 ` [PATCH bpf-next v3 3/5] bpf: Inline calls to bpf_loop when callback is known Eduard Zingerman
  2022-06-03 22:36   ` Song Liu
  2022-06-04  0:06   ` Joanne Koong
@ 2022-06-04 14:47   ` Alexei Starovoitov
  2022-06-04 15:36     ` Eduard Zingerman
  2 siblings, 1 reply; 20+ messages in thread
From: Alexei Starovoitov @ 2022-06-04 14:47 UTC (permalink / raw)
  To: Eduard Zingerman; +Cc: bpf, ast, andrii, daniel, kernel-team, song

On Fri, Jun 03, 2022 at 05:10:45PM +0300, Eduard Zingerman wrote:
> +
> +		if (fit_for_bpf_loop_inline(aux)) {
> +			if (!subprog_updated) {
> +				subprog_updated = true;
> +				subprogs[cur_subprog].stack_depth += BPF_REG_SIZE * 3;

Instead of doing += and keeping track that the stack was increased
(if multiple bpf_loop happened to be in a function)
may be add subprogs[..].stack_depth_extra variable ?
And unconditionally do subprogs[cur_subprog].stack_depth_extra = BPF_REG_SIZE * 3;
every time bpf_loop is seen?
Then we can do that during inline_bpf_loop().
Also is there a test for multiple bpf_loop in a func?

> +				stack_base = -subprogs[cur_subprog].stack_depth;
> +			}
> +			aux->loop_inline_state.stack_base = stack_base;
> +		}
> +		if (i == subprog_end - 1) {
> +			subprog_updated = false;
> +			cur_subprog++;
> +			if (cur_subprog < env->subprog_cnt)
> +				subprog_end = subprogs[cur_subprog + 1].start;
> +		}
> +	}
> +
> +	env->prog->aux->stack_depth = env->subprog_info[0].stack_depth;
> +}
> +
> +static void update_loop_inline_state(struct bpf_verifier_env *env, u32 subprogno)
> +{
> +	struct bpf_loop_inline_state *state = &cur_aux(env)->loop_inline_state;
> +	struct bpf_reg_state *regs = cur_regs(env);
> +	struct bpf_reg_state *flags_reg = &regs[BPF_REG_4];
> +
> +	int flags_is_zero =
> +		register_is_const(flags_reg) && flags_reg->var_off.value == 0;
> +
> +	if (state->initialized) {
> +		state->flags_is_zero &= flags_is_zero;
> +		state->callback_is_constant &= state->callback_subprogno == subprogno;
> +	} else {
> +		state->initialized = 1;
> +		state->callback_is_constant = 1;
> +		state->flags_is_zero = flags_is_zero;
> +		state->callback_subprogno = subprogno;
> +	}
> +}
> +
>  static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>  			     int *insn_idx_p)
>  {
> @@ -7255,6 +7327,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>  		err = check_bpf_snprintf_call(env, regs);
>  		break;
>  	case BPF_FUNC_loop:
> +		update_loop_inline_state(env, meta.subprogno);
>  		err = __check_func_call(env, insn, insn_idx_p, meta.subprogno,
>  					set_loop_callback_state);
>  		break;
> @@ -7661,11 +7734,6 @@ static bool check_reg_sane_offset(struct bpf_verifier_env *env,
>  	return true;
>  }
>  
> -static struct bpf_insn_aux_data *cur_aux(struct bpf_verifier_env *env)
> -{
> -	return &env->insn_aux_data[env->insn_idx];
> -}
> -
>  enum {
>  	REASON_BOUNDS	= -1,
>  	REASON_TYPE	= -2,
> @@ -12920,6 +12988,22 @@ static struct bpf_prog *bpf_patch_insn_data(struct bpf_verifier_env *env, u32 of
>  	return new_prog;
>  }
>  
> +static void adjust_loop_inline_subprogno(struct bpf_verifier_env *env,
> +					 u32 first_removed,
> +					 u32 first_remaining)
> +{
> +	int delta = first_remaining - first_removed;
> +
> +	for (int i = 0; i < env->prog->len; ++i) {
> +		struct bpf_loop_inline_state *state =
> +			&env->insn_aux_data[i].loop_inline_state;
> +
> +		if (state->initialized &&
> +		    state->callback_subprogno >= first_remaining)
> +			state->callback_subprogno -= delta;
> +	}
> +}
> +
>  static int adjust_subprog_starts_after_remove(struct bpf_verifier_env *env,
>  					      u32 off, u32 cnt)
>  {
> @@ -12963,6 +13047,8 @@ static int adjust_subprog_starts_after_remove(struct bpf_verifier_env *env,
>  			 * in adjust_btf_func() - no need to adjust
>  			 */
>  		}
> +
> +		adjust_loop_inline_subprogno(env, i, j);

This special case isn't great.
May be let's do inline_bpf_loop() earlier?
Instead of doing it from do_misc_fixups() how about doing it as a separate
pass right after check_max_stack_depth() ?
Then opt_remove_dead_code() will adjust BPF_CALL_REL automatically
as a part of bpf_adj_branches().

>  	} else {
>  		/* convert i from "first prog to remove" to "first to adjust" */
>  		if (env->subprog_info[i].start == off)
> @@ -13773,6 +13859,94 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env,
>  	return 0;
>  }
>  
> +static struct bpf_prog *inline_bpf_loop(struct bpf_verifier_env *env,
> +					int position, u32 *cnt)
> +{
> +	struct bpf_insn_aux_data *aux = &env->insn_aux_data[position];
> +	s32 stack_base = aux->loop_inline_state.stack_base;
> +	s32 r6_offset = stack_base + 0 * BPF_REG_SIZE;
> +	s32 r7_offset = stack_base + 1 * BPF_REG_SIZE;
> +	s32 r8_offset = stack_base + 2 * BPF_REG_SIZE;
> +	int reg_loop_max = BPF_REG_6;
> +	int reg_loop_cnt = BPF_REG_7;
> +	int reg_loop_ctx = BPF_REG_8;
> +
> +	struct bpf_prog *new_prog;
> +	u32 callback_subprogno = aux->loop_inline_state.callback_subprogno;
> +	u32 callback_start;
> +	u32 call_insn_offset;
> +	s32 callback_offset;
> +	struct bpf_insn insn_buf[19];
> +	struct bpf_insn *next = insn_buf;
> +	struct bpf_insn *call, *jump_to_end, *loop_header;
> +	struct bpf_insn *jump_to_header, *loop_exit;
> +
> +	/* Return error and jump to the end of the patch if
> +	 * expected number of iterations is too big.  This
> +	 * repeats the check done in bpf_loop helper function,
> +	 * be careful to modify this code in sync.
> +	 */
> +	(*next++) = BPF_JMP_IMM(BPF_JLE, BPF_REG_1, BPF_MAX_LOOPS, 2);
> +	(*next++) = BPF_MOV32_IMM(BPF_REG_0, -E2BIG);
> +	jump_to_end = next;
> +	(*next++) = BPF_JMP_IMM(BPF_JA, 0, 0, 0 /* set below */);
> +	/* spill R6, R7, R8 to use these as loop vars */
> +	(*next++) = BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, r6_offset);
> +	(*next++) = BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, r7_offset);
> +	(*next++) = BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, r8_offset);
> +	/* initialize loop vars */
> +	(*next++) = BPF_MOV64_REG(reg_loop_max, BPF_REG_1);
> +	(*next++) = BPF_MOV32_IMM(reg_loop_cnt, 0);
> +	(*next++) = BPF_MOV64_REG(reg_loop_ctx, BPF_REG_3);
> +	/* loop header;
> +	 * if reg_loop_cnt >= reg_loop_max skip the loop body
> +	 */
> +	loop_header = next;
> +	(*next++) = BPF_JMP_REG(BPF_JGE, reg_loop_cnt, reg_loop_max,
> +				0 /* set below */);
> +	/* callback call */
> +	(*next++) = BPF_MOV64_REG(BPF_REG_1, reg_loop_cnt);
> +	(*next++) = BPF_MOV64_REG(BPF_REG_2, reg_loop_ctx);
> +	call = next;
> +	(*next++) = BPF_CALL_REL(0 /* set below after patching */);
> +	/* increment loop counter */
> +	(*next++) = BPF_ALU64_IMM(BPF_ADD, reg_loop_cnt, 1);
> +	/* jump to loop header if callback returned 0 */
> +	jump_to_header = next;
> +	(*next++) = BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 0 /* set below */);
> +	/* return value of bpf_loop;
> +	 * set R0 to the number of iterations
> +	 */
> +	loop_exit = next;
> +	(*next++) = BPF_MOV64_REG(BPF_REG_0, reg_loop_cnt);
> +	/* restore original values of R6, R7, R8 */
> +	(*next++) = BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_10, r6_offset);
> +	(*next++) = BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_10, r7_offset);
> +	(*next++) = BPF_LDX_MEM(BPF_DW, BPF_REG_8, BPF_REG_10, r8_offset);
> +
> +	*cnt = next - insn_buf;
> +	if (*cnt > ARRAY_SIZE(insn_buf)) {
> +		WARN_ONCE(1, "BUG %s: 'next' exceeds bounds for 'insn_buf'\n",
> +			  __func__);
> +		return NULL;
> +	}
> +	jump_to_end->off = next - jump_to_end - 1;
> +	loop_header->off = loop_exit - loop_header - 1;
> +	jump_to_header->off = loop_header - jump_to_header - 1;
> +
> +	new_prog = bpf_patch_insn_data(env, position, insn_buf, *cnt);
> +	if (!new_prog)
> +		return new_prog;
> +
> +	/* callback start is known only after patching */
> +	callback_start = env->subprog_info[callback_subprogno].start;
> +	call_insn_offset = position + (call - insn_buf);
> +	callback_offset = callback_start - call_insn_offset - 1;
> +	env->prog->insnsi[call_insn_offset].imm = callback_offset;
> +
> +	return new_prog;
> +}
> +
>  /* Do various post-verification rewrites in a single program pass.
>   * These rewrites simplify JIT and interpreter implementations.
>   */
> @@ -14258,6 +14432,18 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
>  			continue;
>  		}
>  
> +		if (insn->imm == BPF_FUNC_loop &&
> +		    fit_for_bpf_loop_inline(&env->insn_aux_data[i + delta])) {
> +			new_prog = inline_bpf_loop(env, i + delta, &cnt);
> +			if (!new_prog)
> +				return -ENOMEM;
> +
> +			delta    += cnt - 1;
> +			env->prog = prog = new_prog;
> +			insn      = new_prog->insnsi + i + delta;
> +			continue;
> +		}
> +
>  patch_call_imm:
>  		fn = env->ops->get_func_proto(insn->imm, env->prog);
>  		/* all functions that have prototype and verifier allowed
> @@ -15030,6 +15216,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr)
>  	if (ret == 0)
>  		ret = check_max_stack_depth(env);
>  
> +	if (ret == 0)
> +		adjust_stack_depth_for_loop_inlining(env);

With above two suggestions this will become
if (ret == 0)
    optimize_bpf_loop(env);

where it will call inline_bpf_loop() and will assign max_depth_extra.
And then one extra loop for_each_subporg() max_depth += max_depth_extra.

wdyt?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v3 3/5] bpf: Inline calls to bpf_loop when callback is known
  2022-06-04 14:47   ` Alexei Starovoitov
@ 2022-06-04 15:36     ` Eduard Zingerman
  0 siblings, 0 replies; 20+ messages in thread
From: Eduard Zingerman @ 2022-06-04 15:36 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: bpf, ast, andrii, daniel, kernel-team, song

Hi Alexei,

> On Sat, 2022-06-04 at 16:47 +0200, Alexei Starovoitov wrote:
> > On Fri, Jun 03, 2022 at 05:10:45PM +0300, Eduard Zingerman wrote:
[...]
> > +		if (fit_for_bpf_loop_inline(aux)) {
> > +			if (!subprog_updated) {
> > +				subprog_updated = true;
> > +				subprogs[cur_subprog].stack_depth += BPF_REG_SIZE * 3;
> 
> Instead of doing += and keeping track that the stack was increased
> (if multiple bpf_loop happened to be in a function)
> may be add subprogs[..].stack_depth_extra variable ?
> And unconditionally do subprogs[cur_subprog].stack_depth_extra = BPF_REG_SIZE * 3;
> every time bpf_loop is seen?
> Then we can do that during inline_bpf_loop().
[...]
> > @@ -12963,6 +13047,8 @@ static int adjust_subprog_starts_after_remove(struct bpf_verifier_env *env,
> >  			 * in adjust_btf_func() - no need to adjust
> >  			 */
> >  		}
> > +
> > +		adjust_loop_inline_subprogno(env, i, j);
> 
> This special case isn't great.
> May be let's do inline_bpf_loop() earlier?
> Instead of doing it from do_misc_fixups() how about doing it as a separate
> pass right after check_max_stack_depth() ?
> Then opt_remove_dead_code() will adjust BPF_CALL_REL automatically
> as a part of bpf_adj_branches().
[...]
> >  	if (ret == 0)
> >  		ret = check_max_stack_depth(env);
> >  
> > +	if (ret == 0)
> > +		adjust_stack_depth_for_loop_inlining(env);
> 
> With above two suggestions this will become
> if (ret == 0)
>     optimize_bpf_loop(env);
> 
> where it will call inline_bpf_loop() and will assign max_depth_extra.
> And then one extra loop for_each_subporg() max_depth += max_depth_extra.
> 
> wdyt?

I will proceed with the suggested rewrite, it simplifies a few things,
thank you!

> Also is there a test for multiple bpf_loop in a func?

Yes, please see the test case "bpf_loop_inline stack locations for
loop vars" in the patch "v3 4/5" from this series.

Also, please note Joanne's comment from a sibiling thread:

> > +       if (ret == 0)
> > +               adjust_stack_depth_for_loop_inlining(env);
> 
> Do we need to do this before the check_max_stack_depth() call above
> since adjust_stack_depth_for_loop_inlining() adjusts the stack depth?

In short, what is the meaning of the `MAX_BPF_STACK` variable?

- Is it a hard limit on BPF program stack size?

  If so, `optimize_bpf_loop` should skip the optimization if not
  enough stack space is available. But this makes optimization a bit
  'flaky'. This could be achieved by passing maximal stack depth
  computed by `check_max_stack_depth` to `optimize_bpf_loop`.

- Or is it a soft limit used by verifier to guarantee that stack space
  needed by the BPF program is limited?
  
  If so, `optimize_bpf_loop` might be executed after
  `check_max_stack_depth` w/o any data passed from one to another. But
  this would mean that for some programs actual stack usage would be
  `MAX_BPF_STACK + 24*N`. Where N is a number of functions with
  inlined loops (which is easier to compute than the max number of
  functions with inlined loop that can occur on a same stack trace).
  
  This might affect the following portion of the `do_misc_fixups`:
  
 static int do_misc_fixups(struct bpf_verifier_env *env) {
	...
		if (insn->imm == BPF_FUNC_tail_call) {
			/* If we tail call into other programs, we
			 * cannot make any assumptions since they can
			 * be replaced dynamically during runtime in
			 * the program array.
			 */
			prog->cb_access = 1;
			if (!allow_tail_call_in_subprogs(env))
here ---->     			prog->aux->stack_depth = MAX_BPF_STACK;
			...
		}
	...
 }

Thanks,
Eduard


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v3 3/5] bpf: Inline calls to bpf_loop when callback is known
  2022-06-04 12:51     ` Eduard Zingerman
@ 2022-06-06 18:08       ` Joanne Koong
  2022-06-08  9:06         ` Eduard Zingerman
  0 siblings, 1 reply; 20+ messages in thread
From: Joanne Koong @ 2022-06-06 18:08 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, song

On Sat, Jun 4, 2022 at 5:51 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> Hi Joanne, thanks for the review!
>
> > On Fri, 2022-06-03 at 17:06 -0700, Joanne Koong wrote:
[...]
>
> > > +
> > > +               if (fit_for_bpf_loop_inline(aux)) {
> > > +                       if (!subprog_updated) {
> > > +                               subprog_updated = true;
> > > +                               subprogs[cur_subprog].stack_depth += BPF_REG_SIZE * 3;
> > > +                               stack_base = -subprogs[cur_subprog].stack_depth;
> > > +                       }
> > > +                       aux->loop_inline_state.stack_base = stack_base;
> > > +               }
> > > +               if (i == subprog_end - 1) {
> > > +                       subprog_updated = false;
> > > +                       cur_subprog++;
> > > +                       if (cur_subprog < env->subprog_cnt)
> > > +                               subprog_end = subprogs[cur_subprog + 1].start;
> > > +               }
> > > +       }
> > > +
> > > +       env->prog->aux->stack_depth = env->subprog_info[0].stack_depth;
> >
> > In the case where a subprogram that is not subprogram 0 is a fit for
> > the bpf loop inline and thus increases its stack depth, won't
> > env->prog->aux->stack_depth need to also be updated?
>
> As far as I understand the logic in `do_check_main` and `jit_subprogs`
> `env->prog->aux->stack_depth` always reflects the stack depth of the
> first sub-program (aka `env->subprog_info[0].stack_depth`). So the
> last line of `adjust_stack_depth_for_loop_inlining` merely ensures
> this invariant. The stack depth for each relevant subprogram is
> updated earlier in the function:
>
> static void adjust_stack_depth_for_loop_inlining(struct bpf_verifier_env *env)
> {
>         ...
>         for (i = 0; i < insn_cnt; i++) {
>                 ...
>                 if (fit_for_bpf_loop_inline(aux)) {
>                         if (!subprog_updated) {
>                                 subprog_updated = true;
> here  ---->                     subprogs[cur_subprog].stack_depth += BPF_REG_SIZE * 3;
>                                 ...
>                         }
>                         ...
>                 }
>                 if (i == subprog_end - 1) {
>                         subprog_updated = false;
>                         cur_subprog++;
>                         ...
>                 }
>         }
>         ...
> }
>
> Also, the patch v3 4/5 in a series has a test case "bpf_loop_inline
> stack locations for loop vars" which checks that stack offsets for
> spilled registers are assigned correctly for subprogram that is not a
> first subprogram.
>
Thanks for the explanation :) I misunderstood what
prog->aux->stack_depth is used for

Looking at arch/arm64/net/bpf_jit_comp.c, I see this diagram

        /*
         * BPF prog stack layout
         *
         *                         high
         * original A64_SP =>   0:+-----+ BPF prologue
         *                        |FP/LR|
         * current A64_FP =>  -16:+-----+
         *                        | ... | callee saved registers
         * BPF fp register => -64:+-----+ <= (BPF_FP)
         *                        |     |
         *                        | ... | BPF prog stack
         *                        |     |
         *                        +-----+ <= (BPF_FP - prog->aux->stack_depth)
         *                        |RSVD | padding
         * current A64_SP =>      +-----+ <= (BPF_FP - ctx->stack_size)
         *                        |     |
         *                        | ... | Function call stack
         *                        |     |
         *                        +-----+
         *                          low
         *
         */
It looks like prog->aux->stack_depth is used for the "BPF prog stack",
which is the stack for the main bpf program (subprog 0)

> > > @@ -15030,6 +15216,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr)
> > >         if (ret == 0)
> > >                 ret = check_max_stack_depth(env);
> > >
> > > +       if (ret == 0)
> > > +               adjust_stack_depth_for_loop_inlining(env);
> >
> > Do we need to do this before the check_max_stack_depth() call above
> > since adjust_stack_depth_for_loop_inlining() adjusts the stack depth?
>
> This is an interesting question. I used the following reasoning
> placing `adjust_stack_depth_for_loop_inlining` after the
> `check_max_stack_depth`:
>
> 1. If `adjust_stack_depth_for_loop_inlining` is placed before
>    `check_max_stack_depth` some of the existing BPF programs might
>    stop loading, because of increased stack usage.
> 2. To avoid (1) it is possible to do `check_max_stack_depth` first,
>    remember actual max depth and apply
>    `adjust_stack_depth_for_loop_inlining` only when there is enough
>    stack space. However there are two downsides:
>    - the optimization becomes flaky, similarly simple changes to the
>      code of the BPF program might cause loop inlining to stop
>      working;
>    - the precise verification itself is non-trivial as each possible
>      stack trace has to be analyzed in terms of stack size with loop
>      inlining / stack size without loop inlining. If necessary, I will
>      probably use a simpler heuristic where stack budget for register
>      spilling would be computed as
>      `MAX_BPF_STACK - actual_max_stack_depth`
> 3. Things are simpler if MAX_BPF_STACK is a soft limit that ensures
>    that BPF programs consume limited amount of stack. In such case
>    current implementation is sufficient.
>
> So this boils down to the question what is `MAX_BPF_STACK`:
> - a hard limit for the BPF program stack size?
> - a soft limit used to verify that programs consume a limited amount
>   of stack while executing?
>
> If the answer is "hard limit" I'll proceed with implementation for
> option (2).
I'm not sure either whether MAX_BPF_STACK is a hard limit or a soft
limit. I'm curious to know as well.
>
> Thanks,
> Eduard
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v3 3/5] bpf: Inline calls to bpf_loop when callback is known
  2022-06-06 18:08       ` Joanne Koong
@ 2022-06-08  9:06         ` Eduard Zingerman
  0 siblings, 0 replies; 20+ messages in thread
From: Eduard Zingerman @ 2022-06-08  9:06 UTC (permalink / raw)
  To: Joanne Koong
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, song

Hi Joanne,

> Looking at arch/arm64/net/bpf_jit_comp.c, I see this diagram
> 
>         /*
>          * BPF prog stack layout
>          *
>          *                         high
>          * original A64_SP =>   0:+-----+ BPF prologue
>          *                        |FP/LR|
>          * current A64_FP =>  -16:+-----+
>          *                        | ... | callee saved registers
>          * BPF fp register => -64:+-----+ <= (BPF_FP)
>          *                        |     |
>          *                        | ... | BPF prog stack
>          *                        |     |
>          *                        +-----+ <= (BPF_FP - prog->aux->stack_depth)
>          *                        |RSVD | padding
>          * current A64_SP =>      +-----+ <= (BPF_FP - ctx->stack_size)
>          *                        |     |
>          *                        | ... | Function call stack
>          *                        |     |
>          *                        +-----+
>          *                          low
>          *
>          */
> It looks like prog->aux->stack_depth is used for the "BPF prog stack",
> which is the stack for the main bpf program (subprog 0)

Thanks for looking into this. Also note the verifier.c:jit_subprogs
function which essentially "promotes" every sub-program to sub-program #0
before calling bpf_jit_comp.c:bpf_int_jit_compile for that sub-program.

> I'm not sure either whether MAX_BPF_STACK is a hard limit or a soft
> limit. I'm curious to know as well.

Alexei had commented in a sibling thread that MAX_BPF_STACK should be
considered a soft limit.

Thanks,
Eduard


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2022-06-08  9:39 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-03 14:10 [PATCH bpf-next v3 0/5] bpf_loop inlining Eduard Zingerman
2022-06-03 14:10 ` [PATCH bpf-next v3 1/5] selftests/bpf: specify expected instructions in test_verifier tests Eduard Zingerman
2022-06-03 21:31   ` Song Liu
2022-06-03 22:08     ` Eduard Zingerman
2022-06-03 23:01       ` Song Liu
2022-06-04 12:53         ` Eduard Zingerman
2022-06-03 14:10 ` [PATCH bpf-next v3 2/5] selftests/bpf: allow BTF specs and func infos " Eduard Zingerman
2022-06-03 21:41   ` Song Liu
2022-06-03 14:10 ` [PATCH bpf-next v3 3/5] bpf: Inline calls to bpf_loop when callback is known Eduard Zingerman
2022-06-03 22:36   ` Song Liu
2022-06-04  0:06   ` Joanne Koong
2022-06-04 12:51     ` Eduard Zingerman
2022-06-06 18:08       ` Joanne Koong
2022-06-08  9:06         ` Eduard Zingerman
2022-06-04 14:47   ` Alexei Starovoitov
2022-06-04 15:36     ` Eduard Zingerman
2022-06-03 14:10 ` [PATCH bpf-next v3 4/5] selftests/bpf: BPF test_verifier selftests for bpf_loop inlining Eduard Zingerman
2022-06-03 22:38   ` Song Liu
2022-06-03 14:10 ` [PATCH bpf-next v3 5/5] selftests/bpf: BPF test_prog " Eduard Zingerman
2022-06-03 22:52   ` Song Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.