All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/3] bpf_loop inlining
@ 2022-05-29 22:36 Eduard Zingerman
  2022-05-29 22:36 ` [PATCH bpf-next v2 1/3] selftests/bpf: specify expected instructions in test_verifier tests Eduard Zingerman
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Eduard Zingerman @ 2022-05-29 22:36 UTC (permalink / raw)
  To: bpf, ast, andrii, daniel, kernel-team; +Cc: eddyz87

Hi Everyone,

This patch implements inlining of calls to bpf_loop helper function
when bpf_loop's callback is statically known. E.g. the rewrite does
the following transformation during BPF program processing:

  bpf_loop(10, foo, NULL, 0);

 ->

  for (int i = 0; i < 10; ++i)
    foo(i, NULL);

The transformation leads to measurable latency change for simple
loops. Measurements using `benchs/run_bench_bpf_loop.sh` inside QEMU /
KVM on i7-4710HQ CPU show a drop in latency from 14 ns/op to 2 ns/op.

The change is split in three parts:

* Update to test_verifier.c to specify expected and unexpected
  instruction sequences. This allows to check BPF program rewrites
  applied by do_mix_fixups function.

* Update to test_verifier.c to specify BTF function infos and types
  per test case. This is necessary for tests that load sub-program
  addresses to a variable because of the checks applied by
  check_ld_imm function.

* The update to verifier.c that tracks state of the parameters for
  each bpf_loop call in a program and decides whether it could be
  replaced by a loop.

Additional details are available in commit messages for each patch.
Hope you will find this useful.

Note: the newly added `bpf_loop/check_stack` test case in `test_progs`
currently fails in no_alu32 mode with error "BPF program is too
large". I'm still investigating this issue.  Nevertheless, I'm still
interested in overall feedback for the patch.

Best regards,
Eduard Zingerman

Changes since v1:
 - allow to use SKIP_INSNS in instruction pattern specification in
   test_verifier tests;
 - fix for a bug in spill offset assignement for loop vars when
   bpf_loop is located in a non-main function.

Eduard Zingerman (3):
  selftests/bpf: specify expected instructions in test_verifier tests
  selftests/bpf: allow BTF specs and func infos in test_verifier tests
  bpf: Inline calls to bpf_loop when callback is known

 include/linux/bpf_verifier.h                  |  16 +
 kernel/bpf/bpf_iter.c                         |   9 +-
 kernel/bpf/verifier.c                         | 184 +++++++++-
 .../selftests/bpf/prog_tests/bpf_loop.c       |  62 ++++
 tools/testing/selftests/bpf/prog_tests/btf.c  |   1 -
 tools/testing/selftests/bpf/progs/bpf_loop.c  | 122 +++++++
 tools/testing/selftests/bpf/test_btf.h        |   2 +
 tools/testing/selftests/bpf/test_verifier.c   | 338 +++++++++++++++++-
 .../selftests/bpf/verifier/bpf_loop_inline.c  | 244 +++++++++++++
 9 files changed, 951 insertions(+), 27 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/verifier/bpf_loop_inline.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH bpf-next v2 1/3] selftests/bpf: specify expected instructions in test_verifier tests
  2022-05-29 22:36 [PATCH bpf-next v2 0/3] bpf_loop inlining Eduard Zingerman
@ 2022-05-29 22:36 ` Eduard Zingerman
  2022-05-31 18:26   ` Song Liu
  2022-05-29 22:36 ` [PATCH bpf-next v2 2/3] selftests/bpf: allow BTF specs and func infos " Eduard Zingerman
  2022-05-29 22:36 ` [PATCH bpf-next v2 3/3] bpf: Inline calls to bpf_loop when callback is known Eduard Zingerman
  2 siblings, 1 reply; 9+ messages in thread
From: Eduard Zingerman @ 2022-05-29 22:36 UTC (permalink / raw)
  To: bpf, ast, andrii, daniel, kernel-team; +Cc: eddyz87

Allows to specify expected and unexpected instruction sequences in
test_verifier test cases. The instructions are requested from kernel
after BPF program loading, thus allowing to check some of the
transformations applied by BPF verifier.

- `expected_insn` field specifies a sequence of instructions expected
  to be found in the program;
- `unexpected_insn` field specifies a sequence of instructions that
  are not expected to be found in the program;
- `INSN_OFF_MASK` and `INSN_IMM_MASK` values could be used to mask
  `off` and `imm` fields.
- `SKIP_INSNS` could be used to specify that some instructions in the
  (un)expected pattern are not important (behavior similar to usage of
  `\t` in `errstr` field).

The intended usage is as follows:

  {
	"inline simple bpf_loop call",
	.insns = {
	/* main */
	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2,
			BPF_PSEUDO_FUNC, 0, 6),
    ...
	BPF_EXIT_INSN(),
	/* callback */
	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
	BPF_EXIT_INSN(),
	},
	.expected_insns = {
		BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
		SKIP_INSN(),
		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_CALL, 8, 1)
	},
	.unexpected_insns = {
	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0,
			INSN_OFF_MASK, INSN_IMM_MASK),
	},
	.prog_type = BPF_PROG_TYPE_TRACEPOINT,
	.result = ACCEPT,
	.runs = 0,
  },

Here it is expected that move of 1 to register 1 would remain in place
and helper function call instruction would be replaced by a relative
call instruction.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/test_verifier.c | 228 ++++++++++++++++++++
 1 file changed, 228 insertions(+)

diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index 372579c9f45e..13468221d227 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -51,6 +51,8 @@
 #endif
 
 #define MAX_INSNS	BPF_MAXINSNS
+#define MAX_EXPECTED_INSNS	32
+#define MAX_UNEXPECTED_INSNS	32
 #define MAX_TEST_INSNS	1000000
 #define MAX_FIXUPS	8
 #define MAX_NR_MAPS	23
@@ -58,6 +60,10 @@
 #define POINTER_VALUE	0xcafe4all
 #define TEST_DATA_LEN	64
 
+#define INSN_OFF_MASK	((s16)0xFFFF)
+#define INSN_IMM_MASK	((s32)0xFFFFFFFF)
+#define SKIP_INSNS()	BPF_RAW_INSN(0xde, 0xa, 0xd, 0xbeef, 0xdeadbeef)
+
 #define F_NEEDS_EFFICIENT_UNALIGNED_ACCESS	(1 << 0)
 #define F_LOAD_WITH_STRICT_ALIGNMENT		(1 << 1)
 
@@ -79,6 +85,19 @@ struct bpf_test {
 	const char *descr;
 	struct bpf_insn	insns[MAX_INSNS];
 	struct bpf_insn	*fill_insns;
+	/* If specified, test engine looks for this sequence of
+	 * instructions in the BPF program after loading. Allows to
+	 * test rewrites applied by verifier.  Use values
+	 * INSN_OFF_MASK and INSN_IMM_MASK to mask `off` and `imm`
+	 * fields if content does not matter.  The test case fails if
+	 * specified instructions are not found.
+	 *
+	 * The sequence could be split into sub-sequences by adding
+	 * SKIP_INSNS instruction at the end of each sub-sequence. In
+	 * such case sub-sequences are searched for one after another.
+	 */
+	struct bpf_insn expected_insns[MAX_EXPECTED_INSNS];
+	struct bpf_insn unexpected_insns[MAX_UNEXPECTED_INSNS];
 	int fixup_map_hash_8b[MAX_FIXUPS];
 	int fixup_map_hash_48b[MAX_FIXUPS];
 	int fixup_map_hash_16b[MAX_FIXUPS];
@@ -1126,6 +1145,212 @@ static bool cmp_str_seq(const char *log, const char *exp)
 	return true;
 }
 
+static __u32 roundup_u32(__u32 number, __u32 divisor)
+{
+	if (number % divisor == 0)
+		return number / divisor;
+	else
+		return number / divisor + 1;
+}
+
+static int get_xlated_program(int fd_prog, struct bpf_insn **buf, int *cnt)
+{
+	struct bpf_prog_info info = {};
+	__u32 info_len = sizeof(info);
+	int err = 0;
+
+	if (bpf_obj_get_info_by_fd(fd_prog, &info, &info_len)) {
+		err = errno;
+		perror("bpf_obj_get_info_by_fd failed");
+		goto out;
+	}
+
+	__u32 xlated_prog_len = info.xlated_prog_len;
+	*cnt = roundup_u32(xlated_prog_len, sizeof(**buf));
+	*buf = calloc(*cnt, sizeof(**buf));
+	if (!buf) {
+		err = -ENOMEM;
+		perror("can't allocate xlated program buffer");
+		goto out;
+	}
+
+	bzero(&info, sizeof(info));
+	info.xlated_prog_len = xlated_prog_len;
+	info.xlated_prog_insns = (__u64)*buf;
+
+	if (bpf_obj_get_info_by_fd(fd_prog, &info, &info_len)) {
+		err = errno;
+		perror("second bpf_obj_get_info_by_fd failed");
+		goto out_free_buf;
+	}
+
+	goto out;
+
+ out_free_buf:
+	free(*buf);
+ out:
+	return err;
+}
+
+static bool is_null_insn(struct bpf_insn *insn)
+{
+	struct bpf_insn null_insn = {};
+
+	return memcmp(insn, &null_insn, sizeof(null_insn)) == 0;
+}
+
+static bool is_skip_insn(struct bpf_insn *insn)
+{
+	struct bpf_insn skip_insn = SKIP_INSNS();
+
+	return memcmp(insn, &skip_insn, sizeof(skip_insn)) == 0;
+}
+
+static int null_terminated_insn_len(struct bpf_insn *seq, int max_len)
+{
+	for (int i = 0; i < max_len; ++i) {
+		if (is_null_insn(&seq[i]))
+			return i;
+	}
+	return max_len;
+}
+
+static bool compare_masked_insn(struct bpf_insn *orig, struct bpf_insn *masked)
+{
+	struct bpf_insn orig_masked;
+
+	memcpy(&orig_masked, orig, sizeof(orig_masked));
+	if (masked->imm == INSN_IMM_MASK)
+		orig_masked.imm = INSN_IMM_MASK;
+	if (masked->off == INSN_OFF_MASK)
+		orig_masked.off = INSN_OFF_MASK;
+
+	return memcmp(&orig_masked, masked, sizeof(orig_masked)) == 0;
+}
+
+static int find_insn_subseq(struct bpf_insn *seq, struct bpf_insn *subseq,
+			    int seq_len, int subseq_len)
+{
+	if (subseq_len > seq_len)
+		return -1;
+
+	for (int i = 0; i < seq_len - subseq_len + 1; ++i) {
+		bool found = true;
+
+		for (int j = 0; j < subseq_len; ++j) {
+			if (!compare_masked_insn(&seq[i + j], &subseq[j])) {
+				found = false;
+				break;
+			}
+		}
+		if (found)
+			return i;
+	}
+
+	return -1;
+}
+
+static int find_skip_insn_marker(struct bpf_insn *seq, int len)
+{
+	for (int i = 0; i < len; ++i)
+		if (is_skip_insn(&seq[i]))
+			return i;
+
+	return -1;
+}
+
+/* Return true if all sub-sequences in `subseqs` could be found in
+ * `seq` one after another. Sub-sequences are separated by a single
+ * nil instruction.
+ */
+static bool find_all_insn_subseqs(struct bpf_insn *seq, struct bpf_insn *subseqs,
+				  int seq_len, int max_subseqs_len)
+{
+	int subseqs_len = null_terminated_insn_len(subseqs, max_subseqs_len);
+
+	while (subseqs_len > 0) {
+		int skip_idx = find_skip_insn_marker(subseqs, subseqs_len);
+		int cur_subseq_len = skip_idx < 0 ? subseqs_len : skip_idx;
+		int subseq_idx = find_insn_subseq(seq, subseqs,
+						  seq_len, cur_subseq_len);
+
+		if (subseq_idx < 0)
+			return false;
+		seq += subseq_idx + cur_subseq_len;
+		seq_len -= subseq_idx + cur_subseq_len;
+		subseqs += cur_subseq_len + 1;
+		subseqs_len -= cur_subseq_len + 1;
+	}
+
+	return true;
+}
+
+static void print_insn(struct bpf_insn *buf, int cnt)
+{
+	printf("  addr  op d s off  imm\n");
+	for (int i = 0; i < cnt; ++i) {
+		struct bpf_insn *insn = &buf[i];
+
+		if (is_null_insn(insn))
+			break;
+
+		if (is_skip_insn(insn))
+			printf("  ...\n");
+		else
+			printf("  %04x: %02x %1x %x %04hx %08x\n",
+			       i, insn->code, insn->dst_reg,
+			       insn->src_reg, insn->off, insn->imm);
+	}
+}
+
+static bool check_xlated_program(struct bpf_test *test, int fd_prog)
+{
+	struct bpf_insn *buf;
+	int cnt;
+	bool result = true;
+	bool check_expected = !is_null_insn(test->expected_insns);
+	bool check_unexpected = !is_null_insn(test->unexpected_insns);
+
+	if (!check_expected && !check_unexpected)
+		goto out;
+
+	if (get_xlated_program(fd_prog, &buf, &cnt)) {
+		printf("FAIL: can't get xlated program\n");
+		result = false;
+		goto out;
+	}
+
+	if (check_expected &&
+	    !find_all_insn_subseqs(buf, test->expected_insns,
+				   cnt, MAX_EXPECTED_INSNS)) {
+		printf("FAIL: can't find expected subsequence of instructions\n");
+		result = false;
+		if (verbose) {
+			printf("Program:\n");
+			print_insn(buf, cnt);
+			printf("Expected subsequence:\n");
+			print_insn(test->expected_insns, MAX_EXPECTED_INSNS);
+		}
+	}
+
+	if (check_unexpected &&
+	    find_all_insn_subseqs(buf, test->unexpected_insns,
+				  cnt, MAX_UNEXPECTED_INSNS)) {
+		printf("FAIL: found unexpected subsequence of instructions\n");
+		result = false;
+		if (verbose) {
+			printf("Program:\n");
+			print_insn(buf, cnt);
+			printf("Un-expected subsequence:\n");
+			print_insn(test->unexpected_insns, MAX_UNEXPECTED_INSNS);
+		}
+	}
+
+	free(buf);
+ out:
+	return result;
+}
+
 static void do_test_single(struct bpf_test *test, bool unpriv,
 			   int *passes, int *errors)
 {
@@ -1262,6 +1487,9 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
 	if (verbose)
 		printf(", verifier log:\n%s", bpf_vlog);
 
+	if (!check_xlated_program(test, fd_prog))
+		goto fail_log;
+
 	run_errs = 0;
 	run_successes = 0;
 	if (!alignment_prevented_execution && fd_prog >= 0 && test->runs >= 0) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH bpf-next v2 2/3] selftests/bpf: allow BTF specs and func infos in test_verifier tests
  2022-05-29 22:36 [PATCH bpf-next v2 0/3] bpf_loop inlining Eduard Zingerman
  2022-05-29 22:36 ` [PATCH bpf-next v2 1/3] selftests/bpf: specify expected instructions in test_verifier tests Eduard Zingerman
@ 2022-05-29 22:36 ` Eduard Zingerman
  2022-05-31 20:52   ` Song Liu
  2022-05-29 22:36 ` [PATCH bpf-next v2 3/3] bpf: Inline calls to bpf_loop when callback is known Eduard Zingerman
  2 siblings, 1 reply; 9+ messages in thread
From: Eduard Zingerman @ 2022-05-29 22:36 UTC (permalink / raw)
  To: bpf, ast, andrii, daniel, kernel-team; +Cc: eddyz87

The BTF and func_info specification for test_verifier tests follows
the same notation as in prog_tests/btf.c tests. E.g.:

  ...
  .func_info = { { 0, 6 }, { 8, 7 } },
  .func_info_cnt = 2,
  .btf_strings = "\0int\0",
  .btf_types = {
    BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4),
    BTF_PTR_ENC(1),
  },
  ...

The BTF specification is loaded only when specified.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/prog_tests/btf.c |   1 -
 tools/testing/selftests/bpf/test_btf.h       |   2 +
 tools/testing/selftests/bpf/test_verifier.c  | 110 ++++++++++++++++---
 3 files changed, 95 insertions(+), 18 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c
index ba5bde53d418..bebc98bad615 100644
--- a/tools/testing/selftests/bpf/prog_tests/btf.c
+++ b/tools/testing/selftests/bpf/prog_tests/btf.c
@@ -34,7 +34,6 @@ static bool always_log;
 #undef CHECK
 #define CHECK(condition, format...) _CHECK(condition, "check", duration, format)
 
-#define BTF_END_RAW 0xdeadbeef
 #define NAME_TBD 0xdeadb33f
 
 #define NAME_NTH(N) (0xfffe0000 | N)
diff --git a/tools/testing/selftests/bpf/test_btf.h b/tools/testing/selftests/bpf/test_btf.h
index 128989bed8b7..044663c45a57 100644
--- a/tools/testing/selftests/bpf/test_btf.h
+++ b/tools/testing/selftests/bpf/test_btf.h
@@ -4,6 +4,8 @@
 #ifndef _TEST_BTF_H
 #define _TEST_BTF_H
 
+#define BTF_END_RAW 0xdeadbeef
+
 #define BTF_INFO_ENC(kind, kind_flag, vlen)			\
 	((!!(kind_flag) << 31) | ((kind) << 24) | ((vlen) & BTF_MAX_VLEN))
 
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index 13468221d227..b6dfa633bffa 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -59,11 +59,17 @@
 #define MAX_TEST_RUNS	8
 #define POINTER_VALUE	0xcafe4all
 #define TEST_DATA_LEN	64
+#define MAX_FUNC_INFOS	8
+#define MAX_BTF_STRINGS	256
+#define MAX_BTF_TYPES	256
 
 #define INSN_OFF_MASK	((s16)0xFFFF)
 #define INSN_IMM_MASK	((s32)0xFFFFFFFF)
 #define SKIP_INSNS()	BPF_RAW_INSN(0xde, 0xa, 0xd, 0xbeef, 0xdeadbeef)
 
+#define DEFAULT_LIBBPF_LOG_LEVEL	4
+#define VERBOSE_LIBBPF_LOG_LEVEL	1
+
 #define F_NEEDS_EFFICIENT_UNALIGNED_ACCESS	(1 << 0)
 #define F_LOAD_WITH_STRICT_ALIGNMENT		(1 << 1)
 
@@ -154,6 +160,14 @@ struct bpf_test {
 	};
 	enum bpf_attach_type expected_attach_type;
 	const char *kfunc;
+	struct bpf_func_info func_info[MAX_FUNC_INFOS];
+	int func_info_cnt;
+	char btf_strings[MAX_BTF_STRINGS];
+	/* A set of BTF types to load when specified,
+	 * use macro definitions from test_btf.h,
+	 * must end with BTF_END_RAW
+	 */
+	__u32 btf_types[MAX_BTF_TYPES];
 };
 
 /* Note we want this to be 64 bit aligned so that the end of our array is
@@ -683,34 +697,67 @@ static __u32 btf_raw_types[] = {
 	BTF_MEMBER_ENC(71, 13, 128), /* struct prog_test_member __kptr_ref *ptr; */
 };
 
-static int load_btf(void)
+static char bpf_vlog[UINT_MAX >> 8];
+
+static int load_btf_spec(__u32 *types, int types_len,
+			 const char *strings, int strings_len)
 {
 	struct btf_header hdr = {
 		.magic = BTF_MAGIC,
 		.version = BTF_VERSION,
 		.hdr_len = sizeof(struct btf_header),
-		.type_len = sizeof(btf_raw_types),
-		.str_off = sizeof(btf_raw_types),
-		.str_len = sizeof(btf_str_sec),
+		.type_len = types_len,
+		.str_off = types_len,
+		.str_len = strings_len,
 	};
 	void *ptr, *raw_btf;
 	int btf_fd;
 
-	ptr = raw_btf = malloc(sizeof(hdr) + sizeof(btf_raw_types) +
-			       sizeof(btf_str_sec));
+	raw_btf = malloc(sizeof(hdr) + types_len + strings_len);
 
+	ptr = raw_btf;
 	memcpy(ptr, &hdr, sizeof(hdr));
 	ptr += sizeof(hdr);
-	memcpy(ptr, btf_raw_types, hdr.type_len);
+	memcpy(ptr, types, hdr.type_len);
 	ptr += hdr.type_len;
-	memcpy(ptr, btf_str_sec, hdr.str_len);
+	memcpy(ptr, strings, hdr.str_len);
 	ptr += hdr.str_len;
 
-	btf_fd = bpf_btf_load(raw_btf, ptr - raw_btf, NULL);
-	free(raw_btf);
+	LIBBPF_OPTS(bpf_btf_load_opts, opts,
+		    .log_buf = bpf_vlog,
+		    .log_size = sizeof(bpf_vlog),
+		    .log_level = (verbose
+				  ? VERBOSE_LIBBPF_LOG_LEVEL
+				  : DEFAULT_LIBBPF_LOG_LEVEL),
+	);
+
+	btf_fd = bpf_btf_load(raw_btf, ptr - raw_btf, &opts);
 	if (btf_fd < 0)
-		return -1;
-	return btf_fd;
+		printf("Failed to load BTF spec: '%s'\n", strerror(errno));
+
+	free(raw_btf);
+
+	return btf_fd < 0 ? -1 : btf_fd;
+}
+
+static int load_btf(void)
+{
+	return load_btf_spec(btf_raw_types, sizeof(btf_raw_types),
+			     btf_str_sec, sizeof(btf_str_sec));
+}
+
+static int load_btf_for_test(struct bpf_test *test)
+{
+	int types_num = 0;
+
+	while (types_num < MAX_BTF_TYPES &&
+	       test->btf_types[types_num] != BTF_END_RAW)
+		++types_num;
+
+	int types_len = types_num * sizeof(test->btf_types[0]);
+
+	return load_btf_spec(test->btf_types, types_len,
+			     test->btf_strings, sizeof(test->btf_strings));
 }
 
 static int create_map_spin_lock(void)
@@ -789,8 +836,6 @@ static int create_map_kptr(void)
 	return fd;
 }
 
-static char bpf_vlog[UINT_MAX >> 8];
-
 static void do_test_fixup(struct bpf_test *test, enum bpf_prog_type prog_type,
 			  struct bpf_insn *prog, int *map_fds)
 {
@@ -1351,10 +1396,23 @@ static bool check_xlated_program(struct bpf_test *test, int fd_prog)
 	return result;
 }
 
+static bool string_ends_with_nl(char *str, int str_len)
+{
+	bool ends_with_nl = false;
+
+	for (int i = 0; i < str_len; ++i) {
+		if (str[i] == 0)
+			break;
+		ends_with_nl = str[i] == '\n';
+	}
+
+	return ends_with_nl;
+}
+
 static void do_test_single(struct bpf_test *test, bool unpriv,
 			   int *passes, int *errors)
 {
-	int fd_prog, expected_ret, alignment_prevented_execution;
+	int fd_prog, btf_fd, expected_ret, alignment_prevented_execution;
 	int prog_len, prog_type = test->prog_type;
 	struct bpf_insn *prog = test->insns;
 	LIBBPF_OPTS(bpf_prog_load_opts, opts);
@@ -1366,8 +1424,10 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
 	__u32 pflags;
 	int i, err;
 
+	fd_prog = -1;
 	for (i = 0; i < MAX_NR_MAPS; i++)
 		map_fds[i] = -1;
+	btf_fd = -1;
 
 	if (!prog_type)
 		prog_type = BPF_PROG_TYPE_SOCKET_FILTER;
@@ -1400,11 +1460,11 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
 
 	opts.expected_attach_type = test->expected_attach_type;
 	if (verbose)
-		opts.log_level = 1;
+		opts.log_level = VERBOSE_LIBBPF_LOG_LEVEL;
 	else if (expected_ret == VERBOSE_ACCEPT)
 		opts.log_level = 2;
 	else
-		opts.log_level = 4;
+		opts.log_level = DEFAULT_LIBBPF_LOG_LEVEL;
 	opts.prog_flags = pflags;
 
 	if (prog_type == BPF_PROG_TYPE_TRACING && test->kfunc) {
@@ -1422,6 +1482,19 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
 		opts.attach_btf_id = attach_btf_id;
 	}
 
+	if (test->btf_types[0] != 0) {
+		btf_fd = load_btf_for_test(test);
+		if (btf_fd < 0)
+			goto fail_log;
+		opts.prog_btf_fd = btf_fd;
+	}
+
+	if (test->func_info_cnt != 0) {
+		opts.func_info = test->func_info;
+		opts.func_info_cnt = test->func_info_cnt;
+		opts.func_info_rec_size = sizeof(test->func_info[0]);
+	}
+
 	opts.log_buf = bpf_vlog;
 	opts.log_size = sizeof(bpf_vlog);
 	fd_prog = bpf_prog_load(prog_type, NULL, "GPL", prog, prog_len, &opts);
@@ -1533,6 +1606,7 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
 	if (test->fill_insns)
 		free(test->fill_insns);
 	close(fd_prog);
+	close(btf_fd);
 	for (i = 0; i < MAX_NR_MAPS; i++)
 		close(map_fds[i]);
 	sched_yield();
@@ -1540,6 +1614,8 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
 fail_log:
 	(*errors)++;
 	printf("%s", bpf_vlog);
+	if (!string_ends_with_nl(bpf_vlog, sizeof(bpf_vlog)))
+		printf("\n");
 	goto close_fds;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH bpf-next v2 3/3] bpf: Inline calls to bpf_loop when callback is known
  2022-05-29 22:36 [PATCH bpf-next v2 0/3] bpf_loop inlining Eduard Zingerman
  2022-05-29 22:36 ` [PATCH bpf-next v2 1/3] selftests/bpf: specify expected instructions in test_verifier tests Eduard Zingerman
  2022-05-29 22:36 ` [PATCH bpf-next v2 2/3] selftests/bpf: allow BTF specs and func infos " Eduard Zingerman
@ 2022-05-29 22:36 ` Eduard Zingerman
  2022-05-31 22:10   ` Song Liu
  2 siblings, 1 reply; 9+ messages in thread
From: Eduard Zingerman @ 2022-05-29 22:36 UTC (permalink / raw)
  To: bpf, ast, andrii, daniel, kernel-team; +Cc: eddyz87

Calls to `bpf_loop` are replaced with direct loops to avoid
indirection. E.g. the following:

  bpf_loop(10, foo, NULL, 0);

Is replaced by equivalent of the following:

  for (int i = 0; i < 10; ++i)
    foo(i, NULL);

This transformation could be applied when:
- callback is known and does not change during program execution;
- flags passed to `bpf_loop` are always zero.

Inlining logic works as follows:

- During execution simulation function `update_loop_inline_state`
  tracks the following information for each `bpf_loop` call
  instruction:
  - is callback known and constant?
  - are flags constant and zero?
- Function `adjust_stack_depth_for_loop_inlining` increases stack
  depth for functions where `bpf_loop` calls could be inlined. This is
  needed to spill registers R6, R7 and R8. These registers are used as
  loop counter, loop maximal bound and callback context parameter;
- Function `inline_bpf_loop` called from `do_misc_fixups` replaces
  `bpf_loop` calls fit for inlining with corresponding loop
  instructions.

Measurements using `benchs/run_bench_bpf_loop.sh` inside QEMU / KVM on
i7-4710HQ CPU show a drop in latency from 14 ns/op to 2 ns/op.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 include/linux/bpf_verifier.h                  |  16 ++
 kernel/bpf/bpf_iter.c                         |   9 +-
 kernel/bpf/verifier.c                         | 184 ++++++++++++-
 .../selftests/bpf/prog_tests/bpf_loop.c       |  62 +++++
 tools/testing/selftests/bpf/progs/bpf_loop.c  | 122 +++++++++
 .../selftests/bpf/verifier/bpf_loop_inline.c  | 244 ++++++++++++++++++
 6 files changed, 628 insertions(+), 9 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/verifier/bpf_loop_inline.c

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index e8439f6cbe57..80279616a76b 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -20,6 +20,8 @@
 #define BPF_MAX_VAR_SIZ	(1 << 29)
 /* size of type_str_buf in bpf_verifier. */
 #define TYPE_STR_BUF_LEN 64
+/* Maximum number of loops for bpf_loop */
+#define BPF_MAX_LOOPS	BIT(23)
 
 /* Liveness marks, used for registers and spilled-regs (in stack slots).
  * Read marks propagate upwards until they find a write mark; they record that
@@ -344,6 +346,16 @@ struct bpf_verifier_state_list {
 	int miss_cnt, hit_cnt;
 };
 
+struct bpf_loop_inline_state {
+	u32 initialized:1; /* set to true upon first entry */
+	u32 callback_is_constant:1; /* true if callback function
+				     * is the same at each call
+				     */
+	u32 flags_is_zero:1; /* true if flags register is zero at each call */
+	u32 callback_subprogno; /* valid when callback_is_constant == 1 */
+	s32 stack_base; /* stack offset for loop vars */
+};
+
 /* Possible states for alu_state member. */
 #define BPF_ALU_SANITIZE_SRC		(1U << 0)
 #define BPF_ALU_SANITIZE_DST		(1U << 1)
@@ -380,6 +392,10 @@ struct bpf_insn_aux_data {
 	bool sanitize_stack_spill; /* subject to Spectre v4 sanitation */
 	bool zext_dst; /* this insn zero extends dst reg */
 	u8 alu_state; /* used in combination with alu_limit */
+	/* if instruction is a call to bpf_loop this field tracks
+	 * the state of the relevant registers to take decision about inlining
+	 */
+	struct bpf_loop_inline_state loop_inline_state;
 
 	/* below fields are initialized once */
 	unsigned int orig_idx; /* original instruction index */
diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c
index d5d96ceca105..cdb898fce118 100644
--- a/kernel/bpf/bpf_iter.c
+++ b/kernel/bpf/bpf_iter.c
@@ -6,6 +6,7 @@
 #include <linux/filter.h>
 #include <linux/bpf.h>
 #include <linux/rcupdate_trace.h>
+#include <linux/bpf_verifier.h>
 
 struct bpf_iter_target_info {
 	struct list_head list;
@@ -723,9 +724,6 @@ const struct bpf_func_proto bpf_for_each_map_elem_proto = {
 	.arg4_type	= ARG_ANYTHING,
 };
 
-/* maximum number of loops */
-#define MAX_LOOPS	BIT(23)
-
 BPF_CALL_4(bpf_loop, u32, nr_loops, void *, callback_fn, void *, callback_ctx,
 	   u64, flags)
 {
@@ -733,9 +731,12 @@ BPF_CALL_4(bpf_loop, u32, nr_loops, void *, callback_fn, void *, callback_ctx,
 	u64 ret;
 	u32 i;
 
+	/* note: these safety checks are also verified when bpf_loop is inlined,
+	 * be careful to modify this code in sync
+	 */
 	if (flags)
 		return -EINVAL;
-	if (nr_loops > MAX_LOOPS)
+	if (nr_loops > BPF_MAX_LOOPS)
 		return -E2BIG;
 
 	for (i = 0; i < nr_loops; i++) {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index aedac2ac02b9..9bae1f301f77 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7103,6 +7103,78 @@ static int check_get_func_ip(struct bpf_verifier_env *env)
 	return -ENOTSUPP;
 }
 
+static struct bpf_insn_aux_data *cur_aux(struct bpf_verifier_env *env)
+{
+	return &env->insn_aux_data[env->insn_idx];
+}
+
+static bool fit_for_bpf_loop_inline(struct bpf_insn_aux_data *insn_aux)
+{
+	return insn_aux->loop_inline_state.initialized &&
+		insn_aux->loop_inline_state.flags_is_zero &&
+		insn_aux->loop_inline_state.callback_is_constant;
+}
+
+/* For all sub-programs in the program (including main) checks
+ * insn_aux_data to see if there are bpf_loop calls that require
+ * inlining. If such calls are found subprog stack_depth is increased
+ * by the size of 3 registers. Reserved space would be used in the
+ * do_misc_fixups to spill values of the R6, R7, R8 to use these
+ * registers for loop iteration.
+ */
+static void adjust_stack_depth_for_loop_inlining(struct bpf_verifier_env *env)
+{
+	int i, subprog_end, cur_subprog = 0;
+	struct bpf_subprog_info *subprogs = env->subprog_info;
+	int insn_cnt = env->prog->len;
+	bool proc_updated = false;
+	s32 stack_base;
+
+	subprog_end = (env->subprog_cnt > 1
+		       ? subprogs[cur_subprog + 1].start
+		       : insn_cnt);
+	for (i = 0; i < insn_cnt; i++) {
+		struct bpf_insn_aux_data *aux = &env->insn_aux_data[i];
+
+		if (fit_for_bpf_loop_inline(aux)) {
+			if (!proc_updated) {
+				proc_updated = true;
+				subprogs[cur_subprog].stack_depth += BPF_REG_SIZE * 3;
+				stack_base = -subprogs[cur_subprog].stack_depth;
+			}
+			aux->loop_inline_state.stack_base = stack_base;
+		}
+		if (i == subprog_end - 1) {
+			proc_updated = false;
+			cur_subprog++;
+			if (cur_subprog < env->subprog_cnt)
+				subprog_end = subprogs[cur_subprog + 1].start;
+		}
+	}
+
+	env->prog->aux->stack_depth = env->subprog_info[0].stack_depth;
+}
+
+static void update_loop_inline_state(struct bpf_verifier_env *env, u32 subprogno)
+{
+	struct bpf_loop_inline_state *state = &cur_aux(env)->loop_inline_state;
+	struct bpf_reg_state *regs = cur_regs(env);
+	struct bpf_reg_state *flags_reg = &regs[BPF_REG_4];
+
+	int flags_is_zero =
+		register_is_const(flags_reg) && flags_reg->var_off.value == 0;
+
+	if (state->initialized) {
+		state->flags_is_zero &= flags_is_zero;
+		state->callback_is_constant &= state->callback_subprogno == subprogno;
+	} else {
+		state->initialized = 1;
+		state->callback_is_constant = 1;
+		state->flags_is_zero = flags_is_zero;
+		state->callback_subprogno = subprogno;
+	}
+}
+
 static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			     int *insn_idx_p)
 {
@@ -7255,6 +7327,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 		err = check_bpf_snprintf_call(env, regs);
 		break;
 	case BPF_FUNC_loop:
+		update_loop_inline_state(env, meta.subprogno);
 		err = __check_func_call(env, insn, insn_idx_p, meta.subprogno,
 					set_loop_callback_state);
 		break;
@@ -7661,11 +7734,6 @@ static bool check_reg_sane_offset(struct bpf_verifier_env *env,
 	return true;
 }
 
-static struct bpf_insn_aux_data *cur_aux(struct bpf_verifier_env *env)
-{
-	return &env->insn_aux_data[env->insn_idx];
-}
-
 enum {
 	REASON_BOUNDS	= -1,
 	REASON_TYPE	= -2,
@@ -12920,6 +12988,22 @@ static struct bpf_prog *bpf_patch_insn_data(struct bpf_verifier_env *env, u32 of
 	return new_prog;
 }
 
+static void adjust_loop_inline_subprogno(struct bpf_verifier_env *env,
+					 u32 first_removed,
+					 u32 first_remaining)
+{
+	int delta = first_remaining - first_removed;
+
+	for (int i = 0; i < env->prog->len; ++i) {
+		struct bpf_loop_inline_state *state =
+			&env->insn_aux_data[i].loop_inline_state;
+
+		if (state->initialized &&
+		    state->callback_subprogno >= first_remaining)
+			state->callback_subprogno -= delta;
+	}
+}
+
 static int adjust_subprog_starts_after_remove(struct bpf_verifier_env *env,
 					      u32 off, u32 cnt)
 {
@@ -12963,6 +13047,8 @@ static int adjust_subprog_starts_after_remove(struct bpf_verifier_env *env,
 			 * in adjust_btf_func() - no need to adjust
 			 */
 		}
+
+		adjust_loop_inline_subprogno(env, i, j);
 	} else {
 		/* convert i from "first prog to remove" to "first to adjust" */
 		if (env->subprog_info[i].start == off)
@@ -13773,6 +13859,79 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env,
 	return 0;
 }
 
+static struct bpf_prog *inline_bpf_loop(struct bpf_verifier_env *env,
+					int position, u32 *cnt)
+{
+	struct bpf_insn_aux_data *aux = &env->insn_aux_data[position];
+	s32 stack_base = aux->loop_inline_state.stack_base;
+	s32 r6_offset = stack_base + 0 * BPF_REG_SIZE;
+	s32 r7_offset = stack_base + 1 * BPF_REG_SIZE;
+	s32 r8_offset = stack_base + 2 * BPF_REG_SIZE;
+	int reg_loop_max = BPF_REG_6;
+	int reg_loop_cnt = BPF_REG_7;
+	int reg_loop_ctx = BPF_REG_8;
+
+	struct bpf_prog *new_prog;
+	u32 callback_subprogno = aux->loop_inline_state.callback_subprogno;
+	u32 callback_start;
+	u32 call_insn_offset;
+	s32 callback_offset;
+
+	struct bpf_insn insn_buf[] = {
+		/* Return error and jump to the end of the patch if
+		 * expected number of iterations is too big.  This
+		 * repeats the check done in bpf_loop helper function,
+		 * be careful to modify this code in sync.
+		 */
+		BPF_JMP_IMM(BPF_JLE, BPF_REG_1, BPF_MAX_LOOPS, 2),
+		BPF_MOV32_IMM(BPF_REG_0, -E2BIG),
+		BPF_JMP_IMM(BPF_JA, 0, 0, 16),
+		/* spill R6, R7, R8 to use these as loop vars */
+		BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, r6_offset),
+		BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, r7_offset),
+		BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, r8_offset),
+		/* initialize loop vars */
+		BPF_MOV64_REG(reg_loop_max, BPF_REG_1),
+		BPF_MOV32_IMM(reg_loop_cnt, 0),
+		BPF_MOV64_REG(reg_loop_ctx, BPF_REG_3),
+		/* loop header,
+		 * if reg_loop_cnt >= reg_loop_max skip the loop body
+		 */
+		BPF_JMP_REG(BPF_JGE, reg_loop_cnt, reg_loop_max, 5),
+		/* callback call,
+		 * correct callback offset would be set after patching
+		 */
+		BPF_MOV64_REG(BPF_REG_1, reg_loop_cnt),
+		BPF_MOV64_REG(BPF_REG_2, reg_loop_ctx),
+		BPF_CALL_REL(0),
+		/* increment loop counter */
+		BPF_ALU64_IMM(BPF_ADD, reg_loop_cnt, 1),
+		/* jump to loop header if callback returned 0 */
+		BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, -6),
+		/* return value of bpf_loop,
+		 * set R0 to the number of iterations
+		 */
+		BPF_MOV64_REG(BPF_REG_0, reg_loop_cnt),
+		/* restore original values of R6, R7, R8 */
+		BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_10, r6_offset),
+		BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_10, r7_offset),
+		BPF_LDX_MEM(BPF_DW, BPF_REG_8, BPF_REG_10, r8_offset),
+	};
+
+	*cnt = ARRAY_SIZE(insn_buf);
+	new_prog = bpf_patch_insn_data(env, position, insn_buf, *cnt);
+	if (!new_prog)
+		return new_prog;
+
+	/* callback start is known only after patching */
+	callback_start = env->subprog_info[callback_subprogno].start;
+	call_insn_offset = position + 12;
+	callback_offset = callback_start - call_insn_offset - 1;
+	env->prog->insnsi[call_insn_offset].imm = callback_offset;
+
+	return new_prog;
+}
+
 /* Do various post-verification rewrites in a single program pass.
  * These rewrites simplify JIT and interpreter implementations.
  */
@@ -14258,6 +14417,18 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
 			continue;
 		}
 
+		if (insn->imm == BPF_FUNC_loop &&
+		    fit_for_bpf_loop_inline(&env->insn_aux_data[i + delta])) {
+			new_prog = inline_bpf_loop(env, i + delta, &cnt);
+			if (!new_prog)
+				return -ENOMEM;
+
+			delta    += cnt - 1;
+			env->prog = prog = new_prog;
+			insn      = new_prog->insnsi + i + delta;
+			continue;
+		}
+
 patch_call_imm:
 		fn = env->ops->get_func_proto(insn->imm, env->prog);
 		/* all functions that have prototype and verifier allowed
@@ -15030,6 +15201,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr)
 	if (ret == 0)
 		ret = check_max_stack_depth(env);
 
+	if (ret == 0)
+		adjust_stack_depth_for_loop_inlining(env);
+
 	/* instruction rewrites happen after this point */
 	if (is_priv) {
 		if (ret == 0)
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_loop.c b/tools/testing/selftests/bpf/prog_tests/bpf_loop.c
index 380d7a2072e3..1caa495be48c 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_loop.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_loop.c
@@ -120,6 +120,64 @@ static void check_nested_calls(struct bpf_loop *skel)
 	bpf_link__destroy(link);
 }
 
+static void check_non_constant_callback(struct bpf_loop *skel)
+{
+	struct bpf_link *link =
+		bpf_program__attach(skel->progs.prog_non_constant_callback);
+
+	if (!ASSERT_OK_PTR(link, "link"))
+		return;
+
+	skel->bss->callback_selector = 0x0F;
+	usleep(1);
+	ASSERT_EQ(skel->bss->g_output, 0x0F, "g_output #1");
+
+	skel->bss->callback_selector = 0xF0;
+	usleep(1);
+	ASSERT_EQ(skel->bss->g_output, 0xF0, "g_output #2");
+
+	bpf_link__destroy(link);
+}
+
+static void check_stack(struct bpf_loop *skel)
+{
+	struct bpf_link *link =
+		bpf_program__attach(skel->progs.stack_check);
+
+	if (!ASSERT_OK_PTR(link, "link"))
+		goto out;
+
+	int map_fd = bpf_map__fd(skel->maps.map1);
+
+	if (!ASSERT_GE(map_fd, 0, "bpf_map__fd"))
+		goto out;
+
+	for (int key = 1; key <= 16; ++key) {
+		int val = key;
+		int err = bpf_map_update_elem(map_fd, &key, &val, BPF_NOEXIST);
+
+		if (!ASSERT_OK(err, "bpf_map_update_elem"))
+			goto out;
+	}
+
+	usleep(1);
+
+	for (int key = 1; key <= 16; ++key) {
+		int val;
+		int err = bpf_map_lookup_elem(map_fd, &key, &val);
+
+		if (!ASSERT_OK(err, "bpf_map_lookup_elem"))
+			goto out;
+		ASSERT_EQ(val, key + 1, "bad value in the map");
+	}
+
+out:
+	if (map_fd >= 0)
+		close(map_fd);
+	if (link)
+		bpf_link__destroy(link);
+}
+
 void test_bpf_loop(void)
 {
 	struct bpf_loop *skel;
@@ -140,6 +198,10 @@ void test_bpf_loop(void)
 		check_invalid_flags(skel);
 	if (test__start_subtest("check_nested_calls"))
 		check_nested_calls(skel);
+	if (test__start_subtest("check_non_constant_callback"))
+		check_non_constant_callback(skel);
+	if (test__start_subtest("check_stack"))
+		check_stack(skel);
 
 	bpf_loop__destroy(skel);
 }
diff --git a/tools/testing/selftests/bpf/progs/bpf_loop.c b/tools/testing/selftests/bpf/progs/bpf_loop.c
index e08565282759..9344428a833b 100644
--- a/tools/testing/selftests/bpf/progs/bpf_loop.c
+++ b/tools/testing/selftests/bpf/progs/bpf_loop.c
@@ -11,11 +11,19 @@ struct callback_ctx {
 	int output;
 };
 
+struct {
+	__uint(type, BPF_MAP_TYPE_HASH);
+	__uint(max_entries, 32);
+	__type(key, int);
+	__type(value, int);
+} map1 SEC(".maps");
+
 /* These should be set by the user program */
 u32 nested_callback_nr_loops;
 u32 stop_index = -1;
 u32 nr_loops;
 int pid;
+int callback_selector;
 
 /* Making these global variables so that the userspace program
  * can verify the output through the skeleton
@@ -111,3 +119,117 @@ int prog_nested_calls(void *ctx)
 
 	return 0;
 }
+
+static int callback_set_f0(int i, void *ctx)
+{
+	g_output = 0xF0;
+	return 0;
+}
+
+static int callback_set_0f(int i, void *ctx)
+{
+	g_output = 0x0F;
+	return 0;
+}
+
+/*
+ * non-constant callback is a corner case for bpf_loop inline logic
+ */
+SEC("fentry/" SYS_PREFIX "sys_nanosleep")
+int prog_non_constant_callback(void *ctx)
+{
+	struct callback_ctx data = {};
+
+	if (bpf_get_current_pid_tgid() >> 32 != pid)
+		return 0;
+
+	int (*callback)(int i, void *ctx);
+
+	g_output = 0;
+
+	if (callback_selector == 0x0F)
+		callback = callback_set_0f;
+	else
+		callback = callback_set_f0;
+
+	bpf_loop(1, callback, NULL, 0);
+
+	return 0;
+}
+
+static int stack_check_inner_callback(void *ctx)
+{
+	return 0;
+}
+
+static int map1_lookup_elem(int key)
+{
+	int *val = bpf_map_lookup_elem(&map1, &key);
+
+	return val ? *val : -1;
+}
+
+static void map1_update_elem(int key, int val)
+{
+	bpf_map_update_elem(&map1, &key, &val, BPF_ANY);
+}
+
+static int stack_check_outer_callback(void *ctx)
+{
+	int a = map1_lookup_elem(1);
+	int b = map1_lookup_elem(2);
+	int c = map1_lookup_elem(3);
+	int d = map1_lookup_elem(4);
+	int e = map1_lookup_elem(5);
+	int f = map1_lookup_elem(6);
+	int g = map1_lookup_elem(7);
+	int h = map1_lookup_elem(8);
+
+	bpf_loop(1, stack_check_inner_callback, NULL, 0);
+
+	map1_update_elem(1, a + 1);
+	map1_update_elem(2, b + 1);
+	map1_update_elem(3, c + 1);
+	map1_update_elem(4, d + 1);
+	map1_update_elem(5, e + 1);
+	map1_update_elem(6, f + 1);
+	map1_update_elem(7, g + 1);
+	map1_update_elem(8, h + 1);
+
+	return 0;
+}
+
+/* Some of the local variables in stack_check and
+ * stack_check_outer_callback would be allocated on stack by
+ * compiler. This test should verify that stack content for these
+ * variables is preserved between calls to bpf_loop (might be an issue
+ * if loop inlining allocates stack slots incorrectly).
+ */
+SEC("fentry/" SYS_PREFIX "sys_nanosleep")
+int stack_check(void *ctx)
+{
+	if (bpf_get_current_pid_tgid() >> 32 != pid)
+		return 0;
+
+	int a = map1_lookup_elem(9);
+	int b = map1_lookup_elem(10);
+	int c = map1_lookup_elem(11);
+	int d = map1_lookup_elem(12);
+	int e = map1_lookup_elem(13);
+	int f = map1_lookup_elem(14);
+	int g = map1_lookup_elem(15);
+	int h = map1_lookup_elem(16);
+
+	bpf_loop(1, stack_check_outer_callback, NULL, 0);
+
+	map1_update_elem(9,  a + 1);
+	map1_update_elem(10, b + 1);
+	map1_update_elem(11, c + 1);
+	map1_update_elem(12, d + 1);
+	map1_update_elem(13, e + 1);
+	map1_update_elem(14, f + 1);
+	map1_update_elem(15, g + 1);
+	map1_update_elem(16, h + 1);
+
+	return 0;
+}
diff --git a/tools/testing/selftests/bpf/verifier/bpf_loop_inline.c b/tools/testing/selftests/bpf/verifier/bpf_loop_inline.c
new file mode 100644
index 000000000000..d1fbcfef69f2
--- /dev/null
+++ b/tools/testing/selftests/bpf/verifier/bpf_loop_inline.c
@@ -0,0 +1,244 @@
+#define BTF_TYPES \
+	.btf_strings = "\0int\0i\0ctx\0callback\0main\0", \
+	.btf_types = { \
+	/* 1: int   */ BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4), \
+	/* 2: int*  */ BTF_PTR_ENC(1), \
+	/* 3: void* */ BTF_PTR_ENC(0), \
+	/* 4: int __(void*) */ BTF_FUNC_PROTO_ENC(1, 1), \
+		BTF_FUNC_PROTO_ARG_ENC(7, 3), \
+	/* 5: int __(int, int*) */ BTF_FUNC_PROTO_ENC(1, 2), \
+		BTF_FUNC_PROTO_ARG_ENC(5, 1), \
+		BTF_FUNC_PROTO_ARG_ENC(7, 2), \
+	/* 6: main      */ BTF_FUNC_ENC(20, 4), \
+	/* 7: callback  */ BTF_FUNC_ENC(11, 5), \
+	BTF_END_RAW \
+	}
+
+#define MAIN_TYPE	6
+#define CALLBACK_TYPE	7
+
+/* can't use BPF_CALL_REL, jit_subprogs adjusts IMM & OFF
+ * fields for pseudo calls
+ */
+#define PSEUDO_CALL_INSN() \
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_CALL, \
+		     INSN_OFF_MASK, INSN_IMM_MASK)
+
+/* can't use BPF_FUNC_loop constant,
+ * do_mix_fixups adjusts the IMM field
+ */
+#define HELPER_CALL_INSN() \
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, INSN_OFF_MASK, INSN_IMM_MASK)
+
+{
+	"inline simple bpf_loop call",
+	.insns = {
+	/* main */
+	/* force verifier state branching to verify logic on first and
+	 * subsequent bpf_loop insn processing steps
+	 */
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
+	BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 777, 2),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
+	BPF_JMP_IMM(BPF_JA, 0, 0, 1),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 2),
+
+	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 6),
+	BPF_RAW_INSN(0, 0, 0, 0, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
+	BPF_EXIT_INSN(),
+	/* callback */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	},
+	.expected_insns = { PSEUDO_CALL_INSN() },
+	.unexpected_insns = { HELPER_CALL_INSN() },
+	.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	.result = ACCEPT,
+	.runs = 0,
+	.func_info = { { 0, MAIN_TYPE }, { 12, CALLBACK_TYPE } },
+	.func_info_cnt = 2,
+	BTF_TYPES
+},
+{
+	"don't inline bpf_loop call, flags non-zero",
+	.insns = {
+	/* main */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
+	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 6),
+	BPF_RAW_INSN(0, 0, 0, 0, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 1),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
+	BPF_EXIT_INSN(),
+	/* callback */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	},
+	.expected_insns = { HELPER_CALL_INSN() },
+	.unexpected_insns = { PSEUDO_CALL_INSN() },
+	.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	.result = ACCEPT,
+	.runs = 0,
+	.func_info = { { 0, MAIN_TYPE }, { 8, CALLBACK_TYPE } },
+	.func_info_cnt = 2,
+	BTF_TYPES
+},
+{
+	"don't inline bpf_loop call, callback non-constant",
+	.insns = {
+	/* main */
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
+	BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 777, 4), /* pick a random callback */
+
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
+	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 10),
+	BPF_RAW_INSN(0, 0, 0, 0, 0),
+	BPF_JMP_IMM(BPF_JA, 0, 0, 3),
+
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
+	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 8),
+	BPF_RAW_INSN(0, 0, 0, 0, 0),
+
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
+	BPF_EXIT_INSN(),
+	/* callback */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	/* callback #2 */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	},
+	.expected_insns = { HELPER_CALL_INSN() },
+	.unexpected_insns = { PSEUDO_CALL_INSN() },
+	.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	.result = ACCEPT,
+	.runs = 0,
+	.func_info = {
+		{ 0, MAIN_TYPE },
+		{ 14, CALLBACK_TYPE },
+		{ 16, CALLBACK_TYPE }
+	},
+	.func_info_cnt = 3,
+	BTF_TYPES
+},
+{
+	"bpf_loop_inline and a dead func",
+	.insns = {
+	/* main */
+
+	/* A reference to callback #1 to make verifier count it as a func.
+	 * This reference is overwritten below and callback #1 is dead.
+	 */
+	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 9),
+	BPF_RAW_INSN(0, 0, 0, 0, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
+	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 8),
+	BPF_RAW_INSN(0, 0, 0, 0, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
+	BPF_EXIT_INSN(),
+	/* callback */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	/* callback #2 */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	},
+	.expected_insns = { PSEUDO_CALL_INSN() },
+	.unexpected_insns = { HELPER_CALL_INSN() },
+	.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	.result = ACCEPT,
+	.runs = 0,
+	.func_info = {
+		{ 0, MAIN_TYPE },
+		{ 10, CALLBACK_TYPE },
+		{ 12, CALLBACK_TYPE }
+	},
+	.func_info_cnt = 3,
+	BTF_TYPES
+},
+{
+	"bpf_loop_inline stack locations for loop vars",
+	.insns = {
+	/* main */
+	BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0x77),
+	/* bpf_loop call #1 */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
+	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 22),
+	BPF_RAW_INSN(0, 0, 0, 0, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+	/* bpf_loop call #2 */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 2),
+	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 16),
+	BPF_RAW_INSN(0, 0, 0, 0, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+	/* call func and exit */
+	BPF_CALL_REL(2),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
+	BPF_EXIT_INSN(),
+	/* func */
+	BPF_ST_MEM(BPF_DW, BPF_REG_10, -32, 0x55),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 2),
+	BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 6),
+	BPF_RAW_INSN(0, 0, 0, 0, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
+	BPF_EXIT_INSN(),
+	/* callback */
+	BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	},
+	.expected_insns = {
+	BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0x77),
+	SKIP_INSNS(),
+	BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, -40),
+	BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, -32),
+	BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, -24),
+	SKIP_INSNS(),
+	/* offsets are the same as in the first call */
+	BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, -40),
+	BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, -32),
+	BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, -24),
+	SKIP_INSNS(),
+	BPF_ST_MEM(BPF_DW, BPF_REG_10, -32, 0x55),
+	SKIP_INSNS(),
+	/* offsets differ from main because of different offset
+	 * in BPF_ST_MEM instruction
+	 */
+	BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, -56),
+	BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, -48),
+	BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, -40),
+	},
+	.unexpected_insns = { HELPER_CALL_INSN() },
+	.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	.result = ACCEPT,
+	.func_info = {
+		{ 0, MAIN_TYPE },
+		{ 16, MAIN_TYPE },
+		{ 25, CALLBACK_TYPE },
+	},
+	.func_info_cnt = 3,
+	BTF_TYPES
+},
+
+#undef HELPER_CALL_INSN
+#undef PSEUDO_CALL_INSN
+#undef CALLBACK_TYPE
+#undef MAIN_TYPE
+#undef BTF_TYPES
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next v2 1/3] selftests/bpf: specify expected instructions in test_verifier tests
  2022-05-29 22:36 ` [PATCH bpf-next v2 1/3] selftests/bpf: specify expected instructions in test_verifier tests Eduard Zingerman
@ 2022-05-31 18:26   ` Song Liu
  0 siblings, 0 replies; 9+ messages in thread
From: Song Liu @ 2022-05-31 18:26 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team

On Sun, May 29, 2022 at 3:37 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> Allows to specify expected and unexpected instruction sequences in
> test_verifier test cases. The instructions are requested from kernel
> after BPF program loading, thus allowing to check some of the
> transformations applied by BPF verifier.
>
> - `expected_insn` field specifies a sequence of instructions expected
>   to be found in the program;
> - `unexpected_insn` field specifies a sequence of instructions that
>   are not expected to be found in the program;
> - `INSN_OFF_MASK` and `INSN_IMM_MASK` values could be used to mask
>   `off` and `imm` fields.
> - `SKIP_INSNS` could be used to specify that some instructions in the
>   (un)expected pattern are not important (behavior similar to usage of
>   `\t` in `errstr` field).
>
> The intended usage is as follows:
>
>   {
>         "inline simple bpf_loop call",
>         .insns = {
>         /* main */
>         BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
>         BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2,
>                         BPF_PSEUDO_FUNC, 0, 6),
>     ...
>         BPF_EXIT_INSN(),
>         /* callback */
>         BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
>         BPF_EXIT_INSN(),
>         },
>         .expected_insns = {
>                 BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
>                 SKIP_INSN(),

nit: SKIP_INSNS(),

>                 BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_CALL, 8, 1)
>         },
>         .unexpected_insns = {
>         BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0,
>                         INSN_OFF_MASK, INSN_IMM_MASK),
>         },
>         .prog_type = BPF_PROG_TYPE_TRACEPOINT,
>         .result = ACCEPT,
>         .runs = 0,
>   },
>
> Here it is expected that move of 1 to register 1 would remain in place
> and helper function call instruction would be replaced by a relative
> call instruction.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---

[...]

>
> +static __u32 roundup_u32(__u32 number, __u32 divisor)
> +{
> +       if (number % divisor == 0)
> +               return number / divisor;
> +       else
> +               return number / divisor + 1;
> +}

Do we really need this roundup? If so, maybe give it a different name?

> +
> +static int get_xlated_program(int fd_prog, struct bpf_insn **buf, int *cnt)
> +{
> +       struct bpf_prog_info info = {};
> +       __u32 info_len = sizeof(info);
> +       int err = 0;
> +
> +       if (bpf_obj_get_info_by_fd(fd_prog, &info, &info_len)) {
> +               err = errno;
> +               perror("bpf_obj_get_info_by_fd failed");
> +               goto out;
> +       }
> +
> +       __u32 xlated_prog_len = info.xlated_prog_len;
> +       *cnt = roundup_u32(xlated_prog_len, sizeof(**buf));
> +       *buf = calloc(*cnt, sizeof(**buf));
> +       if (!buf) {
> +               err = -ENOMEM;
> +               perror("can't allocate xlated program buffer");
> +               goto out;
> +       }
> +
> +       bzero(&info, sizeof(info));
> +       info.xlated_prog_len = xlated_prog_len;
> +       info.xlated_prog_insns = (__u64)*buf;
> +
> +       if (bpf_obj_get_info_by_fd(fd_prog, &info, &info_len)) {
> +               err = errno;
> +               perror("second bpf_obj_get_info_by_fd failed");
> +               goto out_free_buf;
> +       }
> +
> +       goto out;

Maybe just return 0 here, and ...

> +
> + out_free_buf:
> +       free(*buf);
> + out:

... remove label "out:".

> +       return err;
> +}
> +

Other than these, looks good to me.

Acked-by: Song Liu <songliubraving@fb.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next v2 2/3] selftests/bpf: allow BTF specs and func infos in test_verifier tests
  2022-05-29 22:36 ` [PATCH bpf-next v2 2/3] selftests/bpf: allow BTF specs and func infos " Eduard Zingerman
@ 2022-05-31 20:52   ` Song Liu
  2022-05-31 23:20     ` Eduard Zingerman
  0 siblings, 1 reply; 9+ messages in thread
From: Song Liu @ 2022-05-31 20:52 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team

On Sun, May 29, 2022 at 3:37 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> The BTF and func_info specification for test_verifier tests follows
> the same notation as in prog_tests/btf.c tests. E.g.:
>
>   ...
>   .func_info = { { 0, 6 }, { 8, 7 } },
>   .func_info_cnt = 2,
>   .btf_strings = "\0int\0",
>   .btf_types = {
>     BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4),
>     BTF_PTR_ENC(1),
>   },
>   ...
>
> The BTF specification is loaded only when specified.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

This can be very useful! Thanks!

> ---
>  tools/testing/selftests/bpf/prog_tests/btf.c |   1 -
>  tools/testing/selftests/bpf/test_btf.h       |   2 +
>  tools/testing/selftests/bpf/test_verifier.c  | 110 ++++++++++++++++---
>  3 files changed, 95 insertions(+), 18 deletions(-)

[...]

>
> -static int load_btf(void)
> +static char bpf_vlog[UINT_MAX >> 8];
> +
> +static int load_btf_spec(__u32 *types, int types_len,
> +                        const char *strings, int strings_len)
>  {
>         struct btf_header hdr = {
>                 .magic = BTF_MAGIC,
>                 .version = BTF_VERSION,
>                 .hdr_len = sizeof(struct btf_header),
> -               .type_len = sizeof(btf_raw_types),
> -               .str_off = sizeof(btf_raw_types),
> -               .str_len = sizeof(btf_str_sec),
> +               .type_len = types_len,
> +               .str_off = types_len,
> +               .str_len = strings_len,
>         };
>         void *ptr, *raw_btf;
>         int btf_fd;
>
> -       ptr = raw_btf = malloc(sizeof(hdr) + sizeof(btf_raw_types) +
> -                              sizeof(btf_str_sec));
> +       raw_btf = malloc(sizeof(hdr) + types_len + strings_len);
>
> +       ptr = raw_btf;
>         memcpy(ptr, &hdr, sizeof(hdr));
>         ptr += sizeof(hdr);
> -       memcpy(ptr, btf_raw_types, hdr.type_len);
> +       memcpy(ptr, types, hdr.type_len);
>         ptr += hdr.type_len;
> -       memcpy(ptr, btf_str_sec, hdr.str_len);
> +       memcpy(ptr, strings, hdr.str_len);
>         ptr += hdr.str_len;
>
> -       btf_fd = bpf_btf_load(raw_btf, ptr - raw_btf, NULL);
> -       free(raw_btf);
> +       LIBBPF_OPTS(bpf_btf_load_opts, opts,
> +                   .log_buf = bpf_vlog,
> +                   .log_size = sizeof(bpf_vlog),
> +                   .log_level = (verbose
> +                                 ? VERBOSE_LIBBPF_LOG_LEVEL
> +                                 : DEFAULT_LIBBPF_LOG_LEVEL),
> +       );

Please move the local variable definition to the beginning of the function.

> +
> +       btf_fd = bpf_btf_load(raw_btf, ptr - raw_btf, &opts);
>         if (btf_fd < 0)
> -               return -1;
> -       return btf_fd;
> +               printf("Failed to load BTF spec: '%s'\n", strerror(errno));
> +
> +       free(raw_btf);
> +
> +       return btf_fd < 0 ? -1 : btf_fd;
> +}
> +

[...]

> +
>  static void do_test_single(struct bpf_test *test, bool unpriv,
>                            int *passes, int *errors)
>  {
> -       int fd_prog, expected_ret, alignment_prevented_execution;
> +       int fd_prog, btf_fd, expected_ret, alignment_prevented_execution;
>         int prog_len, prog_type = test->prog_type;
>         struct bpf_insn *prog = test->insns;
>         LIBBPF_OPTS(bpf_prog_load_opts, opts);
> @@ -1366,8 +1424,10 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
>         __u32 pflags;
>         int i, err;
>
> +       fd_prog = -1;

This is not really necessary.

>         for (i = 0; i < MAX_NR_MAPS; i++)
>                 map_fds[i] = -1;
> +       btf_fd = -1;
>
>         if (!prog_type)
>                 prog_type = BPF_PROG_TYPE_SOCKET_FILTER;

[...]

> @@ -1540,6 +1614,8 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
>  fail_log:
>         (*errors)++;
>         printf("%s", bpf_vlog);
> +       if (!string_ends_with_nl(bpf_vlog, sizeof(bpf_vlog)))
> +               printf("\n");

This seems like overkill. Can we just print "\n" in all cases?

>         goto close_fds;
>  }
>
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next v2 3/3] bpf: Inline calls to bpf_loop when callback is known
  2022-05-29 22:36 ` [PATCH bpf-next v2 3/3] bpf: Inline calls to bpf_loop when callback is known Eduard Zingerman
@ 2022-05-31 22:10   ` Song Liu
  0 siblings, 0 replies; 9+ messages in thread
From: Song Liu @ 2022-05-31 22:10 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team

On Sun, May 29, 2022 at 3:37 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
[...]
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

Please put kernel changes, test_prog changes, and test_verifier changes to 3
separate patches.

> ---
>  include/linux/bpf_verifier.h                  |  16 ++
>  kernel/bpf/bpf_iter.c                         |   9 +-
>  kernel/bpf/verifier.c                         | 184 ++++++++++++-
>  .../selftests/bpf/prog_tests/bpf_loop.c       |  62 +++++
>  tools/testing/selftests/bpf/progs/bpf_loop.c  | 122 +++++++++
>  .../selftests/bpf/verifier/bpf_loop_inline.c  | 244 ++++++++++++++++++
>  6 files changed, 628 insertions(+), 9 deletions(-)
>  create mode 100644 tools/testing/selftests/bpf/verifier/bpf_loop_inline.c
[...]
> +/* For all sub-programs in the program (including main) checks
> + * insn_aux_data to see if there are bpf_loop calls that require
> + * inlining. If such calls are found subprog stack_depth is increased
> + * by the size of 3 registers. Reserved space would be used in the
> + * do_misc_fixups to spill values of the R6, R7, R8 to use these
> + * registers for loop iteration.
> + */
> +static void adjust_stack_depth_for_loop_inlining(struct bpf_verifier_env *env)
> +{
> +       int i, subprog_end, cur_subprog = 0;
> +       struct bpf_subprog_info *subprogs = env->subprog_info;
> +       int insn_cnt = env->prog->len;
> +       bool proc_updated = false;

nit: I guess this should be called subprog_updated?

> +       s32 stack_base;
> +
> +       subprog_end = (env->subprog_cnt > 1
> +                      ? subprogs[cur_subprog + 1].start
> +                      : insn_cnt);
> +       for (i = 0; i < insn_cnt; i++) {
> +               struct bpf_insn_aux_data *aux = &env->insn_aux_data[i];
> +

[...]

> +static struct bpf_prog *inline_bpf_loop(struct bpf_verifier_env *env,
> +                                       int position, u32 *cnt)
> +{
> +       struct bpf_insn_aux_data *aux = &env->insn_aux_data[position];
> +       s32 stack_base = aux->loop_inline_state.stack_base;
> +       s32 r6_offset = stack_base + 0 * BPF_REG_SIZE;
> +       s32 r7_offset = stack_base + 1 * BPF_REG_SIZE;
> +       s32 r8_offset = stack_base + 2 * BPF_REG_SIZE;
> +       int reg_loop_max = BPF_REG_6;
> +       int reg_loop_cnt = BPF_REG_7;
> +       int reg_loop_ctx = BPF_REG_8;
> +
> +       struct bpf_prog *new_prog;
> +       u32 callback_subprogno = aux->loop_inline_state.callback_subprogno;
> +       u32 callback_start;
> +       u32 call_insn_offset;
> +       s32 callback_offset;
> +
> +       struct bpf_insn insn_buf[] = {
> +               /* Return error and jump to the end of the patch if
> +                * expected number of iterations is too big.  This
> +                * repeats the check done in bpf_loop helper function,
> +                * be careful to modify this code in sync.
> +                */
> +               BPF_JMP_IMM(BPF_JLE, BPF_REG_1, BPF_MAX_LOOPS, 2),
> +               BPF_MOV32_IMM(BPF_REG_0, -E2BIG),
> +               BPF_JMP_IMM(BPF_JA, 0, 0, 16),
> +               /* spill R6, R7, R8 to use these as loop vars */
> +               BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, r6_offset),
> +               BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, r7_offset),
> +               BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, r8_offset),
> +               /* initialize loop vars */
> +               BPF_MOV64_REG(reg_loop_max, BPF_REG_1),
> +               BPF_MOV32_IMM(reg_loop_cnt, 0),
> +               BPF_MOV64_REG(reg_loop_ctx, BPF_REG_3),
> +               /* loop header,
> +                * if reg_loop_cnt >= reg_loop_max skip the loop body
> +                */
> +               BPF_JMP_REG(BPF_JGE, reg_loop_cnt, reg_loop_max, 5),
> +               /* callback call,
> +                * correct callback offset would be set after patching
> +                */
> +               BPF_MOV64_REG(BPF_REG_1, reg_loop_cnt),
> +               BPF_MOV64_REG(BPF_REG_2, reg_loop_ctx),
> +               BPF_CALL_REL(0),
> +               /* increment loop counter */
> +               BPF_ALU64_IMM(BPF_ADD, reg_loop_cnt, 1),
> +               /* jump to loop header if callback returned 0 */
> +               BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, -6),
> +               /* return value of bpf_loop,
> +                * set R0 to the number of iterations
> +                */
> +               BPF_MOV64_REG(BPF_REG_0, reg_loop_cnt),
> +               /* restore original values of R6, R7, R8 */
> +               BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_10, r6_offset),
> +               BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_10, r7_offset),
> +               BPF_LDX_MEM(BPF_DW, BPF_REG_8, BPF_REG_10, r8_offset),
> +       };
> +
> +       *cnt = ARRAY_SIZE(insn_buf);
> +       new_prog = bpf_patch_insn_data(env, position, insn_buf, *cnt);
> +       if (!new_prog)
> +               return new_prog;
> +
> +       /* callback start is known only after patching */
> +       callback_start = env->subprog_info[callback_subprogno].start;
> +       call_insn_offset = position + 12;

IIUC, magic number 12 is calculated based on the content of insn_buf.
Can we make this more robust for future changes? We should at least
add a comment here.

> +       callback_offset = callback_start - call_insn_offset - 1;
> +       env->prog->insnsi[call_insn_offset].imm = callback_offset;
> +
> +       return new_prog;
> +}
> +
>  /* Do various post-verification rewrites in a single program pass.
>   * These rewrites simplify JIT and interpreter implementations.
>   */
> @@ -14258,6 +14417,18 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
>                         continue;
>                 }
>
> +               if (insn->imm == BPF_FUNC_loop &&
> +                   fit_for_bpf_loop_inline(&env->insn_aux_data[i + delta])) {
> +                       new_prog = inline_bpf_loop(env, i + delta, &cnt);
> +                       if (!new_prog)
> +                               return -ENOMEM;
> +
> +                       delta    += cnt - 1;
> +                       env->prog = prog = new_prog;
> +                       insn      = new_prog->insnsi + i + delta;
> +                       continue;
> +               }
> +
>  patch_call_imm:
>                 fn = env->ops->get_func_proto(insn->imm, env->prog);
>                 /* all functions that have prototype and verifier allowed
> @@ -15030,6 +15201,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr)
>         if (ret == 0)
>                 ret = check_max_stack_depth(env);
>
> +       if (ret == 0)
> +               adjust_stack_depth_for_loop_inlining(env);
> +
>         /* instruction rewrites happen after this point */
>         if (is_priv) {
>                 if (ret == 0)
> diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_loop.c b/tools/testing/selftests/bpf/prog_tests/bpf_loop.c
> index 380d7a2072e3..1caa495be48c 100644
> --- a/tools/testing/selftests/bpf/prog_tests/bpf_loop.c
> +++ b/tools/testing/selftests/bpf/prog_tests/bpf_loop.c
> @@ -120,6 +120,64 @@ static void check_nested_calls(struct bpf_loop *skel)
>         bpf_link__destroy(link);
>  }
>
> +static void check_non_constant_callback(struct bpf_loop *skel)
> +{
> +       struct bpf_link *link =
> +               bpf_program__attach(skel->progs.prog_non_constant_callback);
> +
> +       if (!ASSERT_OK_PTR(link, "link"))
> +               return;
> +
> +       skel->bss->callback_selector = 0x0F;
> +       usleep(1);
> +       ASSERT_EQ(skel->bss->g_output, 0x0F, "g_output #1");
> +
> +       skel->bss->callback_selector = 0xF0;
> +       usleep(1);
> +       ASSERT_EQ(skel->bss->g_output, 0xF0, "g_output #2");
> +
> +       bpf_link__destroy(link);
> +}
> +
> +static void check_stack(struct bpf_loop *skel)
> +{
> +       struct bpf_link *link =
> +               bpf_program__attach(skel->progs.stack_check);
> +
> +       if (!ASSERT_OK_PTR(link, "link"))
> +               goto out;

We can just return here.

> +
> +       int map_fd = bpf_map__fd(skel->maps.map1);
Please move variable definition to the beginning of the function.

> +
> +       if (!ASSERT_GE(map_fd, 0, "bpf_map__fd"))
> +               goto out;
> +
> +       for (int key = 1; key <= 16; ++key) {
> +               int val = key;
> +               int err = bpf_map_update_elem(map_fd, &key, &val, BPF_NOEXIST);
> +
> +               if (!ASSERT_OK(err, "bpf_map_update_elem"))
> +                       goto out;
> +       }
> +
> +       usleep(1);
> +
> +       for (int key = 1; key <= 16; ++key) {
> +               int val;
> +               int err = bpf_map_lookup_elem(map_fd, &key, &val);
> +
> +               if (!ASSERT_OK(err, "bpf_map_lookup_elem"))
> +                       goto out;
> +               ASSERT_EQ(val, key + 1, "bad value in the map");
> +       }
> +
> +out:
> +       if (map_fd >= 0)
> +               close(map_fd);

map1 is part of the skeleton, we should not close it here.

> +       if (link)
> +               bpf_link__destroy(link);

No need to check "if (link)"

> +}
> +
>  void test_bpf_loop(void)
>  {
>         struct bpf_loop *skel;
> @@ -140,6 +198,10 @@ void test_bpf_loop(void)
>                 check_invalid_flags(skel);
>         if (test__start_subtest("check_nested_calls"))
>                 check_nested_calls(skel);
> +       if (test__start_subtest("check_non_constant_callback"))
> +               check_non_constant_callback(skel);
> +       if (test__start_subtest("check_stack"))
> +               check_stack(skel);
>
>         bpf_loop__destroy(skel);
>  }

[...]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next v2 2/3] selftests/bpf: allow BTF specs and func infos in test_verifier tests
  2022-05-31 20:52   ` Song Liu
@ 2022-05-31 23:20     ` Eduard Zingerman
  2022-06-01  4:10       ` Song Liu
  0 siblings, 1 reply; 9+ messages in thread
From: Eduard Zingerman @ 2022-05-31 23:20 UTC (permalink / raw)
  To: Song Liu
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team

> On Tue, 2022-05-31 at 13:52 -0700, Song Liu wrote:

Hi Song,

Thanks a lot for the review, I'll apply the suggested changes and
provide the v3 in one or two days. My only objection is below.

> >  {
> > -       int fd_prog, expected_ret, alignment_prevented_execution;
> > +       int fd_prog, btf_fd, expected_ret, alignment_prevented_execution;
> >         int prog_len, prog_type = test->prog_type;
> >         struct bpf_insn *prog = test->insns;
> >         LIBBPF_OPTS(bpf_prog_load_opts, opts);        __u32 pflags;
> >         int i, err;
> > 
> > +       fd_prog = -1;
> 
> This is not really necessary.

Actually this one is necessary to avoid compiler warning, note the
following fragment of the do_test_single function below:

static void do_test_single(...)
{
        ...
        if (...) {
                btf_fd = load_btf_for_test(...);
                if (btf_fd < 0)
                        goto fail_log;
                opts.prog_btf_fd = btf_fd;
        }
        ...
        fd_prog = bpf_prog_load(..., &opts);
        ...
close_fds:
        ...
        close(fd_prog);
        close(btf_fd);
        ...
        return;
fail_log:
        ...
        goto close_fds;
}

If load_btf_for_test fails the goto fail_log would eventually jump to
close_fds, where fd_prog would be in an uninitialised state.

Best regards,
Eduard


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next v2 2/3] selftests/bpf: allow BTF specs and func infos in test_verifier tests
  2022-05-31 23:20     ` Eduard Zingerman
@ 2022-06-01  4:10       ` Song Liu
  0 siblings, 0 replies; 9+ messages in thread
From: Song Liu @ 2022-06-01  4:10 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team

On Tue, May 31, 2022 at 4:20 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> > On Tue, 2022-05-31 at 13:52 -0700, Song Liu wrote:
>
> Hi Song,
>
> Thanks a lot for the review, I'll apply the suggested changes and
> provide the v3 in one or two days. My only objection is below.
>
> > >  {
> > > -       int fd_prog, expected_ret, alignment_prevented_execution;
> > > +       int fd_prog, btf_fd, expected_ret, alignment_prevented_execution;
> > >         int prog_len, prog_type = test->prog_type;
> > >         struct bpf_insn *prog = test->insns;
> > >         LIBBPF_OPTS(bpf_prog_load_opts, opts);        __u32 pflags;
> > >         int i, err;
> > >
> > > +       fd_prog = -1;
> >
> > This is not really necessary.
>
> Actually this one is necessary to avoid compiler warning, note the
> following fragment of the do_test_single function below:
>
> static void do_test_single(...)
> {
>         ...
>         if (...) {
>                 btf_fd = load_btf_for_test(...);
>                 if (btf_fd < 0)
>                         goto fail_log;
>                 opts.prog_btf_fd = btf_fd;
>         }
>         ...
>         fd_prog = bpf_prog_load(..., &opts);
>         ...
> close_fds:
>         ...
>         close(fd_prog);
>         close(btf_fd);
>         ...
>         return;
> fail_log:
>         ...
>         goto close_fds;
> }
>
> If load_btf_for_test fails the goto fail_log would eventually jump to
> close_fds, where fd_prog would be in an uninitialised state.

Got it. Thanks for the explanation!

Song

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-06-01  4:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-29 22:36 [PATCH bpf-next v2 0/3] bpf_loop inlining Eduard Zingerman
2022-05-29 22:36 ` [PATCH bpf-next v2 1/3] selftests/bpf: specify expected instructions in test_verifier tests Eduard Zingerman
2022-05-31 18:26   ` Song Liu
2022-05-29 22:36 ` [PATCH bpf-next v2 2/3] selftests/bpf: allow BTF specs and func infos " Eduard Zingerman
2022-05-31 20:52   ` Song Liu
2022-05-31 23:20     ` Eduard Zingerman
2022-06-01  4:10       ` Song Liu
2022-05-29 22:36 ` [PATCH bpf-next v2 3/3] bpf: Inline calls to bpf_loop when callback is known Eduard Zingerman
2022-05-31 22:10   ` Song Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.