All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/16] bpf, bounded loop support work in progress
@ 2018-06-01  9:32 John Fastabend
  2018-06-01  9:32 ` [RFC PATCH 01/16] bpf: cfg: partition basic blocks for each subprog John Fastabend
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: John Fastabend @ 2018-06-01  9:32 UTC (permalink / raw)
  To: alexei.starovoitov, daniel, davem; +Cc: netdev

This series is an early preview of ongoing work to support loops
inside BPF verifier. The code is rough in spots (more rough the
further you get into the series) but we have some signs of life!

The following code runs and is verified,

 SEC("classifier_tc_loop1")
 int _tc_loop(struct __sk_buff *ctx)
 {
	void *data      = (void *)(unsigned long)ctx->data;
	void *data_end  = (void *)(unsigned long)ctx->data_end;
	 __u8 i = 0, j = 0, k = 0, *p;

	 if (data + 2 > data_end)
		 return TC_ACT_OK;

	 p = data;
	 j = p[0];

	 if (data + 101 > data_end)
		 return TC_ACT_OK;

	 if (j < 100)
		 return TC_ACT_OK;

 #pragma nounroll
	while (i < j) {
		k += p[i];
		i++;
	}
	ctx->mark = k;

	return TC_ACT_OK;
 }

perhaps even more important many invalid loops are caught and
rejected. There are still a few cases that need to be handled
but largely things are working. And lots of code cleanup,
polishing and improvements sill needed but worth sharing
I think. Even if no one reads the code line by line the commit
messages have the description.

The series is broken into three parts.

Part1: Build the basic infrastruce. Basic Blocks, CFG and DOM are
built. (work done by Jiong).

Part2: Verify loops are bounded and encourage the state pruning
logic to prune as many states as possible.

Part3: Tests. Largely tbd at this point although I've been
doing lots of manual poking around with good/bad/ugly C programs.

Some high level details around design decisions for each part but
please see patch descriptions for more.

First Part1 Jiong chose to use Tarjan algorithm to build DOM.
(We also have an iterative solution if that is prefered but in
theory at least Tarjan has better running time.) Also we have
calculated the pdom as well but it is currently unused.

For all Part2 gory details please see patch,

 "bpf: verifier, add initial support to allow bounded loops"

The high level design is as follows,

  i. Use DOM to find loops
 ii. Find induction variables in loop
iii. Verify termination by comparing induction variable (inc/dec)
     against the JMP op. Propagate min/max values into insn aux
     data.
 iv. At do_check time ensure induction variable is bounded as
     required to ensure loop termination. These are a form of
     restriction on valid values for the registers.
  v. Encourage state pruning by using known bounds from
     previous induction step. e.g. if a scalar increments from
     0 to 100 we know min/max values that are vali for every
     iteration.
 vi. Loop state is usually pruned early but if not we can simply
     run the loop costing verifier complexity.

There is still lots of work to finish up all the pieces but we wanted
to share the current design.

Thanks,
John

---

Jiong Wang (11):
      bpf: cfg: partition basic blocks for each subprog
      bpf: cfg: add edges between basic blocks to form CFG
      bpf: cfg: build domination tree using Tarjan algorithm
      bpf: cfg: detect loop use domination information
      bpf: cfg: detect unreachable basic blocks
      bpf: cfg: move find_subprog/add_subprog to cfg.c
      bpf: cfg: build call graph and detect unreachable/recursive call
      bpf: cfg: remove push_insn and check_cfg
      bpf: cfg: reduce k*alloc/free call by using memory pool for allocating nodes
      bpf: cfg: reduce memory usage by using singly list + compat pointer
      bpf: cfg: detect irreducible loop using Eric Stoltz algorithm

John Fastabend (5):
      bpf: cfg: pretty print CFG and DOM
      bpf: verifier, can ptr range be calculated with scalar ALU op
      bpf: verifier, add initial support to allow bounded loops
      bpf: verifier, simple loop examples
      bpf: tools: dbg patch to turn on debugging and add primitive examples


 include/linux/bpf_verifier.h                   |   28 
 kernel/bpf/Makefile                            |    2 
 kernel/bpf/cfg.c                               | 2060 ++++++++++++++++++++++++
 kernel/bpf/cfg.h                               |   56 +
 kernel/bpf/verifier.c                          |  499 +++---
 samples/bpf/xdp2skb_meta_kern.c                |   57 +
 tools/lib/bpf/bpf.c                            |    2 
 tools/lib/bpf/libbpf.c                         |    7 
 tools/testing/selftests/bpf/Makefile           |    5 
 tools/testing/selftests/bpf/test_bad_loop1.c   |   47 +
 tools/testing/selftests/bpf/test_bad_loop2.c   |   45 +
 tools/testing/selftests/bpf/test_bad_loop3.c   |   47 +
 tools/testing/selftests/bpf/test_basic_loop1.c |   44 +
 tools/testing/selftests/bpf/test_basic_loop2.c |   48 +
 tools/testing/selftests/bpf/test_basic_loop3.c |   51 +
 15 files changed, 2711 insertions(+), 287 deletions(-)
 create mode 100644 kernel/bpf/cfg.c
 create mode 100644 kernel/bpf/cfg.h
 create mode 100644 tools/testing/selftests/bpf/test_bad_loop1.c
 create mode 100644 tools/testing/selftests/bpf/test_bad_loop2.c
 create mode 100644 tools/testing/selftests/bpf/test_bad_loop3.c
 create mode 100644 tools/testing/selftests/bpf/test_basic_loop1.c
 create mode 100644 tools/testing/selftests/bpf/test_basic_loop2.c
 create mode 100644 tools/testing/selftests/bpf/test_basic_loop3.c

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH 01/16] bpf: cfg: partition basic blocks for each subprog
  2018-06-01  9:32 [RFC PATCH 00/16] bpf, bounded loop support work in progress John Fastabend
@ 2018-06-01  9:32 ` John Fastabend
  2018-06-01  9:32 ` [RFC PATCH 02/16] bpf: cfg: add edges between basic blocks to form CFG John Fastabend
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2018-06-01  9:32 UTC (permalink / raw)
  To: alexei.starovoitov, daniel, davem; +Cc: netdev

From: Jiong Wang <jiong.wang@netronome.com>

"check_subprogs" has partitioned subprogs and is doing insn scan for each
subprog already, this patch extends it to also partition basic blocks (BB)
for each subprog.

  - The first insn for each subprog start a BB.
  - Branch (both cond and absolute) target start a BB.
  - Insn immediately follows branch insn start a BB.
    Insn immediately follows exit and within subprog start a BB.

BBs for each subprog are organized as a list in ascending head.

Two special BBs, entry and exit are added as well.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 include/linux/bpf_verifier.h |    1 
 kernel/bpf/Makefile          |    2 -
 kernel/bpf/cfg.c             |   93 ++++++++++++++++++++++++++++++++++++++++++
 kernel/bpf/cfg.h             |   16 +++++++
 kernel/bpf/verifier.c        |   57 +++++++++++++++++++++++---
 5 files changed, 162 insertions(+), 7 deletions(-)
 create mode 100644 kernel/bpf/cfg.c
 create mode 100644 kernel/bpf/cfg.h

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 38b04f5..c70faf5 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -177,6 +177,7 @@ static inline bool bpf_verifier_log_needed(const struct bpf_verifier_log *log)
 struct bpf_subprog_info {
 	u32 start; /* insn idx of function entry point */
 	u16 stack_depth; /* max. stack depth used by this function */
+	struct list_head bbs; /* basic blocks list. */
 };
 
 /* single container for all structs
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index f27f549..3ee9036 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-y := core.o
 
-obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o
+obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o cfg.o
 obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o
 obj-$(CONFIG_BPF_SYSCALL) += disasm.o
 obj-$(CONFIG_BPF_SYSCALL) += btf.o
diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
new file mode 100644
index 0000000..b1af714
--- /dev/null
+++ b/kernel/bpf/cfg.c
@@ -0,0 +1,93 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (C) 2018 Netronome Systems, Inc. */
+/* This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+
+#include <linux/bpf_verifier.h>
+#include <linux/list.h>
+#include <linux/slab.h>
+
+#include "cfg.h"
+
+struct bb_node {
+	struct list_head l;
+	u16 head;
+};
+
+#define bb_prev(bb)		list_prev_entry(bb, l)
+#define bb_next(bb)		list_next_entry(bb, l)
+#define bb_first(bb_list)	list_first_entry(bb_list, struct bb_node, l)
+#define bb_last(bb_list)	list_last_entry(bb_list, struct bb_node, l)
+#define entry_bb(bb_list)	bb_first(bb_list)
+#define exit_bb(bb_list)	bb_last(bb_list)
+
+int subprog_append_bb(struct list_head *bb_list, int head)
+{
+	struct bb_node *new_bb, *bb;
+
+	list_for_each_entry(bb, bb_list, l) {
+		if (bb->head == head)
+			return 0;
+		else if (bb->head > head)
+			break;
+	}
+
+	bb = bb_prev(bb);
+	new_bb = kzalloc(sizeof(*new_bb), GFP_KERNEL);
+	if (!new_bb)
+		return -ENOMEM;
+
+	new_bb->head = head;
+	list_add(&new_bb->l, &bb->l);
+
+	return 0;
+}
+
+int subprog_fini_bb(struct list_head *bb_list, int subprog_end)
+{
+	struct bb_node *bb = kzalloc(sizeof(*bb), GFP_KERNEL);
+
+	if (!bb)
+		return -ENOMEM;
+	/* entry bb. */
+	bb->head = -1;
+	list_add(&bb->l, bb_list);
+
+	bb = kzalloc(sizeof(*bb), GFP_KERNEL);
+	if (!bb)
+		return -ENOMEM;
+	/* exit bb. */
+	bb->head = subprog_end;
+	list_add_tail(&bb->l, bb_list);
+
+	return 0;
+}
+
+int subprog_init_bb(struct list_head *bb_list, int subprog_start)
+{
+	int ret;
+
+	INIT_LIST_HEAD(bb_list);
+	ret = subprog_append_bb(bb_list, subprog_start);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+void subprog_free_bb(struct bpf_subprog_info *subprog, int end_idx)
+{
+	int i = 0;
+
+	for (; i <= end_idx; i++) {
+		struct list_head *bbs = &subprog[i].bbs;
+		struct bb_node *bb, *tmp;
+
+		list_for_each_entry_safe(bb, tmp, bbs, l) {
+			list_del(&bb->l);
+			kfree(bb);
+		}
+	}
+}
diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h
new file mode 100644
index 0000000..4092145
--- /dev/null
+++ b/kernel/bpf/cfg.h
@@ -0,0 +1,16 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (C) 2018 Netronome Systems, Inc. */
+/* This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+
+#ifndef __BPF_CFG_H__
+#define __BPF_CFG_H__
+
+int subprog_append_bb(struct list_head *bb_list, int head);
+int subprog_fini_bb(struct list_head *bb_list, int subprog_end);
+int subprog_init_bb(struct list_head *bb_list, int subprog_start);
+void subprog_free_bb(struct bpf_subprog_info *subprog, int end_idx);
+
+#endif
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 1fd9667b..523e440 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -24,6 +24,7 @@
 #include <linux/sort.h>
 #include <linux/perf_event.h>
 
+#include "cfg.h"
 #include "disasm.h"
 
 static const struct bpf_verifier_ops * const bpf_verifier_ops[] = {
@@ -807,6 +808,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
 	struct bpf_subprog_info *subprog = env->subprog_info;
 	struct bpf_insn *insn = env->prog->insnsi;
 	int insn_cnt = env->prog->len;
+	struct list_head *cur_bb_list;
 
 	/* Add entry function. */
 	ret = add_subprog(env, 0);
@@ -841,20 +843,46 @@ static int check_subprogs(struct bpf_verifier_env *env)
 		for (i = 0; i < env->subprog_cnt; i++)
 			verbose(env, "func#%d @%d\n", i, subprog[i].start);
 
-	/* now check that all jumps are within the same subprog */
 	subprog_start = subprog[cur_subprog].start;
 	subprog_end = subprog[cur_subprog + 1].start;
+	cur_bb_list = &subprog[cur_subprog].bbs;
+	ret = subprog_init_bb(cur_bb_list, subprog_start);
+	if (ret < 0)
+		goto free_nodes;
+	/* now check that all jumps are within the same subprog */
 	for (i = 0; i < insn_cnt; i++) {
 		u8 code = insn[i].code;
 
 		if (BPF_CLASS(code) != BPF_JMP)
 			goto next;
-		if (BPF_OP(code) == BPF_EXIT || BPF_OP(code) == BPF_CALL)
+
+		if (BPF_OP(code) == BPF_EXIT) {
+			if (i + 1 < subprog_end) {
+				ret = subprog_append_bb(cur_bb_list, i + 1);
+				if (ret < 0)
+					goto free_nodes;
+			}
+			goto next;
+		}
+
+		if (BPF_OP(code) == BPF_CALL)
 			goto next;
+
 		off = i + insn[i].off + 1;
 		if (off < subprog_start || off >= subprog_end) {
 			verbose(env, "jump out of range from insn %d to %d\n", i, off);
-			return -EINVAL;
+			ret = -EINVAL;
+			goto free_nodes;
+		}
+
+		ret = subprog_append_bb(cur_bb_list, off);
+		if (ret < 0)
+			goto free_nodes;
+
+		if (i + 1 < subprog_end) {
+			ret = subprog_append_bb(cur_bb_list, i + 1);
+			if (ret < 0)
+				goto free_nodes;
 		}
 next:
 		if (i == subprog_end - 1) {
@@ -865,15 +893,32 @@ static int check_subprogs(struct bpf_verifier_env *env)
 			if (code != (BPF_JMP | BPF_EXIT) &&
 			    code != (BPF_JMP | BPF_JA)) {
 				verbose(env, "last insn is not an exit or jmp\n");
-				return -EINVAL;
+				ret = -EINVAL;
+				goto free_nodes;
 			}
+
+			ret = subprog_fini_bb(cur_bb_list, subprog_end);
+			if (ret < 0)
+				goto free_nodes;
 			subprog_start = subprog_end;
 			cur_subprog++;
-			if (cur_subprog < env->subprog_cnt)
+			if (cur_subprog < env->subprog_cnt) {
 				subprog_end = subprog[cur_subprog + 1].start;
+				cur_bb_list = &subprog[cur_subprog].bbs;
+				ret = subprog_init_bb(cur_bb_list,
+						      subprog_start);
+				if (ret < 0)
+					goto free_nodes;
+			}
 		}
 	}
-	return 0;
+
+	ret = 0;
+
+free_nodes:
+	subprog_free_bb(subprog, cur_subprog == env->subprog_cnt ?
+					cur_subprog - 1 : cur_subprog);
+	return ret;
 }
 
 static

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 02/16] bpf: cfg: add edges between basic blocks to form CFG
  2018-06-01  9:32 [RFC PATCH 00/16] bpf, bounded loop support work in progress John Fastabend
  2018-06-01  9:32 ` [RFC PATCH 01/16] bpf: cfg: partition basic blocks for each subprog John Fastabend
@ 2018-06-01  9:32 ` John Fastabend
  2018-06-01  9:32 ` [RFC PATCH 03/16] bpf: cfg: build domination tree using Tarjan algorithm John Fastabend
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2018-06-01  9:32 UTC (permalink / raw)
  To: alexei.starovoitov, daniel, davem; +Cc: netdev

From: Jiong Wang <jiong.wang@netronome.com>

This patch add edges between basic blocks. Both edges for predecessors and
successors are added.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 kernel/bpf/cfg.c      |  129 ++++++++++++++++++++++++++++++++++++++++++++++++-
 kernel/bpf/cfg.h      |    1 
 kernel/bpf/verifier.c |    3 +
 3 files changed, 131 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index b1af714..208aac7 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -11,8 +11,16 @@
 
 #include "cfg.h"
 
+struct edge_node {
+	struct list_head l;
+	struct bb_node *src;
+	struct bb_node *dst;
+};
+
 struct bb_node {
 	struct list_head l;
+	struct list_head e_prevs;
+	struct list_head e_succs;
 	u16 head;
 };
 
@@ -39,6 +47,8 @@ int subprog_append_bb(struct list_head *bb_list, int head)
 	if (!new_bb)
 		return -ENOMEM;
 
+	INIT_LIST_HEAD(&new_bb->e_prevs);
+	INIT_LIST_HEAD(&new_bb->e_succs);
 	new_bb->head = head;
 	list_add(&new_bb->l, &bb->l);
 
@@ -53,6 +63,8 @@ int subprog_fini_bb(struct list_head *bb_list, int subprog_end)
 		return -ENOMEM;
 	/* entry bb. */
 	bb->head = -1;
+	INIT_LIST_HEAD(&bb->e_prevs);
+	INIT_LIST_HEAD(&bb->e_succs);
 	list_add(&bb->l, bb_list);
 
 	bb = kzalloc(sizeof(*bb), GFP_KERNEL);
@@ -60,6 +72,8 @@ int subprog_fini_bb(struct list_head *bb_list, int subprog_end)
 		return -ENOMEM;
 	/* exit bb. */
 	bb->head = subprog_end;
+	INIT_LIST_HEAD(&bb->e_prevs);
+	INIT_LIST_HEAD(&bb->e_succs);
 	list_add_tail(&bb->l, bb_list);
 
 	return 0;
@@ -77,15 +91,126 @@ int subprog_init_bb(struct list_head *bb_list, int subprog_start)
 	return 0;
 }
 
+static struct bb_node *search_bb_with_head(struct list_head *bb_list, int head)
+{
+	struct bb_node *bb;
+
+	list_for_each_entry(bb, bb_list, l) {
+		if (bb->head == head)
+			return bb;
+	}
+
+	return NULL;
+}
+
+int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list)
+{
+	struct bb_node *bb, *exit_bb;
+	struct edge_node *edge;
+
+	bb = entry_bb(bb_list);
+	edge = kcalloc(2, sizeof(*edge), GFP_KERNEL);
+	if (!edge)
+		return -ENOMEM;
+	edge->src = bb;
+	edge->dst = bb_next(bb);
+	list_add_tail(&edge->l, &bb->e_succs);
+	edge[1].src = edge->src;
+	edge[1].dst = edge->dst;
+	list_add_tail(&edge[1].l, &edge[1].dst->e_prevs);
+
+	exit_bb = exit_bb(bb_list);
+	bb = bb_next(bb);
+	list_for_each_entry_from(bb, &exit_bb->l, l) {
+		bool has_fallthrough, only_has_fallthrough;
+		bool has_branch, only_has_branch;
+		struct bb_node *next_bb = bb_next(bb);
+		int tail = next_bb->head - 1;
+		struct bpf_insn insn;
+		u8 code;
+
+		edge = kcalloc(2, sizeof(*edge), GFP_KERNEL);
+		if (!edge)
+			return -ENOMEM;
+		edge->src = bb;
+		edge[1].src = bb;
+
+		insn = insns[tail];
+		code = insn.code;
+		only_has_fallthrough = BPF_CLASS(code) != BPF_JMP ||
+				       BPF_OP(code) == BPF_EXIT;
+		has_fallthrough = only_has_fallthrough ||
+				  (BPF_CLASS(code) == BPF_JMP &&
+				   BPF_OP(code) != BPF_CALL &&
+				   BPF_OP(code) != BPF_JA);
+		only_has_branch = BPF_CLASS(code) == BPF_JMP &&
+				  BPF_OP(code) == BPF_JA;
+		has_branch = only_has_branch ||
+			     (BPF_CLASS(code) == BPF_JMP &&
+			      BPF_OP(code) != BPF_EXIT &&
+			      BPF_OP(code) != BPF_CALL);
+
+		if (has_fallthrough) {
+			if (BPF_CLASS(code) == BPF_JMP &&
+			    BPF_OP(code) == BPF_EXIT)
+				next_bb = exit_bb;
+			edge->dst = next_bb;
+			edge[1].dst = next_bb;
+			list_add_tail(&edge->l, &bb->e_succs);
+			list_add_tail(&edge[1].l, &edge[1].dst->e_prevs);
+			edge = NULL;
+		}
+
+		if (has_branch) {
+			struct bb_node *tgt;
+
+			if (!edge) {
+				edge = kcalloc(2, sizeof(*edge), GFP_KERNEL);
+				if (!edge)
+					return -ENOMEM;
+				edge->src = bb;
+				edge[1].src = bb;
+			}
+
+			tgt = search_bb_with_head(bb_list,
+						  tail + insn.off + 1);
+			if (!tgt)
+				return -EINVAL;
+
+			edge->dst = tgt;
+			edge[1].dst = tgt;
+			list_add_tail(&edge->l, &bb->e_succs);
+			list_add_tail(&edge[1].l, &tgt->e_prevs);
+		}
+	}
+
+	return 0;
+}
+
+static void subprog_free_edge(struct bb_node *bb)
+{
+	struct list_head *succs = &bb->e_succs;
+	struct edge_node *e, *tmp;
+
+	/* prevs and succs are allocated as pair, succs is the start addr. */
+	list_for_each_entry_safe(e, tmp, succs, l) {
+		list_del(&e->l);
+		kfree(e);
+	}
+}
+
 void subprog_free_bb(struct bpf_subprog_info *subprog, int end_idx)
 {
 	int i = 0;
 
 	for (; i <= end_idx; i++) {
 		struct list_head *bbs = &subprog[i].bbs;
-		struct bb_node *bb, *tmp;
+		struct bb_node *bb, *tmp, *exit;
 
-		list_for_each_entry_safe(bb, tmp, bbs, l) {
+		bb = entry_bb(bbs);
+		exit = exit_bb(bbs);
+		list_for_each_entry_safe_from(bb, tmp, &exit->l, l) {
+			subprog_free_edge(bb);
 			list_del(&bb->l);
 			kfree(bb);
 		}
diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h
index 4092145..7a9d5dd 100644
--- a/kernel/bpf/cfg.h
+++ b/kernel/bpf/cfg.h
@@ -8,6 +8,7 @@
 #ifndef __BPF_CFG_H__
 #define __BPF_CFG_H__
 
+int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list);
 int subprog_append_bb(struct list_head *bb_list, int head);
 int subprog_fini_bb(struct list_head *bb_list, int subprog_end);
 int subprog_init_bb(struct list_head *bb_list, int subprog_start);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 523e440..bcac7c8 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -900,6 +900,9 @@ static int check_subprogs(struct bpf_verifier_env *env)
 			ret = subprog_fini_bb(cur_bb_list, subprog_end);
 			if (ret < 0)
 				goto free_nodes;
+			ret = subprog_add_bb_edges(insn, cur_bb_list);
+			if (ret < 0)
+				goto free_nodes;
 			subprog_start = subprog_end;
 			cur_subprog++;
 			if (cur_subprog < env->subprog_cnt) {

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 03/16] bpf: cfg: build domination tree using Tarjan algorithm
  2018-06-01  9:32 [RFC PATCH 00/16] bpf, bounded loop support work in progress John Fastabend
  2018-06-01  9:32 ` [RFC PATCH 01/16] bpf: cfg: partition basic blocks for each subprog John Fastabend
  2018-06-01  9:32 ` [RFC PATCH 02/16] bpf: cfg: add edges between basic blocks to form CFG John Fastabend
@ 2018-06-01  9:32 ` John Fastabend
  2018-06-01  9:32 ` [RFC PATCH 04/16] bpf: cfg: detect loop use domination information John Fastabend
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2018-06-01  9:32 UTC (permalink / raw)
  To: alexei.starovoitov, daniel, davem; +Cc: netdev

From: Jiong Wang <jiong.wang@netronome.com>

  - When building domination information, we need to some array data
    structure, and need to index into them using basic block index.
    Calculate the basic block index during adding edges.

  - The Step-1 in Tarjan algorithm is to assign dfs order number for
    each bb in the cfg. We need to iterate on all bbs in reverse dfs order.

  - Step-2 and Step-3 in Tarjan algorithm to calculate semi-dominators and
    immediate-dominators implicitly and explicitly.

  - Step-4, we add a new bitmap field inside subprog_info to keep the
    dominator tree there.

  - Post-domination information is built but not tested.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 include/linux/bpf_verifier.h |    3 
 kernel/bpf/cfg.c             |  386 ++++++++++++++++++++++++++++++++++++++++++
 kernel/bpf/cfg.h             |    3 
 kernel/bpf/verifier.c        |    5 -
 4 files changed, 393 insertions(+), 4 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index c70faf5..0eb838d 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -178,6 +178,9 @@ struct bpf_subprog_info {
 	u32 start; /* insn idx of function entry point */
 	u16 stack_depth; /* max. stack depth used by this function */
 	struct list_head bbs; /* basic blocks list. */
+	u32 bb_num; /* total basic block num. */
+	unsigned long *dtree;
+	bool dtree_avail;
 };
 
 /* single container for all structs
diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index 208aac7..b50937a 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -17,11 +17,33 @@ struct edge_node {
 	struct bb_node *dst;
 };
 
+struct edge_iter {
+	struct list_head *list_head;
+	struct edge_node *edge;
+};
+
+#define first_edge(e_list)	list_first_entry(e_list, struct edge_node, l)
+#define last_edge(e_list)	list_last_entry(e_list, struct edge_node, l)
+#define next_edge(e)		list_next_entry(e, l)
+
+static bool ei_end_p(struct edge_iter *ei)
+{
+	return &ei->edge->l == ei->list_head;
+}
+
+static void ei_next(struct edge_iter *ei)
+{
+	struct edge_node *e = ei->edge;
+
+	ei->edge = next_edge(e);
+}
+
 struct bb_node {
 	struct list_head l;
 	struct list_head e_prevs;
 	struct list_head e_succs;
 	u16 head;
+	u16 idx;
 };
 
 #define bb_prev(bb)		list_prev_entry(bb, l)
@@ -31,6 +53,27 @@ struct bb_node {
 #define entry_bb(bb_list)	bb_first(bb_list)
 #define exit_bb(bb_list)	bb_last(bb_list)
 
+struct dom_info {
+	u16 *dfs_parent;
+	u16 *dfs_order;
+	struct bb_node **dfs_to_bb;
+	/* immediate-dominator */
+	u16 *idom;
+	/* semi-dominator */
+	u16 *sdom;
+	u16 *bucket;
+	u16 *next_bucket;
+	/* best node during path compression. */
+	u16 *best;
+	/* ancestor along tree edge. */
+	u16 *ancestor;
+	/* size and child are used for tree balancing. */
+	u16 *size;
+	u16 *child;
+
+	u16 dfsnum;
+};
+
 int subprog_append_bb(struct list_head *bb_list, int head)
 {
 	struct bb_node *new_bb, *bb;
@@ -107,6 +150,7 @@ int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list)
 {
 	struct bb_node *bb, *exit_bb;
 	struct edge_node *edge;
+	int bb_num;
 
 	bb = entry_bb(bb_list);
 	edge = kcalloc(2, sizeof(*edge), GFP_KERNEL);
@@ -118,9 +162,12 @@ int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list)
 	edge[1].src = edge->src;
 	edge[1].dst = edge->dst;
 	list_add_tail(&edge[1].l, &edge[1].dst->e_prevs);
+	bb->idx = -1;
 
 	exit_bb = exit_bb(bb_list);
+	exit_bb->idx = -2;
 	bb = bb_next(bb);
+	bb_num = 0;
 	list_for_each_entry_from(bb, &exit_bb->l, l) {
 		bool has_fallthrough, only_has_fallthrough;
 		bool has_branch, only_has_branch;
@@ -129,6 +176,8 @@ int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list)
 		struct bpf_insn insn;
 		u8 code;
 
+		bb->idx = bb_num++;
+
 		edge = kcalloc(2, sizeof(*edge), GFP_KERNEL);
 		if (!edge)
 			return -ENOMEM;
@@ -184,9 +233,341 @@ int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list)
 		}
 	}
 
+	return bb_num + 2;
+}
+
+static int init_dom_info(struct bpf_subprog_info *subprog, struct dom_info *di)
+{
+	u16 *p, bb_num, i;
+
+	di->dfs_parent = NULL;
+	di->dfs_to_bb = NULL;
+
+	bb_num = subprog->bb_num;
+	p = kcalloc(10 * bb_num, sizeof(*p), GFP_KERNEL);
+	if (!p)
+		return -ENOMEM;
+
+	di->dfs_parent = p;
+	di->dfs_order = di->dfs_parent + bb_num;
+	di->idom = di->dfs_order + bb_num;
+	di->sdom = di->idom + bb_num;
+	di->bucket = di->sdom + bb_num;
+	di->next_bucket = di->bucket + bb_num;
+	di->best = di->next_bucket + bb_num;
+	di->ancestor = di->best + bb_num;
+	di->size = di->ancestor + bb_num;
+	di->child = di->size + bb_num;
+	di->dfs_to_bb = kcalloc(bb_num, sizeof(struct bb_node *), GFP_KERNEL);
+	di->dfsnum = 1;
+
+	for (i = 0; i < bb_num; i++) {
+		di->size[i] = 1;
+		di->best[i] = i;
+		di->sdom[i] = i;
+	}
+
 	return 0;
 }
 
+static void compress_path(struct dom_info *di, unsigned int v)
+{
+	unsigned int parent = di->ancestor[v];
+
+	if (di->ancestor[parent]) {
+		compress_path(di, parent);
+
+		if (di->sdom[di->best[parent]] < di->sdom[di->best[v]])
+			di->best[v] = di->best[parent];
+
+		di->ancestor[v] = di->ancestor[parent];
+	}
+}
+
+static unsigned int eval(struct dom_info *di, unsigned int v)
+{
+	unsigned int ancestor = di->ancestor[v];
+
+	/* v is root. */
+	if (!ancestor)
+		return di->best[v];
+
+	/* compress path */
+	compress_path(di, v);
+	ancestor = di->ancestor[v];
+
+	if (di->sdom[di->best[ancestor]] >= di->sdom[di->best[v]])
+		return di->best[v];
+	else
+		return di->best[ancestor];
+}
+
+/* Re-balancing the tree before linking. */
+static void link(struct dom_info *di, unsigned int v, unsigned int w)
+{
+	unsigned int s = w;
+
+	while (di->sdom[di->best[w]] < di->sdom[di->best[di->child[s]]]) {
+		if (di->size[s] + di->size[di->child[di->child[s]]] >=
+			2 * di->size[di->child[s]]) {
+			di->ancestor[di->child[s]] = s;
+			di->child[s] = di->child[di->child[s]];
+		} else {
+			di->size[di->child[s]] = di->size[s];
+			di->ancestor[s] = di->child[s];
+			s = di->child[s];
+		}
+	}
+
+	di->best[s] = di->best[w];
+	di->size[v] += di->size[w];
+	if (di->size[v] < 2 * di->size[w]) {
+		unsigned int t = s;
+
+		s = di->child[v];
+		di->child[v] = t;
+	}
+
+	while (s) {
+		di->ancestor[s] = v;
+		s = di->child[s];
+	}
+}
+
+static void calc_idoms(struct bpf_subprog_info *subprog, struct dom_info *di,
+		       bool reverse)
+{
+	u16 entry_bb_fake_idx = subprog->bb_num - 2, idx, w, k, par;
+	struct list_head *bb_list = &subprog->bbs;
+	struct bb_node *entry_bb;
+	struct edge_iter ei;
+
+	if (reverse)
+		entry_bb = exit_bb(bb_list);
+	else
+		entry_bb = entry_bb(bb_list);
+	idx = di->dfsnum - 1;
+
+	while (idx > 1) {
+		struct bb_node *bb = di->dfs_to_bb[idx];
+		struct edge_node *e;
+
+		par = di->dfs_parent[idx];
+		k = idx;
+
+		if (reverse) {
+			ei.edge = first_edge(&bb->e_succs);
+			ei.list_head = &bb->e_succs;
+		} else {
+			ei.edge = first_edge(&bb->e_prevs);
+			ei.list_head = &bb->e_prevs;
+		}
+
+		while (!ei_end_p(&ei)) {
+			struct bb_node *b;
+			u16 k1;
+
+			e = ei.edge;
+			if (reverse)
+				b = e->dst;
+			else
+				b = e->src;
+			ei_next(&ei);
+
+			if (b == entry_bb)
+				k1 = di->dfs_order[entry_bb_fake_idx];
+			else
+				k1 = di->dfs_order[b->idx];
+
+			if (k1 > idx)
+				k1 = di->sdom[eval(di, k1)];
+			if (k1 < k)
+				k = k1;
+		}
+
+		di->sdom[idx] = k;
+		link(di, par, idx);
+		di->next_bucket[idx] = di->bucket[k];
+		di->bucket[k] = idx;
+
+		for (w = di->bucket[par]; w; w = di->next_bucket[w]) {
+			k = eval(di, w);
+			if (di->sdom[k] < di->sdom[w])
+				di->idom[w] = k;
+			else
+				di->idom[w] = par;
+		}
+		di->bucket[par] = 0;
+		idx--;
+	}
+
+	di->idom[1] = 0;
+	for (idx = 2; idx <= di->dfsnum - 1; idx++)
+		if (di->idom[idx] != di->sdom[idx])
+			di->idom[idx] = di->idom[di->idom[idx]];
+}
+
+static int calc_dfs_tree(struct bpf_subprog_info *subprog, struct dom_info *di,
+			 bool reverse)
+{
+	u16 bb_num = subprog->bb_num, sp = 0, idx, parent_idx;
+	struct list_head *bb_list = &subprog->bbs;
+	u16 entry_bb_fake_idx = bb_num - 2;
+	struct bb_node *entry_bb, *exit_bb;
+	struct edge_iter ei, *stack;
+	struct edge_node *e;
+
+	di->dfs_order[entry_bb_fake_idx] = di->dfsnum;
+
+	stack = kmalloc_array(bb_num - 1, sizeof(struct edge_iter), GFP_KERNEL);
+	if (!stack)
+		return -ENOMEM;
+
+	if (reverse) {
+		entry_bb = exit_bb(bb_list);
+		exit_bb = entry_bb(bb_list);
+		di->dfs_to_bb[di->dfsnum++] = exit_bb;
+		ei.edge = first_edge(&entry_bb->e_prevs);
+		ei.list_head = &entry_bb->e_prevs;
+	} else {
+		entry_bb = entry_bb(bb_list);
+		exit_bb = exit_bb(bb_list);
+		di->dfs_to_bb[di->dfsnum++] = entry_bb;
+		ei.edge = first_edge(&entry_bb->e_succs);
+		ei.list_head = &entry_bb->e_succs;
+	}
+
+	while (1) {
+		struct bb_node *bb_dst, *bb_src;
+
+		while (!ei_end_p(&ei)) {
+			e = ei.edge;
+
+			if (reverse) {
+				bb_dst = e->src;
+				if (bb_dst == exit_bb ||
+				    di->dfs_order[bb_dst->idx]) {
+					ei_next(&ei);
+					continue;
+				}
+				bb_src = e->dst;
+			} else {
+				bb_dst = e->dst;
+				if (bb_dst == exit_bb ||
+				    di->dfs_order[bb_dst->idx]) {
+					ei_next(&ei);
+					continue;
+				}
+				bb_src = e->src;
+			}
+
+			if (bb_src != entry_bb)
+				parent_idx = di->dfs_order[bb_src->idx];
+			else
+				parent_idx = di->dfs_order[entry_bb_fake_idx];
+
+			idx = di->dfsnum++;
+			di->dfs_order[bb_dst->idx] = idx;
+			di->dfs_to_bb[idx] = bb_dst;
+			di->dfs_parent[idx] = parent_idx;
+
+			stack[sp++] = ei;
+			if (reverse) {
+				ei.edge = first_edge(&bb_dst->e_prevs);
+				ei.list_head = &bb_dst->e_prevs;
+			} else {
+				ei.edge = first_edge(&bb_dst->e_succs);
+				ei.list_head = &bb_dst->e_succs;
+			}
+		}
+
+		if (!sp)
+			break;
+
+		ei = stack[--sp];
+		ei_next(&ei);
+	}
+
+	kfree(stack);
+
+	return 0;
+}
+
+static int idoms_to_doms(struct bpf_subprog_info *subprog, struct dom_info *di)
+{
+	int bb_num, i, end_index, bb, bb_idom, lane_len;
+	unsigned long *bitmap;
+
+	bb_num = subprog->bb_num - 2;
+	lane_len = BITS_TO_LONGS(bb_num);
+	bitmap = kcalloc(bb_num, lane_len * sizeof(long), GFP_KERNEL);
+	subprog->dtree = bitmap;
+	if (!subprog->dtree)
+		return -ENOMEM;
+	subprog->dtree_avail = true;
+	end_index = di->dfs_order[bb_num];
+
+	for (i = 1; i <= di->dfsnum - 1; i++) {
+		if (i == end_index)
+			continue;
+
+		bb = di->dfs_to_bb[i]->idx;
+		if (di->idom[i] && di->idom[i] != end_index) {
+			bb_idom = di->dfs_to_bb[di->idom[i]]->idx;
+			bitmap_copy(bitmap + bb * lane_len,
+				    bitmap + bb_idom * lane_len, bb_num);
+		}
+
+		bitmap_set(bitmap + bb * lane_len, bb, 1);
+	}
+
+	return 0;
+}
+
+/* Build domination information using Lengauer and Tarjan algorithm.
+ *
+ *   1. dfs on cfg to assign dfs-num for each node(bb).
+ *   2. calculate semi-dominator and calculate immediate dominator when
+ *      possible.
+ *   3. calculate immediate dominator not finished during step 2.
+ *   4. build domination bitmap using immediate dominator information.
+ *
+ * See:
+ *   A fast algorithm for finding dominators in a flowgraph.
+ *     - 1979, by T Lengauer and R Tarjan
+ *
+ *   Especially, Appendix B: The Complete Dominators Algorithms.
+ *
+ * The implementation also referenced GNU GCC 3.0.
+ */
+
+int subprog_build_dom_info(struct bpf_subprog_info *subprog)
+{
+	struct dom_info di;
+	int ret;
+
+	ret = init_dom_info(subprog, &di);
+	if (ret < 0)
+		goto free_dominfo;
+
+	ret = calc_dfs_tree(subprog, &di, false);
+	if (ret < 0)
+		goto free_dominfo;
+
+	calc_idoms(subprog, &di, false);
+	ret = idoms_to_doms(subprog, &di);
+	if (ret < 0)
+		goto free_dominfo;
+
+	ret = 0;
+
+free_dominfo:
+	if (di.dfs_parent)
+		kfree(di.dfs_parent);
+
+	return ret;
+}
+
 static void subprog_free_edge(struct bb_node *bb)
 {
 	struct list_head *succs = &bb->e_succs;
@@ -199,7 +580,7 @@ static void subprog_free_edge(struct bb_node *bb)
 	}
 }
 
-void subprog_free_bb(struct bpf_subprog_info *subprog, int end_idx)
+void subprog_free(struct bpf_subprog_info *subprog, int end_idx)
 {
 	int i = 0;
 
@@ -214,5 +595,8 @@ void subprog_free_bb(struct bpf_subprog_info *subprog, int end_idx)
 			list_del(&bb->l);
 			kfree(bb);
 		}
+
+		if (subprog[i].dtree_avail)
+			kfree(subprog[i].dtree);
 	}
 }
diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h
index 7a9d5dd..cbb44f2 100644
--- a/kernel/bpf/cfg.h
+++ b/kernel/bpf/cfg.h
@@ -10,8 +10,9 @@
 
 int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list);
 int subprog_append_bb(struct list_head *bb_list, int head);
+int subprog_build_dom_info(struct bpf_subprog_info *subprog);
 int subprog_fini_bb(struct list_head *bb_list, int subprog_end);
 int subprog_init_bb(struct list_head *bb_list, int subprog_start);
-void subprog_free_bb(struct bpf_subprog_info *subprog, int end_idx);
+void subprog_free(struct bpf_subprog_info *subprog, int end_idx);
 
 #endif
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index bcac7c8..eccaee4 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -903,6 +903,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
 			ret = subprog_add_bb_edges(insn, cur_bb_list);
 			if (ret < 0)
 				goto free_nodes;
+			subprog[cur_subprog].bb_num = ret;
 			subprog_start = subprog_end;
 			cur_subprog++;
 			if (cur_subprog < env->subprog_cnt) {
@@ -919,8 +920,8 @@ static int check_subprogs(struct bpf_verifier_env *env)
 	ret = 0;
 
 free_nodes:
-	subprog_free_bb(subprog, cur_subprog == env->subprog_cnt ?
-					cur_subprog - 1 : cur_subprog);
+	subprog_free(subprog, cur_subprog == env->subprog_cnt ?
+				cur_subprog - 1 : cur_subprog);
 	return ret;
 }
 

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 04/16] bpf: cfg: detect loop use domination information
  2018-06-01  9:32 [RFC PATCH 00/16] bpf, bounded loop support work in progress John Fastabend
                   ` (2 preceding siblings ...)
  2018-06-01  9:32 ` [RFC PATCH 03/16] bpf: cfg: build domination tree using Tarjan algorithm John Fastabend
@ 2018-06-01  9:32 ` John Fastabend
  2018-06-01  9:32 ` [RFC PATCH 05/16] bpf: cfg: detect unreachable basic blocks John Fastabend
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2018-06-01  9:32 UTC (permalink / raw)
  To: alexei.starovoitov, daniel, davem; +Cc: netdev

From: Jiong Wang <jiong.wang@netronome.com>

If one bb is dominating its predecessor, then there is loop.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 kernel/bpf/cfg.c      |   22 ++++++++++++++++++++++
 kernel/bpf/cfg.h      |    1 +
 kernel/bpf/verifier.c |    8 ++++++++
 3 files changed, 31 insertions(+)

diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index b50937a..90692e4 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -568,6 +568,28 @@ int subprog_build_dom_info(struct bpf_subprog_info *subprog)
 	return ret;
 }
 
+bool subprog_has_loop(struct bpf_subprog_info *subprog)
+{
+	int lane_len = BITS_TO_LONGS(subprog->bb_num - 2);
+	struct list_head *bb_list = &subprog->bbs;
+	struct bb_node *bb, *entry_bb;
+	struct edge_node *e;
+
+	entry_bb = entry_bb(bb_list);
+	bb = bb_next(entry_bb);
+	list_for_each_entry_from(bb, &exit_bb(bb_list)->l, l)
+		list_for_each_entry(e, &bb->e_prevs, l) {
+			struct bb_node *latch = e->src;
+
+			if (latch != entry_bb &&
+			    test_bit(bb->idx,
+				     subprog->dtree + latch->idx * lane_len))
+				return true;
+		}
+
+	return false;
+}
+
 static void subprog_free_edge(struct bb_node *bb)
 {
 	struct list_head *succs = &bb->e_succs;
diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h
index cbb44f2..c02c4cf 100644
--- a/kernel/bpf/cfg.h
+++ b/kernel/bpf/cfg.h
@@ -12,6 +12,7 @@
 int subprog_append_bb(struct list_head *bb_list, int head);
 int subprog_build_dom_info(struct bpf_subprog_info *subprog);
 int subprog_fini_bb(struct list_head *bb_list, int subprog_end);
+bool subprog_has_loop(struct bpf_subprog_info *subprog);
 int subprog_init_bb(struct list_head *bb_list, int subprog_start);
 void subprog_free(struct bpf_subprog_info *subprog, int end_idx);
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index eccaee4..c349c45 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -904,6 +904,14 @@ static int check_subprogs(struct bpf_verifier_env *env)
 			if (ret < 0)
 				goto free_nodes;
 			subprog[cur_subprog].bb_num = ret;
+			ret = subprog_build_dom_info(&subprog[cur_subprog]);
+			if (ret < 0)
+				goto free_nodes;
+			if (subprog_has_loop(&subprog[cur_subprog])) {
+				verbose(env, "cfg - loop detected");
+				ret = -EINVAL;
+				goto free_nodes;
+			}
 			subprog_start = subprog_end;
 			cur_subprog++;
 			if (cur_subprog < env->subprog_cnt) {

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 05/16] bpf: cfg: detect unreachable basic blocks
  2018-06-01  9:32 [RFC PATCH 00/16] bpf, bounded loop support work in progress John Fastabend
                   ` (3 preceding siblings ...)
  2018-06-01  9:32 ` [RFC PATCH 04/16] bpf: cfg: detect loop use domination information John Fastabend
@ 2018-06-01  9:32 ` John Fastabend
  2018-06-01  9:32 ` [RFC PATCH 06/16] bpf: cfg: move find_subprog/add_subprog to cfg.c John Fastabend
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2018-06-01  9:32 UTC (permalink / raw)
  To: alexei.starovoitov, daniel, davem; +Cc: netdev

From: Jiong Wang <jiong.wang@netronome.com>

Do unreachable basic blocks detection as a side-product of the dfs walk
when building domination information.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 kernel/bpf/cfg.c      |   19 ++++++++++++++-----
 kernel/bpf/cfg.h      |    3 ++-
 kernel/bpf/verifier.c |    3 ++-
 3 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index 90692e4..2f0ac00 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -407,10 +407,11 @@ static void calc_idoms(struct bpf_subprog_info *subprog, struct dom_info *di,
 			di->idom[idx] = di->idom[di->idom[idx]];
 }
 
-static int calc_dfs_tree(struct bpf_subprog_info *subprog, struct dom_info *di,
-			 bool reverse)
+static int
+calc_dfs_tree(struct bpf_verifier_env *env, struct bpf_subprog_info *subprog,
+	      struct dom_info *di, bool reverse)
 {
-	u16 bb_num = subprog->bb_num, sp = 0, idx, parent_idx;
+	u16 bb_num = subprog->bb_num, sp = 0, idx, parent_idx, i;
 	struct list_head *bb_list = &subprog->bbs;
 	u16 entry_bb_fake_idx = bb_num - 2;
 	struct bb_node *entry_bb, *exit_bb;
@@ -490,6 +491,13 @@ static int calc_dfs_tree(struct bpf_subprog_info *subprog, struct dom_info *di,
 
 	kfree(stack);
 
+	for (i = 0; i < bb_num - 2; i++) {
+		if (!di->dfs_order[i]) {
+			bpf_verifier_log_write(env, "cfg - unreachable insn\n");
+			return -EINVAL;
+		}
+	}
+
 	return 0;
 }
 
@@ -541,7 +549,8 @@ static int idoms_to_doms(struct bpf_subprog_info *subprog, struct dom_info *di)
  * The implementation also referenced GNU GCC 3.0.
  */
 
-int subprog_build_dom_info(struct bpf_subprog_info *subprog)
+int subprog_build_dom_info(struct bpf_verifier_env *env,
+			   struct bpf_subprog_info *subprog)
 {
 	struct dom_info di;
 	int ret;
@@ -550,7 +559,7 @@ int subprog_build_dom_info(struct bpf_subprog_info *subprog)
 	if (ret < 0)
 		goto free_dominfo;
 
-	ret = calc_dfs_tree(subprog, &di, false);
+	ret = calc_dfs_tree(env, subprog, &di, false);
 	if (ret < 0)
 		goto free_dominfo;
 
diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h
index c02c4cf..02729a9 100644
--- a/kernel/bpf/cfg.h
+++ b/kernel/bpf/cfg.h
@@ -10,7 +10,8 @@
 
 int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list);
 int subprog_append_bb(struct list_head *bb_list, int head);
-int subprog_build_dom_info(struct bpf_subprog_info *subprog);
+int subprog_build_dom_info(struct bpf_verifier_env *env,
+			   struct bpf_subprog_info *subprog);
 int subprog_fini_bb(struct list_head *bb_list, int subprog_end);
 bool subprog_has_loop(struct bpf_subprog_info *subprog);
 int subprog_init_bb(struct list_head *bb_list, int subprog_start);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index c349c45..29797d2 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -904,7 +904,8 @@ static int check_subprogs(struct bpf_verifier_env *env)
 			if (ret < 0)
 				goto free_nodes;
 			subprog[cur_subprog].bb_num = ret;
-			ret = subprog_build_dom_info(&subprog[cur_subprog]);
+			ret = subprog_build_dom_info(env,
+						     &subprog[cur_subprog]);
 			if (ret < 0)
 				goto free_nodes;
 			if (subprog_has_loop(&subprog[cur_subprog])) {

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 06/16] bpf: cfg: move find_subprog/add_subprog to cfg.c
  2018-06-01  9:32 [RFC PATCH 00/16] bpf, bounded loop support work in progress John Fastabend
                   ` (4 preceding siblings ...)
  2018-06-01  9:32 ` [RFC PATCH 05/16] bpf: cfg: detect unreachable basic blocks John Fastabend
@ 2018-06-01  9:32 ` John Fastabend
  2018-06-01  9:32 ` [RFC PATCH 07/16] bpf: cfg: build call graph and detect unreachable/recursive call John Fastabend
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2018-06-01  9:32 UTC (permalink / raw)
  To: alexei.starovoitov, daniel, davem; +Cc: netdev

From: Jiong Wang <jiong.wang@netronome.com>

This patch centre find_subprog and add_subprog to cfg.c.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 kernel/bpf/cfg.c      |   41 +++++++++++++++++++++++++++++++++++++++++
 kernel/bpf/cfg.h      |    2 ++
 kernel/bpf/verifier.c |   42 ------------------------------------------
 3 files changed, 43 insertions(+), 42 deletions(-)

diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index 2f0ac00..5175aa7 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -6,8 +6,10 @@
  */
 
 #include <linux/bpf_verifier.h>
+#include <linux/bsearch.h>
 #include <linux/list.h>
 #include <linux/slab.h>
+#include <linux/sort.h>
 
 #include "cfg.h"
 
@@ -599,6 +601,45 @@ bool subprog_has_loop(struct bpf_subprog_info *subprog)
 	return false;
 }
 
+static int cmp_subprogs(const void *a, const void *b)
+{
+	return ((struct bpf_subprog_info *)a)->start -
+	       ((struct bpf_subprog_info *)b)->start;
+}
+
+int find_subprog(struct bpf_verifier_env *env, int off)
+{
+	struct bpf_subprog_info *p;
+
+	p = bsearch(&off, env->subprog_info, env->subprog_cnt,
+		    sizeof(env->subprog_info[0]), cmp_subprogs);
+	if (!p)
+		return -ENOENT;
+	return p - env->subprog_info;
+}
+
+int add_subprog(struct bpf_verifier_env *env, int off)
+{
+	int insn_cnt = env->prog->len;
+	int ret;
+
+	if (off >= insn_cnt || off < 0) {
+		bpf_verifier_log_write(env, "call to invalid destination\n");
+		return -EINVAL;
+	}
+	ret = find_subprog(env, off);
+	if (ret >= 0)
+		return 0;
+	if (env->subprog_cnt >= BPF_MAX_SUBPROGS) {
+		bpf_verifier_log_write(env, "too many subprograms\n");
+		return -E2BIG;
+	}
+	env->subprog_info[env->subprog_cnt++].start = off;
+	sort(env->subprog_info, env->subprog_cnt,
+	     sizeof(env->subprog_info[0]), cmp_subprogs, NULL);
+	return 0;
+}
+
 static void subprog_free_edge(struct bb_node *bb)
 {
 	struct list_head *succs = &bb->e_succs;
diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h
index 02729a9..57eab9b 100644
--- a/kernel/bpf/cfg.h
+++ b/kernel/bpf/cfg.h
@@ -8,6 +8,8 @@
 #ifndef __BPF_CFG_H__
 #define __BPF_CFG_H__
 
+int add_subprog(struct bpf_verifier_env *env, int off);
+int find_subprog(struct bpf_verifier_env *env, int off);
 int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list);
 int subprog_append_bb(struct list_head *bb_list, int head);
 int subprog_build_dom_info(struct bpf_verifier_env *env,
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 29797d2..782dd17 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -20,8 +20,6 @@
 #include <linux/file.h>
 #include <linux/vmalloc.h>
 #include <linux/stringify.h>
-#include <linux/bsearch.h>
-#include <linux/sort.h>
 #include <linux/perf_event.h>
 
 #include "cfg.h"
@@ -762,46 +760,6 @@ enum reg_arg_type {
 	DST_OP_NO_MARK	/* same as above, check only, don't mark */
 };
 
-static int cmp_subprogs(const void *a, const void *b)
-{
-	return ((struct bpf_subprog_info *)a)->start -
-	       ((struct bpf_subprog_info *)b)->start;
-}
-
-static int find_subprog(struct bpf_verifier_env *env, int off)
-{
-	struct bpf_subprog_info *p;
-
-	p = bsearch(&off, env->subprog_info, env->subprog_cnt,
-		    sizeof(env->subprog_info[0]), cmp_subprogs);
-	if (!p)
-		return -ENOENT;
-	return p - env->subprog_info;
-
-}
-
-static int add_subprog(struct bpf_verifier_env *env, int off)
-{
-	int insn_cnt = env->prog->len;
-	int ret;
-
-	if (off >= insn_cnt || off < 0) {
-		verbose(env, "call to invalid destination\n");
-		return -EINVAL;
-	}
-	ret = find_subprog(env, off);
-	if (ret >= 0)
-		return 0;
-	if (env->subprog_cnt >= BPF_MAX_SUBPROGS) {
-		verbose(env, "too many subprograms\n");
-		return -E2BIG;
-	}
-	env->subprog_info[env->subprog_cnt++].start = off;
-	sort(env->subprog_info, env->subprog_cnt,
-	     sizeof(env->subprog_info[0]), cmp_subprogs, NULL);
-	return 0;
-}
-
 static int check_subprogs(struct bpf_verifier_env *env)
 {
 	int i, ret, subprog_start, subprog_end, off, cur_subprog = 0;

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 07/16] bpf: cfg: build call graph and detect unreachable/recursive call
  2018-06-01  9:32 [RFC PATCH 00/16] bpf, bounded loop support work in progress John Fastabend
                   ` (5 preceding siblings ...)
  2018-06-01  9:32 ` [RFC PATCH 06/16] bpf: cfg: move find_subprog/add_subprog to cfg.c John Fastabend
@ 2018-06-01  9:32 ` John Fastabend
  2018-06-01  9:32 ` [RFC PATCH 08/16] bpf: cfg: remove push_insn and check_cfg John Fastabend
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2018-06-01  9:32 UTC (permalink / raw)
  To: alexei.starovoitov, daniel, davem; +Cc: netdev

From: Jiong Wang <jiong.wang@netronome.com>

This patch build call graph during insn scan inside check_subprogs.
Then do recursive and unreachable subprog detection using call graph.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 include/linux/bpf_verifier.h |    1 
 kernel/bpf/cfg.c             |  133 ++++++++++++++++++++++++++++++++++++++++++
 kernel/bpf/cfg.h             |    4 +
 kernel/bpf/verifier.c        |   32 +++++++++-
 4 files changed, 166 insertions(+), 4 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 0eb838d..83ccbc4 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -177,6 +177,7 @@ static inline bool bpf_verifier_log_needed(const struct bpf_verifier_log *log)
 struct bpf_subprog_info {
 	u32 start; /* insn idx of function entry point */
 	u16 stack_depth; /* max. stack depth used by this function */
+	struct list_head callees; /* callees list. */
 	struct list_head bbs; /* basic blocks list. */
 	u32 bb_num; /* total basic block num. */
 	unsigned long *dtree;
diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index 5175aa7..56c08e8 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -13,6 +13,12 @@
 
 #include "cfg.h"
 
+struct cedge_node {
+	struct list_head l;
+	u8 caller_idx;
+	u8 callee_idx;
+};
+
 struct edge_node {
 	struct list_head l;
 	struct bb_node *src;
@@ -640,6 +646,126 @@ int add_subprog(struct bpf_verifier_env *env, int off)
 	return 0;
 }
 
+struct callee_iter {
+	struct list_head *list_head;
+	struct cedge_node *callee;
+};
+
+#define first_callee(c_list)	list_first_entry(c_list, struct cedge_node, l)
+#define next_callee(c)		list_next_entry(c, l)
+
+static bool ci_end_p(struct callee_iter *ci)
+{
+	return &ci->callee->l == ci->list_head;
+}
+
+static void ci_next(struct callee_iter *ci)
+{
+	struct cedge_node *c = ci->callee;
+
+	ci->callee = next_callee(c);
+}
+
+#define EXPLORING	1
+#define EXPLORED	2
+int cgraph_check_recursive_unreachable(struct bpf_verifier_env *env,
+				       struct bpf_subprog_info *subprog)
+{
+	struct callee_iter *stack;
+	struct cedge_node *callee;
+	int sp = 0, idx = 0, ret;
+	struct callee_iter ci;
+	int *status;
+
+	stack = kmalloc_array(128, sizeof(struct callee_iter), GFP_KERNEL);
+	if (!stack)
+		return -ENOMEM;
+	status = kcalloc(env->subprog_cnt, sizeof(int), GFP_KERNEL);
+	if (!status) {
+		kfree(stack);
+		return -ENOMEM;
+	}
+	ci.callee = first_callee(&subprog->callees);
+	ci.list_head = &subprog->callees;
+	status[0] = EXPLORING;
+
+	while (1) {
+		while (!ci_end_p(&ci)) {
+			callee = ci.callee;
+			idx = callee->callee_idx;
+			if (status[idx] == EXPLORING) {
+				bpf_verifier_log_write(env, "cgraph - recursive call\n");
+				ret = -EINVAL;
+				goto err_free;
+			}
+
+			status[idx] = EXPLORING;
+
+			if (sp == 127) {
+				bpf_verifier_log_write(env, "cgraph - call frame too deep\n");
+				ret = -EINVAL;
+				goto err_free;
+			}
+
+			stack[sp++] = ci;
+			ci.callee = first_callee(&subprog[idx].callees);
+			ci.list_head = &subprog[idx].callees;
+		}
+
+		if (!list_empty(ci.list_head))
+			status[first_callee(ci.list_head)->caller_idx] =
+				EXPLORED;
+		else
+			/* leaf func. */
+			status[idx] = EXPLORED;
+
+		if (!sp)
+			break;
+
+		ci = stack[--sp];
+		ci_next(&ci);
+	}
+
+	for (idx = 0; idx < env->subprog_cnt; idx++)
+		if (status[idx] != EXPLORED) {
+			bpf_verifier_log_write(env, "cgraph - unreachable subprog\n");
+			ret = -EINVAL;
+			goto err_free;
+		}
+
+	ret = 0;
+err_free:
+	kfree(status);
+	kfree(stack);
+	return ret;
+}
+
+int subprog_append_callee(struct bpf_verifier_env *env,
+			  struct list_head *callees_list,
+			  int caller_idx, int off)
+{
+	int callee_idx = find_subprog(env, off);
+	struct cedge_node *new_callee, *callee;
+
+	if (callee_idx < 0)
+		return callee_idx;
+
+	list_for_each_entry(callee, callees_list, l) {
+		if (callee->callee_idx == callee_idx)
+			return 0;
+	}
+
+	new_callee = kzalloc(sizeof(*new_callee), GFP_KERNEL);
+	if (!new_callee)
+		return -ENOMEM;
+
+	new_callee->caller_idx = caller_idx;
+	new_callee->callee_idx = callee_idx;
+	list_add_tail(&new_callee->l, callees_list);
+
+	return 0;
+}
+
 static void subprog_free_edge(struct bb_node *bb)
 {
 	struct list_head *succs = &bb->e_succs;
@@ -657,7 +783,9 @@ void subprog_free(struct bpf_subprog_info *subprog, int end_idx)
 	int i = 0;
 
 	for (; i <= end_idx; i++) {
+		struct list_head *callees = &subprog[i].callees;
 		struct list_head *bbs = &subprog[i].bbs;
+		struct cedge_node *callee, *tmp_callee;
 		struct bb_node *bb, *tmp, *exit;
 
 		bb = entry_bb(bbs);
@@ -668,6 +796,11 @@ void subprog_free(struct bpf_subprog_info *subprog, int end_idx)
 			kfree(bb);
 		}
 
+		list_for_each_entry_safe(callee, tmp_callee, callees, l) {
+			list_del(&callee->l);
+			kfree(callee);
+		}
+
 		if (subprog[i].dtree_avail)
 			kfree(subprog[i].dtree);
 	}
diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h
index 57eab9b..577c22c 100644
--- a/kernel/bpf/cfg.h
+++ b/kernel/bpf/cfg.h
@@ -9,9 +9,13 @@
 #define __BPF_CFG_H__
 
 int add_subprog(struct bpf_verifier_env *env, int off);
+int cgraph_check_recursive_unreachable(struct bpf_verifier_env *env,
+				       struct bpf_subprog_info *subprog);
 int find_subprog(struct bpf_verifier_env *env, int off);
 int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list);
 int subprog_append_bb(struct list_head *bb_list, int head);
+int subprog_append_callee(struct bpf_verifier_env *env,
+			  struct list_head *bb_list, int caller_idx, int off);
 int subprog_build_dom_info(struct bpf_verifier_env *env,
 			   struct bpf_subprog_info *subprog);
 int subprog_fini_bb(struct list_head *bb_list, int subprog_end);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 782dd17..ddba8ad 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -764,9 +764,9 @@ static int check_subprogs(struct bpf_verifier_env *env)
 {
 	int i, ret, subprog_start, subprog_end, off, cur_subprog = 0;
 	struct bpf_subprog_info *subprog = env->subprog_info;
+	struct list_head *cur_bb_list, *cur_callee_list;
 	struct bpf_insn *insn = env->prog->insnsi;
 	int insn_cnt = env->prog->len;
-	struct list_head *cur_bb_list;
 
 	/* Add entry function. */
 	ret = add_subprog(env, 0);
@@ -797,16 +797,23 @@ static int check_subprogs(struct bpf_verifier_env *env)
 	 */
 	subprog[env->subprog_cnt].start = insn_cnt;
 
-	if (env->log.level > 1)
-		for (i = 0; i < env->subprog_cnt; i++)
+	for (i = 0; i < env->subprog_cnt; i++) {
+		if (env->log.level > 1)
 			verbose(env, "func#%d @%d\n", i, subprog[i].start);
 
+		/* Don't init callees inside add_subprog which will sort the
+		 * array which breaks list link.
+		 */
+		INIT_LIST_HEAD(&subprog[i].callees);
+	}
+
 	subprog_start = subprog[cur_subprog].start;
 	subprog_end = subprog[cur_subprog + 1].start;
 	cur_bb_list = &subprog[cur_subprog].bbs;
 	ret = subprog_init_bb(cur_bb_list, subprog_start);
 	if (ret < 0)
 		goto free_nodes;
+	cur_callee_list = &env->subprog_info[cur_subprog].callees;
 	/* now check that all jumps are within the same subprog */
 	for (i = 0; i < insn_cnt; i++) {
 		u8 code = insn[i].code;
@@ -823,8 +830,20 @@ static int check_subprogs(struct bpf_verifier_env *env)
 			goto next;
 		}
 
-		if (BPF_OP(code) == BPF_CALL)
+		if (BPF_OP(code) == BPF_CALL) {
+			if (insn[i].src_reg == BPF_PSEUDO_CALL) {
+				int target = i + insn[i].imm + 1;
+
+				ret = subprog_append_callee(env,
+							    cur_callee_list,
+							    cur_subprog,
+							    target);
+				if (ret < 0)
+					return ret;
+			}
+
 			goto next;
+		}
 
 		off = i + insn[i].off + 1;
 		if (off < subprog_start || off >= subprog_end) {
@@ -880,10 +899,15 @@ static int check_subprogs(struct bpf_verifier_env *env)
 						      subprog_start);
 				if (ret < 0)
 					goto free_nodes;
+				cur_callee_list = &subprog[cur_subprog].callees;
 			}
 		}
 	}
 
+	ret = cgraph_check_recursive_unreachable(env, env->subprog_info);
+	if (ret < 0)
+		goto free_nodes;
+
 	ret = 0;
 
 free_nodes:

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 08/16] bpf: cfg: remove push_insn and check_cfg
  2018-06-01  9:32 [RFC PATCH 00/16] bpf, bounded loop support work in progress John Fastabend
                   ` (6 preceding siblings ...)
  2018-06-01  9:32 ` [RFC PATCH 07/16] bpf: cfg: build call graph and detect unreachable/recursive call John Fastabend
@ 2018-06-01  9:32 ` John Fastabend
  2018-06-01  9:33 ` [RFC PATCH 09/16] bpf: cfg: reduce k*alloc/free call by using memory pool for allocating nodes John Fastabend
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2018-06-01  9:32 UTC (permalink / raw)
  To: alexei.starovoitov, daniel, davem; +Cc: netdev

From: Jiong Wang <jiong.wang@netronome.com>

As we have detected loop and unreachable insns based on domination
information and call graph, there is no need of check_cfg.

This patch removes check_cfg and it's associated push_insn.

state prune heuristic marking as moved to check_subprog.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 kernel/bpf/verifier.c |  228 +++----------------------------------------------
 1 file changed, 16 insertions(+), 212 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ddba8ad..338ebc5 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -760,6 +760,8 @@ enum reg_arg_type {
 	DST_OP_NO_MARK	/* same as above, check only, don't mark */
 };
 
+#define STATE_LIST_MARK ((struct bpf_verifier_state_list *)-1L)
+
 static int check_subprogs(struct bpf_verifier_env *env)
 {
 	int i, ret, subprog_start, subprog_end, off, cur_subprog = 0;
@@ -840,8 +842,14 @@ static int check_subprogs(struct bpf_verifier_env *env)
 							    target);
 				if (ret < 0)
 					return ret;
+
+				env->explored_states[target] = STATE_LIST_MARK;
+				env->explored_states[i] = STATE_LIST_MARK;
 			}
 
+			if (i + 1 < insn_cnt)
+				env->explored_states[i + 1] = STATE_LIST_MARK;
+
 			goto next;
 		}
 
@@ -861,6 +869,13 @@ static int check_subprogs(struct bpf_verifier_env *env)
 			if (ret < 0)
 				goto free_nodes;
 		}
+
+		if (BPF_OP(code) != BPF_JA) {
+			env->explored_states[i] = STATE_LIST_MARK;
+			env->explored_states[off] = STATE_LIST_MARK;
+		} else if (i + 1 < insn_cnt) {
+			env->explored_states[i + 1] = STATE_LIST_MARK;
+		}
 next:
 		if (i == subprog_end - 1) {
 			/* to avoid fall-through from one subprog into another
@@ -4093,217 +4108,6 @@ static int check_return_code(struct bpf_verifier_env *env)
 	return 0;
 }
 
-/* non-recursive DFS pseudo code
- * 1  procedure DFS-iterative(G,v):
- * 2      label v as discovered
- * 3      let S be a stack
- * 4      S.push(v)
- * 5      while S is not empty
- * 6            t <- S.pop()
- * 7            if t is what we're looking for:
- * 8                return t
- * 9            for all edges e in G.adjacentEdges(t) do
- * 10               if edge e is already labelled
- * 11                   continue with the next edge
- * 12               w <- G.adjacentVertex(t,e)
- * 13               if vertex w is not discovered and not explored
- * 14                   label e as tree-edge
- * 15                   label w as discovered
- * 16                   S.push(w)
- * 17                   continue at 5
- * 18               else if vertex w is discovered
- * 19                   label e as back-edge
- * 20               else
- * 21                   // vertex w is explored
- * 22                   label e as forward- or cross-edge
- * 23           label t as explored
- * 24           S.pop()
- *
- * convention:
- * 0x10 - discovered
- * 0x11 - discovered and fall-through edge labelled
- * 0x12 - discovered and fall-through and branch edges labelled
- * 0x20 - explored
- */
-
-enum {
-	DISCOVERED = 0x10,
-	EXPLORED = 0x20,
-	FALLTHROUGH = 1,
-	BRANCH = 2,
-};
-
-#define STATE_LIST_MARK ((struct bpf_verifier_state_list *) -1L)
-
-static int *insn_stack;	/* stack of insns to process */
-static int cur_stack;	/* current stack index */
-static int *insn_state;
-
-/* t, w, e - match pseudo-code above:
- * t - index of current instruction
- * w - next instruction
- * e - edge
- */
-static int push_insn(int t, int w, int e, struct bpf_verifier_env *env)
-{
-	if (e == FALLTHROUGH && insn_state[t] >= (DISCOVERED | FALLTHROUGH))
-		return 0;
-
-	if (e == BRANCH && insn_state[t] >= (DISCOVERED | BRANCH))
-		return 0;
-
-	if (w < 0 || w >= env->prog->len) {
-		verbose(env, "jump out of range from insn %d to %d\n", t, w);
-		return -EINVAL;
-	}
-
-	if (e == BRANCH)
-		/* mark branch target for state pruning */
-		env->explored_states[w] = STATE_LIST_MARK;
-
-	if (insn_state[w] == 0) {
-		/* tree-edge */
-		insn_state[t] = DISCOVERED | e;
-		insn_state[w] = DISCOVERED;
-		if (cur_stack >= env->prog->len)
-			return -E2BIG;
-		insn_stack[cur_stack++] = w;
-		return 1;
-	} else if ((insn_state[w] & 0xF0) == DISCOVERED) {
-		verbose(env, "back-edge from insn %d to %d\n", t, w);
-		return -EINVAL;
-	} else if (insn_state[w] == EXPLORED) {
-		/* forward- or cross-edge */
-		insn_state[t] = DISCOVERED | e;
-	} else {
-		verbose(env, "insn state internal bug\n");
-		return -EFAULT;
-	}
-	return 0;
-}
-
-/* non-recursive depth-first-search to detect loops in BPF program
- * loop == back-edge in directed graph
- */
-static int check_cfg(struct bpf_verifier_env *env)
-{
-	struct bpf_insn *insns = env->prog->insnsi;
-	int insn_cnt = env->prog->len;
-	int ret = 0;
-	int i, t;
-
-	ret = check_subprogs(env);
-	if (ret < 0)
-		return ret;
-
-	insn_state = kcalloc(insn_cnt, sizeof(int), GFP_KERNEL);
-	if (!insn_state)
-		return -ENOMEM;
-
-	insn_stack = kcalloc(insn_cnt, sizeof(int), GFP_KERNEL);
-	if (!insn_stack) {
-		kfree(insn_state);
-		return -ENOMEM;
-	}
-
-	insn_state[0] = DISCOVERED; /* mark 1st insn as discovered */
-	insn_stack[0] = 0; /* 0 is the first instruction */
-	cur_stack = 1;
-
-peek_stack:
-	if (cur_stack == 0)
-		goto check_state;
-	t = insn_stack[cur_stack - 1];
-
-	if (BPF_CLASS(insns[t].code) == BPF_JMP) {
-		u8 opcode = BPF_OP(insns[t].code);
-
-		if (opcode == BPF_EXIT) {
-			goto mark_explored;
-		} else if (opcode == BPF_CALL) {
-			ret = push_insn(t, t + 1, FALLTHROUGH, env);
-			if (ret == 1)
-				goto peek_stack;
-			else if (ret < 0)
-				goto err_free;
-			if (t + 1 < insn_cnt)
-				env->explored_states[t + 1] = STATE_LIST_MARK;
-			if (insns[t].src_reg == BPF_PSEUDO_CALL) {
-				env->explored_states[t] = STATE_LIST_MARK;
-				ret = push_insn(t, t + insns[t].imm + 1, BRANCH, env);
-				if (ret == 1)
-					goto peek_stack;
-				else if (ret < 0)
-					goto err_free;
-			}
-		} else if (opcode == BPF_JA) {
-			if (BPF_SRC(insns[t].code) != BPF_K) {
-				ret = -EINVAL;
-				goto err_free;
-			}
-			/* unconditional jump with single edge */
-			ret = push_insn(t, t + insns[t].off + 1,
-					FALLTHROUGH, env);
-			if (ret == 1)
-				goto peek_stack;
-			else if (ret < 0)
-				goto err_free;
-			/* tell verifier to check for equivalent states
-			 * after every call and jump
-			 */
-			if (t + 1 < insn_cnt)
-				env->explored_states[t + 1] = STATE_LIST_MARK;
-		} else {
-			/* conditional jump with two edges */
-			env->explored_states[t] = STATE_LIST_MARK;
-			ret = push_insn(t, t + 1, FALLTHROUGH, env);
-			if (ret == 1)
-				goto peek_stack;
-			else if (ret < 0)
-				goto err_free;
-
-			ret = push_insn(t, t + insns[t].off + 1, BRANCH, env);
-			if (ret == 1)
-				goto peek_stack;
-			else if (ret < 0)
-				goto err_free;
-		}
-	} else {
-		/* all other non-branch instructions with single
-		 * fall-through edge
-		 */
-		ret = push_insn(t, t + 1, FALLTHROUGH, env);
-		if (ret == 1)
-			goto peek_stack;
-		else if (ret < 0)
-			goto err_free;
-	}
-
-mark_explored:
-	insn_state[t] = EXPLORED;
-	if (cur_stack-- <= 0) {
-		verbose(env, "pop stack internal bug\n");
-		ret = -EFAULT;
-		goto err_free;
-	}
-	goto peek_stack;
-
-check_state:
-	for (i = 0; i < insn_cnt; i++) {
-		if (insn_state[i] != EXPLORED) {
-			verbose(env, "unreachable insn %d\n", i);
-			ret = -EINVAL;
-			goto err_free;
-		}
-	}
-	ret = 0; /* cfg looks good */
-
-err_free:
-	kfree(insn_state);
-	kfree(insn_stack);
-	return ret;
-}
-
 /* check %cur's range satisfies %old's */
 static bool range_within(struct bpf_reg_state *old,
 			 struct bpf_reg_state *cur)
@@ -5914,7 +5718,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr)
 
 	env->allow_ptr_leaks = capable(CAP_SYS_ADMIN);
 
-	ret = check_cfg(env);
+	ret = check_subprogs(env);
 	if (ret < 0)
 		goto skip_full_check;
 

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 09/16] bpf: cfg: reduce k*alloc/free call by using memory pool for allocating nodes
  2018-06-01  9:32 [RFC PATCH 00/16] bpf, bounded loop support work in progress John Fastabend
                   ` (7 preceding siblings ...)
  2018-06-01  9:32 ` [RFC PATCH 08/16] bpf: cfg: remove push_insn and check_cfg John Fastabend
@ 2018-06-01  9:33 ` John Fastabend
  2018-06-01  9:33 ` [RFC PATCH 10/16] bpf: cfg: reduce memory usage by using singly list + compat pointer John Fastabend
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2018-06-01  9:33 UTC (permalink / raw)
  To: alexei.starovoitov, daniel, davem; +Cc: netdev

From: Jiong Wang <jiong.wang@netronome.com>

During building control flow graph, we need to build basic block nodes
and edge nodes etc. These nodes needs to allocated dynamically as we
don't have pre-scan pass to know how many nodes we need accurately.
It is better to manage their allocation using memory pool which could
reduce calling of *alloc/kfree functions and also could simplify freeing
nodes.

This patch:
  - implements a simple memory pool based node allocator.
    nodes need dynamic allocation migrated to this allocator.
  - estimate node numbers inside subprog detection pass, asking allocator
    to do an initial allocation of estimated size (aligned to 2K). The pool
    will grow later if space are not enough.
  - There is no support on returning memory back to the pool.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 kernel/bpf/cfg.c      |  164 ++++++++++++++++++++++++++++++++++++-------------
 kernel/bpf/cfg.h      |   21 +++++-
 kernel/bpf/verifier.c |   39 +++++++++---
 3 files changed, 170 insertions(+), 54 deletions(-)

diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index 56c08e8..d213289 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -82,7 +82,113 @@ struct dom_info {
 	u16 dfsnum;
 };
 
-int subprog_append_bb(struct list_head *bb_list, int head)
+struct node_pool {
+	struct list_head l;
+	void *data;
+	u32 size;
+	u32 used;
+};
+
+#define first_node_pool(pool_list)	\
+	list_first_entry(pool_list, struct node_pool, l)
+
+#define MEM_CHUNK_SIZE	(1024)
+
+static int cfg_node_allocator_grow(struct cfg_node_allocator *allocator,
+				   int min_grow_size)
+{
+	int s = min_grow_size;
+	struct node_pool *pool;
+	void *data;
+
+	s += sizeof(struct node_pool);
+	s = ALIGN(s, MEM_CHUNK_SIZE);
+	data = kzalloc(s, GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	pool = (struct node_pool *)data;
+	pool->data = pool + 1;
+	pool->size = s - sizeof(struct node_pool);
+	pool->used = 0;
+	allocator->cur_free_pool = pool;
+	list_add_tail(&pool->l, &allocator->pools);
+
+	return 0;
+}
+
+static void *cfg_node_alloc(struct cfg_node_allocator *allocator, int size)
+{
+	struct node_pool *pool = allocator->cur_free_pool;
+	void *p;
+
+	if (pool->used + size > pool->size) {
+		int ret = cfg_node_allocator_grow(allocator, size);
+
+		if (ret < 0)
+			return NULL;
+
+		pool = allocator->cur_free_pool;
+	}
+
+	p = pool->data + pool->used;
+	pool->used += size;
+
+	return p;
+}
+
+static struct bb_node *get_single_bb_nodes(struct cfg_node_allocator *allocator)
+{
+	int size = sizeof(struct bb_node);
+
+	return (struct bb_node *)cfg_node_alloc(allocator, size);
+}
+
+static struct edge_node *get_edge_nodes(struct cfg_node_allocator *allocator,
+					int num)
+{
+	int size = num * sizeof(struct edge_node);
+
+	return (struct edge_node *)cfg_node_alloc(allocator, size);
+}
+
+static struct cedge_node *
+get_single_cedge_node(struct cfg_node_allocator *allocator)
+{
+	int size = sizeof(struct cedge_node);
+
+	return (struct cedge_node *)cfg_node_alloc(allocator, size);
+}
+
+int cfg_node_allocator_init(struct cfg_node_allocator *allocator,
+			    int bb_num_esti, int cedge_num_esti)
+{
+	int s = bb_num_esti * sizeof(struct bb_node), ret;
+
+	s += 2 * bb_num_esti * sizeof(struct edge_node);
+	s += cedge_num_esti * sizeof(struct cedge_node);
+	INIT_LIST_HEAD(&allocator->pools);
+	ret = cfg_node_allocator_grow(allocator, s);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+void cfg_node_allocator_free(struct cfg_node_allocator *allocator)
+{
+	struct list_head *pools = &allocator->pools;
+	struct node_pool *pool, *tmp;
+
+	pool = first_node_pool(pools);
+	list_for_each_entry_safe_from(pool, tmp, pools, l) {
+		list_del(&pool->l);
+		kfree(pool);
+	}
+}
+
+int subprog_append_bb(struct cfg_node_allocator *allocator,
+		      struct list_head *bb_list, int head)
 {
 	struct bb_node *new_bb, *bb;
 
@@ -94,7 +200,7 @@ int subprog_append_bb(struct list_head *bb_list, int head)
 	}
 
 	bb = bb_prev(bb);
-	new_bb = kzalloc(sizeof(*new_bb), GFP_KERNEL);
+	new_bb = get_single_bb_nodes(allocator);
 	if (!new_bb)
 		return -ENOMEM;
 
@@ -106,9 +212,10 @@ int subprog_append_bb(struct list_head *bb_list, int head)
 	return 0;
 }
 
-int subprog_fini_bb(struct list_head *bb_list, int subprog_end)
+int subprog_fini_bb(struct cfg_node_allocator *allocator,
+		    struct list_head *bb_list, int subprog_end)
 {
-	struct bb_node *bb = kzalloc(sizeof(*bb), GFP_KERNEL);
+	struct bb_node *bb = get_single_bb_nodes(allocator);
 
 	if (!bb)
 		return -ENOMEM;
@@ -118,7 +225,7 @@ int subprog_fini_bb(struct list_head *bb_list, int subprog_end)
 	INIT_LIST_HEAD(&bb->e_succs);
 	list_add(&bb->l, bb_list);
 
-	bb = kzalloc(sizeof(*bb), GFP_KERNEL);
+	bb = get_single_bb_nodes(allocator);
 	if (!bb)
 		return -ENOMEM;
 	/* exit bb. */
@@ -130,12 +237,13 @@ int subprog_fini_bb(struct list_head *bb_list, int subprog_end)
 	return 0;
 }
 
-int subprog_init_bb(struct list_head *bb_list, int subprog_start)
+int subprog_init_bb(struct cfg_node_allocator *allocator,
+		    struct list_head *bb_list, int subprog_start)
 {
 	int ret;
 
 	INIT_LIST_HEAD(bb_list);
-	ret = subprog_append_bb(bb_list, subprog_start);
+	ret = subprog_append_bb(allocator, bb_list, subprog_start);
 	if (ret < 0)
 		return ret;
 
@@ -154,14 +262,15 @@ static struct bb_node *search_bb_with_head(struct list_head *bb_list, int head)
 	return NULL;
 }
 
-int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list)
+int subprog_add_bb_edges(struct cfg_node_allocator *allocator,
+			 struct bpf_insn *insns, struct list_head *bb_list)
 {
 	struct bb_node *bb, *exit_bb;
 	struct edge_node *edge;
 	int bb_num;
 
 	bb = entry_bb(bb_list);
-	edge = kcalloc(2, sizeof(*edge), GFP_KERNEL);
+	edge = get_edge_nodes(allocator, 2);
 	if (!edge)
 		return -ENOMEM;
 	edge->src = bb;
@@ -186,7 +295,7 @@ int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list)
 
 		bb->idx = bb_num++;
 
-		edge = kcalloc(2, sizeof(*edge), GFP_KERNEL);
+		edge = get_edge_nodes(allocator, 2);
 		if (!edge)
 			return -ENOMEM;
 		edge->src = bb;
@@ -222,7 +331,7 @@ int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list)
 			struct bb_node *tgt;
 
 			if (!edge) {
-				edge = kcalloc(2, sizeof(*edge), GFP_KERNEL);
+				edge = get_edge_nodes(allocator, 2);
 				if (!edge)
 					return -ENOMEM;
 				edge->src = bb;
@@ -741,6 +850,7 @@ int cgraph_check_recursive_unreachable(struct bpf_verifier_env *env,
 }
 
 int subprog_append_callee(struct bpf_verifier_env *env,
+			  struct cfg_node_allocator *allocator,
 			  struct list_head *callees_list,
 			  int caller_idx, int off)
 {
@@ -755,7 +865,7 @@ int subprog_append_callee(struct bpf_verifier_env *env,
 			return 0;
 	}
 
-	new_callee = kzalloc(sizeof(*new_callee), GFP_KERNEL);
+	new_callee = get_single_cedge_node(allocator);
 	if (!new_callee)
 		return -ENOMEM;
 
@@ -766,41 +876,11 @@ int subprog_append_callee(struct bpf_verifier_env *env,
 	return 0;
 }
 
-static void subprog_free_edge(struct bb_node *bb)
-{
-	struct list_head *succs = &bb->e_succs;
-	struct edge_node *e, *tmp;
-
-	/* prevs and succs are allocated as pair, succs is the start addr. */
-	list_for_each_entry_safe(e, tmp, succs, l) {
-		list_del(&e->l);
-		kfree(e);
-	}
-}
-
 void subprog_free(struct bpf_subprog_info *subprog, int end_idx)
 {
 	int i = 0;
 
 	for (; i <= end_idx; i++) {
-		struct list_head *callees = &subprog[i].callees;
-		struct list_head *bbs = &subprog[i].bbs;
-		struct cedge_node *callee, *tmp_callee;
-		struct bb_node *bb, *tmp, *exit;
-
-		bb = entry_bb(bbs);
-		exit = exit_bb(bbs);
-		list_for_each_entry_safe_from(bb, tmp, &exit->l, l) {
-			subprog_free_edge(bb);
-			list_del(&bb->l);
-			kfree(bb);
-		}
-
-		list_for_each_entry_safe(callee, tmp_callee, callees, l) {
-			list_del(&callee->l);
-			kfree(callee);
-		}
-
 		if (subprog[i].dtree_avail)
 			kfree(subprog[i].dtree);
 	}
diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h
index 577c22c..1868587 100644
--- a/kernel/bpf/cfg.h
+++ b/kernel/bpf/cfg.h
@@ -8,19 +8,32 @@
 #ifndef __BPF_CFG_H__
 #define __BPF_CFG_H__
 
+struct cfg_node_allocator {
+	struct list_head pools;
+	struct node_pool *cur_free_pool;
+};
+
 int add_subprog(struct bpf_verifier_env *env, int off);
+void cfg_node_allocator_free(struct cfg_node_allocator *allocator);
+int cfg_node_allocator_init(struct cfg_node_allocator *allocator,
+			    int bb_num_esti, int cedge_num_esti);
 int cgraph_check_recursive_unreachable(struct bpf_verifier_env *env,
 				       struct bpf_subprog_info *subprog);
 int find_subprog(struct bpf_verifier_env *env, int off);
-int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list);
-int subprog_append_bb(struct list_head *bb_list, int head);
+int subprog_add_bb_edges(struct cfg_node_allocator *allocator,
+			 struct bpf_insn *insns, struct list_head *bb_list);
+int subprog_append_bb(struct cfg_node_allocator *allocator,
+		      struct list_head *bb_list, int head);
 int subprog_append_callee(struct bpf_verifier_env *env,
+			  struct cfg_node_allocator *allocator,
 			  struct list_head *bb_list, int caller_idx, int off);
 int subprog_build_dom_info(struct bpf_verifier_env *env,
 			   struct bpf_subprog_info *subprog);
-int subprog_fini_bb(struct list_head *bb_list, int subprog_end);
+int subprog_fini_bb(struct cfg_node_allocator *allocator,
+		    struct list_head *bb_list, int subprog_end);
 bool subprog_has_loop(struct bpf_subprog_info *subprog);
-int subprog_init_bb(struct list_head *bb_list, int subprog_start);
+int subprog_init_bb(struct cfg_node_allocator *allocator,
+		    struct list_head *bb_list, int subprog_start);
 void subprog_free(struct bpf_subprog_info *subprog, int end_idx);
 
 #endif
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 338ebc5..5a5016d 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -768,7 +768,10 @@ static int check_subprogs(struct bpf_verifier_env *env)
 	struct bpf_subprog_info *subprog = env->subprog_info;
 	struct list_head *cur_bb_list, *cur_callee_list;
 	struct bpf_insn *insn = env->prog->insnsi;
+	int cedge_num_esti = 0, bb_num_esti = 3;
+	struct cfg_node_allocator allocator;
 	int insn_cnt = env->prog->len;
+	u8 code;
 
 	/* Add entry function. */
 	ret = add_subprog(env, 0);
@@ -777,8 +780,18 @@ static int check_subprogs(struct bpf_verifier_env *env)
 
 	/* determine subprog starts. The end is one before the next starts */
 	for (i = 0; i < insn_cnt; i++) {
-		if (insn[i].code != (BPF_JMP | BPF_CALL))
+		code = insn[i].code;
+		if (BPF_CLASS(code) != BPF_JMP)
+			continue;
+		if (BPF_OP(code) == BPF_EXIT) {
+			if (i + 1 < insn_cnt)
+				bb_num_esti++;
 			continue;
+		}
+		if (BPF_OP(code) != BPF_CALL) {
+			bb_num_esti += 2;
+			continue;
+		}
 		if (insn[i].src_reg != BPF_PSEUDO_CALL)
 			continue;
 		if (!env->allow_ptr_leaks) {
@@ -789,6 +802,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
 			verbose(env, "function calls in offloaded programs are not supported yet\n");
 			return -EINVAL;
 		}
+		cedge_num_esti++;
 		ret = add_subprog(env, i + insn[i].imm + 1);
 		if (ret < 0)
 			return ret;
@@ -809,10 +823,14 @@ static int check_subprogs(struct bpf_verifier_env *env)
 		INIT_LIST_HEAD(&subprog[i].callees);
 	}
 
+	ret = cfg_node_allocator_init(&allocator, bb_num_esti,
+				      cedge_num_esti);
+	if (ret < 0)
+		return ret;
 	subprog_start = subprog[cur_subprog].start;
 	subprog_end = subprog[cur_subprog + 1].start;
 	cur_bb_list = &subprog[cur_subprog].bbs;
-	ret = subprog_init_bb(cur_bb_list, subprog_start);
+	ret = subprog_init_bb(&allocator, cur_bb_list, subprog_start);
 	if (ret < 0)
 		goto free_nodes;
 	cur_callee_list = &env->subprog_info[cur_subprog].callees;
@@ -825,7 +843,8 @@ static int check_subprogs(struct bpf_verifier_env *env)
 
 		if (BPF_OP(code) == BPF_EXIT) {
 			if (i + 1 < subprog_end) {
-				ret = subprog_append_bb(cur_bb_list, i + 1);
+				ret = subprog_append_bb(&allocator, cur_bb_list,
+							i + 1);
 				if (ret < 0)
 					goto free_nodes;
 			}
@@ -837,6 +856,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
 				int target = i + insn[i].imm + 1;
 
 				ret = subprog_append_callee(env,
+							    &allocator,
 							    cur_callee_list,
 							    cur_subprog,
 							    target);
@@ -860,12 +880,12 @@ static int check_subprogs(struct bpf_verifier_env *env)
 			goto free_nodes;
 		}
 
-		ret = subprog_append_bb(cur_bb_list, off);
+		ret = subprog_append_bb(&allocator, cur_bb_list, off);
 		if (ret < 0)
 			goto free_nodes;
 
 		if (i + 1 < subprog_end) {
-			ret = subprog_append_bb(cur_bb_list, i + 1);
+			ret = subprog_append_bb(&allocator, cur_bb_list, i + 1);
 			if (ret < 0)
 				goto free_nodes;
 		}
@@ -889,10 +909,12 @@ static int check_subprogs(struct bpf_verifier_env *env)
 				goto free_nodes;
 			}
 
-			ret = subprog_fini_bb(cur_bb_list, subprog_end);
+			ret = subprog_fini_bb(&allocator, cur_bb_list,
+					      subprog_end);
 			if (ret < 0)
 				goto free_nodes;
-			ret = subprog_add_bb_edges(insn, cur_bb_list);
+			ret = subprog_add_bb_edges(&allocator, insn,
+						   cur_bb_list);
 			if (ret < 0)
 				goto free_nodes;
 			subprog[cur_subprog].bb_num = ret;
@@ -910,7 +932,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
 			if (cur_subprog < env->subprog_cnt) {
 				subprog_end = subprog[cur_subprog + 1].start;
 				cur_bb_list = &subprog[cur_subprog].bbs;
-				ret = subprog_init_bb(cur_bb_list,
+				ret = subprog_init_bb(&allocator, cur_bb_list,
 						      subprog_start);
 				if (ret < 0)
 					goto free_nodes;
@@ -926,6 +948,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
 	ret = 0;
 
 free_nodes:
+	cfg_node_allocator_free(&allocator);
 	subprog_free(subprog, cur_subprog == env->subprog_cnt ?
 				cur_subprog - 1 : cur_subprog);
 	return ret;

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 10/16] bpf: cfg: reduce memory usage by using singly list + compat pointer
  2018-06-01  9:32 [RFC PATCH 00/16] bpf, bounded loop support work in progress John Fastabend
                   ` (8 preceding siblings ...)
  2018-06-01  9:33 ` [RFC PATCH 09/16] bpf: cfg: reduce k*alloc/free call by using memory pool for allocating nodes John Fastabend
@ 2018-06-01  9:33 ` John Fastabend
  2018-06-01  9:33 ` [RFC PATCH 11/16] bpf: cfg: detect irreducible loop using Eric Stoltz algorithm John Fastabend
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2018-06-01  9:33 UTC (permalink / raw)
  To: alexei.starovoitov, daniel, davem; +Cc: netdev

From: Jiong Wang <jiong.wang@netronome.com>

Because there are going to be quite a few nodes (bb, edge etc) for a
reasonable sized program, the size of cfg nodes matters.

The cfg traversal used at the moment are mostly via edge which contains
pointers to both src and dst basic block. The traversal is not via links
between basic blocks and edge nodes themselves, so link them with doubly
list_head is unnecessary, and is too heavy on 64-bit host as pointer will
take 8-bytes.

This patch reduce memory usage from two ways:
  1. use singly linked list to link basic block and edge nodes.
  2. introduce a compact pointer representation for pointers generated from
     memory pool. If a pointer is from pool, we could represent it using
     pool_base + pool_offset,

       struct cfg_node_link {
         u16 offset;
         s8 idx;
       };

     the compact pointer "cfg_node_link" will only take 3-bytes.
     the delink of the pointer will becomes:

     static void *cfg_node_delink(struct cfg_node_allocator *allocator,
                            struct cfg_node_link *link)
     {
       s8 idx = link->idx;

       if (idx == -1)
               return NULL;

       return allocator->base[idx] + link->offset;
     }

For example, basic blocks nodes changed from:

    struct bb_node {
       struct list_head l;
       struct list_head e_prevs;
       struct list_head e_succs;
       s16 head;
       u16 idx;
    };

into:

    struct bb_node {
	struct cfg_node_link link;
	struct cfg_node_link e_prevs;
	struct cfg_node_link e_succs;
	s16 head;
	u16 idx;
    };

We reduced node size from 52 to 16. We could further reduce node size to 14
if we reshuffle fields of link/e_prevs/e_succs to avoid alignment padding,
but that will with a cost of code readability.

>From benchmarks like test_xdp_noinline, this patch reduce peek memory usage
of new cfg infrastructure by more than 50%.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 include/linux/bpf_verifier.h |    7 -
 kernel/bpf/cfg.c             |  503 +++++++++++++++++++++++-------------------
 kernel/bpf/cfg.h             |   23 +-
 kernel/bpf/verifier.c        |   33 +--
 4 files changed, 312 insertions(+), 254 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 83ccbc4..07844ff 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -177,8 +177,11 @@ static inline bool bpf_verifier_log_needed(const struct bpf_verifier_log *log)
 struct bpf_subprog_info {
 	u32 start; /* insn idx of function entry point */
 	u16 stack_depth; /* max. stack depth used by this function */
-	struct list_head callees; /* callees list. */
-	struct list_head bbs; /* basic blocks list. */
+	void *callees; /* callees list. */
+	struct {
+		void *entry_bb; /* basic blocks list head. */
+		void *exit_bb; /* basic blocks list tail. */
+	} bbs;
 	u32 bb_num; /* total basic block num. */
 	unsigned long *dtree;
 	bool dtree_avail;
diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index d213289..2a9b513 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -13,54 +13,55 @@
 
 #include "cfg.h"
 
+/* The size of cfg nodes matters, therefore we try to avoid using doubly linked
+ * list and we try to use base + offset to address node, by this we only need
+ * to keep offset.
+ */
+struct cfg_node_link {
+	u16 offset;
+	s8 idx;
+};
+
+static void *cfg_node_delink(struct cfg_node_allocator *allocator,
+			     struct cfg_node_link *link)
+{
+	s8 idx = link->idx;
+
+	if (idx == -1)
+		return NULL;
+
+	return allocator->base[idx] + link->offset;
+}
+
 struct cedge_node {
-	struct list_head l;
+	struct cfg_node_link link;
 	u8 caller_idx;
 	u8 callee_idx;
 };
 
 struct edge_node {
-	struct list_head l;
+	struct cfg_node_link link;
 	struct bb_node *src;
 	struct bb_node *dst;
 };
 
-struct edge_iter {
-	struct list_head *list_head;
-	struct edge_node *edge;
+struct bb_node {
+	struct cfg_node_link link;
+	struct cfg_node_link e_prevs;
+	struct cfg_node_link e_succs;
+	s16 head;
+	u16 idx;
 };
 
-#define first_edge(e_list)	list_first_entry(e_list, struct edge_node, l)
-#define last_edge(e_list)	list_last_entry(e_list, struct edge_node, l)
-#define next_edge(e)		list_next_entry(e, l)
+#define entry_bb(bb_list)		(struct bb_node *)(*bb_list)
+#define exit_bb(bb_list)		(struct bb_node *)(*(bb_list + 1))
 
-static bool ei_end_p(struct edge_iter *ei)
+static struct bb_node *bb_next(struct cfg_node_allocator *allocator,
+			       struct bb_node *bb)
 {
-	return &ei->edge->l == ei->list_head;
+	return (struct bb_node *)cfg_node_delink(allocator, &bb->link);
 }
 
-static void ei_next(struct edge_iter *ei)
-{
-	struct edge_node *e = ei->edge;
-
-	ei->edge = next_edge(e);
-}
-
-struct bb_node {
-	struct list_head l;
-	struct list_head e_prevs;
-	struct list_head e_succs;
-	u16 head;
-	u16 idx;
-};
-
-#define bb_prev(bb)		list_prev_entry(bb, l)
-#define bb_next(bb)		list_next_entry(bb, l)
-#define bb_first(bb_list)	list_first_entry(bb_list, struct bb_node, l)
-#define bb_last(bb_list)	list_last_entry(bb_list, struct bb_node, l)
-#define entry_bb(bb_list)	bb_first(bb_list)
-#define exit_bb(bb_list)	bb_last(bb_list)
-
 struct dom_info {
 	u16 *dfs_parent;
 	u16 *dfs_order;
@@ -82,9 +83,12 @@ struct dom_info {
 	u16 dfsnum;
 };
 
-struct node_pool {
-	struct list_head l;
-	void *data;
+struct mem_frag {
+	struct cfg_node_link link;
+	void *p;
+};
+
+struct pool_head {
 	u32 size;
 	u32 used;
 };
@@ -97,67 +101,103 @@ struct node_pool {
 static int cfg_node_allocator_grow(struct cfg_node_allocator *allocator,
 				   int min_grow_size)
 {
-	int s = min_grow_size;
-	struct node_pool *pool;
-	void *data;
+	int s = min_grow_size, pool_cnt = allocator->pool_cnt;
+	struct pool_head *pool;
 
-	s += sizeof(struct node_pool);
+	if (pool_cnt >= MAX_POOL_NUM)
+		return -E2BIG;
+
+	s += sizeof(struct pool_head);
 	s = ALIGN(s, MEM_CHUNK_SIZE);
-	data = kzalloc(s, GFP_KERNEL);
-	if (!data)
+	if (s > U16_MAX)
+		return -E2BIG;
+
+	pool = kzalloc(s, GFP_KERNEL);
+	if (!pool)
 		return -ENOMEM;
 
-	pool = (struct node_pool *)data;
-	pool->data = pool + 1;
-	pool->size = s - sizeof(struct node_pool);
-	pool->used = 0;
-	allocator->cur_free_pool = pool;
-	list_add_tail(&pool->l, &allocator->pools);
+	allocator->base[pool_cnt] = pool;
+	pool->size = s;
+	pool->used = sizeof(struct pool_head);
+	allocator->pool_cnt++;
 
 	return 0;
 }
 
-static void *cfg_node_alloc(struct cfg_node_allocator *allocator, int size)
+static int cfg_node_alloc(struct cfg_node_allocator *allocator,
+			  struct mem_frag *frag, int size)
 {
-	struct node_pool *pool = allocator->cur_free_pool;
+	int pool_idx = allocator->pool_cnt - 1;
+	struct pool_head *pool;
 	void *p;
 
+	pool = allocator->base[pool_idx];
 	if (pool->used + size > pool->size) {
 		int ret = cfg_node_allocator_grow(allocator, size);
 
 		if (ret < 0)
-			return NULL;
+			return ret;
 
-		pool = allocator->cur_free_pool;
+		pool_idx++;
+		pool = allocator->base[pool_idx];
 	}
 
-	p = pool->data + pool->used;
+	p = (void *)pool + pool->used;
+	frag->p = p;
+	frag->link.idx = pool_idx;
+	frag->link.offset = pool->used;
 	pool->used += size;
 
-	return p;
+	return 0;
 }
 
-static struct bb_node *get_single_bb_nodes(struct cfg_node_allocator *allocator)
+static int get_link_nodes(struct cfg_node_allocator *allocator,
+			  struct mem_frag *frag, int num, int elem_size)
 {
-	int size = sizeof(struct bb_node);
+	int i, ret;
+	struct cfg_node_link *link;
 
-	return (struct bb_node *)cfg_node_alloc(allocator, size);
+	ret = cfg_node_alloc(allocator, frag, num * elem_size);
+	if (ret < 0)
+		return ret;
+
+	for (i = 0; i < num; i++) {
+		link = frag->p + i * elem_size;
+		link->idx = -1;
+	}
+
+	return 0;
 }
 
-static struct edge_node *get_edge_nodes(struct cfg_node_allocator *allocator,
-					int num)
+static int get_bb_nodes(struct cfg_node_allocator *allocator,
+			struct mem_frag *frag, int num)
 {
-	int size = num * sizeof(struct edge_node);
+	struct bb_node *bb;
+	int i, ret;
+
+	ret = get_link_nodes(allocator, frag, num, sizeof(struct bb_node));
+	if (ret < 0)
+		return ret;
+
+	bb = frag->p;
+	for (i = 0; i < num; i++) {
+		bb[i].e_prevs.idx = -1;
+		bb[i].e_succs.idx = -1;
+	}
 
-	return (struct edge_node *)cfg_node_alloc(allocator, size);
+	return 0;
 }
 
-static struct cedge_node *
-get_single_cedge_node(struct cfg_node_allocator *allocator)
+static int get_edge_nodes(struct cfg_node_allocator *allocator,
+			  struct mem_frag *frag, int num)
 {
-	int size = sizeof(struct cedge_node);
+	return get_link_nodes(allocator, frag, num, sizeof(struct edge_node));
+}
 
-	return (struct cedge_node *)cfg_node_alloc(allocator, size);
+static int get_single_cedge_node(struct cfg_node_allocator *allocator,
+				 struct mem_frag *frag)
+{
+	return get_link_nodes(allocator, frag, 1, sizeof(struct cedge_node));
 }
 
 int cfg_node_allocator_init(struct cfg_node_allocator *allocator,
@@ -167,7 +207,8 @@ int cfg_node_allocator_init(struct cfg_node_allocator *allocator,
 
 	s += 2 * bb_num_esti * sizeof(struct edge_node);
 	s += cedge_num_esti * sizeof(struct cedge_node);
-	INIT_LIST_HEAD(&allocator->pools);
+
+	allocator->pool_cnt = 0;
 	ret = cfg_node_allocator_grow(allocator, s);
 	if (ret < 0)
 		return ret;
@@ -177,127 +218,123 @@ int cfg_node_allocator_init(struct cfg_node_allocator *allocator,
 
 void cfg_node_allocator_free(struct cfg_node_allocator *allocator)
 {
-	struct list_head *pools = &allocator->pools;
-	struct node_pool *pool, *tmp;
+	int i, cnt = allocator->pool_cnt;
 
-	pool = first_node_pool(pools);
-	list_for_each_entry_safe_from(pool, tmp, pools, l) {
-		list_del(&pool->l);
-		kfree(pool);
-	}
+	for (i = 0; i < cnt; i++)
+		kfree(allocator->base[i]);
 }
 
-int subprog_append_bb(struct cfg_node_allocator *allocator,
-		      struct list_head *bb_list, int head)
+int subprog_append_bb(struct cfg_node_allocator *allocator, void **bb_list,
+		      int head)
 {
-	struct bb_node *new_bb, *bb;
+	struct bb_node *cur = entry_bb(bb_list);
+	struct bb_node *prev = cur;
+	struct mem_frag frag;
+	int ret;
 
-	list_for_each_entry(bb, bb_list, l) {
-		if (bb->head == head)
+	while (cur) {
+		if (cur->head == head)
 			return 0;
-		else if (bb->head > head)
+		else if (cur->head > head)
 			break;
-	}
+		prev = cur;
+		cur = cfg_node_delink(allocator, &cur->link);
+	};
 
-	bb = bb_prev(bb);
-	new_bb = get_single_bb_nodes(allocator);
-	if (!new_bb)
-		return -ENOMEM;
+	ret = get_bb_nodes(allocator, &frag, 1);
+	if (ret < 0)
+		return ret;
 
-	INIT_LIST_HEAD(&new_bb->e_prevs);
-	INIT_LIST_HEAD(&new_bb->e_succs);
-	new_bb->head = head;
-	list_add(&new_bb->l, &bb->l);
+	cur = frag.p;
+	cur->head = head;
+	cur->link = prev->link;
+	prev->link = frag.link;
 
 	return 0;
 }
 
-int subprog_fini_bb(struct cfg_node_allocator *allocator,
-		    struct list_head *bb_list, int subprog_end)
+int subprog_init_bb(struct cfg_node_allocator *allocator, void **bb_list,
+		    int subprog_start, int subprog_end)
 {
-	struct bb_node *bb = get_single_bb_nodes(allocator);
-
-	if (!bb)
-		return -ENOMEM;
-	/* entry bb. */
-	bb->head = -1;
-	INIT_LIST_HEAD(&bb->e_prevs);
-	INIT_LIST_HEAD(&bb->e_succs);
-	list_add(&bb->l, bb_list);
-
-	bb = get_single_bb_nodes(allocator);
-	if (!bb)
-		return -ENOMEM;
-	/* exit bb. */
-	bb->head = subprog_end;
-	INIT_LIST_HEAD(&bb->e_prevs);
-	INIT_LIST_HEAD(&bb->e_succs);
-	list_add_tail(&bb->l, bb_list);
-
-	return 0;
-}
+	struct bb_node **list_head = (struct bb_node **)bb_list;
+	struct bb_node **list_tail = (struct bb_node **)(bb_list + 1);
+	struct bb_node *entry_bb, *first_bb, *exit_bb;
+	int ret, s = sizeof(struct bb_node);
+	struct mem_frag frag;
 
-int subprog_init_bb(struct cfg_node_allocator *allocator,
-		    struct list_head *bb_list, int subprog_start)
-{
-	int ret;
-
-	INIT_LIST_HEAD(bb_list);
-	ret = subprog_append_bb(allocator, bb_list, subprog_start);
+	ret = get_bb_nodes(allocator, &frag, 3);
 	if (ret < 0)
 		return ret;
 
+	entry_bb = frag.p;
+	*list_head = entry_bb;
+	entry_bb->head = -1;
+	first_bb = frag.p + s;
+	first_bb->head = subprog_start;
+	exit_bb = frag.p + 2 * s;
+	exit_bb->head = subprog_end;
+	entry_bb->link.idx = frag.link.idx;
+	entry_bb->link.offset = frag.link.offset + s;
+	first_bb->link.idx = frag.link.idx;
+	first_bb->link.offset = frag.link.offset + 2 * s;
+	*list_tail = exit_bb;
+
 	return 0;
 }
 
-static struct bb_node *search_bb_with_head(struct list_head *bb_list, int head)
+static struct bb_node *search_bb_with_head(struct cfg_node_allocator *allocator,
+					   void **bb_list, int head)
 {
-	struct bb_node *bb;
+	struct bb_node *cur = entry_bb(bb_list);
 
-	list_for_each_entry(bb, bb_list, l) {
-		if (bb->head == head)
-			return bb;
-	}
+	while (cur) {
+		if (cur->head == head)
+			return cur;
+
+		cur = cfg_node_delink(allocator, &cur->link);
+	};
 
 	return NULL;
 }
 
 int subprog_add_bb_edges(struct cfg_node_allocator *allocator,
-			 struct bpf_insn *insns, struct list_head *bb_list)
+			 struct bpf_insn *insns, void **bb_list)
 {
-	struct bb_node *bb, *exit_bb;
+	struct bb_node *bb = entry_bb(bb_list), *exit_bb;
 	struct edge_node *edge;
-	int bb_num;
+	struct mem_frag frag;
+	int ret, bb_num;
 
-	bb = entry_bb(bb_list);
-	edge = get_edge_nodes(allocator, 2);
-	if (!edge)
-		return -ENOMEM;
+	ret = get_edge_nodes(allocator, &frag, 2);
+	if (ret < 0)
+		return ret;
+	edge = frag.p;
 	edge->src = bb;
-	edge->dst = bb_next(bb);
-	list_add_tail(&edge->l, &bb->e_succs);
+	edge->dst = bb_next(allocator, bb);
+	bb->e_succs = frag.link;
 	edge[1].src = edge->src;
 	edge[1].dst = edge->dst;
-	list_add_tail(&edge[1].l, &edge[1].dst->e_prevs);
+	edge->dst->e_prevs = frag.link;
 	bb->idx = -1;
 
 	exit_bb = exit_bb(bb_list);
 	exit_bb->idx = -2;
-	bb = bb_next(bb);
+	bb = edge->dst;
 	bb_num = 0;
-	list_for_each_entry_from(bb, &exit_bb->l, l) {
+	while (bb && bb != exit_bb) {
+		struct bb_node *next_bb = bb_next(allocator, bb);
 		bool has_fallthrough, only_has_fallthrough;
 		bool has_branch, only_has_branch;
-		struct bb_node *next_bb = bb_next(bb);
 		int tail = next_bb->head - 1;
 		struct bpf_insn insn;
 		u8 code;
 
 		bb->idx = bb_num++;
 
-		edge = get_edge_nodes(allocator, 2);
-		if (!edge)
-			return -ENOMEM;
+		ret = get_edge_nodes(allocator, &frag, 2);
+		if (ret < 0)
+			return ret;
+		edge = frag.p;
 		edge->src = bb;
 		edge[1].src = bb;
 
@@ -322,8 +359,11 @@ int subprog_add_bb_edges(struct cfg_node_allocator *allocator,
 				next_bb = exit_bb;
 			edge->dst = next_bb;
 			edge[1].dst = next_bb;
-			list_add_tail(&edge->l, &bb->e_succs);
-			list_add_tail(&edge[1].l, &edge[1].dst->e_prevs);
+			edge->link = bb->e_succs;
+			bb->e_succs = frag.link;
+			frag.link.offset += sizeof(struct edge_node);
+			edge[1].link = next_bb->e_prevs;
+			next_bb->e_prevs = frag.link;
 			edge = NULL;
 		}
 
@@ -331,23 +371,29 @@ int subprog_add_bb_edges(struct cfg_node_allocator *allocator,
 			struct bb_node *tgt;
 
 			if (!edge) {
-				edge = get_edge_nodes(allocator, 2);
-				if (!edge)
-					return -ENOMEM;
+				ret = get_edge_nodes(allocator, &frag, 2);
+				if (ret < 0)
+					return ret;
+				edge = frag.p;
 				edge->src = bb;
 				edge[1].src = bb;
 			}
 
-			tgt = search_bb_with_head(bb_list,
+			tgt = search_bb_with_head(allocator, bb_list,
 						  tail + insn.off + 1);
 			if (!tgt)
 				return -EINVAL;
 
 			edge->dst = tgt;
 			edge[1].dst = tgt;
-			list_add_tail(&edge->l, &bb->e_succs);
-			list_add_tail(&edge[1].l, &tgt->e_prevs);
+			edge->link = bb->e_succs;
+			bb->e_succs = frag.link;
+			frag.link.offset += sizeof(struct edge_node);
+			edge[1].link = tgt->e_prevs;
+			tgt->e_prevs = frag.link;
 		}
+
+		bb = bb_next(allocator, bb);
 	}
 
 	return bb_num + 2;
@@ -451,13 +497,13 @@ static void link(struct dom_info *di, unsigned int v, unsigned int w)
 	}
 }
 
-static void calc_idoms(struct bpf_subprog_info *subprog, struct dom_info *di,
-		       bool reverse)
+static void
+calc_idoms(struct cfg_node_allocator *allocator,
+	   struct bpf_subprog_info *subprog, struct dom_info *di, bool reverse)
 {
 	u16 entry_bb_fake_idx = subprog->bb_num - 2, idx, w, k, par;
-	struct list_head *bb_list = &subprog->bbs;
+	void **bb_list = (void **)&subprog->bbs;
 	struct bb_node *entry_bb;
-	struct edge_iter ei;
 
 	if (reverse)
 		entry_bb = exit_bb(bb_list);
@@ -472,24 +518,21 @@ static void calc_idoms(struct bpf_subprog_info *subprog, struct dom_info *di,
 		par = di->dfs_parent[idx];
 		k = idx;
 
-		if (reverse) {
-			ei.edge = first_edge(&bb->e_succs);
-			ei.list_head = &bb->e_succs;
-		} else {
-			ei.edge = first_edge(&bb->e_prevs);
-			ei.list_head = &bb->e_prevs;
-		}
+		if (reverse)
+			e = cfg_node_delink(allocator, &bb->e_succs);
+		else
+			e = cfg_node_delink(allocator, &bb->e_prevs);
 
-		while (!ei_end_p(&ei)) {
+		while (e) {
 			struct bb_node *b;
 			u16 k1;
 
-			e = ei.edge;
 			if (reverse)
 				b = e->dst;
 			else
 				b = e->src;
-			ei_next(&ei);
+
+			e = cfg_node_delink(allocator, &e->link);
 
 			if (b == entry_bb)
 				k1 = di->dfs_order[entry_bb_fake_idx];
@@ -525,19 +568,21 @@ static void calc_idoms(struct bpf_subprog_info *subprog, struct dom_info *di,
 }
 
 static int
-calc_dfs_tree(struct bpf_verifier_env *env, struct bpf_subprog_info *subprog,
-	      struct dom_info *di, bool reverse)
+calc_dfs_tree(struct bpf_verifier_env *env,
+	      struct cfg_node_allocator *allocator,
+	      struct bpf_subprog_info *subprog, struct dom_info *di,
+	      bool reverse)
 {
 	u16 bb_num = subprog->bb_num, sp = 0, idx, parent_idx, i;
-	struct list_head *bb_list = &subprog->bbs;
+	void **bb_list = (void **)&subprog->bbs;
 	u16 entry_bb_fake_idx = bb_num - 2;
 	struct bb_node *entry_bb, *exit_bb;
-	struct edge_iter ei, *stack;
-	struct edge_node *e;
+	struct edge_node **stack, *e;
 
 	di->dfs_order[entry_bb_fake_idx] = di->dfsnum;
 
-	stack = kmalloc_array(bb_num - 1, sizeof(struct edge_iter), GFP_KERNEL);
+	stack = kmalloc_array(bb_num - 1, sizeof(struct edge_node *),
+			      GFP_KERNEL);
 	if (!stack)
 		return -ENOMEM;
 
@@ -545,27 +590,24 @@ static void calc_idoms(struct bpf_subprog_info *subprog, struct dom_info *di,
 		entry_bb = exit_bb(bb_list);
 		exit_bb = entry_bb(bb_list);
 		di->dfs_to_bb[di->dfsnum++] = exit_bb;
-		ei.edge = first_edge(&entry_bb->e_prevs);
-		ei.list_head = &entry_bb->e_prevs;
+		e = cfg_node_delink(allocator, &entry_bb->e_prevs);
 	} else {
 		entry_bb = entry_bb(bb_list);
 		exit_bb = exit_bb(bb_list);
 		di->dfs_to_bb[di->dfsnum++] = entry_bb;
-		ei.edge = first_edge(&entry_bb->e_succs);
-		ei.list_head = &entry_bb->e_succs;
+		e = cfg_node_delink(allocator, &entry_bb->e_succs);
 	}
 
 	while (1) {
 		struct bb_node *bb_dst, *bb_src;
 
-		while (!ei_end_p(&ei)) {
-			e = ei.edge;
-
+		while (e) {
 			if (reverse) {
 				bb_dst = e->src;
 				if (bb_dst == exit_bb ||
 				    di->dfs_order[bb_dst->idx]) {
-					ei_next(&ei);
+					e = cfg_node_delink(allocator,
+							    &e->link);
 					continue;
 				}
 				bb_src = e->dst;
@@ -573,7 +615,8 @@ static void calc_idoms(struct bpf_subprog_info *subprog, struct dom_info *di,
 				bb_dst = e->dst;
 				if (bb_dst == exit_bb ||
 				    di->dfs_order[bb_dst->idx]) {
-					ei_next(&ei);
+					e = cfg_node_delink(allocator,
+							    &e->link);
 					continue;
 				}
 				bb_src = e->src;
@@ -589,21 +632,20 @@ static void calc_idoms(struct bpf_subprog_info *subprog, struct dom_info *di,
 			di->dfs_to_bb[idx] = bb_dst;
 			di->dfs_parent[idx] = parent_idx;
 
-			stack[sp++] = ei;
-			if (reverse) {
-				ei.edge = first_edge(&bb_dst->e_prevs);
-				ei.list_head = &bb_dst->e_prevs;
-			} else {
-				ei.edge = first_edge(&bb_dst->e_succs);
-				ei.list_head = &bb_dst->e_succs;
-			}
+			stack[sp++] = e;
+			if (reverse)
+				e = cfg_node_delink(allocator,
+						    &bb_dst->e_prevs);
+			else
+				e = cfg_node_delink(allocator,
+						    &bb_dst->e_succs);
 		}
 
 		if (!sp)
 			break;
 
-		ei = stack[--sp];
-		ei_next(&ei);
+		e = stack[--sp];
+		e = cfg_node_delink(allocator, &e->link);
 	}
 
 	kfree(stack);
@@ -667,6 +709,7 @@ static int idoms_to_doms(struct bpf_subprog_info *subprog, struct dom_info *di)
  */
 
 int subprog_build_dom_info(struct bpf_verifier_env *env,
+			   struct cfg_node_allocator *allocator,
 			   struct bpf_subprog_info *subprog)
 {
 	struct dom_info di;
@@ -676,11 +719,11 @@ int subprog_build_dom_info(struct bpf_verifier_env *env,
 	if (ret < 0)
 		goto free_dominfo;
 
-	ret = calc_dfs_tree(env, subprog, &di, false);
+	ret = calc_dfs_tree(env, allocator, subprog, &di, false);
 	if (ret < 0)
 		goto free_dominfo;
 
-	calc_idoms(subprog, &di, false);
+	calc_idoms(allocator, subprog, &di, false);
 	ret = idoms_to_doms(subprog, &di);
 	if (ret < 0)
 		goto free_dominfo;
@@ -694,25 +737,33 @@ int subprog_build_dom_info(struct bpf_verifier_env *env,
 	return ret;
 }
 
-bool subprog_has_loop(struct bpf_subprog_info *subprog)
+bool subprog_has_loop(struct cfg_node_allocator *allocator,
+		      struct bpf_subprog_info *subprog)
 {
 	int lane_len = BITS_TO_LONGS(subprog->bb_num - 2);
-	struct list_head *bb_list = &subprog->bbs;
-	struct bb_node *bb, *entry_bb;
+	struct bb_node *bb, *entry_bb, *exit_bb;
+	void **bb_list = (void **)&subprog->bbs;
 	struct edge_node *e;
 
 	entry_bb = entry_bb(bb_list);
-	bb = bb_next(entry_bb);
-	list_for_each_entry_from(bb, &exit_bb(bb_list)->l, l)
-		list_for_each_entry(e, &bb->e_prevs, l) {
+	exit_bb = exit_bb(bb_list);
+	bb = bb_next(allocator, entry_bb);
+	while (bb && bb != exit_bb) {
+		e = cfg_node_delink(allocator, &bb->e_prevs);
+		while (e) {
 			struct bb_node *latch = e->src;
 
 			if (latch != entry_bb &&
 			    test_bit(bb->idx,
 				     subprog->dtree + latch->idx * lane_len))
 				return true;
+
+			e = cfg_node_delink(allocator, &e->link);
 		}
 
+		bb = bb_next(allocator, bb);
+	}
+
 	return false;
 }
 
@@ -756,37 +807,34 @@ int add_subprog(struct bpf_verifier_env *env, int off)
 }
 
 struct callee_iter {
-	struct list_head *list_head;
+	struct cedge_node *head;
 	struct cedge_node *callee;
 };
 
-#define first_callee(c_list)	list_first_entry(c_list, struct cedge_node, l)
-#define next_callee(c)		list_next_entry(c, l)
-
 static bool ci_end_p(struct callee_iter *ci)
 {
-	return &ci->callee->l == ci->list_head;
+	return !ci->callee;
 }
 
-static void ci_next(struct callee_iter *ci)
+static void ci_next(struct cfg_node_allocator *allocator,
+		    struct callee_iter *ci)
 {
 	struct cedge_node *c = ci->callee;
 
-	ci->callee = next_callee(c);
+	ci->callee = cfg_node_delink(allocator, &c->link);
 }
 
 #define EXPLORING	1
 #define EXPLORED	2
 int cgraph_check_recursive_unreachable(struct bpf_verifier_env *env,
+				       struct cfg_node_allocator *allocator,
 				       struct bpf_subprog_info *subprog)
 {
-	struct callee_iter *stack;
+	int sp = 0, idx = 0, ret, *status;
+	struct callee_iter *stack, ci;
 	struct cedge_node *callee;
-	int sp = 0, idx = 0, ret;
-	struct callee_iter ci;
-	int *status;
 
-	stack = kmalloc_array(128, sizeof(struct callee_iter), GFP_KERNEL);
+	stack = kmalloc_array(64, sizeof(struct callee_iter), GFP_KERNEL);
 	if (!stack)
 		return -ENOMEM;
 	status = kcalloc(env->subprog_cnt, sizeof(int), GFP_KERNEL);
@@ -794,8 +842,8 @@ int cgraph_check_recursive_unreachable(struct bpf_verifier_env *env,
 		kfree(stack);
 		return -ENOMEM;
 	}
-	ci.callee = first_callee(&subprog->callees);
-	ci.list_head = &subprog->callees;
+	ci.head = subprog->callees;
+	ci.callee = subprog->callees;
 	status[0] = EXPLORING;
 
 	while (1) {
@@ -810,20 +858,19 @@ int cgraph_check_recursive_unreachable(struct bpf_verifier_env *env,
 
 			status[idx] = EXPLORING;
 
-			if (sp == 127) {
+			if (sp == 64) {
 				bpf_verifier_log_write(env, "cgraph - call frame too deep\n");
 				ret = -EINVAL;
 				goto err_free;
 			}
 
 			stack[sp++] = ci;
-			ci.callee = first_callee(&subprog[idx].callees);
-			ci.list_head = &subprog[idx].callees;
+			ci.head = subprog[idx].callees;
+			ci.callee = subprog[idx].callees;
 		}
 
-		if (!list_empty(ci.list_head))
-			status[first_callee(ci.list_head)->caller_idx] =
-				EXPLORED;
+		if (ci.head)
+			status[ci.head->caller_idx] = EXPLORED;
 		else
 			/* leaf func. */
 			status[idx] = EXPLORED;
@@ -832,7 +879,7 @@ int cgraph_check_recursive_unreachable(struct bpf_verifier_env *env,
 			break;
 
 		ci = stack[--sp];
-		ci_next(&ci);
+		ci_next(allocator, &ci);
 	}
 
 	for (idx = 0; idx < env->subprog_cnt; idx++)
@@ -851,27 +898,37 @@ int cgraph_check_recursive_unreachable(struct bpf_verifier_env *env,
 
 int subprog_append_callee(struct bpf_verifier_env *env,
 			  struct cfg_node_allocator *allocator,
-			  struct list_head *callees_list,
-			  int caller_idx, int off)
+			  void **callees_list, int caller_idx, int off)
 {
-	int callee_idx = find_subprog(env, off);
+	int callee_idx = find_subprog(env, off), ret;
 	struct cedge_node *new_callee, *callee;
+	struct mem_frag frag;
 
 	if (callee_idx < 0)
 		return callee_idx;
 
-	list_for_each_entry(callee, callees_list, l) {
+	callee = (struct cedge_node *)*callees_list;
+	while (callee) {
 		if (callee->callee_idx == callee_idx)
 			return 0;
+
+		callee = cfg_node_delink(allocator, &callee->link);
 	}
 
-	new_callee = get_single_cedge_node(allocator);
-	if (!new_callee)
-		return -ENOMEM;
+	ret = get_single_cedge_node(allocator, &frag);
+	if (ret < 0)
+		return ret;
 
+	new_callee = frag.p;
 	new_callee->caller_idx = caller_idx;
 	new_callee->callee_idx = callee_idx;
-	list_add_tail(&new_callee->l, callees_list);
+	callee = (struct cedge_node *)*callees_list;
+	if (!callee) {
+		*callees_list = new_callee;
+	} else {
+		new_callee->link = callee->link;
+		callee->link = frag.link;
+	}
 
 	return 0;
 }
diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h
index 1868587..cb833bd 100644
--- a/kernel/bpf/cfg.h
+++ b/kernel/bpf/cfg.h
@@ -8,9 +8,11 @@
 #ifndef __BPF_CFG_H__
 #define __BPF_CFG_H__
 
+#define MAX_POOL_NUM	32
+
 struct cfg_node_allocator {
-	struct list_head pools;
-	struct node_pool *cur_free_pool;
+	void *base[MAX_POOL_NUM];
+	u8 pool_cnt;
 };
 
 int add_subprog(struct bpf_verifier_env *env, int off);
@@ -18,22 +20,23 @@ struct cfg_node_allocator {
 int cfg_node_allocator_init(struct cfg_node_allocator *allocator,
 			    int bb_num_esti, int cedge_num_esti);
 int cgraph_check_recursive_unreachable(struct bpf_verifier_env *env,
+				       struct cfg_node_allocator *allocator,
 				       struct bpf_subprog_info *subprog);
 int find_subprog(struct bpf_verifier_env *env, int off);
 int subprog_add_bb_edges(struct cfg_node_allocator *allocator,
-			 struct bpf_insn *insns, struct list_head *bb_list);
+			 struct bpf_insn *insns, void **bb_list);
 int subprog_append_bb(struct cfg_node_allocator *allocator,
-		      struct list_head *bb_list, int head);
+		      void **bb_list, int head);
 int subprog_append_callee(struct bpf_verifier_env *env,
 			  struct cfg_node_allocator *allocator,
-			  struct list_head *bb_list, int caller_idx, int off);
+			  void **callees, int caller_idx, int off);
 int subprog_build_dom_info(struct bpf_verifier_env *env,
+			   struct cfg_node_allocator *allocator,
 			   struct bpf_subprog_info *subprog);
-int subprog_fini_bb(struct cfg_node_allocator *allocator,
-		    struct list_head *bb_list, int subprog_end);
-bool subprog_has_loop(struct bpf_subprog_info *subprog);
-int subprog_init_bb(struct cfg_node_allocator *allocator,
-		    struct list_head *bb_list, int subprog_start);
+bool subprog_has_loop(struct cfg_node_allocator *allocator,
+		      struct bpf_subprog_info *subprog);
+int subprog_init_bb(struct cfg_node_allocator *allocator, void **bb_list,
+		    int subprog_start, int subprog_end);
 void subprog_free(struct bpf_subprog_info *subprog, int end_idx);
 
 #endif
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5a5016d..90619da 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -766,9 +766,9 @@ static int check_subprogs(struct bpf_verifier_env *env)
 {
 	int i, ret, subprog_start, subprog_end, off, cur_subprog = 0;
 	struct bpf_subprog_info *subprog = env->subprog_info;
-	struct list_head *cur_bb_list, *cur_callee_list;
 	struct bpf_insn *insn = env->prog->insnsi;
 	int cedge_num_esti = 0, bb_num_esti = 3;
+	void **cur_bb_list, **cur_callee_list;
 	struct cfg_node_allocator allocator;
 	int insn_cnt = env->prog->len;
 	u8 code;
@@ -813,24 +813,19 @@ static int check_subprogs(struct bpf_verifier_env *env)
 	 */
 	subprog[env->subprog_cnt].start = insn_cnt;
 
-	for (i = 0; i < env->subprog_cnt; i++) {
+	for (i = 0; i < env->subprog_cnt; i++)
 		if (env->log.level > 1)
 			verbose(env, "func#%d @%d\n", i, subprog[i].start);
 
-		/* Don't init callees inside add_subprog which will sort the
-		 * array which breaks list link.
-		 */
-		INIT_LIST_HEAD(&subprog[i].callees);
-	}
-
 	ret = cfg_node_allocator_init(&allocator, bb_num_esti,
 				      cedge_num_esti);
 	if (ret < 0)
 		return ret;
 	subprog_start = subprog[cur_subprog].start;
 	subprog_end = subprog[cur_subprog + 1].start;
-	cur_bb_list = &subprog[cur_subprog].bbs;
-	ret = subprog_init_bb(&allocator, cur_bb_list, subprog_start);
+	cur_bb_list = (void **)&subprog[cur_subprog].bbs;
+	ret = subprog_init_bb(&allocator, cur_bb_list, subprog_start,
+			      subprog_end);
 	if (ret < 0)
 		goto free_nodes;
 	cur_callee_list = &env->subprog_info[cur_subprog].callees;
@@ -909,20 +904,17 @@ static int check_subprogs(struct bpf_verifier_env *env)
 				goto free_nodes;
 			}
 
-			ret = subprog_fini_bb(&allocator, cur_bb_list,
-					      subprog_end);
-			if (ret < 0)
-				goto free_nodes;
 			ret = subprog_add_bb_edges(&allocator, insn,
 						   cur_bb_list);
 			if (ret < 0)
 				goto free_nodes;
 			subprog[cur_subprog].bb_num = ret;
-			ret = subprog_build_dom_info(env,
+			ret = subprog_build_dom_info(env, &allocator,
 						     &subprog[cur_subprog]);
 			if (ret < 0)
 				goto free_nodes;
-			if (subprog_has_loop(&subprog[cur_subprog])) {
+			if (subprog_has_loop(&allocator,
+					     &subprog[cur_subprog])) {
 				verbose(env, "cfg - loop detected");
 				ret = -EINVAL;
 				goto free_nodes;
@@ -931,9 +923,11 @@ static int check_subprogs(struct bpf_verifier_env *env)
 			cur_subprog++;
 			if (cur_subprog < env->subprog_cnt) {
 				subprog_end = subprog[cur_subprog + 1].start;
-				cur_bb_list = &subprog[cur_subprog].bbs;
+				cur_bb_list =
+					(void **)&subprog[cur_subprog].bbs;
 				ret = subprog_init_bb(&allocator, cur_bb_list,
-						      subprog_start);
+						      subprog_start,
+						      subprog_end);
 				if (ret < 0)
 					goto free_nodes;
 				cur_callee_list = &subprog[cur_subprog].callees;
@@ -941,7 +935,8 @@ static int check_subprogs(struct bpf_verifier_env *env)
 		}
 	}
 
-	ret = cgraph_check_recursive_unreachable(env, env->subprog_info);
+	ret = cgraph_check_recursive_unreachable(env, &allocator,
+						 env->subprog_info);
 	if (ret < 0)
 		goto free_nodes;
 

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 11/16] bpf: cfg: detect irreducible loop using Eric Stoltz algorithm
  2018-06-01  9:32 [RFC PATCH 00/16] bpf, bounded loop support work in progress John Fastabend
                   ` (9 preceding siblings ...)
  2018-06-01  9:33 ` [RFC PATCH 10/16] bpf: cfg: reduce memory usage by using singly list + compat pointer John Fastabend
@ 2018-06-01  9:33 ` John Fastabend
  2018-06-01  9:33 ` [RFC PATCH 12/16] bpf: cfg: pretty print CFG and DOM John Fastabend
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2018-06-01  9:33 UTC (permalink / raw)
  To: alexei.starovoitov, daniel, davem; +Cc: netdev

From: Jiong Wang <jiong.wang@netronome.com>

We don't want to do any further loop bounds analysis for irreducible loop,
so just reject programs containing it.

The current DOM based loop detection can't detect irreducible loop. We'd
use the algorithm given by Eric Stoltz to detect it. The algorithm requires
DOM info.

Algorithm pseudo code:

	test_dom(a,b) returns TRUE if a dominates b
	push( v ) pushes v onto a reverse topologically-sorted stack

	top_sort( entry node )

	top_sort( node v ) {
		mark_visited( v );
		Visit all successors s of v {
			if (mark_visited(s) &&
			    !pushed(s) &&
			    !test_dom(s, v)) {
				Irreducible_Graph = TRUE;
				Exit -- no need to continue now!
			}
			if(!mark_visited(s))
				top_sort( s );
		}
		push( v );
	}

Tested by:

int xdp_prog1(struct xdp_md *ctx) {
	char *data      = (char *)(unsigned long)(ctx->data);
	char *data_end  = (char *)(unsigned long)(ctx->data_end);
	if (data + 60 > (unsigned char *)(unsigned long)ctx->data_end)
		return XDP_ABORTED;

	char *p = data + 14;
	char *mark = *(char **)p;
	unsigned long mark2 = *(unsigned long *)(p + 24);

	mark += 1;

	if (p + 1 > data) {
B:
		mark += 2;

		if (mark < mark2 * 3)
			goto C;
	} else if (mark < data) {
C:
		mark += 1;
		if (mark < 1000)
			goto  B;
	}

	return XDP_PASS;
}

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 kernel/bpf/cfg.c      |  117 ++++++++++++++++++++++++++++++++++++++++++++++---
 kernel/bpf/cfg.h      |    5 ++
 kernel/bpf/verifier.c |    9 ++++
 3 files changed, 123 insertions(+), 8 deletions(-)

diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index 2a9b513..c67541c 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -767,6 +767,109 @@ bool subprog_has_loop(struct cfg_node_allocator *allocator,
 	return false;
 }
 
+/* We don't want to do any further loop bounds analysis for irreducible loop,
+ * so just reject programs containing it.
+ *
+ * The current DOM based loop detection can't detect irreducible loop. We'd
+ * use the algorithm given by Eric Stoltz to detect it. The algorithm requires
+ * DOM info.
+ *
+ * Algorithm pseudo code:
+ *
+ *   test_dom(a,b) returns TRUE if a dominates b
+ *   push( v ) pushes v onto a reverse topologically-sorted stack
+ *
+ *   top_sort( entry node )
+ *
+ *   top_sort( node v ) {
+ *	mark_visited( v );
+ *	Visit all successors s of v {
+ *		if (mark_visited(s) && !pushed(s) && !test_dom(s, v)) {
+ *			Irreducible_Graph = TRUE;
+ *			Exit -- no need to continue now!
+ *		}
+ *		if(!mark_visited(s))
+ *			top_sort( s );
+ *	}
+ *	push( v );
+ *   }
+ */
+int subprog_has_irreduciable_loop(struct cfg_node_allocator *allocator,
+				  struct bpf_subprog_info *subprog)
+{
+	u16 bb_num = subprog->bb_num - 2, sp = 0, *status;
+	void **bb_list = (void **)&subprog->bbs;
+	struct edge_node **stack, *e, *prev_e;
+	int lane_len = BITS_TO_LONGS(bb_num);
+	struct bb_node *entry_bb, *exit_bb;
+	int found = 0;
+
+	stack = kmalloc_array(bb_num, sizeof(struct edge_node *), GFP_KERNEL);
+	if (!stack)
+		return -ENOMEM;
+
+	status = kcalloc(bb_num, sizeof(u16), GFP_KERNEL);
+	if (!status)
+		return -ENOMEM;
+
+	entry_bb = entry_bb(bb_list);
+	exit_bb = exit_bb(bb_list);
+	e = cfg_node_delink(allocator, &entry_bb->e_succs);
+	prev_e = e;
+
+	while (1) {
+		struct bb_node *bb_dst;
+
+		while (e) {
+			bb_dst = e->dst;
+
+			if (bb_dst == exit_bb ||
+			    status[bb_dst->idx] == DFS_NODE_EXPLORED) {
+				prev_e = e;
+				e = cfg_node_delink(allocator, &e->link);
+				continue;
+			}
+
+			if (status[bb_dst->idx] == DFS_NODE_EXPLORING) {
+				u16 src_idx = e->src->idx;
+				unsigned long *bb_map;
+
+				bb_map = subprog->dtree + src_idx * lane_len;
+				if (!test_bit(bb_dst->idx, bb_map)) {
+					found = 1;
+					goto free_and_ret;
+				} else {
+					prev_e = e;
+					e = cfg_node_delink(allocator,
+							    &e->link);
+					continue;
+				}
+			}
+
+			status[bb_dst->idx] = DFS_NODE_EXPLORING;
+			stack[sp++] = e;
+			/* e should never be NULL as it couldn't be exit_bb. */
+			e = cfg_node_delink(allocator, &bb_dst->e_succs);
+		}
+
+		if (prev_e->src != entry_bb)
+			status[prev_e->src->idx] = DFS_NODE_EXPLORED;
+
+		if (!sp)
+			break;
+
+		e = stack[--sp];
+		prev_e = e;
+		e = cfg_node_delink(allocator, &e->link);
+	}
+
+free_and_ret:
+	kfree(stack);
+	kfree(status);
+
+	return found;
+}
+
 static int cmp_subprogs(const void *a, const void *b)
 {
 	return ((struct bpf_subprog_info *)a)->start -
@@ -824,8 +927,6 @@ static void ci_next(struct cfg_node_allocator *allocator,
 	ci->callee = cfg_node_delink(allocator, &c->link);
 }
 
-#define EXPLORING	1
-#define EXPLORED	2
 int cgraph_check_recursive_unreachable(struct bpf_verifier_env *env,
 				       struct cfg_node_allocator *allocator,
 				       struct bpf_subprog_info *subprog)
@@ -844,19 +945,19 @@ int cgraph_check_recursive_unreachable(struct bpf_verifier_env *env,
 	}
 	ci.head = subprog->callees;
 	ci.callee = subprog->callees;
-	status[0] = EXPLORING;
+	status[0] = DFS_NODE_EXPLORING;
 
 	while (1) {
 		while (!ci_end_p(&ci)) {
 			callee = ci.callee;
 			idx = callee->callee_idx;
-			if (status[idx] == EXPLORING) {
+			if (status[idx] == DFS_NODE_EXPLORING) {
 				bpf_verifier_log_write(env, "cgraph - recursive call\n");
 				ret = -EINVAL;
 				goto err_free;
 			}
 
-			status[idx] = EXPLORING;
+			status[idx] = DFS_NODE_EXPLORING;
 
 			if (sp == 64) {
 				bpf_verifier_log_write(env, "cgraph - call frame too deep\n");
@@ -870,10 +971,10 @@ int cgraph_check_recursive_unreachable(struct bpf_verifier_env *env,
 		}
 
 		if (ci.head)
-			status[ci.head->caller_idx] = EXPLORED;
+			status[ci.head->caller_idx] = DFS_NODE_EXPLORED;
 		else
 			/* leaf func. */
-			status[idx] = EXPLORED;
+			status[idx] = DFS_NODE_EXPLORED;
 
 		if (!sp)
 			break;
@@ -883,7 +984,7 @@ int cgraph_check_recursive_unreachable(struct bpf_verifier_env *env,
 	}
 
 	for (idx = 0; idx < env->subprog_cnt; idx++)
-		if (status[idx] != EXPLORED) {
+		if (status[idx] != DFS_NODE_EXPLORED) {
 			bpf_verifier_log_write(env, "cgraph - unreachable subprog\n");
 			ret = -EINVAL;
 			goto err_free;
diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h
index cb833bd..8363406 100644
--- a/kernel/bpf/cfg.h
+++ b/kernel/bpf/cfg.h
@@ -35,8 +35,13 @@ int subprog_build_dom_info(struct bpf_verifier_env *env,
 			   struct bpf_subprog_info *subprog);
 bool subprog_has_loop(struct cfg_node_allocator *allocator,
 		      struct bpf_subprog_info *subprog);
+int subprog_has_irreduciable_loop(struct cfg_node_allocator *allocator,
+				  struct bpf_subprog_info *subprog);
 int subprog_init_bb(struct cfg_node_allocator *allocator, void **bb_list,
 		    int subprog_start, int subprog_end);
 void subprog_free(struct bpf_subprog_info *subprog, int end_idx);
 
+#define DFS_NODE_EXPLORING	1
+#define DFS_NODE_EXPLORED	2
+
 #endif
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 90619da..49895f3 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -913,6 +913,15 @@ static int check_subprogs(struct bpf_verifier_env *env)
 						     &subprog[cur_subprog]);
 			if (ret < 0)
 				goto free_nodes;
+			ret = subprog_has_irreduciable_loop(&allocator,
+							&subprog[cur_subprog]);
+			if (ret < 0)
+				goto free_nodes;
+			if (ret > 0) {
+				verbose(env, "cfg - irreduciable loop detected");
+				ret = -EINVAL;
+				goto free_nodes;
+			}
 			if (subprog_has_loop(&allocator,
 					     &subprog[cur_subprog])) {
 				verbose(env, "cfg - loop detected");

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 12/16] bpf: cfg: pretty print CFG and DOM
  2018-06-01  9:32 [RFC PATCH 00/16] bpf, bounded loop support work in progress John Fastabend
                   ` (10 preceding siblings ...)
  2018-06-01  9:33 ` [RFC PATCH 11/16] bpf: cfg: detect irreducible loop using Eric Stoltz algorithm John Fastabend
@ 2018-06-01  9:33 ` John Fastabend
  2018-06-01  9:33 ` [RFC PATCH 13/16] bpf: verifier, can ptr range be calculated with scalar ALU op John Fastabend
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2018-06-01  9:33 UTC (permalink / raw)
  To: alexei.starovoitov, daniel, davem; +Cc: netdev

Add functions to pretty print the CFG and DOM table. We can also
add contrib scripts to print this dot format so tools can visualize
them. I at least found this helpful. Will add scripts in tools/bpf
follow up patch.

For development we are always printing these but should put these
noisy routines behind a verbose bit and/or only print when an error
has occured in bounded-loop logic.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 kernel/bpf/cfg.c      |   54 +++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/bpf/cfg.h      |    5 +++++
 kernel/bpf/verifier.c |    2 ++
 3 files changed, 61 insertions(+)

diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index c67541c..0d50646 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -62,6 +62,60 @@ static struct bb_node *bb_next(struct cfg_node_allocator *allocator,
 	return (struct bb_node *)cfg_node_delink(allocator, &bb->link);
 }
 
+void cfg_pretty_print(struct bpf_verifier_env *env,
+		      struct cfg_node_allocator *allocator,
+		      struct bpf_subprog_info *subprog)
+{
+	void **bb_list = (void **)&subprog->bbs;
+	struct bb_node *bb, *exit_bb;
+
+	bb = entry_bb(bb_list);
+	exit_bb = exit_bb(bb_list);
+
+	bpf_verifier_log_write(env, "CFG: ");
+	while (bb && bb != exit_bb) {
+		struct bb_node *next_bb = bb_next(allocator, bb);
+		struct edge_node *e;
+
+		e = cfg_node_delink(allocator, &bb->e_succs);
+		while (e) {
+			struct bb_node *dst = e->dst;
+			int tail = next_bb->head - 1;
+			struct bb_node *dst_next;
+			int dst_tail;
+
+			dst_next = bb_next(allocator, dst);
+			dst_tail = dst_next ? dst_next->head - 1 : 65534;
+
+			bpf_verifier_log_write(env, " %i[%i,%i] -> %i[%i,%i] ",
+					       bb->idx, bb->head, tail, dst->idx, dst->head, dst_tail);
+			e = cfg_node_delink(allocator, &e->link);
+		}
+		bb = bb_next(allocator, bb);
+	}
+	bpf_verifier_log_write(env, "\n");
+}
+
+void dom_pretty_print(struct bpf_verifier_env *env,
+		      struct bpf_subprog_info *subprog)
+{
+	int lane_len, bb_num = subprog->bb_num - 2;
+	int i, j;
+
+	lane_len = BITS_TO_LONGS(bb_num);
+
+	bpf_verifier_log_write(env, "DOM:\n");
+	for (i = 0; i < bb_num; i++) {
+		for (j = 0; j < bb_num; j++) {
+			bpf_verifier_log_write(env, " %i ",
+			    test_bit(j,
+				     subprog->dtree + i * lane_len) ? 1 : 0);
+		}
+		bpf_verifier_log_write(env, "\n");
+	}
+	bpf_verifier_log_write(env, "\n");
+}
+
 struct dom_info {
 	u16 *dfs_parent;
 	u16 *dfs_order;
diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h
index 8363406..44dcabb 100644
--- a/kernel/bpf/cfg.h
+++ b/kernel/bpf/cfg.h
@@ -37,6 +37,11 @@ bool subprog_has_loop(struct cfg_node_allocator *allocator,
 		      struct bpf_subprog_info *subprog);
 int subprog_has_irreduciable_loop(struct cfg_node_allocator *allocator,
 				  struct bpf_subprog_info *subprog);
+void cfg_pretty_print(struct bpf_verifier_env *env,
+		      struct cfg_node_allocator *allocator,
+		      struct bpf_subprog_info *subprog);
+void dom_pretty_print(struct bpf_verifier_env *env,
+		      struct bpf_subprog_info *subprog);
 int subprog_init_bb(struct cfg_node_allocator *allocator, void **bb_list,
 		    int subprog_start, int subprog_end);
 void subprog_free(struct bpf_subprog_info *subprog, int end_idx);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 49895f3..610559a 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -913,6 +913,8 @@ static int check_subprogs(struct bpf_verifier_env *env)
 						     &subprog[cur_subprog]);
 			if (ret < 0)
 				goto free_nodes;
+			cfg_pretty_print(env, &allocator, &subprog[cur_subprog]);
+			dom_pretty_print(env, &subprog[cur_subprog]);
 			ret = subprog_has_irreduciable_loop(&allocator,
 							&subprog[cur_subprog]);
 			if (ret < 0)

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 13/16] bpf: verifier, can ptr range be calculated with scalar ALU op
  2018-06-01  9:32 [RFC PATCH 00/16] bpf, bounded loop support work in progress John Fastabend
                   ` (11 preceding siblings ...)
  2018-06-01  9:33 ` [RFC PATCH 12/16] bpf: cfg: pretty print CFG and DOM John Fastabend
@ 2018-06-01  9:33 ` John Fastabend
  2018-06-01  9:33 ` [RFC PATCH 14/16] bpf: verifier, add initial support to allow bounded loops John Fastabend
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2018-06-01  9:33 UTC (permalink / raw)
  To: alexei.starovoitov, daniel, davem; +Cc: netdev

At the moment any BPF_ADD or BPF_NEG with a pointer type will create
a new pointer and destroy the register range. Then any memory access
after this will fail because it looks like no bounds checking has
been done, even if it was previously done on the other pointer. So
patterns like this fail,

   ptrA = pkt_data;

   if (ptrA + expected_payload > data_end)
          return 0;

   ptrA  += 1
   accum += *ptrA
   ptrA  += 1
   accum += *ptrA

I did a quick look over the code and it seems like if the ALU op
has a uma_val and it is less than the previous range, we can simply
reduce the range by that amount and pull it into the new ptr.

This patch does the above.
---
 kernel/bpf/verifier.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 610559a..c41f587 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2830,8 +2830,10 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
 		dst_reg->off = ptr_reg->off;
 		if (reg_is_pkt_pointer(ptr_reg)) {
 			dst_reg->id = ++env->id_gen;
-			/* something was added to pkt_ptr, set range to zero */
-			dst_reg->range = 0;
+			if (umax_val > dst_reg->range)
+				dst_reg->range = 0;
+			else
+				dst_reg->range -= umax_val;
 		}
 		break;
 	case BPF_SUB:

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 14/16] bpf: verifier, add initial support to allow bounded loops
  2018-06-01  9:32 [RFC PATCH 00/16] bpf, bounded loop support work in progress John Fastabend
                   ` (12 preceding siblings ...)
  2018-06-01  9:33 ` [RFC PATCH 13/16] bpf: verifier, can ptr range be calculated with scalar ALU op John Fastabend
@ 2018-06-01  9:33 ` John Fastabend
  2018-06-01  9:33 ` [RFC PATCH 15/16] bpf: verifier, simple loop examples John Fastabend
  2018-06-01  9:33 ` [RFC PATCH 16/16] bpf: tools: dbg patch to turn on debugging and add primitive examples John Fastabend
  15 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2018-06-01  9:33 UTC (permalink / raw)
  To: alexei.starovoitov, daniel, davem; +Cc: netdev

Allow bounded loops if we can become convinced they will in fact
terminate.

At a high level this is done in steps the first two steps are done
outside of the do_check routine which "runs" the instructions.

 1. Use dom tree to find loops.
 2. Find loop induction variable in the loop. (I use loop
    induction variable here to refer to the variable used
    in the loop guard. For example in `do { .... i++; ...} while(i)`
    we refer to 'i' as the loop induction variable any other
    variables as just induction variables. I need to check
    what the standard terminology is for this)

The final steps are done 'inline' with do_check when the first
instruction of a loop is reached.

 3. Verify the bounds of the loop induction variable do in
    fact ensure the loop will terminate at some point. e.g.
    if its an increasing induction variable it has a lower bound.
 4. Deduce the min/max values of the induction variable so
    that when the loop is run all possible worst case values are
    tested on the first iteration through the loop. And the
    state pruning logic has some chance of pruning future trips
    through the loop.
 5. When exiting the loop clean up loop markers (tbd)

So that is the high level sketch now the details:

1. Finding the loops using standard dom queries is shown in
   previous patches, so it will not be discussed further.

2. Finding the loop induction variable. This is the variable
   easily identified in C constructs by any of the following,

      for (i = 0; i < j; i++) {...}
      while (i) { ... i++; ...}
      do {... i++; ... } while(i)

   and so on. The fist problem with finding the loop induction
   variable in BPF is locating it. LLVM clang/bpf has a set of
   patterns it tends to generate for various C loop constructs.
   This patch matches the following simple one (in pseudocode),

       hdr:
            <do stuff>
            if (i != x) goto hdr

   where the comparison at the end can match any of the JMP
   ops and can also of the form '(i op imm)'. The key is the
   jump is the last insn in the loop. For example another
   common pattern generated by LLVM, in my codes at least is
   the following (again in pseudocode),

       hdr:
           <do stuff>
           if (i != x) goto out
           <do more stuff>
           goto hdr
       out:
           <outside loop>

  This patch is still fairly naive on this front and does not
  handle the second case yet. It seems though by scanning the
  loop for conditionals that exit the loop and testing to see
  if either src or dst reg are induction variables should be
  enough to resolve this.

  For each potential loop induction variable we walk all paths
  back to the header of the loop and test (a) is it always
  increasing or decreasing (but not both!) and (b) that all
  paths agree.

  Currently, we only search for the loop induction variable.
  Future patches will expand this for additional induction
  variables. This will help the state pruning logic later.
  Also we only allow for induction variables generated
  from constants at the moment. This will be solved by
  tracking multiple induction variables.

  Even with all these limitations I've been able to use this
  to get some basic loops running. See subsequent patch.

  Once we have found a loop induction variable we can test
  that it is compatible with the comparison operation. For
  example 'while (i > 0) { ... i++ } is clearly not a valid
  loop.

3. Now that we believe we have a valid loop induction variable we
   need to confirm we have good bounds when the do_check hits the
   first insn in the loop. This ensures that the loops has a
   known starting point. If the loop induction variable is
   increasing we ensure it has min value and vice versa.

4. If we stopped here the loop would terminate but would
   not likely prune any states so each iteration would
   contribute to the verifier complexity and many loops
   would hit the complexity limit.

   To help the state pruning logic we set the induction variables
   to their min/max values, which we can calculate because we
   know the final exit condition and how the induction variable
   increases/decreases for each trip in the loop.

   There is one noteworth problem though. When the ALU operation
   is run over these new ranges it would normally increment both
   the min and the max. However this is not correct anymore
   because one of the bounds is a result of the conditional, that
   exits the loop and so can not be passed, and the other is a
   result of the initial condition entering the loop. So to
   resolve this we only increment the initial condition and fix
   the upper/lower bound that is attributed to the conditional.
   This works, except tnum needs to be improved to handle this.
   For right now I hacked this by forcing tnum values using
   tnum_range. This might be OK (it seems to work on scalars
   at least) but seems a bit like an ugly hack. More thinking
   needed.

   With the above I've had luck getting the pruning logic
   to work for many actual C code loops.

5. Finally, when we exit the node we have to clear the flag on
   the register indicating its an induction variable. This is
   what kicks the logic to try and only update min/max and fix
   the other. (tbd, at the moment if these variables are used
   after the loop we run spilling these vars in and out of the
   stack while tracking bounds which may make the state pruning
   logic less effective. In practice it is cleared fairly quickly
   usually by a mark_set_reg_unknown or the likes.)

All right that is it.

The big todos are the following:

- nested loops is broken at the moment, could just reject
  nested loops in first iteration of this work.
- only doing simple pattern matching for one loop type generated
  by LLVM need to generalize to find most types (or at least the
  common ones).
- I need to check overflow cases in a few places
- Bounds checks need to account for induction var step to ensure
  we account for possibility of walking past guard in last iteration
- Follow on patches will track multiple induction variables
- In general the code needs some love, error paths are suspect,
  originally I was using lists for things that can now be dropped
  and use more standard stacks, bb_node_stack(!?) was part of a
  bigger object but then got simplified now there is nothing left,
  etc.
- There are some problematic code snippets as well that we can
  illustrate in example code in next patches. For example LLVM
  likes to test unsigned values for '!= 0' to avoid decrementing
  a unsigned of 0 causing overflow state. However we don't learn
  this bound from the ALU op and we mark the reg as possibly
  for overflow which ruins our bounds preventing loop verification
  in some cases..

Interesting note, we nearly have scalar evolution infrastructure
in place which would be interesting but not sure if its needet
yet. We will probably get there with the follow on patch for
multiple induction variables.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 include/linux/bpf_verifier.h    |   20 +
 kernel/bpf/cfg.c                |  971 +++++++++++++++++++++++++++++++++++++++
 kernel/bpf/cfg.h                |    8 
 kernel/bpf/verifier.c           |   98 +++-
 samples/bpf/xdp2skb_meta_kern.c |   57 ++
 5 files changed, 1126 insertions(+), 28 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 07844ff..b8c9853 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -40,6 +40,14 @@ enum bpf_reg_liveness {
 	REG_LIVE_WRITTEN, /* reg was written first, screening off later reads */
 };
 
+enum bpf_indvar_state {
+	BPF_LOOP_UNKNOWN,
+	BPF_LOOP_INVALID,
+	BPF_LOOP_IMM,
+	BPF_LOOP_INC,
+	BPF_LOOP_DEC,
+};
+
 struct bpf_reg_state {
 	enum bpf_reg_type type;
 	union {
@@ -81,9 +89,17 @@ struct bpf_reg_state {
 	 * while another to the caller's stack. To differentiate them 'frameno'
 	 * is used which is an index in bpf_verifier_state->frame[] array
 	 * pointing to bpf_func_state.
-	 * This field must be second to last, for states_equal() reasons.
+	 * This field marks the offset for comparisons in state pruning, review
+	 * states_equal() for specifics.
 	 */
 	u32 frameno;
+
+	/* Used to determine if this reg state is tracking an induction variable
+	 * through a loop. Induction variables have worst case values for all
+	 * iterations of the loop.
+	 */
+	enum bpf_indvar_state indvar;
+
 	/* This field must be last, for states_equal() reasons. */
 	enum bpf_reg_liveness live;
 };
@@ -139,6 +155,7 @@ struct bpf_verifier_state_list {
 	struct bpf_verifier_state_list *next;
 };
 
+struct bpf_loop_info;
 struct bpf_insn_aux_data {
 	union {
 		enum bpf_reg_type ptr_type;	/* pointer type for load/store insns */
@@ -148,6 +165,7 @@ struct bpf_insn_aux_data {
 	int ctx_field_size; /* the ctx field size for load insn, maybe 0 */
 	int sanitize_stack_off; /* stack slot to be cleared */
 	bool seen; /* this insn was processed by the verifier */
+	struct bpf_loop_info *loop; /* when set this insn marks loop header */
 };
 
 #define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */
diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index 0d50646..4622365 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -12,6 +12,7 @@
 #include <linux/sort.h>
 
 #include "cfg.h"
+#include "disasm.h"
 
 /* The size of cfg nodes matters, therefore we try to avoid using doubly linked
  * list and we try to use base + offset to address node, by this we only need
@@ -53,6 +54,64 @@ struct bb_node {
 	u16 idx;
 };
 
+struct bb_node_stack {
+	struct bb_node *bb;
+	struct list_head list;
+};
+
+static const char * const bpf_loop_state_str[] = {
+	[BPF_LOOP_UNKNOWN]	= "unknown",
+	[BPF_LOOP_INVALID]	= "invalid",
+	[BPF_LOOP_IMM]		= "imm",
+	[BPF_LOOP_INC]		= "increasing",
+	[BPF_LOOP_DEC]		= "decreasing",
+};
+
+struct bb_state_reg {
+	int reg;	/* register number */
+	int off;	/* offset in stack or unused */
+	int size;	/* size in stack or unused */
+	s64 smin_value; /* minimum possible (s64)value */
+	s64 smax_value; /* maximum possible (s64)value */
+	u64 umin_value; /* minimum possible (u64)value */
+	u64 umax_value; /* maximum possible (u64)value */
+	enum bpf_indvar_state state;
+};
+
+struct bb_state {
+	struct bb_node *bb;
+	struct list_head list;
+	int insn;
+	int insn_cnt;
+	struct bb_state_reg src;
+	struct bb_state_reg dst;
+};
+
+struct bpf_loop_info {
+	struct list_head bb;
+	int bb_num;
+	int insn_cnt;
+	int insn_entry;
+	int insn_exit;
+	struct bb_state_reg src;
+	struct bb_state_reg dst;
+	u16 idx;
+	struct bpf_loop_info *next;
+};
+
+static u16 loop_idx;
+
+static void bpf_clear_state(struct bb_state *s)
+{
+	s->insn = s->insn_cnt = 0;
+
+	s->src.state = BPF_LOOP_UNKNOWN;
+	s->src.reg = s->src.off = s->src.size = 0;
+
+	s->dst.state = BPF_LOOP_UNKNOWN;
+	s->dst.reg = s->src.off = s->src.size = 0;
+}
+
 #define entry_bb(bb_list)		(struct bb_node *)(*bb_list)
 #define exit_bb(bb_list)		(struct bb_node *)(*(bb_list + 1))
 
@@ -791,13 +850,912 @@ int subprog_build_dom_info(struct bpf_verifier_env *env,
 	return ret;
 }
 
-bool subprog_has_loop(struct cfg_node_allocator *allocator,
-		      struct bpf_subprog_info *subprog)
+static bool bb_search(struct bb_node *bb, struct bpf_loop_info *loop)
+{
+	struct bb_node_stack *i;
+
+	list_for_each_entry(i, &loop->bb, list) {
+		if (i->bb->idx == bb->idx)
+			return true;
+	}
+	return false;
+}
+
+void bpf_state_set_invalid(struct bb_state *state)
+{
+	state->src.state = BPF_LOOP_INVALID;
+	state->dst.state = BPF_LOOP_INVALID;
+}
+
+void bpf_state_set_stx(struct bb_state *state, const struct bpf_insn *insn)
+{
+	int size = bpf_size_to_bytes(BPF_SIZE(insn->code));
+
+	/* BPF_MEM | <size> | BPF_STX: *(size *) (dst_reg + off) = src_reg */
+	if (state->dst.reg == insn->dst_reg) {
+		state->dst.reg = insn->src_reg;
+		state->dst.off = insn->off;
+		state->dst.size = size;
+	} else if (state->src.reg == insn->dst_reg) {
+		state->src.reg = insn->src_reg;
+		state->src.off = insn->off;
+		state->src.size = size;
+	}
+}
+
+static void bpf_state_set_ldx(struct bpf_verifier_env *env,
+			      struct bb_state *state, const struct bpf_insn *insn)
+{
+	int off = insn->off;
+	int size = bpf_size_to_bytes(BPF_SIZE(insn->code));
+
+	/* BPF_MEM | <size> | BPF_LDX:  dst_reg = *(size *) (src_reg + off) */
+	if (state->dst.reg == insn->src_reg && state->dst.off == off) {
+		if (state->dst.size != size) {
+			bpf_verifier_log_write(env,
+					       "Loop tracing (dst) through BPF_LDX with mismatch sizes unsupported (%i != %i)\n",
+					       state->dst.size, size);
+			bpf_state_set_invalid(state);
+			return;
+		}
+		state->dst.reg = insn->dst_reg;
+	} else if (state->src.reg == insn->dst_reg && state->dst.off == off) {
+		if (state->src.size != size) {
+			bpf_verifier_log_write(env,
+					       "Loop tracing (src) through BPF_LDX with mismatch sizes unsupported (%i != %i)\n",
+					       state->src.size, size);
+			bpf_state_set_invalid(state);
+			return;
+		}
+		state->src.reg = insn->dst_reg;
+	}
+}
+
+void bpf_state_set_xadd(struct bb_state *state, const struct bpf_insn *insn)
+{
+	/* Bail out on XADD programs for the moment */
+	bpf_state_set_invalid(state);
+}
+
+static void _bpf_state_set_add(struct bpf_verifier_env *env,
+			       struct bb_state_reg *reg,
+			       const struct bpf_insn *insn, bool add)
+{
+	int sign;
+
+	/* Currently, only basic induction variables are supported. So we
+	 * require "reg += const" this limitation is artificial and we can support
+	 * more complex linear statements 'x += y' and 'x = x + a y' with additional
+	 * verifier effort. However, lets see if this is actually needed before
+	 * we add recursive path search for n-order induction variables.
+	 */
+	if (BPF_SRC(insn->code) == BPF_K) {
+		/* BPF_ADD/BPF_NEG by zero is a NOP just return */
+		if (insn->imm == 0)
+			return;
+
+		if (add)
+			sign = insn->imm < 0 ? BPF_LOOP_DEC : BPF_LOOP_INC;
+		else
+			sign = insn->imm < 0 ? BPF_LOOP_INC : BPF_LOOP_DEC;
+	} else {
+		reg->state = BPF_LOOP_INVALID;
+		return;
+	}
+
+	if (reg->state == BPF_LOOP_UNKNOWN)
+		reg->state = sign;
+	else if (reg->state == BPF_LOOP_INC && sign == BPF_LOOP_INC)
+		reg->state = BPF_LOOP_INC;
+	else if (reg->state == BPF_LOOP_DEC && sign == BPF_LOOP_DEC)
+		reg->state = BPF_LOOP_DEC;
+	else
+		reg->state = BPF_LOOP_INVALID;
+}
+
+static void bpf_state_set_add(struct bpf_verifier_env *env,
+			      struct bb_state *state,
+			      const struct bpf_insn *insn)
+{
+	if (state->dst.reg == insn->dst_reg) {
+		_bpf_state_set_add(env, &state->dst, insn, true);
+	} else if (state->src.reg == insn->dst_reg) {
+		_bpf_state_set_add(env, &state->src, insn, true);
+	} else {
+		bpf_state_set_invalid(state);
+		WARN_ON_ONCE(1);
+	}
+}
+
+static void bpf_state_set_sub(struct bpf_verifier_env *env,
+			      struct bb_state *state,
+			      const struct bpf_insn *insn)
+{
+	if (state->dst.reg == insn->dst_reg) {
+		_bpf_state_set_add(env, &state->dst, insn, false);
+	} else if (state->src.reg == insn->dst_reg) {
+		_bpf_state_set_add(env, &state->src, insn, false);
+	} else {
+		bpf_state_set_invalid(state);
+		WARN_ON_ONCE(1);
+	}
+}
+
+static void _bpf_state_set_move(struct bb_state_reg *reg, const struct bpf_insn *insn)
+{
+	if (BPF_SRC(insn->code) == BPF_K) {
+		u64 uimm = insn->imm;
+		s64 simm = (s64)insn->imm;
+
+		reg->state = BPF_LOOP_IMM;
+		reg->reg = -1;
+
+		reg->smin_value = simm;
+		reg->smax_value = simm;
+		reg->umin_value = uimm;
+		reg->umax_value = uimm;
+	} else {
+		reg->reg  = insn->src_reg;
+	}
+}
+void bpf_state_set_mov(struct bb_state *state, const struct bpf_insn *insn)
+{
+
+	if (state->dst.reg == insn->dst_reg) {
+		_bpf_state_set_move(&state->dst, insn);
+	} else if (state->src.reg == insn->dst_reg) {
+		_bpf_state_set_move(&state->src, insn);
+	} else {
+		bpf_state_set_invalid(state);
+		WARN_ON_ONCE(1);
+	}
+}
+
+void bpf_loop_state(struct bpf_verifier_env *env,
+		    int i, const struct bpf_insn *insn, struct bb_state *state)
+{
+	u8 class = BPF_CLASS(insn->code);
+
+	if (class == BPF_ALU || class == BPF_ALU64) {
+		u8 opcode;
+
+		if (state->src.reg != insn->dst_reg &&
+		    state->dst.reg != insn->dst_reg)
+			return;
+
+		opcode = BPF_OP(insn->code);
+		switch (opcode) {
+		case BPF_ADD:
+			bpf_state_set_add(env, state, insn);
+			break;
+		case BPF_SUB:
+			bpf_state_set_sub(env, state, insn);
+			break;
+		case BPF_MOV:
+			bpf_state_set_mov(state, insn);
+			break;
+		case BPF_DIV:
+		case BPF_OR:
+		case BPF_AND:
+		case BPF_LSH:
+		case BPF_RSH:
+		case BPF_MOD:
+		case BPF_XOR:
+		case BPF_ARSH:
+		case BPF_END:
+		default:
+			bpf_verifier_log_write(env,
+					      "%i: BPF_ALU%s: unsupported opcode (%u) invalidate state\n",
+					       i, class == BPF_ALU ? "" : "64",
+					       opcode);
+			bpf_state_set_invalid(state);
+			break;
+		}
+	} else if (class == BPF_STX) {
+		u8 mode = BPF_MODE(insn->code);
+
+		switch (mode) {
+		case BPF_MEM:
+			/* BPF_MEM | <size> | BPF_STX */
+			bpf_state_set_stx(state, insn);
+			break;
+		case BPF_XADD:
+			/* BPF_XADD | BPF_W | BPF_STX */
+			/* BPF_XADD | BPF_DW | BPF_STX */
+			bpf_state_set_xadd(state, insn);
+			break;
+		default:
+			bpf_verifier_log_write(env,
+					       "%i: BPF_STX: unsupported mode (%u) invalidate state\n",
+					       i, mode);
+			bpf_state_set_invalid(state);
+		}
+	} else if (class == BPF_ST) {
+		/* Unsupported at the moment */
+		bpf_verifier_log_write(env, "%i: BPF_ST: unsupported class invalidate state\n", i);
+		bpf_state_set_invalid(state);
+	} else if (class == BPF_LDX) {
+		u8 mode = BPF_MODE(insn->code);
+
+		if (mode != BPF_MEM) {
+			bpf_verifier_log_write(env,
+					      "%i: BPF_LDX: unsupported mode (%u) invalidate state\n",
+					      i, mode);
+			bpf_state_set_invalid(state);
+		} else {
+			/* BPF_MEM | <size> | BPF_LDX */
+			bpf_state_set_ldx(env, state, insn);
+		}
+	} else if (class == BPF_LD) {
+		/* Unsupported at the moment */
+		bpf_verifier_log_write(env, "%i: BPF_LD: unsupported class invalidate state\n", i);
+		bpf_state_set_invalid(state);
+	} else if (class == BPF_JMP) {
+		; // Jumps are verified by CFG
+	} else {
+		/* If we do not understand instruction invalidate state */
+		bpf_verifier_log_write(env, "%i: %u: unknown class invalidate state\n", i, class);
+		bpf_state_set_invalid(state);
+	}
+}
+
+/* bit noisy at the moment duplicates with print_regs */
+static void bpf_print_loop_info(struct bpf_verifier_env *env,
+				struct bpf_loop_info *loop)
+{
+	struct bpf_reg_state *regs, *src_reg = NULL, *dst_reg = NULL;
+
+	regs = cur_regs(env);
+	if (loop->src.reg >= 0)
+		src_reg = &regs[loop->src.reg];
+	if (loop->dst.reg >= 0)
+		dst_reg = &regs[loop->dst.reg];
+
+	bpf_verifier_log_write(env,
+			       "Loop %i: (%i:ty(%i))src.state(%s) (%i:ty(%i))dst.state(%s): R%d(%llu:%llu,%lld:%lld) R%d(%llu:%llu,%lld:%lld)\n",
+			       loop->idx,
+			       loop->src.reg, src_reg ? src_reg->type : -1,
+			       bpf_loop_state_str[loop->src.state],
+			       loop->dst.reg, dst_reg ? dst_reg->type : -1,
+			       bpf_loop_state_str[loop->dst.state],
+			       loop->src.reg,
+			       src_reg ? src_reg->umin_value : loop->src.umin_value,
+			       src_reg ? src_reg->umax_value : loop->src.umax_value,
+			       src_reg ? src_reg->smin_value : loop->src.smin_value,
+			       src_reg ? src_reg->smax_value : loop->src.smax_value,
+			       loop->dst.reg,
+			       dst_reg ? dst_reg->umin_value : loop->dst.umin_value,
+			       dst_reg ? dst_reg->umax_value : loop->dst.umax_value,
+			       dst_reg ? dst_reg->smin_value : loop->dst.smin_value,
+			       dst_reg ? dst_reg->smax_value : loop->dst.smax_value);
+}
+
+static bool bpf_op_sign(u8 op)
+{
+	switch (op) {
+	case BPF_JNE:
+	case BPF_JGT:
+	case BPF_JGE:
+		return false;
+	case BPF_JSGT:
+	case BPF_JSGE:
+		return true;
+	default:
+		return false;
+	}
+}
+
+/* Verify conditions necessary to ensure increasing/decreasing loop induction
+ * variables will in fact terminate.
+ *
+ * 1. Increasing/decreasing variables _must_ be paired with a bounded variable
+ *    in this case BPF_LOOP_IMM type.
+ * 2. Increasing/decreasing variables _must_ have a "known" worst case starting
+ *    bound. For example if an increasing variable has no min value we can not
+ *    say it will actually terminate. So test increasing variables have mins
+ *    and decreasing variables have maxs.
+ * 3. The known min/max bound must match the comparison sign
+ *
+ * After this we know that a loop will increase or decrease and eventually
+ * terminate.
+ */
+static int bpf_cfg_valid_bounds(struct bpf_verifier_env *env, u8 op,
+				struct bpf_reg_state *src_reg,
+				struct bpf_reg_state *dst_reg,
+				struct bpf_loop_info *loop)
+{
+	bool sign = bpf_op_sign(op);
+
+	switch (loop->src.state) {
+	/*
+	 * dev note: improve verbose messaging, and maybe refactor the
+	 * switch stmt
+	 */
+	case BPF_LOOP_IMM:
+		if (!dst_reg) {
+			bpf_verifier_log_write(env, "internal cfg error: missing dst_reg LOOP_IMM!\n");
+			return -1;
+		}
+
+		if (loop->dst.state == BPF_LOOP_INC) {
+			if (sign && dst_reg->smin_value == S64_MIN) {
+				bpf_verifier_log_write(env,
+						       "increasing loop induction variable (towarads imm) unbounded min value\n");
+				return -1;
+			}
+		} else if (loop->dst.state == BPF_LOOP_DEC) {
+			if ((sign && dst_reg->smax_value == S64_MAX) ||
+			    (!sign && dst_reg->umax_value == U64_MAX)) {
+				bpf_verifier_log_write(env,
+						       "decreasing loop induction variable (towards imm) unbounded max value\n");
+				return -1;
+			}
+		} else {
+			return -1;
+		}
+		break;
+	case BPF_LOOP_INC:
+		if (loop->dst.state != BPF_LOOP_IMM) {
+			bpf_verifier_log_write(env,
+					       "increasing loop induction variable not towards imm\n");
+			return -1;
+		}
+
+		if (!src_reg) {
+			bpf_verifier_log_write(env, "internal cfg error: missing src_reg LOOP_INC!\n");
+			return -1;
+		}
+
+		if (sign && src_reg->smin_value == S64_MIN) {
+			bpf_verifier_log_write(env,
+					       "increasing loop induction variable unbounded min value\n");
+			return -1;
+		}
+		break;
+	case BPF_LOOP_DEC:
+		if (loop->dst.state != BPF_LOOP_IMM) {
+			bpf_verifier_log_write(env,
+					       "decreasing loop induction variable not towards imm\n");
+			return -1;
+		}
+
+		if (!src_reg) {
+			bpf_verifier_log_write(env, "internal cfg error: missing src_reg LOOP_DEC!\n");
+			return -1;
+		}
+
+		if ((sign && src_reg->smax_value == S64_MAX) ||
+		    (!sign && src_reg->umax_value == U64_MAX)) {
+			bpf_verifier_log_write(env,
+					       "decreasing loop induction variable unbounded max value\n");
+			return -1;
+		}
+		break;
+	default:
+		bpf_verifier_log_write(env, "loop state unknown/invalid\n");
+		return -1;
+	}
+	return 0;
+}
+
+/* Before calling bpf_cfg_deduce_bounds we ensured the loop does in fact
+ * terminate. (Because increasing/decreasing towards a constant) But, if
+ * left as is each iteration of the loop will be a new state. This is a
+ * result of the loop induction variable, by definition, being incremented
+ * or decremented by a constant each iteration of the loop.
+ *
+ * To resolve this we know the worst case iteration count and the step
+ * of each iteration so we know the expected range of the indvar. Here
+ * we calculate and set the min/max to the worst case range.
+ */
+static int bpf_cfg_deduce_bounds(struct bpf_verifier_env *env, u8 op,
+				 struct bpf_reg_state *src_reg,
+				 struct bpf_reg_state *dst_reg,
+				 struct bpf_loop_info *loop)
+{
+	int err = 0;
+
+	/* Need to consider overflow cases? */
+	/* Need to consider step > 1 */
+	if (loop->src.state == BPF_LOOP_INC) {
+		switch (op) {
+		case BPF_JNE:
+		case BPF_JGT:
+		case BPF_JGE:
+		case BPF_JSGT:
+		case BPF_JSGE:
+			src_reg->umax_value = loop->dst.umax_value;
+			src_reg->smax_value = loop->dst.smax_value;
+			src_reg->smax_value = loop->dst.smax_value;
+			src_reg->umax_value = loop->dst.umax_value;
+			break;
+		default:
+			bpf_verifier_log_write(env, "src.state INC, invalid opcode %u", op);
+			err = -1;
+			break;
+		}
+		src_reg->var_off = tnum_range(src_reg->umin_value,
+					      src_reg->umax_value);
+	} else if (loop->src.state == BPF_LOOP_DEC) {
+		switch (op) {
+		case BPF_JNE:
+		case BPF_JGT:
+		case BPF_JGE:
+		case BPF_JSGT:
+		case BPF_JSGE:
+			src_reg->umin_value = loop->dst.umin_value;
+			src_reg->smin_value = loop->dst.smin_value;
+			break;
+		default:
+			bpf_verifier_log_write(env, "src.state INC, invalid opcode %u", op);
+			err = -1;
+			break;
+		}
+		src_reg->var_off = tnum_range(src_reg->umin_value,
+					      src_reg->umax_value);
+	} else if (loop->dst.state == BPF_LOOP_INC) {
+		switch (op) {
+		case BPF_JNE:
+		case BPF_JGT:
+		case BPF_JGE:
+		case BPF_JSGT:
+		case BPF_JSGE:
+			dst_reg->umax_value = loop->src.umax_value;
+			dst_reg->smax_value = loop->src.smax_value;
+			break;
+		default:
+			bpf_verifier_log_write(env, "dst.state INC, invalid opcode %u", op);
+			err = -1;
+			break;
+		}
+		dst_reg->var_off = tnum_range(dst_reg->umin_value,
+					      dst_reg->umax_value);
+	} else if (loop->dst.state == BPF_LOOP_DEC) {
+		switch (op) {
+		case BPF_JNE:
+		case BPF_JLT:
+		case BPF_JLE:
+		case BPF_JSLT:
+		case BPF_JSLE:
+			dst_reg->umin_value = loop->src.umin_value;
+			dst_reg->smin_value = loop->src.smin_value;
+			break;
+		default:
+			bpf_verifier_log_write(env, "dst.state DEC, invalid opcode %u", op);
+			err = -1;
+			break;
+		}
+		dst_reg->var_off = tnum_range(dst_reg->umin_value,
+					      dst_reg->umax_value);
+	} else {
+		bpf_verifier_log_write(env, "internal cfg error: unknown src|dst state\n");
+		err = -1;
+	}
+
+	return err;
+}
+
+static int bb_stack_push(struct bb_node *node, struct bpf_loop_info *loop)
+{
+	struct bb_node_stack *n;
+
+	n = kzalloc(sizeof(struct bb_node_stack), GFP_KERNEL);
+	if (!n)
+		return -ENOMEM;
+
+	n->bb = node;
+	list_add(&n->list, &loop->bb);
+	return 0;
+}
+
+static struct bb_node *bb_stack_pop(struct bpf_loop_info *loop)
+{
+	struct bb_node *bb = NULL;
+	struct bb_node_stack *n;
+
+	n = list_first_entry_or_null(&loop->bb, struct bb_node_stack, list);
+	if (n) {
+		list_del(&n->list);
+		bb = n->bb;
+	}
+	kfree(n);
+	return bb;
+}
+
+static void bpf_free_loop(struct bpf_loop_info *loop)
+{
+	struct bb_node *bb = bb_stack_pop(loop);
+
+	while (bb)
+		bb = bb_stack_pop(loop);
+	kfree(loop);
+}
+
+/* CFG verified the loop is normal and that either src or dst is a
+ * basic loop induction variable but we still need to ensure min/max
+ * values will guarentee termination and that trip count is reasonable.
+ *
+ * There are a few cases that need to be handled. First the easy case, if the
+ * CFG loop scan found that one of the registers is an IMM value then we need
+ * only verify that the other register has a min or max value depending on
+ * increasing/decreasing induction variable. The more complicated case is when
+ * the loop scan has yet to resolve one of the registers bounds. (Note having
+ * two unknowns is not valid because we would have no guarantees the loop
+ * induction variable is increasing or decreasing) At this point we need to
+ * lookup the register and verify it does have a known min/max value without
+ * these bounds we can not make any statement about termination.
+ *
+ * Then we make a worse case instruction count analysis and when the loop
+ * latch is completed this will be added to the total instruction count.
+ *
+ * Finally, with worst case insn count completed and valid bounds go ahead
+ * and mark the register with the induction type indicator.
+*/
+int bpf_check_loop_header(struct bpf_verifier_env *env, int insn_idx)
+{
+	struct bpf_reg_state *regs, *src_reg = NULL, *dst_reg = NULL;
+	struct bpf_insn *insn = env->prog->insnsi;
+	struct bpf_loop_info *loop;
+	int err = 0;
+	u8 op;
+
+	loop = env->insn_aux_data[insn_idx].loop;
+	if (!loop)
+		return 0;
+
+	regs = cur_regs(env);
+	if (loop->src.reg >= 0)
+		src_reg = &regs[loop->src.reg];
+	if (loop->dst.reg >= 0)
+		dst_reg = &regs[loop->dst.reg];
+
+	op = BPF_OP(insn[loop->insn_exit].code);
+
+	/* If one of the states is unknown we have some options. We could
+	 * try to do a path walk backwards and see if the value is valid.
+	 * Or we could look at the bounds here and see if we can infer
+	 * anything from that.
+	 *
+	 * What I've found from experimentation is usually due to stack
+	 * spilling and operations that leave the variable in unknown
+	 * state we don't learn much here.
+	 *
+	 * At one point I tried some code to walk back in the tree but
+	 * I didn't like that very much either. Right now my working
+	 * assumption is we can make the bounds tracking good enough
+	 * to avoid walking back through the tree.
+	 */
+	if (loop->dst.state == BPF_LOOP_UNKNOWN ||
+	    loop->src.state == BPF_LOOP_UNKNOWN) {
+		bpf_verifier_log_write(env, "bpf loop unknown state!\n");
+		return -1;
+	}
+
+	err = bpf_cfg_valid_bounds(env, op, src_reg, dst_reg, loop);
+	if (err) {
+		bpf_verifier_log_write(env, "bpf cfg loop has invalid bounds!\n");
+		return -1;
+	}
+
+	/* At this point the bounds are valid */
+
+	/* We are going to push worse case bounds on to the loop induction
+	 * variable this way when we run symbolic execution we check validity
+	 * of min and max values. This allows the state pruning logic to have
+	 * a chance at pruning the state.
+	 *
+	 * At the moment we only track the loop induction variables any other
+	 * induction variables in the loop (linear or otherwise) will force
+	 * the loop to iterate through every case presumably exploding the
+	 * state complexity. E.g. simple example is the following,
+	 *
+	 *   for (i = 0, j = 0; i < const; i++, j++) { ... }
+	 *
+	 * Improving the induction variable logic can catch the above case
+	 * and many more.
+	 */
+	err = bpf_cfg_deduce_bounds(env, op, src_reg, dst_reg, loop);
+	if (err) {
+		bpf_verifier_log_write(env, "bpf cfg loop could not deduce bounds!\n");
+		return -1;
+	}
+
+	bpf_print_loop_info(env, loop);
+
+	/* Instruction count is being used as an approximation for runtime.
+	 * Loops have the potential to iterative many times over a single
+	 * set of instructions. To account for this charge instruction count
+	 * limits the worst case path times the worst case number of iterations.
+	 */
+	// tbd
+
+	/* Mark insn for indvar tracking informs the execution engine when
+	 * it can avoid updating bounds on an insn. This is required to
+	 * allow the pruning logic to eventually prune the loop state.
+	 */
+	if (loop->dst.state == BPF_LOOP_DEC ||
+	    loop->dst.state == BPF_LOOP_INC)
+		dst_reg->indvar = loop->dst.state;
+	else if (loop->src.state == BPF_LOOP_DEC ||
+		 loop->src.state == BPF_LOOP_INC)
+		src_reg->indvar = loop->src.state;
+
+	/* Loop _will_ terminate remove reference and let the state pruning
+	 * do its job.
+	 */
+	env->insn_aux_data[insn_idx].loop = NULL;
+	bpf_free_loop(loop);
+	return 1;
+}
+
+static int bpf_is_valid_loop_state(struct bpf_loop_info *loop)
+{
+	if (loop->src.state == BPF_LOOP_INVALID ||
+	    loop->dst.state == BPF_LOOP_INVALID)
+		return -1;
+
+	switch (loop->src.state) {
+	case BPF_LOOP_UNKNOWN:
+	case BPF_LOOP_IMM:
+		if (loop->dst.state == BPF_LOOP_INC ||
+		    loop->dst.state == BPF_LOOP_DEC)
+			return 0;
+		break;
+	case BPF_LOOP_DEC:
+	case BPF_LOOP_INC:
+		if (loop->dst.state == BPF_LOOP_UNKNOWN ||
+		    loop->dst.state == BPF_LOOP_IMM)
+			return 0;
+		break;
+	}
+
+	return -1;
+}
+
+static void bb_state_stack_push(struct bb_state *state, struct list_head *stack)
+{
+	list_add(&state->list, stack);
+}
+
+static struct bb_state *bb_state_stack_pop(struct list_head *stack)
+{
+	struct bb_state *s;
+
+	s = list_first_entry_or_null(stack, struct bb_state, list);
+	if (s)
+		list_del(&s->list);
+	return s;
+}
+
+static int build_loop_info(struct bpf_verifier_env *env,
+			   struct cfg_node_allocator *allocator,
+			   struct bpf_subprog_info *subprog,
+			   struct bb_node *head,
+			   struct bb_node *tail)
+{
+	struct bpf_insn *insns = env->prog->insnsi;
+	bool has_branch, only_has_branch;
+	struct list_head bb_state_stack;
+	struct bb_state *state = NULL;
+	struct bpf_loop_info *loop;
+	struct bb_node *next_bb;
+	struct bpf_insn *insn;
+	int err = 0;
+	u8 code;
+
+	loop = kzalloc(sizeof(struct bpf_loop_info), GFP_KERNEL);
+	if (!loop)
+		return -ENOMEM;
+
+	loop->src.state = BPF_LOOP_UNKNOWN;
+	loop->dst.state = BPF_LOOP_UNKNOWN;
+	loop->idx = loop_idx++;
+	INIT_LIST_HEAD(&loop->bb);
+	INIT_LIST_HEAD(&bb_state_stack);
+
+	state = kzalloc(sizeof(struct bb_state), GFP_KERNEL);
+	if (!state) {
+		kfree(loop);
+		return -ENOMEM;
+	}
+
+	/* Initialize stack for path walk. To track the loop induction
+	 * variable we will walk all paths back from the last instruction
+	 * to the first instruction in the loop.
+	 */
+	next_bb = bb_next(allocator, head);
+	state->bb = head;
+	state->insn = next_bb->head - 1;
+	insn = &insns[state->insn];
+	code = insn->code;
+	if (BPF_SRC(insn->code) == BPF_K) {
+		_bpf_state_set_move(&state->src, insn);
+	} else {
+		state->src.state = BPF_LOOP_UNKNOWN;
+		state->src.reg = insn->src_reg;
+	}
+	state->dst.state = BPF_LOOP_UNKNOWN;
+	state->dst.reg = insn->dst_reg;
+	state->insn_cnt = 0;
+
+	bb_state_stack_push(state, &bb_state_stack);
+	err = bb_stack_push(tail, loop);
+	if (err)
+		goto out;
+
+	loop->insn_entry = tail->head;
+	loop->insn_exit = next_bb->head - 1;
+
+	/* This is a pattern match on the loop type we expect loops
+	 * of the form,
+	 *
+	 *   header
+	 *   ...
+	 *   if (r1 > r2) goto header
+	 *   ...
+	 *
+	 * Where the jump is not a BPF_JA instruction. However sometimes
+	 * we get loops like the following,
+	 *
+	 *  header
+	 *  ...
+	 *  if (r1 > r2) goto out_of_loop
+	 *  ...
+	 *  goto header
+	 *
+	 *  Here the bounding condition is inside the loop and not in the
+	 *  last BB. Presumably we can handle these as well with additional
+	 *  searching to find the latch element. However it makes the scanning
+	 *  a bit more painful. For simplicity test if tail is valid latch and
+	 *  throw out other constructs.
+	 *
+	 *  TBD handle nested loops.
+	 */
+	only_has_branch = BPF_CLASS(code) == BPF_JMP &&
+			  BPF_OP(code) == BPF_JA;
+	if (only_has_branch) {
+		bpf_verifier_log_write(env,
+				       "non-terminating loop detected e(%i->%i)\n",
+				       head->idx, tail->idx);
+		return -EINVAL;
+	}
+
+	has_branch = only_has_branch ||
+		     (BPF_CLASS(code) == BPF_JMP &&
+		      BPF_OP(code) != BPF_EXIT &&
+		      BPF_OP(code) != BPF_CALL);
+	if (!has_branch) {
+		bpf_verifier_log_write(env,
+				       "loop without branches (class %i op %i), must be a verifier bug? e(%i->%i)\n",
+				       BPF_CLASS(code), BPF_OP(code), head->idx, tail->idx);
+		return -EINVAL;
+	}
+
+
+	/* With a valid branch then either src or dst register must be monotonic for
+	 * the loop to terminate. To detect this do a path walk through the loop and
+	 * ensure that monotonic property holds in each path.
+	 */
+	state = bb_state_stack_pop(&bb_state_stack);
+	while (state) {
+		int bb_tail, bb_head;
+		struct edge_node *e;
+		struct bb_node *bb;
+		bool found;
+
+		bb = state->bb;
+		found = bb_search(bb, loop);
+		if (!found)
+			bb_stack_push(bb, loop);
+		next_bb = bb_next(allocator, bb);
+		bb_tail = next_bb->head - 1;
+		bb_head = bb->head;
+
+		while (bb_tail >= bb_head) {
+			bpf_loop_state(env, bb_tail, &insns[bb_tail], state);
+			bb_tail--;
+			state->insn_cnt++;
+		}
+
+		if (state->src.state == BPF_LOOP_INVALID ||
+		    state->dst.state == BPF_LOOP_INVALID) {
+			bpf_verifier_log_write(env,
+					       "Detected BPF_LOOP_INVALID state\n");
+			goto out;
+		}
+
+		/* If this is the last node in the loop ensure the loop states
+		 * have not changed with paths. For example, it would be invalid
+		 * to have two paths one where the induction variable increases
+		 * and another where it decreases. If the state is invalid abort
+		 * now because if any single path fails the loop is invalid.
+		 *
+		 * Finally, assuming state is valid continue processing stack
+		 * giving the next path trace.
+		 */
+		if (bb == tail) {
+			if (state->src.state != loop->src.state &&
+			    loop->src.state != BPF_LOOP_UNKNOWN) {
+				bpf_verifier_log_write(env,
+						       "Paths (src) do not align %i != %i\n",
+						       state->src.state,
+						       loop->src.state);
+
+				goto out;
+			}
+			if (state->dst.state != loop->dst.state &&
+			    loop->dst.state != BPF_LOOP_UNKNOWN) {
+				bpf_verifier_log_write(env,
+						       "Paths (dst) do not align %i != %i\n",
+						       state->src.state,
+						       loop->src.state);
+				goto out;
+			}
+
+			if (loop->insn_cnt < state->insn_cnt)
+				loop->insn_cnt = state->insn_cnt;
+
+			loop->dst = state->dst;
+			loop->src = state->src;
+
+			bpf_clear_state(state);
+			kfree(state);
+			state = bb_state_stack_pop(&bb_state_stack);
+			continue;
+		}
+
+		e = cfg_node_delink(allocator, &bb->e_prevs);
+		while (e) {
+			struct bb_node *src = e->src;
+			struct bb_state *old = state;
+			struct bb_state *new;
+
+			new = kzalloc(sizeof(struct bb_state), GFP_KERNEL);
+			if (!state)
+				goto out;
+
+			next_bb = bb_next(allocator, src);
+
+			*new = *old;
+			new->bb = src;
+			new->insn = next_bb->head - 1;
+			bb_state_stack_push(new, &bb_state_stack);
+
+			e = cfg_node_delink(allocator, &e->link);
+		}
+		kfree(state);
+		state = bb_state_stack_pop(&bb_state_stack);
+	}
+	if (bpf_is_valid_loop_state(loop))
+		goto out;
+
+	bpf_verifier_log_write(env,
+			      "Loop detected e(%i->%i) insn(%i) src_state(R%i:%s) dst_state(R%i:%s)\n",
+			       head->idx, tail->idx, loop->insn_cnt,
+			       loop->src.reg,
+			       bpf_loop_state_str[loop->src.state],
+			       loop->dst.reg,
+			       bpf_loop_state_str[loop->dst.state]);
+	env->insn_aux_data[loop->insn_entry].loop = loop;
+	return 0;
+out:
+	while (state) {
+		kfree(state);
+		state = bb_state_stack_pop(&bb_state_stack);
+	}
+	bpf_free_loop(loop);
+	return -1;
+}
+
+int subprog_has_loop(struct bpf_verifier_env *env,
+		     struct cfg_node_allocator *allocator,
+		     struct bpf_subprog_info *subprog)
 {
 	int lane_len = BITS_TO_LONGS(subprog->bb_num - 2);
 	struct bb_node *bb, *entry_bb, *exit_bb;
 	void **bb_list = (void **)&subprog->bbs;
 	struct edge_node *e;
+	int err = 0;
 
 	entry_bb = entry_bb(bb_list);
 	exit_bb = exit_bb(bb_list);
@@ -809,8 +1767,11 @@ bool subprog_has_loop(struct cfg_node_allocator *allocator,
 
 			if (latch != entry_bb &&
 			    test_bit(bb->idx,
-				     subprog->dtree + latch->idx * lane_len))
-				return true;
+				     subprog->dtree + latch->idx * lane_len)) {
+				err = build_loop_info(env, allocator, subprog, latch, bb);
+				if (err)
+					return err;
+			}
 
 			e = cfg_node_delink(allocator, &e->link);
 		}
@@ -818,7 +1779,7 @@ bool subprog_has_loop(struct cfg_node_allocator *allocator,
 		bb = bb_next(allocator, bb);
 	}
 
-	return false;
+	return 0;
 }
 
 /* We don't want to do any further loop bounds analysis for irreducible loop,
diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h
index 44dcabb..8fa64b5 100644
--- a/kernel/bpf/cfg.h
+++ b/kernel/bpf/cfg.h
@@ -15,6 +15,8 @@ struct cfg_node_allocator {
 	u8 pool_cnt;
 };
 
+struct bpf_loop_info;
+
 int add_subprog(struct bpf_verifier_env *env, int off);
 void cfg_node_allocator_free(struct cfg_node_allocator *allocator);
 int cfg_node_allocator_init(struct cfg_node_allocator *allocator,
@@ -33,8 +35,9 @@ int subprog_append_callee(struct bpf_verifier_env *env,
 int subprog_build_dom_info(struct bpf_verifier_env *env,
 			   struct cfg_node_allocator *allocator,
 			   struct bpf_subprog_info *subprog);
-bool subprog_has_loop(struct cfg_node_allocator *allocator,
-		      struct bpf_subprog_info *subprog);
+int subprog_has_loop(struct bpf_verifier_env *env,
+		     struct cfg_node_allocator *allocator,
+		     struct bpf_subprog_info *subprog);
 int subprog_has_irreduciable_loop(struct cfg_node_allocator *allocator,
 				  struct bpf_subprog_info *subprog);
 void cfg_pretty_print(struct bpf_verifier_env *env,
@@ -45,6 +48,7 @@ void dom_pretty_print(struct bpf_verifier_env *env,
 int subprog_init_bb(struct cfg_node_allocator *allocator, void **bb_list,
 		    int subprog_start, int subprog_end);
 void subprog_free(struct bpf_subprog_info *subprog, int end_idx);
+int bpf_check_loop_header(struct bpf_verifier_env *env, int insn_idx);
 
 #define DFS_NODE_EXPLORING	1
 #define DFS_NODE_EXPLORED	2
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index c41f587..c9ed3d7 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -687,6 +687,7 @@ static void __mark_reg_unknown(struct bpf_reg_state *reg)
 	reg->off = 0;
 	reg->var_off = tnum_unknown;
 	reg->frameno = 0;
+	reg->indvar = BPF_LOOP_UNKNOWN;
 	__mark_reg_unbounded(reg);
 }
 
@@ -913,6 +914,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
 						     &subprog[cur_subprog]);
 			if (ret < 0)
 				goto free_nodes;
+
 			cfg_pretty_print(env, &allocator, &subprog[cur_subprog]);
 			dom_pretty_print(env, &subprog[cur_subprog]);
 			ret = subprog_has_irreduciable_loop(&allocator,
@@ -924,7 +926,8 @@ static int check_subprogs(struct bpf_verifier_env *env)
 				ret = -EINVAL;
 				goto free_nodes;
 			}
-			if (subprog_has_loop(&allocator,
+			if (subprog_has_loop(env,
+					     &allocator,
 					     &subprog[cur_subprog])) {
 				verbose(env, "cfg - loop detected");
 				ret = -EINVAL;
@@ -1117,7 +1120,8 @@ static int check_stack_write(struct bpf_verifier_env *env,
 
 	cur = env->cur_state->frame[env->cur_state->curframe];
 	if (value_regno >= 0 &&
-	    is_spillable_regtype((type = cur->regs[value_regno].type))) {
+	    (is_spillable_regtype((type = cur->regs[value_regno].type)) ||
+	    cur->regs[value_regno].indvar)) {
 
 		/* register containing pointer is being spilled into stack */
 		if (size != BPF_REG_SIZE) {
@@ -2784,6 +2788,13 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
 	    !check_reg_sane_offset(env, ptr_reg, ptr_reg->type))
 		return -EINVAL;
 
+	/* For now if indvar is being used with a pointer and scalar drop the
+	 * indvar. This will force the loop to run until termination without
+	 * the aid of pruning. With extra complexity this can be added in the
+	 * future if needed.
+	 */
+	dst_reg->indvar = BPF_LOOP_UNKNOWN;
+
 	switch (opcode) {
 	case BPF_ADD:
 		/* We can take a fixed offset as long as it doesn't overflow
@@ -2809,6 +2820,8 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
 		 * this creates a new 'base' pointer, off_reg (variable) gets
 		 * added into the variable offset, and we copy the fixed offset
 		 * from ptr_reg.
+		 * For PTR_TO_PACJET ptrs we propagate range through when
+		 * possible by taking worst case upper limit.
 		 */
 		if (signed_add_overflows(smin_ptr, smin_val) ||
 		    signed_add_overflows(smax_ptr, smax_val)) {
@@ -2956,6 +2969,13 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
 		return 0;
 	}
 
+	/* Verifier check to ensure indvar state is good */
+	if (dst_reg->indvar && opcode != BPF_ADD && opcode != BPF_SUB) {
+		WARN_ON_ONCE(1);
+		__mark_reg_unknown(dst_reg);
+		return 0;
+	}
+
 	switch (opcode) {
 	case BPF_ADD:
 		if (signed_add_overflows(dst_reg->smin_value, smin_val) ||
@@ -2963,18 +2983,37 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
 			dst_reg->smin_value = S64_MIN;
 			dst_reg->smax_value = S64_MAX;
 		} else {
-			dst_reg->smin_value += smin_val;
-			dst_reg->smax_value += smax_val;
+			if (dst_reg->indvar == BPF_LOOP_INC) {
+				dst_reg->smin_value += smin_val;
+			} else if (dst_reg->indvar == BPF_LOOP_DEC) {
+				dst_reg->smax_value += smax_val;
+			} else {
+				dst_reg->smin_value += smin_val;
+				dst_reg->smax_value += smax_val;
+			}
 		}
 		if (dst_reg->umin_value + umin_val < umin_val ||
 		    dst_reg->umax_value + umax_val < umax_val) {
 			dst_reg->umin_value = 0;
 			dst_reg->umax_value = U64_MAX;
 		} else {
-			dst_reg->umin_value += umin_val;
-			dst_reg->umax_value += umax_val;
+			if (dst_reg->indvar == BPF_LOOP_INC) {
+				dst_reg->umin_value += umin_val;
+			} else if (dst_reg->indvar == BPF_LOOP_DEC) {
+				dst_reg->umax_value += smax_val;
+			} else {
+				dst_reg->umin_value += umin_val;
+				dst_reg->umax_value += umax_val;
+			}
 		}
-		dst_reg->var_off = tnum_add(dst_reg->var_off, src_reg.var_off);
+
+		/* This is a hack for now need to sort out tnum when manipulating
+		 * min/max bounds directly.
+		 */
+		if (dst_reg->indvar)
+			dst_reg->var_off = tnum_range(dst_reg->umin_value, dst_reg->umax_value);
+		else
+			dst_reg->var_off = tnum_add(dst_reg->var_off, src_reg.var_off);
 		break;
 	case BPF_SUB:
 		if (signed_sub_overflows(dst_reg->smin_value, smax_val) ||
@@ -2983,8 +3022,14 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
 			dst_reg->smin_value = S64_MIN;
 			dst_reg->smax_value = S64_MAX;
 		} else {
-			dst_reg->smin_value -= smax_val;
-			dst_reg->smax_value -= smin_val;
+			if (dst_reg->indvar == BPF_LOOP_INC) {
+				dst_reg->smin_value -= smax_val;
+			} else if (dst_reg->indvar == BPF_LOOP_DEC) {
+				dst_reg->smax_value -= smin_val;
+			} else {
+				dst_reg->smin_value -= smax_val;
+				dst_reg->smax_value -= smin_val;
+			}
 		}
 		if (dst_reg->umin_value < umax_val) {
 			/* Overflow possible, we know nothing */
@@ -2992,10 +3037,23 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
 			dst_reg->umax_value = U64_MAX;
 		} else {
 			/* Cannot overflow (as long as bounds are consistent) */
-			dst_reg->umin_value -= umax_val;
-			dst_reg->umax_value -= umin_val;
+			if (dst_reg->indvar == BPF_LOOP_INC) {
+				dst_reg->umin_value -= umax_val;
+			} else if (dst_reg->indvar == BPF_LOOP_DEC) {
+				dst_reg->umax_value -= umin_val;
+			} else {
+				dst_reg->umin_value -= umax_val;
+				dst_reg->umax_value -= umin_val;
+			}
 		}
-		dst_reg->var_off = tnum_sub(dst_reg->var_off, src_reg.var_off);
+
+		/* This is a hack for now need to sort out tnum when manipulating
+		 * min/max bounds directly.
+		 */
+		if (dst_reg->indvar)
+			dst_reg->var_off = tnum_range(dst_reg->umin_value, dst_reg->umax_value);
+		else
+			dst_reg->var_off = tnum_sub(dst_reg->var_off, src_reg.var_off);
 		break;
 	case BPF_MUL:
 		dst_reg->var_off = tnum_mul(dst_reg->var_off, src_reg.var_off);
@@ -3167,6 +3225,10 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
 	}
 
 	if (BPF_CLASS(insn->code) != BPF_ALU64) {
+		/* Not tracking 32-bit ALU ops for now */
+		dst_reg->indvar = BPF_LOOP_UNKNOWN;
+		src_reg.indvar = BPF_LOOP_UNKNOWN;
+
 		/* 32-bit ALU ops are (32,32)->32 */
 		coerce_reg_to_size(dst_reg, 4);
 		coerce_reg_to_size(&src_reg, 4);
@@ -3199,7 +3261,9 @@ static int adjust_reg_min_max_vals(struct bpf_verifier_env *env,
 			if (dst_reg->type != SCALAR_VALUE) {
 				/* Combining two pointers by any ALU op yields
 				 * an arbitrary scalar. Disallow all math except
-				 * pointer subtraction
+				 * pointer subtraction. Further we do not track
+				 * indvar through pointer ALU, so likely to hit
+				 * complexity limits with pointer loop indvar.
 				 */
 				if (opcode == BPF_SUB){
 					mark_reg_unknown(env, regs, insn->dst_reg);
@@ -4864,6 +4928,14 @@ static int do_check(struct bpf_verifier_env *env)
 			return -EINVAL;
 		}
 
+		err = bpf_check_loop_header(env, insn_idx);
+		if (err < 0) {
+			print_verifier_state(env, state->frame[state->curframe]);
+			return err;
+		} else if (err > 0) {
+			print_verifier_state(env, state->frame[state->curframe]);
+		}
+
 		insn_idx++;
 	}
 
diff --git a/samples/bpf/xdp2skb_meta_kern.c b/samples/bpf/xdp2skb_meta_kern.c
index 0c12048..b8a1d98 100644
--- a/samples/bpf/xdp2skb_meta_kern.c
+++ b/samples/bpf/xdp2skb_meta_kern.c
@@ -11,6 +11,7 @@
  */
 #include <uapi/linux/bpf.h>
 #include <uapi/linux/pkt_cls.h>
+#include <linux/version.h>
 
 #include "bpf_helpers.h"
 
@@ -34,6 +35,13 @@ int _xdp_mark(struct xdp_md *ctx)
 	struct meta_info *meta;
 	void *data, *data_end;
 	int ret;
+	volatile __u64 v[4];
+
+	v[0] = 0;
+	v[1] = 1;
+	v[2] = 2;
+	v[3] = 3;
+
 
 	/* Reserve space in-front of data pointer for our meta info.
 	 * (Notice drivers not supporting data_meta will fail here!)
@@ -59,27 +67,62 @@ int _xdp_mark(struct xdp_md *ctx)
 	return XDP_PASS;
 }
 
-SEC("tc_mark")
+SEC("classifier_tc_mark")
 int _tc_mark(struct __sk_buff *ctx)
 {
 	void *data      = (void *)(unsigned long)ctx->data;
 	void *data_end  = (void *)(unsigned long)ctx->data_end;
+	char a = 'a';
+#if 0
 	void *data_meta = (void *)(unsigned long)ctx->data_meta;
 	struct meta_info *meta = data_meta;
+	volatile __u64 mark = ctx->mark;
+#endif
 
 	/* Check XDP gave us some data_meta */
-	if (meta + 1 > data) {
-		ctx->mark = 41;
-		 /* Skip "accept" if no data_meta is avail */
-		return TC_ACT_OK;
+	 u8 j = 0;
+	 s32 i = 0;
+	 u8 k = 0;
+	 u8 *p;
+	
+	 if (data + 2 > data_end)
+		 return TC_ACT_OK;
+
+	 p = data;
+	 j = p[0]; 
+
+	 if (data + 111 > data_end)
+		 return TC_ACT_OK;
+
+	 if (j > 100)
+		 return TC_ACT_OK;
+#if 0
+	 if (j < 10)
+		 return TC_ACT_OK;
+#endif
+
+#pragma nounroll
+	while (i < j) {
+		//j1 = j2 = j3 = j4 = j5 = j6 = j7 = j8 = j9 = j10 = 1;
+		//k += (__u8)p[i];
+		k += p[i];
+		i++;
+#if 0
+		else 
+			p[i] = 'b';
+#endif
+
+		//j++;
 	}
-
 	/* Hint: See func tc_cls_act_is_valid_access() for BPF_WRITE access */
-	ctx->mark = meta->mark; /* Transfer XDP-mark to SKB-mark */
+	//ctx->mark = meta->mark; /* Transfer XDP-mark to SKB-mark */
+	ctx->mark = k;
 
 	return TC_ACT_OK;
 }
 
+u32 _version SEC("version") = LINUX_VERSION_CODE;
+
 /* Manually attaching these programs:
 export DEV=ixgbe2
 export FILE=xdp2skb_meta_kern.o

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 15/16] bpf: verifier, simple loop examples
  2018-06-01  9:32 [RFC PATCH 00/16] bpf, bounded loop support work in progress John Fastabend
                   ` (13 preceding siblings ...)
  2018-06-01  9:33 ` [RFC PATCH 14/16] bpf: verifier, add initial support to allow bounded loops John Fastabend
@ 2018-06-01  9:33 ` John Fastabend
  2018-06-01  9:33 ` [RFC PATCH 16/16] bpf: tools: dbg patch to turn on debugging and add primitive examples John Fastabend
  15 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2018-06-01  9:33 UTC (permalink / raw)
  To: alexei.starovoitov, daniel, davem; +Cc: netdev

Add some simple loop examples. For now I just load these
using bpftool but eventually they should have a loader
simliar to test_verifier except for C snippets.

We want to use C files here instead of BPF because most
users will be using C clang/llvm and want to test how
well this works with the code generated by the actual
tools instead of hand-crafted BPF.
---
 tools/testing/selftests/bpf/Makefile           |    5 ++
 tools/testing/selftests/bpf/test_bad_loop1.c   |   47 ++++++++++++++++++++++
 tools/testing/selftests/bpf/test_bad_loop2.c   |   45 +++++++++++++++++++++
 tools/testing/selftests/bpf/test_bad_loop3.c   |   47 ++++++++++++++++++++++
 tools/testing/selftests/bpf/test_basic_loop1.c |   44 +++++++++++++++++++++
 tools/testing/selftests/bpf/test_basic_loop2.c |   48 +++++++++++++++++++++++
 tools/testing/selftests/bpf/test_basic_loop3.c |   51 ++++++++++++++++++++++++
 7 files changed, 286 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/test_bad_loop1.c
 create mode 100644 tools/testing/selftests/bpf/test_bad_loop2.c
 create mode 100644 tools/testing/selftests/bpf/test_bad_loop3.c
 create mode 100644 tools/testing/selftests/bpf/test_basic_loop1.c
 create mode 100644 tools/testing/selftests/bpf/test_basic_loop2.c
 create mode 100644 tools/testing/selftests/bpf/test_basic_loop3.c

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index a1b66da..0cf7bd1 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -34,7 +34,10 @@ TEST_GEN_FILES = test_pkt_access.o test_xdp.o test_l4lb.o test_tcp_estats.o test
 	sockmap_tcp_msg_prog.o connect4_prog.o connect6_prog.o test_adjust_tail.o \
 	test_btf_haskv.o test_btf_nokv.o test_sockmap_kern.o test_tunnel_kern.o \
 	test_get_stack_rawtp.o test_sockmap_kern.o test_sockhash_kern.o \
-	test_lwt_seg6local.o sendmsg4_prog.o sendmsg6_prog.o
+	test_lwt_seg6local.o sendmsg4_prog.o sendmsg6_prog.o \
+	test_basic_loop1.o test_basic_loop2.o test_basic_loop3.o \
+	test_bad_loop1.o test_bad_loop2.o test_bad_loop3.o
+
 
 # Order correspond to 'make run_tests' order
 TEST_PROGS := test_kmod.sh \
diff --git a/tools/testing/selftests/bpf/test_bad_loop1.c b/tools/testing/selftests/bpf/test_bad_loop1.c
new file mode 100644
index 0000000..88e8cfb
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_bad_loop1.c
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0
+ * Copyright (c) 2018 Covalent IO
+ */
+#include <stddef.h>
+#include <stdbool.h>
+#include <string.h>
+#include <linux/pkt_cls.h>
+#include <linux/bpf.h>
+#include <linux/in.h>
+#include <linux/if_ether.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <linux/icmp.h>
+#include <linux/icmpv6.h>
+#include <linux/tcp.h>
+#include <linux/udp.h>
+#include "bpf_helpers.h"
+#include "test_iptunnel_common.h"
+#include "bpf_endian.h"
+
+
+/* Test packet access out of bounds */
+
+int _version SEC("version") = 1;
+SEC("classifier_tc_loop1")
+int _tc_loop(struct __sk_buff *ctx)
+{
+	void *data      = (void *)(unsigned long)ctx->data;
+	void *data_end  = (void *)(unsigned long)ctx->data_end;
+	 __u8 i = 0, j = 0, k = 0, *p;
+
+	 p = data;
+	 if (data + 50 > data_end)
+		 return TC_ACT_OK;
+
+#pragma nounroll
+	while (i < 100) {
+		k += p[i];
+		p[i] = k;
+		i++;
+	}
+	ctx->mark = k;
+
+	return TC_ACT_OK;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_bad_loop2.c b/tools/testing/selftests/bpf/test_bad_loop2.c
new file mode 100644
index 0000000..fff005e
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_bad_loop2.c
@@ -0,0 +1,45 @@
+/* SPDX-License-Identifier: GPL-2.0
+ * Copyright (c) 2018 Covalent IO
+ */
+#include <stddef.h>
+#include <stdbool.h>
+#include <string.h>
+#include <linux/pkt_cls.h>
+#include <linux/bpf.h>
+#include <linux/in.h>
+#include <linux/if_ether.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <linux/icmp.h>
+#include <linux/icmpv6.h>
+#include <linux/tcp.h>
+#include <linux/udp.h>
+#include "bpf_helpers.h"
+#include "test_iptunnel_common.h"
+#include "bpf_endian.h"
+
+/* Test non-terminating loops */
+int _version SEC("version") = 1;
+SEC("classifier_tc_loop1")
+int _tc_loop(struct __sk_buff *ctx)
+{
+	void *data      = (void *)(unsigned long)ctx->data;
+	void *data_end  = (void *)(unsigned long)ctx->data_end;
+	int i = 0, j = 0, k = 0, *p;
+
+	 p = data;
+	 if (data + 101 > data_end)
+		 return TC_ACT_OK;
+
+#pragma nounroll
+	while (i < 100) {
+		k += p[i];
+		p[i] = k;
+		i--;
+	}
+	ctx->mark = k;
+
+	return TC_ACT_OK;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_bad_loop3.c b/tools/testing/selftests/bpf/test_bad_loop3.c
new file mode 100644
index 0000000..4015366
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_bad_loop3.c
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0
+ * Copyright (c) 2018 Covalent IO
+ */
+#include <stddef.h>
+#include <stdbool.h>
+#include <string.h>
+#include <linux/pkt_cls.h>
+#include <linux/bpf.h>
+#include <linux/in.h>
+#include <linux/if_ether.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <linux/icmp.h>
+#include <linux/icmpv6.h>
+#include <linux/tcp.h>
+#include <linux/udp.h>
+#include "bpf_helpers.h"
+#include "test_iptunnel_common.h"
+#include "bpf_endian.h"
+
+int _version SEC("version") = 1;
+SEC("classifier_tc_loop1")
+int _tc_loop(struct __sk_buff *ctx)
+{
+	void *data      = (void *)(unsigned long)ctx->data;
+	void *data_end  = (void *)(unsigned long)ctx->data_end;
+	 __u8 i = 0, j = 0, k = 0, *p;
+
+	 p = data;
+	 if (data + 101 > data_end)
+		 return TC_ACT_OK;
+
+head:
+#pragma nounroll
+	while (i < 100) {
+		k += p[i];
+		p[i] = k;
+		i++;
+		if (k < 100)
+			goto head;
+	}
+	ctx->mark = k;
+
+	return TC_ACT_OK;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_basic_loop1.c b/tools/testing/selftests/bpf/test_basic_loop1.c
new file mode 100644
index 0000000..63dc4a4
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_basic_loop1.c
@@ -0,0 +1,44 @@
+/* SPDX-License-Identifier: GPL-2.0
+ * Copyright (c) 2018 Covalent IO
+ */
+#include <stddef.h>
+#include <stdbool.h>
+#include <string.h>
+#include <linux/pkt_cls.h>
+#include <linux/bpf.h>
+#include <linux/in.h>
+#include <linux/if_ether.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <linux/icmp.h>
+#include <linux/icmpv6.h>
+#include <linux/tcp.h>
+#include <linux/udp.h>
+#include "bpf_helpers.h"
+#include "test_iptunnel_common.h"
+#include "bpf_endian.h"
+
+int _version SEC("version") = 1;
+SEC("classifier_tc_loop1")
+int _tc_loop(struct __sk_buff *ctx)
+{
+	void *data      = (void *)(unsigned long)ctx->data;
+	void *data_end  = (void *)(unsigned long)ctx->data_end;
+	 __u8 i = 0, j = 0, k = 0, *p;
+
+	 p = data;
+	 if (data + 101 > data_end)
+		 return TC_ACT_OK;
+
+#pragma nounroll
+	while (i < 100) {
+		k += p[i];
+		p[i] = k;
+		i++;
+	}
+	ctx->mark = k;
+
+	return TC_ACT_OK;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_basic_loop2.c b/tools/testing/selftests/bpf/test_basic_loop2.c
new file mode 100644
index 0000000..fb774dc
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_basic_loop2.c
@@ -0,0 +1,48 @@
+/* SPDX-License-Identifier: GPL-2.0
+ * Copyright (c) 2018 Covalent IO
+ */
+#include <stddef.h>
+#include <stdbool.h>
+#include <string.h>
+#include <linux/pkt_cls.h>
+#include <linux/bpf.h>
+#include <linux/in.h>
+#include <linux/if_ether.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <linux/icmp.h>
+#include <linux/icmpv6.h>
+#include <linux/tcp.h>
+#include <linux/udp.h>
+#include "bpf_helpers.h"
+#include "test_iptunnel_common.h"
+#include "bpf_endian.h"
+
+int _version SEC("version") = 1;
+SEC("classifier_tc_loop1")
+int _tc_loop(struct __sk_buff *ctx)
+{
+	void *data      = (void *)(unsigned long)ctx->data;
+	void *data_end  = (void *)(unsigned long)ctx->data_end;
+	 __u8 i = 0, j = 0, k = 0, *p;
+
+	 p = data;
+	 if (data + 101 > data_end)
+		 return TC_ACT_OK;
+
+#pragma nounroll
+	while (i < 100) {
+		k += p[i];
+		if (k < 100) {
+			p[i] = 'a';
+		} else {
+			p[i] = 'b';
+		}
+		i++;
+	}
+	ctx->mark = k;
+
+	return TC_ACT_OK;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_basic_loop3.c b/tools/testing/selftests/bpf/test_basic_loop3.c
new file mode 100644
index 0000000..dd7dd1b
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_basic_loop3.c
@@ -0,0 +1,51 @@
+/* SPDX-License-Identifier: GPL-2.0
+ * Copyright (c) 2018 Covalent IO
+ */
+#include <stddef.h>
+#include <stdbool.h>
+#include <string.h>
+#include <linux/pkt_cls.h>
+#include <linux/bpf.h>
+#include <linux/in.h>
+#include <linux/if_ether.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <linux/icmp.h>
+#include <linux/icmpv6.h>
+#include <linux/tcp.h>
+#include <linux/udp.h>
+#include "bpf_helpers.h"
+#include "test_iptunnel_common.h"
+#include "bpf_endian.h"
+
+int _version SEC("version") = 1;
+SEC("classifier_tc_loop1")
+int _tc_loop(struct __sk_buff *ctx)
+{
+	void *data      = (void *)(unsigned long)ctx->data;
+	void *data_end  = (void *)(unsigned long)ctx->data_end;
+	 __u8 i = 0, j = 0, k = 0, *p;
+
+	 if (data + 2 > data_end)
+		 return TC_ACT_OK;
+
+	 p = data;
+	 j = p[0];
+
+	 if (data + 101 > data_end)
+		 return TC_ACT_OK;
+
+	 if (j < 100)
+		 return TC_ACT_OK;
+
+#pragma nounroll
+	while (i < j) {
+		k += p[i];
+		i++;
+	}
+	ctx->mark = k;
+
+	return TC_ACT_OK;
+}
+
+char _license[] SEC("license") = "GPL";

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 16/16] bpf: tools: dbg patch to turn on debugging and add primitive examples
  2018-06-01  9:32 [RFC PATCH 00/16] bpf, bounded loop support work in progress John Fastabend
                   ` (14 preceding siblings ...)
  2018-06-01  9:33 ` [RFC PATCH 15/16] bpf: verifier, simple loop examples John Fastabend
@ 2018-06-01  9:33 ` John Fastabend
  15 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2018-06-01  9:33 UTC (permalink / raw)
  To: alexei.starovoitov, daniel, davem; +Cc: netdev

While developing this I found it useful to always have the log
output when dealing with bpftool. This is a quick hack to enable
it but we should add a '-v' option shortly.

Then add a set of good and bad loop examples to test with. These
are all very basic at the moment but will get better soon.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 tools/lib/bpf/bpf.c    |    2 +-
 tools/lib/bpf/libbpf.c |    7 +++++++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 9ddc89d..b1d2199 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -200,7 +200,7 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
 	attr.license = ptr_to_u64(load_attr->license);
 	attr.log_buf = ptr_to_u64(NULL);
 	attr.log_size = 0;
-	attr.log_level = 0;
+	attr.log_level = 1;
 	attr.kern_version = load_attr->kern_version;
 	attr.prog_ifindex = load_attr->prog_ifindex;
 	memcpy(attr.prog_name, load_attr->name,
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index b1a60ac..34ff173 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -1304,6 +1304,13 @@ static int bpf_object__collect_reloc(struct bpf_object *obj)
 	if (ret >= 0) {
 		*pfd = ret;
 		ret = 0;
+
+		pr_warning("load bpf program succesful\n");
+		if (log_buf && log_buf[0] != '\0') {
+			pr_warning("-- BEGIN DUMP LOG ---\n");
+			pr_warning("\n%s\n", log_buf);
+			pr_warning("-- END LOG --\n");
+		}
 		goto out;
 	}
 

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-06-01  9:33 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-01  9:32 [RFC PATCH 00/16] bpf, bounded loop support work in progress John Fastabend
2018-06-01  9:32 ` [RFC PATCH 01/16] bpf: cfg: partition basic blocks for each subprog John Fastabend
2018-06-01  9:32 ` [RFC PATCH 02/16] bpf: cfg: add edges between basic blocks to form CFG John Fastabend
2018-06-01  9:32 ` [RFC PATCH 03/16] bpf: cfg: build domination tree using Tarjan algorithm John Fastabend
2018-06-01  9:32 ` [RFC PATCH 04/16] bpf: cfg: detect loop use domination information John Fastabend
2018-06-01  9:32 ` [RFC PATCH 05/16] bpf: cfg: detect unreachable basic blocks John Fastabend
2018-06-01  9:32 ` [RFC PATCH 06/16] bpf: cfg: move find_subprog/add_subprog to cfg.c John Fastabend
2018-06-01  9:32 ` [RFC PATCH 07/16] bpf: cfg: build call graph and detect unreachable/recursive call John Fastabend
2018-06-01  9:32 ` [RFC PATCH 08/16] bpf: cfg: remove push_insn and check_cfg John Fastabend
2018-06-01  9:33 ` [RFC PATCH 09/16] bpf: cfg: reduce k*alloc/free call by using memory pool for allocating nodes John Fastabend
2018-06-01  9:33 ` [RFC PATCH 10/16] bpf: cfg: reduce memory usage by using singly list + compat pointer John Fastabend
2018-06-01  9:33 ` [RFC PATCH 11/16] bpf: cfg: detect irreducible loop using Eric Stoltz algorithm John Fastabend
2018-06-01  9:33 ` [RFC PATCH 12/16] bpf: cfg: pretty print CFG and DOM John Fastabend
2018-06-01  9:33 ` [RFC PATCH 13/16] bpf: verifier, can ptr range be calculated with scalar ALU op John Fastabend
2018-06-01  9:33 ` [RFC PATCH 14/16] bpf: verifier, add initial support to allow bounded loops John Fastabend
2018-06-01  9:33 ` [RFC PATCH 15/16] bpf: verifier, simple loop examples John Fastabend
2018-06-01  9:33 ` [RFC PATCH 16/16] bpf: tools: dbg patch to turn on debugging and add primitive examples John Fastabend

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.