All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFCv2 00/16] BPF hardware offload (cls_bpf for now)
@ 2016-08-26 18:05 Jakub Kicinski
  2016-08-26 18:06 ` [RFCv2 01/16] add basic register-field manipulation macros Jakub Kicinski
                   ` (15 more replies)
  0 siblings, 16 replies; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-26 18:05 UTC (permalink / raw)
  To: netdev
  Cc: ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici,
	Jakub Kicinski

Hi!

This is an updated version of BPF offload set.

Biggest change from the previous version is reusing the verifier
to check the exit codes and pointer types.  The change to the
verifier is not very invasive and my gut feeling is that adding
simple hooks to the core verifier is cleaner than reimplementing
parsing in every advanced translator.  It may also open a path
to some clever re-interpretation of programs for XDP to adapt 
to hardware metadata details.

Item number two on the feature list is redirect to port from
which packet came (which will be needed for XDP soon).

Last but not least direct action support.  The set of return
codes is limited.  One thing to note there is that we can't
trivially support TC_ACT_OK now because there is no way
to tell TC whether packet passed the filter because of OK or
UNSPEC.  Perhaps there are ways of implementing this, it's
definitely a topic for netdev 1.2 discussion.

I decided to keep legacy mode just because it's easy and I find
it useful for testing things :)

Another item on the todo list is to think about the interface
stats.  Should the dropped/redirected packets appear there?
I am providing TC stats only today but interface stats are not
incremented.  I'll rework the stats once Jiri's SW/HW stat set
lands.

I'm still posting as an RFC because I'm waiting for patch 1
to be merged via wireless-drivers-next - which will also make
this set 15 patches long :)


Jakub Kicinski (16):
  add basic register-field manipulation macros
  net: cls_bpf: add hardware offload
  net: cls_bpf: limit hardware offload by software-only flag
  net: cls_bpf: add support for marking filters as hardware-only
  bpf: recognize 64bit immediate loads as consts
  bpf: verifier: recognize rN ^ rN as load of 0
  bpf: enable non-core use of the verfier
  bpf: export bpf_prog_clone functions
  nfp: add BPF to NFP code translator
  nfp: bpf: add hardware bpf offload
  net: cls_bpf: allow offloaded filters to update stats
  net: bpf: allow offloaded filters to update stats
  nfp: bpf: add packet marking support
  net: act_mirred: allow statistic updates from offloaded actions
  nfp: bpf: add support for legacy redirect action
  nfp: bpf: add offload of TC direct action mode

 drivers/net/ethernet/netronome/nfp/Makefile        |    7 +
 drivers/net/ethernet/netronome/nfp/nfp_asm.h       |  233 +++
 drivers/net/ethernet/netronome/nfp/nfp_bpf.h       |  212 +++
 drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c   | 1816 ++++++++++++++++++++
 .../net/ethernet/netronome/nfp/nfp_bpf_verifier.c  |  166 ++
 drivers/net/ethernet/netronome/nfp/nfp_net.h       |   49 +-
 .../net/ethernet/netronome/nfp/nfp_net_common.c    |   81 +-
 drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h  |   53 +-
 .../net/ethernet/netronome/nfp/nfp_net_offload.c   |  291 ++++
 include/linux/bitfield.h                           |   93 +
 include/linux/bpf.h                                |    4 +
 include/linux/bpf_parser.h                         |   84 +
 include/linux/bug.h                                |    3 +
 include/linux/netdevice.h                          |    2 +
 include/net/pkt_cls.h                              |   16 +
 include/uapi/linux/pkt_cls.h                       |    1 +
 kernel/bpf/core.c                                  |    8 +-
 kernel/bpf/verifier.c                              |  135 +-
 net/sched/act_mirred.c                             |    8 +
 net/sched/cls_bpf.c                                |  116 +-
 20 files changed, 3299 insertions(+), 79 deletions(-)
 create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_asm.h
 create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_bpf.h
 create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
 create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_bpf_verifier.c
 create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
 create mode 100644 include/linux/bitfield.h
 create mode 100644 include/linux/bpf_parser.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFCv2 01/16] add basic register-field manipulation macros
  2016-08-26 18:05 [RFCv2 00/16] BPF hardware offload (cls_bpf for now) Jakub Kicinski
@ 2016-08-26 18:06 ` Jakub Kicinski
  2016-08-29 14:34   ` Daniel Borkmann
  2016-08-26 18:06 ` [RFCv2 02/16] net: cls_bpf: add hardware offload Jakub Kicinski
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-26 18:06 UTC (permalink / raw)
  To: netdev
  Cc: ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici,
	Jakub Kicinski

Common approach to accessing register fields is to define
structures or sets of macros containing mask and shift pair.
Operations on the register are then performed as follows:

 field = (reg >> shift) & mask;

 reg &= ~(mask << shift);
 reg |= (field & mask) << shift;

Defining shift and mask separately is tedious.  Ivo van Doorn
came up with an idea of computing them at compilation time
based on a single shifted mask (later refined by Felix) which
can be used like this:

 #define REG_FIELD 0x000ff000

 field = FIELD_GET(REG_FIELD, reg);

 reg &= ~REG_FIELD;
 reg |= FIELD_PREP(REG_FIELD, field);

FIELD_{GET,PREP} macros take care of finding out what the
appropriate shift is based on compilation time ffs operation.

GENMASK can be used to define registers (which is usually
less error-prone and easier to match with datasheets).

This approach is the most convenient I've seen so to limit code
multiplication let's move the macros to a global header file.
Attempts to use static inlines instead of macros failed due
to false positive triggering of BUILD_BUG_ON()s, especially with
GCC < 6.0.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 include/linux/bitfield.h | 93 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/bug.h      |  3 ++
 2 files changed, 96 insertions(+)
 create mode 100644 include/linux/bitfield.h

diff --git a/include/linux/bitfield.h b/include/linux/bitfield.h
new file mode 100644
index 000000000000..32ca8863e66d
--- /dev/null
+++ b/include/linux/bitfield.h
@@ -0,0 +1,93 @@
+/*
+ * Copyright (C) 2014 Felix Fietkau <nbd@nbd.name>
+ * Copyright (C) 2004 - 2009 Ivo van Doorn <IvDoorn@gmail.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _LINUX_BITFIELD_H
+#define _LINUX_BITFIELD_H
+
+#include <linux/bug.h>
+
+/*
+ * Bitfield access macros
+ *
+ * FIELD_{GET,PREP} macros take as first parameter shifted mask
+ * from which they extract the base mask and shift amount.
+ * Mask must be a compilation time constant.
+ *
+ * Example:
+ *
+ *  #define REG_FIELD_A  GENMASK(6, 0)
+ *  #define REG_FIELD_B  BIT(7)
+ *  #define REG_FIELD_C  GENMASK(15, 8)
+ *  #define REG_FIELD_D  GENMASK(31, 16)
+ *
+ * Get:
+ *  a = FIELD_GET(REG_FIELD_A, reg);
+ *  b = FIELD_GET(REG_FIELD_B, reg);
+ *
+ * Set:
+ *  reg = FIELD_PREP(REG_FIELD_A, 1) |
+ *	  FIELD_PREP(REG_FIELD_B, 0) |
+ *	  FIELD_PREP(REG_FIELD_C, c) |
+ *	  FIELD_PREP(REG_FIELD_D, 0x40);
+ *
+ * Modify:
+ *  reg &= ~REG_FIELD_C;
+ *  reg |= FIELD_PREP(REG_FIELD_C, c);
+ */
+
+#define _bf_shf(x) (__builtin_ffsll(x) - 1)
+
+#define _BF_FIELD_CHECK(_mask, _reg, _val, _pfx)			\
+	({								\
+		BUILD_BUG_ON_MSG(!__builtin_constant_p(_mask),		\
+				 _pfx "mask is not constant");		\
+		BUILD_BUG_ON_MSG(!(_mask), _pfx "mask is zero");	\
+		BUILD_BUG_ON_MSG(__builtin_constant_p(_val) ?		\
+				 ~((_mask) >> _bf_shf(_mask)) & (_val) : 0, \
+				 _pfx "value too large for the field"); \
+		BUILD_BUG_ON_MSG((_mask) > (typeof(_reg))~0ull,		\
+				 _pfx "type of reg too small for mask"); \
+		__BUILD_BUG_ON_NOT_POWER_OF_2((_mask) +			\
+					      (1ULL << _bf_shf(_mask))); \
+	})
+
+/**
+ * FIELD_PREP() - prepare a bitfield element
+ * @_mask: shifted mask defining the field's length and position
+ * @_val:  value to put in the field
+ *
+ * FIELD_PREP() masks and shifts up the value.  The result should
+ * be combined with other fields of the bitfield using logical OR.
+ */
+#define FIELD_PREP(_mask, _val)						\
+	({								\
+		_BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: ");	\
+		((typeof(_mask))(_val) << _bf_shf(_mask)) & (_mask);	\
+	})
+
+/**
+ * FIELD_GET() - extract a bitfield element
+ * @_mask: shifted mask defining the field's length and position
+ * @_reg:  32bit value of entire bitfield
+ *
+ * FIELD_GET() extracts the field specified by @_mask from the
+ * bitfield passed in as @_reg by masking and shifting it down.
+ */
+#define FIELD_GET(_mask, _reg)						\
+	({								\
+		_BF_FIELD_CHECK(_mask, _reg, 0U, "FIELD_GET: ");	\
+		(typeof(_mask))(((_reg) & (_mask)) >> _bf_shf(_mask));	\
+	})
+
+#endif
diff --git a/include/linux/bug.h b/include/linux/bug.h
index e51b0709e78d..292d6a10b0c2 100644
--- a/include/linux/bug.h
+++ b/include/linux/bug.h
@@ -13,6 +13,7 @@ enum bug_trap_type {
 struct pt_regs;
 
 #ifdef __CHECKER__
+#define __BUILD_BUG_ON_NOT_POWER_OF_2(n) (0)
 #define BUILD_BUG_ON_NOT_POWER_OF_2(n) (0)
 #define BUILD_BUG_ON_ZERO(e) (0)
 #define BUILD_BUG_ON_NULL(e) ((void*)0)
@@ -24,6 +25,8 @@ struct pt_regs;
 #else /* __CHECKER__ */
 
 /* Force a compilation error if a constant expression is not a power of 2 */
+#define __BUILD_BUG_ON_NOT_POWER_OF_2(n)	\
+	BUILD_BUG_ON(((n) & ((n) - 1)) != 0)
 #define BUILD_BUG_ON_NOT_POWER_OF_2(n)			\
 	BUILD_BUG_ON((n) == 0 || (((n) & ((n) - 1)) != 0))
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFCv2 02/16] net: cls_bpf: add hardware offload
  2016-08-26 18:05 [RFCv2 00/16] BPF hardware offload (cls_bpf for now) Jakub Kicinski
  2016-08-26 18:06 ` [RFCv2 01/16] add basic register-field manipulation macros Jakub Kicinski
@ 2016-08-26 18:06 ` Jakub Kicinski
  2016-08-29 14:51   ` Daniel Borkmann
  2016-08-26 18:06 ` [RFCv2 03/16] net: cls_bpf: limit hardware offload by software-only flag Jakub Kicinski
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-26 18:06 UTC (permalink / raw)
  To: netdev
  Cc: ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici,
	Jakub Kicinski

This patch adds hardware offload capability to cls_bpf classifier,
similar to what have been done with U32 and flower.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
v2:
 - drop unnecessary WARN_ON;
 - reformat error handling a bit.
---
 include/linux/netdevice.h |  2 ++
 include/net/pkt_cls.h     | 14 ++++++++++
 net/sched/cls_bpf.c       | 70 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 86 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 794bb0733799..32b9ceb9d237 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -789,6 +789,7 @@ enum {
 	TC_SETUP_CLSU32,
 	TC_SETUP_CLSFLOWER,
 	TC_SETUP_MATCHALL,
+	TC_SETUP_CLSBPF,
 };
 
 struct tc_cls_u32_offload;
@@ -800,6 +801,7 @@ struct tc_to_netdev {
 		struct tc_cls_u32_offload *cls_u32;
 		struct tc_cls_flower_offload *cls_flower;
 		struct tc_cls_matchall_offload *cls_mall;
+		struct tc_cls_bpf_offload *cls_bpf;
 	};
 };
 
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index a459be5fe1c2..a86262f0d93a 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -486,4 +486,18 @@ struct tc_cls_matchall_offload {
 	unsigned long cookie;
 };
 
+enum tc_clsbpf_command {
+	TC_CLSBPF_ADD,
+	TC_CLSBPF_REPLACE,
+	TC_CLSBPF_DESTROY,
+};
+
+struct tc_cls_bpf_offload {
+	enum tc_clsbpf_command command;
+	struct tcf_exts *exts;
+	struct bpf_prog *filter;
+	const char *name;
+	bool exts_integrated;
+};
+
 #endif
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 4742f415ee5b..ea87595c49ad 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -39,6 +39,7 @@ struct cls_bpf_prog {
 	struct list_head link;
 	struct tcf_result res;
 	bool exts_integrated;
+	bool offloaded;
 	struct tcf_exts exts;
 	u32 handle;
 	union {
@@ -140,6 +141,71 @@ static bool cls_bpf_is_ebpf(const struct cls_bpf_prog *prog)
 	return !prog->bpf_ops;
 }
 
+static int cls_bpf_offload_cmd(struct tcf_proto *tp, struct cls_bpf_prog *prog,
+			       enum tc_clsbpf_command cmd)
+{
+	struct net_device *dev = tp->q->dev_queue->dev;
+	struct tc_cls_bpf_offload bpf_offload = {};
+	struct tc_to_netdev offload;
+
+	offload.type = TC_SETUP_CLSBPF;
+	offload.cls_bpf = &bpf_offload;
+
+	bpf_offload.command = cmd;
+	bpf_offload.exts = &prog->exts;
+	bpf_offload.filter = prog->filter;
+	bpf_offload.name = prog->bpf_name;
+	bpf_offload.exts_integrated = prog->exts_integrated;
+
+	return dev->netdev_ops->ndo_setup_tc(dev, tp->q->handle,
+					     tp->protocol, &offload);
+}
+
+static void cls_bpf_offload(struct tcf_proto *tp, struct cls_bpf_prog *prog,
+			    struct cls_bpf_prog *oldprog)
+{
+	struct net_device *dev = tp->q->dev_queue->dev;
+	struct cls_bpf_prog *obj = prog;
+	enum tc_clsbpf_command cmd;
+
+	if (oldprog && oldprog->offloaded) {
+		if (tc_should_offload(dev, tp, 0)) {
+			cmd = TC_CLSBPF_REPLACE;
+		} else {
+			obj = oldprog;
+			cmd = TC_CLSBPF_DESTROY;
+		}
+	} else {
+		if (!tc_should_offload(dev, tp, 0))
+			return;
+		cmd = TC_CLSBPF_ADD;
+	}
+
+	if (cls_bpf_offload_cmd(tp, obj, cmd))
+		return;
+
+	obj->offloaded = true;
+	if (oldprog)
+		oldprog->offloaded = false;
+}
+
+static void cls_bpf_stop_offload(struct tcf_proto *tp,
+				 struct cls_bpf_prog *prog)
+{
+	int err;
+
+	if (!prog->offloaded)
+		return;
+
+	err = cls_bpf_offload_cmd(tp, prog, TC_CLSBPF_DESTROY);
+	if (err) {
+		pr_err("Stopping hardware offload failed: %d\n", err);
+		return;
+	}
+
+	prog->offloaded = false;
+}
+
 static int cls_bpf_init(struct tcf_proto *tp)
 {
 	struct cls_bpf_head *head;
@@ -179,6 +245,7 @@ static int cls_bpf_delete(struct tcf_proto *tp, unsigned long arg)
 {
 	struct cls_bpf_prog *prog = (struct cls_bpf_prog *) arg;
 
+	cls_bpf_stop_offload(tp, prog);
 	list_del_rcu(&prog->link);
 	tcf_unbind_filter(tp, &prog->res);
 	call_rcu(&prog->rcu, __cls_bpf_delete_prog);
@@ -195,6 +262,7 @@ static bool cls_bpf_destroy(struct tcf_proto *tp, bool force)
 		return false;
 
 	list_for_each_entry_safe(prog, tmp, &head->plist, link) {
+		cls_bpf_stop_offload(tp, prog);
 		list_del_rcu(&prog->link);
 		tcf_unbind_filter(tp, &prog->res);
 		call_rcu(&prog->rcu, __cls_bpf_delete_prog);
@@ -416,6 +484,8 @@ static int cls_bpf_change(struct net *net, struct sk_buff *in_skb,
 	if (ret < 0)
 		goto errout;
 
+	cls_bpf_offload(tp, prog, oldprog);
+
 	if (oldprog) {
 		list_replace_rcu(&oldprog->link, &prog->link);
 		tcf_unbind_filter(tp, &oldprog->res);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFCv2 03/16] net: cls_bpf: limit hardware offload by software-only flag
  2016-08-26 18:05 [RFCv2 00/16] BPF hardware offload (cls_bpf for now) Jakub Kicinski
  2016-08-26 18:06 ` [RFCv2 01/16] add basic register-field manipulation macros Jakub Kicinski
  2016-08-26 18:06 ` [RFCv2 02/16] net: cls_bpf: add hardware offload Jakub Kicinski
@ 2016-08-26 18:06 ` Jakub Kicinski
  2016-08-29 15:06   ` Daniel Borkmann
  2016-08-26 18:06 ` [RFCv2 04/16] net: cls_bpf: add support for marking filters as hardware-only Jakub Kicinski
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-26 18:06 UTC (permalink / raw)
  To: netdev
  Cc: ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici,
	Jakub Kicinski

Add cls_bpf support for the TCA_CLS_FLAGS_SKIP_HW flag.
Unlike U32 and flower cls_bpf already has some netlink
flags defined.  I chose to create a new attribute to be
able to use the same flag values as the above.

Unknown flags are ignored and not reported upon dump.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
v2:
 - rename TCA_BPF_GEN_TCA_FLAGS -> TCA_BPF_FLAGS_GEN;
 - add comment about clearing unsupported flags;
 - validate flags after clearing unsupported.
---
 include/net/pkt_cls.h        |  1 +
 include/uapi/linux/pkt_cls.h |  1 +
 net/sched/cls_bpf.c          | 21 +++++++++++++++++++--
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index a86262f0d93a..0a4a51f339b4 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -498,6 +498,7 @@ struct tc_cls_bpf_offload {
 	struct bpf_prog *filter;
 	const char *name;
 	bool exts_integrated;
+	u32 gen_flags;
 };
 
 #endif
diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index 51b5b247fb5a..5cb7ba5efe57 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -396,6 +396,7 @@ enum {
 	TCA_BPF_FD,
 	TCA_BPF_NAME,
 	TCA_BPF_FLAGS,
+	TCA_BPF_FLAGS_GEN,
 	__TCA_BPF_MAX,
 };
 
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index ea87595c49ad..1999f44075f0 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -27,6 +27,8 @@ MODULE_AUTHOR("Daniel Borkmann <dborkman@redhat.com>");
 MODULE_DESCRIPTION("TC BPF based classifier");
 
 #define CLS_BPF_NAME_LEN	256
+#define CLS_BPF_SUPPORTED_GEN_FLAGS		\
+	TCA_CLS_FLAGS_SKIP_HW
 
 struct cls_bpf_head {
 	struct list_head plist;
@@ -40,6 +42,7 @@ struct cls_bpf_prog {
 	struct tcf_result res;
 	bool exts_integrated;
 	bool offloaded;
+	u32 gen_flags;
 	struct tcf_exts exts;
 	u32 handle;
 	union {
@@ -55,6 +58,7 @@ struct cls_bpf_prog {
 static const struct nla_policy bpf_policy[TCA_BPF_MAX + 1] = {
 	[TCA_BPF_CLASSID]	= { .type = NLA_U32 },
 	[TCA_BPF_FLAGS]		= { .type = NLA_U32 },
+	[TCA_BPF_FLAGS_GEN]	= { .type = NLA_U32 },
 	[TCA_BPF_FD]		= { .type = NLA_U32 },
 	[TCA_BPF_NAME]		= { .type = NLA_NUL_STRING, .len = CLS_BPF_NAME_LEN },
 	[TCA_BPF_OPS_LEN]	= { .type = NLA_U16 },
@@ -156,6 +160,7 @@ static int cls_bpf_offload_cmd(struct tcf_proto *tp, struct cls_bpf_prog *prog,
 	bpf_offload.filter = prog->filter;
 	bpf_offload.name = prog->bpf_name;
 	bpf_offload.exts_integrated = prog->exts_integrated;
+	bpf_offload.gen_flags = prog->gen_flags;
 
 	return dev->netdev_ops->ndo_setup_tc(dev, tp->q->handle,
 					     tp->protocol, &offload);
@@ -169,14 +174,14 @@ static void cls_bpf_offload(struct tcf_proto *tp, struct cls_bpf_prog *prog,
 	enum tc_clsbpf_command cmd;
 
 	if (oldprog && oldprog->offloaded) {
-		if (tc_should_offload(dev, tp, 0)) {
+		if (tc_should_offload(dev, tp, prog->gen_flags)) {
 			cmd = TC_CLSBPF_REPLACE;
 		} else {
 			obj = oldprog;
 			cmd = TC_CLSBPF_DESTROY;
 		}
 	} else {
-		if (!tc_should_offload(dev, tp, 0))
+		if (!tc_should_offload(dev, tp, prog->gen_flags))
 			return;
 		cmd = TC_CLSBPF_ADD;
 	}
@@ -372,6 +377,7 @@ static int cls_bpf_modify_existing(struct net *net, struct tcf_proto *tp,
 {
 	bool is_bpf, is_ebpf, have_exts = false;
 	struct tcf_exts exts;
+	u32 gen_flags = 0;
 	int ret;
 
 	is_bpf = tb[TCA_BPF_OPS_LEN] && tb[TCA_BPF_OPS];
@@ -396,8 +402,16 @@ static int cls_bpf_modify_existing(struct net *net, struct tcf_proto *tp,
 
 		have_exts = bpf_flags & TCA_BPF_FLAG_ACT_DIRECT;
 	}
+	if (tb[TCA_BPF_FLAGS_GEN]) {
+		gen_flags = nla_get_u32(tb[TCA_BPF_FLAGS_GEN]);
+		/* Make sure dump doesn't report back flags we don't handle */
+		gen_flags &= CLS_BPF_SUPPORTED_GEN_FLAGS;
+		if (!tc_flags_valid(gen_flags))
+			return -EINVAL;
+	}
 
 	prog->exts_integrated = have_exts;
+	prog->gen_flags = gen_flags;
 
 	ret = is_bpf ? cls_bpf_prog_from_ops(tb, prog) :
 		       cls_bpf_prog_from_efd(tb, prog, tp);
@@ -569,6 +583,9 @@ static int cls_bpf_dump(struct net *net, struct tcf_proto *tp, unsigned long fh,
 		bpf_flags |= TCA_BPF_FLAG_ACT_DIRECT;
 	if (bpf_flags && nla_put_u32(skb, TCA_BPF_FLAGS, bpf_flags))
 		goto nla_put_failure;
+	if (prog->gen_flags &&
+	    nla_put_u32(skb, TCA_BPF_FLAGS_GEN, prog->gen_flags))
+		goto nla_put_failure;
 
 	nla_nest_end(skb, nest);
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFCv2 04/16] net: cls_bpf: add support for marking filters as hardware-only
  2016-08-26 18:05 [RFCv2 00/16] BPF hardware offload (cls_bpf for now) Jakub Kicinski
                   ` (2 preceding siblings ...)
  2016-08-26 18:06 ` [RFCv2 03/16] net: cls_bpf: limit hardware offload by software-only flag Jakub Kicinski
@ 2016-08-26 18:06 ` Jakub Kicinski
  2016-08-29 15:28   ` Daniel Borkmann
  2016-08-26 18:06 ` [RFCv2 05/16] bpf: recognize 64bit immediate loads as consts Jakub Kicinski
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-26 18:06 UTC (permalink / raw)
  To: netdev
  Cc: ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici,
	Jakub Kicinski

Add cls_bpf support for the TCA_CLS_FLAGS_SKIP_SW flag.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 net/sched/cls_bpf.c | 34 +++++++++++++++++++++++++---------
 1 file changed, 25 insertions(+), 9 deletions(-)

diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 1999f44075f0..630f296f5a90 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -28,7 +28,7 @@ MODULE_DESCRIPTION("TC BPF based classifier");
 
 #define CLS_BPF_NAME_LEN	256
 #define CLS_BPF_SUPPORTED_GEN_FLAGS		\
-	TCA_CLS_FLAGS_SKIP_HW
+	(TCA_CLS_FLAGS_SKIP_HW | TCA_CLS_FLAGS_SKIP_SW)
 
 struct cls_bpf_head {
 	struct list_head plist;
@@ -98,7 +98,9 @@ static int cls_bpf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 
 		qdisc_skb_cb(skb)->tc_classid = prog->res.classid;
 
-		if (at_ingress) {
+		if (tc_skip_sw(prog->gen_flags)) {
+			filter_res = prog->exts_integrated ? TC_ACT_UNSPEC : 0;
+		} else if (at_ingress) {
 			/* It is safe to push/pull even if skb_shared() */
 			__skb_push(skb, skb->mac_len);
 			bpf_compute_data_end(skb);
@@ -166,32 +168,42 @@ static int cls_bpf_offload_cmd(struct tcf_proto *tp, struct cls_bpf_prog *prog,
 					     tp->protocol, &offload);
 }
 
-static void cls_bpf_offload(struct tcf_proto *tp, struct cls_bpf_prog *prog,
-			    struct cls_bpf_prog *oldprog)
+static int cls_bpf_offload(struct tcf_proto *tp, struct cls_bpf_prog *prog,
+			   struct cls_bpf_prog *oldprog)
 {
 	struct net_device *dev = tp->q->dev_queue->dev;
 	struct cls_bpf_prog *obj = prog;
 	enum tc_clsbpf_command cmd;
+	bool skip_sw;
+	int ret;
+
+	skip_sw = tc_skip_sw(prog->gen_flags) ||
+		(oldprog && tc_skip_sw(oldprog->gen_flags));
 
 	if (oldprog && oldprog->offloaded) {
 		if (tc_should_offload(dev, tp, prog->gen_flags)) {
 			cmd = TC_CLSBPF_REPLACE;
-		} else {
+		} else if (!tc_skip_sw(prog->gen_flags)) {
 			obj = oldprog;
 			cmd = TC_CLSBPF_DESTROY;
+		} else {
+			return -EINVAL;
 		}
 	} else {
 		if (!tc_should_offload(dev, tp, prog->gen_flags))
-			return;
+			return skip_sw ? -EINVAL : 0;
 		cmd = TC_CLSBPF_ADD;
 	}
 
-	if (cls_bpf_offload_cmd(tp, obj, cmd))
-		return;
+	ret = cls_bpf_offload_cmd(tp, obj, cmd);
+	if (ret)
+		return skip_sw ? ret : 0;
 
 	obj->offloaded = true;
 	if (oldprog)
 		oldprog->offloaded = false;
+
+	return 0;
 }
 
 static void cls_bpf_stop_offload(struct tcf_proto *tp,
@@ -498,7 +510,11 @@ static int cls_bpf_change(struct net *net, struct sk_buff *in_skb,
 	if (ret < 0)
 		goto errout;
 
-	cls_bpf_offload(tp, prog, oldprog);
+	ret = cls_bpf_offload(tp, prog, oldprog);
+	if (ret) {
+		cls_bpf_delete_prog(tp, prog);
+		return ret;
+	}
 
 	if (oldprog) {
 		list_replace_rcu(&oldprog->link, &prog->link);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFCv2 05/16] bpf: recognize 64bit immediate loads as consts
  2016-08-26 18:05 [RFCv2 00/16] BPF hardware offload (cls_bpf for now) Jakub Kicinski
                   ` (3 preceding siblings ...)
  2016-08-26 18:06 ` [RFCv2 04/16] net: cls_bpf: add support for marking filters as hardware-only Jakub Kicinski
@ 2016-08-26 18:06 ` Jakub Kicinski
  2016-08-26 18:06 ` [RFCv2 06/16] bpf: verifier: recognize rN ^ rN as load of 0 Jakub Kicinski
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-26 18:06 UTC (permalink / raw)
  To: netdev
  Cc: ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici,
	Jakub Kicinski

When verifier sees a generic BPF_LD | BPF_IMM | BPF_DW
it should mark the dst register as CONST_IMM with
the loaded value stored in imm.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 kernel/bpf/verifier.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index abb61f3f6900..db68a0e5db1e 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1787,9 +1787,14 @@ static int check_ld_imm(struct verifier_env *env, struct bpf_insn *insn)
 	if (err)
 		return err;
 
-	if (insn->src_reg == 0)
+	if (insn->src_reg == 0) {
 		/* generic move 64-bit immediate into a register */
+		u64 imm = ((u64)(insn + 1)->imm << 32) | (u32)insn->imm;
+
+		regs[insn->dst_reg].type = CONST_IMM;
+		regs[insn->dst_reg].imm = imm;
 		return 0;
+	}
 
 	/* replace_map_fd_with_map_ptr() should have caught bad ld_imm64 */
 	BUG_ON(insn->src_reg != BPF_PSEUDO_MAP_FD);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFCv2 06/16] bpf: verifier: recognize rN ^ rN as load of 0
  2016-08-26 18:05 [RFCv2 00/16] BPF hardware offload (cls_bpf for now) Jakub Kicinski
                   ` (4 preceding siblings ...)
  2016-08-26 18:06 ` [RFCv2 05/16] bpf: recognize 64bit immediate loads as consts Jakub Kicinski
@ 2016-08-26 18:06 ` Jakub Kicinski
  2016-08-26 18:06 ` [RFCv2 07/16] bpf: enable non-core use of the verfier Jakub Kicinski
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-26 18:06 UTC (permalink / raw)
  To: netdev
  Cc: ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici,
	Jakub Kicinski

Teach the verifier to recognize that xoring a register with
itself makes it a constant (0).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 kernel/bpf/verifier.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index db68a0e5db1e..0f4494c194f9 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1550,6 +1550,12 @@ static int check_alu_op(struct verifier_env *env, struct bpf_insn *insn)
 		verbose("invalid BPF_ALU opcode %x\n", opcode);
 		return -EINVAL;
 
+	} else if (opcode == BPF_XOR && BPF_SRC(insn->code) == BPF_X &&
+		   insn->src_reg == insn->dst_reg) {
+
+		regs[insn->dst_reg].type = CONST_IMM;
+		regs[insn->dst_reg].imm = 0;
+
 	} else {	/* all other ALU ops: and, sub, xor, add, ... */
 
 		if (BPF_SRC(insn->code) == BPF_X) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFCv2 07/16] bpf: enable non-core use of the verfier
  2016-08-26 18:05 [RFCv2 00/16] BPF hardware offload (cls_bpf for now) Jakub Kicinski
                   ` (5 preceding siblings ...)
  2016-08-26 18:06 ` [RFCv2 06/16] bpf: verifier: recognize rN ^ rN as load of 0 Jakub Kicinski
@ 2016-08-26 18:06 ` Jakub Kicinski
  2016-08-26 23:29   ` Alexei Starovoitov
  2016-08-26 18:06 ` [RFCv2 08/16] bpf: export bpf_prog_clone functions Jakub Kicinski
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-26 18:06 UTC (permalink / raw)
  To: netdev
  Cc: ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici,
	Jakub Kicinski

Advanced JIT compilers and translators may want to use
eBPF verifier as a base for parsers or to perform custom
checks and validations.

Add ability for external users to invoke the verifier
and provide callbacks to be invoked for every intruction
checked.  For now only add most basic callback for
per-instruction pre-interpretation checks is added.  More
advanced users may also like to have per-instruction post
callback and state comparison callback.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 include/linux/bpf_parser.h |  84 +++++++++++++++++++++++++++++++
 kernel/bpf/verifier.c      | 122 +++++++++++++++++++++++----------------------
 2 files changed, 146 insertions(+), 60 deletions(-)
 create mode 100644 include/linux/bpf_parser.h

diff --git a/include/linux/bpf_parser.h b/include/linux/bpf_parser.h
new file mode 100644
index 000000000000..1b73cc464914
--- /dev/null
+++ b/include/linux/bpf_parser.h
@@ -0,0 +1,84 @@
+/* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#ifndef _LINUX_BPF_PARSER_H
+#define _LINUX_BPF_PARSER_H 1
+
+#include <linux/bpf.h> /* for enum bpf_reg_type */
+#include <linux/filter.h> /* for MAX_BPF_STACK */
+
+struct reg_state {
+	enum bpf_reg_type type;
+	union {
+		/* valid when type == CONST_IMM | PTR_TO_STACK | UNKNOWN_VALUE */
+		s64 imm;
+
+		/* valid when type == PTR_TO_PACKET* */
+		struct {
+			u32 id;
+			u16 off;
+			u16 range;
+		};
+
+		/* valid when type == CONST_PTR_TO_MAP | PTR_TO_MAP_VALUE |
+		 *   PTR_TO_MAP_VALUE_OR_NULL
+		 */
+		struct bpf_map *map_ptr;
+	};
+};
+
+enum bpf_stack_slot_type {
+	STACK_INVALID,    /* nothing was stored in this stack slot */
+	STACK_SPILL,      /* register spilled into stack */
+	STACK_MISC	  /* BPF program wrote some data into this slot */
+};
+
+#define BPF_REG_SIZE 8	/* size of eBPF register in bytes */
+
+/* state of the program:
+ * type of all registers and stack info
+ */
+struct verifier_state {
+	struct reg_state regs[MAX_BPF_REG];
+	u8 stack_slot_type[MAX_BPF_STACK];
+	struct reg_state spilled_regs[MAX_BPF_STACK / BPF_REG_SIZE];
+};
+
+/* linked list of verifier states used to prune search */
+struct verifier_state_list {
+	struct verifier_state state;
+	struct verifier_state_list *next;
+};
+
+#define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */
+
+struct verifier_env;
+struct bpf_ext_parser_ops {
+	int (*insn_hook)(struct verifier_env *env,
+			 int insn_idx, int prev_insn_idx);
+};
+
+/* single container for all structs
+ * one verifier_env per bpf_check() call
+ */
+struct verifier_env {
+	struct bpf_prog *prog;		/* eBPF program being verified */
+	struct verifier_stack_elem *head; /* stack of verifier states to be processed */
+	int stack_size;			/* number of states to be processed */
+	struct verifier_state cur_state; /* current verifier state */
+	struct verifier_state_list **explored_states; /* search pruning optimization */
+	const struct bpf_ext_parser_ops *pops; /* external parser ops */
+	void *ppriv; /* pointer to external parser's private data */
+	struct bpf_map *used_maps[MAX_USED_MAPS]; /* array of map's used by eBPF program */
+	u32 used_map_cnt;		/* number of used maps */
+	u32 id_gen;			/* used to generate unique reg IDs */
+	bool allow_ptr_leaks;
+};
+
+int bpf_parse(struct bpf_prog *prog, const struct bpf_ext_parser_ops *pops,
+	      void *ppriv);
+
+#endif /* _LINUX_BPF_PARSER_H */
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 0f4494c194f9..e91faad7d2b2 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -14,6 +14,7 @@
 #include <linux/types.h>
 #include <linux/slab.h>
 #include <linux/bpf.h>
+#include <linux/bpf_parser.h>
 #include <linux/filter.h>
 #include <net/netlink.h>
 #include <linux/file.h>
@@ -126,49 +127,6 @@
  * are set to NOT_INIT to indicate that they are no longer readable.
  */
 
-struct reg_state {
-	enum bpf_reg_type type;
-	union {
-		/* valid when type == CONST_IMM | PTR_TO_STACK | UNKNOWN_VALUE */
-		s64 imm;
-
-		/* valid when type == PTR_TO_PACKET* */
-		struct {
-			u32 id;
-			u16 off;
-			u16 range;
-		};
-
-		/* valid when type == CONST_PTR_TO_MAP | PTR_TO_MAP_VALUE |
-		 *   PTR_TO_MAP_VALUE_OR_NULL
-		 */
-		struct bpf_map *map_ptr;
-	};
-};
-
-enum bpf_stack_slot_type {
-	STACK_INVALID,    /* nothing was stored in this stack slot */
-	STACK_SPILL,      /* register spilled into stack */
-	STACK_MISC	  /* BPF program wrote some data into this slot */
-};
-
-#define BPF_REG_SIZE 8	/* size of eBPF register in bytes */
-
-/* state of the program:
- * type of all registers and stack info
- */
-struct verifier_state {
-	struct reg_state regs[MAX_BPF_REG];
-	u8 stack_slot_type[MAX_BPF_STACK];
-	struct reg_state spilled_regs[MAX_BPF_STACK / BPF_REG_SIZE];
-};
-
-/* linked list of verifier states used to prune search */
-struct verifier_state_list {
-	struct verifier_state state;
-	struct verifier_state_list *next;
-};
-
 /* verifier_state + insn_idx are pushed to stack when branch is encountered */
 struct verifier_stack_elem {
 	/* verifer state is 'st'
@@ -181,23 +139,6 @@ struct verifier_stack_elem {
 	struct verifier_stack_elem *next;
 };
 
-#define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */
-
-/* single container for all structs
- * one verifier_env per bpf_check() call
- */
-struct verifier_env {
-	struct bpf_prog *prog;		/* eBPF program being verified */
-	struct verifier_stack_elem *head; /* stack of verifier states to be processed */
-	int stack_size;			/* number of states to be processed */
-	struct verifier_state cur_state; /* current verifier state */
-	struct verifier_state_list **explored_states; /* search pruning optimization */
-	struct bpf_map *used_maps[MAX_USED_MAPS]; /* array of map's used by eBPF program */
-	u32 used_map_cnt;		/* number of used maps */
-	u32 id_gen;			/* used to generate unique reg IDs */
-	bool allow_ptr_leaks;
-};
-
 #define BPF_COMPLEXITY_LIMIT_INSNS	65536
 #define BPF_COMPLEXITY_LIMIT_STACK	1024
 
@@ -683,6 +624,10 @@ static int check_packet_access(struct verifier_env *env, u32 regno, int off,
 static int check_ctx_access(struct verifier_env *env, int off, int size,
 			    enum bpf_access_type t, enum bpf_reg_type *reg_type)
 {
+	/* for parser ctx accesses are already validated and converted */
+	if (env->pops)
+		return 0;
+
 	if (env->prog->aux->ops->is_valid_access &&
 	    env->prog->aux->ops->is_valid_access(off, size, t, reg_type)) {
 		/* remember the offset of last byte accessed in ctx */
@@ -2256,6 +2201,15 @@ static int is_state_visited(struct verifier_env *env, int insn_idx)
 	return 0;
 }
 
+static int ext_parser_hook(struct verifier_env *env,
+			   int insn_idx, int prev_insn_idx)
+{
+	if (!env->pops || !env->pops->insn_hook)
+		return 0;
+
+	return env->pops->insn_hook(env, insn_idx, prev_insn_idx);
+}
+
 static int do_check(struct verifier_env *env)
 {
 	struct verifier_state *state = &env->cur_state;
@@ -2314,6 +2268,10 @@ static int do_check(struct verifier_env *env)
 			print_bpf_insn(insn);
 		}
 
+		err = ext_parser_hook(env, insn_idx, prev_insn_idx);
+		if (err)
+			return err;
+
 		if (class == BPF_ALU || class == BPF_ALU64) {
 			err = check_alu_op(env, insn);
 			if (err)
@@ -2832,3 +2790,47 @@ free_env:
 	mutex_unlock(&bpf_verifier_lock);
 	return ret;
 }
+
+int bpf_parse(struct bpf_prog *prog, const struct bpf_ext_parser_ops *pops,
+	      void *ppriv)
+{
+	struct verifier_env *env;
+	int ret;
+
+	env = kzalloc(sizeof(struct verifier_env), GFP_KERNEL);
+	if (!env)
+		return -ENOMEM;
+
+	env->prog = prog;
+	env->pops = pops;
+	env->ppriv = ppriv;
+
+	/* grab the mutex to protect few globals used by verifier */
+	mutex_lock(&bpf_verifier_lock);
+
+	log_level = 0;
+
+	env->explored_states = kcalloc(env->prog->len,
+				       sizeof(struct verifier_state_list *),
+				       GFP_KERNEL);
+	ret = -ENOMEM;
+	if (!env->explored_states)
+		goto skip_full_check;
+
+	ret = check_cfg(env);
+	if (ret < 0)
+		goto skip_full_check;
+
+	env->allow_ptr_leaks = capable(CAP_SYS_ADMIN);
+
+	ret = do_check(env);
+
+skip_full_check:
+	while (pop_stack(env, NULL) >= 0);
+	free_states(env);
+
+	kfree(env);
+	mutex_unlock(&bpf_verifier_lock);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(bpf_parse);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFCv2 08/16] bpf: export bpf_prog_clone functions
  2016-08-26 18:05 [RFCv2 00/16] BPF hardware offload (cls_bpf for now) Jakub Kicinski
                   ` (6 preceding siblings ...)
  2016-08-26 18:06 ` [RFCv2 07/16] bpf: enable non-core use of the verfier Jakub Kicinski
@ 2016-08-26 18:06 ` Jakub Kicinski
  2016-08-26 18:06 ` [RFCv2 09/16] nfp: add BPF to NFP code translator Jakub Kicinski
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-26 18:06 UTC (permalink / raw)
  To: netdev
  Cc: ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici,
	Jakub Kicinski

Export bpf_prog_clone_create() and bpf_prog_clone_free().

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 include/linux/bpf.h | 4 ++++
 kernel/bpf/core.c   | 8 +++++---
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 11134238417d..39f32c5ad445 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -319,4 +319,8 @@ extern const struct bpf_func_proto bpf_get_stackid_proto;
 void bpf_user_rnd_init_once(void);
 u64 bpf_user_rnd_u32(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
 
+struct bpf_prog *bpf_prog_clone_create(struct bpf_prog *fp_other,
+				       gfp_t gfp_extra_flags);
+void bpf_prog_clone_free(struct bpf_prog *fp);
+
 #endif /* _LINUX_BPF_H */
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 03fd23d4d587..c6e7ed9b6a24 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -351,8 +351,8 @@ out:
 	return to - to_buff;
 }
 
-static struct bpf_prog *bpf_prog_clone_create(struct bpf_prog *fp_other,
-					      gfp_t gfp_extra_flags)
+struct bpf_prog *bpf_prog_clone_create(struct bpf_prog *fp_other,
+				       gfp_t gfp_extra_flags)
 {
 	gfp_t gfp_flags = GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO |
 			  gfp_extra_flags;
@@ -371,8 +371,9 @@ static struct bpf_prog *bpf_prog_clone_create(struct bpf_prog *fp_other,
 
 	return fp;
 }
+EXPORT_SYMBOL_GPL(bpf_prog_clone_create);
 
-static void bpf_prog_clone_free(struct bpf_prog *fp)
+void bpf_prog_clone_free(struct bpf_prog *fp)
 {
 	/* aux was stolen by the other clone, so we cannot free
 	 * it from this path! It will be freed eventually by the
@@ -384,6 +385,7 @@ static void bpf_prog_clone_free(struct bpf_prog *fp)
 	fp->aux = NULL;
 	__bpf_prog_free(fp);
 }
+EXPORT_SYMBOL_GPL(bpf_prog_clone_free);
 
 void bpf_jit_prog_release_other(struct bpf_prog *fp, struct bpf_prog *fp_other)
 {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFCv2 09/16] nfp: add BPF to NFP code translator
  2016-08-26 18:05 [RFCv2 00/16] BPF hardware offload (cls_bpf for now) Jakub Kicinski
                   ` (7 preceding siblings ...)
  2016-08-26 18:06 ` [RFCv2 08/16] bpf: export bpf_prog_clone functions Jakub Kicinski
@ 2016-08-26 18:06 ` Jakub Kicinski
  2016-08-26 18:06 ` [RFCv2 10/16] nfp: bpf: add hardware bpf offload Jakub Kicinski
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-26 18:06 UTC (permalink / raw)
  To: netdev
  Cc: ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici,
	Jakub Kicinski

Add translator for JITing eBPF to operations which
can be executed on NFP's programmable engines.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/Makefile        |    6 +
 drivers/net/ethernet/netronome/nfp/nfp_asm.h       |  233 +++
 drivers/net/ethernet/netronome/nfp/nfp_bpf.h       |  208 +++
 drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c   | 1729 ++++++++++++++++++++
 .../net/ethernet/netronome/nfp/nfp_bpf_verifier.c  |  157 ++
 5 files changed, 2333 insertions(+)
 create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_asm.h
 create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_bpf.h
 create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
 create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_bpf_verifier.c

diff --git a/drivers/net/ethernet/netronome/nfp/Makefile b/drivers/net/ethernet/netronome/nfp/Makefile
index 68178819ff12..5f12689bf523 100644
--- a/drivers/net/ethernet/netronome/nfp/Makefile
+++ b/drivers/net/ethernet/netronome/nfp/Makefile
@@ -5,4 +5,10 @@ nfp_netvf-objs := \
 	    nfp_net_ethtool.o \
 	    nfp_netvf_main.o
 
+ifeq ($(CONFIG_BPF_SYSCALL),y)
+nfp_netvf-objs += \
+	    nfp_bpf_verifier.o \
+	    nfp_bpf_jit.o
+endif
+
 nfp_netvf-$(CONFIG_NFP_NET_DEBUG) += nfp_net_debugfs.o
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_asm.h b/drivers/net/ethernet/netronome/nfp/nfp_asm.h
new file mode 100644
index 000000000000..22484b6fd3e8
--- /dev/null
+++ b/drivers/net/ethernet/netronome/nfp/nfp_asm.h
@@ -0,0 +1,233 @@
+/*
+ * Copyright (C) 2016 Netronome Systems, Inc.
+ *
+ * This software is dual licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree or the BSD 2-Clause License provided below.  You have the
+ * option to license this software under the complete terms of either license.
+ *
+ * The BSD 2-Clause License:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      1. Redistributions of source code must retain the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer.
+ *
+ *      2. Redistributions in binary form must reproduce the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer in the documentation and/or other materials
+ *         provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __NFP_ASM_H__
+#define __NFP_ASM_H__ 1
+
+#include "nfp_bpf.h"
+
+#define REG_NONE	0
+
+#define RE_REG_NO_DST	0x020
+#define RE_REG_IMM	0x020
+#define RE_REG_IMM_encode(x)					\
+	(RE_REG_IMM | ((x) & 0x1f) | (((x) & 0x60) << 1))
+#define RE_REG_IMM_MAX	 0x07fULL
+#define RE_REG_XFR	0x080
+
+#define UR_REG_XFR	0x180
+#define UR_REG_NN	0x280
+#define UR_REG_NO_DST	0x300
+#define UR_REG_IMM	UR_REG_NO_DST
+#define UR_REG_IMM_encode(x) (UR_REG_IMM | (x))
+#define UR_REG_IMM_MAX	 0x0ffULL
+
+#define OP_BR_BASE	0x0d800000020ULL
+#define OP_BR_BASE_MASK	0x0f8000c3ce0ULL
+#define OP_BR_MASK	0x0000000001fULL
+#define OP_BR_EV_PIP	0x00000000300ULL
+#define OP_BR_CSS	0x0000003c000ULL
+#define OP_BR_DEFBR	0x00000300000ULL
+#define OP_BR_ADDR_LO	0x007ffc00000ULL
+#define OP_BR_ADDR_HI	0x10000000000ULL
+
+#define nfp_is_br(_insn)				\
+	(((_insn) & OP_BR_BASE_MASK) == OP_BR_BASE)
+
+enum br_mask {
+	BR_BEQ = 0x00,
+	BR_BNE = 0x01,
+	BR_BHS = 0x04,
+	BR_BLO = 0x05,
+	BR_BGE = 0x08,
+	BR_UNC = 0x18,
+};
+
+enum br_ev_pip {
+	BR_EV_PIP_UNCOND = 0,
+	BR_EV_PIP_COND = 1,
+};
+
+enum br_ctx_signal_state {
+	BR_CSS_NONE = 2,
+};
+
+#define OP_BBYTE_BASE	0x0c800000000ULL
+#define OP_BB_A_SRC	0x000000000ffULL
+#define OP_BB_BYTE	0x00000000300ULL
+#define OP_BB_B_SRC	0x0000003fc00ULL
+#define OP_BB_I8	0x00000040000ULL
+#define OP_BB_EQ	0x00000080000ULL
+#define OP_BB_DEFBR	0x00000300000ULL
+#define OP_BB_ADDR_LO	0x007ffc00000ULL
+#define OP_BB_ADDR_HI	0x10000000000ULL
+
+#define OP_BALU_BASE	0x0e800000000ULL
+#define OP_BA_A_SRC	0x000000003ffULL
+#define OP_BA_B_SRC	0x000000ffc00ULL
+#define OP_BA_DEFBR	0x00000300000ULL
+#define OP_BA_ADDR_HI	0x0007fc00000ULL
+
+#define OP_IMMED_A_SRC	0x000000003ffULL
+#define OP_IMMED_B_SRC	0x000000ffc00ULL
+#define OP_IMMED_IMM	0x0000ff00000ULL
+#define OP_IMMED_WIDTH	0x00060000000ULL
+#define OP_IMMED_INV	0x00080000000ULL
+#define OP_IMMED_SHIFT	0x00600000000ULL
+#define OP_IMMED_BASE	0x0f000000000ULL
+#define OP_IMMED_WR_AB	0x20000000000ULL
+
+enum immed_width {
+	IMMED_WIDTH_ALL = 0,
+	IMMED_WIDTH_BYTE = 1,
+	IMMED_WIDTH_WORD = 2,
+};
+
+enum immed_shift {
+	IMMED_SHIFT_0B = 0,
+	IMMED_SHIFT_1B = 1,
+	IMMED_SHIFT_2B = 2,
+};
+
+#define OP_SHF_BASE	0x08000000000ULL
+#define OP_SHF_A_SRC	0x000000000ffULL
+#define OP_SHF_SC	0x00000000300ULL
+#define OP_SHF_B_SRC	0x0000003fc00ULL
+#define OP_SHF_I8	0x00000040000ULL
+#define OP_SHF_SW	0x00000080000ULL
+#define OP_SHF_DST	0x0000ff00000ULL
+#define OP_SHF_SHIFT	0x001f0000000ULL
+#define OP_SHF_OP	0x00e00000000ULL
+#define OP_SHF_DST_AB	0x01000000000ULL
+#define OP_SHF_WR_AB	0x20000000000ULL
+
+enum shf_op {
+	SHF_OP_NONE = 0,
+	SHF_OP_AND = 2,
+	SHF_OP_OR = 5,
+};
+
+enum shf_sc {
+	SHF_SC_R_ROT = 0,
+	SHF_SC_R_SHF = 1,
+	SHF_SC_L_SHF = 2,
+	SHF_SC_R_DSHF = 3,
+};
+
+#define OP_ALU_A_SRC	0x000000003ffULL
+#define OP_ALU_B_SRC	0x000000ffc00ULL
+#define OP_ALU_DST	0x0003ff00000ULL
+#define OP_ALU_SW	0x00040000000ULL
+#define OP_ALU_OP	0x00f80000000ULL
+#define OP_ALU_DST_AB	0x01000000000ULL
+#define OP_ALU_BASE	0x0a000000000ULL
+#define OP_ALU_WR_AB	0x20000000000ULL
+
+enum alu_op {
+	ALU_OP_NONE	= 0x00,
+	ALU_OP_ADD	= 0x01,
+	ALU_OP_NEG	= 0x04,
+	ALU_OP_AND	= 0x08,
+	ALU_OP_SUB_C	= 0x0d,
+	ALU_OP_ADD_C	= 0x11,
+	ALU_OP_OR	= 0x14,
+	ALU_OP_SUB	= 0x15,
+	ALU_OP_XOR	= 0x18,
+};
+
+enum alu_dst_ab {
+	ALU_DST_A = 0,
+	ALU_DST_B = 1,
+};
+
+#define OP_LDF_BASE	0x0c000000000ULL
+#define OP_LDF_A_SRC	0x000000000ffULL
+#define OP_LDF_SC	0x00000000300ULL
+#define OP_LDF_B_SRC	0x0000003fc00ULL
+#define OP_LDF_I8	0x00000040000ULL
+#define OP_LDF_SW	0x00000080000ULL
+#define OP_LDF_ZF	0x00000100000ULL
+#define OP_LDF_BMASK	0x0000f000000ULL
+#define OP_LDF_SHF	0x001f0000000ULL
+#define OP_LDF_WR_AB	0x20000000000ULL
+
+#define OP_CMD_A_SRC	 0x000000000ffULL
+#define OP_CMD_CTX	 0x00000000300ULL
+#define OP_CMD_B_SRC	 0x0000003fc00ULL
+#define OP_CMD_TOKEN	 0x000000c0000ULL
+#define OP_CMD_XFER	 0x00001f00000ULL
+#define OP_CMD_CNT	 0x0000e000000ULL
+#define OP_CMD_SIG	 0x000f0000000ULL
+#define OP_CMD_TGT_CMD	 0x07f00000000ULL
+#define OP_CMD_MODE	0x1c0000000000ULL
+
+struct cmd_tgt_act {
+	u8 token;
+	u8 tgt_cmd;
+};
+
+enum cmd_tgt_map {
+	CMD_TGT_READ8,
+	CMD_TGT_WRITE8,
+	CMD_TGT_READ_LE,
+	CMD_TGT_READ_SWAP_LE,
+	__CMD_TGT_MAP_SIZE,
+};
+
+enum cmd_mode {
+	CMD_MODE_40b_AB	= 0,
+	CMD_MODE_40b_BA	= 1,
+	CMD_MODE_32b	= 4,
+};
+
+enum cmd_ctx_swap {
+	CMD_CTX_SWAP = 0,
+	CMD_CTX_NO_SWAP = 3,
+};
+
+#define OP_LCSR_BASE	0x0fc00000000ULL
+#define OP_LCSR_A_SRC	0x000000003ffULL
+#define OP_LCSR_B_SRC	0x000000ffc00ULL
+#define OP_LCSR_WRITE	0x00000200000ULL
+#define OP_LCSR_ADDR	0x001ffc00000ULL
+
+enum lcsr_wr_src {
+	LCSR_WR_AREG,
+	LCSR_WR_BREG,
+	LCSR_WR_IMM,
+};
+
+#define OP_CARB_BASE	0x0e000000000ULL
+#define OP_CARB_OR	0x00000010000ULL
+
+#endif
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_bpf.h b/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
new file mode 100644
index 000000000000..f4265f88db23
--- /dev/null
+++ b/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
@@ -0,0 +1,208 @@
+/*
+ * Copyright (C) 2016 Netronome Systems, Inc.
+ *
+ * This software is dual licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree or the BSD 2-Clause License provided below.  You have the
+ * option to license this software under the complete terms of either license.
+ *
+ * The BSD 2-Clause License:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      1. Redistributions of source code must retain the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer.
+ *
+ *      2. Redistributions in binary form must reproduce the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer in the documentation and/or other materials
+ *         provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __NFP_BPF_H__
+#define __NFP_BPF_H__ 1
+
+#include <linux/bitfield.h>
+#include <linux/bpf.h>
+#include <linux/list.h>
+#include <linux/types.h>
+
+#define FIELD_FIT(mask, val)  (!((((u64)val) << _bf_shf(mask)) & ~(mask)))
+
+/* For branch fixup logic use up-most byte of branch instruction as scratch
+ * area.  Remember to clear this before sending instructions to HW!
+ */
+#define OP_BR_SPECIAL	0xff00000000000000ULL
+
+enum br_special {
+	OP_BR_NORMAL = 0,
+	OP_BR_GO_OUT,
+	OP_BR_GO_ABORT,
+};
+
+enum static_regs {
+	STATIC_REG_PKT		= 1,
+#define REG_PKT_BANK	ALU_DST_A
+	STATIC_REG_IMM		= 2, /* Bank AB */
+};
+
+enum nfp_bpf_action_type {
+	NN_ACT_TC_DROP,
+};
+
+/* Software register representation, hardware encoding in asm.h */
+#define NN_REG_TYPE	GENMASK(31, 24)
+#define NN_REG_VAL	GENMASK(7, 0)
+
+enum nfp_bpf_reg_type {
+	NN_REG_GPR_A =	BIT(0),
+	NN_REG_GPR_B =	BIT(1),
+	NN_REG_NNR =	BIT(2),
+	NN_REG_XFER =	BIT(3),
+	NN_REG_IMM =	BIT(4),
+	NN_REG_NONE =	BIT(5),
+};
+
+#define NN_REG_GPR_BOTH	(NN_REG_GPR_A | NN_REG_GPR_B)
+
+#define reg_both(x)	((x) | FIELD_PREP(NN_REG_TYPE, NN_REG_GPR_BOTH))
+#define reg_a(x)	((x) | FIELD_PREP(NN_REG_TYPE, NN_REG_GPR_A))
+#define reg_b(x)	((x) | FIELD_PREP(NN_REG_TYPE, NN_REG_GPR_B))
+#define reg_nnr(x)	((x) | FIELD_PREP(NN_REG_TYPE, NN_REG_NNR))
+#define reg_xfer(x)	((x) | FIELD_PREP(NN_REG_TYPE, NN_REG_XFER))
+#define reg_imm(x)	((x) | FIELD_PREP(NN_REG_TYPE, NN_REG_IMM))
+#define reg_none()	(FIELD_PREP(NN_REG_TYPE, NN_REG_NONE))
+
+#define pkt_reg(np)	reg_a((np)->regs_per_thread - STATIC_REG_PKT)
+#define imm_a(np)	reg_a((np)->regs_per_thread - STATIC_REG_IMM)
+#define imm_b(np)	reg_b((np)->regs_per_thread - STATIC_REG_IMM)
+#define imm_both(np)	reg_both((np)->regs_per_thread - STATIC_REG_IMM)
+
+#define NFP_BPF_ABI_FLAGS	reg_nnr(0)
+#define NFP_BPF_ABI_PKT		reg_nnr(2)
+#define NFP_BPF_ABI_LEN		reg_nnr(3)
+
+struct nfp_prog;
+struct nfp_insn_meta;
+typedef int (*instr_cb_t)(struct nfp_prog *, struct nfp_insn_meta *);
+
+#define nfp_prog_first_meta(nfp_prog)					\
+	list_first_entry(&(nfp_prog)->insns, struct nfp_insn_meta, l)
+#define nfp_prog_last_meta(nfp_prog)					\
+	list_last_entry(&(nfp_prog)->insns, struct nfp_insn_meta, l)
+#define nfp_meta_next(meta)	list_next_entry(meta, l)
+#define nfp_meta_prev(meta)	list_prev_entry(meta, l)
+
+/**
+ * struct nfp_insn_meta - BPF instruction wrapper
+ * @insn: BPF instruction
+ * @off: index of first generated machine instruction (in nfp_prog.prog)
+ * @n: eBPF instruction number
+ * @skip: skip this instruction (optimized out)
+ * @double_cb: callback for second part of the instruction
+ * @l: link on nfp_prog->insns list
+ */
+struct nfp_insn_meta {
+	struct bpf_insn insn;
+	unsigned int off;
+	unsigned short n;
+	bool skip;
+	instr_cb_t double_cb;
+
+	struct list_head l;
+};
+
+#define BPF_SIZE_MASK	0x18
+
+static inline u8 mbpf_class(const struct nfp_insn_meta *meta)
+{
+	return BPF_CLASS(meta->insn.code);
+}
+
+static inline u8 mbpf_src(const struct nfp_insn_meta *meta)
+{
+	return BPF_SRC(meta->insn.code);
+}
+
+static inline u8 mbpf_op(const struct nfp_insn_meta *meta)
+{
+	return BPF_OP(meta->insn.code);
+}
+
+static inline u8 mbpf_mode(const struct nfp_insn_meta *meta)
+{
+	return BPF_MODE(meta->insn.code);
+}
+
+/**
+ * struct nfp_prog - nfp BPF program
+ * @prog: machine code
+ * @prog_len: number of valid instructions in @prog array
+ * @__prog_alloc_len: alloc size of @prog array
+ * @act: BPF program/action type (TC DA, TC with action, XDP etc.)
+ * @num_regs: number of registers used by this program
+ * @regs_per_thread: number of basic registers allocated per thread
+ * @start_off: address of the first instruction in the memory
+ * @tgt_out: jump target for normal exit
+ * @tgt_abort: jump target for abort (e.g. access outside of packet buffer)
+ * @tgt_done: jump target to get the next packet
+ * @n_translated: number of successfully translated instructions (for errors)
+ * @error: error code if something went wrong
+ * @insns: list of BPF instruction wrappers (struct nfp_insn_meta)
+ */
+struct nfp_prog {
+	u64 *prog;
+	unsigned int prog_len;
+	unsigned int __prog_alloc_len;
+
+	enum nfp_bpf_action_type act;
+
+	unsigned int num_regs;
+	unsigned int regs_per_thread;
+
+	unsigned int start_off;
+	unsigned int tgt_out;
+	unsigned int tgt_abort;
+	unsigned int tgt_done;
+
+	unsigned int n_translated;
+	int error;
+
+	struct list_head insns;
+};
+
+struct nfp_bpf_result {
+	unsigned int n_instr;
+	bool dense_mode;
+};
+
+#ifdef CONFIG_BPF_SYSCALL
+int
+nfp_bpf_jit(struct bpf_prog *filter, void *prog, enum nfp_bpf_action_type act,
+	    unsigned int prog_start, unsigned int prog_done,
+	    unsigned int prog_sz, struct nfp_bpf_result *res);
+#else
+int
+nfp_bpf_jit(struct bpf_prog *filter, void *prog, enum nfp_bpf_action_type act,
+	    unsigned int prog_start, unsigned int prog_done,
+	    unsigned int prog_sz, struct nfp_bpf_result *res)
+{
+	return -ENOTSUPP;
+}
+#endif
+
+int nfp_prog_verify(struct nfp_prog *nfp_prog, struct bpf_prog *prog);
+
+#endif
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c b/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
new file mode 100644
index 000000000000..09ed1627ae20
--- /dev/null
+++ b/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
@@ -0,0 +1,1729 @@
+/*
+ * Copyright (C) 2016 Netronome Systems, Inc.
+ *
+ * This software is dual licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree or the BSD 2-Clause License provided below.  You have the
+ * option to license this software under the complete terms of either license.
+ *
+ * The BSD 2-Clause License:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      1. Redistributions of source code must retain the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer.
+ *
+ *      2. Redistributions in binary form must reproduce the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer in the documentation and/or other materials
+ *         provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#define pr_fmt(fmt)	"NFP net bpf: " fmt
+
+#include <linux/kernel.h>
+#include <linux/bpf.h>
+#include <linux/filter.h>
+#include <linux/pkt_cls.h>
+#include <linux/unistd.h>
+
+#include "nfp_asm.h"
+#include "nfp_bpf.h"
+
+/* --- NFP prog --- */
+/* Foreach "multiple" entries macros provide pos and next<n> pointers.
+ * It's safe to modify the next pointers (but not pos).
+ */
+#define nfp_for_each_insn_walk2(nfp_prog, pos, next)			\
+	for (pos = list_first_entry(&(nfp_prog)->insns, typeof(*pos), l), \
+	     next = list_next_entry(pos, l);			\
+	     &(nfp_prog)->insns != &pos->l &&			\
+	     &(nfp_prog)->insns != &next->l;			\
+	     pos = nfp_meta_next(pos),				\
+	     next = nfp_meta_next(pos))
+
+#define nfp_for_each_insn_walk3(nfp_prog, pos, next, next2)		\
+	for (pos = list_first_entry(&(nfp_prog)->insns, typeof(*pos), l), \
+	     next = list_next_entry(pos, l),			\
+	     next2 = list_next_entry(next, l);			\
+	     &(nfp_prog)->insns != &pos->l &&			\
+	     &(nfp_prog)->insns != &next->l &&			\
+	     &(nfp_prog)->insns != &next2->l;			\
+	     pos = nfp_meta_next(pos),				\
+	     next = nfp_meta_next(pos),				\
+	     next2 = nfp_meta_next(next))
+
+static bool
+nfp_meta_has_next(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return meta->l.next != &nfp_prog->insns;
+}
+
+static bool
+nfp_meta_has_prev(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return meta->l.prev != &nfp_prog->insns;
+}
+
+static void nfp_prog_free(struct nfp_prog *nfp_prog)
+{
+	struct nfp_insn_meta *meta, *tmp;
+
+	list_for_each_entry_safe(meta, tmp, &nfp_prog->insns, l) {
+		list_del(&meta->l);
+		kfree(meta);
+	}
+	kfree(nfp_prog);
+}
+
+static void nfp_prog_push(struct nfp_prog *nfp_prog, u64 insn)
+{
+	if (nfp_prog->__prog_alloc_len == nfp_prog->prog_len) {
+		nfp_prog->error = -ENOSPC;
+		return;
+	}
+
+	nfp_prog->prog[nfp_prog->prog_len] = insn;
+	nfp_prog->prog_len++;
+}
+
+static unsigned int nfp_prog_current_offset(struct nfp_prog *nfp_prog)
+{
+	return nfp_prog->start_off + nfp_prog->prog_len;
+}
+
+static unsigned int
+nfp_prog_offset_to_index(struct nfp_prog *nfp_prog, unsigned int offset)
+{
+	return offset - nfp_prog->start_off;
+}
+
+/* --- SW reg --- */
+struct nfp_insn_ur_regs {
+	enum alu_dst_ab dst_ab;
+	u16 dst;
+	u16 areg, breg;
+	bool swap;
+	bool wr_both;
+};
+
+struct nfp_insn_re_regs {
+	enum alu_dst_ab dst_ab;
+	u8 dst;
+	u8 areg, breg;
+	bool swap;
+	bool wr_both;
+	bool i8;
+};
+
+static u16 nfp_swreg_to_unreg(u32 swreg, bool is_dst)
+{
+	u16 val = FIELD_GET(NN_REG_VAL, swreg);
+
+	switch (FIELD_GET(NN_REG_TYPE, swreg)) {
+	case NN_REG_GPR_A:
+	case NN_REG_GPR_B:
+	case NN_REG_GPR_BOTH:
+		return val;
+	case NN_REG_NNR:
+		return UR_REG_NN | val;
+	case NN_REG_XFER:
+		return UR_REG_XFR | val;
+	case NN_REG_IMM:
+		if (val & ~0xff) {
+			pr_err("immediate too large\n");
+			return 0;
+		}
+		return UR_REG_IMM_encode(val);
+	case NN_REG_NONE:
+		return is_dst ? UR_REG_NO_DST : REG_NONE;
+	default:
+		pr_err("unrecognized reg encoding %08x\n", swreg);
+		return 0;
+	}
+}
+
+static int
+swreg_to_unrestricted(u32 dst, u32 lreg, u32 rreg, struct nfp_insn_ur_regs *reg)
+{
+	memset(reg, 0, sizeof(*reg));
+
+	/* Decode destination */
+	if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_IMM)
+		return -EFAULT;
+
+	if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_GPR_B)
+		reg->dst_ab = ALU_DST_B;
+	if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_GPR_BOTH)
+		reg->wr_both = true;
+	reg->dst = nfp_swreg_to_unreg(dst, true);
+
+	/* Decode source operands */
+	if (FIELD_GET(NN_REG_TYPE, lreg) == FIELD_GET(NN_REG_TYPE, rreg))
+		return -EFAULT;
+
+	if (FIELD_GET(NN_REG_TYPE, lreg) == NN_REG_GPR_B ||
+	    FIELD_GET(NN_REG_TYPE, rreg) == NN_REG_GPR_A) {
+		reg->areg = nfp_swreg_to_unreg(rreg, false);
+		reg->breg = nfp_swreg_to_unreg(lreg, false);
+		reg->swap = true;
+	} else {
+		reg->areg = nfp_swreg_to_unreg(lreg, false);
+		reg->breg = nfp_swreg_to_unreg(rreg, false);
+	}
+
+	return 0;
+}
+
+static u16 nfp_swreg_to_rereg(u32 swreg, bool is_dst, bool has_imm8, bool *i8)
+{
+	u16 val = FIELD_GET(NN_REG_VAL, swreg);
+
+	switch (FIELD_GET(NN_REG_TYPE, swreg)) {
+	case NN_REG_GPR_A:
+	case NN_REG_GPR_B:
+	case NN_REG_GPR_BOTH:
+		return val;
+	case NN_REG_XFER:
+		return RE_REG_XFR | val;
+	case NN_REG_IMM:
+		if (val & ~(0x7f | has_imm8 << 7)) {
+			pr_err("immediate too large\n");
+			return 0;
+		}
+		*i8 = val & 0x80;
+		return RE_REG_IMM_encode(val & 0x7f);
+	case NN_REG_NONE:
+		return is_dst ? RE_REG_NO_DST : REG_NONE;
+	default:
+		pr_err("unrecognized reg encoding\n");
+		return 0;
+	}
+}
+
+static int
+swreg_to_restricted(u32 dst, u32 lreg, u32 rreg, struct nfp_insn_re_regs *reg,
+		    bool has_imm8)
+{
+	memset(reg, 0, sizeof(*reg));
+
+	/* Decode destination */
+	if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_IMM)
+		return -EFAULT;
+
+	if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_GPR_B)
+		reg->dst_ab = ALU_DST_B;
+	if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_GPR_BOTH)
+		reg->wr_both = true;
+	reg->dst = nfp_swreg_to_rereg(dst, true, false, NULL);
+
+	/* Decode source operands */
+	if (FIELD_GET(NN_REG_TYPE, lreg) == FIELD_GET(NN_REG_TYPE, rreg))
+		return -EFAULT;
+
+	if (FIELD_GET(NN_REG_TYPE, lreg) == NN_REG_GPR_B ||
+	    FIELD_GET(NN_REG_TYPE, rreg) == NN_REG_GPR_A) {
+		reg->areg = nfp_swreg_to_rereg(rreg, false, has_imm8, &reg->i8);
+		reg->breg = nfp_swreg_to_rereg(lreg, false, has_imm8, &reg->i8);
+		reg->swap = true;
+	} else {
+		reg->areg = nfp_swreg_to_rereg(lreg, false, has_imm8, &reg->i8);
+		reg->breg = nfp_swreg_to_rereg(rreg, false, has_imm8, &reg->i8);
+	}
+
+	return 0;
+}
+
+/* --- Emitters --- */
+static const struct cmd_tgt_act cmd_tgt_act[__CMD_TGT_MAP_SIZE] = {
+	[CMD_TGT_WRITE8] =		{ 0x00, 0x42 },
+	[CMD_TGT_READ8] =		{ 0x01, 0x43 },
+	[CMD_TGT_READ_LE] =		{ 0x01, 0x40 },
+	[CMD_TGT_READ_SWAP_LE] =	{ 0x03, 0x40 },
+};
+
+static void
+__emit_cmd(struct nfp_prog *nfp_prog, enum cmd_tgt_map op,
+	   u8 mode, u8 xfer, u8 areg, u8 breg, u8 size, bool sync)
+{
+	enum cmd_ctx_swap ctx;
+	u64 insn;
+
+	if (sync)
+		ctx = CMD_CTX_SWAP;
+	else
+		ctx = CMD_CTX_NO_SWAP;
+
+	insn =	FIELD_PREP(OP_CMD_A_SRC, areg) |
+		FIELD_PREP(OP_CMD_CTX, ctx) |
+		FIELD_PREP(OP_CMD_B_SRC, breg) |
+		FIELD_PREP(OP_CMD_TOKEN, cmd_tgt_act[op].token) |
+		FIELD_PREP(OP_CMD_XFER, xfer) |
+		FIELD_PREP(OP_CMD_CNT, size) |
+		FIELD_PREP(OP_CMD_SIG, sync) |
+		FIELD_PREP(OP_CMD_TGT_CMD, cmd_tgt_act[op].tgt_cmd) |
+		FIELD_PREP(OP_CMD_MODE, mode);
+
+	nfp_prog_push(nfp_prog, insn);
+}
+
+static void
+emit_cmd(struct nfp_prog *nfp_prog, enum cmd_tgt_map op,
+	 u8 mode, u8 xfer, u32 lreg, u32 rreg, u8 size, bool sync)
+{
+	struct nfp_insn_re_regs reg;
+	int err;
+
+	err = swreg_to_restricted(reg_none(), lreg, rreg, &reg, false);
+	if (err) {
+		nfp_prog->error = err;
+		return;
+	}
+	if (reg.swap) {
+		pr_err("cmd can't swap arguments\n");
+		nfp_prog->error = -EFAULT;
+		return;
+	}
+
+	__emit_cmd(nfp_prog, op, mode, xfer, reg.areg, reg.breg, size, sync);
+}
+
+static void
+__emit_br(struct nfp_prog *nfp_prog, enum br_mask mask, enum br_ev_pip ev_pip,
+	  enum br_ctx_signal_state css, u16 addr, u8 defer)
+{
+	u16 addr_lo, addr_hi;
+	u64 insn;
+
+	addr_lo = addr & (OP_BR_ADDR_LO >> _bf_shf(OP_BR_ADDR_LO));
+	addr_hi = addr != addr_lo;
+
+	insn = OP_BR_BASE |
+		FIELD_PREP(OP_BR_MASK, mask) |
+		FIELD_PREP(OP_BR_EV_PIP, ev_pip) |
+		FIELD_PREP(OP_BR_CSS, css) |
+		FIELD_PREP(OP_BR_DEFBR, defer) |
+		FIELD_PREP(OP_BR_ADDR_LO, addr_lo) |
+		FIELD_PREP(OP_BR_ADDR_HI, addr_hi);
+
+	nfp_prog_push(nfp_prog, insn);
+}
+
+static void
+emit_br(struct nfp_prog *nfp_prog, enum br_mask mask, u16 addr, u8 defer)
+{
+	__emit_br(nfp_prog, mask,
+		  mask != BR_UNC ? BR_EV_PIP_COND : BR_EV_PIP_UNCOND,
+		  BR_CSS_NONE, addr, defer);
+}
+
+static void
+__emit_br_byte(struct nfp_prog *nfp_prog, u8 areg, u8 breg, bool imm8,
+	       u8 byte, bool equal, u16 addr, u8 defer)
+{
+	u16 addr_lo, addr_hi;
+	u64 insn;
+
+	addr_lo = addr & (OP_BB_ADDR_LO >> _bf_shf(OP_BB_ADDR_LO));
+	addr_hi = addr != addr_lo;
+
+	insn = OP_BBYTE_BASE |
+		FIELD_PREP(OP_BB_A_SRC, areg) |
+		FIELD_PREP(OP_BB_BYTE, byte) |
+		FIELD_PREP(OP_BB_B_SRC, breg) |
+		FIELD_PREP(OP_BB_I8, imm8) |
+		FIELD_PREP(OP_BB_EQ, equal) |
+		FIELD_PREP(OP_BB_DEFBR, defer) |
+		FIELD_PREP(OP_BB_ADDR_LO, addr_lo) |
+		FIELD_PREP(OP_BB_ADDR_HI, addr_hi);
+
+	nfp_prog_push(nfp_prog, insn);
+}
+
+static void
+emit_br_byte_neq(struct nfp_prog *nfp_prog,
+		 u32 dst, u8 imm, u8 byte, u16 addr, u8 defer)
+{
+	struct nfp_insn_re_regs reg;
+	int err;
+
+	err = swreg_to_restricted(reg_none(), dst, reg_imm(imm), &reg, true);
+	if (err) {
+		nfp_prog->error = err;
+		return;
+	}
+
+	__emit_br_byte(nfp_prog, reg.areg, reg.breg, reg.i8, byte, false, addr,
+		       defer);
+}
+
+static void
+__emit_immed(struct nfp_prog *nfp_prog, u16 areg, u16 breg, u16 imm_hi,
+	     enum immed_width width, bool invert,
+	     enum immed_shift shift, bool wr_both)
+{
+	u64 insn;
+
+	insn = OP_IMMED_BASE |
+		FIELD_PREP(OP_IMMED_A_SRC, areg) |
+		FIELD_PREP(OP_IMMED_B_SRC, breg) |
+		FIELD_PREP(OP_IMMED_IMM, imm_hi) |
+		FIELD_PREP(OP_IMMED_WIDTH, width) |
+		FIELD_PREP(OP_IMMED_INV, invert) |
+		FIELD_PREP(OP_IMMED_SHIFT, shift) |
+		FIELD_PREP(OP_IMMED_WR_AB, wr_both);
+
+	nfp_prog_push(nfp_prog, insn);
+}
+
+static void
+emit_immed(struct nfp_prog *nfp_prog, u32 dst, u16 imm,
+	   enum immed_width width, bool invert, enum immed_shift shift)
+{
+	struct nfp_insn_ur_regs reg;
+	int err;
+
+	if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_IMM) {
+		nfp_prog->error = -EFAULT;
+		return;
+	}
+
+	err = swreg_to_unrestricted(dst, dst, reg_imm(imm & 0xff), &reg);
+	if (err) {
+		nfp_prog->error = err;
+		return;
+	}
+
+	__emit_immed(nfp_prog, reg.areg, reg.breg, imm >> 8, width,
+		     invert, shift, reg.wr_both);
+}
+
+static void
+__emit_shf(struct nfp_prog *nfp_prog, u16 dst, enum alu_dst_ab dst_ab,
+	   enum shf_sc sc, u8 shift,
+	   u16 areg, enum shf_op op, u16 breg, bool i8, bool sw, bool wr_both)
+{
+	u64 insn;
+
+	if (!FIELD_FIT(OP_SHF_SHIFT, shift)) {
+		nfp_prog->error = -EFAULT;
+		return;
+	}
+
+	if (sc == SHF_SC_L_SHF)
+		shift = 32 - shift;
+
+	insn = OP_SHF_BASE |
+		FIELD_PREP(OP_SHF_A_SRC, areg) |
+		FIELD_PREP(OP_SHF_SC, sc) |
+		FIELD_PREP(OP_SHF_B_SRC, breg) |
+		FIELD_PREP(OP_SHF_I8, i8) |
+		FIELD_PREP(OP_SHF_SW, sw) |
+		FIELD_PREP(OP_SHF_DST, dst) |
+		FIELD_PREP(OP_SHF_SHIFT, shift) |
+		FIELD_PREP(OP_SHF_OP, op) |
+		FIELD_PREP(OP_SHF_DST_AB, dst_ab) |
+		FIELD_PREP(OP_SHF_WR_AB, wr_both);
+
+	nfp_prog_push(nfp_prog, insn);
+}
+
+static void
+emit_shf(struct nfp_prog *nfp_prog, u32 dst, u32 lreg, enum shf_op op, u32 rreg,
+	 enum shf_sc sc, u8 shift)
+{
+	struct nfp_insn_re_regs reg;
+	int err;
+
+	err = swreg_to_restricted(dst, lreg, rreg, &reg, true);
+	if (err) {
+		nfp_prog->error = err;
+		return;
+	}
+
+	__emit_shf(nfp_prog, reg.dst, reg.dst_ab, sc, shift,
+		   reg.areg, op, reg.breg, reg.i8, reg.swap, reg.wr_both);
+}
+
+static void
+__emit_alu(struct nfp_prog *nfp_prog, u16 dst, enum alu_dst_ab dst_ab,
+	   u16 areg, enum alu_op op, u16 breg, bool swap, bool wr_both)
+{
+	u64 insn;
+
+	insn = OP_ALU_BASE |
+		FIELD_PREP(OP_ALU_A_SRC, areg) |
+		FIELD_PREP(OP_ALU_B_SRC, breg) |
+		FIELD_PREP(OP_ALU_DST, dst) |
+		FIELD_PREP(OP_ALU_SW, swap) |
+		FIELD_PREP(OP_ALU_OP, op) |
+		FIELD_PREP(OP_ALU_DST_AB, dst_ab) |
+		FIELD_PREP(OP_ALU_WR_AB, wr_both);
+
+	nfp_prog_push(nfp_prog, insn);
+}
+
+static void
+emit_alu(struct nfp_prog *nfp_prog, u32 dst, u32 lreg, enum alu_op op, u32 rreg)
+{
+	struct nfp_insn_ur_regs reg;
+	int err;
+
+	err = swreg_to_unrestricted(dst, lreg, rreg, &reg);
+	if (err) {
+		nfp_prog->error = err;
+		return;
+	}
+
+	__emit_alu(nfp_prog, reg.dst, reg.dst_ab,
+		   reg.areg, op, reg.breg, reg.swap, reg.wr_both);
+}
+
+static void
+__emit_ld_field(struct nfp_prog *nfp_prog, enum shf_sc sc,
+		u8 areg, u8 bmask, u8 breg, u8 shift, bool imm8,
+		bool zero, bool swap, bool wr_both)
+{
+	u64 insn;
+
+	insn = OP_LDF_BASE |
+		FIELD_PREP(OP_LDF_A_SRC, areg) |
+		FIELD_PREP(OP_LDF_SC, sc) |
+		FIELD_PREP(OP_LDF_B_SRC, breg) |
+		FIELD_PREP(OP_LDF_I8, imm8) |
+		FIELD_PREP(OP_LDF_SW, swap) |
+		FIELD_PREP(OP_LDF_ZF, zero) |
+		FIELD_PREP(OP_LDF_BMASK, bmask) |
+		FIELD_PREP(OP_LDF_SHF, shift) |
+		FIELD_PREP(OP_LDF_WR_AB, wr_both);
+
+	nfp_prog_push(nfp_prog, insn);
+}
+
+static void
+emit_ld_field_any(struct nfp_prog *nfp_prog, enum shf_sc sc, u8 shift,
+		  u32 dst, u8 bmask, u32 src, bool zero)
+{
+	struct nfp_insn_re_regs reg;
+	int err;
+
+	err = swreg_to_restricted(reg_none(), dst, src, &reg, true);
+	if (err) {
+		nfp_prog->error = err;
+		return;
+	}
+
+	__emit_ld_field(nfp_prog, sc, reg.areg, bmask, reg.breg, shift,
+			reg.i8, zero, reg.swap, reg.wr_both);
+}
+
+static void
+emit_ld_field(struct nfp_prog *nfp_prog, u32 dst, u8 bmask, u32 src,
+	      enum shf_sc sc, u8 shift)
+{
+	emit_ld_field_any(nfp_prog, sc, shift, dst, bmask, src, false);
+}
+
+/* --- Wrappers --- */
+static bool pack_immed(u32 imm, u16 *val, enum immed_shift *shift)
+{
+	if (!(imm & 0xffff0000)) {
+		*val = imm;
+		*shift = IMMED_SHIFT_0B;
+	} else if (!(imm & 0xff0000ff)) {
+		*val = imm >> 8;
+		*shift = IMMED_SHIFT_1B;
+	} else if (!(imm & 0x0000ffff)) {
+		*val = imm >> 16;
+		*shift = IMMED_SHIFT_2B;
+	} else {
+		return false;
+	}
+
+	return true;
+}
+
+static void wrp_immed(struct nfp_prog *nfp_prog, u32 dst, u32 imm)
+{
+	enum immed_shift shift;
+	u16 val;
+
+	if (pack_immed(imm, &val, &shift)) {
+		emit_immed(nfp_prog, dst, val, IMMED_WIDTH_ALL, false, shift);
+	} else if (pack_immed(~imm, &val, &shift)) {
+		emit_immed(nfp_prog, dst, val, IMMED_WIDTH_ALL, true, shift);
+	} else {
+		emit_immed(nfp_prog, dst, imm & 0xffff, IMMED_WIDTH_ALL,
+			   false, IMMED_SHIFT_0B);
+		emit_immed(nfp_prog, dst, imm >> 16, IMMED_WIDTH_WORD,
+			   false, IMMED_SHIFT_2B);
+	}
+}
+
+/* ur_load_imm_any() - encode immediate or use tmp register (unrestricted)
+ * If the @imm is small enough encode it directly in operand and return
+ * otherwise load @imm to a spare register and return its encoding.
+ */
+static u32 ur_load_imm_any(struct nfp_prog *nfp_prog, u32 imm, u32 tmp_reg)
+{
+	if (FIELD_FIT(UR_REG_IMM_MAX, imm))
+		return reg_imm(imm);
+
+	wrp_immed(nfp_prog, tmp_reg, imm);
+	return tmp_reg;
+}
+
+/* re_load_imm_any() - encode immediate or use tmp register (restricted)
+ * If the @imm is small enough encode it directly in operand and return
+ * otherwise load @imm to a spare register and return its encoding.
+ */
+static u32 re_load_imm_any(struct nfp_prog *nfp_prog, u32 imm, u32 tmp_reg)
+{
+	if (FIELD_FIT(RE_REG_IMM_MAX, imm))
+		return reg_imm(imm);
+
+	wrp_immed(nfp_prog, tmp_reg, imm);
+	return tmp_reg;
+}
+
+static void
+wrp_br_special(struct nfp_prog *nfp_prog, enum br_mask mask,
+	       enum br_special special)
+{
+	emit_br(nfp_prog, mask, 0, 0);
+
+	nfp_prog->prog[nfp_prog->prog_len - 1] |=
+		FIELD_PREP(OP_BR_SPECIAL, special);
+}
+
+static void wrp_reg_mov(struct nfp_prog *nfp_prog, u16 dst, u16 src)
+{
+	emit_alu(nfp_prog, reg_both(dst), reg_none(), ALU_OP_NONE, reg_b(src));
+}
+
+static int
+construct_data_ind_ld(struct nfp_prog *nfp_prog, u16 offset,
+		      u16 src, bool src_valid, u8 size)
+{
+	unsigned int i;
+	u16 shift, sz;
+	u32 tmp_reg;
+
+	/* We load the value from the address indicated in @offset and then
+	 * shift out the data we don't need.  Note: this is big endian!
+	 */
+	sz = size < 4 ? 4 : size;
+	shift = size < 4 ? 4 - size : 0;
+
+	if (src_valid) {
+		/* Calculate the true offset (src_reg + imm) */
+		tmp_reg = ur_load_imm_any(nfp_prog, offset, imm_b(nfp_prog));
+		emit_alu(nfp_prog, imm_both(nfp_prog),
+			 reg_a(src), ALU_OP_ADD, tmp_reg);
+		/* Check packet length (size guaranteed to fit b/c it's u8) */
+		emit_alu(nfp_prog, imm_a(nfp_prog),
+			 imm_a(nfp_prog), ALU_OP_ADD, reg_imm(size));
+		emit_alu(nfp_prog, reg_none(),
+			 NFP_BPF_ABI_LEN, ALU_OP_SUB, imm_a(nfp_prog));
+		wrp_br_special(nfp_prog, BR_BLO, OP_BR_GO_ABORT);
+		/* Load data */
+		emit_cmd(nfp_prog, CMD_TGT_READ8, CMD_MODE_32b, 0,
+			 pkt_reg(nfp_prog), imm_b(nfp_prog), sz - 1, true);
+	} else {
+		/* Check packet length */
+		tmp_reg = ur_load_imm_any(nfp_prog, offset + size,
+					  imm_a(nfp_prog));
+		emit_alu(nfp_prog, reg_none(),
+			 NFP_BPF_ABI_LEN, ALU_OP_SUB, tmp_reg);
+		wrp_br_special(nfp_prog, BR_BLO, OP_BR_GO_ABORT);
+		/* Load data */
+		tmp_reg = re_load_imm_any(nfp_prog, offset, imm_b(nfp_prog));
+		emit_cmd(nfp_prog, CMD_TGT_READ8, CMD_MODE_32b, 0,
+			 pkt_reg(nfp_prog), tmp_reg, sz - 1, true);
+	}
+
+	i = 0;
+	if (shift)
+		emit_shf(nfp_prog, reg_both(0), reg_none(), SHF_OP_NONE,
+			 reg_xfer(0), SHF_SC_R_SHF, shift * 8);
+	else
+		for (; i * 4 < size; i++)
+			emit_alu(nfp_prog, reg_both(i),
+				 reg_none(), ALU_OP_NONE, reg_xfer(i));
+
+	if (i < 2)
+		wrp_immed(nfp_prog, reg_both(1), 0);
+
+	return 0;
+}
+
+static int construct_data_ld(struct nfp_prog *nfp_prog, u16 offset, u8 size)
+{
+	return construct_data_ind_ld(nfp_prog, offset, 0, false, size);
+}
+
+static void
+wrp_alu_imm(struct nfp_prog *nfp_prog, u8 dst, enum alu_op alu_op, u32 imm)
+{
+	u32 tmp_reg;
+
+	if (alu_op == ALU_OP_AND) {
+		if (!imm)
+			wrp_immed(nfp_prog, reg_both(dst), 0);
+		if (!imm || !~imm)
+			return;
+	}
+	if (alu_op == ALU_OP_OR) {
+		if (!~imm)
+			wrp_immed(nfp_prog, reg_both(dst), ~0U);
+		if (!imm || !~imm)
+			return;
+	}
+	if (alu_op == ALU_OP_XOR) {
+		if (!~imm)
+			emit_alu(nfp_prog, reg_both(dst), reg_none(),
+				 ALU_OP_NEG, reg_b(dst));
+		if (!imm || !~imm)
+			return;
+	}
+
+	tmp_reg = ur_load_imm_any(nfp_prog, imm, imm_b(nfp_prog));
+	emit_alu(nfp_prog, reg_both(dst), reg_a(dst), alu_op, tmp_reg);
+}
+
+static int
+wrp_alu64_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+	      enum alu_op alu_op, bool skip)
+{
+	const struct bpf_insn *insn = &meta->insn;
+	u64 imm = insn->imm; /* sign extend */
+
+	if (skip) {
+		meta->skip = true;
+		return 0;
+	}
+
+	wrp_alu_imm(nfp_prog, insn->dst_reg * 2, alu_op, imm & ~0U);
+	wrp_alu_imm(nfp_prog, insn->dst_reg * 2 + 1, alu_op, imm >> 32);
+
+	return 0;
+}
+
+static int
+wrp_alu64_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+	      enum alu_op alu_op)
+{
+	u8 dst = meta->insn.dst_reg * 2, src = meta->insn.src_reg * 2;
+
+	emit_alu(nfp_prog, reg_both(dst), reg_a(dst), alu_op, reg_b(src));
+	emit_alu(nfp_prog, reg_both(dst + 1),
+		 reg_a(dst + 1), alu_op, reg_b(src + 1));
+
+	return 0;
+}
+
+static int
+wrp_alu32_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+	      enum alu_op alu_op, bool skip)
+{
+	const struct bpf_insn *insn = &meta->insn;
+
+	if (skip) {
+		meta->skip = true;
+		return 0;
+	}
+
+	wrp_alu_imm(nfp_prog, insn->dst_reg * 2, alu_op, insn->imm);
+	wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2 + 1), 0);
+
+	return 0;
+}
+
+static int
+wrp_alu32_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+	      enum alu_op alu_op)
+{
+	u8 dst = meta->insn.dst_reg * 2, src = meta->insn.src_reg * 2;
+
+	emit_alu(nfp_prog, reg_both(dst), reg_a(dst), alu_op, reg_b(src));
+	wrp_immed(nfp_prog, reg_both(meta->insn.dst_reg * 2 + 1), 0);
+
+	return 0;
+}
+
+static void
+wrp_test_reg_one(struct nfp_prog *nfp_prog, u8 dst, enum alu_op alu_op, u8 src,
+		 enum br_mask br_mask, u16 off)
+{
+	emit_alu(nfp_prog, reg_none(), reg_a(dst), alu_op, reg_b(src));
+	emit_br(nfp_prog, br_mask, off, 0);
+}
+
+static int
+wrp_test_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+	     enum alu_op alu_op, enum br_mask br_mask)
+{
+	const struct bpf_insn *insn = &meta->insn;
+
+	if (insn->off < 0) /* TODO */
+		return -ENOTSUPP;
+
+	wrp_test_reg_one(nfp_prog, insn->dst_reg * 2, alu_op,
+			 insn->src_reg * 2, br_mask, insn->off);
+	wrp_test_reg_one(nfp_prog, insn->dst_reg * 2 + 1, alu_op,
+			 insn->src_reg * 2 + 1, br_mask, insn->off);
+
+	return 0;
+}
+
+static int
+wrp_cmp_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+	    enum br_mask br_mask, bool swap)
+{
+	const struct bpf_insn *insn = &meta->insn;
+	u64 imm = insn->imm; /* sign extend */
+	u8 reg = insn->dst_reg * 2;
+	u32 tmp_reg;
+
+	if (insn->off < 0) /* TODO */
+		return -ENOTSUPP;
+
+	tmp_reg = ur_load_imm_any(nfp_prog, imm & ~0U, imm_b(nfp_prog));
+	if (!swap)
+		emit_alu(nfp_prog, reg_none(), reg_a(reg), ALU_OP_SUB, tmp_reg);
+	else
+		emit_alu(nfp_prog, reg_none(), tmp_reg, ALU_OP_SUB, reg_a(reg));
+
+	tmp_reg = ur_load_imm_any(nfp_prog, imm >> 32, imm_b(nfp_prog));
+	if (!swap)
+		emit_alu(nfp_prog, reg_none(),
+			 reg_a(reg + 1), ALU_OP_SUB_C, tmp_reg);
+	else
+		emit_alu(nfp_prog, reg_none(),
+			 tmp_reg, ALU_OP_SUB_C, reg_a(reg + 1));
+
+	emit_br(nfp_prog, br_mask, insn->off, 0);
+
+	return 0;
+}
+
+static int
+wrp_cmp_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+	    enum br_mask br_mask, bool swap)
+{
+	const struct bpf_insn *insn = &meta->insn;
+	u8 areg = insn->src_reg * 2, breg = insn->dst_reg * 2;
+
+	if (insn->off < 0) /* TODO */
+		return -ENOTSUPP;
+
+	if (swap) {
+		areg ^= breg;
+		breg ^= areg;
+		areg ^= breg;
+	}
+
+	emit_alu(nfp_prog, reg_none(), reg_a(areg), ALU_OP_SUB, reg_b(breg));
+	emit_alu(nfp_prog, reg_none(),
+		 reg_a(areg + 1), ALU_OP_SUB_C, reg_b(breg + 1));
+	emit_br(nfp_prog, br_mask, insn->off, 0);
+
+	return 0;
+}
+
+/* --- Callbacks --- */
+static int mov_reg64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	const struct bpf_insn *insn = &meta->insn;
+
+	wrp_reg_mov(nfp_prog, insn->dst_reg * 2, insn->src_reg * 2);
+	wrp_reg_mov(nfp_prog, insn->dst_reg * 2 + 1, insn->src_reg * 2 + 1);
+
+	return 0;
+}
+
+static int mov_imm64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	u64 imm = meta->insn.imm; /* sign extend */
+
+	wrp_immed(nfp_prog, reg_both(meta->insn.dst_reg * 2), imm & ~0U);
+	wrp_immed(nfp_prog, reg_both(meta->insn.dst_reg * 2 + 1), imm >> 32);
+
+	return 0;
+}
+
+static int xor_reg64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_alu64_reg(nfp_prog, meta, ALU_OP_XOR);
+}
+
+static int xor_imm64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_alu64_imm(nfp_prog, meta, ALU_OP_XOR, !meta->insn.imm);
+}
+
+static int and_reg64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_alu64_reg(nfp_prog, meta, ALU_OP_AND);
+}
+
+static int and_imm64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_alu64_imm(nfp_prog, meta, ALU_OP_AND, !~meta->insn.imm);
+}
+
+static int or_reg64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_alu64_reg(nfp_prog, meta, ALU_OP_OR);
+}
+
+static int or_imm64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_alu64_imm(nfp_prog, meta, ALU_OP_OR, !meta->insn.imm);
+}
+
+static int add_reg64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	const struct bpf_insn *insn = &meta->insn;
+
+	emit_alu(nfp_prog, reg_both(insn->dst_reg * 2),
+		 reg_a(insn->dst_reg * 2), ALU_OP_ADD,
+		 reg_b(insn->src_reg * 2));
+	emit_alu(nfp_prog, reg_both(insn->dst_reg * 2 + 1),
+		 reg_a(insn->dst_reg * 2 + 1), ALU_OP_ADD_C,
+		 reg_b(insn->src_reg * 2 + 1));
+
+	return 0;
+}
+
+static int add_imm64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	const struct bpf_insn *insn = &meta->insn;
+	u64 imm = insn->imm; /* sign extend */
+
+	wrp_alu_imm(nfp_prog, insn->dst_reg * 2, ALU_OP_ADD, imm & ~0U);
+	wrp_alu_imm(nfp_prog, insn->dst_reg * 2 + 1, ALU_OP_ADD_C, imm >> 32);
+
+	return 0;
+}
+
+static int sub_reg64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	const struct bpf_insn *insn = &meta->insn;
+
+	emit_alu(nfp_prog, reg_both(insn->dst_reg * 2),
+		 reg_a(insn->dst_reg * 2), ALU_OP_SUB,
+		 reg_b(insn->src_reg * 2));
+	emit_alu(nfp_prog, reg_both(insn->dst_reg * 2 + 1),
+		 reg_a(insn->dst_reg * 2 + 1), ALU_OP_SUB_C,
+		 reg_b(insn->src_reg * 2 + 1));
+
+	return 0;
+}
+
+static int sub_imm64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	const struct bpf_insn *insn = &meta->insn;
+	u64 imm = insn->imm; /* sign extend */
+
+	wrp_alu_imm(nfp_prog, insn->dst_reg * 2, ALU_OP_SUB, imm & ~0U);
+	wrp_alu_imm(nfp_prog, insn->dst_reg * 2 + 1, ALU_OP_SUB_C, imm >> 32);
+
+	return 0;
+}
+
+static int shl_imm64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	const struct bpf_insn *insn = &meta->insn;
+
+	if (insn->imm != 32)
+		return 1; /* TODO */
+
+	wrp_reg_mov(nfp_prog, insn->dst_reg * 2 + 1, insn->dst_reg * 2);
+	wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2), 0);
+
+	return 0;
+}
+
+static int shr_imm64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	const struct bpf_insn *insn = &meta->insn;
+
+	if (insn->imm != 32)
+		return 1; /* TODO */
+
+	wrp_reg_mov(nfp_prog, insn->dst_reg * 2, insn->dst_reg * 2 + 1);
+	wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2 + 1), 0);
+
+	return 0;
+}
+
+static int mov_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	const struct bpf_insn *insn = &meta->insn;
+
+	wrp_reg_mov(nfp_prog, insn->dst_reg * 2,  insn->src_reg * 2);
+	wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2 + 1), 0);
+
+	return 0;
+}
+
+static int mov_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	const struct bpf_insn *insn = &meta->insn;
+
+	wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2), insn->imm);
+	wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2 + 1), 0);
+
+	return 0;
+}
+
+static int xor_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_alu32_reg(nfp_prog, meta, ALU_OP_XOR);
+}
+
+static int xor_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_alu32_imm(nfp_prog, meta, ALU_OP_XOR, !~meta->insn.imm);
+}
+
+static int and_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_alu32_reg(nfp_prog, meta, ALU_OP_AND);
+}
+
+static int and_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_alu32_imm(nfp_prog, meta, ALU_OP_AND, !~meta->insn.imm);
+}
+
+static int or_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_alu32_reg(nfp_prog, meta, ALU_OP_OR);
+}
+
+static int or_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_alu32_imm(nfp_prog, meta, ALU_OP_OR, !meta->insn.imm);
+}
+
+static int add_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_alu32_reg(nfp_prog, meta, ALU_OP_ADD);
+}
+
+static int add_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_alu32_imm(nfp_prog, meta, ALU_OP_ADD, !meta->insn.imm);
+}
+
+static int sub_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_alu32_reg(nfp_prog, meta, ALU_OP_SUB);
+}
+
+static int sub_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_alu32_imm(nfp_prog, meta, ALU_OP_SUB, !meta->insn.imm);
+}
+
+static int shl_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	const struct bpf_insn *insn = &meta->insn;
+
+	if (!insn->imm)
+		return 1; /* TODO: zero shift means indirect */
+
+	emit_shf(nfp_prog, reg_both(insn->dst_reg * 2),
+		 reg_none(), SHF_OP_NONE, reg_b(insn->dst_reg * 2),
+		 SHF_SC_L_SHF, insn->imm);
+	wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2 + 1), 0);
+
+	return 0;
+}
+
+static int imm_ld8_part2(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	wrp_immed(nfp_prog, reg_both(nfp_meta_prev(meta)->insn.dst_reg * 2 + 1),
+		  meta->insn.imm);
+
+	return 0;
+}
+
+static int imm_ld8(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	const struct bpf_insn *insn = &meta->insn;
+
+	meta->double_cb = imm_ld8_part2;
+	wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2), insn->imm);
+
+	return 0;
+}
+
+static int data_ld1(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return construct_data_ld(nfp_prog, meta->insn.imm, 1);
+}
+
+static int data_ld2(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return construct_data_ld(nfp_prog, meta->insn.imm, 2);
+}
+
+static int data_ld4(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return construct_data_ld(nfp_prog, meta->insn.imm, 4);
+}
+
+static int data_ind_ld1(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return construct_data_ind_ld(nfp_prog, meta->insn.imm,
+				     meta->insn.src_reg * 2, true, 1);
+}
+
+static int data_ind_ld2(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return construct_data_ind_ld(nfp_prog, meta->insn.imm,
+				     meta->insn.src_reg * 2, true, 2);
+}
+
+static int data_ind_ld4(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return construct_data_ind_ld(nfp_prog, meta->insn.imm,
+				     meta->insn.src_reg * 2, true, 4);
+}
+
+static int mem_ldx4(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	if (meta->insn.off == offsetof(struct sk_buff, len))
+		emit_alu(nfp_prog, reg_both(meta->insn.dst_reg * 2),
+			 reg_none(), ALU_OP_NONE, NFP_BPF_ABI_LEN);
+	else
+		return -ENOTSUPP;
+
+	return 0;
+}
+
+static int jump(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	if (meta->insn.off < 0) /* TODO */
+		return -ENOTSUPP;
+	emit_br(nfp_prog, BR_UNC, meta->insn.off, 0);
+
+	return 0;
+}
+
+static int jeq_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	const struct bpf_insn *insn = &meta->insn;
+	u64 imm = insn->imm; /* sign extend */
+	u32 or1 = reg_a(insn->dst_reg * 2), or2 = reg_b(insn->dst_reg * 2 + 1);
+	u32 tmp_reg;
+
+	if (insn->off < 0) /* TODO */
+		return -ENOTSUPP;
+
+	if (imm & ~0U) {
+		tmp_reg = ur_load_imm_any(nfp_prog, imm & ~0U, imm_b(nfp_prog));
+		emit_alu(nfp_prog, imm_a(nfp_prog),
+			 reg_a(insn->dst_reg * 2), ALU_OP_XOR, tmp_reg);
+		or1 = imm_a(nfp_prog);
+	}
+
+	if (imm >> 32) {
+		tmp_reg = ur_load_imm_any(nfp_prog, imm >> 32, imm_b(nfp_prog));
+		emit_alu(nfp_prog, imm_b(nfp_prog),
+			 reg_a(insn->dst_reg * 2 + 1), ALU_OP_XOR, tmp_reg);
+		or2 = imm_b(nfp_prog);
+	}
+
+	emit_alu(nfp_prog, reg_none(), or1, ALU_OP_OR, or2);
+	emit_br(nfp_prog, BR_BEQ, insn->off, 0);
+
+	return 0;
+}
+
+static int jgt_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_cmp_imm(nfp_prog, meta, BR_BLO, false);
+}
+
+static int jge_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_cmp_imm(nfp_prog, meta, BR_BHS, true);
+}
+
+static int jset_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	const struct bpf_insn *insn = &meta->insn;
+	u64 imm = insn->imm; /* sign extend */
+	u32 tmp_reg;
+
+	if (insn->off < 0) /* TODO */
+		return -ENOTSUPP;
+
+	if (!imm) {
+		meta->skip = true;
+		return 0;
+	}
+
+	if (imm & ~0U) {
+		tmp_reg = ur_load_imm_any(nfp_prog, imm & ~0U, imm_b(nfp_prog));
+		emit_alu(nfp_prog, reg_none(),
+			 reg_a(insn->dst_reg * 2), ALU_OP_AND, tmp_reg);
+		emit_br(nfp_prog, BR_BNE, insn->off, 0);
+	}
+
+	if (imm >> 32) {
+		tmp_reg = ur_load_imm_any(nfp_prog, imm >> 32, imm_b(nfp_prog));
+		emit_alu(nfp_prog, reg_none(),
+			 reg_a(insn->dst_reg * 2 + 1), ALU_OP_AND, tmp_reg);
+		emit_br(nfp_prog, BR_BNE, insn->off, 0);
+	}
+
+	return 0;
+}
+
+static int jne_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	const struct bpf_insn *insn = &meta->insn;
+	u64 imm = insn->imm; /* sign extend */
+	u32 tmp_reg;
+
+	if (insn->off < 0) /* TODO */
+		return -ENOTSUPP;
+
+	if (!imm) {
+		emit_alu(nfp_prog, reg_none(), reg_a(insn->dst_reg * 2),
+			 ALU_OP_OR, reg_b(insn->dst_reg * 2 + 1));
+		emit_br(nfp_prog, BR_BNE, insn->off, 0);
+	}
+
+	tmp_reg = ur_load_imm_any(nfp_prog, imm & ~0U, imm_b(nfp_prog));
+	emit_alu(nfp_prog, reg_none(),
+		 reg_a(insn->dst_reg * 2), ALU_OP_XOR, tmp_reg);
+	emit_br(nfp_prog, BR_BNE, insn->off, 0);
+
+	tmp_reg = ur_load_imm_any(nfp_prog, imm >> 32, imm_b(nfp_prog));
+	emit_alu(nfp_prog, reg_none(),
+		 reg_a(insn->dst_reg * 2 + 1), ALU_OP_XOR, tmp_reg);
+	emit_br(nfp_prog, BR_BNE, insn->off, 0);
+
+	return 0;
+}
+
+static int jeq_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	const struct bpf_insn *insn = &meta->insn;
+
+	if (insn->off < 0) /* TODO */
+		return -ENOTSUPP;
+
+	emit_alu(nfp_prog, imm_a(nfp_prog), reg_a(insn->dst_reg * 2),
+		 ALU_OP_XOR, reg_b(insn->src_reg * 2));
+	emit_alu(nfp_prog, imm_b(nfp_prog), reg_a(insn->dst_reg * 2 + 1),
+		 ALU_OP_XOR, reg_b(insn->src_reg * 2 + 1));
+	emit_alu(nfp_prog, reg_none(),
+		 imm_a(nfp_prog), ALU_OP_OR, imm_b(nfp_prog));
+	emit_br(nfp_prog, BR_BEQ, insn->off, 0);
+
+	return 0;
+}
+
+static int jgt_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_cmp_reg(nfp_prog, meta, BR_BLO, false);
+}
+
+static int jge_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_cmp_reg(nfp_prog, meta, BR_BHS, true);
+}
+
+static int jset_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_test_reg(nfp_prog, meta, ALU_OP_AND, BR_BNE);
+}
+
+static int jne_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	return wrp_test_reg(nfp_prog, meta, ALU_OP_XOR, BR_BNE);
+}
+
+static int goto_out(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	wrp_br_special(nfp_prog, BR_UNC, OP_BR_GO_OUT);
+
+	return 0;
+}
+
+static const instr_cb_t instr_cb[256] = {
+	[BPF_ALU64 | BPF_MOV | BPF_X] =	mov_reg64,
+	[BPF_ALU64 | BPF_MOV | BPF_K] =	mov_imm64,
+	[BPF_ALU64 | BPF_XOR | BPF_X] =	xor_reg64,
+	[BPF_ALU64 | BPF_XOR | BPF_K] =	xor_imm64,
+	[BPF_ALU64 | BPF_AND | BPF_X] =	and_reg64,
+	[BPF_ALU64 | BPF_AND | BPF_K] =	and_imm64,
+	[BPF_ALU64 | BPF_OR | BPF_X] =	or_reg64,
+	[BPF_ALU64 | BPF_OR | BPF_K] =	or_imm64,
+	[BPF_ALU64 | BPF_ADD | BPF_X] =	add_reg64,
+	[BPF_ALU64 | BPF_ADD | BPF_K] =	add_imm64,
+	[BPF_ALU64 | BPF_SUB | BPF_X] =	sub_reg64,
+	[BPF_ALU64 | BPF_SUB | BPF_K] =	sub_imm64,
+	[BPF_ALU64 | BPF_LSH | BPF_K] =	shl_imm64,
+	[BPF_ALU64 | BPF_RSH | BPF_K] =	shr_imm64,
+	[BPF_ALU | BPF_MOV | BPF_X] =	mov_reg,
+	[BPF_ALU | BPF_MOV | BPF_K] =	mov_imm,
+	[BPF_ALU | BPF_XOR | BPF_X] =	xor_reg,
+	[BPF_ALU | BPF_XOR | BPF_K] =	xor_imm,
+	[BPF_ALU | BPF_AND | BPF_X] =	and_reg,
+	[BPF_ALU | BPF_AND | BPF_K] =	and_imm,
+	[BPF_ALU | BPF_OR | BPF_X] =	or_reg,
+	[BPF_ALU | BPF_OR | BPF_K] =	or_imm,
+	[BPF_ALU | BPF_ADD | BPF_X] =	add_reg,
+	[BPF_ALU | BPF_ADD | BPF_K] =	add_imm,
+	[BPF_ALU | BPF_SUB | BPF_X] =	sub_reg,
+	[BPF_ALU | BPF_SUB | BPF_K] =	sub_imm,
+	[BPF_ALU | BPF_LSH | BPF_K] =	shl_imm,
+	[BPF_LD | BPF_IMM | BPF_DW] =	imm_ld8,
+	[BPF_LD | BPF_ABS | BPF_B] =	data_ld1,
+	[BPF_LD | BPF_ABS | BPF_H] =	data_ld2,
+	[BPF_LD | BPF_ABS | BPF_W] =	data_ld4,
+	[BPF_LD | BPF_IND | BPF_B] =	data_ind_ld1,
+	[BPF_LD | BPF_IND | BPF_H] =	data_ind_ld2,
+	[BPF_LD | BPF_IND | BPF_W] =	data_ind_ld4,
+	[BPF_LDX | BPF_MEM | BPF_W] =	mem_ldx4,
+	[BPF_JMP | BPF_JA | BPF_K] =	jump,
+	[BPF_JMP | BPF_JEQ | BPF_K] =	jeq_imm,
+	[BPF_JMP | BPF_JGT | BPF_K] =	jgt_imm,
+	[BPF_JMP | BPF_JGE | BPF_K] =	jge_imm,
+	[BPF_JMP | BPF_JSET | BPF_K] =	jset_imm,
+	[BPF_JMP | BPF_JNE | BPF_K] =	jne_imm,
+	[BPF_JMP | BPF_JEQ | BPF_X] =	jeq_reg,
+	[BPF_JMP | BPF_JGT | BPF_X] =	jgt_reg,
+	[BPF_JMP | BPF_JGE | BPF_X] =	jge_reg,
+	[BPF_JMP | BPF_JSET | BPF_X] =	jset_reg,
+	[BPF_JMP | BPF_JNE | BPF_X] =	jne_reg,
+	[BPF_JMP | BPF_EXIT] =		goto_out,
+};
+
+/* --- Misc code --- */
+static void br_set_offset(u64 *instr, u16 offset)
+{
+	u16 addr_lo, addr_hi;
+
+	addr_lo = offset & (OP_BR_ADDR_LO >> _bf_shf(OP_BR_ADDR_LO));
+	addr_hi = offset != addr_lo;
+	*instr &= ~(OP_BR_ADDR_HI | OP_BR_ADDR_LO);
+	*instr |= FIELD_PREP(OP_BR_ADDR_HI, addr_hi);
+	*instr |= FIELD_PREP(OP_BR_ADDR_LO, addr_lo);
+}
+
+/* --- Assembler logic --- */
+static int nfp_fixup_branches(struct nfp_prog *nfp_prog)
+{
+	struct nfp_insn_meta *meta, *next;
+	u32 off, br_idx;
+	u32 idx;
+
+	nfp_for_each_insn_walk2(nfp_prog, meta, next) {
+		if (meta->skip)
+			continue;
+		if (BPF_CLASS(meta->insn.code) != BPF_JMP)
+			continue;
+
+		br_idx = nfp_prog_offset_to_index(nfp_prog, next->off) - 1;
+		if (!nfp_is_br(nfp_prog->prog[br_idx])) {
+			pr_err("Fixup found block not ending in branch %d %02x %016llx!!\n",
+			       br_idx, meta->insn.code, nfp_prog->prog[br_idx]);
+			return -ELOOP;
+		}
+		/* Leave special branches for later */
+		if (FIELD_GET(OP_BR_SPECIAL, nfp_prog->prog[br_idx]))
+			continue;
+
+		/* Find the target offset in assembler realm */
+		off = meta->insn.off;
+		if (!off) {
+			pr_err("Fixup found zero offset!!\n");
+			return -ELOOP;
+		}
+
+		while (off && nfp_meta_has_next(nfp_prog, next)) {
+			next = nfp_meta_next(next);
+			off--;
+		}
+		if (off) {
+			pr_err("Fixup found too large jump!! %d\n", off);
+			return -ELOOP;
+		}
+
+		if (next->skip) {
+			pr_err("Branch landing on removed instruction!!\n");
+			return -ELOOP;
+		}
+
+		for (idx = nfp_prog_offset_to_index(nfp_prog, meta->off);
+		     idx <= br_idx; idx++) {
+			if (!nfp_is_br(nfp_prog->prog[idx]))
+				continue;
+			br_set_offset(&nfp_prog->prog[idx], next->off);
+		}
+	}
+
+	/* Fixup 'goto out's separately, they can be scattered around */
+	for (br_idx = 0; br_idx < nfp_prog->prog_len; br_idx++) {
+		enum br_special special;
+
+		if ((nfp_prog->prog[br_idx] & OP_BR_BASE_MASK) != OP_BR_BASE)
+			continue;
+
+		special = FIELD_GET(OP_BR_SPECIAL, nfp_prog->prog[br_idx]);
+		switch (special) {
+		case OP_BR_NORMAL:
+			break;
+		case OP_BR_GO_OUT:
+			br_set_offset(&nfp_prog->prog[br_idx],
+				      nfp_prog->tgt_out);
+			break;
+		case OP_BR_GO_ABORT:
+			br_set_offset(&nfp_prog->prog[br_idx],
+				      nfp_prog->tgt_abort);
+			break;
+		}
+
+		nfp_prog->prog[br_idx] &= ~OP_BR_SPECIAL;
+	}
+
+	return 0;
+}
+
+static void nfp_intro(struct nfp_prog *nfp_prog)
+{
+	emit_alu(nfp_prog, pkt_reg(nfp_prog),
+		 reg_none(), ALU_OP_NONE, NFP_BPF_ABI_PKT);
+}
+
+static void nfp_outro_tc_legacy(struct nfp_prog *nfp_prog)
+{
+	const u8 act2code[] = {
+		[NN_ACT_TC_DROP]  = 0x22,
+	};
+	/* Target for aborts */
+	nfp_prog->tgt_abort = nfp_prog_current_offset(nfp_prog);
+	wrp_immed(nfp_prog, reg_both(0), 0);
+
+	/* Target for normal exits */
+	nfp_prog->tgt_out = nfp_prog_current_offset(nfp_prog);
+	/* Legacy TC mode:
+	 *   0        0x11 -> pass,  count as stat0
+	 *  -1  drop  0x22 -> drop,  count as stat1
+	 *     redir  0x24 -> redir, count as stat1
+	 *  ife mark  0x21 -> pass,  count as stat1
+	 *  ife + tx  0x24 -> redir, count as stat1
+	 */
+	emit_br_byte_neq(nfp_prog, reg_b(0), 0xff, 0, nfp_prog->tgt_done, 2);
+	emit_alu(nfp_prog, reg_a(0),
+		 reg_none(), ALU_OP_NONE, NFP_BPF_ABI_FLAGS);
+	emit_ld_field(nfp_prog, reg_a(0), 0xc, reg_imm(0x11), SHF_SC_L_SHF, 16);
+
+	emit_br(nfp_prog, BR_UNC, nfp_prog->tgt_done, 1);
+	emit_ld_field(nfp_prog, reg_a(0), 0xc, reg_imm(act2code[nfp_prog->act]),
+		      SHF_SC_L_SHF, 16);
+}
+
+static void nfp_outro(struct nfp_prog *nfp_prog)
+{
+	switch (nfp_prog->act) {
+	case NN_ACT_TC_DROP:
+		nfp_outro_tc_legacy(nfp_prog);
+		break;
+	}
+}
+
+static int nfp_translate(struct nfp_prog *nfp_prog)
+{
+	struct nfp_insn_meta *meta;
+	int err;
+
+	nfp_intro(nfp_prog);
+	if (nfp_prog->error)
+		return nfp_prog->error;
+
+	list_for_each_entry(meta, &nfp_prog->insns, l) {
+		instr_cb_t cb = instr_cb[meta->insn.code];
+
+		meta->off = nfp_prog_current_offset(nfp_prog);
+
+		if (meta->skip) {
+			nfp_prog->n_translated++;
+			continue;
+		}
+
+		if (nfp_meta_has_prev(nfp_prog, meta) &&
+		    nfp_meta_prev(meta)->double_cb)
+			cb = nfp_meta_prev(meta)->double_cb;
+		if (!cb)
+			return -ENOENT;
+		err = cb(nfp_prog, meta);
+		if (err)
+			return err;
+
+		nfp_prog->n_translated++;
+	}
+
+	nfp_outro(nfp_prog);
+	if (nfp_prog->error)
+		return nfp_prog->error;
+
+	return nfp_fixup_branches(nfp_prog);
+}
+
+static int
+nfp_prog_prepare(struct nfp_prog *nfp_prog, const struct bpf_insn *prog,
+		 unsigned int cnt)
+{
+	unsigned int i;
+
+	for (i = 0; i < cnt; i++) {
+		struct nfp_insn_meta *meta;
+
+		meta = kzalloc(sizeof(*meta), GFP_KERNEL);
+		if (!meta)
+			return -ENOMEM;
+
+		meta->insn = prog[i];
+		meta->n = i;
+
+		list_add_tail(&meta->l, &nfp_prog->insns);
+	}
+
+	return 0;
+}
+
+/* --- Optimizations --- */
+static void nfp_bpf_opt_reg_init(struct nfp_prog *nfp_prog)
+{
+	struct nfp_insn_meta *meta;
+
+	list_for_each_entry(meta, &nfp_prog->insns, l) {
+		struct bpf_insn insn = meta->insn;
+
+		/* Programs converted from cBPF start with register xoring */
+		if (insn.code == (BPF_ALU64 | BPF_XOR | BPF_X) &&
+		    insn.src_reg == insn.dst_reg)
+			continue;
+
+		/* Programs start with R6 = R1 but we ignore the skb pointer */
+		if (insn.code == (BPF_ALU64 | BPF_MOV | BPF_X) &&
+		    insn.src_reg == 1 && insn.dst_reg == 6)
+			meta->skip = true;
+
+		/* Return as soon as something doesn't match */
+		if (!meta->skip)
+			return;
+	}
+}
+
+/* Try to rename registers so that program uses only low ones */
+static int nfp_bpf_opt_reg_rename(struct nfp_prog *nfp_prog)
+{
+	bool reg_used[MAX_BPF_REG] = {};
+	u8 tgt_reg[MAX_BPF_REG] = {};
+	struct nfp_insn_meta *meta;
+	unsigned int i, j;
+
+	list_for_each_entry(meta, &nfp_prog->insns, l) {
+		if (meta->skip)
+			continue;
+
+		reg_used[meta->insn.src_reg] = true;
+		reg_used[meta->insn.dst_reg] = true;
+	}
+
+	if (reg_used[BPF_REG_10]) {
+		pr_err("Detected use of stack ptr\n");
+		return -EINVAL;
+	}
+
+	for (i = 0, j = 0; i < ARRAY_SIZE(tgt_reg); i++) {
+		if (!reg_used[i])
+			continue;
+
+		tgt_reg[i] = j++;
+	}
+	nfp_prog->num_regs = j;
+
+	list_for_each_entry(meta, &nfp_prog->insns, l) {
+		meta->insn.src_reg = tgt_reg[meta->insn.src_reg];
+		meta->insn.dst_reg = tgt_reg[meta->insn.dst_reg];
+	}
+
+	return 0;
+}
+
+/* Remove masking after load since our load guarantees this is not needed */
+static void nfp_bpf_opt_ld_mask(struct nfp_prog *nfp_prog)
+{
+	struct nfp_insn_meta *meta1, *meta2;
+	const s32 exp_mask[] = {
+		[BPF_B] = 0x000000ffU,
+		[BPF_H] = 0x0000ffffU,
+		[BPF_W] = 0xffffffffU,
+	};
+
+	nfp_for_each_insn_walk2(nfp_prog, meta1, meta2) {
+		struct bpf_insn insn, next;
+
+		insn = meta1->insn;
+		next = meta2->insn;
+
+		if (BPF_CLASS(insn.code) != BPF_LD)
+			continue;
+		if (BPF_MODE(insn.code) != BPF_ABS &&
+		    BPF_MODE(insn.code) != BPF_IND)
+			continue;
+
+		if (next.code != (BPF_ALU64 | BPF_AND | BPF_K))
+			continue;
+
+		if (!exp_mask[BPF_SIZE(insn.code)])
+			continue;
+		if (exp_mask[BPF_SIZE(insn.code)] != next.imm)
+			continue;
+
+		if (next.src_reg || next.dst_reg)
+			continue;
+
+		meta2->skip = true;
+	}
+}
+
+static void nfp_bpf_opt_ld_shift(struct nfp_prog *nfp_prog)
+{
+	struct nfp_insn_meta *meta1, *meta2, *meta3;
+
+	nfp_for_each_insn_walk3(nfp_prog, meta1, meta2, meta3) {
+		struct bpf_insn insn, next1, next2;
+
+		insn = meta1->insn;
+		next1 = meta2->insn;
+		next2 = meta3->insn;
+
+		if (BPF_CLASS(insn.code) != BPF_LD)
+			continue;
+		if (BPF_MODE(insn.code) != BPF_ABS &&
+		    BPF_MODE(insn.code) != BPF_IND)
+			continue;
+		if (BPF_SIZE(insn.code) != BPF_W)
+			continue;
+
+		if (!(next1.code == (BPF_LSH | BPF_K | BPF_ALU64) &&
+		      next2.code == (BPF_RSH | BPF_K | BPF_ALU64)) &&
+		    !(next1.code == (BPF_RSH | BPF_K | BPF_ALU64) &&
+		      next2.code == (BPF_LSH | BPF_K | BPF_ALU64)))
+			continue;
+
+		if (next1.src_reg || next1.dst_reg ||
+		    next2.src_reg || next2.dst_reg)
+			continue;
+
+		if (next1.imm != 0x20 || next2.imm != 0x20)
+			continue;
+
+		meta2->skip = true;
+		meta3->skip = true;
+	}
+}
+
+static int nfp_bpf_optimize(struct nfp_prog *nfp_prog)
+{
+	int ret;
+
+	nfp_bpf_opt_reg_init(nfp_prog);
+
+	ret = nfp_bpf_opt_reg_rename(nfp_prog);
+	if (ret)
+		return ret;
+
+	nfp_bpf_opt_ld_mask(nfp_prog);
+	nfp_bpf_opt_ld_shift(nfp_prog);
+
+	return 0;
+}
+
+/**
+ * nfp_bpf_jit() - translate BPF code into NFP assembly
+ * @filter:	kernel BPF filter struct
+ * @prog_mem:	memory to store assembler instructions
+ * @act:	action attached to this eBPF program
+ * @prog_start:	offset of the first instruction when loaded
+ * @prog_done:	where to jump on exit
+ * @prog_sz:	size of @prog_mem in instructions
+ * @res:	achieved parameters of translation results
+ */
+int
+nfp_bpf_jit(struct bpf_prog *filter, void *prog_mem,
+	    enum nfp_bpf_action_type act,
+	    unsigned int prog_start, unsigned int prog_done,
+	    unsigned int prog_sz, struct nfp_bpf_result *res)
+{
+	struct nfp_prog *nfp_prog;
+	int ret;
+
+	nfp_prog = kzalloc(sizeof(*nfp_prog), GFP_KERNEL);
+	if (!nfp_prog)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&nfp_prog->insns);
+	nfp_prog->act = act;
+	nfp_prog->start_off = prog_start;
+	nfp_prog->tgt_done = prog_done;
+
+	ret = nfp_prog_prepare(nfp_prog, filter->insnsi, filter->len);
+	if (ret)
+		goto out;
+
+	ret = nfp_prog_verify(nfp_prog, filter);
+	if (ret)
+		return ret;
+
+	ret = nfp_bpf_optimize(nfp_prog);
+	if (ret)
+		goto out;
+
+	if (nfp_prog->num_regs <= 7)
+		nfp_prog->regs_per_thread = 16;
+	else
+		nfp_prog->regs_per_thread = 32;
+
+	nfp_prog->prog = prog_mem;
+	nfp_prog->__prog_alloc_len = prog_sz;
+
+	ret = nfp_translate(nfp_prog);
+	if (ret) {
+		pr_err("Translation failed with error %d (translated: %u)\n",
+		       ret, nfp_prog->n_translated);
+		ret = -EINVAL;
+	}
+
+	res->n_instr = nfp_prog->prog_len;
+	res->dense_mode = nfp_prog->num_regs <= 7;
+out:
+	nfp_prog_free(nfp_prog);
+
+	return ret;
+}
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_bpf_verifier.c b/drivers/net/ethernet/netronome/nfp/nfp_bpf_verifier.c
new file mode 100644
index 000000000000..8b66d98e37eb
--- /dev/null
+++ b/drivers/net/ethernet/netronome/nfp/nfp_bpf_verifier.c
@@ -0,0 +1,157 @@
+/*
+ * Copyright (C) 2016 Netronome Systems, Inc.
+ *
+ * This software is dual licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree or the BSD 2-Clause License provided below.  You have the
+ * option to license this software under the complete terms of either license.
+ *
+ * The BSD 2-Clause License:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      1. Redistributions of source code must retain the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer.
+ *
+ *      2. Redistributions in binary form must reproduce the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer in the documentation and/or other materials
+ *         provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#define pr_fmt(fmt)	"NFP net bpf: " fmt
+
+#include <linux/bpf.h>
+#include <linux/bpf_parser.h>
+#include <linux/kernel.h>
+#include <linux/pkt_cls.h>
+
+#include "nfp_bpf.h"
+
+/* Parser/verifier definitions */
+struct nfp_bpf_parser_priv {
+	struct nfp_prog *prog;
+	struct nfp_insn_meta *meta;
+};
+
+static struct nfp_insn_meta *
+nfp_bpf_goto_meta(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+		  unsigned int insn_idx, unsigned int n_insns)
+{
+	unsigned int forward, backward, i;
+
+	backward = meta->n - insn_idx;
+	forward = insn_idx - meta->n;
+
+	if (min(forward, backward) > n_insns - insn_idx - 1) {
+		backward = n_insns - insn_idx - 1;
+		meta = nfp_prog_last_meta(nfp_prog);
+	}
+	if (min(forward, backward) > insn_idx && backward > insn_idx) {
+		forward = insn_idx;
+		meta = nfp_prog_first_meta(nfp_prog);
+	}
+
+	if (forward < backward)
+		for (i = 0; i < forward; i++)
+			meta = nfp_meta_next(meta);
+	else
+		for (i = 0; i < backward; i++)
+			meta = nfp_meta_prev(meta);
+
+	return meta;
+}
+
+static int
+nfp_bpf_check_exit(struct nfp_prog *nfp_prog, const struct verifier_env *env)
+{
+	const struct reg_state *reg0 = &env->cur_state.regs[0];
+
+	if (reg0->type != CONST_IMM) {
+		pr_info("unsupported exit state: %d, imm: %llx\n",
+			reg0->type, reg0->imm);
+		return -EINVAL;
+	}
+
+	if (reg0->imm != 0 && (reg0->imm & ~0U) != ~0U) {
+		pr_info("unsupported exit state: %d, imm: %llx\n",
+			reg0->type, reg0->imm);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int
+nfp_bpf_check_ctx_ptr(struct nfp_prog *nfp_prog, const struct verifier_env *env,
+		      u8 reg)
+{
+	if (env->cur_state.regs[reg].type != PTR_TO_CTX)
+		return -EINVAL;
+
+	return 0;
+}
+
+static int
+nfp_verify_insn(struct verifier_env *env, int insn_idx, int prev_insn_idx)
+{
+	struct nfp_bpf_parser_priv *priv = env->ppriv;
+	struct nfp_insn_meta *meta = priv->meta;
+
+	meta = nfp_bpf_goto_meta(priv->prog, meta, insn_idx, env->prog->len);
+	priv->meta = meta;
+
+	if (meta->insn.code == (BPF_JMP | BPF_EXIT))
+		return nfp_bpf_check_exit(priv->prog, env);
+
+	if ((meta->insn.code & ~BPF_SIZE_MASK) == (BPF_LDX | BPF_MEM))
+		return nfp_bpf_check_ctx_ptr(priv->prog, env,
+					     meta->insn.src_reg);
+	if ((meta->insn.code & ~BPF_SIZE_MASK) == (BPF_STX | BPF_MEM))
+		return nfp_bpf_check_ctx_ptr(priv->prog, env,
+					     meta->insn.dst_reg);
+
+	return 0;
+}
+
+static const struct bpf_ext_parser_ops nfp_bpf_pops = {
+	.insn_hook = nfp_verify_insn,
+};
+
+int nfp_prog_verify(struct nfp_prog *nfp_prog, struct bpf_prog *prog)
+{
+	struct nfp_bpf_parser_priv *priv;
+	struct bpf_prog *prog_rw;
+	int ret;
+
+	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+	prog_rw = bpf_prog_clone_create(prog, 0);
+	if (!prog_rw) {
+		kfree(priv);
+		return -ENOMEM;
+	}
+
+	priv->prog = nfp_prog;
+	priv->meta = nfp_prog_first_meta(nfp_prog);
+
+	ret = bpf_parse(prog_rw, &nfp_bpf_pops, priv);
+
+	bpf_prog_clone_free(prog_rw);
+	kfree(priv);
+
+	return ret;
+}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFCv2 10/16] nfp: bpf: add hardware bpf offload
  2016-08-26 18:05 [RFCv2 00/16] BPF hardware offload (cls_bpf for now) Jakub Kicinski
                   ` (8 preceding siblings ...)
  2016-08-26 18:06 ` [RFCv2 09/16] nfp: add BPF to NFP code translator Jakub Kicinski
@ 2016-08-26 18:06 ` Jakub Kicinski
  2016-08-26 18:06 ` [RFCv2 11/16] net: cls_bpf: allow offloaded filters to update stats Jakub Kicinski
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-26 18:06 UTC (permalink / raw)
  To: netdev
  Cc: ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici,
	Jakub Kicinski

Add hardware bpf offload on our smart NICs.  Detect if
capable firmware is loaded and use it to load the code JITed
with just added translator onto programmable engines.

This commit only supports offloading cls_bpf in legacy mode
(non-direct action).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/Makefile        |   1 +
 drivers/net/ethernet/netronome/nfp/nfp_net.h       |  26 ++-
 .../net/ethernet/netronome/nfp/nfp_net_common.c    |  40 +++-
 drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h  |  45 ++++-
 .../net/ethernet/netronome/nfp/nfp_net_offload.c   | 220 +++++++++++++++++++++
 5 files changed, 325 insertions(+), 7 deletions(-)
 create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_net_offload.c

diff --git a/drivers/net/ethernet/netronome/nfp/Makefile b/drivers/net/ethernet/netronome/nfp/Makefile
index 5f12689bf523..0efb2ba9a558 100644
--- a/drivers/net/ethernet/netronome/nfp/Makefile
+++ b/drivers/net/ethernet/netronome/nfp/Makefile
@@ -3,6 +3,7 @@ obj-$(CONFIG_NFP_NETVF)	+= nfp_netvf.o
 nfp_netvf-objs := \
 	    nfp_net_common.o \
 	    nfp_net_ethtool.o \
+	    nfp_net_offload.o \
 	    nfp_netvf_main.o
 
 ifeq ($(CONFIG_BPF_SYSCALL),y)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
index 690635660195..ea6f5e667f27 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
@@ -220,7 +220,7 @@ struct nfp_net_tx_ring {
 #define PCIE_DESC_RX_I_TCP_CSUM_OK	cpu_to_le16(BIT(11))
 #define PCIE_DESC_RX_I_UDP_CSUM		cpu_to_le16(BIT(10))
 #define PCIE_DESC_RX_I_UDP_CSUM_OK	cpu_to_le16(BIT(9))
-#define PCIE_DESC_RX_SPARE		cpu_to_le16(BIT(8))
+#define PCIE_DESC_RX_BPF		cpu_to_le16(BIT(8))
 #define PCIE_DESC_RX_EOP		cpu_to_le16(BIT(7))
 #define PCIE_DESC_RX_IP4_CSUM		cpu_to_le16(BIT(6))
 #define PCIE_DESC_RX_IP4_CSUM_OK	cpu_to_le16(BIT(5))
@@ -413,6 +413,7 @@ static inline bool nfp_net_fw_ver_eq(struct nfp_net_fw_version *fw_ver,
  * @is_vf:              Is the driver attached to a VF?
  * @is_nfp3200:         Is the driver for a NFP-3200 card?
  * @fw_loaded:          Is the firmware loaded?
+ * @bpf_offload_skip_sw:  Offloaded BPF program will not be rerun by cls_bpf
  * @ctrl:               Local copy of the control register/word.
  * @fl_bufsz:           Currently configured size of the freelist buffers
  * @rx_offset:		Offset in the RX buffers where packet data starts
@@ -473,6 +474,7 @@ struct nfp_net {
 	unsigned is_vf:1;
 	unsigned is_nfp3200:1;
 	unsigned fw_loaded:1;
+	unsigned bpf_offload_skip_sw:1;
 
 	u32 ctrl;
 	u32 fl_bufsz;
@@ -561,12 +563,28 @@ struct nfp_net {
 /* Functions to read/write from/to a BAR
  * Performs any endian conversion necessary.
  */
+static inline u16 nn_readb(struct nfp_net *nn, int off)
+{
+	return readb(nn->ctrl_bar + off);
+}
+
 static inline void nn_writeb(struct nfp_net *nn, int off, u8 val)
 {
 	writeb(val, nn->ctrl_bar + off);
 }
 
-/* NFP-3200 can't handle 16-bit accesses too well - hence no readw/writew */
+/* NFP-3200 can't handle 16-bit accesses too well */
+static inline u16 nn_readw(struct nfp_net *nn, int off)
+{
+	WARN_ON_ONCE(nn->is_nfp3200);
+	return readw(nn->ctrl_bar + off);
+}
+
+static inline void nn_writew(struct nfp_net *nn, int off, u16 val)
+{
+	WARN_ON_ONCE(nn->is_nfp3200);
+	writew(val, nn->ctrl_bar + off);
+}
 
 static inline u32 nn_readl(struct nfp_net *nn, int off)
 {
@@ -757,4 +775,8 @@ static inline void nfp_net_debugfs_adapter_del(struct nfp_net *nn)
 }
 #endif /* CONFIG_NFP_NET_DEBUG */
 
+int
+nfp_net_bpf_offload(struct nfp_net *nn, u32 handle, __be16 proto,
+		    struct tc_cls_bpf_offload *cls_bpf);
+
 #endif /* _NFP_NET_H_ */
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 88678c172b19..ff0e6b02934c 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -61,6 +61,7 @@
 
 #include <linux/ktime.h>
 
+#include <net/pkt_cls.h>
 #include <net/vxlan.h>
 
 #include "nfp_net_ctrl.h"
@@ -2387,6 +2388,31 @@ static struct rtnl_link_stats64 *nfp_net_stat64(struct net_device *netdev,
 	return stats;
 }
 
+static bool nfp_net_ebpf_capable(struct nfp_net *nn)
+{
+	if (nn->cap & NFP_NET_CFG_CTRL_BPF &&
+	    nn_readb(nn, NFP_NET_CFG_BPF_ABI) == NFP_NET_BPF_ABI)
+		return true;
+	return false;
+}
+
+static int
+nfp_net_setup_tc(struct net_device *netdev, u32 handle, __be16 proto,
+		 struct tc_to_netdev *tc)
+{
+	struct nfp_net *nn = netdev_priv(netdev);
+
+	if (TC_H_MAJ(handle) != TC_H_MAJ(TC_H_INGRESS))
+		return -ENOTSUPP;
+	if (proto != htons(ETH_P_ALL))
+		return -ENOTSUPP;
+
+	if (tc->type == TC_SETUP_CLSBPF && nfp_net_ebpf_capable(nn))
+		return nfp_net_bpf_offload(nn, handle, proto, tc->cls_bpf);
+
+	return -EINVAL;
+}
+
 static int nfp_net_set_features(struct net_device *netdev,
 				netdev_features_t features)
 {
@@ -2441,6 +2467,11 @@ static int nfp_net_set_features(struct net_device *netdev,
 			new_ctrl &= ~NFP_NET_CFG_CTRL_GATHER;
 	}
 
+	if (changed & NETIF_F_HW_TC && nn->ctrl & NFP_NET_CFG_CTRL_BPF) {
+		nn_err(nn, "Cannot disable HW TC offload while in use\n");
+		return -EBUSY;
+	}
+
 	nn_dbg(nn, "Feature change 0x%llx -> 0x%llx (changed=0x%llx)\n",
 	       netdev->features, features, changed);
 
@@ -2590,6 +2621,7 @@ static const struct net_device_ops nfp_net_netdev_ops = {
 	.ndo_stop		= nfp_net_netdev_close,
 	.ndo_start_xmit		= nfp_net_tx,
 	.ndo_get_stats64	= nfp_net_stat64,
+	.ndo_setup_tc		= nfp_net_setup_tc,
 	.ndo_tx_timeout		= nfp_net_tx_timeout,
 	.ndo_set_rx_mode	= nfp_net_set_rx_mode,
 	.ndo_change_mtu		= nfp_net_change_mtu,
@@ -2615,7 +2647,7 @@ void nfp_net_info(struct nfp_net *nn)
 		nn->fw_ver.resv, nn->fw_ver.class,
 		nn->fw_ver.major, nn->fw_ver.minor,
 		nn->max_mtu);
-	nn_info(nn, "CAP: %#x %s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n",
+	nn_info(nn, "CAP: %#x %s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n",
 		nn->cap,
 		nn->cap & NFP_NET_CFG_CTRL_PROMISC  ? "PROMISC "  : "",
 		nn->cap & NFP_NET_CFG_CTRL_L2BC     ? "L2BCFILT " : "",
@@ -2632,7 +2664,8 @@ void nfp_net_info(struct nfp_net *nn)
 		nn->cap & NFP_NET_CFG_CTRL_MSIXAUTO ? "AUTOMASK " : "",
 		nn->cap & NFP_NET_CFG_CTRL_IRQMOD   ? "IRQMOD "   : "",
 		nn->cap & NFP_NET_CFG_CTRL_VXLAN    ? "VXLAN "    : "",
-		nn->cap & NFP_NET_CFG_CTRL_NVGRE    ? "NVGRE "	  : "");
+		nn->cap & NFP_NET_CFG_CTRL_NVGRE    ? "NVGRE "	  : "",
+		nfp_net_ebpf_capable(nn)            ? "BPF "	  : "");
 }
 
 /**
@@ -2800,6 +2833,9 @@ int nfp_net_netdev_init(struct net_device *netdev)
 
 	netdev->features = netdev->hw_features;
 
+	if (nfp_net_ebpf_capable(nn))
+		netdev->hw_features |= NETIF_F_HW_TC;
+
 	/* Advertise but disable TSO by default. */
 	netdev->features &= ~(NETIF_F_TSO | NETIF_F_TSO6);
 
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h b/drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h
index ad6c4e31cedd..a4b0ef11a09c 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h
@@ -123,6 +123,8 @@
 #define   NFP_NET_CFG_CTRL_L2SWITCH_LOCAL (0x1 << 23) /* Switch to local */
 #define   NFP_NET_CFG_CTRL_VXLAN	  (0x1 << 24) /* VXLAN tunnel support */
 #define   NFP_NET_CFG_CTRL_NVGRE	  (0x1 << 25) /* NVGRE tunnel support */
+/* {M> bit 26 taken by the no-TX-intr thing */
+#define   NFP_NET_CFG_CTRL_BPF		  (0x1 << 27) /* BPF offload */
 #define NFP_NET_CFG_UPDATE              0x0004
 #define   NFP_NET_CFG_UPDATE_GEN          (0x1 <<  0) /* General update */
 #define   NFP_NET_CFG_UPDATE_RING         (0x1 <<  1) /* Ring config change */
@@ -134,6 +136,7 @@
 #define   NFP_NET_CFG_UPDATE_RESET        (0x1 <<  7) /* Update due to FLR */
 #define   NFP_NET_CFG_UPDATE_IRQMOD       (0x1 <<  8) /* IRQ mod change */
 #define   NFP_NET_CFG_UPDATE_VXLAN	  (0x1 <<  9) /* VXLAN port change */
+#define   NFP_NET_CFG_UPDATE_BPF	  (0x1 << 10) /* BPF program load */
 #define   NFP_NET_CFG_UPDATE_ERR          (0x1 << 31) /* A error occurred */
 #define NFP_NET_CFG_TXRS_ENABLE         0x0008
 #define NFP_NET_CFG_RXRS_ENABLE         0x0010
@@ -196,10 +199,37 @@
 #define NFP_NET_CFG_VXLAN_SZ		  0x0008
 
 /**
- * 64B reserved for future use (0x0080 - 0x00c0)
+ * NFP6000 - BPF section
+ * @NFP_NET_CFG_BPF_ABI:	BPF ABI version
+ * @NFP_NET_CFG_BPF_CAP:	BPF capabilities
+ * @NFP_NET_CFG_BPF_MAX_LEN:	Maximum size of JITed BPF code in bytes
+ * @NFP_NET_CFG_BPF_START:	Offset at which BPF will be loaded
+ * @NFP_NET_CFG_BPF_DONE:	Offset to jump to on exit
+ * @NFP_NET_CFG_BPF_STACK_SZ:	Total size of stack area in 64B chunks
+ * @NFP_NET_CFG_BPF_INL_MTU:	Packet data split offset in 64B chunks
+ * @NFP_NET_CFG_BPF_SIZE:	Size of the JITed BPF code in instructions
+ * @NFP_NET_CFG_BPF_ADDR:	DMA address of the buffer with JITed BPF code
  */
-#define NFP_NET_CFG_RESERVED            0x0080
-#define NFP_NET_CFG_RESERVED_SZ         0x0040
+#define NFP_NET_CFG_BPF_ABI		0x0080
+#define   NFP_NET_BPF_ABI		1
+#define NFP_NET_CFG_BPF_CAP		0x0081
+#define   NFP_NET_BPF_CAP_RELO		(1 << 0) /* seamless reload */
+#define NFP_NET_CFG_BPF_MAX_LEN		0x0082
+#define NFP_NET_CFG_BPF_START		0x0084
+#define NFP_NET_CFG_BPF_DONE		0x0086
+#define NFP_NET_CFG_BPF_STACK_SZ	0x0088
+#define NFP_NET_CFG_BPF_INL_MTU		0x0089
+#define NFP_NET_CFG_BPF_SIZE		0x008e
+#define NFP_NET_CFG_BPF_ADDR		0x0090
+#define   NFP_NET_CFG_BPF_CFG_8CTX	(1 << 0) /* 8ctx mode */
+#define   NFP_NET_CFG_BPF_CFG_MASK	7ULL
+#define   NFP_NET_CFG_BPF_ADDR_MASK	(~NFP_NET_CFG_BPF_CFG_MASK)
+
+/**
+ * 40B reserved for future use (0x0098 - 0x00c0)
+ */
+#define NFP_NET_CFG_RESERVED            0x0098
+#define NFP_NET_CFG_RESERVED_SZ         0x0028
 
 /**
  * RSS configuration (0x0100 - 0x01ac):
@@ -303,6 +333,15 @@
 #define NFP_NET_CFG_STATS_TX_MC_FRAMES  (NFP_NET_CFG_STATS_BASE + 0x80)
 #define NFP_NET_CFG_STATS_TX_BC_FRAMES  (NFP_NET_CFG_STATS_BASE + 0x88)
 
+#define NFP_NET_CFG_STATS_APP0_FRAMES	(NFP_NET_CFG_STATS_BASE + 0x90)
+#define NFP_NET_CFG_STATS_APP0_BYTES	(NFP_NET_CFG_STATS_BASE + 0x98)
+#define NFP_NET_CFG_STATS_APP1_FRAMES	(NFP_NET_CFG_STATS_BASE + 0xa0)
+#define NFP_NET_CFG_STATS_APP1_BYTES	(NFP_NET_CFG_STATS_BASE + 0xa8)
+#define NFP_NET_CFG_STATS_APP2_FRAMES	(NFP_NET_CFG_STATS_BASE + 0xb0)
+#define NFP_NET_CFG_STATS_APP2_BYTES	(NFP_NET_CFG_STATS_BASE + 0xb8)
+#define NFP_NET_CFG_STATS_APP3_FRAMES	(NFP_NET_CFG_STATS_BASE + 0xc0)
+#define NFP_NET_CFG_STATS_APP3_BYTES	(NFP_NET_CFG_STATS_BASE + 0xc8)
+
 /**
  * Per ring stats (0x1000 - 0x1800)
  * options, 64bit per entry
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c b/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
new file mode 100644
index 000000000000..3a3dba1a03f0
--- /dev/null
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
@@ -0,0 +1,220 @@
+/*
+ * Copyright (C) 2016 Netronome Systems, Inc.
+ *
+ * This software is dual licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree or the BSD 2-Clause License provided below.  You have the
+ * option to license this software under the complete terms of either license.
+ *
+ * The BSD 2-Clause License:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      1. Redistributions of source code must retain the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer.
+ *
+ *      2. Redistributions in binary form must reproduce the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer in the documentation and/or other materials
+ *         provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/*
+ * nfp_net_offload.c
+ * Netronome network device driver: TC offload functions for PF and VF
+ */
+
+#include <linux/kernel.h>
+#include <linux/netdevice.h>
+#include <linux/pci.h>
+#include <linux/jiffies.h>
+#include <linux/timer.h>
+#include <linux/list.h>
+
+#include <net/pkt_cls.h>
+#include <net/tc_act/tc_gact.h>
+#include <net/tc_act/tc_mirred.h>
+
+#include "nfp_bpf.h"
+#include "nfp_net_ctrl.h"
+#include "nfp_net.h"
+
+static int
+nfp_net_bpf_get_act(struct nfp_net *nn, struct tc_cls_bpf_offload *cls_bpf)
+{
+	const struct tc_action *a;
+	LIST_HEAD(actions);
+
+	/* TC direct action */
+	if (cls_bpf->exts_integrated)
+		return -ENOTSUPP;
+
+	/* TC legacy mode */
+	if (!tc_single_action(cls_bpf->exts))
+		return -ENOTSUPP;
+
+	tcf_exts_to_list(cls_bpf->exts, &actions);
+	list_for_each_entry(a, &actions, list) {
+		if (is_tcf_gact_shot(a))
+			return NN_ACT_TC_DROP;
+	}
+
+	return -ENOTSUPP;
+}
+
+static int
+nfp_net_bpf_offload_prepare(struct nfp_net *nn,
+			    struct tc_cls_bpf_offload *cls_bpf,
+			    struct nfp_bpf_result *res,
+			    void **code, dma_addr_t *dma_addr, u16 max_instr)
+{
+	unsigned int code_sz = max_instr * sizeof(u64);
+	enum nfp_bpf_action_type act;
+	u16 start_off, done_off;
+	unsigned int max_mtu;
+	int ret;
+
+	ret = nfp_net_bpf_get_act(nn, cls_bpf);
+	if (ret < 0)
+		return ret;
+	act = ret;
+
+	max_mtu = nn_readb(nn, NFP_NET_CFG_BPF_INL_MTU) * 64 - 32;
+	if (max_mtu < nn->netdev->mtu) {
+		nn_info(nn, "BPF offload not supported with MTU larger than HW packet split boundary\n");
+		return -ENOTSUPP;
+	}
+
+	start_off = nn_readw(nn, NFP_NET_CFG_BPF_START);
+	done_off = nn_readw(nn, NFP_NET_CFG_BPF_DONE);
+
+	*code = dma_zalloc_coherent(&nn->pdev->dev, code_sz, dma_addr,
+				    GFP_KERNEL);
+	if (!*code)
+		return -ENOMEM;
+
+	ret = nfp_bpf_jit(cls_bpf->filter, *code, act, start_off, done_off,
+			  max_instr, res);
+	if (ret)
+		goto out;
+
+	return 0;
+
+out:
+	dma_free_coherent(&nn->pdev->dev, code_sz, *code, *dma_addr);
+	return ret;
+}
+
+static void
+nfp_net_bpf_load_and_start(struct nfp_net *nn, u32 tc_flags,
+			   void *code, dma_addr_t dma_addr,
+			   unsigned int code_sz, unsigned int n_instr,
+			   bool dense_mode)
+{
+	u64 bpf_addr = dma_addr;
+	int err;
+
+	nn->bpf_offload_skip_sw = !!(tc_flags & TCA_CLS_FLAGS_SKIP_SW);
+
+	if (dense_mode)
+		bpf_addr |= NFP_NET_CFG_BPF_CFG_8CTX;
+
+	nn_writew(nn, NFP_NET_CFG_BPF_SIZE, n_instr);
+	nn_writeq(nn, NFP_NET_CFG_BPF_ADDR, bpf_addr);
+
+	/* Load up the JITed code */
+	err = nfp_net_reconfig(nn, NFP_NET_CFG_UPDATE_BPF);
+	if (err)
+		nn_err(nn, "FW command error while loading BPF: %d\n", err);
+
+	/* Enable passing packets through BPF function */
+	nn->ctrl |= NFP_NET_CFG_CTRL_BPF;
+	nn_writel(nn, NFP_NET_CFG_CTRL, nn->ctrl);
+	err = nfp_net_reconfig(nn, NFP_NET_CFG_UPDATE_GEN);
+	if (err)
+		nn_err(nn, "FW command error while enabling BPF: %d\n", err);
+
+	dma_free_coherent(&nn->pdev->dev, code_sz, code, dma_addr);
+}
+
+static int nfp_net_bpf_stop(struct nfp_net *nn)
+{
+	if (!(nn->ctrl & NFP_NET_CFG_CTRL_BPF))
+		return 0;
+
+	nn->ctrl &= ~NFP_NET_CFG_CTRL_BPF;
+	nn_writel(nn, NFP_NET_CFG_CTRL, nn->ctrl);
+
+	nn->bpf_offload_skip_sw = 0;
+
+	return nfp_net_reconfig(nn, NFP_NET_CFG_UPDATE_GEN);
+}
+
+int
+nfp_net_bpf_offload(struct nfp_net *nn, u32 handle, __be16 proto,
+		    struct tc_cls_bpf_offload *cls_bpf)
+{
+	struct nfp_bpf_result res;
+	dma_addr_t dma_addr;
+	u16 max_instr;
+	void *code;
+	int err;
+
+	max_instr = nn_readw(nn, NFP_NET_CFG_BPF_MAX_LEN);
+
+	switch (cls_bpf->command) {
+	case TC_CLSBPF_REPLACE:
+		/* There is nothing stopping us from implementing seamless
+		 * replace but the simple method of loading I adopted in
+		 * the firmware does not handle atomic replace (i.e. we have to
+		 * stop the BPF offload and re-enable it).  Leaking-in a few
+		 * frames which didn't have BPF applied in the hardware should
+		 * be fine if software fallback is available, though.
+		 */
+		if (nn->bpf_offload_skip_sw)
+			return -EBUSY;
+
+		err = nfp_net_bpf_offload_prepare(nn, cls_bpf, &res, &code,
+						  &dma_addr, max_instr);
+		if (err)
+			return err;
+
+		nfp_net_bpf_stop(nn);
+		nfp_net_bpf_load_and_start(nn, cls_bpf->gen_flags, code,
+					   dma_addr, max_instr * sizeof(u64),
+					   res.n_instr, res.dense_mode);
+		return 0;
+
+	case TC_CLSBPF_ADD:
+		if (nn->ctrl & NFP_NET_CFG_CTRL_BPF)
+			return -EBUSY;
+
+		err = nfp_net_bpf_offload_prepare(nn, cls_bpf, &res, &code,
+						  &dma_addr, max_instr);
+		if (err)
+			return err;
+
+		nfp_net_bpf_load_and_start(nn, cls_bpf->gen_flags, code,
+					   dma_addr, max_instr * sizeof(u64),
+					   res.n_instr, res.dense_mode);
+		return 0;
+
+	case TC_CLSBPF_DESTROY:
+		return nfp_net_bpf_stop(nn);
+
+	default:
+		return -ENOTSUPP;
+	}
+}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFCv2 11/16] net: cls_bpf: allow offloaded filters to update stats
  2016-08-26 18:05 [RFCv2 00/16] BPF hardware offload (cls_bpf for now) Jakub Kicinski
                   ` (9 preceding siblings ...)
  2016-08-26 18:06 ` [RFCv2 10/16] nfp: bpf: add hardware bpf offload Jakub Kicinski
@ 2016-08-26 18:06 ` Jakub Kicinski
  2016-08-29 20:43   ` Daniel Borkmann
  2016-08-26 18:06 ` [RFCv2 12/16] net: bpf: " Jakub Kicinski
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-26 18:06 UTC (permalink / raw)
  To: netdev
  Cc: ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici,
	Jakub Kicinski

Call into offloaded filters to update stats.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 include/net/pkt_cls.h |  1 +
 net/sched/cls_bpf.c   | 11 +++++++++++
 2 files changed, 12 insertions(+)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 0a4a51f339b4..462dc53dd8cc 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -490,6 +490,7 @@ enum tc_clsbpf_command {
 	TC_CLSBPF_ADD,
 	TC_CLSBPF_REPLACE,
 	TC_CLSBPF_DESTROY,
+	TC_CLSBPF_STATS,
 };
 
 struct tc_cls_bpf_offload {
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 630f296f5a90..f1095c6211c9 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -223,6 +223,15 @@ static void cls_bpf_stop_offload(struct tcf_proto *tp,
 	prog->offloaded = false;
 }
 
+static void cls_bpf_offload_update_stats(struct tcf_proto *tp,
+					 struct cls_bpf_prog *prog)
+{
+	if (!prog->offloaded)
+		return;
+
+	cls_bpf_offload_cmd(tp, prog, TC_CLSBPF_STATS);
+}
+
 static int cls_bpf_init(struct tcf_proto *tp)
 {
 	struct cls_bpf_head *head;
@@ -577,6 +586,8 @@ static int cls_bpf_dump(struct net *net, struct tcf_proto *tp, unsigned long fh,
 
 	tm->tcm_handle = prog->handle;
 
+	cls_bpf_offload_update_stats(tp, prog);
+
 	nest = nla_nest_start(skb, TCA_OPTIONS);
 	if (nest == NULL)
 		goto nla_put_failure;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFCv2 12/16] net: bpf: allow offloaded filters to update stats
  2016-08-26 18:05 [RFCv2 00/16] BPF hardware offload (cls_bpf for now) Jakub Kicinski
                   ` (10 preceding siblings ...)
  2016-08-26 18:06 ` [RFCv2 11/16] net: cls_bpf: allow offloaded filters to update stats Jakub Kicinski
@ 2016-08-26 18:06 ` Jakub Kicinski
  2016-08-26 18:06 ` [RFCv2 13/16] nfp: bpf: add packet marking support Jakub Kicinski
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-26 18:06 UTC (permalink / raw)
  To: netdev
  Cc: ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici,
	Jakub Kicinski

Periodically poll stats and call into offloaded actions
to update them.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_net.h       | 19 +++++++
 .../net/ethernet/netronome/nfp/nfp_net_common.c    |  3 ++
 .../net/ethernet/netronome/nfp/nfp_net_offload.c   | 63 ++++++++++++++++++++++
 3 files changed, 85 insertions(+)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
index ea6f5e667f27..13c6a9001b4d 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
@@ -62,6 +62,9 @@
 /* Max time to wait for NFP to respond on updates (in seconds) */
 #define NFP_NET_POLL_TIMEOUT	5
 
+/* Interval for reading offloaded filter stats */
+#define NFP_NET_STAT_POLL_IVL	msecs_to_jiffies(100)
+
 /* Bar allocation */
 #define NFP_NET_CTRL_BAR	0
 #define NFP_NET_Q0_BAR		2
@@ -405,6 +408,11 @@ static inline bool nfp_net_fw_ver_eq(struct nfp_net_fw_version *fw_ver,
 	       fw_ver->minor == minor;
 }
 
+struct nfp_stat_pair {
+	u64 pkts;
+	u64 bytes;
+};
+
 /**
  * struct nfp_net - NFP network device structure
  * @pdev:               Backpointer to PCI device
@@ -428,6 +436,11 @@ static inline bool nfp_net_fw_ver_eq(struct nfp_net_fw_version *fw_ver,
  * @rss_cfg:            RSS configuration
  * @rss_key:            RSS secret key
  * @rss_itbl:           RSS indirection table
+ * @rx_filter:		Filter offload statistics - dropped packets/bytes
+ * @rx_filter_prev:	Filter offload statistics - values from previous update
+ * @rx_filter_change:	Jiffies when statistics last changed
+ * @rx_filter_stats_timer:  Timer for polling filter offload statistics
+ * @rx_filter_lock:	Lock protecting timer state changes (teardown)
  * @max_tx_rings:       Maximum number of TX rings supported by the Firmware
  * @max_rx_rings:       Maximum number of RX rings supported by the Firmware
  * @num_tx_rings:       Currently configured number of TX rings
@@ -504,6 +517,11 @@ struct nfp_net {
 	u8 rss_key[NFP_NET_CFG_RSS_KEY_SZ];
 	u8 rss_itbl[NFP_NET_CFG_RSS_ITBL_SZ];
 
+	struct nfp_stat_pair rx_filter, rx_filter_prev;
+	unsigned long rx_filter_change;
+	struct timer_list rx_filter_stats_timer;
+	spinlock_t rx_filter_lock;
+
 	int max_tx_rings;
 	int max_rx_rings;
 
@@ -775,6 +793,7 @@ static inline void nfp_net_debugfs_adapter_del(struct nfp_net *nn)
 }
 #endif /* CONFIG_NFP_NET_DEBUG */
 
+void nfp_net_filter_stats_timer(unsigned long data);
 int
 nfp_net_bpf_offload(struct nfp_net *nn, u32 handle, __be16 proto,
 		    struct tc_cls_bpf_offload *cls_bpf);
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index ff0e6b02934c..053bda8a0fbd 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -2708,10 +2708,13 @@ struct nfp_net *nfp_net_netdev_alloc(struct pci_dev *pdev,
 	nn->rxd_cnt = NFP_NET_RX_DESCS_DEFAULT;
 
 	spin_lock_init(&nn->reconfig_lock);
+	spin_lock_init(&nn->rx_filter_lock);
 	spin_lock_init(&nn->link_status_lock);
 
 	setup_timer(&nn->reconfig_timer,
 		    nfp_net_reconfig_timer, (unsigned long)nn);
+	setup_timer(&nn->rx_filter_stats_timer,
+		    nfp_net_filter_stats_timer, (unsigned long)nn);
 
 	return nn;
 }
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c b/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
index 3a3dba1a03f0..4b8ce01a8dd2 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
@@ -51,6 +51,60 @@
 #include "nfp_net_ctrl.h"
 #include "nfp_net.h"
 
+void nfp_net_filter_stats_timer(unsigned long data)
+{
+	struct nfp_net *nn = (void *)data;
+	struct nfp_stat_pair latest;
+
+	spin_lock_bh(&nn->rx_filter_lock);
+
+	if (nn->ctrl & NFP_NET_CFG_CTRL_BPF)
+		mod_timer(&nn->rx_filter_stats_timer,
+			  jiffies + NFP_NET_STAT_POLL_IVL);
+
+	spin_unlock_bh(&nn->rx_filter_lock);
+
+	latest.pkts = nn_readq(nn, NFP_NET_CFG_STATS_APP1_FRAMES);
+	latest.bytes = nn_readq(nn, NFP_NET_CFG_STATS_APP1_BYTES);
+
+	if (latest.pkts != nn->rx_filter.pkts)
+		nn->rx_filter_change = jiffies;
+
+	nn->rx_filter = latest;
+}
+
+static void nfp_net_bpf_stats_reset(struct nfp_net *nn)
+{
+	nn->rx_filter.pkts = nn_readq(nn, NFP_NET_CFG_STATS_APP1_FRAMES);
+	nn->rx_filter.bytes = nn_readq(nn, NFP_NET_CFG_STATS_APP1_BYTES);
+	nn->rx_filter_prev = nn->rx_filter;
+	nn->rx_filter_change = jiffies;
+}
+
+static int
+nfp_net_bpf_stats_update(struct nfp_net *nn, struct tc_cls_bpf_offload *cls_bpf)
+{
+	struct tc_action *a;
+	LIST_HEAD(actions);
+	u64 bytes, pkts;
+
+	pkts = nn->rx_filter.pkts - nn->rx_filter_prev.pkts;
+	bytes = nn->rx_filter.bytes - nn->rx_filter_prev.bytes;
+	bytes -= pkts * ETH_HLEN;
+
+	nn->rx_filter_prev = nn->rx_filter;
+
+	preempt_disable();
+
+	tcf_exts_to_list(cls_bpf->exts, &actions);
+	list_for_each_entry(a, &actions, list)
+		tcf_action_stats_update(a, bytes, pkts, nn->rx_filter_change);
+
+	preempt_enable();
+
+	return 0;
+}
+
 static int
 nfp_net_bpf_get_act(struct nfp_net *nn, struct tc_cls_bpf_offload *cls_bpf)
 {
@@ -147,6 +201,9 @@ nfp_net_bpf_load_and_start(struct nfp_net *nn, u32 tc_flags,
 		nn_err(nn, "FW command error while enabling BPF: %d\n", err);
 
 	dma_free_coherent(&nn->pdev->dev, code_sz, code, dma_addr);
+
+	nfp_net_bpf_stats_reset(nn);
+	mod_timer(&nn->rx_filter_stats_timer, jiffies + NFP_NET_STAT_POLL_IVL);
 }
 
 static int nfp_net_bpf_stop(struct nfp_net *nn)
@@ -154,9 +211,12 @@ static int nfp_net_bpf_stop(struct nfp_net *nn)
 	if (!(nn->ctrl & NFP_NET_CFG_CTRL_BPF))
 		return 0;
 
+	spin_lock_bh(&nn->rx_filter_lock);
 	nn->ctrl &= ~NFP_NET_CFG_CTRL_BPF;
+	spin_unlock_bh(&nn->rx_filter_lock);
 	nn_writel(nn, NFP_NET_CFG_CTRL, nn->ctrl);
 
+	del_timer_sync(&nn->rx_filter_stats_timer);
 	nn->bpf_offload_skip_sw = 0;
 
 	return nfp_net_reconfig(nn, NFP_NET_CFG_UPDATE_GEN);
@@ -214,6 +274,9 @@ nfp_net_bpf_offload(struct nfp_net *nn, u32 handle, __be16 proto,
 	case TC_CLSBPF_DESTROY:
 		return nfp_net_bpf_stop(nn);
 
+	case TC_CLSBPF_STATS:
+		return nfp_net_bpf_stats_update(nn, cls_bpf);
+
 	default:
 		return -ENOTSUPP;
 	}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFCv2 13/16] nfp: bpf: add packet marking support
  2016-08-26 18:05 [RFCv2 00/16] BPF hardware offload (cls_bpf for now) Jakub Kicinski
                   ` (11 preceding siblings ...)
  2016-08-26 18:06 ` [RFCv2 12/16] net: bpf: " Jakub Kicinski
@ 2016-08-26 18:06 ` Jakub Kicinski
  2016-08-26 18:06 ` [RFCv2 14/16] net: act_mirred: allow statistic updates from offloaded actions Jakub Kicinski
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-26 18:06 UTC (permalink / raw)
  To: netdev
  Cc: ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici,
	Jakub Kicinski

Add missing ABI defines and eBPF instructions to allow
mark to be passed on and extend prepend parsing on the
RX path to pick it up from packet metadata.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_bpf.h       |  2 ++
 drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c   | 19 +++++++++++
 .../net/ethernet/netronome/nfp/nfp_net_common.c    | 38 ++++++++++++++++++----
 drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h  |  8 +++++
 4 files changed, 60 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_bpf.h b/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
index f4265f88db23..85b258a70b18 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
@@ -91,6 +91,8 @@ enum nfp_bpf_reg_type {
 #define imm_both(np)	reg_both((np)->regs_per_thread - STATIC_REG_IMM)
 
 #define NFP_BPF_ABI_FLAGS	reg_nnr(0)
+#define   NFP_BPF_ABI_FLAG_MARK	1
+#define NFP_BPF_ABI_MARK	reg_nnr(1)
 #define NFP_BPF_ABI_PKT		reg_nnr(2)
 #define NFP_BPF_ABI_LEN		reg_nnr(3)
 
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c b/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
index 09ed1627ae20..ca73be6fcc3d 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
@@ -674,6 +674,16 @@ static int construct_data_ld(struct nfp_prog *nfp_prog, u16 offset, u8 size)
 	return construct_data_ind_ld(nfp_prog, offset, 0, false, size);
 }
 
+static int wrp_set_mark(struct nfp_prog *nfp_prog, u8 src)
+{
+	emit_alu(nfp_prog, NFP_BPF_ABI_MARK,
+		 reg_none(), ALU_OP_NONE, reg_b(src));
+	emit_alu(nfp_prog, NFP_BPF_ABI_FLAGS,
+		 NFP_BPF_ABI_FLAGS, ALU_OP_OR, reg_imm(NFP_BPF_ABI_FLAG_MARK));
+
+	return 0;
+}
+
 static void
 wrp_alu_imm(struct nfp_prog *nfp_prog, u8 dst, enum alu_op alu_op, u32 imm)
 {
@@ -1117,6 +1127,14 @@ static int mem_ldx4(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 	return 0;
 }
 
+static int mem_stx4(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	if (meta->insn.off == offsetof(struct sk_buff, mark))
+		return wrp_set_mark(nfp_prog, meta->insn.src_reg * 2);
+
+	return -ENOTSUPP;
+}
+
 static int jump(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 {
 	if (meta->insn.off < 0) /* TODO */
@@ -1306,6 +1324,7 @@ static const instr_cb_t instr_cb[256] = {
 	[BPF_LD | BPF_IND | BPF_H] =	data_ind_ld2,
 	[BPF_LD | BPF_IND | BPF_W] =	data_ind_ld4,
 	[BPF_LDX | BPF_MEM | BPF_W] =	mem_ldx4,
+	[BPF_STX | BPF_MEM | BPF_W] =	mem_stx4,
 	[BPF_JMP | BPF_JA | BPF_K] =	jump,
 	[BPF_JMP | BPF_JEQ | BPF_K] =	jeq_imm,
 	[BPF_JMP | BPF_JGT | BPF_K] =	jgt_imm,
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 053bda8a0fbd..739dd13dc18e 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1298,23 +1298,20 @@ static void nfp_net_rx_csum(struct nfp_net *nn, struct nfp_net_r_vector *r_vec,
  * nfp_net_set_hash() - Set SKB hash data
  * @netdev: adapter's net_device structure
  * @skb:   SKB to set the hash data on
- * @rxd:   RX descriptor
  *
  * The RSS hash and hash-type are pre-pended to the packet data.
  * Extract and decode it and set the skb fields.
  */
-static void nfp_net_set_hash(struct net_device *netdev, struct sk_buff *skb,
-			     struct nfp_net_rx_desc *rxd)
+static void nfp_net_set_hash(struct net_device *netdev, struct sk_buff *skb)
 {
 	struct nfp_net_rx_hash *rx_hash;
 
-	if (!(rxd->rxd.flags & PCIE_DESC_RX_RSS) ||
-	    !(netdev->features & NETIF_F_RXHASH))
+	if (!(netdev->features & NETIF_F_RXHASH))
 		return;
 
 	rx_hash = (struct nfp_net_rx_hash *)(skb->data - sizeof(*rx_hash));
 
-	switch (be32_to_cpu(rx_hash->hash_type)) {
+	switch (be32_to_cpu(rx_hash->hash_type) & NPF_NET_META_FIELD_HASH) {
 	case NFP_NET_RSS_IPV4:
 	case NFP_NET_RSS_IPV6:
 	case NFP_NET_RSS_IPV6_EX:
@@ -1326,6 +1323,33 @@ static void nfp_net_set_hash(struct net_device *netdev, struct sk_buff *skb,
 	}
 }
 
+static void
+nfp_net_parse_meta(struct net_device *netdev, struct sk_buff *skb,
+		   struct nfp_net_rx_desc *rxd, int meta_len)
+{
+	u32 meta_info;
+	u8 *data = skb->data - 4;
+
+	if (rxd->rxd.flags & PCIE_DESC_RX_RSS) {
+		data -= 4;
+		nfp_net_set_hash(netdev, skb);
+	}
+
+	meta_info = get_unaligned_be32(data) >> 8;
+	data -= 4;
+
+	while (meta_info) {
+		switch (meta_info & GENMASK(NFP_NET_META_FIELD_SIZE, 0)) {
+		case NFP_NET_META_MARK:
+			skb->mark = get_unaligned_be32(data);
+			data -= 4;
+			break;
+		}
+
+		meta_info >>= NFP_NET_META_FIELD_SIZE;
+	}
+}
+
 /**
  * nfp_net_rx() - receive up to @budget packets on @rx_ring
  * @rx_ring:   RX ring to receive from
@@ -1440,7 +1464,7 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, int budget)
 			skb_reserve(skb, nn->rx_offset);
 		skb_put(skb, data_len - meta_len);
 
-		nfp_net_set_hash(nn->netdev, skb, rxd);
+		nfp_net_parse_meta(nn->netdev, skb, rxd, meta_len);
 
 		/* Pad small frames to minimum */
 		if (skb_put_padto(skb, 60))
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h b/drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h
index a4b0ef11a09c..4c30439d3fcb 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h
@@ -80,6 +80,14 @@
 #define NFP_NET_RSS_IPV6_EX_UDP         9
 
 /**
+ * Prepend field types
+ */
+#define NPF_NET_META_FIELD_HASH		0xff
+
+#define NFP_NET_META_FIELD_SIZE		4
+#define NFP_NET_META_MARK		1
+
+/**
  * @NFP_NET_TXR_MAX:         Maximum number of TX rings
  * @NFP_NET_RXR_MAX:         Maximum number of RX rings
  */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFCv2 14/16] net: act_mirred: allow statistic updates from offloaded actions
  2016-08-26 18:05 [RFCv2 00/16] BPF hardware offload (cls_bpf for now) Jakub Kicinski
                   ` (12 preceding siblings ...)
  2016-08-26 18:06 ` [RFCv2 13/16] nfp: bpf: add packet marking support Jakub Kicinski
@ 2016-08-26 18:06 ` Jakub Kicinski
  2016-08-26 18:06 ` [RFCv2 15/16] nfp: bpf: add support for legacy redirect action Jakub Kicinski
  2016-08-26 18:06 ` [RFCv2 16/16] nfp: bpf: add offload of TC direct action mode Jakub Kicinski
  15 siblings, 0 replies; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-26 18:06 UTC (permalink / raw)
  To: netdev
  Cc: ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici,
	Jakub Kicinski

Implement .stats_update() callback.  The implementation
is generic and can be reused by other simple actions if
needed.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 net/sched/act_mirred.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index 6038c85d92f5..f9862d89cb93 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -204,6 +204,13 @@ out:
 	return retval;
 }
 
+static void tcf_stats_update(struct tc_action *a, u64 bytes, u32 packets,
+			     u64 lastuse)
+{
+	tcf_lastuse_update(&a->tcfa_tm);
+	_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets);
+}
+
 static int tcf_mirred_dump(struct sk_buff *skb, struct tc_action *a, int bind, int ref)
 {
 	unsigned char *b = skb_tail_pointer(skb);
@@ -280,6 +287,7 @@ static struct tc_action_ops act_mirred_ops = {
 	.type		=	TCA_ACT_MIRRED,
 	.owner		=	THIS_MODULE,
 	.act		=	tcf_mirred,
+	.stats_update	=	tcf_stats_update,
 	.dump		=	tcf_mirred_dump,
 	.cleanup	=	tcf_mirred_release,
 	.init		=	tcf_mirred_init,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFCv2 15/16] nfp: bpf: add support for legacy redirect action
  2016-08-26 18:05 [RFCv2 00/16] BPF hardware offload (cls_bpf for now) Jakub Kicinski
                   ` (13 preceding siblings ...)
  2016-08-26 18:06 ` [RFCv2 14/16] net: act_mirred: allow statistic updates from offloaded actions Jakub Kicinski
@ 2016-08-26 18:06 ` Jakub Kicinski
  2016-08-26 18:06 ` [RFCv2 16/16] nfp: bpf: add offload of TC direct action mode Jakub Kicinski
  15 siblings, 0 replies; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-26 18:06 UTC (permalink / raw)
  To: netdev
  Cc: ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici,
	Jakub Kicinski

Data path has redirect support so expressing redirect
to the port frame came from is a trivial matter of
setting the right result code.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_bpf.h         | 1 +
 drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c     | 2 ++
 drivers/net/ethernet/netronome/nfp/nfp_net_offload.c | 4 ++++
 3 files changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_bpf.h b/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
index 85b258a70b18..ccb8c2fa20d5 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
@@ -60,6 +60,7 @@ enum static_regs {
 
 enum nfp_bpf_action_type {
 	NN_ACT_TC_DROP,
+	NN_ACT_TC_REDIR,
 };
 
 /* Software register representation, hardware encoding in asm.h */
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c b/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
index ca73be6fcc3d..e784ed827afe 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
@@ -1440,6 +1440,7 @@ static void nfp_outro_tc_legacy(struct nfp_prog *nfp_prog)
 {
 	const u8 act2code[] = {
 		[NN_ACT_TC_DROP]  = 0x22,
+		[NN_ACT_TC_REDIR] = 0x24
 	};
 	/* Target for aborts */
 	nfp_prog->tgt_abort = nfp_prog_current_offset(nfp_prog);
@@ -1468,6 +1469,7 @@ static void nfp_outro(struct nfp_prog *nfp_prog)
 {
 	switch (nfp_prog->act) {
 	case NN_ACT_TC_DROP:
+	case NN_ACT_TC_REDIR:
 		nfp_outro_tc_legacy(nfp_prog);
 		break;
 	}
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c b/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
index 4b8ce01a8dd2..6399801c4196 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
@@ -123,6 +123,10 @@ nfp_net_bpf_get_act(struct nfp_net *nn, struct tc_cls_bpf_offload *cls_bpf)
 	list_for_each_entry(a, &actions, list) {
 		if (is_tcf_gact_shot(a))
 			return NN_ACT_TC_DROP;
+
+		if (is_tcf_mirred_redirect(a) &&
+		    tcf_mirred_ifindex(a) == nn->netdev->ifindex)
+			return NN_ACT_TC_REDIR;
 	}
 
 	return -ENOTSUPP;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFCv2 16/16] nfp: bpf: add offload of TC direct action mode
  2016-08-26 18:05 [RFCv2 00/16] BPF hardware offload (cls_bpf for now) Jakub Kicinski
                   ` (14 preceding siblings ...)
  2016-08-26 18:06 ` [RFCv2 15/16] nfp: bpf: add support for legacy redirect action Jakub Kicinski
@ 2016-08-26 18:06 ` Jakub Kicinski
  2016-08-29 21:09   ` Daniel Borkmann
  15 siblings, 1 reply; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-26 18:06 UTC (permalink / raw)
  To: netdev
  Cc: ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici,
	Jakub Kicinski

Add offload of TC in direct action mode.  We just need
to provide appropriate checks in the verifier and
a new outro block to translate the exit codes to what
data path expects

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_bpf.h       |  1 +
 drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c   | 66 ++++++++++++++++++++++
 .../net/ethernet/netronome/nfp/nfp_bpf_verifier.c  | 11 +++-
 .../net/ethernet/netronome/nfp/nfp_net_offload.c   |  6 +-
 4 files changed, 82 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_bpf.h b/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
index ccb8c2fa20d5..3fd4cf6fa47b 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
@@ -61,6 +61,7 @@ enum static_regs {
 enum nfp_bpf_action_type {
 	NN_ACT_TC_DROP,
 	NN_ACT_TC_REDIR,
+	NN_ACT_DIRECT,
 };
 
 /* Software register representation, hardware encoding in asm.h */
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c b/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
index e784ed827afe..1bed2ae05da5 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
@@ -321,6 +321,16 @@ __emit_br(struct nfp_prog *nfp_prog, enum br_mask mask, enum br_ev_pip ev_pip,
 	nfp_prog_push(nfp_prog, insn);
 }
 
+static void emit_br_def(struct nfp_prog *nfp_prog, u16 addr, u8 defer)
+{
+	if (defer > 2) {
+		pr_err("BUG: branch defer out of bounds %d\n", defer);
+		nfp_prog->error = -EFAULT;
+		return;
+	}
+	__emit_br(nfp_prog, BR_UNC, BR_EV_PIP_UNCOND, BR_CSS_NONE, addr, defer);
+}
+
 static void
 emit_br(struct nfp_prog *nfp_prog, enum br_mask mask, u16 addr, u8 defer)
 {
@@ -1465,9 +1475,65 @@ static void nfp_outro_tc_legacy(struct nfp_prog *nfp_prog)
 		      SHF_SC_L_SHF, 16);
 }
 
+static void nfp_outro_tc_da(struct nfp_prog *nfp_prog)
+{
+	/* TC direct-action mode:
+	 *   0,1   ok        NOT SUPPORTED[1]
+	 *   2   drop  0x22 -> drop,  count as stat1
+	 *   4,5 nuke  0x02 -> drop
+	 *   7  redir  0x44 -> redir, count as stat2
+	 *   * unspec  0x11 -> pass,  count as stat0
+	 *
+	 * [1] We can't support OK and RECLASSIFY because we can't tell TC
+	 *     the exact decision made.  We are forced to support UNSPEC
+	 *     to handle aborts so that's the only one we handle for passing
+	 *     packets up the stack.
+	 */
+	/* Target for aborts */
+	nfp_prog->tgt_abort = nfp_prog_current_offset(nfp_prog);
+
+	emit_br_def(nfp_prog, nfp_prog->tgt_done, 2);
+
+	emit_alu(nfp_prog, reg_a(0),
+		 reg_none(), ALU_OP_NONE, NFP_BPF_ABI_FLAGS);
+	emit_ld_field(nfp_prog, reg_a(0), 0xc, reg_imm(0x11), SHF_SC_L_SHF, 16);
+
+	/* Target for normal exits */
+	nfp_prog->tgt_out = nfp_prog_current_offset(nfp_prog);
+
+	/* if R0 > 7 jump to abort */
+	emit_alu(nfp_prog, reg_none(), reg_imm(7), ALU_OP_SUB, reg_b(0));
+	emit_br(nfp_prog, BR_BLO, nfp_prog->tgt_abort, 0);
+	emit_alu(nfp_prog, reg_a(0),
+		 reg_none(), ALU_OP_NONE, NFP_BPF_ABI_FLAGS);
+
+	wrp_immed(nfp_prog, reg_b(2), 0x41221211);
+	wrp_immed(nfp_prog, reg_b(3), 0x41001211);
+
+	emit_shf(nfp_prog, reg_a(1),
+		 reg_none(), SHF_OP_NONE, reg_b(0), SHF_SC_L_SHF, 2);
+
+	emit_alu(nfp_prog, reg_none(), reg_a(1), ALU_OP_OR, reg_imm(0));
+	emit_shf(nfp_prog, reg_a(2),
+		 reg_imm(0xf), SHF_OP_AND, reg_b(2), SHF_SC_R_SHF, 0);
+
+	emit_alu(nfp_prog, reg_none(), reg_a(1), ALU_OP_OR, reg_imm(0));
+	emit_shf(nfp_prog, reg_b(2),
+		 reg_imm(0xf), SHF_OP_AND, reg_b(3), SHF_SC_R_SHF, 0);
+
+	emit_br_def(nfp_prog, nfp_prog->tgt_done, 2);
+
+	emit_shf(nfp_prog, reg_b(2),
+		 reg_a(2), SHF_OP_OR, reg_b(2), SHF_SC_L_SHF, 4);
+	emit_ld_field(nfp_prog, reg_a(0), 0xc, reg_b(2), SHF_SC_L_SHF, 16);
+}
+
 static void nfp_outro(struct nfp_prog *nfp_prog)
 {
 	switch (nfp_prog->act) {
+	case NN_ACT_DIRECT:
+		nfp_outro_tc_da(nfp_prog);
+		break;
 	case NN_ACT_TC_DROP:
 	case NN_ACT_TC_REDIR:
 		nfp_outro_tc_legacy(nfp_prog);
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_bpf_verifier.c b/drivers/net/ethernet/netronome/nfp/nfp_bpf_verifier.c
index 8b66d98e37eb..b628836896fa 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_bpf_verifier.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_bpf_verifier.c
@@ -85,7 +85,16 @@ nfp_bpf_check_exit(struct nfp_prog *nfp_prog, const struct verifier_env *env)
 		return -EINVAL;
 	}
 
-	if (reg0->imm != 0 && (reg0->imm & ~0U) != ~0U) {
+	if (nfp_prog->act != NN_ACT_DIRECT &&
+	    reg0->imm != 0 && (reg0->imm & ~0U) != ~0U) {
+		pr_info("unsupported exit state: %d, imm: %llx\n",
+			reg0->type, reg0->imm);
+		return -EINVAL;
+	}
+
+	if (nfp_prog->act == NN_ACT_DIRECT && reg0->imm <= TC_ACT_REDIRECT &&
+	    reg0->imm != TC_ACT_SHOT && reg0->imm != TC_ACT_STOLEN &&
+	    reg0->imm != TC_ACT_QUEUED) {
 		pr_info("unsupported exit state: %d, imm: %llx\n",
 			reg0->type, reg0->imm);
 		return -EINVAL;
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c b/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
index 6399801c4196..a826fa390d4e 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
@@ -112,8 +112,12 @@ nfp_net_bpf_get_act(struct nfp_net *nn, struct tc_cls_bpf_offload *cls_bpf)
 	LIST_HEAD(actions);
 
 	/* TC direct action */
-	if (cls_bpf->exts_integrated)
+	if (cls_bpf->exts_integrated) {
+		if (tc_no_actions(cls_bpf->exts))
+			return NN_ACT_DIRECT;
+
 		return -ENOTSUPP;
+	}
 
 	/* TC legacy mode */
 	if (!tc_single_action(cls_bpf->exts))
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [RFCv2 07/16] bpf: enable non-core use of the verfier
  2016-08-26 18:06 ` [RFCv2 07/16] bpf: enable non-core use of the verfier Jakub Kicinski
@ 2016-08-26 23:29   ` Alexei Starovoitov
  2016-08-27 11:40     ` Jakub Kicinski
  0 siblings, 1 reply; 40+ messages in thread
From: Alexei Starovoitov @ 2016-08-26 23:29 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici

On Fri, Aug 26, 2016 at 07:06:06PM +0100, Jakub Kicinski wrote:
> Advanced JIT compilers and translators may want to use
> eBPF verifier as a base for parsers or to perform custom
> checks and validations.
> 
> Add ability for external users to invoke the verifier
> and provide callbacks to be invoked for every intruction
> checked.  For now only add most basic callback for
> per-instruction pre-interpretation checks is added.  More
> advanced users may also like to have per-instruction post
> callback and state comparison callback.
> 
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>

I like the apporach. Making verifier into 'bytecode parser'
that JITs can reuse is a good design choice.
The only thing I would suggest is to tweak the verifier to
avoid in-place state recording. Then I think patch 8 for
clone/unclone of the program won't be needed, since verifier
will be read-only from bytecode point of view and patch 9
also will be slightly cleaner.
I think there are very few places in verifier that do this
state keeping inside insn. It was bugging me for some time.
Good time to clean that up.
Unless I misunderstand the patches 7,8,9...

There is also small concern for patches 5 and 6 that add
more register state information. Potentially that extra
state can prevent states_equal() to recognize equivalent
states. Only patch 9 uses that info, right?
Another question is do you need all state walking that
verifier does or single linear pass through insns
would have worked?
Looks like you're only using CONST_IMM and PTR_TO_CTX
state, right?

The rest looks very good. Thanks a lot!

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 07/16] bpf: enable non-core use of the verfier
  2016-08-26 23:29   ` Alexei Starovoitov
@ 2016-08-27 11:40     ` Jakub Kicinski
  2016-08-27 17:32       ` Alexei Starovoitov
  0 siblings, 1 reply; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-27 11:40 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: netdev, ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici

On Fri, 26 Aug 2016 16:29:05 -0700, Alexei Starovoitov wrote:
> On Fri, Aug 26, 2016 at 07:06:06PM +0100, Jakub Kicinski wrote:
> > Advanced JIT compilers and translators may want to use
> > eBPF verifier as a base for parsers or to perform custom
> > checks and validations.
> > 
> > Add ability for external users to invoke the verifier
> > and provide callbacks to be invoked for every intruction
> > checked.  For now only add most basic callback for
> > per-instruction pre-interpretation checks is added.  More
> > advanced users may also like to have per-instruction post
> > callback and state comparison callback.
> > 
> > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>  
> 
> I like the apporach. Making verifier into 'bytecode parser'
> that JITs can reuse is a good design choice.
> The only thing I would suggest is to tweak the verifier to
> avoid in-place state recording. Then I think patch 8 for
> clone/unclone of the program won't be needed, since verifier
> will be read-only from bytecode point of view and patch 9
> also will be slightly cleaner.
> I think there are very few places in verifier that do this
> state keeping inside insn. It was bugging me for some time.
> Good time to clean that up.
> Unless I misunderstand the patches 7,8,9...

Agreed, I think the verifier only modifies the program to
store pointer types in imm field.  I will try to come up
a way around this, any suggestions?  Perhaps state_equal()
logic could be modified to downgrade pointers to UNKONWNs
when it detects other state had incompatible pointer type.

> There is also small concern for patches 5 and 6 that add
> more register state information. Potentially that extra
> state can prevent states_equal() to recognize equivalent
> states. Only patch 9 uses that info, right?

5 and 6 recognize more constant loads, those can only
upgrade some UNKNOWN_VALUEs to CONST_IMMs.  So yes, if the
verifier hits the CONST first and then tries with UNKNOWN
it will have to reverify the path.  

> Another question is do you need all state walking that
> verifier does or single linear pass through insns
> would have worked?
> Looks like you're only using CONST_IMM and PTR_TO_CTX
> state, right?

I think I need all the parsing.  Right now I mostly need
the verification to check that exit codes are specific
CONST_IMMs.  Clang quite happily does this:

r0 <- 0
if (...)
	r0 <- 1
exit

> The rest looks very good. Thanks a lot!

Thanks for the review!  FWIW my use of parsing is isolated
to the nfp_bpf_verifier.c file, at the very end of patch 9.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 07/16] bpf: enable non-core use of the verfier
  2016-08-27 11:40     ` Jakub Kicinski
@ 2016-08-27 17:32       ` Alexei Starovoitov
  2016-08-29 20:13         ` Daniel Borkmann
  0 siblings, 1 reply; 40+ messages in thread
From: Alexei Starovoitov @ 2016-08-27 17:32 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, ast, daniel, dinan.gunawardena, jiri, john.fastabend, kubakici

On Sat, Aug 27, 2016 at 12:40:04PM +0100, Jakub Kicinski wrote:
> On Fri, 26 Aug 2016 16:29:05 -0700, Alexei Starovoitov wrote:
> > On Fri, Aug 26, 2016 at 07:06:06PM +0100, Jakub Kicinski wrote:
> > > Advanced JIT compilers and translators may want to use
> > > eBPF verifier as a base for parsers or to perform custom
> > > checks and validations.
> > > 
> > > Add ability for external users to invoke the verifier
> > > and provide callbacks to be invoked for every intruction
> > > checked.  For now only add most basic callback for
> > > per-instruction pre-interpretation checks is added.  More
> > > advanced users may also like to have per-instruction post
> > > callback and state comparison callback.
> > > 
> > > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>  
> > 
> > I like the apporach. Making verifier into 'bytecode parser'
> > that JITs can reuse is a good design choice.
> > The only thing I would suggest is to tweak the verifier to
> > avoid in-place state recording. Then I think patch 8 for
> > clone/unclone of the program won't be needed, since verifier
> > will be read-only from bytecode point of view and patch 9
> > also will be slightly cleaner.
> > I think there are very few places in verifier that do this
> > state keeping inside insn. It was bugging me for some time.
> > Good time to clean that up.
> > Unless I misunderstand the patches 7,8,9...
> 
> Agreed, I think the verifier only modifies the program to
> store pointer types in imm field.  I will try to come up
> a way around this, any suggestions?  Perhaps state_equal()

probably array_of_insn_aux_data[num_insns] should do it.
Unlike reg_state that is forked on branches, this array
is only one.

> logic could be modified to downgrade pointers to UNKONWNs
> when it detects other state had incompatible pointer type.
> 
> > There is also small concern for patches 5 and 6 that add
> > more register state information. Potentially that extra
> > state can prevent states_equal() to recognize equivalent
> > states. Only patch 9 uses that info, right?
> 
> 5 and 6 recognize more constant loads, those can only
> upgrade some UNKNOWN_VALUEs to CONST_IMMs.  So yes, if the
> verifier hits the CONST first and then tries with UNKNOWN
> it will have to reverify the path.  
> 
> > Another question is do you need all state walking that
> > verifier does or single linear pass through insns
> > would have worked?
> > Looks like you're only using CONST_IMM and PTR_TO_CTX
> > state, right?
> 
> I think I need all the parsing.  Right now I mostly need
> the verification to check that exit codes are specific
> CONST_IMMs.  Clang quite happily does this:
> 
> r0 <- 0
> if (...)
> 	r0 <- 1
> exit

I see. Indeed then you'd need the verifier to walk all paths
to make sure constant return values.
If you only need yes/no check then such info can probably be
collected unconditionally during initial program load.
Like prog->cb_access flag.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 01/16] add basic register-field manipulation macros
  2016-08-26 18:06 ` [RFCv2 01/16] add basic register-field manipulation macros Jakub Kicinski
@ 2016-08-29 14:34   ` Daniel Borkmann
  2016-08-29 15:07     ` Jakub Kicinski
  0 siblings, 1 reply; 40+ messages in thread
From: Daniel Borkmann @ 2016-08-29 14:34 UTC (permalink / raw)
  To: Jakub Kicinski, netdev
  Cc: ast, dinan.gunawardena, jiri, john.fastabend, kubakici

On 08/26/2016 08:06 PM, Jakub Kicinski wrote:
> Common approach to accessing register fields is to define
> structures or sets of macros containing mask and shift pair.
> Operations on the register are then performed as follows:
>
>   field = (reg >> shift) & mask;
>
>   reg &= ~(mask << shift);
>   reg |= (field & mask) << shift;
>
> Defining shift and mask separately is tedious.  Ivo van Doorn
> came up with an idea of computing them at compilation time
> based on a single shifted mask (later refined by Felix) which
> can be used like this:
>
>   #define REG_FIELD 0x000ff000
>
>   field = FIELD_GET(REG_FIELD, reg);
>
>   reg &= ~REG_FIELD;
>   reg |= FIELD_PREP(REG_FIELD, field);
>
> FIELD_{GET,PREP} macros take care of finding out what the
> appropriate shift is based on compilation time ffs operation.
>
> GENMASK can be used to define registers (which is usually
> less error-prone and easier to match with datasheets).
>
> This approach is the most convenient I've seen so to limit code
> multiplication let's move the macros to a global header file.
> Attempts to use static inlines instead of macros failed due
> to false positive triggering of BUILD_BUG_ON()s, especially with
> GCC < 6.0.
>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
[...]
> + * Bitfield access macros
> + *
> + * FIELD_{GET,PREP} macros take as first parameter shifted mask
> + * from which they extract the base mask and shift amount.
> + * Mask must be a compilation time constant.
> + *
> + * Example:
> + *
> + *  #define REG_FIELD_A  GENMASK(6, 0)
> + *  #define REG_FIELD_B  BIT(7)
> + *  #define REG_FIELD_C  GENMASK(15, 8)
> + *  #define REG_FIELD_D  GENMASK(31, 16)
> + *
> + * Get:
> + *  a = FIELD_GET(REG_FIELD_A, reg);
> + *  b = FIELD_GET(REG_FIELD_B, reg);
> + *
> + * Set:
> + *  reg = FIELD_PREP(REG_FIELD_A, 1) |
> + *	  FIELD_PREP(REG_FIELD_B, 0) |
> + *	  FIELD_PREP(REG_FIELD_C, c) |
> + *	  FIELD_PREP(REG_FIELD_D, 0x40);
> + *
> + * Modify:
> + *  reg &= ~REG_FIELD_C;
> + *  reg |= FIELD_PREP(REG_FIELD_C, c);
> + */
> +
> +#define _bf_shf(x) (__builtin_ffsll(x) - 1)
> +
> +#define _BF_FIELD_CHECK(_mask, _reg, _val, _pfx)			\

Nit: if possible, please always use "__" instead of "_" as prefix, which is
more common coding style in the kernel.

> +	({								\
> +		BUILD_BUG_ON_MSG(!__builtin_constant_p(_mask),		\
> +				 _pfx "mask is not constant");		\
> +		BUILD_BUG_ON_MSG(!(_mask), _pfx "mask is zero");	\
> +		BUILD_BUG_ON_MSG(__builtin_constant_p(_val) ?		\
> +				 ~((_mask) >> _bf_shf(_mask)) & (_val) : 0, \
> +				 _pfx "value too large for the field"); \
> +		BUILD_BUG_ON_MSG((_mask) > (typeof(_reg))~0ull,		\
> +				 _pfx "type of reg too small for mask"); \
> +		__BUILD_BUG_ON_NOT_POWER_OF_2((_mask) +			\
> +					      (1ULL << _bf_shf(_mask))); \
> +	})
> +
> +/**
> + * FIELD_PREP() - prepare a bitfield element
> + * @_mask: shifted mask defining the field's length and position
> + * @_val:  value to put in the field
> + *
> + * FIELD_PREP() masks and shifts up the value.  The result should
> + * be combined with other fields of the bitfield using logical OR.
> + */
> +#define FIELD_PREP(_mask, _val)						\
> +	({								\
> +		_BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: ");	\
> +		((typeof(_mask))(_val) << _bf_shf(_mask)) & (_mask);	\
> +	})
> +
> +/**
> + * FIELD_GET() - extract a bitfield element
> + * @_mask: shifted mask defining the field's length and position
> + * @_reg:  32bit value of entire bitfield
> + *
> + * FIELD_GET() extracts the field specified by @_mask from the
> + * bitfield passed in as @_reg by masking and shifting it down.
> + */
> +#define FIELD_GET(_mask, _reg)						\
> +	({								\
> +		_BF_FIELD_CHECK(_mask, _reg, 0U, "FIELD_GET: ");	\
> +		(typeof(_mask))(((_reg) & (_mask)) >> _bf_shf(_mask));	\
> +	})

No strong opinion, but FIELD_PREP() sounds a bit weird. Maybe rather a
FIELD_GEN() (aka "generate") and FIELD_GET() pair?

> +#endif
> diff --git a/include/linux/bug.h b/include/linux/bug.h
> index e51b0709e78d..292d6a10b0c2 100644
> --- a/include/linux/bug.h
> +++ b/include/linux/bug.h
> @@ -13,6 +13,7 @@ enum bug_trap_type {
>   struct pt_regs;
>
>   #ifdef __CHECKER__
> +#define __BUILD_BUG_ON_NOT_POWER_OF_2(n) (0)
>   #define BUILD_BUG_ON_NOT_POWER_OF_2(n) (0)
>   #define BUILD_BUG_ON_ZERO(e) (0)
>   #define BUILD_BUG_ON_NULL(e) ((void*)0)
> @@ -24,6 +25,8 @@ struct pt_regs;
>   #else /* __CHECKER__ */
>
>   /* Force a compilation error if a constant expression is not a power of 2 */
> +#define __BUILD_BUG_ON_NOT_POWER_OF_2(n)	\
> +	BUILD_BUG_ON(((n) & ((n) - 1)) != 0)

Is there a reason BUILD_BUG_ON_NOT_POWER_OF_2(n) cannot be reused?

Because the (n) == 0 check would trigger (although it shouldn't ...)?

>   #define BUILD_BUG_ON_NOT_POWER_OF_2(n)			\
>   	BUILD_BUG_ON((n) == 0 || (((n) & ((n) - 1)) != 0))
>
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 02/16] net: cls_bpf: add hardware offload
  2016-08-26 18:06 ` [RFCv2 02/16] net: cls_bpf: add hardware offload Jakub Kicinski
@ 2016-08-29 14:51   ` Daniel Borkmann
  0 siblings, 0 replies; 40+ messages in thread
From: Daniel Borkmann @ 2016-08-29 14:51 UTC (permalink / raw)
  To: Jakub Kicinski, netdev
  Cc: ast, dinan.gunawardena, jiri, john.fastabend, kubakici

On 08/26/2016 08:06 PM, Jakub Kicinski wrote:
> This patch adds hardware offload capability to cls_bpf classifier,
> similar to what have been done with U32 and flower.
>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

[...]
> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
> index a459be5fe1c2..a86262f0d93a 100644
> --- a/include/net/pkt_cls.h
> +++ b/include/net/pkt_cls.h
> @@ -486,4 +486,18 @@ struct tc_cls_matchall_offload {
>   	unsigned long cookie;
>   };
>
> +enum tc_clsbpf_command {
> +	TC_CLSBPF_ADD,
> +	TC_CLSBPF_REPLACE,
> +	TC_CLSBPF_DESTROY,
> +};
> +
> +struct tc_cls_bpf_offload {
> +	enum tc_clsbpf_command command;
> +	struct tcf_exts *exts;
> +	struct bpf_prog *filter;

Small nit: s/filter/prog/, I think prog is a more appropriate name since
it's about more than just filtering. (I will rename it at some point for
cls_bpf as well.)

Rest looks good to me.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 03/16] net: cls_bpf: limit hardware offload by software-only flag
  2016-08-26 18:06 ` [RFCv2 03/16] net: cls_bpf: limit hardware offload by software-only flag Jakub Kicinski
@ 2016-08-29 15:06   ` Daniel Borkmann
  2016-08-29 15:15     ` Jakub Kicinski
  0 siblings, 1 reply; 40+ messages in thread
From: Daniel Borkmann @ 2016-08-29 15:06 UTC (permalink / raw)
  To: Jakub Kicinski, netdev
  Cc: ast, dinan.gunawardena, jiri, john.fastabend, kubakici

On 08/26/2016 08:06 PM, Jakub Kicinski wrote:
> Add cls_bpf support for the TCA_CLS_FLAGS_SKIP_HW flag.
> Unlike U32 and flower cls_bpf already has some netlink
> flags defined.  I chose to create a new attribute to be
> able to use the same flag values as the above.
>
> Unknown flags are ignored and not reported upon dump.
>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
[...]
> @@ -55,6 +58,7 @@ struct cls_bpf_prog {
>   static const struct nla_policy bpf_policy[TCA_BPF_MAX + 1] = {
>   	[TCA_BPF_CLASSID]	= { .type = NLA_U32 },
>   	[TCA_BPF_FLAGS]		= { .type = NLA_U32 },
> +	[TCA_BPF_FLAGS_GEN]	= { .type = NLA_U32 },
>   	[TCA_BPF_FD]		= { .type = NLA_U32 },
>   	[TCA_BPF_NAME]		= { .type = NLA_NUL_STRING, .len = CLS_BPF_NAME_LEN },
>   	[TCA_BPF_OPS_LEN]	= { .type = NLA_U16 },
> @@ -156,6 +160,7 @@ static int cls_bpf_offload_cmd(struct tcf_proto *tp, struct cls_bpf_prog *prog,
>   	bpf_offload.filter = prog->filter;
>   	bpf_offload.name = prog->bpf_name;
>   	bpf_offload.exts_integrated = prog->exts_integrated;
> +	bpf_offload.gen_flags = prog->gen_flags;
>
>   	return dev->netdev_ops->ndo_setup_tc(dev, tp->q->handle,
>   					     tp->protocol, &offload);
> @@ -169,14 +174,14 @@ static void cls_bpf_offload(struct tcf_proto *tp, struct cls_bpf_prog *prog,
>   	enum tc_clsbpf_command cmd;
>
>   	if (oldprog && oldprog->offloaded) {
> -		if (tc_should_offload(dev, tp, 0)) {
> +		if (tc_should_offload(dev, tp, prog->gen_flags)) {
>   			cmd = TC_CLSBPF_REPLACE;
>   		} else {
>   			obj = oldprog;
>   			cmd = TC_CLSBPF_DESTROY;
>   		}
>   	} else {
> -		if (!tc_should_offload(dev, tp, 0))
> +		if (!tc_should_offload(dev, tp, prog->gen_flags))
>   			return;
>   		cmd = TC_CLSBPF_ADD;
>   	}
> @@ -372,6 +377,7 @@ static int cls_bpf_modify_existing(struct net *net, struct tcf_proto *tp,
>   {
>   	bool is_bpf, is_ebpf, have_exts = false;
>   	struct tcf_exts exts;
> +	u32 gen_flags = 0;
>   	int ret;
>
>   	is_bpf = tb[TCA_BPF_OPS_LEN] && tb[TCA_BPF_OPS];
> @@ -396,8 +402,16 @@ static int cls_bpf_modify_existing(struct net *net, struct tcf_proto *tp,
>
>   		have_exts = bpf_flags & TCA_BPF_FLAG_ACT_DIRECT;
>   	}
> +	if (tb[TCA_BPF_FLAGS_GEN]) {
> +		gen_flags = nla_get_u32(tb[TCA_BPF_FLAGS_GEN]);
> +		/* Make sure dump doesn't report back flags we don't handle */
> +		gen_flags &= CLS_BPF_SUPPORTED_GEN_FLAGS;

Instead of above rather ...

	if (gen_flags & ~CLS_BPF_SUPPORTED_GEN_FLAGS) {
		ret = -EINVAL;
		goto errout;
	}

... so that we can handle further additions properly like we do with
tb[TCA_BPF_FLAGS]?

> +		if (!tc_flags_valid(gen_flags))
> +			return -EINVAL;

Shouldn't we: goto errout?

> +	}
>
>   	prog->exts_integrated = have_exts;
> +	prog->gen_flags = gen_flags;
>
>   	ret = is_bpf ? cls_bpf_prog_from_ops(tb, prog) :
>   		       cls_bpf_prog_from_efd(tb, prog, tp);
> @@ -569,6 +583,9 @@ static int cls_bpf_dump(struct net *net, struct tcf_proto *tp, unsigned long fh,
>   		bpf_flags |= TCA_BPF_FLAG_ACT_DIRECT;
>   	if (bpf_flags && nla_put_u32(skb, TCA_BPF_FLAGS, bpf_flags))
>   		goto nla_put_failure;
> +	if (prog->gen_flags &&
> +	    nla_put_u32(skb, TCA_BPF_FLAGS_GEN, prog->gen_flags))
> +		goto nla_put_failure;
>
>   	nla_nest_end(skb, nest);

Rest looks good:

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 01/16] add basic register-field manipulation macros
  2016-08-29 14:34   ` Daniel Borkmann
@ 2016-08-29 15:07     ` Jakub Kicinski
  2016-08-29 15:40       ` Daniel Borkmann
  0 siblings, 1 reply; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-29 15:07 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Jakub Kicinski, netdev, ast, dinan.gunawardena, jiri, john.fastabend

On Mon, 29 Aug 2016 16:34:25 +0200, Daniel Borkmann wrote:
> On 08/26/2016 08:06 PM, Jakub Kicinski wrote:
> > Common approach to accessing register fields is to define
> > structures or sets of macros containing mask and shift pair.
> > Operations on the register are then performed as follows:
> >
> >   field = (reg >> shift) & mask;
> >
> >   reg &= ~(mask << shift);
> >   reg |= (field & mask) << shift;
> >
> > Defining shift and mask separately is tedious.  Ivo van Doorn
> > came up with an idea of computing them at compilation time
> > based on a single shifted mask (later refined by Felix) which
> > can be used like this:
> >
> >   #define REG_FIELD 0x000ff000
> >
> >   field = FIELD_GET(REG_FIELD, reg);
> >
> >   reg &= ~REG_FIELD;
> >   reg |= FIELD_PREP(REG_FIELD, field);
> >
> > FIELD_{GET,PREP} macros take care of finding out what the
> > appropriate shift is based on compilation time ffs operation.
> >
> > GENMASK can be used to define registers (which is usually
> > less error-prone and easier to match with datasheets).
> >
> > This approach is the most convenient I've seen so to limit code
> > multiplication let's move the macros to a global header file.
> > Attempts to use static inlines instead of macros failed due
> > to false positive triggering of BUILD_BUG_ON()s, especially with
> > GCC < 6.0.
> >
> > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>  
> [...]
> > + * Bitfield access macros
> > + *
> > + * FIELD_{GET,PREP} macros take as first parameter shifted mask
> > + * from which they extract the base mask and shift amount.
> > + * Mask must be a compilation time constant.
> > + *
> > + * Example:
> > + *
> > + *  #define REG_FIELD_A  GENMASK(6, 0)
> > + *  #define REG_FIELD_B  BIT(7)
> > + *  #define REG_FIELD_C  GENMASK(15, 8)
> > + *  #define REG_FIELD_D  GENMASK(31, 16)
> > + *
> > + * Get:
> > + *  a = FIELD_GET(REG_FIELD_A, reg);
> > + *  b = FIELD_GET(REG_FIELD_B, reg);
> > + *
> > + * Set:
> > + *  reg = FIELD_PREP(REG_FIELD_A, 1) |
> > + *	  FIELD_PREP(REG_FIELD_B, 0) |
> > + *	  FIELD_PREP(REG_FIELD_C, c) |
> > + *	  FIELD_PREP(REG_FIELD_D, 0x40);
> > + *
> > + * Modify:
> > + *  reg &= ~REG_FIELD_C;
> > + *  reg |= FIELD_PREP(REG_FIELD_C, c);
> > + */
> > +
> > +#define _bf_shf(x) (__builtin_ffsll(x) - 1)
> > +
> > +#define _BF_FIELD_CHECK(_mask, _reg, _val, _pfx)			\  
> 
> Nit: if possible, please always use "__" instead of "_" as prefix, which is
> more common coding style in the kernel.

I went with single underscore, because my understanding was:
 - no underscore - safe, "user-facing" API;
 - two underscores - internal, make sure you know how to use it;
 - single underscore - library internals, shouldn't be touched.

I don't expect anyone to invoke those macros, the underscore is
there to avoid collisions. 

> > +	({								\
> > +		BUILD_BUG_ON_MSG(!__builtin_constant_p(_mask),		\
> > +				 _pfx "mask is not constant");		\
> > +		BUILD_BUG_ON_MSG(!(_mask), _pfx "mask is zero");	\
> > +		BUILD_BUG_ON_MSG(__builtin_constant_p(_val) ?		\
> > +				 ~((_mask) >> _bf_shf(_mask)) & (_val) : 0, \
> > +				 _pfx "value too large for the field"); \
> > +		BUILD_BUG_ON_MSG((_mask) > (typeof(_reg))~0ull,		\
> > +				 _pfx "type of reg too small for mask"); \
> > +		__BUILD_BUG_ON_NOT_POWER_OF_2((_mask) +			\
> > +					      (1ULL << _bf_shf(_mask))); \
> > +	})
> > +
> > +/**
> > + * FIELD_PREP() - prepare a bitfield element
> > + * @_mask: shifted mask defining the field's length and position
> > + * @_val:  value to put in the field
> > + *
> > + * FIELD_PREP() masks and shifts up the value.  The result should
> > + * be combined with other fields of the bitfield using logical OR.
> > + */
> > +#define FIELD_PREP(_mask, _val)						\
> > +	({								\
> > +		_BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: ");	\
> > +		((typeof(_mask))(_val) << _bf_shf(_mask)) & (_mask);	\
> > +	})
> > +
> > +/**
> > + * FIELD_GET() - extract a bitfield element
> > + * @_mask: shifted mask defining the field's length and position
> > + * @_reg:  32bit value of entire bitfield
> > + *
> > + * FIELD_GET() extracts the field specified by @_mask from the
> > + * bitfield passed in as @_reg by masking and shifting it down.
> > + */
> > +#define FIELD_GET(_mask, _reg)						\
> > +	({								\
> > +		_BF_FIELD_CHECK(_mask, _reg, 0U, "FIELD_GET: ");	\
> > +		(typeof(_mask))(((_reg) & (_mask)) >> _bf_shf(_mask));	\
> > +	})  
> 
> No strong opinion, but FIELD_PREP() sounds a bit weird. Maybe rather a
> FIELD_GEN() (aka "generate") and FIELD_GET() pair?

FWIW PREP was suggested by Linus:

https://lkml.org/lkml/2016/8/17/384

> > +#endif
> > diff --git a/include/linux/bug.h b/include/linux/bug.h
> > index e51b0709e78d..292d6a10b0c2 100644
> > --- a/include/linux/bug.h
> > +++ b/include/linux/bug.h
> > @@ -13,6 +13,7 @@ enum bug_trap_type {
> >   struct pt_regs;
> >
> >   #ifdef __CHECKER__
> > +#define __BUILD_BUG_ON_NOT_POWER_OF_2(n) (0)
> >   #define BUILD_BUG_ON_NOT_POWER_OF_2(n) (0)
> >   #define BUILD_BUG_ON_ZERO(e) (0)
> >   #define BUILD_BUG_ON_NULL(e) ((void*)0)
> > @@ -24,6 +25,8 @@ struct pt_regs;
> >   #else /* __CHECKER__ */
> >
> >   /* Force a compilation error if a constant expression is not a power of 2 */
> > +#define __BUILD_BUG_ON_NOT_POWER_OF_2(n)	\
> > +	BUILD_BUG_ON(((n) & ((n) - 1)) != 0)  
> 
> Is there a reason BUILD_BUG_ON_NOT_POWER_OF_2(n) cannot be reused?
> 
> Because the (n) == 0 check would trigger (although it shouldn't ...)?

It would, I'm doing:
  mask + lowest bit of mask
which will result in:
  highest bit of mask << 1
which in turn will overflow for masks with highest bit set.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 03/16] net: cls_bpf: limit hardware offload by software-only flag
  2016-08-29 15:06   ` Daniel Borkmann
@ 2016-08-29 15:15     ` Jakub Kicinski
  0 siblings, 0 replies; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-29 15:15 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Jakub Kicinski, netdev, ast, dinan.gunawardena, jiri, john.fastabend

On Mon, 29 Aug 2016 17:06:34 +0200, Daniel Borkmann wrote:
> On 08/26/2016 08:06 PM, Jakub Kicinski wrote:
> > [...]
> > @@ -372,6 +377,7 @@ static int cls_bpf_modify_existing(struct net *net, struct tcf_proto *tp,
> >   {
> >   	bool is_bpf, is_ebpf, have_exts = false;
> >   	struct tcf_exts exts;
> > +	u32 gen_flags = 0;
> >   	int ret;
> >
> >   	is_bpf = tb[TCA_BPF_OPS_LEN] && tb[TCA_BPF_OPS];
> > @@ -396,8 +402,16 @@ static int cls_bpf_modify_existing(struct net *net, struct tcf_proto *tp,
> >
> >   		have_exts = bpf_flags & TCA_BPF_FLAG_ACT_DIRECT;
> >   	}
> > +	if (tb[TCA_BPF_FLAGS_GEN]) {
> > +		gen_flags = nla_get_u32(tb[TCA_BPF_FLAGS_GEN]);
> > +		/* Make sure dump doesn't report back flags we don't handle */
> > +		gen_flags &= CLS_BPF_SUPPORTED_GEN_FLAGS;  
> 
> Instead of above rather ...
> 
> 	if (gen_flags & ~CLS_BPF_SUPPORTED_GEN_FLAGS) {
> 		ret = -EINVAL;
> 		goto errout;
> 	}
> 
> ... so that we can handle further additions properly like we do with
> tb[TCA_BPF_FLAGS]?

Sure!

> > +		if (!tc_flags_valid(gen_flags))
> > +			return -EINVAL;  
> 
> Shouldn't we: goto errout?

Ugh, right!  I'm missing:

	tcf_exts_destroy(&exts);

Thanks!

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 04/16] net: cls_bpf: add support for marking filters as hardware-only
  2016-08-26 18:06 ` [RFCv2 04/16] net: cls_bpf: add support for marking filters as hardware-only Jakub Kicinski
@ 2016-08-29 15:28   ` Daniel Borkmann
  0 siblings, 0 replies; 40+ messages in thread
From: Daniel Borkmann @ 2016-08-29 15:28 UTC (permalink / raw)
  To: Jakub Kicinski, netdev
  Cc: ast, dinan.gunawardena, jiri, john.fastabend, kubakici

On 08/26/2016 08:06 PM, Jakub Kicinski wrote:
> Add cls_bpf support for the TCA_CLS_FLAGS_SKIP_SW flag.
>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 01/16] add basic register-field manipulation macros
  2016-08-29 15:07     ` Jakub Kicinski
@ 2016-08-29 15:40       ` Daniel Borkmann
  0 siblings, 0 replies; 40+ messages in thread
From: Daniel Borkmann @ 2016-08-29 15:40 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jakub Kicinski, netdev, ast, dinan.gunawardena, jiri, john.fastabend

On 08/29/2016 05:07 PM, Jakub Kicinski wrote:
> On Mon, 29 Aug 2016 16:34:25 +0200, Daniel Borkmann wrote:
>> On 08/26/2016 08:06 PM, Jakub Kicinski wrote:
>>> Common approach to accessing register fields is to define
>>> structures or sets of macros containing mask and shift pair.
>>> Operations on the register are then performed as follows:
>>>
>>>    field = (reg >> shift) & mask;
>>>
>>>    reg &= ~(mask << shift);
>>>    reg |= (field & mask) << shift;
>>>
>>> Defining shift and mask separately is tedious.  Ivo van Doorn
>>> came up with an idea of computing them at compilation time
>>> based on a single shifted mask (later refined by Felix) which
>>> can be used like this:
>>>
>>>    #define REG_FIELD 0x000ff000
>>>
>>>    field = FIELD_GET(REG_FIELD, reg);
>>>
>>>    reg &= ~REG_FIELD;
>>>    reg |= FIELD_PREP(REG_FIELD, field);
>>>
>>> FIELD_{GET,PREP} macros take care of finding out what the
>>> appropriate shift is based on compilation time ffs operation.
>>>
>>> GENMASK can be used to define registers (which is usually
>>> less error-prone and easier to match with datasheets).
>>>
>>> This approach is the most convenient I've seen so to limit code
>>> multiplication let's move the macros to a global header file.
>>> Attempts to use static inlines instead of macros failed due
>>> to false positive triggering of BUILD_BUG_ON()s, especially with
>>> GCC < 6.0.
>>>
>>> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
>> [...]
>>> + * Bitfield access macros
>>> + *
>>> + * FIELD_{GET,PREP} macros take as first parameter shifted mask
>>> + * from which they extract the base mask and shift amount.
>>> + * Mask must be a compilation time constant.
>>> + *
>>> + * Example:
>>> + *
>>> + *  #define REG_FIELD_A  GENMASK(6, 0)
>>> + *  #define REG_FIELD_B  BIT(7)
>>> + *  #define REG_FIELD_C  GENMASK(15, 8)
>>> + *  #define REG_FIELD_D  GENMASK(31, 16)
>>> + *
>>> + * Get:
>>> + *  a = FIELD_GET(REG_FIELD_A, reg);
>>> + *  b = FIELD_GET(REG_FIELD_B, reg);
>>> + *
>>> + * Set:
>>> + *  reg = FIELD_PREP(REG_FIELD_A, 1) |
>>> + *	  FIELD_PREP(REG_FIELD_B, 0) |
>>> + *	  FIELD_PREP(REG_FIELD_C, c) |
>>> + *	  FIELD_PREP(REG_FIELD_D, 0x40);
>>> + *
>>> + * Modify:
>>> + *  reg &= ~REG_FIELD_C;
>>> + *  reg |= FIELD_PREP(REG_FIELD_C, c);
>>> + */
>>> +
>>> +#define _bf_shf(x) (__builtin_ffsll(x) - 1)
>>> +
>>> +#define _BF_FIELD_CHECK(_mask, _reg, _val, _pfx)			\
>>
>> Nit: if possible, please always use "__" instead of "_" as prefix, which is
>> more common coding style in the kernel.
>
> I went with single underscore, because my understanding was:
>   - no underscore - safe, "user-facing" API;
>   - two underscores - internal, make sure you know how to use it;
>   - single underscore - library internals, shouldn't be touched.

That convention would be new to me, at least I haven't seen it much (see
also recent comment on the act_tunnel set). Still think two underscores
is generally preferred (unless this is somewhere documented otherwise).

> I don't expect anyone to invoke those macros, the underscore is
> there to avoid collisions.
>
>>> +	({								\
>>> +		BUILD_BUG_ON_MSG(!__builtin_constant_p(_mask),		\
>>> +				 _pfx "mask is not constant");		\
>>> +		BUILD_BUG_ON_MSG(!(_mask), _pfx "mask is zero");	\
>>> +		BUILD_BUG_ON_MSG(__builtin_constant_p(_val) ?		\
>>> +				 ~((_mask) >> _bf_shf(_mask)) & (_val) : 0, \
>>> +				 _pfx "value too large for the field"); \
>>> +		BUILD_BUG_ON_MSG((_mask) > (typeof(_reg))~0ull,		\
>>> +				 _pfx "type of reg too small for mask"); \
>>> +		__BUILD_BUG_ON_NOT_POWER_OF_2((_mask) +			\
>>> +					      (1ULL << _bf_shf(_mask))); \
>>> +	})
>>> +
>>> +/**
>>> + * FIELD_PREP() - prepare a bitfield element
>>> + * @_mask: shifted mask defining the field's length and position
>>> + * @_val:  value to put in the field
>>> + *
>>> + * FIELD_PREP() masks and shifts up the value.  The result should
>>> + * be combined with other fields of the bitfield using logical OR.
>>> + */
>>> +#define FIELD_PREP(_mask, _val)						\
>>> +	({								\
>>> +		_BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: ");	\
>>> +		((typeof(_mask))(_val) << _bf_shf(_mask)) & (_mask);	\
>>> +	})
>>> +
>>> +/**
>>> + * FIELD_GET() - extract a bitfield element
>>> + * @_mask: shifted mask defining the field's length and position
>>> + * @_reg:  32bit value of entire bitfield
>>> + *
>>> + * FIELD_GET() extracts the field specified by @_mask from the
>>> + * bitfield passed in as @_reg by masking and shifting it down.
>>> + */
>>> +#define FIELD_GET(_mask, _reg)						\
>>> +	({								\
>>> +		_BF_FIELD_CHECK(_mask, _reg, 0U, "FIELD_GET: ");	\
>>> +		(typeof(_mask))(((_reg) & (_mask)) >> _bf_shf(_mask));	\
>>> +	})
>>
>> No strong opinion, but FIELD_PREP() sounds a bit weird. Maybe rather a
>> FIELD_GEN() (aka "generate") and FIELD_GET() pair?
>
> FWIW PREP was suggested by Linus:
>
> https://lkml.org/lkml/2016/8/17/384

Hmm, ok, fair enough.

>>> +#endif
>>> diff --git a/include/linux/bug.h b/include/linux/bug.h
>>> index e51b0709e78d..292d6a10b0c2 100644
>>> --- a/include/linux/bug.h
>>> +++ b/include/linux/bug.h
>>> @@ -13,6 +13,7 @@ enum bug_trap_type {
>>>    struct pt_regs;
>>>
>>>    #ifdef __CHECKER__
>>> +#define __BUILD_BUG_ON_NOT_POWER_OF_2(n) (0)
>>>    #define BUILD_BUG_ON_NOT_POWER_OF_2(n) (0)
>>>    #define BUILD_BUG_ON_ZERO(e) (0)
>>>    #define BUILD_BUG_ON_NULL(e) ((void*)0)
>>> @@ -24,6 +25,8 @@ struct pt_regs;
>>>    #else /* __CHECKER__ */
>>>
>>>    /* Force a compilation error if a constant expression is not a power of 2 */
>>> +#define __BUILD_BUG_ON_NOT_POWER_OF_2(n)	\
>>> +	BUILD_BUG_ON(((n) & ((n) - 1)) != 0)
>>
>> Is there a reason BUILD_BUG_ON_NOT_POWER_OF_2(n) cannot be reused?
>>
>> Because the (n) == 0 check would trigger (although it shouldn't ...)?
>
> It would, I'm doing:
>    mask + lowest bit of mask
> which will result in:
>    highest bit of mask << 1
> which in turn will overflow for masks with highest bit set.

Ahh, right.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 07/16] bpf: enable non-core use of the verfier
  2016-08-27 17:32       ` Alexei Starovoitov
@ 2016-08-29 20:13         ` Daniel Borkmann
  2016-08-29 20:17           ` Daniel Borkmann
  0 siblings, 1 reply; 40+ messages in thread
From: Daniel Borkmann @ 2016-08-29 20:13 UTC (permalink / raw)
  To: Alexei Starovoitov, Jakub Kicinski
  Cc: netdev, ast, dinan.gunawardena, jiri, john.fastabend, kubakici

On 08/27/2016 07:32 PM, Alexei Starovoitov wrote:
> On Sat, Aug 27, 2016 at 12:40:04PM +0100, Jakub Kicinski wrote:
>> On Fri, 26 Aug 2016 16:29:05 -0700, Alexei Starovoitov wrote:
>>> On Fri, Aug 26, 2016 at 07:06:06PM +0100, Jakub Kicinski wrote:
>>>> Advanced JIT compilers and translators may want to use
>>>> eBPF verifier as a base for parsers or to perform custom
>>>> checks and validations.
>>>>
>>>> Add ability for external users to invoke the verifier
>>>> and provide callbacks to be invoked for every intruction
>>>> checked.  For now only add most basic callback for
>>>> per-instruction pre-interpretation checks is added.  More
>>>> advanced users may also like to have per-instruction post
>>>> callback and state comparison callback.
>>>>
>>>> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
>>>
>>> I like the apporach. Making verifier into 'bytecode parser'
>>> that JITs can reuse is a good design choice.

+1

>>> The only thing I would suggest is to tweak the verifier to
>>> avoid in-place state recording. Then I think patch 8 for
>>> clone/unclone of the program won't be needed, since verifier
>>> will be read-only from bytecode point of view and patch 9
>>> also will be slightly cleaner.
>>> I think there are very few places in verifier that do this
>>> state keeping inside insn. It was bugging me for some time.
>>> Good time to clean that up.
>>> Unless I misunderstand the patches 7,8,9...
>>
>> Agreed, I think the verifier only modifies the program to
>> store pointer types in imm field.  I will try to come up
>> a way around this, any suggestions?  Perhaps state_equal()
>
> probably array_of_insn_aux_data[num_insns] should do it.
> Unlike reg_state that is forked on branches, this array
> is only one.

This would be for struct nfp_insn_meta, right? So, struct
bpf_ext_parser_ops could become:

static const struct bpf_ext_parser_ops nfp_bpf_pops = {
	.insn_hook = nfp_verify_insn,
	.insn_size = sizeof(struct nfp_insn_meta),
};

... where bpf_parse() would prealloc that f.e. in env->insn_meta[].

>> logic could be modified to downgrade pointers to UNKONWNs
>> when it detects other state had incompatible pointer type.
>>
>>> There is also small concern for patches 5 and 6 that add
>>> more register state information. Potentially that extra
>>> state can prevent states_equal() to recognize equivalent
>>> states. Only patch 9 uses that info, right?
>>
>> 5 and 6 recognize more constant loads, those can only
>> upgrade some UNKNOWN_VALUEs to CONST_IMMs.  So yes, if the
>> verifier hits the CONST first and then tries with UNKNOWN
>> it will have to reverify the path.

Agree, was also my concern when I read patch 5 and 6. It would
not only be related to types, but also different imm values,
where the memcmp() could fail on. Potentially the latter can be
avoided by only checking types which should be sufficient. Hmm,
maybe only bpf_parse() should go through this stricter mode since
only relevant for drivers (otoh downside would be that bugs
would end up less likely to be found).

>>> Another question is do you need all state walking that
>>> verifier does or single linear pass through insns
>>> would have worked?
>>> Looks like you're only using CONST_IMM and PTR_TO_CTX
>>> state, right?
>>
>> I think I need all the parsing.  Right now I mostly need
>> the verification to check that exit codes are specific
>> CONST_IMMs.  Clang quite happily does this:
>>
>> r0 <- 0
>> if (...)
>> 	r0 <- 1
>> exit
>
> I see. Indeed then you'd need the verifier to walk all paths
> to make sure constant return values.

I think this would still not cover the cases where you'd fetch
a return value/verdict from a map, but this should be ignored/
rejected for now, also since majority of programs are not written
in such a way.

> If you only need yes/no check then such info can probably be
> collected unconditionally during initial program load.
> Like prog->cb_access flag.

One other comment wrt the header, when you move these things
there, would be good to prefix with bpf_* so that this doesn't
clash in future with other header files.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 07/16] bpf: enable non-core use of the verfier
  2016-08-29 20:13         ` Daniel Borkmann
@ 2016-08-29 20:17           ` Daniel Borkmann
  2016-08-30 10:48             ` Jakub Kicinski
  0 siblings, 1 reply; 40+ messages in thread
From: Daniel Borkmann @ 2016-08-29 20:17 UTC (permalink / raw)
  To: Alexei Starovoitov, Jakub Kicinski
  Cc: netdev, ast, dinan.gunawardena, jiri, john.fastabend, kubakici

On 08/29/2016 10:13 PM, Daniel Borkmann wrote:
> On 08/27/2016 07:32 PM, Alexei Starovoitov wrote:
>> On Sat, Aug 27, 2016 at 12:40:04PM +0100, Jakub Kicinski wrote:
>>> On Fri, 26 Aug 2016 16:29:05 -0700, Alexei Starovoitov wrote:
>>>> On Fri, Aug 26, 2016 at 07:06:06PM +0100, Jakub Kicinski wrote:
>>>>> Advanced JIT compilers and translators may want to use
>>>>> eBPF verifier as a base for parsers or to perform custom
>>>>> checks and validations.
>>>>>
>>>>> Add ability for external users to invoke the verifier
>>>>> and provide callbacks to be invoked for every intruction
>>>>> checked.  For now only add most basic callback for
>>>>> per-instruction pre-interpretation checks is added.  More
>>>>> advanced users may also like to have per-instruction post
>>>>> callback and state comparison callback.
>>>>>
>>>>> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
>>>>
>>>> I like the apporach. Making verifier into 'bytecode parser'
>>>> that JITs can reuse is a good design choice.
>
> +1
>
>>>> The only thing I would suggest is to tweak the verifier to
>>>> avoid in-place state recording. Then I think patch 8 for
>>>> clone/unclone of the program won't be needed, since verifier
>>>> will be read-only from bytecode point of view and patch 9
>>>> also will be slightly cleaner.
>>>> I think there are very few places in verifier that do this
>>>> state keeping inside insn. It was bugging me for some time.
>>>> Good time to clean that up.
>>>> Unless I misunderstand the patches 7,8,9...
>>>
>>> Agreed, I think the verifier only modifies the program to
>>> store pointer types in imm field.  I will try to come up
>>> a way around this, any suggestions?  Perhaps state_equal()
>>
>> probably array_of_insn_aux_data[num_insns] should do it.
>> Unlike reg_state that is forked on branches, this array
>> is only one.
>
> This would be for struct nfp_insn_meta, right? So, struct
> bpf_ext_parser_ops could become:
>
> static const struct bpf_ext_parser_ops nfp_bpf_pops = {
>      .insn_hook = nfp_verify_insn,
>      .insn_size = sizeof(struct nfp_insn_meta),
> };
>
> ... where bpf_parse() would prealloc that f.e. in env->insn_meta[].

(Well, actually everything can live in env->private_data.)

>>> logic could be modified to downgrade pointers to UNKONWNs
>>> when it detects other state had incompatible pointer type.
>>>
>>>> There is also small concern for patches 5 and 6 that add
>>>> more register state information. Potentially that extra
>>>> state can prevent states_equal() to recognize equivalent
>>>> states. Only patch 9 uses that info, right?
>>>
>>> 5 and 6 recognize more constant loads, those can only
>>> upgrade some UNKNOWN_VALUEs to CONST_IMMs.  So yes, if the
>>> verifier hits the CONST first and then tries with UNKNOWN
>>> it will have to reverify the path.
>
> Agree, was also my concern when I read patch 5 and 6. It would
> not only be related to types, but also different imm values,
> where the memcmp() could fail on. Potentially the latter can be
> avoided by only checking types which should be sufficient. Hmm,
> maybe only bpf_parse() should go through this stricter mode since
> only relevant for drivers (otoh downside would be that bugs
> would end up less likely to be found).
>
>>>> Another question is do you need all state walking that
>>>> verifier does or single linear pass through insns
>>>> would have worked?
>>>> Looks like you're only using CONST_IMM and PTR_TO_CTX
>>>> state, right?
>>>
>>> I think I need all the parsing.  Right now I mostly need
>>> the verification to check that exit codes are specific
>>> CONST_IMMs.  Clang quite happily does this:
>>>
>>> r0 <- 0
>>> if (...)
>>>     r0 <- 1
>>> exit
>>
>> I see. Indeed then you'd need the verifier to walk all paths
>> to make sure constant return values.
>
> I think this would still not cover the cases where you'd fetch
> a return value/verdict from a map, but this should be ignored/
> rejected for now, also since majority of programs are not written
> in such a way.
>
>> If you only need yes/no check then such info can probably be
>> collected unconditionally during initial program load.
>> Like prog->cb_access flag.
>
> One other comment wrt the header, when you move these things
> there, would be good to prefix with bpf_* so that this doesn't
> clash in future with other header files.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 11/16] net: cls_bpf: allow offloaded filters to update stats
  2016-08-26 18:06 ` [RFCv2 11/16] net: cls_bpf: allow offloaded filters to update stats Jakub Kicinski
@ 2016-08-29 20:43   ` Daniel Borkmann
  0 siblings, 0 replies; 40+ messages in thread
From: Daniel Borkmann @ 2016-08-29 20:43 UTC (permalink / raw)
  To: Jakub Kicinski, netdev
  Cc: ast, dinan.gunawardena, jiri, john.fastabend, kubakici

On 08/26/2016 08:06 PM, Jakub Kicinski wrote:
> Call into offloaded filters to update stats.
>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 16/16] nfp: bpf: add offload of TC direct action mode
  2016-08-26 18:06 ` [RFCv2 16/16] nfp: bpf: add offload of TC direct action mode Jakub Kicinski
@ 2016-08-29 21:09   ` Daniel Borkmann
  2016-08-30 10:52     ` Jakub Kicinski
  0 siblings, 1 reply; 40+ messages in thread
From: Daniel Borkmann @ 2016-08-29 21:09 UTC (permalink / raw)
  To: Jakub Kicinski, netdev
  Cc: ast, dinan.gunawardena, jiri, john.fastabend, kubakici

On 08/26/2016 08:06 PM, Jakub Kicinski wrote:
> Add offload of TC in direct action mode.  We just need
> to provide appropriate checks in the verifier and
> a new outro block to translate the exit codes to what
> data path expects
>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
[...]
> +static void nfp_outro_tc_da(struct nfp_prog *nfp_prog)
> +{
> +	/* TC direct-action mode:

Would have made this the only supported mode, but I understand you
want to have the legacy drop/redir actions, fair enough.

> +	 *   0,1   ok        NOT SUPPORTED[1]
> +	 *   2   drop  0x22 -> drop,  count as stat1
> +	 *   4,5 nuke  0x02 -> drop
> +	 *   7  redir  0x44 -> redir, count as stat2
> +	 *   * unspec  0x11 -> pass,  count as stat0
> +	 *
> +	 * [1] We can't support OK and RECLASSIFY because we can't tell TC
> +	 *     the exact decision made.  We are forced to support UNSPEC
> +	 *     to handle aborts so that's the only one we handle for passing
> +	 *     packets up the stack.

In da mode, RECLASSIFY is not supported, so this one could be scratched.
For the OK and UNSPEC part, couldn't both be treated the same (as in: OK /
pass to stack roughly equivalent as in sch_handle_ingress())? Or is the
issue that you cannot populate skb->tc_index when passing to stack (maybe
just fine to leave it at 0 for now)?
Just curious, does TC_ACT_REDIRECT work in this scenario?

> +	 */
> +	/* Target for aborts */
> +	nfp_prog->tgt_abort = nfp_prog_current_offset(nfp_prog);
> +
> +	emit_br_def(nfp_prog, nfp_prog->tgt_done, 2);
> +
> +	emit_alu(nfp_prog, reg_a(0),
> +		 reg_none(), ALU_OP_NONE, NFP_BPF_ABI_FLAGS);
> +	emit_ld_field(nfp_prog, reg_a(0), 0xc, reg_imm(0x11), SHF_SC_L_SHF, 16);
> +
> +	/* Target for normal exits */
> +	nfp_prog->tgt_out = nfp_prog_current_offset(nfp_prog);
> +
> +	/* if R0 > 7 jump to abort */
> +	emit_alu(nfp_prog, reg_none(), reg_imm(7), ALU_OP_SUB, reg_b(0));
> +	emit_br(nfp_prog, BR_BLO, nfp_prog->tgt_abort, 0);
> +	emit_alu(nfp_prog, reg_a(0),
> +		 reg_none(), ALU_OP_NONE, NFP_BPF_ABI_FLAGS);
> +
> +	wrp_immed(nfp_prog, reg_b(2), 0x41221211);
> +	wrp_immed(nfp_prog, reg_b(3), 0x41001211);
> +
> +	emit_shf(nfp_prog, reg_a(1),
> +		 reg_none(), SHF_OP_NONE, reg_b(0), SHF_SC_L_SHF, 2);
> +
> +	emit_alu(nfp_prog, reg_none(), reg_a(1), ALU_OP_OR, reg_imm(0));
> +	emit_shf(nfp_prog, reg_a(2),
> +		 reg_imm(0xf), SHF_OP_AND, reg_b(2), SHF_SC_R_SHF, 0);
> +
> +	emit_alu(nfp_prog, reg_none(), reg_a(1), ALU_OP_OR, reg_imm(0));
> +	emit_shf(nfp_prog, reg_b(2),
> +		 reg_imm(0xf), SHF_OP_AND, reg_b(3), SHF_SC_R_SHF, 0);
> +
> +	emit_br_def(nfp_prog, nfp_prog->tgt_done, 2);
> +
> +	emit_shf(nfp_prog, reg_b(2),
> +		 reg_a(2), SHF_OP_OR, reg_b(2), SHF_SC_L_SHF, 4);
> +	emit_ld_field(nfp_prog, reg_a(0), 0xc, reg_b(2), SHF_SC_L_SHF, 16);
> +}

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 07/16] bpf: enable non-core use of the verfier
  2016-08-29 20:17           ` Daniel Borkmann
@ 2016-08-30 10:48             ` Jakub Kicinski
  2016-08-30 19:07               ` Daniel Borkmann
  0 siblings, 1 reply; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-30 10:48 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Alexei Starovoitov, Jakub Kicinski, netdev, ast,
	dinan.gunawardena, jiri, john.fastabend

On Mon, 29 Aug 2016 22:17:10 +0200, Daniel Borkmann wrote:
> On 08/29/2016 10:13 PM, Daniel Borkmann wrote:
> > On 08/27/2016 07:32 PM, Alexei Starovoitov wrote:  
> >> On Sat, Aug 27, 2016 at 12:40:04PM +0100, Jakub Kicinski wrote:  
> >> probably array_of_insn_aux_data[num_insns] should do it.
> >> Unlike reg_state that is forked on branches, this array
> >> is only one.  
> >
> > This would be for struct nfp_insn_meta, right? So, struct
> > bpf_ext_parser_ops could become:
> >
> > static const struct bpf_ext_parser_ops nfp_bpf_pops = {
> >      .insn_hook = nfp_verify_insn,
> >      .insn_size = sizeof(struct nfp_insn_meta),
> > };
> >
> > ... where bpf_parse() would prealloc that f.e. in env->insn_meta[].  

Hm.. this is tempting, I will have to store the pointer type in
nfp_insn_meta soon, anyway.

> (Well, actually everything can live in env->private_data.)

We are discussing changing the place verifier keep its pointer type
annotation, I don't think we could put that in the private_data.

> > Agree, was also my concern when I read patch 5 and 6. It would
> > not only be related to types, but also different imm values,
> > where the memcmp() could fail on. Potentially the latter can be
> > avoided by only checking types which should be sufficient. Hmm,
> > maybe only bpf_parse() should go through this stricter mode since
> > only relevant for drivers (otoh downside would be that bugs
> > would end up less likely to be found).

I don't want only checking types because it would defeat my exit code
validation :)  I was thinking about doing a lazy evaluation -
registering branches to explored_states with UNKNOWN and only upgrading
to CONST when someone actually needed the imm value.  I'm not sure the
complexity would be justified, though.

Having two modes seems more straight forward and I think we would only
need to pay attention in the LD_IMM64 case, I don't think I've seen
LLVM generating XORs, it's just the cBPF -> eBPF conversion.

> >> I see. Indeed then you'd need the verifier to walk all paths
> >> to make sure constant return values.  
> >
> > I think this would still not cover the cases where you'd fetch
> > a return value/verdict from a map, but this should be ignored/
> > rejected for now, also since majority of programs are not written
> > in such a way.
> >  
> >> If you only need yes/no check then such info can probably be
> >> collected unconditionally during initial program load.
> >> Like prog->cb_access flag.  
> >
> > One other comment wrt the header, when you move these things
> > there, would be good to prefix with bpf_* so that this doesn't
> > clash in future with other header files.  

Good point!

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 16/16] nfp: bpf: add offload of TC direct action mode
  2016-08-29 21:09   ` Daniel Borkmann
@ 2016-08-30 10:52     ` Jakub Kicinski
  2016-08-30 20:02       ` Daniel Borkmann
  0 siblings, 1 reply; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-30 10:52 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Jakub Kicinski, netdev, ast, dinan.gunawardena, jiri, john.fastabend

On Mon, 29 Aug 2016 23:09:35 +0200, Daniel Borkmann wrote:
> > +	 *   0,1   ok        NOT SUPPORTED[1]
> > +	 *   2   drop  0x22 -> drop,  count as stat1
> > +	 *   4,5 nuke  0x02 -> drop
> > +	 *   7  redir  0x44 -> redir, count as stat2
> > +	 *   * unspec  0x11 -> pass,  count as stat0
> > +	 *
> > +	 * [1] We can't support OK and RECLASSIFY because we can't tell TC
> > +	 *     the exact decision made.  We are forced to support UNSPEC
> > +	 *     to handle aborts so that's the only one we handle for passing
> > +	 *     packets up the stack.  
> 
> In da mode, RECLASSIFY is not supported, so this one could be scratched.
> For the OK and UNSPEC part, couldn't both be treated the same (as in: OK /
> pass to stack roughly equivalent as in sch_handle_ingress())? Or is the
> issue that you cannot populate skb->tc_index when passing to stack (maybe
> just fine to leave it at 0 for now)?

The comment is a bit confus(ed|ing).  The problem is:

tc filter add <filter1> skip_sw
tc filter add <filter2> skip_hw

If packet appears in the stack - was it because of OK or UNSPEC (or
RECLASSIFY) in filter1?  Do we need to run filter2 or not?  Passing
tc_index can be implemented the same way I do mark today.

> Just curious, does TC_ACT_REDIRECT work in this scenario?

I do the redirects in the card, all the problems stem from the
difficulty of passing full ret code in the skb from the driver
to tc_classify()/cls_bpf_classify().

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 07/16] bpf: enable non-core use of the verfier
  2016-08-30 10:48             ` Jakub Kicinski
@ 2016-08-30 19:07               ` Daniel Borkmann
  2016-08-30 20:22                 ` Jakub Kicinski
  0 siblings, 1 reply; 40+ messages in thread
From: Daniel Borkmann @ 2016-08-30 19:07 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Alexei Starovoitov, Jakub Kicinski, netdev, ast,
	dinan.gunawardena, jiri, john.fastabend

On 08/30/2016 12:48 PM, Jakub Kicinski wrote:
> On Mon, 29 Aug 2016 22:17:10 +0200, Daniel Borkmann wrote:
>> On 08/29/2016 10:13 PM, Daniel Borkmann wrote:
>>> On 08/27/2016 07:32 PM, Alexei Starovoitov wrote:
>>>> On Sat, Aug 27, 2016 at 12:40:04PM +0100, Jakub Kicinski wrote:
>>>> probably array_of_insn_aux_data[num_insns] should do it.
>>>> Unlike reg_state that is forked on branches, this array
>>>> is only one.
>>>
>>> This would be for struct nfp_insn_meta, right? So, struct
>>> bpf_ext_parser_ops could become:
>>>
>>> static const struct bpf_ext_parser_ops nfp_bpf_pops = {
>>>       .insn_hook = nfp_verify_insn,
>>>       .insn_size = sizeof(struct nfp_insn_meta),
>>> };
>>>
>>> ... where bpf_parse() would prealloc that f.e. in env->insn_meta[].
>
> Hm.. this is tempting, I will have to store the pointer type in
> nfp_insn_meta soon, anyway.
>
>> (Well, actually everything can live in env->private_data.)
>
> We are discussing changing the place verifier keep its pointer type
> annotation, I don't think we could put that in the private_data.
>
>>> Agree, was also my concern when I read patch 5 and 6. It would
>>> not only be related to types, but also different imm values,
>>> where the memcmp() could fail on. Potentially the latter can be
>>> avoided by only checking types which should be sufficient. Hmm,
>>> maybe only bpf_parse() should go through this stricter mode since
>>> only relevant for drivers (otoh downside would be that bugs
>>> would end up less likely to be found).
>
> I don't want only checking types because it would defeat my exit code
> validation :)  I was thinking about doing a lazy evaluation -
> registering branches to explored_states with UNKNOWN and only upgrading
> to CONST when someone actually needed the imm value.  I'm not sure the
> complexity would be justified, though.
>
> Having two modes seems more straight forward and I think we would only
> need to pay attention in the LD_IMM64 case, I don't think I've seen
> LLVM generating XORs, it's just the cBPF -> eBPF conversion.

Okay, though, I think that the cBPF to eBPF migration wouldn't even
pass through the bpf_parse() handling, since verifier is not aware on
some of their aspects such as emitting calls directly (w/o *proto) or
arg mappings. Probably make sense to reject these (bpf_prog_was_classic())
if they cannot be handled anyway?

>>>> I see. Indeed then you'd need the verifier to walk all paths
>>>> to make sure constant return values.
>>>
>>> I think this would still not cover the cases where you'd fetch
>>> a return value/verdict from a map, but this should be ignored/
>>> rejected for now, also since majority of programs are not written
>>> in such a way.
>>>
>>>> If you only need yes/no check then such info can probably be
>>>> collected unconditionally during initial program load.
>>>> Like prog->cb_access flag.
>>>
>>> One other comment wrt the header, when you move these things
>>> there, would be good to prefix with bpf_* so that this doesn't
>>> clash in future with other header files.
>
> Good point!

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 16/16] nfp: bpf: add offload of TC direct action mode
  2016-08-30 10:52     ` Jakub Kicinski
@ 2016-08-30 20:02       ` Daniel Borkmann
  2016-08-30 20:50         ` Jakub Kicinski
  0 siblings, 1 reply; 40+ messages in thread
From: Daniel Borkmann @ 2016-08-30 20:02 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jakub Kicinski, netdev, ast, dinan.gunawardena, jiri, john.fastabend

On 08/30/2016 12:52 PM, Jakub Kicinski wrote:
> On Mon, 29 Aug 2016 23:09:35 +0200, Daniel Borkmann wrote:
>>> +	 *   0,1   ok        NOT SUPPORTED[1]
>>> +	 *   2   drop  0x22 -> drop,  count as stat1
>>> +	 *   4,5 nuke  0x02 -> drop
>>> +	 *   7  redir  0x44 -> redir, count as stat2
>>> +	 *   * unspec  0x11 -> pass,  count as stat0
>>> +	 *
>>> +	 * [1] We can't support OK and RECLASSIFY because we can't tell TC
>>> +	 *     the exact decision made.  We are forced to support UNSPEC
>>> +	 *     to handle aborts so that's the only one we handle for passing
>>> +	 *     packets up the stack.
>>
>> In da mode, RECLASSIFY is not supported, so this one could be scratched.
>> For the OK and UNSPEC part, couldn't both be treated the same (as in: OK /
>> pass to stack roughly equivalent as in sch_handle_ingress())? Or is the
>> issue that you cannot populate skb->tc_index when passing to stack (maybe
>> just fine to leave it at 0 for now)?
>
> The comment is a bit confus(ed|ing).  The problem is:
>
> tc filter add <filter1> skip_sw
> tc filter add <filter2> skip_hw
>
> If packet appears in the stack - was it because of OK or UNSPEC (or
> RECLASSIFY) in filter1?  Do we need to run filter2 or not?  Passing
> tc_index can be implemented the same way I do mark today.

Okay, I see, thanks for explaining. So, if passing tc_index (or any other
meta data) can be implemented the same way as we do with mark already,
could we store such verdict, say, in some unused skb->tc_verd bits (the
skb->tc_index could be filled by the program already) and pass that up the
stack to differentiate between them? There should be no prior user before
ingress, so that patch 4 could become something like:

   if (tc_skip_sw(prog->gen_flags)) {
      filter_res = tc_map_hw_verd_to_act(skb);
   } else if (at_ingress) {
      ...
   } ...

And I assume it wouldn't make any sense anyway to have a skip_sw filter
being chained /after/ some skip_hw and the like, right?

>> Just curious, does TC_ACT_REDIRECT work in this scenario?
>
> I do the redirects in the card, all the problems stem from the

Ok, cool.

> difficulty of passing full ret code in the skb from the driver
> to tc_classify()/cls_bpf_classify().

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 07/16] bpf: enable non-core use of the verfier
  2016-08-30 19:07               ` Daniel Borkmann
@ 2016-08-30 20:22                 ` Jakub Kicinski
  2016-08-30 20:48                   ` Alexei Starovoitov
  0 siblings, 1 reply; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-30 20:22 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Alexei Starovoitov, Jakub Kicinski, netdev, ast,
	dinan.gunawardena, jiri, john.fastabend

On Tue, 30 Aug 2016 21:07:50 +0200, Daniel Borkmann wrote:
> > Having two modes seems more straight forward and I think we would only
> > need to pay attention in the LD_IMM64 case, I don't think I've seen
> > LLVM generating XORs, it's just the cBPF -> eBPF conversion.  
> 
> Okay, though, I think that the cBPF to eBPF migration wouldn't even
> pass through the bpf_parse() handling, since verifier is not aware on
> some of their aspects such as emitting calls directly (w/o *proto) or
> arg mappings. Probably make sense to reject these (bpf_prog_was_classic())
> if they cannot be handled anyway?

TBH again I only use cBPF for testing.  It's a convenient way of
generating certain instruction sequences.  I can probably just drop
it completely but the XOR patch is just 3 lines of code so not a huge
cost either...  I'll keep patch 6 in my tree for now.  

Alternatively - is there any eBPF assembler out there?  Something
converting verifier output back into ELF would be quite cool.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 07/16] bpf: enable non-core use of the verfier
  2016-08-30 20:22                 ` Jakub Kicinski
@ 2016-08-30 20:48                   ` Alexei Starovoitov
  2016-08-30 21:00                     ` Daniel Borkmann
  0 siblings, 1 reply; 40+ messages in thread
From: Alexei Starovoitov @ 2016-08-30 20:48 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Daniel Borkmann, Jakub Kicinski, netdev, ast, dinan.gunawardena,
	jiri, john.fastabend

On Tue, Aug 30, 2016 at 10:22:46PM +0200, Jakub Kicinski wrote:
> On Tue, 30 Aug 2016 21:07:50 +0200, Daniel Borkmann wrote:
> > > Having two modes seems more straight forward and I think we would only
> > > need to pay attention in the LD_IMM64 case, I don't think I've seen
> > > LLVM generating XORs, it's just the cBPF -> eBPF conversion.  
> > 
> > Okay, though, I think that the cBPF to eBPF migration wouldn't even
> > pass through the bpf_parse() handling, since verifier is not aware on
> > some of their aspects such as emitting calls directly (w/o *proto) or
> > arg mappings. Probably make sense to reject these (bpf_prog_was_classic())
> > if they cannot be handled anyway?
> 
> TBH again I only use cBPF for testing.  It's a convenient way of
> generating certain instruction sequences.  I can probably just drop
> it completely but the XOR patch is just 3 lines of code so not a huge
> cost either...  I'll keep patch 6 in my tree for now.  

if xor matching is only need for classic, I would drop that patch
just to avoid unnecessary state collection. The number of lines
is not a concern, but extra state for state prunning is.

> Alternatively - is there any eBPF assembler out there?  Something
> converting verifier output back into ELF would be quite cool.

would certainly be nice. I don't think there is anything standalone.
btw llvm can be made to work as assembler only, but simple flex/bison
is probably better.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 16/16] nfp: bpf: add offload of TC direct action mode
  2016-08-30 20:02       ` Daniel Borkmann
@ 2016-08-30 20:50         ` Jakub Kicinski
  0 siblings, 0 replies; 40+ messages in thread
From: Jakub Kicinski @ 2016-08-30 20:50 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Jakub Kicinski, netdev, ast, dinan.gunawardena, jiri, john.fastabend

On Tue, 30 Aug 2016 22:02:10 +0200, Daniel Borkmann wrote:
> On 08/30/2016 12:52 PM, Jakub Kicinski wrote:
> > On Mon, 29 Aug 2016 23:09:35 +0200, Daniel Borkmann wrote:  
>  [...]  
> >>
> >> In da mode, RECLASSIFY is not supported, so this one could be scratched.
> >> For the OK and UNSPEC part, couldn't both be treated the same (as in: OK /
> >> pass to stack roughly equivalent as in sch_handle_ingress())? Or is the
> >> issue that you cannot populate skb->tc_index when passing to stack (maybe
> >> just fine to leave it at 0 for now)?  
> >
> > The comment is a bit confus(ed|ing).  The problem is:
> >
> > tc filter add <filter1> skip_sw
> > tc filter add <filter2> skip_hw
> >
> > If packet appears in the stack - was it because of OK or UNSPEC (or
> > RECLASSIFY) in filter1?  Do we need to run filter2 or not?  Passing
> > tc_index can be implemented the same way I do mark today.  
> 
> Okay, I see, thanks for explaining. So, if passing tc_index (or any other
> meta data) can be implemented the same way as we do with mark already,
> could we store such verdict, say, in some unused skb->tc_verd bits (the
> skb->tc_index could be filled by the program already) and pass that up the
> stack to differentiate between them? There should be no prior user before
> ingress, so that patch 4 could become something like:
> 
>    if (tc_skip_sw(prog->gen_flags)) {
>       filter_res = tc_map_hw_verd_to_act(skb);
>    } else if (at_ingress) {
>       ...
>    } ...

This looks promising!

> And I assume it wouldn't make any sense anyway to have a skip_sw filter
> being chained /after/ some skip_hw and the like, right?

Right.  I think it should be enforced by TC core or at least some shared
code similar to tc_flags_valid() to reject offload attempts of filters
which are not first in line from the wire.  Right now AFAICT enabling
transparent offload with ethtool may result in things going down to HW
completely out of order and user doesn't even have to specify the
skip_* flags...

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 07/16] bpf: enable non-core use of the verfier
  2016-08-30 20:48                   ` Alexei Starovoitov
@ 2016-08-30 21:00                     ` Daniel Borkmann
  2016-08-31  1:18                       ` Alexei Starovoitov
  0 siblings, 1 reply; 40+ messages in thread
From: Daniel Borkmann @ 2016-08-30 21:00 UTC (permalink / raw)
  To: Alexei Starovoitov, Jakub Kicinski
  Cc: Jakub Kicinski, netdev, ast, dinan.gunawardena, jiri, john.fastabend

On 08/30/2016 10:48 PM, Alexei Starovoitov wrote:
> On Tue, Aug 30, 2016 at 10:22:46PM +0200, Jakub Kicinski wrote:
>> On Tue, 30 Aug 2016 21:07:50 +0200, Daniel Borkmann wrote:
>>>> Having two modes seems more straight forward and I think we would only
>>>> need to pay attention in the LD_IMM64 case, I don't think I've seen
>>>> LLVM generating XORs, it's just the cBPF -> eBPF conversion.
>>>
>>> Okay, though, I think that the cBPF to eBPF migration wouldn't even
>>> pass through the bpf_parse() handling, since verifier is not aware on
>>> some of their aspects such as emitting calls directly (w/o *proto) or
>>> arg mappings. Probably make sense to reject these (bpf_prog_was_classic())
>>> if they cannot be handled anyway?
>>
>> TBH again I only use cBPF for testing.  It's a convenient way of
>> generating certain instruction sequences.  I can probably just drop
>> it completely but the XOR patch is just 3 lines of code so not a huge
>> cost either...  I'll keep patch 6 in my tree for now.
>
> if xor matching is only need for classic, I would drop that patch
> just to avoid unnecessary state collection. The number of lines
> is not a concern, but extra state for state prunning is.
>
>> Alternatively - is there any eBPF assembler out there?  Something
>> converting verifier output back into ELF would be quite cool.
>
> would certainly be nice. I don't think there is anything standalone.
> btw llvm can be made to work as assembler only, but simple flex/bison
> is probably better.

Never tried it out, but seems llvm backend doesn't have asm parser
implemented?

   $ clang -target bpf -O2 -c foo.c -S -o foo.S
   $ llvm-mc -arch bpf foo.S -filetype=obj -o foo.o
   llvm-mc: error: this target does not support assembly parsing.

LLVM IR might work, but maybe too high level(?); alternatively, we could
make bpf_asm from tools/net/ eBPF aware for debugging purposes. If you
have a toolchain supporting libbfd et al, you could probably make use
of bpf_jit_dump() (like JITs do) and then bpf_jit_disasm tool (from
same dir as bpf_asm).

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFCv2 07/16] bpf: enable non-core use of the verfier
  2016-08-30 21:00                     ` Daniel Borkmann
@ 2016-08-31  1:18                       ` Alexei Starovoitov
  0 siblings, 0 replies; 40+ messages in thread
From: Alexei Starovoitov @ 2016-08-31  1:18 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Jakub Kicinski, Jakub Kicinski, netdev, ast, dinan.gunawardena,
	jiri, john.fastabend

On Tue, Aug 30, 2016 at 11:00:38PM +0200, Daniel Borkmann wrote:
> On 08/30/2016 10:48 PM, Alexei Starovoitov wrote:
> >On Tue, Aug 30, 2016 at 10:22:46PM +0200, Jakub Kicinski wrote:
> >>On Tue, 30 Aug 2016 21:07:50 +0200, Daniel Borkmann wrote:
> >>>>Having two modes seems more straight forward and I think we would only
> >>>>need to pay attention in the LD_IMM64 case, I don't think I've seen
> >>>>LLVM generating XORs, it's just the cBPF -> eBPF conversion.
> >>>
> >>>Okay, though, I think that the cBPF to eBPF migration wouldn't even
> >>>pass through the bpf_parse() handling, since verifier is not aware on
> >>>some of their aspects such as emitting calls directly (w/o *proto) or
> >>>arg mappings. Probably make sense to reject these (bpf_prog_was_classic())
> >>>if they cannot be handled anyway?
> >>
> >>TBH again I only use cBPF for testing.  It's a convenient way of
> >>generating certain instruction sequences.  I can probably just drop
> >>it completely but the XOR patch is just 3 lines of code so not a huge
> >>cost either...  I'll keep patch 6 in my tree for now.
> >
> >if xor matching is only need for classic, I would drop that patch
> >just to avoid unnecessary state collection. The number of lines
> >is not a concern, but extra state for state prunning is.
> >
> >>Alternatively - is there any eBPF assembler out there?  Something
> >>converting verifier output back into ELF would be quite cool.
> >
> >would certainly be nice. I don't think there is anything standalone.
> >btw llvm can be made to work as assembler only, but simple flex/bison
> >is probably better.
> 
> Never tried it out, but seems llvm backend doesn't have asm parser
> implemented?
> 
>   $ clang -target bpf -O2 -c foo.c -S -o foo.S
>   $ llvm-mc -arch bpf foo.S -filetype=obj -o foo.o
>   llvm-mc: error: this target does not support assembly parsing.
> 
> LLVM IR might work, but maybe too high level(?); alternatively, we could
> make bpf_asm from tools/net/ eBPF aware for debugging purposes. If you
> have a toolchain supporting libbfd et al, you could probably make use
> of bpf_jit_dump() (like JITs do) and then bpf_jit_disasm tool (from
> same dir as bpf_asm).

yes. llvm-based bpf asm is not complete. It's straightforward to add though.
It won't be going through IR. Only 'mc' (machine instruciton) layer.

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2016-08-31  1:18 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-26 18:05 [RFCv2 00/16] BPF hardware offload (cls_bpf for now) Jakub Kicinski
2016-08-26 18:06 ` [RFCv2 01/16] add basic register-field manipulation macros Jakub Kicinski
2016-08-29 14:34   ` Daniel Borkmann
2016-08-29 15:07     ` Jakub Kicinski
2016-08-29 15:40       ` Daniel Borkmann
2016-08-26 18:06 ` [RFCv2 02/16] net: cls_bpf: add hardware offload Jakub Kicinski
2016-08-29 14:51   ` Daniel Borkmann
2016-08-26 18:06 ` [RFCv2 03/16] net: cls_bpf: limit hardware offload by software-only flag Jakub Kicinski
2016-08-29 15:06   ` Daniel Borkmann
2016-08-29 15:15     ` Jakub Kicinski
2016-08-26 18:06 ` [RFCv2 04/16] net: cls_bpf: add support for marking filters as hardware-only Jakub Kicinski
2016-08-29 15:28   ` Daniel Borkmann
2016-08-26 18:06 ` [RFCv2 05/16] bpf: recognize 64bit immediate loads as consts Jakub Kicinski
2016-08-26 18:06 ` [RFCv2 06/16] bpf: verifier: recognize rN ^ rN as load of 0 Jakub Kicinski
2016-08-26 18:06 ` [RFCv2 07/16] bpf: enable non-core use of the verfier Jakub Kicinski
2016-08-26 23:29   ` Alexei Starovoitov
2016-08-27 11:40     ` Jakub Kicinski
2016-08-27 17:32       ` Alexei Starovoitov
2016-08-29 20:13         ` Daniel Borkmann
2016-08-29 20:17           ` Daniel Borkmann
2016-08-30 10:48             ` Jakub Kicinski
2016-08-30 19:07               ` Daniel Borkmann
2016-08-30 20:22                 ` Jakub Kicinski
2016-08-30 20:48                   ` Alexei Starovoitov
2016-08-30 21:00                     ` Daniel Borkmann
2016-08-31  1:18                       ` Alexei Starovoitov
2016-08-26 18:06 ` [RFCv2 08/16] bpf: export bpf_prog_clone functions Jakub Kicinski
2016-08-26 18:06 ` [RFCv2 09/16] nfp: add BPF to NFP code translator Jakub Kicinski
2016-08-26 18:06 ` [RFCv2 10/16] nfp: bpf: add hardware bpf offload Jakub Kicinski
2016-08-26 18:06 ` [RFCv2 11/16] net: cls_bpf: allow offloaded filters to update stats Jakub Kicinski
2016-08-29 20:43   ` Daniel Borkmann
2016-08-26 18:06 ` [RFCv2 12/16] net: bpf: " Jakub Kicinski
2016-08-26 18:06 ` [RFCv2 13/16] nfp: bpf: add packet marking support Jakub Kicinski
2016-08-26 18:06 ` [RFCv2 14/16] net: act_mirred: allow statistic updates from offloaded actions Jakub Kicinski
2016-08-26 18:06 ` [RFCv2 15/16] nfp: bpf: add support for legacy redirect action Jakub Kicinski
2016-08-26 18:06 ` [RFCv2 16/16] nfp: bpf: add offload of TC direct action mode Jakub Kicinski
2016-08-29 21:09   ` Daniel Borkmann
2016-08-30 10:52     ` Jakub Kicinski
2016-08-30 20:02       ` Daniel Borkmann
2016-08-30 20:50         ` Jakub Kicinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.