All of lore.kernel.org
 help / color / mirror / Atom feed
* [net-next PATCH 0/5] net_sched: Add support for IFE action
@ 2016-02-22 13:21 Jamal Hadi Salim
  2016-02-22 13:21 ` [net-next PATCH 1/5] introduce " Jamal Hadi Salim
                   ` (6 more replies)
  0 siblings, 7 replies; 27+ messages in thread
From: Jamal Hadi Salim @ 2016-02-22 13:21 UTC (permalink / raw)
  To: davem
  Cc: netdev, daniel, xiyou.wangcong, alexei.starovoitov, Jamal Hadi Salim

From: Jamal Hadi Salim <jhs@mojatatu.com>


As agreed at netconf in Seville, here's the patch finally (1 year
was just too long to wait).
Described in netdev01 paper:
            "Distributing Linux Traffic Control Classifier-Action Subsystem"
             Authors: Jamal Hadi Salim and Damascene M. Joachimpillai

Allows for incremental updates for new metadatum support.
This patch set includes support for basic skb metadatum
Followup patches will have more examples of metadata

Jamal Hadi Salim (5):
  introduce IFE action
  Support to encoding decoding skb mark on IFE action
  Support to encoding decoding skb prio on IFE action
  Support to encoding decoding skb hashid on IFE action
  Support to encoding decoding skb queue map on IFE action

 include/net/tc_act/tc_ife.h        |  60 +++
 include/uapi/linux/tc_act/tc_ife.h |  38 ++
 net/sched/Kconfig                  |  32 ++
 net/sched/Makefile                 |   5 +
 net/sched/act_ife.c                | 865 +++++++++++++++++++++++++++++++++++++
 net/sched/act_meta_mark.c          |  81 ++++
 net/sched/act_meta_qmap.c          | 100 +++++
 net/sched/act_meta_skbhash.c       |  87 ++++
 net/sched/act_meta_skbprio.c       |  80 ++++
 9 files changed, 1348 insertions(+)
 create mode 100644 include/net/tc_act/tc_ife.h
 create mode 100644 include/uapi/linux/tc_act/tc_ife.h
 create mode 100644 net/sched/act_ife.c
 create mode 100644 net/sched/act_meta_mark.c
 create mode 100644 net/sched/act_meta_qmap.c
 create mode 100644 net/sched/act_meta_skbhash.c
 create mode 100644 net/sched/act_meta_skbprio.c

-- 
1.9.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [net-next PATCH 1/5] introduce IFE action
  2016-02-22 13:21 [net-next PATCH 0/5] net_sched: Add support for IFE action Jamal Hadi Salim
@ 2016-02-22 13:21 ` Jamal Hadi Salim
  2016-02-22 13:21 ` [net-next PATCH 2/5] Support to encoding decoding skb mark on " Jamal Hadi Salim
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 27+ messages in thread
From: Jamal Hadi Salim @ 2016-02-22 13:21 UTC (permalink / raw)
  To: davem
  Cc: netdev, daniel, xiyou.wangcong, alexei.starovoitov, Jamal Hadi Salim

From: Jamal Hadi Salim <jhs@mojatatu.com>

This action allows for a sending side to encapsulate arbitrary metadata
which is decapsulated by the receiving end.
The sender runs in encoding mode and the receiver in decode mode.
Both sender and receiver must specify the same ethertype.
At some point we hope to have a registered ethertype and we'll
then provide a default so the user doesnt have to specify it.
For now we enforce the user specify it.

Lets show example usage where we encode icmp from a sender towards
a receiver with an skbmark of 17; both sender and receiver use
ethertype of 0xdead to interop.

YYYY: Lets start with Receiver-side policy config:
xxx: add an ingress qdisc
sudo tc qdisc add dev $ETH ingress

xxx: any packets with ethertype 0xdead will be subjected to ife decoding
xxx: we then restart the classification so we can match on icmp at prio 3
sudo $TC filter add dev $ETH parent ffff: prio 2 protocol 0xdead \
u32 match u32 0 0 flowid 1:1 \
action ife decode reclassify

xxx: on restarting the classification from above if it was an icmp
xxx: packet, then match it here and continue to the next rule at prio 4
xxx: which will match based on skb mark of 17
sudo tc filter add dev $ETH parent ffff: prio 3 protocol ip \
u32 match ip protocol 1 0xff flowid 1:1 \
action continue

xxx: match on skbmark of 0x11 (decimal 17) and accept
sudo tc filter add dev $ETH parent ffff: prio 4 protocol ip \
handle 0x11 fw flowid 1:1 \
action ok

xxx: Lets show the decoding policy
sudo tc -s filter ls dev $ETH parent ffff: protocol 0xdead
xxx:
filter pref 2 u32
filter pref 2 u32 fh 800: ht divisor 1
filter pref 2 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1  (rule hit 0 success 0)
  match 00000000/00000000 at 0 (success 0 )
        action order 1: ife decode action reclassify
         index 1 ref 1 bind 1 installed 14 sec used 14 sec
         type: 0x0
         Metadata: allow mark allow hash allow prio allow qmap
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0
xxx:
Observe that above lists all metadatum it can decode. Typically these
submodules will already be compiled into a monolithic kernel or
loaded as modules

YYYY: Lets show the sender side now ..

xxx: Add an egress qdisc on the sender netdev
sudo tc qdisc add dev $ETH root handle 1: prio
xxx:
xxx: Match all icmp packets to 192.168.122.237/24, then
xxx: tag the packet with skb mark of decimal 17, then
xxx: Encode it with:
xxx:	ethertype 0xdead
xxx:	add skb->mark to whitelist of metadatum to send
xxx:	rewrite target dst MAC address to 02:15:15:15:15:15
xxx:
sudo $TC filter add dev $ETH parent 1: protocol ip prio 10  u32 \
match ip dst 192.168.122.237/24 \
match ip protocol 1 0xff \
flowid 1:2 \
action skbedit mark 17 \
action ife encode \
type 0xDEAD \
allow mark \
dst 02:15:15:15:15:15

xxx: Lets show the encoding policy
sudo tc -s filter ls dev $ETH parent 1: protocol ip
xxx:
filter pref 10 u32
filter pref 10 u32 fh 800: ht divisor 1
filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:2  (rule hit 0 success 0)
  match c0a87aed/ffffffff at 16 (success 0 )
  match 00010000/00ff0000 at 8 (success 0 )

	action order 1:  skbedit mark 17
	 index 6 ref 1 bind 1
 	Action statistics:
	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
	backlog 0b 0p requeues 0

	action order 2: ife encode action pipe
	 index 3 ref 1 bind 1
	 dst MAC: 02:15:15:15:15:15 type: 0xDEAD
 	 Metadata: allow mark
 	Action statistics:
	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
	backlog 0b 0p requeues 0
xxx:

test by sending ping from sender to destination

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 include/net/tc_act/tc_ife.h        |  60 +++
 include/uapi/linux/tc_act/tc_ife.h |  38 ++
 net/sched/Kconfig                  |  12 +
 net/sched/Makefile                 |   1 +
 net/sched/act_ife.c                | 865 +++++++++++++++++++++++++++++++++++++
 5 files changed, 976 insertions(+)
 create mode 100644 include/net/tc_act/tc_ife.h
 create mode 100644 include/uapi/linux/tc_act/tc_ife.h
 create mode 100644 net/sched/act_ife.c

diff --git a/include/net/tc_act/tc_ife.h b/include/net/tc_act/tc_ife.h
new file mode 100644
index 0000000..fbbc23c
--- /dev/null
+++ b/include/net/tc_act/tc_ife.h
@@ -0,0 +1,60 @@
+#ifndef __NET_TC_IFE_H
+#define __NET_TC_IFE_H
+
+#include <net/act_api.h>
+#include <linux/etherdevice.h>
+#include <linux/rtnetlink.h>
+#include <linux/module.h>
+
+#define IFE_METAHDRLEN 2
+struct tcf_ife_info {
+	struct tcf_common common;
+	u8 eth_dst[ETH_ALEN];
+	u8 eth_src[ETH_ALEN];
+	u16 eth_type;
+	u16 flags;
+	/* list of metaids allowed */
+	struct list_head metalist;
+};
+#define to_ife(a) \
+	container_of(a->priv, struct tcf_ife_info, common)
+
+struct tcf_meta_info {
+	const struct tcf_meta_ops *ops;
+	void *metaval;
+	u16 metaid;
+	struct list_head metalist;
+};
+
+struct tcf_meta_ops {
+	u16 metaid; /*Maintainer provided ID */
+	u16 metatype; /*netlink attribute type (look at net/netlink.h) */
+	const char *name;
+	const char *synopsis;
+	struct list_head list;
+	int	(*check_presence)(struct sk_buff *, struct tcf_meta_info *);
+	int	(*encode)(struct sk_buff *, void *, struct tcf_meta_info *);
+	int	(*decode)(struct sk_buff *, void *, u16 len);
+	int	(*get)(struct sk_buff *skb, struct tcf_meta_info *mi);
+	int	(*alloc)(struct tcf_meta_info *, void *);
+	void	(*release)(struct tcf_meta_info *);
+	int	(*validate)(void *val, int len);
+	struct module	*owner;
+};
+
+#define MODULE_ALIAS_IFE_META(metan)   MODULE_ALIAS("ifemeta" __stringify_1(metan))
+
+int get_meta_u32(struct sk_buff *skb, struct tcf_meta_info *mi);
+int get_meta_u16(struct sk_buff *skb, struct tcf_meta_info *mi);
+int tlv_meta_encode(void *skbdata, u16 attrtype, u16 dlen, const void *dval);
+int alloc_meta_u32(struct tcf_meta_info *mi, void *metaval);
+int alloc_meta_u16(struct tcf_meta_info *mi, void *metaval);
+int check_meta_u32(u32 metaval, struct tcf_meta_info *mi);
+int encode_meta_u32(u32 metaval, void *skbdata, struct tcf_meta_info *mi);
+int validate_meta_u32(void *val, int len);
+int validate_meta_u16(void *val, int len);
+void release_meta_gen(struct tcf_meta_info *mi);
+int register_ife_op(struct tcf_meta_ops *mops);
+int unregister_ife_op(struct tcf_meta_ops *mops);
+
+#endif /* __NET_TC_IFE_H */
diff --git a/include/uapi/linux/tc_act/tc_ife.h b/include/uapi/linux/tc_act/tc_ife.h
new file mode 100644
index 0000000..f11bcda
--- /dev/null
+++ b/include/uapi/linux/tc_act/tc_ife.h
@@ -0,0 +1,38 @@
+#ifndef __UAPI_TC_IFE_H
+#define __UAPI_TC_IFE_H
+
+#include <linux/types.h>
+#include <linux/pkt_cls.h>
+
+#define TCA_ACT_IFE 25
+/* Flag bits for now just encoding/decoding */
+#define IFE_ENCODE 1
+#define IFE_DECODE 2
+
+struct tc_ife {
+	tc_gen;
+	__u16 flags;
+};
+
+/*XXX: We need to encode the total number of bytes consumed */
+enum {
+	TCA_IFE_UNSPEC,
+	TCA_IFE_PARMS,
+	TCA_IFE_TM,
+	TCA_IFE_DMAC,
+	TCA_IFE_SMAC,
+	TCA_IFE_TYPE,
+	TCA_IFE_METALST,
+	__TCA_IFE_MAX
+};
+#define TCA_IFE_MAX (__TCA_IFE_MAX - 1)
+
+#define IFE_META_SKBMARK 1
+#define IFE_META_HASHID 2
+#define	IFE_META_PRIO 3
+#define	IFE_META_QMAP 4
+/*Can be overriden at runtime by module option*/
+#define	__IFE_META_MAX 5
+#define IFE_META_MAX (__IFE_META_MAX - 1)
+
+#endif
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index 8283082..4d48ef5 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -739,6 +739,18 @@ config NET_ACT_CONNMARK
 	  To compile this code as a module, choose M here: the
 	  module will be called act_connmark.
 
+config NET_ACT_IFE
+        tristate "Inter-FE action based on IETF ForCES InterFE LFB"
+        depends on NET_CLS_ACT
+        ---help---
+	  Say Y here to allow for sourcing and terminating metadata
+	  For details refer to netdev01 paper:
+	  "Distributing Linux Traffic Control Classifier-Action Subsystem"
+	   Authors: Jamal Hadi Salim and Damascene M. Joachimpillai
+
+	  To compile this code as a module, choose M here: the
+	  module will be called act_ife.
+
 config NET_CLS_IND
 	bool "Incoming device classification"
 	depends on NET_CLS_U32 || NET_CLS_FW
diff --git a/net/sched/Makefile b/net/sched/Makefile
index 690c1689..3d17667 100644
--- a/net/sched/Makefile
+++ b/net/sched/Makefile
@@ -19,6 +19,7 @@ obj-$(CONFIG_NET_ACT_CSUM)	+= act_csum.o
 obj-$(CONFIG_NET_ACT_VLAN)	+= act_vlan.o
 obj-$(CONFIG_NET_ACT_BPF)	+= act_bpf.o
 obj-$(CONFIG_NET_ACT_CONNMARK)	+= act_connmark.o
+obj-$(CONFIG_NET_ACT_IFE)	+= act_ife.o
 obj-$(CONFIG_NET_SCH_FIFO)	+= sch_fifo.o
 obj-$(CONFIG_NET_SCH_CBQ)	+= sch_cbq.o
 obj-$(CONFIG_NET_SCH_HTB)	+= sch_htb.o
diff --git a/net/sched/act_ife.c b/net/sched/act_ife.c
new file mode 100644
index 0000000..153a202
--- /dev/null
+++ b/net/sched/act_ife.c
@@ -0,0 +1,865 @@
+/*
+ * net/sched/ife.c	Inter-FE action based on ForCES WG InterFE LFB
+ *
+ *		Refer to:
+ *		draft-ietf-forces-interfelfb-03
+ *		and
+ *		netdev01 paper:
+ *		"Distributing Linux Traffic Control Classifier-Action
+ *		Subsystem"
+ *		Authors: Jamal Hadi Salim and Damascene M. Joachimpillai
+ *
+ *		This program is free software; you can redistribute it and/or
+ *		modify it under the terms of the GNU General Public License
+ *		as published by the Free Software Foundation; either version
+ *		2 of the License, or (at your option) any later version.
+ *
+ * copyright 	Jamal Hadi Salim (2015)
+ *
+*/
+
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/errno.h>
+#include <linux/skbuff.h>
+#include <linux/rtnetlink.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <net/netlink.h>
+#include <net/pkt_sched.h>
+#include <uapi/linux/tc_act/tc_ife.h>
+#include <net/tc_act/tc_ife.h>
+#include <linux/etherdevice.h>
+
+#define IFE_TAB_MASK	15
+
+static int max_metacnt = IFE_META_MAX + 1;
+static const struct nla_policy ife_policy[TCA_IFE_MAX + 1] = {
+	[TCA_IFE_PARMS] = {.len = sizeof(struct tc_ife)},
+	[TCA_IFE_DMAC] = {.type = NLA_BINARY,.len = ETH_ALEN},
+	[TCA_IFE_SMAC] = {.type = NLA_BINARY,.len = ETH_ALEN},
+	[TCA_IFE_TYPE] = {.type = NLA_U16},
+};
+
+/*
+ * Caller takes care of presenting data in network order
+*/
+int tlv_meta_encode(void *skbdata, u16 attrtype, u16 dlen, const void *dval)
+{
+	u32 *tlv = (u32 *) (skbdata);
+	u16 totlen = nla_total_size(dlen);	/*alignment + hdr */
+	char *dptr = (char *)tlv + NLA_HDRLEN;
+	u32 htlv = attrtype << 16 | totlen;
+
+	*tlv = htonl(htlv);
+	memset(dptr, 0, totlen - NLA_HDRLEN);
+	memcpy(dptr, dval, dlen);
+
+	return totlen;
+}
+EXPORT_SYMBOL_GPL(tlv_meta_encode);
+
+int get_meta_u32(struct sk_buff *skb, struct tcf_meta_info *mi)
+{
+	if (mi->metaval)
+		return nla_put_u32(skb, mi->metaid, *(u32 *) mi->metaval);
+	else
+		return nla_put(skb, mi->metaid, 0, NULL);
+}
+EXPORT_SYMBOL_GPL(get_meta_u32);
+
+int check_meta_u32(u32 metaval, struct tcf_meta_info *mi)
+{
+	if (metaval || mi->metaval)
+		return 8;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(check_meta_u32);
+
+int encode_meta_u32(u32 metaval, void *skbdata, struct tcf_meta_info *mi)
+{
+	u32 edata = metaval;
+
+	if (mi->metaval)
+		edata = *(u32 *)mi->metaval;
+	else if (metaval)
+		edata = metaval;
+
+	if (!edata) /* will not encode */
+		return 0;
+
+	edata = htonl(edata);
+	return tlv_meta_encode(skbdata, mi->metaid, 4, &edata);
+}
+EXPORT_SYMBOL_GPL(encode_meta_u32);
+
+int get_meta_u16(struct sk_buff *skb, struct tcf_meta_info *mi)
+{
+	if (mi->metaval)
+		return nla_put_u16(skb, mi->metaid, *(u16 *) mi->metaval);
+	else
+		return nla_put(skb, mi->metaid, 0, NULL);
+}
+EXPORT_SYMBOL_GPL(get_meta_u16);
+
+int alloc_meta_u32(struct tcf_meta_info *mi, void *metaval)
+{
+	mi->metaval = kzalloc(sizeof(u32), GFP_KERNEL);
+	if (!mi->metaval)
+		return -ENOMEM;
+
+	memcpy(mi->metaval, metaval, 4);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(alloc_meta_u32);
+
+int alloc_meta_u16(struct tcf_meta_info *mi, void *metaval)
+{
+	mi->metaval = kzalloc(sizeof(u16), GFP_KERNEL);
+	if (!mi->metaval)
+		return -ENOMEM;
+
+	memcpy(mi->metaval, metaval, 2);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(alloc_meta_u16);
+
+void release_meta_gen(struct tcf_meta_info *mi)
+{
+	kfree(mi->metaval);
+}
+
+EXPORT_SYMBOL_GPL(release_meta_gen);
+
+int validate_meta_u32(void *val, int len)
+{
+	if (len == 4)
+		return 0;
+
+	return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(validate_meta_u32);
+
+int validate_meta_u16(void *val, int len)
+{
+	/* length will include padding */
+	if (len == NLA_ALIGN(2))
+		return 0;
+
+	return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(validate_meta_u16);
+
+static LIST_HEAD(ifeoplist);
+static DEFINE_RWLOCK(ife_mod_lock);
+
+struct tcf_meta_ops *find_ife_oplist(u16 metaid)
+{
+	struct tcf_meta_ops *o;
+
+	read_lock(&ife_mod_lock);
+	list_for_each_entry(o, &ifeoplist, list) {
+		if (o->metaid == metaid) {
+			if (!try_module_get(o->owner))
+				o = NULL;
+			read_unlock(&ife_mod_lock);
+			return o;
+		}
+	}
+	read_unlock(&ife_mod_lock);
+
+	return NULL;
+}
+
+int register_ife_op(struct tcf_meta_ops *mops)
+{
+	struct tcf_meta_ops *m;
+
+	if (!mops->metaid || !mops->metatype || !mops->name ||
+	    !mops->check_presence || !mops->encode || !mops->decode ||
+	    !mops->get || !mops->alloc)
+		return -EINVAL;
+
+	write_lock(&ife_mod_lock);
+
+	list_for_each_entry(m, &ifeoplist, list) {
+		if (m->metaid == mops->metaid ||
+		    (strcmp(mops->name, m->name) == 0)) {
+			write_unlock(&ife_mod_lock);
+			return -EEXIST;
+		}
+	}
+
+	if (!mops->release)
+		mops->release = release_meta_gen;
+
+	INIT_LIST_HEAD(&mops->list);
+	list_add_tail(&mops->list, &ifeoplist);
+	write_unlock(&ife_mod_lock);
+	return 0;
+}
+
+int unregister_ife_op(struct tcf_meta_ops *mops)
+{
+	struct tcf_meta_ops *m;
+	int err = -ENOENT;
+
+	write_lock(&ife_mod_lock);
+	list_for_each_entry(m, &ifeoplist, list) {
+		if (m->metaid == mops->metaid) {
+			list_del(&mops->list);
+			err = 0;
+			break;
+		}
+	}
+	write_unlock(&ife_mod_lock);
+
+	return err;
+}
+
+EXPORT_SYMBOL_GPL(register_ife_op);
+EXPORT_SYMBOL_GPL(unregister_ife_op);
+
+void print_ife_oplist(void)
+{
+	struct tcf_meta_ops *o;
+	int i = 0;
+
+	read_lock(&ife_mod_lock);
+	list_for_each_entry(o, &ifeoplist, list) {
+		pr_debug("%d: %s metaid %d\n", ++i, o->name, o->metaid);
+	}
+	read_unlock(&ife_mod_lock);
+}
+
+/* called when adding new meta information
+ * under ife->tcf_lock
+*/
+int load_metaops_and_vet(struct tcf_ife_info *ife, u32 metaid,
+			 void *val, int len)
+{
+	struct tcf_meta_ops *ops = find_ife_oplist(metaid);
+	int ret = 0;
+
+	if (!ops) {
+		ret = -ENOENT;
+#ifdef CONFIG_MODULES
+		spin_unlock_bh(&ife->tcf_lock);
+		rtnl_unlock();
+		request_module("ifemeta%u", metaid);
+		rtnl_lock();
+		spin_lock_bh(&ife->tcf_lock);
+		ops = find_ife_oplist(metaid);
+#endif
+	}
+
+	if (ops) {
+		ret = 0;
+
+		/* XXX: unfortunately cant use nla_policy at this point
+		 * because a length of 0 is valid in the case of 
+		 * "allow". "use" semantics do enforce for proper
+		 * length and i couldve use nla_policy but it makes it hard
+		 * to use it just for that..
+		 */
+		if (len) {
+			if (ops->validate) {
+				ret = ops->validate(val, len);
+			} else {
+				if (ops->metatype == NLA_U32) {
+					ret = validate_meta_u32(val, len);
+				} else if (ops->metatype == NLA_U16) {
+					ret = validate_meta_u16(val, len);
+				}
+			}
+		}
+
+		module_put(ops->owner);
+	}
+
+	return ret;
+}
+
+/* called when adding new meta information
+ * under ife->tcf_lock
+*/
+int add_metainfo(struct tcf_ife_info *ife, u32 metaid, void *metaval, int len)
+{
+	struct tcf_meta_info *mi = NULL;
+	struct tcf_meta_ops *ops = find_ife_oplist(metaid);
+	int ret = 0;
+
+	if (!ops) {
+		return -ENOENT;
+	}
+
+	mi = kzalloc(sizeof(*mi), GFP_KERNEL);
+	if (!mi) {
+		/*put back what find_ife_oplist took */
+		module_put(ops->owner);
+		return -ENOMEM;
+	}
+
+	mi->metaid = metaid;
+	mi->ops = ops;
+	if (len > 0) {
+		ret = ops->alloc(mi, metaval);
+		if (ret != 0) {
+			kfree(mi);
+			module_put(ops->owner);
+			return ret;
+		}
+	}
+
+	/*XXX: if it becomes necessary add per tcf_meta_info lock;
+	 * for now use Thor's hammer */
+	write_lock(&ife_mod_lock);
+	list_add_tail(&mi->metalist, &ife->metalist);
+	write_unlock(&ife_mod_lock);
+
+	return ret;
+}
+
+int use_all_metadata(struct tcf_ife_info *ife)
+{
+	struct tcf_meta_ops *o;
+	int rc = 0;
+	int installed = 0;
+
+	list_for_each_entry(o, &ifeoplist, list) {
+		rc = add_metainfo(ife, o->metaid, NULL, 0);
+		if (rc == 0)
+			installed +=1;
+	}
+
+	if (installed)
+		return 0;
+	else
+		return -EINVAL;
+}
+
+int dump_metalist(struct sk_buff *skb, struct tcf_ife_info *ife)
+{
+	struct tcf_meta_info *e;
+	struct nlattr *nest;
+	unsigned char *b = skb_tail_pointer(skb);
+	int total_encoded = 0;
+
+	/*can only happen on decode */
+	if (list_empty(&ife->metalist))
+		return 0;
+
+	nest = nla_nest_start(skb, TCA_IFE_METALST);
+	if (nest == NULL)
+		goto out_nlmsg_trim;
+
+	list_for_each_entry(e, &ife->metalist, metalist) {
+		if (!e->ops->get(skb, e))
+			total_encoded += 1;
+	}
+
+	if (!total_encoded)
+		goto out_nlmsg_trim;
+
+	nla_nest_end(skb, nest);
+
+	return 0;
+
+out_nlmsg_trim:
+	nlmsg_trim(skb, b);
+	return -1;
+}
+
+/* under ife->tcf_lock */
+static void _tcf_ife_cleanup(struct tc_action *a, int bind)
+{
+	struct tcf_ife_info *ife = a->priv;
+	struct tcf_meta_info *e, *n;
+
+	list_for_each_entry_safe(e, n, &ife->metalist, metalist) {
+		module_put(e->ops->owner);
+		list_del(&e->metalist);
+		if (e->metaval) {
+			if (e->ops->release)
+				e->ops->release(e);
+			else
+				kfree(e->metaval);
+		}
+		kfree(e);
+	}
+}
+
+static void tcf_ife_cleanup(struct tc_action *a, int bind)
+{
+	struct tcf_ife_info *ife = a->priv;
+
+	spin_lock_bh(&ife->tcf_lock);
+	_tcf_ife_cleanup(a, bind);
+	spin_unlock_bh(&ife->tcf_lock);
+}
+
+/* under ife->tcf_lock */
+int populate_metalist(struct tcf_ife_info *ife, struct nlattr **tb)
+{
+	int len = 0;
+	int rc = 0;
+	int i = 0;
+	void *val;
+
+	for (i = 1; i < max_metacnt; i++) {
+		if (tb[i]) {
+			val = nla_data(tb[i]);
+			len = nla_len(tb[i]);
+
+			rc = load_metaops_and_vet(ife, i, val, len);
+			if (rc != 0)
+				return rc;
+
+			rc = add_metainfo(ife, i, val, len);
+			if (rc)
+				return rc;
+		}
+	}
+
+	return rc;
+}
+
+static int tcf_ife_init(struct net *net, struct nlattr *nla,
+			struct nlattr *est, struct tc_action *a,
+			int ovr, int bind)
+{
+	struct nlattr *tb[TCA_IFE_MAX + 1];
+	struct nlattr *tb2[IFE_META_MAX + 1];
+	struct tcf_ife_info *ife;
+	struct tc_ife *parm;
+	u16 ife_type = 0;
+	u8 *daddr = NULL;
+	u8 *saddr = NULL;
+	int ret = 0;
+	int err;
+
+	err = nla_parse_nested(tb, TCA_IFE_MAX, nla, ife_policy);
+	if (err < 0)
+		return err;
+
+	if (tb[TCA_IFE_PARMS] == NULL)
+		return -EINVAL;
+
+	parm = nla_data(tb[TCA_IFE_PARMS]);
+
+	if ((parm->flags & IFE_ENCODE) && (parm->flags & IFE_DECODE)) {
+		pr_info("Ambigous: Cant have both encode and decode\n");
+		return -EINVAL;
+	}
+	if (!(parm->flags & IFE_ENCODE) && !(parm->flags & IFE_DECODE)) {
+		pr_info("MUST have either encode or decode\n");
+		return -EINVAL;
+	}
+
+	if (parm->flags & IFE_ENCODE) {
+		/* Until we get issued the ethertype, we cant have
+		 * a default..
+		**/
+		if (tb[TCA_IFE_TYPE] == NULL) {
+			pr_info("You MUST pass etherype for encoding\n");
+			return -EINVAL;
+		}
+	}
+
+	if (!tcf_hash_check(parm->index, a, bind)) {
+		ret = tcf_hash_create(parm->index, est, a, sizeof(*ife),
+				      bind, false);
+		if (ret)
+			return ret;
+		ret = ACT_P_CREATED;
+	} else {
+		if (bind)	/* dont override defaults */
+			return 0;
+		tcf_hash_release(a, bind);
+		if (!ovr)
+			return -EEXIST;
+	}
+
+	ife = to_ife(a);
+	ife->flags = parm->flags;
+
+	if (parm->flags & IFE_ENCODE) {
+		ife_type = *(u16 *) nla_data(tb[TCA_IFE_TYPE]);
+		if (tb[TCA_IFE_DMAC] != NULL)
+			daddr = nla_data(tb[TCA_IFE_DMAC]);
+		if (tb[TCA_IFE_SMAC] != NULL)
+			saddr = nla_data(tb[TCA_IFE_SMAC]);
+	}
+
+	spin_lock_bh(&ife->tcf_lock);
+	ife->tcf_action = parm->action;
+
+	if (parm->flags & IFE_ENCODE) {
+		if (daddr)
+			ether_addr_copy(ife->eth_dst, daddr);
+		else
+			eth_zero_addr(ife->eth_dst);
+
+		if (saddr)
+			ether_addr_copy(ife->eth_src, saddr);
+		else
+			eth_zero_addr(ife->eth_src);
+
+		ife->eth_type = ife_type;
+	}
+
+	if (ret == ACT_P_CREATED)
+		INIT_LIST_HEAD(&ife->metalist);
+
+	if (tb[TCA_IFE_METALST] != NULL) {
+		err = nla_parse_nested(tb2, IFE_META_MAX, tb[TCA_IFE_METALST],
+				       NULL);
+		if (err) {
+metadata_parse_err:
+			if (ret == ACT_P_CREATED)
+				_tcf_ife_cleanup(a, bind);
+
+			spin_unlock_bh(&ife->tcf_lock);
+			return err;
+		}
+
+		err = populate_metalist(ife, tb2);
+		if (err)
+			goto metadata_parse_err;
+
+	} else {
+		/* if no passed metadata allow list or passed allow-all 
+		 * then here we process by adding as many supported metadatum
+		 * as we can. You better have at least one else we are
+		 * going to bail out
+		 */
+		err = use_all_metadata(ife);
+		if (err) {
+			if (ret == ACT_P_CREATED)
+				_tcf_ife_cleanup(a, bind);
+
+			spin_unlock_bh(&ife->tcf_lock);
+			return err;
+		}
+	}
+
+	spin_unlock_bh(&ife->tcf_lock);
+
+	if (ret == ACT_P_CREATED)
+		tcf_hash_insert(a);
+
+	print_ife_oplist();
+	return ret;
+}
+
+static int tcf_ife_dump(struct sk_buff *skb, struct tc_action *a, int bind,
+			int ref)
+{
+	unsigned char *b = skb_tail_pointer(skb);
+	struct tcf_ife_info *ife = a->priv;
+	struct tc_ife opt = {
+		.index = ife->tcf_index,
+		.refcnt = ife->tcf_refcnt - ref,
+		.bindcnt = ife->tcf_bindcnt - bind,
+		.action = ife->tcf_action,
+		.flags = ife->flags,
+	};
+	struct tcf_t t;
+
+	if (nla_put(skb, TCA_IFE_PARMS, sizeof(opt), &opt))
+		goto nla_put_failure;
+
+	t.install = jiffies_to_clock_t(jiffies - ife->tcf_tm.install);
+	t.lastuse = jiffies_to_clock_t(jiffies - ife->tcf_tm.lastuse);
+	t.expires = jiffies_to_clock_t(ife->tcf_tm.expires);
+	if (nla_put(skb, TCA_IFE_TM, sizeof(t), &t))
+		goto nla_put_failure;
+
+	if (!is_zero_ether_addr(ife->eth_dst)) {
+		if (nla_put(skb, TCA_IFE_DMAC, ETH_ALEN, ife->eth_dst))
+			goto nla_put_failure;
+	}
+
+	if (!is_zero_ether_addr(ife->eth_src)) {
+		if (nla_put(skb, TCA_IFE_SMAC, ETH_ALEN, ife->eth_src))
+			goto nla_put_failure;
+	}
+
+	if (nla_put(skb, TCA_IFE_TYPE, 2, &ife->eth_type))
+		goto nla_put_failure;
+
+	if (dump_metalist(skb, ife)) {
+		/*ignore failure to dump metalist */
+		pr_info("Failed to dump metalist\n");
+	}
+
+	return skb->len;
+
+nla_put_failure:
+	nlmsg_trim(skb, b);
+	return -1;
+}
+
+int find_decode_metaid(struct sk_buff *skb, struct tcf_ife_info *ife,
+		       u16 metaid, u16 mlen, void *mdata)
+{
+	struct tcf_meta_info *e;
+
+	/* XXX: use hash to speed up */
+	list_for_each_entry(e, &ife->metalist, metalist) {
+		if (metaid == e->metaid) {
+			if (e->ops) {
+				/* We check for decode presence already */
+				return e->ops->decode(skb, mdata, mlen);
+			}
+		}
+	}
+
+	return 0;
+}
+
+struct ifeheadr {
+	__be16 metalen;
+	u8 tlv_data[];
+};
+
+struct meta_tlvhdr {
+	__be16 type;
+	__be16 len;
+};
+
+static int tcf_ife_decode(struct sk_buff *skb, const struct tc_action *a,
+			  struct tcf_result *res)
+{
+	struct tcf_ife_info *ife = a->priv;
+	int action = ife->tcf_action;
+	struct ifeheadr *ifehdr = (struct ifeheadr *)skb->data;
+	u16 ifehdrln = ifehdr->metalen;
+	struct meta_tlvhdr *tlv = (struct meta_tlvhdr *)(ifehdr->tlv_data);
+
+	spin_lock(&ife->tcf_lock);
+	bstats_update(&ife->tcf_bstats, skb);
+	ife->tcf_tm.lastuse = jiffies;
+	spin_unlock(&ife->tcf_lock);
+
+	ifehdrln = ntohs(ifehdrln);
+	if (unlikely(!pskb_may_pull(skb, ifehdrln))) {
+		spin_lock(&ife->tcf_lock);
+		ife->tcf_qstats.drops++;
+		spin_unlock(&ife->tcf_lock);
+		return TC_ACT_SHOT;
+	}
+
+	skb_set_mac_header(skb, ifehdrln);
+	__skb_pull(skb, ifehdrln);
+	skb->protocol = eth_type_trans(skb, skb->dev);
+	ifehdrln -= IFE_METAHDRLEN;
+
+	while (ifehdrln > 0) {
+		u8 *tlvdata = (u8 *) tlv;
+		u16 mtype = tlv->type;
+		u16 mlen = tlv->len;
+		mtype = ntohs(mtype);
+		mlen = ntohs(mlen);
+
+		if (find_decode_metaid(skb, ife, mtype, (mlen - 4),
+					(void *)(tlvdata + 4))) {
+			/* abuse overlimits to count when we receive metadata
+			 * but dont have an ops for it
+			 */
+			if (net_ratelimit())
+				printk("Unknown incoming metaid %d alnlen %d\n",
+				       mtype, mlen);
+			ife->tcf_qstats.overlimits++;
+		}
+
+		tlvdata += mlen;
+		ifehdrln -= mlen;
+		tlv = (struct meta_tlvhdr *)tlvdata;
+	}
+
+	skb_reset_network_header(skb);
+	return action;
+}
+
+/*XXX: check if we can do this at install time instead of current
+ * send data path
+**/
+static int ife_get_sz(struct sk_buff *skb, struct tcf_ife_info *ife)
+{
+	struct tcf_meta_info *e, *n;
+	int tot_run_sz = 0, run_sz = 0;
+
+	list_for_each_entry_safe(e, n, &ife->metalist, metalist) {
+		if (e->ops->check_presence) {
+			run_sz = e->ops->check_presence(skb, e);
+			tot_run_sz += run_sz;
+		}
+	}
+
+	return tot_run_sz;
+}
+
+static int tcf_ife_encode(struct sk_buff *skb, const struct tc_action *a,
+			  struct tcf_result *res)
+{
+	struct tcf_ife_info *ife = a->priv;
+	int action = ife->tcf_action;
+	struct ethhdr *oethh;	/* outer ether header */
+	struct ethhdr *iethh;	/* inner eth header */
+	struct tcf_meta_info *e;
+	/*
+	   OUTERHDR:TOTMETALEN:{TLVHDR:Metadatum:TLVHDR..}:ORIGDATA
+	   where ORIGDATA = original ethernet header ...
+	 */
+	u16 metalen = ife_get_sz(skb, ife);
+	int hdrm = metalen + skb->dev->hard_header_len + IFE_METAHDRLEN;
+	unsigned int skboff = skb->dev->hard_header_len;
+	u32 at = G_TC_AT(skb->tc_verd);
+	int new_len = skb->len + hdrm;
+	int exceed_mtu = 0;
+	int err;
+
+	if (at & AT_EGRESS) {
+		if (new_len > skb->dev->mtu) {
+			exceed_mtu = 1;
+		}
+	}
+
+	spin_lock(&ife->tcf_lock);
+	bstats_update(&ife->tcf_bstats, skb);
+	ife->tcf_tm.lastuse = jiffies;
+
+	if (!metalen) {		/* no metadata to send */
+		spin_unlock(&ife->tcf_lock);
+		/* abuse overlimits to count when we allow packet
+		 * with no metadata
+		 */
+		ife->tcf_qstats.overlimits++;
+		return action;
+	}
+	/* could be stupid policy setup or mtu config
+	 * so lets be conservative.. */
+	if ((action == TC_ACT_SHOT) || exceed_mtu) {
+		ife->tcf_qstats.drops++;
+		spin_unlock(&ife->tcf_lock);
+		return TC_ACT_SHOT;
+	}
+
+	iethh = eth_hdr(skb);
+
+	err = skb_cow_head(skb, hdrm);
+	if (unlikely(err)) {
+		ife->tcf_qstats.drops++;
+		spin_unlock(&ife->tcf_lock);
+		return TC_ACT_SHOT;
+	}
+
+	if (!(at & AT_EGRESS)) {
+		skb_push(skb, skb->dev->hard_header_len);
+	}
+
+	__skb_push(skb, hdrm);
+	memcpy(skb->data, iethh, skb->mac_len);
+	skb_reset_mac_header(skb);
+	oethh = eth_hdr(skb);
+
+	/*total metadata length */
+	metalen += IFE_METAHDRLEN;
+	metalen = htons(metalen);
+	memcpy((void *)(skb->data + skboff), &metalen, IFE_METAHDRLEN);
+	skboff += IFE_METAHDRLEN;
+
+	/*XXX: we dont have a clever way of telling encode to 
+	 * not repeat some of the computations that are done by 
+	 * ops->presence_check...
+	 */
+	list_for_each_entry(e, &ife->metalist, metalist) {
+		if (e->ops->encode) {
+			err = e->ops->encode(skb, (void *)(skb->data + skboff),
+					     e);
+		}
+		if (err < 0) {
+			/* too corrupt to keep around if overwritten */
+			ife->tcf_qstats.drops++;
+			spin_unlock(&ife->tcf_lock);
+			return TC_ACT_SHOT;
+		}
+		skboff += err;
+	}
+
+	if (!is_zero_ether_addr(ife->eth_src))
+		ether_addr_copy(oethh->h_source, ife->eth_src);
+	else
+		ether_addr_copy(oethh->h_source, iethh->h_source);
+	if (!is_zero_ether_addr(ife->eth_dst))
+		ether_addr_copy(oethh->h_dest, ife->eth_dst);
+	else
+		ether_addr_copy(oethh->h_dest, iethh->h_dest);
+	oethh->h_proto = htons(ife->eth_type);
+
+	if (!(at & AT_EGRESS)) {
+		skb_pull(skb, skb->dev->hard_header_len);
+	}
+
+	spin_unlock(&ife->tcf_lock);
+
+	return action;
+}
+
+static int tcf_ife(struct sk_buff *skb, const struct tc_action *a,
+		   struct tcf_result *res)
+{
+	struct tcf_ife_info *ife = a->priv;
+
+	if (ife->flags & IFE_ENCODE) {
+		return tcf_ife_encode(skb, a, res);
+	}
+
+	if (ife->flags & IFE_DECODE) {
+		return tcf_ife_decode(skb, a, res);
+	}
+
+	pr_info("unknown failure(policy neither de/encode\n");
+	spin_lock(&ife->tcf_lock);
+	bstats_update(&ife->tcf_bstats, skb);
+	ife->tcf_tm.lastuse = jiffies;
+	ife->tcf_qstats.drops++;
+	spin_unlock(&ife->tcf_lock);
+
+	return TC_ACT_SHOT;
+}
+
+static struct tc_action_ops act_ife_ops = {
+	.kind = "ife",
+	.type = TCA_ACT_IFE,
+	.owner = THIS_MODULE,
+	.act = tcf_ife,
+	.dump = tcf_ife_dump,
+	.cleanup = tcf_ife_cleanup,
+	.init = tcf_ife_init,
+};
+
+static int __init ife_init_module(void)
+{
+	pr_info("Loaded IFE maximum metaid %d\n", max_metacnt);
+	INIT_LIST_HEAD(&ifeoplist);
+	return tcf_register_action(&act_ife_ops, IFE_TAB_MASK);
+}
+
+static void __exit ife_cleanup_module(void)
+{
+	pr_info("Unloaded IFE\n");
+	tcf_unregister_action(&act_ife_ops);
+}
+
+module_init(ife_init_module);
+module_exit(ife_cleanup_module);
+
+module_param(max_metacnt, int, 0);
+MODULE_AUTHOR("Jamal Hadi Salim(2015)");
+MODULE_DESCRIPTION("Inter-FE LFB action");
+MODULE_LICENSE("GPL");
+MODULE_PARM_DESC(max_metacnt, "Upper bound of metadata id");
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [net-next PATCH 2/5] Support to encoding decoding skb mark on IFE action
  2016-02-22 13:21 [net-next PATCH 0/5] net_sched: Add support for IFE action Jamal Hadi Salim
  2016-02-22 13:21 ` [net-next PATCH 1/5] introduce " Jamal Hadi Salim
@ 2016-02-22 13:21 ` Jamal Hadi Salim
  2016-02-22 13:21 ` [net-next PATCH 3/5] Support to encoding decoding skb prio " Jamal Hadi Salim
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 27+ messages in thread
From: Jamal Hadi Salim @ 2016-02-22 13:21 UTC (permalink / raw)
  To: davem
  Cc: netdev, daniel, xiyou.wangcong, alexei.starovoitov, Jamal Hadi Salim

From: Jamal Hadi Salim <jhs@mojatatu.com>

Example usage:
Set the skb using skbedit then allow it to be encoded

sudo tc qdisc add dev $ETH root handle 1: prio
sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action skbedit mark 17 \
action ife encode \
allow mark \
dst 02:15:15:15:15:15

Note: You dont need the skbedit action if you are already encoding the
skb mark earlier. A zero skb mark will not be allowed

Alternative hard code static mark of 0x12 every time the filter matches

sudo $TC filter add dev $ETH parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action ife encode \
type 0xDEAD \
use mark 0x12 \
dst 02:15:15:15:15:15

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 net/sched/Kconfig         |  5 +++
 net/sched/Makefile        |  1 +
 net/sched/act_meta_mark.c | 81 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 87 insertions(+)
 create mode 100644 net/sched/act_meta_mark.c

diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index 4d48ef5..85854c0 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -751,6 +751,11 @@ config NET_ACT_IFE
 	  To compile this code as a module, choose M here: the
 	  module will be called act_ife.
 
+config NET_IFE_SKBMARK
+        tristate "Support to encoding decoding skb mark on IFE action"
+        depends on NET_ACT_IFE
+        ---help---
+
 config NET_CLS_IND
 	bool "Incoming device classification"
 	depends on NET_CLS_U32 || NET_CLS_FW
diff --git a/net/sched/Makefile b/net/sched/Makefile
index 3d17667..3f7a182 100644
--- a/net/sched/Makefile
+++ b/net/sched/Makefile
@@ -20,6 +20,7 @@ obj-$(CONFIG_NET_ACT_VLAN)	+= act_vlan.o
 obj-$(CONFIG_NET_ACT_BPF)	+= act_bpf.o
 obj-$(CONFIG_NET_ACT_CONNMARK)	+= act_connmark.o
 obj-$(CONFIG_NET_ACT_IFE)	+= act_ife.o
+obj-$(CONFIG_NET_IFE_SKBMARK)	+= act_meta_mark.o
 obj-$(CONFIG_NET_SCH_FIFO)	+= sch_fifo.o
 obj-$(CONFIG_NET_SCH_CBQ)	+= sch_cbq.o
 obj-$(CONFIG_NET_SCH_HTB)	+= sch_htb.o
diff --git a/net/sched/act_meta_mark.c b/net/sched/act_meta_mark.c
new file mode 100644
index 0000000..f3e1a44
--- /dev/null
+++ b/net/sched/act_meta_mark.c
@@ -0,0 +1,81 @@
+/*
+ * net/sched/act_meta_mark.c IFE skb->mark metadata module
+ *
+ *		This program is free software; you can redistribute it and/or
+ *		modify it under the terms of the GNU General Public License
+ *		as published by the Free Software Foundation; either version
+ *		2 of the License, or (at your option) any later version.
+ *
+ * copyright 	Jamal Hadi Salim (2015)
+ *
+*/
+
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/errno.h>
+#include <linux/skbuff.h>
+#include <linux/rtnetlink.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <net/netlink.h>
+#include <net/pkt_sched.h>
+#include <uapi/linux/tc_act/tc_ife.h>
+#include <net/tc_act/tc_ife.h>
+#include <linux/rtnetlink.h>
+
+static int skbmark_encode(struct sk_buff *skb, void *skbdata,
+			  struct tcf_meta_info *e)
+{
+	u32 ifemark = skb->mark;
+
+	return encode_meta_u32(ifemark, skbdata, e);
+}
+
+static int skbmark_decode(struct sk_buff *skb, void *data, u16 len)
+{
+	u32 ifemark = *(u32 *) data;
+
+	skb->mark = ntohl(ifemark);
+	return 0;
+}
+
+static int skbmark_check(struct sk_buff *skb, struct tcf_meta_info *e)
+{
+	return check_meta_u32(skb->mark, e);
+}
+
+static struct tcf_meta_ops ife_skbmark_ops = {
+	.metaid = IFE_META_SKBMARK,
+	.metatype = NLA_U32,
+	.name = "skbmark",
+	.synopsis = "skb mark 32 bit metadata",
+	.check_presence = skbmark_check,
+	.encode = skbmark_encode,
+	.decode = skbmark_decode,
+	.get = get_meta_u32,
+	.alloc = alloc_meta_u32,
+	.release = release_meta_gen,
+	.validate = validate_meta_u32,
+	.owner = THIS_MODULE,
+};
+
+static int __init ifemark_init_module(void)
+{
+	pr_info("Loaded IFE skb MARK\n");
+	return register_ife_op(&ife_skbmark_ops);
+}
+
+static void __exit ifemark_cleanup_module(void)
+{
+	pr_info("Unloaded IFE MARK\n");
+	unregister_ife_op(&ife_skbmark_ops);
+}
+
+module_init(ifemark_init_module);
+module_exit(ifemark_cleanup_module);
+
+MODULE_AUTHOR("Jamal Hadi Salim(2015)");
+MODULE_DESCRIPTION("Inter-FE skb mark metadata module");
+MODULE_LICENSE("GPL");
+MODULE_ALIAS_IFE_META(IFE_META_SKBMARK);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [net-next PATCH 3/5] Support to encoding decoding skb prio on IFE action
  2016-02-22 13:21 [net-next PATCH 0/5] net_sched: Add support for IFE action Jamal Hadi Salim
  2016-02-22 13:21 ` [net-next PATCH 1/5] introduce " Jamal Hadi Salim
  2016-02-22 13:21 ` [net-next PATCH 2/5] Support to encoding decoding skb mark on " Jamal Hadi Salim
@ 2016-02-22 13:21 ` Jamal Hadi Salim
  2016-02-22 17:01   ` Daniel Borkmann
  2016-02-22 13:21 ` [net-next PATCH 4/5] Support to encoding decoding skb hashid " Jamal Hadi Salim
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 27+ messages in thread
From: Jamal Hadi Salim @ 2016-02-22 13:21 UTC (permalink / raw)
  To: davem
  Cc: netdev, daniel, xiyou.wangcong, alexei.starovoitov, Jamal Hadi Salim

From: Jamal Hadi Salim <jhs@mojatatu.com>

    Example usage:
    Set the skb priority using skbedit then allow it to be encoded

    sudo tc qdisc add dev $ETH root handle 1: prio
    sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
    u32 match ip protocol 1 0xff flowid 1:2 \
    action skbedit prio 17 \
    action ife encode \
    allow prio \
    dst 02:15:15:15:15:15

    Note: You dont need the skbedit action if you are already encoding the
    skb priority earlier. A zero skb priority will not be sent

    Alternative hard code static priority of decimal 33 (unlike skbedit)
    then mark of 0x12 every time the filter matches

    sudo $TC filter add dev $ETH parent 1: protocol ip prio 10 \
    u32 match ip protocol 1 0xff flowid 1:2 \
    action ife encode \
    type 0xDEAD \
    use prio 33 \
    use mark 0x12 \
    dst 02:15:15:15:15:15

    Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 net/sched/Kconfig            |  5 +++
 net/sched/Makefile           |  1 +
 net/sched/act_meta_skbprio.c | 80 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 86 insertions(+)
 create mode 100644 net/sched/act_meta_skbprio.c

diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index 85854c0..b148302 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -756,6 +756,11 @@ config NET_IFE_SKBMARK
         depends on NET_ACT_IFE
         ---help---
 
+config NET_IFE_SKBPRIO
+        tristate "Support to encoding decoding skb prio on IFE action"
+        depends on NET_ACT_IFE
+        ---help---
+
 config NET_CLS_IND
 	bool "Incoming device classification"
 	depends on NET_CLS_U32 || NET_CLS_FW
diff --git a/net/sched/Makefile b/net/sched/Makefile
index 3f7a182..84bddb3 100644
--- a/net/sched/Makefile
+++ b/net/sched/Makefile
@@ -21,6 +21,7 @@ obj-$(CONFIG_NET_ACT_BPF)	+= act_bpf.o
 obj-$(CONFIG_NET_ACT_CONNMARK)	+= act_connmark.o
 obj-$(CONFIG_NET_ACT_IFE)	+= act_ife.o
 obj-$(CONFIG_NET_IFE_SKBMARK)	+= act_meta_mark.o
+obj-$(CONFIG_NET_IFE_SKBPRIO)	+= act_meta_skbprio.o
 obj-$(CONFIG_NET_SCH_FIFO)	+= sch_fifo.o
 obj-$(CONFIG_NET_SCH_CBQ)	+= sch_cbq.o
 obj-$(CONFIG_NET_SCH_HTB)	+= sch_htb.o
diff --git a/net/sched/act_meta_skbprio.c b/net/sched/act_meta_skbprio.c
new file mode 100644
index 0000000..ceaab8c
--- /dev/null
+++ b/net/sched/act_meta_skbprio.c
@@ -0,0 +1,80 @@
+/*
+ * net/sched/act_meta_prio.c IFE skb->priority metadata module
+ *
+ *		This program is free software; you can redistribute it and/or
+ *		modify it under the terms of the GNU General Public License
+ *		as published by the Free Software Foundation; either version
+ *		2 of the License, or (at your option) any later version.
+ *
+ * copyright 	Jamal Hadi Salim (2015)
+ *
+*/
+
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/errno.h>
+#include <linux/skbuff.h>
+#include <linux/rtnetlink.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <net/netlink.h>
+#include <net/pkt_sched.h>
+#include <uapi/linux/tc_act/tc_ife.h>
+#include <net/tc_act/tc_ife.h>
+
+static int skbprio_check(struct sk_buff *skb, struct tcf_meta_info *e)
+{
+	return check_meta_u32(skb->priority, e);
+}
+
+static int skbprio_encode(struct sk_buff *skb, void *skbdata,
+			  struct tcf_meta_info *e)
+{
+	u32 ifeprio = skb->priority; /* avoid having to cast skb->priority*/
+
+	pr_emerg("encode skbprio %d\n", ifeprio);
+	return encode_meta_u32(ifeprio, skbdata, e);
+}
+
+static int skbprio_decode(struct sk_buff *skb, void *data, u16 len)
+{
+	u32 ifeprio = *(u32 *) data;
+
+	skb->priority = ntohl(ifeprio);
+	pr_emerg("decode skbprio %d\n", skb->priority);
+	return 0;
+}
+
+static struct tcf_meta_ops ife_prio_ops = {
+	.metaid = IFE_META_PRIO,
+	.metatype = NLA_U32,
+	.name = "skbprio",
+	.synopsis = "skb prio metadata",
+	.check_presence = skbprio_check,
+	.encode = skbprio_encode,
+	.decode = skbprio_decode,
+	.get = get_meta_u32,
+	.alloc = alloc_meta_u32,
+	.owner = THIS_MODULE,
+};
+
+static int __init ifeprio_init_module(void)
+{
+	pr_info("Loaded IFE skb prio\n");
+	return register_ife_op(&ife_prio_ops);
+}
+
+static void __exit ifeprio_cleanup_module(void)
+{
+	unregister_ife_op(&ife_prio_ops);
+	pr_info("Unloaded IFE skb prio\n");
+}
+
+module_init(ifeprio_init_module);
+module_exit(ifeprio_cleanup_module);
+
+MODULE_AUTHOR("Jamal Hadi Salim(2015)");
+MODULE_DESCRIPTION("Inter-FE skb prio metadata action");
+MODULE_LICENSE("GPL");
+MODULE_ALIAS_IFE_META(IFE_META_PRIO);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [net-next PATCH 4/5] Support to encoding decoding skb hashid on IFE action
  2016-02-22 13:21 [net-next PATCH 0/5] net_sched: Add support for IFE action Jamal Hadi Salim
                   ` (2 preceding siblings ...)
  2016-02-22 13:21 ` [net-next PATCH 3/5] Support to encoding decoding skb prio " Jamal Hadi Salim
@ 2016-02-22 13:21 ` Jamal Hadi Salim
  2016-02-22 16:56   ` Daniel Borkmann
  2016-02-22 13:21 ` [net-next PATCH 5/5] Support to encoding decoding skb queue map " Jamal Hadi Salim
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 27+ messages in thread
From: Jamal Hadi Salim @ 2016-02-22 13:21 UTC (permalink / raw)
  To: davem
  Cc: netdev, daniel, xiyou.wangcong, alexei.starovoitov, Jamal Hadi Salim

From: Jamal Hadi Salim <jhs@mojatatu.com>

        Example usage:
        allow skb hash to be encoded as metadata

        sudo tc qdisc add dev $ETH root handle 1: prio
        sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
        u32 match ip protocol 1 0xff flowid 1:2 \
        action ife encode \
        allow hashid \
        dst 02:15:15:15:15:15

        Note: A zero skb hash will not be sent

        Alternative hard code static hashid of 11
        priority 16
        mark 33

        sudo $TC filter add dev $ETH parent 1: protocol ip prio 10 \
        u32 match ip protocol 1 0xff flowid 1:2 \
        action ife encode \
        type 0xDEAD \
	use hashid 11 \
        use prio 16 \
        use mark 33 \
        dst 02:15:15:15:15:15

        Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 net/sched/Kconfig            |  5 +++
 net/sched/Makefile           |  1 +
 net/sched/act_meta_skbhash.c | 87 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 93 insertions(+)
 create mode 100644 net/sched/act_meta_skbhash.c

diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index b148302..4c0c694 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -761,6 +761,11 @@ config NET_IFE_SKBPRIO
         depends on NET_ACT_IFE
         ---help---
 
+config NET_IFE_SKBHASH
+        tristate "Support to encoding decoding skb hashid on IFE action"
+        depends on NET_ACT_IFE
+        ---help---
+
 config NET_CLS_IND
 	bool "Incoming device classification"
 	depends on NET_CLS_U32 || NET_CLS_FW
diff --git a/net/sched/Makefile b/net/sched/Makefile
index 84bddb3..321a9bc 100644
--- a/net/sched/Makefile
+++ b/net/sched/Makefile
@@ -22,6 +22,7 @@ obj-$(CONFIG_NET_ACT_CONNMARK)	+= act_connmark.o
 obj-$(CONFIG_NET_ACT_IFE)	+= act_ife.o
 obj-$(CONFIG_NET_IFE_SKBMARK)	+= act_meta_mark.o
 obj-$(CONFIG_NET_IFE_SKBPRIO)	+= act_meta_skbprio.o
+obj-$(CONFIG_NET_IFE_SKBHASH)	+= act_meta_skbhash.o
 obj-$(CONFIG_NET_SCH_FIFO)	+= sch_fifo.o
 obj-$(CONFIG_NET_SCH_CBQ)	+= sch_cbq.o
 obj-$(CONFIG_NET_SCH_HTB)	+= sch_htb.o
diff --git a/net/sched/act_meta_skbhash.c b/net/sched/act_meta_skbhash.c
new file mode 100644
index 0000000..c3140ea
--- /dev/null
+++ b/net/sched/act_meta_skbhash.c
@@ -0,0 +1,87 @@
+/*
+ * net/sched/act_meta_skbhash.c IFE skb->hash metadata module
+ *
+ *		This program is free software; you can redistribute it and/or
+ *		modify it under the terms of the GNU General Public License
+ *		as published by the Free Software Foundation; either version
+ *		2 of the License, or (at your option) any later version.
+ *
+ * copyright 	Jamal Hadi Salim (2015)
+ *
+*/
+
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/errno.h>
+#include <linux/skbuff.h>
+#include <linux/rtnetlink.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <net/netlink.h>
+#include <net/pkt_sched.h>
+#include <uapi/linux/tc_act/tc_ife.h>
+#include <net/tc_act/tc_ife.h>
+
+static int skbhash_check(struct sk_buff *skb, struct tcf_meta_info *e)
+{
+	u32 skbhash = skb->hash;
+
+	if (e->metaval) {
+		skbhash = *(u32 *)e->metaval;
+	}
+	if (!skbhash)
+		return 0;
+
+	return 8;
+}
+
+static int skbhash_encode(struct sk_buff *skb, void *skbdata,
+			  struct tcf_meta_info *e)
+{
+	u32 skbhash = skb->hash;
+
+	return encode_meta_u32(skbhash, skbdata, e);
+}
+
+static int skbhash_decode(struct sk_buff *skb, void *data, u16 len)
+{
+	u32 skbhash = *(u32 *) data;
+
+	skb->hash = ntohl(skbhash);
+	return 0;
+}
+
+static struct tcf_meta_ops ife_skbhash_ops = {
+	.metaid = IFE_META_HASHID,
+	.metatype = NLA_U32,
+	.name = "skbhash",
+	.synopsis = "skb hash metadata",
+	.check_presence = skbhash_check,
+	.encode = skbhash_encode,
+	.decode = skbhash_decode,
+	.get = get_meta_u32,
+	.alloc = alloc_meta_u32,
+	.owner = THIS_MODULE,
+};
+
+static int __init ifeskbhash_init_module(void)
+{
+	pr_info("Loaded IFE skbhash\n");
+
+	return register_ife_op(&ife_skbhash_ops);
+}
+
+static void __exit ifeskbhash_cleanup_module(void)
+{
+	pr_info("Unloaded IFE skb hash\n");
+	unregister_ife_op(&ife_skbhash_ops);
+}
+
+module_init(ifeskbhash_init_module);
+module_exit(ifeskbhash_cleanup_module);
+
+MODULE_AUTHOR("Jamal Hadi Salim(2015)");
+MODULE_DESCRIPTION("Inter-FE skb hash meta action");
+MODULE_LICENSE("GPL");
+MODULE_ALIAS_IFE_META(IFE_META_HASHID);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [net-next PATCH 5/5] Support to encoding decoding skb queue map on IFE action
  2016-02-22 13:21 [net-next PATCH 0/5] net_sched: Add support for IFE action Jamal Hadi Salim
                   ` (3 preceding siblings ...)
  2016-02-22 13:21 ` [net-next PATCH 4/5] Support to encoding decoding skb hashid " Jamal Hadi Salim
@ 2016-02-22 13:21 ` Jamal Hadi Salim
  2016-02-22 16:59   ` Daniel Borkmann
  2016-02-22 21:03   ` John Fastabend
  2016-02-22 16:47 ` [net-next PATCH 0/5] net_sched: Add support for " Daniel Borkmann
  2016-02-23  7:00 ` Cong Wang
  6 siblings, 2 replies; 27+ messages in thread
From: Jamal Hadi Salim @ 2016-02-22 13:21 UTC (permalink / raw)
  To: davem
  Cc: netdev, daniel, xiyou.wangcong, alexei.starovoitov, Jamal Hadi Salim

From: Jamal Hadi Salim <jhs@mojatatu.com>

hard code static value of 10 for qmap
mark of 12
prio of 13
and hashid of 11

sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action ife encode \
type 0xDEAD \
use mark 12 \
use prio 13 \
use hashid 11 \
use qmap 10 \
dst 02:15:15:15:15:15

Note: as of the time of submission skbedit of queue map doesnt work
(just in case you try to use it)

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 net/sched/Kconfig         |   5 +++
 net/sched/Makefile        |   1 +
 net/sched/act_meta_qmap.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 106 insertions(+)
 create mode 100644 net/sched/act_meta_qmap.c

diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index 4c0c694..c25b192 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -766,6 +766,11 @@ config NET_IFE_SKBHASH
         depends on NET_ACT_IFE
         ---help---
 
+config NET_IFE_SKBQMAP
+        tristate "Support to encoding decoding skb queue_map on IFE action"
+        depends on NET_ACT_IFE
+        ---help---
+
 config NET_CLS_IND
 	bool "Incoming device classification"
 	depends on NET_CLS_U32 || NET_CLS_FW
diff --git a/net/sched/Makefile b/net/sched/Makefile
index 321a9bc..fa97501 100644
--- a/net/sched/Makefile
+++ b/net/sched/Makefile
@@ -23,6 +23,7 @@ obj-$(CONFIG_NET_ACT_IFE)	+= act_ife.o
 obj-$(CONFIG_NET_IFE_SKBMARK)	+= act_meta_mark.o
 obj-$(CONFIG_NET_IFE_SKBPRIO)	+= act_meta_skbprio.o
 obj-$(CONFIG_NET_IFE_SKBHASH)	+= act_meta_skbhash.o
+obj-$(CONFIG_NET_IFE_SKBQMAP)	+= act_meta_qmap.o
 obj-$(CONFIG_NET_SCH_FIFO)	+= sch_fifo.o
 obj-$(CONFIG_NET_SCH_CBQ)	+= sch_cbq.o
 obj-$(CONFIG_NET_SCH_HTB)	+= sch_htb.o
diff --git a/net/sched/act_meta_qmap.c b/net/sched/act_meta_qmap.c
new file mode 100644
index 0000000..e5e8dbe
--- /dev/null
+++ b/net/sched/act_meta_qmap.c
@@ -0,0 +1,100 @@
+/*
+ * net/sched/act_meta_qmap.c skb queue map encoder/decoder
+ *
+ *
+ *		This program is free software; you can redistribute it and/or
+ *		modify it under the terms of the GNU General Public License
+ *		as published by the Free Software Foundation; either version
+ *		2 of the License, or (at your option) any later version.
+ *
+ * copyright 	Jamal Hadi Salim (2015)
+ *
+*/
+
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/errno.h>
+#include <linux/skbuff.h>
+#include <linux/rtnetlink.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <net/netlink.h>
+#include <net/pkt_sched.h>
+#include <uapi/linux/tc_act/tc_ife.h>
+#include <net/tc_act/tc_ife.h>
+#include <linux/ip.h> /*XXX*/
+
+int skbqmap_check(struct sk_buff *skb, struct tcf_meta_info *e)
+{
+	//XXX: skb_get_queue_mapping()?
+	u32 ifeqmap = skb->queue_mapping;
+
+	if (e->metaval) {
+		ifeqmap = *(u32 *)e->metaval;
+	}
+
+	if (!ifeqmap)
+		return 0;
+	/* data + pad + LV = 2+2+4 */
+	return 8;
+}
+
+int skbqmap_encode(struct sk_buff *skb, void *skbdata, struct tcf_meta_info *e)
+{
+	/*(XXX: skb_get_queue_mapping()? */
+	u16 ifeqmap = skb->queue_mapping;
+
+	if (e->metaval) {
+		ifeqmap = *(u16 *)e->metaval;
+	}
+
+	if (!ifeqmap)
+		return 0;
+
+	ifeqmap = htons(ifeqmap);
+
+	return tlv_meta_encode(skbdata, e->metaid, 2, &ifeqmap);
+}
+
+int qmap_decode(struct sk_buff *skb, void *data, u16 len)
+{
+	u16 qm = *(u16 *) data;
+
+	skb->queue_mapping = ntohs(qm);
+	return 0;
+}
+
+static struct tcf_meta_ops ife_qmap_ops = {
+	.metaid = IFE_META_QMAP,
+	.metatype = NLA_U16,
+	.name = "skbqmap",
+	.synopsis = "skb queue map 16 bit metadata",
+	.check_presence = skbqmap_check,
+	.encode = skbqmap_encode,
+	.decode = qmap_decode,
+	.get = get_meta_u16,
+	.alloc = alloc_meta_u16,
+	.owner = THIS_MODULE,
+};
+
+static int __init ifeqmap_init_module(void)
+{
+	pr_emerg("Loaded IFE qmap\n");
+
+	return register_ife_op(&ife_qmap_ops);
+}
+
+static void __exit ifeqmap_cleanup_module(void)
+{
+	pr_emerg("Unloaded IFE qmap\n");
+	unregister_ife_op(&ife_qmap_ops);
+}
+
+module_init(ifeqmap_init_module);
+module_exit(ifeqmap_cleanup_module);
+
+MODULE_AUTHOR("Jamal Hadi Salim(2015)");
+MODULE_DESCRIPTION("Inter-FE skb qmap metadata action");
+MODULE_LICENSE("GPL");
+MODULE_ALIAS_IFE_META(IFE_META_QMAP);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 0/5] net_sched: Add support for IFE action
  2016-02-22 13:21 [net-next PATCH 0/5] net_sched: Add support for IFE action Jamal Hadi Salim
                   ` (4 preceding siblings ...)
  2016-02-22 13:21 ` [net-next PATCH 5/5] Support to encoding decoding skb queue map " Jamal Hadi Salim
@ 2016-02-22 16:47 ` Daniel Borkmann
  2016-02-23 12:09   ` Jamal Hadi Salim
  2016-02-23  7:00 ` Cong Wang
  6 siblings, 1 reply; 27+ messages in thread
From: Daniel Borkmann @ 2016-02-22 16:47 UTC (permalink / raw)
  To: Jamal Hadi Salim, davem; +Cc: netdev, xiyou.wangcong, alexei.starovoitov

Hi Jamal,

On 02/22/2016 02:21 PM, Jamal Hadi Salim wrote:
> From: Jamal Hadi Salim <jhs@mojatatu.com>
>
>
> As agreed at netconf in Seville, here's the patch finally (1 year
> was just too long to wait).
> Described in netdev01 paper:
>              "Distributing Linux Traffic Control Classifier-Action Subsystem"
>               Authors: Jamal Hadi Salim and Damascene M. Joachimpillai
>
> Allows for incremental updates for new metadatum support.
> This patch set includes support for basic skb metadatum
> Followup patches will have more examples of metadata

So, basically this is a L2 encap with TLVs, right?

And as TLVs you have skb->mark, skb->priority, skb->hash, skb->queue_mapping
that you transfer from one machine to another, where on the destination, you
are applying the above meta data to the skb itself. And, configuration is via
tc.

I couldn't parse from the commit log what the real world use case is, resp.
who is going to use this infrastructure?

Do you have some typical setup, where the above needs to be transferred in the
encap and restored?

> Jamal Hadi Salim (5):
>    introduce IFE action
>    Support to encoding decoding skb mark on IFE action
>    Support to encoding decoding skb prio on IFE action
>    Support to encoding decoding skb hashid on IFE action
>    Support to encoding decoding skb queue map on IFE action
>
>   include/net/tc_act/tc_ife.h        |  60 +++
>   include/uapi/linux/tc_act/tc_ife.h |  38 ++
>   net/sched/Kconfig                  |  32 ++
>   net/sched/Makefile                 |   5 +
>   net/sched/act_ife.c                | 865 +++++++++++++++++++++++++++++++++++++
>   net/sched/act_meta_mark.c          |  81 ++++
>   net/sched/act_meta_qmap.c          | 100 +++++
>   net/sched/act_meta_skbhash.c       |  87 ++++
>   net/sched/act_meta_skbprio.c       |  80 ++++

Splitting these set/get functions into individual modules where you only
set/get a single skb member seems overkill to me. Could be done with a
simple switch statement inside ife?

>   9 files changed, 1348 insertions(+)
>   create mode 100644 include/net/tc_act/tc_ife.h
>   create mode 100644 include/uapi/linux/tc_act/tc_ife.h
>   create mode 100644 net/sched/act_ife.c
>   create mode 100644 net/sched/act_meta_mark.c
>   create mode 100644 net/sched/act_meta_qmap.c
>   create mode 100644 net/sched/act_meta_skbhash.c
>   create mode 100644 net/sched/act_meta_skbprio.c

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 4/5] Support to encoding decoding skb hashid on IFE action
  2016-02-22 13:21 ` [net-next PATCH 4/5] Support to encoding decoding skb hashid " Jamal Hadi Salim
@ 2016-02-22 16:56   ` Daniel Borkmann
  0 siblings, 0 replies; 27+ messages in thread
From: Daniel Borkmann @ 2016-02-22 16:56 UTC (permalink / raw)
  To: Jamal Hadi Salim, davem; +Cc: netdev, xiyou.wangcong, alexei.starovoitov

On 02/22/2016 02:21 PM, Jamal Hadi Salim wrote:
[...]
> diff --git a/net/sched/act_meta_skbhash.c b/net/sched/act_meta_skbhash.c
> new file mode 100644
> index 0000000..c3140ea
> --- /dev/null
> +++ b/net/sched/act_meta_skbhash.c
> @@ -0,0 +1,87 @@
> +/*
> + * net/sched/act_meta_skbhash.c IFE skb->hash metadata module
> + *
> + *		This program is free software; you can redistribute it and/or
> + *		modify it under the terms of the GNU General Public License
> + *		as published by the Free Software Foundation; either version
> + *		2 of the License, or (at your option) any later version.
> + *
> + * copyright 	Jamal Hadi Salim (2015)
> + *
> +*/
> +
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +#include <linux/string.h>
> +#include <linux/errno.h>
> +#include <linux/skbuff.h>
> +#include <linux/rtnetlink.h>
> +#include <linux/module.h>
> +#include <linux/init.h>
> +#include <net/netlink.h>
> +#include <net/pkt_sched.h>
> +#include <uapi/linux/tc_act/tc_ife.h>
> +#include <net/tc_act/tc_ife.h>
> +
> +static int skbhash_check(struct sk_buff *skb, struct tcf_meta_info *e)
> +{
> +	u32 skbhash = skb->hash;
> +
> +	if (e->metaval) {
> +		skbhash = *(u32 *)e->metaval;
> +	}
> +	if (!skbhash)
> +		return 0;
> +
> +	return 8;

What's magic number 8?

> +}
> +
> +static int skbhash_encode(struct sk_buff *skb, void *skbdata,
> +			  struct tcf_meta_info *e)
> +{
> +	u32 skbhash = skb->hash;
> +
> +	return encode_meta_u32(skbhash, skbdata, e);
> +}
> +
> +static int skbhash_decode(struct sk_buff *skb, void *data, u16 len)
> +{
> +	u32 skbhash = *(u32 *) data;
> +
> +	skb->hash = ntohl(skbhash);

Depending on your scenario/use case, the next skb_get_hash() call
could overwrite what you've transferred and set here.

> +	return 0;
> +}
> +
> +static struct tcf_meta_ops ife_skbhash_ops = {
> +	.metaid = IFE_META_HASHID,
> +	.metatype = NLA_U32,
> +	.name = "skbhash",
> +	.synopsis = "skb hash metadata",

MODULE_DESCRIPTION()?

> +	.check_presence = skbhash_check,
> +	.encode = skbhash_encode,
> +	.decode = skbhash_decode,
> +	.get = get_meta_u32,
> +	.alloc = alloc_meta_u32,
> +	.owner = THIS_MODULE,
> +};
> +
> +static int __init ifeskbhash_init_module(void)
> +{
> +	pr_info("Loaded IFE skbhash\n");

Leftover pr_info()?

> +	return register_ife_op(&ife_skbhash_ops);
> +}
> +
> +static void __exit ifeskbhash_cleanup_module(void)
> +{
> +	pr_info("Unloaded IFE skb hash\n");

Ditto?

> +	unregister_ife_op(&ife_skbhash_ops);
> +}
> +
> +module_init(ifeskbhash_init_module);
> +module_exit(ifeskbhash_cleanup_module);
> +
> +MODULE_AUTHOR("Jamal Hadi Salim(2015)");
> +MODULE_DESCRIPTION("Inter-FE skb hash meta action");
> +MODULE_LICENSE("GPL");
> +MODULE_ALIAS_IFE_META(IFE_META_HASHID);
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 5/5] Support to encoding decoding skb queue map on IFE action
  2016-02-22 13:21 ` [net-next PATCH 5/5] Support to encoding decoding skb queue map " Jamal Hadi Salim
@ 2016-02-22 16:59   ` Daniel Borkmann
  2016-02-22 21:03   ` John Fastabend
  1 sibling, 0 replies; 27+ messages in thread
From: Daniel Borkmann @ 2016-02-22 16:59 UTC (permalink / raw)
  To: Jamal Hadi Salim, davem; +Cc: netdev, xiyou.wangcong, alexei.starovoitov

On 02/22/2016 02:21 PM, Jamal Hadi Salim wrote:
[...]
> Note: as of the time of submission skbedit of queue map doesnt work
> (just in case you try to use it)

Yep, setting queue mapping doesn't work. ;)

> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
> ---
>   net/sched/Kconfig         |   5 +++
>   net/sched/Makefile        |   1 +
>   net/sched/act_meta_qmap.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++
[...]
> +static int __init ifeqmap_init_module(void)
> +{
> +	pr_emerg("Loaded IFE qmap\n");

pr_emerg() means "system is unusable" ... ;)

> +	return register_ife_op(&ife_qmap_ops);
> +}
> +
> +static void __exit ifeqmap_cleanup_module(void)
> +{
> +	pr_emerg("Unloaded IFE qmap\n");
> +	unregister_ife_op(&ife_qmap_ops);
> +}
> +
> +module_init(ifeqmap_init_module);
> +module_exit(ifeqmap_cleanup_module);
> +
> +MODULE_AUTHOR("Jamal Hadi Salim(2015)");
> +MODULE_DESCRIPTION("Inter-FE skb qmap metadata action");
> +MODULE_LICENSE("GPL");
> +MODULE_ALIAS_IFE_META(IFE_META_QMAP);
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 3/5] Support to encoding decoding skb prio on IFE action
  2016-02-22 13:21 ` [net-next PATCH 3/5] Support to encoding decoding skb prio " Jamal Hadi Salim
@ 2016-02-22 17:01   ` Daniel Borkmann
  0 siblings, 0 replies; 27+ messages in thread
From: Daniel Borkmann @ 2016-02-22 17:01 UTC (permalink / raw)
  To: Jamal Hadi Salim, davem; +Cc: netdev, xiyou.wangcong, alexei.starovoitov

On 02/22/2016 02:21 PM, Jamal Hadi Salim wrote:
> From: Jamal Hadi Salim <jhs@mojatatu.com>
[...]
> +static int skbprio_check(struct sk_buff *skb, struct tcf_meta_info *e)
> +{
> +	return check_meta_u32(skb->priority, e);
> +}
> +
> +static int skbprio_encode(struct sk_buff *skb, void *skbdata,
> +			  struct tcf_meta_info *e)
> +{
> +	u32 ifeprio = skb->priority; /* avoid having to cast skb->priority*/
> +
> +	pr_emerg("encode skbprio %d\n", ifeprio);

Ditto ... and this for every packet on encap and decap ...

> +	return encode_meta_u32(ifeprio, skbdata, e);
> +}
> +
> +static int skbprio_decode(struct sk_buff *skb, void *data, u16 len)
> +{
> +	u32 ifeprio = *(u32 *) data;
> +
> +	skb->priority = ntohl(ifeprio);
> +	pr_emerg("decode skbprio %d\n", skb->priority);
> +	return 0;
> +}
> +
> +static struct tcf_meta_ops ife_prio_ops = {
> +	.metaid = IFE_META_PRIO,
> +	.metatype = NLA_U32,
> +	.name = "skbprio",
> +	.synopsis = "skb prio metadata",
> +	.check_presence = skbprio_check,
> +	.encode = skbprio_encode,
> +	.decode = skbprio_decode,
> +	.get = get_meta_u32,
> +	.alloc = alloc_meta_u32,
> +	.owner = THIS_MODULE,
> +};
> +
> +static int __init ifeprio_init_module(void)
> +{
> +	pr_info("Loaded IFE skb prio\n");
> +	return register_ife_op(&ife_prio_ops);
> +}
> +
> +static void __exit ifeprio_cleanup_module(void)
> +{
> +	unregister_ife_op(&ife_prio_ops);
> +	pr_info("Unloaded IFE skb prio\n");
> +}
> +
> +module_init(ifeprio_init_module);
> +module_exit(ifeprio_cleanup_module);
> +
> +MODULE_AUTHOR("Jamal Hadi Salim(2015)");
> +MODULE_DESCRIPTION("Inter-FE skb prio metadata action");
> +MODULE_LICENSE("GPL");
> +MODULE_ALIAS_IFE_META(IFE_META_PRIO);
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 5/5] Support to encoding decoding skb queue map on IFE action
  2016-02-22 13:21 ` [net-next PATCH 5/5] Support to encoding decoding skb queue map " Jamal Hadi Salim
  2016-02-22 16:59   ` Daniel Borkmann
@ 2016-02-22 21:03   ` John Fastabend
  2016-02-23 12:17     ` Jamal Hadi Salim
  1 sibling, 1 reply; 27+ messages in thread
From: John Fastabend @ 2016-02-22 21:03 UTC (permalink / raw)
  To: Jamal Hadi Salim, davem
  Cc: netdev, daniel, xiyou.wangcong, alexei.starovoitov

On 16-02-22 05:21 AM, Jamal Hadi Salim wrote:
> From: Jamal Hadi Salim <jhs@mojatatu.com>
> 
> hard code static value of 10 for qmap
> mark of 12
> prio of 13
> and hashid of 11
> 
> sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
> u32 match ip protocol 1 0xff flowid 1:2 \
> action ife encode \
> type 0xDEAD \
> use mark 12 \
> use prio 13 \
> use hashid 11 \
> use qmap 10 \
> dst 02:15:15:15:15:15
> 
> Note: as of the time of submission skbedit of queue map doesnt work
> (just in case you try to use it)
> 
> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
> ---

Well the skbedit queue_mapping action does work I'm just guessing it
is not working as you expect? We probably haven't done a good job
explaining how to set it up.

If you set it on the clsact egress filter chain for example it will
map traffic to a queue if you disable XPS and get sk_tx_queue() to
return -1. This is because XPS and socket mappings have a higher
precedence in queue selection.

Anyways just tested on net-next and it works, the following
puts ip traffic with src 15.0.0.1 on hardware queue 6,

./tc/tc qdisc add dev eth4 clsact
./tc/tc filter add dev eth4 egress protocol ip \
         u32 ht 800: order 1 \
         match ip src 15.0.0.1/32 \
        action skbedit queue_mapping 6

Thanks,
.John

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 0/5] net_sched: Add support for IFE action
  2016-02-22 13:21 [net-next PATCH 0/5] net_sched: Add support for IFE action Jamal Hadi Salim
                   ` (5 preceding siblings ...)
  2016-02-22 16:47 ` [net-next PATCH 0/5] net_sched: Add support for " Daniel Borkmann
@ 2016-02-23  7:00 ` Cong Wang
  2016-02-23 12:18   ` Jamal Hadi Salim
  6 siblings, 1 reply; 27+ messages in thread
From: Cong Wang @ 2016-02-23  7:00 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: David Miller, Linux Kernel Network Developers, Daniel Borkmann,
	Alexei Starovoitov

On Mon, Feb 22, 2016 at 5:21 AM, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> From: Jamal Hadi Salim <jhs@mojatatu.com>
>
>
> As agreed at netconf in Seville, here's the patch finally (1 year
> was just too long to wait).
> Described in netdev01 paper:
>             "Distributing Linux Traffic Control Classifier-Action Subsystem"
>              Authors: Jamal Hadi Salim and Damascene M. Joachimpillai

It would be nice if you can put a paragraph summary here, I am too lazy
to read through the whole paper and the IETF drafts. ;)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 0/5] net_sched: Add support for IFE action
  2016-02-22 16:47 ` [net-next PATCH 0/5] net_sched: Add support for " Daniel Borkmann
@ 2016-02-23 12:09   ` Jamal Hadi Salim
  2016-02-23 13:20     ` Daniel Borkmann
  0 siblings, 1 reply; 27+ messages in thread
From: Jamal Hadi Salim @ 2016-02-23 12:09 UTC (permalink / raw)
  To: Daniel Borkmann, davem; +Cc: netdev, xiyou.wangcong, alexei.starovoitov

Hi Daniel,

On 16-02-22 11:47 AM, Daniel Borkmann wrote:
> Hi Jamal,
>
> On 02/22/2016 02:21 PM, Jamal Hadi Salim wrote:
>> From: Jamal Hadi Salim <jhs@mojatatu.com>
>>
>>
>> As agreed at netconf in Seville, here's the patch finally (1 year
>> was just too long to wait).
>> Described in netdev01 paper:
>>              "Distributing Linux Traffic Control Classifier-Action
>> Subsystem"
>>               Authors: Jamal Hadi Salim and Damascene M. Joachimpillai
>>
>> Allows for incremental updates for new metadatum support.
>> This patch set includes support for basic skb metadatum
>> Followup patches will have more examples of metadata
>
> So, basically this is a L2 encap with TLVs, right?
>
> And as TLVs you have skb->mark, skb->priority, skb->hash,
> skb->queue_mapping
> that you transfer from one machine to another, where on the destination,
> you
> are applying the above meta data to the skb itself. And, configuration
> is via
> tc.
>
> I couldn't parse from the commit log what the real world use case is, resp.
> who is going to use this infrastructure?
>
> Do you have some typical setup, where the above needs to be transferred
> in the
> encap and restored?
>

I am assuming you are asking this for the sake of people who dont
have context (and not for yourself)?
I added a pointer to the paper. It is 6 pages and simple to read.
Isnt that sufficient? I dont want to write a novel here. Some could
argue that in fact i am already writing a novel in commit 1/5.

>> Jamal Hadi Salim (5):
>>    introduce IFE action
>>    Support to encoding decoding skb mark on IFE action
>>    Support to encoding decoding skb prio on IFE action
>>    Support to encoding decoding skb hashid on IFE action
>>    Support to encoding decoding skb queue map on IFE action
>>
>>   include/net/tc_act/tc_ife.h        |  60 +++
>>   include/uapi/linux/tc_act/tc_ife.h |  38 ++
>>   net/sched/Kconfig                  |  32 ++
>>   net/sched/Makefile                 |   5 +
>>   net/sched/act_ife.c                | 865
>> +++++++++++++++++++++++++++++++++++++
>>   net/sched/act_meta_mark.c          |  81 ++++
>>   net/sched/act_meta_qmap.c          | 100 +++++
>>   net/sched/act_meta_skbhash.c       |  87 ++++
>>   net/sched/act_meta_skbprio.c       |  80 ++++
>
> Splitting these set/get functions into individual modules where you only
> set/get a single skb member seems overkill to me. Could be done with a
> simple switch statement inside ife?
>

They need to be separated to make them unique. These are basic
metadatum; I have a few others lined up - but i just wanted to start
with these because they are obvious to see. What i mulled over is
to send one big patch or several. In the end it seemed cleaner to
send separate patches.

Thanks for your other input - I will redo, test and submit v2.

cheers,
jamal

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 5/5] Support to encoding decoding skb queue map on IFE action
  2016-02-22 21:03   ` John Fastabend
@ 2016-02-23 12:17     ` Jamal Hadi Salim
  2016-02-23 19:33       ` John Fastabend
  0 siblings, 1 reply; 27+ messages in thread
From: Jamal Hadi Salim @ 2016-02-23 12:17 UTC (permalink / raw)
  To: John Fastabend, davem; +Cc: netdev, daniel, xiyou.wangcong, alexei.starovoitov

On 16-02-22 04:03 PM, John Fastabend wrote:
> On 16-02-22 05:21 AM, Jamal Hadi Salim wrote:
>> From: Jamal Hadi Salim <jhs@mojatatu.com>
>>
>> hard code static value of 10 for qmap
>> mark of 12
>> prio of 13
>> and hashid of 11
>>
>> sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
>> u32 match ip protocol 1 0xff flowid 1:2 \
>> action ife encode \
>> type 0xDEAD \
>> use mark 12 \
>> use prio 13 \
>> use hashid 11 \
>> use qmap 10 \
>> dst 02:15:15:15:15:15
>>
>> Note: as of the time of submission skbedit of queue map doesnt work
>> (just in case you try to use it)
>>
>> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
>> ---
>
> Well the skbedit queue_mapping action does work I'm just guessing it
> is not working as you expect? We probably haven't done a good job
> explaining how to set it up.
>
> If you set it on the clsact egress filter chain for example it will
> map traffic to a queue if you disable XPS and get sk_tx_queue() to
> return -1. This is because XPS and socket mappings have a higher
> precedence in queue selection.
>

Ok, I am going to use your comment in the commit log.
When i have time i will test it.

> Anyways just tested on net-next and it works, the following
> puts ip traffic with src 15.0.0.1 on hardware queue 6,
>
> ./tc/tc qdisc add dev eth4 clsact
> ./tc/tc filter add dev eth4 egress protocol ip \
>           u32 ht 800: order 1 \
>           match ip src 15.0.0.1/32 \
>          action skbedit queue_mapping 6
>



I used a asimilar example:

sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action skbedit queue_mapping 49

BTW: You didnt have flow/classid in your example rule;
weird things could happen when you dont have one (unfortunately
we dont check at user space).

cheers,
jamal

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 0/5] net_sched: Add support for IFE action
  2016-02-23  7:00 ` Cong Wang
@ 2016-02-23 12:18   ` Jamal Hadi Salim
  0 siblings, 0 replies; 27+ messages in thread
From: Jamal Hadi Salim @ 2016-02-23 12:18 UTC (permalink / raw)
  To: Cong Wang
  Cc: David Miller, Linux Kernel Network Developers, Daniel Borkmann,
	Alexei Starovoitov

On 16-02-23 02:00 AM, Cong Wang wrote:
> On Mon, Feb 22, 2016 at 5:21 AM, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>> From: Jamal Hadi Salim <jhs@mojatatu.com>
>>
>>
>> As agreed at netconf in Seville, here's the patch finally (1 year
>> was just too long to wait).
>> Described in netdev01 paper:
>>              "Distributing Linux Traffic Control Classifier-Action Subsystem"
>>               Authors: Jamal Hadi Salim and Damascene M. Joachimpillai
>
> It would be nice if you can put a paragraph summary here, I am too lazy
> to read through the whole paper and the IETF drafts. ;)
>

It is an easy paper Cong. Print it and read it as a bed time story ;->
Dont bother with the IETF material.
Ok, I give up - based on your and Daniel's comments i will add a
paragraph; likely cutnpasted from the paper.

cheers,
jamal

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 0/5] net_sched: Add support for IFE action
  2016-02-23 12:09   ` Jamal Hadi Salim
@ 2016-02-23 13:20     ` Daniel Borkmann
  2016-02-23 14:28       ` Jamal Hadi Salim
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel Borkmann @ 2016-02-23 13:20 UTC (permalink / raw)
  To: Jamal Hadi Salim, davem; +Cc: netdev, xiyou.wangcong, alexei.starovoitov

On 02/23/2016 01:09 PM, Jamal Hadi Salim wrote:
> On 16-02-22 11:47 AM, Daniel Borkmann wrote:
[...]
>> So, basically this is a L2 encap with TLVs, right?
>>
>> And as TLVs you have skb->mark, skb->priority, skb->hash,
>> skb->queue_mapping
>> that you transfer from one machine to another, where on the destination,
>> you
>> are applying the above meta data to the skb itself. And, configuration
>> is via
>> tc.
>>
>> I couldn't parse from the commit log what the real world use case is, resp.
>> who is going to use this infrastructure?
>>
>> Do you have some typical setup, where the above needs to be transferred
>> in the
>> encap and restored?
>
> I am assuming you are asking this for the sake of people who dont
> have context (and not for yourself)?
> I added a pointer to the paper. It is 6 pages and simple to read.
> Isnt that sufficient? I dont want to write a novel here. Some could
> argue that in fact i am already writing a novel in commit 1/5.

Ok, the paper talks about, quote:

  - Pipeline-stage Indexing.
  - OAM information - example turn on some packet debug information on a need basis.
  - Exception handling information - example VXLAN service handling.
  - Authentication and authorization information.
  - Versioning information.
  - Compliance information.
  - Service Identifiers.

As your primary examples, you provide skb->mark, skb->priority, skb->hash,
skb->queue_mapping for encapsulating, f.e. how do you use skb>hash in this
scenario? What's the use-case?

>>> Jamal Hadi Salim (5):
>>>    introduce IFE action
>>>    Support to encoding decoding skb mark on IFE action
>>>    Support to encoding decoding skb prio on IFE action
>>>    Support to encoding decoding skb hashid on IFE action
>>>    Support to encoding decoding skb queue map on IFE action
>>>
>>>   include/net/tc_act/tc_ife.h        |  60 +++
>>>   include/uapi/linux/tc_act/tc_ife.h |  38 ++
>>>   net/sched/Kconfig                  |  32 ++
>>>   net/sched/Makefile                 |   5 +
>>>   net/sched/act_ife.c                | 865
>>> +++++++++++++++++++++++++++++++++++++
>>>   net/sched/act_meta_mark.c          |  81 ++++
>>>   net/sched/act_meta_qmap.c          | 100 +++++
>>>   net/sched/act_meta_skbhash.c       |  87 ++++
>>>   net/sched/act_meta_skbprio.c       |  80 ++++
>>
>> Splitting these set/get functions into individual modules where you only
>> set/get a single skb member seems overkill to me. Could be done with a
>> simple switch statement inside ife?
>
> They need to be separated to make them unique. These are basic
> metadatum; I have a few others lined up - but i just wanted to start
> with these because they are obvious to see. What i mulled over is
> to send one big patch or several. In the end it seemed cleaner to
> send separate patches.

But just to make them unique, you don't need to add extra modules for
this ... just having a module for encoding one skb member seems like
total overdesign to me. If you really want them to be separate ops,
you can still include them into act_ife.c itself.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 0/5] net_sched: Add support for IFE action
  2016-02-23 13:20     ` Daniel Borkmann
@ 2016-02-23 14:28       ` Jamal Hadi Salim
  2016-02-23 15:34         ` Daniel Borkmann
  0 siblings, 1 reply; 27+ messages in thread
From: Jamal Hadi Salim @ 2016-02-23 14:28 UTC (permalink / raw)
  To: Daniel Borkmann, davem; +Cc: netdev, xiyou.wangcong, alexei.starovoitov

On 16-02-23 08:20 AM, Daniel Borkmann wrote:
> On 02/23/2016 01:09 PM, Jamal Hadi Salim wrote:
>> On 16-02-22 11:47 AM, Daniel Borkmann wrote:
> [...]
>>> So, basically this is a L2 encap with TLVs, right?
>>>
>>> And as TLVs you have skb->mark, skb->priority, skb->hash,
>>> skb->queue_mapping
>>> that you transfer from one machine to another, where on the destination,
>>> you
>>> are applying the above meta data to the skb itself. And, configuration
>>> is via
>>> tc.
>>>
>>> I couldn't parse from the commit log what the real world use case is,
>>> resp.
>>> who is going to use this infrastructure?
>>>
>>> Do you have some typical setup, where the above needs to be transferred
>>> in the
>>> encap and restored?
>>
>> I am assuming you are asking this for the sake of people who dont
>> have context (and not for yourself)?
>> I added a pointer to the paper. It is 6 pages and simple to read.
>> Isnt that sufficient? I dont want to write a novel here. Some could
>> argue that in fact i am already writing a novel in commit 1/5.
>
> Ok, the paper talks about, quote:
>
>   - Pipeline-stage Indexing.
>   - OAM information - example turn on some packet debug information on a
> need basis.
>   - Exception handling information - example VXLAN service handling.
>   - Authentication and authorization information.
>   - Versioning information.
>   - Compliance information.
>   - Service Identifiers.
>

Which is now added to the cover letter.

> As your primary examples, you provide skb->mark, skb->priority, skb->hash,
> skb->queue_mapping for encapsulating, f.e. how do you use skb>hash in this
> scenario? What's the use-case?

These are basic metadata. The question to ask is what could one use
skb->hash for. Today it is used to select a cpu to balance to.
I dont understand what your concern is: I could pass a string
as metadata if i wanted to. I chose to pass skb metadata;
some of which i actually use. Only one i dont use is queue
map - but because it is still an skb metadata so i wrote code
to pass it.

>> They need to be separated to make them unique. These are basic
>> metadatum; I have a few others lined up - but i just wanted to start
>> with these because they are obvious to see. What i mulled over is
>> to send one big patch or several. In the end it seemed cleaner to
>> send separate patches.
>
> But just to make them unique, you don't need to add extra modules for
> this ... just having a module for encoding one skb member seems like
> total overdesign to me. If you really want them to be separate ops,
> you can still include them into act_ife.c itself.
>

These things will evolve. I would rather leave them the way they
are for that reason and they serve as good examples of how one
would new metadatum.
Example: skb->mark may end up evolving to include masks.

cheers,
jamal

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 0/5] net_sched: Add support for IFE action
  2016-02-23 14:28       ` Jamal Hadi Salim
@ 2016-02-23 15:34         ` Daniel Borkmann
  2016-02-24 12:49           ` Jamal Hadi Salim
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel Borkmann @ 2016-02-23 15:34 UTC (permalink / raw)
  To: Jamal Hadi Salim, davem; +Cc: netdev, xiyou.wangcong, alexei.starovoitov

On 02/23/2016 03:28 PM, Jamal Hadi Salim wrote:
> On 16-02-23 08:20 AM, Daniel Borkmann wrote:
>> On 02/23/2016 01:09 PM, Jamal Hadi Salim wrote:
>>> On 16-02-22 11:47 AM, Daniel Borkmann wrote:
>> [...]
>>>> So, basically this is a L2 encap with TLVs, right?
>>>>
>>>> And as TLVs you have skb->mark, skb->priority, skb->hash,
>>>> skb->queue_mapping
>>>> that you transfer from one machine to another, where on the destination,
>>>> you
>>>> are applying the above meta data to the skb itself. And, configuration
>>>> is via
>>>> tc.
>>>>
>>>> I couldn't parse from the commit log what the real world use case is,
>>>> resp.
>>>> who is going to use this infrastructure?
>>>>
>>>> Do you have some typical setup, where the above needs to be transferred
>>>> in the
>>>> encap and restored?
>>>
>>> I am assuming you are asking this for the sake of people who dont
>>> have context (and not for yourself)?
>>> I added a pointer to the paper. It is 6 pages and simple to read.
>>> Isnt that sufficient? I dont want to write a novel here. Some could
>>> argue that in fact i am already writing a novel in commit 1/5.
>>
>> Ok, the paper talks about, quote:
>>
>>   - Pipeline-stage Indexing.
>>   - OAM information - example turn on some packet debug information on a
>> need basis.
>>   - Exception handling information - example VXLAN service handling.
>>   - Authentication and authorization information.
>>   - Versioning information.
>>   - Compliance information.
>>   - Service Identifiers.
>
> Which is now added to the cover letter.
>
>> As your primary examples, you provide skb->mark, skb->priority, skb->hash,
>> skb->queue_mapping for encapsulating, f.e. how do you use skb>hash in this
>> scenario? What's the use-case?
>
> These are basic metadata. The question to ask is what could one use
> skb->hash for. Today it is used to select a cpu to balance to.

Right, but that happens before you decode that information from your TLV
on ingress qdisc. And any subsequent skb_get_hash() to read out skb->hash
will effectively overwrite what you set there and call into flow dissector.

> I dont understand what your concern is: I could pass a string
> as metadata if i wanted to. I chose to pass skb metadata;
> some of which i actually use. Only one i dont use is queue
> map - but because it is still an skb metadata so i wrote code
> to pass it.

But if it's not used by anyone currently, why add it? ;) Better to only
add if there's a real demand for usage, otherwise it's code that no-one
uses and distros needlessly ship it to everyone.

>>> They need to be separated to make them unique. These are basic
>>> metadatum; I have a few others lined up - but i just wanted to start
>>> with these because they are obvious to see. What i mulled over is
>>> to send one big patch or several. In the end it seemed cleaner to
>>> send separate patches.
>>
>> But just to make them unique, you don't need to add extra modules for
>> this ... just having a module for encoding one skb member seems like
>> total overdesign to me. If you really want them to be separate ops,
>> you can still include them into act_ife.c itself.
>
> These things will evolve. I would rather leave them the way they
> are for that reason and they serve as good examples of how one
> would new metadatum.

My concern is we add 20 new modules like this that only do trivial things,
where instead they could have been consolidated and reduce maintenance. Or
is this hard module requirement related to the IFE_META_* module parameter?

> Example: skb->mark may end up evolving to include masks.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 5/5] Support to encoding decoding skb queue map on IFE action
  2016-02-23 12:17     ` Jamal Hadi Salim
@ 2016-02-23 19:33       ` John Fastabend
  0 siblings, 0 replies; 27+ messages in thread
From: John Fastabend @ 2016-02-23 19:33 UTC (permalink / raw)
  To: Jamal Hadi Salim, davem
  Cc: netdev, daniel, xiyou.wangcong, alexei.starovoitov

On 16-02-23 04:17 AM, Jamal Hadi Salim wrote:
> On 16-02-22 04:03 PM, John Fastabend wrote:
>> On 16-02-22 05:21 AM, Jamal Hadi Salim wrote:
>>> From: Jamal Hadi Salim <jhs@mojatatu.com>
>>>
>>> hard code static value of 10 for qmap
>>> mark of 12
>>> prio of 13
>>> and hashid of 11
>>>
>>> sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
>>> u32 match ip protocol 1 0xff flowid 1:2 \
>>> action ife encode \
>>> type 0xDEAD \
>>> use mark 12 \
>>> use prio 13 \
>>> use hashid 11 \
>>> use qmap 10 \
>>> dst 02:15:15:15:15:15
>>>
>>> Note: as of the time of submission skbedit of queue map doesnt work
>>> (just in case you try to use it)
>>>
>>> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
>>> ---
>>
>> Well the skbedit queue_mapping action does work I'm just guessing it
>> is not working as you expect? We probably haven't done a good job
>> explaining how to set it up.
>>
>> If you set it on the clsact egress filter chain for example it will
>> map traffic to a queue if you disable XPS and get sk_tx_queue() to
>> return -1. This is because XPS and socket mappings have a higher
>> precedence in queue selection.
>>
> 
> Ok, I am going to use your comment in the commit log.
> When i have time i will test it.
> 
>> Anyways just tested on net-next and it works, the following
>> puts ip traffic with src 15.0.0.1 on hardware queue 6,
>>
>> ./tc/tc qdisc add dev eth4 clsact
>> ./tc/tc filter add dev eth4 egress protocol ip \
>>           u32 ht 800: order 1 \
>>           match ip src 15.0.0.1/32 \
>>          action skbedit queue_mapping 6
>>
> 
> 
> 
> I used a asimilar example:
> 
> sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
> u32 match ip protocol 1 0xff flowid 1:2 \
> action skbedit queue_mapping 49

ah if its not obvious setting queue_mapping from inside a qdisc
has no effect except on multiq because the tx queue has already
been selected. Maybe I need to do a man page update.

Also please don't use multiq I'm tempted to remove it now that
we have mq and mqprio with clsact there is no reason to use it.

> 
> BTW: You didnt have flow/classid in your example rule;
> weird things could happen when you dont have one (unfortunately
> we dont check at user space).

Its safe as far as I can tell on many qdiscs including clsact and
ingress. fq_codel is an example of a qdisc where it would break things
however.

> 
> cheers,
> jamal

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 0/5] net_sched: Add support for IFE action
  2016-02-23 15:34         ` Daniel Borkmann
@ 2016-02-24 12:49           ` Jamal Hadi Salim
  2016-02-24 17:48             ` Daniel Borkmann
  2016-02-24 17:58             ` Daniel Borkmann
  0 siblings, 2 replies; 27+ messages in thread
From: Jamal Hadi Salim @ 2016-02-24 12:49 UTC (permalink / raw)
  To: Daniel Borkmann, davem; +Cc: netdev, xiyou.wangcong, alexei.starovoitov

On 16-02-23 10:34 AM, Daniel Borkmann wrote:
> On 02/23/2016 03:28 PM, Jamal Hadi Salim wrote:
[..]

>> These are basic metadata. The question to ask is what could one use
>> skb->hash for. Today it is used to select a cpu to balance to.
>
> Right, but that happens before you decode that information from your TLV
> on ingress qdisc. And any subsequent skb_get_hash() to read out skb->hash
> will effectively overwrite what you set there and call into flow dissector.
>

Drivers do set the hash. My use case is slightly different.
I have a NIC which has an embedded cavium processor. This thing
strips off the TLV and uses the hash to select the host MSI.
Only thing we dont use at the moment is queue_mapping.


> My concern is we add 20 new modules like this that only do trivial things,
> where instead they could have been consolidated and reduce maintenance. Or
> is this hard module requirement related to the IFE_META_* module parameter?
>

Yes, a bit of that ++.
I am between two worlds: There are people who do user space packet
processing that claim they do so because they can quickly prototype
without compiling the kernel. My goal is to make it easy for people
adding new metadata without having to deal with kernel recompile.
I do expect for there to be many variations of what that metadata
will be. For that reason I have them as standalone modules and they
serve the purpose to illustrate how someone would write such a module.
The IFE_META_XXX is part of saying i dont need to have people
changing the header file either. But i want them to use static
META_IDS. So the IFE module parameter is supposed to allow them to
change the upper bound of modules when insmoding ife_act so that
proper validation can happen. I cant make it as large as 32-bit
or not check if it is correct. If i take it out - then i would have to
do that or introduce some complex mechanism for registration.

cheers,
jamal

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 0/5] net_sched: Add support for IFE action
  2016-02-24 12:49           ` Jamal Hadi Salim
@ 2016-02-24 17:48             ` Daniel Borkmann
  2016-02-25 12:23               ` Jamal Hadi Salim
  2016-02-24 17:58             ` Daniel Borkmann
  1 sibling, 1 reply; 27+ messages in thread
From: Daniel Borkmann @ 2016-02-24 17:48 UTC (permalink / raw)
  To: Jamal Hadi Salim, davem; +Cc: netdev, xiyou.wangcong, alexei.starovoitov

On 02/24/2016 01:49 PM, Jamal Hadi Salim wrote:
> On 16-02-23 10:34 AM, Daniel Borkmann wrote:
>> On 02/23/2016 03:28 PM, Jamal Hadi Salim wrote:
> [..]
>
>>> These are basic metadata. The question to ask is what could one use
>>> skb->hash for. Today it is used to select a cpu to balance to.
>>
>> Right, but that happens before you decode that information from your TLV
>> on ingress qdisc. And any subsequent skb_get_hash() to read out skb->hash
>> will effectively overwrite what you set there and call into flow dissector.
>
> Drivers do set the hash. My use case is slightly different.
> I have a NIC which has an embedded cavium processor. This thing
> strips off the TLV and uses the hash to select the host MSI.
> Only thing we dont use at the moment is queue_mapping.

Ok, but the example says ingress qdisc. ;) I presume the driver for the
NIC and the offloading parts are non-public? :/ So, without them, placing
this on ingress qdisc doesn't seem much useful wrt the skb hash example,
and most people only have the software part (for ingress I mean) available.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 0/5] net_sched: Add support for IFE action
  2016-02-24 12:49           ` Jamal Hadi Salim
  2016-02-24 17:48             ` Daniel Borkmann
@ 2016-02-24 17:58             ` Daniel Borkmann
  2016-02-25 12:35               ` Jamal Hadi Salim
  1 sibling, 1 reply; 27+ messages in thread
From: Daniel Borkmann @ 2016-02-24 17:58 UTC (permalink / raw)
  To: Jamal Hadi Salim, davem; +Cc: netdev, xiyou.wangcong, alexei.starovoitov

On 02/24/2016 01:49 PM, Jamal Hadi Salim wrote:
> On 16-02-23 10:34 AM, Daniel Borkmann wrote:
[...]
>> My concern is we add 20 new modules like this that only do trivial things,
>> where instead they could have been consolidated and reduce maintenance. Or
>> is this hard module requirement related to the IFE_META_* module parameter?
>
> Yes, a bit of that ++.
> I am between two worlds: There are people who do user space packet
> processing that claim they do so because they can quickly prototype
> without compiling the kernel. My goal is to make it easy for people
> adding new metadata without having to deal with kernel recompile.

Seems like a case for cls_bpf? ;)

> I do expect for there to be many variations of what that metadata
> will be. For that reason I have them as standalone modules and they
> serve the purpose to illustrate how someone would write such a module.
> The IFE_META_XXX is part of saying i dont need to have people
> changing the header file either. But i want them to use static
> META_IDS. So the IFE module parameter is supposed to allow them to
> change the upper bound of modules when insmoding ife_act so that
> proper validation can happen. I cant make it as large as 32-bit
> or not check if it is correct. If i take it out - then i would have to
> do that or introduce some complex mechanism for registration.

Ok, sure, given the assumption that this is only to be used in your own
fully _controlled_ environment anyway. But in that case, you don't even
need to define any fixed IDs. Currently it seems like you could have
different kernel versions with different IFE_META_MAX from the kernel
headers and external modules define part of the ID space differently?

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 0/5] net_sched: Add support for IFE action
  2016-02-24 17:48             ` Daniel Borkmann
@ 2016-02-25 12:23               ` Jamal Hadi Salim
  2016-02-25 21:34                 ` Daniel Borkmann
  0 siblings, 1 reply; 27+ messages in thread
From: Jamal Hadi Salim @ 2016-02-25 12:23 UTC (permalink / raw)
  To: Daniel Borkmann, davem; +Cc: netdev, xiyou.wangcong, alexei.starovoitov

On 16-02-24 12:48 PM, Daniel Borkmann wrote:
> On 02/24/2016 01:49 PM, Jamal Hadi Salim wrote:

>> Drivers do set the hash. My use case is slightly different.
>> I have a NIC which has an embedded cavium processor. This thing
>> strips off the TLV and uses the hash to select the host MSI.
>> Only thing we dont use at the moment is queue_mapping.
>
> Ok, but the example says ingress qdisc. ;) I presume the driver for the
> NIC and the offloading parts are non-public? :/

Well, it is not exactly mellanox or intel - but I am sure someone would
be willing to sell that NIC.

> So, without them, placing
> this on ingress qdisc doesn't seem much useful wrt the skb hash example,
> and most people only have the software part (for ingress I mean) available.

Do you want me to take skbbhash out? I just want to get this set in then
worry about adding and modifying.

cheers,
jamal

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 0/5] net_sched: Add support for IFE action
  2016-02-24 17:58             ` Daniel Borkmann
@ 2016-02-25 12:35               ` Jamal Hadi Salim
  0 siblings, 0 replies; 27+ messages in thread
From: Jamal Hadi Salim @ 2016-02-25 12:35 UTC (permalink / raw)
  To: Daniel Borkmann, davem; +Cc: netdev, xiyou.wangcong, alexei.starovoitov

On 16-02-24 12:58 PM, Daniel Borkmann wrote:
> On 02/24/2016 01:49 PM, Jamal Hadi Salim wrote:

>> Yes, a bit of that ++.
>> I am between two worlds: There are people who do user space packet
>> processing that claim they do so because they can quickly prototype
>> without compiling the kernel. My goal is to make it easy for people
>> adding new metadata without having to deal with kernel recompile.
>
> Seems like a case for cls_bpf? ;)
>

Yes, and ebpf is a good answer in a few such discussions these days.

>
> Ok, sure, given the assumption that this is only to be used in your own
> fully _controlled_ environment anyway.

It is - typically in the same rack; but could be across to another
room.


> But in that case, you don't even
> need to define any fixed IDs. Currently it seems like you could have
> different kernel versions with different IFE_META_MAX from the kernel
> headers and external modules define part of the ID space differently?
>

That is part of the problem.
Dynamic registration is not going to work well without some
human supervision. We are talking about large number of endpoints.
One machine cannot guarantee to be interested in the same metadata
as others and as you say they could be different kernels etc.
Even within same organization across different groups would require
some  coordination.

I will take that module param out and maybe describe a set of IDs that
are for "private use" and let whoever is deploying worry about
coordination. For the rest i will just have some LANANA or IANA
or kernel maintainance issues them. I should declare all the obvious
kernel ones.

cheers,
jamal

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 0/5] net_sched: Add support for IFE action
  2016-02-25 12:23               ` Jamal Hadi Salim
@ 2016-02-25 21:34                 ` Daniel Borkmann
  2016-02-25 22:40                   ` Jamal Hadi Salim
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel Borkmann @ 2016-02-25 21:34 UTC (permalink / raw)
  To: Jamal Hadi Salim, davem; +Cc: netdev, xiyou.wangcong, alexei.starovoitov

On 02/25/2016 01:23 PM, Jamal Hadi Salim wrote:
> On 16-02-24 12:48 PM, Daniel Borkmann wrote:
>> On 02/24/2016 01:49 PM, Jamal Hadi Salim wrote:
[...]
>>> Drivers do set the hash. My use case is slightly different.
>>> I have a NIC which has an embedded cavium processor. This thing
>>> strips off the TLV and uses the hash to select the host MSI.
>>> Only thing we dont use at the moment is queue_mapping.
>>
>> Ok, but the example says ingress qdisc. ;) I presume the driver for the
>> NIC and the offloading parts are non-public? :/
>
> Well, it is not exactly mellanox or intel - but I am sure someone would
> be willing to sell that NIC.
>
>> So, without them, placing
>> this on ingress qdisc doesn't seem much useful wrt the skb hash example,
>> and most people only have the software part (for ingress I mean) available.
>
> Do you want me to take skbbhash out? I just want to get this set in then
> worry about adding and modifying.

Well, if the NIC driver and offloading parts for ife are not available
(or cannot be made available) in the upstream kernel tree, then you have
to assume that everyone else can _only_ use this with the existing tc
facilities in the kernel. And as such, the whole set needs to be evaluated,
I think this is nothing new. ;-)

That is why I asked for a "typical setup" and use-case earlier. And if
it doesn't have any obvious ones in upstream context without the missing
pieces, then the code might linger around unused by anyone. So taking out
these parts would be highly preferred (unless there's a _good_ reason not
to).

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 0/5] net_sched: Add support for IFE action
  2016-02-25 21:34                 ` Daniel Borkmann
@ 2016-02-25 22:40                   ` Jamal Hadi Salim
  2016-02-26  0:03                     ` Daniel Borkmann
  0 siblings, 1 reply; 27+ messages in thread
From: Jamal Hadi Salim @ 2016-02-25 22:40 UTC (permalink / raw)
  To: Daniel Borkmann, davem; +Cc: netdev, xiyou.wangcong, alexei.starovoitov

On 16-02-25 04:34 PM, Daniel Borkmann wrote:
> On 02/25/2016 01:23 PM, Jamal Hadi Salim wrote:
>> On 16-02-24 12:48 PM, Daniel Borkmann wrote:
>>> On 02/24/2016 01:49 PM, Jamal Hadi Salim wrote:
> [...]
>>>> Drivers do set the hash. My use case is slightly different.
>>>> I have a NIC which has an embedded cavium processor. This thing
>>>> strips off the TLV and uses the hash to select the host MSI.
>>>> Only thing we dont use at the moment is queue_mapping.
>>>
>>> Ok, but the example says ingress qdisc. ;) I presume the driver for the
>>> NIC and the offloading parts are non-public? :/
>>
>> Well, it is not exactly mellanox or intel - but I am sure someone would
>> be willing to sell that NIC.
>>
>>> So, without them, placing
>>> this on ingress qdisc doesn't seem much useful wrt the skb hash example,
>>> and most people only have the software part (for ingress I mean)
>>> available.
>>
>> Do you want me to take skbbhash out? I just want to get this set in then
>> worry about adding and modifying.
>
> Well, if the NIC driver and offloading parts for ife are not available
> (or cannot be made available)

If you are willing to pay for one i can have one sold to you.
Note: There is no dependency on such a NIC. You asked for an example
of how one would use skbhash and i pointed it out to you.
Infact nothing in the commit notes pointed to any NIC.

>in the upstream kernel tree, then you have
> to assume that everyone else can _only_ use this with the existing tc
> facilities in the kernel. And as such, the whole set needs to be evaluated,
> I think this is nothing new. ;-)
>

Seriously this is getting to a ridiculous level now.
I gave a talk - which you attended. I wrote a paper which you may have
at minimal glanced at.
I have had discussions with you on this very subject.
And as soon as i posted the patches your statements led from you
needing a good use case to why not use a netdev and why i have
all these things that are not very useful. None of which came up before.
For someone who works on ebpf - where there is plenty of code that
noone gets to see you are making rather bold statements.

> That is why I asked for a "typical setup" and use-case earlier. And if
> it doesn't have any obvious ones in upstream context without the missing
> pieces, then the code might linger around unused by anyone. So taking out
> these parts would be highly preferred (unless there's a _good_ reason not
> to).
>

So is there anything that is useful in your view?

cheers,
jamal

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [net-next PATCH 0/5] net_sched: Add support for IFE action
  2016-02-25 22:40                   ` Jamal Hadi Salim
@ 2016-02-26  0:03                     ` Daniel Borkmann
  0 siblings, 0 replies; 27+ messages in thread
From: Daniel Borkmann @ 2016-02-26  0:03 UTC (permalink / raw)
  To: Jamal Hadi Salim, davem; +Cc: netdev, xiyou.wangcong, alexei.starovoitov

On 02/25/2016 11:40 PM, Jamal Hadi Salim wrote:
> On 16-02-25 04:34 PM, Daniel Borkmann wrote:
>> On 02/25/2016 01:23 PM, Jamal Hadi Salim wrote:
>>> On 16-02-24 12:48 PM, Daniel Borkmann wrote:
>>>> On 02/24/2016 01:49 PM, Jamal Hadi Salim wrote:
>> [...]
>>>>> Drivers do set the hash. My use case is slightly different.
>>>>> I have a NIC which has an embedded cavium processor. This thing
>>>>> strips off the TLV and uses the hash to select the host MSI.
>>>>> Only thing we dont use at the moment is queue_mapping.
>>>>
>>>> Ok, but the example says ingress qdisc. ;) I presume the driver for the
>>>> NIC and the offloading parts are non-public? :/
>>>
>>> Well, it is not exactly mellanox or intel - but I am sure someone would
>>> be willing to sell that NIC.
>>>
>>>> So, without them, placing
>>>> this on ingress qdisc doesn't seem much useful wrt the skb hash example,
>>>> and most people only have the software part (for ingress I mean)
>>>> available.
>>>
>>> Do you want me to take skbbhash out? I just want to get this set in then
>>> worry about adding and modifying.
>>
>> Well, if the NIC driver and offloading parts for ife are not available
>> (or cannot be made available)
>
> If you are willing to pay for one i can have one sold to you.
> Note: There is no dependency on such a NIC. You asked for an example
> of how one would use skbhash and i pointed it out to you.
> Infact nothing in the commit notes pointed to any NIC.

Yes, but that was not the point, I think we're talking past each other. ;)
The example (or better, use-case) you provided with your NIC seems useful.
But when you want to decode this with your kernel module, f.e. attached on
ingress qdisc, it's an entirely different situation. Anyway, point is subsequent
calls to skb_get_hash() will overwrite what you decoded there, that's why I
asked and tried to point this out earlier. You may want to look at sth
like __skb_set_sw_hash() ...

>> in the upstream kernel tree, then you have
>> to assume that everyone else can _only_ use this with the existing tc
>> facilities in the kernel. And as such, the whole set needs to be evaluated,
>> I think this is nothing new. ;-)
>
> Seriously this is getting to a ridiculous level now.
> I gave a talk - which you attended. I wrote a paper which you may have
> at minimal glanced at.
> I have had discussions with you on this very subject.

I think I only suggested that in case you still don't have an ethertype
assigned to this to require the user to specify one instead of some default.

> And as soon as i posted the patches your statements led from you
> needing a good use case to why not use a netdev and why i have
> all these things that are not very useful. None of which came up before.

Fair enough, didn't see that code before and just tried to review it, probably
shouldn't have, so I'm not looking further into this anymore. ;)

Thanks anyway,
Daniel

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2016-02-26  0:03 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-22 13:21 [net-next PATCH 0/5] net_sched: Add support for IFE action Jamal Hadi Salim
2016-02-22 13:21 ` [net-next PATCH 1/5] introduce " Jamal Hadi Salim
2016-02-22 13:21 ` [net-next PATCH 2/5] Support to encoding decoding skb mark on " Jamal Hadi Salim
2016-02-22 13:21 ` [net-next PATCH 3/5] Support to encoding decoding skb prio " Jamal Hadi Salim
2016-02-22 17:01   ` Daniel Borkmann
2016-02-22 13:21 ` [net-next PATCH 4/5] Support to encoding decoding skb hashid " Jamal Hadi Salim
2016-02-22 16:56   ` Daniel Borkmann
2016-02-22 13:21 ` [net-next PATCH 5/5] Support to encoding decoding skb queue map " Jamal Hadi Salim
2016-02-22 16:59   ` Daniel Borkmann
2016-02-22 21:03   ` John Fastabend
2016-02-23 12:17     ` Jamal Hadi Salim
2016-02-23 19:33       ` John Fastabend
2016-02-22 16:47 ` [net-next PATCH 0/5] net_sched: Add support for " Daniel Borkmann
2016-02-23 12:09   ` Jamal Hadi Salim
2016-02-23 13:20     ` Daniel Borkmann
2016-02-23 14:28       ` Jamal Hadi Salim
2016-02-23 15:34         ` Daniel Borkmann
2016-02-24 12:49           ` Jamal Hadi Salim
2016-02-24 17:48             ` Daniel Borkmann
2016-02-25 12:23               ` Jamal Hadi Salim
2016-02-25 21:34                 ` Daniel Borkmann
2016-02-25 22:40                   ` Jamal Hadi Salim
2016-02-26  0:03                     ` Daniel Borkmann
2016-02-24 17:58             ` Daniel Borkmann
2016-02-25 12:35               ` Jamal Hadi Salim
2016-02-23  7:00 ` Cong Wang
2016-02-23 12:18   ` Jamal Hadi Salim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.