All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/8] cls_flower hardware offload support
@ 2016-03-01 14:24 Amir Vadai
  2016-03-01 14:24 ` [PATCH net-next 1/8] net/flower: Introduce " Amir Vadai
                   ` (8 more replies)
  0 siblings, 9 replies; 31+ messages in thread
From: Amir Vadai @ 2016-03-01 14:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, John Fastabend, Saeed Mahameed,
	Hadar Har-Zion, Jiri Pirko, Amir Vadai

Hi,

This patchset introduces cls_flower hardware offload support over ConnectX-4
driver, more hardware vendors are welcome to use it too.

This patchset is based on John's infrastructure for tc offloading [2] to add
hardware offload support to the flower filter. It also extends the support to
an additional tc action - skbedit mark operation.
NIC driver that was used is ConnectX-4. Feature is off by default and could be
turned on using ethtool.

Some commands to use this code:

export TC=../iproute2/tc/tc
export ETH=ens9

ethtool  -K ens9 hw-tc-offload on

# add an ingress qdisc
$TC qdisc add dev $ETH ingress

# Drop ICMP (ip_proto 1) packets
$TC filter add dev $ETH protocol ip prio 20 parent ffff: \
	flower ip_proto 1 \
	dst_mac 7c:fe:90:69:81:62 \
	src_mac 7c:fe:90:69:81:56 \
	dst_ip 11.11.11.11 \
	src_ip 11.11.11.12 \
	indev $ETH \
	action drop

# Mark (with 0x1234) TCP (ip_proto 6) packets
$TC filter add dev $ETH protocol ip prio 30 parent ffff: \
	flower ip_proto 6 \
	indev $ETH \
	action skbedit mark 0x1234

# A NOP software filter used to count marked packets using "tc show -s"
$TC filter add dev $ETH protocol ip prio 10 parent ffff: \
	handle 0x1234 fw action pass

The code was tested and applied on top of commit f12d33f
("3c59x: Ensure to apply the expires time") + John's pending patches [3]

Main changes from the RFC [1]:
- API
  - Using ndo_setup_tc() instead of switchdev
- act_skbedit, act_gact
  - Actions are not serialized to NIC driver, instead using access functions.
- cls_flower
  - prevent double classification by software by not adding
    successfuly offloaded filters to the hashtable
  - Fixed some bugs in original RFC with rule delete  
- mlx5
  - Adding flow table to kernel namespace instead of a new namespace
  - s/offload/tc/ in many places
  - no need for a special kconfig since switchdev is not used

Thanks,
Amir

[1] - http://permalink.gmane.org/gmane.linux.network/397064
[2] - http://permalink.gmane.org/gmane.linux.network/397045 
[3] - http://permalink.gmane.org/gmane.linux.network/401226

Amir Vadai (8):
  net/flower: Introduce hardware offload support
  net/flow_dissector: Make dissector_uses_key() and
    skb_flow_dissector_target() public
  net/act_skbedit: Utility functions for mark action
  net/mlx5_core: Set flow steering dest only for forward rules
  net/mlx5e: Add a new priority for kernel flow tables
  net/mlx5e: Introduce tc offload support
  net/mlx5e: Support offload cls_flower with drop action
  net/mlx5e: Support offload cls_flower with sskbedit mark action

 drivers/net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |   9 +
 drivers/net/ethernet/mellanox/mlx5/core/en_fs.c   |   4 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c |  40 ++
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c   |   3 +
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c   | 434 ++++++++++++++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.h   |  51 +++
 drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c  |  29 +-
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c |  22 +-
 include/linux/netdevice.h                         |   2 +
 include/net/flow_dissector.h                      |  13 +
 include/net/pkt_cls.h                             |  14 +
 include/net/tc_act/tc_skbedit.h                   |  15 +
 include/uapi/linux/pkt_cls.h                      |   2 +
 net/core/flow_dissector.c                         |  13 -
 net/sched/cls_flower.c                            |  75 +++-
 16 files changed, 686 insertions(+), 42 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_tc.h

-- 
2.7.0

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH net-next 1/8] net/flower: Introduce hardware offload support
  2016-03-01 14:24 [PATCH net-next 0/8] cls_flower hardware offload support Amir Vadai
@ 2016-03-01 14:24 ` Amir Vadai
  2016-03-01 14:47   ` Jiri Pirko
  2016-03-01 14:53   ` Jiri Pirko
  2016-03-01 14:24 ` [PATCH net-next 2/8] net/flow_dissector: Make dissector_uses_key() and skb_flow_dissector_target() public Amir Vadai
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 31+ messages in thread
From: Amir Vadai @ 2016-03-01 14:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, John Fastabend, Saeed Mahameed,
	Hadar Har-Zion, Jiri Pirko, Amir Vadai

This patch is based on a patch made by John Fastabend.
It adds support for offloading cls_flower.
A filter that is offloaded successfuly by hardware, will not be added to
the hashtable and won't be processed by software.

Suggested-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Amir Vadai <amir@vadai.me>
---
 include/linux/netdevice.h    |  2 ++
 include/net/pkt_cls.h        | 14 +++++++++
 include/uapi/linux/pkt_cls.h |  2 ++
 net/sched/cls_flower.c       | 75 +++++++++++++++++++++++++++++++++++++++++---
 4 files changed, 88 insertions(+), 5 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index e52077f..0fd329a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -785,6 +785,7 @@ typedef u16 (*select_queue_fallback_t)(struct net_device *dev,
 enum {
 	TC_SETUP_MQPRIO,
 	TC_SETUP_CLSU32,
+	TC_SETUP_CLSFLOWER,
 };
 
 struct tc_cls_u32_offload;
@@ -794,6 +795,7 @@ struct tc_to_netdev {
 	union {
 		u8 tc;
 		struct tc_cls_u32_offload *cls_u32;
+		struct tc_cls_flower_offload *cls_flower;
 	};
 };
 
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index bea14ee..beb2ee1 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -409,4 +409,18 @@ static inline bool tc_should_offload(struct net_device *dev, u32 flags)
 	return true;
 }
 
+enum {
+	TC_CLSFLOWER_REPLACE,
+	TC_CLSFLOWER_DESTROY,
+};
+
+struct tc_cls_flower_offload {
+	int command;
+	u64 cookie;
+	struct flow_dissector *dissector;
+	struct fl_flow_key *mask;
+	struct fl_flow_key *key;
+	struct tcf_exts *exts;
+};
+
 #endif
diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index 9874f568..c43c5f7 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -417,6 +417,8 @@ enum {
 	TCA_FLOWER_KEY_TCP_DST,		/* be16 */
 	TCA_FLOWER_KEY_UDP_SRC,		/* be16 */
 	TCA_FLOWER_KEY_UDP_DST,		/* be16 */
+
+	TCA_FLOWER_FLAGS,
 	__TCA_FLOWER_MAX,
 };
 
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 95b0212..e599bea 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -165,6 +165,53 @@ static void fl_destroy_filter(struct rcu_head *head)
 	kfree(f);
 }
 
+static int fl_hw_destroy_filter(struct tcf_proto *tp, u64 cookie)
+{
+	struct net_device *dev = tp->q->dev_queue->dev;
+	struct tc_cls_flower_offload offload = {0};
+	struct tc_to_netdev tc;
+
+	if (!tc_should_offload(dev, 0))
+		return -ENOTSUPP;
+
+	offload.command = TC_CLSFLOWER_DESTROY;
+	offload.cookie = cookie;
+
+	tc.type = TC_SETUP_CLSFLOWER;
+	tc.cls_flower = &offload;
+
+	return dev->netdev_ops->ndo_setup_tc(dev, tp->q->handle, tp->protocol,
+					     &tc);
+}
+
+static int fl_hw_replace_filter(struct tcf_proto *tp,
+				struct flow_dissector *dissector,
+				struct fl_flow_key *mask,
+				struct fl_flow_key *key,
+				struct tcf_exts *actions,
+				u64 cookie, u32 flags)
+{
+	struct net_device *dev = tp->q->dev_queue->dev;
+	struct tc_cls_flower_offload offload = {0};
+	struct tc_to_netdev tc;
+
+	if (!tc_should_offload(dev, flags))
+		return -ENOTSUPP;
+
+	offload.command = TC_CLSFLOWER_REPLACE;
+	offload.cookie = cookie;
+	offload.dissector = dissector;
+	offload.mask = mask;
+	offload.key = key;
+	offload.exts = actions;
+
+	tc.type = TC_SETUP_CLSFLOWER;
+	tc.cls_flower = &offload;
+
+	return dev->netdev_ops->ndo_setup_tc(dev, tp->q->handle, tp->protocol,
+					     &tc);
+}
+
 static bool fl_destroy(struct tcf_proto *tp, bool force)
 {
 	struct cls_fl_head *head = rtnl_dereference(tp->root);
@@ -174,6 +221,7 @@ static bool fl_destroy(struct tcf_proto *tp, bool force)
 		return false;
 
 	list_for_each_entry_safe(f, next, &head->filters, list) {
+		fl_hw_destroy_filter(tp, (u64)f);
 		list_del_rcu(&f->list);
 		call_rcu(&f->rcu, fl_destroy_filter);
 	}
@@ -459,6 +507,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 	struct cls_fl_filter *fnew;
 	struct nlattr *tb[TCA_FLOWER_MAX + 1];
 	struct fl_flow_mask mask = {};
+	u32 flags = 0;
 	int err;
 
 	if (!tca[TCA_OPTIONS])
@@ -494,13 +543,28 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 	if (err)
 		goto errout;
 
-	err = rhashtable_insert_fast(&head->ht, &fnew->ht_node,
-				     head->ht_params);
-	if (err)
-		goto errout;
-	if (fold)
+	if (tb[TCA_FLOWER_FLAGS])
+		flags = nla_get_u32(tb[TCA_FLOWER_FLAGS]);
+
+	err = fl_hw_replace_filter(tp,
+				   &head->dissector,
+				   &mask.key,
+				   &fnew->key,
+				   &fnew->exts,
+				   (u64)fnew,
+				   flags);
+	if (err) {
+		err = rhashtable_insert_fast(&head->ht, &fnew->ht_node,
+					     head->ht_params);
+		if (err)
+			goto errout;
+	}
+
+	if (fold) {
 		rhashtable_remove_fast(&head->ht, &fold->ht_node,
 				       head->ht_params);
+		fl_hw_destroy_filter(tp, (u64)fold);
+	}
 
 	*arg = (unsigned long) fnew;
 
@@ -527,6 +591,7 @@ static int fl_delete(struct tcf_proto *tp, unsigned long arg)
 	rhashtable_remove_fast(&head->ht, &f->ht_node,
 			       head->ht_params);
 	list_del_rcu(&f->list);
+	fl_hw_destroy_filter(tp, (u64)f);
 	tcf_unbind_filter(tp, &f->res);
 	call_rcu(&f->rcu, fl_destroy_filter);
 	return 0;
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 2/8] net/flow_dissector: Make dissector_uses_key() and skb_flow_dissector_target() public
  2016-03-01 14:24 [PATCH net-next 0/8] cls_flower hardware offload support Amir Vadai
  2016-03-01 14:24 ` [PATCH net-next 1/8] net/flower: Introduce " Amir Vadai
@ 2016-03-01 14:24 ` Amir Vadai
  2016-03-01 14:24 ` [PATCH net-next 3/8] net/act_skbedit: Utility functions for mark action Amir Vadai
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 31+ messages in thread
From: Amir Vadai @ 2016-03-01 14:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, John Fastabend, Saeed Mahameed,
	Hadar Har-Zion, Jiri Pirko, Amir Vadai

Will be used in a following patch to query if a key is being used, and
what it's value in the target object.

Signed-off-by: Amir Vadai <amir@vadai.me>
---
 include/net/flow_dissector.h | 13 +++++++++++++
 net/core/flow_dissector.c    | 13 -------------
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index 8c8548c..d3d60dc 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -184,4 +184,17 @@ static inline bool flow_keys_have_l4(struct flow_keys *keys)
 
 u32 flow_hash_from_keys(struct flow_keys *keys);
 
+static inline bool dissector_uses_key(const struct flow_dissector *flow_dissector,
+				      enum flow_dissector_key_id key_id)
+{
+	return flow_dissector->used_keys & (1 << key_id);
+}
+
+static inline void *skb_flow_dissector_target(struct flow_dissector *flow_dissector,
+					      enum flow_dissector_key_id key_id,
+					      void *target_container)
+{
+	return ((char *)target_container) + flow_dissector->offset[key_id];
+}
+
 #endif
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 7c7b873..a669dea 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -19,25 +19,12 @@
 #include <net/flow_dissector.h>
 #include <scsi/fc/fc_fcoe.h>
 
-static bool dissector_uses_key(const struct flow_dissector *flow_dissector,
-			       enum flow_dissector_key_id key_id)
-{
-	return flow_dissector->used_keys & (1 << key_id);
-}
-
 static void dissector_set_key(struct flow_dissector *flow_dissector,
 			      enum flow_dissector_key_id key_id)
 {
 	flow_dissector->used_keys |= (1 << key_id);
 }
 
-static void *skb_flow_dissector_target(struct flow_dissector *flow_dissector,
-				       enum flow_dissector_key_id key_id,
-				       void *target_container)
-{
-	return ((char *) target_container) + flow_dissector->offset[key_id];
-}
-
 void skb_flow_dissector_init(struct flow_dissector *flow_dissector,
 			     const struct flow_dissector_key *key,
 			     unsigned int key_count)
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 3/8] net/act_skbedit: Utility functions for mark action
  2016-03-01 14:24 [PATCH net-next 0/8] cls_flower hardware offload support Amir Vadai
  2016-03-01 14:24 ` [PATCH net-next 1/8] net/flower: Introduce " Amir Vadai
  2016-03-01 14:24 ` [PATCH net-next 2/8] net/flow_dissector: Make dissector_uses_key() and skb_flow_dissector_target() public Amir Vadai
@ 2016-03-01 14:24 ` Amir Vadai
  2016-03-01 14:24 ` [PATCH net-next 4/8] net/mlx5_core: Set flow steering dest only for forward rules Amir Vadai
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 31+ messages in thread
From: Amir Vadai @ 2016-03-01 14:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, John Fastabend, Saeed Mahameed,
	Hadar Har-Zion, Jiri Pirko, Amir Vadai

Enable device drivers to query the action if is a mark action and what
value to use for marking.

Signed-off-by: Amir Vadai <amir@vadai.me>
---
 include/net/tc_act/tc_skbedit.h | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/include/net/tc_act/tc_skbedit.h b/include/net/tc_act/tc_skbedit.h
index 0df9a0d..ad27d69 100644
--- a/include/net/tc_act/tc_skbedit.h
+++ b/include/net/tc_act/tc_skbedit.h
@@ -20,6 +20,7 @@
 #define __NET_TC_SKBEDIT_H
 
 #include <net/act_api.h>
+#include <linux/tc_act/tc_skbedit.h>
 
 struct tcf_skbedit {
 	struct tcf_common	common;
@@ -32,4 +33,18 @@ struct tcf_skbedit {
 #define to_skbedit(a) \
 	container_of(a->priv, struct tcf_skbedit, common)
 
+#ifdef CONFIG_NET_CLS_ACT
+static inline bool is_tcf_skbedit_mark(const struct tc_action *a)
+{
+	if (!a->ops || a->ops->type != TCA_ACT_SKBEDIT)
+		return false;
+
+	return to_skbedit(a)->flags == SKBEDIT_F_MARK;
+}
+
+static inline u32 tcf_skbedit_mark(const struct tc_action *a)
+{
+	return to_skbedit(a)->mark;
+}
+#endif
 #endif /* __NET_TC_SKBEDIT_H */
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 4/8] net/mlx5_core: Set flow steering dest only for forward rules
  2016-03-01 14:24 [PATCH net-next 0/8] cls_flower hardware offload support Amir Vadai
                   ` (2 preceding siblings ...)
  2016-03-01 14:24 ` [PATCH net-next 3/8] net/act_skbedit: Utility functions for mark action Amir Vadai
@ 2016-03-01 14:24 ` Amir Vadai
  2016-03-01 14:24 ` [PATCH net-next 5/8] net/mlx5e: Add a new priority for kernel flow tables Amir Vadai
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 31+ messages in thread
From: Amir Vadai @ 2016-03-01 14:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, John Fastabend, Saeed Mahameed,
	Hadar Har-Zion, Jiri Pirko, Amir Vadai, Maor Gottlieb

We need to handle flow table entry destinations only if the action
associated with the rule is forwarding (MLX5_FLOW_CONTEXT_ACTION_FWD_DEST).

Fixes: 26a8145390b3 ('net/mlx5_core: Introduce flow steering firmware commands')
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c  | 29 +++++++++++++----------
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 18 +++++++++-----
 2 files changed, 28 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
index a9894d2..f46f1db 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
@@ -218,19 +218,22 @@ static int mlx5_cmd_set_fte(struct mlx5_core_dev *dev,
 				      match_value);
 	memcpy(in_match_value, &fte->val, MLX5_ST_SZ_BYTES(fte_match_param));
 
-	in_dests = MLX5_ADDR_OF(flow_context, in_flow_context, destination);
-	list_for_each_entry(dst, &fte->node.children, node.list) {
-		unsigned int id;
-
-		MLX5_SET(dest_format_struct, in_dests, destination_type,
-			 dst->dest_attr.type);
-		if (dst->dest_attr.type ==
-		    MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE)
-			id = dst->dest_attr.ft->id;
-		else
-			id = dst->dest_attr.tir_num;
-		MLX5_SET(dest_format_struct, in_dests, destination_id, id);
-		in_dests += MLX5_ST_SZ_BYTES(dest_format_struct);
+	if (fte->action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST) {
+		in_dests = MLX5_ADDR_OF(flow_context, in_flow_context, destination);
+		list_for_each_entry(dst, &fte->node.children, node.list) {
+			unsigned int id;
+
+			MLX5_SET(dest_format_struct, in_dests, destination_type,
+				 dst->dest_attr.type);
+			if (dst->dest_attr.type ==
+			    MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE) {
+				id = dst->dest_attr.ft->id;
+			} else {
+				id = dst->dest_attr.tir_num;
+			}
+			MLX5_SET(dest_format_struct, in_dests, destination_id, id);
+			in_dests += MLX5_ST_SZ_BYTES(dest_format_struct);
+		}
 	}
 	memset(out, 0, sizeof(out));
 	err = mlx5_cmd_exec_check_status(dev, in, inlen, out,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 6f68dba..f0e67d2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -360,8 +360,8 @@ static void del_rule(struct fs_node *node)
 	memcpy(match_value, fte->val, sizeof(fte->val));
 	fs_get_obj(ft, fg->node.parent);
 	list_del(&rule->node.list);
-	fte->dests_size--;
-	if (fte->dests_size) {
+	if ((fte->action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST) &&
+	    --fte->dests_size) {
 		err = mlx5_cmd_update_fte(dev, ft,
 					  fg->id, fte);
 		if (err)
@@ -763,7 +763,8 @@ static struct mlx5_flow_rule *alloc_rule(struct mlx5_flow_destination *dest)
 		return NULL;
 
 	rule->node.type = FS_TYPE_FLOW_DEST;
-	memcpy(&rule->dest_attr, dest, sizeof(*dest));
+	if (dest)
+		memcpy(&rule->dest_attr, dest, sizeof(*dest));
 
 	return rule;
 }
@@ -785,8 +786,9 @@ static struct mlx5_flow_rule *add_rule_fte(struct fs_fte *fte,
 	/* Add dest to dests list- added as first element after the head */
 	tree_init_node(&rule->node, 1, del_rule);
 	list_add_tail(&rule->node.list, &fte->node.children);
-	fte->dests_size++;
-	if (fte->dests_size == 1)
+	if (dest)
+		fte->dests_size++;
+	if (fte->dests_size == 1 || !dest)
 		err = mlx5_cmd_create_fte(get_dev(&ft->node),
 					  ft, fg->id, fte);
 	else
@@ -802,7 +804,8 @@ static struct mlx5_flow_rule *add_rule_fte(struct fs_fte *fte,
 free_rule:
 	list_del(&rule->node.list);
 	kfree(rule);
-	fte->dests_size--;
+	if (dest)
+		fte->dests_size--;
 	return ERR_PTR(err);
 }
 
@@ -996,6 +999,9 @@ mlx5_add_flow_rule(struct mlx5_flow_table *ft,
 	struct mlx5_flow_group *g;
 	struct mlx5_flow_rule *rule;
 
+	if ((action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST) && !dest)
+		return ERR_PTR(-EINVAL);
+
 	nested_lock_ref_node(&ft->node, FS_MUTEX_GRANDPARENT);
 	fs_for_each_fg(g, ft)
 		if (compare_match_criteria(g->mask.match_criteria_enable,
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 5/8] net/mlx5e: Add a new priority for kernel flow tables
  2016-03-01 14:24 [PATCH net-next 0/8] cls_flower hardware offload support Amir Vadai
                   ` (3 preceding siblings ...)
  2016-03-01 14:24 ` [PATCH net-next 4/8] net/mlx5_core: Set flow steering dest only for forward rules Amir Vadai
@ 2016-03-01 14:24 ` Amir Vadai
  2016-03-01 14:24 ` [PATCH net-next 6/8] net/mlx5e: Introduce tc offload support Amir Vadai
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 31+ messages in thread
From: Amir Vadai @ 2016-03-01 14:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, John Fastabend, Saeed Mahameed,
	Hadar Har-Zion, Jiri Pirko, Amir Vadai

Move the vlan and main flow tables to use priority 1. This will allow
the upcoming TC offload logic to use a higher priority (0) for the
offload steering table.

Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_fs.c   | 4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
index 80d81ab..d00a242 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
@@ -1041,7 +1041,7 @@ static int mlx5e_create_main_flow_table(struct mlx5e_priv *priv)
 	int err;
 
 	ft->num_groups = 0;
-	ft->t = mlx5_create_flow_table(priv->fts.ns, 0, MLX5E_MAIN_TABLE_SIZE);
+	ft->t = mlx5_create_flow_table(priv->fts.ns, 1, MLX5E_MAIN_TABLE_SIZE);
 
 	if (IS_ERR(ft->t)) {
 		err = PTR_ERR(ft->t);
@@ -1150,7 +1150,7 @@ static int mlx5e_create_vlan_flow_table(struct mlx5e_priv *priv)
 	int err;
 
 	ft->num_groups = 0;
-	ft->t = mlx5_create_flow_table(priv->fts.ns, 0, MLX5E_VLAN_TABLE_SIZE);
+	ft->t = mlx5_create_flow_table(priv->fts.ns, 1, MLX5E_VLAN_TABLE_SIZE);
 
 	if (IS_ERR(ft->t)) {
 		err = PTR_ERR(ft->t);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index f0e67d2..e848d70 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -73,8 +73,8 @@
 #define BY_PASS_MIN_LEVEL (KENREL_MIN_LEVEL + MLX5_BY_PASS_NUM_PRIOS +\
 			   LEFTOVERS_MAX_FT)
 
-#define KERNEL_MAX_FT 2
-#define KERNEL_NUM_PRIOS 1
+#define KERNEL_MAX_FT 3
+#define KERNEL_NUM_PRIOS 2
 #define KENREL_MIN_LEVEL 2
 
 struct node_caps {
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 6/8] net/mlx5e: Introduce tc offload support
  2016-03-01 14:24 [PATCH net-next 0/8] cls_flower hardware offload support Amir Vadai
                   ` (4 preceding siblings ...)
  2016-03-01 14:24 ` [PATCH net-next 5/8] net/mlx5e: Add a new priority for kernel flow tables Amir Vadai
@ 2016-03-01 14:24 ` Amir Vadai
  2016-03-01 14:52   ` Jiri Pirko
  2016-03-01 15:59   ` Saeed Mahameed
  2016-03-01 14:24 ` [PATCH net-next 7/8] net/mlx5e: Support offload cls_flower with drop action Amir Vadai
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 31+ messages in thread
From: Amir Vadai @ 2016-03-01 14:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, John Fastabend, Saeed Mahameed,
	Hadar Har-Zion, Jiri Pirko, Amir Vadai

Extend ndo_setup_tc() to support ingress tc offloading. Will be used by
later patches to offload tc flower filter.

Feature is off by default and could be enabled by issuing:
 # ethtool  -K eth0 hw-tc-offload on

Offloads flow table is dynamically created when first filter is
added.
Rules are saved in a hash table that is maintained by the consumer (for
example - the flower offload in the next patch).
When last filter is removed and no filters exist in the hash table, the
offload flow table is destroyed.

Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |   9 ++
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c |  31 +++++
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c   | 131 ++++++++++++++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.h   |  44 ++++++++
 5 files changed, 216 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_tc.h

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 11b592d..4fc45ee 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -6,6 +6,6 @@ mlx5_core-y :=	main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
 
 mlx5_core-$(CONFIG_MLX5_CORE_EN) += wq.o eswitch.o \
 		en_main.o en_fs.o en_ethtool.o en_tx.o en_rx.o \
-		en_txrx.o en_clock.o vxlan.o
+		en_txrx.o en_clock.o vxlan.o en_tc.o
 
 mlx5_core-$(CONFIG_MLX5_CORE_EN_DCB) +=  en_dcbnl.o
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 1dca3dc..6571a25 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -43,6 +43,7 @@
 #include <linux/mlx5/port.h>
 #include <linux/mlx5/vport.h>
 #include <linux/mlx5/transobj.h>
+#include <linux/rhashtable.h>
 #include "wq.h"
 #include "mlx5_core.h"
 
@@ -524,8 +525,16 @@ struct mlx5e_flow_table {
 	struct mlx5_flow_group		**g;
 };
 
+struct mlx5e_tc_flow_table {
+	struct mlx5_flow_table		*t;
+
+	struct rhashtable_params        ht_params;
+	struct rhashtable               ht;
+};
+
 struct mlx5e_flow_tables {
 	struct mlx5_flow_namespace	*ns;
+	struct mlx5e_tc_flow_table	tc;
 	struct mlx5e_flow_table		vlan;
 	struct mlx5e_flow_table		main;
 };
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 0d45f35..cb02b4c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -30,9 +30,12 @@
  * SOFTWARE.
  */
 
+#include <net/tc_act/tc_gact.h>
+#include <net/pkt_cls.h>
 #include <linux/mlx5/fs.h>
 #include <net/vxlan.h>
 #include "en.h"
+#include "en_tc.h"
 #include "eswitch.h"
 #include "vxlan.h"
 
@@ -1880,6 +1883,17 @@ static int mlx5e_setup_tc(struct net_device *netdev, u8 tc)
 static int mlx5e_ndo_setup_tc(struct net_device *dev, u32 handle,
 			      __be16 proto, struct tc_to_netdev *tc)
 {
+	struct mlx5e_priv *priv = netdev_priv(dev);
+
+	if (TC_H_MAJ(handle) != TC_H_MAJ(TC_H_INGRESS))
+		goto mqprio;
+
+	switch (tc->type) {
+	default:
+		return -EINVAL;
+	}
+
+mqprio:
 	if (handle != TC_H_ROOT || tc->type != TC_SETUP_MQPRIO)
 		return -EINVAL;
 
@@ -1963,6 +1977,13 @@ static int mlx5e_set_features(struct net_device *netdev,
 			mlx5e_disable_vlan_filter(priv);
 	}
 
+	if ((changes & NETIF_F_HW_TC) && !(features & NETIF_F_HW_TC) &&
+	    mlx5e_tc_num_filters(priv)) {
+		netdev_err(netdev,
+			   "Active offloaded tc filters, can't turn hw_tc_offload off\n");
+		return -EINVAL;
+	}
+
 	return err;
 }
 
@@ -2361,10 +2382,18 @@ static void mlx5e_build_netdev(struct net_device *netdev)
 	if (!priv->params.lro_en)
 		netdev->features  &= ~NETIF_F_LRO;
 
+#define FT_CAP(f) MLX5_CAP_FLOWTABLE(mdev, flow_table_properties_nic_receive.f)
+	if (FT_CAP(flow_modify_en) &&
+	    FT_CAP(modify_root) &&
+	    FT_CAP(identified_miss_table_mode) &&
+	    FT_CAP(flow_table_modify))
+		priv->netdev->hw_features      |= NETIF_F_HW_TC;
+
 	netdev->features         |= NETIF_F_HIGHDMA;
 
 	netdev->priv_flags       |= IFF_UNICAST_FLT;
 
+	mlx5e_tc_init(priv);
 	mlx5e_set_netdev_dev_addr(netdev);
 }
 
@@ -2531,6 +2560,7 @@ err_unmap_free_uar:
 	mlx5_unmap_free_uar(mdev, &priv->cq_uar);
 
 err_free_netdev:
+	mlx5e_tc_cleanup(priv);
 	free_netdev(netdev);
 
 	return NULL;
@@ -2558,6 +2588,7 @@ static void mlx5e_destroy_netdev(struct mlx5_core_dev *mdev, void *vpriv)
 	mlx5_core_dealloc_transport_domain(priv->mdev, priv->tdn);
 	mlx5_core_dealloc_pd(priv->mdev, priv->pdn);
 	mlx5_unmap_free_uar(priv->mdev, &priv->cq_uar);
+	mlx5e_tc_cleanup(priv);
 	free_netdev(netdev);
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
new file mode 100644
index 0000000..7d1c0a3
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -0,0 +1,131 @@
+/*
+ * Copyright (c) 2016, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/mlx5/fs.h>
+#include <linux/mlx5/device.h>
+#include <linux/rhashtable.h>
+#include "en.h"
+#include "en_tc.h"
+
+struct mlx5e_tc_flow {
+	struct rhash_head	node;
+	u64			cookie;
+	struct mlx5_flow_rule	*rule;
+};
+
+#define MLX5E_TC_FLOW_TABLE_NUM_ENTRIES 1024
+#define MLX5E_TC_FLOW_TABLE_NUM_GROUPS 4
+
+struct mlx5_flow_rule *mlx5e_tc_add_flow(struct mlx5e_priv *priv,
+					 u32 *match_c, u32 *match_v,
+					 u32 action, u32 flow_tag)
+{
+	struct mlx5_flow_destination dest = {
+		.type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE,
+		{.ft = priv->fts.vlan.t},
+	};
+	struct mlx5_flow_rule *rule;
+	bool table_created = false;
+
+	if (IS_ERR_OR_NULL(priv->fts.tc.t)) {
+		priv->fts.tc.t =
+			mlx5_create_auto_grouped_flow_table(priv->fts.ns, 0,
+							    MLX5E_TC_FLOW_TABLE_NUM_ENTRIES,
+							    MLX5E_TC_FLOW_TABLE_NUM_GROUPS);
+		if (IS_ERR(priv->fts.tc.t)) {
+			netdev_err(priv->netdev,
+				   "Failed to create tc offload table\n");
+			return ERR_CAST(priv->fts.tc.t);
+		}
+
+		table_created = true;
+	}
+
+	rule = mlx5_add_flow_rule(priv->fts.tc.t, MLX5_MATCH_OUTER_HEADERS,
+				  match_c, match_v,
+				  action, flow_tag,
+				  action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST ? &dest : NULL);
+
+	if (IS_ERR(rule) && table_created) {
+		mlx5_destroy_flow_table(priv->fts.tc.t);
+		priv->fts.tc.t = NULL;
+	}
+
+	return rule;
+}
+
+static void mlx5e_tc_del_flow(struct mlx5e_priv *priv,
+			      struct mlx5_flow_rule *rule)
+{
+	mlx5_del_flow_rule(rule);
+
+	if (!mlx5e_tc_num_filters(priv)) {
+		mlx5_destroy_flow_table(priv->fts.tc.t);
+		priv->fts.tc.t = NULL;
+	}
+}
+
+static const struct rhashtable_params mlx5e_tc_flow_ht_params = {
+	.head_offset = offsetof(struct mlx5e_tc_flow, node),
+	.key_offset = offsetof(struct mlx5e_tc_flow, cookie),
+	.key_len = sizeof(((struct mlx5e_tc_flow *)0)->cookie),
+	.automatic_shrinking = true,
+};
+
+void mlx5e_tc_init(struct mlx5e_priv *priv)
+{
+	struct mlx5e_tc_flow_table *tc = &priv->fts.tc;
+
+	tc->ht_params = mlx5e_tc_flow_ht_params;
+	rhashtable_init(&tc->ht, &tc->ht_params);
+}
+
+static void _mlx5e_tc_del_flow(void *ptr, void *arg)
+{
+	struct mlx5e_tc_flow *flow = ptr;
+	struct mlx5e_priv *priv = arg;
+
+	mlx5e_tc_del_flow(priv, flow->rule);
+	kfree(flow);
+}
+
+void mlx5e_tc_cleanup(struct mlx5e_priv *priv)
+{
+	struct mlx5e_tc_flow_table *tc = &priv->fts.tc;
+
+	rhashtable_free_and_destroy(&tc->ht, _mlx5e_tc_del_flow, priv);
+
+	if (priv->fts.tc.t) {
+		mlx5_destroy_flow_table(priv->fts.tc.t);
+		priv->fts.tc.t = NULL;
+	}
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h
new file mode 100644
index 0000000..8a0dc0d
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h
@@ -0,0 +1,44 @@
+/*
+ * Copyright (c) 2016, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __MLX5_EN_TC_H__
+#define __MLX5_EN_TC_H__
+
+void mlx5e_tc_init(struct mlx5e_priv *priv);
+void mlx5e_tc_cleanup(struct mlx5e_priv *priv);
+
+static inline int mlx5e_tc_num_filters(struct mlx5e_priv *priv)
+{
+	return atomic_read(&priv->fts.tc.ht.nelems);
+}
+
+#endif /* __MLX5_EN_TC_H__ */
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 7/8] net/mlx5e: Support offload cls_flower with drop action
  2016-03-01 14:24 [PATCH net-next 0/8] cls_flower hardware offload support Amir Vadai
                   ` (5 preceding siblings ...)
  2016-03-01 14:24 ` [PATCH net-next 6/8] net/mlx5e: Introduce tc offload support Amir Vadai
@ 2016-03-01 14:24 ` Amir Vadai
  2016-03-01 14:55   ` Jiri Pirko
  2016-03-01 15:03   ` Jiri Pirko
  2016-03-01 14:24 ` [PATCH net-next 8/8] net/mlx5e: Support offload cls_flower with sskbedit mark action Amir Vadai
  2016-03-01 17:18 ` [PATCH net-next 0/8] cls_flower hardware offload support John Fastabend
  8 siblings, 2 replies; 31+ messages in thread
From: Amir Vadai @ 2016-03-01 14:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, John Fastabend, Saeed Mahameed,
	Hadar Har-Zion, Jiri Pirko, Amir Vadai

Parse tc_cls_flower_offload into device specific commands and program
the hardware to classify and act accordingly.

For example, to drop ICMP (ip_proto 1) packets from specific smac, dmac,
src_ip, src_ip, arriving to interface ens9:

 # tc qdisc add dev ens9 ingress

 # tc filter add dev ens9 protocol ip parent ffff: \
     flower ip_proto 1 \
     dst_mac 7c:fe:90:69:81:62 src_mac 7c:fe:90:69:81:56 \
     dst_ip 11.11.11.11 src_ip 11.11.11.12 indev ens9 \
     action drop

Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c |   9 +
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c   | 287 ++++++++++++++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.h   |   5 +
 3 files changed, 301 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index cb02b4c..1d1da94 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1889,6 +1889,15 @@ static int mlx5e_ndo_setup_tc(struct net_device *dev, u32 handle,
 		goto mqprio;
 
 	switch (tc->type) {
+	case TC_SETUP_CLSFLOWER:
+		switch (tc->cls_flower->command) {
+		case TC_CLSFLOWER_REPLACE:
+			return mlx5e_configure_flower(priv, proto, tc->cls_flower);
+		case TC_CLSFLOWER_DESTROY:
+			return mlx5e_delete_flower(priv, tc->cls_flower);
+		default:
+			return -EINVAL;
+		}
 	default:
 		return -EINVAL;
 	}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 7d1c0a3..8fee983 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -30,6 +30,9 @@
  * SOFTWARE.
  */
 
+#include <net/flow_dissector.h>
+#include <net/pkt_cls.h>
+#include <net/tc_act/tc_gact.h>
 #include <linux/mlx5/fs.h>
 #include <linux/mlx5/device.h>
 #include <linux/rhashtable.h>
@@ -94,6 +97,290 @@ static void mlx5e_tc_del_flow(struct mlx5e_priv *priv,
 	}
 }
 
+static int parse_cls_flower(struct mlx5e_priv *priv,
+			    u32 *match_c, u32 *match_v,
+			    struct tc_cls_flower_offload *f)
+{
+	void *headers_c = MLX5_ADDR_OF(fte_match_param, match_c, outer_headers);
+	void *headers_v = MLX5_ADDR_OF(fte_match_param, match_v, outer_headers);
+	u16 addr_type = 0;
+	u8 ip_proto = 0;
+
+	if (f->dissector->used_keys &
+	    ~(BIT(FLOW_DISSECTOR_KEY_CONTROL) |
+	      BIT(FLOW_DISSECTOR_KEY_BASIC) |
+	      BIT(FLOW_DISSECTOR_KEY_ETH_ADDRS) |
+	      BIT(FLOW_DISSECTOR_KEY_IPV4_ADDRS) |
+	      BIT(FLOW_DISSECTOR_KEY_IPV6_ADDRS) |
+	      BIT(FLOW_DISSECTOR_KEY_PORTS))) {
+		netdev_warn(priv->netdev, "Unsupported key used: 0x%x\n",
+			    f->dissector->used_keys);
+		return -ENOTSUPP;
+	}
+
+	if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_CONTROL)) {
+		struct flow_dissector_key_control *key =
+			skb_flow_dissector_target(f->dissector,
+						  FLOW_DISSECTOR_KEY_BASIC,
+						  f->key);
+		addr_type = key->addr_type;
+	}
+
+	if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_BASIC)) {
+		struct flow_dissector_key_basic *key =
+			skb_flow_dissector_target(f->dissector,
+						  FLOW_DISSECTOR_KEY_BASIC,
+						  f->key);
+		struct flow_dissector_key_basic *mask =
+			skb_flow_dissector_target(f->dissector,
+						  FLOW_DISSECTOR_KEY_BASIC,
+						  f->mask);
+		ip_proto = key->ip_proto;
+
+		MLX5_SET(fte_match_set_lyr_2_4, headers_c, ethertype,
+			 ntohs(mask->n_proto));
+		MLX5_SET(fte_match_set_lyr_2_4, headers_v, ethertype,
+			 ntohs(key->n_proto));
+
+		MLX5_SET(fte_match_set_lyr_2_4, headers_c, ip_protocol,
+			 mask->ip_proto);
+		MLX5_SET(fte_match_set_lyr_2_4, headers_v, ip_protocol,
+			 key->ip_proto);
+	}
+
+	if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_ETH_ADDRS)) {
+		struct flow_dissector_key_eth_addrs *key =
+			skb_flow_dissector_target(f->dissector,
+						  FLOW_DISSECTOR_KEY_ETH_ADDRS,
+						  f->key);
+		struct flow_dissector_key_eth_addrs *mask =
+			skb_flow_dissector_target(f->dissector,
+						  FLOW_DISSECTOR_KEY_ETH_ADDRS,
+						  f->mask);
+
+		ether_addr_copy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_c,
+					     dmac_47_16),
+				mask->dst);
+		ether_addr_copy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_v,
+					     dmac_47_16),
+				key->dst);
+
+		ether_addr_copy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_c,
+					     smac_47_16),
+				mask->src);
+		ether_addr_copy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_v,
+					     smac_47_16),
+				key->src);
+	}
+
+	if (addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS) {
+		struct flow_dissector_key_ipv4_addrs *key =
+			skb_flow_dissector_target(f->dissector,
+						  FLOW_DISSECTOR_KEY_IPV4_ADDRS,
+						  f->key);
+		struct flow_dissector_key_ipv4_addrs *mask =
+			skb_flow_dissector_target(f->dissector,
+						  FLOW_DISSECTOR_KEY_IPV4_ADDRS,
+						  f->mask);
+
+		memcpy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_c,
+				    src_ipv4_src_ipv6.ipv4_layout.ipv4),
+		       &mask->src, sizeof(mask->src));
+		memcpy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_v,
+				    src_ipv4_src_ipv6.ipv4_layout.ipv4),
+		       &key->src, sizeof(key->src));
+		memcpy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_c,
+				    dst_ipv4_dst_ipv6.ipv4_layout.ipv4),
+		       &mask->dst, sizeof(mask->dst));
+		memcpy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_v,
+				    dst_ipv4_dst_ipv6.ipv4_layout.ipv4),
+		       &key->dst, sizeof(key->dst));
+	}
+
+	if (addr_type == FLOW_DISSECTOR_KEY_IPV6_ADDRS) {
+		struct flow_dissector_key_ipv6_addrs *key =
+			skb_flow_dissector_target(f->dissector,
+						  FLOW_DISSECTOR_KEY_IPV6_ADDRS,
+						  f->key);
+		struct flow_dissector_key_ipv6_addrs *mask =
+			skb_flow_dissector_target(f->dissector,
+						  FLOW_DISSECTOR_KEY_IPV6_ADDRS,
+						  f->mask);
+
+		memcpy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_c,
+				    src_ipv4_src_ipv6.ipv6_layout.ipv6),
+		       &mask->src, sizeof(mask->src));
+		memcpy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_v,
+				    src_ipv4_src_ipv6.ipv6_layout.ipv6),
+		       &key->src, sizeof(key->src));
+
+		memcpy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_c,
+				    dst_ipv4_dst_ipv6.ipv6_layout.ipv6),
+		       &mask->dst, sizeof(mask->dst));
+		memcpy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_v,
+				    dst_ipv4_dst_ipv6.ipv6_layout.ipv6),
+		       &key->dst, sizeof(key->dst));
+	}
+
+	if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_PORTS)) {
+		struct flow_dissector_key_ports *key =
+			skb_flow_dissector_target(f->dissector,
+						  FLOW_DISSECTOR_KEY_PORTS,
+						  f->key);
+		struct flow_dissector_key_ports *mask =
+			skb_flow_dissector_target(f->dissector,
+						  FLOW_DISSECTOR_KEY_PORTS,
+						  f->mask);
+		switch (ip_proto) {
+		case IPPROTO_TCP:
+			MLX5_SET(fte_match_set_lyr_2_4, headers_c,
+				 tcp_sport, ntohs(mask->src));
+			MLX5_SET(fte_match_set_lyr_2_4, headers_v,
+				 tcp_sport, ntohs(key->src));
+
+			MLX5_SET(fte_match_set_lyr_2_4, headers_c,
+				 tcp_dport, ntohs(mask->dst));
+			MLX5_SET(fte_match_set_lyr_2_4, headers_v,
+				 tcp_dport, ntohs(key->dst));
+			break;
+
+		case IPPROTO_UDP:
+			MLX5_SET(fte_match_set_lyr_2_4, headers_c,
+				 udp_sport, ntohs(mask->src));
+			MLX5_SET(fte_match_set_lyr_2_4, headers_v,
+				 udp_sport, ntohs(key->src));
+
+			MLX5_SET(fte_match_set_lyr_2_4, headers_c,
+				 udp_dport, ntohs(mask->dst));
+			MLX5_SET(fte_match_set_lyr_2_4, headers_v,
+				 udp_dport, ntohs(key->dst));
+			break;
+		default:
+			netdev_err(priv->netdev,
+				   "Only UDP and TCP transport are supported\n");
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
+static int parse_tc_actions(struct mlx5e_priv *priv, struct tcf_exts *exts,
+			    u32 *action, u32 *flow_tag)
+{
+#ifdef CONFIG_NET_CLS_ACT
+	const struct tc_action *a;
+
+	*flow_tag = MLX5_FS_DEFAULT_FLOW_TAG;
+	*action = 0;
+
+	if (list_empty(&exts->actions))
+		return -EINVAL;
+
+	list_for_each_entry(a, &exts->actions, list) {
+		/* Only support a single action per rule */
+		if (*action)
+			return -EINVAL;
+
+		if (is_tcf_gact_shot(a)) {
+			*action |= MLX5_FLOW_CONTEXT_ACTION_DROP;
+			continue;
+		}
+
+		return -EINVAL;
+	}
+
+	return 0;
+#else
+	return -EINVAL;
+#endif
+}
+
+int mlx5e_configure_flower(struct mlx5e_priv *priv, __be16 protocol,
+			   struct tc_cls_flower_offload *f)
+{
+	struct mlx5e_tc_flow_table *tc = &priv->fts.tc;
+	u32 *match_c;
+	u32 *match_v;
+	int err = 0;
+	u32 flow_tag;
+	u32 action;
+	struct mlx5e_tc_flow *flow;
+	struct mlx5_flow_rule *old = NULL;
+
+	flow = rhashtable_lookup_fast(&tc->ht, &f->cookie,
+				      tc->ht_params);
+	if (flow)
+		old = flow->rule;
+	else
+		flow = kzalloc(sizeof(*flow), GFP_KERNEL);
+
+	match_c = kzalloc(MLX5_ST_SZ_BYTES(fte_match_param), GFP_KERNEL);
+	match_v = kzalloc(MLX5_ST_SZ_BYTES(fte_match_param), GFP_KERNEL);
+	if (!match_c || !match_v || !flow) {
+		err = -ENOMEM;
+		goto err_free;
+	}
+
+	flow->cookie = f->cookie;
+
+	err = parse_cls_flower(priv, match_c, match_v, f);
+	if (err < 0)
+		goto err_free;
+
+	err = parse_tc_actions(priv, f->exts, &action, &flow_tag);
+	if (err < 0)
+		goto err_free;
+
+	err = rhashtable_insert_fast(&tc->ht, &flow->node,
+				     tc->ht_params);
+	if (err)
+		goto err_free;
+
+	flow->rule = mlx5e_tc_add_flow(priv, match_c, match_v, action,
+				       flow_tag);
+	if (IS_ERR(flow->rule)) {
+		err = PTR_ERR(flow->rule);
+		goto err_hash_del;
+	}
+
+	if (old)
+		mlx5e_tc_del_flow(priv, old);
+
+	goto out;
+
+err_hash_del:
+	rhashtable_remove_fast(&tc->ht, &flow->node, tc->ht_params);
+
+err_free:
+	if (!old)
+		kfree(flow);
+out:
+	kfree(match_c);
+	kfree(match_v);
+	return err;
+}
+
+int mlx5e_delete_flower(struct mlx5e_priv *priv,
+			struct tc_cls_flower_offload *f)
+{
+	struct mlx5e_tc_flow *flow;
+	struct mlx5e_tc_flow_table *tc = &priv->fts.tc;
+
+	flow = rhashtable_lookup_fast(&tc->ht, &f->cookie,
+				      tc->ht_params);
+	if (!flow)
+		return -EINVAL;
+
+	rhashtable_remove_fast(&tc->ht, &flow->node, tc->ht_params);
+
+	mlx5e_tc_del_flow(priv, flow->rule);
+
+	kfree(flow);
+
+	return 0;
+}
+
 static const struct rhashtable_params mlx5e_tc_flow_ht_params = {
 	.head_offset = offsetof(struct mlx5e_tc_flow, node),
 	.key_offset = offsetof(struct mlx5e_tc_flow, cookie),
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h
index 8a0dc0d..f1e7180 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h
@@ -36,6 +36,11 @@
 void mlx5e_tc_init(struct mlx5e_priv *priv);
 void mlx5e_tc_cleanup(struct mlx5e_priv *priv);
 
+int mlx5e_configure_flower(struct mlx5e_priv *priv, __be16 protocol,
+			   struct tc_cls_flower_offload *f);
+int mlx5e_delete_flower(struct mlx5e_priv *priv,
+			struct tc_cls_flower_offload *f);
+
 static inline int mlx5e_tc_num_filters(struct mlx5e_priv *priv)
 {
 	return atomic_read(&priv->fts.tc.ht.nelems);
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 8/8] net/mlx5e: Support offload cls_flower with sskbedit mark action
  2016-03-01 14:24 [PATCH net-next 0/8] cls_flower hardware offload support Amir Vadai
                   ` (6 preceding siblings ...)
  2016-03-01 14:24 ` [PATCH net-next 7/8] net/mlx5e: Support offload cls_flower with drop action Amir Vadai
@ 2016-03-01 14:24 ` Amir Vadai
  2016-03-01 14:58   ` Jiri Pirko
  2016-03-01 17:18 ` [PATCH net-next 0/8] cls_flower hardware offload support John Fastabend
  8 siblings, 1 reply; 31+ messages in thread
From: Amir Vadai @ 2016-03-01 14:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, John Fastabend, Saeed Mahameed,
	Hadar Har-Zion, Jiri Pirko, Amir Vadai

Introduce offloading of skbedit mark action.

For example, to mark with 0x1234, all TCP (ip_proto 6) packets arriving
to interface ens9:

 # tc qdisc add dev ens9 ingress
 # tc filter add dev ens9 protocol ip parent ffff: \
     flower ip_proto 6 \
     indev ens9 \
     action skbedit mark 0x1234

Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c |  3 +++
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 16 ++++++++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.h |  2 ++
 3 files changed, 21 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 519a07f..f293afe 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -35,6 +35,7 @@
 #include <linux/tcp.h>
 #include <net/busy_poll.h>
 #include "en.h"
+#include "en_tc.h"
 
 static inline bool mlx5e_rx_hw_stamp(struct mlx5e_tstamp *tstamp)
 {
@@ -224,6 +225,8 @@ static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe,
 	if (cqe_has_vlan(cqe))
 		__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q),
 				       be16_to_cpu(cqe->vlan_info));
+
+	skb->mark = be32_to_cpu(cqe->sop_drop_qpn) & MLX5E_TC_FLOW_ID_MASK;
 }
 
 int mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 8fee983..22ab439 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -33,6 +33,7 @@
 #include <net/flow_dissector.h>
 #include <net/pkt_cls.h>
 #include <net/tc_act/tc_gact.h>
+#include <net/tc_act/tc_skbedit.h>
 #include <linux/mlx5/fs.h>
 #include <linux/mlx5/device.h>
 #include <linux/rhashtable.h>
@@ -287,6 +288,21 @@ static int parse_tc_actions(struct mlx5e_priv *priv, struct tcf_exts *exts,
 			continue;
 		}
 
+		if (is_tcf_skbedit_mark(a)) {
+			u32 mark = tcf_skbedit_mark(a);
+
+			if (mark & ~MLX5E_TC_FLOW_ID_MASK) {
+				netdev_warn(priv->netdev,
+					    "Bad flow mark - only 16 bit is supported: 0x%x\n",
+					    mark);
+				return -EINVAL;
+			}
+
+			*flow_tag = mark;
+			*action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
+			continue;
+		}
+
 		return -EINVAL;
 	}
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h
index f1e7180..155e9bd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h
@@ -33,6 +33,8 @@
 #ifndef __MLX5_EN_TC_H__
 #define __MLX5_EN_TC_H__
 
+#define MLX5E_TC_FLOW_ID_MASK 0x0000ffff
+
 void mlx5e_tc_init(struct mlx5e_priv *priv);
 void mlx5e_tc_cleanup(struct mlx5e_priv *priv);
 
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 1/8] net/flower: Introduce hardware offload support
  2016-03-01 14:24 ` [PATCH net-next 1/8] net/flower: Introduce " Amir Vadai
@ 2016-03-01 14:47   ` Jiri Pirko
  2016-03-01 16:49     ` Amir Vadai
  2016-03-01 14:53   ` Jiri Pirko
  1 sibling, 1 reply; 31+ messages in thread
From: Jiri Pirko @ 2016-03-01 14:47 UTC (permalink / raw)
  To: Amir Vadai
  Cc: David S. Miller, netdev, Or Gerlitz, John Fastabend,
	Saeed Mahameed, Hadar Har-Zion, Jiri Pirko

Tue, Mar 01, 2016 at 03:24:43PM CET, amir@vadai.me wrote:
>This patch is based on a patch made by John Fastabend.
>It adds support for offloading cls_flower.
>A filter that is offloaded successfuly by hardware, will not be added to
>the hashtable and won't be processed by software.

That is wrong. User should explitly specify to not include rule into sw
by SKIP_KERNEL flag (does not exist now, with John's recent patch we'll
have only SKIP_HW). Please add that in this patchset.


>
>Suggested-by: John Fastabend <john.r.fastabend@intel.com>
>Signed-off-by: Amir Vadai <amir@vadai.me>
>---
> include/linux/netdevice.h    |  2 ++
> include/net/pkt_cls.h        | 14 +++++++++
> include/uapi/linux/pkt_cls.h |  2 ++
> net/sched/cls_flower.c       | 75 +++++++++++++++++++++++++++++++++++++++++---
> 4 files changed, 88 insertions(+), 5 deletions(-)
>
>diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>index e52077f..0fd329a 100644
>--- a/include/linux/netdevice.h
>+++ b/include/linux/netdevice.h
>@@ -785,6 +785,7 @@ typedef u16 (*select_queue_fallback_t)(struct net_device *dev,
> enum {
> 	TC_SETUP_MQPRIO,
> 	TC_SETUP_CLSU32,
>+	TC_SETUP_CLSFLOWER,
> };
> 
> struct tc_cls_u32_offload;
>@@ -794,6 +795,7 @@ struct tc_to_netdev {
> 	union {
> 		u8 tc;
> 		struct tc_cls_u32_offload *cls_u32;
>+		struct tc_cls_flower_offload *cls_flower;
> 	};
> };
> 
>diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
>index bea14ee..beb2ee1 100644
>--- a/include/net/pkt_cls.h
>+++ b/include/net/pkt_cls.h
>@@ -409,4 +409,18 @@ static inline bool tc_should_offload(struct net_device *dev, u32 flags)
> 	return true;
> }
> 
>+enum {
>+	TC_CLSFLOWER_REPLACE,
>+	TC_CLSFLOWER_DESTROY,
>+};
>+
>+struct tc_cls_flower_offload {
>+	int command;
>+	u64 cookie;
>+	struct flow_dissector *dissector;
>+	struct fl_flow_key *mask;
>+	struct fl_flow_key *key;
>+	struct tcf_exts *exts;
>+};
>+
> #endif
>diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
>index 9874f568..c43c5f7 100644
>--- a/include/uapi/linux/pkt_cls.h
>+++ b/include/uapi/linux/pkt_cls.h
>@@ -417,6 +417,8 @@ enum {
> 	TCA_FLOWER_KEY_TCP_DST,		/* be16 */
> 	TCA_FLOWER_KEY_UDP_SRC,		/* be16 */
> 	TCA_FLOWER_KEY_UDP_DST,		/* be16 */
>+
>+	TCA_FLOWER_FLAGS,
> 	__TCA_FLOWER_MAX,
> };
> 
>diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
>index 95b0212..e599bea 100644
>--- a/net/sched/cls_flower.c
>+++ b/net/sched/cls_flower.c
>@@ -165,6 +165,53 @@ static void fl_destroy_filter(struct rcu_head *head)
> 	kfree(f);
> }
> 
>+static int fl_hw_destroy_filter(struct tcf_proto *tp, u64 cookie)
>+{
>+	struct net_device *dev = tp->q->dev_queue->dev;
>+	struct tc_cls_flower_offload offload = {0};
>+	struct tc_to_netdev tc;
>+
>+	if (!tc_should_offload(dev, 0))
>+		return -ENOTSUPP;
>+
>+	offload.command = TC_CLSFLOWER_DESTROY;
>+	offload.cookie = cookie;
>+
>+	tc.type = TC_SETUP_CLSFLOWER;
>+	tc.cls_flower = &offload;
>+
>+	return dev->netdev_ops->ndo_setup_tc(dev, tp->q->handle, tp->protocol,
>+					     &tc);
>+}
>+
>+static int fl_hw_replace_filter(struct tcf_proto *tp,
>+				struct flow_dissector *dissector,
>+				struct fl_flow_key *mask,
>+				struct fl_flow_key *key,
>+				struct tcf_exts *actions,
>+				u64 cookie, u32 flags)
>+{
>+	struct net_device *dev = tp->q->dev_queue->dev;
>+	struct tc_cls_flower_offload offload = {0};
>+	struct tc_to_netdev tc;
>+
>+	if (!tc_should_offload(dev, flags))
>+		return -ENOTSUPP;
>+
>+	offload.command = TC_CLSFLOWER_REPLACE;
>+	offload.cookie = cookie;
>+	offload.dissector = dissector;
>+	offload.mask = mask;
>+	offload.key = key;
>+	offload.exts = actions;
>+
>+	tc.type = TC_SETUP_CLSFLOWER;
>+	tc.cls_flower = &offload;
>+
>+	return dev->netdev_ops->ndo_setup_tc(dev, tp->q->handle, tp->protocol,
>+					     &tc);
>+}
>+
> static bool fl_destroy(struct tcf_proto *tp, bool force)
> {
> 	struct cls_fl_head *head = rtnl_dereference(tp->root);
>@@ -174,6 +221,7 @@ static bool fl_destroy(struct tcf_proto *tp, bool force)
> 		return false;
> 
> 	list_for_each_entry_safe(f, next, &head->filters, list) {
>+		fl_hw_destroy_filter(tp, (u64)f);
> 		list_del_rcu(&f->list);
> 		call_rcu(&f->rcu, fl_destroy_filter);
> 	}
>@@ -459,6 +507,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
> 	struct cls_fl_filter *fnew;
> 	struct nlattr *tb[TCA_FLOWER_MAX + 1];
> 	struct fl_flow_mask mask = {};
>+	u32 flags = 0;
> 	int err;
> 
> 	if (!tca[TCA_OPTIONS])
>@@ -494,13 +543,28 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
> 	if (err)
> 		goto errout;
> 
>-	err = rhashtable_insert_fast(&head->ht, &fnew->ht_node,
>-				     head->ht_params);
>-	if (err)
>-		goto errout;
>-	if (fold)
>+	if (tb[TCA_FLOWER_FLAGS])
>+		flags = nla_get_u32(tb[TCA_FLOWER_FLAGS]);
>+
>+	err = fl_hw_replace_filter(tp,
>+				   &head->dissector,
>+				   &mask.key,
>+				   &fnew->key,
>+				   &fnew->exts,
>+				   (u64)fnew,
>+				   flags);
>+	if (err) {
>+		err = rhashtable_insert_fast(&head->ht, &fnew->ht_node,
>+					     head->ht_params);
>+		if (err)
>+			goto errout;
>+	}
>+
>+	if (fold) {
> 		rhashtable_remove_fast(&head->ht, &fold->ht_node,
> 				       head->ht_params);
>+		fl_hw_destroy_filter(tp, (u64)fold);
>+	}
> 
> 	*arg = (unsigned long) fnew;
> 
>@@ -527,6 +591,7 @@ static int fl_delete(struct tcf_proto *tp, unsigned long arg)
> 	rhashtable_remove_fast(&head->ht, &f->ht_node,
> 			       head->ht_params);
> 	list_del_rcu(&f->list);
>+	fl_hw_destroy_filter(tp, (u64)f);
> 	tcf_unbind_filter(tp, &f->res);
> 	call_rcu(&f->rcu, fl_destroy_filter);
> 	return 0;
>-- 
>2.7.0
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 6/8] net/mlx5e: Introduce tc offload support
  2016-03-01 14:24 ` [PATCH net-next 6/8] net/mlx5e: Introduce tc offload support Amir Vadai
@ 2016-03-01 14:52   ` Jiri Pirko
  2016-03-01 17:00     ` Amir Vadai
  2016-03-01 15:59   ` Saeed Mahameed
  1 sibling, 1 reply; 31+ messages in thread
From: Jiri Pirko @ 2016-03-01 14:52 UTC (permalink / raw)
  To: Amir Vadai
  Cc: David S. Miller, netdev, Or Gerlitz, John Fastabend,
	Saeed Mahameed, Hadar Har-Zion, Jiri Pirko

Tue, Mar 01, 2016 at 03:24:48PM CET, amir@vadai.me wrote:
>Extend ndo_setup_tc() to support ingress tc offloading. Will be used by
>later patches to offload tc flower filter.
>
>Feature is off by default and could be enabled by issuing:
> # ethtool  -K eth0 hw-tc-offload on
>
>Offloads flow table is dynamically created when first filter is
>added.
>Rules are saved in a hash table that is maintained by the consumer (for
>example - the flower offload in the next patch).
>When last filter is removed and no filters exist in the hash table, the
>offload flow table is destroyed.

<snip>	
	
>@@ -1880,6 +1883,17 @@ static int mlx5e_setup_tc(struct net_device *netdev, u8 tc)
> static int mlx5e_ndo_setup_tc(struct net_device *dev, u32 handle,
> 			      __be16 proto, struct tc_to_netdev *tc)
> {
>+	struct mlx5e_priv *priv = netdev_priv(dev);
>+
>+	if (TC_H_MAJ(handle) != TC_H_MAJ(TC_H_INGRESS))
>+		goto mqprio;
>+
>+	switch (tc->type) {
>+	default:
>+		return -EINVAL;

-EOPNOTSUPP would be better here perhaps?


>+	}
>+
>+mqprio:
> 	if (handle != TC_H_ROOT || tc->type != TC_SETUP_MQPRIO)
> 		return -EINVAL;
> 
>@@ -1963,6 +1977,13 @@ static int mlx5e_set_features(struct net_device *netdev,
> 			mlx5e_disable_vlan_filter(priv);
> 	}
> 
>+	if ((changes & NETIF_F_HW_TC) && !(features & NETIF_F_HW_TC) &&
>+	    mlx5e_tc_num_filters(priv)) {
>+		netdev_err(netdev,
>+			   "Active offloaded tc filters, can't turn hw_tc_offload off\n");
>+		return -EINVAL;

This should not fail I believe. Just disable it in hw. I would even toss
away the rules if necessary.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 1/8] net/flower: Introduce hardware offload support
  2016-03-01 14:24 ` [PATCH net-next 1/8] net/flower: Introduce " Amir Vadai
  2016-03-01 14:47   ` Jiri Pirko
@ 2016-03-01 14:53   ` Jiri Pirko
  2016-03-01 16:50     ` Amir Vadai
  1 sibling, 1 reply; 31+ messages in thread
From: Jiri Pirko @ 2016-03-01 14:53 UTC (permalink / raw)
  To: Amir Vadai
  Cc: David S. Miller, netdev, Or Gerlitz, John Fastabend,
	Saeed Mahameed, Hadar Har-Zion, Jiri Pirko

Tue, Mar 01, 2016 at 03:24:43PM CET, amir@vadai.me wrote:
>This patch is based on a patch made by John Fastabend.
>It adds support for offloading cls_flower.
>A filter that is offloaded successfuly by hardware, will not be added to
>the hashtable and won't be processed by software.
>

<snip>

>+enum {
>+	TC_CLSFLOWER_REPLACE,
>+	TC_CLSFLOWER_DESTROY,
>+};

Name this enum

>+
>+struct tc_cls_flower_offload {
>+	int command;

        ^^^ and use it here

>+	u64 cookie;
>+	struct flow_dissector *dissector;
>+	struct fl_flow_key *mask;
>+	struct fl_flow_key *key;
>+	struct tcf_exts *exts;
>+};

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 7/8] net/mlx5e: Support offload cls_flower with drop action
  2016-03-01 14:24 ` [PATCH net-next 7/8] net/mlx5e: Support offload cls_flower with drop action Amir Vadai
@ 2016-03-01 14:55   ` Jiri Pirko
  2016-03-01 16:50     ` Amir Vadai
  2016-03-01 15:03   ` Jiri Pirko
  1 sibling, 1 reply; 31+ messages in thread
From: Jiri Pirko @ 2016-03-01 14:55 UTC (permalink / raw)
  To: Amir Vadai
  Cc: David S. Miller, netdev, Or Gerlitz, John Fastabend,
	Saeed Mahameed, Hadar Har-Zion, Jiri Pirko

Tue, Mar 01, 2016 at 03:24:49PM CET, amir@vadai.me wrote:
>Parse tc_cls_flower_offload into device specific commands and program
>the hardware to classify and act accordingly.
>
>For example, to drop ICMP (ip_proto 1) packets from specific smac, dmac,
>src_ip, src_ip, arriving to interface ens9:
>
> # tc qdisc add dev ens9 ingress
>
> # tc filter add dev ens9 protocol ip parent ffff: \
>     flower ip_proto 1 \
>     dst_mac 7c:fe:90:69:81:62 src_mac 7c:fe:90:69:81:56 \
>     dst_ip 11.11.11.11 src_ip 11.11.11.12 indev ens9 \
>     action drop
>
>Signed-off-by: Amir Vadai <amir@vadai.me>
>Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
>---
> drivers/net/ethernet/mellanox/mlx5/core/en_main.c |   9 +
> drivers/net/ethernet/mellanox/mlx5/core/en_tc.c   | 287 ++++++++++++++++++++++
> drivers/net/ethernet/mellanox/mlx5/core/en_tc.h   |   5 +
> 3 files changed, 301 insertions(+)
>
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>index cb02b4c..1d1da94 100644
>--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>@@ -1889,6 +1889,15 @@ static int mlx5e_ndo_setup_tc(struct net_device *dev, u32 handle,
> 		goto mqprio;
> 
> 	switch (tc->type) {
>+	case TC_SETUP_CLSFLOWER:
>+		switch (tc->cls_flower->command) {
>+		case TC_CLSFLOWER_REPLACE:
>+			return mlx5e_configure_flower(priv, proto, tc->cls_flower);
>+		case TC_CLSFLOWER_DESTROY:
>+			return mlx5e_delete_flower(priv, tc->cls_flower);
>+		default:
>+			return -EINVAL;

No need to default case here is you change "command" to enum as I
suggested it the reply to patch 1.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 8/8] net/mlx5e: Support offload cls_flower with sskbedit mark action
  2016-03-01 14:24 ` [PATCH net-next 8/8] net/mlx5e: Support offload cls_flower with sskbedit mark action Amir Vadai
@ 2016-03-01 14:58   ` Jiri Pirko
  2016-03-01 16:53     ` Amir Vadai
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Pirko @ 2016-03-01 14:58 UTC (permalink / raw)
  To: Amir Vadai
  Cc: David S. Miller, netdev, Or Gerlitz, John Fastabend,
	Saeed Mahameed, Hadar Har-Zion, Jiri Pirko

Tue, Mar 01, 2016 at 03:24:50PM CET, amir@vadai.me wrote:
>Introduce offloading of skbedit mark action.
>
>For example, to mark with 0x1234, all TCP (ip_proto 6) packets arriving
>to interface ens9:
>
> # tc qdisc add dev ens9 ingress
> # tc filter add dev ens9 protocol ip parent ffff: \
>     flower ip_proto 6 \
>     indev ens9 \
>     action skbedit mark 0x1234
>

<snip>

>@@ -287,6 +288,21 @@ static int parse_tc_actions(struct mlx5e_priv *priv, struct tcf_exts *exts,
> 			continue;
> 		}
> 
>+		if (is_tcf_skbedit_mark(a)) {
>+			u32 mark = tcf_skbedit_mark(a);
>+
>+			if (mark & ~MLX5E_TC_FLOW_ID_MASK) {
>+				netdev_warn(priv->netdev,
>+					    "Bad flow mark - only 16 bit is supported: 0x%x\n",

You can start printk string on the first line.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 7/8] net/mlx5e: Support offload cls_flower with drop action
  2016-03-01 14:24 ` [PATCH net-next 7/8] net/mlx5e: Support offload cls_flower with drop action Amir Vadai
  2016-03-01 14:55   ` Jiri Pirko
@ 2016-03-01 15:03   ` Jiri Pirko
  1 sibling, 0 replies; 31+ messages in thread
From: Jiri Pirko @ 2016-03-01 15:03 UTC (permalink / raw)
  To: Amir Vadai
  Cc: David S. Miller, netdev, Or Gerlitz, John Fastabend,
	Saeed Mahameed, Hadar Har-Zion, Jiri Pirko

Tue, Mar 01, 2016 at 03:24:49PM CET, amir@vadai.me wrote:
>Parse tc_cls_flower_offload into device specific commands and program
>the hardware to classify and act accordingly.
>
>For example, to drop ICMP (ip_proto 1) packets from specific smac, dmac,
>src_ip, src_ip, arriving to interface ens9:
>
> # tc qdisc add dev ens9 ingress
>
> # tc filter add dev ens9 protocol ip parent ffff: \
>     flower ip_proto 1 \
>     dst_mac 7c:fe:90:69:81:62 src_mac 7c:fe:90:69:81:56 \
>     dst_ip 11.11.11.11 src_ip 11.11.11.12 indev ens9 \
>     action drop
>

<snip>

>+static int parse_tc_actions(struct mlx5e_priv *priv, struct tcf_exts *exts,
>+			    u32 *action, u32 *flow_tag)
>+{
>+#ifdef CONFIG_NET_CLS_ACT
>+	const struct tc_action *a;
>+
>+	*flow_tag = MLX5_FS_DEFAULT_FLOW_TAG;
>+	*action = 0;
>+
>+	if (list_empty(&exts->actions))
>+		return -EINVAL;
>+
>+	list_for_each_entry(a, &exts->actions, list) {

I think it would make sense to make this a macro "tc_for_each_action"
and "tc_no_action" for empty list and you can loose ugly
CONFIG_NET_CLS_ACT ifdef

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 6/8] net/mlx5e: Introduce tc offload support
  2016-03-01 14:24 ` [PATCH net-next 6/8] net/mlx5e: Introduce tc offload support Amir Vadai
  2016-03-01 14:52   ` Jiri Pirko
@ 2016-03-01 15:59   ` Saeed Mahameed
  2016-03-01 17:07     ` Amir Vadai
  1 sibling, 1 reply; 31+ messages in thread
From: Saeed Mahameed @ 2016-03-01 15:59 UTC (permalink / raw)
  To: Amir Vadai
  Cc: David S. Miller, Linux Netdev List, Or Gerlitz, John Fastabend,
	Saeed Mahameed, Hadar Har-Zion, Jiri Pirko

On Tue, Mar 1, 2016 at 4:24 PM, Amir Vadai <amir@vadai.me> wrote:
> +#define FT_CAP(f) MLX5_CAP_FLOWTABLE(mdev, flow_table_properties_nic_receive.f)
> +       if (FT_CAP(flow_modify_en) &&
> +           FT_CAP(modify_root) &&
> +           FT_CAP(identified_miss_table_mode) &&
> +           FT_CAP(flow_table_modify))
> +               priv->netdev->hw_features      |= NETIF_F_HW_TC;
> +
>         netdev->features         |= NETIF_F_HIGHDMA;
>
>         netdev->priv_flags       |= IFF_UNICAST_FLT;
>
> +       mlx5e_tc_init(priv);

This is not the place for this, We usually do internal data structure
initialization  after we create all HW resources in
mlx5e_create_netdev
Please see mlx5e_vxlan_init as example, and you already call
mlx5e_tc_cleanup inside mlx5e_destroy_netdev, please move the
mlx5e_tc_init
to mlx5e_create_netdev after HW resources creation,


> @@ -2558,6 +2588,7 @@ static void mlx5e_destroy_netdev(struct mlx5_core_dev *mdev, void *vpriv)
>         mlx5_core_dealloc_transport_domain(priv->mdev, priv->tdn);
>         mlx5_core_dealloc_pd(priv->mdev, priv->pdn);
>         mlx5_unmap_free_uar(priv->mdev, &priv->cq_uar);
> +       mlx5e_tc_cleanup(priv);

I would suggest to move  mlx5e_tc_init to be right after
mlx5e_vxlan_init and mlx5e_tc_cleanup before mlx5e_vxlan_cleanup.

> +struct mlx5_flow_rule *mlx5e_tc_add_flow(struct mlx5e_priv *priv,
> +                                        u32 *match_c, u32 *match_v,
> +                                        u32 action, u32 flow_tag)
> +{
> +       struct mlx5_flow_destination dest = {
> +               .type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE,
> +               {.ft = priv->fts.vlan.t},
> +       };
> +       struct mlx5_flow_rule *rule;
> +       bool table_created = false;
> +
> +       if (IS_ERR_OR_NULL(priv->fts.tc.t)) {
> +               priv->fts.tc.t =
> +                       mlx5_create_auto_grouped_flow_table(priv->fts.ns, 0,
> +                                                           MLX5E_TC_FLOW_TABLE_NUM_ENTRIES,
> +                                                           MLX5E_TC_FLOW_TABLE_NUM_GROUPS);
> +               if (IS_ERR(priv->fts.tc.t)) {
> +                       netdev_err(priv->netdev,
> +                                  "Failed to create tc offload table\n");
> +                       return ERR_CAST(priv->fts.tc.t);

Here priv->fts.tc.t will be invalid pointer and in your code you treat
it as NULL in case of failure.

> +               }
> +
> +               table_created = true;
> +       }
> +
> +       rule = mlx5_add_flow_rule(priv->fts.tc.t, MLX5_MATCH_OUTER_HEADERS,
> +                                 match_c, match_v,
> +                                 action, flow_tag,
> +                                 action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST ? &dest : NULL);
> +
> +       if (IS_ERR(rule) && table_created) {
> +               mlx5_destroy_flow_table(priv->fts.tc.t);
> +               priv->fts.tc.t = NULL;
> +       }
> +
> +       return rule;
> +}
> +

> +void mlx5e_tc_cleanup(struct mlx5e_priv *priv)
> +{
> +       struct mlx5e_tc_flow_table *tc = &priv->fts.tc;
> +
> +       rhashtable_free_and_destroy(&tc->ht, _mlx5e_tc_del_flow, priv);
> +
> +       if (priv->fts.tc.t) {

priv->fts.tc.t will be invalid pointer and this test will pass in case
 mlx5_create_auto_grouped_flow_table had failed

> +               mlx5_destroy_flow_table(priv->fts.tc.t);
> +               priv->fts.tc.t = NULL;
> +       }
> +}

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 1/8] net/flower: Introduce hardware offload support
  2016-03-01 14:47   ` Jiri Pirko
@ 2016-03-01 16:49     ` Amir Vadai
  2016-03-01 17:01       ` Jiri Pirko
  0 siblings, 1 reply; 31+ messages in thread
From: Amir Vadai @ 2016-03-01 16:49 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: David S. Miller, netdev, Or Gerlitz, John Fastabend,
	Saeed Mahameed, Hadar Har-Zion, Jiri Pirko

On Tue, Mar 01, 2016 at 03:47:19PM +0100, Jiri Pirko wrote:
> Tue, Mar 01, 2016 at 03:24:43PM CET, amir@vadai.me wrote:
> >This patch is based on a patch made by John Fastabend.
> >It adds support for offloading cls_flower.
> >A filter that is offloaded successfuly by hardware, will not be added to
> >the hashtable and won't be processed by software.
> 
> That is wrong. User should explitly specify to not include rule into sw
> by SKIP_KERNEL flag (does not exist now, with John's recent patch we'll
> have only SKIP_HW). Please add that in this patchset.
Why? If a rule is offloaded, why would the user want to reprocess it by
software?
If the user use SKIP_HW, it will be processed by SW. Else, the user
would want it to be processed by HW or fallback to SW. I don't
understand in which case the user would like to have it done twice.

> 

[..]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 1/8] net/flower: Introduce hardware offload support
  2016-03-01 14:53   ` Jiri Pirko
@ 2016-03-01 16:50     ` Amir Vadai
  0 siblings, 0 replies; 31+ messages in thread
From: Amir Vadai @ 2016-03-01 16:50 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: David S. Miller, netdev, Or Gerlitz, John Fastabend,
	Saeed Mahameed, Hadar Har-Zion, Jiri Pirko

On Tue, Mar 01, 2016 at 03:53:44PM +0100, Jiri Pirko wrote:
> Tue, Mar 01, 2016 at 03:24:43PM CET, amir@vadai.me wrote:
> >This patch is based on a patch made by John Fastabend.
> >It adds support for offloading cls_flower.
> >A filter that is offloaded successfuly by hardware, will not be added to
> >the hashtable and won't be processed by software.
> >
> 
> <snip>
> 
> >+enum {
> >+	TC_CLSFLOWER_REPLACE,
> >+	TC_CLSFLOWER_DESTROY,
> >+};
> 
> Name this enum
Right,
Thanks

> 
> >+
> >+struct tc_cls_flower_offload {
> >+	int command;
> 
>         ^^^ and use it here
> 
> >+	u64 cookie;
> >+	struct flow_dissector *dissector;
> >+	struct fl_flow_key *mask;
> >+	struct fl_flow_key *key;
> >+	struct tcf_exts *exts;
> >+};

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 7/8] net/mlx5e: Support offload cls_flower with drop action
  2016-03-01 14:55   ` Jiri Pirko
@ 2016-03-01 16:50     ` Amir Vadai
  0 siblings, 0 replies; 31+ messages in thread
From: Amir Vadai @ 2016-03-01 16:50 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: David S. Miller, netdev, Or Gerlitz, John Fastabend,
	Saeed Mahameed, Hadar Har-Zion, Jiri Pirko

On Tue, Mar 01, 2016 at 03:55:48PM +0100, Jiri Pirko wrote:
> Tue, Mar 01, 2016 at 03:24:49PM CET, amir@vadai.me wrote:
> >Parse tc_cls_flower_offload into device specific commands and program
> >the hardware to classify and act accordingly.
> >
> >For example, to drop ICMP (ip_proto 1) packets from specific smac, dmac,
> >src_ip, src_ip, arriving to interface ens9:
> >
> > # tc qdisc add dev ens9 ingress
> >
> > # tc filter add dev ens9 protocol ip parent ffff: \
> >     flower ip_proto 1 \
> >     dst_mac 7c:fe:90:69:81:62 src_mac 7c:fe:90:69:81:56 \
> >     dst_ip 11.11.11.11 src_ip 11.11.11.12 indev ens9 \
> >     action drop
> >
> >Signed-off-by: Amir Vadai <amir@vadai.me>
> >Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
> >---
> > drivers/net/ethernet/mellanox/mlx5/core/en_main.c |   9 +
> > drivers/net/ethernet/mellanox/mlx5/core/en_tc.c   | 287 ++++++++++++++++++++++
> > drivers/net/ethernet/mellanox/mlx5/core/en_tc.h   |   5 +
> > 3 files changed, 301 insertions(+)
> >
> >diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> >index cb02b4c..1d1da94 100644
> >--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> >+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> >@@ -1889,6 +1889,15 @@ static int mlx5e_ndo_setup_tc(struct net_device *dev, u32 handle,
> > 		goto mqprio;
> > 
> > 	switch (tc->type) {
> >+	case TC_SETUP_CLSFLOWER:
> >+		switch (tc->cls_flower->command) {
> >+		case TC_CLSFLOWER_REPLACE:
> >+			return mlx5e_configure_flower(priv, proto, tc->cls_flower);
> >+		case TC_CLSFLOWER_DESTROY:
> >+			return mlx5e_delete_flower(priv, tc->cls_flower);
> >+		default:
> >+			return -EINVAL;
> 
> No need to default case here is you change "command" to enum as I
> suggested it the reply to patch 1.
ack

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 8/8] net/mlx5e: Support offload cls_flower with sskbedit mark action
  2016-03-01 14:58   ` Jiri Pirko
@ 2016-03-01 16:53     ` Amir Vadai
  0 siblings, 0 replies; 31+ messages in thread
From: Amir Vadai @ 2016-03-01 16:53 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: David S. Miller, netdev, Or Gerlitz, John Fastabend,
	Saeed Mahameed, Hadar Har-Zion, Jiri Pirko

On Tue, Mar 01, 2016 at 03:58:10PM +0100, Jiri Pirko wrote:
> Tue, Mar 01, 2016 at 03:24:50PM CET, amir@vadai.me wrote:
> >Introduce offloading of skbedit mark action.
> >
> >For example, to mark with 0x1234, all TCP (ip_proto 6) packets arriving
> >to interface ens9:
> >
> > # tc qdisc add dev ens9 ingress
> > # tc filter add dev ens9 protocol ip parent ffff: \
> >     flower ip_proto 6 \
> >     indev ens9 \
> >     action skbedit mark 0x1234
> >
> 
> <snip>
> 
> >@@ -287,6 +288,21 @@ static int parse_tc_actions(struct mlx5e_priv *priv, struct tcf_exts *exts,
> > 			continue;
> > 		}
> > 
> >+		if (is_tcf_skbedit_mark(a)) {
> >+			u32 mark = tcf_skbedit_mark(a);
> >+
> >+			if (mark & ~MLX5E_TC_FLOW_ID_MASK) {
> >+				netdev_warn(priv->netdev,
> >+					    "Bad flow mark - only 16 bit is supported: 0x%x\n",
> 
> You can start printk string on the first line.
ack

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 6/8] net/mlx5e: Introduce tc offload support
  2016-03-01 14:52   ` Jiri Pirko
@ 2016-03-01 17:00     ` Amir Vadai
  2016-03-01 17:13       ` John Fastabend
  0 siblings, 1 reply; 31+ messages in thread
From: Amir Vadai @ 2016-03-01 17:00 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: David S. Miller, netdev, Or Gerlitz, John Fastabend,
	Saeed Mahameed, Hadar Har-Zion, Jiri Pirko

On Tue, Mar 01, 2016 at 03:52:08PM +0100, Jiri Pirko wrote:
> Tue, Mar 01, 2016 at 03:24:48PM CET, amir@vadai.me wrote:
> >Extend ndo_setup_tc() to support ingress tc offloading. Will be used by
> >later patches to offload tc flower filter.
> >
> >Feature is off by default and could be enabled by issuing:
> > # ethtool  -K eth0 hw-tc-offload on
> >
> >Offloads flow table is dynamically created when first filter is
> >added.
> >Rules are saved in a hash table that is maintained by the consumer (for
> >example - the flower offload in the next patch).
> >When last filter is removed and no filters exist in the hash table, the
> >offload flow table is destroyed.
> 
> <snip>	
> 	
> >@@ -1880,6 +1883,17 @@ static int mlx5e_setup_tc(struct net_device *netdev, u8 tc)
> > static int mlx5e_ndo_setup_tc(struct net_device *dev, u32 handle,
> > 			      __be16 proto, struct tc_to_netdev *tc)
> > {
> >+	struct mlx5e_priv *priv = netdev_priv(dev);
> >+
> >+	if (TC_H_MAJ(handle) != TC_H_MAJ(TC_H_INGRESS))
> >+		goto mqprio;
> >+
> >+	switch (tc->type) {
> >+	default:
> >+		return -EINVAL;
> 
> -EOPNOTSUPP would be better here perhaps?
> 
> 
> >+	}
> >+
> >+mqprio:
> > 	if (handle != TC_H_ROOT || tc->type != TC_SETUP_MQPRIO)
> > 		return -EINVAL;
> > 
> >@@ -1963,6 +1977,13 @@ static int mlx5e_set_features(struct net_device *netdev,
> > 			mlx5e_disable_vlan_filter(priv);
> > 	}
> > 
> >+	if ((changes & NETIF_F_HW_TC) && !(features & NETIF_F_HW_TC) &&
> >+	    mlx5e_tc_num_filters(priv)) {
> >+		netdev_err(netdev,
> >+			   "Active offloaded tc filters, can't turn hw_tc_offload off\n");
> >+		return -EINVAL;
> 
> This should not fail I believe. Just disable it in hw. I would even toss
> away the rules if necessary.
It depends on the answer regarding your comment on the previous patch.
If we have the rule in both SW and HW, and remove it from the HW it is
ok (although, currently I don't understand why would anyone want in both
places).
If the rule is processed by HW only - turning off this flag, will
disable the offloaded rules - it might be misleading, so I prefered not
to allow it and print a message.


> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 1/8] net/flower: Introduce hardware offload support
  2016-03-01 16:49     ` Amir Vadai
@ 2016-03-01 17:01       ` Jiri Pirko
  2016-03-02 11:14         ` Or Gerlitz
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Pirko @ 2016-03-01 17:01 UTC (permalink / raw)
  To: Amir Vadai
  Cc: David S. Miller, netdev, Or Gerlitz, John Fastabend,
	Saeed Mahameed, Hadar Har-Zion, Jiri Pirko

Tue, Mar 01, 2016 at 05:49:27PM CET, amir@vadai.me wrote:
>On Tue, Mar 01, 2016 at 03:47:19PM +0100, Jiri Pirko wrote:
>> Tue, Mar 01, 2016 at 03:24:43PM CET, amir@vadai.me wrote:
>> >This patch is based on a patch made by John Fastabend.
>> >It adds support for offloading cls_flower.
>> >A filter that is offloaded successfuly by hardware, will not be added to
>> >the hashtable and won't be processed by software.
>> 
>> That is wrong. User should explitly specify to not include rule into sw
>> by SKIP_KERNEL flag (does not exist now, with John's recent patch we'll
>> have only SKIP_HW). Please add that in this patchset.
>Why? If a rule is offloaded, why would the user want to reprocess it by
>software?
>If the user use SKIP_HW, it will be processed by SW. Else, the user
>would want it to be processed by HW or fallback to SW. I don't
>understand in which case the user would like to have it done twice.

For example if you turn on the offloading by unsetting NETIF_F_HW_TC.
Or if someone inserts skbs into rx path directly, for example pktgen.
We need SKIP_KERNEL to be set by user, not implicit.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 6/8] net/mlx5e: Introduce tc offload support
  2016-03-01 15:59   ` Saeed Mahameed
@ 2016-03-01 17:07     ` Amir Vadai
  0 siblings, 0 replies; 31+ messages in thread
From: Amir Vadai @ 2016-03-01 17:07 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David S. Miller, Linux Netdev List, Or Gerlitz, John Fastabend,
	Saeed Mahameed, Hadar Har-Zion, Jiri Pirko

On Tue, Mar 01, 2016 at 05:59:59PM +0200, Saeed Mahameed wrote:
> On Tue, Mar 1, 2016 at 4:24 PM, Amir Vadai <amir@vadai.me> wrote:
> > +#define FT_CAP(f) MLX5_CAP_FLOWTABLE(mdev, flow_table_properties_nic_receive.f)
> > +       if (FT_CAP(flow_modify_en) &&
> > +           FT_CAP(modify_root) &&
> > +           FT_CAP(identified_miss_table_mode) &&
> > +           FT_CAP(flow_table_modify))
> > +               priv->netdev->hw_features      |= NETIF_F_HW_TC;
> > +
> >         netdev->features         |= NETIF_F_HIGHDMA;
> >
> >         netdev->priv_flags       |= IFF_UNICAST_FLT;
> >
> > +       mlx5e_tc_init(priv);
> 
> This is not the place for this, We usually do internal data structure
> initialization  after we create all HW resources in
> mlx5e_create_netdev
> Please see mlx5e_vxlan_init as example, and you already call
> mlx5e_tc_cleanup inside mlx5e_destroy_netdev, please move the
> mlx5e_tc_init
> to mlx5e_create_netdev after HW resources creation,
ack

> 
> 
> > @@ -2558,6 +2588,7 @@ static void mlx5e_destroy_netdev(struct mlx5_core_dev *mdev, void *vpriv)
> >         mlx5_core_dealloc_transport_domain(priv->mdev, priv->tdn);
> >         mlx5_core_dealloc_pd(priv->mdev, priv->pdn);
> >         mlx5_unmap_free_uar(priv->mdev, &priv->cq_uar);
> > +       mlx5e_tc_cleanup(priv);
> 
> I would suggest to move  mlx5e_tc_init to be right after
> mlx5e_vxlan_init and mlx5e_tc_cleanup before mlx5e_vxlan_cleanup.
ok

> 
> > +struct mlx5_flow_rule *mlx5e_tc_add_flow(struct mlx5e_priv *priv,
> > +                                        u32 *match_c, u32 *match_v,
> > +                                        u32 action, u32 flow_tag)
> > +{
> > +       struct mlx5_flow_destination dest = {
> > +               .type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE,
> > +               {.ft = priv->fts.vlan.t},
> > +       };
> > +       struct mlx5_flow_rule *rule;
> > +       bool table_created = false;
> > +
> > +       if (IS_ERR_OR_NULL(priv->fts.tc.t)) {
> > +               priv->fts.tc.t =
> > +                       mlx5_create_auto_grouped_flow_table(priv->fts.ns, 0,
> > +                                                           MLX5E_TC_FLOW_TABLE_NUM_ENTRIES,
> > +                                                           MLX5E_TC_FLOW_TABLE_NUM_GROUPS);
> > +               if (IS_ERR(priv->fts.tc.t)) {
> > +                       netdev_err(priv->netdev,
> > +                                  "Failed to create tc offload table\n");
> > +                       return ERR_CAST(priv->fts.tc.t);
> 
> Here priv->fts.tc.t will be invalid pointer and in your code you treat
> it as NULL in case of failure.
> 
> > +               }
> > +
> > +               table_created = true;
> > +       }
> > +
> > +       rule = mlx5_add_flow_rule(priv->fts.tc.t, MLX5_MATCH_OUTER_HEADERS,
> > +                                 match_c, match_v,
> > +                                 action, flow_tag,
> > +                                 action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST ? &dest : NULL);
> > +
> > +       if (IS_ERR(rule) && table_created) {
> > +               mlx5_destroy_flow_table(priv->fts.tc.t);
> > +               priv->fts.tc.t = NULL;
> > +       }
> > +
> > +       return rule;
> > +}
> > +
> 
> > +void mlx5e_tc_cleanup(struct mlx5e_priv *priv)
> > +{
> > +       struct mlx5e_tc_flow_table *tc = &priv->fts.tc;
> > +
> > +       rhashtable_free_and_destroy(&tc->ht, _mlx5e_tc_del_flow, priv);
> > +
> > +       if (priv->fts.tc.t) {
> 
> priv->fts.tc.t will be invalid pointer and this test will pass in case
>  mlx5_create_auto_grouped_flow_table had failed
Yeh - should have used !IS_ERR_OR_NULL() or set it to NULL on failure
above.

Thanks,
Amir


> 
> > +               mlx5_destroy_flow_table(priv->fts.tc.t);
> > +               priv->fts.tc.t = NULL;
> > +       }
> > +}

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 6/8] net/mlx5e: Introduce tc offload support
  2016-03-01 17:00     ` Amir Vadai
@ 2016-03-01 17:13       ` John Fastabend
  2016-03-02 15:53         ` Amir Vadai
  0 siblings, 1 reply; 31+ messages in thread
From: John Fastabend @ 2016-03-01 17:13 UTC (permalink / raw)
  To: Amir Vadai, Jiri Pirko
  Cc: David S. Miller, netdev, Or Gerlitz, John Fastabend,
	Saeed Mahameed, Hadar Har-Zion, Jiri Pirko

On 16-03-01 09:00 AM, Amir Vadai wrote:
> On Tue, Mar 01, 2016 at 03:52:08PM +0100, Jiri Pirko wrote:
>> Tue, Mar 01, 2016 at 03:24:48PM CET, amir@vadai.me wrote:
>>> Extend ndo_setup_tc() to support ingress tc offloading. Will be used by
>>> later patches to offload tc flower filter.
>>>
>>> Feature is off by default and could be enabled by issuing:
>>> # ethtool  -K eth0 hw-tc-offload on
>>>
>>> Offloads flow table is dynamically created when first filter is
>>> added.
>>> Rules are saved in a hash table that is maintained by the consumer (for
>>> example - the flower offload in the next patch).
>>> When last filter is removed and no filters exist in the hash table, the
>>> offload flow table is destroyed.
>>
>> <snip>	
>> 	
>>> @@ -1880,6 +1883,17 @@ static int mlx5e_setup_tc(struct net_device *netdev, u8 tc)
>>> static int mlx5e_ndo_setup_tc(struct net_device *dev, u32 handle,
>>> 			      __be16 proto, struct tc_to_netdev *tc)
>>> {
>>> +	struct mlx5e_priv *priv = netdev_priv(dev);
>>> +
>>> +	if (TC_H_MAJ(handle) != TC_H_MAJ(TC_H_INGRESS))
>>> +		goto mqprio;
>>> +
>>> +	switch (tc->type) {
>>> +	default:
>>> +		return -EINVAL;
>>
>> -EOPNOTSUPP would be better here perhaps?
>>
>>
>>> +	}
>>> +
>>> +mqprio:
>>> 	if (handle != TC_H_ROOT || tc->type != TC_SETUP_MQPRIO)
>>> 		return -EINVAL;
>>>
>>> @@ -1963,6 +1977,13 @@ static int mlx5e_set_features(struct net_device *netdev,
>>> 			mlx5e_disable_vlan_filter(priv);
>>> 	}
>>>
>>> +	if ((changes & NETIF_F_HW_TC) && !(features & NETIF_F_HW_TC) &&
>>> +	    mlx5e_tc_num_filters(priv)) {
>>> +		netdev_err(netdev,
>>> +			   "Active offloaded tc filters, can't turn hw_tc_offload off\n");
>>> +		return -EINVAL;
>>
>> This should not fail I believe. Just disable it in hw. I would even toss
>> away the rules if necessary.
> It depends on the answer regarding your comment on the previous patch.
> If we have the rule in both SW and HW, and remove it from the HW it is
> ok (although, currently I don't understand why would anyone want in both
> places).
> If the rule is processed by HW only - turning off this flag, will
> disable the offloaded rules - it might be misleading, so I prefered not
> to allow it and print a message.

When we get the HW only mode we will need to also flush the hardware
representation in software as well as the hardware state.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 0/8] cls_flower hardware offload support
  2016-03-01 14:24 [PATCH net-next 0/8] cls_flower hardware offload support Amir Vadai
                   ` (7 preceding siblings ...)
  2016-03-01 14:24 ` [PATCH net-next 8/8] net/mlx5e: Support offload cls_flower with sskbedit mark action Amir Vadai
@ 2016-03-01 17:18 ` John Fastabend
  8 siblings, 0 replies; 31+ messages in thread
From: John Fastabend @ 2016-03-01 17:18 UTC (permalink / raw)
  To: Amir Vadai, David S. Miller
  Cc: netdev, Or Gerlitz, John Fastabend, Saeed Mahameed,
	Hadar Har-Zion, Jiri Pirko

On 16-03-01 06:24 AM, Amir Vadai wrote:
> Hi,
> 
> This patchset introduces cls_flower hardware offload support over ConnectX-4
> driver, more hardware vendors are welcome to use it too.
> 
> This patchset is based on John's infrastructure for tc offloading [2] to add
> hardware offload support to the flower filter. It also extends the support to
> an additional tc action - skbedit mark operation.
> NIC driver that was used is ConnectX-4. Feature is off by default and could be
> turned on using ethtool.
> 
> Some commands to use this code:
> 

This series depends on these patches still under review,

https://patchwork.ozlabs.org/patch/589150/
https://patchwork.ozlabs.org/patch/589151/
https://patchwork.ozlabs.org/patch/589152/

Just a heads up for DaveM there is an ordering requirement here.

Thanks,
John

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 1/8] net/flower: Introduce hardware offload support
  2016-03-01 17:01       ` Jiri Pirko
@ 2016-03-02 11:14         ` Or Gerlitz
  2016-03-02 13:22           ` Jiri Pirko
  0 siblings, 1 reply; 31+ messages in thread
From: Or Gerlitz @ 2016-03-02 11:14 UTC (permalink / raw)
  To: Jiri Pirko, John Fastabend, Jamal Hadi Salim
  Cc: Amir Vadai, David S. Miller, Linux Netdev List, Or Gerlitz,
	Saeed Mahameed, Hadar Har-Zion, Jiri Pirko

On Tue, Mar 1, 2016 at 7:01 PM, Jiri Pirko <jiri@resnulli.us> wrote:
> Tue, Mar 01, 2016 at 05:49:27PM CET, amir@vadai.me wrote:
>>On Tue, Mar 01, 2016 at 03:47:19PM +0100, Jiri Pirko wrote:
>>> Tue, Mar 01, 2016 at 03:24:43PM CET, amir@vadai.me wrote:
>>> >This patch is based on a patch made by John Fastabend.

>>> >It adds support for offloading cls_flower.
>>> >A filter that is offloaded successfully by hardware, will not be added to
>>> >the hashtable and won't be processed by software.

>>> That is wrong. User should explicitly specify to not include rule into sw
>>> by SKIP_KERNEL flag (does not exist now, with John's recent patch we'll
>>> have only SKIP_HW). Please add that in this patchset.

>> Why? If a rule is offloaded, why would the user want to reprocess it by software?

>> If the user use SKIP_HW, it will be processed by SW. Else, the user
>> would want it to be processed by HW or fallback to SW. I don't
>> understand in which case the user would like to have it done twice.

> For example if you turn on the offloading by unsetting NETIF_F_HW_TC.
> Or if someone inserts skbs into rx path directly, for example pktgen.
> We need SKIP_KERNEL to be set by user, not implicit.

As discussed in netdev, we want to have three modes for TC offloads

1. SW only
2. HW only (and err if can't)
3. HW and if not supported fallback to SW

Now, from your reply, I understand we want a fourth mode

4. Both (HW and SW)

Do we agree that these four policies/modes make sense?

So you want #4 to be the default? can you elaborate why? note that for
the HW marking
case a "both" mode will be very inefficient, also for actions like
vlan push/pop, encap/decap, etc
the result will be just wrong... I don't think this should be the default.

Or.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 1/8] net/flower: Introduce hardware offload support
  2016-03-02 11:14         ` Or Gerlitz
@ 2016-03-02 13:22           ` Jiri Pirko
  2016-03-02 16:22             ` John Fastabend
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Pirko @ 2016-03-02 13:22 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: John Fastabend, Jamal Hadi Salim, Amir Vadai, David S. Miller,
	Linux Netdev List, Or Gerlitz, Saeed Mahameed, Hadar Har-Zion,
	Jiri Pirko

Wed, Mar 02, 2016 at 12:14:39PM CET, gerlitz.or@gmail.com wrote:
>On Tue, Mar 1, 2016 at 7:01 PM, Jiri Pirko <jiri@resnulli.us> wrote:
>> Tue, Mar 01, 2016 at 05:49:27PM CET, amir@vadai.me wrote:
>>>On Tue, Mar 01, 2016 at 03:47:19PM +0100, Jiri Pirko wrote:
>>>> Tue, Mar 01, 2016 at 03:24:43PM CET, amir@vadai.me wrote:
>>>> >This patch is based on a patch made by John Fastabend.
>
>>>> >It adds support for offloading cls_flower.
>>>> >A filter that is offloaded successfully by hardware, will not be added to
>>>> >the hashtable and won't be processed by software.
>
>>>> That is wrong. User should explicitly specify to not include rule into sw
>>>> by SKIP_KERNEL flag (does not exist now, with John's recent patch we'll
>>>> have only SKIP_HW). Please add that in this patchset.
>
>>> Why? If a rule is offloaded, why would the user want to reprocess it by software?
>
>>> If the user use SKIP_HW, it will be processed by SW. Else, the user
>>> would want it to be processed by HW or fallback to SW. I don't
>>> understand in which case the user would like to have it done twice.
>
>> For example if you turn on the offloading by unsetting NETIF_F_HW_TC.
>> Or if someone inserts skbs into rx path directly, for example pktgen.
>> We need SKIP_KERNEL to be set by user, not implicit.
>
>As discussed in netdev, we want to have three modes for TC offloads
>
>1. SW only
>2. HW only (and err if can't)
>3. HW and if not supported fallback to SW
>
>Now, from your reply, I understand we want a fourth mode
>
>4. Both (HW and SW)

I would perhaps do it a litte bit differently:
NO FLAG (default)- insert into kernel and HW now:
		if NETIF_F_HW_TC is off (default)
			-> push to kernel only (current behaviour)
		if NETIF_F_HW_TC is on AND push to HW fails
			-> return error
SKIP_HW - flag to tell kernel not to insert into HW
SKIP_SW - flag to tell kernel not to insert into kernel

to achieve hw only, user has to turn on the NETIF_F_HW_TC and
pass SKIP_SW flag.


>
>Do we agree that these four policies/modes make sense?
>
>So you want #4 to be the default? can you elaborate why? note that for
>the HW marking
>case a "both" mode will be very inefficient, also for actions like
>vlan push/pop, encap/decap, etc

Well when you push/pop vlan of encap/decap tunnel header in hw, you won't
match the packet in kernel again. Most likely. But anyway, if you turn
on NETIF_F_HW_TC you know what you are doing and you adjust the
included rules (flags) accordingly.


>the result will be just wrong... I don't think this should be the default.
>
>Or.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 6/8] net/mlx5e: Introduce tc offload support
  2016-03-01 17:13       ` John Fastabend
@ 2016-03-02 15:53         ` Amir Vadai
  2016-03-02 15:58           ` Jiri Pirko
  0 siblings, 1 reply; 31+ messages in thread
From: Amir Vadai @ 2016-03-02 15:53 UTC (permalink / raw)
  To: John Fastabend
  Cc: Jiri Pirko, David S. Miller, netdev, Or Gerlitz, John Fastabend,
	Saeed Mahameed, Hadar Har-Zion, Jiri Pirko

On Tue, Mar 01, 2016 at 09:13:25AM -0800, John Fastabend wrote:
> On 16-03-01 09:00 AM, Amir Vadai wrote:
> > On Tue, Mar 01, 2016 at 03:52:08PM +0100, Jiri Pirko wrote:
> >> Tue, Mar 01, 2016 at 03:24:48PM CET, amir@vadai.me wrote:
> >>> Extend ndo_setup_tc() to support ingress tc offloading. Will be used by
> >>> later patches to offload tc flower filter.
> >>>
> >>> Feature is off by default and could be enabled by issuing:
> >>> # ethtool  -K eth0 hw-tc-offload on
> >>>
> >>> Offloads flow table is dynamically created when first filter is
> >>> added.
> >>> Rules are saved in a hash table that is maintained by the consumer (for
> >>> example - the flower offload in the next patch).
> >>> When last filter is removed and no filters exist in the hash table, the
> >>> offload flow table is destroyed.
> >>
> >> <snip>	
> >> 	
> >>> @@ -1880,6 +1883,17 @@ static int mlx5e_setup_tc(struct net_device *netdev, u8 tc)
> >>> static int mlx5e_ndo_setup_tc(struct net_device *dev, u32 handle,
> >>> 			      __be16 proto, struct tc_to_netdev *tc)
> >>> {
> >>> +	struct mlx5e_priv *priv = netdev_priv(dev);
> >>> +
> >>> +	if (TC_H_MAJ(handle) != TC_H_MAJ(TC_H_INGRESS))
> >>> +		goto mqprio;
> >>> +
> >>> +	switch (tc->type) {
> >>> +	default:
> >>> +		return -EINVAL;
> >>
> >> -EOPNOTSUPP would be better here perhaps?
> >>
> >>
> >>> +	}
> >>> +
> >>> +mqprio:
> >>> 	if (handle != TC_H_ROOT || tc->type != TC_SETUP_MQPRIO)
> >>> 		return -EINVAL;
> >>>
> >>> @@ -1963,6 +1977,13 @@ static int mlx5e_set_features(struct net_device *netdev,
> >>> 			mlx5e_disable_vlan_filter(priv);
> >>> 	}
> >>>
> >>> +	if ((changes & NETIF_F_HW_TC) && !(features & NETIF_F_HW_TC) &&
> >>> +	    mlx5e_tc_num_filters(priv)) {
> >>> +		netdev_err(netdev,
> >>> +			   "Active offloaded tc filters, can't turn hw_tc_offload off\n");
> >>> +		return -EINVAL;
> >>
> >> This should not fail I believe. Just disable it in hw. I would even toss
> >> away the rules if necessary.
> > It depends on the answer regarding your comment on the previous patch.
> > If we have the rule in both SW and HW, and remove it from the HW it is
> > ok (although, currently I don't understand why would anyone want in both
> > places).
> > If the rule is processed by HW only - turning off this flag, will
> > disable the offloaded rules - it might be misleading, so I prefered not
> > to allow it and print a message.
> 
> When we get the HW only mode we will need to also flush the hardware
> representation in software as well as the hardware state.

Yep, I do think that just failing the operation is the best appraoch.
Will make the design simpler, and from the user point of view, less
surprises.

Jiri?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 6/8] net/mlx5e: Introduce tc offload support
  2016-03-02 15:53         ` Amir Vadai
@ 2016-03-02 15:58           ` Jiri Pirko
  0 siblings, 0 replies; 31+ messages in thread
From: Jiri Pirko @ 2016-03-02 15:58 UTC (permalink / raw)
  To: Amir Vadai
  Cc: John Fastabend, David S. Miller, netdev, Or Gerlitz,
	John Fastabend, Saeed Mahameed, Hadar Har-Zion, Jiri Pirko

Wed, Mar 02, 2016 at 04:53:37PM CET, amir@vadai.me wrote:
>On Tue, Mar 01, 2016 at 09:13:25AM -0800, John Fastabend wrote:
>> On 16-03-01 09:00 AM, Amir Vadai wrote:
>> > On Tue, Mar 01, 2016 at 03:52:08PM +0100, Jiri Pirko wrote:
>> >> Tue, Mar 01, 2016 at 03:24:48PM CET, amir@vadai.me wrote:
>> >>> Extend ndo_setup_tc() to support ingress tc offloading. Will be used by
>> >>> later patches to offload tc flower filter.
>> >>>
>> >>> Feature is off by default and could be enabled by issuing:
>> >>> # ethtool  -K eth0 hw-tc-offload on
>> >>>
>> >>> Offloads flow table is dynamically created when first filter is
>> >>> added.
>> >>> Rules are saved in a hash table that is maintained by the consumer (for
>> >>> example - the flower offload in the next patch).
>> >>> When last filter is removed and no filters exist in the hash table, the
>> >>> offload flow table is destroyed.
>> >>
>> >> <snip>	
>> >> 	
>> >>> @@ -1880,6 +1883,17 @@ static int mlx5e_setup_tc(struct net_device *netdev, u8 tc)
>> >>> static int mlx5e_ndo_setup_tc(struct net_device *dev, u32 handle,
>> >>> 			      __be16 proto, struct tc_to_netdev *tc)
>> >>> {
>> >>> +	struct mlx5e_priv *priv = netdev_priv(dev);
>> >>> +
>> >>> +	if (TC_H_MAJ(handle) != TC_H_MAJ(TC_H_INGRESS))
>> >>> +		goto mqprio;
>> >>> +
>> >>> +	switch (tc->type) {
>> >>> +	default:
>> >>> +		return -EINVAL;
>> >>
>> >> -EOPNOTSUPP would be better here perhaps?
>> >>
>> >>
>> >>> +	}
>> >>> +
>> >>> +mqprio:
>> >>> 	if (handle != TC_H_ROOT || tc->type != TC_SETUP_MQPRIO)
>> >>> 		return -EINVAL;
>> >>>
>> >>> @@ -1963,6 +1977,13 @@ static int mlx5e_set_features(struct net_device *netdev,
>> >>> 			mlx5e_disable_vlan_filter(priv);
>> >>> 	}
>> >>>
>> >>> +	if ((changes & NETIF_F_HW_TC) && !(features & NETIF_F_HW_TC) &&
>> >>> +	    mlx5e_tc_num_filters(priv)) {
>> >>> +		netdev_err(netdev,
>> >>> +			   "Active offloaded tc filters, can't turn hw_tc_offload off\n");
>> >>> +		return -EINVAL;
>> >>
>> >> This should not fail I believe. Just disable it in hw. I would even toss
>> >> away the rules if necessary.
>> > It depends on the answer regarding your comment on the previous patch.
>> > If we have the rule in both SW and HW, and remove it from the HW it is
>> > ok (although, currently I don't understand why would anyone want in both
>> > places).
>> > If the rule is processed by HW only - turning off this flag, will
>> > disable the offloaded rules - it might be misleading, so I prefered not
>> > to allow it and print a message.
>> 
>> When we get the HW only mode we will need to also flush the hardware
>> representation in software as well as the hardware state.
>
>Yep, I do think that just failing the operation is the best appraoch.
>Will make the design simpler, and from the user point of view, less
>surprises.
>
>Jiri?

I don't feel it is ok, but at the same time, it is probably the best
solution for now. Other solutions would be too complicated.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 1/8] net/flower: Introduce hardware offload support
  2016-03-02 13:22           ` Jiri Pirko
@ 2016-03-02 16:22             ` John Fastabend
  2016-03-02 16:30               ` Jiri Pirko
  0 siblings, 1 reply; 31+ messages in thread
From: John Fastabend @ 2016-03-02 16:22 UTC (permalink / raw)
  To: Jiri Pirko, Or Gerlitz
  Cc: John Fastabend, Jamal Hadi Salim, Amir Vadai, David S. Miller,
	Linux Netdev List, Or Gerlitz, Saeed Mahameed, Hadar Har-Zion,
	Jiri Pirko

On 16-03-02 05:22 AM, Jiri Pirko wrote:
> Wed, Mar 02, 2016 at 12:14:39PM CET, gerlitz.or@gmail.com wrote:
>> On Tue, Mar 1, 2016 at 7:01 PM, Jiri Pirko <jiri@resnulli.us> wrote:
>>> Tue, Mar 01, 2016 at 05:49:27PM CET, amir@vadai.me wrote:
>>>> On Tue, Mar 01, 2016 at 03:47:19PM +0100, Jiri Pirko wrote:
>>>>> Tue, Mar 01, 2016 at 03:24:43PM CET, amir@vadai.me wrote:
>>>>>> This patch is based on a patch made by John Fastabend.
>>
>>>>>> It adds support for offloading cls_flower.
>>>>>> A filter that is offloaded successfully by hardware, will not be added to
>>>>>> the hashtable and won't be processed by software.
>>
>>>>> That is wrong. User should explicitly specify to not include rule into sw
>>>>> by SKIP_KERNEL flag (does not exist now, with John's recent patch we'll
>>>>> have only SKIP_HW). Please add that in this patchset.
>>
>>>> Why? If a rule is offloaded, why would the user want to reprocess it by software?
>>
>>>> If the user use SKIP_HW, it will be processed by SW. Else, the user
>>>> would want it to be processed by HW or fallback to SW. I don't
>>>> understand in which case the user would like to have it done twice.
>>
>>> For example if you turn on the offloading by unsetting NETIF_F_HW_TC.
>>> Or if someone inserts skbs into rx path directly, for example pktgen.
>>> We need SKIP_KERNEL to be set by user, not implicit.
>>
>> As discussed in netdev, we want to have three modes for TC offloads
>>
>> 1. SW only
>> 2. HW only (and err if can't)
>> 3. HW and if not supported fallback to SW
>>
>> Now, from your reply, I understand we want a fourth mode
>>
>> 4. Both (HW and SW)
> 
> I would perhaps do it a litte bit differently:
> NO FLAG (default)- insert into kernel and HW now:
> 		if NETIF_F_HW_TC is off (default)
> 			-> push to kernel only (current behaviour)
> 		if NETIF_F_HW_TC is on AND push to HW fails
> 			-> return error
> SKIP_HW - flag to tell kernel not to insert into HW
> SKIP_SW - flag to tell kernel not to insert into kernel
> 
> to achieve hw only, user has to turn on the NETIF_F_HW_TC and
> pass SKIP_SW flag.
> 

The modes Jiri describes here is exactly how I planned to build
this. And at the moment the only one we are missing is SKIP_HW
which I'm reworking now and should have in a few days.

To resolve the error handling if the rule is SKIP_HW or NO_FLAG
an error will be thrown if it can not be applied to software.
Notice if an error happens on the software insert with NO_FLAG then
the hardware insert is not attempted either. With SKIP_SW I will
throw an error if the hardware insert fails because there is
no software fallback in this mode.

The only mode I haven't looked at doing is

	3. HW and if not supported fallback to SW

I'm not sure I have a use case for it at the moment. It is sufficient
for me to just do a SKIP_SW command followed by a SKIP_HW command
if needed. I guess someone else could implement it if they really need
it.

> 
>>
>> Do we agree that these four policies/modes make sense?
>>
>> So you want #4 to be the default? can you elaborate why? note that for
>> the HW marking
>> case a "both" mode will be very inefficient, also for actions like
>> vlan push/pop, encap/decap, etc
> 
> Well when you push/pop vlan of encap/decap tunnel header in hw, you won't
> match the packet in kernel again. Most likely. But anyway, if you turn
> on NETIF_F_HW_TC you know what you are doing and you adjust the
> included rules (flags) accordingly.
> 
> 
>> the result will be just wrong... I don't think this should be the default.
>>
>> Or.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 1/8] net/flower: Introduce hardware offload support
  2016-03-02 16:22             ` John Fastabend
@ 2016-03-02 16:30               ` Jiri Pirko
  0 siblings, 0 replies; 31+ messages in thread
From: Jiri Pirko @ 2016-03-02 16:30 UTC (permalink / raw)
  To: John Fastabend
  Cc: Or Gerlitz, John Fastabend, Jamal Hadi Salim, Amir Vadai,
	David S. Miller, Linux Netdev List, Or Gerlitz, Saeed Mahameed,
	Hadar Har-Zion, Jiri Pirko

Wed, Mar 02, 2016 at 05:22:48PM CET, john.fastabend@gmail.com wrote:
>On 16-03-02 05:22 AM, Jiri Pirko wrote:
>> Wed, Mar 02, 2016 at 12:14:39PM CET, gerlitz.or@gmail.com wrote:
>>> On Tue, Mar 1, 2016 at 7:01 PM, Jiri Pirko <jiri@resnulli.us> wrote:
>>>> Tue, Mar 01, 2016 at 05:49:27PM CET, amir@vadai.me wrote:
>>>>> On Tue, Mar 01, 2016 at 03:47:19PM +0100, Jiri Pirko wrote:
>>>>>> Tue, Mar 01, 2016 at 03:24:43PM CET, amir@vadai.me wrote:
>>>>>>> This patch is based on a patch made by John Fastabend.
>>>
>>>>>>> It adds support for offloading cls_flower.
>>>>>>> A filter that is offloaded successfully by hardware, will not be added to
>>>>>>> the hashtable and won't be processed by software.
>>>
>>>>>> That is wrong. User should explicitly specify to not include rule into sw
>>>>>> by SKIP_KERNEL flag (does not exist now, with John's recent patch we'll
>>>>>> have only SKIP_HW). Please add that in this patchset.
>>>
>>>>> Why? If a rule is offloaded, why would the user want to reprocess it by software?
>>>
>>>>> If the user use SKIP_HW, it will be processed by SW. Else, the user
>>>>> would want it to be processed by HW or fallback to SW. I don't
>>>>> understand in which case the user would like to have it done twice.
>>>
>>>> For example if you turn on the offloading by unsetting NETIF_F_HW_TC.
>>>> Or if someone inserts skbs into rx path directly, for example pktgen.
>>>> We need SKIP_KERNEL to be set by user, not implicit.
>>>
>>> As discussed in netdev, we want to have three modes for TC offloads
>>>
>>> 1. SW only
>>> 2. HW only (and err if can't)
>>> 3. HW and if not supported fallback to SW
>>>
>>> Now, from your reply, I understand we want a fourth mode
>>>
>>> 4. Both (HW and SW)
>> 
>> I would perhaps do it a litte bit differently:
>> NO FLAG (default)- insert into kernel and HW now:
>> 		if NETIF_F_HW_TC is off (default)
>> 			-> push to kernel only (current behaviour)
>> 		if NETIF_F_HW_TC is on AND push to HW fails
>> 			-> return error
>> SKIP_HW - flag to tell kernel not to insert into HW
>> SKIP_SW - flag to tell kernel not to insert into kernel
>> 
>> to achieve hw only, user has to turn on the NETIF_F_HW_TC and
>> pass SKIP_SW flag.
>> 
>
>The modes Jiri describes here is exactly how I planned to build
>this. And at the moment the only one we are missing is SKIP_HW
>which I'm reworking now and should have in a few days.
>
>To resolve the error handling if the rule is SKIP_HW or NO_FLAG
>an error will be thrown if it can not be applied to software.
>Notice if an error happens on the software insert with NO_FLAG then
>the hardware insert is not attempted either. With SKIP_SW I will
>throw an error if the hardware insert fails because there is
>no software fallback in this mode.
>
>The only mode I haven't looked at doing is
>
>	3. HW and if not supported fallback to SW
>
>I'm not sure I have a use case for it at the moment. It is sufficient
>for me to just do a SKIP_SW command followed by a SKIP_HW command
>if needed. I guess someone else could implement it if they really need
>it.

I'd let user to resolve this as you described.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2016-03-02 16:30 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-01 14:24 [PATCH net-next 0/8] cls_flower hardware offload support Amir Vadai
2016-03-01 14:24 ` [PATCH net-next 1/8] net/flower: Introduce " Amir Vadai
2016-03-01 14:47   ` Jiri Pirko
2016-03-01 16:49     ` Amir Vadai
2016-03-01 17:01       ` Jiri Pirko
2016-03-02 11:14         ` Or Gerlitz
2016-03-02 13:22           ` Jiri Pirko
2016-03-02 16:22             ` John Fastabend
2016-03-02 16:30               ` Jiri Pirko
2016-03-01 14:53   ` Jiri Pirko
2016-03-01 16:50     ` Amir Vadai
2016-03-01 14:24 ` [PATCH net-next 2/8] net/flow_dissector: Make dissector_uses_key() and skb_flow_dissector_target() public Amir Vadai
2016-03-01 14:24 ` [PATCH net-next 3/8] net/act_skbedit: Utility functions for mark action Amir Vadai
2016-03-01 14:24 ` [PATCH net-next 4/8] net/mlx5_core: Set flow steering dest only for forward rules Amir Vadai
2016-03-01 14:24 ` [PATCH net-next 5/8] net/mlx5e: Add a new priority for kernel flow tables Amir Vadai
2016-03-01 14:24 ` [PATCH net-next 6/8] net/mlx5e: Introduce tc offload support Amir Vadai
2016-03-01 14:52   ` Jiri Pirko
2016-03-01 17:00     ` Amir Vadai
2016-03-01 17:13       ` John Fastabend
2016-03-02 15:53         ` Amir Vadai
2016-03-02 15:58           ` Jiri Pirko
2016-03-01 15:59   ` Saeed Mahameed
2016-03-01 17:07     ` Amir Vadai
2016-03-01 14:24 ` [PATCH net-next 7/8] net/mlx5e: Support offload cls_flower with drop action Amir Vadai
2016-03-01 14:55   ` Jiri Pirko
2016-03-01 16:50     ` Amir Vadai
2016-03-01 15:03   ` Jiri Pirko
2016-03-01 14:24 ` [PATCH net-next 8/8] net/mlx5e: Support offload cls_flower with sskbedit mark action Amir Vadai
2016-03-01 14:58   ` Jiri Pirko
2016-03-01 16:53     ` Amir Vadai
2016-03-01 17:18 ` [PATCH net-next 0/8] cls_flower hardware offload support John Fastabend

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.