netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances
@ 2018-01-09 14:07 Jiri Pirko
  2018-01-09 14:07 ` [patch net-next v7 01/13] net: sched: introduce support for multiple filter chain pointers registration Jiri Pirko
                   ` (16 more replies)
  0 siblings, 17 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-09 14:07 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

From: Jiri Pirko <jiri@mellanox.com>

Currently the filters added to qdiscs are independent. So for example if you
have 2 netdevices and you create ingress qdisc on both and you want to add
identical filter rules both, you need to add them twice. This patchset
makes this easier and mainly saves resources allowing to share all filters
within a qdisc - I call it a "filter block". Also this helps to save
resources when we do offload to hw for example to expensive TCAM.

So back to the example. First, we create 2 qdiscs. Both will share
block number 22. "22" is just an identification:
$ tc qdisc add dev ens7 ingress block 22
                                ^^^^^^^^
$ tc qdisc add dev ens8 ingress block 22
                                ^^^^^^^^

If we don't specify "block" command line option, no shared block would
be created:
$ tc qdisc add dev ens9 ingress

Now if we list the qdiscs, we will see the block index in the output:

$ tc qdisc
qdisc ingress ffff: dev ens7 parent ffff:fff1 block 22
qdisc ingress ffff: dev ens8 parent ffff:fff1 block 22
qdisc ingress ffff: dev ens9 parent ffff:fff1


To make is more visual, the situation looks like this:

   ens7 ingress qdisc                 ens7 ingress qdisc
          |                                  |
          |                                  |
          +---------->  block 22  <----------+

Unlimited number of qdiscs may share the same block.

Now we can add filter using the block index:

$ tc filter add block 22 protocol ip pref 25 flower dst_ip 192.168.0.0/16 action drop


Note we cannot use the qdisc for filter manipulations of shared blocks:

$ tc filter add dev ens8 ingress protocol ip pref 1 flower dst_ip 192.168.100.2 action drop
Error: This filter block is shared. Please use the block index to manipulate the filters.


We will see the same output if we list filters for ingress qdisc of
ens7 and ens8, also for the block 22:

$ tc filter show block 22
filter block 22 protocol ip pref 25 flower chain 0
filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
...

$ tc filter show dev ens7 ingress
filter block 22 protocol ip pref 25 flower chain 0
filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
...

$ tc filter show dev ens8 ingress
filter block 22 protocol ip pref 25 flower chain 0
filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
...

---
v6->v7:
- patch 1:
 - unsquashed shared block patch that was previously squashed by mistake
 - fixed error path in block create - freeing chain 0
- patch 2:
 - new patch - splitted from the previous one as it got accidentaly
   squashed in the rebasing process in the past
 - converted to idr extended
 - removed auto-generating of block indexes. Callers have to explicily
   tell that the block is shared by passing non-zero block index
 - fixed error path in block get ext - freeing chain 0
- patch 7:
 - changed extack message for block index handle as suggested by DaveA
 - added extack message when block index does not exist
 - the block ifindex magic is in define and change to 0xffffffff
   as suggested by Jamal
- patch 8:
 - new patch implementing RTM_GETBLOCK in order to query if the block
   with some index exists
- patch 9:
 - adjust to the core changes and check block index attributes for being 0

v5->v6:
- added patch 6 that introduces block handle

v4->v5:
- patch 5:
 - add tracking of binding of devs that are unable to offload and check
   that before block cbs call.

v3->v4:
- patch 1:
 - rebased on top of the current net-next
 - added some extack strings
- patch 3:
 - rebased on top of the current net-next
- patch 5:
 - propagate netdev_ops->ndo_setup_tc error up to tcf_block_offload_bind
   caller
- patch 7:
 - rebased on top of the current net-next

v2->v3:
- removed original patch 1, removing tp->q cls_bpf dependency. Fixed by
  Jakub in the meantime.
- patch 1:
 - rebased on top of the current net-next
- patch 5:
 - new patch
- patch 8:
 - removed "p_" prefix from block index function args
- patch 10:
 - add tc offload feature handling


Jiri Pirko (13):
  net: sched: introduce support for multiple filter chain pointers
    registration
  net: sched: introduce shared filter blocks infrastructure
  net: sched: avoid usage of tp->q in tcf_classify
  net: sched: introduce block mechanism to handle netif_keep_dst calls
  net: sched: remove classid and q fields from tcf_proto
  net: sched: keep track of offloaded filters and check tc offload
    feature
  net: sched: use block index as a handle instead of qdisc when block is
    shared
  net: sched: add rt netlink message type for block get
  net: sched: allow ingress and clsact qdiscs to share filter blocks
  mlxsw: spectrum_acl: Reshuffle code around
    mlxsw_sp_acl_ruleset_create/destroy
  mlxsw: spectrum_acl: Don't store netdev and ingress for ruleset unbind
  mlxsw: spectrum_acl: Implement TC block sharing
  mlxsw: spectrum_acl: Pass mlxsw_sp_port down to ruleset bind/unbind
    ops

 drivers/net/ethernet/mellanox/mlxsw/spectrum.c     | 182 +++++-
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  44 +-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c | 302 +++++++---
 .../ethernet/mellanox/mlxsw/spectrum_acl_tcam.c    |  44 +-
 .../net/ethernet/mellanox/mlxsw/spectrum_flower.c  |  41 +-
 include/net/pkt_cls.h                              |   8 +
 include/net/sch_generic.h                          |  27 +-
 include/uapi/linux/pkt_sched.h                     |  11 +
 include/uapi/linux/rtnetlink.h                     |  12 +
 net/sched/cls_api.c                                | 655 ++++++++++++++++-----
 net/sched/cls_bpf.c                                |   9 +-
 net/sched/cls_flow.c                               |   2 +-
 net/sched/cls_flower.c                             |   3 +-
 net/sched/cls_matchall.c                           |   3 +-
 net/sched/cls_route.c                              |   2 +-
 net/sched/cls_u32.c                                |  13 +-
 net/sched/sch_ingress.c                            | 101 +++-
 security/selinux/nlmsgtab.c                        |   5 +-
 18 files changed, 1155 insertions(+), 309 deletions(-)

-- 
2.9.5

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [patch net-next v7 01/13] net: sched: introduce support for multiple filter chain pointers registration
  2018-01-09 14:07 [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jiri Pirko
@ 2018-01-09 14:07 ` Jiri Pirko
  2018-01-09 14:07 ` [patch net-next v7 02/13] net: sched: introduce shared filter blocks infrastructure Jiri Pirko
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-09 14:07 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

From: Jiri Pirko <jiri@mellanox.com>

So far, there was possible only to register a single filter chain
pointer to block->chain[0]. However, when the blocks will get shareable,
we need to allow multiple filter chain pointers registration.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v6->v7:
- unsquashed shared block patch that was previously squashed by mistake
- fixed error path in block create - freeing chain 0
v3->v4:
- rebased on top of the current net-next
- added some extack strings
v2->v3:
- rebased on top of the current net-next
---
 include/net/sch_generic.h |  3 +-
 net/sched/cls_api.c       | 77 ++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 70 insertions(+), 10 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index ac029d5..ddb96bb 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -275,8 +275,7 @@ typedef void tcf_chain_head_change_t(struct tcf_proto *tp_head, void *priv);
 
 struct tcf_chain {
 	struct tcf_proto __rcu *filter_chain;
-	tcf_chain_head_change_t *chain_head_change;
-	void *chain_head_change_priv;
+	struct list_head filter_chain_list;
 	struct list_head list;
 	struct tcf_block *block;
 	u32 index; /* chain index */
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 6708b69..e6b16b3 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -179,6 +179,12 @@ static void tcf_proto_destroy(struct tcf_proto *tp)
 	kfree_rcu(tp, rcu);
 }
 
+struct tcf_filter_chain_list_item {
+	struct list_head list;
+	tcf_chain_head_change_t *chain_head_change;
+	void *chain_head_change_priv;
+};
+
 static struct tcf_chain *tcf_chain_create(struct tcf_block *block,
 					  u32 chain_index)
 {
@@ -187,6 +193,7 @@ static struct tcf_chain *tcf_chain_create(struct tcf_block *block,
 	chain = kzalloc(sizeof(*chain), GFP_KERNEL);
 	if (!chain)
 		return NULL;
+	INIT_LIST_HEAD(&chain->filter_chain_list);
 	list_add_tail(&chain->list, &block->chain_list);
 	chain->block = block;
 	chain->index = chain_index;
@@ -194,12 +201,19 @@ static struct tcf_chain *tcf_chain_create(struct tcf_block *block,
 	return chain;
 }
 
+static void tcf_chain_head_change_item(struct tcf_filter_chain_list_item *item,
+				       struct tcf_proto *tp_head)
+{
+	if (item->chain_head_change)
+		item->chain_head_change(tp_head, item->chain_head_change_priv);
+}
 static void tcf_chain_head_change(struct tcf_chain *chain,
 				  struct tcf_proto *tp_head)
 {
-	if (chain->chain_head_change)
-		chain->chain_head_change(tp_head,
-					 chain->chain_head_change_priv);
+	struct tcf_filter_chain_list_item *item;
+
+	list_for_each_entry(item, &chain->filter_chain_list, list)
+		tcf_chain_head_change_item(item, tp_head);
 }
 
 static void tcf_chain_flush(struct tcf_chain *chain)
@@ -280,6 +294,50 @@ static void tcf_block_offload_unbind(struct tcf_block *block, struct Qdisc *q,
 	tcf_block_offload_cmd(block, q, ei, TC_BLOCK_UNBIND);
 }
 
+static int
+tcf_chain_head_change_cb_add(struct tcf_chain *chain,
+			     struct tcf_block_ext_info *ei,
+			     struct netlink_ext_ack *extack)
+{
+	struct tcf_filter_chain_list_item *item;
+
+	item = kmalloc(sizeof(*item), GFP_KERNEL);
+	if (!item) {
+		NL_SET_ERR_MSG(extack, "Memory allocation for head change callback item failed");
+		return -ENOMEM;
+	}
+	item->chain_head_change = ei->chain_head_change;
+	item->chain_head_change_priv = ei->chain_head_change_priv;
+	if (chain->filter_chain)
+		tcf_chain_head_change_item(item, chain->filter_chain);
+	list_add(&item->list, &chain->filter_chain_list);
+	return 0;
+}
+
+static void
+tcf_chain_head_change_cb_del(struct tcf_chain *chain,
+			     struct tcf_block_ext_info *ei)
+{
+	struct tcf_filter_chain_list_item *item;
+
+	list_for_each_entry(item, &chain->filter_chain_list, list) {
+		if ((!ei->chain_head_change && !ei->chain_head_change_priv) ||
+		    (item->chain_head_change == ei->chain_head_change &&
+		     item->chain_head_change_priv == ei->chain_head_change_priv)) {
+			tcf_chain_head_change_item(item, NULL);
+			list_del(&item->list);
+			kfree(item);
+			return;
+		}
+	}
+	WARN_ON(1);
+}
+
+static struct tcf_chain *tcf_block_chain_zero(struct tcf_block *block)
+{
+	return list_first_entry(&block->chain_list, struct tcf_chain, list);
+}
+
 int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 		      struct tcf_block_ext_info *ei,
 		      struct netlink_ext_ack *extack)
@@ -302,9 +360,10 @@ int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 		err = -ENOMEM;
 		goto err_chain_create;
 	}
-	WARN_ON(!ei->chain_head_change);
-	chain->chain_head_change = ei->chain_head_change;
-	chain->chain_head_change_priv = ei->chain_head_change_priv;
+	err = tcf_chain_head_change_cb_add(tcf_block_chain_zero(block),
+					   ei, extack);
+	if (err)
+		goto err_chain_head_change_cb_add;
 	block->net = qdisc_net(q);
 	block->q = q;
 	tcf_block_offload_bind(block, q, ei);
@@ -313,6 +372,8 @@ int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 
 err_chain_create:
 	kfree(block);
+err_chain_head_change_cb_add:
+	kfree(chain);
 	return err;
 }
 EXPORT_SYMBOL(tcf_block_get_ext);
@@ -351,6 +412,7 @@ void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
 	 */
 	if (!block)
 		return;
+	tcf_chain_head_change_cb_del(tcf_block_chain_zero(block), ei);
 	list_for_each_entry(chain, &block->chain_list, list)
 		tcf_chain_hold(chain);
 
@@ -364,8 +426,7 @@ void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
 		tcf_chain_put(chain);
 
 	/* Finally, put chain 0 and allow block to be freed. */
-	chain = list_first_entry(&block->chain_list, struct tcf_chain, list);
-	tcf_chain_put(chain);
+	tcf_chain_put(tcf_block_chain_zero(block));
 }
 EXPORT_SYMBOL(tcf_block_put_ext);
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [patch net-next v7 02/13] net: sched: introduce shared filter blocks infrastructure
  2018-01-09 14:07 [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jiri Pirko
  2018-01-09 14:07 ` [patch net-next v7 01/13] net: sched: introduce support for multiple filter chain pointers registration Jiri Pirko
@ 2018-01-09 14:07 ` Jiri Pirko
  2018-01-09 14:07 ` [patch net-next v7 03/13] net: sched: avoid usage of tp->q in tcf_classify Jiri Pirko
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-09 14:07 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

From: Jiri Pirko <jiri@mellanox.com>

Allow qdiscs to share filter blocks among them. Each qdisc type has to
use block get/put extended modifications that enable sharing.
Shared blocks are tracked within each net namespace and identified
by u32 index. This index is passed from user during the qdisc creation.
If user passes index that is not used by any other qdisc, new block
is created. If user passes index that is already used, the existing
block will be re-used.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v6->v7:
- new patch - splitted from the previous one as it got accidentaly
  squashed in the rebasing process in the past
- converted to idr extended
- removed auto-generating of block indexes. Callers have to explicily
  tell that the block is shared by passing non-zero block index
- fixed error path in block get ext - freeing chain 0
---
 include/net/pkt_cls.h     |   7 ++
 include/net/sch_generic.h |   2 +
 net/sched/cls_api.c       | 163 +++++++++++++++++++++++++++++++++++++++-------
 3 files changed, 148 insertions(+), 24 deletions(-)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index c4f4e46..3f53d12 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -29,6 +29,7 @@ struct tcf_block_ext_info {
 	enum tcf_block_binder_type binder_type;
 	tcf_chain_head_change_t *chain_head_change;
 	void *chain_head_change_priv;
+	u32 block_index;
 };
 
 struct tcf_block_cb;
@@ -48,8 +49,14 @@ void tcf_block_put(struct tcf_block *block);
 void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
 		       struct tcf_block_ext_info *ei);
 
+static inline bool tcf_block_shared(struct tcf_block *block)
+{
+	return block->index;
+}
+
 static inline struct Qdisc *tcf_block_q(struct tcf_block *block)
 {
+	WARN_ON(tcf_block_shared(block));
 	return block->q;
 }
 
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index ddb96bb..5cc4d71 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -284,6 +284,8 @@ struct tcf_chain {
 
 struct tcf_block {
 	struct list_head chain_list;
+	u32 index; /* block index for shared blocks */
+	unsigned int refcnt;
 	struct net *net;
 	struct Qdisc *q;
 	struct list_head cb_list;
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index e6b16b3..9b45950 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -24,6 +24,7 @@
 #include <linux/init.h>
 #include <linux/kmod.h>
 #include <linux/slab.h>
+#include <linux/idr.h>
 #include <net/net_namespace.h>
 #include <net/sock.h>
 #include <net/netlink.h>
@@ -333,22 +334,44 @@ tcf_chain_head_change_cb_del(struct tcf_chain *chain,
 	WARN_ON(1);
 }
 
-static struct tcf_chain *tcf_block_chain_zero(struct tcf_block *block)
+struct tcf_net {
+	struct idr idr;
+};
+
+static unsigned int tcf_net_id;
+
+static int tcf_block_insert(struct tcf_block *block, struct net *net,
+			    u32 block_index, struct netlink_ext_ack *extack)
 {
-	return list_first_entry(&block->chain_list, struct tcf_chain, list);
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+	int err;
+
+	err = idr_alloc_ext(&tn->idr, block, NULL, block_index,
+			    block_index + 1, GFP_KERNEL);
+	if (err)
+		return err;
+	block->index = block_index;
+	return 0;
 }
 
-int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
-		      struct tcf_block_ext_info *ei,
-		      struct netlink_ext_ack *extack)
+static void tcf_block_remove(struct tcf_block *block, struct net *net)
+{
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+
+	idr_remove_ext(&tn->idr, block->index);
+}
+
+static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q,
+					  struct netlink_ext_ack *extack)
 {
-	struct tcf_block *block = kzalloc(sizeof(*block), GFP_KERNEL);
+	struct tcf_block *block;
 	struct tcf_chain *chain;
 	int err;
 
+	block = kzalloc(sizeof(*block), GFP_KERNEL);
 	if (!block) {
 		NL_SET_ERR_MSG(extack, "Memory allocation for block failed");
-		return -ENOMEM;
+		return ERR_PTR(-ENOMEM);
 	}
 	INIT_LIST_HEAD(&block->chain_list);
 	INIT_LIST_HEAD(&block->cb_list);
@@ -360,20 +383,76 @@ int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 		err = -ENOMEM;
 		goto err_chain_create;
 	}
+	block->net = qdisc_net(q);
+	block->refcnt = 1;
+	block->net = net;
+	block->q = q;
+	return block;
+
+err_chain_create:
+	kfree(block);
+	return ERR_PTR(err);
+}
+
+static struct tcf_block *tcf_block_lookup(struct net *net, u32 block_index)
+{
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+
+	return idr_find_ext(&tn->idr, block_index);
+}
+
+static struct tcf_chain *tcf_block_chain_zero(struct tcf_block *block)
+{
+	return list_first_entry(&block->chain_list, struct tcf_chain, list);
+}
+
+int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
+		      struct tcf_block_ext_info *ei,
+		      struct netlink_ext_ack *extack)
+{
+	struct net *net = qdisc_net(q);
+	struct tcf_block *block = NULL;
+	bool created = false;
+	int err;
+
+	if (ei->block_index) {
+		/* block_index not 0 means the shared block is requested */
+		block = tcf_block_lookup(net, ei->block_index);
+		if (block)
+			block->refcnt++;
+	}
+
+	if (!block) {
+		block = tcf_block_create(net, q, extack);
+		if (IS_ERR(block))
+			return PTR_ERR(block);
+		created = true;
+		if (ei->block_index) {
+			err = tcf_block_insert(block, net,
+					       ei->block_index, extack);
+			if (err)
+				goto err_block_insert;
+		}
+	}
+
 	err = tcf_chain_head_change_cb_add(tcf_block_chain_zero(block),
 					   ei, extack);
 	if (err)
 		goto err_chain_head_change_cb_add;
-	block->net = qdisc_net(q);
-	block->q = q;
 	tcf_block_offload_bind(block, q, ei);
 	*p_block = block;
 	return 0;
 
-err_chain_create:
-	kfree(block);
 err_chain_head_change_cb_add:
-	kfree(chain);
+	if (created) {
+		if (tcf_block_shared(block))
+			tcf_block_remove(block, net);
+err_block_insert:
+		kfree(tcf_block_chain_zero(block));
+		kfree(block);
+	} else {
+		block->refcnt--;
+	}
 	return err;
 }
 EXPORT_SYMBOL(tcf_block_get_ext);
@@ -407,26 +486,34 @@ void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
 {
 	struct tcf_chain *chain, *tmp;
 
-	/* Hold a refcnt for all chains, so that they don't disappear
-	 * while we are iterating.
-	 */
 	if (!block)
 		return;
 	tcf_chain_head_change_cb_del(tcf_block_chain_zero(block), ei);
-	list_for_each_entry(chain, &block->chain_list, list)
-		tcf_chain_hold(chain);
 
-	list_for_each_entry(chain, &block->chain_list, list)
-		tcf_chain_flush(chain);
+	if (--block->refcnt == 0) {
+		if (tcf_block_shared(block))
+			tcf_block_remove(block, block->net);
+
+		/* Hold a refcnt for all chains, so that they don't disappear
+		 * while we are iterating.
+		 */
+		list_for_each_entry(chain, &block->chain_list, list)
+			tcf_chain_hold(chain);
+
+		list_for_each_entry(chain, &block->chain_list, list)
+			tcf_chain_flush(chain);
+	}
 
 	tcf_block_offload_unbind(block, q, ei);
 
-	/* At this point, all the chains should have refcnt >= 1. */
-	list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
-		tcf_chain_put(chain);
+	if (block->refcnt == 0) {
+		/* At this point, all the chains should have refcnt >= 1. */
+		list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
+			tcf_chain_put(chain);
 
-	/* Finally, put chain 0 and allow block to be freed. */
-	tcf_chain_put(tcf_block_chain_zero(block));
+		/* Finally, put chain 0 and allow block to be freed. */
+		tcf_chain_put(tcf_block_chain_zero(block));
+	}
 }
 EXPORT_SYMBOL(tcf_block_put_ext);
 
@@ -1313,12 +1400,40 @@ int tc_setup_cb_call(struct tcf_block *block, struct tcf_exts *exts,
 }
 EXPORT_SYMBOL(tc_setup_cb_call);
 
+static __net_init int tcf_net_init(struct net *net)
+{
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+
+	idr_init(&tn->idr);
+	return 0;
+}
+
+static void __net_exit tcf_net_exit(struct net *net)
+{
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+
+	idr_destroy(&tn->idr);
+}
+
+static struct pernet_operations tcf_net_ops = {
+	.init = tcf_net_init,
+	.exit = tcf_net_exit,
+	.id   = &tcf_net_id,
+	.size = sizeof(struct tcf_net),
+};
+
 static int __init tc_filter_init(void)
 {
+	int err;
+
 	tc_filter_wq = alloc_ordered_workqueue("tc_filter_workqueue", 0);
 	if (!tc_filter_wq)
 		return -ENOMEM;
 
+	err = register_pernet_subsys(&tcf_net_ops);
+	if (err)
+		return err;
+
 	rtnl_register(PF_UNSPEC, RTM_NEWTFILTER, tc_ctl_tfilter, NULL, 0);
 	rtnl_register(PF_UNSPEC, RTM_DELTFILTER, tc_ctl_tfilter, NULL, 0);
 	rtnl_register(PF_UNSPEC, RTM_GETTFILTER, tc_ctl_tfilter,
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [patch net-next v7 03/13] net: sched: avoid usage of tp->q in tcf_classify
  2018-01-09 14:07 [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jiri Pirko
  2018-01-09 14:07 ` [patch net-next v7 01/13] net: sched: introduce support for multiple filter chain pointers registration Jiri Pirko
  2018-01-09 14:07 ` [patch net-next v7 02/13] net: sched: introduce shared filter blocks infrastructure Jiri Pirko
@ 2018-01-09 14:07 ` Jiri Pirko
  2018-01-10 16:17   ` David Ahern
  2018-01-09 14:07 ` [patch net-next v7 04/13] net: sched: introduce block mechanism to handle netif_keep_dst calls Jiri Pirko
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 42+ messages in thread
From: Jiri Pirko @ 2018-01-09 14:07 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

From: Jiri Pirko <jiri@mellanox.com>

Use block index in the messages instead.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/sched/cls_api.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 9b45950..31e91dc 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -672,8 +672,9 @@ int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 #ifdef CONFIG_NET_CLS_ACT
 reset:
 	if (unlikely(limit++ >= max_reclassify_loop)) {
-		net_notice_ratelimited("%s: reclassify loop, rule prio %u, protocol %02x\n",
-				       tp->q->ops->id, tp->prio & 0xffff,
+		net_notice_ratelimited("%u: reclassify loop, rule prio %u, protocol %02x\n",
+				       tp->chain->block->index,
+				       tp->prio & 0xffff,
 				       ntohs(tp->protocol));
 		return TC_ACT_SHOT;
 	}
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [patch net-next v7 04/13] net: sched: introduce block mechanism to handle netif_keep_dst calls
  2018-01-09 14:07 [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (2 preceding siblings ...)
  2018-01-09 14:07 ` [patch net-next v7 03/13] net: sched: avoid usage of tp->q in tcf_classify Jiri Pirko
@ 2018-01-09 14:07 ` Jiri Pirko
  2018-01-09 14:07 ` [patch net-next v7 05/13] net: sched: remove classid and q fields from tcf_proto Jiri Pirko
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-09 14:07 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

From: Jiri Pirko <jiri@mellanox.com>

Couple of classifiers call netif_keep_dst directly on q->dev. That is
not possible to do directly for shared blocke where multiple qdiscs are
owning the block. So introduce a infrastructure to keep track of the
block owners in list and use this list to implement block variant of
netif_keep_dst.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v3->v4:
- rebased on top of the current net-next
v1->v2:
- fix binder_type to check "egress" as well as pointed out by Daniel
---
 include/net/pkt_cls.h     |  1 +
 include/net/sch_generic.h |  2 ++
 net/sched/cls_api.c       | 69 +++++++++++++++++++++++++++++++++++++++++++++++
 net/sched/cls_bpf.c       |  4 +--
 net/sched/cls_flow.c      |  2 +-
 net/sched/cls_route.c     |  2 +-
 6 files changed, 76 insertions(+), 4 deletions(-)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 3f53d12..49b3f13 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -39,6 +39,7 @@ bool tcf_queue_work(struct work_struct *work);
 struct tcf_chain *tcf_chain_get(struct tcf_block *block, u32 chain_index,
 				bool create);
 void tcf_chain_put(struct tcf_chain *chain);
+void tcf_block_netif_keep_dst(struct tcf_block *block);
 int tcf_block_get(struct tcf_block **p_block,
 		  struct tcf_proto __rcu **p_filter_chain, struct Qdisc *q,
 		  struct netlink_ext_ack *extack);
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 5cc4d71..df97c3e 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -289,6 +289,8 @@ struct tcf_block {
 	struct net *net;
 	struct Qdisc *q;
 	struct list_head cb_list;
+	struct list_head owner_list;
+	bool keep_dst;
 };
 
 static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz)
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 31e91dc..11bdb89 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -375,6 +375,7 @@ static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q,
 	}
 	INIT_LIST_HEAD(&block->chain_list);
 	INIT_LIST_HEAD(&block->cb_list);
+	INIT_LIST_HEAD(&block->owner_list);
 
 	/* Create chain 0 by default, it has to be always present. */
 	chain = tcf_chain_create(block, 0);
@@ -406,6 +407,65 @@ static struct tcf_chain *tcf_block_chain_zero(struct tcf_block *block)
 	return list_first_entry(&block->chain_list, struct tcf_chain, list);
 }
 
+struct tcf_block_owner_item {
+	struct list_head list;
+	struct Qdisc *q;
+	enum tcf_block_binder_type binder_type;
+};
+
+static void
+tcf_block_owner_netif_keep_dst(struct tcf_block *block,
+			       struct Qdisc *q,
+			       enum tcf_block_binder_type binder_type)
+{
+	if (block->keep_dst &&
+	    binder_type != TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS &&
+	    binder_type != TCF_BLOCK_BINDER_TYPE_CLSACT_EGRESS)
+		netif_keep_dst(qdisc_dev(q));
+}
+
+void tcf_block_netif_keep_dst(struct tcf_block *block)
+{
+	struct tcf_block_owner_item *item;
+
+	block->keep_dst = true;
+	list_for_each_entry(item, &block->owner_list, list)
+		tcf_block_owner_netif_keep_dst(block, item->q,
+					       item->binder_type);
+}
+EXPORT_SYMBOL(tcf_block_netif_keep_dst);
+
+static int tcf_block_owner_add(struct tcf_block *block,
+			       struct Qdisc *q,
+			       enum tcf_block_binder_type binder_type)
+{
+	struct tcf_block_owner_item *item;
+
+	item = kmalloc(sizeof(*item), GFP_KERNEL);
+	if (!item)
+		return -ENOMEM;
+	item->q = q;
+	item->binder_type = binder_type;
+	list_add(&item->list, &block->owner_list);
+	return 0;
+}
+
+static void tcf_block_owner_del(struct tcf_block *block,
+				struct Qdisc *q,
+				enum tcf_block_binder_type binder_type)
+{
+	struct tcf_block_owner_item *item;
+
+	list_for_each_entry(item, &block->owner_list, list) {
+		if (item->q == q && item->binder_type == binder_type) {
+			list_del(&item->list);
+			kfree(item);
+			return;
+		}
+	}
+	WARN_ON(1);
+}
+
 int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 		      struct tcf_block_ext_info *ei,
 		      struct netlink_ext_ack *extack)
@@ -435,6 +495,12 @@ int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 		}
 	}
 
+	err = tcf_block_owner_add(block, q, ei->binder_type);
+	if (err)
+		goto err_block_owner_add;
+
+	tcf_block_owner_netif_keep_dst(block, q, ei->binder_type);
+
 	err = tcf_chain_head_change_cb_add(tcf_block_chain_zero(block),
 					   ei, extack);
 	if (err)
@@ -444,6 +510,8 @@ int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 	return 0;
 
 err_chain_head_change_cb_add:
+	tcf_block_owner_del(block, q, ei->binder_type);
+err_block_owner_add:
 	if (created) {
 		if (tcf_block_shared(block))
 			tcf_block_remove(block, net);
@@ -489,6 +557,7 @@ void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
 	if (!block)
 		return;
 	tcf_chain_head_change_cb_del(tcf_block_chain_zero(block), ei);
+	tcf_block_owner_del(block, q, ei->binder_type);
 
 	if (--block->refcnt == 0) {
 		if (tcf_block_shared(block))
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 8d78e7f..d79cc50 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -392,8 +392,8 @@ static int cls_bpf_prog_from_efd(struct nlattr **tb, struct cls_bpf_prog *prog,
 	prog->bpf_name = name;
 	prog->filter = fp;
 
-	if (fp->dst_needed && !(tp->q->flags & TCQ_F_INGRESS))
-		netif_keep_dst(qdisc_dev(tp->q));
+	if (fp->dst_needed)
+		tcf_block_netif_keep_dst(tp->chain->block);
 
 	return 0;
 }
diff --git a/net/sched/cls_flow.c b/net/sched/cls_flow.c
index 25c2a88..28cd6fb 100644
--- a/net/sched/cls_flow.c
+++ b/net/sched/cls_flow.c
@@ -526,7 +526,7 @@ static int flow_change(struct net *net, struct sk_buff *in_skb,
 
 	timer_setup(&fnew->perturb_timer, flow_perturbation, TIMER_DEFERRABLE);
 
-	netif_keep_dst(qdisc_dev(tp->q));
+	tcf_block_netif_keep_dst(tp->chain->block);
 
 	if (tb[TCA_FLOW_KEYS]) {
 		fnew->keymask = keymask;
diff --git a/net/sched/cls_route.c b/net/sched/cls_route.c
index ac9a5b8..a1f2b1b 100644
--- a/net/sched/cls_route.c
+++ b/net/sched/cls_route.c
@@ -527,7 +527,7 @@ static int route4_change(struct net *net, struct sk_buff *in_skb,
 		if (f->handle < f1->handle)
 			break;
 
-	netif_keep_dst(qdisc_dev(tp->q));
+	tcf_block_netif_keep_dst(tp->chain->block);
 	rcu_assign_pointer(f->next, f1);
 	rcu_assign_pointer(*fp, f);
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [patch net-next v7 05/13] net: sched: remove classid and q fields from tcf_proto
  2018-01-09 14:07 [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (3 preceding siblings ...)
  2018-01-09 14:07 ` [patch net-next v7 04/13] net: sched: introduce block mechanism to handle netif_keep_dst calls Jiri Pirko
@ 2018-01-09 14:07 ` Jiri Pirko
  2018-01-09 14:07 ` [patch net-next v7 06/13] net: sched: keep track of offloaded filters and check tc offload feature Jiri Pirko
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-09 14:07 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

From: Jiri Pirko <jiri@mellanox.com>

Both are no longer used, so remove them.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/sch_generic.h | 2 --
 net/sched/cls_api.c       | 7 ++-----
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index df97c3e..dba2214 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -255,8 +255,6 @@ struct tcf_proto {
 
 	/* All the rest */
 	u32			prio;
-	u32			classid;
-	struct Qdisc		*q;
 	void			*data;
 	const struct tcf_proto_ops	*ops;
 	struct tcf_chain	*chain;
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 11bdb89..8a130e2 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -122,8 +122,7 @@ static inline u32 tcf_auto_prio(struct tcf_proto *tp)
 }
 
 static struct tcf_proto *tcf_proto_create(const char *kind, u32 protocol,
-					  u32 prio, u32 parent, struct Qdisc *q,
-					  struct tcf_chain *chain)
+					  u32 prio, struct tcf_chain *chain)
 {
 	struct tcf_proto *tp;
 	int err;
@@ -157,8 +156,6 @@ static struct tcf_proto *tcf_proto_create(const char *kind, u32 protocol,
 	tp->classify = tp->ops->classify;
 	tp->protocol = protocol;
 	tp->prio = prio;
-	tp->classid = parent;
-	tp->q = q;
 	tp->chain = chain;
 
 	err = tp->ops->init(tp);
@@ -1069,7 +1066,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 			prio = tcf_auto_prio(tcf_chain_tp_prev(&chain_info));
 
 		tp = tcf_proto_create(nla_data(tca[TCA_KIND]),
-				      protocol, prio, parent, q, chain);
+				      protocol, prio, chain);
 		if (IS_ERR(tp)) {
 			err = PTR_ERR(tp);
 			goto errout;
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [patch net-next v7 06/13] net: sched: keep track of offloaded filters and check tc offload feature
  2018-01-09 14:07 [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (4 preceding siblings ...)
  2018-01-09 14:07 ` [patch net-next v7 05/13] net: sched: remove classid and q fields from tcf_proto Jiri Pirko
@ 2018-01-09 14:07 ` Jiri Pirko
  2018-01-09 14:07 ` [patch net-next v7 07/13] net: sched: use block index as a handle instead of qdisc when block is shared Jiri Pirko
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-09 14:07 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

From: Jiri Pirko <jiri@mellanox.com>

During block bind, we need to check tc offload feature. If it is
disabled yet still the block contains offloaded filters, forbid the
bind. Also forbid to register callback for a block that already
contains offloaded filters, as the play back is not supported now.
For keeping track of offloaded filters there is a new counter
introduced, alongside with couple of helpers called from cls_* code.
These helpers set and clear TCA_CLS_FLAGS_IN_HW flag.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v4->v5:
- add tracking of binding of devs that are unable to offload and check
  that before block cbs call.
v3->v4:
- propagate netdev_ops->ndo_setup_tc error up to tcf_block_offload_bind
  caller
v2->v3:
- new patch
---
 include/net/sch_generic.h | 18 +++++++++++
 net/sched/cls_api.c       | 80 ++++++++++++++++++++++++++++++++++++++---------
 net/sched/cls_bpf.c       |  5 ++-
 net/sched/cls_flower.c    |  3 +-
 net/sched/cls_matchall.c  |  3 +-
 net/sched/cls_u32.c       | 13 ++++----
 6 files changed, 99 insertions(+), 23 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index dba2214..ab86b64 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -289,8 +289,26 @@ struct tcf_block {
 	struct list_head cb_list;
 	struct list_head owner_list;
 	bool keep_dst;
+	unsigned int offloadcnt; /* Number of oddloaded filters */
+	unsigned int nooffloaddevcnt; /* Number of devs unable to do offload */
 };
 
+static inline void tcf_block_offload_inc(struct tcf_block *block, u32 *flags)
+{
+	if (*flags & TCA_CLS_FLAGS_IN_HW)
+		return;
+	*flags |= TCA_CLS_FLAGS_IN_HW;
+	block->offloadcnt++;
+}
+
+static inline void tcf_block_offload_dec(struct tcf_block *block, u32 *flags)
+{
+	if (!(*flags & TCA_CLS_FLAGS_IN_HW))
+		return;
+	*flags &= ~TCA_CLS_FLAGS_IN_HW;
+	block->offloadcnt--;
+}
+
 static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz)
 {
 	struct qdisc_skb_cb *qcb;
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 8a130e2..0ff8ae9 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -265,31 +265,66 @@ void tcf_chain_put(struct tcf_chain *chain)
 }
 EXPORT_SYMBOL(tcf_chain_put);
 
-static void tcf_block_offload_cmd(struct tcf_block *block, struct Qdisc *q,
-				  struct tcf_block_ext_info *ei,
-				  enum tc_block_command command)
+static bool tcf_block_offload_in_use(struct tcf_block *block)
+{
+	return block->offloadcnt;
+}
+
+static int tcf_block_offload_cmd(struct tcf_block *block,
+				 struct net_device *dev,
+				 struct tcf_block_ext_info *ei,
+				 enum tc_block_command command)
 {
-	struct net_device *dev = q->dev_queue->dev;
 	struct tc_block_offload bo = {};
 
-	if (!dev->netdev_ops->ndo_setup_tc)
-		return;
 	bo.command = command;
 	bo.binder_type = ei->binder_type;
 	bo.block = block;
-	dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
+	return dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
 }
 
-static void tcf_block_offload_bind(struct tcf_block *block, struct Qdisc *q,
-				   struct tcf_block_ext_info *ei)
+static int tcf_block_offload_bind(struct tcf_block *block, struct Qdisc *q,
+				  struct tcf_block_ext_info *ei)
 {
-	tcf_block_offload_cmd(block, q, ei, TC_BLOCK_BIND);
+	struct net_device *dev = q->dev_queue->dev;
+	int err;
+
+	if (!dev->netdev_ops->ndo_setup_tc)
+		goto no_offload_dev_inc;
+
+	/* If tc offload feature is disabled and the block we try to bind
+	 * to already has some offloaded filters, forbid to bind.
+	 */
+	if (!tc_can_offload(dev) && tcf_block_offload_in_use(block))
+		return -EOPNOTSUPP;
+
+	err = tcf_block_offload_cmd(block, dev, ei, TC_BLOCK_BIND);
+	if (err == -EOPNOTSUPP)
+		goto no_offload_dev_inc;
+	return err;
+
+no_offload_dev_inc:
+	if (tcf_block_offload_in_use(block))
+		return -EOPNOTSUPP;
+	block->nooffloaddevcnt++;
+	return 0;
 }
 
 static void tcf_block_offload_unbind(struct tcf_block *block, struct Qdisc *q,
 				     struct tcf_block_ext_info *ei)
 {
-	tcf_block_offload_cmd(block, q, ei, TC_BLOCK_UNBIND);
+	struct net_device *dev = q->dev_queue->dev;
+	int err;
+
+	if (!dev->netdev_ops->ndo_setup_tc)
+		goto no_offload_dev_dec;
+	err = tcf_block_offload_cmd(block, dev, ei, TC_BLOCK_UNBIND);
+	if (err == -EOPNOTSUPP)
+		goto no_offload_dev_dec;
+	return;
+
+no_offload_dev_dec:
+	WARN_ON(block->nooffloaddevcnt-- == 0);
 }
 
 static int
@@ -502,10 +537,16 @@ int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 					   ei, extack);
 	if (err)
 		goto err_chain_head_change_cb_add;
-	tcf_block_offload_bind(block, q, ei);
+
+	err = tcf_block_offload_bind(block, q, ei);
+	if (err)
+		goto err_block_offload_bind;
+
 	*p_block = block;
 	return 0;
 
+err_block_offload_bind:
+	tcf_chain_head_change_cb_del(tcf_block_chain_zero(block), ei);
 err_chain_head_change_cb_add:
 	tcf_block_owner_del(block, q, ei->binder_type);
 err_block_owner_add:
@@ -637,9 +678,16 @@ struct tcf_block_cb *__tcf_block_cb_register(struct tcf_block *block,
 {
 	struct tcf_block_cb *block_cb;
 
+	/* At this point, playback of previous block cb calls is not supported,
+	 * so forbid to register to block which already has some offloaded
+	 * filters present.
+	 */
+	if (tcf_block_offload_in_use(block))
+		return ERR_PTR(-EOPNOTSUPP);
+
 	block_cb = kzalloc(sizeof(*block_cb), GFP_KERNEL);
 	if (!block_cb)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 	block_cb->cb = cb;
 	block_cb->cb_ident = cb_ident;
 	block_cb->cb_priv = cb_priv;
@@ -655,7 +703,7 @@ int tcf_block_cb_register(struct tcf_block *block,
 	struct tcf_block_cb *block_cb;
 
 	block_cb = __tcf_block_cb_register(block, cb, cb_ident, cb_priv);
-	return block_cb ? 0 : -ENOMEM;
+	return IS_ERR(block_cb) ? PTR_ERR(block_cb) : 0;
 }
 EXPORT_SYMBOL(tcf_block_cb_register);
 
@@ -685,6 +733,10 @@ static int tcf_block_cb_call(struct tcf_block *block, enum tc_setup_type type,
 	int ok_count = 0;
 	int err;
 
+	/* Make sure all netdevs sharing this block are offload-capable. */
+	if (block->nooffloaddevcnt && err_stop)
+		return -EOPNOTSUPP;
+
 	list_for_each_entry(block_cb, &block->cb_list, list) {
 		err = block_cb->cb(type, type_data, block_cb->cb_priv);
 		if (err) {
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index d79cc50..cf72aef 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -167,13 +167,16 @@ static int cls_bpf_offload_cmd(struct tcf_proto *tp, struct cls_bpf_prog *prog,
 	cls_bpf.exts_integrated = obj->exts_integrated;
 	cls_bpf.gen_flags = obj->gen_flags;
 
+	if (oldprog)
+		tcf_block_offload_dec(block, &oldprog->gen_flags);
+
 	err = tc_setup_cb_call(block, NULL, TC_SETUP_CLSBPF, &cls_bpf, skip_sw);
 	if (prog) {
 		if (err < 0) {
 			cls_bpf_offload_cmd(tp, oldprog, prog);
 			return err;
 		} else if (err > 0) {
-			prog->gen_flags |= TCA_CLS_FLAGS_IN_HW;
+			tcf_block_offload_inc(block, &prog->gen_flags);
 		}
 	}
 
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 6132a73..f61df19 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -229,6 +229,7 @@ static void fl_hw_destroy_filter(struct tcf_proto *tp, struct cls_fl_filter *f)
 
 	tc_setup_cb_call(block, &f->exts, TC_SETUP_CLSFLOWER,
 			 &cls_flower, false);
+	tcf_block_offload_dec(block, &f->flags);
 }
 
 static int fl_hw_replace_filter(struct tcf_proto *tp,
@@ -256,7 +257,7 @@ static int fl_hw_replace_filter(struct tcf_proto *tp,
 		fl_hw_destroy_filter(tp, f);
 		return err;
 	} else if (err > 0) {
-		f->flags |= TCA_CLS_FLAGS_IN_HW;
+		tcf_block_offload_inc(block, &f->flags);
 	}
 
 	if (skip_sw && !(f->flags & TCA_CLS_FLAGS_IN_HW))
diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
index 66d4e00..d0e57c8 100644
--- a/net/sched/cls_matchall.c
+++ b/net/sched/cls_matchall.c
@@ -81,6 +81,7 @@ static void mall_destroy_hw_filter(struct tcf_proto *tp,
 	cls_mall.cookie = cookie;
 
 	tc_setup_cb_call(block, NULL, TC_SETUP_CLSMATCHALL, &cls_mall, false);
+	tcf_block_offload_dec(block, &head->flags);
 }
 
 static int mall_replace_hw_filter(struct tcf_proto *tp,
@@ -103,7 +104,7 @@ static int mall_replace_hw_filter(struct tcf_proto *tp,
 		mall_destroy_hw_filter(tp, head, cookie);
 		return err;
 	} else if (err > 0) {
-		head->flags |= TCA_CLS_FLAGS_IN_HW;
+		tcf_block_offload_inc(block, &head->flags);
 	}
 
 	if (skip_sw && !(head->flags & TCA_CLS_FLAGS_IN_HW))
diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index 507859c..020d328 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -529,16 +529,17 @@ static int u32_replace_hw_hnode(struct tcf_proto *tp, struct tc_u_hnode *h,
 	return 0;
 }
 
-static void u32_remove_hw_knode(struct tcf_proto *tp, u32 handle)
+static void u32_remove_hw_knode(struct tcf_proto *tp, struct tc_u_knode *n)
 {
 	struct tcf_block *block = tp->chain->block;
 	struct tc_cls_u32_offload cls_u32 = {};
 
 	tc_cls_common_offload_init(&cls_u32.common, tp);
 	cls_u32.command = TC_CLSU32_DELETE_KNODE;
-	cls_u32.knode.handle = handle;
+	cls_u32.knode.handle = n->handle;
 
 	tc_setup_cb_call(block, NULL, TC_SETUP_CLSU32, &cls_u32, false);
+	tcf_block_offload_dec(block, &n->flags);
 }
 
 static int u32_replace_hw_knode(struct tcf_proto *tp, struct tc_u_knode *n,
@@ -567,10 +568,10 @@ static int u32_replace_hw_knode(struct tcf_proto *tp, struct tc_u_knode *n,
 
 	err = tc_setup_cb_call(block, NULL, TC_SETUP_CLSU32, &cls_u32, skip_sw);
 	if (err < 0) {
-		u32_remove_hw_knode(tp, n->handle);
+		u32_remove_hw_knode(tp, n);
 		return err;
 	} else if (err > 0) {
-		n->flags |= TCA_CLS_FLAGS_IN_HW;
+		tcf_block_offload_inc(block, &n->flags);
 	}
 
 	if (skip_sw && !(n->flags & TCA_CLS_FLAGS_IN_HW))
@@ -589,7 +590,7 @@ static void u32_clear_hnode(struct tcf_proto *tp, struct tc_u_hnode *ht)
 			RCU_INIT_POINTER(ht->ht[h],
 					 rtnl_dereference(n->next));
 			tcf_unbind_filter(tp, &n->res);
-			u32_remove_hw_knode(tp, n->handle);
+			u32_remove_hw_knode(tp, n);
 			idr_remove_ext(&ht->handle_idr, n->handle);
 			if (tcf_exts_get_net(&n->exts))
 				call_rcu(&n->rcu, u32_delete_key_freepf_rcu);
@@ -682,7 +683,7 @@ static int u32_delete(struct tcf_proto *tp, void *arg, bool *last)
 		goto out;
 
 	if (TC_U32_KEY(ht->handle)) {
-		u32_remove_hw_knode(tp, ht->handle);
+		u32_remove_hw_knode(tp, (struct tc_u_knode *)ht);
 		ret = u32_delete_key(tp, (struct tc_u_knode *)ht);
 		goto out;
 	}
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [patch net-next v7 07/13] net: sched: use block index as a handle instead of qdisc when block is shared
  2018-01-09 14:07 [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (5 preceding siblings ...)
  2018-01-09 14:07 ` [patch net-next v7 06/13] net: sched: keep track of offloaded filters and check tc offload feature Jiri Pirko
@ 2018-01-09 14:07 ` Jiri Pirko
  2018-01-10 18:12   ` David Ahern
  2018-01-11 13:25   ` Jamal Hadi Salim
  2018-01-09 14:07 ` [patch net-next v7 08/13] net: sched: add rt netlink message type for block get Jiri Pirko
                   ` (9 subsequent siblings)
  16 siblings, 2 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-09 14:07 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

From: Jiri Pirko <jiri@mellanox.com>

As the tcm_ifindex 0 is invalid ifindex, reuse it to indicate that we
work with block, instead of qdisc. So if tcm_ifindex is 0, tcm_parent is
used to carry block_index.

If the block is set to be shared between at least 2 qdiscs, it is
forbidden to use the qdisc handle to add/delete filters. In that case,
userspace has to pass block_index.

Also, for dump of the filters, in case the block is shared in between at
least 2 qdiscs, the each filter is dumped with tcm_ifindex 0 and
tcm_parent set to block_index. That gives the user clear indication,
that the filter belongs to a shared block and not only to one qdisc
under which it is dumped.

Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v6->v7:
- changed extack message for block index handle as suggested by DaveA
- added extack message when block index does not exist
- the block ifindex magic is in define and change to 0xffffffff
  as suggested by Jamal
v5->v6:
- new patch
---
 include/uapi/linux/rtnetlink.h |   6 ++
 net/sched/cls_api.c            | 202 ++++++++++++++++++++++++-----------------
 2 files changed, 124 insertions(+), 84 deletions(-)

diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index 843e29a..9c026d9 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -541,9 +541,15 @@ struct tcmsg {
 	int		tcm_ifindex;
 	__u32		tcm_handle;
 	__u32		tcm_parent;
+/* tcm_block_index is used instead of tcm_parent
+ * in case tcm_ifindex == TCM_IFINDEX_MAGIC_BLOCK
+ */
+#define tcm_block_index tcm_parent
 	__u32		tcm_info;
 };
 
+#define TCM_IFINDEX_MAGIC_BLOCK (0xFFFFFFFFU)
+
 enum {
 	TCA_UNSPEC,
 	TCA_KIND,
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 0ff8ae9..d687e58 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -865,8 +865,9 @@ static struct tcf_proto *tcf_chain_tp_find(struct tcf_chain *chain,
 }
 
 static int tcf_fill_node(struct net *net, struct sk_buff *skb,
-			 struct tcf_proto *tp, struct Qdisc *q, u32 parent,
-			 void *fh, u32 portid, u32 seq, u16 flags, int event)
+			 struct tcf_proto *tp, struct tcf_block *block,
+			 struct Qdisc *q, u32 parent, void *fh,
+			 u32 portid, u32 seq, u16 flags, int event)
 {
 	struct tcmsg *tcm;
 	struct nlmsghdr  *nlh;
@@ -879,8 +880,13 @@ static int tcf_fill_node(struct net *net, struct sk_buff *skb,
 	tcm->tcm_family = AF_UNSPEC;
 	tcm->tcm__pad1 = 0;
 	tcm->tcm__pad2 = 0;
-	tcm->tcm_ifindex = qdisc_dev(q)->ifindex;
-	tcm->tcm_parent = parent;
+	if (q) {
+		tcm->tcm_ifindex = qdisc_dev(q)->ifindex;
+		tcm->tcm_parent = parent;
+	} else {
+		tcm->tcm_ifindex = TCM_IFINDEX_MAGIC_BLOCK;
+		tcm->tcm_block_index = block->index;
+	}
 	tcm->tcm_info = TC_H_MAKE(tp->prio, tp->protocol);
 	if (nla_put_string(skb, TCA_KIND, tp->ops->kind))
 		goto nla_put_failure;
@@ -903,8 +909,8 @@ static int tcf_fill_node(struct net *net, struct sk_buff *skb,
 
 static int tfilter_notify(struct net *net, struct sk_buff *oskb,
 			  struct nlmsghdr *n, struct tcf_proto *tp,
-			  struct Qdisc *q, u32 parent,
-			  void *fh, int event, bool unicast)
+			  struct tcf_block *block, struct Qdisc *q,
+			  u32 parent, void *fh, int event, bool unicast)
 {
 	struct sk_buff *skb;
 	u32 portid = oskb ? NETLINK_CB(oskb).portid : 0;
@@ -913,8 +919,8 @@ static int tfilter_notify(struct net *net, struct sk_buff *oskb,
 	if (!skb)
 		return -ENOBUFS;
 
-	if (tcf_fill_node(net, skb, tp, q, parent, fh, portid, n->nlmsg_seq,
-			  n->nlmsg_flags, event) <= 0) {
+	if (tcf_fill_node(net, skb, tp, block, q, parent, fh, portid,
+			  n->nlmsg_seq, n->nlmsg_flags, event) <= 0) {
 		kfree_skb(skb);
 		return -EINVAL;
 	}
@@ -928,8 +934,8 @@ static int tfilter_notify(struct net *net, struct sk_buff *oskb,
 
 static int tfilter_del_notify(struct net *net, struct sk_buff *oskb,
 			      struct nlmsghdr *n, struct tcf_proto *tp,
-			      struct Qdisc *q, u32 parent,
-			      void *fh, bool unicast, bool *last)
+			      struct tcf_block *block, struct Qdisc *q,
+			      u32 parent, void *fh, bool unicast, bool *last)
 {
 	struct sk_buff *skb;
 	u32 portid = oskb ? NETLINK_CB(oskb).portid : 0;
@@ -939,8 +945,8 @@ static int tfilter_del_notify(struct net *net, struct sk_buff *oskb,
 	if (!skb)
 		return -ENOBUFS;
 
-	if (tcf_fill_node(net, skb, tp, q, parent, fh, portid, n->nlmsg_seq,
-			  n->nlmsg_flags, RTM_DELTFILTER) <= 0) {
+	if (tcf_fill_node(net, skb, tp, block, q, parent, fh, portid,
+			  n->nlmsg_seq, n->nlmsg_flags, RTM_DELTFILTER) <= 0) {
 		kfree_skb(skb);
 		return -EINVAL;
 	}
@@ -959,15 +965,16 @@ static int tfilter_del_notify(struct net *net, struct sk_buff *oskb,
 }
 
 static void tfilter_notify_chain(struct net *net, struct sk_buff *oskb,
-				 struct Qdisc *q, u32 parent,
-				 struct nlmsghdr *n,
+				 struct tcf_block *block, struct Qdisc *q,
+				 u32 parent, struct nlmsghdr *n,
 				 struct tcf_chain *chain, int event)
 {
 	struct tcf_proto *tp;
 
 	for (tp = rtnl_dereference(chain->filter_chain);
 	     tp; tp = rtnl_dereference(tp->next))
-		tfilter_notify(net, oskb, n, tp, q, parent, 0, event, false);
+		tfilter_notify(net, oskb, n, tp, block,
+			       q, parent, 0, event, false);
 }
 
 /* Add/change/delete/get a filter node */
@@ -983,13 +990,11 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 	bool prio_allocate;
 	u32 parent;
 	u32 chain_index;
-	struct net_device *dev;
-	struct Qdisc  *q;
+	struct Qdisc *q = NULL;
 	struct tcf_chain_info chain_info;
 	struct tcf_chain *chain = NULL;
 	struct tcf_block *block;
 	struct tcf_proto *tp;
-	const struct Qdisc_class_ops *cops;
 	unsigned long cl;
 	void *fh;
 	int err;
@@ -1036,41 +1041,58 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 
 	/* Find head of filter chain. */
 
-	/* Find link */
-	dev = __dev_get_by_index(net, t->tcm_ifindex);
-	if (dev == NULL)
-		return -ENODEV;
-
-	/* Find qdisc */
-	if (!parent) {
-		q = dev->qdisc;
-		parent = q->handle;
+	if (t->tcm_ifindex == TCM_IFINDEX_MAGIC_BLOCK) {
+		block = tcf_block_lookup(net, t->tcm_block_index);
+		if (!block) {
+			NL_SET_ERR_MSG(extack, "Block of given index was not found");
+			err = -EINVAL;
+			goto errout;
+		}
 	} else {
-		q = qdisc_lookup(dev, TC_H_MAJ(t->tcm_parent));
-		if (q == NULL)
-			return -EINVAL;
-	}
+		const struct Qdisc_class_ops *cops;
+		struct net_device *dev;
 
-	/* Is it classful? */
-	cops = q->ops->cl_ops;
-	if (!cops)
-		return -EINVAL;
+		/* Find link */
+		dev = __dev_get_by_index(net, t->tcm_ifindex);
+		if (!dev)
+			return -ENODEV;
 
-	if (!cops->tcf_block)
-		return -EOPNOTSUPP;
+		/* Find qdisc */
+		if (!parent) {
+			q = dev->qdisc;
+			parent = q->handle;
+		} else {
+			q = qdisc_lookup(dev, TC_H_MAJ(t->tcm_parent));
+			if (!q)
+				return -EINVAL;
+		}
 
-	/* Do we search for filter, attached to class? */
-	if (TC_H_MIN(parent)) {
-		cl = cops->find(q, parent);
-		if (cl == 0)
-			return -ENOENT;
-	}
+		/* Is it classful? */
+		cops = q->ops->cl_ops;
+		if (!cops)
+			return -EINVAL;
 
-	/* And the last stroke */
-	block = cops->tcf_block(q, cl, extack);
-	if (!block) {
-		err = -EINVAL;
-		goto errout;
+		if (!cops->tcf_block)
+			return -EOPNOTSUPP;
+
+		/* Do we search for filter, attached to class? */
+		if (TC_H_MIN(parent)) {
+			cl = cops->find(q, parent);
+			if (cl == 0)
+				return -ENOENT;
+		}
+
+		/* And the last stroke */
+		block = cops->tcf_block(q, cl, extack);
+		if (!block) {
+			err = -EINVAL;
+			goto errout;
+		}
+		if (tcf_block_shared(block)) {
+			NL_SET_ERR_MSG(extack, "This filter block is shared. Please use the block index to manipulate the filters");
+			err = -EOPNOTSUPP;
+			goto errout;
+		}
 	}
 
 	chain_index = tca[TCA_CHAIN] ? nla_get_u32(tca[TCA_CHAIN]) : 0;
@@ -1086,7 +1108,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 	}
 
 	if (n->nlmsg_type == RTM_DELTFILTER && prio == 0) {
-		tfilter_notify_chain(net, skb, q, parent, n,
+		tfilter_notify_chain(net, skb, block, q, parent, n,
 				     chain, RTM_DELTFILTER);
 		tcf_chain_flush(chain);
 		err = 0;
@@ -1134,7 +1156,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 	if (!fh) {
 		if (n->nlmsg_type == RTM_DELTFILTER && t->tcm_handle == 0) {
 			tcf_chain_tp_remove(chain, &chain_info, tp);
-			tfilter_notify(net, skb, n, tp, q, parent, fh,
+			tfilter_notify(net, skb, n, tp, block, q, parent, fh,
 				       RTM_DELTFILTER, false);
 			tcf_proto_destroy(tp);
 			err = 0;
@@ -1159,8 +1181,8 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 			}
 			break;
 		case RTM_DELTFILTER:
-			err = tfilter_del_notify(net, skb, n, tp, q, parent,
-						 fh, false, &last);
+			err = tfilter_del_notify(net, skb, n, tp, block,
+						 q, parent, fh, false, &last);
 			if (err)
 				goto errout;
 			if (last) {
@@ -1169,8 +1191,8 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 			}
 			goto errout;
 		case RTM_GETTFILTER:
-			err = tfilter_notify(net, skb, n, tp, q, parent, fh,
-					     RTM_NEWTFILTER, true);
+			err = tfilter_notify(net, skb, n, tp, block, q, parent,
+					     fh, RTM_NEWTFILTER, true);
 			goto errout;
 		default:
 			err = -EINVAL;
@@ -1183,7 +1205,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 	if (err == 0) {
 		if (tp_created)
 			tcf_chain_tp_insert(chain, &chain_info, tp);
-		tfilter_notify(net, skb, n, tp, q, parent, fh,
+		tfilter_notify(net, skb, n, tp, block, q, parent, fh,
 			       RTM_NEWTFILTER, false);
 	} else {
 		if (tp_created)
@@ -1203,6 +1225,7 @@ struct tcf_dump_args {
 	struct tcf_walker w;
 	struct sk_buff *skb;
 	struct netlink_callback *cb;
+	struct tcf_block *block;
 	struct Qdisc *q;
 	u32 parent;
 };
@@ -1212,7 +1235,7 @@ static int tcf_node_dump(struct tcf_proto *tp, void *n, struct tcf_walker *arg)
 	struct tcf_dump_args *a = (void *)arg;
 	struct net *net = sock_net(a->skb->sk);
 
-	return tcf_fill_node(net, a->skb, tp, a->q, a->parent,
+	return tcf_fill_node(net, a->skb, tp, a->block, a->q, a->parent,
 			     n, NETLINK_CB(a->cb->skb).portid,
 			     a->cb->nlh->nlmsg_seq, NLM_F_MULTI,
 			     RTM_NEWTFILTER);
@@ -1223,6 +1246,7 @@ static bool tcf_chain_dump(struct tcf_chain *chain, struct Qdisc *q, u32 parent,
 			   long index_start, long *p_index)
 {
 	struct net *net = sock_net(skb->sk);
+	struct tcf_block *block = chain->block;
 	struct tcmsg *tcm = nlmsg_data(cb->nlh);
 	struct tcf_dump_args arg;
 	struct tcf_proto *tp;
@@ -1241,7 +1265,7 @@ static bool tcf_chain_dump(struct tcf_chain *chain, struct Qdisc *q, u32 parent,
 			memset(&cb->args[1], 0,
 			       sizeof(cb->args) - sizeof(cb->args[0]));
 		if (cb->args[1] == 0) {
-			if (tcf_fill_node(net, skb, tp, q, parent, 0,
+			if (tcf_fill_node(net, skb, tp, block, q, parent, 0,
 					  NETLINK_CB(cb->skb).portid,
 					  cb->nlh->nlmsg_seq, NLM_F_MULTI,
 					  RTM_NEWTFILTER) <= 0)
@@ -1254,6 +1278,7 @@ static bool tcf_chain_dump(struct tcf_chain *chain, struct Qdisc *q, u32 parent,
 		arg.w.fn = tcf_node_dump;
 		arg.skb = skb;
 		arg.cb = cb;
+		arg.block = block;
 		arg.q = q;
 		arg.parent = parent;
 		arg.w.stop = 0;
@@ -1272,13 +1297,10 @@ static int tc_dump_tfilter(struct sk_buff *skb, struct netlink_callback *cb)
 {
 	struct net *net = sock_net(skb->sk);
 	struct nlattr *tca[TCA_MAX + 1];
-	struct net_device *dev;
-	struct Qdisc *q;
+	struct Qdisc *q = NULL;
 	struct tcf_block *block;
 	struct tcf_chain *chain;
 	struct tcmsg *tcm = nlmsg_data(cb->nlh);
-	unsigned long cl = 0;
-	const struct Qdisc_class_ops *cops;
 	long index_start;
 	long index;
 	u32 parent;
@@ -1291,32 +1313,44 @@ static int tc_dump_tfilter(struct sk_buff *skb, struct netlink_callback *cb)
 	if (err)
 		return err;
 
-	dev = __dev_get_by_index(net, tcm->tcm_ifindex);
-	if (!dev)
-		return skb->len;
-
-	parent = tcm->tcm_parent;
-	if (!parent) {
-		q = dev->qdisc;
-		parent = q->handle;
+	if (tcm->tcm_ifindex == TCM_IFINDEX_MAGIC_BLOCK) {
+		block = tcf_block_lookup(net, tcm->tcm_block_index);
+		if (!block)
+			goto out;
 	} else {
-		q = qdisc_lookup(dev, TC_H_MAJ(tcm->tcm_parent));
-	}
-	if (!q)
-		goto out;
-	cops = q->ops->cl_ops;
-	if (!cops)
-		goto out;
-	if (!cops->tcf_block)
-		goto out;
-	if (TC_H_MIN(tcm->tcm_parent)) {
-		cl = cops->find(q, tcm->tcm_parent);
-		if (cl == 0)
+		const struct Qdisc_class_ops *cops;
+		struct net_device *dev;
+		unsigned long cl = 0;
+
+		dev = __dev_get_by_index(net, tcm->tcm_ifindex);
+		if (!dev)
+			return skb->len;
+
+		parent = tcm->tcm_parent;
+		if (!parent) {
+			q = dev->qdisc;
+			parent = q->handle;
+		} else {
+			q = qdisc_lookup(dev, TC_H_MAJ(tcm->tcm_parent));
+		}
+		if (!q)
 			goto out;
+		cops = q->ops->cl_ops;
+		if (!cops)
+			goto out;
+		if (!cops->tcf_block)
+			goto out;
+		if (TC_H_MIN(tcm->tcm_parent)) {
+			cl = cops->find(q, tcm->tcm_parent);
+			if (cl == 0)
+				goto out;
+		}
+		block = cops->tcf_block(q, cl, NULL);
+		if (!block)
+			goto out;
+		if (tcf_block_shared(block))
+			q = NULL;
 	}
-	block = cops->tcf_block(q, cl, NULL);
-	if (!block)
-		goto out;
 
 	index_start = cb->args[0];
 	index = 0;
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [patch net-next v7 08/13] net: sched: add rt netlink message type for block get
  2018-01-09 14:07 [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (6 preceding siblings ...)
  2018-01-09 14:07 ` [patch net-next v7 07/13] net: sched: use block index as a handle instead of qdisc when block is shared Jiri Pirko
@ 2018-01-09 14:07 ` Jiri Pirko
  2018-01-10 16:48   ` David Ahern
  2018-01-11 13:27   ` Jamal Hadi Salim
  2018-01-09 14:07 ` [patch net-next v7 09/13] net: sched: allow ingress and clsact qdiscs to share filter blocks Jiri Pirko
                   ` (8 subsequent siblings)
  16 siblings, 2 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-09 14:07 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

From: Jiri Pirko <jiri@mellanox.com>

Add simple block get operation which primary purpose is to check the
block existence by block index.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v6->v7:
- new patch
---
 include/uapi/linux/rtnetlink.h |  6 ++++
 net/sched/cls_api.c            | 64 ++++++++++++++++++++++++++++++++++++++++++
 security/selinux/nlmsgtab.c    |  5 +++-
 3 files changed, 74 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index 9c026d9..038cde7 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -150,6 +150,12 @@ enum {
 	RTM_NEWCACHEREPORT = 96,
 #define RTM_NEWCACHEREPORT RTM_NEWCACHEREPORT
 
+	RTM_NEWBLOCK = 100,
+#define RTM_NEWBLOCK RTM_NEWBLOCK
+	RTM_DELBLOCK,
+#define RTM_DELBLOCK RTM_DELBLOCK
+	RTM_GETBLOCK,
+#define RTM_GETBLOCK RTM_GETBLOCK
 	__RTM_MAX,
 #define RTM_MAX		(((__RTM_MAX + 3) & ~3) - 1)
 };
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index d687e58..14e4f20 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -1553,6 +1553,69 @@ int tc_setup_cb_call(struct tcf_block *block, struct tcf_exts *exts,
 }
 EXPORT_SYMBOL(tc_setup_cb_call);
 
+static int block_notify_fill(struct net *net, struct sk_buff *skb,
+			     struct tcf_block *block, u32 portid, u32 seq,
+			     u16 flags, int event)
+{
+	struct nlmsghdr *nlh;
+	struct tcmsg *tcm;
+
+	nlh = nlmsg_put(skb, portid, seq, event, sizeof(*tcm), flags);
+	if (!nlh)
+		return -EMSGSIZE;
+	tcm = nlmsg_data(nlh);
+	memset(tcm, 0, sizeof(*tcm));
+	tcm->tcm_family = AF_UNSPEC;
+	tcm->tcm_ifindex = TCM_IFINDEX_MAGIC_BLOCK;
+	tcm->tcm_block_index = block->index;
+	return 0;
+}
+
+static int block_notify(struct net *net, struct sk_buff *oskb,
+			struct nlmsghdr *n, struct tcf_block *block, int event)
+{
+	u32 portid = NETLINK_CB(oskb).portid;
+	struct sk_buff *skb;
+	int err;
+
+	skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL);
+	if (!skb)
+		return -ENOBUFS;
+
+	err = block_notify_fill(net, skb, block, portid,
+				n->nlmsg_seq, n->nlmsg_flags, event);
+	if (err) {
+		kfree_skb(skb);
+		return err;
+	}
+
+	return netlink_unicast(net->rtnl, skb, portid, MSG_DONTWAIT);
+}
+
+static int tc_ctl_block(struct sk_buff *skb, struct nlmsghdr *n,
+			struct netlink_ext_ack *extack)
+{
+	struct net *net = sock_net(skb->sk);
+	struct tcf_block *block;
+	struct tcmsg *tcm;
+
+	if (n->nlmsg_len < nlmsg_msg_size(sizeof(*tcm)))
+		return -EINVAL;
+
+	tcm = nlmsg_data(n);
+
+	if (tcm->tcm_ifindex != TCM_IFINDEX_MAGIC_BLOCK)
+		return -EINVAL;
+
+	block = tcf_block_lookup(net, tcm->tcm_block_index);
+	if (!block) {
+		NL_SET_ERR_MSG(extack, "Block with the given index does not exist");
+		return -EINVAL;
+	}
+
+	return block_notify(net, skb, n, block, RTM_NEWBLOCK);
+}
+
 static __net_init int tcf_net_init(struct net *net)
 {
 	struct tcf_net *tn = net_generic(net, tcf_net_id);
@@ -1591,6 +1654,7 @@ static int __init tc_filter_init(void)
 	rtnl_register(PF_UNSPEC, RTM_DELTFILTER, tc_ctl_tfilter, NULL, 0);
 	rtnl_register(PF_UNSPEC, RTM_GETTFILTER, tc_ctl_tfilter,
 		      tc_dump_tfilter, 0);
+	rtnl_register(PF_UNSPEC, RTM_GETBLOCK, tc_ctl_block, NULL, 0);
 
 	return 0;
 }
diff --git a/security/selinux/nlmsgtab.c b/security/selinux/nlmsgtab.c
index 7b7433a..4e95a46 100644
--- a/security/selinux/nlmsgtab.c
+++ b/security/selinux/nlmsgtab.c
@@ -80,6 +80,9 @@ static const struct nlmsg_perm nlmsg_route_perms[] =
 	{ RTM_NEWSTATS,		NETLINK_ROUTE_SOCKET__NLMSG_READ },
 	{ RTM_GETSTATS,		NETLINK_ROUTE_SOCKET__NLMSG_READ  },
 	{ RTM_NEWCACHEREPORT,	NETLINK_ROUTE_SOCKET__NLMSG_READ },
+	{ RTM_NEWBLOCK,		NETLINK_ROUTE_SOCKET__NLMSG_READ },
+	{ RTM_DELBLOCK,		NETLINK_ROUTE_SOCKET__NLMSG_READ },
+	{ RTM_GETBLOCK,		NETLINK_ROUTE_SOCKET__NLMSG_READ },
 };
 
 static const struct nlmsg_perm nlmsg_tcpdiag_perms[] =
@@ -159,7 +162,7 @@ int selinux_nlmsg_lookup(u16 sclass, u16 nlmsg_type, u32 *perm)
 	switch (sclass) {
 	case SECCLASS_NETLINK_ROUTE_SOCKET:
 		/* RTM_MAX always point to RTM_SETxxxx, ie RTM_NEWxxx + 3 */
-		BUILD_BUG_ON(RTM_MAX != (RTM_NEWCACHEREPORT + 3));
+		BUILD_BUG_ON(RTM_MAX != (RTM_NEWBLOCK + 3));
 		err = nlmsg_perm(nlmsg_type, perm, nlmsg_route_perms,
 				 sizeof(nlmsg_route_perms));
 		break;
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [patch net-next v7 09/13] net: sched: allow ingress and clsact qdiscs to share filter blocks
  2018-01-09 14:07 [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (7 preceding siblings ...)
  2018-01-09 14:07 ` [patch net-next v7 08/13] net: sched: add rt netlink message type for block get Jiri Pirko
@ 2018-01-09 14:07 ` Jiri Pirko
  2018-01-11 13:36   ` Jamal Hadi Salim
  2018-01-09 14:07 ` [patch net-next v7 10/13] mlxsw: spectrum_acl: Reshuffle code around mlxsw_sp_acl_ruleset_create/destroy Jiri Pirko
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 42+ messages in thread
From: Jiri Pirko @ 2018-01-09 14:07 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

From: Jiri Pirko <jiri@mellanox.com>

Benefit from the previously introduced shared filter blocks
infrastructure and allow ingress and clsact qdisc instances to share
filter blocks. The block index is coming from userspace as qdisc option.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v6->v7:
- adjust to the core changes and check block index attributes for being 0
v3->v4:
- rebased on top of the current net-next
v2->v3:
- removed "p_" prefix from block index function args
---
 include/uapi/linux/pkt_sched.h |  11 +++++
 net/sched/sch_ingress.c        | 101 ++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 111 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index 37b5096..8cc554a 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -934,4 +934,15 @@ enum {
 
 #define TCA_CBS_MAX (__TCA_CBS_MAX - 1)
 
+/* Ingress/clsact */
+
+enum {
+	TCA_CLSACT_UNSPEC,
+	TCA_CLSACT_INGRESS_BLOCK,
+	TCA_CLSACT_EGRESS_BLOCK,
+	__TCA_CLSACT_MAX
+};
+
+#define TCA_CLSACT_MAX	(__TCA_CLSACT_MAX - 1)
+
 #endif
diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c
index 7ca2be2..1bef8d4 100644
--- a/net/sched/sch_ingress.c
+++ b/net/sched/sch_ingress.c
@@ -61,6 +61,32 @@ static void clsact_chain_head_change(struct tcf_proto *tp_head, void *priv)
 	struct mini_Qdisc_pair *miniqp = priv;
 
 	mini_qdisc_pair_swap(miniqp, tp_head);
+};
+
+static const struct nla_policy ingress_policy[TCA_CLSACT_MAX + 1] = {
+	[TCA_CLSACT_INGRESS_BLOCK]	= { .type = NLA_U32 },
+};
+
+static int ingress_parse_opt(struct nlattr *opt, struct tcf_block_ext_info *ei,
+			     struct netlink_ext_ack *extack)
+{
+	struct nlattr *tb[TCA_CLSACT_MAX + 1];
+	int err;
+
+	if (!opt)
+		return 0;
+	err = nla_parse_nested(tb, TCA_CLSACT_MAX, opt, ingress_policy, NULL);
+	if (err)
+		return err;
+
+	if (tb[TCA_CLSACT_INGRESS_BLOCK]) {
+		ei->block_index = nla_get_u32(tb[TCA_CLSACT_INGRESS_BLOCK]);
+		if (!ei->block_index) {
+			NL_SET_ERR_MSG(extack, "Block index cannot be 0");
+			return -EINVAL;
+		}
+	}
+	return 0;
 }
 
 static int ingress_init(struct Qdisc *sch, struct nlattr *opt,
@@ -74,6 +100,10 @@ static int ingress_init(struct Qdisc *sch, struct nlattr *opt,
 
 	mini_qdisc_pair_init(&q->miniqp, sch, &dev->miniq_ingress);
 
+	err = ingress_parse_opt(opt, &q->block_info, extack);
+	if (err)
+		return err;
+
 	q->block_info.binder_type = TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
 	q->block_info.chain_head_change = clsact_chain_head_change;
 	q->block_info.chain_head_change_priv = &q->miniqp;
@@ -97,11 +127,15 @@ static void ingress_destroy(struct Qdisc *sch)
 
 static int ingress_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
+	struct ingress_sched_data *q = qdisc_priv(sch);
 	struct nlattr *nest;
 
 	nest = nla_nest_start(skb, TCA_OPTIONS);
 	if (nest == NULL)
 		goto nla_put_failure;
+	if (q->block->index &&
+	    nla_put_u32(skb, TCA_CLSACT_INGRESS_BLOCK, q->block->index))
+		goto nla_put_failure;
 
 	return nla_nest_end(skb, nest);
 
@@ -170,6 +204,44 @@ static struct tcf_block *clsact_tcf_block(struct Qdisc *sch, unsigned long cl,
 	}
 }
 
+static const struct nla_policy clsact_policy[TCA_CLSACT_MAX + 1] = {
+	[TCA_CLSACT_INGRESS_BLOCK]	= { .type = NLA_U32 },
+	[TCA_CLSACT_EGRESS_BLOCK]	= { .type = NLA_U32 },
+};
+
+static int clsact_parse_opt(struct nlattr *opt,
+			    struct tcf_block_ext_info *ei_ingress,
+			    struct tcf_block_ext_info *ei_egress,
+			    struct netlink_ext_ack *extack)
+{
+	struct nlattr *tb[TCA_CLSACT_MAX + 1];
+	int err;
+
+	if (!opt)
+		return 0;
+	err = nla_parse_nested(tb, TCA_CLSACT_MAX, opt, clsact_policy, NULL);
+	if (err)
+		return err;
+
+	if (tb[TCA_CLSACT_INGRESS_BLOCK]) {
+		ei_ingress->block_index =
+			nla_get_u32(tb[TCA_CLSACT_INGRESS_BLOCK]);
+		if (!ei_ingress->block_index) {
+			NL_SET_ERR_MSG(extack, "Block index cannot be 0");
+			return -EINVAL;
+		}
+	}
+	if (tb[TCA_CLSACT_EGRESS_BLOCK]) {
+		ei_egress->block_index =
+			nla_get_u32(tb[TCA_CLSACT_EGRESS_BLOCK]);
+		if (!ei_egress->block_index) {
+			NL_SET_ERR_MSG(extack, "Block index cannot be 0");
+			return -EINVAL;
+		}
+	}
+	return 0;
+}
+
 static int clsact_init(struct Qdisc *sch, struct nlattr *opt,
 		       struct netlink_ext_ack *extack)
 {
@@ -182,6 +254,11 @@ static int clsact_init(struct Qdisc *sch, struct nlattr *opt,
 
 	mini_qdisc_pair_init(&q->miniqp_ingress, sch, &dev->miniq_ingress);
 
+	err = clsact_parse_opt(opt, &q->ingress_block_info,
+			       &q->egress_block_info, extack);
+	if (err)
+		return err;
+
 	q->ingress_block_info.binder_type = TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
 	q->ingress_block_info.chain_head_change = clsact_chain_head_change;
 	q->ingress_block_info.chain_head_change_priv = &q->miniqp_ingress;
@@ -218,6 +295,28 @@ static void clsact_destroy(struct Qdisc *sch)
 	net_dec_egress_queue();
 }
 
+static int clsact_dump(struct Qdisc *sch, struct sk_buff *skb)
+{
+	struct clsact_sched_data *q = qdisc_priv(sch);
+	struct nlattr *nest;
+
+	nest = nla_nest_start(skb, TCA_OPTIONS);
+	if (!nest)
+		goto nla_put_failure;
+	if (q->ingress_block->index &&
+	    nla_put_u32(skb, TCA_CLSACT_INGRESS_BLOCK, q->ingress_block->index))
+		goto nla_put_failure;
+	if (q->egress_block->index &&
+	    nla_put_u32(skb, TCA_CLSACT_EGRESS_BLOCK, q->egress_block->index))
+		goto nla_put_failure;
+
+	return nla_nest_end(skb, nest);
+
+nla_put_failure:
+	nla_nest_cancel(skb, nest);
+	return -1;
+}
+
 static const struct Qdisc_class_ops clsact_class_ops = {
 	.leaf		=	ingress_leaf,
 	.find		=	clsact_find,
@@ -233,7 +332,7 @@ static struct Qdisc_ops clsact_qdisc_ops __read_mostly = {
 	.priv_size	=	sizeof(struct clsact_sched_data),
 	.init		=	clsact_init,
 	.destroy	=	clsact_destroy,
-	.dump		=	ingress_dump,
+	.dump		=	clsact_dump,
 	.owner		=	THIS_MODULE,
 };
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [patch net-next v7 10/13] mlxsw: spectrum_acl: Reshuffle code around mlxsw_sp_acl_ruleset_create/destroy
  2018-01-09 14:07 [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (8 preceding siblings ...)
  2018-01-09 14:07 ` [patch net-next v7 09/13] net: sched: allow ingress and clsact qdiscs to share filter blocks Jiri Pirko
@ 2018-01-09 14:07 ` Jiri Pirko
  2018-01-09 14:07 ` [patch net-next v7 11/13] mlxsw: spectrum_acl: Don't store netdev and ingress for ruleset unbind Jiri Pirko
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-09 14:07 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

From: Jiri Pirko <jiri@mellanox.com>

In order to prepare for follow-up changes, make the bind/unbind helpers
very simple. That required move of ht insertion/removal and bind/unbind
calls into mlxsw_sp_acl_ruleset_create/destroy.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c | 102 ++++++++++-----------
 1 file changed, 46 insertions(+), 56 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index 93dcd31..ead4cb8 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -118,8 +118,26 @@ struct mlxsw_sp_fid *mlxsw_sp_acl_dummy_fid(struct mlxsw_sp *mlxsw_sp)
 	return mlxsw_sp->acl->dummy_fid;
 }
 
+static int mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
+				     struct mlxsw_sp_acl_ruleset *ruleset,
+				     struct net_device *dev, bool ingress)
+{
+	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
+
+	return ops->ruleset_bind(mlxsw_sp, ruleset->priv, dev, ingress);
+}
+
+static void mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
+					struct mlxsw_sp_acl_ruleset *ruleset)
+{
+	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
+
+	ops->ruleset_unbind(mlxsw_sp, ruleset->priv);
+}
+
 static struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp,
+mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
+			    bool ingress, u32 chain_index,
 			    const struct mlxsw_sp_acl_profile_ops *ops)
 {
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
@@ -132,6 +150,9 @@ mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp,
 	if (!ruleset)
 		return ERR_PTR(-ENOMEM);
 	ruleset->ref_count = 1;
+	ruleset->ht_key.dev = dev;
+	ruleset->ht_key.ingress = ingress;
+	ruleset->ht_key.chain_index = chain_index;
 	ruleset->ht_key.ops = ops;
 
 	err = rhashtable_init(&ruleset->rule_ht, &mlxsw_sp_acl_rule_ht_params);
@@ -142,68 +163,49 @@ mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp,
 	if (err)
 		goto err_ops_ruleset_add;
 
-	return ruleset;
-
-err_ops_ruleset_add:
-	rhashtable_destroy(&ruleset->rule_ht);
-err_rhashtable_init:
-	kfree(ruleset);
-	return ERR_PTR(err);
-}
-
-static void mlxsw_sp_acl_ruleset_destroy(struct mlxsw_sp *mlxsw_sp,
-					 struct mlxsw_sp_acl_ruleset *ruleset)
-{
-	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
-
-	ops->ruleset_del(mlxsw_sp, ruleset->priv);
-	rhashtable_destroy(&ruleset->rule_ht);
-	kfree(ruleset);
-}
-
-static int mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
-				     struct mlxsw_sp_acl_ruleset *ruleset,
-				     struct net_device *dev, bool ingress,
-				     u32 chain_index)
-{
-	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
-	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
-	int err;
-
-	ruleset->ht_key.dev = dev;
-	ruleset->ht_key.ingress = ingress;
-	ruleset->ht_key.chain_index = chain_index;
 	err = rhashtable_insert_fast(&acl->ruleset_ht, &ruleset->ht_node,
 				     mlxsw_sp_acl_ruleset_ht_params);
 	if (err)
-		return err;
-	if (!ruleset->ht_key.chain_index) {
+		goto err_ht_insert;
+
+	if (!chain_index) {
 		/* We only need ruleset with chain index 0, the implicit one,
 		 * to be directly bound to device. The rest of the rulesets
 		 * are bound by "Goto action set".
 		 */
-		err = ops->ruleset_bind(mlxsw_sp, ruleset->priv, dev, ingress);
+		err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, ruleset,
+						dev, ingress);
 		if (err)
-			goto err_ops_ruleset_bind;
+			goto err_ruleset_bind;
 	}
-	return 0;
 
-err_ops_ruleset_bind:
+	return ruleset;
+
+err_ruleset_bind:
 	rhashtable_remove_fast(&acl->ruleset_ht, &ruleset->ht_node,
 			       mlxsw_sp_acl_ruleset_ht_params);
-	return err;
+err_ht_insert:
+	ops->ruleset_del(mlxsw_sp, ruleset->priv);
+err_ops_ruleset_add:
+	rhashtable_destroy(&ruleset->rule_ht);
+err_rhashtable_init:
+	kfree(ruleset);
+	return ERR_PTR(err);
 }
 
-static void mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
-					struct mlxsw_sp_acl_ruleset *ruleset)
+static void mlxsw_sp_acl_ruleset_destroy(struct mlxsw_sp *mlxsw_sp,
+					 struct mlxsw_sp_acl_ruleset *ruleset)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
 
 	if (!ruleset->ht_key.chain_index)
-		ops->ruleset_unbind(mlxsw_sp, ruleset->priv);
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset);
 	rhashtable_remove_fast(&acl->ruleset_ht, &ruleset->ht_node,
 			       mlxsw_sp_acl_ruleset_ht_params);
+	ops->ruleset_del(mlxsw_sp, ruleset->priv);
+	rhashtable_destroy(&ruleset->rule_ht);
+	kfree(ruleset);
 }
 
 static void mlxsw_sp_acl_ruleset_ref_inc(struct mlxsw_sp_acl_ruleset *ruleset)
@@ -216,7 +218,6 @@ static void mlxsw_sp_acl_ruleset_ref_dec(struct mlxsw_sp *mlxsw_sp,
 {
 	if (--ruleset->ref_count)
 		return;
-	mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset);
 	mlxsw_sp_acl_ruleset_destroy(mlxsw_sp, ruleset);
 }
 
@@ -263,7 +264,6 @@ mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 	const struct mlxsw_sp_acl_profile_ops *ops;
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
 	struct mlxsw_sp_acl_ruleset *ruleset;
-	int err;
 
 	ops = acl->ops->profile_ops(mlxsw_sp, profile);
 	if (!ops)
@@ -275,18 +275,8 @@ mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 		mlxsw_sp_acl_ruleset_ref_inc(ruleset);
 		return ruleset;
 	}
-	ruleset = mlxsw_sp_acl_ruleset_create(mlxsw_sp, ops);
-	if (IS_ERR(ruleset))
-		return ruleset;
-	err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, ruleset, dev,
-					ingress, chain_index);
-	if (err)
-		goto err_ruleset_bind;
-	return ruleset;
-
-err_ruleset_bind:
-	mlxsw_sp_acl_ruleset_destroy(mlxsw_sp, ruleset);
-	return ERR_PTR(err);
+	return mlxsw_sp_acl_ruleset_create(mlxsw_sp, dev, ingress,
+					   chain_index, ops);
 }
 
 void mlxsw_sp_acl_ruleset_put(struct mlxsw_sp *mlxsw_sp,
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [patch net-next v7 11/13] mlxsw: spectrum_acl: Don't store netdev and ingress for ruleset unbind
  2018-01-09 14:07 [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (9 preceding siblings ...)
  2018-01-09 14:07 ` [patch net-next v7 10/13] mlxsw: spectrum_acl: Reshuffle code around mlxsw_sp_acl_ruleset_create/destroy Jiri Pirko
@ 2018-01-09 14:07 ` Jiri Pirko
  2018-01-09 14:07 ` [patch net-next v7 12/13] mlxsw: spectrum_acl: Implement TC block sharing Jiri Pirko
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-09 14:07 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

From: Jiri Pirko <jiri@mellanox.com>

Instead, pass netdev and ingress flag to ruleset unbind op.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  3 +-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c |  9 ++++--
 .../ethernet/mellanox/mlxsw/spectrum_acl_tcam.c    | 33 +++++++++++-----------
 3 files changed, 24 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index a0adcd8..523e64e 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -477,7 +477,8 @@ struct mlxsw_sp_acl_profile_ops {
 	void (*ruleset_del)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv);
 	int (*ruleset_bind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv,
 			    struct net_device *dev, bool ingress);
-	void (*ruleset_unbind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv);
+	void (*ruleset_unbind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv,
+			       struct net_device *dev, bool ingress);
 	u16 (*ruleset_group_id)(void *ruleset_priv);
 	size_t rule_priv_size;
 	int (*rule_add)(struct mlxsw_sp *mlxsw_sp,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index ead4cb8..7fb41a4 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -128,11 +128,12 @@ static int mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
 }
 
 static void mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
-					struct mlxsw_sp_acl_ruleset *ruleset)
+					struct mlxsw_sp_acl_ruleset *ruleset,
+					struct net_device *dev, bool ingress)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
-	ops->ruleset_unbind(mlxsw_sp, ruleset->priv);
+	ops->ruleset_unbind(mlxsw_sp, ruleset->priv, dev, ingress);
 }
 
 static struct mlxsw_sp_acl_ruleset *
@@ -200,7 +201,9 @@ static void mlxsw_sp_acl_ruleset_destroy(struct mlxsw_sp *mlxsw_sp,
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
 
 	if (!ruleset->ht_key.chain_index)
-		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset);
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset,
+					    ruleset->ht_key.dev,
+					    ruleset->ht_key.ingress);
 	rhashtable_remove_fast(&acl->ruleset_ht, &ruleset->ht_node,
 			       mlxsw_sp_acl_ruleset_ht_params);
 	ops->ruleset_del(mlxsw_sp, ruleset->priv);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
index 7e8284b..50b2f9a 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
@@ -154,10 +154,6 @@ struct mlxsw_sp_acl_tcam_group {
 	struct list_head region_list;
 	unsigned int region_count;
 	struct rhashtable chunk_ht;
-	struct {
-		u16 local_port;
-		bool ingress;
-	} bound;
 	struct mlxsw_sp_acl_tcam_group_ops *ops;
 	const struct mlxsw_sp_acl_tcam_pattern *patterns;
 	unsigned int patterns_count;
@@ -271,26 +267,28 @@ mlxsw_sp_acl_tcam_group_bind(struct mlxsw_sp *mlxsw_sp,
 		return -EINVAL;
 
 	mlxsw_sp_port = netdev_priv(dev);
-	group->bound.local_port = mlxsw_sp_port->local_port;
-	group->bound.ingress = ingress;
-	mlxsw_reg_ppbt_pack(ppbt_pl,
-			    group->bound.ingress ? MLXSW_REG_PXBT_E_IACL :
-						   MLXSW_REG_PXBT_E_EACL,
-			    MLXSW_REG_PXBT_OP_BIND, group->bound.local_port,
+	mlxsw_reg_ppbt_pack(ppbt_pl, ingress ? MLXSW_REG_PXBT_E_IACL :
+					       MLXSW_REG_PXBT_E_EACL,
+			    MLXSW_REG_PXBT_OP_BIND, mlxsw_sp_port->local_port,
 			    group->id);
 	return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(ppbt), ppbt_pl);
 }
 
 static void
 mlxsw_sp_acl_tcam_group_unbind(struct mlxsw_sp *mlxsw_sp,
-			       struct mlxsw_sp_acl_tcam_group *group)
+			       struct mlxsw_sp_acl_tcam_group *group,
+			       struct net_device *dev, bool ingress)
 {
+	struct mlxsw_sp_port *mlxsw_sp_port;
 	char ppbt_pl[MLXSW_REG_PPBT_LEN];
 
-	mlxsw_reg_ppbt_pack(ppbt_pl,
-			    group->bound.ingress ? MLXSW_REG_PXBT_E_IACL :
-						   MLXSW_REG_PXBT_E_EACL,
-			    MLXSW_REG_PXBT_OP_UNBIND, group->bound.local_port,
+	if (WARN_ON(!mlxsw_sp_port_dev_check(dev)))
+		return;
+
+	mlxsw_sp_port = netdev_priv(dev);
+	mlxsw_reg_ppbt_pack(ppbt_pl, ingress ? MLXSW_REG_PXBT_E_IACL :
+					       MLXSW_REG_PXBT_E_EACL,
+			    MLXSW_REG_PXBT_OP_UNBIND, mlxsw_sp_port->local_port,
 			    group->id);
 	mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(ppbt), ppbt_pl);
 }
@@ -1066,11 +1064,12 @@ mlxsw_sp_acl_tcam_flower_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
 
 static void
 mlxsw_sp_acl_tcam_flower_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
-					void *ruleset_priv)
+					void *ruleset_priv,
+					struct net_device *dev, bool ingress)
 {
 	struct mlxsw_sp_acl_tcam_flower_ruleset *ruleset = ruleset_priv;
 
-	mlxsw_sp_acl_tcam_group_unbind(mlxsw_sp, &ruleset->group);
+	mlxsw_sp_acl_tcam_group_unbind(mlxsw_sp, &ruleset->group, dev, ingress);
 }
 
 static u16
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [patch net-next v7 12/13] mlxsw: spectrum_acl: Implement TC block sharing
  2018-01-09 14:07 [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (10 preceding siblings ...)
  2018-01-09 14:07 ` [patch net-next v7 11/13] mlxsw: spectrum_acl: Don't store netdev and ingress for ruleset unbind Jiri Pirko
@ 2018-01-09 14:07 ` Jiri Pirko
  2018-01-09 14:07 ` [patch net-next v7 13/13] mlxsw: spectrum_acl: Pass mlxsw_sp_port down to ruleset bind/unbind ops Jiri Pirko
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-09 14:07 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

From: Jiri Pirko <jiri@mellanox.com>

Benefit from the prepared TC and in-driver ACL infrastructure and
introduce block sharing offload. For that, a new struct "block" is
introduced in spectrum_acl in order to hold a list of specific
block-port bindings.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v2->v3:
- add tc offload feature handling
v1->v2:
- new patch
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c     | 182 +++++++++++++---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  37 +++-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c | 237 ++++++++++++++++++---
 .../net/ethernet/mellanox/mlxsw/spectrum_flower.c  |  41 ++--
 4 files changed, 401 insertions(+), 96 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index d373df7..fa2896f 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -1747,72 +1747,186 @@ static int mlxsw_sp_setup_tc_cls_matchall(struct mlxsw_sp_port *mlxsw_sp_port,
 }
 
 static int
-mlxsw_sp_setup_tc_cls_flower(struct mlxsw_sp_port *mlxsw_sp_port,
-			     struct tc_cls_flower_offload *f,
-			     bool ingress)
+mlxsw_sp_setup_tc_cls_flower(struct mlxsw_sp_acl_block *acl_block,
+			     struct tc_cls_flower_offload *f)
 {
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_acl_block_mlxsw_sp(acl_block);
+
 	switch (f->command) {
 	case TC_CLSFLOWER_REPLACE:
-		return mlxsw_sp_flower_replace(mlxsw_sp_port, ingress, f);
+		return mlxsw_sp_flower_replace(mlxsw_sp, acl_block, f);
 	case TC_CLSFLOWER_DESTROY:
-		mlxsw_sp_flower_destroy(mlxsw_sp_port, ingress, f);
+		mlxsw_sp_flower_destroy(mlxsw_sp, acl_block, f);
 		return 0;
 	case TC_CLSFLOWER_STATS:
-		return mlxsw_sp_flower_stats(mlxsw_sp_port, ingress, f);
+		return mlxsw_sp_flower_stats(mlxsw_sp, acl_block, f);
 	default:
 		return -EOPNOTSUPP;
 	}
 }
 
-static int mlxsw_sp_setup_tc_block_cb(enum tc_setup_type type, void *type_data,
-				      void *cb_priv, bool ingress)
+static int mlxsw_sp_setup_tc_block_cb_matchall(enum tc_setup_type type,
+					       void *type_data,
+					       void *cb_priv, bool ingress)
 {
 	struct mlxsw_sp_port *mlxsw_sp_port = cb_priv;
 
-	if (!tc_can_offload(mlxsw_sp_port->dev))
-		return -EOPNOTSUPP;
-
 	switch (type) {
 	case TC_SETUP_CLSMATCHALL:
+		if (!tc_can_offload(mlxsw_sp_port->dev))
+			return -EOPNOTSUPP;
+
 		return mlxsw_sp_setup_tc_cls_matchall(mlxsw_sp_port, type_data,
 						      ingress);
 	case TC_SETUP_CLSFLOWER:
-		return mlxsw_sp_setup_tc_cls_flower(mlxsw_sp_port, type_data,
-						    ingress);
+		return 0;
 	default:
 		return -EOPNOTSUPP;
 	}
 }
 
-static int mlxsw_sp_setup_tc_block_cb_ig(enum tc_setup_type type,
-					 void *type_data, void *cb_priv)
+static int mlxsw_sp_setup_tc_block_cb_matchall_ig(enum tc_setup_type type,
+						  void *type_data,
+						  void *cb_priv)
 {
-	return mlxsw_sp_setup_tc_block_cb(type, type_data, cb_priv, true);
+	return mlxsw_sp_setup_tc_block_cb_matchall(type, type_data,
+						   cb_priv, true);
 }
 
-static int mlxsw_sp_setup_tc_block_cb_eg(enum tc_setup_type type,
-					 void *type_data, void *cb_priv)
+static int mlxsw_sp_setup_tc_block_cb_matchall_eg(enum tc_setup_type type,
+						  void *type_data,
+						  void *cb_priv)
 {
-	return mlxsw_sp_setup_tc_block_cb(type, type_data, cb_priv, false);
+	return mlxsw_sp_setup_tc_block_cb_matchall(type, type_data,
+						   cb_priv, false);
+}
+
+static int mlxsw_sp_setup_tc_block_cb_flower(enum tc_setup_type type,
+					     void *type_data, void *cb_priv)
+{
+	struct mlxsw_sp_acl_block *acl_block = cb_priv;
+
+	switch (type) {
+	case TC_SETUP_CLSMATCHALL:
+		return 0;
+	case TC_SETUP_CLSFLOWER:
+		if (mlxsw_sp_acl_block_disabled(acl_block))
+			return -EOPNOTSUPP;
+
+		return mlxsw_sp_setup_tc_cls_flower(acl_block, type_data);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int
+mlxsw_sp_setup_tc_block_flower_bind(struct mlxsw_sp_port *mlxsw_sp_port,
+				    struct tcf_block *block, bool ingress)
+{
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
+	struct mlxsw_sp_acl_block *acl_block;
+	struct tcf_block_cb *block_cb;
+	int err;
+
+	block_cb = tcf_block_cb_lookup(block, mlxsw_sp_setup_tc_block_cb_flower,
+				       mlxsw_sp);
+	if (!block_cb) {
+		acl_block = mlxsw_sp_acl_block_create(mlxsw_sp, block->net);
+		if (!acl_block)
+			return -ENOMEM;
+		block_cb = __tcf_block_cb_register(block,
+						   mlxsw_sp_setup_tc_block_cb_flower,
+						   mlxsw_sp, acl_block);
+		if (IS_ERR(block_cb)) {
+			err = PTR_ERR(block_cb);
+			goto err_cb_register;
+		}
+	} else {
+		acl_block = tcf_block_cb_priv(block_cb);
+	}
+	tcf_block_cb_incref(block_cb);
+	err = mlxsw_sp_acl_block_bind(mlxsw_sp, acl_block,
+				      mlxsw_sp_port, ingress);
+	if (err)
+		goto err_block_bind;
+
+	if (ingress)
+		mlxsw_sp_port->ing_acl_block = acl_block;
+	else
+		mlxsw_sp_port->eg_acl_block = acl_block;
+
+	return 0;
+
+err_block_bind:
+	if (!tcf_block_cb_decref(block_cb)) {
+		__tcf_block_cb_unregister(block_cb);
+err_cb_register:
+		mlxsw_sp_acl_block_destroy(acl_block);
+	}
+	return err;
+}
+
+static void
+mlxsw_sp_setup_tc_block_flower_unbind(struct mlxsw_sp_port *mlxsw_sp_port,
+				      struct tcf_block *block, bool ingress)
+{
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
+	struct mlxsw_sp_acl_block *acl_block;
+	struct tcf_block_cb *block_cb;
+	int err;
+
+	block_cb = tcf_block_cb_lookup(block, mlxsw_sp_setup_tc_block_cb_flower,
+				       mlxsw_sp);
+	if (!block_cb)
+		return;
+
+	if (ingress)
+		mlxsw_sp_port->ing_acl_block = NULL;
+	else
+		mlxsw_sp_port->eg_acl_block = NULL;
+
+	acl_block = tcf_block_cb_priv(block_cb);
+	err = mlxsw_sp_acl_block_unbind(mlxsw_sp, acl_block,
+					mlxsw_sp_port, ingress);
+	if (!err && !tcf_block_cb_decref(block_cb)) {
+		__tcf_block_cb_unregister(block_cb);
+		mlxsw_sp_acl_block_destroy(acl_block);
+	}
 }
 
 static int mlxsw_sp_setup_tc_block(struct mlxsw_sp_port *mlxsw_sp_port,
 				   struct tc_block_offload *f)
 {
 	tc_setup_cb_t *cb;
+	bool ingress;
+	int err;
 
-	if (f->binder_type == TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS)
-		cb = mlxsw_sp_setup_tc_block_cb_ig;
-	else if (f->binder_type == TCF_BLOCK_BINDER_TYPE_CLSACT_EGRESS)
-		cb = mlxsw_sp_setup_tc_block_cb_eg;
-	else
+	if (f->binder_type == TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS) {
+		cb = mlxsw_sp_setup_tc_block_cb_matchall_ig;
+		ingress = true;
+	} else if (f->binder_type == TCF_BLOCK_BINDER_TYPE_CLSACT_EGRESS) {
+		cb = mlxsw_sp_setup_tc_block_cb_matchall_eg;
+		ingress = false;
+	} else {
 		return -EOPNOTSUPP;
+	}
 
 	switch (f->command) {
 	case TC_BLOCK_BIND:
-		return tcf_block_cb_register(f->block, cb, mlxsw_sp_port,
-					     mlxsw_sp_port);
+		err = tcf_block_cb_register(f->block, cb, mlxsw_sp_port,
+					    mlxsw_sp_port);
+		if (err)
+			return err;
+		err = mlxsw_sp_setup_tc_block_flower_bind(mlxsw_sp_port,
+							  f->block, ingress);
+		if (err) {
+			tcf_block_cb_unregister(f->block, cb, mlxsw_sp_port);
+			return err;
+		}
+		return 0;
 	case TC_BLOCK_UNBIND:
+		mlxsw_sp_setup_tc_block_flower_unbind(mlxsw_sp_port,
+						      f->block, ingress);
 		tcf_block_cb_unregister(f->block, cb, mlxsw_sp_port);
 		return 0;
 	default:
@@ -1840,10 +1954,18 @@ static int mlxsw_sp_feature_hw_tc(struct net_device *dev, bool enable)
 {
 	struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev);
 
-	if (!enable && (mlxsw_sp_port->acl_rule_count ||
-			!list_empty(&mlxsw_sp_port->mall_tc_list))) {
-		netdev_err(dev, "Active offloaded tc filters, can't turn hw_tc_offload off\n");
-		return -EINVAL;
+	if (!enable) {
+		if (mlxsw_sp_acl_block_rule_count(mlxsw_sp_port->ing_acl_block) ||
+		    mlxsw_sp_acl_block_rule_count(mlxsw_sp_port->eg_acl_block) ||
+		    !list_empty(&mlxsw_sp_port->mall_tc_list)) {
+			netdev_err(dev, "Active offloaded tc filters, can't turn hw_tc_offload off\n");
+			return -EINVAL;
+		}
+		mlxsw_sp_acl_block_disable_inc(mlxsw_sp_port->ing_acl_block);
+		mlxsw_sp_acl_block_disable_inc(mlxsw_sp_port->eg_acl_block);
+	} else {
+		mlxsw_sp_acl_block_disable_dec(mlxsw_sp_port->ing_acl_block);
+		mlxsw_sp_acl_block_disable_dec(mlxsw_sp_port->eg_acl_block);
 	}
 	return 0;
 }
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index 523e64e..ab6ada7 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -270,7 +270,8 @@ struct mlxsw_sp_port {
 	struct mlxsw_sp_port_sample *sample;
 	struct list_head vlans_list;
 	struct mlxsw_sp_qdisc root_qdisc;
-	unsigned acl_rule_count;
+	struct mlxsw_sp_acl_block *ing_acl_block;
+	struct mlxsw_sp_acl_block *eg_acl_block;
 };
 
 static inline bool
@@ -498,17 +499,34 @@ struct mlxsw_sp_acl_ops {
 				       enum mlxsw_sp_acl_profile profile);
 };
 
+struct mlxsw_sp_acl_block;
 struct mlxsw_sp_acl_ruleset;
 
 /* spectrum_acl.c */
 struct mlxsw_afk *mlxsw_sp_acl_afk(struct mlxsw_sp_acl *acl);
+struct mlxsw_sp *mlxsw_sp_acl_block_mlxsw_sp(struct mlxsw_sp_acl_block *block);
+unsigned int mlxsw_sp_acl_block_rule_count(struct mlxsw_sp_acl_block *block);
+void mlxsw_sp_acl_block_disable_inc(struct mlxsw_sp_acl_block *block);
+void mlxsw_sp_acl_block_disable_dec(struct mlxsw_sp_acl_block *block);
+bool mlxsw_sp_acl_block_disabled(struct mlxsw_sp_acl_block *block);
+struct mlxsw_sp_acl_block *mlxsw_sp_acl_block_create(struct mlxsw_sp *mlxsw_sp,
+						     struct net *net);
+void mlxsw_sp_acl_block_destroy(struct mlxsw_sp_acl_block *block);
+int mlxsw_sp_acl_block_bind(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
+			    struct mlxsw_sp_port *mlxsw_sp_port,
+			    bool ingress);
+int mlxsw_sp_acl_block_unbind(struct mlxsw_sp *mlxsw_sp,
+			      struct mlxsw_sp_acl_block *block,
+			      struct mlxsw_sp_port *mlxsw_sp_port,
+			      bool ingress);
 struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			    bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block, u32 chain_index,
 			    enum mlxsw_sp_acl_profile profile);
 struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			 bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp,
+			 struct mlxsw_sp_acl_block *block, u32 chain_index,
 			 enum mlxsw_sp_acl_profile profile);
 void mlxsw_sp_acl_ruleset_put(struct mlxsw_sp *mlxsw_sp,
 			      struct mlxsw_sp_acl_ruleset *ruleset);
@@ -575,11 +593,14 @@ void mlxsw_sp_acl_fini(struct mlxsw_sp *mlxsw_sp);
 extern const struct mlxsw_sp_acl_ops mlxsw_sp_acl_tcam_ops;
 
 /* spectrum_flower.c */
-int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+int mlxsw_sp_flower_replace(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
 			    struct tc_cls_flower_offload *f);
-void mlxsw_sp_flower_destroy(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+void mlxsw_sp_flower_destroy(struct mlxsw_sp *mlxsw_sp,
+			     struct mlxsw_sp_acl_block *block,
 			     struct tc_cls_flower_offload *f);
-int mlxsw_sp_flower_stats(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+int mlxsw_sp_flower_stats(struct mlxsw_sp *mlxsw_sp,
+			  struct mlxsw_sp_acl_block *block,
 			  struct tc_cls_flower_offload *f);
 
 /* spectrum_qdisc.c */
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index 7fb41a4..f98bca9 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -39,6 +39,7 @@
 #include <linux/string.h>
 #include <linux/rhashtable.h>
 #include <linux/netdevice.h>
+#include <net/net_namespace.h>
 #include <net/tc_act/tc_vlan.h>
 
 #include "reg.h"
@@ -70,9 +71,23 @@ struct mlxsw_afk *mlxsw_sp_acl_afk(struct mlxsw_sp_acl *acl)
 	return acl->afk;
 }
 
-struct mlxsw_sp_acl_ruleset_ht_key {
-	struct net_device *dev; /* dev this ruleset is bound to */
+struct mlxsw_sp_acl_block_binding {
+	struct list_head list;
+	struct net_device *dev;
+	struct mlxsw_sp_port *mlxsw_sp_port;
 	bool ingress;
+};
+
+struct mlxsw_sp_acl_block {
+	struct list_head binding_list;
+	struct mlxsw_sp_acl_ruleset *ruleset_zero;
+	struct mlxsw_sp *mlxsw_sp;
+	unsigned int rule_count;
+	unsigned int disable_count;
+};
+
+struct mlxsw_sp_acl_ruleset_ht_key {
+	struct mlxsw_sp_acl_block *block;
 	u32 chain_index;
 	const struct mlxsw_sp_acl_profile_ops *ops;
 };
@@ -118,27 +133,185 @@ struct mlxsw_sp_fid *mlxsw_sp_acl_dummy_fid(struct mlxsw_sp *mlxsw_sp)
 	return mlxsw_sp->acl->dummy_fid;
 }
 
-static int mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
-				     struct mlxsw_sp_acl_ruleset *ruleset,
-				     struct net_device *dev, bool ingress)
+struct mlxsw_sp *mlxsw_sp_acl_block_mlxsw_sp(struct mlxsw_sp_acl_block *block)
+{
+	return block->mlxsw_sp;
+}
+
+unsigned int mlxsw_sp_acl_block_rule_count(struct mlxsw_sp_acl_block *block)
+{
+	return block ? block->rule_count : 0;
+}
+
+void mlxsw_sp_acl_block_disable_inc(struct mlxsw_sp_acl_block *block)
+{
+	if (block)
+		block->disable_count++;
+}
+
+void mlxsw_sp_acl_block_disable_dec(struct mlxsw_sp_acl_block *block)
+{
+	if (block)
+		block->disable_count--;
+}
+
+bool mlxsw_sp_acl_block_disabled(struct mlxsw_sp_acl_block *block)
 {
+	return block->disable_count;
+}
+
+static int
+mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
+			  struct mlxsw_sp_acl_block *block,
+			  struct mlxsw_sp_acl_block_binding *binding)
+{
+	struct mlxsw_sp_acl_ruleset *ruleset = block->ruleset_zero;
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
-	return ops->ruleset_bind(mlxsw_sp, ruleset->priv, dev, ingress);
+	return ops->ruleset_bind(mlxsw_sp, ruleset->priv,
+				 binding->mlxsw_sp_port->dev, binding->ingress);
 }
 
-static void mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
-					struct mlxsw_sp_acl_ruleset *ruleset,
-					struct net_device *dev, bool ingress)
+static void
+mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
+			    struct mlxsw_sp_acl_block_binding *binding)
 {
+	struct mlxsw_sp_acl_ruleset *ruleset = block->ruleset_zero;
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
-	ops->ruleset_unbind(mlxsw_sp, ruleset->priv, dev, ingress);
+	ops->ruleset_unbind(mlxsw_sp, ruleset->priv,
+			    binding->mlxsw_sp_port->dev, binding->ingress);
+}
+
+static bool mlxsw_sp_acl_ruleset_block_bound(struct mlxsw_sp_acl_block *block)
+{
+	return block->ruleset_zero;
+}
+
+static int
+mlxsw_sp_acl_ruleset_block_bind(struct mlxsw_sp *mlxsw_sp,
+				struct mlxsw_sp_acl_ruleset *ruleset,
+				struct mlxsw_sp_acl_block *block)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+	int err;
+
+	block->ruleset_zero = ruleset;
+	list_for_each_entry(binding, &block->binding_list, list) {
+		err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, block, binding);
+		if (err)
+			goto rollback;
+	}
+	return 0;
+
+rollback:
+	list_for_each_entry_continue_reverse(binding, &block->binding_list,
+					     list)
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, block, binding);
+	block->ruleset_zero = NULL;
+
+	return err;
+}
+
+static void
+mlxsw_sp_acl_ruleset_block_unbind(struct mlxsw_sp *mlxsw_sp,
+				  struct mlxsw_sp_acl_ruleset *ruleset,
+				  struct mlxsw_sp_acl_block *block)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+
+	list_for_each_entry(binding, &block->binding_list, list)
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, block, binding);
+	block->ruleset_zero = NULL;
+}
+
+struct mlxsw_sp_acl_block *mlxsw_sp_acl_block_create(struct mlxsw_sp *mlxsw_sp,
+						     struct net *net)
+{
+	struct mlxsw_sp_acl_block *block;
+
+	block = kzalloc(sizeof(*block), GFP_KERNEL);
+	if (!block)
+		return NULL;
+	INIT_LIST_HEAD(&block->binding_list);
+	block->mlxsw_sp = mlxsw_sp;
+	return block;
+}
+
+void mlxsw_sp_acl_block_destroy(struct mlxsw_sp_acl_block *block)
+{
+	WARN_ON(!list_empty(&block->binding_list));
+	kfree(block);
+}
+
+static struct mlxsw_sp_acl_block_binding *
+mlxsw_sp_acl_block_lookup(struct mlxsw_sp_acl_block *block,
+			  struct mlxsw_sp_port *mlxsw_sp_port, bool ingress)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+
+	list_for_each_entry(binding, &block->binding_list, list)
+		if (binding->mlxsw_sp_port == mlxsw_sp_port &&
+		    binding->ingress == ingress)
+			return binding;
+	return NULL;
+}
+
+int mlxsw_sp_acl_block_bind(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
+			    struct mlxsw_sp_port *mlxsw_sp_port,
+			    bool ingress)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+	int err;
+
+	if (WARN_ON(mlxsw_sp_acl_block_lookup(block, mlxsw_sp_port, ingress)))
+		return -EEXIST;
+
+	binding = kzalloc(sizeof(*binding), GFP_KERNEL);
+	if (!binding)
+		return -ENOMEM;
+	binding->mlxsw_sp_port = mlxsw_sp_port;
+	binding->ingress = ingress;
+
+	if (mlxsw_sp_acl_ruleset_block_bound(block)) {
+		err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, block, binding);
+		if (err)
+			goto err_ruleset_bind;
+	}
+
+	list_add(&binding->list, &block->binding_list);
+	return 0;
+
+err_ruleset_bind:
+	kfree(binding);
+	return err;
+}
+
+int mlxsw_sp_acl_block_unbind(struct mlxsw_sp *mlxsw_sp,
+			      struct mlxsw_sp_acl_block *block,
+			      struct mlxsw_sp_port *mlxsw_sp_port,
+			      bool ingress)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+
+	binding = mlxsw_sp_acl_block_lookup(block, mlxsw_sp_port, ingress);
+	if (!binding)
+		return -ENOENT;
+
+	list_del(&binding->list);
+
+	if (mlxsw_sp_acl_ruleset_block_bound(block))
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, block, binding);
+
+	kfree(binding);
+	return 0;
 }
 
 static struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			    bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block, u32 chain_index,
 			    const struct mlxsw_sp_acl_profile_ops *ops)
 {
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
@@ -151,8 +324,7 @@ mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 	if (!ruleset)
 		return ERR_PTR(-ENOMEM);
 	ruleset->ref_count = 1;
-	ruleset->ht_key.dev = dev;
-	ruleset->ht_key.ingress = ingress;
+	ruleset->ht_key.block = block;
 	ruleset->ht_key.chain_index = chain_index;
 	ruleset->ht_key.ops = ops;
 
@@ -174,8 +346,7 @@ mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 		 * to be directly bound to device. The rest of the rulesets
 		 * are bound by "Goto action set".
 		 */
-		err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, ruleset,
-						dev, ingress);
+		err = mlxsw_sp_acl_ruleset_block_bind(mlxsw_sp, ruleset, block);
 		if (err)
 			goto err_ruleset_bind;
 	}
@@ -198,12 +369,12 @@ static void mlxsw_sp_acl_ruleset_destroy(struct mlxsw_sp *mlxsw_sp,
 					 struct mlxsw_sp_acl_ruleset *ruleset)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
+	struct mlxsw_sp_acl_block *block = ruleset->ht_key.block;
+	u32 chain_index = ruleset->ht_key.chain_index;
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
 
-	if (!ruleset->ht_key.chain_index)
-		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset,
-					    ruleset->ht_key.dev,
-					    ruleset->ht_key.ingress);
+	if (!chain_index)
+		mlxsw_sp_acl_ruleset_block_unbind(mlxsw_sp, ruleset, block);
 	rhashtable_remove_fast(&acl->ruleset_ht, &ruleset->ht_node,
 			       mlxsw_sp_acl_ruleset_ht_params);
 	ops->ruleset_del(mlxsw_sp, ruleset->priv);
@@ -225,15 +396,14 @@ static void mlxsw_sp_acl_ruleset_ref_dec(struct mlxsw_sp *mlxsw_sp,
 }
 
 static struct mlxsw_sp_acl_ruleset *
-__mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp_acl *acl, struct net_device *dev,
-			      bool ingress, u32 chain_index,
+__mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp_acl *acl,
+			      struct mlxsw_sp_acl_block *block, u32 chain_index,
 			      const struct mlxsw_sp_acl_profile_ops *ops)
 {
 	struct mlxsw_sp_acl_ruleset_ht_key ht_key;
 
 	memset(&ht_key, 0, sizeof(ht_key));
-	ht_key.dev = dev;
-	ht_key.ingress = ingress;
+	ht_key.block = block;
 	ht_key.chain_index = chain_index;
 	ht_key.ops = ops;
 	return rhashtable_lookup_fast(&acl->ruleset_ht, &ht_key,
@@ -241,8 +411,8 @@ __mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp_acl *acl, struct net_device *dev,
 }
 
 struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			    bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block, u32 chain_index,
 			    enum mlxsw_sp_acl_profile profile)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops;
@@ -252,16 +422,15 @@ mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 	ops = acl->ops->profile_ops(mlxsw_sp, profile);
 	if (!ops)
 		return ERR_PTR(-EINVAL);
-	ruleset = __mlxsw_sp_acl_ruleset_lookup(acl, dev, ingress,
-						chain_index, ops);
+	ruleset = __mlxsw_sp_acl_ruleset_lookup(acl, block, chain_index, ops);
 	if (!ruleset)
 		return ERR_PTR(-ENOENT);
 	return ruleset;
 }
 
 struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			 bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp,
+			 struct mlxsw_sp_acl_block *block, u32 chain_index,
 			 enum mlxsw_sp_acl_profile profile)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops;
@@ -272,14 +441,12 @@ mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 	if (!ops)
 		return ERR_PTR(-EINVAL);
 
-	ruleset = __mlxsw_sp_acl_ruleset_lookup(acl, dev, ingress,
-						chain_index, ops);
+	ruleset = __mlxsw_sp_acl_ruleset_lookup(acl, block, chain_index, ops);
 	if (ruleset) {
 		mlxsw_sp_acl_ruleset_ref_inc(ruleset);
 		return ruleset;
 	}
-	return mlxsw_sp_acl_ruleset_create(mlxsw_sp, dev, ingress,
-					   chain_index, ops);
+	return mlxsw_sp_acl_ruleset_create(mlxsw_sp, block, chain_index, ops);
 }
 
 void mlxsw_sp_acl_ruleset_put(struct mlxsw_sp *mlxsw_sp,
@@ -528,6 +695,7 @@ int mlxsw_sp_acl_rule_add(struct mlxsw_sp *mlxsw_sp,
 		goto err_rhashtable_insert;
 
 	list_add_tail(&rule->list, &mlxsw_sp->acl->rules);
+	ruleset->ht_key.block->rule_count++;
 	return 0;
 
 err_rhashtable_insert:
@@ -541,6 +709,7 @@ void mlxsw_sp_acl_rule_del(struct mlxsw_sp *mlxsw_sp,
 	struct mlxsw_sp_acl_ruleset *ruleset = rule->ruleset;
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
+	ruleset->ht_key.block->rule_count--;
 	list_del(&rule->list);
 	rhashtable_remove_fast(&ruleset->rule_ht, &rule->ht_node,
 			       mlxsw_sp_acl_rule_ht_params);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
index 42e8a36..cf7b97d4 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
@@ -35,6 +35,7 @@
 #include <linux/kernel.h>
 #include <linux/errno.h>
 #include <linux/netdevice.h>
+#include <net/net_namespace.h>
 #include <net/flow_dissector.h>
 #include <net/pkt_cls.h>
 #include <net/tc_act/tc_gact.h>
@@ -45,7 +46,7 @@
 #include "core_acl_flex_keys.h"
 
 static int mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
-					 struct net_device *dev, bool ingress,
+					 struct mlxsw_sp_acl_block *block,
 					 struct mlxsw_sp_acl_rule_info *rulei,
 					 struct tcf_exts *exts)
 {
@@ -80,8 +81,7 @@ static int mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
 			struct mlxsw_sp_acl_ruleset *ruleset;
 			u16 group_id;
 
-			ruleset = mlxsw_sp_acl_ruleset_lookup(mlxsw_sp, dev,
-							      ingress,
+			ruleset = mlxsw_sp_acl_ruleset_lookup(mlxsw_sp, block,
 							      chain_index,
 							      MLXSW_SP_ACL_PROFILE_FLOWER);
 			if (IS_ERR(ruleset))
@@ -104,9 +104,6 @@ static int mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
 				return err;
 
 			out_dev = tcf_mirred_dev(a);
-			if (out_dev == dev)
-				out_dev = NULL;
-
 			err = mlxsw_sp_acl_rulei_act_fwd(mlxsw_sp, rulei,
 							 out_dev);
 			if (err)
@@ -265,7 +262,7 @@ static int mlxsw_sp_flower_parse_ip(struct mlxsw_sp *mlxsw_sp,
 }
 
 static int mlxsw_sp_flower_parse(struct mlxsw_sp *mlxsw_sp,
-				 struct net_device *dev, bool ingress,
+				 struct mlxsw_sp_acl_block *block,
 				 struct mlxsw_sp_acl_rule_info *rulei,
 				 struct tc_cls_flower_offload *f)
 {
@@ -383,21 +380,19 @@ static int mlxsw_sp_flower_parse(struct mlxsw_sp *mlxsw_sp,
 	if (err)
 		return err;
 
-	return mlxsw_sp_flower_parse_actions(mlxsw_sp, dev, ingress,
-					     rulei, f->exts);
+	return mlxsw_sp_flower_parse_actions(mlxsw_sp, block, rulei, f->exts);
 }
 
-int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+int mlxsw_sp_flower_replace(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
 			    struct tc_cls_flower_offload *f)
 {
-	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
-	struct net_device *dev = mlxsw_sp_port->dev;
 	struct mlxsw_sp_acl_rule_info *rulei;
 	struct mlxsw_sp_acl_ruleset *ruleset;
 	struct mlxsw_sp_acl_rule *rule;
 	int err;
 
-	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, dev, ingress,
+	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, block,
 					   f->common.chain_index,
 					   MLXSW_SP_ACL_PROFILE_FLOWER);
 	if (IS_ERR(ruleset))
@@ -410,7 +405,7 @@ int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 	}
 
 	rulei = mlxsw_sp_acl_rule_rulei(rule);
-	err = mlxsw_sp_flower_parse(mlxsw_sp, dev, ingress, rulei, f);
+	err = mlxsw_sp_flower_parse(mlxsw_sp, block, rulei, f);
 	if (err)
 		goto err_flower_parse;
 
@@ -423,7 +418,6 @@ int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 		goto err_rule_add;
 
 	mlxsw_sp_acl_ruleset_put(mlxsw_sp, ruleset);
-	mlxsw_sp_port->acl_rule_count++;
 	return 0;
 
 err_rule_add:
@@ -435,15 +429,15 @@ int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 	return err;
 }
 
-void mlxsw_sp_flower_destroy(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+void mlxsw_sp_flower_destroy(struct mlxsw_sp *mlxsw_sp,
+			     struct mlxsw_sp_acl_block *block,
 			     struct tc_cls_flower_offload *f)
 {
-	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
 	struct mlxsw_sp_acl_ruleset *ruleset;
 	struct mlxsw_sp_acl_rule *rule;
 
-	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, mlxsw_sp_port->dev,
-					   ingress, f->common.chain_index,
+	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, block,
+					   f->common.chain_index,
 					   MLXSW_SP_ACL_PROFILE_FLOWER);
 	if (IS_ERR(ruleset))
 		return;
@@ -455,13 +449,12 @@ void mlxsw_sp_flower_destroy(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 	}
 
 	mlxsw_sp_acl_ruleset_put(mlxsw_sp, ruleset);
-	mlxsw_sp_port->acl_rule_count--;
 }
 
-int mlxsw_sp_flower_stats(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+int mlxsw_sp_flower_stats(struct mlxsw_sp *mlxsw_sp,
+			  struct mlxsw_sp_acl_block *block,
 			  struct tc_cls_flower_offload *f)
 {
-	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
 	struct mlxsw_sp_acl_ruleset *ruleset;
 	struct mlxsw_sp_acl_rule *rule;
 	u64 packets;
@@ -469,8 +462,8 @@ int mlxsw_sp_flower_stats(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 	u64 bytes;
 	int err;
 
-	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, mlxsw_sp_port->dev,
-					   ingress, f->common.chain_index,
+	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, block,
+					   f->common.chain_index,
 					   MLXSW_SP_ACL_PROFILE_FLOWER);
 	if (WARN_ON(IS_ERR(ruleset)))
 		return -EINVAL;
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [patch net-next v7 13/13] mlxsw: spectrum_acl: Pass mlxsw_sp_port down to ruleset bind/unbind ops
  2018-01-09 14:07 [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (11 preceding siblings ...)
  2018-01-09 14:07 ` [patch net-next v7 12/13] mlxsw: spectrum_acl: Implement TC block sharing Jiri Pirko
@ 2018-01-09 14:07 ` Jiri Pirko
  2018-01-09 14:19 ` [patch iproute2 net-next v7 1/3] include: update rtnetlink header according to kernel Jiri Pirko
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-09 14:07 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

From: Jiri Pirko <jiri@mellanox.com>

No need to convert from mlxsw_sp_port to net_device and back again.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  6 +++--
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c |  4 ++--
 .../ethernet/mellanox/mlxsw/spectrum_acl_tcam.c    | 27 +++++++++-------------
 3 files changed, 17 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index ab6ada7..525552d 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -477,9 +477,11 @@ struct mlxsw_sp_acl_profile_ops {
 			   void *priv, void *ruleset_priv);
 	void (*ruleset_del)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv);
 	int (*ruleset_bind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv,
-			    struct net_device *dev, bool ingress);
+			    struct mlxsw_sp_port *mlxsw_sp_port,
+			    bool ingress);
 	void (*ruleset_unbind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv,
-			       struct net_device *dev, bool ingress);
+			       struct mlxsw_sp_port *mlxsw_sp_port,
+			       bool ingress);
 	u16 (*ruleset_group_id)(void *ruleset_priv);
 	size_t rule_priv_size;
 	int (*rule_add)(struct mlxsw_sp *mlxsw_sp,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index f98bca9..9439bfa 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -169,7 +169,7 @@ mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
 	return ops->ruleset_bind(mlxsw_sp, ruleset->priv,
-				 binding->mlxsw_sp_port->dev, binding->ingress);
+				 binding->mlxsw_sp_port, binding->ingress);
 }
 
 static void
@@ -181,7 +181,7 @@ mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
 	ops->ruleset_unbind(mlxsw_sp, ruleset->priv,
-			    binding->mlxsw_sp_port->dev, binding->ingress);
+			    binding->mlxsw_sp_port, binding->ingress);
 }
 
 static bool mlxsw_sp_acl_ruleset_block_bound(struct mlxsw_sp_acl_block *block)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
index 50b2f9a..c6e180c 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
@@ -258,15 +258,11 @@ static void mlxsw_sp_acl_tcam_group_del(struct mlxsw_sp *mlxsw_sp,
 static int
 mlxsw_sp_acl_tcam_group_bind(struct mlxsw_sp *mlxsw_sp,
 			     struct mlxsw_sp_acl_tcam_group *group,
-			     struct net_device *dev, bool ingress)
+			     struct mlxsw_sp_port *mlxsw_sp_port,
+			     bool ingress)
 {
-	struct mlxsw_sp_port *mlxsw_sp_port;
 	char ppbt_pl[MLXSW_REG_PPBT_LEN];
 
-	if (!mlxsw_sp_port_dev_check(dev))
-		return -EINVAL;
-
-	mlxsw_sp_port = netdev_priv(dev);
 	mlxsw_reg_ppbt_pack(ppbt_pl, ingress ? MLXSW_REG_PXBT_E_IACL :
 					       MLXSW_REG_PXBT_E_EACL,
 			    MLXSW_REG_PXBT_OP_BIND, mlxsw_sp_port->local_port,
@@ -277,15 +273,11 @@ mlxsw_sp_acl_tcam_group_bind(struct mlxsw_sp *mlxsw_sp,
 static void
 mlxsw_sp_acl_tcam_group_unbind(struct mlxsw_sp *mlxsw_sp,
 			       struct mlxsw_sp_acl_tcam_group *group,
-			       struct net_device *dev, bool ingress)
+			       struct mlxsw_sp_port *mlxsw_sp_port,
+			       bool ingress)
 {
-	struct mlxsw_sp_port *mlxsw_sp_port;
 	char ppbt_pl[MLXSW_REG_PPBT_LEN];
 
-	if (WARN_ON(!mlxsw_sp_port_dev_check(dev)))
-		return;
-
-	mlxsw_sp_port = netdev_priv(dev);
 	mlxsw_reg_ppbt_pack(ppbt_pl, ingress ? MLXSW_REG_PXBT_E_IACL :
 					       MLXSW_REG_PXBT_E_EACL,
 			    MLXSW_REG_PXBT_OP_UNBIND, mlxsw_sp_port->local_port,
@@ -1054,22 +1046,25 @@ mlxsw_sp_acl_tcam_flower_ruleset_del(struct mlxsw_sp *mlxsw_sp,
 static int
 mlxsw_sp_acl_tcam_flower_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
 				      void *ruleset_priv,
-				      struct net_device *dev, bool ingress)
+				      struct mlxsw_sp_port *mlxsw_sp_port,
+				      bool ingress)
 {
 	struct mlxsw_sp_acl_tcam_flower_ruleset *ruleset = ruleset_priv;
 
 	return mlxsw_sp_acl_tcam_group_bind(mlxsw_sp, &ruleset->group,
-					    dev, ingress);
+					    mlxsw_sp_port, ingress);
 }
 
 static void
 mlxsw_sp_acl_tcam_flower_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
 					void *ruleset_priv,
-					struct net_device *dev, bool ingress)
+					struct mlxsw_sp_port *mlxsw_sp_port,
+					bool ingress)
 {
 	struct mlxsw_sp_acl_tcam_flower_ruleset *ruleset = ruleset_priv;
 
-	mlxsw_sp_acl_tcam_group_unbind(mlxsw_sp, &ruleset->group, dev, ingress);
+	mlxsw_sp_acl_tcam_group_unbind(mlxsw_sp, &ruleset->group,
+				       mlxsw_sp_port, ingress);
 }
 
 static u16
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [patch iproute2 net-next v7 1/3] include: update rtnetlink header according to kernel
  2018-01-09 14:07 [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (12 preceding siblings ...)
  2018-01-09 14:07 ` [patch net-next v7 13/13] mlxsw: spectrum_acl: Pass mlxsw_sp_port down to ruleset bind/unbind ops Jiri Pirko
@ 2018-01-09 14:19 ` Jiri Pirko
  2018-01-09 14:19 ` [patch iproute2 net-next v7 2/3] tc: introduce support for block-handle for filter operations Jiri Pirko
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-09 14:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

From: Jiri Pirko <jiri@mellanox.com>

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/uapi/linux/rtnetlink.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index 13bf56f..34ed866 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -150,6 +150,12 @@ enum {
 	RTM_NEWCACHEREPORT = 96,
 #define RTM_NEWCACHEREPORT RTM_NEWCACHEREPORT
 
+	RTM_NEWBLOCK = 100,
+#define RTM_NEWBLOCK RTM_NEWBLOCK
+	RTM_DELBLOCK,
+#define RTM_DELBLOCK RTM_DELBLOCK
+	RTM_GETBLOCK,
+#define RTM_GETBLOCK RTM_GETBLOCK
 	__RTM_MAX,
 #define RTM_MAX		(((__RTM_MAX + 3) & ~3) - 1)
 };
@@ -541,9 +547,15 @@ struct tcmsg {
 	int		tcm_ifindex;
 	__u32		tcm_handle;
 	__u32		tcm_parent;
+/* tcm_block_index is used instead of tcm_parent
+ * in case tcm_ifindex == TCM_IFINDEX_MAGIC_BLOCK
+ */
+#define tcm_block_index tcm_parent
 	__u32		tcm_info;
 };
 
+#define TCM_IFINDEX_MAGIC_BLOCK (0xFFFFFFFFU)
+
 enum {
 	TCA_UNSPEC,
 	TCA_KIND,
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [patch iproute2 net-next v7 2/3] tc: introduce support for block-handle for filter operations
  2018-01-09 14:07 [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (13 preceding siblings ...)
  2018-01-09 14:19 ` [patch iproute2 net-next v7 1/3] include: update rtnetlink header according to kernel Jiri Pirko
@ 2018-01-09 14:19 ` Jiri Pirko
  2018-01-09 14:19 ` [patch iproute2 net-next v7 3/3] tc: implement filter block sharing to ingress and clsact qdiscs Jiri Pirko
  2018-01-11 13:19 ` [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jamal Hadi Salim
  16 siblings, 0 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-09 14:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

From: Jiri Pirko <jiri@mellanox.com>

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 tc/tc_filter.c | 127 +++++++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 110 insertions(+), 17 deletions(-)

diff --git a/tc/tc_filter.c b/tc/tc_filter.c
index 545cc3a..67a0577 100644
--- a/tc/tc_filter.c
+++ b/tc/tc_filter.c
@@ -28,14 +28,17 @@
 static void usage(void)
 {
 	fprintf(stderr,
-		"Usage: tc filter [ add | del | change | replace | show ] dev STRING\n"
-		"Usage: tc filter get dev STRING parent CLASSID protocol PROTO handle FILTERID pref PRIO FILTER_TYPE\n"
+		"Usage: tc filter [ add | del | change | replace | show ] [ dev STRING ]\n"
+		"       tc filter [ add | del | change | replace | show ] [ block BLOCK_INDEX ]\n"
+		"       tc filter get dev STRING parent CLASSID protocol PROTO handle FILTERID pref PRIO FILTER_TYPE\n"
+		"       tc filter get block BLOCK_INDEX protocol PROTO handle FILTERID pref PRIO FILTER_TYPE\n"
 		"       [ pref PRIO ] protocol PROTO [ chain CHAIN_INDEX ]\n"
 		"       [ estimator INTERVAL TIME_CONSTANT ]\n"
 		"       [ root | ingress | egress | parent CLASSID ]\n"
 		"       [ handle FILTERID ] [ [ FILTER_TYPE ] [ help | OPTIONS ] ]\n"
 		"\n"
 		"       tc filter show [ dev STRING ] [ root | ingress | egress | parent CLASSID ]\n"
+		"       tc filter show [ block BLOCK_INDEX ]\n"
 		"Where:\n"
 		"FILTER_TYPE := { rsvp | u32 | bpf | fw | route | etc. }\n"
 		"FILTERID := ... format depends on classifier, see there\n"
@@ -60,6 +63,7 @@ static int tc_filter_modify(int cmd, unsigned int flags, int argc, char **argv)
 	int protocol_set = 0;
 	__u32 chain_index;
 	int chain_index_set = 0;
+	__u32 block_index = 0;
 	char *fhandle = NULL;
 	char  d[IFNAMSIZ] = {};
 	char  k[FILTER_NAMESZ] = {};
@@ -73,7 +77,21 @@ static int tc_filter_modify(int cmd, unsigned int flags, int argc, char **argv)
 			NEXT_ARG();
 			if (d[0])
 				duparg("dev", *argv);
+			if (block_index) {
+				fprintf(stderr, "Error: \"dev\" cannot be used in the same time as \"block\"\n");
+				return -1;
+			}
 			strncpy(d, *argv, sizeof(d)-1);
+		} else if (matches(*argv, "block") == 0) {
+			NEXT_ARG();
+			if (block_index)
+				duparg("block", *argv);
+			if (d[0]) {
+				fprintf(stderr, "Error: \"block\" cannot be used in the same time as \"dev\"\n");
+				return -1;
+			}
+			if (get_u32(&block_index, *argv, 0) || !block_index)
+				invarg("invalid block index value", *argv);
 		} else if (strcmp(*argv, "root") == 0) {
 			if (req.t.tcm_parent) {
 				fprintf(stderr,
@@ -168,6 +186,9 @@ static int tc_filter_modify(int cmd, unsigned int flags, int argc, char **argv)
 			fprintf(stderr, "Cannot find device \"%s\"\n", d);
 			return 1;
 		}
+	} else if (block_index) {
+		req.t.tcm_ifindex = TCM_IFINDEX_MAGIC_BLOCK;
+		req.t.tcm_block_index = block_index;
 	}
 
 	if (q) {
@@ -206,6 +227,7 @@ static __u32 filter_prio;
 static __u32 filter_protocol;
 static __u32 filter_chain_index;
 static int filter_chain_index_set;
+static __u32 filter_block_index;
 __u16 f_proto;
 
 int print_filter(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
@@ -252,20 +274,27 @@ int print_filter(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 		print_bool(PRINT_ANY, "added", "added ", true);
 
 	print_string(PRINT_FP, NULL, "filter ", NULL);
-	if (!filter_ifindex || filter_ifindex != t->tcm_ifindex)
-		print_string(PRINT_ANY, "dev", "dev %s ",
-			     ll_index_to_name(t->tcm_ifindex));
-
-	if (!filter_parent || filter_parent != t->tcm_parent) {
-		if (t->tcm_parent == TC_H_ROOT)
-			print_bool(PRINT_ANY, "root", "root ", true);
-		else if (t->tcm_parent == TC_H_MAKE(TC_H_CLSACT, TC_H_MIN_INGRESS))
-			print_bool(PRINT_ANY, "ingress", "ingress ", true);
-		else if (t->tcm_parent == TC_H_MAKE(TC_H_CLSACT, TC_H_MIN_EGRESS))
-			print_bool(PRINT_ANY, "egress", "egress ", true);
-		else {
-			print_tc_classid(abuf, sizeof(abuf), t->tcm_parent);
-			print_string(PRINT_ANY, "parent", "parent %s ", abuf);
+	if (t->tcm_ifindex == TCM_IFINDEX_MAGIC_BLOCK) {
+		if (!filter_block_index ||
+		    filter_block_index != t->tcm_block_index)
+			print_uint(PRINT_ANY, "block", "block %u ",
+				   t->tcm_block_index);
+	} else {
+		if (!filter_ifindex || filter_ifindex != t->tcm_ifindex)
+			print_string(PRINT_ANY, "dev", "dev %s ",
+				     ll_index_to_name(t->tcm_ifindex));
+
+		if (!filter_parent || filter_parent != t->tcm_parent) {
+			if (t->tcm_parent == TC_H_ROOT)
+				print_bool(PRINT_ANY, "root", "root ", true);
+			else if (t->tcm_parent == TC_H_MAKE(TC_H_CLSACT, TC_H_MIN_INGRESS))
+				print_bool(PRINT_ANY, "ingress", "ingress ", true);
+			else if (t->tcm_parent == TC_H_MAKE(TC_H_CLSACT, TC_H_MIN_EGRESS))
+				print_bool(PRINT_ANY, "egress", "egress ", true);
+			else {
+				print_tc_classid(abuf, sizeof(abuf), t->tcm_parent);
+				print_string(PRINT_ANY, "parent", "parent %s ", abuf);
+			}
 		}
 	}
 
@@ -345,6 +374,7 @@ static int tc_filter_get(int cmd, unsigned int flags, int argc, char **argv)
 	int protocol_set = 0;
 	__u32 chain_index;
 	int chain_index_set = 0;
+	__u32 block_index = 0;
 	__u32 parent_handle = 0;
 	char *fhandle = NULL;
 	char  d[IFNAMSIZ] = {};
@@ -355,7 +385,21 @@ static int tc_filter_get(int cmd, unsigned int flags, int argc, char **argv)
 			NEXT_ARG();
 			if (d[0])
 				duparg("dev", *argv);
+			if (block_index) {
+				fprintf(stderr, "Error: \"dev\" cannot be used in the same time as \"block\"\n");
+				return -1;
+			}
 			strncpy(d, *argv, sizeof(d)-1);
+		} else if (matches(*argv, "block") == 0) {
+			NEXT_ARG();
+			if (block_index)
+				duparg("block", *argv);
+			if (d[0]) {
+				fprintf(stderr, "Error: \"block\" cannot be used in the same time as \"dev\"\n");
+				return -1;
+			}
+			if (get_u32(&block_index, *argv, 0) || !block_index)
+				invarg("invalid block index value", *argv);
 		} else if (strcmp(*argv, "root") == 0) {
 			if (req.t.tcm_parent) {
 				fprintf(stderr,
@@ -469,8 +513,12 @@ static int tc_filter_get(int cmd, unsigned int flags, int argc, char **argv)
 			return 1;
 		}
 		filter_ifindex = req.t.tcm_ifindex;
+	} else if (block_index) {
+		req.t.tcm_ifindex = TCM_IFINDEX_MAGIC_BLOCK;
+		req.t.tcm_block_index = block_index;
+		filter_block_index = block_index;
 	} else {
-		fprintf(stderr, "Must specify netdevice \"dev\"\n");
+		fprintf(stderr, "Must specify netdevice \"dev\" or block index \"block\"\n");
 		return -1;
 	}
 
@@ -504,6 +552,30 @@ static int tc_filter_get(int cmd, unsigned int flags, int argc, char **argv)
 	return 0;
 }
 
+static bool block_index_exists(__u32 block_index)
+{
+	struct {
+		struct nlmsghdr	n;
+		struct tcmsg t;
+	} req = {
+		.n.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg)),
+		.n.nlmsg_flags = NLM_F_REQUEST,
+		.n.nlmsg_type = RTM_GETBLOCK,
+		.t.tcm_parent = TC_H_UNSPEC,
+		.t.tcm_family = AF_UNSPEC,
+	};
+	struct nlmsghdr *answer;
+
+	req.t.tcm_ifindex = TCM_IFINDEX_MAGIC_BLOCK;
+	req.t.tcm_block_index = block_index;
+
+	if (rtnl_talk(&rth, &req.n, &answer) < 0)
+		return false;
+
+	free(answer);
+	return true;
+}
+
 static int tc_filter_list(int argc, char **argv)
 {
 	struct {
@@ -520,6 +592,7 @@ static int tc_filter_list(int argc, char **argv)
 	__u32 prio = 0;
 	__u32 protocol = 0;
 	__u32 chain_index;
+	__u32 block_index = 0;
 	char *fhandle = NULL;
 
 	while (argc > 0) {
@@ -527,7 +600,21 @@ static int tc_filter_list(int argc, char **argv)
 			NEXT_ARG();
 			if (d[0])
 				duparg("dev", *argv);
+			if (block_index) {
+				fprintf(stderr, "Error: \"dev\" cannot be used in the same time as \"block\"\n");
+				return -1;
+			}
 			strncpy(d, *argv, sizeof(d)-1);
+		} else if (matches(*argv, "block") == 0) {
+			NEXT_ARG();
+			if (block_index)
+				duparg("block", *argv);
+			if (d[0]) {
+				fprintf(stderr, "Error: \"block\" cannot be used in the same time as \"dev\"\n");
+				return -1;
+			}
+			if (get_u32(&block_index, *argv, 0) || !block_index)
+				invarg("invalid block index value", *argv);
 		} else if (strcmp(*argv, "root") == 0) {
 			if (req.t.tcm_parent) {
 				fprintf(stderr,
@@ -616,6 +703,12 @@ static int tc_filter_list(int argc, char **argv)
 			return 1;
 		}
 		filter_ifindex = req.t.tcm_ifindex;
+	} else if (block_index) {
+		if (!block_index_exists(block_index))
+			return 1;
+		req.t.tcm_ifindex = TCM_IFINDEX_MAGIC_BLOCK;
+		req.t.tcm_block_index = block_index;
+		filter_block_index = block_index;
 	}
 
 	if (filter_chain_index_set)
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [patch iproute2 net-next v7 3/3] tc: implement filter block sharing to ingress and clsact qdiscs
  2018-01-09 14:07 [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (14 preceding siblings ...)
  2018-01-09 14:19 ` [patch iproute2 net-next v7 2/3] tc: introduce support for block-handle for filter operations Jiri Pirko
@ 2018-01-09 14:19 ` Jiri Pirko
  2018-01-11 13:19 ` [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jamal Hadi Salim
  16 siblings, 0 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-09 14:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

From: Jiri Pirko <jiri@mellanox.com>

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/uapi/linux/pkt_sched.h | 11 +++++++++
 tc/q_clsact.c                  | 52 ++++++++++++++++++++++++++++++++++++++----
 tc/q_ingress.c                 | 30 +++++++++++++++++++++---
 3 files changed, 85 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index 37b5096..8cc554a 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -934,4 +934,15 @@ enum {
 
 #define TCA_CBS_MAX (__TCA_CBS_MAX - 1)
 
+/* Ingress/clsact */
+
+enum {
+	TCA_CLSACT_UNSPEC,
+	TCA_CLSACT_INGRESS_BLOCK,
+	TCA_CLSACT_EGRESS_BLOCK,
+	__TCA_CLSACT_MAX
+};
+
+#define TCA_CLSACT_MAX	(__TCA_CLSACT_MAX - 1)
+
 #endif
diff --git a/tc/q_clsact.c b/tc/q_clsact.c
index 341f653..c15d01c 100644
--- a/tc/q_clsact.c
+++ b/tc/q_clsact.c
@@ -7,23 +7,65 @@
 
 static void explain(void)
 {
-	fprintf(stderr, "Usage: ... clsact\n");
+	fprintf(stderr, "Usage: ... clsact [ingress_block BLOCK_INDEX] [egress_block BLOCK_INDEX]\n");
 }
 
 static int clsact_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 			    struct nlmsghdr *n, const char *dev)
 {
-	if (argc > 0) {
-		fprintf(stderr, "What is \"%s\"?\n", *argv);
-		explain();
-		return -1;
+	struct rtattr *tail;
+	__u32 ingress_block = 0;
+	__u32 egress_block = 0;
+
+	while (argc > 0) {
+		if (strcmp(*argv, "ingress_block") == 0) {
+			NEXT_ARG();
+			if (get_u32(&ingress_block, *argv, 0) || !ingress_block)
+				invarg("invalid ingress block index value", *argv);
+		} else if (strcmp(*argv, "egress_block") == 0) {
+			NEXT_ARG();
+			if (get_u32(&egress_block, *argv, 0) || !egress_block)
+				invarg("invalid egress block index value", *argv);
+		} else {
+			fprintf(stderr, "What is \"%s\"?\n", *argv);
+			explain();
+			return -1;
+		}
+		NEXT_ARG_FWD();
 	}
 
+	tail = NLMSG_TAIL(n);
+	addattr_l(n, 1024, TCA_OPTIONS, NULL, 0);
+	if (ingress_block)
+		addattr32(n, 1024, TCA_CLSACT_INGRESS_BLOCK, ingress_block);
+	if (egress_block)
+		addattr32(n, 1024, TCA_CLSACT_EGRESS_BLOCK, egress_block);
+	tail->rta_len = (void *) NLMSG_TAIL(n) - (void *) tail;
 	return 0;
 }
 
 static int clsact_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
 {
+	struct rtattr *tb[TCA_CLSACT_MAX + 1];
+	__u32 block;
+
+	if (!opt)
+		return 0;
+
+	parse_rtattr_nested(tb, TCA_CLSACT_MAX, opt);
+
+	if (tb[TCA_CLSACT_INGRESS_BLOCK] &&
+	    RTA_PAYLOAD(tb[TCA_CLSACT_INGRESS_BLOCK]) >= sizeof(__u32)) {
+		block = rta_getattr_u32(tb[TCA_CLSACT_INGRESS_BLOCK]);
+		print_uint(PRINT_ANY, "ingress_block",
+			   "ingress_block %u ", block);
+	}
+	if (tb[TCA_CLSACT_EGRESS_BLOCK] &&
+	    RTA_PAYLOAD(tb[TCA_CLSACT_EGRESS_BLOCK]) >= sizeof(__u32)) {
+		block = rta_getattr_u32(tb[TCA_CLSACT_EGRESS_BLOCK]);
+		print_uint(PRINT_ANY, "egress_block",
+			   "egress_block %u ", block);
+	}
 	return 0;
 }
 
diff --git a/tc/q_ingress.c b/tc/q_ingress.c
index 1e42229..c3fcac7 100644
--- a/tc/q_ingress.c
+++ b/tc/q_ingress.c
@@ -17,30 +17,54 @@
 
 static void explain(void)
 {
-	fprintf(stderr, "Usage: ... ingress\n");
+	fprintf(stderr, "Usage: ... ingress [block BLOCK_INDEX]\n");
 }
 
 static int ingress_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 			     struct nlmsghdr *n, const char *dev)
 {
+	struct rtattr *tail;
+	__u32 block_index = 0;
+
 	while (argc > 0) {
 		if (strcmp(*argv, "handle") == 0) {
 			NEXT_ARG();
-			argc--; argv++;
+		} else if (strcmp(*argv, "block") == 0) {
+			NEXT_ARG();
+			if (get_u32(&block_index, *argv, 0) || !block_index)
+				invarg("invalid block index value", *argv);
 		} else {
 			fprintf(stderr, "What is \"%s\"?\n", *argv);
 			explain();
 			return -1;
 		}
+		NEXT_ARG_FWD();
 	}
 
+	tail = NLMSG_TAIL(n);
+	addattr_l(n, 1024, TCA_OPTIONS, NULL, 0);
+	if (block_index)
+		addattr32(n, 1024, TCA_CLSACT_INGRESS_BLOCK, block_index);
+	tail->rta_len = (void *) NLMSG_TAIL(n) - (void *) tail;
 	return 0;
 }
 
 static int ingress_print_opt(struct qdisc_util *qu, FILE *f,
 			     struct rtattr *opt)
 {
-	fprintf(f, "---------------- ");
+	struct rtattr *tb[TCA_CLSACT_MAX + 1];
+	__u32 block_index;
+
+	if (!opt)
+		return 0;
+
+	parse_rtattr_nested(tb, TCA_CLSACT_MAX, opt);
+
+	if (tb[TCA_CLSACT_INGRESS_BLOCK] &&
+	    RTA_PAYLOAD(tb[TCA_CLSACT_INGRESS_BLOCK]) >= sizeof(__u32)) {
+		block_index = rta_getattr_u32(tb[TCA_CLSACT_INGRESS_BLOCK]);
+		print_uint(PRINT_ANY, "block", "block %u ", block_index);
+	}
 	return 0;
 }
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 03/13] net: sched: avoid usage of tp->q in tcf_classify
  2018-01-09 14:07 ` [patch net-next v7 03/13] net: sched: avoid usage of tp->q in tcf_classify Jiri Pirko
@ 2018-01-10 16:17   ` David Ahern
  2018-01-11  9:40     ` Jiri Pirko
  0 siblings, 1 reply; 42+ messages in thread
From: David Ahern @ 2018-01-10 16:17 UTC (permalink / raw)
  To: Jiri Pirko, netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel

On 1/9/18 7:07 AM, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> Use block index in the messages instead.
> 
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
> ---
>  net/sched/cls_api.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> index 9b45950..31e91dc 100644
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -672,8 +672,9 @@ int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
>  #ifdef CONFIG_NET_CLS_ACT
>  reset:
>  	if (unlikely(limit++ >= max_reclassify_loop)) {
> -		net_notice_ratelimited("%s: reclassify loop, rule prio %u, protocol %02x\n",
> -				       tp->q->ops->id, tp->prio & 0xffff,
> +		net_notice_ratelimited("%u: reclassify loop, rule prio %u, protocol %02x\n",

if you are dumping index instead of prio shouldn't the 'rule prio' above
be adjusted?


> +				       tp->chain->block->index,
> +				       tp->prio & 0xffff,
>  				       ntohs(tp->protocol));
>  		return TC_ACT_SHOT;
>  	}
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 08/13] net: sched: add rt netlink message type for block get
  2018-01-09 14:07 ` [patch net-next v7 08/13] net: sched: add rt netlink message type for block get Jiri Pirko
@ 2018-01-10 16:48   ` David Ahern
  2018-01-11  9:37     ` Jiri Pirko
  2018-01-11 13:27   ` Jamal Hadi Salim
  1 sibling, 1 reply; 42+ messages in thread
From: David Ahern @ 2018-01-10 16:48 UTC (permalink / raw)
  To: Jiri Pirko, netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel

On 1/9/18 7:07 AM, Jiri Pirko wrote:
> diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
> index 9c026d9..038cde7 100644
> --- a/include/uapi/linux/rtnetlink.h
> +++ b/include/uapi/linux/rtnetlink.h
> @@ -150,6 +150,12 @@ enum {
>  	RTM_NEWCACHEREPORT = 96,
>  #define RTM_NEWCACHEREPORT RTM_NEWCACHEREPORT
>  
> +	RTM_NEWBLOCK = 100,
> +#define RTM_NEWBLOCK RTM_NEWBLOCK
> +	RTM_DELBLOCK,
> +#define RTM_DELBLOCK RTM_DELBLOCK
> +	RTM_GETBLOCK,
> +#define RTM_GETBLOCK RTM_GETBLOCK
>  	__RTM_MAX,
>  #define RTM_MAX		(((__RTM_MAX + 3) & ~3) - 1)
>  };

Seems like this is creating an inconsistency. RTM_GETBLOCK is used to
dump the set of shared blocks, but RTM_NEWBLOCK / RTM_DELBLOCK are not
used to create / delete one.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 07/13] net: sched: use block index as a handle instead of qdisc when block is shared
  2018-01-09 14:07 ` [patch net-next v7 07/13] net: sched: use block index as a handle instead of qdisc when block is shared Jiri Pirko
@ 2018-01-10 18:12   ` David Ahern
  2018-01-11  9:38     ` Jiri Pirko
  2018-01-11 13:25   ` Jamal Hadi Salim
  1 sibling, 1 reply; 42+ messages in thread
From: David Ahern @ 2018-01-10 18:12 UTC (permalink / raw)
  To: Jiri Pirko, netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel

On 1/9/18 7:07 AM, Jiri Pirko wrote:
> diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
> index 843e29a..9c026d9 100644
> --- a/include/uapi/linux/rtnetlink.h
> +++ b/include/uapi/linux/rtnetlink.h
> @@ -541,9 +541,15 @@ struct tcmsg {
>  	int		tcm_ifindex;
>  	__u32		tcm_handle;
>  	__u32		tcm_parent;
> +/* tcm_block_index is used instead of tcm_parent
> + * in case tcm_ifindex == TCM_IFINDEX_MAGIC_BLOCK
> + */
> +#define tcm_block_index tcm_parent
>  	__u32		tcm_info;
>  };
>  
> +#define TCM_IFINDEX_MAGIC_BLOCK (0xFFFFFFFFU)
> +
>  enum {
>  	TCA_UNSPEC,
>  	TCA_KIND,


This could be more clearly documented for anyone wanting to write an app
against the API. Something like:

For shared blocks, tcm_ifindex is set to TCM_IFINDEX_MAGIC_BLOCK, and
tcm_parent is aliased to tcm_block_index which is the block index.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 08/13] net: sched: add rt netlink message type for block get
  2018-01-10 16:48   ` David Ahern
@ 2018-01-11  9:37     ` Jiri Pirko
  2018-01-11 11:11       ` Jiri Pirko
  0 siblings, 1 reply; 42+ messages in thread
From: Jiri Pirko @ 2018-01-11  9:37 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, michael.chan, ganeshgr, saeedm,
	matanb, leonro, idosch, jakub.kicinski, simon.horman,
	pieter.jansenvanvuuren, john.hurley, alexander.h.duyck, ogerlitz,
	john.fastabend, daniel

Wed, Jan 10, 2018 at 05:48:09PM CET, dsahern@gmail.com wrote:
>On 1/9/18 7:07 AM, Jiri Pirko wrote:
>> diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
>> index 9c026d9..038cde7 100644
>> --- a/include/uapi/linux/rtnetlink.h
>> +++ b/include/uapi/linux/rtnetlink.h
>> @@ -150,6 +150,12 @@ enum {
>>  	RTM_NEWCACHEREPORT = 96,
>>  #define RTM_NEWCACHEREPORT RTM_NEWCACHEREPORT
>>  
>> +	RTM_NEWBLOCK = 100,
>> +#define RTM_NEWBLOCK RTM_NEWBLOCK
>> +	RTM_DELBLOCK,
>> +#define RTM_DELBLOCK RTM_DELBLOCK
>> +	RTM_GETBLOCK,
>> +#define RTM_GETBLOCK RTM_GETBLOCK
>>  	__RTM_MAX,
>>  #define RTM_MAX		(((__RTM_MAX + 3) & ~3) - 1)
>>  };
>
>Seems like this is creating an inconsistency. RTM_GETBLOCK is used to
>dump the set of shared blocks, but RTM_NEWBLOCK / RTM_DELBLOCK are not
>used to create / delete one.

Why is it a problem? RTM_NEWBLOCK is used as a reply for RTM_GETBLOCK.
I plan to have block notifications as a follow-up, there the RTM_GETBLOCK
and RTM_DELBLOCK will be used. The fact the user cannot create and
delete block explicitly is no problem in my opinion. The block creation
and deletion is done according to usage of qdiscs.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 07/13] net: sched: use block index as a handle instead of qdisc when block is shared
  2018-01-10 18:12   ` David Ahern
@ 2018-01-11  9:38     ` Jiri Pirko
  0 siblings, 0 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-11  9:38 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, michael.chan, ganeshgr, saeedm,
	matanb, leonro, idosch, jakub.kicinski, simon.horman,
	pieter.jansenvanvuuren, john.hurley, alexander.h.duyck, ogerlitz,
	john.fastabend, daniel

Wed, Jan 10, 2018 at 07:12:44PM CET, dsahern@gmail.com wrote:
>On 1/9/18 7:07 AM, Jiri Pirko wrote:
>> diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
>> index 843e29a..9c026d9 100644
>> --- a/include/uapi/linux/rtnetlink.h
>> +++ b/include/uapi/linux/rtnetlink.h
>> @@ -541,9 +541,15 @@ struct tcmsg {
>>  	int		tcm_ifindex;
>>  	__u32		tcm_handle;
>>  	__u32		tcm_parent;
>> +/* tcm_block_index is used instead of tcm_parent
>> + * in case tcm_ifindex == TCM_IFINDEX_MAGIC_BLOCK
>> + */
>> +#define tcm_block_index tcm_parent
>>  	__u32		tcm_info;
>>  };
>>  
>> +#define TCM_IFINDEX_MAGIC_BLOCK (0xFFFFFFFFU)
>> +
>>  enum {
>>  	TCA_UNSPEC,
>>  	TCA_KIND,
>
>
>This could be more clearly documented for anyone wanting to write an app
>against the API. Something like:
>
>For shared blocks, tcm_ifindex is set to TCM_IFINDEX_MAGIC_BLOCK, and
>tcm_parent is aliased to tcm_block_index which is the block index.

Okay, will add this comment here.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 03/13] net: sched: avoid usage of tp->q in tcf_classify
  2018-01-10 16:17   ` David Ahern
@ 2018-01-11  9:40     ` Jiri Pirko
  2018-01-11 13:57       ` David Ahern
  0 siblings, 1 reply; 42+ messages in thread
From: Jiri Pirko @ 2018-01-11  9:40 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, michael.chan, ganeshgr, saeedm,
	matanb, leonro, idosch, jakub.kicinski, simon.horman,
	pieter.jansenvanvuuren, john.hurley, alexander.h.duyck, ogerlitz,
	john.fastabend, daniel

Wed, Jan 10, 2018 at 05:17:28PM CET, dsahern@gmail.com wrote:
>On 1/9/18 7:07 AM, Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@mellanox.com>
>> 
>> Use block index in the messages instead.
>> 
>> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
>> ---
>>  net/sched/cls_api.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>> 
>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>> index 9b45950..31e91dc 100644
>> --- a/net/sched/cls_api.c
>> +++ b/net/sched/cls_api.c
>> @@ -672,8 +672,9 @@ int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
>>  #ifdef CONFIG_NET_CLS_ACT
>>  reset:
>>  	if (unlikely(limit++ >= max_reclassify_loop)) {
>> -		net_notice_ratelimited("%s: reclassify loop, rule prio %u, protocol %02x\n",
>> -				       tp->q->ops->id, tp->prio & 0xffff,
>> +		net_notice_ratelimited("%u: reclassify loop, rule prio %u, protocol %02x\n",
>
>if you are dumping index instead of prio shouldn't the 'rule prio' above
>be adjusted?

I'm not! Why do you think so?

"%u:" is tp->chain->block->index
"prio %u" is tp->prio & 0xffff
"%02x" is ntohs(tp->protocol)


>
>
>> +				       tp->chain->block->index,
>> +				       tp->prio & 0xffff,
>>  				       ntohs(tp->protocol));
>>  		return TC_ACT_SHOT;
>>  	}
>> 
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 08/13] net: sched: add rt netlink message type for block get
  2018-01-11  9:37     ` Jiri Pirko
@ 2018-01-11 11:11       ` Jiri Pirko
  0 siblings, 0 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-11 11:11 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, michael.chan, ganeshgr, saeedm,
	matanb, leonro, idosch, jakub.kicinski, simon.horman,
	pieter.jansenvanvuuren, john.hurley, alexander.h.duyck, ogerlitz,
	john.fastabend, daniel

Thu, Jan 11, 2018 at 10:37:10AM CET, jiri@resnulli.us wrote:
>Wed, Jan 10, 2018 at 05:48:09PM CET, dsahern@gmail.com wrote:
>>On 1/9/18 7:07 AM, Jiri Pirko wrote:
>>> diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
>>> index 9c026d9..038cde7 100644
>>> --- a/include/uapi/linux/rtnetlink.h
>>> +++ b/include/uapi/linux/rtnetlink.h
>>> @@ -150,6 +150,12 @@ enum {
>>>  	RTM_NEWCACHEREPORT = 96,
>>>  #define RTM_NEWCACHEREPORT RTM_NEWCACHEREPORT
>>>  
>>> +	RTM_NEWBLOCK = 100,
>>> +#define RTM_NEWBLOCK RTM_NEWBLOCK
>>> +	RTM_DELBLOCK,
>>> +#define RTM_DELBLOCK RTM_DELBLOCK
>>> +	RTM_GETBLOCK,
>>> +#define RTM_GETBLOCK RTM_GETBLOCK
>>>  	__RTM_MAX,
>>>  #define RTM_MAX		(((__RTM_MAX + 3) & ~3) - 1)
>>>  };
>>
>>Seems like this is creating an inconsistency. RTM_GETBLOCK is used to
>>dump the set of shared blocks, but RTM_NEWBLOCK / RTM_DELBLOCK are not
>>used to create / delete one.
>
>Why is it a problem? RTM_NEWBLOCK is used as a reply for RTM_GETBLOCK.
>I plan to have block notifications as a follow-up, there the RTM_GETBLOCK

I mean RTM_NEWBLOCK and RTM_DELBLOCK of couse.

>and RTM_DELBLOCK will be used. The fact the user cannot create and
>delete block explicitly is no problem in my opinion. The block creation
>and deletion is done according to usage of qdiscs.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances
  2018-01-09 14:07 [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (15 preceding siblings ...)
  2018-01-09 14:19 ` [patch iproute2 net-next v7 3/3] tc: implement filter block sharing to ingress and clsact qdiscs Jiri Pirko
@ 2018-01-11 13:19 ` Jamal Hadi Salim
  2018-01-11 14:27   ` Jiri Pirko
  16 siblings, 1 reply; 42+ messages in thread
From: Jamal Hadi Salim @ 2018-01-11 13:19 UTC (permalink / raw)
  To: Jiri Pirko, netdev
  Cc: davem, xiyou.wangcong, mlxsw, andrew, vivien.didelot, f.fainelli,
	michael.chan, ganeshgr, saeedm, matanb, leonro, idosch,
	jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

On 18-01-09 09:07 AM, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> Currently the filters added to qdiscs are independent. So for example if you
> have 2 netdevices and you create ingress qdisc on both and you want to add
> identical filter rules both, you need to add them twice. This patchset
> makes this easier and mainly saves resources allowing to share all filters
> within a qdisc - I call it a "filter block". Also this helps to save
> resources when we do offload to hw for example to expensive TCAM.
> 
> So back to the example. First, we create 2 qdiscs. Both will share
> block number 22. "22" is just an identification:
> $ tc qdisc add dev ens7 ingress block 22
>                                  ^^^^^^^^
> $ tc qdisc add dev ens8 ingress block 22
>                                  ^^^^^^^^
> 
> If we don't specify "block" command line option, no shared block would
> be created:
> $ tc qdisc add dev ens9 ingress
> 
> Now if we list the qdiscs, we will see the block index in the output:
> 
> $ tc qdisc
> qdisc ingress ffff: dev ens7 parent ffff:fff1 block 22
> qdisc ingress ffff: dev ens8 parent ffff:fff1 block 22
> qdisc ingress ffff: dev ens9 parent ffff:fff1
> 
> 
> To make is more visual, the situation looks like this:
> 
>     ens7 ingress qdisc                 ens7 ingress qdisc
>            |                                  |
>            |                                  |
>            +---------->  block 22  <----------+
> 
> Unlimited number of qdiscs may share the same block.
> 
> Now we can add filter using the block index:
> 
> $ tc filter add block 22 protocol ip pref 25 flower dst_ip 192.168.0.0/16 action drop
> 
> 
> Note we cannot use the qdisc for filter manipulations of shared blocks:
> 
> $ tc filter add dev ens8 ingress protocol ip pref 1 flower dst_ip 192.168.100.2 action drop
> Error: This filter block is shared. Please use the block index to manipulate the filters.
> 
> 
> We will see the same output if we list filters for ingress qdisc of
> ens7 and ens8, also for the block 22:
> 
> $ tc filter show block 22
> filter block 22 protocol ip pref 25 flower chain 0
> filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
> ...
> 
> $ tc filter show dev ens7 ingress
> filter block 22 protocol ip pref 25 flower chain 0
> filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
> ...
> 
> $ tc filter show dev ens8 ingress
> filter block 22 protocol ip pref 25 flower chain 0
> filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
> ...
> 

Somewhere here mention the egress issue we talked about, something
like:
----
At the moment on ingress and clsact_xxx are well supported by the
block infrastructure. For this to work well with egress qdisc,
all the ports/qdiscs sharing the block will have to be symmetric.
e.g. if ens8 and ens9 root qdiscs shared a block at their (egress)
root qdiscs, then those qdiscs would both need to have the same
handle id. An example of a symettric shared block setup would like like:

tc qdisc add dev ens8 root block 22 handle 1:0 prio
tc qdisc add dev ens9 root block 22 handle 1:0 prio
----

I am confident the above would work. You said you are thinking of
getting this to always work (I cant think of a simple way to do it),
but for the moment the above is fine.
Most people who want this would probably use clsact egress and not
care about queues (so it may never be "fixed")

cheers,
jamal

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 07/13] net: sched: use block index as a handle instead of qdisc when block is shared
  2018-01-09 14:07 ` [patch net-next v7 07/13] net: sched: use block index as a handle instead of qdisc when block is shared Jiri Pirko
  2018-01-10 18:12   ` David Ahern
@ 2018-01-11 13:25   ` Jamal Hadi Salim
  2018-01-11 14:21     ` Jiri Pirko
  1 sibling, 1 reply; 42+ messages in thread
From: Jamal Hadi Salim @ 2018-01-11 13:25 UTC (permalink / raw)
  To: Jiri Pirko, netdev
  Cc: davem, xiyou.wangcong, mlxsw, andrew, vivien.didelot, f.fainelli,
	michael.chan, ganeshgr, saeedm, matanb, leonro, idosch,
	jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

On 18-01-09 09:07 AM, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> As the tcm_ifindex 0 is invalid ifindex, reuse it to indicate that we
> work with block, instead of qdisc. So if tcm_ifindex is 0, tcm_parent is
> used to carry block_index.
> 

Commit log still refers to ifindex of 0 instead of TCM_IFINDEX_MAGIC_BLOCK

cheers,
jamal

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 08/13] net: sched: add rt netlink message type for block get
  2018-01-09 14:07 ` [patch net-next v7 08/13] net: sched: add rt netlink message type for block get Jiri Pirko
  2018-01-10 16:48   ` David Ahern
@ 2018-01-11 13:27   ` Jamal Hadi Salim
  2018-01-11 14:23     ` Jiri Pirko
  1 sibling, 1 reply; 42+ messages in thread
From: Jamal Hadi Salim @ 2018-01-11 13:27 UTC (permalink / raw)
  To: Jiri Pirko, netdev
  Cc: davem, xiyou.wangcong, mlxsw, andrew, vivien.didelot, f.fainelli,
	michael.chan, ganeshgr, saeedm, matanb, leonro, idosch,
	jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

On 18-01-09 09:07 AM, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> Add simple block get operation which primary purpose is to check the
> block existence by block index.
> 

block_dump missing?

cheers,
jamal

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 09/13] net: sched: allow ingress and clsact qdiscs to share filter blocks
  2018-01-09 14:07 ` [patch net-next v7 09/13] net: sched: allow ingress and clsact qdiscs to share filter blocks Jiri Pirko
@ 2018-01-11 13:36   ` Jamal Hadi Salim
  2018-01-11 14:24     ` Jiri Pirko
  0 siblings, 1 reply; 42+ messages in thread
From: Jamal Hadi Salim @ 2018-01-11 13:36 UTC (permalink / raw)
  To: Jiri Pirko, netdev
  Cc: davem, xiyou.wangcong, mlxsw, andrew, vivien.didelot, f.fainelli,
	michael.chan, ganeshgr, saeedm, matanb, leonro, idosch,
	jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

On 18-01-09 09:07 AM, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> Benefit from the previously introduced shared filter blocks
> infrastructure and allow ingress and clsact qdisc instances to share
> filter blocks. The block index is coming from userspace as qdisc option.

Didnt quiet follow why ingress is special and needs attributes to
set the block but other qdiscs didnt.
Will check again later after some coffee..

cheers,
jamal

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 03/13] net: sched: avoid usage of tp->q in tcf_classify
  2018-01-11  9:40     ` Jiri Pirko
@ 2018-01-11 13:57       ` David Ahern
  0 siblings, 0 replies; 42+ messages in thread
From: David Ahern @ 2018-01-11 13:57 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, michael.chan, ganeshgr, saeedm,
	matanb, leonro, idosch, jakub.kicinski, simon.horman,
	pieter.jansenvanvuuren, john.hurley, alexander.h.duyck, ogerlitz,
	john.fastabend, daniel

On 1/11/18 2:40 AM, Jiri Pirko wrote:
> Wed, Jan 10, 2018 at 05:17:28PM CET, dsahern@gmail.com wrote:
>> On 1/9/18 7:07 AM, Jiri Pirko wrote:
>>> From: Jiri Pirko <jiri@mellanox.com>
>>>
>>> Use block index in the messages instead.
>>>
>>> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
>>> ---
>>>  net/sched/cls_api.c | 5 +++--
>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>>> index 9b45950..31e91dc 100644
>>> --- a/net/sched/cls_api.c
>>> +++ b/net/sched/cls_api.c
>>> @@ -672,8 +672,9 @@ int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
>>>  #ifdef CONFIG_NET_CLS_ACT
>>>  reset:
>>>  	if (unlikely(limit++ >= max_reclassify_loop)) {
>>> -		net_notice_ratelimited("%s: reclassify loop, rule prio %u, protocol %02x\n",
>>> -				       tp->q->ops->id, tp->prio & 0xffff,
>>> +		net_notice_ratelimited("%u: reclassify loop, rule prio %u, protocol %02x\n",
>>
>> if you are dumping index instead of prio shouldn't the 'rule prio' above
>> be adjusted?
> 
> I'm not! Why do you think so?
> 
> "%u:" is tp->chain->block->index
> "prio %u" is tp->prio & 0xffff
> "%02x" is ntohs(tp->protocol)
> 

Never mind. scanned that too quickly.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 07/13] net: sched: use block index as a handle instead of qdisc when block is shared
  2018-01-11 13:25   ` Jamal Hadi Salim
@ 2018-01-11 14:21     ` Jiri Pirko
  0 siblings, 0 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-11 14:21 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: netdev, davem, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

Thu, Jan 11, 2018 at 02:25:36PM CET, jhs@mojatatu.com wrote:
>On 18-01-09 09:07 AM, Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@mellanox.com>
>> 
>> As the tcm_ifindex 0 is invalid ifindex, reuse it to indicate that we
>> work with block, instead of qdisc. So if tcm_ifindex is 0, tcm_parent is
>> used to carry block_index.
>> 
>
>Commit log still refers to ifindex of 0 instead of TCM_IFINDEX_MAGIC_BLOCK

Missed this. Will update, thanks!

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 08/13] net: sched: add rt netlink message type for block get
  2018-01-11 13:27   ` Jamal Hadi Salim
@ 2018-01-11 14:23     ` Jiri Pirko
  0 siblings, 0 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-11 14:23 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: netdev, davem, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

Thu, Jan 11, 2018 at 02:27:11PM CET, jhs@mojatatu.com wrote:
>On 18-01-09 09:07 AM, Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@mellanox.com>
>> 
>> Add simple block get operation which primary purpose is to check the
>> block existence by block index.
>> 
>
>block_dump missing?

It is not needed for anything now. You see all the blocks when you list
qdiscs. Yet, dump could be easily added if needed in the future.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 09/13] net: sched: allow ingress and clsact qdiscs to share filter blocks
  2018-01-11 13:36   ` Jamal Hadi Salim
@ 2018-01-11 14:24     ` Jiri Pirko
  2018-01-11 14:37       ` Jamal Hadi Salim
  0 siblings, 1 reply; 42+ messages in thread
From: Jiri Pirko @ 2018-01-11 14:24 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: netdev, davem, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

Thu, Jan 11, 2018 at 02:36:01PM CET, jhs@mojatatu.com wrote:
>On 18-01-09 09:07 AM, Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@mellanox.com>
>> 
>> Benefit from the previously introduced shared filter blocks
>> infrastructure and allow ingress and clsact qdisc instances to share
>> filter blocks. The block index is coming from userspace as qdisc option.
>
>Didnt quiet follow why ingress is special and needs attributes to
>set the block but other qdiscs didnt.

Jamal, again, other qdiscs does not support block sharing. This patchset
only adds support for sharing of block for ingress and clsact qdiscs.
Later on, other qdiscs could also support block sharing.


>Will check again later after some coffee..
>
>cheers,
>jamal

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances
  2018-01-11 13:19 ` [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jamal Hadi Salim
@ 2018-01-11 14:27   ` Jiri Pirko
  0 siblings, 0 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-11 14:27 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: netdev, davem, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

Thu, Jan 11, 2018 at 02:19:16PM CET, jhs@mojatatu.com wrote:
>On 18-01-09 09:07 AM, Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@mellanox.com>
>> 
>> Currently the filters added to qdiscs are independent. So for example if you
>> have 2 netdevices and you create ingress qdisc on both and you want to add
>> identical filter rules both, you need to add them twice. This patchset
>> makes this easier and mainly saves resources allowing to share all filters
>> within a qdisc - I call it a "filter block". Also this helps to save
>> resources when we do offload to hw for example to expensive TCAM.
>> 
>> So back to the example. First, we create 2 qdiscs. Both will share
>> block number 22. "22" is just an identification:
>> $ tc qdisc add dev ens7 ingress block 22
>>                                  ^^^^^^^^
>> $ tc qdisc add dev ens8 ingress block 22
>>                                  ^^^^^^^^
>> 
>> If we don't specify "block" command line option, no shared block would
>> be created:
>> $ tc qdisc add dev ens9 ingress
>> 
>> Now if we list the qdiscs, we will see the block index in the output:
>> 
>> $ tc qdisc
>> qdisc ingress ffff: dev ens7 parent ffff:fff1 block 22
>> qdisc ingress ffff: dev ens8 parent ffff:fff1 block 22
>> qdisc ingress ffff: dev ens9 parent ffff:fff1
>> 
>> 
>> To make is more visual, the situation looks like this:
>> 
>>     ens7 ingress qdisc                 ens7 ingress qdisc
>>            |                                  |
>>            |                                  |
>>            +---------->  block 22  <----------+
>> 
>> Unlimited number of qdiscs may share the same block.
>> 
>> Now we can add filter using the block index:
>> 
>> $ tc filter add block 22 protocol ip pref 25 flower dst_ip 192.168.0.0/16 action drop
>> 
>> 
>> Note we cannot use the qdisc for filter manipulations of shared blocks:
>> 
>> $ tc filter add dev ens8 ingress protocol ip pref 1 flower dst_ip 192.168.100.2 action drop
>> Error: This filter block is shared. Please use the block index to manipulate the filters.
>> 
>> 
>> We will see the same output if we list filters for ingress qdisc of
>> ens7 and ens8, also for the block 22:
>> 
>> $ tc filter show block 22
>> filter block 22 protocol ip pref 25 flower chain 0
>> filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
>> ...
>> 
>> $ tc filter show dev ens7 ingress
>> filter block 22 protocol ip pref 25 flower chain 0
>> filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
>> ...
>> 
>> $ tc filter show dev ens8 ingress
>> filter block 22 protocol ip pref 25 flower chain 0
>> filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
>> ...
>> 
>
>Somewhere here mention the egress issue we talked about, something
>like:

I don't understand why to mention something that is not supported and
future thinking and work needs to be done in order to support it. Let's
leave that text for a cover letter of that patchset, could we?


>----
>At the moment on ingress and clsact_xxx are well supported by the
>block infrastructure. For this to work well with egress qdisc,
>all the ports/qdiscs sharing the block will have to be symmetric.
>e.g. if ens8 and ens9 root qdiscs shared a block at their (egress)
>root qdiscs, then those qdiscs would both need to have the same
>handle id. An example of a symettric shared block setup would like like:
>
>tc qdisc add dev ens8 root block 22 handle 1:0 prio
>tc qdisc add dev ens9 root block 22 handle 1:0 prio
>----
>
>I am confident the above would work. You said you are thinking of
>getting this to always work (I cant think of a simple way to do it),
>but for the moment the above is fine.
>Most people who want this would probably use clsact egress and not
>care about queues (so it may never be "fixed")
>
>cheers,
>jamal

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 09/13] net: sched: allow ingress and clsact qdiscs to share filter blocks
  2018-01-11 14:24     ` Jiri Pirko
@ 2018-01-11 14:37       ` Jamal Hadi Salim
  2018-01-11 14:41         ` Jiri Pirko
  0 siblings, 1 reply; 42+ messages in thread
From: Jamal Hadi Salim @ 2018-01-11 14:37 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

On 18-01-11 09:24 AM, Jiri Pirko wrote:
> Thu, Jan 11, 2018 at 02:36:01PM CET, jhs@mojatatu.com wrote:
>> On 18-01-09 09:07 AM, Jiri Pirko wrote:
>>> From: Jiri Pirko <jiri@mellanox.com>
>>>
>>> Benefit from the previously introduced shared filter blocks
>>> infrastructure and allow ingress and clsact qdisc instances to share
>>> filter blocks. The block index is coming from userspace as qdisc option.
>>
>> Didnt quiet follow why ingress is special and needs attributes to
>> set the block but other qdiscs didnt.
> 
> Jamal, again, other qdiscs does not support block sharing. This patchset
> only adds support for sharing of block for ingress and clsact qdiscs.
> Later on, other qdiscs could also support block sharing.
> 

Can you stop a config which says:
tc qdisc add dev ens9 root block 22 handle 1:0 prio ?

cheers,
jamal

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 09/13] net: sched: allow ingress and clsact qdiscs to share filter blocks
  2018-01-11 14:37       ` Jamal Hadi Salim
@ 2018-01-11 14:41         ` Jiri Pirko
  2018-01-11 14:46           ` Jamal Hadi Salim
  0 siblings, 1 reply; 42+ messages in thread
From: Jiri Pirko @ 2018-01-11 14:41 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: netdev, davem, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

Thu, Jan 11, 2018 at 03:37:08PM CET, jhs@mojatatu.com wrote:
>On 18-01-11 09:24 AM, Jiri Pirko wrote:
>> Thu, Jan 11, 2018 at 02:36:01PM CET, jhs@mojatatu.com wrote:
>> > On 18-01-09 09:07 AM, Jiri Pirko wrote:
>> > > From: Jiri Pirko <jiri@mellanox.com>
>> > > 
>> > > Benefit from the previously introduced shared filter blocks
>> > > infrastructure and allow ingress and clsact qdisc instances to share
>> > > filter blocks. The block index is coming from userspace as qdisc option.
>> > 
>> > Didnt quiet follow why ingress is special and needs attributes to
>> > set the block but other qdiscs didnt.
>> 
>> Jamal, again, other qdiscs does not support block sharing. This patchset
>> only adds support for sharing of block for ingress and clsact qdiscs.
>> Later on, other qdiscs could also support block sharing.
>> 
>
>Can you stop a config which says:
>tc qdisc add dev ens9 root block 22 handle 1:0 prio ?

Please see the iproute2 patches. Parsing of "block" command line option
is done inside q_ingress.c

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 09/13] net: sched: allow ingress and clsact qdiscs to share filter blocks
  2018-01-11 14:41         ` Jiri Pirko
@ 2018-01-11 14:46           ` Jamal Hadi Salim
  2018-01-11 15:07             ` Jiri Pirko
  0 siblings, 1 reply; 42+ messages in thread
From: Jamal Hadi Salim @ 2018-01-11 14:46 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

On 18-01-11 09:41 AM, Jiri Pirko wrote:
> Thu, Jan 11, 2018 at 03:37:08PM CET, jhs@mojatatu.com wrote:
>> On 18-01-11 09:24 AM, Jiri Pirko wrote:
>>> Thu, Jan 11, 2018 at 02:36:01PM CET, jhs@mojatatu.com wrote:
>>>> On 18-01-09 09:07 AM, Jiri Pirko wrote:
>>>>> From: Jiri Pirko <jiri@mellanox.com>
>>>>>
>>>>> Benefit from the previously introduced shared filter blocks
>>>>> infrastructure and allow ingress and clsact qdisc instances to share
>>>>> filter blocks. The block index is coming from userspace as qdisc option.
>>>>
>>>> Didnt quiet follow why ingress is special and needs attributes to
>>>> set the block but other qdiscs didnt.
>>>
>>> Jamal, again, other qdiscs does not support block sharing. This patchset
>>> only adds support for sharing of block for ingress and clsact qdiscs.
>>> Later on, other qdiscs could also support block sharing.
>>>
>>
>> Can you stop a config which says:
>> tc qdisc add dev ens9 root block 22 handle 1:0 prio ?
> 
> Please see the iproute2 patches. Parsing of "block" command line option
> is done inside q_ingress.c
> 

I only looked at the kernel code. Good you can stop it at tc
but the API does not stop it (unless you expect the rest of the
world to only use tc).
Really - there is no reason for this API to be only via ingress qdisc
attributes. You can add a check in cls api to reject any parent that is
not either of the clsacts + ingress (depending on tc doesnt sound
right).

cheers,
jamal

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 09/13] net: sched: allow ingress and clsact qdiscs to share filter blocks
  2018-01-11 14:46           ` Jamal Hadi Salim
@ 2018-01-11 15:07             ` Jiri Pirko
  2018-01-11 15:41               ` Roopa Prabhu
  2018-01-11 15:44               ` Jamal Hadi Salim
  0 siblings, 2 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-11 15:07 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: netdev, davem, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

Thu, Jan 11, 2018 at 03:46:09PM CET, jhs@mojatatu.com wrote:
>On 18-01-11 09:41 AM, Jiri Pirko wrote:
>> Thu, Jan 11, 2018 at 03:37:08PM CET, jhs@mojatatu.com wrote:
>> > On 18-01-11 09:24 AM, Jiri Pirko wrote:
>> > > Thu, Jan 11, 2018 at 02:36:01PM CET, jhs@mojatatu.com wrote:
>> > > > On 18-01-09 09:07 AM, Jiri Pirko wrote:
>> > > > > From: Jiri Pirko <jiri@mellanox.com>
>> > > > > 
>> > > > > Benefit from the previously introduced shared filter blocks
>> > > > > infrastructure and allow ingress and clsact qdisc instances to share
>> > > > > filter blocks. The block index is coming from userspace as qdisc option.
>> > > > 
>> > > > Didnt quiet follow why ingress is special and needs attributes to
>> > > > set the block but other qdiscs didnt.
>> > > 
>> > > Jamal, again, other qdiscs does not support block sharing. This patchset
>> > > only adds support for sharing of block for ingress and clsact qdiscs.
>> > > Later on, other qdiscs could also support block sharing.
>> > > 
>> > 
>> > Can you stop a config which says:
>> > tc qdisc add dev ens9 root block 22 handle 1:0 prio ?
>> 
>> Please see the iproute2 patches. Parsing of "block" command line option
>> is done inside q_ingress.c
>> 
>
>I only looked at the kernel code. Good you can stop it at tc
>but the API does not stop it (unless you expect the rest of the
>world to only use tc).

Jamal, apparently, you did not looked at the kernel code either :)
Look at the changes done in net/sched/sch_ingress.c - there is where the
parsing of block attr takes place.


>Really - there is no reason for this API to be only via ingress qdisc
>attributes. You can add a check in cls api to reject any parent that is
>not either of the clsacts + ingress (depending on tc doesnt sound
>right).

I was thinking to take this direction originally. To have another
generic attr called TCA_BLOCK or something that would be used when qdisc
is created. For ingress, what would work. But for clsact, you need to be
able to specify 2 block during qdisc creation - one for ingress, one for
egress. That's when I realized this has to be per-qdisc-type attr. 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 09/13] net: sched: allow ingress and clsact qdiscs to share filter blocks
  2018-01-11 15:07             ` Jiri Pirko
@ 2018-01-11 15:41               ` Roopa Prabhu
  2018-01-11 16:11                 ` Jiri Pirko
  2018-01-11 15:44               ` Jamal Hadi Salim
  1 sibling, 1 reply; 42+ messages in thread
From: Roopa Prabhu @ 2018-01-11 15:41 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Jamal Hadi Salim, netdev, David Miller, Cong Wang, mlxsw,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Michael Chan,
	ganeshgr, Saeed Mahameed, matanb, leonro, Ido Schimmel,
	jakub.kicinski, Simon Horman, pieter.jansenvanvuuren,
	john.hurley, Alexander Duyck, Or Gerlitz, John Fastabend,
	Daniel Borkmann

On Thu, Jan 11, 2018 at 7:07 AM, Jiri Pirko <jiri@resnulli.us> wrote:
> Thu, Jan 11, 2018 at 03:46:09PM CET, jhs@mojatatu.com wrote:
>>On 18-01-11 09:41 AM, Jiri Pirko wrote:
>>> Thu, Jan 11, 2018 at 03:37:08PM CET, jhs@mojatatu.com wrote:
>>> > On 18-01-11 09:24 AM, Jiri Pirko wrote:
>>> > > Thu, Jan 11, 2018 at 02:36:01PM CET, jhs@mojatatu.com wrote:
>>> > > > On 18-01-09 09:07 AM, Jiri Pirko wrote:
>>> > > > > From: Jiri Pirko <jiri@mellanox.com>
>>> > > > >
>>> > > > > Benefit from the previously introduced shared filter blocks
>>> > > > > infrastructure and allow ingress and clsact qdisc instances to share
>>> > > > > filter blocks. The block index is coming from userspace as qdisc option.
>>> > > >
>>> > > > Didnt quiet follow why ingress is special and needs attributes to
>>> > > > set the block but other qdiscs didnt.
>>> > >
>>> > > Jamal, again, other qdiscs does not support block sharing. This patchset
>>> > > only adds support for sharing of block for ingress and clsact qdiscs.
>>> > > Later on, other qdiscs could also support block sharing.
>>> > >
>>> >
>>> > Can you stop a config which says:
>>> > tc qdisc add dev ens9 root block 22 handle 1:0 prio ?
>>>
>>> Please see the iproute2 patches. Parsing of "block" command line option
>>> is done inside q_ingress.c
>>>
>>
>>I only looked at the kernel code. Good you can stop it at tc
>>but the API does not stop it (unless you expect the rest of the
>>world to only use tc).
>
> Jamal, apparently, you did not looked at the kernel code either :)
> Look at the changes done in net/sched/sch_ingress.c - there is where the
> parsing of block attr takes place.
>
>
>>Really - there is no reason for this API to be only via ingress qdisc
>>attributes. You can add a check in cls api to reject any parent that is
>>not either of the clsacts + ingress (depending on tc doesnt sound
>>right).
>
> I was thinking to take this direction originally. To have another
> generic attr called TCA_BLOCK or something that would be used when qdisc
> is created. For ingress, what would work. But for clsact, you need to be
> able to specify 2 block during qdisc creation - one for ingress, one for
> egress. That's when I realized this has to be per-qdisc-type attr.


yeah, see the problem...but.., would it help if we just introduce two
generic attrs TCA_BLOCK_INGRESS and TCA_BLOCK_EGRESS instead of having
to duplicate these attrs at every qdisc ?.
and add proper validation depending on qdisc type..

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 09/13] net: sched: allow ingress and clsact qdiscs to share filter blocks
  2018-01-11 15:07             ` Jiri Pirko
  2018-01-11 15:41               ` Roopa Prabhu
@ 2018-01-11 15:44               ` Jamal Hadi Salim
  2018-01-11 16:15                 ` Jiri Pirko
  1 sibling, 1 reply; 42+ messages in thread
From: Jamal Hadi Salim @ 2018-01-11 15:44 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

On 18-01-11 10:07 AM, Jiri Pirko wrote:
> Thu, Jan 11, 2018 at 03:46:09PM CET, jhs@mojatatu.com wrote:
>> On 18-01-11 09:41 AM, Jiri Pirko wrote:
>>> Thu, Jan 11, 2018 at 03:37:08PM CET, jhs@mojatatu.com wrote:

>>
>> I only looked at the kernel code. Good you can stop it at tc
>> but the API does not stop it (unless you expect the rest of the
>> world to only use tc).
> 
> Jamal, apparently, you did not looked at the kernel code either :)
> Look at the changes done in net/sched/sch_ingress.c - there is where the
> parsing of block attr takes place.
> 

reason i raised it is from looking at tc_ctl_tfilter().
If i specify ifindex != TCM_IFINDEX_MAGIC_BLOCK,
parent = 0XFFFF.... and block = 22 that should work, no?
i.e regardless of whether parent is INGRESS etc.

And so i was confused why you had attributes in sch_ingress.c

> 
>> Really - there is no reason for this API to be only via ingress qdisc
>> attributes. You can add a check in cls api to reject any parent that is
>> not either of the clsacts + ingress (depending on tc doesnt sound
>> right).
> 
> I was thinking to take this direction originally. To have another
> generic attr called TCA_BLOCK or something that would be used when qdisc
> is created. For ingress, what would work. But for clsact, you need to be
> able to specify 2 block during qdisc creation - one for ingress, one for
> egress. That's when I realized this has to be per-qdisc-type attr.
> 

ok for clsact - i can see that we dont have enough fields in the tcm
message.

TCA_BLOCK sounds appealing - could be a speacial tlv with many block ids
maybe? I really would like to use this for egress as well - and what
i described earlier should work for me.

cheers,
jamal

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 09/13] net: sched: allow ingress and clsact qdiscs to share filter blocks
  2018-01-11 15:41               ` Roopa Prabhu
@ 2018-01-11 16:11                 ` Jiri Pirko
  0 siblings, 0 replies; 42+ messages in thread
From: Jiri Pirko @ 2018-01-11 16:11 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: Jamal Hadi Salim, netdev, David Miller, Cong Wang, mlxsw,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Michael Chan,
	ganeshgr, Saeed Mahameed, matanb, leonro, Ido Schimmel,
	jakub.kicinski, Simon Horman, pieter.jansenvanvuuren,
	john.hurley, Alexander Duyck, Or Gerlitz, John Fastabend,
	Daniel Borkmann

Thu, Jan 11, 2018 at 04:41:27PM CET, roopa@cumulusnetworks.com wrote:
>On Thu, Jan 11, 2018 at 7:07 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>> Thu, Jan 11, 2018 at 03:46:09PM CET, jhs@mojatatu.com wrote:
>>>On 18-01-11 09:41 AM, Jiri Pirko wrote:
>>>> Thu, Jan 11, 2018 at 03:37:08PM CET, jhs@mojatatu.com wrote:
>>>> > On 18-01-11 09:24 AM, Jiri Pirko wrote:
>>>> > > Thu, Jan 11, 2018 at 02:36:01PM CET, jhs@mojatatu.com wrote:
>>>> > > > On 18-01-09 09:07 AM, Jiri Pirko wrote:
>>>> > > > > From: Jiri Pirko <jiri@mellanox.com>
>>>> > > > >
>>>> > > > > Benefit from the previously introduced shared filter blocks
>>>> > > > > infrastructure and allow ingress and clsact qdisc instances to share
>>>> > > > > filter blocks. The block index is coming from userspace as qdisc option.
>>>> > > >
>>>> > > > Didnt quiet follow why ingress is special and needs attributes to
>>>> > > > set the block but other qdiscs didnt.
>>>> > >
>>>> > > Jamal, again, other qdiscs does not support block sharing. This patchset
>>>> > > only adds support for sharing of block for ingress and clsact qdiscs.
>>>> > > Later on, other qdiscs could also support block sharing.
>>>> > >
>>>> >
>>>> > Can you stop a config which says:
>>>> > tc qdisc add dev ens9 root block 22 handle 1:0 prio ?
>>>>
>>>> Please see the iproute2 patches. Parsing of "block" command line option
>>>> is done inside q_ingress.c
>>>>
>>>
>>>I only looked at the kernel code. Good you can stop it at tc
>>>but the API does not stop it (unless you expect the rest of the
>>>world to only use tc).
>>
>> Jamal, apparently, you did not looked at the kernel code either :)
>> Look at the changes done in net/sched/sch_ingress.c - there is where the
>> parsing of block attr takes place.
>>
>>
>>>Really - there is no reason for this API to be only via ingress qdisc
>>>attributes. You can add a check in cls api to reject any parent that is
>>>not either of the clsacts + ingress (depending on tc doesnt sound
>>>right).
>>
>> I was thinking to take this direction originally. To have another
>> generic attr called TCA_BLOCK or something that would be used when qdisc
>> is created. For ingress, what would work. But for clsact, you need to be
>> able to specify 2 block during qdisc creation - one for ingress, one for
>> egress. That's when I realized this has to be per-qdisc-type attr.
>
>
>yeah, see the problem...but.., would it help if we just introduce two
>generic attrs TCA_BLOCK_INGRESS and TCA_BLOCK_EGRESS instead of having
>to duplicate these attrs at every qdisc ?.
>and add proper validation depending on qdisc type..

I guess it would make sense. I just did not want to introduce this
limitation. But allright. Will do this.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 09/13] net: sched: allow ingress and clsact qdiscs to share filter blocks
  2018-01-11 15:44               ` Jamal Hadi Salim
@ 2018-01-11 16:15                 ` Jiri Pirko
  2018-01-11 17:02                   ` Jamal Hadi Salim
  0 siblings, 1 reply; 42+ messages in thread
From: Jiri Pirko @ 2018-01-11 16:15 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: netdev, davem, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

Thu, Jan 11, 2018 at 04:44:32PM CET, jhs@mojatatu.com wrote:
>On 18-01-11 10:07 AM, Jiri Pirko wrote:
>> Thu, Jan 11, 2018 at 03:46:09PM CET, jhs@mojatatu.com wrote:
>> > On 18-01-11 09:41 AM, Jiri Pirko wrote:
>> > > Thu, Jan 11, 2018 at 03:37:08PM CET, jhs@mojatatu.com wrote:
>
>> > 
>> > I only looked at the kernel code. Good you can stop it at tc
>> > but the API does not stop it (unless you expect the rest of the
>> > world to only use tc).
>> 
>> Jamal, apparently, you did not looked at the kernel code either :)
>> Look at the changes done in net/sched/sch_ingress.c - there is where the
>> parsing of block attr takes place.
>> 
>
>reason i raised it is from looking at tc_ctl_tfilter().
>If i specify ifindex != TCM_IFINDEX_MAGIC_BLOCK,
>parent = 0XFFFF.... and block = 22 that should work, no?
>i.e regardless of whether parent is INGRESS etc.

No, the block needs to be created first by qdisc instance.
Seems to me that you are mixing apples and oranges a bit.


>
>And so i was confused why you had attributes in sch_ingress.c
>
>> 
>> > Really - there is no reason for this API to be only via ingress qdisc
>> > attributes. You can add a check in cls api to reject any parent that is
>> > not either of the clsacts + ingress (depending on tc doesnt sound
>> > right).
>> 
>> I was thinking to take this direction originally. To have another
>> generic attr called TCA_BLOCK or something that would be used when qdisc
>> is created. For ingress, what would work. But for clsact, you need to be
>> able to specify 2 block during qdisc creation - one for ingress, one for
>> egress. That's when I realized this has to be per-qdisc-type attr.
>> 
>
>ok for clsact - i can see that we dont have enough fields in the tcm
>message.

That is not a problem. Again, don't mix filter manipulation with qdisc
manipulation.


>
>TCA_BLOCK sounds appealing - could be a speacial tlv with many block ids
>maybe? I really would like to use this for egress as well - and what
>i described earlier should work for me.

I don't get what you mean by "tlv with many block ids". What is it good
for? :O


>
>cheers,
>jamal

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [patch net-next v7 09/13] net: sched: allow ingress and clsact qdiscs to share filter blocks
  2018-01-11 16:15                 ` Jiri Pirko
@ 2018-01-11 17:02                   ` Jamal Hadi Salim
  0 siblings, 0 replies; 42+ messages in thread
From: Jamal Hadi Salim @ 2018-01-11 17:02 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

On 18-01-11 11:15 AM, Jiri Pirko wrote:
> Thu, Jan 11, 2018 at 04:44:32PM CET, jhs@mojatatu.com wrote:
>> On 18-01-11 10:07 AM, Jiri Pirko wrote:
>>> Thu, Jan 11, 2018 at 03:46:09PM CET, jhs@mojatatu.com wrote:
>>>> On 18-01-11 09:41 AM, Jiri Pirko wrote:
>>>>> Thu, Jan 11, 2018 at 03:37:08PM CET, jhs@mojatatu.com wrote:
>>
>>>>
>>>> I only looked at the kernel code. Good you can stop it at tc
>>>> but the API does not stop it (unless you expect the rest of the
>>>> world to only use tc).
>>>
>>> Jamal, apparently, you did not looked at the kernel code either :)
>>> Look at the changes done in net/sched/sch_ingress.c - there is where the
>>> parsing of block attr takes place.
>>>
>>
>> reason i raised it is from looking at tc_ctl_tfilter().
>> If i specify ifindex != TCM_IFINDEX_MAGIC_BLOCK,
>> parent = 0XFFFF.... and block = 22 that should work, no?
>> i.e regardless of whether parent is INGRESS etc.
> 
> No, the block needs to be created first by qdisc instance.
> Seems to me that you are mixing apples and oranges a bit.
> 

You are correct, the qdisc attachment must exist first ;->


>> TCA_BLOCK sounds appealing - could be a speacial tlv with many block ids
>> maybe? I really would like to use this for egress as well - and what
>> i described earlier should work for me.
> 
> I don't get what you mean by "tlv with many block ids". What is it good
> for? :O
>

I meant A TLV with a bunch of 32 bit values, so you can have more than
one block id in it. But what Roopa suggested is more explicit (and
better).

cheers,
jamal

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2018-01-11 17:02 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-09 14:07 [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jiri Pirko
2018-01-09 14:07 ` [patch net-next v7 01/13] net: sched: introduce support for multiple filter chain pointers registration Jiri Pirko
2018-01-09 14:07 ` [patch net-next v7 02/13] net: sched: introduce shared filter blocks infrastructure Jiri Pirko
2018-01-09 14:07 ` [patch net-next v7 03/13] net: sched: avoid usage of tp->q in tcf_classify Jiri Pirko
2018-01-10 16:17   ` David Ahern
2018-01-11  9:40     ` Jiri Pirko
2018-01-11 13:57       ` David Ahern
2018-01-09 14:07 ` [patch net-next v7 04/13] net: sched: introduce block mechanism to handle netif_keep_dst calls Jiri Pirko
2018-01-09 14:07 ` [patch net-next v7 05/13] net: sched: remove classid and q fields from tcf_proto Jiri Pirko
2018-01-09 14:07 ` [patch net-next v7 06/13] net: sched: keep track of offloaded filters and check tc offload feature Jiri Pirko
2018-01-09 14:07 ` [patch net-next v7 07/13] net: sched: use block index as a handle instead of qdisc when block is shared Jiri Pirko
2018-01-10 18:12   ` David Ahern
2018-01-11  9:38     ` Jiri Pirko
2018-01-11 13:25   ` Jamal Hadi Salim
2018-01-11 14:21     ` Jiri Pirko
2018-01-09 14:07 ` [patch net-next v7 08/13] net: sched: add rt netlink message type for block get Jiri Pirko
2018-01-10 16:48   ` David Ahern
2018-01-11  9:37     ` Jiri Pirko
2018-01-11 11:11       ` Jiri Pirko
2018-01-11 13:27   ` Jamal Hadi Salim
2018-01-11 14:23     ` Jiri Pirko
2018-01-09 14:07 ` [patch net-next v7 09/13] net: sched: allow ingress and clsact qdiscs to share filter blocks Jiri Pirko
2018-01-11 13:36   ` Jamal Hadi Salim
2018-01-11 14:24     ` Jiri Pirko
2018-01-11 14:37       ` Jamal Hadi Salim
2018-01-11 14:41         ` Jiri Pirko
2018-01-11 14:46           ` Jamal Hadi Salim
2018-01-11 15:07             ` Jiri Pirko
2018-01-11 15:41               ` Roopa Prabhu
2018-01-11 16:11                 ` Jiri Pirko
2018-01-11 15:44               ` Jamal Hadi Salim
2018-01-11 16:15                 ` Jiri Pirko
2018-01-11 17:02                   ` Jamal Hadi Salim
2018-01-09 14:07 ` [patch net-next v7 10/13] mlxsw: spectrum_acl: Reshuffle code around mlxsw_sp_acl_ruleset_create/destroy Jiri Pirko
2018-01-09 14:07 ` [patch net-next v7 11/13] mlxsw: spectrum_acl: Don't store netdev and ingress for ruleset unbind Jiri Pirko
2018-01-09 14:07 ` [patch net-next v7 12/13] mlxsw: spectrum_acl: Implement TC block sharing Jiri Pirko
2018-01-09 14:07 ` [patch net-next v7 13/13] mlxsw: spectrum_acl: Pass mlxsw_sp_port down to ruleset bind/unbind ops Jiri Pirko
2018-01-09 14:19 ` [patch iproute2 net-next v7 1/3] include: update rtnetlink header according to kernel Jiri Pirko
2018-01-09 14:19 ` [patch iproute2 net-next v7 2/3] tc: introduce support for block-handle for filter operations Jiri Pirko
2018-01-09 14:19 ` [patch iproute2 net-next v7 3/3] tc: implement filter block sharing to ingress and clsact qdiscs Jiri Pirko
2018-01-11 13:19 ` [patch net-next v7 00/13] net: sched: allow qdiscs to share filter block instances Jamal Hadi Salim
2018-01-11 14:27   ` Jiri Pirko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).