All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances
@ 2017-11-12 15:55 Jiri Pirko
  2017-11-12 15:55 ` [patch net-next v2 01/10] cls_bpf: move prog offload->netdev check into drivers Jiri Pirko
                   ` (10 more replies)
  0 siblings, 11 replies; 31+ messages in thread
From: Jiri Pirko @ 2017-11-12 15:55 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, jakub.kicinski, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

From: Jiri Pirko <jiri@mellanox.com>

Currently the filters added to qdiscs are independent. So for example if you
have 2 netdevices and you create ingress qdisc on both and you want to add
identical filter rules both, you need to add them twice. This patchset
makes this easier and mainly saves resources allowing to share all filters
within a qdisc - I call it a "filter block". Also this helps to save
resources when we do offload to hw for example to expensive TCAM.

So back to the example. First, we create 2 qdiscs. Both will share
block number 22. "22" is just an identification. If we don't pass any
block number, a new one will be generated by kernel:

$ tc qdisc add dev ens7 ingress block 22
                                ^^^^^^^^
$ tc qdisc add dev ens8 ingress block 22
                                ^^^^^^^^

Now if we list the qdiscs, we will see the block index in the output:

$ tc qdisc
qdisc ingress ffff: dev ens7 parent ffff:fff1 block 22
qdisc ingress ffff: dev ens8 parent ffff:fff1 block 22

Now we can add filter to any of qdiscs sharing the same block:

$ tc filter add dev ens7 ingress protocol ip pref 25 flower dst_ip 192.168.0.0/16 action drop


We will see the same output if we list filters for ens7 and ens8, including stats:

$ tc -s filter show dev ens7 ingress
filter protocol ip pref 25 flower chain 0
filter protocol ip pref 25 flower chain 0 handle 0x1
  eth_type ipv4
  dst_ip 192.168.0.0/16
  not_in_hw
        action order 1: gact action drop
         random type none pass val 0
         index 1 ref 1 bind 1 installed 39 sec used 2 sec
        Action statistics:
        Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

$ tc -s filter show dev ens8 ingress
filter protocol ip pref 25 flower chain 0
filter protocol ip pref 25 flower chain 0 handle 0x1
  eth_type ipv4
  dst_ip 192.168.0.0/16
  not_in_hw
        action order 1: gact action drop
         random type none pass val 0
         index 1 ref 1 bind 1 installed 40 sec used 3 sec
        Action statistics:
        Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

---
v1->v2:
-patch 4:
 - fix binder_type to check "egress" as well as pointed out by Daniel
-patch 1:
 - new patch due to rebase (another tp->q usage appeared)
-patch 7-10:
 - new patches to offload shared blocks in mlxsw

*** BLURB HERE ***

Jiri Pirko (10):
  cls_bpf: move prog offload->netdev check into drivers
  net: sched: introduce support for multiple filter chain pointers
    registration
  net: sched: avoid usage of tp->q in tcf_classify
  net: sched: introduce block mechanism to handle netif_keep_dst calls
  net: sched: remove classid and q fields from tcf_proto
  net: sched: allow ingress and clsact qdiscs to share filter blocks
  mlxsw: spectrum_acl: Reshuffle code around
    mlxsw_sp_acl_ruleset_create/destroy
  mlxsw: spectrum_acl: Don't store netdev and ingress for ruleset unbind
  mlxsw: spectrum_acl: Implement TC block sharing
  mlxsw: spectrum_acl: Pass mlxsw_sp_port down to ruleset bind/unbind
    ops

 drivers/net/ethernet/mellanox/mlxsw/spectrum.c     | 146 +++++++--
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  38 ++-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c | 283 +++++++++++++-----
 .../ethernet/mellanox/mlxsw/spectrum_acl_tcam.c    |  44 ++-
 .../net/ethernet/mellanox/mlxsw/spectrum_flower.c  |  41 ++-
 drivers/net/ethernet/netronome/nfp/bpf/main.c      |   2 +
 include/linux/bpf.h                                |  16 +
 include/net/pkt_cls.h                              |   4 +
 include/net/sch_generic.h                          |   9 +-
 include/uapi/linux/pkt_sched.h                     |  11 +
 kernel/bpf/syscall.c                               |  39 ++-
 net/sched/cls_api.c                                | 328 ++++++++++++++++++---
 net/sched/cls_bpf.c                                |  10 +-
 net/sched/cls_flow.c                               |   2 +-
 net/sched/cls_route.c                              |   2 +-
 net/sched/sch_ingress.c                            |  89 +++++-
 16 files changed, 851 insertions(+), 213 deletions(-)

-- 
2.9.5

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [patch net-next v2 01/10] cls_bpf: move prog offload->netdev check into drivers
  2017-11-12 15:55 [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances Jiri Pirko
@ 2017-11-12 15:55 ` Jiri Pirko
  2017-11-13  2:14   ` Jakub Kicinski
  2017-11-12 15:55 ` [patch net-next v2 02/10] net: sched: introduce support for multiple filter chain pointers registration Jiri Pirko
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Jiri Pirko @ 2017-11-12 15:55 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, jakub.kicinski, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

From: Jiri Pirko <jiri@mellanox.com>

In order to remove tp->q usage in cls_bpf, the offload->netdev check
needs to be moved to individual drivers as only they will have access
to appropriate struct net_device.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/netronome/nfp/bpf/main.c |  2 ++
 include/linux/bpf.h                           | 16 +++++++++++
 kernel/bpf/syscall.c                          | 39 +++++++++++++++++++++------
 net/sched/cls_bpf.c                           |  6 +----
 4 files changed, 50 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/main.c b/drivers/net/ethernet/netronome/nfp/bpf/main.c
index e379b78..c303e08 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/main.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/main.c
@@ -102,6 +102,8 @@ static int nfp_bpf_setup_tc_block_cb(enum tc_setup_type type,
 		return -EOPNOTSUPP;
 	if (nn->dp.bpf_offload_xdp)
 		return -EBUSY;
+	if (!bpf_prog_can_attach_netdev(cls_bpf->prog, nn->dp.netdev))
+		return -EINVAL;
 
 	/* Only support TC direct action */
 	if (!cls_bpf->exts_integrated ||
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index c397934..df7b7bc 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -337,10 +337,14 @@ struct bpf_prog *bpf_prog_get(u32 ufd);
 struct bpf_prog *bpf_prog_get_type(u32 ufd, enum bpf_prog_type type);
 struct bpf_prog *bpf_prog_get_type_dev(u32 ufd, enum bpf_prog_type type,
 				       struct net_device *netdev);
+struct bpf_prog *bpf_prog_get_type_dev_nocheck(u32 ufd,
+					       enum bpf_prog_type type);
 struct bpf_prog * __must_check bpf_prog_add(struct bpf_prog *prog, int i);
 void bpf_prog_sub(struct bpf_prog *prog, int i);
 struct bpf_prog * __must_check bpf_prog_inc(struct bpf_prog *prog);
 struct bpf_prog * __must_check bpf_prog_inc_not_zero(struct bpf_prog *prog);
+bool bpf_prog_can_attach_netdev(struct bpf_prog *prog,
+				struct net_device *netdev);
 void bpf_prog_put(struct bpf_prog *prog);
 int __bpf_prog_charge(struct user_struct *user, u32 pages);
 void __bpf_prog_uncharge(struct user_struct *user, u32 pages);
@@ -438,6 +442,12 @@ static inline struct bpf_prog *bpf_prog_get_type_dev(u32 ufd,
 	return ERR_PTR(-EOPNOTSUPP);
 }
 
+static inline struct bpf_prog *
+bpf_prog_get_type_dev_nocheck(u32 ufd, enum bpf_prog_type type)
+{
+	return ERR_PTR(-EOPNOTSUPP);
+}
+
 static inline struct bpf_prog * __must_check bpf_prog_add(struct bpf_prog *prog,
 							  int i)
 {
@@ -463,6 +473,12 @@ bpf_prog_inc_not_zero(struct bpf_prog *prog)
 	return ERR_PTR(-EOPNOTSUPP);
 }
 
+static inline bool bpf_prog_can_attach_netdev(struct bpf_prog *prog,
+					      struct net_device *netdev)
+{
+	return true;
+}
+
 static inline int __bpf_prog_charge(struct user_struct *user, u32 pages)
 {
 	return 0;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 09badc3..d880038 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1057,22 +1057,33 @@ struct bpf_prog *bpf_prog_inc_not_zero(struct bpf_prog *prog)
 }
 EXPORT_SYMBOL_GPL(bpf_prog_inc_not_zero);
 
-static bool bpf_prog_can_attach(struct bpf_prog *prog,
-				enum bpf_prog_type *attach_type,
+bool bpf_prog_can_attach_netdev(struct bpf_prog *prog,
 				struct net_device *netdev)
 {
 	struct bpf_dev_offload *offload = prog->aux->offload;
 
+	if (offload && offload->netdev != netdev)
+		return false;
+
+	return true;
+}
+EXPORT_SYMBOL_GPL(bpf_prog_can_attach_netdev);
+
+static bool bpf_prog_can_attach(struct bpf_prog *prog,
+				enum bpf_prog_type *attach_type,
+				struct net_device *netdev, bool check_netdev)
+{
 	if (prog->type != *attach_type)
 		return false;
-	if (offload && offload->netdev != netdev)
+	if (check_netdev && !bpf_prog_can_attach_netdev(prog, netdev))
 		return false;
 
 	return true;
 }
 
 static struct bpf_prog *__bpf_prog_get(u32 ufd, enum bpf_prog_type *attach_type,
-				       struct net_device *netdev)
+				       struct net_device *netdev,
+				       bool check_netdev)
 {
 	struct fd f = fdget(ufd);
 	struct bpf_prog *prog;
@@ -1080,7 +1091,8 @@ static struct bpf_prog *__bpf_prog_get(u32 ufd, enum bpf_prog_type *attach_type,
 	prog = ____bpf_prog_get(f);
 	if (IS_ERR(prog))
 		return prog;
-	if (attach_type && !bpf_prog_can_attach(prog, attach_type, netdev)) {
+	if (attach_type && !bpf_prog_can_attach(prog, attach_type,
+						netdev, check_netdev)) {
 		prog = ERR_PTR(-EINVAL);
 		goto out;
 	}
@@ -1093,12 +1105,12 @@ static struct bpf_prog *__bpf_prog_get(u32 ufd, enum bpf_prog_type *attach_type,
 
 struct bpf_prog *bpf_prog_get(u32 ufd)
 {
-	return __bpf_prog_get(ufd, NULL, NULL);
+	return __bpf_prog_get(ufd, NULL, NULL, true);
 }
 
 struct bpf_prog *bpf_prog_get_type(u32 ufd, enum bpf_prog_type type)
 {
-	struct bpf_prog *prog = __bpf_prog_get(ufd, &type, NULL);
+	struct bpf_prog *prog = __bpf_prog_get(ufd, &type, NULL, true);
 
 	if (!IS_ERR(prog))
 		trace_bpf_prog_get_type(prog);
@@ -1109,7 +1121,7 @@ EXPORT_SYMBOL_GPL(bpf_prog_get_type);
 struct bpf_prog *bpf_prog_get_type_dev(u32 ufd, enum bpf_prog_type type,
 				       struct net_device *netdev)
 {
-	struct bpf_prog *prog = __bpf_prog_get(ufd, &type, netdev);
+	struct bpf_prog *prog = __bpf_prog_get(ufd, &type, netdev, true);
 
 	if (!IS_ERR(prog))
 		trace_bpf_prog_get_type(prog);
@@ -1117,6 +1129,17 @@ struct bpf_prog *bpf_prog_get_type_dev(u32 ufd, enum bpf_prog_type type,
 }
 EXPORT_SYMBOL_GPL(bpf_prog_get_type_dev);
 
+struct bpf_prog *bpf_prog_get_type_dev_nocheck(u32 ufd,
+					       enum bpf_prog_type type)
+{
+	struct bpf_prog *prog = __bpf_prog_get(ufd, &type, NULL, false);
+
+	if (!IS_ERR(prog))
+		trace_bpf_prog_get_type(prog);
+	return prog;
+}
+EXPORT_SYMBOL_GPL(bpf_prog_get_type_dev_nocheck);
+
 /* last field in 'union bpf_attr' used by this command */
 #define	BPF_PROG_LOAD_LAST_FIELD prog_target_ifindex
 
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index fb680da..8b3927b 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -386,11 +386,7 @@ static int cls_bpf_prog_from_efd(struct nlattr **tb, struct cls_bpf_prog *prog,
 
 	bpf_fd = nla_get_u32(tb[TCA_BPF_FD]);
 
-	if (gen_flags & TCA_CLS_FLAGS_SKIP_SW)
-		fp = bpf_prog_get_type_dev(bpf_fd, BPF_PROG_TYPE_SCHED_CLS,
-					   qdisc_dev(tp->q));
-	else
-		fp = bpf_prog_get_type(bpf_fd, BPF_PROG_TYPE_SCHED_CLS);
+	fp = bpf_prog_get_type_dev_nocheck(bpf_fd, BPF_PROG_TYPE_SCHED_CLS);
 	if (IS_ERR(fp))
 		return PTR_ERR(fp);
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [patch net-next v2 02/10] net: sched: introduce support for multiple filter chain pointers registration
  2017-11-12 15:55 [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances Jiri Pirko
  2017-11-12 15:55 ` [patch net-next v2 01/10] cls_bpf: move prog offload->netdev check into drivers Jiri Pirko
@ 2017-11-12 15:55 ` Jiri Pirko
  2017-11-12 15:55 ` [patch net-next v2 03/10] net: sched: avoid usage of tp->q in tcf_classify Jiri Pirko
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Jiri Pirko @ 2017-11-12 15:55 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, jakub.kicinski, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

From: Jiri Pirko <jiri@mellanox.com>

So far, there was possible only to register a single filter chain
pointer to block->chain[0]. However, when the blocks will get shareable,
we need to allow multiple filter chain pointers registration.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/pkt_cls.h     |   3 +
 include/net/sch_generic.h |   5 +-
 net/sched/cls_api.c       | 247 +++++++++++++++++++++++++++++++++++++++-------
 3 files changed, 218 insertions(+), 37 deletions(-)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 0105445..c04154f 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -29,6 +29,8 @@ struct tcf_block_ext_info {
 	enum tcf_block_binder_type binder_type;
 	tcf_chain_head_change_t *chain_head_change;
 	void *chain_head_change_priv;
+	bool shareable;
+	u32 block_index;
 };
 
 struct tcf_block_cb;
@@ -48,6 +50,7 @@ void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
 
 static inline struct Qdisc *tcf_block_q(struct tcf_block *block)
 {
+	WARN_ON(block->refcnt != 1);
 	return block->q;
 }
 
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 65d0d25..9b4ab7e 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -265,8 +265,7 @@ typedef void tcf_chain_head_change_t(struct tcf_proto *tp_head, void *priv);
 
 struct tcf_chain {
 	struct tcf_proto __rcu *filter_chain;
-	tcf_chain_head_change_t *chain_head_change;
-	void *chain_head_change_priv;
+	struct list_head filter_chain_list;
 	struct list_head list;
 	struct tcf_block *block;
 	u32 index; /* chain index */
@@ -275,6 +274,8 @@ struct tcf_chain {
 
 struct tcf_block {
 	struct list_head chain_list;
+	u32 index; /* block index for shared blocks */
+	unsigned int refcnt;
 	struct net *net;
 	struct Qdisc *q;
 	struct list_head cb_list;
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index ab255b4..14e1a66 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -25,6 +25,7 @@
 #include <linux/kmod.h>
 #include <linux/err.h>
 #include <linux/slab.h>
+#include <linux/idr.h>
 #include <net/net_namespace.h>
 #include <net/sock.h>
 #include <net/netlink.h>
@@ -180,6 +181,12 @@ static void tcf_proto_destroy(struct tcf_proto *tp)
 	kfree_rcu(tp, rcu);
 }
 
+struct tcf_filter_chain_list_item {
+	struct list_head list;
+	tcf_chain_head_change_t *chain_head_change;
+	void *chain_head_change_priv;
+};
+
 static struct tcf_chain *tcf_chain_create(struct tcf_block *block,
 					  u32 chain_index)
 {
@@ -188,6 +195,7 @@ static struct tcf_chain *tcf_chain_create(struct tcf_block *block,
 	chain = kzalloc(sizeof(*chain), GFP_KERNEL);
 	if (!chain)
 		return NULL;
+	INIT_LIST_HEAD(&chain->filter_chain_list);
 	list_add_tail(&chain->list, &block->chain_list);
 	chain->block = block;
 	chain->index = chain_index;
@@ -195,12 +203,19 @@ static struct tcf_chain *tcf_chain_create(struct tcf_block *block,
 	return chain;
 }
 
+static void tcf_chain_head_change_item(struct tcf_filter_chain_list_item *item,
+				       struct tcf_proto *tp_head)
+{
+	if (item->chain_head_change)
+		item->chain_head_change(tp_head, item->chain_head_change_priv);
+}
 static void tcf_chain_head_change(struct tcf_chain *chain,
 				  struct tcf_proto *tp_head)
 {
-	if (chain->chain_head_change)
-		chain->chain_head_change(tp_head,
-					 chain->chain_head_change_priv);
+	struct tcf_filter_chain_list_item *item;
+
+	list_for_each_entry(item, &chain->filter_chain_list, list)
+		tcf_chain_head_change_item(item, tp_head);
 }
 
 static void tcf_chain_flush(struct tcf_chain *chain)
@@ -276,15 +291,84 @@ static void tcf_block_offload_unbind(struct tcf_block *block, struct Qdisc *q,
 	tcf_block_offload_cmd(block, q, ei, TC_BLOCK_UNBIND);
 }
 
-int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
-		      struct tcf_block_ext_info *ei)
+static int
+tcf_chain_head_change_cb_add(struct tcf_chain *chain,
+			     struct tcf_block_ext_info *ei)
+{
+	struct tcf_filter_chain_list_item *item;
+
+	item = kmalloc(sizeof(*item), GFP_KERNEL);
+	if (!item)
+		return -ENOMEM;
+	item->chain_head_change = ei->chain_head_change;
+	item->chain_head_change_priv = ei->chain_head_change_priv;
+	if (chain->filter_chain)
+		tcf_chain_head_change_item(item, chain->filter_chain);
+	list_add(&item->list, &chain->filter_chain_list);
+	return 0;
+}
+
+static void
+tcf_chain_head_change_cb_del(struct tcf_chain *chain,
+			     struct tcf_block_ext_info *ei)
+{
+	struct tcf_filter_chain_list_item *item;
+
+	list_for_each_entry(item, &chain->filter_chain_list, list) {
+		if ((!ei->chain_head_change && !ei->chain_head_change_priv) ||
+		    (item->chain_head_change == ei->chain_head_change &&
+		     item->chain_head_change_priv == ei->chain_head_change_priv)) {
+			tcf_chain_head_change_item(item, NULL);
+			list_del(&item->list);
+			kfree(item);
+			return;
+		}
+	}
+	WARN_ON(1);
+}
+
+struct tcf_net {
+	struct idr idr;
+};
+
+static unsigned int tcf_net_id;
+
+static int tcf_block_insert(struct tcf_block *block, struct net *net,
+			    u32 block_index)
+{
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+	int idr_start;
+	int idr_end;
+	int index;
+
+	if (block_index >= INT_MAX)
+		return -EINVAL;
+	idr_start = block_index ? block_index : 1;
+	idr_end = block_index ? block_index + 1 : INT_MAX;
+
+	index = idr_alloc(&tn->idr, block, idr_start, idr_end, GFP_KERNEL);
+	if (index < 0)
+		return index;
+	block->index = index;
+	return 0;
+}
+
+static void tcf_block_remove(struct tcf_block *block, struct net *net)
 {
-	struct tcf_block *block = kzalloc(sizeof(*block), GFP_KERNEL);
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+
+	idr_remove(&tn->idr, block->index);
+}
+
+static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q)
+{
+	struct tcf_block *block;
 	struct tcf_chain *chain;
 	int err;
 
+	block = kzalloc(sizeof(*block), GFP_KERNEL);
 	if (!block)
-		return -ENOMEM;
+		return ERR_PTR(-ENOMEM);
 	INIT_LIST_HEAD(&block->chain_list);
 	INIT_LIST_HEAD(&block->cb_list);
 
@@ -294,17 +378,96 @@ int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 		err = -ENOMEM;
 		goto err_chain_create;
 	}
-	WARN_ON(!ei->chain_head_change);
-	chain->chain_head_change = ei->chain_head_change;
-	chain->chain_head_change_priv = ei->chain_head_change_priv;
 	block->net = qdisc_net(q);
+	block->refcnt = 1;
+	block->net = net;
 	block->q = q;
+	return block;
+
+err_chain_create:
+	kfree(block);
+	return ERR_PTR(err);
+}
+
+static void tcf_block_destroy_final(struct work_struct *work)
+{
+	struct tcf_block *block = container_of(work, struct tcf_block, work);
+	struct tcf_chain *chain, *tmp;
+
+	rtnl_lock();
+	/* Only chain 0 should be still here. */
+	list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
+		tcf_chain_put(chain);
+	rtnl_unlock();
+	kfree(block);
+}
+
+static void tcf_block_destroy(struct tcf_block *block)
+{
+	INIT_WORK(&block->work, tcf_block_destroy_final);
+	/* Wait for existing RCU callbacks to cool down, make sure their works
+	 * have been queued before this. We can not flush pending works here
+	 * because we are holding the RTNL lock.
+	 */
+	rcu_barrier();
+	tcf_queue_work(&block->work);
+}
+
+static struct tcf_block *tcf_block_lookup(struct net *net, u32 block_index)
+{
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+
+	return idr_find(&tn->idr, block_index);
+}
+
+static struct tcf_chain *tcf_block_chain_zero(struct tcf_block *block)
+{
+	return list_first_entry(&block->chain_list, struct tcf_chain, list);
+}
+
+int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
+		      struct tcf_block_ext_info *ei)
+{
+	struct net *net = qdisc_net(q);
+	struct tcf_block *block = NULL;
+	bool created = false;
+	int err;
+
+	if (ei->shareable) {
+		block = tcf_block_lookup(net, ei->block_index);
+		if (block)
+			block->refcnt++;
+	}
+
+	if (!block) {
+		block = tcf_block_create(net, q);
+		if (IS_ERR(block))
+			return PTR_ERR(block);
+		created = true;
+		if (ei->shareable) {
+			err = tcf_block_insert(block, net, ei->block_index);
+			if (err)
+				goto err_block_insert;
+		}
+	}
+
+	err = tcf_chain_head_change_cb_add(tcf_block_chain_zero(block), ei);
+	if (err)
+		goto err_chain_head_change_cb_add;
+
 	tcf_block_offload_bind(block, q, ei);
 	*p_block = block;
 	return 0;
 
-err_chain_create:
-	kfree(block);
+err_chain_head_change_cb_add:
+	if (created) {
+		if (ei->shareable)
+			tcf_block_remove(block, net);
+err_block_insert:
+		tcf_block_destroy(block);
+	} else {
+		block->refcnt--;
+	}
 	return err;
 }
 EXPORT_SYMBOL(tcf_block_get_ext);
@@ -329,19 +492,6 @@ int tcf_block_get(struct tcf_block **p_block,
 }
 EXPORT_SYMBOL(tcf_block_get);
 
-static void tcf_block_put_final(struct work_struct *work)
-{
-	struct tcf_block *block = container_of(work, struct tcf_block, work);
-	struct tcf_chain *chain, *tmp;
-
-	rtnl_lock();
-	/* Only chain 0 should be still here. */
-	list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
-		tcf_chain_put(chain);
-	rtnl_unlock();
-	kfree(block);
-}
-
 /* XXX: Standalone actions are not allowed to jump to any chain, and bound
  * actions should be all removed after flushing. However, filters are now
  * destroyed in tc filter workqueue with RTNL lock, they can not race here.
@@ -351,18 +501,17 @@ void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
 {
 	struct tcf_chain *chain, *tmp;
 
-	list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
-		tcf_chain_flush(chain);
+	tcf_chain_head_change_cb_del(tcf_block_chain_zero(block), ei);
 
+	if (--block->refcnt == 0) {
+		if (ei->shareable)
+			tcf_block_remove(block, block->net);
+		list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
+			tcf_chain_flush(chain);
+	}
 	tcf_block_offload_unbind(block, q, ei);
-
-	INIT_WORK(&block->work, tcf_block_put_final);
-	/* Wait for existing RCU callbacks to cool down, make sure their works
-	 * have been queued before this. We can not flush pending works here
-	 * because we are holding the RTNL lock.
-	 */
-	rcu_barrier();
-	tcf_queue_work(&block->work);
+	if (block->refcnt == 0)
+		tcf_block_destroy(block);
 }
 EXPORT_SYMBOL(tcf_block_put_ext);
 
@@ -1249,12 +1398,40 @@ int tc_setup_cb_call(struct tcf_block *block, struct tcf_exts *exts,
 }
 EXPORT_SYMBOL(tc_setup_cb_call);
 
+static __net_init int tcf_net_init(struct net *net)
+{
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+
+	idr_init(&tn->idr);
+	return 0;
+}
+
+static void __net_exit tcf_net_exit(struct net *net)
+{
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+
+	idr_destroy(&tn->idr);
+}
+
+static struct pernet_operations tcf_net_ops = {
+	.init = tcf_net_init,
+	.exit = tcf_net_exit,
+	.id   = &tcf_net_id,
+	.size = sizeof(struct tcf_net),
+};
+
 static int __init tc_filter_init(void)
 {
+	int err;
+
 	tc_filter_wq = alloc_ordered_workqueue("tc_filter_workqueue", 0);
 	if (!tc_filter_wq)
 		return -ENOMEM;
 
+	err = register_pernet_subsys(&tcf_net_ops);
+	if (err)
+		return err;
+
 	rtnl_register(PF_UNSPEC, RTM_NEWTFILTER, tc_ctl_tfilter, NULL, 0);
 	rtnl_register(PF_UNSPEC, RTM_DELTFILTER, tc_ctl_tfilter, NULL, 0);
 	rtnl_register(PF_UNSPEC, RTM_GETTFILTER, tc_ctl_tfilter,
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [patch net-next v2 03/10] net: sched: avoid usage of tp->q in tcf_classify
  2017-11-12 15:55 [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances Jiri Pirko
  2017-11-12 15:55 ` [patch net-next v2 01/10] cls_bpf: move prog offload->netdev check into drivers Jiri Pirko
  2017-11-12 15:55 ` [patch net-next v2 02/10] net: sched: introduce support for multiple filter chain pointers registration Jiri Pirko
@ 2017-11-12 15:55 ` Jiri Pirko
  2017-11-12 15:55 ` [patch net-next v2 04/10] net: sched: introduce block mechanism to handle netif_keep_dst calls Jiri Pirko
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Jiri Pirko @ 2017-11-12 15:55 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, jakub.kicinski, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

From: Jiri Pirko <jiri@mellanox.com>

Use block index in the messages instead.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/sched/cls_api.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 14e1a66..a4e28d4 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -670,8 +670,9 @@ int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 #ifdef CONFIG_NET_CLS_ACT
 reset:
 	if (unlikely(limit++ >= max_reclassify_loop)) {
-		net_notice_ratelimited("%s: reclassify loop, rule prio %u, protocol %02x\n",
-				       tp->q->ops->id, tp->prio & 0xffff,
+		net_notice_ratelimited("%u: reclassify loop, rule prio %u, protocol %02x\n",
+				       tp->chain->block->index,
+				       tp->prio & 0xffff,
 				       ntohs(tp->protocol));
 		return TC_ACT_SHOT;
 	}
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [patch net-next v2 04/10] net: sched: introduce block mechanism to handle netif_keep_dst calls
  2017-11-12 15:55 [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (2 preceding siblings ...)
  2017-11-12 15:55 ` [patch net-next v2 03/10] net: sched: avoid usage of tp->q in tcf_classify Jiri Pirko
@ 2017-11-12 15:55 ` Jiri Pirko
  2017-11-13  7:47   ` Jakub Kicinski
  2017-11-12 15:55 ` [patch net-next v2 05/10] net: sched: remove classid and q fields from tcf_proto Jiri Pirko
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Jiri Pirko @ 2017-11-12 15:55 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, jakub.kicinski, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

From: Jiri Pirko <jiri@mellanox.com>

Couple of classifiers call netif_keep_dst directly on q->dev. That is
not possible to do directly for shared blocke where multiple qdiscs are
owning the block. So introduce a infrastructure to keep track of the
block owners in list and use this list to implement block variant of
netif_keep_dst.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/pkt_cls.h     |  1 +
 include/net/sch_generic.h |  2 ++
 net/sched/cls_api.c       | 69 +++++++++++++++++++++++++++++++++++++++++++++++
 net/sched/cls_bpf.c       |  4 +--
 net/sched/cls_flow.c      |  2 +-
 net/sched/cls_route.c     |  2 +-
 6 files changed, 76 insertions(+), 4 deletions(-)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index c04154f..80261e7 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -40,6 +40,7 @@ bool tcf_queue_work(struct work_struct *work);
 struct tcf_chain *tcf_chain_get(struct tcf_block *block, u32 chain_index,
 				bool create);
 void tcf_chain_put(struct tcf_chain *chain);
+void tcf_block_netif_keep_dst(struct tcf_block *block);
 int tcf_block_get(struct tcf_block **p_block,
 		  struct tcf_proto __rcu **p_filter_chain, struct Qdisc *q);
 int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 9b4ab7e..655c788 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -279,6 +279,8 @@ struct tcf_block {
 	struct net *net;
 	struct Qdisc *q;
 	struct list_head cb_list;
+	struct list_head owner_list;
+	bool keep_dst;
 	struct work_struct work;
 };
 
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index a4e28d4..d964f2e 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -371,6 +371,7 @@ static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q)
 		return ERR_PTR(-ENOMEM);
 	INIT_LIST_HEAD(&block->chain_list);
 	INIT_LIST_HEAD(&block->cb_list);
+	INIT_LIST_HEAD(&block->owner_list);
 
 	/* Create chain 0 by default, it has to be always present. */
 	chain = tcf_chain_create(block, 0);
@@ -425,6 +426,65 @@ static struct tcf_chain *tcf_block_chain_zero(struct tcf_block *block)
 	return list_first_entry(&block->chain_list, struct tcf_chain, list);
 }
 
+struct tcf_block_owner_item {
+	struct list_head list;
+	struct Qdisc *q;
+	enum tcf_block_binder_type binder_type;
+};
+
+static void
+tcf_block_owner_netif_keep_dst(struct tcf_block *block,
+			       struct Qdisc *q,
+			       enum tcf_block_binder_type binder_type)
+{
+	if (block->keep_dst &&
+	    binder_type != TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS &&
+	    binder_type != TCF_BLOCK_BINDER_TYPE_CLSACT_EGRESS)
+		netif_keep_dst(qdisc_dev(q));
+}
+
+void tcf_block_netif_keep_dst(struct tcf_block *block)
+{
+	struct tcf_block_owner_item *item;
+
+	block->keep_dst = true;
+	list_for_each_entry(item, &block->owner_list, list)
+		tcf_block_owner_netif_keep_dst(block, item->q,
+					       item->binder_type);
+}
+EXPORT_SYMBOL(tcf_block_netif_keep_dst);
+
+static int tcf_block_owner_add(struct tcf_block *block,
+			       struct Qdisc *q,
+			       enum tcf_block_binder_type binder_type)
+{
+	struct tcf_block_owner_item *item;
+
+	item = kmalloc(sizeof(*item), GFP_KERNEL);
+	if (!item)
+		return -ENOMEM;
+	item->q = q;
+	item->binder_type = binder_type;
+	list_add(&item->list, &block->owner_list);
+	return 0;
+}
+
+static void tcf_block_owner_del(struct tcf_block *block,
+				struct Qdisc *q,
+				enum tcf_block_binder_type binder_type)
+{
+	struct tcf_block_owner_item *item;
+
+	list_for_each_entry(item, &block->owner_list, list) {
+		if (item->q == q && item->binder_type == binder_type) {
+			list_del(&item->list);
+			kfree(item);
+			return;
+		}
+	}
+	WARN_ON(1);
+}
+
 int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 		      struct tcf_block_ext_info *ei)
 {
@@ -451,6 +511,12 @@ int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 		}
 	}
 
+	err = tcf_block_owner_add(block, q, ei->binder_type);
+	if (err)
+		goto err_block_owner_add;
+
+	tcf_block_owner_netif_keep_dst(block, q, ei->binder_type);
+
 	err = tcf_chain_head_change_cb_add(tcf_block_chain_zero(block), ei);
 	if (err)
 		goto err_chain_head_change_cb_add;
@@ -460,6 +526,8 @@ int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 	return 0;
 
 err_chain_head_change_cb_add:
+	tcf_block_owner_del(block, q, ei->binder_type);
+err_block_owner_add:
 	if (created) {
 		if (ei->shareable)
 			tcf_block_remove(block, net);
@@ -502,6 +570,7 @@ void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
 	struct tcf_chain *chain, *tmp;
 
 	tcf_chain_head_change_cb_del(tcf_block_chain_zero(block), ei);
+	tcf_block_owner_del(block, q, ei->binder_type);
 
 	if (--block->refcnt == 0) {
 		if (ei->shareable)
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 8b3927b..c7afd4b 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -402,8 +402,8 @@ static int cls_bpf_prog_from_efd(struct nlattr **tb, struct cls_bpf_prog *prog,
 	prog->bpf_name = name;
 	prog->filter = fp;
 
-	if (fp->dst_needed && !(tp->q->flags & TCQ_F_INGRESS))
-		netif_keep_dst(qdisc_dev(tp->q));
+	if (fp->dst_needed)
+		tcf_block_netif_keep_dst(tp->chain->block);
 
 	return 0;
 }
diff --git a/net/sched/cls_flow.c b/net/sched/cls_flow.c
index 25c2a88..28cd6fb 100644
--- a/net/sched/cls_flow.c
+++ b/net/sched/cls_flow.c
@@ -526,7 +526,7 @@ static int flow_change(struct net *net, struct sk_buff *in_skb,
 
 	timer_setup(&fnew->perturb_timer, flow_perturbation, TIMER_DEFERRABLE);
 
-	netif_keep_dst(qdisc_dev(tp->q));
+	tcf_block_netif_keep_dst(tp->chain->block);
 
 	if (tb[TCA_FLOW_KEYS]) {
 		fnew->keymask = keymask;
diff --git a/net/sched/cls_route.c b/net/sched/cls_route.c
index ac9a5b8..a1f2b1b 100644
--- a/net/sched/cls_route.c
+++ b/net/sched/cls_route.c
@@ -527,7 +527,7 @@ static int route4_change(struct net *net, struct sk_buff *in_skb,
 		if (f->handle < f1->handle)
 			break;
 
-	netif_keep_dst(qdisc_dev(tp->q));
+	tcf_block_netif_keep_dst(tp->chain->block);
 	rcu_assign_pointer(f->next, f1);
 	rcu_assign_pointer(*fp, f);
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [patch net-next v2 05/10] net: sched: remove classid and q fields from tcf_proto
  2017-11-12 15:55 [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (3 preceding siblings ...)
  2017-11-12 15:55 ` [patch net-next v2 04/10] net: sched: introduce block mechanism to handle netif_keep_dst calls Jiri Pirko
@ 2017-11-12 15:55 ` Jiri Pirko
  2017-11-12 15:56 ` [patch net-next v2 06/10] net: sched: allow ingress and clsact qdiscs to share filter blocks Jiri Pirko
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Jiri Pirko @ 2017-11-12 15:55 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, jakub.kicinski, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

From: Jiri Pirko <jiri@mellanox.com>

Both are no longer used, so remove them.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/sch_generic.h | 2 --
 net/sched/cls_api.c       | 7 ++-----
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 655c788..132ce94 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -245,8 +245,6 @@ struct tcf_proto {
 
 	/* All the rest */
 	u32			prio;
-	u32			classid;
-	struct Qdisc		*q;
 	void			*data;
 	const struct tcf_proto_ops	*ops;
 	struct tcf_chain	*chain;
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index d964f2e..ee37c62 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -123,8 +123,7 @@ static inline u32 tcf_auto_prio(struct tcf_proto *tp)
 }
 
 static struct tcf_proto *tcf_proto_create(const char *kind, u32 protocol,
-					  u32 prio, u32 parent, struct Qdisc *q,
-					  struct tcf_chain *chain)
+					  u32 prio, struct tcf_chain *chain)
 {
 	struct tcf_proto *tp;
 	int err;
@@ -158,8 +157,6 @@ static struct tcf_proto *tcf_proto_create(const char *kind, u32 protocol,
 	tp->classify = tp->ops->classify;
 	tp->protocol = protocol;
 	tp->prio = prio;
-	tp->classid = parent;
-	tp->q = q;
 	tp->chain = chain;
 
 	err = tp->ops->init(tp);
@@ -1067,7 +1064,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 			prio = tcf_auto_prio(tcf_chain_tp_prev(&chain_info));
 
 		tp = tcf_proto_create(nla_data(tca[TCA_KIND]),
-				      protocol, prio, parent, q, chain);
+				      protocol, prio, chain);
 		if (IS_ERR(tp)) {
 			err = PTR_ERR(tp);
 			goto errout;
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [patch net-next v2 06/10] net: sched: allow ingress and clsact qdiscs to share filter blocks
  2017-11-12 15:55 [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (4 preceding siblings ...)
  2017-11-12 15:55 ` [patch net-next v2 05/10] net: sched: remove classid and q fields from tcf_proto Jiri Pirko
@ 2017-11-12 15:56 ` Jiri Pirko
  2017-11-13  7:54   ` Jakub Kicinski
  2017-11-12 15:56 ` [patch net-next v2 07/10] mlxsw: spectrum_acl: Reshuffle code around mlxsw_sp_acl_ruleset_create/destroy Jiri Pirko
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Jiri Pirko @ 2017-11-12 15:56 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, jakub.kicinski, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

From: Jiri Pirko <jiri@mellanox.com>

Benefit from the previously introduced shared filter blocks
infrastructure and allow ingress and clsact qdisc instances to share
filter blocks. The block index is coming from userspace as qdisc option.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/uapi/linux/pkt_sched.h | 11 ++++++
 net/sched/sch_ingress.c        | 89 +++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 99 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index 6a2c5ea..0a122a2 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -925,4 +925,15 @@ enum {
 
 #define TCA_CBS_MAX (__TCA_CBS_MAX - 1)
 
+/* Ingress/clsact */
+
+enum {
+	TCA_CLSACT_UNSPEC,
+	TCA_CLSACT_INGRESS_BLOCK,
+	TCA_CLSACT_EGRESS_BLOCK,
+	__TCA_CLSACT_MAX
+};
+
+#define TCA_CLSACT_MAX	(__TCA_CLSACT_MAX - 1)
+
 #endif
diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c
index 5ecc38f..ee89efc 100644
--- a/net/sched/sch_ingress.c
+++ b/net/sched/sch_ingress.c
@@ -60,6 +60,29 @@ static void clsact_chain_head_change(struct tcf_proto *tp_head, void *priv)
 	struct mini_Qdisc_pair *miniqp = priv;
 
 	mini_qdisc_pair_swap(miniqp, tp_head);
+};
+
+static const struct nla_policy ingress_policy[TCA_CLSACT_MAX + 1] = {
+	[TCA_CLSACT_INGRESS_BLOCK]	= { .type = NLA_U32 },
+};
+
+static int ingress_parse_opt(struct nlattr *opt, u32 *p_ingress_block_index)
+{
+	struct nlattr *tb[TCA_CLSACT_MAX + 1];
+	int err;
+
+	*p_ingress_block_index = 0;
+
+	if (!opt)
+		return 0;
+	err = nla_parse_nested(tb, TCA_CLSACT_MAX, opt, ingress_policy, NULL);
+	if (err)
+		return err;
+
+	if (tb[TCA_CLSACT_INGRESS_BLOCK])
+		*p_ingress_block_index =
+			nla_get_u32(tb[TCA_CLSACT_INGRESS_BLOCK]);
+	return 0;
 }
 
 static int ingress_init(struct Qdisc *sch, struct nlattr *opt)
@@ -70,6 +93,11 @@ static int ingress_init(struct Qdisc *sch, struct nlattr *opt)
 
 	mini_qdisc_pair_init(&q->miniqp, sch, &dev->miniq_ingress);
 
+	err = ingress_parse_opt(opt, &q->block_info.block_index);
+	if (err)
+		return err;
+
+	q->block_info.shareable = true;
 	q->block_info.binder_type = TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
 	q->block_info.chain_head_change = clsact_chain_head_change;
 	q->block_info.chain_head_change_priv = &q->miniqp;
@@ -94,11 +122,14 @@ static void ingress_destroy(struct Qdisc *sch)
 
 static int ingress_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
+	struct ingress_sched_data *q = qdisc_priv(sch);
 	struct nlattr *nest;
 
 	nest = nla_nest_start(skb, TCA_OPTIONS);
 	if (nest == NULL)
 		goto nla_put_failure;
+	if (nla_put_u32(skb, TCA_CLSACT_INGRESS_BLOCK, q->block->index))
+		goto nla_put_failure;
 
 	return nla_nest_end(skb, nest);
 
@@ -166,6 +197,35 @@ static struct tcf_block *clsact_tcf_block(struct Qdisc *sch, unsigned long cl)
 	}
 }
 
+static const struct nla_policy clsact_policy[TCA_CLSACT_MAX + 1] = {
+	[TCA_CLSACT_INGRESS_BLOCK]	= { .type = NLA_U32 },
+	[TCA_CLSACT_EGRESS_BLOCK]	= { .type = NLA_U32 },
+};
+
+static int clsact_parse_opt(struct nlattr *opt, u32 *p_ingress_block_index,
+			    u32 *p_egress_block_index)
+{
+	struct nlattr *tb[TCA_CLSACT_MAX + 1];
+	int err;
+
+	*p_ingress_block_index = 0;
+	*p_egress_block_index = 0;
+
+	if (!opt)
+		return 0;
+	err = nla_parse_nested(tb, TCA_CLSACT_MAX, opt, clsact_policy, NULL);
+	if (err)
+		return err;
+
+	if (tb[TCA_CLSACT_INGRESS_BLOCK])
+		*p_ingress_block_index =
+			nla_get_u32(tb[TCA_CLSACT_INGRESS_BLOCK]);
+	if (tb[TCA_CLSACT_EGRESS_BLOCK])
+		*p_egress_block_index =
+			nla_get_u32(tb[TCA_CLSACT_EGRESS_BLOCK]);
+	return 0;
+}
+
 static int clsact_init(struct Qdisc *sch, struct nlattr *opt)
 {
 	struct clsact_sched_data *q = qdisc_priv(sch);
@@ -174,6 +234,12 @@ static int clsact_init(struct Qdisc *sch, struct nlattr *opt)
 
 	mini_qdisc_pair_init(&q->miniqp_ingress, sch, &dev->miniq_ingress);
 
+	err = clsact_parse_opt(opt, &q->ingress_block_info.block_index,
+			       &q->egress_block_info.block_index);
+	if (err)
+		return err;
+
+	q->ingress_block_info.shareable = true;
 	q->ingress_block_info.binder_type = TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
 	q->ingress_block_info.chain_head_change = clsact_chain_head_change;
 	q->ingress_block_info.chain_head_change_priv = &q->miniqp_ingress;
@@ -184,6 +250,7 @@ static int clsact_init(struct Qdisc *sch, struct nlattr *opt)
 
 	mini_qdisc_pair_init(&q->miniqp_egress, sch, &dev->miniq_egress);
 
+	q->egress_block_info.shareable = true;
 	q->egress_block_info.binder_type = TCF_BLOCK_BINDER_TYPE_CLSACT_EGRESS;
 	q->egress_block_info.chain_head_change = clsact_chain_head_change;
 	q->egress_block_info.chain_head_change_priv = &q->miniqp_egress;
@@ -215,6 +282,26 @@ static void clsact_destroy(struct Qdisc *sch)
 	net_dec_egress_queue();
 }
 
+static int clsact_dump(struct Qdisc *sch, struct sk_buff *skb)
+{
+	struct clsact_sched_data *q = qdisc_priv(sch);
+	struct nlattr *nest;
+
+	nest = nla_nest_start(skb, TCA_OPTIONS);
+	if (!nest)
+		goto nla_put_failure;
+	if (nla_put_u32(skb, TCA_CLSACT_INGRESS_BLOCK, q->ingress_block->index))
+		goto nla_put_failure;
+	if (nla_put_u32(skb, TCA_CLSACT_EGRESS_BLOCK, q->egress_block->index))
+		goto nla_put_failure;
+
+	return nla_nest_end(skb, nest);
+
+nla_put_failure:
+	nla_nest_cancel(skb, nest);
+	return -1;
+}
+
 static const struct Qdisc_class_ops clsact_class_ops = {
 	.leaf		=	ingress_leaf,
 	.find		=	clsact_find,
@@ -230,7 +317,7 @@ static struct Qdisc_ops clsact_qdisc_ops __read_mostly = {
 	.priv_size	=	sizeof(struct clsact_sched_data),
 	.init		=	clsact_init,
 	.destroy	=	clsact_destroy,
-	.dump		=	ingress_dump,
+	.dump		=	clsact_dump,
 	.owner		=	THIS_MODULE,
 };
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [patch net-next v2 07/10] mlxsw: spectrum_acl: Reshuffle code around mlxsw_sp_acl_ruleset_create/destroy
  2017-11-12 15:55 [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (5 preceding siblings ...)
  2017-11-12 15:56 ` [patch net-next v2 06/10] net: sched: allow ingress and clsact qdiscs to share filter blocks Jiri Pirko
@ 2017-11-12 15:56 ` Jiri Pirko
  2017-11-12 15:56 ` [patch net-next v2 08/10] mlxsw: spectrum_acl: Don't store netdev and ingress for ruleset unbind Jiri Pirko
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Jiri Pirko @ 2017-11-12 15:56 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, jakub.kicinski, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

From: Jiri Pirko <jiri@mellanox.com>

In order to prepare for follow-up changes, make the bind/unbind helpers
very simple. That required move of ht insertion/removal and bind/unbind
calls into mlxsw_sp_acl_ruleset_create/destroy.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c | 102 ++++++++++-----------
 1 file changed, 46 insertions(+), 56 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index 93dcd31..ead4cb8 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -118,8 +118,26 @@ struct mlxsw_sp_fid *mlxsw_sp_acl_dummy_fid(struct mlxsw_sp *mlxsw_sp)
 	return mlxsw_sp->acl->dummy_fid;
 }
 
+static int mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
+				     struct mlxsw_sp_acl_ruleset *ruleset,
+				     struct net_device *dev, bool ingress)
+{
+	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
+
+	return ops->ruleset_bind(mlxsw_sp, ruleset->priv, dev, ingress);
+}
+
+static void mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
+					struct mlxsw_sp_acl_ruleset *ruleset)
+{
+	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
+
+	ops->ruleset_unbind(mlxsw_sp, ruleset->priv);
+}
+
 static struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp,
+mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
+			    bool ingress, u32 chain_index,
 			    const struct mlxsw_sp_acl_profile_ops *ops)
 {
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
@@ -132,6 +150,9 @@ mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp,
 	if (!ruleset)
 		return ERR_PTR(-ENOMEM);
 	ruleset->ref_count = 1;
+	ruleset->ht_key.dev = dev;
+	ruleset->ht_key.ingress = ingress;
+	ruleset->ht_key.chain_index = chain_index;
 	ruleset->ht_key.ops = ops;
 
 	err = rhashtable_init(&ruleset->rule_ht, &mlxsw_sp_acl_rule_ht_params);
@@ -142,68 +163,49 @@ mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp,
 	if (err)
 		goto err_ops_ruleset_add;
 
-	return ruleset;
-
-err_ops_ruleset_add:
-	rhashtable_destroy(&ruleset->rule_ht);
-err_rhashtable_init:
-	kfree(ruleset);
-	return ERR_PTR(err);
-}
-
-static void mlxsw_sp_acl_ruleset_destroy(struct mlxsw_sp *mlxsw_sp,
-					 struct mlxsw_sp_acl_ruleset *ruleset)
-{
-	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
-
-	ops->ruleset_del(mlxsw_sp, ruleset->priv);
-	rhashtable_destroy(&ruleset->rule_ht);
-	kfree(ruleset);
-}
-
-static int mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
-				     struct mlxsw_sp_acl_ruleset *ruleset,
-				     struct net_device *dev, bool ingress,
-				     u32 chain_index)
-{
-	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
-	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
-	int err;
-
-	ruleset->ht_key.dev = dev;
-	ruleset->ht_key.ingress = ingress;
-	ruleset->ht_key.chain_index = chain_index;
 	err = rhashtable_insert_fast(&acl->ruleset_ht, &ruleset->ht_node,
 				     mlxsw_sp_acl_ruleset_ht_params);
 	if (err)
-		return err;
-	if (!ruleset->ht_key.chain_index) {
+		goto err_ht_insert;
+
+	if (!chain_index) {
 		/* We only need ruleset with chain index 0, the implicit one,
 		 * to be directly bound to device. The rest of the rulesets
 		 * are bound by "Goto action set".
 		 */
-		err = ops->ruleset_bind(mlxsw_sp, ruleset->priv, dev, ingress);
+		err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, ruleset,
+						dev, ingress);
 		if (err)
-			goto err_ops_ruleset_bind;
+			goto err_ruleset_bind;
 	}
-	return 0;
 
-err_ops_ruleset_bind:
+	return ruleset;
+
+err_ruleset_bind:
 	rhashtable_remove_fast(&acl->ruleset_ht, &ruleset->ht_node,
 			       mlxsw_sp_acl_ruleset_ht_params);
-	return err;
+err_ht_insert:
+	ops->ruleset_del(mlxsw_sp, ruleset->priv);
+err_ops_ruleset_add:
+	rhashtable_destroy(&ruleset->rule_ht);
+err_rhashtable_init:
+	kfree(ruleset);
+	return ERR_PTR(err);
 }
 
-static void mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
-					struct mlxsw_sp_acl_ruleset *ruleset)
+static void mlxsw_sp_acl_ruleset_destroy(struct mlxsw_sp *mlxsw_sp,
+					 struct mlxsw_sp_acl_ruleset *ruleset)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
 
 	if (!ruleset->ht_key.chain_index)
-		ops->ruleset_unbind(mlxsw_sp, ruleset->priv);
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset);
 	rhashtable_remove_fast(&acl->ruleset_ht, &ruleset->ht_node,
 			       mlxsw_sp_acl_ruleset_ht_params);
+	ops->ruleset_del(mlxsw_sp, ruleset->priv);
+	rhashtable_destroy(&ruleset->rule_ht);
+	kfree(ruleset);
 }
 
 static void mlxsw_sp_acl_ruleset_ref_inc(struct mlxsw_sp_acl_ruleset *ruleset)
@@ -216,7 +218,6 @@ static void mlxsw_sp_acl_ruleset_ref_dec(struct mlxsw_sp *mlxsw_sp,
 {
 	if (--ruleset->ref_count)
 		return;
-	mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset);
 	mlxsw_sp_acl_ruleset_destroy(mlxsw_sp, ruleset);
 }
 
@@ -263,7 +264,6 @@ mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 	const struct mlxsw_sp_acl_profile_ops *ops;
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
 	struct mlxsw_sp_acl_ruleset *ruleset;
-	int err;
 
 	ops = acl->ops->profile_ops(mlxsw_sp, profile);
 	if (!ops)
@@ -275,18 +275,8 @@ mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 		mlxsw_sp_acl_ruleset_ref_inc(ruleset);
 		return ruleset;
 	}
-	ruleset = mlxsw_sp_acl_ruleset_create(mlxsw_sp, ops);
-	if (IS_ERR(ruleset))
-		return ruleset;
-	err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, ruleset, dev,
-					ingress, chain_index);
-	if (err)
-		goto err_ruleset_bind;
-	return ruleset;
-
-err_ruleset_bind:
-	mlxsw_sp_acl_ruleset_destroy(mlxsw_sp, ruleset);
-	return ERR_PTR(err);
+	return mlxsw_sp_acl_ruleset_create(mlxsw_sp, dev, ingress,
+					   chain_index, ops);
 }
 
 void mlxsw_sp_acl_ruleset_put(struct mlxsw_sp *mlxsw_sp,
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [patch net-next v2 08/10] mlxsw: spectrum_acl: Don't store netdev and ingress for ruleset unbind
  2017-11-12 15:55 [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (6 preceding siblings ...)
  2017-11-12 15:56 ` [patch net-next v2 07/10] mlxsw: spectrum_acl: Reshuffle code around mlxsw_sp_acl_ruleset_create/destroy Jiri Pirko
@ 2017-11-12 15:56 ` Jiri Pirko
  2017-11-12 15:56 ` [patch net-next v2 09/10] mlxsw: spectrum_acl: Implement TC block sharing Jiri Pirko
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Jiri Pirko @ 2017-11-12 15:56 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, jakub.kicinski, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

From: Jiri Pirko <jiri@mellanox.com>

Instead, pass netdev and ingress flag to ruleset unbind op.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  3 +-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c |  9 ++++--
 .../ethernet/mellanox/mlxsw/spectrum_acl_tcam.c    | 33 +++++++++++-----------
 3 files changed, 24 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index 58cf222..f502e2e 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -476,7 +476,8 @@ struct mlxsw_sp_acl_profile_ops {
 	void (*ruleset_del)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv);
 	int (*ruleset_bind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv,
 			    struct net_device *dev, bool ingress);
-	void (*ruleset_unbind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv);
+	void (*ruleset_unbind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv,
+			       struct net_device *dev, bool ingress);
 	u16 (*ruleset_group_id)(void *ruleset_priv);
 	size_t rule_priv_size;
 	int (*rule_add)(struct mlxsw_sp *mlxsw_sp,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index ead4cb8..7fb41a4 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -128,11 +128,12 @@ static int mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
 }
 
 static void mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
-					struct mlxsw_sp_acl_ruleset *ruleset)
+					struct mlxsw_sp_acl_ruleset *ruleset,
+					struct net_device *dev, bool ingress)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
-	ops->ruleset_unbind(mlxsw_sp, ruleset->priv);
+	ops->ruleset_unbind(mlxsw_sp, ruleset->priv, dev, ingress);
 }
 
 static struct mlxsw_sp_acl_ruleset *
@@ -200,7 +201,9 @@ static void mlxsw_sp_acl_ruleset_destroy(struct mlxsw_sp *mlxsw_sp,
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
 
 	if (!ruleset->ht_key.chain_index)
-		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset);
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset,
+					    ruleset->ht_key.dev,
+					    ruleset->ht_key.ingress);
 	rhashtable_remove_fast(&acl->ruleset_ht, &ruleset->ht_node,
 			       mlxsw_sp_acl_ruleset_ht_params);
 	ops->ruleset_del(mlxsw_sp, ruleset->priv);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
index 7e8284b..50b2f9a 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
@@ -154,10 +154,6 @@ struct mlxsw_sp_acl_tcam_group {
 	struct list_head region_list;
 	unsigned int region_count;
 	struct rhashtable chunk_ht;
-	struct {
-		u16 local_port;
-		bool ingress;
-	} bound;
 	struct mlxsw_sp_acl_tcam_group_ops *ops;
 	const struct mlxsw_sp_acl_tcam_pattern *patterns;
 	unsigned int patterns_count;
@@ -271,26 +267,28 @@ mlxsw_sp_acl_tcam_group_bind(struct mlxsw_sp *mlxsw_sp,
 		return -EINVAL;
 
 	mlxsw_sp_port = netdev_priv(dev);
-	group->bound.local_port = mlxsw_sp_port->local_port;
-	group->bound.ingress = ingress;
-	mlxsw_reg_ppbt_pack(ppbt_pl,
-			    group->bound.ingress ? MLXSW_REG_PXBT_E_IACL :
-						   MLXSW_REG_PXBT_E_EACL,
-			    MLXSW_REG_PXBT_OP_BIND, group->bound.local_port,
+	mlxsw_reg_ppbt_pack(ppbt_pl, ingress ? MLXSW_REG_PXBT_E_IACL :
+					       MLXSW_REG_PXBT_E_EACL,
+			    MLXSW_REG_PXBT_OP_BIND, mlxsw_sp_port->local_port,
 			    group->id);
 	return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(ppbt), ppbt_pl);
 }
 
 static void
 mlxsw_sp_acl_tcam_group_unbind(struct mlxsw_sp *mlxsw_sp,
-			       struct mlxsw_sp_acl_tcam_group *group)
+			       struct mlxsw_sp_acl_tcam_group *group,
+			       struct net_device *dev, bool ingress)
 {
+	struct mlxsw_sp_port *mlxsw_sp_port;
 	char ppbt_pl[MLXSW_REG_PPBT_LEN];
 
-	mlxsw_reg_ppbt_pack(ppbt_pl,
-			    group->bound.ingress ? MLXSW_REG_PXBT_E_IACL :
-						   MLXSW_REG_PXBT_E_EACL,
-			    MLXSW_REG_PXBT_OP_UNBIND, group->bound.local_port,
+	if (WARN_ON(!mlxsw_sp_port_dev_check(dev)))
+		return;
+
+	mlxsw_sp_port = netdev_priv(dev);
+	mlxsw_reg_ppbt_pack(ppbt_pl, ingress ? MLXSW_REG_PXBT_E_IACL :
+					       MLXSW_REG_PXBT_E_EACL,
+			    MLXSW_REG_PXBT_OP_UNBIND, mlxsw_sp_port->local_port,
 			    group->id);
 	mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(ppbt), ppbt_pl);
 }
@@ -1066,11 +1064,12 @@ mlxsw_sp_acl_tcam_flower_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
 
 static void
 mlxsw_sp_acl_tcam_flower_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
-					void *ruleset_priv)
+					void *ruleset_priv,
+					struct net_device *dev, bool ingress)
 {
 	struct mlxsw_sp_acl_tcam_flower_ruleset *ruleset = ruleset_priv;
 
-	mlxsw_sp_acl_tcam_group_unbind(mlxsw_sp, &ruleset->group);
+	mlxsw_sp_acl_tcam_group_unbind(mlxsw_sp, &ruleset->group, dev, ingress);
 }
 
 static u16
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [patch net-next v2 09/10] mlxsw: spectrum_acl: Implement TC block sharing
  2017-11-12 15:55 [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (7 preceding siblings ...)
  2017-11-12 15:56 ` [patch net-next v2 08/10] mlxsw: spectrum_acl: Don't store netdev and ingress for ruleset unbind Jiri Pirko
@ 2017-11-12 15:56 ` Jiri Pirko
  2017-11-12 15:56 ` [patch net-next v2 10/10] mlxsw: spectrum_acl: Pass mlxsw_sp_port down to ruleset bind/unbind ops Jiri Pirko
  2017-11-15 23:12 ` [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances Cong Wang
  10 siblings, 0 replies; 31+ messages in thread
From: Jiri Pirko @ 2017-11-12 15:56 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, jakub.kicinski, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

From: Jiri Pirko <jiri@mellanox.com>

Benefit from the prepared TC and in-driver ACL infrastructure and
introduce block sharing offload. For that, a new struct "block" is
introduced in spectrum_acl in order to hold a list of specific
block-port bindings.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c     | 146 +++++++++++---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  31 ++-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c | 218 +++++++++++++++++----
 .../net/ethernet/mellanox/mlxsw/spectrum_flower.c  |  41 ++--
 4 files changed, 350 insertions(+), 86 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index b2cd1eb..e989519 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -1750,25 +1750,27 @@ static int mlxsw_sp_setup_tc_cls_matchall(struct mlxsw_sp_port *mlxsw_sp_port,
 }
 
 static int
-mlxsw_sp_setup_tc_cls_flower(struct mlxsw_sp_port *mlxsw_sp_port,
-			     struct tc_cls_flower_offload *f,
-			     bool ingress)
+mlxsw_sp_setup_tc_cls_flower(struct mlxsw_sp_acl_block *acl_block,
+			     struct tc_cls_flower_offload *f)
 {
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_acl_block_mlxsw_sp(acl_block);
+
 	switch (f->command) {
 	case TC_CLSFLOWER_REPLACE:
-		return mlxsw_sp_flower_replace(mlxsw_sp_port, ingress, f);
+		return mlxsw_sp_flower_replace(mlxsw_sp, acl_block, f);
 	case TC_CLSFLOWER_DESTROY:
-		mlxsw_sp_flower_destroy(mlxsw_sp_port, ingress, f);
+		mlxsw_sp_flower_destroy(mlxsw_sp, acl_block, f);
 		return 0;
 	case TC_CLSFLOWER_STATS:
-		return mlxsw_sp_flower_stats(mlxsw_sp_port, ingress, f);
+		return mlxsw_sp_flower_stats(mlxsw_sp, acl_block, f);
 	default:
 		return -EOPNOTSUPP;
 	}
 }
 
-static int mlxsw_sp_setup_tc_block_cb(enum tc_setup_type type, void *type_data,
-				      void *cb_priv, bool ingress)
+static int mlxsw_sp_setup_tc_block_cb_matchall(enum tc_setup_type type,
+					       void *type_data,
+					       void *cb_priv, bool ingress)
 {
 	struct mlxsw_sp_port *mlxsw_sp_port = cb_priv;
 
@@ -1780,42 +1782,140 @@ static int mlxsw_sp_setup_tc_block_cb(enum tc_setup_type type, void *type_data,
 		return mlxsw_sp_setup_tc_cls_matchall(mlxsw_sp_port, type_data,
 						      ingress);
 	case TC_SETUP_CLSFLOWER:
-		return mlxsw_sp_setup_tc_cls_flower(mlxsw_sp_port, type_data,
-						    ingress);
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int mlxsw_sp_setup_tc_block_cb_matchall_ig(enum tc_setup_type type,
+						  void *type_data,
+						  void *cb_priv)
+{
+	return mlxsw_sp_setup_tc_block_cb_matchall(type, type_data,
+						   cb_priv, true);
+}
+
+static int mlxsw_sp_setup_tc_block_cb_matchall_eg(enum tc_setup_type type,
+						  void *type_data,
+						  void *cb_priv)
+{
+	return mlxsw_sp_setup_tc_block_cb_matchall(type, type_data,
+						   cb_priv, false);
+}
+
+static int mlxsw_sp_setup_tc_block_cb_flower(enum tc_setup_type type,
+					     void *type_data, void *cb_priv)
+{
+	struct mlxsw_sp_acl_block *acl_block = cb_priv;
+
+	switch (type) {
+	case TC_SETUP_CLSMATCHALL:
+		return 0;
+	case TC_SETUP_CLSFLOWER:
+		return mlxsw_sp_setup_tc_cls_flower(acl_block, type_data);
 	default:
 		return -EOPNOTSUPP;
 	}
 }
 
-static int mlxsw_sp_setup_tc_block_cb_ig(enum tc_setup_type type,
-					 void *type_data, void *cb_priv)
+static int
+mlxsw_sp_setup_tc_block_flower_bind(struct mlxsw_sp_port *mlxsw_sp_port,
+				    struct tcf_block *block, bool ingress)
 {
-	return mlxsw_sp_setup_tc_block_cb(type, type_data, cb_priv, true);
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
+	struct mlxsw_sp_acl_block *acl_block;
+	struct tcf_block_cb *block_cb;
+	int err;
+
+	block_cb = tcf_block_cb_lookup(block, mlxsw_sp_setup_tc_block_cb_flower,
+				       mlxsw_sp);
+	if (!block_cb) {
+		acl_block = mlxsw_sp_acl_block_create(mlxsw_sp, block->net);
+		if (!acl_block)
+			return -ENOMEM;
+		block_cb = __tcf_block_cb_register(block,
+						   mlxsw_sp_setup_tc_block_cb_flower,
+						   mlxsw_sp, acl_block);
+		if (!block_cb) {
+			err = -ENOMEM;
+			goto err_cb_register;
+		}
+	} else {
+		acl_block = tcf_block_cb_priv(block_cb);
+	}
+	tcf_block_cb_incref(block_cb);
+	err = mlxsw_sp_acl_block_bind(mlxsw_sp, acl_block,
+				      mlxsw_sp_port, ingress);
+	if (err)
+		goto err_block_bind;
+	return 0;
+
+err_block_bind:
+	if (!tcf_block_cb_decref(block_cb)) {
+		__tcf_block_cb_unregister(block_cb);
+err_cb_register:
+		mlxsw_sp_acl_block_destroy(acl_block);
+	}
+	return err;
 }
 
-static int mlxsw_sp_setup_tc_block_cb_eg(enum tc_setup_type type,
-					 void *type_data, void *cb_priv)
+static void
+mlxsw_sp_setup_tc_block_flower_unbind(struct mlxsw_sp_port *mlxsw_sp_port,
+				      struct tcf_block *block, bool ingress)
 {
-	return mlxsw_sp_setup_tc_block_cb(type, type_data, cb_priv, false);
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
+	struct mlxsw_sp_acl_block *acl_block;
+	struct tcf_block_cb *block_cb;
+	int err;
+
+	block_cb = tcf_block_cb_lookup(block, mlxsw_sp_setup_tc_block_cb_flower,
+				       mlxsw_sp);
+	if (!block_cb)
+		return;
+
+	acl_block = tcf_block_cb_priv(block_cb);
+	err = mlxsw_sp_acl_block_unbind(mlxsw_sp, acl_block,
+					mlxsw_sp_port, ingress);
+	if (!err && !tcf_block_cb_decref(block_cb)) {
+		__tcf_block_cb_unregister(block_cb);
+		mlxsw_sp_acl_block_destroy(acl_block);
+	}
 }
 
 static int mlxsw_sp_setup_tc_block(struct mlxsw_sp_port *mlxsw_sp_port,
 				   struct tc_block_offload *f)
 {
 	tc_setup_cb_t *cb;
+	bool ingress;
+	int err;
 
-	if (f->binder_type == TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS)
-		cb = mlxsw_sp_setup_tc_block_cb_ig;
-	else if (f->binder_type == TCF_BLOCK_BINDER_TYPE_CLSACT_EGRESS)
-		cb = mlxsw_sp_setup_tc_block_cb_eg;
-	else
+	if (f->binder_type == TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS) {
+		cb = mlxsw_sp_setup_tc_block_cb_matchall_ig;
+		ingress = true;
+	} else if (f->binder_type == TCF_BLOCK_BINDER_TYPE_CLSACT_EGRESS) {
+		cb = mlxsw_sp_setup_tc_block_cb_matchall_eg;
+		ingress = false;
+	} else {
 		return -EOPNOTSUPP;
+	}
 
 	switch (f->command) {
 	case TC_BLOCK_BIND:
-		return tcf_block_cb_register(f->block, cb, mlxsw_sp_port,
-					     mlxsw_sp_port);
+		err = tcf_block_cb_register(f->block, cb, mlxsw_sp_port,
+					    mlxsw_sp_port);
+		if (err)
+			return err;
+		err = mlxsw_sp_setup_tc_block_flower_bind(mlxsw_sp_port,
+							  f->block, ingress);
+		if (err) {
+			tcf_block_cb_unregister(f->block, cb, mlxsw_sp_port);
+			return err;
+		}
+		return 0;
 	case TC_BLOCK_UNBIND:
+		mlxsw_sp_setup_tc_block_flower_unbind(mlxsw_sp_port,
+						      f->block, ingress);
 		tcf_block_cb_unregister(f->block, cb, mlxsw_sp_port);
 		return 0;
 	default:
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index f502e2e..4d375f3 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -497,17 +497,31 @@ struct mlxsw_sp_acl_ops {
 				       enum mlxsw_sp_acl_profile profile);
 };
 
+struct mlxsw_sp_acl_block;
 struct mlxsw_sp_acl_ruleset;
 
 /* spectrum_acl.c */
 struct mlxsw_afk *mlxsw_sp_acl_afk(struct mlxsw_sp_acl *acl);
+struct mlxsw_sp *mlxsw_sp_acl_block_mlxsw_sp(struct mlxsw_sp_acl_block *block);
+struct net *mlxsw_sp_acl_block_net(struct mlxsw_sp_acl_block *block);
+struct mlxsw_sp_acl_block *mlxsw_sp_acl_block_create(struct mlxsw_sp *mlxsw_sp,
+						     struct net *net);
+void mlxsw_sp_acl_block_destroy(struct mlxsw_sp_acl_block *block);
+int mlxsw_sp_acl_block_bind(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
+			    struct mlxsw_sp_port *mlxsw_sp_port,
+			    bool ingress);
+int mlxsw_sp_acl_block_unbind(struct mlxsw_sp *mlxsw_sp,
+			      struct mlxsw_sp_acl_block *block,
+			      struct mlxsw_sp_port *mlxsw_sp_port,
+			      bool ingress);
 struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			    bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block, u32 chain_index,
 			    enum mlxsw_sp_acl_profile profile);
 struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			 bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp,
+			 struct mlxsw_sp_acl_block *block, u32 chain_index,
 			 enum mlxsw_sp_acl_profile profile);
 void mlxsw_sp_acl_ruleset_put(struct mlxsw_sp *mlxsw_sp,
 			      struct mlxsw_sp_acl_ruleset *ruleset);
@@ -574,11 +588,14 @@ void mlxsw_sp_acl_fini(struct mlxsw_sp *mlxsw_sp);
 extern const struct mlxsw_sp_acl_ops mlxsw_sp_acl_tcam_ops;
 
 /* spectrum_flower.c */
-int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+int mlxsw_sp_flower_replace(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
 			    struct tc_cls_flower_offload *f);
-void mlxsw_sp_flower_destroy(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+void mlxsw_sp_flower_destroy(struct mlxsw_sp *mlxsw_sp,
+			     struct mlxsw_sp_acl_block *block,
 			     struct tc_cls_flower_offload *f);
-int mlxsw_sp_flower_stats(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+int mlxsw_sp_flower_stats(struct mlxsw_sp *mlxsw_sp,
+			  struct mlxsw_sp_acl_block *block,
 			  struct tc_cls_flower_offload *f);
 
 /* spectrum_qdisc.c */
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index 7fb41a4..fda41f1 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -39,6 +39,7 @@
 #include <linux/string.h>
 #include <linux/rhashtable.h>
 #include <linux/netdevice.h>
+#include <net/net_namespace.h>
 #include <net/tc_act/tc_vlan.h>
 
 #include "reg.h"
@@ -70,9 +71,22 @@ struct mlxsw_afk *mlxsw_sp_acl_afk(struct mlxsw_sp_acl *acl)
 	return acl->afk;
 }
 
-struct mlxsw_sp_acl_ruleset_ht_key {
-	struct net_device *dev; /* dev this ruleset is bound to */
+struct mlxsw_sp_acl_block_binding {
+	struct list_head list;
+	struct net_device *dev;
+	struct mlxsw_sp_port *mlxsw_sp_port;
 	bool ingress;
+};
+
+struct mlxsw_sp_acl_block {
+	struct list_head binding_list;
+	struct mlxsw_sp_acl_ruleset *ruleset_zero;
+	struct mlxsw_sp *mlxsw_sp;
+	struct net *net;
+};
+
+struct mlxsw_sp_acl_ruleset_ht_key {
+	struct mlxsw_sp_acl_block *block;
 	u32 chain_index;
 	const struct mlxsw_sp_acl_profile_ops *ops;
 };
@@ -118,27 +132,169 @@ struct mlxsw_sp_fid *mlxsw_sp_acl_dummy_fid(struct mlxsw_sp *mlxsw_sp)
 	return mlxsw_sp->acl->dummy_fid;
 }
 
-static int mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
-				     struct mlxsw_sp_acl_ruleset *ruleset,
-				     struct net_device *dev, bool ingress)
+struct mlxsw_sp *mlxsw_sp_acl_block_mlxsw_sp(struct mlxsw_sp_acl_block *block)
 {
+	return block->mlxsw_sp;
+}
+
+struct net *mlxsw_sp_acl_block_net(struct mlxsw_sp_acl_block *block)
+{
+	return block->net;
+}
+
+static int
+mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
+			  struct mlxsw_sp_acl_block *block,
+			  struct mlxsw_sp_acl_block_binding *binding)
+{
+	struct mlxsw_sp_acl_ruleset *ruleset = block->ruleset_zero;
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
-	return ops->ruleset_bind(mlxsw_sp, ruleset->priv, dev, ingress);
+	return ops->ruleset_bind(mlxsw_sp, ruleset->priv,
+				 binding->mlxsw_sp_port->dev, binding->ingress);
 }
 
-static void mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
-					struct mlxsw_sp_acl_ruleset *ruleset,
-					struct net_device *dev, bool ingress)
+static void
+mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
+			    struct mlxsw_sp_acl_block_binding *binding)
 {
+	struct mlxsw_sp_acl_ruleset *ruleset = block->ruleset_zero;
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
-	ops->ruleset_unbind(mlxsw_sp, ruleset->priv, dev, ingress);
+	ops->ruleset_unbind(mlxsw_sp, ruleset->priv,
+			    binding->mlxsw_sp_port->dev, binding->ingress);
+}
+
+static bool mlxsw_sp_acl_ruleset_block_bound(struct mlxsw_sp_acl_block *block)
+{
+	return block->ruleset_zero;
+}
+
+static int
+mlxsw_sp_acl_ruleset_block_bind(struct mlxsw_sp *mlxsw_sp,
+				struct mlxsw_sp_acl_ruleset *ruleset,
+				struct mlxsw_sp_acl_block *block)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+	int err;
+
+	block->ruleset_zero = ruleset;
+	list_for_each_entry(binding, &block->binding_list, list) {
+		err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, block, binding);
+		if (err)
+			goto rollback;
+	}
+	return 0;
+
+rollback:
+	list_for_each_entry_continue_reverse(binding, &block->binding_list,
+					     list)
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, block, binding);
+	block->ruleset_zero = NULL;
+
+	return err;
+}
+
+static void
+mlxsw_sp_acl_ruleset_block_unbind(struct mlxsw_sp *mlxsw_sp,
+				  struct mlxsw_sp_acl_ruleset *ruleset,
+				  struct mlxsw_sp_acl_block *block)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+
+	list_for_each_entry(binding, &block->binding_list, list)
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, block, binding);
+	block->ruleset_zero = NULL;
+}
+
+struct mlxsw_sp_acl_block *mlxsw_sp_acl_block_create(struct mlxsw_sp *mlxsw_sp,
+						     struct net *net)
+{
+	struct mlxsw_sp_acl_block *block;
+
+	block = kzalloc(sizeof(*block), GFP_KERNEL);
+	if (!block)
+		return NULL;
+	INIT_LIST_HEAD(&block->binding_list);
+	block->mlxsw_sp = mlxsw_sp;
+	block->net = net;
+	return block;
+}
+
+void mlxsw_sp_acl_block_destroy(struct mlxsw_sp_acl_block *block)
+{
+	WARN_ON(!list_empty(&block->binding_list));
+	kfree(block);
+}
+
+static struct mlxsw_sp_acl_block_binding *
+mlxsw_sp_acl_block_lookup(struct mlxsw_sp_acl_block *block,
+			  struct mlxsw_sp_port *mlxsw_sp_port, bool ingress)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+
+	list_for_each_entry(binding, &block->binding_list, list)
+		if (binding->mlxsw_sp_port == mlxsw_sp_port &&
+		    binding->ingress == ingress)
+			return binding;
+	return NULL;
+}
+
+int mlxsw_sp_acl_block_bind(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
+			    struct mlxsw_sp_port *mlxsw_sp_port,
+			    bool ingress)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+	int err;
+
+	if (WARN_ON(mlxsw_sp_acl_block_lookup(block, mlxsw_sp_port, ingress)))
+		return -EEXIST;
+
+	binding = kzalloc(sizeof(*binding), GFP_KERNEL);
+	if (!binding)
+		return -ENOMEM;
+	binding->mlxsw_sp_port = mlxsw_sp_port;
+	binding->ingress = ingress;
+
+	if (mlxsw_sp_acl_ruleset_block_bound(block)) {
+		err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, block, binding);
+		if (err)
+			goto err_ruleset_bind;
+	}
+
+	list_add(&binding->list, &block->binding_list);
+	return 0;
+
+err_ruleset_bind:
+	kfree(binding);
+	return err;
+}
+
+int mlxsw_sp_acl_block_unbind(struct mlxsw_sp *mlxsw_sp,
+			      struct mlxsw_sp_acl_block *block,
+			      struct mlxsw_sp_port *mlxsw_sp_port,
+			      bool ingress)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+
+	binding = mlxsw_sp_acl_block_lookup(block, mlxsw_sp_port, ingress);
+	if (!binding)
+		return -ENOENT;
+
+	list_del(&binding->list);
+
+	if (mlxsw_sp_acl_ruleset_block_bound(block))
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, block, binding);
+
+	kfree(binding);
+	return 0;
 }
 
 static struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			    bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block, u32 chain_index,
 			    const struct mlxsw_sp_acl_profile_ops *ops)
 {
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
@@ -151,8 +307,7 @@ mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 	if (!ruleset)
 		return ERR_PTR(-ENOMEM);
 	ruleset->ref_count = 1;
-	ruleset->ht_key.dev = dev;
-	ruleset->ht_key.ingress = ingress;
+	ruleset->ht_key.block = block;
 	ruleset->ht_key.chain_index = chain_index;
 	ruleset->ht_key.ops = ops;
 
@@ -174,8 +329,7 @@ mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 		 * to be directly bound to device. The rest of the rulesets
 		 * are bound by "Goto action set".
 		 */
-		err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, ruleset,
-						dev, ingress);
+		err = mlxsw_sp_acl_ruleset_block_bind(mlxsw_sp, ruleset, block);
 		if (err)
 			goto err_ruleset_bind;
 	}
@@ -198,12 +352,12 @@ static void mlxsw_sp_acl_ruleset_destroy(struct mlxsw_sp *mlxsw_sp,
 					 struct mlxsw_sp_acl_ruleset *ruleset)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
+	struct mlxsw_sp_acl_block *block = ruleset->ht_key.block;
+	u32 chain_index = ruleset->ht_key.chain_index;
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
 
-	if (!ruleset->ht_key.chain_index)
-		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset,
-					    ruleset->ht_key.dev,
-					    ruleset->ht_key.ingress);
+	if (!chain_index)
+		mlxsw_sp_acl_ruleset_block_unbind(mlxsw_sp, ruleset, block);
 	rhashtable_remove_fast(&acl->ruleset_ht, &ruleset->ht_node,
 			       mlxsw_sp_acl_ruleset_ht_params);
 	ops->ruleset_del(mlxsw_sp, ruleset->priv);
@@ -225,15 +379,14 @@ static void mlxsw_sp_acl_ruleset_ref_dec(struct mlxsw_sp *mlxsw_sp,
 }
 
 static struct mlxsw_sp_acl_ruleset *
-__mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp_acl *acl, struct net_device *dev,
-			      bool ingress, u32 chain_index,
+__mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp_acl *acl,
+			      struct mlxsw_sp_acl_block *block, u32 chain_index,
 			      const struct mlxsw_sp_acl_profile_ops *ops)
 {
 	struct mlxsw_sp_acl_ruleset_ht_key ht_key;
 
 	memset(&ht_key, 0, sizeof(ht_key));
-	ht_key.dev = dev;
-	ht_key.ingress = ingress;
+	ht_key.block = block;
 	ht_key.chain_index = chain_index;
 	ht_key.ops = ops;
 	return rhashtable_lookup_fast(&acl->ruleset_ht, &ht_key,
@@ -241,8 +394,8 @@ __mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp_acl *acl, struct net_device *dev,
 }
 
 struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			    bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block, u32 chain_index,
 			    enum mlxsw_sp_acl_profile profile)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops;
@@ -252,16 +405,15 @@ mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 	ops = acl->ops->profile_ops(mlxsw_sp, profile);
 	if (!ops)
 		return ERR_PTR(-EINVAL);
-	ruleset = __mlxsw_sp_acl_ruleset_lookup(acl, dev, ingress,
-						chain_index, ops);
+	ruleset = __mlxsw_sp_acl_ruleset_lookup(acl, block, chain_index, ops);
 	if (!ruleset)
 		return ERR_PTR(-ENOENT);
 	return ruleset;
 }
 
 struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			 bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp,
+			 struct mlxsw_sp_acl_block *block, u32 chain_index,
 			 enum mlxsw_sp_acl_profile profile)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops;
@@ -272,14 +424,12 @@ mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 	if (!ops)
 		return ERR_PTR(-EINVAL);
 
-	ruleset = __mlxsw_sp_acl_ruleset_lookup(acl, dev, ingress,
-						chain_index, ops);
+	ruleset = __mlxsw_sp_acl_ruleset_lookup(acl, block, chain_index, ops);
 	if (ruleset) {
 		mlxsw_sp_acl_ruleset_ref_inc(ruleset);
 		return ruleset;
 	}
-	return mlxsw_sp_acl_ruleset_create(mlxsw_sp, dev, ingress,
-					   chain_index, ops);
+	return mlxsw_sp_acl_ruleset_create(mlxsw_sp, block, chain_index, ops);
 }
 
 void mlxsw_sp_acl_ruleset_put(struct mlxsw_sp *mlxsw_sp,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
index 2f0e578..144a2ad 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
@@ -35,6 +35,7 @@
 #include <linux/kernel.h>
 #include <linux/errno.h>
 #include <linux/netdevice.h>
+#include <net/net_namespace.h>
 #include <net/flow_dissector.h>
 #include <net/pkt_cls.h>
 #include <net/tc_act/tc_gact.h>
@@ -45,7 +46,7 @@
 #include "core_acl_flex_keys.h"
 
 static int mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
-					 struct net_device *dev, bool ingress,
+					 struct mlxsw_sp_acl_block *block,
 					 struct mlxsw_sp_acl_rule_info *rulei,
 					 struct tcf_exts *exts)
 {
@@ -80,8 +81,7 @@ static int mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
 			struct mlxsw_sp_acl_ruleset *ruleset;
 			u16 group_id;
 
-			ruleset = mlxsw_sp_acl_ruleset_lookup(mlxsw_sp, dev,
-							      ingress,
+			ruleset = mlxsw_sp_acl_ruleset_lookup(mlxsw_sp, block,
 							      chain_index,
 							      MLXSW_SP_ACL_PROFILE_FLOWER);
 			if (IS_ERR(ruleset))
@@ -92,6 +92,7 @@ static int mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
 			if (err)
 				return err;
 		} else if (is_tcf_mirred_egress_redirect(a)) {
+			struct net *net = mlxsw_sp_acl_block_net(block);
 			int ifindex = tcf_mirred_ifindex(a);
 			struct net_device *out_dev;
 			struct mlxsw_sp_fid *fid;
@@ -104,9 +105,7 @@ static int mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
 			if (err)
 				return err;
 
-			out_dev = __dev_get_by_index(dev_net(dev), ifindex);
-			if (out_dev == dev)
-				out_dev = NULL;
+			out_dev = __dev_get_by_index(net, ifindex);
 
 			err = mlxsw_sp_acl_rulei_act_fwd(mlxsw_sp, rulei,
 							 out_dev);
@@ -266,7 +265,7 @@ static int mlxsw_sp_flower_parse_ip(struct mlxsw_sp *mlxsw_sp,
 }
 
 static int mlxsw_sp_flower_parse(struct mlxsw_sp *mlxsw_sp,
-				 struct net_device *dev, bool ingress,
+				 struct mlxsw_sp_acl_block *block,
 				 struct mlxsw_sp_acl_rule_info *rulei,
 				 struct tc_cls_flower_offload *f)
 {
@@ -384,21 +383,19 @@ static int mlxsw_sp_flower_parse(struct mlxsw_sp *mlxsw_sp,
 	if (err)
 		return err;
 
-	return mlxsw_sp_flower_parse_actions(mlxsw_sp, dev, ingress,
-					     rulei, f->exts);
+	return mlxsw_sp_flower_parse_actions(mlxsw_sp, block, rulei, f->exts);
 }
 
-int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+int mlxsw_sp_flower_replace(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
 			    struct tc_cls_flower_offload *f)
 {
-	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
-	struct net_device *dev = mlxsw_sp_port->dev;
 	struct mlxsw_sp_acl_rule_info *rulei;
 	struct mlxsw_sp_acl_ruleset *ruleset;
 	struct mlxsw_sp_acl_rule *rule;
 	int err;
 
-	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, dev, ingress,
+	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, block,
 					   f->common.chain_index,
 					   MLXSW_SP_ACL_PROFILE_FLOWER);
 	if (IS_ERR(ruleset))
@@ -411,7 +408,7 @@ int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 	}
 
 	rulei = mlxsw_sp_acl_rule_rulei(rule);
-	err = mlxsw_sp_flower_parse(mlxsw_sp, dev, ingress, rulei, f);
+	err = mlxsw_sp_flower_parse(mlxsw_sp, block, rulei, f);
 	if (err)
 		goto err_flower_parse;
 
@@ -435,15 +432,15 @@ int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 	return err;
 }
 
-void mlxsw_sp_flower_destroy(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+void mlxsw_sp_flower_destroy(struct mlxsw_sp *mlxsw_sp,
+			     struct mlxsw_sp_acl_block *block,
 			     struct tc_cls_flower_offload *f)
 {
-	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
 	struct mlxsw_sp_acl_ruleset *ruleset;
 	struct mlxsw_sp_acl_rule *rule;
 
-	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, mlxsw_sp_port->dev,
-					   ingress, f->common.chain_index,
+	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, block,
+					   f->common.chain_index,
 					   MLXSW_SP_ACL_PROFILE_FLOWER);
 	if (IS_ERR(ruleset))
 		return;
@@ -457,10 +454,10 @@ void mlxsw_sp_flower_destroy(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 	mlxsw_sp_acl_ruleset_put(mlxsw_sp, ruleset);
 }
 
-int mlxsw_sp_flower_stats(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+int mlxsw_sp_flower_stats(struct mlxsw_sp *mlxsw_sp,
+			  struct mlxsw_sp_acl_block *block,
 			  struct tc_cls_flower_offload *f)
 {
-	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
 	struct mlxsw_sp_acl_ruleset *ruleset;
 	struct mlxsw_sp_acl_rule *rule;
 	u64 packets;
@@ -468,8 +465,8 @@ int mlxsw_sp_flower_stats(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 	u64 bytes;
 	int err;
 
-	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, mlxsw_sp_port->dev,
-					   ingress, f->common.chain_index,
+	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, block,
+					   f->common.chain_index,
 					   MLXSW_SP_ACL_PROFILE_FLOWER);
 	if (WARN_ON(IS_ERR(ruleset)))
 		return -EINVAL;
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [patch net-next v2 10/10] mlxsw: spectrum_acl: Pass mlxsw_sp_port down to ruleset bind/unbind ops
  2017-11-12 15:55 [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (8 preceding siblings ...)
  2017-11-12 15:56 ` [patch net-next v2 09/10] mlxsw: spectrum_acl: Implement TC block sharing Jiri Pirko
@ 2017-11-12 15:56 ` Jiri Pirko
  2017-11-15 23:12 ` [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances Cong Wang
  10 siblings, 0 replies; 31+ messages in thread
From: Jiri Pirko @ 2017-11-12 15:56 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, jakub.kicinski, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

From: Jiri Pirko <jiri@mellanox.com>

No need to convert from mlxsw_sp_port to net_device and back again.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  6 +++--
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c |  4 ++--
 .../ethernet/mellanox/mlxsw/spectrum_acl_tcam.c    | 27 +++++++++-------------
 3 files changed, 17 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index 4d375f3..5b48d37 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -475,9 +475,11 @@ struct mlxsw_sp_acl_profile_ops {
 			   void *priv, void *ruleset_priv);
 	void (*ruleset_del)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv);
 	int (*ruleset_bind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv,
-			    struct net_device *dev, bool ingress);
+			    struct mlxsw_sp_port *mlxsw_sp_port,
+			    bool ingress);
 	void (*ruleset_unbind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv,
-			       struct net_device *dev, bool ingress);
+			       struct mlxsw_sp_port *mlxsw_sp_port,
+			       bool ingress);
 	u16 (*ruleset_group_id)(void *ruleset_priv);
 	size_t rule_priv_size;
 	int (*rule_add)(struct mlxsw_sp *mlxsw_sp,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index fda41f1..9719791 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -151,7 +151,7 @@ mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
 	return ops->ruleset_bind(mlxsw_sp, ruleset->priv,
-				 binding->mlxsw_sp_port->dev, binding->ingress);
+				 binding->mlxsw_sp_port, binding->ingress);
 }
 
 static void
@@ -163,7 +163,7 @@ mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
 	ops->ruleset_unbind(mlxsw_sp, ruleset->priv,
-			    binding->mlxsw_sp_port->dev, binding->ingress);
+			    binding->mlxsw_sp_port, binding->ingress);
 }
 
 static bool mlxsw_sp_acl_ruleset_block_bound(struct mlxsw_sp_acl_block *block)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
index 50b2f9a..c6e180c 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
@@ -258,15 +258,11 @@ static void mlxsw_sp_acl_tcam_group_del(struct mlxsw_sp *mlxsw_sp,
 static int
 mlxsw_sp_acl_tcam_group_bind(struct mlxsw_sp *mlxsw_sp,
 			     struct mlxsw_sp_acl_tcam_group *group,
-			     struct net_device *dev, bool ingress)
+			     struct mlxsw_sp_port *mlxsw_sp_port,
+			     bool ingress)
 {
-	struct mlxsw_sp_port *mlxsw_sp_port;
 	char ppbt_pl[MLXSW_REG_PPBT_LEN];
 
-	if (!mlxsw_sp_port_dev_check(dev))
-		return -EINVAL;
-
-	mlxsw_sp_port = netdev_priv(dev);
 	mlxsw_reg_ppbt_pack(ppbt_pl, ingress ? MLXSW_REG_PXBT_E_IACL :
 					       MLXSW_REG_PXBT_E_EACL,
 			    MLXSW_REG_PXBT_OP_BIND, mlxsw_sp_port->local_port,
@@ -277,15 +273,11 @@ mlxsw_sp_acl_tcam_group_bind(struct mlxsw_sp *mlxsw_sp,
 static void
 mlxsw_sp_acl_tcam_group_unbind(struct mlxsw_sp *mlxsw_sp,
 			       struct mlxsw_sp_acl_tcam_group *group,
-			       struct net_device *dev, bool ingress)
+			       struct mlxsw_sp_port *mlxsw_sp_port,
+			       bool ingress)
 {
-	struct mlxsw_sp_port *mlxsw_sp_port;
 	char ppbt_pl[MLXSW_REG_PPBT_LEN];
 
-	if (WARN_ON(!mlxsw_sp_port_dev_check(dev)))
-		return;
-
-	mlxsw_sp_port = netdev_priv(dev);
 	mlxsw_reg_ppbt_pack(ppbt_pl, ingress ? MLXSW_REG_PXBT_E_IACL :
 					       MLXSW_REG_PXBT_E_EACL,
 			    MLXSW_REG_PXBT_OP_UNBIND, mlxsw_sp_port->local_port,
@@ -1054,22 +1046,25 @@ mlxsw_sp_acl_tcam_flower_ruleset_del(struct mlxsw_sp *mlxsw_sp,
 static int
 mlxsw_sp_acl_tcam_flower_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
 				      void *ruleset_priv,
-				      struct net_device *dev, bool ingress)
+				      struct mlxsw_sp_port *mlxsw_sp_port,
+				      bool ingress)
 {
 	struct mlxsw_sp_acl_tcam_flower_ruleset *ruleset = ruleset_priv;
 
 	return mlxsw_sp_acl_tcam_group_bind(mlxsw_sp, &ruleset->group,
-					    dev, ingress);
+					    mlxsw_sp_port, ingress);
 }
 
 static void
 mlxsw_sp_acl_tcam_flower_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
 					void *ruleset_priv,
-					struct net_device *dev, bool ingress)
+					struct mlxsw_sp_port *mlxsw_sp_port,
+					bool ingress)
 {
 	struct mlxsw_sp_acl_tcam_flower_ruleset *ruleset = ruleset_priv;
 
-	mlxsw_sp_acl_tcam_group_unbind(mlxsw_sp, &ruleset->group, dev, ingress);
+	mlxsw_sp_acl_tcam_group_unbind(mlxsw_sp, &ruleset->group,
+				       mlxsw_sp_port, ingress);
 }
 
 static u16
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 01/10] cls_bpf: move prog offload->netdev check into drivers
  2017-11-12 15:55 ` [patch net-next v2 01/10] cls_bpf: move prog offload->netdev check into drivers Jiri Pirko
@ 2017-11-13  2:14   ` Jakub Kicinski
  2017-11-13  6:25     ` Jiri Pirko
  0 siblings, 1 reply; 31+ messages in thread
From: Jakub Kicinski @ 2017-11-13  2:14 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

On Sun, 12 Nov 2017 16:55:55 +0100, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> In order to remove tp->q usage in cls_bpf, the offload->netdev check
> needs to be moved to individual drivers as only they will have access
> to appropriate struct net_device.
> 
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>

This seems not entirely correct and it adds unnecessary code.  I think
the XDP and cls_bpf handling could be unified, making way for binding
the same program to multiple ports of the same device.  Would you mind
waiting a day for me to send corrections to BPF offload?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 01/10] cls_bpf: move prog offload->netdev check into drivers
  2017-11-13  2:14   ` Jakub Kicinski
@ 2017-11-13  6:25     ` Jiri Pirko
  2017-11-13  7:17       ` Jakub Kicinski
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Pirko @ 2017-11-13  6:25 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

Mon, Nov 13, 2017 at 03:14:18AM CET, jakub.kicinski@netronome.com wrote:
>On Sun, 12 Nov 2017 16:55:55 +0100, Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@mellanox.com>
>> 
>> In order to remove tp->q usage in cls_bpf, the offload->netdev check
>> needs to be moved to individual drivers as only they will have access
>> to appropriate struct net_device.
>> 
>> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
>
>This seems not entirely correct and it adds unnecessary code.  I think

What is not correct?


>the XDP and cls_bpf handling could be unified, making way for binding
>the same program to multiple ports of the same device.  Would you mind
>waiting a day for me to send corrections to BPF offload?

Well I'm trying to get this in before net-next closes...

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 01/10] cls_bpf: move prog offload->netdev check into drivers
  2017-11-13  6:25     ` Jiri Pirko
@ 2017-11-13  7:17       ` Jakub Kicinski
  2017-11-13  7:55         ` Jiri Pirko
  0 siblings, 1 reply; 31+ messages in thread
From: Jakub Kicinski @ 2017-11-13  7:17 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

On Mon, 13 Nov 2017 07:25:38 +0100, Jiri Pirko wrote:
> Mon, Nov 13, 2017 at 03:14:18AM CET, jakub.kicinski@netronome.com wrote:
> >On Sun, 12 Nov 2017 16:55:55 +0100, Jiri Pirko wrote:  
> >> From: Jiri Pirko <jiri@mellanox.com>
> >> 
> >> In order to remove tp->q usage in cls_bpf, the offload->netdev check
> >> needs to be moved to individual drivers as only they will have access
> >> to appropriate struct net_device.
> >> 
> >> Signed-off-by: Jiri Pirko <jiri@mellanox.com>  
> >
> >This seems not entirely correct and it adds unnecessary code.  I think  
> 
> What is not correct?

From quick reading it looks like you will allow to install the
dev-specific filter without skip_sw flag.  You haven't fixed what
your previous series broke in cls_bpf offload model and now you 
break it even further.

> >the XDP and cls_bpf handling could be unified, making way for binding
> >the same program to multiple ports of the same device.  Would you mind
> >waiting a day for me to send corrections to BPF offload?  
> 
> Well I'm trying to get this in before net-next closes...

Right, and I'm surprised by that.  I'd hope you'll understand my caution
here given recent history.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 04/10] net: sched: introduce block mechanism to handle netif_keep_dst calls
  2017-11-12 15:55 ` [patch net-next v2 04/10] net: sched: introduce block mechanism to handle netif_keep_dst calls Jiri Pirko
@ 2017-11-13  7:47   ` Jakub Kicinski
  2017-11-13  7:58     ` Jiri Pirko
  0 siblings, 1 reply; 31+ messages in thread
From: Jakub Kicinski @ 2017-11-13  7:47 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

On Sun, 12 Nov 2017 16:55:58 +0100, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> Couple of classifiers call netif_keep_dst directly on q->dev. That is
> not possible to do directly for shared blocke where multiple qdiscs are
> owning the block. So introduce a infrastructure to keep track of the
> block owners in list and use this list to implement block variant of
> netif_keep_dst.
> 
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>

Could you use the list you add here to check the ethtool tc offload
flag? :)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 06/10] net: sched: allow ingress and clsact qdiscs to share filter blocks
  2017-11-12 15:56 ` [patch net-next v2 06/10] net: sched: allow ingress and clsact qdiscs to share filter blocks Jiri Pirko
@ 2017-11-13  7:54   ` Jakub Kicinski
  2017-11-13  7:56     ` Jiri Pirko
  0 siblings, 1 reply; 31+ messages in thread
From: Jakub Kicinski @ 2017-11-13  7:54 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

On Sun, 12 Nov 2017 16:56:00 +0100, Jiri Pirko wrote:
> diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c
> index 5ecc38f..ee89efc 100644
> --- a/net/sched/sch_ingress.c
> +++ b/net/sched/sch_ingress.c
> @@ -60,6 +60,29 @@ static void clsact_chain_head_change(struct tcf_proto *tp_head, void *priv)
>  	struct mini_Qdisc_pair *miniqp = priv;
>  
>  	mini_qdisc_pair_swap(miniqp, tp_head);
> +};
> +
> +static const struct nla_policy ingress_policy[TCA_CLSACT_MAX + 1] = {
> +	[TCA_CLSACT_INGRESS_BLOCK]	= { .type = NLA_U32 },
> +};
> +
> +static int ingress_parse_opt(struct nlattr *opt, u32 *p_ingress_block_index)

nit: why the p_ prefix on all the pointers?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 01/10] cls_bpf: move prog offload->netdev check into drivers
  2017-11-13  7:17       ` Jakub Kicinski
@ 2017-11-13  7:55         ` Jiri Pirko
  2017-11-13  8:12           ` Jakub Kicinski
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Pirko @ 2017-11-13  7:55 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

Mon, Nov 13, 2017 at 08:17:34AM CET, jakub.kicinski@netronome.com wrote:
>On Mon, 13 Nov 2017 07:25:38 +0100, Jiri Pirko wrote:
>> Mon, Nov 13, 2017 at 03:14:18AM CET, jakub.kicinski@netronome.com wrote:
>> >On Sun, 12 Nov 2017 16:55:55 +0100, Jiri Pirko wrote:  
>> >> From: Jiri Pirko <jiri@mellanox.com>
>> >> 
>> >> In order to remove tp->q usage in cls_bpf, the offload->netdev check
>> >> needs to be moved to individual drivers as only they will have access
>> >> to appropriate struct net_device.
>> >> 
>> >> Signed-off-by: Jiri Pirko <jiri@mellanox.com>  
>> >
>> >This seems not entirely correct and it adds unnecessary code.  I think  
>> 
>> What is not correct?
>
>From quick reading it looks like you will allow to install the
>dev-specific filter without skip_sw flag.  You haven't fixed what

Right. I see it now.


>your previous series broke in cls_bpf offload model and now you 

What do you mean exactly?


>break it even further.
>
>> >the XDP and cls_bpf handling could be unified, making way for binding
>> >the same program to multiple ports of the same device.  Would you mind
>> >waiting a day for me to send corrections to BPF offload?  
>> 
>> Well I'm trying to get this in before net-next closes...
>
>Right, and I'm surprised by that.  I'd hope you'll understand my caution
>here given recent history.

Sure.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 06/10] net: sched: allow ingress and clsact qdiscs to share filter blocks
  2017-11-13  7:54   ` Jakub Kicinski
@ 2017-11-13  7:56     ` Jiri Pirko
  2017-11-13  8:05       ` Jakub Kicinski
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Pirko @ 2017-11-13  7:56 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

Mon, Nov 13, 2017 at 08:54:52AM CET, jakub.kicinski@netronome.com wrote:
>On Sun, 12 Nov 2017 16:56:00 +0100, Jiri Pirko wrote:
>> diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c
>> index 5ecc38f..ee89efc 100644
>> --- a/net/sched/sch_ingress.c
>> +++ b/net/sched/sch_ingress.c
>> @@ -60,6 +60,29 @@ static void clsact_chain_head_change(struct tcf_proto *tp_head, void *priv)
>>  	struct mini_Qdisc_pair *miniqp = priv;
>>  
>>  	mini_qdisc_pair_swap(miniqp, tp_head);
>> +};
>> +
>> +static const struct nla_policy ingress_policy[TCA_CLSACT_MAX + 1] = {
>> +	[TCA_CLSACT_INGRESS_BLOCK]	= { .type = NLA_U32 },
>> +};
>> +
>> +static int ingress_parse_opt(struct nlattr *opt, u32 *p_ingress_block_index)
>
>nit: why the p_ prefix on all the pointers?

Just to diferenciate:
u32 *ingress_block_index
and
u32 ingress_block_index

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 04/10] net: sched: introduce block mechanism to handle netif_keep_dst calls
  2017-11-13  7:47   ` Jakub Kicinski
@ 2017-11-13  7:58     ` Jiri Pirko
  2017-11-13  8:03       ` Jakub Kicinski
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Pirko @ 2017-11-13  7:58 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

Mon, Nov 13, 2017 at 08:47:26AM CET, jakub.kicinski@netronome.com wrote:
>On Sun, 12 Nov 2017 16:55:58 +0100, Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@mellanox.com>
>> 
>> Couple of classifiers call netif_keep_dst directly on q->dev. That is
>> not possible to do directly for shared blocke where multiple qdiscs are
>> owning the block. So introduce a infrastructure to keep track of the
>> block owners in list and use this list to implement block variant of
>> netif_keep_dst.
>> 
>> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
>
>Could you use the list you add here to check the ethtool tc offload
>flag? :)

It is a list of qdisc sub parts. Not a list of netdevs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 04/10] net: sched: introduce block mechanism to handle netif_keep_dst calls
  2017-11-13  7:58     ` Jiri Pirko
@ 2017-11-13  8:03       ` Jakub Kicinski
  2017-11-13  8:08         ` Jiri Pirko
  0 siblings, 1 reply; 31+ messages in thread
From: Jakub Kicinski @ 2017-11-13  8:03 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

On Mon, 13 Nov 2017 08:58:44 +0100, Jiri Pirko wrote:
> Mon, Nov 13, 2017 at 08:47:26AM CET, jakub.kicinski@netronome.com wrote:
> >On Sun, 12 Nov 2017 16:55:58 +0100, Jiri Pirko wrote:  
> >> From: Jiri Pirko <jiri@mellanox.com>
> >> 
> >> Couple of classifiers call netif_keep_dst directly on q->dev. That is
> >> not possible to do directly for shared blocke where multiple qdiscs are
> >> owning the block. So introduce a infrastructure to keep track of the
> >> block owners in list and use this list to implement block variant of
> >> netif_keep_dst.
> >> 
> >> Signed-off-by: Jiri Pirko <jiri@mellanox.com>  
> >
> >Could you use the list you add here to check the ethtool tc offload
> >flag? :)  
> 
> It is a list of qdisc sub parts. Not a list of netdevs

Hm.  OK, I won't pretend I understand the TC code in detail, I thought
that would give you all netdevs, but possibly duplicated.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 06/10] net: sched: allow ingress and clsact qdiscs to share filter blocks
  2017-11-13  7:56     ` Jiri Pirko
@ 2017-11-13  8:05       ` Jakub Kicinski
  2017-11-13  8:09         ` Jiri Pirko
  0 siblings, 1 reply; 31+ messages in thread
From: Jakub Kicinski @ 2017-11-13  8:05 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

On Mon, 13 Nov 2017 08:56:58 +0100, Jiri Pirko wrote:
> Mon, Nov 13, 2017 at 08:54:52AM CET, jakub.kicinski@netronome.com wrote:
> >On Sun, 12 Nov 2017 16:56:00 +0100, Jiri Pirko wrote:  
> >> diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c
> >> index 5ecc38f..ee89efc 100644
> >> --- a/net/sched/sch_ingress.c
> >> +++ b/net/sched/sch_ingress.c
> >> @@ -60,6 +60,29 @@ static void clsact_chain_head_change(struct tcf_proto *tp_head, void *priv)
> >>  	struct mini_Qdisc_pair *miniqp = priv;
> >>  
> >>  	mini_qdisc_pair_swap(miniqp, tp_head);
> >> +};
> >> +
> >> +static const struct nla_policy ingress_policy[TCA_CLSACT_MAX + 1] = {
> >> +	[TCA_CLSACT_INGRESS_BLOCK]	= { .type = NLA_U32 },
> >> +};
> >> +
> >> +static int ingress_parse_opt(struct nlattr *opt, u32 *p_ingress_block_index)  
> >
> >nit: why the p_ prefix on all the pointers?  
> 
> Just to diferenciate:
> u32 *ingress_block_index
> and
> u32 ingress_block_index

But why?  There isn't a single ingress_block_index in this patch.
Looks like Hungarian notation.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 04/10] net: sched: introduce block mechanism to handle netif_keep_dst calls
  2017-11-13  8:03       ` Jakub Kicinski
@ 2017-11-13  8:08         ` Jiri Pirko
  2017-11-13  8:17           ` Jakub Kicinski
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Pirko @ 2017-11-13  8:08 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

Mon, Nov 13, 2017 at 09:03:34AM CET, jakub.kicinski@netronome.com wrote:
>On Mon, 13 Nov 2017 08:58:44 +0100, Jiri Pirko wrote:
>> Mon, Nov 13, 2017 at 08:47:26AM CET, jakub.kicinski@netronome.com wrote:
>> >On Sun, 12 Nov 2017 16:55:58 +0100, Jiri Pirko wrote:  
>> >> From: Jiri Pirko <jiri@mellanox.com>
>> >> 
>> >> Couple of classifiers call netif_keep_dst directly on q->dev. That is
>> >> not possible to do directly for shared blocke where multiple qdiscs are
>> >> owning the block. So introduce a infrastructure to keep track of the
>> >> block owners in list and use this list to implement block variant of
>> >> netif_keep_dst.
>> >> 
>> >> Signed-off-by: Jiri Pirko <jiri@mellanox.com>  
>> >
>> >Could you use the list you add here to check the ethtool tc offload
>> >flag? :)  
>> 
>> It is a list of qdisc sub parts. Not a list of netdevs
>
>Hm.  OK, I won't pretend I understand the TC code in detail, I thought
>that would give you all netdevs, but possibly duplicated.

Yeah, eventually you can get it. But still, it is unusable to check the
offload flag cause it has no relation with the block cbs.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 06/10] net: sched: allow ingress and clsact qdiscs to share filter blocks
  2017-11-13  8:05       ` Jakub Kicinski
@ 2017-11-13  8:09         ` Jiri Pirko
  0 siblings, 0 replies; 31+ messages in thread
From: Jiri Pirko @ 2017-11-13  8:09 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

Mon, Nov 13, 2017 at 09:05:26AM CET, jakub.kicinski@netronome.com wrote:
>On Mon, 13 Nov 2017 08:56:58 +0100, Jiri Pirko wrote:
>> Mon, Nov 13, 2017 at 08:54:52AM CET, jakub.kicinski@netronome.com wrote:
>> >On Sun, 12 Nov 2017 16:56:00 +0100, Jiri Pirko wrote:  
>> >> diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c
>> >> index 5ecc38f..ee89efc 100644
>> >> --- a/net/sched/sch_ingress.c
>> >> +++ b/net/sched/sch_ingress.c
>> >> @@ -60,6 +60,29 @@ static void clsact_chain_head_change(struct tcf_proto *tp_head, void *priv)
>> >>  	struct mini_Qdisc_pair *miniqp = priv;
>> >>  
>> >>  	mini_qdisc_pair_swap(miniqp, tp_head);
>> >> +};
>> >> +
>> >> +static const struct nla_policy ingress_policy[TCA_CLSACT_MAX + 1] = {
>> >> +	[TCA_CLSACT_INGRESS_BLOCK]	= { .type = NLA_U32 },
>> >> +};
>> >> +
>> >> +static int ingress_parse_opt(struct nlattr *opt, u32 *p_ingress_block_index)  
>> >
>> >nit: why the p_ prefix on all the pointers?  
>> 
>> Just to diferenciate:
>> u32 *ingress_block_index
>> and
>> u32 ingress_block_index
>
>But why?  There isn't a single ingress_block_index in this patch.
>Looks like Hungarian notation.

Will change that.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 01/10] cls_bpf: move prog offload->netdev check into drivers
  2017-11-13  7:55         ` Jiri Pirko
@ 2017-11-13  8:12           ` Jakub Kicinski
  2017-11-13  8:28             ` Jiri Pirko
  0 siblings, 1 reply; 31+ messages in thread
From: Jakub Kicinski @ 2017-11-13  8:12 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

On Mon, 13 Nov 2017 08:55:56 +0100, Jiri Pirko wrote:
> Mon, Nov 13, 2017 at 08:17:34AM CET, jakub.kicinski@netronome.com wrote:
> >On Mon, 13 Nov 2017 07:25:38 +0100, Jiri Pirko wrote:  
> >> Mon, Nov 13, 2017 at 03:14:18AM CET, jakub.kicinski@netronome.com wrote:  
> >> >On Sun, 12 Nov 2017 16:55:55 +0100, Jiri Pirko wrote:    
> >> >> From: Jiri Pirko <jiri@mellanox.com>
> >> >> 
> >> >> In order to remove tp->q usage in cls_bpf, the offload->netdev check
> >> >> needs to be moved to individual drivers as only they will have access
> >> >> to appropriate struct net_device.
> >> >> 
> >> >> Signed-off-by: Jiri Pirko <jiri@mellanox.com>    
> >> >
> >> >This seems not entirely correct and it adds unnecessary code.  I think    
> >> 
> >> What is not correct?  
> >
> >From quick reading it looks like you will allow to install the
> >dev-specific filter without skip_sw flag.  You haven't fixed what  
> 
> Right. I see it now.
> 
> 
> >your previous series broke in cls_bpf offload model and now you   
> 
> What do you mean exactly?

As explained elsewhere, cls_bpf used to track what's offloaded and
issue ADD/REPLACE/DESTORY accordingly.  Now drivers need to know what
they're offloading, but they still don't.  So if you add a filter that
offload successfully and then one that doesn't, the spurious DESTORY
will kill the wrong offload.

> >break it even further.
> >  
> >> >the XDP and cls_bpf handling could be unified, making way for binding
> >> >the same program to multiple ports of the same device.  Would you mind
> >> >waiting a day for me to send corrections to BPF offload?    
> >> 
> >> Well I'm trying to get this in before net-next closes...  
> >
> >Right, and I'm surprised by that.  I'd hope you'll understand my caution
> >here given recent history.  
> 
> Sure.

I looked through this series and I can't grasp all the details of how
things are supposed to work from the code here :(  Perhaps important
bits went in earlier and I missed them.

Starting from the most fundamental thing - if I have a shared block
full of skip_sw filters and then bind it to a device which doesn't even
have ndo_setup_tc - what prevents that from happening?

AFACT tcf_block_offload_cmd() is returning void.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 04/10] net: sched: introduce block mechanism to handle netif_keep_dst calls
  2017-11-13  8:08         ` Jiri Pirko
@ 2017-11-13  8:17           ` Jakub Kicinski
  2017-11-13  8:35             ` Jiri Pirko
  0 siblings, 1 reply; 31+ messages in thread
From: Jakub Kicinski @ 2017-11-13  8:17 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

On Mon, 13 Nov 2017 09:08:16 +0100, Jiri Pirko wrote:
> Mon, Nov 13, 2017 at 09:03:34AM CET, jakub.kicinski@netronome.com wrote:
> >On Mon, 13 Nov 2017 08:58:44 +0100, Jiri Pirko wrote:  
> >> Mon, Nov 13, 2017 at 08:47:26AM CET, jakub.kicinski@netronome.com wrote:  
> >> >On Sun, 12 Nov 2017 16:55:58 +0100, Jiri Pirko wrote:    
> >> >> From: Jiri Pirko <jiri@mellanox.com>
> >> >> 
> >> >> Couple of classifiers call netif_keep_dst directly on q->dev. That is
> >> >> not possible to do directly for shared blocke where multiple qdiscs are
> >> >> owning the block. So introduce a infrastructure to keep track of the
> >> >> block owners in list and use this list to implement block variant of
> >> >> netif_keep_dst.
> >> >> 
> >> >> Signed-off-by: Jiri Pirko <jiri@mellanox.com>    
> >> >
> >> >Could you use the list you add here to check the ethtool tc offload
> >> >flag? :)    
> >> 
> >> It is a list of qdisc sub parts. Not a list of netdevs  
> >
> >Hm.  OK, I won't pretend I understand the TC code in detail, I thought
> >that would give you all netdevs, but possibly duplicated.  
> 
> Yeah, eventually you can get it. But still, it is unusable to check the
> offload flag cause it has no relation with the block cbs.

OK.  Depends on which flags you intend to check.  I.e. is it OK to
offload filters of the bond, because all its slaves have offloads on
but the bond itself doesn't.  Is that what you mean?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 01/10] cls_bpf: move prog offload->netdev check into drivers
  2017-11-13  8:12           ` Jakub Kicinski
@ 2017-11-13  8:28             ` Jiri Pirko
  0 siblings, 0 replies; 31+ messages in thread
From: Jiri Pirko @ 2017-11-13  8:28 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

Mon, Nov 13, 2017 at 09:12:35AM CET, jakub.kicinski@netronome.com wrote:
>On Mon, 13 Nov 2017 08:55:56 +0100, Jiri Pirko wrote:
>> Mon, Nov 13, 2017 at 08:17:34AM CET, jakub.kicinski@netronome.com wrote:
>> >On Mon, 13 Nov 2017 07:25:38 +0100, Jiri Pirko wrote:  
>> >> Mon, Nov 13, 2017 at 03:14:18AM CET, jakub.kicinski@netronome.com wrote:  
>> >> >On Sun, 12 Nov 2017 16:55:55 +0100, Jiri Pirko wrote:    
>> >> >> From: Jiri Pirko <jiri@mellanox.com>
>> >> >> 
>> >> >> In order to remove tp->q usage in cls_bpf, the offload->netdev check
>> >> >> needs to be moved to individual drivers as only they will have access
>> >> >> to appropriate struct net_device.
>> >> >> 
>> >> >> Signed-off-by: Jiri Pirko <jiri@mellanox.com>    
>> >> >
>> >> >This seems not entirely correct and it adds unnecessary code.  I think    
>> >> 
>> >> What is not correct?  
>> >
>> >From quick reading it looks like you will allow to install the
>> >dev-specific filter without skip_sw flag.  You haven't fixed what  
>> 
>> Right. I see it now.
>> 
>> 
>> >your previous series broke in cls_bpf offload model and now you   
>> 
>> What do you mean exactly?
>
>As explained elsewhere, cls_bpf used to track what's offloaded and
>issue ADD/REPLACE/DESTORY accordingly.  Now drivers need to know what
>they're offloading, but they still don't.  So if you add a filter that
>offload successfully and then one that doesn't, the spurious DESTORY
>will kill the wrong offload.

Ah, got it.

>
>> >break it even further.
>> >  
>> >> >the XDP and cls_bpf handling could be unified, making way for binding
>> >> >the same program to multiple ports of the same device.  Would you mind
>> >> >waiting a day for me to send corrections to BPF offload?    
>> >> 
>> >> Well I'm trying to get this in before net-next closes...  
>> >
>> >Right, and I'm surprised by that.  I'd hope you'll understand my caution
>> >here given recent history.  
>> 
>> Sure.
>
>I looked through this series and I can't grasp all the details of how
>things are supposed to work from the code here :(  Perhaps important
>bits went in earlier and I missed them.
>
>Starting from the most fundamental thing - if I have a shared block
>full of skip_sw filters and then bind it to a device which doesn't even
>have ndo_setup_tc - what prevents that from happening?

Nothing atm. I need to add some check there. Thanks.


>
>AFACT tcf_block_offload_cmd() is returning void.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 04/10] net: sched: introduce block mechanism to handle netif_keep_dst calls
  2017-11-13  8:17           ` Jakub Kicinski
@ 2017-11-13  8:35             ` Jiri Pirko
  2017-11-13  8:45               ` Jakub Kicinski
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Pirko @ 2017-11-13  8:35 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

Mon, Nov 13, 2017 at 09:17:23AM CET, jakub.kicinski@netronome.com wrote:
>On Mon, 13 Nov 2017 09:08:16 +0100, Jiri Pirko wrote:
>> Mon, Nov 13, 2017 at 09:03:34AM CET, jakub.kicinski@netronome.com wrote:
>> >On Mon, 13 Nov 2017 08:58:44 +0100, Jiri Pirko wrote:  
>> >> Mon, Nov 13, 2017 at 08:47:26AM CET, jakub.kicinski@netronome.com wrote:  
>> >> >On Sun, 12 Nov 2017 16:55:58 +0100, Jiri Pirko wrote:    
>> >> >> From: Jiri Pirko <jiri@mellanox.com>
>> >> >> 
>> >> >> Couple of classifiers call netif_keep_dst directly on q->dev. That is
>> >> >> not possible to do directly for shared blocke where multiple qdiscs are
>> >> >> owning the block. So introduce a infrastructure to keep track of the
>> >> >> block owners in list and use this list to implement block variant of
>> >> >> netif_keep_dst.
>> >> >> 
>> >> >> Signed-off-by: Jiri Pirko <jiri@mellanox.com>    
>> >> >
>> >> >Could you use the list you add here to check the ethtool tc offload
>> >> >flag? :)    
>> >> 
>> >> It is a list of qdisc sub parts. Not a list of netdevs  
>> >
>> >Hm.  OK, I won't pretend I understand the TC code in detail, I thought
>> >that would give you all netdevs, but possibly duplicated.  
>> 
>> Yeah, eventually you can get it. But still, it is unusable to check the
>> offload flag cause it has no relation with the block cbs.
>
>OK.  Depends on which flags you intend to check.  I.e. is it OK to
>offload filters of the bond, because all its slaves have offloads on
>but the bond itself doesn't.  Is that what you mean?

No.
What I mean is, there is not always 1:1 relation between a registered
block cb and netdev. For example in case of mlxsw. When multiple mlxsw
devices share the same block, there is only one block cb call for all
of them.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 04/10] net: sched: introduce block mechanism to handle netif_keep_dst calls
  2017-11-13  8:35             ` Jiri Pirko
@ 2017-11-13  8:45               ` Jakub Kicinski
  2017-11-13  8:54                 ` Jiri Pirko
  0 siblings, 1 reply; 31+ messages in thread
From: Jakub Kicinski @ 2017-11-13  8:45 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

On Mon, 13 Nov 2017 09:35:55 +0100, Jiri Pirko wrote:
> Mon, Nov 13, 2017 at 09:17:23AM CET, jakub.kicinski@netronome.com wrote:
> >On Mon, 13 Nov 2017 09:08:16 +0100, Jiri Pirko wrote:  
> >> Mon, Nov 13, 2017 at 09:03:34AM CET, jakub.kicinski@netronome.com wrote:  
> >> >On Mon, 13 Nov 2017 08:58:44 +0100, Jiri Pirko wrote:    
> >> >> Mon, Nov 13, 2017 at 08:47:26AM CET, jakub.kicinski@netronome.com wrote:    
> >> >> >On Sun, 12 Nov 2017 16:55:58 +0100, Jiri Pirko wrote:      
> >> >> >> From: Jiri Pirko <jiri@mellanox.com>
> >> >> >> 
> >> >> >> Couple of classifiers call netif_keep_dst directly on q->dev. That is
> >> >> >> not possible to do directly for shared blocke where multiple qdiscs are
> >> >> >> owning the block. So introduce a infrastructure to keep track of the
> >> >> >> block owners in list and use this list to implement block variant of
> >> >> >> netif_keep_dst.
> >> >> >> 
> >> >> >> Signed-off-by: Jiri Pirko <jiri@mellanox.com>      
> >> >> >
> >> >> >Could you use the list you add here to check the ethtool tc offload
> >> >> >flag? :)      
> >> >> 
> >> >> It is a list of qdisc sub parts. Not a list of netdevs    
> >> >
> >> >Hm.  OK, I won't pretend I understand the TC code in detail, I thought
> >> >that would give you all netdevs, but possibly duplicated.    
> >> 
> >> Yeah, eventually you can get it. But still, it is unusable to check the
> >> offload flag cause it has no relation with the block cbs.  
> >
> >OK.  Depends on which flags you intend to check.  I.e. is it OK to
> >offload filters of the bond, because all its slaves have offloads on
> >but the bond itself doesn't.  Is that what you mean?  
> 
> No.
> What I mean is, there is not always 1:1 relation between a registered
> block cb and netdev. For example in case of mlxsw. When multiple mlxsw
> devices share the same block, there is only one block cb call for all
> of them.

OK, I'm clearly missing something.  I would have thought that the case
where the callback is shared for multiple port netdevs is pretty well
covered by the Qdiscs owning the block, provided that you said you
intend to only offload the rule if all port netdevs sharing the block
have the TC offload on.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 04/10] net: sched: introduce block mechanism to handle netif_keep_dst calls
  2017-11-13  8:45               ` Jakub Kicinski
@ 2017-11-13  8:54                 ` Jiri Pirko
  0 siblings, 0 replies; 31+ messages in thread
From: Jiri Pirko @ 2017-11-13  8:54 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, davem, jhs, xiyou.wangcong, mlxsw, andrew,
	vivien.didelot, f.fainelli, ast, daniel, simon.horman,
	pieter.jansenvanvuuren, john.hurley

Mon, Nov 13, 2017 at 09:45:12AM CET, jakub.kicinski@netronome.com wrote:
>On Mon, 13 Nov 2017 09:35:55 +0100, Jiri Pirko wrote:
>> Mon, Nov 13, 2017 at 09:17:23AM CET, jakub.kicinski@netronome.com wrote:
>> >On Mon, 13 Nov 2017 09:08:16 +0100, Jiri Pirko wrote:  
>> >> Mon, Nov 13, 2017 at 09:03:34AM CET, jakub.kicinski@netronome.com wrote:  
>> >> >On Mon, 13 Nov 2017 08:58:44 +0100, Jiri Pirko wrote:    
>> >> >> Mon, Nov 13, 2017 at 08:47:26AM CET, jakub.kicinski@netronome.com wrote:    
>> >> >> >On Sun, 12 Nov 2017 16:55:58 +0100, Jiri Pirko wrote:      
>> >> >> >> From: Jiri Pirko <jiri@mellanox.com>
>> >> >> >> 
>> >> >> >> Couple of classifiers call netif_keep_dst directly on q->dev. That is
>> >> >> >> not possible to do directly for shared blocke where multiple qdiscs are
>> >> >> >> owning the block. So introduce a infrastructure to keep track of the
>> >> >> >> block owners in list and use this list to implement block variant of
>> >> >> >> netif_keep_dst.
>> >> >> >> 
>> >> >> >> Signed-off-by: Jiri Pirko <jiri@mellanox.com>      
>> >> >> >
>> >> >> >Could you use the list you add here to check the ethtool tc offload
>> >> >> >flag? :)      
>> >> >> 
>> >> >> It is a list of qdisc sub parts. Not a list of netdevs    
>> >> >
>> >> >Hm.  OK, I won't pretend I understand the TC code in detail, I thought
>> >> >that would give you all netdevs, but possibly duplicated.    
>> >> 
>> >> Yeah, eventually you can get it. But still, it is unusable to check the
>> >> offload flag cause it has no relation with the block cbs.  
>> >
>> >OK.  Depends on which flags you intend to check.  I.e. is it OK to
>> >offload filters of the bond, because all its slaves have offloads on
>> >but the bond itself doesn't.  Is that what you mean?  
>> 
>> No.
>> What I mean is, there is not always 1:1 relation between a registered
>> block cb and netdev. For example in case of mlxsw. When multiple mlxsw
>> devices share the same block, there is only one block cb call for all
>> of them.
>
>OK, I'm clearly missing something.  I would have thought that the case
>where the callback is shared for multiple port netdevs is pretty well
>covered by the Qdiscs owning the block, provided that you said you
>intend to only offload the rule if all port netdevs sharing the block
>have the TC offload on.

You are right. But still, how do you map the qdiscs owning the block to
whoever who registers the cb? That is my point.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances
  2017-11-12 15:55 [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances Jiri Pirko
                   ` (9 preceding siblings ...)
  2017-11-12 15:56 ` [patch net-next v2 10/10] mlxsw: spectrum_acl: Pass mlxsw_sp_port down to ruleset bind/unbind ops Jiri Pirko
@ 2017-11-15 23:12 ` Cong Wang
  2017-11-18 17:18   ` Jiri Pirko
  10 siblings, 1 reply; 31+ messages in thread
From: Cong Wang @ 2017-11-15 23:12 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Linux Kernel Network Developers, David Miller, Jamal Hadi Salim,
	mlxsw, andrew, Vivien Didelot, Florian Fainelli, Jakub Kicinski,
	Alexei Starovoitov, Daniel Borkmann, Simon Horman,
	Pieter Jansen van Vuuren, john.hurley

On Sun, Nov 12, 2017 at 7:55 AM, Jiri Pirko <jiri@resnulli.us> wrote:
> From: Jiri Pirko <jiri@mellanox.com>
>
> Currently the filters added to qdiscs are independent. So for example if you
> have 2 netdevices and you create ingress qdisc on both and you want to add
> identical filter rules both, you need to add them twice. This patchset
> makes this easier and mainly saves resources allowing to share all filters
> within a qdisc - I call it a "filter block". Also this helps to save
> resources when we do offload to hw for example to expensive TCAM.
>
> So back to the example. First, we create 2 qdiscs. Both will share
> block number 22. "22" is just an identification. If we don't pass any
> block number, a new one will be generated by kernel:

Should not block 0 by used by default if not specified by user?
Why a new one?


>
> $ tc qdisc add dev ens7 ingress block 22
>                                 ^^^^^^^^
> $ tc qdisc add dev ens8 ingress block 22
>                                 ^^^^^^^^
>
> Now if we list the qdiscs, we will see the block index in the output:
>
> $ tc qdisc
> qdisc ingress ffff: dev ens7 parent ffff:fff1 block 22
> qdisc ingress ffff: dev ens8 parent ffff:fff1 block 22
>
> Now we can add filter to any of qdiscs sharing the same block:
>
> $ tc filter add dev ens7 ingress protocol ip pref 25 flower dst_ip 192.168.0.0/16 action drop
>

So you don't need to specify block 22 for this filter?
Because there is only one block???


>
> We will see the same output if we list filters for ens7 and ens8, including stats:
>
> $ tc -s filter show dev ens7 ingress
> filter protocol ip pref 25 flower chain 0
> filter protocol ip pref 25 flower chain 0 handle 0x1
>   eth_type ipv4
>   dst_ip 192.168.0.0/16
>   not_in_hw
>         action order 1: gact action drop
>          random type none pass val 0
>          index 1 ref 1 bind 1 installed 39 sec used 2 sec
>         Action statistics:
>         Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0)
>         backlog 0b 0p requeues 0


Don't see which block it belongs to here.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances
  2017-11-15 23:12 ` [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances Cong Wang
@ 2017-11-18 17:18   ` Jiri Pirko
  0 siblings, 0 replies; 31+ messages in thread
From: Jiri Pirko @ 2017-11-18 17:18 UTC (permalink / raw)
  To: Cong Wang
  Cc: Linux Kernel Network Developers, David Miller, Jamal Hadi Salim,
	mlxsw, andrew, Vivien Didelot, Florian Fainelli, Jakub Kicinski,
	Alexei Starovoitov, Daniel Borkmann, Simon Horman,
	Pieter Jansen van Vuuren, john.hurley

Thu, Nov 16, 2017 at 12:12:47AM CET, xiyou.wangcong@gmail.com wrote:
>On Sun, Nov 12, 2017 at 7:55 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>> From: Jiri Pirko <jiri@mellanox.com>
>>
>> Currently the filters added to qdiscs are independent. So for example if you
>> have 2 netdevices and you create ingress qdisc on both and you want to add
>> identical filter rules both, you need to add them twice. This patchset
>> makes this easier and mainly saves resources allowing to share all filters
>> within a qdisc - I call it a "filter block". Also this helps to save
>> resources when we do offload to hw for example to expensive TCAM.
>>
>> So back to the example. First, we create 2 qdiscs. Both will share
>> block number 22. "22" is just an identification. If we don't pass any
>> block number, a new one will be generated by kernel:
>
>Should not block 0 by used by default if not specified by user?
>Why a new one?

That would mean you would share block 0 among all newly created qdiscs.
That is not right. By default, there should be no sharing.

>
>
>>
>> $ tc qdisc add dev ens7 ingress block 22
>>                                 ^^^^^^^^
>> $ tc qdisc add dev ens8 ingress block 22
>>                                 ^^^^^^^^
>>
>> Now if we list the qdiscs, we will see the block index in the output:
>>
>> $ tc qdisc
>> qdisc ingress ffff: dev ens7 parent ffff:fff1 block 22
>> qdisc ingress ffff: dev ens8 parent ffff:fff1 block 22
>>
>> Now we can add filter to any of qdiscs sharing the same block:
>>
>> $ tc filter add dev ens7 ingress protocol ip pref 25 flower dst_ip 192.168.0.0/16 action drop
>>
>
>So you don't need to specify block 22 for this filter?
>Because there is only one block???

eth7 was ingress qdisc was assigned block 22 during the creation.
There is always 1 block assigned to one qdisc. However there might be
multiple qdiscs sharing one block. I will try to make this more clear in
the cover letter.



>
>
>>
>> We will see the same output if we list filters for ens7 and ens8, including stats:
>>
>> $ tc -s filter show dev ens7 ingress
>> filter protocol ip pref 25 flower chain 0
>> filter protocol ip pref 25 flower chain 0 handle 0x1
>>   eth_type ipv4
>>   dst_ip 192.168.0.0/16
>>   not_in_hw
>>         action order 1: gact action drop
>>          random type none pass val 0
>>          index 1 ref 1 bind 1 installed 39 sec used 2 sec
>>         Action statistics:
>>         Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0)
>>         backlog 0b 0p requeues 0
>
>
>Don't see which block it belongs to here.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2017-11-18 17:18 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-12 15:55 [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances Jiri Pirko
2017-11-12 15:55 ` [patch net-next v2 01/10] cls_bpf: move prog offload->netdev check into drivers Jiri Pirko
2017-11-13  2:14   ` Jakub Kicinski
2017-11-13  6:25     ` Jiri Pirko
2017-11-13  7:17       ` Jakub Kicinski
2017-11-13  7:55         ` Jiri Pirko
2017-11-13  8:12           ` Jakub Kicinski
2017-11-13  8:28             ` Jiri Pirko
2017-11-12 15:55 ` [patch net-next v2 02/10] net: sched: introduce support for multiple filter chain pointers registration Jiri Pirko
2017-11-12 15:55 ` [patch net-next v2 03/10] net: sched: avoid usage of tp->q in tcf_classify Jiri Pirko
2017-11-12 15:55 ` [patch net-next v2 04/10] net: sched: introduce block mechanism to handle netif_keep_dst calls Jiri Pirko
2017-11-13  7:47   ` Jakub Kicinski
2017-11-13  7:58     ` Jiri Pirko
2017-11-13  8:03       ` Jakub Kicinski
2017-11-13  8:08         ` Jiri Pirko
2017-11-13  8:17           ` Jakub Kicinski
2017-11-13  8:35             ` Jiri Pirko
2017-11-13  8:45               ` Jakub Kicinski
2017-11-13  8:54                 ` Jiri Pirko
2017-11-12 15:55 ` [patch net-next v2 05/10] net: sched: remove classid and q fields from tcf_proto Jiri Pirko
2017-11-12 15:56 ` [patch net-next v2 06/10] net: sched: allow ingress and clsact qdiscs to share filter blocks Jiri Pirko
2017-11-13  7:54   ` Jakub Kicinski
2017-11-13  7:56     ` Jiri Pirko
2017-11-13  8:05       ` Jakub Kicinski
2017-11-13  8:09         ` Jiri Pirko
2017-11-12 15:56 ` [patch net-next v2 07/10] mlxsw: spectrum_acl: Reshuffle code around mlxsw_sp_acl_ruleset_create/destroy Jiri Pirko
2017-11-12 15:56 ` [patch net-next v2 08/10] mlxsw: spectrum_acl: Don't store netdev and ingress for ruleset unbind Jiri Pirko
2017-11-12 15:56 ` [patch net-next v2 09/10] mlxsw: spectrum_acl: Implement TC block sharing Jiri Pirko
2017-11-12 15:56 ` [patch net-next v2 10/10] mlxsw: spectrum_acl: Pass mlxsw_sp_port down to ruleset bind/unbind ops Jiri Pirko
2017-11-15 23:12 ` [patch net-next v2 00/10] net: sched: allow qdiscs to share filter block instances Cong Wang
2017-11-18 17:18   ` Jiri Pirko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.