All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch net-next 00/10] net: sched: introduce multichain support for filters
@ 2017-04-27 11:12 Jiri Pirko
  2017-04-27 11:12 ` [patch net-next 01/10] net: sched: move tc_classify function to cls_api.c Jiri Pirko
                   ` (12 more replies)
  0 siblings, 13 replies; 26+ messages in thread
From: Jiri Pirko @ 2017-04-27 11:12 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, dsa, edumazet, stephen, daniel,
	alexander.h.duyck, mlxsw, simon.horman

From: Jiri Pirko <jiri@mellanox.com>

Currently, each classful qdisc holds one chain of filters.
This chain is traversed and each filter could be matched on, which
may lead to execution of list of actions. One of such action
could be "reclassify", which would "reset" the processing of the
filter chain.

So this filter chain could be looked at as a flat table.
Sometimes it is convenient for user to configure a hierarchy
of tables. Example usecase is encapsulation.

Hierarchy of tables is a common way how it is done in HW pipelines.
So it is much more convenient to offload this.

This patchset contains two major patches:
8/10 - This patch introduces the support for having multiple
       chains of filters. 
10/10 - This patch extends existing gact action to allow
        going to specified chain
The rest of the patches are smaller or bigger depencies of those 2.
Please see individual patch descriptions for details.

Corresponding iproute2 patches are appended as a reply to this cover letter.

Simple example:
$ tc qdisc add dev eth0 ingress
$ tc filter add dev eth0 parent ffff: protocol ip pref 33 flower dst_mac 52:54:00:3d:c7:6d action goto chain 11
$ tc filter add dev eth0 parent ffff: protocol ip pref 22 chain 11 flower dst_ip 192.168.40.1 action drop
$ tc filter show dev eth0 root
filter parent ffff: protocol ip pref 33 flower chain 0 
filter parent ffff: protocol ip pref 33 flower chain 0 handle 0x1 
  dst_mac 52:54:00:3d:c7:6d
  eth_type ipv4
        action order 1: gact action goto chain 11
         random type none pass val 0
         index 2 ref 1 bind 1
 
filter parent ffff: protocol ip pref 22 flower chain 11 
filter parent ffff: protocol ip pref 22 flower chain 11 handle 0x1 
  eth_type ipv4
  dst_ip 192.168.40.1
        action order 1: gact action drop
         random type none pass val 0
         index 3 ref 1 bind 1

Jiri Pirko (10):
  net: sched: move tc_classify function to cls_api.c
  net: sched: introduce tcf block infractructure
  net: sched: rename tcf_destroy_chain helper
  net: sched: replace nprio by a bool to make the function more readable
  net: sched: move TC_H_MAJ macro call into tcf_auto_prio
  net: sched: introduce helpers to work with filter chains
  net: sched: push chain dump to a separate function
  net: sched: introduce multichain support for filters
  net: sched: push tp down to action init callback
  net: sched: extend gact to allow jumping to another filter chain

 include/net/act_api.h               |  18 +-
 include/net/pkt_cls.h               |  24 ++-
 include/net/pkt_sched.h             |   3 -
 include/net/sch_generic.h           |  26 ++-
 include/net/tc_act/tc_gact.h        |   2 +
 include/uapi/linux/pkt_cls.h        |   1 +
 include/uapi/linux/rtnetlink.h      |   1 +
 include/uapi/linux/tc_act/tc_gact.h |   1 +
 net/core/dev.c                      |   5 +-
 net/sched/act_api.c                 |  20 +-
 net/sched/act_bpf.c                 |   6 +-
 net/sched/act_connmark.c            |   6 +-
 net/sched/act_csum.c                |   6 +-
 net/sched/act_gact.c                |  54 ++++-
 net/sched/act_ife.c                 |   6 +-
 net/sched/act_ipt.c                 |  12 +-
 net/sched/act_mirred.c              |   6 +-
 net/sched/act_nat.c                 |   3 +-
 net/sched/act_pedit.c               |   6 +-
 net/sched/act_police.c              |   6 +-
 net/sched/act_sample.c              |   6 +-
 net/sched/act_simple.c              |   6 +-
 net/sched/act_skbedit.c             |   6 +-
 net/sched/act_skbmod.c              |   6 +-
 net/sched/act_tunnel_key.c          |   6 +-
 net/sched/act_vlan.c                |   6 +-
 net/sched/cls_api.c                 | 401 ++++++++++++++++++++++++++++--------
 net/sched/sch_api.c                 |  50 +----
 net/sched/sch_atm.c                 |  29 ++-
 net/sched/sch_cbq.c                 |  21 +-
 net/sched/sch_drr.c                 |  15 +-
 net/sched/sch_dsmark.c              |  19 +-
 net/sched/sch_fq_codel.c            |  17 +-
 net/sched/sch_hfsc.c                |  21 +-
 net/sched/sch_htb.c                 |  28 ++-
 net/sched/sch_ingress.c             |  61 ++++--
 net/sched/sch_multiq.c              |  16 +-
 net/sched/sch_prio.c                |  19 +-
 net/sched/sch_qfq.c                 |  16 +-
 net/sched/sch_sfb.c                 |  17 +-
 net/sched/sch_sfq.c                 |  17 +-
 41 files changed, 680 insertions(+), 315 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [patch net-next 01/10] net: sched: move tc_classify function to cls_api.c
  2017-04-27 11:12 [patch net-next 00/10] net: sched: introduce multichain support for filters Jiri Pirko
@ 2017-04-27 11:12 ` Jiri Pirko
  2017-04-27 11:12 ` [patch net-next 02/10] net: sched: introduce tcf block infractructure Jiri Pirko
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Jiri Pirko @ 2017-04-27 11:12 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, dsa, edumazet, stephen, daniel,
	alexander.h.duyck, mlxsw, simon.horman

From: Jiri Pirko <jiri@mellanox.com>

Move tc_classify function to cls_api.c where it belongs, rename it to
fit the namespace.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/pkt_cls.h    |  9 +++++++++
 include/net/pkt_sched.h  |  3 ---
 net/core/dev.c           |  5 +++--
 net/sched/cls_api.c      | 42 ++++++++++++++++++++++++++++++++++++++++++
 net/sched/sch_api.c      | 48 ------------------------------------------------
 net/sched/sch_atm.c      |  2 +-
 net/sched/sch_cbq.c      |  2 +-
 net/sched/sch_drr.c      |  2 +-
 net/sched/sch_dsmark.c   |  2 +-
 net/sched/sch_fq_codel.c |  2 +-
 net/sched/sch_hfsc.c     |  2 +-
 net/sched/sch_htb.c      |  2 +-
 net/sched/sch_multiq.c   |  2 +-
 net/sched/sch_prio.c     |  2 +-
 net/sched/sch_qfq.c      |  2 +-
 net/sched/sch_sfb.c      |  2 +-
 net/sched/sch_sfq.c      |  2 +-
 17 files changed, 66 insertions(+), 65 deletions(-)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 269fd78..cb74506 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -19,10 +19,19 @@ int unregister_tcf_proto_ops(struct tcf_proto_ops *ops);
 
 #ifdef CONFIG_NET_CLS
 void tcf_destroy_chain(struct tcf_proto __rcu **fl);
+int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
+		 struct tcf_result *res, bool compat_mode);
+
 #else
 static inline void tcf_destroy_chain(struct tcf_proto __rcu **fl)
 {
 }
+
+static inline int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
+			       struct tcf_result *res, bool compat_mode)
+{
+	return TC_ACT_UNSPEC;
+}
 #endif
 
 static inline unsigned long
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index bec46f6..2579c20 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -113,9 +113,6 @@ static inline void qdisc_run(struct Qdisc *q)
 		__qdisc_run(q);
 }
 
-int tc_classify(struct sk_buff *skb, const struct tcf_proto *tp,
-		struct tcf_result *res, bool compat_mode);
-
 static inline __be16 tc_skb_protocol(const struct sk_buff *skb)
 {
 	/* We need to take extra care in case the skb came via
diff --git a/net/core/dev.c b/net/core/dev.c
index 3361ee8..51d2861 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -104,6 +104,7 @@
 #include <net/dst.h>
 #include <net/dst_metadata.h>
 #include <net/pkt_sched.h>
+#include <net/pkt_cls.h>
 #include <net/checksum.h>
 #include <net/xfrm.h>
 #include <linux/highmem.h>
@@ -3177,7 +3178,7 @@ sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev)
 	/* qdisc_skb_cb(skb)->pkt_len was already set by the caller. */
 	qdisc_bstats_cpu_update(cl->q, skb);
 
-	switch (tc_classify(skb, cl, &cl_res, false)) {
+	switch (tcf_classify(skb, cl, &cl_res, false)) {
 	case TC_ACT_OK:
 	case TC_ACT_RECLASSIFY:
 		skb->tc_index = TC_H_MIN(cl_res.classid);
@@ -3947,7 +3948,7 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret,
 	skb->tc_at_ingress = 1;
 	qdisc_bstats_cpu_update(cl->q, skb);
 
-	switch (tc_classify(skb, cl, &cl_res, false)) {
+	switch (tcf_classify(skb, cl, &cl_res, false)) {
 	case TC_ACT_OK:
 	case TC_ACT_RECLASSIFY:
 		skb->tc_index = TC_H_MIN(cl_res.classid);
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 22f88b3..a97af61 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -196,6 +196,48 @@ void tcf_destroy_chain(struct tcf_proto __rcu **fl)
 }
 EXPORT_SYMBOL(tcf_destroy_chain);
 
+/* Main classifier routine: scans classifier chain attached
+ * to this qdisc, (optionally) tests for protocol and asks
+ * specific classifiers.
+ */
+int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
+		 struct tcf_result *res, bool compat_mode)
+{
+	__be16 protocol = tc_skb_protocol(skb);
+	const int max_reclassify_loop = 4;
+	const struct tcf_proto *old_tp = tp;
+	int limit = 0;
+
+reclassify:
+	for (; tp; tp = rcu_dereference_bh(tp->next)) {
+		int err;
+
+		if (tp->protocol != protocol &&
+		    tp->protocol != htons(ETH_P_ALL))
+			continue;
+
+		err = tp->classify(skb, tp, res);
+		if (unlikely(err == TC_ACT_RECLASSIFY && !compat_mode))
+			goto reset;
+		if (err >= 0)
+			return err;
+	}
+
+	return TC_ACT_UNSPEC; /* signal: continue lookup */
+reset:
+	if (unlikely(limit++ >= max_reclassify_loop)) {
+		net_notice_ratelimited("%s: reclassify loop, rule prio %u, protocol %02x\n",
+				       tp->q->ops->id, tp->prio & 0xffff,
+				       ntohs(tp->protocol));
+		return TC_ACT_SHOT;
+	}
+
+	tp = old_tp;
+	protocol = tc_skb_protocol(skb);
+	goto reclassify;
+}
+EXPORT_SYMBOL(tcf_classify);
+
 /* Add/change/delete/get a filter node */
 
 static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index bbe57d5..2bf20bb 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -1872,54 +1872,6 @@ static int tc_dump_tclass(struct sk_buff *skb, struct netlink_callback *cb)
 	return skb->len;
 }
 
-/* Main classifier routine: scans classifier chain attached
- * to this qdisc, (optionally) tests for protocol and asks
- * specific classifiers.
- */
-int tc_classify(struct sk_buff *skb, const struct tcf_proto *tp,
-		struct tcf_result *res, bool compat_mode)
-{
-	__be16 protocol = tc_skb_protocol(skb);
-#ifdef CONFIG_NET_CLS_ACT
-	const int max_reclassify_loop = 4;
-	const struct tcf_proto *old_tp = tp;
-	int limit = 0;
-
-reclassify:
-#endif
-	for (; tp; tp = rcu_dereference_bh(tp->next)) {
-		int err;
-
-		if (tp->protocol != protocol &&
-		    tp->protocol != htons(ETH_P_ALL))
-			continue;
-
-		err = tp->classify(skb, tp, res);
-#ifdef CONFIG_NET_CLS_ACT
-		if (unlikely(err == TC_ACT_RECLASSIFY && !compat_mode))
-			goto reset;
-#endif
-		if (err >= 0)
-			return err;
-	}
-
-	return TC_ACT_UNSPEC; /* signal: continue lookup */
-#ifdef CONFIG_NET_CLS_ACT
-reset:
-	if (unlikely(limit++ >= max_reclassify_loop)) {
-		net_notice_ratelimited("%s: reclassify loop, rule prio %u, protocol %02x\n",
-				       tp->q->ops->id, tp->prio & 0xffff,
-				       ntohs(tp->protocol));
-		return TC_ACT_SHOT;
-	}
-
-	tp = old_tp;
-	protocol = tc_skb_protocol(skb);
-	goto reclassify;
-#endif
-}
-EXPORT_SYMBOL(tc_classify);
-
 #ifdef CONFIG_PROC_FS
 static int psched_show(struct seq_file *seq, void *v)
 {
diff --git a/net/sched/sch_atm.c b/net/sched/sch_atm.c
index 40cbcee..89d32fa 100644
--- a/net/sched/sch_atm.c
+++ b/net/sched/sch_atm.c
@@ -377,7 +377,7 @@ static int atm_tc_enqueue(struct sk_buff *skb, struct Qdisc *sch,
 		list_for_each_entry(flow, &p->flows, list) {
 			fl = rcu_dereference_bh(flow->filter_list);
 			if (fl) {
-				result = tc_classify(skb, fl, &res, true);
+				result = tcf_classify(skb, fl, &res, true);
 				if (result < 0)
 					continue;
 				flow = (struct atm_flow_data *)res.class;
diff --git a/net/sched/sch_cbq.c b/net/sched/sch_cbq.c
index 7415859..c543ea3 100644
--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -233,7 +233,7 @@ cbq_classify(struct sk_buff *skb, struct Qdisc *sch, int *qerr)
 		/*
 		 * Step 2+n. Apply classifier.
 		 */
-		result = tc_classify(skb, fl, &res, true);
+		result = tcf_classify(skb, fl, &res, true);
 		if (!fl || result < 0)
 			goto fallback;
 
diff --git a/net/sched/sch_drr.c b/net/sched/sch_drr.c
index 58a8c32..446d79b 100644
--- a/net/sched/sch_drr.c
+++ b/net/sched/sch_drr.c
@@ -333,7 +333,7 @@ static struct drr_class *drr_classify(struct sk_buff *skb, struct Qdisc *sch,
 
 	*qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
 	fl = rcu_dereference_bh(q->filter_list);
-	result = tc_classify(skb, fl, &res, false);
+	result = tcf_classify(skb, fl, &res, false);
 	if (result >= 0) {
 #ifdef CONFIG_NET_CLS_ACT
 		switch (result) {
diff --git a/net/sched/sch_dsmark.c b/net/sched/sch_dsmark.c
index 1c0f877..7bc638d 100644
--- a/net/sched/sch_dsmark.c
+++ b/net/sched/sch_dsmark.c
@@ -234,7 +234,7 @@ static int dsmark_enqueue(struct sk_buff *skb, struct Qdisc *sch,
 	else {
 		struct tcf_result res;
 		struct tcf_proto *fl = rcu_dereference_bh(p->filter_list);
-		int result = tc_classify(skb, fl, &res, false);
+		int result = tcf_classify(skb, fl, &res, false);
 
 		pr_debug("result %d class 0x%04x\n", result, res.classid);
 
diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
index 18bbb54..0db8d68 100644
--- a/net/sched/sch_fq_codel.c
+++ b/net/sched/sch_fq_codel.c
@@ -96,7 +96,7 @@ static unsigned int fq_codel_classify(struct sk_buff *skb, struct Qdisc *sch,
 		return fq_codel_hash(q, skb) + 1;
 
 	*qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
-	result = tc_classify(skb, filter, &res, false);
+	result = tcf_classify(skb, filter, &res, false);
 	if (result >= 0) {
 #ifdef CONFIG_NET_CLS_ACT
 		switch (result) {
diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 5cb82f6..b0dcab1 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -1142,7 +1142,7 @@ hfsc_classify(struct sk_buff *skb, struct Qdisc *sch, int *qerr)
 	*qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
 	head = &q->root;
 	tcf = rcu_dereference_bh(q->root.filter_list);
-	while (tcf && (result = tc_classify(skb, tcf, &res, false)) >= 0) {
+	while (tcf && (result = tcf_classify(skb, tcf, &res, false)) >= 0) {
 #ifdef CONFIG_NET_CLS_ACT
 		switch (result) {
 		case TC_ACT_QUEUED:
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 570ef3b..640f5f3 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -231,7 +231,7 @@ static struct htb_class *htb_classify(struct sk_buff *skb, struct Qdisc *sch,
 	}
 
 	*qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
-	while (tcf && (result = tc_classify(skb, tcf, &res, false)) >= 0) {
+	while (tcf && (result = tcf_classify(skb, tcf, &res, false)) >= 0) {
 #ifdef CONFIG_NET_CLS_ACT
 		switch (result) {
 		case TC_ACT_QUEUED:
diff --git a/net/sched/sch_multiq.c b/net/sched/sch_multiq.c
index 43a3a10..25bb9ff 100644
--- a/net/sched/sch_multiq.c
+++ b/net/sched/sch_multiq.c
@@ -46,7 +46,7 @@ multiq_classify(struct sk_buff *skb, struct Qdisc *sch, int *qerr)
 	int err;
 
 	*qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
-	err = tc_classify(skb, fl, &res, false);
+	err = tcf_classify(skb, fl, &res, false);
 #ifdef CONFIG_NET_CLS_ACT
 	switch (err) {
 	case TC_ACT_STOLEN:
diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c
index 92c2e6d..7997363 100644
--- a/net/sched/sch_prio.c
+++ b/net/sched/sch_prio.c
@@ -42,7 +42,7 @@ prio_classify(struct sk_buff *skb, struct Qdisc *sch, int *qerr)
 	*qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
 	if (TC_H_MAJ(skb->priority) != sch->handle) {
 		fl = rcu_dereference_bh(q->filter_list);
-		err = tc_classify(skb, fl, &res, false);
+		err = tcf_classify(skb, fl, &res, false);
 #ifdef CONFIG_NET_CLS_ACT
 		switch (err) {
 		case TC_ACT_STOLEN:
diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c
index 041eba3..73c7ac3 100644
--- a/net/sched/sch_qfq.c
+++ b/net/sched/sch_qfq.c
@@ -720,7 +720,7 @@ static struct qfq_class *qfq_classify(struct sk_buff *skb, struct Qdisc *sch,
 
 	*qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
 	fl = rcu_dereference_bh(q->filter_list);
-	result = tc_classify(skb, fl, &res, false);
+	result = tcf_classify(skb, fl, &res, false);
 	if (result >= 0) {
 #ifdef CONFIG_NET_CLS_ACT
 		switch (result) {
diff --git a/net/sched/sch_sfb.c b/net/sched/sch_sfb.c
index 0f77727..b2878808 100644
--- a/net/sched/sch_sfb.c
+++ b/net/sched/sch_sfb.c
@@ -259,7 +259,7 @@ static bool sfb_classify(struct sk_buff *skb, struct tcf_proto *fl,
 	struct tcf_result res;
 	int result;
 
-	result = tc_classify(skb, fl, &res, false);
+	result = tcf_classify(skb, fl, &res, false);
 	if (result >= 0) {
 #ifdef CONFIG_NET_CLS_ACT
 		switch (result) {
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index b00e02c..012fa3b 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -180,7 +180,7 @@ static unsigned int sfq_classify(struct sk_buff *skb, struct Qdisc *sch,
 		return sfq_hash(q, skb) + 1;
 
 	*qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
-	result = tc_classify(skb, fl, &res, false);
+	result = tcf_classify(skb, fl, &res, false);
 	if (result >= 0) {
 #ifdef CONFIG_NET_CLS_ACT
 		switch (result) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [patch net-next 02/10] net: sched: introduce tcf block infractructure
  2017-04-27 11:12 [patch net-next 00/10] net: sched: introduce multichain support for filters Jiri Pirko
  2017-04-27 11:12 ` [patch net-next 01/10] net: sched: move tc_classify function to cls_api.c Jiri Pirko
@ 2017-04-27 11:12 ` Jiri Pirko
  2017-04-28 17:48   ` Cong Wang
  2017-04-27 11:12 ` [patch net-next 03/10] net: sched: rename tcf_destroy_chain helper Jiri Pirko
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 26+ messages in thread
From: Jiri Pirko @ 2017-04-27 11:12 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, dsa, edumazet, stephen, daniel,
	alexander.h.duyck, mlxsw, simon.horman

From: Jiri Pirko <jiri@mellanox.com>

Currently, the filter chains are direcly put into the private structures
of qdiscs. In order to be able to have multiple chains per qdisc and to
allow filter chains sharing among qdiscs, there is a need for common
object that would hold the chains. This introduces such object and calls
it "tcf_block".

Helpers to get and put the blocks are provided to be called from
individual qdisc code. Also, the original filter_list pointers are left
in qdisc privs to allow the entry into tcf_block processing without any
added overhead of possible multiple pointer dereference on fast path.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/pkt_cls.h     | 13 ++++++++--
 include/net/sch_generic.h |  7 +++++-
 net/sched/cls_api.c       | 48 +++++++++++++++++++++++++++++--------
 net/sched/sch_api.c       |  2 +-
 net/sched/sch_atm.c       | 27 ++++++++++++++-------
 net/sched/sch_cbq.c       | 19 ++++++++++-----
 net/sched/sch_drr.c       | 13 ++++++----
 net/sched/sch_dsmark.c    | 17 ++++++++-----
 net/sched/sch_fq_codel.c  | 15 ++++++++----
 net/sched/sch_hfsc.c      | 19 ++++++++++-----
 net/sched/sch_htb.c       | 26 +++++++++++++-------
 net/sched/sch_ingress.c   | 61 ++++++++++++++++++++++++++++++++++-------------
 net/sched/sch_multiq.c    | 14 +++++++----
 net/sched/sch_prio.c      | 17 +++++++++----
 net/sched/sch_qfq.c       | 14 +++++++----
 net/sched/sch_sfb.c       | 15 ++++++++----
 net/sched/sch_sfq.c       | 15 ++++++++----
 17 files changed, 243 insertions(+), 99 deletions(-)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index cb74506..e56e715 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -18,12 +18,21 @@ int register_tcf_proto_ops(struct tcf_proto_ops *ops);
 int unregister_tcf_proto_ops(struct tcf_proto_ops *ops);
 
 #ifdef CONFIG_NET_CLS
-void tcf_destroy_chain(struct tcf_proto __rcu **fl);
+int tcf_block_get(struct tcf_block **p_block,
+		  struct tcf_proto __rcu **p_filter_chain);
+void tcf_block_put(struct tcf_block *block);
 int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 		 struct tcf_result *res, bool compat_mode);
 
 #else
-static inline void tcf_destroy_chain(struct tcf_proto __rcu **fl)
+static inline
+int tcf_block_get(struct tcf_block **p_block,
+		  struct tcf_proto __rcu **p_filter_chain)
+{
+	return 0;
+}
+
+static inline void tcf_block_put(struct tcf_block *block)
 {
 }
 
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 22e5209..98cf2f2 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -153,7 +153,7 @@ struct Qdisc_class_ops {
 	void			(*walk)(struct Qdisc *, struct qdisc_walker * arg);
 
 	/* Filter manipulation */
-	struct tcf_proto __rcu ** (*tcf_chain)(struct Qdisc *, unsigned long);
+	struct tcf_block *	(*tcf_block)(struct Qdisc *, unsigned long);
 	bool			(*tcf_cl_offload)(u32 classid);
 	unsigned long		(*bind_tcf)(struct Qdisc *, unsigned long,
 					u32 classid);
@@ -236,6 +236,7 @@ struct tcf_proto {
 	struct Qdisc		*q;
 	void			*data;
 	const struct tcf_proto_ops	*ops;
+	struct tcf_block	*block;
 	struct rcu_head		rcu;
 };
 
@@ -247,6 +248,10 @@ struct qdisc_skb_cb {
 	unsigned char		data[QDISC_CB_PRIV_LEN];
 };
 
+struct tcf_block {
+	struct tcf_proto __rcu **p_filter_chain;
+};
+
 static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz)
 {
 	struct qdisc_skb_cb *qcb;
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index a97af61..42fdc8a 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -129,7 +129,8 @@ static inline u32 tcf_auto_prio(struct tcf_proto *tp)
 }
 
 static struct tcf_proto *tcf_proto_create(const char *kind, u32 protocol,
-					  u32 prio, u32 parent, struct Qdisc *q)
+					  u32 prio, u32 parent, struct Qdisc *q,
+					  struct tcf_block *block)
 {
 	struct tcf_proto *tp;
 	int err;
@@ -165,6 +166,7 @@ static struct tcf_proto *tcf_proto_create(const char *kind, u32 protocol,
 	tp->prio = prio;
 	tp->classid = parent;
 	tp->q = q;
+	tp->block = block;
 
 	err = tp->ops->init(tp);
 	if (err) {
@@ -185,7 +187,7 @@ static void tcf_proto_destroy(struct tcf_proto *tp)
 	kfree_rcu(tp, rcu);
 }
 
-void tcf_destroy_chain(struct tcf_proto __rcu **fl)
+static void tcf_destroy_chain(struct tcf_proto __rcu **fl)
 {
 	struct tcf_proto *tp;
 
@@ -194,7 +196,28 @@ void tcf_destroy_chain(struct tcf_proto __rcu **fl)
 		tcf_proto_destroy(tp);
 	}
 }
-EXPORT_SYMBOL(tcf_destroy_chain);
+
+int tcf_block_get(struct tcf_block **p_block,
+		  struct tcf_proto __rcu **p_filter_chain)
+{
+	struct tcf_block *block = kzalloc(sizeof(*block), GFP_KERNEL);
+
+	if (!block)
+		return -ENOMEM;
+	block->p_filter_chain = p_filter_chain;
+	*p_block = block;
+	return 0;
+}
+EXPORT_SYMBOL(tcf_block_get);
+
+void tcf_block_put(struct tcf_block *block)
+{
+	if (!block)
+		return;
+	tcf_destroy_chain(block->p_filter_chain);
+	kfree(block);
+}
+EXPORT_SYMBOL(tcf_block_put);
 
 /* Main classifier routine: scans classifier chain attached
  * to this qdisc, (optionally) tests for protocol and asks
@@ -254,6 +277,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 	struct Qdisc  *q;
 	struct tcf_proto __rcu **back;
 	struct tcf_proto __rcu **chain;
+	struct tcf_block *block;
 	struct tcf_proto *next;
 	struct tcf_proto *tp;
 	const struct Qdisc_class_ops *cops;
@@ -322,7 +346,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 	if (!cops)
 		return -EINVAL;
 
-	if (cops->tcf_chain == NULL)
+	if (!cops->tcf_block)
 		return -EOPNOTSUPP;
 
 	/* Do we search for filter, attached to class? */
@@ -333,11 +357,13 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 	}
 
 	/* And the last stroke */
-	chain = cops->tcf_chain(q, cl);
-	if (chain == NULL) {
+	block = cops->tcf_block(q, cl);
+	if (!block) {
 		err = -EINVAL;
 		goto errout;
 	}
+	chain = block->p_filter_chain;
+
 	if (n->nlmsg_type == RTM_DELTFILTER && prio == 0) {
 		tfilter_notify_chain(net, skb, n, chain, RTM_DELTFILTER);
 		tcf_destroy_chain(chain);
@@ -381,7 +407,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 			nprio = TC_H_MAJ(tcf_auto_prio(rtnl_dereference(*back)));
 
 		tp = tcf_proto_create(nla_data(tca[TCA_KIND]),
-				      protocol, nprio, parent, q);
+				      protocol, nprio, parent, q, block);
 		if (IS_ERR(tp)) {
 			err = PTR_ERR(tp);
 			goto errout;
@@ -550,6 +576,7 @@ static int tc_dump_tfilter(struct sk_buff *skb, struct netlink_callback *cb)
 	int s_t;
 	struct net_device *dev;
 	struct Qdisc *q;
+	struct tcf_block *block;
 	struct tcf_proto *tp, __rcu **chain;
 	struct tcmsg *tcm = nlmsg_data(cb->nlh);
 	unsigned long cl = 0;
@@ -571,16 +598,17 @@ static int tc_dump_tfilter(struct sk_buff *skb, struct netlink_callback *cb)
 	cops = q->ops->cl_ops;
 	if (!cops)
 		goto errout;
-	if (cops->tcf_chain == NULL)
+	if (!cops->tcf_block)
 		goto errout;
 	if (TC_H_MIN(tcm->tcm_parent)) {
 		cl = cops->get(q, tcm->tcm_parent);
 		if (cl == 0)
 			goto errout;
 	}
-	chain = cops->tcf_chain(q, cl);
-	if (chain == NULL)
+	block = cops->tcf_block(q, cl);
+	if (!block)
 		goto errout;
+	chain = block->p_filter_chain;
 
 	s_t = cb->args[0];
 
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 2bf20bb..234322f 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -163,7 +163,7 @@ int register_qdisc(struct Qdisc_ops *qops)
 		if (!(cops->get && cops->put && cops->walk && cops->leaf))
 			goto out_einval;
 
-		if (cops->tcf_chain && !(cops->bind_tcf && cops->unbind_tcf))
+		if (cops->tcf_block && !(cops->bind_tcf && cops->unbind_tcf))
 			goto out_einval;
 	}
 
diff --git a/net/sched/sch_atm.c b/net/sched/sch_atm.c
index 89d32fa..f435546 100644
--- a/net/sched/sch_atm.c
+++ b/net/sched/sch_atm.c
@@ -43,6 +43,7 @@
 struct atm_flow_data {
 	struct Qdisc		*q;	/* FIFO, TBF, etc. */
 	struct tcf_proto __rcu	*filter_list;
+	struct tcf_block	*block;
 	struct atm_vcc		*vcc;	/* VCC; NULL if VCC is closed */
 	void			(*old_pop)(struct atm_vcc *vcc,
 					   struct sk_buff *skb); /* chaining */
@@ -143,7 +144,7 @@ static void atm_tc_put(struct Qdisc *sch, unsigned long cl)
 	list_del_init(&flow->list);
 	pr_debug("atm_tc_put: qdisc %p\n", flow->q);
 	qdisc_destroy(flow->q);
-	tcf_destroy_chain(&flow->filter_list);
+	tcf_block_put(flow->block);
 	if (flow->sock) {
 		pr_debug("atm_tc_put: f_count %ld\n",
 			file_count(flow->sock->file));
@@ -274,7 +275,13 @@ static int atm_tc_change(struct Qdisc *sch, u32 classid, u32 parent,
 		error = -ENOBUFS;
 		goto err_out;
 	}
-	RCU_INIT_POINTER(flow->filter_list, NULL);
+
+	error = tcf_block_get(&flow->block, &flow->filter_list);
+	if (error) {
+		kfree(flow);
+		goto err_out;
+	}
+
 	flow->q = qdisc_create_dflt(sch->dev_queue, &pfifo_qdisc_ops, classid);
 	if (!flow->q)
 		flow->q = &noop_qdisc;
@@ -346,14 +353,13 @@ static void atm_tc_walk(struct Qdisc *sch, struct qdisc_walker *walker)
 	}
 }
 
-static struct tcf_proto __rcu **atm_tc_find_tcf(struct Qdisc *sch,
-						unsigned long cl)
+static struct tcf_block *atm_tc_tcf_block(struct Qdisc *sch, unsigned long cl)
 {
 	struct atm_qdisc_data *p = qdisc_priv(sch);
 	struct atm_flow_data *flow = (struct atm_flow_data *)cl;
 
 	pr_debug("atm_tc_find_tcf(sch %p,[qdisc %p],flow %p)\n", sch, p, flow);
-	return flow ? &flow->filter_list : &p->link.filter_list;
+	return flow ? flow->block : p->link.block;
 }
 
 /* --------------------------- Qdisc operations ---------------------------- */
@@ -524,6 +530,7 @@ static struct sk_buff *atm_tc_peek(struct Qdisc *sch)
 static int atm_tc_init(struct Qdisc *sch, struct nlattr *opt)
 {
 	struct atm_qdisc_data *p = qdisc_priv(sch);
+	int err;
 
 	pr_debug("atm_tc_init(sch %p,[qdisc %p],opt %p)\n", sch, p, opt);
 	INIT_LIST_HEAD(&p->flows);
@@ -534,7 +541,11 @@ static int atm_tc_init(struct Qdisc *sch, struct nlattr *opt)
 	if (!p->link.q)
 		p->link.q = &noop_qdisc;
 	pr_debug("atm_tc_init: link (%p) qdisc %p\n", &p->link, p->link.q);
-	RCU_INIT_POINTER(p->link.filter_list, NULL);
+
+	err = tcf_block_get(&p->link.block, &p->link.filter_list);
+	if (err)
+		return err;
+
 	p->link.vcc = NULL;
 	p->link.sock = NULL;
 	p->link.classid = sch->handle;
@@ -561,7 +572,7 @@ static void atm_tc_destroy(struct Qdisc *sch)
 
 	pr_debug("atm_tc_destroy(sch %p,[qdisc %p])\n", sch, p);
 	list_for_each_entry(flow, &p->flows, list)
-		tcf_destroy_chain(&flow->filter_list);
+		tcf_block_put(flow->block);
 
 	list_for_each_entry_safe(flow, tmp, &p->flows, list) {
 		if (flow->ref > 1)
@@ -646,7 +657,7 @@ static const struct Qdisc_class_ops atm_class_ops = {
 	.change		= atm_tc_change,
 	.delete		= atm_tc_delete,
 	.walk		= atm_tc_walk,
-	.tcf_chain	= atm_tc_find_tcf,
+	.tcf_block	= atm_tc_tcf_block,
 	.bind_tcf	= atm_tc_bind_filter,
 	.unbind_tcf	= atm_tc_put,
 	.dump		= atm_tc_dump_class,
diff --git a/net/sched/sch_cbq.c b/net/sched/sch_cbq.c
index c543ea3..8dd6d0a 100644
--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -127,6 +127,7 @@ struct cbq_class {
 	struct tc_cbq_xstats	xstats;
 
 	struct tcf_proto __rcu	*filter_list;
+	struct tcf_block	*block;
 
 	int			refcnt;
 	int			filters;
@@ -1405,7 +1406,7 @@ static void cbq_destroy_class(struct Qdisc *sch, struct cbq_class *cl)
 
 	WARN_ON(cl->filters);
 
-	tcf_destroy_chain(&cl->filter_list);
+	tcf_block_put(cl->block);
 	qdisc_destroy(cl->q);
 	qdisc_put_rtab(cl->R_tab);
 	gen_kill_estimator(&cl->rate_est);
@@ -1430,7 +1431,7 @@ static void cbq_destroy(struct Qdisc *sch)
 	 */
 	for (h = 0; h < q->clhash.hashsize; h++) {
 		hlist_for_each_entry(cl, &q->clhash.hash[h], common.hnode)
-			tcf_destroy_chain(&cl->filter_list);
+			tcf_block_put(cl->block);
 	}
 	for (h = 0; h < q->clhash.hashsize; h++) {
 		hlist_for_each_entry_safe(cl, next, &q->clhash.hash[h],
@@ -1585,12 +1586,19 @@ cbq_change_class(struct Qdisc *sch, u32 classid, u32 parentid, struct nlattr **t
 	if (cl == NULL)
 		goto failure;
 
+	err = tcf_block_get(&cl->block, &cl->filter_list);
+	if (err) {
+		kfree(cl);
+		return err;
+	}
+
 	if (tca[TCA_RATE]) {
 		err = gen_new_estimator(&cl->bstats, NULL, &cl->rate_est,
 					NULL,
 					qdisc_root_sleeping_running(sch),
 					tca[TCA_RATE]);
 		if (err) {
+			tcf_block_put(cl->block);
 			kfree(cl);
 			goto failure;
 		}
@@ -1688,8 +1696,7 @@ static int cbq_delete(struct Qdisc *sch, unsigned long arg)
 	return 0;
 }
 
-static struct tcf_proto __rcu **cbq_find_tcf(struct Qdisc *sch,
-					     unsigned long arg)
+static struct tcf_block *cbq_tcf_block(struct Qdisc *sch, unsigned long arg)
 {
 	struct cbq_sched_data *q = qdisc_priv(sch);
 	struct cbq_class *cl = (struct cbq_class *)arg;
@@ -1697,7 +1704,7 @@ static struct tcf_proto __rcu **cbq_find_tcf(struct Qdisc *sch,
 	if (cl == NULL)
 		cl = &q->link;
 
-	return &cl->filter_list;
+	return cl->block;
 }
 
 static unsigned long cbq_bind_filter(struct Qdisc *sch, unsigned long parent,
@@ -1756,7 +1763,7 @@ static const struct Qdisc_class_ops cbq_class_ops = {
 	.change		=	cbq_change_class,
 	.delete		=	cbq_delete,
 	.walk		=	cbq_walk,
-	.tcf_chain	=	cbq_find_tcf,
+	.tcf_block	=	cbq_tcf_block,
 	.bind_tcf	=	cbq_bind_filter,
 	.unbind_tcf	=	cbq_unbind_filter,
 	.dump		=	cbq_dump_class,
diff --git a/net/sched/sch_drr.c b/net/sched/sch_drr.c
index 446d79b..5db2a28 100644
--- a/net/sched/sch_drr.c
+++ b/net/sched/sch_drr.c
@@ -36,6 +36,7 @@ struct drr_class {
 struct drr_sched {
 	struct list_head		active;
 	struct tcf_proto __rcu		*filter_list;
+	struct tcf_block		*block;
 	struct Qdisc_class_hash		clhash;
 };
 
@@ -190,15 +191,14 @@ static void drr_put_class(struct Qdisc *sch, unsigned long arg)
 		drr_destroy_class(sch, cl);
 }
 
-static struct tcf_proto __rcu **drr_tcf_chain(struct Qdisc *sch,
-					      unsigned long cl)
+static struct tcf_block *drr_tcf_block(struct Qdisc *sch, unsigned long cl)
 {
 	struct drr_sched *q = qdisc_priv(sch);
 
 	if (cl)
 		return NULL;
 
-	return &q->filter_list;
+	return q->block;
 }
 
 static unsigned long drr_bind_tcf(struct Qdisc *sch, unsigned long parent,
@@ -431,6 +431,9 @@ static int drr_init_qdisc(struct Qdisc *sch, struct nlattr *opt)
 	struct drr_sched *q = qdisc_priv(sch);
 	int err;
 
+	err = tcf_block_get(&q->block, &q->filter_list);
+	if (err)
+		return err;
 	err = qdisc_class_hash_init(&q->clhash);
 	if (err < 0)
 		return err;
@@ -462,7 +465,7 @@ static void drr_destroy_qdisc(struct Qdisc *sch)
 	struct hlist_node *next;
 	unsigned int i;
 
-	tcf_destroy_chain(&q->filter_list);
+	tcf_block_put(q->block);
 
 	for (i = 0; i < q->clhash.hashsize; i++) {
 		hlist_for_each_entry_safe(cl, next, &q->clhash.hash[i],
@@ -477,7 +480,7 @@ static const struct Qdisc_class_ops drr_class_ops = {
 	.delete		= drr_delete_class,
 	.get		= drr_get_class,
 	.put		= drr_put_class,
-	.tcf_chain	= drr_tcf_chain,
+	.tcf_block	= drr_tcf_block,
 	.bind_tcf	= drr_bind_tcf,
 	.unbind_tcf	= drr_unbind_tcf,
 	.graft		= drr_graft_class,
diff --git a/net/sched/sch_dsmark.c b/net/sched/sch_dsmark.c
index 7bc638d..ba45102 100644
--- a/net/sched/sch_dsmark.c
+++ b/net/sched/sch_dsmark.c
@@ -44,6 +44,7 @@ struct mask_value {
 struct dsmark_qdisc_data {
 	struct Qdisc		*q;
 	struct tcf_proto __rcu	*filter_list;
+	struct tcf_block	*block;
 	struct mask_value	*mv;
 	u16			indices;
 	u8			set_tc_index;
@@ -183,11 +184,11 @@ static void dsmark_walk(struct Qdisc *sch, struct qdisc_walker *walker)
 	}
 }
 
-static inline struct tcf_proto __rcu **dsmark_find_tcf(struct Qdisc *sch,
-						       unsigned long cl)
+static struct tcf_block *dsmark_tcf_block(struct Qdisc *sch, unsigned long cl)
 {
 	struct dsmark_qdisc_data *p = qdisc_priv(sch);
-	return &p->filter_list;
+
+	return p->block;
 }
 
 /* --------------------------- Qdisc operations ---------------------------- */
@@ -332,7 +333,7 @@ static int dsmark_init(struct Qdisc *sch, struct nlattr *opt)
 {
 	struct dsmark_qdisc_data *p = qdisc_priv(sch);
 	struct nlattr *tb[TCA_DSMARK_MAX + 1];
-	int err = -EINVAL;
+	int err;
 	u32 default_index = NO_DEFAULT_INDEX;
 	u16 indices;
 	int i;
@@ -342,6 +343,10 @@ static int dsmark_init(struct Qdisc *sch, struct nlattr *opt)
 	if (!opt)
 		goto errout;
 
+	err = tcf_block_get(&p->block, &p->filter_list);
+	if (err)
+		return err;
+
 	err = nla_parse_nested(tb, TCA_DSMARK_MAX, opt, dsmark_policy, NULL);
 	if (err < 0)
 		goto errout;
@@ -400,7 +405,7 @@ static void dsmark_destroy(struct Qdisc *sch)
 
 	pr_debug("%s(sch %p,[qdisc %p])\n", __func__, sch, p);
 
-	tcf_destroy_chain(&p->filter_list);
+	tcf_block_put(p->block);
 	qdisc_destroy(p->q);
 	if (p->mv != p->embedded)
 		kfree(p->mv);
@@ -468,7 +473,7 @@ static const struct Qdisc_class_ops dsmark_class_ops = {
 	.change		=	dsmark_change,
 	.delete		=	dsmark_delete,
 	.walk		=	dsmark_walk,
-	.tcf_chain	=	dsmark_find_tcf,
+	.tcf_block	=	dsmark_tcf_block,
 	.bind_tcf	=	dsmark_bind_filter,
 	.unbind_tcf	=	dsmark_put,
 	.dump		=	dsmark_dump_class,
diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
index 0db8d68..3d962eb 100644
--- a/net/sched/sch_fq_codel.c
+++ b/net/sched/sch_fq_codel.c
@@ -55,6 +55,7 @@ struct fq_codel_flow {
 
 struct fq_codel_sched_data {
 	struct tcf_proto __rcu *filter_list; /* optional external classifier */
+	struct tcf_block *block;
 	struct fq_codel_flow *flows;	/* Flows table [flows_cnt] */
 	u32		*backlogs;	/* backlog table [flows_cnt] */
 	u32		flows_cnt;	/* number of flows */
@@ -464,7 +465,7 @@ static void fq_codel_destroy(struct Qdisc *sch)
 {
 	struct fq_codel_sched_data *q = qdisc_priv(sch);
 
-	tcf_destroy_chain(&q->filter_list);
+	tcf_block_put(q->block);
 	fq_codel_free(q->backlogs);
 	fq_codel_free(q->flows);
 }
@@ -473,6 +474,7 @@ static int fq_codel_init(struct Qdisc *sch, struct nlattr *opt)
 {
 	struct fq_codel_sched_data *q = qdisc_priv(sch);
 	int i;
+	int err;
 
 	sch->limit = 10*1024;
 	q->flows_cnt = 1024;
@@ -492,6 +494,10 @@ static int fq_codel_init(struct Qdisc *sch, struct nlattr *opt)
 			return err;
 	}
 
+	err = tcf_block_get(&q->block, &q->filter_list);
+	if (err)
+		return err;
+
 	if (!q->flows) {
 		q->flows = fq_codel_zalloc(q->flows_cnt *
 					   sizeof(struct fq_codel_flow));
@@ -603,14 +609,13 @@ static void fq_codel_put(struct Qdisc *q, unsigned long cl)
 {
 }
 
-static struct tcf_proto __rcu **fq_codel_find_tcf(struct Qdisc *sch,
-						  unsigned long cl)
+static struct tcf_block *fq_codel_tcf_block(struct Qdisc *sch, unsigned long cl)
 {
 	struct fq_codel_sched_data *q = qdisc_priv(sch);
 
 	if (cl)
 		return NULL;
-	return &q->filter_list;
+	return q->block;
 }
 
 static int fq_codel_dump_class(struct Qdisc *sch, unsigned long cl,
@@ -693,7 +698,7 @@ static const struct Qdisc_class_ops fq_codel_class_ops = {
 	.leaf		=	fq_codel_leaf,
 	.get		=	fq_codel_get,
 	.put		=	fq_codel_put,
-	.tcf_chain	=	fq_codel_find_tcf,
+	.tcf_block	=	fq_codel_tcf_block,
 	.bind_tcf	=	fq_codel_bind,
 	.unbind_tcf	=	fq_codel_put,
 	.dump		=	fq_codel_dump_class,
diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index b0dcab1..a324f84 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -116,6 +116,7 @@ struct hfsc_class {
 	struct gnet_stats_queue qstats;
 	struct net_rate_estimator __rcu *rate_est;
 	struct tcf_proto __rcu *filter_list; /* filter list */
+	struct tcf_block *block;
 	unsigned int	filter_cnt;	/* filter count */
 	unsigned int	level;		/* class level in hierarchy */
 
@@ -1040,12 +1041,19 @@ hfsc_change_class(struct Qdisc *sch, u32 classid, u32 parentid,
 	if (cl == NULL)
 		return -ENOBUFS;
 
+	err = tcf_block_get(&cl->block, &cl->filter_list);
+	if (err) {
+		kfree(cl);
+		return err;
+	}
+
 	if (tca[TCA_RATE]) {
 		err = gen_new_estimator(&cl->bstats, NULL, &cl->rate_est,
 					NULL,
 					qdisc_root_sleeping_running(sch),
 					tca[TCA_RATE]);
 		if (err) {
+			tcf_block_put(cl->block);
 			kfree(cl);
 			return err;
 		}
@@ -1091,7 +1099,7 @@ hfsc_destroy_class(struct Qdisc *sch, struct hfsc_class *cl)
 {
 	struct hfsc_sched *q = qdisc_priv(sch);
 
-	tcf_destroy_chain(&cl->filter_list);
+	tcf_block_put(cl->block);
 	qdisc_destroy(cl->qdisc);
 	gen_kill_estimator(&cl->rate_est);
 	if (cl != &q->root)
@@ -1261,8 +1269,7 @@ hfsc_unbind_tcf(struct Qdisc *sch, unsigned long arg)
 	cl->filter_cnt--;
 }
 
-static struct tcf_proto __rcu **
-hfsc_tcf_chain(struct Qdisc *sch, unsigned long arg)
+static struct tcf_block *hfsc_tcf_block(struct Qdisc *sch, unsigned long arg)
 {
 	struct hfsc_sched *q = qdisc_priv(sch);
 	struct hfsc_class *cl = (struct hfsc_class *)arg;
@@ -1270,7 +1277,7 @@ hfsc_tcf_chain(struct Qdisc *sch, unsigned long arg)
 	if (cl == NULL)
 		cl = &q->root;
 
-	return &cl->filter_list;
+	return cl->block;
 }
 
 static int
@@ -1515,7 +1522,7 @@ hfsc_destroy_qdisc(struct Qdisc *sch)
 
 	for (i = 0; i < q->clhash.hashsize; i++) {
 		hlist_for_each_entry(cl, &q->clhash.hash[i], cl_common.hnode)
-			tcf_destroy_chain(&cl->filter_list);
+			tcf_block_put(cl->block);
 	}
 	for (i = 0; i < q->clhash.hashsize; i++) {
 		hlist_for_each_entry_safe(cl, next, &q->clhash.hash[i],
@@ -1662,7 +1669,7 @@ static const struct Qdisc_class_ops hfsc_class_ops = {
 	.put		= hfsc_put_class,
 	.bind_tcf	= hfsc_bind_tcf,
 	.unbind_tcf	= hfsc_unbind_tcf,
-	.tcf_chain	= hfsc_tcf_chain,
+	.tcf_block	= hfsc_tcf_block,
 	.dump		= hfsc_dump_class,
 	.dump_stats	= hfsc_dump_class_stats,
 	.walk		= hfsc_walk
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 640f5f3..195bbca9 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -105,6 +105,7 @@ struct htb_class {
 	int			quantum;	/* but stored for parent-to-leaf return */
 
 	struct tcf_proto __rcu	*filter_list;	/* class attached filters */
+	struct tcf_block	*block;
 	int			filter_cnt;
 	int			refcnt;		/* usage count of this class */
 
@@ -156,6 +157,7 @@ struct htb_sched {
 
 	/* filters for qdisc itself */
 	struct tcf_proto __rcu	*filter_list;
+	struct tcf_block	*block;
 
 #define HTB_WARN_TOOMANYEVENTS	0x1
 	unsigned int		warned;	/* only one warning */
@@ -1017,6 +1019,10 @@ static int htb_init(struct Qdisc *sch, struct nlattr *opt)
 	if (!opt)
 		return -EINVAL;
 
+	err = tcf_block_get(&q->block, &q->filter_list);
+	if (err)
+		return err;
+
 	err = nla_parse_nested(tb, TCA_HTB_MAX, opt, htb_policy, NULL);
 	if (err < 0)
 		return err;
@@ -1230,7 +1236,7 @@ static void htb_destroy_class(struct Qdisc *sch, struct htb_class *cl)
 		qdisc_destroy(cl->un.leaf.q);
 	}
 	gen_kill_estimator(&cl->rate_est);
-	tcf_destroy_chain(&cl->filter_list);
+	tcf_block_put(cl->block);
 	kfree(cl);
 }
 
@@ -1248,11 +1254,11 @@ static void htb_destroy(struct Qdisc *sch)
 	 * because filter need its target class alive to be able to call
 	 * unbind_filter on it (without Oops).
 	 */
-	tcf_destroy_chain(&q->filter_list);
+	tcf_block_put(q->block);
 
 	for (i = 0; i < q->clhash.hashsize; i++) {
 		hlist_for_each_entry(cl, &q->clhash.hash[i], common.hnode)
-			tcf_destroy_chain(&cl->filter_list);
+			tcf_block_put(cl->block);
 	}
 	for (i = 0; i < q->clhash.hashsize; i++) {
 		hlist_for_each_entry_safe(cl, next, &q->clhash.hash[i],
@@ -1396,6 +1402,11 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
 		if (!cl)
 			goto failure;
 
+		err = tcf_block_get(&cl->block, &cl->filter_list);
+		if (err) {
+			kfree(cl);
+			goto failure;
+		}
 		if (htb_rate_est || tca[TCA_RATE]) {
 			err = gen_new_estimator(&cl->bstats, NULL,
 						&cl->rate_est,
@@ -1403,6 +1414,7 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
 						qdisc_root_sleeping_running(sch),
 						tca[TCA_RATE] ? : &est.nla);
 			if (err) {
+				tcf_block_put(cl->block);
 				kfree(cl);
 				goto failure;
 			}
@@ -1521,14 +1533,12 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
 	return err;
 }
 
-static struct tcf_proto __rcu **htb_find_tcf(struct Qdisc *sch,
-					     unsigned long arg)
+static struct tcf_block *htb_tcf_block(struct Qdisc *sch, unsigned long arg)
 {
 	struct htb_sched *q = qdisc_priv(sch);
 	struct htb_class *cl = (struct htb_class *)arg;
-	struct tcf_proto __rcu **fl = cl ? &cl->filter_list : &q->filter_list;
 
-	return fl;
+	return cl ? cl->block : q->block;
 }
 
 static unsigned long htb_bind_filter(struct Qdisc *sch, unsigned long parent,
@@ -1591,7 +1601,7 @@ static const struct Qdisc_class_ops htb_class_ops = {
 	.change		=	htb_change_class,
 	.delete		=	htb_delete,
 	.walk		=	htb_walk,
-	.tcf_chain	=	htb_find_tcf,
+	.tcf_block	=	htb_tcf_block,
 	.bind_tcf	=	htb_bind_filter,
 	.unbind_tcf	=	htb_unbind_filter,
 	.dump		=	htb_dump_class,
diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c
index 3bab5f6..d8a9beb 100644
--- a/net/sched/sch_ingress.c
+++ b/net/sched/sch_ingress.c
@@ -18,6 +18,10 @@
 #include <net/pkt_sched.h>
 #include <net/pkt_cls.h>
 
+struct ingress_sched_data {
+	struct tcf_block *block;
+};
+
 static struct Qdisc *ingress_leaf(struct Qdisc *sch, unsigned long arg)
 {
 	return NULL;
@@ -47,16 +51,23 @@ static void ingress_walk(struct Qdisc *sch, struct qdisc_walker *walker)
 {
 }
 
-static struct tcf_proto __rcu **ingress_find_tcf(struct Qdisc *sch,
-						 unsigned long cl)
+static struct tcf_block *ingress_tcf_block(struct Qdisc *sch, unsigned long cl)
 {
-	struct net_device *dev = qdisc_dev(sch);
+	struct ingress_sched_data *q = qdisc_priv(sch);
 
-	return &dev->ingress_cl_list;
+	return q->block;
 }
 
 static int ingress_init(struct Qdisc *sch, struct nlattr *opt)
 {
+	struct ingress_sched_data *q = qdisc_priv(sch);
+	struct net_device *dev = qdisc_dev(sch);
+	int err;
+
+	err = tcf_block_get(&q->block, &dev->ingress_cl_list);
+	if (err)
+		return err;
+
 	net_inc_ingress_queue();
 	sch->flags |= TCQ_F_CPUSTATS;
 
@@ -65,9 +76,9 @@ static int ingress_init(struct Qdisc *sch, struct nlattr *opt)
 
 static void ingress_destroy(struct Qdisc *sch)
 {
-	struct net_device *dev = qdisc_dev(sch);
+	struct ingress_sched_data *q = qdisc_priv(sch);
 
-	tcf_destroy_chain(&dev->ingress_cl_list);
+	tcf_block_put(q->block);
 	net_dec_ingress_queue();
 }
 
@@ -91,7 +102,7 @@ static const struct Qdisc_class_ops ingress_class_ops = {
 	.get		=	ingress_get,
 	.put		=	ingress_put,
 	.walk		=	ingress_walk,
-	.tcf_chain	=	ingress_find_tcf,
+	.tcf_block	=	ingress_tcf_block,
 	.tcf_cl_offload	=	ingress_cl_offload,
 	.bind_tcf	=	ingress_bind_filter,
 	.unbind_tcf	=	ingress_put,
@@ -100,12 +111,18 @@ static const struct Qdisc_class_ops ingress_class_ops = {
 static struct Qdisc_ops ingress_qdisc_ops __read_mostly = {
 	.cl_ops		=	&ingress_class_ops,
 	.id		=	"ingress",
+	.priv_size	=	sizeof(struct ingress_sched_data),
 	.init		=	ingress_init,
 	.destroy	=	ingress_destroy,
 	.dump		=	ingress_dump,
 	.owner		=	THIS_MODULE,
 };
 
+struct clsact_sched_data {
+	struct tcf_block *ingress_block;
+	struct tcf_block *egress_block;
+};
+
 static unsigned long clsact_get(struct Qdisc *sch, u32 classid)
 {
 	switch (TC_H_MIN(classid)) {
@@ -128,16 +145,15 @@ static unsigned long clsact_bind_filter(struct Qdisc *sch,
 	return clsact_get(sch, classid);
 }
 
-static struct tcf_proto __rcu **clsact_find_tcf(struct Qdisc *sch,
-						unsigned long cl)
+static struct tcf_block *clsact_tcf_block(struct Qdisc *sch, unsigned long cl)
 {
-	struct net_device *dev = qdisc_dev(sch);
+	struct clsact_sched_data *q = qdisc_priv(sch);
 
 	switch (cl) {
 	case TC_H_MIN(TC_H_MIN_INGRESS):
-		return &dev->ingress_cl_list;
+		return q->ingress_block;
 	case TC_H_MIN(TC_H_MIN_EGRESS):
-		return &dev->egress_cl_list;
+		return q->egress_block;
 	default:
 		return NULL;
 	}
@@ -145,6 +161,18 @@ static struct tcf_proto __rcu **clsact_find_tcf(struct Qdisc *sch,
 
 static int clsact_init(struct Qdisc *sch, struct nlattr *opt)
 {
+	struct clsact_sched_data *q = qdisc_priv(sch);
+	struct net_device *dev = qdisc_dev(sch);
+	int err;
+
+	err = tcf_block_get(&q->ingress_block, &dev->ingress_cl_list);
+	if (err)
+		return err;
+
+	err = tcf_block_get(&q->egress_block, &dev->egress_cl_list);
+	if (err)
+		return err;
+
 	net_inc_ingress_queue();
 	net_inc_egress_queue();
 
@@ -155,10 +183,10 @@ static int clsact_init(struct Qdisc *sch, struct nlattr *opt)
 
 static void clsact_destroy(struct Qdisc *sch)
 {
-	struct net_device *dev = qdisc_dev(sch);
+	struct clsact_sched_data *q = qdisc_priv(sch);
 
-	tcf_destroy_chain(&dev->ingress_cl_list);
-	tcf_destroy_chain(&dev->egress_cl_list);
+	tcf_block_put(q->egress_block);
+	tcf_block_put(q->ingress_block);
 
 	net_dec_ingress_queue();
 	net_dec_egress_queue();
@@ -169,7 +197,7 @@ static const struct Qdisc_class_ops clsact_class_ops = {
 	.get		=	clsact_get,
 	.put		=	ingress_put,
 	.walk		=	ingress_walk,
-	.tcf_chain	=	clsact_find_tcf,
+	.tcf_block	=	clsact_tcf_block,
 	.tcf_cl_offload	=	clsact_cl_offload,
 	.bind_tcf	=	clsact_bind_filter,
 	.unbind_tcf	=	ingress_put,
@@ -178,6 +206,7 @@ static const struct Qdisc_class_ops clsact_class_ops = {
 static struct Qdisc_ops clsact_qdisc_ops __read_mostly = {
 	.cl_ops		=	&clsact_class_ops,
 	.id		=	"clsact",
+	.priv_size	=	sizeof(struct clsact_sched_data),
 	.init		=	clsact_init,
 	.destroy	=	clsact_destroy,
 	.dump		=	ingress_dump,
diff --git a/net/sched/sch_multiq.c b/net/sched/sch_multiq.c
index 25bb9ff..6047674 100644
--- a/net/sched/sch_multiq.c
+++ b/net/sched/sch_multiq.c
@@ -32,6 +32,7 @@ struct multiq_sched_data {
 	u16 max_bands;
 	u16 curband;
 	struct tcf_proto __rcu *filter_list;
+	struct tcf_block *block;
 	struct Qdisc **queues;
 };
 
@@ -170,7 +171,7 @@ multiq_destroy(struct Qdisc *sch)
 	int band;
 	struct multiq_sched_data *q = qdisc_priv(sch);
 
-	tcf_destroy_chain(&q->filter_list);
+	tcf_block_put(q->block);
 	for (band = 0; band < q->bands; band++)
 		qdisc_destroy(q->queues[band]);
 
@@ -243,6 +244,10 @@ static int multiq_init(struct Qdisc *sch, struct nlattr *opt)
 	if (opt == NULL)
 		return -EINVAL;
 
+	err = tcf_block_get(&q->block, &q->filter_list);
+	if (err)
+		return err;
+
 	q->max_bands = qdisc_dev(sch)->num_tx_queues;
 
 	q->queues = kcalloc(q->max_bands, sizeof(struct Qdisc *), GFP_KERNEL);
@@ -367,14 +372,13 @@ static void multiq_walk(struct Qdisc *sch, struct qdisc_walker *arg)
 	}
 }
 
-static struct tcf_proto __rcu **multiq_find_tcf(struct Qdisc *sch,
-						unsigned long cl)
+static struct tcf_block *multiq_tcf_block(struct Qdisc *sch, unsigned long cl)
 {
 	struct multiq_sched_data *q = qdisc_priv(sch);
 
 	if (cl)
 		return NULL;
-	return &q->filter_list;
+	return q->block;
 }
 
 static const struct Qdisc_class_ops multiq_class_ops = {
@@ -383,7 +387,7 @@ static const struct Qdisc_class_ops multiq_class_ops = {
 	.get		=	multiq_get,
 	.put		=	multiq_put,
 	.walk		=	multiq_walk,
-	.tcf_chain	=	multiq_find_tcf,
+	.tcf_block	=	multiq_tcf_block,
 	.bind_tcf	=	multiq_bind,
 	.unbind_tcf	=	multiq_put,
 	.dump		=	multiq_dump_class,
diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c
index 7997363..a240468 100644
--- a/net/sched/sch_prio.c
+++ b/net/sched/sch_prio.c
@@ -25,6 +25,7 @@
 struct prio_sched_data {
 	int bands;
 	struct tcf_proto __rcu *filter_list;
+	struct tcf_block *block;
 	u8  prio2band[TC_PRIO_MAX+1];
 	struct Qdisc *queues[TCQ_PRIO_BANDS];
 };
@@ -145,7 +146,7 @@ prio_destroy(struct Qdisc *sch)
 	int prio;
 	struct prio_sched_data *q = qdisc_priv(sch);
 
-	tcf_destroy_chain(&q->filter_list);
+	tcf_block_put(q->block);
 	for (prio = 0; prio < q->bands; prio++)
 		qdisc_destroy(q->queues[prio]);
 }
@@ -204,9 +205,16 @@ static int prio_tune(struct Qdisc *sch, struct nlattr *opt)
 
 static int prio_init(struct Qdisc *sch, struct nlattr *opt)
 {
+	struct prio_sched_data *q = qdisc_priv(sch);
+	int err;
+
 	if (!opt)
 		return -EINVAL;
 
+	err = tcf_block_get(&q->block, &q->filter_list);
+	if (err)
+		return err;
+
 	return prio_tune(sch, opt);
 }
 
@@ -317,14 +325,13 @@ static void prio_walk(struct Qdisc *sch, struct qdisc_walker *arg)
 	}
 }
 
-static struct tcf_proto __rcu **prio_find_tcf(struct Qdisc *sch,
-					      unsigned long cl)
+static struct tcf_block *prio_tcf_block(struct Qdisc *sch, unsigned long cl)
 {
 	struct prio_sched_data *q = qdisc_priv(sch);
 
 	if (cl)
 		return NULL;
-	return &q->filter_list;
+	return q->block;
 }
 
 static const struct Qdisc_class_ops prio_class_ops = {
@@ -333,7 +340,7 @@ static const struct Qdisc_class_ops prio_class_ops = {
 	.get		=	prio_get,
 	.put		=	prio_put,
 	.walk		=	prio_walk,
-	.tcf_chain	=	prio_find_tcf,
+	.tcf_block	=	prio_tcf_block,
 	.bind_tcf	=	prio_bind,
 	.unbind_tcf	=	prio_put,
 	.dump		=	prio_dump_class,
diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c
index 73c7ac3..076ad03 100644
--- a/net/sched/sch_qfq.c
+++ b/net/sched/sch_qfq.c
@@ -182,6 +182,7 @@ struct qfq_group {
 
 struct qfq_sched {
 	struct tcf_proto __rcu *filter_list;
+	struct tcf_block	*block;
 	struct Qdisc_class_hash clhash;
 
 	u64			oldV, V;	/* Precise virtual times. */
@@ -582,15 +583,14 @@ static void qfq_put_class(struct Qdisc *sch, unsigned long arg)
 		qfq_destroy_class(sch, cl);
 }
 
-static struct tcf_proto __rcu **qfq_tcf_chain(struct Qdisc *sch,
-					      unsigned long cl)
+static struct tcf_block *qfq_tcf_block(struct Qdisc *sch, unsigned long cl)
 {
 	struct qfq_sched *q = qdisc_priv(sch);
 
 	if (cl)
 		return NULL;
 
-	return &q->filter_list;
+	return q->block;
 }
 
 static unsigned long qfq_bind_tcf(struct Qdisc *sch, unsigned long parent,
@@ -1438,6 +1438,10 @@ static int qfq_init_qdisc(struct Qdisc *sch, struct nlattr *opt)
 	int i, j, err;
 	u32 max_cl_shift, maxbudg_shift, max_classes;
 
+	err = tcf_block_get(&q->block, &q->filter_list);
+	if (err)
+		return err;
+
 	err = qdisc_class_hash_init(&q->clhash);
 	if (err < 0)
 		return err;
@@ -1492,7 +1496,7 @@ static void qfq_destroy_qdisc(struct Qdisc *sch)
 	struct hlist_node *next;
 	unsigned int i;
 
-	tcf_destroy_chain(&q->filter_list);
+	tcf_block_put(q->block);
 
 	for (i = 0; i < q->clhash.hashsize; i++) {
 		hlist_for_each_entry_safe(cl, next, &q->clhash.hash[i],
@@ -1508,7 +1512,7 @@ static const struct Qdisc_class_ops qfq_class_ops = {
 	.delete		= qfq_delete_class,
 	.get		= qfq_get_class,
 	.put		= qfq_put_class,
-	.tcf_chain	= qfq_tcf_chain,
+	.tcf_block	= qfq_tcf_block,
 	.bind_tcf	= qfq_bind_tcf,
 	.unbind_tcf	= qfq_unbind_tcf,
 	.graft		= qfq_graft_class,
diff --git a/net/sched/sch_sfb.c b/net/sched/sch_sfb.c
index b2878808..9756b1c 100644
--- a/net/sched/sch_sfb.c
+++ b/net/sched/sch_sfb.c
@@ -56,6 +56,7 @@ struct sfb_bins {
 struct sfb_sched_data {
 	struct Qdisc	*qdisc;
 	struct tcf_proto __rcu *filter_list;
+	struct tcf_block *block;
 	unsigned long	rehash_interval;
 	unsigned long	warmup_time;	/* double buffering warmup time in jiffies */
 	u32		max;
@@ -465,7 +466,7 @@ static void sfb_destroy(struct Qdisc *sch)
 {
 	struct sfb_sched_data *q = qdisc_priv(sch);
 
-	tcf_destroy_chain(&q->filter_list);
+	tcf_block_put(q->block);
 	qdisc_destroy(q->qdisc);
 }
 
@@ -549,6 +550,11 @@ static int sfb_change(struct Qdisc *sch, struct nlattr *opt)
 static int sfb_init(struct Qdisc *sch, struct nlattr *opt)
 {
 	struct sfb_sched_data *q = qdisc_priv(sch);
+	int err;
+
+	err = tcf_block_get(&q->block, &q->filter_list);
+	if (err)
+		return err;
 
 	q->qdisc = &noop_qdisc;
 	return sfb_change(sch, opt);
@@ -657,14 +663,13 @@ static void sfb_walk(struct Qdisc *sch, struct qdisc_walker *walker)
 	}
 }
 
-static struct tcf_proto __rcu **sfb_find_tcf(struct Qdisc *sch,
-					     unsigned long cl)
+static struct tcf_block *sfb_tcf_block(struct Qdisc *sch, unsigned long cl)
 {
 	struct sfb_sched_data *q = qdisc_priv(sch);
 
 	if (cl)
 		return NULL;
-	return &q->filter_list;
+	return q->block;
 }
 
 static unsigned long sfb_bind(struct Qdisc *sch, unsigned long parent,
@@ -682,7 +687,7 @@ static const struct Qdisc_class_ops sfb_class_ops = {
 	.change		=	sfb_change_class,
 	.delete		=	sfb_delete,
 	.walk		=	sfb_walk,
-	.tcf_chain	=	sfb_find_tcf,
+	.tcf_block	=	sfb_tcf_block,
 	.bind_tcf	=	sfb_bind,
 	.unbind_tcf	=	sfb_put,
 	.dump		=	sfb_dump_class,
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 012fa3b..063281b 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -126,6 +126,7 @@ struct sfq_sched_data {
 	u8		flags;
 	unsigned short  scaled_quantum; /* SFQ_ALLOT_SIZE(quantum) */
 	struct tcf_proto __rcu *filter_list;
+	struct tcf_block *block;
 	sfq_index	*ht;		/* Hash table ('divisor' slots) */
 	struct sfq_slot	*slots;		/* Flows table ('maxflows' entries) */
 
@@ -701,7 +702,7 @@ static void sfq_destroy(struct Qdisc *sch)
 {
 	struct sfq_sched_data *q = qdisc_priv(sch);
 
-	tcf_destroy_chain(&q->filter_list);
+	tcf_block_put(q->block);
 	q->perturb_period = 0;
 	del_timer_sync(&q->perturb_timer);
 	sfq_free(q->ht);
@@ -713,6 +714,11 @@ static int sfq_init(struct Qdisc *sch, struct nlattr *opt)
 {
 	struct sfq_sched_data *q = qdisc_priv(sch);
 	int i;
+	int err;
+
+	err = tcf_block_get(&q->block, &q->filter_list);
+	if (err)
+		return err;
 
 	setup_deferrable_timer(&q->perturb_timer, sfq_perturbation,
 			       (unsigned long)sch);
@@ -819,14 +825,13 @@ static void sfq_put(struct Qdisc *q, unsigned long cl)
 {
 }
 
-static struct tcf_proto __rcu **sfq_find_tcf(struct Qdisc *sch,
-					     unsigned long cl)
+static struct tcf_block *sfq_tcf_block(struct Qdisc *sch, unsigned long cl)
 {
 	struct sfq_sched_data *q = qdisc_priv(sch);
 
 	if (cl)
 		return NULL;
-	return &q->filter_list;
+	return q->block;
 }
 
 static int sfq_dump_class(struct Qdisc *sch, unsigned long cl,
@@ -882,7 +887,7 @@ static const struct Qdisc_class_ops sfq_class_ops = {
 	.leaf		=	sfq_leaf,
 	.get		=	sfq_get,
 	.put		=	sfq_put,
-	.tcf_chain	=	sfq_find_tcf,
+	.tcf_block	=	sfq_tcf_block,
 	.bind_tcf	=	sfq_bind,
 	.unbind_tcf	=	sfq_put,
 	.dump		=	sfq_dump_class,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [patch net-next 03/10] net: sched: rename tcf_destroy_chain helper
  2017-04-27 11:12 [patch net-next 00/10] net: sched: introduce multichain support for filters Jiri Pirko
  2017-04-27 11:12 ` [patch net-next 01/10] net: sched: move tc_classify function to cls_api.c Jiri Pirko
  2017-04-27 11:12 ` [patch net-next 02/10] net: sched: introduce tcf block infractructure Jiri Pirko
@ 2017-04-27 11:12 ` Jiri Pirko
  2017-04-27 11:12 ` [patch net-next 04/10] net: sched: replace nprio by a bool to make the function more readable Jiri Pirko
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Jiri Pirko @ 2017-04-27 11:12 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, dsa, edumazet, stephen, daniel,
	alexander.h.duyck, mlxsw, simon.horman

From: Jiri Pirko <jiri@mellanox.com>

Make the name consistent with the rest of the helpers around.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/sched/cls_api.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 42fdc8a..88ec1a1 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -187,7 +187,7 @@ static void tcf_proto_destroy(struct tcf_proto *tp)
 	kfree_rcu(tp, rcu);
 }
 
-static void tcf_destroy_chain(struct tcf_proto __rcu **fl)
+static void tcf_chain_destroy(struct tcf_proto __rcu **fl)
 {
 	struct tcf_proto *tp;
 
@@ -214,7 +214,7 @@ void tcf_block_put(struct tcf_block *block)
 {
 	if (!block)
 		return;
-	tcf_destroy_chain(block->p_filter_chain);
+	tcf_chain_destroy(block->p_filter_chain);
 	kfree(block);
 }
 EXPORT_SYMBOL(tcf_block_put);
@@ -366,7 +366,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 
 	if (n->nlmsg_type == RTM_DELTFILTER && prio == 0) {
 		tfilter_notify_chain(net, skb, n, chain, RTM_DELTFILTER);
-		tcf_destroy_chain(chain);
+		tcf_chain_destroy(chain);
 		err = 0;
 		goto errout;
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [patch net-next 04/10] net: sched: replace nprio by a bool to make the function more readable
  2017-04-27 11:12 [patch net-next 00/10] net: sched: introduce multichain support for filters Jiri Pirko
                   ` (2 preceding siblings ...)
  2017-04-27 11:12 ` [patch net-next 03/10] net: sched: rename tcf_destroy_chain helper Jiri Pirko
@ 2017-04-27 11:12 ` Jiri Pirko
  2017-04-27 11:12 ` [patch net-next 05/10] net: sched: move TC_H_MAJ macro call into tcf_auto_prio Jiri Pirko
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Jiri Pirko @ 2017-04-27 11:12 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, dsa, edumazet, stephen, daniel,
	alexander.h.duyck, mlxsw, simon.horman

From: Jiri Pirko <jiri@mellanox.com>

The use of "nprio" variable in tc_ctl_tfilter is a bit cryptic and makes
a reader wonder what is going on for a while. So help him to understand
this priority allocation dance a litte bit better.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/sched/cls_api.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 88ec1a1..0e49e6e 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -271,7 +271,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 	struct tcmsg *t;
 	u32 protocol;
 	u32 prio;
-	u32 nprio;
+	bool prio_allocate;
 	u32 parent;
 	struct net_device *dev;
 	struct Qdisc  *q;
@@ -300,7 +300,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 	t = nlmsg_data(n);
 	protocol = TC_H_MIN(t->tcm_info);
 	prio = TC_H_MAJ(t->tcm_info);
-	nprio = prio;
+	prio_allocate = false;
 	parent = t->tcm_parent;
 	cl = 0;
 
@@ -316,6 +316,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 			 */
 			if (n->nlmsg_flags & NLM_F_CREATE) {
 				prio = TC_H_MAKE(0x80000000U, 0U);
+				prio_allocate = true;
 				break;
 			}
 			/* fall-through */
@@ -377,7 +378,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 	     back = &tp->next) {
 		if (tp->prio >= prio) {
 			if (tp->prio == prio) {
-				if (!nprio ||
+				if (prio_allocate ||
 				    (tp->protocol != protocol && protocol)) {
 					err = -EINVAL;
 					goto errout;
@@ -403,11 +404,11 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 			goto errout;
 		}
 
-		if (!nprio)
-			nprio = TC_H_MAJ(tcf_auto_prio(rtnl_dereference(*back)));
+		if (prio_allocate)
+			prio = TC_H_MAJ(tcf_auto_prio(rtnl_dereference(*back)));
 
 		tp = tcf_proto_create(nla_data(tca[TCA_KIND]),
-				      protocol, nprio, parent, q, block);
+				      protocol, prio, parent, q, block);
 		if (IS_ERR(tp)) {
 			err = PTR_ERR(tp);
 			goto errout;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [patch net-next 05/10] net: sched: move TC_H_MAJ macro call into tcf_auto_prio
  2017-04-27 11:12 [patch net-next 00/10] net: sched: introduce multichain support for filters Jiri Pirko
                   ` (3 preceding siblings ...)
  2017-04-27 11:12 ` [patch net-next 04/10] net: sched: replace nprio by a bool to make the function more readable Jiri Pirko
@ 2017-04-27 11:12 ` Jiri Pirko
  2017-04-27 11:12 ` [patch net-next 06/10] net: sched: introduce helpers to work with filter chains Jiri Pirko
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Jiri Pirko @ 2017-04-27 11:12 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, dsa, edumazet, stephen, daniel,
	alexander.h.duyck, mlxsw, simon.horman

From: Jiri Pirko <jiri@mellanox.com>

Call the helper from the function rather than to always adjust the
return value of the function.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/sched/cls_api.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 0e49e6e..72624fa 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -125,7 +125,7 @@ static inline u32 tcf_auto_prio(struct tcf_proto *tp)
 	if (tp)
 		first = tp->prio - 1;
 
-	return first;
+	return TC_H_MAJ(first);
 }
 
 static struct tcf_proto *tcf_proto_create(const char *kind, u32 protocol,
@@ -405,7 +405,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 		}
 
 		if (prio_allocate)
-			prio = TC_H_MAJ(tcf_auto_prio(rtnl_dereference(*back)));
+			prio = tcf_auto_prio(rtnl_dereference(*back));
 
 		tp = tcf_proto_create(nla_data(tca[TCA_KIND]),
 				      protocol, prio, parent, q, block);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [patch net-next 06/10] net: sched: introduce helpers to work with filter chains
  2017-04-27 11:12 [patch net-next 00/10] net: sched: introduce multichain support for filters Jiri Pirko
                   ` (4 preceding siblings ...)
  2017-04-27 11:12 ` [patch net-next 05/10] net: sched: move TC_H_MAJ macro call into tcf_auto_prio Jiri Pirko
@ 2017-04-27 11:12 ` Jiri Pirko
  2017-04-27 11:12 ` [patch net-next 07/10] net: sched: push chain dump to a separate function Jiri Pirko
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Jiri Pirko @ 2017-04-27 11:12 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, dsa, edumazet, stephen, daniel,
	alexander.h.duyck, mlxsw, simon.horman

From: Jiri Pirko <jiri@mellanox.com>

Introduce struct tcf_chain object and set of helpers around it. Wraps up
insertion, deletion and search in the filter chain.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/sch_generic.h |   7 ++-
 net/sched/cls_api.c       | 148 +++++++++++++++++++++++++++++++++-------------
 2 files changed, 113 insertions(+), 42 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 98cf2f2..52bceed 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -248,10 +248,15 @@ struct qdisc_skb_cb {
 	unsigned char		data[QDISC_CB_PRIV_LEN];
 };
 
-struct tcf_block {
+struct tcf_chain {
+	struct tcf_proto __rcu *filter_chain;
 	struct tcf_proto __rcu **p_filter_chain;
 };
 
+struct tcf_block {
+	struct tcf_chain *chain;
+};
+
 static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz)
 {
 	struct qdisc_skb_cb *qcb;
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 72624fa..6f2ea68 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -106,13 +106,12 @@ static int tfilter_notify(struct net *net, struct sk_buff *oskb,
 
 static void tfilter_notify_chain(struct net *net, struct sk_buff *oskb,
 				 struct nlmsghdr *n,
-				 struct tcf_proto __rcu **chain, int event)
+				 struct tcf_chain *chain, int event)
 {
-	struct tcf_proto __rcu **it_chain;
 	struct tcf_proto *tp;
 
-	for (it_chain = chain; (tp = rtnl_dereference(*it_chain)) != NULL;
-	     it_chain = &tp->next)
+	for (tp = rtnl_dereference(chain->filter_chain);
+	     tp; tp = rtnl_dereference(tp->next))
 		tfilter_notify(net, oskb, n, tp, 0, event, false);
 }
 
@@ -187,26 +186,49 @@ static void tcf_proto_destroy(struct tcf_proto *tp)
 	kfree_rcu(tp, rcu);
 }
 
-static void tcf_chain_destroy(struct tcf_proto __rcu **fl)
+static struct tcf_chain *tcf_chain_create(void)
+{
+	return kzalloc(sizeof(struct tcf_chain), GFP_KERNEL);
+}
+
+static void tcf_chain_destroy(struct tcf_chain *chain)
 {
 	struct tcf_proto *tp;
 
-	while ((tp = rtnl_dereference(*fl)) != NULL) {
-		RCU_INIT_POINTER(*fl, tp->next);
+	while ((tp = rtnl_dereference(chain->filter_chain)) != NULL) {
+		RCU_INIT_POINTER(chain->filter_chain, tp->next);
 		tcf_proto_destroy(tp);
 	}
+	kfree(chain);
+}
+
+static void
+tcf_chain_filter_chain_ptr_set(struct tcf_chain *chain,
+			       struct tcf_proto __rcu **p_filter_chain)
+{
+	chain->p_filter_chain = p_filter_chain;
 }
 
 int tcf_block_get(struct tcf_block **p_block,
 		  struct tcf_proto __rcu **p_filter_chain)
 {
 	struct tcf_block *block = kzalloc(sizeof(*block), GFP_KERNEL);
+	int err;
 
 	if (!block)
 		return -ENOMEM;
-	block->p_filter_chain = p_filter_chain;
+	block->chain = tcf_chain_create();
+	if (!block->chain) {
+		err = -ENOMEM;
+		goto err_chain_create;
+	}
+	tcf_chain_filter_chain_ptr_set(block->chain, p_filter_chain);
 	*p_block = block;
 	return 0;
+
+err_chain_create:
+	kfree(block);
+	return err;
 }
 EXPORT_SYMBOL(tcf_block_get);
 
@@ -214,7 +236,7 @@ void tcf_block_put(struct tcf_block *block)
 {
 	if (!block)
 		return;
-	tcf_chain_destroy(block->p_filter_chain);
+	tcf_chain_destroy(block->chain);
 	kfree(block);
 }
 EXPORT_SYMBOL(tcf_block_put);
@@ -261,6 +283,65 @@ int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 }
 EXPORT_SYMBOL(tcf_classify);
 
+struct tcf_chain_info {
+	struct tcf_proto __rcu **pprev;
+	struct tcf_proto __rcu *next;
+};
+
+static struct tcf_proto *tcf_chain_tp_prev(struct tcf_chain_info *chain_info)
+{
+	return rtnl_dereference(*chain_info->pprev);
+}
+
+static void tcf_chain_tp_insert(struct tcf_chain *chain,
+				struct tcf_chain_info *chain_info,
+				struct tcf_proto *tp)
+{
+	if (chain->p_filter_chain &&
+	    *chain_info->pprev == chain->filter_chain)
+		*chain->p_filter_chain = tp;
+	RCU_INIT_POINTER(tp->next, rtnl_dereference(*chain_info->pprev));
+	rcu_assign_pointer(*chain_info->pprev, tp);
+}
+
+static void tcf_chain_tp_remove(struct tcf_chain *chain,
+				struct tcf_chain_info *chain_info,
+				struct tcf_proto *tp)
+{
+	struct tcf_proto *next = rtnl_dereference(chain_info->next);
+
+	if (chain->p_filter_chain && tp == chain->filter_chain)
+		*chain->p_filter_chain = next;
+	RCU_INIT_POINTER(*chain_info->pprev, next);
+}
+
+static struct tcf_proto *tcf_chain_tp_find(struct tcf_chain *chain,
+					   struct tcf_chain_info *chain_info,
+					   u32 protocol, u32 prio,
+					   bool prio_allocate)
+{
+	struct tcf_proto **pprev;
+	struct tcf_proto *tp;
+
+	/* Check the chain for existence of proto-tcf with this priority */
+	for (pprev = &chain->filter_chain;
+	     (tp = rtnl_dereference(*pprev)); pprev = &tp->next) {
+		if (tp->prio >= prio) {
+			if (tp->prio == prio) {
+				if (prio_allocate ||
+				    (tp->protocol != protocol && protocol))
+					return ERR_PTR(-EINVAL);
+			} else {
+				tp = NULL;
+			}
+			break;
+		}
+	}
+	chain_info->pprev = pprev;
+	chain_info->next = tp ? tp->next : NULL;
+	return tp;
+}
+
 /* Add/change/delete/get a filter node */
 
 static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
@@ -275,10 +356,9 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 	u32 parent;
 	struct net_device *dev;
 	struct Qdisc  *q;
-	struct tcf_proto __rcu **back;
-	struct tcf_proto __rcu **chain;
+	struct tcf_chain_info chain_info;
+	struct tcf_chain *chain;
 	struct tcf_block *block;
-	struct tcf_proto *next;
 	struct tcf_proto *tp;
 	const struct Qdisc_class_ops *cops;
 	unsigned long cl;
@@ -363,7 +443,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 		err = -EINVAL;
 		goto errout;
 	}
-	chain = block->p_filter_chain;
+	chain = block->chain;
 
 	if (n->nlmsg_type == RTM_DELTFILTER && prio == 0) {
 		tfilter_notify_chain(net, skb, n, chain, RTM_DELTFILTER);
@@ -372,22 +452,11 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 		goto errout;
 	}
 
-	/* Check the chain for existence of proto-tcf with this priority */
-	for (back = chain;
-	     (tp = rtnl_dereference(*back)) != NULL;
-	     back = &tp->next) {
-		if (tp->prio >= prio) {
-			if (tp->prio == prio) {
-				if (prio_allocate ||
-				    (tp->protocol != protocol && protocol)) {
-					err = -EINVAL;
-					goto errout;
-				}
-			} else {
-				tp = NULL;
-			}
-			break;
-		}
+	tp = tcf_chain_tp_find(chain, &chain_info, protocol,
+			       prio, prio_allocate);
+	if (IS_ERR(tp)) {
+		err = PTR_ERR(tp);
+		goto errout;
 	}
 
 	if (tp == NULL) {
@@ -405,7 +474,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 		}
 
 		if (prio_allocate)
-			prio = tcf_auto_prio(rtnl_dereference(*back));
+			prio = tcf_auto_prio(tcf_chain_tp_prev(&chain_info));
 
 		tp = tcf_proto_create(nla_data(tca[TCA_KIND]),
 				      protocol, prio, parent, q, block);
@@ -423,8 +492,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 
 	if (fh == 0) {
 		if (n->nlmsg_type == RTM_DELTFILTER && t->tcm_handle == 0) {
-			next = rtnl_dereference(tp->next);
-			RCU_INIT_POINTER(*back, next);
+			tcf_chain_tp_remove(chain, &chain_info, tp);
 			tfilter_notify(net, skb, n, tp, fh,
 				       RTM_DELTFILTER, false);
 			tcf_proto_destroy(tp);
@@ -453,11 +521,10 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 			err = tp->ops->delete(tp, fh, &last);
 			if (err)
 				goto errout;
-			next = rtnl_dereference(tp->next);
 			tfilter_notify(net, skb, n, tp, t->tcm_handle,
 				       RTM_DELTFILTER, false);
 			if (last) {
-				RCU_INIT_POINTER(*back, next);
+				tcf_chain_tp_remove(chain, &chain_info, tp);
 				tcf_proto_destroy(tp);
 			}
 			goto errout;
@@ -474,10 +541,8 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 	err = tp->ops->change(net, skb, tp, cl, t->tcm_handle, tca, &fh,
 			      n->nlmsg_flags & NLM_F_CREATE ? TCA_ACT_NOREPLACE : TCA_ACT_REPLACE);
 	if (err == 0) {
-		if (tp_created) {
-			RCU_INIT_POINTER(tp->next, rtnl_dereference(*back));
-			rcu_assign_pointer(*back, tp);
-		}
+		if (tp_created)
+			tcf_chain_tp_insert(chain, &chain_info, tp);
 		tfilter_notify(net, skb, n, tp, fh, RTM_NEWTFILTER, false);
 	} else {
 		if (tp_created)
@@ -578,7 +643,8 @@ static int tc_dump_tfilter(struct sk_buff *skb, struct netlink_callback *cb)
 	struct net_device *dev;
 	struct Qdisc *q;
 	struct tcf_block *block;
-	struct tcf_proto *tp, __rcu **chain;
+	struct tcf_proto *tp;
+	struct tcf_chain *chain;
 	struct tcmsg *tcm = nlmsg_data(cb->nlh);
 	unsigned long cl = 0;
 	const struct Qdisc_class_ops *cops;
@@ -609,11 +675,11 @@ static int tc_dump_tfilter(struct sk_buff *skb, struct netlink_callback *cb)
 	block = cops->tcf_block(q, cl);
 	if (!block)
 		goto errout;
-	chain = block->p_filter_chain;
+	chain = block->chain;
 
 	s_t = cb->args[0];
 
-	for (tp = rtnl_dereference(*chain), t = 0;
+	for (tp = rtnl_dereference(chain->filter_chain), t = 0;
 	     tp; tp = rtnl_dereference(tp->next), t++) {
 		if (t < s_t)
 			continue;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [patch net-next 07/10] net: sched: push chain dump to a separate function
  2017-04-27 11:12 [patch net-next 00/10] net: sched: introduce multichain support for filters Jiri Pirko
                   ` (5 preceding siblings ...)
  2017-04-27 11:12 ` [patch net-next 06/10] net: sched: introduce helpers to work with filter chains Jiri Pirko
@ 2017-04-27 11:12 ` Jiri Pirko
  2017-04-27 11:12 ` [patch net-next 08/10] net: sched: introduce multichain support for filters Jiri Pirko
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Jiri Pirko @ 2017-04-27 11:12 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, dsa, edumazet, stephen, daniel,
	alexander.h.duyck, mlxsw, simon.horman

From: Jiri Pirko <jiri@mellanox.com>

Since there will be multiple chains to dump, push chain dumping code to
a separate function.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/sched/cls_api.c | 95 +++++++++++++++++++++++++++++------------------------
 1 file changed, 52 insertions(+), 43 deletions(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 6f2ea68..9f48061 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -634,21 +634,65 @@ static int tcf_node_dump(struct tcf_proto *tp, unsigned long n,
 			     RTM_NEWTFILTER);
 }
 
+static void tcf_chain_dump(struct tcf_chain *chain, struct sk_buff *skb,
+			   struct netlink_callback *cb,
+			   long index_start, long *p_index)
+{
+	struct net *net = sock_net(skb->sk);
+	struct tcmsg *tcm = nlmsg_data(cb->nlh);
+	struct tcf_dump_args arg;
+	struct tcf_proto *tp;
+
+	for (tp = rtnl_dereference(chain->filter_chain);
+	     tp; tp = rtnl_dereference(tp->next), (*p_index)++) {
+		if (*p_index < index_start)
+			continue;
+		if (TC_H_MAJ(tcm->tcm_info) &&
+		    TC_H_MAJ(tcm->tcm_info) != tp->prio)
+			continue;
+		if (TC_H_MIN(tcm->tcm_info) &&
+		    TC_H_MIN(tcm->tcm_info) != tp->protocol)
+			continue;
+		if (*p_index > index_start)
+			memset(&cb->args[1], 0,
+			       sizeof(cb->args) - sizeof(cb->args[0]));
+		if (cb->args[1] == 0) {
+			if (tcf_fill_node(net, skb, tp, 0,
+					  NETLINK_CB(cb->skb).portid,
+					  cb->nlh->nlmsg_seq, NLM_F_MULTI,
+					  RTM_NEWTFILTER) <= 0)
+				break;
+
+			cb->args[1] = 1;
+		}
+		if (!tp->ops->walk)
+			continue;
+		arg.w.fn = tcf_node_dump;
+		arg.skb = skb;
+		arg.cb = cb;
+		arg.w.stop = 0;
+		arg.w.skip = cb->args[1] - 1;
+		arg.w.count = 0;
+		tp->ops->walk(tp, &arg.w);
+		cb->args[1] = arg.w.count + 1;
+		if (arg.w.stop)
+			break;
+	}
+}
+
 /* called with RTNL */
 static int tc_dump_tfilter(struct sk_buff *skb, struct netlink_callback *cb)
 {
 	struct net *net = sock_net(skb->sk);
-	int t;
-	int s_t;
 	struct net_device *dev;
 	struct Qdisc *q;
 	struct tcf_block *block;
-	struct tcf_proto *tp;
 	struct tcf_chain *chain;
 	struct tcmsg *tcm = nlmsg_data(cb->nlh);
 	unsigned long cl = 0;
 	const struct Qdisc_class_ops *cops;
-	struct tcf_dump_args arg;
+	long index_start;
+	long index;
 
 	if (nlmsg_len(cb->nlh) < sizeof(*tcm))
 		return skb->len;
@@ -677,45 +721,10 @@ static int tc_dump_tfilter(struct sk_buff *skb, struct netlink_callback *cb)
 		goto errout;
 	chain = block->chain;
 
-	s_t = cb->args[0];
-
-	for (tp = rtnl_dereference(chain->filter_chain), t = 0;
-	     tp; tp = rtnl_dereference(tp->next), t++) {
-		if (t < s_t)
-			continue;
-		if (TC_H_MAJ(tcm->tcm_info) &&
-		    TC_H_MAJ(tcm->tcm_info) != tp->prio)
-			continue;
-		if (TC_H_MIN(tcm->tcm_info) &&
-		    TC_H_MIN(tcm->tcm_info) != tp->protocol)
-			continue;
-		if (t > s_t)
-			memset(&cb->args[1], 0,
-			       sizeof(cb->args)-sizeof(cb->args[0]));
-		if (cb->args[1] == 0) {
-			if (tcf_fill_node(net, skb, tp, 0,
-					  NETLINK_CB(cb->skb).portid,
-					  cb->nlh->nlmsg_seq, NLM_F_MULTI,
-					  RTM_NEWTFILTER) <= 0)
-				break;
-
-			cb->args[1] = 1;
-		}
-		if (tp->ops->walk == NULL)
-			continue;
-		arg.w.fn = tcf_node_dump;
-		arg.skb = skb;
-		arg.cb = cb;
-		arg.w.stop = 0;
-		arg.w.skip = cb->args[1] - 1;
-		arg.w.count = 0;
-		tp->ops->walk(tp, &arg.w);
-		cb->args[1] = arg.w.count + 1;
-		if (arg.w.stop)
-			break;
-	}
-
-	cb->args[0] = t;
+	index_start = cb->args[0];
+	index = 0;
+	tcf_chain_dump(chain, skb, cb, index_start, &index);
+	cb->args[0] = index;
 
 errout:
 	if (cl)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [patch net-next 08/10] net: sched: introduce multichain support for filters
  2017-04-27 11:12 [patch net-next 00/10] net: sched: introduce multichain support for filters Jiri Pirko
                   ` (6 preceding siblings ...)
  2017-04-27 11:12 ` [patch net-next 07/10] net: sched: push chain dump to a separate function Jiri Pirko
@ 2017-04-27 11:12 ` Jiri Pirko
  2017-04-27 11:12 ` [patch net-next 09/10] net: sched: push tp down to action init callback Jiri Pirko
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Jiri Pirko @ 2017-04-27 11:12 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, dsa, edumazet, stephen, daniel,
	alexander.h.duyck, mlxsw, simon.horman

From: Jiri Pirko <jiri@mellanox.com>

Instead of having only one filter per block, introduce a list of chains
for every block. Create chain 0 by default. UAPI is extended so the user
can specify which chain he wants to change. If the new attribute is not
specified, chain 0 is used. That allows to maintain backward
compatibility. If chain does not exist and user wants to manipulate with
it, new chain is created with specified index. Also, when last filter is
removed from the chain, the chain is destroyed.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/pkt_cls.h          |   2 +
 include/net/sch_generic.h      |   9 +++-
 include/uapi/linux/rtnetlink.h |   1 +
 net/sched/cls_api.c            | 100 ++++++++++++++++++++++++++++++++++-------
 4 files changed, 94 insertions(+), 18 deletions(-)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index e56e715..2c213a6 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -18,6 +18,8 @@ int register_tcf_proto_ops(struct tcf_proto_ops *ops);
 int unregister_tcf_proto_ops(struct tcf_proto_ops *ops);
 
 #ifdef CONFIG_NET_CLS
+struct tcf_chain *tcf_chain_get(struct tcf_block *block, u32 chain_index);
+void tcf_chain_put(struct tcf_chain *chain);
 int tcf_block_get(struct tcf_block **p_block,
 		  struct tcf_proto __rcu **p_filter_chain);
 void tcf_block_put(struct tcf_block *block);
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 52bceed..569b565 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -8,6 +8,7 @@
 #include <linux/pkt_cls.h>
 #include <linux/percpu.h>
 #include <linux/dynamic_queue_limits.h>
+#include <linux/list.h>
 #include <net/gen_stats.h>
 #include <net/rtnetlink.h>
 
@@ -236,7 +237,7 @@ struct tcf_proto {
 	struct Qdisc		*q;
 	void			*data;
 	const struct tcf_proto_ops	*ops;
-	struct tcf_block	*block;
+	struct tcf_chain	*chain;
 	struct rcu_head		rcu;
 };
 
@@ -251,10 +252,14 @@ struct qdisc_skb_cb {
 struct tcf_chain {
 	struct tcf_proto __rcu *filter_chain;
 	struct tcf_proto __rcu **p_filter_chain;
+	struct list_head list;
+	struct tcf_block *block;
+	u32 index; /* chain index */
+	unsigned int refcnt;
 };
 
 struct tcf_block {
-	struct tcf_chain *chain;
+	struct list_head chain_list;
 };
 
 static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz)
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index cce0613..6487b21 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -549,6 +549,7 @@ enum {
 	TCA_STAB,
 	TCA_PAD,
 	TCA_DUMP_INVISIBLE,
+	TCA_CHAIN,
 	__TCA_MAX
 };
 
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 9f48061..a4fab8f 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -129,7 +129,7 @@ static inline u32 tcf_auto_prio(struct tcf_proto *tp)
 
 static struct tcf_proto *tcf_proto_create(const char *kind, u32 protocol,
 					  u32 prio, u32 parent, struct Qdisc *q,
-					  struct tcf_block *block)
+					  struct tcf_chain *chain)
 {
 	struct tcf_proto *tp;
 	int err;
@@ -165,7 +165,7 @@ static struct tcf_proto *tcf_proto_create(const char *kind, u32 protocol,
 	tp->prio = prio;
 	tp->classid = parent;
 	tp->q = q;
-	tp->block = block;
+	tp->chain = chain;
 
 	err = tp->ops->init(tp);
 	if (err) {
@@ -186,15 +186,26 @@ static void tcf_proto_destroy(struct tcf_proto *tp)
 	kfree_rcu(tp, rcu);
 }
 
-static struct tcf_chain *tcf_chain_create(void)
+static struct tcf_chain *tcf_chain_create(struct tcf_block *block,
+					  u32 chain_index)
 {
-	return kzalloc(sizeof(struct tcf_chain), GFP_KERNEL);
+	struct tcf_chain *chain;
+
+	chain = kzalloc(sizeof(*chain), GFP_KERNEL);
+	if (!chain)
+		return NULL;
+	list_add_tail(&chain->list, &block->chain_list);
+	chain->block = block;
+	chain->index = chain_index;
+	chain->refcnt = 1;
+	return chain;
 }
 
 static void tcf_chain_destroy(struct tcf_chain *chain)
 {
 	struct tcf_proto *tp;
 
+	list_del(&chain->list);
 	while ((tp = rtnl_dereference(chain->filter_chain)) != NULL) {
 		RCU_INIT_POINTER(chain->filter_chain, tp->next);
 		tcf_proto_destroy(tp);
@@ -202,6 +213,30 @@ static void tcf_chain_destroy(struct tcf_chain *chain)
 	kfree(chain);
 }
 
+struct tcf_chain *tcf_chain_get(struct tcf_block *block, u32 chain_index)
+{
+	struct tcf_chain *chain;
+
+	list_for_each_entry(chain, &block->chain_list, list) {
+		if (chain->index == chain_index) {
+			chain->refcnt++;
+			return chain;
+		}
+	}
+	return tcf_chain_create(block, chain_index);
+}
+EXPORT_SYMBOL(tcf_chain_get);
+
+void tcf_chain_put(struct tcf_chain *chain)
+{
+	/* Destroy unused chain, with exception of chain 0, which is the
+	 * default one and has to be always present.
+	 */
+	if (--chain->refcnt == 0 && !chain->filter_chain && chain->index != 0)
+		tcf_chain_destroy(chain);
+}
+EXPORT_SYMBOL(tcf_chain_put);
+
 static void
 tcf_chain_filter_chain_ptr_set(struct tcf_chain *chain,
 			       struct tcf_proto __rcu **p_filter_chain)
@@ -213,16 +248,19 @@ int tcf_block_get(struct tcf_block **p_block,
 		  struct tcf_proto __rcu **p_filter_chain)
 {
 	struct tcf_block *block = kzalloc(sizeof(*block), GFP_KERNEL);
+	struct tcf_chain *chain;
 	int err;
 
 	if (!block)
 		return -ENOMEM;
-	block->chain = tcf_chain_create();
-	if (!block->chain) {
+	INIT_LIST_HEAD(&block->chain_list);
+	/* Create chain 0 by default, it has to be always present. */
+	chain = tcf_chain_create(block, 0);
+	if (!chain) {
 		err = -ENOMEM;
 		goto err_chain_create;
 	}
-	tcf_chain_filter_chain_ptr_set(block->chain, p_filter_chain);
+	tcf_chain_filter_chain_ptr_set(chain, p_filter_chain);
 	*p_block = block;
 	return 0;
 
@@ -234,9 +272,13 @@ EXPORT_SYMBOL(tcf_block_get);
 
 void tcf_block_put(struct tcf_block *block)
 {
+	struct tcf_chain *chain, *tmp;
+
 	if (!block)
 		return;
-	tcf_chain_destroy(block->chain);
+
+	list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
+		tcf_chain_destroy(chain);
 	kfree(block);
 }
 EXPORT_SYMBOL(tcf_block_put);
@@ -354,10 +396,11 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 	u32 prio;
 	bool prio_allocate;
 	u32 parent;
+	u32 chain_index;
 	struct net_device *dev;
 	struct Qdisc  *q;
 	struct tcf_chain_info chain_info;
-	struct tcf_chain *chain;
+	struct tcf_chain *chain = NULL;
 	struct tcf_block *block;
 	struct tcf_proto *tp;
 	const struct Qdisc_class_ops *cops;
@@ -443,7 +486,13 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 		err = -EINVAL;
 		goto errout;
 	}
-	chain = block->chain;
+
+	chain_index = tca[TCA_CHAIN] ? nla_get_u32(tca[TCA_CHAIN]) : 0;
+	chain = tcf_chain_get(block, chain_index);
+	if (!chain) {
+		err = -ENOMEM;
+		goto errout;
+	}
 
 	if (n->nlmsg_type == RTM_DELTFILTER && prio == 0) {
 		tfilter_notify_chain(net, skb, n, chain, RTM_DELTFILTER);
@@ -477,7 +526,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 			prio = tcf_auto_prio(tcf_chain_tp_prev(&chain_info));
 
 		tp = tcf_proto_create(nla_data(tca[TCA_KIND]),
-				      protocol, prio, parent, q, block);
+				      protocol, prio, parent, q, chain);
 		if (IS_ERR(tp)) {
 			err = PTR_ERR(tp);
 			goto errout;
@@ -550,6 +599,8 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 	}
 
 errout:
+	if (chain)
+		tcf_chain_put(chain);
 	if (cl)
 		cops->put(q, cl);
 	if (err == -EAGAIN)
@@ -578,6 +629,8 @@ static int tcf_fill_node(struct net *net, struct sk_buff *skb,
 	tcm->tcm_info = TC_H_MAKE(tp->prio, tp->protocol);
 	if (nla_put_string(skb, TCA_KIND, tp->ops->kind))
 		goto nla_put_failure;
+	if (nla_put_u32(skb, TCA_CHAIN, tp->chain->index))
+		goto nla_put_failure;
 	tcm->tcm_handle = fh;
 	if (RTM_DELTFILTER != event) {
 		tcm->tcm_handle = 0;
@@ -634,7 +687,7 @@ static int tcf_node_dump(struct tcf_proto *tp, unsigned long n,
 			     RTM_NEWTFILTER);
 }
 
-static void tcf_chain_dump(struct tcf_chain *chain, struct sk_buff *skb,
+static bool tcf_chain_dump(struct tcf_chain *chain, struct sk_buff *skb,
 			   struct netlink_callback *cb,
 			   long index_start, long *p_index)
 {
@@ -661,7 +714,7 @@ static void tcf_chain_dump(struct tcf_chain *chain, struct sk_buff *skb,
 					  NETLINK_CB(cb->skb).portid,
 					  cb->nlh->nlmsg_seq, NLM_F_MULTI,
 					  RTM_NEWTFILTER) <= 0)
-				break;
+				return false;
 
 			cb->args[1] = 1;
 		}
@@ -676,14 +729,16 @@ static void tcf_chain_dump(struct tcf_chain *chain, struct sk_buff *skb,
 		tp->ops->walk(tp, &arg.w);
 		cb->args[1] = arg.w.count + 1;
 		if (arg.w.stop)
-			break;
+			return false;
 	}
+	return true;
 }
 
 /* called with RTNL */
 static int tc_dump_tfilter(struct sk_buff *skb, struct netlink_callback *cb)
 {
 	struct net *net = sock_net(skb->sk);
+	struct nlattr *tca[TCA_MAX + 1];
 	struct net_device *dev;
 	struct Qdisc *q;
 	struct tcf_block *block;
@@ -693,9 +748,15 @@ static int tc_dump_tfilter(struct sk_buff *skb, struct netlink_callback *cb)
 	const struct Qdisc_class_ops *cops;
 	long index_start;
 	long index;
+	int err;
 
 	if (nlmsg_len(cb->nlh) < sizeof(*tcm))
 		return skb->len;
+
+	err = nlmsg_parse(cb->nlh, sizeof(*tcm), tca, TCA_MAX, NULL, NULL);
+	if (err)
+		return err;
+
 	dev = __dev_get_by_index(net, tcm->tcm_ifindex);
 	if (!dev)
 		return skb->len;
@@ -719,11 +780,18 @@ static int tc_dump_tfilter(struct sk_buff *skb, struct netlink_callback *cb)
 	block = cops->tcf_block(q, cl);
 	if (!block)
 		goto errout;
-	chain = block->chain;
 
 	index_start = cb->args[0];
 	index = 0;
-	tcf_chain_dump(chain, skb, cb, index_start, &index);
+
+	list_for_each_entry(chain, &block->chain_list, list) {
+		if (tca[TCA_CHAIN] &&
+		    nla_get_u32(tca[TCA_CHAIN]) != chain->index)
+			continue;
+		if (!tcf_chain_dump(chain, skb, cb, index_start, &index))
+			break;
+	}
+
 	cb->args[0] = index;
 
 errout:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [patch net-next 09/10] net: sched: push tp down to action init callback
  2017-04-27 11:12 [patch net-next 00/10] net: sched: introduce multichain support for filters Jiri Pirko
                   ` (7 preceding siblings ...)
  2017-04-27 11:12 ` [patch net-next 08/10] net: sched: introduce multichain support for filters Jiri Pirko
@ 2017-04-27 11:12 ` Jiri Pirko
  2017-04-27 11:12 ` [patch net-next 10/10] net: sched: extend gact to allow jumping to another filter chain Jiri Pirko
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Jiri Pirko @ 2017-04-27 11:12 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, dsa, edumazet, stephen, daniel,
	alexander.h.duyck, mlxsw, simon.horman

From: Jiri Pirko <jiri@mellanox.com>

Tp pointer will be needed by the next patch in order to get the chain.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/act_api.h      | 18 +++++++++---------
 net/sched/act_api.c        | 20 +++++++++++---------
 net/sched/act_bpf.c        |  6 +++---
 net/sched/act_connmark.c   |  6 +++---
 net/sched/act_csum.c       |  6 +++---
 net/sched/act_gact.c       |  6 +++---
 net/sched/act_ife.c        |  6 +++---
 net/sched/act_ipt.c        | 12 ++++++------
 net/sched/act_mirred.c     |  6 +++---
 net/sched/act_nat.c        |  3 ++-
 net/sched/act_pedit.c      |  6 +++---
 net/sched/act_police.c     |  6 +++---
 net/sched/act_sample.c     |  6 +++---
 net/sched/act_simple.c     |  6 +++---
 net/sched/act_skbedit.c    |  6 +++---
 net/sched/act_skbmod.c     |  6 +++---
 net/sched/act_tunnel_key.c |  6 +++---
 net/sched/act_vlan.c       |  6 +++---
 net/sched/cls_api.c        |  9 +++++----
 19 files changed, 75 insertions(+), 71 deletions(-)

diff --git a/include/net/act_api.h b/include/net/act_api.h
index cfa2ae3..816c5f5 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -114,9 +114,9 @@ struct tc_action_ops {
 	int     (*dump)(struct sk_buff *, struct tc_action *, int, int);
 	void	(*cleanup)(struct tc_action *, int bind);
 	int     (*lookup)(struct net *, struct tc_action **, u32);
-	int     (*init)(struct net *net, struct nlattr *nla,
-			struct nlattr *est, struct tc_action **act, int ovr,
-			int bind);
+	int     (*init)(struct net *net, struct tcf_proto *tp,
+			struct nlattr *nla, struct nlattr *est,
+			struct tc_action **act, int ovr, int bind);
 	int     (*walk)(struct net *, struct sk_buff *,
 			struct netlink_callback *, int, const struct tc_action_ops *);
 	void	(*stats_update)(struct tc_action *, u64, u32, u64);
@@ -180,12 +180,12 @@ int tcf_unregister_action(struct tc_action_ops *a,
 int tcf_action_destroy(struct list_head *actions, int bind);
 int tcf_action_exec(struct sk_buff *skb, struct tc_action **actions,
 		    int nr_actions, struct tcf_result *res);
-int tcf_action_init(struct net *net, struct nlattr *nla,
-				  struct nlattr *est, char *n, int ovr,
-				  int bind, struct list_head *);
-struct tc_action *tcf_action_init_1(struct net *net, struct nlattr *nla,
-				    struct nlattr *est, char *n, int ovr,
-				    int bind);
+int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
+		    struct nlattr *est, char *name, int ovr, int bind,
+		    struct list_head *actions);
+struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
+				    struct nlattr *nla, struct nlattr *est,
+				    char *name, int ovr, int bind);
 int tcf_action_dump(struct sk_buff *skb, struct list_head *, int, int);
 int tcf_action_dump_old(struct sk_buff *skb, struct tc_action *a, int, int);
 int tcf_action_dump_1(struct sk_buff *skb, struct tc_action *a, int, int);
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 7f2cd70..5646046 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -570,9 +570,9 @@ static struct tc_cookie *nla_memdup_cookie(struct nlattr **tb)
 	return c;
 }
 
-struct tc_action *tcf_action_init_1(struct net *net, struct nlattr *nla,
-				    struct nlattr *est, char *name, int ovr,
-				    int bind)
+struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
+				    struct nlattr *nla, struct nlattr *est,
+				    char *name, int ovr, int bind)
 {
 	struct tc_action *a;
 	struct tc_action_ops *a_o;
@@ -636,9 +636,10 @@ struct tc_action *tcf_action_init_1(struct net *net, struct nlattr *nla,
 
 	/* backward compatibility for policer */
 	if (name == NULL)
-		err = a_o->init(net, tb[TCA_ACT_OPTIONS], est, &a, ovr, bind);
+		err = a_o->init(net, tp, tb[TCA_ACT_OPTIONS], est,
+				&a, ovr, bind);
 	else
-		err = a_o->init(net, nla, est, &a, ovr, bind);
+		err = a_o->init(net, tp, nla, est, &a, ovr, bind);
 	if (err < 0)
 		goto err_mod;
 
@@ -680,8 +681,9 @@ static void cleanup_a(struct list_head *actions, int ovr)
 		a->tcfa_refcnt--;
 }
 
-int tcf_action_init(struct net *net, struct nlattr *nla, struct nlattr *est,
-		    char *name, int ovr, int bind, struct list_head *actions)
+int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
+		    struct nlattr *est, char *name, int ovr, int bind,
+		    struct list_head *actions)
 {
 	struct nlattr *tb[TCA_ACT_MAX_PRIO + 1];
 	struct tc_action *act;
@@ -693,7 +695,7 @@ int tcf_action_init(struct net *net, struct nlattr *nla, struct nlattr *est,
 		return err;
 
 	for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) {
-		act = tcf_action_init_1(net, tb[i], est, name, ovr, bind);
+		act = tcf_action_init_1(net, tp, tb[i], est, name, ovr, bind);
 		if (IS_ERR(act)) {
 			err = PTR_ERR(act);
 			goto err;
@@ -1020,7 +1022,7 @@ static int tcf_action_add(struct net *net, struct nlattr *nla,
 	int ret = 0;
 	LIST_HEAD(actions);
 
-	ret = tcf_action_init(net, nla, NULL, NULL, ovr, 0, &actions);
+	ret = tcf_action_init(net, NULL, nla, NULL, NULL, ovr, 0, &actions);
 	if (ret)
 		return ret;
 
diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c
index d33947d..64bb273 100644
--- a/net/sched/act_bpf.c
+++ b/net/sched/act_bpf.c
@@ -268,9 +268,9 @@ static void tcf_bpf_prog_fill_cfg(const struct tcf_bpf *prog,
 	cfg->bpf_name = prog->bpf_name;
 }
 
-static int tcf_bpf_init(struct net *net, struct nlattr *nla,
-			struct nlattr *est, struct tc_action **act,
-			int replace, int bind)
+static int tcf_bpf_init(struct net *net, struct tcf_proto *tp,
+			struct nlattr *nla, struct nlattr *est,
+			struct tc_action **act, int replace, int bind)
 {
 	struct tc_action_net *tn = net_generic(net, bpf_net_id);
 	struct nlattr *tb[TCA_ACT_BPF_MAX + 1];
diff --git a/net/sched/act_connmark.c b/net/sched/act_connmark.c
index 2155bc6..2f0f3b5 100644
--- a/net/sched/act_connmark.c
+++ b/net/sched/act_connmark.c
@@ -96,9 +96,9 @@ static const struct nla_policy connmark_policy[TCA_CONNMARK_MAX + 1] = {
 	[TCA_CONNMARK_PARMS] = { .len = sizeof(struct tc_connmark) },
 };
 
-static int tcf_connmark_init(struct net *net, struct nlattr *nla,
-			     struct nlattr *est, struct tc_action **a,
-			     int ovr, int bind)
+static int tcf_connmark_init(struct net *net, struct tcf_proto *tp,
+			     struct nlattr *nla, struct nlattr *est,
+			     struct tc_action **a, int ovr, int bind)
 {
 	struct tc_action_net *tn = net_generic(net, connmark_net_id);
 	struct nlattr *tb[TCA_CONNMARK_MAX + 1];
diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
index ab6fdbd..ee3f339 100644
--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -46,9 +46,9 @@ static const struct nla_policy csum_policy[TCA_CSUM_MAX + 1] = {
 static unsigned int csum_net_id;
 static struct tc_action_ops act_csum_ops;
 
-static int tcf_csum_init(struct net *net, struct nlattr *nla,
-			 struct nlattr *est, struct tc_action **a, int ovr,
-			 int bind)
+static int tcf_csum_init(struct net *net, struct tcf_proto *tp,
+			 struct nlattr *nla, struct nlattr *est,
+			 struct tc_action **a, int ovr, int bind)
 {
 	struct tc_action_net *tn = net_generic(net, csum_net_id);
 	struct nlattr *tb[TCA_CSUM_MAX + 1];
diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
index 99afe8b..c527c11 100644
--- a/net/sched/act_gact.c
+++ b/net/sched/act_gact.c
@@ -56,9 +56,9 @@ static const struct nla_policy gact_policy[TCA_GACT_MAX + 1] = {
 	[TCA_GACT_PROB]		= { .len = sizeof(struct tc_gact_p) },
 };
 
-static int tcf_gact_init(struct net *net, struct nlattr *nla,
-			 struct nlattr *est, struct tc_action **a,
-			 int ovr, int bind)
+static int tcf_gact_init(struct net *net, struct tcf_proto *tp,
+			 struct nlattr *nla, struct nlattr *est,
+			 struct tc_action **a, int ovr, int bind)
 {
 	struct tc_action_net *tn = net_generic(net, gact_net_id);
 	struct nlattr *tb[TCA_GACT_MAX + 1];
diff --git a/net/sched/act_ife.c b/net/sched/act_ife.c
index c5dec30..deea15c 100644
--- a/net/sched/act_ife.c
+++ b/net/sched/act_ife.c
@@ -427,9 +427,9 @@ static int populate_metalist(struct tcf_ife_info *ife, struct nlattr **tb,
 	return rc;
 }
 
-static int tcf_ife_init(struct net *net, struct nlattr *nla,
-			struct nlattr *est, struct tc_action **a,
-			int ovr, int bind)
+static int tcf_ife_init(struct net *net, struct tcf_proto *tp,
+			struct nlattr *nla, struct nlattr *est,
+			struct tc_action **a, int ovr, int bind)
 {
 	struct tc_action_net *tn = net_generic(net, ife_net_id);
 	struct nlattr *tb[TCA_IFE_MAX + 1];
diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c
index 36f0ced..011a1c6 100644
--- a/net/sched/act_ipt.c
+++ b/net/sched/act_ipt.c
@@ -189,18 +189,18 @@ static int __tcf_ipt_init(struct tc_action_net *tn, struct nlattr *nla,
 	return err;
 }
 
-static int tcf_ipt_init(struct net *net, struct nlattr *nla,
-			struct nlattr *est, struct tc_action **a, int ovr,
-			int bind)
+static int tcf_ipt_init(struct net *net, struct tcf_proto *tp,
+			struct nlattr *nla, struct nlattr *est,
+			struct tc_action **a, int ovr, int bind)
 {
 	struct tc_action_net *tn = net_generic(net, ipt_net_id);
 
 	return __tcf_ipt_init(tn, nla, est, a, &act_ipt_ops, ovr, bind);
 }
 
-static int tcf_xt_init(struct net *net, struct nlattr *nla,
-		       struct nlattr *est, struct tc_action **a, int ovr,
-		       int bind)
+static int tcf_xt_init(struct net *net, struct tcf_proto *tp,
+		       struct nlattr *nla, struct nlattr *est,
+		       struct tc_action **a, int ovr, int bind)
 {
 	struct tc_action_net *tn = net_generic(net, xt_net_id);
 
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index 1b5549a..957711f 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -72,9 +72,9 @@ static const struct nla_policy mirred_policy[TCA_MIRRED_MAX + 1] = {
 static unsigned int mirred_net_id;
 static struct tc_action_ops act_mirred_ops;
 
-static int tcf_mirred_init(struct net *net, struct nlattr *nla,
-			   struct nlattr *est, struct tc_action **a, int ovr,
-			   int bind)
+static int tcf_mirred_init(struct net *net, struct tcf_proto *tp,
+			   struct nlattr *nla, struct nlattr *est,
+			   struct tc_action **a, int ovr, int bind)
 {
 	struct tc_action_net *tn = net_generic(net, mirred_net_id);
 	struct nlattr *tb[TCA_MIRRED_MAX + 1];
diff --git a/net/sched/act_nat.c b/net/sched/act_nat.c
index 9016ab8..50e2fa6 100644
--- a/net/sched/act_nat.c
+++ b/net/sched/act_nat.c
@@ -38,7 +38,8 @@ static const struct nla_policy nat_policy[TCA_NAT_MAX + 1] = {
 	[TCA_NAT_PARMS]	= { .len = sizeof(struct tc_nat) },
 };
 
-static int tcf_nat_init(struct net *net, struct nlattr *nla, struct nlattr *est,
+static int tcf_nat_init(struct net *net, struct tcf_proto *tp,
+			struct nlattr *nla, struct nlattr *est,
 			struct tc_action **a, int ovr, int bind)
 {
 	struct tc_action_net *tn = net_generic(net, nat_net_id);
diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
index 164b5ac..3445f96 100644
--- a/net/sched/act_pedit.c
+++ b/net/sched/act_pedit.c
@@ -130,9 +130,9 @@ static int tcf_pedit_key_ex_dump(struct sk_buff *skb,
 	return 0;
 }
 
-static int tcf_pedit_init(struct net *net, struct nlattr *nla,
-			  struct nlattr *est, struct tc_action **a,
-			  int ovr, int bind)
+static int tcf_pedit_init(struct net *net, struct tcf_proto *tp,
+			  struct nlattr *nla, struct nlattr *est,
+			  struct tc_action **a, int ovr, int bind)
 {
 	struct tc_action_net *tn = net_generic(net, pedit_net_id);
 	struct nlattr *tb[TCA_PEDIT_MAX + 1];
diff --git a/net/sched/act_police.c b/net/sched/act_police.c
index f42008b..f5aeba2 100644
--- a/net/sched/act_police.c
+++ b/net/sched/act_police.c
@@ -74,9 +74,9 @@ static const struct nla_policy police_policy[TCA_POLICE_MAX + 1] = {
 	[TCA_POLICE_RESULT]	= { .type = NLA_U32 },
 };
 
-static int tcf_act_police_init(struct net *net, struct nlattr *nla,
-			       struct nlattr *est, struct tc_action **a,
-			       int ovr, int bind)
+static int tcf_act_police_init(struct net *net, struct tcf_proto *tp,
+			       struct nlattr *nla, struct nlattr *est,
+			       struct tc_action **a, int ovr, int bind)
 {
 	int ret = 0, err;
 	struct nlattr *tb[TCA_POLICE_MAX + 1];
diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c
index 59d6645..60080fb 100644
--- a/net/sched/act_sample.c
+++ b/net/sched/act_sample.c
@@ -36,9 +36,9 @@ static const struct nla_policy sample_policy[TCA_SAMPLE_MAX + 1] = {
 	[TCA_SAMPLE_PSAMPLE_GROUP]	= { .type = NLA_U32 },
 };
 
-static int tcf_sample_init(struct net *net, struct nlattr *nla,
-			   struct nlattr *est, struct tc_action **a, int ovr,
-			   int bind)
+static int tcf_sample_init(struct net *net, struct tcf_proto *tp,
+			   struct nlattr *nla, struct nlattr *est,
+			   struct tc_action **a, int ovr, int bind)
 {
 	struct tc_action_net *tn = net_generic(net, sample_net_id);
 	struct nlattr *tb[TCA_SAMPLE_MAX + 1];
diff --git a/net/sched/act_simple.c b/net/sched/act_simple.c
index 43605e7..d4b5da4 100644
--- a/net/sched/act_simple.c
+++ b/net/sched/act_simple.c
@@ -79,9 +79,9 @@ static const struct nla_policy simple_policy[TCA_DEF_MAX + 1] = {
 	[TCA_DEF_DATA]	= { .type = NLA_STRING, .len = SIMP_MAX_DATA },
 };
 
-static int tcf_simp_init(struct net *net, struct nlattr *nla,
-			 struct nlattr *est, struct tc_action **a,
-			 int ovr, int bind)
+static int tcf_simp_init(struct net *net, struct tcf_proto *tp,
+			 struct nlattr *nla, struct nlattr *est,
+			 struct tc_action **a, int ovr, int bind)
 {
 	struct tc_action_net *tn = net_generic(net, simp_net_id);
 	struct nlattr *tb[TCA_DEF_MAX + 1];
diff --git a/net/sched/act_skbedit.c b/net/sched/act_skbedit.c
index 6b3e65d..979d656 100644
--- a/net/sched/act_skbedit.c
+++ b/net/sched/act_skbedit.c
@@ -66,9 +66,9 @@ static const struct nla_policy skbedit_policy[TCA_SKBEDIT_MAX + 1] = {
 	[TCA_SKBEDIT_MASK]		= { .len = sizeof(u32) },
 };
 
-static int tcf_skbedit_init(struct net *net, struct nlattr *nla,
-			    struct nlattr *est, struct tc_action **a,
-			    int ovr, int bind)
+static int tcf_skbedit_init(struct net *net, struct tcf_proto *tp,
+			    struct nlattr *nla, struct nlattr *est,
+			    struct tc_action **a, int ovr, int bind)
 {
 	struct tc_action_net *tn = net_generic(net, skbedit_net_id);
 	struct nlattr *tb[TCA_SKBEDIT_MAX + 1];
diff --git a/net/sched/act_skbmod.c b/net/sched/act_skbmod.c
index a73c4bb..56549b7 100644
--- a/net/sched/act_skbmod.c
+++ b/net/sched/act_skbmod.c
@@ -84,9 +84,9 @@ static const struct nla_policy skbmod_policy[TCA_SKBMOD_MAX + 1] = {
 	[TCA_SKBMOD_ETYPE]		= { .type = NLA_U16 },
 };
 
-static int tcf_skbmod_init(struct net *net, struct nlattr *nla,
-			   struct nlattr *est, struct tc_action **a,
-			   int ovr, int bind)
+static int tcf_skbmod_init(struct net *net, struct tcf_proto *tp,
+			   struct nlattr *nla, struct nlattr *est,
+			   struct tc_action **a, int ovr, int bind)
 {
 	struct tc_action_net *tn = net_generic(net, skbmod_net_id);
 	struct nlattr *tb[TCA_SKBMOD_MAX + 1];
diff --git a/net/sched/act_tunnel_key.c b/net/sched/act_tunnel_key.c
index b9a2f24..9d37909 100644
--- a/net/sched/act_tunnel_key.c
+++ b/net/sched/act_tunnel_key.c
@@ -69,9 +69,9 @@ static const struct nla_policy tunnel_key_policy[TCA_TUNNEL_KEY_MAX + 1] = {
 	[TCA_TUNNEL_KEY_ENC_DST_PORT] = {.type = NLA_U16},
 };
 
-static int tunnel_key_init(struct net *net, struct nlattr *nla,
-			   struct nlattr *est, struct tc_action **a,
-			   int ovr, int bind)
+static int tunnel_key_init(struct net *net, struct tcf_proto *tp,
+			   struct nlattr *nla, struct nlattr *est,
+			   struct tc_action **a, int ovr, int bind)
 {
 	struct tc_action_net *tn = net_generic(net, tunnel_key_net_id);
 	struct nlattr *tb[TCA_TUNNEL_KEY_MAX + 1];
diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c
index 13ba3a8..525603f 100644
--- a/net/sched/act_vlan.c
+++ b/net/sched/act_vlan.c
@@ -103,9 +103,9 @@ static const struct nla_policy vlan_policy[TCA_VLAN_MAX + 1] = {
 	[TCA_VLAN_PUSH_VLAN_PRIORITY]	= { .type = NLA_U8 },
 };
 
-static int tcf_vlan_init(struct net *net, struct nlattr *nla,
-			 struct nlattr *est, struct tc_action **a,
-			 int ovr, int bind)
+static int tcf_vlan_init(struct net *net, struct tcf_proto *tp,
+			 struct nlattr *nla, struct nlattr *est,
+			 struct tc_action **a, int ovr, int bind)
 {
 	struct tc_action_net *tn = net_generic(net, vlan_net_id);
 	struct nlattr *tb[TCA_VLAN_MAX + 1];
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index a4fab8f..dbc1348 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -822,8 +822,9 @@ int tcf_exts_validate(struct net *net, struct tcf_proto *tp, struct nlattr **tb,
 		struct tc_action *act;
 
 		if (exts->police && tb[exts->police]) {
-			act = tcf_action_init_1(net, tb[exts->police], rate_tlv,
-						"police", ovr, TCA_ACT_BIND);
+			act = tcf_action_init_1(net, tp, tb[exts->police],
+						rate_tlv, "police", ovr,
+						TCA_ACT_BIND);
 			if (IS_ERR(act))
 				return PTR_ERR(act);
 
@@ -834,8 +835,8 @@ int tcf_exts_validate(struct net *net, struct tcf_proto *tp, struct nlattr **tb,
 			LIST_HEAD(actions);
 			int err, i = 0;
 
-			err = tcf_action_init(net, tb[exts->action], rate_tlv,
-					      NULL, ovr, TCA_ACT_BIND,
+			err = tcf_action_init(net, tp, tb[exts->action],
+					      rate_tlv, NULL, ovr, TCA_ACT_BIND,
 					      &actions);
 			if (err)
 				return err;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [patch net-next 10/10] net: sched: extend gact to allow jumping to another filter chain
  2017-04-27 11:12 [patch net-next 00/10] net: sched: introduce multichain support for filters Jiri Pirko
                   ` (8 preceding siblings ...)
  2017-04-27 11:12 ` [patch net-next 09/10] net: sched: push tp down to action init callback Jiri Pirko
@ 2017-04-27 11:12 ` Jiri Pirko
  2017-04-28  1:41   ` Jamal Hadi Salim
  2017-04-27 11:15 ` [patch iproute2 1/2] tc_filter: add support for chain index Jiri Pirko
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 26+ messages in thread
From: Jiri Pirko @ 2017-04-27 11:12 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, dsa, edumazet, stephen, daniel,
	alexander.h.duyck, mlxsw, simon.horman

From: Jiri Pirko <jiri@mellanox.com>

Introduce new type of gact action called "goto_chain". This allows
user to specify a chain to be processed. This action type is
then processed as a return value in tcf_classify loop in similar
way as "reclassify" is, only it does not reset to the first filter
in chain but rather reset to the first filter of the desired chain.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/sch_generic.h           |  9 +++++--
 include/net/tc_act/tc_gact.h        |  2 ++
 include/uapi/linux/pkt_cls.h        |  1 +
 include/uapi/linux/tc_act/tc_gact.h |  1 +
 net/sched/act_gact.c                | 48 ++++++++++++++++++++++++++++++++++++-
 net/sched/cls_api.c                 |  8 +++++--
 6 files changed, 64 insertions(+), 5 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 569b565..3688501 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -193,8 +193,13 @@ struct Qdisc_ops {
 
 
 struct tcf_result {
-	unsigned long	class;
-	u32		classid;
+	union {
+		struct {
+			unsigned long	class;
+			u32		classid;
+		};
+		const struct tcf_proto *goto_tp;
+	};
 };
 
 struct tcf_proto_ops {
diff --git a/include/net/tc_act/tc_gact.h b/include/net/tc_act/tc_gact.h
index b6f1739..58bee54 100644
--- a/include/net/tc_act/tc_gact.h
+++ b/include/net/tc_act/tc_gact.h
@@ -12,6 +12,8 @@ struct tcf_gact {
 	int			tcfg_paction;
 	atomic_t		packets;
 #endif
+	struct tcf_chain	*goto_chain;
+	struct rcu_head		rcu;
 };
 #define to_gact(a) ((struct tcf_gact *)a)
 
diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index f1129e3..e03ba27 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -37,6 +37,7 @@ enum {
 #define TC_ACT_QUEUED		5
 #define TC_ACT_REPEAT		6
 #define TC_ACT_REDIRECT		7
+#define TC_ACT_GOTO_CHAIN	8
 #define TC_ACT_JUMP		0x10000000
 
 /* Action type identifiers*/
diff --git a/include/uapi/linux/tc_act/tc_gact.h b/include/uapi/linux/tc_act/tc_gact.h
index 70b536a..388733d 100644
--- a/include/uapi/linux/tc_act/tc_gact.h
+++ b/include/uapi/linux/tc_act/tc_gact.h
@@ -26,6 +26,7 @@ enum {
 	TCA_GACT_PARMS,
 	TCA_GACT_PROB,
 	TCA_GACT_PAD,
+	TCA_GACT_CHAIN,
 	__TCA_GACT_MAX
 };
 #define TCA_GACT_MAX (__TCA_GACT_MAX - 1)
diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
index c527c11..d63aebd 100644
--- a/net/sched/act_gact.c
+++ b/net/sched/act_gact.c
@@ -20,6 +20,7 @@
 #include <linux/init.h>
 #include <net/netlink.h>
 #include <net/pkt_sched.h>
+#include <net/pkt_cls.h>
 #include <linux/tc_act/tc_gact.h>
 #include <net/tc_act/tc_gact.h>
 
@@ -54,6 +55,7 @@ static g_rand gact_rand[MAX_RAND] = { NULL, gact_net_rand, gact_determ };
 static const struct nla_policy gact_policy[TCA_GACT_MAX + 1] = {
 	[TCA_GACT_PARMS]	= { .len = sizeof(struct tc_gact) },
 	[TCA_GACT_PROB]		= { .len = sizeof(struct tc_gact_p) },
+	[TCA_GACT_CHAIN]	= { .type = NLA_U32 },
 };
 
 static int tcf_gact_init(struct net *net, struct tcf_proto *tp,
@@ -92,6 +94,9 @@ static int tcf_gact_init(struct net *net, struct tcf_proto *tp,
 	}
 #endif
 
+	if (parm->action == TC_ACT_GOTO_CHAIN && !tb[TCA_GACT_CHAIN])
+		return -EINVAL;
+
 	if (!tcf_hash_check(tn, parm->index, a, bind)) {
 		ret = tcf_hash_create(tn, parm->index, est, a,
 				      &act_gact_ops, bind, true);
@@ -121,11 +126,43 @@ static int tcf_gact_init(struct net *net, struct tcf_proto *tp,
 		gact->tcfg_ptype   = p_parm->ptype;
 	}
 #endif
+
+	if (gact->tcf_action == TC_ACT_GOTO_CHAIN) {
+		u32 chain_index = nla_get_u32(tb[TCA_GACT_CHAIN]);
+
+		if (!tp) {
+			if (ret == ACT_P_CREATED)
+				tcf_hash_release(*a, bind);
+			return -EINVAL;
+		}
+		gact->goto_chain = tcf_chain_get(tp->chain->block, chain_index);
+		if (!gact->goto_chain) {
+			if (ret == ACT_P_CREATED)
+				tcf_hash_release(*a, bind);
+			return -ENOMEM;
+		}
+	}
+
 	if (ret == ACT_P_CREATED)
 		tcf_hash_insert(tn, *a);
 	return ret;
 }
 
+static void tcf_gact_cleanup_rcu(struct rcu_head *rcu)
+{
+	struct tcf_gact *gact = container_of(rcu, struct tcf_gact, rcu);
+
+	if (gact->tcf_action == TC_ACT_GOTO_CHAIN)
+		tcf_chain_put(gact->goto_chain);
+}
+
+static void tcf_gact_cleanup(struct tc_action *a, int bind)
+{
+	struct tcf_gact *gact = to_gact(a);
+
+	call_rcu(&gact->rcu, tcf_gact_cleanup_rcu);
+}
+
 static int tcf_gact(struct sk_buff *skb, const struct tc_action *a,
 		    struct tcf_result *res)
 {
@@ -141,8 +178,13 @@ static int tcf_gact(struct sk_buff *skb, const struct tc_action *a,
 	}
 #endif
 	bstats_cpu_update(this_cpu_ptr(gact->common.cpu_bstats), skb);
-	if (action == TC_ACT_SHOT)
+	if (action == TC_ACT_SHOT) {
 		qstats_drop_inc(this_cpu_ptr(gact->common.cpu_qstats));
+	} else if (action == TC_ACT_GOTO_CHAIN) {
+		struct tcf_chain *chain = gact->goto_chain;
+
+		res->goto_tp = rcu_dereference_bh(chain->filter_chain);
+	}
 
 	tcf_lastuse_update(&gact->tcf_tm);
 
@@ -194,6 +236,9 @@ static int tcf_gact_dump(struct sk_buff *skb, struct tc_action *a,
 	tcf_tm_dump(&t, &gact->tcf_tm);
 	if (nla_put_64bit(skb, TCA_GACT_TM, sizeof(t), &t, TCA_GACT_PAD))
 		goto nla_put_failure;
+	if (gact->tcf_action == TC_ACT_GOTO_CHAIN &&
+	    nla_put_u32(skb, TCA_GACT_CHAIN, gact->goto_chain->index))
+		goto nla_put_failure;
 	return skb->len;
 
 nla_put_failure:
@@ -225,6 +270,7 @@ static struct tc_action_ops act_gact_ops = {
 	.stats_update	=	tcf_gact_stats_update,
 	.dump		=	tcf_gact_dump,
 	.init		=	tcf_gact_init,
+	.cleanup	=	tcf_gact_cleanup,
 	.walk		=	tcf_gact_walker,
 	.lookup		=	tcf_gact_search,
 	.size		=	sizeof(struct tcf_gact),
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index dbc1348..a2d6bc7 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -304,10 +304,14 @@ int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 			continue;
 
 		err = tp->classify(skb, tp, res);
-		if (unlikely(err == TC_ACT_RECLASSIFY && !compat_mode))
+		if (err == TC_ACT_RECLASSIFY && !compat_mode) {
 			goto reset;
-		if (err >= 0)
+		} else if (err == TC_ACT_GOTO_CHAIN) {
+			old_tp = res->goto_tp;
+			goto reset;
+		} else if (err >= 0) {
 			return err;
+		}
 	}
 
 	return TC_ACT_UNSPEC; /* signal: continue lookup */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [patch iproute2 1/2] tc_filter: add support for chain index
  2017-04-27 11:12 [patch net-next 00/10] net: sched: introduce multichain support for filters Jiri Pirko
                   ` (9 preceding siblings ...)
  2017-04-27 11:12 ` [patch net-next 10/10] net: sched: extend gact to allow jumping to another filter chain Jiri Pirko
@ 2017-04-27 11:15 ` Jiri Pirko
  2017-04-27 11:15 ` [patch iproute2 2/2] tc_gact: introduce support for goto chain action Jiri Pirko
  2017-04-27 17:46 ` [patch net-next 00/10] net: sched: introduce multichain support for filters Cong Wang
  12 siblings, 0 replies; 26+ messages in thread
From: Jiri Pirko @ 2017-04-27 11:15 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, dsa, edumazet, stephen, daniel,
	alexander.h.duyck, mlxsw, simon.horman

From: Jiri Pirko <jiri@mellanox.com>

Allow user to put filter to a specific chain identified by index.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/linux/rtnetlink.h |  1 +
 tc/tc_filter.c            | 87 +++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 73 insertions(+), 15 deletions(-)

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index a96db83..70c5750 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -549,6 +549,7 @@ enum {
 	TCA_STAB,
 	TCA_PAD,
 	TCA_DUMP_INVISIBLE,
+	TCA_CHAIN,
 	__TCA_MAX
 };
 
diff --git a/tc/tc_filter.c b/tc/tc_filter.c
index ff8713b..b13fb91 100644
--- a/tc/tc_filter.c
+++ b/tc/tc_filter.c
@@ -31,7 +31,7 @@ static void usage(void)
 	fprintf(stderr,
 		"Usage: tc filter [ add | del | change | replace | show ] dev STRING\n"
 		"Usage: tc filter get dev STRING parent CLASSID protocol PROTO handle FILTERID pref PRIO FILTER_TYPE\n"
-		"       [ pref PRIO ] protocol PROTO\n"
+		"       [ pref PRIO ] protocol PROTO [ chain CHAIN_INDEX ]\n"
 		"       [ estimator INTERVAL TIME_CONSTANT ]\n"
 		"       [ root | ingress | egress | parent CLASSID ]\n"
 		"       [ handle FILTERID ] [ [ FILTER_TYPE ] [ help | OPTIONS ] ]\n"
@@ -59,6 +59,8 @@ static int tc_filter_modify(int cmd, unsigned int flags, int argc, char **argv)
 	__u32 prio = 0;
 	__u32 protocol = 0;
 	int protocol_set = 0;
+	__u32 chain_index;
+	int chain_index_set = 0;
 	char *fhandle = NULL;
 	char  d[16] = {};
 	char  k[16] = {};
@@ -127,6 +129,13 @@ static int tc_filter_modify(int cmd, unsigned int flags, int argc, char **argv)
 				invarg("invalid protocol", *argv);
 			protocol = id;
 			protocol_set = 1;
+		} else if (matches(*argv, "chain") == 0) {
+			NEXT_ARG();
+			if (chain_index_set)
+				duparg("chain", *argv);
+			if (get_u32(&chain_index, *argv, 0))
+				invarg("invalid chain index value", *argv);
+			chain_index_set = 1;
 		} else if (matches(*argv, "estimator") == 0) {
 			if (parse_estimator(&argc, &argv, &est) < 0)
 				return -1;
@@ -146,6 +155,9 @@ static int tc_filter_modify(int cmd, unsigned int flags, int argc, char **argv)
 
 	req.t.tcm_info = TC_H_MAKE(prio<<16, protocol);
 
+	if (chain_index_set)
+		addattr32(&req.n, sizeof(req), TCA_CHAIN, chain_index);
+
 	if (k[0])
 		addattr_l(&req.n, sizeof(req), TCA_KIND, k, strlen(k)+1);
 
@@ -167,6 +179,7 @@ static int tc_filter_modify(int cmd, unsigned int flags, int argc, char **argv)
 			return -1;
 		}
 	}
+
 	if (est.ewma_log)
 		addattr_l(&req.n, sizeof(req), TCA_RATE, &est, sizeof(est));
 
@@ -193,6 +206,8 @@ static __u32 filter_parent;
 static int filter_ifindex;
 static __u32 filter_prio;
 static __u32 filter_protocol;
+static __u32 filter_chain_index;
+static int filter_chain_index_set;
 __u16 f_proto;
 
 int print_filter(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
@@ -270,6 +285,15 @@ int print_filter(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 		}
 	}
 	fprintf(fp, "%s ", rta_getattr_str(tb[TCA_KIND]));
+
+	if (tb[TCA_CHAIN]) {
+		__u32 chain_index = rta_getattr_u32(tb[TCA_CHAIN]);
+
+		if (!filter_chain_index_set ||
+		    filter_chain_index != chain_index)
+			fprintf(fp, "chain %u ", chain_index);
+	}
+
 	q = get_filter_kind(RTA_DATA(tb[TCA_KIND]));
 	if (tb[TCA_OPTIONS]) {
 		if (q)
@@ -311,6 +335,8 @@ static int tc_filter_get(int cmd, unsigned int flags, int argc, char **argv)
 	__u32 prio = 0;
 	__u32 protocol = 0;
 	int protocol_set = 0;
+	__u32 chain_index;
+	int chain_index_set = 0;
 	__u32 parent_handle = 0;
 	char *fhandle = NULL;
 	char  d[16] = {};
@@ -375,6 +401,13 @@ static int tc_filter_get(int cmd, unsigned int flags, int argc, char **argv)
 				invarg("invalid protocol", *argv);
 			protocol = id;
 			protocol_set = 1;
+		} else if (matches(*argv, "chain") == 0) {
+			NEXT_ARG();
+			if (chain_index_set)
+				duparg("chain", *argv);
+			if (get_u32(&chain_index, *argv, 0))
+				invarg("invalid chain index value", *argv);
+			chain_index_set = 1;
 		} else if (matches(*argv, "help") == 0) {
 			usage();
 			return 0;
@@ -401,6 +434,9 @@ static int tc_filter_get(int cmd, unsigned int flags, int argc, char **argv)
 
 	req.t.tcm_info = TC_H_MAKE(prio<<16, protocol);
 
+	if (chain_index_set)
+		addattr32(&req.n, sizeof(req), TCA_CHAIN, chain_index);
+
 	if (req.t.tcm_parent == TC_H_UNSPEC) {
 		fprintf(stderr, "Must specify filter parent\n");
 		return -1;
@@ -457,10 +493,20 @@ static int tc_filter_get(int cmd, unsigned int flags, int argc, char **argv)
 
 static int tc_filter_list(int argc, char **argv)
 {
-	struct tcmsg t = { .tcm_family = AF_UNSPEC };
+	struct {
+		struct nlmsghdr n;
+		struct tcmsg t;
+		char buf[MAX_MSG];
+	} req = {
+		.n.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg)),
+		.n.nlmsg_type = RTM_GETTFILTER,
+		.t.tcm_parent = TC_H_UNSPEC,
+		.t.tcm_family = AF_UNSPEC,
+	};
 	char d[16] = {};
 	__u32 prio = 0;
 	__u32 protocol = 0;
+	__u32 chain_index;
 	char *fhandle = NULL;
 
 	while (argc > 0) {
@@ -470,39 +516,39 @@ static int tc_filter_list(int argc, char **argv)
 				duparg("dev", *argv);
 			strncpy(d, *argv, sizeof(d)-1);
 		} else if (strcmp(*argv, "root") == 0) {
-			if (t.tcm_parent) {
+			if (req.t.tcm_parent) {
 				fprintf(stderr,
 					"Error: \"root\" is duplicate parent ID\n");
 				return -1;
 			}
-			filter_parent = t.tcm_parent = TC_H_ROOT;
+			filter_parent = req.t.tcm_parent = TC_H_ROOT;
 		} else if (strcmp(*argv, "ingress") == 0) {
-			if (t.tcm_parent) {
+			if (req.t.tcm_parent) {
 				fprintf(stderr,
 					"Error: \"ingress\" is duplicate parent ID\n");
 				return -1;
 			}
 			filter_parent = TC_H_MAKE(TC_H_CLSACT,
 						  TC_H_MIN_INGRESS);
-			t.tcm_parent = filter_parent;
+			req.t.tcm_parent = filter_parent;
 		} else if (strcmp(*argv, "egress") == 0) {
-			if (t.tcm_parent) {
+			if (req.t.tcm_parent) {
 				fprintf(stderr,
 					"Error: \"egress\" is duplicate parent ID\n");
 				return -1;
 			}
 			filter_parent = TC_H_MAKE(TC_H_CLSACT,
 						  TC_H_MIN_EGRESS);
-			t.tcm_parent = filter_parent;
+			req.t.tcm_parent = filter_parent;
 		} else if (strcmp(*argv, "parent") == 0) {
 			__u32 handle;
 
 			NEXT_ARG();
-			if (t.tcm_parent)
+			if (req.t.tcm_parent)
 				duparg("parent", *argv);
 			if (get_tc_classid(&handle, *argv))
 				invarg("invalid parent ID", *argv);
-			filter_parent = t.tcm_parent = handle;
+			filter_parent = req.t.tcm_parent = handle;
 		} else if (strcmp(*argv, "handle") == 0) {
 			NEXT_ARG();
 			if (fhandle)
@@ -526,6 +572,14 @@ static int tc_filter_list(int argc, char **argv)
 				invarg("invalid protocol", *argv);
 			protocol = res;
 			filter_protocol = protocol;
+		} else if (matches(*argv, "chain") == 0) {
+			NEXT_ARG();
+			if (filter_chain_index_set)
+				duparg("chain", *argv);
+			if (get_u32(&chain_index, *argv, 0))
+				invarg("invalid chain index value", *argv);
+			filter_chain_index_set = 1;
+			filter_chain_index = chain_index;
 		} else if (matches(*argv, "help") == 0) {
 			usage();
 		} else {
@@ -538,20 +592,23 @@ static int tc_filter_list(int argc, char **argv)
 		argc--; argv++;
 	}
 
-	t.tcm_info = TC_H_MAKE(prio<<16, protocol);
+	req.t.tcm_info = TC_H_MAKE(prio<<16, protocol);
 
 	ll_init_map(&rth);
 
 	if (d[0]) {
-		t.tcm_ifindex = ll_name_to_index(d);
-		if (t.tcm_ifindex == 0) {
+		req.t.tcm_ifindex = ll_name_to_index(d);
+		if (req.t.tcm_ifindex == 0) {
 			fprintf(stderr, "Cannot find device \"%s\"\n", d);
 			return 1;
 		}
-		filter_ifindex = t.tcm_ifindex;
+		filter_ifindex = req.t.tcm_ifindex;
 	}
 
-	if (rtnl_dump_request(&rth, RTM_GETTFILTER, &t, sizeof(t)) < 0) {
+	if (filter_chain_index_set)
+		addattr32(&req.n, sizeof(req), TCA_CHAIN, chain_index);
+
+	if (rtnl_dump_request_n(&rth, &req.n) < 0) {
 		perror("Cannot send dump request");
 		return 1;
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [patch iproute2 2/2] tc_gact: introduce support for goto chain action
  2017-04-27 11:12 [patch net-next 00/10] net: sched: introduce multichain support for filters Jiri Pirko
                   ` (10 preceding siblings ...)
  2017-04-27 11:15 ` [patch iproute2 1/2] tc_filter: add support for chain index Jiri Pirko
@ 2017-04-27 11:15 ` Jiri Pirko
  2017-04-27 17:46 ` [patch net-next 00/10] net: sched: introduce multichain support for filters Cong Wang
  12 siblings, 0 replies; 26+ messages in thread
From: Jiri Pirko @ 2017-04-27 11:15 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, dsa, edumazet, stephen, daniel,
	alexander.h.duyck, mlxsw, simon.horman

From: Jiri Pirko <jiri@mellanox.com>

Allow user to set action "goto" with filter chain index as a parameter.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/linux/pkt_cls.h        |  1 +
 include/linux/tc_act/tc_gact.h |  1 +
 tc/m_gact.c                    | 32 ++++++++++++++++++++++++++++----
 tc/tc_util.c                   |  3 +++
 4 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/include/linux/pkt_cls.h b/include/linux/pkt_cls.h
index 7a69f2a..4f347b0 100644
--- a/include/linux/pkt_cls.h
+++ b/include/linux/pkt_cls.h
@@ -37,6 +37,7 @@ enum {
 #define TC_ACT_QUEUED		5
 #define TC_ACT_REPEAT		6
 #define TC_ACT_REDIRECT		7
+#define TC_ACT_GOTO_CHAIN	8
 #define TC_ACT_JUMP		0x10000000
 
 /* Action type identifiers*/
diff --git a/include/linux/tc_act/tc_gact.h b/include/linux/tc_act/tc_gact.h
index 70b536a..388733d 100644
--- a/include/linux/tc_act/tc_gact.h
+++ b/include/linux/tc_act/tc_gact.h
@@ -26,6 +26,7 @@ enum {
 	TCA_GACT_PARMS,
 	TCA_GACT_PROB,
 	TCA_GACT_PAD,
+	TCA_GACT_CHAIN,
 	__TCA_GACT_MAX
 };
 #define TCA_GACT_MAX (__TCA_GACT_MAX - 1)
diff --git a/tc/m_gact.c b/tc/m_gact.c
index 755a3be..f8d1c1e 100644
--- a/tc/m_gact.c
+++ b/tc/m_gact.c
@@ -43,19 +43,21 @@ static void
 explain(void)
 {
 #ifdef CONFIG_GACT_PROB
-	fprintf(stderr, "Usage: ... gact <ACTION> [RAND] [INDEX]\n");
+	fprintf(stderr, "Usage: ... gact <ACTION> [RAND] [INDEX] [OPTIONS]\n");
 	fprintf(stderr,
-		"Where: \tACTION := reclassify | drop | continue | pass | pipe\n"
+		"Where: \tACTION := reclassify | drop | continue | pass | pipe | goto\n"
 			"\tRAND := random <RANDTYPE> <ACTION> <VAL>\n"
 			"\tRANDTYPE := netrand | determ\n"
 			"\tVAL : = value not exceeding 10000\n"
 			"\tINDEX := index value used\n"
+			"\tOPTIONS: := chain <CHAIN_INDEX>\n"
 			"\n");
 #else
-	fprintf(stderr, "Usage: ... gact <ACTION> [INDEX]\n");
+	fprintf(stderr, "Usage: ... gact <ACTION> [INDEX] [OPTIONS]\n");
 	fprintf(stderr,
-		"Where: \tACTION := reclassify | drop | continue | pass | pipe\n"
+		"Where: \tACTION := reclassify | drop | continue | pass | pipe | goto\n"
 		"\tINDEX := index value used\n"
+		"\tOPTIONS: := chain <CHAIN_INDEX>\n"
 		"\n");
 #endif
 }
@@ -93,6 +95,7 @@ parse_gact(struct action_util *a, int *argc_p, char ***argv_p,
 	int rd = 0;
 	struct tc_gact_p pp;
 #endif
+	__u32 chain_index;
 	struct rtattr *tail;
 
 	if (argc < 0)
@@ -173,6 +176,23 @@ parse_gact(struct action_util *a, int *argc_p, char ***argv_p,
 		}
 	}
 
+	if (ok && action == TC_ACT_GOTO_CHAIN) {
+		if (matches(*argv, "chain") == 0) {
+			NEXT_ARG();
+			if (get_u32(&chain_index, *argv, 10)) {
+				fprintf(stderr, "Illegal \"chain index\"\n");
+				return -1;
+			}
+			argc--;
+			argv++;
+		} else if (matches(*argv, "help") == 0) {
+			usage();
+		} else {
+			fprintf(stderr, "\"chain\" needs to be specified for \"goto\" action\n");
+			return -1;
+		}
+	}
+
 	if (!ok)
 		return -1;
 
@@ -184,6 +204,8 @@ parse_gact(struct action_util *a, int *argc_p, char ***argv_p,
 		addattr_l(n, MAX_MSG, TCA_GACT_PROB, &pp, sizeof(pp));
 	}
 #endif
+	if (action == TC_ACT_GOTO_CHAIN)
+		addattr32(n, MAX_MSG, TCA_GACT_CHAIN, chain_index);
 	tail->rta_len = (void *) NLMSG_TAIL(n) - (void *) tail;
 
 	*argc_p = argc;
@@ -213,6 +235,8 @@ print_gact(struct action_util *au, FILE * f, struct rtattr *arg)
 	p = RTA_DATA(tb[TCA_GACT_PARMS]);
 
 	fprintf(f, "gact action %s", action_n2a(p->action));
+	if (tb[TCA_GACT_CHAIN])
+		fprintf(f, " chain %u", rta_getattr_u32(tb[TCA_GACT_CHAIN]));
 #ifdef CONFIG_GACT_PROB
 	if (tb[TCA_GACT_PROB] != NULL) {
 		pp = RTA_DATA(tb[TCA_GACT_PROB]);
diff --git a/tc/tc_util.c b/tc/tc_util.c
index 24ca1f1..a75367f 100644
--- a/tc/tc_util.c
+++ b/tc/tc_util.c
@@ -428,6 +428,8 @@ const char *action_n2a(int action)
 		return "pipe";
 	case TC_ACT_STOLEN:
 		return "stolen";
+	case TC_ACT_GOTO_CHAIN:
+		return "goto";
 	default:
 		snprintf(buf, 64, "%d", action);
 		buf[63] = '\0';
@@ -459,6 +461,7 @@ int action_a2n(char *arg, int *result, bool allow_num)
 		{"ok", TC_ACT_OK},
 		{"reclassify", TC_ACT_RECLASSIFY},
 		{"pipe", TC_ACT_PIPE},
+		{"goto", TC_ACT_GOTO_CHAIN},
 		{ NULL },
 	}, *iter;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [patch net-next 00/10] net: sched: introduce multichain support for filters
  2017-04-27 11:12 [patch net-next 00/10] net: sched: introduce multichain support for filters Jiri Pirko
                   ` (11 preceding siblings ...)
  2017-04-27 11:15 ` [patch iproute2 2/2] tc_gact: introduce support for goto chain action Jiri Pirko
@ 2017-04-27 17:46 ` Cong Wang
  2017-04-28  6:53   ` Jiri Pirko
  12 siblings, 1 reply; 26+ messages in thread
From: Cong Wang @ 2017-04-27 17:46 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Linux Kernel Network Developers, David Miller, Jamal Hadi Salim,
	David Ahern, Eric Dumazet, Stephen Hemminger, Daniel Borkmann,
	Alexander Duyck, mlxsw, Simon Horman

On Thu, Apr 27, 2017 at 4:12 AM, Jiri Pirko <jiri@resnulli.us> wrote:
> Simple example:
> $ tc qdisc add dev eth0 ingress
> $ tc filter add dev eth0 parent ffff: protocol ip pref 33 flower dst_mac 52:54:00:3d:c7:6d action goto chain 11
> $ tc filter add dev eth0 parent ffff: protocol ip pref 22 chain 11 flower dst_ip 192.168.40.1 action drop
> $ tc filter show dev eth0 root

Interesting.

I don't look into the code yet. If I understand the concepts correctly,
so with your patchset we can mark either filter with a chain No. to
choose which chain it belongs to _logically_ even though
_physically_ it is still in the old-fashion chain (prio, proto)?

If so, you have to ensure proto is same since the protocol of
the packet does not change dynamically? And the original
priority becomes pointless with chains since we can just to
any other chain in any order?

By default, without any chain No., you use 0 for all the chains,
so the old-fashion chain still works.

Thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch net-next 10/10] net: sched: extend gact to allow jumping to another filter chain
  2017-04-27 11:12 ` [patch net-next 10/10] net: sched: extend gact to allow jumping to another filter chain Jiri Pirko
@ 2017-04-28  1:41   ` Jamal Hadi Salim
  2017-04-28  6:52     ` Jiri Pirko
  0 siblings, 1 reply; 26+ messages in thread
From: Jamal Hadi Salim @ 2017-04-28  1:41 UTC (permalink / raw)
  To: Jiri Pirko, netdev
  Cc: davem, xiyou.wangcong, dsa, edumazet, stephen, daniel,
	alexander.h.duyck, mlxsw, simon.horman


Jiri,

Good stuff!
Thanks for the effort.

I didnt review the details - will do. I wanted to raise one issue.
This should work for all actions, not just gact (refer to the
recent commit i made on the action jumping).

Example policy for policer:

#if packets destined for mac address 52:54:00:3d:c7:6d
#exceed 90kbps with burst of 90K then jump to chain 11
#for further classification, otherwise set their skb mark to 11
# and proceed.

tc filter add dev eth0 parent ffff: protocol ip pref 33 \
flower dst_mac 52:54:00:3d:c7:6d \
action police rate 1kbit burst 90k conform-exceed pipe/goto chain 11 \
action skbedit mark 11

But i should also be able to do this for any other action, etc.

For this to work, you have to be able to encode the action in the
opcode. Something like (for 2^16 chains):

#define TC_ACT_GOTO_CHAIN	0x20000000
#define TCA_ACT_MAX_CHAIN_MASK 0xFFFF

So 0x20000001 is encoding of chain 1 etc.

I will post the iproute2 code i used for jumping of actions.

cheers,
jamal

On 17-04-27 07:12 AM, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
>
> Introduce new type of gact action called "goto_chain". This allows
> user to specify a chain to be processed. This action type is
> then processed as a return value in tcf_classify loop in similar
> way as "reclassify" is, only it does not reset to the first filter
> in chain but rather reset to the first filter of the desired chain.
>
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
> ---
>  include/net/sch_generic.h           |  9 +++++--
>  include/net/tc_act/tc_gact.h        |  2 ++
>  include/uapi/linux/pkt_cls.h        |  1 +
>  include/uapi/linux/tc_act/tc_gact.h |  1 +
>  net/sched/act_gact.c                | 48 ++++++++++++++++++++++++++++++++++++-
>  net/sched/cls_api.c                 |  8 +++++--
>  6 files changed, 64 insertions(+), 5 deletions(-)
>
> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
> index 569b565..3688501 100644
> --- a/include/net/sch_generic.h
> +++ b/include/net/sch_generic.h
> @@ -193,8 +193,13 @@ struct Qdisc_ops {
>
>
>  struct tcf_result {
> -	unsigned long	class;
> -	u32		classid;
> +	union {
> +		struct {
> +			unsigned long	class;
> +			u32		classid;
> +		};
> +		const struct tcf_proto *goto_tp;
> +	};
>  };
>
>  struct tcf_proto_ops {
> diff --git a/include/net/tc_act/tc_gact.h b/include/net/tc_act/tc_gact.h
> index b6f1739..58bee54 100644
> --- a/include/net/tc_act/tc_gact.h
> +++ b/include/net/tc_act/tc_gact.h
> @@ -12,6 +12,8 @@ struct tcf_gact {
>  	int			tcfg_paction;
>  	atomic_t		packets;
>  #endif
> +	struct tcf_chain	*goto_chain;
> +	struct rcu_head		rcu;
>  };
>  #define to_gact(a) ((struct tcf_gact *)a)
>
> diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
> index f1129e3..e03ba27 100644
> --- a/include/uapi/linux/pkt_cls.h
> +++ b/include/uapi/linux/pkt_cls.h
> @@ -37,6 +37,7 @@ enum {
>  #define TC_ACT_QUEUED		5
>  #define TC_ACT_REPEAT		6
>  #define TC_ACT_REDIRECT		7
> +#define TC_ACT_GOTO_CHAIN	8
>  #define TC_ACT_JUMP		0x10000000
>
>  /* Action type identifiers*/
> diff --git a/include/uapi/linux/tc_act/tc_gact.h b/include/uapi/linux/tc_act/tc_gact.h
> index 70b536a..388733d 100644
> --- a/include/uapi/linux/tc_act/tc_gact.h
> +++ b/include/uapi/linux/tc_act/tc_gact.h
> @@ -26,6 +26,7 @@ enum {
>  	TCA_GACT_PARMS,
>  	TCA_GACT_PROB,
>  	TCA_GACT_PAD,
> +	TCA_GACT_CHAIN,
>  	__TCA_GACT_MAX
>  };
>  #define TCA_GACT_MAX (__TCA_GACT_MAX - 1)
> diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
> index c527c11..d63aebd 100644
> --- a/net/sched/act_gact.c
> +++ b/net/sched/act_gact.c
> @@ -20,6 +20,7 @@
>  #include <linux/init.h>
>  #include <net/netlink.h>
>  #include <net/pkt_sched.h>
> +#include <net/pkt_cls.h>
>  #include <linux/tc_act/tc_gact.h>
>  #include <net/tc_act/tc_gact.h>
>
> @@ -54,6 +55,7 @@ static g_rand gact_rand[MAX_RAND] = { NULL, gact_net_rand, gact_determ };
>  static const struct nla_policy gact_policy[TCA_GACT_MAX + 1] = {
>  	[TCA_GACT_PARMS]	= { .len = sizeof(struct tc_gact) },
>  	[TCA_GACT_PROB]		= { .len = sizeof(struct tc_gact_p) },
> +	[TCA_GACT_CHAIN]	= { .type = NLA_U32 },
>  };
>
>  static int tcf_gact_init(struct net *net, struct tcf_proto *tp,
> @@ -92,6 +94,9 @@ static int tcf_gact_init(struct net *net, struct tcf_proto *tp,
>  	}
>  #endif
>
> +	if (parm->action == TC_ACT_GOTO_CHAIN && !tb[TCA_GACT_CHAIN])
> +		return -EINVAL;
> +
>  	if (!tcf_hash_check(tn, parm->index, a, bind)) {
>  		ret = tcf_hash_create(tn, parm->index, est, a,
>  				      &act_gact_ops, bind, true);
> @@ -121,11 +126,43 @@ static int tcf_gact_init(struct net *net, struct tcf_proto *tp,
>  		gact->tcfg_ptype   = p_parm->ptype;
>  	}
>  #endif
> +
> +	if (gact->tcf_action == TC_ACT_GOTO_CHAIN) {
> +		u32 chain_index = nla_get_u32(tb[TCA_GACT_CHAIN]);
> +
> +		if (!tp) {
> +			if (ret == ACT_P_CREATED)
> +				tcf_hash_release(*a, bind);
> +			return -EINVAL;
> +		}
> +		gact->goto_chain = tcf_chain_get(tp->chain->block, chain_index);
> +		if (!gact->goto_chain) {
> +			if (ret == ACT_P_CREATED)
> +				tcf_hash_release(*a, bind);
> +			return -ENOMEM;
> +		}
> +	}
> +
>  	if (ret == ACT_P_CREATED)
>  		tcf_hash_insert(tn, *a);
>  	return ret;
>  }
>
> +static void tcf_gact_cleanup_rcu(struct rcu_head *rcu)
> +{
> +	struct tcf_gact *gact = container_of(rcu, struct tcf_gact, rcu);
> +
> +	if (gact->tcf_action == TC_ACT_GOTO_CHAIN)
> +		tcf_chain_put(gact->goto_chain);
> +}
> +
> +static void tcf_gact_cleanup(struct tc_action *a, int bind)
> +{
> +	struct tcf_gact *gact = to_gact(a);
> +
> +	call_rcu(&gact->rcu, tcf_gact_cleanup_rcu);
> +}
> +
>  static int tcf_gact(struct sk_buff *skb, const struct tc_action *a,
>  		    struct tcf_result *res)
>  {
> @@ -141,8 +178,13 @@ static int tcf_gact(struct sk_buff *skb, const struct tc_action *a,
>  	}
>  #endif
>  	bstats_cpu_update(this_cpu_ptr(gact->common.cpu_bstats), skb);
> -	if (action == TC_ACT_SHOT)
> +	if (action == TC_ACT_SHOT) {
>  		qstats_drop_inc(this_cpu_ptr(gact->common.cpu_qstats));
> +	} else if (action == TC_ACT_GOTO_CHAIN) {
> +		struct tcf_chain *chain = gact->goto_chain;
> +
> +		res->goto_tp = rcu_dereference_bh(chain->filter_chain);
> +	}
>
>  	tcf_lastuse_update(&gact->tcf_tm);
>
> @@ -194,6 +236,9 @@ static int tcf_gact_dump(struct sk_buff *skb, struct tc_action *a,
>  	tcf_tm_dump(&t, &gact->tcf_tm);
>  	if (nla_put_64bit(skb, TCA_GACT_TM, sizeof(t), &t, TCA_GACT_PAD))
>  		goto nla_put_failure;
> +	if (gact->tcf_action == TC_ACT_GOTO_CHAIN &&
> +	    nla_put_u32(skb, TCA_GACT_CHAIN, gact->goto_chain->index))
> +		goto nla_put_failure;
>  	return skb->len;
>
>  nla_put_failure:
> @@ -225,6 +270,7 @@ static struct tc_action_ops act_gact_ops = {
>  	.stats_update	=	tcf_gact_stats_update,
>  	.dump		=	tcf_gact_dump,
>  	.init		=	tcf_gact_init,
> +	.cleanup	=	tcf_gact_cleanup,
>  	.walk		=	tcf_gact_walker,
>  	.lookup		=	tcf_gact_search,
>  	.size		=	sizeof(struct tcf_gact),
> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> index dbc1348..a2d6bc7 100644
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -304,10 +304,14 @@ int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
>  			continue;
>
>  		err = tp->classify(skb, tp, res);
> -		if (unlikely(err == TC_ACT_RECLASSIFY && !compat_mode))
> +		if (err == TC_ACT_RECLASSIFY && !compat_mode) {
>  			goto reset;
> -		if (err >= 0)
> +		} else if (err == TC_ACT_GOTO_CHAIN) {
> +			old_tp = res->goto_tp;
> +			goto reset;
> +		} else if (err >= 0) {
>  			return err;
> +		}
>  	}
>
>  	return TC_ACT_UNSPEC; /* signal: continue lookup */
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch net-next 10/10] net: sched: extend gact to allow jumping to another filter chain
  2017-04-28  1:41   ` Jamal Hadi Salim
@ 2017-04-28  6:52     ` Jiri Pirko
  2017-04-28 12:23       ` Jamal Hadi Salim
  0 siblings, 1 reply; 26+ messages in thread
From: Jiri Pirko @ 2017-04-28  6:52 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: netdev, davem, xiyou.wangcong, dsa, edumazet, stephen, daniel,
	alexander.h.duyck, mlxsw, simon.horman

Fri, Apr 28, 2017 at 03:41:08AM CEST, jhs@mojatatu.com wrote:
>
>Jiri,
>
>Good stuff!
>Thanks for the effort.
>
>I didnt review the details - will do. I wanted to raise one issue.
>This should work for all actions, not just gact (refer to the
>recent commit i made on the action jumping).
>
>Example policy for policer:
>
>#if packets destined for mac address 52:54:00:3d:c7:6d
>#exceed 90kbps with burst of 90K then jump to chain 11
>#for further classification, otherwise set their skb mark to 11
># and proceed.
>
>tc filter add dev eth0 parent ffff: protocol ip pref 33 \
>flower dst_mac 52:54:00:3d:c7:6d \
>action police rate 1kbit burst 90k conform-exceed pipe/goto chain 11 \
>action skbedit mark 11
>
>But i should also be able to do this for any other action, etc.
>
>For this to work, you have to be able to encode the action in the
>opcode. Something like (for 2^16 chains):
>
>#define TC_ACT_GOTO_CHAIN	0x20000000
>#define TCA_ACT_MAX_CHAIN_MASK 0xFFFF
>
>So 0x20000001 is encoding of chain 1 etc.
>
>I will post the iproute2 code i used for jumping of actions.

You can have multiple actions in list and gact goto as the last one. Why
to do this ugliness?


>
>cheers,
>jamal
>
>On 17-04-27 07:12 AM, Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@mellanox.com>
>> 
>> Introduce new type of gact action called "goto_chain". This allows
>> user to specify a chain to be processed. This action type is
>> then processed as a return value in tcf_classify loop in similar
>> way as "reclassify" is, only it does not reset to the first filter
>> in chain but rather reset to the first filter of the desired chain.
>> 
>> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
>> ---
>>  include/net/sch_generic.h           |  9 +++++--
>>  include/net/tc_act/tc_gact.h        |  2 ++
>>  include/uapi/linux/pkt_cls.h        |  1 +
>>  include/uapi/linux/tc_act/tc_gact.h |  1 +
>>  net/sched/act_gact.c                | 48 ++++++++++++++++++++++++++++++++++++-
>>  net/sched/cls_api.c                 |  8 +++++--
>>  6 files changed, 64 insertions(+), 5 deletions(-)
>> 
>> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
>> index 569b565..3688501 100644
>> --- a/include/net/sch_generic.h
>> +++ b/include/net/sch_generic.h
>> @@ -193,8 +193,13 @@ struct Qdisc_ops {
>> 
>> 
>>  struct tcf_result {
>> -	unsigned long	class;
>> -	u32		classid;
>> +	union {
>> +		struct {
>> +			unsigned long	class;
>> +			u32		classid;
>> +		};
>> +		const struct tcf_proto *goto_tp;
>> +	};
>>  };
>> 
>>  struct tcf_proto_ops {
>> diff --git a/include/net/tc_act/tc_gact.h b/include/net/tc_act/tc_gact.h
>> index b6f1739..58bee54 100644
>> --- a/include/net/tc_act/tc_gact.h
>> +++ b/include/net/tc_act/tc_gact.h
>> @@ -12,6 +12,8 @@ struct tcf_gact {
>>  	int			tcfg_paction;
>>  	atomic_t		packets;
>>  #endif
>> +	struct tcf_chain	*goto_chain;
>> +	struct rcu_head		rcu;
>>  };
>>  #define to_gact(a) ((struct tcf_gact *)a)
>> 
>> diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
>> index f1129e3..e03ba27 100644
>> --- a/include/uapi/linux/pkt_cls.h
>> +++ b/include/uapi/linux/pkt_cls.h
>> @@ -37,6 +37,7 @@ enum {
>>  #define TC_ACT_QUEUED		5
>>  #define TC_ACT_REPEAT		6
>>  #define TC_ACT_REDIRECT		7
>> +#define TC_ACT_GOTO_CHAIN	8
>>  #define TC_ACT_JUMP		0x10000000
>> 
>>  /* Action type identifiers*/
>> diff --git a/include/uapi/linux/tc_act/tc_gact.h b/include/uapi/linux/tc_act/tc_gact.h
>> index 70b536a..388733d 100644
>> --- a/include/uapi/linux/tc_act/tc_gact.h
>> +++ b/include/uapi/linux/tc_act/tc_gact.h
>> @@ -26,6 +26,7 @@ enum {
>>  	TCA_GACT_PARMS,
>>  	TCA_GACT_PROB,
>>  	TCA_GACT_PAD,
>> +	TCA_GACT_CHAIN,
>>  	__TCA_GACT_MAX
>>  };
>>  #define TCA_GACT_MAX (__TCA_GACT_MAX - 1)
>> diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
>> index c527c11..d63aebd 100644
>> --- a/net/sched/act_gact.c
>> +++ b/net/sched/act_gact.c
>> @@ -20,6 +20,7 @@
>>  #include <linux/init.h>
>>  #include <net/netlink.h>
>>  #include <net/pkt_sched.h>
>> +#include <net/pkt_cls.h>
>>  #include <linux/tc_act/tc_gact.h>
>>  #include <net/tc_act/tc_gact.h>
>> 
>> @@ -54,6 +55,7 @@ static g_rand gact_rand[MAX_RAND] = { NULL, gact_net_rand, gact_determ };
>>  static const struct nla_policy gact_policy[TCA_GACT_MAX + 1] = {
>>  	[TCA_GACT_PARMS]	= { .len = sizeof(struct tc_gact) },
>>  	[TCA_GACT_PROB]		= { .len = sizeof(struct tc_gact_p) },
>> +	[TCA_GACT_CHAIN]	= { .type = NLA_U32 },
>>  };
>> 
>>  static int tcf_gact_init(struct net *net, struct tcf_proto *tp,
>> @@ -92,6 +94,9 @@ static int tcf_gact_init(struct net *net, struct tcf_proto *tp,
>>  	}
>>  #endif
>> 
>> +	if (parm->action == TC_ACT_GOTO_CHAIN && !tb[TCA_GACT_CHAIN])
>> +		return -EINVAL;
>> +
>>  	if (!tcf_hash_check(tn, parm->index, a, bind)) {
>>  		ret = tcf_hash_create(tn, parm->index, est, a,
>>  				      &act_gact_ops, bind, true);
>> @@ -121,11 +126,43 @@ static int tcf_gact_init(struct net *net, struct tcf_proto *tp,
>>  		gact->tcfg_ptype   = p_parm->ptype;
>>  	}
>>  #endif
>> +
>> +	if (gact->tcf_action == TC_ACT_GOTO_CHAIN) {
>> +		u32 chain_index = nla_get_u32(tb[TCA_GACT_CHAIN]);
>> +
>> +		if (!tp) {
>> +			if (ret == ACT_P_CREATED)
>> +				tcf_hash_release(*a, bind);
>> +			return -EINVAL;
>> +		}
>> +		gact->goto_chain = tcf_chain_get(tp->chain->block, chain_index);
>> +		if (!gact->goto_chain) {
>> +			if (ret == ACT_P_CREATED)
>> +				tcf_hash_release(*a, bind);
>> +			return -ENOMEM;
>> +		}
>> +	}
>> +
>>  	if (ret == ACT_P_CREATED)
>>  		tcf_hash_insert(tn, *a);
>>  	return ret;
>>  }
>> 
>> +static void tcf_gact_cleanup_rcu(struct rcu_head *rcu)
>> +{
>> +	struct tcf_gact *gact = container_of(rcu, struct tcf_gact, rcu);
>> +
>> +	if (gact->tcf_action == TC_ACT_GOTO_CHAIN)
>> +		tcf_chain_put(gact->goto_chain);
>> +}
>> +
>> +static void tcf_gact_cleanup(struct tc_action *a, int bind)
>> +{
>> +	struct tcf_gact *gact = to_gact(a);
>> +
>> +	call_rcu(&gact->rcu, tcf_gact_cleanup_rcu);
>> +}
>> +
>>  static int tcf_gact(struct sk_buff *skb, const struct tc_action *a,
>>  		    struct tcf_result *res)
>>  {
>> @@ -141,8 +178,13 @@ static int tcf_gact(struct sk_buff *skb, const struct tc_action *a,
>>  	}
>>  #endif
>>  	bstats_cpu_update(this_cpu_ptr(gact->common.cpu_bstats), skb);
>> -	if (action == TC_ACT_SHOT)
>> +	if (action == TC_ACT_SHOT) {
>>  		qstats_drop_inc(this_cpu_ptr(gact->common.cpu_qstats));
>> +	} else if (action == TC_ACT_GOTO_CHAIN) {
>> +		struct tcf_chain *chain = gact->goto_chain;
>> +
>> +		res->goto_tp = rcu_dereference_bh(chain->filter_chain);
>> +	}
>> 
>>  	tcf_lastuse_update(&gact->tcf_tm);
>> 
>> @@ -194,6 +236,9 @@ static int tcf_gact_dump(struct sk_buff *skb, struct tc_action *a,
>>  	tcf_tm_dump(&t, &gact->tcf_tm);
>>  	if (nla_put_64bit(skb, TCA_GACT_TM, sizeof(t), &t, TCA_GACT_PAD))
>>  		goto nla_put_failure;
>> +	if (gact->tcf_action == TC_ACT_GOTO_CHAIN &&
>> +	    nla_put_u32(skb, TCA_GACT_CHAIN, gact->goto_chain->index))
>> +		goto nla_put_failure;
>>  	return skb->len;
>> 
>>  nla_put_failure:
>> @@ -225,6 +270,7 @@ static struct tc_action_ops act_gact_ops = {
>>  	.stats_update	=	tcf_gact_stats_update,
>>  	.dump		=	tcf_gact_dump,
>>  	.init		=	tcf_gact_init,
>> +	.cleanup	=	tcf_gact_cleanup,
>>  	.walk		=	tcf_gact_walker,
>>  	.lookup		=	tcf_gact_search,
>>  	.size		=	sizeof(struct tcf_gact),
>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>> index dbc1348..a2d6bc7 100644
>> --- a/net/sched/cls_api.c
>> +++ b/net/sched/cls_api.c
>> @@ -304,10 +304,14 @@ int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
>>  			continue;
>> 
>>  		err = tp->classify(skb, tp, res);
>> -		if (unlikely(err == TC_ACT_RECLASSIFY && !compat_mode))
>> +		if (err == TC_ACT_RECLASSIFY && !compat_mode) {
>>  			goto reset;
>> -		if (err >= 0)
>> +		} else if (err == TC_ACT_GOTO_CHAIN) {
>> +			old_tp = res->goto_tp;
>> +			goto reset;
>> +		} else if (err >= 0) {
>>  			return err;
>> +		}
>>  	}
>> 
>>  	return TC_ACT_UNSPEC; /* signal: continue lookup */
>> 
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch net-next 00/10] net: sched: introduce multichain support for filters
  2017-04-27 17:46 ` [patch net-next 00/10] net: sched: introduce multichain support for filters Cong Wang
@ 2017-04-28  6:53   ` Jiri Pirko
  2017-04-28 17:40     ` Cong Wang
  0 siblings, 1 reply; 26+ messages in thread
From: Jiri Pirko @ 2017-04-28  6:53 UTC (permalink / raw)
  To: Cong Wang
  Cc: Linux Kernel Network Developers, David Miller, Jamal Hadi Salim,
	David Ahern, Eric Dumazet, Stephen Hemminger, Daniel Borkmann,
	Alexander Duyck, mlxsw, Simon Horman

Thu, Apr 27, 2017 at 07:46:03PM CEST, xiyou.wangcong@gmail.com wrote:
>On Thu, Apr 27, 2017 at 4:12 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>> Simple example:
>> $ tc qdisc add dev eth0 ingress
>> $ tc filter add dev eth0 parent ffff: protocol ip pref 33 flower dst_mac 52:54:00:3d:c7:6d action goto chain 11
>> $ tc filter add dev eth0 parent ffff: protocol ip pref 22 chain 11 flower dst_ip 192.168.40.1 action drop
>> $ tc filter show dev eth0 root
>
>Interesting.
>
>I don't look into the code yet. If I understand the concepts correctly,
>so with your patchset we can mark either filter with a chain No. to
>choose which chain it belongs to _logically_ even though
>_physically_ it is still in the old-fashion chain (prio, proto)?

You have to see the code :)

There are physically multiple chains


>
>If so, you have to ensure proto is same since the protocol of
>the packet does not change dynamically? And the original
>priority becomes pointless with chains since we can just to
>any other chain in any order?
>
>By default, without any chain No., you use 0 for all the chains,
>so the old-fashion chain still works.

Yes.

>
>Thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch net-next 10/10] net: sched: extend gact to allow jumping to another filter chain
  2017-04-28  6:52     ` Jiri Pirko
@ 2017-04-28 12:23       ` Jamal Hadi Salim
  2017-04-28 13:47         ` Jiri Pirko
  0 siblings, 1 reply; 26+ messages in thread
From: Jamal Hadi Salim @ 2017-04-28 12:23 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, xiyou.wangcong, dsa, edumazet, stephen, daniel,
	alexander.h.duyck, mlxsw, simon.horman

On 17-04-28 02:52 AM, Jiri Pirko wrote:
> Fri, Apr 28, 2017 at 03:41:08AM CEST, jhs@mojatatu.com wrote:
>>
>> Jiri,
>>
>> Good stuff!
>> Thanks for the effort.
>>
>> I didnt review the details - will do. I wanted to raise one issue.
>> This should work for all actions, not just gact (refer to the
>> recent commit i made on the action jumping).
>>
>> Example policy for policer:
>>
>> #if packets destined for mac address 52:54:00:3d:c7:6d
>> #exceed 90kbps with burst of 90K then jump to chain 11
>> #for further classification, otherwise set their skb mark to 11
>> # and proceed.
>>
>> tc filter add dev eth0 parent ffff: protocol ip pref 33 \
>> flower dst_mac 52:54:00:3d:c7:6d \
>> action police rate 1kbit burst 90k conform-exceed pipe/goto chain 11 \
>> action skbedit mark 11
>>
>> But i should also be able to do this for any other action, etc.
>

[..]

> You can have multiple actions in list and gact goto as the last one. Why
> to do this ugliness?

To be able to do what I described above ;-> Policer is always a good
example because it can branch depending on whether a rate is exceeded
or not.

Another example:
If you had two tables, one for flows that dont exceed their
rate and another for flows exceed their rate.
"match X
	action police
	    if exceed
                 goto chain 12
             else did not exceed
                tag packet, goto chain 11
"

But it is not just the policer, other actions as well would benefit.

I am not sure you can achieve that with just gact.

cheers,
jamal

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch net-next 10/10] net: sched: extend gact to allow jumping to another filter chain
  2017-04-28 12:23       ` Jamal Hadi Salim
@ 2017-04-28 13:47         ` Jiri Pirko
  2017-04-30 11:08           ` Jamal Hadi Salim
  0 siblings, 1 reply; 26+ messages in thread
From: Jiri Pirko @ 2017-04-28 13:47 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: netdev, davem, xiyou.wangcong, dsa, edumazet, stephen, daniel,
	alexander.h.duyck, mlxsw, simon.horman

Fri, Apr 28, 2017 at 02:23:34PM CEST, jhs@mojatatu.com wrote:
>On 17-04-28 02:52 AM, Jiri Pirko wrote:
>> Fri, Apr 28, 2017 at 03:41:08AM CEST, jhs@mojatatu.com wrote:
>> > 
>> > Jiri,
>> > 
>> > Good stuff!
>> > Thanks for the effort.
>> > 
>> > I didnt review the details - will do. I wanted to raise one issue.
>> > This should work for all actions, not just gact (refer to the
>> > recent commit i made on the action jumping).
>> > 
>> > Example policy for policer:
>> > 
>> > #if packets destined for mac address 52:54:00:3d:c7:6d
>> > #exceed 90kbps with burst of 90K then jump to chain 11
>> > #for further classification, otherwise set their skb mark to 11
>> > # and proceed.
>> > 
>> > tc filter add dev eth0 parent ffff: protocol ip pref 33 \
>> > flower dst_mac 52:54:00:3d:c7:6d \
>> > action police rate 1kbit burst 90k conform-exceed pipe/goto chain 11 \
>> > action skbedit mark 11
>> > 
>> > But i should also be able to do this for any other action, etc.
>> 
>
>[..]
>
>> You can have multiple actions in list and gact goto as the last one. Why
>> to do this ugliness?
>
>To be able to do what I described above ;-> Policer is always a good
>example because it can branch depending on whether a rate is exceeded
>or not.
>
>Another example:
>If you had two tables, one for flows that dont exceed their
>rate and another for flows exceed their rate.
>"match X
>	action police
>	    if exceed
>                goto chain 12
>            else did not exceed
>               tag packet, goto chain 11
>"
>
>But it is not just the policer, other actions as well would benefit.
>
>I am not sure you can achieve that with just gact.

Got it. Sigh, every day I find new oddities in net/sched :)

I will try to figure out how to extend GOTO_CHAIN action for other
actions.

So basically, you suggest to encode chain number into the action opcode,
like you do it in jump, right? For example:
#define TC_ACT_GOTO_CHAIN       0x20000000

And then I will have chainlimit 0x10000000-1

Thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch net-next 00/10] net: sched: introduce multichain support for filters
  2017-04-28  6:53   ` Jiri Pirko
@ 2017-04-28 17:40     ` Cong Wang
  2017-04-28 22:34       ` Jiri Pirko
  0 siblings, 1 reply; 26+ messages in thread
From: Cong Wang @ 2017-04-28 17:40 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Linux Kernel Network Developers, David Miller, Jamal Hadi Salim,
	David Ahern, Eric Dumazet, Stephen Hemminger, Daniel Borkmann,
	Alexander Duyck, mlxsw, Simon Horman

On Thu, Apr 27, 2017 at 11:53 PM, Jiri Pirko <jiri@resnulli.us> wrote:
> Thu, Apr 27, 2017 at 07:46:03PM CEST, xiyou.wangcong@gmail.com wrote:
>>On Thu, Apr 27, 2017 at 4:12 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>>> Simple example:
>>> $ tc qdisc add dev eth0 ingress
>>> $ tc filter add dev eth0 parent ffff: protocol ip pref 33 flower dst_mac 52:54:00:3d:c7:6d action goto chain 11
>>> $ tc filter add dev eth0 parent ffff: protocol ip pref 22 chain 11 flower dst_ip 192.168.40.1 action drop
>>> $ tc filter show dev eth0 root
>>
>>Interesting.
>>
>>I don't look into the code yet. If I understand the concepts correctly,
>>so with your patchset we can mark either filter with a chain No. to
>>choose which chain it belongs to _logically_ even though
>>_physically_ it is still in the old-fashion chain (prio, proto)?
>
> You have to see the code :)

I don't understand why I have to, these are high-level concepts
and should be put in your cover letter (aka. design doc). You miss
a lot of information about the ordering here.

Also the terms you use are confusing too, without your patchset
we have chains too, struct tcf_proto is a chain, each kind of filter
defines their own way to store their filters into this chain (tp->root),
and of course tp is chained in a singly-linked list too which turns
into multiple-chains.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch net-next 02/10] net: sched: introduce tcf block infractructure
  2017-04-27 11:12 ` [patch net-next 02/10] net: sched: introduce tcf block infractructure Jiri Pirko
@ 2017-04-28 17:48   ` Cong Wang
  2017-04-28 22:29     ` Jiri Pirko
  0 siblings, 1 reply; 26+ messages in thread
From: Cong Wang @ 2017-04-28 17:48 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Linux Kernel Network Developers, David Miller, Jamal Hadi Salim,
	David Ahern, Eric Dumazet, Stephen Hemminger, Daniel Borkmann,
	Alexander Duyck, mlxsw, Simon Horman

On Thu, Apr 27, 2017 at 4:12 AM, Jiri Pirko <jiri@resnulli.us> wrote:
> From: Jiri Pirko <jiri@mellanox.com>
>
> Currently, the filter chains are direcly put into the private structures
> of qdiscs. In order to be able to have multiple chains per qdisc and to
> allow filter chains sharing among qdiscs, there is a need for common
> object that would hold the chains. This introduces such object and calls
> it "tcf_block".
>

What is filter chains sharing here? Sounds like a new feature you are
trying to hide? How could it be possibly shared among qdisc's when they
are still stored in a per-qdisc pointer?

Look at tc actions, they can be shared because they are physically stored
in a per-netns hashtable, filters can just refer them with indexes.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch net-next 02/10] net: sched: introduce tcf block infractructure
  2017-04-28 17:48   ` Cong Wang
@ 2017-04-28 22:29     ` Jiri Pirko
  0 siblings, 0 replies; 26+ messages in thread
From: Jiri Pirko @ 2017-04-28 22:29 UTC (permalink / raw)
  To: Cong Wang
  Cc: Linux Kernel Network Developers, David Miller, Jamal Hadi Salim,
	David Ahern, Eric Dumazet, Stephen Hemminger, Daniel Borkmann,
	Alexander Duyck, mlxsw, Simon Horman

Fri, Apr 28, 2017 at 07:48:34PM CEST, xiyou.wangcong@gmail.com wrote:
>On Thu, Apr 27, 2017 at 4:12 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>> From: Jiri Pirko <jiri@mellanox.com>
>>
>> Currently, the filter chains are direcly put into the private structures
>> of qdiscs. In order to be able to have multiple chains per qdisc and to
>> allow filter chains sharing among qdiscs, there is a need for common
>> object that would hold the chains. This introduces such object and calls
>> it "tcf_block".
>>
>
>What is filter chains sharing here? Sounds like a new feature you are
>trying to hide? How could it be possibly shared among qdisc's when they
>are still stored in a per-qdisc pointer?

I can change the description. I just found out that eventually, later
on, the blocks could be shared among qdiscs. It is not part of this
patchset. Just this change of infra is bringing this one step closer.

I can imagine that the blocks could have indexes and when you create a
qdisc you could pass this index of an existing block.

Not trying to hide anything!


>
>Look at tc actions, they can be shared because they are physically stored
>in a per-netns hashtable, filters can just refer them with indexes.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch net-next 00/10] net: sched: introduce multichain support for filters
  2017-04-28 17:40     ` Cong Wang
@ 2017-04-28 22:34       ` Jiri Pirko
  2017-05-02  5:26         ` Cong Wang
  0 siblings, 1 reply; 26+ messages in thread
From: Jiri Pirko @ 2017-04-28 22:34 UTC (permalink / raw)
  To: Cong Wang
  Cc: Linux Kernel Network Developers, David Miller, Jamal Hadi Salim,
	David Ahern, Eric Dumazet, Stephen Hemminger, Daniel Borkmann,
	Alexander Duyck, mlxsw, Simon Horman

Fri, Apr 28, 2017 at 07:40:24PM CEST, xiyou.wangcong@gmail.com wrote:
>On Thu, Apr 27, 2017 at 11:53 PM, Jiri Pirko <jiri@resnulli.us> wrote:
>> Thu, Apr 27, 2017 at 07:46:03PM CEST, xiyou.wangcong@gmail.com wrote:
>>>On Thu, Apr 27, 2017 at 4:12 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>>>> Simple example:
>>>> $ tc qdisc add dev eth0 ingress
>>>> $ tc filter add dev eth0 parent ffff: protocol ip pref 33 flower dst_mac 52:54:00:3d:c7:6d action goto chain 11
>>>> $ tc filter add dev eth0 parent ffff: protocol ip pref 22 chain 11 flower dst_ip 192.168.40.1 action drop
>>>> $ tc filter show dev eth0 root
>>>
>>>Interesting.
>>>
>>>I don't look into the code yet. If I understand the concepts correctly,
>>>so with your patchset we can mark either filter with a chain No. to
>>>choose which chain it belongs to _logically_ even though
>>>_physically_ it is still in the old-fashion chain (prio, proto)?
>>
>> You have to see the code :)
>
>I don't understand why I have to, these are high-level concepts
>and should be put in your cover letter (aka. design doc). You miss
>a lot of information about the ordering here.

Well, the description is one thing, but seeing the actual code should
put the whole view. But if you are missing something, I can add it. What
do you mean by "information about the ordering"?


>
>Also the terms you use are confusing too, without your patchset
>we have chains too, struct tcf_proto is a chain, each kind of filter
>defines their own way to store their filters into this chain (tp->root),
	 
Those are internal structures specific to each filter. Not "chains" per
say.

>and of course tp is chained in a singly-linked list too which turns
>into multiple-chains.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch net-next 10/10] net: sched: extend gact to allow jumping to another filter chain
  2017-04-28 13:47         ` Jiri Pirko
@ 2017-04-30 11:08           ` Jamal Hadi Salim
  0 siblings, 0 replies; 26+ messages in thread
From: Jamal Hadi Salim @ 2017-04-30 11:08 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, xiyou.wangcong, dsa, edumazet, stephen, daniel,
	alexander.h.duyck, mlxsw, simon.horman

On 17-04-28 09:47 AM, Jiri Pirko wrote:
[..]
> I will try to figure out how to extend GOTO_CHAIN action for other
> actions.
>
> So basically, you suggest to encode chain number into the action opcode,
> like you do it in jump, right? For example:
> #define TC_ACT_GOTO_CHAIN       0x20000000
>
> And then I will have chainlimit 0x10000000-1
>

If you pick a range to own (one that is not being used)
like 0x20000000 then chain limit starts at 0x20000001
to some upper bound. I dont know if you need all the range,
but if you put upper bound to 0x2FFFFFFF that is a few
million.

cheers,
jamal

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch net-next 00/10] net: sched: introduce multichain support for filters
  2017-04-28 22:34       ` Jiri Pirko
@ 2017-05-02  5:26         ` Cong Wang
  2017-05-02  5:50           ` Jiri Pirko
  0 siblings, 1 reply; 26+ messages in thread
From: Cong Wang @ 2017-05-02  5:26 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Linux Kernel Network Developers, David Miller, Jamal Hadi Salim,
	David Ahern, Eric Dumazet, Stephen Hemminger, Daniel Borkmann,
	Alexander Duyck, mlxsw, Simon Horman

On Fri, Apr 28, 2017 at 3:34 PM, Jiri Pirko <jiri@resnulli.us> wrote:
> Fri, Apr 28, 2017 at 07:40:24PM CEST, xiyou.wangcong@gmail.com wrote:
>>On Thu, Apr 27, 2017 at 11:53 PM, Jiri Pirko <jiri@resnulli.us> wrote:
>>> Thu, Apr 27, 2017 at 07:46:03PM CEST, xiyou.wangcong@gmail.com wrote:
>>>>On Thu, Apr 27, 2017 at 4:12 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>>>>> Simple example:
>>>>> $ tc qdisc add dev eth0 ingress
>>>>> $ tc filter add dev eth0 parent ffff: protocol ip pref 33 flower dst_mac 52:54:00:3d:c7:6d action goto chain 11
>>>>> $ tc filter add dev eth0 parent ffff: protocol ip pref 22 chain 11 flower dst_ip 192.168.40.1 action drop
>>>>> $ tc filter show dev eth0 root
>>>>
>>>>Interesting.
>>>>
>>>>I don't look into the code yet. If I understand the concepts correctly,
>>>>so with your patchset we can mark either filter with a chain No. to
>>>>choose which chain it belongs to _logically_ even though
>>>>_physically_ it is still in the old-fashion chain (prio, proto)?
>>>
>>> You have to see the code :)
>>
>>I don't understand why I have to, these are high-level concepts
>>and should be put in your cover letter (aka. design doc). You miss
>>a lot of information about the ordering here.
>
> Well, the description is one thing, but seeing the actual code should
> put the whole view. But if you are missing something, I can add it. What
> do you mean by "information about the ordering"?
>

By ordering, I mean:

1) before your patch, filters are ordered by prio and categorized by proto

2) after your patch, we can jump from one filter to a specified one, how
does this work or not work with the prio/proto?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch net-next 00/10] net: sched: introduce multichain support for filters
  2017-05-02  5:26         ` Cong Wang
@ 2017-05-02  5:50           ` Jiri Pirko
  0 siblings, 0 replies; 26+ messages in thread
From: Jiri Pirko @ 2017-05-02  5:50 UTC (permalink / raw)
  To: Cong Wang
  Cc: Linux Kernel Network Developers, David Miller, Jamal Hadi Salim,
	David Ahern, Eric Dumazet, Stephen Hemminger, Daniel Borkmann,
	Alexander Duyck, mlxsw, Simon Horman

Tue, May 02, 2017 at 07:26:07AM CEST, xiyou.wangcong@gmail.com wrote:
>On Fri, Apr 28, 2017 at 3:34 PM, Jiri Pirko <jiri@resnulli.us> wrote:
>> Fri, Apr 28, 2017 at 07:40:24PM CEST, xiyou.wangcong@gmail.com wrote:
>>>On Thu, Apr 27, 2017 at 11:53 PM, Jiri Pirko <jiri@resnulli.us> wrote:
>>>> Thu, Apr 27, 2017 at 07:46:03PM CEST, xiyou.wangcong@gmail.com wrote:
>>>>>On Thu, Apr 27, 2017 at 4:12 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>>>>>> Simple example:
>>>>>> $ tc qdisc add dev eth0 ingress
>>>>>> $ tc filter add dev eth0 parent ffff: protocol ip pref 33 flower dst_mac 52:54:00:3d:c7:6d action goto chain 11
>>>>>> $ tc filter add dev eth0 parent ffff: protocol ip pref 22 chain 11 flower dst_ip 192.168.40.1 action drop
>>>>>> $ tc filter show dev eth0 root
>>>>>
>>>>>Interesting.
>>>>>
>>>>>I don't look into the code yet. If I understand the concepts correctly,
>>>>>so with your patchset we can mark either filter with a chain No. to
>>>>>choose which chain it belongs to _logically_ even though
>>>>>_physically_ it is still in the old-fashion chain (prio, proto)?
>>>>
>>>> You have to see the code :)
>>>
>>>I don't understand why I have to, these are high-level concepts
>>>and should be put in your cover letter (aka. design doc). You miss
>>>a lot of information about the ordering here.
>>
>> Well, the description is one thing, but seeing the actual code should
>> put the whole view. But if you are missing something, I can add it. What
>> do you mean by "information about the ordering"?
>>
>
>By ordering, I mean:
>
>1) before your patch, filters are ordered by prio and categorized by proto
>
>2) after your patch, we can jump from one filter to a specified one, how
>does this work or not work with the prio/proto?

No, you can jump to another chain. And that chain is also ordered by
prio/proto.

Just imagine currently you have only chain 0. This patchset just extends
for other chains.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2017-05-02  5:50 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-27 11:12 [patch net-next 00/10] net: sched: introduce multichain support for filters Jiri Pirko
2017-04-27 11:12 ` [patch net-next 01/10] net: sched: move tc_classify function to cls_api.c Jiri Pirko
2017-04-27 11:12 ` [patch net-next 02/10] net: sched: introduce tcf block infractructure Jiri Pirko
2017-04-28 17:48   ` Cong Wang
2017-04-28 22:29     ` Jiri Pirko
2017-04-27 11:12 ` [patch net-next 03/10] net: sched: rename tcf_destroy_chain helper Jiri Pirko
2017-04-27 11:12 ` [patch net-next 04/10] net: sched: replace nprio by a bool to make the function more readable Jiri Pirko
2017-04-27 11:12 ` [patch net-next 05/10] net: sched: move TC_H_MAJ macro call into tcf_auto_prio Jiri Pirko
2017-04-27 11:12 ` [patch net-next 06/10] net: sched: introduce helpers to work with filter chains Jiri Pirko
2017-04-27 11:12 ` [patch net-next 07/10] net: sched: push chain dump to a separate function Jiri Pirko
2017-04-27 11:12 ` [patch net-next 08/10] net: sched: introduce multichain support for filters Jiri Pirko
2017-04-27 11:12 ` [patch net-next 09/10] net: sched: push tp down to action init callback Jiri Pirko
2017-04-27 11:12 ` [patch net-next 10/10] net: sched: extend gact to allow jumping to another filter chain Jiri Pirko
2017-04-28  1:41   ` Jamal Hadi Salim
2017-04-28  6:52     ` Jiri Pirko
2017-04-28 12:23       ` Jamal Hadi Salim
2017-04-28 13:47         ` Jiri Pirko
2017-04-30 11:08           ` Jamal Hadi Salim
2017-04-27 11:15 ` [patch iproute2 1/2] tc_filter: add support for chain index Jiri Pirko
2017-04-27 11:15 ` [patch iproute2 2/2] tc_gact: introduce support for goto chain action Jiri Pirko
2017-04-27 17:46 ` [patch net-next 00/10] net: sched: introduce multichain support for filters Cong Wang
2017-04-28  6:53   ` Jiri Pirko
2017-04-28 17:40     ` Cong Wang
2017-04-28 22:34       ` Jiri Pirko
2017-05-02  5:26         ` Cong Wang
2017-05-02  5:50           ` Jiri Pirko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.