All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch net v2 0/4] net/sched: Fix a system panic when deleting filters
@ 2017-10-16 11:18 Chris Mi
  2017-10-16 11:18 ` [patch net v2 1/4] net/sched: Change tc_action refcnt and bindcnt to atomic Chris Mi
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Chris Mi @ 2017-10-16 11:18 UTC (permalink / raw)
  To: netdev; +Cc: jhs, lucasb, xiyou.wangcong, jiri, davem

If some filters share the same action, when deleting these filters,
it is possible to create a system panic. This is because deletions
could be manipulated by many RCU callbacks at the same time.

This patch set fixes these issues. To reproduce the issue run selftests
in patch 3 and 4. To test if the issue was fixed, apply patches 1 and 2
and then repeat the tests.

v2 changelog
============

Revise the comment and add Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

Chris Mi (4):
  net/sched: Change tc_action refcnt and bindcnt to atomic
  net/sched: Use action array instead of action list as parameter
  selftests: Introduce a new script to generate tc batch file
  selftests: Introduce a new test case to tc testsuite

 include/net/act_api.h                              |  11 +-
 net/sched/act_api.c                                | 124 +++++++++++++--------
 net/sched/act_bpf.c                                |   4 +-
 net/sched/act_connmark.c                           |   4 +-
 net/sched/act_csum.c                               |   4 +-
 net/sched/act_gact.c                               |   4 +-
 net/sched/act_ife.c                                |   4 +-
 net/sched/act_ipt.c                                |   4 +-
 net/sched/act_mirred.c                             |   4 +-
 net/sched/act_nat.c                                |   4 +-
 net/sched/act_pedit.c                              |   4 +-
 net/sched/act_police.c                             |   4 +-
 net/sched/act_sample.c                             |   4 +-
 net/sched/act_simple.c                             |   4 +-
 net/sched/act_skbedit.c                            |   4 +-
 net/sched/act_skbmod.c                             |   4 +-
 net/sched/act_tunnel_key.c                         |   4 +-
 net/sched/act_vlan.c                               |   4 +-
 net/sched/cls_api.c                                |  18 +--
 .../tc-testing/tc-tests/filters/tests.json         |  23 +++-
 tools/testing/selftests/tc-testing/tdc.py          |  20 +++-
 tools/testing/selftests/tc-testing/tdc_batch.py    |  62 +++++++++++
 tools/testing/selftests/tc-testing/tdc_config.py   |   2 +
 23 files changed, 222 insertions(+), 102 deletions(-)
 create mode 100755 tools/testing/selftests/tc-testing/tdc_batch.py

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [patch net v2 1/4] net/sched: Change tc_action refcnt and bindcnt to atomic
  2017-10-16 11:18 [patch net v2 0/4] net/sched: Fix a system panic when deleting filters Chris Mi
@ 2017-10-16 11:18 ` Chris Mi
  2017-10-16 17:06   ` Cong Wang
  2017-10-16 11:18 ` [patch net v2 2/4] net/sched: Use action array instead of action list as parameter Chris Mi
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: Chris Mi @ 2017-10-16 11:18 UTC (permalink / raw)
  To: netdev; +Cc: jhs, lucasb, xiyou.wangcong, jiri, davem

If many filters share the same action. That action's refcnt and bindcnt
could be manipulated by many RCU callbacks at the same time. This patch
makes these operations atomic.

Fixes commit in pre-git era.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Chris Mi <chrism@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 include/net/act_api.h      |  4 ++--
 net/sched/act_api.c        | 21 +++++++++++----------
 net/sched/act_bpf.c        |  4 ++--
 net/sched/act_connmark.c   |  4 ++--
 net/sched/act_csum.c       |  4 ++--
 net/sched/act_gact.c       |  4 ++--
 net/sched/act_ife.c        |  4 ++--
 net/sched/act_ipt.c        |  4 ++--
 net/sched/act_mirred.c     |  4 ++--
 net/sched/act_nat.c        |  4 ++--
 net/sched/act_pedit.c      |  4 ++--
 net/sched/act_police.c     |  4 ++--
 net/sched/act_sample.c     |  4 ++--
 net/sched/act_simple.c     |  4 ++--
 net/sched/act_skbedit.c    |  4 ++--
 net/sched/act_skbmod.c     |  4 ++--
 net/sched/act_tunnel_key.c |  4 ++--
 net/sched/act_vlan.c       |  4 ++--
 18 files changed, 45 insertions(+), 44 deletions(-)

diff --git a/include/net/act_api.h b/include/net/act_api.h
index b944e0eb..a469ab6 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -25,8 +25,8 @@ struct tc_action {
 	struct tcf_idrinfo		*idrinfo;
 
 	u32				tcfa_index;
-	int				tcfa_refcnt;
-	int				tcfa_bindcnt;
+	atomic_t			tcfa_refcnt;
+	atomic_t			tcfa_bindcnt;
 	u32				tcfa_capab;
 	int				tcfa_action;
 	struct tcf_t			tcfa_tm;
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index da6fa82..9c0224d 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -88,12 +88,13 @@ int __tcf_idr_release(struct tc_action *p, bool bind, bool strict)
 
 	if (p) {
 		if (bind)
-			p->tcfa_bindcnt--;
-		else if (strict && p->tcfa_bindcnt > 0)
+			atomic_dec(&p->tcfa_bindcnt);
+		else if (strict && atomic_read(&p->tcfa_bindcnt) > 0)
 			return -EPERM;
 
-		p->tcfa_refcnt--;
-		if (p->tcfa_bindcnt <= 0 && p->tcfa_refcnt <= 0) {
+		atomic_dec(&p->tcfa_refcnt);
+		if (atomic_read(&p->tcfa_bindcnt) == 0 &&
+		    atomic_read(&p->tcfa_refcnt) == 0) {
 			if (p->ops->cleanup)
 				p->ops->cleanup(p, bind);
 			tcf_idr_remove(p->idrinfo, p);
@@ -245,8 +246,8 @@ bool tcf_idr_check(struct tc_action_net *tn, u32 index, struct tc_action **a,
 
 	if (index && p) {
 		if (bind)
-			p->tcfa_bindcnt++;
-		p->tcfa_refcnt++;
+			atomic_inc(&p->tcfa_bindcnt);
+		atomic_inc(&p->tcfa_refcnt);
 		*a = p;
 		return true;
 	}
@@ -274,9 +275,9 @@ int tcf_idr_create(struct tc_action_net *tn, u32 index, struct nlattr *est,
 
 	if (unlikely(!p))
 		return -ENOMEM;
-	p->tcfa_refcnt = 1;
+	atomic_set(&p->tcfa_refcnt, 1);
 	if (bind)
-		p->tcfa_bindcnt = 1;
+		atomic_set(&p->tcfa_bindcnt, 1);
 
 	if (cpustats) {
 		p->cpu_bstats = netdev_alloc_pcpu_stats(struct gnet_stats_basic_cpu);
@@ -727,7 +728,7 @@ static void cleanup_a(struct list_head *actions, int ovr)
 		return;
 
 	list_for_each_entry(a, actions, list)
-		a->tcfa_refcnt--;
+		atomic_dec(&a->tcfa_refcnt);
 }
 
 int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
@@ -751,7 +752,7 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
 		}
 		act->order = i;
 		if (ovr)
-			act->tcfa_refcnt++;
+			atomic_inc(&act->tcfa_refcnt);
 		list_add_tail(&act->list, actions);
 	}
 
diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c
index c0c707e..4ddf281 100644
--- a/net/sched/act_bpf.c
+++ b/net/sched/act_bpf.c
@@ -141,8 +141,8 @@ static int tcf_bpf_dump(struct sk_buff *skb, struct tc_action *act,
 	struct tcf_bpf *prog = to_bpf(act);
 	struct tc_act_bpf opt = {
 		.index   = prog->tcf_index,
-		.refcnt  = prog->tcf_refcnt - ref,
-		.bindcnt = prog->tcf_bindcnt - bind,
+		.refcnt  = atomic_read(&prog->tcf_refcnt) - ref,
+		.bindcnt = atomic_read(&prog->tcf_bindcnt) - bind,
 		.action  = prog->tcf_action,
 	};
 	struct tcf_t tm;
diff --git a/net/sched/act_connmark.c b/net/sched/act_connmark.c
index 10b7a88..d2cf658 100644
--- a/net/sched/act_connmark.c
+++ b/net/sched/act_connmark.c
@@ -153,8 +153,8 @@ static inline int tcf_connmark_dump(struct sk_buff *skb, struct tc_action *a,
 
 	struct tc_connmark opt = {
 		.index   = ci->tcf_index,
-		.refcnt  = ci->tcf_refcnt - ref,
-		.bindcnt = ci->tcf_bindcnt - bind,
+		.refcnt  = atomic_read(&ci->tcf_refcnt) - ref,
+		.bindcnt = atomic_read(&ci->tcf_bindcnt) - bind,
 		.action  = ci->tcf_action,
 		.zone   = ci->zone,
 	};
diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
index 1c40caa..b9ca424 100644
--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -575,8 +575,8 @@ static int tcf_csum_dump(struct sk_buff *skb, struct tc_action *a, int bind,
 		.update_flags = p->update_flags,
 		.index   = p->tcf_index,
 		.action  = p->tcf_action,
-		.refcnt  = p->tcf_refcnt - ref,
-		.bindcnt = p->tcf_bindcnt - bind,
+		.refcnt  = atomic_read(&p->tcf_refcnt) - ref,
+		.bindcnt = atomic_read(&p->tcf_bindcnt) - bind,
 	};
 	struct tcf_t t;
 
diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
index e29a48e..b1326b7 100644
--- a/net/sched/act_gact.c
+++ b/net/sched/act_gact.c
@@ -169,8 +169,8 @@ static int tcf_gact_dump(struct sk_buff *skb, struct tc_action *a,
 	struct tcf_gact *gact = to_gact(a);
 	struct tc_gact opt = {
 		.index   = gact->tcf_index,
-		.refcnt  = gact->tcf_refcnt - ref,
-		.bindcnt = gact->tcf_bindcnt - bind,
+		.refcnt  = atomic_read(&gact->tcf_refcnt) - ref,
+		.bindcnt = atomic_read(&gact->tcf_bindcnt) - bind,
 		.action  = gact->tcf_action,
 	};
 	struct tcf_t t;
diff --git a/net/sched/act_ife.c b/net/sched/act_ife.c
index 8ccd358..5cdcc87 100644
--- a/net/sched/act_ife.c
+++ b/net/sched/act_ife.c
@@ -551,8 +551,8 @@ static int tcf_ife_dump(struct sk_buff *skb, struct tc_action *a, int bind,
 	struct tcf_ife_info *ife = to_ife(a);
 	struct tc_ife opt = {
 		.index = ife->tcf_index,
-		.refcnt = ife->tcf_refcnt - ref,
-		.bindcnt = ife->tcf_bindcnt - bind,
+		.refcnt = atomic_read(&ife->tcf_refcnt) - ref,
+		.bindcnt = atomic_read(&ife->tcf_bindcnt) - bind,
 		.action = ife->tcf_action,
 		.flags = ife->flags,
 	};
diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c
index d9e399a..02c14dc 100644
--- a/net/sched/act_ipt.c
+++ b/net/sched/act_ipt.c
@@ -277,8 +277,8 @@ static int tcf_ipt_dump(struct sk_buff *skb, struct tc_action *a, int bind,
 	if (unlikely(!t))
 		goto nla_put_failure;
 
-	c.bindcnt = ipt->tcf_bindcnt - bind;
-	c.refcnt = ipt->tcf_refcnt - ref;
+	c.bindcnt = atomic_read(&ipt->tcf_bindcnt) - bind;
+	c.refcnt = atomic_read(&ipt->tcf_refcnt) - ref;
 	strcpy(t->u.user.name, ipt->tcfi_t->u.kernel.target->name);
 
 	if (nla_put(skb, TCA_IPT_TARG, ipt->tcfi_t->u.user.target_size, t) ||
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index 416627c..aeeeb38 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -249,8 +249,8 @@ static int tcf_mirred_dump(struct sk_buff *skb, struct tc_action *a, int bind,
 	struct tc_mirred opt = {
 		.index   = m->tcf_index,
 		.action  = m->tcf_action,
-		.refcnt  = m->tcf_refcnt - ref,
-		.bindcnt = m->tcf_bindcnt - bind,
+		.refcnt  = atomic_read(&m->tcf_refcnt) - ref,
+		.bindcnt = atomic_read(&m->tcf_bindcnt) - bind,
 		.eaction = m->tcfm_eaction,
 		.ifindex = m->tcfm_ifindex,
 	};
diff --git a/net/sched/act_nat.c b/net/sched/act_nat.c
index c365d01..58fa1ae 100644
--- a/net/sched/act_nat.c
+++ b/net/sched/act_nat.c
@@ -256,8 +256,8 @@ static int tcf_nat_dump(struct sk_buff *skb, struct tc_action *a,
 
 		.index    = p->tcf_index,
 		.action   = p->tcf_action,
-		.refcnt   = p->tcf_refcnt - ref,
-		.bindcnt  = p->tcf_bindcnt - bind,
+		.refcnt   = atomic_read(&p->tcf_refcnt) - ref,
+		.bindcnt  = atomic_read(&p->tcf_bindcnt) - bind,
 	};
 	struct tcf_t t;
 
diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
index 491fe5de..27b9fea 100644
--- a/net/sched/act_pedit.c
+++ b/net/sched/act_pedit.c
@@ -391,8 +391,8 @@ static int tcf_pedit_dump(struct sk_buff *skb, struct tc_action *a,
 	opt->nkeys = p->tcfp_nkeys;
 	opt->flags = p->tcfp_flags;
 	opt->action = p->tcf_action;
-	opt->refcnt = p->tcf_refcnt - ref;
-	opt->bindcnt = p->tcf_bindcnt - bind;
+	opt->refcnt = atomic_read(&p->tcf_refcnt) - ref;
+	opt->bindcnt = atomic_read(&p->tcf_bindcnt) - bind;
 
 	if (p->tcfp_keys_ex) {
 		tcf_pedit_key_ex_dump(skb, p->tcfp_keys_ex, p->tcfp_nkeys);
diff --git a/net/sched/act_police.c b/net/sched/act_police.c
index 3bb2ebf..d8a1ac6 100644
--- a/net/sched/act_police.c
+++ b/net/sched/act_police.c
@@ -272,8 +272,8 @@ static int tcf_act_police_dump(struct sk_buff *skb, struct tc_action *a,
 		.action = police->tcf_action,
 		.mtu = police->tcfp_mtu,
 		.burst = PSCHED_NS2TICKS(police->tcfp_burst),
-		.refcnt = police->tcf_refcnt - ref,
-		.bindcnt = police->tcf_bindcnt - bind,
+		.refcnt = atomic_read(&police->tcf_refcnt) - ref,
+		.bindcnt = atomic_read(&police->tcf_bindcnt) - bind,
 	};
 	struct tcf_t t;
 
diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c
index ec986ae..64889a4 100644
--- a/net/sched/act_sample.c
+++ b/net/sched/act_sample.c
@@ -179,8 +179,8 @@ static int tcf_sample_dump(struct sk_buff *skb, struct tc_action *a,
 	struct tc_sample opt = {
 		.index      = s->tcf_index,
 		.action     = s->tcf_action,
-		.refcnt     = s->tcf_refcnt - ref,
-		.bindcnt    = s->tcf_bindcnt - bind,
+		.refcnt     = atomic_read(&s->tcf_refcnt) - ref,
+		.bindcnt    = atomic_read(&s->tcf_bindcnt) - bind,
 	};
 	struct tcf_t t;
 
diff --git a/net/sched/act_simple.c b/net/sched/act_simple.c
index e7b57e5..ec7e397 100644
--- a/net/sched/act_simple.c
+++ b/net/sched/act_simple.c
@@ -148,8 +148,8 @@ static int tcf_simp_dump(struct sk_buff *skb, struct tc_action *a,
 	struct tcf_defact *d = to_defact(a);
 	struct tc_defact opt = {
 		.index   = d->tcf_index,
-		.refcnt  = d->tcf_refcnt - ref,
-		.bindcnt = d->tcf_bindcnt - bind,
+		.refcnt  = atomic_read(&d->tcf_refcnt) - ref,
+		.bindcnt = atomic_read(&d->tcf_bindcnt) - bind,
 		.action  = d->tcf_action,
 	};
 	struct tcf_t t;
diff --git a/net/sched/act_skbedit.c b/net/sched/act_skbedit.c
index 59949d6..2b05e56 100644
--- a/net/sched/act_skbedit.c
+++ b/net/sched/act_skbedit.c
@@ -172,8 +172,8 @@ static int tcf_skbedit_dump(struct sk_buff *skb, struct tc_action *a,
 	struct tcf_skbedit *d = to_skbedit(a);
 	struct tc_skbedit opt = {
 		.index   = d->tcf_index,
-		.refcnt  = d->tcf_refcnt - ref,
-		.bindcnt = d->tcf_bindcnt - bind,
+		.refcnt  = atomic_read(&d->tcf_refcnt) - ref,
+		.bindcnt = atomic_read(&d->tcf_bindcnt) - bind,
 		.action  = d->tcf_action,
 	};
 	struct tcf_t t;
diff --git a/net/sched/act_skbmod.c b/net/sched/act_skbmod.c
index b642ad3..ca1ddf9 100644
--- a/net/sched/act_skbmod.c
+++ b/net/sched/act_skbmod.c
@@ -201,8 +201,8 @@ static int tcf_skbmod_dump(struct sk_buff *skb, struct tc_action *a,
 	struct tcf_skbmod_params  *p = rtnl_dereference(d->skbmod_p);
 	struct tc_skbmod opt = {
 		.index   = d->tcf_index,
-		.refcnt  = d->tcf_refcnt - ref,
-		.bindcnt = d->tcf_bindcnt - bind,
+		.refcnt  = atomic_read(&d->tcf_refcnt) - ref,
+		.bindcnt = atomic_read(&d->tcf_bindcnt) - bind,
 		.action  = d->tcf_action,
 	};
 	struct tcf_t t;
diff --git a/net/sched/act_tunnel_key.c b/net/sched/act_tunnel_key.c
index 30c9627..158d472 100644
--- a/net/sched/act_tunnel_key.c
+++ b/net/sched/act_tunnel_key.c
@@ -250,8 +250,8 @@ static int tunnel_key_dump(struct sk_buff *skb, struct tc_action *a,
 	struct tcf_tunnel_key_params *params;
 	struct tc_tunnel_key opt = {
 		.index    = t->tcf_index,
-		.refcnt   = t->tcf_refcnt - ref,
-		.bindcnt  = t->tcf_bindcnt - bind,
+		.refcnt   = atomic_read(&t->tcf_refcnt) - ref,
+		.bindcnt  = atomic_read(&t->tcf_bindcnt) - bind,
 	};
 	struct tcf_t tm;
 
diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c
index 16eb067..e2d7aab 100644
--- a/net/sched/act_vlan.c
+++ b/net/sched/act_vlan.c
@@ -208,8 +208,8 @@ static int tcf_vlan_dump(struct sk_buff *skb, struct tc_action *a,
 	struct tcf_vlan *v = to_vlan(a);
 	struct tc_vlan opt = {
 		.index    = v->tcf_index,
-		.refcnt   = v->tcf_refcnt - ref,
-		.bindcnt  = v->tcf_bindcnt - bind,
+		.refcnt   = atomic_read(&v->tcf_refcnt) - ref,
+		.bindcnt  = atomic_read(&v->tcf_bindcnt) - bind,
 		.action   = v->tcf_action,
 		.v_action = v->tcfv_action,
 	};
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [patch net v2 2/4] net/sched: Use action array instead of action list as parameter
  2017-10-16 11:18 [patch net v2 0/4] net/sched: Fix a system panic when deleting filters Chris Mi
  2017-10-16 11:18 ` [patch net v2 1/4] net/sched: Change tc_action refcnt and bindcnt to atomic Chris Mi
@ 2017-10-16 11:18 ` Chris Mi
  2017-10-16 11:18 ` [patch net v2 3/4] selftests: Introduce a new script to generate tc batch file Chris Mi
  2017-10-16 11:18 ` [patch net v2 4/4] selftests: Introduce a new test case to tc testsuite Chris Mi
  3 siblings, 0 replies; 18+ messages in thread
From: Chris Mi @ 2017-10-16 11:18 UTC (permalink / raw)
  To: netdev; +Cc: jhs, lucasb, xiyou.wangcong, jiri, davem

When destroying filters, actions should be destroyed first.
The pointers of each action are saved in an array. TC doesn't
use the array directly, but put all actions in a doubly linked
list and use that list as parameter.

There is no problem if each filter has its own actions. But if
some filters share the same action, when these filters are
destroyed, RCU callback fl_destroy_filter() may be called at the
same time. That means the same action's 'struct list_head list'
could be manipulated at the same time. It may point to an invalid
address so that system will panic.

This patch uses the action array directly to fix this issue.

Fixes commit in pre-git era.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Chris Mi <chrism@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 include/net/act_api.h |   7 ++--
 net/sched/act_api.c   | 103 +++++++++++++++++++++++++++++++-------------------
 net/sched/cls_api.c   |  18 +++------
 3 files changed, 75 insertions(+), 53 deletions(-)

diff --git a/include/net/act_api.h b/include/net/act_api.h
index a469ab6..081a313 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -148,16 +148,17 @@ static inline int tcf_idr_release(struct tc_action *a, bool bind)
 int tcf_register_action(struct tc_action_ops *a, struct pernet_operations *ops);
 int tcf_unregister_action(struct tc_action_ops *a,
 			  struct pernet_operations *ops);
-int tcf_action_destroy(struct list_head *actions, int bind);
+int tcf_action_destroy(struct tc_action **actions, int nr, int bind);
 int tcf_action_exec(struct sk_buff *skb, struct tc_action **actions,
 		    int nr_actions, struct tcf_result *res);
 int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
 		    struct nlattr *est, char *name, int ovr, int bind,
-		    struct list_head *actions);
+		    struct tc_action **actions, int *nr);
 struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 				    struct nlattr *nla, struct nlattr *est,
 				    char *name, int ovr, int bind);
-int tcf_action_dump(struct sk_buff *skb, struct list_head *, int, int);
+int tcf_action_dump(struct sk_buff *skb, struct tc_action **actions, int nr,
+		    int bind, int ref);
 int tcf_action_dump_old(struct sk_buff *skb, struct tc_action *a, int, int);
 int tcf_action_dump_1(struct sk_buff *skb, struct tc_action *a, int, int);
 int tcf_action_copy_stats(struct sk_buff *, struct tc_action *, int);
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 9c0224d..391d560 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -513,13 +513,15 @@ int tcf_action_exec(struct sk_buff *skb, struct tc_action **actions,
 }
 EXPORT_SYMBOL(tcf_action_exec);
 
-int tcf_action_destroy(struct list_head *actions, int bind)
+int tcf_action_destroy(struct tc_action **actions, int nr, int bind)
 {
 	const struct tc_action_ops *ops;
-	struct tc_action *a, *tmp;
+	struct tc_action *a;
 	int ret = 0;
+	int i;
 
-	list_for_each_entry_safe(a, tmp, actions, list) {
+	for (i = 0; i < nr; i++) {
+		a = actions[i];
 		ops = a->ops;
 		ret = __tcf_idr_release(a, bind, true);
 		if (ret == ACT_P_DELETED)
@@ -568,14 +570,16 @@ int tcf_action_destroy(struct list_head *actions, int bind)
 }
 EXPORT_SYMBOL(tcf_action_dump_1);
 
-int tcf_action_dump(struct sk_buff *skb, struct list_head *actions,
+int tcf_action_dump(struct sk_buff *skb, struct tc_action **actions, int nr,
 		    int bind, int ref)
 {
 	struct tc_action *a;
-	int err = -EINVAL;
 	struct nlattr *nest;
+	int err = -EINVAL;
+	int i;
 
-	list_for_each_entry(a, actions, list) {
+	for (i = 0; i < nr; i++) {
+		a = actions[i];
 		nest = nla_nest_start(skb, a->order);
 		if (nest == NULL)
 			goto nla_put_failure;
@@ -700,10 +704,7 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 	if (TC_ACT_EXT_CMP(a->tcfa_action, TC_ACT_GOTO_CHAIN)) {
 		err = tcf_action_goto_chain_init(a, tp);
 		if (err) {
-			LIST_HEAD(actions);
-
-			list_add_tail(&a->list, &actions);
-			tcf_action_destroy(&actions, bind);
+			tcf_action_destroy(&a, 1, bind);
 			return ERR_PTR(err);
 		}
 	}
@@ -720,23 +721,27 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 	return ERR_PTR(err);
 }
 
-static void cleanup_a(struct list_head *actions, int ovr)
+static void cleanup_a(struct tc_action **actions, int nr, int ovr)
 {
 	struct tc_action *a;
+	int i;
 
 	if (!ovr)
 		return;
 
-	list_for_each_entry(a, actions, list)
+	for (i = 0; i < nr; i++) {
+		a = actions[i];
 		atomic_dec(&a->tcfa_refcnt);
+	}
 }
 
 int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
 		    struct nlattr *est, char *name, int ovr, int bind,
-		    struct list_head *actions)
+		    struct tc_action **actions, int *nr)
 {
 	struct nlattr *tb[TCA_ACT_MAX_PRIO + 1];
 	struct tc_action *act;
+	int n = 0;
 	int err;
 	int i;
 
@@ -753,17 +758,19 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
 		act->order = i;
 		if (ovr)
 			atomic_inc(&act->tcfa_refcnt);
-		list_add_tail(&act->list, actions);
+		actions[n++] = act;
 	}
+	*nr = n;
 
 	/* Remove the temp refcnt which was necessary to protect against
 	 * destroying an existing action which was being replaced
 	 */
-	cleanup_a(actions, ovr);
+	cleanup_a(actions, n, ovr);
 	return 0;
 
 err:
-	tcf_action_destroy(actions, bind);
+	tcf_action_destroy(actions, n, bind);
+	*nr = 0;
 	return err;
 }
 
@@ -811,9 +818,9 @@ int tcf_action_copy_stats(struct sk_buff *skb, struct tc_action *p,
 	return -1;
 }
 
-static int tca_get_fill(struct sk_buff *skb, struct list_head *actions,
-			u32 portid, u32 seq, u16 flags, int event, int bind,
-			int ref)
+static int tca_get_fill(struct sk_buff *skb, struct tc_action **actions,
+			int nr, u32 portid, u32 seq, u16 flags, int event,
+			int bind, int ref)
 {
 	struct tcamsg *t;
 	struct nlmsghdr *nlh;
@@ -832,7 +839,7 @@ static int tca_get_fill(struct sk_buff *skb, struct list_head *actions,
 	if (nest == NULL)
 		goto out_nlmsg_trim;
 
-	if (tcf_action_dump(skb, actions, bind, ref) < 0)
+	if (tcf_action_dump(skb, actions, nr, bind, ref) < 0)
 		goto out_nlmsg_trim;
 
 	nla_nest_end(skb, nest);
@@ -847,14 +854,14 @@ static int tca_get_fill(struct sk_buff *skb, struct list_head *actions,
 
 static int
 tcf_get_notify(struct net *net, u32 portid, struct nlmsghdr *n,
-	       struct list_head *actions, int event)
+	       struct tc_action **actions, int nr, int event)
 {
 	struct sk_buff *skb;
 
 	skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL);
 	if (!skb)
 		return -ENOBUFS;
-	if (tca_get_fill(skb, actions, portid, n->nlmsg_seq, 0, event,
+	if (tca_get_fill(skb, actions, nr, portid, n->nlmsg_seq, 0, event,
 			 0, 0) <= 0) {
 		kfree_skb(skb);
 		return -EINVAL;
@@ -968,7 +975,8 @@ static int tca_action_flush(struct net *net, struct nlattr *nla,
 }
 
 static int
-tcf_del_notify(struct net *net, struct nlmsghdr *n, struct list_head *actions,
+tcf_del_notify(struct net *net, struct nlmsghdr *n,
+	       struct tc_action **actions, int nr,
 	       u32 portid)
 {
 	int ret;
@@ -978,14 +986,14 @@ static int tca_action_flush(struct net *net, struct nlattr *nla,
 	if (!skb)
 		return -ENOBUFS;
 
-	if (tca_get_fill(skb, actions, portid, n->nlmsg_seq, 0, RTM_DELACTION,
-			 0, 1) <= 0) {
+	if (tca_get_fill(skb, actions, nr, portid, n->nlmsg_seq, 0,
+			 RTM_DELACTION, 0, 1) <= 0) {
 		kfree_skb(skb);
 		return -EINVAL;
 	}
 
 	/* now do the delete */
-	ret = tcf_action_destroy(actions, 0);
+	ret = tcf_action_destroy(actions, nr, 0);
 	if (ret < 0) {
 		kfree_skb(skb);
 		return ret;
@@ -1002,10 +1010,11 @@ static int tca_action_flush(struct net *net, struct nlattr *nla,
 tca_action_gd(struct net *net, struct nlattr *nla, struct nlmsghdr *n,
 	      u32 portid, int event)
 {
-	int i, ret;
 	struct nlattr *tb[TCA_ACT_MAX_PRIO + 1];
+	struct tc_action **actions;
 	struct tc_action *act;
-	LIST_HEAD(actions);
+	int i, ret;
+	int nr = 0;
 
 	ret = nla_parse_nested(tb, TCA_ACT_MAX_PRIO, nla, NULL, NULL);
 	if (ret < 0)
@@ -1018,6 +1027,11 @@ static int tca_action_flush(struct net *net, struct nlattr *nla,
 			return -EINVAL;
 	}
 
+	actions = kcalloc(TCA_ACT_MAX_PRIO, sizeof(struct tc_action *),
+			  GFP_KERNEL);
+	if (!actions)
+		return -ENOMEM;
+
 	for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) {
 		act = tcf_action_get_1(net, tb[i], n, portid);
 		if (IS_ERR(act)) {
@@ -1025,25 +1039,28 @@ static int tca_action_flush(struct net *net, struct nlattr *nla,
 			goto err;
 		}
 		act->order = i;
-		list_add_tail(&act->list, &actions);
+		actions[nr++] = act;
 	}
 
 	if (event == RTM_GETACTION)
-		ret = tcf_get_notify(net, portid, n, &actions, event);
+		ret = tcf_get_notify(net, portid, n, actions, nr, event);
 	else { /* delete */
-		ret = tcf_del_notify(net, n, &actions, portid);
+		ret = tcf_del_notify(net, n, actions, nr, portid);
 		if (ret)
 			goto err;
+		kfree(actions);
 		return ret;
 	}
 err:
 	if (event != RTM_GETACTION)
-		tcf_action_destroy(&actions, 0);
+		tcf_action_destroy(actions, nr, 0);
+	kfree(actions);
 	return ret;
 }
 
 static int
-tcf_add_notify(struct net *net, struct nlmsghdr *n, struct list_head *actions,
+tcf_add_notify(struct net *net, struct nlmsghdr *n,
+	       struct tc_action **actions, int nr,
 	       u32 portid)
 {
 	struct sk_buff *skb;
@@ -1053,7 +1070,7 @@ static int tca_action_flush(struct net *net, struct nlattr *nla,
 	if (!skb)
 		return -ENOBUFS;
 
-	if (tca_get_fill(skb, actions, portid, n->nlmsg_seq, n->nlmsg_flags,
+	if (tca_get_fill(skb, actions, nr, portid, n->nlmsg_seq, n->nlmsg_flags,
 			 RTM_NEWACTION, 0, 0) <= 0) {
 		kfree_skb(skb);
 		return -EINVAL;
@@ -1069,14 +1086,24 @@ static int tca_action_flush(struct net *net, struct nlattr *nla,
 static int tcf_action_add(struct net *net, struct nlattr *nla,
 			  struct nlmsghdr *n, u32 portid, int ovr)
 {
+	struct tc_action **actions;
 	int ret = 0;
-	LIST_HEAD(actions);
+	int nr;
 
-	ret = tcf_action_init(net, NULL, nla, NULL, NULL, ovr, 0, &actions);
+	actions = kcalloc(TCA_ACT_MAX_PRIO, sizeof(struct tc_action *),
+			  GFP_KERNEL);
+	if (!actions)
+		return -ENOMEM;
+
+	ret = tcf_action_init(net, NULL, nla, NULL, NULL, ovr, 0,
+			      actions, &nr);
 	if (ret)
-		return ret;
+		goto out;
 
-	return tcf_add_notify(net, n, &actions, portid);
+	ret = tcf_add_notify(net, n, actions, nr, portid);
+out:
+	kfree(actions);
+	return ret;
 }
 
 static u32 tcaa_root_flags_allowed = TCA_FLAG_LARGE_DUMP_ON;
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 0b2219a..acaa0c6 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -879,8 +879,7 @@ void tcf_exts_destroy(struct tcf_exts *exts)
 #ifdef CONFIG_NET_CLS_ACT
 	LIST_HEAD(actions);
 
-	tcf_exts_to_list(exts, &actions);
-	tcf_action_destroy(&actions, TCA_ACT_UNBIND);
+	tcf_action_destroy(exts->actions, exts->nr_actions, TCA_ACT_UNBIND);
 	kfree(exts->actions);
 	exts->nr_actions = 0;
 #endif
@@ -905,17 +904,14 @@ int tcf_exts_validate(struct net *net, struct tcf_proto *tp, struct nlattr **tb,
 			exts->actions[0] = act;
 			exts->nr_actions = 1;
 		} else if (exts->action && tb[exts->action]) {
-			LIST_HEAD(actions);
-			int err, i = 0;
+			int err;
 
 			err = tcf_action_init(net, tp, tb[exts->action],
 					      rate_tlv, NULL, ovr, TCA_ACT_BIND,
-					      &actions);
+					      exts->actions,
+					      &exts->nr_actions);
 			if (err)
 				return err;
-			list_for_each_entry(act, &actions, list)
-				exts->actions[i++] = act;
-			exts->nr_actions = i;
 		}
 	}
 #else
@@ -961,14 +957,12 @@ int tcf_exts_dump(struct sk_buff *skb, struct tcf_exts *exts)
 		 * tc data even if iproute2  was newer - jhs
 		 */
 		if (exts->type != TCA_OLD_COMPAT) {
-			LIST_HEAD(actions);
-
 			nest = nla_nest_start(skb, exts->action);
 			if (nest == NULL)
 				goto nla_put_failure;
 
-			tcf_exts_to_list(exts, &actions);
-			if (tcf_action_dump(skb, &actions, 0, 0) < 0)
+			if (tcf_action_dump(skb, exts->actions,
+					    exts->nr_actions, 0, 0) < 0)
 				goto nla_put_failure;
 			nla_nest_end(skb, nest);
 		} else if (exts->police) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [patch net v2 3/4] selftests: Introduce a new script to generate tc batch file
  2017-10-16 11:18 [patch net v2 0/4] net/sched: Fix a system panic when deleting filters Chris Mi
  2017-10-16 11:18 ` [patch net v2 1/4] net/sched: Change tc_action refcnt and bindcnt to atomic Chris Mi
  2017-10-16 11:18 ` [patch net v2 2/4] net/sched: Use action array instead of action list as parameter Chris Mi
@ 2017-10-16 11:18 ` Chris Mi
  2017-10-16 16:24   ` Lucas Bates
  2017-10-16 11:18 ` [patch net v2 4/4] selftests: Introduce a new test case to tc testsuite Chris Mi
  3 siblings, 1 reply; 18+ messages in thread
From: Chris Mi @ 2017-10-16 11:18 UTC (permalink / raw)
  To: netdev; +Cc: jhs, lucasb, xiyou.wangcong, jiri, davem

  # ./tdc_batch.py -h
  usage: tdc_batch.py [-h] [-n NUMBER] [-o] [-s] [-p] device file

  TC batch file generator

  positional arguments:
    device                device name
    file                  batch file name

  optional arguments:
    -h, --help            show this help message and exit
    -n NUMBER, --number NUMBER
                          how many lines in batch file
    -o, --skip_sw         skip_sw (offload), by default skip_hw
    -s, --share_action    all filters share the same action
    -p, --prio            all filters have different prio

Signed-off-by: Chris Mi <chrism@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 tools/testing/selftests/tc-testing/tdc_batch.py | 62 +++++++++++++++++++++++++
 1 file changed, 62 insertions(+)
 create mode 100755 tools/testing/selftests/tc-testing/tdc_batch.py

diff --git a/tools/testing/selftests/tc-testing/tdc_batch.py b/tools/testing/selftests/tc-testing/tdc_batch.py
new file mode 100755
index 0000000..707c6bf
--- /dev/null
+++ b/tools/testing/selftests/tc-testing/tdc_batch.py
@@ -0,0 +1,62 @@
+#!/usr/bin/python3
+
+"""
+tdc_batch.py - a script to generate TC batch file
+
+Copyright (C) 2017 Chris Mi <chrism@mellanox.com>
+"""
+
+import argparse
+
+parser = argparse.ArgumentParser(description='TC batch file generator')
+parser.add_argument("device", help="device name")
+parser.add_argument("file", help="batch file name")
+parser.add_argument("-n", "--number", type=int,
+                    help="how many lines in batch file")
+parser.add_argument("-o", "--skip_sw",
+                    help="skip_sw (offload), by default skip_hw",
+                    action="store_true")
+parser.add_argument("-s", "--share_action",
+                    help="all filters share the same action",
+                    action="store_true")
+parser.add_argument("-p", "--prio",
+                    help="all filters have different prio",
+                    action="store_true")
+args = parser.parse_args()
+
+device = args.device
+file = open(args.file, 'w')
+
+number = 1
+if args.number:
+    number = args.number
+
+skip = "skip_hw"
+if args.skip_sw:
+    skip = "skip_sw"
+
+share_action = ""
+if args.share_action:
+    share_action = "index 1"
+
+prio = "prio 1"
+if args.prio:
+    prio = ""
+    if number > 0x4000:
+        number = 0x4000
+
+index = 0
+for i in range(0x100):
+    for j in range(0x100):
+        for k in range(0x100):
+            mac = ("%02x:%02x:%02x" % (i, j, k))
+            src_mac = "e4:11:00:" + mac
+            dst_mac = "e4:12:00:" + mac
+            cmd = ("filter add dev %s %s protocol ip parent ffff: flower %s "
+                   "src_mac %s dst_mac %s action drop %s" %
+                   (device, prio, skip, src_mac, dst_mac, share_action))
+            file.write("%s\n" % cmd)
+            index += 1
+            if index >= number:
+                file.close()
+                exit(0)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [patch net v2 4/4] selftests: Introduce a new test case to tc testsuite
  2017-10-16 11:18 [patch net v2 0/4] net/sched: Fix a system panic when deleting filters Chris Mi
                   ` (2 preceding siblings ...)
  2017-10-16 11:18 ` [patch net v2 3/4] selftests: Introduce a new script to generate tc batch file Chris Mi
@ 2017-10-16 11:18 ` Chris Mi
  2017-10-16 16:25   ` Lucas Bates
  3 siblings, 1 reply; 18+ messages in thread
From: Chris Mi @ 2017-10-16 11:18 UTC (permalink / raw)
  To: netdev; +Cc: jhs, lucasb, xiyou.wangcong, jiri, davem

In this patchset, we fixed a tc bug. This patch adds the test case
that reproduces the bug. To run this test case, user should specify
an existing NIC device:
  # sudo ./tdc.py -d enp4s0f0

This test case belongs to category "flower". If user doesn't specify
a NIC device, the test cases belong to "flower" will not be run.

In this test case, we create 1M filters and all filters share the same
action. When destroying all filters, kernel should not panic. It takes
about 18s to run it.

Signed-off-by: Chris Mi <chrism@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 .../tc-testing/tc-tests/filters/tests.json         | 23 +++++++++++++++++++++-
 tools/testing/selftests/tc-testing/tdc.py          | 20 +++++++++++++++----
 tools/testing/selftests/tc-testing/tdc_config.py   |  2 ++
 3 files changed, 40 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json b/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
index c727b96..5fa02d8 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
@@ -17,5 +17,26 @@
         "teardown": [
             "$TC qdisc del dev $DEV1 ingress"
         ]
+    },
+    {
+        "id": "d052",
+        "name": "Add 1M filters with the same action",
+        "category": [
+            "filter",
+            "flower"
+        ],
+        "setup": [
+            "$TC qdisc add dev $DEV2 ingress",
+            "./tdc_batch.py $DEV2 $BATCH_FILE --share_action -n 1000000"
+        ],
+        "cmdUnderTest": "$TC -b $BATCH_FILE",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions list action gact",
+        "matchPattern": "action order 0: gact action drop.*index 1 ref 1000000 bind 1000000",
+        "matchCount": "1",
+        "teardown": [
+            "$TC qdisc del dev $DEV2 ingress",
+            "/bin/rm $BATCH_FILE"
+        ]
     }
-]
\ No newline at end of file
+]
diff --git a/tools/testing/selftests/tc-testing/tdc.py b/tools/testing/selftests/tc-testing/tdc.py
index cd61b78..5f11f5d 100755
--- a/tools/testing/selftests/tc-testing/tdc.py
+++ b/tools/testing/selftests/tc-testing/tdc.py
@@ -88,7 +88,7 @@ def prepare_env(cmdlist):
             exit(1)
 
 
-def test_runner(filtered_tests):
+def test_runner(filtered_tests, args):
     """
     Driver function for the unit tests.
 
@@ -105,6 +105,8 @@ def test_runner(filtered_tests):
     for tidx in testlist:
         result = True
         tresult = ""
+        if "flower" in tidx["category"] and args.device == None:
+            continue
         print("Test " + tidx["id"] + ": " + tidx["name"])
         prepare_env(tidx["setup"])
         (p, procout) = exec_cmd(tidx["cmdUnderTest"])
@@ -152,6 +154,10 @@ def ns_create():
         exec_cmd(cmd, False)
         cmd = 'ip -s $NS link set $DEV1 up'
         exec_cmd(cmd, False)
+        cmd = 'ip link set $DEV2 netns $NS'
+        exec_cmd(cmd, False)
+        cmd = 'ip -s $NS link set $DEV2 up'
+        exec_cmd(cmd, False)
 
 
 def ns_destroy():
@@ -211,7 +217,8 @@ def set_args(parser):
                         help='Execute the single test case with specified ID')
     parser.add_argument('-i', '--id', action='store_true', dest='gen_id',
                         help='Generate ID numbers for new test cases')
-    return parser
+    parser.add_argument('-d', '--device',
+                        help='Execute the test case in flower category')
     return parser
 
 
@@ -225,6 +232,8 @@ def check_default_settings(args):
 
     if args.path != None:
          NAMES['TC'] = args.path
+    if args.device != None:
+         NAMES['DEV2'] = args.device
     if not os.path.isfile(NAMES['TC']):
         print("The specified tc path " + NAMES['TC'] + " does not exist.")
         exit(1)
@@ -381,14 +390,17 @@ def set_operation_mode(args):
             if (len(alltests) == 0):
                 print("Cannot find a test case with ID matching " + target_id)
                 exit(1)
-        catresults = test_runner(alltests)
+        catresults = test_runner(alltests, args)
         print("All test results: " + "\n\n" + catresults)
     elif (len(target_category) > 0):
+        if (target_category == "flower") and args.device == None:
+            print("Please specify a NIC device (-d) to run category flower")
+            exit(1)
         if (target_category not in ucat):
             print("Specified category is not present in this file.")
             exit(1)
         else:
-            catresults = test_runner(testcases[target_category])
+            catresults = test_runner(testcases[target_category], args)
             print("Category " + target_category + "\n\n" + catresults)
 
     ns_destroy()
diff --git a/tools/testing/selftests/tc-testing/tdc_config.py b/tools/testing/selftests/tc-testing/tdc_config.py
index 0108737..b635251 100644
--- a/tools/testing/selftests/tc-testing/tdc_config.py
+++ b/tools/testing/selftests/tc-testing/tdc_config.py
@@ -12,6 +12,8 @@ NAMES = {
           # Name of veth devices to be created for the namespace
           'DEV0': 'v0p0',
           'DEV1': 'v0p1',
+          'DEV2': '',
+          'BATCH_FILE': './batch.txt',
           # Name of the namespace to use
           'NS': 'tcut'
         }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [patch net v2 3/4] selftests: Introduce a new script to generate tc batch file
  2017-10-16 11:18 ` [patch net v2 3/4] selftests: Introduce a new script to generate tc batch file Chris Mi
@ 2017-10-16 16:24   ` Lucas Bates
  0 siblings, 0 replies; 18+ messages in thread
From: Lucas Bates @ 2017-10-16 16:24 UTC (permalink / raw)
  To: Chris Mi; +Cc: netdev, Jamal Hadi Salim, Cong Wang, Jiri Pirko, davem

On Mon, Oct 16, 2017 at 7:18 AM, Chris Mi <chrism@mellanox.com> wrote:
>   # ./tdc_batch.py -h
>   usage: tdc_batch.py [-h] [-n NUMBER] [-o] [-s] [-p] device file
>
>   TC batch file generator
>
>   positional arguments:
>     device                device name
>     file                  batch file name
>
>   optional arguments:
>     -h, --help            show this help message and exit
>     -n NUMBER, --number NUMBER
>                           how many lines in batch file
>     -o, --skip_sw         skip_sw (offload), by default skip_hw
>     -s, --share_action    all filters share the same action
>     -p, --prio            all filters have different prio
>
> Signed-off-by: Chris Mi <chrism@mellanox.com>
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

Acked-by: Lucas Bates <lucasb@mojatatu.com>

> ---
>  tools/testing/selftests/tc-testing/tdc_batch.py | 62 +++++++++++++++++++++++++
>  1 file changed, 62 insertions(+)
>  create mode 100755 tools/testing/selftests/tc-testing/tdc_batch.py
>
> diff --git a/tools/testing/selftests/tc-testing/tdc_batch.py b/tools/testing/selftests/tc-testing/tdc_batch.py
> new file mode 100755
> index 0000000..707c6bf
> --- /dev/null
> +++ b/tools/testing/selftests/tc-testing/tdc_batch.py
> @@ -0,0 +1,62 @@
> +#!/usr/bin/python3
> +
> +"""
> +tdc_batch.py - a script to generate TC batch file
> +
> +Copyright (C) 2017 Chris Mi <chrism@mellanox.com>
> +"""
> +
> +import argparse
> +
> +parser = argparse.ArgumentParser(description='TC batch file generator')
> +parser.add_argument("device", help="device name")
> +parser.add_argument("file", help="batch file name")
> +parser.add_argument("-n", "--number", type=int,
> +                    help="how many lines in batch file")
> +parser.add_argument("-o", "--skip_sw",
> +                    help="skip_sw (offload), by default skip_hw",
> +                    action="store_true")
> +parser.add_argument("-s", "--share_action",
> +                    help="all filters share the same action",
> +                    action="store_true")
> +parser.add_argument("-p", "--prio",
> +                    help="all filters have different prio",
> +                    action="store_true")
> +args = parser.parse_args()
> +
> +device = args.device
> +file = open(args.file, 'w')
> +
> +number = 1
> +if args.number:
> +    number = args.number
> +
> +skip = "skip_hw"
> +if args.skip_sw:
> +    skip = "skip_sw"
> +
> +share_action = ""
> +if args.share_action:
> +    share_action = "index 1"
> +
> +prio = "prio 1"
> +if args.prio:
> +    prio = ""
> +    if number > 0x4000:
> +        number = 0x4000
> +
> +index = 0
> +for i in range(0x100):
> +    for j in range(0x100):
> +        for k in range(0x100):
> +            mac = ("%02x:%02x:%02x" % (i, j, k))
> +            src_mac = "e4:11:00:" + mac
> +            dst_mac = "e4:12:00:" + mac
> +            cmd = ("filter add dev %s %s protocol ip parent ffff: flower %s "
> +                   "src_mac %s dst_mac %s action drop %s" %
> +                   (device, prio, skip, src_mac, dst_mac, share_action))
> +            file.write("%s\n" % cmd)
> +            index += 1
> +            if index >= number:
> +                file.close()
> +                exit(0)
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [patch net v2 4/4] selftests: Introduce a new test case to tc testsuite
  2017-10-16 11:18 ` [patch net v2 4/4] selftests: Introduce a new test case to tc testsuite Chris Mi
@ 2017-10-16 16:25   ` Lucas Bates
  2017-10-17  1:03     ` Chris Mi
  0 siblings, 1 reply; 18+ messages in thread
From: Lucas Bates @ 2017-10-16 16:25 UTC (permalink / raw)
  To: Chris Mi; +Cc: netdev, Jamal Hadi Salim, Cong Wang, Jiri Pirko, davem

On Mon, Oct 16, 2017 at 7:18 AM, Chris Mi <chrism@mellanox.com> wrote:
> In this patchset, we fixed a tc bug. This patch adds the test case
> that reproduces the bug. To run this test case, user should specify
> an existing NIC device:
>   # sudo ./tdc.py -d enp4s0f0
>
> This test case belongs to category "flower". If user doesn't specify
> a NIC device, the test cases belong to "flower" will not be run.
>
> In this test case, we create 1M filters and all filters share the same
> action. When destroying all filters, kernel should not panic. It takes
> about 18s to run it.
>
> Signed-off-by: Chris Mi <chrism@mellanox.com>
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

I'm a little wary about adding changes like these into tdc.py
directly; I don't think it's going to be sustainable in the long run.
Even the namespace creation I put in to the original version is too
specific and limiting.

There are some upcoming changes to tdc to help address these
particular issues.  I'll ack this for now, thanks.

Acked-by: Lucas Bates <lucasb@mojatatu.com>


> ---
>  .../tc-testing/tc-tests/filters/tests.json         | 23 +++++++++++++++++++++-
>  tools/testing/selftests/tc-testing/tdc.py          | 20 +++++++++++++++----
>  tools/testing/selftests/tc-testing/tdc_config.py   |  2 ++
>  3 files changed, 40 insertions(+), 5 deletions(-)
>
> diff --git a/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json b/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
> index c727b96..5fa02d8 100644
> --- a/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
> +++ b/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
> @@ -17,5 +17,26 @@
>          "teardown": [
>              "$TC qdisc del dev $DEV1 ingress"
>          ]
> +    },
> +    {
> +        "id": "d052",
> +        "name": "Add 1M filters with the same action",
> +        "category": [
> +            "filter",
> +            "flower"
> +        ],
> +        "setup": [
> +            "$TC qdisc add dev $DEV2 ingress",
> +            "./tdc_batch.py $DEV2 $BATCH_FILE --share_action -n 1000000"
> +        ],
> +        "cmdUnderTest": "$TC -b $BATCH_FILE",
> +        "expExitCode": "0",
> +        "verifyCmd": "$TC actions list action gact",
> +        "matchPattern": "action order 0: gact action drop.*index 1 ref 1000000 bind 1000000",
> +        "matchCount": "1",
> +        "teardown": [
> +            "$TC qdisc del dev $DEV2 ingress",
> +            "/bin/rm $BATCH_FILE"
> +        ]
>      }
> -]
> \ No newline at end of file
> +]
> diff --git a/tools/testing/selftests/tc-testing/tdc.py b/tools/testing/selftests/tc-testing/tdc.py
> index cd61b78..5f11f5d 100755
> --- a/tools/testing/selftests/tc-testing/tdc.py
> +++ b/tools/testing/selftests/tc-testing/tdc.py
> @@ -88,7 +88,7 @@ def prepare_env(cmdlist):
>              exit(1)
>
>
> -def test_runner(filtered_tests):
> +def test_runner(filtered_tests, args):
>      """
>      Driver function for the unit tests.
>
> @@ -105,6 +105,8 @@ def test_runner(filtered_tests):
>      for tidx in testlist:
>          result = True
>          tresult = ""
> +        if "flower" in tidx["category"] and args.device == None:
> +            continue
>          print("Test " + tidx["id"] + ": " + tidx["name"])
>          prepare_env(tidx["setup"])
>          (p, procout) = exec_cmd(tidx["cmdUnderTest"])
> @@ -152,6 +154,10 @@ def ns_create():
>          exec_cmd(cmd, False)
>          cmd = 'ip -s $NS link set $DEV1 up'
>          exec_cmd(cmd, False)
> +        cmd = 'ip link set $DEV2 netns $NS'
> +        exec_cmd(cmd, False)
> +        cmd = 'ip -s $NS link set $DEV2 up'
> +        exec_cmd(cmd, False)
>
>
>  def ns_destroy():
> @@ -211,7 +217,8 @@ def set_args(parser):
>                          help='Execute the single test case with specified ID')
>      parser.add_argument('-i', '--id', action='store_true', dest='gen_id',
>                          help='Generate ID numbers for new test cases')
> -    return parser
> +    parser.add_argument('-d', '--device',
> +                        help='Execute the test case in flower category')
>      return parser
>
>
> @@ -225,6 +232,8 @@ def check_default_settings(args):
>
>      if args.path != None:
>           NAMES['TC'] = args.path
> +    if args.device != None:
> +         NAMES['DEV2'] = args.device
>      if not os.path.isfile(NAMES['TC']):
>          print("The specified tc path " + NAMES['TC'] + " does not exist.")
>          exit(1)
> @@ -381,14 +390,17 @@ def set_operation_mode(args):
>              if (len(alltests) == 0):
>                  print("Cannot find a test case with ID matching " + target_id)
>                  exit(1)
> -        catresults = test_runner(alltests)
> +        catresults = test_runner(alltests, args)
>          print("All test results: " + "\n\n" + catresults)
>      elif (len(target_category) > 0):
> +        if (target_category == "flower") and args.device == None:
> +            print("Please specify a NIC device (-d) to run category flower")
> +            exit(1)
>          if (target_category not in ucat):
>              print("Specified category is not present in this file.")
>              exit(1)
>          else:
> -            catresults = test_runner(testcases[target_category])
> +            catresults = test_runner(testcases[target_category], args)
>              print("Category " + target_category + "\n\n" + catresults)
>
>      ns_destroy()
> diff --git a/tools/testing/selftests/tc-testing/tdc_config.py b/tools/testing/selftests/tc-testing/tdc_config.py
> index 0108737..b635251 100644
> --- a/tools/testing/selftests/tc-testing/tdc_config.py
> +++ b/tools/testing/selftests/tc-testing/tdc_config.py
> @@ -12,6 +12,8 @@ NAMES = {
>            # Name of veth devices to be created for the namespace
>            'DEV0': 'v0p0',
>            'DEV1': 'v0p1',
> +          'DEV2': '',
> +          'BATCH_FILE': './batch.txt',
>            # Name of the namespace to use
>            'NS': 'tcut'
>          }
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [patch net v2 1/4] net/sched: Change tc_action refcnt and bindcnt to atomic
  2017-10-16 11:18 ` [patch net v2 1/4] net/sched: Change tc_action refcnt and bindcnt to atomic Chris Mi
@ 2017-10-16 17:06   ` Cong Wang
  2017-10-17  1:14     ` Chris Mi
  0 siblings, 1 reply; 18+ messages in thread
From: Cong Wang @ 2017-10-16 17:06 UTC (permalink / raw)
  To: Chris Mi
  Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Lucas Bates,
	Jiri Pirko, David Miller

On Mon, Oct 16, 2017 at 4:18 AM, Chris Mi <chrism@mellanox.com> wrote:
> If many filters share the same action. That action's refcnt and bindcnt
> could be manipulated by many RCU callbacks at the same time. This patch
> makes these operations atomic.

Actually I have been thinking about removing these RCU callbacks,
they are not necessary AFAIK, callers hold RTNL lock so they are
allowed to block. The only drawback is that adding a synchronize_rcu(),
but these are slow paths anyway...

I am not sure, it is arguable anyway, essentially it is:

synchronize_rcu() in slow path vs.  multiple RCU callback races


>
> Fixes commit in pre-git era.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")

This is not true, the action RCU callbacks were introduced
by:

commit c7de2cf053420d63bac85133469c965d4b1083e1
Author: Eric Dumazet <eric.dumazet@gmail.com>
Date:   Wed Jun 9 02:09:23 2010 +0000

    pkt_sched: gen_kill_estimator() rcu fixes


and the filter RCU callbacks were introduced by the
patchset like this one:


commit 1ce87720d456e471de0fbd814dc5d1fe10fc1c44
Author: John Fastabend <john.fastabend@gmail.com>
Date:   Fri Sep 12 20:09:16 2014 -0700

    net: sched: make cls_u32 lockless

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [patch net v2 4/4] selftests: Introduce a new test case to tc testsuite
  2017-10-16 16:25   ` Lucas Bates
@ 2017-10-17  1:03     ` Chris Mi
  0 siblings, 0 replies; 18+ messages in thread
From: Chris Mi @ 2017-10-17  1:03 UTC (permalink / raw)
  To: Lucas Bates; +Cc: netdev, Jamal Hadi Salim, Cong Wang, Jiri Pirko, davem



> -----Original Message-----
> From: Lucas Bates [mailto:lucasb@mojatatu.com]
> Sent: Tuesday, October 17, 2017 12:25 AM
> To: Chris Mi <chrism@mellanox.com>
> Cc: netdev@vger.kernel.org; Jamal Hadi Salim <jhs@mojatatu.com>; Cong
> Wang <xiyou.wangcong@gmail.com>; Jiri Pirko <jiri@resnulli.us>;
> davem@davemloft.net
> Subject: Re: [patch net v2 4/4] selftests: Introduce a new test case to tc
> testsuite
> 
> On Mon, Oct 16, 2017 at 7:18 AM, Chris Mi <chrism@mellanox.com> wrote:
> > In this patchset, we fixed a tc bug. This patch adds the test case
> > that reproduces the bug. To run this test case, user should specify an
> > existing NIC device:
> >   # sudo ./tdc.py -d enp4s0f0
> >
> > This test case belongs to category "flower". If user doesn't specify a
> > NIC device, the test cases belong to "flower" will not be run.
> >
> > In this test case, we create 1M filters and all filters share the same
> > action. When destroying all filters, kernel should not panic. It takes
> > about 18s to run it.
> >
> > Signed-off-by: Chris Mi <chrism@mellanox.com>
> > Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
> 
> I'm a little wary about adding changes like these into tdc.py directly; I don't
> think it's going to be sustainable in the long run.
> Even the namespace creation I put in to the original version is too specific and
> limiting.
> 
> There are some upcoming changes to tdc to help address these particular
> issues.  I'll ack this for now, thanks.
OK. Thanks for your review, Lucas.
> 
> Acked-by: Lucas Bates <lucasb@mojatatu.com>
> 
> 
> > ---
> >  .../tc-testing/tc-tests/filters/tests.json         | 23
> +++++++++++++++++++++-
> >  tools/testing/selftests/tc-testing/tdc.py          | 20 +++++++++++++++----
> >  tools/testing/selftests/tc-testing/tdc_config.py   |  2 ++
> >  3 files changed, 40 insertions(+), 5 deletions(-)
> >
> > diff --git
> > a/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
> > b/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
> > index c727b96..5fa02d8 100644
> > --- a/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
> > +++ b/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
> > @@ -17,5 +17,26 @@
> >          "teardown": [
> >              "$TC qdisc del dev $DEV1 ingress"
> >          ]
> > +    },
> > +    {
> > +        "id": "d052",
> > +        "name": "Add 1M filters with the same action",
> > +        "category": [
> > +            "filter",
> > +            "flower"
> > +        ],
> > +        "setup": [
> > +            "$TC qdisc add dev $DEV2 ingress",
> > +            "./tdc_batch.py $DEV2 $BATCH_FILE --share_action -n 1000000"
> > +        ],
> > +        "cmdUnderTest": "$TC -b $BATCH_FILE",
> > +        "expExitCode": "0",
> > +        "verifyCmd": "$TC actions list action gact",
> > +        "matchPattern": "action order 0: gact action drop.*index 1 ref 1000000
> bind 1000000",
> > +        "matchCount": "1",
> > +        "teardown": [
> > +            "$TC qdisc del dev $DEV2 ingress",
> > +            "/bin/rm $BATCH_FILE"
> > +        ]
> >      }
> > -]
> > \ No newline at end of file
> > +]
> > diff --git a/tools/testing/selftests/tc-testing/tdc.py
> > b/tools/testing/selftests/tc-testing/tdc.py
> > index cd61b78..5f11f5d 100755
> > --- a/tools/testing/selftests/tc-testing/tdc.py
> > +++ b/tools/testing/selftests/tc-testing/tdc.py
> > @@ -88,7 +88,7 @@ def prepare_env(cmdlist):
> >              exit(1)
> >
> >
> > -def test_runner(filtered_tests):
> > +def test_runner(filtered_tests, args):
> >      """
> >      Driver function for the unit tests.
> >
> > @@ -105,6 +105,8 @@ def test_runner(filtered_tests):
> >      for tidx in testlist:
> >          result = True
> >          tresult = ""
> > +        if "flower" in tidx["category"] and args.device == None:
> > +            continue
> >          print("Test " + tidx["id"] + ": " + tidx["name"])
> >          prepare_env(tidx["setup"])
> >          (p, procout) = exec_cmd(tidx["cmdUnderTest"]) @@ -152,6
> > +154,10 @@ def ns_create():
> >          exec_cmd(cmd, False)
> >          cmd = 'ip -s $NS link set $DEV1 up'
> >          exec_cmd(cmd, False)
> > +        cmd = 'ip link set $DEV2 netns $NS'
> > +        exec_cmd(cmd, False)
> > +        cmd = 'ip -s $NS link set $DEV2 up'
> > +        exec_cmd(cmd, False)
> >
> >
> >  def ns_destroy():
> > @@ -211,7 +217,8 @@ def set_args(parser):
> >                          help='Execute the single test case with specified ID')
> >      parser.add_argument('-i', '--id', action='store_true', dest='gen_id',
> >                          help='Generate ID numbers for new test cases')
> > -    return parser
> > +    parser.add_argument('-d', '--device',
> > +                        help='Execute the test case in flower
> > + category')
> >      return parser
> >
> >
> > @@ -225,6 +232,8 @@ def check_default_settings(args):
> >
> >      if args.path != None:
> >           NAMES['TC'] = args.path
> > +    if args.device != None:
> > +         NAMES['DEV2'] = args.device
> >      if not os.path.isfile(NAMES['TC']):
> >          print("The specified tc path " + NAMES['TC'] + " does not exist.")
> >          exit(1)
> > @@ -381,14 +390,17 @@ def set_operation_mode(args):
> >              if (len(alltests) == 0):
> >                  print("Cannot find a test case with ID matching " + target_id)
> >                  exit(1)
> > -        catresults = test_runner(alltests)
> > +        catresults = test_runner(alltests, args)
> >          print("All test results: " + "\n\n" + catresults)
> >      elif (len(target_category) > 0):
> > +        if (target_category == "flower") and args.device == None:
> > +            print("Please specify a NIC device (-d) to run category flower")
> > +            exit(1)
> >          if (target_category not in ucat):
> >              print("Specified category is not present in this file.")
> >              exit(1)
> >          else:
> > -            catresults = test_runner(testcases[target_category])
> > +            catresults = test_runner(testcases[target_category],
> > + args)
> >              print("Category " + target_category + "\n\n" +
> > catresults)
> >
> >      ns_destroy()
> > diff --git a/tools/testing/selftests/tc-testing/tdc_config.py
> > b/tools/testing/selftests/tc-testing/tdc_config.py
> > index 0108737..b635251 100644
> > --- a/tools/testing/selftests/tc-testing/tdc_config.py
> > +++ b/tools/testing/selftests/tc-testing/tdc_config.py
> > @@ -12,6 +12,8 @@ NAMES = {
> >            # Name of veth devices to be created for the namespace
> >            'DEV0': 'v0p0',
> >            'DEV1': 'v0p1',
> > +          'DEV2': '',
> > +          'BATCH_FILE': './batch.txt',
> >            # Name of the namespace to use
> >            'NS': 'tcut'
> >          }
> > --
> > 1.8.3.1
> >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [patch net v2 1/4] net/sched: Change tc_action refcnt and bindcnt to atomic
  2017-10-16 17:06   ` Cong Wang
@ 2017-10-17  1:14     ` Chris Mi
  2017-10-17 15:52       ` Cong Wang
  0 siblings, 1 reply; 18+ messages in thread
From: Chris Mi @ 2017-10-17  1:14 UTC (permalink / raw)
  To: Cong Wang
  Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Lucas Bates,
	Jiri Pirko, David Miller



> -----Original Message-----
> From: Cong Wang [mailto:xiyou.wangcong@gmail.com]
> Sent: Tuesday, October 17, 2017 1:06 AM
> To: Chris Mi <chrism@mellanox.com>
> Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Jamal Hadi
> Salim <jhs@mojatatu.com>; Lucas Bates <lucasb@mojatatu.com>; Jiri Pirko
> <jiri@resnulli.us>; David Miller <davem@davemloft.net>
> Subject: Re: [patch net v2 1/4] net/sched: Change tc_action refcnt and
> bindcnt to atomic
> 
> On Mon, Oct 16, 2017 at 4:18 AM, Chris Mi <chrism@mellanox.com> wrote:
> > If many filters share the same action. That action's refcnt and
> > bindcnt could be manipulated by many RCU callbacks at the same time.
> > This patch makes these operations atomic.
> 
> Actually I have been thinking about removing these RCU callbacks, they are
> not necessary AFAIK, callers hold RTNL lock so they are allowed to block. The
> only drawback is that adding a synchronize_rcu(), but these are slow paths
> anyway...
> 
> I am not sure, it is arguable anyway, essentially it is:
> 
> synchronize_rcu() in slow path vs.  multiple RCU callback races
> 
> 
> >
> > Fixes commit in pre-git era.
> >
> > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> 
> This is not true, the action RCU callbacks were introduced
> by:
> 
> commit c7de2cf053420d63bac85133469c965d4b1083e1
> Author: Eric Dumazet <eric.dumazet@gmail.com>
> Date:   Wed Jun 9 02:09:23 2010 +0000
> 
>     pkt_sched: gen_kill_estimator() rcu fixes
> 
> 
> and the filter RCU callbacks were introduced by the patchset like this one:
> 
> 
> commit 1ce87720d456e471de0fbd814dc5d1fe10fc1c44
> Author: John Fastabend <john.fastabend@gmail.com>
> Date:   Fri Sep 12 20:09:16 2014 -0700
> 
>     net: sched: make cls_u32 lockless
I don't think this bug were introduced by above two commits only.
Actually, this bug were introduced by several commits, at least the following:
1. refcnt and bindcnt are not atomic
2. passing actions using list instead of arrays (I think initially we are using arrays)
3. using RCU callbacks
So instead of blaming the latest commit, it is better to say it is a pre-git error.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [patch net v2 1/4] net/sched: Change tc_action refcnt and bindcnt to atomic
  2017-10-17  1:14     ` Chris Mi
@ 2017-10-17 15:52       ` Cong Wang
  2017-10-18  1:03         ` Chris Mi
  0 siblings, 1 reply; 18+ messages in thread
From: Cong Wang @ 2017-10-17 15:52 UTC (permalink / raw)
  To: Chris Mi
  Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Lucas Bates,
	Jiri Pirko, David Miller

On Mon, Oct 16, 2017 at 6:14 PM, Chris Mi <chrism@mellanox.com> wrote:
> I don't think this bug were introduced by above two commits only.
> Actually, this bug were introduced by several commits, at least the following:
> 1. refcnt and bindcnt are not atomic

Nope, it is perfectly okay with non-atomic as long as no parallel,
and without RCU callback they are perfectly serialized by RTNL.


> 2. passing actions using list instead of arrays (I think initially we are using arrays)

We are discussing patch 1/4, this is patch 2/4, so irrelevant.


> 3. using RCU callbacks

This introduces problem 1.


> So instead of blaming the latest commit, it is better to say it is a pre-git error.

You are wrong.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [patch net v2 1/4] net/sched: Change tc_action refcnt and bindcnt to atomic
  2017-10-17 15:52       ` Cong Wang
@ 2017-10-18  1:03         ` Chris Mi
  2017-10-18 16:43           ` Cong Wang
  0 siblings, 1 reply; 18+ messages in thread
From: Chris Mi @ 2017-10-18  1:03 UTC (permalink / raw)
  To: Cong Wang
  Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Lucas Bates,
	Jiri Pirko, David Miller

> -----Original Message-----
> From: Cong Wang [mailto:xiyou.wangcong@gmail.com]
> Sent: Tuesday, October 17, 2017 11:53 PM
> To: Chris Mi <chrism@mellanox.com>
> Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Jamal Hadi
> Salim <jhs@mojatatu.com>; Lucas Bates <lucasb@mojatatu.com>; Jiri Pirko
> <jiri@resnulli.us>; David Miller <davem@davemloft.net>
> Subject: Re: [patch net v2 1/4] net/sched: Change tc_action refcnt and
> bindcnt to atomic
> 
> On Mon, Oct 16, 2017 at 6:14 PM, Chris Mi <chrism@mellanox.com> wrote:
> > I don't think this bug were introduced by above two commits only.
> > Actually, this bug were introduced by several commits, at least the
> following:
> > 1. refcnt and bindcnt are not atomic
> 
> Nope, it is perfectly okay with non-atomic as long as no parallel, and without
> RCU callback they are perfectly serialized by RTNL.
Agree.
> 
> 
> > 2. passing actions using list instead of arrays (I think initially we
> > are using arrays)
> 
> We are discussing patch 1/4, this is patch 2/4, so irrelevant.
Agree.
> 
> 
> > 3. using RCU callbacks
> 
> This introduces problem 1.
I think this patch set only fixes one problem, that's the race and the panic.
What do you mean by problem 1.
> 
> 
> > So instead of blaming the latest commit, it is better to say it is a pre-git error.
> 
> You are wrong.
OK, you are right. But could I know what's your suggestion for this patch set?
1. reject it?
2. change the "Fixes" as you suggested?
3. something else?

Thanks,
Chris

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [patch net v2 1/4] net/sched: Change tc_action refcnt and bindcnt to atomic
  2017-10-18  1:03         ` Chris Mi
@ 2017-10-18 16:43           ` Cong Wang
  2017-10-19 14:21             ` Jamal Hadi Salim
  0 siblings, 1 reply; 18+ messages in thread
From: Cong Wang @ 2017-10-18 16:43 UTC (permalink / raw)
  To: Chris Mi
  Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Lucas Bates,
	Jiri Pirko, David Miller

On Tue, Oct 17, 2017 at 6:03 PM, Chris Mi <chrism@mellanox.com> wrote:
>> -----Original Message-----
>> From: Cong Wang [mailto:xiyou.wangcong@gmail.com]
>> Sent: Tuesday, October 17, 2017 11:53 PM
>> To: Chris Mi <chrism@mellanox.com>
>> Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Jamal Hadi
>> Salim <jhs@mojatatu.com>; Lucas Bates <lucasb@mojatatu.com>; Jiri Pirko
>> <jiri@resnulli.us>; David Miller <davem@davemloft.net>
>> Subject: Re: [patch net v2 1/4] net/sched: Change tc_action refcnt and
>> bindcnt to atomic
>>
>> On Mon, Oct 16, 2017 at 6:14 PM, Chris Mi <chrism@mellanox.com> wrote:
>> > I don't think this bug were introduced by above two commits only.
>> > Actually, this bug were introduced by several commits, at least the
>> following:
>> > 1. refcnt and bindcnt are not atomic
>>
>> Nope, it is perfectly okay with non-atomic as long as no parallel, and without
>> RCU callback they are perfectly serialized by RTNL.
> Agree.
>>
>>
>> > 2. passing actions using list instead of arrays (I think initially we
>> > are using arrays)
>>
>> We are discussing patch 1/4, this is patch 2/4, so irrelevant.
> Agree.
>>
>>
>> > 3. using RCU callbacks
>>
>> This introduces problem 1.
> I think this patch set only fixes one problem, that's the race and the panic.
> What do you mean by problem 1.


You listed 3 problems, and you think they are 3 different ones, here
I argue problem 3 (using RCU callbacks) is the cause of problem 1
(refcnt not atomic). This is why I mentioned I have been thinking about
removing RCU callbacks, because it probably could fix all of them.


>>
>>
>> > So instead of blaming the latest commit, it is better to say it is a pre-git error.
>>
>> You are wrong.
> OK, you are right. But could I know what's your suggestion for this patch set?
> 1. reject it?
> 2. change the "Fixes" as you suggested?
> 3. something else?

In my opinion we need to think about removing RCU callbacks
rather than fixing all bugs they introduce, because it is really hard
to prove we can fix all of them. In your patchset, you fix 2 bugs.
Before, we fixed 2 bugs (I already list them in the other reply to you).
In total, we have 4 bugs... Are we totally race-free even after
your patches? It seems not at all without a lock, but as I said locking
itself is hard...

I will start a new thread to discuss this and keep you Cc'ed. So
please hold your patches until we have a conclusion.

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [patch net v2 1/4] net/sched: Change tc_action refcnt and bindcnt to atomic
  2017-10-18 16:43           ` Cong Wang
@ 2017-10-19 14:21             ` Jamal Hadi Salim
  2017-10-20  3:00               ` Cong Wang
  0 siblings, 1 reply; 18+ messages in thread
From: Jamal Hadi Salim @ 2017-10-19 14:21 UTC (permalink / raw)
  To: Cong Wang, Chris Mi
  Cc: Linux Kernel Network Developers, Lucas Bates, Jiri Pirko, David Miller

On 17-10-18 12:43 PM, Cong Wang wrote:
> On Tue, Oct 17, 2017 at 6:03 PM, Chris Mi <chrism@mellanox.com> wrote:
>>> -----Original Message-----

> 
> You listed 3 problems, and you think they are 3 different ones, here
> I argue problem 3 (using RCU callbacks) is the cause of problem 1
> (refcnt not atomic). This is why I mentioned I have been thinking about
> removing RCU callbacks, because it probably could fix all of them.
> 

Cong,
Given this is a known bug (the test case Chris presented crashes the
kernel) - would it make sense to have a patch that goes to -net
to fix this while your approach and discussion outcome goes into
net-next?

cheers,
jamal

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [patch net v2 1/4] net/sched: Change tc_action refcnt and bindcnt to atomic
  2017-10-19 14:21             ` Jamal Hadi Salim
@ 2017-10-20  3:00               ` Cong Wang
  2017-10-23  2:47                 ` Chris Mi
  0 siblings, 1 reply; 18+ messages in thread
From: Cong Wang @ 2017-10-20  3:00 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Chris Mi, Linux Kernel Network Developers, Lucas Bates,
	Jiri Pirko, David Miller

On Thu, Oct 19, 2017 at 7:21 AM, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 17-10-18 12:43 PM, Cong Wang wrote:
>>
>> On Tue, Oct 17, 2017 at 6:03 PM, Chris Mi <chrism@mellanox.com> wrote:
>>>>
>>>> -----Original Message-----
>
>
>>
>> You listed 3 problems, and you think they are 3 different ones, here
>> I argue problem 3 (using RCU callbacks) is the cause of problem 1
>> (refcnt not atomic). This is why I mentioned I have been thinking about
>> removing RCU callbacks, because it probably could fix all of them.
>>
>
> Cong,
> Given this is a known bug (the test case Chris presented crashes the
> kernel) - would it make sense to have a patch that goes to -net
> to fix this while your approach and discussion outcome goes into
> net-next?

I am not sure. Because Chris' patchset is large too and I don't think
it could fix all crashes, so it has little value to just apply them for -net.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [patch net v2 1/4] net/sched: Change tc_action refcnt and bindcnt to atomic
  2017-10-20  3:00               ` Cong Wang
@ 2017-10-23  2:47                 ` Chris Mi
  2017-10-23 15:39                   ` Cong Wang
  0 siblings, 1 reply; 18+ messages in thread
From: Chris Mi @ 2017-10-23  2:47 UTC (permalink / raw)
  To: Cong Wang, Jamal Hadi Salim
  Cc: Linux Kernel Network Developers, Lucas Bates, Jiri Pirko, David Miller

> -----Original Message-----
> From: Cong Wang [mailto:xiyou.wangcong@gmail.com]
> Sent: Friday, October 20, 2017 11:00 AM
> To: Jamal Hadi Salim <jhs@mojatatu.com>
> Cc: Chris Mi <chrism@mellanox.com>; Linux Kernel Network Developers
> <netdev@vger.kernel.org>; Lucas Bates <lucasb@mojatatu.com>; Jiri Pirko
> <jiri@resnulli.us>; David Miller <davem@davemloft.net>
> Subject: Re: [patch net v2 1/4] net/sched: Change tc_action refcnt and
> bindcnt to atomic
> 
> On Thu, Oct 19, 2017 at 7:21 AM, Jamal Hadi Salim <jhs@mojatatu.com>
> wrote:
> > On 17-10-18 12:43 PM, Cong Wang wrote:
> >>
> >> On Tue, Oct 17, 2017 at 6:03 PM, Chris Mi <chrism@mellanox.com> wrote:
> >>>>
> >>>> -----Original Message-----
> >
> >
> >>
> >> You listed 3 problems, and you think they are 3 different ones, here
> >> I argue problem 3 (using RCU callbacks) is the cause of problem 1
> >> (refcnt not atomic). This is why I mentioned I have been thinking
> >> about removing RCU callbacks, because it probably could fix all of them.
> >>
> >
> > Cong,
> > Given this is a known bug (the test case Chris presented crashes the
> > kernel) - would it make sense to have a patch that goes to -net to fix
> > this while your approach and discussion outcome goes into net-next?
> 
> I am not sure. Because Chris' patchset is large too and I don't think it could fix
> all crashes, so it has little value to just apply them for -net.

It seems it is not easy to discard call_rcu().  I'm afraid even if we have a final solution
without call_rcu(), it is not mature at the beginning as well. I mean we also need time
to fix the possible bugs of the new design. And maybe to destroy the filters in parallel
is the right direction. If this bug is the last bug brought by call_rcu(), then changing it
may not be a good idea.
 
Patch 1 is straightforward to use atomic. Patch 2 is to convert the list to array.
I think there is no harm to the new design.  Patch 3 and 4 are useful test case.
We also need it with new design to make sure there is no regression.

So I think my patch set should not be held so long time.

My two cents.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [patch net v2 1/4] net/sched: Change tc_action refcnt and bindcnt to atomic
  2017-10-23  2:47                 ` Chris Mi
@ 2017-10-23 15:39                   ` Cong Wang
  2017-10-24  6:17                     ` Chris Mi
  0 siblings, 1 reply; 18+ messages in thread
From: Cong Wang @ 2017-10-23 15:39 UTC (permalink / raw)
  To: Chris Mi
  Cc: Jamal Hadi Salim, Linux Kernel Network Developers, Lucas Bates,
	Jiri Pirko, David Miller

On Sun, Oct 22, 2017 at 7:47 PM, Chris Mi <chrism@mellanox.com> wrote:
>
> It seems it is not easy to discard call_rcu().  I'm afraid even if we have a final solution
> without call_rcu(), it is not mature at the beginning as well. I mean we also need time

Why do you believe it is not easy? RTNL lock is already there,
list_splice_init_rcu() is there too. I can naturally divide my patches
for each module so that they are much easier to backport than
yours.


> to fix the possible bugs of the new design. And maybe to destroy the filters in parallel
> is the right direction. If this bug is the last bug brought by call_rcu(), then changing it
> may not be a good idea.

Again, you have to prove this is the last bug, I seriously doubt
it is.


>
> Patch 1 is straightforward to use atomic. Patch 2 is to convert the list to array.

Both are big in size.


> I think there is no harm to the new design.  Patch 3 and 4 are useful test case.

It definitely doesn't harm, but why do we waste time on it when we
know there is a better way? It is clearly not easy for backport either,
in fact it is harder w.r.t. size.


> We also need it with new design to make sure there is no regression.
>

Are you saying I can't trust your test cases? ;)


> So I think my patch set should not be held so long time.

I think your patches should be dropped except the last two,
I will take the last two for you.

Thanks!

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [patch net v2 1/4] net/sched: Change tc_action refcnt and bindcnt to atomic
  2017-10-23 15:39                   ` Cong Wang
@ 2017-10-24  6:17                     ` Chris Mi
  0 siblings, 0 replies; 18+ messages in thread
From: Chris Mi @ 2017-10-24  6:17 UTC (permalink / raw)
  To: Cong Wang
  Cc: Jamal Hadi Salim, Linux Kernel Network Developers, Lucas Bates,
	Jiri Pirko, David Miller

> -----Original Message-----
> From: Cong Wang [mailto:xiyou.wangcong@gmail.com]
> Sent: Monday, October 23, 2017 11:40 PM
> To: Chris Mi <chrism@mellanox.com>
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>; Linux Kernel Network Developers
> <netdev@vger.kernel.org>; Lucas Bates <lucasb@mojatatu.com>; Jiri Pirko
> <jiri@resnulli.us>; David Miller <davem@davemloft.net>
> Subject: Re: [patch net v2 1/4] net/sched: Change tc_action refcnt and
> bindcnt to atomic
> 
> On Sun, Oct 22, 2017 at 7:47 PM, Chris Mi <chrism@mellanox.com> wrote:
> >
> > It seems it is not easy to discard call_rcu().  I'm afraid even if we
> > have a final solution without call_rcu(), it is not mature at the
> > beginning as well. I mean we also need time
> 
> Why do you believe it is not easy? RTNL lock is already there,
> list_splice_init_rcu() is there too. I can naturally divide my patches for each
> module so that they are much easier to backport than yours.
I tested your patches. It takes about 17 seconds to run the same test.
I believe your code is simpler and easy to maintain. If it is stable,
we will also get benefit. Thanks!
> 
> 
> > to fix the possible bugs of the new design. And maybe to destroy the
> > filters in parallel is the right direction. If this bug is the last
> > bug brought by call_rcu(), then changing it may not be a good idea.
> 
> Again, you have to prove this is the last bug, I seriously doubt it is.
> 
> 
> >
> > Patch 1 is straightforward to use atomic. Patch 2 is to convert the list to
> array.
> 
> Both are big in size.
> 
> 
> > I think there is no harm to the new design.  Patch 3 and 4 are useful test
> case.
> 
> It definitely doesn't harm, but why do we waste time on it when we know
> there is a better way? It is clearly not easy for backport either, in fact it is
> harder w.r.t. size.
> 
> 
> > We also need it with new design to make sure there is no regression.
> >
> 
> Are you saying I can't trust your test cases? ;)
> 
> 
> > So I think my patch set should not be held so long time.
> 
> I think your patches should be dropped except the last two, I will take the
> last two for you.
> 
> Thanks!

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-10-24  6:17 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-16 11:18 [patch net v2 0/4] net/sched: Fix a system panic when deleting filters Chris Mi
2017-10-16 11:18 ` [patch net v2 1/4] net/sched: Change tc_action refcnt and bindcnt to atomic Chris Mi
2017-10-16 17:06   ` Cong Wang
2017-10-17  1:14     ` Chris Mi
2017-10-17 15:52       ` Cong Wang
2017-10-18  1:03         ` Chris Mi
2017-10-18 16:43           ` Cong Wang
2017-10-19 14:21             ` Jamal Hadi Salim
2017-10-20  3:00               ` Cong Wang
2017-10-23  2:47                 ` Chris Mi
2017-10-23 15:39                   ` Cong Wang
2017-10-24  6:17                     ` Chris Mi
2017-10-16 11:18 ` [patch net v2 2/4] net/sched: Use action array instead of action list as parameter Chris Mi
2017-10-16 11:18 ` [patch net v2 3/4] selftests: Introduce a new script to generate tc batch file Chris Mi
2017-10-16 16:24   ` Lucas Bates
2017-10-16 11:18 ` [patch net v2 4/4] selftests: Introduce a new test case to tc testsuite Chris Mi
2017-10-16 16:25   ` Lucas Bates
2017-10-17  1:03     ` Chris Mi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.