All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions
@ 2016-09-02  5:57 Cong Wang
  2016-09-02  5:57 ` [RFC Patch net-next 1/6] net_sched: use RCU for action hash table Cong Wang
                   ` (7 more replies)
  0 siblings, 8 replies; 37+ messages in thread
From: Cong Wang @ 2016-09-02  5:57 UTC (permalink / raw)
  To: netdev; +Cc: jhs, Cong Wang

Currently there are only two tc actions lockless:
gact and mirred. But they are questionable because
we don't have anything to prevent a parallel update
on an existing tc action in hash table while reading
it on fast path, this could be a problem when a tc
action becomes complex.

This patchset introduces a few new tc action API's
based on RCU so that the fast path could now really
be protected by RCU and we can update existing tc
actions safely and race-freely.

Obviously this is still _not_ complete yet, I only
modified mirred action to show the use case of
the new API's, all the rest actions could switch to
the new API's too. The new API's are a bit ugly too,
any suggestion to improve them is welcome.

I tested mirred action with a few test cases, so far
so good, at least no obvious bugs. ;)


Cong Wang (6):
  net_sched: use RCU for action hash table
  net_sched: introduce tcf_hash_replace()
  net_sched: return NULL in tcf_hash_check()
  net_sched: introduce tcf_hash_copy()
  net_sched: use rcu in fast path
  net_sched: switch to RCU API for act_mirred

 include/net/act_api.h  |  3 +++
 net/sched/act_api.c    | 59 +++++++++++++++++++++++++++++++++++++++++++-------
 net/sched/act_mirred.c | 41 ++++++++++++++++-------------------
 3 files changed, 73 insertions(+), 30 deletions(-)

-- 
2.1.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [RFC Patch net-next 1/6] net_sched: use RCU for action hash table
  2016-09-02  5:57 [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions Cong Wang
@ 2016-09-02  5:57 ` Cong Wang
  2016-09-06 12:47   ` Jamal Hadi Salim
  2016-09-02  5:57 ` [RFC Patch net-next 2/6] net_sched: introduce tcf_hash_replace() Cong Wang
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 37+ messages in thread
From: Cong Wang @ 2016-09-02  5:57 UTC (permalink / raw)
  To: netdev; +Cc: jhs, Cong Wang

We already free tc actions in a RCU callback, so here
we just need to convert the hash table operations to
RCU API's.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/sched/act_api.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index d09d068..06111aa 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -39,7 +39,7 @@ static void free_tcf(struct rcu_head *head)
 static void tcf_hash_destroy(struct tcf_hashinfo *hinfo, struct tc_action *p)
 {
 	spin_lock_bh(&hinfo->lock);
-	hlist_del(&p->tcfa_head);
+	hlist_del_rcu(&p->tcfa_head);
 	spin_unlock_bh(&hinfo->lock);
 	gen_kill_estimator(&p->tcfa_bstats,
 			   &p->tcfa_rate_est);
@@ -79,10 +79,10 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
 	int err = 0, index = -1, i = 0, s_i = 0, n_i = 0;
 	struct nlattr *nest;
 
-	spin_lock_bh(&hinfo->lock);
-
 	s_i = cb->args[0];
 
+	rcu_read_lock_bh();
+
 	for (i = 0; i < (hinfo->hmask + 1); i++) {
 		struct hlist_head *head;
 		struct tc_action *p;
@@ -110,7 +110,7 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
 		}
 	}
 done:
-	spin_unlock_bh(&hinfo->lock);
+	rcu_read_unlock_bh();
 	if (n_i)
 		cb->args[0] += n_i;
 	return n_i;
@@ -179,12 +179,12 @@ static struct tc_action *tcf_hash_lookup(u32 index, struct tcf_hashinfo *hinfo)
 	struct tc_action *p = NULL;
 	struct hlist_head *head;
 
-	spin_lock_bh(&hinfo->lock);
+	rcu_read_lock_bh();
 	head = &hinfo->htab[tcf_hash(index, hinfo->hmask)];
 	hlist_for_each_entry_rcu(p, head, tcfa_head)
 		if (p->tcfa_index == index)
 			break;
-	spin_unlock_bh(&hinfo->lock);
+	rcu_read_unlock_bh();
 
 	return p;
 }
@@ -301,7 +301,7 @@ void tcf_hash_insert(struct tc_action_net *tn, struct tc_action *a)
 	unsigned int h = tcf_hash(a->tcfa_index, hinfo->hmask);
 
 	spin_lock_bh(&hinfo->lock);
-	hlist_add_head(&a->tcfa_head, &hinfo->htab[h]);
+	hlist_add_head_rcu(&a->tcfa_head, &hinfo->htab[h]);
 	spin_unlock_bh(&hinfo->lock);
 }
 EXPORT_SYMBOL(tcf_hash_insert);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC Patch net-next 2/6] net_sched: introduce tcf_hash_replace()
  2016-09-02  5:57 [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions Cong Wang
  2016-09-02  5:57 ` [RFC Patch net-next 1/6] net_sched: use RCU for action hash table Cong Wang
@ 2016-09-02  5:57 ` Cong Wang
  2016-09-06 12:52   ` Jamal Hadi Salim
  2016-09-02  5:57 ` [RFC Patch net-next 3/6] net_sched: return NULL in tcf_hash_check() Cong Wang
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 37+ messages in thread
From: Cong Wang @ 2016-09-02  5:57 UTC (permalink / raw)
  To: netdev; +Cc: jhs, Cong Wang

This API is used to replace the old action in hash table
with the new one. It is used by the following patch.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 include/net/act_api.h |  2 ++
 net/sched/act_api.c   | 20 ++++++++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/include/net/act_api.h b/include/net/act_api.h
index 82f3c91..a374bab 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -164,6 +164,8 @@ int tcf_hash_create(struct tc_action_net *tn, u32 index, struct nlattr *est,
 		    bool cpustats);
 void tcf_hash_cleanup(struct tc_action *a, struct nlattr *est);
 void tcf_hash_insert(struct tc_action_net *tn, struct tc_action *a);
+void tcf_hash_replace(struct tc_action_net *tn, struct tc_action **old,
+		      struct tc_action *new, int bind);
 
 int __tcf_hash_release(struct tc_action *a, bool bind, bool strict);
 
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 06111aa..db907e5 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -306,6 +306,26 @@ void tcf_hash_insert(struct tc_action_net *tn, struct tc_action *a)
 }
 EXPORT_SYMBOL(tcf_hash_insert);
 
+void tcf_hash_replace(struct tc_action_net *tn, struct tc_action **old,
+		      struct tc_action *new, int bind)
+{
+	struct tcf_hashinfo *hinfo = tn->hinfo;
+	unsigned int h = tcf_hash(new->tcfa_index, hinfo->hmask);
+
+	spin_lock_bh(&hinfo->lock);
+	if (*old)
+		hlist_replace_rcu(&(*old)->tcfa_head, &new->tcfa_head);
+	else
+		hlist_add_head_rcu(&new->tcfa_head, &hinfo->htab[h]);
+	spin_unlock_bh(&hinfo->lock);
+
+	if (*old)
+		tcf_hash_release(*old, bind);
+
+	rcu_assign_pointer(*old, new);
+}
+EXPORT_SYMBOL(tcf_hash_replace);
+
 void tcf_hashinfo_destroy(const struct tc_action_ops *ops,
 			  struct tcf_hashinfo *hinfo)
 {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC Patch net-next 3/6] net_sched: return NULL in tcf_hash_check()
  2016-09-02  5:57 [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions Cong Wang
  2016-09-02  5:57 ` [RFC Patch net-next 1/6] net_sched: use RCU for action hash table Cong Wang
  2016-09-02  5:57 ` [RFC Patch net-next 2/6] net_sched: introduce tcf_hash_replace() Cong Wang
@ 2016-09-02  5:57 ` Cong Wang
  2016-09-02  5:57 ` [RFC Patch net-next 4/6] net_sched: introduce tcf_hash_copy() Cong Wang
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 37+ messages in thread
From: Cong Wang @ 2016-09-02  5:57 UTC (permalink / raw)
  To: netdev; +Cc: jhs, Cong Wang

When we don't find an action with a specific index,
we only return false but not touch *a. It is nicer
if we could set *a to NULL here, this would make
callers easier to use these API's.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/sched/act_api.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index db907e5..d0a7db2 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -230,6 +230,7 @@ bool tcf_hash_check(struct tc_action_net *tn, u32 index, struct tc_action **a,
 		*a = p;
 		return true;
 	}
+	*a = NULL;
 	return false;
 }
 EXPORT_SYMBOL(tcf_hash_check);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC Patch net-next 4/6] net_sched: introduce tcf_hash_copy()
  2016-09-02  5:57 [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions Cong Wang
                   ` (2 preceding siblings ...)
  2016-09-02  5:57 ` [RFC Patch net-next 3/6] net_sched: return NULL in tcf_hash_check() Cong Wang
@ 2016-09-02  5:57 ` Cong Wang
  2016-09-02  5:57 ` [RFC Patch net-next 5/6] net_sched: use rcu in fast path Cong Wang
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 37+ messages in thread
From: Cong Wang @ 2016-09-02  5:57 UTC (permalink / raw)
  To: netdev; +Cc: jhs, Cong Wang

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 include/net/act_api.h |  1 +
 net/sched/act_api.c   | 18 ++++++++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/include/net/act_api.h b/include/net/act_api.h
index a374bab..17837af 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -166,6 +166,7 @@ void tcf_hash_cleanup(struct tc_action *a, struct nlattr *est);
 void tcf_hash_insert(struct tc_action_net *tn, struct tc_action *a);
 void tcf_hash_replace(struct tc_action_net *tn, struct tc_action **old,
 		      struct tc_action *new, int bind);
+bool tcf_hash_copy(struct tc_action *dst, const struct tc_action *src);
 
 int __tcf_hash_release(struct tc_action *a, bool bind, bool strict);
 
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index d0a7db2..2f8db3c 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -296,6 +296,24 @@ int tcf_hash_create(struct tc_action_net *tn, u32 index, struct nlattr *est,
 }
 EXPORT_SYMBOL(tcf_hash_create);
 
+bool tcf_hash_copy(struct tc_action *dst, const struct tc_action *src)
+{
+	if (src) {
+		memcpy(dst, src, sizeof(*src));
+		spin_lock_init(&dst->tcfa_lock);
+		INIT_HLIST_NODE(&dst->tcfa_head);
+		INIT_LIST_HEAD(&dst->list);
+
+		/* tcf_hash_check() is called before this */
+		dst->tcfa_refcnt = src->tcfa_refcnt - 1;
+		if (src->tcfa_bindcnt)
+			dst->tcfa_bindcnt = src->tcfa_bindcnt - 1;
+		return true;
+	}
+	return false;
+}
+EXPORT_SYMBOL(tcf_hash_copy);
+
 void tcf_hash_insert(struct tc_action_net *tn, struct tc_action *a)
 {
 	struct tcf_hashinfo *hinfo = tn->hinfo;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC Patch net-next 5/6] net_sched: use rcu in fast path
  2016-09-02  5:57 [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions Cong Wang
                   ` (3 preceding siblings ...)
  2016-09-02  5:57 ` [RFC Patch net-next 4/6] net_sched: introduce tcf_hash_copy() Cong Wang
@ 2016-09-02  5:57 ` Cong Wang
  2016-09-06 14:52   ` Eric Dumazet
  2016-09-02  5:57 ` [RFC Patch net-next 6/6] net_sched: switch to RCU API for act_mirred Cong Wang
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 37+ messages in thread
From: Cong Wang @ 2016-09-02  5:57 UTC (permalink / raw)
  To: netdev; +Cc: jhs, Cong Wang

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/sched/act_api.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 2f8db3c..fb6ff52 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -470,10 +470,14 @@ int tcf_action_exec(struct sk_buff *skb, struct tc_action **actions,
 		goto exec_done;
 	}
 	for (i = 0; i < nr_actions; i++) {
-		const struct tc_action *a = actions[i];
+		const struct tc_action *a;
 
+		rcu_read_lock();
+		a = rcu_dereference(actions[i]);
 repeat:
 		ret = a->ops->act(skb, a, res);
+		rcu_read_unlock();
+
 		if (ret == TC_ACT_REPEAT)
 			goto repeat;	/* we need a ttl - JHS */
 		if (ret != TC_ACT_PIPE)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC Patch net-next 6/6] net_sched: switch to RCU API for act_mirred
  2016-09-02  5:57 [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions Cong Wang
                   ` (4 preceding siblings ...)
  2016-09-02  5:57 ` [RFC Patch net-next 5/6] net_sched: use rcu in fast path Cong Wang
@ 2016-09-02  5:57 ` Cong Wang
  2016-09-02  7:09 ` [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions Jiri Pirko
  2016-09-07 16:23 ` John Fastabend
  7 siblings, 0 replies; 37+ messages in thread
From: Cong Wang @ 2016-09-02  5:57 UTC (permalink / raw)
  To: netdev; +Cc: jhs, Cong Wang

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/sched/act_mirred.c | 41 +++++++++++++++++++----------------------
 1 file changed, 19 insertions(+), 22 deletions(-)

diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index 6038c85..90a024c 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -60,6 +60,7 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla,
 {
 	struct tc_action_net *tn = net_generic(net, mirred_net_id);
 	struct nlattr *tb[TCA_MIRRED_MAX + 1];
+	struct tc_action *n;
 	struct tc_mirred *parm;
 	struct tcf_mirred *m;
 	struct net_device *dev;
@@ -108,44 +109,40 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla,
 			ok_push = 1;
 			break;
 		}
-	} else {
-		dev = NULL;
+	} else if (!exists) {
+		return -EINVAL;
+	}
+
+	if (exists && !ovr) {
+		tcf_hash_release(*a, bind);
+		return -EEXIST;
 	}
 
-	if (!exists) {
-		if (dev == NULL)
-			return -EINVAL;
-		ret = tcf_hash_create(tn, parm->index, est, a,
-				      &act_mirred_ops, bind, true);
-		if (ret)
-			return ret;
-		ret = ACT_P_CREATED;
-	} else {
+	ret = tcf_hash_create(tn, parm->index, est, &n,
+			      &act_mirred_ops, bind, true);
+	if (ret) {
 		tcf_hash_release(*a, bind);
-		if (!ovr)
-			return -EEXIST;
+		return ret;
 	}
-	m = to_mirred(*a);
+
+	tcf_hash_copy(n, *a);
+	m = to_mirred(n);
 
 	ASSERT_RTNL();
 	m->tcf_action = parm->action;
 	m->tcfm_eaction = parm->eaction;
 	if (dev != NULL) {
 		m->tcfm_ifindex = parm->ifindex;
-		if (ret != ACT_P_CREATED)
-			dev_put(rcu_dereference_protected(m->tcfm_dev, 1));
 		dev_hold(dev);
 		rcu_assign_pointer(m->tcfm_dev, dev);
 		m->tcfm_ok_push = ok_push;
 	}
 
-	if (ret == ACT_P_CREATED) {
-		spin_lock_bh(&mirred_list_lock);
-		list_add(&m->tcfm_list, &mirred_list);
-		spin_unlock_bh(&mirred_list_lock);
-		tcf_hash_insert(tn, *a);
-	}
+	spin_lock_bh(&mirred_list_lock);
+	list_add(&m->tcfm_list, &mirred_list);
+	spin_unlock_bh(&mirred_list_lock);
 
+	tcf_hash_replace(tn, a, n, bind);
 	return ret;
 }
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions
  2016-09-02  5:57 [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions Cong Wang
                   ` (5 preceding siblings ...)
  2016-09-02  5:57 ` [RFC Patch net-next 6/6] net_sched: switch to RCU API for act_mirred Cong Wang
@ 2016-09-02  7:09 ` Jiri Pirko
  2016-09-02 19:44   ` Cong Wang
  2016-09-07 16:23 ` John Fastabend
  7 siblings, 1 reply; 37+ messages in thread
From: Jiri Pirko @ 2016-09-02  7:09 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev, jhs

Fri, Sep 02, 2016 at 07:57:14AM CEST, xiyou.wangcong@gmail.com wrote:
>Currently there are only two tc actions lockless:
>gact and mirred. But they are questionable because
>we don't have anything to prevent a parallel update
>on an existing tc action in hash table while reading
>it on fast path, this could be a problem when a tc
>action becomes complex.
>
>This patchset introduces a few new tc action API's
>based on RCU so that the fast path could now really
>be protected by RCU and we can update existing tc
>actions safely and race-freely.
>
>Obviously this is still _not_ complete yet, I only
>modified mirred action to show the use case of
>the new API's, all the rest actions could switch to
>the new API's too. The new API's are a bit ugly too,
>any suggestion to improve them is welcome.
>
>I tested mirred action with a few test cases, so far
>so good, at least no obvious bugs. ;)


I wonder, do you happen to have a very tiny narrow screen?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions
  2016-09-02  7:09 ` [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions Jiri Pirko
@ 2016-09-02 19:44   ` Cong Wang
  0 siblings, 0 replies; 37+ messages in thread
From: Cong Wang @ 2016-09-02 19:44 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Linux Kernel Network Developers, Jamal Hadi Salim

On Fri, Sep 2, 2016 at 12:09 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>
> I wonder, do you happen to have a very tiny narrow screen?

LOL, yeah, I should fix the indentation... ;)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC Patch net-next 1/6] net_sched: use RCU for action hash table
  2016-09-02  5:57 ` [RFC Patch net-next 1/6] net_sched: use RCU for action hash table Cong Wang
@ 2016-09-06 12:47   ` Jamal Hadi Salim
  2016-09-06 22:37     ` Cong Wang
  0 siblings, 1 reply; 37+ messages in thread
From: Jamal Hadi Salim @ 2016-09-06 12:47 UTC (permalink / raw)
  To: Cong Wang, netdev

On 16-09-02 01:57 AM, Cong Wang wrote:
> We already free tc actions in a RCU callback, so here
> we just need to convert the hash table operations to
> RCU API's.
>
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

This one stands on its own merit, no?
So:
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>


cheers,
jamal

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC Patch net-next 2/6] net_sched: introduce tcf_hash_replace()
  2016-09-02  5:57 ` [RFC Patch net-next 2/6] net_sched: introduce tcf_hash_replace() Cong Wang
@ 2016-09-06 12:52   ` Jamal Hadi Salim
  2016-09-06 22:34     ` Cong Wang
  0 siblings, 1 reply; 37+ messages in thread
From: Jamal Hadi Salim @ 2016-09-06 12:52 UTC (permalink / raw)
  To: Cong Wang, netdev

On 16-09-02 01:57 AM, Cong Wang wrote:

> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> ---
>  include/net/act_api.h |  2 ++
>  net/sched/act_api.c   | 20 ++++++++++++++++++++
>  2 files changed, 22 insertions(+)
>
> diff --git a/include/net/act_api.h b/include/net/act_api.h
> index 82f3c91..a374bab 100644
> --- a/include/net/act_api.h

>
> +void tcf_hash_replace(struct tc_action_net *tn, struct tc_action **old,
> +		      struct tc_action *new, int bind)
> +{
> +	struct tcf_hashinfo *hinfo = tn->hinfo;
> +	unsigned int h = tcf_hash(new->tcfa_index, hinfo->hmask);
> +

WWhy do you need to recreate the index?
Old index was fine since this is just a replacement..

The rest of the patches seem fine - will let Eric comment on the
mirred.

Note: I am going to still push forward with skbmod action and I think
so should the new tunnel action code i.e we make them independent.
I'd like to switch to this when we think it is stable.

cheers,
jamal

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC Patch net-next 5/6] net_sched: use rcu in fast path
  2016-09-02  5:57 ` [RFC Patch net-next 5/6] net_sched: use rcu in fast path Cong Wang
@ 2016-09-06 14:52   ` Eric Dumazet
  2016-09-08  6:04     ` John Fastabend
  2016-09-09  5:49     ` Cong Wang
  0 siblings, 2 replies; 37+ messages in thread
From: Eric Dumazet @ 2016-09-06 14:52 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev, jhs

On Thu, 2016-09-01 at 22:57 -0700, Cong Wang wrote:



Missing changelog ?

Here I have no idea what you want to fix, since John already took care
all this infra.

Adding extra rcu_dereference() and rcu_read_lock() while the critical
RCU dereferences already happen in callers is not needed.

> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> ---
>  net/sched/act_api.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> index 2f8db3c..fb6ff52 100644
> --- a/net/sched/act_api.c
> +++ b/net/sched/act_api.c
> @@ -470,10 +470,14 @@ int tcf_action_exec(struct sk_buff *skb, struct tc_action **actions,
>  		goto exec_done;
>  	}
>  	for (i = 0; i < nr_actions; i++) {
> -		const struct tc_action *a = actions[i];
> +		const struct tc_action *a;
>  
> +		rcu_read_lock();

But the caller already has rcu_read_lock() or rcu_read_lock_bh()

This was done in commit 25d8c0d55f241ce2 ("net: rcu-ify tcf_proto")

> +		a = rcu_dereference(actions[i]);


Add in your .config :
CONFIG_SPARSE_RCU_POINTER=y
make C=2 M=net/sched

>  repeat:
>  		ret = a->ops->act(skb, a, res);
> +		rcu_read_unlock();
> +
>  		if (ret == TC_ACT_REPEAT)
>  			goto repeat;	/* we need a ttl - JHS */
>  		if (ret != TC_ACT_PIPE)



I do not believe this patch is necessary.

Please add John as reviewer next time.

Thanks.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC Patch net-next 2/6] net_sched: introduce tcf_hash_replace()
  2016-09-06 12:52   ` Jamal Hadi Salim
@ 2016-09-06 22:34     ` Cong Wang
  0 siblings, 0 replies; 37+ messages in thread
From: Cong Wang @ 2016-09-06 22:34 UTC (permalink / raw)
  To: Jamal Hadi Salim; +Cc: Linux Kernel Network Developers

On Tue, Sep 6, 2016 at 5:52 AM, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 16-09-02 01:57 AM, Cong Wang wrote:
>
>> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
>> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>> ---
>>  include/net/act_api.h |  2 ++
>>  net/sched/act_api.c   | 20 ++++++++++++++++++++
>>  2 files changed, 22 insertions(+)
>>
>> diff --git a/include/net/act_api.h b/include/net/act_api.h
>> index 82f3c91..a374bab 100644
>> --- a/include/net/act_api.h
>
>
>>
>> +void tcf_hash_replace(struct tc_action_net *tn, struct tc_action **old,
>> +                     struct tc_action *new, int bind)
>> +{
>> +       struct tcf_hashinfo *hinfo = tn->hinfo;
>> +       unsigned int h = tcf_hash(new->tcfa_index, hinfo->hmask);
>> +
>
>
> WWhy do you need to recreate the index?
> Old index was fine since this is just a replacement..

A new index is possible created by tcf_hash_create(), but later
overwritten by tcf_hash_copy(). I know this is a bit ugly, so
feel free to suggest any better API here.


>
> The rest of the patches seem fine - will let Eric comment on the
> mirred.
>
> Note: I am going to still push forward with skbmod action and I think
> so should the new tunnel action code i.e we make them independent.
> I'd like to switch to this when we think it is stable.

You don't have to rebase or change, I will take care of it because they
will probably be merged before this patchset. At least I can hold on
this for a while to let that happen. ;)

Thanks.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC Patch net-next 1/6] net_sched: use RCU for action hash table
  2016-09-06 12:47   ` Jamal Hadi Salim
@ 2016-09-06 22:37     ` Cong Wang
  0 siblings, 0 replies; 37+ messages in thread
From: Cong Wang @ 2016-09-06 22:37 UTC (permalink / raw)
  To: Jamal Hadi Salim; +Cc: Linux Kernel Network Developers

On Tue, Sep 6, 2016 at 5:47 AM, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 16-09-02 01:57 AM, Cong Wang wrote:
>>
>> We already free tc actions in a RCU callback, so here
>> we just need to convert the hash table operations to
>> RCU API's.
>>
>> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
>> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>
>
> This one stands on its own merit, no?

Yes, I think so. I think it is fine to carry it in this patchset too.


> So:
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
>
>

Thanks.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions
  2016-09-02  5:57 [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions Cong Wang
                   ` (6 preceding siblings ...)
  2016-09-02  7:09 ` [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions Jiri Pirko
@ 2016-09-07 16:23 ` John Fastabend
  2016-09-08  6:05   ` John Fastabend
  2016-09-09  5:22   ` Cong Wang
  7 siblings, 2 replies; 37+ messages in thread
From: John Fastabend @ 2016-09-07 16:23 UTC (permalink / raw)
  To: Cong Wang, netdev; +Cc: jhs

On 16-09-01 10:57 PM, Cong Wang wrote:
> Currently there are only two tc actions lockless:
> gact and mirred. But they are questionable because
> we don't have anything to prevent a parallel update
> on an existing tc action in hash table while reading
> it on fast path, this could be a problem when a tc
> action becomes complex.

hmm I'm trying to see where the questionable part is in the current
code? What is it exactly.

The calls to

	 tcf_lastuse_update(&m->tcf_tm);

for example are possibly wrong as that is a u64 may be set from
multiple cores. So fixing that seems like a good idea.

Actions themselves don't have a path to be "updated" while live do they?
iirc and I think a quick scan this morning of the code shows actions
have a refcnt and a "bind"/"release" action that increments/decrements
this counter. Both bind and release are protected via rtnl lock in the
control path.

I need to follow all the code paths but is there a way to remove an
action that still has a refcnt > 0? In other words does it need to be
removed from all filters before it can be deleted. If yes then by the
time it is removed (after rcu grace period) it should not be in use.
If no then I think there is a problem.

I'm looking at this code path here,

int __tcf_hash_release(struct tc_action *p, bool bind, bool strict)
{
        int ret = 0;

        if (p) {
                if (bind)
                        p->tcfa_bindcnt--;
                else if (strict && p->tcfa_bindcnt > 0)
                        return -EPERM;

                p->tcfa_refcnt--;
                if (p->tcfa_bindcnt <= 0 && p->tcfa_refcnt <= 0) {
                        if (p->ops->cleanup)
                                p->ops->cleanup(p, bind);
                        tcf_hash_destroy(p->hinfo, p);
                        ret = ACT_P_DELETED;
                }
        }

        return ret;
}

It looks to me that every call site that jumps here where its possible
an action is being used by a filter is "strict". And further filters
only release actions after an rcu grace period when being destroyed and
the filter is no longer using the action.

Although the refcnt should be atomic now that its being called from
outside the rtnl lock in rcu call back? At least it looks racy to me
at a glance this morning.

If the refcnt'ing is atomic then do we care/need the hash rcu bits? I'm
not seeing how it helps because in the fast path we don't even touch the
hash table we have a pointer to a refcnt'd action object.

What did I miss?

> 
> This patchset introduces a few new tc action API's
> based on RCU so that the fast path could now really
> be protected by RCU and we can update existing tc
> actions safely and race-freely.
> 
> Obviously this is still _not_ complete yet, I only
> modified mirred action to show the use case of
> the new API's, all the rest actions could switch to
> the new API's too. The new API's are a bit ugly too,
> any suggestion to improve them is welcome.
> 
> I tested mirred action with a few test cases, so far
> so good, at least no obvious bugs. ;)

Take a quick survey of the actions I didn't see any with global state.
But I didn't look at them all.

> 
> 
> Cong Wang (6):
>   net_sched: use RCU for action hash table
>   net_sched: introduce tcf_hash_replace()
>   net_sched: return NULL in tcf_hash_check()
>   net_sched: introduce tcf_hash_copy()
>   net_sched: use rcu in fast path
>   net_sched: switch to RCU API for act_mirred
> 
>  include/net/act_api.h  |  3 +++
>  net/sched/act_api.c    | 59 +++++++++++++++++++++++++++++++++++++++++++-------
>  net/sched/act_mirred.c | 41 ++++++++++++++++-------------------
>  3 files changed, 73 insertions(+), 30 deletions(-)
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC Patch net-next 5/6] net_sched: use rcu in fast path
  2016-09-06 14:52   ` Eric Dumazet
@ 2016-09-08  6:04     ` John Fastabend
  2016-09-08 13:28       ` Eric Dumazet
  2016-09-09  5:49     ` Cong Wang
  1 sibling, 1 reply; 37+ messages in thread
From: John Fastabend @ 2016-09-08  6:04 UTC (permalink / raw)
  To: Eric Dumazet, Cong Wang; +Cc: netdev, jhs

On 16-09-06 07:52 AM, Eric Dumazet wrote:
> On Thu, 2016-09-01 at 22:57 -0700, Cong Wang wrote:
> 
> 
> 
> Missing changelog ?
> 

Yeah this stuff is a bit tricky more notes to walk us through
this would be helpful. (btw you can disregard my comment from
earlier this morning I've tracked most of this down now)

> Here I have no idea what you want to fix, since John already took care
> all this infra.
> 
> Adding extra rcu_dereference() and rcu_read_lock() while the critical
> RCU dereferences already happen in callers is not needed.
> 

Agreed drop all the extra rcu_read_lock/unlock here.

>> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>> ---
>>  net/sched/act_api.c | 6 +++++-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
>> index 2f8db3c..fb6ff52 100644
>> --- a/net/sched/act_api.c
>> +++ b/net/sched/act_api.c
>> @@ -470,10 +470,14 @@ int tcf_action_exec(struct sk_buff *skb, struct tc_action **actions,
>>  		goto exec_done;
>>  	}
>>  	for (i = 0; i < nr_actions; i++) {
>> -		const struct tc_action *a = actions[i];
>> +		const struct tc_action *a;
>>  
>> +		rcu_read_lock();
> 
> But the caller already has rcu_read_lock() or rcu_read_lock_bh()
> 
> This was done in commit 25d8c0d55f241ce2 ("net: rcu-ify tcf_proto")
> 
>> +		a = rcu_dereference(actions[i]);
> 
> 
> Add in your .config :
> CONFIG_SPARSE_RCU_POINTER=y
> make C=2 M=net/sched
> 
>>  repeat:
>>  		ret = a->ops->act(skb, a, res);
>> +		rcu_read_unlock();
>> +
>>  		if (ret == TC_ACT_REPEAT)
>>  			goto repeat;	/* we need a ttl - JHS */
>>  		if (ret != TC_ACT_PIPE)
> 
> 
> 
> I do not believe this patch is necessary.
> 
> Please add John as reviewer next time.
> 
> Thanks.
> 
> 

So the actual issue as I see it is with the late binding actions the
ones created with the ill documented 'tc action' syntax.

If you add a rule here and then bind it to a filter (stealing Jamals
example),

  tc actions add action skbedit mark 10 index 1
  tc filter add dev $DEV parent ffff: protocol ip prio 1 u32\
             match ip dst 17.0.0.1/32 flowid 1:10 action skbedit index 1

then you can modify this action later with the following

  tc actions replace action ....

So for an action such as act_mirred we may end up running with a
partially updated state from this,

	tcf_mirred_init(...)
	[...]

		m->tcf_action = parm->action;
	        m->tcfm_eaction = parm->eaction;
	
		[...]
	

and then in the execution of the action,


	tcf_mirred(...)
		[...]
	        retval = READ_ONCE(m->tcf_action);
		[...]
                if (m->tcfm_eaction != TCA_EGRESS_MIRROR)
		[...]


So its at least possible I think these could be interleaved on multiple
cpus.

Notice that some of the actions are fine though and don't have this
issue act_bpf for example is fine.

I think we can either fix it in the hash table create part of the
list as this series does or just let each action handle it on its own.

.John

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions
  2016-09-07 16:23 ` John Fastabend
@ 2016-09-08  6:05   ` John Fastabend
  2016-09-09  5:22   ` Cong Wang
  1 sibling, 0 replies; 37+ messages in thread
From: John Fastabend @ 2016-09-08  6:05 UTC (permalink / raw)
  To: Cong Wang, netdev; +Cc: jhs

On 16-09-07 09:23 AM, John Fastabend wrote:
> On 16-09-01 10:57 PM, Cong Wang wrote:
>> Currently there are only two tc actions lockless:
>> gact and mirred. But they are questionable because
>> we don't have anything to prevent a parallel update
>> on an existing tc action in hash table while reading
>> it on fast path, this could be a problem when a tc
>> action becomes complex.
> 
> hmm I'm trying to see where the questionable part is in the current
> code? What is it exactly.

[...]

> What did I miss?
> 

OK tracked this down see the other patch 5/6.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC Patch net-next 5/6] net_sched: use rcu in fast path
  2016-09-08  6:04     ` John Fastabend
@ 2016-09-08 13:28       ` Eric Dumazet
  2016-09-08 15:35         ` [PATCH net] net_sched: act_mirred: full rcu conversion Eric Dumazet
  2016-09-08 15:49         ` [RFC Patch net-next 5/6] net_sched: use rcu in fast path John Fastabend
  0 siblings, 2 replies; 37+ messages in thread
From: Eric Dumazet @ 2016-09-08 13:28 UTC (permalink / raw)
  To: John Fastabend; +Cc: Cong Wang, netdev, jhs

On Wed, 2016-09-07 at 23:04 -0700, John Fastabend wrote:

> So the actual issue as I see it is with the late binding actions the
> ones created with the ill documented 'tc action' syntax.
> 
> If you add a rule here and then bind it to a filter (stealing Jamals
> example),
> 
>   tc actions add action skbedit mark 10 index 1
>   tc filter add dev $DEV parent ffff: protocol ip prio 1 u32\
>              match ip dst 17.0.0.1/32 flowid 1:10 action skbedit index 1
> 
> then you can modify this action later with the following
> 
>   tc actions replace action ....
> 
> So for an action such as act_mirred we may end up running with a
> partially updated state from this,
> 
> 	tcf_mirred_init(...)
> 	[...]
> 
> 		m->tcf_action = parm->action;
> 	        m->tcfm_eaction = parm->eaction;
> 	
> 		[...]
> 	
> 
> and then in the execution of the action,
> 
> 
> 	tcf_mirred(...)
> 		[...]
> 	        retval = READ_ONCE(m->tcf_action);
> 		[...]
>                 if (m->tcfm_eaction != TCA_EGRESS_MIRROR)
> 		[...]
> 
> 
> So its at least possible I think these could be interleaved on multiple
> cpus.

Sure, this was very clear when I wrote this code. Otherwise I would have
used an intermediate object and one rcu_dereference() instead of the
READ_ONCE().


> 
> Notice that some of the actions are fine though and don't have this
> issue act_bpf for example is fine.
> 
> I think we can either fix it in the hash table create part of the
> list as this series does or just let each action handle it on its own.

If we want a fix for stable kernels we want a tc_mirred fix on its own,
and I was planning to do so.

Given that a "tc qdisc replace ..." drops all packets sitting in the old
qdisc, I never thought someone would actually depend on atomically
switching tc_mirred parameters. I for sure did not care.

Thanks.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH net] net_sched: act_mirred: full rcu conversion
  2016-09-08 13:28       ` Eric Dumazet
@ 2016-09-08 15:35         ` Eric Dumazet
  2016-09-08 15:47           ` John Fastabend
  2016-09-09  5:24           ` Cong Wang
  2016-09-08 15:49         ` [RFC Patch net-next 5/6] net_sched: use rcu in fast path John Fastabend
  1 sibling, 2 replies; 37+ messages in thread
From: Eric Dumazet @ 2016-09-08 15:35 UTC (permalink / raw)
  To: David Miller
  Cc: Cong Wang, netdev, jhs, John Fastabend, Hadar Hen Zion, Amir Vadai

From: Eric Dumazet <edumazet@google.com>

As reported by Cong Wang, I was lazy when I did initial RCU conversion
of tc_mirred, as I thought I could avoid allocation/freeing of a
parameter block.

Use an intermediate object so that fast path can get a consistent
snapshot of multiple variables (dev, action, eaction, ok_push)

Fixes: 2ee22a90c7af ("net_sched: act_mirred: remove spinlock in fast path")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Hadar Hen Zion <hadarh@mellanox.com>
Cc: Amir Vadai <amirva@mellanox.com>
---
 include/net/tc_act/tc_mirred.h |   12 +++++-
 net/sched/act_mirred.c         |   54 +++++++++++++++++++++----------
 2 files changed, 47 insertions(+), 19 deletions(-)

diff --git a/include/net/tc_act/tc_mirred.h b/include/net/tc_act/tc_mirred.h
index 62770add15bd..a613b4b9c56e 100644
--- a/include/net/tc_act/tc_mirred.h
+++ b/include/net/tc_act/tc_mirred.h
@@ -4,12 +4,20 @@
 #include <net/act_api.h>
 #include <linux/tc_act/tc_mirred.h>
 
+
+struct tcf_mirred_params {
+	struct net_device	*dev;
+	int			action;
+	int			eaction;
+	int			ok_push;
+	struct rcu_head		rcu;
+};
+
 struct tcf_mirred {
 	struct tc_action	common;
 	int			tcfm_eaction;
 	int			tcfm_ifindex;
-	int			tcfm_ok_push;
-	struct net_device __rcu	*tcfm_dev;
+	struct tcf_mirred_params __rcu *params;
 	struct list_head	tcfm_list;
 };
 #define to_mirred(a) ((struct tcf_mirred *)a)
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index 6038c85d92f5..1dc0b6036180 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -36,15 +36,16 @@ static DEFINE_SPINLOCK(mirred_list_lock);
 static void tcf_mirred_release(struct tc_action *a, int bind)
 {
 	struct tcf_mirred *m = to_mirred(a);
-	struct net_device *dev;
+	struct tcf_mirred_params *params;
 
 	/* We could be called either in a RCU callback or with RTNL lock held. */
 	spin_lock_bh(&mirred_list_lock);
 	list_del(&m->tcfm_list);
-	dev = rcu_dereference_protected(m->tcfm_dev, 1);
-	if (dev)
-		dev_put(dev);
+	params = rcu_dereference_protected(m->params, 1);
+	if (params->dev)
+		dev_put(params->dev);
 	spin_unlock_bh(&mirred_list_lock);
+	kfree_rcu(params, rcu);
 }
 
 static const struct nla_policy mirred_policy[TCA_MIRRED_MAX + 1] = {
@@ -60,6 +61,8 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla,
 {
 	struct tc_action_net *tn = net_generic(net, mirred_net_id);
 	struct nlattr *tb[TCA_MIRRED_MAX + 1];
+	struct tcf_mirred_params *params_new;
+	struct tcf_mirred_params *params_old;
 	struct tc_mirred *parm;
 	struct tcf_mirred *m;
 	struct net_device *dev;
@@ -124,21 +127,33 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla,
 		tcf_hash_release(*a, bind);
 		if (!ovr)
 			return -EEXIST;
+		ret = 0;
 	}
 	m = to_mirred(*a);
 
 	ASSERT_RTNL();
-	m->tcf_action = parm->action;
-	m->tcfm_eaction = parm->eaction;
+	params_new = kzalloc(sizeof(*params_new), GFP_KERNEL);
+	if (unlikely(!params_new)) {
+		if (ret == ACT_P_CREATED)
+			tcf_hash_release(*a, bind);
+		return -ENOMEM;
+	}
+	params_old = rtnl_dereference(m->params);
+	params_new->action = m->tcf_action = parm->action;
+	params_new->eaction = m->tcfm_eaction = parm->eaction;
 	if (dev != NULL) {
 		m->tcfm_ifindex = parm->ifindex;
-		if (ret != ACT_P_CREATED)
-			dev_put(rcu_dereference_protected(m->tcfm_dev, 1));
 		dev_hold(dev);
-		rcu_assign_pointer(m->tcfm_dev, dev);
-		m->tcfm_ok_push = ok_push;
+		params_new->dev = dev;
+		params_new->ok_push = ok_push;
 	}
 
+	rcu_assign_pointer(m->params, params_new);
+	if (params_old) {
+		if (params_old->dev)
+			dev_put(params_old->dev);
+		kfree_rcu(params_old, rcu);
+	}
 	if (ret == ACT_P_CREATED) {
 		spin_lock_bh(&mirred_list_lock);
 		list_add(&m->tcfm_list, &mirred_list);
@@ -153,6 +168,7 @@ static int tcf_mirred(struct sk_buff *skb, const struct tc_action *a,
 		      struct tcf_result *res)
 {
 	struct tcf_mirred *m = to_mirred(a);
+	struct tcf_mirred_params *params;
 	struct net_device *dev;
 	struct sk_buff *skb2;
 	int retval, err;
@@ -162,8 +178,9 @@ static int tcf_mirred(struct sk_buff *skb, const struct tc_action *a,
 	bstats_cpu_update(this_cpu_ptr(m->common.cpu_bstats), skb);
 
 	rcu_read_lock();
-	retval = READ_ONCE(m->tcf_action);
-	dev = rcu_dereference(m->tcfm_dev);
+	params = rcu_dereference(m->params);
+	retval = params->action;
+	dev = params->dev;
 	if (unlikely(!dev)) {
 		pr_notice_once("tc mirred: target device is gone\n");
 		goto out;
@@ -181,12 +198,12 @@ static int tcf_mirred(struct sk_buff *skb, const struct tc_action *a,
 		goto out;
 
 	if (!(at & AT_EGRESS)) {
-		if (m->tcfm_ok_push)
+		if (params->ok_push)
 			skb_push_rcsum(skb2, skb->mac_len);
 	}
 
 	/* mirror is always swallowed */
-	if (m->tcfm_eaction != TCA_EGRESS_MIRROR)
+	if (params->eaction != TCA_EGRESS_MIRROR)
 		skb2->tc_verd = SET_TC_FROM(skb2->tc_verd, at);
 
 	skb2->skb_iif = skb->dev->ifindex;
@@ -196,7 +213,7 @@ static int tcf_mirred(struct sk_buff *skb, const struct tc_action *a,
 	if (err) {
 out:
 		qstats_overlimit_inc(this_cpu_ptr(m->common.cpu_qstats));
-		if (m->tcfm_eaction != TCA_EGRESS_MIRROR)
+		if (params->eaction != TCA_EGRESS_MIRROR)
 			retval = TC_ACT_SHOT;
 	}
 	rcu_read_unlock();
@@ -257,12 +274,15 @@ static int mirred_device_event(struct notifier_block *unused,
 	if (event == NETDEV_UNREGISTER) {
 		spin_lock_bh(&mirred_list_lock);
 		list_for_each_entry(m, &mirred_list, tcfm_list) {
-			if (rcu_access_pointer(m->tcfm_dev) == dev) {
+			struct tcf_mirred_params *params;
+
+			params = rtnl_dereference(m->params);
+			if (params->dev == dev) {
 				dev_put(dev);
 				/* Note : no rcu grace period necessary, as
 				 * net_device are already rcu protected.
 				 */
-				RCU_INIT_POINTER(m->tcfm_dev, NULL);
+				params->dev = NULL;
 			}
 		}
 		spin_unlock_bh(&mirred_list_lock);

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH net] net_sched: act_mirred: full rcu conversion
  2016-09-08 15:35         ` [PATCH net] net_sched: act_mirred: full rcu conversion Eric Dumazet
@ 2016-09-08 15:47           ` John Fastabend
  2016-09-08 15:51             ` Eric Dumazet
  2016-09-09  5:24           ` Cong Wang
  1 sibling, 1 reply; 37+ messages in thread
From: John Fastabend @ 2016-09-08 15:47 UTC (permalink / raw)
  To: Eric Dumazet, David Miller
  Cc: Cong Wang, netdev, jhs, Hadar Hen Zion, Amir Vadai

On 16-09-08 08:35 AM, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> As reported by Cong Wang, I was lazy when I did initial RCU conversion
> of tc_mirred, as I thought I could avoid allocation/freeing of a
> parameter block.
> 
> Use an intermediate object so that fast path can get a consistent
> snapshot of multiple variables (dev, action, eaction, ok_push)
> 
> Fixes: 2ee22a90c7af ("net_sched: act_mirred: remove spinlock in fast path")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Cc: Hadar Hen Zion <hadarh@mellanox.com>
> Cc: Amir Vadai <amirva@mellanox.com>
> ---

Works for me. FWIW I find this plenty straightforward and don't really
see the need to make the hash table itself rcu friendly.

Acked-by: John Fastabend <john.r.fastabend@intel.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC Patch net-next 5/6] net_sched: use rcu in fast path
  2016-09-08 13:28       ` Eric Dumazet
  2016-09-08 15:35         ` [PATCH net] net_sched: act_mirred: full rcu conversion Eric Dumazet
@ 2016-09-08 15:49         ` John Fastabend
  2016-09-09  5:54           ` Cong Wang
  1 sibling, 1 reply; 37+ messages in thread
From: John Fastabend @ 2016-09-08 15:49 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Cong Wang, netdev, jhs

[...]

>>
>> So its at least possible I think these could be interleaved on multiple
>> cpus.
> 
> Sure, this was very clear when I wrote this code. Otherwise I would have
> used an intermediate object and one rcu_dereference() instead of the
> READ_ONCE().
> 
> 
>>
>> Notice that some of the actions are fine though and don't have this
>> issue act_bpf for example is fine.
>>
>> I think we can either fix it in the hash table create part of the
>> list as this series does or just let each action handle it on its own.
> 
> If we want a fix for stable kernels we want a tc_mirred fix on its own,
> and I was planning to do so.
> 
> Given that a "tc qdisc replace ..." drops all packets sitting in the old
> qdisc, I never thought someone would actually depend on atomically
> switching tc_mirred parameters. I for sure did not care.
> 
> Thanks.
> 
> 

Agreed not sure why you would ever want to do a late binding and
replace on a tc_mirred actions. But it is supported...

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net] net_sched: act_mirred: full rcu conversion
  2016-09-08 15:47           ` John Fastabend
@ 2016-09-08 15:51             ` Eric Dumazet
  2016-09-09  5:26               ` Cong Wang
  0 siblings, 1 reply; 37+ messages in thread
From: Eric Dumazet @ 2016-09-08 15:51 UTC (permalink / raw)
  To: John Fastabend
  Cc: David Miller, Cong Wang, netdev, jhs, Hadar Hen Zion, Amir Vadai

On Thu, 2016-09-08 at 08:47 -0700, John Fastabend wrote:

> Works for me. FWIW I find this plenty straightforward and don't really
> see the need to make the hash table itself rcu friendly.
> 
> Acked-by: John Fastabend <john.r.fastabend@intel.com>
> 

Yes, it seems this hash table is used in control path, with RTNL held
anyway.

Thanks for reviewing John.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions
  2016-09-07 16:23 ` John Fastabend
  2016-09-08  6:05   ` John Fastabend
@ 2016-09-09  5:22   ` Cong Wang
  1 sibling, 0 replies; 37+ messages in thread
From: Cong Wang @ 2016-09-09  5:22 UTC (permalink / raw)
  To: John Fastabend; +Cc: Linux Kernel Network Developers, Jamal Hadi Salim

On Wed, Sep 7, 2016 at 9:23 AM, John Fastabend <john.fastabend@gmail.com> wrote:
>
> hmm I'm trying to see where the questionable part is in the current
> code? What is it exactly.

All you need is a google search in netdev:

https://www.mail-archive.com/netdev@vger.kernel.org/msg115480.html

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net] net_sched: act_mirred: full rcu conversion
  2016-09-08 15:35         ` [PATCH net] net_sched: act_mirred: full rcu conversion Eric Dumazet
  2016-09-08 15:47           ` John Fastabend
@ 2016-09-09  5:24           ` Cong Wang
  2016-09-09  5:48             ` Alexei Starovoitov
  2016-09-09 12:23             ` Eric Dumazet
  1 sibling, 2 replies; 37+ messages in thread
From: Cong Wang @ 2016-09-09  5:24 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Linux Kernel Network Developers, Jamal Hadi Salim,
	John Fastabend, Hadar Hen Zion, Amir Vadai

On Thu, Sep 8, 2016 at 8:35 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> As reported by Cong Wang, I was lazy when I did initial RCU conversion
> of tc_mirred, as I thought I could avoid allocation/freeing of a
> parameter block.

Quote from Eric Dumazet:

https://www.mail-archive.com/netdev@vger.kernel.org/msg115482.html

<Quote>
Well, I added a READ_ONCE() to read tcf_action once.

Adding rcu here would mean adding a pointer and extra cache line, to
deref the values.

IMHO the race here has no effect . You either read the old or new value.
</Quote>

Me with facepalm... ;-)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net] net_sched: act_mirred: full rcu conversion
  2016-09-08 15:51             ` Eric Dumazet
@ 2016-09-09  5:26               ` Cong Wang
  2016-09-09 12:23                 ` Eric Dumazet
  2016-09-09 15:52                 ` John Fastabend
  0 siblings, 2 replies; 37+ messages in thread
From: Cong Wang @ 2016-09-09  5:26 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: John Fastabend, David Miller, Linux Kernel Network Developers,
	Jamal Hadi Salim, Hadar Hen Zion, Amir Vadai

On Thu, Sep 8, 2016 at 8:51 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2016-09-08 at 08:47 -0700, John Fastabend wrote:
>
>> Works for me. FWIW I find this plenty straightforward and don't really
>> see the need to make the hash table itself rcu friendly.
>>
>> Acked-by: John Fastabend <john.r.fastabend@intel.com>
>>
>
> Yes, it seems this hash table is used in control path, with RTNL held
> anyway.

Seriously? You never read hashtable in fast path?? I think you need
to wake up.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net] net_sched: act_mirred: full rcu conversion
  2016-09-09  5:24           ` Cong Wang
@ 2016-09-09  5:48             ` Alexei Starovoitov
  2016-09-09  5:59               ` Cong Wang
  2016-09-09 12:23             ` Eric Dumazet
  1 sibling, 1 reply; 37+ messages in thread
From: Alexei Starovoitov @ 2016-09-09  5:48 UTC (permalink / raw)
  To: Cong Wang
  Cc: Eric Dumazet, David Miller, Linux Kernel Network Developers,
	Jamal Hadi Salim, John Fastabend, Hadar Hen Zion, Amir Vadai

On Thu, Sep 08, 2016 at 10:24:32PM -0700, Cong Wang wrote:
> On Thu, Sep 8, 2016 at 8:35 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > From: Eric Dumazet <edumazet@google.com>
> >
> > As reported by Cong Wang, I was lazy when I did initial RCU conversion
> > of tc_mirred, as I thought I could avoid allocation/freeing of a
> > parameter block.
> 
> Quote from Eric Dumazet:
> 
> https://www.mail-archive.com/netdev@vger.kernel.org/msg115482.html
> 
> <Quote>
> Well, I added a READ_ONCE() to read tcf_action once.
> 
> Adding rcu here would mean adding a pointer and extra cache line, to
> deref the values.
> 
> IMHO the race here has no effect . You either read the old or new value.
> </Quote>
> 
> Me with facepalm... ;-)

imo the deliberate small race in Eric's initial lock removal in mirred
was a good design choice. I think he did this patch only to
silence the complains with the code instead of arguing with words.
imo the initial code is good as-is. This patch is also a nice improvement,
but certainly not mandatory. 'face palm' is unnecessary.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC Patch net-next 5/6] net_sched: use rcu in fast path
  2016-09-06 14:52   ` Eric Dumazet
  2016-09-08  6:04     ` John Fastabend
@ 2016-09-09  5:49     ` Cong Wang
  1 sibling, 0 replies; 37+ messages in thread
From: Cong Wang @ 2016-09-09  5:49 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Linux Kernel Network Developers, Jamal Hadi Salim

On Tue, Sep 6, 2016 at 7:52 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2016-09-01 at 22:57 -0700, Cong Wang wrote:
>
>
>
> Missing changelog ?

I was too lazy to write a changelog for this RFC patch, surely
it will be added when removing RFC.


>
> Here I have no idea what you want to fix, since John already took care
> all this infra.
>
> Adding extra rcu_dereference() and rcu_read_lock() while the critical
> RCU dereferences already happen in callers is not needed.


Of course it is, TX and RX both have rcu read lock held.


>
>> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>> ---
>>  net/sched/act_api.c | 6 +++++-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
>> index 2f8db3c..fb6ff52 100644
>> --- a/net/sched/act_api.c
>> +++ b/net/sched/act_api.c
>> @@ -470,10 +470,14 @@ int tcf_action_exec(struct sk_buff *skb, struct tc_action **actions,
>>               goto exec_done;
>>       }
>>       for (i = 0; i < nr_actions; i++) {
>> -             const struct tc_action *a = actions[i];
>> +             const struct tc_action *a;
>>
>> +             rcu_read_lock();
>
> But the caller already has rcu_read_lock() or rcu_read_lock_bh()
>
> This was done in commit 25d8c0d55f241ce2 ("net: rcu-ify tcf_proto")

More precisely, it is __dev_queue_xmit() or netif_receive_skb_internal().
Not directly related to John.

The reason why I made it is to make it explicit that I take rcu_read_lock()
for tc action fast path, since it can be called recursively.

>
>> +             a = rcu_dereference(actions[i]);
>
>
> Add in your .config :
> CONFIG_SPARSE_RCU_POINTER=y
> make C=2 M=net/sched
>

Yeah, kbuild-robot already reported this to me, no worries,
fixing it is already in my TODO list.



>>  repeat:
>>               ret = a->ops->act(skb, a, res);
>> +             rcu_read_unlock();
>> +
>>               if (ret == TC_ACT_REPEAT)
>>                       goto repeat;    /* we need a ttl - JHS */
>>               if (ret != TC_ACT_PIPE)
>
>
>
> I do not believe this patch is necessary.
>
> Please add John as reviewer next time.
>

You missed the rcu_dereference() part.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC Patch net-next 5/6] net_sched: use rcu in fast path
  2016-09-08 15:49         ` [RFC Patch net-next 5/6] net_sched: use rcu in fast path John Fastabend
@ 2016-09-09  5:54           ` Cong Wang
  2016-09-09 15:25             ` John Fastabend
  0 siblings, 1 reply; 37+ messages in thread
From: Cong Wang @ 2016-09-09  5:54 UTC (permalink / raw)
  To: John Fastabend
  Cc: Eric Dumazet, Linux Kernel Network Developers, Jamal Hadi Salim

On Thu, Sep 8, 2016 at 8:49 AM, John Fastabend <john.fastabend@gmail.com> wrote:
> Agreed not sure why you would ever want to do a late binding and
> replace on a tc_mirred actions. But it is supported...

I will let Jamal teach you on this, /me is really tired of explaining
things to you John.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net] net_sched: act_mirred: full rcu conversion
  2016-09-09  5:48             ` Alexei Starovoitov
@ 2016-09-09  5:59               ` Cong Wang
  2016-09-09 12:25                 ` Eric Dumazet
  0 siblings, 1 reply; 37+ messages in thread
From: Cong Wang @ 2016-09-09  5:59 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Eric Dumazet, David Miller, Linux Kernel Network Developers,
	Jamal Hadi Salim, John Fastabend, Hadar Hen Zion, Amir Vadai

On Thu, Sep 8, 2016 at 10:48 PM, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> imo the deliberate small race in Eric's initial lock removal in mirred
> was a good design choice. I think he did this patch only to
> silence the complains with the code instead of arguing with words.
> imo the initial code is good as-is. This patch is also a nice improvement,
> but certainly not mandatory. 'face palm' is unnecessary.

LOL, to summarize your words:

1) not changing the current code is good
2) changing the current code is good too

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net] net_sched: act_mirred: full rcu conversion
  2016-09-09  5:26               ` Cong Wang
@ 2016-09-09 12:23                 ` Eric Dumazet
  2016-09-09 15:52                 ` John Fastabend
  1 sibling, 0 replies; 37+ messages in thread
From: Eric Dumazet @ 2016-09-09 12:23 UTC (permalink / raw)
  To: Cong Wang
  Cc: John Fastabend, David Miller, Linux Kernel Network Developers,
	Jamal Hadi Salim, Hadar Hen Zion, Amir Vadai

On Thu, 2016-09-08 at 22:26 -0700, Cong Wang wrote:
> On Thu, Sep 8, 2016 at 8:51 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Thu, 2016-09-08 at 08:47 -0700, John Fastabend wrote:
> >
> >> Works for me. FWIW I find this plenty straightforward and don't really
> >> see the need to make the hash table itself rcu friendly.
> >>
> >> Acked-by: John Fastabend <john.r.fastabend@intel.com>
> >>
> >
> > Yes, it seems this hash table is used in control path, with RTNL held
> > anyway.
> 
> Seriously? You never read hashtable in fast path?? I think you need
> to wake up.

I seriously consider to ignore your mails.

Since you write patches with no changelog, expecting us to read your
mind, you definitely don't need us.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net] net_sched: act_mirred: full rcu conversion
  2016-09-09  5:24           ` Cong Wang
  2016-09-09  5:48             ` Alexei Starovoitov
@ 2016-09-09 12:23             ` Eric Dumazet
  2016-09-12  5:46               ` Cong Wang
  1 sibling, 1 reply; 37+ messages in thread
From: Eric Dumazet @ 2016-09-09 12:23 UTC (permalink / raw)
  To: Cong Wang
  Cc: David Miller, Linux Kernel Network Developers, Jamal Hadi Salim,
	John Fastabend, Hadar Hen Zion, Amir Vadai

On Thu, 2016-09-08 at 22:24 -0700, Cong Wang wrote:
> On Thu, Sep 8, 2016 at 8:35 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > From: Eric Dumazet <edumazet@google.com>
> >
> > As reported by Cong Wang, I was lazy when I did initial RCU conversion
> > of tc_mirred, as I thought I could avoid allocation/freeing of a
> > parameter block.
> 
> Quote from Eric Dumazet:
> 
> https://www.mail-archive.com/netdev@vger.kernel.org/msg115482.html
> 
> <Quote>
> Well, I added a READ_ONCE() to read tcf_action once.
> 
> Adding rcu here would mean adding a pointer and extra cache line, to
> deref the values.
> 
> IMHO the race here has no effect . You either read the old or new value.
> </Quote>
> 
> Me with facepalm... ;-)


Point is still valid. Show me a real case where it was a serious
problem, instead of simply theoretical.

tc_mirred + ifb patches allowed us to reach a milestone, removing the
last contended spinlocks, and you are catching up with this one year
later.

I wont backport this fix in Google prod kernels, because there is
absolutely no way we need it, and the extra memory cache line might hurt
latencies.

Since you did not write a fix on your side since June 17th, I presume
you do not care that much.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net] net_sched: act_mirred: full rcu conversion
  2016-09-09  5:59               ` Cong Wang
@ 2016-09-09 12:25                 ` Eric Dumazet
  0 siblings, 0 replies; 37+ messages in thread
From: Eric Dumazet @ 2016-09-09 12:25 UTC (permalink / raw)
  To: Cong Wang
  Cc: Alexei Starovoitov, David Miller,
	Linux Kernel Network Developers, Jamal Hadi Salim,
	John Fastabend, Hadar Hen Zion, Amir Vadai

On Thu, 2016-09-08 at 22:59 -0700, Cong Wang wrote:
> On Thu, Sep 8, 2016 at 10:48 PM, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > imo the deliberate small race in Eric's initial lock removal in mirred
> > was a good design choice. I think he did this patch only to
> > silence the complains with the code instead of arguing with words.
> > imo the initial code is good as-is. This patch is also a nice improvement,
> > but certainly not mandatory. 'face palm' is unnecessary.
> 
> LOL, to summarize your words:
> 
> 1) not changing the current code is good
> 2) changing the current code is good too

Only because I was tired of your rants.

I wont backport this patch to Google prod kernels, I already spent too
much time.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC Patch net-next 5/6] net_sched: use rcu in fast path
  2016-09-09  5:54           ` Cong Wang
@ 2016-09-09 15:25             ` John Fastabend
  0 siblings, 0 replies; 37+ messages in thread
From: John Fastabend @ 2016-09-09 15:25 UTC (permalink / raw)
  To: Cong Wang; +Cc: Eric Dumazet, Linux Kernel Network Developers, Jamal Hadi Salim

On 16-09-08 10:54 PM, Cong Wang wrote:
> On Thu, Sep 8, 2016 at 8:49 AM, John Fastabend <john.fastabend@gmail.com> wrote:
>> Agreed not sure why you would ever want to do a late binding and
>> replace on a tc_mirred actions. But it is supported...
> 
> I will let Jamal teach you on this, /me is really tired of explaining
> things to you John.
> 

This was a meta-comment on the use case for doing this with mirred
action. Not necessarily about the patch itself.

I was actually curious where this happens in practice. The only thing
I can think of is your external logging box moved so you need to send
out another port. Is there any open source software that manages 'tc'
like this. If so I would like to read it.

So do you know of any?

.John

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net] net_sched: act_mirred: full rcu conversion
  2016-09-09  5:26               ` Cong Wang
  2016-09-09 12:23                 ` Eric Dumazet
@ 2016-09-09 15:52                 ` John Fastabend
  2016-09-12  6:12                   ` Cong Wang
  1 sibling, 1 reply; 37+ messages in thread
From: John Fastabend @ 2016-09-09 15:52 UTC (permalink / raw)
  To: Cong Wang, Eric Dumazet
  Cc: David Miller, Linux Kernel Network Developers, Jamal Hadi Salim,
	Hadar Hen Zion, Amir Vadai

On 16-09-08 10:26 PM, Cong Wang wrote:
> On Thu, Sep 8, 2016 at 8:51 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> On Thu, 2016-09-08 at 08:47 -0700, John Fastabend wrote:
>>
>>> Works for me. FWIW I find this plenty straightforward and don't really
>>> see the need to make the hash table itself rcu friendly.
>>>
>>> Acked-by: John Fastabend <john.r.fastabend@intel.com>
>>>
>>
>> Yes, it seems this hash table is used in control path, with RTNL held
>> anyway.
> 
> Seriously? You never read hashtable in fast path?? I think you need
> to wake up.
> 

But the actions use refcnt'ing and should never be decremented to zero
as long as they can still be referenced by an active filter. If each
action handles its parameters like mirred/gact then I don't see why its
necessary.

I believe though that the refcnt needs to be fixed a bit though most
likely by making it atomic. I original assumed it was protected by
RTNL lock but because its getting decremented from rcu callback this is
not true.

.John

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net] net_sched: act_mirred: full rcu conversion
  2016-09-09 12:23             ` Eric Dumazet
@ 2016-09-12  5:46               ` Cong Wang
  0 siblings, 0 replies; 37+ messages in thread
From: Cong Wang @ 2016-09-12  5:46 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Linux Kernel Network Developers, Jamal Hadi Salim,
	John Fastabend, Hadar Hen Zion, Amir Vadai

On Fri, Sep 9, 2016 at 5:23 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2016-09-08 at 22:24 -0700, Cong Wang wrote:
>> On Thu, Sep 8, 2016 at 8:35 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> > From: Eric Dumazet <edumazet@google.com>
>> >
>> > As reported by Cong Wang, I was lazy when I did initial RCU conversion
>> > of tc_mirred, as I thought I could avoid allocation/freeing of a
>> > parameter block.
>>
>> Quote from Eric Dumazet:
>>
>> https://www.mail-archive.com/netdev@vger.kernel.org/msg115482.html
>>
>> <Quote>
>> Well, I added a READ_ONCE() to read tcf_action once.
>>
>> Adding rcu here would mean adding a pointer and extra cache line, to
>> deref the values.
>>
>> IMHO the race here has no effect . You either read the old or new value.
>> </Quote>
>>
>> Me with facepalm... ;-)
>
>
> Point is still valid. Show me a real case where it was a serious
> problem, instead of simply theoretical.
>
> tc_mirred + ifb patches allowed us to reach a milestone, removing the
> last contended spinlocks, and you are catching up with this one year
> later.
>
> I wont backport this fix in Google prod kernels, because there is
> absolutely no way we need it, and the extra memory cache line might hurt
> latencies.
>
> Since you did not write a fix on your side since June 17th, I presume
> you do not care that much.

Sounds like I were the author of this patch.... Why are you questioning
your own patch? Did I ask you to care about it? ;-)

Please drop this patch.

Thanks.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net] net_sched: act_mirred: full rcu conversion
  2016-09-09 15:52                 ` John Fastabend
@ 2016-09-12  6:12                   ` Cong Wang
  2016-09-12 15:34                     ` John Fastabend
  0 siblings, 1 reply; 37+ messages in thread
From: Cong Wang @ 2016-09-12  6:12 UTC (permalink / raw)
  To: John Fastabend
  Cc: Eric Dumazet, David Miller, Linux Kernel Network Developers,
	Jamal Hadi Salim, Hadar Hen Zion, Amir Vadai

On Fri, Sep 9, 2016 at 8:52 AM, John Fastabend <john.fastabend@gmail.com> wrote:
> On 16-09-08 10:26 PM, Cong Wang wrote:
>> On Thu, Sep 8, 2016 at 8:51 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>> On Thu, 2016-09-08 at 08:47 -0700, John Fastabend wrote:
>>>
>>>> Works for me. FWIW I find this plenty straightforward and don't really
>>>> see the need to make the hash table itself rcu friendly.
>>>>
>>>> Acked-by: John Fastabend <john.r.fastabend@intel.com>
>>>>
>>>
>>> Yes, it seems this hash table is used in control path, with RTNL held
>>> anyway.
>>
>> Seriously? You never read hashtable in fast path?? I think you need
>> to wake up.
>>
>
> But the actions use refcnt'ing and should never be decremented to zero
> as long as they can still be referenced by an active filter. If each
> action handles its parameters like mirred/gact then I don't see why its
> necessary.

This is correct, by "read" I meant "dereference", the tc actions
are now permanently stored in hashtable directly, so "reading"
a tc action is reading from hashtable.

Sorry if this wasn't clear.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net] net_sched: act_mirred: full rcu conversion
  2016-09-12  6:12                   ` Cong Wang
@ 2016-09-12 15:34                     ` John Fastabend
  0 siblings, 0 replies; 37+ messages in thread
From: John Fastabend @ 2016-09-12 15:34 UTC (permalink / raw)
  To: Cong Wang
  Cc: Eric Dumazet, David Miller, Linux Kernel Network Developers,
	Jamal Hadi Salim, Hadar Hen Zion, Amir Vadai

On 16-09-11 11:12 PM, Cong Wang wrote:
> On Fri, Sep 9, 2016 at 8:52 AM, John Fastabend <john.fastabend@gmail.com> wrote:
>> On 16-09-08 10:26 PM, Cong Wang wrote:
>>> On Thu, Sep 8, 2016 at 8:51 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>>> On Thu, 2016-09-08 at 08:47 -0700, John Fastabend wrote:
>>>>
>>>>> Works for me. FWIW I find this plenty straightforward and don't really
>>>>> see the need to make the hash table itself rcu friendly.
>>>>>
>>>>> Acked-by: John Fastabend <john.r.fastabend@intel.com>
>>>>>
>>>>
>>>> Yes, it seems this hash table is used in control path, with RTNL held
>>>> anyway.
>>>
>>> Seriously? You never read hashtable in fast path?? I think you need
>>> to wake up.
>>>
>>
>> But the actions use refcnt'ing and should never be decremented to zero
>> as long as they can still be referenced by an active filter. If each
>> action handles its parameters like mirred/gact then I don't see why its
>> necessary.
> 
> This is correct, by "read" I meant "dereference", the tc actions
> are now permanently stored in hashtable directly, so "reading"
> a tc action is reading from hashtable.
> 
> Sorry if this wasn't clear.
> 

OK, but with the current code there is no need to protect the hash table
with an RCU semantics. The ref counting ensures the hash table entries
are always available and any 'replace' commands on actions are handled
internally in the action itself with rcu_assign_pointer replacing old
params with new params. The fast path readers will always read a
consistent set of parameters in this scheme.

So no need for an rcu hash table. The reference count though should
likely be atomic because not all increments/decrements are protected by
RTNL lock.

.John

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2016-09-12 15:35 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-02  5:57 [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions Cong Wang
2016-09-02  5:57 ` [RFC Patch net-next 1/6] net_sched: use RCU for action hash table Cong Wang
2016-09-06 12:47   ` Jamal Hadi Salim
2016-09-06 22:37     ` Cong Wang
2016-09-02  5:57 ` [RFC Patch net-next 2/6] net_sched: introduce tcf_hash_replace() Cong Wang
2016-09-06 12:52   ` Jamal Hadi Salim
2016-09-06 22:34     ` Cong Wang
2016-09-02  5:57 ` [RFC Patch net-next 3/6] net_sched: return NULL in tcf_hash_check() Cong Wang
2016-09-02  5:57 ` [RFC Patch net-next 4/6] net_sched: introduce tcf_hash_copy() Cong Wang
2016-09-02  5:57 ` [RFC Patch net-next 5/6] net_sched: use rcu in fast path Cong Wang
2016-09-06 14:52   ` Eric Dumazet
2016-09-08  6:04     ` John Fastabend
2016-09-08 13:28       ` Eric Dumazet
2016-09-08 15:35         ` [PATCH net] net_sched: act_mirred: full rcu conversion Eric Dumazet
2016-09-08 15:47           ` John Fastabend
2016-09-08 15:51             ` Eric Dumazet
2016-09-09  5:26               ` Cong Wang
2016-09-09 12:23                 ` Eric Dumazet
2016-09-09 15:52                 ` John Fastabend
2016-09-12  6:12                   ` Cong Wang
2016-09-12 15:34                     ` John Fastabend
2016-09-09  5:24           ` Cong Wang
2016-09-09  5:48             ` Alexei Starovoitov
2016-09-09  5:59               ` Cong Wang
2016-09-09 12:25                 ` Eric Dumazet
2016-09-09 12:23             ` Eric Dumazet
2016-09-12  5:46               ` Cong Wang
2016-09-08 15:49         ` [RFC Patch net-next 5/6] net_sched: use rcu in fast path John Fastabend
2016-09-09  5:54           ` Cong Wang
2016-09-09 15:25             ` John Fastabend
2016-09-09  5:49     ` Cong Wang
2016-09-02  5:57 ` [RFC Patch net-next 6/6] net_sched: switch to RCU API for act_mirred Cong Wang
2016-09-02  7:09 ` [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions Jiri Pirko
2016-09-02 19:44   ` Cong Wang
2016-09-07 16:23 ` John Fastabend
2016-09-08  6:05   ` John Fastabend
2016-09-09  5:22   ` Cong Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.