netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock
@ 2019-03-21 13:17 Vlad Buslov
  2019-03-21 13:17 ` [PATCH net-next v3 01/12] net: sched: flower: don't check for rtnl on head dereference Vlad Buslov
                   ` (12 more replies)
  0 siblings, 13 replies; 22+ messages in thread
From: Vlad Buslov @ 2019-03-21 13:17 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, sbrivio, Vlad Buslov

Currently, all netlink protocol handlers for updating rules, actions and
qdiscs are protected with single global rtnl lock which removes any
possibility for parallelism. This patch set is a third step to remove
rtnl lock dependency from TC rules update path.

Recently, new rtnl registration flag RTNL_FLAG_DOIT_UNLOCKED was added.
TC rule update handlers (RTM_NEWTFILTER, RTM_DELTFILTER, etc.) are
already registered with this flag and only take rtnl lock when qdisc or
classifier requires it. Classifiers can indicate that their ops
callbacks don't require caller to hold rtnl lock by setting the
TCF_PROTO_OPS_DOIT_UNLOCKED flag. The goal of this change is to refactor
flower classifier to support unlocked execution and register it with
unlocked flag.

This patch set implements following changes to make flower classifier
concurrency-safe:

- Implement reference counting for individual filters. Change fl_get to
  take reference to filter. Implement tp->ops->put callback that was
  introduced in cls API patch set to release reference to flower filter.

- Use tp->lock spinlock to protect internal classifier data structures
  from concurrent modification.

- Handle concurrent tcf proto deletion by returning EAGAIN, which will
  cause cls API to retry and create new proto instance or return error
  to the user (depending on message type).

- Handle concurrent insertion of filter with same priority and handle by
  returning EAGAIN, which will cause cls API to lookup filter again and
  process it accordingly to netlink message flags.

- Extend flower mask with reference counting and protect masks list with
  masks_lock spinlock.

- Prevent concurrent mask insertion by inserting temporary value to
  masks hash table. This is necessary because mask initialization is a
  sleeping operation and cannot be done while holding tp->lock.

Both chain level and classifier level conflicts are resolved by
returning -EAGAIN to cls API that results restart of whole operation.
This retry mechanism is a result of fine-grained locking approach used
in this and previous changes in series and is necessary to allow
concurrent updates on same chain instance. Alternative approach would be
to lock the whole chain while updating filters on any of child tp's,
adding and removing classifier instances from the chain. However, since
most CPU-intensive parts of filter update code are specifically in
classifier code and its dependencies (extensions and hw offloads), such
approach would negate most of the gains introduced by this change and
previous changes in the series when updating same chain instance.

Tcf hw offloads API is not changed by this patch set and still requires
caller to hold rtnl lock. Refactored flower classifier tracks rtnl lock
state by means of 'rtnl_held' flag provided by cls API and obtains the
lock before calling hw offloads. Following patch set will lift this
restriction and refactor cls hw offloads API to support unlocked
execution.

With these changes flower classifier is safely registered with
TCF_PROTO_OPS_DOIT_UNLOCKED flag in last patch.

Changes from V2 to V3:
- Rebase on latest net-next

Changes from V1 to V2:
- Extend cover letter with explanation about retry mechanism.
- Rebase on current net-next.
- Patch 1:
  - Use rcu_dereference_raw() for tp->root dereference.
  - Update comment in fl_head_dereference().
- Patch 2:
  - Remove redundant check in fl_change error handling code.
  - Add empty line between error check and new handle assignment.
- Patch 3:
  - Refactor loop in fl_get_next_filter() to improve readability.
- Patch 4:
  - Refactor __fl_delete() to improve readability.
- Patch 6:
  - Fix comment in fl_check_assign_mask().
- Patch 9:
  - Extend commit message.
  - Fix error code in comment.
- Patch 11:
  - Fix fl_hw_replace_filter() to always release rtnl lock in error
    handlers.
- Patch 12:
  - Don't take rtnl lock before calling __fl_destroy_filter() in
    workqueue context.
  - Extend commit message with explanation why flower still takes rtnl
    lock before calling hardware offloads API.

Github: <https://github.com/vbuslov/linux/tree/unlocked-flower-cong3>

Vlad Buslov (12):
  net: sched: flower: don't check for rtnl on head dereference
  net: sched: flower: refactor fl_change
  net: sched: flower: introduce reference counting for filters
  net: sched: flower: track filter deletion with flag
  net: sched: flower: add reference counter to flower mask
  net: sched: flower: handle concurrent mask insertion
  net: sched: flower: protect masks list with spinlock
  net: sched: flower: handle concurrent filter insertion in fl_change
  net: sched: flower: handle concurrent tcf proto deletion
  net: sched: flower: protect flower classifier state with spinlock
  net: sched: flower: track rtnl lock state
  net: sched: flower: set unlocked flag for flower proto ops

 net/sched/cls_flower.c | 433 +++++++++++++++++++++++++++++++----------
 1 file changed, 325 insertions(+), 108 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH net-next v3 01/12] net: sched: flower: don't check for rtnl on head dereference
  2019-03-21 13:17 [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock Vlad Buslov
@ 2019-03-21 13:17 ` Vlad Buslov
  2019-03-21 13:51   ` Jiri Pirko
  2019-03-21 13:17 ` [PATCH net-next v3 02/12] net: sched: flower: refactor fl_change Vlad Buslov
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 22+ messages in thread
From: Vlad Buslov @ 2019-03-21 13:17 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, sbrivio, Vlad Buslov

Flower classifier only changes root pointer during init and destroy. Cls
API implements reference counting for tcf_proto, so there is no danger of
concurrent access to tp when it is being destroyed, even without protection
provided by rtnl lock.

Implement new function fl_head_dereference() to dereference tp->root
without checking for rtnl lock. Use it in all flower function that obtain
head pointer instead of rtnl_dereference().

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
---
 net/sched/cls_flower.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index c04247b403ed..dcf3aee5697e 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -437,10 +437,20 @@ static void fl_hw_update_stats(struct tcf_proto *tp, struct cls_fl_filter *f)
 			      cls_flower.stats.lastused);
 }
 
+static struct cls_fl_head *fl_head_dereference(struct tcf_proto *tp)
+{
+	/* Flower classifier only changes root pointer during init and destroy.
+	 * Users must obtain reference to tcf_proto instance before calling its
+	 * API, so tp->root pointer is protected from concurrent call to
+	 * fl_destroy() by reference counting.
+	 */
+	return rcu_dereference_raw(tp->root);
+}
+
 static bool __fl_delete(struct tcf_proto *tp, struct cls_fl_filter *f,
 			struct netlink_ext_ack *extack)
 {
-	struct cls_fl_head *head = rtnl_dereference(tp->root);
+	struct cls_fl_head *head = fl_head_dereference(tp);
 	bool async = tcf_exts_get_net(&f->exts);
 	bool last;
 
@@ -472,7 +482,7 @@ static void fl_destroy_sleepable(struct work_struct *work)
 static void fl_destroy(struct tcf_proto *tp, bool rtnl_held,
 		       struct netlink_ext_ack *extack)
 {
-	struct cls_fl_head *head = rtnl_dereference(tp->root);
+	struct cls_fl_head *head = fl_head_dereference(tp);
 	struct fl_flow_mask *mask, *next_mask;
 	struct cls_fl_filter *f, *next;
 
@@ -490,7 +500,7 @@ static void fl_destroy(struct tcf_proto *tp, bool rtnl_held,
 
 static void *fl_get(struct tcf_proto *tp, u32 handle)
 {
-	struct cls_fl_head *head = rtnl_dereference(tp->root);
+	struct cls_fl_head *head = fl_head_dereference(tp);
 
 	return idr_find(&head->handle_idr, handle);
 }
@@ -1308,7 +1318,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 		     void **arg, bool ovr, bool rtnl_held,
 		     struct netlink_ext_ack *extack)
 {
-	struct cls_fl_head *head = rtnl_dereference(tp->root);
+	struct cls_fl_head *head = fl_head_dereference(tp);
 	struct cls_fl_filter *fold = *arg;
 	struct cls_fl_filter *fnew;
 	struct fl_flow_mask *mask;
@@ -1446,7 +1456,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 static int fl_delete(struct tcf_proto *tp, void *arg, bool *last,
 		     bool rtnl_held, struct netlink_ext_ack *extack)
 {
-	struct cls_fl_head *head = rtnl_dereference(tp->root);
+	struct cls_fl_head *head = fl_head_dereference(tp);
 	struct cls_fl_filter *f = arg;
 
 	rhashtable_remove_fast(&f->mask->ht, &f->ht_node,
@@ -1459,7 +1469,7 @@ static int fl_delete(struct tcf_proto *tp, void *arg, bool *last,
 static void fl_walk(struct tcf_proto *tp, struct tcf_walker *arg,
 		    bool rtnl_held)
 {
-	struct cls_fl_head *head = rtnl_dereference(tp->root);
+	struct cls_fl_head *head = fl_head_dereference(tp);
 	struct cls_fl_filter *f;
 
 	arg->count = arg->skip;
@@ -1478,7 +1488,7 @@ static void fl_walk(struct tcf_proto *tp, struct tcf_walker *arg,
 static int fl_reoffload(struct tcf_proto *tp, bool add, tc_setup_cb_t *cb,
 			void *cb_priv, struct netlink_ext_ack *extack)
 {
-	struct cls_fl_head *head = rtnl_dereference(tp->root);
+	struct cls_fl_head *head = fl_head_dereference(tp);
 	struct tc_cls_flower_offload cls_flower = {};
 	struct tcf_block *block = tp->chain->block;
 	struct fl_flow_mask *mask;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next v3 02/12] net: sched: flower: refactor fl_change
  2019-03-21 13:17 [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock Vlad Buslov
  2019-03-21 13:17 ` [PATCH net-next v3 01/12] net: sched: flower: don't check for rtnl on head dereference Vlad Buslov
@ 2019-03-21 13:17 ` Vlad Buslov
  2019-03-21 13:53   ` Jiri Pirko
  2019-03-21 13:17 ` [PATCH net-next v3 03/12] net: sched: flower: introduce reference counting for filters Vlad Buslov
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 22+ messages in thread
From: Vlad Buslov @ 2019-03-21 13:17 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, sbrivio, Vlad Buslov

As a preparation for using classifier spinlock instead of relying on
external rtnl lock, rearrange code in fl_change. The goal is to group the
code which changes classifier state in single block in order to allow
following commits in this set to protect it from parallel modification with
tp->lock. Data structures that require tp->lock protection are mask
hashtable and filters list, and classifier handle_idr.

fl_hw_replace_filter() is a sleeping function and cannot be called while
holding a spinlock. In order to execute all sequence of changes to shared
classifier data structures atomically, call fl_hw_replace_filter() before
modifying them.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
---
 net/sched/cls_flower.c | 80 ++++++++++++++++++++++--------------------
 1 file changed, 41 insertions(+), 39 deletions(-)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index dcf3aee5697e..d36ceb5001f9 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -1376,73 +1376,75 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 	if (err)
 		goto errout;
 
-	if (!handle) {
-		handle = 1;
-		err = idr_alloc_u32(&head->handle_idr, fnew, &handle,
-				    INT_MAX, GFP_KERNEL);
-	} else if (!fold) {
-		/* user specifies a handle and it doesn't exist */
-		err = idr_alloc_u32(&head->handle_idr, fnew, &handle,
-				    handle, GFP_KERNEL);
-	}
-	if (err)
-		goto errout_mask;
-	fnew->handle = handle;
-
-	if (!fold && __fl_lookup(fnew->mask, &fnew->mkey)) {
-		err = -EEXIST;
-		goto errout_idr;
-	}
-
-	err = rhashtable_insert_fast(&fnew->mask->ht, &fnew->ht_node,
-				     fnew->mask->filter_ht_params);
-	if (err)
-		goto errout_idr;
-
 	if (!tc_skip_hw(fnew->flags)) {
 		err = fl_hw_replace_filter(tp, fnew, extack);
 		if (err)
-			goto errout_mask_ht;
+			goto errout_mask;
 	}
 
 	if (!tc_in_hw(fnew->flags))
 		fnew->flags |= TCA_CLS_FLAGS_NOT_IN_HW;
 
 	if (fold) {
+		fnew->handle = handle;
+
+		err = rhashtable_insert_fast(&fnew->mask->ht, &fnew->ht_node,
+					     fnew->mask->filter_ht_params);
+		if (err)
+			goto errout_hw;
+
 		rhashtable_remove_fast(&fold->mask->ht,
 				       &fold->ht_node,
 				       fold->mask->filter_ht_params);
-		if (!tc_skip_hw(fold->flags))
-			fl_hw_destroy_filter(tp, fold, NULL);
-	}
-
-	*arg = fnew;
-
-	if (fold) {
 		idr_replace(&head->handle_idr, fnew, fnew->handle);
 		list_replace_rcu(&fold->list, &fnew->list);
+
+		if (!tc_skip_hw(fold->flags))
+			fl_hw_destroy_filter(tp, fold, NULL);
 		tcf_unbind_filter(tp, &fold->res);
 		tcf_exts_get_net(&fold->exts);
 		tcf_queue_work(&fold->rwork, fl_destroy_filter_work);
 	} else {
+		if (__fl_lookup(fnew->mask, &fnew->mkey)) {
+			err = -EEXIST;
+			goto errout_hw;
+		}
+
+		if (handle) {
+			/* user specifies a handle and it doesn't exist */
+			err = idr_alloc_u32(&head->handle_idr, fnew, &handle,
+					    handle, GFP_ATOMIC);
+		} else {
+			handle = 1;
+			err = idr_alloc_u32(&head->handle_idr, fnew, &handle,
+					    INT_MAX, GFP_ATOMIC);
+		}
+		if (err)
+			goto errout_hw;
+
+		fnew->handle = handle;
+
+		err = rhashtable_insert_fast(&fnew->mask->ht, &fnew->ht_node,
+					     fnew->mask->filter_ht_params);
+		if (err)
+			goto errout_idr;
+
 		list_add_tail_rcu(&fnew->list, &fnew->mask->filters);
 	}
 
+	*arg = fnew;
+
 	kfree(tb);
 	kfree(mask);
 	return 0;
 
-errout_mask_ht:
-	rhashtable_remove_fast(&fnew->mask->ht, &fnew->ht_node,
-			       fnew->mask->filter_ht_params);
-
 errout_idr:
-	if (!fold)
-		idr_remove(&head->handle_idr, fnew->handle);
-
+	idr_remove(&head->handle_idr, fnew->handle);
+errout_hw:
+	if (!tc_skip_hw(fnew->flags))
+		fl_hw_destroy_filter(tp, fnew, NULL);
 errout_mask:
 	fl_mask_put(head, fnew->mask, false);
-
 errout:
 	tcf_exts_destroy(&fnew->exts);
 	kfree(fnew);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next v3 03/12] net: sched: flower: introduce reference counting for filters
  2019-03-21 13:17 [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock Vlad Buslov
  2019-03-21 13:17 ` [PATCH net-next v3 01/12] net: sched: flower: don't check for rtnl on head dereference Vlad Buslov
  2019-03-21 13:17 ` [PATCH net-next v3 02/12] net: sched: flower: refactor fl_change Vlad Buslov
@ 2019-03-21 13:17 ` Vlad Buslov
  2019-03-21 14:00   ` Jiri Pirko
  2019-03-21 13:17 ` [PATCH net-next v3 04/12] net: sched: flower: track filter deletion with flag Vlad Buslov
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 22+ messages in thread
From: Vlad Buslov @ 2019-03-21 13:17 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, sbrivio, Vlad Buslov

Extend flower filters with reference counting in order to remove dependency
on rtnl lock in flower ops and allow to modify filters concurrently.
Reference to flower filter can be taken/released concurrently as soon as it
is marked as 'unlocked' by last patch in this series. Use atomic reference
counter type to make concurrent modifications safe.

Always take reference to flower filter while working with it:
- Modify fl_get() to take reference to filter.
- Implement tp->put() callback as fl_put() function to allow cls API to
release reference taken by fl_get().
- Modify fl_change() to assume that caller holds reference to fold and take
reference to fnew.
- Take reference to filter while using it in fl_walk().

Implement helper functions to get/put filter reference counter.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
---
 net/sched/cls_flower.c | 96 ++++++++++++++++++++++++++++++++++++------
 1 file changed, 82 insertions(+), 14 deletions(-)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index d36ceb5001f9..9ed7c9b804a7 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -14,6 +14,7 @@
 #include <linux/module.h>
 #include <linux/rhashtable.h>
 #include <linux/workqueue.h>
+#include <linux/refcount.h>
 
 #include <linux/if_ether.h>
 #include <linux/in6.h>
@@ -104,6 +105,11 @@ struct cls_fl_filter {
 	u32 in_hw_count;
 	struct rcu_work rwork;
 	struct net_device *hw_dev;
+	/* Flower classifier is unlocked, which means that its reference counter
+	 * can be changed concurrently without any kind of external
+	 * synchronization. Use atomic reference counter to be concurrency-safe.
+	 */
+	refcount_t refcnt;
 };
 
 static const struct rhashtable_params mask_ht_params = {
@@ -447,6 +453,48 @@ static struct cls_fl_head *fl_head_dereference(struct tcf_proto *tp)
 	return rcu_dereference_raw(tp->root);
 }
 
+static void __fl_put(struct cls_fl_filter *f)
+{
+	if (!refcount_dec_and_test(&f->refcnt))
+		return;
+
+	if (tcf_exts_get_net(&f->exts))
+		tcf_queue_work(&f->rwork, fl_destroy_filter_work);
+	else
+		__fl_destroy_filter(f);
+}
+
+static struct cls_fl_filter *__fl_get(struct cls_fl_head *head, u32 handle)
+{
+	struct cls_fl_filter *f;
+
+	rcu_read_lock();
+	f = idr_find(&head->handle_idr, handle);
+	if (f && !refcount_inc_not_zero(&f->refcnt))
+		f = NULL;
+	rcu_read_unlock();
+
+	return f;
+}
+
+static struct cls_fl_filter *fl_get_next_filter(struct tcf_proto *tp,
+						unsigned long *handle)
+{
+	struct cls_fl_head *head = fl_head_dereference(tp);
+	struct cls_fl_filter *f;
+
+	rcu_read_lock();
+	while ((f = idr_get_next_ul(&head->handle_idr, handle))) {
+		/* don't return filters that are being deleted */
+		if (refcount_inc_not_zero(&f->refcnt))
+			break;
+		++(*handle);
+	}
+	rcu_read_unlock();
+
+	return f;
+}
+
 static bool __fl_delete(struct tcf_proto *tp, struct cls_fl_filter *f,
 			struct netlink_ext_ack *extack)
 {
@@ -460,10 +508,7 @@ static bool __fl_delete(struct tcf_proto *tp, struct cls_fl_filter *f,
 	if (!tc_skip_hw(f->flags))
 		fl_hw_destroy_filter(tp, f, extack);
 	tcf_unbind_filter(tp, &f->res);
-	if (async)
-		tcf_queue_work(&f->rwork, fl_destroy_filter_work);
-	else
-		__fl_destroy_filter(f);
+	__fl_put(f);
 
 	return last;
 }
@@ -498,11 +543,18 @@ static void fl_destroy(struct tcf_proto *tp, bool rtnl_held,
 	tcf_queue_work(&head->rwork, fl_destroy_sleepable);
 }
 
+static void fl_put(struct tcf_proto *tp, void *arg)
+{
+	struct cls_fl_filter *f = arg;
+
+	__fl_put(f);
+}
+
 static void *fl_get(struct tcf_proto *tp, u32 handle)
 {
 	struct cls_fl_head *head = fl_head_dereference(tp);
 
-	return idr_find(&head->handle_idr, handle);
+	return __fl_get(head, handle);
 }
 
 static const struct nla_policy fl_policy[TCA_FLOWER_MAX + 1] = {
@@ -1325,12 +1377,16 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 	struct nlattr **tb;
 	int err;
 
-	if (!tca[TCA_OPTIONS])
-		return -EINVAL;
+	if (!tca[TCA_OPTIONS]) {
+		err = -EINVAL;
+		goto errout_fold;
+	}
 
 	mask = kzalloc(sizeof(struct fl_flow_mask), GFP_KERNEL);
-	if (!mask)
-		return -ENOBUFS;
+	if (!mask) {
+		err = -ENOBUFS;
+		goto errout_fold;
+	}
 
 	tb = kcalloc(TCA_FLOWER_MAX + 1, sizeof(struct nlattr *), GFP_KERNEL);
 	if (!tb) {
@@ -1353,6 +1409,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 		err = -ENOBUFS;
 		goto errout_tb;
 	}
+	refcount_set(&fnew->refcnt, 1);
 
 	err = tcf_exts_init(&fnew->exts, net, TCA_FLOWER_ACT, 0);
 	if (err < 0)
@@ -1385,6 +1442,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 	if (!tc_in_hw(fnew->flags))
 		fnew->flags |= TCA_CLS_FLAGS_NOT_IN_HW;
 
+	refcount_inc(&fnew->refcnt);
 	if (fold) {
 		fnew->handle = handle;
 
@@ -1403,7 +1461,11 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 			fl_hw_destroy_filter(tp, fold, NULL);
 		tcf_unbind_filter(tp, &fold->res);
 		tcf_exts_get_net(&fold->exts);
-		tcf_queue_work(&fold->rwork, fl_destroy_filter_work);
+		/* Caller holds reference to fold, so refcnt is always > 0
+		 * after this.
+		 */
+		refcount_dec(&fold->refcnt);
+		__fl_put(fold);
 	} else {
 		if (__fl_lookup(fnew->mask, &fnew->mkey)) {
 			err = -EEXIST;
@@ -1452,6 +1514,9 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 	kfree(tb);
 errout_mask_alloc:
 	kfree(mask);
+errout_fold:
+	if (fold)
+		__fl_put(fold);
 	return err;
 }
 
@@ -1465,24 +1530,26 @@ static int fl_delete(struct tcf_proto *tp, void *arg, bool *last,
 			       f->mask->filter_ht_params);
 	__fl_delete(tp, f, extack);
 	*last = list_empty(&head->masks);
+	__fl_put(f);
+
 	return 0;
 }
 
 static void fl_walk(struct tcf_proto *tp, struct tcf_walker *arg,
 		    bool rtnl_held)
 {
-	struct cls_fl_head *head = fl_head_dereference(tp);
 	struct cls_fl_filter *f;
 
 	arg->count = arg->skip;
 
-	while ((f = idr_get_next_ul(&head->handle_idr,
-				    &arg->cookie)) != NULL) {
+	while ((f = fl_get_next_filter(tp, &arg->cookie)) != NULL) {
 		if (arg->fn(tp, f, arg) < 0) {
+			__fl_put(f);
 			arg->stop = 1;
 			break;
 		}
-		arg->cookie = f->handle + 1;
+		__fl_put(f);
+		arg->cookie++;
 		arg->count++;
 	}
 }
@@ -2156,6 +2223,7 @@ static struct tcf_proto_ops cls_fl_ops __read_mostly = {
 	.init		= fl_init,
 	.destroy	= fl_destroy,
 	.get		= fl_get,
+	.put		= fl_put,
 	.change		= fl_change,
 	.delete		= fl_delete,
 	.walk		= fl_walk,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next v3 04/12] net: sched: flower: track filter deletion with flag
  2019-03-21 13:17 [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock Vlad Buslov
                   ` (2 preceding siblings ...)
  2019-03-21 13:17 ` [PATCH net-next v3 03/12] net: sched: flower: introduce reference counting for filters Vlad Buslov
@ 2019-03-21 13:17 ` Vlad Buslov
  2019-03-21 14:04   ` Jiri Pirko
  2019-03-21 13:17 ` [PATCH net-next v3 05/12] net: sched: flower: add reference counter to flower mask Vlad Buslov
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 22+ messages in thread
From: Vlad Buslov @ 2019-03-21 13:17 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, sbrivio, Vlad Buslov

In order to prevent double deletion of filter by concurrent tasks when rtnl
lock is not used for synchronization, add 'deleted' filter field. Check
value of this field when modifying filters and return error if concurrent
deletion is detected.

Refactor __fl_delete() to accept pointer to 'last' boolean as argument,
and return error code as function return value instead. This is necessary
to signal concurrent filter delete to caller.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
---
 net/sched/cls_flower.c | 39 +++++++++++++++++++++++++++++----------
 1 file changed, 29 insertions(+), 10 deletions(-)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 9ed7c9b804a7..dd8a65cef6e1 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -110,6 +110,7 @@ struct cls_fl_filter {
 	 * synchronization. Use atomic reference counter to be concurrency-safe.
 	 */
 	refcount_t refcnt;
+	bool deleted;
 };
 
 static const struct rhashtable_params mask_ht_params = {
@@ -458,6 +459,8 @@ static void __fl_put(struct cls_fl_filter *f)
 	if (!refcount_dec_and_test(&f->refcnt))
 		return;
 
+	WARN_ON(!f->deleted);
+
 	if (tcf_exts_get_net(&f->exts))
 		tcf_queue_work(&f->rwork, fl_destroy_filter_work);
 	else
@@ -495,22 +498,29 @@ static struct cls_fl_filter *fl_get_next_filter(struct tcf_proto *tp,
 	return f;
 }
 
-static bool __fl_delete(struct tcf_proto *tp, struct cls_fl_filter *f,
-			struct netlink_ext_ack *extack)
+static int __fl_delete(struct tcf_proto *tp, struct cls_fl_filter *f,
+		       bool *last, struct netlink_ext_ack *extack)
 {
 	struct cls_fl_head *head = fl_head_dereference(tp);
 	bool async = tcf_exts_get_net(&f->exts);
-	bool last;
 
+	*last = false;
+
+	if (f->deleted)
+		return -ENOENT;
+
+	f->deleted = true;
+	rhashtable_remove_fast(&f->mask->ht, &f->ht_node,
+			       f->mask->filter_ht_params);
 	idr_remove(&head->handle_idr, f->handle);
 	list_del_rcu(&f->list);
-	last = fl_mask_put(head, f->mask, async);
+	*last = fl_mask_put(head, f->mask, async);
 	if (!tc_skip_hw(f->flags))
 		fl_hw_destroy_filter(tp, f, extack);
 	tcf_unbind_filter(tp, &f->res);
 	__fl_put(f);
 
-	return last;
+	return 0;
 }
 
 static void fl_destroy_sleepable(struct work_struct *work)
@@ -530,10 +540,12 @@ static void fl_destroy(struct tcf_proto *tp, bool rtnl_held,
 	struct cls_fl_head *head = fl_head_dereference(tp);
 	struct fl_flow_mask *mask, *next_mask;
 	struct cls_fl_filter *f, *next;
+	bool last;
 
 	list_for_each_entry_safe(mask, next_mask, &head->masks, list) {
 		list_for_each_entry_safe(f, next, &mask->filters, list) {
-			if (__fl_delete(tp, f, extack))
+			__fl_delete(tp, f, &last, extack);
+			if (last)
 				break;
 		}
 	}
@@ -1444,6 +1456,12 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 
 	refcount_inc(&fnew->refcnt);
 	if (fold) {
+		/* Fold filter was deleted concurrently. Retry lookup. */
+		if (fold->deleted) {
+			err = -EAGAIN;
+			goto errout_hw;
+		}
+
 		fnew->handle = handle;
 
 		err = rhashtable_insert_fast(&fnew->mask->ht, &fnew->ht_node,
@@ -1456,6 +1474,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 				       fold->mask->filter_ht_params);
 		idr_replace(&head->handle_idr, fnew, fnew->handle);
 		list_replace_rcu(&fold->list, &fnew->list);
+		fold->deleted = true;
 
 		if (!tc_skip_hw(fold->flags))
 			fl_hw_destroy_filter(tp, fold, NULL);
@@ -1525,14 +1544,14 @@ static int fl_delete(struct tcf_proto *tp, void *arg, bool *last,
 {
 	struct cls_fl_head *head = fl_head_dereference(tp);
 	struct cls_fl_filter *f = arg;
+	bool last_on_mask;
+	int err = 0;
 
-	rhashtable_remove_fast(&f->mask->ht, &f->ht_node,
-			       f->mask->filter_ht_params);
-	__fl_delete(tp, f, extack);
+	err = __fl_delete(tp, f, &last_on_mask, extack);
 	*last = list_empty(&head->masks);
 	__fl_put(f);
 
-	return 0;
+	return err;
 }
 
 static void fl_walk(struct tcf_proto *tp, struct tcf_walker *arg,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next v3 05/12] net: sched: flower: add reference counter to flower mask
  2019-03-21 13:17 [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock Vlad Buslov
                   ` (3 preceding siblings ...)
  2019-03-21 13:17 ` [PATCH net-next v3 04/12] net: sched: flower: track filter deletion with flag Vlad Buslov
@ 2019-03-21 13:17 ` Vlad Buslov
  2019-03-21 13:17 ` [PATCH net-next v3 06/12] net: sched: flower: handle concurrent mask insertion Vlad Buslov
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Vlad Buslov @ 2019-03-21 13:17 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, sbrivio, Vlad Buslov, Jiri Pirko

Extend fl_flow_mask structure with reference counter to allow parallel
modification without relying on rtnl lock. Use rcu read lock to safely
lookup mask and increment reference counter in order to accommodate
concurrent deletes.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
---
 net/sched/cls_flower.c | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index dd8a65cef6e1..e98313cd710a 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -76,6 +76,7 @@ struct fl_flow_mask {
 	struct list_head filters;
 	struct rcu_work rwork;
 	struct list_head list;
+	refcount_t refcnt;
 };
 
 struct fl_flow_tmplt {
@@ -320,6 +321,7 @@ static int fl_init(struct tcf_proto *tp)
 
 static void fl_mask_free(struct fl_flow_mask *mask)
 {
+	WARN_ON(!list_empty(&mask->filters));
 	rhashtable_destroy(&mask->ht);
 	kfree(mask);
 }
@@ -335,7 +337,7 @@ static void fl_mask_free_work(struct work_struct *work)
 static bool fl_mask_put(struct cls_fl_head *head, struct fl_flow_mask *mask,
 			bool async)
 {
-	if (!list_empty(&mask->filters))
+	if (!refcount_dec_and_test(&mask->refcnt))
 		return false;
 
 	rhashtable_remove_fast(&head->ht, &mask->ht_node, mask_ht_params);
@@ -1301,6 +1303,7 @@ static struct fl_flow_mask *fl_create_new_mask(struct cls_fl_head *head,
 
 	INIT_LIST_HEAD_RCU(&newmask->filters);
 
+	refcount_set(&newmask->refcnt, 1);
 	err = rhashtable_insert_fast(&head->ht, &newmask->ht_node,
 				     mask_ht_params);
 	if (err)
@@ -1324,9 +1327,13 @@ static int fl_check_assign_mask(struct cls_fl_head *head,
 				struct fl_flow_mask *mask)
 {
 	struct fl_flow_mask *newmask;
+	int ret = 0;
 
+	rcu_read_lock();
 	fnew->mask = rhashtable_lookup_fast(&head->ht, mask, mask_ht_params);
 	if (!fnew->mask) {
+		rcu_read_unlock();
+
 		if (fold)
 			return -EINVAL;
 
@@ -1335,11 +1342,15 @@ static int fl_check_assign_mask(struct cls_fl_head *head,
 			return PTR_ERR(newmask);
 
 		fnew->mask = newmask;
+		return 0;
 	} else if (fold && fold->mask != fnew->mask) {
-		return -EINVAL;
+		ret = -EINVAL;
+	} else if (!refcount_inc_not_zero(&fnew->mask->refcnt)) {
+		/* Mask was deleted concurrently, try again */
+		ret = -EAGAIN;
 	}
-
-	return 0;
+	rcu_read_unlock();
+	return ret;
 }
 
 static int fl_set_parms(struct net *net, struct tcf_proto *tp,
@@ -1476,6 +1487,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 		list_replace_rcu(&fold->list, &fnew->list);
 		fold->deleted = true;
 
+		fl_mask_put(head, fold->mask, true);
 		if (!tc_skip_hw(fold->flags))
 			fl_hw_destroy_filter(tp, fold, NULL);
 		tcf_unbind_filter(tp, &fold->res);
@@ -1525,7 +1537,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 	if (!tc_skip_hw(fnew->flags))
 		fl_hw_destroy_filter(tp, fnew, NULL);
 errout_mask:
-	fl_mask_put(head, fnew->mask, false);
+	fl_mask_put(head, fnew->mask, true);
 errout:
 	tcf_exts_destroy(&fnew->exts);
 	kfree(fnew);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next v3 06/12] net: sched: flower: handle concurrent mask insertion
  2019-03-21 13:17 [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock Vlad Buslov
                   ` (4 preceding siblings ...)
  2019-03-21 13:17 ` [PATCH net-next v3 05/12] net: sched: flower: add reference counter to flower mask Vlad Buslov
@ 2019-03-21 13:17 ` Vlad Buslov
  2019-03-21 13:17 ` [PATCH net-next v3 07/12] net: sched: flower: protect masks list with spinlock Vlad Buslov
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Vlad Buslov @ 2019-03-21 13:17 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, sbrivio, Vlad Buslov, Jiri Pirko

Without rtnl lock protection masks with same key can be inserted
concurrently. Insert temporary mask with reference count zero to masks
hashtable. This will cause any concurrent modifications to retry.

Wait for rcu grace period to complete after removing temporary mask from
masks hashtable to accommodate concurrent readers.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Suggested-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
---
 net/sched/cls_flower.c | 41 ++++++++++++++++++++++++++++++++++-------
 1 file changed, 34 insertions(+), 7 deletions(-)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index e98313cd710a..92478bb122d3 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -1304,11 +1304,14 @@ static struct fl_flow_mask *fl_create_new_mask(struct cls_fl_head *head,
 	INIT_LIST_HEAD_RCU(&newmask->filters);
 
 	refcount_set(&newmask->refcnt, 1);
-	err = rhashtable_insert_fast(&head->ht, &newmask->ht_node,
-				     mask_ht_params);
+	err = rhashtable_replace_fast(&head->ht, &mask->ht_node,
+				      &newmask->ht_node, mask_ht_params);
 	if (err)
 		goto errout_destroy;
 
+	/* Wait until any potential concurrent users of mask are finished */
+	synchronize_rcu();
+
 	list_add_tail_rcu(&newmask->list, &head->masks);
 
 	return newmask;
@@ -1330,19 +1333,36 @@ static int fl_check_assign_mask(struct cls_fl_head *head,
 	int ret = 0;
 
 	rcu_read_lock();
-	fnew->mask = rhashtable_lookup_fast(&head->ht, mask, mask_ht_params);
+
+	/* Insert mask as temporary node to prevent concurrent creation of mask
+	 * with same key. Any concurrent lookups with same key will return
+	 * -EAGAIN because mask's refcnt is zero. It is safe to insert
+	 * stack-allocated 'mask' to masks hash table because we call
+	 * synchronize_rcu() before returning from this function (either in case
+	 * of error or after replacing it with heap-allocated mask in
+	 * fl_create_new_mask()).
+	 */
+	fnew->mask = rhashtable_lookup_get_insert_fast(&head->ht,
+						       &mask->ht_node,
+						       mask_ht_params);
 	if (!fnew->mask) {
 		rcu_read_unlock();
 
-		if (fold)
-			return -EINVAL;
+		if (fold) {
+			ret = -EINVAL;
+			goto errout_cleanup;
+		}
 
 		newmask = fl_create_new_mask(head, mask);
-		if (IS_ERR(newmask))
-			return PTR_ERR(newmask);
+		if (IS_ERR(newmask)) {
+			ret = PTR_ERR(newmask);
+			goto errout_cleanup;
+		}
 
 		fnew->mask = newmask;
 		return 0;
+	} else if (IS_ERR(fnew->mask)) {
+		ret = PTR_ERR(fnew->mask);
 	} else if (fold && fold->mask != fnew->mask) {
 		ret = -EINVAL;
 	} else if (!refcount_inc_not_zero(&fnew->mask->refcnt)) {
@@ -1351,6 +1371,13 @@ static int fl_check_assign_mask(struct cls_fl_head *head,
 	}
 	rcu_read_unlock();
 	return ret;
+
+errout_cleanup:
+	rhashtable_remove_fast(&head->ht, &mask->ht_node,
+			       mask_ht_params);
+	/* Wait until any potential concurrent users of mask are finished */
+	synchronize_rcu();
+	return ret;
 }
 
 static int fl_set_parms(struct net *net, struct tcf_proto *tp,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next v3 07/12] net: sched: flower: protect masks list with spinlock
  2019-03-21 13:17 [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock Vlad Buslov
                   ` (5 preceding siblings ...)
  2019-03-21 13:17 ` [PATCH net-next v3 06/12] net: sched: flower: handle concurrent mask insertion Vlad Buslov
@ 2019-03-21 13:17 ` Vlad Buslov
  2019-03-21 13:17 ` [PATCH net-next v3 08/12] net: sched: flower: handle concurrent filter insertion in fl_change Vlad Buslov
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Vlad Buslov @ 2019-03-21 13:17 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, sbrivio, Vlad Buslov, Jiri Pirko

Protect modifications of flower masks list with spinlock to remove
dependency on rtnl lock and allow concurrent access.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
---
 net/sched/cls_flower.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 92478bb122d3..db47828ea5e2 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -88,6 +88,7 @@ struct fl_flow_tmplt {
 
 struct cls_fl_head {
 	struct rhashtable ht;
+	spinlock_t masks_lock; /* Protect masks list */
 	struct list_head masks;
 	struct rcu_work rwork;
 	struct idr handle_idr;
@@ -312,6 +313,7 @@ static int fl_init(struct tcf_proto *tp)
 	if (!head)
 		return -ENOBUFS;
 
+	spin_lock_init(&head->masks_lock);
 	INIT_LIST_HEAD_RCU(&head->masks);
 	rcu_assign_pointer(tp->root, head);
 	idr_init(&head->handle_idr);
@@ -341,7 +343,11 @@ static bool fl_mask_put(struct cls_fl_head *head, struct fl_flow_mask *mask,
 		return false;
 
 	rhashtable_remove_fast(&head->ht, &mask->ht_node, mask_ht_params);
+
+	spin_lock(&head->masks_lock);
 	list_del_rcu(&mask->list);
+	spin_unlock(&head->masks_lock);
+
 	if (async)
 		tcf_queue_work(&mask->rwork, fl_mask_free_work);
 	else
@@ -1312,7 +1318,9 @@ static struct fl_flow_mask *fl_create_new_mask(struct cls_fl_head *head,
 	/* Wait until any potential concurrent users of mask are finished */
 	synchronize_rcu();
 
+	spin_lock(&head->masks_lock);
 	list_add_tail_rcu(&newmask->list, &head->masks);
+	spin_unlock(&head->masks_lock);
 
 	return newmask;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next v3 08/12] net: sched: flower: handle concurrent filter insertion in fl_change
  2019-03-21 13:17 [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock Vlad Buslov
                   ` (6 preceding siblings ...)
  2019-03-21 13:17 ` [PATCH net-next v3 07/12] net: sched: flower: protect masks list with spinlock Vlad Buslov
@ 2019-03-21 13:17 ` Vlad Buslov
  2019-03-21 13:17 ` [PATCH net-next v3 09/12] net: sched: flower: handle concurrent tcf proto deletion Vlad Buslov
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Vlad Buslov @ 2019-03-21 13:17 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, sbrivio, Vlad Buslov, Jiri Pirko

Check if user specified a handle and another filter with the same handle
was inserted concurrently. Return EAGAIN to retry filter processing (in
case it is an overwrite request).

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
---
 net/sched/cls_flower.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index db47828ea5e2..70b357f23391 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -1542,6 +1542,15 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 			/* user specifies a handle and it doesn't exist */
 			err = idr_alloc_u32(&head->handle_idr, fnew, &handle,
 					    handle, GFP_ATOMIC);
+
+			/* Filter with specified handle was concurrently
+			 * inserted after initial check in cls_api. This is not
+			 * necessarily an error if NLM_F_EXCL is not set in
+			 * message flags. Returning EAGAIN will cause cls_api to
+			 * try to update concurrently inserted rule.
+			 */
+			if (err == -ENOSPC)
+				err = -EAGAIN;
 		} else {
 			handle = 1;
 			err = idr_alloc_u32(&head->handle_idr, fnew, &handle,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next v3 09/12] net: sched: flower: handle concurrent tcf proto deletion
  2019-03-21 13:17 [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock Vlad Buslov
                   ` (7 preceding siblings ...)
  2019-03-21 13:17 ` [PATCH net-next v3 08/12] net: sched: flower: handle concurrent filter insertion in fl_change Vlad Buslov
@ 2019-03-21 13:17 ` Vlad Buslov
  2019-03-21 14:06   ` Jiri Pirko
  2019-03-21 13:17 ` [PATCH net-next v3 10/12] net: sched: flower: protect flower classifier state with spinlock Vlad Buslov
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 22+ messages in thread
From: Vlad Buslov @ 2019-03-21 13:17 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, sbrivio, Vlad Buslov

Without rtnl lock protection tcf proto can be deleted concurrently. Check
tcf proto 'deleting' flag after taking tcf spinlock to verify that no
concurrent deletion is in progress. Return EAGAIN error if concurrent
deletion detected, which will cause caller to retry and possibly create new
instance of tcf proto.

Retry mechanism is a result of fine-grained locking approach used in this
and previous changes in series and is necessary to allow concurrent updates
on same chain instance. Alternative approach would be to lock the whole
chain while updating filters on any of child tp's, adding and removing
classifier instances from the chain. However, since most CPU-intensive
parts of filter update code are specifically in classifier code and its
dependencies (extensions and hw offloads), such approach would negate most
of the gains introduced by this change and previous changes in the series
when updating same chain instance.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
---
 net/sched/cls_flower.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 70b357f23391..25a4d64b82db 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -1500,6 +1500,14 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 	if (!tc_in_hw(fnew->flags))
 		fnew->flags |= TCA_CLS_FLAGS_NOT_IN_HW;
 
+	/* tp was deleted concurrently. -EAGAIN will cause caller to lookup
+	 * proto again or create new one, if necessary.
+	 */
+	if (tp->deleting) {
+		err = -EAGAIN;
+		goto errout_hw;
+	}
+
 	refcount_inc(&fnew->refcnt);
 	if (fold) {
 		/* Fold filter was deleted concurrently. Retry lookup. */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next v3 10/12] net: sched: flower: protect flower classifier state with spinlock
  2019-03-21 13:17 [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock Vlad Buslov
                   ` (8 preceding siblings ...)
  2019-03-21 13:17 ` [PATCH net-next v3 09/12] net: sched: flower: handle concurrent tcf proto deletion Vlad Buslov
@ 2019-03-21 13:17 ` Vlad Buslov
  2019-03-21 14:09   ` Jiri Pirko
  2019-03-21 13:17 ` [PATCH net-next v3 11/12] net: sched: flower: track rtnl lock state Vlad Buslov
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 22+ messages in thread
From: Vlad Buslov @ 2019-03-21 13:17 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, sbrivio, Vlad Buslov

struct tcf_proto was extended with spinlock to be used by classifiers
instead of global rtnl lock. Use it to protect shared flower classifier
data structures (handle_idr, mask hashtable and list) and fields of
individual filters that can be accessed concurrently. This patch set uses
tcf_proto->lock as per instance lock that protects all filters on
tcf_proto.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
---
 net/sched/cls_flower.c | 39 ++++++++++++++++++++++++++++++++-------
 1 file changed, 32 insertions(+), 7 deletions(-)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 25a4d64b82db..04210d645c78 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -384,7 +384,9 @@ static void fl_hw_destroy_filter(struct tcf_proto *tp, struct cls_fl_filter *f,
 	cls_flower.cookie = (unsigned long) f;
 
 	tc_setup_cb_call(block, TC_SETUP_CLSFLOWER, &cls_flower, false);
+	spin_lock(&tp->lock);
 	tcf_block_offload_dec(block, &f->flags);
+	spin_unlock(&tp->lock);
 }
 
 static int fl_hw_replace_filter(struct tcf_proto *tp,
@@ -426,7 +428,9 @@ static int fl_hw_replace_filter(struct tcf_proto *tp,
 		return err;
 	} else if (err > 0) {
 		f->in_hw_count = err;
+		spin_lock(&tp->lock);
 		tcf_block_offload_inc(block, &f->flags);
+		spin_unlock(&tp->lock);
 	}
 
 	if (skip_sw && !(f->flags & TCA_CLS_FLAGS_IN_HW))
@@ -514,14 +518,19 @@ static int __fl_delete(struct tcf_proto *tp, struct cls_fl_filter *f,
 
 	*last = false;
 
-	if (f->deleted)
+	spin_lock(&tp->lock);
+	if (f->deleted) {
+		spin_unlock(&tp->lock);
 		return -ENOENT;
+	}
 
 	f->deleted = true;
 	rhashtable_remove_fast(&f->mask->ht, &f->ht_node,
 			       f->mask->filter_ht_params);
 	idr_remove(&head->handle_idr, f->handle);
 	list_del_rcu(&f->list);
+	spin_unlock(&tp->lock);
+
 	*last = fl_mask_put(head, f->mask, async);
 	if (!tc_skip_hw(f->flags))
 		fl_hw_destroy_filter(tp, f, extack);
@@ -1500,6 +1509,8 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 	if (!tc_in_hw(fnew->flags))
 		fnew->flags |= TCA_CLS_FLAGS_NOT_IN_HW;
 
+	spin_lock(&tp->lock);
+
 	/* tp was deleted concurrently. -EAGAIN will cause caller to lookup
 	 * proto again or create new one, if necessary.
 	 */
@@ -1530,6 +1541,8 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 		list_replace_rcu(&fold->list, &fnew->list);
 		fold->deleted = true;
 
+		spin_unlock(&tp->lock);
+
 		fl_mask_put(head, fold->mask, true);
 		if (!tc_skip_hw(fold->flags))
 			fl_hw_destroy_filter(tp, fold, NULL);
@@ -1575,6 +1588,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 			goto errout_idr;
 
 		list_add_tail_rcu(&fnew->list, &fnew->mask->filters);
+		spin_unlock(&tp->lock);
 	}
 
 	*arg = fnew;
@@ -1586,6 +1600,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 errout_idr:
 	idr_remove(&head->handle_idr, fnew->handle);
 errout_hw:
+	spin_unlock(&tp->lock);
 	if (!tc_skip_hw(fnew->flags))
 		fl_hw_destroy_filter(tp, fnew, NULL);
 errout_mask:
@@ -1688,8 +1703,10 @@ static int fl_reoffload(struct tcf_proto *tp, bool add, tc_setup_cb_t *cb,
 				continue;
 			}
 
+			spin_lock(&tp->lock);
 			tc_cls_offload_cnt_update(block, &f->in_hw_count,
 						  &f->flags, add);
+			spin_unlock(&tp->lock);
 		}
 	}
 
@@ -2223,6 +2240,7 @@ static int fl_dump(struct net *net, struct tcf_proto *tp, void *fh,
 	struct cls_fl_filter *f = fh;
 	struct nlattr *nest;
 	struct fl_flow_key *key, *mask;
+	bool skip_hw;
 
 	if (!f)
 		return skb->len;
@@ -2233,21 +2251,26 @@ static int fl_dump(struct net *net, struct tcf_proto *tp, void *fh,
 	if (!nest)
 		goto nla_put_failure;
 
+	spin_lock(&tp->lock);
+
 	if (f->res.classid &&
 	    nla_put_u32(skb, TCA_FLOWER_CLASSID, f->res.classid))
-		goto nla_put_failure;
+		goto nla_put_failure_locked;
 
 	key = &f->key;
 	mask = &f->mask->key;
+	skip_hw = tc_skip_hw(f->flags);
 
 	if (fl_dump_key(skb, net, key, mask))
-		goto nla_put_failure;
-
-	if (!tc_skip_hw(f->flags))
-		fl_hw_update_stats(tp, f);
+		goto nla_put_failure_locked;
 
 	if (f->flags && nla_put_u32(skb, TCA_FLOWER_FLAGS, f->flags))
-		goto nla_put_failure;
+		goto nla_put_failure_locked;
+
+	spin_unlock(&tp->lock);
+
+	if (!skip_hw)
+		fl_hw_update_stats(tp, f);
 
 	if (nla_put_u32(skb, TCA_FLOWER_IN_HW_COUNT, f->in_hw_count))
 		goto nla_put_failure;
@@ -2262,6 +2285,8 @@ static int fl_dump(struct net *net, struct tcf_proto *tp, void *fh,
 
 	return skb->len;
 
+nla_put_failure_locked:
+	spin_unlock(&tp->lock);
 nla_put_failure:
 	nla_nest_cancel(skb, nest);
 	return -1;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next v3 11/12] net: sched: flower: track rtnl lock state
  2019-03-21 13:17 [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock Vlad Buslov
                   ` (9 preceding siblings ...)
  2019-03-21 13:17 ` [PATCH net-next v3 10/12] net: sched: flower: protect flower classifier state with spinlock Vlad Buslov
@ 2019-03-21 13:17 ` Vlad Buslov
  2019-03-21 14:11   ` Jiri Pirko
  2019-03-21 13:17 ` [PATCH net-next v3 12/12] net: sched: flower: set unlocked flag for flower proto ops Vlad Buslov
  2019-03-21 21:33 ` [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock David Miller
  12 siblings, 1 reply; 22+ messages in thread
From: Vlad Buslov @ 2019-03-21 13:17 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, sbrivio, Vlad Buslov

Use 'rtnl_held' flag to track if caller holds rtnl lock. Propagate the flag
to internal functions that need to know rtnl lock state. Take rtnl lock
before calling tcf APIs that require it (hw offload, bind filter, etc.).

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
---
 net/sched/cls_flower.c | 82 ++++++++++++++++++++++++++++--------------
 1 file changed, 56 insertions(+), 26 deletions(-)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 04210d645c78..68bac808cf35 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -374,11 +374,14 @@ static void fl_destroy_filter_work(struct work_struct *work)
 }
 
 static void fl_hw_destroy_filter(struct tcf_proto *tp, struct cls_fl_filter *f,
-				 struct netlink_ext_ack *extack)
+				 bool rtnl_held, struct netlink_ext_ack *extack)
 {
 	struct tc_cls_flower_offload cls_flower = {};
 	struct tcf_block *block = tp->chain->block;
 
+	if (!rtnl_held)
+		rtnl_lock();
+
 	tc_cls_common_offload_init(&cls_flower.common, tp, f->flags, extack);
 	cls_flower.command = TC_CLSFLOWER_DESTROY;
 	cls_flower.cookie = (unsigned long) f;
@@ -387,20 +390,28 @@ static void fl_hw_destroy_filter(struct tcf_proto *tp, struct cls_fl_filter *f,
 	spin_lock(&tp->lock);
 	tcf_block_offload_dec(block, &f->flags);
 	spin_unlock(&tp->lock);
+
+	if (!rtnl_held)
+		rtnl_unlock();
 }
 
 static int fl_hw_replace_filter(struct tcf_proto *tp,
-				struct cls_fl_filter *f,
+				struct cls_fl_filter *f, bool rtnl_held,
 				struct netlink_ext_ack *extack)
 {
 	struct tc_cls_flower_offload cls_flower = {};
 	struct tcf_block *block = tp->chain->block;
 	bool skip_sw = tc_skip_sw(f->flags);
-	int err;
+	int err = 0;
+
+	if (!rtnl_held)
+		rtnl_lock();
 
 	cls_flower.rule = flow_rule_alloc(tcf_exts_num_actions(&f->exts));
-	if (!cls_flower.rule)
-		return -ENOMEM;
+	if (!cls_flower.rule) {
+		err = -ENOMEM;
+		goto errout;
+	}
 
 	tc_cls_common_offload_init(&cls_flower.common, tp, f->flags, extack);
 	cls_flower.command = TC_CLSFLOWER_REPLACE;
@@ -413,37 +424,48 @@ static int fl_hw_replace_filter(struct tcf_proto *tp,
 	err = tc_setup_flow_action(&cls_flower.rule->action, &f->exts);
 	if (err) {
 		kfree(cls_flower.rule);
-		if (skip_sw) {
+		if (skip_sw)
 			NL_SET_ERR_MSG_MOD(extack, "Failed to setup flow action");
-			return err;
-		}
-		return 0;
+		else
+			err = 0;
+		goto errout;
 	}
 
 	err = tc_setup_cb_call(block, TC_SETUP_CLSFLOWER, &cls_flower, skip_sw);
 	kfree(cls_flower.rule);
 
 	if (err < 0) {
-		fl_hw_destroy_filter(tp, f, NULL);
-		return err;
+		fl_hw_destroy_filter(tp, f, true, NULL);
+		goto errout;
 	} else if (err > 0) {
 		f->in_hw_count = err;
+		err = 0;
 		spin_lock(&tp->lock);
 		tcf_block_offload_inc(block, &f->flags);
 		spin_unlock(&tp->lock);
 	}
 
-	if (skip_sw && !(f->flags & TCA_CLS_FLAGS_IN_HW))
-		return -EINVAL;
+	if (skip_sw && !(f->flags & TCA_CLS_FLAGS_IN_HW)) {
+		err = -EINVAL;
+		goto errout;
+	}
 
-	return 0;
+errout:
+	if (!rtnl_held)
+		rtnl_unlock();
+
+	return err;
 }
 
-static void fl_hw_update_stats(struct tcf_proto *tp, struct cls_fl_filter *f)
+static void fl_hw_update_stats(struct tcf_proto *tp, struct cls_fl_filter *f,
+			       bool rtnl_held)
 {
 	struct tc_cls_flower_offload cls_flower = {};
 	struct tcf_block *block = tp->chain->block;
 
+	if (!rtnl_held)
+		rtnl_lock();
+
 	tc_cls_common_offload_init(&cls_flower.common, tp, f->flags, NULL);
 	cls_flower.command = TC_CLSFLOWER_STATS;
 	cls_flower.cookie = (unsigned long) f;
@@ -454,6 +476,9 @@ static void fl_hw_update_stats(struct tcf_proto *tp, struct cls_fl_filter *f)
 	tcf_exts_stats_update(&f->exts, cls_flower.stats.bytes,
 			      cls_flower.stats.pkts,
 			      cls_flower.stats.lastused);
+
+	if (!rtnl_held)
+		rtnl_unlock();
 }
 
 static struct cls_fl_head *fl_head_dereference(struct tcf_proto *tp)
@@ -511,7 +536,8 @@ static struct cls_fl_filter *fl_get_next_filter(struct tcf_proto *tp,
 }
 
 static int __fl_delete(struct tcf_proto *tp, struct cls_fl_filter *f,
-		       bool *last, struct netlink_ext_ack *extack)
+		       bool *last, bool rtnl_held,
+		       struct netlink_ext_ack *extack)
 {
 	struct cls_fl_head *head = fl_head_dereference(tp);
 	bool async = tcf_exts_get_net(&f->exts);
@@ -533,7 +559,7 @@ static int __fl_delete(struct tcf_proto *tp, struct cls_fl_filter *f,
 
 	*last = fl_mask_put(head, f->mask, async);
 	if (!tc_skip_hw(f->flags))
-		fl_hw_destroy_filter(tp, f, extack);
+		fl_hw_destroy_filter(tp, f, rtnl_held, extack);
 	tcf_unbind_filter(tp, &f->res);
 	__fl_put(f);
 
@@ -561,7 +587,7 @@ static void fl_destroy(struct tcf_proto *tp, bool rtnl_held,
 
 	list_for_each_entry_safe(mask, next_mask, &head->masks, list) {
 		list_for_each_entry_safe(f, next, &mask->filters, list) {
-			__fl_delete(tp, f, &last, extack);
+			__fl_delete(tp, f, &last, rtnl_held, extack);
 			if (last)
 				break;
 		}
@@ -1401,19 +1427,23 @@ static int fl_set_parms(struct net *net, struct tcf_proto *tp,
 			struct cls_fl_filter *f, struct fl_flow_mask *mask,
 			unsigned long base, struct nlattr **tb,
 			struct nlattr *est, bool ovr,
-			struct fl_flow_tmplt *tmplt,
+			struct fl_flow_tmplt *tmplt, bool rtnl_held,
 			struct netlink_ext_ack *extack)
 {
 	int err;
 
-	err = tcf_exts_validate(net, tp, tb, est, &f->exts, ovr, true,
+	err = tcf_exts_validate(net, tp, tb, est, &f->exts, ovr, rtnl_held,
 				extack);
 	if (err < 0)
 		return err;
 
 	if (tb[TCA_FLOWER_CLASSID]) {
 		f->res.classid = nla_get_u32(tb[TCA_FLOWER_CLASSID]);
+		if (!rtnl_held)
+			rtnl_lock();
 		tcf_bind_filter(tp, &f->res, base);
+		if (!rtnl_held)
+			rtnl_unlock();
 	}
 
 	err = fl_set_key(net, tb, &f->key, &mask->key, extack);
@@ -1492,7 +1522,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 	}
 
 	err = fl_set_parms(net, tp, fnew, mask, base, tb, tca[TCA_RATE], ovr,
-			   tp->chain->tmplt_priv, extack);
+			   tp->chain->tmplt_priv, rtnl_held, extack);
 	if (err)
 		goto errout;
 
@@ -1501,7 +1531,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 		goto errout;
 
 	if (!tc_skip_hw(fnew->flags)) {
-		err = fl_hw_replace_filter(tp, fnew, extack);
+		err = fl_hw_replace_filter(tp, fnew, rtnl_held, extack);
 		if (err)
 			goto errout_mask;
 	}
@@ -1545,7 +1575,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 
 		fl_mask_put(head, fold->mask, true);
 		if (!tc_skip_hw(fold->flags))
-			fl_hw_destroy_filter(tp, fold, NULL);
+			fl_hw_destroy_filter(tp, fold, rtnl_held, NULL);
 		tcf_unbind_filter(tp, &fold->res);
 		tcf_exts_get_net(&fold->exts);
 		/* Caller holds reference to fold, so refcnt is always > 0
@@ -1602,7 +1632,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 errout_hw:
 	spin_unlock(&tp->lock);
 	if (!tc_skip_hw(fnew->flags))
-		fl_hw_destroy_filter(tp, fnew, NULL);
+		fl_hw_destroy_filter(tp, fnew, rtnl_held, NULL);
 errout_mask:
 	fl_mask_put(head, fnew->mask, true);
 errout:
@@ -1626,7 +1656,7 @@ static int fl_delete(struct tcf_proto *tp, void *arg, bool *last,
 	bool last_on_mask;
 	int err = 0;
 
-	err = __fl_delete(tp, f, &last_on_mask, extack);
+	err = __fl_delete(tp, f, &last_on_mask, rtnl_held, extack);
 	*last = list_empty(&head->masks);
 	__fl_put(f);
 
@@ -2270,7 +2300,7 @@ static int fl_dump(struct net *net, struct tcf_proto *tp, void *fh,
 	spin_unlock(&tp->lock);
 
 	if (!skip_hw)
-		fl_hw_update_stats(tp, f);
+		fl_hw_update_stats(tp, f, rtnl_held);
 
 	if (nla_put_u32(skb, TCA_FLOWER_IN_HW_COUNT, f->in_hw_count))
 		goto nla_put_failure;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next v3 12/12] net: sched: flower: set unlocked flag for flower proto ops
  2019-03-21 13:17 [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock Vlad Buslov
                   ` (10 preceding siblings ...)
  2019-03-21 13:17 ` [PATCH net-next v3 11/12] net: sched: flower: track rtnl lock state Vlad Buslov
@ 2019-03-21 13:17 ` Vlad Buslov
  2019-03-21 14:13   ` Jiri Pirko
  2019-03-21 21:33 ` [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock David Miller
  12 siblings, 1 reply; 22+ messages in thread
From: Vlad Buslov @ 2019-03-21 13:17 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, sbrivio, Vlad Buslov

Set TCF_PROTO_OPS_DOIT_UNLOCKED for flower classifier to indicate that its
ops callbacks don't require caller to hold rtnl lock. Don't take rtnl lock
in fl_destroy_filter_work() that is executed on workqueue instead of being
called by cls API and is not affected by setting
TCF_PROTO_OPS_DOIT_UNLOCKED. Rtnl mutex is still manually taken by flower
classifier before calling hardware offloads API that has not been updated
for unlocked execution.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
---
 net/sched/cls_flower.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 68bac808cf35..0638f17ac5ab 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -368,9 +368,7 @@ static void fl_destroy_filter_work(struct work_struct *work)
 	struct cls_fl_filter *f = container_of(to_rcu_work(work),
 					struct cls_fl_filter, rwork);
 
-	rtnl_lock();
 	__fl_destroy_filter(f);
-	rtnl_unlock();
 }
 
 static void fl_hw_destroy_filter(struct tcf_proto *tp, struct cls_fl_filter *f,
@@ -2372,6 +2370,7 @@ static struct tcf_proto_ops cls_fl_ops __read_mostly = {
 	.tmplt_destroy	= fl_tmplt_destroy,
 	.tmplt_dump	= fl_tmplt_dump,
 	.owner		= THIS_MODULE,
+	.flags		= TCF_PROTO_OPS_DOIT_UNLOCKED,
 };
 
 static int __init cls_fl_init(void)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v3 01/12] net: sched: flower: don't check for rtnl on head dereference
  2019-03-21 13:17 ` [PATCH net-next v3 01/12] net: sched: flower: don't check for rtnl on head dereference Vlad Buslov
@ 2019-03-21 13:51   ` Jiri Pirko
  0 siblings, 0 replies; 22+ messages in thread
From: Jiri Pirko @ 2019-03-21 13:51 UTC (permalink / raw)
  To: Vlad Buslov; +Cc: netdev, jhs, xiyou.wangcong, davem, sbrivio

Thu, Mar 21, 2019 at 02:17:33PM CET, vladbu@mellanox.com wrote:
>Flower classifier only changes root pointer during init and destroy. Cls
>API implements reference counting for tcf_proto, so there is no danger of
>concurrent access to tp when it is being destroyed, even without protection
>provided by rtnl lock.
>
>Implement new function fl_head_dereference() to dereference tp->root
>without checking for rtnl lock. Use it in all flower function that obtain
>head pointer instead of rtnl_dereference().
>
>Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
>Reviewed-by: Stefano Brivio <sbrivio@redhat.com>

Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v3 02/12] net: sched: flower: refactor fl_change
  2019-03-21 13:17 ` [PATCH net-next v3 02/12] net: sched: flower: refactor fl_change Vlad Buslov
@ 2019-03-21 13:53   ` Jiri Pirko
  0 siblings, 0 replies; 22+ messages in thread
From: Jiri Pirko @ 2019-03-21 13:53 UTC (permalink / raw)
  To: Vlad Buslov; +Cc: netdev, jhs, xiyou.wangcong, davem, sbrivio

Thu, Mar 21, 2019 at 02:17:34PM CET, vladbu@mellanox.com wrote:
>As a preparation for using classifier spinlock instead of relying on
>external rtnl lock, rearrange code in fl_change. The goal is to group the
>code which changes classifier state in single block in order to allow
>following commits in this set to protect it from parallel modification with
>tp->lock. Data structures that require tp->lock protection are mask
>hashtable and filters list, and classifier handle_idr.
>
>fl_hw_replace_filter() is a sleeping function and cannot be called while
>holding a spinlock. In order to execute all sequence of changes to shared
>classifier data structures atomically, call fl_hw_replace_filter() before
>modifying them.
>
>Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
>Reviewed-by: Stefano Brivio <sbrivio@redhat.com>

Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v3 03/12] net: sched: flower: introduce reference counting for filters
  2019-03-21 13:17 ` [PATCH net-next v3 03/12] net: sched: flower: introduce reference counting for filters Vlad Buslov
@ 2019-03-21 14:00   ` Jiri Pirko
  0 siblings, 0 replies; 22+ messages in thread
From: Jiri Pirko @ 2019-03-21 14:00 UTC (permalink / raw)
  To: Vlad Buslov; +Cc: netdev, jhs, xiyou.wangcong, davem, sbrivio

Thu, Mar 21, 2019 at 02:17:35PM CET, vladbu@mellanox.com wrote:
>Extend flower filters with reference counting in order to remove dependency
>on rtnl lock in flower ops and allow to modify filters concurrently.
>Reference to flower filter can be taken/released concurrently as soon as it
>is marked as 'unlocked' by last patch in this series. Use atomic reference
>counter type to make concurrent modifications safe.
>
>Always take reference to flower filter while working with it:
>- Modify fl_get() to take reference to filter.
>- Implement tp->put() callback as fl_put() function to allow cls API to
>release reference taken by fl_get().
>- Modify fl_change() to assume that caller holds reference to fold and take
>reference to fnew.
>- Take reference to filter while using it in fl_walk().
>
>Implement helper functions to get/put filter reference counter.
>
>Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
>Reviewed-by: Stefano Brivio <sbrivio@redhat.com>

Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v3 04/12] net: sched: flower: track filter deletion with flag
  2019-03-21 13:17 ` [PATCH net-next v3 04/12] net: sched: flower: track filter deletion with flag Vlad Buslov
@ 2019-03-21 14:04   ` Jiri Pirko
  0 siblings, 0 replies; 22+ messages in thread
From: Jiri Pirko @ 2019-03-21 14:04 UTC (permalink / raw)
  To: Vlad Buslov; +Cc: netdev, jhs, xiyou.wangcong, davem, sbrivio

Thu, Mar 21, 2019 at 02:17:36PM CET, vladbu@mellanox.com wrote:
>In order to prevent double deletion of filter by concurrent tasks when rtnl
>lock is not used for synchronization, add 'deleted' filter field. Check
>value of this field when modifying filters and return error if concurrent
>deletion is detected.
>
>Refactor __fl_delete() to accept pointer to 'last' boolean as argument,
>and return error code as function return value instead. This is necessary
>to signal concurrent filter delete to caller.
>
>Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
>Reviewed-by: Stefano Brivio <sbrivio@redhat.com>

Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v3 09/12] net: sched: flower: handle concurrent tcf proto deletion
  2019-03-21 13:17 ` [PATCH net-next v3 09/12] net: sched: flower: handle concurrent tcf proto deletion Vlad Buslov
@ 2019-03-21 14:06   ` Jiri Pirko
  0 siblings, 0 replies; 22+ messages in thread
From: Jiri Pirko @ 2019-03-21 14:06 UTC (permalink / raw)
  To: Vlad Buslov; +Cc: netdev, jhs, xiyou.wangcong, davem, sbrivio

Thu, Mar 21, 2019 at 02:17:41PM CET, vladbu@mellanox.com wrote:
>Without rtnl lock protection tcf proto can be deleted concurrently. Check
>tcf proto 'deleting' flag after taking tcf spinlock to verify that no
>concurrent deletion is in progress. Return EAGAIN error if concurrent
>deletion detected, which will cause caller to retry and possibly create new
>instance of tcf proto.
>
>Retry mechanism is a result of fine-grained locking approach used in this
>and previous changes in series and is necessary to allow concurrent updates
>on same chain instance. Alternative approach would be to lock the whole
>chain while updating filters on any of child tp's, adding and removing
>classifier instances from the chain. However, since most CPU-intensive
>parts of filter update code are specifically in classifier code and its
>dependencies (extensions and hw offloads), such approach would negate most
>of the gains introduced by this change and previous changes in the series
>when updating same chain instance.
>
>Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
>Reviewed-by: Stefano Brivio <sbrivio@redhat.com>

Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v3 10/12] net: sched: flower: protect flower classifier state with spinlock
  2019-03-21 13:17 ` [PATCH net-next v3 10/12] net: sched: flower: protect flower classifier state with spinlock Vlad Buslov
@ 2019-03-21 14:09   ` Jiri Pirko
  0 siblings, 0 replies; 22+ messages in thread
From: Jiri Pirko @ 2019-03-21 14:09 UTC (permalink / raw)
  To: Vlad Buslov; +Cc: netdev, jhs, xiyou.wangcong, davem, sbrivio

Thu, Mar 21, 2019 at 02:17:42PM CET, vladbu@mellanox.com wrote:
>struct tcf_proto was extended with spinlock to be used by classifiers
>instead of global rtnl lock. Use it to protect shared flower classifier
>data structures (handle_idr, mask hashtable and list) and fields of
>individual filters that can be accessed concurrently. This patch set uses
>tcf_proto->lock as per instance lock that protects all filters on
>tcf_proto.
>
>Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
>Reviewed-by: Stefano Brivio <sbrivio@redhat.com>

Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v3 11/12] net: sched: flower: track rtnl lock state
  2019-03-21 13:17 ` [PATCH net-next v3 11/12] net: sched: flower: track rtnl lock state Vlad Buslov
@ 2019-03-21 14:11   ` Jiri Pirko
  0 siblings, 0 replies; 22+ messages in thread
From: Jiri Pirko @ 2019-03-21 14:11 UTC (permalink / raw)
  To: Vlad Buslov; +Cc: netdev, jhs, xiyou.wangcong, davem, sbrivio

Thu, Mar 21, 2019 at 02:17:43PM CET, vladbu@mellanox.com wrote:
>Use 'rtnl_held' flag to track if caller holds rtnl lock. Propagate the flag
>to internal functions that need to know rtnl lock state. Take rtnl lock
>before calling tcf APIs that require it (hw offload, bind filter, etc.).
>
>Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
>Reviewed-by: Stefano Brivio <sbrivio@redhat.com>

Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v3 12/12] net: sched: flower: set unlocked flag for flower proto ops
  2019-03-21 13:17 ` [PATCH net-next v3 12/12] net: sched: flower: set unlocked flag for flower proto ops Vlad Buslov
@ 2019-03-21 14:13   ` Jiri Pirko
  0 siblings, 0 replies; 22+ messages in thread
From: Jiri Pirko @ 2019-03-21 14:13 UTC (permalink / raw)
  To: Vlad Buslov; +Cc: netdev, jhs, xiyou.wangcong, davem, sbrivio

Thu, Mar 21, 2019 at 02:17:44PM CET, vladbu@mellanox.com wrote:
>Set TCF_PROTO_OPS_DOIT_UNLOCKED for flower classifier to indicate that its
>ops callbacks don't require caller to hold rtnl lock. Don't take rtnl lock
>in fl_destroy_filter_work() that is executed on workqueue instead of being
>called by cls API and is not affected by setting
>TCF_PROTO_OPS_DOIT_UNLOCKED. Rtnl mutex is still manually taken by flower
>classifier before calling hardware offloads API that has not been updated
>for unlocked execution.
>
>Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
>Reviewed-by: Stefano Brivio <sbrivio@redhat.com>

Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock
  2019-03-21 13:17 [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock Vlad Buslov
                   ` (11 preceding siblings ...)
  2019-03-21 13:17 ` [PATCH net-next v3 12/12] net: sched: flower: set unlocked flag for flower proto ops Vlad Buslov
@ 2019-03-21 21:33 ` David Miller
  12 siblings, 0 replies; 22+ messages in thread
From: David Miller @ 2019-03-21 21:33 UTC (permalink / raw)
  To: vladbu; +Cc: netdev, jhs, xiyou.wangcong, jiri, sbrivio

From: Vlad Buslov <vladbu@mellanox.com>
Date: Thu, 21 Mar 2019 15:17:32 +0200

> Currently, all netlink protocol handlers for updating rules, actions and
> qdiscs are protected with single global rtnl lock which removes any
> possibility for parallelism. This patch set is a third step to remove
> rtnl lock dependency from TC rules update path.
> 
> Recently, new rtnl registration flag RTNL_FLAG_DOIT_UNLOCKED was added.
> TC rule update handlers (RTM_NEWTFILTER, RTM_DELTFILTER, etc.) are
> already registered with this flag and only take rtnl lock when qdisc or
> classifier requires it. Classifiers can indicate that their ops
> callbacks don't require caller to hold rtnl lock by setting the
> TCF_PROTO_OPS_DOIT_UNLOCKED flag. The goal of this change is to refactor
> flower classifier to support unlocked execution and register it with
> unlocked flag.
> 
> This patch set implements following changes to make flower classifier
> concurrency-safe:
 ...

Series applied, thanks Vlad.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2019-03-21 21:33 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-21 13:17 [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock Vlad Buslov
2019-03-21 13:17 ` [PATCH net-next v3 01/12] net: sched: flower: don't check for rtnl on head dereference Vlad Buslov
2019-03-21 13:51   ` Jiri Pirko
2019-03-21 13:17 ` [PATCH net-next v3 02/12] net: sched: flower: refactor fl_change Vlad Buslov
2019-03-21 13:53   ` Jiri Pirko
2019-03-21 13:17 ` [PATCH net-next v3 03/12] net: sched: flower: introduce reference counting for filters Vlad Buslov
2019-03-21 14:00   ` Jiri Pirko
2019-03-21 13:17 ` [PATCH net-next v3 04/12] net: sched: flower: track filter deletion with flag Vlad Buslov
2019-03-21 14:04   ` Jiri Pirko
2019-03-21 13:17 ` [PATCH net-next v3 05/12] net: sched: flower: add reference counter to flower mask Vlad Buslov
2019-03-21 13:17 ` [PATCH net-next v3 06/12] net: sched: flower: handle concurrent mask insertion Vlad Buslov
2019-03-21 13:17 ` [PATCH net-next v3 07/12] net: sched: flower: protect masks list with spinlock Vlad Buslov
2019-03-21 13:17 ` [PATCH net-next v3 08/12] net: sched: flower: handle concurrent filter insertion in fl_change Vlad Buslov
2019-03-21 13:17 ` [PATCH net-next v3 09/12] net: sched: flower: handle concurrent tcf proto deletion Vlad Buslov
2019-03-21 14:06   ` Jiri Pirko
2019-03-21 13:17 ` [PATCH net-next v3 10/12] net: sched: flower: protect flower classifier state with spinlock Vlad Buslov
2019-03-21 14:09   ` Jiri Pirko
2019-03-21 13:17 ` [PATCH net-next v3 11/12] net: sched: flower: track rtnl lock state Vlad Buslov
2019-03-21 14:11   ` Jiri Pirko
2019-03-21 13:17 ` [PATCH net-next v3 12/12] net: sched: flower: set unlocked flag for flower proto ops Vlad Buslov
2019-03-21 14:13   ` Jiri Pirko
2019-03-21 21:33 ` [PATCH net-next v3 00/12] Refactor flower classifier to remove dependency on rtnl lock David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).