All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHES] cls_u32 cleanups and fixes
@ 2018-09-05 19:01 Al Viro
  2018-09-05 19:04 ` [PATCH 1/7] fix hnode refcounting Al Viro
  0 siblings, 1 reply; 30+ messages in thread
From: Al Viro @ 2018-09-05 19:01 UTC (permalink / raw)
  To: netdev; +Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko

	Several cls_u32 patches: fixing refcounting, preventing
links to and deletion of root hnodes, validating divisor.  Plus
cleanups - removal of some useless fields and saner handling of
tc_u_common hashtable.  The first 3 in series are fixes (and
-stable fodder), the rest - cleanups.  Branch can be found in
vfs.git#misc.cls_u32; individual patches will go in followups.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/7] fix hnode refcounting
  2018-09-05 19:01 [PATCHES] cls_u32 cleanups and fixes Al Viro
@ 2018-09-05 19:04 ` Al Viro
  2018-09-05 19:04   ` [PATCH 2/7] mark root hnode explicitly Al Viro
                     ` (6 more replies)
  0 siblings, 7 replies; 30+ messages in thread
From: Al Viro @ 2018-09-05 19:04 UTC (permalink / raw)
  To: netdev; +Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, stable

From: Al Viro <viro@zeniv.linux.org.uk>

cls_u32.c misuses refcounts for struct tc_u_hnode - it counts references via
->hlist and via ->tp_root together.  u32_destroy() drops the former and, in
case when there had been links, leaves the sucker on the list.  As the result,
there's nothing to protect it from getting freed once links are dropped.
That also makes the "is it busy" check incapable of catching the root hnode -
it *is* busy (there's a reference from tp), but we don't see it as something
separate.  "Is it our root?" check partially covers that, but the problem
exists for others' roots as well.

AFAICS, the minimal fix preserving the existing behaviour (where it doesn't
include oopsen, that is) would be this:
        * count tp->root and tp_c->hlist as separate references.  I.e.
have u32_init() set refcount to 2, not 1.
	* in u32_destroy() we always drop the former; in u32_destroy_hnode() -
the latter.

	That way we have *all* references contributing to refcount.  List
removal happens in u32_destroy_hnode() (called only when ->refcnt is 1)
an in u32_destroy() in case of tc_u_common going away, along with everything
reachable from it.  IOW, that way we know that u32_destroy_key() won't
free something still on the list (or pointed to by someone's ->root).

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/sched/cls_u32.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index f218ccf1e2d9..3f985f29ef30 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -398,6 +398,7 @@ static int u32_init(struct tcf_proto *tp)
 	rcu_assign_pointer(tp_c->hlist, root_ht);
 	root_ht->tp_c = tp_c;
 
+	root_ht->refcnt++;
 	rcu_assign_pointer(tp->root, root_ht);
 	tp->data = tp_c;
 	return 0;
@@ -610,7 +611,7 @@ static int u32_destroy_hnode(struct tcf_proto *tp, struct tc_u_hnode *ht,
 	struct tc_u_hnode __rcu **hn;
 	struct tc_u_hnode *phn;
 
-	WARN_ON(ht->refcnt);
+	WARN_ON(--ht->refcnt);
 
 	u32_clear_hnode(tp, ht, extack);
 
@@ -649,7 +650,7 @@ static void u32_destroy(struct tcf_proto *tp, struct netlink_ext_ack *extack)
 
 	WARN_ON(root_ht == NULL);
 
-	if (root_ht && --root_ht->refcnt == 0)
+	if (root_ht && --root_ht->refcnt == 1)
 		u32_destroy_hnode(tp, root_ht, extack);
 
 	if (--tp_c->refcnt == 0) {
@@ -698,7 +699,6 @@ static int u32_delete(struct tcf_proto *tp, void *arg, bool *last,
 	}
 
 	if (ht->refcnt == 1) {
-		ht->refcnt--;
 		u32_destroy_hnode(tp, ht, extack);
 	} else {
 		NL_SET_ERR_MSG_MOD(extack, "Can not delete in-use filter");
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/7] mark root hnode explicitly
  2018-09-05 19:04 ` [PATCH 1/7] fix hnode refcounting Al Viro
@ 2018-09-05 19:04   ` Al Viro
  2018-09-06 10:28     ` Jamal Hadi Salim
  2018-09-05 19:04   ` [PATCH 3/7] make sure that divisor is a power of 2 Al Viro
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 30+ messages in thread
From: Al Viro @ 2018-09-05 19:04 UTC (permalink / raw)
  To: netdev; +Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko

From: Al Viro <viro@zeniv.linux.org.uk>

... and disallow deleting or linking to such

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/sched/cls_u32.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index 3f985f29ef30..9ea5f2be907b 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -91,6 +91,7 @@ struct tc_u_hnode {
 	 */
 	struct tc_u_knode __rcu	*ht[1];
 };
+#define TCA_CLS_FLAGS_U32_ROOT (1<<8)
 
 struct tc_u_common {
 	struct tc_u_hnode __rcu	*hlist;
@@ -377,6 +378,7 @@ static int u32_init(struct tcf_proto *tp)
 	root_ht->refcnt++;
 	root_ht->handle = tp_c ? gen_new_htid(tp_c, root_ht) : 0x80000000;
 	root_ht->prio = tp->prio;
+	root_ht->flags = TCA_CLS_FLAGS_U32_ROOT;
 	idr_init(&root_ht->handle_idr);
 
 	if (tp_c == NULL) {
@@ -491,7 +493,8 @@ static void u32_clear_hw_hnode(struct tcf_proto *tp, struct tc_u_hnode *h,
 	struct tcf_block *block = tp->chain->block;
 	struct tc_cls_u32_offload cls_u32 = {};
 
-	tc_cls_common_offload_init(&cls_u32.common, tp, h->flags, extack);
+	tc_cls_common_offload_init(&cls_u32.common, tp,
+		h->flags & ~TCA_CLS_FLAGS_U32_ROOT, extack);
 	cls_u32.command = TC_CLSU32_DELETE_HNODE;
 	cls_u32.hnode.divisor = h->divisor;
 	cls_u32.hnode.handle = h->handle;
@@ -693,7 +696,7 @@ static int u32_delete(struct tcf_proto *tp, void *arg, bool *last,
 		goto out;
 	}
 
-	if (root_ht == ht) {
+	if (ht->flags & TCA_CLS_FLAGS_U32_ROOT) {
 		NL_SET_ERR_MSG_MOD(extack, "Not allowed to delete root node");
 		return -EINVAL;
 	}
@@ -795,6 +798,10 @@ static int u32_set_parms(struct net *net, struct tcf_proto *tp,
 				NL_SET_ERR_MSG_MOD(extack, "Link hash table not found");
 				return -EINVAL;
 			}
+			if (ht_down->flags & TCA_CLS_FLAGS_U32_ROOT) {
+				NL_SET_ERR_MSG_MOD(extack, "Not linke to root node");
+				return -EINVAL;
+			}
 			ht_down->refcnt++;
 		}
 
@@ -1214,7 +1221,8 @@ static int u32_reoffload_hnode(struct tcf_proto *tp, struct tc_u_hnode *ht,
 	struct tc_cls_u32_offload cls_u32 = {};
 	int err;
 
-	tc_cls_common_offload_init(&cls_u32.common, tp, ht->flags, extack);
+	tc_cls_common_offload_init(&cls_u32.common, tp,
+		ht->flags & ~TCA_CLS_FLAGS_U32_ROOT, extack);
 	cls_u32.command = add ? TC_CLSU32_NEW_HNODE : TC_CLSU32_DELETE_HNODE;
 	cls_u32.hnode.divisor = ht->divisor;
 	cls_u32.hnode.handle = ht->handle;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 3/7] make sure that divisor is a power of 2
  2018-09-05 19:04 ` [PATCH 1/7] fix hnode refcounting Al Viro
  2018-09-05 19:04   ` [PATCH 2/7] mark root hnode explicitly Al Viro
@ 2018-09-05 19:04   ` Al Viro
  2018-09-06 10:28     ` Jamal Hadi Salim
  2018-09-05 19:04   ` [PATCH 4/7] get rid of unused argument of u32_destroy_key() Al Viro
                     ` (4 subsequent siblings)
  6 siblings, 1 reply; 30+ messages in thread
From: Al Viro @ 2018-09-05 19:04 UTC (permalink / raw)
  To: netdev; +Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/sched/cls_u32.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index 9ea5f2be907b..5816288810cc 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -995,7 +995,11 @@ static int u32_change(struct net *net, struct sk_buff *in_skb,
 	if (tb[TCA_U32_DIVISOR]) {
 		unsigned int divisor = nla_get_u32(tb[TCA_U32_DIVISOR]);
 
-		if (--divisor > 0x100) {
+		if (!is_power_of_2(divisor)) {
+			NL_SET_ERR_MSG_MOD(extack, "Divisor is not a power of 2");
+			return -EINVAL;
+		}
+		if (divisor-- > 0x100) {
 			NL_SET_ERR_MSG_MOD(extack, "Exceeded maximum 256 hash buckets");
 			return -EINVAL;
 		}
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 4/7] get rid of unused argument of u32_destroy_key()
  2018-09-05 19:04 ` [PATCH 1/7] fix hnode refcounting Al Viro
  2018-09-05 19:04   ` [PATCH 2/7] mark root hnode explicitly Al Viro
  2018-09-05 19:04   ` [PATCH 3/7] make sure that divisor is a power of 2 Al Viro
@ 2018-09-05 19:04   ` Al Viro
  2018-09-06 10:34     ` Jamal Hadi Salim
  2018-09-05 19:04   ` [PATCH 5/7] get rid of tc_u_knode ->tp Al Viro
                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 30+ messages in thread
From: Al Viro @ 2018-09-05 19:04 UTC (permalink / raw)
  To: netdev; +Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/sched/cls_u32.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index 5816288810cc..3311aacad6c3 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -406,8 +406,7 @@ static int u32_init(struct tcf_proto *tp)
 	return 0;
 }
 
-static int u32_destroy_key(struct tcf_proto *tp, struct tc_u_knode *n,
-			   bool free_pf)
+static int u32_destroy_key(struct tc_u_knode *n, bool free_pf)
 {
 	struct tc_u_hnode *ht = rtnl_dereference(n->ht_down);
 
@@ -441,7 +440,7 @@ static void u32_delete_key_work(struct work_struct *work)
 					      struct tc_u_knode,
 					      rwork);
 	rtnl_lock();
-	u32_destroy_key(key->tp, key, false);
+	u32_destroy_key(key, false);
 	rtnl_unlock();
 }
 
@@ -458,7 +457,7 @@ static void u32_delete_key_freepf_work(struct work_struct *work)
 					      struct tc_u_knode,
 					      rwork);
 	rtnl_lock();
-	u32_destroy_key(key->tp, key, true);
+	u32_destroy_key(key, true);
 	rtnl_unlock();
 }
 
@@ -602,7 +601,7 @@ static void u32_clear_hnode(struct tcf_proto *tp, struct tc_u_hnode *ht,
 			if (tcf_exts_get_net(&n->exts))
 				tcf_queue_work(&n->rwork, u32_delete_key_freepf_work);
 			else
-				u32_destroy_key(n->tp, n, true);
+				u32_destroy_key(n, true);
 		}
 	}
 }
@@ -972,13 +971,13 @@ static int u32_change(struct net *net, struct sk_buff *in_skb,
 				    tca[TCA_RATE], ovr, extack);
 
 		if (err) {
-			u32_destroy_key(tp, new, false);
+			u32_destroy_key(new, false);
 			return err;
 		}
 
 		err = u32_replace_hw_knode(tp, new, flags, extack);
 		if (err) {
-			u32_destroy_key(tp, new, false);
+			u32_destroy_key(new, false);
 			return err;
 		}
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 5/7] get rid of tc_u_knode ->tp
  2018-09-05 19:04 ` [PATCH 1/7] fix hnode refcounting Al Viro
                     ` (2 preceding siblings ...)
  2018-09-05 19:04   ` [PATCH 4/7] get rid of unused argument of u32_destroy_key() Al Viro
@ 2018-09-05 19:04   ` Al Viro
  2018-09-06 10:35     ` Jamal Hadi Salim
  2018-09-05 19:04   ` [PATCH 6/7] get rid of tc_u_common ->rcu Al Viro
                     ` (2 subsequent siblings)
  6 siblings, 1 reply; 30+ messages in thread
From: Al Viro @ 2018-09-05 19:04 UTC (permalink / raw)
  To: netdev; +Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko

From: Al Viro <viro@zeniv.linux.org.uk>

not used anymore

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/sched/cls_u32.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index 3311aacad6c3..8a1a573487bd 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -68,7 +68,6 @@ struct tc_u_knode {
 	u32			mask;
 	u32 __percpu		*pcpu_success;
 #endif
-	struct tcf_proto	*tp;
 	struct rcu_work		rwork;
 	/* The 'sel' field MUST be the last field in structure to allow for
 	 * tc_u32_keys allocated at end of structure.
@@ -897,7 +896,6 @@ static struct tc_u_knode *u32_init_knode(struct tcf_proto *tp,
 	/* Similarly success statistics must be moved as pointers */
 	new->pcpu_success = n->pcpu_success;
 #endif
-	new->tp = tp;
 	memcpy(&new->sel, s, sizeof(*s) + s->nkeys*sizeof(struct tc_u32_key));
 
 	if (tcf_exts_init(&new->exts, TCA_U32_ACT, TCA_U32_POLICE)) {
@@ -1113,7 +1111,6 @@ static int u32_change(struct net *net, struct sk_buff *in_skb,
 	n->handle = handle;
 	n->fshift = s->hmask ? ffs(ntohl(s->hmask)) - 1 : 0;
 	n->flags = flags;
-	n->tp = tp;
 
 	err = tcf_exts_init(&n->exts, TCA_U32_ACT, TCA_U32_POLICE);
 	if (err < 0)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 6/7] get rid of tc_u_common ->rcu
  2018-09-05 19:04 ` [PATCH 1/7] fix hnode refcounting Al Viro
                     ` (3 preceding siblings ...)
  2018-09-05 19:04   ` [PATCH 5/7] get rid of tc_u_knode ->tp Al Viro
@ 2018-09-05 19:04   ` Al Viro
  2018-09-06 10:36     ` Jamal Hadi Salim
  2018-09-07  4:18     ` Cong Wang
  2018-09-05 19:04   ` [PATCH 7/7] clean tc_u_common hashtable Al Viro
  2018-09-06 10:21   ` [PATCH 1/7] fix hnode refcounting Jamal Hadi Salim
  6 siblings, 2 replies; 30+ messages in thread
From: Al Viro @ 2018-09-05 19:04 UTC (permalink / raw)
  To: netdev; +Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko

From: Al Viro <viro@zeniv.linux.org.uk>

unused

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/sched/cls_u32.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index 8a1a573487bd..be9240ae1417 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -98,7 +98,6 @@ struct tc_u_common {
 	int			refcnt;
 	struct idr		handle_idr;
 	struct hlist_node	hnode;
-	struct rcu_head		rcu;
 };
 
 static inline unsigned int u32_hash_fold(__be32 key,
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 7/7] clean tc_u_common hashtable
  2018-09-05 19:04 ` [PATCH 1/7] fix hnode refcounting Al Viro
                     ` (4 preceding siblings ...)
  2018-09-05 19:04   ` [PATCH 6/7] get rid of tc_u_common ->rcu Al Viro
@ 2018-09-05 19:04   ` Al Viro
  2018-09-06 10:36     ` Jamal Hadi Salim
  2018-09-06 10:21   ` [PATCH 1/7] fix hnode refcounting Jamal Hadi Salim
  6 siblings, 1 reply; 30+ messages in thread
From: Al Viro @ 2018-09-05 19:04 UTC (permalink / raw)
  To: netdev; +Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko

From: Al Viro <viro@zeniv.linux.org.uk>

* calculate key *once*, not for each hash chain element
* let tc_u_hash() return the pointer to chain head rather than index -
callers are cleaner that way.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/sched/cls_u32.c | 22 ++++++++--------------
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index be9240ae1417..6d45ec4c218c 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -343,19 +343,16 @@ static void *tc_u_common_ptr(const struct tcf_proto *tp)
 		return block->q;
 }
 
-static unsigned int tc_u_hash(const struct tcf_proto *tp)
+static struct hlist_head *tc_u_hash(void *key)
 {
-	return hash_ptr(tc_u_common_ptr(tp), U32_HASH_SHIFT);
+	return tc_u_common_hash + hash_ptr(key, U32_HASH_SHIFT);
 }
 
-static struct tc_u_common *tc_u_common_find(const struct tcf_proto *tp)
+static struct tc_u_common *tc_u_common_find(void *key)
 {
 	struct tc_u_common *tc;
-	unsigned int h;
-
-	h = tc_u_hash(tp);
-	hlist_for_each_entry(tc, &tc_u_common_hash[h], hnode) {
-		if (tc->ptr == tc_u_common_ptr(tp))
+	hlist_for_each_entry(tc, tc_u_hash(key), hnode) {
+		if (tc->ptr == key)
 			return tc;
 	}
 	return NULL;
@@ -364,10 +361,8 @@ static struct tc_u_common *tc_u_common_find(const struct tcf_proto *tp)
 static int u32_init(struct tcf_proto *tp)
 {
 	struct tc_u_hnode *root_ht;
-	struct tc_u_common *tp_c;
-	unsigned int h;
-
-	tp_c = tc_u_common_find(tp);
+	void *key = tc_u_common_ptr(tp);
+	struct tc_u_common *tp_c = tc_u_common_find(key);
 
 	root_ht = kzalloc(sizeof(*root_ht), GFP_KERNEL);
 	if (root_ht == NULL)
@@ -389,8 +384,7 @@ static int u32_init(struct tcf_proto *tp)
 		INIT_HLIST_NODE(&tp_c->hnode);
 		idr_init(&tp_c->handle_idr);
 
-		h = tc_u_hash(tp);
-		hlist_add_head(&tp_c->hnode, &tc_u_common_hash[h]);
+		hlist_add_head(&tp_c->hnode, tc_u_hash(key));
 	}
 
 	tp_c->refcnt++;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/7] fix hnode refcounting
  2018-09-05 19:04 ` [PATCH 1/7] fix hnode refcounting Al Viro
                     ` (5 preceding siblings ...)
  2018-09-05 19:04   ` [PATCH 7/7] clean tc_u_common hashtable Al Viro
@ 2018-09-06 10:21   ` Jamal Hadi Salim
  2018-09-07  2:35     ` Al Viro
  6 siblings, 1 reply; 30+ messages in thread
From: Jamal Hadi Salim @ 2018-09-06 10:21 UTC (permalink / raw)
  To: Al Viro, netdev; +Cc: Cong Wang, Jiri Pirko, stable

On 2018-09-05 3:04 p.m., Al Viro wrote:
> From: Al Viro <viro@zeniv.linux.org.uk>
> 
> cls_u32.c misuses refcounts for struct tc_u_hnode - it counts references via
> ->hlist and via ->tp_root together.  u32_destroy() drops the former and, in
> case when there had been links, leaves the sucker on the list.  As the result,
> there's nothing to protect it from getting freed once links are dropped.
> That also makes the "is it busy" check incapable of catching the root hnode -
> it *is* busy (there's a reference from tp), but we don't see it as something
> separate.  "Is it our root?" check partially covers that, but the problem
> exists for others' roots as well.
> 
> AFAICS, the minimal fix preserving the existing behaviour (where it doesn't
> include oopsen, that is) would be this:
>          * count tp->root and tp_c->hlist as separate references.  I.e.
> have u32_init() set refcount to 2, not 1.
> 	* in u32_destroy() we always drop the former; in u32_destroy_hnode() -
> the latter.
> 
> 	That way we have *all* references contributing to refcount.  List
> removal happens in u32_destroy_hnode() (called only when ->refcnt is 1)
> an in u32_destroy() in case of tc_u_common going away, along with everything
> reachable from it.  IOW, that way we know that u32_destroy_key() won't
> free something still on the list (or pointed to by someone's ->root).
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

For networking patches, subject should be reflective of tree and
subsystem. Example for this one:
"[PATCH net 1/7]:net: sched: cls_u32: fix hnode refcounting"
Also useful to have a cover letter summarizing the patchset
in 0/7. Otherwise

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

cheers,
jamal

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/7] mark root hnode explicitly
  2018-09-05 19:04   ` [PATCH 2/7] mark root hnode explicitly Al Viro
@ 2018-09-06 10:28     ` Jamal Hadi Salim
  2018-09-06 10:34       ` Jamal Hadi Salim
  0 siblings, 1 reply; 30+ messages in thread
From: Jamal Hadi Salim @ 2018-09-06 10:28 UTC (permalink / raw)
  To: Al Viro, netdev; +Cc: Cong Wang, Jiri Pirko

On 2018-09-05 3:04 p.m., Al Viro wrote:
> From: Al Viro <viro@zeniv.linux.org.uk>
> 
> ... and disallow deleting or linking to such
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Same comment as other one in regards to subject

Since the flag space is coming from htnode which is
exposed via uapi it makes sense to keep this one here
because it is for private use; but  a comment in
include/uapi/linux/pkt_cls.h that this flag or
maybe a set of bits is reserved for internal use.
Otherwise:

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>


cheers,
jamal

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/7] make sure that divisor is a power of 2
  2018-09-05 19:04   ` [PATCH 3/7] make sure that divisor is a power of 2 Al Viro
@ 2018-09-06 10:28     ` Jamal Hadi Salim
  0 siblings, 0 replies; 30+ messages in thread
From: Jamal Hadi Salim @ 2018-09-06 10:28 UTC (permalink / raw)
  To: Al Viro, netdev; +Cc: Cong Wang, Jiri Pirko

On 2018-09-05 3:04 p.m., Al Viro wrote:
> From: Al Viro <viro@zeniv.linux.org.uk>
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>


cheers,
jamal

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/7] mark root hnode explicitly
  2018-09-06 10:28     ` Jamal Hadi Salim
@ 2018-09-06 10:34       ` Jamal Hadi Salim
  2018-09-06 10:42         ` Jamal Hadi Salim
  2018-09-06 10:59         ` Al Viro
  0 siblings, 2 replies; 30+ messages in thread
From: Jamal Hadi Salim @ 2018-09-06 10:34 UTC (permalink / raw)
  To: Al Viro, netdev; +Cc: Cong Wang, Jiri Pirko

On 2018-09-06 6:28 a.m., Jamal Hadi Salim wrote:
> On 2018-09-05 3:04 p.m., Al Viro wrote:
>> From: Al Viro <viro@zeniv.linux.org.uk>
>>
>> ... and disallow deleting or linking to such
>>
>> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> 
> Same comment as other one in regards to subject
> 
> Since the flag space is coming from htnode which is
> exposed via uapi it makes sense to keep this one here
> because it is for private use; but  a comment in
> include/uapi/linux/pkt_cls.h that this flag or
> maybe a set of bits is reserved for internal use.
> Otherwise:
> 
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>


Sorry, additional comment:
It makes sense to reject user space attempt to
set TCA_CLS_FLAGS_U32_ROOT
So my suggestion is to update tc_flags_valid() to
check for this.

cheers,
jamal

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/7] get rid of unused argument of u32_destroy_key()
  2018-09-05 19:04   ` [PATCH 4/7] get rid of unused argument of u32_destroy_key() Al Viro
@ 2018-09-06 10:34     ` Jamal Hadi Salim
  0 siblings, 0 replies; 30+ messages in thread
From: Jamal Hadi Salim @ 2018-09-06 10:34 UTC (permalink / raw)
  To: Al Viro, netdev; +Cc: Cong Wang, Jiri Pirko

On 2018-09-05 3:04 p.m., Al Viro wrote:
> From: Al Viro <viro@zeniv.linux.org.uk>
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

cheers,
jamal

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/7] get rid of tc_u_knode ->tp
  2018-09-05 19:04   ` [PATCH 5/7] get rid of tc_u_knode ->tp Al Viro
@ 2018-09-06 10:35     ` Jamal Hadi Salim
  0 siblings, 0 replies; 30+ messages in thread
From: Jamal Hadi Salim @ 2018-09-06 10:35 UTC (permalink / raw)
  To: Al Viro, netdev; +Cc: Cong Wang, Jiri Pirko

On 2018-09-05 3:04 p.m., Al Viro wrote:
> From: Al Viro <viro@zeniv.linux.org.uk>
> 
> not used anymore
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

cheers,
jamal

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 6/7] get rid of tc_u_common ->rcu
  2018-09-05 19:04   ` [PATCH 6/7] get rid of tc_u_common ->rcu Al Viro
@ 2018-09-06 10:36     ` Jamal Hadi Salim
  2018-09-07  4:18     ` Cong Wang
  1 sibling, 0 replies; 30+ messages in thread
From: Jamal Hadi Salim @ 2018-09-06 10:36 UTC (permalink / raw)
  To: Al Viro, netdev; +Cc: Cong Wang, Jiri Pirko

On 2018-09-05 3:04 p.m., Al Viro wrote:
> From: Al Viro <viro@zeniv.linux.org.uk>
> 
> unused
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

cheers,
jamal

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/7] clean tc_u_common hashtable
  2018-09-05 19:04   ` [PATCH 7/7] clean tc_u_common hashtable Al Viro
@ 2018-09-06 10:36     ` Jamal Hadi Salim
  0 siblings, 0 replies; 30+ messages in thread
From: Jamal Hadi Salim @ 2018-09-06 10:36 UTC (permalink / raw)
  To: Al Viro, netdev; +Cc: Cong Wang, Jiri Pirko

On 2018-09-05 3:04 p.m., Al Viro wrote:
> From: Al Viro <viro@zeniv.linux.org.uk>
> 
> * calculate key *once*, not for each hash chain element
> * let tc_u_hash() return the pointer to chain head rather than index -
> callers are cleaner that way.
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

cheers,
jamal

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/7] mark root hnode explicitly
  2018-09-06 10:34       ` Jamal Hadi Salim
@ 2018-09-06 10:42         ` Jamal Hadi Salim
  2018-09-06 10:59         ` Al Viro
  1 sibling, 0 replies; 30+ messages in thread
From: Jamal Hadi Salim @ 2018-09-06 10:42 UTC (permalink / raw)
  To: Al Viro, netdev; +Cc: Cong Wang, Jiri Pirko

[-- Attachment #1: Type: text/plain, Size: 47 bytes --]


And a bunch of indentations...

cheers,
jamal

[-- Attachment #2: indent.p --]
[-- Type: text/x-pascal, Size: 975 bytes --]

diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index 6d45ec4c218c..cb3bee12af78 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -485,7 +485,8 @@ static void u32_clear_hw_hnode(struct tcf_proto *tp, struct tc_u_hnode *h,
 	struct tc_cls_u32_offload cls_u32 = {};
 
 	tc_cls_common_offload_init(&cls_u32.common, tp,
-		h->flags & ~TCA_CLS_FLAGS_U32_ROOT, extack);
+				   h->flags & ~TCA_CLS_FLAGS_U32_ROOT,
+				   extack);
 	cls_u32.command = TC_CLSU32_DELETE_HNODE;
 	cls_u32.hnode.divisor = h->divisor;
 	cls_u32.hnode.handle = h->handle;
@@ -1215,7 +1216,7 @@ static int u32_reoffload_hnode(struct tcf_proto *tp, struct tc_u_hnode *ht,
 	int err;
 
 	tc_cls_common_offload_init(&cls_u32.common, tp,
-		ht->flags & ~TCA_CLS_FLAGS_U32_ROOT, extack);
+				   ht->flags & ~TCA_CLS_FLAGS_U32_ROOT, extack);
 	cls_u32.command = add ? TC_CLSU32_NEW_HNODE : TC_CLSU32_DELETE_HNODE;
 	cls_u32.hnode.divisor = ht->divisor;
 	cls_u32.hnode.handle = ht->handle;

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/7] mark root hnode explicitly
  2018-09-06 10:34       ` Jamal Hadi Salim
  2018-09-06 10:42         ` Jamal Hadi Salim
@ 2018-09-06 10:59         ` Al Viro
  2018-09-06 11:04           ` Jamal Hadi Salim
  2018-09-07  2:57           ` Cong Wang
  1 sibling, 2 replies; 30+ messages in thread
From: Al Viro @ 2018-09-06 10:59 UTC (permalink / raw)
  To: Jamal Hadi Salim; +Cc: netdev, Cong Wang, Jiri Pirko

On Thu, Sep 06, 2018 at 06:34:00AM -0400, Jamal Hadi Salim wrote:
> On 2018-09-06 6:28 a.m., Jamal Hadi Salim wrote:
> > On 2018-09-05 3:04 p.m., Al Viro wrote:
> > > From: Al Viro <viro@zeniv.linux.org.uk>
> > > 
> > > ... and disallow deleting or linking to such
> > > 
> > > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> > 
> > Same comment as other one in regards to subject
> > 
> > Since the flag space is coming from htnode which is
> > exposed via uapi it makes sense to keep this one here
> > because it is for private use; but  a comment in
> > include/uapi/linux/pkt_cls.h that this flag or
> > maybe a set of bits is reserved for internal use.
> > Otherwise:
> > 
> > Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
> 
> 
> Sorry, additional comment:
> It makes sense to reject user space attempt to
> set TCA_CLS_FLAGS_U32_ROOT

Point, and that one is IMO enough to give up on using ->flags for
that.  How about simply

diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index 3f985f29ef30..d14048e38b5c 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -84,6 +84,7 @@ struct tc_u_hnode {
 	int			refcnt;
 	unsigned int		divisor;
 	struct idr		handle_idr;
+	bool			is_root;
 	struct rcu_head		rcu;
 	u32			flags;
 	/* The 'ht' field MUST be the last field in structure to allow for
@@ -377,6 +378,7 @@ static int u32_init(struct tcf_proto *tp)
 	root_ht->refcnt++;
 	root_ht->handle = tp_c ? gen_new_htid(tp_c, root_ht) : 0x80000000;
 	root_ht->prio = tp->prio;
+	root_ht->is_root = true;
 	idr_init(&root_ht->handle_idr);
 
 	if (tp_c == NULL) {
@@ -693,7 +695,7 @@ static int u32_delete(struct tcf_proto *tp, void *arg, bool *last,
 		goto out;
 	}
 
-	if (root_ht == ht) {
+	if (ht->is_root) {
 		NL_SET_ERR_MSG_MOD(extack, "Not allowed to delete root node");
 		return -EINVAL;
 	}
@@ -795,6 +797,10 @@ static int u32_set_parms(struct net *net, struct tcf_proto *tp,
 				NL_SET_ERR_MSG_MOD(extack, "Link hash table not found");
 				return -EINVAL;
 			}
+			if (ht_down->is_root) {
+				NL_SET_ERR_MSG_MOD(extack, "Not linking to root node");
+				return -EINVAL;
+			}
 			ht_down->refcnt++;
 		}
 

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/7] mark root hnode explicitly
  2018-09-06 10:59         ` Al Viro
@ 2018-09-06 11:04           ` Jamal Hadi Salim
  2018-09-07  2:57           ` Cong Wang
  1 sibling, 0 replies; 30+ messages in thread
From: Jamal Hadi Salim @ 2018-09-06 11:04 UTC (permalink / raw)
  To: Al Viro; +Cc: netdev, Cong Wang, Jiri Pirko

On 2018-09-06 6:59 a.m., Al Viro wrote:
> On Thu, Sep 06, 2018 at 06:34:00AM -0400, Jamal Hadi Salim wrote:
>> On 2018-09-06 6:28 a.m., Jamal Hadi Salim wrote:

[..]

> Point, and that one is IMO enough to give up on using ->flags for
> that.  How about simply
> 
> diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
> index 3f985f29ef30..d14048e38b5c 100644
> --- a/net/sched/cls_u32.c
> +++ b/net/sched/cls_u32.c
> @@ -84,6 +84,7 @@ struct tc_u_hnode {
>   	int			refcnt;
>   	unsigned int		divisor;
>   	struct idr		handle_idr;
> +	bool			is_root;
>   	struct rcu_head		rcu;
>   	u32			flags;
>   	/* The 'ht' field MUST be the last field in structure to allow for
> @@ -377,6 +378,7 @@ static int u32_init(struct tcf_proto *tp)
>   	root_ht->refcnt++;
>   	root_ht->handle = tp_c ? gen_new_htid(tp_c, root_ht) : 0x80000000;
>   	root_ht->prio = tp->prio;
> +	root_ht->is_root = true;
>   	idr_init(&root_ht->handle_idr);
>   
>   	if (tp_c == NULL) {
> @@ -693,7 +695,7 @@ static int u32_delete(struct tcf_proto *tp, void *arg, bool *last,
>   		goto out;
>   	}
>   
> -	if (root_ht == ht) {
> +	if (ht->is_root) {
>   		NL_SET_ERR_MSG_MOD(extack, "Not allowed to delete root node");
>   		return -EINVAL;
>   	}
> @@ -795,6 +797,10 @@ static int u32_set_parms(struct net *net, struct tcf_proto *tp,
>   				NL_SET_ERR_MSG_MOD(extack, "Link hash table not found");
>   				return -EINVAL;
>   			}
> +			if (ht_down->is_root) {
> +				NL_SET_ERR_MSG_MOD(extack, "Not linking to root node");
> +				return -EINVAL;
> +			}
>   			ht_down->refcnt++;


Looks good to me ;->

cheers,
jamal

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/7] fix hnode refcounting
  2018-09-06 10:21   ` [PATCH 1/7] fix hnode refcounting Jamal Hadi Salim
@ 2018-09-07  2:35     ` Al Viro
  2018-09-07 12:13       ` Jamal Hadi Salim
  0 siblings, 1 reply; 30+ messages in thread
From: Al Viro @ 2018-09-07  2:35 UTC (permalink / raw)
  To: Jamal Hadi Salim; +Cc: netdev, Cong Wang, Jiri Pirko, stable

On Thu, Sep 06, 2018 at 06:21:09AM -0400, Jamal Hadi Salim wrote:

> For networking patches, subject should be reflective of tree and
> subsystem. Example for this one:
> "[PATCH net 1/7]:net: sched: cls_u32: fix hnode refcounting"
> Also useful to have a cover letter summarizing the patchset
> in 0/7. Otherwise
> 
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

Argh...  Unfortunately, there's this: in u32_delete() we have
        if (root_ht) {
                if (root_ht->refcnt > 1) {
                        *last = false;
                        goto ret;
                }
                if (root_ht->refcnt == 1) {
                        if (!ht_empty(root_ht)) {
                                *last = false;
                                goto ret;
                        }
                }
        }
and that would need to be updated.  However, that logics is bloody odd
to start with.  First of all, root_ht has come from
       struct tc_u_hnode *root_ht = rtnl_dereference(tp->root);
and the only place where it's ever modified is
        rcu_assign_pointer(tp->root, root_ht);
in u32_init(), where we'd bloody well checked that root_ht is non-NULL
(see
        if (root_ht == NULL)
                return -ENOBUFS;
upstream of that place) and where that assignment is inevitable on the
way to returning 0.  No matter what, if tp has passed u32_init() it
will have non-NULL ->root, forever.  And there is no way for tcf_proto
to be seen outside of tcf_proto_create() without ->init() having returned
0 - it gets freed before anyone sees it.

So this 'if (root_ht)' can't be false.  What's more, what the hell is the
whole thing checking?  We are in u32_delete().  It's called (as ->delete())
from tfilter_del_notify(), which is called from tc_del_tfilter().  If we
return 0 with *last true, we follow up calling tcf_proto_destroy().
OK, let's look at the logics in there:
	* if there are links to root hnode => false
	* if there's no links to root hnode and it has knodes => false
(BTW, if we ever get there with root_ht->refcnt < 1, we are obviously screwed)
	* if there is a tcf_proto sharing tp->data => false (i.e. any filters
with different prio - don't bother)
	* if tp is the only one with reference to tp->data and there are *any*
knodes => false.

Any extra links can come only from knodes in a non-empty hnode.  And it's not
a common case.  Shouldn't thIe whole thing be
	* shared tp->data => false
	* any non-empty hnode => false
instead?  Perhaps even with the knode counter in tp->data, avoiding any loops
in there, as well as the entire ht_empty()...

Now, in the very beginning of u32_delete() we have this:
        struct tc_u_hnode *ht = arg;
	
        if (ht == NULL)
                goto out;
OK, but the call of ->delete() is
        err = tp->ops->delete(tp, fh, last, extack);
and arg == NULL seen in u32_delete() means fh == NULL in tfilter_del_notify().
Which is called in
        if (!fh) {
		...
	} else {
                bool last;

                err = tfilter_del_notify(net, skb, n, tp, block,
                                         q, parent, fh, false, &last,
                                         extack);
How can we ever get there with NULL fh?

The whole thing makes very little sense; looks like it used to live in
u32_destroy() prior to commit 763dbf6328e41 ("net_sched: move the empty tp
check from ->destroy() to ->delete()"), but looking at the rationale in
that commit...  I don't see how it fixes anything - sure, now we remove
tcf_proto from the list before calling ->destroy().  Without any RCU delays
in between.  How could it possibly solve any issues with ->classify()
called in parallel with ->destroy()?  cls_u32 (at least these days)
does try to survive u32_destroy() in parallel with u32_classify();
if any other classifiers do not, they are still broken and that commit
has not done anything for them.

Anyway, adjusting 1/7 for that is trivial, but I would really like to
understand what that code is doing...  Comments?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/7] mark root hnode explicitly
  2018-09-06 10:59         ` Al Viro
  2018-09-06 11:04           ` Jamal Hadi Salim
@ 2018-09-07  2:57           ` Cong Wang
  2018-09-07  3:04             ` Al Viro
  1 sibling, 1 reply; 30+ messages in thread
From: Cong Wang @ 2018-09-07  2:57 UTC (permalink / raw)
  To: Al Viro; +Cc: Jamal Hadi Salim, Linux Kernel Network Developers, Jiri Pirko

On Thu, Sep 6, 2018 at 3:59 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
> index 3f985f29ef30..d14048e38b5c 100644
> --- a/net/sched/cls_u32.c
> +++ b/net/sched/cls_u32.c
> @@ -84,6 +84,7 @@ struct tc_u_hnode {
>         int                     refcnt;
>         unsigned int            divisor;
>         struct idr              handle_idr;
> +       bool                    is_root;
>         struct rcu_head         rcu;
>         u32                     flags;
>         /* The 'ht' field MUST be the last field in structure to allow for
> @@ -377,6 +378,7 @@ static int u32_init(struct tcf_proto *tp)
>         root_ht->refcnt++;
>         root_ht->handle = tp_c ? gen_new_htid(tp_c, root_ht) : 0x80000000;
>         root_ht->prio = tp->prio;
> +       root_ht->is_root = true;
>         idr_init(&root_ht->handle_idr);
>
>         if (tp_c == NULL) {
> @@ -693,7 +695,7 @@ static int u32_delete(struct tcf_proto *tp, void *arg, bool *last,
>                 goto out;
>         }
>
> -       if (root_ht == ht) {
> +       if (ht->is_root) {


What's wrong with comparing pointers with root ht?


>                 NL_SET_ERR_MSG_MOD(extack, "Not allowed to delete root node");
>                 return -EINVAL;
>         }
> @@ -795,6 +797,10 @@ static int u32_set_parms(struct net *net, struct tcf_proto *tp,
>                                 NL_SET_ERR_MSG_MOD(extack, "Link hash table not found");
>                                 return -EINVAL;
>                         }
> +                       if (ht_down->is_root) {

root ht is saved in tp->root, so you can compare ht_down with it too,
if you want.

If this check is all what you need, you don't need an extra flag.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/7] mark root hnode explicitly
  2018-09-07  2:57           ` Cong Wang
@ 2018-09-07  3:04             ` Al Viro
  2018-09-07  3:23               ` Cong Wang
  0 siblings, 1 reply; 30+ messages in thread
From: Al Viro @ 2018-09-07  3:04 UTC (permalink / raw)
  To: Cong Wang; +Cc: Jamal Hadi Salim, Linux Kernel Network Developers, Jiri Pirko

On Thu, Sep 06, 2018 at 07:57:25PM -0700, Cong Wang wrote:

> > -       if (root_ht == ht) {
> > +       if (ht->is_root) {
> 
> 
> What's wrong with comparing pointers with root ht?

The fact that there may be more than one tcf_proto sharing tp->data.

> >                 NL_SET_ERR_MSG_MOD(extack, "Not allowed to delete root node");
> >                 return -EINVAL;
> >         }
> > @@ -795,6 +797,10 @@ static int u32_set_parms(struct net *net, struct tcf_proto *tp,
> >                                 NL_SET_ERR_MSG_MOD(extack, "Link hash table not found");
> >                                 return -EINVAL;
> >                         }
> > +                       if (ht_down->is_root) {
> 
> root ht is saved in tp->root, so you can compare ht_down with it too,
> if you want.
> 
> If this check is all what you need, you don't need an extra flag.

Again, *which* tp?  We can trivially check that we are not linking to/deleting
our own root, sure.  But there's nothing to stop doing the same via another
tcf_proto...

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/7] mark root hnode explicitly
  2018-09-07  3:04             ` Al Viro
@ 2018-09-07  3:23               ` Cong Wang
  2018-09-07  3:49                 ` Al Viro
  0 siblings, 1 reply; 30+ messages in thread
From: Cong Wang @ 2018-09-07  3:23 UTC (permalink / raw)
  To: Al Viro; +Cc: Jamal Hadi Salim, Linux Kernel Network Developers, Jiri Pirko

On Thu, Sep 6, 2018 at 8:04 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Thu, Sep 06, 2018 at 07:57:25PM -0700, Cong Wang wrote:
>
> > > -       if (root_ht == ht) {
> > > +       if (ht->is_root) {
> >
> >
> > What's wrong with comparing pointers with root ht?
>
> The fact that there may be more than one tcf_proto sharing tp->data.

Hmm? root ht is from tp->root, not tp->data.

Also, this very important information is missing in your one-line changelog...



>
> > >                 NL_SET_ERR_MSG_MOD(extack, "Not allowed to delete root node");
> > >                 return -EINVAL;
> > >         }
> > > @@ -795,6 +797,10 @@ static int u32_set_parms(struct net *net, struct tcf_proto *tp,
> > >                                 NL_SET_ERR_MSG_MOD(extack, "Link hash table not found");
> > >                                 return -EINVAL;
> > >                         }
> > > +                       if (ht_down->is_root) {
> >
> > root ht is saved in tp->root, so you can compare ht_down with it too,
> > if you want.
> >
> > If this check is all what you need, you don't need an extra flag.
>
> Again, *which* tp?  We can trivially check that we are not linking to/deleting


Pretty sure there is a 'tp' in u32_set_parms() parameter list.

Are you saying it is not what you want? If so, why?

More importantly, why this information is again missing in your
changelog? This patch is definitely not trivial, it deserves a detailed
changelog.


> our own root, sure.  But there's nothing to stop doing the same via another
> tcf_proto...

To my best knowledge, the place where you set ->is_root=true
is precisely same with where we set tp->root=root_ht, and it doesn't
change after set. What am I missing here?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/7] mark root hnode explicitly
  2018-09-07  3:23               ` Cong Wang
@ 2018-09-07  3:49                 ` Al Viro
  2018-09-07  4:14                   ` Cong Wang
  0 siblings, 1 reply; 30+ messages in thread
From: Al Viro @ 2018-09-07  3:49 UTC (permalink / raw)
  To: Cong Wang; +Cc: Jamal Hadi Salim, Linux Kernel Network Developers, Jiri Pirko

On Thu, Sep 06, 2018 at 08:23:36PM -0700, Cong Wang wrote:

> Pretty sure there is a 'tp' in u32_set_parms() parameter list.
> 
> Are you saying it is not what you want? If so, why?
> 
> More importantly, why this information is again missing in your
> changelog? This patch is definitely not trivial, it deserves a detailed
> changelog.
> 
> 
> > our own root, sure.  But there's nothing to stop doing the same via another
> > tcf_proto...
> 
> To my best knowledge, the place where you set ->is_root=true
> is precisely same with where we set tp->root=root_ht, and it doesn't
> change after set. What am I missing here?

The fact that there can be two (or more) different tcf_proto instances sharing
->data, but not ->root.  And since ->data is shared, u32_get() on one tp
will be able to return you ->root of *ANOTHER* one.  So comparison with
tp->root doesn't protect you.  Try this on mainline:

tc qdisc add dev eth0 ingress
tc filter add dev eth0 parent ffff: protocol ip prio 100 handle 1: u32 divisor 1
tc filter add dev eth0 parent ffff: protocol ip prio 200 handle 2: u32 divisor 1
tc filter delete dev eth0 parent ffff: protocol ip prio 100 handle 801: u32

and watch the fun as soon as you get an incoming packet on eth0.  That panic
is fixed by 1/7, but you get "Not allowed to delete root node" for removing
_your_ root, with "Can not delete in-use filter" for other's root (as in the
last line of the reproducer).

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/7] mark root hnode explicitly
  2018-09-07  3:49                 ` Al Viro
@ 2018-09-07  4:14                   ` Cong Wang
  0 siblings, 0 replies; 30+ messages in thread
From: Cong Wang @ 2018-09-07  4:14 UTC (permalink / raw)
  To: Al Viro; +Cc: Jamal Hadi Salim, Linux Kernel Network Developers, Jiri Pirko

On Thu, Sep 6, 2018 at 8:50 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Thu, Sep 06, 2018 at 08:23:36PM -0700, Cong Wang wrote:
>
> > Pretty sure there is a 'tp' in u32_set_parms() parameter list.
> >
> > Are you saying it is not what you want? If so, why?
> >
> > More importantly, why this information is again missing in your
> > changelog? This patch is definitely not trivial, it deserves a detailed
> > changelog.
> >
> >
> > > our own root, sure.  But there's nothing to stop doing the same via another
> > > tcf_proto...
> >
> > To my best knowledge, the place where you set ->is_root=true
> > is precisely same with where we set tp->root=root_ht, and it doesn't
> > change after set. What am I missing here?
>
> The fact that there can be two (or more) different tcf_proto instances sharing
> ->data, but not ->root.  And since ->data is shared, u32_get() on one tp

They have to share tp->data, that is how we link hashtables together.


> will be able to return you ->root of *ANOTHER* one.  So comparison with
> tp->root doesn't protect you.  Try this on mainline:

Hmm, it is not u32_get(), it is u32_lookup_ht() which could get another
root ht... I see.


>
> tc qdisc add dev eth0 ingress
> tc filter add dev eth0 parent ffff: protocol ip prio 100 handle 1: u32 divisor 1
> tc filter add dev eth0 parent ffff: protocol ip prio 200 handle 2: u32 divisor 1
> tc filter delete dev eth0 parent ffff: protocol ip prio 100 handle 801: u32
>
> and watch the fun as soon as you get an incoming packet on eth0.  That panic
> is fixed by 1/7, but you get "Not allowed to delete root node" for removing
> _your_ root, with "Can not delete in-use filter" for other's root (as in the
> last line of the reproducer).

Sure, please consider:

1. adding such a test case to tools/testing/selftests/tc-testing/
2. adding it in your changelog

This would save a lot of time for both of us. I don't need to ask you for
this if it is in the changelog, you don't have to explain it again.

This is a win-win.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 6/7] get rid of tc_u_common ->rcu
  2018-09-05 19:04   ` [PATCH 6/7] get rid of tc_u_common ->rcu Al Viro
  2018-09-06 10:36     ` Jamal Hadi Salim
@ 2018-09-07  4:18     ` Cong Wang
  2018-09-07  4:28       ` Al Viro
  1 sibling, 1 reply; 30+ messages in thread
From: Cong Wang @ 2018-09-07  4:18 UTC (permalink / raw)
  To: Al Viro; +Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Jiri Pirko

On Wed, Sep 5, 2018 at 12:04 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> From: Al Viro <viro@zeniv.linux.org.uk>
>
> unused
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
>  net/sched/cls_u32.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
> index 8a1a573487bd..be9240ae1417 100644
> --- a/net/sched/cls_u32.c
> +++ b/net/sched/cls_u32.c
> @@ -98,7 +98,6 @@ struct tc_u_common {
>         int                     refcnt;
>         struct idr              handle_idr;
>         struct hlist_node       hnode;
> -       struct rcu_head         rcu;
>  };

Just FYI:

This was added when RCU was introduced to u32 fast path,
it looks like on fast path we never touch tc_u_common, we
only use tp->root, all the rest are slow paths with RTNL lock,
so it is probably fine to just remove it rather than converting
that kfree() to kfree_rcu().

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 6/7] get rid of tc_u_common ->rcu
  2018-09-07  4:18     ` Cong Wang
@ 2018-09-07  4:28       ` Al Viro
  0 siblings, 0 replies; 30+ messages in thread
From: Al Viro @ 2018-09-07  4:28 UTC (permalink / raw)
  To: Cong Wang; +Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Jiri Pirko

On Thu, Sep 06, 2018 at 09:18:47PM -0700, Cong Wang wrote:
> On Wed, Sep 5, 2018 at 12:04 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > From: Al Viro <viro@zeniv.linux.org.uk>
> >
> > unused
> >
> > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> > ---
> >  net/sched/cls_u32.c | 1 -
> >  1 file changed, 1 deletion(-)
> >
> > diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
> > index 8a1a573487bd..be9240ae1417 100644
> > --- a/net/sched/cls_u32.c
> > +++ b/net/sched/cls_u32.c
> > @@ -98,7 +98,6 @@ struct tc_u_common {
> >         int                     refcnt;
> >         struct idr              handle_idr;
> >         struct hlist_node       hnode;
> > -       struct rcu_head         rcu;
> >  };
> 
> Just FYI:
> 
> This was added when RCU was introduced to u32 fast path,
> it looks like on fast path we never touch tc_u_common, we
> only use tp->root, all the rest are slow paths with RTNL lock,
> so it is probably fine to just remove it rather than converting
> that kfree() to kfree_rcu().

*nod*

In any case, if u32_classify() grows that dereference, we can always
re-add ->rcu at the same time.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/7] fix hnode refcounting
  2018-09-07  2:35     ` Al Viro
@ 2018-09-07 12:13       ` Jamal Hadi Salim
  2018-09-07 12:33         ` Jamal Hadi Salim
  2018-09-08 15:03         ` Al Viro
  0 siblings, 2 replies; 30+ messages in thread
From: Jamal Hadi Salim @ 2018-09-07 12:13 UTC (permalink / raw)
  To: Al Viro; +Cc: netdev, Cong Wang, Jiri Pirko, stable

On 2018-09-06 10:35 p.m., Al Viro wrote:
> On Thu, Sep 06, 2018 at 06:21:09AM -0400, Jamal Hadi Salim wrote:
[..]
> 
> Argh...  Unfortunately, there's this: in u32_delete() we have
>          if (root_ht) {
>                  if (root_ht->refcnt > 1) {
>                          *last = false;
>                          goto ret;
>                  }
>                  if (root_ht->refcnt == 1) {
>                          if (!ht_empty(root_ht)) {
>                                  *last = false;
>                                  goto ret;
>                          }
>                  }
>          }
> and that would need to be updated.  

It is not detrimental as you have it right now but
you are right an adjustment is needed...

Deleting of a root directly should not be allowed. But
you can flush a whole tp. Consider this:
--
sudo tc qdisc add dev $P ingress
sudo tc filter add dev $P parent ffff: protocol ip prio 10 \
u32 match ip protocol 1 0xff

Which creates root ht 800

You shouldnt be allowed to do this:
--
tc filter delete dev $P parent ffff: protocol ip prio 10 handle 800: u32
---

But you can delete the tp entirely as such:
---
tc filter delete dev $P parent ffff: protocol ip prio 10 u32
--

The later will go via the destroy() path and flush all filters.

You should also be able to delete individual filters. ex:
$tc filter del dev $P parent ffff: prio 10 handle 800:0:800 u32

Where that code you are referring to is important is when
the last filter deleted - we need the caller to know
and it destroys root.

i.e you should return last=true when the last filter is
deleted so root gets auto deleted (just like it was autocreated)

> However, that logics is bloody odd
> to start with.  First of all, root_ht has come from
>         struct tc_u_hnode *root_ht = rtnl_dereference(tp->root);
> and the only place where it's ever modified is
>          rcu_assign_pointer(tp->root, root_ht);
> in u32_init(), where we'd bloody well checked that root_ht is non-NULL
> (see
>          if (root_ht == NULL)
>                  return -ENOBUFS;
> upstream of that place) and where that assignment is inevitable on the
> way to returning 0.  No matter what, if tp has passed u32_init() it
> will have non-NULL ->root, forever.  And there is no way for tcf_proto
> to be seen outside of tcf_proto_create() without ->init() having returned
> 0 - it gets freed before anyone sees it.
> 

Yes, the check for root_ht is not necessary - but the check for the
last filter (and testing for last) is needed.

> So this 'if (root_ht)' can't be false.  What's more, what the hell is the
> whole thing checking?  We are in u32_delete().  It's called (as ->delete())
> from tfilter_del_notify(), which is called from tc_del_tfilter().  If we
> return 0 with *last true, we follow up calling tcf_proto_destroy().
> OK, let's look at the logics in there:
> 	* if there are links to root hnode => false
> 	* if there's no links to root hnode and it has knodes => false
> (BTW, if we ever get there with root_ht->refcnt < 1, we are obviously screwed)
> 	* if there is a tcf_proto sharing tp->data => false (i.e. any filters
> with different prio - don't bother)
> 	* if tp is the only one with reference to tp->data and there are *any*
> knodes => false.
> 
> Any extra links can come only from knodes in a non-empty hnode.  And it's not
> a common case.  Shouldn't thIe whole thing be
> 	* shared tp->data => false
> 	* any non-empty hnode => false
> instead?  Perhaps even with the knode counter in tp->data, avoiding any loops
> in there, as well as the entire ht_empty()...
> 
> Now, in the very beginning of u32_delete() we have this:
>          struct tc_u_hnode *ht = arg;
> 	
>          if (ht == NULL)
>                  goto out;
> OK, but the call of ->delete() is
>          err = tp->ops->delete(tp, fh, last, extack);
> and arg == NULL seen in u32_delete() means fh == NULL in tfilter_del_notify().
> Which is called in
>          if (!fh) {
> 		...
> 	} else {
>                  bool last;
> 
>                  err = tfilter_del_notify(net, skb, n, tp, block,
>                                           q, parent, fh, false, &last,
>                                           extack);
> How can we ever get there with NULL fh?
>

Try:
tc filter delete dev $P parent ffff: protocol ip prio 10 u32
tcm handle is 0, so will hit that code path.

> The whole thing makes very little sense; looks like it used to live in
> u32_destroy() prior to commit 763dbf6328e41 ("net_sched: move the empty tp
> check from ->destroy() to ->delete()"), but looking at the rationale in
> that commit...  I don't see how it fixes anything - sure, now we remove
> tcf_proto from the list before calling ->destroy().  Without any RCU delays
> in between.  How could it possibly solve any issues with ->classify()
> called in parallel with ->destroy()?  cls_u32 (at least these days)
> does try to survive u32_destroy() in parallel with u32_classify();
> if any other classifiers do not, they are still broken and that commit
> has not done anything for them.
> 
> Anyway, adjusting 1/7 for that is trivial, but I would really like to
> understand what that code is doing...  Comments?
> 

Refer to above.

cheers,
jamal

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/7] fix hnode refcounting
  2018-09-07 12:13       ` Jamal Hadi Salim
@ 2018-09-07 12:33         ` Jamal Hadi Salim
  2018-09-08 15:03         ` Al Viro
  1 sibling, 0 replies; 30+ messages in thread
From: Jamal Hadi Salim @ 2018-09-07 12:33 UTC (permalink / raw)
  To: Al Viro; +Cc: netdev, Cong Wang, Jiri Pirko, stable

To clarify with an example i used to test
your patches:

#0 add ingress filter
$TC qdisc add dev $P ingress
#1 add filter
$TC filter add dev $P parent ffff: protocol ip prio 10 \
u32 match ip protocol 1 0xff
#2 display
$TC filter ls dev $P parent ffff:

#3 try to delete root
$TC filter delete dev $P parent ffff: protocol ip prio 10 \
handle 800: u32

#4 nothing changes..
$TC filter ls dev $P parent ffff:
#5 delete filter
$TC filter delete dev $P parent ffff: protocol ip prio 10 \
handle 800:0:800 u32
#6 filter gone but hash table still there..
$TC filter ls dev $P parent ffff:
#7 delete tp
$TC filter delete dev $P parent ffff: protocol ip prio 10 \
u32
#8 now it is gone..
$TC filter ls dev $P parent ffff:

your patches show #6 filter as still active.
We want it to look like #8

Hope this helps.

cheers,
jamal

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/7] fix hnode refcounting
  2018-09-07 12:13       ` Jamal Hadi Salim
  2018-09-07 12:33         ` Jamal Hadi Salim
@ 2018-09-08 15:03         ` Al Viro
  1 sibling, 0 replies; 30+ messages in thread
From: Al Viro @ 2018-09-08 15:03 UTC (permalink / raw)
  To: Jamal Hadi Salim; +Cc: netdev, Cong Wang, Jiri Pirko, stable

On Fri, Sep 07, 2018 at 08:13:56AM -0400, Jamal Hadi Salim wrote:
> > 	} else {
> >                  bool last;
> > 
> >                  err = tfilter_del_notify(net, skb, n, tp, block,
> >                                           q, parent, fh, false, &last,
> >                                           extack);
> > How can we ever get there with NULL fh?
> > 
> 
> Try:
> tc filter delete dev $P parent ffff: protocol ip prio 10 u32
> tcm handle is 0, so will hit that code path.

Huh?  It will hit tcf_proto_destroy() (and thus u32_destroy()), but where will
it hit u32_delete()?  Sure, we have fh == NULL there; what happens next is
                if (t->tcm_handle == 0) {
                        tcf_chain_tp_remove(chain, &chain_info, tp);    
                        tfilter_notify(net, skb, n, tp, block, q, parent, fh,
                                       RTM_DELTFILTER, false);
                        tcf_proto_destroy(tp, extack);
and that's it.  IDGI...  Direct experiment shows that on e.g.
tc qdisc add dev eth0 ingress
tc filter add dev eth0 parent ffff: protocol ip prio 10 u32 match ip protocol 1 0xff
tc filter delete dev eth0 parent ffff: protocol ip prio 10 u32
we get u32_destroy() called, with u32_destroy_hnode() called by it,
but no u32_delete() is called at all, let alone with ht == NULL...

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2018-09-08 19:49 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-05 19:01 [PATCHES] cls_u32 cleanups and fixes Al Viro
2018-09-05 19:04 ` [PATCH 1/7] fix hnode refcounting Al Viro
2018-09-05 19:04   ` [PATCH 2/7] mark root hnode explicitly Al Viro
2018-09-06 10:28     ` Jamal Hadi Salim
2018-09-06 10:34       ` Jamal Hadi Salim
2018-09-06 10:42         ` Jamal Hadi Salim
2018-09-06 10:59         ` Al Viro
2018-09-06 11:04           ` Jamal Hadi Salim
2018-09-07  2:57           ` Cong Wang
2018-09-07  3:04             ` Al Viro
2018-09-07  3:23               ` Cong Wang
2018-09-07  3:49                 ` Al Viro
2018-09-07  4:14                   ` Cong Wang
2018-09-05 19:04   ` [PATCH 3/7] make sure that divisor is a power of 2 Al Viro
2018-09-06 10:28     ` Jamal Hadi Salim
2018-09-05 19:04   ` [PATCH 4/7] get rid of unused argument of u32_destroy_key() Al Viro
2018-09-06 10:34     ` Jamal Hadi Salim
2018-09-05 19:04   ` [PATCH 5/7] get rid of tc_u_knode ->tp Al Viro
2018-09-06 10:35     ` Jamal Hadi Salim
2018-09-05 19:04   ` [PATCH 6/7] get rid of tc_u_common ->rcu Al Viro
2018-09-06 10:36     ` Jamal Hadi Salim
2018-09-07  4:18     ` Cong Wang
2018-09-07  4:28       ` Al Viro
2018-09-05 19:04   ` [PATCH 7/7] clean tc_u_common hashtable Al Viro
2018-09-06 10:36     ` Jamal Hadi Salim
2018-09-06 10:21   ` [PATCH 1/7] fix hnode refcounting Jamal Hadi Salim
2018-09-07  2:35     ` Al Viro
2018-09-07 12:13       ` Jamal Hadi Salim
2018-09-07 12:33         ` Jamal Hadi Salim
2018-09-08 15:03         ` Al Viro

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.