All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v4 00/10] optimize openvswitch flow looking up
@ 2019-10-15 10:30 xiangxia.m.yue
  2019-10-15 10:30 ` [PATCH net-next v4 01/10] net: openvswitch: add flow-mask cache for performance xiangxia.m.yue
                   ` (11 more replies)
  0 siblings, 12 replies; 37+ messages in thread
From: xiangxia.m.yue @ 2019-10-15 10:30 UTC (permalink / raw)
  To: gvrose8192, pshelar; +Cc: netdev, dev, Tonghao Zhang

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

This series patch optimize openvswitch for performance or simplify
codes.

Patch 1, 2, 4: Port Pravin B Shelar patches to
linux upstream with little changes.

Patch 5, 6, 7: Optimize the flow looking up and
simplify the flow hash.

Patch 8, 9: are bugfix.

The performance test is on Intel Xeon E5-2630 v4.
The test topology is show as below:

+-----------------------------------+
|   +---------------------------+   |
|   | eth0   ovs-switch    eth1 |   | Host0
|   +---------------------------+   |
+-----------------------------------+
      ^                       |
      |                       |
      |                       |
      |                       |
      |                       v
+-----+----+             +----+-----+
| netperf  | Host1       | netserver| Host2
+----------+             +----------+

We use netperf send the 64B packets, and insert 255+ flow-mask:
$ ovs-dpctl add-flow ovs-switch "in_port(1),eth(dst=00:01:00:00:00:00/ff:ff:ff:ff:ff:01),eth_type(0x0800),ipv4(frag=no)" 2
...
$ ovs-dpctl add-flow ovs-switch "in_port(1),eth(dst=00:ff:00:00:00:00/ff:ff:ff:ff:ff:ff),eth_type(0x0800),ipv4(frag=no)" 2
$
$ netperf -t UDP_STREAM -H 2.2.2.200 -l 40 -- -m 18

* Without series patch, throughput 8.28Mbps
* With series patch, throughput 46.05Mbps

v3->v4:
access ma->count with READ_ONCE/WRITE_ONCE API. More information,
see patch 5 comments. 

v2->v3:
update ma point when realloc mask_array in patch 5.

v1->v2:
use kfree_rcu instead of call_rcu

Tonghao Zhang (10):
  net: openvswitch: add flow-mask cache for performance
  net: openvswitch: convert mask list in mask array
  net: openvswitch: shrink the mask array if necessary
  net: openvswitch: optimize flow mask cache hash collision
  net: openvswitch: optimize flow-mask looking up
  net: openvswitch: simplify the flow_hash
  net: openvswitch: add likely in flow_lookup
  net: openvswitch: fix possible memleak on destroy flow-table
  net: openvswitch: don't unlock mutex when changing the user_features
    fails
  net: openvswitch: simplify the ovs_dp_cmd_new

 net/openvswitch/datapath.c   |  65 +++++----
 net/openvswitch/flow.h       |   1 -
 net/openvswitch/flow_table.c | 316 +++++++++++++++++++++++++++++++++++++------
 net/openvswitch/flow_table.h |  19 ++-
 4 files changed, 329 insertions(+), 72 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH net-next v4 01/10] net: openvswitch: add flow-mask cache for performance
  2019-10-15 10:30 [PATCH net-next v4 00/10] optimize openvswitch flow looking up xiangxia.m.yue
@ 2019-10-15 10:30 ` xiangxia.m.yue
  2019-10-18 23:29   ` [ovs-dev] " William Tu
  2019-10-15 10:30 ` [PATCH net-next v4 02/10] net: openvswitch: convert mask list in mask array xiangxia.m.yue
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: xiangxia.m.yue @ 2019-10-15 10:30 UTC (permalink / raw)
  To: gvrose8192, pshelar; +Cc: netdev, dev, Tonghao Zhang

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

The idea of this optimization comes from a patch which
is committed in 2014, openvswitch community. The author
is Pravin B Shelar. In order to get high performance, I
implement it again. Later patches will use it.

Pravin B Shelar, says:
| On every packet OVS needs to lookup flow-table with every
| mask until it finds a match. The packet flow-key is first
| masked with mask in the list and then the masked key is
| looked up in flow-table. Therefore number of masks can
| affect packet processing performance.

Link: https://github.com/openvswitch/ovs/commit/5604935e4e1cbc16611d2d97f50b717aa31e8ec5
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
---
 net/openvswitch/datapath.c   |   3 +-
 net/openvswitch/flow_table.c | 109 +++++++++++++++++++++++++++++++++++++------
 net/openvswitch/flow_table.h |  11 ++++-
 3 files changed, 107 insertions(+), 16 deletions(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index f30e406..9fea7e1 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -227,7 +227,8 @@ void ovs_dp_process_packet(struct sk_buff *skb, struct sw_flow_key *key)
 	stats = this_cpu_ptr(dp->stats_percpu);
 
 	/* Look up flow. */
-	flow = ovs_flow_tbl_lookup_stats(&dp->table, key, &n_mask_hit);
+	flow = ovs_flow_tbl_lookup_stats(&dp->table, key, skb_get_hash(skb),
+					 &n_mask_hit);
 	if (unlikely(!flow)) {
 		struct dp_upcall_info upcall;
 
diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index cf3582c..3d515c0 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -36,6 +36,10 @@
 #define TBL_MIN_BUCKETS		1024
 #define REHASH_INTERVAL		(10 * 60 * HZ)
 
+#define MC_HASH_SHIFT		8
+#define MC_HASH_ENTRIES		(1u << MC_HASH_SHIFT)
+#define MC_HASH_SEGS		((sizeof(uint32_t) * 8) / MC_HASH_SHIFT)
+
 static struct kmem_cache *flow_cache;
 struct kmem_cache *flow_stats_cache __read_mostly;
 
@@ -168,10 +172,15 @@ int ovs_flow_tbl_init(struct flow_table *table)
 {
 	struct table_instance *ti, *ufid_ti;
 
-	ti = table_instance_alloc(TBL_MIN_BUCKETS);
+	table->mask_cache = __alloc_percpu(sizeof(struct mask_cache_entry) *
+					   MC_HASH_ENTRIES,
+					   __alignof__(struct mask_cache_entry));
+	if (!table->mask_cache)
+		return -ENOMEM;
 
+	ti = table_instance_alloc(TBL_MIN_BUCKETS);
 	if (!ti)
-		return -ENOMEM;
+		goto free_mask_cache;
 
 	ufid_ti = table_instance_alloc(TBL_MIN_BUCKETS);
 	if (!ufid_ti)
@@ -187,6 +196,8 @@ int ovs_flow_tbl_init(struct flow_table *table)
 
 free_ti:
 	__table_instance_destroy(ti);
+free_mask_cache:
+	free_percpu(table->mask_cache);
 	return -ENOMEM;
 }
 
@@ -243,6 +254,7 @@ void ovs_flow_tbl_destroy(struct flow_table *table)
 	struct table_instance *ti = rcu_dereference_raw(table->ti);
 	struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
 
+	free_percpu(table->mask_cache);
 	table_instance_destroy(ti, ufid_ti, false);
 }
 
@@ -425,7 +437,8 @@ static bool ovs_flow_cmp_unmasked_key(const struct sw_flow *flow,
 
 static struct sw_flow *masked_flow_lookup(struct table_instance *ti,
 					  const struct sw_flow_key *unmasked,
-					  const struct sw_flow_mask *mask)
+					  const struct sw_flow_mask *mask,
+					  u32 *n_mask_hit)
 {
 	struct sw_flow *flow;
 	struct hlist_head *head;
@@ -435,6 +448,8 @@ static struct sw_flow *masked_flow_lookup(struct table_instance *ti,
 	ovs_flow_mask_key(&masked_key, unmasked, false, mask);
 	hash = flow_hash(&masked_key, &mask->range);
 	head = find_bucket(ti, hash);
+	(*n_mask_hit)++;
+
 	hlist_for_each_entry_rcu(flow, head, flow_table.node[ti->node_ver]) {
 		if (flow->mask == mask && flow->flow_table.hash == hash &&
 		    flow_cmp_masked_key(flow, &masked_key, &mask->range))
@@ -443,30 +458,97 @@ static struct sw_flow *masked_flow_lookup(struct table_instance *ti,
 	return NULL;
 }
 
-struct sw_flow *ovs_flow_tbl_lookup_stats(struct flow_table *tbl,
-				    const struct sw_flow_key *key,
-				    u32 *n_mask_hit)
+static struct sw_flow *flow_lookup(struct flow_table *tbl,
+				   struct table_instance *ti,
+				   const struct sw_flow_key *key,
+				   u32 *n_mask_hit)
 {
-	struct table_instance *ti = rcu_dereference_ovsl(tbl->ti);
 	struct sw_flow_mask *mask;
 	struct sw_flow *flow;
 
-	*n_mask_hit = 0;
 	list_for_each_entry_rcu(mask, &tbl->mask_list, list) {
-		(*n_mask_hit)++;
-		flow = masked_flow_lookup(ti, key, mask);
+		flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
 		if (flow)  /* Found */
 			return flow;
 	}
 	return NULL;
 }
 
+/*
+ * mask_cache maps flow to probable mask. This cache is not tightly
+ * coupled cache, It means updates to  mask list can result in inconsistent
+ * cache entry in mask cache.
+ * This is per cpu cache and is divided in MC_HASH_SEGS segments.
+ * In case of a hash collision the entry is hashed in next segment.
+ * */
+struct sw_flow *ovs_flow_tbl_lookup_stats(struct flow_table *tbl,
+					  const struct sw_flow_key *key,
+					  u32 skb_hash,
+					  u32 *n_mask_hit)
+{
+	struct table_instance *ti = rcu_dereference_ovsl(tbl->ti);
+	struct mask_cache_entry  *entries, *ce, *del;
+	struct sw_flow *flow;
+	u32 hash = skb_hash;
+	int seg;
+
+	*n_mask_hit = 0;
+	if (unlikely(!skb_hash))
+		return flow_lookup(tbl, ti, key, n_mask_hit);
+
+	del = NULL;
+	entries = this_cpu_ptr(tbl->mask_cache);
+
+	for (seg = 0; seg < MC_HASH_SEGS; seg++) {
+		int index;
+
+		index = hash & (MC_HASH_ENTRIES - 1);
+		ce = &entries[index];
+
+		if (ce->skb_hash == skb_hash) {
+			struct sw_flow_mask *mask;
+			int i;
+
+			i = 0;
+			list_for_each_entry_rcu(mask, &tbl->mask_list, list) {
+				if (ce->mask_index == i++) {
+					flow = masked_flow_lookup(ti, key, mask,
+								  n_mask_hit);
+					if (flow)  /* Found */
+						return flow;
+
+					break;
+				}
+			}
+
+			del = ce;
+			break;
+		}
+
+		if (!del || (del->skb_hash && !ce->skb_hash)) {
+			del = ce;
+		}
+
+		hash >>= MC_HASH_SHIFT;
+	}
+
+	flow = flow_lookup(tbl, ti, key, n_mask_hit);
+
+	if (flow) {
+		del->skb_hash = skb_hash;
+		del->mask_index = (*n_mask_hit - 1);
+	}
+
+	return flow;
+}
+
 struct sw_flow *ovs_flow_tbl_lookup(struct flow_table *tbl,
 				    const struct sw_flow_key *key)
 {
+	struct table_instance *ti = rcu_dereference_ovsl(tbl->ti);
 	u32 __always_unused n_mask_hit;
 
-	return ovs_flow_tbl_lookup_stats(tbl, key, &n_mask_hit);
+	return flow_lookup(tbl, ti, key, &n_mask_hit);
 }
 
 struct sw_flow *ovs_flow_tbl_lookup_exact(struct flow_table *tbl,
@@ -475,10 +557,11 @@ struct sw_flow *ovs_flow_tbl_lookup_exact(struct flow_table *tbl,
 	struct table_instance *ti = rcu_dereference_ovsl(tbl->ti);
 	struct sw_flow_mask *mask;
 	struct sw_flow *flow;
+	u32 __always_unused n_mask_hit;
 
 	/* Always called under ovs-mutex. */
 	list_for_each_entry(mask, &tbl->mask_list, list) {
-		flow = masked_flow_lookup(ti, match->key, mask);
+		flow = masked_flow_lookup(ti, match->key, mask, &n_mask_hit);
 		if (flow && ovs_identifier_is_key(&flow->id) &&
 		    ovs_flow_cmp_unmasked_key(flow, match))
 			return flow;
@@ -631,7 +714,7 @@ static int flow_mask_insert(struct flow_table *tbl, struct sw_flow *flow,
 			return -ENOMEM;
 		mask->key = new->key;
 		mask->range = new->range;
-		list_add_rcu(&mask->list, &tbl->mask_list);
+		list_add_tail_rcu(&mask->list, &tbl->mask_list);
 	} else {
 		BUG_ON(!mask->ref_count);
 		mask->ref_count++;
diff --git a/net/openvswitch/flow_table.h b/net/openvswitch/flow_table.h
index bc52045..04b6b1c 100644
--- a/net/openvswitch/flow_table.h
+++ b/net/openvswitch/flow_table.h
@@ -22,6 +22,11 @@
 
 #include "flow.h"
 
+struct mask_cache_entry {
+	u32 skb_hash;
+	u32 mask_index;
+};
+
 struct table_instance {
 	struct hlist_head *buckets;
 	unsigned int n_buckets;
@@ -34,6 +39,7 @@ struct table_instance {
 struct flow_table {
 	struct table_instance __rcu *ti;
 	struct table_instance __rcu *ufid_ti;
+	struct mask_cache_entry __percpu *mask_cache;
 	struct list_head mask_list;
 	unsigned long last_rehash;
 	unsigned int count;
@@ -60,8 +66,9 @@ int ovs_flow_tbl_insert(struct flow_table *table, struct sw_flow *flow,
 struct sw_flow *ovs_flow_tbl_dump_next(struct table_instance *table,
 				       u32 *bucket, u32 *idx);
 struct sw_flow *ovs_flow_tbl_lookup_stats(struct flow_table *,
-				    const struct sw_flow_key *,
-				    u32 *n_mask_hit);
+					  const struct sw_flow_key *,
+					  u32 skb_hash,
+					  u32 *n_mask_hit);
 struct sw_flow *ovs_flow_tbl_lookup(struct flow_table *,
 				    const struct sw_flow_key *);
 struct sw_flow *ovs_flow_tbl_lookup_exact(struct flow_table *tbl,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH net-next v4 02/10] net: openvswitch: convert mask list in mask array
  2019-10-15 10:30 [PATCH net-next v4 00/10] optimize openvswitch flow looking up xiangxia.m.yue
  2019-10-15 10:30 ` [PATCH net-next v4 01/10] net: openvswitch: add flow-mask cache for performance xiangxia.m.yue
@ 2019-10-15 10:30 ` xiangxia.m.yue
  2019-10-18 23:30   ` [ovs-dev] " William Tu
  2019-10-15 10:30 ` [PATCH net-next v4 03/10] net: openvswitch: shrink the mask array if necessary xiangxia.m.yue
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: xiangxia.m.yue @ 2019-10-15 10:30 UTC (permalink / raw)
  To: gvrose8192, pshelar; +Cc: netdev, dev, Tonghao Zhang

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

Port the codes to linux upstream and with little changes.

Pravin B Shelar, says:
| mask caches index of mask in mask_list. On packet recv OVS
| need to traverse mask-list to get cached mask. Therefore array
| is better for retrieving cached mask. This also allows better
| cache replacement algorithm by directly checking mask's existence.

Link: https://github.com/openvswitch/ovs/commit/d49fc3ff53c65e4eca9cabd52ac63396746a7ef5
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
---
 net/openvswitch/flow.h       |   1 -
 net/openvswitch/flow_table.c | 210 ++++++++++++++++++++++++++++++++-----------
 net/openvswitch/flow_table.h |   8 +-
 3 files changed, 167 insertions(+), 52 deletions(-)

diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h
index b830d5f..8080518 100644
--- a/net/openvswitch/flow.h
+++ b/net/openvswitch/flow.h
@@ -166,7 +166,6 @@ struct sw_flow_key_range {
 struct sw_flow_mask {
 	int ref_count;
 	struct rcu_head rcu;
-	struct list_head list;
 	struct sw_flow_key_range range;
 	struct sw_flow_key key;
 };
diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 3d515c0..0d1df53 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -34,6 +34,7 @@
 #include <net/ndisc.h>
 
 #define TBL_MIN_BUCKETS		1024
+#define MASK_ARRAY_SIZE_MIN	16
 #define REHASH_INTERVAL		(10 * 60 * HZ)
 
 #define MC_HASH_SHIFT		8
@@ -168,9 +169,51 @@ static struct table_instance *table_instance_alloc(int new_size)
 	return ti;
 }
 
+static struct mask_array *tbl_mask_array_alloc(int size)
+{
+	struct mask_array *new;
+
+	size = max(MASK_ARRAY_SIZE_MIN, size);
+	new = kzalloc(sizeof(struct mask_array) +
+		      sizeof(struct sw_flow_mask *) * size, GFP_KERNEL);
+	if (!new)
+		return NULL;
+
+	new->count = 0;
+	new->max = size;
+
+	return new;
+}
+
+static int tbl_mask_array_realloc(struct flow_table *tbl, int size)
+{
+	struct mask_array *old;
+	struct mask_array *new;
+
+	new = tbl_mask_array_alloc(size);
+	if (!new)
+		return -ENOMEM;
+
+	old = ovsl_dereference(tbl->mask_array);
+	if (old) {
+		int i;
+
+		for (i = 0; i < old->max; i++) {
+			if (ovsl_dereference(old->masks[i]))
+				new->masks[new->count++] = old->masks[i];
+		}
+	}
+
+	rcu_assign_pointer(tbl->mask_array, new);
+	kfree_rcu(old, rcu);
+
+	return 0;
+}
+
 int ovs_flow_tbl_init(struct flow_table *table)
 {
 	struct table_instance *ti, *ufid_ti;
+	struct mask_array *ma;
 
 	table->mask_cache = __alloc_percpu(sizeof(struct mask_cache_entry) *
 					   MC_HASH_ENTRIES,
@@ -178,9 +221,13 @@ int ovs_flow_tbl_init(struct flow_table *table)
 	if (!table->mask_cache)
 		return -ENOMEM;
 
+	ma = tbl_mask_array_alloc(MASK_ARRAY_SIZE_MIN);
+	if (!ma)
+		goto free_mask_cache;
+
 	ti = table_instance_alloc(TBL_MIN_BUCKETS);
 	if (!ti)
-		goto free_mask_cache;
+		goto free_mask_array;
 
 	ufid_ti = table_instance_alloc(TBL_MIN_BUCKETS);
 	if (!ufid_ti)
@@ -188,7 +235,7 @@ int ovs_flow_tbl_init(struct flow_table *table)
 
 	rcu_assign_pointer(table->ti, ti);
 	rcu_assign_pointer(table->ufid_ti, ufid_ti);
-	INIT_LIST_HEAD(&table->mask_list);
+	rcu_assign_pointer(table->mask_array, ma);
 	table->last_rehash = jiffies;
 	table->count = 0;
 	table->ufid_count = 0;
@@ -196,6 +243,8 @@ int ovs_flow_tbl_init(struct flow_table *table)
 
 free_ti:
 	__table_instance_destroy(ti);
+free_mask_array:
+	kfree(ma);
 free_mask_cache:
 	free_percpu(table->mask_cache);
 	return -ENOMEM;
@@ -255,6 +304,7 @@ void ovs_flow_tbl_destroy(struct flow_table *table)
 	struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
 
 	free_percpu(table->mask_cache);
+	kfree_rcu(rcu_dereference_raw(table->mask_array), rcu);
 	table_instance_destroy(ti, ufid_ti, false);
 }
 
@@ -460,17 +510,27 @@ static struct sw_flow *masked_flow_lookup(struct table_instance *ti,
 
 static struct sw_flow *flow_lookup(struct flow_table *tbl,
 				   struct table_instance *ti,
+				   struct mask_array *ma,
 				   const struct sw_flow_key *key,
-				   u32 *n_mask_hit)
+				   u32 *n_mask_hit,
+				   u32 *index)
 {
-	struct sw_flow_mask *mask;
 	struct sw_flow *flow;
+	int i;
 
-	list_for_each_entry_rcu(mask, &tbl->mask_list, list) {
-		flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
-		if (flow)  /* Found */
-			return flow;
+	for (i = 0; i < ma->max; i++)  {
+		struct sw_flow_mask *mask;
+
+		mask = rcu_dereference_ovsl(ma->masks[i]);
+		if (mask) {
+			flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
+			if (flow) { /* Found */
+				*index = i;
+				return flow;
+			}
+		}
 	}
+
 	return NULL;
 }
 
@@ -486,6 +546,7 @@ struct sw_flow *ovs_flow_tbl_lookup_stats(struct flow_table *tbl,
 					  u32 skb_hash,
 					  u32 *n_mask_hit)
 {
+	struct mask_array *ma = rcu_dereference_ovsl(tbl->mask_array);
 	struct table_instance *ti = rcu_dereference_ovsl(tbl->ti);
 	struct mask_cache_entry  *entries, *ce, *del;
 	struct sw_flow *flow;
@@ -493,8 +554,11 @@ struct sw_flow *ovs_flow_tbl_lookup_stats(struct flow_table *tbl,
 	int seg;
 
 	*n_mask_hit = 0;
-	if (unlikely(!skb_hash))
-		return flow_lookup(tbl, ti, key, n_mask_hit);
+	if (unlikely(!skb_hash)) {
+		u32 __always_unused mask_index;
+
+		return flow_lookup(tbl, ti, ma, key, n_mask_hit, &mask_index);
+	}
 
 	del = NULL;
 	entries = this_cpu_ptr(tbl->mask_cache);
@@ -507,37 +571,33 @@ struct sw_flow *ovs_flow_tbl_lookup_stats(struct flow_table *tbl,
 
 		if (ce->skb_hash == skb_hash) {
 			struct sw_flow_mask *mask;
-			int i;
-
-			i = 0;
-			list_for_each_entry_rcu(mask, &tbl->mask_list, list) {
-				if (ce->mask_index == i++) {
-					flow = masked_flow_lookup(ti, key, mask,
-								  n_mask_hit);
-					if (flow)  /* Found */
-						return flow;
-
-					break;
-				}
+			struct sw_flow *flow;
+
+			mask = rcu_dereference_ovsl(ma->masks[ce->mask_index]);
+			if (mask) {
+				flow = masked_flow_lookup(ti, key, mask,
+							  n_mask_hit);
+				if (flow)  /* Found */
+					return flow;
 			}
 
 			del = ce;
 			break;
 		}
 
-		if (!del || (del->skb_hash && !ce->skb_hash)) {
+		if (!del || (del->skb_hash && !ce->skb_hash) ||
+		    (rcu_dereference_ovsl(ma->masks[del->mask_index]) &&
+		     !rcu_dereference_ovsl(ma->masks[ce->mask_index]))) {
 			del = ce;
 		}
 
 		hash >>= MC_HASH_SHIFT;
 	}
 
-	flow = flow_lookup(tbl, ti, key, n_mask_hit);
+	flow = flow_lookup(tbl, ti, ma, key, n_mask_hit, &del->mask_index);
 
-	if (flow) {
+	if (flow)
 		del->skb_hash = skb_hash;
-		del->mask_index = (*n_mask_hit - 1);
-	}
 
 	return flow;
 }
@@ -546,26 +606,38 @@ struct sw_flow *ovs_flow_tbl_lookup(struct flow_table *tbl,
 				    const struct sw_flow_key *key)
 {
 	struct table_instance *ti = rcu_dereference_ovsl(tbl->ti);
+	struct mask_array *ma = rcu_dereference_ovsl(tbl->mask_array);
+
 	u32 __always_unused n_mask_hit;
+	u32 __always_unused index;
 
-	return flow_lookup(tbl, ti, key, &n_mask_hit);
+	return flow_lookup(tbl, ti, ma, key, &n_mask_hit, &index);
 }
 
 struct sw_flow *ovs_flow_tbl_lookup_exact(struct flow_table *tbl,
 					  const struct sw_flow_match *match)
 {
-	struct table_instance *ti = rcu_dereference_ovsl(tbl->ti);
-	struct sw_flow_mask *mask;
-	struct sw_flow *flow;
-	u32 __always_unused n_mask_hit;
+	struct mask_array *ma = ovsl_dereference(tbl->mask_array);
+	int i;
 
 	/* Always called under ovs-mutex. */
-	list_for_each_entry(mask, &tbl->mask_list, list) {
+	for (i = 0; i < ma->max; i++) {
+		struct table_instance *ti = rcu_dereference_ovsl(tbl->ti);
+		u32 __always_unused n_mask_hit;
+		struct sw_flow_mask *mask;
+		struct sw_flow *flow;
+
+		mask = ovsl_dereference(ma->masks[i]);
+		if (!mask)
+			continue;
+
 		flow = masked_flow_lookup(ti, match->key, mask, &n_mask_hit);
 		if (flow && ovs_identifier_is_key(&flow->id) &&
-		    ovs_flow_cmp_unmasked_key(flow, match))
+		    ovs_flow_cmp_unmasked_key(flow, match)) {
 			return flow;
+		}
 	}
+
 	return NULL;
 }
 
@@ -611,13 +683,8 @@ struct sw_flow *ovs_flow_tbl_lookup_ufid(struct flow_table *tbl,
 
 int ovs_flow_tbl_num_masks(const struct flow_table *table)
 {
-	struct sw_flow_mask *mask;
-	int num = 0;
-
-	list_for_each_entry(mask, &table->mask_list, list)
-		num++;
-
-	return num;
+	struct mask_array *ma = rcu_dereference_ovsl(table->mask_array);
+	return ma->count;
 }
 
 static struct table_instance *table_instance_expand(struct table_instance *ti,
@@ -638,8 +705,19 @@ static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
 		mask->ref_count--;
 
 		if (!mask->ref_count) {
-			list_del_rcu(&mask->list);
-			kfree_rcu(mask, rcu);
+			struct mask_array *ma;
+			int i;
+
+			ma = ovsl_dereference(tbl->mask_array);
+			for (i = 0; i < ma->max; i++) {
+				if (mask == ovsl_dereference(ma->masks[i])) {
+					RCU_INIT_POINTER(ma->masks[i], NULL);
+					ma->count--;
+					kfree_rcu(mask, rcu);
+					return;
+				}
+			}
+			BUG();
 		}
 	}
 }
@@ -689,13 +767,16 @@ static bool mask_equal(const struct sw_flow_mask *a,
 static struct sw_flow_mask *flow_mask_find(const struct flow_table *tbl,
 					   const struct sw_flow_mask *mask)
 {
-	struct list_head *ml;
+	struct mask_array *ma;
+	int i;
 
-	list_for_each(ml, &tbl->mask_list) {
-		struct sw_flow_mask *m;
-		m = container_of(ml, struct sw_flow_mask, list);
-		if (mask_equal(mask, m))
-			return m;
+	ma = ovsl_dereference(tbl->mask_array);
+	for (i = 0; i < ma->max; i++) {
+		struct sw_flow_mask *t;
+		t = ovsl_dereference(ma->masks[i]);
+
+		if (t && mask_equal(mask, t))
+			return t;
 	}
 
 	return NULL;
@@ -706,15 +787,44 @@ static int flow_mask_insert(struct flow_table *tbl, struct sw_flow *flow,
 			    const struct sw_flow_mask *new)
 {
 	struct sw_flow_mask *mask;
+
 	mask = flow_mask_find(tbl, new);
 	if (!mask) {
+		struct mask_array *ma;
+		int i;
+
 		/* Allocate a new mask if none exsits. */
 		mask = mask_alloc();
 		if (!mask)
 			return -ENOMEM;
 		mask->key = new->key;
 		mask->range = new->range;
-		list_add_tail_rcu(&mask->list, &tbl->mask_list);
+
+		/* Add mask to mask-list. */
+		ma = ovsl_dereference(tbl->mask_array);
+		if (ma->count >= ma->max) {
+			int err;
+
+			err = tbl_mask_array_realloc(tbl, ma->max +
+						     MASK_ARRAY_SIZE_MIN);
+			if (err) {
+				kfree(mask);
+				return err;
+			}
+
+			ma = ovsl_dereference(tbl->mask_array);
+		}
+
+		for (i = 0; i < ma->max; i++) {
+			const struct sw_flow_mask *t;
+
+			t = ovsl_dereference(ma->masks[i]);
+			if (!t) {
+				rcu_assign_pointer(ma->masks[i], mask);
+				ma->count++;
+				break;
+			}
+		}
 	} else {
 		BUG_ON(!mask->ref_count);
 		mask->ref_count++;
diff --git a/net/openvswitch/flow_table.h b/net/openvswitch/flow_table.h
index 04b6b1c..8a5cea6 100644
--- a/net/openvswitch/flow_table.h
+++ b/net/openvswitch/flow_table.h
@@ -27,6 +27,12 @@ struct mask_cache_entry {
 	u32 mask_index;
 };
 
+struct mask_array {
+	struct rcu_head rcu;
+	int count, max;
+	struct sw_flow_mask __rcu *masks[];
+};
+
 struct table_instance {
 	struct hlist_head *buckets;
 	unsigned int n_buckets;
@@ -40,7 +46,7 @@ struct flow_table {
 	struct table_instance __rcu *ti;
 	struct table_instance __rcu *ufid_ti;
 	struct mask_cache_entry __percpu *mask_cache;
-	struct list_head mask_list;
+	struct mask_array __rcu *mask_array;
 	unsigned long last_rehash;
 	unsigned int count;
 	unsigned int ufid_count;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH net-next v4 03/10] net: openvswitch: shrink the mask array if necessary
  2019-10-15 10:30 [PATCH net-next v4 00/10] optimize openvswitch flow looking up xiangxia.m.yue
  2019-10-15 10:30 ` [PATCH net-next v4 01/10] net: openvswitch: add flow-mask cache for performance xiangxia.m.yue
  2019-10-15 10:30 ` [PATCH net-next v4 02/10] net: openvswitch: convert mask list in mask array xiangxia.m.yue
@ 2019-10-15 10:30 ` xiangxia.m.yue
  2019-10-18 23:33   ` [ovs-dev] " William Tu
  2019-10-15 10:30 ` [PATCH net-next v4 04/10] net: openvswitch: optimize flow mask cache hash collision xiangxia.m.yue
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: xiangxia.m.yue @ 2019-10-15 10:30 UTC (permalink / raw)
  To: gvrose8192, pshelar; +Cc: netdev, dev, Tonghao Zhang

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

When creating and inserting flow-mask, if there is no available
flow-mask, we realloc the mask array. When removing flow-mask,
if necessary, we shrink mask array.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
---
 net/openvswitch/flow_table.c | 33 +++++++++++++++++++++++----------
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 0d1df53..237cf85 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -693,6 +693,23 @@ static struct table_instance *table_instance_expand(struct table_instance *ti,
 	return table_instance_rehash(ti, ti->n_buckets * 2, ufid);
 }
 
+static void tbl_mask_array_delete_mask(struct mask_array *ma,
+				       struct sw_flow_mask *mask)
+{
+	int i;
+
+	/* Remove the deleted mask pointers from the array */
+	for (i = 0; i < ma->max; i++) {
+		if (mask == ovsl_dereference(ma->masks[i])) {
+			RCU_INIT_POINTER(ma->masks[i], NULL);
+			ma->count--;
+			kfree_rcu(mask, rcu);
+			return;
+		}
+	}
+	BUG();
+}
+
 /* Remove 'mask' from the mask list, if it is not needed any more. */
 static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
 {
@@ -706,18 +723,14 @@ static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
 
 		if (!mask->ref_count) {
 			struct mask_array *ma;
-			int i;
 
 			ma = ovsl_dereference(tbl->mask_array);
-			for (i = 0; i < ma->max; i++) {
-				if (mask == ovsl_dereference(ma->masks[i])) {
-					RCU_INIT_POINTER(ma->masks[i], NULL);
-					ma->count--;
-					kfree_rcu(mask, rcu);
-					return;
-				}
-			}
-			BUG();
+			tbl_mask_array_delete_mask(ma, mask);
+
+			/* Shrink the mask array if necessary. */
+			if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
+			    ma->count <= (ma->max / 3))
+				tbl_mask_array_realloc(tbl, ma->max / 2);
 		}
 	}
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH net-next v4 04/10] net: openvswitch: optimize flow mask cache hash collision
  2019-10-15 10:30 [PATCH net-next v4 00/10] optimize openvswitch flow looking up xiangxia.m.yue
                   ` (2 preceding siblings ...)
  2019-10-15 10:30 ` [PATCH net-next v4 03/10] net: openvswitch: shrink the mask array if necessary xiangxia.m.yue
@ 2019-10-15 10:30 ` xiangxia.m.yue
  2019-10-15 10:30 ` [PATCH net-next v4 05/10] net: openvswitch: optimize flow-mask looking up xiangxia.m.yue
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 37+ messages in thread
From: xiangxia.m.yue @ 2019-10-15 10:30 UTC (permalink / raw)
  To: gvrose8192, pshelar; +Cc: netdev, dev, Tonghao Zhang

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

Port the codes to linux upstream and with little changes.

Pravin B Shelar, says:
| In case hash collision on mask cache, OVS does extra flow
| lookup. Following patch avoid it.

Link: https://github.com/openvswitch/ovs/commit/0e6efbe2712da03522532dc5e84806a96f6a0dd1
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
---
 net/openvswitch/flow_table.c | 95 ++++++++++++++++++++++++--------------------
 1 file changed, 53 insertions(+), 42 deletions(-)

diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 237cf85..8d4f50d 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -508,6 +508,9 @@ static struct sw_flow *masked_flow_lookup(struct table_instance *ti,
 	return NULL;
 }
 
+/* Flow lookup does full lookup on flow table. It starts with
+ * mask from index passed in *index.
+ */
 static struct sw_flow *flow_lookup(struct flow_table *tbl,
 				   struct table_instance *ti,
 				   struct mask_array *ma,
@@ -516,18 +519,31 @@ static struct sw_flow *flow_lookup(struct flow_table *tbl,
 				   u32 *index)
 {
 	struct sw_flow *flow;
+	struct sw_flow_mask *mask;
 	int i;
 
-	for (i = 0; i < ma->max; i++)  {
-		struct sw_flow_mask *mask;
-
-		mask = rcu_dereference_ovsl(ma->masks[i]);
+	if (*index < ma->max) {
+		mask = rcu_dereference_ovsl(ma->masks[*index]);
 		if (mask) {
 			flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
-			if (flow) { /* Found */
-				*index = i;
+			if (flow)
 				return flow;
-			}
+		}
+	}
+
+	for (i = 0; i < ma->max; i++)  {
+
+		if (i == *index)
+			continue;
+
+		mask = rcu_dereference_ovsl(ma->masks[i]);
+		if (!mask)
+			continue;
+
+		flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
+		if (flow) { /* Found */
+			*index = i;
+			return flow;
 		}
 	}
 
@@ -546,58 +562,54 @@ struct sw_flow *ovs_flow_tbl_lookup_stats(struct flow_table *tbl,
 					  u32 skb_hash,
 					  u32 *n_mask_hit)
 {
-	struct mask_array *ma = rcu_dereference_ovsl(tbl->mask_array);
-	struct table_instance *ti = rcu_dereference_ovsl(tbl->ti);
-	struct mask_cache_entry  *entries, *ce, *del;
+	struct mask_array *ma = rcu_dereference(tbl->mask_array);
+	struct table_instance *ti = rcu_dereference(tbl->ti);
+	struct mask_cache_entry *entries, *ce;
 	struct sw_flow *flow;
-	u32 hash = skb_hash;
+	u32 hash;
 	int seg;
 
 	*n_mask_hit = 0;
 	if (unlikely(!skb_hash)) {
-		u32 __always_unused mask_index;
+		u32 mask_index = 0;
 
 		return flow_lookup(tbl, ti, ma, key, n_mask_hit, &mask_index);
 	}
 
-	del = NULL;
+	/* Pre and post recirulation flows usually have the same skb_hash
+	 * value. To avoid hash collisions, rehash the 'skb_hash' with
+	 * 'recirc_id'.  */
+	if (key->recirc_id)
+		skb_hash = jhash_1word(skb_hash, key->recirc_id);
+
+	ce = NULL;
+	hash = skb_hash;
 	entries = this_cpu_ptr(tbl->mask_cache);
 
+	/* Find the cache entry 'ce' to operate on. */
 	for (seg = 0; seg < MC_HASH_SEGS; seg++) {
-		int index;
-
-		index = hash & (MC_HASH_ENTRIES - 1);
-		ce = &entries[index];
-
-		if (ce->skb_hash == skb_hash) {
-			struct sw_flow_mask *mask;
-			struct sw_flow *flow;
-
-			mask = rcu_dereference_ovsl(ma->masks[ce->mask_index]);
-			if (mask) {
-				flow = masked_flow_lookup(ti, key, mask,
-							  n_mask_hit);
-				if (flow)  /* Found */
-					return flow;
-			}
-
-			del = ce;
-			break;
+		int index = hash & (MC_HASH_ENTRIES - 1);
+		struct mask_cache_entry *e;
+
+		e = &entries[index];
+		if (e->skb_hash == skb_hash) {
+			flow = flow_lookup(tbl, ti, ma, key, n_mask_hit,
+					   &e->mask_index);
+			if (!flow)
+				e->skb_hash = 0;
+			return flow;
 		}
 
-		if (!del || (del->skb_hash && !ce->skb_hash) ||
-		    (rcu_dereference_ovsl(ma->masks[del->mask_index]) &&
-		     !rcu_dereference_ovsl(ma->masks[ce->mask_index]))) {
-			del = ce;
-		}
+		if (!ce || e->skb_hash < ce->skb_hash)
+			ce = e;  /* A better replacement cache candidate. */
 
 		hash >>= MC_HASH_SHIFT;
 	}
 
-	flow = flow_lookup(tbl, ti, ma, key, n_mask_hit, &del->mask_index);
-
+	/* Cache miss, do full lookup. */
+	flow = flow_lookup(tbl, ti, ma, key, n_mask_hit, &ce->mask_index);
 	if (flow)
-		del->skb_hash = skb_hash;
+		ce->skb_hash = skb_hash;
 
 	return flow;
 }
@@ -607,9 +619,8 @@ struct sw_flow *ovs_flow_tbl_lookup(struct flow_table *tbl,
 {
 	struct table_instance *ti = rcu_dereference_ovsl(tbl->ti);
 	struct mask_array *ma = rcu_dereference_ovsl(tbl->mask_array);
-
 	u32 __always_unused n_mask_hit;
-	u32 __always_unused index;
+	u32 index = 0;
 
 	return flow_lookup(tbl, ti, ma, key, &n_mask_hit, &index);
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH net-next v4 05/10] net: openvswitch: optimize flow-mask looking up
  2019-10-15 10:30 [PATCH net-next v4 00/10] optimize openvswitch flow looking up xiangxia.m.yue
                   ` (3 preceding siblings ...)
  2019-10-15 10:30 ` [PATCH net-next v4 04/10] net: openvswitch: optimize flow mask cache hash collision xiangxia.m.yue
@ 2019-10-15 10:30 ` xiangxia.m.yue
  2019-10-18 23:26   ` [ovs-dev] " William Tu
  2019-10-15 10:30 ` [PATCH net-next v4 06/10] net: openvswitch: simplify the flow_hash xiangxia.m.yue
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: xiangxia.m.yue @ 2019-10-15 10:30 UTC (permalink / raw)
  To: gvrose8192, pshelar; +Cc: netdev, dev, Tonghao Zhang

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

The full looking up on flow table traverses all mask array.
If mask-array is too large, the number of invalid flow-mask
increase, performance will be drop.

One bad case, for example: M means flow-mask is valid and NULL
of flow-mask means deleted.

+-------------------------------------------+
| M | NULL | ...                  | NULL | M|
+-------------------------------------------+

In that case, without this patch, openvswitch will traverses all
mask array, because there will be one flow-mask in the tail. This
patch changes the way of flow-mask inserting and deleting, and the
mask array will be keep as below: there is not a NULL hole. In the
fast path, we can "break" "for" (not "continue") in flow_lookup
when we get a NULL flow-mask.

         "break"
            v
+-------------------------------------------+
| M | M |  NULL |...           | NULL | NULL|
+-------------------------------------------+

This patch don't optimize slow or control path, still using ma->max
to traverse. Slow path:
* tbl_mask_array_realloc
* ovs_flow_tbl_lookup_exact
* flow_mask_find

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
---
 net/openvswitch/flow_table.c | 101 ++++++++++++++++++++++---------------------
 1 file changed, 52 insertions(+), 49 deletions(-)

diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 8d4f50d..a10d421 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -538,7 +538,7 @@ static struct sw_flow *flow_lookup(struct flow_table *tbl,
 
 		mask = rcu_dereference_ovsl(ma->masks[i]);
 		if (!mask)
-			continue;
+			break;
 
 		flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
 		if (flow) { /* Found */
@@ -695,7 +695,7 @@ struct sw_flow *ovs_flow_tbl_lookup_ufid(struct flow_table *tbl,
 int ovs_flow_tbl_num_masks(const struct flow_table *table)
 {
 	struct mask_array *ma = rcu_dereference_ovsl(table->mask_array);
-	return ma->count;
+	return READ_ONCE(ma->count);
 }
 
 static struct table_instance *table_instance_expand(struct table_instance *ti,
@@ -704,21 +704,33 @@ static struct table_instance *table_instance_expand(struct table_instance *ti,
 	return table_instance_rehash(ti, ti->n_buckets * 2, ufid);
 }
 
-static void tbl_mask_array_delete_mask(struct mask_array *ma,
-				       struct sw_flow_mask *mask)
+static void tbl_mask_array_del_mask(struct flow_table *tbl,
+				    struct sw_flow_mask *mask)
 {
-	int i;
+	struct mask_array *ma = ovsl_dereference(tbl->mask_array);
+	int i, ma_count = READ_ONCE(ma->count);
 
 	/* Remove the deleted mask pointers from the array */
-	for (i = 0; i < ma->max; i++) {
-		if (mask == ovsl_dereference(ma->masks[i])) {
-			RCU_INIT_POINTER(ma->masks[i], NULL);
-			ma->count--;
-			kfree_rcu(mask, rcu);
-			return;
-		}
+	for (i = 0; i < ma_count; i++) {
+		if (mask == ovsl_dereference(ma->masks[i]))
+			goto found;
 	}
+
 	BUG();
+	return;
+
+found:
+	WRITE_ONCE(ma->count, ma_count -1);
+
+	rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
+	RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);
+
+	kfree_rcu(mask, rcu);
+
+	/* Shrink the mask array if necessary. */
+	if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
+	    ma_count <= (ma->max / 3))
+		tbl_mask_array_realloc(tbl, ma->max / 2);
 }
 
 /* Remove 'mask' from the mask list, if it is not needed any more. */
@@ -732,17 +744,8 @@ static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
 		BUG_ON(!mask->ref_count);
 		mask->ref_count--;
 
-		if (!mask->ref_count) {
-			struct mask_array *ma;
-
-			ma = ovsl_dereference(tbl->mask_array);
-			tbl_mask_array_delete_mask(ma, mask);
-
-			/* Shrink the mask array if necessary. */
-			if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
-			    ma->count <= (ma->max / 3))
-				tbl_mask_array_realloc(tbl, ma->max / 2);
-		}
+		if (!mask->ref_count)
+			tbl_mask_array_del_mask(tbl, mask);
 	}
 }
 
@@ -806,6 +809,29 @@ static struct sw_flow_mask *flow_mask_find(const struct flow_table *tbl,
 	return NULL;
 }
 
+static int tbl_mask_array_add_mask(struct flow_table *tbl,
+				   struct sw_flow_mask *new)
+{
+	struct mask_array *ma = ovsl_dereference(tbl->mask_array);
+	int err, ma_count = READ_ONCE(ma->count);
+
+	if (ma_count >= ma->max) {
+		err = tbl_mask_array_realloc(tbl, ma->max +
+					      MASK_ARRAY_SIZE_MIN);
+		if (err)
+			return err;
+
+		ma = ovsl_dereference(tbl->mask_array);
+	}
+
+	BUG_ON(ovsl_dereference(ma->masks[ma_count]));
+
+	rcu_assign_pointer(ma->masks[ma_count], new);
+	WRITE_ONCE(ma->count, ma_count +1);
+
+	return 0;
+}
+
 /* Add 'mask' into the mask list, if it is not already there. */
 static int flow_mask_insert(struct flow_table *tbl, struct sw_flow *flow,
 			    const struct sw_flow_mask *new)
@@ -814,9 +840,6 @@ static int flow_mask_insert(struct flow_table *tbl, struct sw_flow *flow,
 
 	mask = flow_mask_find(tbl, new);
 	if (!mask) {
-		struct mask_array *ma;
-		int i;
-
 		/* Allocate a new mask if none exsits. */
 		mask = mask_alloc();
 		if (!mask)
@@ -825,29 +848,9 @@ static int flow_mask_insert(struct flow_table *tbl, struct sw_flow *flow,
 		mask->range = new->range;
 
 		/* Add mask to mask-list. */
-		ma = ovsl_dereference(tbl->mask_array);
-		if (ma->count >= ma->max) {
-			int err;
-
-			err = tbl_mask_array_realloc(tbl, ma->max +
-						     MASK_ARRAY_SIZE_MIN);
-			if (err) {
-				kfree(mask);
-				return err;
-			}
-
-			ma = ovsl_dereference(tbl->mask_array);
-		}
-
-		for (i = 0; i < ma->max; i++) {
-			const struct sw_flow_mask *t;
-
-			t = ovsl_dereference(ma->masks[i]);
-			if (!t) {
-				rcu_assign_pointer(ma->masks[i], mask);
-				ma->count++;
-				break;
-			}
+		if (tbl_mask_array_add_mask(tbl, mask)) {
+			kfree(mask);
+			return -ENOMEM;
 		}
 	} else {
 		BUG_ON(!mask->ref_count);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH net-next v4 06/10] net: openvswitch: simplify the flow_hash
  2019-10-15 10:30 [PATCH net-next v4 00/10] optimize openvswitch flow looking up xiangxia.m.yue
                   ` (4 preceding siblings ...)
  2019-10-15 10:30 ` [PATCH net-next v4 05/10] net: openvswitch: optimize flow-mask looking up xiangxia.m.yue
@ 2019-10-15 10:30 ` xiangxia.m.yue
  2019-10-18 23:27   ` [ovs-dev] " William Tu
  2019-10-15 10:30 ` [PATCH net-next v4 07/10] net: openvswitch: add likely in flow_lookup xiangxia.m.yue
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: xiangxia.m.yue @ 2019-10-15 10:30 UTC (permalink / raw)
  To: gvrose8192, pshelar; +Cc: netdev, dev, Tonghao Zhang

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

Simplify the code and remove the unnecessary BUILD_BUG_ON.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
---
 net/openvswitch/flow_table.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index a10d421..3e3d345 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -432,13 +432,9 @@ int ovs_flow_tbl_flush(struct flow_table *flow_table)
 static u32 flow_hash(const struct sw_flow_key *key,
 		     const struct sw_flow_key_range *range)
 {
-	int key_start = range->start;
-	int key_end = range->end;
-	const u32 *hash_key = (const u32 *)((const u8 *)key + key_start);
-	int hash_u32s = (key_end - key_start) >> 2;
-
+	const u32 *hash_key = (const u32 *)((const u8 *)key + range->start);
 	/* Make sure number of hash bytes are multiple of u32. */
-	BUILD_BUG_ON(sizeof(long) % sizeof(u32));
+	int hash_u32s = range_n_bytes(range) >> 2;
 
 	return jhash2(hash_key, hash_u32s, 0);
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH net-next v4 07/10] net: openvswitch: add likely in flow_lookup
  2019-10-15 10:30 [PATCH net-next v4 00/10] optimize openvswitch flow looking up xiangxia.m.yue
                   ` (5 preceding siblings ...)
  2019-10-15 10:30 ` [PATCH net-next v4 06/10] net: openvswitch: simplify the flow_hash xiangxia.m.yue
@ 2019-10-15 10:30 ` xiangxia.m.yue
  2019-10-18 23:27   ` [ovs-dev] " William Tu
  2019-10-15 10:30 ` [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table xiangxia.m.yue
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: xiangxia.m.yue @ 2019-10-15 10:30 UTC (permalink / raw)
  To: gvrose8192, pshelar; +Cc: netdev, dev, Tonghao Zhang

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

The most case *index < ma->max, and flow-mask is not NULL.
We add un/likely for performance.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
---
 net/openvswitch/flow_table.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 3e3d345..5df5182 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -518,7 +518,7 @@ static struct sw_flow *flow_lookup(struct flow_table *tbl,
 	struct sw_flow_mask *mask;
 	int i;
 
-	if (*index < ma->max) {
+	if (likely(*index < ma->max)) {
 		mask = rcu_dereference_ovsl(ma->masks[*index]);
 		if (mask) {
 			flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
@@ -533,7 +533,7 @@ static struct sw_flow *flow_lookup(struct flow_table *tbl,
 			continue;
 
 		mask = rcu_dereference_ovsl(ma->masks[i]);
-		if (!mask)
+		if (unlikely(!mask))
 			break;
 
 		flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table
  2019-10-15 10:30 [PATCH net-next v4 00/10] optimize openvswitch flow looking up xiangxia.m.yue
                   ` (6 preceding siblings ...)
  2019-10-15 10:30 ` [PATCH net-next v4 07/10] net: openvswitch: add likely in flow_lookup xiangxia.m.yue
@ 2019-10-15 10:30 ` xiangxia.m.yue
  2019-10-17 22:38   ` Pravin Shelar
  2019-10-15 10:30 ` [PATCH net-next v4 09/10] net: openvswitch: don't unlock mutex when changing the user_features fails xiangxia.m.yue
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: xiangxia.m.yue @ 2019-10-15 10:30 UTC (permalink / raw)
  To: gvrose8192, pshelar; +Cc: netdev, dev, Tonghao Zhang

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

When we destroy the flow tables which may contain the flow_mask,
so release the flow mask struct.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
---
 net/openvswitch/flow_table.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 5df5182..d5d768e 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -295,6 +295,18 @@ static void table_instance_destroy(struct table_instance *ti,
 	}
 }
 
+static void tbl_mask_array_destroy(struct flow_table *tbl)
+{
+	struct mask_array *ma = ovsl_dereference(tbl->mask_array);
+	int i;
+
+	/* Free the flow-mask and kfree_rcu the NULL is allowed. */
+	for (i = 0; i < ma->max; i++)
+		kfree_rcu(rcu_dereference_raw(ma->masks[i]), rcu);
+
+	kfree_rcu(rcu_dereference_raw(tbl->mask_array), rcu);
+}
+
 /* No need for locking this function is called from RCU callback or
  * error path.
  */
@@ -304,7 +316,7 @@ void ovs_flow_tbl_destroy(struct flow_table *table)
 	struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
 
 	free_percpu(table->mask_cache);
-	kfree_rcu(rcu_dereference_raw(table->mask_array), rcu);
+	tbl_mask_array_destroy(table);
 	table_instance_destroy(ti, ufid_ti, false);
 }
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH net-next v4 09/10] net: openvswitch: don't unlock mutex when changing the user_features fails
  2019-10-15 10:30 [PATCH net-next v4 00/10] optimize openvswitch flow looking up xiangxia.m.yue
                   ` (7 preceding siblings ...)
  2019-10-15 10:30 ` [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table xiangxia.m.yue
@ 2019-10-15 10:30 ` xiangxia.m.yue
  2019-10-18 23:27   ` [ovs-dev] " William Tu
  2019-10-15 10:30 ` [PATCH net-next v4 10/10] net: openvswitch: simplify the ovs_dp_cmd_new xiangxia.m.yue
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: xiangxia.m.yue @ 2019-10-15 10:30 UTC (permalink / raw)
  To: gvrose8192, pshelar; +Cc: netdev, dev, Tonghao Zhang, Paul Blakey

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

Unlocking of a not locked mutex is not allowed.
Other kernel thread may be in critical section while
we unlock it because of setting user_feature fail.

Fixes: 95a7233c4 ("net: openvswitch: Set OvS recirc_id from tc chain index")
Cc: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
---
 net/openvswitch/datapath.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 9fea7e1..aeb76e4 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -1657,6 +1657,7 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
 				ovs_dp_reset_user_features(skb, info);
 		}
 
+		ovs_unlock();
 		goto err_destroy_meters;
 	}
 
@@ -1673,7 +1674,6 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
 	return 0;
 
 err_destroy_meters:
-	ovs_unlock();
 	ovs_meters_exit(dp);
 err_destroy_ports_array:
 	kfree(dp->ports);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH net-next v4 10/10] net: openvswitch: simplify the ovs_dp_cmd_new
  2019-10-15 10:30 [PATCH net-next v4 00/10] optimize openvswitch flow looking up xiangxia.m.yue
                   ` (8 preceding siblings ...)
  2019-10-15 10:30 ` [PATCH net-next v4 09/10] net: openvswitch: don't unlock mutex when changing the user_features fails xiangxia.m.yue
@ 2019-10-15 10:30 ` xiangxia.m.yue
  2019-10-18 23:29   ` [ovs-dev] " William Tu
  2019-10-17 19:22 ` [PATCH net-next v4 00/10] optimize openvswitch flow looking up David Miller
  2019-10-21 17:14 ` [ovs-dev] " William Tu
  11 siblings, 1 reply; 37+ messages in thread
From: xiangxia.m.yue @ 2019-10-15 10:30 UTC (permalink / raw)
  To: gvrose8192, pshelar; +Cc: netdev, dev, Tonghao Zhang

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

use the specified functions to init resource.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
---
 net/openvswitch/datapath.c | 60 +++++++++++++++++++++++++++++-----------------
 1 file changed, 38 insertions(+), 22 deletions(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index aeb76e4..4d48e48 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -1576,6 +1576,31 @@ static int ovs_dp_change(struct datapath *dp, struct nlattr *a[])
 	return 0;
 }
 
+static int ovs_dp_stats_init(struct datapath *dp)
+{
+	dp->stats_percpu = netdev_alloc_pcpu_stats(struct dp_stats_percpu);
+	if (!dp->stats_percpu)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int ovs_dp_vport_init(struct datapath *dp)
+{
+	int i;
+
+	dp->ports = kmalloc_array(DP_VPORT_HASH_BUCKETS,
+				  sizeof(struct hlist_head),
+				  GFP_KERNEL);
+	if (!dp->ports)
+		return -ENOMEM;
+
+	for (i = 0; i < DP_VPORT_HASH_BUCKETS; i++)
+		INIT_HLIST_HEAD(&dp->ports[i]);
+
+	return 0;
+}
+
 static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
 {
 	struct nlattr **a = info->attrs;
@@ -1584,7 +1609,7 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
 	struct datapath *dp;
 	struct vport *vport;
 	struct ovs_net *ovs_net;
-	int err, i;
+	int err;
 
 	err = -EINVAL;
 	if (!a[OVS_DP_ATTR_NAME] || !a[OVS_DP_ATTR_UPCALL_PID])
@@ -1597,35 +1622,26 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
 	err = -ENOMEM;
 	dp = kzalloc(sizeof(*dp), GFP_KERNEL);
 	if (dp == NULL)
-		goto err_free_reply;
+		goto err_destroy_reply;
 
 	ovs_dp_set_net(dp, sock_net(skb->sk));
 
 	/* Allocate table. */
 	err = ovs_flow_tbl_init(&dp->table);
 	if (err)
-		goto err_free_dp;
+		goto err_destroy_dp;
 
-	dp->stats_percpu = netdev_alloc_pcpu_stats(struct dp_stats_percpu);
-	if (!dp->stats_percpu) {
-		err = -ENOMEM;
+	err = ovs_dp_stats_init(dp);
+	if (err)
 		goto err_destroy_table;
-	}
 
-	dp->ports = kmalloc_array(DP_VPORT_HASH_BUCKETS,
-				  sizeof(struct hlist_head),
-				  GFP_KERNEL);
-	if (!dp->ports) {
-		err = -ENOMEM;
-		goto err_destroy_percpu;
-	}
-
-	for (i = 0; i < DP_VPORT_HASH_BUCKETS; i++)
-		INIT_HLIST_HEAD(&dp->ports[i]);
+	err = ovs_dp_vport_init(dp);
+	if (err)
+		goto err_destroy_stats;
 
 	err = ovs_meters_init(dp);
 	if (err)
-		goto err_destroy_ports_array;
+		goto err_destroy_ports;
 
 	/* Set up our datapath device. */
 	parms.name = nla_data(a[OVS_DP_ATTR_NAME]);
@@ -1675,15 +1691,15 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
 
 err_destroy_meters:
 	ovs_meters_exit(dp);
-err_destroy_ports_array:
+err_destroy_ports:
 	kfree(dp->ports);
-err_destroy_percpu:
+err_destroy_stats:
 	free_percpu(dp->stats_percpu);
 err_destroy_table:
 	ovs_flow_tbl_destroy(&dp->table);
-err_free_dp:
+err_destroy_dp:
 	kfree(dp);
-err_free_reply:
+err_destroy_reply:
 	kfree_skb(reply);
 err:
 	return err;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH net-next v4 00/10] optimize openvswitch flow looking up
  2019-10-15 10:30 [PATCH net-next v4 00/10] optimize openvswitch flow looking up xiangxia.m.yue
                   ` (9 preceding siblings ...)
  2019-10-15 10:30 ` [PATCH net-next v4 10/10] net: openvswitch: simplify the ovs_dp_cmd_new xiangxia.m.yue
@ 2019-10-17 19:22 ` David Miller
  2019-10-17 20:29   ` Gregory Rose
  2019-10-21 17:14 ` [ovs-dev] " William Tu
  11 siblings, 1 reply; 37+ messages in thread
From: David Miller @ 2019-10-17 19:22 UTC (permalink / raw)
  To: xiangxia.m.yue; +Cc: gvrose8192, pshelar, netdev, dev

From: xiangxia.m.yue@gmail.com
Date: Tue, 15 Oct 2019 18:30:30 +0800

> This series patch optimize openvswitch for performance or simplify
> codes.
 ...

Pravin and company, please review!

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net-next v4 00/10] optimize openvswitch flow looking up
  2019-10-17 19:22 ` [PATCH net-next v4 00/10] optimize openvswitch flow looking up David Miller
@ 2019-10-17 20:29   ` Gregory Rose
  0 siblings, 0 replies; 37+ messages in thread
From: Gregory Rose @ 2019-10-17 20:29 UTC (permalink / raw)
  To: David Miller, xiangxia.m.yue; +Cc: pshelar, netdev, dev


On 10/17/2019 12:22 PM, David Miller wrote:
> From: xiangxia.m.yue@gmail.com
> Date: Tue, 15 Oct 2019 18:30:30 +0800
>
>> This series patch optimize openvswitch for performance or simplify
>> codes.
>   ...
>
> Pravin and company, please review!

Hi Dave,

I've tested the patches and provided my 'Tested-by' tag.  Pravin has had 
some suggestions for versions 1-3 of these
patches so I'll let him provide the ack on this latest patch set.

Thanks,

- Greg

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table
  2019-10-15 10:30 ` [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table xiangxia.m.yue
@ 2019-10-17 22:38   ` Pravin Shelar
  2019-10-18  3:16     ` Tonghao Zhang
  0 siblings, 1 reply; 37+ messages in thread
From: Pravin Shelar @ 2019-10-17 22:38 UTC (permalink / raw)
  To: Tonghao Zhang; +Cc: Greg Rose, Linux Kernel Network Developers, ovs dev

On Wed, Oct 16, 2019 at 5:50 AM <xiangxia.m.yue@gmail.com> wrote:
>
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> When we destroy the flow tables which may contain the flow_mask,
> so release the flow mask struct.
>
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> Tested-by: Greg Rose <gvrose8192@gmail.com>
> ---
>  net/openvswitch/flow_table.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
> index 5df5182..d5d768e 100644
> --- a/net/openvswitch/flow_table.c
> +++ b/net/openvswitch/flow_table.c
> @@ -295,6 +295,18 @@ static void table_instance_destroy(struct table_instance *ti,
>         }
>  }
>
> +static void tbl_mask_array_destroy(struct flow_table *tbl)
> +{
> +       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
> +       int i;
> +
> +       /* Free the flow-mask and kfree_rcu the NULL is allowed. */
> +       for (i = 0; i < ma->max; i++)
> +               kfree_rcu(rcu_dereference_raw(ma->masks[i]), rcu);
> +
> +       kfree_rcu(rcu_dereference_raw(tbl->mask_array), rcu);
> +}
> +
>  /* No need for locking this function is called from RCU callback or
>   * error path.
>   */
> @@ -304,7 +316,7 @@ void ovs_flow_tbl_destroy(struct flow_table *table)
>         struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
>
>         free_percpu(table->mask_cache);
> -       kfree_rcu(rcu_dereference_raw(table->mask_array), rcu);
> +       tbl_mask_array_destroy(table);
>         table_instance_destroy(ti, ufid_ti, false);
>  }

This should not be required. mask is linked to a flow and gets
released when flow is removed.
Does the memory leak occur when OVS module is abruptly unloaded and
userspace does not cleanup flow table?
In that case better fix could be calling ovs_flow_tbl_remove()
equivalent from table_instance_destroy when it is iterating flow
table.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table
  2019-10-17 22:38   ` Pravin Shelar
@ 2019-10-18  3:16     ` Tonghao Zhang
  2019-10-18 18:12       ` Pravin Shelar
  0 siblings, 1 reply; 37+ messages in thread
From: Tonghao Zhang @ 2019-10-18  3:16 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Greg Rose, Linux Kernel Network Developers, ovs dev

On Fri, Oct 18, 2019 at 6:38 AM Pravin Shelar <pshelar@ovn.org> wrote:
>
> On Wed, Oct 16, 2019 at 5:50 AM <xiangxia.m.yue@gmail.com> wrote:
> >
> > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >
> > When we destroy the flow tables which may contain the flow_mask,
> > so release the flow mask struct.
> >
> > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > Tested-by: Greg Rose <gvrose8192@gmail.com>
> > ---
> >  net/openvswitch/flow_table.c | 14 +++++++++++++-
> >  1 file changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
> > index 5df5182..d5d768e 100644
> > --- a/net/openvswitch/flow_table.c
> > +++ b/net/openvswitch/flow_table.c
> > @@ -295,6 +295,18 @@ static void table_instance_destroy(struct table_instance *ti,
> >         }
> >  }
> >
> > +static void tbl_mask_array_destroy(struct flow_table *tbl)
> > +{
> > +       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
> > +       int i;
> > +
> > +       /* Free the flow-mask and kfree_rcu the NULL is allowed. */
> > +       for (i = 0; i < ma->max; i++)
> > +               kfree_rcu(rcu_dereference_raw(ma->masks[i]), rcu);
> > +
> > +       kfree_rcu(rcu_dereference_raw(tbl->mask_array), rcu);
> > +}
> > +
> >  /* No need for locking this function is called from RCU callback or
> >   * error path.
> >   */
> > @@ -304,7 +316,7 @@ void ovs_flow_tbl_destroy(struct flow_table *table)
> >         struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
> >
> >         free_percpu(table->mask_cache);
> > -       kfree_rcu(rcu_dereference_raw(table->mask_array), rcu);
> > +       tbl_mask_array_destroy(table);
> >         table_instance_destroy(ti, ufid_ti, false);
> >  }
>
> This should not be required. mask is linked to a flow and gets
> released when flow is removed.
> Does the memory leak occur when OVS module is abruptly unloaded and
> userspace does not cleanup flow table?
When we destroy the ovs datapath or net namespace is destroyed , the
mask memory will be happened. The call tree:
ovs_exit_net/ovs_dp_cmd_del
-->__dp_destroy
-->destroy_dp_rcu
-->ovs_flow_tbl_destroy
-->table_instance_destroy (which don't release the mask memory because
don't call the ovs_flow_tbl_remove /flow_mask_remove directly or
indirectly).

but one thing, when we flush the flow, we don't flush the mask flow.(
If necessary, one patch should be sent)

> In that case better fix could be calling ovs_flow_tbl_remove()
> equivalent from table_instance_destroy when it is iterating flow
> table.
I think operation of  the flow mask and flow table should use
different API, for example:
for flow mask, we use the:
-tbl_mask_array_add_mask
-tbl_mask_array_del_mask
-tbl_mask_array_alloc
-tbl_mask_array_realloc
-tbl_mask_array_destroy(this patch introduce.)

table instance:
-table_instance_alloc
-table_instance_destroy
....

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table
  2019-10-18  3:16     ` Tonghao Zhang
@ 2019-10-18 18:12       ` Pravin Shelar
  2019-10-21  5:01         ` Tonghao Zhang
  0 siblings, 1 reply; 37+ messages in thread
From: Pravin Shelar @ 2019-10-18 18:12 UTC (permalink / raw)
  To: Tonghao Zhang; +Cc: Greg Rose, Linux Kernel Network Developers, ovs dev

On Thu, Oct 17, 2019 at 8:16 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> On Fri, Oct 18, 2019 at 6:38 AM Pravin Shelar <pshelar@ovn.org> wrote:
> >
> > On Wed, Oct 16, 2019 at 5:50 AM <xiangxia.m.yue@gmail.com> wrote:
> > >
> > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > >
> > > When we destroy the flow tables which may contain the flow_mask,
> > > so release the flow mask struct.
> > >
> > > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > Tested-by: Greg Rose <gvrose8192@gmail.com>
> > > ---
> > >  net/openvswitch/flow_table.c | 14 +++++++++++++-
> > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
> > > index 5df5182..d5d768e 100644
> > > --- a/net/openvswitch/flow_table.c
> > > +++ b/net/openvswitch/flow_table.c
> > > @@ -295,6 +295,18 @@ static void table_instance_destroy(struct table_instance *ti,
> > >         }
> > >  }
> > >
> > > +static void tbl_mask_array_destroy(struct flow_table *tbl)
> > > +{
> > > +       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
> > > +       int i;
> > > +
> > > +       /* Free the flow-mask and kfree_rcu the NULL is allowed. */
> > > +       for (i = 0; i < ma->max; i++)
> > > +               kfree_rcu(rcu_dereference_raw(ma->masks[i]), rcu);
> > > +
> > > +       kfree_rcu(rcu_dereference_raw(tbl->mask_array), rcu);
> > > +}
> > > +
> > >  /* No need for locking this function is called from RCU callback or
> > >   * error path.
> > >   */
> > > @@ -304,7 +316,7 @@ void ovs_flow_tbl_destroy(struct flow_table *table)
> > >         struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
> > >
> > >         free_percpu(table->mask_cache);
> > > -       kfree_rcu(rcu_dereference_raw(table->mask_array), rcu);
> > > +       tbl_mask_array_destroy(table);
> > >         table_instance_destroy(ti, ufid_ti, false);
> > >  }
> >
> > This should not be required. mask is linked to a flow and gets
> > released when flow is removed.
> > Does the memory leak occur when OVS module is abruptly unloaded and
> > userspace does not cleanup flow table?
> When we destroy the ovs datapath or net namespace is destroyed , the
> mask memory will be happened. The call tree:
> ovs_exit_net/ovs_dp_cmd_del
> -->__dp_destroy
> -->destroy_dp_rcu
> -->ovs_flow_tbl_destroy
> -->table_instance_destroy (which don't release the mask memory because
> don't call the ovs_flow_tbl_remove /flow_mask_remove directly or
> indirectly).
>
Thats what I suggested earlier, we need to call function similar to
ovs_flow_tbl_remove(), we could refactor code to use the code.
This is better since by introducing tbl_mask_array_destroy() is
creating a dangling pointer to mask in sw-flow object. OVS is anyway
iterating entire flow table to release sw-flow in
table_instance_destroy(), it is natural to release mask at that point
after releasing corresponding sw-flow.



> but one thing, when we flush the flow, we don't flush the mask flow.(
> If necessary, one patch should be sent)
>
> > In that case better fix could be calling ovs_flow_tbl_remove()
> > equivalent from table_instance_destroy when it is iterating flow
> > table.
> I think operation of  the flow mask and flow table should use
> different API, for example:
> for flow mask, we use the:
> -tbl_mask_array_add_mask
> -tbl_mask_array_del_mask
> -tbl_mask_array_alloc
> -tbl_mask_array_realloc
> -tbl_mask_array_destroy(this patch introduce.)
>
> table instance:
> -table_instance_alloc
> -table_instance_destroy
> ....

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [ovs-dev] [PATCH net-next v4 05/10] net: openvswitch: optimize flow-mask looking up
  2019-10-15 10:30 ` [PATCH net-next v4 05/10] net: openvswitch: optimize flow-mask looking up xiangxia.m.yue
@ 2019-10-18 23:26   ` William Tu
  2019-10-21  4:51     ` Tonghao Zhang
  0 siblings, 1 reply; 37+ messages in thread
From: William Tu @ 2019-10-18 23:26 UTC (permalink / raw)
  To: xiangxia.m.yue
  Cc: Greg Rose, pravin shelar, <dev@openvswitch.org>,
	Linux Kernel Network Developers

On Wed, Oct 16, 2019 at 5:54 AM <xiangxia.m.yue@gmail.com> wrote:
>
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> The full looking up on flow table traverses all mask array.
> If mask-array is too large, the number of invalid flow-mask
> increase, performance will be drop.
>
> One bad case, for example: M means flow-mask is valid and NULL
> of flow-mask means deleted.
>
> +-------------------------------------------+
> | M | NULL | ...                  | NULL | M|
> +-------------------------------------------+
>
> In that case, without this patch, openvswitch will traverses all
> mask array, because there will be one flow-mask in the tail. This
> patch changes the way of flow-mask inserting and deleting, and the
> mask array will be keep as below: there is not a NULL hole. In the
> fast path, we can "break" "for" (not "continue") in flow_lookup
> when we get a NULL flow-mask.
>
>          "break"
>             v
> +-------------------------------------------+
> | M | M |  NULL |...           | NULL | NULL|
> +-------------------------------------------+
>
> This patch don't optimize slow or control path, still using ma->max
> to traverse. Slow path:
> * tbl_mask_array_realloc
> * ovs_flow_tbl_lookup_exact
> * flow_mask_find


Hi Tonghao,

Does this improve performance? After all, the original code simply
check whether the mask is NULL, then goto next mask.

However, with your patch,  isn't this invalidated the mask cache entry which
point to the "M" you swap to the front? See my commands inline.

>
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> Tested-by: Greg Rose <gvrose8192@gmail.com>
> ---
>  net/openvswitch/flow_table.c | 101 ++++++++++++++++++++++---------------------
>  1 file changed, 52 insertions(+), 49 deletions(-)
>
> diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
> index 8d4f50d..a10d421 100644
> --- a/net/openvswitch/flow_table.c
> +++ b/net/openvswitch/flow_table.c
> @@ -538,7 +538,7 @@ static struct sw_flow *flow_lookup(struct flow_table *tbl,
>
>                 mask = rcu_dereference_ovsl(ma->masks[i]);
>                 if (!mask)
> -                       continue;
> +                       break;
>
>                 flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
>                 if (flow) { /* Found */
> @@ -695,7 +695,7 @@ struct sw_flow *ovs_flow_tbl_lookup_ufid(struct flow_table *tbl,
>  int ovs_flow_tbl_num_masks(const struct flow_table *table)
>  {
>         struct mask_array *ma = rcu_dereference_ovsl(table->mask_array);
> -       return ma->count;
> +       return READ_ONCE(ma->count);
>  }
>
>  static struct table_instance *table_instance_expand(struct table_instance *ti,
> @@ -704,21 +704,33 @@ static struct table_instance *table_instance_expand(struct table_instance *ti,
>         return table_instance_rehash(ti, ti->n_buckets * 2, ufid);
>  }
>
> -static void tbl_mask_array_delete_mask(struct mask_array *ma,
> -                                      struct sw_flow_mask *mask)
> +static void tbl_mask_array_del_mask(struct flow_table *tbl,
> +                                   struct sw_flow_mask *mask)
>  {
> -       int i;
> +       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
> +       int i, ma_count = READ_ONCE(ma->count);
>
>         /* Remove the deleted mask pointers from the array */
> -       for (i = 0; i < ma->max; i++) {
> -               if (mask == ovsl_dereference(ma->masks[i])) {
> -                       RCU_INIT_POINTER(ma->masks[i], NULL);
> -                       ma->count--;
> -                       kfree_rcu(mask, rcu);
> -                       return;
> -               }
> +       for (i = 0; i < ma_count; i++) {
> +               if (mask == ovsl_dereference(ma->masks[i]))
> +                       goto found;
>         }
> +
>         BUG();
> +       return;
> +
> +found:
> +       WRITE_ONCE(ma->count, ma_count -1);
> +
> +       rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
> +       RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);

So when you swap the ma->masks[ma_count -1], the mask cache entry
who's 'mask_index == ma_count' become all invalid?

Regards,
William

<snip>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [ovs-dev] [PATCH net-next v4 06/10] net: openvswitch: simplify the flow_hash
  2019-10-15 10:30 ` [PATCH net-next v4 06/10] net: openvswitch: simplify the flow_hash xiangxia.m.yue
@ 2019-10-18 23:27   ` William Tu
  0 siblings, 0 replies; 37+ messages in thread
From: William Tu @ 2019-10-18 23:27 UTC (permalink / raw)
  To: xiangxia.m.yue
  Cc: Greg Rose, pravin shelar, <dev@openvswitch.org>,
	Linux Kernel Network Developers

On Wed, Oct 16, 2019 at 5:54 AM <xiangxia.m.yue@gmail.com> wrote:
>
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> Simplify the code and remove the unnecessary BUILD_BUG_ON.
>
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> Tested-by: Greg Rose <gvrose8192@gmail.com>
> ---

LGTM
Acked-by: William Tu <u9012063@gmail.com>

>  net/openvswitch/flow_table.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
> index a10d421..3e3d345 100644
> --- a/net/openvswitch/flow_table.c
> +++ b/net/openvswitch/flow_table.c
> @@ -432,13 +432,9 @@ int ovs_flow_tbl_flush(struct flow_table *flow_table)
>  static u32 flow_hash(const struct sw_flow_key *key,
>                      const struct sw_flow_key_range *range)
>  {
> -       int key_start = range->start;
> -       int key_end = range->end;
> -       const u32 *hash_key = (const u32 *)((const u8 *)key + key_start);
> -       int hash_u32s = (key_end - key_start) >> 2;
> -
> +       const u32 *hash_key = (const u32 *)((const u8 *)key + range->start);
>         /* Make sure number of hash bytes are multiple of u32. */
> -       BUILD_BUG_ON(sizeof(long) % sizeof(u32));
> +       int hash_u32s = range_n_bytes(range) >> 2;
>
>         return jhash2(hash_key, hash_u32s, 0);
>  }
> --
> 1.8.3.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [ovs-dev] [PATCH net-next v4 07/10] net: openvswitch: add likely in flow_lookup
  2019-10-15 10:30 ` [PATCH net-next v4 07/10] net: openvswitch: add likely in flow_lookup xiangxia.m.yue
@ 2019-10-18 23:27   ` William Tu
  0 siblings, 0 replies; 37+ messages in thread
From: William Tu @ 2019-10-18 23:27 UTC (permalink / raw)
  To: xiangxia.m.yue
  Cc: Greg Rose, pravin shelar, <dev@openvswitch.org>,
	Linux Kernel Network Developers

On Wed, Oct 16, 2019 at 5:55 AM <xiangxia.m.yue@gmail.com> wrote:
>
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> The most case *index < ma->max, and flow-mask is not NULL.
> We add un/likely for performance.
>
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> Tested-by: Greg Rose <gvrose8192@gmail.com>
> ---
LGTM
Acked-by: William Tu <u9012063@gmail.com>


>  net/openvswitch/flow_table.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
> index 3e3d345..5df5182 100644
> --- a/net/openvswitch/flow_table.c
> +++ b/net/openvswitch/flow_table.c
> @@ -518,7 +518,7 @@ static struct sw_flow *flow_lookup(struct flow_table *tbl,
>         struct sw_flow_mask *mask;
>         int i;
>
> -       if (*index < ma->max) {
> +       if (likely(*index < ma->max)) {
>                 mask = rcu_dereference_ovsl(ma->masks[*index]);
>                 if (mask) {
>                         flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
> @@ -533,7 +533,7 @@ static struct sw_flow *flow_lookup(struct flow_table *tbl,
>                         continue;
>
>                 mask = rcu_dereference_ovsl(ma->masks[i]);
> -               if (!mask)
> +               if (unlikely(!mask))
>                         break;
>
>                 flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
> --
> 1.8.3.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [ovs-dev] [PATCH net-next v4 09/10] net: openvswitch: don't unlock mutex when changing the user_features fails
  2019-10-15 10:30 ` [PATCH net-next v4 09/10] net: openvswitch: don't unlock mutex when changing the user_features fails xiangxia.m.yue
@ 2019-10-18 23:27   ` William Tu
  0 siblings, 0 replies; 37+ messages in thread
From: William Tu @ 2019-10-18 23:27 UTC (permalink / raw)
  To: xiangxia.m.yue
  Cc: Greg Rose, pravin shelar, <dev@openvswitch.org>,
	Linux Kernel Network Developers

On Wed, Oct 16, 2019 at 5:56 AM <xiangxia.m.yue@gmail.com> wrote:
>
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> Unlocking of a not locked mutex is not allowed.
> Other kernel thread may be in critical section while
> we unlock it because of setting user_feature fail.
>
> Fixes: 95a7233c4 ("net: openvswitch: Set OvS recirc_id from tc chain index")
> Cc: Paul Blakey <paulb@mellanox.com>
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> Tested-by: Greg Rose <gvrose8192@gmail.com>
> ---

LGTM
Acked-by: William Tu <u9012063@gmail.com>


>  net/openvswitch/datapath.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index 9fea7e1..aeb76e4 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -1657,6 +1657,7 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
>                                 ovs_dp_reset_user_features(skb, info);
>                 }
>
> +               ovs_unlock();
>                 goto err_destroy_meters;
>         }
>
> @@ -1673,7 +1674,6 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
>         return 0;
>
>  err_destroy_meters:
> -       ovs_unlock();
>         ovs_meters_exit(dp);
>  err_destroy_ports_array:
>         kfree(dp->ports);
> --
> 1.8.3.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [ovs-dev] [PATCH net-next v4 10/10] net: openvswitch: simplify the ovs_dp_cmd_new
  2019-10-15 10:30 ` [PATCH net-next v4 10/10] net: openvswitch: simplify the ovs_dp_cmd_new xiangxia.m.yue
@ 2019-10-18 23:29   ` William Tu
  0 siblings, 0 replies; 37+ messages in thread
From: William Tu @ 2019-10-18 23:29 UTC (permalink / raw)
  To: xiangxia.m.yue
  Cc: Greg Rose, pravin shelar, <dev@openvswitch.org>,
	Linux Kernel Network Developers

On Wed, Oct 16, 2019 at 5:56 AM <xiangxia.m.yue@gmail.com> wrote:
>
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> use the specified functions to init resource.
>
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> Tested-by: Greg Rose <gvrose8192@gmail.com>
> ---

Looks like this is simply moving code around.
I don't have any opinion.

>  net/openvswitch/datapath.c | 60 +++++++++++++++++++++++++++++-----------------
>  1 file changed, 38 insertions(+), 22 deletions(-)
>
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index aeb76e4..4d48e48 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -1576,6 +1576,31 @@ static int ovs_dp_change(struct datapath *dp, struct nlattr *a[])
>         return 0;
>  }
>
> +static int ovs_dp_stats_init(struct datapath *dp)
> +{
> +       dp->stats_percpu = netdev_alloc_pcpu_stats(struct dp_stats_percpu);
> +       if (!dp->stats_percpu)
> +               return -ENOMEM;
> +
> +       return 0;
> +}
> +
> +static int ovs_dp_vport_init(struct datapath *dp)
> +{
> +       int i;
> +
> +       dp->ports = kmalloc_array(DP_VPORT_HASH_BUCKETS,
> +                                 sizeof(struct hlist_head),
> +                                 GFP_KERNEL);
> +       if (!dp->ports)
> +               return -ENOMEM;
> +
> +       for (i = 0; i < DP_VPORT_HASH_BUCKETS; i++)
> +               INIT_HLIST_HEAD(&dp->ports[i]);
> +
> +       return 0;
> +}
> +
>  static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
>  {
>         struct nlattr **a = info->attrs;
> @@ -1584,7 +1609,7 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
>         struct datapath *dp;
>         struct vport *vport;
>         struct ovs_net *ovs_net;
> -       int err, i;
> +       int err;
>
>         err = -EINVAL;
>         if (!a[OVS_DP_ATTR_NAME] || !a[OVS_DP_ATTR_UPCALL_PID])
> @@ -1597,35 +1622,26 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
>         err = -ENOMEM;
>         dp = kzalloc(sizeof(*dp), GFP_KERNEL);
>         if (dp == NULL)
> -               goto err_free_reply;
> +               goto err_destroy_reply;
>
>         ovs_dp_set_net(dp, sock_net(skb->sk));
>
>         /* Allocate table. */
>         err = ovs_flow_tbl_init(&dp->table);
>         if (err)
> -               goto err_free_dp;
> +               goto err_destroy_dp;
>
> -       dp->stats_percpu = netdev_alloc_pcpu_stats(struct dp_stats_percpu);
> -       if (!dp->stats_percpu) {
> -               err = -ENOMEM;
> +       err = ovs_dp_stats_init(dp);
> +       if (err)
>                 goto err_destroy_table;
> -       }
>
> -       dp->ports = kmalloc_array(DP_VPORT_HASH_BUCKETS,
> -                                 sizeof(struct hlist_head),
> -                                 GFP_KERNEL);
> -       if (!dp->ports) {
> -               err = -ENOMEM;
> -               goto err_destroy_percpu;
> -       }
> -
> -       for (i = 0; i < DP_VPORT_HASH_BUCKETS; i++)
> -               INIT_HLIST_HEAD(&dp->ports[i]);
> +       err = ovs_dp_vport_init(dp);
> +       if (err)
> +               goto err_destroy_stats;
>
>         err = ovs_meters_init(dp);
>         if (err)
> -               goto err_destroy_ports_array;
> +               goto err_destroy_ports;
>
>         /* Set up our datapath device. */
>         parms.name = nla_data(a[OVS_DP_ATTR_NAME]);
> @@ -1675,15 +1691,15 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
>
>  err_destroy_meters:
>         ovs_meters_exit(dp);
> -err_destroy_ports_array:
> +err_destroy_ports:
>         kfree(dp->ports);
> -err_destroy_percpu:
> +err_destroy_stats:
>         free_percpu(dp->stats_percpu);
>  err_destroy_table:
>         ovs_flow_tbl_destroy(&dp->table);
> -err_free_dp:
> +err_destroy_dp:
>         kfree(dp);
> -err_free_reply:
> +err_destroy_reply:
>         kfree_skb(reply);
>  err:
>         return err;
> --
> 1.8.3.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [ovs-dev] [PATCH net-next v4 01/10] net: openvswitch: add flow-mask cache for performance
  2019-10-15 10:30 ` [PATCH net-next v4 01/10] net: openvswitch: add flow-mask cache for performance xiangxia.m.yue
@ 2019-10-18 23:29   ` William Tu
  0 siblings, 0 replies; 37+ messages in thread
From: William Tu @ 2019-10-18 23:29 UTC (permalink / raw)
  To: xiangxia.m.yue
  Cc: Greg Rose, pravin shelar, <dev@openvswitch.org>,
	Linux Kernel Network Developers

On Wed, Oct 16, 2019 at 5:51 AM <xiangxia.m.yue@gmail.com> wrote:
>
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> The idea of this optimization comes from a patch which
> is committed in 2014, openvswitch community. The author
> is Pravin B Shelar. In order to get high performance, I
> implement it again. Later patches will use it.
>
> Pravin B Shelar, says:
> | On every packet OVS needs to lookup flow-table with every
> | mask until it finds a match. The packet flow-key is first
> | masked with mask in the list and then the masked key is
> | looked up in flow-table. Therefore number of masks can
> | affect packet processing performance.
>
> Link: https://github.com/openvswitch/ovs/commit/5604935e4e1cbc16611d2d97f50b717aa31e8ec5
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> Tested-by: Greg Rose <gvrose8192@gmail.com>
> ---
LGTM
Acked-by: William Tu <u9012063@gmail.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [ovs-dev] [PATCH net-next v4 02/10] net: openvswitch: convert mask list in mask array
  2019-10-15 10:30 ` [PATCH net-next v4 02/10] net: openvswitch: convert mask list in mask array xiangxia.m.yue
@ 2019-10-18 23:30   ` William Tu
  0 siblings, 0 replies; 37+ messages in thread
From: William Tu @ 2019-10-18 23:30 UTC (permalink / raw)
  To: xiangxia.m.yue
  Cc: Greg Rose, pravin shelar, <dev@openvswitch.org>,
	Linux Kernel Network Developers

On Wed, Oct 16, 2019 at 5:52 AM <xiangxia.m.yue@gmail.com> wrote:
>
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> Port the codes to linux upstream and with little changes.
>
> Pravin B Shelar, says:
> | mask caches index of mask in mask_list. On packet recv OVS
> | need to traverse mask-list to get cached mask. Therefore array
> | is better for retrieving cached mask. This also allows better
> | cache replacement algorithm by directly checking mask's existence.
>
> Link: https://github.com/openvswitch/ovs/commit/d49fc3ff53c65e4eca9cabd52ac63396746a7ef5
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> Tested-by: Greg Rose <gvrose8192@gmail.com>
> ---

LGTM
Acked-by: William Tu <u9012063@gmail.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [ovs-dev] [PATCH net-next v4 03/10] net: openvswitch: shrink the mask array if necessary
  2019-10-15 10:30 ` [PATCH net-next v4 03/10] net: openvswitch: shrink the mask array if necessary xiangxia.m.yue
@ 2019-10-18 23:33   ` William Tu
  0 siblings, 0 replies; 37+ messages in thread
From: William Tu @ 2019-10-18 23:33 UTC (permalink / raw)
  To: xiangxia.m.yue
  Cc: Greg Rose, pravin shelar, <dev@openvswitch.org>,
	Linux Kernel Network Developers

On Wed, Oct 16, 2019 at 5:53 AM <xiangxia.m.yue@gmail.com> wrote:
>
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> When creating and inserting flow-mask, if there is no available
> flow-mask, we realloc the mask array. When removing flow-mask,
> if necessary, we shrink mask array.
>
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> Tested-by: Greg Rose <gvrose8192@gmail.com>
> ---

LGTM
Acked-by: William Tu <u9012063@gmail.com>

On the other hand, maybe we should have an upper limit on the
mask cash size?

Regards,
William`
>  net/openvswitch/flow_table.c | 33 +++++++++++++++++++++++----------
>  1 file changed, 23 insertions(+), 10 deletions(-)
>
> diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
> index 0d1df53..237cf85 100644
> --- a/net/openvswitch/flow_table.c
> +++ b/net/openvswitch/flow_table.c
> @@ -693,6 +693,23 @@ static struct table_instance *table_instance_expand(struct table_instance *ti,
>         return table_instance_rehash(ti, ti->n_buckets * 2, ufid);
>  }
>
> +static void tbl_mask_array_delete_mask(struct mask_array *ma,
> +                                      struct sw_flow_mask *mask)
> +{
> +       int i;
> +
> +       /* Remove the deleted mask pointers from the array */
> +       for (i = 0; i < ma->max; i++) {
> +               if (mask == ovsl_dereference(ma->masks[i])) {
> +                       RCU_INIT_POINTER(ma->masks[i], NULL);
> +                       ma->count--;
> +                       kfree_rcu(mask, rcu);
> +                       return;
> +               }
> +       }
> +       BUG();
> +}
> +
>  /* Remove 'mask' from the mask list, if it is not needed any more. */
>  static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
>  {
> @@ -706,18 +723,14 @@ static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
>
>                 if (!mask->ref_count) {
>                         struct mask_array *ma;
> -                       int i;
>
>                         ma = ovsl_dereference(tbl->mask_array);
> -                       for (i = 0; i < ma->max; i++) {
> -                               if (mask == ovsl_dereference(ma->masks[i])) {
> -                                       RCU_INIT_POINTER(ma->masks[i], NULL);
> -                                       ma->count--;
> -                                       kfree_rcu(mask, rcu);
> -                                       return;
> -                               }
> -                       }
> -                       BUG();
> +                       tbl_mask_array_delete_mask(ma, mask);
> +
> +                       /* Shrink the mask array if necessary. */
> +                       if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
> +                           ma->count <= (ma->max / 3))
> +                               tbl_mask_array_realloc(tbl, ma->max / 2);
>                 }
>         }
>  }
> --
> 1.8.3.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [ovs-dev] [PATCH net-next v4 05/10] net: openvswitch: optimize flow-mask looking up
  2019-10-18 23:26   ` [ovs-dev] " William Tu
@ 2019-10-21  4:51     ` Tonghao Zhang
  2019-10-21 17:58       ` William Tu
  0 siblings, 1 reply; 37+ messages in thread
From: Tonghao Zhang @ 2019-10-21  4:51 UTC (permalink / raw)
  To: William Tu
  Cc: Greg Rose, pravin shelar, <dev@openvswitch.org>,
	Linux Kernel Network Developers

On Sat, Oct 19, 2019 at 7:27 AM William Tu <u9012063@gmail.com> wrote:
>
> On Wed, Oct 16, 2019 at 5:54 AM <xiangxia.m.yue@gmail.com> wrote:
> >
> > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >
> > The full looking up on flow table traverses all mask array.
> > If mask-array is too large, the number of invalid flow-mask
> > increase, performance will be drop.
> >
> > One bad case, for example: M means flow-mask is valid and NULL
> > of flow-mask means deleted.
> >
> > +-------------------------------------------+
> > | M | NULL | ...                  | NULL | M|
> > +-------------------------------------------+
> >
> > In that case, without this patch, openvswitch will traverses all
> > mask array, because there will be one flow-mask in the tail. This
> > patch changes the way of flow-mask inserting and deleting, and the
> > mask array will be keep as below: there is not a NULL hole. In the
> > fast path, we can "break" "for" (not "continue") in flow_lookup
> > when we get a NULL flow-mask.
> >
> >          "break"
> >             v
> > +-------------------------------------------+
> > | M | M |  NULL |...           | NULL | NULL|
> > +-------------------------------------------+
> >
> > This patch don't optimize slow or control path, still using ma->max
> > to traverse. Slow path:
> > * tbl_mask_array_realloc
> > * ovs_flow_tbl_lookup_exact
> > * flow_mask_find
>
>
> Hi Tonghao,
>
> Does this improve performance? After all, the original code simply
> check whether the mask is NULL, then goto next mask.
I tested the performance, but I disable the mask cache, and use the
dpdk-pktgen to generate packets:
The test ovs flow:
ovs-dpctl add-dp system@ovs-system
ovs-dpctl add-if system@ovs-system eth6
ovs-dpctl add-if system@ovs-system eth7

for m in $(seq 1 100 | xargs printf '%.2x\n'); do
       ovs-dpctl add-flow ovs-system
"in_port(1),eth(dst=00:$m:00:00:00:00/ff:ff:ff:ff:ff:$m),eth_type(0x0800),ipv4(frag=no)"
2
done

ovs-dpctl add-flow ovs-system
"in_port(1),eth(dst=98:03:9b:6e:4a:f5/ff:ff:ff:ff:ff:ff),eth_type(0x0800),ipv4(frag=no)"
2
ovs-dpctl add-flow ovs-system
"in_port(2),eth(dst=98:03:9b:6e:4a:f4/ff:ff:ff:ff:ff:ff),eth_type(0x0800),ipv4(frag=no)"
1

for m in $(seq 101 160 | xargs printf '%.2x\n'); do
        ovs-dpctl add-flow ovs-system
"in_port(1),eth(dst=00:$m:00:00:00:00/ff:ff:ff:ff:ff:$m),eth_type(0x0800),ipv4(frag=no)"
2
done

for m in $(seq 1 100 | xargs printf '%.2x\n'); do
         ovs-dpctl del-flow ovs-system
"in_port(1),eth(dst=00:$m:00:00:00:00/ff:ff:ff:ff:ff:$m),eth_type(0x0800),ipv4(frag=no)"
done

Without this patch: 982481pps (64B)
With this patch: 1112495 pps (64B), about 13% improve

> However, with your patch,  isn't this invalidated the mask cache entry which
> point to the "M" you swap to the front? See my commands inline.
>
> >
> > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > Tested-by: Greg Rose <gvrose8192@gmail.com>
> > ---
> >  net/openvswitch/flow_table.c | 101 ++++++++++++++++++++++---------------------
> >  1 file changed, 52 insertions(+), 49 deletions(-)
> >
> > diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
> > index 8d4f50d..a10d421 100644
> > --- a/net/openvswitch/flow_table.c
> > +++ b/net/openvswitch/flow_table.c
> > @@ -538,7 +538,7 @@ static struct sw_flow *flow_lookup(struct flow_table *tbl,
> >
> >                 mask = rcu_dereference_ovsl(ma->masks[i]);
> >                 if (!mask)
> > -                       continue;
> > +                       break;
> >
> >                 flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
> >                 if (flow) { /* Found */
> > @@ -695,7 +695,7 @@ struct sw_flow *ovs_flow_tbl_lookup_ufid(struct flow_table *tbl,
> >  int ovs_flow_tbl_num_masks(const struct flow_table *table)
> >  {
> >         struct mask_array *ma = rcu_dereference_ovsl(table->mask_array);
> > -       return ma->count;
> > +       return READ_ONCE(ma->count);
> >  }
> >
> >  static struct table_instance *table_instance_expand(struct table_instance *ti,
> > @@ -704,21 +704,33 @@ static struct table_instance *table_instance_expand(struct table_instance *ti,
> >         return table_instance_rehash(ti, ti->n_buckets * 2, ufid);
> >  }
> >
> > -static void tbl_mask_array_delete_mask(struct mask_array *ma,
> > -                                      struct sw_flow_mask *mask)
> > +static void tbl_mask_array_del_mask(struct flow_table *tbl,
> > +                                   struct sw_flow_mask *mask)
> >  {
> > -       int i;
> > +       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
> > +       int i, ma_count = READ_ONCE(ma->count);
> >
> >         /* Remove the deleted mask pointers from the array */
> > -       for (i = 0; i < ma->max; i++) {
> > -               if (mask == ovsl_dereference(ma->masks[i])) {
> > -                       RCU_INIT_POINTER(ma->masks[i], NULL);
> > -                       ma->count--;
> > -                       kfree_rcu(mask, rcu);
> > -                       return;
> > -               }
> > +       for (i = 0; i < ma_count; i++) {
> > +               if (mask == ovsl_dereference(ma->masks[i]))
> > +                       goto found;
> >         }
> > +
> >         BUG();
> > +       return;
> > +
> > +found:
> > +       WRITE_ONCE(ma->count, ma_count -1);
> > +
> > +       rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
> > +       RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);
>
> So when you swap the ma->masks[ma_count -1], the mask cache entry
> who's 'mask_index == ma_count' become all invalid?
Yes, a little tricky.
> Regards,
> William
>
> <snip>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table
  2019-10-18 18:12       ` Pravin Shelar
@ 2019-10-21  5:01         ` Tonghao Zhang
  2019-10-22  6:57           ` Pravin Shelar
  0 siblings, 1 reply; 37+ messages in thread
From: Tonghao Zhang @ 2019-10-21  5:01 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Greg Rose, Linux Kernel Network Developers, ovs dev

[-- Attachment #1: Type: text/plain, Size: 3922 bytes --]

On Sat, Oct 19, 2019 at 2:12 AM Pravin Shelar <pshelar@ovn.org> wrote:
>
> On Thu, Oct 17, 2019 at 8:16 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> >
> > On Fri, Oct 18, 2019 at 6:38 AM Pravin Shelar <pshelar@ovn.org> wrote:
> > >
> > > On Wed, Oct 16, 2019 at 5:50 AM <xiangxia.m.yue@gmail.com> wrote:
> > > >
> > > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > >
> > > > When we destroy the flow tables which may contain the flow_mask,
> > > > so release the flow mask struct.
> > > >
> > > > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > > Tested-by: Greg Rose <gvrose8192@gmail.com>
> > > > ---
> > > >  net/openvswitch/flow_table.c | 14 +++++++++++++-
> > > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
> > > > index 5df5182..d5d768e 100644
> > > > --- a/net/openvswitch/flow_table.c
> > > > +++ b/net/openvswitch/flow_table.c
> > > > @@ -295,6 +295,18 @@ static void table_instance_destroy(struct table_instance *ti,
> > > >         }
> > > >  }
> > > >
> > > > +static void tbl_mask_array_destroy(struct flow_table *tbl)
> > > > +{
> > > > +       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
> > > > +       int i;
> > > > +
> > > > +       /* Free the flow-mask and kfree_rcu the NULL is allowed. */
> > > > +       for (i = 0; i < ma->max; i++)
> > > > +               kfree_rcu(rcu_dereference_raw(ma->masks[i]), rcu);
> > > > +
> > > > +       kfree_rcu(rcu_dereference_raw(tbl->mask_array), rcu);
> > > > +}
> > > > +
> > > >  /* No need for locking this function is called from RCU callback or
> > > >   * error path.
> > > >   */
> > > > @@ -304,7 +316,7 @@ void ovs_flow_tbl_destroy(struct flow_table *table)
> > > >         struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
> > > >
> > > >         free_percpu(table->mask_cache);
> > > > -       kfree_rcu(rcu_dereference_raw(table->mask_array), rcu);
> > > > +       tbl_mask_array_destroy(table);
> > > >         table_instance_destroy(ti, ufid_ti, false);
> > > >  }
> > >
> > > This should not be required. mask is linked to a flow and gets
> > > released when flow is removed.
> > > Does the memory leak occur when OVS module is abruptly unloaded and
> > > userspace does not cleanup flow table?
> > When we destroy the ovs datapath or net namespace is destroyed , the
> > mask memory will be happened. The call tree:
> > ovs_exit_net/ovs_dp_cmd_del
> > -->__dp_destroy
> > -->destroy_dp_rcu
> > -->ovs_flow_tbl_destroy
> > -->table_instance_destroy (which don't release the mask memory because
> > don't call the ovs_flow_tbl_remove /flow_mask_remove directly or
> > indirectly).
> >
> Thats what I suggested earlier, we need to call function similar to
> ovs_flow_tbl_remove(), we could refactor code to use the code.
> This is better since by introducing tbl_mask_array_destroy() is
> creating a dangling pointer to mask in sw-flow object. OVS is anyway
> iterating entire flow table to release sw-flow in
> table_instance_destroy(), it is natural to release mask at that point
> after releasing corresponding sw-flow.
I got it, thanks. I rewrite the codes, can you help me to review it.
If fine, I will sent it next version.
>
>
> > but one thing, when we flush the flow, we don't flush the mask flow.(
> > If necessary, one patch should be sent)
> >
> > > In that case better fix could be calling ovs_flow_tbl_remove()
> > > equivalent from table_instance_destroy when it is iterating flow
> > > table.
> > I think operation of  the flow mask and flow table should use
> > different API, for example:
> > for flow mask, we use the:
> > -tbl_mask_array_add_mask
> > -tbl_mask_array_del_mask
> > -tbl_mask_array_alloc
> > -tbl_mask_array_realloc
> > -tbl_mask_array_destroy(this patch introduce.)
> >
> > table instance:
> > -table_instance_alloc
> > -table_instance_destroy
> > ....

[-- Attachment #2: ovs-mem-leak.patch --]
[-- Type: application/octet-stream, Size: 6694 bytes --]

diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 5df5182..5b20793 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -257,10 +257,75 @@ static void flow_tbl_destroy_rcu_cb(struct rcu_head *rcu)
 	__table_instance_destroy(ti);
 }
 
-static void table_instance_destroy(struct table_instance *ti,
-				   struct table_instance *ufid_ti,
+static void tbl_mask_array_del_mask(struct flow_table *tbl,
+				    struct sw_flow_mask *mask)
+{
+	struct mask_array *ma = ovsl_dereference(tbl->mask_array);
+	int i, ma_count = READ_ONCE(ma->count);
+
+	/* Remove the deleted mask pointers from the array */
+	for (i = 0; i < ma_count; i++) {
+		if (mask == ovsl_dereference(ma->masks[i]))
+			goto found;
+	}
+
+	BUG();
+	return;
+
+found:
+	WRITE_ONCE(ma->count, ma_count -1);
+
+	rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
+	RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);
+
+	kfree_rcu(mask, rcu);
+
+	/* Shrink the mask array if necessary. */
+	if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
+	    ma_count <= (ma->max / 3))
+		tbl_mask_array_realloc(tbl, ma->max / 2);
+}
+
+/* Remove 'mask' from the mask list, if it is not needed any more. */
+static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
+{
+	if (mask) {
+		/* ovs-lock is required to protect mask-refcount and
+		 * mask list.
+		 */
+		ASSERT_OVSL();
+		BUG_ON(!mask->ref_count);
+		mask->ref_count--;
+
+		if (!mask->ref_count)
+			tbl_mask_array_del_mask(tbl, mask);
+	}
+}
+
+static void table_instance_remove(struct flow_table *table, struct sw_flow *flow)
+{
+	struct table_instance *ti = ovsl_dereference(table->ti);
+	struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);
+
+	BUG_ON(table->count == 0);
+	hlist_del_rcu(&flow->flow_table.node[ti->node_ver]);
+	table->count--;
+	if (ovs_identifier_is_ufid(&flow->id)) {
+		hlist_del_rcu(&flow->ufid_table.node[ufid_ti->node_ver]);
+		table->ufid_count--;
+	}
+
+	/* RCU delete the mask. 'flow->mask' is not NULLed, as it should be
+	 * accessible as long as the RCU read lock is held.
+	 */
+	flow_mask_remove(table, flow->mask);
+}
+
+static void table_instance_destroy(struct flow_table *table,
 				   bool deferred)
 {
+	struct table_instance *ti = ovsl_dereference(table->ti);
+	struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);
 	int i;
 
 	if (!ti)
@@ -274,13 +339,9 @@ static void table_instance_destroy(struct table_instance *ti,
 		struct sw_flow *flow;
 		struct hlist_head *head = &ti->buckets[i];
 		struct hlist_node *n;
-		int ver = ti->node_ver;
-		int ufid_ver = ufid_ti->node_ver;
 
-		hlist_for_each_entry_safe(flow, n, head, flow_table.node[ver]) {
-			hlist_del_rcu(&flow->flow_table.node[ver]);
-			if (ovs_identifier_is_ufid(&flow->id))
-				hlist_del_rcu(&flow->ufid_table.node[ufid_ver]);
+		hlist_for_each_entry_safe(flow, n, head, flow_table.node[ti->node_ver]) {
+			table_instance_remove(table, flow);
 			ovs_flow_free(flow, deferred);
 		}
 	}
@@ -300,12 +361,9 @@ static void table_instance_destroy(struct table_instance *ti,
  */
 void ovs_flow_tbl_destroy(struct flow_table *table)
 {
-	struct table_instance *ti = rcu_dereference_raw(table->ti);
-	struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
-
 	free_percpu(table->mask_cache);
 	kfree_rcu(rcu_dereference_raw(table->mask_array), rcu);
-	table_instance_destroy(ti, ufid_ti, false);
+	table_instance_destroy(table, false);
 }
 
 struct sw_flow *ovs_flow_tbl_dump_next(struct table_instance *ti,
@@ -400,10 +458,9 @@ static struct table_instance *table_instance_rehash(struct table_instance *ti,
 	return new_ti;
 }
 
-int ovs_flow_tbl_flush(struct flow_table *flow_table)
+int ovs_flow_tbl_flush(struct flow_table *table)
 {
-	struct table_instance *old_ti, *new_ti;
-	struct table_instance *old_ufid_ti, *new_ufid_ti;
+	struct table_instance *new_ti, *new_ufid_ti;
 
 	new_ti = table_instance_alloc(TBL_MIN_BUCKETS);
 	if (!new_ti)
@@ -412,16 +469,12 @@ int ovs_flow_tbl_flush(struct flow_table *flow_table)
 	if (!new_ufid_ti)
 		goto err_free_ti;
 
-	old_ti = ovsl_dereference(flow_table->ti);
-	old_ufid_ti = ovsl_dereference(flow_table->ufid_ti);
+	table_instance_destroy(table, true);
 
-	rcu_assign_pointer(flow_table->ti, new_ti);
-	rcu_assign_pointer(flow_table->ufid_ti, new_ufid_ti);
-	flow_table->last_rehash = jiffies;
-	flow_table->count = 0;
-	flow_table->ufid_count = 0;
+	rcu_assign_pointer(table->ti, new_ti);
+	rcu_assign_pointer(table->ufid_ti, new_ufid_ti);
+	table->last_rehash = jiffies;
 
-	table_instance_destroy(old_ti, old_ufid_ti, true);
 	return 0;
 
 err_free_ti:
@@ -700,69 +753,10 @@ static struct table_instance *table_instance_expand(struct table_instance *ti,
 	return table_instance_rehash(ti, ti->n_buckets * 2, ufid);
 }
 
-static void tbl_mask_array_del_mask(struct flow_table *tbl,
-				    struct sw_flow_mask *mask)
-{
-	struct mask_array *ma = ovsl_dereference(tbl->mask_array);
-	int i, ma_count = READ_ONCE(ma->count);
-
-	/* Remove the deleted mask pointers from the array */
-	for (i = 0; i < ma_count; i++) {
-		if (mask == ovsl_dereference(ma->masks[i]))
-			goto found;
-	}
-
-	BUG();
-	return;
-
-found:
-	WRITE_ONCE(ma->count, ma_count -1);
-
-	rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
-	RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);
-
-	kfree_rcu(mask, rcu);
-
-	/* Shrink the mask array if necessary. */
-	if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
-	    ma_count <= (ma->max / 3))
-		tbl_mask_array_realloc(tbl, ma->max / 2);
-}
-
-/* Remove 'mask' from the mask list, if it is not needed any more. */
-static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
-{
-	if (mask) {
-		/* ovs-lock is required to protect mask-refcount and
-		 * mask list.
-		 */
-		ASSERT_OVSL();
-		BUG_ON(!mask->ref_count);
-		mask->ref_count--;
-
-		if (!mask->ref_count)
-			tbl_mask_array_del_mask(tbl, mask);
-	}
-}
-
 /* Must be called with OVS mutex held. */
 void ovs_flow_tbl_remove(struct flow_table *table, struct sw_flow *flow)
 {
-	struct table_instance *ti = ovsl_dereference(table->ti);
-	struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);
-
-	BUG_ON(table->count == 0);
-	hlist_del_rcu(&flow->flow_table.node[ti->node_ver]);
-	table->count--;
-	if (ovs_identifier_is_ufid(&flow->id)) {
-		hlist_del_rcu(&flow->ufid_table.node[ufid_ti->node_ver]);
-		table->ufid_count--;
-	}
-
-	/* RCU delete the mask. 'flow->mask' is not NULLed, as it should be
-	 * accessible as long as the RCU read lock is held.
-	 */
-	flow_mask_remove(table, flow->mask);
+	table_instance_remove(table, flow);
 }
 
 static struct sw_flow_mask *mask_alloc(void)

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [ovs-dev] [PATCH net-next v4 00/10] optimize openvswitch flow looking up
  2019-10-15 10:30 [PATCH net-next v4 00/10] optimize openvswitch flow looking up xiangxia.m.yue
                   ` (10 preceding siblings ...)
  2019-10-17 19:22 ` [PATCH net-next v4 00/10] optimize openvswitch flow looking up David Miller
@ 2019-10-21 17:14 ` William Tu
  2019-10-22  1:16   ` Tonghao Zhang
  11 siblings, 1 reply; 37+ messages in thread
From: William Tu @ 2019-10-21 17:14 UTC (permalink / raw)
  To: xiangxia.m.yue
  Cc: Greg Rose, pravin shelar, <dev@openvswitch.org>,
	Linux Kernel Network Developers

On Wed, Oct 16, 2019 at 5:50 AM <xiangxia.m.yue@gmail.com> wrote:
>
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> This series patch optimize openvswitch for performance or simplify
> codes.
>
> Patch 1, 2, 4: Port Pravin B Shelar patches to
> linux upstream with little changes.

btw, should we keep Pravin as the author of the above three patches?

Regards,
William

>
> Patch 5, 6, 7: Optimize the flow looking up and
> simplify the flow hash.
>
> Patch 8, 9: are bugfix.
>
> The performance test is on Intel Xeon E5-2630 v4.
> The test topology is show as below:
>
> +-----------------------------------+
> |   +---------------------------+   |
> |   | eth0   ovs-switch    eth1 |   | Host0
> |   +---------------------------+   |
> +-----------------------------------+
>       ^                       |
>       |                       |
>       |                       |
>       |                       |
>       |                       v
> +-----+----+             +----+-----+
> | netperf  | Host1       | netserver| Host2
> +----------+             +----------+
>
> We use netperf send the 64B packets, and insert 255+ flow-mask:
> $ ovs-dpctl add-flow ovs-switch "in_port(1),eth(dst=00:01:00:00:00:00/ff:ff:ff:ff:ff:01),eth_type(0x0800),ipv4(frag=no)" 2
> ...
> $ ovs-dpctl add-flow ovs-switch "in_port(1),eth(dst=00:ff:00:00:00:00/ff:ff:ff:ff:ff:ff),eth_type(0x0800),ipv4(frag=no)" 2
> $
> $ netperf -t UDP_STREAM -H 2.2.2.200 -l 40 -- -m 18
>
> * Without series patch, throughput 8.28Mbps
> * With series patch, throughput 46.05Mbps
>
> v3->v4:
> access ma->count with READ_ONCE/WRITE_ONCE API. More information,
> see patch 5 comments.
>
> v2->v3:
> update ma point when realloc mask_array in patch 5.
>
> v1->v2:
> use kfree_rcu instead of call_rcu
>
> Tonghao Zhang (10):
>   net: openvswitch: add flow-mask cache for performance
>   net: openvswitch: convert mask list in mask array
>   net: openvswitch: shrink the mask array if necessary
>   net: openvswitch: optimize flow mask cache hash collision
>   net: openvswitch: optimize flow-mask looking up
>   net: openvswitch: simplify the flow_hash
>   net: openvswitch: add likely in flow_lookup
>   net: openvswitch: fix possible memleak on destroy flow-table
>   net: openvswitch: don't unlock mutex when changing the user_features
>     fails
>   net: openvswitch: simplify the ovs_dp_cmd_new
>
>  net/openvswitch/datapath.c   |  65 +++++----
>  net/openvswitch/flow.h       |   1 -
>  net/openvswitch/flow_table.c | 316 +++++++++++++++++++++++++++++++++++++------
>  net/openvswitch/flow_table.h |  19 ++-
>  4 files changed, 329 insertions(+), 72 deletions(-)
>
> --
> 1.8.3.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [ovs-dev] [PATCH net-next v4 05/10] net: openvswitch: optimize flow-mask looking up
  2019-10-21  4:51     ` Tonghao Zhang
@ 2019-10-21 17:58       ` William Tu
  0 siblings, 0 replies; 37+ messages in thread
From: William Tu @ 2019-10-21 17:58 UTC (permalink / raw)
  To: Tonghao Zhang
  Cc: Greg Rose, pravin shelar, <dev@openvswitch.org>,
	Linux Kernel Network Developers

> > Hi Tonghao,
> >
> > Does this improve performance? After all, the original code simply
> > check whether the mask is NULL, then goto next mask.
> I tested the performance, but I disable the mask cache, and use the
> dpdk-pktgen to generate packets:
> The test ovs flow:
> ovs-dpctl add-dp system@ovs-system
> ovs-dpctl add-if system@ovs-system eth6
> ovs-dpctl add-if system@ovs-system eth7
>
> for m in $(seq 1 100 | xargs printf '%.2x\n'); do
>        ovs-dpctl add-flow ovs-system
> "in_port(1),eth(dst=00:$m:00:00:00:00/ff:ff:ff:ff:ff:$m),eth_type(0x0800),ipv4(frag=no)"
> 2
> done
>
> ovs-dpctl add-flow ovs-system
> "in_port(1),eth(dst=98:03:9b:6e:4a:f5/ff:ff:ff:ff:ff:ff),eth_type(0x0800),ipv4(frag=no)"
> 2
> ovs-dpctl add-flow ovs-system
> "in_port(2),eth(dst=98:03:9b:6e:4a:f4/ff:ff:ff:ff:ff:ff),eth_type(0x0800),ipv4(frag=no)"
> 1
>
> for m in $(seq 101 160 | xargs printf '%.2x\n'); do
>         ovs-dpctl add-flow ovs-system
> "in_port(1),eth(dst=00:$m:00:00:00:00/ff:ff:ff:ff:ff:$m),eth_type(0x0800),ipv4(frag=no)"
> 2
> done
>
> for m in $(seq 1 100 | xargs printf '%.2x\n'); do
>          ovs-dpctl del-flow ovs-system
> "in_port(1),eth(dst=00:$m:00:00:00:00/ff:ff:ff:ff:ff:$m),eth_type(0x0800),ipv4(frag=no)"
> done
>
> Without this patch: 982481pps (64B)
> With this patch: 1112495 pps (64B), about 13% improve
>

Hi Tonghao,

Thanks for doing the measurement.
Based on the result (skipping 100 NULL mask lookup with 13% improvement),
and with additional overhead of mask cache being invalidate while
refilling these 100
gap, I'd argue that this patch is not necessary. But let's wait for
others comments.

Regards,
William

> > However, with your patch,  isn't this invalidated the mask cache entry which
> > point to the "M" you swap to the front? See my commands inline.
> >
> > >
> > > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > Tested-by: Greg Rose <gvrose8192@gmail.com>
> > > ---

<snip>

> > >  static struct table_instance *table_instance_expand(struct table_instance *ti,
> > > @@ -704,21 +704,33 @@ static struct table_instance *table_instance_expand(struct table_instance *ti,
> > >         return table_instance_rehash(ti, ti->n_buckets * 2, ufid);
> > >  }
> > >
> > > -static void tbl_mask_array_delete_mask(struct mask_array *ma,
> > > -                                      struct sw_flow_mask *mask)
> > > +static void tbl_mask_array_del_mask(struct flow_table *tbl,
> > > +                                   struct sw_flow_mask *mask)
> > >  {
> > > -       int i;
> > > +       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
> > > +       int i, ma_count = READ_ONCE(ma->count);
> > >
> > >         /* Remove the deleted mask pointers from the array */
> > > -       for (i = 0; i < ma->max; i++) {
> > > -               if (mask == ovsl_dereference(ma->masks[i])) {
> > > -                       RCU_INIT_POINTER(ma->masks[i], NULL);
> > > -                       ma->count--;
> > > -                       kfree_rcu(mask, rcu);
> > > -                       return;
> > > -               }
> > > +       for (i = 0; i < ma_count; i++) {
> > > +               if (mask == ovsl_dereference(ma->masks[i]))
> > > +                       goto found;
> > >         }
> > > +
> > >         BUG();
> > > +       return;
> > > +
> > > +found:
> > > +       WRITE_ONCE(ma->count, ma_count -1);
> > > +
> > > +       rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
> > > +       RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);
> >
> > So when you swap the ma->masks[ma_count -1], the mask cache entry
> > who's 'mask_index == ma_count' become all invalid?
> Yes, a little tricky.
> > Regards,
> > William
> >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [ovs-dev] [PATCH net-next v4 00/10] optimize openvswitch flow looking up
  2019-10-21 17:14 ` [ovs-dev] " William Tu
@ 2019-10-22  1:16   ` Tonghao Zhang
  2019-10-22 15:44     ` William Tu
  0 siblings, 1 reply; 37+ messages in thread
From: Tonghao Zhang @ 2019-10-22  1:16 UTC (permalink / raw)
  To: William Tu
  Cc: Greg Rose, pravin shelar, <dev@openvswitch.org>,
	Linux Kernel Network Developers

On Tue, Oct 22, 2019 at 1:14 AM William Tu <u9012063@gmail.com> wrote:
>
> On Wed, Oct 16, 2019 at 5:50 AM <xiangxia.m.yue@gmail.com> wrote:
> >
> > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >
> > This series patch optimize openvswitch for performance or simplify
> > codes.
> >
> > Patch 1, 2, 4: Port Pravin B Shelar patches to
> > linux upstream with little changes.
>
> btw, should we keep Pravin as the author of the above three patches?
Agree, but how i can to that, these patches should be sent by Pravin ?
> Regards,
> William
>
> >
> > Patch 5, 6, 7: Optimize the flow looking up and
> > simplify the flow hash.
> >
> > Patch 8, 9: are bugfix.
> >
> > The performance test is on Intel Xeon E5-2630 v4.
> > The test topology is show as below:
> >
> > +-----------------------------------+
> > |   +---------------------------+   |
> > |   | eth0   ovs-switch    eth1 |   | Host0
> > |   +---------------------------+   |
> > +-----------------------------------+
> >       ^                       |
> >       |                       |
> >       |                       |
> >       |                       |
> >       |                       v
> > +-----+----+             +----+-----+
> > | netperf  | Host1       | netserver| Host2
> > +----------+             +----------+
> >
> > We use netperf send the 64B packets, and insert 255+ flow-mask:
> > $ ovs-dpctl add-flow ovs-switch "in_port(1),eth(dst=00:01:00:00:00:00/ff:ff:ff:ff:ff:01),eth_type(0x0800),ipv4(frag=no)" 2
> > ...
> > $ ovs-dpctl add-flow ovs-switch "in_port(1),eth(dst=00:ff:00:00:00:00/ff:ff:ff:ff:ff:ff),eth_type(0x0800),ipv4(frag=no)" 2
> > $
> > $ netperf -t UDP_STREAM -H 2.2.2.200 -l 40 -- -m 18
> >
> > * Without series patch, throughput 8.28Mbps
> > * With series patch, throughput 46.05Mbps
> >
> > v3->v4:
> > access ma->count with READ_ONCE/WRITE_ONCE API. More information,
> > see patch 5 comments.
> >
> > v2->v3:
> > update ma point when realloc mask_array in patch 5.
> >
> > v1->v2:
> > use kfree_rcu instead of call_rcu
> >
> > Tonghao Zhang (10):
> >   net: openvswitch: add flow-mask cache for performance
> >   net: openvswitch: convert mask list in mask array
> >   net: openvswitch: shrink the mask array if necessary
> >   net: openvswitch: optimize flow mask cache hash collision
> >   net: openvswitch: optimize flow-mask looking up
> >   net: openvswitch: simplify the flow_hash
> >   net: openvswitch: add likely in flow_lookup
> >   net: openvswitch: fix possible memleak on destroy flow-table
> >   net: openvswitch: don't unlock mutex when changing the user_features
> >     fails
> >   net: openvswitch: simplify the ovs_dp_cmd_new
> >
> >  net/openvswitch/datapath.c   |  65 +++++----
> >  net/openvswitch/flow.h       |   1 -
> >  net/openvswitch/flow_table.c | 316 +++++++++++++++++++++++++++++++++++++------
> >  net/openvswitch/flow_table.h |  19 ++-
> >  4 files changed, 329 insertions(+), 72 deletions(-)
> >
> > --
> > 1.8.3.1
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table
  2019-10-21  5:01         ` Tonghao Zhang
@ 2019-10-22  6:57           ` Pravin Shelar
  2019-10-23  2:35             ` Tonghao Zhang
  0 siblings, 1 reply; 37+ messages in thread
From: Pravin Shelar @ 2019-10-22  6:57 UTC (permalink / raw)
  To: Tonghao Zhang; +Cc: Greg Rose, Linux Kernel Network Developers, ovs dev

On Sun, Oct 20, 2019 at 10:02 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> On Sat, Oct 19, 2019 at 2:12 AM Pravin Shelar <pshelar@ovn.org> wrote:
> >
> > On Thu, Oct 17, 2019 at 8:16 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > >
> > > On Fri, Oct 18, 2019 at 6:38 AM Pravin Shelar <pshelar@ovn.org> wrote:
> > > >
> > > > On Wed, Oct 16, 2019 at 5:50 AM <xiangxia.m.yue@gmail.com> wrote:
> > > > >
> > > > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > > >
> > > > > When we destroy the flow tables which may contain the flow_mask,
> > > > > so release the flow mask struct.
> > > > >
> > > > > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > > > Tested-by: Greg Rose <gvrose8192@gmail.com>
> > > > > ---
> > > > >  net/openvswitch/flow_table.c | 14 +++++++++++++-
> > > > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
> > > > > index 5df5182..d5d768e 100644
> > > > > --- a/net/openvswitch/flow_table.c
> > > > > +++ b/net/openvswitch/flow_table.c
> > > > > @@ -295,6 +295,18 @@ static void table_instance_destroy(struct table_instance *ti,
> > > > >         }
> > > > >  }
> > > > >
> > > > > +static void tbl_mask_array_destroy(struct flow_table *tbl)
> > > > > +{
> > > > > +       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
> > > > > +       int i;
> > > > > +
> > > > > +       /* Free the flow-mask and kfree_rcu the NULL is allowed. */
> > > > > +       for (i = 0; i < ma->max; i++)
> > > > > +               kfree_rcu(rcu_dereference_raw(ma->masks[i]), rcu);
> > > > > +
> > > > > +       kfree_rcu(rcu_dereference_raw(tbl->mask_array), rcu);
> > > > > +}
> > > > > +
> > > > >  /* No need for locking this function is called from RCU callback or
> > > > >   * error path.
> > > > >   */
> > > > > @@ -304,7 +316,7 @@ void ovs_flow_tbl_destroy(struct flow_table *table)
> > > > >         struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
> > > > >
> > > > >         free_percpu(table->mask_cache);
> > > > > -       kfree_rcu(rcu_dereference_raw(table->mask_array), rcu);
> > > > > +       tbl_mask_array_destroy(table);
> > > > >         table_instance_destroy(ti, ufid_ti, false);
> > > > >  }
> > > >
> > > > This should not be required. mask is linked to a flow and gets
> > > > released when flow is removed.
> > > > Does the memory leak occur when OVS module is abruptly unloaded and
> > > > userspace does not cleanup flow table?
> > > When we destroy the ovs datapath or net namespace is destroyed , the
> > > mask memory will be happened. The call tree:
> > > ovs_exit_net/ovs_dp_cmd_del
> > > -->__dp_destroy
> > > -->destroy_dp_rcu
> > > -->ovs_flow_tbl_destroy
> > > -->table_instance_destroy (which don't release the mask memory because
> > > don't call the ovs_flow_tbl_remove /flow_mask_remove directly or
> > > indirectly).
> > >
> > Thats what I suggested earlier, we need to call function similar to
> > ovs_flow_tbl_remove(), we could refactor code to use the code.
> > This is better since by introducing tbl_mask_array_destroy() is
> > creating a dangling pointer to mask in sw-flow object. OVS is anyway
> > iterating entire flow table to release sw-flow in
> > table_instance_destroy(), it is natural to release mask at that point
> > after releasing corresponding sw-flow.
> I got it, thanks. I rewrite the codes, can you help me to review it.
> If fine, I will sent it next version.
> >
> >
Sure, I can review it, Can you send the patch inlined in mail?

Thanks.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [ovs-dev] [PATCH net-next v4 00/10] optimize openvswitch flow looking up
  2019-10-22  1:16   ` Tonghao Zhang
@ 2019-10-22 15:44     ` William Tu
  0 siblings, 0 replies; 37+ messages in thread
From: William Tu @ 2019-10-22 15:44 UTC (permalink / raw)
  To: Tonghao Zhang
  Cc: Greg Rose, pravin shelar, <dev@openvswitch.org>,
	Linux Kernel Network Developers

On Mon, Oct 21, 2019 at 6:16 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> On Tue, Oct 22, 2019 at 1:14 AM William Tu <u9012063@gmail.com> wrote:
> >
> > On Wed, Oct 16, 2019 at 5:50 AM <xiangxia.m.yue@gmail.com> wrote:
> > >
> > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > >
> > > This series patch optimize openvswitch for performance or simplify
> > > codes.
> > >
> > > Patch 1, 2, 4: Port Pravin B Shelar patches to
> > > linux upstream with little changes.
> >
> > btw, should we keep Pravin as the author of the above three patches?
> Agree, but how i can to that, these patches should be sent by Pravin ?

you can send the patch, and use
git commit --amend --author=""
to change author

William

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table
  2019-10-22  6:57           ` Pravin Shelar
@ 2019-10-23  2:35             ` Tonghao Zhang
  2019-10-24  7:14               ` Pravin Shelar
  0 siblings, 1 reply; 37+ messages in thread
From: Tonghao Zhang @ 2019-10-23  2:35 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Greg Rose, Linux Kernel Network Developers, ovs dev

On Tue, Oct 22, 2019 at 2:58 PM Pravin Shelar <pshelar@ovn.org> wrote:
>
> On Sun, Oct 20, 2019 at 10:02 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> >
> > On Sat, Oct 19, 2019 at 2:12 AM Pravin Shelar <pshelar@ovn.org> wrote:
> > >
> > > On Thu, Oct 17, 2019 at 8:16 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > > >
> > > > On Fri, Oct 18, 2019 at 6:38 AM Pravin Shelar <pshelar@ovn.org> wrote:
> > > > >
> > > > > On Wed, Oct 16, 2019 at 5:50 AM <xiangxia.m.yue@gmail.com> wrote:
> > > > > >
> > > > > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > > > >
> > > > > > When we destroy the flow tables which may contain the flow_mask,
> > > > > > so release the flow mask struct.
> > > > > >
> > > > > > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > > > > Tested-by: Greg Rose <gvrose8192@gmail.com>
> > > > > > ---
> > > > > >  net/openvswitch/flow_table.c | 14 +++++++++++++-
> > > > > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
> > > > > > index 5df5182..d5d768e 100644
> > > > > > --- a/net/openvswitch/flow_table.c
> > > > > > +++ b/net/openvswitch/flow_table.c
> > > > > > @@ -295,6 +295,18 @@ static void table_instance_destroy(struct table_instance *ti,
> > > > > >         }
> > > > > >  }
> > > > > >
> > > > > > +static void tbl_mask_array_destroy(struct flow_table *tbl)
> > > > > > +{
> > > > > > +       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
> > > > > > +       int i;
> > > > > > +
> > > > > > +       /* Free the flow-mask and kfree_rcu the NULL is allowed. */
> > > > > > +       for (i = 0; i < ma->max; i++)
> > > > > > +               kfree_rcu(rcu_dereference_raw(ma->masks[i]), rcu);
> > > > > > +
> > > > > > +       kfree_rcu(rcu_dereference_raw(tbl->mask_array), rcu);
> > > > > > +}
> > > > > > +
> > > > > >  /* No need for locking this function is called from RCU callback or
> > > > > >   * error path.
> > > > > >   */
> > > > > > @@ -304,7 +316,7 @@ void ovs_flow_tbl_destroy(struct flow_table *table)
> > > > > >         struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
> > > > > >
> > > > > >         free_percpu(table->mask_cache);
> > > > > > -       kfree_rcu(rcu_dereference_raw(table->mask_array), rcu);
> > > > > > +       tbl_mask_array_destroy(table);
> > > > > >         table_instance_destroy(ti, ufid_ti, false);
> > > > > >  }
> > > > >
> > > > > This should not be required. mask is linked to a flow and gets
> > > > > released when flow is removed.
> > > > > Does the memory leak occur when OVS module is abruptly unloaded and
> > > > > userspace does not cleanup flow table?
> > > > When we destroy the ovs datapath or net namespace is destroyed , the
> > > > mask memory will be happened. The call tree:
> > > > ovs_exit_net/ovs_dp_cmd_del
> > > > -->__dp_destroy
> > > > -->destroy_dp_rcu
> > > > -->ovs_flow_tbl_destroy
> > > > -->table_instance_destroy (which don't release the mask memory because
> > > > don't call the ovs_flow_tbl_remove /flow_mask_remove directly or
> > > > indirectly).
> > > >
> > > Thats what I suggested earlier, we need to call function similar to
> > > ovs_flow_tbl_remove(), we could refactor code to use the code.
> > > This is better since by introducing tbl_mask_array_destroy() is
> > > creating a dangling pointer to mask in sw-flow object. OVS is anyway
> > > iterating entire flow table to release sw-flow in
> > > table_instance_destroy(), it is natural to release mask at that point
> > > after releasing corresponding sw-flow.
> > I got it, thanks. I rewrite the codes, can you help me to review it.
> > If fine, I will sent it next version.
> > >
> > >
> Sure, I can review it, Can you send the patch inlined in mail?
>
> Thanks.
diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 5df5182..5b20793 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -257,10 +257,75 @@ static void flow_tbl_destroy_rcu_cb(struct rcu_head *rcu)
        __table_instance_destroy(ti);
 }

-static void table_instance_destroy(struct table_instance *ti,
-                                  struct table_instance *ufid_ti,
+static void tbl_mask_array_del_mask(struct flow_table *tbl,
+                                   struct sw_flow_mask *mask)
+{
+       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
+       int i, ma_count = READ_ONCE(ma->count);
+
+       /* Remove the deleted mask pointers from the array */
+       for (i = 0; i < ma_count; i++) {
+               if (mask == ovsl_dereference(ma->masks[i]))
+                       goto found;
+       }
+
+       BUG();
+       return;
+
+found:
+       WRITE_ONCE(ma->count, ma_count -1);
+
+       rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
+       RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);
+
+       kfree_rcu(mask, rcu);
+
+       /* Shrink the mask array if necessary. */
+       if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
+           ma_count <= (ma->max / 3))
+               tbl_mask_array_realloc(tbl, ma->max / 2);
+}
+
+/* Remove 'mask' from the mask list, if it is not needed any more. */
+static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
+{
+       if (mask) {
+               /* ovs-lock is required to protect mask-refcount and
+                * mask list.
+                */
+               ASSERT_OVSL();
+               BUG_ON(!mask->ref_count);
+               mask->ref_count--;
+
+               if (!mask->ref_count)
+                       tbl_mask_array_del_mask(tbl, mask);
+       }
+}
+
+static void table_instance_remove(struct flow_table *table, struct
sw_flow *flow)
+{
+       struct table_instance *ti = ovsl_dereference(table->ti);
+       struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);
+
+       BUG_ON(table->count == 0);
+       hlist_del_rcu(&flow->flow_table.node[ti->node_ver]);
+       table->count--;
+       if (ovs_identifier_is_ufid(&flow->id)) {
+               hlist_del_rcu(&flow->ufid_table.node[ufid_ti->node_ver]);
+               table->ufid_count--;
+       }
+
+       /* RCU delete the mask. 'flow->mask' is not NULLed, as it should be
+        * accessible as long as the RCU read lock is held.
+        */
+       flow_mask_remove(table, flow->mask);
+}
+
+static void table_instance_destroy(struct flow_table *table,
                                   bool deferred)
 {
+       struct table_instance *ti = ovsl_dereference(table->ti);
+       struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);
        int i;

        if (!ti)
@@ -274,13 +339,9 @@ static void table_instance_destroy(struct
table_instance *ti,
                struct sw_flow *flow;
                struct hlist_head *head = &ti->buckets[i];
                struct hlist_node *n;
-               int ver = ti->node_ver;
-               int ufid_ver = ufid_ti->node_ver;

-               hlist_for_each_entry_safe(flow, n, head, flow_table.node[ver]) {
-                       hlist_del_rcu(&flow->flow_table.node[ver]);
-                       if (ovs_identifier_is_ufid(&flow->id))
-                               hlist_del_rcu(&flow->ufid_table.node[ufid_ver]);
+               hlist_for_each_entry_safe(flow, n, head,
flow_table.node[ti->node_ver]) {
+                       table_instance_remove(table, flow);
                        ovs_flow_free(flow, deferred);
                }
        }
@@ -300,12 +361,9 @@ static void table_instance_destroy(struct
table_instance *ti,
  */
 void ovs_flow_tbl_destroy(struct flow_table *table)
 {
-       struct table_instance *ti = rcu_dereference_raw(table->ti);
-       struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
-
        free_percpu(table->mask_cache);
        kfree_rcu(rcu_dereference_raw(table->mask_array), rcu);
-       table_instance_destroy(ti, ufid_ti, false);
+       table_instance_destroy(table, false);
 }

 struct sw_flow *ovs_flow_tbl_dump_next(struct table_instance *ti,
@@ -400,10 +458,9 @@ static struct table_instance
*table_instance_rehash(struct table_instance *ti,
        return new_ti;
 }

-int ovs_flow_tbl_flush(struct flow_table *flow_table)
+int ovs_flow_tbl_flush(struct flow_table *table)
 {
-       struct table_instance *old_ti, *new_ti;
-       struct table_instance *old_ufid_ti, *new_ufid_ti;
+       struct table_instance *new_ti, *new_ufid_ti;

        new_ti = table_instance_alloc(TBL_MIN_BUCKETS);
        if (!new_ti)
@@ -412,16 +469,12 @@ int ovs_flow_tbl_flush(struct flow_table *flow_table)
        if (!new_ufid_ti)
                goto err_free_ti;

-       old_ti = ovsl_dereference(flow_table->ti);
-       old_ufid_ti = ovsl_dereference(flow_table->ufid_ti);
+       table_instance_destroy(table, true);

-       rcu_assign_pointer(flow_table->ti, new_ti);
-       rcu_assign_pointer(flow_table->ufid_ti, new_ufid_ti);
-       flow_table->last_rehash = jiffies;
-       flow_table->count = 0;
-       flow_table->ufid_count = 0;
+       rcu_assign_pointer(table->ti, new_ti);
+       rcu_assign_pointer(table->ufid_ti, new_ufid_ti);
+       table->last_rehash = jiffies;

-       table_instance_destroy(old_ti, old_ufid_ti, true);
        return 0;

 err_free_ti:
@@ -700,69 +753,10 @@ static struct table_instance
*table_instance_expand(struct table_instance *ti,
        return table_instance_rehash(ti, ti->n_buckets * 2, ufid);
 }

-static void tbl_mask_array_del_mask(struct flow_table *tbl,
-                                   struct sw_flow_mask *mask)
-{
-       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
-       int i, ma_count = READ_ONCE(ma->count);
-
-       /* Remove the deleted mask pointers from the array */
-       for (i = 0; i < ma_count; i++) {
-               if (mask == ovsl_dereference(ma->masks[i]))
-                       goto found;
-       }
-
-       BUG();
-       return;
-
-found:
-       WRITE_ONCE(ma->count, ma_count -1);
-
-       rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
-       RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);
-
-       kfree_rcu(mask, rcu);
-
-       /* Shrink the mask array if necessary. */
-       if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
-           ma_count <= (ma->max / 3))
-               tbl_mask_array_realloc(tbl, ma->max / 2);
-}
-
-/* Remove 'mask' from the mask list, if it is not needed any more. */
-static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
-{
-       if (mask) {
-               /* ovs-lock is required to protect mask-refcount and
-                * mask list.
-                */
-               ASSERT_OVSL();
-               BUG_ON(!mask->ref_count);
-               mask->ref_count--;
-
-               if (!mask->ref_count)
-                       tbl_mask_array_del_mask(tbl, mask);
-       }
-}
-
 /* Must be called with OVS mutex held. */
 void ovs_flow_tbl_remove(struct flow_table *table, struct sw_flow *flow)
 {
-       struct table_instance *ti = ovsl_dereference(table->ti);
-       struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);
-
-       BUG_ON(table->count == 0);
-       hlist_del_rcu(&flow->flow_table.node[ti->node_ver]);
-       table->count--;
-       if (ovs_identifier_is_ufid(&flow->id)) {
-               hlist_del_rcu(&flow->ufid_table.node[ufid_ti->node_ver]);
-               table->ufid_count--;
-       }
-
-       /* RCU delete the mask. 'flow->mask' is not NULLed, as it should be
-        * accessible as long as the RCU read lock is held.
-        */
-       flow_mask_remove(table, flow->mask);
+       table_instance_remove(table, flow);
 }

 static struct sw_flow_mask *mask_alloc(void)

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table
  2019-10-23  2:35             ` Tonghao Zhang
@ 2019-10-24  7:14               ` Pravin Shelar
  2019-10-28  6:49                 ` Tonghao Zhang
  0 siblings, 1 reply; 37+ messages in thread
From: Pravin Shelar @ 2019-10-24  7:14 UTC (permalink / raw)
  To: Tonghao Zhang; +Cc: Greg Rose, Linux Kernel Network Developers, ovs dev

On Tue, Oct 22, 2019 at 7:35 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> On Tue, Oct 22, 2019 at 2:58 PM Pravin Shelar <pshelar@ovn.org> wrote:
> >
...

> > > >
> > Sure, I can review it, Can you send the patch inlined in mail?
> >
> > Thanks.
> diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
> index 5df5182..5b20793 100644
> --- a/net/openvswitch/flow_table.c
> +++ b/net/openvswitch/flow_table.c
> @@ -257,10 +257,75 @@ static void flow_tbl_destroy_rcu_cb(struct rcu_head *rcu)
>         __table_instance_destroy(ti);
>  }
>
> -static void table_instance_destroy(struct table_instance *ti,
> -                                  struct table_instance *ufid_ti,
> +static void tbl_mask_array_del_mask(struct flow_table *tbl,
> +                                   struct sw_flow_mask *mask)
> +{
> +       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
> +       int i, ma_count = READ_ONCE(ma->count);
> +
> +       /* Remove the deleted mask pointers from the array */
> +       for (i = 0; i < ma_count; i++) {
> +               if (mask == ovsl_dereference(ma->masks[i]))
> +                       goto found;
> +       }
> +
> +       BUG();
> +       return;
> +
> +found:
> +       WRITE_ONCE(ma->count, ma_count -1);
> +
> +       rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
> +       RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);
> +
> +       kfree_rcu(mask, rcu);
> +
> +       /* Shrink the mask array if necessary. */
> +       if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
> +           ma_count <= (ma->max / 3))
> +               tbl_mask_array_realloc(tbl, ma->max / 2);
> +}
> +
> +/* Remove 'mask' from the mask list, if it is not needed any more. */
> +static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
> +{
> +       if (mask) {
> +               /* ovs-lock is required to protect mask-refcount and
> +                * mask list.
> +                */
> +               ASSERT_OVSL();
> +               BUG_ON(!mask->ref_count);
> +               mask->ref_count--;
> +
> +               if (!mask->ref_count)
> +                       tbl_mask_array_del_mask(tbl, mask);
> +       }
> +}
> +
> +static void table_instance_remove(struct flow_table *table, struct
> sw_flow *flow)
> +{
> +       struct table_instance *ti = ovsl_dereference(table->ti);
> +       struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);
> +
> +       BUG_ON(table->count == 0);
> +       hlist_del_rcu(&flow->flow_table.node[ti->node_ver]);
> +       table->count--;
> +       if (ovs_identifier_is_ufid(&flow->id)) {
> +               hlist_del_rcu(&flow->ufid_table.node[ufid_ti->node_ver]);
> +               table->ufid_count--;
> +       }
> +
> +       /* RCU delete the mask. 'flow->mask' is not NULLed, as it should be
> +        * accessible as long as the RCU read lock is held.
> +        */
> +       flow_mask_remove(table, flow->mask);
> +}
> +
> +static void table_instance_destroy(struct flow_table *table,
>                                    bool deferred)
>  {
> +       struct table_instance *ti = ovsl_dereference(table->ti);
> +       struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);
>         int i;
>
>         if (!ti)
> @@ -274,13 +339,9 @@ static void table_instance_destroy(struct
> table_instance *ti,
>                 struct sw_flow *flow;
>                 struct hlist_head *head = &ti->buckets[i];
>                 struct hlist_node *n;
> -               int ver = ti->node_ver;
> -               int ufid_ver = ufid_ti->node_ver;
>
> -               hlist_for_each_entry_safe(flow, n, head, flow_table.node[ver]) {
> -                       hlist_del_rcu(&flow->flow_table.node[ver]);
> -                       if (ovs_identifier_is_ufid(&flow->id))
> -                               hlist_del_rcu(&flow->ufid_table.node[ufid_ver]);
> +               hlist_for_each_entry_safe(flow, n, head,
> flow_table.node[ti->node_ver]) {
> +                       table_instance_remove(table, flow);
>                         ovs_flow_free(flow, deferred);
>                 }
>         }
> @@ -300,12 +361,9 @@ static void table_instance_destroy(struct
> table_instance *ti,
>   */
>  void ovs_flow_tbl_destroy(struct flow_table *table)
>  {
> -       struct table_instance *ti = rcu_dereference_raw(table->ti);
> -       struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
> -
>         free_percpu(table->mask_cache);
>         kfree_rcu(rcu_dereference_raw(table->mask_array), rcu);
> -       table_instance_destroy(ti, ufid_ti, false);
> +       table_instance_destroy(table, false);
>  }
>
>  struct sw_flow *ovs_flow_tbl_dump_next(struct table_instance *ti,
> @@ -400,10 +458,9 @@ static struct table_instance
> *table_instance_rehash(struct table_instance *ti,
>         return new_ti;
>  }
>
> -int ovs_flow_tbl_flush(struct flow_table *flow_table)
> +int ovs_flow_tbl_flush(struct flow_table *table)
>  {
> -       struct table_instance *old_ti, *new_ti;
> -       struct table_instance *old_ufid_ti, *new_ufid_ti;
> +       struct table_instance *new_ti, *new_ufid_ti;
>
>         new_ti = table_instance_alloc(TBL_MIN_BUCKETS);
>         if (!new_ti)
> @@ -412,16 +469,12 @@ int ovs_flow_tbl_flush(struct flow_table *flow_table)
>         if (!new_ufid_ti)
>                 goto err_free_ti;
>
> -       old_ti = ovsl_dereference(flow_table->ti);
> -       old_ufid_ti = ovsl_dereference(flow_table->ufid_ti);
> +       table_instance_destroy(table, true);
>
This would destroy running table causing unnecessary flow miss. Lets
keep current scheme of setting up new table before destroying current
one.

> -       rcu_assign_pointer(flow_table->ti, new_ti);
> -       rcu_assign_pointer(flow_table->ufid_ti, new_ufid_ti);
> -       flow_table->last_rehash = jiffies;
> -       flow_table->count = 0;
> -       flow_table->ufid_count = 0;
> +       rcu_assign_pointer(table->ti, new_ti);
> +       rcu_assign_pointer(table->ufid_ti, new_ufid_ti);
> +       table->last_rehash = jiffies;
>
> -       table_instance_destroy(old_ti, old_ufid_ti, true);
>         return 0;
>
>  err_free_ti:
> @@ -700,69 +753,10 @@ static struct table_instance
> *table_instance_expand(struct table_instance *ti,
>         return table_instance_rehash(ti, ti->n_buckets * 2, ufid);
>  }
>
> -static void tbl_mask_array_del_mask(struct flow_table *tbl,
> -                                   struct sw_flow_mask *mask)
> -{
> -       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
> -       int i, ma_count = READ_ONCE(ma->count);
> -
> -       /* Remove the deleted mask pointers from the array */
> -       for (i = 0; i < ma_count; i++) {
> -               if (mask == ovsl_dereference(ma->masks[i]))
> -                       goto found;
> -       }
> -
> -       BUG();
> -       return;
> -
> -found:
> -       WRITE_ONCE(ma->count, ma_count -1);
> -
> -       rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
> -       RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);
> -
> -       kfree_rcu(mask, rcu);
> -
> -       /* Shrink the mask array if necessary. */
> -       if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
> -           ma_count <= (ma->max / 3))
> -               tbl_mask_array_realloc(tbl, ma->max / 2);
> -}
> -
> -/* Remove 'mask' from the mask list, if it is not needed any more. */
> -static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
> -{
> -       if (mask) {
> -               /* ovs-lock is required to protect mask-refcount and
> -                * mask list.
> -                */
> -               ASSERT_OVSL();
> -               BUG_ON(!mask->ref_count);
> -               mask->ref_count--;
> -
> -               if (!mask->ref_count)
> -                       tbl_mask_array_del_mask(tbl, mask);
> -       }
> -}
> -
>  /* Must be called with OVS mutex held. */
>  void ovs_flow_tbl_remove(struct flow_table *table, struct sw_flow *flow)
>  {
> -       struct table_instance *ti = ovsl_dereference(table->ti);
> -       struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);
> -
> -       BUG_ON(table->count == 0);
> -       hlist_del_rcu(&flow->flow_table.node[ti->node_ver]);
> -       table->count--;
> -       if (ovs_identifier_is_ufid(&flow->id)) {
> -               hlist_del_rcu(&flow->ufid_table.node[ufid_ti->node_ver]);
> -               table->ufid_count--;
> -       }
> -
> -       /* RCU delete the mask. 'flow->mask' is not NULLed, as it should be
> -        * accessible as long as the RCU read lock is held.
> -        */
> -       flow_mask_remove(table, flow->mask);
> +       table_instance_remove(table, flow);
Can you just rename table_instance_remove() to ovs_flow_tbl_remove()?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table
  2019-10-24  7:14               ` Pravin Shelar
@ 2019-10-28  6:49                 ` Tonghao Zhang
  2019-10-29  7:37                   ` Pravin Shelar
  0 siblings, 1 reply; 37+ messages in thread
From: Tonghao Zhang @ 2019-10-28  6:49 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Greg Rose, Linux Kernel Network Developers, ovs dev

On Thu, Oct 24, 2019 at 3:14 PM Pravin Shelar <pshelar@ovn.org> wrote:
>
> On Tue, Oct 22, 2019 at 7:35 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> >
> > On Tue, Oct 22, 2019 at 2:58 PM Pravin Shelar <pshelar@ovn.org> wrote:
> > >
> ...
>
> > > > >
> > > Sure, I can review it, Can you send the patch inlined in mail?
> > >
> > > Thanks.
> > diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
> > index 5df5182..5b20793 100644
> > --- a/net/openvswitch/flow_table.c
> > +++ b/net/openvswitch/flow_table.c
> > @@ -257,10 +257,75 @@ static void flow_tbl_destroy_rcu_cb(struct rcu_head *rcu)
> >         __table_instance_destroy(ti);
> >  }
> >
> > -static void table_instance_destroy(struct table_instance *ti,
> > -                                  struct table_instance *ufid_ti,
> > +static void tbl_mask_array_del_mask(struct flow_table *tbl,
> > +                                   struct sw_flow_mask *mask)
> > +{
> > +       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
> > +       int i, ma_count = READ_ONCE(ma->count);
> > +
> > +       /* Remove the deleted mask pointers from the array */
> > +       for (i = 0; i < ma_count; i++) {
> > +               if (mask == ovsl_dereference(ma->masks[i]))
> > +                       goto found;
> > +       }
> > +
> > +       BUG();
> > +       return;
> > +
> > +found:
> > +       WRITE_ONCE(ma->count, ma_count -1);
> > +
> > +       rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
> > +       RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);
> > +
> > +       kfree_rcu(mask, rcu);
> > +
> > +       /* Shrink the mask array if necessary. */
> > +       if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
> > +           ma_count <= (ma->max / 3))
> > +               tbl_mask_array_realloc(tbl, ma->max / 2);
> > +}
> > +
> > +/* Remove 'mask' from the mask list, if it is not needed any more. */
> > +static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
> > +{
> > +       if (mask) {
> > +               /* ovs-lock is required to protect mask-refcount and
> > +                * mask list.
> > +                */
> > +               ASSERT_OVSL();
> > +               BUG_ON(!mask->ref_count);
> > +               mask->ref_count--;
> > +
> > +               if (!mask->ref_count)
> > +                       tbl_mask_array_del_mask(tbl, mask);
> > +       }
> > +}
> > +
> > +static void table_instance_remove(struct flow_table *table, struct
> > sw_flow *flow)
> > +{
> > +       struct table_instance *ti = ovsl_dereference(table->ti);
> > +       struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);
> > +
> > +       BUG_ON(table->count == 0);
> > +       hlist_del_rcu(&flow->flow_table.node[ti->node_ver]);
> > +       table->count--;
> > +       if (ovs_identifier_is_ufid(&flow->id)) {
> > +               hlist_del_rcu(&flow->ufid_table.node[ufid_ti->node_ver]);
> > +               table->ufid_count--;
> > +       }
> > +
> > +       /* RCU delete the mask. 'flow->mask' is not NULLed, as it should be
> > +        * accessible as long as the RCU read lock is held.
> > +        */
> > +       flow_mask_remove(table, flow->mask);
> > +}
> > +
> > +static void table_instance_destroy(struct flow_table *table,
> >                                    bool deferred)
> >  {
> > +       struct table_instance *ti = ovsl_dereference(table->ti);
> > +       struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);
> >         int i;
> >
> >         if (!ti)
> > @@ -274,13 +339,9 @@ static void table_instance_destroy(struct
> > table_instance *ti,
> >                 struct sw_flow *flow;
> >                 struct hlist_head *head = &ti->buckets[i];
> >                 struct hlist_node *n;
> > -               int ver = ti->node_ver;
> > -               int ufid_ver = ufid_ti->node_ver;
> >
> > -               hlist_for_each_entry_safe(flow, n, head, flow_table.node[ver]) {
> > -                       hlist_del_rcu(&flow->flow_table.node[ver]);
> > -                       if (ovs_identifier_is_ufid(&flow->id))
> > -                               hlist_del_rcu(&flow->ufid_table.node[ufid_ver]);
> > +               hlist_for_each_entry_safe(flow, n, head,
> > flow_table.node[ti->node_ver]) {
> > +                       table_instance_remove(table, flow);
> >                         ovs_flow_free(flow, deferred);
> >                 }
> >         }
> > @@ -300,12 +361,9 @@ static void table_instance_destroy(struct
> > table_instance *ti,
> >   */
> >  void ovs_flow_tbl_destroy(struct flow_table *table)
> >  {
> > -       struct table_instance *ti = rcu_dereference_raw(table->ti);
> > -       struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
> > -
> >         free_percpu(table->mask_cache);
> >         kfree_rcu(rcu_dereference_raw(table->mask_array), rcu);
> > -       table_instance_destroy(ti, ufid_ti, false);
> > +       table_instance_destroy(table, false);
> >  }
> >
> >  struct sw_flow *ovs_flow_tbl_dump_next(struct table_instance *ti,
> > @@ -400,10 +458,9 @@ static struct table_instance
> > *table_instance_rehash(struct table_instance *ti,
> >         return new_ti;
> >  }
> >
> > -int ovs_flow_tbl_flush(struct flow_table *flow_table)
> > +int ovs_flow_tbl_flush(struct flow_table *table)
> >  {
> > -       struct table_instance *old_ti, *new_ti;
> > -       struct table_instance *old_ufid_ti, *new_ufid_ti;
> > +       struct table_instance *new_ti, *new_ufid_ti;
> >
> >         new_ti = table_instance_alloc(TBL_MIN_BUCKETS);
> >         if (!new_ti)
> > @@ -412,16 +469,12 @@ int ovs_flow_tbl_flush(struct flow_table *flow_table)
> >         if (!new_ufid_ti)
> >                 goto err_free_ti;
> >
> > -       old_ti = ovsl_dereference(flow_table->ti);
> > -       old_ufid_ti = ovsl_dereference(flow_table->ufid_ti);
> > +       table_instance_destroy(table, true);
> >
> This would destroy running table causing unnecessary flow miss. Lets
> keep current scheme of setting up new table before destroying current
> one.
>
> > -       rcu_assign_pointer(flow_table->ti, new_ti);
> > -       rcu_assign_pointer(flow_table->ufid_ti, new_ufid_ti);
> > -       flow_table->last_rehash = jiffies;
> > -       flow_table->count = 0;
> > -       flow_table->ufid_count = 0;
> > +       rcu_assign_pointer(table->ti, new_ti);
> > +       rcu_assign_pointer(table->ufid_ti, new_ufid_ti);
> > +       table->last_rehash = jiffies;
> >
> > -       table_instance_destroy(old_ti, old_ufid_ti, true);
> >         return 0;
> >
> >  err_free_ti:
> > @@ -700,69 +753,10 @@ static struct table_instance
> > *table_instance_expand(struct table_instance *ti,
> >         return table_instance_rehash(ti, ti->n_buckets * 2, ufid);
> >  }
> >
> > -static void tbl_mask_array_del_mask(struct flow_table *tbl,
> > -                                   struct sw_flow_mask *mask)
> > -{
> > -       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
> > -       int i, ma_count = READ_ONCE(ma->count);
> > -
> > -       /* Remove the deleted mask pointers from the array */
> > -       for (i = 0; i < ma_count; i++) {
> > -               if (mask == ovsl_dereference(ma->masks[i]))
> > -                       goto found;
> > -       }
> > -
> > -       BUG();
> > -       return;
> > -
> > -found:
> > -       WRITE_ONCE(ma->count, ma_count -1);
> > -
> > -       rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
> > -       RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);
> > -
> > -       kfree_rcu(mask, rcu);
> > -
> > -       /* Shrink the mask array if necessary. */
> > -       if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
> > -           ma_count <= (ma->max / 3))
> > -               tbl_mask_array_realloc(tbl, ma->max / 2);
> > -}
> > -
> > -/* Remove 'mask' from the mask list, if it is not needed any more. */
> > -static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
> > -{
> > -       if (mask) {
> > -               /* ovs-lock is required to protect mask-refcount and
> > -                * mask list.
> > -                */
> > -               ASSERT_OVSL();
> > -               BUG_ON(!mask->ref_count);
> > -               mask->ref_count--;
> > -
> > -               if (!mask->ref_count)
> > -                       tbl_mask_array_del_mask(tbl, mask);
> > -       }
> > -}
> > -
> >  /* Must be called with OVS mutex held. */
> >  void ovs_flow_tbl_remove(struct flow_table *table, struct sw_flow *flow)
> >  {
> > -       struct table_instance *ti = ovsl_dereference(table->ti);
> > -       struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);
> > -
> > -       BUG_ON(table->count == 0);
> > -       hlist_del_rcu(&flow->flow_table.node[ti->node_ver]);
> > -       table->count--;
> > -       if (ovs_identifier_is_ufid(&flow->id)) {
> > -               hlist_del_rcu(&flow->ufid_table.node[ufid_ti->node_ver]);
> > -               table->ufid_count--;
> > -       }
> > -
> > -       /* RCU delete the mask. 'flow->mask' is not NULLed, as it should be
> > -        * accessible as long as the RCU read lock is held.
> > -        */
> > -       flow_mask_remove(table, flow->mask);
> > +       table_instance_remove(table, flow);
> Can you just rename table_instance_remove() to ovs_flow_tbl_remove()?

diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 5df5182..4871ab8 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -210,6 +210,74 @@ static int tbl_mask_array_realloc(struct
flow_table *tbl, int size)
        return 0;
 }

+static int tbl_mask_array_add_mask(struct flow_table *tbl,
+                                  struct sw_flow_mask *new)
+{
+       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
+       int err, ma_count = READ_ONCE(ma->count);
+
+       if (ma_count >= ma->max) {
+               err = tbl_mask_array_realloc(tbl, ma->max +
+                                             MASK_ARRAY_SIZE_MIN);
+               if (err)
+                       return err;
+
+               ma = ovsl_dereference(tbl->mask_array);
+       }
+
+       BUG_ON(ovsl_dereference(ma->masks[ma_count]));
+
+       rcu_assign_pointer(ma->masks[ma_count], new);
+       WRITE_ONCE(ma->count, ma_count +1);
+
+       return 0;
+}
+
+static void tbl_mask_array_del_mask(struct flow_table *tbl,
+                                   struct sw_flow_mask *mask)
+{
+       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
+       int i, ma_count = READ_ONCE(ma->count);
+
+       /* Remove the deleted mask pointers from the array */
+       for (i = 0; i < ma_count; i++) {
+               if (mask == ovsl_dereference(ma->masks[i]))
+                       goto found;
+       }
+
+       BUG();
+       return;
+
+found:
+       WRITE_ONCE(ma->count, ma_count -1);
+
+       rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
+       RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);
+
+       kfree_rcu(mask, rcu);
+
+       /* Shrink the mask array if necessary. */
+       if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
+           ma_count <= (ma->max / 3))
+               tbl_mask_array_realloc(tbl, ma->max / 2);
+}
+
+/* Remove 'mask' from the mask list, if it is not needed any more. */
+static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
+{
+       if (mask) {
+               /* ovs-lock is required to protect mask-refcount and
+                * mask list.
+                */
+               ASSERT_OVSL();
+               BUG_ON(!mask->ref_count);
+               mask->ref_count--;
+
+               if (!mask->ref_count)
+                       tbl_mask_array_del_mask(tbl, mask);
+       }
+}
+
 int ovs_flow_tbl_init(struct flow_table *table)
 {
        struct table_instance *ti, *ufid_ti;
@@ -257,7 +325,28 @@ static void flow_tbl_destroy_rcu_cb(struct rcu_head *rcu)
        __table_instance_destroy(ti);
 }

-static void table_instance_destroy(struct table_instance *ti,
+static void table_instance_remove(struct flow_table *table,
+                                 struct table_instance *ti,
+                                 struct table_instance *ufid_ti,
+                                 struct sw_flow *flow,
+                                 bool count)
+{
+       hlist_del_rcu(&flow->flow_table.node[ti->node_ver]);
+       if (count)
+               table->count--;
+
+       if (ovs_identifier_is_ufid(&flow->id)) {
+               hlist_del_rcu(&flow->ufid_table.node[ufid_ti->node_ver]);
+
+               if (count)
+                       table->ufid_count--;
+       }
+
+       flow_mask_remove(table, flow->mask);
+}
+
+static void table_instance_destroy(struct flow_table *table,
+                                  struct table_instance *ti,
                                   struct table_instance *ufid_ti,
                                   bool deferred)
 {
@@ -274,13 +363,11 @@ static void table_instance_destroy(struct
table_instance *ti,
                struct sw_flow *flow;
                struct hlist_head *head = &ti->buckets[i];
                struct hlist_node *n;
-               int ver = ti->node_ver;
-               int ufid_ver = ufid_ti->node_ver;

-               hlist_for_each_entry_safe(flow, n, head, flow_table.node[ver]) {
-                       hlist_del_rcu(&flow->flow_table.node[ver]);
-                       if (ovs_identifier_is_ufid(&flow->id))
-                               hlist_del_rcu(&flow->ufid_table.node[ufid_ver]);
+               hlist_for_each_entry_safe(flow, n, head,
+                                         flow_table.node[ti->node_ver]) {
+
+                       table_instance_remove(table, ti, ufid_ti, flow, false);
                        ovs_flow_free(flow, deferred);
                }
        }
@@ -305,7 +392,7 @@ void ovs_flow_tbl_destroy(struct flow_table *table)

        free_percpu(table->mask_cache);
        kfree_rcu(rcu_dereference_raw(table->mask_array), rcu);
-       table_instance_destroy(ti, ufid_ti, false);
+       table_instance_destroy(table, ti, ufid_ti, false);
 }

 struct sw_flow *ovs_flow_tbl_dump_next(struct table_instance *ti,
@@ -421,7 +508,7 @@ int ovs_flow_tbl_flush(struct flow_table *flow_table)
        flow_table->count = 0;
        flow_table->ufid_count = 0;

-       table_instance_destroy(old_ti, old_ufid_ti, true);
+       table_instance_destroy(flow_table, old_ti, old_ufid_ti, true);
        return 0;

 err_free_ti:
@@ -700,51 +787,6 @@ static struct table_instance
*table_instance_expand(struct table_instance *ti,
        return table_instance_rehash(ti, ti->n_buckets * 2, ufid);
 }

-static void tbl_mask_array_del_mask(struct flow_table *tbl,
-                                   struct sw_flow_mask *mask)
-{
-       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
-       int i, ma_count = READ_ONCE(ma->count);
-
-       /* Remove the deleted mask pointers from the array */
-       for (i = 0; i < ma_count; i++) {
-               if (mask == ovsl_dereference(ma->masks[i]))
-                       goto found;
-       }
-
-       BUG();
-       return;
-
-found:
-       WRITE_ONCE(ma->count, ma_count -1);
-
-       rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
-       RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);
-
-       kfree_rcu(mask, rcu);
-
-       /* Shrink the mask array if necessary. */
-       if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
-           ma_count <= (ma->max / 3))
-               tbl_mask_array_realloc(tbl, ma->max / 2);
-}
-
-/* Remove 'mask' from the mask list, if it is not needed any more. */
-static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
-{
-       if (mask) {
-               /* ovs-lock is required to protect mask-refcount and
-                * mask list.
-                */
-               ASSERT_OVSL();
-               BUG_ON(!mask->ref_count);
-               mask->ref_count--;
-
-               if (!mask->ref_count)
-                       tbl_mask_array_del_mask(tbl, mask);
-       }
-}
-
 /* Must be called with OVS mutex held. */
 void ovs_flow_tbl_remove(struct flow_table *table, struct sw_flow *flow)
 {
@@ -752,17 +794,7 @@ void ovs_flow_tbl_remove(struct flow_table
*table, struct sw_flow *flow)
        struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);

        BUG_ON(table->count == 0);
-       hlist_del_rcu(&flow->flow_table.node[ti->node_ver]);
-       table->count--;
-       if (ovs_identifier_is_ufid(&flow->id)) {
-               hlist_del_rcu(&flow->ufid_table.node[ufid_ti->node_ver]);
-               table->ufid_count--;
-       }
-
-       /* RCU delete the mask. 'flow->mask' is not NULLed, as it should be
-        * accessible as long as the RCU read lock is held.
-        */
-       flow_mask_remove(table, flow->mask);
+       table_instance_remove(table, ti, ufid_ti, flow, true);
 }

 static struct sw_flow_mask *mask_alloc(void)
@@ -805,29 +837,6 @@ static struct sw_flow_mask *flow_mask_find(const
struct flow_table *tbl,
        return NULL;
 }

-static int tbl_mask_array_add_mask(struct flow_table *tbl,
-                                  struct sw_flow_mask *new)
-{
-       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
-       int err, ma_count = READ_ONCE(ma->count);
-
-       if (ma_count >= ma->max) {
-               err = tbl_mask_array_realloc(tbl, ma->max +
-                                             MASK_ARRAY_SIZE_MIN);
-               if (err)
-                       return err;
-
-               ma = ovsl_dereference(tbl->mask_array);
-       }
-
-       BUG_ON(ovsl_dereference(ma->masks[ma_count]));
-
-       rcu_assign_pointer(ma->masks[ma_count], new);
-       WRITE_ONCE(ma->count, ma_count +1);
-
-       return 0;
-}
-
 /* Add 'mask' into the mask list, if it is not already there. */
 static int flow_mask_insert(struct flow_table *tbl, struct sw_flow *flow,
                            const struct sw_flow_mask *new)

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table
  2019-10-28  6:49                 ` Tonghao Zhang
@ 2019-10-29  7:37                   ` Pravin Shelar
  2019-10-29 11:30                     ` Tonghao Zhang
  0 siblings, 1 reply; 37+ messages in thread
From: Pravin Shelar @ 2019-10-29  7:37 UTC (permalink / raw)
  To: Tonghao Zhang; +Cc: Greg Rose, Linux Kernel Network Developers, ovs dev

On Sun, Oct 27, 2019 at 11:49 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> On Thu, Oct 24, 2019 at 3:14 PM Pravin Shelar <pshelar@ovn.org> wrote:
> >
> > On Tue, Oct 22, 2019 at 7:35 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > >
> > > On Tue, Oct 22, 2019 at 2:58 PM Pravin Shelar <pshelar@ovn.org> wrote:
> > > >
> > ...
> >
...
> > >  struct sw_flow *ovs_flow_tbl_dump_next(struct table_instance *ti,
> > > @@ -400,10 +458,9 @@ static struct table_instance
> > > *table_instance_rehash(struct table_instance *ti,
> > >         return new_ti;
> > >  }
> > >
> > > -int ovs_flow_tbl_flush(struct flow_table *flow_table)
> > > +int ovs_flow_tbl_flush(struct flow_table *table)
> > >  {
> > > -       struct table_instance *old_ti, *new_ti;
> > > -       struct table_instance *old_ufid_ti, *new_ufid_ti;
> > > +       struct table_instance *new_ti, *new_ufid_ti;
> > >
> > >         new_ti = table_instance_alloc(TBL_MIN_BUCKETS);
> > >         if (!new_ti)
> > > @@ -412,16 +469,12 @@ int ovs_flow_tbl_flush(struct flow_table *flow_table)
> > >         if (!new_ufid_ti)
> > >                 goto err_free_ti;
> > >
> > > -       old_ti = ovsl_dereference(flow_table->ti);
> > > -       old_ufid_ti = ovsl_dereference(flow_table->ufid_ti);
> > > +       table_instance_destroy(table, true);
> > >
> > This would destroy running table causing unnecessary flow miss. Lets
> > keep current scheme of setting up new table before destroying current
> > one.
> >
> > > -       rcu_assign_pointer(flow_table->ti, new_ti);
....
...
>  /* Must be called with OVS mutex held. */
>  void ovs_flow_tbl_remove(struct flow_table *table, struct sw_flow *flow)
>  {
> @@ -752,17 +794,7 @@ void ovs_flow_tbl_remove(struct flow_table
> *table, struct sw_flow *flow)
>         struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);
>
>         BUG_ON(table->count == 0);
> -       hlist_del_rcu(&flow->flow_table.node[ti->node_ver]);
> -       table->count--;
> -       if (ovs_identifier_is_ufid(&flow->id)) {
> -               hlist_del_rcu(&flow->ufid_table.node[ufid_ti->node_ver]);
> -               table->ufid_count--;
> -       }
> -
> -       /* RCU delete the mask. 'flow->mask' is not NULLed, as it should be
> -        * accessible as long as the RCU read lock is held.
> -        */
> -       flow_mask_remove(table, flow->mask);
> +       table_instance_remove(table, ti, ufid_ti, flow, true);
>  }
Lets rename table_instance_remove() to imply it is freeing a flow.
Otherwise looks good.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table
  2019-10-29  7:37                   ` Pravin Shelar
@ 2019-10-29 11:30                     ` Tonghao Zhang
  2019-10-29 20:27                       ` Pravin Shelar
  0 siblings, 1 reply; 37+ messages in thread
From: Tonghao Zhang @ 2019-10-29 11:30 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Greg Rose, Linux Kernel Network Developers, ovs dev

On Tue, Oct 29, 2019 at 3:38 PM Pravin Shelar <pshelar@ovn.org> wrote:
>
> On Sun, Oct 27, 2019 at 11:49 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> >
> > On Thu, Oct 24, 2019 at 3:14 PM Pravin Shelar <pshelar@ovn.org> wrote:
> > >
> > > On Tue, Oct 22, 2019 at 7:35 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > > >
> > > > On Tue, Oct 22, 2019 at 2:58 PM Pravin Shelar <pshelar@ovn.org> wrote:
> > > > >
> > > ...
> > >
> ...
> > > >  struct sw_flow *ovs_flow_tbl_dump_next(struct table_instance *ti,
> > > > @@ -400,10 +458,9 @@ static struct table_instance
> > > > *table_instance_rehash(struct table_instance *ti,
> > > >         return new_ti;
> > > >  }
> > > >
> > > > -int ovs_flow_tbl_flush(struct flow_table *flow_table)
> > > > +int ovs_flow_tbl_flush(struct flow_table *table)
> > > >  {
> > > > -       struct table_instance *old_ti, *new_ti;
> > > > -       struct table_instance *old_ufid_ti, *new_ufid_ti;
> > > > +       struct table_instance *new_ti, *new_ufid_ti;
> > > >
> > > >         new_ti = table_instance_alloc(TBL_MIN_BUCKETS);
> > > >         if (!new_ti)
> > > > @@ -412,16 +469,12 @@ int ovs_flow_tbl_flush(struct flow_table *flow_table)
> > > >         if (!new_ufid_ti)
> > > >                 goto err_free_ti;
> > > >
> > > > -       old_ti = ovsl_dereference(flow_table->ti);
> > > > -       old_ufid_ti = ovsl_dereference(flow_table->ufid_ti);
> > > > +       table_instance_destroy(table, true);
> > > >
> > > This would destroy running table causing unnecessary flow miss. Lets
> > > keep current scheme of setting up new table before destroying current
> > > one.
> > >
> > > > -       rcu_assign_pointer(flow_table->ti, new_ti);
> ....
> ...
> >  /* Must be called with OVS mutex held. */
> >  void ovs_flow_tbl_remove(struct flow_table *table, struct sw_flow *flow)
> >  {
> > @@ -752,17 +794,7 @@ void ovs_flow_tbl_remove(struct flow_table
> > *table, struct sw_flow *flow)
> >         struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);
> >
> >         BUG_ON(table->count == 0);
> > -       hlist_del_rcu(&flow->flow_table.node[ti->node_ver]);
> > -       table->count--;
> > -       if (ovs_identifier_is_ufid(&flow->id)) {
> > -               hlist_del_rcu(&flow->ufid_table.node[ufid_ti->node_ver]);
> > -               table->ufid_count--;
> > -       }
> > -
> > -       /* RCU delete the mask. 'flow->mask' is not NULLed, as it should be
> > -        * accessible as long as the RCU read lock is held.
> > -        */
> > -       flow_mask_remove(table, flow->mask);
> > +       table_instance_remove(table, ti, ufid_ti, flow, true);
> >  }
> Lets rename table_instance_remove() to imply it is freeing a flow.
hi Pravin, the function ovs_flow_free will free the flow actually. In
-ovs_flow_cmd_del
ovs_flow_tbl_remove
...
ovs_flow_free

In -table_instance_destroy
table_instance_remove
ovs_flow_free

But if rename the table_instance_remove, table_instance_flow_free ?
> Otherwise looks good.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table
  2019-10-29 11:30                     ` Tonghao Zhang
@ 2019-10-29 20:27                       ` Pravin Shelar
  0 siblings, 0 replies; 37+ messages in thread
From: Pravin Shelar @ 2019-10-29 20:27 UTC (permalink / raw)
  To: Tonghao Zhang; +Cc: Greg Rose, Linux Kernel Network Developers, ovs dev

On Tue, Oct 29, 2019 at 4:31 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> On Tue, Oct 29, 2019 at 3:38 PM Pravin Shelar <pshelar@ovn.org> wrote:
> >
> > On Sun, Oct 27, 2019 at 11:49 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > >
> > > On Thu, Oct 24, 2019 at 3:14 PM Pravin Shelar <pshelar@ovn.org> wrote:
> > > >
> > > > On Tue, Oct 22, 2019 at 7:35 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > > > >
> > > > > On Tue, Oct 22, 2019 at 2:58 PM Pravin Shelar <pshelar@ovn.org> wrote:
> > > > > >
> > > > ...
> > > >
> > ...
> > > > >  struct sw_flow *ovs_flow_tbl_dump_next(struct table_instance *ti,
> > > > > @@ -400,10 +458,9 @@ static struct table_instance
> > > > > *table_instance_rehash(struct table_instance *ti,
> > > > >         return new_ti;
> > > > >  }
> > > > >
> > > > > -int ovs_flow_tbl_flush(struct flow_table *flow_table)
> > > > > +int ovs_flow_tbl_flush(struct flow_table *table)
> > > > >  {
> > > > > -       struct table_instance *old_ti, *new_ti;
> > > > > -       struct table_instance *old_ufid_ti, *new_ufid_ti;
> > > > > +       struct table_instance *new_ti, *new_ufid_ti;
> > > > >
> > > > >         new_ti = table_instance_alloc(TBL_MIN_BUCKETS);
> > > > >         if (!new_ti)
> > > > > @@ -412,16 +469,12 @@ int ovs_flow_tbl_flush(struct flow_table *flow_table)
> > > > >         if (!new_ufid_ti)
> > > > >                 goto err_free_ti;
> > > > >
> > > > > -       old_ti = ovsl_dereference(flow_table->ti);
> > > > > -       old_ufid_ti = ovsl_dereference(flow_table->ufid_ti);
> > > > > +       table_instance_destroy(table, true);
> > > > >
> > > > This would destroy running table causing unnecessary flow miss. Lets
> > > > keep current scheme of setting up new table before destroying current
> > > > one.
> > > >
> > > > > -       rcu_assign_pointer(flow_table->ti, new_ti);
> > ....
> > ...
> > >  /* Must be called with OVS mutex held. */
> > >  void ovs_flow_tbl_remove(struct flow_table *table, struct sw_flow *flow)
> > >  {
> > > @@ -752,17 +794,7 @@ void ovs_flow_tbl_remove(struct flow_table
> > > *table, struct sw_flow *flow)
> > >         struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);
> > >
> > >         BUG_ON(table->count == 0);
> > > -       hlist_del_rcu(&flow->flow_table.node[ti->node_ver]);
> > > -       table->count--;
> > > -       if (ovs_identifier_is_ufid(&flow->id)) {
> > > -               hlist_del_rcu(&flow->ufid_table.node[ufid_ti->node_ver]);
> > > -               table->ufid_count--;
> > > -       }
> > > -
> > > -       /* RCU delete the mask. 'flow->mask' is not NULLed, as it should be
> > > -        * accessible as long as the RCU read lock is held.
> > > -        */
> > > -       flow_mask_remove(table, flow->mask);
> > > +       table_instance_remove(table, ti, ufid_ti, flow, true);
> > >  }
> > Lets rename table_instance_remove() to imply it is freeing a flow.
> hi Pravin, the function ovs_flow_free will free the flow actually. In
> -ovs_flow_cmd_del
> ovs_flow_tbl_remove
> ...
> ovs_flow_free
>
> In -table_instance_destroy
> table_instance_remove
> ovs_flow_free
>
> But if rename the table_instance_remove, table_instance_flow_free ?
table_instance_flow_free() looks good.

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2019-10-29 20:27 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-15 10:30 [PATCH net-next v4 00/10] optimize openvswitch flow looking up xiangxia.m.yue
2019-10-15 10:30 ` [PATCH net-next v4 01/10] net: openvswitch: add flow-mask cache for performance xiangxia.m.yue
2019-10-18 23:29   ` [ovs-dev] " William Tu
2019-10-15 10:30 ` [PATCH net-next v4 02/10] net: openvswitch: convert mask list in mask array xiangxia.m.yue
2019-10-18 23:30   ` [ovs-dev] " William Tu
2019-10-15 10:30 ` [PATCH net-next v4 03/10] net: openvswitch: shrink the mask array if necessary xiangxia.m.yue
2019-10-18 23:33   ` [ovs-dev] " William Tu
2019-10-15 10:30 ` [PATCH net-next v4 04/10] net: openvswitch: optimize flow mask cache hash collision xiangxia.m.yue
2019-10-15 10:30 ` [PATCH net-next v4 05/10] net: openvswitch: optimize flow-mask looking up xiangxia.m.yue
2019-10-18 23:26   ` [ovs-dev] " William Tu
2019-10-21  4:51     ` Tonghao Zhang
2019-10-21 17:58       ` William Tu
2019-10-15 10:30 ` [PATCH net-next v4 06/10] net: openvswitch: simplify the flow_hash xiangxia.m.yue
2019-10-18 23:27   ` [ovs-dev] " William Tu
2019-10-15 10:30 ` [PATCH net-next v4 07/10] net: openvswitch: add likely in flow_lookup xiangxia.m.yue
2019-10-18 23:27   ` [ovs-dev] " William Tu
2019-10-15 10:30 ` [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table xiangxia.m.yue
2019-10-17 22:38   ` Pravin Shelar
2019-10-18  3:16     ` Tonghao Zhang
2019-10-18 18:12       ` Pravin Shelar
2019-10-21  5:01         ` Tonghao Zhang
2019-10-22  6:57           ` Pravin Shelar
2019-10-23  2:35             ` Tonghao Zhang
2019-10-24  7:14               ` Pravin Shelar
2019-10-28  6:49                 ` Tonghao Zhang
2019-10-29  7:37                   ` Pravin Shelar
2019-10-29 11:30                     ` Tonghao Zhang
2019-10-29 20:27                       ` Pravin Shelar
2019-10-15 10:30 ` [PATCH net-next v4 09/10] net: openvswitch: don't unlock mutex when changing the user_features fails xiangxia.m.yue
2019-10-18 23:27   ` [ovs-dev] " William Tu
2019-10-15 10:30 ` [PATCH net-next v4 10/10] net: openvswitch: simplify the ovs_dp_cmd_new xiangxia.m.yue
2019-10-18 23:29   ` [ovs-dev] " William Tu
2019-10-17 19:22 ` [PATCH net-next v4 00/10] optimize openvswitch flow looking up David Miller
2019-10-17 20:29   ` Gregory Rose
2019-10-21 17:14 ` [ovs-dev] " William Tu
2019-10-22  1:16   ` Tonghao Zhang
2019-10-22 15:44     ` William Tu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.