netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
To: Pravin Shelar <pshelar@ovn.org>
Cc: Greg Rose <gvrose8192@gmail.com>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	ovs dev <dev@openvswitch.org>
Subject: Re: [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table
Date: Mon, 21 Oct 2019 13:01:24 +0800	[thread overview]
Message-ID: <CAMDZJNXdu3R_GkHEBbwycEpe0wnwNmGzHx-8gUxtwiW1mEy7uw@mail.gmail.com> (raw)
In-Reply-To: <CAOrHB_BqGdFmmzTEPxejt0QXmyC_QtAXG=S8kzKi=3w-PacwUw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3922 bytes --]

On Sat, Oct 19, 2019 at 2:12 AM Pravin Shelar <pshelar@ovn.org> wrote:
>
> On Thu, Oct 17, 2019 at 8:16 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> >
> > On Fri, Oct 18, 2019 at 6:38 AM Pravin Shelar <pshelar@ovn.org> wrote:
> > >
> > > On Wed, Oct 16, 2019 at 5:50 AM <xiangxia.m.yue@gmail.com> wrote:
> > > >
> > > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > >
> > > > When we destroy the flow tables which may contain the flow_mask,
> > > > so release the flow mask struct.
> > > >
> > > > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > > Tested-by: Greg Rose <gvrose8192@gmail.com>
> > > > ---
> > > >  net/openvswitch/flow_table.c | 14 +++++++++++++-
> > > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
> > > > index 5df5182..d5d768e 100644
> > > > --- a/net/openvswitch/flow_table.c
> > > > +++ b/net/openvswitch/flow_table.c
> > > > @@ -295,6 +295,18 @@ static void table_instance_destroy(struct table_instance *ti,
> > > >         }
> > > >  }
> > > >
> > > > +static void tbl_mask_array_destroy(struct flow_table *tbl)
> > > > +{
> > > > +       struct mask_array *ma = ovsl_dereference(tbl->mask_array);
> > > > +       int i;
> > > > +
> > > > +       /* Free the flow-mask and kfree_rcu the NULL is allowed. */
> > > > +       for (i = 0; i < ma->max; i++)
> > > > +               kfree_rcu(rcu_dereference_raw(ma->masks[i]), rcu);
> > > > +
> > > > +       kfree_rcu(rcu_dereference_raw(tbl->mask_array), rcu);
> > > > +}
> > > > +
> > > >  /* No need for locking this function is called from RCU callback or
> > > >   * error path.
> > > >   */
> > > > @@ -304,7 +316,7 @@ void ovs_flow_tbl_destroy(struct flow_table *table)
> > > >         struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
> > > >
> > > >         free_percpu(table->mask_cache);
> > > > -       kfree_rcu(rcu_dereference_raw(table->mask_array), rcu);
> > > > +       tbl_mask_array_destroy(table);
> > > >         table_instance_destroy(ti, ufid_ti, false);
> > > >  }
> > >
> > > This should not be required. mask is linked to a flow and gets
> > > released when flow is removed.
> > > Does the memory leak occur when OVS module is abruptly unloaded and
> > > userspace does not cleanup flow table?
> > When we destroy the ovs datapath or net namespace is destroyed , the
> > mask memory will be happened. The call tree:
> > ovs_exit_net/ovs_dp_cmd_del
> > -->__dp_destroy
> > -->destroy_dp_rcu
> > -->ovs_flow_tbl_destroy
> > -->table_instance_destroy (which don't release the mask memory because
> > don't call the ovs_flow_tbl_remove /flow_mask_remove directly or
> > indirectly).
> >
> Thats what I suggested earlier, we need to call function similar to
> ovs_flow_tbl_remove(), we could refactor code to use the code.
> This is better since by introducing tbl_mask_array_destroy() is
> creating a dangling pointer to mask in sw-flow object. OVS is anyway
> iterating entire flow table to release sw-flow in
> table_instance_destroy(), it is natural to release mask at that point
> after releasing corresponding sw-flow.
I got it, thanks. I rewrite the codes, can you help me to review it.
If fine, I will sent it next version.
>
>
> > but one thing, when we flush the flow, we don't flush the mask flow.(
> > If necessary, one patch should be sent)
> >
> > > In that case better fix could be calling ovs_flow_tbl_remove()
> > > equivalent from table_instance_destroy when it is iterating flow
> > > table.
> > I think operation of  the flow mask and flow table should use
> > different API, for example:
> > for flow mask, we use the:
> > -tbl_mask_array_add_mask
> > -tbl_mask_array_del_mask
> > -tbl_mask_array_alloc
> > -tbl_mask_array_realloc
> > -tbl_mask_array_destroy(this patch introduce.)
> >
> > table instance:
> > -table_instance_alloc
> > -table_instance_destroy
> > ....

[-- Attachment #2: ovs-mem-leak.patch --]
[-- Type: application/octet-stream, Size: 6694 bytes --]

diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 5df5182..5b20793 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -257,10 +257,75 @@ static void flow_tbl_destroy_rcu_cb(struct rcu_head *rcu)
 	__table_instance_destroy(ti);
 }
 
-static void table_instance_destroy(struct table_instance *ti,
-				   struct table_instance *ufid_ti,
+static void tbl_mask_array_del_mask(struct flow_table *tbl,
+				    struct sw_flow_mask *mask)
+{
+	struct mask_array *ma = ovsl_dereference(tbl->mask_array);
+	int i, ma_count = READ_ONCE(ma->count);
+
+	/* Remove the deleted mask pointers from the array */
+	for (i = 0; i < ma_count; i++) {
+		if (mask == ovsl_dereference(ma->masks[i]))
+			goto found;
+	}
+
+	BUG();
+	return;
+
+found:
+	WRITE_ONCE(ma->count, ma_count -1);
+
+	rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
+	RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);
+
+	kfree_rcu(mask, rcu);
+
+	/* Shrink the mask array if necessary. */
+	if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
+	    ma_count <= (ma->max / 3))
+		tbl_mask_array_realloc(tbl, ma->max / 2);
+}
+
+/* Remove 'mask' from the mask list, if it is not needed any more. */
+static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
+{
+	if (mask) {
+		/* ovs-lock is required to protect mask-refcount and
+		 * mask list.
+		 */
+		ASSERT_OVSL();
+		BUG_ON(!mask->ref_count);
+		mask->ref_count--;
+
+		if (!mask->ref_count)
+			tbl_mask_array_del_mask(tbl, mask);
+	}
+}
+
+static void table_instance_remove(struct flow_table *table, struct sw_flow *flow)
+{
+	struct table_instance *ti = ovsl_dereference(table->ti);
+	struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);
+
+	BUG_ON(table->count == 0);
+	hlist_del_rcu(&flow->flow_table.node[ti->node_ver]);
+	table->count--;
+	if (ovs_identifier_is_ufid(&flow->id)) {
+		hlist_del_rcu(&flow->ufid_table.node[ufid_ti->node_ver]);
+		table->ufid_count--;
+	}
+
+	/* RCU delete the mask. 'flow->mask' is not NULLed, as it should be
+	 * accessible as long as the RCU read lock is held.
+	 */
+	flow_mask_remove(table, flow->mask);
+}
+
+static void table_instance_destroy(struct flow_table *table,
 				   bool deferred)
 {
+	struct table_instance *ti = ovsl_dereference(table->ti);
+	struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);
 	int i;
 
 	if (!ti)
@@ -274,13 +339,9 @@ static void table_instance_destroy(struct table_instance *ti,
 		struct sw_flow *flow;
 		struct hlist_head *head = &ti->buckets[i];
 		struct hlist_node *n;
-		int ver = ti->node_ver;
-		int ufid_ver = ufid_ti->node_ver;
 
-		hlist_for_each_entry_safe(flow, n, head, flow_table.node[ver]) {
-			hlist_del_rcu(&flow->flow_table.node[ver]);
-			if (ovs_identifier_is_ufid(&flow->id))
-				hlist_del_rcu(&flow->ufid_table.node[ufid_ver]);
+		hlist_for_each_entry_safe(flow, n, head, flow_table.node[ti->node_ver]) {
+			table_instance_remove(table, flow);
 			ovs_flow_free(flow, deferred);
 		}
 	}
@@ -300,12 +361,9 @@ static void table_instance_destroy(struct table_instance *ti,
  */
 void ovs_flow_tbl_destroy(struct flow_table *table)
 {
-	struct table_instance *ti = rcu_dereference_raw(table->ti);
-	struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
-
 	free_percpu(table->mask_cache);
 	kfree_rcu(rcu_dereference_raw(table->mask_array), rcu);
-	table_instance_destroy(ti, ufid_ti, false);
+	table_instance_destroy(table, false);
 }
 
 struct sw_flow *ovs_flow_tbl_dump_next(struct table_instance *ti,
@@ -400,10 +458,9 @@ static struct table_instance *table_instance_rehash(struct table_instance *ti,
 	return new_ti;
 }
 
-int ovs_flow_tbl_flush(struct flow_table *flow_table)
+int ovs_flow_tbl_flush(struct flow_table *table)
 {
-	struct table_instance *old_ti, *new_ti;
-	struct table_instance *old_ufid_ti, *new_ufid_ti;
+	struct table_instance *new_ti, *new_ufid_ti;
 
 	new_ti = table_instance_alloc(TBL_MIN_BUCKETS);
 	if (!new_ti)
@@ -412,16 +469,12 @@ int ovs_flow_tbl_flush(struct flow_table *flow_table)
 	if (!new_ufid_ti)
 		goto err_free_ti;
 
-	old_ti = ovsl_dereference(flow_table->ti);
-	old_ufid_ti = ovsl_dereference(flow_table->ufid_ti);
+	table_instance_destroy(table, true);
 
-	rcu_assign_pointer(flow_table->ti, new_ti);
-	rcu_assign_pointer(flow_table->ufid_ti, new_ufid_ti);
-	flow_table->last_rehash = jiffies;
-	flow_table->count = 0;
-	flow_table->ufid_count = 0;
+	rcu_assign_pointer(table->ti, new_ti);
+	rcu_assign_pointer(table->ufid_ti, new_ufid_ti);
+	table->last_rehash = jiffies;
 
-	table_instance_destroy(old_ti, old_ufid_ti, true);
 	return 0;
 
 err_free_ti:
@@ -700,69 +753,10 @@ static struct table_instance *table_instance_expand(struct table_instance *ti,
 	return table_instance_rehash(ti, ti->n_buckets * 2, ufid);
 }
 
-static void tbl_mask_array_del_mask(struct flow_table *tbl,
-				    struct sw_flow_mask *mask)
-{
-	struct mask_array *ma = ovsl_dereference(tbl->mask_array);
-	int i, ma_count = READ_ONCE(ma->count);
-
-	/* Remove the deleted mask pointers from the array */
-	for (i = 0; i < ma_count; i++) {
-		if (mask == ovsl_dereference(ma->masks[i]))
-			goto found;
-	}
-
-	BUG();
-	return;
-
-found:
-	WRITE_ONCE(ma->count, ma_count -1);
-
-	rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
-	RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);
-
-	kfree_rcu(mask, rcu);
-
-	/* Shrink the mask array if necessary. */
-	if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
-	    ma_count <= (ma->max / 3))
-		tbl_mask_array_realloc(tbl, ma->max / 2);
-}
-
-/* Remove 'mask' from the mask list, if it is not needed any more. */
-static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
-{
-	if (mask) {
-		/* ovs-lock is required to protect mask-refcount and
-		 * mask list.
-		 */
-		ASSERT_OVSL();
-		BUG_ON(!mask->ref_count);
-		mask->ref_count--;
-
-		if (!mask->ref_count)
-			tbl_mask_array_del_mask(tbl, mask);
-	}
-}
-
 /* Must be called with OVS mutex held. */
 void ovs_flow_tbl_remove(struct flow_table *table, struct sw_flow *flow)
 {
-	struct table_instance *ti = ovsl_dereference(table->ti);
-	struct table_instance *ufid_ti = ovsl_dereference(table->ufid_ti);
-
-	BUG_ON(table->count == 0);
-	hlist_del_rcu(&flow->flow_table.node[ti->node_ver]);
-	table->count--;
-	if (ovs_identifier_is_ufid(&flow->id)) {
-		hlist_del_rcu(&flow->ufid_table.node[ufid_ti->node_ver]);
-		table->ufid_count--;
-	}
-
-	/* RCU delete the mask. 'flow->mask' is not NULLed, as it should be
-	 * accessible as long as the RCU read lock is held.
-	 */
-	flow_mask_remove(table, flow->mask);
+	table_instance_remove(table, flow);
 }
 
 static struct sw_flow_mask *mask_alloc(void)

  reply	other threads:[~2019-10-21  5:02 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-15 10:30 [PATCH net-next v4 00/10] optimize openvswitch flow looking up xiangxia.m.yue
2019-10-15 10:30 ` [PATCH net-next v4 01/10] net: openvswitch: add flow-mask cache for performance xiangxia.m.yue
2019-10-18 23:29   ` [ovs-dev] " William Tu
2019-10-15 10:30 ` [PATCH net-next v4 02/10] net: openvswitch: convert mask list in mask array xiangxia.m.yue
2019-10-18 23:30   ` [ovs-dev] " William Tu
2019-10-15 10:30 ` [PATCH net-next v4 03/10] net: openvswitch: shrink the mask array if necessary xiangxia.m.yue
2019-10-18 23:33   ` [ovs-dev] " William Tu
2019-10-15 10:30 ` [PATCH net-next v4 04/10] net: openvswitch: optimize flow mask cache hash collision xiangxia.m.yue
2019-10-15 10:30 ` [PATCH net-next v4 05/10] net: openvswitch: optimize flow-mask looking up xiangxia.m.yue
2019-10-18 23:26   ` [ovs-dev] " William Tu
2019-10-21  4:51     ` Tonghao Zhang
2019-10-21 17:58       ` William Tu
2019-10-15 10:30 ` [PATCH net-next v4 06/10] net: openvswitch: simplify the flow_hash xiangxia.m.yue
2019-10-18 23:27   ` [ovs-dev] " William Tu
2019-10-15 10:30 ` [PATCH net-next v4 07/10] net: openvswitch: add likely in flow_lookup xiangxia.m.yue
2019-10-18 23:27   ` [ovs-dev] " William Tu
2019-10-15 10:30 ` [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table xiangxia.m.yue
2019-10-17 22:38   ` Pravin Shelar
2019-10-18  3:16     ` Tonghao Zhang
2019-10-18 18:12       ` Pravin Shelar
2019-10-21  5:01         ` Tonghao Zhang [this message]
2019-10-22  6:57           ` Pravin Shelar
2019-10-23  2:35             ` Tonghao Zhang
2019-10-24  7:14               ` Pravin Shelar
2019-10-28  6:49                 ` Tonghao Zhang
2019-10-29  7:37                   ` Pravin Shelar
2019-10-29 11:30                     ` Tonghao Zhang
2019-10-29 20:27                       ` Pravin Shelar
2019-10-15 10:30 ` [PATCH net-next v4 09/10] net: openvswitch: don't unlock mutex when changing the user_features fails xiangxia.m.yue
2019-10-18 23:27   ` [ovs-dev] " William Tu
2019-10-15 10:30 ` [PATCH net-next v4 10/10] net: openvswitch: simplify the ovs_dp_cmd_new xiangxia.m.yue
2019-10-18 23:29   ` [ovs-dev] " William Tu
2019-10-17 19:22 ` [PATCH net-next v4 00/10] optimize openvswitch flow looking up David Miller
2019-10-17 20:29   ` Gregory Rose
2019-10-21 17:14 ` [ovs-dev] " William Tu
2019-10-22  1:16   ` Tonghao Zhang
2019-10-22 15:44     ` William Tu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMDZJNXdu3R_GkHEBbwycEpe0wnwNmGzHx-8gUxtwiW1mEy7uw@mail.gmail.com \
    --to=xiangxia.m.yue@gmail.com \
    --cc=dev@openvswitch.org \
    --cc=gvrose8192@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pshelar@ovn.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).