netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Simon Horman <horms@verge.net.au>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>, David Miller <davem@davemloft.net>,
	torvalds@linux-foundation.org, akpm@linux-foundation.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Thomas Gleixner <tglx@linutronix.de>,
	Arnd Bergmann <arnd@arndb.de>,
	Pekka Enberg <penberg@cs.helsinki.fi>,
	Julian Anastasov <ja@ssi.bg>,
	Hans Schillstrom <hans@schillstrom.com>
Subject: Re: [slab poison overwritten] Re: [GIT] Networking
Date: Tue, 22 Mar 2011 10:18:01 +0900	[thread overview]
Message-ID: <20110322011801.GE27019@verge.net.au> (raw)
In-Reply-To: <20110322001706.GD27019@verge.net.au>

On Tue, Mar 22, 2011 at 09:17:07AM +0900, Simon Horman wrote:
> On Tue, Mar 22, 2011 at 09:01:33AM +0900, Simon Horman wrote:
> > On Tue, Mar 22, 2011 at 08:29:21AM +0900, Simon Horman wrote:
> > > On Tue, Mar 22, 2011 at 07:13:58AM +0900, Simon Horman wrote:
> > > > On Mon, Mar 21, 2011 at 09:15:40PM +0100, Eric Dumazet wrote:
> > > > > Le lundi 21 mars 2011 à 19:07 +0100, Eric Dumazet a écrit :
> > > > > > Le lundi 21 mars 2011 à 18:39 +0100, Ingo Molnar a écrit :
> > > > > > > here's the same but with kallsyms enabled.
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > 
> > > > > > > 	Ingo
> > > > > > > 
> > > > > > > [    9.585627] initcall 0xffffffff81d5b806 returned 0 after 0 usecs
> > > > > > > [    9.588960] calling  0xffffffff81d5b9da @ 1
> > > > > > > [    9.592303] IPVS: Creating netns size=1272 id=0
> > > > > > > [    9.595646] IPVS: __ip_vs_control_init(): alloc_percpu.
> > > > > > > [    9.602298] IPVS: cannot register namespace.
> > > > > > > [    9.605627] IPVS: can't setup control
> > > > > > 
> > > > > > It seems IPVS is busted in case of memory allocation error in 
> > > > > > __ip_vs_control_init()
> > > > > > 
> > > > > > IPVS deinits its "struct netns_ipvs" space, but something (in IPVS) uses
> > > > > > it after free.
> > > > > > 
> > > > > > __ip_vs_init() seems to be called before ip_vs_init() completes
> > > > > > correctly. We then keep in net->ipvs a pointer to some freed memory.
> > > > > > 
> > > > > > Commit 14e405461e664b7 did some changes in this area
> > > > > > 
> > > > > > Simon, any idea ?
> > > > > > 
> > > > > > 
> > > > > 
> > > > > For the time being, we can avoid the false memory allocation error (and
> > > > > leak)
> > > > 
> > > > Sorry, that typo is my work.
> > > 
> > > With your patch applied I now see the following
> > > 
> > > ffff880003bbf1a0 corresponds to &ipvs->app_key in __ip_vs_app_init().
> > > I'll continue looking into this.
> > > 
> > > [   12.610000] IPVS: Creating netns size=2456 id=0
> > > [   12.630000] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
> > > [   12.640000] BUG: key ffff880003bbf1a0 not in .data!
> > > [   12.640000] ------------[ cut here ]------------
> > > [   12.640000] WARNING: at kernel/lockdep.c:2701
> > > lockdep_init_map+0x37b/0x570()
> > > [   12.640000] Hardware name: Bochs
> > > [   12.640000] Pid: 1, comm: swapper Tainted: G        W
> > > 2.6.38-kexec-06330-g69b7efe-dirty #122
> > > [   12.650000] Call Trace:
> > > [   12.650000]  [<ffffffff8102e685>] warn_slowpath_common+0x75/0xb0
> > > [   12.650000]  [<ffffffff8102e6d5>] warn_slowpath_null+0x15/0x20
> > > [   12.650000]  [<ffffffff8105967b>] lockdep_init_map+0x37b/0x570
> > > [   12.650000]  [<ffffffff8105829d>] ? trace_hardirqs_on+0xd/0x10
> > > [   12.650000]  [<ffffffff81055ad8>] debug_mutex_init+0x38/0x50
> > > [   12.650000]  [<ffffffff8104bc4c>] __mutex_init+0x5c/0x70
> > > [   12.650000]  [<ffffffff81685ee7>] __ip_vs_app_init+0x64/0x86
> > > [   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
> > > [   12.660000]  [<ffffffff811b1c33>] T.620+0x43/0x170
> > > [   12.660000]  [<ffffffff811b1e9a>] ? register_pernet_subsys+0x1a/0x40
> > > [   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
> > > [   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
> > > [   12.660000]  [<ffffffff811b1db7>] register_pernet_operations+0x57/0xb0
> > > [   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
> > > [   12.670000]  [<ffffffff811b1ea9>] register_pernet_subsys+0x29/0x40
> > > [   12.670000]  [<ffffffff81685f19>] ip_vs_app_init+0x10/0x12
> > > [   12.670000]  [<ffffffff81685a87>] ip_vs_init+0x4c/0xff
> > > [   12.670000]  [<ffffffff8166562c>] do_one_initcall+0x7a/0x12e
> > > [   12.670000]  [<ffffffff8166583e>] kernel_init+0x13e/0x1c2
> > > [   12.670000]  [<ffffffff8128c134>] kernel_thread_helper+0x4/0x10
> > > [   12.670000]  [<ffffffff8128ad40>] ? restore_args+0x0/0x30
> > > [   12.680000]  [<ffffffff81665700>] ? kernel_init+0x0/0x1c2
> > > [   12.680000]  [<ffffffff8128c130>] ? kernel_thread_helper+0x0/0x10
> > > [   12.680000] ---[ end trace 4eaa2a86a8e2da23 ]---
> > 
> > It seems that the problem above was introduced by
> > ab8a5e8408c3 ("IPVS: netns awareness to ip_vs_app").
> > I assume the hungs are the cause:
> 
> s/hungs/hunks below/
> 
> I am a little unsure of what to do about this.
> 
> The problem seems to be that ipvs->app_key is not in static storage.
> But I'm not sure how to resolve that given that the struct netns_ipvs is
> per-network namespace. So I guess that the locking needs to be re-worked.
> Again, I'm a little unsure of what the best way forward is.

I had an idea for a fix over breakfast.

IPVS: Use global mutex in ip_vs_app.c

As part of the work to make IPVS network namespace aware
__ip_vs_app_mutex was replaced by a per-namespace lock,
ipvs->app_mutex. ipvs->app_key is also supplied for debugging purposes.

Unfortunately this implementation results in ipvs->app_key residing
in non-static storage which at the very least causes a lockdep warning.

This patch takes the rather heavy-handed approach of reinstating
__ip_vs_app_mutex which will cover access to the ipvs->list_head
of all network namespaces.

[   12.610000] IPVS: Creating netns size=2456 id=0
[   12.630000] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
[   12.640000] BUG: key ffff880003bbf1a0 not in .data!
[   12.640000] ------------[ cut here ]------------
[   12.640000] WARNING: at kernel/lockdep.c:2701 lockdep_init_map+0x37b/0x570()
[   12.640000] Hardware name: Bochs
[   12.640000] Pid: 1, comm: swapper Tainted: G        W 2.6.38-kexec-06330-g69b7efe-dirty #122
[   12.650000] Call Trace:
[   12.650000]  [<ffffffff8102e685>] warn_slowpath_common+0x75/0xb0
[   12.650000]  [<ffffffff8102e6d5>] warn_slowpath_null+0x15/0x20
[   12.650000]  [<ffffffff8105967b>] lockdep_init_map+0x37b/0x570
[   12.650000]  [<ffffffff8105829d>] ? trace_hardirqs_on+0xd/0x10
[   12.650000]  [<ffffffff81055ad8>] debug_mutex_init+0x38/0x50
[   12.650000]  [<ffffffff8104bc4c>] __mutex_init+0x5c/0x70
[   12.650000]  [<ffffffff81685ee7>] __ip_vs_app_init+0x64/0x86
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.660000]  [<ffffffff811b1c33>] T.620+0x43/0x170
[   12.660000]  [<ffffffff811b1e9a>] ? register_pernet_subsys+0x1a/0x40
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.660000]  [<ffffffff811b1db7>] register_pernet_operations+0x57/0xb0
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.670000]  [<ffffffff811b1ea9>] register_pernet_subsys+0x29/0x40
[   12.670000]  [<ffffffff81685f19>] ip_vs_app_init+0x10/0x12
[   12.670000]  [<ffffffff81685a87>] ip_vs_init+0x4c/0xff
[   12.670000]  [<ffffffff8166562c>] do_one_initcall+0x7a/0x12e
[   12.670000]  [<ffffffff8166583e>] kernel_init+0x13e/0x1c2
[   12.670000]  [<ffffffff8128c134>] kernel_thread_helper+0x4/0x10
[   12.670000]  [<ffffffff8128ad40>] ? restore_args+0x0/0x30
[   12.680000]  [<ffffffff81665700>] ? kernel_init+0x0/0x1c2
[   12.680000]  [<ffffffff8128c130>] ? kernel_thread_helper+0x0/0x1global0

Signed-off-by: Simon Horman <horms@verge.net.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Hans Schillstrom <hans@schillstrom.com>
---
 include/net/ip_vs.h            |    2 --
 net/netfilter/ipvs/ip_vs_app.c |   23 ++++++++++-------------
 2 files changed, 10 insertions(+), 15 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 272f593..30b49ed 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -801,8 +801,6 @@ struct netns_ipvs {
 	struct list_head	rs_table[IP_VS_RTAB_SIZE];
 	/* ip_vs_app */
 	struct list_head	app_list;
-	struct mutex		app_mutex;
-	struct lock_class_key	app_key;	/* mutex debuging */
 
 	/* ip_vs_proto */
 	#define IP_VS_PROTO_TAB_SIZE	32	/* must be power of 2 */
diff --git a/net/netfilter/ipvs/ip_vs_app.c b/net/netfilter/ipvs/ip_vs_app.c
index 5c48ffb..2dc6de1 100644
--- a/net/netfilter/ipvs/ip_vs_app.c
+++ b/net/netfilter/ipvs/ip_vs_app.c
@@ -43,6 +43,8 @@ EXPORT_SYMBOL(register_ip_vs_app);
 EXPORT_SYMBOL(unregister_ip_vs_app);
 EXPORT_SYMBOL(register_ip_vs_app_inc);
 
+static DEFINE_MUTEX(__ip_vs_app_mutex);
+
 /*
  *	Get an ip_vs_app object
  */
@@ -167,14 +169,13 @@ int
 register_ip_vs_app_inc(struct net *net, struct ip_vs_app *app, __u16 proto,
 		       __u16 port)
 {
-	struct netns_ipvs *ipvs = net_ipvs(net);
 	int result;
 
-	mutex_lock(&ipvs->app_mutex);
+	mutex_lock(&__ip_vs_app_mutex);
 
 	result = ip_vs_app_inc_new(net, app, proto, port);
 
-	mutex_unlock(&ipvs->app_mutex);
+	mutex_unlock(&__ip_vs_app_mutex);
 
 	return result;
 }
@@ -189,11 +190,11 @@ int register_ip_vs_app(struct net *net, struct ip_vs_app *app)
 	/* increase the module use count */
 	ip_vs_use_count_inc();
 
-	mutex_lock(&ipvs->app_mutex);
+	mutex_lock(&__ip_vs_app_mutex);
 
 	list_add(&app->a_list, &ipvs->app_list);
 
-	mutex_unlock(&ipvs->app_mutex);
+	mutex_unlock(&__ip_vs_app_mutex);
 
 	return 0;
 }
@@ -205,10 +206,9 @@ int register_ip_vs_app(struct net *net, struct ip_vs_app *app)
  */
 void unregister_ip_vs_app(struct net *net, struct ip_vs_app *app)
 {
-	struct netns_ipvs *ipvs = net_ipvs(net);
 	struct ip_vs_app *inc, *nxt;
 
-	mutex_lock(&ipvs->app_mutex);
+	mutex_lock(&__ip_vs_app_mutex);
 
 	list_for_each_entry_safe(inc, nxt, &app->incs_list, a_list) {
 		ip_vs_app_inc_release(net, inc);
@@ -216,7 +216,7 @@ void unregister_ip_vs_app(struct net *net, struct ip_vs_app *app)
 
 	list_del(&app->a_list);
 
-	mutex_unlock(&ipvs->app_mutex);
+	mutex_unlock(&__ip_vs_app_mutex);
 
 	/* decrease the module use count */
 	ip_vs_use_count_dec();
@@ -501,7 +501,7 @@ static void *ip_vs_app_seq_start(struct seq_file *seq, loff_t *pos)
 	struct net *net = seq_file_net(seq);
 	struct netns_ipvs *ipvs = net_ipvs(net);
 
-	mutex_lock(&ipvs->app_mutex);
+	mutex_lock(&__ip_vs_app_mutex);
 
 	return *pos ? ip_vs_app_idx(ipvs, *pos - 1) : SEQ_START_TOKEN;
 }
@@ -535,9 +535,7 @@ static void *ip_vs_app_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 
 static void ip_vs_app_seq_stop(struct seq_file *seq, void *v)
 {
-	struct netns_ipvs *ipvs = net_ipvs(seq_file_net(seq));
-
-	mutex_unlock(&ipvs->app_mutex);
+	mutex_unlock(&__ip_vs_app_mutex);
 }
 
 static int ip_vs_app_seq_show(struct seq_file *seq, void *v)
@@ -583,7 +581,6 @@ static int __net_init __ip_vs_app_init(struct net *net)
 	struct netns_ipvs *ipvs = net_ipvs(net);
 
 	INIT_LIST_HEAD(&ipvs->app_list);
-	__mutex_init(&ipvs->app_mutex, "ipvs->app_mutex", &ipvs->app_key);
 	proc_net_fops_create(net, "ip_vs_app", 0, &ip_vs_app_fops);
 	return 0;
 }
-- 
1.7.2.3

  reply	other threads:[~2011-03-22  1:18 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-21  2:51 [GIT] Networking David Miller
2011-03-21 12:53 ` [held lock freed] " Ingo Molnar
2011-03-21 13:32   ` Eric Dumazet
2011-03-21 14:50     ` Arnd Bergmann
2011-03-21 14:55       ` Eric Dumazet
2011-03-21 15:22         ` Arnd Bergmann
2011-03-21 16:16           ` Ingo Molnar
2011-03-22  1:18           ` David Miller
2011-03-21 16:15     ` Ingo Molnar
2011-03-21 16:42       ` [slab poison overwritten] " Ingo Molnar
2011-03-21 17:37         ` Ingo Molnar
2011-03-21 17:39         ` Ingo Molnar
2011-03-21 18:07           ` Eric Dumazet
2011-03-21 20:15             ` Eric Dumazet
2011-03-21 22:13               ` Simon Horman
2011-03-21 23:29                 ` Simon Horman
2011-03-22  0:01                   ` Simon Horman
2011-03-22  0:17                     ` Simon Horman
2011-03-22  1:18                       ` Simon Horman [this message]
2011-03-22  3:40                         ` David Miller
2011-03-22  3:39                 ` David Miller
2011-03-22  9:56               ` Ingo Molnar
2011-03-22 10:00                 ` Eric Dumazet
2011-03-22 21:52                   ` Simon Horman
2011-03-22  9:07           ` Ingo Molnar
2011-03-22  1:16       ` [held lock freed] " David Miller
2011-03-21 19:24 ` Linus Torvalds
2011-03-21 20:10   ` Eric Dumazet
2011-03-22  4:09   ` David Miller
2011-03-22 10:00   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110322011801.GE27019@verge.net.au \
    --to=horms@verge.net.au \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=hans@schillstrom.com \
    --cc=ja@ssi.bg \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=netdev@vger.kernel.org \
    --cc=penberg@cs.helsinki.fi \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).