linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Glauber Costa <glommer@parallels.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: <netdev@vger.kernel.org>, "David S. Miller" <davem@davemloft.net>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [v3.3-rc1 regression] TCP: too many of orphaned sockets
Date: Fri, 27 Jan 2012 20:27:06 +0400	[thread overview]
Message-ID: <4F22D05A.8030604@parallels.com> (raw)
In-Reply-To: <4F22B634.2020007@parallels.com>

[-- Attachment #1: Type: text/plain, Size: 1483 bytes --]

On 01/27/2012 06:35 PM, Glauber Costa wrote:
> On 01/27/2012 06:22 PM, Ingo Molnar wrote:
>>
>> * Ingo Molnar<mingo@elte.hu> wrote:
>>
>>> ok, i've bisected it, and the bad commit is:
>>>
>>> 3dc43e3e4d0b52197d3205214fe8f162f9e0c334 is the first bad commit
>>> commit 3dc43e3e4d0b52197d3205214fe8f162f9e0c334
>>> Author: Glauber Costa<glommer@parallels.com>
>>> Date: Sun Dec 11 21:47:05 2011 +0000
>>>
>>> per-netns ipv4 sysctl_tcp_mem
>>
>> Might be related to this detail in the .config:
>>
>> # CONFIG_PROC_SYSCTL is not set
>>
>> So former tcp_init() code does not get run?
>>
>> Thanks,
>>
>> Ingo
>
> Can you tell me if the following patch fixes your problem?
>
Update on this:

What really makes it break is CONFIG_SYSCTL.
CONFIG_PROC_SYSCTL selects that, so if you get the one, you
end up getting the other. (The config mingo provided lacks both)

Also, I believe there is no harm in initializing this unconditionally,
so instead of cluttering tcp_init() with #ifdef, I am proposing we just 
init it here, and then init it again in sysctl initialization. I don't
expect it to harm workload, since it is a one-shot.

Now, I am attaching my proposed final patch for this, but I can't really
generate a config without sysctl that boots okay for me.

Ingo, would you please confirm that this fixes the problem for you? If 
I'm mistaken, let me know and I'll get back to it ASAP.

Dave, once Ingo acks that it fixes the problem he says, I'll submit the 
patch formally.

Thanks.

[-- Attachment #2: 0001-fix-tcp-sysctl-initialization-with-CONFIG_SYSCTL-dis.patch --]
[-- Type: text/x-patch, Size: 3063 bytes --]

>From 49318a2c917f970373e66e21d747a38a595eb462 Mon Sep 17 00:00:00 2001
From: Glauber Costa <glommer@parallels.com>
Date: Fri, 27 Jan 2012 19:34:17 +0400
Subject: [PATCH] fix tcp sysctl initialization with CONFIG_SYSCTL disabled.

sysctl_tcp_mem initialization was moved to sysctl_tcp_ipv4.c
in commit 3dc43e3e4d0b52197d3205214fe8f162f9e0c334, since it
became a per-ns value.

That code, however, will never run when CONFIG_SYSCTL is disabled,
leading to bogus values on those fields.

This patch fixes it by keeping an initialization code in tcp_init().
It will be overwritten by the first net namespace init if CONFIG_SYSCTL
is compiled in, and do the right thing if it is compiled out.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reported-by: Ingo Molnar <mingo@elte.hu>
---
 include/net/tcp.h          |    2 ++
 net/ipv4/sysctl_net_ipv4.c |    1 +
 net/ipv4/tcp.c             |   16 +++++++++++++---
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 0118ea9..b04a3e9 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -311,6 +311,8 @@ extern struct proto tcp_prot;
 #define TCP_ADD_STATS_USER(net, field, val) SNMP_ADD_STATS_USER((net)->mib.tcp_statistics, field, val)
 #define TCP_ADD_STATS(net, field, val)	SNMP_ADD_STATS((net)->mib.tcp_statistics, field, val)
 
+extern void init_tcp_mem(struct net *net);
+
 extern void tcp_v4_err(struct sk_buff *skb, u32);
 
 extern void tcp_shutdown (struct sock *sk, int how);
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 4aa7e9d..1d67cde 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -814,6 +814,7 @@ static __net_init int ipv4_sysctl_init_net(struct net *net)
 
 	net->ipv4.sysctl_rt_cache_rebuild_count = 4;
 
+	init_tcp_mem(net);
 	limit = nr_free_buffer_pages() / 8;
 	limit = max(limit, 128UL);
 	net->ipv4.sysctl_tcp_mem[0] = limit / 4 * 3;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 9bcdec3..34e4051 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3216,6 +3216,16 @@ static int __init set_thash_entries(char *str)
 }
 __setup("thash_entries=", set_thash_entries);
 
+void init_tcp_mem(struct net *net)
+{
+	/* Set per-socket limits to no more than 1/128 the pressure threshold */
+	unsigned long limit = nr_free_buffer_pages() / 8;
+	limit = max(limit, 128UL);
+	net->ipv4.sysctl_tcp_mem[0] = limit / 4 * 3;
+	net->ipv4.sysctl_tcp_mem[1] = limit;
+	net->ipv4.sysctl_tcp_mem[2] = net->ipv4.sysctl_tcp_mem[0] * 2;
+}
+
 void __init tcp_init(void)
 {
 	struct sk_buff *skb = NULL;
@@ -3276,9 +3286,9 @@ void __init tcp_init(void)
 	sysctl_tcp_max_orphans = cnt / 2;
 	sysctl_max_syn_backlog = max(128, cnt / 256);
 
-	/* Set per-socket limits to no more than 1/128 the pressure threshold */
-	limit = ((unsigned long)init_net.ipv4.sysctl_tcp_mem[1])
-		<< (PAGE_SHIFT - 7);
+	init_tcp_mem(&init_net);
+	limit = nr_free_buffer_pages() / 8;
+	limit = max(limit, 128UL);
 	max_share = min(4UL*1024*1024, limit);
 
 	sysctl_tcp_wmem[0] = SK_MEM_QUANTUM;
-- 
1.7.7.4


  reply	other threads:[~2012-01-27 16:28 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-27 12:46 [v3.3-rc1 regression] TCP: too many of orphaned sockets Ingo Molnar
2012-01-27 12:49 ` Glauber Costa
2012-01-27 12:56   ` Ingo Molnar
2012-01-27 14:17     ` Ingo Molnar
2012-01-27 14:22       ` Ingo Molnar
2012-01-27 14:28         ` Glauber Costa
2012-01-27 14:35         ` Glauber Costa
2012-01-27 16:27           ` Glauber Costa [this message]
2012-01-27 21:28             ` David Miller
2012-01-27 21:28               ` Glauber Costa
2012-01-28 11:50                 ` [PATCH] net/tcp: Fix tcp memory limits initialization when !CONFIG_SYSCTL Ingo Molnar
2012-01-30 11:17                   ` Glauber Costa
     [not found] <inTpE-FS-29@gated-at.bofh.it>
2012-01-30 22:13 ` [v3.3-rc1 regression] TCP: too many of orphaned sockets Arun Sharma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F22D05A.8030604@parallels.com \
    --to=glommer@parallels.com \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).