* [PATCH] ipv4: Namespaceify tcp_max_orphans knob
@ 2017-09-07 3:10 Haishuang Yan
2017-09-08 22:13 ` Cong Wang
0 siblings, 1 reply; 7+ messages in thread
From: Haishuang Yan @ 2017-09-07 3:10 UTC (permalink / raw)
To: David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI, Eric Dumazet
Cc: netdev, linux-kernel, Haishuang Yan
Different namespace application might require different maximal number
of TCP sockets independently of the host.
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
---
include/net/netns/ipv4.h | 1 +
include/net/tcp.h | 5 +++--
net/ipv4/sysctl_net_ipv4.c | 14 +++++++-------
net/ipv4/tcp.c | 3 ---
net/ipv4/tcp_input.c | 1 -
net/ipv4/tcp_ipv4.c | 1 +
6 files changed, 12 insertions(+), 13 deletions(-)
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 20d061c..305e031 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -127,6 +127,7 @@ struct netns_ipv4 {
int sysctl_tcp_timestamps;
struct inet_timewait_death_row tcp_death_row;
int sysctl_max_syn_backlog;
+ int sysctl_tcp_max_orphans;
#ifdef CONFIG_NET_L3_MASTER_DEV
int sysctl_udp_l3mdev_accept;
diff --git a/include/net/tcp.h b/include/net/tcp.h
index b510f28..ac2d998 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -320,10 +320,11 @@ static inline bool tcp_too_many_orphans(struct sock *sk, int shift)
{
struct percpu_counter *ocp = sk->sk_prot->orphan_count;
int orphans = percpu_counter_read_positive(ocp);
+ int tcp_max_orphans = sock_net(sk)->ipv4.sysctl_tcp_max_orphans;
- if (orphans << shift > sysctl_tcp_max_orphans) {
+ if (orphans << shift > tcp_max_orphans) {
orphans = percpu_counter_sum_positive(ocp);
- if (orphans << shift > sysctl_tcp_max_orphans)
+ if (orphans << shift > tcp_max_orphans)
return true;
}
return false;
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 0d3c038..4f26c8d3 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -394,13 +394,6 @@ static int proc_tcp_available_ulp(struct ctl_table *ctl,
.proc_handler = proc_dointvec
},
{
- .procname = "tcp_max_orphans",
- .data = &sysctl_tcp_max_orphans,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec
- },
- {
.procname = "tcp_fastopen",
.data = &sysctl_tcp_fastopen,
.maxlen = sizeof(int),
@@ -1085,6 +1078,13 @@ static int proc_tcp_available_ulp(struct ctl_table *ctl,
.mode = 0644,
.proc_handler = proc_dointvec
},
+ {
+ .procname = "tcp_max_orphans",
+ .data = &init_net.ipv4.sysctl_tcp_max_orphans,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec
+ },
#ifdef CONFIG_IP_ROUTE_MULTIPATH
{
.procname = "fib_multipath_use_neigh",
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 5091402..39187ac 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3522,9 +3522,6 @@ void __init tcp_init(void)
}
- cnt = tcp_hashinfo.ehash_mask + 1;
- sysctl_tcp_max_orphans = cnt / 2;
-
tcp_init_mem();
/* Set per-socket limits to no more than 1/128 the pressure threshold */
limit = nr_free_buffer_pages() << (PAGE_SHIFT - 7);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index c5d7656..0230509 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -88,7 +88,6 @@
int sysctl_tcp_stdurg __read_mostly;
int sysctl_tcp_rfc1337 __read_mostly;
-int sysctl_tcp_max_orphans __read_mostly = NR_FILE;
int sysctl_tcp_frto __read_mostly = 2;
int sysctl_tcp_min_rtt_wlen __read_mostly = 300;
int sysctl_tcp_moderate_rcvbuf __read_mostly = 1;
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index a63486a..4b17a91 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2468,6 +2468,7 @@ static int __net_init tcp_sk_init(struct net *net)
net->ipv4.tcp_death_row.hashinfo = &tcp_hashinfo;
net->ipv4.sysctl_max_syn_backlog = max(128, cnt / 256);
+ net->ipv4.sysctl_tcp_max_orphans = cnt / 2;
net->ipv4.sysctl_tcp_sack = 1;
net->ipv4.sysctl_tcp_window_scaling = 1;
net->ipv4.sysctl_tcp_timestamps = 1;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] ipv4: Namespaceify tcp_max_orphans knob
2017-09-07 3:10 [PATCH] ipv4: Namespaceify tcp_max_orphans knob Haishuang Yan
@ 2017-09-08 22:13 ` Cong Wang
2017-09-09 1:25 ` 严海双
0 siblings, 1 reply; 7+ messages in thread
From: Cong Wang @ 2017-09-08 22:13 UTC (permalink / raw)
To: Haishuang Yan
Cc: David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
Eric Dumazet, Linux Kernel Network Developers, LKML
On Wed, Sep 6, 2017 at 8:10 PM, Haishuang Yan
<yanhaishuang@cmss.chinamobile.com> wrote:
> Different namespace application might require different maximal number
> of TCP sockets independently of the host.
So after your patch we could have N * net->ipv4.sysctl_tcp_max_orphans
in a whole system, right? This just makes OOM easier to trigger.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ipv4: Namespaceify tcp_max_orphans knob
2017-09-08 22:13 ` Cong Wang
@ 2017-09-09 1:25 ` 严海双
2017-09-09 4:35 ` Cong Wang
0 siblings, 1 reply; 7+ messages in thread
From: 严海双 @ 2017-09-09 1:25 UTC (permalink / raw)
To: Cong Wang
Cc: David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
Eric Dumazet, Linux Kernel Network Developers, LKML
> On 2017年9月9日, at 上午6:13, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> On Wed, Sep 6, 2017 at 8:10 PM, Haishuang Yan
> <yanhaishuang@cmss.chinamobile.com> wrote:
>> Different namespace application might require different maximal number
>> of TCP sockets independently of the host.
>
> So after your patch we could have N * net->ipv4.sysctl_tcp_max_orphans
> in a whole system, right? This just makes OOM easier to trigger.
>
>From my understanding, before the patch, we had N * net->ipv4.sysctl_tcp_max_orphans,
and after the patch, we could have ns1.sysctl_tcp_max_orphans + ns2.sysctl_tcp_max_orphans
+ ns3.sysctl_tcp_max_orphans, is that right? Thanks for your reviewing.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ipv4: Namespaceify tcp_max_orphans knob
2017-09-09 1:25 ` 严海双
@ 2017-09-09 4:35 ` Cong Wang
2017-09-09 5:09 ` 严海双
0 siblings, 1 reply; 7+ messages in thread
From: Cong Wang @ 2017-09-09 4:35 UTC (permalink / raw)
To: 严海双
Cc: David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
Eric Dumazet, Linux Kernel Network Developers, LKML
On Fri, Sep 8, 2017 at 6:25 PM, 严海双 <yanhaishuang@cmss.chinamobile.com> wrote:
>
>
>> On 2017年9月9日, at 上午6:13, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>
>> On Wed, Sep 6, 2017 at 8:10 PM, Haishuang Yan
>> <yanhaishuang@cmss.chinamobile.com> wrote:
>>> Different namespace application might require different maximal number
>>> of TCP sockets independently of the host.
>>
>> So after your patch we could have N * net->ipv4.sysctl_tcp_max_orphans
>> in a whole system, right? This just makes OOM easier to trigger.
>>
>
> From my understanding, before the patch, we had N * net->ipv4.sysctl_tcp_max_orphans,
> and after the patch, we could have ns1.sysctl_tcp_max_orphans + ns2.sysctl_tcp_max_orphans
> + ns3.sysctl_tcp_max_orphans, is that right? Thanks for your reviewing.
Nope, by N I mean the number of containers. Before your patch, the limit
is global, after your patch it is per container.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ipv4: Namespaceify tcp_max_orphans knob
2017-09-09 4:35 ` Cong Wang
@ 2017-09-09 5:09 ` 严海双
2017-09-09 5:16 ` David Miller
0 siblings, 1 reply; 7+ messages in thread
From: 严海双 @ 2017-09-09 5:09 UTC (permalink / raw)
To: Cong Wang
Cc: David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
Eric Dumazet, Linux Kernel Network Developers, LKML
> On 2017年9月9日, at 下午12:35, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> On Fri, Sep 8, 2017 at 6:25 PM, 严海双 <yanhaishuang@cmss.chinamobile.com> wrote:
>>
>>
>>> On 2017年9月9日, at 上午6:13, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>>
>>> On Wed, Sep 6, 2017 at 8:10 PM, Haishuang Yan
>>> <yanhaishuang@cmss.chinamobile.com> wrote:
>>>> Different namespace application might require different maximal number
>>>> of TCP sockets independently of the host.
>>>
>>> So after your patch we could have N * net->ipv4.sysctl_tcp_max_orphans
>>> in a whole system, right? This just makes OOM easier to trigger.
>>>
>>
>> From my understanding, before the patch, we had N * net->ipv4.sysctl_tcp_max_orphans,
>> and after the patch, we could have ns1.sysctl_tcp_max_orphans + ns2.sysctl_tcp_max_orphans
>> + ns3.sysctl_tcp_max_orphans, is that right? Thanks for your reviewing.
>
> Nope, by N I mean the number of containers. Before your patch, the limit
> is global, after your patch it is per container.
>
Yeah, for example, if there is N containers, before the patch, I mean the limit is:
N * net->ipv4.sysctl_tcp_max_orphans
After the patch, the limit is:
ns1. net->ipv4.sysctl_tcp_max_orphans + ns2. net->ipv4.sysctl_tcp_max_orphans + …
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ipv4: Namespaceify tcp_max_orphans knob
2017-09-09 5:09 ` 严海双
@ 2017-09-09 5:16 ` David Miller
2017-09-09 10:21 ` 严海双
0 siblings, 1 reply; 7+ messages in thread
From: David Miller @ 2017-09-09 5:16 UTC (permalink / raw)
To: yanhaishuang
Cc: xiyou.wangcong, kuznet, yoshfuji, edumazet, netdev, linux-kernel
From: 严海双 <yanhaishuang@cmss.chinamobile.com>
Date: Sat, 9 Sep 2017 13:09:57 +0800
>
>
>> On 2017年9月9日, at 下午12:35, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>
>> On Fri, Sep 8, 2017 at 6:25 PM, 严海双 <yanhaishuang@cmss.chinamobile.com> wrote:
>>>
>>>
>>>> On 2017年9月9日, at 上午6:13, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>>>
>>>> On Wed, Sep 6, 2017 at 8:10 PM, Haishuang Yan
>>>> <yanhaishuang@cmss.chinamobile.com> wrote:
>>>>> Different namespace application might require different maximal number
>>>>> of TCP sockets independently of the host.
>>>>
>>>> So after your patch we could have N * net->ipv4.sysctl_tcp_max_orphans
>>>> in a whole system, right? This just makes OOM easier to trigger.
>>>>
>>>
>>> From my understanding, before the patch, we had N * net->ipv4.sysctl_tcp_max_orphans,
>>> and after the patch, we could have ns1.sysctl_tcp_max_orphans + ns2.sysctl_tcp_max_orphans
>>> + ns3.sysctl_tcp_max_orphans, is that right? Thanks for your reviewing.
>>
>> Nope, by N I mean the number of containers. Before your patch, the limit
>> is global, after your patch it is per container.
>>
>
> Yeah, for example, if there is N containers, before the patch, I mean the limit is:
>
> N * net->ipv4.sysctl_tcp_max_orphans
>
> After the patch, the limit is:
>
> ns1. net->ipv4.sysctl_tcp_max_orphans + ns2. net->ipv4.sysctl_tcp_max_orphans + …
Not true.
Please remove "N" from your equation of the current situation.
"sysctl_tcp_max_orphans" applies to entire system, it is a global limit,
comparing one limit against all orphans in the system, there is no N.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ipv4: Namespaceify tcp_max_orphans knob
2017-09-09 5:16 ` David Miller
@ 2017-09-09 10:21 ` 严海双
0 siblings, 0 replies; 7+ messages in thread
From: 严海双 @ 2017-09-09 10:21 UTC (permalink / raw)
To: David Miller
Cc: xiyou.wangcong, kuznet, yoshfuji, edumazet, netdev, linux-kernel
> On 2017年9月9日, at 下午1:16, David Miller <davem@davemloft.net> wrote:
>
> From: 严海双 <yanhaishuang@cmss.chinamobile.com>
> Date: Sat, 9 Sep 2017 13:09:57 +0800
>
>>
>>
>>> On 2017年9月9日, at 下午12:35, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>>
>>> On Fri, Sep 8, 2017 at 6:25 PM, 严海双 <yanhaishuang@cmss.chinamobile.com> wrote:
>>>>
>>>>
>>>>> On 2017年9月9日, at 上午6:13, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>>>>
>>>>> On Wed, Sep 6, 2017 at 8:10 PM, Haishuang Yan
>>>>> <yanhaishuang@cmss.chinamobile.com> wrote:
>>>>>> Different namespace application might require different maximal number
>>>>>> of TCP sockets independently of the host.
>>>>>
>>>>> So after your patch we could have N * net->ipv4.sysctl_tcp_max_orphans
>>>>> in a whole system, right? This just makes OOM easier to trigger.
>>>>>
>>>>
>>>> From my understanding, before the patch, we had N * net->ipv4.sysctl_tcp_max_orphans,
>>>> and after the patch, we could have ns1.sysctl_tcp_max_orphans + ns2.sysctl_tcp_max_orphans
>>>> + ns3.sysctl_tcp_max_orphans, is that right? Thanks for your reviewing.
>>>
>>> Nope, by N I mean the number of containers. Before your patch, the limit
>>> is global, after your patch it is per container.
>>>
>>
>> Yeah, for example, if there is N containers, before the patch, I mean the limit is:
>>
>> N * net->ipv4.sysctl_tcp_max_orphans
>>
>> After the patch, the limit is:
>>
>> ns1. net->ipv4.sysctl_tcp_max_orphans + ns2. net->ipv4.sysctl_tcp_max_orphans + …
>
> Not true.
>
> Please remove "N" from your equation of the current situation.
>
> "sysctl_tcp_max_orphans" applies to entire system, it is a global limit,
> comparing one limit against all orphans in the system, there is no N.
Yes, it’s right. I browse the source code and found that it’s a global limit,
sorry for my mistake.
Thanks David and Cong.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-09-09 10:21 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-07 3:10 [PATCH] ipv4: Namespaceify tcp_max_orphans knob Haishuang Yan
2017-09-08 22:13 ` Cong Wang
2017-09-09 1:25 ` 严海双
2017-09-09 4:35 ` Cong Wang
2017-09-09 5:09 ` 严海双
2017-09-09 5:16 ` David Miller
2017-09-09 10:21 ` 严海双
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).