* [PATCH v3 0/3] Namespaceify two sysctls related with route
@ 2022-08-30 9:14 cgel.zte
2022-08-30 9:16 ` [PATCH v3 1/3] ipv4: Namespaceify route/error_cost knob cgel.zte
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: cgel.zte @ 2022-08-30 9:14 UTC (permalink / raw)
To: davem, kuba, yoshfuji, dsahern; +Cc: netdev, linux-kernel, xu.xin16
From: xu xin <xu.xin16@zte.com.cn>
With the rise of cloud native, more and more container applications are
deployed. The network namespace is one of the foundations of the container.
The sysctls of error_cost and error_burst are important knobs to control
the sending frequency of ICMP_DEST_UNREACH packet for ipv4. When different
containers has requirements on the tuning of error_cost and error_burst,
for host's security, the sysctls should exist per network namespace.
Different netns has different requirements on the setting of error_cost
and error_burst, which are related with limiting the frequency of sending
ICMP_DEST_UNREACH packets. Enable them to be configured per netns.
xu xin (3):
ipv4: Namespaceify route/error_cost knob
ipv4: Namespaceify route/error_burst knob
ipv4: add documentation of two sysctls about icmp
Documentation/networking/ip-sysctl.rst | 17 ++++++++++++
include/net/netns/ipv4.h | 2 ++
net/ipv4/route.c | 36 ++++++++++++++------------
3 files changed, 39 insertions(+), 16 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3 1/3] ipv4: Namespaceify route/error_cost knob
2022-08-30 9:14 [PATCH v3 0/3] Namespaceify two sysctls related with route cgel.zte
@ 2022-08-30 9:16 ` cgel.zte
2022-08-30 9:16 ` [PATCH v3 2/3] ipv4: Namespaceify route/error_burst knob cgel.zte
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: cgel.zte @ 2022-08-30 9:16 UTC (permalink / raw)
To: davem, kuba, yoshfuji, dsahern
Cc: netdev, linux-kernel, xu.xin16, Yunkai Zhang
From: xu xin <xu.xin16@zte.com.cn>
Different netns has different requirement on the setting of error_cost
sysctl which is used to limit the max frequency of sending
ICMP_DEST_UNREACH packet together with error_burst. To put it simply,
it refers to the minimum time interval between two consecutive
ICMP_DEST_UNREACHABLE packets sent to the same peer when now is
icmp-stable period not the burst case after a long calm time.
Enable error_cost to be configured per network namespace.
Signed-off-by: xu xin (CGEL ZTE) <xu.xin16@zte.com.cn>
Reviewed-by: Yunkai Zhang (CGEL ZTE) <zhang.yunkai@zte.com.cn>
---
include/net/netns/ipv4.h | 1 +
net/ipv4/route.c | 18 ++++++++++--------
2 files changed, 11 insertions(+), 8 deletions(-)
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index c7320ef356d9..319395bbad3c 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -85,6 +85,7 @@ struct netns_ipv4 {
u32 ip_rt_min_pmtu;
int ip_rt_mtu_expires;
int ip_rt_min_advmss;
+ int ip_rt_error_cost;
struct local_ports ip_local_ports;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 795cbe1de912..209539c201c2 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -118,7 +118,6 @@ static int ip_rt_max_size;
static int ip_rt_redirect_number __read_mostly = 9;
static int ip_rt_redirect_load __read_mostly = HZ / 50;
static int ip_rt_redirect_silence __read_mostly = ((HZ / 50) << (9 + 1));
-static int ip_rt_error_cost __read_mostly = HZ;
static int ip_rt_error_burst __read_mostly = 5 * HZ;
static int ip_rt_gc_timeout __read_mostly = RT_GC_TIMEOUT;
@@ -1000,6 +999,8 @@ static int ip_error(struct sk_buff *skb)
send = true;
if (peer) {
+ int ip_rt_error_cost = READ_ONCE(net->ipv4.ip_rt_error_cost);
+
now = jiffies;
peer->rate_tokens += now - peer->rate_last;
if (peer->rate_tokens > ip_rt_error_burst)
@@ -3535,13 +3536,6 @@ static struct ctl_table ipv4_route_table[] = {
.mode = 0644,
.proc_handler = proc_dointvec,
},
- {
- .procname = "error_cost",
- .data = &ip_rt_error_cost,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec,
- },
{
.procname = "error_burst",
.data = &ip_rt_error_burst,
@@ -3590,6 +3584,13 @@ static struct ctl_table ipv4_route_netns_table[] = {
.mode = 0644,
.proc_handler = proc_dointvec,
},
+ {
+ .procname = "error_cost",
+ .data = &init_net.ipv4.ip_rt_error_cost,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec,
+ },
{ },
};
@@ -3653,6 +3654,7 @@ static __net_init int netns_ip_rt_init(struct net *net)
net->ipv4.ip_rt_min_pmtu = DEFAULT_MIN_PMTU;
net->ipv4.ip_rt_mtu_expires = DEFAULT_MTU_EXPIRES;
net->ipv4.ip_rt_min_advmss = DEFAULT_MIN_ADVMSS;
+ net->ipv4.ip_rt_error_cost = HZ;
return 0;
}
--
2.25.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v3 2/3] ipv4: Namespaceify route/error_burst knob
2022-08-30 9:14 [PATCH v3 0/3] Namespaceify two sysctls related with route cgel.zte
2022-08-30 9:16 ` [PATCH v3 1/3] ipv4: Namespaceify route/error_cost knob cgel.zte
@ 2022-08-30 9:16 ` cgel.zte
2022-08-30 9:17 ` [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp cgel.zte
2022-08-31 19:58 ` [PATCH v3 0/3] Namespaceify two sysctls related with route Jakub Kicinski
3 siblings, 0 replies; 7+ messages in thread
From: cgel.zte @ 2022-08-30 9:16 UTC (permalink / raw)
To: davem, kuba, yoshfuji, dsahern
Cc: netdev, linux-kernel, xu.xin16, Yunkai Zhang
From: xu xin <xu.xin16@zte.com.cn>
Different netns has different requirement on the setting of error_burst
sysctl which is used to limit the frequency of sending ICMP_DEST_UNREACH
packet together with error_cost. To put it simply, if the rate of
error_burst over error_cost is larger, then allowd burstly-sent
ICMP_DEST_UNREACH packets after a long calm time (no dest-unreachable
icmp packets) is more.
Enable error_burst to be configured per network namespace.
Signed-off-by: xu xin (CGEL ZTE) <xu.xin16@zte.com.cn>
Reviewed-by: Yunkai Zhang (CGEL ZTE) <zhang.yunkai@zte.com.cn>
---
include/net/netns/ipv4.h | 1 +
net/ipv4/route.c | 18 ++++++++++--------
2 files changed, 11 insertions(+), 8 deletions(-)
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 319395bbad3c..03d16cf32508 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -86,6 +86,7 @@ struct netns_ipv4 {
int ip_rt_mtu_expires;
int ip_rt_min_advmss;
int ip_rt_error_cost;
+ int ip_rt_error_burst;
struct local_ports ip_local_ports;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 209539c201c2..4745a4085de5 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -114,11 +114,11 @@
#define DEFAULT_MIN_PMTU (512 + 20 + 20)
#define DEFAULT_MTU_EXPIRES (10 * 60 * HZ)
#define DEFAULT_MIN_ADVMSS 256
+#define DEFAULT_ERROR_BURST (5 * HZ)
static int ip_rt_max_size;
static int ip_rt_redirect_number __read_mostly = 9;
static int ip_rt_redirect_load __read_mostly = HZ / 50;
static int ip_rt_redirect_silence __read_mostly = ((HZ / 50) << (9 + 1));
-static int ip_rt_error_burst __read_mostly = 5 * HZ;
static int ip_rt_gc_timeout __read_mostly = RT_GC_TIMEOUT;
@@ -1000,6 +1000,7 @@ static int ip_error(struct sk_buff *skb)
send = true;
if (peer) {
int ip_rt_error_cost = READ_ONCE(net->ipv4.ip_rt_error_cost);
+ int ip_rt_error_burst = READ_ONCE(net->ipv4.ip_rt_error_burst);
now = jiffies;
peer->rate_tokens += now - peer->rate_last;
@@ -3536,13 +3537,6 @@ static struct ctl_table ipv4_route_table[] = {
.mode = 0644,
.proc_handler = proc_dointvec,
},
- {
- .procname = "error_burst",
- .data = &ip_rt_error_burst,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec,
- },
{
.procname = "gc_elasticity",
.data = &ip_rt_gc_elasticity,
@@ -3591,6 +3585,13 @@ static struct ctl_table ipv4_route_netns_table[] = {
.mode = 0644,
.proc_handler = proc_dointvec,
},
+ {
+ .procname = "error_burst",
+ .data = &init_net.ipv4.ip_rt_error_burst,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec,
+ },
{ },
};
@@ -3655,6 +3656,7 @@ static __net_init int netns_ip_rt_init(struct net *net)
net->ipv4.ip_rt_mtu_expires = DEFAULT_MTU_EXPIRES;
net->ipv4.ip_rt_min_advmss = DEFAULT_MIN_ADVMSS;
net->ipv4.ip_rt_error_cost = HZ;
+ net->ipv4.ip_rt_error_burst = DEFAULT_ERROR_BURST;
return 0;
}
--
2.25.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp
2022-08-30 9:14 [PATCH v3 0/3] Namespaceify two sysctls related with route cgel.zte
2022-08-30 9:16 ` [PATCH v3 1/3] ipv4: Namespaceify route/error_cost knob cgel.zte
2022-08-30 9:16 ` [PATCH v3 2/3] ipv4: Namespaceify route/error_burst knob cgel.zte
@ 2022-08-30 9:17 ` cgel.zte
2022-09-02 10:07 ` Nicolas Dichtel
2022-08-31 19:58 ` [PATCH v3 0/3] Namespaceify two sysctls related with route Jakub Kicinski
3 siblings, 1 reply; 7+ messages in thread
From: cgel.zte @ 2022-08-30 9:17 UTC (permalink / raw)
To: davem, kuba, yoshfuji, dsahern
Cc: netdev, linux-kernel, xu.xin16, Yunkai Zhang
From: xu xin <xu.xin16@zte.com.cn>
Add the descriptions of the sysctls of error_cost and error_burst in
Documentation/networking/ip-sysctl.rst.
Signed-off-by: xu xin (CGEL ZTE) <xu.xin16@zte.com.cn>
Reviewed-by: Yunkai Zhang (CGEL ZTE) <zhang.yunkai@zte.com.cn>
---
Documentation/networking/ip-sysctl.rst | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index 56cd4ea059b2..c113a34a4115 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -156,6 +156,23 @@ route/max_size - INTEGER
From linux kernel 3.6 onwards, this is deprecated for ipv4
as route cache is no longer used.
+route/error_cost - INTEGER
+ The minimum time interval between two consecutive ICMP-DEST-
+ UNREACHABLE packets allowed sent to the same peer in the stable
+ period. Basically, The higher its value is, the lower the general
+ frequency of sending ICMP DEST-UNREACHABLE packets.
+
+ Default: HZ (one second)
+
+route/error_burst - INTEGER
+ Together with error_cost, it controls the max number of burstly
+ sent ICMP DEST-UNREACHABLE packets after a long calm time (no
+ sending ICMP DEST-UNREACHABLE). Basically, the higher the rate
+ of error_burst over error_cost is, the more allowed burstly sent
+ ICMP DEST-UNREACHABLE packets after a long calm time.
+
+ Default: 5 * HZ
+
neigh/default/gc_thresh1 - INTEGER
Minimum number of entries to keep. Garbage collector will not
purge entries if there are fewer than this number.
--
2.25.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v3 0/3] Namespaceify two sysctls related with route
2022-08-30 9:14 [PATCH v3 0/3] Namespaceify two sysctls related with route cgel.zte
` (2 preceding siblings ...)
2022-08-30 9:17 ` [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp cgel.zte
@ 2022-08-31 19:58 ` Jakub Kicinski
3 siblings, 0 replies; 7+ messages in thread
From: Jakub Kicinski @ 2022-08-31 19:58 UTC (permalink / raw)
To: cgel.zte; +Cc: davem, yoshfuji, dsahern, netdev, linux-kernel, xu.xin16
On Tue, 30 Aug 2022 09:14:53 +0000 cgel.zte@gmail.com wrote:
> With the rise of cloud native, more and more container applications are
> deployed. The network namespace is one of the foundations of the container.
> The sysctls of error_cost and error_burst are important knobs to control
> the sending frequency of ICMP_DEST_UNREACH packet for ipv4. When different
> containers has requirements on the tuning of error_cost and error_burst,
> for host's security, the sysctls should exist per network namespace.
>
> Different netns has different requirements on the setting of error_cost
> and error_burst, which are related with limiting the frequency of sending
> ICMP_DEST_UNREACH packets. Enable them to be configured per netns.
One last time, if v6 doesn't need it, neither should v4.
Seems like you're just trying to check a box.
I'm dropping these patches from patchwork, please don't repost them
again, unless someone from the community voices support for merging
them.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp
2022-08-30 9:17 ` [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp cgel.zte
@ 2022-09-02 10:07 ` Nicolas Dichtel
2022-09-11 9:03 ` CGEL
0 siblings, 1 reply; 7+ messages in thread
From: Nicolas Dichtel @ 2022-09-02 10:07 UTC (permalink / raw)
To: cgel.zte, davem, kuba, yoshfuji, dsahern
Cc: netdev, linux-kernel, xu.xin16, Yunkai Zhang
Le 30/08/2022 à 11:17, cgel.zte@gmail.com a écrit :
> From: xu xin <xu.xin16@zte.com.cn>
>
> Add the descriptions of the sysctls of error_cost and error_burst in
> Documentation/networking/ip-sysctl.rst.
>
> Signed-off-by: xu xin (CGEL ZTE) <xu.xin16@zte.com.cn>
> Reviewed-by: Yunkai Zhang (CGEL ZTE) <zhang.yunkai@zte.com.cn>
Maybe you could resubmit this one alone?
Thank you,
Nicolas
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp
2022-09-02 10:07 ` Nicolas Dichtel
@ 2022-09-11 9:03 ` CGEL
0 siblings, 0 replies; 7+ messages in thread
From: CGEL @ 2022-09-11 9:03 UTC (permalink / raw)
To: Nicolas Dichtel
Cc: davem, kuba, yoshfuji, dsahern, netdev, linux-kernel, xu.xin16,
Yunkai Zhang
On Fri, Sep 02, 2022 at 12:07:11PM +0200, Nicolas Dichtel wrote:
>
> Le 30/08/2022 à 11:17, cgel.zte@gmail.com a écrit :
> > From: xu xin <xu.xin16@zte.com.cn>
> >
> > Add the descriptions of the sysctls of error_cost and error_burst in
> > Documentation/networking/ip-sysctl.rst.
> >
> > Signed-off-by: xu xin (CGEL ZTE) <xu.xin16@zte.com.cn>
> > Reviewed-by: Yunkai Zhang (CGEL ZTE) <zhang.yunkai@zte.com.cn>
> Maybe you could resubmit this one alone?
>
Okay. did.
>
> Thank you,
> Nicolas
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-09-11 9:03 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-30 9:14 [PATCH v3 0/3] Namespaceify two sysctls related with route cgel.zte
2022-08-30 9:16 ` [PATCH v3 1/3] ipv4: Namespaceify route/error_cost knob cgel.zte
2022-08-30 9:16 ` [PATCH v3 2/3] ipv4: Namespaceify route/error_burst knob cgel.zte
2022-08-30 9:17 ` [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp cgel.zte
2022-09-02 10:07 ` Nicolas Dichtel
2022-09-11 9:03 ` CGEL
2022-08-31 19:58 ` [PATCH v3 0/3] Namespaceify two sysctls related with route Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).