netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] Namespaceify two sysctls related with route
@ 2022-08-30  9:14 cgel.zte
  2022-08-30  9:16 ` [PATCH v3 1/3] ipv4: Namespaceify route/error_cost knob cgel.zte
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: cgel.zte @ 2022-08-30  9:14 UTC (permalink / raw)
  To: davem, kuba, yoshfuji, dsahern; +Cc: netdev, linux-kernel, xu.xin16

From: xu xin <xu.xin16@zte.com.cn>

With the rise of cloud native, more and more container applications are
deployed. The network namespace is one of the foundations of the container.
The sysctls of error_cost and error_burst are important knobs to control
the sending frequency of ICMP_DEST_UNREACH packet for ipv4. When different
containers has requirements on the tuning of error_cost and error_burst,
for host's security, the sysctls should exist per network namespace.

Different netns has different requirements on the setting of error_cost
and error_burst, which are related with limiting the frequency of sending
ICMP_DEST_UNREACH packets. Enable them to be configured per netns.

xu xin (3):
  ipv4: Namespaceify route/error_cost knob
  ipv4: Namespaceify route/error_burst knob
  ipv4: add documentation of two sysctls about icmp

 Documentation/networking/ip-sysctl.rst | 17 ++++++++++++
 include/net/netns/ipv4.h               |  2 ++
 net/ipv4/route.c                       | 36 ++++++++++++++------------
 3 files changed, 39 insertions(+), 16 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 1/3] ipv4: Namespaceify route/error_cost knob
  2022-08-30  9:14 [PATCH v3 0/3] Namespaceify two sysctls related with route cgel.zte
@ 2022-08-30  9:16 ` cgel.zte
  2022-08-30  9:16 ` [PATCH v3 2/3] ipv4: Namespaceify route/error_burst knob cgel.zte
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: cgel.zte @ 2022-08-30  9:16 UTC (permalink / raw)
  To: davem, kuba, yoshfuji, dsahern
  Cc: netdev, linux-kernel, xu.xin16, Yunkai Zhang

From: xu xin <xu.xin16@zte.com.cn>

Different netns has different requirement on the setting of error_cost
sysctl which is used to limit the max frequency of sending
ICMP_DEST_UNREACH packet together with error_burst. To put it simply,
it refers to the minimum time interval between two consecutive
ICMP_DEST_UNREACHABLE packets sent to the same peer when now is
icmp-stable period not the burst case after a long calm time.

Enable error_cost to be configured per network namespace.

Signed-off-by: xu xin (CGEL ZTE) <xu.xin16@zte.com.cn>
Reviewed-by: Yunkai Zhang (CGEL ZTE) <zhang.yunkai@zte.com.cn>
---
 include/net/netns/ipv4.h |  1 +
 net/ipv4/route.c         | 18 ++++++++++--------
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index c7320ef356d9..319395bbad3c 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -85,6 +85,7 @@ struct netns_ipv4 {
 	u32 ip_rt_min_pmtu;
 	int ip_rt_mtu_expires;
 	int ip_rt_min_advmss;
+	int ip_rt_error_cost;
 
 	struct local_ports ip_local_ports;
 
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 795cbe1de912..209539c201c2 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -118,7 +118,6 @@ static int ip_rt_max_size;
 static int ip_rt_redirect_number __read_mostly	= 9;
 static int ip_rt_redirect_load __read_mostly	= HZ / 50;
 static int ip_rt_redirect_silence __read_mostly	= ((HZ / 50) << (9 + 1));
-static int ip_rt_error_cost __read_mostly	= HZ;
 static int ip_rt_error_burst __read_mostly	= 5 * HZ;
 
 static int ip_rt_gc_timeout __read_mostly	= RT_GC_TIMEOUT;
@@ -1000,6 +999,8 @@ static int ip_error(struct sk_buff *skb)
 
 	send = true;
 	if (peer) {
+		int ip_rt_error_cost = READ_ONCE(net->ipv4.ip_rt_error_cost);
+
 		now = jiffies;
 		peer->rate_tokens += now - peer->rate_last;
 		if (peer->rate_tokens > ip_rt_error_burst)
@@ -3535,13 +3536,6 @@ static struct ctl_table ipv4_route_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec,
 	},
-	{
-		.procname	= "error_cost",
-		.data		= &ip_rt_error_cost,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec,
-	},
 	{
 		.procname	= "error_burst",
 		.data		= &ip_rt_error_burst,
@@ -3590,6 +3584,13 @@ static struct ctl_table ipv4_route_netns_table[] = {
 		.mode       = 0644,
 		.proc_handler   = proc_dointvec,
 	},
+	{
+		.procname   = "error_cost",
+		.data       = &init_net.ipv4.ip_rt_error_cost,
+		.maxlen     = sizeof(int),
+		.mode       = 0644,
+		.proc_handler   = proc_dointvec,
+	},
 	{ },
 };
 
@@ -3653,6 +3654,7 @@ static __net_init int netns_ip_rt_init(struct net *net)
 	net->ipv4.ip_rt_min_pmtu = DEFAULT_MIN_PMTU;
 	net->ipv4.ip_rt_mtu_expires = DEFAULT_MTU_EXPIRES;
 	net->ipv4.ip_rt_min_advmss = DEFAULT_MIN_ADVMSS;
+	net->ipv4.ip_rt_error_cost = HZ;
 	return 0;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 2/3] ipv4: Namespaceify route/error_burst knob
  2022-08-30  9:14 [PATCH v3 0/3] Namespaceify two sysctls related with route cgel.zte
  2022-08-30  9:16 ` [PATCH v3 1/3] ipv4: Namespaceify route/error_cost knob cgel.zte
@ 2022-08-30  9:16 ` cgel.zte
  2022-08-30  9:17 ` [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp cgel.zte
  2022-08-31 19:58 ` [PATCH v3 0/3] Namespaceify two sysctls related with route Jakub Kicinski
  3 siblings, 0 replies; 7+ messages in thread
From: cgel.zte @ 2022-08-30  9:16 UTC (permalink / raw)
  To: davem, kuba, yoshfuji, dsahern
  Cc: netdev, linux-kernel, xu.xin16, Yunkai Zhang

From: xu xin <xu.xin16@zte.com.cn>

Different netns has different requirement on the setting of error_burst
sysctl which is used to limit the frequency of sending ICMP_DEST_UNREACH
packet together with error_cost. To put it simply, if the rate of
error_burst over error_cost is larger, then allowd burstly-sent
ICMP_DEST_UNREACH packets after a long calm time (no dest-unreachable
icmp packets) is more.

Enable error_burst to be configured per network namespace.

Signed-off-by: xu xin (CGEL ZTE) <xu.xin16@zte.com.cn>
Reviewed-by: Yunkai Zhang (CGEL ZTE) <zhang.yunkai@zte.com.cn>
---
 include/net/netns/ipv4.h |  1 +
 net/ipv4/route.c         | 18 ++++++++++--------
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 319395bbad3c..03d16cf32508 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -86,6 +86,7 @@ struct netns_ipv4 {
 	int ip_rt_mtu_expires;
 	int ip_rt_min_advmss;
 	int ip_rt_error_cost;
+	int ip_rt_error_burst;
 
 	struct local_ports ip_local_ports;
 
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 209539c201c2..4745a4085de5 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -114,11 +114,11 @@
 #define DEFAULT_MIN_PMTU (512 + 20 + 20)
 #define DEFAULT_MTU_EXPIRES (10 * 60 * HZ)
 #define DEFAULT_MIN_ADVMSS 256
+#define DEFAULT_ERROR_BURST	(5 * HZ)
 static int ip_rt_max_size;
 static int ip_rt_redirect_number __read_mostly	= 9;
 static int ip_rt_redirect_load __read_mostly	= HZ / 50;
 static int ip_rt_redirect_silence __read_mostly	= ((HZ / 50) << (9 + 1));
-static int ip_rt_error_burst __read_mostly	= 5 * HZ;
 
 static int ip_rt_gc_timeout __read_mostly	= RT_GC_TIMEOUT;
 
@@ -1000,6 +1000,7 @@ static int ip_error(struct sk_buff *skb)
 	send = true;
 	if (peer) {
 		int ip_rt_error_cost = READ_ONCE(net->ipv4.ip_rt_error_cost);
+		int ip_rt_error_burst = READ_ONCE(net->ipv4.ip_rt_error_burst);
 
 		now = jiffies;
 		peer->rate_tokens += now - peer->rate_last;
@@ -3536,13 +3537,6 @@ static struct ctl_table ipv4_route_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec,
 	},
-	{
-		.procname	= "error_burst",
-		.data		= &ip_rt_error_burst,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec,
-	},
 	{
 		.procname	= "gc_elasticity",
 		.data		= &ip_rt_gc_elasticity,
@@ -3591,6 +3585,13 @@ static struct ctl_table ipv4_route_netns_table[] = {
 		.mode       = 0644,
 		.proc_handler   = proc_dointvec,
 	},
+	{
+		.procname       = "error_burst",
+		.data           = &init_net.ipv4.ip_rt_error_burst,
+		.maxlen     = sizeof(int),
+		.mode       = 0644,
+		.proc_handler   = proc_dointvec,
+	},
 	{ },
 };
 
@@ -3655,6 +3656,7 @@ static __net_init int netns_ip_rt_init(struct net *net)
 	net->ipv4.ip_rt_mtu_expires = DEFAULT_MTU_EXPIRES;
 	net->ipv4.ip_rt_min_advmss = DEFAULT_MIN_ADVMSS;
 	net->ipv4.ip_rt_error_cost = HZ;
+	net->ipv4.ip_rt_error_burst = DEFAULT_ERROR_BURST;
 	return 0;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp
  2022-08-30  9:14 [PATCH v3 0/3] Namespaceify two sysctls related with route cgel.zte
  2022-08-30  9:16 ` [PATCH v3 1/3] ipv4: Namespaceify route/error_cost knob cgel.zte
  2022-08-30  9:16 ` [PATCH v3 2/3] ipv4: Namespaceify route/error_burst knob cgel.zte
@ 2022-08-30  9:17 ` cgel.zte
  2022-09-02 10:07   ` Nicolas Dichtel
  2022-08-31 19:58 ` [PATCH v3 0/3] Namespaceify two sysctls related with route Jakub Kicinski
  3 siblings, 1 reply; 7+ messages in thread
From: cgel.zte @ 2022-08-30  9:17 UTC (permalink / raw)
  To: davem, kuba, yoshfuji, dsahern
  Cc: netdev, linux-kernel, xu.xin16, Yunkai Zhang

From: xu xin <xu.xin16@zte.com.cn>

Add the descriptions of the sysctls of error_cost and error_burst in
Documentation/networking/ip-sysctl.rst.

Signed-off-by: xu xin (CGEL ZTE) <xu.xin16@zte.com.cn>
Reviewed-by: Yunkai Zhang (CGEL ZTE) <zhang.yunkai@zte.com.cn>
---
 Documentation/networking/ip-sysctl.rst | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index 56cd4ea059b2..c113a34a4115 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -156,6 +156,23 @@ route/max_size - INTEGER
 	From linux kernel 3.6 onwards, this is deprecated for ipv4
 	as route cache is no longer used.
 
+route/error_cost - INTEGER
+	The minimum time interval between two consecutive ICMP-DEST-
+	UNREACHABLE packets allowed sent to the same peer in the stable
+	period. Basically, The higher its value is, the lower the general
+	frequency of sending ICMP DEST-UNREACHABLE packets.
+
+	Default: HZ (one second)
+
+route/error_burst - INTEGER
+	Together with error_cost, it controls the max number of burstly
+	sent ICMP DEST-UNREACHABLE packets after a long calm time (no
+	sending ICMP DEST-UNREACHABLE). Basically, the higher the rate
+	of error_burst over error_cost is, the more allowed burstly sent
+	ICMP DEST-UNREACHABLE packets after a long calm time.
+
+	Default: 5 * HZ
+
 neigh/default/gc_thresh1 - INTEGER
 	Minimum number of entries to keep.  Garbage collector will not
 	purge entries if there are fewer than this number.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 0/3] Namespaceify two sysctls related with route
  2022-08-30  9:14 [PATCH v3 0/3] Namespaceify two sysctls related with route cgel.zte
                   ` (2 preceding siblings ...)
  2022-08-30  9:17 ` [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp cgel.zte
@ 2022-08-31 19:58 ` Jakub Kicinski
  3 siblings, 0 replies; 7+ messages in thread
From: Jakub Kicinski @ 2022-08-31 19:58 UTC (permalink / raw)
  To: cgel.zte; +Cc: davem, yoshfuji, dsahern, netdev, linux-kernel, xu.xin16

On Tue, 30 Aug 2022 09:14:53 +0000 cgel.zte@gmail.com wrote:
> With the rise of cloud native, more and more container applications are
> deployed. The network namespace is one of the foundations of the container.
> The sysctls of error_cost and error_burst are important knobs to control
> the sending frequency of ICMP_DEST_UNREACH packet for ipv4. When different
> containers has requirements on the tuning of error_cost and error_burst,
> for host's security, the sysctls should exist per network namespace.
> 
> Different netns has different requirements on the setting of error_cost
> and error_burst, which are related with limiting the frequency of sending
> ICMP_DEST_UNREACH packets. Enable them to be configured per netns.

One last time, if v6 doesn't need it, neither should v4.

Seems like you're just trying to check a box.

I'm dropping these patches from patchwork, please don't repost them
again, unless someone from the community voices support for merging
them.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp
  2022-08-30  9:17 ` [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp cgel.zte
@ 2022-09-02 10:07   ` Nicolas Dichtel
  2022-09-11  9:03     ` CGEL
  0 siblings, 1 reply; 7+ messages in thread
From: Nicolas Dichtel @ 2022-09-02 10:07 UTC (permalink / raw)
  To: cgel.zte, davem, kuba, yoshfuji, dsahern
  Cc: netdev, linux-kernel, xu.xin16, Yunkai Zhang


Le 30/08/2022 à 11:17, cgel.zte@gmail.com a écrit :
> From: xu xin <xu.xin16@zte.com.cn>
> 
> Add the descriptions of the sysctls of error_cost and error_burst in
> Documentation/networking/ip-sysctl.rst.
> 
> Signed-off-by: xu xin (CGEL ZTE) <xu.xin16@zte.com.cn>
> Reviewed-by: Yunkai Zhang (CGEL ZTE) <zhang.yunkai@zte.com.cn>
Maybe you could resubmit this one alone?


Thank you,
Nicolas

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp
  2022-09-02 10:07   ` Nicolas Dichtel
@ 2022-09-11  9:03     ` CGEL
  0 siblings, 0 replies; 7+ messages in thread
From: CGEL @ 2022-09-11  9:03 UTC (permalink / raw)
  To: Nicolas Dichtel
  Cc: davem, kuba, yoshfuji, dsahern, netdev, linux-kernel, xu.xin16,
	Yunkai Zhang

On Fri, Sep 02, 2022 at 12:07:11PM +0200, Nicolas Dichtel wrote:
> 
> Le 30/08/2022 à 11:17, cgel.zte@gmail.com a écrit :
> > From: xu xin <xu.xin16@zte.com.cn>
> > 
> > Add the descriptions of the sysctls of error_cost and error_burst in
> > Documentation/networking/ip-sysctl.rst.
> > 
> > Signed-off-by: xu xin (CGEL ZTE) <xu.xin16@zte.com.cn>
> > Reviewed-by: Yunkai Zhang (CGEL ZTE) <zhang.yunkai@zte.com.cn>
> Maybe you could resubmit this one alone?
> 

Okay. did.

> 
> Thank you,
> Nicolas

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-09-11  9:03 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-30  9:14 [PATCH v3 0/3] Namespaceify two sysctls related with route cgel.zte
2022-08-30  9:16 ` [PATCH v3 1/3] ipv4: Namespaceify route/error_cost knob cgel.zte
2022-08-30  9:16 ` [PATCH v3 2/3] ipv4: Namespaceify route/error_burst knob cgel.zte
2022-08-30  9:17 ` [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp cgel.zte
2022-09-02 10:07   ` Nicolas Dichtel
2022-09-11  9:03     ` CGEL
2022-08-31 19:58 ` [PATCH v3 0/3] Namespaceify two sysctls related with route Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).