All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] Namespaceify two sysctls related with route
@ 2022-08-30  9:14 cgel.zte
  2022-08-30  9:16 ` [PATCH v3 1/3] ipv4: Namespaceify route/error_cost knob cgel.zte
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: cgel.zte @ 2022-08-30  9:14 UTC (permalink / raw)
  To: davem, kuba, yoshfuji, dsahern; +Cc: netdev, linux-kernel, xu.xin16

From: xu xin <xu.xin16@zte.com.cn>

With the rise of cloud native, more and more container applications are
deployed. The network namespace is one of the foundations of the container.
The sysctls of error_cost and error_burst are important knobs to control
the sending frequency of ICMP_DEST_UNREACH packet for ipv4. When different
containers has requirements on the tuning of error_cost and error_burst,
for host's security, the sysctls should exist per network namespace.

Different netns has different requirements on the setting of error_cost
and error_burst, which are related with limiting the frequency of sending
ICMP_DEST_UNREACH packets. Enable them to be configured per netns.

xu xin (3):
  ipv4: Namespaceify route/error_cost knob
  ipv4: Namespaceify route/error_burst knob
  ipv4: add documentation of two sysctls about icmp

 Documentation/networking/ip-sysctl.rst | 17 ++++++++++++
 include/net/netns/ipv4.h               |  2 ++
 net/ipv4/route.c                       | 36 ++++++++++++++------------
 3 files changed, 39 insertions(+), 16 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 1/3] ipv4: Namespaceify route/error_cost knob
  2022-08-30  9:14 [PATCH v3 0/3] Namespaceify two sysctls related with route cgel.zte
@ 2022-08-30  9:16 ` cgel.zte
  2022-08-30  9:16 ` [PATCH v3 2/3] ipv4: Namespaceify route/error_burst knob cgel.zte
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: cgel.zte @ 2022-08-30  9:16 UTC (permalink / raw)
  To: davem, kuba, yoshfuji, dsahern
  Cc: netdev, linux-kernel, xu.xin16, Yunkai Zhang

From: xu xin <xu.xin16@zte.com.cn>

Different netns has different requirement on the setting of error_cost
sysctl which is used to limit the max frequency of sending
ICMP_DEST_UNREACH packet together with error_burst. To put it simply,
it refers to the minimum time interval between two consecutive
ICMP_DEST_UNREACHABLE packets sent to the same peer when now is
icmp-stable period not the burst case after a long calm time.

Enable error_cost to be configured per network namespace.

Signed-off-by: xu xin (CGEL ZTE) <xu.xin16@zte.com.cn>
Reviewed-by: Yunkai Zhang (CGEL ZTE) <zhang.yunkai@zte.com.cn>
---
 include/net/netns/ipv4.h |  1 +
 net/ipv4/route.c         | 18 ++++++++++--------
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index c7320ef356d9..319395bbad3c 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -85,6 +85,7 @@ struct netns_ipv4 {
 	u32 ip_rt_min_pmtu;
 	int ip_rt_mtu_expires;
 	int ip_rt_min_advmss;
+	int ip_rt_error_cost;
 
 	struct local_ports ip_local_ports;
 
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 795cbe1de912..209539c201c2 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -118,7 +118,6 @@ static int ip_rt_max_size;
 static int ip_rt_redirect_number __read_mostly	= 9;
 static int ip_rt_redirect_load __read_mostly	= HZ / 50;
 static int ip_rt_redirect_silence __read_mostly	= ((HZ / 50) << (9 + 1));
-static int ip_rt_error_cost __read_mostly	= HZ;
 static int ip_rt_error_burst __read_mostly	= 5 * HZ;
 
 static int ip_rt_gc_timeout __read_mostly	= RT_GC_TIMEOUT;
@@ -1000,6 +999,8 @@ static int ip_error(struct sk_buff *skb)
 
 	send = true;
 	if (peer) {
+		int ip_rt_error_cost = READ_ONCE(net->ipv4.ip_rt_error_cost);
+
 		now = jiffies;
 		peer->rate_tokens += now - peer->rate_last;
 		if (peer->rate_tokens > ip_rt_error_burst)
@@ -3535,13 +3536,6 @@ static struct ctl_table ipv4_route_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec,
 	},
-	{
-		.procname	= "error_cost",
-		.data		= &ip_rt_error_cost,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec,
-	},
 	{
 		.procname	= "error_burst",
 		.data		= &ip_rt_error_burst,
@@ -3590,6 +3584,13 @@ static struct ctl_table ipv4_route_netns_table[] = {
 		.mode       = 0644,
 		.proc_handler   = proc_dointvec,
 	},
+	{
+		.procname   = "error_cost",
+		.data       = &init_net.ipv4.ip_rt_error_cost,
+		.maxlen     = sizeof(int),
+		.mode       = 0644,
+		.proc_handler   = proc_dointvec,
+	},
 	{ },
 };
 
@@ -3653,6 +3654,7 @@ static __net_init int netns_ip_rt_init(struct net *net)
 	net->ipv4.ip_rt_min_pmtu = DEFAULT_MIN_PMTU;
 	net->ipv4.ip_rt_mtu_expires = DEFAULT_MTU_EXPIRES;
 	net->ipv4.ip_rt_min_advmss = DEFAULT_MIN_ADVMSS;
+	net->ipv4.ip_rt_error_cost = HZ;
 	return 0;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 2/3] ipv4: Namespaceify route/error_burst knob
  2022-08-30  9:14 [PATCH v3 0/3] Namespaceify two sysctls related with route cgel.zte
  2022-08-30  9:16 ` [PATCH v3 1/3] ipv4: Namespaceify route/error_cost knob cgel.zte
@ 2022-08-30  9:16 ` cgel.zte
  2022-08-30  9:17 ` [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp cgel.zte
  2022-08-31 19:58 ` [PATCH v3 0/3] Namespaceify two sysctls related with route Jakub Kicinski
  3 siblings, 0 replies; 7+ messages in thread
From: cgel.zte @ 2022-08-30  9:16 UTC (permalink / raw)
  To: davem, kuba, yoshfuji, dsahern
  Cc: netdev, linux-kernel, xu.xin16, Yunkai Zhang

From: xu xin <xu.xin16@zte.com.cn>

Different netns has different requirement on the setting of error_burst
sysctl which is used to limit the frequency of sending ICMP_DEST_UNREACH
packet together with error_cost. To put it simply, if the rate of
error_burst over error_cost is larger, then allowd burstly-sent
ICMP_DEST_UNREACH packets after a long calm time (no dest-unreachable
icmp packets) is more.

Enable error_burst to be configured per network namespace.

Signed-off-by: xu xin (CGEL ZTE) <xu.xin16@zte.com.cn>
Reviewed-by: Yunkai Zhang (CGEL ZTE) <zhang.yunkai@zte.com.cn>
---
 include/net/netns/ipv4.h |  1 +
 net/ipv4/route.c         | 18 ++++++++++--------
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 319395bbad3c..03d16cf32508 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -86,6 +86,7 @@ struct netns_ipv4 {
 	int ip_rt_mtu_expires;
 	int ip_rt_min_advmss;
 	int ip_rt_error_cost;
+	int ip_rt_error_burst;
 
 	struct local_ports ip_local_ports;
 
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 209539c201c2..4745a4085de5 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -114,11 +114,11 @@
 #define DEFAULT_MIN_PMTU (512 + 20 + 20)
 #define DEFAULT_MTU_EXPIRES (10 * 60 * HZ)
 #define DEFAULT_MIN_ADVMSS 256
+#define DEFAULT_ERROR_BURST	(5 * HZ)
 static int ip_rt_max_size;
 static int ip_rt_redirect_number __read_mostly	= 9;
 static int ip_rt_redirect_load __read_mostly	= HZ / 50;
 static int ip_rt_redirect_silence __read_mostly	= ((HZ / 50) << (9 + 1));
-static int ip_rt_error_burst __read_mostly	= 5 * HZ;
 
 static int ip_rt_gc_timeout __read_mostly	= RT_GC_TIMEOUT;
 
@@ -1000,6 +1000,7 @@ static int ip_error(struct sk_buff *skb)
 	send = true;
 	if (peer) {
 		int ip_rt_error_cost = READ_ONCE(net->ipv4.ip_rt_error_cost);
+		int ip_rt_error_burst = READ_ONCE(net->ipv4.ip_rt_error_burst);
 
 		now = jiffies;
 		peer->rate_tokens += now - peer->rate_last;
@@ -3536,13 +3537,6 @@ static struct ctl_table ipv4_route_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec,
 	},
-	{
-		.procname	= "error_burst",
-		.data		= &ip_rt_error_burst,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec,
-	},
 	{
 		.procname	= "gc_elasticity",
 		.data		= &ip_rt_gc_elasticity,
@@ -3591,6 +3585,13 @@ static struct ctl_table ipv4_route_netns_table[] = {
 		.mode       = 0644,
 		.proc_handler   = proc_dointvec,
 	},
+	{
+		.procname       = "error_burst",
+		.data           = &init_net.ipv4.ip_rt_error_burst,
+		.maxlen     = sizeof(int),
+		.mode       = 0644,
+		.proc_handler   = proc_dointvec,
+	},
 	{ },
 };
 
@@ -3655,6 +3656,7 @@ static __net_init int netns_ip_rt_init(struct net *net)
 	net->ipv4.ip_rt_mtu_expires = DEFAULT_MTU_EXPIRES;
 	net->ipv4.ip_rt_min_advmss = DEFAULT_MIN_ADVMSS;
 	net->ipv4.ip_rt_error_cost = HZ;
+	net->ipv4.ip_rt_error_burst = DEFAULT_ERROR_BURST;
 	return 0;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp
  2022-08-30  9:14 [PATCH v3 0/3] Namespaceify two sysctls related with route cgel.zte
  2022-08-30  9:16 ` [PATCH v3 1/3] ipv4: Namespaceify route/error_cost knob cgel.zte
  2022-08-30  9:16 ` [PATCH v3 2/3] ipv4: Namespaceify route/error_burst knob cgel.zte
@ 2022-08-30  9:17 ` cgel.zte
  2022-09-02 10:07   ` Nicolas Dichtel
  2022-08-31 19:58 ` [PATCH v3 0/3] Namespaceify two sysctls related with route Jakub Kicinski
  3 siblings, 1 reply; 7+ messages in thread
From: cgel.zte @ 2022-08-30  9:17 UTC (permalink / raw)
  To: davem, kuba, yoshfuji, dsahern
  Cc: netdev, linux-kernel, xu.xin16, Yunkai Zhang

From: xu xin <xu.xin16@zte.com.cn>

Add the descriptions of the sysctls of error_cost and error_burst in
Documentation/networking/ip-sysctl.rst.

Signed-off-by: xu xin (CGEL ZTE) <xu.xin16@zte.com.cn>
Reviewed-by: Yunkai Zhang (CGEL ZTE) <zhang.yunkai@zte.com.cn>
---
 Documentation/networking/ip-sysctl.rst | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index 56cd4ea059b2..c113a34a4115 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -156,6 +156,23 @@ route/max_size - INTEGER
 	From linux kernel 3.6 onwards, this is deprecated for ipv4
 	as route cache is no longer used.
 
+route/error_cost - INTEGER
+	The minimum time interval between two consecutive ICMP-DEST-
+	UNREACHABLE packets allowed sent to the same peer in the stable
+	period. Basically, The higher its value is, the lower the general
+	frequency of sending ICMP DEST-UNREACHABLE packets.
+
+	Default: HZ (one second)
+
+route/error_burst - INTEGER
+	Together with error_cost, it controls the max number of burstly
+	sent ICMP DEST-UNREACHABLE packets after a long calm time (no
+	sending ICMP DEST-UNREACHABLE). Basically, the higher the rate
+	of error_burst over error_cost is, the more allowed burstly sent
+	ICMP DEST-UNREACHABLE packets after a long calm time.
+
+	Default: 5 * HZ
+
 neigh/default/gc_thresh1 - INTEGER
 	Minimum number of entries to keep.  Garbage collector will not
 	purge entries if there are fewer than this number.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 0/3] Namespaceify two sysctls related with route
  2022-08-30  9:14 [PATCH v3 0/3] Namespaceify two sysctls related with route cgel.zte
                   ` (2 preceding siblings ...)
  2022-08-30  9:17 ` [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp cgel.zte
@ 2022-08-31 19:58 ` Jakub Kicinski
  3 siblings, 0 replies; 7+ messages in thread
From: Jakub Kicinski @ 2022-08-31 19:58 UTC (permalink / raw)
  To: cgel.zte; +Cc: davem, yoshfuji, dsahern, netdev, linux-kernel, xu.xin16

On Tue, 30 Aug 2022 09:14:53 +0000 cgel.zte@gmail.com wrote:
> With the rise of cloud native, more and more container applications are
> deployed. The network namespace is one of the foundations of the container.
> The sysctls of error_cost and error_burst are important knobs to control
> the sending frequency of ICMP_DEST_UNREACH packet for ipv4. When different
> containers has requirements on the tuning of error_cost and error_burst,
> for host's security, the sysctls should exist per network namespace.
> 
> Different netns has different requirements on the setting of error_cost
> and error_burst, which are related with limiting the frequency of sending
> ICMP_DEST_UNREACH packets. Enable them to be configured per netns.

One last time, if v6 doesn't need it, neither should v4.

Seems like you're just trying to check a box.

I'm dropping these patches from patchwork, please don't repost them
again, unless someone from the community voices support for merging
them.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp
  2022-08-30  9:17 ` [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp cgel.zte
@ 2022-09-02 10:07   ` Nicolas Dichtel
  2022-09-11  9:03     ` CGEL
  0 siblings, 1 reply; 7+ messages in thread
From: Nicolas Dichtel @ 2022-09-02 10:07 UTC (permalink / raw)
  To: cgel.zte, davem, kuba, yoshfuji, dsahern
  Cc: netdev, linux-kernel, xu.xin16, Yunkai Zhang


Le 30/08/2022 à 11:17, cgel.zte@gmail.com a écrit :
> From: xu xin <xu.xin16@zte.com.cn>
> 
> Add the descriptions of the sysctls of error_cost and error_burst in
> Documentation/networking/ip-sysctl.rst.
> 
> Signed-off-by: xu xin (CGEL ZTE) <xu.xin16@zte.com.cn>
> Reviewed-by: Yunkai Zhang (CGEL ZTE) <zhang.yunkai@zte.com.cn>
Maybe you could resubmit this one alone?


Thank you,
Nicolas

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp
  2022-09-02 10:07   ` Nicolas Dichtel
@ 2022-09-11  9:03     ` CGEL
  0 siblings, 0 replies; 7+ messages in thread
From: CGEL @ 2022-09-11  9:03 UTC (permalink / raw)
  To: Nicolas Dichtel
  Cc: davem, kuba, yoshfuji, dsahern, netdev, linux-kernel, xu.xin16,
	Yunkai Zhang

On Fri, Sep 02, 2022 at 12:07:11PM +0200, Nicolas Dichtel wrote:
> 
> Le 30/08/2022 à 11:17, cgel.zte@gmail.com a écrit :
> > From: xu xin <xu.xin16@zte.com.cn>
> > 
> > Add the descriptions of the sysctls of error_cost and error_burst in
> > Documentation/networking/ip-sysctl.rst.
> > 
> > Signed-off-by: xu xin (CGEL ZTE) <xu.xin16@zte.com.cn>
> > Reviewed-by: Yunkai Zhang (CGEL ZTE) <zhang.yunkai@zte.com.cn>
> Maybe you could resubmit this one alone?
> 

Okay. did.

> 
> Thank you,
> Nicolas

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-09-11  9:03 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-30  9:14 [PATCH v3 0/3] Namespaceify two sysctls related with route cgel.zte
2022-08-30  9:16 ` [PATCH v3 1/3] ipv4: Namespaceify route/error_cost knob cgel.zte
2022-08-30  9:16 ` [PATCH v3 2/3] ipv4: Namespaceify route/error_burst knob cgel.zte
2022-08-30  9:17 ` [PATCH v3 3/3] ipv4: add documentation of two sysctls about icmp cgel.zte
2022-09-02 10:07   ` Nicolas Dichtel
2022-09-11  9:03     ` CGEL
2022-08-31 19:58 ` [PATCH v3 0/3] Namespaceify two sysctls related with route Jakub Kicinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.