[net-next] net: core: Namespace-ify sysctl_wmem_default and sysctl_rmem_default
diff mbox series

Message ID 20210117102319.193756-1-dong.menglong@zte.com.cn
State New, archived
Headers show
Series
  • [net-next] net: core: Namespace-ify sysctl_wmem_default and sysctl_rmem_default
Related show

Commit Message

Menglong Dong Jan. 17, 2021, 10:23 a.m. UTC
From: Menglong Dong <dong.menglong@zte.com.cn>

For now, sysctl_wmem_default and sysctl_rmem_default are globally
unified. It's not convenient in some case. For example, when we
use docker and try to control the default udp socket receive buffer
for each container.

For that reason, make sysctl_wmem_default and sysctl_rmem_default
per-namespace.

Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
---
 include/net/netns/core.h   |  2 ++
 include/net/sock.h         |  3 ---
 net/core/net_namespace.c   |  2 ++
 net/core/sock.c            |  6 ++----
 net/core/sysctl_net_core.c | 32 ++++++++++++++++----------------
 net/ipv4/ip_output.c       |  2 +-
 6 files changed, 23 insertions(+), 24 deletions(-)

Comments

Christian Brauner Jan. 18, 2021, 11:15 a.m. UTC | #1
On Sun, Jan 17, 2021 at 06:23:19PM +0800, menglong8.dong@gmail.com wrote:
> From: Menglong Dong <dong.menglong@zte.com.cn>
> 
> For now, sysctl_wmem_default and sysctl_rmem_default are globally
> unified. It's not convenient in some case. For example, when we
> use docker and try to control the default udp socket receive buffer
> for each container.
> 
> For that reason, make sysctl_wmem_default and sysctl_rmem_default
> per-namespace.
> 
> Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
> ---

Hey Menglong,

I was about to review the two patches you sent:

1. [PATCH net-next] net: core: Namespace-ify sysctl_rmem_max and sysctl_wmem_max
   https://lore.kernel.org/lkml/20210117104743.217194-1-dong.menglong@zte.com.cn
2. [PATCH net-next] net: core: Namespace-ify sysctl_wmem_default and sysctl_rmem_default
   https://lore.kernel.org/lkml/20210117102319.193756-1-dong.menglong@zte.com.cn

and I had to spend some time figuring out that 2. is dependent on 1. I
first thought I got the base wrong.

I'd suggest you resend both patches as a part of a single series with a
cover letter mentioning the goal and use-case for these changes and also
pass --base=<base-commit>
when creating the patch series which makes it way easier to figure out
what to apply it to when wanting to review a series in the larger
context of a tree.

Thanks!
Christian
Jakub Kicinski Jan. 18, 2021, 9:15 p.m. UTC | #2
On Mon, 18 Jan 2021 12:15:18 +0100 Christian Brauner wrote:
> On Sun, Jan 17, 2021 at 06:23:19PM +0800, menglong8.dong@gmail.com wrote:
> > From: Menglong Dong <dong.menglong@zte.com.cn>
> > 
> > For now, sysctl_wmem_default and sysctl_rmem_default are globally
> > unified. It's not convenient in some case. For example, when we
> > use docker and try to control the default udp socket receive buffer
> > for each container.
> > 
> > For that reason, make sysctl_wmem_default and sysctl_rmem_default
> > per-namespace.
> > 
> > Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
> > ---  
> 
> Hey Menglong,
> 
> I was about to review the two patches you sent:
> 
> 1. [PATCH net-next] net: core: Namespace-ify sysctl_rmem_max and sysctl_wmem_max
>    https://lore.kernel.org/lkml/20210117104743.217194-1-dong.menglong@zte.com.cn
> 2. [PATCH net-next] net: core: Namespace-ify sysctl_wmem_default and sysctl_rmem_default
>    https://lore.kernel.org/lkml/20210117102319.193756-1-dong.menglong@zte.com.cn

And perhaps

  0. [PATCH net-next] net: core: init every ctl_table in netns_core_table

? 

I'm dropping these three from patchwork please follow Christian
suggestions on how to repost properly, thanks!

> and I had to spend some time figuring out that 2. is dependent on 1. I
> first thought I got the base wrong.
> 
> I'd suggest you resend both patches as a part of a single series with a
> cover letter mentioning the goal and use-case for these changes and also
> pass --base=<base-commit>
> when creating the patch series which makes it way easier to figure out
> what to apply it to when wanting to review a series in the larger
> context of a tree.
Jakub Kicinski Jan. 18, 2021, 9:29 p.m. UTC | #3
On Mon, 18 Jan 2021 13:15:28 -0800 Jakub Kicinski wrote:
> On Mon, 18 Jan 2021 12:15:18 +0100 Christian Brauner wrote:
> > On Sun, Jan 17, 2021 at 06:23:19PM +0800, menglong8.dong@gmail.com wrote:  
> > > From: Menglong Dong <dong.menglong@zte.com.cn>
> > > 
> > > For now, sysctl_wmem_default and sysctl_rmem_default are globally
> > > unified. It's not convenient in some case. For example, when we
> > > use docker and try to control the default udp socket receive buffer
> > > for each container.
> > > 
> > > For that reason, make sysctl_wmem_default and sysctl_rmem_default
> > > per-namespace.
> > > 
> > > Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
> > > ---    
> > 
> > Hey Menglong,
> > 
> > I was about to review the two patches you sent:
> > 
> > 1. [PATCH net-next] net: core: Namespace-ify sysctl_rmem_max and sysctl_wmem_max
> >    https://lore.kernel.org/lkml/20210117104743.217194-1-dong.menglong@zte.com.cn
> > 2. [PATCH net-next] net: core: Namespace-ify sysctl_wmem_default and sysctl_rmem_default
> >    https://lore.kernel.org/lkml/20210117102319.193756-1-dong.menglong@zte.com.cn  
> 
> And perhaps
> 
>   0. [PATCH net-next] net: core: init every ctl_table in netns_core_table
> 
> ? 
> 
> I'm dropping these three from patchwork please follow Christian
> suggestions on how to repost properly, thanks!

Ah, you already did.

Patch
diff mbox series

diff --git a/include/net/netns/core.h b/include/net/netns/core.h
index 36c2d998a43c..317b47df6d08 100644
--- a/include/net/netns/core.h
+++ b/include/net/netns/core.h
@@ -9,6 +9,8 @@  struct netns_core {
 	/* core sysctls */
 	struct ctl_table_header	*sysctl_hdr;
 
+	int sysctl_wmem_default;
+	int sysctl_rmem_default;
 	int	sysctl_somaxconn;
 
 #ifdef CONFIG_PROC_FS
diff --git a/include/net/sock.h b/include/net/sock.h
index bdc4323ce53c..b846a6d24459 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2653,9 +2653,6 @@  extern __u32 sysctl_rmem_max;
 extern int sysctl_tstamp_allow_data;
 extern int sysctl_optmem_max;
 
-extern __u32 sysctl_wmem_default;
-extern __u32 sysctl_rmem_default;
-
 DECLARE_STATIC_KEY_FALSE(net_high_order_alloc_disable_key);
 
 static inline int sk_get_wmem0(const struct sock *sk, const struct proto *proto)
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 2ef3b4557f40..eb4ea99131d6 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -374,6 +374,8 @@  static __net_init int setup_net(struct net *net, struct user_namespace *user_ns)
 
 static int __net_init net_defaults_init_net(struct net *net)
 {
+	net->core.sysctl_rmem_default = SK_RMEM_MAX;
+	net->core.sysctl_wmem_default = SK_WMEM_MAX;
 	net->core.sysctl_somaxconn = SOMAXCONN;
 	return 0;
 }
diff --git a/net/core/sock.c b/net/core/sock.c
index bbcd4b97eddd..2421e4ea1915 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -270,8 +270,6 @@  __u32 sysctl_wmem_max __read_mostly = SK_WMEM_MAX;
 EXPORT_SYMBOL(sysctl_wmem_max);
 __u32 sysctl_rmem_max __read_mostly = SK_RMEM_MAX;
 EXPORT_SYMBOL(sysctl_rmem_max);
-__u32 sysctl_wmem_default __read_mostly = SK_WMEM_MAX;
-__u32 sysctl_rmem_default __read_mostly = SK_RMEM_MAX;
 
 /* Maximal space eaten by iovec or ancillary data plus some space */
 int sysctl_optmem_max __read_mostly = sizeof(unsigned long)*(2*UIO_MAXIOV+512);
@@ -2970,8 +2968,8 @@  void sock_init_data(struct socket *sock, struct sock *sk)
 	timer_setup(&sk->sk_timer, NULL, 0);
 
 	sk->sk_allocation	=	GFP_KERNEL;
-	sk->sk_rcvbuf		=	sysctl_rmem_default;
-	sk->sk_sndbuf		=	sysctl_wmem_default;
+	sk->sk_rcvbuf		=	sock_net(sk)->core.sysctl_rmem_default;
+	sk->sk_sndbuf		=	sock_net(sk)->core.sysctl_wmem_default;
 	sk->sk_state		=	TCP_CLOSE;
 	sk_set_socket(sk, sock);
 
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 966d976dee84..5c1c75e42a09 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -326,22 +326,6 @@  static struct ctl_table net_core_table[] = {
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= &min_rcvbuf,
 	},
-	{
-		.procname	= "wmem_default",
-		.data		= &sysctl_wmem_default,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= &min_sndbuf,
-	},
-	{
-		.procname	= "rmem_default",
-		.data		= &sysctl_rmem_default,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= &min_rcvbuf,
-	},
 	{
 		.procname	= "dev_weight",
 		.data		= &weight_p,
@@ -584,6 +568,22 @@  static struct ctl_table netns_core_table[] = {
 		.extra1		= SYSCTL_ZERO,
 		.proc_handler	= proc_dointvec_minmax
 	},
+	{
+		.procname	= "wmem_default",
+		.data		= &init_net.core.sysctl_wmem_default,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &min_sndbuf,
+	},
+	{
+		.procname	= "rmem_default",
+		.data		= &init_net.core.sysctl_rmem_default,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &min_rcvbuf,
+	},
 	{ }
 };
 
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 2ed0b01f72f0..0fbdcda6f314 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1709,7 +1709,7 @@  void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb,
 
 	sk->sk_protocol = ip_hdr(skb)->protocol;
 	sk->sk_bound_dev_if = arg->bound_dev_if;
-	sk->sk_sndbuf = sysctl_wmem_default;
+	sk->sk_sndbuf = sock_net(sk)->core.sysctl_wmem_default;
 	ipc.sockc.mark = fl4.flowi4_mark;
 	err = ip_append_data(sk, &fl4, ip_reply_glue_bits, arg->iov->iov_base,
 			     len, 0, &ipc, &rt, MSG_DONTWAIT);