All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] net: Make tcp_allowed_congestion_control readonly in non-init netns
@ 2021-04-13  7:08 Jonathon Reinhart
  2021-04-13  9:06 ` Christian Brauner
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Jonathon Reinhart @ 2021-04-13  7:08 UTC (permalink / raw)
  To: netdev
  Cc: Jonathon Reinhart, David S. Miller, Hideaki YOSHIFUJI,
	David Ahern, Jakub Kicinski, Christian Brauner

Currently, tcp_allowed_congestion_control is global and writable;
writing to it in any net namespace will leak into all other net
namespaces.

tcp_available_congestion_control and tcp_allowed_congestion_control are
the only sysctls in ipv4_net_table (the per-netns sysctl table) with a
NULL data pointer; their handlers (proc_tcp_available_congestion_control
and proc_allowed_congestion_control) have no other way of referencing a
struct net. Thus, they operate globally.

Because ipv4_net_table does not use designated initializers, there is no
easy way to fix up this one "bad" table entry. However, the data pointer
updating logic shouldn't be applied to NULL pointers anyway, so we
instead force these entries to be read-only.

These sysctls used to exist in ipv4_table (init-net only), but they were
moved to the per-net ipv4_net_table, presumably without realizing that
tcp_allowed_congestion_control was writable and thus introduced a leak.

Because the intent of that commit was only to know (i.e. read) "which
congestion algorithms are available or allowed", this read-only solution
should be sufficient.

The logic added in recent commit
31c4d2f160eb: ("net: Ensure net namespace isolation of sysctls")
does not and cannot check for NULL data pointers, because
other table entries (e.g. /proc/sys/net/netfilter/nf_log/) have
.data=NULL but use other methods (.extra2) to access the struct net.

Fixes: 9cb8e048e5d9: ("net/ipv4/sysctl: show tcp_{allowed, available}_congestion_control in non-initial netns")
Signed-off-by: Jonathon Reinhart <jonathon.reinhart@gmail.com>
---
 net/ipv4/sysctl_net_ipv4.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index a09e466ce11d..a62934b9f15a 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -1383,9 +1383,19 @@ static __net_init int ipv4_sysctl_init_net(struct net *net)
 		if (!table)
 			goto err_alloc;
 
-		/* Update the variables to point into the current struct net */
-		for (i = 0; i < ARRAY_SIZE(ipv4_net_table) - 1; i++)
-			table[i].data += (void *)net - (void *)&init_net;
+		for (i = 0; i < ARRAY_SIZE(ipv4_net_table) - 1; i++) {
+			if (table[i].data) {
+				/* Update the variables to point into
+				 * the current struct net
+				 */
+				table[i].data += (void *)net - (void *)&init_net;
+			} else {
+				/* Entries without data pointer are global;
+				 * Make them read-only in non-init_net ns
+				 */
+				table[i].mode &= ~0222;
+			}
+		}
 	}
 
 	net->ipv4.ipv4_hdr = register_net_sysctl(net, "net/ipv4", table);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] net: Make tcp_allowed_congestion_control readonly in non-init netns
  2021-04-13  7:08 [PATCH] net: Make tcp_allowed_congestion_control readonly in non-init netns Jonathon Reinhart
@ 2021-04-13  9:06 ` Christian Brauner
  2021-04-13 18:23 ` Jakub Kicinski
  2021-04-15  3:33 ` [PATCH RESEND net-next] " Jonathon Reinhart
  2 siblings, 0 replies; 7+ messages in thread
From: Christian Brauner @ 2021-04-13  9:06 UTC (permalink / raw)
  To: Jonathon Reinhart
  Cc: netdev, David S. Miller, Hideaki YOSHIFUJI, David Ahern, Jakub Kicinski

On Tue, Apr 13, 2021 at 03:08:48AM -0400, Jonathon Reinhart wrote:
> Currently, tcp_allowed_congestion_control is global and writable;
> writing to it in any net namespace will leak into all other net
> namespaces.
> 
> tcp_available_congestion_control and tcp_allowed_congestion_control are
> the only sysctls in ipv4_net_table (the per-netns sysctl table) with a
> NULL data pointer; their handlers (proc_tcp_available_congestion_control
> and proc_allowed_congestion_control) have no other way of referencing a
> struct net. Thus, they operate globally.
> 
> Because ipv4_net_table does not use designated initializers, there is no
> easy way to fix up this one "bad" table entry. However, the data pointer
> updating logic shouldn't be applied to NULL pointers anyway, so we
> instead force these entries to be read-only.
> 
> These sysctls used to exist in ipv4_table (init-net only), but they were
> moved to the per-net ipv4_net_table, presumably without realizing that
> tcp_allowed_congestion_control was writable and thus introduced a leak.
> 
> Because the intent of that commit was only to know (i.e. read) "which
> congestion algorithms are available or allowed", this read-only solution
> should be sufficient.
> 
> The logic added in recent commit
> 31c4d2f160eb: ("net: Ensure net namespace isolation of sysctls")
> does not and cannot check for NULL data pointers, because
> other table entries (e.g. /proc/sys/net/netfilter/nf_log/) have
> .data=NULL but use other methods (.extra2) to access the struct net.
> 
> Fixes: 9cb8e048e5d9: ("net/ipv4/sysctl: show tcp_{allowed, available}_congestion_control in non-initial netns")
> Signed-off-by: Jonathon Reinhart <jonathon.reinhart@gmail.com>
> ---

Thanks for catching this. I think this works fine as a fix.

What I wonder is whether we have a need to make
tcp_allowed_congestion_control per-netns proper. That would be a bit
more work of course and we don't need to do it right now.

(Fwiw, we should really have a document describing the sysctls which are
namespaced and how.)

>  net/ipv4/sysctl_net_ipv4.c | 16 +++++++++++++---
>  1 file changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
> index a09e466ce11d..a62934b9f15a 100644
> --- a/net/ipv4/sysctl_net_ipv4.c
> +++ b/net/ipv4/sysctl_net_ipv4.c
> @@ -1383,9 +1383,19 @@ static __net_init int ipv4_sysctl_init_net(struct net *net)
>  		if (!table)
>  			goto err_alloc;
>  
> -		/* Update the variables to point into the current struct net */
> -		for (i = 0; i < ARRAY_SIZE(ipv4_net_table) - 1; i++)
> -			table[i].data += (void *)net - (void *)&init_net;
> +		for (i = 0; i < ARRAY_SIZE(ipv4_net_table) - 1; i++) {
> +			if (table[i].data) {
> +				/* Update the variables to point into
> +				 * the current struct net
> +				 */
> +				table[i].data += (void *)net - (void *)&init_net;
> +			} else {
> +				/* Entries without data pointer are global;
> +				 * Make them read-only in non-init_net ns
> +				 */
> +				table[i].mode &= ~0222;
> +			}
> +		}

Just for future reference if we would need to distinguish between
entries that do have data set and such that don't here we could also do
the below. This works because the tcp_{available,allowed}_[..] do
allocated into a local ctl_table variable anyway.

diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index f55095d3ed16..a1c8f5fe7501 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -877,12 +877,14 @@ static struct ctl_table ipv4_net_table[] = {
        },
        {
                .procname       = "tcp_available_congestion_control",
+               .data           = ERR_PTR(-EINVAL),
                .maxlen         = TCP_CA_BUF_MAX,
                .mode           = 0444,
                .proc_handler   = proc_tcp_available_congestion_control,
        },
        {
                .procname       = "tcp_allowed_congestion_control",
+               .data           = ERR_PTR(-EINVAL),
                .maxlen         = TCP_CA_BUF_MAX,
                .mode           = 0644,
                .proc_handler   = proc_allowed_congestion_control,
@@ -1379,8 +1381,18 @@ static __net_init int ipv4_sysctl_init_net(struct net *net)
                        goto err_alloc;

                /* Update the variables to point into the current struct net */
-               for (i = 0; i < ARRAY_SIZE(ipv4_net_table) - 1; i++)
-                       table[i].data += (void *)net - (void *)&init_net;
+               for (i = 0; i < ARRAY_SIZE(ipv4_net_table) - 1; i++) {
+                       if (!IS_ERR(table[i].data)) {
+                               /* Update the variables to point into
+                                * the current struct net. */
+                               table[i].data += (void *)net - (void *)&init_net;
+                       } else {
+                               /* Entries without data pointer are global;
+                                * Make them read-only in non-init_net ns.
+                                */
+                               table[i].mode &= ~0222;
+                       }
+               }
        }

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] net: Make tcp_allowed_congestion_control readonly in non-init netns
  2021-04-13  7:08 [PATCH] net: Make tcp_allowed_congestion_control readonly in non-init netns Jonathon Reinhart
  2021-04-13  9:06 ` Christian Brauner
@ 2021-04-13 18:23 ` Jakub Kicinski
  2021-04-14 21:31   ` Jonathon Reinhart
  2021-04-15  3:33 ` [PATCH RESEND net-next] " Jonathon Reinhart
  2 siblings, 1 reply; 7+ messages in thread
From: Jakub Kicinski @ 2021-04-13 18:23 UTC (permalink / raw)
  To: Jonathon Reinhart
  Cc: netdev, David S. Miller, Hideaki YOSHIFUJI, David Ahern,
	Christian Brauner

On Tue, 13 Apr 2021 03:08:48 -0400 Jonathon Reinhart wrote:
> Fixes: 9cb8e048e5d9: ("net/ipv4/sysctl: show tcp_{allowed, available}_congestion_control in non-initial netns")

nit: no semicolon after the hash

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] net: Make tcp_allowed_congestion_control readonly in non-init netns
  2021-04-13 18:23 ` Jakub Kicinski
@ 2021-04-14 21:31   ` Jonathon Reinhart
  2021-04-14 23:09     ` Jakub Kicinski
  0 siblings, 1 reply; 7+ messages in thread
From: Jonathon Reinhart @ 2021-04-14 21:31 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Linux Netdev List, David S. Miller, Hideaki YOSHIFUJI,
	David Ahern, Christian Brauner

On Tue, Apr 13, 2021 at 2:23 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 13 Apr 2021 03:08:48 -0400 Jonathon Reinhart wrote:
> > Fixes: 9cb8e048e5d9: ("net/ipv4/sysctl: show tcp_{allowed, available}_congestion_control in non-initial netns")
>
> nit: no semicolon after the hash

Oops. scripts/checkpatch.pl didn't catch this, but it looks like patchwork did.

You indicated "nit": Shall I resubmit, or can something trivial like
this be fixed up when committed? I'm fine either way, I just need to
know what's expected.

Also, this patch shows up in patchwork as "Not Applicable". I didn't
see any notification about that state change, nor do I really know
what that means. Was that due to this nit, or something else?

Thanks for guiding me through this.
Jonathon

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] net: Make tcp_allowed_congestion_control readonly in non-init netns
  2021-04-14 21:31   ` Jonathon Reinhart
@ 2021-04-14 23:09     ` Jakub Kicinski
  0 siblings, 0 replies; 7+ messages in thread
From: Jakub Kicinski @ 2021-04-14 23:09 UTC (permalink / raw)
  To: Jonathon Reinhart
  Cc: Linux Netdev List, David S. Miller, Hideaki YOSHIFUJI,
	David Ahern, Christian Brauner

On Wed, 14 Apr 2021 17:31:55 -0400 Jonathon Reinhart wrote:
> On Tue, Apr 13, 2021 at 2:23 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Tue, 13 Apr 2021 03:08:48 -0400 Jonathon Reinhart wrote:  
> > > Fixes: 9cb8e048e5d9: ("net/ipv4/sysctl: show tcp_{allowed, available}_congestion_control in non-initial netns")  
> >
> > nit: no semicolon after the hash  
> 
> Oops. scripts/checkpatch.pl didn't catch this, but it looks like patchwork did.
> 
> You indicated "nit": Shall I resubmit, or can something trivial like
> this be fixed up when committed? I'm fine either way, I just need to
> know what's expected.

I was just pointing it out, I wasn't 100% sure if it's important enough
for DaveM to request a repost.

> Also, this patch shows up in patchwork as "Not Applicable". I didn't
> see any notification about that state change, nor do I really know
> what that means. Was that due to this nit, or something else?
> 
> Thanks for guiding me through this.

I think best way forward is to repost with the nit addressed.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH RESEND net-next] net: Make tcp_allowed_congestion_control readonly in non-init netns
  2021-04-13  7:08 [PATCH] net: Make tcp_allowed_congestion_control readonly in non-init netns Jonathon Reinhart
  2021-04-13  9:06 ` Christian Brauner
  2021-04-13 18:23 ` Jakub Kicinski
@ 2021-04-15  3:33 ` Jonathon Reinhart
  2 siblings, 0 replies; 7+ messages in thread
From: Jonathon Reinhart @ 2021-04-15  3:33 UTC (permalink / raw)
  To: netdev
  Cc: Jonathon Reinhart, David S. Miller, Hideaki YOSHIFUJI,
	David Ahern, Jakub Kicinski, Christian Brauner

Currently, tcp_allowed_congestion_control is global and writable;
writing to it in any net namespace will leak into all other net
namespaces.

tcp_available_congestion_control and tcp_allowed_congestion_control are
the only sysctls in ipv4_net_table (the per-netns sysctl table) with a
NULL data pointer; their handlers (proc_tcp_available_congestion_control
and proc_allowed_congestion_control) have no other way of referencing a
struct net. Thus, they operate globally.

Because ipv4_net_table does not use designated initializers, there is no
easy way to fix up this one "bad" table entry. However, the data pointer
updating logic shouldn't be applied to NULL pointers anyway, so we
instead force these entries to be read-only.

These sysctls used to exist in ipv4_table (init-net only), but they were
moved to the per-net ipv4_net_table, presumably without realizing that
tcp_allowed_congestion_control was writable and thus introduced a leak.

Because the intent of that commit was only to know (i.e. read) "which
congestion algorithms are available or allowed", this read-only solution
should be sufficient.

The logic added in recent commit
31c4d2f160eb: ("net: Ensure net namespace isolation of sysctls")
does not and cannot check for NULL data pointers, because
other table entries (e.g. /proc/sys/net/netfilter/nf_log/) have
.data=NULL but use other methods (.extra2) to access the struct net.

Fixes: 9cb8e048e5d9 ("net/ipv4/sysctl: show tcp_{allowed, available}_congestion_control in non-initial netns")
Signed-off-by: Jonathon Reinhart <jonathon.reinhart@gmail.com>
---
 net/ipv4/sysctl_net_ipv4.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index a09e466ce11d..a62934b9f15a 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -1383,9 +1383,19 @@ static __net_init int ipv4_sysctl_init_net(struct net *net)
 		if (!table)
 			goto err_alloc;
 
-		/* Update the variables to point into the current struct net */
-		for (i = 0; i < ARRAY_SIZE(ipv4_net_table) - 1; i++)
-			table[i].data += (void *)net - (void *)&init_net;
+		for (i = 0; i < ARRAY_SIZE(ipv4_net_table) - 1; i++) {
+			if (table[i].data) {
+				/* Update the variables to point into
+				 * the current struct net
+				 */
+				table[i].data += (void *)net - (void *)&init_net;
+			} else {
+				/* Entries without data pointer are global;
+				 * Make them read-only in non-init_net ns
+				 */
+				table[i].mode &= ~0222;
+			}
+		}
 	}
 
 	net->ipv4.ipv4_hdr = register_net_sysctl(net, "net/ipv4", table);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH RESEND net-next] net: Make tcp_allowed_congestion_control readonly in non-init netns
@ 2021-04-18 14:35 Jonathon Reinhart
  0 siblings, 0 replies; 7+ messages in thread
From: Jonathon Reinhart @ 2021-04-18 14:35 UTC (permalink / raw)
  To: jonathon.reinhart; +Cc: davem, kuba, netdev

Hi Dave,

It looks like this patch is on "net", "net-next", and Linus' tree (as
commit 97684f0970f6). Additionally, gregkh has queued it up for the
5.10 and 5.11 stable trees.

But it still shows up in Patchwork as "Needs ACK". Is there anything I
need to do?

Thanks,
Jonathon

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-04-18 14:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-13  7:08 [PATCH] net: Make tcp_allowed_congestion_control readonly in non-init netns Jonathon Reinhart
2021-04-13  9:06 ` Christian Brauner
2021-04-13 18:23 ` Jakub Kicinski
2021-04-14 21:31   ` Jonathon Reinhart
2021-04-14 23:09     ` Jakub Kicinski
2021-04-15  3:33 ` [PATCH RESEND net-next] " Jonathon Reinhart
2021-04-18 14:35 Jonathon Reinhart

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.