netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
@ 2020-11-11 20:43 Jian Yang
  2020-11-12 16:08 ` Andrew Lunn
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Jian Yang @ 2020-11-11 20:43 UTC (permalink / raw)
  To: davem, kuba, netdev; +Cc: Mahesh Bandewar, Jian Yang

From: Mahesh Bandewar <maheshb@google.com>

Traditionally loopback devices comes up with initial state as DOWN for
any new network-namespace. This would mean that anyone needing this
device (which is mostly true except sandboxes where networking in not
needed at all), would have to bring this UP by issuing something like
'ip link set lo up' which can be avoided if the initial state can be set
as UP. Also ICMP error propagation needs loopback to be UP.

The default value for this sysctl is set to ZERO which will preserve the
backward compatible behavior for the root-netns while changing the
sysctl will only alter the behavior of the newer network namespaces.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: Jian Yang <jianyang@google.com>
---
 Documentation/admin-guide/sysctl/net.rst | 11 +++++++++++
 drivers/net/loopback.c                   |  7 +++++++
 include/linux/netdevice.h                |  1 +
 net/core/sysctl_net_core.c               | 14 ++++++++++++++
 4 files changed, 33 insertions(+)

diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index f2ab8a5b6a4b..6902232ff57a 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -268,6 +268,17 @@ Maximum number of microseconds in one NAPI polling cycle. Polling
 will exit when either netdev_budget_usecs have elapsed during the
 poll cycle or the number of packets processed reaches netdev_budget.
 
+netdev_loopback_state
+---------------------
+
+Controls the loopback device initial state for any new network namespaces. By
+default, we keep the initial state as DOWN.
+
+If set to 1, the loopback device will be brought UP during namespace creation.
+This will only apply to all new network namespaces.
+
+Default : 0  (for compatibility reasons)
+
 netdev_max_backlog
 ------------------
 
diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index a1c77cc00416..76dc92ac65a2 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -219,6 +219,13 @@ static __net_init int loopback_net_init(struct net *net)
 
 	BUG_ON(dev->ifindex != LOOPBACK_IFINDEX);
 	net->loopback_dev = dev;
+
+	if (sysctl_netdev_loopback_state) {
+		/* Bring loopback device UP */
+		rtnl_lock();
+		dev_open(dev, NULL);
+		rtnl_unlock();
+	}
 	return 0;
 
 out_free_netdev:
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 7ce648a564f7..27c0a7e8a8ea 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -625,6 +625,7 @@ struct netdev_queue {
 
 extern int sysctl_fb_tunnels_only_for_init_net;
 extern int sysctl_devconf_inherit_init_net;
+extern int sysctl_netdev_loopback_state;
 
 /*
  * sysctl_fb_tunnels_only_for_init_net == 0 : For all netns
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index d86d8d11cfe4..d2cf435f5991 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -35,6 +35,11 @@ static int net_msg_warn;	/* Unused, but still a sysctl */
 int sysctl_fb_tunnels_only_for_init_net __read_mostly = 0;
 EXPORT_SYMBOL(sysctl_fb_tunnels_only_for_init_net);
 
+/* 0 - default (backward compatible) state: DOWN by default
+ * 1 - UP by default (for all new network namespaces)
+ */
+int sysctl_netdev_loopback_state __read_mostly;
+
 /* 0 - Keep current behavior:
  *     IPv4: inherit all current settings from init_net
  *     IPv6: reset all settings to default
@@ -507,6 +512,15 @@ static struct ctl_table net_core_table[] = {
 		.proc_handler	= set_default_qdisc
 	},
 #endif
+	{
+		.procname	= "netdev_loopback_state",
+		.data		= &sysctl_netdev_loopback_state,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_ONE
+	},
 #endif /* CONFIG_NET */
 	{
 		.procname	= "netdev_budget",
-- 
2.29.2.222.g5d2a92d10f8-goog


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-11 20:43 [PATCH net-next] net-loopback: allow lo dev initial state to be controlled Jian Yang
@ 2020-11-12 16:08 ` Andrew Lunn
  2020-11-12 19:54   ` Dan Williams
  2020-11-14 18:17 ` Jakub Kicinski
  2020-11-17  4:50 ` kernel test robot
  2 siblings, 1 reply; 25+ messages in thread
From: Andrew Lunn @ 2020-11-12 16:08 UTC (permalink / raw)
  To: Jian Yang; +Cc: davem, kuba, netdev, Mahesh Bandewar, Jian Yang

On Wed, Nov 11, 2020 at 12:43:08PM -0800, Jian Yang wrote:
> From: Mahesh Bandewar <maheshb@google.com>
> 
> Traditionally loopback devices comes up with initial state as DOWN for
> any new network-namespace. This would mean that anyone needing this
> device (which is mostly true except sandboxes where networking in not
> needed at all), would have to bring this UP by issuing something like
> 'ip link set lo up' which can be avoided if the initial state can be set
> as UP.

How useful is lo if it is up, but has no IP address? I don't think
this change adds the IP addresses does it? So you still need something
inside your netns to add the IP addresses? Which seems to make this
change pointless?

       Andrew

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-12 16:08 ` Andrew Lunn
@ 2020-11-12 19:54   ` Dan Williams
  0 siblings, 0 replies; 25+ messages in thread
From: Dan Williams @ 2020-11-12 19:54 UTC (permalink / raw)
  To: Andrew Lunn, Jian Yang; +Cc: davem, kuba, netdev, Mahesh Bandewar, Jian Yang

On Thu, 2020-11-12 at 17:08 +0100, Andrew Lunn wrote:
> On Wed, Nov 11, 2020 at 12:43:08PM -0800, Jian Yang wrote:
> > From: Mahesh Bandewar <maheshb@google.com>
> > 
> > Traditionally loopback devices comes up with initial state as DOWN
> > for
> > any new network-namespace. This would mean that anyone needing this
> > device (which is mostly true except sandboxes where networking in
> > not
> > needed at all), would have to bring this UP by issuing something
> > like
> > 'ip link set lo up' which can be avoided if the initial state can
> > be set
> > as UP.
> 
> How useful is lo if it is up, but has no IP address? I don't think
> this change adds the IP addresses does it? So you still need
> something
> inside your netns to add the IP addresses? Which seems to make this
> change pointless?

lo gets addresses automatically these days, no?

$ ip netns add blue
$ ip netns exec blue ip addr
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
$ ip netns exec blue ip link set lo up
$ ip netns exec blue ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever

Dan


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-11 20:43 [PATCH net-next] net-loopback: allow lo dev initial state to be controlled Jian Yang
  2020-11-12 16:08 ` Andrew Lunn
@ 2020-11-14 18:17 ` Jakub Kicinski
  2020-11-16 20:02   ` Mahesh Bandewar (महेश बंडेवार)
  2020-11-17  4:50 ` kernel test robot
  2 siblings, 1 reply; 25+ messages in thread
From: Jakub Kicinski @ 2020-11-14 18:17 UTC (permalink / raw)
  To: Jian Yang; +Cc: davem, netdev, Mahesh Bandewar, Jian Yang

On Wed, 11 Nov 2020 12:43:08 -0800 Jian Yang wrote:
> From: Mahesh Bandewar <maheshb@google.com>
> 
> Traditionally loopback devices comes up with initial state as DOWN for
> any new network-namespace. This would mean that anyone needing this
> device (which is mostly true except sandboxes where networking in not
> needed at all), would have to bring this UP by issuing something like
> 'ip link set lo up' which can be avoided if the initial state can be set
> as UP. Also ICMP error propagation needs loopback to be UP.
> 
> The default value for this sysctl is set to ZERO which will preserve the
> backward compatible behavior for the root-netns while changing the
> sysctl will only alter the behavior of the newer network namespaces.

Any reason why the new sysctl itself is not per netns?

> +netdev_loopback_state
> +---------------------

loopback_init_state ?

> +Controls the loopback device initial state for any new network namespaces. By
> +default, we keep the initial state as DOWN.
> +
> +If set to 1, the loopback device will be brought UP during namespace creation.
> +This will only apply to all new network namespaces.
> +
> +Default : 0  (for compatibility reasons)
> +
>  netdev_max_backlog
>  ------------------
>  
> diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
> index a1c77cc00416..76dc92ac65a2 100644
> --- a/drivers/net/loopback.c
> +++ b/drivers/net/loopback.c
> @@ -219,6 +219,13 @@ static __net_init int loopback_net_init(struct net *net)
>  
>  	BUG_ON(dev->ifindex != LOOPBACK_IFINDEX);
>  	net->loopback_dev = dev;
> +
> +	if (sysctl_netdev_loopback_state) {
> +		/* Bring loopback device UP */
> +		rtnl_lock();
> +		dev_open(dev, NULL);
> +		rtnl_unlock();
> +	}

The only concern I have here is that it breaks notification ordering.
Is there precedent for NETDEV_UP to be generated before all pernet ops
->init was called?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-14 18:17 ` Jakub Kicinski
@ 2020-11-16 20:02   ` Mahesh Bandewar (महेश बंडेवार)
  2020-11-16 20:17     ` Jakub Kicinski
  2020-11-16 20:34     ` Jakub Kicinski
  0 siblings, 2 replies; 25+ messages in thread
From: Mahesh Bandewar (महेश बंडेवार) @ 2020-11-16 20:02 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Jian Yang, David Miller, linux-netdev, Jian Yang

On Sat, Nov 14, 2020 at 10:17 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 11 Nov 2020 12:43:08 -0800 Jian Yang wrote:
> > From: Mahesh Bandewar <maheshb@google.com>
> >
> > Traditionally loopback devices comes up with initial state as DOWN for
> > any new network-namespace. This would mean that anyone needing this
> > device (which is mostly true except sandboxes where networking in not
> > needed at all), would have to bring this UP by issuing something like
> > 'ip link set lo up' which can be avoided if the initial state can be set
> > as UP. Also ICMP error propagation needs loopback to be UP.
> >
> > The default value for this sysctl is set to ZERO which will preserve the
> > backward compatible behavior for the root-netns while changing the
> > sysctl will only alter the behavior of the newer network namespaces.
>
> Any reason why the new sysctl itself is not per netns?
>
Making it per netns would not be very useful since its effect is only
during netns creation.

> > +netdev_loopback_state
> > +---------------------
>
> loopback_init_state ?
>
That's fine, thanks for the suggestion.

> > +Controls the loopback device initial state for any new network namespaces. By
> > +default, we keep the initial state as DOWN.
> > +
> > +If set to 1, the loopback device will be brought UP during namespace creation.
> > +This will only apply to all new network namespaces.
> > +
> > +Default : 0  (for compatibility reasons)
> > +
> >  netdev_max_backlog
> >  ------------------
> >
> > diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
> > index a1c77cc00416..76dc92ac65a2 100644
> > --- a/drivers/net/loopback.c
> > +++ b/drivers/net/loopback.c
> > @@ -219,6 +219,13 @@ static __net_init int loopback_net_init(struct net *net)
> >
> >       BUG_ON(dev->ifindex != LOOPBACK_IFINDEX);
> >       net->loopback_dev = dev;
> > +
> > +     if (sysctl_netdev_loopback_state) {
> > +             /* Bring loopback device UP */
> > +             rtnl_lock();
> > +             dev_open(dev, NULL);
> > +             rtnl_unlock();
> > +     }
>
> The only concern I have here is that it breaks notification ordering.
> Is there precedent for NETDEV_UP to be generated before all pernet ops
> ->init was called?
I'm not sure if any and didn't see any issues in our usage / tests.
I'm not even sure anyone is watching/monitoring for lo status as such.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-16 20:02   ` Mahesh Bandewar (महेश बंडेवार)
@ 2020-11-16 20:17     ` Jakub Kicinski
  2020-11-16 20:50       ` Mahesh Bandewar (महेश बंडेवार)
  2020-11-16 20:34     ` Jakub Kicinski
  1 sibling, 1 reply; 25+ messages in thread
From: Jakub Kicinski @ 2020-11-16 20:17 UTC (permalink / raw)
  To: Mahesh Bandewar (महेश
	बंडेवार)
  Cc: Jian Yang, David Miller, linux-netdev, Jian Yang

On Mon, 16 Nov 2020 12:02:48 -0800 Mahesh Bandewar (महेश बंडेवार) wrote:
> On Sat, Nov 14, 2020 at 10:17 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > On Wed, 11 Nov 2020 12:43:08 -0800 Jian Yang wrote:  
> > > From: Mahesh Bandewar <maheshb@google.com>
> > >
> > > Traditionally loopback devices comes up with initial state as DOWN for
> > > any new network-namespace. This would mean that anyone needing this
> > > device (which is mostly true except sandboxes where networking in not
> > > needed at all), would have to bring this UP by issuing something like
> > > 'ip link set lo up' which can be avoided if the initial state can be set
> > > as UP. Also ICMP error propagation needs loopback to be UP.
> > >
> > > The default value for this sysctl is set to ZERO which will preserve the
> > > backward compatible behavior for the root-netns while changing the
> > > sysctl will only alter the behavior of the newer network namespaces.  
>
> > Any reason why the new sysctl itself is not per netns?
> >  
> Making it per netns would not be very useful since its effect is only
> during netns creation.

I must be confused. Are all namespaces spawned off init_net, not the
current netns the process is in?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-16 20:02   ` Mahesh Bandewar (महेश बंडेवार)
  2020-11-16 20:17     ` Jakub Kicinski
@ 2020-11-16 20:34     ` Jakub Kicinski
  2020-11-16 21:03       ` Mahesh Bandewar (महेश बंडेवार)
  1 sibling, 1 reply; 25+ messages in thread
From: Jakub Kicinski @ 2020-11-16 20:34 UTC (permalink / raw)
  To: Mahesh Bandewar (महेश
	बंडेवार) ,
	Ido Schimmel, David Ahern
  Cc: Jian Yang, David Miller, linux-netdev, Jian Yang

On Mon, 16 Nov 2020 12:02:48 -0800 Mahesh Bandewar (महेश बंडेवार) wrote:
> > > diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
> > > index a1c77cc00416..76dc92ac65a2 100644
> > > --- a/drivers/net/loopback.c
> > > +++ b/drivers/net/loopback.c
> > > @@ -219,6 +219,13 @@ static __net_init int loopback_net_init(struct net *net)
> > >
> > >       BUG_ON(dev->ifindex != LOOPBACK_IFINDEX);
> > >       net->loopback_dev = dev;
> > > +
> > > +     if (sysctl_netdev_loopback_state) {
> > > +             /* Bring loopback device UP */
> > > +             rtnl_lock();
> > > +             dev_open(dev, NULL);
> > > +             rtnl_unlock();
> > > +     }  
> >
> > The only concern I have here is that it breaks notification ordering.
> > Is there precedent for NETDEV_UP to be generated before all pernet ops  
> > ->init was called?  
> I'm not sure if any and didn't see any issues in our usage / tests.
> I'm not even sure anyone is watching/monitoring for lo status as such.

Ido, David, how does this sound to you?

I can't think of any particular case where bringing the device up (and
populating it's addresses) before per netns init is finished could be
problematic. But if this is going to make kernel coding harder the
minor convenience of the knob is probably not worth it.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-16 20:17     ` Jakub Kicinski
@ 2020-11-16 20:50       ` Mahesh Bandewar (महेश बंडेवार)
  2020-11-16 21:20         ` Jakub Kicinski
  0 siblings, 1 reply; 25+ messages in thread
From: Mahesh Bandewar (महेश बंडेवार) @ 2020-11-16 20:50 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Jian Yang, David Miller, linux-netdev, Jian Yang

On Mon, Nov 16, 2020 at 12:17 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Mon, 16 Nov 2020 12:02:48 -0800 Mahesh Bandewar (महेश बंडेवार) wrote:
> > On Sat, Nov 14, 2020 at 10:17 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > > On Wed, 11 Nov 2020 12:43:08 -0800 Jian Yang wrote:
> > > > From: Mahesh Bandewar <maheshb@google.com>
> > > >
> > > > Traditionally loopback devices comes up with initial state as DOWN for
> > > > any new network-namespace. This would mean that anyone needing this
> > > > device (which is mostly true except sandboxes where networking in not
> > > > needed at all), would have to bring this UP by issuing something like
> > > > 'ip link set lo up' which can be avoided if the initial state can be set
> > > > as UP. Also ICMP error propagation needs loopback to be UP.
> > > >
> > > > The default value for this sysctl is set to ZERO which will preserve the
> > > > backward compatible behavior for the root-netns while changing the
> > > > sysctl will only alter the behavior of the newer network namespaces.
> >
> > > Any reason why the new sysctl itself is not per netns?
> > >
> > Making it per netns would not be very useful since its effect is only
> > during netns creation.
>
> I must be confused. Are all namespaces spawned off init_net, not the
> current netns the process is in?
The namespace hierarchy is maintained in user-ns while we have per-ns
sysctls hanging off of a netns object and we don't have parent (netns)
reference when initializing newly created netns values. One can copy
the current value of the settings from root-ns but I don't think
that's a good practice since there is no clear way to affect those
values when the root-ns changes them. Also from the isolation
perspective (I think) it's better to have this behavior (sysctl
values) stand on it's own i.e. have default values and then alter
values on it's own without linking to any other netns values.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-16 20:34     ` Jakub Kicinski
@ 2020-11-16 21:03       ` Mahesh Bandewar (महेश बंडेवार)
  2020-11-17 17:18         ` Ido Schimmel
  0 siblings, 1 reply; 25+ messages in thread
From: Mahesh Bandewar (महेश बंडेवार) @ 2020-11-16 21:03 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Ido Schimmel, David Ahern, Jian Yang, David Miller, linux-netdev,
	Jian Yang, Eric Dumazet

On Mon, Nov 16, 2020 at 12:34 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Mon, 16 Nov 2020 12:02:48 -0800 Mahesh Bandewar (महेश बंडेवार) wrote:
> > > > diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
> > > > index a1c77cc00416..76dc92ac65a2 100644
> > > > --- a/drivers/net/loopback.c
> > > > +++ b/drivers/net/loopback.c
> > > > @@ -219,6 +219,13 @@ static __net_init int loopback_net_init(struct net *net)
> > > >
> > > >       BUG_ON(dev->ifindex != LOOPBACK_IFINDEX);
> > > >       net->loopback_dev = dev;
> > > > +
> > > > +     if (sysctl_netdev_loopback_state) {
> > > > +             /* Bring loopback device UP */
> > > > +             rtnl_lock();
> > > > +             dev_open(dev, NULL);
> > > > +             rtnl_unlock();
> > > > +     }
> > >
> > > The only concern I have here is that it breaks notification ordering.
> > > Is there precedent for NETDEV_UP to be generated before all pernet ops
> > > ->init was called?
> > I'm not sure if any and didn't see any issues in our usage / tests.
> > I'm not even sure anyone is watching/monitoring for lo status as such.
>
> Ido, David, how does this sound to you?
>
> I can't think of any particular case where bringing the device up (and
> populating it's addresses) before per netns init is finished could be
> problematic. But if this is going to make kernel coding harder the
> minor convenience of the knob is probably not worth it.

+Eric Dumazet

I'm not sure why kernel coding should get harder, but happy to listen
to the opinions.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-16 20:50       ` Mahesh Bandewar (महेश बंडेवार)
@ 2020-11-16 21:20         ` Jakub Kicinski
  2020-11-16 21:42           ` Mahesh Bandewar (महेश बंडेवार)
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Kicinski @ 2020-11-16 21:20 UTC (permalink / raw)
  To: Mahesh Bandewar (महेश
	बंडेवार)
  Cc: Jian Yang, David Miller, linux-netdev, Jian Yang

On Mon, 16 Nov 2020 12:50:22 -0800 Mahesh Bandewar (महेश बंडेवार) wrote:
> On Mon, Nov 16, 2020 at 12:17 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > On Mon, 16 Nov 2020 12:02:48 -0800 Mahesh Bandewar (महेश बंडेवार) wrote:  
> > > On Sat, Nov 14, 2020 at 10:17 AM Jakub Kicinski <kuba@kernel.org> wrote:  
> > > > On Wed, 11 Nov 2020 12:43:08 -0800 Jian Yang wrote:  
> > > > > From: Mahesh Bandewar <maheshb@google.com>
> > > > >
> > > > > Traditionally loopback devices comes up with initial state as DOWN for
> > > > > any new network-namespace. This would mean that anyone needing this
> > > > > device (which is mostly true except sandboxes where networking in not
> > > > > needed at all), would have to bring this UP by issuing something like
> > > > > 'ip link set lo up' which can be avoided if the initial state can be set
> > > > > as UP. Also ICMP error propagation needs loopback to be UP.
> > > > >
> > > > > The default value for this sysctl is set to ZERO which will preserve the
> > > > > backward compatible behavior for the root-netns while changing the
> > > > > sysctl will only alter the behavior of the newer network namespaces.  
> > >  
> > > > Any reason why the new sysctl itself is not per netns?
> > > >  
> > > Making it per netns would not be very useful since its effect is only
> > > during netns creation.  
> >
> > I must be confused. Are all namespaces spawned off init_net, not the
> > current netns the process is in?  
> The namespace hierarchy is maintained in user-ns while we have per-ns
> sysctls hanging off of a netns object and we don't have parent (netns)
> reference when initializing newly created netns values. One can copy
> the current value of the settings from root-ns but I don't think
> that's a good practice since there is no clear way to affect those
> values when the root-ns changes them. Also from the isolation
> perspective (I think) it's better to have this behavior (sysctl
> values) stand on it's own i.e. have default values and then alter
> values on it's own without linking to any other netns values.

To be clear, what I meant was just to make the sysctl per namespace.
That way you can deploy a workload with this sysctl set appropriately
without changing the system-global setting.

Is your expectation that particular application stacks would take
advantage of this, or that people would set this to 1 across the
fleet?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-16 21:20         ` Jakub Kicinski
@ 2020-11-16 21:42           ` Mahesh Bandewar (महेश बंडेवार)
  0 siblings, 0 replies; 25+ messages in thread
From: Mahesh Bandewar (महेश बंडेवार) @ 2020-11-16 21:42 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Jian Yang, David Miller, linux-netdev, Jian Yang

On Mon, Nov 16, 2020 at 1:20 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Mon, 16 Nov 2020 12:50:22 -0800 Mahesh Bandewar (महेश बंडेवार) wrote:
> > On Mon, Nov 16, 2020 at 12:17 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > > On Mon, 16 Nov 2020 12:02:48 -0800 Mahesh Bandewar (महेश बंडेवार) wrote:
> > > > On Sat, Nov 14, 2020 at 10:17 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > > > > On Wed, 11 Nov 2020 12:43:08 -0800 Jian Yang wrote:
> > > > > > From: Mahesh Bandewar <maheshb@google.com>
> > > > > >
> > > > > > Traditionally loopback devices comes up with initial state as DOWN for
> > > > > > any new network-namespace. This would mean that anyone needing this
> > > > > > device (which is mostly true except sandboxes where networking in not
> > > > > > needed at all), would have to bring this UP by issuing something like
> > > > > > 'ip link set lo up' which can be avoided if the initial state can be set
> > > > > > as UP. Also ICMP error propagation needs loopback to be UP.
> > > > > >
> > > > > > The default value for this sysctl is set to ZERO which will preserve the
> > > > > > backward compatible behavior for the root-netns while changing the
> > > > > > sysctl will only alter the behavior of the newer network namespaces.
> > > >
> > > > > Any reason why the new sysctl itself is not per netns?
> > > > >
> > > > Making it per netns would not be very useful since its effect is only
> > > > during netns creation.
> > >
> > > I must be confused. Are all namespaces spawned off init_net, not the
> > > current netns the process is in?
> > The namespace hierarchy is maintained in user-ns while we have per-ns
> > sysctls hanging off of a netns object and we don't have parent (netns)
> > reference when initializing newly created netns values. One can copy
> > the current value of the settings from root-ns but I don't think
> > that's a good practice since there is no clear way to affect those
> > values when the root-ns changes them. Also from the isolation
> > perspective (I think) it's better to have this behavior (sysctl
> > values) stand on it's own i.e. have default values and then alter
> > values on it's own without linking to any other netns values.
>
> To be clear, what I meant was just to make the sysctl per namespace.
> That way you can deploy a workload with this sysctl set appropriately
> without changing the system-global setting.
>
> Is your expectation that particular application stacks would take
> advantage of this, or that people would set this to 1 across the
> fleet?

Having loopback DOWN is useful to only a tiny fraction of netns use
cases where networking is not enabled but since that's how it had been
historically so breaking that (default) behavior is not right. But
apart from those cases, wherever networking is used inside netns (most
of the use cases), loopback is always required to be UP otherwise your
ICMP error delivery is affected. So workloads that always use
networking inside netns would set this value to be 1 always (Google's
use case) while those workloads where there is a mix of non-networking
as well as networking enabled netns-es can use the default behavior
that has been traditionally present (where the value set to 0).

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-11 20:43 [PATCH net-next] net-loopback: allow lo dev initial state to be controlled Jian Yang
  2020-11-12 16:08 ` Andrew Lunn
  2020-11-14 18:17 ` Jakub Kicinski
@ 2020-11-17  4:50 ` kernel test robot
  2 siblings, 0 replies; 25+ messages in thread
From: kernel test robot @ 2020-11-17  4:50 UTC (permalink / raw)
  To: Jian Yang, davem, kuba, netdev; +Cc: kbuild-all, Mahesh Bandewar, Jian Yang

[-- Attachment #1: Type: text/plain, Size: 1655 bytes --]

Hi Jian,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Jian-Yang/net-loopback-allow-lo-dev-initial-state-to-be-controlled/20201112-044539
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 70408949a35f1a31c327c69b6a187635cb0305fa
config: powerpc64-randconfig-r006-20201116 (attached as .config)
compiler: powerpc-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/f1167177eeca028a046726f582c332d4c638a0c8
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Jian-Yang/net-loopback-allow-lo-dev-initial-state-to-be-controlled/20201112-044539
        git checkout f1167177eeca028a046726f582c332d4c638a0c8
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   powerpc-linux-ld: drivers/net/loopback.o: in function `loopback_net_init':
>> loopback.c:(.init.text+0x7a): undefined reference to `sysctl_netdev_loopback_state'
>> powerpc-linux-ld: loopback.c:(.init.text+0x7e): undefined reference to `sysctl_netdev_loopback_state'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 27672 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-16 21:03       ` Mahesh Bandewar (महेश बंडेवार)
@ 2020-11-17 17:18         ` Ido Schimmel
  2020-11-17 20:53           ` Mahesh Bandewar (महेश बंडेवार)
  0 siblings, 1 reply; 25+ messages in thread
From: Ido Schimmel @ 2020-11-17 17:18 UTC (permalink / raw)
  To: Mahesh Bandewar (महेश
	बंडेवार)
  Cc: Jakub Kicinski, David Ahern, Jian Yang, David Miller,
	linux-netdev, Jian Yang, Eric Dumazet

On Mon, Nov 16, 2020 at 01:03:32PM -0800, Mahesh Bandewar (महेश बंडेवार) wrote:
> On Mon, Nov 16, 2020 at 12:34 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Mon, 16 Nov 2020 12:02:48 -0800 Mahesh Bandewar (महेश बंडेवार) wrote:
> > > > > diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
> > > > > index a1c77cc00416..76dc92ac65a2 100644
> > > > > --- a/drivers/net/loopback.c
> > > > > +++ b/drivers/net/loopback.c
> > > > > @@ -219,6 +219,13 @@ static __net_init int loopback_net_init(struct net *net)
> > > > >
> > > > >       BUG_ON(dev->ifindex != LOOPBACK_IFINDEX);
> > > > >       net->loopback_dev = dev;
> > > > > +
> > > > > +     if (sysctl_netdev_loopback_state) {
> > > > > +             /* Bring loopback device UP */
> > > > > +             rtnl_lock();
> > > > > +             dev_open(dev, NULL);
> > > > > +             rtnl_unlock();
> > > > > +     }
> > > >
> > > > The only concern I have here is that it breaks notification ordering.
> > > > Is there precedent for NETDEV_UP to be generated before all pernet ops
> > > > ->init was called?
> > > I'm not sure if any and didn't see any issues in our usage / tests.
> > > I'm not even sure anyone is watching/monitoring for lo status as such.
> >
> > Ido, David, how does this sound to you?
> >
> > I can't think of any particular case where bringing the device up (and
> > populating it's addresses) before per netns init is finished could be
> > problematic. But if this is going to make kernel coding harder the
> > minor convenience of the knob is probably not worth it.
> 
> +Eric Dumazet
> 
> I'm not sure why kernel coding should get harder, but happy to listen
> to the opinions.

Hi,

Sorry for the delay. Does not occur to me as a problematic change. I ran
various tests with 'sysctl -qw net.core.netdev_loopback_state=1' and a
debug config. Looks OK.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-17 17:18         ` Ido Schimmel
@ 2020-11-17 20:53           ` Mahesh Bandewar (महेश बंडेवार)
  2020-11-18  1:12             ` David Ahern
  0 siblings, 1 reply; 25+ messages in thread
From: Mahesh Bandewar (महेश बंडेवार) @ 2020-11-17 20:53 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Jakub Kicinski, David Ahern, Jian Yang, David Miller,
	linux-netdev, Jian Yang, Eric Dumazet

On Tue, Nov 17, 2020 at 9:18 AM Ido Schimmel <idosch@idosch.org> wrote:
>
> On Mon, Nov 16, 2020 at 01:03:32PM -0800, Mahesh Bandewar (महेश बंडेवार) wrote:
> > On Mon, Nov 16, 2020 at 12:34 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > >
> > > On Mon, 16 Nov 2020 12:02:48 -0800 Mahesh Bandewar (महेश बंडेवार) wrote:
> > > > > > diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
> > > > > > index a1c77cc00416..76dc92ac65a2 100644
> > > > > > --- a/drivers/net/loopback.c
> > > > > > +++ b/drivers/net/loopback.c
> > > > > > @@ -219,6 +219,13 @@ static __net_init int loopback_net_init(struct net *net)
> > > > > >
> > > > > >       BUG_ON(dev->ifindex != LOOPBACK_IFINDEX);
> > > > > >       net->loopback_dev = dev;
> > > > > > +
> > > > > > +     if (sysctl_netdev_loopback_state) {
> > > > > > +             /* Bring loopback device UP */
> > > > > > +             rtnl_lock();
> > > > > > +             dev_open(dev, NULL);
> > > > > > +             rtnl_unlock();
> > > > > > +     }
> > > > >
> > > > > The only concern I have here is that it breaks notification ordering.
> > > > > Is there precedent for NETDEV_UP to be generated before all pernet ops
> > > > > ->init was called?
> > > > I'm not sure if any and didn't see any issues in our usage / tests.
> > > > I'm not even sure anyone is watching/monitoring for lo status as such.
> > >
> > > Ido, David, how does this sound to you?
> > >
> > > I can't think of any particular case where bringing the device up (and
> > > populating it's addresses) before per netns init is finished could be
> > > problematic. But if this is going to make kernel coding harder the
> > > minor convenience of the knob is probably not worth it.
> >
> > +Eric Dumazet
> >
> > I'm not sure why kernel coding should get harder, but happy to listen
> > to the opinions.
>
> Hi,
>
> Sorry for the delay. Does not occur to me as a problematic change. I ran
> various tests with 'sysctl -qw net.core.netdev_loopback_state=1' and a
> debug config. Looks OK.

Thanks for the confirmation Ido. I think Jian is getting powerpc
config build fixed to address the build-bots findings and then he can
push the updated version.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-17 20:53           ` Mahesh Bandewar (महेश बंडेवार)
@ 2020-11-18  1:12             ` David Ahern
  2020-11-18 16:58               ` Nicolas Dichtel
  0 siblings, 1 reply; 25+ messages in thread
From: David Ahern @ 2020-11-18  1:12 UTC (permalink / raw)
  To: Mahesh Bandewar (महेश
	बंडेवार),
	Ido Schimmel
  Cc: Jakub Kicinski, Jian Yang, David Miller, linux-netdev, Jian Yang,
	Eric Dumazet

On 11/17/20 1:53 PM, Mahesh Bandewar (महेश बंडेवार) wrote:
> On Tue, Nov 17, 2020 at 9:18 AM Ido Schimmel <idosch@idosch.org> wrote:
>>
>> On Mon, Nov 16, 2020 at 01:03:32PM -0800, Mahesh Bandewar (महेश बंडेवार) wrote:
>>> On Mon, Nov 16, 2020 at 12:34 PM Jakub Kicinski <kuba@kernel.org> wrote:
>>>>
>>>> On Mon, 16 Nov 2020 12:02:48 -0800 Mahesh Bandewar (महेश बंडेवार) wrote:
>>>>>>> diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
>>>>>>> index a1c77cc00416..76dc92ac65a2 100644
>>>>>>> --- a/drivers/net/loopback.c
>>>>>>> +++ b/drivers/net/loopback.c
>>>>>>> @@ -219,6 +219,13 @@ static __net_init int loopback_net_init(struct net *net)
>>>>>>>
>>>>>>>       BUG_ON(dev->ifindex != LOOPBACK_IFINDEX);
>>>>>>>       net->loopback_dev = dev;
>>>>>>> +
>>>>>>> +     if (sysctl_netdev_loopback_state) {
>>>>>>> +             /* Bring loopback device UP */
>>>>>>> +             rtnl_lock();
>>>>>>> +             dev_open(dev, NULL);
>>>>>>> +             rtnl_unlock();
>>>>>>> +     }
>>>>>>
>>>>>> The only concern I have here is that it breaks notification ordering.
>>>>>> Is there precedent for NETDEV_UP to be generated before all pernet ops
>>>>>> ->init was called?
>>>>> I'm not sure if any and didn't see any issues in our usage / tests.
>>>>> I'm not even sure anyone is watching/monitoring for lo status as such.
>>>>
>>>> Ido, David, how does this sound to you?
>>>>
>>>> I can't think of any particular case where bringing the device up (and
>>>> populating it's addresses) before per netns init is finished could be
>>>> problematic. But if this is going to make kernel coding harder the
>>>> minor convenience of the knob is probably not worth it.
>>>
>>> +Eric Dumazet
>>>
>>> I'm not sure why kernel coding should get harder, but happy to listen
>>> to the opinions.
>>
>> Hi,
>>
>> Sorry for the delay. Does not occur to me as a problematic change. I ran
>> various tests with 'sysctl -qw net.core.netdev_loopback_state=1' and a
>> debug config. Looks OK.
> 
> Thanks for the confirmation Ido. I think Jian is getting powerpc
> config build fixed to address the build-bots findings and then he can
> push the updated version.
> 

If there is no harm in just creating lo in the up state, why not just do
it vs relying on a sysctl? It only affects 'local' networking so no real
impact to containers that do not do networking (ie., packets can't
escape). Linux has a lot of sysctl options; is this one really needed?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-18  1:12             ` David Ahern
@ 2020-11-18 16:58               ` Nicolas Dichtel
  2020-11-18 17:39                 ` Mahesh Bandewar (महेश बंडेवार)
  0 siblings, 1 reply; 25+ messages in thread
From: Nicolas Dichtel @ 2020-11-18 16:58 UTC (permalink / raw)
  To: David Ahern,
	Mahesh Bandewar (महेश
	बंडेवार),
	Ido Schimmel
  Cc: Jakub Kicinski, Jian Yang, David Miller, linux-netdev, Jian Yang,
	Eric Dumazet

Le 18/11/2020 à 02:12, David Ahern a écrit :
[snip]
> If there is no harm in just creating lo in the up state, why not just do
> it vs relying on a sysctl? It only affects 'local' networking so no real
> impact to containers that do not do networking (ie., packets can't
> escape). Linux has a lot of sysctl options; is this one really needed?
> 
+1

And thus, it will benefit to everybody.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-18 16:58               ` Nicolas Dichtel
@ 2020-11-18 17:39                 ` Mahesh Bandewar (महेश बंडेवार)
  2020-11-18 18:04                   ` David Ahern
  2020-11-19  8:03                   ` Nicolas Dichtel
  0 siblings, 2 replies; 25+ messages in thread
From: Mahesh Bandewar (महेश बंडेवार) @ 2020-11-18 17:39 UTC (permalink / raw)
  To: nicolas.dichtel
  Cc: David Ahern, Ido Schimmel, Jakub Kicinski, Jian Yang,
	David Miller, linux-netdev, Jian Yang, Eric Dumazet

On Wed, Nov 18, 2020 at 8:58 AM Nicolas Dichtel
<nicolas.dichtel@6wind.com> wrote:
>
> Le 18/11/2020 à 02:12, David Ahern a écrit :
> [snip]
> > If there is no harm in just creating lo in the up state, why not just do
> > it vs relying on a sysctl? It only affects 'local' networking so no real
> > impact to containers that do not do networking (ie., packets can't
> > escape). Linux has a lot of sysctl options; is this one really needed?
> >
I started with that approach but then I was informed about these
containers that disable networking all together including loopback.
Also bringing up by default would break backward compatibility hence
resorted to sysctl.
> +1
>
> And thus, it will benefit to everybody.

Well, it benefits everyone who uses networking (most of us) inside
netns but would create problems for workloads that create netns to
disable networking. One can always disable it after creating the netns
but that would mean change in the workflow and it could be viewed as
regression.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-18 17:39                 ` Mahesh Bandewar (महेश बंडेवार)
@ 2020-11-18 18:04                   ` David Ahern
  2020-11-18 19:54                     ` Mahesh Bandewar (महेश बंडेवार)
  2020-11-19  8:03                   ` Nicolas Dichtel
  1 sibling, 1 reply; 25+ messages in thread
From: David Ahern @ 2020-11-18 18:04 UTC (permalink / raw)
  To: Mahesh Bandewar (महेश
	बंडेवार),
	nicolas.dichtel
  Cc: Ido Schimmel, Jakub Kicinski, Jian Yang, David Miller,
	linux-netdev, Jian Yang, Eric Dumazet

On 11/18/20 10:39 AM, Mahesh Bandewar (महेश बंडेवार) wrote:
> On Wed, Nov 18, 2020 at 8:58 AM Nicolas Dichtel
> <nicolas.dichtel@6wind.com> wrote:
>>
>> Le 18/11/2020 à 02:12, David Ahern a écrit :
>> [snip]
>>> If there is no harm in just creating lo in the up state, why not just do
>>> it vs relying on a sysctl? It only affects 'local' networking so no real
>>> impact to containers that do not do networking (ie., packets can't
>>> escape). Linux has a lot of sysctl options; is this one really needed?
>>>
> I started with that approach but then I was informed about these
> containers that disable networking all together including loopback.
> Also bringing up by default would break backward compatibility hence
> resorted to sysctl.
>> +1
>>
>> And thus, it will benefit to everybody.
> 
> Well, it benefits everyone who uses networking (most of us) inside
> netns but would create problems for workloads that create netns to
> disable networking. One can always disable it after creating the netns
> but that would mean change in the workflow and it could be viewed as
> regression.
> 

Then perhaps the relevant sysctl -- or maybe netns attribute -- is
whether to create a loopback device at all.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-18 18:04                   ` David Ahern
@ 2020-11-18 19:54                     ` Mahesh Bandewar (महेश बंडेवार)
  0 siblings, 0 replies; 25+ messages in thread
From: Mahesh Bandewar (महेश बंडेवार) @ 2020-11-18 19:54 UTC (permalink / raw)
  To: David Ahern
  Cc: nicolas.dichtel, Ido Schimmel, Jakub Kicinski, Jian Yang,
	David Miller, linux-netdev, Jian Yang, Eric Dumazet

On Wed, Nov 18, 2020 at 10:04 AM David Ahern <dsahern@gmail.com> wrote:
>
> On 11/18/20 10:39 AM, Mahesh Bandewar (महेश बंडेवार) wrote:
> > On Wed, Nov 18, 2020 at 8:58 AM Nicolas Dichtel
> > <nicolas.dichtel@6wind.com> wrote:
> >>
> >> Le 18/11/2020 à 02:12, David Ahern a écrit :
> >> [snip]
> >>> If there is no harm in just creating lo in the up state, why not just do
> >>> it vs relying on a sysctl? It only affects 'local' networking so no real
> >>> impact to containers that do not do networking (ie., packets can't
> >>> escape). Linux has a lot of sysctl options; is this one really needed?
> >>>
> > I started with that approach but then I was informed about these
> > containers that disable networking all together including loopback.
> > Also bringing up by default would break backward compatibility hence
> > resorted to sysctl.
> >> +1
> >>
> >> And thus, it will benefit to everybody.
> >
> > Well, it benefits everyone who uses networking (most of us) inside
> > netns but would create problems for workloads that create netns to
> > disable networking. One can always disable it after creating the netns
> > but that would mean change in the workflow and it could be viewed as
> > regression.
> >
>
> Then perhaps the relevant sysctl -- or maybe netns attribute -- is
> whether to create a loopback device at all.

I'm open to ideas within the bounds of maintaining backward
compatibility. However, not able to see how we can pull it off without
creating a 'loopback' device.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-18 17:39                 ` Mahesh Bandewar (महेश बंडेवार)
  2020-11-18 18:04                   ` David Ahern
@ 2020-11-19  8:03                   ` Nicolas Dichtel
  2020-11-20  3:55                     ` Mahesh Bandewar (महेश बंडेवार)
  1 sibling, 1 reply; 25+ messages in thread
From: Nicolas Dichtel @ 2020-11-19  8:03 UTC (permalink / raw)
  To: Mahesh Bandewar (महेश
	बंडेवार)
  Cc: David Ahern, Ido Schimmel, Jakub Kicinski, Jian Yang,
	David Miller, linux-netdev, Jian Yang, Eric Dumazet

Le 18/11/2020 à 18:39, Mahesh Bandewar (महेश बंडेवार) a écrit :
> On Wed, Nov 18, 2020 at 8:58 AM Nicolas Dichtel
> <nicolas.dichtel@6wind.com> wrote:
>>
>> Le 18/11/2020 à 02:12, David Ahern a écrit :
>> [snip]
>>> If there is no harm in just creating lo in the up state, why not just do
>>> it vs relying on a sysctl? It only affects 'local' networking so no real
>>> impact to containers that do not do networking (ie., packets can't
>>> escape). Linux has a lot of sysctl options; is this one really needed?
>>>
> I started with that approach but then I was informed about these
> containers that disable networking all together including loopback.
> Also bringing up by default would break backward compatibility hence
> resorted to sysctl.
>> +1
>>
>> And thus, it will benefit to everybody.
> 
> Well, it benefits everyone who uses networking (most of us) inside
Sure.

> netns but would create problems for workloads that create netns to
> disable networking. One can always disable it after creating the netns
> but that would mean change in the workflow and it could be viewed as
> regression.
The networking is very limited with only a loopback. Do you have some real use
case in mind?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-19  8:03                   ` Nicolas Dichtel
@ 2020-11-20  3:55                     ` Mahesh Bandewar (महेश बंडेवार)
  2020-11-20  4:56                       ` Jakub Kicinski
  0 siblings, 1 reply; 25+ messages in thread
From: Mahesh Bandewar (महेश बंडेवार) @ 2020-11-20  3:55 UTC (permalink / raw)
  To: nicolas.dichtel
  Cc: David Ahern, Ido Schimmel, Jakub Kicinski, Jian Yang,
	David Miller, linux-netdev, Jian Yang, Eric Dumazet

On Thu, Nov 19, 2020 at 12:03 AM Nicolas Dichtel
<nicolas.dichtel@6wind.com> wrote:
>
> Le 18/11/2020 à 18:39, Mahesh Bandewar (महेश बंडेवार) a écrit :
> > On Wed, Nov 18, 2020 at 8:58 AM Nicolas Dichtel
> > <nicolas.dichtel@6wind.com> wrote:
> >>
> >> Le 18/11/2020 à 02:12, David Ahern a écrit :
> >> [snip]
> >>> If there is no harm in just creating lo in the up state, why not just do
> >>> it vs relying on a sysctl? It only affects 'local' networking so no real
> >>> impact to containers that do not do networking (ie., packets can't
> >>> escape). Linux has a lot of sysctl options; is this one really needed?
> >>>
> > I started with that approach but then I was informed about these
> > containers that disable networking all together including loopback.
> > Also bringing up by default would break backward compatibility hence
> > resorted to sysctl.
> >> +1
> >>
> >> And thus, it will benefit to everybody.
> >
> > Well, it benefits everyone who uses networking (most of us) inside
> Sure.
>
> > netns but would create problems for workloads that create netns to
> > disable networking. One can always disable it after creating the netns
> > but that would mean change in the workflow and it could be viewed as
> > regression.
> The networking is very limited with only a loopback. Do you have some real use
> case in mind?

My use cases all use networking but I think principally we cannot
break backward compatibility, right?
Jakub, WDYT?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-20  3:55                     ` Mahesh Bandewar (महेश बंडेवार)
@ 2020-11-20  4:56                       ` Jakub Kicinski
  2020-12-01 20:24                         ` Mahesh Bandewar (महेश बंडेवार)
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Kicinski @ 2020-11-20  4:56 UTC (permalink / raw)
  To: Mahesh Bandewar (महेश
	बंडेवार)
  Cc: nicolas.dichtel, David Ahern, Ido Schimmel, Jian Yang,
	David Miller, linux-netdev, Jian Yang, Eric Dumazet

On Thu, 19 Nov 2020 19:55:08 -0800 Mahesh Bandewar (महेश बंडेवार) wrote:
> On Thu, Nov 19, 2020 at 12:03 AM Nicolas Dichtel
> > Le 18/11/2020 à 18:39, Mahesh Bandewar (महेश बंडेवार) a écrit :  
> > > netns but would create problems for workloads that create netns to
> > > disable networking. One can always disable it after creating the netns
> > > but that would mean change in the workflow and it could be viewed as
> > > regression.  
> > The networking is very limited with only a loopback. Do you have some real use
> > case in mind?  
> 
> My use cases all use networking but I think principally we cannot
> break backward compatibility, right?
> Jakub, WDYT?

Do you have more details on what the use cases are that expect no
networking?

TBH I don't get the utility of this knob. If you want to write vaguely
portable software you have to assume the knob won't be useful, because
either (a) kernel may be old, or (b) you shouldn't assume to own the
sysctls and this is a global one (what if an application spawns that
expects legacy behavior?)

And if you have to check for those two things you're gonna write more
code than just ifuping lo would be.

Maybe you can shed some more light on how it makes life at Google
easier for you? Or someone else can enlighten me?

I don't have much practical experience with namespaces, but the more 
I think about it the more pointless it seems.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-11-20  4:56                       ` Jakub Kicinski
@ 2020-12-01 20:24                         ` Mahesh Bandewar (महेश बंडेवार)
  2020-12-02  2:38                           ` Jakub Kicinski
  0 siblings, 1 reply; 25+ messages in thread
From: Mahesh Bandewar (महेश बंडेवार) @ 2020-12-01 20:24 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: nicolas.dichtel, David Ahern, Ido Schimmel, Jian Yang,
	David Miller, linux-netdev, Jian Yang, Eric Dumazet

On Thu, Nov 19, 2020 at 8:56 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 19 Nov 2020 19:55:08 -0800 Mahesh Bandewar (महेश बंडेवार) wrote:
> > On Thu, Nov 19, 2020 at 12:03 AM Nicolas Dichtel
> > > Le 18/11/2020 à 18:39, Mahesh Bandewar (महेश बंडेवार) a écrit :
> > > > netns but would create problems for workloads that create netns to
> > > > disable networking. One can always disable it after creating the netns
> > > > but that would mean change in the workflow and it could be viewed as
> > > > regression.
> > > The networking is very limited with only a loopback. Do you have some real use
> > > case in mind?
> >
> > My use cases all use networking but I think principally we cannot
> > break backward compatibility, right?
> > Jakub, WDYT?
>
> Do you have more details on what the use cases are that expect no
> networking?
>
> TBH I don't get the utility of this knob. If you want to write vaguely
> portable software you have to assume the knob won't be useful, because
> either (a) kernel may be old, or (b) you shouldn't assume to own the
> sysctls and this is a global one (what if an application spawns that
> expects legacy behavior?)
>
> And if you have to check for those two things you're gonna write more
> code than just ifuping lo would be.
>
> Maybe you can shed some more light on how it makes life at Google
> easier for you? Or someone else can enlighten me?
>
> I don't have much practical experience with namespaces, but the more
> I think about it the more pointless it seems.

Thanks for the input.

Sorry, I was on vacation and couldn't process this response earlier.

There are two things that this patch is providing and let me understand the
objections well.

(a) Bring up lo by default
(b) sysctl to protect the legacy behavior

Frankly we really dont have any legacy-behavior use case and one can
bring it back to life by just doing 'ifdown lo' if necessary. Most of
the use cases involve using networking anyways. My belief was that we
need to protect legacy behavior and hence went lengths to add sysctl
to protect it. If we are OK not to have it, I'm more than happy to
remove the sysctl and just have the 3 line patch to bring loopback up.

If legacy-behavior is a concern (which I thought generally is), then
either we can have the sysctl to have it around to protect it (the
current implementation) but if we prefer to have kernel-command-line
instead of sysctl that defaults to legacy behavior but if provided, we
can set it UP by default during boot (or the other way around)?

My primary motive is (a) while (b) is just a side-effect which we can
get away if deemed unnecessary.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-12-01 20:24                         ` Mahesh Bandewar (महेश बंडेवार)
@ 2020-12-02  2:38                           ` Jakub Kicinski
  2020-12-02 20:53                             ` Mahesh Bandewar (महेश बंडेवार)
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Kicinski @ 2020-12-02  2:38 UTC (permalink / raw)
  To: Mahesh Bandewar (महेश
	बंडेवार)
  Cc: nicolas.dichtel, David Ahern, Ido Schimmel, Jian Yang,
	David Miller, linux-netdev, Jian Yang, Eric Dumazet

On Tue, 1 Dec 2020 12:24:38 -0800 Mahesh Bandewar (महेश बंडेवार) wrote:
> On Thu, Nov 19, 2020 at 8:56 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > Do you have more details on what the use cases are that expect no
> > networking?
> >
> > TBH I don't get the utility of this knob. If you want to write vaguely
> > portable software you have to assume the knob won't be useful, because
> > either (a) kernel may be old, or (b) you shouldn't assume to own the
> > sysctls and this is a global one (what if an application spawns that
> > expects legacy behavior?)
> >
> > And if you have to check for those two things you're gonna write more
> > code than just ifuping lo would be.
> >
> > Maybe you can shed some more light on how it makes life at Google
> > easier for you? Or someone else can enlighten me?
> >
> > I don't have much practical experience with namespaces, but the more
> > I think about it the more pointless it seems.  
> 
> Thanks for the input.
> 
> Sorry, I was on vacation and couldn't process this response earlier.
> 
> There are two things that this patch is providing and let me understand the
> objections well.
> 
> (a) Bring up lo by default
> (b) sysctl to protect the legacy behavior
> 
> Frankly we really dont have any legacy-behavior use case and one can
> bring it back to life by just doing 'ifdown lo' if necessary.

If use cases depending on legacy behavior exist we are just trading the
ifup in one case for an ifdown in another.

Unless we can dispel the notion that sand-boxing by lo down is a
legitimate use case IMO we're just adding complexity and growing
a sysctl for something that can be trivially handled from user space.

> Most of
> the use cases involve using networking anyways. My belief was that we
> need to protect legacy behavior and hence went lengths to add sysctl
> to protect it. If we are OK not to have it, I'm more than happy to
> remove the sysctl and just have the 3 line patch to bring loopback up.
> 
> If legacy-behavior is a concern (which I thought generally is), then
> either we can have the sysctl to have it around to protect it (the
> current implementation) but if we prefer to have kernel-command-line
> instead of sysctl that defaults to legacy behavior but if provided, we
> can set it UP by default during boot (or the other way around)?
> 
> My primary motive is (a) while (b) is just a side-effect which we can
> get away if deemed unnecessary.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next] net-loopback: allow lo dev initial state to be controlled
  2020-12-02  2:38                           ` Jakub Kicinski
@ 2020-12-02 20:53                             ` Mahesh Bandewar (महेश बंडेवार)
  0 siblings, 0 replies; 25+ messages in thread
From: Mahesh Bandewar (महेश बंडेवार) @ 2020-12-02 20:53 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: nicolas.dichtel, David Ahern, Ido Schimmel, Jian Yang,
	David Miller, linux-netdev, Jian Yang, Eric Dumazet

On Tue, Dec 1, 2020 at 6:38 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 1 Dec 2020 12:24:38 -0800 Mahesh Bandewar (महेश बंडेवार) wrote:
> > On Thu, Nov 19, 2020 at 8:56 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > > Do you have more details on what the use cases are that expect no
> > > networking?
> > >
> > > TBH I don't get the utility of this knob. If you want to write vaguely
> > > portable software you have to assume the knob won't be useful, because
> > > either (a) kernel may be old, or (b) you shouldn't assume to own the
> > > sysctls and this is a global one (what if an application spawns that
> > > expects legacy behavior?)
> > >
> > > And if you have to check for those two things you're gonna write more
> > > code than just ifuping lo would be.
> > >
> > > Maybe you can shed some more light on how it makes life at Google
> > > easier for you? Or someone else can enlighten me?
> > >
> > > I don't have much practical experience with namespaces, but the more
> > > I think about it the more pointless it seems.
> >
> > Thanks for the input.
> >
> > Sorry, I was on vacation and couldn't process this response earlier.
> >
> > There are two things that this patch is providing and let me understand the
> > objections well.
> >
> > (a) Bring up lo by default
> > (b) sysctl to protect the legacy behavior
> >
> > Frankly we really dont have any legacy-behavior use case and one can
> > bring it back to life by just doing 'ifdown lo' if necessary.
>
> If use cases depending on legacy behavior exist we are just trading the
> ifup in one case for an ifdown in another.
>
Yes, I would agree with this if the use-cases are 50/50 but I would
say it's more like 99/1 distribution (99% of the time if not higher
times lo is required to be UP and probably 1% of the time or less it's
 down)

> Unless we can dispel the notion that sand-boxing by lo down is a
> legitimate use case IMO we're just adding complexity and growing
> a sysctl for something that can be trivially handled from user space.
>
OK, I can remove the sysctl and just have the 3 line patch.



> > Most of
> > the use cases involve using networking anyways. My belief was that we
> > need to protect legacy behavior and hence went lengths to add sysctl
> > to protect it. If we are OK not to have it, I'm more than happy to
> > remove the sysctl and just have the 3 line patch to bring loopback up.
> >
> > If legacy-behavior is a concern (which I thought generally is), then
> > either we can have the sysctl to have it around to protect it (the
> > current implementation) but if we prefer to have kernel-command-line
> > instead of sysctl that defaults to legacy behavior but if provided, we
> > can set it UP by default during boot (or the other way around)?
> >
> > My primary motive is (a) while (b) is just a side-effect which we can
> > get away if deemed unnecessary.
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-12-02 20:55 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-11 20:43 [PATCH net-next] net-loopback: allow lo dev initial state to be controlled Jian Yang
2020-11-12 16:08 ` Andrew Lunn
2020-11-12 19:54   ` Dan Williams
2020-11-14 18:17 ` Jakub Kicinski
2020-11-16 20:02   ` Mahesh Bandewar (महेश बंडेवार)
2020-11-16 20:17     ` Jakub Kicinski
2020-11-16 20:50       ` Mahesh Bandewar (महेश बंडेवार)
2020-11-16 21:20         ` Jakub Kicinski
2020-11-16 21:42           ` Mahesh Bandewar (महेश बंडेवार)
2020-11-16 20:34     ` Jakub Kicinski
2020-11-16 21:03       ` Mahesh Bandewar (महेश बंडेवार)
2020-11-17 17:18         ` Ido Schimmel
2020-11-17 20:53           ` Mahesh Bandewar (महेश बंडेवार)
2020-11-18  1:12             ` David Ahern
2020-11-18 16:58               ` Nicolas Dichtel
2020-11-18 17:39                 ` Mahesh Bandewar (महेश बंडेवार)
2020-11-18 18:04                   ` David Ahern
2020-11-18 19:54                     ` Mahesh Bandewar (महेश बंडेवार)
2020-11-19  8:03                   ` Nicolas Dichtel
2020-11-20  3:55                     ` Mahesh Bandewar (महेश बंडेवार)
2020-11-20  4:56                       ` Jakub Kicinski
2020-12-01 20:24                         ` Mahesh Bandewar (महेश बंडेवार)
2020-12-02  2:38                           ` Jakub Kicinski
2020-12-02 20:53                             ` Mahesh Bandewar (महेश बंडेवार)
2020-11-17  4:50 ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).