All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v6] failover: allow name change on IFF_UP slave interfaces
@ 2019-04-03  4:52 Si-Wei Liu
  2019-04-05 20:40 ` si-wei liu
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Si-Wei Liu @ 2019-04-03  4:52 UTC (permalink / raw)
  To: mst, sridhar.samudrala, stephen, davem, kubakici,
	alexander.duyck, jiri, netdev, virtualization
  Cc: liran.alon, boris.ostrovsky, vijay.balakrishna, si-wei liu

When a netdev appears through hot plug then gets enslaved by a failover
master that is already up and running, the slave will be opened
right away after getting enslaved. Today there's a race that userspace
(udev) may fail to rename the slave if the kernel (net_failover)
opens the slave earlier than when the userspace rename happens.
Unlike bond or team, the primary slave of failover can't be renamed by
userspace ahead of time, since the kernel initiated auto-enslavement is
unable to, or rather, is never meant to be synchronized with the rename
request from userspace.

As the failover slave interfaces are not designed to be operated
directly by userspace apps: IP configuration, filter rules with
regard to network traffic passing and etc., should all be done on master
interface. In general, userspace apps only care about the
name of master interface, while slave names are less important as long
as admin users can see reliable names that may carry
other information describing the netdev. For e.g., they can infer that
"ens3nsby" is a standby slave of "ens3", while for a
name like "eth0" they can't tell which master it belongs to.

Historically the name of IFF_UP interface can't be changed because
there might be admin script or management software that is already
relying on such behavior and assumes that the slave name can't be
changed once UP. But failover is special: with the in-kernel
auto-enslavement mechanism, the userspace expectation for device
enumeration and bring-up order is already broken. Previously initramfs
and various userspace config tools were modified to bypass failover
slaves because of auto-enslavement and duplicate MAC address. Similarly,
in case that users care about seeing reliable slave name, the new type
of failover slaves needs to be taken care of specifically in userspace
anyway.

It's less risky to lift up the rename restriction on failover slave
which is already UP. Although it's possible this change may potentially
break userspace component (most likely configuration scripts or
management software) that assumes slave name can't be changed while
UP, it's relatively a limited and controllable set among all userspace
components, which can be fixed specifically to listen for the rename
and/or link down/up events on failover slaves. Userspace component
interacting with slaves is expected to be changed to operate on failover
master interface instead, as the failover slave is dynamic in nature
which may come and go at any point.  The goal is to make the role of
failover slaves less relevant, and userspace components should only
deal with failover master in the long run.

Fixes: 30c8bd5aa8b2 ("net: Introduce generic failover module")
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>

--
v1 -> v2:
- Drop configurable module parameter (Sridhar)

v2 -> v3:
- Drop additional IFF_SLAVE_RENAME_OK flag (Sridhar)
- Send down and up events around rename (Michael S. Tsirkin)

v3 -> v4:
- Simplify notification to be sent (Stephen Hemminger)

v4 -> v5:
- Sync up code with latest net-next (Sridhar)
- Use proper structure initialization (Stephen, Jiri)

v5 -> v6:
- Make the property of live name change a generic flag (Stephen)
---
 include/linux/netdevice.h |  3 +++
 net/core/dev.c            | 25 ++++++++++++++++++++++++-
 net/core/failover.c       |  6 +++---
 3 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 78f5ec4e..ea9a63f 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1498,6 +1498,7 @@ struct net_device_ops {
  * @IFF_FAILOVER: device is a failover master device
  * @IFF_FAILOVER_SLAVE: device is lower dev of a failover master device
  * @IFF_L3MDEV_RX_HANDLER: only invoke the rx handler of L3 master device
+ * @IFF_LIVE_NAME_CHANGE: rename is allowed while device is running
  */
 enum netdev_priv_flags {
 	IFF_802_1Q_VLAN			= 1<<0,
@@ -1530,6 +1531,7 @@ enum netdev_priv_flags {
 	IFF_FAILOVER			= 1<<27,
 	IFF_FAILOVER_SLAVE		= 1<<28,
 	IFF_L3MDEV_RX_HANDLER		= 1<<29,
+	IFF_LIVE_NAME_CHANGE		= 1<<30,
 };
 
 #define IFF_802_1Q_VLAN			IFF_802_1Q_VLAN
@@ -1561,6 +1563,7 @@ enum netdev_priv_flags {
 #define IFF_FAILOVER			IFF_FAILOVER
 #define IFF_FAILOVER_SLAVE		IFF_FAILOVER_SLAVE
 #define IFF_L3MDEV_RX_HANDLER		IFF_L3MDEV_RX_HANDLER
+#define IFF_LIVE_NAME_CHANGE		IFF_LIVE_NAME_CHANGE
 
 /**
  *	struct net_device - The DEVICE structure.
diff --git a/net/core/dev.c b/net/core/dev.c
index 9823b77..48341d5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1185,7 +1185,21 @@ int dev_change_name(struct net_device *dev, const char *newname)
 	BUG_ON(!dev_net(dev));
 
 	net = dev_net(dev);
-	if (dev->flags & IFF_UP)
+
+	/* Some auto-enslaved devices e.g. failover slaves are
+	 * special, as userspace might rename the device after
+	 * the interface had been brought up and running since
+	 * the point kernel initiated auto-enslavement. Allow
+	 * live name change even when these slave devices are
+	 * up and running.
+	 *
+	 * Typically, users of these auto-enslaving devices
+	 * don't actually care about slave name change, as
+	 * they are supposed to operate on master interface
+	 * directly.
+	 */
+	if (dev->flags & IFF_UP &&
+	    likely(!(dev->priv_flags & IFF_LIVE_NAME_CHANGE)))
 		return -EBUSY;
 
 	write_seqcount_begin(&devnet_rename_seq);
@@ -1232,6 +1246,15 @@ int dev_change_name(struct net_device *dev, const char *newname)
 	hlist_add_head_rcu(&dev->name_hlist, dev_name_hash(net, dev->name));
 	write_unlock_bh(&dev_base_lock);
 
+	if (unlikely(dev->flags & IFF_UP)) {
+		struct netdev_notifier_change_info change_info = {
+			.info.dev = dev,
+		};
+
+		call_netdevice_notifiers_info(NETDEV_CHANGE,
+					      &change_info.info);
+	}
+
 	ret = call_netdevice_notifiers(NETDEV_CHANGENAME, dev);
 	ret = notifier_to_errno(ret);
 
diff --git a/net/core/failover.c b/net/core/failover.c
index 4a92a98..b5cd3c7 100644
--- a/net/core/failover.c
+++ b/net/core/failover.c
@@ -80,14 +80,14 @@ static int failover_slave_register(struct net_device *slave_dev)
 		goto err_upper_link;
 	}
 
-	slave_dev->priv_flags |= IFF_FAILOVER_SLAVE;
+	slave_dev->priv_flags |= (IFF_FAILOVER_SLAVE | IFF_LIVE_NAME_CHANGE);
 
 	if (fops && fops->slave_register &&
 	    !fops->slave_register(slave_dev, failover_dev))
 		return NOTIFY_OK;
 
 	netdev_upper_dev_unlink(slave_dev, failover_dev);
-	slave_dev->priv_flags &= ~IFF_FAILOVER_SLAVE;
+	slave_dev->priv_flags &= ~(IFF_FAILOVER_SLAVE | IFF_LIVE_NAME_CHANGE);
 err_upper_link:
 	netdev_rx_handler_unregister(slave_dev);
 done:
@@ -121,7 +121,7 @@ int failover_slave_unregister(struct net_device *slave_dev)
 
 	netdev_rx_handler_unregister(slave_dev);
 	netdev_upper_dev_unlink(slave_dev, failover_dev);
-	slave_dev->priv_flags &= ~IFF_FAILOVER_SLAVE;
+	slave_dev->priv_flags &= ~(IFF_FAILOVER_SLAVE | IFF_LIVE_NAME_CHANGE);
 
 	if (fops && fops->slave_unregister &&
 	    !fops->slave_unregister(slave_dev, failover_dev))
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH net v6] failover: allow name change on IFF_UP slave interfaces
  2019-04-03  4:52 [PATCH net v6] failover: allow name change on IFF_UP slave interfaces Si-Wei Liu
@ 2019-04-05 20:40 ` si-wei liu
  2019-04-05 21:28 ` Michael S. Tsirkin
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: si-wei liu @ 2019-04-05 20:40 UTC (permalink / raw)
  To: mst, sridhar.samudrala, stephen, davem, kubakici,
	alexander.duyck, jiri, netdev, virtualization
  Cc: liran.alon, boris.ostrovsky, vijay.balakrishna

A gentle reminder. This patch still needs reviewer to acknowledge the 
change proposed.

-Siwei

On 4/2/2019 9:52 PM, Si-Wei Liu wrote:
> When a netdev appears through hot plug then gets enslaved by a failover
> master that is already up and running, the slave will be opened
> right away after getting enslaved. Today there's a race that userspace
> (udev) may fail to rename the slave if the kernel (net_failover)
> opens the slave earlier than when the userspace rename happens.
> Unlike bond or team, the primary slave of failover can't be renamed by
> userspace ahead of time, since the kernel initiated auto-enslavement is
> unable to, or rather, is never meant to be synchronized with the rename
> request from userspace.
>
> As the failover slave interfaces are not designed to be operated
> directly by userspace apps: IP configuration, filter rules with
> regard to network traffic passing and etc., should all be done on master
> interface. In general, userspace apps only care about the
> name of master interface, while slave names are less important as long
> as admin users can see reliable names that may carry
> other information describing the netdev. For e.g., they can infer that
> "ens3nsby" is a standby slave of "ens3", while for a
> name like "eth0" they can't tell which master it belongs to.
>
> Historically the name of IFF_UP interface can't be changed because
> there might be admin script or management software that is already
> relying on such behavior and assumes that the slave name can't be
> changed once UP. But failover is special: with the in-kernel
> auto-enslavement mechanism, the userspace expectation for device
> enumeration and bring-up order is already broken. Previously initramfs
> and various userspace config tools were modified to bypass failover
> slaves because of auto-enslavement and duplicate MAC address. Similarly,
> in case that users care about seeing reliable slave name, the new type
> of failover slaves needs to be taken care of specifically in userspace
> anyway.
>
> It's less risky to lift up the rename restriction on failover slave
> which is already UP. Although it's possible this change may potentially
> break userspace component (most likely configuration scripts or
> management software) that assumes slave name can't be changed while
> UP, it's relatively a limited and controllable set among all userspace
> components, which can be fixed specifically to listen for the rename
> and/or link down/up events on failover slaves. Userspace component
> interacting with slaves is expected to be changed to operate on failover
> master interface instead, as the failover slave is dynamic in nature
> which may come and go at any point.  The goal is to make the role of
> failover slaves less relevant, and userspace components should only
> deal with failover master in the long run.
>
> Fixes: 30c8bd5aa8b2 ("net: Introduce generic failover module")
> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
> Reviewed-by: Liran Alon <liran.alon@oracle.com>
>
> --
> v1 -> v2:
> - Drop configurable module parameter (Sridhar)
>
> v2 -> v3:
> - Drop additional IFF_SLAVE_RENAME_OK flag (Sridhar)
> - Send down and up events around rename (Michael S. Tsirkin)
>
> v3 -> v4:
> - Simplify notification to be sent (Stephen Hemminger)
>
> v4 -> v5:
> - Sync up code with latest net-next (Sridhar)
> - Use proper structure initialization (Stephen, Jiri)
>
> v5 -> v6:
> - Make the property of live name change a generic flag (Stephen)
> ---
>   include/linux/netdevice.h |  3 +++
>   net/core/dev.c            | 25 ++++++++++++++++++++++++-
>   net/core/failover.c       |  6 +++---
>   3 files changed, 30 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 78f5ec4e..ea9a63f 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1498,6 +1498,7 @@ struct net_device_ops {
>    * @IFF_FAILOVER: device is a failover master device
>    * @IFF_FAILOVER_SLAVE: device is lower dev of a failover master device
>    * @IFF_L3MDEV_RX_HANDLER: only invoke the rx handler of L3 master device
> + * @IFF_LIVE_NAME_CHANGE: rename is allowed while device is running
>    */
>   enum netdev_priv_flags {
>   	IFF_802_1Q_VLAN			= 1<<0,
> @@ -1530,6 +1531,7 @@ enum netdev_priv_flags {
>   	IFF_FAILOVER			= 1<<27,
>   	IFF_FAILOVER_SLAVE		= 1<<28,
>   	IFF_L3MDEV_RX_HANDLER		= 1<<29,
> +	IFF_LIVE_NAME_CHANGE		= 1<<30,
>   };
>   
>   #define IFF_802_1Q_VLAN			IFF_802_1Q_VLAN
> @@ -1561,6 +1563,7 @@ enum netdev_priv_flags {
>   #define IFF_FAILOVER			IFF_FAILOVER
>   #define IFF_FAILOVER_SLAVE		IFF_FAILOVER_SLAVE
>   #define IFF_L3MDEV_RX_HANDLER		IFF_L3MDEV_RX_HANDLER
> +#define IFF_LIVE_NAME_CHANGE		IFF_LIVE_NAME_CHANGE
>   
>   /**
>    *	struct net_device - The DEVICE structure.
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 9823b77..48341d5 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1185,7 +1185,21 @@ int dev_change_name(struct net_device *dev, const char *newname)
>   	BUG_ON(!dev_net(dev));
>   
>   	net = dev_net(dev);
> -	if (dev->flags & IFF_UP)
> +
> +	/* Some auto-enslaved devices e.g. failover slaves are
> +	 * special, as userspace might rename the device after
> +	 * the interface had been brought up and running since
> +	 * the point kernel initiated auto-enslavement. Allow
> +	 * live name change even when these slave devices are
> +	 * up and running.
> +	 *
> +	 * Typically, users of these auto-enslaving devices
> +	 * don't actually care about slave name change, as
> +	 * they are supposed to operate on master interface
> +	 * directly.
> +	 */
> +	if (dev->flags & IFF_UP &&
> +	    likely(!(dev->priv_flags & IFF_LIVE_NAME_CHANGE)))
>   		return -EBUSY;
>   
>   	write_seqcount_begin(&devnet_rename_seq);
> @@ -1232,6 +1246,15 @@ int dev_change_name(struct net_device *dev, const char *newname)
>   	hlist_add_head_rcu(&dev->name_hlist, dev_name_hash(net, dev->name));
>   	write_unlock_bh(&dev_base_lock);
>   
> +	if (unlikely(dev->flags & IFF_UP)) {
> +		struct netdev_notifier_change_info change_info = {
> +			.info.dev = dev,
> +		};
> +
> +		call_netdevice_notifiers_info(NETDEV_CHANGE,
> +					      &change_info.info);
> +	}
> +
>   	ret = call_netdevice_notifiers(NETDEV_CHANGENAME, dev);
>   	ret = notifier_to_errno(ret);
>   
> diff --git a/net/core/failover.c b/net/core/failover.c
> index 4a92a98..b5cd3c7 100644
> --- a/net/core/failover.c
> +++ b/net/core/failover.c
> @@ -80,14 +80,14 @@ static int failover_slave_register(struct net_device *slave_dev)
>   		goto err_upper_link;
>   	}
>   
> -	slave_dev->priv_flags |= IFF_FAILOVER_SLAVE;
> +	slave_dev->priv_flags |= (IFF_FAILOVER_SLAVE | IFF_LIVE_NAME_CHANGE);
>   
>   	if (fops && fops->slave_register &&
>   	    !fops->slave_register(slave_dev, failover_dev))
>   		return NOTIFY_OK;
>   
>   	netdev_upper_dev_unlink(slave_dev, failover_dev);
> -	slave_dev->priv_flags &= ~IFF_FAILOVER_SLAVE;
> +	slave_dev->priv_flags &= ~(IFF_FAILOVER_SLAVE | IFF_LIVE_NAME_CHANGE);
>   err_upper_link:
>   	netdev_rx_handler_unregister(slave_dev);
>   done:
> @@ -121,7 +121,7 @@ int failover_slave_unregister(struct net_device *slave_dev)
>   
>   	netdev_rx_handler_unregister(slave_dev);
>   	netdev_upper_dev_unlink(slave_dev, failover_dev);
> -	slave_dev->priv_flags &= ~IFF_FAILOVER_SLAVE;
> +	slave_dev->priv_flags &= ~(IFF_FAILOVER_SLAVE | IFF_LIVE_NAME_CHANGE);
>   
>   	if (fops && fops->slave_unregister &&
>   	    !fops->slave_unregister(slave_dev, failover_dev))


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v6] failover: allow name change on IFF_UP slave interfaces
  2019-04-03  4:52 [PATCH net v6] failover: allow name change on IFF_UP slave interfaces Si-Wei Liu
  2019-04-05 20:40 ` si-wei liu
  2019-04-05 21:28 ` Michael S. Tsirkin
@ 2019-04-05 21:28 ` Michael S. Tsirkin
  2019-04-05 21:47   ` Stephen Hemminger
  2019-04-05 21:47   ` Stephen Hemminger
  2019-04-05 21:47 ` Stephen Hemminger
  2019-04-05 21:47 ` Stephen Hemminger
  4 siblings, 2 replies; 17+ messages in thread
From: Michael S. Tsirkin @ 2019-04-05 21:28 UTC (permalink / raw)
  To: Si-Wei Liu
  Cc: sridhar.samudrala, stephen, davem, kubakici, alexander.duyck,
	jiri, netdev, virtualization, liran.alon, boris.ostrovsky,
	vijay.balakrishna

On Wed, Apr 03, 2019 at 12:52:47AM -0400, Si-Wei Liu wrote:
> When a netdev appears through hot plug then gets enslaved by a failover
> master that is already up and running, the slave will be opened
> right away after getting enslaved. Today there's a race that userspace
> (udev) may fail to rename the slave if the kernel (net_failover)
> opens the slave earlier than when the userspace rename happens.
> Unlike bond or team, the primary slave of failover can't be renamed by
> userspace ahead of time, since the kernel initiated auto-enslavement is
> unable to, or rather, is never meant to be synchronized with the rename
> request from userspace.
> 
> As the failover slave interfaces are not designed to be operated
> directly by userspace apps: IP configuration, filter rules with
> regard to network traffic passing and etc., should all be done on master
> interface. In general, userspace apps only care about the
> name of master interface, while slave names are less important as long
> as admin users can see reliable names that may carry
> other information describing the netdev. For e.g., they can infer that
> "ens3nsby" is a standby slave of "ens3", while for a
> name like "eth0" they can't tell which master it belongs to.
> 
> Historically the name of IFF_UP interface can't be changed because
> there might be admin script or management software that is already
> relying on such behavior and assumes that the slave name can't be
> changed once UP. But failover is special: with the in-kernel
> auto-enslavement mechanism, the userspace expectation for device
> enumeration and bring-up order is already broken. Previously initramfs
> and various userspace config tools were modified to bypass failover
> slaves because of auto-enslavement and duplicate MAC address. Similarly,
> in case that users care about seeing reliable slave name, the new type
> of failover slaves needs to be taken care of specifically in userspace
> anyway.
> 
> It's less risky to lift up the rename restriction on failover slave
> which is already UP. Although it's possible this change may potentially
> break userspace component (most likely configuration scripts or
> management software) that assumes slave name can't be changed while
> UP, it's relatively a limited and controllable set among all userspace
> components, which can be fixed specifically to listen for the rename
> and/or link down/up events on failover slaves. Userspace component
> interacting with slaves is expected to be changed to operate on failover
> master interface instead, as the failover slave is dynamic in nature
> which may come and go at any point.  The goal is to make the role of
> failover slaves less relevant, and userspace components should only
> deal with failover master in the long run.
> 
> Fixes: 30c8bd5aa8b2 ("net: Introduce generic failover module")
> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
> Reviewed-by: Liran Alon <liran.alon@oracle.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

Stephen are you happy with this approach?

> --
> v1 -> v2:
> - Drop configurable module parameter (Sridhar)
> 
> v2 -> v3:
> - Drop additional IFF_SLAVE_RENAME_OK flag (Sridhar)
> - Send down and up events around rename (Michael S. Tsirkin)
> 
> v3 -> v4:
> - Simplify notification to be sent (Stephen Hemminger)
> 
> v4 -> v5:
> - Sync up code with latest net-next (Sridhar)
> - Use proper structure initialization (Stephen, Jiri)
> 
> v5 -> v6:
> - Make the property of live name change a generic flag (Stephen)
> ---
>  include/linux/netdevice.h |  3 +++
>  net/core/dev.c            | 25 ++++++++++++++++++++++++-
>  net/core/failover.c       |  6 +++---
>  3 files changed, 30 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 78f5ec4e..ea9a63f 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1498,6 +1498,7 @@ struct net_device_ops {
>   * @IFF_FAILOVER: device is a failover master device
>   * @IFF_FAILOVER_SLAVE: device is lower dev of a failover master device
>   * @IFF_L3MDEV_RX_HANDLER: only invoke the rx handler of L3 master device
> + * @IFF_LIVE_NAME_CHANGE: rename is allowed while device is running
>   */
>  enum netdev_priv_flags {
>  	IFF_802_1Q_VLAN			= 1<<0,
> @@ -1530,6 +1531,7 @@ enum netdev_priv_flags {
>  	IFF_FAILOVER			= 1<<27,
>  	IFF_FAILOVER_SLAVE		= 1<<28,
>  	IFF_L3MDEV_RX_HANDLER		= 1<<29,
> +	IFF_LIVE_NAME_CHANGE		= 1<<30,
>  };
>  
>  #define IFF_802_1Q_VLAN			IFF_802_1Q_VLAN
> @@ -1561,6 +1563,7 @@ enum netdev_priv_flags {
>  #define IFF_FAILOVER			IFF_FAILOVER
>  #define IFF_FAILOVER_SLAVE		IFF_FAILOVER_SLAVE
>  #define IFF_L3MDEV_RX_HANDLER		IFF_L3MDEV_RX_HANDLER
> +#define IFF_LIVE_NAME_CHANGE		IFF_LIVE_NAME_CHANGE
>  
>  /**
>   *	struct net_device - The DEVICE structure.
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 9823b77..48341d5 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1185,7 +1185,21 @@ int dev_change_name(struct net_device *dev, const char *newname)
>  	BUG_ON(!dev_net(dev));
>  
>  	net = dev_net(dev);
> -	if (dev->flags & IFF_UP)
> +
> +	/* Some auto-enslaved devices e.g. failover slaves are
> +	 * special, as userspace might rename the device after
> +	 * the interface had been brought up and running since
> +	 * the point kernel initiated auto-enslavement. Allow
> +	 * live name change even when these slave devices are
> +	 * up and running.
> +	 *
> +	 * Typically, users of these auto-enslaving devices
> +	 * don't actually care about slave name change, as
> +	 * they are supposed to operate on master interface
> +	 * directly.
> +	 */
> +	if (dev->flags & IFF_UP &&
> +	    likely(!(dev->priv_flags & IFF_LIVE_NAME_CHANGE)))
>  		return -EBUSY;
>  
>  	write_seqcount_begin(&devnet_rename_seq);
> @@ -1232,6 +1246,15 @@ int dev_change_name(struct net_device *dev, const char *newname)
>  	hlist_add_head_rcu(&dev->name_hlist, dev_name_hash(net, dev->name));
>  	write_unlock_bh(&dev_base_lock);
>  
> +	if (unlikely(dev->flags & IFF_UP)) {
> +		struct netdev_notifier_change_info change_info = {
> +			.info.dev = dev,
> +		};
> +
> +		call_netdevice_notifiers_info(NETDEV_CHANGE,
> +					      &change_info.info);
> +	}
> +
>  	ret = call_netdevice_notifiers(NETDEV_CHANGENAME, dev);
>  	ret = notifier_to_errno(ret);
>  
> diff --git a/net/core/failover.c b/net/core/failover.c
> index 4a92a98..b5cd3c7 100644
> --- a/net/core/failover.c
> +++ b/net/core/failover.c
> @@ -80,14 +80,14 @@ static int failover_slave_register(struct net_device *slave_dev)
>  		goto err_upper_link;
>  	}
>  
> -	slave_dev->priv_flags |= IFF_FAILOVER_SLAVE;
> +	slave_dev->priv_flags |= (IFF_FAILOVER_SLAVE | IFF_LIVE_NAME_CHANGE);
>  
>  	if (fops && fops->slave_register &&
>  	    !fops->slave_register(slave_dev, failover_dev))
>  		return NOTIFY_OK;
>  
>  	netdev_upper_dev_unlink(slave_dev, failover_dev);
> -	slave_dev->priv_flags &= ~IFF_FAILOVER_SLAVE;
> +	slave_dev->priv_flags &= ~(IFF_FAILOVER_SLAVE | IFF_LIVE_NAME_CHANGE);
>  err_upper_link:
>  	netdev_rx_handler_unregister(slave_dev);
>  done:
> @@ -121,7 +121,7 @@ int failover_slave_unregister(struct net_device *slave_dev)
>  
>  	netdev_rx_handler_unregister(slave_dev);
>  	netdev_upper_dev_unlink(slave_dev, failover_dev);
> -	slave_dev->priv_flags &= ~IFF_FAILOVER_SLAVE;
> +	slave_dev->priv_flags &= ~(IFF_FAILOVER_SLAVE | IFF_LIVE_NAME_CHANGE);
>  
>  	if (fops && fops->slave_unregister &&
>  	    !fops->slave_unregister(slave_dev, failover_dev))
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v6] failover: allow name change on IFF_UP slave interfaces
  2019-04-03  4:52 [PATCH net v6] failover: allow name change on IFF_UP slave interfaces Si-Wei Liu
  2019-04-05 20:40 ` si-wei liu
@ 2019-04-05 21:28 ` Michael S. Tsirkin
  2019-04-05 21:28 ` Michael S. Tsirkin
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Michael S. Tsirkin @ 2019-04-05 21:28 UTC (permalink / raw)
  To: Si-Wei Liu
  Cc: jiri, kubakici, sridhar.samudrala, alexander.duyck,
	virtualization, liran.alon, netdev, boris.ostrovsky, davem

On Wed, Apr 03, 2019 at 12:52:47AM -0400, Si-Wei Liu wrote:
> When a netdev appears through hot plug then gets enslaved by a failover
> master that is already up and running, the slave will be opened
> right away after getting enslaved. Today there's a race that userspace
> (udev) may fail to rename the slave if the kernel (net_failover)
> opens the slave earlier than when the userspace rename happens.
> Unlike bond or team, the primary slave of failover can't be renamed by
> userspace ahead of time, since the kernel initiated auto-enslavement is
> unable to, or rather, is never meant to be synchronized with the rename
> request from userspace.
> 
> As the failover slave interfaces are not designed to be operated
> directly by userspace apps: IP configuration, filter rules with
> regard to network traffic passing and etc., should all be done on master
> interface. In general, userspace apps only care about the
> name of master interface, while slave names are less important as long
> as admin users can see reliable names that may carry
> other information describing the netdev. For e.g., they can infer that
> "ens3nsby" is a standby slave of "ens3", while for a
> name like "eth0" they can't tell which master it belongs to.
> 
> Historically the name of IFF_UP interface can't be changed because
> there might be admin script or management software that is already
> relying on such behavior and assumes that the slave name can't be
> changed once UP. But failover is special: with the in-kernel
> auto-enslavement mechanism, the userspace expectation for device
> enumeration and bring-up order is already broken. Previously initramfs
> and various userspace config tools were modified to bypass failover
> slaves because of auto-enslavement and duplicate MAC address. Similarly,
> in case that users care about seeing reliable slave name, the new type
> of failover slaves needs to be taken care of specifically in userspace
> anyway.
> 
> It's less risky to lift up the rename restriction on failover slave
> which is already UP. Although it's possible this change may potentially
> break userspace component (most likely configuration scripts or
> management software) that assumes slave name can't be changed while
> UP, it's relatively a limited and controllable set among all userspace
> components, which can be fixed specifically to listen for the rename
> and/or link down/up events on failover slaves. Userspace component
> interacting with slaves is expected to be changed to operate on failover
> master interface instead, as the failover slave is dynamic in nature
> which may come and go at any point.  The goal is to make the role of
> failover slaves less relevant, and userspace components should only
> deal with failover master in the long run.
> 
> Fixes: 30c8bd5aa8b2 ("net: Introduce generic failover module")
> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
> Reviewed-by: Liran Alon <liran.alon@oracle.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

Stephen are you happy with this approach?

> --
> v1 -> v2:
> - Drop configurable module parameter (Sridhar)
> 
> v2 -> v3:
> - Drop additional IFF_SLAVE_RENAME_OK flag (Sridhar)
> - Send down and up events around rename (Michael S. Tsirkin)
> 
> v3 -> v4:
> - Simplify notification to be sent (Stephen Hemminger)
> 
> v4 -> v5:
> - Sync up code with latest net-next (Sridhar)
> - Use proper structure initialization (Stephen, Jiri)
> 
> v5 -> v6:
> - Make the property of live name change a generic flag (Stephen)
> ---
>  include/linux/netdevice.h |  3 +++
>  net/core/dev.c            | 25 ++++++++++++++++++++++++-
>  net/core/failover.c       |  6 +++---
>  3 files changed, 30 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 78f5ec4e..ea9a63f 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1498,6 +1498,7 @@ struct net_device_ops {
>   * @IFF_FAILOVER: device is a failover master device
>   * @IFF_FAILOVER_SLAVE: device is lower dev of a failover master device
>   * @IFF_L3MDEV_RX_HANDLER: only invoke the rx handler of L3 master device
> + * @IFF_LIVE_NAME_CHANGE: rename is allowed while device is running
>   */
>  enum netdev_priv_flags {
>  	IFF_802_1Q_VLAN			= 1<<0,
> @@ -1530,6 +1531,7 @@ enum netdev_priv_flags {
>  	IFF_FAILOVER			= 1<<27,
>  	IFF_FAILOVER_SLAVE		= 1<<28,
>  	IFF_L3MDEV_RX_HANDLER		= 1<<29,
> +	IFF_LIVE_NAME_CHANGE		= 1<<30,
>  };
>  
>  #define IFF_802_1Q_VLAN			IFF_802_1Q_VLAN
> @@ -1561,6 +1563,7 @@ enum netdev_priv_flags {
>  #define IFF_FAILOVER			IFF_FAILOVER
>  #define IFF_FAILOVER_SLAVE		IFF_FAILOVER_SLAVE
>  #define IFF_L3MDEV_RX_HANDLER		IFF_L3MDEV_RX_HANDLER
> +#define IFF_LIVE_NAME_CHANGE		IFF_LIVE_NAME_CHANGE
>  
>  /**
>   *	struct net_device - The DEVICE structure.
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 9823b77..48341d5 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1185,7 +1185,21 @@ int dev_change_name(struct net_device *dev, const char *newname)
>  	BUG_ON(!dev_net(dev));
>  
>  	net = dev_net(dev);
> -	if (dev->flags & IFF_UP)
> +
> +	/* Some auto-enslaved devices e.g. failover slaves are
> +	 * special, as userspace might rename the device after
> +	 * the interface had been brought up and running since
> +	 * the point kernel initiated auto-enslavement. Allow
> +	 * live name change even when these slave devices are
> +	 * up and running.
> +	 *
> +	 * Typically, users of these auto-enslaving devices
> +	 * don't actually care about slave name change, as
> +	 * they are supposed to operate on master interface
> +	 * directly.
> +	 */
> +	if (dev->flags & IFF_UP &&
> +	    likely(!(dev->priv_flags & IFF_LIVE_NAME_CHANGE)))
>  		return -EBUSY;
>  
>  	write_seqcount_begin(&devnet_rename_seq);
> @@ -1232,6 +1246,15 @@ int dev_change_name(struct net_device *dev, const char *newname)
>  	hlist_add_head_rcu(&dev->name_hlist, dev_name_hash(net, dev->name));
>  	write_unlock_bh(&dev_base_lock);
>  
> +	if (unlikely(dev->flags & IFF_UP)) {
> +		struct netdev_notifier_change_info change_info = {
> +			.info.dev = dev,
> +		};
> +
> +		call_netdevice_notifiers_info(NETDEV_CHANGE,
> +					      &change_info.info);
> +	}
> +
>  	ret = call_netdevice_notifiers(NETDEV_CHANGENAME, dev);
>  	ret = notifier_to_errno(ret);
>  
> diff --git a/net/core/failover.c b/net/core/failover.c
> index 4a92a98..b5cd3c7 100644
> --- a/net/core/failover.c
> +++ b/net/core/failover.c
> @@ -80,14 +80,14 @@ static int failover_slave_register(struct net_device *slave_dev)
>  		goto err_upper_link;
>  	}
>  
> -	slave_dev->priv_flags |= IFF_FAILOVER_SLAVE;
> +	slave_dev->priv_flags |= (IFF_FAILOVER_SLAVE | IFF_LIVE_NAME_CHANGE);
>  
>  	if (fops && fops->slave_register &&
>  	    !fops->slave_register(slave_dev, failover_dev))
>  		return NOTIFY_OK;
>  
>  	netdev_upper_dev_unlink(slave_dev, failover_dev);
> -	slave_dev->priv_flags &= ~IFF_FAILOVER_SLAVE;
> +	slave_dev->priv_flags &= ~(IFF_FAILOVER_SLAVE | IFF_LIVE_NAME_CHANGE);
>  err_upper_link:
>  	netdev_rx_handler_unregister(slave_dev);
>  done:
> @@ -121,7 +121,7 @@ int failover_slave_unregister(struct net_device *slave_dev)
>  
>  	netdev_rx_handler_unregister(slave_dev);
>  	netdev_upper_dev_unlink(slave_dev, failover_dev);
> -	slave_dev->priv_flags &= ~IFF_FAILOVER_SLAVE;
> +	slave_dev->priv_flags &= ~(IFF_FAILOVER_SLAVE | IFF_LIVE_NAME_CHANGE);
>  
>  	if (fops && fops->slave_unregister &&
>  	    !fops->slave_unregister(slave_dev, failover_dev))
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v6] failover: allow name change on IFF_UP slave interfaces
  2019-04-03  4:52 [PATCH net v6] failover: allow name change on IFF_UP slave interfaces Si-Wei Liu
                   ` (2 preceding siblings ...)
  2019-04-05 21:28 ` Michael S. Tsirkin
@ 2019-04-05 21:47 ` Stephen Hemminger
  2019-04-05 22:01   ` Michael S. Tsirkin
                     ` (2 more replies)
  2019-04-05 21:47 ` Stephen Hemminger
  4 siblings, 3 replies; 17+ messages in thread
From: Stephen Hemminger @ 2019-04-05 21:47 UTC (permalink / raw)
  To: Si-Wei Liu
  Cc: mst, sridhar.samudrala, davem, kubakici, alexander.duyck, jiri,
	netdev, virtualization, liran.alon, boris.ostrovsky,
	vijay.balakrishna

On Wed,  3 Apr 2019 00:52:47 -0400
Si-Wei Liu <si-wei.liu@oracle.com> wrote:

>  
> +	if (unlikely(dev->flags & IFF_UP)) {
> +		struct netdev_notifier_change_info change_info = {
> +			.info.dev = dev,
> +		};
> +
> +		call_netdevice_notifiers_info(NETDEV_CHANGE,
> +					      &change_info.info);
> +	}

This notifier is not really necessary, there already is a CHANGENAME
that gets sent.

NETDEV_CHANGE is used in other cases to mean that the state (flags)
have changed.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v6] failover: allow name change on IFF_UP slave interfaces
  2019-04-03  4:52 [PATCH net v6] failover: allow name change on IFF_UP slave interfaces Si-Wei Liu
                   ` (3 preceding siblings ...)
  2019-04-05 21:47 ` Stephen Hemminger
@ 2019-04-05 21:47 ` Stephen Hemminger
  4 siblings, 0 replies; 17+ messages in thread
From: Stephen Hemminger @ 2019-04-05 21:47 UTC (permalink / raw)
  To: Si-Wei Liu
  Cc: jiri, mst, kubakici, sridhar.samudrala, alexander.duyck,
	virtualization, liran.alon, netdev, boris.ostrovsky, davem

On Wed,  3 Apr 2019 00:52:47 -0400
Si-Wei Liu <si-wei.liu@oracle.com> wrote:

>  
> +	if (unlikely(dev->flags & IFF_UP)) {
> +		struct netdev_notifier_change_info change_info = {
> +			.info.dev = dev,
> +		};
> +
> +		call_netdevice_notifiers_info(NETDEV_CHANGE,
> +					      &change_info.info);
> +	}

This notifier is not really necessary, there already is a CHANGENAME
that gets sent.

NETDEV_CHANGE is used in other cases to mean that the state (flags)
have changed.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v6] failover: allow name change on IFF_UP slave interfaces
  2019-04-05 21:28 ` Michael S. Tsirkin
  2019-04-05 21:47   ` Stephen Hemminger
@ 2019-04-05 21:47   ` Stephen Hemminger
  2019-04-06  7:21     ` si-wei liu
  1 sibling, 1 reply; 17+ messages in thread
From: Stephen Hemminger @ 2019-04-05 21:47 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Si-Wei Liu, sridhar.samudrala, davem, kubakici, alexander.duyck,
	jiri, netdev, virtualization, liran.alon, boris.ostrovsky,
	vijay.balakrishna

On Fri, 5 Apr 2019 17:28:55 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Wed, Apr 03, 2019 at 12:52:47AM -0400, Si-Wei Liu wrote:
> > When a netdev appears through hot plug then gets enslaved by a failover
> > master that is already up and running, the slave will be opened
> > right away after getting enslaved. Today there's a race that userspace
> > (udev) may fail to rename the slave if the kernel (net_failover)
> > opens the slave earlier than when the userspace rename happens.
> > Unlike bond or team, the primary slave of failover can't be renamed by
> > userspace ahead of time, since the kernel initiated auto-enslavement is
> > unable to, or rather, is never meant to be synchronized with the rename
> > request from userspace.
> > 
> > As the failover slave interfaces are not designed to be operated
> > directly by userspace apps: IP configuration, filter rules with
> > regard to network traffic passing and etc., should all be done on master
> > interface. In general, userspace apps only care about the
> > name of master interface, while slave names are less important as long
> > as admin users can see reliable names that may carry
> > other information describing the netdev. For e.g., they can infer that
> > "ens3nsby" is a standby slave of "ens3", while for a
> > name like "eth0" they can't tell which master it belongs to.
> > 
> > Historically the name of IFF_UP interface can't be changed because
> > there might be admin script or management software that is already
> > relying on such behavior and assumes that the slave name can't be
> > changed once UP. But failover is special: with the in-kernel
> > auto-enslavement mechanism, the userspace expectation for device
> > enumeration and bring-up order is already broken. Previously initramfs
> > and various userspace config tools were modified to bypass failover
> > slaves because of auto-enslavement and duplicate MAC address. Similarly,
> > in case that users care about seeing reliable slave name, the new type
> > of failover slaves needs to be taken care of specifically in userspace
> > anyway.
> > 
> > It's less risky to lift up the rename restriction on failover slave
> > which is already UP. Although it's possible this change may potentially
> > break userspace component (most likely configuration scripts or
> > management software) that assumes slave name can't be changed while
> > UP, it's relatively a limited and controllable set among all userspace
> > components, which can be fixed specifically to listen for the rename
> > and/or link down/up events on failover slaves. Userspace component
> > interacting with slaves is expected to be changed to operate on failover
> > master interface instead, as the failover slave is dynamic in nature
> > which may come and go at any point.  The goal is to make the role of
> > failover slaves less relevant, and userspace components should only
> > deal with failover master in the long run.
> > 
> > Fixes: 30c8bd5aa8b2 ("net: Introduce generic failover module")
> > Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
> > Reviewed-by: Liran Alon <liran.alon@oracle.com>  
> 
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> 
> Stephen are you happy with this approach?

I think it is the best solution for what you want to do. 

Did you test with some things like Free Range Routing, VPP or other userspace
control planes that consume netlink?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v6] failover: allow name change on IFF_UP slave interfaces
  2019-04-05 21:28 ` Michael S. Tsirkin
@ 2019-04-05 21:47   ` Stephen Hemminger
  2019-04-05 21:47   ` Stephen Hemminger
  1 sibling, 0 replies; 17+ messages in thread
From: Stephen Hemminger @ 2019-04-05 21:47 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jiri, kubakici, sridhar.samudrala, alexander.duyck,
	virtualization, liran.alon, netdev, Si-Wei Liu, boris.ostrovsky,
	davem

On Fri, 5 Apr 2019 17:28:55 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Wed, Apr 03, 2019 at 12:52:47AM -0400, Si-Wei Liu wrote:
> > When a netdev appears through hot plug then gets enslaved by a failover
> > master that is already up and running, the slave will be opened
> > right away after getting enslaved. Today there's a race that userspace
> > (udev) may fail to rename the slave if the kernel (net_failover)
> > opens the slave earlier than when the userspace rename happens.
> > Unlike bond or team, the primary slave of failover can't be renamed by
> > userspace ahead of time, since the kernel initiated auto-enslavement is
> > unable to, or rather, is never meant to be synchronized with the rename
> > request from userspace.
> > 
> > As the failover slave interfaces are not designed to be operated
> > directly by userspace apps: IP configuration, filter rules with
> > regard to network traffic passing and etc., should all be done on master
> > interface. In general, userspace apps only care about the
> > name of master interface, while slave names are less important as long
> > as admin users can see reliable names that may carry
> > other information describing the netdev. For e.g., they can infer that
> > "ens3nsby" is a standby slave of "ens3", while for a
> > name like "eth0" they can't tell which master it belongs to.
> > 
> > Historically the name of IFF_UP interface can't be changed because
> > there might be admin script or management software that is already
> > relying on such behavior and assumes that the slave name can't be
> > changed once UP. But failover is special: with the in-kernel
> > auto-enslavement mechanism, the userspace expectation for device
> > enumeration and bring-up order is already broken. Previously initramfs
> > and various userspace config tools were modified to bypass failover
> > slaves because of auto-enslavement and duplicate MAC address. Similarly,
> > in case that users care about seeing reliable slave name, the new type
> > of failover slaves needs to be taken care of specifically in userspace
> > anyway.
> > 
> > It's less risky to lift up the rename restriction on failover slave
> > which is already UP. Although it's possible this change may potentially
> > break userspace component (most likely configuration scripts or
> > management software) that assumes slave name can't be changed while
> > UP, it's relatively a limited and controllable set among all userspace
> > components, which can be fixed specifically to listen for the rename
> > and/or link down/up events on failover slaves. Userspace component
> > interacting with slaves is expected to be changed to operate on failover
> > master interface instead, as the failover slave is dynamic in nature
> > which may come and go at any point.  The goal is to make the role of
> > failover slaves less relevant, and userspace components should only
> > deal with failover master in the long run.
> > 
> > Fixes: 30c8bd5aa8b2 ("net: Introduce generic failover module")
> > Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
> > Reviewed-by: Liran Alon <liran.alon@oracle.com>  
> 
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> 
> Stephen are you happy with this approach?

I think it is the best solution for what you want to do. 

Did you test with some things like Free Range Routing, VPP or other userspace
control planes that consume netlink?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v6] failover: allow name change on IFF_UP slave interfaces
  2019-04-05 21:47 ` Stephen Hemminger
@ 2019-04-05 22:01   ` Michael S. Tsirkin
  2019-04-07 15:41       ` Stephen Hemminger
  2019-04-05 22:01   ` Michael S. Tsirkin
  2019-04-05 22:13   ` si-wei liu
  2 siblings, 1 reply; 17+ messages in thread
From: Michael S. Tsirkin @ 2019-04-05 22:01 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Si-Wei Liu, sridhar.samudrala, davem, kubakici, alexander.duyck,
	jiri, netdev, virtualization, liran.alon, boris.ostrovsky,
	vijay.balakrishna

On Fri, Apr 05, 2019 at 02:47:01PM -0700, Stephen Hemminger wrote:
> On Wed,  3 Apr 2019 00:52:47 -0400
> Si-Wei Liu <si-wei.liu@oracle.com> wrote:
> 
> >  
> > +	if (unlikely(dev->flags & IFF_UP)) {
> > +		struct netdev_notifier_change_info change_info = {
> > +			.info.dev = dev,
> > +		};
> > +
> > +		call_netdevice_notifiers_info(NETDEV_CHANGE,
> > +					      &change_info.info);
> > +	}
> 
> This notifier is not really necessary, there already is a CHANGENAME
> that gets sent.
> NETDEV_CHANGE is used in other cases to mean that the state (flags)
> have changed.

The point is some existing scripts might not expect name
change to happen without a status change afterwards (since it was
impossible for so long). So this reports a change
to make sure scripts do not miss it.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v6] failover: allow name change on IFF_UP slave interfaces
  2019-04-05 21:47 ` Stephen Hemminger
  2019-04-05 22:01   ` Michael S. Tsirkin
@ 2019-04-05 22:01   ` Michael S. Tsirkin
  2019-04-05 22:13   ` si-wei liu
  2 siblings, 0 replies; 17+ messages in thread
From: Michael S. Tsirkin @ 2019-04-05 22:01 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: jiri, kubakici, sridhar.samudrala, alexander.duyck,
	virtualization, liran.alon, netdev, Si-Wei Liu, boris.ostrovsky,
	davem

On Fri, Apr 05, 2019 at 02:47:01PM -0700, Stephen Hemminger wrote:
> On Wed,  3 Apr 2019 00:52:47 -0400
> Si-Wei Liu <si-wei.liu@oracle.com> wrote:
> 
> >  
> > +	if (unlikely(dev->flags & IFF_UP)) {
> > +		struct netdev_notifier_change_info change_info = {
> > +			.info.dev = dev,
> > +		};
> > +
> > +		call_netdevice_notifiers_info(NETDEV_CHANGE,
> > +					      &change_info.info);
> > +	}
> 
> This notifier is not really necessary, there already is a CHANGENAME
> that gets sent.
> NETDEV_CHANGE is used in other cases to mean that the state (flags)
> have changed.

The point is some existing scripts might not expect name
change to happen without a status change afterwards (since it was
impossible for so long). So this reports a change
to make sure scripts do not miss it.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v6] failover: allow name change on IFF_UP slave interfaces
  2019-04-05 21:47 ` Stephen Hemminger
  2019-04-05 22:01   ` Michael S. Tsirkin
  2019-04-05 22:01   ` Michael S. Tsirkin
@ 2019-04-05 22:13   ` si-wei liu
  2 siblings, 0 replies; 17+ messages in thread
From: si-wei liu @ 2019-04-05 22:13 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: mst, sridhar.samudrala, davem, kubakici, alexander.duyck, jiri,
	netdev, virtualization, liran.alon, boris.ostrovsky,
	vijay.balakrishna



On 4/5/2019 2:47 PM, Stephen Hemminger wrote:
> On Wed,  3 Apr 2019 00:52:47 -0400
> Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>
>>   
>> +	if (unlikely(dev->flags & IFF_UP)) {
>> +		struct netdev_notifier_change_info change_info = {
>> +			.info.dev = dev,
>> +		};
>> +
>> +		call_netdevice_notifiers_info(NETDEV_CHANGE,
>> +					      &change_info.info);
>> +	}
> This notifier is not really necessary, there already is a CHANGENAME
> that gets sent.
>
> NETDEV_CHANGE is used in other cases to mean that the state (flags)
> have changed.
Honestly I myself did not find NETDEV_CHANGE useful, but it was your 
call... Anyway, I can remove this notifier and get the patch back to 
close to v2 except for flag name.

https://patchwork.ozlabs.org/patch/1052633/

But what left open is should we really need to notify userspace of link 
state change around rename, which is what Michael suggested, but turns 
out too involved.

Let me know if you intend to remove it or leave it as-is.

-Siwei




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v6] failover: allow name change on IFF_UP slave interfaces
  2019-04-05 21:47   ` Stephen Hemminger
@ 2019-04-06  7:21     ` si-wei liu
  2019-04-07  2:45         ` Samudrala, Sridhar
  0 siblings, 1 reply; 17+ messages in thread
From: si-wei liu @ 2019-04-06  7:21 UTC (permalink / raw)
  To: Stephen Hemminger, Michael S. Tsirkin
  Cc: sridhar.samudrala, davem, kubakici, alexander.duyck, jiri,
	netdev, virtualization, liran.alon, boris.ostrovsky,
	vijay.balakrishna



On 4/5/2019 2:47 PM, Stephen Hemminger wrote:
> On Fri, 5 Apr 2019 17:28:55 -0400
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>
>> On Wed, Apr 03, 2019 at 12:52:47AM -0400, Si-Wei Liu wrote:
>>> When a netdev appears through hot plug then gets enslaved by a failover
>>> master that is already up and running, the slave will be opened
>>> right away after getting enslaved. Today there's a race that userspace
>>> (udev) may fail to rename the slave if the kernel (net_failover)
>>> opens the slave earlier than when the userspace rename happens.
>>> Unlike bond or team, the primary slave of failover can't be renamed by
>>> userspace ahead of time, since the kernel initiated auto-enslavement is
>>> unable to, or rather, is never meant to be synchronized with the rename
>>> request from userspace.
>>>
>>> As the failover slave interfaces are not designed to be operated
>>> directly by userspace apps: IP configuration, filter rules with
>>> regard to network traffic passing and etc., should all be done on master
>>> interface. In general, userspace apps only care about the
>>> name of master interface, while slave names are less important as long
>>> as admin users can see reliable names that may carry
>>> other information describing the netdev. For e.g., they can infer that
>>> "ens3nsby" is a standby slave of "ens3", while for a
>>> name like "eth0" they can't tell which master it belongs to.
>>>
>>> Historically the name of IFF_UP interface can't be changed because
>>> there might be admin script or management software that is already
>>> relying on such behavior and assumes that the slave name can't be
>>> changed once UP. But failover is special: with the in-kernel
>>> auto-enslavement mechanism, the userspace expectation for device
>>> enumeration and bring-up order is already broken. Previously initramfs
>>> and various userspace config tools were modified to bypass failover
>>> slaves because of auto-enslavement and duplicate MAC address. Similarly,
>>> in case that users care about seeing reliable slave name, the new type
>>> of failover slaves needs to be taken care of specifically in userspace
>>> anyway.
>>>
>>> It's less risky to lift up the rename restriction on failover slave
>>> which is already UP. Although it's possible this change may potentially
>>> break userspace component (most likely configuration scripts or
>>> management software) that assumes slave name can't be changed while
>>> UP, it's relatively a limited and controllable set among all userspace
>>> components, which can be fixed specifically to listen for the rename
>>> and/or link down/up events on failover slaves. Userspace component
>>> interacting with slaves is expected to be changed to operate on failover
>>> master interface instead, as the failover slave is dynamic in nature
>>> which may come and go at any point.  The goal is to make the role of
>>> failover slaves less relevant, and userspace components should only
>>> deal with failover master in the long run.
>>>
>>> Fixes: 30c8bd5aa8b2 ("net: Introduce generic failover module")
>>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
>>> Reviewed-by: Liran Alon <liran.alon@oracle.com>
>> Acked-by: Michael S. Tsirkin <mst@redhat.com>
>>
>> Stephen are you happy with this approach?
> I think it is the best solution for what you want to do.

Since you're asking specifically, I tried what you suggested below.

> Did you test with some things like Free Range Routing,

Although there might be spurious warning (which is a check for sanity 
more than an error) while slave interface is up, slave rename had been 
handled quite well there, no matter which state slave is at.

https://github.com/FRRouting/frr/blob/master/zebra/if_netlink.c#L97

The FRR users are supposed to operate on failover master interface 
anyway. No one is expected to configure those passive interfaces for 
routing.

> VPP
Nothing particular was seen for this one. The netlink usage there 
doesn't seem related to my change:
https://github.com/FDio/vpp/blob/master/src/vnet/devices/netlink.c

> or other userspace
> control planes that consume netlink?
dhcpcd (https://github.com/kobolabs/dhcpcd/blob/kobo/if-linux.c#L761) 
was tested OK.

In addition, the patch seems to play quite well with systemd-udev and 
dracut/initramfs-tools. No breakage, no weird error message was seen.

What else do you suggest we should try/test with?

Thanks,
-Siwei

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v6] failover: allow name change on IFF_UP slave interfaces
  2019-04-06  7:21     ` si-wei liu
@ 2019-04-07  2:45         ` Samudrala, Sridhar
  0 siblings, 0 replies; 17+ messages in thread
From: Samudrala, Sridhar @ 2019-04-07  2:45 UTC (permalink / raw)
  To: si-wei liu, Stephen Hemminger, Michael S. Tsirkin
  Cc: davem, kubakici, alexander.duyck, jiri, netdev, virtualization,
	liran.alon, boris.ostrovsky, vijay.balakrishna


On 4/6/2019 12:21 AM, si-wei liu wrote:

>>>
>>> Stephen are you happy with this approach?
>> I think it is the best solution for what you want to do.
> 
> Since you're asking specifically, I tried what you suggested below.
> 
>> Did you test with some things like Free Range Routing,
> 
> Although there might be spurious warning (which is a check for sanity 
> more than an error) while slave interface is up, slave rename had been 
> handled quite well there, no matter which state slave is at.
> 
> https://github.com/FRRouting/frr/blob/master/zebra/if_netlink.c#L97
> 
> The FRR users are supposed to operate on failover master interface 
> anyway. No one is expected to configure those passive interfaces for 
> routing.
> 
>> VPP
> Nothing particular was seen for this one. The netlink usage there 
> doesn't seem related to my change:
> https://github.com/FDio/vpp/blob/master/src/vnet/devices/netlink.c
> 
>> or other userspace
>> control planes that consume netlink?
> dhcpcd (https://github.com/kobolabs/dhcpcd/blob/kobo/if-linux.c#L761) 
> was tested OK.
> 
> In addition, the patch seems to play quite well with systemd-udev and 
> dracut/initramfs-tools. No breakage, no weird error message was seen.
> 
> What else do you suggest we should try/test with?

Thanks Siwei for all the tests you are trying out. Did you notice that 
any of these tests required the NETDEV_CHANGE notifier that you added?

-Sridhar

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v6] failover: allow name change on IFF_UP slave interfaces
@ 2019-04-07  2:45         ` Samudrala, Sridhar
  0 siblings, 0 replies; 17+ messages in thread
From: Samudrala, Sridhar @ 2019-04-07  2:45 UTC (permalink / raw)
  To: si-wei liu, Stephen Hemminger, Michael S. Tsirkin
  Cc: jiri, kubakici, netdev, alexander.duyck, virtualization,
	liran.alon, boris.ostrovsky, davem


On 4/6/2019 12:21 AM, si-wei liu wrote:

>>>
>>> Stephen are you happy with this approach?
>> I think it is the best solution for what you want to do.
> 
> Since you're asking specifically, I tried what you suggested below.
> 
>> Did you test with some things like Free Range Routing,
> 
> Although there might be spurious warning (which is a check for sanity 
> more than an error) while slave interface is up, slave rename had been 
> handled quite well there, no matter which state slave is at.
> 
> https://github.com/FRRouting/frr/blob/master/zebra/if_netlink.c#L97
> 
> The FRR users are supposed to operate on failover master interface 
> anyway. No one is expected to configure those passive interfaces for 
> routing.
> 
>> VPP
> Nothing particular was seen for this one. The netlink usage there 
> doesn't seem related to my change:
> https://github.com/FDio/vpp/blob/master/src/vnet/devices/netlink.c
> 
>> or other userspace
>> control planes that consume netlink?
> dhcpcd (https://github.com/kobolabs/dhcpcd/blob/kobo/if-linux.c#L761) 
> was tested OK.
> 
> In addition, the patch seems to play quite well with systemd-udev and 
> dracut/initramfs-tools. No breakage, no weird error message was seen.
> 
> What else do you suggest we should try/test with?

Thanks Siwei for all the tests you are trying out. Did you notice that 
any of these tests required the NETDEV_CHANGE notifier that you added?

-Sridhar

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v6] failover: allow name change on IFF_UP slave interfaces
  2019-04-05 22:01   ` Michael S. Tsirkin
@ 2019-04-07 15:41       ` Stephen Hemminger
  0 siblings, 0 replies; 17+ messages in thread
From: Stephen Hemminger @ 2019-04-07 15:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Si-Wei Liu, sridhar.samudrala, davem, kubakici, alexander.duyck,
	jiri, netdev, virtualization, liran.alon, boris.ostrovsky,
	vijay.balakrishna

On Fri, 5 Apr 2019 18:01:43 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> > 
> > This notifier is not really necessary, there already is a CHANGENAME
> > that gets sent.
> > NETDEV_CHANGE is used in other cases to mean that the state (flags)
> > have changed.  
> 
> The point is some existing scripts might not expect name
> change to happen without a status change afterwards (since it was
> impossible for so long). So this reports a change
> to make sure scripts do not miss it.


I don't think it matters because if device named is changed and it
is down (!IFF_UP) then only CHANGENAME is sent. The NETDEV_CHANGE is
just noise to an application.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v6] failover: allow name change on IFF_UP slave interfaces
@ 2019-04-07 15:41       ` Stephen Hemminger
  0 siblings, 0 replies; 17+ messages in thread
From: Stephen Hemminger @ 2019-04-07 15:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jiri, kubakici, sridhar.samudrala, alexander.duyck,
	virtualization, liran.alon, netdev, Si-Wei Liu, boris.ostrovsky,
	davem

On Fri, 5 Apr 2019 18:01:43 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> > 
> > This notifier is not really necessary, there already is a CHANGENAME
> > that gets sent.
> > NETDEV_CHANGE is used in other cases to mean that the state (flags)
> > have changed.  
> 
> The point is some existing scripts might not expect name
> change to happen without a status change afterwards (since it was
> impossible for so long). So this reports a change
> to make sure scripts do not miss it.


I don't think it matters because if device named is changed and it
is down (!IFF_UP) then only CHANGENAME is sent. The NETDEV_CHANGE is
just noise to an application.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v6] failover: allow name change on IFF_UP slave interfaces
  2019-04-07  2:45         ` Samudrala, Sridhar
  (?)
@ 2019-04-08 20:56         ` si-wei liu
  -1 siblings, 0 replies; 17+ messages in thread
From: si-wei liu @ 2019-04-08 20:56 UTC (permalink / raw)
  To: Samudrala, Sridhar, Stephen Hemminger, Michael S. Tsirkin
  Cc: davem, kubakici, alexander.duyck, jiri, netdev, virtualization,
	liran.alon, boris.ostrovsky, vijay.balakrishna



On 4/6/2019 7:45 PM, Samudrala, Sridhar wrote:
>
> On 4/6/2019 12:21 AM, si-wei liu wrote:
>
>>>>
>>>> Stephen are you happy with this approach?
>>> I think it is the best solution for what you want to do.
>>
>> Since you're asking specifically, I tried what you suggested below.
>>
>>> Did you test with some things like Free Range Routing,
>>
>> Although there might be spurious warning (which is a check for sanity 
>> more than an error) while slave interface is up, slave rename had 
>> been handled quite well there, no matter which state slave is at.
>>
>> https://github.com/FRRouting/frr/blob/master/zebra/if_netlink.c#L97
>>
>> The FRR users are supposed to operate on failover master interface 
>> anyway. No one is expected to configure those passive interfaces for 
>> routing.
>>
>>> VPP
>> Nothing particular was seen for this one. The netlink usage there 
>> doesn't seem related to my change:
>> https://github.com/FDio/vpp/blob/master/src/vnet/devices/netlink.c
>>
>>> or other userspace
>>> control planes that consume netlink?
>> dhcpcd (https://github.com/kobolabs/dhcpcd/blob/kobo/if-linux.c#L761) 
>> was tested OK.
>>
>> In addition, the patch seems to play quite well with systemd-udev and 
>> dracut/initramfs-tools. No breakage, no weird error message was seen.
>>
>> What else do you suggest we should try/test with?
>
> Thanks Siwei for all the tests you are trying out. Did you notice that 
> any of these tests required the NETDEV_CHANGE notifier that you added?

No, that is not actually needed - none of NETDEV_CHANGE consumers checks 
for name change, although name change was reflected there already. I 
retested userspace applications above with NETDEV_CHANGE notifier 
removed, the results remainl same.

Thanks,
-Siwei

>
> -Sridhar


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-04-08 20:56 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-03  4:52 [PATCH net v6] failover: allow name change on IFF_UP slave interfaces Si-Wei Liu
2019-04-05 20:40 ` si-wei liu
2019-04-05 21:28 ` Michael S. Tsirkin
2019-04-05 21:28 ` Michael S. Tsirkin
2019-04-05 21:47   ` Stephen Hemminger
2019-04-05 21:47   ` Stephen Hemminger
2019-04-06  7:21     ` si-wei liu
2019-04-07  2:45       ` Samudrala, Sridhar
2019-04-07  2:45         ` Samudrala, Sridhar
2019-04-08 20:56         ` si-wei liu
2019-04-05 21:47 ` Stephen Hemminger
2019-04-05 22:01   ` Michael S. Tsirkin
2019-04-07 15:41     ` Stephen Hemminger
2019-04-07 15:41       ` Stephen Hemminger
2019-04-05 22:01   ` Michael S. Tsirkin
2019-04-05 22:13   ` si-wei liu
2019-04-05 21:47 ` Stephen Hemminger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.