All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2,net-next, 0/3] Add vmbus dev_num and enable netvsc async probing
@ 2019-12-30 20:13 Haiyang Zhang
  2019-12-30 20:13 ` [PATCH V2,net-next, 1/3] Drivers: hv: vmbus: Add a dev_num variable based on channel offer sequence Haiyang Zhang
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Haiyang Zhang @ 2019-12-30 20:13 UTC (permalink / raw)
  To: sashal, linux-hyperv, netdev
  Cc: haiyangz, kys, sthemmin, olaf, vkuznets, davem, linux-kernel

Add dev_num for vmbus device based on channel offer sequence.
Use this number for nic naming, and enable async probing.

Haiyang Zhang (3):
  Drivers: hv: vmbus: Add a dev_num variable based on channel offer
    sequence
  Drivers: hv: vmbus: Add dev_num to sysfs
  hv_netvsc: Name NICs based on vmbus offer sequence and use async probe

 Documentation/ABI/stable/sysfs-bus-vmbus |  8 +++++++
 drivers/hv/channel_mgmt.c                | 38 ++++++++++++++++++++++++++++++--
 drivers/hv/vmbus_drv.c                   | 13 +++++++++++
 drivers/net/hyperv/netvsc_drv.c          | 18 ++++++++++++---
 include/linux/hyperv.h                   |  6 +++++
 5 files changed, 78 insertions(+), 5 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH V2,net-next, 1/3] Drivers: hv: vmbus: Add a dev_num variable based on channel offer sequence
  2019-12-30 20:13 [PATCH V2,net-next, 0/3] Add vmbus dev_num and enable netvsc async probing Haiyang Zhang
@ 2019-12-30 20:13 ` Haiyang Zhang
  2019-12-31  1:34   ` Michael Kelley
  2019-12-30 20:13 ` [PATCH V2,net-next, 2/3] Drivers: hv: vmbus: Add dev_num to sysfs Haiyang Zhang
  2019-12-30 20:13 ` [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe Haiyang Zhang
  2 siblings, 1 reply; 14+ messages in thread
From: Haiyang Zhang @ 2019-12-30 20:13 UTC (permalink / raw)
  To: sashal, linux-hyperv, netdev
  Cc: haiyangz, kys, sthemmin, olaf, vkuznets, davem, linux-kernel

This number is set to the first available number, starting from zero,
when a vmbus device’s primary channel is offered.
It will be used for stable naming when Async probing is used.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
---
Changes
V2:
	Use nest loops in hv_set_devnum, instead of goto.

 drivers/hv/channel_mgmt.c | 38 ++++++++++++++++++++++++++++++++++++--
 include/linux/hyperv.h    |  6 ++++++
 2 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 8eb1675..00fa2db 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -315,6 +315,8 @@ static struct vmbus_channel *alloc_channel(void)
 	if (!channel)
 		return NULL;
 
+	channel->dev_num = HV_DEV_NUM_INVALID;
+
 	spin_lock_init(&channel->lock);
 	init_completion(&channel->rescind_event);
 
@@ -541,6 +543,36 @@ static void vmbus_add_channel_work(struct work_struct *work)
 }
 
 /*
+ * Get the first available device number of its type, then
+ * record it in the channel structure.
+ */
+static void hv_set_devnum(struct vmbus_channel *newchannel)
+{
+	struct vmbus_channel *channel;
+	int i = -1;
+	bool found;
+
+	BUG_ON(!mutex_is_locked(&vmbus_connection.channel_mutex));
+
+	do {
+		i++;
+		found = false;
+
+		list_for_each_entry(channel, &vmbus_connection.chn_list,
+				    listentry) {
+			if (i == channel->dev_num &&
+			    guid_equal(&channel->offermsg.offer.if_type,
+				       &newchannel->offermsg.offer.if_type)) {
+				found = true;
+				break;
+			}
+		}
+	} while (found);
+
+	newchannel->dev_num = i;
+}
+
+/*
  * vmbus_process_offer - Process the offer by creating a channel/device
  * associated with this offer
  */
@@ -573,10 +605,12 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
 		}
 	}
 
-	if (fnew)
+	if (fnew) {
+		hv_set_devnum(newchannel);
+
 		list_add_tail(&newchannel->listentry,
 			      &vmbus_connection.chn_list);
-	else {
+	} else {
 		/*
 		 * Check to see if this is a valid sub-channel.
 		 */
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 26f3aee..4f110c5 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -718,6 +718,8 @@ struct vmbus_device {
 	bool perf_device;
 };
 
+#define HV_DEV_NUM_INVALID (-1)
+
 struct vmbus_channel {
 	struct list_head listentry;
 
@@ -849,6 +851,10 @@ struct vmbus_channel {
 	 */
 	struct vmbus_channel *primary_channel;
 	/*
+	 * Used for device naming based on channel offer sequence.
+	 */
+	int dev_num;
+	/*
 	 * Support per-channel state for use by vmbus drivers.
 	 */
 	void *per_channel_state;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V2,net-next, 2/3] Drivers: hv: vmbus: Add dev_num to sysfs
  2019-12-30 20:13 [PATCH V2,net-next, 0/3] Add vmbus dev_num and enable netvsc async probing Haiyang Zhang
  2019-12-30 20:13 ` [PATCH V2,net-next, 1/3] Drivers: hv: vmbus: Add a dev_num variable based on channel offer sequence Haiyang Zhang
@ 2019-12-30 20:13 ` Haiyang Zhang
  2019-12-31  1:34   ` Michael Kelley
  2019-12-30 20:13 ` [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe Haiyang Zhang
  2 siblings, 1 reply; 14+ messages in thread
From: Haiyang Zhang @ 2019-12-30 20:13 UTC (permalink / raw)
  To: sashal, linux-hyperv, netdev
  Cc: haiyangz, kys, sthemmin, olaf, vkuznets, davem, linux-kernel

It's a number based on the vmbus device offer sequence.
Useful for device naming when using Async probing.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
---
 Documentation/ABI/stable/sysfs-bus-vmbus |  8 ++++++++
 drivers/hv/vmbus_drv.c                   | 13 +++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/Documentation/ABI/stable/sysfs-bus-vmbus b/Documentation/ABI/stable/sysfs-bus-vmbus
index 8e8d167..a42225d 100644
--- a/Documentation/ABI/stable/sysfs-bus-vmbus
+++ b/Documentation/ABI/stable/sysfs-bus-vmbus
@@ -49,6 +49,14 @@ Contact:	Stephen Hemminger <sthemmin@microsoft.com>
 Description:	This NUMA node to which the VMBUS device is
 		attached, or -1 if the node is unknown.
 
+What:           /sys/bus/vmbus/devices/<UUID>/dev_num
+Date:		Dec 2019
+KernelVersion:	5.5
+Contact:	Haiyang Zhang <haiyangz@microsoft.com>
+Description:	A number based on the vmbus device offer sequence.
+		Useful for device naming when using Async probing.
+Users:		Debugging tools and userspace drivers
+
 What:		/sys/bus/vmbus/devices/<UUID>/channels/<N>
 Date:		September. 2017
 KernelVersion:	4.14
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 4ef5a66..fe7aefa 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -214,6 +214,18 @@ static ssize_t numa_node_show(struct device *dev,
 static DEVICE_ATTR_RO(numa_node);
 #endif
 
+static ssize_t dev_num_show(struct device *dev,
+			      struct device_attribute *attr, char *buf)
+{
+	struct hv_device *hv_dev = device_to_hv_device(dev);
+
+	if (!hv_dev->channel)
+		return -ENODEV;
+
+	return sprintf(buf, "%d\n", hv_dev->channel->dev_num);
+}
+static DEVICE_ATTR_RO(dev_num);
+
 static ssize_t server_monitor_pending_show(struct device *dev,
 					   struct device_attribute *dev_attr,
 					   char *buf)
@@ -598,6 +610,7 @@ static ssize_t driver_override_show(struct device *dev,
 #ifdef CONFIG_NUMA
 	&dev_attr_numa_node.attr,
 #endif
+	&dev_attr_dev_num.attr,
 	&dev_attr_server_monitor_pending.attr,
 	&dev_attr_client_monitor_pending.attr,
 	&dev_attr_server_monitor_latency.attr,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe
  2019-12-30 20:13 [PATCH V2,net-next, 0/3] Add vmbus dev_num and enable netvsc async probing Haiyang Zhang
  2019-12-30 20:13 ` [PATCH V2,net-next, 1/3] Drivers: hv: vmbus: Add a dev_num variable based on channel offer sequence Haiyang Zhang
  2019-12-30 20:13 ` [PATCH V2,net-next, 2/3] Drivers: hv: vmbus: Add dev_num to sysfs Haiyang Zhang
@ 2019-12-30 20:13 ` Haiyang Zhang
  2019-12-31  1:35   ` Michael Kelley
  2019-12-31 11:34   ` Roman Kagan
  2 siblings, 2 replies; 14+ messages in thread
From: Haiyang Zhang @ 2019-12-30 20:13 UTC (permalink / raw)
  To: sashal, linux-hyperv, netdev
  Cc: haiyangz, kys, sthemmin, olaf, vkuznets, davem, linux-kernel

The dev_num field in vmbus channel structure is assigned to the first
available number when the channel is offered. So netvsc driver uses it
for NIC naming based on channel offer sequence. Now re-enable the async
probing mode for faster probing.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
---
 drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index f3f9eb8..39c412f 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev,
 	struct net_device_context *net_device_ctx;
 	struct netvsc_device_info *device_info = NULL;
 	struct netvsc_device *nvdev;
+	char name[IFNAMSIZ];
 	int ret = -ENOMEM;
 
-	net = alloc_etherdev_mq(sizeof(struct net_device_context),
-				VRSS_CHANNEL_MAX);
+	snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num);
+	net = alloc_netdev_mqs(sizeof(struct net_device_context), name,
+			       NET_NAME_ENUM, ether_setup,
+			       VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX);
+
 	if (!net)
 		goto no_net;
 
@@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev,
 		net->max_mtu = ETH_DATA_LEN;
 
 	ret = register_netdevice(net);
+
+	if (ret == -EEXIST) {
+		pr_info("NIC name %s exists, request another name.\n",
+			net->name);
+		strlcpy(net->name, "eth%d", IFNAMSIZ);
+		ret = register_netdevice(net);
+	}
+
 	if (ret != 0) {
 		pr_err("Unable to register netdev.\n");
 		goto register_failed;
@@ -2496,7 +2508,7 @@ static int netvsc_resume(struct hv_device *dev)
 	.suspend = netvsc_suspend,
 	.resume = netvsc_resume,
 	.driver = {
-		.probe_type = PROBE_FORCE_SYNCHRONOUS,
+		.probe_type = PROBE_PREFER_ASYNCHRONOUS,
 	},
 };
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* RE: [PATCH V2,net-next, 1/3] Drivers: hv: vmbus: Add a dev_num variable based on channel offer sequence
  2019-12-30 20:13 ` [PATCH V2,net-next, 1/3] Drivers: hv: vmbus: Add a dev_num variable based on channel offer sequence Haiyang Zhang
@ 2019-12-31  1:34   ` Michael Kelley
  2019-12-31 15:48     ` Haiyang Zhang
  0 siblings, 1 reply; 14+ messages in thread
From: Michael Kelley @ 2019-12-31  1:34 UTC (permalink / raw)
  To: Haiyang Zhang, sashal, linux-hyperv, netdev
  Cc: Haiyang Zhang, KY Srinivasan, Stephen Hemminger, olaf, vkuznets,
	davem, linux-kernel

From: Haiyang Zhang <haiyangz@microsoft.com> Sent: Monday, December 30, 2019 12:14 PM
> 
> This number is set to the first available number, starting from zero,
> when a vmbus device's primary channel is offered.

Let's use "VMBus" as the capitalization in text.

> It will be used for stable naming when Async probing is used.
> 
> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> ---
> Changes
> V2:
> 	Use nest loops in hv_set_devnum, instead of goto.
> 
>  drivers/hv/channel_mgmt.c | 38 ++++++++++++++++++++++++++++++++++++--
>  include/linux/hyperv.h    |  6 ++++++
>  2 files changed, 42 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> index 8eb1675..00fa2db 100644
> --- a/drivers/hv/channel_mgmt.c
> +++ b/drivers/hv/channel_mgmt.c
> @@ -315,6 +315,8 @@ static struct vmbus_channel *alloc_channel(void)
>  	if (!channel)
>  		return NULL;
> 
> +	channel->dev_num = HV_DEV_NUM_INVALID;
> +
>  	spin_lock_init(&channel->lock);
>  	init_completion(&channel->rescind_event);
> 
> @@ -541,6 +543,36 @@ static void vmbus_add_channel_work(struct work_struct *work)
>  }
> 
>  /*
> + * Get the first available device number of its type, then
> + * record it in the channel structure.
> + */
> +static void hv_set_devnum(struct vmbus_channel *newchannel)
> +{
> +	struct vmbus_channel *channel;
> +	int i = -1;
> +	bool found;
> +
> +	BUG_ON(!mutex_is_locked(&vmbus_connection.channel_mutex));
> +
> +	do {
> +		i++;
> +		found = false;
> +
> +		list_for_each_entry(channel, &vmbus_connection.chn_list,
> +				    listentry) {
> +			if (i == channel->dev_num &&
> +			    guid_equal(&channel->offermsg.offer.if_type,
> +				       &newchannel->offermsg.offer.if_type)) {
> +				found = true;
> +				break;
> +			}
> +		}
> +	} while (found);
> +
> +	newchannel->dev_num = i;
> +}
> +

It took me a little while to figure out what the above algorithm is doing.
Perhaps it would help to rename the "found" variable to "in_use", and add
this comment before the start of the "do" loop:

Iterate through each possible device number starting at zero.  If the device
number is already in use for a device of this type, try the next device number
until finding one that is not in use.   This approach selects the smallest
device number that is not in use, and so reuses any numbers that are freed
by devices that have been removed.

> +/*
>   * vmbus_process_offer - Process the offer by creating a channel/device
>   * associated with this offer
>   */
> @@ -573,10 +605,12 @@ static void vmbus_process_offer(struct vmbus_channel
> *newchannel)
>  		}
>  	}
> 
> -	if (fnew)
> +	if (fnew) {
> +		hv_set_devnum(newchannel);
> +
>  		list_add_tail(&newchannel->listentry,
>  			      &vmbus_connection.chn_list);
> -	else {
> +	} else {
>  		/*
>  		 * Check to see if this is a valid sub-channel.
>  		 */
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> index 26f3aee..4f110c5 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -718,6 +718,8 @@ struct vmbus_device {
>  	bool perf_device;
>  };
> 
> +#define HV_DEV_NUM_INVALID (-1)
> +
>  struct vmbus_channel {
>  	struct list_head listentry;
> 
> @@ -849,6 +851,10 @@ struct vmbus_channel {
>  	 */
>  	struct vmbus_channel *primary_channel;
>  	/*
> +	 * Used for device naming based on channel offer sequence.
> +	 */
> +	int dev_num;
> +	/*
>  	 * Support per-channel state for use by vmbus drivers.
>  	 */
>  	void *per_channel_state;
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH V2,net-next, 2/3] Drivers: hv: vmbus: Add dev_num to sysfs
  2019-12-30 20:13 ` [PATCH V2,net-next, 2/3] Drivers: hv: vmbus: Add dev_num to sysfs Haiyang Zhang
@ 2019-12-31  1:34   ` Michael Kelley
  2019-12-31 15:50     ` Haiyang Zhang
  0 siblings, 1 reply; 14+ messages in thread
From: Michael Kelley @ 2019-12-31  1:34 UTC (permalink / raw)
  To: Haiyang Zhang, sashal, linux-hyperv, netdev
  Cc: Haiyang Zhang, KY Srinivasan, Stephen Hemminger, olaf, vkuznets,
	davem, linux-kernel

From: Haiyang Zhang <haiyangz@microsoft.com> Sent: Monday, December 30, 2019 12:14 PM
> 
> It's a number based on the vmbus device offer sequence.

Let's use "VMBus" as the capitalization.

> Useful for device naming when using Async probing.
> 
> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> ---
>  Documentation/ABI/stable/sysfs-bus-vmbus |  8 ++++++++
>  drivers/hv/vmbus_drv.c                   | 13 +++++++++++++
>  2 files changed, 21 insertions(+)
> 
> diff --git a/Documentation/ABI/stable/sysfs-bus-vmbus
> b/Documentation/ABI/stable/sysfs-bus-vmbus
> index 8e8d167..a42225d 100644
> --- a/Documentation/ABI/stable/sysfs-bus-vmbus
> +++ b/Documentation/ABI/stable/sysfs-bus-vmbus
> @@ -49,6 +49,14 @@ Contact:	Stephen Hemminger <sthemmin@microsoft.com>
>  Description:	This NUMA node to which the VMBUS device is
>  		attached, or -1 if the node is unknown.
> 
> +What:           /sys/bus/vmbus/devices/<UUID>/dev_num
> +Date:		Dec 2019
> +KernelVersion:	5.5
> +Contact:	Haiyang Zhang <haiyangz@microsoft.com>
> +Description:	A number based on the vmbus device offer sequence.

The capitalization of "VMBus" is already inconsistent in this file in that 
we have "VMBus" and "VMBUS" in the text.  Let's not introduce another
capitalization -- use "VMBus".

> +		Useful for device naming when using Async probing.
> +Users:		Debugging tools and userspace drivers
> +
>  What:		/sys/bus/vmbus/devices/<UUID>/channels/<N>
>  Date:		September. 2017
>  KernelVersion:	4.14
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 4ef5a66..fe7aefa 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -214,6 +214,18 @@ static ssize_t numa_node_show(struct device *dev,
>  static DEVICE_ATTR_RO(numa_node);
>  #endif
> 
> +static ssize_t dev_num_show(struct device *dev,
> +			      struct device_attribute *attr, char *buf)
> +{
> +	struct hv_device *hv_dev = device_to_hv_device(dev);
> +
> +	if (!hv_dev->channel)
> +		return -ENODEV;
> +
> +	return sprintf(buf, "%d\n", hv_dev->channel->dev_num);
> +}
> +static DEVICE_ATTR_RO(dev_num);
> +
>  static ssize_t server_monitor_pending_show(struct device *dev,
>  					   struct device_attribute *dev_attr,
>  					   char *buf)
> @@ -598,6 +610,7 @@ static ssize_t driver_override_show(struct device *dev,
>  #ifdef CONFIG_NUMA
>  	&dev_attr_numa_node.attr,
>  #endif
> +	&dev_attr_dev_num.attr,
>  	&dev_attr_server_monitor_pending.attr,
>  	&dev_attr_client_monitor_pending.attr,
>  	&dev_attr_server_monitor_latency.attr,
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe
  2019-12-30 20:13 ` [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe Haiyang Zhang
@ 2019-12-31  1:35   ` Michael Kelley
  2019-12-31 16:01     ` Haiyang Zhang
  2019-12-31 11:34   ` Roman Kagan
  1 sibling, 1 reply; 14+ messages in thread
From: Michael Kelley @ 2019-12-31  1:35 UTC (permalink / raw)
  To: Haiyang Zhang, sashal, linux-hyperv, netdev
  Cc: Haiyang Zhang, KY Srinivasan, Stephen Hemminger, olaf, vkuznets,
	davem, linux-kernel

From: Haiyang Zhang <haiyangz@microsoft.com> Sent: Monday, December 30, 2019 12:14 PM
> 
> The dev_num field in vmbus channel structure is assigned to the first

Let's use "VMBus" in text.

> available number when the channel is offered. So netvsc driver uses it
> for NIC naming based on channel offer sequence. Now re-enable the async
> probing mode for faster probing.
> 
> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> ---
>  drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
> index f3f9eb8..39c412f 100644
> --- a/drivers/net/hyperv/netvsc_drv.c
> +++ b/drivers/net/hyperv/netvsc_drv.c
> @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev,
>  	struct net_device_context *net_device_ctx;
>  	struct netvsc_device_info *device_info = NULL;
>  	struct netvsc_device *nvdev;
> +	char name[IFNAMSIZ];
>  	int ret = -ENOMEM;
> 
> -	net = alloc_etherdev_mq(sizeof(struct net_device_context),
> -				VRSS_CHANNEL_MAX);
> +	snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num);
> +	net = alloc_netdev_mqs(sizeof(struct net_device_context), name,
> +			       NET_NAME_ENUM, ether_setup,
> +			       VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX);
> +
>  	if (!net)
>  		goto no_net;
> 
> @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev,
>  		net->max_mtu = ETH_DATA_LEN;
> 
>  	ret = register_netdevice(net);
> +
> +	if (ret == -EEXIST) {
> +		pr_info("NIC name %s exists, request another name.\n",
> +			net->name);
> +		strlcpy(net->name, "eth%d", IFNAMSIZ);
> +		ret = register_netdevice(net);

The message above could be confusing.  "request another name" sounds
like a directive to the user/sysadmin, which I don't think it is.  Perhaps
better would be "requesting another name", which says the system is
handling the collision automatically.

Also will this second call to register_netdevice() actually assign a numeric
name?  If so, it would be really nice for the message to be output *after*
the second call to register_netdevice() and to show both the originally
selected name that collided as well as the new name.  We sometimes get
into NIC naming issues with customers in Azure, and when a new name
has to be selected it will help debugging if we can show both the original
selection and the new selection.  With both pieces of data, there's less
guessing about who did what regarding NIC naming.

Finally, having to choose a different name because of a collision does
*not* update the channel->dev_num value.  Subsequent calls to 
hv_set_devnum() will detect "in use" based on the originally assigned
dev_num value.  I don't think that fundamentally breaks anything, but
I wondered if you had thought about that case.   Maybe a comment here
to that effect would help a future reader of this code.

Michael

> +	}
> +
>  	if (ret != 0) {
>  		pr_err("Unable to register netdev.\n");
>  		goto register_failed;
> @@ -2496,7 +2508,7 @@ static int netvsc_resume(struct hv_device *dev)
>  	.suspend = netvsc_suspend,
>  	.resume = netvsc_resume,
>  	.driver = {
> -		.probe_type = PROBE_FORCE_SYNCHRONOUS,
> +		.probe_type = PROBE_PREFER_ASYNCHRONOUS,
>  	},
>  };
> 
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe
  2019-12-30 20:13 ` [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe Haiyang Zhang
  2019-12-31  1:35   ` Michael Kelley
@ 2019-12-31 11:34   ` Roman Kagan
  2019-12-31 16:12     ` Haiyang Zhang
  1 sibling, 1 reply; 14+ messages in thread
From: Roman Kagan @ 2019-12-31 11:34 UTC (permalink / raw)
  To: Haiyang Zhang
  Cc: sashal, linux-hyperv, netdev, kys, sthemmin, olaf, vkuznets,
	davem, linux-kernel

On Mon, Dec 30, 2019 at 12:13:34PM -0800, Haiyang Zhang wrote:
> The dev_num field in vmbus channel structure is assigned to the first
> available number when the channel is offered. So netvsc driver uses it
> for NIC naming based on channel offer sequence. Now re-enable the async
> probing mode for faster probing.
> 
> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> ---
>  drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
> index f3f9eb8..39c412f 100644
> --- a/drivers/net/hyperv/netvsc_drv.c
> +++ b/drivers/net/hyperv/netvsc_drv.c
> @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev,
>  	struct net_device_context *net_device_ctx;
>  	struct netvsc_device_info *device_info = NULL;
>  	struct netvsc_device *nvdev;
> +	char name[IFNAMSIZ];
>  	int ret = -ENOMEM;
>  
> -	net = alloc_etherdev_mq(sizeof(struct net_device_context),
> -				VRSS_CHANNEL_MAX);
> +	snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num);

How is this supposed to work when there are other ethernet device types
on the system, which may claim the same device names?

> +	net = alloc_netdev_mqs(sizeof(struct net_device_context), name,
> +			       NET_NAME_ENUM, ether_setup,
> +			       VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX);
> +
>  	if (!net)
>  		goto no_net;
>  
> @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev,
>  		net->max_mtu = ETH_DATA_LEN;
>  
>  	ret = register_netdevice(net);
> +
> +	if (ret == -EEXIST) {
> +		pr_info("NIC name %s exists, request another name.\n",
> +			net->name);
> +		strlcpy(net->name, "eth%d", IFNAMSIZ);
> +		ret = register_netdevice(net);
> +	}

IOW you want the device naming to be predictable, but don't guarantee
this?

I think the problem this patchset is trying to solve is much better
solved with a udev rule, similar to how it's done for PCI net devices.
And IMO the primary channel number, being a device's "hardware"
property, is more suited to be used in the device name, than this
completely ephemeral device number.

Thanks,
Roman.

> +
>  	if (ret != 0) {
>  		pr_err("Unable to register netdev.\n");
>  		goto register_failed;
> @@ -2496,7 +2508,7 @@ static int netvsc_resume(struct hv_device *dev)
>  	.suspend = netvsc_suspend,
>  	.resume = netvsc_resume,
>  	.driver = {
> -		.probe_type = PROBE_FORCE_SYNCHRONOUS,
> +		.probe_type = PROBE_PREFER_ASYNCHRONOUS,
>  	},
>  };
>  
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH V2,net-next, 1/3] Drivers: hv: vmbus: Add a dev_num variable based on channel offer sequence
  2019-12-31  1:34   ` Michael Kelley
@ 2019-12-31 15:48     ` Haiyang Zhang
  0 siblings, 0 replies; 14+ messages in thread
From: Haiyang Zhang @ 2019-12-31 15:48 UTC (permalink / raw)
  To: Michael Kelley, sashal, linux-hyperv, netdev
  Cc: KY Srinivasan, Stephen Hemminger, olaf, vkuznets, davem, linux-kernel



> -----Original Message-----
> From: Michael Kelley <mikelley@microsoft.com>
> Sent: Monday, December 30, 2019 8:35 PM
> To: Haiyang Zhang <haiyangz@microsoft.com>; sashal@kernel.org; linux-
> hyperv@vger.kernel.org; netdev@vger.kernel.org
> Cc: Haiyang Zhang <haiyangz@microsoft.com>; KY Srinivasan
> <kys@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>;
> olaf@aepfle.de; vkuznets <vkuznets@redhat.com>; davem@davemloft.net;
> linux-kernel@vger.kernel.org
> Subject: RE: [PATCH V2,net-next, 1/3] Drivers: hv: vmbus: Add a dev_num
> variable based on channel offer sequence
> 
> From: Haiyang Zhang <haiyangz@microsoft.com> Sent: Monday, December 30,
> 2019 12:14 PM
> >
> > This number is set to the first available number, starting from zero,
> > when a vmbus device's primary channel is offered.
> 
> Let's use "VMBus" as the capitalization in text.
Sure I will.

> 
> > It will be used for stable naming when Async probing is used.
> >
> > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> > ---
> > Changes
> > V2:
> > 	Use nest loops in hv_set_devnum, instead of goto.
> >
> >  drivers/hv/channel_mgmt.c | 38
> ++++++++++++++++++++++++++++++++++++--
> >  include/linux/hyperv.h    |  6 ++++++
> >  2 files changed, 42 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> > index 8eb1675..00fa2db 100644
> > --- a/drivers/hv/channel_mgmt.c
> > +++ b/drivers/hv/channel_mgmt.c
> > @@ -315,6 +315,8 @@ static struct vmbus_channel *alloc_channel(void)
> >  	if (!channel)
> >  		return NULL;
> >
> > +	channel->dev_num = HV_DEV_NUM_INVALID;
> > +
> >  	spin_lock_init(&channel->lock);
> >  	init_completion(&channel->rescind_event);
> >
> > @@ -541,6 +543,36 @@ static void vmbus_add_channel_work(struct
> work_struct *work)
> >  }
> >
> >  /*
> > + * Get the first available device number of its type, then
> > + * record it in the channel structure.
> > + */
> > +static void hv_set_devnum(struct vmbus_channel *newchannel)
> > +{
> > +	struct vmbus_channel *channel;
> > +	int i = -1;
> > +	bool found;
> > +
> > +	BUG_ON(!mutex_is_locked(&vmbus_connection.channel_mutex));
> > +
> > +	do {
> > +		i++;
> > +		found = false;
> > +
> > +		list_for_each_entry(channel, &vmbus_connection.chn_list,
> > +				    listentry) {
> > +			if (i == channel->dev_num &&
> > +			    guid_equal(&channel->offermsg.offer.if_type,
> > +				       &newchannel->offermsg.offer.if_type)) {
> > +				found = true;
> > +				break;
> > +			}
> > +		}
> > +	} while (found);
> > +
> > +	newchannel->dev_num = i;
> > +}
> > +
> 
> It took me a little while to figure out what the above algorithm is doing.
> Perhaps it would help to rename the "found" variable to "in_use", and add
> this comment before the start of the "do" loop:
> 
> Iterate through each possible device number starting at zero.  If the device
> number is already in use for a device of this type, try the next device number
> until finding one that is not in use.   This approach selects the smallest
> device number that is not in use, and so reuses any numbers that are freed
> by devices that have been removed.
I will rename the variable, and add the code comments.
Thanks,

- Haiyang


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH V2,net-next, 2/3] Drivers: hv: vmbus: Add dev_num to sysfs
  2019-12-31  1:34   ` Michael Kelley
@ 2019-12-31 15:50     ` Haiyang Zhang
  0 siblings, 0 replies; 14+ messages in thread
From: Haiyang Zhang @ 2019-12-31 15:50 UTC (permalink / raw)
  To: Michael Kelley, sashal, linux-hyperv, netdev
  Cc: KY Srinivasan, Stephen Hemminger, olaf, vkuznets, davem, linux-kernel



> -----Original Message-----
> From: Michael Kelley <mikelley@microsoft.com>
> Sent: Monday, December 30, 2019 8:35 PM
> To: Haiyang Zhang <haiyangz@microsoft.com>; sashal@kernel.org; linux-
> hyperv@vger.kernel.org; netdev@vger.kernel.org
> Cc: Haiyang Zhang <haiyangz@microsoft.com>; KY Srinivasan
> <kys@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>;
> olaf@aepfle.de; vkuznets <vkuznets@redhat.com>; davem@davemloft.net;
> linux-kernel@vger.kernel.org
> Subject: RE: [PATCH V2,net-next, 2/3] Drivers: hv: vmbus: Add dev_num to sysfs
> 
> From: Haiyang Zhang <haiyangz@microsoft.com> Sent: Monday, December 30,
> 2019 12:14 PM
> >
> > It's a number based on the vmbus device offer sequence.
> 
> Let's use "VMBus" as the capitalization.
I will.

> 
> > Useful for device naming when using Async probing.
> >
> > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> > ---
> >  Documentation/ABI/stable/sysfs-bus-vmbus |  8 ++++++++
> >  drivers/hv/vmbus_drv.c                   | 13 +++++++++++++
> >  2 files changed, 21 insertions(+)
> >
> > diff --git a/Documentation/ABI/stable/sysfs-bus-vmbus
> > b/Documentation/ABI/stable/sysfs-bus-vmbus
> > index 8e8d167..a42225d 100644
> > --- a/Documentation/ABI/stable/sysfs-bus-vmbus
> > +++ b/Documentation/ABI/stable/sysfs-bus-vmbus
> > @@ -49,6 +49,14 @@ Contact:	Stephen Hemminger
> <sthemmin@microsoft.com>
> >  Description:	This NUMA node to which the VMBUS device is
> >  		attached, or -1 if the node is unknown.
> >
> > +What:           /sys/bus/vmbus/devices/<UUID>/dev_num
> > +Date:		Dec 2019
> > +KernelVersion:	5.5
> > +Contact:	Haiyang Zhang <haiyangz@microsoft.com>
> > +Description:	A number based on the vmbus device offer sequence.
> 
> The capitalization of "VMBus" is already inconsistent in this file in that
> we have "VMBus" and "VMBUS" in the text.  Let's not introduce another
> capitalization -- use "VMBus".

I will use "VMBus". 
Thanks,
- Haiyang

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe
  2019-12-31  1:35   ` Michael Kelley
@ 2019-12-31 16:01     ` Haiyang Zhang
  0 siblings, 0 replies; 14+ messages in thread
From: Haiyang Zhang @ 2019-12-31 16:01 UTC (permalink / raw)
  To: Michael Kelley, sashal, linux-hyperv, netdev
  Cc: KY Srinivasan, Stephen Hemminger, olaf, vkuznets, davem, linux-kernel



> -----Original Message-----
> From: Michael Kelley <mikelley@microsoft.com>
> Sent: Monday, December 30, 2019 8:35 PM
> To: Haiyang Zhang <haiyangz@microsoft.com>; sashal@kernel.org; linux-
> hyperv@vger.kernel.org; netdev@vger.kernel.org
> Cc: Haiyang Zhang <haiyangz@microsoft.com>; KY Srinivasan
> <kys@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>;
> olaf@aepfle.de; vkuznets <vkuznets@redhat.com>; davem@davemloft.net;
> linux-kernel@vger.kernel.org
> Subject: RE: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus
> offer sequence and use async probe
> 
> From: Haiyang Zhang <haiyangz@microsoft.com> Sent: Monday, December 30,
> 2019 12:14 PM
> >
> > The dev_num field in vmbus channel structure is assigned to the first
> 
> Let's use "VMBus" in text.
I will.

> 
> > available number when the channel is offered. So netvsc driver uses it
> > for NIC naming based on channel offer sequence. Now re-enable the async
> > probing mode for faster probing.
> >
> > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> > ---
> >  drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++---
> >  1 file changed, 15 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
> > index f3f9eb8..39c412f 100644
> > --- a/drivers/net/hyperv/netvsc_drv.c
> > +++ b/drivers/net/hyperv/netvsc_drv.c
> > @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev,
> >  	struct net_device_context *net_device_ctx;
> >  	struct netvsc_device_info *device_info = NULL;
> >  	struct netvsc_device *nvdev;
> > +	char name[IFNAMSIZ];
> >  	int ret = -ENOMEM;
> >
> > -	net = alloc_etherdev_mq(sizeof(struct net_device_context),
> > -				VRSS_CHANNEL_MAX);
> > +	snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num);
> > +	net = alloc_netdev_mqs(sizeof(struct net_device_context), name,
> > +			       NET_NAME_ENUM, ether_setup,
> > +			       VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX);
> > +
> >  	if (!net)
> >  		goto no_net;
> >
> > @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev,
> >  		net->max_mtu = ETH_DATA_LEN;
> >
> >  	ret = register_netdevice(net);
> > +
> > +	if (ret == -EEXIST) {
> > +		pr_info("NIC name %s exists, request another name.\n",
> > +			net->name);
> > +		strlcpy(net->name, "eth%d", IFNAMSIZ);
> > +		ret = register_netdevice(net);
> 
> The message above could be confusing.  "request another name" sounds
> like a directive to the user/sysadmin, which I don't think it is.  Perhaps
> better would be "requesting another name", which says the system is
> handling the collision automatically.
> 
> Also will this second call to register_netdevice() actually assign a numeric
> name?  If so, it would be really nice for the message to be output *after*
> the second call to register_netdevice() and to show both the originally
> selected name that collided as well as the new name.  We sometimes get
> into NIC naming issues with customers in Azure, and when a new name
> has to be selected it will help debugging if we can show both the original
> selection and the new selection.  With both pieces of data, there's less
> guessing about who did what regarding NIC naming.
Good idea! I will include both new and old names in the log messages.

> 
> Finally, having to choose a different name because of a collision does
> *not* update the channel->dev_num value.  Subsequent calls to
> hv_set_devnum() will detect "in use" based on the originally assigned
> dev_num value.  I don't think that fundamentally breaks anything, but
> I wondered if you had thought about that case.   Maybe a comment here
> to that effect would help a future reader of this code.

That's correct. And I'm aware of this situation. But, the dev_num shouldn't 
be changed, because we want it to keep track of the sequence of device 
offering.
Avoid single or multiple collisions, we should use udev or other daemons to
set name of VF or other types of NICs to different formats, like "vf*", or "enP*",
etc. For distros have not done so already, we need to work with the distro vendors
to ensure udev or other settings are correctly included in Distros, and Azure images.

Thanks,
- Haiyang


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe
  2019-12-31 11:34   ` Roman Kagan
@ 2019-12-31 16:12     ` Haiyang Zhang
  2019-12-31 17:36       ` Stephen Hemminger
  0 siblings, 1 reply; 14+ messages in thread
From: Haiyang Zhang @ 2019-12-31 16:12 UTC (permalink / raw)
  To: Roman Kagan
  Cc: sashal, linux-hyperv, netdev, KY Srinivasan, Stephen Hemminger,
	olaf, vkuznets, davem, linux-kernel



> -----Original Message-----
> From: Roman Kagan <rkagan@virtuozzo.com>
> Sent: Tuesday, December 31, 2019 6:35 AM
> To: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: sashal@kernel.org; linux-hyperv@vger.kernel.org; netdev@vger.kernel.org;
> KY Srinivasan <kys@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>; olaf@aepfle.de; vkuznets
> <vkuznets@redhat.com>; davem@davemloft.net; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus
> offer sequence and use async probe
> 
> On Mon, Dec 30, 2019 at 12:13:34PM -0800, Haiyang Zhang wrote:
> > The dev_num field in vmbus channel structure is assigned to the first
> > available number when the channel is offered. So netvsc driver uses it
> > for NIC naming based on channel offer sequence. Now re-enable the
> > async probing mode for faster probing.
> >
> > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> > ---
> >  drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++---
> >  1 file changed, 15 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/hyperv/netvsc_drv.c
> > b/drivers/net/hyperv/netvsc_drv.c index f3f9eb8..39c412f 100644
> > --- a/drivers/net/hyperv/netvsc_drv.c
> > +++ b/drivers/net/hyperv/netvsc_drv.c
> > @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev,
> >  	struct net_device_context *net_device_ctx;
> >  	struct netvsc_device_info *device_info = NULL;
> >  	struct netvsc_device *nvdev;
> > +	char name[IFNAMSIZ];
> >  	int ret = -ENOMEM;
> >
> > -	net = alloc_etherdev_mq(sizeof(struct net_device_context),
> > -				VRSS_CHANNEL_MAX);
> > +	snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num);
> 
> How is this supposed to work when there are other ethernet device types on the
> system, which may claim the same device names?
> 
> > +	net = alloc_netdev_mqs(sizeof(struct net_device_context), name,
> > +			       NET_NAME_ENUM, ether_setup,
> > +			       VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX);
> > +
> >  	if (!net)
> >  		goto no_net;
> >
> > @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev,
> >  		net->max_mtu = ETH_DATA_LEN;
> >
> >  	ret = register_netdevice(net);
> > +
> > +	if (ret == -EEXIST) {
> > +		pr_info("NIC name %s exists, request another name.\n",
> > +			net->name);
> > +		strlcpy(net->name, "eth%d", IFNAMSIZ);
> > +		ret = register_netdevice(net);
> > +	}
> 
> IOW you want the device naming to be predictable, but don't guarantee this?
> 
> I think the problem this patchset is trying to solve is much better solved with a
> udev rule, similar to how it's done for PCI net devices.
> And IMO the primary channel number, being a device's "hardware"
> property, is more suited to be used in the device name, than this completely
> ephemeral device number.

The vmbus number can be affected by other types of devices and/or subchannel
offerings. They are not stable either. That's why this patch set keeps track of the 
offering sequence within the same device type in a new variable "dev_num".

As in my earlier email, to avoid impact by other types of NICs, we should put them
into different naming formats, like "vf*", "enP*", etc. And yes, these can be done in
udev.

But for netvsc (synthetic) NICs, we still want the default naming format "eth*". And
the variable "dev_num" gives them the basis for stable naming with Async probing.

Thanks,
- Haiyang


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe
  2019-12-31 16:12     ` Haiyang Zhang
@ 2019-12-31 17:36       ` Stephen Hemminger
  2019-12-31 17:51         ` Haiyang Zhang
  0 siblings, 1 reply; 14+ messages in thread
From: Stephen Hemminger @ 2019-12-31 17:36 UTC (permalink / raw)
  To: Haiyang Zhang
  Cc: Roman Kagan, sashal, linux-hyperv, netdev, KY Srinivasan,
	Stephen Hemminger, olaf, vkuznets, davem

On Tue, 31 Dec 2019 16:12:36 +0000
Haiyang Zhang <haiyangz@microsoft.com> wrote:

> > -----Original Message-----
> > From: Roman Kagan <rkagan@virtuozzo.com>
> > Sent: Tuesday, December 31, 2019 6:35 AM
> > To: Haiyang Zhang <haiyangz@microsoft.com>
> > Cc: sashal@kernel.org; linux-hyperv@vger.kernel.org; netdev@vger.kernel.org;
> > KY Srinivasan <kys@microsoft.com>; Stephen Hemminger
> > <sthemmin@microsoft.com>; olaf@aepfle.de; vkuznets
> > <vkuznets@redhat.com>; davem@davemloft.net; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus
> > offer sequence and use async probe
> > 
> > On Mon, Dec 30, 2019 at 12:13:34PM -0800, Haiyang Zhang wrote:  
> > > The dev_num field in vmbus channel structure is assigned to the first
> > > available number when the channel is offered. So netvsc driver uses it
> > > for NIC naming based on channel offer sequence. Now re-enable the
> > > async probing mode for faster probing.
> > >
> > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> > > ---
> > >  drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++---
> > >  1 file changed, 15 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/net/hyperv/netvsc_drv.c
> > > b/drivers/net/hyperv/netvsc_drv.c index f3f9eb8..39c412f 100644
> > > --- a/drivers/net/hyperv/netvsc_drv.c
> > > +++ b/drivers/net/hyperv/netvsc_drv.c
> > > @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev,
> > >  	struct net_device_context *net_device_ctx;
> > >  	struct netvsc_device_info *device_info = NULL;
> > >  	struct netvsc_device *nvdev;
> > > +	char name[IFNAMSIZ];
> > >  	int ret = -ENOMEM;
> > >
> > > -	net = alloc_etherdev_mq(sizeof(struct net_device_context),
> > > -				VRSS_CHANNEL_MAX);
> > > +	snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num);  
> > 
> > How is this supposed to work when there are other ethernet device types on the
> > system, which may claim the same device names?
> >   
> > > +	net = alloc_netdev_mqs(sizeof(struct net_device_context), name,
> > > +			       NET_NAME_ENUM, ether_setup,
> > > +			       VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX);
> > > +
> > >  	if (!net)
> > >  		goto no_net;
> > >
> > > @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev,
> > >  		net->max_mtu = ETH_DATA_LEN;
> > >
> > >  	ret = register_netdevice(net);
> > > +
> > > +	if (ret == -EEXIST) {
> > > +		pr_info("NIC name %s exists, request another name.\n",
> > > +			net->name);
> > > +		strlcpy(net->name, "eth%d", IFNAMSIZ);
> > > +		ret = register_netdevice(net);
> > > +	}  
> > 
> > IOW you want the device naming to be predictable, but don't guarantee this?
> > 
> > I think the problem this patchset is trying to solve is much better solved with a
> > udev rule, similar to how it's done for PCI net devices.
> > And IMO the primary channel number, being a device's "hardware"
> > property, is more suited to be used in the device name, than this completely
> > ephemeral device number.  
> 
> The vmbus number can be affected by other types of devices and/or subchannel
> offerings. They are not stable either. That's why this patch set keeps track of the 
> offering sequence within the same device type in a new variable "dev_num".
> 
> As in my earlier email, to avoid impact by other types of NICs, we should put them
> into different naming formats, like "vf*", "enP*", etc. And yes, these can be done in
> udev.
> 
> But for netvsc (synthetic) NICs, we still want the default naming format "eth*". And
> the variable "dev_num" gives them the basis for stable naming with Async probing.
> 
> Thanks,
> - Haiyang
> 

The primary requirements for network naming are:
  1. Network names must be repeatable on each boot. This was the original problem
     that PCI devices discovered back years ago when parallel probing was enabled.
  2. Network names must be predictable. If new VM is created, the names should
     match a similar VM config.
  3. Names must be persistent. If a NIC is added or deleted, the existing names
     must not change.

The other things which are important (and this proposal breaks):
  1. Don't break it principle: an existing VM should not suddenly get interfaces
     renamed if kernel is upgraded. A corrallary is that a lot of current userspace
     code expects eth0. It doesn't look like first interface would be guaranteed
     to be eth0.

  2. No snowflakes principle: a device driver should follow the current practice
     of other devices. For netvsc, this means VMBus should act like PCI as much
     as possible. Is there another driver doing this already?

  3. Userspace policy principle: Every distribution has its own policy by now.
     The solution must make netvsc work reliably on Redhat (udev), Ubuntu (netplan), SuSE (Yast)
     doing something in the kernel violates #2.

My recommendation would be to take a multi-phase approach:
  1. Expose persistent value in sysfs now.
  2. Work with udev/netplan/... to use that value. 
  3. Make parallel VMBus probing an option. So that when distributions have picked up
     the udev changes they can enable parallel probe. Some will be quick to adopt
     and the enterprise laggards can get to it when they feel the heat.

Long term wish list (requires host side changes):
   1. The interface index could be a host side property; the host networking
       already has the virtual device table and it is persistent.
   2. The Azure NIC name should be visible as a property in guest. 
      Then userspace could do rename based on that property.
      Having multiple disconnected names is leads to confusion.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe
  2019-12-31 17:36       ` Stephen Hemminger
@ 2019-12-31 17:51         ` Haiyang Zhang
  0 siblings, 0 replies; 14+ messages in thread
From: Haiyang Zhang @ 2019-12-31 17:51 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Roman Kagan, sashal, linux-hyperv, netdev, KY Srinivasan,
	Stephen Hemminger, olaf, vkuznets, davem



> -----Original Message-----
> From: linux-hyperv-owner@vger.kernel.org <linux-hyperv-
> owner@vger.kernel.org> On Behalf Of Stephen Hemminger
> Sent: Tuesday, December 31, 2019 12:36 PM
> To: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Roman Kagan <rkagan@virtuozzo.com>; sashal@kernel.org; linux-
> hyperv@vger.kernel.org; netdev@vger.kernel.org; KY Srinivasan
> <kys@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>;
> olaf@aepfle.de; vkuznets <vkuznets@redhat.com>; davem@davemloft.net
> Subject: Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus
> offer sequence and use async probe
> 
> On Tue, 31 Dec 2019 16:12:36 +0000
> Haiyang Zhang <haiyangz@microsoft.com> wrote:
> 
> > > -----Original Message-----
> > > From: Roman Kagan <rkagan@virtuozzo.com>
> > > Sent: Tuesday, December 31, 2019 6:35 AM
> > > To: Haiyang Zhang <haiyangz@microsoft.com>
> > > Cc: sashal@kernel.org; linux-hyperv@vger.kernel.org;
> netdev@vger.kernel.org;
> > > KY Srinivasan <kys@microsoft.com>; Stephen Hemminger
> > > <sthemmin@microsoft.com>; olaf@aepfle.de; vkuznets
> > > <vkuznets@redhat.com>; davem@davemloft.net; linux-
> kernel@vger.kernel.org
> > > Subject: Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on
> vmbus
> > > offer sequence and use async probe
> > >
> > > On Mon, Dec 30, 2019 at 12:13:34PM -0800, Haiyang Zhang wrote:
> > > > The dev_num field in vmbus channel structure is assigned to the first
> > > > available number when the channel is offered. So netvsc driver uses it
> > > > for NIC naming based on channel offer sequence. Now re-enable the
> > > > async probing mode for faster probing.
> > > >
> > > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> > > > ---
> > > >  drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++---
> > > >  1 file changed, 15 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/net/hyperv/netvsc_drv.c
> > > > b/drivers/net/hyperv/netvsc_drv.c index f3f9eb8..39c412f 100644
> > > > --- a/drivers/net/hyperv/netvsc_drv.c
> > > > +++ b/drivers/net/hyperv/netvsc_drv.c
> > > > @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev,
> > > >  	struct net_device_context *net_device_ctx;
> > > >  	struct netvsc_device_info *device_info = NULL;
> > > >  	struct netvsc_device *nvdev;
> > > > +	char name[IFNAMSIZ];
> > > >  	int ret = -ENOMEM;
> > > >
> > > > -	net = alloc_etherdev_mq(sizeof(struct net_device_context),
> > > > -				VRSS_CHANNEL_MAX);
> > > > +	snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num);
> > >
> > > How is this supposed to work when there are other ethernet device types on
> the
> > > system, which may claim the same device names?
> > >
> > > > +	net = alloc_netdev_mqs(sizeof(struct net_device_context), name,
> > > > +			       NET_NAME_ENUM, ether_setup,
> > > > +			       VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX);
> > > > +
> > > >  	if (!net)
> > > >  		goto no_net;
> > > >
> > > > @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev,
> > > >  		net->max_mtu = ETH_DATA_LEN;
> > > >
> > > >  	ret = register_netdevice(net);
> > > > +
> > > > +	if (ret == -EEXIST) {
> > > > +		pr_info("NIC name %s exists, request another name.\n",
> > > > +			net->name);
> > > > +		strlcpy(net->name, "eth%d", IFNAMSIZ);
> > > > +		ret = register_netdevice(net);
> > > > +	}
> > >
> > > IOW you want the device naming to be predictable, but don't guarantee this?
> > >
> > > I think the problem this patchset is trying to solve is much better solved with
> a
> > > udev rule, similar to how it's done for PCI net devices.
> > > And IMO the primary channel number, being a device's "hardware"
> > > property, is more suited to be used in the device name, than this completely
> > > ephemeral device number.
> >
> > The vmbus number can be affected by other types of devices and/or
> subchannel
> > offerings. They are not stable either. That's why this patch set keeps track of
> the
> > offering sequence within the same device type in a new variable "dev_num".
> >
> > As in my earlier email, to avoid impact by other types of NICs, we should put
> them
> > into different naming formats, like "vf*", "enP*", etc. And yes, these can be
> done in
> > udev.
> >
> > But for netvsc (synthetic) NICs, we still want the default naming format "eth*".
> And
> > the variable "dev_num" gives them the basis for stable naming with Async
> probing.
> >
> > Thanks,
> > - Haiyang
> >
> 
> The primary requirements for network naming are:
>   1. Network names must be repeatable on each boot. This was the original
> problem
>      that PCI devices discovered back years ago when parallel probing was
> enabled.
>   2. Network names must be predictable. If new VM is created, the names
> should
>      match a similar VM config.
>   3. Names must be persistent. If a NIC is added or deleted, the existing names
>      must not change.
> 
> The other things which are important (and this proposal breaks):
>   1. Don't break it principle: an existing VM should not suddenly get interfaces
>      renamed if kernel is upgraded. A corrallary is that a lot of current userspace
>      code expects eth0. It doesn't look like first interface would be guaranteed
>      to be eth0.
> 
>   2. No snowflakes principle: a device driver should follow the current practice
>      of other devices. For netvsc, this means VMBus should act like PCI as much
>      as possible. Is there another driver doing this already?
> 
>   3. Userspace policy principle: Every distribution has its own policy by now.
>      The solution must make netvsc work reliably on Redhat (udev), Ubuntu
> (netplan), SuSE (Yast)
>      doing something in the kernel violates #2.
> 
> My recommendation would be to take a multi-phase approach:
>   1. Expose persistent value in sysfs now.
>   2. Work with udev/netplan/... to use that value.
>   3. Make parallel VMBus probing an option. So that when distributions have
> picked up
>      the udev changes they can enable parallel probe. Some will be quick to
> adopt
>      and the enterprise laggards can get to it when they feel the heat.
> 
> Long term wish list (requires host side changes):
>    1. The interface index could be a host side property; the host networking
>        already has the virtual device table and it is persistent.
>    2. The Azure NIC name should be visible as a property in guest.
>       Then userspace could do rename based on that property.
>       Having multiple disconnected names is leads to confusion.

Thank you for the detailed recommendations!
I will do the step 1. Expose persistent value in sysfs now.
And work on other changes in the future.

Thanks,
- Haiyang

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-12-31 17:51 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-30 20:13 [PATCH V2,net-next, 0/3] Add vmbus dev_num and enable netvsc async probing Haiyang Zhang
2019-12-30 20:13 ` [PATCH V2,net-next, 1/3] Drivers: hv: vmbus: Add a dev_num variable based on channel offer sequence Haiyang Zhang
2019-12-31  1:34   ` Michael Kelley
2019-12-31 15:48     ` Haiyang Zhang
2019-12-30 20:13 ` [PATCH V2,net-next, 2/3] Drivers: hv: vmbus: Add dev_num to sysfs Haiyang Zhang
2019-12-31  1:34   ` Michael Kelley
2019-12-31 15:50     ` Haiyang Zhang
2019-12-30 20:13 ` [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe Haiyang Zhang
2019-12-31  1:35   ` Michael Kelley
2019-12-31 16:01     ` Haiyang Zhang
2019-12-31 11:34   ` Roman Kagan
2019-12-31 16:12     ` Haiyang Zhang
2019-12-31 17:36       ` Stephen Hemminger
2019-12-31 17:51         ` Haiyang Zhang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.