All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable
@ 2018-05-14 22:31 Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 01/24] hv_netvsc: Fix the real number of queues of non-vRSS cases Stephen Hemminger
                   ` (23 more replies)
  0 siblings, 24 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:31 UTC (permalink / raw)
  To: davem; +Cc: netdev, Stephen Hemminger

These patches are backport of latest stability related patches
from upstream.  Although it looks like a lot it encompasses
three main areas:
   1. The set of patches to get rid of races when MTU or number
      of queues is changed while device is up.  And make this
      work on older versions of Windows server.
   2. Make transparent passthrough mode work better by setting
      master/slave correctly.
   3. Do correct queue mapping in NUMA and VF mode.

Haiyang Zhang (6):
  hv_netvsc: Fix the real number of queues of non-vRSS cases
  hv_netvsc: Rename ind_table to rx_table
  hv_netvsc: Rename tx_send_table to tx_table
  hv_netvsc: Add initialization of tx_table in netvsc_device_add()
  hv_netvsc: Set tx_table to equal weight after subchannels open
  hv_netvsc: Use the num_online_cpus() for channel limit

Mohammed Gamal (4):
  hv_netvsc: Use Windows version instead of NVSP version on GPAD
    teardown
  hv_netvsc: Split netvsc_revoke_buf() and netvsc_teardown_gpadl()
  hv_netvsc: Ensure correct teardown message sequence order
  hv_netvsc: Fix net device attach on older Windows hosts

Stephen Hemminger (12):
  hv_netvsc: empty current transmit aggregation if flow blocked
  hv_netvsc: avoid retry on send during shutdown
  hv_netvsc: only wake transmit queue if link is up
  hv_netvsc: fix error unwind handling if vmbus_open fails
  hv_netvsc: cancel subchannel setup before halting device
  hv_netvsc: fix race in napi poll when rescheduling
  hv_netvsc: defer queue selection to VF
  hv_netvsc: disable NAPI before channel close
  hv_netvsc: use RCU to fix concurrent rx and queue changes
  hv_netvsc: change GPAD teardown order on older versions
  hv_netvsc: common detach logic
  hv_netvsc: set master device

Vitaly Kuznetsov (2):
  hv_netvsc: netvsc_teardown_gpadl() split
  hv_netvsc: preserve hw_features on mtu/channels/ringparam changes

 drivers/net/hyperv/hyperv_net.h   |  11 +-
 drivers/net/hyperv/netvsc.c       | 203 +++++++++++--------
 drivers/net/hyperv/netvsc_drv.c   | 313 +++++++++++++++++-------------
 drivers/net/hyperv/rndis_filter.c | 210 ++++++++++----------
 4 files changed, 417 insertions(+), 320 deletions(-)

-- 
2.17.0

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH net-stable 01/24] hv_netvsc: Fix the real number of queues of non-vRSS cases
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 02/24] hv_netvsc: Rename ind_table to rx_table Stephen Hemminger
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Haiyang Zhang

From: Haiyang Zhang <haiyangz@microsoft.com>

commit 6450f8f269a9271985e4a8c13920b7e4cf21c0f3 upstream.

For older hosts without multi-channel (vRSS) support, and some error
cases, we still need to set the real number of queues to one.
This patch adds this missing setting.

Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/netvsc_drv.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index c849de3cb046..e40be71df9d6 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -1931,6 +1931,12 @@ static int netvsc_probe(struct hv_device *dev,
 	/* We always need headroom for rndis header */
 	net->needed_headroom = RNDIS_AND_PPI_SIZE;
 
+	/* Initialize the number of queues to be 1, we may change it if more
+	 * channels are offered later.
+	 */
+	netif_set_real_num_tx_queues(net, 1);
+	netif_set_real_num_rx_queues(net, 1);
+
 	/* Notify the netvsc driver of the new device */
 	memset(&device_info, 0, sizeof(device_info));
 	device_info.ring_size = ring_size;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 02/24] hv_netvsc: Rename ind_table to rx_table
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 01/24] hv_netvsc: Fix the real number of queues of non-vRSS cases Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 03/24] hv_netvsc: Rename tx_send_table to tx_table Stephen Hemminger
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Haiyang Zhang

From: Haiyang Zhang <haiyangz@microsoft.com>

commit 47371300dfc269dd8d150e5b872bdbbda98ba809 upstream.

Rename this variable because it is the Receive indirection
table.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/hyperv_net.h   | 2 +-
 drivers/net/hyperv/netvsc_drv.c   | 4 ++--
 drivers/net/hyperv/rndis_filter.c | 6 +++---
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index 4f3afcf92a7c..8144b938460f 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -179,7 +179,7 @@ struct rndis_device {
 
 	u8 hw_mac_adr[ETH_ALEN];
 	u8 rss_key[NETVSC_HASH_KEYLEN];
-	u16 ind_table[ITAB_NUM];
+	u16 rx_table[ITAB_NUM];
 };
 
 
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index e40be71df9d6..8b358d3fc057 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -1378,7 +1378,7 @@ static int netvsc_get_rxfh(struct net_device *dev, u32 *indir, u8 *key,
 	rndis_dev = ndev->extension;
 	if (indir) {
 		for (i = 0; i < ITAB_NUM; i++)
-			indir[i] = rndis_dev->ind_table[i];
+			indir[i] = rndis_dev->rx_table[i];
 	}
 
 	if (key)
@@ -1408,7 +1408,7 @@ static int netvsc_set_rxfh(struct net_device *dev, const u32 *indir,
 				return -EINVAL;
 
 		for (i = 0; i < ITAB_NUM; i++)
-			rndis_dev->ind_table[i] = indir[i];
+			rndis_dev->rx_table[i] = indir[i];
 	}
 
 	if (!key) {
diff --git a/drivers/net/hyperv/rndis_filter.c b/drivers/net/hyperv/rndis_filter.c
index 065b204d8e17..addf9f69c58c 100644
--- a/drivers/net/hyperv/rndis_filter.c
+++ b/drivers/net/hyperv/rndis_filter.c
@@ -759,7 +759,7 @@ int rndis_filter_set_rss_param(struct rndis_device *rdev,
 	/* Set indirection table entries */
 	itab = (u32 *)(rssp + 1);
 	for (i = 0; i < ITAB_NUM; i++)
-		itab[i] = rdev->ind_table[i];
+		itab[i] = rdev->rx_table[i];
 
 	/* Set hask key values */
 	keyp = (u8 *)((unsigned long)rssp + rssp->kashkey_offset);
@@ -1284,8 +1284,8 @@ struct netvsc_device *rndis_filter_device_add(struct hv_device *dev,
 	net_device->num_chn = min(net_device->max_chn, device_info->num_chn);
 
 	for (i = 0; i < ITAB_NUM; i++)
-		rndis_device->ind_table[i] = ethtool_rxfh_indir_default(i,
-							net_device->num_chn);
+		rndis_device->rx_table[i] = ethtool_rxfh_indir_default(
+						i, net_device->num_chn);
 
 	atomic_set(&net_device->open_chn, 1);
 	vmbus_set_sc_create_callback(dev->channel, netvsc_sc_open);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 03/24] hv_netvsc: Rename tx_send_table to tx_table
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 01/24] hv_netvsc: Fix the real number of queues of non-vRSS cases Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 02/24] hv_netvsc: Rename ind_table to rx_table Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 04/24] hv_netvsc: Add initialization of tx_table in netvsc_device_add() Stephen Hemminger
                   ` (20 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Haiyang Zhang

From: Haiyang Zhang <haiyangz@microsoft.com>

commit 39e91cfbf6f5fb26ba64cc2e8874372baf1671e7 upstream.

Simplify the variable name: tx_send_table

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/hyperv_net.h | 2 +-
 drivers/net/hyperv/netvsc.c     | 2 +-
 drivers/net/hyperv/netvsc_drv.c | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index 8144b938460f..94de7ca1e52e 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -734,7 +734,7 @@ struct net_device_context {
 
 	u32 tx_checksum_mask;
 
-	u32 tx_send_table[VRSS_SEND_TAB_SIZE];
+	u32 tx_table[VRSS_SEND_TAB_SIZE];
 
 	/* Ethtool settings */
 	bool udp4_l4_hash;
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index a6bafcf55776..03b44ec805db 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -1107,7 +1107,7 @@ static void netvsc_send_table(struct hv_device *hdev,
 		      nvmsg->msg.v5_msg.send_table.offset);
 
 	for (i = 0; i < count; i++)
-		net_device_ctx->tx_send_table[i] = tab[i];
+		net_device_ctx->tx_table[i] = tab[i];
 }
 
 static void netvsc_send_vf(struct net_device_context *net_device_ctx,
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 8b358d3fc057..f5d299b82bac 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -234,8 +234,8 @@ static inline int netvsc_get_tx_queue(struct net_device *ndev,
 	struct sock *sk = skb->sk;
 	int q_idx;
 
-	q_idx = ndc->tx_send_table[netvsc_get_hash(skb, ndc) &
-				   (VRSS_SEND_TAB_SIZE - 1)];
+	q_idx = ndc->tx_table[netvsc_get_hash(skb, ndc) &
+			      (VRSS_SEND_TAB_SIZE - 1)];
 
 	/* If queue index changed record the new value */
 	if (q_idx != old_idx &&
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 04/24] hv_netvsc: Add initialization of tx_table in netvsc_device_add()
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (2 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 03/24] hv_netvsc: Rename tx_send_table to tx_table Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 05/24] hv_netvsc: Set tx_table to equal weight after subchannels open Stephen Hemminger
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Haiyang Zhang

From: Haiyang Zhang <haiyangz@microsoft.com>

commit 6b0cbe315868d613123cf387052ccda5f09d49ea upstream.

tx_table is part of the private data of kernel net_device. It is only
zero-ed out when allocating net_device.

We may recreate netvsc_device w/o recreating net_device, so the private
netdev data, including tx_table, are not zeroed. It may contain channel
numbers for the older netvsc_device.

This patch adds initialization of tx_table each time we recreate
netvsc_device.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/netvsc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 03b44ec805db..73999214d444 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -1252,6 +1252,9 @@ struct netvsc_device *netvsc_device_add(struct hv_device *device,
 	if (!net_device)
 		return ERR_PTR(-ENOMEM);
 
+	for (i = 0; i < VRSS_SEND_TAB_SIZE; i++)
+		net_device_ctx->tx_table[i] = 0;
+
 	net_device->ring_size = ring_size;
 
 	/* Because the device uses NAPI, all the interrupt batching and
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 05/24] hv_netvsc: Set tx_table to equal weight after subchannels open
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (3 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 04/24] hv_netvsc: Add initialization of tx_table in netvsc_device_add() Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 06/24] hv_netvsc: netvsc_teardown_gpadl() split Stephen Hemminger
                   ` (18 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Haiyang Zhang

From: Haiyang Zhang <haiyangz@microsoft.com>

commit a6fb6aa3cfa9047b62653dbcfc9bcde6e2272b41 upstream.

In some cases, like internal vSwitch, the host doesn't provide
send indirection table updates. This patch sets the table to be
equal weight after subchannels are all open. Otherwise, all workload
will be on one TX channel.

As tested, this patch has largely increased the throughput over
internal vSwitch.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/rndis_filter.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/hyperv/rndis_filter.c b/drivers/net/hyperv/rndis_filter.c
index addf9f69c58c..0648eebda829 100644
--- a/drivers/net/hyperv/rndis_filter.c
+++ b/drivers/net/hyperv/rndis_filter.c
@@ -1114,6 +1114,9 @@ void rndis_set_subchannel(struct work_struct *w)
 	netif_set_real_num_tx_queues(ndev, nvdev->num_chn);
 	netif_set_real_num_rx_queues(ndev, nvdev->num_chn);
 
+	for (i = 0; i < VRSS_SEND_TAB_SIZE; i++)
+		ndev_ctx->tx_table[i] = i % nvdev->num_chn;
+
 	rtnl_unlock();
 	return;
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 06/24] hv_netvsc: netvsc_teardown_gpadl() split
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (4 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 05/24] hv_netvsc: Set tx_table to equal weight after subchannels open Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 07/24] hv_netvsc: preserve hw_features on mtu/channels/ringparam changes Stephen Hemminger
                   ` (17 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Vitaly Kuznetsov

From: Vitaly Kuznetsov <vkuznets@redhat.com>

commit 0cf737808ae7cb25e952be619db46b9147a92f46 upstream.

It was found that in some cases host refuses to teardown GPADL for send/
receive buffers (probably when some work with these buffere is scheduled or
ongoing). Change the teardown logic to be:
1) Send NVSP_MSG1_TYPE_REVOKE_* messages
2) Close the channel
3) Teardown GPADLs.
This seems to work reliably.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/netvsc.c | 69 +++++++++++++++++++------------------
 1 file changed, 36 insertions(+), 33 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 73999214d444..4bc8a1d529d9 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -100,12 +100,11 @@ static void free_netvsc_device_rcu(struct netvsc_device *nvdev)
 	call_rcu(&nvdev->rcu, free_netvsc_device);
 }
 
-static void netvsc_destroy_buf(struct hv_device *device)
+static void netvsc_revoke_buf(struct hv_device *device,
+			      struct netvsc_device *net_device)
 {
 	struct nvsp_message *revoke_packet;
 	struct net_device *ndev = hv_get_drvdata(device);
-	struct net_device_context *ndc = netdev_priv(ndev);
-	struct netvsc_device *net_device = rtnl_dereference(ndc->nvdev);
 	int ret;
 
 	/*
@@ -148,28 +147,6 @@ static void netvsc_destroy_buf(struct hv_device *device)
 		net_device->recv_section_cnt = 0;
 	}
 
-	/* Teardown the gpadl on the vsp end */
-	if (net_device->recv_buf_gpadl_handle) {
-		ret = vmbus_teardown_gpadl(device->channel,
-					   net_device->recv_buf_gpadl_handle);
-
-		/* If we failed here, we might as well return and have a leak
-		 * rather than continue and a bugchk
-		 */
-		if (ret != 0) {
-			netdev_err(ndev,
-				   "unable to teardown receive buffer's gpadl\n");
-			return;
-		}
-		net_device->recv_buf_gpadl_handle = 0;
-	}
-
-	if (net_device->recv_buf) {
-		/* Free up the receive buffer */
-		vfree(net_device->recv_buf);
-		net_device->recv_buf = NULL;
-	}
-
 	/* Deal with the send buffer we may have setup.
 	 * If we got a  send section size, it means we received a
 	 * NVSP_MSG1_TYPE_SEND_SEND_BUF_COMPLETE msg (ie sent
@@ -210,7 +187,35 @@ static void netvsc_destroy_buf(struct hv_device *device)
 		}
 		net_device->send_section_cnt = 0;
 	}
-	/* Teardown the gpadl on the vsp end */
+}
+
+static void netvsc_teardown_gpadl(struct hv_device *device,
+				  struct netvsc_device *net_device)
+{
+	struct net_device *ndev = hv_get_drvdata(device);
+	int ret;
+
+	if (net_device->recv_buf_gpadl_handle) {
+		ret = vmbus_teardown_gpadl(device->channel,
+					   net_device->recv_buf_gpadl_handle);
+
+		/* If we failed here, we might as well return and have a leak
+		 * rather than continue and a bugchk
+		 */
+		if (ret != 0) {
+			netdev_err(ndev,
+				   "unable to teardown receive buffer's gpadl\n");
+			return;
+		}
+		net_device->recv_buf_gpadl_handle = 0;
+	}
+
+	if (net_device->recv_buf) {
+		/* Free up the receive buffer */
+		vfree(net_device->recv_buf);
+		net_device->recv_buf = NULL;
+	}
+
 	if (net_device->send_buf_gpadl_handle) {
 		ret = vmbus_teardown_gpadl(device->channel,
 					   net_device->send_buf_gpadl_handle);
@@ -425,7 +430,8 @@ static int netvsc_init_buf(struct hv_device *device,
 	goto exit;
 
 cleanup:
-	netvsc_destroy_buf(device);
+	netvsc_revoke_buf(device, net_device);
+	netvsc_teardown_gpadl(device, net_device);
 
 exit:
 	return ret;
@@ -544,11 +550,6 @@ static int netvsc_connect_vsp(struct hv_device *device,
 	return ret;
 }
 
-static void netvsc_disconnect_vsp(struct hv_device *device)
-{
-	netvsc_destroy_buf(device);
-}
-
 /*
  * netvsc_device_remove - Callback when the root bus device is removed
  */
@@ -562,7 +563,7 @@ void netvsc_device_remove(struct hv_device *device)
 
 	cancel_work_sync(&net_device->subchan_work);
 
-	netvsc_disconnect_vsp(device);
+	netvsc_revoke_buf(device, net_device);
 
 	RCU_INIT_POINTER(net_device_ctx->nvdev, NULL);
 
@@ -575,6 +576,8 @@ void netvsc_device_remove(struct hv_device *device)
 	/* Now, we can close the channel safely */
 	vmbus_close(device->channel);
 
+	netvsc_teardown_gpadl(device, net_device);
+
 	/* And dissassociate NAPI context from device */
 	for (i = 0; i < net_device->num_chn; i++)
 		netif_napi_del(&net_device->chan_table[i].napi);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 07/24] hv_netvsc: preserve hw_features on mtu/channels/ringparam changes
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (5 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 06/24] hv_netvsc: netvsc_teardown_gpadl() split Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 08/24] hv_netvsc: empty current transmit aggregation if flow blocked Stephen Hemminger
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Vitaly Kuznetsov

From: Vitaly Kuznetsov <vkuznets@redhat.com>

commit aefd80e874e98a864915df5b7d90824a4340b450 upstream.

rndis_filter_device_add() is called both from netvsc_probe() when we
initially create the device and from set channels/mtu/ringparam
routines where we basically remove the device and add it back.

hw_features is reset in rndis_filter_device_add() and filled with
host data. However, we lose all additional flags which are set outside
of the driver, e.g. register_netdevice() adds NETIF_F_SOFT_FEATURES and
many others.

Unfortunately, calls to rndis_{query_hwcaps(), _set_offload_params()}
calls cannot be avoided on every RNDIS reset: host expects us to set
required features explicitly. Moreover, in theory hardware capabilities
can change and we need to reflect the change in hw_features.

Reset net->hw_features bits according to host data in
rndis_netdev_set_hwcaps(), clear corresponding feature bits
from net->features in case some features went missing (will never happen
in real life I guess but let's be consistent).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/hyperv_net.h   |   4 +
 drivers/net/hyperv/netvsc_drv.c   |   2 +-
 drivers/net/hyperv/rndis_filter.c | 136 +++++++++++++++++-------------
 3 files changed, 83 insertions(+), 59 deletions(-)

diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index 94de7ca1e52e..a3f628c3c9ed 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -659,6 +659,10 @@ struct nvsp_message {
 #define NETVSC_RECEIVE_BUFFER_ID		0xcafe
 #define NETVSC_SEND_BUFFER_ID			0
 
+#define NETVSC_SUPPORTED_HW_FEATURES (NETIF_F_RXCSUM | NETIF_F_IP_CSUM | \
+				      NETIF_F_TSO | NETIF_F_IPV6_CSUM | \
+				      NETIF_F_TSO6)
+
 #define VRSS_SEND_TAB_SIZE 16  /* must be power of 2 */
 #define VRSS_CHANNEL_MAX 64
 #define VRSS_CHANNEL_DEFAULT 8
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index f5d299b82bac..cfaf433b0cda 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -1955,7 +1955,7 @@ static int netvsc_probe(struct hv_device *dev,
 
 	memcpy(net->dev_addr, device_info.mac_adr, ETH_ALEN);
 
-	/* hw_features computed in rndis_filter_device_add */
+	/* hw_features computed in rndis_netdev_set_hwcaps() */
 	net->features = net->hw_features |
 		NETIF_F_HIGHDMA | NETIF_F_SG |
 		NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX;
diff --git a/drivers/net/hyperv/rndis_filter.c b/drivers/net/hyperv/rndis_filter.c
index 0648eebda829..be57639bee29 100644
--- a/drivers/net/hyperv/rndis_filter.c
+++ b/drivers/net/hyperv/rndis_filter.c
@@ -1131,69 +1131,20 @@ void rndis_set_subchannel(struct work_struct *w)
 	rtnl_unlock();
 }
 
-struct netvsc_device *rndis_filter_device_add(struct hv_device *dev,
-				      struct netvsc_device_info *device_info)
+static int rndis_netdev_set_hwcaps(struct rndis_device *rndis_device,
+				   struct netvsc_device *nvdev)
 {
-	struct net_device *net = hv_get_drvdata(dev);
+	struct net_device *net = rndis_device->ndev;
 	struct net_device_context *net_device_ctx = netdev_priv(net);
-	struct netvsc_device *net_device;
-	struct rndis_device *rndis_device;
 	struct ndis_offload hwcaps;
 	struct ndis_offload_params offloads;
-	struct ndis_recv_scale_cap rsscap;
-	u32 rsscap_size = sizeof(struct ndis_recv_scale_cap);
 	unsigned int gso_max_size = GSO_MAX_SIZE;
-	u32 mtu, size;
-	const struct cpumask *node_cpu_mask;
-	u32 num_possible_rss_qs;
-	int i, ret;
-
-	rndis_device = get_rndis_device();
-	if (!rndis_device)
-		return ERR_PTR(-ENODEV);
-
-	/*
-	 * Let the inner driver handle this first to create the netvsc channel
-	 * NOTE! Once the channel is created, we may get a receive callback
-	 * (RndisFilterOnReceive()) before this call is completed
-	 */
-	net_device = netvsc_device_add(dev, device_info);
-	if (IS_ERR(net_device)) {
-		kfree(rndis_device);
-		return net_device;
-	}
-
-	/* Initialize the rndis device */
-	net_device->max_chn = 1;
-	net_device->num_chn = 1;
-
-	net_device->extension = rndis_device;
-	rndis_device->ndev = net;
-
-	/* Send the rndis initialization message */
-	ret = rndis_filter_init_device(rndis_device, net_device);
-	if (ret != 0)
-		goto err_dev_remv;
-
-	/* Get the MTU from the host */
-	size = sizeof(u32);
-	ret = rndis_filter_query_device(rndis_device, net_device,
-					RNDIS_OID_GEN_MAXIMUM_FRAME_SIZE,
-					&mtu, &size);
-	if (ret == 0 && size == sizeof(u32) && mtu < net->mtu)
-		net->mtu = mtu;
-
-	/* Get the mac address */
-	ret = rndis_filter_query_device_mac(rndis_device, net_device);
-	if (ret != 0)
-		goto err_dev_remv;
-
-	memcpy(device_info->mac_adr, rndis_device->hw_mac_adr, ETH_ALEN);
+	int ret;
 
 	/* Find HW offload capabilities */
-	ret = rndis_query_hwcaps(rndis_device, net_device, &hwcaps);
+	ret = rndis_query_hwcaps(rndis_device, nvdev, &hwcaps);
 	if (ret != 0)
-		goto err_dev_remv;
+		return ret;
 
 	/* A value of zero means "no change"; now turn on what we want. */
 	memset(&offloads, 0, sizeof(struct ndis_offload_params));
@@ -1201,8 +1152,12 @@ struct netvsc_device *rndis_filter_device_add(struct hv_device *dev,
 	/* Linux does not care about IP checksum, always does in kernel */
 	offloads.ip_v4_csum = NDIS_OFFLOAD_PARAMETERS_TX_RX_DISABLED;
 
+	/* Reset previously set hw_features flags */
+	net->hw_features &= ~NETVSC_SUPPORTED_HW_FEATURES;
+	net_device_ctx->tx_checksum_mask = 0;
+
 	/* Compute tx offload settings based on hw capabilities */
-	net->hw_features = NETIF_F_RXCSUM;
+	net->hw_features |= NETIF_F_RXCSUM;
 
 	if ((hwcaps.csum.ip4_txcsum & NDIS_TXCSUM_ALL_TCP4) == NDIS_TXCSUM_ALL_TCP4) {
 		/* Can checksum TCP */
@@ -1246,10 +1201,75 @@ struct netvsc_device *rndis_filter_device_add(struct hv_device *dev,
 		}
 	}
 
+	/* In case some hw_features disappeared we need to remove them from
+	 * net->features list as they're no longer supported.
+	 */
+	net->features &= ~NETVSC_SUPPORTED_HW_FEATURES | net->hw_features;
+
 	netif_set_gso_max_size(net, gso_max_size);
 
-	ret = rndis_filter_set_offload_params(net, net_device, &offloads);
-	if (ret)
+	ret = rndis_filter_set_offload_params(net, nvdev, &offloads);
+
+	return ret;
+}
+
+struct netvsc_device *rndis_filter_device_add(struct hv_device *dev,
+				      struct netvsc_device_info *device_info)
+{
+	struct net_device *net = hv_get_drvdata(dev);
+	struct netvsc_device *net_device;
+	struct rndis_device *rndis_device;
+	struct ndis_recv_scale_cap rsscap;
+	u32 rsscap_size = sizeof(struct ndis_recv_scale_cap);
+	u32 mtu, size;
+	const struct cpumask *node_cpu_mask;
+	u32 num_possible_rss_qs;
+	int i, ret;
+
+	rndis_device = get_rndis_device();
+	if (!rndis_device)
+		return ERR_PTR(-ENODEV);
+
+	/* Let the inner driver handle this first to create the netvsc channel
+	 * NOTE! Once the channel is created, we may get a receive callback
+	 * (RndisFilterOnReceive()) before this call is completed
+	 */
+	net_device = netvsc_device_add(dev, device_info);
+	if (IS_ERR(net_device)) {
+		kfree(rndis_device);
+		return net_device;
+	}
+
+	/* Initialize the rndis device */
+	net_device->max_chn = 1;
+	net_device->num_chn = 1;
+
+	net_device->extension = rndis_device;
+	rndis_device->ndev = net;
+
+	/* Send the rndis initialization message */
+	ret = rndis_filter_init_device(rndis_device, net_device);
+	if (ret != 0)
+		goto err_dev_remv;
+
+	/* Get the MTU from the host */
+	size = sizeof(u32);
+	ret = rndis_filter_query_device(rndis_device, net_device,
+					RNDIS_OID_GEN_MAXIMUM_FRAME_SIZE,
+					&mtu, &size);
+	if (ret == 0 && size == sizeof(u32) && mtu < net->mtu)
+		net->mtu = mtu;
+
+	/* Get the mac address */
+	ret = rndis_filter_query_device_mac(rndis_device, net_device);
+	if (ret != 0)
+		goto err_dev_remv;
+
+	memcpy(device_info->mac_adr, rndis_device->hw_mac_adr, ETH_ALEN);
+
+	/* Query and set hardware capabilities */
+	ret = rndis_netdev_set_hwcaps(rndis_device, net_device);
+	if (ret != 0)
 		goto err_dev_remv;
 
 	rndis_filter_query_device_link_status(rndis_device, net_device);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 08/24] hv_netvsc: empty current transmit aggregation if flow blocked
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (6 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 07/24] hv_netvsc: preserve hw_features on mtu/channels/ringparam changes Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 09/24] hv_netvsc: Use the num_online_cpus() for channel limit Stephen Hemminger
                   ` (15 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Stephen Hemminger, Stephen Hemminger

From: Stephen Hemminger <stephen@networkplumber.org>

commit cfd8afd986cdb59ea9adac873c5082498a1eb7c0 upstream

If the transmit queue is known full, then don't keep aggregating
data. And the cp_partial flag which indicates that the current
aggregation buffer is full can be folded in to avoid more
conditionals.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/hyperv_net.h   |  2 +-
 drivers/net/hyperv/netvsc.c       | 36 ++++++++++++++++++-------------
 drivers/net/hyperv/netvsc_drv.c   |  2 +-
 drivers/net/hyperv/rndis_filter.c |  3 +--
 4 files changed, 24 insertions(+), 19 deletions(-)

diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index a3f628c3c9ed..fd51a329e36e 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -192,7 +192,7 @@ struct netvsc_device *netvsc_device_add(struct hv_device *device,
 					const struct netvsc_device_info *info);
 int netvsc_alloc_recv_comp_ring(struct netvsc_device *net_device, u32 q_idx);
 void netvsc_device_remove(struct hv_device *device);
-int netvsc_send(struct net_device_context *ndc,
+int netvsc_send(struct net_device *net,
 		struct hv_netvsc_packet *packet,
 		struct rndis_message *rndis_msg,
 		struct hv_page_buffer *page_buffer,
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 4bc8a1d529d9..22da6399b37a 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -700,13 +700,13 @@ static u32 netvsc_get_next_send_section(struct netvsc_device *net_device)
 	return NETVSC_INVALID_INDEX;
 }
 
-static u32 netvsc_copy_to_send_buf(struct netvsc_device *net_device,
-				   unsigned int section_index,
-				   u32 pend_size,
-				   struct hv_netvsc_packet *packet,
-				   struct rndis_message *rndis_msg,
-				   struct hv_page_buffer *pb,
-				   struct sk_buff *skb)
+static void netvsc_copy_to_send_buf(struct netvsc_device *net_device,
+				    unsigned int section_index,
+				    u32 pend_size,
+				    struct hv_netvsc_packet *packet,
+				    struct rndis_message *rndis_msg,
+				    struct hv_page_buffer *pb,
+				    bool xmit_more)
 {
 	char *start = net_device->send_buf;
 	char *dest = start + (section_index * net_device->send_section_size)
@@ -719,7 +719,8 @@ static u32 netvsc_copy_to_send_buf(struct netvsc_device *net_device,
 		packet->page_buf_cnt;
 
 	/* Add padding */
-	if (skb->xmit_more && remain && !packet->cp_partial) {
+	remain = packet->total_data_buflen & (net_device->pkt_align - 1);
+	if (xmit_more && remain) {
 		padding = net_device->pkt_align - remain;
 		rndis_msg->msg_len += padding;
 		packet->total_data_buflen += padding;
@@ -739,8 +740,6 @@ static u32 netvsc_copy_to_send_buf(struct netvsc_device *net_device,
 		memset(dest, 0, padding);
 		msg_size += padding;
 	}
-
-	return msg_size;
 }
 
 static inline int netvsc_send_pkt(
@@ -828,12 +827,13 @@ static inline void move_pkt_msd(struct hv_netvsc_packet **msd_send,
 }
 
 /* RCU already held by caller */
-int netvsc_send(struct net_device_context *ndev_ctx,
+int netvsc_send(struct net_device *ndev,
 		struct hv_netvsc_packet *packet,
 		struct rndis_message *rndis_msg,
 		struct hv_page_buffer *pb,
 		struct sk_buff *skb)
 {
+	struct net_device_context *ndev_ctx = netdev_priv(ndev);
 	struct netvsc_device *net_device
 		= rcu_dereference_bh(ndev_ctx->nvdev);
 	struct hv_device *device = ndev_ctx->device_ctx;
@@ -844,8 +844,7 @@ int netvsc_send(struct net_device_context *ndev_ctx,
 	struct multi_send_data *msdp;
 	struct hv_netvsc_packet *msd_send = NULL, *cur_send = NULL;
 	struct sk_buff *msd_skb = NULL;
-	bool try_batch;
-	bool xmit_more = (skb != NULL) ? skb->xmit_more : false;
+	bool try_batch, xmit_more;
 
 	/* If device is rescinded, return error and packet will get dropped. */
 	if (unlikely(!net_device || net_device->destroy))
@@ -896,10 +895,17 @@ int netvsc_send(struct net_device_context *ndev_ctx,
 		}
 	}
 
+	/* Keep aggregating only if stack says more data is coming
+	 * and not doing mixed modes send and not flow blocked
+	 */
+	xmit_more = skb->xmit_more &&
+		!packet->cp_partial &&
+		!netif_xmit_stopped(netdev_get_tx_queue(ndev, packet->q_idx));
+
 	if (section_index != NETVSC_INVALID_INDEX) {
 		netvsc_copy_to_send_buf(net_device,
 					section_index, msd_len,
-					packet, rndis_msg, pb, skb);
+					packet, rndis_msg, pb, xmit_more);
 
 		packet->send_buf_index = section_index;
 
@@ -919,7 +925,7 @@ int netvsc_send(struct net_device_context *ndev_ctx,
 		if (msdp->skb)
 			dev_consume_skb_any(msdp->skb);
 
-		if (xmit_more && !packet->cp_partial) {
+		if (xmit_more) {
 			msdp->skb = skb;
 			msdp->pkt = packet;
 			msdp->count++;
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index cfaf433b0cda..b48a673d526d 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -614,7 +614,7 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
 	/* timestamp packet in software */
 	skb_tx_timestamp(skb);
 
-	ret = netvsc_send(net_device_ctx, packet, rndis_msg, pb, skb);
+	ret = netvsc_send(net, packet, rndis_msg, pb, skb);
 	if (likely(ret == 0))
 		return NETDEV_TX_OK;
 
diff --git a/drivers/net/hyperv/rndis_filter.c b/drivers/net/hyperv/rndis_filter.c
index be57639bee29..0c99d9926085 100644
--- a/drivers/net/hyperv/rndis_filter.c
+++ b/drivers/net/hyperv/rndis_filter.c
@@ -217,7 +217,6 @@ static int rndis_filter_send_request(struct rndis_device *dev,
 	struct hv_netvsc_packet *packet;
 	struct hv_page_buffer page_buf[2];
 	struct hv_page_buffer *pb = page_buf;
-	struct net_device_context *net_device_ctx = netdev_priv(dev->ndev);
 	int ret;
 
 	/* Setup the packet to send it */
@@ -245,7 +244,7 @@ static int rndis_filter_send_request(struct rndis_device *dev,
 	}
 
 	rcu_read_lock_bh();
-	ret = netvsc_send(net_device_ctx, packet, NULL, pb, NULL);
+	ret = netvsc_send(dev->ndev, packet, NULL, pb, NULL);
 	rcu_read_unlock_bh();
 
 	return ret;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 09/24] hv_netvsc: Use the num_online_cpus() for channel limit
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (7 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 08/24] hv_netvsc: empty current transmit aggregation if flow blocked Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 10/24] hv_netvsc: avoid retry on send during shutdown Stephen Hemminger
                   ` (14 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Haiyang Zhang

From: Haiyang Zhang <haiyangz@microsoft.com>

commit 25a39f7f975c3c26a0052fbf9b59201c06744332 upstream

Since we no longer localize channel/CPU affiliation within one NUMA
node, num_online_cpus() is used as the number of channel cap, instead of
the number of processors in a NUMA node.

This patch allows a bigger range for tuning the number of channels.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/rndis_filter.c | 11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/drivers/net/hyperv/rndis_filter.c b/drivers/net/hyperv/rndis_filter.c
index 0c99d9926085..05109ed5377c 100644
--- a/drivers/net/hyperv/rndis_filter.c
+++ b/drivers/net/hyperv/rndis_filter.c
@@ -1221,7 +1221,6 @@ struct netvsc_device *rndis_filter_device_add(struct hv_device *dev,
 	struct ndis_recv_scale_cap rsscap;
 	u32 rsscap_size = sizeof(struct ndis_recv_scale_cap);
 	u32 mtu, size;
-	const struct cpumask *node_cpu_mask;
 	u32 num_possible_rss_qs;
 	int i, ret;
 
@@ -1290,14 +1289,8 @@ struct netvsc_device *rndis_filter_device_add(struct hv_device *dev,
 	if (ret || rsscap.num_recv_que < 2)
 		goto out;
 
-	/*
-	 * We will limit the VRSS channels to the number CPUs in the NUMA node
-	 * the primary channel is currently bound to.
-	 *
-	 * This also guarantees that num_possible_rss_qs <= num_online_cpus
-	 */
-	node_cpu_mask = cpumask_of_node(cpu_to_node(dev->channel->target_cpu));
-	num_possible_rss_qs = min_t(u32, cpumask_weight(node_cpu_mask),
+	/* This guarantees that num_possible_rss_qs <= num_online_cpus */
+	num_possible_rss_qs = min_t(u32, num_online_cpus(),
 				    rsscap.num_recv_que);
 
 	net_device->max_chn = min_t(u32, VRSS_CHANNEL_MAX, num_possible_rss_qs);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 10/24] hv_netvsc: avoid retry on send during shutdown
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (8 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 09/24] hv_netvsc: Use the num_online_cpus() for channel limit Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 11/24] hv_netvsc: only wake transmit queue if link is up Stephen Hemminger
                   ` (13 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Stephen Hemminger, Stephen Hemminger

From: Stephen Hemminger <stephen@networkplumber.org>

commit 12f69661a49446840d742d8feb593ace022d9f66 upstream

Change the initialization order so that the device is ready to transmit
(ie connect vsp is completed) before setting the internal reference
to the device with RCU.

This avoids any races on initialization and prevents retry issues
on shutdown.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/netvsc.c | 24 +++++++-----------------
 1 file changed, 7 insertions(+), 17 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 22da6399b37a..839b9d6ecb41 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -850,13 +850,6 @@ int netvsc_send(struct net_device *ndev,
 	if (unlikely(!net_device || net_device->destroy))
 		return -ENODEV;
 
-	/* We may race with netvsc_connect_vsp()/netvsc_init_buf() and get
-	 * here before the negotiation with the host is finished and
-	 * send_section_map may not be allocated yet.
-	 */
-	if (unlikely(!net_device->send_section_map))
-		return -EAGAIN;
-
 	nvchan = &net_device->chan_table[packet->q_idx];
 	packet->send_buf_index = NETVSC_INVALID_INDEX;
 	packet->cp_partial = false;
@@ -864,10 +857,8 @@ int netvsc_send(struct net_device *ndev,
 	/* Send control message directly without accessing msd (Multi-Send
 	 * Data) field which may be changed during data packet processing.
 	 */
-	if (!skb) {
-		cur_send = packet;
-		goto send_now;
-	}
+	if (!skb)
+		return netvsc_send_pkt(device, packet, net_device, pb, skb);
 
 	/* batch packets in send buffer if possible */
 	msdp = &nvchan->msd;
@@ -951,7 +942,6 @@ int netvsc_send(struct net_device *ndev,
 		}
 	}
 
-send_now:
 	if (cur_send)
 		ret = netvsc_send_pkt(device, cur_send, net_device, pb, skb);
 
@@ -1308,11 +1298,6 @@ struct netvsc_device *netvsc_device_add(struct hv_device *device,
 
 	napi_enable(&net_device->chan_table[0].napi);
 
-	/* Writing nvdev pointer unlocks netvsc_send(), make sure chn_table is
-	 * populated.
-	 */
-	rcu_assign_pointer(net_device_ctx->nvdev, net_device);
-
 	/* Connect with the NetVsp */
 	ret = netvsc_connect_vsp(device, net_device, device_info);
 	if (ret != 0) {
@@ -1321,6 +1306,11 @@ struct netvsc_device *netvsc_device_add(struct hv_device *device,
 		goto close;
 	}
 
+	/* Writing nvdev pointer unlocks netvsc_send(), make sure chn_table is
+	 * populated.
+	 */
+	rcu_assign_pointer(net_device_ctx->nvdev, net_device);
+
 	return net_device;
 
 close:
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 11/24] hv_netvsc: only wake transmit queue if link is up
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (9 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 10/24] hv_netvsc: avoid retry on send during shutdown Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 12/24] hv_netvsc: fix error unwind handling if vmbus_open fails Stephen Hemminger
                   ` (12 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Stephen Hemminger, Stephen Hemminger

From: Stephen Hemminger <stephen@networkplumber.org>

commit f4950e4586dfc957e0a28226eeb992ddc049b5a2 upstream

Don't wake transmit queues if link is not up yet.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/netvsc_drv.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index b48a673d526d..bcd74f01f04b 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -88,12 +88,11 @@ static int netvsc_open(struct net_device *net)
 		return ret;
 	}
 
-	netif_tx_wake_all_queues(net);
-
 	rdev = nvdev->extension;
-
-	if (!rdev->link_state)
+	if (!rdev->link_state) {
 		netif_carrier_on(net);
+		netif_tx_wake_all_queues(net);
+	}
 
 	if (vf_netdev) {
 		/* Setting synthetic device up transparently sets
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 12/24] hv_netvsc: fix error unwind handling if vmbus_open fails
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (10 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 11/24] hv_netvsc: only wake transmit queue if link is up Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 13/24] hv_netvsc: cancel subchannel setup before halting device Stephen Hemminger
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Stephen Hemminger, Stephen Hemminger

From: Stephen Hemminger <stephen@networkplumber.org>

commit fcfb4a00d1e514e8313277a01ef919de1113025b upstream

Need to delete NAPI association if vmbus_open fails.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/netvsc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 839b9d6ecb41..b7720b65f5d6 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -1288,7 +1288,6 @@ struct netvsc_device *netvsc_device_add(struct hv_device *device,
 			 net_device->chan_table);
 
 	if (ret != 0) {
-		netif_napi_del(&net_device->chan_table[0].napi);
 		netdev_err(ndev, "unable to open channel: %d\n", ret);
 		goto cleanup;
 	}
@@ -1321,6 +1320,7 @@ struct netvsc_device *netvsc_device_add(struct hv_device *device,
 	vmbus_close(device->channel);
 
 cleanup:
+	netif_napi_del(&net_device->chan_table[0].napi);
 	free_netvsc_device(&net_device->rcu);
 
 	return ERR_PTR(ret);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 13/24] hv_netvsc: cancel subchannel setup before halting device
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (11 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 12/24] hv_netvsc: fix error unwind handling if vmbus_open fails Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 14/24] hv_netvsc: fix race in napi poll when rescheduling Stephen Hemminger
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Stephen Hemminger, Stephen Hemminger

From: Stephen Hemminger <stephen@networkplumber.org>

commit a7483ec0267c69b34e818738da60b392623da94b upstream

Block setup of multiple channels earlier in the teardown
process. This avoids possible races between halt and subchannel
initialization.

Suggested-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/rndis_filter.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/hyperv/rndis_filter.c b/drivers/net/hyperv/rndis_filter.c
index 05109ed5377c..5a312b2d5a7b 100644
--- a/drivers/net/hyperv/rndis_filter.c
+++ b/drivers/net/hyperv/rndis_filter.c
@@ -1340,6 +1340,9 @@ void rndis_filter_device_remove(struct hv_device *dev,
 {
 	struct rndis_device *rndis_dev = net_dev->extension;
 
+	/* Don't try and setup sub channels if about to halt */
+	cancel_work_sync(&net_dev->subchan_work);
+
 	/* Halt and release the rndis device */
 	rndis_filter_halt_device(rndis_dev);
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 14/24] hv_netvsc: fix race in napi poll when rescheduling
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (12 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 13/24] hv_netvsc: cancel subchannel setup before halting device Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 15/24] hv_netvsc: defer queue selection to VF Stephen Hemminger
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Stephen Hemminger, Stephen Hemminger

From: Stephen Hemminger <stephen@networkplumber.org>

commit d64e38ae690e3337db0d38d9b149a193a1646c4b upstream

There is a race between napi_reschedule and re-enabling interrupts
which could lead to missed host interrrupts.  This occurs when
interrupts are re-enabled (hv_end_read) and vmbus irq callback
(netvsc_channel_cb) has already scheduled NAPI.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/netvsc.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index b7720b65f5d6..7e0eb99571af 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -1205,9 +1205,10 @@ int netvsc_poll(struct napi_struct *napi, int budget)
 	if (send_recv_completions(ndev, net_device, nvchan) == 0 &&
 	    work_done < budget &&
 	    napi_complete_done(napi, work_done) &&
-	    hv_end_read(&channel->inbound)) {
+	    hv_end_read(&channel->inbound) &&
+	    napi_schedule_prep(napi)) {
 		hv_begin_read(&channel->inbound);
-		napi_reschedule(napi);
+		__napi_schedule(napi);
 	}
 
 	/* Driver may overshoot since multiple packets per descriptor */
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 15/24] hv_netvsc: defer queue selection to VF
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (13 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 14/24] hv_netvsc: fix race in napi poll when rescheduling Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 16/24] hv_netvsc: disable NAPI before channel close Stephen Hemminger
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Stephen Hemminger, Stephen Hemminger

From: Stephen Hemminger <stephen@networkplumber.org>

commit b3bf5666a51068ad5ddd89a76ed877101ef3bc16 upstream

When VF is used for accelerated networking it will likely have
more queues (and different policy) than the synthetic NIC.
This patch defers the queue policy to the VF so that all the
queues can be used. This impacts workloads like local generate UDP.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/netvsc_drv.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index bcd74f01f04b..daa450e4e2a4 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -283,8 +283,19 @@ static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb,
 	rcu_read_lock();
 	vf_netdev = rcu_dereference(ndc->vf_netdev);
 	if (vf_netdev) {
-		txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0;
-		qdisc_skb_cb(skb)->slave_dev_queue_mapping = skb->queue_mapping;
+		const struct net_device_ops *vf_ops = vf_netdev->netdev_ops;
+
+		if (vf_ops->ndo_select_queue)
+			txq = vf_ops->ndo_select_queue(vf_netdev, skb,
+						       accel_priv, fallback);
+		else
+			txq = fallback(vf_netdev, skb);
+
+		/* Record the queue selected by VF so that it can be
+		 * used for common case where VF has more queues than
+		 * the synthetic device.
+		 */
+		qdisc_skb_cb(skb)->slave_dev_queue_mapping = txq;
 	} else {
 		txq = netvsc_pick_tx(ndev, skb);
 	}
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 16/24] hv_netvsc: disable NAPI before channel close
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (14 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 15/24] hv_netvsc: defer queue selection to VF Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 17/24] hv_netvsc: use RCU to fix concurrent rx and queue changes Stephen Hemminger
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Stephen Hemminger, Stephen Hemminger

From: Stephen Hemminger <stephen@networkplumber.org>

commit 8348e0460ab1473f06c8b824699dd2eed3c1979d upstream

This makes sure that no CPU is still process packets when
the channel is closed.

Fixes: 76bb5db5c749 ("netvsc: fix use after free on module removal")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/netvsc.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 7e0eb99571af..a8789dc4fc9b 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -567,6 +567,10 @@ void netvsc_device_remove(struct hv_device *device)
 
 	RCU_INIT_POINTER(net_device_ctx->nvdev, NULL);
 
+	/* And disassociate NAPI context from device */
+	for (i = 0; i < net_device->num_chn; i++)
+		netif_napi_del(&net_device->chan_table[i].napi);
+
 	/*
 	 * At this point, no one should be accessing net_device
 	 * except in here
@@ -578,10 +582,6 @@ void netvsc_device_remove(struct hv_device *device)
 
 	netvsc_teardown_gpadl(device, net_device);
 
-	/* And dissassociate NAPI context from device */
-	for (i = 0; i < net_device->num_chn; i++)
-		netif_napi_del(&net_device->chan_table[i].napi);
-
 	/* Release all resources */
 	free_netvsc_device_rcu(net_device);
 }
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 17/24] hv_netvsc: use RCU to fix concurrent rx and queue changes
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (15 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 16/24] hv_netvsc: disable NAPI before channel close Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 18/24] hv_netvsc: change GPAD teardown order on older versions Stephen Hemminger
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Stephen Hemminger, Stephen Hemminger

From: Stephen Hemminger <stephen@networkplumber.org>

commit 02400fcee2542ee334a2394e0d9f6efd969fe782 upstream

The receive processing may continue to happen while the
internal network device state is in RCU grace period.
The internal RNDIS structure is associated with the
internal netvsc_device structure; both have the same
RCU lifetime.

Defer freeing all associated parts until after grace
period.

Fixes: 0cf737808ae7 ("hv_netvsc: netvsc_teardown_gpadl() split")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/netvsc.c       | 17 ++++----------
 drivers/net/hyperv/rndis_filter.c | 39 ++++++++++++++-----------------
 2 files changed, 22 insertions(+), 34 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index a8789dc4fc9b..03328a60377e 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -89,6 +89,11 @@ static void free_netvsc_device(struct rcu_head *head)
 		= container_of(head, struct netvsc_device, rcu);
 	int i;
 
+	kfree(nvdev->extension);
+	vfree(nvdev->recv_buf);
+	vfree(nvdev->send_buf);
+	kfree(nvdev->send_section_map);
+
 	for (i = 0; i < VRSS_CHANNEL_MAX; i++)
 		vfree(nvdev->chan_table[i].mrc.slots);
 
@@ -210,12 +215,6 @@ static void netvsc_teardown_gpadl(struct hv_device *device,
 		net_device->recv_buf_gpadl_handle = 0;
 	}
 
-	if (net_device->recv_buf) {
-		/* Free up the receive buffer */
-		vfree(net_device->recv_buf);
-		net_device->recv_buf = NULL;
-	}
-
 	if (net_device->send_buf_gpadl_handle) {
 		ret = vmbus_teardown_gpadl(device->channel,
 					   net_device->send_buf_gpadl_handle);
@@ -230,12 +229,6 @@ static void netvsc_teardown_gpadl(struct hv_device *device,
 		}
 		net_device->send_buf_gpadl_handle = 0;
 	}
-	if (net_device->send_buf) {
-		/* Free up the send buffer */
-		vfree(net_device->send_buf);
-		net_device->send_buf = NULL;
-	}
-	kfree(net_device->send_section_map);
 }
 
 int netvsc_alloc_recv_comp_ring(struct netvsc_device *net_device, u32 q_idx)
diff --git a/drivers/net/hyperv/rndis_filter.c b/drivers/net/hyperv/rndis_filter.c
index 5a312b2d5a7b..056499014a1f 100644
--- a/drivers/net/hyperv/rndis_filter.c
+++ b/drivers/net/hyperv/rndis_filter.c
@@ -266,13 +266,23 @@ static void rndis_set_link_state(struct rndis_device *rdev,
 	}
 }
 
-static void rndis_filter_receive_response(struct rndis_device *dev,
-				       struct rndis_message *resp)
+static void rndis_filter_receive_response(struct net_device *ndev,
+					  struct netvsc_device *nvdev,
+					  const struct rndis_message *resp)
 {
+	struct rndis_device *dev = nvdev->extension;
 	struct rndis_request *request = NULL;
 	bool found = false;
 	unsigned long flags;
-	struct net_device *ndev = dev->ndev;
+
+	/* This should never happen, it means control message
+	 * response received after device removed.
+	 */
+	if (dev->state == RNDIS_DEV_UNINITIALIZED) {
+		netdev_err(ndev,
+			   "got rndis message uninitialized\n");
+		return;
+	}
 
 	spin_lock_irqsave(&dev->request_lock, flags);
 	list_for_each_entry(request, &dev->req_list, list_ent) {
@@ -353,7 +363,7 @@ static inline void *rndis_get_ppi(struct rndis_packet *rpkt, u32 type)
 }
 
 static int rndis_filter_receive_data(struct net_device *ndev,
-				     struct rndis_device *dev,
+				     struct netvsc_device *nvdev,
 				     struct rndis_message *msg,
 				     struct vmbus_channel *channel,
 				     void *data, u32 data_buflen)
@@ -373,7 +383,7 @@ static int rndis_filter_receive_data(struct net_device *ndev,
 	 * should be the data packet size plus the trailer padding size
 	 */
 	if (unlikely(data_buflen < rndis_pkt->data_len)) {
-		netdev_err(dev->ndev, "rndis message buffer "
+		netdev_err(ndev, "rndis message buffer "
 			   "overflow detected (got %u, min %u)"
 			   "...dropping this message!\n",
 			   data_buflen, rndis_pkt->data_len);
@@ -401,34 +411,20 @@ int rndis_filter_receive(struct net_device *ndev,
 			 void *data, u32 buflen)
 {
 	struct net_device_context *net_device_ctx = netdev_priv(ndev);
-	struct rndis_device *rndis_dev = net_dev->extension;
 	struct rndis_message *rndis_msg = data;
 
-	/* Make sure the rndis device state is initialized */
-	if (unlikely(!rndis_dev)) {
-		netif_err(net_device_ctx, rx_err, ndev,
-			  "got rndis message but no rndis device!\n");
-		return NVSP_STAT_FAIL;
-	}
-
-	if (unlikely(rndis_dev->state == RNDIS_DEV_UNINITIALIZED)) {
-		netif_err(net_device_ctx, rx_err, ndev,
-			  "got rndis message uninitialized\n");
-		return NVSP_STAT_FAIL;
-	}
-
 	if (netif_msg_rx_status(net_device_ctx))
 		dump_rndis_message(dev, rndis_msg);
 
 	switch (rndis_msg->ndis_msg_type) {
 	case RNDIS_MSG_PACKET:
-		return rndis_filter_receive_data(ndev, rndis_dev, rndis_msg,
+		return rndis_filter_receive_data(ndev, net_dev, rndis_msg,
 						 channel, data, buflen);
 	case RNDIS_MSG_INIT_C:
 	case RNDIS_MSG_QUERY_C:
 	case RNDIS_MSG_SET_C:
 		/* completion msgs */
-		rndis_filter_receive_response(rndis_dev, rndis_msg);
+		rndis_filter_receive_response(ndev, net_dev, rndis_msg);
 		break;
 
 	case RNDIS_MSG_INDICATE:
@@ -1349,7 +1345,6 @@ void rndis_filter_device_remove(struct hv_device *dev,
 	net_dev->extension = NULL;
 
 	netvsc_device_remove(dev);
-	kfree(rndis_dev);
 }
 
 int rndis_filter_open(struct netvsc_device *nvdev)
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 18/24] hv_netvsc: change GPAD teardown order on older versions
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (16 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 17/24] hv_netvsc: use RCU to fix concurrent rx and queue changes Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 19/24] hv_netvsc: common detach logic Stephen Hemminger
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Stephen Hemminger, Stephen Hemminger

From: Stephen Hemminger <stephen@networkplumber.org>

commit 0ef58b0a05c127762f975c3dfe8b922e4aa87a29 upstream

On older versions of Windows, the host ignores messages after
vmbus channel is closed.

Workaround this by doing what Windows does and send the teardown
before close on older versions of NVSP protocol.

Reported-by: Mohammed Gamal <mgamal@redhat.com>
Fixes: 0cf737808ae7 ("hv_netvsc: netvsc_teardown_gpadl() split")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/netvsc.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 03328a60377e..3f3c6489ade2 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -570,10 +570,15 @@ void netvsc_device_remove(struct hv_device *device)
 	 */
 	netdev_dbg(ndev, "net device safe to remove\n");
 
+	/* older versions require that buffer be revoked before close */
+	if (net_device->nvsp_version < NVSP_PROTOCOL_VERSION_4)
+		netvsc_teardown_gpadl(device, net_device);
+
 	/* Now, we can close the channel safely */
 	vmbus_close(device->channel);
 
-	netvsc_teardown_gpadl(device, net_device);
+	if (net_device->nvsp_version >= NVSP_PROTOCOL_VERSION_4)
+		netvsc_teardown_gpadl(device, net_device);
 
 	/* Release all resources */
 	free_netvsc_device_rcu(net_device);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 19/24] hv_netvsc: common detach logic
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (17 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 18/24] hv_netvsc: change GPAD teardown order on older versions Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 20/24] hv_netvsc: Use Windows version instead of NVSP version on GPAD teardown Stephen Hemminger
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Stephen Hemminger, Stephen Hemminger

From: Stephen Hemminger <stephen@networkplumber.org>

commit 7b2ee50c0cd513a176a26a71f2989facdd75bfea upstream

Make common function for detaching internals of device
during changes to MTU and RSS. Make sure no more packets
are transmitted and all packets have been received before
doing device teardown.

Change the wait logic to be common and use usleep_range().

Changes transmit enabling logic so that transmit queues are disabled
during the period when lower device is being changed. And enabled
only after sub channels are setup. This avoids issue where it could
be that a packet was being sent while subchannel was not initialized.

Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/hyperv_net.h   |   1 -
 drivers/net/hyperv/netvsc.c       |  19 +-
 drivers/net/hyperv/netvsc_drv.c   | 278 +++++++++++++++++-------------
 drivers/net/hyperv/rndis_filter.c |  15 +-
 4 files changed, 173 insertions(+), 140 deletions(-)

diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index fd51a329e36e..01017dd88802 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -208,7 +208,6 @@ void netvsc_channel_cb(void *context);
 int netvsc_poll(struct napi_struct *napi, int budget);
 
 void rndis_set_subchannel(struct work_struct *w);
-bool rndis_filter_opened(const struct netvsc_device *nvdev);
 int rndis_filter_open(struct netvsc_device *nvdev);
 int rndis_filter_close(struct netvsc_device *nvdev);
 struct netvsc_device *rndis_filter_device_add(struct hv_device *dev,
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 3f3c6489ade2..0c548748fba8 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -554,8 +554,6 @@ void netvsc_device_remove(struct hv_device *device)
 		= rtnl_dereference(net_device_ctx->nvdev);
 	int i;
 
-	cancel_work_sync(&net_device->subchan_work);
-
 	netvsc_revoke_buf(device, net_device);
 
 	RCU_INIT_POINTER(net_device_ctx->nvdev, NULL);
@@ -644,13 +642,18 @@ static void netvsc_send_tx_complete(struct netvsc_device *net_device,
 	queue_sends =
 		atomic_dec_return(&net_device->chan_table[q_idx].queue_sends);
 
-	if (net_device->destroy && queue_sends == 0)
-		wake_up(&net_device->wait_drain);
+	if (unlikely(net_device->destroy)) {
+		if (queue_sends == 0)
+			wake_up(&net_device->wait_drain);
+	} else {
+		struct netdev_queue *txq = netdev_get_tx_queue(ndev, q_idx);
 
-	if (netif_tx_queue_stopped(netdev_get_tx_queue(ndev, q_idx)) &&
-	    (hv_ringbuf_avail_percent(&channel->outbound) > RING_AVAIL_PERCENT_HIWATER ||
-	     queue_sends < 1))
-		netif_tx_wake_queue(netdev_get_tx_queue(ndev, q_idx));
+		if (netif_tx_queue_stopped(txq) &&
+		    (hv_ringbuf_avail_percent(&channel->outbound) > RING_AVAIL_PERCENT_HIWATER ||
+		     queue_sends < 1)) {
+			netif_tx_wake_queue(txq);
+		}
+	}
 }
 
 static void netvsc_send_completion(struct netvsc_device *net_device,
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index daa450e4e2a4..be4b15b63355 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -45,7 +45,10 @@
 
 #include "hyperv_net.h"
 
-#define RING_SIZE_MIN		64
+#define RING_SIZE_MIN	64
+#define RETRY_US_LO	5000
+#define RETRY_US_HI	10000
+#define RETRY_MAX	2000	/* >10 sec */
 
 #define LINKCHANGE_INT (2 * HZ)
 #define VF_TAKEOVER_INT (HZ / 10)
@@ -89,10 +92,8 @@ static int netvsc_open(struct net_device *net)
 	}
 
 	rdev = nvdev->extension;
-	if (!rdev->link_state) {
+	if (!rdev->link_state)
 		netif_carrier_on(net);
-		netif_tx_wake_all_queues(net);
-	}
 
 	if (vf_netdev) {
 		/* Setting synthetic device up transparently sets
@@ -108,36 +109,25 @@ static int netvsc_open(struct net_device *net)
 	return 0;
 }
 
-static int netvsc_close(struct net_device *net)
+static int netvsc_wait_until_empty(struct netvsc_device *nvdev)
 {
-	struct net_device_context *net_device_ctx = netdev_priv(net);
-	struct net_device *vf_netdev
-		= rtnl_dereference(net_device_ctx->vf_netdev);
-	struct netvsc_device *nvdev = rtnl_dereference(net_device_ctx->nvdev);
-	int ret = 0;
-	u32 aread, i, msec = 10, retry = 0, retry_max = 20;
-	struct vmbus_channel *chn;
-
-	netif_tx_disable(net);
-
-	/* No need to close rndis filter if it is removed already */
-	if (!nvdev)
-		goto out;
-
-	ret = rndis_filter_close(nvdev);
-	if (ret != 0) {
-		netdev_err(net, "unable to close device (ret %d).\n", ret);
-		return ret;
-	}
+	unsigned int retry = 0;
+	int i;
 
 	/* Ensure pending bytes in ring are read */
-	while (true) {
-		aread = 0;
+	for (;;) {
+		u32 aread = 0;
+
 		for (i = 0; i < nvdev->num_chn; i++) {
-			chn = nvdev->chan_table[i].channel;
+			struct vmbus_channel *chn
+				= nvdev->chan_table[i].channel;
+
 			if (!chn)
 				continue;
 
+			/* make sure receive not running now */
+			napi_synchronize(&nvdev->chan_table[i].napi);
+
 			aread = hv_get_bytes_to_read(&chn->inbound);
 			if (aread)
 				break;
@@ -147,22 +137,40 @@ static int netvsc_close(struct net_device *net)
 				break;
 		}
 
-		retry++;
-		if (retry > retry_max || aread == 0)
-			break;
+		if (aread == 0)
+			return 0;
 
-		msleep(msec);
+		if (++retry > RETRY_MAX)
+			return -ETIMEDOUT;
 
-		if (msec < 1000)
-			msec *= 2;
+		usleep_range(RETRY_US_LO, RETRY_US_HI);
 	}
+}
 
-	if (aread) {
-		netdev_err(net, "Ring buffer not empty after closing rndis\n");
-		ret = -ETIMEDOUT;
+static int netvsc_close(struct net_device *net)
+{
+	struct net_device_context *net_device_ctx = netdev_priv(net);
+	struct net_device *vf_netdev
+		= rtnl_dereference(net_device_ctx->vf_netdev);
+	struct netvsc_device *nvdev = rtnl_dereference(net_device_ctx->nvdev);
+	int ret;
+
+	netif_tx_disable(net);
+
+	/* No need to close rndis filter if it is removed already */
+	if (!nvdev)
+		return 0;
+
+	ret = rndis_filter_close(nvdev);
+	if (ret != 0) {
+		netdev_err(net, "unable to close device (ret %d).\n", ret);
+		return ret;
 	}
 
-out:
+	ret = netvsc_wait_until_empty(nvdev);
+	if (ret)
+		netdev_err(net, "Ring buffer not empty after closing rndis\n");
+
 	if (vf_netdev)
 		dev_close(vf_netdev);
 
@@ -820,16 +828,81 @@ static void netvsc_get_channels(struct net_device *net,
 	}
 }
 
+static int netvsc_detach(struct net_device *ndev,
+			 struct netvsc_device *nvdev)
+{
+	struct net_device_context *ndev_ctx = netdev_priv(ndev);
+	struct hv_device *hdev = ndev_ctx->device_ctx;
+	int ret;
+
+	/* Don't try continuing to try and setup sub channels */
+	if (cancel_work_sync(&nvdev->subchan_work))
+		nvdev->num_chn = 1;
+
+	/* If device was up (receiving) then shutdown */
+	if (netif_running(ndev)) {
+		netif_tx_disable(ndev);
+
+		ret = rndis_filter_close(nvdev);
+		if (ret) {
+			netdev_err(ndev,
+				   "unable to close device (ret %d).\n", ret);
+			return ret;
+		}
+
+		ret = netvsc_wait_until_empty(nvdev);
+		if (ret) {
+			netdev_err(ndev,
+				   "Ring buffer not empty after closing rndis\n");
+			return ret;
+		}
+	}
+
+	netif_device_detach(ndev);
+
+	rndis_filter_device_remove(hdev, nvdev);
+
+	return 0;
+}
+
+static int netvsc_attach(struct net_device *ndev,
+			 struct netvsc_device_info *dev_info)
+{
+	struct net_device_context *ndev_ctx = netdev_priv(ndev);
+	struct hv_device *hdev = ndev_ctx->device_ctx;
+	struct netvsc_device *nvdev;
+	struct rndis_device *rdev;
+	int ret;
+
+	nvdev = rndis_filter_device_add(hdev, dev_info);
+	if (IS_ERR(nvdev))
+		return PTR_ERR(nvdev);
+
+	/* Note: enable and attach happen when sub-channels setup */
+
+	netif_carrier_off(ndev);
+
+	if (netif_running(ndev)) {
+		ret = rndis_filter_open(nvdev);
+		if (ret)
+			return ret;
+
+		rdev = nvdev->extension;
+		if (!rdev->link_state)
+			netif_carrier_on(ndev);
+	}
+
+	return 0;
+}
+
 static int netvsc_set_channels(struct net_device *net,
 			       struct ethtool_channels *channels)
 {
 	struct net_device_context *net_device_ctx = netdev_priv(net);
-	struct hv_device *dev = net_device_ctx->device_ctx;
 	struct netvsc_device *nvdev = rtnl_dereference(net_device_ctx->nvdev);
 	unsigned int orig, count = channels->combined_count;
 	struct netvsc_device_info device_info;
-	bool was_opened;
-	int ret = 0;
+	int ret;
 
 	/* We do not support separate count for rx, tx, or other */
 	if (count == 0 ||
@@ -846,9 +919,6 @@ static int netvsc_set_channels(struct net_device *net,
 		return -EINVAL;
 
 	orig = nvdev->num_chn;
-	was_opened = rndis_filter_opened(nvdev);
-	if (was_opened)
-		rndis_filter_close(nvdev);
 
 	memset(&device_info, 0, sizeof(device_info));
 	device_info.num_chn = count;
@@ -858,28 +928,17 @@ static int netvsc_set_channels(struct net_device *net,
 	device_info.recv_sections = nvdev->recv_section_cnt;
 	device_info.recv_section_size = nvdev->recv_section_size;
 
-	rndis_filter_device_remove(dev, nvdev);
+	ret = netvsc_detach(net, nvdev);
+	if (ret)
+		return ret;
 
-	nvdev = rndis_filter_device_add(dev, &device_info);
-	if (IS_ERR(nvdev)) {
-		ret = PTR_ERR(nvdev);
+	ret = netvsc_attach(net, &device_info);
+	if (ret) {
 		device_info.num_chn = orig;
-		nvdev = rndis_filter_device_add(dev, &device_info);
-
-		if (IS_ERR(nvdev)) {
-			netdev_err(net, "restoring channel setting failed: %ld\n",
-				   PTR_ERR(nvdev));
-			return ret;
-		}
+		if (netvsc_attach(net, &device_info))
+			netdev_err(net, "restoring channel setting failed\n");
 	}
 
-	if (was_opened)
-		rndis_filter_open(nvdev);
-
-	/* We may have missed link change notifications */
-	net_device_ctx->last_reconfig = 0;
-	schedule_delayed_work(&net_device_ctx->dwork, 0);
-
 	return ret;
 }
 
@@ -946,10 +1005,8 @@ static int netvsc_change_mtu(struct net_device *ndev, int mtu)
 	struct net_device_context *ndevctx = netdev_priv(ndev);
 	struct net_device *vf_netdev = rtnl_dereference(ndevctx->vf_netdev);
 	struct netvsc_device *nvdev = rtnl_dereference(ndevctx->nvdev);
-	struct hv_device *hdev = ndevctx->device_ctx;
 	int orig_mtu = ndev->mtu;
 	struct netvsc_device_info device_info;
-	bool was_opened;
 	int ret = 0;
 
 	if (!nvdev || nvdev->destroy)
@@ -962,11 +1019,6 @@ static int netvsc_change_mtu(struct net_device *ndev, int mtu)
 			return ret;
 	}
 
-	netif_device_detach(ndev);
-	was_opened = rndis_filter_opened(nvdev);
-	if (was_opened)
-		rndis_filter_close(nvdev);
-
 	memset(&device_info, 0, sizeof(device_info));
 	device_info.ring_size = ring_size;
 	device_info.num_chn = nvdev->num_chn;
@@ -975,35 +1027,27 @@ static int netvsc_change_mtu(struct net_device *ndev, int mtu)
 	device_info.recv_sections = nvdev->recv_section_cnt;
 	device_info.recv_section_size = nvdev->recv_section_size;
 
-	rndis_filter_device_remove(hdev, nvdev);
+	ret = netvsc_detach(ndev, nvdev);
+	if (ret)
+		goto rollback_vf;
 
 	ndev->mtu = mtu;
 
-	nvdev = rndis_filter_device_add(hdev, &device_info);
-	if (IS_ERR(nvdev)) {
-		ret = PTR_ERR(nvdev);
-
-		/* Attempt rollback to original MTU */
-		ndev->mtu = orig_mtu;
-		nvdev = rndis_filter_device_add(hdev, &device_info);
-
-		if (vf_netdev)
-			dev_set_mtu(vf_netdev, orig_mtu);
-
-		if (IS_ERR(nvdev)) {
-			netdev_err(ndev, "restoring mtu failed: %ld\n",
-				   PTR_ERR(nvdev));
-			return ret;
-		}
-	}
+	ret = netvsc_attach(ndev, &device_info);
+	if (ret)
+		goto rollback;
 
-	if (was_opened)
-		rndis_filter_open(nvdev);
+	return 0;
 
-	netif_device_attach(ndev);
+rollback:
+	/* Attempt rollback to original MTU */
+	ndev->mtu = orig_mtu;
 
-	/* We may have missed link change notifications */
-	schedule_delayed_work(&ndevctx->dwork, 0);
+	if (netvsc_attach(ndev, &device_info))
+		netdev_err(ndev, "restoring mtu failed\n");
+rollback_vf:
+	if (vf_netdev)
+		dev_set_mtu(vf_netdev, orig_mtu);
 
 	return ret;
 }
@@ -1469,11 +1513,9 @@ static int netvsc_set_ringparam(struct net_device *ndev,
 {
 	struct net_device_context *ndevctx = netdev_priv(ndev);
 	struct netvsc_device *nvdev = rtnl_dereference(ndevctx->nvdev);
-	struct hv_device *hdev = ndevctx->device_ctx;
 	struct netvsc_device_info device_info;
 	struct ethtool_ringparam orig;
 	u32 new_tx, new_rx;
-	bool was_opened;
 	int ret = 0;
 
 	if (!nvdev || nvdev->destroy)
@@ -1499,34 +1541,18 @@ static int netvsc_set_ringparam(struct net_device *ndev,
 	device_info.recv_sections = new_rx;
 	device_info.recv_section_size = nvdev->recv_section_size;
 
-	netif_device_detach(ndev);
-	was_opened = rndis_filter_opened(nvdev);
-	if (was_opened)
-		rndis_filter_close(nvdev);
-
-	rndis_filter_device_remove(hdev, nvdev);
-
-	nvdev = rndis_filter_device_add(hdev, &device_info);
-	if (IS_ERR(nvdev)) {
-		ret = PTR_ERR(nvdev);
+	ret = netvsc_detach(ndev, nvdev);
+	if (ret)
+		return ret;
 
+	ret = netvsc_attach(ndev, &device_info);
+	if (ret) {
 		device_info.send_sections = orig.tx_pending;
 		device_info.recv_sections = orig.rx_pending;
-		nvdev = rndis_filter_device_add(hdev, &device_info);
-		if (IS_ERR(nvdev)) {
-			netdev_err(ndev, "restoring ringparam failed: %ld\n",
-				   PTR_ERR(nvdev));
-			return ret;
-		}
-	}
-
-	if (was_opened)
-		rndis_filter_open(nvdev);
-	netif_device_attach(ndev);
 
-	/* We may have missed link change notifications */
-	ndevctx->last_reconfig = 0;
-	schedule_delayed_work(&ndevctx->dwork, 0);
+		if (netvsc_attach(ndev, &device_info))
+			netdev_err(ndev, "restoring ringparam failed");
+	}
 
 	return ret;
 }
@@ -2002,8 +2028,8 @@ static int netvsc_probe(struct hv_device *dev,
 static int netvsc_remove(struct hv_device *dev)
 {
 	struct net_device_context *ndev_ctx;
-	struct net_device *vf_netdev;
-	struct net_device *net;
+	struct net_device *vf_netdev, *net;
+	struct netvsc_device *nvdev;
 
 	net = hv_get_drvdata(dev);
 	if (net == NULL) {
@@ -2013,10 +2039,14 @@ static int netvsc_remove(struct hv_device *dev)
 
 	ndev_ctx = netdev_priv(net);
 
-	netif_device_detach(net);
-
 	cancel_delayed_work_sync(&ndev_ctx->dwork);
 
+	rcu_read_lock();
+	nvdev = rcu_dereference(ndev_ctx->nvdev);
+
+	if  (nvdev)
+		cancel_work_sync(&nvdev->subchan_work);
+
 	/*
 	 * Call to the vsc driver to let it know that the device is being
 	 * removed. Also blocks mtu and channel changes.
@@ -2026,11 +2056,13 @@ static int netvsc_remove(struct hv_device *dev)
 	if (vf_netdev)
 		netvsc_unregister_vf(vf_netdev);
 
+	if (nvdev)
+		rndis_filter_device_remove(dev, nvdev);
+
 	unregister_netdevice(net);
 
-	rndis_filter_device_remove(dev,
-				   rtnl_dereference(ndev_ctx->nvdev));
 	rtnl_unlock();
+	rcu_read_unlock();
 
 	hv_set_drvdata(dev, NULL);
 
diff --git a/drivers/net/hyperv/rndis_filter.c b/drivers/net/hyperv/rndis_filter.c
index 056499014a1f..3bfa56560286 100644
--- a/drivers/net/hyperv/rndis_filter.c
+++ b/drivers/net/hyperv/rndis_filter.c
@@ -1112,6 +1112,7 @@ void rndis_set_subchannel(struct work_struct *w)
 	for (i = 0; i < VRSS_SEND_TAB_SIZE; i++)
 		ndev_ctx->tx_table[i] = i % nvdev->num_chn;
 
+	netif_device_attach(ndev);
 	rtnl_unlock();
 	return;
 
@@ -1122,6 +1123,8 @@ void rndis_set_subchannel(struct work_struct *w)
 
 	nvdev->max_chn = 1;
 	nvdev->num_chn = 1;
+
+	netif_device_attach(ndev);
 unlock:
 	rtnl_unlock();
 }
@@ -1324,6 +1327,10 @@ struct netvsc_device *rndis_filter_device_add(struct hv_device *dev,
 		net_device->num_chn = 1;
 	}
 
+	/* No sub channels, device is ready */
+	if (net_device->num_chn == 1)
+		netif_device_attach(net);
+
 	return net_device;
 
 err_dev_remv:
@@ -1336,9 +1343,6 @@ void rndis_filter_device_remove(struct hv_device *dev,
 {
 	struct rndis_device *rndis_dev = net_dev->extension;
 
-	/* Don't try and setup sub channels if about to halt */
-	cancel_work_sync(&net_dev->subchan_work);
-
 	/* Halt and release the rndis device */
 	rndis_filter_halt_device(rndis_dev);
 
@@ -1368,8 +1372,3 @@ int rndis_filter_close(struct netvsc_device *nvdev)
 
 	return rndis_filter_close_device(nvdev->extension);
 }
-
-bool rndis_filter_opened(const struct netvsc_device *nvdev)
-{
-	return atomic_read(&nvdev->open_cnt) > 0;
-}
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 20/24] hv_netvsc: Use Windows version instead of NVSP version on GPAD teardown
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (18 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 19/24] hv_netvsc: common detach logic Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 21/24] hv_netvsc: Split netvsc_revoke_buf() and netvsc_teardown_gpadl() Stephen Hemminger
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Mohammed Gamal

From: Mohammed Gamal <mgamal@redhat.com>

commit 2afc5d61a7197de25a61f54ea4ecfb4cb62b1d42A upstram

When changing network interface settings, Windows guests
older than WS2016 can no longer shutdown. This was addressed
by commit 0ef58b0a05c12 ("hv_netvsc: change GPAD teardown order
on older versions"), however the issue also occurs on WS2012
guests that share NVSP protocol versions with WS2016 guests.
Hence we use Windows version directly to differentiate them.

Fixes: 0ef58b0a05c12 ("hv_netvsc: change GPAD teardown order on older versions")
Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/netvsc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 0c548748fba8..aa9b7a912c31 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -569,13 +569,13 @@ void netvsc_device_remove(struct hv_device *device)
 	netdev_dbg(ndev, "net device safe to remove\n");
 
 	/* older versions require that buffer be revoked before close */
-	if (net_device->nvsp_version < NVSP_PROTOCOL_VERSION_4)
+	if (vmbus_proto_version < VERSION_WIN10)
 		netvsc_teardown_gpadl(device, net_device);
 
 	/* Now, we can close the channel safely */
 	vmbus_close(device->channel);
 
-	if (net_device->nvsp_version >= NVSP_PROTOCOL_VERSION_4)
+	if (vmbus_proto_version >= VERSION_WIN10)
 		netvsc_teardown_gpadl(device, net_device);
 
 	/* Release all resources */
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 21/24] hv_netvsc: Split netvsc_revoke_buf() and netvsc_teardown_gpadl()
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (19 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 20/24] hv_netvsc: Use Windows version instead of NVSP version on GPAD teardown Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 22/24] hv_netvsc: Ensure correct teardown message sequence order Stephen Hemminger
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Mohammed Gamal

From: Mohammed Gamal <mgamal@redhat.com>

commit 7992894c305eaf504d005529637ff8283d0a849d upstream

Split each of the functions into two for each of send/recv buffers.
This will be needed in order to implement a fine-grained messaging
sequence to the host so that we accommodate the requirements of
different Windows versions

Fixes: 0ef58b0a05c12 ("hv_netvsc: change GPAD teardown order on older versions")
Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/netvsc.c | 46 +++++++++++++++++++++++++++----------
 1 file changed, 34 insertions(+), 12 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index aa9b7a912c31..25fcba506ac5 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -105,11 +105,11 @@ static void free_netvsc_device_rcu(struct netvsc_device *nvdev)
 	call_rcu(&nvdev->rcu, free_netvsc_device);
 }
 
-static void netvsc_revoke_buf(struct hv_device *device,
-			      struct netvsc_device *net_device)
+static void netvsc_revoke_recv_buf(struct hv_device *device,
+				   struct netvsc_device *net_device)
 {
-	struct nvsp_message *revoke_packet;
 	struct net_device *ndev = hv_get_drvdata(device);
+	struct nvsp_message *revoke_packet;
 	int ret;
 
 	/*
@@ -151,6 +151,14 @@ static void netvsc_revoke_buf(struct hv_device *device,
 		}
 		net_device->recv_section_cnt = 0;
 	}
+}
+
+static void netvsc_revoke_send_buf(struct hv_device *device,
+				   struct netvsc_device *net_device)
+{
+	struct net_device *ndev = hv_get_drvdata(device);
+	struct nvsp_message *revoke_packet;
+	int ret;
 
 	/* Deal with the send buffer we may have setup.
 	 * If we got a  send section size, it means we received a
@@ -194,8 +202,8 @@ static void netvsc_revoke_buf(struct hv_device *device,
 	}
 }
 
-static void netvsc_teardown_gpadl(struct hv_device *device,
-				  struct netvsc_device *net_device)
+static void netvsc_teardown_recv_gpadl(struct hv_device *device,
+				       struct netvsc_device *net_device)
 {
 	struct net_device *ndev = hv_get_drvdata(device);
 	int ret;
@@ -214,6 +222,13 @@ static void netvsc_teardown_gpadl(struct hv_device *device,
 		}
 		net_device->recv_buf_gpadl_handle = 0;
 	}
+}
+
+static void netvsc_teardown_send_gpadl(struct hv_device *device,
+				       struct netvsc_device *net_device)
+{
+	struct net_device *ndev = hv_get_drvdata(device);
+	int ret;
 
 	if (net_device->send_buf_gpadl_handle) {
 		ret = vmbus_teardown_gpadl(device->channel,
@@ -423,8 +438,10 @@ static int netvsc_init_buf(struct hv_device *device,
 	goto exit;
 
 cleanup:
-	netvsc_revoke_buf(device, net_device);
-	netvsc_teardown_gpadl(device, net_device);
+	netvsc_revoke_recv_buf(device, net_device);
+	netvsc_revoke_send_buf(device, net_device);
+	netvsc_teardown_recv_gpadl(device, net_device);
+	netvsc_teardown_send_gpadl(device, net_device);
 
 exit:
 	return ret;
@@ -554,7 +571,8 @@ void netvsc_device_remove(struct hv_device *device)
 		= rtnl_dereference(net_device_ctx->nvdev);
 	int i;
 
-	netvsc_revoke_buf(device, net_device);
+	netvsc_revoke_recv_buf(device, net_device);
+	netvsc_revoke_send_buf(device, net_device);
 
 	RCU_INIT_POINTER(net_device_ctx->nvdev, NULL);
 
@@ -569,14 +587,18 @@ void netvsc_device_remove(struct hv_device *device)
 	netdev_dbg(ndev, "net device safe to remove\n");
 
 	/* older versions require that buffer be revoked before close */
-	if (vmbus_proto_version < VERSION_WIN10)
-		netvsc_teardown_gpadl(device, net_device);
+	if (vmbus_proto_version < VERSION_WIN10) {
+		netvsc_teardown_recv_gpadl(device, net_device);
+		netvsc_teardown_send_gpadl(device, net_device);
+	}
 
 	/* Now, we can close the channel safely */
 	vmbus_close(device->channel);
 
-	if (vmbus_proto_version >= VERSION_WIN10)
-		netvsc_teardown_gpadl(device, net_device);
+	if (vmbus_proto_version >= VERSION_WIN10) {
+		netvsc_teardown_recv_gpadl(device, net_device);
+		netvsc_teardown_send_gpadl(device, net_device);
+	}
 
 	/* Release all resources */
 	free_netvsc_device_rcu(net_device);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 22/24] hv_netvsc: Ensure correct teardown message sequence order
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (20 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 21/24] hv_netvsc: Split netvsc_revoke_buf() and netvsc_teardown_gpadl() Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 23/24] hv_netvsc: Fix net device attach on older Windows hosts Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 24/24] hv_netvsc: set master device Stephen Hemminger
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Mohammed Gamal

From: Mohammed Gamal <mgamal@redhat.com>

commit a56d99d714665591fed8527b90eef21530ea61e0 upstream

Prior to commit 0cf737808ae7 ("hv_netvsc: netvsc_teardown_gpadl() split")
the call sequence in netvsc_device_remove() was as follows (as
implemented in netvsc_destroy_buf()):
1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message
2- Teardown receive buffer GPADL
3- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message
4- Teardown send buffer GPADL
5- Close vmbus

This didn't work for WS2016 hosts. Commit 0cf737808ae7
("hv_netvsc: netvsc_teardown_gpadl() split") rearranged the
teardown sequence as follows:
1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message
2- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message
3- Close vmbus
4- Teardown receive buffer GPADL
5- Teardown send buffer GPADL

That worked well for WS2016 hosts, but it prevented guests on older hosts from
shutting down after changing network settings. Commit 0ef58b0a05c1
("hv_netvsc: change GPAD teardown order on older versions") ensured the
following message sequence for older hosts
1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message
2- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message
3- Teardown receive buffer GPADL
4- Teardown send buffer GPADL
5- Close vmbus

However, with this sequence calling `ip link set eth0 mtu 1000` hangs and the
process becomes uninterruptible. On futher analysis it turns out that on tearing
down the receive buffer GPADL the kernel is waiting indefinitely
in vmbus_teardown_gpadl() for a completion to be signaled.

Here is a snippet of where this occurs:
int vmbus_teardown_gpadl(struct vmbus_channel *channel, u32 gpadl_handle)
{
        struct vmbus_channel_gpadl_teardown *msg;
        struct vmbus_channel_msginfo *info;
        unsigned long flags;
        int ret;

        info = kmalloc(sizeof(*info) +
                       sizeof(struct vmbus_channel_gpadl_teardown), GFP_KERNEL);
        if (!info)
                return -ENOMEM;

        init_completion(&info->waitevent);
        info->waiting_channel = channel;
[....]
        ret = vmbus_post_msg(msg, sizeof(struct vmbus_channel_gpadl_teardown),
                             true);

        if (ret)
                goto post_msg_err;

        wait_for_completion(&info->waitevent);
[....]
}

The completion is signaled from vmbus_ongpadl_torndown(), which gets called when
the corresponding message is received from the host, which apparently never happens
in that case.
This patch works around the issue by restoring the first mentioned message sequence
for older hosts

Fixes: 0ef58b0a05c1 ("hv_netvsc: change GPAD teardown order on older versions")
Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/netvsc.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 25fcba506ac5..99be63eacaeb 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -571,8 +571,17 @@ void netvsc_device_remove(struct hv_device *device)
 		= rtnl_dereference(net_device_ctx->nvdev);
 	int i;
 
+	/*
+	 * Revoke receive buffer. If host is pre-Win2016 then tear down
+	 * receive buffer GPADL. Do the same for send buffer.
+	 */
 	netvsc_revoke_recv_buf(device, net_device);
+	if (vmbus_proto_version < VERSION_WIN10)
+		netvsc_teardown_recv_gpadl(device, net_device);
+
 	netvsc_revoke_send_buf(device, net_device);
+	if (vmbus_proto_version < VERSION_WIN10)
+		netvsc_teardown_send_gpadl(device, net_device);
 
 	RCU_INIT_POINTER(net_device_ctx->nvdev, NULL);
 
@@ -586,15 +595,13 @@ void netvsc_device_remove(struct hv_device *device)
 	 */
 	netdev_dbg(ndev, "net device safe to remove\n");
 
-	/* older versions require that buffer be revoked before close */
-	if (vmbus_proto_version < VERSION_WIN10) {
-		netvsc_teardown_recv_gpadl(device, net_device);
-		netvsc_teardown_send_gpadl(device, net_device);
-	}
-
 	/* Now, we can close the channel safely */
 	vmbus_close(device->channel);
 
+	/*
+	 * If host is Win2016 or higher then we do the GPADL tear down
+	 * here after VMBus is closed.
+	*/
 	if (vmbus_proto_version >= VERSION_WIN10) {
 		netvsc_teardown_recv_gpadl(device, net_device);
 		netvsc_teardown_send_gpadl(device, net_device);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 23/24] hv_netvsc: Fix net device attach on older Windows hosts
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (21 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 22/24] hv_netvsc: Ensure correct teardown message sequence order Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  2018-05-14 22:32 ` [PATCH net-stable 24/24] hv_netvsc: set master device Stephen Hemminger
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Mohammed Gamal

From: Mohammed Gamal <mgamal@redhat.com>

commit 55be9f25be1ca5bda75c39808fc77e42691bc07f upstream

On older windows hosts the net_device instance is returned to
the caller of rndis_filter_device_add() without having the presence
bit set first. This would cause any subsequent calls to network device
operations (e.g. MTU change, channel change) to fail after the device
is detached once, returning -ENODEV.

Instead of returning the device instabce, we take the exit path where
we call netif_device_attach()

Fixes: 7b2ee50c0cd5 ("hv_netvsc: common detach logic")
Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/rndis_filter.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/hyperv/rndis_filter.c b/drivers/net/hyperv/rndis_filter.c
index 3bfa56560286..6dde92c1c113 100644
--- a/drivers/net/hyperv/rndis_filter.c
+++ b/drivers/net/hyperv/rndis_filter.c
@@ -1276,7 +1276,7 @@ struct netvsc_device *rndis_filter_device_add(struct hv_device *dev,
 		   rndis_device->link_state ? "down" : "up");
 
 	if (net_device->nvsp_version < NVSP_PROTOCOL_VERSION_5)
-		return net_device;
+		goto out;
 
 	rndis_filter_query_link_speed(rndis_device, net_device);
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH net-stable 24/24] hv_netvsc: set master device
  2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
                   ` (22 preceding siblings ...)
  2018-05-14 22:32 ` [PATCH net-stable 23/24] hv_netvsc: Fix net device attach on older Windows hosts Stephen Hemminger
@ 2018-05-14 22:32 ` Stephen Hemminger
  23 siblings, 0 replies; 25+ messages in thread
From: Stephen Hemminger @ 2018-05-14 22:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, Stephen Hemminger

From: Stephen Hemminger <stephen@networkplumber.org>

commit 97f3efb64323beb0690576e9d74e94998ad6e82a upstream

The hyper-v transparent bonding should have used master_dev_link.
The netvsc device should look like a master bond device not
like the upper side of a tunnel.

This makes the semantics the same so that userspace applications
looking at network devices see the correct master relationshipship.

Fixes: 0c195567a8f6 ("netvsc: transparent VF management")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/hyperv/netvsc_drv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index be4b15b63355..11b46c8d2d67 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -1778,7 +1778,8 @@ static int netvsc_vf_join(struct net_device *vf_netdev,
 		goto rx_handler_failed;
 	}
 
-	ret = netdev_upper_dev_link(vf_netdev, ndev);
+	ret = netdev_master_upper_dev_link(vf_netdev, ndev,
+					   NULL, NULL);
 	if (ret != 0) {
 		netdev_err(vf_netdev,
 			   "can not set master device %s (err = %d)\n",
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2018-05-14 22:33 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-14 22:31 [PATCH net-stable 00/24] hv_netvsc patches for 4.14 stable Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 01/24] hv_netvsc: Fix the real number of queues of non-vRSS cases Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 02/24] hv_netvsc: Rename ind_table to rx_table Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 03/24] hv_netvsc: Rename tx_send_table to tx_table Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 04/24] hv_netvsc: Add initialization of tx_table in netvsc_device_add() Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 05/24] hv_netvsc: Set tx_table to equal weight after subchannels open Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 06/24] hv_netvsc: netvsc_teardown_gpadl() split Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 07/24] hv_netvsc: preserve hw_features on mtu/channels/ringparam changes Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 08/24] hv_netvsc: empty current transmit aggregation if flow blocked Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 09/24] hv_netvsc: Use the num_online_cpus() for channel limit Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 10/24] hv_netvsc: avoid retry on send during shutdown Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 11/24] hv_netvsc: only wake transmit queue if link is up Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 12/24] hv_netvsc: fix error unwind handling if vmbus_open fails Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 13/24] hv_netvsc: cancel subchannel setup before halting device Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 14/24] hv_netvsc: fix race in napi poll when rescheduling Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 15/24] hv_netvsc: defer queue selection to VF Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 16/24] hv_netvsc: disable NAPI before channel close Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 17/24] hv_netvsc: use RCU to fix concurrent rx and queue changes Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 18/24] hv_netvsc: change GPAD teardown order on older versions Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 19/24] hv_netvsc: common detach logic Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 20/24] hv_netvsc: Use Windows version instead of NVSP version on GPAD teardown Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 21/24] hv_netvsc: Split netvsc_revoke_buf() and netvsc_teardown_gpadl() Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 22/24] hv_netvsc: Ensure correct teardown message sequence order Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 23/24] hv_netvsc: Fix net device attach on older Windows hosts Stephen Hemminger
2018-05-14 22:32 ` [PATCH net-stable 24/24] hv_netvsc: set master device Stephen Hemminger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.