linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net v2] hv_netvsc: Fix hibernation for mlx5 VF driver
@ 2020-09-07  7:13 Dexuan Cui
  2020-09-08  4:10 ` Jakub Kicinski
  2020-09-08 20:49 ` Michael Kelley
  0 siblings, 2 replies; 4+ messages in thread
From: Dexuan Cui @ 2020-09-07  7:13 UTC (permalink / raw)
  To: kuba, wei.liu, kys, haiyangz, sthemmin, davem, linux-hyperv,
	netdev, linux-kernel, mikelley
  Cc: Dexuan Cui

mlx5_suspend()/resume() keep the network interface, so during hibernation
netvsc_unregister_vf() and netvsc_register_vf() are not called, and hence
netvsc_resume() should call netvsc_vf_changed() to switch the data path
back to the VF after hibernation. Note: after we close and re-open the
vmbus channel of the netvsc NIC in netvsc_suspend() and netvsc_resume(),
the data path is implicitly switched to the netvsc NIC. Similarly,
netvsc_suspend() should not call netvsc_unregister_vf(), otherwise the VF
can no longer be used after hibernation.

For mlx4, since the VF network interafce is explicitly destroyed and
re-created during hibernation (see mlx4_suspend()/resume()), hv_netvsc
already explicitly switches the data path from and to the VF automatically
via netvsc_register_vf() and netvsc_unregister_vf(), so mlx4 doesn't need
this fix. Note: mlx4 can still work with the fix because in
netvsc_suspend()/resume() ndev_ctx->vf_netdev is NULL for mlx4.

Fixes: 0efeea5fb153 ("hv_netvsc: Add the support of hibernation")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---

Changes in v2 (Thanks Jakub Kicinski <kuba@kernel.org>):
    Added coments in the changelog and the code about the implicit
data path switching to the netvsc when we close/re-open the vmbus
channels.
    Used reverse xmas order ordering in netvsc_remove().

 drivers/net/hyperv/netvsc_drv.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 64b0a74c1523..81c5c70b616a 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -2587,8 +2587,8 @@ static int netvsc_remove(struct hv_device *dev)
 static int netvsc_suspend(struct hv_device *dev)
 {
 	struct net_device_context *ndev_ctx;
-	struct net_device *vf_netdev, *net;
 	struct netvsc_device *nvdev;
+	struct net_device *net;
 	int ret;
 
 	net = hv_get_drvdata(dev);
@@ -2604,10 +2604,6 @@ static int netvsc_suspend(struct hv_device *dev)
 		goto out;
 	}
 
-	vf_netdev = rtnl_dereference(ndev_ctx->vf_netdev);
-	if (vf_netdev)
-		netvsc_unregister_vf(vf_netdev);
-
 	/* Save the current config info */
 	ndev_ctx->saved_netvsc_dev_info = netvsc_devinfo_get(nvdev);
 
@@ -2623,6 +2619,7 @@ static int netvsc_resume(struct hv_device *dev)
 	struct net_device *net = hv_get_drvdata(dev);
 	struct net_device_context *net_device_ctx;
 	struct netvsc_device_info *device_info;
+	struct net_device *vf_netdev;
 	int ret;
 
 	rtnl_lock();
@@ -2635,6 +2632,15 @@ static int netvsc_resume(struct hv_device *dev)
 	netvsc_devinfo_put(device_info);
 	net_device_ctx->saved_netvsc_dev_info = NULL;
 
+	/* A NIC driver (e.g. mlx5) may keep the VF network interface across
+	 * hibernation, but here the data path is implicitly switched to the
+	 * netvsc NIC since the vmbus channel is closed and re-opened, so
+	 * netvsc_vf_changed() must be used to switch the data path to the VF.
+	 */
+	vf_netdev = rtnl_dereference(net_device_ctx->vf_netdev);
+	if (vf_netdev && netvsc_vf_changed(vf_netdev) != NOTIFY_OK)
+		ret = -EINVAL;
+
 	rtnl_unlock();
 
 	return ret;
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net v2] hv_netvsc: Fix hibernation for mlx5 VF driver
  2020-09-07  7:13 [PATCH net v2] hv_netvsc: Fix hibernation for mlx5 VF driver Dexuan Cui
@ 2020-09-08  4:10 ` Jakub Kicinski
  2020-09-08 20:49 ` Michael Kelley
  1 sibling, 0 replies; 4+ messages in thread
From: Jakub Kicinski @ 2020-09-08  4:10 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: wei.liu, kys, haiyangz, sthemmin, davem, linux-hyperv, netdev,
	linux-kernel, mikelley

On Mon,  7 Sep 2020 00:13:39 -0700 Dexuan Cui wrote:
> mlx5_suspend()/resume() keep the network interface, so during hibernation
> netvsc_unregister_vf() and netvsc_register_vf() are not called, and hence
> netvsc_resume() should call netvsc_vf_changed() to switch the data path
> back to the VF after hibernation. Note: after we close and re-open the
> vmbus channel of the netvsc NIC in netvsc_suspend() and netvsc_resume(),
> the data path is implicitly switched to the netvsc NIC. Similarly,
> netvsc_suspend() should not call netvsc_unregister_vf(), otherwise the VF
> can no longer be used after hibernation.
> 
> For mlx4, since the VF network interafce is explicitly destroyed and
> re-created during hibernation (see mlx4_suspend()/resume()), hv_netvsc
> already explicitly switches the data path from and to the VF automatically
> via netvsc_register_vf() and netvsc_unregister_vf(), so mlx4 doesn't need
> this fix. Note: mlx4 can still work with the fix because in
> netvsc_suspend()/resume() ndev_ctx->vf_netdev is NULL for mlx4.
> 
> Fixes: 0efeea5fb153 ("hv_netvsc: Add the support of hibernation")
> Signed-off-by: Dexuan Cui <decui@microsoft.com>

Applied, thanks!

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH net v2] hv_netvsc: Fix hibernation for mlx5 VF driver
  2020-09-07  7:13 [PATCH net v2] hv_netvsc: Fix hibernation for mlx5 VF driver Dexuan Cui
  2020-09-08  4:10 ` Jakub Kicinski
@ 2020-09-08 20:49 ` Michael Kelley
  2020-09-08 23:00   ` Dexuan Cui
  1 sibling, 1 reply; 4+ messages in thread
From: Michael Kelley @ 2020-09-08 20:49 UTC (permalink / raw)
  To: Dexuan Cui, kuba, wei.liu, KY Srinivasan, Haiyang Zhang,
	Stephen Hemminger, davem, linux-hyperv, netdev, linux-kernel

From: Dexuan Cui <decui@microsoft.com>  Sent: Monday, September 7, 2020 12:14 AM
> 
> mlx5_suspend()/resume() keep the network interface, so during hibernation
> netvsc_unregister_vf() and netvsc_register_vf() are not called, and hence
> netvsc_resume() should call netvsc_vf_changed() to switch the data path
> back to the VF after hibernation. Note: after we close and re-open the
> vmbus channel of the netvsc NIC in netvsc_suspend() and netvsc_resume(),
> the data path is implicitly switched to the netvsc NIC. Similarly,
> netvsc_suspend() should not call netvsc_unregister_vf(), otherwise the VF
> can no longer be used after hibernation.
> 
> For mlx4, since the VF network interafce is explicitly destroyed and
> re-created during hibernation (see mlx4_suspend()/resume()), hv_netvsc
> already explicitly switches the data path from and to the VF automatically
> via netvsc_register_vf() and netvsc_unregister_vf(), so mlx4 doesn't need
> this fix. Note: mlx4 can still work with the fix because in
> netvsc_suspend()/resume() ndev_ctx->vf_netdev is NULL for mlx4.
> 
> Fixes: 0efeea5fb153 ("hv_netvsc: Add the support of hibernation")
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> ---
> 
> Changes in v2 (Thanks Jakub Kicinski <kuba@kernel.org>):
>     Added coments in the changelog and the code about the implicit
> data path switching to the netvsc when we close/re-open the vmbus
> channels.
>     Used reverse xmas order ordering in netvsc_remove().
> 
>  drivers/net/hyperv/netvsc_drv.c | 16 +++++++++++-----
>  1 file changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
> index 64b0a74c1523..81c5c70b616a 100644
> --- a/drivers/net/hyperv/netvsc_drv.c
> +++ b/drivers/net/hyperv/netvsc_drv.c
> @@ -2587,8 +2587,8 @@ static int netvsc_remove(struct hv_device *dev)
>  static int netvsc_suspend(struct hv_device *dev)
>  {
>  	struct net_device_context *ndev_ctx;
> -	struct net_device *vf_netdev, *net;
>  	struct netvsc_device *nvdev;
> +	struct net_device *net;
>  	int ret;
> 
>  	net = hv_get_drvdata(dev);
> @@ -2604,10 +2604,6 @@ static int netvsc_suspend(struct hv_device *dev)
>  		goto out;
>  	}
> 
> -	vf_netdev = rtnl_dereference(ndev_ctx->vf_netdev);
> -	if (vf_netdev)
> -		netvsc_unregister_vf(vf_netdev);
> -
>  	/* Save the current config info */
>  	ndev_ctx->saved_netvsc_dev_info = netvsc_devinfo_get(nvdev);
> 
> @@ -2623,6 +2619,7 @@ static int netvsc_resume(struct hv_device *dev)
>  	struct net_device *net = hv_get_drvdata(dev);
>  	struct net_device_context *net_device_ctx;
>  	struct netvsc_device_info *device_info;
> +	struct net_device *vf_netdev;
>  	int ret;
> 
>  	rtnl_lock();
> @@ -2635,6 +2632,15 @@ static int netvsc_resume(struct hv_device *dev)
>  	netvsc_devinfo_put(device_info);
>  	net_device_ctx->saved_netvsc_dev_info = NULL;
> 
> +	/* A NIC driver (e.g. mlx5) may keep the VF network interface across
> +	 * hibernation, but here the data path is implicitly switched to the
> +	 * netvsc NIC since the vmbus channel is closed and re-opened, so
> +	 * netvsc_vf_changed() must be used to switch the data path to the VF.
> +	 */
> +	vf_netdev = rtnl_dereference(net_device_ctx->vf_netdev);
> +	if (vf_netdev && netvsc_vf_changed(vf_netdev) != NOTIFY_OK)
> +		ret = -EINVAL;
> +

I'm a little late looking at this code.  But a question:  Is it possible for
netvsc_resume() to be called before the VF driver's resume function
is called?  If so, is it possible for netvsc_vf_changed() to find that the VF
is not up, and hence to switch the data path away from the VF instead of
to the VF?

Michael

>  	rtnl_unlock();
> 
>  	return ret;
> --
> 2.19.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH net v2] hv_netvsc: Fix hibernation for mlx5 VF driver
  2020-09-08 20:49 ` Michael Kelley
@ 2020-09-08 23:00   ` Dexuan Cui
  0 siblings, 0 replies; 4+ messages in thread
From: Dexuan Cui @ 2020-09-08 23:00 UTC (permalink / raw)
  To: Michael Kelley, kuba, wei.liu, KY Srinivasan, Haiyang Zhang,
	Stephen Hemminger, davem, linux-hyperv, netdev, linux-kernel

> From: Michael Kelley <mikelley@microsoft.com>
> Sent: Tuesday, September 8, 2020 1:49 PM
> > @@ -2635,6 +2632,15 @@ static int netvsc_resume(struct hv_device *dev)
> >  	netvsc_devinfo_put(device_info);
> >  	net_device_ctx->saved_netvsc_dev_info = NULL;
> >
> > +	/* A NIC driver (e.g. mlx5) may keep the VF network interface across
> > +	 * hibernation, but here the data path is implicitly switched to the
> > +	 * netvsc NIC since the vmbus channel is closed and re-opened, so
> > +	 * netvsc_vf_changed() must be used to switch the data path to the VF.
> > +	 */
> > +	vf_netdev = rtnl_dereference(net_device_ctx->vf_netdev);
> > +	if (vf_netdev && netvsc_vf_changed(vf_netdev) != NOTIFY_OK)
> > +		ret = -EINVAL;
> > +
> 
> I'm a little late looking at this code.  But a question:  Is it possible for
> netvsc_resume() to be called before the VF driver's resume function
> is called?  
Yes, actually this is what happens 100% of the time. :-)

Upon suspend:
1. the mlx5 driver suspends the VF NIC.
2. the pci-hyperv suspends the VF vmbus device, including closing the channel.
3. hv_netvsc suspends the netvsc vmbus device, including closing the channel.

Note: between 1) and 3), the data path is still the mlx5 VF, but obviously the VF
NIC is not working. IMO this is not an issue in practice, since typically we don't
really expect this to work once the suspending starts.

Upon resume:
1. hv_netvsc resumes the netvsc vmbus device, including opening the channel.

At this time, the data path should be the netvsc NIC since we close and re-open 
the netvsc vmbus channel, and I believe the default data path is netvsc.

With this patch, the data path is switched to the VF NIC in netvsc_resume() 
because "netif_running(vf_netdev)" is true for the mlx5 VF NIC (CX-4), though 
the VF NIC device is not resumed back yet. According to my test, I believe this
switching succeeds. Note: when the host receives the VM's 
NVSP_MSG4_TYPE_SWITCH_DATA_PATH request, it looks the host doesn't check
if the VF vmbus device and the VF PCI device are really "activated"[1], and it
looks the host simply programs the FPGA GFT for the newly-requested data path,
and the host doesn't send a response message to the VM, indicating if the
switching is a success or a failure.

So, at this time, any incoming Ethernet packets (except the broadcast/multicast
and TCP SYN packets, which always go through the netvsc NIC on Azure) can not
be received by the VF NIC, which has not been resumed yet. IMO this is not an
issue in practice, since typically we don't really expect this to work before the
resuming is fully finished. BTW, at this time the userspace is not thawed yet, so
no application can transmit packets.

2. pci-hyperv resumes the VF vmbus device, including opening the channel.

3. the mlx5 driver resumes the VF NIC, and now everything is backed to normal.


[1] In the future, if the host starts to check if the VF vmbus/PCI devices are 
activated upon the receipt of the VM's NVSP_MSG4_TYPE_SWITCH_DATA_PATH
request, and refuses to switch the data path if they're not activated, then
we'll be in trouble, but even if that happens, this patch itself doesn't make
the situation worse.

So ideally we need a mechanism to switch the data path to the mlx5 VF NIC
*after* the mlx5 NIC is resumed. Unluckily it looks there is not a standard
notification mechanism for this since the mlx5 driver preserves the VF network
interface. I'll discuss with the Mellanox developer who implemented mlx5
hibernation support, and probably mlx5 should also destroy/re-create the
VF network interface across hibernation, just as the mlx4 driver does.


> If so, is it possible for netvsc_vf_changed() to find that the VF
> is not up, and hence to switch the data path away from the VF instead of
> to the VF?
> 
> Michael

When we are in netvsc_resume(), in my test "netif_running(vf_netdev)" is 
always true for the mlx5 VF NIC (CX-4), so netvsc_vf_changed() should switch
the data path to the VF.

static inline bool netif_running(const struct net_device *dev)
{
        return test_bit(__LINK_STATE_START, &dev->state);
}

The flag __LINK_STATE_START is only cleared when the NIC is brought "DOWN",
and that case is already automatically handled by netvsc_netdev_event().

For mlx4 (CX-3), net_device_ctx->vf_netdev is NULL in netvsc_resume(), so the
CX-3 VF NIC is not affected by this patch.

Thanks,
-- Dexuan

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-09-08 23:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-07  7:13 [PATCH net v2] hv_netvsc: Fix hibernation for mlx5 VF driver Dexuan Cui
2020-09-08  4:10 ` Jakub Kicinski
2020-09-08 20:49 ` Michael Kelley
2020-09-08 23:00   ` Dexuan Cui

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).