All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sridhar Samudrala <sridhar.samudrala@intel.com>
To: mst@redhat.com, stephen@networkplumber.org, davem@davemloft.net,
	netdev@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	virtio-dev@lists.oasis-open.org, jesse.brandeburg@intel.com,
	alexander.h.duyck@intel.com, kubakici@wp.pl,
	sridhar.samudrala@intel.com, jasowang@redhat.com,
	loseweigh@gmail.com, jiri@resnulli.us
Subject: [RFC PATCH net-next v6 3/4] virtio_net: Extend virtio to use VF datapath when available
Date: Tue, 10 Apr 2018 11:59:49 -0700	[thread overview]
Message-ID: <1523386790-12396-4-git-send-email-sridhar.samudrala@intel.com> (raw)
In-Reply-To: <1523386790-12396-1-git-send-email-sridhar.samudrala@intel.com>

This patch enables virtio_net to switch over to a VF datapath when a VF
netdev is present with the same MAC address. It allows live migration
of a VM with a direct attached VF without the need to setup a bond/team
between a VF and virtio net device in the guest.

The hypervisor needs to enable only one datapath at any time so that
packets don't get looped back to the VM over the other datapath. When a VF
is plugged, the virtio datapath link state can be marked as down. The
hypervisor needs to unplug the VF device from the guest on the source host
and reset the MAC filter of the VF to initiate failover of datapath to
virtio before starting the migration. After the migration is completed,
the destination hypervisor sets the MAC filter on the VF and plugs it back
to the guest to switch over to VF datapath.

It uses the generic bypass framework that provides 2 functions to create
and destroy a master bypass netdev. When BACKUP feature is enabled, an
additional netdev(bypass netdev) is created that acts as a master device
and tracks the state of the 2 lower netdevs. The original virtio_net netdev
is marked as 'backup' netdev and a passthru device with the same MAC is
registered as 'active' netdev.

This patch is based on the discussion initiated by Jesse on this thread.
https://marc.info/?l=linux-virtualization&m=151189725224231&w=2

Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
---
 drivers/net/Kconfig      |  1 +
 drivers/net/virtio_net.c | 36 +++++++++++++++++++++++++++++++++++-
 2 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 891846655000..9e2cf61fd1c1 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -331,6 +331,7 @@ config VETH
 config VIRTIO_NET
 	tristate "Virtio network driver"
 	depends on VIRTIO
+	depends on MAY_USE_BYPASS
 	---help---
 	  This is the virtual network driver for virtio.  It can be used with
 	  QEMU based VMMs (like KVM or Xen).  Say Y or M.
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index befb5944f3fd..99aa52d5ac9b 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -30,8 +30,11 @@
 #include <linux/cpu.h>
 #include <linux/average.h>
 #include <linux/filter.h>
+#include <linux/netdevice.h>
+#include <linux/pci.h>
 #include <net/route.h>
 #include <net/xdp.h>
+#include <net/bypass.h>
 
 static int napi_weight = NAPI_POLL_WEIGHT;
 module_param(napi_weight, int, 0444);
@@ -206,6 +209,9 @@ struct virtnet_info {
 	u32 speed;
 
 	unsigned long guest_offloads;
+
+	/* bypass_master created when BACKUP feature enabled */
+	struct bypass_master *bypass_master;
 };
 
 struct padded_vnet_hdr {
@@ -2275,6 +2281,22 @@ static int virtnet_xdp(struct net_device *dev, struct netdev_bpf *xdp)
 	}
 }
 
+static int virtnet_get_phys_port_name(struct net_device *dev, char *buf,
+				      size_t len)
+{
+	struct virtnet_info *vi = netdev_priv(dev);
+	int ret;
+
+	if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_BACKUP))
+		return -EOPNOTSUPP;
+
+	ret = snprintf(buf, len, "_bkup");
+	if (ret >= len)
+		return -EOPNOTSUPP;
+
+	return 0;
+}
+
 static const struct net_device_ops virtnet_netdev = {
 	.ndo_open            = virtnet_open,
 	.ndo_stop   	     = virtnet_close,
@@ -2292,6 +2314,7 @@ static const struct net_device_ops virtnet_netdev = {
 	.ndo_xdp_xmit		= virtnet_xdp_xmit,
 	.ndo_xdp_flush		= virtnet_xdp_flush,
 	.ndo_features_check	= passthru_features_check,
+	.ndo_get_phys_port_name	= virtnet_get_phys_port_name,
 };
 
 static void virtnet_config_changed_work(struct work_struct *work)
@@ -2839,10 +2862,16 @@ static int virtnet_probe(struct virtio_device *vdev)
 
 	virtnet_init_settings(dev);
 
+	if (virtio_has_feature(vdev, VIRTIO_NET_F_BACKUP)) {
+		err = bypass_master_create(vi->dev, &vi->bypass_master);
+		if (err)
+			goto free_vqs;
+	}
+
 	err = register_netdev(dev);
 	if (err) {
 		pr_debug("virtio_net: registering device failed\n");
-		goto free_vqs;
+		goto free_bypass;
 	}
 
 	virtio_device_ready(vdev);
@@ -2879,6 +2908,8 @@ static int virtnet_probe(struct virtio_device *vdev)
 	vi->vdev->config->reset(vdev);
 
 	unregister_netdev(dev);
+free_bypass:
+	bypass_master_destroy(vi->bypass_master);
 free_vqs:
 	cancel_delayed_work_sync(&vi->refill);
 	free_receive_page_frags(vi);
@@ -2913,6 +2944,8 @@ static void virtnet_remove(struct virtio_device *vdev)
 
 	unregister_netdev(vi->dev);
 
+	bypass_master_destroy(vi->bypass_master);
+
 	remove_vq_common(vi);
 
 	free_netdev(vi->dev);
@@ -3010,6 +3043,7 @@ static __init int virtio_net_driver_init(void)
         ret = register_virtio_driver(&virtio_net_driver);
 	if (ret)
 		goto err_virtio;
+
 	return 0;
 err_virtio:
 	cpuhp_remove_multi_state(CPUHP_VIRT_NET_DEAD);
-- 
2.14.3

WARNING: multiple messages have this Message-ID (diff)
From: Sridhar Samudrala <sridhar.samudrala@intel.com>
To: mst@redhat.com, stephen@networkplumber.org, davem@davemloft.net,
	netdev@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	virtio-dev@lists.oasis-open.org, jesse.brandeburg@intel.com,
	alexander.h.duyck@intel.com, kubakici@wp.pl,
	sridhar.samudrala@intel.com, jasowang@redhat.com,
	loseweigh@gmail.com, jiri@resnulli.us
Subject: [virtio-dev] [RFC PATCH net-next v6 3/4] virtio_net: Extend virtio to use VF datapath when available
Date: Tue, 10 Apr 2018 11:59:49 -0700	[thread overview]
Message-ID: <1523386790-12396-4-git-send-email-sridhar.samudrala@intel.com> (raw)
In-Reply-To: <1523386790-12396-1-git-send-email-sridhar.samudrala@intel.com>

This patch enables virtio_net to switch over to a VF datapath when a VF
netdev is present with the same MAC address. It allows live migration
of a VM with a direct attached VF without the need to setup a bond/team
between a VF and virtio net device in the guest.

The hypervisor needs to enable only one datapath at any time so that
packets don't get looped back to the VM over the other datapath. When a VF
is plugged, the virtio datapath link state can be marked as down. The
hypervisor needs to unplug the VF device from the guest on the source host
and reset the MAC filter of the VF to initiate failover of datapath to
virtio before starting the migration. After the migration is completed,
the destination hypervisor sets the MAC filter on the VF and plugs it back
to the guest to switch over to VF datapath.

It uses the generic bypass framework that provides 2 functions to create
and destroy a master bypass netdev. When BACKUP feature is enabled, an
additional netdev(bypass netdev) is created that acts as a master device
and tracks the state of the 2 lower netdevs. The original virtio_net netdev
is marked as 'backup' netdev and a passthru device with the same MAC is
registered as 'active' netdev.

This patch is based on the discussion initiated by Jesse on this thread.
https://marc.info/?l=linux-virtualization&m=151189725224231&w=2

Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
---
 drivers/net/Kconfig      |  1 +
 drivers/net/virtio_net.c | 36 +++++++++++++++++++++++++++++++++++-
 2 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 891846655000..9e2cf61fd1c1 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -331,6 +331,7 @@ config VETH
 config VIRTIO_NET
 	tristate "Virtio network driver"
 	depends on VIRTIO
+	depends on MAY_USE_BYPASS
 	---help---
 	  This is the virtual network driver for virtio.  It can be used with
 	  QEMU based VMMs (like KVM or Xen).  Say Y or M.
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index befb5944f3fd..99aa52d5ac9b 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -30,8 +30,11 @@
 #include <linux/cpu.h>
 #include <linux/average.h>
 #include <linux/filter.h>
+#include <linux/netdevice.h>
+#include <linux/pci.h>
 #include <net/route.h>
 #include <net/xdp.h>
+#include <net/bypass.h>
 
 static int napi_weight = NAPI_POLL_WEIGHT;
 module_param(napi_weight, int, 0444);
@@ -206,6 +209,9 @@ struct virtnet_info {
 	u32 speed;
 
 	unsigned long guest_offloads;
+
+	/* bypass_master created when BACKUP feature enabled */
+	struct bypass_master *bypass_master;
 };
 
 struct padded_vnet_hdr {
@@ -2275,6 +2281,22 @@ static int virtnet_xdp(struct net_device *dev, struct netdev_bpf *xdp)
 	}
 }
 
+static int virtnet_get_phys_port_name(struct net_device *dev, char *buf,
+				      size_t len)
+{
+	struct virtnet_info *vi = netdev_priv(dev);
+	int ret;
+
+	if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_BACKUP))
+		return -EOPNOTSUPP;
+
+	ret = snprintf(buf, len, "_bkup");
+	if (ret >= len)
+		return -EOPNOTSUPP;
+
+	return 0;
+}
+
 static const struct net_device_ops virtnet_netdev = {
 	.ndo_open            = virtnet_open,
 	.ndo_stop   	     = virtnet_close,
@@ -2292,6 +2314,7 @@ static const struct net_device_ops virtnet_netdev = {
 	.ndo_xdp_xmit		= virtnet_xdp_xmit,
 	.ndo_xdp_flush		= virtnet_xdp_flush,
 	.ndo_features_check	= passthru_features_check,
+	.ndo_get_phys_port_name	= virtnet_get_phys_port_name,
 };
 
 static void virtnet_config_changed_work(struct work_struct *work)
@@ -2839,10 +2862,16 @@ static int virtnet_probe(struct virtio_device *vdev)
 
 	virtnet_init_settings(dev);
 
+	if (virtio_has_feature(vdev, VIRTIO_NET_F_BACKUP)) {
+		err = bypass_master_create(vi->dev, &vi->bypass_master);
+		if (err)
+			goto free_vqs;
+	}
+
 	err = register_netdev(dev);
 	if (err) {
 		pr_debug("virtio_net: registering device failed\n");
-		goto free_vqs;
+		goto free_bypass;
 	}
 
 	virtio_device_ready(vdev);
@@ -2879,6 +2908,8 @@ static int virtnet_probe(struct virtio_device *vdev)
 	vi->vdev->config->reset(vdev);
 
 	unregister_netdev(dev);
+free_bypass:
+	bypass_master_destroy(vi->bypass_master);
 free_vqs:
 	cancel_delayed_work_sync(&vi->refill);
 	free_receive_page_frags(vi);
@@ -2913,6 +2944,8 @@ static void virtnet_remove(struct virtio_device *vdev)
 
 	unregister_netdev(vi->dev);
 
+	bypass_master_destroy(vi->bypass_master);
+
 	remove_vq_common(vi);
 
 	free_netdev(vi->dev);
@@ -3010,6 +3043,7 @@ static __init int virtio_net_driver_init(void)
         ret = register_virtio_driver(&virtio_net_driver);
 	if (ret)
 		goto err_virtio;
+
 	return 0;
 err_virtio:
 	cpuhp_remove_multi_state(CPUHP_VIRT_NET_DEAD);
-- 
2.14.3


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


  parent reply	other threads:[~2018-04-10 18:59 UTC|newest]

Thread overview: 147+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-10 18:59 [RFC PATCH net-next v6 0/4] Enable virtio_net to act as a backup for a passthru device Sridhar Samudrala
2018-04-10 18:59 ` [virtio-dev] " Sridhar Samudrala
2018-04-10 18:59 ` [RFC PATCH net-next v6 1/4] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit Sridhar Samudrala
2018-04-10 18:59   ` [virtio-dev] " Sridhar Samudrala
2018-04-10 18:59 ` [RFC PATCH net-next v6 2/4] net: Introduce generic bypass module Sridhar Samudrala
2018-04-10 18:59   ` [virtio-dev] " Sridhar Samudrala
2018-04-11 15:51   ` Jiri Pirko
2018-04-11 19:13     ` Samudrala, Sridhar
2018-04-11 19:13     ` Samudrala, Sridhar
2018-04-11 19:13       ` [virtio-dev] " Samudrala, Sridhar
2018-04-18  9:25       ` Jiri Pirko
2018-04-18  9:25       ` Jiri Pirko
2018-04-18 18:43         ` Samudrala, Sridhar
2018-04-18 18:43         ` Samudrala, Sridhar
2018-04-18 18:43           ` [virtio-dev] " Samudrala, Sridhar
2018-04-18 19:13           ` Jiri Pirko
2018-04-18 19:13           ` Jiri Pirko
2018-04-18 19:46             ` Michael S. Tsirkin
2018-04-18 19:46               ` [virtio-dev] " Michael S. Tsirkin
2018-04-18 20:32               ` Jiri Pirko
2018-04-18 22:46                 ` Samudrala, Sridhar
2018-04-18 22:46                   ` [virtio-dev] " Samudrala, Sridhar
2018-04-19  6:35                   ` Jiri Pirko
2018-04-19  6:35                   ` Jiri Pirko
2018-04-18 22:46                 ` Samudrala, Sridhar
2018-04-19  4:08                 ` Michael S. Tsirkin
2018-04-19  4:08                   ` [virtio-dev] " Michael S. Tsirkin
2018-04-19  7:22                   ` Jiri Pirko
2018-04-19  7:22                   ` Jiri Pirko
2018-04-19  4:08                 ` Michael S. Tsirkin
2018-04-18 20:32               ` Jiri Pirko
2018-04-11 15:51   ` Jiri Pirko
2018-04-10 18:59 ` Sridhar Samudrala [this message]
2018-04-10 18:59   ` [virtio-dev] [RFC PATCH net-next v6 3/4] virtio_net: Extend virtio to use VF datapath when available Sridhar Samudrala
2018-04-10 18:59 ` [RFC PATCH net-next v6 4/4] netvsc: refactor notifier/event handling code to use the bypass framework Sridhar Samudrala
2018-04-10 18:59   ` [virtio-dev] " Sridhar Samudrala
2018-04-10 21:26   ` Stephen Hemminger
2018-04-10 22:56     ` Samudrala, Sridhar
2018-04-10 22:56       ` [virtio-dev] " Samudrala, Sridhar
2018-04-10 23:28     ` Michael S. Tsirkin
2018-04-10 23:28     ` Michael S. Tsirkin
2018-04-10 23:28       ` [virtio-dev] " Michael S. Tsirkin
2018-04-10 23:44       ` Siwei Liu
2018-04-10 23:44         ` [virtio-dev] " Siwei Liu
2018-04-10 23:59         ` Stephen Hemminger
2018-04-10 23:44       ` Siwei Liu
2018-04-11  7:50       ` Jiri Pirko
2018-04-11  7:50       ` Jiri Pirko
2018-04-11  1:21     ` Michael S. Tsirkin
2018-04-11  1:21     ` Michael S. Tsirkin
2018-04-11  7:53     ` Jiri Pirko
2018-04-11  7:53     ` Jiri Pirko
2019-02-22  1:14       ` net_failover slave udev renaming (was Re: [RFC PATCH net-next v6 4/4] netvsc: refactor notifier/event handling code to use the bypass framework) Siwei Liu
2019-02-22  1:14         ` [virtio-dev] " Siwei Liu
2019-02-22  1:39         ` Michael S. Tsirkin
2019-02-22  1:39           ` [virtio-dev] " Michael S. Tsirkin
2019-02-22  3:33           ` si-wei liu
2019-02-22  3:33             ` si-wei liu
2019-02-22  7:00             ` Samudrala, Sridhar
2019-02-22  7:55               ` si-wei liu
2019-02-22  7:55                 ` si-wei liu
2019-02-22 12:58                 ` Rob Miller
2019-02-22 12:58                   ` Rob Miller
2019-02-22 15:14                 ` Michael S. Tsirkin
2019-02-22 15:14                   ` Michael S. Tsirkin
2019-02-26  0:58                   ` si-wei liu
2019-02-26  0:58                     ` si-wei liu
2019-02-26  1:39                     ` Stephen Hemminger
2019-02-26  1:39                     ` Stephen Hemminger
2019-02-26  2:05                       ` Michael S. Tsirkin
2019-02-26  2:05                       ` Michael S. Tsirkin
2019-02-26  2:05                         ` Michael S. Tsirkin
2019-02-27  0:49                         ` si-wei liu
2019-02-27  0:49                           ` si-wei liu
2019-02-26  2:08                     ` Michael S. Tsirkin
2019-02-26  2:08                     ` Michael S. Tsirkin
2019-02-26  2:08                       ` Michael S. Tsirkin
2019-02-27  0:17                       ` si-wei liu
2019-02-27  0:17                         ` si-wei liu
2019-02-27 21:57                         ` Stephen Hemminger
2019-02-27 21:57                         ` Stephen Hemminger
2019-02-27 22:30                           ` si-wei liu
2019-02-27 22:30                             ` si-wei liu
2019-02-27 22:38                         ` Michael S. Tsirkin
2019-02-27 22:38                         ` Michael S. Tsirkin
2019-02-27 22:38                           ` Michael S. Tsirkin
2019-02-27 23:34                           ` si-wei liu
2019-02-27 23:34                             ` si-wei liu
2019-02-27 23:50                             ` Michael S. Tsirkin
2019-02-27 23:50                             ` Michael S. Tsirkin
2019-02-27 23:50                               ` Michael S. Tsirkin
2019-02-28  0:00                               ` Liran Alon
2019-02-28  0:00                               ` Liran Alon
2019-02-28  0:03                               ` Stephen Hemminger
2019-02-28  0:38                                 ` Michael S. Tsirkin
2019-02-28  0:38                                 ` Michael S. Tsirkin
2019-02-28  0:38                                   ` Michael S. Tsirkin
2019-02-28  0:03                               ` Stephen Hemminger
2019-02-28  0:38                               ` si-wei liu
2019-02-28  0:38                                 ` si-wei liu
2019-02-28  0:41                                 ` Michael S. Tsirkin
2019-02-28  0:41                                 ` Michael S. Tsirkin
2019-02-28  0:41                                   ` Michael S. Tsirkin
2019-02-28  0:52                                   ` Jakub Kicinski
2019-02-28  0:52                                   ` Jakub Kicinski
2019-02-28  1:26                                     ` Michael S. Tsirkin
2019-02-28  1:26                                       ` Michael S. Tsirkin
2019-02-28  1:52                                       ` Jakub Kicinski
2019-02-28  1:52                                       ` Jakub Kicinski
2019-02-28  4:47                                         ` Michael S. Tsirkin
2019-02-28  4:47                                         ` Michael S. Tsirkin
2019-02-28  4:47                                           ` Michael S. Tsirkin
2019-02-28 18:13                                           ` Jakub Kicinski
2019-02-28 19:36                                             ` Michael S. Tsirkin
2019-02-28 19:36                                             ` Michael S. Tsirkin
2019-02-28 19:36                                               ` Michael S. Tsirkin
2019-02-28 19:56                                               ` Jakub Kicinski
2019-02-28 19:56                                               ` Jakub Kicinski
2019-02-28 20:14                                                 ` Michael S. Tsirkin
2019-02-28 20:14                                                 ` Michael S. Tsirkin
2019-02-28 23:31                                                   ` Jakub Kicinski
2019-02-28 23:31                                                   ` Jakub Kicinski
2019-03-01  0:20                                                 ` Siwei Liu
2019-03-01  0:20                                                 ` Siwei Liu
2019-03-01  1:05                                                   ` Jakub Kicinski
2019-03-02  0:30                                                     ` Siwei Liu
2019-03-02  0:30                                                     ` Siwei Liu
2019-03-01  1:05                                                   ` Jakub Kicinski
2019-02-28 18:13                                           ` Jakub Kicinski
2019-02-28  1:26                                     ` Michael S. Tsirkin
2019-02-28  9:32                                   ` si-wei liu
2019-02-28  9:32                                     ` si-wei liu
2019-02-28 14:26                                     ` Michael S. Tsirkin
2019-02-28 14:26                                       ` Michael S. Tsirkin
2019-03-01  1:30                                       ` si-wei liu
2019-03-01  1:30                                         ` si-wei liu
2019-03-01 13:27                                         ` Michael S. Tsirkin
2019-03-01 13:27                                         ` Michael S. Tsirkin
2019-03-01 13:27                                           ` Michael S. Tsirkin
2019-03-01 20:55                                           ` si-wei liu
2019-03-01 20:55                                             ` si-wei liu
2019-02-28 14:26                                     ` Michael S. Tsirkin
2019-02-22 15:14                 ` Michael S. Tsirkin
2019-02-22  7:00             ` Samudrala, Sridhar
2019-02-22  1:39         ` Michael S. Tsirkin
2019-02-22  1:14       ` Siwei Liu
2018-04-10 21:26   ` [RFC PATCH net-next v6 4/4] netvsc: refactor notifier/event handling code to use the bypass framework Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1523386790-12396-4-git-send-email-sridhar.samudrala@intel.com \
    --to=sridhar.samudrala@intel.com \
    --cc=alexander.h.duyck@intel.com \
    --cc=davem@davemloft.net \
    --cc=jasowang@redhat.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=jiri@resnulli.us \
    --cc=kubakici@wp.pl \
    --cc=loseweigh@gmail.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=stephen@networkplumber.org \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.