From: Sridhar Samudrala <sridhar.samudrala@intel.com> To: mst@redhat.com, stephen@networkplumber.org, davem@davemloft.net, netdev@vger.kernel.org, virtualization@lists.linux-foundation.org, virtio-dev@lists.oasis-open.org, jesse.brandeburg@intel.com, alexander.h.duyck@intel.com, kubakici@wp.pl, sridhar.samudrala@intel.com, jasowang@redhat.com, loseweigh@gmail.com Subject: [RFC PATCH v3 0/3] Enable virtio_net to act as a backup for a passthru device Date: Fri, 16 Feb 2018 10:11:19 -0800 [thread overview] Message-ID: <1518804682-16881-1-git-send-email-sridhar.samudrala@intel.com> (raw) Patch 1 introduces a new feature bit VIRTIO_NET_F_BACKUP that can be used by hypervisor to indicate that virtio_net interface should act as a backup for another device with the same MAC address. Ppatch 2 is in response to the community request for a 3 netdev solution. However, it creates some issues we'll get into in a moment. It extends virtio_net to use alternate datapath when available and registered. When BACKUP feature is enabled, virtio_net driver creates an additional 'bypass' netdev that acts as a master device and controls 2 slave devices. The original virtio_net netdev is registered as 'backup' netdev and a passthru/vf device with the same MAC gets registered as 'active' netdev. Both 'bypass' and 'backup' netdevs are associated with the same 'pci' device. The user accesses the network interface via 'bypass' netdev. The 'bypass' netdev chooses 'active' netdev as default for transmits when it is available with link up and running. We noticed a couple of issues with this approach during testing. - As both 'bypass' and 'backup' netdevs are associated with the same virtio pci device, udev tries to rename both of them with the same name and the 2nd rename will fail. This would be OK as long as the first netdev to be renamed is the 'bypass' netdev, but the order in which udev gets to rename the 2 netdevs is not reliable. - When the 'active' netdev is unplugged OR not present on a destination system after live migration, the user will see 2 virtio_net netdevs. Patch 3 refactors much of the changes made in patch 2, which was done on purpose just to show the solution we recommend as part of one patch set. If we submit a final version of this, we would combine patch 2/3 together. This patch removes the creation of an additional netdev, Instead, it uses a new virtnet_bypass_info struct added to the original 'backup' netdev to track the 'bypass' information and introduces an additional set of ndo and ethtool ops that are used when BACKUP feature is enabled. One difference with the 3 netdev model compared to the 2 netdev model is that the 'bypass' netdev is created with 'noqueue' qdisc marked as 'NETIF_F_LLTX'. This avoids going through an additional qdisc and acquiring an additional qdisc and tx lock during transmits. If we can replace the qdisc of virtio netdev dynamically, it should be possible to get these optimizations enabled even with 2 netdev model when BACKUP feature is enabled. As this patch series is initially focusing on usecases where hypervisor fully controls the VM networking and the guest is not expected to directly configure any hardware settings, it doesn't expose all the ndo/ethtool ops that are supported by virtio_net at this time. To support additional usecases, it should be possible to enable additional ops later by caching the state in virtio netdev and replaying when the 'active' netdev gets registered. The hypervisor needs to enable only one datapath at any time so that packets don't get looped back to the VM over the other datapath. When a VF is plugged, the virtio datapath link state can be marked as down. At the time of live migration, the hypervisor needs to unplug the VF device from the guest on the source host and reset the MAC filter of the VF to initiate failover of datapath to virtio before starting the migration. After the migration is completed, the destination hypervisor sets the MAC filter on the VF and plugs it back to the guest to switch over to VF datapath. This patch is based on the discussion initiated by Jesse on this thread. https://marc.info/?l=linux-virtualization&m=151189725224231&w=2 Sridhar Samudrala (3): virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit virtio_net: Extend virtio to use VF datapath when available virtio_net: Enable alternate datapath without creating an additional netdev drivers/net/virtio_net.c | 564 +++++++++++++++++++++++++++++++++++++++- include/uapi/linux/virtio_net.h | 3 + 2 files changed, 563 insertions(+), 4 deletions(-) -- 2.14.3
WARNING: multiple messages have this Message-ID (diff)
From: Sridhar Samudrala <sridhar.samudrala@intel.com> To: mst@redhat.com, stephen@networkplumber.org, davem@davemloft.net, netdev@vger.kernel.org, virtualization@lists.linux-foundation.org, virtio-dev@lists.oasis-open.org, jesse.brandeburg@intel.com, alexander.h.duyck@intel.com, kubakici@wp.pl, sridhar.samudrala@intel.com, jasowang@redhat.com, loseweigh@gmail.com Subject: [virtio-dev] [RFC PATCH v3 0/3] Enable virtio_net to act as a backup for a passthru device Date: Fri, 16 Feb 2018 10:11:19 -0800 [thread overview] Message-ID: <1518804682-16881-1-git-send-email-sridhar.samudrala@intel.com> (raw) Patch 1 introduces a new feature bit VIRTIO_NET_F_BACKUP that can be used by hypervisor to indicate that virtio_net interface should act as a backup for another device with the same MAC address. Ppatch 2 is in response to the community request for a 3 netdev solution. However, it creates some issues we'll get into in a moment. It extends virtio_net to use alternate datapath when available and registered. When BACKUP feature is enabled, virtio_net driver creates an additional 'bypass' netdev that acts as a master device and controls 2 slave devices. The original virtio_net netdev is registered as 'backup' netdev and a passthru/vf device with the same MAC gets registered as 'active' netdev. Both 'bypass' and 'backup' netdevs are associated with the same 'pci' device. The user accesses the network interface via 'bypass' netdev. The 'bypass' netdev chooses 'active' netdev as default for transmits when it is available with link up and running. We noticed a couple of issues with this approach during testing. - As both 'bypass' and 'backup' netdevs are associated with the same virtio pci device, udev tries to rename both of them with the same name and the 2nd rename will fail. This would be OK as long as the first netdev to be renamed is the 'bypass' netdev, but the order in which udev gets to rename the 2 netdevs is not reliable. - When the 'active' netdev is unplugged OR not present on a destination system after live migration, the user will see 2 virtio_net netdevs. Patch 3 refactors much of the changes made in patch 2, which was done on purpose just to show the solution we recommend as part of one patch set. If we submit a final version of this, we would combine patch 2/3 together. This patch removes the creation of an additional netdev, Instead, it uses a new virtnet_bypass_info struct added to the original 'backup' netdev to track the 'bypass' information and introduces an additional set of ndo and ethtool ops that are used when BACKUP feature is enabled. One difference with the 3 netdev model compared to the 2 netdev model is that the 'bypass' netdev is created with 'noqueue' qdisc marked as 'NETIF_F_LLTX'. This avoids going through an additional qdisc and acquiring an additional qdisc and tx lock during transmits. If we can replace the qdisc of virtio netdev dynamically, it should be possible to get these optimizations enabled even with 2 netdev model when BACKUP feature is enabled. As this patch series is initially focusing on usecases where hypervisor fully controls the VM networking and the guest is not expected to directly configure any hardware settings, it doesn't expose all the ndo/ethtool ops that are supported by virtio_net at this time. To support additional usecases, it should be possible to enable additional ops later by caching the state in virtio netdev and replaying when the 'active' netdev gets registered. The hypervisor needs to enable only one datapath at any time so that packets don't get looped back to the VM over the other datapath. When a VF is plugged, the virtio datapath link state can be marked as down. At the time of live migration, the hypervisor needs to unplug the VF device from the guest on the source host and reset the MAC filter of the VF to initiate failover of datapath to virtio before starting the migration. After the migration is completed, the destination hypervisor sets the MAC filter on the VF and plugs it back to the guest to switch over to VF datapath. This patch is based on the discussion initiated by Jesse on this thread. https://marc.info/?l=linux-virtualization&m=151189725224231&w=2 Sridhar Samudrala (3): virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit virtio_net: Extend virtio to use VF datapath when available virtio_net: Enable alternate datapath without creating an additional netdev drivers/net/virtio_net.c | 564 +++++++++++++++++++++++++++++++++++++++- include/uapi/linux/virtio_net.h | 3 + 2 files changed, 563 insertions(+), 4 deletions(-) -- 2.14.3 --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
next reply other threads:[~2018-02-16 18:11 UTC|newest] Thread overview: 121+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-02-16 18:11 Sridhar Samudrala [this message] 2018-02-16 18:11 ` [virtio-dev] [RFC PATCH v3 0/3] Enable virtio_net to act as a backup for a passthru device Sridhar Samudrala 2018-02-16 18:11 ` [RFC PATCH v3 1/3] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit Sridhar Samudrala 2018-02-16 18:11 ` Sridhar Samudrala 2018-02-16 18:11 ` [virtio-dev] " Sridhar Samudrala 2018-02-16 18:11 ` [RFC PATCH v3 2/3] virtio_net: Extend virtio to use VF datapath when available Sridhar Samudrala 2018-02-16 18:11 ` Sridhar Samudrala 2018-02-16 18:11 ` [virtio-dev] " Sridhar Samudrala 2018-02-17 3:04 ` Jakub Kicinski 2018-02-17 17:41 ` Alexander Duyck 2018-02-17 3:04 ` Jakub Kicinski 2018-02-16 18:11 ` [RFC PATCH v3 3/3] virtio_net: Enable alternate datapath without creating an additional netdev Sridhar Samudrala 2018-02-16 18:11 ` [virtio-dev] " Sridhar Samudrala 2018-02-16 18:11 ` Sridhar Samudrala 2018-02-17 2:38 ` [RFC PATCH v3 0/3] Enable virtio_net to act as a backup for a passthru device Jakub Kicinski 2018-02-17 2:38 ` Jakub Kicinski 2018-02-17 17:12 ` Alexander Duyck 2018-02-17 17:12 ` [virtio-dev] " Alexander Duyck 2018-02-19 6:11 ` Jakub Kicinski 2018-02-20 16:26 ` Samudrala, Sridhar 2018-02-20 16:26 ` [virtio-dev] " Samudrala, Sridhar 2018-02-20 16:26 ` Samudrala, Sridhar 2018-02-21 23:50 ` Siwei Liu 2018-02-21 23:50 ` [virtio-dev] " Siwei Liu 2018-02-22 0:17 ` Alexander Duyck 2018-02-22 0:17 ` Alexander Duyck 2018-02-22 0:17 ` [virtio-dev] " Alexander Duyck 2018-02-22 1:59 ` Siwei Liu 2018-02-22 1:59 ` Siwei Liu 2018-02-22 1:59 ` [virtio-dev] " Siwei Liu 2018-02-22 2:35 ` Samudrala, Sridhar 2018-02-22 2:35 ` Samudrala, Sridhar 2018-02-22 2:35 ` [virtio-dev] " Samudrala, Sridhar 2018-02-22 3:28 ` Samudrala, Sridhar 2018-02-22 3:28 ` [virtio-dev] " Samudrala, Sridhar 2018-02-23 22:22 ` Siwei Liu 2018-02-23 22:22 ` [virtio-dev] " Siwei Liu 2018-02-23 22:38 ` Jiri Pirko 2018-02-24 0:17 ` Siwei Liu 2018-02-24 0:17 ` [virtio-dev] " Siwei Liu 2018-02-24 0:03 ` Stephen Hemminger 2018-02-25 22:17 ` Alexander Duyck 2018-02-25 22:17 ` [virtio-dev] " Alexander Duyck 2018-02-25 22:17 ` Alexander Duyck 2018-02-21 23:50 ` Siwei Liu 2018-02-17 17:12 ` Alexander Duyck 2018-02-20 10:42 ` Jiri Pirko 2018-02-20 16:04 ` Alexander Duyck 2018-02-20 16:04 ` [virtio-dev] " Alexander Duyck 2018-02-20 16:29 ` Jiri Pirko 2018-02-20 17:14 ` Samudrala, Sridhar 2018-02-20 17:14 ` [virtio-dev] " Samudrala, Sridhar 2018-02-20 20:14 ` Jiri Pirko 2018-02-20 21:02 ` Alexander Duyck 2018-02-20 21:02 ` [virtio-dev] " Alexander Duyck 2018-02-20 21:02 ` Alexander Duyck 2018-02-20 22:33 ` Jakub Kicinski 2018-02-21 9:51 ` Jiri Pirko 2018-02-21 15:56 ` Alexander Duyck 2018-02-21 15:56 ` [virtio-dev] " Alexander Duyck 2018-02-21 16:11 ` Jiri Pirko 2018-02-21 16:49 ` Alexander Duyck 2018-02-21 16:49 ` [virtio-dev] " Alexander Duyck 2018-02-21 16:58 ` Jiri Pirko 2018-02-21 17:56 ` Alexander Duyck 2018-02-21 17:56 ` Alexander Duyck 2018-02-21 17:56 ` [virtio-dev] " Alexander Duyck 2018-02-21 19:38 ` Jiri Pirko 2018-02-21 20:57 ` Alexander Duyck 2018-02-21 20:57 ` [virtio-dev] " Alexander Duyck 2018-02-22 2:02 ` Jakub Kicinski 2018-02-22 2:15 ` Samudrala, Sridhar 2018-02-22 2:15 ` [virtio-dev] " Samudrala, Sridhar 2018-02-22 2:15 ` Samudrala, Sridhar 2018-02-22 8:11 ` Jiri Pirko 2018-02-22 11:54 ` Or Gerlitz 2018-02-22 13:07 ` Jiri Pirko 2018-02-22 15:30 ` Alexander Duyck 2018-02-22 15:30 ` [virtio-dev] " Alexander Duyck 2018-02-22 21:30 ` Alexander Duyck 2018-02-22 21:30 ` [virtio-dev] " Alexander Duyck 2018-02-23 23:59 ` Stephen Hemminger 2018-02-25 22:21 ` Alexander Duyck 2018-02-25 22:21 ` Alexander Duyck 2018-02-25 22:21 ` [virtio-dev] " Alexander Duyck 2018-02-26 7:19 ` Jiri Pirko 2018-02-27 1:02 ` Stephen Hemminger 2018-02-27 1:18 ` Michael S. Tsirkin 2018-02-27 1:18 ` [virtio-dev] " Michael S. Tsirkin 2018-02-27 8:27 ` Jiri Pirko 2018-02-22 21:30 ` Alexander Duyck 2018-02-21 20:57 ` Alexander Duyck 2018-02-21 16:49 ` Alexander Duyck 2018-02-21 15:56 ` Alexander Duyck 2018-02-20 17:14 ` Samudrala, Sridhar 2018-02-20 17:23 ` Alexander Duyck 2018-02-20 17:23 ` [virtio-dev] " Alexander Duyck 2018-02-20 19:53 ` Jiri Pirko 2018-02-27 8:49 ` Jiri Pirko 2018-02-27 21:16 ` Alexander Duyck 2018-02-27 21:16 ` Alexander Duyck 2018-02-27 21:16 ` [virtio-dev] " Alexander Duyck 2018-02-27 21:23 ` Michael S. Tsirkin 2018-02-27 21:23 ` [virtio-dev] " Michael S. Tsirkin 2018-02-27 21:41 ` Jakub Kicinski 2018-02-28 7:08 ` Jiri Pirko 2018-02-28 14:32 ` Michael S. Tsirkin 2018-02-28 14:32 ` [virtio-dev] " Michael S. Tsirkin 2018-02-28 15:11 ` Jiri Pirko 2018-02-28 15:45 ` Michael S. Tsirkin 2018-02-28 15:45 ` Michael S. Tsirkin 2018-02-28 15:45 ` [virtio-dev] " Michael S. Tsirkin 2018-02-28 19:25 ` Jiri Pirko 2018-02-28 20:48 ` Michael S. Tsirkin 2018-02-28 20:48 ` Michael S. Tsirkin 2018-02-28 20:48 ` [virtio-dev] " Michael S. Tsirkin 2018-02-27 21:30 ` Michael S. Tsirkin 2018-02-27 21:30 ` [virtio-dev] " Michael S. Tsirkin 2018-02-27 21:30 ` Michael S. Tsirkin 2018-02-20 16:04 ` Alexander Duyck 2018-02-16 18:11 Sridhar Samudrala
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1518804682-16881-1-git-send-email-sridhar.samudrala@intel.com \ --to=sridhar.samudrala@intel.com \ --cc=alexander.h.duyck@intel.com \ --cc=davem@davemloft.net \ --cc=jasowang@redhat.com \ --cc=jesse.brandeburg@intel.com \ --cc=kubakici@wp.pl \ --cc=loseweigh@gmail.com \ --cc=mst@redhat.com \ --cc=netdev@vger.kernel.org \ --cc=stephen@networkplumber.org \ --cc=virtio-dev@lists.oasis-open.org \ --cc=virtualization@lists.linux-foundation.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.