All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.duyck@gmail.com>
To: Jiri Pirko <jiri@resnulli.us>
Cc: Sridhar Samudrala <sridhar.samudrala@intel.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Stephen Hemminger <stephen@networkplumber.org>,
	David Miller <davem@davemloft.net>,
	Netdev <netdev@vger.kernel.org>,
	virtualization@lists.linux-foundation.org,
	virtio-dev@lists.oasis-open.org, "Brandeburg,
	Jesse" <jesse.brandeburg@intel.com>,
	"Duyck, Alexander H" <alexander.h.duyck@intel.com>,
	Jakub Kicinski <kubakici@wp.pl>, Jason Wang <jasowang@redhat.com>,
	Siwei Liu <loseweigh@gmail.com>
Subject: Re: [RFC PATCH v3 0/3] Enable virtio_net to act as a backup for a passthru device
Date: Tue, 20 Feb 2018 08:04:29 -0800	[thread overview]
Message-ID: <CAKgT0UdU8PDXduzxp4kKfur-DLeFQSJj7-fhW_eTgVzd+AcViw@mail.gmail.com> (raw)
In-Reply-To: <20180220104224.GA2031@nanopsycho>

On Tue, Feb 20, 2018 at 2:42 AM, Jiri Pirko <jiri@resnulli.us> wrote:
> Fri, Feb 16, 2018 at 07:11:19PM CET, sridhar.samudrala@intel.com wrote:
>>Patch 1 introduces a new feature bit VIRTIO_NET_F_BACKUP that can be
>>used by hypervisor to indicate that virtio_net interface should act as
>>a backup for another device with the same MAC address.
>>
>>Ppatch 2 is in response to the community request for a 3 netdev
>>solution.  However, it creates some issues we'll get into in a moment.
>>It extends virtio_net to use alternate datapath when available and
>>registered. When BACKUP feature is enabled, virtio_net driver creates
>>an additional 'bypass' netdev that acts as a master device and controls
>>2 slave devices.  The original virtio_net netdev is registered as
>>'backup' netdev and a passthru/vf device with the same MAC gets
>>registered as 'active' netdev. Both 'bypass' and 'backup' netdevs are
>>associated with the same 'pci' device.  The user accesses the network
>>interface via 'bypass' netdev. The 'bypass' netdev chooses 'active' netdev
>>as default for transmits when it is available with link up and running.
>
> Sorry, but this is ridiculous. You are apparently re-implemeting part
> of bonding driver as a part of NIC driver. Bond and team drivers
> are mature solutions, well tested, broadly used, with lots of issues
> resolved in the past. What you try to introduce is a weird shortcut
> that already has couple of issues as you mentioned and will certanly
> have many more. Also, I'm pretty sure that in future, someone comes up
> with ideas like multiple VFs, LACP and similar bonding things.

The problem with the bond and team drivers is they are too large and
have too many interfaces available for configuration so as a result
they can really screw this interface up.

Essentially this is meant to be a bond that is more-or-less managed by
the host, not the guest. We want the host to be able to configure it
and have it automatically kick in on the guest. For now we want to
avoid adding too much complexity as this is meant to be just the first
step. Trying to go in and implement the whole solution right from the
start based on existing drivers is going to be a massive time sink and
will likely never get completed due to the fact that there is always
going to be some other thing that will interfere.

My personal hope is that we can look at doing a virtio-bond sort of
device that will handle all this as well as providing a communication
channel, but that is much further down the road. For now we only have
a single bit so the goal for now is trying to keep this as simple as
possible.

> What is the reason for this abomination? According to:
> https://marc.info/?l=linux-virtualization&m=151189725224231&w=2
> The reason is quite weak.
> User in the vm sees 2 (or more) netdevices, he puts them in bond/team
> and that's it. This works now! If the vm lacks some userspace features,
> let's fix it there! For example the MAC changes is something that could
> be easily handled in teamd userspace deamon.

I think you might have missed the point of this. This is meant to be a
simple interface so the guest should not be able to change the MAC
address, and it shouldn't require any userspace daemon to setup or
tear down. Ideally with this solution the virtio bypass will come up
and be assigned the name of the original virtio, and the "backup"
interface will come up and be assigned the name of the original virtio
with an additional "nbackup" tacked on via the phys_port_name, and
then whenever a VF is added it will automatically be enslaved by the
bypass interface, and it will be removed when the VF is hotplugged
out.

In my mind the difference between this and bond or team is where the
configuration interface lies. In the case of bond it is in the kernel.
If my understanding is correct team is mostly in user space. With this
the configuration interface is really down in the hypervisor and
requests are communicated up to the guest. I would prefer not to make
virtio_net dependent on the bonding or team drivers, or worse yet a
userspace daemon in the guest. For now I would argue we should keep
this as simple as possible just to support basic live migration. There
has already been discussions of refactoring this after it is in so
that we can start to combine the functionality here with what is there
in bonding/team, but the differences in configuration interface and
the size of the code bases will make it challenging to outright merge
this into something like that.

WARNING: multiple messages have this Message-ID (diff)
From: Alexander Duyck <alexander.duyck@gmail.com>
To: Jiri Pirko <jiri@resnulli.us>
Cc: Sridhar Samudrala <sridhar.samudrala@intel.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Stephen Hemminger <stephen@networkplumber.org>,
	David Miller <davem@davemloft.net>,
	Netdev <netdev@vger.kernel.org>,
	virtualization@lists.linux-foundation.org,
	virtio-dev@lists.oasis-open.org, "Brandeburg,
	Jesse" <jesse.brandeburg@intel.com>,
	"Duyck, Alexander H" <alexander.h.duyck@intel.com>,
	Jakub Kicinski <kubakici@wp.pl>, Jason Wang <jasowang@redhat.com>,
	Siwei Liu <loseweigh@gmail.com>
Subject: [virtio-dev] Re: [RFC PATCH v3 0/3] Enable virtio_net to act as a backup for a passthru device
Date: Tue, 20 Feb 2018 08:04:29 -0800	[thread overview]
Message-ID: <CAKgT0UdU8PDXduzxp4kKfur-DLeFQSJj7-fhW_eTgVzd+AcViw@mail.gmail.com> (raw)
In-Reply-To: <20180220104224.GA2031@nanopsycho>

On Tue, Feb 20, 2018 at 2:42 AM, Jiri Pirko <jiri@resnulli.us> wrote:
> Fri, Feb 16, 2018 at 07:11:19PM CET, sridhar.samudrala@intel.com wrote:
>>Patch 1 introduces a new feature bit VIRTIO_NET_F_BACKUP that can be
>>used by hypervisor to indicate that virtio_net interface should act as
>>a backup for another device with the same MAC address.
>>
>>Ppatch 2 is in response to the community request for a 3 netdev
>>solution.  However, it creates some issues we'll get into in a moment.
>>It extends virtio_net to use alternate datapath when available and
>>registered. When BACKUP feature is enabled, virtio_net driver creates
>>an additional 'bypass' netdev that acts as a master device and controls
>>2 slave devices.  The original virtio_net netdev is registered as
>>'backup' netdev and a passthru/vf device with the same MAC gets
>>registered as 'active' netdev. Both 'bypass' and 'backup' netdevs are
>>associated with the same 'pci' device.  The user accesses the network
>>interface via 'bypass' netdev. The 'bypass' netdev chooses 'active' netdev
>>as default for transmits when it is available with link up and running.
>
> Sorry, but this is ridiculous. You are apparently re-implemeting part
> of bonding driver as a part of NIC driver. Bond and team drivers
> are mature solutions, well tested, broadly used, with lots of issues
> resolved in the past. What you try to introduce is a weird shortcut
> that already has couple of issues as you mentioned and will certanly
> have many more. Also, I'm pretty sure that in future, someone comes up
> with ideas like multiple VFs, LACP and similar bonding things.

The problem with the bond and team drivers is they are too large and
have too many interfaces available for configuration so as a result
they can really screw this interface up.

Essentially this is meant to be a bond that is more-or-less managed by
the host, not the guest. We want the host to be able to configure it
and have it automatically kick in on the guest. For now we want to
avoid adding too much complexity as this is meant to be just the first
step. Trying to go in and implement the whole solution right from the
start based on existing drivers is going to be a massive time sink and
will likely never get completed due to the fact that there is always
going to be some other thing that will interfere.

My personal hope is that we can look at doing a virtio-bond sort of
device that will handle all this as well as providing a communication
channel, but that is much further down the road. For now we only have
a single bit so the goal for now is trying to keep this as simple as
possible.

> What is the reason for this abomination? According to:
> https://marc.info/?l=linux-virtualization&m=151189725224231&w=2
> The reason is quite weak.
> User in the vm sees 2 (or more) netdevices, he puts them in bond/team
> and that's it. This works now! If the vm lacks some userspace features,
> let's fix it there! For example the MAC changes is something that could
> be easily handled in teamd userspace deamon.

I think you might have missed the point of this. This is meant to be a
simple interface so the guest should not be able to change the MAC
address, and it shouldn't require any userspace daemon to setup or
tear down. Ideally with this solution the virtio bypass will come up
and be assigned the name of the original virtio, and the "backup"
interface will come up and be assigned the name of the original virtio
with an additional "nbackup" tacked on via the phys_port_name, and
then whenever a VF is added it will automatically be enslaved by the
bypass interface, and it will be removed when the VF is hotplugged
out.

In my mind the difference between this and bond or team is where the
configuration interface lies. In the case of bond it is in the kernel.
If my understanding is correct team is mostly in user space. With this
the configuration interface is really down in the hypervisor and
requests are communicated up to the guest. I would prefer not to make
virtio_net dependent on the bonding or team drivers, or worse yet a
userspace daemon in the guest. For now I would argue we should keep
this as simple as possible just to support basic live migration. There
has already been discussions of refactoring this after it is in so
that we can start to combine the functionality here with what is there
in bonding/team, but the differences in configuration interface and
the size of the code bases will make it challenging to outright merge
this into something like that.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


  reply	other threads:[~2018-02-20 16:04 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-16 18:11 [RFC PATCH v3 0/3] Enable virtio_net to act as a backup for a passthru device Sridhar Samudrala
2018-02-16 18:11 ` [virtio-dev] " Sridhar Samudrala
2018-02-16 18:11 ` [RFC PATCH v3 1/3] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit Sridhar Samudrala
2018-02-16 18:11 ` Sridhar Samudrala
2018-02-16 18:11   ` [virtio-dev] " Sridhar Samudrala
2018-02-16 18:11 ` [RFC PATCH v3 2/3] virtio_net: Extend virtio to use VF datapath when available Sridhar Samudrala
2018-02-16 18:11 ` Sridhar Samudrala
2018-02-16 18:11   ` [virtio-dev] " Sridhar Samudrala
2018-02-17  3:04   ` Jakub Kicinski
2018-02-17 17:41     ` Alexander Duyck
2018-02-17  3:04   ` Jakub Kicinski
2018-02-16 18:11 ` [RFC PATCH v3 3/3] virtio_net: Enable alternate datapath without creating an additional netdev Sridhar Samudrala
2018-02-16 18:11   ` [virtio-dev] " Sridhar Samudrala
2018-02-16 18:11 ` Sridhar Samudrala
2018-02-17  2:38 ` [RFC PATCH v3 0/3] Enable virtio_net to act as a backup for a passthru device Jakub Kicinski
2018-02-17  2:38 ` Jakub Kicinski
2018-02-17 17:12   ` Alexander Duyck
2018-02-17 17:12     ` [virtio-dev] " Alexander Duyck
2018-02-19  6:11     ` Jakub Kicinski
2018-02-20 16:26       ` Samudrala, Sridhar
2018-02-20 16:26         ` [virtio-dev] " Samudrala, Sridhar
2018-02-20 16:26       ` Samudrala, Sridhar
2018-02-21 23:50     ` Siwei Liu
2018-02-21 23:50       ` [virtio-dev] " Siwei Liu
2018-02-22  0:17       ` Alexander Duyck
2018-02-22  0:17       ` Alexander Duyck
2018-02-22  0:17         ` [virtio-dev] " Alexander Duyck
2018-02-22  1:59         ` Siwei Liu
2018-02-22  1:59         ` Siwei Liu
2018-02-22  1:59           ` [virtio-dev] " Siwei Liu
2018-02-22  2:35           ` Samudrala, Sridhar
2018-02-22  2:35           ` Samudrala, Sridhar
2018-02-22  2:35             ` [virtio-dev] " Samudrala, Sridhar
2018-02-22  3:28             ` Samudrala, Sridhar
2018-02-22  3:28               ` [virtio-dev] " Samudrala, Sridhar
2018-02-23 22:22             ` Siwei Liu
2018-02-23 22:22               ` [virtio-dev] " Siwei Liu
2018-02-23 22:38               ` Jiri Pirko
2018-02-24  0:17                 ` Siwei Liu
2018-02-24  0:17                   ` [virtio-dev] " Siwei Liu
2018-02-24  0:03         ` Stephen Hemminger
2018-02-25 22:17           ` Alexander Duyck
2018-02-25 22:17             ` [virtio-dev] " Alexander Duyck
2018-02-25 22:17           ` Alexander Duyck
2018-02-21 23:50     ` Siwei Liu
2018-02-17 17:12   ` Alexander Duyck
2018-02-20 10:42 ` Jiri Pirko
2018-02-20 16:04   ` Alexander Duyck [this message]
2018-02-20 16:04     ` [virtio-dev] " Alexander Duyck
2018-02-20 16:29     ` Jiri Pirko
2018-02-20 17:14       ` Samudrala, Sridhar
2018-02-20 17:14         ` [virtio-dev] " Samudrala, Sridhar
2018-02-20 20:14         ` Jiri Pirko
2018-02-20 21:02           ` Alexander Duyck
2018-02-20 21:02             ` [virtio-dev] " Alexander Duyck
2018-02-20 21:02           ` Alexander Duyck
2018-02-20 22:33           ` Jakub Kicinski
2018-02-21  9:51             ` Jiri Pirko
2018-02-21 15:56               ` Alexander Duyck
2018-02-21 15:56                 ` [virtio-dev] " Alexander Duyck
2018-02-21 16:11                 ` Jiri Pirko
2018-02-21 16:49                   ` Alexander Duyck
2018-02-21 16:49                     ` [virtio-dev] " Alexander Duyck
2018-02-21 16:58                     ` Jiri Pirko
2018-02-21 17:56                       ` Alexander Duyck
2018-02-21 17:56                       ` Alexander Duyck
2018-02-21 17:56                         ` [virtio-dev] " Alexander Duyck
2018-02-21 19:38                         ` Jiri Pirko
2018-02-21 20:57                           ` Alexander Duyck
2018-02-21 20:57                             ` [virtio-dev] " Alexander Duyck
2018-02-22  2:02                             ` Jakub Kicinski
2018-02-22  2:15                               ` Samudrala, Sridhar
2018-02-22  2:15                                 ` [virtio-dev] " Samudrala, Sridhar
2018-02-22  2:15                               ` Samudrala, Sridhar
2018-02-22  8:11                             ` Jiri Pirko
2018-02-22 11:54                               ` Or Gerlitz
2018-02-22 13:07                                 ` Jiri Pirko
2018-02-22 15:30                                   ` Alexander Duyck
2018-02-22 15:30                                     ` [virtio-dev] " Alexander Duyck
2018-02-22 21:30                               ` Alexander Duyck
2018-02-22 21:30                                 ` [virtio-dev] " Alexander Duyck
2018-02-23 23:59                                 ` Stephen Hemminger
2018-02-25 22:21                                   ` Alexander Duyck
2018-02-25 22:21                                   ` Alexander Duyck
2018-02-25 22:21                                     ` [virtio-dev] " Alexander Duyck
2018-02-26  7:19                                   ` Jiri Pirko
2018-02-27  1:02                                     ` Stephen Hemminger
2018-02-27  1:18                                       ` Michael S. Tsirkin
2018-02-27  1:18                                         ` [virtio-dev] " Michael S. Tsirkin
2018-02-27  8:27                                         ` Jiri Pirko
2018-02-22 21:30                               ` Alexander Duyck
2018-02-21 20:57                           ` Alexander Duyck
2018-02-21 16:49                   ` Alexander Duyck
2018-02-21 15:56               ` Alexander Duyck
2018-02-20 17:14       ` Samudrala, Sridhar
2018-02-20 17:23       ` Alexander Duyck
2018-02-20 17:23         ` [virtio-dev] " Alexander Duyck
2018-02-20 19:53         ` Jiri Pirko
2018-02-27  8:49     ` Jiri Pirko
2018-02-27 21:16       ` Alexander Duyck
2018-02-27 21:16       ` Alexander Duyck
2018-02-27 21:16         ` [virtio-dev] " Alexander Duyck
2018-02-27 21:23         ` Michael S. Tsirkin
2018-02-27 21:23           ` [virtio-dev] " Michael S. Tsirkin
2018-02-27 21:41         ` Jakub Kicinski
2018-02-28  7:08           ` Jiri Pirko
2018-02-28 14:32             ` Michael S. Tsirkin
2018-02-28 14:32               ` [virtio-dev] " Michael S. Tsirkin
2018-02-28 15:11               ` Jiri Pirko
2018-02-28 15:45                 ` Michael S. Tsirkin
2018-02-28 15:45                 ` Michael S. Tsirkin
2018-02-28 15:45                   ` [virtio-dev] " Michael S. Tsirkin
2018-02-28 19:25                   ` Jiri Pirko
2018-02-28 20:48                     ` Michael S. Tsirkin
2018-02-28 20:48                     ` Michael S. Tsirkin
2018-02-28 20:48                       ` [virtio-dev] " Michael S. Tsirkin
2018-02-27 21:30       ` Michael S. Tsirkin
2018-02-27 21:30         ` [virtio-dev] " Michael S. Tsirkin
2018-02-27 21:30       ` Michael S. Tsirkin
2018-02-20 16:04   ` Alexander Duyck
2018-02-16 18:11 Sridhar Samudrala

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKgT0UdU8PDXduzxp4kKfur-DLeFQSJj7-fhW_eTgVzd+AcViw@mail.gmail.com \
    --to=alexander.duyck@gmail.com \
    --cc=alexander.h.duyck@intel.com \
    --cc=davem@davemloft.net \
    --cc=jasowang@redhat.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=jiri@resnulli.us \
    --cc=kubakici@wp.pl \
    --cc=loseweigh@gmail.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=sridhar.samudrala@intel.com \
    --cc=stephen@networkplumber.org \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.