From: Siwei Liu <loseweigh@gmail.com> To: "Michael S. Tsirkin" <mst@redhat.com> Cc: Sridhar Samudrala <sridhar.samudrala@intel.com>, Stephen Hemminger <stephen@networkplumber.org>, David Miller <davem@davemloft.net>, Netdev <netdev@vger.kernel.org>, Jiri Pirko <jiri@resnulli.us>, virtio-dev@lists.oasis-open.org, "Brandeburg, Jesse" <jesse.brandeburg@intel.com>, Alexander Duyck <alexander.h.duyck@intel.com>, Jakub Kicinski <kubakici@wp.pl> Subject: Re: [PATCH v4 2/2] virtio_net: Extend virtio to use VF datapath when available Date: Fri, 2 Mar 2018 15:56:31 -0800 [thread overview] Message-ID: <CADGSJ22VUgJzi6B=Bh4M6Bado1CQEEJvRR1VJ=oC47G2SJ0DEA@mail.gmail.com> (raw) In-Reply-To: <20180302233443-mutt-send-email-mst@kernel.org> On Fri, Mar 2, 2018 at 1:36 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > On Fri, Mar 02, 2018 at 01:11:56PM -0800, Siwei Liu wrote: >> On Thu, Mar 1, 2018 at 12:08 PM, Sridhar Samudrala >> <sridhar.samudrala@intel.com> wrote: >> > This patch enables virtio_net to switch over to a VF datapath when a VF >> > netdev is present with the same MAC address. It allows live migration >> > of a VM with a direct attached VF without the need to setup a bond/team >> > between a VF and virtio net device in the guest. >> > >> > The hypervisor needs to enable only one datapath at any time so that >> > packets don't get looped back to the VM over the other datapath. When a VF >> > is plugged, the virtio datapath link state can be marked as down. The >> > hypervisor needs to unplug the VF device from the guest on the source host >> > and reset the MAC filter of the VF to initiate failover of datapath to >> > virtio before starting the migration. After the migration is completed, >> > the destination hypervisor sets the MAC filter on the VF and plugs it back >> > to the guest to switch over to VF datapath. >> > >> > When BACKUP feature is enabled, an additional netdev(bypass netdev) is >> > created that acts as a master device and tracks the state of the 2 lower >> > netdevs. The original virtio_net netdev is marked as 'backup' netdev and a >> > passthru device with the same MAC is registered as 'active' netdev. >> > >> > This patch is based on the discussion initiated by Jesse on this thread. >> > https://marc.info/?l=linux-virtualization&m=151189725224231&w=2 >> > >> > Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> >> > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> >> > Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> >> > --- >> > drivers/net/virtio_net.c | 683 ++++++++++++++++++++++++++++++++++++++++++++++- >> > 1 file changed, 682 insertions(+), 1 deletion(-) >> > >> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >> > index bcd13fe906ca..f2860d86c952 100644 >> > --- a/drivers/net/virtio_net.c >> > +++ b/drivers/net/virtio_net.c >> > @@ -30,6 +30,8 @@ >> > #include <linux/cpu.h> >> > #include <linux/average.h> >> > #include <linux/filter.h> >> > +#include <linux/netdevice.h> >> > +#include <linux/pci.h> >> > #include <net/route.h> >> > #include <net/xdp.h> >> > >> > @@ -206,6 +208,9 @@ struct virtnet_info { >> > u32 speed; >> > >> > unsigned long guest_offloads; >> > + >> > + /* upper netdev created when BACKUP feature enabled */ >> > + struct net_device *bypass_netdev; >> > }; >> > >> > struct padded_vnet_hdr { >> > @@ -2236,6 +2241,22 @@ static int virtnet_xdp(struct net_device *dev, struct netdev_bpf *xdp) >> > } >> > } >> > >> > +static int virtnet_get_phys_port_name(struct net_device *dev, char *buf, >> > + size_t len) >> > +{ >> > + struct virtnet_info *vi = netdev_priv(dev); >> > + int ret; >> > + >> > + if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_BACKUP)) >> > + return -EOPNOTSUPP; >> > + >> > + ret = snprintf(buf, len, "_bkup"); >> > + if (ret >= len) >> > + return -EOPNOTSUPP; >> > + >> > + return 0; >> > +} >> > + >> >> What if the systemd/udevd is not new enough to enforce the >> n<phys_port_name> naming? Would virtio_bypass get a different name >> than the original virtio_net? > > You mean people using ethX names? Any hardware config change breaks > these, I don't think that can be helped. I don't like the way to rely on .ndo_get_phys_port_name - it's fragile and it does not completely solve the problem it tries to address. Imagine what can end up with if getting an old udevd, or users already have exsiting explicit udev rules around phys_port_name. It does not give you the an ack in saying "yes, I know you're the bypass and you're the backup, please continue and I will give you both correct names", or an unacknowlegment saying "no, I don't know what these extra interfaces are, please go back and leave the VF device alone". We need new udev API for both feature negotiation and naming, or may even completely hide the lower interfaces. > >> Should we detect this earlier and fall >> back to legacy mode without creating the bypass netdev and ensalving >> the VF? > > I don't think we can do this with existing kernel/userspace APIs. That's why I ever said to make udev aware of this new type of combined device instead of doing hacks here and there around. Regards, -Siwei > > -- > MST
WARNING: multiple messages have this Message-ID (diff)
From: Siwei Liu <loseweigh@gmail.com> To: "Michael S. Tsirkin" <mst@redhat.com> Cc: Sridhar Samudrala <sridhar.samudrala@intel.com>, Stephen Hemminger <stephen@networkplumber.org>, David Miller <davem@davemloft.net>, Netdev <netdev@vger.kernel.org>, Jiri Pirko <jiri@resnulli.us>, virtio-dev@lists.oasis-open.org, "Brandeburg, Jesse" <jesse.brandeburg@intel.com>, Alexander Duyck <alexander.h.duyck@intel.com>, Jakub Kicinski <kubakici@wp.pl> Subject: [virtio-dev] Re: [PATCH v4 2/2] virtio_net: Extend virtio to use VF datapath when available Date: Fri, 2 Mar 2018 15:56:31 -0800 [thread overview] Message-ID: <CADGSJ22VUgJzi6B=Bh4M6Bado1CQEEJvRR1VJ=oC47G2SJ0DEA@mail.gmail.com> (raw) In-Reply-To: <20180302233443-mutt-send-email-mst@kernel.org> On Fri, Mar 2, 2018 at 1:36 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > On Fri, Mar 02, 2018 at 01:11:56PM -0800, Siwei Liu wrote: >> On Thu, Mar 1, 2018 at 12:08 PM, Sridhar Samudrala >> <sridhar.samudrala@intel.com> wrote: >> > This patch enables virtio_net to switch over to a VF datapath when a VF >> > netdev is present with the same MAC address. It allows live migration >> > of a VM with a direct attached VF without the need to setup a bond/team >> > between a VF and virtio net device in the guest. >> > >> > The hypervisor needs to enable only one datapath at any time so that >> > packets don't get looped back to the VM over the other datapath. When a VF >> > is plugged, the virtio datapath link state can be marked as down. The >> > hypervisor needs to unplug the VF device from the guest on the source host >> > and reset the MAC filter of the VF to initiate failover of datapath to >> > virtio before starting the migration. After the migration is completed, >> > the destination hypervisor sets the MAC filter on the VF and plugs it back >> > to the guest to switch over to VF datapath. >> > >> > When BACKUP feature is enabled, an additional netdev(bypass netdev) is >> > created that acts as a master device and tracks the state of the 2 lower >> > netdevs. The original virtio_net netdev is marked as 'backup' netdev and a >> > passthru device with the same MAC is registered as 'active' netdev. >> > >> > This patch is based on the discussion initiated by Jesse on this thread. >> > https://marc.info/?l=linux-virtualization&m=151189725224231&w=2 >> > >> > Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> >> > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> >> > Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> >> > --- >> > drivers/net/virtio_net.c | 683 ++++++++++++++++++++++++++++++++++++++++++++++- >> > 1 file changed, 682 insertions(+), 1 deletion(-) >> > >> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >> > index bcd13fe906ca..f2860d86c952 100644 >> > --- a/drivers/net/virtio_net.c >> > +++ b/drivers/net/virtio_net.c >> > @@ -30,6 +30,8 @@ >> > #include <linux/cpu.h> >> > #include <linux/average.h> >> > #include <linux/filter.h> >> > +#include <linux/netdevice.h> >> > +#include <linux/pci.h> >> > #include <net/route.h> >> > #include <net/xdp.h> >> > >> > @@ -206,6 +208,9 @@ struct virtnet_info { >> > u32 speed; >> > >> > unsigned long guest_offloads; >> > + >> > + /* upper netdev created when BACKUP feature enabled */ >> > + struct net_device *bypass_netdev; >> > }; >> > >> > struct padded_vnet_hdr { >> > @@ -2236,6 +2241,22 @@ static int virtnet_xdp(struct net_device *dev, struct netdev_bpf *xdp) >> > } >> > } >> > >> > +static int virtnet_get_phys_port_name(struct net_device *dev, char *buf, >> > + size_t len) >> > +{ >> > + struct virtnet_info *vi = netdev_priv(dev); >> > + int ret; >> > + >> > + if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_BACKUP)) >> > + return -EOPNOTSUPP; >> > + >> > + ret = snprintf(buf, len, "_bkup"); >> > + if (ret >= len) >> > + return -EOPNOTSUPP; >> > + >> > + return 0; >> > +} >> > + >> >> What if the systemd/udevd is not new enough to enforce the >> n<phys_port_name> naming? Would virtio_bypass get a different name >> than the original virtio_net? > > You mean people using ethX names? Any hardware config change breaks > these, I don't think that can be helped. I don't like the way to rely on .ndo_get_phys_port_name - it's fragile and it does not completely solve the problem it tries to address. Imagine what can end up with if getting an old udevd, or users already have exsiting explicit udev rules around phys_port_name. It does not give you the an ack in saying "yes, I know you're the bypass and you're the backup, please continue and I will give you both correct names", or an unacknowlegment saying "no, I don't know what these extra interfaces are, please go back and leave the VF device alone". We need new udev API for both feature negotiation and naming, or may even completely hide the lower interfaces. > >> Should we detect this earlier and fall >> back to legacy mode without creating the bypass netdev and ensalving >> the VF? > > I don't think we can do this with existing kernel/userspace APIs. That's why I ever said to make udev aware of this new type of combined device instead of doing hacks here and there around. Regards, -Siwei > > -- > MST --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
next prev parent reply other threads:[~2018-03-02 23:56 UTC|newest] Thread overview: 102+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-03-01 20:08 [PATCH v4 0/2] Enable virtio_net to act as a backup for a passthru device Sridhar Samudrala 2018-03-01 20:08 ` [virtio-dev] " Sridhar Samudrala 2018-03-01 20:08 ` [PATCH v4 1/2] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit Sridhar Samudrala 2018-03-01 20:08 ` [virtio-dev] " Sridhar Samudrala 2018-03-01 20:08 ` [PATCH v4 2/2] virtio_net: Extend virtio to use VF datapath when available Sridhar Samudrala 2018-03-01 20:08 ` [virtio-dev] " Sridhar Samudrala 2018-03-02 8:36 ` Jiri Pirko 2018-03-02 15:26 ` Alexander Duyck 2018-03-02 15:26 ` [virtio-dev] " Alexander Duyck 2018-03-02 16:20 ` Jiri Pirko 2018-03-02 16:37 ` Samudrala, Sridhar 2018-03-02 16:37 ` [virtio-dev] " Samudrala, Sridhar 2018-03-02 17:06 ` Alexander Duyck 2018-03-02 17:06 ` [virtio-dev] " Alexander Duyck 2018-03-02 19:42 ` Michael S. Tsirkin 2018-03-02 19:42 ` [virtio-dev] " Michael S. Tsirkin 2018-03-02 20:49 ` Siwei Liu 2018-03-02 20:49 ` [virtio-dev] " Siwei Liu 2018-03-03 11:31 ` Jiri Pirko 2018-03-03 18:04 ` Alexander Duyck 2018-03-03 18:04 ` [virtio-dev] " Alexander Duyck 2018-03-03 21:25 ` Jiri Pirko 2018-03-04 0:26 ` Alexander Duyck 2018-03-04 0:26 ` [virtio-dev] " Alexander Duyck 2018-03-04 7:13 ` Jiri Pirko 2018-03-04 18:24 ` Alexander Duyck 2018-03-04 18:24 ` [virtio-dev] " Alexander Duyck 2018-03-04 18:50 ` Jiri Pirko 2018-03-04 21:54 ` Samudrala, Sridhar 2018-03-04 21:54 ` [virtio-dev] " Samudrala, Sridhar 2018-03-04 21:58 ` Alexander Duyck 2018-03-04 21:58 ` [virtio-dev] " Alexander Duyck 2018-03-05 9:21 ` Jiri Pirko 2018-03-05 16:11 ` Stephen Hemminger 2018-03-05 22:30 ` Jiri Pirko 2018-03-05 22:47 ` Alexander Duyck 2018-03-05 22:47 ` [virtio-dev] " Alexander Duyck 2018-03-06 3:15 ` Stephen Hemminger 2018-03-06 19:08 ` Alexander Duyck 2018-03-06 19:08 ` [virtio-dev] " Alexander Duyck 2018-03-06 22:59 ` Jiri Pirko 2018-03-06 23:27 ` Alexander Duyck 2018-03-06 23:27 ` [virtio-dev] " Alexander Duyck 2018-03-07 2:38 ` Michael S. Tsirkin 2018-03-07 2:38 ` [virtio-dev] " Michael S. Tsirkin 2018-03-07 17:50 ` Alexander Duyck 2018-03-07 17:50 ` [virtio-dev] " Alexander Duyck 2018-03-07 18:06 ` Stephen Hemminger 2018-03-07 18:55 ` Alexander Duyck 2018-03-07 18:55 ` [virtio-dev] " Alexander Duyck 2018-03-07 20:11 ` Michael S. Tsirkin 2018-03-07 20:11 ` [virtio-dev] " Michael S. Tsirkin 2018-03-12 18:47 ` Samudrala, Sridhar 2018-03-12 18:47 ` [virtio-dev] " Samudrala, Sridhar 2018-03-02 19:41 ` Michael S. Tsirkin 2018-03-02 19:41 ` [virtio-dev] " Michael S. Tsirkin 2018-03-02 19:52 ` Samudrala, Sridhar 2018-03-02 19:52 ` [virtio-dev] " Samudrala, Sridhar 2018-03-02 20:10 ` Michael S. Tsirkin 2018-03-02 20:10 ` [virtio-dev] " Michael S. Tsirkin 2018-03-02 20:44 ` Siwei Liu 2018-03-02 20:44 ` [virtio-dev] " Siwei Liu 2018-03-02 20:56 ` Samudrala, Sridhar 2018-03-02 20:56 ` [virtio-dev] " Samudrala, Sridhar 2018-03-02 21:33 ` Michael S. Tsirkin 2018-03-02 21:33 ` [virtio-dev] " Michael S. Tsirkin 2018-03-02 21:31 ` Michael S. Tsirkin 2018-03-02 21:31 ` [virtio-dev] " Michael S. Tsirkin 2018-03-02 22:26 ` Siwei Liu 2018-03-02 22:26 ` [virtio-dev] " Siwei Liu 2018-03-04 4:00 ` Michael S. Tsirkin 2018-03-04 4:00 ` [virtio-dev] " Michael S. Tsirkin 2018-03-02 21:11 ` Siwei Liu 2018-03-02 21:11 ` [virtio-dev] " Siwei Liu 2018-03-02 21:36 ` Michael S. Tsirkin 2018-03-02 21:36 ` [virtio-dev] " Michael S. Tsirkin 2018-03-02 23:56 ` Siwei Liu [this message] 2018-03-02 23:56 ` Siwei Liu 2018-03-04 4:04 ` Michael S. Tsirkin 2018-03-04 4:04 ` [virtio-dev] " Michael S. Tsirkin 2018-03-12 21:53 ` Siwei Liu 2018-03-12 21:53 ` [virtio-dev] " Siwei Liu 2018-03-02 23:12 ` Samudrala, Sridhar 2018-03-02 23:12 ` [virtio-dev] " Samudrala, Sridhar 2018-03-03 0:09 ` Siwei Liu 2018-03-03 0:09 ` [virtio-dev] " Siwei Liu 2018-03-12 20:12 ` Jiri Pirko 2018-03-12 20:58 ` Samudrala, Sridhar 2018-03-12 20:58 ` [virtio-dev] " Samudrala, Sridhar 2018-03-12 21:08 ` Jiri Pirko 2018-03-14 0:36 ` Samudrala, Sridhar 2018-03-14 0:36 ` [virtio-dev] " Samudrala, Sridhar 2018-03-14 0:54 ` Stephen Hemminger 2018-03-14 15:45 ` Jiri Pirko 2018-03-12 22:44 ` Siwei Liu 2018-03-12 22:44 ` [virtio-dev] " Siwei Liu 2018-03-14 0:28 ` Samudrala, Sridhar 2018-03-14 0:28 ` [virtio-dev] " Samudrala, Sridhar 2018-03-14 0:44 ` Michael S. Tsirkin 2018-03-14 0:44 ` [virtio-dev] " Michael S. Tsirkin 2018-03-14 4:50 ` Siwei Liu 2018-03-14 4:50 ` [virtio-dev] " Siwei Liu
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CADGSJ22VUgJzi6B=Bh4M6Bado1CQEEJvRR1VJ=oC47G2SJ0DEA@mail.gmail.com' \ --to=loseweigh@gmail.com \ --cc=alexander.h.duyck@intel.com \ --cc=davem@davemloft.net \ --cc=jesse.brandeburg@intel.com \ --cc=jiri@resnulli.us \ --cc=kubakici@wp.pl \ --cc=mst@redhat.com \ --cc=netdev@vger.kernel.org \ --cc=sridhar.samudrala@intel.com \ --cc=stephen@networkplumber.org \ --cc=virtio-dev@lists.oasis-open.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.