netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC Patch 00/12] IXGBE: Add live migration support for SRIOV NIC
@ 2015-10-21 16:37 Lan Tianyu
  2015-10-21 16:37 ` [RFC Patch 01/12] PCI: Add virtfn_index for struct pci_device Lan Tianyu
                   ` (14 more replies)
  0 siblings, 15 replies; 56+ messages in thread
From: Lan Tianyu @ 2015-10-21 16:37 UTC (permalink / raw)
  To: bhelgaas, carolyn.wyborny, donald.c.skidmore, eddie.dong,
	nrupal.jani, yang.z.zhang, agraf, kvm, pbonzini, qemu-devel,
	emil.s.tantilov, intel-wired-lan, jeffrey.t.kirsher,
	jesse.brandeburg, john.ronciak, linux-kernel, linux-pci,
	matthew.vick, mitch.a.williams, netdev, shannon.nelson
  Cc: Lan Tianyu

This patchset is to propose a new solution to add live migration support for 82599
SRIOV network card.

Im our solution, we prefer to put all device specific operation into VF and
PF driver and make code in the Qemu more general.


VF status migration
=================================================================
VF status can be divided into 4 parts
1) PCI configure regs
2) MSIX configure
3) VF status in the PF driver
4) VF MMIO regs 

The first three status are all handled by Qemu. 
The PCI configure space regs and MSIX configure are originally
stored in Qemu. To save and restore "VF status in the PF driver"
by Qemu during migration, adds new sysfs node "state_in_pf" under
VF sysfs directory.

For VF MMIO regs, we introduce self emulation layer in the VF
driver to record MMIO reg values during reading or writing MMIO
and put these data in the guest memory. It will be migrated with
guest memory to new machine.


VF function restoration
================================================================
Restoring VF function operation are done in the VF and PF driver.
 
In order to let VF driver to know migration status, Qemu fakes VF
PCI configure regs to indicate migration status and add new sysfs
node "notify_vf" to trigger VF mailbox irq in order to notify VF 
about migration status change.

Transmit/Receive descriptor head regs are read-only and can't
be restored via writing back recording reg value directly and they
are set to 0 during VF reset. To reuse original tx/rx rings, shift
desc ring in order to move the desc pointed by original head reg to
first entry of the ring and then enable tx/rx rings. VF restarts to
receive and transmit from original head desc.


Tracking DMA accessed memory
=================================================================
Migration relies on tracking dirty page to migrate memory.
Hardware can't automatically mark a page as dirty after DMA
memory access. VF descriptor rings and data buffers are modified
by hardware when receive and transmit data. To track such dirty memory
manually, do dummy writes(read a byte and write it back) when receive
and transmit data.


Service down time test
=================================================================
So far, we tested migration between two laptops with 82599 nic which
are connected to a gigabit switch. Ping VF in the 0.001s interval
during migration on the host of source side. It service down
time is about 180ms.

[983769928.053604] 64 bytes from 10.239.48.100: icmp_seq=4131 ttl=64 time=2.79 ms
[983769928.056422] 64 bytes from 10.239.48.100: icmp_seq=4132 ttl=64 time=2.79 ms
[983769928.059241] 64 bytes from 10.239.48.100: icmp_seq=4133 ttl=64 time=2.79 ms
[983769928.062071] 64 bytes from 10.239.48.100: icmp_seq=4134 ttl=64 time=2.80 ms
[983769928.064890] 64 bytes from 10.239.48.100: icmp_seq=4135 ttl=64 time=2.79 ms
[983769928.067716] 64 bytes from 10.239.48.100: icmp_seq=4136 ttl=64 time=2.79 ms
[983769928.070538] 64 bytes from 10.239.48.100: icmp_seq=4137 ttl=64 time=2.79 ms
[983769928.073360] 64 bytes from 10.239.48.100: icmp_seq=4138 ttl=64 time=2.79 ms
[983769928.083444] no answer yet for icmp_seq=4139
[983769928.093524] no answer yet for icmp_seq=4140
[983769928.103602] no answer yet for icmp_seq=4141
[983769928.113684] no answer yet for icmp_seq=4142
[983769928.123763] no answer yet for icmp_seq=4143
[983769928.133854] no answer yet for icmp_seq=4144
[983769928.143931] no answer yet for icmp_seq=4145
[983769928.154008] no answer yet for icmp_seq=4146
[983769928.164084] no answer yet for icmp_seq=4147
[983769928.174160] no answer yet for icmp_seq=4148
[983769928.184236] no answer yet for icmp_seq=4149
[983769928.194313] no answer yet for icmp_seq=4150
[983769928.204390] no answer yet for icmp_seq=4151
[983769928.214468] no answer yet for icmp_seq=4152
[983769928.224556] no answer yet for icmp_seq=4153
[983769928.234632] no answer yet for icmp_seq=4154
[983769928.244709] no answer yet for icmp_seq=4155
[983769928.254783] no answer yet for icmp_seq=4156
[983769928.256094] 64 bytes from 10.239.48.100: icmp_seq=4139 ttl=64 time=182 ms
[983769928.256107] 64 bytes from 10.239.48.100: icmp_seq=4140 ttl=64 time=172 ms
[983769928.256114] no answer yet for icmp_seq=4157
[983769928.256236] 64 bytes from 10.239.48.100: icmp_seq=4141 ttl=64 time=162 ms
[983769928.256245] 64 bytes from 10.239.48.100: icmp_seq=4142 ttl=64 time=152 ms
[983769928.256272] 64 bytes from 10.239.48.100: icmp_seq=4143 ttl=64 time=142 ms
[983769928.256310] 64 bytes from 10.239.48.100: icmp_seq=4144 ttl=64 time=132 ms
[983769928.256325] 64 bytes from 10.239.48.100: icmp_seq=4145 ttl=64 time=122 ms
[983769928.256332] 64 bytes from 10.239.48.100: icmp_seq=4146 ttl=64 time=112 ms
[983769928.256440] 64 bytes from 10.239.48.100: icmp_seq=4147 ttl=64 time=102 ms
[983769928.256455] 64 bytes from 10.239.48.100: icmp_seq=4148 ttl=64 time=92.3 ms
[983769928.256494] 64 bytes from 10.239.48.100: icmp_seq=4149 ttl=64 time=82.3 ms
[983769928.256503] 64 bytes from 10.239.48.100: icmp_seq=4150 ttl=64 time=72.2 ms
[983769928.256631] 64 bytes from 10.239.48.100: icmp_seq=4158 ttl=64 time=0.500 ms
[983769928.257284] 64 bytes from 10.239.48.100: icmp_seq=4159 ttl=64 time=0.154 ms
[983769928.258297] 64 bytes from 10.239.48.100: icmp_seq=4160 ttl=64 time=0.165 ms

Todo
=======================================================
So far, the patchset isn't perfect. VF net interface can't be open, closed, down 
and up during migration. Will prevent such operation during migration in the
future job.

Very appreciate for your comments.


Lan Tianyu (12):
  PCI: Add virtfn_index for struct pci_device
  IXGBE: Add new mail box event to restore VF status in the PF driver
  IXGBE: Add sysfs interface for Qemu to migrate VF status in the PF
    driver
  IXGBE: Add ixgbe_ping_vf() to notify a specified VF via mailbox msg.
  IXGBE: Add new sysfs interface of "notify_vf"
  IXGBEVF: Add self emulation layer
  IXGBEVF: Add new mail box event for migration
  IXGBEVF: Rework code of finding the end transmit desc of package
  IXGBEVF: Add live migration support for VF driver
  IXGBEVF: Add lock to protect tx/rx ring operation
  IXGBEVF: Migrate VF statistic data
  IXGBEVF: Track dma dirty pages

 drivers/net/ethernet/intel/ixgbe/ixgbe.h           |   1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_mbx.h       |   1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c     | 245 ++++++++++++++++++++-
 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.h     |   1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_type.h      |   4 +
 drivers/net/ethernet/intel/ixgbevf/Makefile        |   3 +-
 drivers/net/ethernet/intel/ixgbevf/defines.h       |   6 +
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h       |  10 +-
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c  | 179 ++++++++++++++-
 drivers/net/ethernet/intel/ixgbevf/mbx.h           |   3 +
 .../net/ethernet/intel/ixgbevf/self-emulation.c    | 133 +++++++++++
 drivers/net/ethernet/intel/ixgbevf/vf.c            |  10 +
 drivers/net/ethernet/intel/ixgbevf/vf.h            |   6 +-
 drivers/pci/iov.c                                  |   1 +
 include/linux/pci.h                                |   1 +
 15 files changed, 582 insertions(+), 22 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/ixgbevf/self-emulation.c

-- 
1.8.4.rc0.1.g8f6a3e5.dirty

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2015-10-30 18:04 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-21 16:37 [RFC Patch 00/12] IXGBE: Add live migration support for SRIOV NIC Lan Tianyu
2015-10-21 16:37 ` [RFC Patch 01/12] PCI: Add virtfn_index for struct pci_device Lan Tianyu
2015-10-21 18:07   ` Alexander Duyck
2015-10-24 14:46     ` Lan, Tianyu
2015-10-21 16:37 ` [RFC Patch 02/12] IXGBE: Add new mail box event to restore VF status in the PF driver Lan Tianyu
2015-10-21 20:34   ` Alexander Duyck
2015-10-21 16:37 ` [RFC Patch 03/12] IXGBE: Add sysfs interface for Qemu to migrate " Lan Tianyu
2015-10-21 20:45   ` Alexander Duyck
2015-10-25  7:21     ` Lan, Tianyu
2015-10-21 16:37 ` [RFC Patch 04/12] IXGBE: Add ixgbe_ping_vf() to notify a specified VF via mailbox msg Lan Tianyu
2015-10-21 16:37 ` [RFC Patch 05/12] IXGBE: Add new sysfs interface of "notify_vf" Lan Tianyu
2015-10-21 20:52   ` Alexander Duyck
2015-10-22 12:51     ` Michael S. Tsirkin
2015-10-24 15:43     ` Lan, Tianyu
2015-10-25  6:03       ` Alexander Duyck
2015-10-25  6:45         ` Lan, Tianyu
2015-10-21 16:37 ` [RFC Patch 06/12] IXGBEVF: Add self emulation layer Lan Tianyu
2015-10-21 20:58   ` Alexander Duyck
2015-10-22 12:50     ` [Qemu-devel] " Michael S. Tsirkin
2015-10-22 15:50       ` Alexander Duyck
2015-10-21 16:37 ` [RFC Patch 07/12] IXGBEVF: Add new mail box event for migration Lan Tianyu
2015-10-21 16:37 ` [RFC Patch 08/12] IXGBEVF: Rework code of finding the end transmit desc of package Lan Tianyu
2015-10-21 21:14   ` Alexander Duyck
2015-10-24 16:12     ` Lan, Tianyu
2015-10-22 12:58   ` Michael S. Tsirkin
2015-10-24 16:08     ` Lan, Tianyu
2015-10-21 16:37 ` [RFC Patch 09/12] IXGBEVF: Add live migration support for VF driver Lan Tianyu
2015-10-21 21:48   ` Alexander Duyck
2015-10-22 12:46   ` Michael S. Tsirkin
2015-10-21 16:37 ` [RFC Patch 10/12] IXGBEVF: Add lock to protect tx/rx ring operation Lan Tianyu
2015-10-21 21:55   ` Alexander Duyck
2015-10-22 12:40   ` Michael S. Tsirkin
2015-10-21 16:37 ` [RFC Patch 11/12] IXGBEVF: Migrate VF statistic data Lan Tianyu
2015-10-22 12:36   ` Michael S. Tsirkin
2015-10-21 16:37 ` [RFC Patch 12/12] IXGBEVF: Track dma dirty pages Lan Tianyu
2015-10-22 12:30   ` Michael S. Tsirkin
2015-10-21 18:45 ` [RFC Patch 00/12] IXGBE: Add live migration support for SRIOV NIC Or Gerlitz
2015-10-21 19:20   ` Alex Williamson
2015-10-21 23:26     ` Alexander Duyck
2015-10-22 12:32     ` [Qemu-devel] " Michael S. Tsirkin
2015-10-22 13:01       ` Alex Williamson
2015-10-22 13:06         ` Michael S. Tsirkin
2015-10-22 15:58     ` Or Gerlitz
2015-10-22 16:17       ` Alex Williamson
2015-10-22 12:55 ` [Qemu-devel] " Michael S. Tsirkin
2015-10-23 18:36 ` Alexander Duyck
2015-10-23 19:05   ` Alex Williamson
2015-10-23 20:01     ` Alexander Duyck
2015-10-26  5:36   ` Lan Tianyu
2015-10-26 15:03     ` Alexander Duyck
2015-10-29  6:12       ` Lan Tianyu
2015-10-29  6:58         ` Alexander Duyck
2015-10-29  8:33           ` Lan Tianyu
2015-10-29 16:17             ` Alexander Duyck
2015-10-30  2:41               ` Lan Tianyu
2015-10-30 18:04                 ` Alexander Duyck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).