[PATCH 00/18] vfio: Add migration pre-copy support and device dirty tracking

* [PATCH 00/18] vfio: Add migration pre-copy support and device dirty tracking
@ 2023-01-26 18:49 Avihai Horon
  2023-01-26 18:49 ` [PATCH 01/18] vfio/migration: Add VFIO migration pre-copy support Avihai Horon
                   ` (17 more replies)
  0 siblings, 18 replies; 42+ messages in thread
From: Avihai Horon @ 2023-01-26 18:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Williamson, Michael S. Tsirkin, Peter Xu, Jason Wang,
	Marcel Apfelbaum, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, David Hildenbrand, Philippe Mathieu-Daudé,
	Yishai Hadas, Jason Gunthorpe, Maor Gottlieb, Avihai Horon,
	Kirti Wankhede, Tarun Gupta, Joao Martins

Hello,

This series is based on the previous one that added the basic VFIO
migration protocol v2 implementation [1].

The first patch in the series starts by adding pre-copy support for VFIO
migration protocol v2. Pre-copy support allows the VFIO device data to
be transferred while the VM is running. This can improve performance and
reduce migration downtime. Full description of it can be found here [2].

The series then moves on to implement device dirty page tracking.
Device dirty page tracking allows the VFIO device to record its DMAs and
report them back when needed. This is part of VFIO migration and is used
during pre-copy phase of migration to track the RAM pages that the
device has written to and mark those pages dirty, so they can later be
re-sent to target.

Device dirty page tracking uses the DMA logging uAPI to discover device
capabilities, to start and stop tracking, and to get dirty page bitmap
report. Extra details and uAPI definition can be found here [3].

Device dirty page tracking operates in VFIOContainer scope. I.e., When
dirty tracking is started, stopped or dirty page report is queried, all
devices within a VFIOContainer are iterated and for each of them device
dirty page tracking is started, stopped or dirty page report is queried,
respectively.

Device dirty page tracking is used only if all devices within a
VFIOContainer support it. Otherwise, VFIO IOMMU dirty page tracking is
used, and if that is not supported as well, memory is perpetually marked
dirty by QEMU. Note that since VFIO IOMMU dirty page tracking has no HW
support, the last two usually have the same effect of perpetually
marking all pages dirty.

Normally, when asked to start dirty tracking, all the currently DMA
mapped ranges are tracked by device dirty page tracking. However, when
vIOMMU is enabled IOVA ranges are DMA mapped/unmapped on the fly as the
vIOMMU maps/unmaps them. These IOVA ranges can potentially be mapped
anywhere in the vIOMMU IOVA space. Due to this dynamic nature of vIOMMU
mapping/unmapping, tracking only the currently DMA mapped IOVA ranges
doesn't work very well.

Thus, when vIOMMU is enabled, we try to track the entire vIOMMU IOVA
space. If that fails (IOVA space can be rather big and we might hit HW
limitation), we try to track smaller range while marking untracked
ranges dirty.

Patch breakdown:
- Patch 1 adds VFIO migration pre-copy support.
- Patches 2-8 fix bugs and do some preparatory work required prior to
  adding device dirty page tracking.
- Patches 9-11 implement device dirty page tracking.
- Patches 12-16 add vIOMMU support to device dirty page tracking.
- Patches 17-18 enable device dirty page tracking and document it.

Thanks.

[1]
https://lore.kernel.org/qemu-devel/20230116141135.12021-1-avihaih@nvidia.com/

[2]
https://lore.kernel.org/kvm/20221206083438.37807-3-yishaih@nvidia.com/

[3]
https://lore.kernel.org/netdev/20220908183448.195262-4-yishaih@nvidia.com/

Avihai Horon (12):
  vfio/migration: Add VFIO migration pre-copy support
  vfio/common: Fix error reporting in vfio_get_dirty_bitmap()
  vfio/common: Fix wrong %m usages
  vfio/common: Abort migration if dirty log start/stop/sync fails
  vfio/common: Add VFIOBitmap and (de)alloc functions
  vfio/common: Extract code from vfio_get_dirty_bitmap() to new function
  vfio/common: Extract vIOMMU code from vfio_sync_dirty_bitmap()
  memory/iommu: Add IOMMU_ATTR_MAX_IOVA attribute
  intel-iommu: Implement get_attr() method
  vfio/common: Support device dirty page tracking with vIOMMU
  vfio/common: Optimize device dirty page tracking with vIOMMU
  docs/devel: Document VFIO device dirty page tracking

Joao Martins (6):
  util: Add iova_tree_nnodes()
  util: Extend iova_tree_foreach() to take data argument
  vfio/common: Record DMA mapped IOVA ranges
  vfio/common: Add device dirty page tracking start/stop
  vfio/common: Add device dirty page bitmap sync
  vfio/migration: Query device dirty page tracking support

 docs/devel/vfio-migration.rst |  79 ++--
 include/exec/memory.h         |   3 +-
 include/hw/vfio/vfio-common.h |  10 +
 include/qemu/iova-tree.h      |  19 +-
 hw/i386/intel_iommu.c         |  18 +
 hw/vfio/common.c              | 866 ++++++++++++++++++++++++++++++----
 hw/vfio/migration.c           | 127 ++++-
 util/iova-tree.c              |  23 +-
 hw/vfio/trace-events          |   5 +-
 9 files changed, 1006 insertions(+), 144 deletions(-)

-- 
2.26.3

^ permalink raw reply	[flat|nested] 42+ messages in thread