[PATCH v2 00/20] vfio: Add migration pre-copy support and device dirty tracking

* [PATCH v2 00/20] vfio: Add migration pre-copy support and device dirty tracking
@ 2023-02-22 17:48 Avihai Horon
  2023-02-22 17:48 ` [PATCH v2 01/20] migration: Pass threshold_size to .state_pending_{estimate, exact}() Avihai Horon via
                   ` (21 more replies)
  0 siblings, 22 replies; 93+ messages in thread
From: Avihai Horon @ 2023-02-22 17:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Williamson, Cédric Le Goater, Juan Quintela,
	Dr. David Alan Gilbert, Michael S. Tsirkin, Peter Xu, Jason Wang,
	Marcel Apfelbaum, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, David Hildenbrand, Philippe Mathieu-Daudé,
	Yishai Hadas, Jason Gunthorpe, Maor Gottlieb, Avihai Horon,
	Kirti Wankhede, Tarun Gupta, Joao Martins

Hello,

This series is based on the previous one that added the basic VFIO
migration protocol v2 implementation [1].

The series starts by adding pre-copy support for VFIO migration protocol
v2. Pre-copy support allows the VFIO device data to be transferred while
the VM is running. This can improve performance and reduce migration
downtime. Full description of it can be found here [2].

It then moves to implement device dirty page tracking. Device dirty page
tracking allows the VFIO device to record its DMAs and report them back
when needed. This is part of VFIO migration and is used during pre-copy
phase of migration to track the RAM pages that the device has written to
and mark those pages dirty, so they can later be re-sent to target.

Device dirty page tracking uses the DMA logging uAPI to discover device
capabilities, to start and stop tracking, and to get dirty page bitmap
report. Extra details and uAPI definition can be found here [3].

Device dirty page tracking operates in VFIOContainer scope. I.e., When
dirty tracking is started, stopped or dirty page report is queried, all
devices within a VFIOContainer are iterated and for each of them device
dirty page tracking is started, stopped or dirty page report is queried,
respectively.

Device dirty page tracking is used only if all devices within a
VFIOContainer support it. Otherwise, VFIO IOMMU dirty page tracking is
used, and if that is not supported as well, memory is perpetually marked
dirty by QEMU. Note that since VFIO IOMMU dirty page tracking has no HW
support, the last two usually have the same effect of perpetually
marking all pages dirty.

Normally, when asked to start dirty tracking, all the currently DMA
mapped ranges are tracked by device dirty page tracking. However, when
vIOMMU is enabled IOVA ranges are DMA mapped/unmapped on the fly as the
vIOMMU maps/unmaps them. These IOVA ranges can potentially be mapped
anywhere in the vIOMMU IOVA space. Due to this dynamic nature of vIOMMU
mapping/unmapping, tracking only the currently DMA mapped IOVA ranges
doesn't work very well.

Thus, when vIOMMU is enabled, we try to track the entire vIOMMU IOVA
space. If that fails (IOVA space can be rather big and we might hit HW
limitation), we try to track smaller range while marking untracked
ranges dirty.

Patch breakdown:
- Patches 1-3 add VFIO migration pre-copy support.
- Patches 4-10 fix bugs and do some preparatory work required prior to
  adding device dirty page tracking.
- Patches 11-13 implement device dirty page tracking.
- Patches 14-18 add vIOMMU support to device dirty page tracking.
- Patches 19-20 enable device dirty page tracking and document it.

Changes from v1 [4]:
- Rebased on latest master branch. As part of it, made some changes in
  pre-copy to adjust it to Juan's new patches:
  1. Added a new patch that passes threshold_size parameter to
     .state_pending_{estimate,exact}() handlers.
  2. Added a new patch that refactors vfio_save_block().
  3. Changed the pre-copy patch to cache and report pending pre-copy
     size in the .state_pending_estimate() handler.
- Removed unnecessary P2P code. This should be added later on when P2P
  support is added. (Alex)
- Moved the dirty sync to be after the DMA unmap in vfio_dma_unmap()
  (patch #11). (Alex)
- Stored vfio_devices_all_device_dirty_tracking()'s value in a local
  variable in vfio_get_dirty_bitmap() so it can be re-used (patch #11).
- Refactored the viommu device dirty tracking ranges creation code to
  make it clearer (patch #15).
- Changed overflow check in vfio_iommu_range_is_device_tracked() to
  emphasize that we specifically check for 2^64 wrap around (patch #15).
- Added R-bs / Acks.

Thanks.

[1]
https://lore.kernel.org/qemu-devel/167658846945.932837.1420176491103357684.stgit@omen/

[2]
https://lore.kernel.org/kvm/20221206083438.37807-3-yishaih@nvidia.com/

[3]
https://lore.kernel.org/netdev/20220908183448.195262-4-yishaih@nvidia.com/

Avihai Horon (14):
  migration: Pass threshold_size to .state_pending_{estimate,exact}()
  vfio/migration: Refactor vfio_save_block() to return saved data size
  vfio/migration: Add VFIO migration pre-copy support
  vfio/common: Fix error reporting in vfio_get_dirty_bitmap()
  vfio/common: Fix wrong %m usages
  vfio/common: Abort migration if dirty log start/stop/sync fails
  vfio/common: Add VFIOBitmap and (de)alloc functions
  vfio/common: Extract code from vfio_get_dirty_bitmap() to new function
  vfio/common: Extract vIOMMU code from vfio_sync_dirty_bitmap()
  memory/iommu: Add IOMMU_ATTR_MAX_IOVA attribute
  intel-iommu: Implement get_attr() method
  vfio/common: Support device dirty page tracking with vIOMMU
  vfio/common: Optimize device dirty page tracking with vIOMMU
  docs/devel: Document VFIO device dirty page tracking

Joao Martins (6):
  util: Add iova_tree_nnodes()
  util: Extend iova_tree_foreach() to take data argument
  vfio/common: Record DMA mapped IOVA ranges
  vfio/common: Add device dirty page tracking start/stop
  vfio/common: Add device dirty page bitmap sync
  vfio/migration: Query device dirty page tracking support

 docs/devel/vfio-migration.rst  |  85 ++-
 include/exec/memory.h          |   3 +-
 include/hw/vfio/vfio-common.h  |  10 +
 include/migration/register.h   |   7 +-
 include/qemu/iova-tree.h       |  19 +-
 migration/savevm.h             |   6 +-
 hw/i386/intel_iommu.c          |  18 +
 hw/s390x/s390-stattrib.c       |   4 +-
 hw/vfio/common.c               | 911 +++++++++++++++++++++++++++++----
 hw/vfio/migration.c            | 210 +++++++-
 migration/block-dirty-bitmap.c |   2 +-
 migration/block.c              |   4 +-
 migration/migration.c          |  12 +-
 migration/ram.c                |   6 +-
 migration/savevm.c             |  12 +-
 util/iova-tree.c               |  23 +-
 hw/vfio/trace-events           |   4 +-
 migration/trace-events         |   4 +-
 18 files changed, 1161 insertions(+), 179 deletions(-)

-- 
2.26.3

^ permalink raw reply	[flat|nested] 93+ messages in thread