All of lore.kernel.org
 help / color / mirror / Atom feed
* [question] VFIO Device Migration: The vCPU may be paused during vfio device DMA in iommu nested stage mode && vSVA
@ 2021-09-24  6:18 Kunkun Jiang
  2021-09-24  6:47 ` Tian, Kevin
  0 siblings, 1 reply; 6+ messages in thread
From: Kunkun Jiang @ 2021-09-24  6:18 UTC (permalink / raw)
  To: Tarun Gupta, Alex Williamson, Kirti Wankhede, Eric Auger,
	Shameer Kolothum, open list:All patches CC here, kevin.tian
  Cc: Zenghui Yu, wanghaibin.wang, liulongfang, Keqian Zhu, tangnianyao

Hi all,

I encountered a problem in vfio device migration test. The
vCPU may be paused during vfio-pci DMA in iommu nested
stage mode && vSVA. This may lead to migration fail and
other problems related to device hardware and driver
implementation.

It may be a bit early to discuss this issue, after all, the iommu
nested stage mode and vSVA are not yet mature. But judging
from the current implementation, we will definitely encounter
this problem in the future.

This is the current process of vSVA processing translation fault
in iommu nested stage mode (take SMMU as an example):

guest os            4.handle translation fault 5.send CMD_RESUME to vSMMU


qemu                3.inject fault into guest os 6.deliver response to 
host os
(vfio/vsmmu)


host os              2.notify the qemu 7.send CMD_RESUME to SMMU
(vfio/smmu)


SMMU              1.address translation fault              8.retry or 
terminate

The order is 1--->8.

Currently, qemu may pause vCPU at any step. It is possible to
pause vCPU at step 1-5, that is, in a DMA. This may lead to
migration fail and other problems related to device hardware
and driver implementation. For example, the device status
cannot be changed from RUNNING && SAVING to SAVING,
because the device DMA is not over.

As far as i can see, vCPU should not be paused during a device
IO process, such as DMA. However, currently live migration
does not pay attention to the state of vfio device when pausing
the vCPU. And if the vCPU is not paused, the vfio device is
always running. This looks like a *deadlock*.

Do you have any ideas to solve this problem?
Looking forward to your replay.

Thanks,
Kunkun Jiang





^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-09-27 13:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-24  6:18 [question] VFIO Device Migration: The vCPU may be paused during vfio device DMA in iommu nested stage mode && vSVA Kunkun Jiang
2021-09-24  6:47 ` Tian, Kevin
2021-09-24  9:29   ` Kirti Wankhede
2021-09-26  2:48     ` Tian, Kevin
2021-09-27 12:30   ` Kunkun Jiang
2021-09-27 13:05     ` Tian, Kevin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.