qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Yan Zhao <yan.y.zhao@intel.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Zhengxiao.zx@Alibaba-inc.com, kevin.tian@intel.com,
	yi.l.liu@intel.com, cjia@nvidia.com, kvm@vger.kernel.org,
	eskultet@redhat.com, ziye.yang@intel.com, qemu-devel@nongnu.org,
	cohuck@redhat.com, shuangtai.tst@alibaba-inc.com,
	dgilbert@redhat.com, zhi.a.wang@intel.com, mlevitsk@redhat.com,
	pasic@linux.ibm.com, aik@ozlabs.ru,
	Kirti Wankhede <kwankhede@nvidia.com>,
	eauger@redhat.com, felipe@nutanix.com,
	jonathan.davies@nutanix.com, changpeng.liu@intel.com,
	Ken.Xue@amd.com
Subject: Re: [PATCH Kernel v22 0/8] Add UAPIs to support migration for VFIO devices
Date: Mon, 25 May 2020 02:59:26 -0400	[thread overview]
Message-ID: <20200525065925.GA698@joy-OptiPlex-7040> (raw)
In-Reply-To: <20200519105804.02f3cae8@x1.home>

On Tue, May 19, 2020 at 10:58:04AM -0600, Alex Williamson wrote:
> Hi folks,
> 
> My impression is that we're getting pretty close to a workable
> implementation here with v22 plus respins of patches 5, 6, and 8.  We
> also have a matching QEMU series and a proposal for a new i40e
> consumer, as well as I assume GVT-g updates happening internally at
> Intel.  I expect all of the latter needs further review and discussion,
> but we should be at the point where we can validate these proposed
> kernel interfaces.  Therefore I'd like to make a call for reviews so
> that we can get this wrapped up for the v5.8 merge window.  I know
> Connie has some outstanding documentation comments and I'd like to make
> sure everyone has an opportunity to check that their comments have been
> addressed and we don't discover any new blocking issues.  Please send
> your Acked-by/Reviewed-by/Tested-by tags if you're satisfied with this
> interface and implementation.  Thanks!
>
hi Alex
after porting gvt/i40e vf migration code to kernel/qemu v23, we spoted
two bugs.
1. "Failed to get dirty bitmap for iova: 0xfe011000 size: 0x3fb0 err: 22"
   This is a qemu bug that the dirty bitmap query range is not the same
   as the dma map range. It can be fixed in qemu. and I just have a little
   concern for kernel to have this restriction.

2. migration abortion, reporting
"qemu-system-x86_64-lm: vfio_load_state: Error allocating buffer
qemu-system-x86_64-lm: error while loading state section id 49(vfio)
qemu-system-x86_64-lm: load of migration failed: Cannot allocate memory"

It's still a qemu bug and we can fixed it by
"
if (migration->pending_bytes == 0) {
+            qemu_put_be64(f, 0);
+            qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
"
and actually there are some extra concerns about this part, as reported in
[1][2].

[1] data_size should be read ahead of data_offset
https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg02795.html.
[2] should not repeatedly update pending_bytes in vfio_save_iterate()
https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg02796.html.

but as those errors are all in qemu, and we have finished basic tests in
both gvt & i40e, we're fine with the kernel part interface in general now.
(except for my concern [1], which needs to update kernel patch 1)

so I wonder which way in your mind is better, to give our reviewed-by to
the kernel part now, or hold until next qemu fixes?
and as performance data from gvt is requested from your previous mail, is
that still required before the code is accepted?

BTW, we have also conducted some basic tests when viommu is on, and found out
errors like 
"qemu-system-x86_64-dt: vtd_iova_to_slpte: detected slpte permission error (iova=0x0, level=0x3, slpte=0x0, write=1)
qemu-system-x86_64-dt: vtd_iommu_translate: detected translation failure (dev=00:03:00, iova=0x0)
qemu-system-x86_64-dt: New fault is not recorded due to compression of faults".

Thanks
Yan






  parent reply	other threads:[~2020-05-25  7:10 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-18  5:56 [PATCH Kernel v22 0/8] Add UAPIs to support migration for VFIO devices Kirti Wankhede
2020-05-18  5:56 ` [PATCH Kernel v22 1/8] vfio: UAPI for migration interface for device state Kirti Wankhede
2020-05-18  5:56 ` [PATCH Kernel v22 2/8] vfio iommu: Remove atomicity of ref_count of pinned pages Kirti Wankhede
2020-05-18  5:56 ` [PATCH Kernel v22 3/8] vfio iommu: Cache pgsize_bitmap in struct vfio_iommu Kirti Wankhede
2020-05-20 10:08   ` Cornelia Huck
2020-05-20 14:46     ` Kirti Wankhede
2020-05-18  5:56 ` [PATCH Kernel v22 4/8] vfio iommu: Add ioctl definition for dirty pages tracking Kirti Wankhede
2020-05-18  5:56 ` [PATCH Kernel v22 5/8] vfio iommu: Implementation of ioctl " Kirti Wankhede
2020-05-18 21:53   ` Alex Williamson
2020-05-19  7:11     ` Kirti Wankhede
2020-05-19  6:52       ` Kirti Wankhede
2020-05-18  5:56 ` [PATCH Kernel v22 6/8] vfio iommu: Update UNMAP_DMA ioctl to get dirty bitmap before unmap Kirti Wankhede
2020-05-19  6:54   ` Kirti Wankhede
2020-05-20 10:27     ` Cornelia Huck
2020-05-20 15:16       ` Kirti Wankhede
2020-05-18  5:56 ` [PATCH Kernel v22 7/8] vfio iommu: Add migration capability to report supported features Kirti Wankhede
2020-05-20 10:42   ` Cornelia Huck
2020-05-20 15:23     ` Kirti Wankhede
2020-05-18  5:56 ` [PATCH Kernel v22 8/8] vfio: Selective dirty page tracking if IOMMU backed device pins pages Kirti Wankhede
2020-05-19  6:54   ` Kirti Wankhede
2020-05-19 16:58 ` [PATCH Kernel v22 0/8] Add UAPIs to support migration for VFIO devices Alex Williamson
2020-05-20  2:55   ` Yan Zhao
2020-05-20 13:40     ` Kirti Wankhede
2020-05-20 16:46       ` Alex Williamson
2020-05-21  5:08         ` Yan Zhao
2020-05-21  7:09           ` Kirti Wankhede
2020-05-21  7:04             ` Yan Zhao
2020-05-21  7:28               ` Kirti Wankhede
2020-05-21  7:32               ` Kirti Wankhede
2020-05-25  6:59   ` Yan Zhao [this message]
2020-05-25 13:20     ` Kirti Wankhede
2020-05-26 20:19       ` Alex Williamson
2020-05-27  6:23         ` Yan Zhao
2020-05-27  8:48           ` Dr. David Alan Gilbert
2020-05-28  8:01             ` Yan Zhao
2020-05-28 22:53               ` Alex Williamson
2020-05-29 11:12                 ` Dr. David Alan Gilbert
2020-05-28 22:59             ` Alex Williamson
2020-05-29  4:15               ` Yan Zhao
2020-05-29 17:57               ` Kirti Wankhede

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200525065925.GA698@joy-OptiPlex-7040 \
    --to=yan.y.zhao@intel.com \
    --cc=Ken.Xue@amd.com \
    --cc=Zhengxiao.zx@Alibaba-inc.com \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=changpeng.liu@intel.com \
    --cc=cjia@nvidia.com \
    --cc=cohuck@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eauger@redhat.com \
    --cc=eskultet@redhat.com \
    --cc=felipe@nutanix.com \
    --cc=jonathan.davies@nutanix.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=mlevitsk@redhat.com \
    --cc=pasic@linux.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shuangtai.tst@alibaba-inc.com \
    --cc=yi.l.liu@intel.com \
    --cc=zhi.a.wang@intel.com \
    --cc=ziye.yang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).