kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yi Liu <yi.l.liu@intel.com>
To: Jason Gunthorpe <jgg@nvidia.com>, "Tian, Kevin" <kevin.tian@intel.com>
Cc: "Peng, Chao P" <chao.p.peng@intel.com>,
	"Sun, Yi Y" <yi.y.sun@intel.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"david@gibson.dropbear.id.au" <david@gibson.dropbear.id.au>,
	"thuth@redhat.com" <thuth@redhat.com>,
	"farman@linux.ibm.com" <farman@linux.ibm.com>,
	"mjrosato@linux.ibm.com" <mjrosato@linux.ibm.com>,
	"akrowiak@linux.ibm.com" <akrowiak@linux.ibm.com>,
	"pasic@linux.ibm.com" <pasic@linux.ibm.com>,
	"jjherne@linux.ibm.com" <jjherne@linux.ibm.com>,
	"jasowang@redhat.com" <jasowang@redhat.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"nicolinc@nvidia.com" <nicolinc@nvidia.com>,
	"eric.auger@redhat.com" <eric.auger@redhat.com>,
	"eric.auger.pro@gmail.com" <eric.auger.pro@gmail.com>,
	"peterx@redhat.com" <peterx@redhat.com>,
	"Alex Williamson (alex.williamson@redhat.com)" 
	<alex.williamson@redhat.com>
Subject: Re: [RFC 15/18] vfio/iommufd: Implement iommufd backend
Date: Tue, 26 Apr 2022 22:08:30 +0800	[thread overview]
Message-ID: <79de081d-31dc-41a4-d38f-1e28327b1152@intel.com> (raw)
In-Reply-To: <20220426134114.GM2125828@nvidia.com>


On 2022/4/26 21:41, Jason Gunthorpe wrote:
> On Tue, Apr 26, 2022 at 10:41:01AM +0000, Tian, Kevin wrote:
> 
>> That's one case of incompatibility, but the IOMMU attach group callback
>> can fail in a variety of ways.  One that we've seen that is not
>> uncommon is that we might have an mdev container with various  mappings
>> to other devices.  None of those mappings are validated until the mdev
>> driver tries to pin something, where it's generally unlikely that
>> they'd pin those particular mappings.  Then QEMU hot-adds a regular
>> IOMMU backed device, we allocate a domain for the device and replay the
>> mappings from the container, but now they get validated and potentially
>> fail.  The kernel returns a failure for the SET_IOMMU ioctl, QEMU
>> creates a new container and fills it from the same AddressSpace, where
>> now QEMU can determine which mappings can be safely skipped.
> 
> I think it is strange that the allowed DMA a guest can do depends on
> the order how devices are plugged into the guest, and varys from
> device to device?
> 
> IMHO it would be nicer if qemu would be able to read the new reserved
> regions and unmap the conflicts before hot plugging the new device. We
> don't have a kernel API to do this, maybe we should have one?

For userspace drivers, it is fine to do it. For QEMU, it's not quite easy 
since the IOVA is GPA which is determined per the e820 table.

>> A:
>> QEMU sets up a MemoryListener for the device AddressSpace and attempts
>> to map anything that triggers that listener, which includes not only VM
>> RAM which is our primary mapping goal, but also miscellaneous devices,
>> unaligned regions, and other device regions, ex. BARs.  Some of these
>> we filter out in QEMU with broad generalizations that unaligned ranges
>> aren't anything we can deal with, but other device regions covers
>> anything that's mmap'd in QEMU, ie. it has an associated KVM memory
>> slot.  IIRC, in the case I'm thinking of, the mapping that triggered
>> the replay failure was the BAR for an mdev device.  No attempt was made
>> to use gup or PFNMAP to resolve the mapping when only the mdev device
>> was present and the mdev host driver didn't attempt to pin pages within
>> its own BAR, but neither of these methods worked for the replay (I
>> don't recall further specifics).
> 
> This feels sort of like a bug in iommufd, or perhaps qemu..
> 
> With iommufd only normal GUP'able memory should be passed to
> map. Special memory will have to go through some other API. This is
> different from vfio containers.
> 
> We could possibly check the VMAs in iommufd during map to enforce
> normal memory.. However I'm also a bit surprised that qemu can't ID
> the underlying memory source and avoid this?
> 
> eg currently I see the log messages that it is passing P2P BAR memory
> into iommufd map, this should be prevented inside qemu because it is
> not reliable right now if iommufd will correctly reject it.

yeah. qemu can filter the P2P BAR mapping and just stop it in qemu. We
haven't added it as it is something you will add in future. so didn't
add it in this RFC. :-) Please let me know if it feels better to filter
it from today.

> IMHO multi-container should be avoided because it does force creating
> multiple iommu_domains which does have a memory/performance cost.

yes. for multi-hw_pgtable, there is no choice as it is mostly due to
compatibility. But for multi-container, seems to be solvable if kernel
and qemu has some extra support like you mentioned. But I'd like to
echo below. It seems there may be other possible reasons to fail in
the attach.

 >> That's one case of incompatibility, but the IOMMU attach group callback
 >> can fail in a variety of ways."

> Though, it is not so important that it is urgent (and copy makes it
> work better anyhow), qemu can stay as it is.

yes. as a start, keep it would be simpler.

> Jason

-- 
Regards,
Yi Liu

  reply	other threads:[~2022-04-26 14:09 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-14 10:46 [RFC 00/18] vfio: Adopt iommufd Yi Liu
2022-04-14 10:46 ` [RFC 01/18] scripts/update-linux-headers: Add iommufd.h Yi Liu
2022-04-14 10:46 ` [RFC 02/18] linux-headers: Import latest vfio.h and iommufd.h Yi Liu
2022-04-14 10:46 ` [RFC 03/18] hw/vfio/pci: fix vfio_pci_hot_reset_result trace point Yi Liu
2022-04-14 10:46 ` [RFC 04/18] vfio/pci: Use vbasedev local variable in vfio_realize() Yi Liu
2022-04-14 10:46 ` [RFC 05/18] vfio/common: Rename VFIOGuestIOMMU::iommu into ::iommu_mr Yi Liu
2022-04-14 10:46 ` [RFC 07/18] vfio: Add base object for VFIOContainer Yi Liu
2022-04-29  6:29   ` David Gibson
2022-05-03 13:05     ` Yi Liu
2022-04-14 10:47 ` [RFC 08/18] vfio/container: Introduce vfio_[attach/detach]_device Yi Liu
2022-04-14 10:47 ` [RFC 09/18] vfio/platform: Use vfio_[attach/detach]_device Yi Liu
2022-04-14 10:47 ` [RFC 10/18] vfio/ap: " Yi Liu
2022-04-14 10:47 ` [RFC 11/18] vfio/ccw: " Yi Liu
2022-04-14 10:47 ` [RFC 12/18] vfio/container-obj: Introduce [attach/detach]_device container callbacks Yi Liu
2022-04-14 10:47 ` [RFC 13/18] vfio/container-obj: Introduce VFIOContainer reset callback Yi Liu
2022-04-14 10:47 ` [RFC 14/18] hw/iommufd: Creation Yi Liu
2022-04-14 10:47 ` [RFC 15/18] vfio/iommufd: Implement iommufd backend Yi Liu
2022-04-22 14:58   ` Jason Gunthorpe
2022-04-22 21:33     ` Alex Williamson
2022-04-26  9:55     ` Yi Liu
2022-04-26 10:41       ` Tian, Kevin
2022-04-26 13:41         ` Jason Gunthorpe
2022-04-26 14:08           ` Yi Liu [this message]
2022-04-26 14:11             ` Jason Gunthorpe
2022-04-26 18:45               ` Alex Williamson
2022-04-26 19:27                 ` Jason Gunthorpe
2022-04-26 20:59                   ` Alex Williamson
2022-04-26 23:08                     ` Jason Gunthorpe
2022-04-26 13:53       ` Jason Gunthorpe
2022-04-14 10:47 ` [RFC 16/18] vfio/iommufd: Add IOAS_COPY_DMA support Yi Liu
2022-04-14 10:47 ` [RFC 17/18] vfio/as: Allow the selection of a given iommu backend Yi Liu
2022-04-14 10:47 ` [RFC 18/18] vfio/pci: Add an iommufd option Yi Liu
2022-04-15  8:37 ` [RFC 00/18] vfio: Adopt iommufd Nicolin Chen
2022-04-17 10:30   ` Eric Auger
2022-04-19  3:26     ` Nicolin Chen
2022-04-25 19:40       ` Eric Auger
2022-04-18  8:49 ` Tian, Kevin
2022-04-18 12:09   ` Yi Liu
2022-04-25 19:51     ` Eric Auger
2022-04-25 19:55   ` Eric Auger
2022-04-26  8:39     ` Tian, Kevin
2022-04-22 22:09 ` Alex Williamson
2022-04-25 10:10   ` Daniel P. Berrangé
2022-04-25 13:36     ` Jason Gunthorpe
2022-04-25 14:37     ` Alex Williamson
2022-04-26  8:37       ` Tian, Kevin
2022-04-26 12:33         ` Jason Gunthorpe
2022-04-26 16:21         ` Alex Williamson
2022-04-26 16:42           ` Jason Gunthorpe
2022-04-26 19:24             ` Alex Williamson
2022-04-26 19:36               ` Jason Gunthorpe
2022-04-28  3:21           ` Tian, Kevin
2022-04-28 14:24             ` Alex Williamson
2022-04-28 16:20               ` Daniel P. Berrangé
2022-04-29  0:45                 ` Tian, Kevin
2022-04-25 20:23   ` Eric Auger
2022-04-25 22:53     ` Alex Williamson
2022-04-26  9:47 ` Shameerali Kolothum Thodi
2022-04-26 11:44   ` Eric Auger
2022-04-26 12:43     ` Shameerali Kolothum Thodi
2022-04-26 16:35       ` Alex Williamson
2022-05-09 14:24         ` Zhangfei Gao
2022-05-10  3:17           ` Yi Liu
2022-05-10  6:51             ` Eric Auger
2022-05-10 12:35               ` Zhangfei Gao
2022-05-10 12:45                 ` Jason Gunthorpe
2022-05-10 14:08                   ` Yi Liu
2022-05-11 14:17                     ` zhangfei.gao
2022-05-12  9:01                       ` zhangfei.gao
2022-05-17  8:55                         ` Yi Liu
2022-05-18  7:22                           ` zhangfei.gao
2022-05-18 14:00                             ` Yi Liu
2022-06-28  8:14                               ` Shameerali Kolothum Thodi
2022-06-28  8:58                                 ` Eric Auger
2022-05-17  8:52                       ` Yi Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=79de081d-31dc-41a4-d38f-1e28327b1152@intel.com \
    --to=yi.l.liu@intel.com \
    --cc=akrowiak@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=chao.p.peng@intel.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=eric.auger.pro@gmail.com \
    --cc=eric.auger@redhat.com \
    --cc=farman@linux.ibm.com \
    --cc=jasowang@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=jjherne@linux.ibm.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=nicolinc@nvidia.com \
    --cc=pasic@linux.ibm.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=thuth@redhat.com \
    --cc=yi.y.sun@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).