From: Eric Auger <eric.auger@redhat.com>
To: Yi Liu <yi.l.liu@intel.com>, "Tian, Kevin" <kevin.tian@intel.com>,
"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
"cohuck@redhat.com" <cohuck@redhat.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Cc: "david@gibson.dropbear.id.au" <david@gibson.dropbear.id.au>,
"thuth@redhat.com" <thuth@redhat.com>,
"farman@linux.ibm.com" <farman@linux.ibm.com>,
"mjrosato@linux.ibm.com" <mjrosato@linux.ibm.com>,
"akrowiak@linux.ibm.com" <akrowiak@linux.ibm.com>,
"pasic@linux.ibm.com" <pasic@linux.ibm.com>,
"jjherne@linux.ibm.com" <jjherne@linux.ibm.com>,
"jasowang@redhat.com" <jasowang@redhat.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"jgg@nvidia.com" <jgg@nvidia.com>,
"nicolinc@nvidia.com" <nicolinc@nvidia.com>,
"eric.auger.pro@gmail.com" <eric.auger.pro@gmail.com>,
"Peng, Chao P" <chao.p.peng@intel.com>,
"Sun, Yi Y" <yi.y.sun@intel.com>,
"peterx@redhat.com" <peterx@redhat.com>
Subject: Re: [RFC 00/18] vfio: Adopt iommufd
Date: Mon, 25 Apr 2022 21:51:50 +0200 [thread overview]
Message-ID: <7015518e-dc19-d1f6-1eb8-a143be8d3721@redhat.com> (raw)
In-Reply-To: <abfebe33-149d-ce34-a178-f735afe2ca95@intel.com>
Hi,
On 4/18/22 2:09 PM, Yi Liu wrote:
> Hi Kevin,
>
> On 2022/4/18 16:49, Tian, Kevin wrote:
>>> From: Liu, Yi L <yi.l.liu@intel.com>
>>> Sent: Thursday, April 14, 2022 6:47 PM
>>>
>>> With the introduction of iommufd[1], the linux kernel provides a
>>> generic
>>> interface for userspace drivers to propagate their DMA mappings to
>>> kernel
>>> for assigned devices. This series does the porting of the VFIO devices
>>> onto the /dev/iommu uapi and let it coexist with the legacy
>>> implementation.
>>> Other devices like vpda, vfio mdev and etc. are not considered yet.
>>
>> vfio mdev has no special support in Qemu. Just that it's not supported
>> by iommufd yet thus can only be operated in legacy container
>> interface at
>> this point. Later once it's supported by the kernel suppose no
>> additional
>> enabling work is required for mdev in Qemu.
>
> yes. will make it more precise in next version.
>
>>>
>>> For vfio devices, the new interface is tied with device fd and iommufd
>>> as the iommufd solution is device-centric. This is different from
>>> legacy
>>> vfio which is group-centric. To support both interfaces in QEMU, this
>>> series introduces the iommu backend concept in the form of different
>>> container classes. The existing vfio container is named legacy
>>> container
>>> (equivalent with legacy iommu backend in this series), while the new
>>> iommufd based container is named as iommufd container (may also be
>>> mentioned
>>> as iommufd backend in this series). The two backend types have their
>>> own
>>> way to setup secure context and dma management interface. Below diagram
>>> shows how it looks like with both BEs.
>>>
>>> VFIO AddressSpace/Memory
>>> +-------+ +----------+ +-----+ +-----+
>>> | pci | | platform | | ap | | ccw |
>>> +---+---+ +----+-----+ +--+--+ +--+--+
>>> +----------------------+
>>> | | | | |
>>> AddressSpace |
>>> | | | |
>>> +------------+---------+
>>> +---V-----------V-----------V--------V----+ /
>>> | VFIOAddressSpace | <------------+
>>> | | | MemoryListener
>>> | VFIOContainer list |
>>> +-------+----------------------------+----+
>>> | |
>>> | |
>>> +-------V------+ +--------V----------+
>>> | iommufd | | vfio legacy |
>>> | container | | container |
>>> +-------+------+ +--------+----------+
>>> | |
>>> | /dev/iommu | /dev/vfio/vfio
>>> | /dev/vfio/devices/vfioX | /dev/vfio/$group_id
>>> Userspace | |
>>>
>>> ===========+============================+=======================
>>> =========
>>> Kernel | device fd |
>>> +---------------+ | group/container fd
>>> | (BIND_IOMMUFD | | (SET_CONTAINER/SET_IOMMU)
>>> | ATTACH_IOAS) | | device fd
>>> | | |
>>> | +-------V------------V-----------------+
>>> iommufd | | vfio |
>>> (map/unmap | +---------+--------------------+-------+
>>> ioas_copy) | | | map/unmap
>>> | | |
>>> +------V------+ +-----V------+ +------V--------+
>>> | iommfd core | | device | | vfio iommu |
>>> +-------------+ +------------+ +---------------+
>>
>> last row: s/iommfd/iommufd/
>
> thanks. a typo.
>
>> overall this sounds a reasonable abstraction. Later when vdpa starts
>> supporting iommufd probably the iommufd BE will become even
>> smaller with more logic shareable between vfio and vdpa.
>
> let's see if Jason Wang will give some idea. :-)
>
>>>
>>> [Secure Context setup]
>>> - iommufd BE: uses device fd and iommufd to setup secure context
>>> (bind_iommufd, attach_ioas)
>>> - vfio legacy BE: uses group fd and container fd to setup secure
>>> context
>>> (set_container, set_iommu)
>>> [Device access]
>>> - iommufd BE: device fd is opened through /dev/vfio/devices/vfioX
>>> - vfio legacy BE: device fd is retrieved from group fd ioctl
>>> [DMA Mapping flow]
>>> - VFIOAddressSpace receives MemoryRegion add/del via MemoryListener
>>> - VFIO populates DMA map/unmap via the container BEs
>>> *) iommufd BE: uses iommufd
>>> *) vfio legacy BE: uses container fd
>>>
>>> This series qomifies the VFIOContainer object which acts as a base
>>> class
>>
>> what does 'qomify' mean? I didn't find this word from dictionary...
>>
>>> for a container. This base class is derived into the legacy VFIO
>>> container
>>> and the new iommufd based container. The base class implements generic
>>> code
>>> such as code related to memory_listener and address space management
>>> whereas
>>> the derived class implements callbacks that depend on the kernel
>>> user space
>>
>> 'the kernel user space'?
>
> aha, just want to express different BE callbacks will use different
> user interface exposed by kernel. will refine the wording.
>
>>
>>> being used.
>>>
>>> The selection of the backend is made on a device basis using the new
>>> iommufd option (on/off/auto). By default the iommufd backend is
>>> selected
>>> if supported by the host and by QEMU (iommufd KConfig). This option is
>>> currently available only for the vfio-pci device. For other types of
>>> devices, it does not yet exist and the legacy BE is chosen by default.
>>>
>>> Test done:
>>> - PCI and Platform device were tested
>>
>> In this case PCI uses iommufd while platform device uses legacy?
>
> For PCI, both legacy and iommufd were tested. The exploration kernel
> branch doesn't have the new device uapi for platform device, so I
> didn't test it.
> But I remember Eric should have tested it with iommufd. Eric?
No I just ran non regression tests for vfio-platform, in legacy mode. I
did not integrate with the new device uapi for platform device.
>
>>> - ccw and ap were only compile-tested
>>> - limited device hotplug test
>>> - vIOMMU test run for both legacy and iommufd backends (limited tests)
>>>
>>> This series was co-developed by Eric Auger and me based on the
>>> exploration
>>> iommufd kernel[2], complete code of this series is available in[3]. As
>>> iommufd kernel is in the early step (only iommufd generic interface
>>> is in
>>> mailing list), so this series hasn't made the iommufd backend fully
>>> on par
>>> with legacy backend w.r.t. features like p2p mappings, coherency
>>> tracking,
>>
>> what does 'coherency tracking' mean here? if related to iommu enforce
>> snoop it is fully handled by the kernel so far. I didn't find any use of
>> VFIO_DMA_CC_IOMMU in current Qemu.
>
> It's the kvm_group add/del stuffs.perhaps say kvm_group add/del
> equivalence
> would be better?
>
>>> live migration, etc. This series hasn't supported PCI devices
>>> without FLR
>>> neither as the kernel doesn't support VFIO_DEVICE_PCI_HOT_RESET when
>>> userspace
>>> is using iommufd. The kernel needs to be updated to accept device fd
>>> list for
>>> reset when userspace is using iommufd. Related work is in progress by
>>> Jason[4].
>>>
>>> TODOs:
>>> - Add DMA alias check for iommufd BE (group level)
>>> - Make pci.c to be BE agnostic. Needs kernel change as well to fix the
>>> VFIO_DEVICE_PCI_HOT_RESET gap
>>> - Cleanup the VFIODevice fields as it's used in both BEs
>>> - Add locks
>>> - Replace list with g_tree
>>> - More tests
>>>
>>> Patch Overview:
>>>
>>> - Preparation:
>>> 0001-scripts-update-linux-headers-Add-iommufd.h.patch
>>> 0002-linux-headers-Import-latest-vfio.h-and-iommufd.h.patch
>>> 0003-hw-vfio-pci-fix-vfio_pci_hot_reset_result-trace-poin.patch
>>> 0004-vfio-pci-Use-vbasedev-local-variable-in-vfio_realize.patch
>>> 0005-vfio-common-Rename-VFIOGuestIOMMU-iommu-into-
>>> iommu_m.patch
>>
>> 3-5 are pure cleanups which could be sent out separately
>
> yes. may send later after checking with Eric. :-)
yes makes sense to send them separately.
Thanks
Eric
>
>>> 0006-vfio-common-Split-common.c-into-common.c-container.c.patch
>>>
>>> - Introduce container object and covert existing vfio to use it:
>>> 0007-vfio-Add-base-object-for-VFIOContainer.patch
>>> 0008-vfio-container-Introduce-vfio_attach-detach_device.patch
>>> 0009-vfio-platform-Use-vfio_-attach-detach-_device.patch
>>> 0010-vfio-ap-Use-vfio_-attach-detach-_device.patch
>>> 0011-vfio-ccw-Use-vfio_-attach-detach-_device.patch
>>> 0012-vfio-container-obj-Introduce-attach-detach-_device-c.patch
>>> 0013-vfio-container-obj-Introduce-VFIOContainer-reset-cal.patch
>>>
>>> - Introduce iommufd based container:
>>> 0014-hw-iommufd-Creation.patch
>>> 0015-vfio-iommufd-Implement-iommufd-backend.patch
>>> 0016-vfio-iommufd-Add-IOAS_COPY_DMA-support.patch
>>>
>>> - Add backend selection for vfio-pci:
>>> 0017-vfio-as-Allow-the-selection-of-a-given-iommu-backend.patch
>>> 0018-vfio-pci-Add-an-iommufd-option.patch
>>>
>>> [1] https://lore.kernel.org/kvm/0-v1-e79cd8d168e8+6-
>>> iommufd_jgg@nvidia.com/
>>> [2] https://github.com/luxis1999/iommufd/tree/iommufd-v5.17-rc6
>>> [3] https://github.com/luxis1999/qemu/tree/qemu-for-5.17-rc6-vm-rfcv1
>>> [4] https://lore.kernel.org/kvm/0-v1-a8faf768d202+125dd-
>>> vfio_mdev_no_group_jgg@nvidia.com/
>>
>> Following is probably more relevant to [4]:
>>
>> https://lore.kernel.org/all/10-v1-33906a626da1+16b0-vfio_kvm_no_group_jgg@nvidia.com/
>>
>
> absolutely.:-) thanks.
>
>> Thanks
>> Kevin
>
next prev parent reply other threads:[~2022-04-25 19:52 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-14 10:46 [RFC 00/18] vfio: Adopt iommufd Yi Liu
2022-04-14 10:46 ` [RFC 01/18] scripts/update-linux-headers: Add iommufd.h Yi Liu
2022-04-14 10:46 ` [RFC 02/18] linux-headers: Import latest vfio.h and iommufd.h Yi Liu
2022-04-14 10:46 ` [RFC 03/18] hw/vfio/pci: fix vfio_pci_hot_reset_result trace point Yi Liu
2022-04-14 10:46 ` [RFC 04/18] vfio/pci: Use vbasedev local variable in vfio_realize() Yi Liu
2022-04-14 10:46 ` [RFC 05/18] vfio/common: Rename VFIOGuestIOMMU::iommu into ::iommu_mr Yi Liu
2022-04-14 10:46 ` [RFC 07/18] vfio: Add base object for VFIOContainer Yi Liu
2022-04-29 6:29 ` David Gibson
2022-05-03 13:05 ` Yi Liu
2022-04-14 10:47 ` [RFC 08/18] vfio/container: Introduce vfio_[attach/detach]_device Yi Liu
2022-04-14 10:47 ` [RFC 09/18] vfio/platform: Use vfio_[attach/detach]_device Yi Liu
2022-04-14 10:47 ` [RFC 10/18] vfio/ap: " Yi Liu
2022-04-14 10:47 ` [RFC 11/18] vfio/ccw: " Yi Liu
2022-04-14 10:47 ` [RFC 12/18] vfio/container-obj: Introduce [attach/detach]_device container callbacks Yi Liu
2022-04-14 10:47 ` [RFC 13/18] vfio/container-obj: Introduce VFIOContainer reset callback Yi Liu
2022-04-14 10:47 ` [RFC 14/18] hw/iommufd: Creation Yi Liu
2022-04-14 10:47 ` [RFC 15/18] vfio/iommufd: Implement iommufd backend Yi Liu
2022-04-22 14:58 ` Jason Gunthorpe
2022-04-22 21:33 ` Alex Williamson
2022-04-26 9:55 ` Yi Liu
2022-04-26 10:41 ` Tian, Kevin
2022-04-26 13:41 ` Jason Gunthorpe
2022-04-26 14:08 ` Yi Liu
2022-04-26 14:11 ` Jason Gunthorpe
2022-04-26 18:45 ` Alex Williamson
2022-04-26 19:27 ` Jason Gunthorpe
2022-04-26 20:59 ` Alex Williamson
2022-04-26 23:08 ` Jason Gunthorpe
2022-04-26 13:53 ` Jason Gunthorpe
2022-04-14 10:47 ` [RFC 16/18] vfio/iommufd: Add IOAS_COPY_DMA support Yi Liu
2022-04-14 10:47 ` [RFC 17/18] vfio/as: Allow the selection of a given iommu backend Yi Liu
2022-04-14 10:47 ` [RFC 18/18] vfio/pci: Add an iommufd option Yi Liu
2022-04-15 8:37 ` [RFC 00/18] vfio: Adopt iommufd Nicolin Chen
2022-04-17 10:30 ` Eric Auger
2022-04-19 3:26 ` Nicolin Chen
2022-04-25 19:40 ` Eric Auger
2022-04-18 8:49 ` Tian, Kevin
2022-04-18 12:09 ` Yi Liu
2022-04-25 19:51 ` Eric Auger [this message]
2022-04-25 19:55 ` Eric Auger
2022-04-26 8:39 ` Tian, Kevin
2022-04-22 22:09 ` Alex Williamson
2022-04-25 10:10 ` Daniel P. Berrangé
2022-04-25 13:36 ` Jason Gunthorpe
2022-04-25 14:37 ` Alex Williamson
2022-04-26 8:37 ` Tian, Kevin
2022-04-26 12:33 ` Jason Gunthorpe
2022-04-26 16:21 ` Alex Williamson
2022-04-26 16:42 ` Jason Gunthorpe
2022-04-26 19:24 ` Alex Williamson
2022-04-26 19:36 ` Jason Gunthorpe
2022-04-28 3:21 ` Tian, Kevin
2022-04-28 14:24 ` Alex Williamson
2022-04-28 16:20 ` Daniel P. Berrangé
2022-04-29 0:45 ` Tian, Kevin
2022-04-25 20:23 ` Eric Auger
2022-04-25 22:53 ` Alex Williamson
2022-04-26 9:47 ` Shameerali Kolothum Thodi
2022-04-26 11:44 ` Eric Auger
2022-04-26 12:43 ` Shameerali Kolothum Thodi
2022-04-26 16:35 ` Alex Williamson
2022-05-09 14:24 ` Zhangfei Gao
2022-05-10 3:17 ` Yi Liu
2022-05-10 6:51 ` Eric Auger
2022-05-10 12:35 ` Zhangfei Gao
2022-05-10 12:45 ` Jason Gunthorpe
2022-05-10 14:08 ` Yi Liu
2022-05-11 14:17 ` zhangfei.gao
2022-05-12 9:01 ` zhangfei.gao
2022-05-17 8:55 ` Yi Liu
2022-05-18 7:22 ` zhangfei.gao
2022-05-18 14:00 ` Yi Liu
2022-06-28 8:14 ` Shameerali Kolothum Thodi
2022-06-28 8:58 ` Eric Auger
2022-05-17 8:52 ` Yi Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7015518e-dc19-d1f6-1eb8-a143be8d3721@redhat.com \
--to=eric.auger@redhat.com \
--cc=akrowiak@linux.ibm.com \
--cc=alex.williamson@redhat.com \
--cc=chao.p.peng@intel.com \
--cc=cohuck@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=eric.auger.pro@gmail.com \
--cc=farman@linux.ibm.com \
--cc=jasowang@redhat.com \
--cc=jgg@nvidia.com \
--cc=jjherne@linux.ibm.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=mjrosato@linux.ibm.com \
--cc=nicolinc@nvidia.com \
--cc=pasic@linux.ibm.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=thuth@redhat.com \
--cc=yi.l.liu@intel.com \
--cc=yi.y.sun@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).