Hi Linus, This PR includes the dirty tracking and first part of the nested translation items for iommufd, details in the tag. For those following, these series are still progressing: - User page table invalidation: https://lore.kernel.org/r/20231020092426.13907-1-yi.l.liu@intel.com https://lore.kernel.org/r/20231020093719.18725-1-yi.l.liu@intel.com - ARM SMMUv3 nested translation: https://lore.kernel.org/all/cover.1683688960.git.nicolinc@nvidia.com/ - Draft AMD IOMMU nested translation: https://lore.kernel.org/all/20230621235508.113949-1-suravee.suthikulpanit@amd.com/ - ARM SMMUv3 Dirty tracking: https://github.com/jpemartins/linux/commits/smmu-iommufd-v3 There is also a lot of ongoing work to consistently and generically enable PASID and SVA support in all the IOMMU drivers: SMMUv3: https://lore.kernel.org/r/0-v1-e289ca9121be+2be-smmuv3_newapi_p1_jgg@nvidia.com https://lore.kernel.org/r/0-v1-afbb86647bbd+5-smmuv3_newapi_p2_jgg@nvidia.com AMD: https://lore.kernel.org/all/20231016104351.5749-1-vasant.hegde@amd.com/ https://lore.kernel.org/all/20231013151652.6008-1-vasant.hegde@amd.com/ Intel: https://lore.kernel.org/r/20231017032045.114868-1-tina.zhang@intel.com RFC patches for PASID support in iommufd & vfio: https://lore.kernel.org/all/20230926092651.17041-1-yi.l.liu@intel.com/ https://lore.kernel.org/all/20230926093121.18676-1-yi.l.liu@intel.com/ IO page faults and events delivered to userspace through iommufd: https://lore.kernel.org/all/20231026024930.382898-1-baolu.lu@linux.intel.com/ RFC patches exploring support for the first Intel Scalable IO Virtualization (SIOV r1) device are posted: https://lore.kernel.org/all/20231009085123.463179-1-yi.l.liu@intel.com/ Along with qemu patches implementing iommufd: https://lore.kernel.org/all/20231016083223.1519410-1-zhenzhong.duan@intel.com/ There are some conflicts with Joerg's main iommu tree, most are of the append to list type of conflict. A few notes: drivers/iommu/iommufd/selftest.c needs a non-conflict hunk: - static struct iommu_domain *mock_domain_alloc(unsigned int iommu_domain_type) - { - if (iommu_domain_type == IOMMU_DOMAIN_BLOCKED) - return &mock_blocking_domain; - if (iommu_domain_type == IOMMU_DOMAIN_UNMANAGED) - return mock_domain_alloc_paging(NULL); - return NULL; - } - drivers/iommu/iommufd/selftest.c should be: @@@ -466,10 -293,8 +450,9 @@@ static const struct iommu_ops mock_ops .owner = THIS_MODULE, .pgsize_bitmap = MOCK_IO_PAGE_SIZE, .hw_info = mock_domain_hw_info, - .domain_alloc = mock_domain_alloc, + .domain_alloc_paging = mock_domain_alloc_paging, + .domain_alloc_user = mock_domain_alloc_user, .capable = mock_domain_capable, - .set_platform_dma_ops = mock_domain_set_plaform_dma_ops, include/linux/iommu.h should be: - * @domain_alloc: allocate iommu domain + * @domain_alloc: allocate and return an iommu domain if success. Otherwise + * NULL is returned. The domain is not fully initialized until + * the caller iommu_domain_alloc() returns. + * @domain_alloc_user: Allocate an iommu domain corresponding to the input + * parameters as defined in include/uapi/linux/iommufd.h. + * Unlike @domain_alloc, it is called only by IOMMUFD and + * must fully initialize the new domain before return. + * Upon success, if the @user_data is valid and the @parent + * points to a kernel-managed domain, the new domain must be + * IOMMU_DOMAIN_NESTED type; otherwise, the @parent must be + * NULL while the @user_data can be optionally provided, the + * new domain must support __IOMMU_DOMAIN_PAGING. + * Upon failure, ERR_PTR must be returned. + * @domain_alloc_paging: Allocate an iommu_domain that can be used for + * UNMANAGED, DMA, and DMA_FQ domain types. The rest were straightforward. The tag for-linus-iommufd-merged with my merge resolution to your tree is also available to pull. Thanks, Jason The following changes since commit ce9ecca0238b140b88f43859b211c9fdfd8e5b70: Linux 6.6-rc2 (2023-09-17 14:40:24 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git tags/for-linus-iommufd for you to fetch changes up to b2b67c997bf74453f3469d8b54e4859f190943bd: iommufd: Organize the mock domain alloc functions closer to Joerg's tree (2023-10-30 18:01:56 -0300) ---------------------------------------------------------------- iommufd for 6.7 This branch has three new iommufd capabilities: - Dirty tracking for DMA. AMD/ARM/Intel CPUs can now record if a DMA writes to a page in the IOPTEs within the IO page table. This can be used to generate a record of what memory is being dirtied by DMA activities during a VM migration process. A VMM like qemu will combine the IOMMU dirty bits with the CPU's dirty log to determine what memory to transfer. VFIO already has a DMA dirty tracking framework that requires PCI devices to implement tracking HW internally. The iommufd version provides an alternative that the VMM can select, if available. The two are designed to have very similar APIs. - Userspace controlled attributes for hardware page tables (HWPT/iommu_domain). There are currently a few generic attributes for HWPTs (support dirty tracking, and parent of a nest). This is an entry point for the userspace iommu driver to control the HW in detail. - Nested translation support for HWPTs. This is a 2D translation scheme similar to the CPU where a DMA goes through a first stage to determine an intermediate address which is then translated trough a second stage to a physical address. Like for CPU translation the first stage table would exist in VM controlled memory and the second stage is in the kernel and matches the VM's guest to physical map. As every IOMMU has a unique set of parameter to describe the S1 IO page table and its associated parameters the userspace IOMMU driver has to marshal the information into the correct format. This is 1/3 of the feature, it allows creating the nested translation and binding it to VFIO devices, however the API to support IOTLB and ATC invalidation of the stage 1 io page table, and forwarding of IO faults are still in progress. The series includes AMD and Intel support for dirty tracking. Intel support for nested translation. Along the way are a number of internal items: - New iommu core items: ops->domain_alloc_user(), ops->set_dirty_tracking, ops->read_and_clear_dirty(), IOMMU_DOMAIN_NESTED, and iommu_copy_struct_from_user - UAF fix in iopt_area_split() - Spelling fixes and some test suite improvement ---------------------------------------------------------------- GuokaiXu (1): iommufd: Fix spelling errors in comments Jason Gunthorpe (4): iommufd: Rename IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING iommufd/device: Wrap IOMMUFD_OBJ_HWPT_PAGING-only configurations iommufd: Add iopt_area_alloc() iommufd: Organize the mock domain alloc functions closer to Joerg's tree Joao Martins (19): vfio/iova_bitmap: Export more API symbols vfio: Move iova_bitmap into iommufd iommufd/iova_bitmap: Move symbols to IOMMUFD namespace iommu: Add iommu_domain ops for dirty tracking iommufd: Add a flag to enforce dirty tracking on attach iommufd: Add IOMMU_HWPT_SET_DIRTY_TRACKING iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP iommufd: Add capabilities to IOMMU_GET_HW_INFO iommufd: Add a flag to skip clearing of IOPTE dirty iommu/amd: Add domain_alloc_user based domain allocation iommu/amd: Access/Dirty bit support in IOPTEs iommu/vt-d: Access/Dirty bit support for SS domains iommufd/selftest: Expand mock_domain with dev_flags iommufd/selftest: Test IOMMU_HWPT_ALLOC_DIRTY_TRACKING iommufd/selftest: Test IOMMU_HWPT_SET_DIRTY_TRACKING iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP iommufd/selftest: Test out_capabilities in IOMMU_GET_HW_INFO iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag iommufd/selftest: Fix page-size check in iommufd_test_dirty() Koichiro Den (1): iommufd: Fix missing update of domains_itree after splitting iopt_area Lu Baolu (6): iommu: Add IOMMU_DOMAIN_NESTED iommu/vt-d: Extend dmar_domain to support nested domain iommu/vt-d: Add helper for nested domain allocation iommu/vt-d: Add helper to setup pasid nested translation iommu/vt-d: Add nested domain allocation iommu/vt-d: Disallow read-only mappings to nest parent domain Nicolin Chen (10): iommufd/selftest: Iterate idev_ids in mock_domain's alloc_hwpt test iommufd/selftest: Rework TEST_LENGTH to test min_size explicitly iommufd: Correct IOMMU_HWPT_ALLOC_NEST_PARENT description iommufd: Only enforce cache coherency in iommufd_hw_pagetable_alloc iommufd: Derive iommufd_hwpt_paging from iommufd_hw_pagetable iommufd: Share iommufd_hwpt_alloc with IOMMUFD_OBJ_HWPT_NESTED iommufd: Add a nested HW pagetable object iommu: Add iommu_copy_struct_from_user helper iommufd/selftest: Add nested domain allocation for mock domain iommufd/selftest: Add coverage for IOMMU_HWPT_ALLOC with nested HWPTs Yi Liu (11): iommu: Add new iommu op to create domains owned by userspace iommufd: Use the domain_alloc_user() op for domain allocation iommufd: Flow user flags for domain allocation to domain_alloc_user() iommufd: Support allocating nested parent domain iommufd/selftest: Add domain_alloc_user() support in iommu mock iommu/vt-d: Add domain_alloc_user op iommu: Pass in parent domain with user_data to domain_alloc_user op iommu/vt-d: Enhance capability check for nested parent domain allocation iommufd: Add data structure for Intel VT-d stage-1 domain allocation iommu/vt-d: Make domain attach helpers to be extern iommu/vt-d: Set the nested domain to a device drivers/iommu/Kconfig | 4 + drivers/iommu/amd/Kconfig | 1 + drivers/iommu/amd/amd_iommu_types.h | 12 + drivers/iommu/amd/io_pgtable.c | 68 ++++ drivers/iommu/amd/iommu.c | 147 ++++++++- drivers/iommu/intel/Kconfig | 1 + drivers/iommu/intel/Makefile | 2 +- drivers/iommu/intel/iommu.c | 156 +++++++++- drivers/iommu/intel/iommu.h | 64 +++- drivers/iommu/intel/nested.c | 117 +++++++ drivers/iommu/intel/pasid.c | 221 +++++++++++++ drivers/iommu/intel/pasid.h | 6 + drivers/iommu/iommufd/Makefile | 1 + drivers/iommu/iommufd/device.c | 174 +++++++---- drivers/iommu/iommufd/hw_pagetable.c | 304 ++++++++++++++---- drivers/iommu/iommufd/io_pagetable.c | 200 +++++++++++- drivers/iommu/iommufd/iommufd_private.h | 84 ++++- drivers/iommu/iommufd/iommufd_test.h | 39 +++ drivers/{vfio => iommu/iommufd}/iova_bitmap.c | 5 +- drivers/iommu/iommufd/main.c | 17 +- drivers/iommu/iommufd/pages.c | 2 + drivers/iommu/iommufd/selftest.c | 328 ++++++++++++++++++-- drivers/iommu/iommufd/vfio_compat.c | 6 +- drivers/vfio/Makefile | 3 +- drivers/vfio/pci/mlx5/Kconfig | 1 + drivers/vfio/pci/mlx5/main.c | 1 + drivers/vfio/pci/pds/Kconfig | 1 + drivers/vfio/pci/pds/pci_drv.c | 1 + drivers/vfio/vfio_main.c | 1 + include/linux/io-pgtable.h | 4 + include/linux/iommu.h | 146 ++++++++- include/linux/iova_bitmap.h | 26 ++ include/uapi/linux/iommufd.h | 180 ++++++++++- tools/testing/selftests/iommu/iommufd.c | 379 ++++++++++++++++++++++- tools/testing/selftests/iommu/iommufd_fail_nth.c | 7 +- tools/testing/selftests/iommu/iommufd_utils.h | 233 +++++++++++++- 36 files changed, 2723 insertions(+), 219 deletions(-)