[RFC PATCHES 00/17] IOMMUFD: Deliver IO page faults to user space

* [RFC PATCHES 00/17] IOMMUFD: Deliver IO page faults to user space
@ 2023-05-30  5:37 Lu Baolu
  2023-05-30  5:37 ` [RFC PATCHES 01/17] iommu: Move iommu fault data to linux/iommu.h Lu Baolu
                   ` (19 more replies)
  0 siblings, 20 replies; 37+ messages in thread
From: Lu Baolu @ 2023-05-30  5:37 UTC (permalink / raw)
  To: Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon,
	Robin Murphy, Jean-Philippe Brucker, Nicolin Chen, Yi Liu,
	Jacob Pan
  Cc: iommu, linux-kselftest, virtualization, linux-kernel, Lu Baolu

Hi folks,

This series implements the functionality of delivering IO page faults to
user space through the IOMMUFD framework. The use case is nested
translation, where modern IOMMU hardware supports two-stage translation
tables. The second-stage translation table is managed by the host VMM
while the first-stage translation table is owned by the user space.
Hence, any IO page fault that occurs on the first-stage page table
should be delivered to the user space and handled there. The user space
should respond the page fault handling result to the device top-down
through the IOMMUFD response uAPI.

User space indicates its capablity of handling IO page faults by setting
a user HWPT allocation flag IOMMU_HWPT_ALLOC_FLAGS_IOPF_CAPABLE. IOMMUFD
will then setup its infrastructure for page fault delivery. Together
with the iopf-capable flag, user space should also provide an eventfd
where it will listen on any down-top page fault messages.

On a successful return of the allocation of iopf-capable HWPT, a fault
fd will be returned. User space can open and read fault messages from it
once the eventfd is signaled.

Besides the overall design, I'd like to hear comments about below
designs:

- The IOMMUFD fault message format. It is very similar to that in
  uapi/linux/iommu which has been discussed before and partially used by
  the IOMMU SVA implementation. I'd like to get more comments on the
  format when it comes to IOMMUFD.

- The timeout value for the pending page fault messages. Ideally we
  should determine the timeout value from the device configuration, but
  I failed to find any statement in the PCI specification (version 6.x).
  A default 100 milliseconds is selected in the implementation, but it
  leave the room for grow the code for per-device setting.

This series is only for review comment purpose. I used IOMMUFD selftest
to verify the hwpt allocation, attach/detach and replace. But I didn't
get a chance to run it with real hardware yet. I will do more test in
the subsequent versions when I am confident that I am heading on the
right way.

This series is based on the latest implementation of the nested
translation under discussion. The whole series and related patches are
available on gitbub:

https://github.com/LuBaolu/intel-iommu/commits/iommufd-io-pgfault-delivery-v1

Best regards,
baolu

Lu Baolu (17):
  iommu: Move iommu fault data to linux/iommu.h
  iommu: Support asynchronous I/O page fault response
  iommu: Add helper to set iopf handler for domain
  iommu: Pass device parameter to iopf handler
  iommu: Split IO page fault handling from SVA
  iommu: Add iommu page fault cookie helpers
  iommufd: Add iommu page fault data
  iommufd: IO page fault delivery initialization and release
  iommufd: Add iommufd hwpt iopf handler
  iommufd: Add IOMMU_HWPT_ALLOC_FLAGS_USER_PASID_TABLE for hwpt_alloc
  iommufd: Deliver fault messages to user space
  iommufd: Add io page fault response support
  iommufd: Add a timer for each iommufd fault data
  iommufd: Drain all pending faults when destroying hwpt
  iommufd: Allow new hwpt_alloc flags
  iommufd/selftest: Add IOPF feature for mock devices
  iommufd/selftest: Cover iopf-capable nested hwpt

 include/linux/iommu.h                         | 175 +++++++++-
 drivers/iommu/{iommu-sva.h => io-pgfault.h}   |  25 +-
 drivers/iommu/iommu-priv.h                    |   3 +
 drivers/iommu/iommufd/iommufd_private.h       |  32 ++
 include/uapi/linux/iommu.h                    | 161 ---------
 include/uapi/linux/iommufd.h                  |  73 +++-
 tools/testing/selftests/iommu/iommufd_utils.h |  20 +-
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |   2 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |   2 +-
 drivers/iommu/intel/iommu.c                   |   2 +-
 drivers/iommu/intel/svm.c                     |   2 +-
 drivers/iommu/io-pgfault.c                    |   7 +-
 drivers/iommu/iommu-sva.c                     |   4 +-
 drivers/iommu/iommu.c                         |  50 ++-
 drivers/iommu/iommufd/device.c                |  64 +++-
 drivers/iommu/iommufd/hw_pagetable.c          | 318 +++++++++++++++++-
 drivers/iommu/iommufd/main.c                  |   3 +
 drivers/iommu/iommufd/selftest.c              |  71 ++++
 tools/testing/selftests/iommu/iommufd.c       |  17 +-
 MAINTAINERS                                   |   1 -
 drivers/iommu/Kconfig                         |   4 +
 drivers/iommu/Makefile                        |   3 +-
 drivers/iommu/intel/Kconfig                   |   1 +
 23 files changed, 837 insertions(+), 203 deletions(-)
 rename drivers/iommu/{iommu-sva.h => io-pgfault.h} (71%)
 delete mode 100644 include/uapi/linux/iommu.h

-- 
2.34.1

^ permalink raw reply	[flat|nested] 37+ messages in thread