All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yi Liu <yi.l.liu@intel.com>
To: joro@8bytes.org, alex.williamson@redhat.com, jgg@nvidia.com,
	kevin.tian@intel.com, robin.murphy@arm.com,
	baolu.lu@linux.intel.com
Cc: cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com,
	kvm@vger.kernel.org, mjrosato@linux.ibm.com,
	chao.p.peng@linux.intel.com, yi.l.liu@intel.com,
	yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com,
	shameerali.kolothum.thodi@huawei.com, lulu@redhat.com,
	suravee.suthikulpanit@amd.com, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	zhenzhong.duan@intel.com
Subject: [PATCH v3 00/17] iommufd: Add nesting infrastructure
Date: Mon, 24 Jul 2023 04:03:49 -0700	[thread overview]
Message-ID: <20230724110406.107212-1-yi.l.liu@intel.com> (raw)

Nested translation is a hardware feature that is supported by many modern
IOMMU hardwares. It has two stages (stage-1, stage-2) address translation
to get access to the physical address. stage-1 translation table is owned
by userspace (e.g. by a guest OS), while stage-2 is owned by kernel. Changes
to stage-1 translation table should be followed by an IOTLB invalidation.

Take Intel VT-d as an example, the stage-1 translation table is I/O page
table. As the below diagram shows, guest I/O page table pointer in GPA
(guest physical address) is passed to host and be used to perform the stage-1
address translation. Along with it, modifications to present mappings in the
guest I/O page table should be followed with an IOTLB invalidation.

    .-------------.  .---------------------------.
    |   vIOMMU    |  | Guest I/O page table      |
    |             |  '---------------------------'
    .----------------/
    | PASID Entry |--- PASID cache flush --+
    '-------------'                        |
    |             |                        V
    |             |           I/O page table pointer in GPA
    '-------------'
Guest
------| Shadow |---------------------------|--------
      v        v                           v
Host
    .-------------.  .------------------------.
    |   pIOMMU    |  |  FS for GIOVA->GPA     |
    |             |  '------------------------'
    .----------------/  |
    | PASID Entry |     V (Nested xlate)
    '----------------\.----------------------------------.
    |             |   | SS for GPA->HPA, unmanaged domain|
    |             |   '----------------------------------'
    '-------------'
Where:
 - FS = First stage page tables
 - SS = Second stage page tables
<Intel VT-d Nested translation>

In IOMMUFD, all the translation tables are tracked by hw_pagetable (hwpt)
and each has an iommu_domain allocated from iommu driver. So in this series
hw_pagetable and iommu_domain means the same thing if no special note.
IOMMUFD has already supported allocating hw_pagetable that is linked with
an IOAS. However, nesting requires IOMMUFD to allow allocating hw_pagetable
with driver specific parameters and interface to sync stage-1 IOTLB as user
owns the stage-1 translation table.

This series is based on the iommu hw info reporting series [1]. It first
introduces new iommu op for allocating domains with user data and the op
for invalidate stage-1 IOTLB, and then extend the IOMMUFD internal infrastructure
to accept user_data and parent hwpt, then relay the data to iommu core to
allocate user iommu_domain. After it, extends the ioctl IOMMU_HWPT_ALLOC to
accept user data and stage-2 hwpt ID to allocate hwpt. Along with it, ioctl
IOMMU_HWPT_INVALIDATE is added to invalidate stage-1 IOTLB. This is needed
for user-managed hwpts. Selftest is added as well to cover the new ioctls.

Complete code can be found in [2], QEMU could can be found in [3].

At last, this is a team work together with Nicolin Chen, Lu Baolu. Thanks
them for the help. ^_^. Look forward to your feedbacks.

[1] https://lore.kernel.org/linux-iommu/20230724105936.107042-1-yi.l.liu@intel.com/
[2] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
[3] https://github.com/yiliu1765/qemu/tree/wip/iommufd_rfcv4_nesting

Change log:

v3:
 - Add new uAPI things in alphabetical order
 - Pass in "enum iommu_hwpt_type hwpt_type" to op->domain_alloc_user for
   sanity, replacing the previous op->domain_alloc_user_data_len solution
 - Return ERR_PTR from domain_alloc_user instead of NULL
 - Only add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation (Kevin)
 - Add IOMMU_RESV_IOVA_RANGES to report resv iova ranges to userspace hence
   userspace is able to exclude the ranges in the stage-1 HWPT (e.g. guest I/O
   page table). (Kevin)
 - Add selftest coverage for the new IOMMU_RESV_IOVA_RANGES ioctl
 - Minor changes per Kevin's inputs

v2: https://lore.kernel.org/linux-iommu/20230511143844.22693-1-yi.l.liu@intel.com/
 - Add union iommu_domain_user_data to include all user data structures to avoid
   passing void * in kernel APIs.
 - Add iommu op to return user data length for user domain allocation
 - Rename struct iommu_hwpt_alloc::data_type to be hwpt_type
 - Store the invalidation data length in iommu_domain_ops::cache_invalidate_user_data_len
 - Convert cache_invalidate_user op to be int instead of void
 - Remove @data_type in struct iommu_hwpt_invalidate
 - Remove out_hwpt_type_bitmap in struct iommu_hw_info hence drop patch 08 of v1

v1: https://lore.kernel.org/linux-iommu/20230309080910.607396-1-yi.l.liu@intel.com/

Thanks,
	Yi Liu

Lu Baolu (2):
  iommu: Add new iommu op to create domains owned by userspace
  iommu: Add nested domain support

Nicolin Chen (6):
  iommufd/hw_pagetable: Do not populate user-managed hw_pagetables
  iommufd: Only enforce IOMMU_RESV_SW_MSI when attaching user-managed
    HWPT
  iommufd/selftest: Add domain_alloc_user() support in iommu mock
  iommufd/selftest: Add coverage for IOMMU_HWPT_ALLOC with user data
  iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
  iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl

Yi Liu (9):
  iommufd/hw_pagetable: Use domain_alloc_user op for domain allocation
  iommufd: Pass in hwpt_type/parent/user_data to
    iommufd_hw_pagetable_alloc()
  iommufd: Add IOMMU_RESV_IOVA_RANGES
  iommufd: IOMMU_HWPT_ALLOC allocation with user data
  iommufd: Add IOMMU_HWPT_INVALIDATE
  iommufd/selftest: Add a helper to get test device
  iommufd/selftest: Add IOMMU_TEST_OP_DEV_[ADD|DEL]_RESERVED to add/del
    reserved regions to selftest device
  iommufd/selftest: Add .get_resv_regions() for mock_dev
  iommufd/selftest: Add coverage for IOMMU_RESV_IOVA_RANGES

 drivers/iommu/iommufd/device.c                |   9 +-
 drivers/iommu/iommufd/hw_pagetable.c          | 181 +++++++++++-
 drivers/iommu/iommufd/io_pagetable.c          |   5 +-
 drivers/iommu/iommufd/iommufd_private.h       |  20 +-
 drivers/iommu/iommufd/iommufd_test.h          |  36 +++
 drivers/iommu/iommufd/main.c                  |  59 +++-
 drivers/iommu/iommufd/selftest.c              | 266 ++++++++++++++++--
 include/linux/iommu.h                         |  34 +++
 include/uapi/linux/iommufd.h                  |  96 ++++++-
 tools/testing/selftests/iommu/iommufd.c       | 224 ++++++++++++++-
 tools/testing/selftests/iommu/iommufd_utils.h |  70 +++++
 11 files changed, 958 insertions(+), 42 deletions(-)

-- 
2.34.1


             reply	other threads:[~2023-07-24 11:04 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-24 11:03 Yi Liu [this message]
2023-07-24 11:03 ` [PATCH v3 01/17] iommu: Add new iommu op to create domains owned by userspace Yi Liu
2023-07-28  9:37   ` Tian, Kevin
2023-07-28 16:56     ` Jason Gunthorpe
2023-07-31 12:44       ` Liu, Yi L
2023-07-31 13:19         ` Jason Gunthorpe
2023-08-03  2:28       ` Nicolin Chen
2023-07-24 11:03 ` [PATCH v3 02/17] iommu: Add nested domain support Yi Liu
2023-07-28  9:38   ` Tian, Kevin
2023-07-28 16:59   ` Jason Gunthorpe
2023-08-03  2:36     ` Nicolin Chen
2023-08-03  2:53       ` Liu, Yi L
2023-08-03  3:04         ` Nicolin Chen
2023-07-24 11:03 ` [PATCH v3 03/17] iommufd/hw_pagetable: Use domain_alloc_user op for domain allocation Yi Liu
2023-07-28  9:39   ` Tian, Kevin
2023-07-24 11:03 ` [PATCH v3 04/17] iommufd: Pass in hwpt_type/parent/user_data to iommufd_hw_pagetable_alloc() Yi Liu
2023-07-28  9:49   ` Tian, Kevin
2023-07-24 11:03 ` [PATCH v3 05/17] iommufd/hw_pagetable: Do not populate user-managed hw_pagetables Yi Liu
2023-07-28  9:51   ` Tian, Kevin
2023-07-24 11:03 ` [PATCH v3 06/17] iommufd: Only enforce IOMMU_RESV_SW_MSI when attaching user-managed HWPT Yi Liu
2023-07-28 10:02   ` Tian, Kevin
2023-07-28 17:06     ` Jason Gunthorpe
2023-07-24 11:03 ` [PATCH v3 07/17] iommufd: Add IOMMU_RESV_IOVA_RANGES Yi Liu
2023-07-28 10:07   ` Tian, Kevin
2023-07-28 17:16     ` Jason Gunthorpe
2023-07-31  6:14       ` Tian, Kevin
2023-07-31 13:12         ` Jason Gunthorpe
2023-07-28 17:44   ` Jason Gunthorpe
2023-07-31  6:21     ` Tian, Kevin
2023-07-31  9:53       ` Liu, Yi L
2023-07-31 13:23       ` Jason Gunthorpe
2023-08-01  2:40         ` Tian, Kevin
2023-08-01 18:22           ` Jason Gunthorpe
2023-08-02  1:09             ` Tian, Kevin
2023-08-02 12:22               ` Jason Gunthorpe
2023-08-03  1:23               ` Nicolin Chen
2023-08-03  1:25                 ` Tian, Kevin
2023-08-03  2:17                   ` Nicolin Chen
2023-07-24 11:03 ` [PATCH v3 08/17] iommufd: IOMMU_HWPT_ALLOC allocation with user data Yi Liu
2023-07-28 17:55   ` Jason Gunthorpe
2023-07-28 19:10     ` Nicolin Chen
2023-07-31  7:22       ` Liu, Yi L
2023-07-31  6:31     ` Tian, Kevin
2023-07-31 13:16       ` Jason Gunthorpe
2023-08-01  2:35         ` Tian, Kevin
2023-08-02 23:42         ` Nicolin Chen
2023-08-02 23:43           ` Jason Gunthorpe
2023-08-03  0:53             ` Nicolin Chen
2023-08-03 16:47               ` Jason Gunthorpe
2023-07-24 11:03 ` [PATCH v3 09/17] iommufd: Add IOMMU_HWPT_INVALIDATE Yi Liu
2023-07-28 18:02   ` Jason Gunthorpe
2023-07-31 10:07     ` Liu, Yi L
2023-07-31 13:19       ` Jason Gunthorpe
2023-08-03  2:16         ` Nicolin Chen
2023-08-03  3:07           ` Liu, Yi L
2023-08-03  3:13             ` Nicolin Chen
2023-08-03  2:56         ` Liu, Yi L
2023-08-03  2:07     ` Nicolin Chen
2023-07-24 11:03 ` [PATCH v3 10/17] iommufd/selftest: Add a helper to get test device Yi Liu
2023-07-24 11:04 ` [PATCH v3 11/17] iommufd/selftest: Add IOMMU_TEST_OP_DEV_[ADD|DEL]_RESERVED to add/del reserved regions to selftest device Yi Liu
2023-07-24 11:04 ` [PATCH v3 12/17] iommufd/selftest: Add .get_resv_regions() for mock_dev Yi Liu
2023-07-24 11:04 ` [PATCH v3 13/17] iommufd/selftest: Add coverage for IOMMU_RESV_IOVA_RANGES Yi Liu
2023-07-24 11:04 ` [PATCH v3 14/17] iommufd/selftest: Add domain_alloc_user() support in iommu mock Yi Liu
2023-07-24 11:04 ` [PATCH v3 15/17] iommufd/selftest: Add coverage for IOMMU_HWPT_ALLOC with user data Yi Liu
2023-07-24 11:04 ` [PATCH v3 16/17] iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op Yi Liu
2023-07-24 11:04 ` [PATCH v3 17/17] iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl Yi Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230724110406.107212-1-yi.l.liu@intel.com \
    --to=yi.l.liu@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=chao.p.peng@linux.intel.com \
    --cc=cohuck@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=iommu@lists.linux.dev \
    --cc=jasowang@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=lulu@redhat.com \
    --cc=mjrosato@linux.ibm.com \
    --cc=nicolinc@nvidia.com \
    --cc=peterx@redhat.com \
    --cc=robin.murphy@arm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=yi.y.sun@linux.intel.com \
    --cc=zhenzhong.duan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.