From: Eric Auger <eric.auger@redhat.com>
To: eric.auger.pro@gmail.com, eric.auger@redhat.com,
iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu,
joro@8bytes.org, alex.williamson@redhat.com,
jacob.jun.pan@linux.intel.com, yi.l.liu@linux.intel.com,
jean-philippe.brucker@arm.com, will.deacon@arm.com,
robin.murphy@arm.com
Cc: tianyu.lan@intel.com, ashok.raj@intel.com, marc.zyngier@arm.com,
christoffer.dall@arm.com, peter.maydell@linaro.org
Subject: [RFC v2 19/20] vfio: Document nested stage control
Date: Tue, 18 Sep 2018 16:24:56 +0200 [thread overview]
Message-ID: <20180918142457.3325-20-eric.auger@redhat.com> (raw)
In-Reply-To: <20180918142457.3325-1-eric.auger@redhat.com>
New iotcls were introduced to pass information about guest stage1
to the host through VFIO. Let's document the nested stage control.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
fault reporting is current missing to the picture
v1 -> v2:
- use the new ioctl names
- add doc related to fault handling
---
Documentation/vfio.txt | 60 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 60 insertions(+)
diff --git a/Documentation/vfio.txt b/Documentation/vfio.txt
index f1a4d3c3ba0b..d4611759d306 100644
--- a/Documentation/vfio.txt
+++ b/Documentation/vfio.txt
@@ -239,6 +239,66 @@ group and can access them as follows::
/* Gratuitous device reset and go... */
ioctl(device, VFIO_DEVICE_RESET);
+IOMMU Dual Stage Control
+------------------------
+
+Some IOMMUs support 2 stages/levels of translation. "Stage" corresponds to
+the ARM terminology while "level" corresponds to Intel's VTD terminology. In
+the following text we use either without distinction.
+
+This is useful when the guest is exposed with a virtual IOMMU and some
+devices are assigned to the guest through VFIO. Then the guest OS can use
+stage 1 (IOVA -> GPA), while the hypervisor uses stage 2 for VM isolation
+(GPA -> HPA).
+
+The guest gets ownership of the stage 1 page tables and also owns stage 1
+configuration structures. The hypervisor owns the root configuration structure
+(for security reason), including stage 2 configuration. This works as long
+configuration structures and page table format are compatible between the
+virtual IOMMU and the physical IOMMU.
+
+Assuming the HW supports it, this nested mode is selected by choosing the
+VFIO_TYPE1_NESTING_IOMMU type through:
+
+ioctl(container, VFIO_SET_IOMMU, VFIO_TYPE1_NESTING_IOMMU);
+
+This forces the hypervisor to use the stage 2, leaving stage 1 available for
+guest usage.
+
+Once groups are attached to the container, the guest stage 1 translation
+configuration data can be passed to VFIO by using
+
+ioctl(container, VFIO_IOMMU_BIND_PASID_TABLE, &pasid_table_info);
+
+This allows to combine guest stage 1 configuration structure along with
+hypervisor stage 2 configuration structure. stage 1 configuration structures
+are dependent on the IOMMU type.
+
+As the stage 1 translation is fully delegated to the HW, physical events that
+may occur (especially translation faults), need to be propagated up to
+the virtualizer and re-injected into the guest.
+
+By calling: ioctl(container, VFIO_IOMMU_SET_FAULT_EVENTFD &fault_config),
+the virtualizer can register an eventfd. This latter will be signalled
+whenever an event is detected at physical level. The fault handler,
+executed at userspace level has to call:
+ioctl(container, VFIO_IOMMU_GET_FAULT_EVENTS, &fault_info) to retrieve
+the pending fault events.
+
+When the guest invalidates stage 1 related caches, invalidations must be
+forwarded to the host through
+ioctl(container, VFIO_IOMMU_CACHE_INVALIDATE, &inv_data);
+Those invalidations can happen at various granularity levels, page, context, ...
+
+The ARM SMMU specification introduces another challenge: MSIs are translated by
+both the virtual SMMU and the physical SMMU. To build a nested mapping for the
+IOVA programmed into the assigned device, the guest needs to pass its IOVA/MSI
+doorbell GPA binding to the host. Then the hypervisor can build a nested stage 2
+binding eventually translating into the physical MSI doorbell.
+
+This is achieved by
+ioctl(container, VFIO_IOMMU_BIND_MSI, &guest_binding);
+
VFIO User API
-------------------------------------------------------------------------------
--
2.17.1
next prev parent reply other threads:[~2018-09-18 14:27 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-18 14:24 [RFC v2 00/20] SMMUv3 Nested Stage Setup Eric Auger
2018-09-18 14:24 ` [RFC v2 01/20] iommu: Introduce bind_pasid_table API Eric Auger
2018-09-20 17:21 ` Jacob Pan
2018-09-21 9:45 ` Auger Eric
2018-09-18 14:24 ` [RFC v2 02/20] iommu: Introduce cache_invalidate API Eric Auger
2018-09-18 14:24 ` [RFC v2 03/20] iommu: Introduce bind_guest_msi Eric Auger
2018-09-18 14:24 ` [RFC v2 04/20] vfio: VFIO_IOMMU_BIND_PASID_TABLE Eric Auger
2018-09-18 14:24 ` [RFC v2 05/20] vfio: VFIO_IOMMU_CACHE_INVALIDATE Eric Auger
2018-09-18 14:24 ` [RFC v2 06/20] vfio: VFIO_IOMMU_BIND_MSI Eric Auger
2018-09-18 14:24 ` [RFC v2 07/20] iommu/arm-smmu-v3: Link domains and devices Eric Auger
2018-09-18 14:24 ` [RFC v2 08/20] iommu/arm-smmu-v3: Maintain a SID->device structure Eric Auger
2018-09-18 14:24 ` [RFC v2 09/20] iommu/smmuv3: Get prepared for nested stage support Eric Auger
2018-09-18 14:24 ` [RFC v2 10/20] iommu/smmuv3: Implement bind_pasid_table Eric Auger
2018-09-18 14:24 ` [RFC v2 11/20] iommu/smmuv3: Implement cache_invalidate Eric Auger
2018-09-18 14:24 ` [RFC v2 12/20] dma-iommu: Implement NESTED_MSI cookie Eric Auger
2018-10-24 18:02 ` Robin Murphy
2018-10-24 18:44 ` Auger Eric
2018-10-24 22:05 ` Robin Murphy
2018-10-27 9:24 ` Auger Eric
2018-09-18 14:24 ` [RFC v2 13/20] iommu/smmuv3: Implement bind_guest_msi Eric Auger
2018-09-18 14:24 ` [RFC v2 14/20] iommu: introduce device fault data Eric Auger
2018-09-20 22:06 ` Jacob Pan
2018-09-21 9:54 ` Auger Eric
2018-09-21 16:18 ` Jacob Pan
2018-12-12 8:21 ` Auger Eric
2018-12-15 0:30 ` Jacob Pan
2018-12-17 9:04 ` Auger Eric
2018-09-18 14:24 ` [RFC v2 15/20] driver core: add per device iommu param Eric Auger
2018-09-18 14:24 ` [RFC v2 16/20] iommu: introduce device fault report API Eric Auger
2018-09-18 14:24 ` [RFC v2 17/20] vfio: VFIO_IOMMU_SET_FAULT_EVENTFD Eric Auger
2018-09-18 14:24 ` [RFC v2 18/20] vfio: VFIO_IOMMU_GET_FAULT_EVENTS Eric Auger
2018-09-18 14:24 ` Eric Auger [this message]
2018-09-18 14:24 ` [RFC v2 20/20] iommu/smmuv3: Report non recoverable faults Eric Auger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180918142457.3325-20-eric.auger@redhat.com \
--to=eric.auger@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=ashok.raj@intel.com \
--cc=christoffer.dall@arm.com \
--cc=eric.auger.pro@gmail.com \
--cc=iommu@lists.linux-foundation.org \
--cc=jacob.jun.pan@linux.intel.com \
--cc=jean-philippe.brucker@arm.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-kernel@vger.kernel.org \
--cc=marc.zyngier@arm.com \
--cc=peter.maydell@linaro.org \
--cc=robin.murphy@arm.com \
--cc=tianyu.lan@intel.com \
--cc=will.deacon@arm.com \
--cc=yi.l.liu@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).