linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 3/3: vfio changes
@ 2016-04-28  8:28 Eric Auger
  2016-04-28  8:28 ` [PATCH v8 1/7] vfio: introduce a vfio_dma type field Eric Auger
                   ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: Eric Auger @ 2016-04-28  8:28 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

This series allows the user-space to register a reserved IOVA domain.
This completes the kernel integration of the whole functionality on top
of part 1 & 2.

It also depends on [PATCH 1/3] iommu: Add MMIO mapping type series,
http://comments.gmane.org/gmane.linux.kernel.iommu/12869

We reuse the VFIO DMA MAP ioctl with a new flag to bridge to the
msi-iommu API. The need for provisioning such MSI IOVA range is reported
through the VFIO_IOMMU_GET_INFO iotcl.

vfio_iommu_type1 checks if the MSI mapping is safe when attaching the
vfio group to the container (allow_unsafe_interrupts modality).

On ARM/ARM64, the IOMMU does not astract IRQ remapping. the modality is
abstracted on MSI controller side. The GICv3 ITS is the first controller
advertising the modality.

More details & context can be found at:
http://www.linaro.org/blog/core-dump/kvm-pciemsi-passthrough-armarm64/

Best Regards

Eric

Testing:
- functional on ARM64 AMD Overdrive HW (single GICv2m frame) with
  Intel X540-T2 (SR-IOV capable)
- Not tested: ARM GICv3 ITS

References:
[1] [RFC 0/2] VFIO: Add virtual MSI doorbell support
    (https://lkml.org/lkml/2015/7/24/135)
[2] [RFC PATCH 0/6] vfio: Add interface to map MSI pages
    (https://lists.cs.columbia.edu/pipermail/kvmarm/2015-September/016607.html)
[3] [PATCH v2 0/3] Introduce MSI hardware mapping for VFIO
    (http://permalink.gmane.org/gmane.comp.emulators.kvm.arm.devel/3858)

Git: complete series available at
https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.6-rc5-pcie-passthrough-v8

previous version at
https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.6-rc4-pcie-passthrough-v7

QEMU Integration:
[RFC v2 0/8] KVM PCI/MSI passthrough with mach-virt
(http://lists.gnu.org/archive/html/qemu-arm/2016-01/msg00444.html)
https://git.linaro.org/people/eric.auger/qemu.git/shortlog/refs/heads/v2.5.0-pci-passthrough-rfc-v2

History:
v7 -> v8:
- use renamed msi-iommu API
- VFIO only responsible for setting the IOVA aperture
- use new DOMAIN_ATTR_MSI_GEOMETRY iommu domain attribute

v6 -> v7:
- vfio_find_dma now accepts a dma_type argument.
- should have recovered the capability to unmap the whole user IOVA range
- remove computation of nb IOVA pages -> will post a separate RFC for that
  while respinning the QEMU part

RFC v5 -> patch v6:
- split to ease the review process

RFC v4 -> RFC v5:
- take into account Thomas' comments on MSI related patches
  - split "msi: IOMMU map the doorbell address when needed"
  - increase readability and add comments
  - fix style issues
 - split "iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute"
 - platform ITS now advertises IOMMU_CAP_INTR_REMAP
 - fix compilation issue with CONFIG_IOMMU API unset
 - arm-smmu-v3 now advertises DOMAIN_ATTR_MSI_MAPPING

RFC v3 -> v4:
- Move doorbell mapping/unmapping in msi.c
- fix ref count issue on set_affinity: in case of a change in the address
  the previous address is decremented
- doorbell map/unmap now is done on msi composition. Should allow the use
  case for platform MSI controllers
- create dma-reserved-iommu.h/c exposing/implementing a new API dedicated
  to reserved IOVA management (looking like dma-iommu glue)
- series reordering to ease the review:
  - first part is related to IOMMU
  - second related to MSI sub-system
  - third related to VFIO (except arm-smmu IOMMU_CAP_INTR_REMAP removal)
- expose the number of requested IOVA pages through VFIO_IOMMU_GET_INFO
  [this partially addresses Marc's comments on iommu_get/put_single_reserved
   size/alignment problematic - which I did not ignore - but I don't know
   how much I can do at the moment]

RFC v2 -> RFC v3:
- should fix wrong handling of some CONFIG combinations:
  CONFIG_IOVA, CONFIG_IOMMU_API, CONFIG_PCI_MSI_IRQ_DOMAIN
- fix MSI_FLAG_IRQ_REMAPPING setting in GICv3 ITS (although not tested)

PATCH v1 -> RFC v2:
- reverted to RFC since it looks more reasonable ;-) the code is split
  between VFIO, IOMMU, MSI controller and I am not sure I did the right
  choices. Also API need to be further discussed.
- iova API usage in arm-smmu.c.
- MSI controller natively programs the MSI addr with either the PA or IOVA.
  This is not done anymore in vfio-pci driver as suggested by Alex.
- check irq remapping capability of the group

RFC v1 [2] -> PATCH v1:
- use the existing dma map/unmap ioctl interface with a flag to register a
  reserved IOVA range. Use the legacy Rb to store this special vfio_dma.
- a single reserved IOVA contiguous region now is allowed
- use of an RB tree indexed by PA to store allocated reserved slots
- use of a vfio_domain iova_domain to manage iova allocation within the
  window provided by the userspace
- vfio alloc_map/unmap_free take a vfio_group handle
- vfio_group handle is cached in vfio_pci_device
- add ref counting to bindings
- user modality enabled at the end of the series


Eric Auger (7):
  vfio: introduce a vfio_dma type field
  vfio/type1: vfio_find_dma accepting a type argument
  vfio/type1: bypass unmap/unpin and replay for VFIO_IOVA_RESERVED slots
  vfio: allow reserved msi iova registration
  vfio/type1: also check IRQ remapping capability at msi domain
  iommu/arm-smmu: do not advertise IOMMU_CAP_INTR_REMAP
  vfio/type1: return MSI mapping requirements with VFIO_IOMMU_GET_INFO

 drivers/iommu/arm-smmu-v3.c     |   3 +-
 drivers/iommu/arm-smmu.c        |   3 +-
 drivers/vfio/vfio_iommu_type1.c | 227 +++++++++++++++++++++++++++++++++++++---
 include/uapi/linux/vfio.h       |  14 ++-
 4 files changed, 230 insertions(+), 17 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v8 1/7] vfio: introduce a vfio_dma type field
  2016-04-28  8:28 [PATCH v8 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 3/3: vfio changes Eric Auger
@ 2016-04-28  8:28 ` Eric Auger
  2016-04-28  8:28 ` [PATCH v8 2/7] vfio/type1: vfio_find_dma accepting a type argument Eric Auger
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Eric Auger @ 2016-04-28  8:28 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

We introduce a vfio_dma type since we will need to discriminate
different types of dma slots:
- VFIO_IOVA_USER: IOVA region used to map user vaddr
- VFIO_IOVA_RESERVED: IOVA region reserved to map host device PA such
  as MSI doorbells

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v6 -> v7:
- add VFIO_IOVA_ANY
- do not introduce yet any VFIO_IOVA_RESERVED handling
---
 drivers/vfio/vfio_iommu_type1.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 75b24e9..aaf5a6c 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -53,6 +53,16 @@ module_param_named(disable_hugepages,
 MODULE_PARM_DESC(disable_hugepages,
 		 "Disable VFIO IOMMU support for IOMMU hugepages.");
 
+enum vfio_iova_type {
+	VFIO_IOVA_USER = 0,	/* standard IOVA used to map user vaddr */
+	/*
+	 * IOVA reserved to map special host physical addresses,
+	 * MSI frames for instance
+	 */
+	VFIO_IOVA_RESERVED,
+	VFIO_IOVA_ANY,		/* matches any IOVA type */
+};
+
 struct vfio_iommu {
 	struct list_head	domain_list;
 	struct mutex		lock;
@@ -75,6 +85,7 @@ struct vfio_dma {
 	unsigned long		vaddr;		/* Process virtual addr */
 	size_t			size;		/* Map size (bytes) */
 	int			prot;		/* IOMMU_READ/WRITE */
+	enum vfio_iova_type	type;		/* type of IOVA */
 };
 
 struct vfio_group {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v8 2/7] vfio/type1: vfio_find_dma accepting a type argument
  2016-04-28  8:28 [PATCH v8 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 3/3: vfio changes Eric Auger
  2016-04-28  8:28 ` [PATCH v8 1/7] vfio: introduce a vfio_dma type field Eric Auger
@ 2016-04-28  8:28 ` Eric Auger
  2016-04-28  8:28 ` [PATCH v8 3/7] vfio/type1: bypass unmap/unpin and replay for VFIO_IOVA_RESERVED slots Eric Auger
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Eric Auger @ 2016-04-28  8:28 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

In our RB-tree we now have slots of different types (USER and RESERVED).
It becomes useful to be able to search for dma slots of a specific type or
any type. This patch proposes an implementation for that modality and also
changes the existing callers using the USER type.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/vfio/vfio_iommu_type1.c | 63 ++++++++++++++++++++++++++++++++++-------
 1 file changed, 53 insertions(+), 10 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index aaf5a6c..2d769d4 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -98,23 +98,64 @@ struct vfio_group {
  * into DMA'ble space using the IOMMU
  */
 
-static struct vfio_dma *vfio_find_dma(struct vfio_iommu *iommu,
-				      dma_addr_t start, size_t size)
+/**
+ * vfio_find_dma_from_node: looks for a dma slot intersecting a window
+ * from a given rb tree node
+ * @top: top rb tree node where the search starts (including this node)
+ * @start: window start
+ * @size: window size
+ * @type: window type
+ */
+static struct vfio_dma *vfio_find_dma_from_node(struct rb_node *top,
+						dma_addr_t start, size_t size,
+						enum vfio_iova_type type)
 {
-	struct rb_node *node = iommu->dma_list.rb_node;
+	struct rb_node *node = top;
+	struct vfio_dma *dma;
 
 	while (node) {
-		struct vfio_dma *dma = rb_entry(node, struct vfio_dma, node);
-
+		dma = rb_entry(node, struct vfio_dma, node);
 		if (start + size <= dma->iova)
 			node = node->rb_left;
 		else if (start >= dma->iova + dma->size)
 			node = node->rb_right;
 		else
+			break;
+	}
+	if (!node)
+		return NULL;
+
+	/* a dma slot intersects our window, check the type also matches */
+	if (type == VFIO_IOVA_ANY || dma->type == type)
+		return dma;
+
+	/* restart 2 searches skipping the current node */
+	if (start < dma->iova) {
+		dma = vfio_find_dma_from_node(node->rb_left, start,
+					      size, type);
+		if (dma)
 			return dma;
 	}
+	if (start + size > dma->iova + dma->size)
+		dma = vfio_find_dma_from_node(node->rb_right, start,
+					      size, type);
+	return dma;
+}
+
+/**
+ * vfio_find_dma: find a dma slot intersecting a given window
+ * @iommu: vfio iommu handle
+ * @start: window base iova
+ * @size: window size
+ * @type: window type
+ */
+static struct vfio_dma *vfio_find_dma(struct vfio_iommu *iommu,
+				      dma_addr_t start, size_t size,
+				      enum vfio_iova_type type)
+{
+	struct rb_node *top_node = iommu->dma_list.rb_node;
 
-	return NULL;
+	return vfio_find_dma_from_node(top_node, start, size, type);
 }
 
 static void vfio_link_dma(struct vfio_iommu *iommu, struct vfio_dma *new)
@@ -488,19 +529,21 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
 	 * mappings within the range.
 	 */
 	if (iommu->v2) {
-		dma = vfio_find_dma(iommu, unmap->iova, 0);
+		dma = vfio_find_dma(iommu, unmap->iova, 0, VFIO_IOVA_USER);
 		if (dma && dma->iova != unmap->iova) {
 			ret = -EINVAL;
 			goto unlock;
 		}
-		dma = vfio_find_dma(iommu, unmap->iova + unmap->size - 1, 0);
+		dma = vfio_find_dma(iommu, unmap->iova + unmap->size - 1, 0,
+				    VFIO_IOVA_USER);
 		if (dma && dma->iova + dma->size != unmap->iova + unmap->size) {
 			ret = -EINVAL;
 			goto unlock;
 		}
 	}
 
-	while ((dma = vfio_find_dma(iommu, unmap->iova, unmap->size))) {
+	while ((dma = vfio_find_dma(iommu, unmap->iova, unmap->size,
+				    VFIO_IOVA_USER))) {
 		if (!iommu->v2 && unmap->iova > dma->iova)
 			break;
 		unmapped += dma->size;
@@ -604,7 +647,7 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
 
 	mutex_lock(&iommu->lock);
 
-	if (vfio_find_dma(iommu, iova, size)) {
+	if (vfio_find_dma(iommu, iova, size, VFIO_IOVA_ANY)) {
 		mutex_unlock(&iommu->lock);
 		return -EEXIST;
 	}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v8 3/7] vfio/type1: bypass unmap/unpin and replay for VFIO_IOVA_RESERVED slots
  2016-04-28  8:28 [PATCH v8 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 3/3: vfio changes Eric Auger
  2016-04-28  8:28 ` [PATCH v8 1/7] vfio: introduce a vfio_dma type field Eric Auger
  2016-04-28  8:28 ` [PATCH v8 2/7] vfio/type1: vfio_find_dma accepting a type argument Eric Auger
@ 2016-04-28  8:28 ` Eric Auger
  2016-04-28  8:28 ` [PATCH v8 4/7] vfio: allow reserved msi iova registration Eric Auger
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Eric Auger @ 2016-04-28  8:28 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Before allowing the end-user to create VFIO_IOVA_RESERVED dma slots,
let's implement the expected behavior for removal and replay. As opposed
to user dma slots, IOVAs are not systematically bound to PAs and PAs are
not pinned. VFIO just initializes the IOVA "aperture". IOVAs are allocated
outside of the VFIO framework, typically the MSI layer which is
responsible to free and unmap them. The MSI mapping resources are freeed
by the IOMMU driver on domain destruction.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v7 -> v8:
- do no destroy anything anymore, just bypass unmap/unpin and iommu_map
  on replay
---
 drivers/vfio/vfio_iommu_type1.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 2d769d4..94a9916 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -391,7 +391,7 @@ static void vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma)
 	struct vfio_domain *domain, *d;
 	long unlocked = 0;
 
-	if (!dma->size)
+	if (!dma->size || dma->type != VFIO_IOVA_USER)
 		return;
 	/*
 	 * We use the IOMMU to track the physical addresses, otherwise we'd
@@ -727,6 +727,9 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu,
 		dma = rb_entry(n, struct vfio_dma, node);
 		iova = dma->iova;
 
+		if (dma->type == VFIO_IOVA_RESERVED)
+			continue;
+
 		while (iova < dma->iova + dma->size) {
 			phys_addr_t phys = iommu_iova_to_phys(d->domain, iova);
 			size_t size;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v8 4/7] vfio: allow reserved msi iova registration
  2016-04-28  8:28 [PATCH v8 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 3/3: vfio changes Eric Auger
                   ` (2 preceding siblings ...)
  2016-04-28  8:28 ` [PATCH v8 3/7] vfio/type1: bypass unmap/unpin and replay for VFIO_IOVA_RESERVED slots Eric Auger
@ 2016-04-28  8:28 ` Eric Auger
  2016-04-28  8:28 ` [PATCH v8 5/7] vfio/type1: also check IRQ remapping capability at msi domain Eric Auger
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Eric Auger @ 2016-04-28  8:28 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

The user is allowed to register a reserved MSI IOVA range by using the
DMA MAP API and setting the new flag: VFIO_DMA_MAP_FLAG_MSI_RESERVED_IOVA.
This region is stored in the vfio_dma rb tree. At that point the iova
range is not mapped to any target address yet. The host kernel will use
those iova when needed, typically when MSIs are allocated.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>

---
v7 -> v8:
- use iommu_msi_set_aperture function. There is no notion of
  unregistration anymore since the reserved msi slot remains
  until the container gets closed.

v6 -> v7:
- use iommu_free_reserved_iova_domain
- convey prot attributes downto dma-reserved-iommu iova domain creation
- reserved bindings teardown now performed on iommu domain destruction
- rename VFIO_DMA_MAP_FLAG_MSI_RESERVED_IOVA into
         VFIO_DMA_MAP_FLAG_RESERVED_MSI_IOVA
- change title
- pass the protection attribute to dma-reserved-iommu API

v3 -> v4:
- use iommu_alloc/free_reserved_iova_domain exported by dma-reserved-iommu
- protect vfio_register_reserved_iova_range implementation with
  CONFIG_IOMMU_DMA_RESERVED
- handle unregistration by user-space and on vfio_iommu_type1 release

v1 -> v2:
- set returned value according to alloc_reserved_iova_domain result
- free the iova domains in case any error occurs

RFC v1 -> v1:
- takes into account Alex comments, based on
  [RFC PATCH 1/6] vfio: Add interface for add/del reserved iova region:
- use the existing dma map/unmap ioctl interface with a flag to register
  a reserved IOVA range. A single reserved iova region is allowed.
---
 drivers/vfio/vfio_iommu_type1.c | 78 ++++++++++++++++++++++++++++++++++++++++-
 include/uapi/linux/vfio.h       | 10 +++++-
 2 files changed, 86 insertions(+), 2 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 94a9916..4d3a6f1 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -36,6 +36,7 @@
 #include <linux/uaccess.h>
 #include <linux/vfio.h>
 #include <linux/workqueue.h>
+#include <linux/msi-iommu.h>
 
 #define DRIVER_VERSION  "0.2"
 #define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
@@ -445,6 +446,20 @@ static void vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma)
 	vfio_lock_acct(-unlocked);
 }
 
+static int vfio_set_msi_aperture(struct vfio_iommu *iommu,
+				dma_addr_t iova, size_t size)
+{
+	struct vfio_domain *d;
+	int ret = 0;
+
+	list_for_each_entry(d, &iommu->domain_list, next) {
+		ret = iommu_msi_set_aperture(d->domain, iova, iova + size - 1);
+		if (ret)
+			break;
+	}
+	return ret;
+}
+
 static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
 {
 	vfio_unmap_unpin(iommu, dma);
@@ -693,6 +708,63 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
 	return ret;
 }
 
+static int vfio_register_msi_range(struct vfio_iommu *iommu,
+				   struct vfio_iommu_type1_dma_map *map)
+{
+	dma_addr_t iova = map->iova;
+	size_t size = map->size;
+	int ret = 0;
+	struct vfio_dma *dma;
+	unsigned long order;
+	uint64_t mask;
+
+	/* Verify that none of our __u64 fields overflow */
+	if (map->size != size || map->iova != iova)
+		return -EINVAL;
+
+	order =  __ffs(vfio_pgsize_bitmap(iommu));
+	mask = ((uint64_t)1 << order) - 1;
+
+	WARN_ON(mask & PAGE_MASK);
+
+	if (!size || (size | iova) & mask)
+		return -EINVAL;
+
+	/* Don't allow IOVA address wrap */
+	if (iova + size - 1 < iova)
+		return -EINVAL;
+
+	mutex_lock(&iommu->lock);
+
+	if (vfio_find_dma(iommu, iova, size, VFIO_IOVA_ANY)) {
+		ret =  -EEXIST;
+		goto unlock;
+	}
+
+	dma = kzalloc(sizeof(*dma), GFP_KERNEL);
+	if (!dma) {
+		ret = -ENOMEM;
+		goto unlock;
+	}
+
+	dma->iova = iova;
+	dma->size = size;
+	dma->type = VFIO_IOVA_RESERVED;
+
+	ret = vfio_set_msi_aperture(iommu, iova, size);
+	if (ret)
+		goto free_unlock;
+
+	vfio_link_dma(iommu, dma);
+	goto unlock;
+
+free_unlock:
+	kfree(dma);
+unlock:
+	mutex_unlock(&iommu->lock);
+	return ret;
+}
+
 static int vfio_bus_type(struct device *dev, void *data)
 {
 	struct bus_type **bus = data;
@@ -1062,7 +1134,8 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 	} else if (cmd == VFIO_IOMMU_MAP_DMA) {
 		struct vfio_iommu_type1_dma_map map;
 		uint32_t mask = VFIO_DMA_MAP_FLAG_READ |
-				VFIO_DMA_MAP_FLAG_WRITE;
+				VFIO_DMA_MAP_FLAG_WRITE |
+				VFIO_DMA_MAP_FLAG_RESERVED_MSI_IOVA;
 
 		minsz = offsetofend(struct vfio_iommu_type1_dma_map, size);
 
@@ -1072,6 +1145,9 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 		if (map.argsz < minsz || map.flags & ~mask)
 			return -EINVAL;
 
+		if (map.flags & VFIO_DMA_MAP_FLAG_RESERVED_MSI_IOVA)
+			return vfio_register_msi_range(iommu, &map);
+
 		return vfio_dma_do_map(iommu, &map);
 
 	} else if (cmd == VFIO_IOMMU_UNMAP_DMA) {
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 255a211..4a9dbc2 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -498,12 +498,19 @@ struct vfio_iommu_type1_info {
  *
  * Map process virtual addresses to IO virtual addresses using the
  * provided struct vfio_dma_map. Caller sets argsz. READ &/ WRITE required.
+ *
+ * In case RESERVED_MSI_IOVA flag is set, the API only aims at registering an
+ * IOVA region that will be used on some platforms to map the host MSI frames.
+ * In that specific case, vaddr is ignored. Once registered, an MSI reserved
+ * IOVA region stays until the container is closed.
  */
 struct vfio_iommu_type1_dma_map {
 	__u32	argsz;
 	__u32	flags;
 #define VFIO_DMA_MAP_FLAG_READ (1 << 0)		/* readable from device */
 #define VFIO_DMA_MAP_FLAG_WRITE (1 << 1)	/* writable from device */
+/* reserved iova for MSI vectors*/
+#define VFIO_DMA_MAP_FLAG_RESERVED_MSI_IOVA (1 << 2)
 	__u64	vaddr;				/* Process virtual address */
 	__u64	iova;				/* IO virtual address */
 	__u64	size;				/* Size of mapping (bytes) */
@@ -519,7 +526,8 @@ struct vfio_iommu_type1_dma_map {
  * Caller sets argsz.  The actual unmapped size is returned in the size
  * field.  No guarantee is made to the user that arbitrary unmaps of iova
  * or size different from those used in the original mapping call will
- * succeed.
+ * succeed. Once registered, an MSI region cannot be unmapped and stays
+ * until the container is closed.
  */
 struct vfio_iommu_type1_dma_unmap {
 	__u32	argsz;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v8 5/7] vfio/type1: also check IRQ remapping capability at msi domain
  2016-04-28  8:28 [PATCH v8 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 3/3: vfio changes Eric Auger
                   ` (3 preceding siblings ...)
  2016-04-28  8:28 ` [PATCH v8 4/7] vfio: allow reserved msi iova registration Eric Auger
@ 2016-04-28  8:28 ` Eric Auger
  2016-04-28  8:28 ` [PATCH v8 6/7] iommu/arm-smmu: do not advertise IOMMU_CAP_INTR_REMAP Eric Auger
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Eric Auger @ 2016-04-28  8:28 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

On x86 IRQ remapping is abstracted by the IOMMU. On ARM this is abstracted
by the msi controller. vfio_safe_irq_domain allows to check whether
interrupts are "safe" for a given device. They are if the device does
not use MSI or if the device uses MSI and the msi-parent controller
supports IRQ remapping.

Then we check at group level if all devices have safe interrupts: if not,
we only allow the group to be attached if allow_unsafe_interrupts is set.

At this point ARM sMMU still advertises IOMMU_CAP_INTR_REMAP. This is
changed in next patch.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v3 -> v4:
- rename vfio_msi_parent_irq_remapping_capable into vfio_safe_irq_domain
  and irq_remapping into safe_irq_domains

v2 -> v3:
- protect vfio_msi_parent_irq_remapping_capable with
  CONFIG_GENERIC_MSI_IRQ_DOMAIN
---
 drivers/vfio/vfio_iommu_type1.c | 44 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 42 insertions(+), 2 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 4d3a6f1..2fc8197 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -37,6 +37,8 @@
 #include <linux/vfio.h>
 #include <linux/workqueue.h>
 #include <linux/msi-iommu.h>
+#include <linux/irqdomain.h>
+#include <linux/msi.h>
 
 #define DRIVER_VERSION  "0.2"
 #define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
@@ -777,6 +779,33 @@ static int vfio_bus_type(struct device *dev, void *data)
 	return 0;
 }
 
+/**
+ * vfio_safe_irq_domain: returns whether the irq domain
+ * the device is attached to is safe with respect to MSI isolation.
+ * If the irq domain is not an MSI domain, we return it is safe.
+ *
+ * @dev: device handle
+ * @data: unused
+ * returns 0 if the irq domain is safe, -1 if not.
+ */
+static int vfio_safe_irq_domain(struct device *dev, void *data)
+{
+#ifdef CONFIG_GENERIC_MSI_IRQ_DOMAIN
+	struct irq_domain *domain;
+	struct msi_domain_info *info;
+
+	domain = dev_get_msi_domain(dev);
+	if (!domain)
+		return 0;
+
+	info = msi_get_domain_info(domain);
+
+	if (!(info->flags & MSI_FLAG_IRQ_REMAPPING))
+		return -1;
+#endif
+	return 0;
+}
+
 static int vfio_iommu_replay(struct vfio_iommu *iommu,
 			     struct vfio_domain *domain)
 {
@@ -870,7 +899,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	struct vfio_group *group, *g;
 	struct vfio_domain *domain, *d;
 	struct bus_type *bus = NULL;
-	int ret;
+	int ret, safe_irq_domains;
 
 	mutex_lock(&iommu->lock);
 
@@ -893,6 +922,13 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 
 	group->iommu_group = iommu_group;
 
+	/*
+	 * Determine if all the devices of the group have a safe irq domain
+	 * with respect to MSI isolation
+	 */
+	safe_irq_domains = !iommu_group_for_each_dev(iommu_group, &bus,
+				       vfio_safe_irq_domain);
+
 	/* Determine bus_type in order to allocate a domain */
 	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
 	if (ret)
@@ -920,8 +956,12 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	INIT_LIST_HEAD(&domain->group_list);
 	list_add(&group->next, &domain->group_list);
 
+	/*
+	 * to advertise safe interrupts either the IOMMU or the MSI controllers
+	 * must support IRQ remapping/interrupt translation
+	 */
 	if (!allow_unsafe_interrupts &&
-	    !iommu_capable(bus, IOMMU_CAP_INTR_REMAP)) {
+	    (!iommu_capable(bus, IOMMU_CAP_INTR_REMAP) && !safe_irq_domains)) {
 		pr_warn("%s: No interrupt remapping support.  Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n",
 		       __func__);
 		ret = -EPERM;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v8 6/7] iommu/arm-smmu: do not advertise IOMMU_CAP_INTR_REMAP
  2016-04-28  8:28 [PATCH v8 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 3/3: vfio changes Eric Auger
                   ` (4 preceding siblings ...)
  2016-04-28  8:28 ` [PATCH v8 5/7] vfio/type1: also check IRQ remapping capability at msi domain Eric Auger
@ 2016-04-28  8:28 ` Eric Auger
  2016-04-28  8:28 ` [PATCH v8 7/7] vfio/type1: return MSI mapping requirements with VFIO_IOMMU_GET_INFO Eric Auger
  2016-05-04 11:17 ` [PATCH v8 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 3/3: vfio changes Yehuda Yitschak
  7 siblings, 0 replies; 10+ messages in thread
From: Eric Auger @ 2016-04-28  8:28 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Do not advertise IOMMU_CAP_INTR_REMAP for arm-smmu(-v3). Indeed the
irq_remapping capability is abstracted on irqchip side for ARM as
opposed to Intel IOMMU featuring IRQ remapping HW.

So to check IRQ remapping capability, the msi domain needs to be
checked instead.

This commit needs to be applied after "vfio/type1: also check IRQ
remapping capability at msi domain" else the legacy interrupt
assignment gets broken with arm-smmu.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/iommu/arm-smmu-v3.c | 3 ++-
 drivers/iommu/arm-smmu.c    | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index b5d9826..778212c 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1386,7 +1386,8 @@ static bool arm_smmu_capable(enum iommu_cap cap)
 	case IOMMU_CAP_CACHE_COHERENCY:
 		return true;
 	case IOMMU_CAP_INTR_REMAP:
-		return true; /* MSIs are just memory writes */
+		/* interrupt translation handled at MSI controller level */
+		return false;
 	case IOMMU_CAP_NOEXEC:
 		return true;
 	default:
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 0908985..e7c9e89 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1320,7 +1320,8 @@ static bool arm_smmu_capable(enum iommu_cap cap)
 		 */
 		return true;
 	case IOMMU_CAP_INTR_REMAP:
-		return true; /* MSIs are just memory writes */
+		/* interrupt translation handled at MSI controller level */
+		return false;
 	case IOMMU_CAP_NOEXEC:
 		return true;
 	default:
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v8 7/7] vfio/type1: return MSI mapping requirements with VFIO_IOMMU_GET_INFO
  2016-04-28  8:28 [PATCH v8 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 3/3: vfio changes Eric Auger
                   ` (5 preceding siblings ...)
  2016-04-28  8:28 ` [PATCH v8 6/7] iommu/arm-smmu: do not advertise IOMMU_CAP_INTR_REMAP Eric Auger
@ 2016-04-28  8:28 ` Eric Auger
  2016-05-04 11:17 ` [PATCH v8 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 3/3: vfio changes Yehuda Yitschak
  7 siblings, 0 replies; 10+ messages in thread
From: Eric Auger @ 2016-04-28  8:28 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

This patch allows the user-space to know whether MSI addresses need to
be mapped in the IOMMU. The user-space uses VFIO_IOMMU_GET_INFO ioctl and
IOMMU_INFO_REQUIRE_MSI_MAP gets set if they need to.

The computation of the number of IOVA pages to be provided by the user
space will be implemented in a separate patch using capability chains.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v6 -> v7:
- remove the computation of the number of IOVA pages to be provisionned.
  This number depends on the domain/group/device topology which can
  dynamically change. Let's rely instead rely on an arbitrary max depending
  on the system

v4 -> v5:
- move msi_info and ret declaration within the conditional code

v3 -> v4:
- replace former vfio_domains_require_msi_mapping by
  more complex computation of MSI mapping requirements, especially the
  number of pages to be provided by the user-space.
- reword patch title

RFC v1 -> v1:
- derived from
  [RFC PATCH 3/6] vfio: Extend iommu-info to return MSIs automap state
- renamed allow_msi_reconfig into require_msi_mapping
- fixed VFIO_IOMMU_GET_INFO
---
 drivers/vfio/vfio_iommu_type1.c | 26 ++++++++++++++++++++++++++
 include/uapi/linux/vfio.h       |  4 ++++
 2 files changed, 30 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 2fc8197..86d97d9 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -310,6 +310,29 @@ static int vaddr_get_pfn(unsigned long vaddr, int prot, unsigned long *pfn)
 }
 
 /*
+ * vfio_domains_require_msi_mapping: return whether MSI doorbells must be
+ * iommu mapped
+ *
+ * returns true if msi mapping is requested
+ */
+static bool vfio_domains_require_msi_mapping(struct vfio_iommu *iommu)
+{
+	struct iommu_domain_msi_geometry msi_geometry;
+	struct vfio_domain *d;
+	bool flag;
+
+	mutex_lock(&iommu->lock);
+	/* All domains have same require_msi_map property, pick first */
+	d = list_first_entry(&iommu->domain_list, struct vfio_domain, next);
+	iommu_domain_get_attr(d->domain, DOMAIN_ATTR_MSI_GEOMETRY,
+			      &msi_geometry);
+	flag = msi_geometry.programmable;
+
+	mutex_unlock(&iommu->lock);
+
+	return flag;
+}
+/*
  * Attempt to pin pages.  We really don't want to track all the pfns and
  * the iommu can only map chunks of consecutive pfns anyway, so get the
  * first page and all consecutive pages with the same locking.
@@ -1166,6 +1189,9 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 
 		info.flags = VFIO_IOMMU_INFO_PGSIZES;
 
+		if (vfio_domains_require_msi_mapping(iommu))
+			info.flags |= VFIO_IOMMU_INFO_REQUIRE_MSI_MAP;
+
 		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
 
 		return copy_to_user((void __user *)arg, &info, minsz) ?
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 4a9dbc2..3e27263 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -488,6 +488,7 @@ struct vfio_iommu_type1_info {
 	__u32	argsz;
 	__u32	flags;
 #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)	/* supported page sizes info */
+#define VFIO_IOMMU_INFO_REQUIRE_MSI_MAP (1 << 1)/* MSI must be mapped */
 	__u64	iova_pgsizes;		/* Bitmap of supported page sizes */
 };
 
@@ -503,6 +504,9 @@ struct vfio_iommu_type1_info {
  * IOVA region that will be used on some platforms to map the host MSI frames.
  * In that specific case, vaddr is ignored. Once registered, an MSI reserved
  * IOVA region stays until the container is closed.
+ * The requirement for provisioning such reserved IOVA range can be checked by
+ * calling VFIO_IOMMU_GET_INFO and testing the VFIO_IOMMU_INFO_REQUIRE_MSI_MAP
+ * flag.
  */
 struct vfio_iommu_type1_dma_map {
 	__u32	argsz;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* RE: [PATCH v8 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 3/3: vfio changes
  2016-04-28  8:28 [PATCH v8 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 3/3: vfio changes Eric Auger
                   ` (6 preceding siblings ...)
  2016-04-28  8:28 ` [PATCH v8 7/7] vfio/type1: return MSI mapping requirements with VFIO_IOMMU_GET_INFO Eric Auger
@ 2016-05-04 11:17 ` Yehuda Yitschak
  2016-05-04 11:29   ` Eric Auger
  7 siblings, 1 reply; 10+ messages in thread
From: Yehuda Yitschak @ 2016-05-04 11:17 UTC (permalink / raw)
  To: Eric Auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: julien.grall, patches, Jean-Philippe.Brucker, p.fedin,
	linux-kernel, Bharat.Bhushan, iommu, pranav.sawargaonkar


Tested-by: Yehuda Yitschak <yehuday@marvell.com>

Tested on Armada-7040 using an intel IXGBE (82599ES). 

> -----Original Message-----
> From: linux-arm-kernel [mailto:linux-arm-kernel-
> bounces@lists.infradead.org] On Behalf Of Eric Auger
> Sent: Thursday, April 28, 2016 11:29
> To: eric.auger@st.com; eric.auger@linaro.org; robin.murphy@arm.com;
> alex.williamson@redhat.com; will.deacon@arm.com; joro@8bytes.org;
> tglx@linutronix.de; jason@lakedaemon.net; marc.zyngier@arm.com;
> christoffer.dall@linaro.org; linux-arm-kernel@lists.infradead.org
> Cc: julien.grall@arm.com; patches@linaro.org; Jean-
> Philippe.Brucker@arm.com; p.fedin@samsung.com; linux-
> kernel@vger.kernel.org; Bharat.Bhushan@freescale.com;
> iommu@lists.linux-foundation.org; pranav.sawargaonkar@gmail.com
> Subject: [PATCH v8 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel
> part 3/3: vfio changes
> 
> This series allows the user-space to register a reserved IOVA domain.
> This completes the kernel integration of the whole functionality on top of
> part 1 & 2.
> 
> It also depends on [PATCH 1/3] iommu: Add MMIO mapping type series,
> http://comments.gmane.org/gmane.linux.kernel.iommu/12869
> 
> We reuse the VFIO DMA MAP ioctl with a new flag to bridge to the msi-
> iommu API. The need for provisioning such MSI IOVA range is reported
> through the VFIO_IOMMU_GET_INFO iotcl.
> 
> vfio_iommu_type1 checks if the MSI mapping is safe when attaching the vfio
> group to the container (allow_unsafe_interrupts modality).
> 
> On ARM/ARM64, the IOMMU does not astract IRQ remapping. the modality
> is abstracted on MSI controller side. The GICv3 ITS is the first controller
> advertising the modality.
> 
> More details & context can be found at:
> http://www.linaro.org/blog/core-dump/kvm-pciemsi-passthrough-
> armarm64/
> 
> Best Regards
> 
> Eric
> 
> Testing:
> - functional on ARM64 AMD Overdrive HW (single GICv2m frame) with
>   Intel X540-T2 (SR-IOV capable)
> - Not tested: ARM GICv3 ITS
> 
> References:
> [1] [RFC 0/2] VFIO: Add virtual MSI doorbell support
>     (https://lkml.org/lkml/2015/7/24/135)
> [2] [RFC PATCH 0/6] vfio: Add interface to map MSI pages
>     (https://lists.cs.columbia.edu/pipermail/kvmarm/2015-
> September/016607.html)
> [3] [PATCH v2 0/3] Introduce MSI hardware mapping for VFIO
> 
> (http://permalink.gmane.org/gmane.comp.emulators.kvm.arm.devel/3858)
> 
> Git: complete series available at
> https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.6-
> rc5-pcie-passthrough-v8
> 
> previous version at
> https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.6-
> rc4-pcie-passthrough-v7
> 
> QEMU Integration:
> [RFC v2 0/8] KVM PCI/MSI passthrough with mach-virt
> (http://lists.gnu.org/archive/html/qemu-arm/2016-01/msg00444.html)
> https://git.linaro.org/people/eric.auger/qemu.git/shortlog/refs/heads/v2.5.
> 0-pci-passthrough-rfc-v2
> 
> History:
> v7 -> v8:
> - use renamed msi-iommu API
> - VFIO only responsible for setting the IOVA aperture
> - use new DOMAIN_ATTR_MSI_GEOMETRY iommu domain attribute
> 
> v6 -> v7:
> - vfio_find_dma now accepts a dma_type argument.
> - should have recovered the capability to unmap the whole user IOVA range
> - remove computation of nb IOVA pages -> will post a separate RFC for that
>   while respinning the QEMU part
> 
> RFC v5 -> patch v6:
> - split to ease the review process
> 
> RFC v4 -> RFC v5:
> - take into account Thomas' comments on MSI related patches
>   - split "msi: IOMMU map the doorbell address when needed"
>   - increase readability and add comments
>   - fix style issues
>  - split "iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute"
>  - platform ITS now advertises IOMMU_CAP_INTR_REMAP
>  - fix compilation issue with CONFIG_IOMMU API unset
>  - arm-smmu-v3 now advertises DOMAIN_ATTR_MSI_MAPPING
> 
> RFC v3 -> v4:
> - Move doorbell mapping/unmapping in msi.c
> - fix ref count issue on set_affinity: in case of a change in the address
>   the previous address is decremented
> - doorbell map/unmap now is done on msi composition. Should allow the use
>   case for platform MSI controllers
> - create dma-reserved-iommu.h/c exposing/implementing a new API
> dedicated
>   to reserved IOVA management (looking like dma-iommu glue)
> - series reordering to ease the review:
>   - first part is related to IOMMU
>   - second related to MSI sub-system
>   - third related to VFIO (except arm-smmu IOMMU_CAP_INTR_REMAP
> removal)
> - expose the number of requested IOVA pages through
> VFIO_IOMMU_GET_INFO
>   [this partially addresses Marc's comments on
> iommu_get/put_single_reserved
>    size/alignment problematic - which I did not ignore - but I don't know
>    how much I can do at the moment]
> 
> RFC v2 -> RFC v3:
> - should fix wrong handling of some CONFIG combinations:
>   CONFIG_IOVA, CONFIG_IOMMU_API, CONFIG_PCI_MSI_IRQ_DOMAIN
> - fix MSI_FLAG_IRQ_REMAPPING setting in GICv3 ITS (although not tested)
> 
> PATCH v1 -> RFC v2:
> - reverted to RFC since it looks more reasonable ;-) the code is split
>   between VFIO, IOMMU, MSI controller and I am not sure I did the right
>   choices. Also API need to be further discussed.
> - iova API usage in arm-smmu.c.
> - MSI controller natively programs the MSI addr with either the PA or IOVA.
>   This is not done anymore in vfio-pci driver as suggested by Alex.
> - check irq remapping capability of the group
> 
> RFC v1 [2] -> PATCH v1:
> - use the existing dma map/unmap ioctl interface with a flag to register a
>   reserved IOVA range. Use the legacy Rb to store this special vfio_dma.
> - a single reserved IOVA contiguous region now is allowed
> - use of an RB tree indexed by PA to store allocated reserved slots
> - use of a vfio_domain iova_domain to manage iova allocation within the
>   window provided by the userspace
> - vfio alloc_map/unmap_free take a vfio_group handle
> - vfio_group handle is cached in vfio_pci_device
> - add ref counting to bindings
> - user modality enabled at the end of the series
> 
> 
> Eric Auger (7):
>   vfio: introduce a vfio_dma type field
>   vfio/type1: vfio_find_dma accepting a type argument
>   vfio/type1: bypass unmap/unpin and replay for VFIO_IOVA_RESERVED slots
>   vfio: allow reserved msi iova registration
>   vfio/type1: also check IRQ remapping capability at msi domain
>   iommu/arm-smmu: do not advertise IOMMU_CAP_INTR_REMAP
>   vfio/type1: return MSI mapping requirements with
> VFIO_IOMMU_GET_INFO
> 
>  drivers/iommu/arm-smmu-v3.c     |   3 +-
>  drivers/iommu/arm-smmu.c        |   3 +-
>  drivers/vfio/vfio_iommu_type1.c | 227
> +++++++++++++++++++++++++++++++++++++---
>  include/uapi/linux/vfio.h       |  14 ++-
>  4 files changed, 230 insertions(+), 17 deletions(-)
> 
> --
> 1.9.1
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v8 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 3/3: vfio changes
  2016-05-04 11:17 ` [PATCH v8 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 3/3: vfio changes Yehuda Yitschak
@ 2016-05-04 11:29   ` Eric Auger
  0 siblings, 0 replies; 10+ messages in thread
From: Eric Auger @ 2016-05-04 11:29 UTC (permalink / raw)
  To: Yehuda Yitschak, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: julien.grall, patches, Jean-Philippe.Brucker, p.fedin,
	linux-kernel, Bharat.Bhushan, iommu, pranav.sawargaonkar

Hi Yehuda,
On 05/04/2016 01:17 PM, Yehuda Yitschak wrote:
> 
> Tested-by: Yehuda Yitschak <yehuday@marvell.com>
Many thanks for the T-b!

I'am about to submit a small update on part I & III today (v9), taking
into account Alex' last comments. MSI layer part (II) is left unchanged
(v8).

The way I am going to report the need for MSI mapping on user-side
changes and I will respin the QEMU part accordingly. Besides, this info
was not yet used in the QEMU integration.

Best Regards

Eric
> 
> Tested on Armada-7040 using an intel IXGBE (82599ES). 
> 
>> -----Original Message-----
>> From: linux-arm-kernel [mailto:linux-arm-kernel-
>> bounces@lists.infradead.org] On Behalf Of Eric Auger
>> Sent: Thursday, April 28, 2016 11:29
>> To: eric.auger@st.com; eric.auger@linaro.org; robin.murphy@arm.com;
>> alex.williamson@redhat.com; will.deacon@arm.com; joro@8bytes.org;
>> tglx@linutronix.de; jason@lakedaemon.net; marc.zyngier@arm.com;
>> christoffer.dall@linaro.org; linux-arm-kernel@lists.infradead.org
>> Cc: julien.grall@arm.com; patches@linaro.org; Jean-
>> Philippe.Brucker@arm.com; p.fedin@samsung.com; linux-
>> kernel@vger.kernel.org; Bharat.Bhushan@freescale.com;
>> iommu@lists.linux-foundation.org; pranav.sawargaonkar@gmail.com
>> Subject: [PATCH v8 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel
>> part 3/3: vfio changes
>>
>> This series allows the user-space to register a reserved IOVA domain.
>> This completes the kernel integration of the whole functionality on top of
>> part 1 & 2.
>>
>> It also depends on [PATCH 1/3] iommu: Add MMIO mapping type series,
>> http://comments.gmane.org/gmane.linux.kernel.iommu/12869
>>
>> We reuse the VFIO DMA MAP ioctl with a new flag to bridge to the msi-
>> iommu API. The need for provisioning such MSI IOVA range is reported
>> through the VFIO_IOMMU_GET_INFO iotcl.
>>
>> vfio_iommu_type1 checks if the MSI mapping is safe when attaching the vfio
>> group to the container (allow_unsafe_interrupts modality).
>>
>> On ARM/ARM64, the IOMMU does not astract IRQ remapping. the modality
>> is abstracted on MSI controller side. The GICv3 ITS is the first controller
>> advertising the modality.
>>
>> More details & context can be found at:
>> http://www.linaro.org/blog/core-dump/kvm-pciemsi-passthrough-
>> armarm64/
>>
>> Best Regards
>>
>> Eric
>>
>> Testing:
>> - functional on ARM64 AMD Overdrive HW (single GICv2m frame) with
>>   Intel X540-T2 (SR-IOV capable)
>> - Not tested: ARM GICv3 ITS
>>
>> References:
>> [1] [RFC 0/2] VFIO: Add virtual MSI doorbell support
>>     (https://lkml.org/lkml/2015/7/24/135)
>> [2] [RFC PATCH 0/6] vfio: Add interface to map MSI pages
>>     (https://lists.cs.columbia.edu/pipermail/kvmarm/2015-
>> September/016607.html)
>> [3] [PATCH v2 0/3] Introduce MSI hardware mapping for VFIO
>>
>> (http://permalink.gmane.org/gmane.comp.emulators.kvm.arm.devel/3858)
>>
>> Git: complete series available at
>> https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.6-
>> rc5-pcie-passthrough-v8
>>
>> previous version at
>> https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.6-
>> rc4-pcie-passthrough-v7
>>
>> QEMU Integration:
>> [RFC v2 0/8] KVM PCI/MSI passthrough with mach-virt
>> (http://lists.gnu.org/archive/html/qemu-arm/2016-01/msg00444.html)
>> https://git.linaro.org/people/eric.auger/qemu.git/shortlog/refs/heads/v2.5.
>> 0-pci-passthrough-rfc-v2
>>
>> History:
>> v7 -> v8:
>> - use renamed msi-iommu API
>> - VFIO only responsible for setting the IOVA aperture
>> - use new DOMAIN_ATTR_MSI_GEOMETRY iommu domain attribute
>>
>> v6 -> v7:
>> - vfio_find_dma now accepts a dma_type argument.
>> - should have recovered the capability to unmap the whole user IOVA range
>> - remove computation of nb IOVA pages -> will post a separate RFC for that
>>   while respinning the QEMU part
>>
>> RFC v5 -> patch v6:
>> - split to ease the review process
>>
>> RFC v4 -> RFC v5:
>> - take into account Thomas' comments on MSI related patches
>>   - split "msi: IOMMU map the doorbell address when needed"
>>   - increase readability and add comments
>>   - fix style issues
>>  - split "iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute"
>>  - platform ITS now advertises IOMMU_CAP_INTR_REMAP
>>  - fix compilation issue with CONFIG_IOMMU API unset
>>  - arm-smmu-v3 now advertises DOMAIN_ATTR_MSI_MAPPING
>>
>> RFC v3 -> v4:
>> - Move doorbell mapping/unmapping in msi.c
>> - fix ref count issue on set_affinity: in case of a change in the address
>>   the previous address is decremented
>> - doorbell map/unmap now is done on msi composition. Should allow the use
>>   case for platform MSI controllers
>> - create dma-reserved-iommu.h/c exposing/implementing a new API
>> dedicated
>>   to reserved IOVA management (looking like dma-iommu glue)
>> - series reordering to ease the review:
>>   - first part is related to IOMMU
>>   - second related to MSI sub-system
>>   - third related to VFIO (except arm-smmu IOMMU_CAP_INTR_REMAP
>> removal)
>> - expose the number of requested IOVA pages through
>> VFIO_IOMMU_GET_INFO
>>   [this partially addresses Marc's comments on
>> iommu_get/put_single_reserved
>>    size/alignment problematic - which I did not ignore - but I don't know
>>    how much I can do at the moment]
>>
>> RFC v2 -> RFC v3:
>> - should fix wrong handling of some CONFIG combinations:
>>   CONFIG_IOVA, CONFIG_IOMMU_API, CONFIG_PCI_MSI_IRQ_DOMAIN
>> - fix MSI_FLAG_IRQ_REMAPPING setting in GICv3 ITS (although not tested)
>>
>> PATCH v1 -> RFC v2:
>> - reverted to RFC since it looks more reasonable ;-) the code is split
>>   between VFIO, IOMMU, MSI controller and I am not sure I did the right
>>   choices. Also API need to be further discussed.
>> - iova API usage in arm-smmu.c.
>> - MSI controller natively programs the MSI addr with either the PA or IOVA.
>>   This is not done anymore in vfio-pci driver as suggested by Alex.
>> - check irq remapping capability of the group
>>
>> RFC v1 [2] -> PATCH v1:
>> - use the existing dma map/unmap ioctl interface with a flag to register a
>>   reserved IOVA range. Use the legacy Rb to store this special vfio_dma.
>> - a single reserved IOVA contiguous region now is allowed
>> - use of an RB tree indexed by PA to store allocated reserved slots
>> - use of a vfio_domain iova_domain to manage iova allocation within the
>>   window provided by the userspace
>> - vfio alloc_map/unmap_free take a vfio_group handle
>> - vfio_group handle is cached in vfio_pci_device
>> - add ref counting to bindings
>> - user modality enabled at the end of the series
>>
>>
>> Eric Auger (7):
>>   vfio: introduce a vfio_dma type field
>>   vfio/type1: vfio_find_dma accepting a type argument
>>   vfio/type1: bypass unmap/unpin and replay for VFIO_IOVA_RESERVED slots
>>   vfio: allow reserved msi iova registration
>>   vfio/type1: also check IRQ remapping capability at msi domain
>>   iommu/arm-smmu: do not advertise IOMMU_CAP_INTR_REMAP
>>   vfio/type1: return MSI mapping requirements with
>> VFIO_IOMMU_GET_INFO
>>
>>  drivers/iommu/arm-smmu-v3.c     |   3 +-
>>  drivers/iommu/arm-smmu.c        |   3 +-
>>  drivers/vfio/vfio_iommu_type1.c | 227
>> +++++++++++++++++++++++++++++++++++++---
>>  include/uapi/linux/vfio.h       |  14 ++-
>>  4 files changed, 230 insertions(+), 17 deletions(-)
>>
>> --
>> 1.9.1
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-05-04 11:30 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-28  8:28 [PATCH v8 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 3/3: vfio changes Eric Auger
2016-04-28  8:28 ` [PATCH v8 1/7] vfio: introduce a vfio_dma type field Eric Auger
2016-04-28  8:28 ` [PATCH v8 2/7] vfio/type1: vfio_find_dma accepting a type argument Eric Auger
2016-04-28  8:28 ` [PATCH v8 3/7] vfio/type1: bypass unmap/unpin and replay for VFIO_IOVA_RESERVED slots Eric Auger
2016-04-28  8:28 ` [PATCH v8 4/7] vfio: allow reserved msi iova registration Eric Auger
2016-04-28  8:28 ` [PATCH v8 5/7] vfio/type1: also check IRQ remapping capability at msi domain Eric Auger
2016-04-28  8:28 ` [PATCH v8 6/7] iommu/arm-smmu: do not advertise IOMMU_CAP_INTR_REMAP Eric Auger
2016-04-28  8:28 ` [PATCH v8 7/7] vfio/type1: return MSI mapping requirements with VFIO_IOMMU_GET_INFO Eric Auger
2016-05-04 11:17 ` [PATCH v8 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 3/3: vfio changes Yehuda Yitschak
2016-05-04 11:29   ` Eric Auger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).