linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions
@ 2017-01-09 13:45 Eric Auger
  2017-01-09 13:45 ` [PATCH v7 01/19] iommu/dma: Implement PCI allocation optimisation Eric Auger
                   ` (19 more replies)
  0 siblings, 20 replies; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:45 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

Following LPC discussions, we now report reserved regions through
the iommu-group sysfs reserved_regions attribute file.

Reserved regions are populated through the IOMMU get_resv_region
callback (former get_dm_regions), now implemented by amd-iommu,
intel-iommu and arm-smmu:
- the intel-iommu reports the [0xfee00000 - 0xfeefffff] MSI window
  as a reserved region and RMRR regions as direct-mapped regions.
- the amd-iommu reports device direct mapped regions, the MSI region
  and HT regions.
- the arm-smmu reports the MSI window (arbitrarily located at
  0x8000000 and 1MB large).

Unsafe interrupt assignment is tested by enumerating all MSI irq
domains and checking MSI remapping is supported in the above hierarchy.
This check is done in case we detect the iommu translates MSI
(an IOMMU_RESV_MSI window exists). Otherwise the IRQ remapping
capability is checked at IOMMU level. Obviously this is a defensive
IRQ safety assessment: Assuming there are several MSI controllers
in the system and at least one does not implement IRQ remapping,
the assignment will be considered as unsafe (even if this controller
is not acessible from the assigned devices).

The two first patches stem from Robin's branch:
http://linux-arm.org/git?p=linux-rm.git;a=shortlog;h=refs/heads/iommu/misc
They are likely to be pulled separately.

Best Regards

Eric

Git: complete series available at
https://github.com/eauger/linux/tree/v4.10-rc3-reserved-v7

History:

PATCHv6 -> PATCHv7:
- "iommu/dma: Implement PCI allocation optimisation" was added to apply
  "iommu/dma: Allow MSI-only cookies"
- report Intel RMRR as direct-mapped regions
- report the type in the iommu group sysfs reserved_regions file
- do not merge regions of different types when building the list
  of reserved regions
- intgerate Robin's "iommu/dma: Allow MSI-only cookies" last
  version
- update Documentation/ABI/testing/sysfs-kernel-iommu_groups
- rename IOMMU_RESV_NOMAP into IOMMU_RESV_RESERVED

PATCHv5 -> PATCHv6
- Introduce IRQ_DOMAIN_FLAG_MSI as suggested by Marc
- irq_domain_is_msi, irq_domain_is_msi_remap,
  irq_domain_hierarchical_is_msi_remap,
- set IRQ_DOMAIN_FLAG_MSI in msi_create_irq_domain
- fix compil issue on i386
- rework test at VFIO level

RFCv4 -> PATCHv5
- fix IRQ security assessment by looking at irq domain parents
- check DOMAIN_BUS_FSL_MC_MSI irq domains
- AMD MSI and HT regions are exposed in iommu group sysfs

RFCv3 -> RFCv4:
- arm-smmu driver does not register PCI host bridge windows as
  reserved regions anymore
- Implement reserved region get/put callbacks also in arm-smmuv3
- take the iommu_group lock on iommu_get_group_resv_regions
- add a type field in iommu_resv_region instead of using prot
- init the region list_head in iommu_alloc_resv_region, also
  add type parameter
- iommu_insert_resv_region manage overlaps and sort reserved
  windows
- address IRQ safety assessment by enumerating all the MSI irq
  domains and checking the MSI_REMAP flag
- update Documentation/ABI/testing/sysfs-kernel-iommu_groups

RFC v2 -> v3:
- switch to an iommu-group sysfs API
- use new dummy allocator provided by Robin
- dummy allocator initialized by vfio-iommu-type1 after enumerating
  the reserved regions
- at the moment ARM MSI base address/size is left unchanged compared
  to v2
- we currently report reserved regions and not usable IOVA regions as
  requested by Alex

RFC v1 -> v2:
- fix intel_add_reserved_regions
- add mutex lock/unlock in vfio_iommu_type1


Eric Auger (17):
  iommu: Rename iommu_dm_regions into iommu_resv_regions
  iommu: Add a new type field in iommu_resv_region
  iommu: iommu_alloc_resv_region
  iommu: Only map direct mapped regions
  iommu: iommu_get_group_resv_regions
  iommu: Implement reserved_regions iommu-group sysfs file
  iommu/vt-d: Implement reserved region get/put callbacks
  iommu/amd: Declare MSI and HT regions as reserved IOVA regions
  iommu/arm-smmu: Implement reserved region get/put callbacks
  iommu/arm-smmu-v3: Implement reserved region get/put callbacks
  irqdomain: Add irq domain MSI and MSI_REMAP flags
  genirq/msi: Set IRQ_DOMAIN_FLAG_MSI on MSI domain creation
  irqdomain: irq_domain_check_msi_remap
  irqchip/gicv3-its: Sets IRQ_DOMAIN_FLAG_MSI_REMAP
  vfio/type1: Allow transparent MSI IOVA allocation
  vfio/type1: Check MSI remapping at irq domain level
  iommu/arm-smmu: Do not advertise IOMMU_CAP_INTR_REMAP anymore

Robin Murphy (2):
  iommu/dma: Implement PCI allocation optimisation
  iommu/dma: Allow MSI-only cookies

 .../ABI/testing/sysfs-kernel-iommu_groups          |  12 ++
 drivers/iommu/amd_iommu.c                          |  54 ++++---
 drivers/iommu/arm-smmu-v3.c                        |  30 +++-
 drivers/iommu/arm-smmu.c                           |  30 +++-
 drivers/iommu/dma-iommu.c                          | 138 ++++++++++++----
 drivers/iommu/intel-iommu.c                        |  92 ++++++++---
 drivers/iommu/iommu.c                              | 179 +++++++++++++++++++--
 drivers/irqchip/irq-gic-v3-its.c                   |   1 +
 drivers/vfio/vfio_iommu_type1.c                    |  37 ++++-
 include/linux/dma-iommu.h                          |   6 +
 include/linux/iommu.h                              |  46 ++++--
 include/linux/irqdomain.h                          |  36 +++++
 kernel/irq/irqdomain.c                             |  39 +++++
 kernel/irq/msi.c                                   |   4 +-
 14 files changed, 606 insertions(+), 98 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v7 01/19] iommu/dma: Implement PCI allocation optimisation
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
@ 2017-01-09 13:45 ` Eric Auger
  2017-01-10 15:41   ` Robin Murphy
  2017-01-09 13:45 ` [PATCH v7 02/19] iommu/dma: Allow MSI-only cookies Eric Auger
                   ` (18 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:45 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

From: Robin Murphy <robin.murphy@arm.com>

Whilst PCI devices may have 64-bit DMA masks, they still benefit from
using 32-bit addresses wherever possible in order to avoid DAC (PCI) or
longer address packets (PCIe), which may incur a performance overhead.
Implement the same optimisation as other allocators by trying to get a
32-bit address first, only falling back to the full mask if that fails.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 2db0d64..a59ca47 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -204,19 +204,28 @@ int dma_direction_to_prot(enum dma_data_direction dir, bool coherent)
 }
 
 static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size,
-		dma_addr_t dma_limit)
+		dma_addr_t dma_limit, struct device *dev)
 {
 	struct iova_domain *iovad = cookie_iovad(domain);
 	unsigned long shift = iova_shift(iovad);
 	unsigned long length = iova_align(iovad, size) >> shift;
+	struct iova *iova = NULL;
 
 	if (domain->geometry.force_aperture)
 		dma_limit = min(dma_limit, domain->geometry.aperture_end);
+
+	/* Try to get PCI devices a SAC address */
+	if (dma_limit > DMA_BIT_MASK(32) && dev_is_pci(dev))
+		iova = alloc_iova(iovad, length, DMA_BIT_MASK(32) >> shift,
+				  true);
 	/*
 	 * Enforce size-alignment to be safe - there could perhaps be an
 	 * attribute to control this per-device, or at least per-domain...
 	 */
-	return alloc_iova(iovad, length, dma_limit >> shift, true);
+	if (!iova)
+		iova = alloc_iova(iovad, length, dma_limit >> shift, true);
+
+	return iova;
 }
 
 /* The IOVA allocator knows what we mapped, so just unmap whatever that was */
@@ -369,7 +378,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
 	if (!pages)
 		return NULL;
 
-	iova = __alloc_iova(domain, size, dev->coherent_dma_mask);
+	iova = __alloc_iova(domain, size, dev->coherent_dma_mask, dev);
 	if (!iova)
 		goto out_free_pages;
 
@@ -440,7 +449,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
 	struct iova_domain *iovad = cookie_iovad(domain);
 	size_t iova_off = iova_offset(iovad, phys);
 	size_t len = iova_align(iovad, size + iova_off);
-	struct iova *iova = __alloc_iova(domain, len, dma_get_mask(dev));
+	struct iova *iova = __alloc_iova(domain, len, dma_get_mask(dev), dev);
 
 	if (!iova)
 		return DMA_ERROR_CODE;
@@ -598,7 +607,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
 		prev = s;
 	}
 
-	iova = __alloc_iova(domain, iova_len, dma_get_mask(dev));
+	iova = __alloc_iova(domain, iova_len, dma_get_mask(dev), dev);
 	if (!iova)
 		goto out_restore_sg;
 
@@ -675,7 +684,7 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
 	if (!msi_page)
 		return NULL;
 
-	iova = __alloc_iova(domain, iovad->granule, dma_get_mask(dev));
+	iova = __alloc_iova(domain, iovad->granule, dma_get_mask(dev), dev);
 	if (!iova)
 		goto out_free_page;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v7 02/19] iommu/dma: Allow MSI-only cookies
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
  2017-01-09 13:45 ` [PATCH v7 01/19] iommu/dma: Implement PCI allocation optimisation Eric Auger
@ 2017-01-09 13:45 ` Eric Auger
  2017-01-09 13:45 ` [PATCH v7 03/19] iommu: Rename iommu_dm_regions into iommu_resv_regions Eric Auger
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:45 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

From: Robin Murphy <robin.murphy@arm.com>

IOMMU domain users such as VFIO face a similar problem to DMA API ops
with regard to mapping MSI messages in systems where the MSI write is
subject to IOMMU translation. With the relevant infrastructure now in
place for managed DMA domains, it's actually really simple for other
users to piggyback off that and reap the benefits without giving up
their own IOVA management, and without having to reinvent their own
wheel in the MSI layer.

Allow such users to opt into automatic MSI remapping by dedicating a
region of their IOVA space to a managed cookie, and extend the mapping
routine to implement a trivial linear allocator in such cases, to avoid
the needless overhead of a full-blown IOVA domain.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c | 119 +++++++++++++++++++++++++++++++++++++---------
 include/linux/dma-iommu.h |   6 +++
 2 files changed, 102 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index a59ca47..008ad88 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -37,15 +37,50 @@ struct iommu_dma_msi_page {
 	phys_addr_t		phys;
 };
 
+enum iommu_dma_cookie_type {
+	IOMMU_DMA_IOVA_COOKIE,
+	IOMMU_DMA_MSI_COOKIE,
+};
+
 struct iommu_dma_cookie {
-	struct iova_domain	iovad;
-	struct list_head	msi_page_list;
-	spinlock_t		msi_lock;
+	enum iommu_dma_cookie_type	type;
+	union {
+		/* Full allocator for IOMMU_DMA_IOVA_COOKIE */
+		struct iova_domain	iovad;
+		/* Trivial linear page allocator for IOMMU_DMA_MSI_COOKIE */
+		dma_addr_t		msi_iova;
+	};
+	struct list_head		msi_page_list;
+	spinlock_t			msi_lock;
 };
 
+static inline size_t cookie_msi_granule(struct iommu_dma_cookie *cookie)
+{
+	if (cookie->type == IOMMU_DMA_IOVA_COOKIE)
+		return cookie->iovad.granule;
+	return PAGE_SIZE;
+}
+
 static inline struct iova_domain *cookie_iovad(struct iommu_domain *domain)
 {
-	return &((struct iommu_dma_cookie *)domain->iova_cookie)->iovad;
+	struct iommu_dma_cookie *cookie = domain->iova_cookie;
+
+	if (cookie->type == IOMMU_DMA_IOVA_COOKIE)
+		return &cookie->iovad;
+	return NULL;
+}
+
+static struct iommu_dma_cookie *cookie_alloc(enum iommu_dma_cookie_type type)
+{
+	struct iommu_dma_cookie *cookie;
+
+	cookie = kzalloc(sizeof(*cookie), GFP_KERNEL);
+	if (cookie) {
+		spin_lock_init(&cookie->msi_lock);
+		INIT_LIST_HEAD(&cookie->msi_page_list);
+		cookie->type = type;
+	}
+	return cookie;
 }
 
 int iommu_dma_init(void)
@@ -62,25 +97,53 @@ int iommu_dma_init(void)
  */
 int iommu_get_dma_cookie(struct iommu_domain *domain)
 {
+	if (domain->iova_cookie)
+		return -EEXIST;
+
+	domain->iova_cookie = cookie_alloc(IOMMU_DMA_IOVA_COOKIE);
+	if (!domain->iova_cookie)
+		return -ENOMEM;
+
+	return 0;
+}
+EXPORT_SYMBOL(iommu_get_dma_cookie);
+
+/**
+ * iommu_get_msi_cookie - Acquire just MSI remapping resources
+ * @domain: IOMMU domain to prepare
+ * @base: Start address of IOVA region for MSI mappings
+ *
+ * Users who manage their own IOVA allocation and do not want DMA API support,
+ * but would still like to take advantage of automatic MSI remapping, can use
+ * this to initialise their own domain appropriately. Users should reserve a
+ * contiguous IOVA region, starting at @base, large enough to accommodate the
+ * number of PAGE_SIZE mappings necessary to cover every MSI doorbell address
+ * used by the devices attached to @domain.
+ */
+int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base)
+{
 	struct iommu_dma_cookie *cookie;
 
+	if (domain->type != IOMMU_DOMAIN_UNMANAGED)
+		return -EINVAL;
+
 	if (domain->iova_cookie)
 		return -EEXIST;
 
-	cookie = kzalloc(sizeof(*cookie), GFP_KERNEL);
+	cookie = cookie_alloc(IOMMU_DMA_MSI_COOKIE);
 	if (!cookie)
 		return -ENOMEM;
 
-	spin_lock_init(&cookie->msi_lock);
-	INIT_LIST_HEAD(&cookie->msi_page_list);
+	cookie->msi_iova = base;
 	domain->iova_cookie = cookie;
 	return 0;
 }
-EXPORT_SYMBOL(iommu_get_dma_cookie);
+EXPORT_SYMBOL(iommu_get_msi_cookie);
 
 /**
  * iommu_put_dma_cookie - Release a domain's DMA mapping resources
- * @domain: IOMMU domain previously prepared by iommu_get_dma_cookie()
+ * @domain: IOMMU domain previously prepared by iommu_get_dma_cookie() or
+ *          iommu_get_msi_cookie()
  *
  * IOMMU drivers should normally call this from their domain_free callback.
  */
@@ -92,7 +155,7 @@ void iommu_put_dma_cookie(struct iommu_domain *domain)
 	if (!cookie)
 		return;
 
-	if (cookie->iovad.granule)
+	if (cookie->type == IOMMU_DMA_IOVA_COOKIE && cookie->iovad.granule)
 		put_iova_domain(&cookie->iovad);
 
 	list_for_each_entry_safe(msi, tmp, &cookie->msi_page_list, list) {
@@ -137,11 +200,12 @@ static void iova_reserve_pci_windows(struct pci_dev *dev,
 int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
 		u64 size, struct device *dev)
 {
-	struct iova_domain *iovad = cookie_iovad(domain);
+	struct iommu_dma_cookie *cookie = domain->iova_cookie;
+	struct iova_domain *iovad = &cookie->iovad;
 	unsigned long order, base_pfn, end_pfn;
 
-	if (!iovad)
-		return -ENODEV;
+	if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
+		return -EINVAL;
 
 	/* Use the smallest supported page size for IOVA granularity */
 	order = __ffs(domain->pgsize_bitmap);
@@ -671,11 +735,12 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
 {
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
 	struct iommu_dma_msi_page *msi_page;
-	struct iova_domain *iovad = &cookie->iovad;
+	struct iova_domain *iovad = cookie_iovad(domain);
 	struct iova *iova;
 	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+	size_t size = cookie_msi_granule(cookie);
 
-	msi_addr &= ~(phys_addr_t)iova_mask(iovad);
+	msi_addr &= ~(phys_addr_t)(size - 1);
 	list_for_each_entry(msi_page, &cookie->msi_page_list, list)
 		if (msi_page->phys == msi_addr)
 			return msi_page;
@@ -684,13 +749,18 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
 	if (!msi_page)
 		return NULL;
 
-	iova = __alloc_iova(domain, iovad->granule, dma_get_mask(dev), dev);
-	if (!iova)
-		goto out_free_page;
-
 	msi_page->phys = msi_addr;
-	msi_page->iova = iova_dma_addr(iovad, iova);
-	if (iommu_map(domain, msi_page->iova, msi_addr, iovad->granule, prot))
+	if (iovad) {
+		iova = __alloc_iova(domain, size, dma_get_mask(dev), dev);
+		if (!iova)
+			goto out_free_page;
+		msi_page->iova = iova_dma_addr(iovad, iova);
+	} else {
+		msi_page->iova = cookie->msi_iova;
+		cookie->msi_iova += size;
+	}
+
+	if (iommu_map(domain, msi_page->iova, msi_addr, size, prot))
 		goto out_free_iova;
 
 	INIT_LIST_HEAD(&msi_page->list);
@@ -698,7 +768,10 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
 	return msi_page;
 
 out_free_iova:
-	__free_iova(iovad, iova);
+	if (iovad)
+		__free_iova(iovad, iova);
+	else
+		cookie->msi_iova -= size;
 out_free_page:
 	kfree(msi_page);
 	return NULL;
@@ -739,7 +812,7 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
 		msg->data = ~0U;
 	} else {
 		msg->address_hi = upper_32_bits(msi_page->iova);
-		msg->address_lo &= iova_mask(&cookie->iovad);
+		msg->address_lo &= cookie_msi_granule(cookie) - 1;
 		msg->address_lo += lower_32_bits(msi_page->iova);
 	}
 }
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 7f7e9a7..28df844 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -27,6 +27,7 @@
 
 /* Domain management interface for IOMMU drivers */
 int iommu_get_dma_cookie(struct iommu_domain *domain);
+int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base);
 void iommu_put_dma_cookie(struct iommu_domain *domain);
 
 /* Setup call for arch DMA mapping code */
@@ -86,6 +87,11 @@ static inline int iommu_get_dma_cookie(struct iommu_domain *domain)
 	return -ENODEV;
 }
 
+static inline int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base)
+{
+	return -ENODEV;
+}
+
 static inline void iommu_put_dma_cookie(struct iommu_domain *domain)
 {
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v7 03/19] iommu: Rename iommu_dm_regions into iommu_resv_regions
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
  2017-01-09 13:45 ` [PATCH v7 01/19] iommu/dma: Implement PCI allocation optimisation Eric Auger
  2017-01-09 13:45 ` [PATCH v7 02/19] iommu/dma: Allow MSI-only cookies Eric Auger
@ 2017-01-09 13:45 ` Eric Auger
  2017-01-09 13:45 ` [PATCH v7 04/19] iommu: Add a new type field in iommu_resv_region Eric Auger
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:45 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

We want to extend the callbacks used for dm regions and
use them for reserved regions. Reserved regions can be
- directly mapped regions
- regions that cannot be iommu mapped (PCI host bridge windows, ...)
- MSI regions (because they belong to another address space or because
  they are not translated by the IOMMU and need special handling)

So let's rename the struct and also the callbacks.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>

---

v3 -> v4:
- add Robin's A-b
---
 drivers/iommu/amd_iommu.c | 20 ++++++++++----------
 drivers/iommu/iommu.c     | 22 +++++++++++-----------
 include/linux/iommu.h     | 29 +++++++++++++++--------------
 3 files changed, 36 insertions(+), 35 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 3ef0f42..f7a024f 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3161,8 +3161,8 @@ static bool amd_iommu_capable(enum iommu_cap cap)
 	return false;
 }
 
-static void amd_iommu_get_dm_regions(struct device *dev,
-				     struct list_head *head)
+static void amd_iommu_get_resv_regions(struct device *dev,
+				       struct list_head *head)
 {
 	struct unity_map_entry *entry;
 	int devid;
@@ -3172,7 +3172,7 @@ static void amd_iommu_get_dm_regions(struct device *dev,
 		return;
 
 	list_for_each_entry(entry, &amd_iommu_unity_map, list) {
-		struct iommu_dm_region *region;
+		struct iommu_resv_region *region;
 
 		if (devid < entry->devid_start || devid > entry->devid_end)
 			continue;
@@ -3195,18 +3195,18 @@ static void amd_iommu_get_dm_regions(struct device *dev,
 	}
 }
 
-static void amd_iommu_put_dm_regions(struct device *dev,
+static void amd_iommu_put_resv_regions(struct device *dev,
 				     struct list_head *head)
 {
-	struct iommu_dm_region *entry, *next;
+	struct iommu_resv_region *entry, *next;
 
 	list_for_each_entry_safe(entry, next, head, list)
 		kfree(entry);
 }
 
-static void amd_iommu_apply_dm_region(struct device *dev,
+static void amd_iommu_apply_resv_region(struct device *dev,
 				      struct iommu_domain *domain,
-				      struct iommu_dm_region *region)
+				      struct iommu_resv_region *region)
 {
 	struct dma_ops_domain *dma_dom = to_dma_ops_domain(to_pdomain(domain));
 	unsigned long start, end;
@@ -3230,9 +3230,9 @@ static void amd_iommu_apply_dm_region(struct device *dev,
 	.add_device = amd_iommu_add_device,
 	.remove_device = amd_iommu_remove_device,
 	.device_group = amd_iommu_device_group,
-	.get_dm_regions = amd_iommu_get_dm_regions,
-	.put_dm_regions = amd_iommu_put_dm_regions,
-	.apply_dm_region = amd_iommu_apply_dm_region,
+	.get_resv_regions = amd_iommu_get_resv_regions,
+	.put_resv_regions = amd_iommu_put_resv_regions,
+	.apply_resv_region = amd_iommu_apply_resv_region,
 	.pgsize_bitmap	= AMD_IOMMU_PGSIZES,
 };
 
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index dbe7f65..1cee5c3 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -318,7 +318,7 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group,
 					      struct device *dev)
 {
 	struct iommu_domain *domain = group->default_domain;
-	struct iommu_dm_region *entry;
+	struct iommu_resv_region *entry;
 	struct list_head mappings;
 	unsigned long pg_size;
 	int ret = 0;
@@ -331,14 +331,14 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group,
 	pg_size = 1UL << __ffs(domain->pgsize_bitmap);
 	INIT_LIST_HEAD(&mappings);
 
-	iommu_get_dm_regions(dev, &mappings);
+	iommu_get_resv_regions(dev, &mappings);
 
 	/* We need to consider overlapping regions for different devices */
 	list_for_each_entry(entry, &mappings, list) {
 		dma_addr_t start, end, addr;
 
-		if (domain->ops->apply_dm_region)
-			domain->ops->apply_dm_region(dev, domain, entry);
+		if (domain->ops->apply_resv_region)
+			domain->ops->apply_resv_region(dev, domain, entry);
 
 		start = ALIGN(entry->start, pg_size);
 		end   = ALIGN(entry->start + entry->length, pg_size);
@@ -358,7 +358,7 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group,
 	}
 
 out:
-	iommu_put_dm_regions(dev, &mappings);
+	iommu_put_resv_regions(dev, &mappings);
 
 	return ret;
 }
@@ -1559,20 +1559,20 @@ int iommu_domain_set_attr(struct iommu_domain *domain,
 }
 EXPORT_SYMBOL_GPL(iommu_domain_set_attr);
 
-void iommu_get_dm_regions(struct device *dev, struct list_head *list)
+void iommu_get_resv_regions(struct device *dev, struct list_head *list)
 {
 	const struct iommu_ops *ops = dev->bus->iommu_ops;
 
-	if (ops && ops->get_dm_regions)
-		ops->get_dm_regions(dev, list);
+	if (ops && ops->get_resv_regions)
+		ops->get_resv_regions(dev, list);
 }
 
-void iommu_put_dm_regions(struct device *dev, struct list_head *list)
+void iommu_put_resv_regions(struct device *dev, struct list_head *list)
 {
 	const struct iommu_ops *ops = dev->bus->iommu_ops;
 
-	if (ops && ops->put_dm_regions)
-		ops->put_dm_regions(dev, list);
+	if (ops && ops->put_resv_regions)
+		ops->put_resv_regions(dev, list);
 }
 
 /* Request that a device is direct mapped by the IOMMU */
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 0ff5111..bfecb8b 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -118,13 +118,13 @@ enum iommu_attr {
 };
 
 /**
- * struct iommu_dm_region - descriptor for a direct mapped memory region
+ * struct iommu_resv_region - descriptor for a reserved memory region
  * @list: Linked list pointers
  * @start: System physical start address of the region
  * @length: Length of the region in bytes
  * @prot: IOMMU Protection flags (READ/WRITE/...)
  */
-struct iommu_dm_region {
+struct iommu_resv_region {
 	struct list_head	list;
 	phys_addr_t		start;
 	size_t			length;
@@ -150,9 +150,9 @@ struct iommu_dm_region {
  * @device_group: find iommu group for a particular device
  * @domain_get_attr: Query domain attributes
  * @domain_set_attr: Change domain attributes
- * @get_dm_regions: Request list of direct mapping requirements for a device
- * @put_dm_regions: Free list of direct mapping requirements for a device
- * @apply_dm_region: Temporary helper call-back for iova reserved ranges
+ * @get_resv_regions: Request list of reserved regions for a device
+ * @put_resv_regions: Free list of reserved regions for a device
+ * @apply_resv_region: Temporary helper call-back for iova reserved ranges
  * @domain_window_enable: Configure and enable a particular window for a domain
  * @domain_window_disable: Disable a particular window for a domain
  * @domain_set_windows: Set the number of windows for a domain
@@ -184,11 +184,12 @@ struct iommu_ops {
 	int (*domain_set_attr)(struct iommu_domain *domain,
 			       enum iommu_attr attr, void *data);
 
-	/* Request/Free a list of direct mapping requirements for a device */
-	void (*get_dm_regions)(struct device *dev, struct list_head *list);
-	void (*put_dm_regions)(struct device *dev, struct list_head *list);
-	void (*apply_dm_region)(struct device *dev, struct iommu_domain *domain,
-				struct iommu_dm_region *region);
+	/* Request/Free a list of reserved regions for a device */
+	void (*get_resv_regions)(struct device *dev, struct list_head *list);
+	void (*put_resv_regions)(struct device *dev, struct list_head *list);
+	void (*apply_resv_region)(struct device *dev,
+				  struct iommu_domain *domain,
+				  struct iommu_resv_region *region);
 
 	/* Window handling functions */
 	int (*domain_window_enable)(struct iommu_domain *domain, u32 wnd_nr,
@@ -233,8 +234,8 @@ extern size_t default_iommu_map_sg(struct iommu_domain *domain, unsigned long io
 extern void iommu_set_fault_handler(struct iommu_domain *domain,
 			iommu_fault_handler_t handler, void *token);
 
-extern void iommu_get_dm_regions(struct device *dev, struct list_head *list);
-extern void iommu_put_dm_regions(struct device *dev, struct list_head *list);
+extern void iommu_get_resv_regions(struct device *dev, struct list_head *list);
+extern void iommu_put_resv_regions(struct device *dev, struct list_head *list);
 extern int iommu_request_dm_for_dev(struct device *dev);
 
 extern int iommu_attach_group(struct iommu_domain *domain,
@@ -443,12 +444,12 @@ static inline void iommu_set_fault_handler(struct iommu_domain *domain,
 {
 }
 
-static inline void iommu_get_dm_regions(struct device *dev,
+static inline void iommu_get_resv_regions(struct device *dev,
 					struct list_head *list)
 {
 }
 
-static inline void iommu_put_dm_regions(struct device *dev,
+static inline void iommu_put_resv_regions(struct device *dev,
 					struct list_head *list)
 {
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v7 04/19] iommu: Add a new type field in iommu_resv_region
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
                   ` (2 preceding siblings ...)
  2017-01-09 13:45 ` [PATCH v7 03/19] iommu: Rename iommu_dm_regions into iommu_resv_regions Eric Auger
@ 2017-01-09 13:45 ` Eric Auger
  2017-01-09 13:45 ` [PATCH v7 05/19] iommu: iommu_alloc_resv_region Eric Auger
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:45 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

We introduce a new field to differentiate the reserved region
types and specialize the apply_resv_region implementation.

Legacy direct mapped regions have IOMMU_RESV_DIRECT type.
We introduce 2 new reserved memory types:
- IOMMU_RESV_MSI will characterize MSI regions that are mapped
- IOMMU_RESV_RESERVED characterize regions that cannot by mapped.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v6 -> v7:
- rename IOMMU_RESV_NOMAP into IOMMU_RESV_RESERVED
---
 drivers/iommu/amd_iommu.c | 1 +
 include/linux/iommu.h     | 7 +++++++
 2 files changed, 8 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index f7a024f..5f7ea4f 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3186,6 +3186,7 @@ static void amd_iommu_get_resv_regions(struct device *dev,
 
 		region->start = entry->address_start;
 		region->length = entry->address_end - entry->address_start;
+		region->type = IOMMU_RESV_DIRECT;
 		if (entry->prot & IOMMU_PROT_IR)
 			region->prot |= IOMMU_READ;
 		if (entry->prot & IOMMU_PROT_IW)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index bfecb8b..233a6bf 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -117,18 +117,25 @@ enum iommu_attr {
 	DOMAIN_ATTR_MAX,
 };
 
+/* These are the possible reserved region types */
+#define IOMMU_RESV_DIRECT	(1 << 0)
+#define IOMMU_RESV_RESERVED	(1 << 1)
+#define IOMMU_RESV_MSI		(1 << 2)
+
 /**
  * struct iommu_resv_region - descriptor for a reserved memory region
  * @list: Linked list pointers
  * @start: System physical start address of the region
  * @length: Length of the region in bytes
  * @prot: IOMMU Protection flags (READ/WRITE/...)
+ * @type: Type of the reserved region
  */
 struct iommu_resv_region {
 	struct list_head	list;
 	phys_addr_t		start;
 	size_t			length;
 	int			prot;
+	int			type;
 };
 
 #ifdef CONFIG_IOMMU_API
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v7 05/19] iommu: iommu_alloc_resv_region
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
                   ` (3 preceding siblings ...)
  2017-01-09 13:45 ` [PATCH v7 04/19] iommu: Add a new type field in iommu_resv_region Eric Auger
@ 2017-01-09 13:45 ` Eric Auger
  2017-01-09 13:45 ` [PATCH v7 06/19] iommu: Only map direct mapped regions Eric Auger
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:45 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

Introduce a new helper serving the purpose to allocate a reserved
region. This will be used in iommu driver implementing reserved
region callbacks.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v3 -> v4:
- add INIT_LIST_HEAD(&region->list)
- use int for prot param and add int type param
- remove implementation outside of CONFIG_IOMMU_API
---
 drivers/iommu/iommu.c | 18 ++++++++++++++++++
 include/linux/iommu.h |  2 ++
 2 files changed, 20 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 1cee5c3..927878d 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1575,6 +1575,24 @@ void iommu_put_resv_regions(struct device *dev, struct list_head *list)
 		ops->put_resv_regions(dev, list);
 }
 
+struct iommu_resv_region *iommu_alloc_resv_region(phys_addr_t start,
+						  size_t length,
+						  int prot, int type)
+{
+	struct iommu_resv_region *region;
+
+	region = kzalloc(sizeof(*region), GFP_KERNEL);
+	if (!region)
+		return NULL;
+
+	INIT_LIST_HEAD(&region->list);
+	region->start = start;
+	region->length = length;
+	region->prot = prot;
+	region->type = type;
+	return region;
+}
+
 /* Request that a device is direct mapped by the IOMMU */
 int iommu_request_dm_for_dev(struct device *dev)
 {
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 233a6bf..f6bb55d3 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -244,6 +244,8 @@ extern void iommu_set_fault_handler(struct iommu_domain *domain,
 extern void iommu_get_resv_regions(struct device *dev, struct list_head *list);
 extern void iommu_put_resv_regions(struct device *dev, struct list_head *list);
 extern int iommu_request_dm_for_dev(struct device *dev);
+extern struct iommu_resv_region *
+iommu_alloc_resv_region(phys_addr_t start, size_t length, int prot, int type);
 
 extern int iommu_attach_group(struct iommu_domain *domain,
 			      struct iommu_group *group);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v7 06/19] iommu: Only map direct mapped regions
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
                   ` (4 preceding siblings ...)
  2017-01-09 13:45 ` [PATCH v7 05/19] iommu: iommu_alloc_resv_region Eric Auger
@ 2017-01-09 13:45 ` Eric Auger
  2017-01-09 13:45 ` [PATCH v7 07/19] iommu: iommu_get_group_resv_regions Eric Auger
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:45 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

As we introduced new reserved region types which do not require
mapping, let's make sure we only map direct mapped regions.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v3 -> v4:
- use region's type and reword commit message and title
---
 drivers/iommu/iommu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 927878d..41c1906 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -343,6 +343,9 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group,
 		start = ALIGN(entry->start, pg_size);
 		end   = ALIGN(entry->start + entry->length, pg_size);
 
+		if (entry->type != IOMMU_RESV_DIRECT)
+			continue;
+
 		for (addr = start; addr < end; addr += pg_size) {
 			phys_addr_t phys_addr;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v7 07/19] iommu: iommu_get_group_resv_regions
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
                   ` (5 preceding siblings ...)
  2017-01-09 13:45 ` [PATCH v7 06/19] iommu: Only map direct mapped regions Eric Auger
@ 2017-01-09 13:45 ` Eric Auger
  2017-01-09 13:45 ` [PATCH v7 08/19] iommu: Implement reserved_regions iommu-group sysfs file Eric Auger
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:45 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

Introduce iommu_get_group_resv_regions whose role consists in
enumerating all devices from the group and collecting their
reserved regions. The list is sorted and overlaps between
regions of the same type are handled by merging the regions.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v6 -> v7:
- avoid merge of regions of different type

v3 -> v4:
- take the iommu_group lock in iommu_get_group_resv_regions
- the list now is sorted and overlaps are checked

NOTE:
- we do not move list elements from device to group list since
  the iommu_put_resv_regions() could not be called.
- at the moment I did not introduce any iommu_put_group_resv_regions
  since it simply consists in voiding/freeing the list
---
 drivers/iommu/iommu.c | 98 +++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/iommu.h |  8 +++++
 2 files changed, 106 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 41c1906..640056b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -133,6 +133,104 @@ static ssize_t iommu_group_show_name(struct iommu_group *group, char *buf)
 	return sprintf(buf, "%s\n", group->name);
 }
 
+/**
+ * iommu_insert_resv_region - Insert a new region in the
+ * list of reserved regions.
+ * @new: new region to insert
+ * @regions: list of regions
+ *
+ * The new element is sorted by address with respect to the other
+ * regions of the same type. In case it overlaps with another
+ * region of the same type, regions are merged. In case it
+ * overlaps with another region of different type, regions are
+ * not merged.
+ */
+static int iommu_insert_resv_region(struct iommu_resv_region *new,
+				    struct list_head *regions)
+{
+	struct iommu_resv_region *region;
+	phys_addr_t start = new->start;
+	phys_addr_t end = new->start + new->length - 1;
+	struct list_head *pos = regions->next;
+
+	while (pos != regions) {
+		struct iommu_resv_region *entry =
+			list_entry(pos, struct iommu_resv_region, list);
+		phys_addr_t a = entry->start;
+		phys_addr_t b = entry->start + entry->length - 1;
+		int type = entry->type;
+
+		if (end < a) {
+			goto insert;
+		} else if (start > b) {
+			pos = pos->next;
+		} else if ((start >= a) && (end <= b)) {
+			if (new->type == type)
+				goto done;
+			else
+				pos = pos->next;
+		} else {
+			if (new->type == type) {
+				phys_addr_t new_start = min(a, start);
+				phys_addr_t new_end = max(b, end);
+
+				list_del(&entry->list);
+				entry->start = new_start;
+				entry->length = new_end - new_start + 1;
+				iommu_insert_resv_region(entry, regions);
+			} else {
+				pos = pos->next;
+			}
+		}
+	}
+insert:
+	region = iommu_alloc_resv_region(new->start, new->length,
+					 new->prot, new->type);
+	if (!region)
+		return -ENOMEM;
+
+	list_add_tail(&region->list, pos);
+done:
+	return 0;
+}
+
+static int
+iommu_insert_device_resv_regions(struct list_head *dev_resv_regions,
+				 struct list_head *group_resv_regions)
+{
+	struct iommu_resv_region *entry;
+	int ret;
+
+	list_for_each_entry(entry, dev_resv_regions, list) {
+		ret = iommu_insert_resv_region(entry, group_resv_regions);
+		if (ret)
+			break;
+	}
+	return ret;
+}
+
+int iommu_get_group_resv_regions(struct iommu_group *group,
+				 struct list_head *head)
+{
+	struct iommu_device *device;
+	int ret = 0;
+
+	mutex_lock(&group->mutex);
+	list_for_each_entry(device, &group->devices, list) {
+		struct list_head dev_resv_regions;
+
+		INIT_LIST_HEAD(&dev_resv_regions);
+		iommu_get_resv_regions(device->dev, &dev_resv_regions);
+		ret = iommu_insert_device_resv_regions(&dev_resv_regions, head);
+		iommu_put_resv_regions(device->dev, &dev_resv_regions);
+		if (ret)
+			break;
+	}
+	mutex_unlock(&group->mutex);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_get_group_resv_regions);
+
 static IOMMU_GROUP_ATTR(name, S_IRUGO, iommu_group_show_name, NULL);
 
 static void iommu_group_release(struct kobject *kobj)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index f6bb55d3..bec3730 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -246,6 +246,8 @@ extern void iommu_set_fault_handler(struct iommu_domain *domain,
 extern int iommu_request_dm_for_dev(struct device *dev);
 extern struct iommu_resv_region *
 iommu_alloc_resv_region(phys_addr_t start, size_t length, int prot, int type);
+extern int iommu_get_group_resv_regions(struct iommu_group *group,
+					struct list_head *head);
 
 extern int iommu_attach_group(struct iommu_domain *domain,
 			      struct iommu_group *group);
@@ -463,6 +465,12 @@ static inline void iommu_put_resv_regions(struct device *dev,
 {
 }
 
+static inline int iommu_get_group_resv_regions(struct iommu_group *group,
+					       struct list_head *head)
+{
+	return -ENODEV;
+}
+
 static inline int iommu_request_dm_for_dev(struct device *dev)
 {
 	return -ENODEV;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v7 08/19] iommu: Implement reserved_regions iommu-group sysfs file
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
                   ` (6 preceding siblings ...)
  2017-01-09 13:45 ` [PATCH v7 07/19] iommu: iommu_get_group_resv_regions Eric Auger
@ 2017-01-09 13:45 ` Eric Auger
  2017-01-10 16:20   ` Auger Eric
  2017-01-09 13:46 ` [PATCH v7 09/19] iommu/vt-d: Implement reserved region get/put callbacks Eric Auger
                   ` (11 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:45 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

A new iommu-group sysfs attribute file is introduced. It contains
the list of reserved regions for the iommu-group. Each reserved
region is described on a separate line:
- first field is the start IOVA address,
- second is the end IOVA address,
- third is the type.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v6 -> v7:
- also report the type of the reserved region as a string
- updated ABI documentation

v3 -> v4:
- add cast to long long int when printing to avoid warning on
  i386
- change S_IRUGO into 0444
- remove sort. The list is natively sorted now.

The file layout is inspired of /sys/bus/pci/devices/BDF/resource.
I also read Documentation/filesystems/sysfs.txt so I expect this
to be frowned upon.
---
 .../ABI/testing/sysfs-kernel-iommu_groups          | 12 +++++++
 drivers/iommu/iommu.c                              | 38 ++++++++++++++++++++++
 2 files changed, 50 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
index 9b31556..35c64e0 100644
--- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
+++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
@@ -12,3 +12,15 @@ Description:	/sys/kernel/iommu_groups/ contains a number of sub-
 		file if the IOMMU driver has chosen to register a more
 		common name for the group.
 Users:
+
+What:		/sys/kernel/iommu_groups/reserved_regions
+Date: 		January 2017
+KernelVersion:  v4.11
+Contact: 	Eric Auger <eric.auger@redhat.com>
+Description:    /sys/kernel/iommu_groups/reserved_regions list IOVA
+		regions that are reserved. Not necessarily all
+		reserved regions are listed. This is typically used to
+		output direct-mapped, MSI, non mappable regions. Each
+		region is described on a single line: the 1st field is
+		the base IOVA, the second is the end IOVA and the third
+		field describes the type of the region.
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 640056b..0123daa 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -68,6 +68,12 @@ struct iommu_group_attribute {
 			 const char *buf, size_t count);
 };
 
+static const char * const iommu_group_resv_type_string[] = {
+	[IOMMU_RESV_DIRECT]	= "direct",
+	[IOMMU_RESV_RESERVED]	= "reserved",
+	[IOMMU_RESV_MSI]	= "msi",
+};
+
 #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)		\
 struct iommu_group_attribute iommu_group_attr_##_name =		\
 	__ATTR(_name, _mode, _show, _store)
@@ -231,8 +237,33 @@ int iommu_get_group_resv_regions(struct iommu_group *group,
 }
 EXPORT_SYMBOL_GPL(iommu_get_group_resv_regions);
 
+static ssize_t iommu_group_show_resv_regions(struct iommu_group *group,
+					     char *buf)
+{
+	struct iommu_resv_region *region, *next;
+	struct list_head group_resv_regions;
+	char *str = buf;
+
+	INIT_LIST_HEAD(&group_resv_regions);
+	iommu_get_group_resv_regions(group, &group_resv_regions);
+
+	list_for_each_entry_safe(region, next, &group_resv_regions, list) {
+		str += sprintf(str, "0x%016llx 0x%016llx %s\n",
+			       (long long int)region->start,
+			       (long long int)(region->start +
+						region->length - 1),
+			       iommu_group_resv_type_string[region->type]);
+		kfree(region);
+	}
+
+	return (str - buf);
+}
+
 static IOMMU_GROUP_ATTR(name, S_IRUGO, iommu_group_show_name, NULL);
 
+static IOMMU_GROUP_ATTR(reserved_regions, 0444,
+			iommu_group_show_resv_regions, NULL);
+
 static void iommu_group_release(struct kobject *kobj)
 {
 	struct iommu_group *group = to_iommu_group(kobj);
@@ -247,6 +278,8 @@ static void iommu_group_release(struct kobject *kobj)
 	if (group->default_domain)
 		iommu_domain_free(group->default_domain);
 
+	iommu_group_remove_file(group, &iommu_group_attr_reserved_regions);
+
 	kfree(group->name);
 	kfree(group);
 }
@@ -310,6 +343,11 @@ struct iommu_group *iommu_group_alloc(void)
 	 */
 	kobject_put(&group->kobj);
 
+	ret = iommu_group_create_file(group,
+				      &iommu_group_attr_reserved_regions);
+	if (ret)
+		return ERR_PTR(ret);
+
 	pr_debug("Allocated group %d\n", group->id);
 
 	return group;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v7 09/19] iommu/vt-d: Implement reserved region get/put callbacks
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
                   ` (7 preceding siblings ...)
  2017-01-09 13:45 ` [PATCH v7 08/19] iommu: Implement reserved_regions iommu-group sysfs file Eric Auger
@ 2017-01-09 13:46 ` Eric Auger
  2017-01-09 13:46 ` [PATCH v7 10/19] iommu/amd: Declare MSI and HT regions as reserved IOVA regions Eric Auger
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:46 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

This patch registers the [FEE0_0000h - FEF0_000h] 1MB MSI
range as a reserved region and RMRR regions as direct regions.

This will allow to report those reserved regions in the
iommu-group sysfs.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v6 -> v7:
- report RMRR regions as direct regions
- Due to the usage of rcu_read_lock, the rmrr reserved region
  allocation is done on rmrr allocation.
- use IOMMU_RESV_RESERVED

RFCv2 -> RFCv3:
- use get/put_resv_region callbacks.

RFC v1 -> RFC v2:
- fix intel_iommu_add_reserved_regions name
- use IOAPIC_RANGE_START and IOAPIC_RANGE_END defines
- return if the MSI region is already registered;
---
 drivers/iommu/intel-iommu.c | 92 ++++++++++++++++++++++++++++++++++++---------
 1 file changed, 74 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 8a18525..bce59a5 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -440,6 +440,7 @@ struct dmar_rmrr_unit {
 	u64	end_address;		/* reserved end address */
 	struct dmar_dev_scope *devices;	/* target devices */
 	int	devices_cnt;		/* target device count */
+	struct iommu_resv_region *resv; /* reserved region handle */
 };
 
 struct dmar_atsr_unit {
@@ -4246,27 +4247,40 @@ static inline void init_iommu_pm_ops(void) {}
 int __init dmar_parse_one_rmrr(struct acpi_dmar_header *header, void *arg)
 {
 	struct acpi_dmar_reserved_memory *rmrr;
+	int prot = DMA_PTE_READ|DMA_PTE_WRITE;
 	struct dmar_rmrr_unit *rmrru;
+	size_t length;
 
 	rmrru = kzalloc(sizeof(*rmrru), GFP_KERNEL);
 	if (!rmrru)
-		return -ENOMEM;
+		goto out;
 
 	rmrru->hdr = header;
 	rmrr = (struct acpi_dmar_reserved_memory *)header;
 	rmrru->base_address = rmrr->base_address;
 	rmrru->end_address = rmrr->end_address;
+
+	length = rmrr->end_address - rmrr->base_address + 1;
+	rmrru->resv = iommu_alloc_resv_region(rmrr->base_address, length, prot,
+					      IOMMU_RESV_DIRECT);
+	if (!rmrru->resv)
+		goto free_rmrru;
+
 	rmrru->devices = dmar_alloc_dev_scope((void *)(rmrr + 1),
 				((void *)rmrr) + rmrr->header.length,
 				&rmrru->devices_cnt);
-	if (rmrru->devices_cnt && rmrru->devices == NULL) {
-		kfree(rmrru);
-		return -ENOMEM;
-	}
+	if (rmrru->devices_cnt && rmrru->devices == NULL)
+		goto free_all;
 
 	list_add(&rmrru->list, &dmar_rmrr_units);
 
 	return 0;
+free_all:
+	kfree(rmrru->resv);
+free_rmrru:
+	kfree(rmrru);
+out:
+	return -ENOMEM;
 }
 
 static struct dmar_atsr_unit *dmar_find_atsr(struct acpi_dmar_atsr *atsr)
@@ -4480,6 +4494,7 @@ static void intel_iommu_free_dmars(void)
 	list_for_each_entry_safe(rmrru, rmrr_n, &dmar_rmrr_units, list) {
 		list_del(&rmrru->list);
 		dmar_free_dev_scope(&rmrru->devices, &rmrru->devices_cnt);
+		kfree(rmrru->resv);
 		kfree(rmrru);
 	}
 
@@ -5203,6 +5218,45 @@ static void intel_iommu_remove_device(struct device *dev)
 	iommu_device_unlink(iommu->iommu_dev, dev);
 }
 
+static void intel_iommu_get_resv_regions(struct device *device,
+					 struct list_head *head)
+{
+	struct iommu_resv_region *reg;
+	struct dmar_rmrr_unit *rmrr;
+	struct device *i_dev;
+	int i;
+
+	rcu_read_lock();
+	for_each_rmrr_units(rmrr) {
+		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
+					  i, i_dev) {
+			if (i_dev != device)
+				continue;
+
+			list_add_tail(&rmrr->resv->list, head);
+		}
+	}
+	rcu_read_unlock();
+
+	reg = iommu_alloc_resv_region(IOAPIC_RANGE_START,
+				      IOAPIC_RANGE_END - IOAPIC_RANGE_START + 1,
+				      0, IOMMU_RESV_RESERVED);
+	if (!reg)
+		return;
+	list_add_tail(&reg->list, head);
+}
+
+static void intel_iommu_put_resv_regions(struct device *dev,
+					 struct list_head *head)
+{
+	struct iommu_resv_region *entry, *next;
+
+	list_for_each_entry_safe(entry, next, head, list) {
+		if (entry->type == IOMMU_RESV_RESERVED)
+			kfree(entry);
+	}
+}
+
 #ifdef CONFIG_INTEL_IOMMU_SVM
 #define MAX_NR_PASID_BITS (20)
 static inline unsigned long intel_iommu_get_pts(struct intel_iommu *iommu)
@@ -5333,19 +5387,21 @@ struct intel_iommu *intel_svm_device_to_iommu(struct device *dev)
 #endif /* CONFIG_INTEL_IOMMU_SVM */
 
 static const struct iommu_ops intel_iommu_ops = {
-	.capable	= intel_iommu_capable,
-	.domain_alloc	= intel_iommu_domain_alloc,
-	.domain_free	= intel_iommu_domain_free,
-	.attach_dev	= intel_iommu_attach_device,
-	.detach_dev	= intel_iommu_detach_device,
-	.map		= intel_iommu_map,
-	.unmap		= intel_iommu_unmap,
-	.map_sg		= default_iommu_map_sg,
-	.iova_to_phys	= intel_iommu_iova_to_phys,
-	.add_device	= intel_iommu_add_device,
-	.remove_device	= intel_iommu_remove_device,
-	.device_group   = pci_device_group,
-	.pgsize_bitmap	= INTEL_IOMMU_PGSIZES,
+	.capable		= intel_iommu_capable,
+	.domain_alloc		= intel_iommu_domain_alloc,
+	.domain_free		= intel_iommu_domain_free,
+	.attach_dev		= intel_iommu_attach_device,
+	.detach_dev		= intel_iommu_detach_device,
+	.map			= intel_iommu_map,
+	.unmap			= intel_iommu_unmap,
+	.map_sg			= default_iommu_map_sg,
+	.iova_to_phys		= intel_iommu_iova_to_phys,
+	.add_device		= intel_iommu_add_device,
+	.remove_device		= intel_iommu_remove_device,
+	.get_resv_regions	= intel_iommu_get_resv_regions,
+	.put_resv_regions	= intel_iommu_put_resv_regions,
+	.device_group		= pci_device_group,
+	.pgsize_bitmap		= INTEL_IOMMU_PGSIZES,
 };
 
 static void quirk_iommu_g4x_gfx(struct pci_dev *dev)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v7 10/19] iommu/amd: Declare MSI and HT regions as reserved IOVA regions
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
                   ` (8 preceding siblings ...)
  2017-01-09 13:46 ` [PATCH v7 09/19] iommu/vt-d: Implement reserved region get/put callbacks Eric Auger
@ 2017-01-09 13:46 ` Eric Auger
  2017-01-09 13:46 ` [PATCH v7 11/19] iommu/arm-smmu: Implement reserved region get/put callbacks Eric Auger
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:46 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

This patch registers the MSI and HT regions as non mappable
reserved regions. They will be exposed in the iommu-group sysfs.

For direct-mapped regions let's also use iommu_alloc_resv_region().

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v6 -> v7:
- use IOMMU_RESV_RESERVED

v5: creation
---
 drivers/iommu/amd_iommu.c | 37 ++++++++++++++++++++++++++-----------
 1 file changed, 26 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 5f7ea4f..d109e41 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3164,6 +3164,7 @@ static bool amd_iommu_capable(enum iommu_cap cap)
 static void amd_iommu_get_resv_regions(struct device *dev,
 				       struct list_head *head)
 {
+	struct iommu_resv_region *region;
 	struct unity_map_entry *entry;
 	int devid;
 
@@ -3172,28 +3173,42 @@ static void amd_iommu_get_resv_regions(struct device *dev,
 		return;
 
 	list_for_each_entry(entry, &amd_iommu_unity_map, list) {
-		struct iommu_resv_region *region;
+		size_t length;
+		int prot = 0;
 
 		if (devid < entry->devid_start || devid > entry->devid_end)
 			continue;
 
-		region = kzalloc(sizeof(*region), GFP_KERNEL);
+		length = entry->address_end - entry->address_start;
+		if (entry->prot & IOMMU_PROT_IR)
+			prot |= IOMMU_READ;
+		if (entry->prot & IOMMU_PROT_IW)
+			prot |= IOMMU_WRITE;
+
+		region = iommu_alloc_resv_region(entry->address_start,
+						 length, prot,
+						 IOMMU_RESV_DIRECT);
 		if (!region) {
 			pr_err("Out of memory allocating dm-regions for %s\n",
 				dev_name(dev));
 			return;
 		}
-
-		region->start = entry->address_start;
-		region->length = entry->address_end - entry->address_start;
-		region->type = IOMMU_RESV_DIRECT;
-		if (entry->prot & IOMMU_PROT_IR)
-			region->prot |= IOMMU_READ;
-		if (entry->prot & IOMMU_PROT_IW)
-			region->prot |= IOMMU_WRITE;
-
 		list_add_tail(&region->list, head);
 	}
+
+	region = iommu_alloc_resv_region(MSI_RANGE_START,
+					 MSI_RANGE_END - MSI_RANGE_START + 1,
+					 0, IOMMU_RESV_RESERVED);
+	if (!region)
+		return;
+	list_add_tail(&region->list, head);
+
+	region = iommu_alloc_resv_region(HT_RANGE_START,
+					 HT_RANGE_END - HT_RANGE_START + 1,
+					 0, IOMMU_RESV_RESERVED);
+	if (!region)
+		return;
+	list_add_tail(&region->list, head);
 }
 
 static void amd_iommu_put_resv_regions(struct device *dev,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v7 11/19] iommu/arm-smmu: Implement reserved region get/put callbacks
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
                   ` (9 preceding siblings ...)
  2017-01-09 13:46 ` [PATCH v7 10/19] iommu/amd: Declare MSI and HT regions as reserved IOVA regions Eric Auger
@ 2017-01-09 13:46 ` Eric Auger
  2017-01-10 14:16   ` Will Deacon
  2017-01-09 13:46 ` [PATCH v7 12/19] iommu/arm-smmu-v3: " Eric Auger
                   ` (8 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:46 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

The get() populates the list with the MSI IOVA reserved window.

At the moment an arbitray MSI IOVA window is set at 0x8000000
of size 1MB. This will allow to report those info in iommu-group
sysfs.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v3 -> v4:
- do not handle PCI host bridge windows anymore
- encode prot

RFC v2 -> v3:
- use existing get/put_resv_regions

RFC v1 -> v2:
- use defines for MSI IOVA base and length
---
 drivers/iommu/arm-smmu.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index a60cded..a354572 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -281,6 +281,9 @@ enum arm_smmu_s2cr_privcfg {
 
 #define FSYNR0_WNR			(1 << 4)
 
+#define MSI_IOVA_BASE			0x8000000
+#define MSI_IOVA_LENGTH			0x100000
+
 static int force_stage;
 module_param(force_stage, int, S_IRUGO);
 MODULE_PARM_DESC(force_stage,
@@ -1549,6 +1552,29 @@ static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 	return iommu_fwspec_add_ids(dev, &fwid, 1);
 }
 
+static void arm_smmu_get_resv_regions(struct device *dev,
+				      struct list_head *head)
+{
+	struct iommu_resv_region *region;
+	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+
+	region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
+					 prot, IOMMU_RESV_MSI);
+	if (!region)
+		return;
+
+	list_add_tail(&region->list, head);
+}
+
+static void arm_smmu_put_resv_regions(struct device *dev,
+				      struct list_head *head)
+{
+	struct iommu_resv_region *entry, *next;
+
+	list_for_each_entry_safe(entry, next, head, list)
+		kfree(entry);
+}
+
 static struct iommu_ops arm_smmu_ops = {
 	.capable		= arm_smmu_capable,
 	.domain_alloc		= arm_smmu_domain_alloc,
@@ -1564,6 +1590,8 @@ static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 	.domain_get_attr	= arm_smmu_domain_get_attr,
 	.domain_set_attr	= arm_smmu_domain_set_attr,
 	.of_xlate		= arm_smmu_of_xlate,
+	.get_resv_regions	= arm_smmu_get_resv_regions,
+	.put_resv_regions	= arm_smmu_put_resv_regions,
 	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v7 12/19] iommu/arm-smmu-v3: Implement reserved region get/put callbacks
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
                   ` (10 preceding siblings ...)
  2017-01-09 13:46 ` [PATCH v7 11/19] iommu/arm-smmu: Implement reserved region get/put callbacks Eric Auger
@ 2017-01-09 13:46 ` Eric Auger
  2017-01-10 14:17   ` Will Deacon
  2017-01-09 13:46 ` [PATCH v7 13/19] irqdomain: Add irq domain MSI and MSI_REMAP flags Eric Auger
                   ` (7 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:46 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

iommu/arm-smmu: Implement reserved region get/put callbacks

The get() populates the list with the MSI IOVA reserved window.

At the moment an arbitray MSI IOVA window is set at 0x8000000
of size 1MB. This will allow to report those info in iommu-group
sysfs.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v4: creation
---
 drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 4d6ec44..6c4111c 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -412,6 +412,9 @@
 /* High-level queue structures */
 #define ARM_SMMU_POLL_TIMEOUT_US	100
 
+#define MSI_IOVA_BASE			0x8000000
+#define MSI_IOVA_LENGTH			0x100000
+
 static bool disable_bypass;
 module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO);
 MODULE_PARM_DESC(disable_bypass,
@@ -1883,6 +1886,29 @@ static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 	return iommu_fwspec_add_ids(dev, args->args, 1);
 }
 
+static void arm_smmu_get_resv_regions(struct device *dev,
+				      struct list_head *head)
+{
+	struct iommu_resv_region *region;
+	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+
+	region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
+					 prot, IOMMU_RESV_MSI);
+	if (!region)
+		return;
+
+	list_add_tail(&region->list, head);
+}
+
+static void arm_smmu_put_resv_regions(struct device *dev,
+				      struct list_head *head)
+{
+	struct iommu_resv_region *entry, *next;
+
+	list_for_each_entry_safe(entry, next, head, list)
+		kfree(entry);
+}
+
 static struct iommu_ops arm_smmu_ops = {
 	.capable		= arm_smmu_capable,
 	.domain_alloc		= arm_smmu_domain_alloc,
@@ -1898,6 +1924,8 @@ static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 	.domain_get_attr	= arm_smmu_domain_get_attr,
 	.domain_set_attr	= arm_smmu_domain_set_attr,
 	.of_xlate		= arm_smmu_of_xlate,
+	.get_resv_regions	= arm_smmu_get_resv_regions,
+	.put_resv_regions	= arm_smmu_put_resv_regions,
 	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v7 13/19] irqdomain: Add irq domain MSI and MSI_REMAP flags
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
                   ` (11 preceding siblings ...)
  2017-01-09 13:46 ` [PATCH v7 12/19] iommu/arm-smmu-v3: " Eric Auger
@ 2017-01-09 13:46 ` Eric Auger
  2017-01-10 11:50   ` Marc Zyngier
  2017-01-09 13:46 ` [PATCH v7 14/19] genirq/msi: Set IRQ_DOMAIN_FLAG_MSI on MSI domain creation Eric Auger
                   ` (6 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:46 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

We introduce two new enum values for the irq domain flag:
- IRQ_DOMAIN_FLAG_MSI indicates the irq domain corresponds to
  an MSI domain
- IRQ_DOMAIN_FLAG_MSI_REMAP indicates the irq domain has MSI
  remapping capabilities.

Those values will be useful to check all MSI irq domains have
MSI remapping support when assessing the safety of IRQ assignment
to a guest.

irq_domain_hierarchical_is_msi_remap() allows to check if an
irq domain or any parent implements MSI remapping.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v6:
- add IRQ_DOMAIN_FLAG_MSI as suggested by Marc
- add irq_domain_is_msi, irq_domain_is_msi_remap and
  irq_domain_hierarchical_is_msi_remap
---
 include/linux/irqdomain.h | 35 +++++++++++++++++++++++++++++++++++
 kernel/irq/irqdomain.c    | 16 ++++++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index ffb8460..bc2f571 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -183,6 +183,12 @@ enum {
 	/* Irq domain is an IPI domain with single virq */
 	IRQ_DOMAIN_FLAG_IPI_SINGLE	= (1 << 3),
 
+	/* Irq domain implements MSIs */
+	IRQ_DOMAIN_FLAG_MSI		= (1 << 4),
+
+	/* Irq domain implements MSI remapping */
+	IRQ_DOMAIN_FLAG_MSI_REMAP	= (1 << 5),
+
 	/*
 	 * Flags starting from IRQ_DOMAIN_FLAG_NONCORE are reserved
 	 * for implementation specific purposes and ignored by the
@@ -446,6 +452,19 @@ static inline bool irq_domain_is_ipi_single(struct irq_domain *domain)
 {
 	return domain->flags & IRQ_DOMAIN_FLAG_IPI_SINGLE;
 }
+
+static inline bool irq_domain_is_msi(struct irq_domain *domain)
+{
+	return domain->flags & IRQ_DOMAIN_FLAG_MSI;
+}
+
+static inline bool irq_domain_is_msi_remap(struct irq_domain *domain)
+{
+	return domain->flags & IRQ_DOMAIN_FLAG_MSI_REMAP;
+}
+
+extern bool irq_domain_hierarchical_is_msi_remap(struct irq_domain *domain);
+
 #else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
 static inline void irq_domain_activate_irq(struct irq_data *data) { }
 static inline void irq_domain_deactivate_irq(struct irq_data *data) { }
@@ -477,6 +496,22 @@ static inline bool irq_domain_is_ipi_single(struct irq_domain *domain)
 {
 	return false;
 }
+
+static inline bool irq_domain_is_msi(struct irq_domain *domain)
+{
+	return false;
+}
+
+static inline bool irq_domain_is_msi_remap(struct irq_domain *domain)
+{
+	return false;
+}
+
+static inline bool
+irq_domain_hierarchical_is_msi_remap(struct irq_domain *domain)
+{
+	return false;
+}
 #endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
 
 #else /* CONFIG_IRQ_DOMAIN */
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 8c0a0ae..101ee8f 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -1392,6 +1392,22 @@ static void irq_domain_check_hierarchy(struct irq_domain *domain)
 	if (domain->ops->alloc)
 		domain->flags |= IRQ_DOMAIN_FLAG_HIERARCHY;
 }
+
+/**
+ * irq_domain_hierarchical_is_msi_remap - Check if the domain or any
+ * parent has MSI remapping support
+ * @domain: domain pointer
+ */
+bool irq_domain_hierarchical_is_msi_remap(struct irq_domain *domain)
+{
+	struct irq_domain *h = domain;
+
+	for (; h; h = h->parent) {
+		if (irq_domain_is_msi_remap(h))
+			return true;
+	}
+	return false;
+}
 #else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
 /**
  * irq_domain_get_irq_data - Get irq_data associated with @virq and @domain
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v7 14/19] genirq/msi: Set IRQ_DOMAIN_FLAG_MSI on MSI domain creation
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
                   ` (12 preceding siblings ...)
  2017-01-09 13:46 ` [PATCH v7 13/19] irqdomain: Add irq domain MSI and MSI_REMAP flags Eric Auger
@ 2017-01-09 13:46 ` Eric Auger
  2017-01-09 13:46 ` [PATCH v7 15/19] irqdomain: irq_domain_check_msi_remap Eric Auger
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:46 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

Now we have a flag value indicating an IRQ domain implements MSI,
let's set it on msi_create_irq_domain().

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v6: new
---
 kernel/irq/msi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index ee23006..ddc2f54 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -270,8 +270,8 @@ struct irq_domain *msi_create_irq_domain(struct fwnode_handle *fwnode,
 	if (info->flags & MSI_FLAG_USE_DEF_CHIP_OPS)
 		msi_domain_update_chip_ops(info);
 
-	return irq_domain_create_hierarchy(parent, 0, 0, fwnode,
-					   &msi_domain_ops, info);
+	return irq_domain_create_hierarchy(parent, IRQ_DOMAIN_FLAG_MSI, 0,
+					   fwnode, &msi_domain_ops, info);
 }
 
 int msi_domain_prepare_irqs(struct irq_domain *domain, struct device *dev,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v7 15/19] irqdomain: irq_domain_check_msi_remap
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
                   ` (13 preceding siblings ...)
  2017-01-09 13:46 ` [PATCH v7 14/19] genirq/msi: Set IRQ_DOMAIN_FLAG_MSI on MSI domain creation Eric Auger
@ 2017-01-09 13:46 ` Eric Auger
  2017-01-10 11:54   ` Marc Zyngier
  2017-01-09 13:46 ` [PATCH v7 16/19] irqchip/gicv3-its: Sets IRQ_DOMAIN_FLAG_MSI_REMAP Eric Auger
                   ` (4 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:46 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

This new function checks whether all MSI irq domains
implement IRQ remapping. This is useful to understand
whether VFIO passthrough is safe with respect to interrupts.

On ARM typically an MSI controller can sit downstream
to the IOMMU without preventing VFIO passthrough.
As such any assigned device can write into the MSI doorbell.
In case the MSI controller implements IRQ remapping, assigned
devices will not be able to trigger interrupts towards the
host. On the contrary, the assignment must be emphasized as
unsafe with respect to interrupts.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v5 -> v6:
- use irq_domain_hierarchical_is_msi_remap()
- comment rewording

v4 -> v5:
- Handle DOMAIN_BUS_FSL_MC_MSI domains
- Check parents
---
 include/linux/irqdomain.h |  1 +
 kernel/irq/irqdomain.c    | 23 +++++++++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index bc2f571..188eced 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -222,6 +222,7 @@ struct irq_domain *irq_domain_add_legacy(struct device_node *of_node,
 					 void *host_data);
 extern struct irq_domain *irq_find_matching_fwspec(struct irq_fwspec *fwspec,
 						   enum irq_domain_bus_token bus_token);
+extern bool irq_domain_check_msi_remap(void);
 extern void irq_set_default_host(struct irq_domain *host);
 extern int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
 				  irq_hw_number_t hwirq, int node,
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 101ee8f..fff30cb 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -278,6 +278,29 @@ struct irq_domain *irq_find_matching_fwspec(struct irq_fwspec *fwspec,
 EXPORT_SYMBOL_GPL(irq_find_matching_fwspec);
 
 /**
+ * irq_domain_check_msi_remap - Check whether all MSI
+ * irq domains implement IRQ remapping
+ */
+bool irq_domain_check_msi_remap(void)
+{
+	struct irq_domain *h;
+	bool ret = true;
+
+	mutex_lock(&irq_domain_mutex);
+	list_for_each_entry(h, &irq_domain_list, link) {
+		if (irq_domain_is_msi(h) &&
+		    !irq_domain_hierarchical_is_msi_remap(h)) {
+			ret = false;
+			goto out;
+		}
+	}
+out:
+	mutex_unlock(&irq_domain_mutex);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(irq_domain_check_msi_remap);
+
+/**
  * irq_set_default_host() - Set a "default" irq domain
  * @domain: default domain pointer
  *
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v7 16/19] irqchip/gicv3-its: Sets IRQ_DOMAIN_FLAG_MSI_REMAP
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
                   ` (14 preceding siblings ...)
  2017-01-09 13:46 ` [PATCH v7 15/19] irqdomain: irq_domain_check_msi_remap Eric Auger
@ 2017-01-09 13:46 ` Eric Auger
  2017-01-10 11:57   ` Marc Zyngier
  2017-01-09 13:46 ` [PATCH v7 17/19] vfio/type1: Allow transparent MSI IOVA allocation Eric Auger
                   ` (3 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:46 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

The GICv3 ITS is MSI remapping capable. Let's advertise
this property so that VFIO passthrough can assess IRQ safety.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/irqchip/irq-gic-v3-its.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 69b040f..9d4fefc 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1642,6 +1642,7 @@ static int its_init_domain(struct fwnode_handle *handle, struct its_node *its)
 
 	inner_domain->parent = its_parent;
 	inner_domain->bus_token = DOMAIN_BUS_NEXUS;
+	inner_domain->flags |= IRQ_DOMAIN_FLAG_MSI_REMAP;
 	info->ops = &its_msi_domain_ops;
 	info->data = its;
 	inner_domain->host_data = info;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v7 17/19] vfio/type1: Allow transparent MSI IOVA allocation
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
                   ` (15 preceding siblings ...)
  2017-01-09 13:46 ` [PATCH v7 16/19] irqchip/gicv3-its: Sets IRQ_DOMAIN_FLAG_MSI_REMAP Eric Auger
@ 2017-01-09 13:46 ` Eric Auger
  2017-01-09 13:46 ` [PATCH v7 18/19] vfio/type1: Check MSI remapping at irq domain level Eric Auger
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:46 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

When attaching a group to the container, check the group's
reserved regions and test whether the IOMMU translates MSI
transactions. If yes, we initialize an IOVA allocator through
the iommu_get_msi_cookie API. This will allow the MSI IOVAs
to be transparently allocated on MSI controller's compose().

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v3 -> v4:
- test region's type: IOMMU_RESV_MSI
- restructure the code to prepare for safety assessment
- reword title
---
 drivers/vfio/vfio_iommu_type1.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 9266271..5651faf 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -39,6 +39,7 @@
 #include <linux/pid_namespace.h>
 #include <linux/mdev.h>
 #include <linux/notifier.h>
+#include <linux/dma-iommu.h>
 
 #define DRIVER_VERSION  "0.2"
 #define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
@@ -1181,6 +1182,28 @@ static struct vfio_group *find_iommu_group(struct vfio_domain *domain,
 	return NULL;
 }
 
+static bool vfio_iommu_has_resv_msi(struct iommu_group *group,
+				    phys_addr_t *base)
+{
+	struct list_head group_resv_regions;
+	struct iommu_resv_region *region, *next;
+	bool ret = false;
+
+	INIT_LIST_HEAD(&group_resv_regions);
+	iommu_get_group_resv_regions(group, &group_resv_regions);
+	list_for_each_entry(region, &group_resv_regions, list) {
+		if (region->type & IOMMU_RESV_MSI) {
+			*base = region->start;
+			ret = true;
+			goto out;
+		}
+	}
+out:
+	list_for_each_entry_safe(region, next, &group_resv_regions, list)
+		kfree(region);
+	return ret;
+}
+
 static int vfio_iommu_type1_attach_group(void *iommu_data,
 					 struct iommu_group *iommu_group)
 {
@@ -1189,6 +1212,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	struct vfio_domain *domain, *d;
 	struct bus_type *bus = NULL, *mdev_bus;
 	int ret;
+	bool resv_msi;
+	phys_addr_t resv_msi_base;
 
 	mutex_lock(&iommu->lock);
 
@@ -1258,6 +1283,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	if (ret)
 		goto out_domain;
 
+	resv_msi = vfio_iommu_has_resv_msi(iommu_group, &resv_msi_base);
+
 	INIT_LIST_HEAD(&domain->group_list);
 	list_add(&group->next, &domain->group_list);
 
@@ -1304,6 +1331,9 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	if (ret)
 		goto out_detach;
 
+	if (resv_msi && iommu_get_msi_cookie(domain->domain, resv_msi_base))
+		goto out_detach;
+
 	list_add(&domain->next, &iommu->domain_list);
 
 	mutex_unlock(&iommu->lock);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v7 18/19] vfio/type1: Check MSI remapping at irq domain level
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
                   ` (16 preceding siblings ...)
  2017-01-09 13:46 ` [PATCH v7 17/19] vfio/type1: Allow transparent MSI IOVA allocation Eric Auger
@ 2017-01-09 13:46 ` Eric Auger
  2017-01-09 13:46 ` [PATCH v7 19/19] iommu/arm-smmu: Do not advertise IOMMU_CAP_INTR_REMAP anymore Eric Auger
  2017-01-10 14:09 ` [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Joerg Roedel
  19 siblings, 0 replies; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:46 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

In case the IOMMU translates MSI transactions (typical case
on ARM), we check MSI remapping capability at IRQ domain
level. Otherwise it is checked at IOMMU level.

At this stage the arm-smmu-(v3) still advertise the
IOMMU_CAP_INTR_REMAP capability at IOMMU level. This will be
removed in subsequent patches.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v6: rewrite test
---
 drivers/vfio/vfio_iommu_type1.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 5651faf..ec903a0 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -40,6 +40,7 @@
 #include <linux/mdev.h>
 #include <linux/notifier.h>
 #include <linux/dma-iommu.h>
+#include <linux/irqdomain.h>
 
 #define DRIVER_VERSION  "0.2"
 #define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
@@ -1212,7 +1213,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	struct vfio_domain *domain, *d;
 	struct bus_type *bus = NULL, *mdev_bus;
 	int ret;
-	bool resv_msi;
+	bool resv_msi, msi_remap;
 	phys_addr_t resv_msi_base;
 
 	mutex_lock(&iommu->lock);
@@ -1288,8 +1289,10 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	INIT_LIST_HEAD(&domain->group_list);
 	list_add(&group->next, &domain->group_list);
 
-	if (!allow_unsafe_interrupts &&
-	    !iommu_capable(bus, IOMMU_CAP_INTR_REMAP)) {
+	msi_remap = resv_msi ? irq_domain_check_msi_remap() :
+				iommu_capable(bus, IOMMU_CAP_INTR_REMAP);
+
+	if (!allow_unsafe_interrupts && !msi_remap) {
 		pr_warn("%s: No interrupt remapping support.  Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n",
 		       __func__);
 		ret = -EPERM;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v7 19/19] iommu/arm-smmu: Do not advertise IOMMU_CAP_INTR_REMAP anymore
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
                   ` (17 preceding siblings ...)
  2017-01-09 13:46 ` [PATCH v7 18/19] vfio/type1: Check MSI remapping at irq domain level Eric Auger
@ 2017-01-09 13:46 ` Eric Auger
  2017-01-10 14:17   ` Will Deacon
  2017-01-10 14:09 ` [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Joerg Roedel
  19 siblings, 1 reply; 33+ messages in thread
From: Eric Auger @ 2017-01-09 13:46 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

IOMMU_CAP_INTR_REMAP has been advertised in arm-smmu(-v3) although
on ARM this property is not attached to the IOMMU but rather is
implemented in the MSI controller (GICv3 ITS).

Now vfio_iommu_type1 checks MSI remapping capability at MSI controller
level, let's correct this.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/iommu/arm-smmu-v3.c | 2 --
 drivers/iommu/arm-smmu.c    | 2 --
 2 files changed, 4 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 6c4111c..d9cf6cb 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1375,8 +1375,6 @@ static bool arm_smmu_capable(enum iommu_cap cap)
 	switch (cap) {
 	case IOMMU_CAP_CACHE_COHERENCY:
 		return true;
-	case IOMMU_CAP_INTR_REMAP:
-		return true; /* MSIs are just memory writes */
 	case IOMMU_CAP_NOEXEC:
 		return true;
 	default:
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index a354572..13d2600 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1374,8 +1374,6 @@ static bool arm_smmu_capable(enum iommu_cap cap)
 		 * requests.
 		 */
 		return true;
-	case IOMMU_CAP_INTR_REMAP:
-		return true; /* MSIs are just memory writes */
 	case IOMMU_CAP_NOEXEC:
 		return true;
 	default:
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v7 13/19] irqdomain: Add irq domain MSI and MSI_REMAP flags
  2017-01-09 13:46 ` [PATCH v7 13/19] irqdomain: Add irq domain MSI and MSI_REMAP flags Eric Auger
@ 2017-01-10 11:50   ` Marc Zyngier
  0 siblings, 0 replies; 33+ messages in thread
From: Marc Zyngier @ 2017-01-10 11:50 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, christoffer.dall, robin.murphy,
	alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

Hi Eric,

On 09/01/17 13:46, Eric Auger wrote:
> We introduce two new enum values for the irq domain flag:
> - IRQ_DOMAIN_FLAG_MSI indicates the irq domain corresponds to
>   an MSI domain
> - IRQ_DOMAIN_FLAG_MSI_REMAP indicates the irq domain has MSI
>   remapping capabilities.
> 
> Those values will be useful to check all MSI irq domains have
> MSI remapping support when assessing the safety of IRQ assignment
> to a guest.
> 
> irq_domain_hierarchical_is_msi_remap() allows to check if an
> irq domain or any parent implements MSI remapping.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> 
> v6:
> - add IRQ_DOMAIN_FLAG_MSI as suggested by Marc
> - add irq_domain_is_msi, irq_domain_is_msi_remap and
>   irq_domain_hierarchical_is_msi_remap
> ---
>  include/linux/irqdomain.h | 35 +++++++++++++++++++++++++++++++++++
>  kernel/irq/irqdomain.c    | 16 ++++++++++++++++
>  2 files changed, 51 insertions(+)
> 
> diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
> index ffb8460..bc2f571 100644
> --- a/include/linux/irqdomain.h
> +++ b/include/linux/irqdomain.h
> @@ -183,6 +183,12 @@ enum {
>  	/* Irq domain is an IPI domain with single virq */
>  	IRQ_DOMAIN_FLAG_IPI_SINGLE	= (1 << 3),
>  
> +	/* Irq domain implements MSIs */
> +	IRQ_DOMAIN_FLAG_MSI		= (1 << 4),
> +
> +	/* Irq domain implements MSI remapping */
> +	IRQ_DOMAIN_FLAG_MSI_REMAP	= (1 << 5),
> +
>  	/*
>  	 * Flags starting from IRQ_DOMAIN_FLAG_NONCORE are reserved
>  	 * for implementation specific purposes and ignored by the
> @@ -446,6 +452,19 @@ static inline bool irq_domain_is_ipi_single(struct irq_domain *domain)
>  {
>  	return domain->flags & IRQ_DOMAIN_FLAG_IPI_SINGLE;
>  }
> +
> +static inline bool irq_domain_is_msi(struct irq_domain *domain)
> +{
> +	return domain->flags & IRQ_DOMAIN_FLAG_MSI;
> +}
> +
> +static inline bool irq_domain_is_msi_remap(struct irq_domain *domain)
> +{
> +	return domain->flags & IRQ_DOMAIN_FLAG_MSI_REMAP;
> +}
> +
> +extern bool irq_domain_hierarchical_is_msi_remap(struct irq_domain *domain);
> +
>  #else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
>  static inline void irq_domain_activate_irq(struct irq_data *data) { }
>  static inline void irq_domain_deactivate_irq(struct irq_data *data) { }
> @@ -477,6 +496,22 @@ static inline bool irq_domain_is_ipi_single(struct irq_domain *domain)
>  {
>  	return false;
>  }
> +
> +static inline bool irq_domain_is_msi(struct irq_domain *domain)
> +{
> +	return false;
> +}
> +
> +static inline bool irq_domain_is_msi_remap(struct irq_domain *domain)
> +{
> +	return false;
> +}
> +
> +static inline bool
> +irq_domain_hierarchical_is_msi_remap(struct irq_domain *domain)
> +{
> +	return false;
> +}
>  #endif	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
>  
>  #else /* CONFIG_IRQ_DOMAIN */
> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> index 8c0a0ae..101ee8f 100644
> --- a/kernel/irq/irqdomain.c
> +++ b/kernel/irq/irqdomain.c
> @@ -1392,6 +1392,22 @@ static void irq_domain_check_hierarchy(struct irq_domain *domain)
>  	if (domain->ops->alloc)
>  		domain->flags |= IRQ_DOMAIN_FLAG_HIERARCHY;
>  }
> +
> +/**
> + * irq_domain_hierarchical_is_msi_remap - Check if the domain or any
> + * parent has MSI remapping support
> + * @domain: domain pointer
> + */
> +bool irq_domain_hierarchical_is_msi_remap(struct irq_domain *domain)
> +{
> +	struct irq_domain *h = domain;
> +
> +	for (; h; h = h->parent) {
> +		if (irq_domain_is_msi_remap(h))

nit: Why do you need to go via this extra 'h' variable?

	for (; domain; domain = domain->parent) {
		if (irq_domain_is_msi_remap(domain))

should work just as well.

> +			return true;
> +	}
> +	return false;
> +}
>  #else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
>  /**
>   * irq_domain_get_irq_data - Get irq_data associated with @virq and @domain
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v7 15/19] irqdomain: irq_domain_check_msi_remap
  2017-01-09 13:46 ` [PATCH v7 15/19] irqdomain: irq_domain_check_msi_remap Eric Auger
@ 2017-01-10 11:54   ` Marc Zyngier
  0 siblings, 0 replies; 33+ messages in thread
From: Marc Zyngier @ 2017-01-10 11:54 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, christoffer.dall, robin.murphy,
	alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

On 09/01/17 13:46, Eric Auger wrote:
> This new function checks whether all MSI irq domains
> implement IRQ remapping. This is useful to understand
> whether VFIO passthrough is safe with respect to interrupts.
> 
> On ARM typically an MSI controller can sit downstream
> to the IOMMU without preventing VFIO passthrough.
> As such any assigned device can write into the MSI doorbell.
> In case the MSI controller implements IRQ remapping, assigned
> devices will not be able to trigger interrupts towards the
> host. On the contrary, the assignment must be emphasized as
> unsafe with respect to interrupts.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> v5 -> v6:
> - use irq_domain_hierarchical_is_msi_remap()
> - comment rewording
> 
> v4 -> v5:
> - Handle DOMAIN_BUS_FSL_MC_MSI domains
> - Check parents
> ---
>  include/linux/irqdomain.h |  1 +
>  kernel/irq/irqdomain.c    | 23 +++++++++++++++++++++++
>  2 files changed, 24 insertions(+)
> 
> diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
> index bc2f571..188eced 100644
> --- a/include/linux/irqdomain.h
> +++ b/include/linux/irqdomain.h
> @@ -222,6 +222,7 @@ struct irq_domain *irq_domain_add_legacy(struct device_node *of_node,
>  					 void *host_data);
>  extern struct irq_domain *irq_find_matching_fwspec(struct irq_fwspec *fwspec,
>  						   enum irq_domain_bus_token bus_token);
> +extern bool irq_domain_check_msi_remap(void);
>  extern void irq_set_default_host(struct irq_domain *host);
>  extern int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
>  				  irq_hw_number_t hwirq, int node,
> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> index 101ee8f..fff30cb 100644
> --- a/kernel/irq/irqdomain.c
> +++ b/kernel/irq/irqdomain.c
> @@ -278,6 +278,29 @@ struct irq_domain *irq_find_matching_fwspec(struct irq_fwspec *fwspec,
>  EXPORT_SYMBOL_GPL(irq_find_matching_fwspec);
>  
>  /**
> + * irq_domain_check_msi_remap - Check whether all MSI
> + * irq domains implement IRQ remapping
> + */
> +bool irq_domain_check_msi_remap(void)
> +{
> +	struct irq_domain *h;
> +	bool ret = true;
> +
> +	mutex_lock(&irq_domain_mutex);
> +	list_for_each_entry(h, &irq_domain_list, link) {
> +		if (irq_domain_is_msi(h) &&
> +		    !irq_domain_hierarchical_is_msi_remap(h)) {
> +			ret = false;
> +			goto out;

Let's avoid gotos if we can. a break statement will have the same effect
here, and we can drop the label.

> +		}
> +	}
> +out:
> +	mutex_unlock(&irq_domain_mutex);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(irq_domain_check_msi_remap);
> +
> +/**
>   * irq_set_default_host() - Set a "default" irq domain
>   * @domain: default domain pointer
>   *
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v7 16/19] irqchip/gicv3-its: Sets IRQ_DOMAIN_FLAG_MSI_REMAP
  2017-01-09 13:46 ` [PATCH v7 16/19] irqchip/gicv3-its: Sets IRQ_DOMAIN_FLAG_MSI_REMAP Eric Auger
@ 2017-01-10 11:57   ` Marc Zyngier
  0 siblings, 0 replies; 33+ messages in thread
From: Marc Zyngier @ 2017-01-10 11:57 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, christoffer.dall, robin.murphy,
	alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

Hi Eric,

On 09/01/17 13:46, Eric Auger wrote:
> The GICv3 ITS is MSI remapping capable. Let's advertise
> this property so that VFIO passthrough can assess IRQ safety.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  drivers/irqchip/irq-gic-v3-its.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 69b040f..9d4fefc 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -1642,6 +1642,7 @@ static int its_init_domain(struct fwnode_handle *handle, struct its_node *its)
>  
>  	inner_domain->parent = its_parent;
>  	inner_domain->bus_token = DOMAIN_BUS_NEXUS;
> +	inner_domain->flags |= IRQ_DOMAIN_FLAG_MSI_REMAP;
>  	info->ops = &its_msi_domain_ops;
>  	info->data = its;
>  	inner_domain->host_data = info;
> 

For patches 13 to 16, and provided that you address the couple of nits I
mentioned in reply to patches 13 and 15:

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions
  2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
                   ` (18 preceding siblings ...)
  2017-01-09 13:46 ` [PATCH v7 19/19] iommu/arm-smmu: Do not advertise IOMMU_CAP_INTR_REMAP anymore Eric Auger
@ 2017-01-10 14:09 ` Joerg Roedel
  2017-01-10 14:15   ` Will Deacon
  2017-01-10 15:58   ` Auger Eric
  19 siblings, 2 replies; 33+ messages in thread
From: Joerg Roedel @ 2017-01-10 14:09 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, christoffer.dall, marc.zyngier, robin.murphy,
	alex.williamson, will.deacon, tglx, jason, linux-arm-kernel, kvm,
	drjones, linux-kernel, pranav.sawargaonkar, iommu, punit.agrawal,
	diana.craciun, gpkulkarni, shankerd, bharat.bhushan,
	geethasowjanya.akula

On Mon, Jan 09, 2017 at 01:45:51PM +0000, Eric Auger wrote:
> Eric Auger (17):
>   iommu: Rename iommu_dm_regions into iommu_resv_regions
>   iommu: Add a new type field in iommu_resv_region
>   iommu: iommu_alloc_resv_region
>   iommu: Only map direct mapped regions
>   iommu: iommu_get_group_resv_regions
>   iommu: Implement reserved_regions iommu-group sysfs file
>   iommu/vt-d: Implement reserved region get/put callbacks
>   iommu/amd: Declare MSI and HT regions as reserved IOVA regions
>   iommu/arm-smmu: Implement reserved region get/put callbacks
>   iommu/arm-smmu-v3: Implement reserved region get/put callbacks

IOMMU patches look good, what is the plan to merge this? I'd like to
take the IOMMU patches and can provide a branch for someone else to base
the rest on.


	Joerg

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions
  2017-01-10 14:09 ` [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Joerg Roedel
@ 2017-01-10 14:15   ` Will Deacon
  2017-01-10 15:58   ` Auger Eric
  1 sibling, 0 replies; 33+ messages in thread
From: Will Deacon @ 2017-01-10 14:15 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Eric Auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	robin.murphy, alex.williamson, tglx, jason, linux-arm-kernel,
	kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

On Tue, Jan 10, 2017 at 03:09:24PM +0100, Joerg Roedel wrote:
> On Mon, Jan 09, 2017 at 01:45:51PM +0000, Eric Auger wrote:
> > Eric Auger (17):
> >   iommu: Rename iommu_dm_regions into iommu_resv_regions
> >   iommu: Add a new type field in iommu_resv_region
> >   iommu: iommu_alloc_resv_region
> >   iommu: Only map direct mapped regions
> >   iommu: iommu_get_group_resv_regions
> >   iommu: Implement reserved_regions iommu-group sysfs file
> >   iommu/vt-d: Implement reserved region get/put callbacks
> >   iommu/amd: Declare MSI and HT regions as reserved IOVA regions
> >   iommu/arm-smmu: Implement reserved region get/put callbacks
> >   iommu/arm-smmu-v3: Implement reserved region get/put callbacks
> 
> IOMMU patches look good, what is the plan to merge this? I'd like to
> take the IOMMU patches and can provide a branch for someone else to base
> the rest on.

I'm perfectly happy with this going through you. In fact, I suspect you
could take the whole series once Marc is happy with the few remaining
irq domain niggles (which are really minor at this point).

That just leaves the VFIO bits -- Alex, are you happy with those now?
If not, we can hold off on the last three patches for the time being.

Will

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v7 11/19] iommu/arm-smmu: Implement reserved region get/put callbacks
  2017-01-09 13:46 ` [PATCH v7 11/19] iommu/arm-smmu: Implement reserved region get/put callbacks Eric Auger
@ 2017-01-10 14:16   ` Will Deacon
  0 siblings, 0 replies; 33+ messages in thread
From: Will Deacon @ 2017-01-10 14:16 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, christoffer.dall, marc.zyngier, robin.murphy,
	alex.williamson, joro, tglx, jason, linux-arm-kernel, kvm,
	drjones, linux-kernel, pranav.sawargaonkar, iommu, punit.agrawal,
	diana.craciun, gpkulkarni, shankerd, bharat.bhushan,
	geethasowjanya.akula

On Mon, Jan 09, 2017 at 01:46:02PM +0000, Eric Auger wrote:
> The get() populates the list with the MSI IOVA reserved window.
> 
> At the moment an arbitray MSI IOVA window is set at 0x8000000
> of size 1MB. This will allow to report those info in iommu-group
> sysfs.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> 
> v3 -> v4:
> - do not handle PCI host bridge windows anymore
> - encode prot
> 
> RFC v2 -> v3:
> - use existing get/put_resv_regions
> 
> RFC v1 -> v2:
> - use defines for MSI IOVA base and length
> ---
>  drivers/iommu/arm-smmu.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)

Acked-by: Will Deacon <will.deacon@arm.com>

Will

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v7 12/19] iommu/arm-smmu-v3: Implement reserved region get/put callbacks
  2017-01-09 13:46 ` [PATCH v7 12/19] iommu/arm-smmu-v3: " Eric Auger
@ 2017-01-10 14:17   ` Will Deacon
  0 siblings, 0 replies; 33+ messages in thread
From: Will Deacon @ 2017-01-10 14:17 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, christoffer.dall, marc.zyngier, robin.murphy,
	alex.williamson, joro, tglx, jason, linux-arm-kernel, kvm,
	drjones, linux-kernel, pranav.sawargaonkar, iommu, punit.agrawal,
	diana.craciun, gpkulkarni, shankerd, bharat.bhushan,
	geethasowjanya.akula

On Mon, Jan 09, 2017 at 01:46:03PM +0000, Eric Auger wrote:
> iommu/arm-smmu: Implement reserved region get/put callbacks
> 
> The get() populates the list with the MSI IOVA reserved window.
> 
> At the moment an arbitray MSI IOVA window is set at 0x8000000
> of size 1MB. This will allow to report those info in iommu-group
> sysfs.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> 
> v4: creation
> ---
>  drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)

Acked-by: Will Deacon <will.deacon@arm.com>

Will

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v7 19/19] iommu/arm-smmu: Do not advertise IOMMU_CAP_INTR_REMAP anymore
  2017-01-09 13:46 ` [PATCH v7 19/19] iommu/arm-smmu: Do not advertise IOMMU_CAP_INTR_REMAP anymore Eric Auger
@ 2017-01-10 14:17   ` Will Deacon
  0 siblings, 0 replies; 33+ messages in thread
From: Will Deacon @ 2017-01-10 14:17 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, christoffer.dall, marc.zyngier, robin.murphy,
	alex.williamson, joro, tglx, jason, linux-arm-kernel, kvm,
	drjones, linux-kernel, pranav.sawargaonkar, iommu, punit.agrawal,
	diana.craciun, gpkulkarni, shankerd, bharat.bhushan,
	geethasowjanya.akula

On Mon, Jan 09, 2017 at 01:46:10PM +0000, Eric Auger wrote:
> IOMMU_CAP_INTR_REMAP has been advertised in arm-smmu(-v3) although
> on ARM this property is not attached to the IOMMU but rather is
> implemented in the MSI controller (GICv3 ITS).
> 
> Now vfio_iommu_type1 checks MSI remapping capability at MSI controller
> level, let's correct this.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  drivers/iommu/arm-smmu-v3.c | 2 --
>  drivers/iommu/arm-smmu.c    | 2 --
>  2 files changed, 4 deletions(-)

Dependent on the previous two VFIO patches:

Acked-by: Will Deacon <will.deacon@arm.com>

Will

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v7 01/19] iommu/dma: Implement PCI allocation optimisation
  2017-01-09 13:45 ` [PATCH v7 01/19] iommu/dma: Implement PCI allocation optimisation Eric Auger
@ 2017-01-10 15:41   ` Robin Murphy
  0 siblings, 0 replies; 33+ messages in thread
From: Robin Murphy @ 2017-01-10 15:41 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, christoffer.dall, marc.zyngier,
	alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

On 09/01/17 13:45, Eric Auger wrote:
> From: Robin Murphy <robin.murphy@arm.com>
> 
> Whilst PCI devices may have 64-bit DMA masks, they still benefit from
> using 32-bit addresses wherever possible in order to avoid DAC (PCI) or
> longer address packets (PCIe), which may incur a performance overhead.
> Implement the same optimisation as other allocators by trying to get a
> 32-bit address first, only falling back to the full mask if that fails.

Oops, this was just some development work which got promoted to my misc
branch for safe keeping; it really has nothing to do with this series.

I'd managed to overlook that one of the __alloc_iova() conflicts hit the
MSI cookie patch, sorry. Things are now in a more appropriate order in
my tree.

Robin.

> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  drivers/iommu/dma-iommu.c | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 2db0d64..a59ca47 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -204,19 +204,28 @@ int dma_direction_to_prot(enum dma_data_direction dir, bool coherent)
>  }
>  
>  static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size,
> -		dma_addr_t dma_limit)
> +		dma_addr_t dma_limit, struct device *dev)
>  {
>  	struct iova_domain *iovad = cookie_iovad(domain);
>  	unsigned long shift = iova_shift(iovad);
>  	unsigned long length = iova_align(iovad, size) >> shift;
> +	struct iova *iova = NULL;
>  
>  	if (domain->geometry.force_aperture)
>  		dma_limit = min(dma_limit, domain->geometry.aperture_end);
> +
> +	/* Try to get PCI devices a SAC address */
> +	if (dma_limit > DMA_BIT_MASK(32) && dev_is_pci(dev))
> +		iova = alloc_iova(iovad, length, DMA_BIT_MASK(32) >> shift,
> +				  true);
>  	/*
>  	 * Enforce size-alignment to be safe - there could perhaps be an
>  	 * attribute to control this per-device, or at least per-domain...
>  	 */
> -	return alloc_iova(iovad, length, dma_limit >> shift, true);
> +	if (!iova)
> +		iova = alloc_iova(iovad, length, dma_limit >> shift, true);
> +
> +	return iova;
>  }
>  
>  /* The IOVA allocator knows what we mapped, so just unmap whatever that was */
> @@ -369,7 +378,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
>  	if (!pages)
>  		return NULL;
>  
> -	iova = __alloc_iova(domain, size, dev->coherent_dma_mask);
> +	iova = __alloc_iova(domain, size, dev->coherent_dma_mask, dev);
>  	if (!iova)
>  		goto out_free_pages;
>  
> @@ -440,7 +449,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
>  	struct iova_domain *iovad = cookie_iovad(domain);
>  	size_t iova_off = iova_offset(iovad, phys);
>  	size_t len = iova_align(iovad, size + iova_off);
> -	struct iova *iova = __alloc_iova(domain, len, dma_get_mask(dev));
> +	struct iova *iova = __alloc_iova(domain, len, dma_get_mask(dev), dev);
>  
>  	if (!iova)
>  		return DMA_ERROR_CODE;
> @@ -598,7 +607,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
>  		prev = s;
>  	}
>  
> -	iova = __alloc_iova(domain, iova_len, dma_get_mask(dev));
> +	iova = __alloc_iova(domain, iova_len, dma_get_mask(dev), dev);
>  	if (!iova)
>  		goto out_restore_sg;
>  
> @@ -675,7 +684,7 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
>  	if (!msi_page)
>  		return NULL;
>  
> -	iova = __alloc_iova(domain, iovad->granule, dma_get_mask(dev));
> +	iova = __alloc_iova(domain, iovad->granule, dma_get_mask(dev), dev);
>  	if (!iova)
>  		goto out_free_page;
>  
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions
  2017-01-10 14:09 ` [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Joerg Roedel
  2017-01-10 14:15   ` Will Deacon
@ 2017-01-10 15:58   ` Auger Eric
  1 sibling, 0 replies; 33+ messages in thread
From: Auger Eric @ 2017-01-10 15:58 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: eric.auger.pro, christoffer.dall, marc.zyngier, robin.murphy,
	alex.williamson, will.deacon, tglx, jason, linux-arm-kernel, kvm,
	drjones, linux-kernel, pranav.sawargaonkar, iommu, punit.agrawal,
	diana.craciun, gpkulkarni, shankerd, bharat.bhushan,
	geethasowjanya.akula

Hi all,

On 10/01/2017 15:09, Joerg Roedel wrote:
> On Mon, Jan 09, 2017 at 01:45:51PM +0000, Eric Auger wrote:
>> Eric Auger (17):
>>   iommu: Rename iommu_dm_regions into iommu_resv_regions
>>   iommu: Add a new type field in iommu_resv_region
>>   iommu: iommu_alloc_resv_region
>>   iommu: Only map direct mapped regions
>>   iommu: iommu_get_group_resv_regions
>>   iommu: Implement reserved_regions iommu-group sysfs file
>>   iommu/vt-d: Implement reserved region get/put callbacks
>>   iommu/amd: Declare MSI and HT regions as reserved IOVA regions
>>   iommu/arm-smmu: Implement reserved region get/put callbacks
>>   iommu/arm-smmu-v3: Implement reserved region get/put callbacks
> 
> IOMMU patches look good, what is the plan to merge this? I'd like to
> take the IOMMU patches and can provide a branch for someone else to base
> the rest on.

I will respin asap taking into account Marc's nits, Robin's update. Also
I would like to do one change in

[PATCH v7 08/19] iommu: Implement reserved_regions iommu-group sysfs file

I will send a separate email.

Thanks

Eric

> 
> 
> 	Joerg
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v7 08/19] iommu: Implement reserved_regions iommu-group sysfs file
  2017-01-09 13:45 ` [PATCH v7 08/19] iommu: Implement reserved_regions iommu-group sysfs file Eric Auger
@ 2017-01-10 16:20   ` Auger Eric
  2017-01-10 17:14     ` Joerg Roedel
  0 siblings, 1 reply; 33+ messages in thread
From: Auger Eric @ 2017-01-10 16:20 UTC (permalink / raw)
  To: eric.auger.pro, christoffer.dall, marc.zyngier, robin.murphy,
	alex.williamson, will.deacon, joro, tglx, jason,
	linux-arm-kernel
  Cc: kvm, drjones, linux-kernel, pranav.sawargaonkar, iommu,
	punit.agrawal, diana.craciun, gpkulkarni, shankerd,
	bharat.bhushan, geethasowjanya.akula

Hi Joerg,

On 09/01/2017 14:45, Eric Auger wrote:
> A new iommu-group sysfs attribute file is introduced. It contains
> the list of reserved regions for the iommu-group. Each reserved
> region is described on a separate line:
> - first field is the start IOVA address,
> - second is the end IOVA address,
> - third is the type.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> v6 -> v7:
> - also report the type of the reserved region as a string
> - updated ABI documentation
> 
> v3 -> v4:
> - add cast to long long int when printing to avoid warning on
>   i386
> - change S_IRUGO into 0444
> - remove sort. The list is natively sorted now.
> 
> The file layout is inspired of /sys/bus/pci/devices/BDF/resource.
> I also read Documentation/filesystems/sysfs.txt so I expect this
> to be frowned upon.
> ---
>  .../ABI/testing/sysfs-kernel-iommu_groups          | 12 +++++++
>  drivers/iommu/iommu.c                              | 38 ++++++++++++++++++++++
>  2 files changed, 50 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
> index 9b31556..35c64e0 100644
> --- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
> +++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
> @@ -12,3 +12,15 @@ Description:	/sys/kernel/iommu_groups/ contains a number of sub-
>  		file if the IOMMU driver has chosen to register a more
>  		common name for the group.
>  Users:
> +
> +What:		/sys/kernel/iommu_groups/reserved_regions
> +Date: 		January 2017
> +KernelVersion:  v4.11
> +Contact: 	Eric Auger <eric.auger@redhat.com>
> +Description:    /sys/kernel/iommu_groups/reserved_regions list IOVA
> +		regions that are reserved. Not necessarily all
> +		reserved regions are listed. This is typically used to
> +		output direct-mapped, MSI, non mappable regions. Each
> +		region is described on a single line: the 1st field is
> +		the base IOVA, the second is the end IOVA and the third
> +		field describes the type of the region.
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 640056b..0123daa 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -68,6 +68,12 @@ struct iommu_group_attribute {
>  			 const char *buf, size_t count);
>  };
>  
> +static const char * const iommu_group_resv_type_string[] = {
> +	[IOMMU_RESV_DIRECT]	= "direct",
> +	[IOMMU_RESV_RESERVED]	= "reserved",
> +	[IOMMU_RESV_MSI]	= "msi",
> +};
> +
>  #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)		\
>  struct iommu_group_attribute iommu_group_attr_##_name =		\
>  	__ATTR(_name, _mode, _show, _store)
> @@ -231,8 +237,33 @@ int iommu_get_group_resv_regions(struct iommu_group *group,
>  }
>  EXPORT_SYMBOL_GPL(iommu_get_group_resv_regions);
>  
> +static ssize_t iommu_group_show_resv_regions(struct iommu_group *group,
> +					     char *buf)
> +{
> +	struct iommu_resv_region *region, *next;
> +	struct list_head group_resv_regions;
> +	char *str = buf;
> +
> +	INIT_LIST_HEAD(&group_resv_regions);
> +	iommu_get_group_resv_regions(group, &group_resv_regions);
> +
> +	list_for_each_entry_safe(region, next, &group_resv_regions, list) {
> +		str += sprintf(str, "0x%016llx 0x%016llx %s\n",
> +			       (long long int)region->start,
> +			       (long long int)(region->start +
> +						region->length - 1),
> +			       iommu_group_resv_type_string[region->type]);
> +		kfree(region);
> +	}
> +
> +	return (str - buf);
> +}
> +
>  static IOMMU_GROUP_ATTR(name, S_IRUGO, iommu_group_show_name, NULL);
>  
> +static IOMMU_GROUP_ATTR(reserved_regions, 0444,
> +			iommu_group_show_resv_regions, NULL);
> +
>  static void iommu_group_release(struct kobject *kobj)
>  {
>  	struct iommu_group *group = to_iommu_group(kobj);
> @@ -247,6 +278,8 @@ static void iommu_group_release(struct kobject *kobj)
>  	if (group->default_domain)
>  		iommu_domain_free(group->default_domain);
>  
> +	iommu_group_remove_file(group, &iommu_group_attr_reserved_regions);

The /sys/kernel/iommu_groups/n directory seems to be removed before this
gets called and this may produce a WARNING when devices get removed from
the group. I intend to remove the call since I have the feeling
everything gets cleaned up properly.

Do you see any issue?

Thanks

Eric

[  350.753618] iommu: Removing device 0000:01:10.0 from group 7
[  350.759331] kernfs: can not remove 'reserved_regions', no directory
[  350.765603] ------------[ cut here ]------------
[  350.770216] WARNING: CPU: 3 PID: 2617 at fs/kernfs/dir.c:1406
kernfs_remove_by_name_ns+0x8c/0x98
../..
[  351.028154] [<ffff00000828819c>] kernfs_remove_by_name_ns+0x8c/0x98
[  351.034414] [<ffff000008289bf8>] sysfs_remove_file_ns+0x14/0x1c
[  351.040325] [<ffff00000856d918>] iommu_group_release+0x68/0x9c
[  351.046150] [<ffff0000083d152c>] kobject_cleanup+0x8c/0x194
[  351.051715] [<ffff0000083d13c8>] kobject_put+0x44/0x6c
[  351.056844] [<ffff0000083d1490>] kobject_del+0x40/0x50
[  351.061974] [<ffff0000083d1508>] kobject_cleanup+0x68/0x194
[  351.067538] [<ffff0000083d13c8>] kobject_put+0x44/0x6c
[  351.072668] [<ffff00000856df30>] iommu_group_remove_device+0x128/0x1d4
[  351.079187] [<ffff000008575eb0>] arm_smmu_remove_device+0x12c/0x158




> +
>  	kfree(group->name);
>  	kfree(group);
>  }
> @@ -310,6 +343,11 @@ struct iommu_group *iommu_group_alloc(void)
>  	 */
>  	kobject_put(&group->kobj);
>  
> +	ret = iommu_group_create_file(group,
> +				      &iommu_group_attr_reserved_regions);
> +	if (ret)
> +		return ERR_PTR(ret);
> +
>  	pr_debug("Allocated group %d\n", group->id);
>  
>  	return group;
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v7 08/19] iommu: Implement reserved_regions iommu-group sysfs file
  2017-01-10 16:20   ` Auger Eric
@ 2017-01-10 17:14     ` Joerg Roedel
  2017-01-10 21:21       ` Auger Eric
  0 siblings, 1 reply; 33+ messages in thread
From: Joerg Roedel @ 2017-01-10 17:14 UTC (permalink / raw)
  To: Auger Eric
  Cc: eric.auger.pro, christoffer.dall, marc.zyngier, robin.murphy,
	alex.williamson, will.deacon, tglx, jason, linux-arm-kernel, kvm,
	drjones, linux-kernel, pranav.sawargaonkar, iommu, punit.agrawal,
	diana.craciun, gpkulkarni, shankerd, bharat.bhushan,
	geethasowjanya.akula

On Tue, Jan 10, 2017 at 05:20:34PM +0100, Auger Eric wrote:
> The /sys/kernel/iommu_groups/n directory seems to be removed before this
> gets called and this may produce a WARNING when devices get removed from
> the group. I intend to remove the call since I have the feeling
> everything gets cleaned up properly.

A feeling is not enough, please check that in the code.


	Joerg

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v7 08/19] iommu: Implement reserved_regions iommu-group sysfs file
  2017-01-10 17:14     ` Joerg Roedel
@ 2017-01-10 21:21       ` Auger Eric
  0 siblings, 0 replies; 33+ messages in thread
From: Auger Eric @ 2017-01-10 21:21 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: drjones, jason, kvm, marc.zyngier, bharat.bhushan, punit.agrawal,
	will.deacon, linux-kernel, iommu, geethasowjanya.akula,
	diana.craciun, alex.williamson, pranav.sawargaonkar, shankerd,
	linux-arm-kernel, tglx, gpkulkarni, robin.murphy,
	christoffer.dall, eric.auger.pro

Hi Joerg,
On 10/01/2017 18:14, Joerg Roedel wrote:
> On Tue, Jan 10, 2017 at 05:20:34PM +0100, Auger Eric wrote:
>> The /sys/kernel/iommu_groups/n directory seems to be removed before this
>> gets called and this may produce a WARNING when devices get removed from
>> the group. I intend to remove the call since I have the feeling
>> everything gets cleaned up properly.
> 
> A feeling is not enough, please check that in the code.

So my understanding is on group's kobject_release we have:
kobject_release
|_ kobject_cleanup
	|_ kobject_del
		|_ sysfs_remove_dir
			|_ kernfs_remove
				|_ _kernfs_remove
	../..
	|_ ktype release (iommu_group_release)

_kernfs_remove() calls kernfs_put() on all descendant nodes, leading to
the whole directory cleanup.

In iommu_group_release I called sysfs_remove_file on the
reserved_regions attribute file. My understanding is its job is
identifical as what was done previously and the node was already
destroyed hence the warning.

sysfs_remove_file
|_ sysfs_remove_file_ns
	|_ kernfs_remove_by_name_ns
		|_kernfs_remove

So my understanding is it is safe to remove it.

Thanks

Eric
> 
> 
> 	Joerg
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2017-01-10 21:21 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-09 13:45 [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Eric Auger
2017-01-09 13:45 ` [PATCH v7 01/19] iommu/dma: Implement PCI allocation optimisation Eric Auger
2017-01-10 15:41   ` Robin Murphy
2017-01-09 13:45 ` [PATCH v7 02/19] iommu/dma: Allow MSI-only cookies Eric Auger
2017-01-09 13:45 ` [PATCH v7 03/19] iommu: Rename iommu_dm_regions into iommu_resv_regions Eric Auger
2017-01-09 13:45 ` [PATCH v7 04/19] iommu: Add a new type field in iommu_resv_region Eric Auger
2017-01-09 13:45 ` [PATCH v7 05/19] iommu: iommu_alloc_resv_region Eric Auger
2017-01-09 13:45 ` [PATCH v7 06/19] iommu: Only map direct mapped regions Eric Auger
2017-01-09 13:45 ` [PATCH v7 07/19] iommu: iommu_get_group_resv_regions Eric Auger
2017-01-09 13:45 ` [PATCH v7 08/19] iommu: Implement reserved_regions iommu-group sysfs file Eric Auger
2017-01-10 16:20   ` Auger Eric
2017-01-10 17:14     ` Joerg Roedel
2017-01-10 21:21       ` Auger Eric
2017-01-09 13:46 ` [PATCH v7 09/19] iommu/vt-d: Implement reserved region get/put callbacks Eric Auger
2017-01-09 13:46 ` [PATCH v7 10/19] iommu/amd: Declare MSI and HT regions as reserved IOVA regions Eric Auger
2017-01-09 13:46 ` [PATCH v7 11/19] iommu/arm-smmu: Implement reserved region get/put callbacks Eric Auger
2017-01-10 14:16   ` Will Deacon
2017-01-09 13:46 ` [PATCH v7 12/19] iommu/arm-smmu-v3: " Eric Auger
2017-01-10 14:17   ` Will Deacon
2017-01-09 13:46 ` [PATCH v7 13/19] irqdomain: Add irq domain MSI and MSI_REMAP flags Eric Auger
2017-01-10 11:50   ` Marc Zyngier
2017-01-09 13:46 ` [PATCH v7 14/19] genirq/msi: Set IRQ_DOMAIN_FLAG_MSI on MSI domain creation Eric Auger
2017-01-09 13:46 ` [PATCH v7 15/19] irqdomain: irq_domain_check_msi_remap Eric Auger
2017-01-10 11:54   ` Marc Zyngier
2017-01-09 13:46 ` [PATCH v7 16/19] irqchip/gicv3-its: Sets IRQ_DOMAIN_FLAG_MSI_REMAP Eric Auger
2017-01-10 11:57   ` Marc Zyngier
2017-01-09 13:46 ` [PATCH v7 17/19] vfio/type1: Allow transparent MSI IOVA allocation Eric Auger
2017-01-09 13:46 ` [PATCH v7 18/19] vfio/type1: Check MSI remapping at irq domain level Eric Auger
2017-01-09 13:46 ` [PATCH v7 19/19] iommu/arm-smmu: Do not advertise IOMMU_CAP_INTR_REMAP anymore Eric Auger
2017-01-10 14:17   ` Will Deacon
2017-01-10 14:09 ` [PATCH v7 00/19] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions Joerg Roedel
2017-01-10 14:15   ` Will Deacon
2017-01-10 15:58   ` Auger Eric

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).