linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
@ 2016-04-04  8:06 Eric Auger
  2016-04-04  8:06 ` [PATCH v6 1/7] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute Eric Auger
                   ` (7 more replies)
  0 siblings, 8 replies; 18+ messages in thread
From: Eric Auger @ 2016-04-04  8:06 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel, kvmarm, kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	Jean-Philippe.Brucker, julien.grall

This series introduces the dma-reserved-iommu api used to:
- create/destroy an iova domain dedicated to reserved iova bindings
- map/unmap physical addresses onto reserved IOVAs.
- unmap and destroy all IOVA reserved bindings

Currently reserved IOVAs are meant to map MSI physical doorbells. A single
reserved domain does exit per domain.

Also a new domain attribute is introduced to signal whether the MSI
addresses must be mapped in the IOMMU

VFIO subsystem is supposed to create/destroy the iommu reserved domain.

When the MSI sub-system is about to handle an MSI physical address
that needs to be bound, it uses the dma-reserved_iommu API to map/unmap
the address. Since several drivers are likely to use the same doorbell,
a reference counting must exist on the bindings. An RB-tree indexed by PA
is used.

More details & context can be found at:
http://www.linaro.org/blog/core-dump/kvm-pciemsi-passthrough-armarm64/

Best Regards

Eric

Git: complete series available at
https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.6-rc1-pcie-passthrough-v6

History:

RFC v5 -> patch v6:
- split to ease the review process
- in dma-reserved-api use a spin lock instead of a mutex (reported by
  Jean-Philippe)
- revisit iommu_get_reserved_iova API to pass a size parameter upon
  Marc's request
- Consistently use the page order passed when creating the iova domain.
- init reserved_binding_list (reported by Julien)

RFC v4 -> RFC v5:
- take into account Thomas' comments on MSI related patches
  - split "msi: IOMMU map the doorbell address when needed"
  - increase readability and add comments
  - fix style issues
 - split "iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute"
 - platform ITS now advertises IOMMU_CAP_INTR_REMAP
 - fix compilation issue with CONFIG_IOMMU API unset
 - arm-smmu-v3 now advertises DOMAIN_ATTR_MSI_MAPPING

RFC v3 -> v4:
- Move doorbell mapping/unmapping in msi.c
- fix ref count issue on set_affinity: in case of a change in the address
  the previous address is decremented
- doorbell map/unmap now is done on msi composition. Should allow the use
  case for platform MSI controllers
- create dma-reserved-iommu.h/c exposing/implementing a new API dedicated
  to reserved IOVA management (looking like dma-iommu glue)
- series reordering to ease the review:
  - first part is related to IOMMU
  - second related to MSI sub-system
  - third related to VFIO (except arm-smmu IOMMU_CAP_INTR_REMAP removal)
- expose the number of requested IOVA pages through VFIO_IOMMU_GET_INFO
  [this partially addresses Marc's comments on iommu_get/put_single_reserved
   size/alignment problematic - which I did not ignore - but I don't know
   how much I can do at the moment]

RFC v2 -> RFC v3:
- should fix wrong handling of some CONFIG combinations:
  CONFIG_IOVA, CONFIG_IOMMU_API, CONFIG_PCI_MSI_IRQ_DOMAIN
- fix MSI_FLAG_IRQ_REMAPPING setting in GICv3 ITS (although not tested)

PATCH v1 -> RFC v2:
- reverted to RFC since it looks more reasonable ;-) the code is split
  between VFIO, IOMMU, MSI controller and I am not sure I did the right
  choices. Also API need to be further discussed.
- iova API usage in arm-smmu.c.
- MSI controller natively programs the MSI addr with either the PA or IOVA.
  This is not done anymore in vfio-pci driver as suggested by Alex.
- check irq remapping capability of the group

RFC v1 [2] -> PATCH v1:
- use the existing dma map/unmap ioctl interface with a flag to register a
  reserved IOVA range. Use the legacy Rb to store this special vfio_dma.
- a single reserved IOVA contiguous region now is allowed
- use of an RB tree indexed by PA to store allocated reserved slots
- use of a vfio_domain iova_domain to manage iova allocation within the
  window provided by the userspace
- vfio alloc_map/unmap_free take a vfio_group handle
- vfio_group handle is cached in vfio_pci_device
- add ref counting to bindings
- user modality enabled at the end of the series


Eric Auger (7):
  iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
  iommu/arm-smmu: advertise DOMAIN_ATTR_MSI_MAPPING attribute
  iommu: introduce a reserved iova cookie
  dma-reserved-iommu: alloc/free_reserved_iova_domain
  dma-reserved-iommu: reserved binding rb-tree and helpers
  dma-reserved-iommu: iommu_get/put_single_reserved
  dma-reserved-iommu: iommu_unmap_reserved

 drivers/iommu/Kconfig              |   8 +
 drivers/iommu/Makefile             |   1 +
 drivers/iommu/arm-smmu-v3.c        |   2 +
 drivers/iommu/arm-smmu.c           |   2 +
 drivers/iommu/dma-reserved-iommu.c | 321 +++++++++++++++++++++++++++++++++++++
 drivers/iommu/iommu.c              |   2 +
 include/linux/dma-reserved-iommu.h |  80 +++++++++
 include/linux/iommu.h              |   7 +
 8 files changed, 423 insertions(+)
 create mode 100644 drivers/iommu/dma-reserved-iommu.c
 create mode 100644 include/linux/dma-reserved-iommu.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v6 1/7] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
  2016-04-04  8:06 [PATCH v6 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes Eric Auger
@ 2016-04-04  8:06 ` Eric Auger
  2016-04-04  8:06 ` [PATCH v6 2/7] iommu/arm-smmu: advertise " Eric Auger
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Eric Auger @ 2016-04-04  8:06 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel, kvmarm, kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	Jean-Philippe.Brucker, julien.grall

Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
this means the MSI addresses need to be mapped in the IOMMU.

x86 IOMMUs typically don't expose the attribute since on x86, MSI write
transaction addresses always are within the 1MB PA region [FEE0_0000h -
FEF0_000h] window which directly targets the APIC configuration space and
hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
conveyed through the IOMMU.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v4 -> v5:
- introduce the user in the next patch

RFC v1 -> v1:
- the data field is not used
- for this attribute domain_get_attr simply returns 0 if the MSI_MAPPING
  capability if needed or <0 if not.
- removed struct iommu_domain_msi_maps
---
 include/linux/iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a5c539f..a4fe04a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -112,6 +112,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_FSL_PAMU_ENABLE,
 	DOMAIN_ATTR_FSL_PAMUV1,
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
+	DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
 	DOMAIN_ATTR_MAX,
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 2/7] iommu/arm-smmu: advertise DOMAIN_ATTR_MSI_MAPPING attribute
  2016-04-04  8:06 [PATCH v6 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes Eric Auger
  2016-04-04  8:06 ` [PATCH v6 1/7] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute Eric Auger
@ 2016-04-04  8:06 ` Eric Auger
  2016-04-04  8:06 ` [PATCH v6 3/7] iommu: introduce a reserved iova cookie Eric Auger
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Eric Auger @ 2016-04-04  8:06 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel, kvmarm, kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	Jean-Philippe.Brucker, julien.grall

On ARM, MSI write transactions from device upstream to the smmu
are conveyed through the iommu. Therefore target physical addresses
must be mapped and DOMAIN_ATTR_MSI_MAPPING is set to advertise
this requirement on arm-smmu and arm-smmu-v3.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>

---

v4 -> v5:
- don't handle fsl_pamu_domain anymore
- handle arm-smmu-v3
---
 drivers/iommu/arm-smmu-v3.c | 2 ++
 drivers/iommu/arm-smmu.c    | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 4ff73ff..a077a35 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1900,6 +1900,8 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 	case DOMAIN_ATTR_NESTING:
 		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 		return 0;
+	case DOMAIN_ATTR_MSI_MAPPING:
+		return 0;
 	default:
 		return -ENODEV;
 	}
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 5158feb..6562752 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1429,6 +1429,8 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 	case DOMAIN_ATTR_NESTING:
 		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 		return 0;
+	case DOMAIN_ATTR_MSI_MAPPING:
+		return 0;
 	default:
 		return -ENODEV;
 	}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 3/7] iommu: introduce a reserved iova cookie
  2016-04-04  8:06 [PATCH v6 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes Eric Auger
  2016-04-04  8:06 ` [PATCH v6 1/7] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute Eric Auger
  2016-04-04  8:06 ` [PATCH v6 2/7] iommu/arm-smmu: advertise " Eric Auger
@ 2016-04-04  8:06 ` Eric Auger
  2016-04-04  8:06 ` [PATCH v6 4/7] dma-reserved-iommu: alloc/free_reserved_iova_domain Eric Auger
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Eric Auger @ 2016-04-04  8:06 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel, kvmarm, kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	Jean-Philippe.Brucker, julien.grall

This patch introduces some new fields in the iommu_domain struct,
dedicated to reserved iova management.

In a similar way as DMA mapping IOVA window, we need to store
information related to a reserved IOVA window.

The reserved_iova_cookie will store the reserved iova_domain
handle. An RB tree indexed by physical address is introduced to
store the host physical addresses bound to reserved IOVAs.

Those physical addresses will correspond to MSI frame base
addresses, also referred to as doorbells. Their number should be
quite limited per domain.

Also a mutex is introduced to protect accesses to the iova_domain
and RB tree.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v5 -> v6:
- initialize reserved_binding_list
- use a spinlock instead of a mutex
---
 drivers/iommu/iommu.c | 2 ++
 include/linux/iommu.h | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index bfd4f7c..cdb7845 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1072,6 +1072,8 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
 
 	domain->ops  = bus->iommu_ops;
 	domain->type = type;
+	spin_lock_init(&domain->reserved_lock);
+	domain->reserved_binding_list = RB_ROOT;
 
 	return domain;
 }
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a4fe04a..630a207 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -24,6 +24,7 @@
 #include <linux/of.h>
 #include <linux/types.h>
 #include <linux/scatterlist.h>
+#include <linux/spinlock.h>
 #include <trace/events/iommu.h>
 
 #define IOMMU_READ	(1 << 0)
@@ -82,6 +83,11 @@ struct iommu_domain {
 	void *handler_token;
 	struct iommu_domain_geometry geometry;
 	void *iova_cookie;
+	void *reserved_iova_cookie;
+	/* rb tree indexed by PA, for reserved bindings only */
+	struct rb_root reserved_binding_list;
+	/* protects reserved cookie and rbtree manipulation */
+	spinlock_t reserved_lock;
 };
 
 enum iommu_cap {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 4/7] dma-reserved-iommu: alloc/free_reserved_iova_domain
  2016-04-04  8:06 [PATCH v6 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes Eric Auger
                   ` (2 preceding siblings ...)
  2016-04-04  8:06 ` [PATCH v6 3/7] iommu: introduce a reserved iova cookie Eric Auger
@ 2016-04-04  8:06 ` Eric Auger
  2016-04-06 23:00   ` Alex Williamson
  2016-04-04  8:07 ` [PATCH v6 5/7] dma-reserved-iommu: reserved binding rb-tree and helpers Eric Auger
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Eric Auger @ 2016-04-04  8:06 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel, kvmarm, kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	Jean-Philippe.Brucker, julien.grall

Introduce alloc/free_reserved_iova_domain in the IOMMU API.
alloc_reserved_iova_domain initializes an iova domain at a given
iova base address and with a given size. This iova domain will
be used to allocate iova within that window. Those IOVAs will be reserved
for special purpose, typically MSI frame binding. Allocation function
within the reserved iova domain will be introduced in subsequent patches.

Those functions are implemented and exposed if CONFIG_IOMMU_DMA_RESERVED
is seta.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v5 -> v6:
- use spin lock instead of mutex

v3 -> v4:
- formerly in "iommu/arm-smmu: implement alloc/free_reserved_iova_domain" &
  "iommu: add alloc/free_reserved_iova_domain"

v2 -> v3:
- remove iommu_alloc_reserved_iova_domain & iommu_free_reserved_iova_domain
  static implementation in case CONFIG_IOMMU_API is not set

v1 -> v2:
- moved from vfio API to IOMMU API
---
 drivers/iommu/Kconfig              |  8 ++++
 drivers/iommu/Makefile             |  1 +
 drivers/iommu/dma-reserved-iommu.c | 75 ++++++++++++++++++++++++++++++++++++++
 include/linux/dma-reserved-iommu.h | 45 +++++++++++++++++++++++
 4 files changed, 129 insertions(+)
 create mode 100644 drivers/iommu/dma-reserved-iommu.c
 create mode 100644 include/linux/dma-reserved-iommu.h

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index dd1dc39..a5a1530 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -74,6 +74,12 @@ config IOMMU_DMA
 	select IOMMU_IOVA
 	select NEED_SG_DMA_LENGTH
 
+# IOMMU reserved IOVA mapping (MSI doorbell)
+config IOMMU_DMA_RESERVED
+	bool
+	select IOMMU_API
+	select IOMMU_IOVA
+
 config FSL_PAMU
 	bool "Freescale IOMMU support"
 	depends on PPC32
@@ -307,6 +313,7 @@ config SPAPR_TCE_IOMMU
 config ARM_SMMU
 	bool "ARM Ltd. System MMU (SMMU) Support"
 	depends on (ARM64 || ARM) && MMU
+	select IOMMU_DMA_RESERVED
 	select IOMMU_API
 	select IOMMU_IO_PGTABLE_LPAE
 	select ARM_DMA_USE_IOMMU if ARM
@@ -320,6 +327,7 @@ config ARM_SMMU
 config ARM_SMMU_V3
 	bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
 	depends on ARM64 && PCI
+	select IOMMU_DMA_RESERVED
 	select IOMMU_API
 	select IOMMU_IO_PGTABLE_LPAE
 	select GENERIC_MSI_IRQ_DOMAIN
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index c6edb31..6c9ae99 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -2,6 +2,7 @@ obj-$(CONFIG_IOMMU_API) += iommu.o
 obj-$(CONFIG_IOMMU_API) += iommu-traces.o
 obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
 obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
+obj-$(CONFIG_IOMMU_DMA_RESERVED) += dma-reserved-iommu.o
 obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
 obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
 obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o
diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
new file mode 100644
index 0000000..a461482
--- /dev/null
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -0,0 +1,75 @@
+/*
+ * Reserved IOVA Management
+ *
+ * Copyright (c) 2015 Linaro Ltd.
+ *              www.linaro.org
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include <linux/iommu.h>
+#include <linux/iova.h>
+
+int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+				     dma_addr_t iova, size_t size,
+				     unsigned long order)
+{
+	unsigned long granule, mask;
+	struct iova_domain *iovad;
+	unsigned long flags;
+	int ret = 0;
+
+	granule = 1UL << order;
+	mask = granule - 1;
+	if (iova & mask || (!size) || (size & mask))
+		return -EINVAL;
+
+	iovad = kzalloc(sizeof(struct iova_domain), GFP_KERNEL);
+	if (!iovad)
+		return -ENOMEM;
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	if (domain->reserved_iova_cookie) {
+		ret = -EEXIST;
+		goto free_unlock;
+	}
+
+	init_iova_domain(iovad, granule,
+			 iova >> order, (iova + size - 1) >> order);
+	domain->reserved_iova_cookie = iovad;
+	goto unlock;
+
+free_unlock:
+	kfree(iovad);
+unlock:
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
+
+void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
+{
+	struct iova_domain *iovad =
+		(struct iova_domain *)domain->reserved_iova_cookie;
+	unsigned long flags;
+
+	if (!iovad)
+		return;
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	put_iova_domain(iovad);
+	kfree(iovad);
+
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+}
+EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
new file mode 100644
index 0000000..5bf863b
--- /dev/null
+++ b/include/linux/dma-reserved-iommu.h
@@ -0,0 +1,45 @@
+/*
+ * Copyright (c) 2015 Linaro Ltd.
+ *              www.linaro.org
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+#ifndef __DMA_RESERVED_IOMMU_H
+#define __DMA_RESERVED_IOMMU_H
+
+#ifdef __KERNEL__
+#include <asm/errno.h>
+
+#ifdef CONFIG_IOMMU_DMA_RESERVED
+#include <linux/iommu.h>
+
+/**
+ * iommu_alloc_reserved_iova_domain: allocate the reserved iova domain
+ *
+ * @domain: iommu domain handle
+ * @iova: base iova address
+ * @size: iova window size
+ * @order: page order
+ */
+int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+				     dma_addr_t iova, size_t size,
+				     unsigned long order);
+
+/**
+ * iommu_free_reserved_iova_domain: free the reserved iova domain
+ *
+ * @domain: iommu domain handle
+ */
+void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
+
+#endif	/* CONFIG_IOMMU_DMA_RESERVED */
+#endif	/* __KERNEL__ */
+#endif	/* __DMA_RESERVED_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 5/7] dma-reserved-iommu: reserved binding rb-tree and helpers
  2016-04-04  8:06 [PATCH v6 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes Eric Auger
                   ` (3 preceding siblings ...)
  2016-04-04  8:06 ` [PATCH v6 4/7] dma-reserved-iommu: alloc/free_reserved_iova_domain Eric Auger
@ 2016-04-04  8:07 ` Eric Auger
  2016-04-04  8:07 ` [PATCH v6 6/7] dma-reserved-iommu: iommu_get/put_single_reserved Eric Auger
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Eric Auger @ 2016-04-04  8:07 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel, kvmarm, kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	Jean-Philippe.Brucker, julien.grall

we will need to track which host physical addresses are mapped to
reserved IOVA. In that prospect we introduce a new RB tree indexed
by physical address. This RB tree only is used for reserved IOVA
bindings.

It is expected this RB tree will contain very few bindings. Those
generally correspond to single page mapping one MSI frame (GICv2m
frame or ITS GITS_TRANSLATER frame).

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v5 -> v6:
- add comment about @d->reserved_lock to be held

v3 -> v4:
- that code was formerly in "iommu/arm-smmu: add a reserved binding RB tree"
---
 drivers/iommu/dma-reserved-iommu.c | 63 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
index a461482..f592118 100644
--- a/drivers/iommu/dma-reserved-iommu.c
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -18,6 +18,69 @@
 #include <linux/iommu.h>
 #include <linux/iova.h>
 
+struct iommu_reserved_binding {
+	struct kref		kref;
+	struct rb_node		node;
+	struct iommu_domain	*domain;
+	phys_addr_t		addr;
+	dma_addr_t		iova;
+	size_t			size;
+};
+
+/* Reserved binding RB-tree manipulation */
+
+/* @d->reserved_lock must be held */
+static struct iommu_reserved_binding *find_reserved_binding(
+				    struct iommu_domain *d,
+				    phys_addr_t start, size_t size)
+{
+	struct rb_node *node = d->reserved_binding_list.rb_node;
+
+	while (node) {
+		struct iommu_reserved_binding *binding =
+			rb_entry(node, struct iommu_reserved_binding, node);
+
+		if (start + size <= binding->addr)
+			node = node->rb_left;
+		else if (start >= binding->addr + binding->size)
+			node = node->rb_right;
+		else
+			return binding;
+	}
+
+	return NULL;
+}
+
+/* @d->reserved_lock must be held */
+static void link_reserved_binding(struct iommu_domain *d,
+				  struct iommu_reserved_binding *new)
+{
+	struct rb_node **link = &d->reserved_binding_list.rb_node;
+	struct rb_node *parent = NULL;
+	struct iommu_reserved_binding *binding;
+
+	while (*link) {
+		parent = *link;
+		binding = rb_entry(parent, struct iommu_reserved_binding,
+				   node);
+
+		if (new->addr + new->size <= binding->addr)
+			link = &(*link)->rb_left;
+		else
+			link = &(*link)->rb_right;
+	}
+
+	rb_link_node(&new->node, parent, link);
+	rb_insert_color(&new->node, &d->reserved_binding_list);
+}
+
+/* @d->reserved_lock must be held */
+static void unlink_reserved_binding(struct iommu_domain *d,
+				    struct iommu_reserved_binding *old)
+{
+	rb_erase(&old->node, &d->reserved_binding_list);
+}
+
 int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
 				     dma_addr_t iova, size_t size,
 				     unsigned long order)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 6/7] dma-reserved-iommu: iommu_get/put_single_reserved
  2016-04-04  8:06 [PATCH v6 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes Eric Auger
                   ` (4 preceding siblings ...)
  2016-04-04  8:07 ` [PATCH v6 5/7] dma-reserved-iommu: reserved binding rb-tree and helpers Eric Auger
@ 2016-04-04  8:07 ` Eric Auger
  2016-04-06 23:12   ` Alex Williamson
  2016-04-04  8:07 ` [PATCH v6 7/7] dma-reserved-iommu: iommu_unmap_reserved Eric Auger
  2016-04-06 23:15 ` [PATCH v6 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes Alex Williamson
  7 siblings, 1 reply; 18+ messages in thread
From: Eric Auger @ 2016-04-04  8:07 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel, kvmarm, kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	Jean-Philippe.Brucker, julien.grall

This patch introduces iommu_get/put_single_reserved.

iommu_get_single_reserved allows to allocate a new reserved iova page
and map it onto the physical page that contains a given physical address.
Page size is the IOMMU page one. It is the responsability of the
system integrator to make sure the in use IOMMU page size corresponds
to the granularity of the MSI frame.

It returns the iova that is mapped onto the provided physical address.
Hence the physical address passed in argument does not need to be aligned.

In case a mapping already exists between both pages, the IOVA mapped
to the PA is directly returned.

Each time an iova is successfully returned a binding ref count is
incremented.

iommu_put_single_reserved decrements the ref count and when this latter
is null, the mapping is destroyed and the iova is released.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Ankit Jindal <ajindal@apm.com>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>

---

v5 -> v6:
- revisit locking with spin_lock instead of mutex
- do not kref_get on 1st get
- add size parameter to the get function following Marc's request
- use the iova domain shift instead of using the smallest supported page size

v3 -> v4:
- formerly in iommu: iommu_get/put_single_reserved &
  iommu/arm-smmu: implement iommu_get/put_single_reserved
- Attempted to address Marc's doubts about missing size/alignment
  at VFIO level (user-space knows the IOMMU page size and the number
  of IOVA pages to provision)

v2 -> v3:
- remove static implementation of iommu_get_single_reserved &
  iommu_put_single_reserved when CONFIG_IOMMU_API is not set

v1 -> v2:
- previously a VFIO API, named vfio_alloc_map/unmap_free_reserved_iova
---
 drivers/iommu/dma-reserved-iommu.c | 146 +++++++++++++++++++++++++++++++++++++
 include/linux/dma-reserved-iommu.h |  28 +++++++
 2 files changed, 174 insertions(+)

diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
index f592118..3c759d9 100644
--- a/drivers/iommu/dma-reserved-iommu.c
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -136,3 +136,149 @@ void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
 	spin_unlock_irqrestore(&domain->reserved_lock, flags);
 }
 EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
+
+static void delete_reserved_binding(struct iommu_domain *domain,
+				    struct iommu_reserved_binding *b)
+{
+	struct iova_domain *iovad =
+		(struct iova_domain *)domain->reserved_iova_cookie;
+	unsigned long order = iova_shift(iovad);
+
+	iommu_unmap(domain, b->iova, b->size);
+	free_iova(iovad, b->iova >> order);
+	kfree(b);
+}
+
+int iommu_get_reserved_iova(struct iommu_domain *domain,
+			      phys_addr_t addr, size_t size, int prot,
+			      dma_addr_t *iova)
+{
+	struct iova_domain *iovad =
+		(struct iova_domain *)domain->reserved_iova_cookie;
+	unsigned long order = iova_shift(iovad);
+	unsigned long  base_pfn, end_pfn, nb_iommu_pages;
+	size_t iommu_page_size = 1 << order, binding_size;
+	phys_addr_t aligned_base, offset;
+	struct iommu_reserved_binding *b, *newb;
+	unsigned long flags;
+	struct iova *p_iova;
+	bool unmap = false;
+	int ret;
+
+	base_pfn = addr >> order;
+	end_pfn = (addr + size - 1) >> order;
+	nb_iommu_pages = end_pfn - base_pfn + 1;
+	aligned_base = base_pfn << order;
+	offset = addr - aligned_base;
+	binding_size = nb_iommu_pages * iommu_page_size;
+
+	if (!iovad)
+		return -EINVAL;
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	b = find_reserved_binding(domain, aligned_base, binding_size);
+	if (b) {
+		*iova = b->iova + offset;
+		kref_get(&b->kref);
+		ret = 0;
+		goto unlock;
+	}
+
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+
+	/*
+	 * no reserved IOVA was found for this PA, start allocating and
+	 * registering one while the spin-lock is not held. iommu_map/unmap
+	 * are not supposed to be atomic
+	 */
+
+	p_iova = alloc_iova(iovad, nb_iommu_pages, iovad->dma_32bit_pfn, true);
+	if (!p_iova)
+		return -ENOMEM;
+
+	*iova = iova_dma_addr(iovad, p_iova);
+
+	newb = kzalloc(sizeof(*b), GFP_KERNEL);
+	if (!newb) {
+		free_iova(iovad, p_iova->pfn_lo);
+		return -ENOMEM;
+	}
+
+	ret = iommu_map(domain, *iova, aligned_base, binding_size, prot);
+	if (ret) {
+		kfree(newb);
+		free_iova(iovad, p_iova->pfn_lo);
+		return ret;
+	}
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	/* re-check the PA was not mapped in our back when lock was not held */
+	b = find_reserved_binding(domain, aligned_base, binding_size);
+	if (b) {
+		*iova = b->iova + offset;
+		kref_get(&b->kref);
+		ret = 0;
+		unmap = true;
+		goto unlock;
+	}
+
+	kref_init(&newb->kref);
+	newb->domain = domain;
+	newb->addr = aligned_base;
+	newb->iova = *iova;
+	newb->size = binding_size;
+
+	link_reserved_binding(domain, newb);
+
+	*iova += offset;
+	goto unlock;
+
+unlock:
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+	if (unmap)
+		delete_reserved_binding(domain, newb);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_get_reserved_iova);
+
+void iommu_put_reserved_iova(struct iommu_domain *domain, dma_addr_t iova)
+{
+	struct iova_domain *iovad =
+		(struct iova_domain *)domain->reserved_iova_cookie;
+	unsigned long order;
+	phys_addr_t aligned_addr;
+	dma_addr_t aligned_iova, page_size, mask, offset;
+	struct iommu_reserved_binding *b;
+	unsigned long flags;
+	bool unmap = false;
+
+	order = iova_shift(iovad);
+	page_size = (uint64_t)1 << order;
+	mask = page_size - 1;
+
+	aligned_iova = iova & ~mask;
+	offset = iova - aligned_iova;
+
+	aligned_addr = iommu_iova_to_phys(domain, aligned_iova);
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+	b = find_reserved_binding(domain, aligned_addr, page_size);
+	if (!b)
+		goto unlock;
+
+	if (atomic_sub_and_test(1, &b->kref.refcount)) {
+		unlink_reserved_binding(domain, b);
+		unmap = true;
+	}
+
+unlock:
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+	if (unmap)
+		delete_reserved_binding(domain, b);
+}
+EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
+
+
+
diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
index 5bf863b..dedea56 100644
--- a/include/linux/dma-reserved-iommu.h
+++ b/include/linux/dma-reserved-iommu.h
@@ -40,6 +40,34 @@ int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
  */
 void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
 
+/**
+ * iommu_get_reserved_iova: allocate a contiguous set of iova pages and
+ * map them to the physical range defined by @addr and @size.
+ *
+ * @domain: iommu domain handle
+ * @addr: physical address to bind
+ * @size: size of the binding
+ * @prot: mapping protection attribute
+ * @iova: returned iova
+ *
+ * Mapped physical pfns are within [@addr >> order, (@addr + size -1) >> order]
+ * where order corresponds to the iova domain order.
+ * This mapping is reference counted as a whole and cannot by split.
+ */
+int iommu_get_reserved_iova(struct iommu_domain *domain,
+			      phys_addr_t addr, size_t size, int prot,
+			      dma_addr_t *iova);
+
+/**
+ * iommu_put_reserved_iova: decrement a ref count of the reserved mapping
+ *
+ * @domain: iommu domain handle
+ * @iova: reserved iova whose binding ref count is decremented
+ *
+ * if the binding ref count is null, destroy the reserved mapping
+ */
+void iommu_put_reserved_iova(struct iommu_domain *domain, dma_addr_t iova);
+
 #endif	/* CONFIG_IOMMU_DMA_RESERVED */
 #endif	/* __KERNEL__ */
 #endif	/* __DMA_RESERVED_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 7/7] dma-reserved-iommu: iommu_unmap_reserved
  2016-04-04  8:06 [PATCH v6 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes Eric Auger
                   ` (5 preceding siblings ...)
  2016-04-04  8:07 ` [PATCH v6 6/7] dma-reserved-iommu: iommu_get/put_single_reserved Eric Auger
@ 2016-04-04  8:07 ` Eric Auger
  2016-04-06 23:15 ` [PATCH v6 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes Alex Williamson
  7 siblings, 0 replies; 18+ messages in thread
From: Eric Auger @ 2016-04-04  8:07 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel, kvmarm, kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	Jean-Philippe.Brucker, julien.grall

Introduce a new function whose role is to unmap all allocated
reserved IOVAs and free the reserved iova domain

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v5 -> v6:
- use spin_lock instead of mutex

v3 -> v4:
- previously "iommu/arm-smmu: relinquish reserved resources on
  domain deletion"
---
 drivers/iommu/dma-reserved-iommu.c | 45 ++++++++++++++++++++++++++++++++++----
 include/linux/dma-reserved-iommu.h |  7 ++++++
 2 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
index 3c759d9..c06c39e 100644
--- a/drivers/iommu/dma-reserved-iommu.c
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -119,20 +119,24 @@ unlock:
 }
 EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
 
-void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
+void __iommu_free_reserved_iova_domain(struct iommu_domain *domain)
 {
 	struct iova_domain *iovad =
 		(struct iova_domain *)domain->reserved_iova_cookie;
-	unsigned long flags;
 
 	if (!iovad)
 		return;
 
-	spin_lock_irqsave(&domain->reserved_lock, flags);
-
 	put_iova_domain(iovad);
 	kfree(iovad);
+}
+
+void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
+{
+	unsigned long flags;
 
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+	__iommu_free_reserved_iova_domain(domain);
 	spin_unlock_irqrestore(&domain->reserved_lock, flags);
 }
 EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
@@ -281,4 +285,37 @@ unlock:
 EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
 
 
+static void reserved_binding_release(struct kref *kref)
+{
+	struct iommu_reserved_binding *b =
+		container_of(kref, struct iommu_reserved_binding, kref);
+	struct iommu_domain *d = b->domain;
+
+	delete_reserved_binding(d, b);
+}
+
+void iommu_unmap_reserved(struct iommu_domain *domain)
+{
+	struct rb_node *node;
+	unsigned long flags;
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+	while ((node = rb_first(&domain->reserved_binding_list))) {
+		struct iommu_reserved_binding *b =
+			rb_entry(node, struct iommu_reserved_binding, node);
+
+		unlink_reserved_binding(domain, b);
+		spin_unlock_irqrestore(&domain->reserved_lock, flags);
+
+		while (!kref_put(&b->kref, reserved_binding_release))
+			;
+		spin_lock_irqsave(&domain->reserved_lock, flags);
+	}
+	domain->reserved_binding_list = RB_ROOT;
+	__iommu_free_reserved_iova_domain(domain);
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+}
+EXPORT_SYMBOL_GPL(iommu_unmap_reserved);
+
+
 
diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
index dedea56..9fba930 100644
--- a/include/linux/dma-reserved-iommu.h
+++ b/include/linux/dma-reserved-iommu.h
@@ -68,6 +68,13 @@ int iommu_get_reserved_iova(struct iommu_domain *domain,
  */
 void iommu_put_reserved_iova(struct iommu_domain *domain, dma_addr_t iova);
 
+/**
+ * iommu_unmap_reserved: unmap & destroy the reserved iova bindings
+ *
+ * @domain: iommu domain handle
+ */
+void iommu_unmap_reserved(struct iommu_domain *domain);
+
 #endif	/* CONFIG_IOMMU_DMA_RESERVED */
 #endif	/* __KERNEL__ */
 #endif	/* __DMA_RESERVED_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 4/7] dma-reserved-iommu: alloc/free_reserved_iova_domain
  2016-04-04  8:06 ` [PATCH v6 4/7] dma-reserved-iommu: alloc/free_reserved_iova_domain Eric Auger
@ 2016-04-06 23:00   ` Alex Williamson
  2016-04-07  9:33     ` Eric Auger
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Williamson @ 2016-04-06 23:00 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, robin.murphy, will.deacon, joro, tglx, jason,
	marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm, kvm,
	suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	Jean-Philippe.Brucker, julien.grall

On Mon,  4 Apr 2016 08:06:59 +0000
Eric Auger <eric.auger@linaro.org> wrote:

> Introduce alloc/free_reserved_iova_domain in the IOMMU API.
> alloc_reserved_iova_domain initializes an iova domain at a given
> iova base address and with a given size. This iova domain will
> be used to allocate iova within that window. Those IOVAs will be reserved
> for special purpose, typically MSI frame binding. Allocation function
> within the reserved iova domain will be introduced in subsequent patches.
> 
> Those functions are implemented and exposed if CONFIG_IOMMU_DMA_RESERVED
> is seta.
> 
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> 
> ---
> 
> v5 -> v6:
> - use spin lock instead of mutex
> 
> v3 -> v4:
> - formerly in "iommu/arm-smmu: implement alloc/free_reserved_iova_domain" &
>   "iommu: add alloc/free_reserved_iova_domain"
> 
> v2 -> v3:
> - remove iommu_alloc_reserved_iova_domain & iommu_free_reserved_iova_domain
>   static implementation in case CONFIG_IOMMU_API is not set
> 
> v1 -> v2:
> - moved from vfio API to IOMMU API
> ---
>  drivers/iommu/Kconfig              |  8 ++++
>  drivers/iommu/Makefile             |  1 +
>  drivers/iommu/dma-reserved-iommu.c | 75 ++++++++++++++++++++++++++++++++++++++
>  include/linux/dma-reserved-iommu.h | 45 +++++++++++++++++++++++
>  4 files changed, 129 insertions(+)
>  create mode 100644 drivers/iommu/dma-reserved-iommu.c
>  create mode 100644 include/linux/dma-reserved-iommu.h
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index dd1dc39..a5a1530 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -74,6 +74,12 @@ config IOMMU_DMA
>  	select IOMMU_IOVA
>  	select NEED_SG_DMA_LENGTH
>  
> +# IOMMU reserved IOVA mapping (MSI doorbell)
> +config IOMMU_DMA_RESERVED
> +	bool
> +	select IOMMU_API
> +	select IOMMU_IOVA
> +
>  config FSL_PAMU
>  	bool "Freescale IOMMU support"
>  	depends on PPC32
> @@ -307,6 +313,7 @@ config SPAPR_TCE_IOMMU
>  config ARM_SMMU
>  	bool "ARM Ltd. System MMU (SMMU) Support"
>  	depends on (ARM64 || ARM) && MMU
> +	select IOMMU_DMA_RESERVED
>  	select IOMMU_API
>  	select IOMMU_IO_PGTABLE_LPAE
>  	select ARM_DMA_USE_IOMMU if ARM
> @@ -320,6 +327,7 @@ config ARM_SMMU
>  config ARM_SMMU_V3
>  	bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
>  	depends on ARM64 && PCI
> +	select IOMMU_DMA_RESERVED
>  	select IOMMU_API
>  	select IOMMU_IO_PGTABLE_LPAE
>  	select GENERIC_MSI_IRQ_DOMAIN
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index c6edb31..6c9ae99 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -2,6 +2,7 @@ obj-$(CONFIG_IOMMU_API) += iommu.o
>  obj-$(CONFIG_IOMMU_API) += iommu-traces.o
>  obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
>  obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
> +obj-$(CONFIG_IOMMU_DMA_RESERVED) += dma-reserved-iommu.o
>  obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
>  obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
>  obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> new file mode 100644
> index 0000000..a461482
> --- /dev/null
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -0,0 +1,75 @@
> +/*
> + * Reserved IOVA Management
> + *
> + * Copyright (c) 2015 Linaro Ltd.
> + *              www.linaro.org
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + */
> +
> +#include <linux/iommu.h>
> +#include <linux/iova.h>
> +
> +int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
> +				     dma_addr_t iova, size_t size,
> +				     unsigned long order)
> +{
> +	unsigned long granule, mask;
> +	struct iova_domain *iovad;
> +	unsigned long flags;
> +	int ret = 0;
> +
> +	granule = 1UL << order;
> +	mask = granule - 1;
> +	if (iova & mask || (!size) || (size & mask))
> +		return -EINVAL;
> +
> +	iovad = kzalloc(sizeof(struct iova_domain), GFP_KERNEL);
> +	if (!iovad)
> +		return -ENOMEM;
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	if (domain->reserved_iova_cookie) {
> +		ret = -EEXIST;
> +		goto free_unlock;
> +	}
> +
> +	init_iova_domain(iovad, granule,
> +			 iova >> order, (iova + size - 1) >> order);
> +	domain->reserved_iova_cookie = iovad;
> +	goto unlock;
> +
> +free_unlock:
> +	kfree(iovad);
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
> +
> +void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
> +{
> +	struct iova_domain *iovad =
> +		(struct iova_domain *)domain->reserved_iova_cookie;
> +	unsigned long flags;
> +
> +	if (!iovad)
> +		return;
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	put_iova_domain(iovad);
> +	kfree(iovad);
> +
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +}

The locking here looks wrong, reserved_iova_cookie needs to be read and
cleared (currently missing) within the lock, the put and free can
probably happen outside the lock.  Likewise the init_iova_domain() in
the alloc path probably doesn't need to iccur under spinlock, but
testing and setting reserved_iova_cookie does.  Seems like a mutex
would do for locking here, but maybe the rb-tree manipulation has
reasons for using a spinlock.  Thanks,

Alex

> +EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
> new file mode 100644
> index 0000000..5bf863b
> --- /dev/null
> +++ b/include/linux/dma-reserved-iommu.h
> @@ -0,0 +1,45 @@
> +/*
> + * Copyright (c) 2015 Linaro Ltd.
> + *              www.linaro.org
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + */
> +#ifndef __DMA_RESERVED_IOMMU_H
> +#define __DMA_RESERVED_IOMMU_H
> +
> +#ifdef __KERNEL__
> +#include <asm/errno.h>
> +
> +#ifdef CONFIG_IOMMU_DMA_RESERVED
> +#include <linux/iommu.h>
> +
> +/**
> + * iommu_alloc_reserved_iova_domain: allocate the reserved iova domain
> + *
> + * @domain: iommu domain handle
> + * @iova: base iova address
> + * @size: iova window size
> + * @order: page order
> + */
> +int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
> +				     dma_addr_t iova, size_t size,
> +				     unsigned long order);
> +
> +/**
> + * iommu_free_reserved_iova_domain: free the reserved iova domain
> + *
> + * @domain: iommu domain handle
> + */
> +void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
> +
> +#endif	/* CONFIG_IOMMU_DMA_RESERVED */
> +#endif	/* __KERNEL__ */
> +#endif	/* __DMA_RESERVED_IOMMU_H */

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 6/7] dma-reserved-iommu: iommu_get/put_single_reserved
  2016-04-04  8:07 ` [PATCH v6 6/7] dma-reserved-iommu: iommu_get/put_single_reserved Eric Auger
@ 2016-04-06 23:12   ` Alex Williamson
  2016-04-07  9:33     ` Eric Auger
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Williamson @ 2016-04-06 23:12 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, robin.murphy, will.deacon, joro, tglx, jason,
	marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm, kvm,
	suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	Jean-Philippe.Brucker, julien.grall

On Mon,  4 Apr 2016 08:07:01 +0000
Eric Auger <eric.auger@linaro.org> wrote:

> This patch introduces iommu_get/put_single_reserved.
> 
> iommu_get_single_reserved allows to allocate a new reserved iova page
> and map it onto the physical page that contains a given physical address.
> Page size is the IOMMU page one. It is the responsability of the
> system integrator to make sure the in use IOMMU page size corresponds
> to the granularity of the MSI frame.
> 
> It returns the iova that is mapped onto the provided physical address.
> Hence the physical address passed in argument does not need to be aligned.
> 
> In case a mapping already exists between both pages, the IOVA mapped
> to the PA is directly returned.
> 
> Each time an iova is successfully returned a binding ref count is
> incremented.
> 
> iommu_put_single_reserved decrements the ref count and when this latter
> is null, the mapping is destroyed and the iova is released.
> 
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> Signed-off-by: Ankit Jindal <ajindal@apm.com>
> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
> 
> ---
> 
> v5 -> v6:
> - revisit locking with spin_lock instead of mutex
> - do not kref_get on 1st get
> - add size parameter to the get function following Marc's request
> - use the iova domain shift instead of using the smallest supported page size
> 
> v3 -> v4:
> - formerly in iommu: iommu_get/put_single_reserved &
>   iommu/arm-smmu: implement iommu_get/put_single_reserved
> - Attempted to address Marc's doubts about missing size/alignment
>   at VFIO level (user-space knows the IOMMU page size and the number
>   of IOVA pages to provision)
> 
> v2 -> v3:
> - remove static implementation of iommu_get_single_reserved &
>   iommu_put_single_reserved when CONFIG_IOMMU_API is not set
> 
> v1 -> v2:
> - previously a VFIO API, named vfio_alloc_map/unmap_free_reserved_iova
> ---
>  drivers/iommu/dma-reserved-iommu.c | 146 +++++++++++++++++++++++++++++++++++++
>  include/linux/dma-reserved-iommu.h |  28 +++++++
>  2 files changed, 174 insertions(+)
> 
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> index f592118..3c759d9 100644
> --- a/drivers/iommu/dma-reserved-iommu.c
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -136,3 +136,149 @@ void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
>  	spin_unlock_irqrestore(&domain->reserved_lock, flags);
>  }
>  EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
> +
> +static void delete_reserved_binding(struct iommu_domain *domain,
> +				    struct iommu_reserved_binding *b)
> +{
> +	struct iova_domain *iovad =
> +		(struct iova_domain *)domain->reserved_iova_cookie;
> +	unsigned long order = iova_shift(iovad);
> +
> +	iommu_unmap(domain, b->iova, b->size);
> +	free_iova(iovad, b->iova >> order);
> +	kfree(b);
> +}
> +
> +int iommu_get_reserved_iova(struct iommu_domain *domain,
> +			      phys_addr_t addr, size_t size, int prot,
> +			      dma_addr_t *iova)
> +{
> +	struct iova_domain *iovad =
> +		(struct iova_domain *)domain->reserved_iova_cookie;
> +	unsigned long order = iova_shift(iovad);
> +	unsigned long  base_pfn, end_pfn, nb_iommu_pages;
> +	size_t iommu_page_size = 1 << order, binding_size;
> +	phys_addr_t aligned_base, offset;
> +	struct iommu_reserved_binding *b, *newb;
> +	unsigned long flags;
> +	struct iova *p_iova;
> +	bool unmap = false;
> +	int ret;
> +
> +	base_pfn = addr >> order;
> +	end_pfn = (addr + size - 1) >> order;
> +	nb_iommu_pages = end_pfn - base_pfn + 1;
> +	aligned_base = base_pfn << order;
> +	offset = addr - aligned_base;
> +	binding_size = nb_iommu_pages * iommu_page_size;
> +
> +	if (!iovad)
> +		return -EINVAL;
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	b = find_reserved_binding(domain, aligned_base, binding_size);
> +	if (b) {
> +		*iova = b->iova + offset;
> +		kref_get(&b->kref);
> +		ret = 0;
> +		goto unlock;
> +	}
> +
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +
> +	/*
> +	 * no reserved IOVA was found for this PA, start allocating and
> +	 * registering one while the spin-lock is not held. iommu_map/unmap
> +	 * are not supposed to be atomic
> +	 */
> +
> +	p_iova = alloc_iova(iovad, nb_iommu_pages, iovad->dma_32bit_pfn, true);
> +	if (!p_iova)
> +		return -ENOMEM;

Here we're using iovad, which was the reserved_iova_cookie outside of
the locking, which makes the locking ineffective.  Isn't this lock also
protecting our iova domain, I'm confused.

> +
> +	*iova = iova_dma_addr(iovad, p_iova);
> +
> +	newb = kzalloc(sizeof(*b), GFP_KERNEL);
> +	if (!newb) {
> +		free_iova(iovad, p_iova->pfn_lo);
> +		return -ENOMEM;
> +	}
> +
> +	ret = iommu_map(domain, *iova, aligned_base, binding_size, prot);
> +	if (ret) {
> +		kfree(newb);
> +		free_iova(iovad, p_iova->pfn_lo);
> +		return ret;
> +	}
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	/* re-check the PA was not mapped in our back when lock was not held */
> +	b = find_reserved_binding(domain, aligned_base, binding_size);
> +	if (b) {
> +		*iova = b->iova + offset;
> +		kref_get(&b->kref);
> +		ret = 0;
> +		unmap = true;
> +		goto unlock;
> +	}
> +
> +	kref_init(&newb->kref);
> +	newb->domain = domain;
> +	newb->addr = aligned_base;
> +	newb->iova = *iova;
> +	newb->size = binding_size;
> +
> +	link_reserved_binding(domain, newb);
> +
> +	*iova += offset;
> +	goto unlock;

??

> +
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	if (unmap)
> +		delete_reserved_binding(domain, newb);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_get_reserved_iova);
> +
> +void iommu_put_reserved_iova(struct iommu_domain *domain, dma_addr_t iova)
> +{
> +	struct iova_domain *iovad =
> +		(struct iova_domain *)domain->reserved_iova_cookie;
> +	unsigned long order;
> +	phys_addr_t aligned_addr;
> +	dma_addr_t aligned_iova, page_size, mask, offset;
> +	struct iommu_reserved_binding *b;
> +	unsigned long flags;
> +	bool unmap = false;
> +
> +	order = iova_shift(iovad);
> +	page_size = (uint64_t)1 << order;
> +	mask = page_size - 1;
> +
> +	aligned_iova = iova & ~mask;
> +	offset = iova - aligned_iova;
> +
> +	aligned_addr = iommu_iova_to_phys(domain, aligned_iova);
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +	b = find_reserved_binding(domain, aligned_addr, page_size);
> +	if (!b)
> +		goto unlock;
> +
> +	if (atomic_sub_and_test(1, &b->kref.refcount)) {
> +		unlink_reserved_binding(domain, b);
> +		unmap = true;
> +	}
> +
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	if (unmap)
> +		delete_reserved_binding(domain, b);
> +}
> +EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
> +
> +
> +
> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
> index 5bf863b..dedea56 100644
> --- a/include/linux/dma-reserved-iommu.h
> +++ b/include/linux/dma-reserved-iommu.h
> @@ -40,6 +40,34 @@ int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>   */
>  void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
>  
> +/**
> + * iommu_get_reserved_iova: allocate a contiguous set of iova pages and
> + * map them to the physical range defined by @addr and @size.
> + *
> + * @domain: iommu domain handle
> + * @addr: physical address to bind
> + * @size: size of the binding
> + * @prot: mapping protection attribute
> + * @iova: returned iova
> + *
> + * Mapped physical pfns are within [@addr >> order, (@addr + size -1) >> order]
> + * where order corresponds to the iova domain order.
> + * This mapping is reference counted as a whole and cannot by split.
> + */
> +int iommu_get_reserved_iova(struct iommu_domain *domain,
> +			      phys_addr_t addr, size_t size, int prot,
> +			      dma_addr_t *iova);
> +
> +/**
> + * iommu_put_reserved_iova: decrement a ref count of the reserved mapping
> + *
> + * @domain: iommu domain handle
> + * @iova: reserved iova whose binding ref count is decremented
> + *
> + * if the binding ref count is null, destroy the reserved mapping
> + */
> +void iommu_put_reserved_iova(struct iommu_domain *domain, dma_addr_t iova);
> +
>  #endif	/* CONFIG_IOMMU_DMA_RESERVED */
>  #endif	/* __KERNEL__ */
>  #endif	/* __DMA_RESERVED_IOMMU_H */

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
  2016-04-04  8:06 [PATCH v6 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes Eric Auger
                   ` (6 preceding siblings ...)
  2016-04-04  8:07 ` [PATCH v6 7/7] dma-reserved-iommu: iommu_unmap_reserved Eric Auger
@ 2016-04-06 23:15 ` Alex Williamson
  2016-04-07 12:28   ` Eric Auger
  7 siblings, 1 reply; 18+ messages in thread
From: Alex Williamson @ 2016-04-06 23:15 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, robin.murphy, will.deacon, joro, tglx, jason,
	marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm, kvm,
	suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	Jean-Philippe.Brucker, julien.grall

On Mon,  4 Apr 2016 08:06:55 +0000
Eric Auger <eric.auger@linaro.org> wrote:

> This series introduces the dma-reserved-iommu api used to:
> - create/destroy an iova domain dedicated to reserved iova bindings
> - map/unmap physical addresses onto reserved IOVAs.
> - unmap and destroy all IOVA reserved bindings

Why are we making the decision to have an unbalanced map vs unmap, we
can create individual mappings, but only unmap the whole thing and
start over?  That's a strange interface.  Thanks,

Alex
 
> Currently reserved IOVAs are meant to map MSI physical doorbells. A single
> reserved domain does exit per domain.
> 
> Also a new domain attribute is introduced to signal whether the MSI
> addresses must be mapped in the IOMMU
> 
> VFIO subsystem is supposed to create/destroy the iommu reserved domain.
> 
> When the MSI sub-system is about to handle an MSI physical address
> that needs to be bound, it uses the dma-reserved_iommu API to map/unmap
> the address. Since several drivers are likely to use the same doorbell,
> a reference counting must exist on the bindings. An RB-tree indexed by PA
> is used.
> 
> More details & context can be found at:
> http://www.linaro.org/blog/core-dump/kvm-pciemsi-passthrough-armarm64/
> 
> Best Regards
> 
> Eric
> 
> Git: complete series available at
> https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.6-rc1-pcie-passthrough-v6
> 
> History:
> 
> RFC v5 -> patch v6:
> - split to ease the review process
> - in dma-reserved-api use a spin lock instead of a mutex (reported by
>   Jean-Philippe)
> - revisit iommu_get_reserved_iova API to pass a size parameter upon
>   Marc's request
> - Consistently use the page order passed when creating the iova domain.
> - init reserved_binding_list (reported by Julien)
> 
> RFC v4 -> RFC v5:
> - take into account Thomas' comments on MSI related patches
>   - split "msi: IOMMU map the doorbell address when needed"
>   - increase readability and add comments
>   - fix style issues
>  - split "iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute"
>  - platform ITS now advertises IOMMU_CAP_INTR_REMAP
>  - fix compilation issue with CONFIG_IOMMU API unset
>  - arm-smmu-v3 now advertises DOMAIN_ATTR_MSI_MAPPING
> 
> RFC v3 -> v4:
> - Move doorbell mapping/unmapping in msi.c
> - fix ref count issue on set_affinity: in case of a change in the address
>   the previous address is decremented
> - doorbell map/unmap now is done on msi composition. Should allow the use
>   case for platform MSI controllers
> - create dma-reserved-iommu.h/c exposing/implementing a new API dedicated
>   to reserved IOVA management (looking like dma-iommu glue)
> - series reordering to ease the review:
>   - first part is related to IOMMU
>   - second related to MSI sub-system
>   - third related to VFIO (except arm-smmu IOMMU_CAP_INTR_REMAP removal)
> - expose the number of requested IOVA pages through VFIO_IOMMU_GET_INFO
>   [this partially addresses Marc's comments on iommu_get/put_single_reserved
>    size/alignment problematic - which I did not ignore - but I don't know
>    how much I can do at the moment]
> 
> RFC v2 -> RFC v3:
> - should fix wrong handling of some CONFIG combinations:
>   CONFIG_IOVA, CONFIG_IOMMU_API, CONFIG_PCI_MSI_IRQ_DOMAIN
> - fix MSI_FLAG_IRQ_REMAPPING setting in GICv3 ITS (although not tested)
> 
> PATCH v1 -> RFC v2:
> - reverted to RFC since it looks more reasonable ;-) the code is split
>   between VFIO, IOMMU, MSI controller and I am not sure I did the right
>   choices. Also API need to be further discussed.
> - iova API usage in arm-smmu.c.
> - MSI controller natively programs the MSI addr with either the PA or IOVA.
>   This is not done anymore in vfio-pci driver as suggested by Alex.
> - check irq remapping capability of the group
> 
> RFC v1 [2] -> PATCH v1:
> - use the existing dma map/unmap ioctl interface with a flag to register a
>   reserved IOVA range. Use the legacy Rb to store this special vfio_dma.
> - a single reserved IOVA contiguous region now is allowed
> - use of an RB tree indexed by PA to store allocated reserved slots
> - use of a vfio_domain iova_domain to manage iova allocation within the
>   window provided by the userspace
> - vfio alloc_map/unmap_free take a vfio_group handle
> - vfio_group handle is cached in vfio_pci_device
> - add ref counting to bindings
> - user modality enabled at the end of the series
> 
> 
> Eric Auger (7):
>   iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
>   iommu/arm-smmu: advertise DOMAIN_ATTR_MSI_MAPPING attribute
>   iommu: introduce a reserved iova cookie
>   dma-reserved-iommu: alloc/free_reserved_iova_domain
>   dma-reserved-iommu: reserved binding rb-tree and helpers
>   dma-reserved-iommu: iommu_get/put_single_reserved
>   dma-reserved-iommu: iommu_unmap_reserved
> 
>  drivers/iommu/Kconfig              |   8 +
>  drivers/iommu/Makefile             |   1 +
>  drivers/iommu/arm-smmu-v3.c        |   2 +
>  drivers/iommu/arm-smmu.c           |   2 +
>  drivers/iommu/dma-reserved-iommu.c | 321 +++++++++++++++++++++++++++++++++++++
>  drivers/iommu/iommu.c              |   2 +
>  include/linux/dma-reserved-iommu.h |  80 +++++++++
>  include/linux/iommu.h              |   7 +
>  8 files changed, 423 insertions(+)
>  create mode 100644 drivers/iommu/dma-reserved-iommu.c
>  create mode 100644 include/linux/dma-reserved-iommu.h
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 4/7] dma-reserved-iommu: alloc/free_reserved_iova_domain
  2016-04-06 23:00   ` Alex Williamson
@ 2016-04-07  9:33     ` Eric Auger
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Auger @ 2016-04-07  9:33 UTC (permalink / raw)
  To: Alex Williamson
  Cc: eric.auger, robin.murphy, will.deacon, joro, tglx, jason,
	marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm, kvm,
	suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	Jean-Philippe.Brucker, julien.grall

Hi Alex,
On 04/07/2016 01:00 AM, Alex Williamson wrote:
> On Mon,  4 Apr 2016 08:06:59 +0000
> Eric Auger <eric.auger@linaro.org> wrote:
> 
>> Introduce alloc/free_reserved_iova_domain in the IOMMU API.
>> alloc_reserved_iova_domain initializes an iova domain at a given
>> iova base address and with a given size. This iova domain will
>> be used to allocate iova within that window. Those IOVAs will be reserved
>> for special purpose, typically MSI frame binding. Allocation function
>> within the reserved iova domain will be introduced in subsequent patches.
>>
>> Those functions are implemented and exposed if CONFIG_IOMMU_DMA_RESERVED
>> is seta.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>>
>> v5 -> v6:
>> - use spin lock instead of mutex
>>
>> v3 -> v4:
>> - formerly in "iommu/arm-smmu: implement alloc/free_reserved_iova_domain" &
>>   "iommu: add alloc/free_reserved_iova_domain"
>>
>> v2 -> v3:
>> - remove iommu_alloc_reserved_iova_domain & iommu_free_reserved_iova_domain
>>   static implementation in case CONFIG_IOMMU_API is not set
>>
>> v1 -> v2:
>> - moved from vfio API to IOMMU API
>> ---
>>  drivers/iommu/Kconfig              |  8 ++++
>>  drivers/iommu/Makefile             |  1 +
>>  drivers/iommu/dma-reserved-iommu.c | 75 ++++++++++++++++++++++++++++++++++++++
>>  include/linux/dma-reserved-iommu.h | 45 +++++++++++++++++++++++
>>  4 files changed, 129 insertions(+)
>>  create mode 100644 drivers/iommu/dma-reserved-iommu.c
>>  create mode 100644 include/linux/dma-reserved-iommu.h
>>
>> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>> index dd1dc39..a5a1530 100644
>> --- a/drivers/iommu/Kconfig
>> +++ b/drivers/iommu/Kconfig
>> @@ -74,6 +74,12 @@ config IOMMU_DMA
>>  	select IOMMU_IOVA
>>  	select NEED_SG_DMA_LENGTH
>>  
>> +# IOMMU reserved IOVA mapping (MSI doorbell)
>> +config IOMMU_DMA_RESERVED
>> +	bool
>> +	select IOMMU_API
>> +	select IOMMU_IOVA
>> +
>>  config FSL_PAMU
>>  	bool "Freescale IOMMU support"
>>  	depends on PPC32
>> @@ -307,6 +313,7 @@ config SPAPR_TCE_IOMMU
>>  config ARM_SMMU
>>  	bool "ARM Ltd. System MMU (SMMU) Support"
>>  	depends on (ARM64 || ARM) && MMU
>> +	select IOMMU_DMA_RESERVED
>>  	select IOMMU_API
>>  	select IOMMU_IO_PGTABLE_LPAE
>>  	select ARM_DMA_USE_IOMMU if ARM
>> @@ -320,6 +327,7 @@ config ARM_SMMU
>>  config ARM_SMMU_V3
>>  	bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
>>  	depends on ARM64 && PCI
>> +	select IOMMU_DMA_RESERVED
>>  	select IOMMU_API
>>  	select IOMMU_IO_PGTABLE_LPAE
>>  	select GENERIC_MSI_IRQ_DOMAIN
>> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
>> index c6edb31..6c9ae99 100644
>> --- a/drivers/iommu/Makefile
>> +++ b/drivers/iommu/Makefile
>> @@ -2,6 +2,7 @@ obj-$(CONFIG_IOMMU_API) += iommu.o
>>  obj-$(CONFIG_IOMMU_API) += iommu-traces.o
>>  obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
>>  obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
>> +obj-$(CONFIG_IOMMU_DMA_RESERVED) += dma-reserved-iommu.o
>>  obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
>>  obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
>>  obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o
>> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
>> new file mode 100644
>> index 0000000..a461482
>> --- /dev/null
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -0,0 +1,75 @@
>> +/*
>> + * Reserved IOVA Management
>> + *
>> + * Copyright (c) 2015 Linaro Ltd.
>> + *              www.linaro.org
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + */
>> +
>> +#include <linux/iommu.h>
>> +#include <linux/iova.h>
>> +
>> +int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>> +				     dma_addr_t iova, size_t size,
>> +				     unsigned long order)
>> +{
>> +	unsigned long granule, mask;
>> +	struct iova_domain *iovad;
>> +	unsigned long flags;
>> +	int ret = 0;
>> +
>> +	granule = 1UL << order;
>> +	mask = granule - 1;
>> +	if (iova & mask || (!size) || (size & mask))
>> +		return -EINVAL;
>> +
>> +	iovad = kzalloc(sizeof(struct iova_domain), GFP_KERNEL);
>> +	if (!iovad)
>> +		return -ENOMEM;
>> +
>> +	spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +	if (domain->reserved_iova_cookie) {
>> +		ret = -EEXIST;
>> +		goto free_unlock;
>> +	}
>> +
>> +	init_iova_domain(iovad, granule,
>> +			 iova >> order, (iova + size - 1) >> order);
>> +	domain->reserved_iova_cookie = iovad;
>> +	goto unlock;
>> +
>> +free_unlock:
>> +	kfree(iovad);
>> +unlock:
>> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
>> +
>> +void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
>> +{
>> +	struct iova_domain *iovad =
>> +		(struct iova_domain *)domain->reserved_iova_cookie;
>> +	unsigned long flags;
>> +
>> +	if (!iovad)
>> +		return;
>> +
>> +	spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +	put_iova_domain(iovad);
>> +	kfree(iovad);
>> +
>> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +}
> 
> The locking here looks wrong, reserved_iova_cookie needs to be read
 and
> cleared (currently missing) within the lock, the put and free can
> probably happen outside the lock. Likewise the init_iova_domain() in
> the alloc path probably doesn't need to iccur under spinlock, but
> testing and setting reserved_iova_cookie does.
I agree with our statements. Need to rewrite this down.

  Seems like a mutex
> would do for locking here, but maybe the rb-tree manipulation has
> reasons for using a spinlock.
I moved to a spinlock after Jean-Philippe spotted a call sequence
disallowing sleeps:
https://lkml.org/lkml/2016/3/10/216
... which I did not address correctly in this respin, focusing on this
wrong spinlock re-writing.

Thanks

Eric
  Thanks,
> 
> Alex
> 
>> +EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
>> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
>> new file mode 100644
>> index 0000000..5bf863b
>> --- /dev/null
>> +++ b/include/linux/dma-reserved-iommu.h
>> @@ -0,0 +1,45 @@
>> +/*
>> + * Copyright (c) 2015 Linaro Ltd.
>> + *              www.linaro.org
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + */
>> +#ifndef __DMA_RESERVED_IOMMU_H
>> +#define __DMA_RESERVED_IOMMU_H
>> +
>> +#ifdef __KERNEL__
>> +#include <asm/errno.h>
>> +
>> +#ifdef CONFIG_IOMMU_DMA_RESERVED
>> +#include <linux/iommu.h>
>> +
>> +/**
>> + * iommu_alloc_reserved_iova_domain: allocate the reserved iova domain
>> + *
>> + * @domain: iommu domain handle
>> + * @iova: base iova address
>> + * @size: iova window size
>> + * @order: page order
>> + */
>> +int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>> +				     dma_addr_t iova, size_t size,
>> +				     unsigned long order);
>> +
>> +/**
>> + * iommu_free_reserved_iova_domain: free the reserved iova domain
>> + *
>> + * @domain: iommu domain handle
>> + */
>> +void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
>> +
>> +#endif	/* CONFIG_IOMMU_DMA_RESERVED */
>> +#endif	/* __KERNEL__ */
>> +#endif	/* __DMA_RESERVED_IOMMU_H */
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 6/7] dma-reserved-iommu: iommu_get/put_single_reserved
  2016-04-06 23:12   ` Alex Williamson
@ 2016-04-07  9:33     ` Eric Auger
  2016-04-07 14:38       ` Jean-Philippe Brucker
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Auger @ 2016-04-07  9:33 UTC (permalink / raw)
  To: Alex Williamson
  Cc: eric.auger, robin.murphy, will.deacon, joro, tglx, jason,
	marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm, kvm,
	suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	Jean-Philippe.Brucker, julien.grall

Alex,
On 04/07/2016 01:12 AM, Alex Williamson wrote:
> On Mon,  4 Apr 2016 08:07:01 +0000
> Eric Auger <eric.auger@linaro.org> wrote:
> 
>> This patch introduces iommu_get/put_single_reserved.
>>
>> iommu_get_single_reserved allows to allocate a new reserved iova page
>> and map it onto the physical page that contains a given physical address.
>> Page size is the IOMMU page one. It is the responsability of the
>> system integrator to make sure the in use IOMMU page size corresponds
>> to the granularity of the MSI frame.
>>
>> It returns the iova that is mapped onto the provided physical address.
>> Hence the physical address passed in argument does not need to be aligned.
>>
>> In case a mapping already exists between both pages, the IOVA mapped
>> to the PA is directly returned.
>>
>> Each time an iova is successfully returned a binding ref count is
>> incremented.
>>
>> iommu_put_single_reserved decrements the ref count and when this latter
>> is null, the mapping is destroyed and the iova is released.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>> Signed-off-by: Ankit Jindal <ajindal@apm.com>
>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
>>
>> ---
>>
>> v5 -> v6:
>> - revisit locking with spin_lock instead of mutex
>> - do not kref_get on 1st get
>> - add size parameter to the get function following Marc's request
>> - use the iova domain shift instead of using the smallest supported page size
>>
>> v3 -> v4:
>> - formerly in iommu: iommu_get/put_single_reserved &
>>   iommu/arm-smmu: implement iommu_get/put_single_reserved
>> - Attempted to address Marc's doubts about missing size/alignment
>>   at VFIO level (user-space knows the IOMMU page size and the number
>>   of IOVA pages to provision)
>>
>> v2 -> v3:
>> - remove static implementation of iommu_get_single_reserved &
>>   iommu_put_single_reserved when CONFIG_IOMMU_API is not set
>>
>> v1 -> v2:
>> - previously a VFIO API, named vfio_alloc_map/unmap_free_reserved_iova
>> ---
>>  drivers/iommu/dma-reserved-iommu.c | 146 +++++++++++++++++++++++++++++++++++++
>>  include/linux/dma-reserved-iommu.h |  28 +++++++
>>  2 files changed, 174 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
>> index f592118..3c759d9 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -136,3 +136,149 @@ void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
>>  	spin_unlock_irqrestore(&domain->reserved_lock, flags);
>>  }
>>  EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
>> +
>> +static void delete_reserved_binding(struct iommu_domain *domain,
>> +				    struct iommu_reserved_binding *b)
>> +{
>> +	struct iova_domain *iovad =
>> +		(struct iova_domain *)domain->reserved_iova_cookie;
>> +	unsigned long order = iova_shift(iovad);
>> +
>> +	iommu_unmap(domain, b->iova, b->size);
>> +	free_iova(iovad, b->iova >> order);
>> +	kfree(b);
>> +}
>> +
>> +int iommu_get_reserved_iova(struct iommu_domain *domain,
>> +			      phys_addr_t addr, size_t size, int prot,
>> +			      dma_addr_t *iova)
>> +{
>> +	struct iova_domain *iovad =
>> +		(struct iova_domain *)domain->reserved_iova_cookie;
>> +	unsigned long order = iova_shift(iovad);
>> +	unsigned long  base_pfn, end_pfn, nb_iommu_pages;
>> +	size_t iommu_page_size = 1 << order, binding_size;
>> +	phys_addr_t aligned_base, offset;
>> +	struct iommu_reserved_binding *b, *newb;
>> +	unsigned long flags;
>> +	struct iova *p_iova;
>> +	bool unmap = false;
>> +	int ret;
>> +
>> +	base_pfn = addr >> order;
>> +	end_pfn = (addr + size - 1) >> order;
>> +	nb_iommu_pages = end_pfn - base_pfn + 1;
>> +	aligned_base = base_pfn << order;
>> +	offset = addr - aligned_base;
>> +	binding_size = nb_iommu_pages * iommu_page_size;
>> +
>> +	if (!iovad)
>> +		return -EINVAL;
>> +
>> +	spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +	b = find_reserved_binding(domain, aligned_base, binding_size);
>> +	if (b) {
>> +		*iova = b->iova + offset;
>> +		kref_get(&b->kref);
>> +		ret = 0;
>> +		goto unlock;
>> +	}
>> +
>> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +
>> +	/*
>> +	 * no reserved IOVA was found for this PA, start allocating and
>> +	 * registering one while the spin-lock is not held. iommu_map/unmap
>> +	 * are not supposed to be atomic
>> +	 */
>> +
>> +	p_iova = alloc_iova(iovad, nb_iommu_pages, iovad->dma_32bit_pfn, true);
>> +	if (!p_iova)
>> +		return -ENOMEM;
> 
> Here we're using iovad, which was the reserved_iova_cookie outside of
> the locking, which makes the locking ineffective.  Isn't this lock also
> protecting our iova domain, I'm confused.
I think I was too when I wrote that :-(
> 
>> +
>> +	*iova = iova_dma_addr(iovad, p_iova);
>> +
>> +	newb = kzalloc(sizeof(*b), GFP_KERNEL);
needs to to be GPF_ATOMIC as Jean-Philippe stated before.
>> +	if (!newb) {
>> +		free_iova(iovad, p_iova->pfn_lo);
>> +		return -ENOMEM;
>> +	}
>> +
>> +	ret = iommu_map(domain, *iova, aligned_base, binding_size, prot);
one problem I have is I would need iommu_map to be atomic (because of
the call sequence reported by Jean-Philippe) and I have no guarantee it
is in general I think . It is for arm-smmu(-v3).c which covers the ARM
use case. But what about other smmus potentially used in that process?

Thanks
Eric
>> +	if (ret) {
>> +		kfree(newb);
>> +		free_iova(iovad, p_iova->pfn_lo);
>> +		return ret;
>> +	}
>> +
>> +	spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +	/* re-check the PA was not mapped in our back when lock was not held */
>> +	b = find_reserved_binding(domain, aligned_base, binding_size);
>> +	if (b) {
>> +		*iova = b->iova + offset;
>> +		kref_get(&b->kref);
>> +		ret = 0;
>> +		unmap = true;
>> +		goto unlock;
>> +	}
>> +
>> +	kref_init(&newb->kref);
>> +	newb->domain = domain;
>> +	newb->addr = aligned_base;
>> +	newb->iova = *iova;
>> +	newb->size = binding_size;
>> +
>> +	link_reserved_binding(domain, newb);
>> +
>> +	*iova += offset;
>> +	goto unlock;
> 
> ??
> 
>> +
>> +unlock:
>> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +	if (unmap)
>> +		delete_reserved_binding(domain, newb);
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_get_reserved_iova);
>> +
>> +void iommu_put_reserved_iova(struct iommu_domain *domain, dma_addr_t iova)
>> +{
>> +	struct iova_domain *iovad =
>> +		(struct iova_domain *)domain->reserved_iova_cookie;
>> +	unsigned long order;
>> +	phys_addr_t aligned_addr;
>> +	dma_addr_t aligned_iova, page_size, mask, offset;
>> +	struct iommu_reserved_binding *b;
>> +	unsigned long flags;
>> +	bool unmap = false;
>> +
>> +	order = iova_shift(iovad);
>> +	page_size = (uint64_t)1 << order;
>> +	mask = page_size - 1;
>> +
>> +	aligned_iova = iova & ~mask;
>> +	offset = iova - aligned_iova;
>> +
>> +	aligned_addr = iommu_iova_to_phys(domain, aligned_iova);
>> +
>> +	spin_lock_irqsave(&domain->reserved_lock, flags);
>> +	b = find_reserved_binding(domain, aligned_addr, page_size);
>> +	if (!b)
>> +		goto unlock;
>> +
>> +	if (atomic_sub_and_test(1, &b->kref.refcount)) {
>> +		unlink_reserved_binding(domain, b);
>> +		unmap = true;
>> +	}
>> +
>> +unlock:
>> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +	if (unmap)
>> +		delete_reserved_binding(domain, b);
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
>> +
>> +
>> +
>> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
>> index 5bf863b..dedea56 100644
>> --- a/include/linux/dma-reserved-iommu.h
>> +++ b/include/linux/dma-reserved-iommu.h
>> @@ -40,6 +40,34 @@ int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>>   */
>>  void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
>>  
>> +/**
>> + * iommu_get_reserved_iova: allocate a contiguous set of iova pages and
>> + * map them to the physical range defined by @addr and @size.
>> + *
>> + * @domain: iommu domain handle
>> + * @addr: physical address to bind
>> + * @size: size of the binding
>> + * @prot: mapping protection attribute
>> + * @iova: returned iova
>> + *
>> + * Mapped physical pfns are within [@addr >> order, (@addr + size -1) >> order]
>> + * where order corresponds to the iova domain order.
>> + * This mapping is reference counted as a whole and cannot by split.
>> + */
>> +int iommu_get_reserved_iova(struct iommu_domain *domain,
>> +			      phys_addr_t addr, size_t size, int prot,
>> +			      dma_addr_t *iova);
>> +
>> +/**
>> + * iommu_put_reserved_iova: decrement a ref count of the reserved mapping
>> + *
>> + * @domain: iommu domain handle
>> + * @iova: reserved iova whose binding ref count is decremented
>> + *
>> + * if the binding ref count is null, destroy the reserved mapping
>> + */
>> +void iommu_put_reserved_iova(struct iommu_domain *domain, dma_addr_t iova);
>> +
>>  #endif	/* CONFIG_IOMMU_DMA_RESERVED */
>>  #endif	/* __KERNEL__ */
>>  #endif	/* __DMA_RESERVED_IOMMU_H */
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
  2016-04-06 23:15 ` [PATCH v6 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes Alex Williamson
@ 2016-04-07 12:28   ` Eric Auger
  2016-04-07 17:50     ` Alex Williamson
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Auger @ 2016-04-07 12:28 UTC (permalink / raw)
  To: Alex Williamson
  Cc: eric.auger, robin.murphy, will.deacon, joro, tglx, jason,
	marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm, kvm,
	suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	Jean-Philippe.Brucker, julien.grall

Hi Alex,
On 04/07/2016 01:15 AM, Alex Williamson wrote:
> On Mon,  4 Apr 2016 08:06:55 +0000
> Eric Auger <eric.auger@linaro.org> wrote:
> 
>> This series introduces the dma-reserved-iommu api used to:
>> - create/destroy an iova domain dedicated to reserved iova bindings
>> - map/unmap physical addresses onto reserved IOVAs.
>> - unmap and destroy all IOVA reserved bindings
> 
> Why are we making the decision to have an unbalanced map vs unmap, we
> can create individual mappings, but only unmap the whole thing and
> start over?  That's a strange interface.  Thanks,
The "individual" balanced unmap also exists (iommu_put_reserved_iova)
and this is the "normal" path. This happens on msi_domain_deactivate
(and possibly on msi_domain_set_affinity).

I added iommu_unmap_reserved to handle the case where the userspace
registers a reserved iova domain and fails to unregister it. In that
case one need to handle the cleanup on kernel-side and I chose to
implement this on vfio_iommu_type1 release. All the reserved IOMMU
bindings get destroyed on that event.

Any advice to handle this situation?

Best Regards

Eric

> 
> Alex
>  
>> Currently reserved IOVAs are meant to map MSI physical doorbells. A single
>> reserved domain does exit per domain.
>>
>> Also a new domain attribute is introduced to signal whether the MSI
>> addresses must be mapped in the IOMMU
>>
>> VFIO subsystem is supposed to create/destroy the iommu reserved domain.
>>
>> When the MSI sub-system is about to handle an MSI physical address
>> that needs to be bound, it uses the dma-reserved_iommu API to map/unmap
>> the address. Since several drivers are likely to use the same doorbell,
>> a reference counting must exist on the bindings. An RB-tree indexed by PA
>> is used.
>>
>> More details & context can be found at:
>> http://www.linaro.org/blog/core-dump/kvm-pciemsi-passthrough-armarm64/
>>
>> Best Regards
>>
>> Eric
>>
>> Git: complete series available at
>> https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.6-rc1-pcie-passthrough-v6
>>
>> History:
>>
>> RFC v5 -> patch v6:
>> - split to ease the review process
>> - in dma-reserved-api use a spin lock instead of a mutex (reported by
>>   Jean-Philippe)
>> - revisit iommu_get_reserved_iova API to pass a size parameter upon
>>   Marc's request
>> - Consistently use the page order passed when creating the iova domain.
>> - init reserved_binding_list (reported by Julien)
>>
>> RFC v4 -> RFC v5:
>> - take into account Thomas' comments on MSI related patches
>>   - split "msi: IOMMU map the doorbell address when needed"
>>   - increase readability and add comments
>>   - fix style issues
>>  - split "iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute"
>>  - platform ITS now advertises IOMMU_CAP_INTR_REMAP
>>  - fix compilation issue with CONFIG_IOMMU API unset
>>  - arm-smmu-v3 now advertises DOMAIN_ATTR_MSI_MAPPING
>>
>> RFC v3 -> v4:
>> - Move doorbell mapping/unmapping in msi.c
>> - fix ref count issue on set_affinity: in case of a change in the address
>>   the previous address is decremented
>> - doorbell map/unmap now is done on msi composition. Should allow the use
>>   case for platform MSI controllers
>> - create dma-reserved-iommu.h/c exposing/implementing a new API dedicated
>>   to reserved IOVA management (looking like dma-iommu glue)
>> - series reordering to ease the review:
>>   - first part is related to IOMMU
>>   - second related to MSI sub-system
>>   - third related to VFIO (except arm-smmu IOMMU_CAP_INTR_REMAP removal)
>> - expose the number of requested IOVA pages through VFIO_IOMMU_GET_INFO
>>   [this partially addresses Marc's comments on iommu_get/put_single_reserved
>>    size/alignment problematic - which I did not ignore - but I don't know
>>    how much I can do at the moment]
>>
>> RFC v2 -> RFC v3:
>> - should fix wrong handling of some CONFIG combinations:
>>   CONFIG_IOVA, CONFIG_IOMMU_API, CONFIG_PCI_MSI_IRQ_DOMAIN
>> - fix MSI_FLAG_IRQ_REMAPPING setting in GICv3 ITS (although not tested)
>>
>> PATCH v1 -> RFC v2:
>> - reverted to RFC since it looks more reasonable ;-) the code is split
>>   between VFIO, IOMMU, MSI controller and I am not sure I did the right
>>   choices. Also API need to be further discussed.
>> - iova API usage in arm-smmu.c.
>> - MSI controller natively programs the MSI addr with either the PA or IOVA.
>>   This is not done anymore in vfio-pci driver as suggested by Alex.
>> - check irq remapping capability of the group
>>
>> RFC v1 [2] -> PATCH v1:
>> - use the existing dma map/unmap ioctl interface with a flag to register a
>>   reserved IOVA range. Use the legacy Rb to store this special vfio_dma.
>> - a single reserved IOVA contiguous region now is allowed
>> - use of an RB tree indexed by PA to store allocated reserved slots
>> - use of a vfio_domain iova_domain to manage iova allocation within the
>>   window provided by the userspace
>> - vfio alloc_map/unmap_free take a vfio_group handle
>> - vfio_group handle is cached in vfio_pci_device
>> - add ref counting to bindings
>> - user modality enabled at the end of the series
>>
>>
>> Eric Auger (7):
>>   iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
>>   iommu/arm-smmu: advertise DOMAIN_ATTR_MSI_MAPPING attribute
>>   iommu: introduce a reserved iova cookie
>>   dma-reserved-iommu: alloc/free_reserved_iova_domain
>>   dma-reserved-iommu: reserved binding rb-tree and helpers
>>   dma-reserved-iommu: iommu_get/put_single_reserved
>>   dma-reserved-iommu: iommu_unmap_reserved
>>
>>  drivers/iommu/Kconfig              |   8 +
>>  drivers/iommu/Makefile             |   1 +
>>  drivers/iommu/arm-smmu-v3.c        |   2 +
>>  drivers/iommu/arm-smmu.c           |   2 +
>>  drivers/iommu/dma-reserved-iommu.c | 321 +++++++++++++++++++++++++++++++++++++
>>  drivers/iommu/iommu.c              |   2 +
>>  include/linux/dma-reserved-iommu.h |  80 +++++++++
>>  include/linux/iommu.h              |   7 +
>>  8 files changed, 423 insertions(+)
>>  create mode 100644 drivers/iommu/dma-reserved-iommu.c
>>  create mode 100644 include/linux/dma-reserved-iommu.h
>>
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 6/7] dma-reserved-iommu: iommu_get/put_single_reserved
  2016-04-07  9:33     ` Eric Auger
@ 2016-04-07 14:38       ` Jean-Philippe Brucker
  2016-04-07 16:44         ` Eric Auger
  0 siblings, 1 reply; 18+ messages in thread
From: Jean-Philippe Brucker @ 2016-04-07 14:38 UTC (permalink / raw)
  To: Eric Auger
  Cc: Alex Williamson, eric.auger, robin.murphy, will.deacon, joro,
	tglx, jason, marc.zyngier, christoffer.dall, linux-arm-kernel,
	kvmarm, kvm, suravee.suthikulpanit, patches, linux-kernel,
	Manish.Jaggi, Bharat.Bhushan, pranav.sawargaonkar, p.fedin,
	iommu, julien.grall

Hi Eric,

On Thu, Apr 07, 2016 at 11:33:42AM +0200, Eric Auger wrote:
> Alex,
> On 04/07/2016 01:12 AM, Alex Williamson wrote:
> > On Mon,  4 Apr 2016 08:07:01 +0000
> > Eric Auger <eric.auger@linaro.org> wrote:
> > 
> >> This patch introduces iommu_get/put_single_reserved.
> >>
> >> iommu_get_single_reserved allows to allocate a new reserved iova page
> >> and map it onto the physical page that contains a given physical address.
> >> Page size is the IOMMU page one. It is the responsability of the
> >> system integrator to make sure the in use IOMMU page size corresponds
> >> to the granularity of the MSI frame.
> >>
> >> It returns the iova that is mapped onto the provided physical address.
> >> Hence the physical address passed in argument does not need to be aligned.
> >>
> >> In case a mapping already exists between both pages, the IOVA mapped
> >> to the PA is directly returned.
> >>
> >> Each time an iova is successfully returned a binding ref count is
> >> incremented.
> >>
> >> iommu_put_single_reserved decrements the ref count and when this latter
> >> is null, the mapping is destroyed and the iova is released.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> >> Signed-off-by: Ankit Jindal <ajindal@apm.com>
> >> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
> >> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
> >>
> >> ---
> >>
> >> v5 -> v6:
> >> - revisit locking with spin_lock instead of mutex
> >> - do not kref_get on 1st get
> >> - add size parameter to the get function following Marc's request
> >> - use the iova domain shift instead of using the smallest supported page size
> >>
> >> v3 -> v4:
> >> - formerly in iommu: iommu_get/put_single_reserved &
> >>   iommu/arm-smmu: implement iommu_get/put_single_reserved
> >> - Attempted to address Marc's doubts about missing size/alignment
> >>   at VFIO level (user-space knows the IOMMU page size and the number
> >>   of IOVA pages to provision)
> >>
> >> v2 -> v3:
> >> - remove static implementation of iommu_get_single_reserved &
> >>   iommu_put_single_reserved when CONFIG_IOMMU_API is not set
> >>
> >> v1 -> v2:
> >> - previously a VFIO API, named vfio_alloc_map/unmap_free_reserved_iova
> >> ---
> >>  drivers/iommu/dma-reserved-iommu.c | 146 +++++++++++++++++++++++++++++++++++++
> >>  include/linux/dma-reserved-iommu.h |  28 +++++++
> >>  2 files changed, 174 insertions(+)
> >>
> >> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> >> index f592118..3c759d9 100644
> >> --- a/drivers/iommu/dma-reserved-iommu.c
> >> +++ b/drivers/iommu/dma-reserved-iommu.c
> >> @@ -136,3 +136,149 @@ void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
> >>  	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> >>  }
> >>  EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
> >> +
> >> +static void delete_reserved_binding(struct iommu_domain *domain,
> >> +				    struct iommu_reserved_binding *b)
> >> +{
> >> +	struct iova_domain *iovad =
> >> +		(struct iova_domain *)domain->reserved_iova_cookie;
> >> +	unsigned long order = iova_shift(iovad);
> >> +
> >> +	iommu_unmap(domain, b->iova, b->size);
> >> +	free_iova(iovad, b->iova >> order);
> >> +	kfree(b);
> >> +}
> >> +
> >> +int iommu_get_reserved_iova(struct iommu_domain *domain,
> >> +			      phys_addr_t addr, size_t size, int prot,
> >> +			      dma_addr_t *iova)
> >> +{
> >> +	struct iova_domain *iovad =
> >> +		(struct iova_domain *)domain->reserved_iova_cookie;
> >> +	unsigned long order = iova_shift(iovad);

Another nit: this call should be after the !iovad check below

> >> +	unsigned long  base_pfn, end_pfn, nb_iommu_pages;
> >> +	size_t iommu_page_size = 1 << order, binding_size;
> >> +	phys_addr_t aligned_base, offset;
> >> +	struct iommu_reserved_binding *b, *newb;
> >> +	unsigned long flags;
> >> +	struct iova *p_iova;
> >> +	bool unmap = false;
> >> +	int ret;
> >> +
> >> +	base_pfn = addr >> order;
> >> +	end_pfn = (addr + size - 1) >> order;
> >> +	nb_iommu_pages = end_pfn - base_pfn + 1;
> >> +	aligned_base = base_pfn << order;
> >> +	offset = addr - aligned_base;
> >> +	binding_size = nb_iommu_pages * iommu_page_size;
> >> +
> >> +	if (!iovad)
> >> +		return -EINVAL;
> >> +
> >> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> >> +
> >> +	b = find_reserved_binding(domain, aligned_base, binding_size);
> >> +	if (b) {
> >> +		*iova = b->iova + offset;
> >> +		kref_get(&b->kref);
> >> +		ret = 0;
> >> +		goto unlock;
> >> +	}
> >> +
> >> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> >> +
> >> +	/*
> >> +	 * no reserved IOVA was found for this PA, start allocating and
> >> +	 * registering one while the spin-lock is not held. iommu_map/unmap
> >> +	 * are not supposed to be atomic
> >> +	 */
> >> +
> >> +	p_iova = alloc_iova(iovad, nb_iommu_pages, iovad->dma_32bit_pfn, true);
> >> +	if (!p_iova)
> >> +		return -ENOMEM;
> > 
> > Here we're using iovad, which was the reserved_iova_cookie outside of
> > the locking, which makes the locking ineffective.  Isn't this lock also
> > protecting our iova domain, I'm confused.
> I think I was too when I wrote that :-(
> > 
> >> +
> >> +	*iova = iova_dma_addr(iovad, p_iova);
> >> +
> >> +	newb = kzalloc(sizeof(*b), GFP_KERNEL);
> needs to to be GPF_ATOMIC as Jean-Philippe stated before.
> >> +	if (!newb) {
> >> +		free_iova(iovad, p_iova->pfn_lo);
> >> +		return -ENOMEM;
> >> +	}
> >> +
> >> +	ret = iommu_map(domain, *iova, aligned_base, binding_size, prot);
> one problem I have is I would need iommu_map to be atomic (because of
> the call sequence reported by Jean-Philippe) and I have no guarantee it
> is in general I think . It is for arm-smmu(-v3).c which covers the ARM
> use case. But what about other smmus potentially used in that process?

Hmm, indeed. How about we move all the heavy mapping work to
msi_domain_prepare_irqs instead? It is allowed to sleep and, more
importantly, fail. It is called much earlier, by pci_enable_msi_range.

All we are missing is details about doorbells, which we could retrieve
from the MSI controller's driver, using one or more additional callbacks
in msi_domain_ops. Currently, we already need one such callbacks for
querying the number of doorbell pages, maybe we could also ask the
driver to provide their addresses? And in msi_domain_activate, simply
search for the IOVA already associated with the MSI message?

I only briefly though about it from the host point of view, not sure how
VFIO would cope with this method.

Thanks,
Jean-Philippe

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 6/7] dma-reserved-iommu: iommu_get/put_single_reserved
  2016-04-07 14:38       ` Jean-Philippe Brucker
@ 2016-04-07 16:44         ` Eric Auger
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Auger @ 2016-04-07 16:44 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Alex Williamson, eric.auger, robin.murphy, will.deacon, joro,
	tglx, jason, marc.zyngier, christoffer.dall, linux-arm-kernel,
	kvmarm, kvm, suravee.suthikulpanit, patches, linux-kernel,
	Manish.Jaggi, Bharat.Bhushan, pranav.sawargaonkar, p.fedin,
	iommu, julien.grall

On 04/07/2016 04:38 PM, Jean-Philippe Brucker wrote:
> Hi Eric,
> 
> On Thu, Apr 07, 2016 at 11:33:42AM +0200, Eric Auger wrote:
>> Alex,
>> On 04/07/2016 01:12 AM, Alex Williamson wrote:
>>> On Mon,  4 Apr 2016 08:07:01 +0000
>>> Eric Auger <eric.auger@linaro.org> wrote:
>>>
>>>> This patch introduces iommu_get/put_single_reserved.
>>>>
>>>> iommu_get_single_reserved allows to allocate a new reserved iova page
>>>> and map it onto the physical page that contains a given physical address.
>>>> Page size is the IOMMU page one. It is the responsability of the
>>>> system integrator to make sure the in use IOMMU page size corresponds
>>>> to the granularity of the MSI frame.
>>>>
>>>> It returns the iova that is mapped onto the provided physical address.
>>>> Hence the physical address passed in argument does not need to be aligned.
>>>>
>>>> In case a mapping already exists between both pages, the IOVA mapped
>>>> to the PA is directly returned.
>>>>
>>>> Each time an iova is successfully returned a binding ref count is
>>>> incremented.
>>>>
>>>> iommu_put_single_reserved decrements the ref count and when this latter
>>>> is null, the mapping is destroyed and the iova is released.
>>>>
>>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>>> Signed-off-by: Ankit Jindal <ajindal@apm.com>
>>>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
>>>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
>>>>
>>>> ---
>>>>
>>>> v5 -> v6:
>>>> - revisit locking with spin_lock instead of mutex
>>>> - do not kref_get on 1st get
>>>> - add size parameter to the get function following Marc's request
>>>> - use the iova domain shift instead of using the smallest supported page size
>>>>
>>>> v3 -> v4:
>>>> - formerly in iommu: iommu_get/put_single_reserved &
>>>>   iommu/arm-smmu: implement iommu_get/put_single_reserved
>>>> - Attempted to address Marc's doubts about missing size/alignment
>>>>   at VFIO level (user-space knows the IOMMU page size and the number
>>>>   of IOVA pages to provision)
>>>>
>>>> v2 -> v3:
>>>> - remove static implementation of iommu_get_single_reserved &
>>>>   iommu_put_single_reserved when CONFIG_IOMMU_API is not set
>>>>
>>>> v1 -> v2:
>>>> - previously a VFIO API, named vfio_alloc_map/unmap_free_reserved_iova
>>>> ---
>>>>  drivers/iommu/dma-reserved-iommu.c | 146 +++++++++++++++++++++++++++++++++++++
>>>>  include/linux/dma-reserved-iommu.h |  28 +++++++
>>>>  2 files changed, 174 insertions(+)
>>>>
>>>> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
>>>> index f592118..3c759d9 100644
>>>> --- a/drivers/iommu/dma-reserved-iommu.c
>>>> +++ b/drivers/iommu/dma-reserved-iommu.c
>>>> @@ -136,3 +136,149 @@ void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
>>>>  	spin_unlock_irqrestore(&domain->reserved_lock, flags);
>>>>  }
>>>>  EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
>>>> +
>>>> +static void delete_reserved_binding(struct iommu_domain *domain,
>>>> +				    struct iommu_reserved_binding *b)
>>>> +{
>>>> +	struct iova_domain *iovad =
>>>> +		(struct iova_domain *)domain->reserved_iova_cookie;
>>>> +	unsigned long order = iova_shift(iovad);
>>>> +
>>>> +	iommu_unmap(domain, b->iova, b->size);
>>>> +	free_iova(iovad, b->iova >> order);
>>>> +	kfree(b);
>>>> +}
>>>> +
>>>> +int iommu_get_reserved_iova(struct iommu_domain *domain,
>>>> +			      phys_addr_t addr, size_t size, int prot,
>>>> +			      dma_addr_t *iova)
>>>> +{
>>>> +	struct iova_domain *iovad =
>>>> +		(struct iova_domain *)domain->reserved_iova_cookie;
>>>> +	unsigned long order = iova_shift(iovad);
> 
> Another nit: this call should be after the !iovad check belo

Yes definitively
> 
>>>> +	unsigned long  base_pfn, end_pfn, nb_iommu_pages;
>>>> +	size_t iommu_page_size = 1 << order, binding_size;
>>>> +	phys_addr_t aligned_base, offset;
>>>> +	struct iommu_reserved_binding *b, *newb;
>>>> +	unsigned long flags;
>>>> +	struct iova *p_iova;
>>>> +	bool unmap = false;
>>>> +	int ret;
>>>> +
>>>> +	base_pfn = addr >> order;
>>>> +	end_pfn = (addr + size - 1) >> order;
>>>> +	nb_iommu_pages = end_pfn - base_pfn + 1;
>>>> +	aligned_base = base_pfn << order;
>>>> +	offset = addr - aligned_base;
>>>> +	binding_size = nb_iommu_pages * iommu_page_size;
>>>> +
>>>> +	if (!iovad)
>>>> +		return -EINVAL;
>>>> +
>>>> +	spin_lock_irqsave(&domain->reserved_lock, flags);
>>>> +
>>>> +	b = find_reserved_binding(domain, aligned_base, binding_size);
>>>> +	if (b) {
>>>> +		*iova = b->iova + offset;
>>>> +		kref_get(&b->kref);
>>>> +		ret = 0;
>>>> +		goto unlock;
>>>> +	}
>>>> +
>>>> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
>>>> +
>>>> +	/*
>>>> +	 * no reserved IOVA was found for this PA, start allocating and
>>>> +	 * registering one while the spin-lock is not held. iommu_map/unmap
>>>> +	 * are not supposed to be atomic
>>>> +	 */
>>>> +
>>>> +	p_iova = alloc_iova(iovad, nb_iommu_pages, iovad->dma_32bit_pfn, true);
>>>> +	if (!p_iova)
>>>> +		return -ENOMEM;
>>>
>>> Here we're using iovad, which was the reserved_iova_cookie outside of
>>> the locking, which makes the locking ineffective.  Isn't this lock also
>>> protecting our iova domain, I'm confused.
>> I think I was too when I wrote that :-(
>>>
>>>> +
>>>> +	*iova = iova_dma_addr(iovad, p_iova);
>>>> +
>>>> +	newb = kzalloc(sizeof(*b), GFP_KERNEL);
>> needs to to be GPF_ATOMIC as Jean-Philippe stated before.
>>>> +	if (!newb) {
>>>> +		free_iova(iovad, p_iova->pfn_lo);
>>>> +		return -ENOMEM;
>>>> +	}
>>>> +
>>>> +	ret = iommu_map(domain, *iova, aligned_base, binding_size, prot);
>> one problem I have is I would need iommu_map to be atomic (because of
>> the call sequence reported by Jean-Philippe) and I have no guarantee it
>> is in general I think . It is for arm-smmu(-v3).c which covers the ARM
>> use case. But what about other smmus potentially used in that process?
> 
> Hmm, indeed. How about we move all the heavy mapping work to
> msi_domain_prepare_irqs instead? It is allowed to sleep and, more
> importantly, fail. It is called much earlier, by pci_enable_msi_range.

Indeed this could be an option for setup.

However a substitute to msi_domain_set_affinity should also be found I
think, to handle a change in affinity (which can change the doorbell):

We have this path and irq_migrate_all_off_this_cpu takes the irq_desc
raw_spin_lock.

cpuhotplug.c:irq_migrate_all_off_this_cpu
cpuhotplug.c:migrate_one_irq
irq_do_set_affinity
chip->irq_set_affinity
msi_domain_set_affinity
../..
iommu_map/unmap

> 
> All we are missing is details about doorbells, which we could retrieve
> from the MSI controller's driver, using one or more additional callbacks
> in msi_domain_ops. Currently, we already need one such callbacks for
> querying the number of doorbell pages,
Yes currently I assume a single page per MSI controller which is
arbitrary. I can add such callback in my next version.

Thank you for your time

Eric
 maybe we could also ask the
> driver to provide their addresses? And in msi_domain_activate, simply
> search for the IOVA already associated with the MSI message?
> 
> I only briefly though about it from the host point of view, not sure how
> VFIO would cope with this method.
> 
> Thanks,
> Jean-Philippe
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
  2016-04-07 12:28   ` Eric Auger
@ 2016-04-07 17:50     ` Alex Williamson
  2016-04-08 13:31       ` Eric Auger
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Williamson @ 2016-04-07 17:50 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, robin.murphy, will.deacon, joro, tglx, jason,
	marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm, kvm,
	suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	Jean-Philippe.Brucker, julien.grall

On Thu, 7 Apr 2016 14:28:59 +0200
Eric Auger <eric.auger@linaro.org> wrote:

> Hi Alex,
> On 04/07/2016 01:15 AM, Alex Williamson wrote:
> > On Mon,  4 Apr 2016 08:06:55 +0000
> > Eric Auger <eric.auger@linaro.org> wrote:
> >   
> >> This series introduces the dma-reserved-iommu api used to:
> >> - create/destroy an iova domain dedicated to reserved iova bindings
> >> - map/unmap physical addresses onto reserved IOVAs.
> >> - unmap and destroy all IOVA reserved bindings  
> > 
> > Why are we making the decision to have an unbalanced map vs unmap, we
> > can create individual mappings, but only unmap the whole thing and
> > start over?  That's a strange interface.  Thanks,  
> The "individual" balanced unmap also exists (iommu_put_reserved_iova)
> and this is the "normal" path. This happens on msi_domain_deactivate
> (and possibly on msi_domain_set_affinity).
> 
> I added iommu_unmap_reserved to handle the case where the userspace
> registers a reserved iova domain and fails to unregister it. In that
> case one need to handle the cleanup on kernel-side and I chose to
> implement this on vfio_iommu_type1 release. All the reserved IOMMU
> bindings get destroyed on that event.
> 
> Any advice to handle this situation?

If we want to model it similar to regular iommu domains, then
iommu_free_reserved_iova_domain() should release all the mappings and
destroy the iova domain.  Additionally, since the reserved iova domain
is just a construct on top of an iommu domain, it should be sufficient
to call iommu_domain_free() to also remove the reserved iova domain if
one exists.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
  2016-04-07 17:50     ` Alex Williamson
@ 2016-04-08 13:31       ` Eric Auger
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Auger @ 2016-04-08 13:31 UTC (permalink / raw)
  To: Alex Williamson
  Cc: eric.auger, robin.murphy, will.deacon, joro, tglx, jason,
	marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm, kvm,
	suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	Jean-Philippe.Brucker, julien.grall

Hi Alex,
On 04/07/2016 07:50 PM, Alex Williamson wrote:
> On Thu, 7 Apr 2016 14:28:59 +0200
> Eric Auger <eric.auger@linaro.org> wrote:
> 
>> Hi Alex,
>> On 04/07/2016 01:15 AM, Alex Williamson wrote:
>>> On Mon,  4 Apr 2016 08:06:55 +0000
>>> Eric Auger <eric.auger@linaro.org> wrote:
>>>   
>>>> This series introduces the dma-reserved-iommu api used to:
>>>> - create/destroy an iova domain dedicated to reserved iova bindings
>>>> - map/unmap physical addresses onto reserved IOVAs.
>>>> - unmap and destroy all IOVA reserved bindings  
>>>
>>> Why are we making the decision to have an unbalanced map vs unmap, we
>>> can create individual mappings, but only unmap the whole thing and
>>> start over?  That's a strange interface.  Thanks,  
>> The "individual" balanced unmap also exists (iommu_put_reserved_iova)
>> and this is the "normal" path. This happens on msi_domain_deactivate
>> (and possibly on msi_domain_set_affinity).
>>
>> I added iommu_unmap_reserved to handle the case where the userspace
>> registers a reserved iova domain and fails to unregister it. In that
>> case one need to handle the cleanup on kernel-side and I chose to
>> implement this on vfio_iommu_type1 release. All the reserved IOMMU
>> bindings get destroyed on that event.
>>
>> Any advice to handle this situation?
> 
> If we want to model it similar to regular iommu domains, then
> iommu_free_reserved_iova_domain() should release all the mappings and
> destroy the iova domain.
Yes this sounds obvious now.
  Additionally, since the reserved iova domain
> is just a construct on top of an iommu domain, it should be sufficient
> to call iommu_domain_free() to also remove the reserved iova domain if
> one exists.  Thanks,
Yes. For dma cookie (iommu_put_dma_cookie) I see this is done from the
iommu driver domain_free callback.

Thanks

Eric
> 
> Alex
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-04-08 13:33 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-04  8:06 [PATCH v6 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes Eric Auger
2016-04-04  8:06 ` [PATCH v6 1/7] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute Eric Auger
2016-04-04  8:06 ` [PATCH v6 2/7] iommu/arm-smmu: advertise " Eric Auger
2016-04-04  8:06 ` [PATCH v6 3/7] iommu: introduce a reserved iova cookie Eric Auger
2016-04-04  8:06 ` [PATCH v6 4/7] dma-reserved-iommu: alloc/free_reserved_iova_domain Eric Auger
2016-04-06 23:00   ` Alex Williamson
2016-04-07  9:33     ` Eric Auger
2016-04-04  8:07 ` [PATCH v6 5/7] dma-reserved-iommu: reserved binding rb-tree and helpers Eric Auger
2016-04-04  8:07 ` [PATCH v6 6/7] dma-reserved-iommu: iommu_get/put_single_reserved Eric Auger
2016-04-06 23:12   ` Alex Williamson
2016-04-07  9:33     ` Eric Auger
2016-04-07 14:38       ` Jean-Philippe Brucker
2016-04-07 16:44         ` Eric Auger
2016-04-04  8:07 ` [PATCH v6 7/7] dma-reserved-iommu: iommu_unmap_reserved Eric Auger
2016-04-06 23:15 ` [PATCH v6 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes Alex Williamson
2016-04-07 12:28   ` Eric Auger
2016-04-07 17:50     ` Alex Williamson
2016-04-08 13:31       ` Eric Auger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).