linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
@ 2013-11-19  5:17 Bharat Bhushan
  2013-11-19  5:17 ` [PATCH 1/9 v2] pci:msi: add weak function for returning msi region info Bharat Bhushan
                   ` (9 more replies)
  0 siblings, 10 replies; 34+ messages in thread
From: Bharat Bhushan @ 2013-11-19  5:17 UTC (permalink / raw)
  To: alex.williamson, joro, bhelgaas, agraf, scottwood, stuart.yoder,
	iommu, linux-pci, linuxppc-dev, linux-kernel
  Cc: Bharat Bhushan

From: Bharat Bhushan <bharat.bhushan@freescale.com>

PAMU (FSL IOMMU) has a concept of primary window and subwindows.
Primary window corresponds to the complete guest iova address space
(including MSI space), with respect to IOMMU_API this is termed as
geometry. IOVA Base of subwindow is determined from the number of
subwindows (configurable using iommu API).
MSI I/O page must be within the geometry and maximum supported
subwindows, so MSI IO-page is setup just after guest memory iova space.

So patch 1/9-4/9(inclusive) are for defining the interface to get:
  - Number of MSI regions (which is number of MSI banks for powerpc)
  - MSI-region address range: Physical page which have the
    address/addresses used for generating MSI interrupt
    and size of the page.

Patch 5/9-7/9(inclusive) is defining the interface of setting up
MSI iova-base for a msi region(bank) for a device. so that when
msi-message will be composed then this configured iova will be used.
Earlier we were using iommu interface for getting the configured iova
which was not currect and Alex Williamson suggeested this type of interface.

patch 8/9 moves some common functions in a separate file so that these
can be used by FSL_PAMU implementation (next patch uses this).
These will be used later for iommu-none implementation. I believe we
can do more of this but will take step by step.

Finally last patch actually adds the support for FSL-PAMU :)

v1->v2
 - Added interface for setting msi iova for a msi region for a device.
   Earlier I added iommu interface for same but as per comment that is
   removed and now created a direct interface between vfio and msi.
 - Incorporated review comments (details is in individual patch)

Bharat Bhushan (9):
  pci:msi: add weak function for returning msi region info
  pci: msi: expose msi region information functions
  powerpc: pci: Add arch specific msi region interface
  powerpc: msi: Extend the msi region interface to get info from
    fsl_msi
  pci/msi: interface to set an iova for a msi region
  powerpc: pci: Extend msi iova page setup to arch specific
  pci: msi: Extend msi iova setting interface to powerpc arch
  vfio: moving some functions in common file
  vfio pci: Add vfio iommu implementation for FSL_PAMU

 arch/powerpc/include/asm/machdep.h |   10 +
 arch/powerpc/kernel/msi.c          |   28 +
 arch/powerpc/sysdev/fsl_msi.c      |  132 +++++-
 arch/powerpc/sysdev/fsl_msi.h      |   25 +-
 drivers/pci/msi.c                  |   35 ++
 drivers/vfio/Kconfig               |    6 +
 drivers/vfio/Makefile              |    5 +-
 drivers/vfio/vfio_iommu_common.c   |  227 ++++++++
 drivers/vfio/vfio_iommu_common.h   |   27 +
 drivers/vfio/vfio_iommu_fsl_pamu.c | 1003 ++++++++++++++++++++++++++++++++++++
 drivers/vfio/vfio_iommu_type1.c    |  206 +--------
 include/linux/msi.h                |   14 +
 include/linux/pci.h                |   21 +
 include/uapi/linux/vfio.h          |  100 ++++
 14 files changed, 1623 insertions(+), 216 deletions(-)
 create mode 100644 drivers/vfio/vfio_iommu_common.c
 create mode 100644 drivers/vfio/vfio_iommu_common.h
 create mode 100644 drivers/vfio/vfio_iommu_fsl_pamu.c



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 1/9 v2] pci:msi: add weak function for returning msi region info
  2013-11-19  5:17 [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) Bharat Bhushan
@ 2013-11-19  5:17 ` Bharat Bhushan
  2013-11-25 23:36   ` Bjorn Helgaas
  2013-11-19  5:17 ` [PATCH 2/9 v2] pci: msi: expose msi region information functions Bharat Bhushan
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 34+ messages in thread
From: Bharat Bhushan @ 2013-11-19  5:17 UTC (permalink / raw)
  To: alex.williamson, joro, bhelgaas, agraf, scottwood, stuart.yoder,
	iommu, linux-pci, linuxppc-dev, linux-kernel
  Cc: Bharat Bhushan, Bharat Bhushan

In Aperture type of IOMMU (like FSL PAMU), VFIO-iommu system need to know
the MSI region to map its window in h/w. This patch just defines the
required weak functions only and will be used by followup patches.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
v1->v2
 - Added description on "struct msi_region" 

 drivers/pci/msi.c   |   22 ++++++++++++++++++++++
 include/linux/msi.h |   14 ++++++++++++++
 2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index d5f90d6..2643a29 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -67,6 +67,28 @@ int __weak arch_msi_check_device(struct pci_dev *dev, int nvec, int type)
 	return chip->check_device(chip, dev, nvec, type);
 }
 
+int __weak arch_msi_get_region_count(void)
+{
+	return 0;
+}
+
+int __weak arch_msi_get_region(int region_num, struct msi_region *region)
+{
+	return 0;
+}
+
+int msi_get_region_count(void)
+{
+	return arch_msi_get_region_count();
+}
+EXPORT_SYMBOL(msi_get_region_count);
+
+int msi_get_region(int region_num, struct msi_region *region)
+{
+	return arch_msi_get_region(region_num, region);
+}
+EXPORT_SYMBOL(msi_get_region);
+
 int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 {
 	struct msi_desc *entry;
diff --git a/include/linux/msi.h b/include/linux/msi.h
index b17ead8..ade1480 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -51,6 +51,18 @@ struct msi_desc {
 };
 
 /*
+ * This structure is used to get
+ * - physical address
+ * - size
+ * of a msi region
+ */
+struct msi_region {
+	int region_num; /* MSI region number */
+	dma_addr_t addr; /* Address of MSI region */
+	size_t size; /* Size of MSI region */
+};
+
+/*
  * The arch hooks to setup up msi irqs. Those functions are
  * implemented as weak symbols so that they /can/ be overriden by
  * architecture specific code if needed.
@@ -64,6 +76,8 @@ void arch_restore_msi_irqs(struct pci_dev *dev, int irq);
 
 void default_teardown_msi_irqs(struct pci_dev *dev);
 void default_restore_msi_irqs(struct pci_dev *dev, int irq);
+int arch_msi_get_region_count(void);
+int arch_msi_get_region(int region_num, struct msi_region *region);
 
 struct msi_chip {
 	struct module *owner;
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 2/9 v2] pci: msi: expose msi region information functions
  2013-11-19  5:17 [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) Bharat Bhushan
  2013-11-19  5:17 ` [PATCH 1/9 v2] pci:msi: add weak function for returning msi region info Bharat Bhushan
@ 2013-11-19  5:17 ` Bharat Bhushan
  2013-11-19  5:17 ` [PATCH 3/9 v2] powerpc: pci: Add arch specific msi region interface Bharat Bhushan
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Bharat Bhushan @ 2013-11-19  5:17 UTC (permalink / raw)
  To: alex.williamson, joro, bhelgaas, agraf, scottwood, stuart.yoder,
	iommu, linux-pci, linuxppc-dev, linux-kernel
  Cc: Bharat Bhushan, Bharat Bhushan

So by now we have defined all the interfaces for getting the msi region,
this patch expose the interface to linux subsystem. These will be used by
vfio subsystem for setting up iommu for MSI interrupt of direct assignment
devices.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
v1->v2
 - None

 include/linux/pci.h |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index da172f9..c587034 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1142,6 +1142,7 @@ struct msix_entry {
 	u16	entry;	/* driver uses to specify entry, OS writes */
 };
 
+struct msi_region;
 
 #ifndef CONFIG_PCI_MSI
 static inline int pci_enable_msi_block(struct pci_dev *dev, unsigned int nvec)
@@ -1184,6 +1185,16 @@ static inline int pci_msi_enabled(void)
 {
 	return 0;
 }
+
+static inline int msi_get_region_count(void)
+{
+	return 0;
+}
+
+static inline int msi_get_region(int region_num, struct msi_region *region)
+{
+	return 0;
+}
 #else
 int pci_enable_msi_block(struct pci_dev *dev, unsigned int nvec);
 int pci_enable_msi_block_auto(struct pci_dev *dev, unsigned int *maxvec);
@@ -1196,6 +1207,8 @@ void pci_disable_msix(struct pci_dev *dev);
 void msi_remove_pci_irq_vectors(struct pci_dev *dev);
 void pci_restore_msi_state(struct pci_dev *dev);
 int pci_msi_enabled(void);
+int msi_get_region_count(void);
+int msi_get_region(int region_num, struct msi_region *region);
 #endif
 
 #ifdef CONFIG_PCIEPORTBUS
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 3/9 v2] powerpc: pci: Add arch specific msi region interface
  2013-11-19  5:17 [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) Bharat Bhushan
  2013-11-19  5:17 ` [PATCH 1/9 v2] pci:msi: add weak function for returning msi region info Bharat Bhushan
  2013-11-19  5:17 ` [PATCH 2/9 v2] pci: msi: expose msi region information functions Bharat Bhushan
@ 2013-11-19  5:17 ` Bharat Bhushan
  2013-11-19  5:17 ` [PATCH 4/9 v2] powerpc: msi: Extend the msi region interface to get info from fsl_msi Bharat Bhushan
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Bharat Bhushan @ 2013-11-19  5:17 UTC (permalink / raw)
  To: alex.williamson, joro, bhelgaas, agraf, scottwood, stuart.yoder,
	iommu, linux-pci, linuxppc-dev, linux-kernel
  Cc: Bharat Bhushan, Bharat Bhushan

This patch adds the interface to get the msi region information from arch
specific code. The machine spicific code is not yet defined.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
v1->v2
 - None

 arch/powerpc/include/asm/machdep.h |    8 ++++++++
 arch/powerpc/kernel/msi.c          |   18 ++++++++++++++++++
 2 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 8b48090..8d1b787 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -30,6 +30,7 @@ struct file;
 struct pci_controller;
 struct kimage;
 struct pci_host_bridge;
+struct msi_region;
 
 struct machdep_calls {
 	char		*name;
@@ -124,6 +125,13 @@ struct machdep_calls {
 	int		(*setup_msi_irqs)(struct pci_dev *dev,
 					  int nvec, int type);
 	void		(*teardown_msi_irqs)(struct pci_dev *dev);
+
+	/* Returns the number of MSI regions (banks) */
+	int		(*msi_get_region_count)(void);
+
+	/* Returns the requested region's address and size */
+	int		(*msi_get_region)(int region_num,
+					  struct msi_region *region);
 #endif
 
 	void		(*restart)(char *cmd);
diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c
index 8bbc12d..1a67787 100644
--- a/arch/powerpc/kernel/msi.c
+++ b/arch/powerpc/kernel/msi.c
@@ -13,6 +13,24 @@
 
 #include <asm/machdep.h>
 
+int arch_msi_get_region_count(void)
+{
+	if (ppc_md.msi_get_region_count) {
+		pr_debug("msi: Using platform get_region_count routine.\n");
+		return ppc_md.msi_get_region_count();
+	}
+	return 0;
+}
+
+int arch_msi_get_region(int region_num, struct msi_region *region)
+{
+	if (ppc_md.msi_get_region) {
+		pr_debug("msi: Using platform get_region routine.\n");
+		return ppc_md.msi_get_region(region_num, region);
+	}
+	return 0;
+}
+
 int arch_msi_check_device(struct pci_dev* dev, int nvec, int type)
 {
 	if (!ppc_md.setup_msi_irqs || !ppc_md.teardown_msi_irqs) {
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 4/9 v2] powerpc: msi: Extend the msi region interface to get info from fsl_msi
  2013-11-19  5:17 [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) Bharat Bhushan
                   ` (2 preceding siblings ...)
  2013-11-19  5:17 ` [PATCH 3/9 v2] powerpc: pci: Add arch specific msi region interface Bharat Bhushan
@ 2013-11-19  5:17 ` Bharat Bhushan
  2013-11-19  5:17 ` [PATCH 5/9 v2] pci/msi: interface to set an iova for a msi region Bharat Bhushan
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Bharat Bhushan @ 2013-11-19  5:17 UTC (permalink / raw)
  To: alex.williamson, joro, bhelgaas, agraf, scottwood, stuart.yoder,
	iommu, linux-pci, linuxppc-dev, linux-kernel
  Cc: Bharat Bhushan, Bharat Bhushan

The FSL MSI will provide the interface to get:
  - Number of MSI regions (which is number of MSI banks for powerpc)
  - Get the region address range: Physical page which have the
    address/addresses used for generating MSI interrupt
    and size of the page.

These are required to create IOMMU (Freescale PAMU) mapping for
devices which are directly assigned using VFIO.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
v1->v2
 - Atomic increment of bank index for parallel probe of msi node 

 arch/powerpc/sysdev/fsl_msi.c |   42 +++++++++++++++++++++++++++++++++++-----
 arch/powerpc/sysdev/fsl_msi.h |   11 ++++++++-
 2 files changed, 45 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/sysdev/fsl_msi.c b/arch/powerpc/sysdev/fsl_msi.c
index 77efbae..eeebbf0 100644
--- a/arch/powerpc/sysdev/fsl_msi.c
+++ b/arch/powerpc/sysdev/fsl_msi.c
@@ -109,6 +109,34 @@ static int fsl_msi_init_allocator(struct fsl_msi *msi_data)
 	return 0;
 }
 
+static int fsl_msi_get_region_count(void)
+{
+	int count = 0;
+	struct fsl_msi *msi_data;
+
+	list_for_each_entry(msi_data, &msi_head, list)
+		count++;
+
+	return count;
+}
+
+static int fsl_msi_get_region(int region_num, struct msi_region *region)
+{
+	struct fsl_msi *msi_data;
+
+	list_for_each_entry(msi_data, &msi_head, list) {
+		if (msi_data->bank_index == region_num) {
+			region->region_num = msi_data->bank_index;
+			/* Setting PAGE_SIZE as MSIIR is a 4 byte register */
+			region->size = PAGE_SIZE;
+			region->addr = msi_data->msiir & ~(region->size - 1);
+			return 0;
+		}
+	}
+
+	return -ENODEV;
+}
+
 static int fsl_msi_check_device(struct pci_dev *pdev, int nvec, int type)
 {
 	if (type == PCI_CAP_ID_MSIX)
@@ -150,7 +178,8 @@ static void fsl_compose_msi_msg(struct pci_dev *pdev, int hwirq,
 	if (reg && (len == sizeof(u64)))
 		address = be64_to_cpup(reg);
 	else
-		address = fsl_pci_immrbar_base(hose) + msi_data->msiir_offset;
+		address = fsl_pci_immrbar_base(hose) +
+			   (msi_data->msiir & 0xfffff);
 
 	msg->address_lo = lower_32_bits(address);
 	msg->address_hi = upper_32_bits(address);
@@ -393,6 +422,7 @@ static int fsl_of_msi_probe(struct platform_device *dev)
 	const struct fsl_msi_feature *features;
 	int len;
 	u32 offset;
+	static atomic_t bank_index = ATOMIC_INIT(-1);
 
 	match = of_match_device(fsl_of_msi_ids, &dev->dev);
 	if (!match)
@@ -436,18 +466,15 @@ static int fsl_of_msi_probe(struct platform_device *dev)
 				dev->dev.of_node->full_name);
 			goto error_out;
 		}
-		msi->msiir_offset =
-			features->msiir_offset + (res.start & 0xfffff);
 
 		/*
 		 * First read the MSIIR/MSIIR1 offset from dts
 		 * On failure use the hardcode MSIIR offset
 		 */
 		if (of_address_to_resource(dev->dev.of_node, 1, &msiir))
-			msi->msiir_offset = features->msiir_offset +
-					    (res.start & MSIIR_OFFSET_MASK);
+			msi->msiir = res.start + features->msiir_offset;
 		else
-			msi->msiir_offset = msiir.start & MSIIR_OFFSET_MASK;
+			msi->msiir = msiir.start;
 	}
 
 	msi->feature = features->fsl_pic_ip;
@@ -521,6 +548,7 @@ static int fsl_of_msi_probe(struct platform_device *dev)
 		}
 	}
 
+	msi->bank_index = atomic_inc_return(&bank_index);
 	list_add_tail(&msi->list, &msi_head);
 
 	/* The multiple setting ppc_md.setup_msi_irqs will not harm things */
@@ -528,6 +556,8 @@ static int fsl_of_msi_probe(struct platform_device *dev)
 		ppc_md.setup_msi_irqs = fsl_setup_msi_irqs;
 		ppc_md.teardown_msi_irqs = fsl_teardown_msi_irqs;
 		ppc_md.msi_check_device = fsl_msi_check_device;
+		ppc_md.msi_get_region_count = fsl_msi_get_region_count;
+		ppc_md.msi_get_region = fsl_msi_get_region;
 	} else if (ppc_md.setup_msi_irqs != fsl_setup_msi_irqs) {
 		dev_err(&dev->dev, "Different MSI driver already installed!\n");
 		err = -ENODEV;
diff --git a/arch/powerpc/sysdev/fsl_msi.h b/arch/powerpc/sysdev/fsl_msi.h
index df9aa9f..a2cc5a2 100644
--- a/arch/powerpc/sysdev/fsl_msi.h
+++ b/arch/powerpc/sysdev/fsl_msi.h
@@ -31,14 +31,21 @@ struct fsl_msi {
 	struct irq_domain *irqhost;
 
 	unsigned long cascade_irq;
-
-	u32 msiir_offset; /* Offset of MSIIR, relative to start of CCSR */
+	phys_addr_t msiir; /* MSIIR Address in CCSR */
 	u32 ibs_shift; /* Shift of interrupt bit select */
 	u32 srs_shift; /* Shift of the shared interrupt register select */
 	void __iomem *msi_regs;
 	u32 feature;
 	int msi_virqs[NR_MSI_REG_MAX];
 
+	/*
+	 * During probe each bank is assigned a index number.
+	 * index number start from 0.
+	 * Example  MSI bank 1 = 0
+	 * MSI bank 2 = 1, and so on.
+	 */
+	int bank_index;
+
 	struct msi_bitmap bitmap;
 
 	struct list_head list;          /* support multiple MSI banks */
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 5/9 v2] pci/msi: interface to set an iova for a msi region
  2013-11-19  5:17 [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) Bharat Bhushan
                   ` (3 preceding siblings ...)
  2013-11-19  5:17 ` [PATCH 4/9 v2] powerpc: msi: Extend the msi region interface to get info from fsl_msi Bharat Bhushan
@ 2013-11-19  5:17 ` Bharat Bhushan
  2013-11-19  5:17 ` [PATCH 6/9 v2] powerpc: pci: Extend msi iova page setup to arch specific Bharat Bhushan
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Bharat Bhushan @ 2013-11-19  5:17 UTC (permalink / raw)
  To: alex.williamson, joro, bhelgaas, agraf, scottwood, stuart.yoder,
	iommu, linux-pci, linuxppc-dev, linux-kernel
  Cc: Bharat Bhushan, Bharat Bhushan

This patch defines an interface by which a msi page
can be mapped to a specific iova page.

This is a requirement in aperture type of IOMMUs (like Freescale PAMU),
where we map msi iova page just after guest memory iova address.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
v2
 - new patch

 drivers/pci/msi.c   |   13 +++++++++++++
 include/linux/pci.h |    8 ++++++++
 2 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 2643a29..040609f 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -77,6 +77,19 @@ int __weak arch_msi_get_region(int region_num, struct msi_region *region)
 	return 0;
 }
 
+int __weak arch_msi_set_iova(struct pci_dev *pdev, int region_num,
+			     dma_addr_t iova, bool set)
+{
+	return 0;
+}
+
+int msi_set_iova(struct pci_dev *pdev, int region_num,
+		 dma_addr_t iova, bool set)
+{
+	return arch_msi_set_iova(pdev, region_num, iova, set);
+}
+EXPORT_SYMBOL(msi_set_iova);
+
 int msi_get_region_count(void)
 {
 	return arch_msi_get_region_count();
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c587034..c6d3e58 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1195,6 +1195,12 @@ static inline int msi_get_region(int region_num, struct msi_region *region)
 {
 	return 0;
 }
+
+static inline int msi_set_iova(struct pci_dev *pdev, int region_num,
+			       dma_addr_t iova, bool set)
+{
+	return 0;
+}
 #else
 int pci_enable_msi_block(struct pci_dev *dev, unsigned int nvec);
 int pci_enable_msi_block_auto(struct pci_dev *dev, unsigned int *maxvec);
@@ -1209,6 +1215,8 @@ void pci_restore_msi_state(struct pci_dev *dev);
 int pci_msi_enabled(void);
 int msi_get_region_count(void);
 int msi_get_region(int region_num, struct msi_region *region);
+int msi_set_iova(struct pci_dev *pdev, int region_num,
+		 dma_addr_t iova, bool set);
 #endif
 
 #ifdef CONFIG_PCIEPORTBUS
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 6/9 v2] powerpc: pci: Extend msi iova page setup to arch specific
  2013-11-19  5:17 [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) Bharat Bhushan
                   ` (4 preceding siblings ...)
  2013-11-19  5:17 ` [PATCH 5/9 v2] pci/msi: interface to set an iova for a msi region Bharat Bhushan
@ 2013-11-19  5:17 ` Bharat Bhushan
  2013-11-19  5:17 ` [PATCH 7/9 v2] pci: msi: Extend msi iova setting interface to powerpc arch Bharat Bhushan
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Bharat Bhushan @ 2013-11-19  5:17 UTC (permalink / raw)
  To: alex.williamson, joro, bhelgaas, agraf, scottwood, stuart.yoder,
	iommu, linux-pci, linuxppc-dev, linux-kernel
  Cc: Bharat Bhushan, Bharat Bhushan

This patch extend the interface to arch specific code for setting
msi iova address for a msi page. Machine specific code is not yet
implemented.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
v2
 - new patch

 arch/powerpc/include/asm/machdep.h |    2 ++
 arch/powerpc/kernel/msi.c          |   10 ++++++++++
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 8d1b787..e87b806 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -132,6 +132,8 @@ struct machdep_calls {
 	/* Returns the requested region's address and size */
 	int		(*msi_get_region)(int region_num,
 					  struct msi_region *region);
+	int		(*msi_set_iova)(struct pci_dev *pdev, int region_num,
+					dma_addr_t iova, bool set);
 #endif
 
 	void		(*restart)(char *cmd);
diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c
index 1a67787..e2bd555 100644
--- a/arch/powerpc/kernel/msi.c
+++ b/arch/powerpc/kernel/msi.c
@@ -13,6 +13,16 @@
 
 #include <asm/machdep.h>
 
+int arch_msi_set_iova(struct pci_dev *pdev, int region_num,
+		      dma_addr_t iova, bool set)
+{
+	if (ppc_md.msi_set_iova) {
+		pr_debug("msi: Using platform get_region_count routine.\n");
+		return ppc_md.msi_set_iova(pdev, region_num, iova, set);
+	}
+	return 0;
+}
+
 int arch_msi_get_region_count(void)
 {
 	if (ppc_md.msi_get_region_count) {
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 7/9 v2] pci: msi: Extend msi iova setting interface to powerpc arch
  2013-11-19  5:17 [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) Bharat Bhushan
                   ` (5 preceding siblings ...)
  2013-11-19  5:17 ` [PATCH 6/9 v2] powerpc: pci: Extend msi iova page setup to arch specific Bharat Bhushan
@ 2013-11-19  5:17 ` Bharat Bhushan
  2013-11-19  5:17 ` [PATCH 8/9 v2] vfio: moving some functions in common file Bharat Bhushan
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Bharat Bhushan @ 2013-11-19  5:17 UTC (permalink / raw)
  To: alex.williamson, joro, bhelgaas, agraf, scottwood, stuart.yoder,
	iommu, linux-pci, linuxppc-dev, linux-kernel
  Cc: Bharat Bhushan, Bharat Bhushan

Now we Keep track of devices which have msi page mapping to specific
iova page for all msi bank. When composing MSI address and data then
this list will be traversed. If device found in the list then use
configured iova page otherwise iova page will be taken as before.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
v2
 - new patch

 arch/powerpc/sysdev/fsl_msi.c |   90 +++++++++++++++++++++++++++++++++++++++++
 arch/powerpc/sysdev/fsl_msi.h |   16 ++++++-
 2 files changed, 104 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/sysdev/fsl_msi.c b/arch/powerpc/sysdev/fsl_msi.c
index eeebbf0..52d2beb 100644
--- a/arch/powerpc/sysdev/fsl_msi.c
+++ b/arch/powerpc/sysdev/fsl_msi.c
@@ -137,6 +137,75 @@ static int fsl_msi_get_region(int region_num, struct msi_region *region)
 	return -ENODEV;
 }
 
+/* Add device to the list which have iova page mapping */
+static int fsl_msi_add_iova_device(struct fsl_msi *msi_data,
+				   struct pci_dev *pdev, dma_addr_t iova)
+{
+	struct fsl_msi_device *device;
+
+	mutex_lock(&msi_data->lock);
+	list_for_each_entry(device, &msi_data->device_list, list) {
+		/* If mapping already exits then update with new page mapping */
+		if (device->dev == pdev) {
+			device->iova = iova;
+			mutex_unlock(&msi_data->lock);
+			return 0;
+		}
+	}
+
+	device = kzalloc(sizeof(struct fsl_msi_device), GFP_KERNEL);
+	if (!device) {
+		pr_err("%s: Memory allocation failed\n", __func__);
+		mutex_unlock(&msi_data->lock);
+		return -ENOMEM;
+	}
+
+	device->dev = pdev;
+	device->iova = iova;
+	list_add_tail(&device->list, &msi_data->device_list);
+	mutex_unlock(&msi_data->lock);
+	return 0;
+}
+
+/* Remove device to the list which have iova page mapping */
+static int fsl_msi_del_iova_device(struct fsl_msi *msi_data,
+				   struct pci_dev *pdev)
+{
+	struct fsl_msi_device *device;
+
+	mutex_lock(&msi_data->lock);
+	list_for_each_entry(device, &msi_data->device_list, list) {
+		if (device->dev == pdev) {
+			list_del(&device->list);
+			kfree(device);
+			break;
+		}
+	}
+	mutex_unlock(&msi_data->lock);
+	return 0;
+}
+
+/* set/clear device iova mapping for the requested msi region */
+static int fsl_msi_set_iova(struct pci_dev *pdev, int region_num,
+			    dma_addr_t iova, bool set)
+{
+	struct fsl_msi *msi_data;
+	int ret = -EINVAL;
+
+	list_for_each_entry(msi_data, &msi_head, list) {
+		if (msi_data->bank_index != region_num)
+			continue;
+
+		if (set)
+			ret = fsl_msi_add_iova_device(msi_data, pdev, iova);
+		else
+			ret = fsl_msi_del_iova_device(msi_data, pdev);
+
+		break;
+	}
+	return ret;
+}
+
 static int fsl_msi_check_device(struct pci_dev *pdev, int nvec, int type)
 {
 	if (type == PCI_CAP_ID_MSIX)
@@ -167,6 +236,7 @@ static void fsl_compose_msi_msg(struct pci_dev *pdev, int hwirq,
 				struct msi_msg *msg,
 				struct fsl_msi *fsl_msi_data)
 {
+	struct fsl_msi_device *device;
 	struct fsl_msi *msi_data = fsl_msi_data;
 	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
 	u64 address; /* Physical address of the MSIIR */
@@ -181,6 +251,15 @@ static void fsl_compose_msi_msg(struct pci_dev *pdev, int hwirq,
 		address = fsl_pci_immrbar_base(hose) +
 			   (msi_data->msiir & 0xfffff);
 
+	mutex_lock(&msi_data->lock);
+	list_for_each_entry(device, &msi_data->device_list, list) {
+		if (device->dev == pdev) {
+			address = device->iova | (msi_data->msiir & 0xfff);
+			break;
+		}
+	}
+	mutex_unlock(&msi_data->lock);
+
 	msg->address_lo = lower_32_bits(address);
 	msg->address_hi = upper_32_bits(address);
 
@@ -356,6 +435,7 @@ static int fsl_of_msi_remove(struct platform_device *ofdev)
 	struct fsl_msi *msi = platform_get_drvdata(ofdev);
 	int virq, i;
 	struct fsl_msi_cascade_data *cascade_data;
+	struct fsl_msi_device *device;
 
 	if (msi->list.prev != NULL)
 		list_del(&msi->list);
@@ -371,6 +451,13 @@ static int fsl_of_msi_remove(struct platform_device *ofdev)
 		msi_bitmap_free(&msi->bitmap);
 	if ((msi->feature & FSL_PIC_IP_MASK) != FSL_PIC_IP_VMPIC)
 		iounmap(msi->msi_regs);
+
+	mutex_lock(&msi->lock);
+	list_for_each_entry(device, &msi->device_list, list) {
+		list_del(&device->list);
+		kfree(device);
+	}
+	mutex_unlock(&msi->lock);
 	kfree(msi);
 
 	return 0;
@@ -436,6 +523,8 @@ static int fsl_of_msi_probe(struct platform_device *dev)
 		dev_err(&dev->dev, "No memory for MSI structure\n");
 		return -ENOMEM;
 	}
+	INIT_LIST_HEAD(&msi->device_list);
+	mutex_init(&msi->lock);
 	platform_set_drvdata(dev, msi);
 
 	msi->irqhost = irq_domain_add_linear(dev->dev.of_node,
@@ -558,6 +647,7 @@ static int fsl_of_msi_probe(struct platform_device *dev)
 		ppc_md.msi_check_device = fsl_msi_check_device;
 		ppc_md.msi_get_region_count = fsl_msi_get_region_count;
 		ppc_md.msi_get_region = fsl_msi_get_region;
+		ppc_md.msi_set_iova = fsl_msi_set_iova;
 	} else if (ppc_md.setup_msi_irqs != fsl_setup_msi_irqs) {
 		dev_err(&dev->dev, "Different MSI driver already installed!\n");
 		err = -ENODEV;
diff --git a/arch/powerpc/sysdev/fsl_msi.h b/arch/powerpc/sysdev/fsl_msi.h
index a2cc5a2..4da2af9 100644
--- a/arch/powerpc/sysdev/fsl_msi.h
+++ b/arch/powerpc/sysdev/fsl_msi.h
@@ -27,9 +27,16 @@
 #define FSL_PIC_IP_IPIC   0x00000002
 #define FSL_PIC_IP_VMPIC  0x00000003
 
+/* List of devices having specific iova page mapping */
+struct fsl_msi_device {
+	struct list_head list;
+	struct pci_dev *dev;
+	dma_addr_t iova;
+};
+
 struct fsl_msi {
 	struct irq_domain *irqhost;
-
+	struct mutex lock;
 	unsigned long cascade_irq;
 	phys_addr_t msiir; /* MSIIR Address in CCSR */
 	u32 ibs_shift; /* Shift of interrupt bit select */
@@ -37,7 +44,12 @@ struct fsl_msi {
 	void __iomem *msi_regs;
 	u32 feature;
 	int msi_virqs[NR_MSI_REG_MAX];
-
+	/*
+	 * Keep track of devices which have msi page mapping to specific
+	 * iova page. Default this is NULL which means legacy way of
+	 * setting iova will be used.
+	 */
+	struct list_head device_list;
 	/*
 	 * During probe each bank is assigned a index number.
 	 * index number start from 0.
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 8/9 v2] vfio: moving some functions in common file
  2013-11-19  5:17 [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) Bharat Bhushan
                   ` (6 preceding siblings ...)
  2013-11-19  5:17 ` [PATCH 7/9 v2] pci: msi: Extend msi iova setting interface to powerpc arch Bharat Bhushan
@ 2013-11-19  5:17 ` Bharat Bhushan
  2013-11-19  5:17 ` [PATCH 9/9 v2] vfio pci: Add vfio iommu implementation for FSL_PAMU Bharat Bhushan
  2013-11-20 18:47 ` [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) Alex Williamson
  9 siblings, 0 replies; 34+ messages in thread
From: Bharat Bhushan @ 2013-11-19  5:17 UTC (permalink / raw)
  To: alex.williamson, joro, bhelgaas, agraf, scottwood, stuart.yoder,
	iommu, linux-pci, linuxppc-dev, linux-kernel
  Cc: Bharat Bhushan, Bharat Bhushan

Some function defined in vfio_iommu_type1.c are generic (not specific
or type1 iommu) and we want to use these for FSL IOMMU (PAMU) and
going forward in iommu-none driver.
So I have created a new file naming vfio_iommu_common.c and moved some
of generic functions into this file.

I Agree (with Alex Williamson and myself :-)) that some more functions
can be moved to this new common file (with some changes in type1/fsl_pamu
and others). But in this patch i avoided doing these changes and
just moved functions which are straight forward and allow me to
get fsl-powerpc vfio framework in place.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
v1->v2
 - removed un-necessary header file inclusion
 - mark static function which are internal to *common.c

 drivers/vfio/Makefile            |    4 +-
 drivers/vfio/vfio_iommu_common.c |  227 ++++++++++++++++++++++++++++++++++++++
 drivers/vfio/vfio_iommu_common.h |   27 +++++
 drivers/vfio/vfio_iommu_type1.c  |  206 +----------------------------------
 4 files changed, 257 insertions(+), 207 deletions(-)
 create mode 100644 drivers/vfio/vfio_iommu_common.c
 create mode 100644 drivers/vfio/vfio_iommu_common.h

diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index 72bfabc..c5792ec 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -1,4 +1,4 @@
 obj-$(CONFIG_VFIO) += vfio.o
-obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
-obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
+obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_common.o vfio_iommu_type1.o
+obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_common.o vfio_iommu_spapr_tce.o
 obj-$(CONFIG_VFIO_PCI) += pci/
diff --git a/drivers/vfio/vfio_iommu_common.c b/drivers/vfio/vfio_iommu_common.c
new file mode 100644
index 0000000..08eea71
--- /dev/null
+++ b/drivers/vfio/vfio_iommu_common.c
@@ -0,0 +1,227 @@
+/*
+ * VFIO: Common code for vfio IOMMU support
+ *
+ * Copyright (C) 2012 Red Hat, Inc.  All rights reserved.
+ *     Author: Alex Williamson <alex.williamson@redhat.com>
+ *     Author: Bharat Bhushan <bharat.bhushan@freescale.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Derived from original vfio:
+ * Copyright 2010 Cisco Systems, Inc.  All rights reserved.
+ * Author: Tom Lyon, pugs@cisco.com
+ */
+
+#include <linux/compat.h>
+#include <linux/iommu.h>
+#include <linux/module.h>
+#include <linux/mm.h>
+#include <linux/slab.h>
+#include <linux/workqueue.h>
+
+static bool disable_hugepages;
+module_param_named(disable_hugepages,
+		   disable_hugepages, bool, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(disable_hugepages,
+		 "Disable VFIO IOMMU support for IOMMU hugepages.");
+
+struct vwork {
+	struct mm_struct	*mm;
+	long			npage;
+	struct work_struct	work;
+};
+
+/* delayed decrement/increment for locked_vm */
+static void vfio_lock_acct_bg(struct work_struct *work)
+{
+	struct vwork *vwork = container_of(work, struct vwork, work);
+	struct mm_struct *mm;
+
+	mm = vwork->mm;
+	down_write(&mm->mmap_sem);
+	mm->locked_vm += vwork->npage;
+	up_write(&mm->mmap_sem);
+	mmput(mm);
+	kfree(vwork);
+}
+
+void vfio_lock_acct(long npage)
+{
+	struct vwork *vwork;
+	struct mm_struct *mm;
+
+	if (!current->mm || !npage)
+		return; /* process exited or nothing to do */
+
+	if (down_write_trylock(&current->mm->mmap_sem)) {
+		current->mm->locked_vm += npage;
+		up_write(&current->mm->mmap_sem);
+		return;
+	}
+
+	/*
+	 * Couldn't get mmap_sem lock, so must setup to update
+	 * mm->locked_vm later. If locked_vm were atomic, we
+	 * wouldn't need this silliness
+	 */
+	vwork = kmalloc(sizeof(struct vwork), GFP_KERNEL);
+	if (!vwork)
+		return;
+	mm = get_task_mm(current);
+	if (!mm) {
+		kfree(vwork);
+		return;
+	}
+	INIT_WORK(&vwork->work, vfio_lock_acct_bg);
+	vwork->mm = mm;
+	vwork->npage = npage;
+	schedule_work(&vwork->work);
+}
+
+/*
+ * Some mappings aren't backed by a struct page, for example an mmap'd
+ * MMIO range for our own or another device.  These use a different
+ * pfn conversion and shouldn't be tracked as locked pages.
+ */
+static bool is_invalid_reserved_pfn(unsigned long pfn)
+{
+	if (pfn_valid(pfn)) {
+		bool reserved;
+		struct page *tail = pfn_to_page(pfn);
+		struct page *head = compound_trans_head(tail);
+		reserved = !!(PageReserved(head));
+		if (head != tail) {
+			/*
+			 * "head" is not a dangling pointer
+			 * (compound_trans_head takes care of that)
+			 * but the hugepage may have been split
+			 * from under us (and we may not hold a
+			 * reference count on the head page so it can
+			 * be reused before we run PageReferenced), so
+			 * we've to check PageTail before returning
+			 * what we just read.
+			 */
+			smp_rmb();
+			if (PageTail(tail))
+				return reserved;
+		}
+		return PageReserved(tail);
+	}
+
+	return true;
+}
+
+static int put_pfn(unsigned long pfn, int prot)
+{
+	if (!is_invalid_reserved_pfn(pfn)) {
+		struct page *page = pfn_to_page(pfn);
+		if (prot & IOMMU_WRITE)
+			SetPageDirty(page);
+		put_page(page);
+		return 1;
+	}
+	return 0;
+}
+
+static int vaddr_get_pfn(unsigned long vaddr, int prot, unsigned long *pfn)
+{
+	struct page *page[1];
+	struct vm_area_struct *vma;
+	int ret = -EFAULT;
+
+	if (get_user_pages_fast(vaddr, 1, !!(prot & IOMMU_WRITE), page) == 1) {
+		*pfn = page_to_pfn(page[0]);
+		return 0;
+	}
+
+	down_read(&current->mm->mmap_sem);
+
+	vma = find_vma_intersection(current->mm, vaddr, vaddr + 1);
+
+	if (vma && vma->vm_flags & VM_PFNMAP) {
+		*pfn = ((vaddr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
+		if (is_invalid_reserved_pfn(*pfn))
+			ret = 0;
+	}
+
+	up_read(&current->mm->mmap_sem);
+
+	return ret;
+}
+
+/*
+ * Attempt to pin pages.  We really don't want to track all the pfns and
+ * the iommu can only map chunks of consecutive pfns anyway, so get the
+ * first page and all consecutive pages with the same locking.
+ */
+long vfio_pin_pages(unsigned long vaddr, long npage,
+			   int prot, unsigned long *pfn_base)
+{
+	unsigned long limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
+	bool lock_cap = capable(CAP_IPC_LOCK);
+	long ret, i;
+
+	if (!current->mm)
+		return -ENODEV;
+
+	ret = vaddr_get_pfn(vaddr, prot, pfn_base);
+	if (ret)
+		return ret;
+
+	if (is_invalid_reserved_pfn(*pfn_base))
+		return 1;
+
+	if (!lock_cap && current->mm->locked_vm + 1 > limit) {
+		put_pfn(*pfn_base, prot);
+		pr_warn("%s: RLIMIT_MEMLOCK (%ld) exceeded\n", __func__,
+			limit << PAGE_SHIFT);
+		return -ENOMEM;
+	}
+
+	if (unlikely(disable_hugepages)) {
+		vfio_lock_acct(1);
+		return 1;
+	}
+
+	/* Lock all the consecutive pages from pfn_base */
+	for (i = 1, vaddr += PAGE_SIZE; i < npage; i++, vaddr += PAGE_SIZE) {
+		unsigned long pfn = 0;
+
+		ret = vaddr_get_pfn(vaddr, prot, &pfn);
+		if (ret)
+			break;
+
+		if (pfn != *pfn_base + i || is_invalid_reserved_pfn(pfn)) {
+			put_pfn(pfn, prot);
+			break;
+		}
+
+		if (!lock_cap && current->mm->locked_vm + i + 1 > limit) {
+			put_pfn(pfn, prot);
+			pr_warn("%s: RLIMIT_MEMLOCK (%ld) exceeded\n",
+				__func__, limit << PAGE_SHIFT);
+			break;
+		}
+	}
+
+	vfio_lock_acct(i);
+
+	return i;
+}
+
+long vfio_unpin_pages(unsigned long pfn, long npage,
+			     int prot, bool do_accounting)
+{
+	unsigned long unlocked = 0;
+	long i;
+
+	for (i = 0; i < npage; i++)
+		unlocked += put_pfn(pfn++, prot);
+
+	if (do_accounting)
+		vfio_lock_acct(-unlocked);
+
+	return unlocked;
+}
diff --git a/drivers/vfio/vfio_iommu_common.h b/drivers/vfio/vfio_iommu_common.h
new file mode 100644
index 0000000..2566ce6
--- /dev/null
+++ b/drivers/vfio/vfio_iommu_common.h
@@ -0,0 +1,27 @@
+/*
+ * Copyright (C) 2012 Red Hat, Inc.  All rights reserved.
+ * Copyright (C) 2013 Freescale Semiconductor, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+ */
+
+#ifndef _VFIO_IOMMU_COMMON_H
+#define _VFIO_IOMMU_COMMON_H
+
+void vfio_lock_acct(long npage);
+long vfio_pin_pages(unsigned long vaddr, long npage, int prot,
+		    unsigned long *pfn_base);
+long vfio_unpin_pages(unsigned long pfn, long npage,
+		      int prot, bool do_accounting);
+#endif
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index a9807de..e9a58fa 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -37,6 +37,7 @@
 #include <linux/uaccess.h>
 #include <linux/vfio.h>
 #include <linux/workqueue.h>
+#include "vfio_iommu_common.h"
 
 #define DRIVER_VERSION  "0.2"
 #define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
@@ -48,12 +49,6 @@ module_param_named(allow_unsafe_interrupts,
 MODULE_PARM_DESC(allow_unsafe_interrupts,
 		 "Enable VFIO IOMMU support for on platforms without interrupt remapping support.");
 
-static bool disable_hugepages;
-module_param_named(disable_hugepages,
-		   disable_hugepages, bool, S_IRUGO | S_IWUSR);
-MODULE_PARM_DESC(disable_hugepages,
-		 "Disable VFIO IOMMU support for IOMMU hugepages.");
-
 struct vfio_iommu {
 	struct iommu_domain	*domain;
 	struct mutex		lock;
@@ -123,205 +118,6 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *old)
 	rb_erase(&old->node, &iommu->dma_list);
 }
 
-struct vwork {
-	struct mm_struct	*mm;
-	long			npage;
-	struct work_struct	work;
-};
-
-/* delayed decrement/increment for locked_vm */
-static void vfio_lock_acct_bg(struct work_struct *work)
-{
-	struct vwork *vwork = container_of(work, struct vwork, work);
-	struct mm_struct *mm;
-
-	mm = vwork->mm;
-	down_write(&mm->mmap_sem);
-	mm->locked_vm += vwork->npage;
-	up_write(&mm->mmap_sem);
-	mmput(mm);
-	kfree(vwork);
-}
-
-static void vfio_lock_acct(long npage)
-{
-	struct vwork *vwork;
-	struct mm_struct *mm;
-
-	if (!current->mm || !npage)
-		return; /* process exited or nothing to do */
-
-	if (down_write_trylock(&current->mm->mmap_sem)) {
-		current->mm->locked_vm += npage;
-		up_write(&current->mm->mmap_sem);
-		return;
-	}
-
-	/*
-	 * Couldn't get mmap_sem lock, so must setup to update
-	 * mm->locked_vm later. If locked_vm were atomic, we
-	 * wouldn't need this silliness
-	 */
-	vwork = kmalloc(sizeof(struct vwork), GFP_KERNEL);
-	if (!vwork)
-		return;
-	mm = get_task_mm(current);
-	if (!mm) {
-		kfree(vwork);
-		return;
-	}
-	INIT_WORK(&vwork->work, vfio_lock_acct_bg);
-	vwork->mm = mm;
-	vwork->npage = npage;
-	schedule_work(&vwork->work);
-}
-
-/*
- * Some mappings aren't backed by a struct page, for example an mmap'd
- * MMIO range for our own or another device.  These use a different
- * pfn conversion and shouldn't be tracked as locked pages.
- */
-static bool is_invalid_reserved_pfn(unsigned long pfn)
-{
-	if (pfn_valid(pfn)) {
-		bool reserved;
-		struct page *tail = pfn_to_page(pfn);
-		struct page *head = compound_trans_head(tail);
-		reserved = !!(PageReserved(head));
-		if (head != tail) {
-			/*
-			 * "head" is not a dangling pointer
-			 * (compound_trans_head takes care of that)
-			 * but the hugepage may have been split
-			 * from under us (and we may not hold a
-			 * reference count on the head page so it can
-			 * be reused before we run PageReferenced), so
-			 * we've to check PageTail before returning
-			 * what we just read.
-			 */
-			smp_rmb();
-			if (PageTail(tail))
-				return reserved;
-		}
-		return PageReserved(tail);
-	}
-
-	return true;
-}
-
-static int put_pfn(unsigned long pfn, int prot)
-{
-	if (!is_invalid_reserved_pfn(pfn)) {
-		struct page *page = pfn_to_page(pfn);
-		if (prot & IOMMU_WRITE)
-			SetPageDirty(page);
-		put_page(page);
-		return 1;
-	}
-	return 0;
-}
-
-static int vaddr_get_pfn(unsigned long vaddr, int prot, unsigned long *pfn)
-{
-	struct page *page[1];
-	struct vm_area_struct *vma;
-	int ret = -EFAULT;
-
-	if (get_user_pages_fast(vaddr, 1, !!(prot & IOMMU_WRITE), page) == 1) {
-		*pfn = page_to_pfn(page[0]);
-		return 0;
-	}
-
-	down_read(&current->mm->mmap_sem);
-
-	vma = find_vma_intersection(current->mm, vaddr, vaddr + 1);
-
-	if (vma && vma->vm_flags & VM_PFNMAP) {
-		*pfn = ((vaddr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
-		if (is_invalid_reserved_pfn(*pfn))
-			ret = 0;
-	}
-
-	up_read(&current->mm->mmap_sem);
-
-	return ret;
-}
-
-/*
- * Attempt to pin pages.  We really don't want to track all the pfns and
- * the iommu can only map chunks of consecutive pfns anyway, so get the
- * first page and all consecutive pages with the same locking.
- */
-static long vfio_pin_pages(unsigned long vaddr, long npage,
-			   int prot, unsigned long *pfn_base)
-{
-	unsigned long limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
-	bool lock_cap = capable(CAP_IPC_LOCK);
-	long ret, i;
-
-	if (!current->mm)
-		return -ENODEV;
-
-	ret = vaddr_get_pfn(vaddr, prot, pfn_base);
-	if (ret)
-		return ret;
-
-	if (is_invalid_reserved_pfn(*pfn_base))
-		return 1;
-
-	if (!lock_cap && current->mm->locked_vm + 1 > limit) {
-		put_pfn(*pfn_base, prot);
-		pr_warn("%s: RLIMIT_MEMLOCK (%ld) exceeded\n", __func__,
-			limit << PAGE_SHIFT);
-		return -ENOMEM;
-	}
-
-	if (unlikely(disable_hugepages)) {
-		vfio_lock_acct(1);
-		return 1;
-	}
-
-	/* Lock all the consecutive pages from pfn_base */
-	for (i = 1, vaddr += PAGE_SIZE; i < npage; i++, vaddr += PAGE_SIZE) {
-		unsigned long pfn = 0;
-
-		ret = vaddr_get_pfn(vaddr, prot, &pfn);
-		if (ret)
-			break;
-
-		if (pfn != *pfn_base + i || is_invalid_reserved_pfn(pfn)) {
-			put_pfn(pfn, prot);
-			break;
-		}
-
-		if (!lock_cap && current->mm->locked_vm + i + 1 > limit) {
-			put_pfn(pfn, prot);
-			pr_warn("%s: RLIMIT_MEMLOCK (%ld) exceeded\n",
-				__func__, limit << PAGE_SHIFT);
-			break;
-		}
-	}
-
-	vfio_lock_acct(i);
-
-	return i;
-}
-
-static long vfio_unpin_pages(unsigned long pfn, long npage,
-			     int prot, bool do_accounting)
-{
-	unsigned long unlocked = 0;
-	long i;
-
-	for (i = 0; i < npage; i++)
-		unlocked += put_pfn(pfn++, prot);
-
-	if (do_accounting)
-		vfio_lock_acct(-unlocked);
-
-	return unlocked;
-}
-
 static int vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
 			    dma_addr_t iova, size_t *size)
 {
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 9/9 v2] vfio pci: Add vfio iommu implementation for FSL_PAMU
  2013-11-19  5:17 [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) Bharat Bhushan
                   ` (7 preceding siblings ...)
  2013-11-19  5:17 ` [PATCH 8/9 v2] vfio: moving some functions in common file Bharat Bhushan
@ 2013-11-19  5:17 ` Bharat Bhushan
  2013-11-20 18:47 ` [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) Alex Williamson
  9 siblings, 0 replies; 34+ messages in thread
From: Bharat Bhushan @ 2013-11-19  5:17 UTC (permalink / raw)
  To: alex.williamson, joro, bhelgaas, agraf, scottwood, stuart.yoder,
	iommu, linux-pci, linuxppc-dev, linux-kernel
  Cc: Bharat Bhushan, Bharat Bhushan

This patch adds vfio iommu support for Freescale IOMMU (PAMU -
Peripheral Access Management Unit).

The Freescale PAMU is an aperture-based IOMMU with the following
characteristics.  Each device has an entry in a table in memory
describing the iova->phys mapping. The mapping has:
   -an overall aperture that is power of 2 sized, and has a start iova that
    is naturally aligned
   -has 1 or more windows within the aperture
   -number of windows must be power of 2, max is 256
   -size of each window is determined by aperture size / # of windows
   -iova of each window is determined by aperture start iova / # of windows
   -the mapped region in each window can be different than
    the window size...mapping must power of 2
   -physical address of the mapping must be naturally aligned
    with the mapping size

Some of the code is derived from TYPE1 iommu (driver/vfio/vfio_iommu_type1.c).

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
v1->v2
 - Use lock around msi-dma list
 - check for overlap between dma and msi-dma pages
 - Some code cleanup as per various comments

 drivers/vfio/Kconfig               |    6 +
 drivers/vfio/Makefile              |    1 +
 drivers/vfio/vfio_iommu_fsl_pamu.c | 1003 ++++++++++++++++++++++++++++++++++++
 include/uapi/linux/vfio.h          |  100 ++++
 4 files changed, 1110 insertions(+), 0 deletions(-)
 create mode 100644 drivers/vfio/vfio_iommu_fsl_pamu.c

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index 26b3d9d..7d1da26 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -8,11 +8,17 @@ config VFIO_IOMMU_SPAPR_TCE
 	depends on VFIO && SPAPR_TCE_IOMMU
 	default n
 
+config VFIO_IOMMU_FSL_PAMU
+	tristate
+	depends on VFIO
+	default n
+
 menuconfig VFIO
 	tristate "VFIO Non-Privileged userspace driver framework"
 	depends on IOMMU_API
 	select VFIO_IOMMU_TYPE1 if X86
 	select VFIO_IOMMU_SPAPR_TCE if (PPC_POWERNV || PPC_PSERIES)
+	select VFIO_IOMMU_FSL_PAMU if FSL_PAMU
 	help
 	  VFIO provides a framework for secure userspace device drivers.
 	  See Documentation/vfio.txt for more details.
diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index c5792ec..7461350 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -1,4 +1,5 @@
 obj-$(CONFIG_VFIO) += vfio.o
 obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_common.o vfio_iommu_type1.o
 obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_common.o vfio_iommu_spapr_tce.o
+obj-$(CONFIG_VFIO_IOMMU_FSL_PAMU) += vfio_iommu_common.o vfio_iommu_fsl_pamu.o
 obj-$(CONFIG_VFIO_PCI) += pci/
diff --git a/drivers/vfio/vfio_iommu_fsl_pamu.c b/drivers/vfio/vfio_iommu_fsl_pamu.c
new file mode 100644
index 0000000..66efc84
--- /dev/null
+++ b/drivers/vfio/vfio_iommu_fsl_pamu.c
@@ -0,0 +1,1003 @@
+/*
+ * VFIO: IOMMU DMA mapping support for FSL PAMU IOMMU
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright (C) 2013 Freescale Semiconductor, Inc.
+ *
+ *     Author: Bharat Bhushan <bharat.bhushan@freescale.com>
+ *
+ * This file is derived from driver/vfio/vfio_iommu_type1.c
+ *
+ * The Freescale PAMU is an aperture-based IOMMU with the following
+ * characteristics.  Each device has an entry in a table in memory
+ * describing the iova->phys mapping. The mapping has:
+ *  -an overall aperture that is power of 2 sized, and has a start iova that
+ *   is naturally aligned
+ *  -has 1 or more windows within the aperture
+ *     -number of windows must be power of 2, max is 256
+ *     -size of each window is determined by aperture size / # of windows
+ *     -iova of each window is determined by aperture start iova / # of windows
+ *     -the mapped region in each window can be different than
+ *      the window size...mapping must power of 2
+ *     -physical address of the mapping must be naturally aligned
+ *      with the mapping size
+ */
+
+#include <linux/compat.h>
+#include <linux/device.h>
+#include <linux/fs.h>
+#include <linux/iommu.h>
+#include <linux/module.h>
+#include <linux/mm.h>
+#include <linux/pci.h>		/* pci_bus_type */
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/vfio.h>
+#include <linux/hugetlb.h>
+#include <linux/msi.h>
+#include <asm/fsl_pamu_stash.h>
+
+#include "vfio_iommu_common.h"
+
+#define DRIVER_VERSION  "0.1"
+#define DRIVER_AUTHOR   "Bharat Bhushan <bharat.bhushan@freescale.com>"
+#define DRIVER_DESC     "FSL PAMU IOMMU driver for VFIO"
+
+struct vfio_iommu {
+	struct iommu_domain	*domain;
+	struct mutex		lock;
+	dma_addr_t		aperture_start;
+	dma_addr_t		aperture_end;
+	dma_addr_t		page_size;	/* Maximum mapped Page size */
+	int			nsubwindows;	/* Number of subwindows */
+	struct rb_root		dma_list;
+	struct list_head	msi_dma_list;
+	struct list_head	group_list;
+};
+
+struct vfio_dma {
+	struct rb_node		node;
+	dma_addr_t		iova;		/* Device address */
+	unsigned long		vaddr;		/* Process virtual addr */
+	size_t			size;		/* Map size (bytes) */
+	int			prot;		/* IOMMU_READ/WRITE */
+};
+
+struct vfio_msi_dma {
+	struct list_head	next;
+	dma_addr_t		iova;		/* Device address */
+	size_t			size;		/* MSI page size */
+	int			bank_id;
+	int			prot;		/* IOMMU_READ/WRITE */
+};
+
+struct vfio_group {
+	struct iommu_group	*iommu_group;
+	struct list_head	next;
+};
+
+static int iova_to_win(struct vfio_iommu *iommu, dma_addr_t iova)
+{
+	u64 offset = iova - iommu->aperture_start;
+	do_div(offset, iommu->page_size);
+	return (int) offset;
+}
+
+static int vfio_disable_iommu_domain(struct vfio_iommu *iommu)
+{
+	int enable = 0;
+	return iommu_domain_set_attr(iommu->domain,
+				     DOMAIN_ATTR_FSL_PAMU_ENABLE, &enable);
+}
+
+static int vfio_enable_iommu_domain(struct vfio_iommu *iommu)
+{
+	int enable = 1;
+	return iommu_domain_set_attr(iommu->domain,
+				     DOMAIN_ATTR_FSL_PAMU_ENABLE, &enable);
+}
+
+/* Unmap DMA region */
+/* This function disable iommu if no dma mapping is set */
+static void vfio_check_and_disable_iommu(struct vfio_iommu *iommu)
+{
+	if (list_empty(&iommu->msi_dma_list) && !rb_first(&iommu->dma_list))
+		vfio_disable_iommu_domain(iommu);
+}
+
+static struct vfio_msi_dma *vfio_find_msi_dma(struct vfio_iommu *iommu,
+					      dma_addr_t start, size_t size)
+{
+	struct vfio_msi_dma *msi_dma;
+
+	/* Check MSI MAP entries */
+	list_for_each_entry(msi_dma, &iommu->msi_dma_list, next) {
+		if ((start + size) <= (msi_dma->iova))
+			continue;
+
+		if ((start >= (msi_dma->iova + msi_dma->size)))
+			continue;
+
+		return msi_dma;
+	}
+
+	return NULL;
+}
+
+static struct vfio_dma *vfio_find_dma(struct vfio_iommu *iommu,
+				      dma_addr_t start, size_t size)
+{
+	struct rb_node *node = iommu->dma_list.rb_node;
+
+	/* check DMA MAP entries */
+	while (node) {
+		struct vfio_dma *dma = rb_entry(node, struct vfio_dma, node);
+
+		if (start + size <= dma->iova)
+			node = node->rb_left;
+		else if (start >= dma->iova + dma->size)
+			node = node->rb_right;
+		else
+			return dma;
+	}
+
+	return NULL;
+}
+
+static void vfio_insert_dma(struct vfio_iommu *iommu, struct vfio_dma *new)
+{
+	struct rb_node **link = &iommu->dma_list.rb_node, *parent = NULL;
+	struct vfio_dma *dma;
+
+	while (*link) {
+		parent = *link;
+		dma = rb_entry(parent, struct vfio_dma, node);
+
+		if (new->iova + new->size <= dma->iova)
+			link = &(*link)->rb_left;
+		else
+			link = &(*link)->rb_right;
+	}
+
+	rb_link_node(&new->node, parent, link);
+	rb_insert_color(&new->node, &iommu->dma_list);
+}
+
+static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *old)
+{
+	rb_erase(&old->node, &iommu->dma_list);
+	vfio_check_and_disable_iommu(iommu);
+}
+
+static int vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
+			    dma_addr_t iova, size_t *size)
+{
+	dma_addr_t start = iova;
+	int win, win_start, win_end;
+	long unlocked = 0;
+	unsigned int nr_pages;
+
+	nr_pages = iommu->page_size / PAGE_SIZE;
+	win_start = iova_to_win(iommu, iova);
+	win_end = iova_to_win(iommu, iova + *size - 1);
+
+	/* Release the pinned pages */
+	for (win = win_start; win <= win_end; iova += iommu->page_size, win++) {
+		unsigned long pfn;
+
+		pfn = iommu_iova_to_phys(iommu->domain, iova) >> PAGE_SHIFT;
+		if (!pfn)
+			continue;
+
+		iommu_domain_window_disable(iommu->domain, win);
+
+		unlocked += vfio_unpin_pages(pfn, nr_pages, dma->prot, 1);
+	}
+
+	vfio_lock_acct(-unlocked);
+	*size = iova - start;
+	return 0;
+}
+
+static int vfio_remove_dma_overlap(struct vfio_iommu *iommu, dma_addr_t start,
+				   size_t *size, struct vfio_dma *dma)
+{
+	size_t offset, overlap, tmp;
+	struct vfio_dma *split;
+	int ret;
+
+	if (!*size)
+		return 0;
+
+	/*
+	 * Existing dma region is completely covered, unmap all.  This is
+	 * the likely case since userspace tends to map and unmap buffers
+	 * in one shot rather than multiple mappings within a buffer.
+	 */
+	if (likely(start <= dma->iova &&
+		   start + *size >= dma->iova + dma->size)) {
+		*size = dma->size;
+		ret = vfio_unmap_unpin(iommu, dma, dma->iova, size);
+		if (ret)
+			return ret;
+
+		/*
+		 * Did we remove more than we have?  Should never happen
+		 * since a vfio_dma is contiguous in iova and vaddr.
+		 */
+		WARN_ON(*size != dma->size);
+
+		vfio_remove_dma(iommu, dma);
+		kfree(dma);
+		return 0;
+	}
+
+	/* Overlap low address of existing range */
+	if (start <= dma->iova) {
+		overlap = start + *size - dma->iova;
+		ret = vfio_unmap_unpin(iommu, dma, dma->iova, &overlap);
+		if (ret)
+			return ret;
+
+		vfio_remove_dma(iommu, dma);
+
+		/*
+		 * Check, we may have removed to whole vfio_dma.  If not
+		 * fixup and re-insert.
+		 */
+		if (overlap < dma->size) {
+			dma->iova += overlap;
+			dma->vaddr += overlap;
+			dma->size -= overlap;
+			vfio_insert_dma(iommu, dma);
+		} else
+			kfree(dma);
+
+		*size = overlap;
+		return 0;
+	}
+
+	/* Overlap high address of existing range */
+	if (start + *size >= dma->iova + dma->size) {
+		offset = start - dma->iova;
+		overlap = dma->size - offset;
+
+		ret = vfio_unmap_unpin(iommu, dma, start, &overlap);
+		if (ret)
+			return ret;
+
+		dma->size -= overlap;
+		*size = overlap;
+		return 0;
+	}
+
+	/* Split existing */
+
+	/*
+	 * Allocate our tracking structure early even though it may not
+	 * be used.  An Allocation failure later loses track of pages and
+	 * is more difficult to unwind.
+	 */
+	split = kzalloc(sizeof(*split), GFP_KERNEL);
+	if (!split)
+		return -ENOMEM;
+
+	offset = start - dma->iova;
+
+	ret = vfio_unmap_unpin(iommu, dma, start, size);
+	if (ret || !*size) {
+		kfree(split);
+		return ret;
+	}
+
+	tmp = dma->size;
+
+	/* Resize the lower vfio_dma in place, before the below insert */
+	dma->size = offset;
+
+	/* Insert new for remainder, assuming it didn't all get unmapped */
+	if (likely(offset + *size < tmp)) {
+		split->size = tmp - offset - *size;
+		split->iova = dma->iova + offset + *size;
+		split->vaddr = dma->vaddr + offset + *size;
+		split->prot = dma->prot;
+		vfio_insert_dma(iommu, split);
+	} else
+		kfree(split);
+
+	return 0;
+}
+
+/* Map DMA region */
+static int vfio_dma_map(struct vfio_iommu *iommu, dma_addr_t iova,
+			  unsigned long vaddr, long npage, int prot)
+{
+	int ret = 0, i;
+	size_t size;
+	unsigned int win, nr_subwindows;
+	dma_addr_t iovamap;
+
+	win = iova_to_win(iommu, iova);
+	if (iova != iommu->aperture_start + iommu->page_size * win) {
+		pr_err("%s iova(%llx) unalligned to window size %llx\n",
+			__func__, iova, iommu->page_size);
+		return -EINVAL;
+	}
+
+	/* total size to be mapped */
+	size = npage << PAGE_SHIFT;
+	nr_subwindows = size >> ilog2(iommu->page_size);
+	iovamap = iova;
+
+	for (i = 0; i < nr_subwindows; i++, win++) {
+		unsigned long pfn;
+		unsigned long nr_pages;
+		dma_addr_t mapsize;
+		struct vfio_dma *dma = NULL;
+
+		mapsize = min(iova + size - iovamap, iommu->page_size);
+		nr_pages = mapsize >> PAGE_SHIFT;
+
+		/* Pin a contiguous chunk of memory */
+		ret = vfio_pin_pages(vaddr, nr_pages, prot, &pfn);
+		if (ret != nr_pages) {
+			pr_err("%s unable to pin pages = %lx, pinned(%lx/%lx)\n",
+				__func__, vaddr, npage, nr_pages);
+			ret = -EINVAL;
+			break;
+		}
+
+		ret = iommu_domain_window_enable(iommu->domain, win,
+						 (phys_addr_t)pfn << PAGE_SHIFT,
+						 mapsize, prot);
+		if (ret) {
+			pr_err("%s unable to iommu_map()\n", __func__);
+			ret = -EINVAL;
+			break;
+		}
+
+		/*
+		 * Check if we abut a region below - nothing below 0.
+		 * This is the most likely case when mapping chunks of
+		 * physically contiguous regions within a virtual address
+		 * range.  Update the abutting entry in place since iova
+		 * doesn't change.
+		 */
+		if (likely(iovamap)) {
+			struct vfio_dma *tmp;
+			tmp = vfio_find_dma(iommu, iovamap - 1, 1);
+			if (tmp && tmp->prot == prot &&
+			    tmp->vaddr + tmp->size == vaddr) {
+				tmp->size += mapsize;
+				dma = tmp;
+			}
+		}
+
+		/*
+		 * Check if we abut a region above - nothing above ~0 + 1.
+		 * If we abut above and below, remove and free.  If only
+		 * abut above, remove, modify, reinsert.
+		 */
+		if (likely(iovamap + mapsize)) {
+			struct vfio_dma *tmp;
+			tmp = vfio_find_dma(iommu, iovamap + mapsize, 1);
+			if (tmp && tmp->prot == prot &&
+			    tmp->vaddr == vaddr + mapsize) {
+				vfio_remove_dma(iommu, tmp);
+				if (dma) {
+					dma->size += tmp->size;
+					kfree(tmp);
+				} else {
+					tmp->size += mapsize;
+					tmp->iova = iovamap;
+					tmp->vaddr = vaddr;
+					vfio_insert_dma(iommu, tmp);
+					dma = tmp;
+				}
+			}
+		}
+
+		if (!dma) {
+			dma = kzalloc(sizeof(*dma), GFP_KERNEL);
+			if (!dma) {
+				iommu_unmap(iommu->domain, iovamap, mapsize);
+				vfio_unpin_pages(pfn, npage, prot, true);
+				ret = -ENOMEM;
+				break;
+			}
+
+			dma->size = mapsize;
+			dma->iova = iovamap;
+			dma->vaddr = vaddr;
+			dma->prot = prot;
+			vfio_insert_dma(iommu, dma);
+		}
+
+		iovamap += mapsize;
+		vaddr += mapsize;
+	}
+
+	if (ret) {
+		struct vfio_dma *tmp;
+		while ((tmp = vfio_find_dma(iommu, iova, size))) {
+			int r = vfio_remove_dma_overlap(iommu, iova,
+							&size, tmp);
+			if (WARN_ON(r || !size))
+				break;
+		}
+		return 0;
+	}
+
+	vfio_enable_iommu_domain(iommu);
+	return 0;
+}
+
+static int vfio_dma_do_map(struct vfio_iommu *iommu,
+			   struct vfio_iommu_type1_dma_map *map)
+{
+	dma_addr_t iova = map->iova;
+	size_t size = map->size;
+	unsigned long vaddr = map->vaddr;
+	int ret = 0, prot = 0;
+	long npage;
+
+	/* READ/WRITE from device perspective */
+	if (map->flags & VFIO_DMA_MAP_FLAG_WRITE)
+		prot |= IOMMU_WRITE;
+	if (map->flags & VFIO_DMA_MAP_FLAG_READ)
+		prot |= IOMMU_READ;
+
+	if (!prot)
+		return -EINVAL; /* No READ/WRITE? */
+
+	/* Don't allow IOVA wrap */
+	if (iova + size && iova + size < iova)
+		return -EINVAL;
+
+	/* Don't allow virtual address wrap */
+	if (vaddr + size && vaddr + size < vaddr)
+		return -EINVAL;
+
+	/*
+	 * FIXME: Currently we only support mapping page-size
+	 * of subwindow-size.
+	 */
+	if (size < iommu->page_size)
+		return -EINVAL;
+
+	npage = size >> PAGE_SHIFT;
+	if (!npage)
+		return -EINVAL;
+
+	mutex_lock(&iommu->lock);
+
+	/* Check for dma maping and msi_dma mapping */
+	if (vfio_find_dma(iommu, iova, size) ||
+	    vfio_find_msi_dma(iommu, iova, size)) {
+		ret = -EEXIST;
+		goto out_lock;
+	}
+
+	ret = vfio_dma_map(iommu, iova, vaddr, npage, prot);
+
+out_lock:
+	mutex_unlock(&iommu->lock);
+	return ret;
+}
+
+static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
+			     struct vfio_iommu_type1_dma_unmap *unmap)
+{
+	struct vfio_dma *dma;
+	size_t unmapped = 0, size;
+	int ret = 0;
+
+	mutex_lock(&iommu->lock);
+
+	while ((dma = vfio_find_dma(iommu, unmap->iova, unmap->size))) {
+		size = unmap->size;
+		ret = vfio_remove_dma_overlap(iommu, unmap->iova, &size, dma);
+		if (ret || !size)
+			break;
+		unmapped += size;
+	}
+
+	mutex_unlock(&iommu->lock);
+
+	/*
+	 * We may unmap more than requested, update the unmap struct so
+	 * userspace can know.
+	 */
+	unmap->size = unmapped;
+
+	return ret;
+}
+
+static int vfio_handle_get_attr(struct vfio_iommu *iommu,
+			 struct vfio_pamu_attr *pamu_attr)
+{
+	int ret = 0;
+
+	switch (pamu_attr->attribute) {
+	case VFIO_ATTR_GEOMETRY: {
+		struct iommu_domain_geometry geom;
+		ret = iommu_domain_get_attr(iommu->domain,
+					  DOMAIN_ATTR_GEOMETRY, &geom);
+		pamu_attr->attr_info.attr.aperture_start = geom.aperture_start;
+		pamu_attr->attr_info.attr.aperture_end = geom.aperture_end;
+		break;
+	}
+	case VFIO_ATTR_WINDOWS: {
+		u32 count;
+		ret = iommu_domain_get_attr(iommu->domain,
+				      DOMAIN_ATTR_WINDOWS, &count);
+		pamu_attr->attr_info.windows = count;
+		break;
+	}
+	case VFIO_ATTR_PAMU_STASH: {
+		struct pamu_stash_attribute stash;
+		ret = iommu_domain_get_attr(iommu->domain,
+				      DOMAIN_ATTR_FSL_PAMU_STASH, &stash);
+		pamu_attr->attr_info.stash.cpu = stash.cpu;
+		pamu_attr->attr_info.stash.cache = stash.cache;
+		break;
+	}
+
+	default:
+		pr_err("%s Error: Invalid attribute (%d)\n",
+			 __func__, pamu_attr->attribute);
+		return -EINVAL;
+	}
+
+	return ret;
+}
+
+static int vfio_handle_set_attr(struct vfio_iommu *iommu,
+			 struct vfio_pamu_attr *pamu_attr)
+{
+	int ret = 0;
+
+	switch (pamu_attr->attribute) {
+	case VFIO_ATTR_GEOMETRY: {
+		struct iommu_domain_geometry geom;
+
+		geom.aperture_start = pamu_attr->attr_info.attr.aperture_start;
+		geom.aperture_end = pamu_attr->attr_info.attr.aperture_end;
+		iommu->aperture_start = geom.aperture_start;
+		iommu->aperture_end = geom.aperture_end;
+		geom.force_aperture = 1;
+		ret = iommu_domain_set_attr(iommu->domain,
+					  DOMAIN_ATTR_GEOMETRY, &geom);
+		break;
+	}
+	case VFIO_ATTR_WINDOWS: {
+		u32 count = pamu_attr->attr_info.windows;
+		u64 size = iommu->aperture_end - iommu->aperture_start + 1;
+
+		ret = iommu_domain_set_attr(iommu->domain,
+				      DOMAIN_ATTR_WINDOWS, &count);
+		if (!ret) {
+			iommu->nsubwindows = pamu_attr->attr_info.windows;
+			iommu->page_size = size >> ilog2(count);
+		}
+
+		break;
+	}
+	case VFIO_ATTR_PAMU_STASH: {
+		struct pamu_stash_attribute stash;
+
+		stash.cpu = pamu_attr->attr_info.stash.cpu;
+		stash.cache = pamu_attr->attr_info.stash.cache;
+		ret = iommu_domain_set_attr(iommu->domain,
+				      DOMAIN_ATTR_FSL_PAMU_STASH, &stash);
+		break;
+	}
+
+	default:
+		pr_err("%s Error: Invalid attribute (%d)\n",
+			 __func__, pamu_attr->attribute);
+		return -EINVAL;
+	}
+
+	return ret;
+}
+
+static int pci_msi_set_device_iova(struct device *dev, void *data)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	struct vfio_msi_dma *msi_dma = data;
+
+	return msi_set_iova(pdev, msi_dma->bank_id, msi_dma->iova, 1);
+}
+
+static int pci_msi_clear_device_iova(struct device *dev, void *data)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	struct vfio_msi_dma *msi_dma = data;
+
+	return msi_set_iova(pdev, msi_dma->bank_id, msi_dma->iova, 0);
+}
+
+static int vfio_iommu_set_msi_iova(struct vfio_iommu *iommu,
+				   struct vfio_msi_dma *msi_dma)
+{
+	struct vfio_group *group;
+	int ret = 0;
+
+	list_for_each_entry(group, &iommu->group_list, next) {
+		ret = iommu_group_for_each_dev(group->iommu_group, msi_dma,
+					       pci_msi_set_device_iova);
+	}
+
+	return ret;
+}
+
+static int vfio_iommu_clear_msi_iova(struct vfio_iommu *iommu,
+				     struct vfio_msi_dma *msi_dma)
+{
+	struct vfio_group *group;
+	int ret = 0;
+
+	list_for_each_entry(group, &iommu->group_list, next) {
+		ret = iommu_group_for_each_dev(group->iommu_group, msi_dma,
+					       pci_msi_clear_device_iova);
+	}
+
+	return ret;
+}
+
+static int vfio_do_msi_map(struct vfio_iommu *iommu,
+			struct vfio_pamu_msi_bank_map *msi_map)
+{
+	struct msi_region region;
+	struct vfio_msi_dma *msi_dma;
+	int window;
+	int prot = 0;
+	int ret;
+
+	/* READ/WRITE from device perspective */
+	if (msi_map->flags & VFIO_DMA_MAP_FLAG_WRITE)
+		prot |= IOMMU_WRITE;
+	if (msi_map->flags & VFIO_DMA_MAP_FLAG_READ)
+		prot |= IOMMU_READ;
+
+	if (!prot)
+		return -EINVAL; /* No READ/WRITE? */
+
+	ret = msi_get_region(msi_map->msi_bank_index, &region);
+	if (ret) {
+		pr_err("%s MSI region (%d) not found\n", __func__,
+		       msi_map->msi_bank_index);
+		return ret;
+	}
+
+	mutex_lock(&iommu->lock);
+	/* Check for dma maping and msi_dma mapping */
+	if (vfio_find_dma(iommu, msi_map->iova, region.size) ||
+	    vfio_find_msi_dma(iommu, msi_map->iova, region.size)) {
+		ret = -EEXIST;
+		goto out_lock;
+	}
+
+	window = iova_to_win(iommu, msi_map->iova);
+	ret = iommu_domain_window_enable(iommu->domain, window, region.addr,
+					 region.size, prot);
+	if (ret) {
+		pr_err("%s Error: unable to map msi region\n", __func__);
+		goto out_lock;
+	}
+
+	msi_dma = kzalloc(sizeof(*msi_dma), GFP_KERNEL);
+	if (!msi_dma) {
+		ret = -ENOMEM;
+		goto out_lock;
+	}
+
+	msi_dma->iova = msi_map->iova;
+	msi_dma->size = region.size;
+	msi_dma->bank_id = msi_map->msi_bank_index;
+	list_add(&msi_dma->next, &iommu->msi_dma_list);
+
+	/* Set iova for all the device in iommu-group for the given msi-bank */
+	ret = vfio_iommu_set_msi_iova(iommu, msi_dma);
+
+out_lock:
+	mutex_unlock(&iommu->lock);
+	return ret;
+}
+
+static void vfio_msi_unmap(struct vfio_iommu *iommu, dma_addr_t iova)
+{
+	int window;
+	window = iova_to_win(iommu, iova);
+	iommu_domain_window_disable(iommu->domain, window);
+}
+
+static int vfio_do_msi_unmap(struct vfio_iommu *iommu,
+			     struct vfio_pamu_msi_bank_unmap *msi_unmap)
+{
+	struct vfio_msi_dma *mdma, *mdma_tmp;
+
+	mutex_lock(&iommu->lock);
+
+	list_for_each_entry_safe(mdma, mdma_tmp, &iommu->msi_dma_list, next) {
+		if (mdma->iova == msi_unmap->iova) {
+			/* Clear mapping for msi iova page mapping */
+			vfio_iommu_clear_msi_iova(iommu, mdma);
+			/* Unmap in iommu (PAMU) */
+			vfio_msi_unmap(iommu, mdma->iova);
+			list_del(&mdma->next);
+			vfio_check_and_disable_iommu(iommu);
+			kfree(mdma);
+			mutex_unlock(&iommu->lock);
+			return 0;
+		}
+	}
+
+	mutex_unlock(&iommu->lock);
+	return -EINVAL;
+}
+static void *vfio_iommu_fsl_pamu_open(unsigned long arg)
+{
+	struct vfio_iommu *iommu;
+
+	if (arg != VFIO_FSL_PAMU_IOMMU)
+		return ERR_PTR(-EINVAL);
+
+	iommu = kzalloc(sizeof(*iommu), GFP_KERNEL);
+	if (!iommu)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&iommu->group_list);
+	iommu->dma_list = RB_ROOT;
+	INIT_LIST_HEAD(&iommu->msi_dma_list);
+	mutex_init(&iommu->lock);
+
+	/*
+	 * Wish we didn't have to know about bus_type here.
+	 */
+	iommu->domain = iommu_domain_alloc(&pci_bus_type);
+	if (!iommu->domain) {
+		kfree(iommu);
+		return ERR_PTR(-EIO);
+	}
+
+	return iommu;
+}
+
+static void vfio_iommu_fsl_pamu_release(void *iommu_data)
+{
+	struct vfio_iommu *iommu = iommu_data;
+	struct vfio_group *group, *group_tmp;
+	struct vfio_msi_dma *mdma, *mdma_tmp;
+	struct rb_node *node;
+
+	list_for_each_entry_safe(group, group_tmp, &iommu->group_list, next) {
+		iommu_detach_group(iommu->domain, group->iommu_group);
+		list_del(&group->next);
+		kfree(group);
+	}
+
+	while ((node = rb_first(&iommu->dma_list))) {
+		struct vfio_dma *dma = rb_entry(node, struct vfio_dma, node);
+		size_t size = dma->size;
+		vfio_remove_dma_overlap(iommu, dma->iova, &size, dma);
+		if (WARN_ON(!size))
+			break;
+	}
+
+	list_for_each_entry_safe(mdma, mdma_tmp, &iommu->msi_dma_list, next) {
+		vfio_msi_unmap(iommu, mdma->iova);
+		list_del(&mdma->next);
+		kfree(mdma);
+	}
+
+	/* Disable the iommu as there is no valid entry */
+	vfio_disable_iommu_domain(iommu);
+
+	iommu_domain_free(iommu->domain);
+	iommu->domain = NULL;
+	kfree(iommu);
+}
+
+static long vfio_iommu_fsl_pamu_ioctl(void *iommu_data,
+				      unsigned int cmd, unsigned long arg)
+{
+	struct vfio_iommu *iommu = iommu_data;
+	unsigned long minsz;
+
+	if (cmd == VFIO_CHECK_EXTENSION) {
+		switch (arg) {
+		case VFIO_FSL_PAMU_IOMMU:
+			return 1;
+		default:
+			return 0;
+		}
+	} else if (cmd == VFIO_IOMMU_MAP_DMA) {
+		struct vfio_iommu_type1_dma_map map;
+		uint32_t mask = VFIO_DMA_MAP_FLAG_READ |
+				VFIO_DMA_MAP_FLAG_WRITE;
+
+		minsz = offsetofend(struct vfio_iommu_type1_dma_map, size);
+
+		if (copy_from_user(&map, (void __user *)arg, minsz))
+			return -EFAULT;
+
+		if (map.argsz < minsz || map.flags & ~mask)
+			return -EINVAL;
+
+		return vfio_dma_do_map(iommu, &map);
+
+	} else if (cmd == VFIO_IOMMU_UNMAP_DMA) {
+		struct vfio_iommu_type1_dma_unmap unmap;
+		long ret;
+
+		minsz = offsetofend(struct vfio_iommu_type1_dma_unmap, size);
+
+		if (copy_from_user(&unmap, (void __user *)arg, minsz))
+			return -EFAULT;
+
+		if (unmap.argsz < minsz || unmap.flags)
+			return -EINVAL;
+
+		ret = vfio_dma_do_unmap(iommu, &unmap);
+		if (ret)
+			return ret;
+
+		return copy_to_user((void __user *)arg, &unmap, minsz);
+	} else if (cmd == VFIO_IOMMU_PAMU_GET_ATTR) {
+		struct vfio_pamu_attr pamu_attr;
+
+		minsz = offsetofend(struct vfio_pamu_attr, attr_info);
+		if (copy_from_user(&pamu_attr, (void __user *)arg, minsz))
+			return -EFAULT;
+
+		if (pamu_attr.argsz < minsz)
+			return -EINVAL;
+
+		vfio_handle_get_attr(iommu, &pamu_attr);
+
+		copy_to_user((void __user *)arg, &pamu_attr, minsz);
+		return 0;
+	} else if (cmd == VFIO_IOMMU_PAMU_SET_ATTR) {
+		struct vfio_pamu_attr pamu_attr;
+
+		minsz = offsetofend(struct vfio_pamu_attr, attr_info);
+		if (copy_from_user(&pamu_attr, (void __user *)arg, minsz))
+			return -EFAULT;
+
+		if (pamu_attr.argsz < minsz)
+			return -EINVAL;
+
+		vfio_handle_set_attr(iommu, &pamu_attr);
+		return 0;
+	} else if (cmd == VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT) {
+		return msi_get_region_count();
+	} else if (cmd == VFIO_IOMMU_PAMU_MAP_MSI_BANK) {
+		struct vfio_pamu_msi_bank_map msi_map;
+
+		minsz = offsetofend(struct vfio_pamu_msi_bank_map, iova);
+		if (copy_from_user(&msi_map, (void __user *)arg, minsz))
+			return -EFAULT;
+
+		if (msi_map.argsz < minsz)
+			return -EINVAL;
+
+		vfio_do_msi_map(iommu, &msi_map);
+		return 0;
+	} else if (cmd == VFIO_IOMMU_PAMU_UNMAP_MSI_BANK) {
+		struct vfio_pamu_msi_bank_unmap msi_unmap;
+
+		minsz = offsetofend(struct vfio_pamu_msi_bank_unmap, iova);
+		if (copy_from_user(&msi_unmap, (void __user *)arg, minsz))
+			return -EFAULT;
+
+		if (msi_unmap.argsz < minsz)
+			return -EINVAL;
+
+		vfio_do_msi_unmap(iommu, &msi_unmap);
+		return 0;
+
+	}
+
+	return -ENOTTY;
+}
+
+static int vfio_iommu_fsl_pamu_attach_group(void *iommu_data,
+					 struct iommu_group *iommu_group)
+{
+	struct vfio_iommu *iommu = iommu_data;
+	struct vfio_group *group, *tmp;
+	int ret;
+
+	group = kzalloc(sizeof(*group), GFP_KERNEL);
+	if (!group)
+		return -ENOMEM;
+
+	mutex_lock(&iommu->lock);
+
+	list_for_each_entry(tmp, &iommu->group_list, next) {
+		if (tmp->iommu_group == iommu_group) {
+			mutex_unlock(&iommu->lock);
+			kfree(group);
+			return -EINVAL;
+		}
+	}
+
+	ret = iommu_attach_group(iommu->domain, iommu_group);
+	if (ret) {
+		mutex_unlock(&iommu->lock);
+		kfree(group);
+		return ret;
+	}
+
+	group->iommu_group = iommu_group;
+	list_add(&group->next, &iommu->group_list);
+
+	mutex_unlock(&iommu->lock);
+
+	return 0;
+}
+
+static void vfio_iommu_fsl_pamu_detach_group(void *iommu_data,
+					  struct iommu_group *iommu_group)
+{
+	struct vfio_iommu *iommu = iommu_data;
+	struct vfio_group *group;
+
+	mutex_lock(&iommu->lock);
+
+	list_for_each_entry(group, &iommu->group_list, next) {
+		if (group->iommu_group == iommu_group) {
+			iommu_detach_group(iommu->domain, iommu_group);
+			list_del(&group->next);
+			kfree(group);
+			break;
+		}
+	}
+
+	mutex_unlock(&iommu->lock);
+}
+
+static const struct vfio_iommu_driver_ops vfio_iommu_driver_ops_fsl_pamu = {
+	.name		= "vfio-iommu-fsl_pamu",
+	.owner		= THIS_MODULE,
+	.open		= vfio_iommu_fsl_pamu_open,
+	.release	= vfio_iommu_fsl_pamu_release,
+	.ioctl		= vfio_iommu_fsl_pamu_ioctl,
+	.attach_group	= vfio_iommu_fsl_pamu_attach_group,
+	.detach_group	= vfio_iommu_fsl_pamu_detach_group,
+};
+
+static int __init vfio_iommu_fsl_pamu_init(void)
+{
+	if (!iommu_present(&pci_bus_type))
+		return -ENODEV;
+
+	return vfio_register_iommu_driver(&vfio_iommu_driver_ops_fsl_pamu);
+}
+
+static void __exit vfio_iommu_fsl_pamu_cleanup(void)
+{
+	vfio_unregister_iommu_driver(&vfio_iommu_driver_ops_fsl_pamu);
+}
+
+module_init(vfio_iommu_fsl_pamu_init);
+module_exit(vfio_iommu_fsl_pamu_cleanup);
+
+MODULE_VERSION(DRIVER_VERSION);
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 0fd47f5..d359055 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -23,6 +23,7 @@
 
 #define VFIO_TYPE1_IOMMU		1
 #define VFIO_SPAPR_TCE_IOMMU		2
+#define VFIO_FSL_PAMU_IOMMU		3
 
 /*
  * The IOCTL interface is designed for extensibility by embedding the
@@ -451,4 +452,103 @@ struct vfio_iommu_spapr_tce_info {
 
 /* ***************************************************************** */
 
+/*********** APIs for VFIO_PAMU type only ****************/
+/*
+ * VFIO_IOMMU_PAMU_GET_ATTR - _IO(VFIO_TYPE, VFIO_BASE + 17,
+ *				  struct vfio_pamu_attr)
+ *
+ * Gets the iommu attributes for the current vfio container.
+ * Caller sets argsz and attribute.  The ioctl fills in
+ * the provided struct vfio_pamu_attr based on the attribute
+ * value that was set.
+ * Return: 0 on success, -errno on failure
+ */
+struct vfio_pamu_attr {
+	__u32	argsz;
+	__u32	flags;	/* no flags currently */
+#define VFIO_ATTR_GEOMETRY	0
+#define VFIO_ATTR_WINDOWS	1
+#define VFIO_ATTR_PAMU_STASH	2
+	__u32	attribute;
+
+	union {
+		/* VFIO_ATTR_GEOMETRY */
+		struct {
+			/* first addr that can be mapped */
+			__u64 aperture_start;
+			/* last addr that can be mapped */
+			__u64 aperture_end;
+		} attr;
+
+		/* VFIO_ATTR_WINDOWS */
+		__u32 windows;  /* number of windows in the aperture
+				 * initially this will be the max number
+				 * of windows that can be set
+				 */
+		/* VFIO_ATTR_PAMU_STASH */
+		struct {
+			__u32 cpu;	/* CPU number for stashing */
+			__u32 cache;	/* cache ID for stashing */
+		} stash;
+	} attr_info;
+};
+#define VFIO_IOMMU_PAMU_GET_ATTR  _IO(VFIO_TYPE, VFIO_BASE + 17)
+
+/*
+ * VFIO_IOMMU_PAMU_SET_ATTR - _IO(VFIO_TYPE, VFIO_BASE + 18,
+ *				  struct vfio_pamu_attr)
+ *
+ * Sets the iommu attributes for the current vfio container.
+ * Caller sets struct vfio_pamu attr, including argsz and attribute and
+ * setting any fields that are valid for the attribute.
+ * Return: 0 on success, -errno on failure
+ */
+#define VFIO_IOMMU_PAMU_SET_ATTR  _IO(VFIO_TYPE, VFIO_BASE + 18)
+
+/*
+ * VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT - _IO(VFIO_TYPE, VFIO_BASE + 19, __u32)
+ *
+ * Returns the number of MSI banks for this platform.  This tells user space
+ * how many aperture windows should be reserved for MSI banks when setting
+ * the PAMU geometry and window count.
+ * Return: __u32 bank count on success, -errno on failure
+ */
+#define VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT _IO(VFIO_TYPE, VFIO_BASE + 19)
+
+/*
+ * VFIO_IOMMU_PAMU_MAP_MSI_BANK - _IO(VFIO_TYPE, VFIO_BASE + 20,
+ *				      struct vfio_pamu_msi_bank_map)
+ *
+ * Maps the MSI bank at the specified index and iova.  User space must
+ * call this ioctl once for each MSI bank (count of banks is returned by
+ * VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT).
+ * Caller provides struct vfio_pamu_msi_bank_map with all fields set.
+ * Return: 0 on success, -errno on failure
+ */
+
+struct vfio_pamu_msi_bank_map {
+	__u32	argsz;
+	__u32	flags;		/* no flags currently */
+	__u32	msi_bank_index;	/* the index of the MSI bank */
+	__u64	iova;		/* the iova the bank is to be mapped to */
+};
+#define VFIO_IOMMU_PAMU_MAP_MSI_BANK  _IO(VFIO_TYPE, VFIO_BASE + 20)
+
+/*
+ * VFIO_IOMMU_PAMU_UNMAP_MSI_BANK - _IO(VFIO_TYPE, VFIO_BASE + 21,
+ *					struct vfio_pamu_msi_bank_unmap)
+ *
+ * Unmaps the MSI bank at the specified iova.
+ * Caller provides struct vfio_pamu_msi_bank_unmap with all fields set.
+ * Operates on VFIO file descriptor (/dev/vfio/vfio).
+ * Return: 0 on success, -errno on failure
+ */
+
+struct vfio_pamu_msi_bank_unmap {
+	__u32	argsz;
+	__u32	flags;	/* no flags currently */
+	__u64	iova;	/* the iova to be unmapped to */
+};
+#define VFIO_IOMMU_PAMU_UNMAP_MSI_BANK  _IO(VFIO_TYPE, VFIO_BASE + 21)
+
 #endif /* _UAPIVFIO_H */
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-11-19  5:17 [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) Bharat Bhushan
                   ` (8 preceding siblings ...)
  2013-11-19  5:17 ` [PATCH 9/9 v2] vfio pci: Add vfio iommu implementation for FSL_PAMU Bharat Bhushan
@ 2013-11-20 18:47 ` Alex Williamson
  2013-11-21 11:20   ` Varun Sethi
  2013-11-21 11:20   ` Bharat Bhushan
  9 siblings, 2 replies; 34+ messages in thread
From: Alex Williamson @ 2013-11-20 18:47 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: joro, bhelgaas, agraf, scottwood, stuart.yoder, iommu, linux-pci,
	linuxppc-dev, linux-kernel, Bharat Bhushan

On Tue, 2013-11-19 at 10:47 +0530, Bharat Bhushan wrote:
> From: Bharat Bhushan <bharat.bhushan@freescale.com>
> 
> PAMU (FSL IOMMU) has a concept of primary window and subwindows.
> Primary window corresponds to the complete guest iova address space
> (including MSI space), with respect to IOMMU_API this is termed as
> geometry. IOVA Base of subwindow is determined from the number of
> subwindows (configurable using iommu API).
> MSI I/O page must be within the geometry and maximum supported
> subwindows, so MSI IO-page is setup just after guest memory iova space.
> 
> So patch 1/9-4/9(inclusive) are for defining the interface to get:
>   - Number of MSI regions (which is number of MSI banks for powerpc)
>   - MSI-region address range: Physical page which have the
>     address/addresses used for generating MSI interrupt
>     and size of the page.
> 
> Patch 5/9-7/9(inclusive) is defining the interface of setting up
> MSI iova-base for a msi region(bank) for a device. so that when
> msi-message will be composed then this configured iova will be used.
> Earlier we were using iommu interface for getting the configured iova
> which was not currect and Alex Williamson suggeested this type of interface.
> 
> patch 8/9 moves some common functions in a separate file so that these
> can be used by FSL_PAMU implementation (next patch uses this).
> These will be used later for iommu-none implementation. I believe we
> can do more of this but will take step by step.
> 
> Finally last patch actually adds the support for FSL-PAMU :)

Patches 1-3: msi_get_region needs to return an error an error (probably
-EINVAL) if called on a path where there's no backend implementation.
Otherwise the caller doesn't know that the data in the region pointer
isn't valid.

Patches 5&6: same as above for msi_set_iova, return an error if no
backend implementation.

Patch 7: Why does fsl_msi_del_iova_device bother to return anything if
it's always zero?  Return -ENODEV when not found?

Patch 9:

vfio_handle_get_attr() passes random kernel data back to userspace in
the event of iommu_domain_get_attr() error.

vfio_handle_set_attr(): I don't see any data validation happening, is
iommu_domain_set_attr() really that safe?

For both of those, drop the pr_err on unknown attribute, it's sufficient
to return error.

Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie. each vfio user
has $COUNT regions at their disposal exclusively)?  Thanks,

Alex

> v1->v2
>  - Added interface for setting msi iova for a msi region for a device.
>    Earlier I added iommu interface for same but as per comment that is
>    removed and now created a direct interface between vfio and msi.
>  - Incorporated review comments (details is in individual patch)
> 
> Bharat Bhushan (9):
>   pci:msi: add weak function for returning msi region info
>   pci: msi: expose msi region information functions
>   powerpc: pci: Add arch specific msi region interface
>   powerpc: msi: Extend the msi region interface to get info from
>     fsl_msi
>   pci/msi: interface to set an iova for a msi region
>   powerpc: pci: Extend msi iova page setup to arch specific
>   pci: msi: Extend msi iova setting interface to powerpc arch
>   vfio: moving some functions in common file
>   vfio pci: Add vfio iommu implementation for FSL_PAMU
> 
>  arch/powerpc/include/asm/machdep.h |   10 +
>  arch/powerpc/kernel/msi.c          |   28 +
>  arch/powerpc/sysdev/fsl_msi.c      |  132 +++++-
>  arch/powerpc/sysdev/fsl_msi.h      |   25 +-
>  drivers/pci/msi.c                  |   35 ++
>  drivers/vfio/Kconfig               |    6 +
>  drivers/vfio/Makefile              |    5 +-
>  drivers/vfio/vfio_iommu_common.c   |  227 ++++++++
>  drivers/vfio/vfio_iommu_common.h   |   27 +
>  drivers/vfio/vfio_iommu_fsl_pamu.c | 1003 ++++++++++++++++++++++++++++++++++++
>  drivers/vfio/vfio_iommu_type1.c    |  206 +--------
>  include/linux/msi.h                |   14 +
>  include/linux/pci.h                |   21 +
>  include/uapi/linux/vfio.h          |  100 ++++
>  14 files changed, 1623 insertions(+), 216 deletions(-)
>  create mode 100644 drivers/vfio/vfio_iommu_common.c
>  create mode 100644 drivers/vfio/vfio_iommu_common.h
>  create mode 100644 drivers/vfio/vfio_iommu_fsl_pamu.c
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/




^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-11-20 18:47 ` [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) Alex Williamson
@ 2013-11-21 11:20   ` Varun Sethi
  2013-11-21 11:20   ` Bharat Bhushan
  1 sibling, 0 replies; 34+ messages in thread
From: Varun Sethi @ 2013-11-21 11:20 UTC (permalink / raw)
  To: Alex Williamson, Bharat Bhushan
  Cc: linux-pci, agraf, Stuart Yoder, Scott Wood, iommu, bhelgaas,
	linuxppc-dev, linux-kernel



> -----Original Message-----
> From: iommu-bounces@lists.linux-foundation.org [mailto:iommu-
> bounces@lists.linux-foundation.org] On Behalf Of Alex Williamson
> Sent: Thursday, November 21, 2013 12:17 AM
> To: Bhushan Bharat-R65777
> Cc: linux-pci@vger.kernel.org; agraf@suse.de; Yoder Stuart-B08248; Wood
> Scott-B07421; iommu@lists.linux-foundation.org; bhelgaas@google.com;
> linuxppc-dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU
> (PAMU)
> 
> On Tue, 2013-11-19 at 10:47 +0530, Bharat Bhushan wrote:
> > From: Bharat Bhushan <bharat.bhushan@freescale.com>
> >
> > PAMU (FSL IOMMU) has a concept of primary window and subwindows.
> > Primary window corresponds to the complete guest iova address space
> > (including MSI space), with respect to IOMMU_API this is termed as
> > geometry. IOVA Base of subwindow is determined from the number of
> > subwindows (configurable using iommu API).
> > MSI I/O page must be within the geometry and maximum supported
> > subwindows, so MSI IO-page is setup just after guest memory iova space.
> >
> > So patch 1/9-4/9(inclusive) are for defining the interface to get:
> >   - Number of MSI regions (which is number of MSI banks for powerpc)
> >   - MSI-region address range: Physical page which have the
> >     address/addresses used for generating MSI interrupt
> >     and size of the page.
> >
> > Patch 5/9-7/9(inclusive) is defining the interface of setting up MSI
> > iova-base for a msi region(bank) for a device. so that when
> > msi-message will be composed then this configured iova will be used.
> > Earlier we were using iommu interface for getting the configured iova
> > which was not currect and Alex Williamson suggeested this type of
> interface.
> >
> > patch 8/9 moves some common functions in a separate file so that these
> > can be used by FSL_PAMU implementation (next patch uses this).
> > These will be used later for iommu-none implementation. I believe we
> > can do more of this but will take step by step.
> >
> > Finally last patch actually adds the support for FSL-PAMU :)
> 
> Patches 1-3: msi_get_region needs to return an error an error (probably
> -EINVAL) if called on a path where there's no backend implementation.
> Otherwise the caller doesn't know that the data in the region pointer
> isn't valid.
> 
> Patches 5&6: same as above for msi_set_iova, return an error if no
> backend implementation.
> 
> Patch 7: Why does fsl_msi_del_iova_device bother to return anything if
> it's always zero?  Return -ENODEV when not found?
> 
> Patch 9:
> 
> vfio_handle_get_attr() passes random kernel data back to userspace in the
> event of iommu_domain_get_attr() error.
> 
> vfio_handle_set_attr(): I don't see any data validation happening, is
> iommu_domain_set_attr() really that safe?
[Sethi Varun-B16395] The parameter validation can be left to the lower level iommu driver. The attribute could be specific to a given hardware.

-Varun


^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-11-20 18:47 ` [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) Alex Williamson
  2013-11-21 11:20   ` Varun Sethi
@ 2013-11-21 11:20   ` Bharat Bhushan
  2013-11-21 20:43     ` Alex Williamson
  1 sibling, 1 reply; 34+ messages in thread
From: Bharat Bhushan @ 2013-11-21 11:20 UTC (permalink / raw)
  To: Alex Williamson
  Cc: joro, bhelgaas, agraf, Scott Wood, Stuart Yoder, iommu,
	linux-pci, linuxppc-dev, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 5980 bytes --]



> -----Original Message-----
> From: Alex Williamson [mailto:alex.williamson@redhat.com]
> Sent: Thursday, November 21, 2013 12:17 AM
> To: Bhushan Bharat-R65777
> Cc: joro@8bytes.org; bhelgaas@google.com; agraf@suse.de; Wood Scott-B07421;
> Yoder Stuart-B08248; iommu@lists.linux-foundation.org; linux-
> pci@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> kernel@vger.kernel.org; Bhushan Bharat-R65777
> Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> 
> On Tue, 2013-11-19 at 10:47 +0530, Bharat Bhushan wrote:
> > From: Bharat Bhushan <bharat.bhushan@freescale.com>
> >
> > PAMU (FSL IOMMU) has a concept of primary window and subwindows.
> > Primary window corresponds to the complete guest iova address space
> > (including MSI space), with respect to IOMMU_API this is termed as
> > geometry. IOVA Base of subwindow is determined from the number of
> > subwindows (configurable using iommu API).
> > MSI I/O page must be within the geometry and maximum supported
> > subwindows, so MSI IO-page is setup just after guest memory iova space.
> >
> > So patch 1/9-4/9(inclusive) are for defining the interface to get:
> >   - Number of MSI regions (which is number of MSI banks for powerpc)
> >   - MSI-region address range: Physical page which have the
> >     address/addresses used for generating MSI interrupt
> >     and size of the page.
> >
> > Patch 5/9-7/9(inclusive) is defining the interface of setting up MSI
> > iova-base for a msi region(bank) for a device. so that when
> > msi-message will be composed then this configured iova will be used.
> > Earlier we were using iommu interface for getting the configured iova
> > which was not currect and Alex Williamson suggeested this type of interface.
> >
> > patch 8/9 moves some common functions in a separate file so that these
> > can be used by FSL_PAMU implementation (next patch uses this).
> > These will be used later for iommu-none implementation. I believe we
> > can do more of this but will take step by step.
> >
> > Finally last patch actually adds the support for FSL-PAMU :)
> 
> Patches 1-3: msi_get_region needs to return an error an error (probably
> -EINVAL) if called on a path where there's no backend implementation.
> Otherwise the caller doesn't know that the data in the region pointer isn't
> valid.

will correct.

> 
> Patches 5&6: same as above for msi_set_iova, return an error if no backend
> implementation.

Ok

> 
> Patch 7: Why does fsl_msi_del_iova_device bother to return anything if it's
> always zero?  Return -ENODEV when not found?

Will make -ENODEV.

> 
> Patch 9:
> 
> vfio_handle_get_attr() passes random kernel data back to userspace in the event
> of iommu_domain_get_attr() error.

Will correct.

> 
> vfio_handle_set_attr(): I don't see any data validation happening, is
> iommu_domain_set_attr() really that safe?

We do not need any data validation here and iommu driver does whatever needed.
So yes,  iommu_domain_set_attr() is safe.

> 
> For both of those, drop the pr_err on unknown attribute, it's sufficient to
> return error.

ok

> 
> Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie. each vfio user has
> $COUNT regions at their disposal exclusively)?

Number of msi-bank count is system wide and not per aperture, But will be setting windows for banks in the device aperture.
So say if we are direct assigning 2 pci device (both have different iommu group, so 2 aperture in iommu) to VM.
Now qemu can make only one call to know how many msi-banks are there but it must set sub-windows for all banks for both pci device in its respective aperture.

Thanks
-Bharat

>  Thanks,
> 
> Alex
> 
> > v1->v2
> >  - Added interface for setting msi iova for a msi region for a device.
> >    Earlier I added iommu interface for same but as per comment that is
> >    removed and now created a direct interface between vfio and msi.
> >  - Incorporated review comments (details is in individual patch)
> >
> > Bharat Bhushan (9):
> >   pci:msi: add weak function for returning msi region info
> >   pci: msi: expose msi region information functions
> >   powerpc: pci: Add arch specific msi region interface
> >   powerpc: msi: Extend the msi region interface to get info from
> >     fsl_msi
> >   pci/msi: interface to set an iova for a msi region
> >   powerpc: pci: Extend msi iova page setup to arch specific
> >   pci: msi: Extend msi iova setting interface to powerpc arch
> >   vfio: moving some functions in common file
> >   vfio pci: Add vfio iommu implementation for FSL_PAMU
> >
> >  arch/powerpc/include/asm/machdep.h |   10 +
> >  arch/powerpc/kernel/msi.c          |   28 +
> >  arch/powerpc/sysdev/fsl_msi.c      |  132 +++++-
> >  arch/powerpc/sysdev/fsl_msi.h      |   25 +-
> >  drivers/pci/msi.c                  |   35 ++
> >  drivers/vfio/Kconfig               |    6 +
> >  drivers/vfio/Makefile              |    5 +-
> >  drivers/vfio/vfio_iommu_common.c   |  227 ++++++++
> >  drivers/vfio/vfio_iommu_common.h   |   27 +
> >  drivers/vfio/vfio_iommu_fsl_pamu.c | 1003
> ++++++++++++++++++++++++++++++++++++
> >  drivers/vfio/vfio_iommu_type1.c    |  206 +--------
> >  include/linux/msi.h                |   14 +
> >  include/linux/pci.h                |   21 +
> >  include/uapi/linux/vfio.h          |  100 ++++
> >  14 files changed, 1623 insertions(+), 216 deletions(-)  create mode
> > 100644 drivers/vfio/vfio_iommu_common.c  create mode 100644
> > drivers/vfio/vfio_iommu_common.h  create mode 100644
> > drivers/vfio/vfio_iommu_fsl_pamu.c
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe
> > linux-kernel" in the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 
> 
> 

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-11-21 11:20   ` Bharat Bhushan
@ 2013-11-21 20:43     ` Alex Williamson
  2013-11-21 20:47       ` Scott Wood
  0 siblings, 1 reply; 34+ messages in thread
From: Alex Williamson @ 2013-11-21 20:43 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: joro, bhelgaas, agraf, Scott Wood, Stuart Yoder, iommu,
	linux-pci, linuxppc-dev, linux-kernel

On Thu, 2013-11-21 at 11:20 +0000, Bharat Bhushan wrote:
> 
> > -----Original Message-----
> > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > Sent: Thursday, November 21, 2013 12:17 AM
> > To: Bhushan Bharat-R65777
> > Cc: joro@8bytes.org; bhelgaas@google.com; agraf@suse.de; Wood Scott-B07421;
> > Yoder Stuart-B08248; iommu@lists.linux-foundation.org; linux-
> > pci@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> > kernel@vger.kernel.org; Bhushan Bharat-R65777
> > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> > 
> > On Tue, 2013-11-19 at 10:47 +0530, Bharat Bhushan wrote:
> > > From: Bharat Bhushan <bharat.bhushan@freescale.com>
> > >
> > > PAMU (FSL IOMMU) has a concept of primary window and subwindows.
> > > Primary window corresponds to the complete guest iova address space
> > > (including MSI space), with respect to IOMMU_API this is termed as
> > > geometry. IOVA Base of subwindow is determined from the number of
> > > subwindows (configurable using iommu API).
> > > MSI I/O page must be within the geometry and maximum supported
> > > subwindows, so MSI IO-page is setup just after guest memory iova space.
> > >
> > > So patch 1/9-4/9(inclusive) are for defining the interface to get:
> > >   - Number of MSI regions (which is number of MSI banks for powerpc)
> > >   - MSI-region address range: Physical page which have the
> > >     address/addresses used for generating MSI interrupt
> > >     and size of the page.
> > >
> > > Patch 5/9-7/9(inclusive) is defining the interface of setting up MSI
> > > iova-base for a msi region(bank) for a device. so that when
> > > msi-message will be composed then this configured iova will be used.
> > > Earlier we were using iommu interface for getting the configured iova
> > > which was not currect and Alex Williamson suggeested this type of interface.
> > >
> > > patch 8/9 moves some common functions in a separate file so that these
> > > can be used by FSL_PAMU implementation (next patch uses this).
> > > These will be used later for iommu-none implementation. I believe we
> > > can do more of this but will take step by step.
> > >
> > > Finally last patch actually adds the support for FSL-PAMU :)
> > 
> > Patches 1-3: msi_get_region needs to return an error an error (probably
> > -EINVAL) if called on a path where there's no backend implementation.
> > Otherwise the caller doesn't know that the data in the region pointer isn't
> > valid.
> 
> will correct.
> 
> > 
> > Patches 5&6: same as above for msi_set_iova, return an error if no backend
> > implementation.
> 
> Ok
> 
> > 
> > Patch 7: Why does fsl_msi_del_iova_device bother to return anything if it's
> > always zero?  Return -ENODEV when not found?
> 
> Will make -ENODEV.
> 
> > 
> > Patch 9:
> > 
> > vfio_handle_get_attr() passes random kernel data back to userspace in the event
> > of iommu_domain_get_attr() error.
> 
> Will correct.
> 
> > 
> > vfio_handle_set_attr(): I don't see any data validation happening, is
> > iommu_domain_set_attr() really that safe?
> 
> We do not need any data validation here and iommu driver does whatever needed.
> So yes,  iommu_domain_set_attr() is safe.
> 
> > 
> > For both of those, drop the pr_err on unknown attribute, it's sufficient to
> > return error.
> 
> ok
> 
> > 
> > Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie. each vfio user has
> > $COUNT regions at their disposal exclusively)?
> 
> Number of msi-bank count is system wide and not per aperture, But will be setting windows for banks in the device aperture.
> So say if we are direct assigning 2 pci device (both have different iommu group, so 2 aperture in iommu) to VM.
> Now qemu can make only one call to know how many msi-banks are there but it must set sub-windows for all banks for both pci device in its respective aperture.

I'm still confused.  What I want to make sure of is that the banks are
independent per aperture.  For instance, if we have two separate
userspace processes operating independently and they both chose to use
msi bank zero for their device, that's bank zero within each aperture
and doesn't interfere.  Or another way to ask is can a malicious user
interfere with other users by using the wrong bank.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-11-21 20:43     ` Alex Williamson
@ 2013-11-21 20:47       ` Scott Wood
  2013-11-21 21:00         ` Alex Williamson
  0 siblings, 1 reply; 34+ messages in thread
From: Scott Wood @ 2013-11-21 20:47 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Bharat Bhushan, linux-pci, agraf, Stuart Yoder, iommu, bhelgaas,
	linuxppc-dev, linux-kernel

On Thu, 2013-11-21 at 13:43 -0700, Alex Williamson wrote:
> On Thu, 2013-11-21 at 11:20 +0000, Bharat Bhushan wrote:
> > 
> > > -----Original Message-----
> > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > Sent: Thursday, November 21, 2013 12:17 AM
> > > To: Bhushan Bharat-R65777
> > > Cc: joro@8bytes.org; bhelgaas@google.com; agraf@suse.de; Wood Scott-B07421;
> > > Yoder Stuart-B08248; iommu@lists.linux-foundation.org; linux-
> > > pci@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> > > kernel@vger.kernel.org; Bhushan Bharat-R65777
> > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> > > 
> > > Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie. each vfio user has
> > > $COUNT regions at their disposal exclusively)?
> > 
> > Number of msi-bank count is system wide and not per aperture, But will be setting windows for banks in the device aperture.
> > So say if we are direct assigning 2 pci device (both have different iommu group, so 2 aperture in iommu) to VM.
> > Now qemu can make only one call to know how many msi-banks are there but it must set sub-windows for all banks for both pci device in its respective aperture.
> 
> I'm still confused.  What I want to make sure of is that the banks are
> independent per aperture.  For instance, if we have two separate
> userspace processes operating independently and they both chose to use
> msi bank zero for their device, that's bank zero within each aperture
> and doesn't interfere.  Or another way to ask is can a malicious user
> interfere with other users by using the wrong bank.  Thanks,

They can interfere.  With this hardware, the only way to prevent that is
to make sure that a bank is not shared by multiple protection contexts.
For some of our users, though, I believe preventing this is less
important than the performance benefit.

-Scott




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-11-21 20:47       ` Scott Wood
@ 2013-11-21 21:00         ` Alex Williamson
  2013-11-25  5:33           ` Bharat Bhushan
  0 siblings, 1 reply; 34+ messages in thread
From: Alex Williamson @ 2013-11-21 21:00 UTC (permalink / raw)
  To: Scott Wood
  Cc: Bharat Bhushan, linux-pci, agraf, Stuart Yoder, iommu, bhelgaas,
	linuxppc-dev, linux-kernel

On Thu, 2013-11-21 at 14:47 -0600, Scott Wood wrote:
> On Thu, 2013-11-21 at 13:43 -0700, Alex Williamson wrote:
> > On Thu, 2013-11-21 at 11:20 +0000, Bharat Bhushan wrote:
> > > 
> > > > -----Original Message-----
> > > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > > Sent: Thursday, November 21, 2013 12:17 AM
> > > > To: Bhushan Bharat-R65777
> > > > Cc: joro@8bytes.org; bhelgaas@google.com; agraf@suse.de; Wood Scott-B07421;
> > > > Yoder Stuart-B08248; iommu@lists.linux-foundation.org; linux-
> > > > pci@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> > > > kernel@vger.kernel.org; Bhushan Bharat-R65777
> > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> > > > 
> > > > Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie. each vfio user has
> > > > $COUNT regions at their disposal exclusively)?
> > > 
> > > Number of msi-bank count is system wide and not per aperture, But will be setting windows for banks in the device aperture.
> > > So say if we are direct assigning 2 pci device (both have different iommu group, so 2 aperture in iommu) to VM.
> > > Now qemu can make only one call to know how many msi-banks are there but it must set sub-windows for all banks for both pci device in its respective aperture.
> > 
> > I'm still confused.  What I want to make sure of is that the banks are
> > independent per aperture.  For instance, if we have two separate
> > userspace processes operating independently and they both chose to use
> > msi bank zero for their device, that's bank zero within each aperture
> > and doesn't interfere.  Or another way to ask is can a malicious user
> > interfere with other users by using the wrong bank.  Thanks,
> 
> They can interfere.  With this hardware, the only way to prevent that is
> to make sure that a bank is not shared by multiple protection contexts.
> For some of our users, though, I believe preventing this is less
> important than the performance benefit.

I think we need some sort of ownership model around the msi banks then.
Otherwise there's nothing preventing another userspace from attempting
an MSI based attack on other users, or perhaps even on the host.  VFIO
can't allow that.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-11-21 21:00         ` Alex Williamson
@ 2013-11-25  5:33           ` Bharat Bhushan
  2013-11-25 16:38             ` Alex Williamson
  2013-12-06  0:00             ` Scott Wood
  0 siblings, 2 replies; 34+ messages in thread
From: Bharat Bhushan @ 2013-11-25  5:33 UTC (permalink / raw)
  To: Alex Williamson, Scott Wood
  Cc: linux-pci, agraf, Stuart Yoder, iommu, bhelgaas, linuxppc-dev,
	linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 3282 bytes --]



> -----Original Message-----
> From: Alex Williamson [mailto:alex.williamson@redhat.com]
> Sent: Friday, November 22, 2013 2:31 AM
> To: Wood Scott-B07421
> Cc: Bhushan Bharat-R65777; linux-pci@vger.kernel.org; agraf@suse.de; Yoder
> Stuart-B08248; iommu@lists.linux-foundation.org; bhelgaas@google.com; linuxppc-
> dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> 
> On Thu, 2013-11-21 at 14:47 -0600, Scott Wood wrote:
> > On Thu, 2013-11-21 at 13:43 -0700, Alex Williamson wrote:
> > > On Thu, 2013-11-21 at 11:20 +0000, Bharat Bhushan wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > > > Sent: Thursday, November 21, 2013 12:17 AM
> > > > > To: Bhushan Bharat-R65777
> > > > > Cc: joro@8bytes.org; bhelgaas@google.com; agraf@suse.de; Wood
> > > > > Scott-B07421; Yoder Stuart-B08248;
> > > > > iommu@lists.linux-foundation.org; linux- pci@vger.kernel.org;
> > > > > linuxppc-dev@lists.ozlabs.org; linux- kernel@vger.kernel.org;
> > > > > Bhushan Bharat-R65777
> > > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
> > > > > IOMMU (PAMU)
> > > > >
> > > > > Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie. each
> > > > > vfio user has $COUNT regions at their disposal exclusively)?
> > > >
> > > > Number of msi-bank count is system wide and not per aperture, But will be
> setting windows for banks in the device aperture.
> > > > So say if we are direct assigning 2 pci device (both have different iommu
> group, so 2 aperture in iommu) to VM.
> > > > Now qemu can make only one call to know how many msi-banks are there but
> it must set sub-windows for all banks for both pci device in its respective
> aperture.
> > >
> > > I'm still confused.  What I want to make sure of is that the banks
> > > are independent per aperture.  For instance, if we have two separate
> > > userspace processes operating independently and they both chose to
> > > use msi bank zero for their device, that's bank zero within each
> > > aperture and doesn't interfere.  Or another way to ask is can a
> > > malicious user interfere with other users by using the wrong bank.
> > > Thanks,
> >
> > They can interfere.

Want to be sure of how they can interfere?

>>  With this hardware, the only way to prevent that
> > is to make sure that a bank is not shared by multiple protection contexts.
> > For some of our users, though, I believe preventing this is less
> > important than the performance benefit.

So should we let this patch series in without protection?

> 
> I think we need some sort of ownership model around the msi banks then.
> Otherwise there's nothing preventing another userspace from attempting an MSI
> based attack on other users, or perhaps even on the host.  VFIO can't allow
> that.  Thanks,

We have very few (3 MSI bank on most of chips), so we can not assign one to each userspace. What we can do is host and userspace does not share a MSI bank while userspace will share a MSI bank.


Thanks
-Bharat

> 
> Alex
> 

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-11-25  5:33           ` Bharat Bhushan
@ 2013-11-25 16:38             ` Alex Williamson
  2013-11-27 16:08               ` Bharat Bhushan
  2013-11-28  9:19               ` Bharat Bhushan
  2013-12-06  0:00             ` Scott Wood
  1 sibling, 2 replies; 34+ messages in thread
From: Alex Williamson @ 2013-11-25 16:38 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: Scott Wood, linux-pci, agraf, Stuart Yoder, iommu, bhelgaas,
	linuxppc-dev, linux-kernel

On Mon, 2013-11-25 at 05:33 +0000, Bharat Bhushan wrote:
> 
> > -----Original Message-----
> > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > Sent: Friday, November 22, 2013 2:31 AM
> > To: Wood Scott-B07421
> > Cc: Bhushan Bharat-R65777; linux-pci@vger.kernel.org; agraf@suse.de; Yoder
> > Stuart-B08248; iommu@lists.linux-foundation.org; bhelgaas@google.com; linuxppc-
> > dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> > 
> > On Thu, 2013-11-21 at 14:47 -0600, Scott Wood wrote:
> > > On Thu, 2013-11-21 at 13:43 -0700, Alex Williamson wrote:
> > > > On Thu, 2013-11-21 at 11:20 +0000, Bharat Bhushan wrote:
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > > > > Sent: Thursday, November 21, 2013 12:17 AM
> > > > > > To: Bhushan Bharat-R65777
> > > > > > Cc: joro@8bytes.org; bhelgaas@google.com; agraf@suse.de; Wood
> > > > > > Scott-B07421; Yoder Stuart-B08248;
> > > > > > iommu@lists.linux-foundation.org; linux- pci@vger.kernel.org;
> > > > > > linuxppc-dev@lists.ozlabs.org; linux- kernel@vger.kernel.org;
> > > > > > Bhushan Bharat-R65777
> > > > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
> > > > > > IOMMU (PAMU)
> > > > > >
> > > > > > Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie. each
> > > > > > vfio user has $COUNT regions at their disposal exclusively)?
> > > > >
> > > > > Number of msi-bank count is system wide and not per aperture, But will be
> > setting windows for banks in the device aperture.
> > > > > So say if we are direct assigning 2 pci device (both have different iommu
> > group, so 2 aperture in iommu) to VM.
> > > > > Now qemu can make only one call to know how many msi-banks are there but
> > it must set sub-windows for all banks for both pci device in its respective
> > aperture.
> > > >
> > > > I'm still confused.  What I want to make sure of is that the banks
> > > > are independent per aperture.  For instance, if we have two separate
> > > > userspace processes operating independently and they both chose to
> > > > use msi bank zero for their device, that's bank zero within each
> > > > aperture and doesn't interfere.  Or another way to ask is can a
> > > > malicious user interfere with other users by using the wrong bank.
> > > > Thanks,
> > >
> > > They can interfere.
> 
> Want to be sure of how they can interfere?

What happens if more than one user selects the same MSI bank?
Minimally, wouldn't that result in the IOMMU blocking transactions from
the previous user once the new user activates their mapping?

> >>  With this hardware, the only way to prevent that
> > > is to make sure that a bank is not shared by multiple protection contexts.
> > > For some of our users, though, I believe preventing this is less
> > > important than the performance benefit.
> 
> So should we let this patch series in without protection?

No.

> > 
> > I think we need some sort of ownership model around the msi banks then.
> > Otherwise there's nothing preventing another userspace from attempting an MSI
> > based attack on other users, or perhaps even on the host.  VFIO can't allow
> > that.  Thanks,
> 
> We have very few (3 MSI bank on most of chips), so we can not assign
> one to each userspace. What we can do is host and userspace does not
> share a MSI bank while userspace will share a MSI bank.

Then you probably need VFIO to "own" the MSI bank and program devices
into it rather than exposing the MSI banks to userspace to let them have
direct access.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/9 v2] pci:msi: add weak function for returning msi region info
  2013-11-19  5:17 ` [PATCH 1/9 v2] pci:msi: add weak function for returning msi region info Bharat Bhushan
@ 2013-11-25 23:36   ` Bjorn Helgaas
  2013-11-28 10:08     ` Bharat Bhushan
  0 siblings, 1 reply; 34+ messages in thread
From: Bjorn Helgaas @ 2013-11-25 23:36 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: alex.williamson, joro, agraf, scottwood, stuart.yoder, iommu,
	linux-pci, linuxppc-dev, linux-kernel, Bharat Bhushan

On Tue, Nov 19, 2013 at 10:47:05AM +0530, Bharat Bhushan wrote:
> In Aperture type of IOMMU (like FSL PAMU), VFIO-iommu system need to know
> the MSI region to map its window in h/w. This patch just defines the
> required weak functions only and will be used by followup patches.
> 
> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> ---
> v1->v2
>  - Added description on "struct msi_region" 
> 
>  drivers/pci/msi.c   |   22 ++++++++++++++++++++++
>  include/linux/msi.h |   14 ++++++++++++++
>  2 files changed, 36 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index d5f90d6..2643a29 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -67,6 +67,28 @@ int __weak arch_msi_check_device(struct pci_dev *dev, int nvec, int type)
>  	return chip->check_device(chip, dev, nvec, type);
>  }
>  
> +int __weak arch_msi_get_region_count(void)
> +{
> +	return 0;
> +}
> +
> +int __weak arch_msi_get_region(int region_num, struct msi_region *region)
> +{
> +	return 0;
> +}
> +
> +int msi_get_region_count(void)
> +{
> +	return arch_msi_get_region_count();
> +}
> +EXPORT_SYMBOL(msi_get_region_count);
> +
> +int msi_get_region(int region_num, struct msi_region *region)
> +{
> +	return arch_msi_get_region(region_num, region);
> +}
> +EXPORT_SYMBOL(msi_get_region);
> +
>  int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
>  {
>  	struct msi_desc *entry;
> diff --git a/include/linux/msi.h b/include/linux/msi.h
> index b17ead8..ade1480 100644
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -51,6 +51,18 @@ struct msi_desc {
>  };
>  
>  /*
> + * This structure is used to get
> + * - physical address
> + * - size
> + * of a msi region
> + */
> +struct msi_region {
> +	int region_num; /* MSI region number */
> +	dma_addr_t addr; /* Address of MSI region */
> +	size_t size; /* Size of MSI region */
> +};
> +
> +/*
>   * The arch hooks to setup up msi irqs. Those functions are
>   * implemented as weak symbols so that they /can/ be overriden by
>   * architecture specific code if needed.
> @@ -64,6 +76,8 @@ void arch_restore_msi_irqs(struct pci_dev *dev, int irq);
>  
>  void default_teardown_msi_irqs(struct pci_dev *dev);
>  void default_restore_msi_irqs(struct pci_dev *dev, int irq);
> +int arch_msi_get_region_count(void);
> +int arch_msi_get_region(int region_num, struct msi_region *region);

It doesn't look like any of this (struct msi_region, msi_get_region(),
msi_get_region_count()) is actually used by drivers/pci/msi.c, so I don't
think it needs to be declared in generic code.  It looks like it's only
used in drivers/vfio/vfio_iommu_fsl_pamu.c, where you already know you have
an FSL IOMMU, and you can just call FSL-specific interfaces directly.

Bjorn

>  
>  struct msi_chip {
>  	struct module *owner;
> -- 
> 1.7.0.4
> 
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-11-25 16:38             ` Alex Williamson
@ 2013-11-27 16:08               ` Bharat Bhushan
  2013-11-28  9:19               ` Bharat Bhushan
  1 sibling, 0 replies; 34+ messages in thread
From: Bharat Bhushan @ 2013-11-27 16:08 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Scott Wood, linux-pci, agraf, Stuart Yoder, iommu, bhelgaas,
	linuxppc-dev, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 5130 bytes --]



> -----Original Message-----
> From: Alex Williamson [mailto:alex.williamson@redhat.com]
> Sent: Monday, November 25, 2013 10:08 PM
> To: Bhushan Bharat-R65777
> Cc: Wood Scott-B07421; linux-pci@vger.kernel.org; agraf@suse.de; Yoder Stuart-
> B08248; iommu@lists.linux-foundation.org; bhelgaas@google.com; linuxppc-
> dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> 
> On Mon, 2013-11-25 at 05:33 +0000, Bharat Bhushan wrote:
> >
> > > -----Original Message-----
> > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > Sent: Friday, November 22, 2013 2:31 AM
> > > To: Wood Scott-B07421
> > > Cc: Bhushan Bharat-R65777; linux-pci@vger.kernel.org; agraf@suse.de;
> > > Yoder Stuart-B08248; iommu@lists.linux-foundation.org;
> > > bhelgaas@google.com; linuxppc- dev@lists.ozlabs.org;
> > > linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
> > > IOMMU (PAMU)
> > >
> > > On Thu, 2013-11-21 at 14:47 -0600, Scott Wood wrote:
> > > > On Thu, 2013-11-21 at 13:43 -0700, Alex Williamson wrote:
> > > > > On Thu, 2013-11-21 at 11:20 +0000, Bharat Bhushan wrote:
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > > > > > Sent: Thursday, November 21, 2013 12:17 AM
> > > > > > > To: Bhushan Bharat-R65777
> > > > > > > Cc: joro@8bytes.org; bhelgaas@google.com; agraf@suse.de;
> > > > > > > Wood Scott-B07421; Yoder Stuart-B08248;
> > > > > > > iommu@lists.linux-foundation.org; linux-
> > > > > > > pci@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> > > > > > > kernel@vger.kernel.org; Bhushan Bharat-R65777
> > > > > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for
> > > > > > > Freescale IOMMU (PAMU)
> > > > > > >
> > > > > > > Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie. each
> > > > > > > vfio user has $COUNT regions at their disposal exclusively)?
> > > > > >
> > > > > > Number of msi-bank count is system wide and not per aperture,
> > > > > > But will be
> > > setting windows for banks in the device aperture.
> > > > > > So say if we are direct assigning 2 pci device (both have
> > > > > > different iommu
> > > group, so 2 aperture in iommu) to VM.
> > > > > > Now qemu can make only one call to know how many msi-banks are
> > > > > > there but
> > > it must set sub-windows for all banks for both pci device in its
> > > respective aperture.
> > > > >
> > > > > I'm still confused.  What I want to make sure of is that the
> > > > > banks are independent per aperture.  For instance, if we have
> > > > > two separate userspace processes operating independently and
> > > > > they both chose to use msi bank zero for their device, that's
> > > > > bank zero within each aperture and doesn't interfere.  Or
> > > > > another way to ask is can a malicious user interfere with other users by
> using the wrong bank.
> > > > > Thanks,
> > > >
> > > > They can interfere.
> >
> > Want to be sure of how they can interfere?
> 
> What happens if more than one user selects the same MSI bank?
> Minimally, wouldn't that result in the IOMMU blocking transactions from the
> previous user once the new user activates their mapping?

Yes and no; With current implementation yes but with a minor change no. Later in this response I will explain how.

> 
> > >>  With this hardware, the only way to prevent that
> > > > is to make sure that a bank is not shared by multiple protection contexts.
> > > > For some of our users, though, I believe preventing this is less
> > > > important than the performance benefit.
> >
> > So should we let this patch series in without protection?
> 
> No.
> 
> > >
> > > I think we need some sort of ownership model around the msi banks then.
> > > Otherwise there's nothing preventing another userspace from
> > > attempting an MSI based attack on other users, or perhaps even on
> > > the host.  VFIO can't allow that.  Thanks,
> >
> > We have very few (3 MSI bank on most of chips), so we can not assign
> > one to each userspace. What we can do is host and userspace does not
> > share a MSI bank while userspace will share a MSI bank.
> 
> Then you probably need VFIO to "own" the MSI bank and program devices into it
> rather than exposing the MSI banks to userspace to let them have direct access.

Overall idea of exposing the details of msi regions to userspace are
 1) User space can define the aperture size to fit MSI mapping in IOMMU.
 2) setup iova for a MSI banks; which is just after guest memory. 

But currently we expose the "size" and "address" of MSI banks, passing address is of no use and can be problematic.
If we just provide the size of MSI bank to userspace then userspace cannot do anything wrong.

While it is still the responsibility of host (MSI+VFIO) to compose MSI-address and MSI-data; so I think this should look fine.

> Thanks,
> 
> Alex
> 

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-11-25 16:38             ` Alex Williamson
  2013-11-27 16:08               ` Bharat Bhushan
@ 2013-11-28  9:19               ` Bharat Bhushan
  2013-12-06  0:21                 ` Scott Wood
  1 sibling, 1 reply; 34+ messages in thread
From: Bharat Bhushan @ 2013-11-28  9:19 UTC (permalink / raw)
  To: Bharat Bhushan, Alex Williamson
  Cc: Scott Wood, linux-pci, agraf, Stuart Yoder, iommu, bhelgaas,
	linuxppc-dev, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 6345 bytes --]



> -----Original Message-----
> From: Bhushan Bharat-R65777
> Sent: Wednesday, November 27, 2013 9:39 PM
> To: 'Alex Williamson'
> Cc: Wood Scott-B07421; linux-pci@vger.kernel.org; agraf@suse.de; Yoder Stuart-
> B08248; iommu@lists.linux-foundation.org; bhelgaas@google.com; linuxppc-
> dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> Subject: RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> 
> 
> 
> > -----Original Message-----
> > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > Sent: Monday, November 25, 2013 10:08 PM
> > To: Bhushan Bharat-R65777
> > Cc: Wood Scott-B07421; linux-pci@vger.kernel.org; agraf@suse.de; Yoder
> > Stuart- B08248; iommu@lists.linux-foundation.org; bhelgaas@google.com;
> > linuxppc- dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU
> > (PAMU)
> >
> > On Mon, 2013-11-25 at 05:33 +0000, Bharat Bhushan wrote:
> > >
> > > > -----Original Message-----
> > > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > > Sent: Friday, November 22, 2013 2:31 AM
> > > > To: Wood Scott-B07421
> > > > Cc: Bhushan Bharat-R65777; linux-pci@vger.kernel.org;
> > > > agraf@suse.de; Yoder Stuart-B08248;
> > > > iommu@lists.linux-foundation.org; bhelgaas@google.com; linuxppc-
> > > > dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
> > > > IOMMU (PAMU)
> > > >
> > > > On Thu, 2013-11-21 at 14:47 -0600, Scott Wood wrote:
> > > > > On Thu, 2013-11-21 at 13:43 -0700, Alex Williamson wrote:
> > > > > > On Thu, 2013-11-21 at 11:20 +0000, Bharat Bhushan wrote:
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > > > > > > Sent: Thursday, November 21, 2013 12:17 AM
> > > > > > > > To: Bhushan Bharat-R65777
> > > > > > > > Cc: joro@8bytes.org; bhelgaas@google.com; agraf@suse.de;
> > > > > > > > Wood Scott-B07421; Yoder Stuart-B08248;
> > > > > > > > iommu@lists.linux-foundation.org; linux-
> > > > > > > > pci@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> > > > > > > > kernel@vger.kernel.org; Bhushan Bharat-R65777
> > > > > > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for
> > > > > > > > Freescale IOMMU (PAMU)
> > > > > > > >
> > > > > > > > Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie.
> > > > > > > > each vfio user has $COUNT regions at their disposal exclusively)?
> > > > > > >
> > > > > > > Number of msi-bank count is system wide and not per
> > > > > > > aperture, But will be
> > > > setting windows for banks in the device aperture.
> > > > > > > So say if we are direct assigning 2 pci device (both have
> > > > > > > different iommu
> > > > group, so 2 aperture in iommu) to VM.
> > > > > > > Now qemu can make only one call to know how many msi-banks
> > > > > > > are there but
> > > > it must set sub-windows for all banks for both pci device in its
> > > > respective aperture.
> > > > > >
> > > > > > I'm still confused.  What I want to make sure of is that the
> > > > > > banks are independent per aperture.  For instance, if we have
> > > > > > two separate userspace processes operating independently and
> > > > > > they both chose to use msi bank zero for their device, that's
> > > > > > bank zero within each aperture and doesn't interfere.  Or
> > > > > > another way to ask is can a malicious user interfere with
> > > > > > other users by
> > using the wrong bank.
> > > > > > Thanks,
> > > > >
> > > > > They can interfere.
> > >
> > > Want to be sure of how they can interfere?
> >
> > What happens if more than one user selects the same MSI bank?
> > Minimally, wouldn't that result in the IOMMU blocking transactions
> > from the previous user once the new user activates their mapping?
> 
> Yes and no; With current implementation yes but with a minor change no. Later in
> this response I will explain how.
> 
> >
> > > >>  With this hardware, the only way to prevent that
> > > > > is to make sure that a bank is not shared by multiple protection
> contexts.
> > > > > For some of our users, though, I believe preventing this is less
> > > > > important than the performance benefit.
> > >
> > > So should we let this patch series in without protection?
> >
> > No.
> >
> > > >
> > > > I think we need some sort of ownership model around the msi banks then.
> > > > Otherwise there's nothing preventing another userspace from
> > > > attempting an MSI based attack on other users, or perhaps even on
> > > > the host.  VFIO can't allow that.  Thanks,
> > >
> > > We have very few (3 MSI bank on most of chips), so we can not assign
> > > one to each userspace. What we can do is host and userspace does not
> > > share a MSI bank while userspace will share a MSI bank.
> >
> > Then you probably need VFIO to "own" the MSI bank and program devices
> > into it rather than exposing the MSI banks to userspace to let them have
> direct access.
> 
> Overall idea of exposing the details of msi regions to userspace are
>  1) User space can define the aperture size to fit MSI mapping in IOMMU.
>  2) setup iova for a MSI banks; which is just after guest memory.
> 
> But currently we expose the "size" and "address" of MSI banks, passing address
> is of no use and can be problematic.

I am sorry, above information is not correct. Currently neither we expose "address" nor "size" to user space. We only expose number of MSI BANK count and userspace adds one sub-window for each bank.

> If we just provide the size of MSI bank to userspace then userspace cannot do
> anything wrong.

So userspace does not know address, so it cannot mmap and cause any interference by directly reading/writing.
When user space makes VFIO_DEVICE_SET_IRQS ioctl for MSI type then VFIO with MSI layer compose and write MSI address and Data in actual device. This is all abstracted within host kernel.

Do we see any issue with this approach?

Thanks
-Bharat

> 
> While it is still the responsibility of host (MSI+VFIO) to compose MSI-address
> and MSI-data; so I think this should look fine.
> 
> > Thanks,
> >
> > Alex
> >

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 1/9 v2] pci:msi: add weak function for returning msi region info
  2013-11-25 23:36   ` Bjorn Helgaas
@ 2013-11-28 10:08     ` Bharat Bhushan
  0 siblings, 0 replies; 34+ messages in thread
From: Bharat Bhushan @ 2013-11-28 10:08 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, joro, agraf, Scott Wood, Stuart Yoder, iommu,
	linux-pci, linuxppc-dev, linux-kernel



> -----Original Message-----
> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-owner@vger.kernel.org]
> On Behalf Of Bjorn Helgaas
> Sent: Tuesday, November 26, 2013 5:06 AM
> To: Bhushan Bharat-R65777
> Cc: alex.williamson@redhat.com; joro@8bytes.org; agraf@suse.de; Wood Scott-
> B07421; Yoder Stuart-B08248; iommu@lists.linux-foundation.org; linux-
> pci@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> kernel@vger.kernel.org; Bhushan Bharat-R65777
> Subject: Re: [PATCH 1/9 v2] pci:msi: add weak function for returning msi region
> info
> 
> On Tue, Nov 19, 2013 at 10:47:05AM +0530, Bharat Bhushan wrote:
> > In Aperture type of IOMMU (like FSL PAMU), VFIO-iommu system need to
> > know the MSI region to map its window in h/w. This patch just defines
> > the required weak functions only and will be used by followup patches.
> >
> > Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> > ---
> > v1->v2
> >  - Added description on "struct msi_region"
> >
> >  drivers/pci/msi.c   |   22 ++++++++++++++++++++++
> >  include/linux/msi.h |   14 ++++++++++++++
> >  2 files changed, 36 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index
> > d5f90d6..2643a29 100644
> > --- a/drivers/pci/msi.c
> > +++ b/drivers/pci/msi.c
> > @@ -67,6 +67,28 @@ int __weak arch_msi_check_device(struct pci_dev *dev, int
> nvec, int type)
> >  	return chip->check_device(chip, dev, nvec, type);  }
> >
> > +int __weak arch_msi_get_region_count(void) {
> > +	return 0;
> > +}
> > +
> > +int __weak arch_msi_get_region(int region_num, struct msi_region
> > +*region) {
> > +	return 0;
> > +}
> > +
> > +int msi_get_region_count(void)
> > +{
> > +	return arch_msi_get_region_count();
> > +}
> > +EXPORT_SYMBOL(msi_get_region_count);
> > +
> > +int msi_get_region(int region_num, struct msi_region *region) {
> > +	return arch_msi_get_region(region_num, region); }
> > +EXPORT_SYMBOL(msi_get_region);
> > +
> >  int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int
> > type)  {
> >  	struct msi_desc *entry;
> > diff --git a/include/linux/msi.h b/include/linux/msi.h index
> > b17ead8..ade1480 100644
> > --- a/include/linux/msi.h
> > +++ b/include/linux/msi.h
> > @@ -51,6 +51,18 @@ struct msi_desc {
> >  };
> >
> >  /*
> > + * This structure is used to get
> > + * - physical address
> > + * - size
> > + * of a msi region
> > + */
> > +struct msi_region {
> > +	int region_num; /* MSI region number */
> > +	dma_addr_t addr; /* Address of MSI region */
> > +	size_t size; /* Size of MSI region */ };
> > +
> > +/*
> >   * The arch hooks to setup up msi irqs. Those functions are
> >   * implemented as weak symbols so that they /can/ be overriden by
> >   * architecture specific code if needed.
> > @@ -64,6 +76,8 @@ void arch_restore_msi_irqs(struct pci_dev *dev, int
> > irq);
> >
> >  void default_teardown_msi_irqs(struct pci_dev *dev);  void
> > default_restore_msi_irqs(struct pci_dev *dev, int irq);
> > +int arch_msi_get_region_count(void);
> > +int arch_msi_get_region(int region_num, struct msi_region *region);
> 
> It doesn't look like any of this (struct msi_region, msi_get_region(),
> msi_get_region_count()) is actually used by drivers/pci/msi.c, so I don't think
> it needs to be declared in generic code.  It looks like it's only used in
> drivers/vfio/vfio_iommu_fsl_pamu.c, where you already know you have an FSL
> IOMMU, and you can just call FSL-specific interfaces directly.

Thanks Bjorn,

Want to be sure of what you are suggesting.

What I understood is that we define these (struct msi_region, msi_get_region(), msi_get_region_count()) in arch/powerpc/include/fsl_msi.h (a new file). Include this header file directly in driver/vfio/vfio_iommu_fsl_pamu.c

Same also applies for msi_set_iova() in patch-5 ?

-Bharat

> 
> Bjorn
> 
> >
> >  struct msi_chip {
> >  	struct module *owner;
> > --
> > 1.7.0.4
> >
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body
> of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-11-25  5:33           ` Bharat Bhushan
  2013-11-25 16:38             ` Alex Williamson
@ 2013-12-06  0:00             ` Scott Wood
  2013-12-06  4:17               ` Bharat Bhushan
  1 sibling, 1 reply; 34+ messages in thread
From: Scott Wood @ 2013-12-06  0:00 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: Alex Williamson, linux-pci, agraf, Yoder Stuart-B08248, iommu,
	bhelgaas, linuxppc-dev, linux-kernel

On Sun, 2013-11-24 at 23:33 -0600, Bharat Bhushan wrote:
> 
> > -----Original Message-----
> > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > Sent: Friday, November 22, 2013 2:31 AM
> > To: Wood Scott-B07421
> > Cc: Bhushan Bharat-R65777; linux-pci@vger.kernel.org; agraf@suse.de; Yoder
> > Stuart-B08248; iommu@lists.linux-foundation.org; bhelgaas@google.com; linuxppc-
> > dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> >
> > On Thu, 2013-11-21 at 14:47 -0600, Scott Wood wrote:
> > > On Thu, 2013-11-21 at 13:43 -0700, Alex Williamson wrote:
> > > > On Thu, 2013-11-21 at 11:20 +0000, Bharat Bhushan wrote:
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > > > > Sent: Thursday, November 21, 2013 12:17 AM
> > > > > > To: Bhushan Bharat-R65777
> > > > > > Cc: joro@8bytes.org; bhelgaas@google.com; agraf@suse.de; Wood
> > > > > > Scott-B07421; Yoder Stuart-B08248;
> > > > > > iommu@lists.linux-foundation.org; linux- pci@vger.kernel.org;
> > > > > > linuxppc-dev@lists.ozlabs.org; linux- kernel@vger.kernel.org;
> > > > > > Bhushan Bharat-R65777
> > > > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
> > > > > > IOMMU (PAMU)
> > > > > >
> > > > > > Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie. each
> > > > > > vfio user has $COUNT regions at their disposal exclusively)?
> > > > >
> > > > > Number of msi-bank count is system wide and not per aperture, But will be
> > setting windows for banks in the device aperture.
> > > > > So say if we are direct assigning 2 pci device (both have different iommu
> > group, so 2 aperture in iommu) to VM.
> > > > > Now qemu can make only one call to know how many msi-banks are there but
> > it must set sub-windows for all banks for both pci device in its respective
> > aperture.
> > > >
> > > > I'm still confused.  What I want to make sure of is that the banks
> > > > are independent per aperture.  For instance, if we have two separate
> > > > userspace processes operating independently and they both chose to
> > > > use msi bank zero for their device, that's bank zero within each
> > > > aperture and doesn't interfere.  Or another way to ask is can a
> > > > malicious user interfere with other users by using the wrong bank.
> > > > Thanks,
> > >
> > > They can interfere.
> 
> Want to be sure of how they can interfere?

If more than one VFIO user shares the same MSI group, one of the users
can send MSIs to another user, by using the wrong interrupt within the
bank.  Unexpected MSIs could cause misbehavior or denial of service.

> >>  With this hardware, the only way to prevent that
> > > is to make sure that a bank is not shared by multiple protection contexts.
> > > For some of our users, though, I believe preventing this is less
> > > important than the performance benefit.
> 
> So should we let this patch series in without protection?

No, there should be some sort of opt-in mechanism similar to IOMMU-less
VFIO -- but not the same exact one, since one is a much more serious
loss of isolation than the other.

> > I think we need some sort of ownership model around the msi banks then.
> > Otherwise there's nothing preventing another userspace from attempting an MSI
> > based attack on other users, or perhaps even on the host.  VFIO can't allow
> > that.  Thanks,
> 
> We have very few (3 MSI bank on most of chips), so we can not assign
> one to each userspace.

That depends on how many users there are.

-Scott




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-11-28  9:19               ` Bharat Bhushan
@ 2013-12-06  0:21                 ` Scott Wood
  2013-12-06  4:11                   ` Bharat Bhushan
  0 siblings, 1 reply; 34+ messages in thread
From: Scott Wood @ 2013-12-06  0:21 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: Alex Williamson, linux-pci, agraf, Yoder Stuart-B08248, iommu,
	bhelgaas, linuxppc-dev, linux-kernel

On Thu, 2013-11-28 at 03:19 -0600, Bharat Bhushan wrote:
> 
> > -----Original Message-----
> > From: Bhushan Bharat-R65777
> > Sent: Wednesday, November 27, 2013 9:39 PM
> > To: 'Alex Williamson'
> > Cc: Wood Scott-B07421; linux-pci@vger.kernel.org; agraf@suse.de; Yoder Stuart-
> > B08248; iommu@lists.linux-foundation.org; bhelgaas@google.com; linuxppc-
> > dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> > Subject: RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> >
> >
> >
> > > -----Original Message-----
> > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > Sent: Monday, November 25, 2013 10:08 PM
> > > To: Bhushan Bharat-R65777
> > > Cc: Wood Scott-B07421; linux-pci@vger.kernel.org; agraf@suse.de; Yoder
> > > Stuart- B08248; iommu@lists.linux-foundation.org; bhelgaas@google.com;
> > > linuxppc- dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU
> > > (PAMU)
> > >
> > > On Mon, 2013-11-25 at 05:33 +0000, Bharat Bhushan wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > > > Sent: Friday, November 22, 2013 2:31 AM
> > > > > To: Wood Scott-B07421
> > > > > Cc: Bhushan Bharat-R65777; linux-pci@vger.kernel.org;
> > > > > agraf@suse.de; Yoder Stuart-B08248;
> > > > > iommu@lists.linux-foundation.org; bhelgaas@google.com; linuxppc-
> > > > > dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> > > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
> > > > > IOMMU (PAMU)
> > > > >
> > > > > On Thu, 2013-11-21 at 14:47 -0600, Scott Wood wrote:
> > > > > > On Thu, 2013-11-21 at 13:43 -0700, Alex Williamson wrote:
> > > > > > > On Thu, 2013-11-21 at 11:20 +0000, Bharat Bhushan wrote:
> > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > > > > > > > Sent: Thursday, November 21, 2013 12:17 AM
> > > > > > > > > To: Bhushan Bharat-R65777
> > > > > > > > > Cc: joro@8bytes.org; bhelgaas@google.com; agraf@suse.de;
> > > > > > > > > Wood Scott-B07421; Yoder Stuart-B08248;
> > > > > > > > > iommu@lists.linux-foundation.org; linux-
> > > > > > > > > pci@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> > > > > > > > > kernel@vger.kernel.org; Bhushan Bharat-R65777
> > > > > > > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for
> > > > > > > > > Freescale IOMMU (PAMU)
> > > > > > > > >
> > > > > > > > > Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie.
> > > > > > > > > each vfio user has $COUNT regions at their disposal exclusively)?
> > > > > > > >
> > > > > > > > Number of msi-bank count is system wide and not per
> > > > > > > > aperture, But will be
> > > > > setting windows for banks in the device aperture.
> > > > > > > > So say if we are direct assigning 2 pci device (both have
> > > > > > > > different iommu
> > > > > group, so 2 aperture in iommu) to VM.
> > > > > > > > Now qemu can make only one call to know how many msi-banks
> > > > > > > > are there but
> > > > > it must set sub-windows for all banks for both pci device in its
> > > > > respective aperture.
> > > > > > >
> > > > > > > I'm still confused.  What I want to make sure of is that the
> > > > > > > banks are independent per aperture.  For instance, if we have
> > > > > > > two separate userspace processes operating independently and
> > > > > > > they both chose to use msi bank zero for their device, that's
> > > > > > > bank zero within each aperture and doesn't interfere.  Or
> > > > > > > another way to ask is can a malicious user interfere with
> > > > > > > other users by
> > > using the wrong bank.
> > > > > > > Thanks,
> > > > > >
> > > > > > They can interfere.
> > > >
> > > > Want to be sure of how they can interfere?
> > >
> > > What happens if more than one user selects the same MSI bank?
> > > Minimally, wouldn't that result in the IOMMU blocking transactions
> > > from the previous user once the new user activates their mapping?
> >
> > Yes and no; With current implementation yes but with a minor change no. Later in
> > this response I will explain how.
> >
> > >
> > > > >>  With this hardware, the only way to prevent that
> > > > > > is to make sure that a bank is not shared by multiple protection
> > contexts.
> > > > > > For some of our users, though, I believe preventing this is less
> > > > > > important than the performance benefit.
> > > >
> > > > So should we let this patch series in without protection?
> > >
> > > No.
> > >
> > > > >
> > > > > I think we need some sort of ownership model around the msi banks then.
> > > > > Otherwise there's nothing preventing another userspace from
> > > > > attempting an MSI based attack on other users, or perhaps even on
> > > > > the host.  VFIO can't allow that.  Thanks,
> > > >
> > > > We have very few (3 MSI bank on most of chips), so we can not assign
> > > > one to each userspace. What we can do is host and userspace does not
> > > > share a MSI bank while userspace will share a MSI bank.
> > >
> > > Then you probably need VFIO to "own" the MSI bank and program devices
> > > into it rather than exposing the MSI banks to userspace to let them have
> > direct access.
> >
> > Overall idea of exposing the details of msi regions to userspace are
> >  1) User space can define the aperture size to fit MSI mapping in IOMMU.
> >  2) setup iova for a MSI banks; which is just after guest memory.
> >
> > But currently we expose the "size" and "address" of MSI banks, passing address
> > is of no use and can be problematic.
> 
> I am sorry, above information is not correct. Currently neither we expose "address" nor "size" to user space. We only expose number of MSI BANK count and userspace adds one sub-window for each bank.
> 
> > If we just provide the size of MSI bank to userspace then userspace cannot do
> > anything wrong.
> 
> So userspace does not know address, so it cannot mmap and cause any interference by directly reading/writing.

That's security through obscurity...  Couldn't the malicious user find
out the address via other means, such as experimentation on another
system over which they have full control?  What would happen if the user
reads from their device's PCI config space?  Or gets the information via
some back door in the PCI device they own?  Or pokes throughout the
address space looking for something that generates an interrupt to its
own device?

-Scott




^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-12-06  0:21                 ` Scott Wood
@ 2013-12-06  4:11                   ` Bharat Bhushan
  2013-12-06 18:59                     ` Scott Wood
  0 siblings, 1 reply; 34+ messages in thread
From: Bharat Bhushan @ 2013-12-06  4:11 UTC (permalink / raw)
  To: Scott Wood
  Cc: Alex Williamson, linux-pci, agraf, Stuart Yoder, iommu, bhelgaas,
	linuxppc-dev, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 7709 bytes --]



> -----Original Message-----
> From: Wood Scott-B07421
> Sent: Friday, December 06, 2013 5:52 AM
> To: Bhushan Bharat-R65777
> Cc: Alex Williamson; linux-pci@vger.kernel.org; agraf@suse.de; Yoder Stuart-
> B08248; iommu@lists.linux-foundation.org; bhelgaas@google.com; linuxppc-
> dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> 
> On Thu, 2013-11-28 at 03:19 -0600, Bharat Bhushan wrote:
> >
> > > -----Original Message-----
> > > From: Bhushan Bharat-R65777
> > > Sent: Wednesday, November 27, 2013 9:39 PM
> > > To: 'Alex Williamson'
> > > Cc: Wood Scott-B07421; linux-pci@vger.kernel.org; agraf@suse.de;
> > > Yoder Stuart- B08248; iommu@lists.linux-foundation.org;
> > > bhelgaas@google.com; linuxppc- dev@lists.ozlabs.org;
> > > linux-kernel@vger.kernel.org
> > > Subject: RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale
> > > IOMMU (PAMU)
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > > Sent: Monday, November 25, 2013 10:08 PM
> > > > To: Bhushan Bharat-R65777
> > > > Cc: Wood Scott-B07421; linux-pci@vger.kernel.org; agraf@suse.de;
> > > > Yoder
> > > > Stuart- B08248; iommu@lists.linux-foundation.org;
> > > > bhelgaas@google.com;
> > > > linuxppc- dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
> > > > IOMMU
> > > > (PAMU)
> > > >
> > > > On Mon, 2013-11-25 at 05:33 +0000, Bharat Bhushan wrote:
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > > > > Sent: Friday, November 22, 2013 2:31 AM
> > > > > > To: Wood Scott-B07421
> > > > > > Cc: Bhushan Bharat-R65777; linux-pci@vger.kernel.org;
> > > > > > agraf@suse.de; Yoder Stuart-B08248;
> > > > > > iommu@lists.linux-foundation.org; bhelgaas@google.com;
> > > > > > linuxppc- dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> > > > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for
> > > > > > Freescale IOMMU (PAMU)
> > > > > >
> > > > > > On Thu, 2013-11-21 at 14:47 -0600, Scott Wood wrote:
> > > > > > > On Thu, 2013-11-21 at 13:43 -0700, Alex Williamson wrote:
> > > > > > > > On Thu, 2013-11-21 at 11:20 +0000, Bharat Bhushan wrote:
> > > > > > > > >
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > From: Alex Williamson
> > > > > > > > > > [mailto:alex.williamson@redhat.com]
> > > > > > > > > > Sent: Thursday, November 21, 2013 12:17 AM
> > > > > > > > > > To: Bhushan Bharat-R65777
> > > > > > > > > > Cc: joro@8bytes.org; bhelgaas@google.com;
> > > > > > > > > > agraf@suse.de; Wood Scott-B07421; Yoder Stuart-B08248;
> > > > > > > > > > iommu@lists.linux-foundation.org; linux-
> > > > > > > > > > pci@vger.kernel.org; linuxppc-dev@lists.ozlabs.org;
> > > > > > > > > > linux- kernel@vger.kernel.org; Bhushan Bharat-R65777
> > > > > > > > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for
> > > > > > > > > > Freescale IOMMU (PAMU)
> > > > > > > > > >
> > > > > > > > > > Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie.
> > > > > > > > > > each vfio user has $COUNT regions at their disposal
> exclusively)?
> > > > > > > > >
> > > > > > > > > Number of msi-bank count is system wide and not per
> > > > > > > > > aperture, But will be
> > > > > > setting windows for banks in the device aperture.
> > > > > > > > > So say if we are direct assigning 2 pci device (both
> > > > > > > > > have different iommu
> > > > > > group, so 2 aperture in iommu) to VM.
> > > > > > > > > Now qemu can make only one call to know how many
> > > > > > > > > msi-banks are there but
> > > > > > it must set sub-windows for all banks for both pci device in
> > > > > > its respective aperture.
> > > > > > > >
> > > > > > > > I'm still confused.  What I want to make sure of is that
> > > > > > > > the banks are independent per aperture.  For instance, if
> > > > > > > > we have two separate userspace processes operating
> > > > > > > > independently and they both chose to use msi bank zero for
> > > > > > > > their device, that's bank zero within each aperture and
> > > > > > > > doesn't interfere.  Or another way to ask is can a
> > > > > > > > malicious user interfere with other users by
> > > > using the wrong bank.
> > > > > > > > Thanks,
> > > > > > >
> > > > > > > They can interfere.
> > > > >
> > > > > Want to be sure of how they can interfere?
> > > >
> > > > What happens if more than one user selects the same MSI bank?
> > > > Minimally, wouldn't that result in the IOMMU blocking transactions
> > > > from the previous user once the new user activates their mapping?
> > >
> > > Yes and no; With current implementation yes but with a minor change
> > > no. Later in this response I will explain how.
> > >
> > > >
> > > > > >>  With this hardware, the only way to prevent that
> > > > > > > is to make sure that a bank is not shared by multiple
> > > > > > > protection
> > > contexts.
> > > > > > > For some of our users, though, I believe preventing this is
> > > > > > > less important than the performance benefit.
> > > > >
> > > > > So should we let this patch series in without protection?
> > > >
> > > > No.
> > > >
> > > > > >
> > > > > > I think we need some sort of ownership model around the msi banks
> then.
> > > > > > Otherwise there's nothing preventing another userspace from
> > > > > > attempting an MSI based attack on other users, or perhaps even
> > > > > > on the host.  VFIO can't allow that.  Thanks,
> > > > >
> > > > > We have very few (3 MSI bank on most of chips), so we can not
> > > > > assign one to each userspace. What we can do is host and
> > > > > userspace does not share a MSI bank while userspace will share a MSI
> bank.
> > > >
> > > > Then you probably need VFIO to "own" the MSI bank and program
> > > > devices into it rather than exposing the MSI banks to userspace to
> > > > let them have
> > > direct access.
> > >
> > > Overall idea of exposing the details of msi regions to userspace are
> > >  1) User space can define the aperture size to fit MSI mapping in IOMMU.
> > >  2) setup iova for a MSI banks; which is just after guest memory.
> > >
> > > But currently we expose the "size" and "address" of MSI banks,
> > > passing address is of no use and can be problematic.
> >
> > I am sorry, above information is not correct. Currently neither we expose
> "address" nor "size" to user space. We only expose number of MSI BANK count and
> userspace adds one sub-window for each bank.
> >
> > > If we just provide the size of MSI bank to userspace then userspace
> > > cannot do anything wrong.
> >
> > So userspace does not know address, so it cannot mmap and cause any
> interference by directly reading/writing.
> 
> That's security through obscurity...  Couldn't the malicious user find out the
> address via other means, such as experimentation on another system over which
> they have full control?  What would happen if the user reads from their device's
> PCI config space?  Or gets the information via some back door in the PCI device
> they own?  Or pokes throughout the address space looking for something that
> generates an interrupt to its own device?

So how to solve this problem, Any suggestion ?

We have to map one window in PAMU for MSIs and a malicious user can ask its device to do DMA to MSI window region with any pair of address and data, which can lead to unexpected MSIs in system?

Thanks
-Bharat

> 
> -Scott
> 

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-12-06  0:00             ` Scott Wood
@ 2013-12-06  4:17               ` Bharat Bhushan
  2013-12-06 19:25                 ` Scott Wood
  0 siblings, 1 reply; 34+ messages in thread
From: Bharat Bhushan @ 2013-12-06  4:17 UTC (permalink / raw)
  To: Scott Wood
  Cc: Alex Williamson, linux-pci, agraf, Stuart Yoder, iommu, bhelgaas,
	linuxppc-dev, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 4729 bytes --]



> -----Original Message-----
> From: Wood Scott-B07421
> Sent: Friday, December 06, 2013 5:31 AM
> To: Bhushan Bharat-R65777
> Cc: Alex Williamson; linux-pci@vger.kernel.org; agraf@suse.de; Yoder Stuart-
> B08248; iommu@lists.linux-foundation.org; bhelgaas@google.com; linuxppc-
> dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> 
> On Sun, 2013-11-24 at 23:33 -0600, Bharat Bhushan wrote:
> >
> > > -----Original Message-----
> > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > Sent: Friday, November 22, 2013 2:31 AM
> > > To: Wood Scott-B07421
> > > Cc: Bhushan Bharat-R65777; linux-pci@vger.kernel.org; agraf@suse.de;
> > > Yoder Stuart-B08248; iommu@lists.linux-foundation.org;
> > > bhelgaas@google.com; linuxppc- dev@lists.ozlabs.org;
> > > linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
> > > IOMMU (PAMU)
> > >
> > > On Thu, 2013-11-21 at 14:47 -0600, Scott Wood wrote:
> > > > On Thu, 2013-11-21 at 13:43 -0700, Alex Williamson wrote:
> > > > > On Thu, 2013-11-21 at 11:20 +0000, Bharat Bhushan wrote:
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > > > > > Sent: Thursday, November 21, 2013 12:17 AM
> > > > > > > To: Bhushan Bharat-R65777
> > > > > > > Cc: joro@8bytes.org; bhelgaas@google.com; agraf@suse.de;
> > > > > > > Wood Scott-B07421; Yoder Stuart-B08248;
> > > > > > > iommu@lists.linux-foundation.org; linux-
> > > > > > > pci@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> > > > > > > kernel@vger.kernel.org; Bhushan Bharat-R65777
> > > > > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for
> > > > > > > Freescale IOMMU (PAMU)
> > > > > > >
> > > > > > > Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie. each
> > > > > > > vfio user has $COUNT regions at their disposal exclusively)?
> > > > > >
> > > > > > Number of msi-bank count is system wide and not per aperture,
> > > > > > But will be
> > > setting windows for banks in the device aperture.
> > > > > > So say if we are direct assigning 2 pci device (both have
> > > > > > different iommu
> > > group, so 2 aperture in iommu) to VM.
> > > > > > Now qemu can make only one call to know how many msi-banks are
> > > > > > there but
> > > it must set sub-windows for all banks for both pci device in its
> > > respective aperture.
> > > > >
> > > > > I'm still confused.  What I want to make sure of is that the
> > > > > banks are independent per aperture.  For instance, if we have
> > > > > two separate userspace processes operating independently and
> > > > > they both chose to use msi bank zero for their device, that's
> > > > > bank zero within each aperture and doesn't interfere.  Or
> > > > > another way to ask is can a malicious user interfere with other users by
> using the wrong bank.
> > > > > Thanks,
> > > >
> > > > They can interfere.
> >
> > Want to be sure of how they can interfere?
> 
> If more than one VFIO user shares the same MSI group, one of the users can send
> MSIs to another user, by using the wrong interrupt within the bank.  Unexpected
> MSIs could cause misbehavior or denial of service.
> 
> > >>  With this hardware, the only way to prevent that
> > > > is to make sure that a bank is not shared by multiple protection contexts.
> > > > For some of our users, though, I believe preventing this is less
> > > > important than the performance benefit.
> >
> > So should we let this patch series in without protection?
> 
> No, there should be some sort of opt-in mechanism similar to IOMMU-less VFIO --
> but not the same exact one, since one is a much more serious loss of isolation
> than the other.

Can you please elaborate "opt-in mechanism"?

> 
> > > I think we need some sort of ownership model around the msi banks then.
> > > Otherwise there's nothing preventing another userspace from
> > > attempting an MSI based attack on other users, or perhaps even on
> > > the host.  VFIO can't allow that.  Thanks,
> >
> > We have very few (3 MSI bank on most of chips), so we can not assign
> > one to each userspace.
> 
> That depends on how many users there are.

What I think we can do is:
 - Reserve one MSI region for host. Host will not share MSI region with Guest.
 - For upto 2 Guest (MAX msi with host - 1) give then separate MSI sub regions
 - Additional Guest will share MSI region with other guest.

Any better suggestion are most welcome.

Thanks
-Bharat
> 
> -Scott
> 

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-12-06  4:11                   ` Bharat Bhushan
@ 2013-12-06 18:59                     ` Scott Wood
  2013-12-06 19:30                       ` Alex Williamson
  0 siblings, 1 reply; 34+ messages in thread
From: Scott Wood @ 2013-12-06 18:59 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: Alex Williamson, linux-pci, agraf, Yoder Stuart-B08248, iommu,
	bhelgaas, linuxppc-dev, linux-kernel

On Thu, 2013-12-05 at 22:11 -0600, Bharat Bhushan wrote:
> 
> > -----Original Message-----
> > From: Wood Scott-B07421
> > Sent: Friday, December 06, 2013 5:52 AM
> > To: Bhushan Bharat-R65777
> > Cc: Alex Williamson; linux-pci@vger.kernel.org; agraf@suse.de; Yoder Stuart-
> > B08248; iommu@lists.linux-foundation.org; bhelgaas@google.com; linuxppc-
> > dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> >
> > On Thu, 2013-11-28 at 03:19 -0600, Bharat Bhushan wrote:
> > >
> > > > -----Original Message-----
> > > > From: Bhushan Bharat-R65777
> > > > Sent: Wednesday, November 27, 2013 9:39 PM
> > > > To: 'Alex Williamson'
> > > > Cc: Wood Scott-B07421; linux-pci@vger.kernel.org; agraf@suse.de;
> > > > Yoder Stuart- B08248; iommu@lists.linux-foundation.org;
> > > > bhelgaas@google.com; linuxppc- dev@lists.ozlabs.org;
> > > > linux-kernel@vger.kernel.org
> > > > Subject: RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale
> > > > IOMMU (PAMU)
> > > >
> > > > If we just provide the size of MSI bank to userspace then userspace
> > > > cannot do anything wrong.
> > >
> > > So userspace does not know address, so it cannot mmap and cause any
> > interference by directly reading/writing.
> >
> > That's security through obscurity...  Couldn't the malicious user find out the
> > address via other means, such as experimentation on another system over which
> > they have full control?  What would happen if the user reads from their device's
> > PCI config space?  Or gets the information via some back door in the PCI device
> > they own?  Or pokes throughout the address space looking for something that
> > generates an interrupt to its own device?
> 
> So how to solve this problem, Any suggestion ?
> 
> We have to map one window in PAMU for MSIs and a malicious user can ask
> its device to do DMA to MSI window region with any pair of address and
> data, which can lead to unexpected MSIs in system?

I don't think there are any solutions other than to limit each bank to
one user, unless the admin turns some knob that says they're OK with the
partial loss of isolation.

-Scott




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-12-06  4:17               ` Bharat Bhushan
@ 2013-12-06 19:25                 ` Scott Wood
       [not found]                   ` <34c6142137194b1cbd2336013ed3b10d@BN1PR03MB266.namprd03.prod.outlook.com>
  0 siblings, 1 reply; 34+ messages in thread
From: Scott Wood @ 2013-12-06 19:25 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: Alex Williamson, linux-pci, agraf, Yoder Stuart-B08248, iommu,
	bhelgaas, linuxppc-dev, linux-kernel

On Thu, 2013-12-05 at 22:17 -0600, Bharat Bhushan wrote:
> 
> > -----Original Message-----
> > From: Wood Scott-B07421
> > Sent: Friday, December 06, 2013 5:31 AM
> > To: Bhushan Bharat-R65777
> > Cc: Alex Williamson; linux-pci@vger.kernel.org; agraf@suse.de; Yoder Stuart-
> > B08248; iommu@lists.linux-foundation.org; bhelgaas@google.com; linuxppc-
> > dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> >
> > On Sun, 2013-11-24 at 23:33 -0600, Bharat Bhushan wrote:
> > >
> > > > -----Original Message-----
> > > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > > Sent: Friday, November 22, 2013 2:31 AM
> > > > To: Wood Scott-B07421
> > > > Cc: Bhushan Bharat-R65777; linux-pci@vger.kernel.org; agraf@suse.de;
> > > > Yoder Stuart-B08248; iommu@lists.linux-foundation.org;
> > > > bhelgaas@google.com; linuxppc- dev@lists.ozlabs.org;
> > > > linux-kernel@vger.kernel.org
> > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
> > > > IOMMU (PAMU)
> > > >
> > > > On Thu, 2013-11-21 at 14:47 -0600, Scott Wood wrote:
> > > > > They can interfere.
> > >
> > > Want to be sure of how they can interfere?
> >
> > If more than one VFIO user shares the same MSI group, one of the users can send
> > MSIs to another user, by using the wrong interrupt within the bank.  Unexpected
> > MSIs could cause misbehavior or denial of service.
> >
> > > >>  With this hardware, the only way to prevent that
> > > > > is to make sure that a bank is not shared by multiple protection contexts.
> > > > > For some of our users, though, I believe preventing this is less
> > > > > important than the performance benefit.
> > >
> > > So should we let this patch series in without protection?
> >
> > No, there should be some sort of opt-in mechanism similar to IOMMU-less VFIO --
> > but not the same exact one, since one is a much more serious loss of isolation
> > than the other.
> 
> Can you please elaborate "opt-in mechanism"?

The system should be secure by default.  If the administrator wants to
relax protection in order to accomplish some functionality, that should
require an explicit request such as a write to a sysfs file.

> > > > I think we need some sort of ownership model around the msi banks then.
> > > > Otherwise there's nothing preventing another userspace from
> > > > attempting an MSI based attack on other users, or perhaps even on
> > > > the host.  VFIO can't allow that.  Thanks,
> > >
> > > We have very few (3 MSI bank on most of chips), so we can not assign
> > > one to each userspace.
> >
> > That depends on how many users there are.
> 
> What I think we can do is:
>  - Reserve one MSI region for host. Host will not share MSI region with Guest.
>  - For upto 2 Guest (MAX msi with host - 1) give then separate MSI sub regions
>  - Additional Guest will share MSI region with other guest.
> 
> Any better suggestion are most welcome.

If the administrator does not opt into this partial loss of isolation,
then once you run out of MSI groups, new users should not be able to set
up MSIs.

-Scott




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-12-06 18:59                     ` Scott Wood
@ 2013-12-06 19:30                       ` Alex Williamson
  2013-12-07  0:22                         ` Scott Wood
  2013-12-10  5:37                         ` Bharat.Bhushan
  0 siblings, 2 replies; 34+ messages in thread
From: Alex Williamson @ 2013-12-06 19:30 UTC (permalink / raw)
  To: Scott Wood
  Cc: Bharat Bhushan, linux-pci, agraf, Yoder Stuart-B08248, iommu,
	bhelgaas, linuxppc-dev, linux-kernel

On Fri, 2013-12-06 at 12:59 -0600, Scott Wood wrote:
> On Thu, 2013-12-05 at 22:11 -0600, Bharat Bhushan wrote:
> > 
> > > -----Original Message-----
> > > From: Wood Scott-B07421
> > > Sent: Friday, December 06, 2013 5:52 AM
> > > To: Bhushan Bharat-R65777
> > > Cc: Alex Williamson; linux-pci@vger.kernel.org; agraf@suse.de; Yoder Stuart-
> > > B08248; iommu@lists.linux-foundation.org; bhelgaas@google.com; linuxppc-
> > > dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> > >
> > > On Thu, 2013-11-28 at 03:19 -0600, Bharat Bhushan wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Bhushan Bharat-R65777
> > > > > Sent: Wednesday, November 27, 2013 9:39 PM
> > > > > To: 'Alex Williamson'
> > > > > Cc: Wood Scott-B07421; linux-pci@vger.kernel.org; agraf@suse.de;
> > > > > Yoder Stuart- B08248; iommu@lists.linux-foundation.org;
> > > > > bhelgaas@google.com; linuxppc- dev@lists.ozlabs.org;
> > > > > linux-kernel@vger.kernel.org
> > > > > Subject: RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale
> > > > > IOMMU (PAMU)
> > > > >
> > > > > If we just provide the size of MSI bank to userspace then userspace
> > > > > cannot do anything wrong.
> > > >
> > > > So userspace does not know address, so it cannot mmap and cause any
> > > interference by directly reading/writing.
> > >
> > > That's security through obscurity...  Couldn't the malicious user find out the
> > > address via other means, such as experimentation on another system over which
> > > they have full control?  What would happen if the user reads from their device's
> > > PCI config space?  Or gets the information via some back door in the PCI device
> > > they own?  Or pokes throughout the address space looking for something that
> > > generates an interrupt to its own device?
> > 
> > So how to solve this problem, Any suggestion ?
> > 
> > We have to map one window in PAMU for MSIs and a malicious user can ask
> > its device to do DMA to MSI window region with any pair of address and
> > data, which can lead to unexpected MSIs in system?
> 
> I don't think there are any solutions other than to limit each bank to
> one user, unless the admin turns some knob that says they're OK with the
> partial loss of isolation.

Even if the admin does opt-in to an allow_unsafe_interrupts options, it
should still be reasonably difficult for one guest to interfere with the
other.  I don't think we want to rely on the blind luck of making the
full MSI bank accessible to multiple guests and hoping they don't step
on each other.  That probably means that vfio needs to manage the space
rather than the guest.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-12-06 19:30                       ` Alex Williamson
@ 2013-12-07  0:22                         ` Scott Wood
  2013-12-10  5:37                         ` Bharat.Bhushan
  1 sibling, 0 replies; 34+ messages in thread
From: Scott Wood @ 2013-12-07  0:22 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-pci, agraf, Yoder Stuart-B08248, iommu, bhelgaas,
	linuxppc-dev, linux-kernel

On Fri, 2013-12-06 at 12:30 -0700, Alex Williamson wrote:
> On Fri, 2013-12-06 at 12:59 -0600, Scott Wood wrote:
> > On Thu, 2013-12-05 at 22:11 -0600, Bharat Bhushan wrote:
> > > 
> > > > -----Original Message-----
> > > > From: Wood Scott-B07421
> > > > Sent: Friday, December 06, 2013 5:52 AM
> > > > To: Bhushan Bharat-R65777
> > > > Cc: Alex Williamson; linux-pci@vger.kernel.org; agraf@suse.de; Yoder Stuart-
> > > > B08248; iommu@lists.linux-foundation.org; bhelgaas@google.com; linuxppc-
> > > > dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> > > >
> > > > On Thu, 2013-11-28 at 03:19 -0600, Bharat Bhushan wrote:
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Bhushan Bharat-R65777
> > > > > > Sent: Wednesday, November 27, 2013 9:39 PM
> > > > > > To: 'Alex Williamson'
> > > > > > Cc: Wood Scott-B07421; linux-pci@vger.kernel.org; agraf@suse.de;
> > > > > > Yoder Stuart- B08248; iommu@lists.linux-foundation.org;
> > > > > > bhelgaas@google.com; linuxppc- dev@lists.ozlabs.org;
> > > > > > linux-kernel@vger.kernel.org
> > > > > > Subject: RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale
> > > > > > IOMMU (PAMU)
> > > > > >
> > > > > > If we just provide the size of MSI bank to userspace then userspace
> > > > > > cannot do anything wrong.
> > > > >
> > > > > So userspace does not know address, so it cannot mmap and cause any
> > > > interference by directly reading/writing.
> > > >
> > > > That's security through obscurity...  Couldn't the malicious user find out the
> > > > address via other means, such as experimentation on another system over which
> > > > they have full control?  What would happen if the user reads from their device's
> > > > PCI config space?  Or gets the information via some back door in the PCI device
> > > > they own?  Or pokes throughout the address space looking for something that
> > > > generates an interrupt to its own device?
> > > 
> > > So how to solve this problem, Any suggestion ?
> > > 
> > > We have to map one window in PAMU for MSIs and a malicious user can ask
> > > its device to do DMA to MSI window region with any pair of address and
> > > data, which can lead to unexpected MSIs in system?
> > 
> > I don't think there are any solutions other than to limit each bank to
> > one user, unless the admin turns some knob that says they're OK with the
> > partial loss of isolation.
> 
> Even if the admin does opt-in to an allow_unsafe_interrupts options, it
> should still be reasonably difficult for one guest to interfere with the
> other.  I don't think we want to rely on the blind luck of making the
> full MSI bank accessible to multiple guests and hoping they don't step
> on each other.  That probably means that vfio needs to manage the space
> rather than the guest.  Thanks,

Yes, the MSIs within a given bank would be allocated by the host kernel
in any case (presumably by the MSI driver, not VFIO itself).  This is
just about what happens if the MSI page is written to outside of the
normal mechanism.

-Scott




^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-12-06 19:30                       ` Alex Williamson
  2013-12-07  0:22                         ` Scott Wood
@ 2013-12-10  5:37                         ` Bharat.Bhushan
  2013-12-10  5:53                           ` Alex Williamson
  1 sibling, 1 reply; 34+ messages in thread
From: Bharat.Bhushan @ 2013-12-10  5:37 UTC (permalink / raw)
  To: Alex Williamson, Scott Wood
  Cc: Bharat.Bhushan, linux-pci, agraf, Stuart Yoder, iommu, bhelgaas,
	linuxppc-dev, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 3615 bytes --]



> -----Original Message-----
> From: Alex Williamson [mailto:alex.williamson@redhat.com]
> Sent: Saturday, December 07, 2013 1:00 AM
> To: Wood Scott-B07421
> Cc: Bhushan Bharat-R65777; linux-pci@vger.kernel.org; agraf@suse.de; Yoder
> Stuart-B08248; iommu@lists.linux-foundation.org; bhelgaas@google.com; linuxppc-
> dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> 
> On Fri, 2013-12-06 at 12:59 -0600, Scott Wood wrote:
> > On Thu, 2013-12-05 at 22:11 -0600, Bharat Bhushan wrote:
> > >
> > > > -----Original Message-----
> > > > From: Wood Scott-B07421
> > > > Sent: Friday, December 06, 2013 5:52 AM
> > > > To: Bhushan Bharat-R65777
> > > > Cc: Alex Williamson; linux-pci@vger.kernel.org; agraf@suse.de;
> > > > Yoder Stuart- B08248; iommu@lists.linux-foundation.org;
> > > > bhelgaas@google.com; linuxppc- dev@lists.ozlabs.org;
> > > > linux-kernel@vger.kernel.org
> > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
> > > > IOMMU (PAMU)
> > > >
> > > > On Thu, 2013-11-28 at 03:19 -0600, Bharat Bhushan wrote:
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Bhushan Bharat-R65777
> > > > > > Sent: Wednesday, November 27, 2013 9:39 PM
> > > > > > To: 'Alex Williamson'
> > > > > > Cc: Wood Scott-B07421; linux-pci@vger.kernel.org;
> > > > > > agraf@suse.de; Yoder Stuart- B08248;
> > > > > > iommu@lists.linux-foundation.org; bhelgaas@google.com;
> > > > > > linuxppc- dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> > > > > > Subject: RE: [PATCH 0/9 v2] vfio-pci: add support for
> > > > > > Freescale IOMMU (PAMU)
> > > > > >
> > > > > > If we just provide the size of MSI bank to userspace then
> > > > > > userspace cannot do anything wrong.
> > > > >
> > > > > So userspace does not know address, so it cannot mmap and cause
> > > > > any
> > > > interference by directly reading/writing.
> > > >
> > > > That's security through obscurity...  Couldn't the malicious user
> > > > find out the address via other means, such as experimentation on
> > > > another system over which they have full control?  What would
> > > > happen if the user reads from their device's PCI config space?  Or
> > > > gets the information via some back door in the PCI device they
> > > > own?  Or pokes throughout the address space looking for something that
> generates an interrupt to its own device?
> > >
> > > So how to solve this problem, Any suggestion ?
> > >
> > > We have to map one window in PAMU for MSIs and a malicious user can
> > > ask its device to do DMA to MSI window region with any pair of
> > > address and data, which can lead to unexpected MSIs in system?
> >
> > I don't think there are any solutions other than to limit each bank to
> > one user, unless the admin turns some knob that says they're OK with
> > the partial loss of isolation.
> 
> Even if the admin does opt-in to an allow_unsafe_interrupts options, it should
> still be reasonably difficult for one guest to interfere with the other.  I
> don't think we want to rely on the blind luck of making the full MSI bank
> accessible to multiple guests and hoping they don't step on each other.

Not sure how to solve in this case (sharing MSI page)

>  That probably means that vfio needs to manage the space rather than the guest.

What you mean by " vfio needs to manage the space rather than the guest"?

Thanks
-Bharat

> Thanks,
> 
> Alex
> 

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-12-10  5:37                         ` Bharat.Bhushan
@ 2013-12-10  5:53                           ` Alex Williamson
  2013-12-10  9:09                             ` Bharat.Bhushan
  0 siblings, 1 reply; 34+ messages in thread
From: Alex Williamson @ 2013-12-10  5:53 UTC (permalink / raw)
  To: Bharat.Bhushan
  Cc: Scott Wood, linux-pci, agraf, Stuart Yoder, iommu, bhelgaas,
	linuxppc-dev, linux-kernel

On Tue, 2013-12-10 at 05:37 +0000, Bharat.Bhushan@freescale.com wrote:
> 
> > -----Original Message-----
> > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > Sent: Saturday, December 07, 2013 1:00 AM
> > To: Wood Scott-B07421
> > Cc: Bhushan Bharat-R65777; linux-pci@vger.kernel.org; agraf@suse.de; Yoder
> > Stuart-B08248; iommu@lists.linux-foundation.org; bhelgaas@google.com; linuxppc-
> > dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> > 
> > On Fri, 2013-12-06 at 12:59 -0600, Scott Wood wrote:
> > > On Thu, 2013-12-05 at 22:11 -0600, Bharat Bhushan wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Wood Scott-B07421
> > > > > Sent: Friday, December 06, 2013 5:52 AM
> > > > > To: Bhushan Bharat-R65777
> > > > > Cc: Alex Williamson; linux-pci@vger.kernel.org; agraf@suse.de;
> > > > > Yoder Stuart- B08248; iommu@lists.linux-foundation.org;
> > > > > bhelgaas@google.com; linuxppc- dev@lists.ozlabs.org;
> > > > > linux-kernel@vger.kernel.org
> > > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
> > > > > IOMMU (PAMU)
> > > > >
> > > > > On Thu, 2013-11-28 at 03:19 -0600, Bharat Bhushan wrote:
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Bhushan Bharat-R65777
> > > > > > > Sent: Wednesday, November 27, 2013 9:39 PM
> > > > > > > To: 'Alex Williamson'
> > > > > > > Cc: Wood Scott-B07421; linux-pci@vger.kernel.org;
> > > > > > > agraf@suse.de; Yoder Stuart- B08248;
> > > > > > > iommu@lists.linux-foundation.org; bhelgaas@google.com;
> > > > > > > linuxppc- dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> > > > > > > Subject: RE: [PATCH 0/9 v2] vfio-pci: add support for
> > > > > > > Freescale IOMMU (PAMU)
> > > > > > >
> > > > > > > If we just provide the size of MSI bank to userspace then
> > > > > > > userspace cannot do anything wrong.
> > > > > >
> > > > > > So userspace does not know address, so it cannot mmap and cause
> > > > > > any
> > > > > interference by directly reading/writing.
> > > > >
> > > > > That's security through obscurity...  Couldn't the malicious user
> > > > > find out the address via other means, such as experimentation on
> > > > > another system over which they have full control?  What would
> > > > > happen if the user reads from their device's PCI config space?  Or
> > > > > gets the information via some back door in the PCI device they
> > > > > own?  Or pokes throughout the address space looking for something that
> > generates an interrupt to its own device?
> > > >
> > > > So how to solve this problem, Any suggestion ?
> > > >
> > > > We have to map one window in PAMU for MSIs and a malicious user can
> > > > ask its device to do DMA to MSI window region with any pair of
> > > > address and data, which can lead to unexpected MSIs in system?
> > >
> > > I don't think there are any solutions other than to limit each bank to
> > > one user, unless the admin turns some knob that says they're OK with
> > > the partial loss of isolation.
> > 
> > Even if the admin does opt-in to an allow_unsafe_interrupts options, it should
> > still be reasonably difficult for one guest to interfere with the other.  I
> > don't think we want to rely on the blind luck of making the full MSI bank
> > accessible to multiple guests and hoping they don't step on each other.
> 
> Not sure how to solve in this case (sharing MSI page)
> 
> >  That probably means that vfio needs to manage the space rather than the guest.
> 
> What you mean by " vfio needs to manage the space rather than the guest"?

I mean there needs to be some kernel component managing the contents of
the MSI page rather than just handing it out to the user and hoping for
the best.  The user API also needs to remain the same whether the user
has the MSI page exclusively or it's shared with others (kernel or
users).  Thanks,

Alex


^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
  2013-12-10  5:53                           ` Alex Williamson
@ 2013-12-10  9:09                             ` Bharat.Bhushan
  0 siblings, 0 replies; 34+ messages in thread
From: Bharat.Bhushan @ 2013-12-10  9:09 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Scott Wood, linux-pci, agraf, Stuart Yoder, iommu, bhelgaas,
	linuxppc-dev, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 6729 bytes --]



> -----Original Message-----
> From: Alex Williamson [mailto:alex.williamson@redhat.com]
> Sent: Tuesday, December 10, 2013 11:23 AM
> To: Bhushan Bharat-R65777
> Cc: Wood Scott-B07421; linux-pci@vger.kernel.org; agraf@suse.de; Yoder Stuart-
> B08248; iommu@lists.linux-foundation.org; bhelgaas@google.com; linuxppc-
> dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> 
> On Tue, 2013-12-10 at 05:37 +0000, Bharat.Bhushan@freescale.com wrote:
> >
> > > -----Original Message-----
> > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > Sent: Saturday, December 07, 2013 1:00 AM
> > > To: Wood Scott-B07421
> > > Cc: Bhushan Bharat-R65777; linux-pci@vger.kernel.org; agraf@suse.de;
> > > Yoder Stuart-B08248; iommu@lists.linux-foundation.org;
> > > bhelgaas@google.com; linuxppc- dev@lists.ozlabs.org;
> > > linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
> > > IOMMU (PAMU)
> > >
> > > On Fri, 2013-12-06 at 12:59 -0600, Scott Wood wrote:
> > > > On Thu, 2013-12-05 at 22:11 -0600, Bharat Bhushan wrote:
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Wood Scott-B07421
> > > > > > Sent: Friday, December 06, 2013 5:52 AM
> > > > > > To: Bhushan Bharat-R65777
> > > > > > Cc: Alex Williamson; linux-pci@vger.kernel.org; agraf@suse.de;
> > > > > > Yoder Stuart- B08248; iommu@lists.linux-foundation.org;
> > > > > > bhelgaas@google.com; linuxppc- dev@lists.ozlabs.org;
> > > > > > linux-kernel@vger.kernel.org
> > > > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for
> > > > > > Freescale IOMMU (PAMU)
> > > > > >
> > > > > > On Thu, 2013-11-28 at 03:19 -0600, Bharat Bhushan wrote:
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Bhushan Bharat-R65777
> > > > > > > > Sent: Wednesday, November 27, 2013 9:39 PM
> > > > > > > > To: 'Alex Williamson'
> > > > > > > > Cc: Wood Scott-B07421; linux-pci@vger.kernel.org;
> > > > > > > > agraf@suse.de; Yoder Stuart- B08248;
> > > > > > > > iommu@lists.linux-foundation.org; bhelgaas@google.com;
> > > > > > > > linuxppc- dev@lists.ozlabs.org;
> > > > > > > > linux-kernel@vger.kernel.org
> > > > > > > > Subject: RE: [PATCH 0/9 v2] vfio-pci: add support for
> > > > > > > > Freescale IOMMU (PAMU)
> > > > > > > >
> > > > > > > > If we just provide the size of MSI bank to userspace then
> > > > > > > > userspace cannot do anything wrong.
> > > > > > >
> > > > > > > So userspace does not know address, so it cannot mmap and
> > > > > > > cause any
> > > > > > interference by directly reading/writing.
> > > > > >
> > > > > > That's security through obscurity...  Couldn't the malicious
> > > > > > user find out the address via other means, such as
> > > > > > experimentation on another system over which they have full
> > > > > > control?  What would happen if the user reads from their
> > > > > > device's PCI config space?  Or gets the information via some
> > > > > > back door in the PCI device they own?  Or pokes throughout the
> > > > > > address space looking for something that
> > > generates an interrupt to its own device?
> > > > >
> > > > > So how to solve this problem, Any suggestion ?
> > > > >
> > > > > We have to map one window in PAMU for MSIs and a malicious user
> > > > > can ask its device to do DMA to MSI window region with any pair
> > > > > of address and data, which can lead to unexpected MSIs in system?
> > > >
> > > > I don't think there are any solutions other than to limit each
> > > > bank to one user, unless the admin turns some knob that says
> > > > they're OK with the partial loss of isolation.
> > >
> > > Even if the admin does opt-in to an allow_unsafe_interrupts options,
> > > it should still be reasonably difficult for one guest to interfere
> > > with the other.  I don't think we want to rely on the blind luck of
> > > making the full MSI bank accessible to multiple guests and hoping they don't
> step on each other.
> >
> > Not sure how to solve in this case (sharing MSI page)
> >
> > >  That probably means that vfio needs to manage the space rather than the
> guest.
> >
> > What you mean by " vfio needs to manage the space rather than the guest"?
> 
> I mean there needs to be some kernel component managing the contents of the MSI
> page rather than just handing it out to the user and hoping for the best.  The
> user API also needs to remain the same whether the user has the MSI page
> exclusively or it's shared with others (kernel or users).  Thanks,

We have limited number of MSI banks, so we cannot provide explicit MSI bank to each VMs.
Below is the summary of msi allocation/ownership model I am thinking of:

Option-1: User-space aware of MSI banks
========= 
1 ) Userspace will make GET_MSI_REGION(request number of MSI banks)
	- VFIO will allocate requested number of MSI bank;
	- If allocation succeed then return number of banks
	- If allocation fails then check opt-in flag set by administrator (allow_unsafe_interrupts);
         allow_unsafe_interrupts  == 0; Not allowed to share; return FAIL (-ENODEV)
         else share MSI bank of another VM.

2) Userspace will adjust geometry size as per number of banks and calls SET_GEOMETRY

3) Userspace will do DMA_MAP for its memory

4) Userspace will do MSI_MAP for number of banks it have
	- MSI_MAP(iova, bank number);
	- Should iova be passed by userspace or not? I think we should pass iova as it does not know if userspace will call DMA_MAP for same iova later on.
	  VFIO can somehow find a magic IOVA within geometry but will assume that userspace will not make DMA_MAP later on.
	

Option-2: Userspace transparent MSI banks
========= 
1) Userspace setup geometry of its memory (say call as "userspace-geometry") (SET_GEOMETRY)
	- VFIO will allocate MSI bank/s; how many??.
	- Error out if not available (shared and/or exclusive, same as in option-1 above)
	- VFIO will adjust geometry accordingly (say called as "actual-geometry").

2) Userspace will do DMA_MAP for its memory.
	- VFIO allows only within "userspace-geometry".

3) Userspace will do MSI_MAP after all DMA_MAP complete
	- VFIO will find a magic IOVA after "userspace-geometry" but within "actual-geometry".
	- Allocated MSI bank/s in step-1 are mapped in IOMMU

=========

Note: Irrespective of which option we use, a malicious userspace can interfere with another userspace by programming device DMA wrongly.

Option-1 looks flexible and good to me but open for suggestions.

Thanks
-Bharat


> 
> Alex
> 
> 

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
       [not found]                   ` <34c6142137194b1cbd2336013ed3b10d@BN1PR03MB266.namprd03.prod.outlook.com>
@ 2013-12-10 20:29                     ` Scott Wood
  0 siblings, 0 replies; 34+ messages in thread
From: Scott Wood @ 2013-12-10 20:29 UTC (permalink / raw)
  To: Bharat.Bhushan
  Cc: 'Wood Scott-B07421',
	linux-pci, agraf, iommu, Stuart Yoder, Alex Williamson, bhelgaas,
	linuxppc-dev, linux-kernel

My e-mail address is <scottwood@freescale.com>, not
<IMCEAEX-_O=MMS_OU=EXTERNAL+20+28FYDIBOHF25SPDLT
+29_CN=RECIPIENTS_CN=F0FAAC8D7E74473A9EE1C45B068D838A@namprd03.prod.outlook.com>

On Tue, 2013-12-10 at 05:37 +0000, Bharat.Bhushan@freescale.com wrote:
> 
> > -----Original Message-----
> > From: Wood Scott-B07421
> > Sent: Saturday, December 07, 2013 12:55 AM
> > To: Bhushan Bharat-R65777
> > Cc: Alex Williamson; linux-pci@vger.kernel.org; agraf@suse.de; Yoder Stuart-
> > B08248; iommu@lists.linux-foundation.org; bhelgaas@google.com; linuxppc-
> > dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> > 
> > If the administrator does not opt into this partial loss of isolation, then once
> > you run out of MSI groups, new users should not be able to set up MSIs.
> 
> So mean vfio should use Legacy when out of MSI banks?

Yes, if the administrator hasn't granted permission to share.

-Scott




^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2013-12-10 20:29 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-19  5:17 [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) Bharat Bhushan
2013-11-19  5:17 ` [PATCH 1/9 v2] pci:msi: add weak function for returning msi region info Bharat Bhushan
2013-11-25 23:36   ` Bjorn Helgaas
2013-11-28 10:08     ` Bharat Bhushan
2013-11-19  5:17 ` [PATCH 2/9 v2] pci: msi: expose msi region information functions Bharat Bhushan
2013-11-19  5:17 ` [PATCH 3/9 v2] powerpc: pci: Add arch specific msi region interface Bharat Bhushan
2013-11-19  5:17 ` [PATCH 4/9 v2] powerpc: msi: Extend the msi region interface to get info from fsl_msi Bharat Bhushan
2013-11-19  5:17 ` [PATCH 5/9 v2] pci/msi: interface to set an iova for a msi region Bharat Bhushan
2013-11-19  5:17 ` [PATCH 6/9 v2] powerpc: pci: Extend msi iova page setup to arch specific Bharat Bhushan
2013-11-19  5:17 ` [PATCH 7/9 v2] pci: msi: Extend msi iova setting interface to powerpc arch Bharat Bhushan
2013-11-19  5:17 ` [PATCH 8/9 v2] vfio: moving some functions in common file Bharat Bhushan
2013-11-19  5:17 ` [PATCH 9/9 v2] vfio pci: Add vfio iommu implementation for FSL_PAMU Bharat Bhushan
2013-11-20 18:47 ` [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) Alex Williamson
2013-11-21 11:20   ` Varun Sethi
2013-11-21 11:20   ` Bharat Bhushan
2013-11-21 20:43     ` Alex Williamson
2013-11-21 20:47       ` Scott Wood
2013-11-21 21:00         ` Alex Williamson
2013-11-25  5:33           ` Bharat Bhushan
2013-11-25 16:38             ` Alex Williamson
2013-11-27 16:08               ` Bharat Bhushan
2013-11-28  9:19               ` Bharat Bhushan
2013-12-06  0:21                 ` Scott Wood
2013-12-06  4:11                   ` Bharat Bhushan
2013-12-06 18:59                     ` Scott Wood
2013-12-06 19:30                       ` Alex Williamson
2013-12-07  0:22                         ` Scott Wood
2013-12-10  5:37                         ` Bharat.Bhushan
2013-12-10  5:53                           ` Alex Williamson
2013-12-10  9:09                             ` Bharat.Bhushan
2013-12-06  0:00             ` Scott Wood
2013-12-06  4:17               ` Bharat Bhushan
2013-12-06 19:25                 ` Scott Wood
     [not found]                   ` <34c6142137194b1cbd2336013ed3b10d@BN1PR03MB266.namprd03.prod.outlook.com>
2013-12-10 20:29                     ` Scott Wood

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).