linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v6 01/20] iommu/arm-smmu: change IOMMU_EXEC to IOMMU_NOEXEC
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-16 15:04   ` Will Deacon
  2014-06-05 17:03 ` [RFC PATCH v6 02/20] iommu: add capability IOMMU_CAP_NOEXEC Antonios Motakis
                   ` (18 subsequent siblings)
  19 siblings, 1 reply; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, Joerg Roedel,
	Varun Sethi, Alexey Kardashevskiy, Shuah Khan,
	Upinder Malhi (umalhi),
	moderated list:ARM SMMU DRIVER, open list

Exposing the XN flag of the SMMU driver as IOMMU_NOEXEC instead of
IOMMU_EXEC makes it enforceable, since for IOMMUs that don't support
the XN flag pages will always be executable.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 drivers/iommu/arm-smmu.c | 2 +-
 include/linux/iommu.h    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 647c3c7..d5a2200 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1294,7 +1294,7 @@ static int arm_smmu_alloc_init_pte(struct arm_smmu_device *smmu, pmd_t *pmd,
 	}
 
 	/* If no access, create a faulting entry to avoid TLB fills */
-	if (prot & IOMMU_EXEC)
+	if (!(prot & IOMMU_NOEXEC))
 		pteval &= ~ARM_SMMU_PTE_XN;
 	else if (!(prot & (IOMMU_READ | IOMMU_WRITE)))
 		pteval &= ~ARM_SMMU_PTE_PAGE;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index b96a5b2..fc464d2 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -27,7 +27,7 @@
 #define IOMMU_READ	(1 << 0)
 #define IOMMU_WRITE	(1 << 1)
 #define IOMMU_CACHE	(1 << 2) /* DMA cache coherency */
-#define IOMMU_EXEC	(1 << 3)
+#define IOMMU_NOEXEC	(1 << 3)
 
 struct iommu_ops;
 struct iommu_group;
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 02/20] iommu: add capability IOMMU_CAP_NOEXEC
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
  2014-06-05 17:03 ` [RFC PATCH v6 01/20] iommu/arm-smmu: change IOMMU_EXEC to IOMMU_NOEXEC Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-05 20:03   ` Alex Williamson
  2014-06-05 17:03 ` [RFC PATCH v6 03/20] iommu/arm-smmu: add IOMMU_CAP_NOEXEC to the ARM SMMU driver Antonios Motakis
                   ` (17 subsequent siblings)
  19 siblings, 1 reply; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, Joerg Roedel,
	Varun Sethi, Alexey Kardashevskiy, Shuah Khan,
	Upinder Malhi (umalhi),
	open list

Some IOMMUs accept an IOMMU_NOEXEC protection flag in addition to
IOMMU_READ and IOMMU_WRITE. Expose this as an IOMMU capability.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 include/linux/iommu.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index fc464d2..7e152fb 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -57,8 +57,9 @@ struct iommu_domain {
 	struct iommu_domain_geometry geometry;
 };
 
-#define IOMMU_CAP_CACHE_COHERENCY	0x1
-#define IOMMU_CAP_INTR_REMAP		0x2	/* isolates device intrs */
+#define IOMMU_CAP_CACHE_COHERENCY	(1 << 0)
+#define IOMMU_CAP_INTR_REMAP		(1 << 1) /* isolates device intrs */
+#define IOMMU_CAP_NOEXEC		(1 << 2) /* IOMMU_NOEXEC flag */
 
 /*
  * Following constraints are specifc to FSL_PAMUV1:
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 03/20] iommu/arm-smmu: add IOMMU_CAP_NOEXEC to the ARM SMMU driver
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
  2014-06-05 17:03 ` [RFC PATCH v6 01/20] iommu/arm-smmu: change IOMMU_EXEC to IOMMU_NOEXEC Antonios Motakis
  2014-06-05 17:03 ` [RFC PATCH v6 02/20] iommu: add capability IOMMU_CAP_NOEXEC Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-16 15:04   ` Will Deacon
  2014-06-05 17:03 ` [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP Antonios Motakis
                   ` (16 subsequent siblings)
  19 siblings, 1 reply; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, Joerg Roedel,
	moderated list:ARM SMMU DRIVER, open list

The ARM SMMU supports the IOMMU_NOEXEC protection flag. Add the
corresponding IOMMU capability.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 drivers/iommu/arm-smmu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index d5a2200..15ab2af 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1544,6 +1544,8 @@ static int arm_smmu_domain_has_cap(struct iommu_domain *domain,
 	if (smmu_domain->root_cfg.smmu->features & ARM_SMMU_FEAT_COHERENT_WALK)
 		caps |= IOMMU_CAP_CACHE_COHERENCY;
 
+	caps |= IOMMU_CAP_NOEXEC;
+
 	return !!(cap & caps);
 }
 
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
                   ` (2 preceding siblings ...)
  2014-06-05 17:03 ` [RFC PATCH v6 03/20] iommu/arm-smmu: add IOMMU_CAP_NOEXEC to the ARM SMMU driver Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-05 18:31   ` Varun Sethi
  2014-06-08 10:31   ` Christoffer Dall
  2014-06-05 17:03 ` [RFC PATCH v6 05/20] vfio/iommu_type1: support for platform bus devices on ARM Antonios Motakis
                   ` (15 subsequent siblings)
  19 siblings, 2 replies; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, Joerg Roedel,
	moderated list:ARM SMMU DRIVER, open list

With an ARM SMMU, interrupt remapping should always be safe from the
SMMU's point of view, as it is properly handled by the GIC.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 drivers/iommu/arm-smmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 15ab2af..ff29402 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1544,7 +1544,7 @@ static int arm_smmu_domain_has_cap(struct iommu_domain *domain,
 	if (smmu_domain->root_cfg.smmu->features & ARM_SMMU_FEAT_COHERENT_WALK)
 		caps |= IOMMU_CAP_CACHE_COHERENCY;
 
-	caps |= IOMMU_CAP_NOEXEC;
+	caps |= IOMMU_CAP_NOEXEC | IOMMU_CAP_INTR_REMAP;
 
 	return !!(cap & caps);
 }
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 05/20] vfio/iommu_type1: support for platform bus devices on ARM
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
                   ` (3 preceding siblings ...)
  2014-06-05 17:03 ` [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-05 17:03 ` [RFC PATCH v6 06/20] vfio: introduce the VFIO_DMA_MAP_FLAG_NOEXEC flag Antonios Motakis
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, open list

This allows to make use of the VFIO_IOMMU_TYPE1 driver with platform
devices on ARM. The driver can then be used with an Exynos SMMU, or
ARM SMMU driver.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 drivers/vfio/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index af7b204..3a598ed 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -11,7 +11,7 @@ config VFIO_IOMMU_SPAPR_TCE
 menuconfig VFIO
 	tristate "VFIO Non-Privileged userspace driver framework"
 	depends on IOMMU_API
-	select VFIO_IOMMU_TYPE1 if X86
+	select VFIO_IOMMU_TYPE1 if X86 || ARM
 	select VFIO_IOMMU_SPAPR_TCE if (PPC_POWERNV || PPC_PSERIES)
 	select ANON_INODES
 	help
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 06/20] vfio: introduce the VFIO_DMA_MAP_FLAG_NOEXEC flag
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
                   ` (4 preceding siblings ...)
  2014-06-05 17:03 ` [RFC PATCH v6 05/20] vfio/iommu_type1: support for platform bus devices on ARM Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-05 17:03 ` [RFC PATCH v6 07/20] vfio/iommu_type1: implement " Antonios Motakis
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, open list

We introduce the VFIO_DMA_MAP_FLAG_NOEXEC flag to the VFIO dma map call,
and expose its availability via the capability VFIO_IOMMU_PROT_NOEXEC.
This way the user can control whether the XN flag will be set on the
requested mappings. The IOMMU_NOEXEC flag needs to be available for all
the IOMMUs of the container used.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 include/uapi/linux/vfio.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index cb9023d..29d234c 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -29,6 +29,7 @@
  * capability is subject to change as groups are added or removed.
  */
 #define VFIO_DMA_CC_IOMMU		4
+#define VFIO_IOMMU_PROT_NOEXEC		5
 
 /*
  * The IOCTL interface is designed for extensibility by embedding the
@@ -398,6 +399,7 @@ struct vfio_iommu_type1_dma_map {
 	__u32	flags;
 #define VFIO_DMA_MAP_FLAG_READ (1 << 0)		/* readable from device */
 #define VFIO_DMA_MAP_FLAG_WRITE (1 << 1)	/* writable from device */
+#define VFIO_DMA_MAP_FLAG_NOEXEC (1 << 2)	/* not executable from device */
 	__u64	vaddr;				/* Process virtual address */
 	__u64	iova;				/* IO virtual address */
 	__u64	size;				/* Size of mapping (bytes) */
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 07/20] vfio/iommu_type1: implement the VFIO_DMA_MAP_FLAG_NOEXEC flag
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
                   ` (5 preceding siblings ...)
  2014-06-05 17:03 ` [RFC PATCH v6 06/20] vfio: introduce the VFIO_DMA_MAP_FLAG_NOEXEC flag Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-05 20:48   ` Alex Williamson
  2014-06-05 17:03 ` [RFC PATCH v6 08/20] driver core: platform: add device binding path 'driver_override' Antonios Motakis
                   ` (12 subsequent siblings)
  19 siblings, 1 reply; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, open list

Some IOMMU drivers, such as the ARM SMMU driver, make available the
IOMMU_NOEXEC flag, to set the page tables for a device as XN (execute never).
This affects devices such as the ARM PL330 DMA Controller, which respects
this flag and will refuse to fetch DMA instructions from memory where the
XN flag has been set.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 drivers/vfio/vfio_iommu_type1.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 6673e7b..e2566fd 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -80,6 +80,24 @@ struct vfio_group {
 	struct list_head	next;
 };
 
+static int vfio_domains_have_cap_noexec(struct vfio_iommu *iommu)
+{
+	struct vfio_domain *d;
+	int ret = 1;
+
+	mutex_lock(&iommu->lock);
+	list_for_each_entry(d, &iommu->domain_list, next) {
+		if (!iommu_domain_has_cap(d->domain, IOMMU_CAP_NOEXEC)) {
+			ret = 0;
+			break;
+		}
+	}
+	mutex_unlock(&iommu->lock);
+
+	return ret;
+}
+
+
 /*
  * This code handles mapping and unmapping of user data buffers
  * into DMA'ble space using the IOMMU
@@ -542,6 +560,11 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
 		prot |= IOMMU_WRITE;
 	if (map->flags & VFIO_DMA_MAP_FLAG_READ)
 		prot |= IOMMU_READ;
+	if (map->flags & VFIO_DMA_MAP_FLAG_NOEXEC) {
+		if (!vfio_domains_have_cap_noexec(iommu))
+			return -EINVAL;
+		prot |= IOMMU_NOEXEC;
+	}
 
 	if (!prot)
 		return -EINVAL; /* No READ/WRITE? */
@@ -899,6 +922,10 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 			if (!iommu)
 				return 0;
 			return vfio_domains_have_iommu_cache(iommu);
+		case VFIO_IOMMU_PROT_NOEXEC:
+			if (!iommu)
+				return 0;
+			return vfio_domains_have_cap_noexec(iommu);
 		default:
 			return 0;
 		}
@@ -922,7 +949,8 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 	} else if (cmd == VFIO_IOMMU_MAP_DMA) {
 		struct vfio_iommu_type1_dma_map map;
 		uint32_t mask = VFIO_DMA_MAP_FLAG_READ |
-				VFIO_DMA_MAP_FLAG_WRITE;
+				VFIO_DMA_MAP_FLAG_WRITE |
+				VFIO_DMA_MAP_FLAG_NOEXEC;
 
 		minsz = offsetofend(struct vfio_iommu_type1_dma_map, size);
 
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 08/20] driver core: platform: add device binding path 'driver_override'
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
                   ` (6 preceding siblings ...)
  2014-06-05 17:03 ` [RFC PATCH v6 07/20] vfio/iommu_type1: implement " Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-05 17:03 ` [RFC PATCH v6 09/20] vfio/platform: initial skeleton of VFIO support for platform devices Antonios Motakis
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Randy Dunlap, Greg Kroah-Hartman,
	Alexander Graf, Johan Hovold, Rob Herring,
	open list:DOCUMENTATION, open list

From: Kim Phillips <kim.phillips@freescale.com>

Needed by platform device drivers, such as the upcoming
vfio-platform driver, in order to bypass the existing OF, ACPI,
id_table and name string matches, and successfully be able to be
bound to any device, like so:

echo vfio-platform > /sys/bus/platform/devices/fff51000.ethernet/driver_override
echo fff51000.ethernet > /sys/bus/platform/devices/fff51000.ethernet/driver/unbind
echo fff51000.ethernet > /sys/bus/platform/drivers_probe

This mimics "PCI: Introduce new device binding path using
pci_dev.driver_override", which is an interface enhancement
for more deterministic PCI device binding, e.g., when in the
presence of hotplug.

Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
---
 Documentation/ABI/testing/sysfs-bus-platform | 20 ++++++++++++
 drivers/base/platform.c                      | 47 ++++++++++++++++++++++++++++
 include/linux/platform_device.h              |  1 +
 3 files changed, 68 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-platform

diff --git a/Documentation/ABI/testing/sysfs-bus-platform b/Documentation/ABI/testing/sysfs-bus-platform
new file mode 100644
index 0000000..5172a61
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-platform
@@ -0,0 +1,20 @@
+What:		/sys/bus/platform/devices/.../driver_override
+Date:		April 2014
+Contact:	Kim Phillips <kim.phillips@freescale.com>
+Description:
+		This file allows the driver for a device to be specified which
+		will override standard OF, ACPI, ID table, and name matching.
+		When specified, only a driver with a name matching the value
+		written to driver_override will have an opportunity to bind
+		to the device.  The override is specified by writing a string
+		to the driver_override file (echo vfio-platform > \
+		driver_override) and may be cleared with an empty string
+		(echo > driver_override).  This returns the device to standard
+		matching rules binding.  Writing to driver_override does not
+		automatically unbind the device from its current driver or make
+		any attempt to automatically load the specified driver.  If no
+		driver with a matching name is currently loaded in the kernel,
+		the device will not bind to any driver.  This also allows
+		devices to opt-out of driver binding using a driver_override
+		name such as "none".  Only a single driver may be specified in
+		the override, there is no support for parsing delimiters.
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 5b47210..4f47563 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -23,6 +23,7 @@
 #include <linux/pm_runtime.h>
 #include <linux/idr.h>
 #include <linux/acpi.h>
+#include <linux/limits.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -188,6 +189,7 @@ static void platform_device_release(struct device *dev)
 	kfree(pa->pdev.dev.platform_data);
 	kfree(pa->pdev.mfd_cell);
 	kfree(pa->pdev.resource);
+	kfree(pa->pdev.driver_override);
 	kfree(pa);
 }
 
@@ -695,8 +697,49 @@ static ssize_t modalias_show(struct device *dev, struct device_attribute *a,
 }
 static DEVICE_ATTR_RO(modalias);
 
+static ssize_t driver_override_store(struct device *dev,
+				     struct device_attribute *attr,
+				     const char *buf, size_t count)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	char *driver_override, *old = pdev->driver_override, *cp;
+
+	if (count > PATH_MAX)
+		return -EINVAL;
+
+	driver_override = kstrndup(buf, count, GFP_KERNEL);
+	if (!driver_override)
+		return -ENOMEM;
+
+	cp = strchr(driver_override, '\n');
+	if (cp)
+		*cp = '\0';
+
+	if (strlen(driver_override)) {
+		pdev->driver_override = driver_override;
+	} else {
+		kfree(driver_override);
+		pdev->driver_override = NULL;
+	}
+
+	kfree(old);
+
+	return count;
+}
+
+static ssize_t driver_override_show(struct device *dev,
+				    struct device_attribute *attr, char *buf)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+
+	return sprintf(buf, "%s\n", pdev->driver_override);
+}
+static DEVICE_ATTR_RW(driver_override);
+
+
 static struct attribute *platform_dev_attrs[] = {
 	&dev_attr_modalias.attr,
+	&dev_attr_driver_override.attr,
 	NULL,
 };
 ATTRIBUTE_GROUPS(platform_dev);
@@ -752,6 +795,10 @@ static int platform_match(struct device *dev, struct device_driver *drv)
 	struct platform_device *pdev = to_platform_device(dev);
 	struct platform_driver *pdrv = to_platform_driver(drv);
 
+	/* When driver_override is set, only bind to the matching driver */
+	if (pdev->driver_override)
+		return !strcmp(pdev->driver_override, drv->name);
+
 	/* Attempt an OF style match first */
 	if (of_driver_match_device(dev, drv))
 		return 1;
diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 16f6654..153d303 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -28,6 +28,7 @@ struct platform_device {
 	struct resource	*resource;
 
 	const struct platform_device_id	*id_entry;
+	char *driver_override; /* Driver name to force a match */
 
 	/* MFD cell pointer */
 	struct mfd_cell *mfd_cell;
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 09/20] vfio/platform: initial skeleton of VFIO support for platform devices
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
                   ` (7 preceding siblings ...)
  2014-06-05 17:03 ` [RFC PATCH v6 08/20] driver core: platform: add device binding path 'driver_override' Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-05 17:03 ` [RFC PATCH v6 10/20] vfio/platform: return info for device and its memory mapped IO regions Antonios Motakis
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, Catalin Marinas,
	Mark Rutland, Vladimir Murzin, open list

This patch forms the skeleton for platform devices support with VFIO.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 drivers/vfio/Kconfig                          |   1 +
 drivers/vfio/Makefile                         |   1 +
 drivers/vfio/platform/Kconfig                 |   9 ++
 drivers/vfio/platform/Makefile                |   4 +
 drivers/vfio/platform/vfio_platform.c         | 172 ++++++++++++++++++++++++++
 drivers/vfio/platform/vfio_platform_private.h |  22 ++++
 include/uapi/linux/vfio.h                     |   1 +
 7 files changed, 210 insertions(+)
 create mode 100644 drivers/vfio/platform/Kconfig
 create mode 100644 drivers/vfio/platform/Makefile
 create mode 100644 drivers/vfio/platform/vfio_platform.c
 create mode 100644 drivers/vfio/platform/vfio_platform_private.h

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index 3a598ed..a484d72 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -21,3 +21,4 @@ menuconfig VFIO
 	  If you don't know what to do here, say N.
 
 source "drivers/vfio/pci/Kconfig"
+source "drivers/vfio/platform/Kconfig"
diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index 72bfabc..b5e4a33 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -2,3 +2,4 @@ obj-$(CONFIG_VFIO) += vfio.o
 obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
 obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
 obj-$(CONFIG_VFIO_PCI) += pci/
+obj-$(CONFIG_VFIO_PLATFORM) += platform/
diff --git a/drivers/vfio/platform/Kconfig b/drivers/vfio/platform/Kconfig
new file mode 100644
index 0000000..c51af17
--- /dev/null
+++ b/drivers/vfio/platform/Kconfig
@@ -0,0 +1,9 @@
+config VFIO_PLATFORM
+	tristate "VFIO support for platform devices"
+	depends on VFIO && EVENTFD && ARM
+	help
+	  Support for platform devices with VFIO. This is required to make
+	  use of platform devices present on the system using the VFIO
+	  framework.
+
+	  If you don't know what to do here, say N.
diff --git a/drivers/vfio/platform/Makefile b/drivers/vfio/platform/Makefile
new file mode 100644
index 0000000..df3a014
--- /dev/null
+++ b/drivers/vfio/platform/Makefile
@@ -0,0 +1,4 @@
+
+vfio-platform-y := vfio_platform.o
+
+obj-$(CONFIG_VFIO_PLATFORM) += vfio-platform.o
diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
new file mode 100644
index 0000000..1df76d8
--- /dev/null
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -0,0 +1,172 @@
+/*
+ * Copyright (C) 2013 - Virtual Open Systems
+ * Author: Antonios Motakis <a.motakis@virtualopensystems.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/device.h>
+#include <linux/eventfd.h>
+#include <linux/interrupt.h>
+#include <linux/iommu.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/notifier.h>
+#include <linux/pm_runtime.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/uaccess.h>
+#include <linux/vfio.h>
+#include <linux/io.h>
+#include <linux/platform_device.h>
+#include <linux/irq.h>
+
+#include "vfio_platform_private.h"
+
+#define DRIVER_VERSION  "0.6"
+#define DRIVER_AUTHOR   "Antonios Motakis <a.motakis@virtualopensystems.com>"
+#define DRIVER_DESC     "VFIO for platform devices - User Level meta-driver"
+
+static void vfio_platform_release(void *device_data)
+{
+	module_put(THIS_MODULE);
+}
+
+static int vfio_platform_open(void *device_data)
+{
+	if (!try_module_get(THIS_MODULE))
+		return -ENODEV;
+
+	return 0;
+}
+
+static long vfio_platform_ioctl(void *device_data,
+			   unsigned int cmd, unsigned long arg)
+{
+	struct vfio_platform_device *vdev = device_data;
+	unsigned long minsz;
+
+	if (cmd == VFIO_DEVICE_GET_INFO) {
+		struct vfio_device_info info;
+
+		minsz = offsetofend(struct vfio_device_info, num_irqs);
+
+		if (copy_from_user(&info, (void __user *)arg, minsz))
+			return -EFAULT;
+
+		if (info.argsz < minsz)
+			return -EINVAL;
+
+		info.flags = VFIO_DEVICE_FLAGS_PLATFORM;
+		info.num_regions = 0;
+		info.num_irqs = 0;
+
+		return copy_to_user((void __user *)arg, &info, minsz);
+
+	} else if (cmd == VFIO_DEVICE_GET_REGION_INFO)
+		return -EINVAL;
+
+	else if (cmd == VFIO_DEVICE_GET_IRQ_INFO)
+		return -EINVAL;
+
+	else if (cmd == VFIO_DEVICE_SET_IRQS)
+		return -EINVAL;
+
+	else if (cmd == VFIO_DEVICE_RESET)
+		return -EINVAL;
+
+	return -ENOTTY;
+}
+
+static ssize_t vfio_platform_read(void *device_data, char __user *buf,
+			     size_t count, loff_t *ppos)
+{
+	return -EINVAL;
+}
+
+static ssize_t vfio_platform_write(void *device_data, const char __user *buf,
+			      size_t count, loff_t *ppos)
+{
+	return -EINVAL;
+}
+
+static int vfio_platform_mmap(void *device_data, struct vm_area_struct *vma)
+{
+	return -EINVAL;
+}
+
+static const struct vfio_device_ops vfio_platform_ops = {
+	.name		= "vfio-platform",
+	.open		= vfio_platform_open,
+	.release	= vfio_platform_release,
+	.ioctl		= vfio_platform_ioctl,
+	.read		= vfio_platform_read,
+	.write		= vfio_platform_write,
+	.mmap		= vfio_platform_mmap,
+};
+
+static int vfio_platform_probe(struct platform_device *pdev)
+{
+	struct vfio_platform_device *vdev;
+	struct iommu_group *group;
+	int ret;
+
+	group = iommu_group_get(&pdev->dev);
+	if (!group) {
+		pr_err("VFIO: No IOMMU group for device %s\n", pdev->name);
+		return -EINVAL;
+	}
+
+	vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
+	if (!vdev) {
+		iommu_group_put(group);
+		return -ENOMEM;
+	}
+
+	vdev->pdev = pdev;
+
+	ret = vfio_add_group_dev(&pdev->dev, &vfio_platform_ops, vdev);
+	if (ret) {
+		iommu_group_put(group);
+		kfree(vdev);
+	}
+
+	return ret;
+}
+
+static int vfio_platform_remove(struct platform_device *pdev)
+{
+	struct vfio_platform_device *vdev;
+
+	vdev = vfio_del_group_dev(&pdev->dev);
+	if (!vdev)
+		return -EINVAL;
+
+	iommu_group_put(pdev->dev.iommu_group);
+	kfree(vdev);
+
+	return 0;
+}
+
+static struct platform_driver vfio_platform_driver = {
+	.probe		= vfio_platform_probe,
+	.remove		= vfio_platform_remove,
+	.driver	= {
+		.name	= "vfio-platform",
+		.owner	= THIS_MODULE,
+	},
+};
+
+module_platform_driver(vfio_platform_driver);
+
+MODULE_VERSION(DRIVER_VERSION);
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
new file mode 100644
index 0000000..4ae88f8
--- /dev/null
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -0,0 +1,22 @@
+/*
+ * Copyright (C) 2013 - Virtual Open Systems
+ * Author: Antonios Motakis <a.motakis@virtualopensystems.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef VFIO_PLATFORM_PRIVATE_H
+#define VFIO_PLATFORM_PRIVATE_H
+
+struct vfio_platform_device {
+	struct platform_device		*pdev;
+};
+
+#endif /* VFIO_PLATFORM_PRIVATE_H */
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 29d234c..d381107 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -155,6 +155,7 @@ struct vfio_device_info {
 	__u32	flags;
 #define VFIO_DEVICE_FLAGS_RESET	(1 << 0)	/* Device supports reset */
 #define VFIO_DEVICE_FLAGS_PCI	(1 << 1)	/* vfio-pci device */
+#define VFIO_DEVICE_FLAGS_PLATFORM (1 << 2)	/* vfio-platform device */
 	__u32	num_regions;	/* Max region index + 1 */
 	__u32	num_irqs;	/* Max IRQ index + 1 */
 };
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 10/20] vfio/platform: return info for device and its memory mapped IO regions
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
                   ` (8 preceding siblings ...)
  2014-06-05 17:03 ` [RFC PATCH v6 09/20] vfio/platform: initial skeleton of VFIO support for platform devices Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-05 21:14   ` Alex Williamson
  2014-06-05 17:03 ` [RFC PATCH v6 11/20] vfio/platform: read and write support for the device fd Antonios Motakis
                   ` (9 subsequent siblings)
  19 siblings, 1 reply; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, Catalin Marinas,
	Mark Rutland, Vladimir Murzin, open list

A VFIO userspace driver will start by opening the VFIO device
that corresponds to an IOMMU group, and will use the ioctl interface
to get the basic device info, such as number of memory regions and
interrupts, and their properties.

This patch enables the IOCTLs:
 - VFIO_DEVICE_GET_INFO
 - VFIO_DEVICE_GET_REGION_INFO

IRQ info is provided by one of the latter patches.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 drivers/vfio/platform/vfio_platform.c         | 79 +++++++++++++++++++++++++--
 drivers/vfio/platform/vfio_platform_private.h | 17 ++++++
 2 files changed, 90 insertions(+), 6 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index 1df76d8..eeaebc4 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -34,17 +34,66 @@
 #define DRIVER_AUTHOR   "Antonios Motakis <a.motakis@virtualopensystems.com>"
 #define DRIVER_DESC     "VFIO for platform devices - User Level meta-driver"
 
+static int vfio_platform_regions_init(struct vfio_platform_device *vdev)
+{
+	int cnt = 0, i;
+
+	while (platform_get_resource(vdev->pdev, IORESOURCE_MEM, cnt))
+		cnt++;
+
+	vdev->region = kzalloc(sizeof(struct vfio_platform_region) * cnt,
+				GFP_KERNEL);
+	if (!vdev->region)
+		return -ENOMEM;
+
+	for (i = 0; i < cnt;  i++) {
+		struct resource *res =
+			platform_get_resource(vdev->pdev, IORESOURCE_MEM, i);
+
+		if (!res)
+			goto err;
+
+		vdev->region[i].addr = res->start;
+		vdev->region[i].size = resource_size(res);
+		vdev->region[i].flags = 0;
+	}
+
+	vdev->num_regions = cnt;
+
+	return 0;
+err:
+	kfree(vdev->region);
+	return -EINVAL;
+}
+
+static void vfio_platform_regions_cleanup(struct vfio_platform_device *vdev)
+{
+	vdev->num_regions = 0;
+	kfree(vdev->region);
+}
+
 static void vfio_platform_release(void *device_data)
 {
+	struct vfio_platform_device *vdev = device_data;
+
+	vfio_platform_regions_cleanup(vdev);
+
 	module_put(THIS_MODULE);
 }
 
 static int vfio_platform_open(void *device_data)
 {
+	struct vfio_platform_device *vdev = device_data;
+	int ret;
+
 	if (!try_module_get(THIS_MODULE))
 		return -ENODEV;
 
-	return 0;
+	ret = vfio_platform_regions_init(vdev);
+	if (ret)
+		module_put(THIS_MODULE);
+
+	return ret;
 }
 
 static long vfio_platform_ioctl(void *device_data,
@@ -65,18 +114,36 @@ static long vfio_platform_ioctl(void *device_data,
 			return -EINVAL;
 
 		info.flags = VFIO_DEVICE_FLAGS_PLATFORM;
-		info.num_regions = 0;
+		info.num_regions = vdev->num_regions;
 		info.num_irqs = 0;
 
 		return copy_to_user((void __user *)arg, &info, minsz);
 
-	} else if (cmd == VFIO_DEVICE_GET_REGION_INFO)
-		return -EINVAL;
+	} else if (cmd == VFIO_DEVICE_GET_REGION_INFO) {
+		struct vfio_region_info info;
+
+		minsz = offsetofend(struct vfio_region_info, offset);
+
+		if (copy_from_user(&info, (void __user *)arg, minsz))
+			return -EFAULT;
+
+		if (info.argsz < minsz)
+			return -EINVAL;
+
+		if (info.index >= vdev->num_regions)
+			return -EINVAL;
+
+		/* map offset to the physical address  */
+		info.offset = VFIO_PLATFORM_INDEX_TO_OFFSET(info.index);
+		info.size = vdev->region[info.index].size;
+		info.flags = vdev->region[info.index].flags;
+
+		return copy_to_user((void __user *)arg, &info, minsz);
 
-	else if (cmd == VFIO_DEVICE_GET_IRQ_INFO)
+	} else if (cmd == VFIO_DEVICE_GET_IRQ_INFO) {
 		return -EINVAL;
 
-	else if (cmd == VFIO_DEVICE_SET_IRQS)
+	} else if (cmd == VFIO_DEVICE_SET_IRQS)
 		return -EINVAL;
 
 	else if (cmd == VFIO_DEVICE_RESET)
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 4ae88f8..3448f918 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -15,8 +15,25 @@
 #ifndef VFIO_PLATFORM_PRIVATE_H
 #define VFIO_PLATFORM_PRIVATE_H
 
+#define VFIO_PLATFORM_OFFSET_SHIFT   40
+#define VFIO_PLATFORM_OFFSET_MASK (((u64)(1) << VFIO_PLATFORM_OFFSET_SHIFT) - 1)
+
+#define VFIO_PLATFORM_OFFSET_TO_INDEX(off)	\
+	(off >> VFIO_PLATFORM_OFFSET_SHIFT)
+
+#define VFIO_PLATFORM_INDEX_TO_OFFSET(index)	\
+	((u64)(index) << VFIO_PLATFORM_OFFSET_SHIFT)
+
+struct vfio_platform_region {
+	u64			addr;
+	resource_size_t		size;
+	u32			flags;
+};
+
 struct vfio_platform_device {
 	struct platform_device		*pdev;
+	struct vfio_platform_region	*region;
+	u32				num_regions;
 };
 
 #endif /* VFIO_PLATFORM_PRIVATE_H */
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 11/20] vfio/platform: read and write support for the device fd
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
                   ` (9 preceding siblings ...)
  2014-06-05 17:03 ` [RFC PATCH v6 10/20] vfio/platform: return info for device and its memory mapped IO regions Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-05 17:03 ` [RFC PATCH v6 12/20] vfio/platform: support MMAP of MMIO regions Antonios Motakis
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, Catalin Marinas,
	Mark Rutland, Vladimir Murzin, open list

VFIO returns a file descriptor which we can use to manipulate the memory
regions of the device. Usually, the user will mmap memory regions that are
addressable on page boundaries, however for memory regions where this is
not the case we cannot provide mmap functionality due to security concerns.
For this reason we also need allow to read and write to the memory regions
via the file descriptor.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 drivers/vfio/platform/vfio_platform.c         | 116 +++++++++++++++++++++++++-
 drivers/vfio/platform/vfio_platform_private.h |   1 +
 2 files changed, 114 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index eeaebc4..c66afbc 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -55,7 +55,8 @@ static int vfio_platform_regions_init(struct vfio_platform_device *vdev)
 
 		vdev->region[i].addr = res->start;
 		vdev->region[i].size = resource_size(res);
-		vdev->region[i].flags = 0;
+		vdev->region[i].flags = VFIO_REGION_INFO_FLAG_READ
+					| VFIO_REGION_INFO_FLAG_WRITE;
 	}
 
 	vdev->num_regions = cnt;
@@ -155,13 +156,122 @@ static long vfio_platform_ioctl(void *device_data,
 static ssize_t vfio_platform_read(void *device_data, char __user *buf,
 			     size_t count, loff_t *ppos)
 {
-	return -EINVAL;
+	struct vfio_platform_device *vdev = device_data;
+	unsigned int index = VFIO_PLATFORM_OFFSET_TO_INDEX(*ppos);
+	loff_t off = *ppos & VFIO_PLATFORM_OFFSET_MASK;
+	unsigned int done = 0;
+
+	if (index >= vdev->num_regions)
+		return -EINVAL;
+
+	if (!vdev->region[index].ioaddr) {
+		vdev->region[index].ioaddr =
+			ioremap_nocache(vdev->region[index].addr,
+					vdev->region[index].size);
+
+		if (!vdev->region[index].ioaddr)
+			return -ENOMEM;
+	}
+
+	while (count) {
+		size_t filled;
+
+		if (count >= 4 && !(off % 4)) {
+			u32 val;
+
+			val = ioread32(vdev->region[index].ioaddr + off);
+			if (copy_to_user(buf, &val, 4))
+				goto err;
+
+			filled = 4;
+		} else if (count >= 2 && !(off % 2)) {
+			u16 val;
+
+			val = ioread16(vdev->region[index].ioaddr + off);
+			if (copy_to_user(buf, &val, 2))
+				goto err;
+
+			filled = 2;
+		} else {
+			u8 val;
+
+			val = ioread8(vdev->region[index].ioaddr + off);
+			if (copy_to_user(buf, &val, 1))
+				goto err;
+
+			filled = 1;
+		}
+
+
+		count -= filled;
+		done += filled;
+		off += filled;
+		buf += filled;
+	}
+
+	return done;
+err:
+	return -EFAULT;
 }
 
 static ssize_t vfio_platform_write(void *device_data, const char __user *buf,
 			      size_t count, loff_t *ppos)
 {
-	return -EINVAL;
+	struct vfio_platform_device *vdev = device_data;
+	unsigned int index = VFIO_PLATFORM_OFFSET_TO_INDEX(*ppos);
+	loff_t off = *ppos & VFIO_PLATFORM_OFFSET_MASK;
+	unsigned int done = 0;
+
+	if (index >= vdev->num_regions)
+		return -EINVAL;
+
+	if (!vdev->region[index].ioaddr) {
+		vdev->region[index].ioaddr =
+			ioremap_nocache(vdev->region[index].addr,
+					vdev->region[index].size);
+
+		if (!vdev->region[index].ioaddr)
+			return -ENOMEM;
+	}
+
+	while (count) {
+		size_t filled;
+
+		if (count >= 4 && !(off % 4)) {
+			u32 val;
+
+			if (copy_from_user(&val, buf, 4))
+				goto err;
+			iowrite32(val, vdev->region[index].ioaddr + off);
+
+			filled = 4;
+		} else if (count >= 2 && !(off % 2)) {
+			u16 val;
+
+			if (copy_from_user(&val, buf, 2))
+				goto err;
+			iowrite16(val, vdev->region[index].ioaddr + off);
+
+			filled = 2;
+		} else {
+			u8 val;
+
+			if (copy_from_user(&val, buf, 1))
+				goto err;
+			iowrite8(val, vdev->region[index].ioaddr + off);
+
+			filled = 1;
+		}
+
+		count -= filled;
+		done += filled;
+		off += filled;
+		buf += filled;
+	}
+
+	return done;
+err:
+	return -EFAULT;
 }
 
 static int vfio_platform_mmap(void *device_data, struct vm_area_struct *vma)
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 3448f918..c56f4b6 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -28,6 +28,7 @@ struct vfio_platform_region {
 	u64			addr;
 	resource_size_t		size;
 	u32			flags;
+	void __iomem		*ioaddr;
 };
 
 struct vfio_platform_device {
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 12/20] vfio/platform: support MMAP of MMIO regions
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
                   ` (10 preceding siblings ...)
  2014-06-05 17:03 ` [RFC PATCH v6 11/20] vfio/platform: read and write support for the device fd Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-05 17:03 ` [RFC PATCH v6 13/20] vfio/platform: return IRQ info Antonios Motakis
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, open list

Allow to memory map the MMIO regions of the device so userspace can
directly access them.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 drivers/vfio/platform/vfio_platform.c | 40 ++++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index c66afbc..1515175 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -57,6 +57,11 @@ static int vfio_platform_regions_init(struct vfio_platform_device *vdev)
 		vdev->region[i].size = resource_size(res);
 		vdev->region[i].flags = VFIO_REGION_INFO_FLAG_READ
 					| VFIO_REGION_INFO_FLAG_WRITE;
+		/* Only regions addressed with PAGE granularity may be MMAPed
+		 * securely. */
+		if (!(vdev->region[i].addr & ~PAGE_MASK)
+				&& !(vdev->region[i].size & ~PAGE_MASK))
+			vdev->region[i].flags |= VFIO_REGION_INFO_FLAG_MMAP;
 	}
 
 	vdev->num_regions = cnt;
@@ -276,7 +281,40 @@ err:
 
 static int vfio_platform_mmap(void *device_data, struct vm_area_struct *vma)
 {
-	return -EINVAL;
+	struct vfio_platform_device *vdev = device_data;
+	unsigned int index;
+	u64 req_len, pgoff, req_start;
+	struct vfio_platform_region region;
+
+	index = vma->vm_pgoff >> (VFIO_PLATFORM_OFFSET_SHIFT - PAGE_SHIFT);
+
+	if (vma->vm_end < vma->vm_start)
+		return -EINVAL;
+	if ((vma->vm_flags & VM_SHARED) == 0)
+		return -EINVAL;
+	if (index >= vdev->num_regions)
+		return -EINVAL;
+	if (vma->vm_start & ~PAGE_MASK)
+		return -EINVAL;
+	if (vma->vm_end & ~PAGE_MASK)
+		return -EINVAL;
+
+	region = vdev->region[index];
+
+	req_len = vma->vm_end - vma->vm_start;
+	pgoff = vma->vm_pgoff &
+		((1U << (VFIO_PLATFORM_OFFSET_SHIFT - PAGE_SHIFT)) - 1);
+	req_start = pgoff << PAGE_SHIFT;
+
+	if (region.size < PAGE_SIZE || req_start + req_len > region.size)
+		return -EINVAL;
+
+	vma->vm_private_data = vdev;
+	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+	vma->vm_pgoff = (region.addr >> PAGE_SHIFT) + pgoff;
+
+	return remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
+			       req_len, vma->vm_page_prot);
 }
 
 static const struct vfio_device_ops vfio_platform_ops = {
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 13/20] vfio/platform: return IRQ info
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
                   ` (11 preceding siblings ...)
  2014-06-05 17:03 ` [RFC PATCH v6 12/20] vfio/platform: support MMAP of MMIO regions Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-05 17:03 ` [RFC PATCH v6 14/20] vfio/platform: initial interrupts support Antonios Motakis
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, Catalin Marinas,
	Mark Rutland, Vladimir Murzin, open list

Return information for the interrupts exposed by the device.
This patch extends VFIO_DEVICE_GET_INFO with the number of IRQs
and enables VFIO_DEVICE_GET_IRQ_INFO

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 drivers/vfio/platform/Makefile                |  2 +-
 drivers/vfio/platform/vfio_platform.c         | 34 +++++++++++++--
 drivers/vfio/platform/vfio_platform_irq.c     | 59 +++++++++++++++++++++++++++
 drivers/vfio/platform/vfio_platform_private.h | 11 +++++
 4 files changed, 102 insertions(+), 4 deletions(-)
 create mode 100644 drivers/vfio/platform/vfio_platform_irq.c

diff --git a/drivers/vfio/platform/Makefile b/drivers/vfio/platform/Makefile
index df3a014..2c53327 100644
--- a/drivers/vfio/platform/Makefile
+++ b/drivers/vfio/platform/Makefile
@@ -1,4 +1,4 @@
 
-vfio-platform-y := vfio_platform.o
+vfio-platform-y := vfio_platform.o vfio_platform_irq.o
 
 obj-$(CONFIG_VFIO_PLATFORM) += vfio-platform.o
diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index 1515175..192291c 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -83,6 +83,7 @@ static void vfio_platform_release(void *device_data)
 	struct vfio_platform_device *vdev = device_data;
 
 	vfio_platform_regions_cleanup(vdev);
+	vfio_platform_irq_cleanup(vdev);
 
 	module_put(THIS_MODULE);
 }
@@ -97,7 +98,18 @@ static int vfio_platform_open(void *device_data)
 
 	ret = vfio_platform_regions_init(vdev);
 	if (ret)
-		module_put(THIS_MODULE);
+		goto err_reg;
+
+	ret = vfio_platform_irq_init(vdev);
+	if (ret)
+		goto err_irq;
+
+	return 0;
+
+err_irq:
+	vfio_platform_regions_cleanup(vdev);
+err_reg:
+	module_put(THIS_MODULE);
 
 	return ret;
 }
@@ -121,7 +133,7 @@ static long vfio_platform_ioctl(void *device_data,
 
 		info.flags = VFIO_DEVICE_FLAGS_PLATFORM;
 		info.num_regions = vdev->num_regions;
-		info.num_irqs = 0;
+		info.num_irqs = vdev->num_irqs;
 
 		return copy_to_user((void __user *)arg, &info, minsz);
 
@@ -147,7 +159,23 @@ static long vfio_platform_ioctl(void *device_data,
 		return copy_to_user((void __user *)arg, &info, minsz);
 
 	} else if (cmd == VFIO_DEVICE_GET_IRQ_INFO) {
-		return -EINVAL;
+		struct vfio_irq_info info;
+
+		minsz = offsetofend(struct vfio_irq_info, count);
+
+		if (copy_from_user(&info, (void __user *)arg, minsz))
+			return -EFAULT;
+
+		if (info.argsz < minsz)
+			return -EINVAL;
+
+		if (info.index >= vdev->num_irqs)
+			return -EINVAL;
+
+		info.flags = vdev->irq[info.index].flags;
+		info.count = vdev->irq[info.index].count;
+
+		return copy_to_user((void __user *)arg, &info, minsz);
 
 	} else if (cmd == VFIO_DEVICE_SET_IRQS)
 		return -EINVAL;
diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
new file mode 100644
index 0000000..22c214f
--- /dev/null
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -0,0 +1,59 @@
+/*
+ * VFIO platform devices interrupt handling
+ *
+ * Copyright (C) 2013 - Virtual Open Systems
+ * Author: Antonios Motakis <a.motakis@virtualopensystems.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/device.h>
+#include <linux/eventfd.h>
+#include <linux/interrupt.h>
+#include <linux/iommu.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/notifier.h>
+#include <linux/pm_runtime.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/uaccess.h>
+#include <linux/vfio.h>
+#include <linux/platform_device.h>
+#include <linux/irq.h>
+
+#include "vfio_platform_private.h"
+
+int vfio_platform_irq_init(struct vfio_platform_device *vdev)
+{
+	int cnt = 0, i;
+
+	while (platform_get_irq(vdev->pdev, cnt) > 0)
+		cnt++;
+
+	vdev->irq = kzalloc(sizeof(struct vfio_platform_irq) * cnt, GFP_KERNEL);
+	if (!vdev->irq)
+		return -ENOMEM;
+
+	for (i = 0; i < cnt; i++) {
+		vdev->irq[i].flags = 0;
+		vdev->irq[i].count = 1;
+	}
+
+	vdev->num_irqs = cnt;
+
+	return 0;
+}
+
+void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev)
+{
+	vdev->num_irqs = 0;
+	kfree(vdev->irq);
+}
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index c56f4b6..632a294 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -24,6 +24,11 @@
 #define VFIO_PLATFORM_INDEX_TO_OFFSET(index)	\
 	((u64)(index) << VFIO_PLATFORM_OFFSET_SHIFT)
 
+struct vfio_platform_irq {
+	u32			flags;
+	u32			count;
+};
+
 struct vfio_platform_region {
 	u64			addr;
 	resource_size_t		size;
@@ -35,6 +40,12 @@ struct vfio_platform_device {
 	struct platform_device		*pdev;
 	struct vfio_platform_region	*region;
 	u32				num_regions;
+	struct vfio_platform_irq	*irq;
+	u32				num_irqs;
 };
 
+extern int vfio_platform_irq_init(struct vfio_platform_device *vdev);
+
+extern void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev);
+
 #endif /* VFIO_PLATFORM_PRIVATE_H */
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 14/20] vfio/platform: initial interrupts support
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
                   ` (12 preceding siblings ...)
  2014-06-05 17:03 ` [RFC PATCH v6 13/20] vfio/platform: return IRQ info Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-08 10:09   ` Christoffer Dall
  2014-06-05 17:03 ` [RFC PATCH v6 15/20] vfio/platform: support for maskable and automasked interrupts Antonios Motakis
                   ` (5 subsequent siblings)
  19 siblings, 1 reply; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, Catalin Marinas,
	Mark Rutland, Vladimir Murzin, open list

This patch allows to set an eventfd for a patform device's interrupt,
and also to trigger the interrupt eventfd from userspace for testing.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 drivers/vfio/platform/vfio_platform.c         |  36 ++++++-
 drivers/vfio/platform/vfio_platform_irq.c     | 130 +++++++++++++++++++++++++-
 drivers/vfio/platform/vfio_platform_private.h |   7 ++
 3 files changed, 169 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index 192291c..f4c06c6 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -177,10 +177,40 @@ static long vfio_platform_ioctl(void *device_data,
 
 		return copy_to_user((void __user *)arg, &info, minsz);
 
-	} else if (cmd == VFIO_DEVICE_SET_IRQS)
-		return -EINVAL;
+	} else if (cmd == VFIO_DEVICE_SET_IRQS) {
+		struct vfio_irq_set hdr;
+		int ret = 0;
+
+		minsz = offsetofend(struct vfio_irq_set, count);
+
+		if (copy_from_user(&hdr, (void __user *)arg, minsz))
+			return -EFAULT;
+
+		if (hdr.argsz < minsz)
+			return -EINVAL;
+
+		if (hdr.index >= vdev->num_irqs)
+			return -EINVAL;
+
+		if (hdr.start != 0 || hdr.count > 1)
+			return -EINVAL;
+
+		if (hdr.count == 0 &&
+			(!(hdr.flags & VFIO_IRQ_SET_DATA_NONE) ||
+			 !(hdr.flags & VFIO_IRQ_SET_ACTION_TRIGGER)))
+			return -EINVAL;
+
+		if (hdr.flags & ~(VFIO_IRQ_SET_DATA_TYPE_MASK |
+				  VFIO_IRQ_SET_ACTION_TYPE_MASK))
+			return -EINVAL;
+
+		ret = vfio_platform_set_irqs_ioctl(vdev, hdr.flags, hdr.index,
+						   hdr.start, hdr.count,
+						   (void *)arg+minsz);
+
+		return ret;
 
-	else if (cmd == VFIO_DEVICE_RESET)
+	} else if (cmd == VFIO_DEVICE_RESET)
 		return -EINVAL;
 
 	return -ENOTTY;
diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index 22c214f..d79f5af 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -31,6 +31,9 @@
 
 #include "vfio_platform_private.h"
 
+static int vfio_set_trigger(struct vfio_platform_device *vdev,
+			    int index, int fd);
+
 int vfio_platform_irq_init(struct vfio_platform_device *vdev)
 {
 	int cnt = 0, i;
@@ -43,17 +46,142 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)
 		return -ENOMEM;
 
 	for (i = 0; i < cnt; i++) {
-		vdev->irq[i].flags = 0;
+		int hwirq = platform_get_irq(vdev->pdev, i);
+
+		if (hwirq < 0)
+			goto err;
+
+		vdev->irq[i].flags = VFIO_IRQ_INFO_EVENTFD;
 		vdev->irq[i].count = 1;
+		vdev->irq[i].hwirq = hwirq;
 	}
 
 	vdev->num_irqs = cnt;
 
 	return 0;
+err:
+	kfree(vdev->irq);
+	return -EINVAL;
 }
 
 void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev)
 {
+	int i;
+
+	for (i = 0; i < vdev->num_irqs; i++)
+		vfio_set_trigger(vdev, i, -1);
+
 	vdev->num_irqs = 0;
 	kfree(vdev->irq);
 }
+
+static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
+{
+	struct eventfd_ctx *trigger = dev_id;
+
+	eventfd_signal(trigger, 1);
+
+	return IRQ_HANDLED;
+}
+
+static int vfio_set_trigger(struct vfio_platform_device *vdev,
+			    int index, int fd)
+{
+	struct vfio_platform_irq *irq = &vdev->irq[index];
+	struct eventfd_ctx *trigger;
+	int ret;
+
+	if (irq->trigger) {
+		free_irq(irq->hwirq, irq);
+		kfree(irq->name);
+		eventfd_ctx_put(irq->trigger);
+		irq->trigger = NULL;
+	}
+
+	if (fd < 0) /* Disable only */
+		return 0;
+
+	irq->name = kasprintf(GFP_KERNEL, "vfio-irq[%d](%s)",
+						irq->hwirq, vdev->pdev->name);
+	if (!irq->name)
+		return -ENOMEM;
+
+	trigger = eventfd_ctx_fdget(fd);
+	if (IS_ERR(trigger)) {
+		kfree(irq->name);
+		return PTR_ERR(trigger);
+	}
+
+	irq->trigger = trigger;
+
+	ret = request_irq(irq->hwirq, vfio_irq_handler, 0, irq->name, irq);
+	if (ret) {
+		kfree(irq->name);
+		eventfd_ctx_put(trigger);
+		irq->trigger = NULL;
+		return ret;
+	}
+
+	return 0;
+}
+
+static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev,
+				     unsigned index, unsigned start,
+				     unsigned count, uint32_t flags, void *data)
+{
+	struct vfio_platform_irq *irq = &vdev->irq[index];
+	uint8_t arr;
+	int32_t fd;
+
+	switch (flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
+	case VFIO_IRQ_SET_DATA_NONE:
+		if (count == 0)
+			return vfio_set_trigger(vdev, index, -1);
+
+		vfio_irq_handler(irq->hwirq, irq);
+		return 0;
+
+	case VFIO_IRQ_SET_DATA_BOOL:
+		if (copy_from_user(&arr, data, sizeof(uint8_t)))
+			return -EFAULT;
+
+		if (arr == 0x1) {
+			vfio_irq_handler(irq->hwirq, irq);
+			return 0;
+		}
+
+		return -EINVAL;
+
+	case VFIO_IRQ_SET_DATA_EVENTFD:
+		if (copy_from_user(&fd, data, sizeof(int32_t)))
+			return -EFAULT;
+
+		return vfio_set_trigger(vdev, index, fd);
+	}
+
+	return -EFAULT;
+}
+
+int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev,
+				 uint32_t flags, unsigned index, unsigned start,
+				 unsigned count, void *data)
+{
+	int (*func)(struct vfio_platform_device *vdev, unsigned index,
+		    unsigned start, unsigned count, uint32_t flags,
+		    void *data) = NULL;
+
+	switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
+	case VFIO_IRQ_SET_ACTION_MASK:
+	case VFIO_IRQ_SET_ACTION_UNMASK:
+		/* XXX not implemented */
+		break;
+	case VFIO_IRQ_SET_ACTION_TRIGGER:
+		func = vfio_platform_set_irq_trigger;
+		break;
+	}
+
+	if (!func)
+		return -ENOTTY;
+
+	return func(vdev, index, start, count, flags, data);
+}
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 632a294..d6200df 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -25,8 +25,11 @@
 	((u64)(index) << VFIO_PLATFORM_OFFSET_SHIFT)
 
 struct vfio_platform_irq {
+	struct eventfd_ctx	*trigger;
 	u32			flags;
 	u32			count;
+	int			hwirq;
+	char			*name;
 };
 
 struct vfio_platform_region {
@@ -48,4 +51,8 @@ extern int vfio_platform_irq_init(struct vfio_platform_device *vdev);
 
 extern void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev);
 
+extern int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev,
+			uint32_t flags, unsigned index, unsigned start,
+			unsigned count, void *data);
+
 #endif /* VFIO_PLATFORM_PRIVATE_H */
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 15/20] vfio/platform: support for maskable and automasked interrupts
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
                   ` (13 preceding siblings ...)
  2014-06-05 17:03 ` [RFC PATCH v6 14/20] vfio/platform: initial interrupts support Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-08 10:17   ` Christoffer Dall
  2014-06-05 17:03 ` [RFC PATCH v6 16/20] vfio: move eventfd support code for VFIO_PCI to a sepparate file Antonios Motakis
                   ` (4 subsequent siblings)
  19 siblings, 1 reply; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, Catalin Marinas,
	Mark Rutland, Vladimir Murzin, open list

Adds support to mask interrupts, and also for automasked interrupts.
Level sensitive interrupts are exposed as automasked interrupts and
are masked and disabled automatically when they fire.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 drivers/vfio/platform/vfio_platform_irq.c     | 112 ++++++++++++++++++++++++--
 drivers/vfio/platform/vfio_platform_private.h |   2 +
 2 files changed, 109 insertions(+), 5 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index d79f5af..10dfbf0 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -51,9 +51,17 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)
 		if (hwirq < 0)
 			goto err;
 
-		vdev->irq[i].flags = VFIO_IRQ_INFO_EVENTFD;
+		spin_lock_init(&vdev->irq[i].lock);
+
+		vdev->irq[i].flags = VFIO_IRQ_INFO_EVENTFD
+					| VFIO_IRQ_INFO_MASKABLE;
+
+		if (irq_get_trigger_type(hwirq) & IRQ_TYPE_LEVEL_MASK)
+			vdev->irq[i].flags |= VFIO_IRQ_INFO_AUTOMASKED;
+
 		vdev->irq[i].count = 1;
 		vdev->irq[i].hwirq = hwirq;
+		vdev->irq[i].masked = false;
 	}
 
 	vdev->num_irqs = cnt;
@@ -77,11 +85,27 @@ void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev)
 
 static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
 {
-	struct eventfd_ctx *trigger = dev_id;
+	struct vfio_platform_irq *irq_ctx = dev_id;
+	unsigned long flags;
+	int ret = IRQ_NONE;
+
+	spin_lock_irqsave(&irq_ctx->lock, flags);
+
+	if (!irq_ctx->masked) {
+		ret = IRQ_HANDLED;
+
+		if (irq_ctx->flags & VFIO_IRQ_INFO_AUTOMASKED) {
+			disable_irq_nosync(irq_ctx->hwirq);
+			irq_ctx->masked = true;
+		}
+	}
 
-	eventfd_signal(trigger, 1);
+	spin_unlock_irqrestore(&irq_ctx->lock, flags);
 
-	return IRQ_HANDLED;
+	if (ret == IRQ_HANDLED)
+		eventfd_signal(irq_ctx->trigger, 1);
+
+	return ret;
 }
 
 static int vfio_set_trigger(struct vfio_platform_device *vdev,
@@ -162,6 +186,82 @@ static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev,
 	return -EFAULT;
 }
 
+static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
+				    unsigned index, unsigned start,
+				    unsigned count, uint32_t flags, void *data)
+{
+	uint8_t arr;
+
+	if (start != 0 || count != 1)
+		return -EINVAL;
+
+	switch (flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
+	case VFIO_IRQ_SET_DATA_BOOL:
+		if (copy_from_user(&arr, data, sizeof(uint8_t)))
+			return -EFAULT;
+
+		if (arr != 0x1)
+			return -EINVAL;
+
+	case VFIO_IRQ_SET_DATA_NONE:
+
+		spin_lock_irq(&vdev->irq[index].lock);
+
+		if (vdev->irq[index].masked) {
+			enable_irq(vdev->irq[index].hwirq);
+			vdev->irq[index].masked = false;
+		}
+
+		spin_unlock_irq(&vdev->irq[index].lock);
+
+		return 0;
+
+	case VFIO_IRQ_SET_DATA_EVENTFD: /* XXX not implemented yet */
+	default:
+		return -ENOTTY;
+	}
+
+	return 0;
+}
+
+static int vfio_platform_set_irq_mask(struct vfio_platform_device *vdev,
+				    unsigned index, unsigned start,
+				    unsigned count, uint32_t flags, void *data)
+{
+	uint8_t arr;
+
+	if (start != 0 || count != 1)
+		return -EINVAL;
+
+	switch (flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
+	case VFIO_IRQ_SET_DATA_BOOL:
+		if (copy_from_user(&arr, data, sizeof(uint8_t)))
+			return -EFAULT;
+
+		if (arr != 0x1)
+			return -EINVAL;
+
+	case VFIO_IRQ_SET_DATA_NONE:
+
+		spin_lock_irq(&vdev->irq[index].lock);
+
+		if (!vdev->irq[index].masked) {
+			disable_irq(vdev->irq[index].hwirq);
+			vdev->irq[index].masked = true;
+		}
+
+		spin_unlock_irq(&vdev->irq[index].lock);
+
+		return 0;
+
+	case VFIO_IRQ_SET_DATA_EVENTFD: /* XXX not implemented yet */
+	default:
+		return -ENOTTY;
+	}
+
+	return 0;
+}
+
 int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev,
 				 uint32_t flags, unsigned index, unsigned start,
 				 unsigned count, void *data)
@@ -172,8 +272,10 @@ int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev,
 
 	switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
 	case VFIO_IRQ_SET_ACTION_MASK:
+		func = vfio_platform_set_irq_mask;
+		break;
 	case VFIO_IRQ_SET_ACTION_UNMASK:
-		/* XXX not implemented */
+		func = vfio_platform_set_irq_unmask;
 		break;
 	case VFIO_IRQ_SET_ACTION_TRIGGER:
 		func = vfio_platform_set_irq_trigger;
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index d6200df..4d887fd 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -30,6 +30,8 @@ struct vfio_platform_irq {
 	u32			count;
 	int			hwirq;
 	char			*name;
+	bool			masked;
+	spinlock_t		lock;
 };
 
 struct vfio_platform_region {
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 16/20] vfio: move eventfd support code for VFIO_PCI to a sepparate file
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
                   ` (14 preceding siblings ...)
  2014-06-05 17:03 ` [RFC PATCH v6 15/20] vfio/platform: support for maskable and automasked interrupts Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-05 17:03 ` [RFC PATCH v6 17/20] vfio: add local lock in virqfd instead of depending on VFIO PCI Antonios Motakis
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, Alexander Gordeev,
	Bjorn Helgaas, open list

The virqfd functionality that is used by VFIO_PCI to implement interrupt
masking and unmasking via an eventfd, is generic enough and can be reused
by another driver. Move it to a separate file in order to allow the code
to be shared.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 drivers/vfio/Makefile               |   2 +-
 drivers/vfio/pci/vfio_pci_intrs.c   | 213 ------------------------------------
 drivers/vfio/pci/vfio_pci_private.h |   3 -
 drivers/vfio/virqfd.c               | 212 +++++++++++++++++++++++++++++++++++
 include/linux/vfio.h                |  27 +++++
 5 files changed, 240 insertions(+), 217 deletions(-)
 create mode 100644 drivers/vfio/virqfd.c

diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index b5e4a33..0d8f48b 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -1,4 +1,4 @@
-obj-$(CONFIG_VFIO) += vfio.o
+obj-$(CONFIG_VFIO) += vfio.o virqfd.o
 obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
 obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
 obj-$(CONFIG_VFIO_PCI) += pci/
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 9dd49c9..3f909bb 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -18,226 +18,13 @@
 #include <linux/eventfd.h>
 #include <linux/pci.h>
 #include <linux/file.h>
-#include <linux/poll.h>
 #include <linux/vfio.h>
 #include <linux/wait.h>
-#include <linux/workqueue.h>
 #include <linux/slab.h>
 
 #include "vfio_pci_private.h"
 
 /*
- * IRQfd - generic
- */
-struct virqfd {
-	struct vfio_pci_device	*vdev;
-	struct eventfd_ctx	*eventfd;
-	int			(*handler)(struct vfio_pci_device *, void *);
-	void			(*thread)(struct vfio_pci_device *, void *);
-	void			*data;
-	struct work_struct	inject;
-	wait_queue_t		wait;
-	poll_table		pt;
-	struct work_struct	shutdown;
-	struct virqfd		**pvirqfd;
-};
-
-static struct workqueue_struct *vfio_irqfd_cleanup_wq;
-
-int __init vfio_pci_virqfd_init(void)
-{
-	vfio_irqfd_cleanup_wq =
-		create_singlethread_workqueue("vfio-irqfd-cleanup");
-	if (!vfio_irqfd_cleanup_wq)
-		return -ENOMEM;
-
-	return 0;
-}
-
-void vfio_pci_virqfd_exit(void)
-{
-	destroy_workqueue(vfio_irqfd_cleanup_wq);
-}
-
-static void virqfd_deactivate(struct virqfd *virqfd)
-{
-	queue_work(vfio_irqfd_cleanup_wq, &virqfd->shutdown);
-}
-
-static int virqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, void *key)
-{
-	struct virqfd *virqfd = container_of(wait, struct virqfd, wait);
-	unsigned long flags = (unsigned long)key;
-
-	if (flags & POLLIN) {
-		/* An event has been signaled, call function */
-		if ((!virqfd->handler ||
-		     virqfd->handler(virqfd->vdev, virqfd->data)) &&
-		    virqfd->thread)
-			schedule_work(&virqfd->inject);
-	}
-
-	if (flags & POLLHUP) {
-		unsigned long flags;
-		spin_lock_irqsave(&virqfd->vdev->irqlock, flags);
-
-		/*
-		 * The eventfd is closing, if the virqfd has not yet been
-		 * queued for release, as determined by testing whether the
-		 * vdev pointer to it is still valid, queue it now.  As
-		 * with kvm irqfds, we know we won't race against the virqfd
-		 * going away because we hold wqh->lock to get here.
-		 */
-		if (*(virqfd->pvirqfd) == virqfd) {
-			*(virqfd->pvirqfd) = NULL;
-			virqfd_deactivate(virqfd);
-		}
-
-		spin_unlock_irqrestore(&virqfd->vdev->irqlock, flags);
-	}
-
-	return 0;
-}
-
-static void virqfd_ptable_queue_proc(struct file *file,
-				     wait_queue_head_t *wqh, poll_table *pt)
-{
-	struct virqfd *virqfd = container_of(pt, struct virqfd, pt);
-	add_wait_queue(wqh, &virqfd->wait);
-}
-
-static void virqfd_shutdown(struct work_struct *work)
-{
-	struct virqfd *virqfd = container_of(work, struct virqfd, shutdown);
-	u64 cnt;
-
-	eventfd_ctx_remove_wait_queue(virqfd->eventfd, &virqfd->wait, &cnt);
-	flush_work(&virqfd->inject);
-	eventfd_ctx_put(virqfd->eventfd);
-
-	kfree(virqfd);
-}
-
-static void virqfd_inject(struct work_struct *work)
-{
-	struct virqfd *virqfd = container_of(work, struct virqfd, inject);
-	if (virqfd->thread)
-		virqfd->thread(virqfd->vdev, virqfd->data);
-}
-
-static int virqfd_enable(struct vfio_pci_device *vdev,
-			 int (*handler)(struct vfio_pci_device *, void *),
-			 void (*thread)(struct vfio_pci_device *, void *),
-			 void *data, struct virqfd **pvirqfd, int fd)
-{
-	struct fd irqfd;
-	struct eventfd_ctx *ctx;
-	struct virqfd *virqfd;
-	int ret = 0;
-	unsigned int events;
-
-	virqfd = kzalloc(sizeof(*virqfd), GFP_KERNEL);
-	if (!virqfd)
-		return -ENOMEM;
-
-	virqfd->pvirqfd = pvirqfd;
-	virqfd->vdev = vdev;
-	virqfd->handler = handler;
-	virqfd->thread = thread;
-	virqfd->data = data;
-
-	INIT_WORK(&virqfd->shutdown, virqfd_shutdown);
-	INIT_WORK(&virqfd->inject, virqfd_inject);
-
-	irqfd = fdget(fd);
-	if (!irqfd.file) {
-		ret = -EBADF;
-		goto err_fd;
-	}
-
-	ctx = eventfd_ctx_fileget(irqfd.file);
-	if (IS_ERR(ctx)) {
-		ret = PTR_ERR(ctx);
-		goto err_ctx;
-	}
-
-	virqfd->eventfd = ctx;
-
-	/*
-	 * virqfds can be released by closing the eventfd or directly
-	 * through ioctl.  These are both done through a workqueue, so
-	 * we update the pointer to the virqfd under lock to avoid
-	 * pushing multiple jobs to release the same virqfd.
-	 */
-	spin_lock_irq(&vdev->irqlock);
-
-	if (*pvirqfd) {
-		spin_unlock_irq(&vdev->irqlock);
-		ret = -EBUSY;
-		goto err_busy;
-	}
-	*pvirqfd = virqfd;
-
-	spin_unlock_irq(&vdev->irqlock);
-
-	/*
-	 * Install our own custom wake-up handling so we are notified via
-	 * a callback whenever someone signals the underlying eventfd.
-	 */
-	init_waitqueue_func_entry(&virqfd->wait, virqfd_wakeup);
-	init_poll_funcptr(&virqfd->pt, virqfd_ptable_queue_proc);
-
-	events = irqfd.file->f_op->poll(irqfd.file, &virqfd->pt);
-
-	/*
-	 * Check if there was an event already pending on the eventfd
-	 * before we registered and trigger it as if we didn't miss it.
-	 */
-	if (events & POLLIN) {
-		if ((!handler || handler(vdev, data)) && thread)
-			schedule_work(&virqfd->inject);
-	}
-
-	/*
-	 * Do not drop the file until the irqfd is fully initialized,
-	 * otherwise we might race against the POLLHUP.
-	 */
-	fdput(irqfd);
-
-	return 0;
-err_busy:
-	eventfd_ctx_put(ctx);
-err_ctx:
-	fdput(irqfd);
-err_fd:
-	kfree(virqfd);
-
-	return ret;
-}
-
-static void virqfd_disable(struct vfio_pci_device *vdev,
-			   struct virqfd **pvirqfd)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&vdev->irqlock, flags);
-
-	if (*pvirqfd) {
-		virqfd_deactivate(*pvirqfd);
-		*pvirqfd = NULL;
-	}
-
-	spin_unlock_irqrestore(&vdev->irqlock, flags);
-
-	/*
-	 * Block until we know all outstanding shutdown jobs have completed.
-	 * Even if we don't queue the job, flush the wq to be sure it's
-	 * been released.
-	 */
-	flush_workqueue(vfio_irqfd_cleanup_wq);
-}
-
-/*
  * INTx
  */
 static void vfio_send_intx_eventfd(struct vfio_pci_device *vdev, void *unused)
diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
index 9c6d5d0..72b86c7 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -85,9 +85,6 @@ extern ssize_t vfio_pci_vga_rw(struct vfio_pci_device *vdev, char __user *buf,
 extern int vfio_pci_init_perm_bits(void);
 extern void vfio_pci_uninit_perm_bits(void);
 
-extern int vfio_pci_virqfd_init(void);
-extern void vfio_pci_virqfd_exit(void);
-
 extern int vfio_config_init(struct vfio_pci_device *vdev);
 extern void vfio_config_free(struct vfio_pci_device *vdev);
 #endif /* VFIO_PCI_PRIVATE_H */
diff --git a/drivers/vfio/virqfd.c b/drivers/vfio/virqfd.c
new file mode 100644
index 0000000..9cf2842
--- /dev/null
+++ b/drivers/vfio/virqfd.c
@@ -0,0 +1,212 @@
+/*
+ * VFIO generic eventfd code for IRQFD support.
+ * Derived from drivers/vfio/pci/vfio_pci_intrs.c
+ *
+ * Copyright (C) 2012 Red Hat, Inc.  All rights reserved.
+ *     Author: Alex Williamson <alex.williamson@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/vfio.h>
+#include <linux/eventfd.h>
+#include <linux/file.h>
+#include <linux/slab.h>
+#include "pci/vfio_pci_private.h"
+
+static struct workqueue_struct *vfio_irqfd_cleanup_wq;
+
+int __init vfio_pci_virqfd_init(void)
+{
+	vfio_irqfd_cleanup_wq =
+		create_singlethread_workqueue("vfio-irqfd-cleanup");
+	if (!vfio_irqfd_cleanup_wq)
+		return -ENOMEM;
+
+	return 0;
+}
+
+void vfio_pci_virqfd_exit(void)
+{
+	destroy_workqueue(vfio_irqfd_cleanup_wq);
+}
+
+static void virqfd_deactivate(struct virqfd *virqfd)
+{
+	queue_work(vfio_irqfd_cleanup_wq, &virqfd->shutdown);
+}
+
+static int virqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, void *key)
+{
+	struct virqfd *virqfd = container_of(wait, struct virqfd, wait);
+	unsigned long flags = (unsigned long)key;
+
+	if (flags & POLLIN) {
+		/* An event has been signaled, call function */
+		if ((!virqfd->handler ||
+		     virqfd->handler(virqfd->vdev, virqfd->data)) &&
+		    virqfd->thread)
+			schedule_work(&virqfd->inject);
+	}
+
+	if (flags & POLLHUP) {
+		unsigned long flags;
+		spin_lock_irqsave(&virqfd->vdev->irqlock, flags);
+
+		/*
+		 * The eventfd is closing, if the virqfd has not yet been
+		 * queued for release, as determined by testing whether the
+		 * vdev pointer to it is still valid, queue it now.  As
+		 * with kvm irqfds, we know we won't race against the virqfd
+		 * going away because we hold wqh->lock to get here.
+		 */
+		if (*(virqfd->pvirqfd) == virqfd) {
+			*(virqfd->pvirqfd) = NULL;
+			virqfd_deactivate(virqfd);
+		}
+
+		spin_unlock_irqrestore(&virqfd->vdev->irqlock, flags);
+	}
+
+	return 0;
+}
+
+static void virqfd_ptable_queue_proc(struct file *file,
+				     wait_queue_head_t *wqh, poll_table *pt)
+{
+	struct virqfd *virqfd = container_of(pt, struct virqfd, pt);
+	add_wait_queue(wqh, &virqfd->wait);
+}
+
+static void virqfd_shutdown(struct work_struct *work)
+{
+	struct virqfd *virqfd = container_of(work, struct virqfd, shutdown);
+	u64 cnt;
+
+	eventfd_ctx_remove_wait_queue(virqfd->eventfd, &virqfd->wait, &cnt);
+	flush_work(&virqfd->inject);
+	eventfd_ctx_put(virqfd->eventfd);
+
+	kfree(virqfd);
+}
+
+static void virqfd_inject(struct work_struct *work)
+{
+	struct virqfd *virqfd = container_of(work, struct virqfd, inject);
+	if (virqfd->thread)
+		virqfd->thread(virqfd->vdev, virqfd->data);
+}
+
+int virqfd_enable(struct vfio_pci_device *vdev,
+			 int (*handler)(struct vfio_pci_device *, void *),
+			 void (*thread)(struct vfio_pci_device *, void *),
+			 void *data, struct virqfd **pvirqfd, int fd)
+{
+	struct fd irqfd;
+	struct eventfd_ctx *ctx;
+	struct virqfd *virqfd;
+	int ret = 0;
+	unsigned int events;
+
+	virqfd = kzalloc(sizeof(*virqfd), GFP_KERNEL);
+	if (!virqfd)
+		return -ENOMEM;
+
+	virqfd->pvirqfd = pvirqfd;
+	virqfd->vdev = vdev;
+	virqfd->handler = handler;
+	virqfd->thread = thread;
+	virqfd->data = data;
+
+	INIT_WORK(&virqfd->shutdown, virqfd_shutdown);
+	INIT_WORK(&virqfd->inject, virqfd_inject);
+
+	irqfd = fdget(fd);
+	if (!irqfd.file) {
+		ret = -EBADF;
+		goto err_fd;
+	}
+
+	ctx = eventfd_ctx_fileget(irqfd.file);
+	if (IS_ERR(ctx)) {
+		ret = PTR_ERR(ctx);
+		goto err_ctx;
+	}
+
+	virqfd->eventfd = ctx;
+
+	/*
+	 * virqfds can be released by closing the eventfd or directly
+	 * through ioctl.  These are both done through a workqueue, so
+	 * we update the pointer to the virqfd under lock to avoid
+	 * pushing multiple jobs to release the same virqfd.
+	 */
+	spin_lock_irq(&vdev->irqlock);
+
+	if (*pvirqfd) {
+		spin_unlock_irq(&vdev->irqlock);
+		ret = -EBUSY;
+		goto err_busy;
+	}
+	*pvirqfd = virqfd;
+
+	spin_unlock_irq(&vdev->irqlock);
+
+	/*
+	 * Install our own custom wake-up handling so we are notified via
+	 * a callback whenever someone signals the underlying eventfd.
+	 */
+	init_waitqueue_func_entry(&virqfd->wait, virqfd_wakeup);
+	init_poll_funcptr(&virqfd->pt, virqfd_ptable_queue_proc);
+
+	events = irqfd.file->f_op->poll(irqfd.file, &virqfd->pt);
+
+	/*
+	 * Check if there was an event already pending on the eventfd
+	 * before we registered and trigger it as if we didn't miss it.
+	 */
+	if (events & POLLIN) {
+		if ((!handler || handler(vdev, data)) && thread)
+			schedule_work(&virqfd->inject);
+	}
+
+	/*
+	 * Do not drop the file until the irqfd is fully initialized,
+	 * otherwise we might race against the POLLHUP.
+	 */
+	fdput(irqfd);
+
+	return 0;
+err_busy:
+	eventfd_ctx_put(ctx);
+err_ctx:
+	fdput(irqfd);
+err_fd:
+	kfree(virqfd);
+
+	return ret;
+}
+
+void virqfd_disable(struct vfio_pci_device *vdev,
+			   struct virqfd **pvirqfd)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&vdev->irqlock, flags);
+
+	if (*pvirqfd) {
+		virqfd_deactivate(*pvirqfd);
+		*pvirqfd = NULL;
+	}
+
+	spin_unlock_irqrestore(&vdev->irqlock, flags);
+
+	/*
+	 * Block until we know all outstanding shutdown jobs have completed.
+	 * Even if we don't queue the job, flush the wq to be sure it's
+	 * been released.
+	 */
+	flush_workqueue(vfio_irqfd_cleanup_wq);
+}
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 81022a52..43968e8 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -14,6 +14,8 @@
 
 #include <linux/iommu.h>
 #include <linux/mm.h>
+#include <linux/workqueue.h>
+#include <linux/poll.h>
 #include <uapi/linux/vfio.h>
 
 /**
@@ -99,4 +101,29 @@ extern int vfio_external_user_iommu_id(struct vfio_group *group);
 extern long vfio_external_check_extension(struct vfio_group *group,
 					  unsigned long arg);
 
+/*
+ * IRQFD support
+ */
+struct virqfd {
+	struct vfio_pci_device	*vdev;
+	struct eventfd_ctx	*eventfd;
+	int			(*handler)(struct vfio_pci_device *, void *);
+	void			(*thread)(struct vfio_pci_device *, void *);
+	void			*data;
+	struct work_struct	inject;
+	wait_queue_t		wait;
+	poll_table		pt;
+	struct work_struct	shutdown;
+	struct virqfd		**pvirqfd;
+};
+
+extern int vfio_pci_virqfd_init(void);
+extern void vfio_pci_virqfd_exit(void);
+extern int virqfd_enable(struct vfio_pci_device *vdev,
+			 int (*handler)(struct vfio_pci_device *, void *),
+			 void (*thread)(struct vfio_pci_device *, void *),
+			 void *data, struct virqfd **pvirqfd, int fd);
+extern void virqfd_disable(struct vfio_pci_device *vdev,
+			   struct virqfd **pvirqfd);
+
 #endif /* VFIO_H */
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 17/20] vfio: add local lock in virqfd instead of depending on VFIO PCI
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
                   ` (15 preceding siblings ...)
  2014-06-05 17:03 ` [RFC PATCH v6 16/20] vfio: move eventfd support code for VFIO_PCI to a sepparate file Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-05 22:19   ` Alex Williamson
  2014-06-05 17:03 ` [RFC PATCH v6 18/20] vfio: pass an opaque pointer on virqfd initialization Antonios Motakis
                   ` (2 subsequent siblings)
  19 siblings, 1 reply; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, Alexander Gordeev,
	Bjorn Helgaas, open list

Sharing the same spinlock with the VFIO bus driver is not necessary for
the virqfd code, so remove that dependency.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 drivers/vfio/pci/vfio_pci_intrs.c | 10 +++++-----
 drivers/vfio/virqfd.c             | 24 +++++++++++++-----------
 include/linux/vfio.h              |  3 +--
 3 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 3f909bb..e56c814 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -226,8 +226,8 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
 static void vfio_intx_disable(struct vfio_pci_device *vdev)
 {
 	vfio_intx_set_signal(vdev, -1);
-	virqfd_disable(vdev, &vdev->ctx[0].unmask);
-	virqfd_disable(vdev, &vdev->ctx[0].mask);
+	virqfd_disable(&vdev->ctx[0].unmask);
+	virqfd_disable(&vdev->ctx[0].mask);
 	vdev->irq_type = VFIO_PCI_NUM_IRQS;
 	vdev->num_ctx = 0;
 	kfree(vdev->ctx);
@@ -377,8 +377,8 @@ static void vfio_msi_disable(struct vfio_pci_device *vdev, bool msix)
 	vfio_msi_set_block(vdev, 0, vdev->num_ctx, NULL, msix);
 
 	for (i = 0; i < vdev->num_ctx; i++) {
-		virqfd_disable(vdev, &vdev->ctx[i].unmask);
-		virqfd_disable(vdev, &vdev->ctx[i].mask);
+		virqfd_disable(&vdev->ctx[i].unmask);
+		virqfd_disable(&vdev->ctx[i].mask);
 	}
 
 	if (msix) {
@@ -415,7 +415,7 @@ static int vfio_pci_set_intx_unmask(struct vfio_pci_device *vdev,
 					     vfio_send_intx_eventfd, NULL,
 					     &vdev->ctx[0].unmask, fd);
 
-		virqfd_disable(vdev, &vdev->ctx[0].unmask);
+		virqfd_disable(&vdev->ctx[0].unmask);
 	}
 
 	return 0;
diff --git a/drivers/vfio/virqfd.c b/drivers/vfio/virqfd.c
index 9cf2842..4528450 100644
--- a/drivers/vfio/virqfd.c
+++ b/drivers/vfio/virqfd.c
@@ -17,6 +17,7 @@
 #include "pci/vfio_pci_private.h"
 
 static struct workqueue_struct *vfio_irqfd_cleanup_wq;
+static spinlock_t lock;
 
 int __init vfio_pci_virqfd_init(void)
 {
@@ -25,6 +26,8 @@ int __init vfio_pci_virqfd_init(void)
 	if (!vfio_irqfd_cleanup_wq)
 		return -ENOMEM;
 
+	spin_lock_init(&lock);
+
 	return 0;
 }
 
@@ -53,21 +56,21 @@ static int virqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, void *key)
 
 	if (flags & POLLHUP) {
 		unsigned long flags;
-		spin_lock_irqsave(&virqfd->vdev->irqlock, flags);
+		spin_lock_irqsave(&lock, flags);
 
 		/*
 		 * The eventfd is closing, if the virqfd has not yet been
 		 * queued for release, as determined by testing whether the
-		 * vdev pointer to it is still valid, queue it now.  As
+		 * virqfd pointer to it is still valid, queue it now.  As
 		 * with kvm irqfds, we know we won't race against the virqfd
-		 * going away because we hold wqh->lock to get here.
+		 * going away because we hold the lock to get here.
 		 */
 		if (*(virqfd->pvirqfd) == virqfd) {
 			*(virqfd->pvirqfd) = NULL;
 			virqfd_deactivate(virqfd);
 		}
 
-		spin_unlock_irqrestore(&virqfd->vdev->irqlock, flags);
+		spin_unlock_irqrestore(&lock, flags);
 	}
 
 	return 0;
@@ -143,16 +146,16 @@ int virqfd_enable(struct vfio_pci_device *vdev,
 	 * we update the pointer to the virqfd under lock to avoid
 	 * pushing multiple jobs to release the same virqfd.
 	 */
-	spin_lock_irq(&vdev->irqlock);
+	spin_lock_irq(&lock);
 
 	if (*pvirqfd) {
-		spin_unlock_irq(&vdev->irqlock);
+		spin_unlock_irq(&lock);
 		ret = -EBUSY;
 		goto err_busy;
 	}
 	*pvirqfd = virqfd;
 
-	spin_unlock_irq(&vdev->irqlock);
+	spin_unlock_irq(&lock);
 
 	/*
 	 * Install our own custom wake-up handling so we are notified via
@@ -189,19 +192,18 @@ err_fd:
 	return ret;
 }
 
-void virqfd_disable(struct vfio_pci_device *vdev,
-			   struct virqfd **pvirqfd)
+void virqfd_disable(struct virqfd **pvirqfd)
 {
 	unsigned long flags;
 
-	spin_lock_irqsave(&vdev->irqlock, flags);
+	spin_lock_irqsave(&lock, flags);
 
 	if (*pvirqfd) {
 		virqfd_deactivate(*pvirqfd);
 		*pvirqfd = NULL;
 	}
 
-	spin_unlock_irqrestore(&vdev->irqlock, flags);
+	spin_unlock_irqrestore(&lock, flags);
 
 	/*
 	 * Block until we know all outstanding shutdown jobs have completed.
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 43968e8..44c9808 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -123,7 +123,6 @@ extern int virqfd_enable(struct vfio_pci_device *vdev,
 			 int (*handler)(struct vfio_pci_device *, void *),
 			 void (*thread)(struct vfio_pci_device *, void *),
 			 void *data, struct virqfd **pvirqfd, int fd);
-extern void virqfd_disable(struct vfio_pci_device *vdev,
-			   struct virqfd **pvirqfd);
+extern void virqfd_disable(struct virqfd **pvirqfd);
 
 #endif /* VFIO_H */
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 18/20] vfio: pass an opaque pointer on virqfd initialization
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
                   ` (16 preceding siblings ...)
  2014-06-05 17:03 ` [RFC PATCH v6 17/20] vfio: add local lock in virqfd instead of depending on VFIO PCI Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-05 17:03 ` [RFC PATCH v6 19/20] vfio: initialize the virqfd workqueue in VFIO generic code Antonios Motakis
  2014-06-05 17:03 ` [RFC PATCH v6 20/20] vfio/platform: implement IRQ masking/unmasking via an eventfd Antonios Motakis
  19 siblings, 0 replies; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, Alexander Gordeev,
	Bjorn Helgaas, open list

VFIO_PCI passes the VFIO device structure *vdev via eventfd to the handler
that implements masking/unmasking of IRQs via an eventfd. We can replace
it in the virqfd infrastructure with an opaque type so we can make use
of the mechanism from other VFIO bus drivers.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 drivers/vfio/pci/vfio_pci_intrs.c | 11 +++++++----
 drivers/vfio/virqfd.c             | 17 ++++++++---------
 include/linux/vfio.h              | 12 ++++++------
 3 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index e56c814..6ca22a8 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -27,8 +27,10 @@
 /*
  * INTx
  */
-static void vfio_send_intx_eventfd(struct vfio_pci_device *vdev, void *unused)
+static void vfio_send_intx_eventfd(void *opaque, void *unused)
 {
+	struct vfio_pci_device *vdev = opaque;
+
 	if (likely(is_intx(vdev) && !vdev->virq_disabled))
 		eventfd_signal(vdev->ctx[0].trigger, 1);
 }
@@ -71,9 +73,9 @@ void vfio_pci_intx_mask(struct vfio_pci_device *vdev)
  * a signal is necessary, which can then be handled via a work queue
  * or directly depending on the caller.
  */
-static int vfio_pci_intx_unmask_handler(struct vfio_pci_device *vdev,
-					void *unused)
+static int vfio_pci_intx_unmask_handler(void *opaque, void *unused)
 {
+	struct vfio_pci_device *vdev = opaque;
 	struct pci_dev *pdev = vdev->pdev;
 	unsigned long flags;
 	int ret = 0;
@@ -411,7 +413,8 @@ static int vfio_pci_set_intx_unmask(struct vfio_pci_device *vdev,
 	} else if (flags & VFIO_IRQ_SET_DATA_EVENTFD) {
 		int32_t fd = *(int32_t *)data;
 		if (fd >= 0)
-			return virqfd_enable(vdev, vfio_pci_intx_unmask_handler,
+			return virqfd_enable((void *) vdev,
+					     vfio_pci_intx_unmask_handler,
 					     vfio_send_intx_eventfd, NULL,
 					     &vdev->ctx[0].unmask, fd);
 
diff --git a/drivers/vfio/virqfd.c b/drivers/vfio/virqfd.c
index 4528450..8c97e2e 100644
--- a/drivers/vfio/virqfd.c
+++ b/drivers/vfio/virqfd.c
@@ -14,7 +14,6 @@
 #include <linux/eventfd.h>
 #include <linux/file.h>
 #include <linux/slab.h>
-#include "pci/vfio_pci_private.h"
 
 static struct workqueue_struct *vfio_irqfd_cleanup_wq;
 static spinlock_t lock;
@@ -49,7 +48,7 @@ static int virqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, void *key)
 	if (flags & POLLIN) {
 		/* An event has been signaled, call function */
 		if ((!virqfd->handler ||
-		     virqfd->handler(virqfd->vdev, virqfd->data)) &&
+		     virqfd->handler(virqfd->opaque, virqfd->data)) &&
 		    virqfd->thread)
 			schedule_work(&virqfd->inject);
 	}
@@ -99,13 +98,13 @@ static void virqfd_inject(struct work_struct *work)
 {
 	struct virqfd *virqfd = container_of(work, struct virqfd, inject);
 	if (virqfd->thread)
-		virqfd->thread(virqfd->vdev, virqfd->data);
+		virqfd->thread(virqfd->opaque, virqfd->data);
 }
 
-int virqfd_enable(struct vfio_pci_device *vdev,
-			 int (*handler)(struct vfio_pci_device *, void *),
-			 void (*thread)(struct vfio_pci_device *, void *),
-			 void *data, struct virqfd **pvirqfd, int fd)
+int virqfd_enable(void *opaque,
+		  int (*handler)(void *, void *),
+		  void (*thread)(void *, void *),
+		  void *data, struct virqfd **pvirqfd, int fd)
 {
 	struct fd irqfd;
 	struct eventfd_ctx *ctx;
@@ -118,7 +117,7 @@ int virqfd_enable(struct vfio_pci_device *vdev,
 		return -ENOMEM;
 
 	virqfd->pvirqfd = pvirqfd;
-	virqfd->vdev = vdev;
+	virqfd->opaque = opaque;
 	virqfd->handler = handler;
 	virqfd->thread = thread;
 	virqfd->data = data;
@@ -171,7 +170,7 @@ int virqfd_enable(struct vfio_pci_device *vdev,
 	 * before we registered and trigger it as if we didn't miss it.
 	 */
 	if (events & POLLIN) {
-		if ((!handler || handler(vdev, data)) && thread)
+		if ((!handler || handler(opaque, data)) && thread)
 			schedule_work(&virqfd->inject);
 	}
 
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 44c9808..e4f7de8 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -105,10 +105,10 @@ extern long vfio_external_check_extension(struct vfio_group *group,
  * IRQFD support
  */
 struct virqfd {
-	struct vfio_pci_device	*vdev;
+	void			*opaque;
 	struct eventfd_ctx	*eventfd;
-	int			(*handler)(struct vfio_pci_device *, void *);
-	void			(*thread)(struct vfio_pci_device *, void *);
+	int			(*handler)(void *, void *);
+	void			(*thread)(void *, void *);
 	void			*data;
 	struct work_struct	inject;
 	wait_queue_t		wait;
@@ -119,9 +119,9 @@ struct virqfd {
 
 extern int vfio_pci_virqfd_init(void);
 extern void vfio_pci_virqfd_exit(void);
-extern int virqfd_enable(struct vfio_pci_device *vdev,
-			 int (*handler)(struct vfio_pci_device *, void *),
-			 void (*thread)(struct vfio_pci_device *, void *),
+extern int virqfd_enable(void *opaque,
+			 int (*handler)(void *, void *),
+			 void (*thread)(void *, void *),
 			 void *data, struct virqfd **pvirqfd, int fd);
 extern void virqfd_disable(struct virqfd **pvirqfd);
 
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 19/20] vfio: initialize the virqfd workqueue in VFIO generic code
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
                   ` (17 preceding siblings ...)
  2014-06-05 17:03 ` [RFC PATCH v6 18/20] vfio: pass an opaque pointer on virqfd initialization Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  2014-06-05 17:03 ` [RFC PATCH v6 20/20] vfio/platform: implement IRQ masking/unmasking via an eventfd Antonios Motakis
  19 siblings, 0 replies; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, Bjorn Helgaas,
	Al Viro, open list

Now we have finally completely decoupled virqfd from VFIO_PCI. We can
initialize it from the VFIO generic code, in order to safely use it from
many different VFIO bus drivers.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 drivers/vfio/pci/vfio_pci.c | 8 --------
 drivers/vfio/vfio.c         | 8 ++++++++
 drivers/vfio/virqfd.c       | 4 ++--
 include/linux/vfio.h        | 4 ++--
 4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 7ba0424..9969eaf 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -899,7 +899,6 @@ static struct pci_driver vfio_pci_driver = {
 static void __exit vfio_pci_cleanup(void)
 {
 	pci_unregister_driver(&vfio_pci_driver);
-	vfio_pci_virqfd_exit();
 	vfio_pci_uninit_perm_bits();
 }
 
@@ -912,11 +911,6 @@ static int __init vfio_pci_init(void)
 	if (ret)
 		return ret;
 
-	/* Start the virqfd cleanup handler */
-	ret = vfio_pci_virqfd_init();
-	if (ret)
-		goto out_virqfd;
-
 	/* Register and scan for devices */
 	ret = pci_register_driver(&vfio_pci_driver);
 	if (ret)
@@ -925,8 +919,6 @@ static int __init vfio_pci_init(void)
 	return 0;
 
 out_driver:
-	vfio_pci_virqfd_exit();
-out_virqfd:
 	vfio_pci_uninit_perm_bits();
 	return ret;
 }
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 512f479..4f95a17 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1470,6 +1470,11 @@ static int __init vfio_init(void)
 	if (ret)
 		goto err_cdev_add;
 
+	/* Start the virqfd cleanup handler used by some VFIO bus drivers */
+	ret = vfio_virqfd_init();
+	if (ret)
+		goto err_virqfd;
+
 	pr_info(DRIVER_DESC " version: " DRIVER_VERSION "\n");
 
 	/*
@@ -1482,6 +1487,8 @@ static int __init vfio_init(void)
 
 	return 0;
 
+err_virqfd:
+	cdev_del(&vfio.group_cdev);
 err_cdev_add:
 	unregister_chrdev_region(vfio.group_devt, MINORMASK);
 err_alloc_chrdev:
@@ -1496,6 +1503,7 @@ static void __exit vfio_cleanup(void)
 {
 	WARN_ON(!list_empty(&vfio.group_list));
 
+	vfio_virqfd_exit();
 	idr_destroy(&vfio.group_idr);
 	cdev_del(&vfio.group_cdev);
 	unregister_chrdev_region(vfio.group_devt, MINORMASK);
diff --git a/drivers/vfio/virqfd.c b/drivers/vfio/virqfd.c
index 8c97e2e..8d286ca 100644
--- a/drivers/vfio/virqfd.c
+++ b/drivers/vfio/virqfd.c
@@ -18,7 +18,7 @@
 static struct workqueue_struct *vfio_irqfd_cleanup_wq;
 static spinlock_t lock;
 
-int __init vfio_pci_virqfd_init(void)
+int __init vfio_virqfd_init(void)
 {
 	vfio_irqfd_cleanup_wq =
 		create_singlethread_workqueue("vfio-irqfd-cleanup");
@@ -30,7 +30,7 @@ int __init vfio_pci_virqfd_init(void)
 	return 0;
 }
 
-void vfio_pci_virqfd_exit(void)
+void vfio_virqfd_exit(void)
 {
 	destroy_workqueue(vfio_irqfd_cleanup_wq);
 }
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index e4f7de8..132e61f 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -117,8 +117,8 @@ struct virqfd {
 	struct virqfd		**pvirqfd;
 };
 
-extern int vfio_pci_virqfd_init(void);
-extern void vfio_pci_virqfd_exit(void);
+extern int vfio_virqfd_init(void);
+extern void vfio_virqfd_exit(void);
 extern int virqfd_enable(void *opaque,
 			 int (*handler)(void *, void *),
 			 void (*thread)(void *, void *),
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v6 20/20] vfio/platform: implement IRQ masking/unmasking via an eventfd
       [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
                   ` (18 preceding siblings ...)
  2014-06-05 17:03 ` [RFC PATCH v6 19/20] vfio: initialize the virqfd workqueue in VFIO generic code Antonios Motakis
@ 2014-06-05 17:03 ` Antonios Motakis
  19 siblings, 0 replies; 53+ messages in thread
From: Antonios Motakis @ 2014-06-05 17:03 UTC (permalink / raw)
  To: alex.williamson, kvmarm, iommu
  Cc: tech, a.rigo, kvm, christoffer.dall, will.deacon, kim.phillips,
	stuart.yoder, eric.auger, Antonios Motakis, Catalin Marinas,
	Mark Rutland, Vladimir Murzin, open list

With this patch the VFIO user will be able to set an eventfd that can be
used in order to mask and unmask IRQs of platform devices.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
---
 drivers/vfio/platform/vfio_platform_irq.c     | 86 ++++++++++++++++++++++-----
 drivers/vfio/platform/vfio_platform_private.h |  2 +
 2 files changed, 72 insertions(+), 16 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index 10dfbf0..6768508 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -108,6 +108,52 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
 	return ret;
 }
 
+static void vfio_platform_unmask(struct vfio_platform_irq *irq_ctx)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&irq_ctx->lock, flags);
+
+	if (irq_ctx->masked) {
+		enable_irq(irq_ctx->hwirq);
+		irq_ctx->masked = false;
+	}
+
+	spin_unlock_irqrestore(&irq_ctx->lock, flags);
+}
+
+static void vfio_platform_mask(struct vfio_platform_irq *irq_ctx)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&irq_ctx->lock, flags);
+
+	if (!irq_ctx->masked) {
+		disable_irq(irq_ctx->hwirq);
+		irq_ctx->masked = true;
+	}
+
+	spin_unlock_irqrestore(&irq_ctx->lock, flags);
+}
+
+static int vfio_platform_unmask_handler(void *opaque, void *unused)
+{
+	struct vfio_platform_irq *irq_ctx = opaque;
+
+	vfio_platform_unmask(irq_ctx);
+
+	return 0;
+}
+
+static int vfio_platform_mask_handler(void *opaque, void *unused)
+{
+	struct vfio_platform_irq *irq_ctx = opaque;
+
+	vfio_platform_mask(irq_ctx);
+
+	return 0;
+}
+
 static int vfio_set_trigger(struct vfio_platform_device *vdev,
 			    int index, int fd)
 {
@@ -191,6 +237,7 @@ static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
 				    unsigned count, uint32_t flags, void *data)
 {
 	uint8_t arr;
+	int32_t fd;
 
 	if (start != 0 || count != 1)
 		return -EINVAL;
@@ -204,19 +251,22 @@ static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
 			return -EINVAL;
 
 	case VFIO_IRQ_SET_DATA_NONE:
+		vfio_platform_unmask(&vdev->irq[index]);
+		return 0;
 
-		spin_lock_irq(&vdev->irq[index].lock);
-
-		if (vdev->irq[index].masked) {
-			enable_irq(vdev->irq[index].hwirq);
-			vdev->irq[index].masked = false;
-		}
+	case VFIO_IRQ_SET_DATA_EVENTFD:
+		if (copy_from_user(&fd, data, sizeof(int32_t)))
+			return -EFAULT;
 
-		spin_unlock_irq(&vdev->irq[index].lock);
+		if (fd >= 0)
+			return virqfd_enable((void *) &vdev->irq[index],
+					     vfio_platform_unmask_handler,
+					     NULL, NULL,
+					     &vdev->irq[index].unmask, fd);
 
+		virqfd_disable(&vdev->irq[index].unmask);
 		return 0;
 
-	case VFIO_IRQ_SET_DATA_EVENTFD: /* XXX not implemented yet */
 	default:
 		return -ENOTTY;
 	}
@@ -229,6 +279,7 @@ static int vfio_platform_set_irq_mask(struct vfio_platform_device *vdev,
 				    unsigned count, uint32_t flags, void *data)
 {
 	uint8_t arr;
+	int32_t fd;
 
 	if (start != 0 || count != 1)
 		return -EINVAL;
@@ -242,19 +293,22 @@ static int vfio_platform_set_irq_mask(struct vfio_platform_device *vdev,
 			return -EINVAL;
 
 	case VFIO_IRQ_SET_DATA_NONE:
+		vfio_platform_mask(&vdev->irq[index]);
+		return 0;
 
-		spin_lock_irq(&vdev->irq[index].lock);
-
-		if (!vdev->irq[index].masked) {
-			disable_irq(vdev->irq[index].hwirq);
-			vdev->irq[index].masked = true;
-		}
+	case VFIO_IRQ_SET_DATA_EVENTFD:
+		if (copy_from_user(&fd, data, sizeof(int32_t)))
+			return -EFAULT;
 
-		spin_unlock_irq(&vdev->irq[index].lock);
+		if (fd >= 0)
+			return virqfd_enable((void *) &vdev->irq[index],
+					     vfio_platform_mask_handler,
+					     NULL, NULL,
+					     &vdev->irq[index].mask, fd);
 
+		virqfd_disable(&vdev->irq[index].mask);
 		return 0;
 
-	case VFIO_IRQ_SET_DATA_EVENTFD: /* XXX not implemented yet */
 	default:
 		return -ENOTTY;
 	}
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 4d887fd..86a9201 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -32,6 +32,8 @@ struct vfio_platform_irq {
 	char			*name;
 	bool			masked;
 	spinlock_t		lock;
+	struct virqfd		*unmask;
+	struct virqfd		*mask;
 };
 
 struct vfio_platform_region {
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* RE: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
  2014-06-05 17:03 ` [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP Antonios Motakis
@ 2014-06-05 18:31   ` Varun Sethi
  2014-06-08 10:31   ` Christoffer Dall
  1 sibling, 0 replies; 53+ messages in thread
From: Varun Sethi @ 2014-06-05 18:31 UTC (permalink / raw)
  To: Antonios Motakis, alex.williamson, kvmarm, iommu
  Cc: kvm, eric.auger, open list, will.deacon, a.rigo, Stuart Yoder,
	moderated list:ARM SMMU DRIVER, tech, christoffer.dall



> -----Original Message-----
> From: iommu-bounces@lists.linux-foundation.org [mailto:iommu-
> bounces@lists.linux-foundation.org] On Behalf Of Antonios Motakis
> Sent: Thursday, June 05, 2014 10:33 PM
> To: alex.williamson@redhat.com; kvmarm@lists.cs.columbia.edu;
> iommu@lists.linux-foundation.org
> Cc: kvm@vger.kernel.org; eric.auger@linaro.org; open list;
> will.deacon@arm.com; a.rigo@virtualopensystems.com; Yoder Stuart-B08248;
> moderated list:ARM SMMU DRIVER; Antonios Motakis;
> tech@virtualopensystems.com; christoffer.dall@linaro.org
> Subject: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability
> IOMMU_CAP_INTR_REMAP
> 
> With an ARM SMMU, interrupt remapping should always be safe from the
> SMMU's point of view, as it is properly handled by the GIC.
> 
> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> ---
>  drivers/iommu/arm-smmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index
> 15ab2af..ff29402 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -1544,7 +1544,7 @@ static int arm_smmu_domain_has_cap(struct
> iommu_domain *domain,
>  	if (smmu_domain->root_cfg.smmu->features &
> ARM_SMMU_FEAT_COHERENT_WALK)
>  		caps |= IOMMU_CAP_CACHE_COHERENCY;
> 
> -	caps |= IOMMU_CAP_NOEXEC;
> +	caps |= IOMMU_CAP_NOEXEC | IOMMU_CAP_INTR_REMAP;
> 
>  	return !!(cap & caps);
>  }
Why mention this as an IOMMU capability when IOMMU doesn't support it? Also, wouldn't this depend on the GIC IP version? What are we gaining by adding this IOMMU capability list?

-Varun

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 02/20] iommu: add capability IOMMU_CAP_NOEXEC
  2014-06-05 17:03 ` [RFC PATCH v6 02/20] iommu: add capability IOMMU_CAP_NOEXEC Antonios Motakis
@ 2014-06-05 20:03   ` Alex Williamson
  0 siblings, 0 replies; 53+ messages in thread
From: Alex Williamson @ 2014-06-05 20:03 UTC (permalink / raw)
  To: Antonios Motakis
  Cc: kvmarm, iommu, tech, a.rigo, kvm, christoffer.dall, will.deacon,
	kim.phillips, stuart.yoder, eric.auger, Joerg Roedel,
	Varun Sethi, Alexey Kardashevskiy, Shuah Khan,
	Upinder Malhi (umalhi),
	open list

On Thu, 2014-06-05 at 19:03 +0200, Antonios Motakis wrote:
> Some IOMMUs accept an IOMMU_NOEXEC protection flag in addition to
> IOMMU_READ and IOMMU_WRITE. Expose this as an IOMMU capability.
> 
> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> ---
>  include/linux/iommu.h | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index fc464d2..7e152fb 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -57,8 +57,9 @@ struct iommu_domain {
>  	struct iommu_domain_geometry geometry;
>  };
>  
> -#define IOMMU_CAP_CACHE_COHERENCY	0x1
> -#define IOMMU_CAP_INTR_REMAP		0x2	/* isolates device intrs */
> +#define IOMMU_CAP_CACHE_COHERENCY	(1 << 0)
> +#define IOMMU_CAP_INTR_REMAP		(1 << 1) /* isolates device intrs */
> +#define IOMMU_CAP_NOEXEC		(1 << 2) /* IOMMU_NOEXEC flag */
>  
>  /*
>   * Following constraints are specifc to FSL_PAMUV1:

Hmm, was this really intended to be a bitmap?  Why?  Thanks,

Alex


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 07/20] vfio/iommu_type1: implement the VFIO_DMA_MAP_FLAG_NOEXEC flag
  2014-06-05 17:03 ` [RFC PATCH v6 07/20] vfio/iommu_type1: implement " Antonios Motakis
@ 2014-06-05 20:48   ` Alex Williamson
  0 siblings, 0 replies; 53+ messages in thread
From: Alex Williamson @ 2014-06-05 20:48 UTC (permalink / raw)
  To: Antonios Motakis
  Cc: kvmarm, iommu, tech, a.rigo, kvm, christoffer.dall, will.deacon,
	kim.phillips, stuart.yoder, eric.auger, open list

On Thu, 2014-06-05 at 19:03 +0200, Antonios Motakis wrote:
> Some IOMMU drivers, such as the ARM SMMU driver, make available the
> IOMMU_NOEXEC flag, to set the page tables for a device as XN (execute never).
> This affects devices such as the ARM PL330 DMA Controller, which respects
> this flag and will refuse to fetch DMA instructions from memory where the
> XN flag has been set.
> 
> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 30 +++++++++++++++++++++++++++++-
>  1 file changed, 29 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 6673e7b..e2566fd 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -80,6 +80,24 @@ struct vfio_group {
>  	struct list_head	next;
>  };
>  
> +static int vfio_domains_have_cap_noexec(struct vfio_iommu *iommu)
> +{
> +	struct vfio_domain *d;
> +	int ret = 1;
> +
> +	mutex_lock(&iommu->lock);
> +	list_for_each_entry(d, &iommu->domain_list, next) {
> +		if (!iommu_domain_has_cap(d->domain, IOMMU_CAP_NOEXEC)) {
> +			ret = 0;
> +			break;
> +		}
> +	}
> +	mutex_unlock(&iommu->lock);
> +
> +	return ret;
> +}
> +
> +
>  /*
>   * This code handles mapping and unmapping of user data buffers
>   * into DMA'ble space using the IOMMU
> @@ -542,6 +560,11 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>  		prot |= IOMMU_WRITE;
>  	if (map->flags & VFIO_DMA_MAP_FLAG_READ)
>  		prot |= IOMMU_READ;
> +	if (map->flags & VFIO_DMA_MAP_FLAG_NOEXEC) {
> +		if (!vfio_domains_have_cap_noexec(iommu))
> +			return -EINVAL;
> +		prot |= IOMMU_NOEXEC;
> +	}
>  
>  	if (!prot)
>  		return -EINVAL; /* No READ/WRITE? */
> @@ -899,6 +922,10 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
>  			if (!iommu)
>  				return 0;
>  			return vfio_domains_have_iommu_cache(iommu);
> +		case VFIO_IOMMU_PROT_NOEXEC:
> +			if (!iommu)
> +				return 0;
> +			return vfio_domains_have_cap_noexec(iommu);
>  		default:
>  			return 0;
>  		}
> @@ -922,7 +949,8 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
>  	} else if (cmd == VFIO_IOMMU_MAP_DMA) {
>  		struct vfio_iommu_type1_dma_map map;
>  		uint32_t mask = VFIO_DMA_MAP_FLAG_READ |
> -				VFIO_DMA_MAP_FLAG_WRITE;
> +				VFIO_DMA_MAP_FLAG_WRITE |
> +				VFIO_DMA_MAP_FLAG_NOEXEC;
>  
>  		minsz = offsetofend(struct vfio_iommu_type1_dma_map, size);
>  

This doesn't look complete.  What happens if we support NOEXEC, then we
add a group behind an IOMMU that doesn't support NOEXEC?  In one case we
attempt to use the same domain, which might imply the same IOMMU page
tables, but is the hardware going to support that?  In the other case we
create a new domain, then try to replay the mappings, including the
NOEXEC bit... what's that going to do?  Can we add a non-NOEXEC domain
to a container that was previously NOEXEC?  The point of changing the
flag from EXEC to NOEXEC was to make it enforceable, but doesn't that
mean we need to prevent that happening if we have NOEXEC mappings?

Something needs to happen in vfio_iommu_type1_attach_group() to figure
out what's allowed.  Also, vfio_domains_have_cap_noexec() should be
named vfio_domains_have_iommu_nexec() to match _iommu_cache() and
possibly they should both be wrappers to a common function if they end
up sharing similar implementations.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 10/20] vfio/platform: return info for device and its memory mapped IO regions
  2014-06-05 17:03 ` [RFC PATCH v6 10/20] vfio/platform: return info for device and its memory mapped IO regions Antonios Motakis
@ 2014-06-05 21:14   ` Alex Williamson
  2014-06-06 16:39     ` Antonios Motakis
  0 siblings, 1 reply; 53+ messages in thread
From: Alex Williamson @ 2014-06-05 21:14 UTC (permalink / raw)
  To: Antonios Motakis
  Cc: kvmarm, iommu, tech, a.rigo, kvm, christoffer.dall, will.deacon,
	kim.phillips, stuart.yoder, eric.auger, Catalin Marinas,
	Mark Rutland, Vladimir Murzin, open list

On Thu, 2014-06-05 at 19:03 +0200, Antonios Motakis wrote:
> A VFIO userspace driver will start by opening the VFIO device
> that corresponds to an IOMMU group, and will use the ioctl interface
> to get the basic device info, such as number of memory regions and
> interrupts, and their properties.
> 
> This patch enables the IOCTLs:
>  - VFIO_DEVICE_GET_INFO
>  - VFIO_DEVICE_GET_REGION_INFO
> 
> IRQ info is provided by one of the latter patches.
> 
> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> ---
>  drivers/vfio/platform/vfio_platform.c         | 79 +++++++++++++++++++++++++--
>  drivers/vfio/platform/vfio_platform_private.h | 17 ++++++
>  2 files changed, 90 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
> index 1df76d8..eeaebc4 100644
> --- a/drivers/vfio/platform/vfio_platform.c
> +++ b/drivers/vfio/platform/vfio_platform.c
> @@ -34,17 +34,66 @@
>  #define DRIVER_AUTHOR   "Antonios Motakis <a.motakis@virtualopensystems.com>"
>  #define DRIVER_DESC     "VFIO for platform devices - User Level meta-driver"
>  
> +static int vfio_platform_regions_init(struct vfio_platform_device *vdev)
> +{
> +	int cnt = 0, i;
> +
> +	while (platform_get_resource(vdev->pdev, IORESOURCE_MEM, cnt))
> +		cnt++;
> +
> +	vdev->region = kzalloc(sizeof(struct vfio_platform_region) * cnt,
> +				GFP_KERNEL);
> +	if (!vdev->region)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < cnt;  i++) {
> +		struct resource *res =
> +			platform_get_resource(vdev->pdev, IORESOURCE_MEM, i);
> +
> +		if (!res)
> +			goto err;
> +
> +		vdev->region[i].addr = res->start;
> +		vdev->region[i].size = resource_size(res);
> +		vdev->region[i].flags = 0;
> +	}
> +
> +	vdev->num_regions = cnt;
> +
> +	return 0;
> +err:
> +	kfree(vdev->region);
> +	return -EINVAL;
> +}
> +
> +static void vfio_platform_regions_cleanup(struct vfio_platform_device *vdev)
> +{
> +	vdev->num_regions = 0;
> +	kfree(vdev->region);
> +}
> +
>  static void vfio_platform_release(void *device_data)
>  {
> +	struct vfio_platform_device *vdev = device_data;
> +
> +	vfio_platform_regions_cleanup(vdev);
> +
>  	module_put(THIS_MODULE);
>  }
>  
>  static int vfio_platform_open(void *device_data)
>  {
> +	struct vfio_platform_device *vdev = device_data;
> +	int ret;
> +
>  	if (!try_module_get(THIS_MODULE))
>  		return -ENODEV;
>  
> -	return 0;
> +	ret = vfio_platform_regions_init(vdev);
> +	if (ret)
> +		module_put(THIS_MODULE);
> +
> +	return ret;

The user can call VFIO_GROUP_GET_DEVICE_FD for a single device more than
once, vfio_pci maintains a reference count to make sure we only allocate
and initialize on the first open and cleanup on the last close.

>  }
>  
>  static long vfio_platform_ioctl(void *device_data,
> @@ -65,18 +114,36 @@ static long vfio_platform_ioctl(void *device_data,
>  			return -EINVAL;
>  
>  		info.flags = VFIO_DEVICE_FLAGS_PLATFORM;
> -		info.num_regions = 0;
> +		info.num_regions = vdev->num_regions;
>  		info.num_irqs = 0;
>  
>  		return copy_to_user((void __user *)arg, &info, minsz);
>  
> -	} else if (cmd == VFIO_DEVICE_GET_REGION_INFO)
> -		return -EINVAL;
> +	} else if (cmd == VFIO_DEVICE_GET_REGION_INFO) {
> +		struct vfio_region_info info;
> +
> +		minsz = offsetofend(struct vfio_region_info, offset);
> +
> +		if (copy_from_user(&info, (void __user *)arg, minsz))
> +			return -EFAULT;
> +
> +		if (info.argsz < minsz)
> +			return -EINVAL;
> +
> +		if (info.index >= vdev->num_regions)
> +			return -EINVAL;
> +
> +		/* map offset to the physical address  */
> +		info.offset = VFIO_PLATFORM_INDEX_TO_OFFSET(info.index);
> +		info.size = vdev->region[info.index].size;
> +		info.flags = vdev->region[info.index].flags;
> +
> +		return copy_to_user((void __user *)arg, &info, minsz);
>  
> -	else if (cmd == VFIO_DEVICE_GET_IRQ_INFO)
> +	} else if (cmd == VFIO_DEVICE_GET_IRQ_INFO) {
>  		return -EINVAL;
>  
> -	else if (cmd == VFIO_DEVICE_SET_IRQS)
> +	} else if (cmd == VFIO_DEVICE_SET_IRQS)
>  		return -EINVAL;
>  
>  	else if (cmd == VFIO_DEVICE_RESET)
> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
> index 4ae88f8..3448f918 100644
> --- a/drivers/vfio/platform/vfio_platform_private.h
> +++ b/drivers/vfio/platform/vfio_platform_private.h
> @@ -15,8 +15,25 @@
>  #ifndef VFIO_PLATFORM_PRIVATE_H
>  #define VFIO_PLATFORM_PRIVATE_H
>  
> +#define VFIO_PLATFORM_OFFSET_SHIFT   40
> +#define VFIO_PLATFORM_OFFSET_MASK (((u64)(1) << VFIO_PLATFORM_OFFSET_SHIFT) - 1)
> +
> +#define VFIO_PLATFORM_OFFSET_TO_INDEX(off)	\
> +	(off >> VFIO_PLATFORM_OFFSET_SHIFT)
> +
> +#define VFIO_PLATFORM_INDEX_TO_OFFSET(index)	\
> +	((u64)(index) << VFIO_PLATFORM_OFFSET_SHIFT)
> +
> +struct vfio_platform_region {
> +	u64			addr;
> +	resource_size_t		size;
> +	u32			flags;
> +};
> +
>  struct vfio_platform_device {
>  	struct platform_device		*pdev;
> +	struct vfio_platform_region	*region;

"regions" (plural) has slightly better consistency.  Thanks,

Alex

> +	u32				num_regions;
>  };
>  
>  #endif /* VFIO_PLATFORM_PRIVATE_H */




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 17/20] vfio: add local lock in virqfd instead of depending on VFIO PCI
  2014-06-05 17:03 ` [RFC PATCH v6 17/20] vfio: add local lock in virqfd instead of depending on VFIO PCI Antonios Motakis
@ 2014-06-05 22:19   ` Alex Williamson
  2014-06-06 16:57     ` Antonios Motakis
  0 siblings, 1 reply; 53+ messages in thread
From: Alex Williamson @ 2014-06-05 22:19 UTC (permalink / raw)
  To: Antonios Motakis
  Cc: kvmarm, iommu, tech, a.rigo, kvm, christoffer.dall, will.deacon,
	kim.phillips, stuart.yoder, eric.auger, Alexander Gordeev,
	Bjorn Helgaas, open list

On Thu, 2014-06-05 at 19:03 +0200, Antonios Motakis wrote:
> Sharing the same spinlock with the VFIO bus driver is not necessary for
> the virqfd code, so remove that dependency.

I like the idea of consolidating this code, but I need more
justification for why the use of the lock here is independent of the bus
driver and why it's ok to move from a per-device lock to a global lock.
Thanks,

Alex

> 
> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> ---
>  drivers/vfio/pci/vfio_pci_intrs.c | 10 +++++-----
>  drivers/vfio/virqfd.c             | 24 +++++++++++++-----------
>  include/linux/vfio.h              |  3 +--
>  3 files changed, 19 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
> index 3f909bb..e56c814 100644
> --- a/drivers/vfio/pci/vfio_pci_intrs.c
> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
> @@ -226,8 +226,8 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
>  static void vfio_intx_disable(struct vfio_pci_device *vdev)
>  {
>  	vfio_intx_set_signal(vdev, -1);
> -	virqfd_disable(vdev, &vdev->ctx[0].unmask);
> -	virqfd_disable(vdev, &vdev->ctx[0].mask);
> +	virqfd_disable(&vdev->ctx[0].unmask);
> +	virqfd_disable(&vdev->ctx[0].mask);
>  	vdev->irq_type = VFIO_PCI_NUM_IRQS;
>  	vdev->num_ctx = 0;
>  	kfree(vdev->ctx);
> @@ -377,8 +377,8 @@ static void vfio_msi_disable(struct vfio_pci_device *vdev, bool msix)
>  	vfio_msi_set_block(vdev, 0, vdev->num_ctx, NULL, msix);
>  
>  	for (i = 0; i < vdev->num_ctx; i++) {
> -		virqfd_disable(vdev, &vdev->ctx[i].unmask);
> -		virqfd_disable(vdev, &vdev->ctx[i].mask);
> +		virqfd_disable(&vdev->ctx[i].unmask);
> +		virqfd_disable(&vdev->ctx[i].mask);
>  	}
>  
>  	if (msix) {
> @@ -415,7 +415,7 @@ static int vfio_pci_set_intx_unmask(struct vfio_pci_device *vdev,
>  					     vfio_send_intx_eventfd, NULL,
>  					     &vdev->ctx[0].unmask, fd);
>  
> -		virqfd_disable(vdev, &vdev->ctx[0].unmask);
> +		virqfd_disable(&vdev->ctx[0].unmask);
>  	}
>  
>  	return 0;
> diff --git a/drivers/vfio/virqfd.c b/drivers/vfio/virqfd.c
> index 9cf2842..4528450 100644
> --- a/drivers/vfio/virqfd.c
> +++ b/drivers/vfio/virqfd.c
> @@ -17,6 +17,7 @@
>  #include "pci/vfio_pci_private.h"
>  
>  static struct workqueue_struct *vfio_irqfd_cleanup_wq;
> +static spinlock_t lock;
>  
>  int __init vfio_pci_virqfd_init(void)
>  {
> @@ -25,6 +26,8 @@ int __init vfio_pci_virqfd_init(void)
>  	if (!vfio_irqfd_cleanup_wq)
>  		return -ENOMEM;
>  
> +	spin_lock_init(&lock);
> +
>  	return 0;
>  }
>  
> @@ -53,21 +56,21 @@ static int virqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, void *key)
>  
>  	if (flags & POLLHUP) {
>  		unsigned long flags;
> -		spin_lock_irqsave(&virqfd->vdev->irqlock, flags);
> +		spin_lock_irqsave(&lock, flags);
>  
>  		/*
>  		 * The eventfd is closing, if the virqfd has not yet been
>  		 * queued for release, as determined by testing whether the
> -		 * vdev pointer to it is still valid, queue it now.  As
> +		 * virqfd pointer to it is still valid, queue it now.  As
>  		 * with kvm irqfds, we know we won't race against the virqfd
> -		 * going away because we hold wqh->lock to get here.
> +		 * going away because we hold the lock to get here.
>  		 */
>  		if (*(virqfd->pvirqfd) == virqfd) {
>  			*(virqfd->pvirqfd) = NULL;
>  			virqfd_deactivate(virqfd);
>  		}
>  
> -		spin_unlock_irqrestore(&virqfd->vdev->irqlock, flags);
> +		spin_unlock_irqrestore(&lock, flags);
>  	}
>  
>  	return 0;
> @@ -143,16 +146,16 @@ int virqfd_enable(struct vfio_pci_device *vdev,
>  	 * we update the pointer to the virqfd under lock to avoid
>  	 * pushing multiple jobs to release the same virqfd.
>  	 */
> -	spin_lock_irq(&vdev->irqlock);
> +	spin_lock_irq(&lock);
>  
>  	if (*pvirqfd) {
> -		spin_unlock_irq(&vdev->irqlock);
> +		spin_unlock_irq(&lock);
>  		ret = -EBUSY;
>  		goto err_busy;
>  	}
>  	*pvirqfd = virqfd;
>  
> -	spin_unlock_irq(&vdev->irqlock);
> +	spin_unlock_irq(&lock);
>  
>  	/*
>  	 * Install our own custom wake-up handling so we are notified via
> @@ -189,19 +192,18 @@ err_fd:
>  	return ret;
>  }
>  
> -void virqfd_disable(struct vfio_pci_device *vdev,
> -			   struct virqfd **pvirqfd)
> +void virqfd_disable(struct virqfd **pvirqfd)
>  {
>  	unsigned long flags;
>  
> -	spin_lock_irqsave(&vdev->irqlock, flags);
> +	spin_lock_irqsave(&lock, flags);
>  
>  	if (*pvirqfd) {
>  		virqfd_deactivate(*pvirqfd);
>  		*pvirqfd = NULL;
>  	}
>  
> -	spin_unlock_irqrestore(&vdev->irqlock, flags);
> +	spin_unlock_irqrestore(&lock, flags);
>  
>  	/*
>  	 * Block until we know all outstanding shutdown jobs have completed.
> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> index 43968e8..44c9808 100644
> --- a/include/linux/vfio.h
> +++ b/include/linux/vfio.h
> @@ -123,7 +123,6 @@ extern int virqfd_enable(struct vfio_pci_device *vdev,
>  			 int (*handler)(struct vfio_pci_device *, void *),
>  			 void (*thread)(struct vfio_pci_device *, void *),
>  			 void *data, struct virqfd **pvirqfd, int fd);
> -extern void virqfd_disable(struct vfio_pci_device *vdev,
> -			   struct virqfd **pvirqfd);
> +extern void virqfd_disable(struct virqfd **pvirqfd);
>  
>  #endif /* VFIO_H */




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 10/20] vfio/platform: return info for device and its memory mapped IO regions
  2014-06-05 21:14   ` Alex Williamson
@ 2014-06-06 16:39     ` Antonios Motakis
  0 siblings, 0 replies; 53+ messages in thread
From: Antonios Motakis @ 2014-06-06 16:39 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm-arm, Linux IOMMU, VirtualOpenSystems Technical Team,
	alvise rigo, KVM devel mailing list, Christoffer Dall,
	Will Deacon, Kim Phillips, Stuart Yoder, Eric Auger,
	Catalin Marinas, Mark Rutland, Vladimir Murzin, open list

On Thu, Jun 5, 2014 at 11:14 PM, Alex Williamson
<alex.williamson@redhat.com> wrote:
> On Thu, 2014-06-05 at 19:03 +0200, Antonios Motakis wrote:
>> A VFIO userspace driver will start by opening the VFIO device
>> that corresponds to an IOMMU group, and will use the ioctl interface
>> to get the basic device info, such as number of memory regions and
>> interrupts, and their properties.
>>
>> This patch enables the IOCTLs:
>>  - VFIO_DEVICE_GET_INFO
>>  - VFIO_DEVICE_GET_REGION_INFO
>>
>> IRQ info is provided by one of the latter patches.
>>
>> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
>> ---
>>  drivers/vfio/platform/vfio_platform.c         | 79 +++++++++++++++++++++++++--
>>  drivers/vfio/platform/vfio_platform_private.h | 17 ++++++
>>  2 files changed, 90 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
>> index 1df76d8..eeaebc4 100644
>> --- a/drivers/vfio/platform/vfio_platform.c
>> +++ b/drivers/vfio/platform/vfio_platform.c
>> @@ -34,17 +34,66 @@
>>  #define DRIVER_AUTHOR   "Antonios Motakis <a.motakis@virtualopensystems.com>"
>>  #define DRIVER_DESC     "VFIO for platform devices - User Level meta-driver"
>>
>> +static int vfio_platform_regions_init(struct vfio_platform_device *vdev)
>> +{
>> +     int cnt = 0, i;
>> +
>> +     while (platform_get_resource(vdev->pdev, IORESOURCE_MEM, cnt))
>> +             cnt++;
>> +
>> +     vdev->region = kzalloc(sizeof(struct vfio_platform_region) * cnt,
>> +                             GFP_KERNEL);
>> +     if (!vdev->region)
>> +             return -ENOMEM;
>> +
>> +     for (i = 0; i < cnt;  i++) {
>> +             struct resource *res =
>> +                     platform_get_resource(vdev->pdev, IORESOURCE_MEM, i);
>> +
>> +             if (!res)
>> +                     goto err;
>> +
>> +             vdev->region[i].addr = res->start;
>> +             vdev->region[i].size = resource_size(res);
>> +             vdev->region[i].flags = 0;
>> +     }
>> +
>> +     vdev->num_regions = cnt;
>> +
>> +     return 0;
>> +err:
>> +     kfree(vdev->region);
>> +     return -EINVAL;
>> +}
>> +
>> +static void vfio_platform_regions_cleanup(struct vfio_platform_device *vdev)
>> +{
>> +     vdev->num_regions = 0;
>> +     kfree(vdev->region);
>> +}
>> +
>>  static void vfio_platform_release(void *device_data)
>>  {
>> +     struct vfio_platform_device *vdev = device_data;
>> +
>> +     vfio_platform_regions_cleanup(vdev);
>> +
>>       module_put(THIS_MODULE);
>>  }
>>
>>  static int vfio_platform_open(void *device_data)
>>  {
>> +     struct vfio_platform_device *vdev = device_data;
>> +     int ret;
>> +
>>       if (!try_module_get(THIS_MODULE))
>>               return -ENODEV;
>>
>> -     return 0;
>> +     ret = vfio_platform_regions_init(vdev);
>> +     if (ret)
>> +             module_put(THIS_MODULE);
>> +
>> +     return ret;
>
> The user can call VFIO_GROUP_GET_DEVICE_FD for a single device more than
> once, vfio_pci maintains a reference count to make sure we only allocate
> and initialize on the first open and cleanup on the last close.
>

Ack

>>  }
>>
>>  static long vfio_platform_ioctl(void *device_data,
>> @@ -65,18 +114,36 @@ static long vfio_platform_ioctl(void *device_data,
>>                       return -EINVAL;
>>
>>               info.flags = VFIO_DEVICE_FLAGS_PLATFORM;
>> -             info.num_regions = 0;
>> +             info.num_regions = vdev->num_regions;
>>               info.num_irqs = 0;
>>
>>               return copy_to_user((void __user *)arg, &info, minsz);
>>
>> -     } else if (cmd == VFIO_DEVICE_GET_REGION_INFO)
>> -             return -EINVAL;
>> +     } else if (cmd == VFIO_DEVICE_GET_REGION_INFO) {
>> +             struct vfio_region_info info;
>> +
>> +             minsz = offsetofend(struct vfio_region_info, offset);
>> +
>> +             if (copy_from_user(&info, (void __user *)arg, minsz))
>> +                     return -EFAULT;
>> +
>> +             if (info.argsz < minsz)
>> +                     return -EINVAL;
>> +
>> +             if (info.index >= vdev->num_regions)
>> +                     return -EINVAL;
>> +
>> +             /* map offset to the physical address  */
>> +             info.offset = VFIO_PLATFORM_INDEX_TO_OFFSET(info.index);
>> +             info.size = vdev->region[info.index].size;
>> +             info.flags = vdev->region[info.index].flags;
>> +
>> +             return copy_to_user((void __user *)arg, &info, minsz);
>>
>> -     else if (cmd == VFIO_DEVICE_GET_IRQ_INFO)
>> +     } else if (cmd == VFIO_DEVICE_GET_IRQ_INFO) {
>>               return -EINVAL;
>>
>> -     else if (cmd == VFIO_DEVICE_SET_IRQS)
>> +     } else if (cmd == VFIO_DEVICE_SET_IRQS)
>>               return -EINVAL;
>>
>>       else if (cmd == VFIO_DEVICE_RESET)
>> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
>> index 4ae88f8..3448f918 100644
>> --- a/drivers/vfio/platform/vfio_platform_private.h
>> +++ b/drivers/vfio/platform/vfio_platform_private.h
>> @@ -15,8 +15,25 @@
>>  #ifndef VFIO_PLATFORM_PRIVATE_H
>>  #define VFIO_PLATFORM_PRIVATE_H
>>
>> +#define VFIO_PLATFORM_OFFSET_SHIFT   40
>> +#define VFIO_PLATFORM_OFFSET_MASK (((u64)(1) << VFIO_PLATFORM_OFFSET_SHIFT) - 1)
>> +
>> +#define VFIO_PLATFORM_OFFSET_TO_INDEX(off)   \
>> +     (off >> VFIO_PLATFORM_OFFSET_SHIFT)
>> +
>> +#define VFIO_PLATFORM_INDEX_TO_OFFSET(index) \
>> +     ((u64)(index) << VFIO_PLATFORM_OFFSET_SHIFT)
>> +
>> +struct vfio_platform_region {
>> +     u64                     addr;
>> +     resource_size_t         size;
>> +     u32                     flags;
>> +};
>> +
>>  struct vfio_platform_device {
>>       struct platform_device          *pdev;
>> +     struct vfio_platform_region     *region;
>
> "regions" (plural) has slightly better consistency.  Thanks,
>

Ack as well.


-- 
Antonios Motakis
Virtual Open Systems

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 17/20] vfio: add local lock in virqfd instead of depending on VFIO PCI
  2014-06-05 22:19   ` Alex Williamson
@ 2014-06-06 16:57     ` Antonios Motakis
  0 siblings, 0 replies; 53+ messages in thread
From: Antonios Motakis @ 2014-06-06 16:57 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm-arm, Linux IOMMU, VirtualOpenSystems Technical Team,
	alvise rigo, KVM devel mailing list, Christoffer Dall,
	Will Deacon, Kim Phillips, Stuart Yoder, Eric Auger,
	Alexander Gordeev, Bjorn Helgaas, open list

On Fri, Jun 6, 2014 at 12:19 AM, Alex Williamson
<alex.williamson@redhat.com> wrote:
> On Thu, 2014-06-05 at 19:03 +0200, Antonios Motakis wrote:
>> Sharing the same spinlock with the VFIO bus driver is not necessary for
>> the virqfd code, so remove that dependency.
>
> I like the idea of consolidating this code, but I need more
> justification for why the use of the lock here is independent of the bus
> driver and why it's ok to move from a per-device lock to a global lock.
> Thanks,

In the bus driver we use the lock when working with IRQs (for example
when masking/unmasking). VFIO_PCI will use a global lock for all IRQs
of a device, VFIO_PLATFORM will use a different lock for each IRQ.

The virqfd code, as decoupled here from the VFIO_PCI code will not
touch this directly, but a lock is used to keep  any *virqfds
variables consistent. In fact, virqfd is not married to IRQs in
particular, in theory you could register a handler for any eventfd. So
the lock here is used only when creating and destroying eventfds.

I believe the virqfd code should not be particularly aware of what is
behind an eventfd, and what the right lock(s) to use would be - if
any. Any sensitive variables will be accessed only from he handler
registered from the bus driver, who can still hold the spinlock to do
it safely.

Virqfd just needs to keep accesses to any struct *virqfd safe, but
this comes into play only when creating or destroying eventfds, so the
global lock should not be a bottleneck.

>
> Alex
>
>>
>> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
>> ---
>>  drivers/vfio/pci/vfio_pci_intrs.c | 10 +++++-----
>>  drivers/vfio/virqfd.c             | 24 +++++++++++++-----------
>>  include/linux/vfio.h              |  3 +--
>>  3 files changed, 19 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
>> index 3f909bb..e56c814 100644
>> --- a/drivers/vfio/pci/vfio_pci_intrs.c
>> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
>> @@ -226,8 +226,8 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
>>  static void vfio_intx_disable(struct vfio_pci_device *vdev)
>>  {
>>       vfio_intx_set_signal(vdev, -1);
>> -     virqfd_disable(vdev, &vdev->ctx[0].unmask);
>> -     virqfd_disable(vdev, &vdev->ctx[0].mask);
>> +     virqfd_disable(&vdev->ctx[0].unmask);
>> +     virqfd_disable(&vdev->ctx[0].mask);
>>       vdev->irq_type = VFIO_PCI_NUM_IRQS;
>>       vdev->num_ctx = 0;
>>       kfree(vdev->ctx);
>> @@ -377,8 +377,8 @@ static void vfio_msi_disable(struct vfio_pci_device *vdev, bool msix)
>>       vfio_msi_set_block(vdev, 0, vdev->num_ctx, NULL, msix);
>>
>>       for (i = 0; i < vdev->num_ctx; i++) {
>> -             virqfd_disable(vdev, &vdev->ctx[i].unmask);
>> -             virqfd_disable(vdev, &vdev->ctx[i].mask);
>> +             virqfd_disable(&vdev->ctx[i].unmask);
>> +             virqfd_disable(&vdev->ctx[i].mask);
>>       }
>>
>>       if (msix) {
>> @@ -415,7 +415,7 @@ static int vfio_pci_set_intx_unmask(struct vfio_pci_device *vdev,
>>                                            vfio_send_intx_eventfd, NULL,
>>                                            &vdev->ctx[0].unmask, fd);
>>
>> -             virqfd_disable(vdev, &vdev->ctx[0].unmask);
>> +             virqfd_disable(&vdev->ctx[0].unmask);
>>       }
>>
>>       return 0;
>> diff --git a/drivers/vfio/virqfd.c b/drivers/vfio/virqfd.c
>> index 9cf2842..4528450 100644
>> --- a/drivers/vfio/virqfd.c
>> +++ b/drivers/vfio/virqfd.c
>> @@ -17,6 +17,7 @@
>>  #include "pci/vfio_pci_private.h"
>>
>>  static struct workqueue_struct *vfio_irqfd_cleanup_wq;
>> +static spinlock_t lock;
>>
>>  int __init vfio_pci_virqfd_init(void)
>>  {
>> @@ -25,6 +26,8 @@ int __init vfio_pci_virqfd_init(void)
>>       if (!vfio_irqfd_cleanup_wq)
>>               return -ENOMEM;
>>
>> +     spin_lock_init(&lock);
>> +
>>       return 0;
>>  }
>>
>> @@ -53,21 +56,21 @@ static int virqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, void *key)
>>
>>       if (flags & POLLHUP) {
>>               unsigned long flags;
>> -             spin_lock_irqsave(&virqfd->vdev->irqlock, flags);
>> +             spin_lock_irqsave(&lock, flags);
>>
>>               /*
>>                * The eventfd is closing, if the virqfd has not yet been
>>                * queued for release, as determined by testing whether the
>> -              * vdev pointer to it is still valid, queue it now.  As
>> +              * virqfd pointer to it is still valid, queue it now.  As
>>                * with kvm irqfds, we know we won't race against the virqfd
>> -              * going away because we hold wqh->lock to get here.
>> +              * going away because we hold the lock to get here.
>>                */
>>               if (*(virqfd->pvirqfd) == virqfd) {
>>                       *(virqfd->pvirqfd) = NULL;
>>                       virqfd_deactivate(virqfd);
>>               }
>>
>> -             spin_unlock_irqrestore(&virqfd->vdev->irqlock, flags);
>> +             spin_unlock_irqrestore(&lock, flags);
>>       }
>>
>>       return 0;
>> @@ -143,16 +146,16 @@ int virqfd_enable(struct vfio_pci_device *vdev,
>>        * we update the pointer to the virqfd under lock to avoid
>>        * pushing multiple jobs to release the same virqfd.
>>        */
>> -     spin_lock_irq(&vdev->irqlock);
>> +     spin_lock_irq(&lock);
>>
>>       if (*pvirqfd) {
>> -             spin_unlock_irq(&vdev->irqlock);
>> +             spin_unlock_irq(&lock);
>>               ret = -EBUSY;
>>               goto err_busy;
>>       }
>>       *pvirqfd = virqfd;
>>
>> -     spin_unlock_irq(&vdev->irqlock);
>> +     spin_unlock_irq(&lock);
>>
>>       /*
>>        * Install our own custom wake-up handling so we are notified via
>> @@ -189,19 +192,18 @@ err_fd:
>>       return ret;
>>  }
>>
>> -void virqfd_disable(struct vfio_pci_device *vdev,
>> -                        struct virqfd **pvirqfd)
>> +void virqfd_disable(struct virqfd **pvirqfd)
>>  {
>>       unsigned long flags;
>>
>> -     spin_lock_irqsave(&vdev->irqlock, flags);
>> +     spin_lock_irqsave(&lock, flags);
>>
>>       if (*pvirqfd) {
>>               virqfd_deactivate(*pvirqfd);
>>               *pvirqfd = NULL;
>>       }
>>
>> -     spin_unlock_irqrestore(&vdev->irqlock, flags);
>> +     spin_unlock_irqrestore(&lock, flags);
>>
>>       /*
>>        * Block until we know all outstanding shutdown jobs have completed.
>> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
>> index 43968e8..44c9808 100644
>> --- a/include/linux/vfio.h
>> +++ b/include/linux/vfio.h
>> @@ -123,7 +123,6 @@ extern int virqfd_enable(struct vfio_pci_device *vdev,
>>                        int (*handler)(struct vfio_pci_device *, void *),
>>                        void (*thread)(struct vfio_pci_device *, void *),
>>                        void *data, struct virqfd **pvirqfd, int fd);
>> -extern void virqfd_disable(struct vfio_pci_device *vdev,
>> -                        struct virqfd **pvirqfd);
>> +extern void virqfd_disable(struct virqfd **pvirqfd);
>>
>>  #endif /* VFIO_H */
>
>
>



-- 
Antonios Motakis
Virtual Open Systems

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 14/20] vfio/platform: initial interrupts support
  2014-06-05 17:03 ` [RFC PATCH v6 14/20] vfio/platform: initial interrupts support Antonios Motakis
@ 2014-06-08 10:09   ` Christoffer Dall
  2014-09-02 16:07     ` Antonios Motakis
  0 siblings, 1 reply; 53+ messages in thread
From: Christoffer Dall @ 2014-06-08 10:09 UTC (permalink / raw)
  To: Antonios Motakis
  Cc: alex.williamson, kvmarm, iommu, tech, a.rigo, kvm, will.deacon,
	kim.phillips, stuart.yoder, eric.auger, Catalin Marinas,
	Mark Rutland, Vladimir Murzin, open list

On Thu, Jun 05, 2014 at 07:03:22PM +0200, Antonios Motakis wrote:
> This patch allows to set an eventfd for a patform device's interrupt,
> and also to trigger the interrupt eventfd from userspace for testing.
> 
> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> ---
>  drivers/vfio/platform/vfio_platform.c         |  36 ++++++-
>  drivers/vfio/platform/vfio_platform_irq.c     | 130 +++++++++++++++++++++++++-
>  drivers/vfio/platform/vfio_platform_private.h |   7 ++
>  3 files changed, 169 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
> index 192291c..f4c06c6 100644
> --- a/drivers/vfio/platform/vfio_platform.c
> +++ b/drivers/vfio/platform/vfio_platform.c
> @@ -177,10 +177,40 @@ static long vfio_platform_ioctl(void *device_data,
>  
>  		return copy_to_user((void __user *)arg, &info, minsz);
>  
> -	} else if (cmd == VFIO_DEVICE_SET_IRQS)
> -		return -EINVAL;
> +	} else if (cmd == VFIO_DEVICE_SET_IRQS) {
> +		struct vfio_irq_set hdr;
> +		int ret = 0;
> +
> +		minsz = offsetofend(struct vfio_irq_set, count);
> +
> +		if (copy_from_user(&hdr, (void __user *)arg, minsz))
> +			return -EFAULT;
> +
> +		if (hdr.argsz < minsz)
> +			return -EINVAL;
> +
> +		if (hdr.index >= vdev->num_irqs)
> +			return -EINVAL;
> +
> +		if (hdr.start != 0 || hdr.count > 1)
> +			return -EINVAL;
> +
> +		if (hdr.count == 0 &&
> +			(!(hdr.flags & VFIO_IRQ_SET_DATA_NONE) ||
> +			 !(hdr.flags & VFIO_IRQ_SET_ACTION_TRIGGER)))
> +			return -EINVAL;
> +
> +		if (hdr.flags & ~(VFIO_IRQ_SET_DATA_TYPE_MASK |
> +				  VFIO_IRQ_SET_ACTION_TYPE_MASK))
> +			return -EINVAL;
> +
> +		ret = vfio_platform_set_irqs_ioctl(vdev, hdr.flags, hdr.index,
> +						   hdr.start, hdr.count,
> +						   (void *)arg+minsz);
> +
> +		return ret;
>  
> -	else if (cmd == VFIO_DEVICE_RESET)
> +	} else if (cmd == VFIO_DEVICE_RESET)
>  		return -EINVAL;
>  
>  	return -ENOTTY;
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index 22c214f..d79f5af 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -31,6 +31,9 @@
>  
>  #include "vfio_platform_private.h"
>  
> +static int vfio_set_trigger(struct vfio_platform_device *vdev,
> +			    int index, int fd);
> +
>  int vfio_platform_irq_init(struct vfio_platform_device *vdev)
>  {
>  	int cnt = 0, i;
> @@ -43,17 +46,142 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)
>  		return -ENOMEM;
>  
>  	for (i = 0; i < cnt; i++) {
> -		vdev->irq[i].flags = 0;
> +		int hwirq = platform_get_irq(vdev->pdev, i);
> +
> +		if (hwirq < 0)
> +			goto err;
> +
> +		vdev->irq[i].flags = VFIO_IRQ_INFO_EVENTFD;
>  		vdev->irq[i].count = 1;
> +		vdev->irq[i].hwirq = hwirq;
>  	}
>  
>  	vdev->num_irqs = cnt;
>  
>  	return 0;
> +err:
> +	kfree(vdev->irq);
> +	return -EINVAL;
>  }
>  
>  void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev)
>  {
> +	int i;
> +
> +	for (i = 0; i < vdev->num_irqs; i++)
> +		vfio_set_trigger(vdev, i, -1);
> +
>  	vdev->num_irqs = 0;
>  	kfree(vdev->irq);
>  }
> +
> +static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
> +{
> +	struct eventfd_ctx *trigger = dev_id;
> +
> +	eventfd_signal(trigger, 1);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static int vfio_set_trigger(struct vfio_platform_device *vdev,
> +			    int index, int fd)
> +{
> +	struct vfio_platform_irq *irq = &vdev->irq[index];
> +	struct eventfd_ctx *trigger;
> +	int ret;
> +
> +	if (irq->trigger) {
> +		free_irq(irq->hwirq, irq);
> +		kfree(irq->name);
> +		eventfd_ctx_put(irq->trigger);
> +		irq->trigger = NULL;
> +	}

this feels incredibly racy, is some lock protecting this access?

-Christoffer

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 15/20] vfio/platform: support for maskable and automasked interrupts
  2014-06-05 17:03 ` [RFC PATCH v6 15/20] vfio/platform: support for maskable and automasked interrupts Antonios Motakis
@ 2014-06-08 10:17   ` Christoffer Dall
  2014-09-02 16:06     ` Antonios Motakis
  0 siblings, 1 reply; 53+ messages in thread
From: Christoffer Dall @ 2014-06-08 10:17 UTC (permalink / raw)
  To: Antonios Motakis
  Cc: alex.williamson, kvmarm, iommu, tech, a.rigo, kvm, will.deacon,
	kim.phillips, stuart.yoder, eric.auger, Catalin Marinas,
	Mark Rutland, Vladimir Murzin, open list

On Thu, Jun 05, 2014 at 07:03:23PM +0200, Antonios Motakis wrote:
> Adds support to mask interrupts, and also for automasked interrupts.
> Level sensitive interrupts are exposed as automasked interrupts and
> are masked and disabled automatically when they fire.
> 
> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> ---
>  drivers/vfio/platform/vfio_platform_irq.c     | 112 ++++++++++++++++++++++++--
>  drivers/vfio/platform/vfio_platform_private.h |   2 +
>  2 files changed, 109 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index d79f5af..10dfbf0 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -51,9 +51,17 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)
>  		if (hwirq < 0)
>  			goto err;
>  
> -		vdev->irq[i].flags = VFIO_IRQ_INFO_EVENTFD;
> +		spin_lock_init(&vdev->irq[i].lock);
> +
> +		vdev->irq[i].flags = VFIO_IRQ_INFO_EVENTFD
> +					| VFIO_IRQ_INFO_MASKABLE;
> +
> +		if (irq_get_trigger_type(hwirq) & IRQ_TYPE_LEVEL_MASK)
> +			vdev->irq[i].flags |= VFIO_IRQ_INFO_AUTOMASKED;

This seems to rely on the fact that you had actually loaded a driver for
your device to set the right type.  Is this assumption always correct?

It seems to me that this configuration bit should now be up to your user
space drive who is the best candidate to know details about your device
at this point?

> +
>  		vdev->irq[i].count = 1;
>  		vdev->irq[i].hwirq = hwirq;
> +		vdev->irq[i].masked = false;
>  	}
>  
>  	vdev->num_irqs = cnt;
> @@ -77,11 +85,27 @@ void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev)
>  
>  static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
>  {
> -	struct eventfd_ctx *trigger = dev_id;
> +	struct vfio_platform_irq *irq_ctx = dev_id;
> +	unsigned long flags;
> +	int ret = IRQ_NONE;
> +
> +	spin_lock_irqsave(&irq_ctx->lock, flags);
> +
> +	if (!irq_ctx->masked) {
> +		ret = IRQ_HANDLED;
> +
> +		if (irq_ctx->flags & VFIO_IRQ_INFO_AUTOMASKED) {
> +			disable_irq_nosync(irq_ctx->hwirq);
> +			irq_ctx->masked = true;
> +		}
> +	}
>  
> -	eventfd_signal(trigger, 1);
> +	spin_unlock_irqrestore(&irq_ctx->lock, flags);
>  
> -	return IRQ_HANDLED;
> +	if (ret == IRQ_HANDLED)
> +		eventfd_signal(irq_ctx->trigger, 1);
> +
> +	return ret;
>  }
>  
>  static int vfio_set_trigger(struct vfio_platform_device *vdev,
> @@ -162,6 +186,82 @@ static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev,
>  	return -EFAULT;
>  }
>  
> +static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
> +				    unsigned index, unsigned start,
> +				    unsigned count, uint32_t flags, void *data)
> +{
> +	uint8_t arr;


arr?

> +
> +	if (start != 0 || count != 1)
> +		return -EINVAL;
> +
> +	switch (flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
> +	case VFIO_IRQ_SET_DATA_BOOL:
> +		if (copy_from_user(&arr, data, sizeof(uint8_t)))
> +			return -EFAULT;
> +
> +		if (arr != 0x1)
> +			return -EINVAL;

why the fallthrough, what's this about?

> +
> +	case VFIO_IRQ_SET_DATA_NONE:
> +
> +		spin_lock_irq(&vdev->irq[index].lock);
> +
> +		if (vdev->irq[index].masked) {
> +			enable_irq(vdev->irq[index].hwirq);
> +			vdev->irq[index].masked = false;
> +		}
> +
> +		spin_unlock_irq(&vdev->irq[index].lock);
> +
> +		return 0;
> +
> +	case VFIO_IRQ_SET_DATA_EVENTFD: /* XXX not implemented yet */
> +	default:
> +		return -ENOTTY;
> +	}
> +
> +	return 0;
> +}
> +
> +static int vfio_platform_set_irq_mask(struct vfio_platform_device *vdev,
> +				    unsigned index, unsigned start,
> +				    unsigned count, uint32_t flags, void *data)
> +{
> +	uint8_t arr;
> +
> +	if (start != 0 || count != 1)
> +		return -EINVAL;
> +
> +	switch (flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
> +	case VFIO_IRQ_SET_DATA_BOOL:
> +		if (copy_from_user(&arr, data, sizeof(uint8_t)))
> +			return -EFAULT;
> +
> +		if (arr != 0x1)
> +			return -EINVAL;
> +
> +	case VFIO_IRQ_SET_DATA_NONE:
> +
> +		spin_lock_irq(&vdev->irq[index].lock);
> +
> +		if (!vdev->irq[index].masked) {
> +			disable_irq(vdev->irq[index].hwirq);
> +			vdev->irq[index].masked = true;
> +		}
> +
> +		spin_unlock_irq(&vdev->irq[index].lock);
> +
> +		return 0;
> +
> +	case VFIO_IRQ_SET_DATA_EVENTFD: /* XXX not implemented yet */
> +	default:
> +		return -ENOTTY;
> +	}
> +
> +	return 0;
> +}
> +
>  int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev,
>  				 uint32_t flags, unsigned index, unsigned start,
>  				 unsigned count, void *data)
> @@ -172,8 +272,10 @@ int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev,
>  
>  	switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
>  	case VFIO_IRQ_SET_ACTION_MASK:
> +		func = vfio_platform_set_irq_mask;
> +		break;
>  	case VFIO_IRQ_SET_ACTION_UNMASK:
> -		/* XXX not implemented */
> +		func = vfio_platform_set_irq_unmask;
>  		break;
>  	case VFIO_IRQ_SET_ACTION_TRIGGER:
>  		func = vfio_platform_set_irq_trigger;
> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
> index d6200df..4d887fd 100644
> --- a/drivers/vfio/platform/vfio_platform_private.h
> +++ b/drivers/vfio/platform/vfio_platform_private.h
> @@ -30,6 +30,8 @@ struct vfio_platform_irq {
>  	u32			count;
>  	int			hwirq;
>  	char			*name;
> +	bool			masked;
> +	spinlock_t		lock;
>  };
>  
>  struct vfio_platform_region {
> -- 
> 1.8.3.2
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
  2014-06-05 17:03 ` [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP Antonios Motakis
  2014-06-05 18:31   ` Varun Sethi
@ 2014-06-08 10:31   ` Christoffer Dall
  2014-06-16 14:53     ` Joerg Roedel
  1 sibling, 1 reply; 53+ messages in thread
From: Christoffer Dall @ 2014-06-08 10:31 UTC (permalink / raw)
  To: Antonios Motakis
  Cc: alex.williamson, kvmarm, iommu, tech, a.rigo, kvm, will.deacon,
	kim.phillips, stuart.yoder, eric.auger, Joerg Roedel,
	moderated list:ARM SMMU DRIVER, open list

On Thu, Jun 05, 2014 at 07:03:12PM +0200, Antonios Motakis wrote:
> With an ARM SMMU, interrupt remapping should always be safe from the
> SMMU's point of view, as it is properly handled by the GIC.
> 
> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> ---
>  drivers/iommu/arm-smmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 15ab2af..ff29402 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -1544,7 +1544,7 @@ static int arm_smmu_domain_has_cap(struct iommu_domain *domain,
>  	if (smmu_domain->root_cfg.smmu->features & ARM_SMMU_FEAT_COHERENT_WALK)
>  		caps |= IOMMU_CAP_CACHE_COHERENCY;
>  
> -	caps |= IOMMU_CAP_NOEXEC;
> +	caps |= IOMMU_CAP_NOEXEC | IOMMU_CAP_INTR_REMAP;
>  
>  	return !!(cap & caps);
>  }
> -- 
> 1.8.3.2
> 
What does IOMMU_CAP_INTR_REMAP signify exactly?  Is there docs/examples
somewhere I can look at?  (A quick scan of the Linux souce code doesn't
reveal much, and I'm not sure if this is purely MSI related or what...)

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
  2014-06-08 10:31   ` Christoffer Dall
@ 2014-06-16 14:53     ` Joerg Roedel
  2014-06-16 15:13       ` Will Deacon
  0 siblings, 1 reply; 53+ messages in thread
From: Joerg Roedel @ 2014-06-16 14:53 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Antonios Motakis, alex.williamson, kvmarm, iommu, tech, a.rigo,
	kvm, will.deacon, kim.phillips, stuart.yoder, eric.auger,
	moderated list:ARM SMMU DRIVER, open list

On Sun, Jun 08, 2014 at 12:31:29PM +0200, Christoffer Dall wrote:
> On Thu, Jun 05, 2014 at 07:03:12PM +0200, Antonios Motakis wrote:
> > With an ARM SMMU, interrupt remapping should always be safe from the
> > SMMU's point of view, as it is properly handled by the GIC.
> > 
> > Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> > ---
> >  drivers/iommu/arm-smmu.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index 15ab2af..ff29402 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -1544,7 +1544,7 @@ static int arm_smmu_domain_has_cap(struct iommu_domain *domain,
> >  	if (smmu_domain->root_cfg.smmu->features & ARM_SMMU_FEAT_COHERENT_WALK)
> >  		caps |= IOMMU_CAP_CACHE_COHERENCY;
> >  
> > -	caps |= IOMMU_CAP_NOEXEC;
> > +	caps |= IOMMU_CAP_NOEXEC | IOMMU_CAP_INTR_REMAP;
> >  
> >  	return !!(cap & caps);
> >  }
> > -- 
> > 1.8.3.2
> > 
> What does IOMMU_CAP_INTR_REMAP signify exactly?  Is there docs/examples
> somewhere I can look at?  (A quick scan of the Linux souce code doesn't
> reveal much, and I'm not sure if this is purely MSI related or what...)

The flag was introduced for x86 IOMMUs to detect whether an IOMMU in the
system has and enables interrupt remapping to allow safe device
assignment to KVM guests. Without interrupt remapping a malicious guest
could attack the host with MSIs from the attached device.

How are PCI MSIs implemented on ARM with SMMU and GIC enabled. MSIs are
only memory dma transactions in the end, is it guaranteed on ARM that a
device only sends MSI transactions it is allowed to?


	Joerg



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 01/20] iommu/arm-smmu: change IOMMU_EXEC to IOMMU_NOEXEC
  2014-06-05 17:03 ` [RFC PATCH v6 01/20] iommu/arm-smmu: change IOMMU_EXEC to IOMMU_NOEXEC Antonios Motakis
@ 2014-06-16 15:04   ` Will Deacon
  0 siblings, 0 replies; 53+ messages in thread
From: Will Deacon @ 2014-06-16 15:04 UTC (permalink / raw)
  To: Antonios Motakis
  Cc: alex.williamson, kvmarm, iommu, tech, a.rigo, kvm,
	christoffer.dall, kim.phillips, stuart.yoder, eric.auger,
	Joerg Roedel, Varun Sethi, Alexey Kardashevskiy, Shuah Khan,
	Upinder Malhi (umalhi),
	moderated list:ARM SMMU DRIVER, open list

On Thu, Jun 05, 2014 at 06:03:09PM +0100, Antonios Motakis wrote:
> Exposing the XN flag of the SMMU driver as IOMMU_NOEXEC instead of
> IOMMU_EXEC makes it enforceable, since for IOMMUs that don't support
> the XN flag pages will always be executable.
> 
> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> ---
>  drivers/iommu/arm-smmu.c | 2 +-
>  include/linux/iommu.h    | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 647c3c7..d5a2200 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -1294,7 +1294,7 @@ static int arm_smmu_alloc_init_pte(struct arm_smmu_device *smmu, pmd_t *pmd,
>  	}
>  
>  	/* If no access, create a faulting entry to avoid TLB fills */
> -	if (prot & IOMMU_EXEC)
> +	if (!(prot & IOMMU_NOEXEC))
>  		pteval &= ~ARM_SMMU_PTE_XN;

It's probably simpler to change the logic so that we initialise pteval
without XN set, then set it if IOMMU_NOEXEC is set (rather than set it
by default, then clear it if NOEXEC is not set).

Will

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 03/20] iommu/arm-smmu: add IOMMU_CAP_NOEXEC to the ARM SMMU driver
  2014-06-05 17:03 ` [RFC PATCH v6 03/20] iommu/arm-smmu: add IOMMU_CAP_NOEXEC to the ARM SMMU driver Antonios Motakis
@ 2014-06-16 15:04   ` Will Deacon
  2014-06-16 15:25     ` Alex Williamson
  0 siblings, 1 reply; 53+ messages in thread
From: Will Deacon @ 2014-06-16 15:04 UTC (permalink / raw)
  To: Antonios Motakis
  Cc: alex.williamson, kvmarm, iommu, tech, a.rigo, kvm,
	christoffer.dall, kim.phillips, stuart.yoder, eric.auger,
	Joerg Roedel, moderated list:ARM SMMU DRIVER, open list

On Thu, Jun 05, 2014 at 06:03:11PM +0100, Antonios Motakis wrote:
> The ARM SMMU supports the IOMMU_NOEXEC protection flag. Add the
> corresponding IOMMU capability.
> 
> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> ---
>  drivers/iommu/arm-smmu.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index d5a2200..15ab2af 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -1544,6 +1544,8 @@ static int arm_smmu_domain_has_cap(struct iommu_domain *domain,
>  	if (smmu_domain->root_cfg.smmu->features & ARM_SMMU_FEAT_COHERENT_WALK)
>  		caps |= IOMMU_CAP_CACHE_COHERENCY;
>  
> +	caps |= IOMMU_CAP_NOEXEC;
> +

Just initialise caps to IOMMU_CAP_NOEXEC.

Will

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
  2014-06-16 14:53     ` Joerg Roedel
@ 2014-06-16 15:13       ` Will Deacon
  2014-06-16 15:21         ` Joerg Roedel
  0 siblings, 1 reply; 53+ messages in thread
From: Will Deacon @ 2014-06-16 15:13 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Christoffer Dall, Antonios Motakis, alex.williamson, kvmarm,
	iommu, tech, a.rigo, kvm, kim.phillips, stuart.yoder, eric.auger,
	moderated list:ARM SMMU DRIVER, open list

On Mon, Jun 16, 2014 at 03:53:44PM +0100, Joerg Roedel wrote:
> On Sun, Jun 08, 2014 at 12:31:29PM +0200, Christoffer Dall wrote:
> > On Thu, Jun 05, 2014 at 07:03:12PM +0200, Antonios Motakis wrote:
> > > With an ARM SMMU, interrupt remapping should always be safe from the
> > > SMMU's point of view, as it is properly handled by the GIC.
> > > 
> > > Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> > > ---
> > >  drivers/iommu/arm-smmu.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > > index 15ab2af..ff29402 100644
> > > --- a/drivers/iommu/arm-smmu.c
> > > +++ b/drivers/iommu/arm-smmu.c
> > > @@ -1544,7 +1544,7 @@ static int arm_smmu_domain_has_cap(struct iommu_domain *domain,
> > >  	if (smmu_domain->root_cfg.smmu->features & ARM_SMMU_FEAT_COHERENT_WALK)
> > >  		caps |= IOMMU_CAP_CACHE_COHERENCY;
> > >  
> > > -	caps |= IOMMU_CAP_NOEXEC;
> > > +	caps |= IOMMU_CAP_NOEXEC | IOMMU_CAP_INTR_REMAP;
> > >  
> > >  	return !!(cap & caps);
> > >  }
> > > -- 
> > > 1.8.3.2
> > > 
> > What does IOMMU_CAP_INTR_REMAP signify exactly?  Is there docs/examples
> > somewhere I can look at?  (A quick scan of the Linux souce code doesn't
> > reveal much, and I'm not sure if this is purely MSI related or what...)
> 
> The flag was introduced for x86 IOMMUs to detect whether an IOMMU in the
> system has and enables interrupt remapping to allow safe device
> assignment to KVM guests. Without interrupt remapping a malicious guest
> could attack the host with MSIs from the attached device.
> 
> How are PCI MSIs implemented on ARM with SMMU and GIC enabled. MSIs are
> only memory dma transactions in the end, is it guaranteed on ARM that a
> device only sends MSI transactions it is allowed to?

MSIs look just like memory accesses made by the device, so the SMMU will
translate them to point at the GIC ITS (doorbell). The ITS then has tables
to work out how to route the MSI.

So, if IOMMU_CAP_INTR_REMAP is simply supposed to indicate that the SMMU can
translate MSIs to point somewhere else, then the ARM SMMU can always do it.
If it's supposed to indicate that the actual MSI payload can be
filtered/routed, then that requires the GIC ITS.

The part I'm unsure of is how VFIO knows where to map the MSIs to. That
requires knowledge of the physical and virtual doorbell pages -- is that
discoverable in the API?

Will

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
  2014-06-16 15:13       ` Will Deacon
@ 2014-06-16 15:21         ` Joerg Roedel
  2014-06-16 15:25           ` Will Deacon
  2014-06-16 15:30           ` Alex Williamson
  0 siblings, 2 replies; 53+ messages in thread
From: Joerg Roedel @ 2014-06-16 15:21 UTC (permalink / raw)
  To: Will Deacon
  Cc: Christoffer Dall, Antonios Motakis, alex.williamson, kvmarm,
	iommu, tech, a.rigo, kvm, kim.phillips, stuart.yoder, eric.auger,
	moderated list:ARM SMMU DRIVER, open list

On Mon, Jun 16, 2014 at 04:13:29PM +0100, Will Deacon wrote:
> MSIs look just like memory accesses made by the device, so the SMMU
> will translate them to point at the GIC ITS (doorbell). The ITS then
> has tables to work out how to route the MSI.
> 
> So, if IOMMU_CAP_INTR_REMAP is simply supposed to indicate that the
> SMMU can translate MSIs to point somewhere else, then the ARM SMMU can
> always do it.  If it's supposed to indicate that the actual MSI
> payload can be filtered/routed, then that requires the GIC ITS.
> 
> The part I'm unsure of is how VFIO knows where to map the MSIs to.
> That requires knowledge of the physical and virtual doorbell pages --
> is that discoverable in the API?

VFIO does not care about the actual routing, it only cares that the
device can not send interrupts it is not allowed to send (e.g.
interrupts to vectors used by other devices or, on x86, exception vectors).
If that is guaranteed by the SMMU or the GIC ITS hardware and driver
then it is fine to set this flag.


	Joerg



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
  2014-06-16 15:21         ` Joerg Roedel
@ 2014-06-16 15:25           ` Will Deacon
  2014-06-16 15:38             ` Joerg Roedel
  2014-06-16 15:30           ` Alex Williamson
  1 sibling, 1 reply; 53+ messages in thread
From: Will Deacon @ 2014-06-16 15:25 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Christoffer Dall, Antonios Motakis, alex.williamson, kvmarm,
	iommu, tech, a.rigo, kvm, kim.phillips, stuart.yoder, eric.auger,
	moderated list:ARM SMMU DRIVER, open list

On Mon, Jun 16, 2014 at 04:21:58PM +0100, Joerg Roedel wrote:
> On Mon, Jun 16, 2014 at 04:13:29PM +0100, Will Deacon wrote:
> > MSIs look just like memory accesses made by the device, so the SMMU
> > will translate them to point at the GIC ITS (doorbell). The ITS then
> > has tables to work out how to route the MSI.
> > 
> > So, if IOMMU_CAP_INTR_REMAP is simply supposed to indicate that the
> > SMMU can translate MSIs to point somewhere else, then the ARM SMMU can
> > always do it.  If it's supposed to indicate that the actual MSI
> > payload can be filtered/routed, then that requires the GIC ITS.
> > 
> > The part I'm unsure of is how VFIO knows where to map the MSIs to.
> > That requires knowledge of the physical and virtual doorbell pages --
> > is that discoverable in the API?
> 
> VFIO does not care about the actual routing, it only cares that the
> device can not send interrupts it is not allowed to send (e.g.
> interrupts to vectors used by other devices or, on x86, exception vectors).
> If that is guaranteed by the SMMU or the GIC ITS hardware and driver
> then it is fine to set this flag.

Ok, thanks. In which case, I think this is really a combined property of
the SMMU and the interrupt controller, so we might need some extra code
so that the SMMU can check that the interrupt controller for the device
is also capable of interrupt remapping.

Will

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 03/20] iommu/arm-smmu: add IOMMU_CAP_NOEXEC to the ARM SMMU driver
  2014-06-16 15:04   ` Will Deacon
@ 2014-06-16 15:25     ` Alex Williamson
  2014-06-16 15:30       ` Will Deacon
  0 siblings, 1 reply; 53+ messages in thread
From: Alex Williamson @ 2014-06-16 15:25 UTC (permalink / raw)
  To: Will Deacon
  Cc: Antonios Motakis, kvmarm, iommu, tech, a.rigo, kvm,
	christoffer.dall, kim.phillips, stuart.yoder, eric.auger,
	Joerg Roedel, moderated list:ARM SMMU DRIVER, open list

On Mon, 2014-06-16 at 16:04 +0100, Will Deacon wrote:
> On Thu, Jun 05, 2014 at 06:03:11PM +0100, Antonios Motakis wrote:
> > The ARM SMMU supports the IOMMU_NOEXEC protection flag. Add the
> > corresponding IOMMU capability.
> > 
> > Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> > ---
> >  drivers/iommu/arm-smmu.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index d5a2200..15ab2af 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -1544,6 +1544,8 @@ static int arm_smmu_domain_has_cap(struct iommu_domain *domain,
> >  	if (smmu_domain->root_cfg.smmu->features & ARM_SMMU_FEAT_COHERENT_WALK)
> >  		caps |= IOMMU_CAP_CACHE_COHERENCY;
> >  
> > +	caps |= IOMMU_CAP_NOEXEC;
> > +
> 
> Just initialise caps to IOMMU_CAP_NOEXEC.

No, it shouldn't be a bitmap in the first place.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 03/20] iommu/arm-smmu: add IOMMU_CAP_NOEXEC to the ARM SMMU driver
  2014-06-16 15:25     ` Alex Williamson
@ 2014-06-16 15:30       ` Will Deacon
  0 siblings, 0 replies; 53+ messages in thread
From: Will Deacon @ 2014-06-16 15:30 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Antonios Motakis, kvmarm, iommu, tech, a.rigo, kvm,
	christoffer.dall, kim.phillips, stuart.yoder, eric.auger,
	Joerg Roedel, moderated list:ARM SMMU DRIVER, open list

On Mon, Jun 16, 2014 at 04:25:28PM +0100, Alex Williamson wrote:
> On Mon, 2014-06-16 at 16:04 +0100, Will Deacon wrote:
> > On Thu, Jun 05, 2014 at 06:03:11PM +0100, Antonios Motakis wrote:
> > > The ARM SMMU supports the IOMMU_NOEXEC protection flag. Add the
> > > corresponding IOMMU capability.
> > > 
> > > Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> > > ---
> > >  drivers/iommu/arm-smmu.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > > index d5a2200..15ab2af 100644
> > > --- a/drivers/iommu/arm-smmu.c
> > > +++ b/drivers/iommu/arm-smmu.c
> > > @@ -1544,6 +1544,8 @@ static int arm_smmu_domain_has_cap(struct iommu_domain *domain,
> > >  	if (smmu_domain->root_cfg.smmu->features & ARM_SMMU_FEAT_COHERENT_WALK)
> > >  		caps |= IOMMU_CAP_CACHE_COHERENCY;
> > >  
> > > +	caps |= IOMMU_CAP_NOEXEC;
> > > +
> > 
> > Just initialise caps to IOMMU_CAP_NOEXEC.
> 
> No, it shouldn't be a bitmap in the first place.  Thanks,

Ah, ok. I probably led Antonios down the garden path with the ARM SMMU
driver then. Apologies!

Will

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
  2014-06-16 15:21         ` Joerg Roedel
  2014-06-16 15:25           ` Will Deacon
@ 2014-06-16 15:30           ` Alex Williamson
  1 sibling, 0 replies; 53+ messages in thread
From: Alex Williamson @ 2014-06-16 15:30 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Will Deacon, Christoffer Dall, Antonios Motakis, kvmarm, iommu,
	tech, a.rigo, kvm, kim.phillips, stuart.yoder, eric.auger,
	moderated list:ARM SMMU DRIVER, open list

On Mon, 2014-06-16 at 17:21 +0200, Joerg Roedel wrote:
> On Mon, Jun 16, 2014 at 04:13:29PM +0100, Will Deacon wrote:
> > MSIs look just like memory accesses made by the device, so the SMMU
> > will translate them to point at the GIC ITS (doorbell). The ITS then
> > has tables to work out how to route the MSI.
> > 
> > So, if IOMMU_CAP_INTR_REMAP is simply supposed to indicate that the
> > SMMU can translate MSIs to point somewhere else, then the ARM SMMU can
> > always do it.  If it's supposed to indicate that the actual MSI
> > payload can be filtered/routed, then that requires the GIC ITS.
> > 
> > The part I'm unsure of is how VFIO knows where to map the MSIs to.
> > That requires knowledge of the physical and virtual doorbell pages --
> > is that discoverable in the API?
> 
> VFIO does not care about the actual routing, it only cares that the
> device can not send interrupts it is not allowed to send (e.g.
> interrupts to vectors used by other devices or, on x86, exception vectors).
> If that is guaranteed by the SMMU or the GIC ITS hardware and driver
> then it is fine to set this flag.

Yep, I agree.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
  2014-06-16 15:25           ` Will Deacon
@ 2014-06-16 15:38             ` Joerg Roedel
  2014-06-26 18:08               ` Chalamarla, Tirumalesh
  0 siblings, 1 reply; 53+ messages in thread
From: Joerg Roedel @ 2014-06-16 15:38 UTC (permalink / raw)
  To: Will Deacon
  Cc: Christoffer Dall, Antonios Motakis, alex.williamson, kvmarm,
	iommu, tech, a.rigo, kvm, kim.phillips, stuart.yoder, eric.auger,
	moderated list:ARM SMMU DRIVER, open list

On Mon, Jun 16, 2014 at 04:25:26PM +0100, Will Deacon wrote:
> Ok, thanks. In which case, I think this is really a combined property of
> the SMMU and the interrupt controller, so we might need some extra code
> so that the SMMU can check that the interrupt controller for the device
> is also capable of interrupt remapping.

Right, that this is part of IOMMU code has more or less historic reasons
on x86. Interrupt remapping is purely implemented in the IOMMU there, so
on ARM some clue-code between interrupt controler and smmu is needed.


	Joerg



^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
  2014-06-16 15:38             ` Joerg Roedel
@ 2014-06-26 18:08               ` Chalamarla, Tirumalesh
  2014-06-26 18:15                 ` Chalamarla, Tirumalesh
  0 siblings, 1 reply; 53+ messages in thread
From: Chalamarla, Tirumalesh @ 2014-06-26 18:08 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon
  Cc: stuart.yoder, kvm, open list, iommu, alex.williamson,
	moderated list:ARM SMMU DRIVER, tech, kvmarm, Christoffer Dall

Forgive me if this discussion is not relative here, but I thought it is.  

How is VFIO restricting devices from writing  to MSI/MSI-X,
Is all the vector area is mapped by VFIO to trap the accesses.  I am asking this because we might need to emulate ITS somewhere either in KVM or VFIO to provide direct access to devices.
And I don't see any mentions on that.   I think this flag needs to be set by ITS emulation.

Regards,
Tirumalesh.

-----Original Message-----
From: kvmarm-bounces@lists.cs.columbia.edu [mailto:kvmarm-bounces@lists.cs.columbia.edu] On Behalf Of Joerg Roedel
Sent: Monday, June 16, 2014 8:39 AM
To: Will Deacon
Cc: stuart.yoder@freescale.com; kvm@vger.kernel.org; open list; iommu@lists.linux-foundation.org; alex.williamson@redhat.com; moderated list:ARM SMMU DRIVER; tech@virtualopensystems.com; kvmarm@lists.cs.columbia.edu; Christoffer Dall
Subject: Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP

On Mon, Jun 16, 2014 at 04:25:26PM +0100, Will Deacon wrote:
> Ok, thanks. In which case, I think this is really a combined property 
> of the SMMU and the interrupt controller, so we might need some extra 
> code so that the SMMU can check that the interrupt controller for the 
> device is also capable of interrupt remapping.

Right, that this is part of IOMMU code has more or less historic reasons on x86. Interrupt remapping is purely implemented in the IOMMU there, so on ARM some clue-code between interrupt controler and smmu is needed.


	Joerg


_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
  2014-06-26 18:08               ` Chalamarla, Tirumalesh
@ 2014-06-26 18:15                 ` Chalamarla, Tirumalesh
  2014-06-26 18:41                   ` Chalamarla, Tirumalesh
  0 siblings, 1 reply; 53+ messages in thread
From: Chalamarla, Tirumalesh @ 2014-06-26 18:15 UTC (permalink / raw)
  To: Chalamarla, Tirumalesh, Joerg Roedel, Will Deacon
  Cc: kvm, open list, alex.williamson, stuart.yoder, iommu, tech,
	kvmarm, moderated list:ARM SMMU DRIVER

When I say emulating ITS, I mean translating guest ITS commands to physical ITS commands  and placing them in physical queue. 

Regards,
Tirumalesh.

-----Original Message-----
From: kvmarm-bounces@lists.cs.columbia.edu [mailto:kvmarm-bounces@lists.cs.columbia.edu] On Behalf Of Chalamarla, Tirumalesh
Sent: Thursday, June 26, 2014 11:08 AM
To: Joerg Roedel; Will Deacon
Cc: kvm@vger.kernel.org; open list; alex.williamson@redhat.com; stuart.yoder@freescale.com; iommu@lists.linux-foundation.org; tech@virtualopensystems.com; kvmarm@lists.cs.columbia.edu; moderated list:ARM SMMU DRIVER
Subject: RE: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP

Forgive me if this discussion is not relative here, but I thought it is.  

How is VFIO restricting devices from writing  to MSI/MSI-X, Is all the vector area is mapped by VFIO to trap the accesses.  I am asking this because we might need to emulate ITS somewhere either in KVM or VFIO to provide direct access to devices.
And I don't see any mentions on that.   I think this flag needs to be set by ITS emulation.

Regards,
Tirumalesh.

-----Original Message-----
From: kvmarm-bounces@lists.cs.columbia.edu [mailto:kvmarm-bounces@lists.cs.columbia.edu] On Behalf Of Joerg Roedel
Sent: Monday, June 16, 2014 8:39 AM
To: Will Deacon
Cc: stuart.yoder@freescale.com; kvm@vger.kernel.org; open list; iommu@lists.linux-foundation.org; alex.williamson@redhat.com; moderated list:ARM SMMU DRIVER; tech@virtualopensystems.com; kvmarm@lists.cs.columbia.edu; Christoffer Dall
Subject: Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP

On Mon, Jun 16, 2014 at 04:25:26PM +0100, Will Deacon wrote:
> Ok, thanks. In which case, I think this is really a combined property 
> of the SMMU and the interrupt controller, so we might need some extra 
> code so that the SMMU can check that the interrupt controller for the 
> device is also capable of interrupt remapping.

Right, that this is part of IOMMU code has more or less historic reasons on x86. Interrupt remapping is purely implemented in the IOMMU there, so on ARM some clue-code between interrupt controler and smmu is needed.


	Joerg


_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
  2014-06-26 18:15                 ` Chalamarla, Tirumalesh
@ 2014-06-26 18:41                   ` Chalamarla, Tirumalesh
  2014-06-26 19:00                     ` Alex Williamson
  0 siblings, 1 reply; 53+ messages in thread
From: Chalamarla, Tirumalesh @ 2014-06-26 18:41 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon
  Cc: kvm, open list, alex.williamson, stuart.yoder, iommu, tech,
	kvmarm, moderated list:ARM SMMU DRIVER

Sorry there was a type,

The question is:
 
            How is VFIO restricting software from writing to MSI/MSI-X vectors of the device. 

-----Original Message-----
From: Chalamarla, Tirumalesh 
Sent: Thursday, June 26, 2014 11:16 AM
To: Chalamarla, Tirumalesh; Joerg Roedel; Will Deacon
Cc: kvm@vger.kernel.org; open list; alex.williamson@redhat.com; stuart.yoder@freescale.com; iommu@lists.linux-foundation.org; tech@virtualopensystems.com; kvmarm@lists.cs.columbia.edu; moderated list:ARM SMMU DRIVER
Subject: RE: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP

When I say emulating ITS, I mean translating guest ITS commands to physical ITS commands  and placing them in physical queue. 

Regards,
Tirumalesh.

-----Original Message-----
From: kvmarm-bounces@lists.cs.columbia.edu [mailto:kvmarm-bounces@lists.cs.columbia.edu] On Behalf Of Chalamarla, Tirumalesh
Sent: Thursday, June 26, 2014 11:08 AM
To: Joerg Roedel; Will Deacon
Cc: kvm@vger.kernel.org; open list; alex.williamson@redhat.com; stuart.yoder@freescale.com; iommu@lists.linux-foundation.org; tech@virtualopensystems.com; kvmarm@lists.cs.columbia.edu; moderated list:ARM SMMU DRIVER
Subject: RE: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP

Forgive me if this discussion is not relative here, but I thought it is.  

How is VFIO restricting devices from writing  to MSI/MSI-X, Is all the vector area is mapped by VFIO to trap the accesses.  I am asking this because we might need to emulate ITS somewhere either in KVM or VFIO to provide direct access to devices.
And I don't see any mentions on that.   I think this flag needs to be set by ITS emulation.

Regards,
Tirumalesh.

-----Original Message-----
From: kvmarm-bounces@lists.cs.columbia.edu [mailto:kvmarm-bounces@lists.cs.columbia.edu] On Behalf Of Joerg Roedel
Sent: Monday, June 16, 2014 8:39 AM
To: Will Deacon
Cc: stuart.yoder@freescale.com; kvm@vger.kernel.org; open list; iommu@lists.linux-foundation.org; alex.williamson@redhat.com; moderated list:ARM SMMU DRIVER; tech@virtualopensystems.com; kvmarm@lists.cs.columbia.edu; Christoffer Dall
Subject: Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP

On Mon, Jun 16, 2014 at 04:25:26PM +0100, Will Deacon wrote:
> Ok, thanks. In which case, I think this is really a combined property 
> of the SMMU and the interrupt controller, so we might need some extra 
> code so that the SMMU can check that the interrupt controller for the 
> device is also capable of interrupt remapping.

Right, that this is part of IOMMU code has more or less historic reasons on x86. Interrupt remapping is purely implemented in the IOMMU there, so on ARM some clue-code between interrupt controler and smmu is needed.


	Joerg


_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
  2014-06-26 18:41                   ` Chalamarla, Tirumalesh
@ 2014-06-26 19:00                     ` Alex Williamson
  2014-06-26 19:10                       ` Chalamarla, Tirumalesh
  0 siblings, 1 reply; 53+ messages in thread
From: Alex Williamson @ 2014-06-26 19:00 UTC (permalink / raw)
  To: Chalamarla, Tirumalesh
  Cc: Joerg Roedel, Will Deacon, kvm, open list, stuart.yoder, iommu,
	tech, kvmarm, moderated list:ARM SMMU DRIVER

On Thu, 2014-06-26 at 18:41 +0000, Chalamarla, Tirumalesh wrote:
> Sorry there was a type,
> 
> The question is:
>  
>             How is VFIO restricting software from writing to MSI/MSI-X vectors of the device. 

All interrupts are configured via ioctl, not MSI config space or the
MSI-X vector table in MMIO space.  VFIO protects the MSI config area by
virtualizing it (you can't actually write the physical enable bit or
address/data through VFIO).  The MSI-X vector table is protected by
preventing read, write, or mmap access to it.  QEMU provides further
virtualization above the basics provided by VFIO.  We really can't
guarantee that devices don't have backdoors to configure these though.
See the realtek quirk in QEMU for an example of a device that has such a
backdoor.  That's why we require interrupt remapping, so that a device
that does this can only hurt the guest, and require the user to opt-out
if they feel they have a sufficiently trusted guest.  Thanks,

Alex

> 
> -----Original Message-----
> From: Chalamarla, Tirumalesh 
> Sent: Thursday, June 26, 2014 11:16 AM
> To: Chalamarla, Tirumalesh; Joerg Roedel; Will Deacon
> Cc: kvm@vger.kernel.org; open list; alex.williamson@redhat.com; stuart.yoder@freescale.com; iommu@lists.linux-foundation.org; tech@virtualopensystems.com; kvmarm@lists.cs.columbia.edu; moderated list:ARM SMMU DRIVER
> Subject: RE: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
> 
> When I say emulating ITS, I mean translating guest ITS commands to physical ITS commands  and placing them in physical queue. 
> 
> Regards,
> Tirumalesh.
> 
> -----Original Message-----
> From: kvmarm-bounces@lists.cs.columbia.edu [mailto:kvmarm-bounces@lists.cs.columbia.edu] On Behalf Of Chalamarla, Tirumalesh
> Sent: Thursday, June 26, 2014 11:08 AM
> To: Joerg Roedel; Will Deacon
> Cc: kvm@vger.kernel.org; open list; alex.williamson@redhat.com; stuart.yoder@freescale.com; iommu@lists.linux-foundation.org; tech@virtualopensystems.com; kvmarm@lists.cs.columbia.edu; moderated list:ARM SMMU DRIVER
> Subject: RE: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
> 
> Forgive me if this discussion is not relative here, but I thought it is.  
> 
> How is VFIO restricting devices from writing  to MSI/MSI-X, Is all the vector area is mapped by VFIO to trap the accesses.  I am asking this because we might need to emulate ITS somewhere either in KVM or VFIO to provide direct access to devices.
> And I don't see any mentions on that.   I think this flag needs to be set by ITS emulation.
> 
> Regards,
> Tirumalesh.
> 
> -----Original Message-----
> From: kvmarm-bounces@lists.cs.columbia.edu [mailto:kvmarm-bounces@lists.cs.columbia.edu] On Behalf Of Joerg Roedel
> Sent: Monday, June 16, 2014 8:39 AM
> To: Will Deacon
> Cc: stuart.yoder@freescale.com; kvm@vger.kernel.org; open list; iommu@lists.linux-foundation.org; alex.williamson@redhat.com; moderated list:ARM SMMU DRIVER; tech@virtualopensystems.com; kvmarm@lists.cs.columbia.edu; Christoffer Dall
> Subject: Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
> 
> On Mon, Jun 16, 2014 at 04:25:26PM +0100, Will Deacon wrote:
> > Ok, thanks. In which case, I think this is really a combined property 
> > of the SMMU and the interrupt controller, so we might need some extra 
> > code so that the SMMU can check that the interrupt controller for the 
> > device is also capable of interrupt remapping.
> 
> Right, that this is part of IOMMU code has more or less historic reasons on x86. Interrupt remapping is purely implemented in the IOMMU there, so on ARM some clue-code between interrupt controler and smmu is needed.
> 
> 
> 	Joerg
> 
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm




^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
  2014-06-26 19:00                     ` Alex Williamson
@ 2014-06-26 19:10                       ` Chalamarla, Tirumalesh
  2014-06-26 19:36                         ` Alex Williamson
  0 siblings, 1 reply; 53+ messages in thread
From: Chalamarla, Tirumalesh @ 2014-06-26 19:10 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Joerg Roedel, Will Deacon, kvm, open list, stuart.yoder, iommu,
	tech, kvmarm, moderated list:ARM SMMU DRIVER

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 5274 bytes --]

Thanks for the clarification Alex, That’s exactly my point, why are we relying on  QEMU or something else to emulate the MSI space when we can directly give access to devices using ITS (of course with a small emulation code).
This way we are also benefited from all ITS services like VCPU migration etc.  

What about non QEMU VFIO users, for example, if I wanted to use VFIO to assign a device to a user process I don't need to depend on QEMU.   I thought this is one of the main goals of vfio to make it independent of hypervisors.     

Thanks,
Tirumalesh. 
          

-----Original Message-----
From: Alex Williamson [mailto:alex.williamson@redhat.com] 
Sent: Thursday, June 26, 2014 12:00 PM
To: Chalamarla, Tirumalesh
Cc: Joerg Roedel; Will Deacon; kvm@vger.kernel.org; open list; stuart.yoder@freescale.com; iommu@lists.linux-foundation.org; tech@virtualopensystems.com; kvmarm@lists.cs.columbia.edu; moderated list:ARM SMMU DRIVER
Subject: Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP

On Thu, 2014-06-26 at 18:41 +0000, Chalamarla, Tirumalesh wrote:
> Sorry there was a type,
> 
> The question is:
>  
>             How is VFIO restricting software from writing to MSI/MSI-X vectors of the device. 

All interrupts are configured via ioctl, not MSI config space or the MSI-X vector table in MMIO space.  VFIO protects the MSI config area by virtualizing it (you can't actually write the physical enable bit or address/data through VFIO).  The MSI-X vector table is protected by preventing read, write, or mmap access to it.  QEMU provides further virtualization above the basics provided by VFIO.  We really can't guarantee that devices don't have backdoors to configure these though.
See the realtek quirk in QEMU for an example of a device that has such a backdoor.  That's why we require interrupt remapping, so that a device that does this can only hurt the guest, and require the user to opt-out if they feel they have a sufficiently trusted guest.  Thanks,

Alex

> 
> -----Original Message-----
> From: Chalamarla, Tirumalesh
> Sent: Thursday, June 26, 2014 11:16 AM
> To: Chalamarla, Tirumalesh; Joerg Roedel; Will Deacon
> Cc: kvm@vger.kernel.org; open list; alex.williamson@redhat.com; 
> stuart.yoder@freescale.com; iommu@lists.linux-foundation.org; 
> tech@virtualopensystems.com; kvmarm@lists.cs.columbia.edu; moderated 
> list:ARM SMMU DRIVER
> Subject: RE: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability 
> IOMMU_CAP_INTR_REMAP
> 
> When I say emulating ITS, I mean translating guest ITS commands to physical ITS commands  and placing them in physical queue. 
> 
> Regards,
> Tirumalesh.
> 
> -----Original Message-----
> From: kvmarm-bounces@lists.cs.columbia.edu 
> [mailto:kvmarm-bounces@lists.cs.columbia.edu] On Behalf Of Chalamarla, 
> Tirumalesh
> Sent: Thursday, June 26, 2014 11:08 AM
> To: Joerg Roedel; Will Deacon
> Cc: kvm@vger.kernel.org; open list; alex.williamson@redhat.com; 
> stuart.yoder@freescale.com; iommu@lists.linux-foundation.org; 
> tech@virtualopensystems.com; kvmarm@lists.cs.columbia.edu; moderated 
> list:ARM SMMU DRIVER
> Subject: RE: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability 
> IOMMU_CAP_INTR_REMAP
> 
> Forgive me if this discussion is not relative here, but I thought it is.  
> 
> How is VFIO restricting devices from writing  to MSI/MSI-X, Is all the vector area is mapped by VFIO to trap the accesses.  I am asking this because we might need to emulate ITS somewhere either in KVM or VFIO to provide direct access to devices.
> And I don't see any mentions on that.   I think this flag needs to be set by ITS emulation.
> 
> Regards,
> Tirumalesh.
> 
> -----Original Message-----
> From: kvmarm-bounces@lists.cs.columbia.edu 
> [mailto:kvmarm-bounces@lists.cs.columbia.edu] On Behalf Of Joerg 
> Roedel
> Sent: Monday, June 16, 2014 8:39 AM
> To: Will Deacon
> Cc: stuart.yoder@freescale.com; kvm@vger.kernel.org; open list; 
> iommu@lists.linux-foundation.org; alex.williamson@redhat.com; 
> moderated list:ARM SMMU DRIVER; tech@virtualopensystems.com; 
> kvmarm@lists.cs.columbia.edu; Christoffer Dall
> Subject: Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability 
> IOMMU_CAP_INTR_REMAP
> 
> On Mon, Jun 16, 2014 at 04:25:26PM +0100, Will Deacon wrote:
> > Ok, thanks. In which case, I think this is really a combined 
> > property of the SMMU and the interrupt controller, so we might need 
> > some extra code so that the SMMU can check that the interrupt 
> > controller for the device is also capable of interrupt remapping.
> 
> Right, that this is part of IOMMU code has more or less historic reasons on x86. Interrupt remapping is purely implemented in the IOMMU there, so on ARM some clue-code between interrupt controler and smmu is needed.
> 
> 
> 	Joerg
> 
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm



ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
  2014-06-26 19:10                       ` Chalamarla, Tirumalesh
@ 2014-06-26 19:36                         ` Alex Williamson
  2014-06-27  8:47                           ` Will Deacon
  0 siblings, 1 reply; 53+ messages in thread
From: Alex Williamson @ 2014-06-26 19:36 UTC (permalink / raw)
  To: Chalamarla, Tirumalesh
  Cc: Joerg Roedel, Will Deacon, kvm, open list, stuart.yoder, iommu,
	tech, kvmarm, moderated list:ARM SMMU DRIVER

On Thu, 2014-06-26 at 19:10 +0000, Chalamarla, Tirumalesh wrote:
> Thanks for the clarification Alex, That’s exactly my point, why are we relying on  QEMU or something else to emulate the MSI space when we can directly give access to devices using ITS (of course with a small emulation code).
> This way we are also benefited from all ITS services like VCPU migration etc.  

I have no idea what ITS is.

> What about non QEMU VFIO users, for example, if I wanted to use VFIO to assign a device to a user process I don't need to depend on QEMU.   I thought this is one of the main goals of vfio to make it independent of hypervisors.     

Where did QEMU become a requirement?  Maybe I'm missing something in the
ARM part of the conversation that got chopped off, but this is exactly
why we have the VFIO/QEMU split that we do.  VFIO provides basic
virtualization for config space and restricts access to other areas that
users shouldn't be allowed to change.  QEMU is just one example of a
userspace VFIO driver.  QEMU takes the decomposed device exposed through
the VFIO ABI and re-creates a PCI device out of it.  VFIO itself has no
dependency on QEMU.  Thanks,

Alex  

> -----Original Message-----
> From: Alex Williamson [mailto:alex.williamson@redhat.com] 
> Sent: Thursday, June 26, 2014 12:00 PM
> To: Chalamarla, Tirumalesh
> Cc: Joerg Roedel; Will Deacon; kvm@vger.kernel.org; open list; stuart.yoder@freescale.com; iommu@lists.linux-foundation.org; tech@virtualopensystems.com; kvmarm@lists.cs.columbia.edu; moderated list:ARM SMMU DRIVER
> Subject: Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
> 
> On Thu, 2014-06-26 at 18:41 +0000, Chalamarla, Tirumalesh wrote:
> > Sorry there was a type,
> > 
> > The question is:
> >  
> >             How is VFIO restricting software from writing to MSI/MSI-X vectors of the device. 
> 
> All interrupts are configured via ioctl, not MSI config space or the MSI-X vector table in MMIO space.  VFIO protects the MSI config area by virtualizing it (you can't actually write the physical enable bit or address/data through VFIO).  The MSI-X vector table is protected by preventing read, write, or mmap access to it.  QEMU provides further virtualization above the basics provided by VFIO.  We really can't guarantee that devices don't have backdoors to configure these though.
> See the realtek quirk in QEMU for an example of a device that has such a backdoor.  That's why we require interrupt remapping, so that a device that does this can only hurt the guest, and require the user to opt-out if they feel they have a sufficiently trusted guest.  Thanks,
> 
> Alex
> 
> > 
> > -----Original Message-----
> > From: Chalamarla, Tirumalesh
> > Sent: Thursday, June 26, 2014 11:16 AM
> > To: Chalamarla, Tirumalesh; Joerg Roedel; Will Deacon
> > Cc: kvm@vger.kernel.org; open list; alex.williamson@redhat.com; 
> > stuart.yoder@freescale.com; iommu@lists.linux-foundation.org; 
> > tech@virtualopensystems.com; kvmarm@lists.cs.columbia.edu; moderated 
> > list:ARM SMMU DRIVER
> > Subject: RE: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability 
> > IOMMU_CAP_INTR_REMAP
> > 
> > When I say emulating ITS, I mean translating guest ITS commands to physical ITS commands  and placing them in physical queue. 
> > 
> > Regards,
> > Tirumalesh.
> > 
> > -----Original Message-----
> > From: kvmarm-bounces@lists.cs.columbia.edu 
> > [mailto:kvmarm-bounces@lists.cs.columbia.edu] On Behalf Of Chalamarla, 
> > Tirumalesh
> > Sent: Thursday, June 26, 2014 11:08 AM
> > To: Joerg Roedel; Will Deacon
> > Cc: kvm@vger.kernel.org; open list; alex.williamson@redhat.com; 
> > stuart.yoder@freescale.com; iommu@lists.linux-foundation.org; 
> > tech@virtualopensystems.com; kvmarm@lists.cs.columbia.edu; moderated 
> > list:ARM SMMU DRIVER
> > Subject: RE: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability 
> > IOMMU_CAP_INTR_REMAP
> > 
> > Forgive me if this discussion is not relative here, but I thought it is.  
> > 
> > How is VFIO restricting devices from writing  to MSI/MSI-X, Is all the vector area is mapped by VFIO to trap the accesses.  I am asking this because we might need to emulate ITS somewhere either in KVM or VFIO to provide direct access to devices.
> > And I don't see any mentions on that.   I think this flag needs to be set by ITS emulation.
> > 
> > Regards,
> > Tirumalesh.
> > 
> > -----Original Message-----
> > From: kvmarm-bounces@lists.cs.columbia.edu 
> > [mailto:kvmarm-bounces@lists.cs.columbia.edu] On Behalf Of Joerg 
> > Roedel
> > Sent: Monday, June 16, 2014 8:39 AM
> > To: Will Deacon
> > Cc: stuart.yoder@freescale.com; kvm@vger.kernel.org; open list; 
> > iommu@lists.linux-foundation.org; alex.williamson@redhat.com; 
> > moderated list:ARM SMMU DRIVER; tech@virtualopensystems.com; 
> > kvmarm@lists.cs.columbia.edu; Christoffer Dall
> > Subject: Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability 
> > IOMMU_CAP_INTR_REMAP
> > 
> > On Mon, Jun 16, 2014 at 04:25:26PM +0100, Will Deacon wrote:
> > > Ok, thanks. In which case, I think this is really a combined 
> > > property of the SMMU and the interrupt controller, so we might need 
> > > some extra code so that the SMMU can check that the interrupt 
> > > controller for the device is also capable of interrupt remapping.
> > 
> > Right, that this is part of IOMMU code has more or less historic reasons on x86. Interrupt remapping is purely implemented in the IOMMU there, so on ARM some clue-code between interrupt controler and smmu is needed.
> > 
> > 
> > 	Joerg
> > 
> > 
> > _______________________________________________
> > kvmarm mailing list
> > kvmarm@lists.cs.columbia.edu
> > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> > _______________________________________________
> > kvmarm mailing list
> > kvmarm@lists.cs.columbia.edu
> > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> 
> 
> 




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
  2014-06-26 19:36                         ` Alex Williamson
@ 2014-06-27  8:47                           ` Will Deacon
  2014-06-27 21:57                             ` Chalamarla, Tirumalesh
  0 siblings, 1 reply; 53+ messages in thread
From: Will Deacon @ 2014-06-27  8:47 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Chalamarla, Tirumalesh, Joerg Roedel, kvm, open list,
	stuart.yoder, iommu, tech, kvmarm,
	moderated list:ARM SMMU DRIVER, marc.zyngier

On Thu, Jun 26, 2014 at 08:36:24PM +0100, Alex Williamson wrote:
> On Thu, 2014-06-26 at 19:10 +0000, Chalamarla, Tirumalesh wrote:
> > Thanks for the clarification Alex, That’s exactly my point, why are we
> > relying on  QEMU or something else to emulate the MSI space when we can
> > directly give access to devices using ITS (of course with a small
> > emulation code).  This way we are also benefited from all ITS services
> > like VCPU migration etc.  
> 
> I have no idea what ITS is.

ITS is the MSI doorbell for GICv3 (ARM's latest interrupt controller).

I agree that we will need an ITS emulation if we want to use MSIs in the
guest, and I believe that Marc (CC'd) had already started thinking about
that.

> > What about non QEMU VFIO users, for example, if I wanted to use VFIO to
> > assign a device to a user process I don't need to depend on QEMU.   I
> > thought this is one of the main goals of vfio to make it independent of
> > hypervisors.     
> 
> Where did QEMU become a requirement?  Maybe I'm missing something in the
> ARM part of the conversation that got chopped off, but this is exactly
> why we have the VFIO/QEMU split that we do.  VFIO provides basic
> virtualization for config space and restricts access to other areas that
> users shouldn't be allowed to change.  QEMU is just one example of a
> userspace VFIO driver.  QEMU takes the decomposed device exposed through
> the VFIO ABI and re-creates a PCI device out of it.  VFIO itself has no
> dependency on QEMU.  Thanks,

I also don't understand the QEMU part here. The MSI emulation would be in
the kernel, just like the GICv2 emulation that we already have. For
userspace drivers, wouldn't you just use eventfd rather than bother with
emulating MSIs?

Finally, the interrupt remapping part is about the SMMU preventing MSI
writes to arbitrary portions of the host address space. The ITS is about
routing interrupts to CPUs.

Will

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
  2014-06-27  8:47                           ` Will Deacon
@ 2014-06-27 21:57                             ` Chalamarla, Tirumalesh
  2014-06-28  7:05                               ` Marc Zyngier
  0 siblings, 1 reply; 53+ messages in thread
From: Chalamarla, Tirumalesh @ 2014-06-27 21:57 UTC (permalink / raw)
  To: Will Deacon, Alex Williamson
  Cc: Joerg Roedel, kvm, open list, stuart.yoder, iommu, tech, kvmarm,
	moderated list:ARM SMMU DRIVER, marc.zyngier

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 2809 bytes --]

Marc,

         What is your opinion on ITS emulation . is it should be part of KVM or VFIO.
        Also this code needs to depend on ITS host driver a lot, Host ITS driver needs to have an interface for this code to use.

Thanks,
 Tirumalesh
     
-----Original Message-----
From: Will Deacon [mailto:will.deacon@arm.com] 
Sent: Friday, June 27, 2014 1:47 AM
To: Alex Williamson
Cc: Chalamarla, Tirumalesh; Joerg Roedel; kvm@vger.kernel.org; open list; stuart.yoder@freescale.com; iommu@lists.linux-foundation.org; tech@virtualopensystems.com; kvmarm@lists.cs.columbia.edu; moderated list:ARM SMMU DRIVER; marc.zyngier@arm.com
Subject: Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP

On Thu, Jun 26, 2014 at 08:36:24PM +0100, Alex Williamson wrote:
> On Thu, 2014-06-26 at 19:10 +0000, Chalamarla, Tirumalesh wrote:
> > Thanks for the clarification Alex, That’s exactly my point, why are 
> > we relying on  QEMU or something else to emulate the MSI space when 
> > we can directly give access to devices using ITS (of course with a 
> > small emulation code).  This way we are also benefited from all ITS 
> > services like VCPU migration etc.
> 
> I have no idea what ITS is.

ITS is the MSI doorbell for GICv3 (ARM's latest interrupt controller).

I agree that we will need an ITS emulation if we want to use MSIs in the guest, and I believe that Marc (CC'd) had already started thinking about that.


> > What about non QEMU VFIO users, for example, if I wanted to use VFIO to
> > assign a device to a user process I don't need to depend on QEMU.   I
> > thought this is one of the main goals of vfio to make it independent of
> > hypervisors.     
> 
> Where did QEMU become a requirement?  Maybe I'm missing something in 
> the ARM part of the conversation that got chopped off, but this is 
> exactly why we have the VFIO/QEMU split that we do.  VFIO provides 
> basic virtualization for config space and restricts access to other 
> areas that users shouldn't be allowed to change.  QEMU is just one 
> example of a userspace VFIO driver.  QEMU takes the decomposed device 
> exposed through the VFIO ABI and re-creates a PCI device out of it.  
> VFIO itself has no dependency on QEMU.  Thanks,

I also don't understand the QEMU part here. The MSI emulation would be in the kernel, just like the GICv2 emulation that we already have. For userspace drivers, wouldn't you just use eventfd rather than bother with emulating MSIs?

Finally, the interrupt remapping part is about the SMMU preventing MSI writes to arbitrary portions of the host address space. The ITS is about routing interrupts to CPUs.

Will
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
  2014-06-27 21:57                             ` Chalamarla, Tirumalesh
@ 2014-06-28  7:05                               ` Marc Zyngier
  0 siblings, 0 replies; 53+ messages in thread
From: Marc Zyngier @ 2014-06-28  7:05 UTC (permalink / raw)
  To: Chalamarla, Tirumalesh
  Cc: Will Deacon, Alex Williamson, Joerg Roedel, kvm, open list,
	stuart.yoder, iommu, tech, kvmarm,
	moderated list:ARM SMMU DRIVER

On Fri, Jun 27 2014 at 10:57:28 PM, "Chalamarla, Tirumalesh" <Tirumalesh.Chalamarla@caviumnetworks.com> wrote:
> Marc,
>
>          What is your opinion on ITS emulation . is it should be part
>          of KVM or VFIO.

Making any sort of emulation part of VFIO sounds quite wrong. That's not
what VFIO is about, at all. Emulation belongs to the hypervisor, and
nowhere else.

>         Also this code needs to depend on ITS host driver a lot, Host
>         ITS driver needs to have an interface for this code to use.

You can share the command interface as some form of library, but that's
about it. There is no more relationship between the ITS driver and the
ITS emulation as there is between the GIC driver and its emulation
counterpart.

	M.


> Thanks,
>  Tirumalesh
>      
> -----Original Message-----
> From: Will Deacon [mailto:will.deacon@arm.com] 
> Sent: Friday, June 27, 2014 1:47 AM
> To: Alex Williamson
> Cc: Chalamarla, Tirumalesh; Joerg Roedel; kvm@vger.kernel.org; open list; stuart.yoder@freescale.com; iommu@lists.linux-foundation.org; tech@virtualopensystems.com; kvmarm@lists.cs.columbia.edu; moderated list:ARM SMMU DRIVER; marc.zyngier@arm.com
> Subject: Re: [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP
>
> On Thu, Jun 26, 2014 at 08:36:24PM +0100, Alex Williamson wrote:
>> On Thu, 2014-06-26 at 19:10 +0000, Chalamarla, Tirumalesh wrote:
>> > Thanks for the clarification Alex, That’s exactly my point, why are 
>> > we relying on  QEMU or something else to emulate the MSI space when 
>> > we can directly give access to devices using ITS (of course with a 
>> > small emulation code).  This way we are also benefited from all ITS 
>> > services like VCPU migration etc.
>> 
>> I have no idea what ITS is.
>
> ITS is the MSI doorbell for GICv3 (ARM's latest interrupt controller).
>
> I agree that we will need an ITS emulation if we want to use MSIs in
>> the guest, and I believe that Marc (CC'd) had already started
>> thinking about that.
>
>
>> > What about non QEMU VFIO users, for example, if I wanted to use VFIO to
>> > assign a device to a user process I don't need to depend on QEMU.   I
>> > thought this is one of the main goals of vfio to make it independent of
>> > hypervisors.     
>> 
>> Where did QEMU become a requirement?  Maybe I'm missing something in 
>> the ARM part of the conversation that got chopped off, but this is 
>> exactly why we have the VFIO/QEMU split that we do.  VFIO provides 
>> basic virtualization for config space and restricts access to other 
>> areas that users shouldn't be allowed to change.  QEMU is just one 
>> example of a userspace VFIO driver.  QEMU takes the decomposed device 
>> exposed through the VFIO ABI and re-creates a PCI device out of it.  
>> VFIO itself has no dependency on QEMU.  Thanks,
>
> I also don't understand the QEMU part here. The MSI emulation would be
>> in the kernel, just like the GICv2 emulation that we already
>> have. For userspace drivers, wouldn't you just use eventfd rather
>> than bother with emulating MSIs?
>
> Finally, the interrupt remapping part is about the SMMU preventing MSI
>> writes to arbitrary portions of the host address space. The ITS is
>> about routing interrupts to CPUs.
>
> Will

-- 
Jazz is not dead. It just smells funny.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 15/20] vfio/platform: support for maskable and automasked interrupts
  2014-06-08 10:17   ` Christoffer Dall
@ 2014-09-02 16:06     ` Antonios Motakis
  2014-09-10 10:13       ` Christoffer Dall
  0 siblings, 1 reply; 53+ messages in thread
From: Antonios Motakis @ 2014-09-02 16:06 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Alex Williamson, kvm-arm, Linux IOMMU,
	VirtualOpenSystems Technical Team, alvise rigo,
	KVM devel mailing list, Will Deacon, Kim Phillips, Stuart Yoder,
	Eric Auger, Catalin Marinas, Mark Rutland, Vladimir Murzin,
	open list

On Sun, Jun 8, 2014 at 12:17 PM, Christoffer Dall
<christoffer.dall@linaro.org> wrote:
> On Thu, Jun 05, 2014 at 07:03:23PM +0200, Antonios Motakis wrote:
>> Adds support to mask interrupts, and also for automasked interrupts.
>> Level sensitive interrupts are exposed as automasked interrupts and
>> are masked and disabled automatically when they fire.
>>
>> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
>> ---
>>  drivers/vfio/platform/vfio_platform_irq.c     | 112 ++++++++++++++++++++++++--
>>  drivers/vfio/platform/vfio_platform_private.h |   2 +
>>  2 files changed, 109 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
>> index d79f5af..10dfbf0 100644
>> --- a/drivers/vfio/platform/vfio_platform_irq.c
>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
>> @@ -51,9 +51,17 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)
>>               if (hwirq < 0)
>>                       goto err;
>>
>> -             vdev->irq[i].flags = VFIO_IRQ_INFO_EVENTFD;
>> +             spin_lock_init(&vdev->irq[i].lock);
>> +
>> +             vdev->irq[i].flags = VFIO_IRQ_INFO_EVENTFD
>> +                                     | VFIO_IRQ_INFO_MASKABLE;
>> +
>> +             if (irq_get_trigger_type(hwirq) & IRQ_TYPE_LEVEL_MASK)
>> +                     vdev->irq[i].flags |= VFIO_IRQ_INFO_AUTOMASKED;
>
> This seems to rely on the fact that you had actually loaded a driver for
> your device to set the right type.  Is this assumption always correct?
>
> It seems to me that this configuration bit should now be up to your user
> space drive who is the best candidate to know details about your device
> at this point?
>

Hm, I see this type being set usually either in a device tree source,
or in the support code for a specific platform. Are there any
situations where this is actually set by the driver? If I understand
right this is not the case for the PL330, but if it is possible that
it is the case for another device then I need to rethink this. Though
as far as I understand this should not be the case.

>> +
>>               vdev->irq[i].count = 1;
>>               vdev->irq[i].hwirq = hwirq;
>> +             vdev->irq[i].masked = false;
>>       }
>>
>>       vdev->num_irqs = cnt;
>> @@ -77,11 +85,27 @@ void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev)
>>
>>  static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
>>  {
>> -     struct eventfd_ctx *trigger = dev_id;
>> +     struct vfio_platform_irq *irq_ctx = dev_id;
>> +     unsigned long flags;
>> +     int ret = IRQ_NONE;
>> +
>> +     spin_lock_irqsave(&irq_ctx->lock, flags);
>> +
>> +     if (!irq_ctx->masked) {
>> +             ret = IRQ_HANDLED;
>> +
>> +             if (irq_ctx->flags & VFIO_IRQ_INFO_AUTOMASKED) {
>> +                     disable_irq_nosync(irq_ctx->hwirq);
>> +                     irq_ctx->masked = true;
>> +             }
>> +     }
>>
>> -     eventfd_signal(trigger, 1);
>> +     spin_unlock_irqrestore(&irq_ctx->lock, flags);
>>
>> -     return IRQ_HANDLED;
>> +     if (ret == IRQ_HANDLED)
>> +             eventfd_signal(irq_ctx->trigger, 1);
>> +
>> +     return ret;
>>  }
>>
>>  static int vfio_set_trigger(struct vfio_platform_device *vdev,
>> @@ -162,6 +186,82 @@ static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev,
>>       return -EFAULT;
>>  }
>>
>> +static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
>> +                                 unsigned index, unsigned start,
>> +                                 unsigned count, uint32_t flags, void *data)
>> +{
>> +     uint8_t arr;
>
>
> arr?

arr for array! As in, the VFIO API allows an array of IRQs. However
for platform devices we don't use this, each IRQ is exposed
independently as an array of 1 IRQ.

>
>> +
>> +     if (start != 0 || count != 1)
>> +             return -EINVAL;
>> +
>> +     switch (flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
>> +     case VFIO_IRQ_SET_DATA_BOOL:
>> +             if (copy_from_user(&arr, data, sizeof(uint8_t)))
>> +                     return -EFAULT;
>> +
>> +             if (arr != 0x1)
>> +                     return -EINVAL;
>
> why the fallthrough, what's this about?

The VFIO API allows to unmask/mask an array of IRQs, however with
platform devices we only have arrays of 1 IRQ (so not really arrays).

So if the user uses VFIO_IRQ_SET_DATA_BOOL, we need to check that arr
== 0x1. When that is the case, a fallthrough to the same code for
VFIO_IRQ_SET_DATA_NONE is safe.

If that is not readable enough, then I can add a comment or duplicate
the code that does the unmasking. I realize that if you don't know the
VFIO API well, then this can look confusing.

>
>> +
>> +     case VFIO_IRQ_SET_DATA_NONE:
>> +
>> +             spin_lock_irq(&vdev->irq[index].lock);
>> +
>> +             if (vdev->irq[index].masked) {
>> +                     enable_irq(vdev->irq[index].hwirq);
>> +                     vdev->irq[index].masked = false;
>> +             }
>> +
>> +             spin_unlock_irq(&vdev->irq[index].lock);
>> +
>> +             return 0;
>> +
>> +     case VFIO_IRQ_SET_DATA_EVENTFD: /* XXX not implemented yet */
>> +     default:
>> +             return -ENOTTY;
>> +     }
>> +
>> +     return 0;
>> +}
>> +
>> +static int vfio_platform_set_irq_mask(struct vfio_platform_device *vdev,
>> +                                 unsigned index, unsigned start,
>> +                                 unsigned count, uint32_t flags, void *data)
>> +{
>> +     uint8_t arr;
>> +
>> +     if (start != 0 || count != 1)
>> +             return -EINVAL;
>> +
>> +     switch (flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
>> +     case VFIO_IRQ_SET_DATA_BOOL:
>> +             if (copy_from_user(&arr, data, sizeof(uint8_t)))
>> +                     return -EFAULT;
>> +
>> +             if (arr != 0x1)
>> +                     return -EINVAL;
>> +
>> +     case VFIO_IRQ_SET_DATA_NONE:
>> +
>> +             spin_lock_irq(&vdev->irq[index].lock);
>> +
>> +             if (!vdev->irq[index].masked) {
>> +                     disable_irq(vdev->irq[index].hwirq);
>> +                     vdev->irq[index].masked = true;
>> +             }
>> +
>> +             spin_unlock_irq(&vdev->irq[index].lock);
>> +
>> +             return 0;
>> +
>> +     case VFIO_IRQ_SET_DATA_EVENTFD: /* XXX not implemented yet */
>> +     default:
>> +             return -ENOTTY;
>> +     }
>> +
>> +     return 0;
>> +}
>> +
>>  int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev,
>>                                uint32_t flags, unsigned index, unsigned start,
>>                                unsigned count, void *data)
>> @@ -172,8 +272,10 @@ int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev,
>>
>>       switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
>>       case VFIO_IRQ_SET_ACTION_MASK:
>> +             func = vfio_platform_set_irq_mask;
>> +             break;
>>       case VFIO_IRQ_SET_ACTION_UNMASK:
>> -             /* XXX not implemented */
>> +             func = vfio_platform_set_irq_unmask;
>>               break;
>>       case VFIO_IRQ_SET_ACTION_TRIGGER:
>>               func = vfio_platform_set_irq_trigger;
>> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
>> index d6200df..4d887fd 100644
>> --- a/drivers/vfio/platform/vfio_platform_private.h
>> +++ b/drivers/vfio/platform/vfio_platform_private.h
>> @@ -30,6 +30,8 @@ struct vfio_platform_irq {
>>       u32                     count;
>>       int                     hwirq;
>>       char                    *name;
>> +     bool                    masked;
>> +     spinlock_t              lock;
>>  };
>>
>>  struct vfio_platform_region {
>> --
>> 1.8.3.2
>>



-- 
Antonios Motakis
Virtual Open Systems

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 14/20] vfio/platform: initial interrupts support
  2014-06-08 10:09   ` Christoffer Dall
@ 2014-09-02 16:07     ` Antonios Motakis
  0 siblings, 0 replies; 53+ messages in thread
From: Antonios Motakis @ 2014-09-02 16:07 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Alex Williamson, kvm-arm, Linux IOMMU,
	VirtualOpenSystems Technical Team, alvise rigo,
	KVM devel mailing list, Will Deacon, Kim Phillips, Stuart Yoder,
	Eric Auger, Catalin Marinas, Mark Rutland, Vladimir Murzin,
	open list

On Sun, Jun 8, 2014 at 12:09 PM, Christoffer Dall
<christoffer.dall@linaro.org> wrote:
>
> On Thu, Jun 05, 2014 at 07:03:22PM +0200, Antonios Motakis wrote:
> > This patch allows to set an eventfd for a patform device's interrupt,
> > and also to trigger the interrupt eventfd from userspace for testing.
> >
> > Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> > ---
> >  drivers/vfio/platform/vfio_platform.c         |  36 ++++++-
> >  drivers/vfio/platform/vfio_platform_irq.c     | 130 +++++++++++++++++++++++++-
> >  drivers/vfio/platform/vfio_platform_private.h |   7 ++
> >  3 files changed, 169 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
> > index 192291c..f4c06c6 100644
> > --- a/drivers/vfio/platform/vfio_platform.c
> > +++ b/drivers/vfio/platform/vfio_platform.c
> > @@ -177,10 +177,40 @@ static long vfio_platform_ioctl(void *device_data,
> >
> >               return copy_to_user((void __user *)arg, &info, minsz);
> >
> > -     } else if (cmd == VFIO_DEVICE_SET_IRQS)
> > -             return -EINVAL;
> > +     } else if (cmd == VFIO_DEVICE_SET_IRQS) {
> > +             struct vfio_irq_set hdr;
> > +             int ret = 0;
> > +
> > +             minsz = offsetofend(struct vfio_irq_set, count);
> > +
> > +             if (copy_from_user(&hdr, (void __user *)arg, minsz))
> > +                     return -EFAULT;
> > +
> > +             if (hdr.argsz < minsz)
> > +                     return -EINVAL;
> > +
> > +             if (hdr.index >= vdev->num_irqs)
> > +                     return -EINVAL;
> > +
> > +             if (hdr.start != 0 || hdr.count > 1)
> > +                     return -EINVAL;
> > +
> > +             if (hdr.count == 0 &&
> > +                     (!(hdr.flags & VFIO_IRQ_SET_DATA_NONE) ||
> > +                      !(hdr.flags & VFIO_IRQ_SET_ACTION_TRIGGER)))
> > +                     return -EINVAL;
> > +
> > +             if (hdr.flags & ~(VFIO_IRQ_SET_DATA_TYPE_MASK |
> > +                               VFIO_IRQ_SET_ACTION_TYPE_MASK))
> > +                     return -EINVAL;
> > +
> > +             ret = vfio_platform_set_irqs_ioctl(vdev, hdr.flags, hdr.index,
> > +                                                hdr.start, hdr.count,
> > +                                                (void *)arg+minsz);
> > +
> > +             return ret;
> >
> > -     else if (cmd == VFIO_DEVICE_RESET)
> > +     } else if (cmd == VFIO_DEVICE_RESET)
> >               return -EINVAL;
> >
> >       return -ENOTTY;
> > diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> > index 22c214f..d79f5af 100644
> > --- a/drivers/vfio/platform/vfio_platform_irq.c
> > +++ b/drivers/vfio/platform/vfio_platform_irq.c
> > @@ -31,6 +31,9 @@
> >
> >  #include "vfio_platform_private.h"
> >
> > +static int vfio_set_trigger(struct vfio_platform_device *vdev,
> > +                         int index, int fd);
> > +
> >  int vfio_platform_irq_init(struct vfio_platform_device *vdev)
> >  {
> >       int cnt = 0, i;
> > @@ -43,17 +46,142 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)
> >               return -ENOMEM;
> >
> >       for (i = 0; i < cnt; i++) {
> > -             vdev->irq[i].flags = 0;
> > +             int hwirq = platform_get_irq(vdev->pdev, i);
> > +
> > +             if (hwirq < 0)
> > +                     goto err;
> > +
> > +             vdev->irq[i].flags = VFIO_IRQ_INFO_EVENTFD;
> >               vdev->irq[i].count = 1;
> > +             vdev->irq[i].hwirq = hwirq;
> >       }
> >
> >       vdev->num_irqs = cnt;
> >
> >       return 0;
> > +err:
> > +     kfree(vdev->irq);
> > +     return -EINVAL;
> >  }
> >
> >  void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev)
> >  {
> > +     int i;
> > +
> > +     for (i = 0; i < vdev->num_irqs; i++)
> > +             vfio_set_trigger(vdev, i, -1);
> > +
> >       vdev->num_irqs = 0;
> >       kfree(vdev->irq);
> >  }
> > +
> > +static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
> > +{
> > +     struct eventfd_ctx *trigger = dev_id;
> > +
> > +     eventfd_signal(trigger, 1);
> > +
> > +     return IRQ_HANDLED;
> > +}
> > +
> > +static int vfio_set_trigger(struct vfio_platform_device *vdev,
> > +                         int index, int fd)
> > +{
> > +     struct vfio_platform_irq *irq = &vdev->irq[index];
> > +     struct eventfd_ctx *trigger;
> > +     int ret;
> > +
> > +     if (irq->trigger) {
> > +             free_irq(irq->hwirq, irq);
> > +             kfree(irq->name);
> > +             eventfd_ctx_put(irq->trigger);
> > +             irq->trigger = NULL;
> > +     }
>
> this feels incredibly racy, is some lock protecting this access?
>

Good catch; there should have been a mutex earlier protecting it, but
it is missing. Thanks.

> -Christoffer




-- 
Antonios Motakis
Virtual Open Systems

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 15/20] vfio/platform: support for maskable and automasked interrupts
  2014-09-02 16:06     ` Antonios Motakis
@ 2014-09-10 10:13       ` Christoffer Dall
  2014-09-11 17:20         ` Antonios Motakis
  0 siblings, 1 reply; 53+ messages in thread
From: Christoffer Dall @ 2014-09-10 10:13 UTC (permalink / raw)
  To: Antonios Motakis
  Cc: Alex Williamson, kvm-arm, Linux IOMMU,
	VirtualOpenSystems Technical Team, alvise rigo,
	KVM devel mailing list, Will Deacon, Kim Phillips, Stuart Yoder,
	Eric Auger, Catalin Marinas, Mark Rutland, Vladimir Murzin,
	open list

On Tue, Sep 02, 2014 at 06:06:17PM +0200, Antonios Motakis wrote:
> On Sun, Jun 8, 2014 at 12:17 PM, Christoffer Dall
> <christoffer.dall@linaro.org> wrote:
> > On Thu, Jun 05, 2014 at 07:03:23PM +0200, Antonios Motakis wrote:
> >> Adds support to mask interrupts, and also for automasked interrupts.
> >> Level sensitive interrupts are exposed as automasked interrupts and
> >> are masked and disabled automatically when they fire.
> >>
> >> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> >> ---
> >>  drivers/vfio/platform/vfio_platform_irq.c     | 112 ++++++++++++++++++++++++--
> >>  drivers/vfio/platform/vfio_platform_private.h |   2 +
> >>  2 files changed, 109 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> >> index d79f5af..10dfbf0 100644
> >> --- a/drivers/vfio/platform/vfio_platform_irq.c
> >> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> >> @@ -51,9 +51,17 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)
> >>               if (hwirq < 0)
> >>                       goto err;
> >>
> >> -             vdev->irq[i].flags = VFIO_IRQ_INFO_EVENTFD;
> >> +             spin_lock_init(&vdev->irq[i].lock);
> >> +
> >> +             vdev->irq[i].flags = VFIO_IRQ_INFO_EVENTFD
> >> +                                     | VFIO_IRQ_INFO_MASKABLE;
> >> +
> >> +             if (irq_get_trigger_type(hwirq) & IRQ_TYPE_LEVEL_MASK)
> >> +                     vdev->irq[i].flags |= VFIO_IRQ_INFO_AUTOMASKED;
> >
> > This seems to rely on the fact that you had actually loaded a driver for
> > your device to set the right type.  Is this assumption always correct?
> >
> > It seems to me that this configuration bit should now be up to your user
> > space drive who is the best candidate to know details about your device
> > at this point?
> >
> 
> Hm, I see this type being set usually either in a device tree source,
> or in the support code for a specific platform. Are there any
> situations where this is actually set by the driver? If I understand
> right this is not the case for the PL330, but if it is possible that
> it is the case for another device then I need to rethink this. Though
> as far as I understand this should not be the case.
> 

Wow, this has been incredibly long time since I looked at this code, so
not sure if I remember my original reasoning anymore, however,

while device properties are set in the DT, they would only be available
to this code if you actually loaded a device driver for that device,
right?  I'm just not sure that assumption always holds for devices used
by VFIO, but I'm really not sure anymore.  Maybe I'm rambling.

> >> +
> >>               vdev->irq[i].count = 1;
> >>               vdev->irq[i].hwirq = hwirq;
> >> +             vdev->irq[i].masked = false;
> >>       }
> >>
> >>       vdev->num_irqs = cnt;
> >> @@ -77,11 +85,27 @@ void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev)
> >>
> >>  static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
> >>  {
> >> -     struct eventfd_ctx *trigger = dev_id;
> >> +     struct vfio_platform_irq *irq_ctx = dev_id;
> >> +     unsigned long flags;
> >> +     int ret = IRQ_NONE;
> >> +
> >> +     spin_lock_irqsave(&irq_ctx->lock, flags);
> >> +
> >> +     if (!irq_ctx->masked) {
> >> +             ret = IRQ_HANDLED;
> >> +
> >> +             if (irq_ctx->flags & VFIO_IRQ_INFO_AUTOMASKED) {
> >> +                     disable_irq_nosync(irq_ctx->hwirq);
> >> +                     irq_ctx->masked = true;
> >> +             }
> >> +     }
> >>
> >> -     eventfd_signal(trigger, 1);
> >> +     spin_unlock_irqrestore(&irq_ctx->lock, flags);
> >>
> >> -     return IRQ_HANDLED;
> >> +     if (ret == IRQ_HANDLED)
> >> +             eventfd_signal(irq_ctx->trigger, 1);
> >> +
> >> +     return ret;
> >>  }
> >>
> >>  static int vfio_set_trigger(struct vfio_platform_device *vdev,
> >> @@ -162,6 +186,82 @@ static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev,
> >>       return -EFAULT;
> >>  }
> >>
> >> +static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
> >> +                                 unsigned index, unsigned start,
> >> +                                 unsigned count, uint32_t flags, void *data)
> >> +{
> >> +     uint8_t arr;
> >
> >
> > arr?
> 
> arr for array! As in, the VFIO API allows an array of IRQs. However
> for platform devices we don't use this, each IRQ is exposed
> independently as an array of 1 IRQ.
> 

but it's not an array.  If it contains IRQs, call it irqs.  Unless this
is referring specifically to a field *named* array, I don't remember the
API at current, but reading the code along it didn't make sense to me to
have a uint8_t called arr, and code should make as much sense
independenly as possible.

This reminds me of people writing code like:

String str;

yuck.

> >
> >> +
> >> +     if (start != 0 || count != 1)
> >> +             return -EINVAL;
> >> +
> >> +     switch (flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
> >> +     case VFIO_IRQ_SET_DATA_BOOL:
> >> +             if (copy_from_user(&arr, data, sizeof(uint8_t)))
> >> +                     return -EFAULT;
> >> +
> >> +             if (arr != 0x1)
> >> +                     return -EINVAL;
> >
> > why the fallthrough, what's this about?
> 
> The VFIO API allows to unmask/mask an array of IRQs, however with
> platform devices we only have arrays of 1 IRQ (so not really arrays).
> 
> So if the user uses VFIO_IRQ_SET_DATA_BOOL, we need to check that arr
> == 0x1. When that is the case, a fallthrough to the same code for
> VFIO_IRQ_SET_DATA_NONE is safe.
> 
> If that is not readable enough, then I can add a comment or duplicate
> the code that does the unmasking. I realize that if you don't know the
> VFIO API well, then this can look confusing.
> 

yeah, please put a big fat comment explaining the fallthrough.

-Christoffer

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v6 15/20] vfio/platform: support for maskable and automasked interrupts
  2014-09-10 10:13       ` Christoffer Dall
@ 2014-09-11 17:20         ` Antonios Motakis
  0 siblings, 0 replies; 53+ messages in thread
From: Antonios Motakis @ 2014-09-11 17:20 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Alex Williamson, kvm-arm, Linux IOMMU,
	VirtualOpenSystems Technical Team, alvise rigo,
	KVM devel mailing list, Will Deacon, Kim Phillips, Stuart Yoder,
	Eric Auger, Catalin Marinas, Mark Rutland, Vladimir Murzin,
	open list

On Wed, Sep 10, 2014 at 12:13 PM, Christoffer Dall
<christoffer.dall@linaro.org> wrote:
> On Tue, Sep 02, 2014 at 06:06:17PM +0200, Antonios Motakis wrote:
>> On Sun, Jun 8, 2014 at 12:17 PM, Christoffer Dall
>> <christoffer.dall@linaro.org> wrote:
>> > On Thu, Jun 05, 2014 at 07:03:23PM +0200, Antonios Motakis wrote:
>> >> Adds support to mask interrupts, and also for automasked interrupts.
>> >> Level sensitive interrupts are exposed as automasked interrupts and
>> >> are masked and disabled automatically when they fire.
>> >>
>> >> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
>> >> ---
>> >>  drivers/vfio/platform/vfio_platform_irq.c     | 112 ++++++++++++++++++++++++--
>> >>  drivers/vfio/platform/vfio_platform_private.h |   2 +
>> >>  2 files changed, 109 insertions(+), 5 deletions(-)
>> >>
>> >> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
>> >> index d79f5af..10dfbf0 100644
>> >> --- a/drivers/vfio/platform/vfio_platform_irq.c
>> >> +++ b/drivers/vfio/platform/vfio_platform_irq.c
>> >> @@ -51,9 +51,17 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)
>> >>               if (hwirq < 0)
>> >>                       goto err;
>> >>
>> >> -             vdev->irq[i].flags = VFIO_IRQ_INFO_EVENTFD;
>> >> +             spin_lock_init(&vdev->irq[i].lock);
>> >> +
>> >> +             vdev->irq[i].flags = VFIO_IRQ_INFO_EVENTFD
>> >> +                                     | VFIO_IRQ_INFO_MASKABLE;
>> >> +
>> >> +             if (irq_get_trigger_type(hwirq) & IRQ_TYPE_LEVEL_MASK)
>> >> +                     vdev->irq[i].flags |= VFIO_IRQ_INFO_AUTOMASKED;
>> >
>> > This seems to rely on the fact that you had actually loaded a driver for
>> > your device to set the right type.  Is this assumption always correct?
>> >
>> > It seems to me that this configuration bit should now be up to your user
>> > space drive who is the best candidate to know details about your device
>> > at this point?
>> >
>>
>> Hm, I see this type being set usually either in a device tree source,
>> or in the support code for a specific platform. Are there any
>> situations where this is actually set by the driver? If I understand
>> right this is not the case for the PL330, but if it is possible that
>> it is the case for another device then I need to rethink this. Though
>> as far as I understand this should not be the case.
>>
>
> Wow, this has been incredibly long time since I looked at this code, so
> not sure if I remember my original reasoning anymore, however,
>
> while device properties are set in the DT, they would only be available
> to this code if you actually loaded a device driver for that device,
> right?  I'm just not sure that assumption always holds for devices used
> by VFIO, but I'm really not sure anymore.  Maybe I'm rambling.

The device I'm testing with, the PL330 DMAC, is one of the devices
that exposes level sensitive interrupts, and therefore for it to
properly work VFIO needs to be able to expose it as automasked.

I just tested the code on a kernel that doesn't include the native
PL330 DMA driver. It seems that even so, the unmasked property is
properly detected and exposed by VFIO. So for this scenario at least
the assumptions are true...

I'm afraid I have to admit that if there are any edge cases where this
might not be true, I don't know which they are :(

>
>> >> +
>> >>               vdev->irq[i].count = 1;
>> >>               vdev->irq[i].hwirq = hwirq;
>> >> +             vdev->irq[i].masked = false;
>> >>       }
>> >>
>> >>       vdev->num_irqs = cnt;
>> >> @@ -77,11 +85,27 @@ void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev)
>> >>
>> >>  static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
>> >>  {
>> >> -     struct eventfd_ctx *trigger = dev_id;
>> >> +     struct vfio_platform_irq *irq_ctx = dev_id;
>> >> +     unsigned long flags;
>> >> +     int ret = IRQ_NONE;
>> >> +
>> >> +     spin_lock_irqsave(&irq_ctx->lock, flags);
>> >> +
>> >> +     if (!irq_ctx->masked) {
>> >> +             ret = IRQ_HANDLED;
>> >> +
>> >> +             if (irq_ctx->flags & VFIO_IRQ_INFO_AUTOMASKED) {
>> >> +                     disable_irq_nosync(irq_ctx->hwirq);
>> >> +                     irq_ctx->masked = true;
>> >> +             }
>> >> +     }
>> >>
>> >> -     eventfd_signal(trigger, 1);
>> >> +     spin_unlock_irqrestore(&irq_ctx->lock, flags);
>> >>
>> >> -     return IRQ_HANDLED;
>> >> +     if (ret == IRQ_HANDLED)
>> >> +             eventfd_signal(irq_ctx->trigger, 1);
>> >> +
>> >> +     return ret;
>> >>  }
>> >>
>> >>  static int vfio_set_trigger(struct vfio_platform_device *vdev,
>> >> @@ -162,6 +186,82 @@ static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev,
>> >>       return -EFAULT;
>> >>  }
>> >>
>> >> +static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
>> >> +                                 unsigned index, unsigned start,
>> >> +                                 unsigned count, uint32_t flags, void *data)
>> >> +{
>> >> +     uint8_t arr;
>> >
>> >
>> > arr?
>>
>> arr for array! As in, the VFIO API allows an array of IRQs. However
>> for platform devices we don't use this, each IRQ is exposed
>> independently as an array of 1 IRQ.
>>
>
> but it's not an array.  If it contains IRQs, call it irqs.  Unless this
> is referring specifically to a field *named* array, I don't remember the
> API at current, but reading the code along it didn't make sense to me to
> have a uint8_t called arr, and code should make as much sense
> independenly as possible.
>
> This reminds me of people writing code like:
>
> String str;
>
> yuck.

You are completely right, the naming is unfortunate. I will change it
to something clearer, (e.g. irq_bitmap?)

>
>> >
>> >> +
>> >> +     if (start != 0 || count != 1)
>> >> +             return -EINVAL;
>> >> +
>> >> +     switch (flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
>> >> +     case VFIO_IRQ_SET_DATA_BOOL:
>> >> +             if (copy_from_user(&arr, data, sizeof(uint8_t)))
>> >> +                     return -EFAULT;
>> >> +
>> >> +             if (arr != 0x1)
>> >> +                     return -EINVAL;
>> >
>> > why the fallthrough, what's this about?
>>
>> The VFIO API allows to unmask/mask an array of IRQs, however with
>> platform devices we only have arrays of 1 IRQ (so not really arrays).
>>
>> So if the user uses VFIO_IRQ_SET_DATA_BOOL, we need to check that arr
>> == 0x1. When that is the case, a fallthrough to the same code for
>> VFIO_IRQ_SET_DATA_NONE is safe.
>>
>> If that is not readable enough, then I can add a comment or duplicate
>> the code that does the unmasking. I realize that if you don't know the
>> VFIO API well, then this can look confusing.
>>
>
> yeah, please put a big fat comment explaining the fallthrough.

Ack.

>
> -Christoffer



-- 
Antonios Motakis
Virtual Open Systems

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2014-09-11 17:20 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1401987808-23596-1-git-send-email-a.motakis@virtualopensystems.com>
2014-06-05 17:03 ` [RFC PATCH v6 01/20] iommu/arm-smmu: change IOMMU_EXEC to IOMMU_NOEXEC Antonios Motakis
2014-06-16 15:04   ` Will Deacon
2014-06-05 17:03 ` [RFC PATCH v6 02/20] iommu: add capability IOMMU_CAP_NOEXEC Antonios Motakis
2014-06-05 20:03   ` Alex Williamson
2014-06-05 17:03 ` [RFC PATCH v6 03/20] iommu/arm-smmu: add IOMMU_CAP_NOEXEC to the ARM SMMU driver Antonios Motakis
2014-06-16 15:04   ` Will Deacon
2014-06-16 15:25     ` Alex Williamson
2014-06-16 15:30       ` Will Deacon
2014-06-05 17:03 ` [RFC PATCH v6 04/20] iommu/arm-smmu: add capability IOMMU_CAP_INTR_REMAP Antonios Motakis
2014-06-05 18:31   ` Varun Sethi
2014-06-08 10:31   ` Christoffer Dall
2014-06-16 14:53     ` Joerg Roedel
2014-06-16 15:13       ` Will Deacon
2014-06-16 15:21         ` Joerg Roedel
2014-06-16 15:25           ` Will Deacon
2014-06-16 15:38             ` Joerg Roedel
2014-06-26 18:08               ` Chalamarla, Tirumalesh
2014-06-26 18:15                 ` Chalamarla, Tirumalesh
2014-06-26 18:41                   ` Chalamarla, Tirumalesh
2014-06-26 19:00                     ` Alex Williamson
2014-06-26 19:10                       ` Chalamarla, Tirumalesh
2014-06-26 19:36                         ` Alex Williamson
2014-06-27  8:47                           ` Will Deacon
2014-06-27 21:57                             ` Chalamarla, Tirumalesh
2014-06-28  7:05                               ` Marc Zyngier
2014-06-16 15:30           ` Alex Williamson
2014-06-05 17:03 ` [RFC PATCH v6 05/20] vfio/iommu_type1: support for platform bus devices on ARM Antonios Motakis
2014-06-05 17:03 ` [RFC PATCH v6 06/20] vfio: introduce the VFIO_DMA_MAP_FLAG_NOEXEC flag Antonios Motakis
2014-06-05 17:03 ` [RFC PATCH v6 07/20] vfio/iommu_type1: implement " Antonios Motakis
2014-06-05 20:48   ` Alex Williamson
2014-06-05 17:03 ` [RFC PATCH v6 08/20] driver core: platform: add device binding path 'driver_override' Antonios Motakis
2014-06-05 17:03 ` [RFC PATCH v6 09/20] vfio/platform: initial skeleton of VFIO support for platform devices Antonios Motakis
2014-06-05 17:03 ` [RFC PATCH v6 10/20] vfio/platform: return info for device and its memory mapped IO regions Antonios Motakis
2014-06-05 21:14   ` Alex Williamson
2014-06-06 16:39     ` Antonios Motakis
2014-06-05 17:03 ` [RFC PATCH v6 11/20] vfio/platform: read and write support for the device fd Antonios Motakis
2014-06-05 17:03 ` [RFC PATCH v6 12/20] vfio/platform: support MMAP of MMIO regions Antonios Motakis
2014-06-05 17:03 ` [RFC PATCH v6 13/20] vfio/platform: return IRQ info Antonios Motakis
2014-06-05 17:03 ` [RFC PATCH v6 14/20] vfio/platform: initial interrupts support Antonios Motakis
2014-06-08 10:09   ` Christoffer Dall
2014-09-02 16:07     ` Antonios Motakis
2014-06-05 17:03 ` [RFC PATCH v6 15/20] vfio/platform: support for maskable and automasked interrupts Antonios Motakis
2014-06-08 10:17   ` Christoffer Dall
2014-09-02 16:06     ` Antonios Motakis
2014-09-10 10:13       ` Christoffer Dall
2014-09-11 17:20         ` Antonios Motakis
2014-06-05 17:03 ` [RFC PATCH v6 16/20] vfio: move eventfd support code for VFIO_PCI to a sepparate file Antonios Motakis
2014-06-05 17:03 ` [RFC PATCH v6 17/20] vfio: add local lock in virqfd instead of depending on VFIO PCI Antonios Motakis
2014-06-05 22:19   ` Alex Williamson
2014-06-06 16:57     ` Antonios Motakis
2014-06-05 17:03 ` [RFC PATCH v6 18/20] vfio: pass an opaque pointer on virqfd initialization Antonios Motakis
2014-06-05 17:03 ` [RFC PATCH v6 19/20] vfio: initialize the virqfd workqueue in VFIO generic code Antonios Motakis
2014-06-05 17:03 ` [RFC PATCH v6 20/20] vfio/platform: implement IRQ masking/unmasking via an eventfd Antonios Motakis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).