All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
@ 2018-04-18 11:40 ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson, eric.auger, pmorel
  Cc: kvm, linux-kernel, iommu, linuxarm, john.garry, xuwei5, Shameer Kolothum

This series introduces an iova list associated with a vfio 
iommu. The list is kept updated taking care of iommu apertures,
and reserved regions. Also this series adds checks for any conflict
with existing dma mappings whenever a new device group is attached to
the domain.

User-space can retrieve valid iova ranges using VFIO_IOMMU_GET_INFO
ioctl capability chains. Any dma map request outside the valid iova
range will be rejected.

v5 --> v6

 -Rebased to 4.17-rc1
 -Changed the ordering such that previous patch#7 "iommu/dma: Move
  PCI window region reservation back...")  is now patch #4. This
  will avoid any bisection issues pointed out by Alex.
 -Added Robin's Reviewed-by tag for patch#4

v4 --> v5
Rebased to next-20180315.
 
 -Incorporated the corner case bug fix suggested by Alex to patch #5.
 -Based on suggestions by Alex and Robin, added patch#7. This
  moves the PCI window  reservation back in to DMA specific path.
  This is to fix the issue reported by Eric[1].

Note:
The patch #7 has dependency with [2][3]

1. https://patchwork.kernel.org/patch/10232043/
2. https://patchwork.kernel.org/patch/10216553/
3. https://patchwork.kernel.org/patch/10216555/

v3 --> v4
 Addressed comments received for v3.
 -dma_addr_t instead of phys_addr_t
 -LIST_HEAD() usage.
 -Free up iova_copy list in case of error.
 -updated logic in filling the iova caps info(patch #5)

RFCv2 --> v3
 Removed RFC tag.
 Addressed comments from Alex and Eric:
 - Added comments to make iova list management logic more clear.
 - Use of iova list copy so that original is not altered in
   case of failure.

RFCv1 --> RFCv2
 Addressed comments from Alex:
-Introduced IOVA list management and added checks for conflicts with 
 existing dma map entries during attach/detach.

Shameer Kolothum (7):
  vfio/type1: Introduce iova list and add iommu aperture validity check
  vfio/type1: Check reserve region conflict and update iova list
  vfio/type1: Update iova list on detach
  iommu/dma: Move PCI window region reservation back into dma specific
    path.
  vfio/type1: check dma map request is within a valid iova range
  vfio/type1: Add IOVA range capability support
  vfio/type1: remove duplicate retrieval of reserved regions

 drivers/iommu/dma-iommu.c       |  54 ++---
 drivers/vfio/vfio_iommu_type1.c | 497 +++++++++++++++++++++++++++++++++++++++-
 include/uapi/linux/vfio.h       |  23 ++
 3 files changed, 533 insertions(+), 41 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
@ 2018-04-18 11:40 ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xuwei5-C8/M+/jPZTeaMJb+Lgu22Q, linuxarm-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

This series introduces an iova list associated with a vfio 
iommu. The list is kept updated taking care of iommu apertures,
and reserved regions. Also this series adds checks for any conflict
with existing dma mappings whenever a new device group is attached to
the domain.

User-space can retrieve valid iova ranges using VFIO_IOMMU_GET_INFO
ioctl capability chains. Any dma map request outside the valid iova
range will be rejected.

v5 --> v6

 -Rebased to 4.17-rc1
 -Changed the ordering such that previous patch#7 "iommu/dma: Move
  PCI window region reservation back...")  is now patch #4. This
  will avoid any bisection issues pointed out by Alex.
 -Added Robin's Reviewed-by tag for patch#4

v4 --> v5
Rebased to next-20180315.
 
 -Incorporated the corner case bug fix suggested by Alex to patch #5.
 -Based on suggestions by Alex and Robin, added patch#7. This
  moves the PCI window  reservation back in to DMA specific path.
  This is to fix the issue reported by Eric[1].

Note:
The patch #7 has dependency with [2][3]

1. https://patchwork.kernel.org/patch/10232043/
2. https://patchwork.kernel.org/patch/10216553/
3. https://patchwork.kernel.org/patch/10216555/

v3 --> v4
 Addressed comments received for v3.
 -dma_addr_t instead of phys_addr_t
 -LIST_HEAD() usage.
 -Free up iova_copy list in case of error.
 -updated logic in filling the iova caps info(patch #5)

RFCv2 --> v3
 Removed RFC tag.
 Addressed comments from Alex and Eric:
 - Added comments to make iova list management logic more clear.
 - Use of iova list copy so that original is not altered in
   case of failure.

RFCv1 --> RFCv2
 Addressed comments from Alex:
-Introduced IOVA list management and added checks for conflicts with 
 existing dma map entries during attach/detach.

Shameer Kolothum (7):
  vfio/type1: Introduce iova list and add iommu aperture validity check
  vfio/type1: Check reserve region conflict and update iova list
  vfio/type1: Update iova list on detach
  iommu/dma: Move PCI window region reservation back into dma specific
    path.
  vfio/type1: check dma map request is within a valid iova range
  vfio/type1: Add IOVA range capability support
  vfio/type1: remove duplicate retrieval of reserved regions

 drivers/iommu/dma-iommu.c       |  54 ++---
 drivers/vfio/vfio_iommu_type1.c | 497 +++++++++++++++++++++++++++++++++++++++-
 include/uapi/linux/vfio.h       |  23 ++
 3 files changed, 533 insertions(+), 41 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
@ 2018-04-18 11:40 ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xuwei5-C8/M+/jPZTeaMJb+Lgu22Q, linuxarm-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

This series introduces an iova list associated with a vfio 
iommu. The list is kept updated taking care of iommu apertures,
and reserved regions. Also this series adds checks for any conflict
with existing dma mappings whenever a new device group is attached to
the domain.

User-space can retrieve valid iova ranges using VFIO_IOMMU_GET_INFO
ioctl capability chains. Any dma map request outside the valid iova
range will be rejected.

v5 --> v6

 -Rebased to 4.17-rc1
 -Changed the ordering such that previous patch#7 "iommu/dma: Move
  PCI window region reservation back...")  is now patch #4. This
  will avoid any bisection issues pointed out by Alex.
 -Added Robin's Reviewed-by tag for patch#4

v4 --> v5
Rebased to next-20180315.
 
 -Incorporated the corner case bug fix suggested by Alex to patch #5.
 -Based on suggestions by Alex and Robin, added patch#7. This
  moves the PCI window  reservation back in to DMA specific path.
  This is to fix the issue reported by Eric[1].

Note:
The patch #7 has dependency with [2][3]

1. https://patchwork.kernel.org/patch/10232043/
2. https://patchwork.kernel.org/patch/10216553/
3. https://patchwork.kernel.org/patch/10216555/

v3 --> v4
 Addressed comments received for v3.
 -dma_addr_t instead of phys_addr_t
 -LIST_HEAD() usage.
 -Free up iova_copy list in case of error.
 -updated logic in filling the iova caps info(patch #5)

RFCv2 --> v3
 Removed RFC tag.
 Addressed comments from Alex and Eric:
 - Added comments to make iova list management logic more clear.
 - Use of iova list copy so that original is not altered in
   case of failure.

RFCv1 --> RFCv2
 Addressed comments from Alex:
-Introduced IOVA list management and added checks for conflicts with 
 existing dma map entries during attach/detach.

Shameer Kolothum (7):
  vfio/type1: Introduce iova list and add iommu aperture validity check
  vfio/type1: Check reserve region conflict and update iova list
  vfio/type1: Update iova list on detach
  iommu/dma: Move PCI window region reservation back into dma specific
    path.
  vfio/type1: check dma map request is within a valid iova range
  vfio/type1: Add IOVA range capability support
  vfio/type1: remove duplicate retrieval of reserved regions

 drivers/iommu/dma-iommu.c       |  54 ++---
 drivers/vfio/vfio_iommu_type1.c | 497 +++++++++++++++++++++++++++++++++++++++-
 include/uapi/linux/vfio.h       |  23 ++
 3 files changed, 533 insertions(+), 41 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 1/7] vfio/type1: Introduce iova list and add iommu aperture validity check
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson, eric.auger, pmorel
  Cc: kvm, linux-kernel, iommu, linuxarm, john.garry, xuwei5, Shameer Kolothum

This introduces an iova list that is valid for dma mappings. Make
sure the new iommu aperture window doesn't conflict with the current
one or with any existing dma mappings during attach.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 drivers/vfio/vfio_iommu_type1.c | 183 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 180 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 5c212bf..775946d 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -60,6 +60,7 @@ MODULE_PARM_DESC(disable_hugepages,
 
 struct vfio_iommu {
 	struct list_head	domain_list;
+	struct list_head	iova_list;
 	struct vfio_domain	*external_domain; /* domain for external user */
 	struct mutex		lock;
 	struct rb_root		dma_list;
@@ -92,6 +93,12 @@ struct vfio_group {
 	struct list_head	next;
 };
 
+struct vfio_iova {
+	struct list_head	list;
+	dma_addr_t		start;
+	dma_addr_t		end;
+};
+
 /*
  * Guest RAM pinning working set or DMA target
  */
@@ -1313,6 +1320,149 @@ static bool vfio_iommu_has_sw_msi(struct iommu_group *group, phys_addr_t *base)
 	return ret;
 }
 
+/*
+ * This is a helper function to insert an address range to iova list.
+ * The list starts with a single entry corresponding to the IOMMU
+ * domain geometry to which the device group is attached. The list
+ * aperture gets modified when a new domain is added to the container
+ * if the new aperture doesn't conflict with the current one or with
+ * any existing dma mappings. The list is also modified to exclude
+ * any reserved regions associated with the device group.
+ */
+static int vfio_iommu_iova_insert(struct list_head *head,
+				  dma_addr_t start, dma_addr_t end)
+{
+	struct vfio_iova *region;
+
+	region = kmalloc(sizeof(*region), GFP_KERNEL);
+	if (!region)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&region->list);
+	region->start = start;
+	region->end = end;
+
+	list_add_tail(&region->list, head);
+	return 0;
+}
+
+/*
+ * Check the new iommu aperture conflicts with existing aper or with any
+ * existing dma mappings.
+ */
+static bool vfio_iommu_aper_conflict(struct vfio_iommu *iommu,
+				     dma_addr_t start, dma_addr_t end)
+{
+	struct vfio_iova *first, *last;
+	struct list_head *iova = &iommu->iova_list;
+
+	if (list_empty(iova))
+		return false;
+
+	/* Disjoint sets, return conflict */
+	first = list_first_entry(iova, struct vfio_iova, list);
+	last = list_last_entry(iova, struct vfio_iova, list);
+	if ((start > last->end) || (end < first->start))
+		return true;
+
+	/* Check for any existing dma mappings outside the new start */
+	if (start > first->start) {
+		if (vfio_find_dma(iommu, first->start, start - first->start))
+			return true;
+	}
+
+	/* Check for any existing dma mappings outside the new end */
+	if (end < last->end) {
+		if (vfio_find_dma(iommu, end + 1, last->end - end))
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * Resize iommu iova aperture window. This is called only if the new
+ * aperture has no conflict with existing aperture and dma mappings.
+ */
+static int vfio_iommu_aper_resize(struct list_head *iova,
+				      dma_addr_t start,
+				      dma_addr_t end)
+{
+	struct vfio_iova *node, *next;
+
+	if (list_empty(iova))
+		return vfio_iommu_iova_insert(iova, start, end);
+
+	/* Adjust iova list start */
+	list_for_each_entry_safe(node, next, iova, list) {
+		if (start < node->start)
+			break;
+		if ((start >= node->start) && (start < node->end)) {
+			node->start = start;
+			break;
+		}
+		/* Delete nodes before new start */
+		list_del(&node->list);
+		kfree(node);
+	}
+
+	/* Adjust iova list end */
+	list_for_each_entry_safe(node, next, iova, list) {
+		if (end > node->end)
+			continue;
+		if ((end > node->start) && (end <= node->end)) {
+			node->end = end;
+			continue;
+		}
+		/* Delete nodes after new end */
+		list_del(&node->list);
+		kfree(node);
+	}
+
+	return 0;
+}
+
+static void vfio_iommu_iova_free(struct list_head *iova)
+{
+	struct vfio_iova *n, *next;
+
+	list_for_each_entry_safe(n, next, iova, list) {
+		list_del(&n->list);
+		kfree(n);
+	}
+}
+
+static int vfio_iommu_iova_get_copy(struct vfio_iommu *iommu,
+				    struct list_head *iova_copy)
+{
+
+	struct list_head *iova = &iommu->iova_list;
+	struct vfio_iova *n;
+	int ret;
+
+	list_for_each_entry(n, iova, list) {
+		ret = vfio_iommu_iova_insert(iova_copy, n->start, n->end);
+		if (ret)
+			goto out_free;
+	}
+
+	return 0;
+
+out_free:
+	vfio_iommu_iova_free(iova_copy);
+	return ret;
+}
+
+static void vfio_iommu_iova_insert_copy(struct vfio_iommu *iommu,
+					struct list_head *iova_copy)
+{
+	struct list_head *iova = &iommu->iova_list;
+
+	vfio_iommu_iova_free(iova);
+
+	list_splice_tail(iova_copy, iova);
+}
+
 static int vfio_iommu_type1_attach_group(void *iommu_data,
 					 struct iommu_group *iommu_group)
 {
@@ -1323,6 +1473,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	int ret;
 	bool resv_msi, msi_remap;
 	phys_addr_t resv_msi_base;
+	struct iommu_domain_geometry geo;
+	LIST_HEAD(iova_copy);
 
 	mutex_lock(&iommu->lock);
 
@@ -1392,6 +1544,25 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	if (ret)
 		goto out_domain;
 
+	/* Get aperture info */
+	iommu_domain_get_attr(domain->domain, DOMAIN_ATTR_GEOMETRY, &geo);
+
+	if (vfio_iommu_aper_conflict(iommu, geo.aperture_start,
+					    geo.aperture_end)) {
+		ret = -EINVAL;
+		goto out_detach;
+	}
+
+	/* Get a copy of the current iova list and work on it */
+	ret = vfio_iommu_iova_get_copy(iommu, &iova_copy);
+	if (ret)
+		goto out_detach;
+
+	ret = vfio_iommu_aper_resize(&iova_copy, geo.aperture_start,
+						 geo.aperture_end);
+	if (ret)
+		goto out_detach;
+
 	resv_msi = vfio_iommu_has_sw_msi(iommu_group, &resv_msi_base);
 
 	INIT_LIST_HEAD(&domain->group_list);
@@ -1425,8 +1596,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 				list_add(&group->next, &d->group_list);
 				iommu_domain_free(domain->domain);
 				kfree(domain);
-				mutex_unlock(&iommu->lock);
-				return 0;
+				goto done;
 			}
 
 			ret = iommu_attach_group(domain->domain, iommu_group);
@@ -1449,7 +1619,9 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	}
 
 	list_add(&domain->next, &iommu->domain_list);
-
+done:
+	/* Delete the old one and insert new iova list */
+	vfio_iommu_iova_insert_copy(iommu, &iova_copy);
 	mutex_unlock(&iommu->lock);
 
 	return 0;
@@ -1458,6 +1630,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	iommu_detach_group(domain->domain, iommu_group);
 out_domain:
 	iommu_domain_free(domain->domain);
+	vfio_iommu_iova_free(&iova_copy);
 out_free:
 	kfree(domain);
 	kfree(group);
@@ -1596,6 +1769,7 @@ static void *vfio_iommu_type1_open(unsigned long arg)
 	}
 
 	INIT_LIST_HEAD(&iommu->domain_list);
+	INIT_LIST_HEAD(&iommu->iova_list);
 	iommu->dma_list = RB_ROOT;
 	mutex_init(&iommu->lock);
 	BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier);
@@ -1638,6 +1812,9 @@ static void vfio_iommu_type1_release(void *iommu_data)
 		list_del(&domain->next);
 		kfree(domain);
 	}
+
+	vfio_iommu_iova_free(&iommu->iova_list);
+
 	kfree(iommu);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 1/7] vfio/type1: Introduce iova list and add iommu aperture validity check
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xuwei5-C8/M+/jPZTeaMJb+Lgu22Q, linuxarm-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

This introduces an iova list that is valid for dma mappings. Make
sure the new iommu aperture window doesn't conflict with the current
one or with any existing dma mappings during attach.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 drivers/vfio/vfio_iommu_type1.c | 183 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 180 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 5c212bf..775946d 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -60,6 +60,7 @@ MODULE_PARM_DESC(disable_hugepages,
 
 struct vfio_iommu {
 	struct list_head	domain_list;
+	struct list_head	iova_list;
 	struct vfio_domain	*external_domain; /* domain for external user */
 	struct mutex		lock;
 	struct rb_root		dma_list;
@@ -92,6 +93,12 @@ struct vfio_group {
 	struct list_head	next;
 };
 
+struct vfio_iova {
+	struct list_head	list;
+	dma_addr_t		start;
+	dma_addr_t		end;
+};
+
 /*
  * Guest RAM pinning working set or DMA target
  */
@@ -1313,6 +1320,149 @@ static bool vfio_iommu_has_sw_msi(struct iommu_group *group, phys_addr_t *base)
 	return ret;
 }
 
+/*
+ * This is a helper function to insert an address range to iova list.
+ * The list starts with a single entry corresponding to the IOMMU
+ * domain geometry to which the device group is attached. The list
+ * aperture gets modified when a new domain is added to the container
+ * if the new aperture doesn't conflict with the current one or with
+ * any existing dma mappings. The list is also modified to exclude
+ * any reserved regions associated with the device group.
+ */
+static int vfio_iommu_iova_insert(struct list_head *head,
+				  dma_addr_t start, dma_addr_t end)
+{
+	struct vfio_iova *region;
+
+	region = kmalloc(sizeof(*region), GFP_KERNEL);
+	if (!region)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&region->list);
+	region->start = start;
+	region->end = end;
+
+	list_add_tail(&region->list, head);
+	return 0;
+}
+
+/*
+ * Check the new iommu aperture conflicts with existing aper or with any
+ * existing dma mappings.
+ */
+static bool vfio_iommu_aper_conflict(struct vfio_iommu *iommu,
+				     dma_addr_t start, dma_addr_t end)
+{
+	struct vfio_iova *first, *last;
+	struct list_head *iova = &iommu->iova_list;
+
+	if (list_empty(iova))
+		return false;
+
+	/* Disjoint sets, return conflict */
+	first = list_first_entry(iova, struct vfio_iova, list);
+	last = list_last_entry(iova, struct vfio_iova, list);
+	if ((start > last->end) || (end < first->start))
+		return true;
+
+	/* Check for any existing dma mappings outside the new start */
+	if (start > first->start) {
+		if (vfio_find_dma(iommu, first->start, start - first->start))
+			return true;
+	}
+
+	/* Check for any existing dma mappings outside the new end */
+	if (end < last->end) {
+		if (vfio_find_dma(iommu, end + 1, last->end - end))
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * Resize iommu iova aperture window. This is called only if the new
+ * aperture has no conflict with existing aperture and dma mappings.
+ */
+static int vfio_iommu_aper_resize(struct list_head *iova,
+				      dma_addr_t start,
+				      dma_addr_t end)
+{
+	struct vfio_iova *node, *next;
+
+	if (list_empty(iova))
+		return vfio_iommu_iova_insert(iova, start, end);
+
+	/* Adjust iova list start */
+	list_for_each_entry_safe(node, next, iova, list) {
+		if (start < node->start)
+			break;
+		if ((start >= node->start) && (start < node->end)) {
+			node->start = start;
+			break;
+		}
+		/* Delete nodes before new start */
+		list_del(&node->list);
+		kfree(node);
+	}
+
+	/* Adjust iova list end */
+	list_for_each_entry_safe(node, next, iova, list) {
+		if (end > node->end)
+			continue;
+		if ((end > node->start) && (end <= node->end)) {
+			node->end = end;
+			continue;
+		}
+		/* Delete nodes after new end */
+		list_del(&node->list);
+		kfree(node);
+	}
+
+	return 0;
+}
+
+static void vfio_iommu_iova_free(struct list_head *iova)
+{
+	struct vfio_iova *n, *next;
+
+	list_for_each_entry_safe(n, next, iova, list) {
+		list_del(&n->list);
+		kfree(n);
+	}
+}
+
+static int vfio_iommu_iova_get_copy(struct vfio_iommu *iommu,
+				    struct list_head *iova_copy)
+{
+
+	struct list_head *iova = &iommu->iova_list;
+	struct vfio_iova *n;
+	int ret;
+
+	list_for_each_entry(n, iova, list) {
+		ret = vfio_iommu_iova_insert(iova_copy, n->start, n->end);
+		if (ret)
+			goto out_free;
+	}
+
+	return 0;
+
+out_free:
+	vfio_iommu_iova_free(iova_copy);
+	return ret;
+}
+
+static void vfio_iommu_iova_insert_copy(struct vfio_iommu *iommu,
+					struct list_head *iova_copy)
+{
+	struct list_head *iova = &iommu->iova_list;
+
+	vfio_iommu_iova_free(iova);
+
+	list_splice_tail(iova_copy, iova);
+}
+
 static int vfio_iommu_type1_attach_group(void *iommu_data,
 					 struct iommu_group *iommu_group)
 {
@@ -1323,6 +1473,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	int ret;
 	bool resv_msi, msi_remap;
 	phys_addr_t resv_msi_base;
+	struct iommu_domain_geometry geo;
+	LIST_HEAD(iova_copy);
 
 	mutex_lock(&iommu->lock);
 
@@ -1392,6 +1544,25 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	if (ret)
 		goto out_domain;
 
+	/* Get aperture info */
+	iommu_domain_get_attr(domain->domain, DOMAIN_ATTR_GEOMETRY, &geo);
+
+	if (vfio_iommu_aper_conflict(iommu, geo.aperture_start,
+					    geo.aperture_end)) {
+		ret = -EINVAL;
+		goto out_detach;
+	}
+
+	/* Get a copy of the current iova list and work on it */
+	ret = vfio_iommu_iova_get_copy(iommu, &iova_copy);
+	if (ret)
+		goto out_detach;
+
+	ret = vfio_iommu_aper_resize(&iova_copy, geo.aperture_start,
+						 geo.aperture_end);
+	if (ret)
+		goto out_detach;
+
 	resv_msi = vfio_iommu_has_sw_msi(iommu_group, &resv_msi_base);
 
 	INIT_LIST_HEAD(&domain->group_list);
@@ -1425,8 +1596,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 				list_add(&group->next, &d->group_list);
 				iommu_domain_free(domain->domain);
 				kfree(domain);
-				mutex_unlock(&iommu->lock);
-				return 0;
+				goto done;
 			}
 
 			ret = iommu_attach_group(domain->domain, iommu_group);
@@ -1449,7 +1619,9 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	}
 
 	list_add(&domain->next, &iommu->domain_list);
-
+done:
+	/* Delete the old one and insert new iova list */
+	vfio_iommu_iova_insert_copy(iommu, &iova_copy);
 	mutex_unlock(&iommu->lock);
 
 	return 0;
@@ -1458,6 +1630,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	iommu_detach_group(domain->domain, iommu_group);
 out_domain:
 	iommu_domain_free(domain->domain);
+	vfio_iommu_iova_free(&iova_copy);
 out_free:
 	kfree(domain);
 	kfree(group);
@@ -1596,6 +1769,7 @@ static void *vfio_iommu_type1_open(unsigned long arg)
 	}
 
 	INIT_LIST_HEAD(&iommu->domain_list);
+	INIT_LIST_HEAD(&iommu->iova_list);
 	iommu->dma_list = RB_ROOT;
 	mutex_init(&iommu->lock);
 	BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier);
@@ -1638,6 +1812,9 @@ static void vfio_iommu_type1_release(void *iommu_data)
 		list_del(&domain->next);
 		kfree(domain);
 	}
+
+	vfio_iommu_iova_free(&iommu->iova_list);
+
 	kfree(iommu);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 1/7] vfio/type1: Introduce iova list and add iommu aperture validity check
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xuwei5-C8/M+/jPZTeaMJb+Lgu22Q, linuxarm-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

This introduces an iova list that is valid for dma mappings. Make
sure the new iommu aperture window doesn't conflict with the current
one or with any existing dma mappings during attach.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 drivers/vfio/vfio_iommu_type1.c | 183 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 180 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 5c212bf..775946d 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -60,6 +60,7 @@ MODULE_PARM_DESC(disable_hugepages,
 
 struct vfio_iommu {
 	struct list_head	domain_list;
+	struct list_head	iova_list;
 	struct vfio_domain	*external_domain; /* domain for external user */
 	struct mutex		lock;
 	struct rb_root		dma_list;
@@ -92,6 +93,12 @@ struct vfio_group {
 	struct list_head	next;
 };
 
+struct vfio_iova {
+	struct list_head	list;
+	dma_addr_t		start;
+	dma_addr_t		end;
+};
+
 /*
  * Guest RAM pinning working set or DMA target
  */
@@ -1313,6 +1320,149 @@ static bool vfio_iommu_has_sw_msi(struct iommu_group *group, phys_addr_t *base)
 	return ret;
 }
 
+/*
+ * This is a helper function to insert an address range to iova list.
+ * The list starts with a single entry corresponding to the IOMMU
+ * domain geometry to which the device group is attached. The list
+ * aperture gets modified when a new domain is added to the container
+ * if the new aperture doesn't conflict with the current one or with
+ * any existing dma mappings. The list is also modified to exclude
+ * any reserved regions associated with the device group.
+ */
+static int vfio_iommu_iova_insert(struct list_head *head,
+				  dma_addr_t start, dma_addr_t end)
+{
+	struct vfio_iova *region;
+
+	region = kmalloc(sizeof(*region), GFP_KERNEL);
+	if (!region)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&region->list);
+	region->start = start;
+	region->end = end;
+
+	list_add_tail(&region->list, head);
+	return 0;
+}
+
+/*
+ * Check the new iommu aperture conflicts with existing aper or with any
+ * existing dma mappings.
+ */
+static bool vfio_iommu_aper_conflict(struct vfio_iommu *iommu,
+				     dma_addr_t start, dma_addr_t end)
+{
+	struct vfio_iova *first, *last;
+	struct list_head *iova = &iommu->iova_list;
+
+	if (list_empty(iova))
+		return false;
+
+	/* Disjoint sets, return conflict */
+	first = list_first_entry(iova, struct vfio_iova, list);
+	last = list_last_entry(iova, struct vfio_iova, list);
+	if ((start > last->end) || (end < first->start))
+		return true;
+
+	/* Check for any existing dma mappings outside the new start */
+	if (start > first->start) {
+		if (vfio_find_dma(iommu, first->start, start - first->start))
+			return true;
+	}
+
+	/* Check for any existing dma mappings outside the new end */
+	if (end < last->end) {
+		if (vfio_find_dma(iommu, end + 1, last->end - end))
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * Resize iommu iova aperture window. This is called only if the new
+ * aperture has no conflict with existing aperture and dma mappings.
+ */
+static int vfio_iommu_aper_resize(struct list_head *iova,
+				      dma_addr_t start,
+				      dma_addr_t end)
+{
+	struct vfio_iova *node, *next;
+
+	if (list_empty(iova))
+		return vfio_iommu_iova_insert(iova, start, end);
+
+	/* Adjust iova list start */
+	list_for_each_entry_safe(node, next, iova, list) {
+		if (start < node->start)
+			break;
+		if ((start >= node->start) && (start < node->end)) {
+			node->start = start;
+			break;
+		}
+		/* Delete nodes before new start */
+		list_del(&node->list);
+		kfree(node);
+	}
+
+	/* Adjust iova list end */
+	list_for_each_entry_safe(node, next, iova, list) {
+		if (end > node->end)
+			continue;
+		if ((end > node->start) && (end <= node->end)) {
+			node->end = end;
+			continue;
+		}
+		/* Delete nodes after new end */
+		list_del(&node->list);
+		kfree(node);
+	}
+
+	return 0;
+}
+
+static void vfio_iommu_iova_free(struct list_head *iova)
+{
+	struct vfio_iova *n, *next;
+
+	list_for_each_entry_safe(n, next, iova, list) {
+		list_del(&n->list);
+		kfree(n);
+	}
+}
+
+static int vfio_iommu_iova_get_copy(struct vfio_iommu *iommu,
+				    struct list_head *iova_copy)
+{
+
+	struct list_head *iova = &iommu->iova_list;
+	struct vfio_iova *n;
+	int ret;
+
+	list_for_each_entry(n, iova, list) {
+		ret = vfio_iommu_iova_insert(iova_copy, n->start, n->end);
+		if (ret)
+			goto out_free;
+	}
+
+	return 0;
+
+out_free:
+	vfio_iommu_iova_free(iova_copy);
+	return ret;
+}
+
+static void vfio_iommu_iova_insert_copy(struct vfio_iommu *iommu,
+					struct list_head *iova_copy)
+{
+	struct list_head *iova = &iommu->iova_list;
+
+	vfio_iommu_iova_free(iova);
+
+	list_splice_tail(iova_copy, iova);
+}
+
 static int vfio_iommu_type1_attach_group(void *iommu_data,
 					 struct iommu_group *iommu_group)
 {
@@ -1323,6 +1473,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	int ret;
 	bool resv_msi, msi_remap;
 	phys_addr_t resv_msi_base;
+	struct iommu_domain_geometry geo;
+	LIST_HEAD(iova_copy);
 
 	mutex_lock(&iommu->lock);
 
@@ -1392,6 +1544,25 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	if (ret)
 		goto out_domain;
 
+	/* Get aperture info */
+	iommu_domain_get_attr(domain->domain, DOMAIN_ATTR_GEOMETRY, &geo);
+
+	if (vfio_iommu_aper_conflict(iommu, geo.aperture_start,
+					    geo.aperture_end)) {
+		ret = -EINVAL;
+		goto out_detach;
+	}
+
+	/* Get a copy of the current iova list and work on it */
+	ret = vfio_iommu_iova_get_copy(iommu, &iova_copy);
+	if (ret)
+		goto out_detach;
+
+	ret = vfio_iommu_aper_resize(&iova_copy, geo.aperture_start,
+						 geo.aperture_end);
+	if (ret)
+		goto out_detach;
+
 	resv_msi = vfio_iommu_has_sw_msi(iommu_group, &resv_msi_base);
 
 	INIT_LIST_HEAD(&domain->group_list);
@@ -1425,8 +1596,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 				list_add(&group->next, &d->group_list);
 				iommu_domain_free(domain->domain);
 				kfree(domain);
-				mutex_unlock(&iommu->lock);
-				return 0;
+				goto done;
 			}
 
 			ret = iommu_attach_group(domain->domain, iommu_group);
@@ -1449,7 +1619,9 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	}
 
 	list_add(&domain->next, &iommu->domain_list);
-
+done:
+	/* Delete the old one and insert new iova list */
+	vfio_iommu_iova_insert_copy(iommu, &iova_copy);
 	mutex_unlock(&iommu->lock);
 
 	return 0;
@@ -1458,6 +1630,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	iommu_detach_group(domain->domain, iommu_group);
 out_domain:
 	iommu_domain_free(domain->domain);
+	vfio_iommu_iova_free(&iova_copy);
 out_free:
 	kfree(domain);
 	kfree(group);
@@ -1596,6 +1769,7 @@ static void *vfio_iommu_type1_open(unsigned long arg)
 	}
 
 	INIT_LIST_HEAD(&iommu->domain_list);
+	INIT_LIST_HEAD(&iommu->iova_list);
 	iommu->dma_list = RB_ROOT;
 	mutex_init(&iommu->lock);
 	BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier);
@@ -1638,6 +1812,9 @@ static void vfio_iommu_type1_release(void *iommu_data)
 		list_del(&domain->next);
 		kfree(domain);
 	}
+
+	vfio_iommu_iova_free(&iommu->iova_list);
+
 	kfree(iommu);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 2/7] vfio/type1: Check reserve region conflict and update iova list
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson, eric.auger, pmorel
  Cc: kvm, linux-kernel, iommu, linuxarm, john.garry, xuwei5, Shameer Kolothum

This retrieves the reserved regions associated with dev group and
checks for conflicts with any existing dma mappings. Also update
the iova list excluding the reserved regions.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 drivers/vfio/vfio_iommu_type1.c | 90 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 90 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 775946d..a0a79e1 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1422,6 +1422,82 @@ static int vfio_iommu_aper_resize(struct list_head *iova,
 	return 0;
 }
 
+/*
+ * Check reserved region conflicts with existing dma mappings
+ */
+static bool vfio_iommu_resv_conflict(struct vfio_iommu *iommu,
+				struct list_head *resv_regions)
+{
+	struct iommu_resv_region *region;
+
+	/* Check for conflict with existing dma mappings */
+	list_for_each_entry(region, resv_regions, list) {
+		if (vfio_find_dma(iommu, region->start, region->length))
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * Check iova region overlap with  reserved regions and
+ * exclude them from the iommu iova range
+ */
+static int vfio_iommu_resv_exclude(struct list_head *iova,
+					struct list_head *resv_regions)
+{
+	struct iommu_resv_region *resv;
+	struct vfio_iova *n, *next;
+
+	list_for_each_entry(resv, resv_regions, list) {
+		phys_addr_t start, end;
+
+		start = resv->start;
+		end = resv->start + resv->length - 1;
+
+		list_for_each_entry_safe(n, next, iova, list) {
+			int ret = 0;
+
+			/* No overlap */
+			if ((start > n->end) || (end < n->start))
+				continue;
+			/*
+			 * Insert a new node if current node overlaps with the
+			 * reserve region to exlude that from valid iova range.
+			 * Note that, new node is inserted before the current
+			 * node and finally the current node is deleted keeping
+			 * the list updated and sorted.
+			 */
+			if (start > n->start)
+				ret = vfio_iommu_iova_insert(&n->list,
+							n->start, start - 1);
+			if (!ret && end < n->end)
+				ret = vfio_iommu_iova_insert(&n->list,
+							end + 1, n->end);
+			if (ret)
+				return ret;
+
+			list_del(&n->list);
+			kfree(n);
+		}
+	}
+
+	if (list_empty(iova))
+		return -EINVAL;
+
+	return 0;
+}
+
+static void vfio_iommu_resv_free(struct list_head *resv_regions)
+{
+	struct iommu_resv_region *n, *next;
+
+	list_for_each_entry_safe(n, next, resv_regions, list) {
+		list_del(&n->list);
+		kfree(n);
+	}
+}
+
 static void vfio_iommu_iova_free(struct list_head *iova)
 {
 	struct vfio_iova *n, *next;
@@ -1475,6 +1551,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	phys_addr_t resv_msi_base;
 	struct iommu_domain_geometry geo;
 	LIST_HEAD(iova_copy);
+	LIST_HEAD(group_resv_regions);
 
 	mutex_lock(&iommu->lock);
 
@@ -1553,6 +1630,13 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		goto out_detach;
 	}
 
+	iommu_get_group_resv_regions(iommu_group, &group_resv_regions);
+
+	if (vfio_iommu_resv_conflict(iommu, &group_resv_regions)) {
+		ret = -EINVAL;
+		goto out_detach;
+	}
+
 	/* Get a copy of the current iova list and work on it */
 	ret = vfio_iommu_iova_get_copy(iommu, &iova_copy);
 	if (ret)
@@ -1563,6 +1647,10 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	if (ret)
 		goto out_detach;
 
+	ret = vfio_iommu_resv_exclude(&iova_copy, &group_resv_regions);
+	if (ret)
+		goto out_detach;
+
 	resv_msi = vfio_iommu_has_sw_msi(iommu_group, &resv_msi_base);
 
 	INIT_LIST_HEAD(&domain->group_list);
@@ -1623,6 +1711,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	/* Delete the old one and insert new iova list */
 	vfio_iommu_iova_insert_copy(iommu, &iova_copy);
 	mutex_unlock(&iommu->lock);
+	vfio_iommu_resv_free(&group_resv_regions);
 
 	return 0;
 
@@ -1631,6 +1720,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 out_domain:
 	iommu_domain_free(domain->domain);
 	vfio_iommu_iova_free(&iova_copy);
+	vfio_iommu_resv_free(&group_resv_regions);
 out_free:
 	kfree(domain);
 	kfree(group);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 2/7] vfio/type1: Check reserve region conflict and update iova list
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xuwei5-C8/M+/jPZTeaMJb+Lgu22Q, linuxarm-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

This retrieves the reserved regions associated with dev group and
checks for conflicts with any existing dma mappings. Also update
the iova list excluding the reserved regions.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 drivers/vfio/vfio_iommu_type1.c | 90 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 90 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 775946d..a0a79e1 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1422,6 +1422,82 @@ static int vfio_iommu_aper_resize(struct list_head *iova,
 	return 0;
 }
 
+/*
+ * Check reserved region conflicts with existing dma mappings
+ */
+static bool vfio_iommu_resv_conflict(struct vfio_iommu *iommu,
+				struct list_head *resv_regions)
+{
+	struct iommu_resv_region *region;
+
+	/* Check for conflict with existing dma mappings */
+	list_for_each_entry(region, resv_regions, list) {
+		if (vfio_find_dma(iommu, region->start, region->length))
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * Check iova region overlap with  reserved regions and
+ * exclude them from the iommu iova range
+ */
+static int vfio_iommu_resv_exclude(struct list_head *iova,
+					struct list_head *resv_regions)
+{
+	struct iommu_resv_region *resv;
+	struct vfio_iova *n, *next;
+
+	list_for_each_entry(resv, resv_regions, list) {
+		phys_addr_t start, end;
+
+		start = resv->start;
+		end = resv->start + resv->length - 1;
+
+		list_for_each_entry_safe(n, next, iova, list) {
+			int ret = 0;
+
+			/* No overlap */
+			if ((start > n->end) || (end < n->start))
+				continue;
+			/*
+			 * Insert a new node if current node overlaps with the
+			 * reserve region to exlude that from valid iova range.
+			 * Note that, new node is inserted before the current
+			 * node and finally the current node is deleted keeping
+			 * the list updated and sorted.
+			 */
+			if (start > n->start)
+				ret = vfio_iommu_iova_insert(&n->list,
+							n->start, start - 1);
+			if (!ret && end < n->end)
+				ret = vfio_iommu_iova_insert(&n->list,
+							end + 1, n->end);
+			if (ret)
+				return ret;
+
+			list_del(&n->list);
+			kfree(n);
+		}
+	}
+
+	if (list_empty(iova))
+		return -EINVAL;
+
+	return 0;
+}
+
+static void vfio_iommu_resv_free(struct list_head *resv_regions)
+{
+	struct iommu_resv_region *n, *next;
+
+	list_for_each_entry_safe(n, next, resv_regions, list) {
+		list_del(&n->list);
+		kfree(n);
+	}
+}
+
 static void vfio_iommu_iova_free(struct list_head *iova)
 {
 	struct vfio_iova *n, *next;
@@ -1475,6 +1551,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	phys_addr_t resv_msi_base;
 	struct iommu_domain_geometry geo;
 	LIST_HEAD(iova_copy);
+	LIST_HEAD(group_resv_regions);
 
 	mutex_lock(&iommu->lock);
 
@@ -1553,6 +1630,13 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		goto out_detach;
 	}
 
+	iommu_get_group_resv_regions(iommu_group, &group_resv_regions);
+
+	if (vfio_iommu_resv_conflict(iommu, &group_resv_regions)) {
+		ret = -EINVAL;
+		goto out_detach;
+	}
+
 	/* Get a copy of the current iova list and work on it */
 	ret = vfio_iommu_iova_get_copy(iommu, &iova_copy);
 	if (ret)
@@ -1563,6 +1647,10 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	if (ret)
 		goto out_detach;
 
+	ret = vfio_iommu_resv_exclude(&iova_copy, &group_resv_regions);
+	if (ret)
+		goto out_detach;
+
 	resv_msi = vfio_iommu_has_sw_msi(iommu_group, &resv_msi_base);
 
 	INIT_LIST_HEAD(&domain->group_list);
@@ -1623,6 +1711,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	/* Delete the old one and insert new iova list */
 	vfio_iommu_iova_insert_copy(iommu, &iova_copy);
 	mutex_unlock(&iommu->lock);
+	vfio_iommu_resv_free(&group_resv_regions);
 
 	return 0;
 
@@ -1631,6 +1720,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 out_domain:
 	iommu_domain_free(domain->domain);
 	vfio_iommu_iova_free(&iova_copy);
+	vfio_iommu_resv_free(&group_resv_regions);
 out_free:
 	kfree(domain);
 	kfree(group);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 2/7] vfio/type1: Check reserve region conflict and update iova list
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xuwei5-C8/M+/jPZTeaMJb+Lgu22Q, linuxarm-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

This retrieves the reserved regions associated with dev group and
checks for conflicts with any existing dma mappings. Also update
the iova list excluding the reserved regions.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 drivers/vfio/vfio_iommu_type1.c | 90 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 90 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 775946d..a0a79e1 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1422,6 +1422,82 @@ static int vfio_iommu_aper_resize(struct list_head *iova,
 	return 0;
 }
 
+/*
+ * Check reserved region conflicts with existing dma mappings
+ */
+static bool vfio_iommu_resv_conflict(struct vfio_iommu *iommu,
+				struct list_head *resv_regions)
+{
+	struct iommu_resv_region *region;
+
+	/* Check for conflict with existing dma mappings */
+	list_for_each_entry(region, resv_regions, list) {
+		if (vfio_find_dma(iommu, region->start, region->length))
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * Check iova region overlap with  reserved regions and
+ * exclude them from the iommu iova range
+ */
+static int vfio_iommu_resv_exclude(struct list_head *iova,
+					struct list_head *resv_regions)
+{
+	struct iommu_resv_region *resv;
+	struct vfio_iova *n, *next;
+
+	list_for_each_entry(resv, resv_regions, list) {
+		phys_addr_t start, end;
+
+		start = resv->start;
+		end = resv->start + resv->length - 1;
+
+		list_for_each_entry_safe(n, next, iova, list) {
+			int ret = 0;
+
+			/* No overlap */
+			if ((start > n->end) || (end < n->start))
+				continue;
+			/*
+			 * Insert a new node if current node overlaps with the
+			 * reserve region to exlude that from valid iova range.
+			 * Note that, new node is inserted before the current
+			 * node and finally the current node is deleted keeping
+			 * the list updated and sorted.
+			 */
+			if (start > n->start)
+				ret = vfio_iommu_iova_insert(&n->list,
+							n->start, start - 1);
+			if (!ret && end < n->end)
+				ret = vfio_iommu_iova_insert(&n->list,
+							end + 1, n->end);
+			if (ret)
+				return ret;
+
+			list_del(&n->list);
+			kfree(n);
+		}
+	}
+
+	if (list_empty(iova))
+		return -EINVAL;
+
+	return 0;
+}
+
+static void vfio_iommu_resv_free(struct list_head *resv_regions)
+{
+	struct iommu_resv_region *n, *next;
+
+	list_for_each_entry_safe(n, next, resv_regions, list) {
+		list_del(&n->list);
+		kfree(n);
+	}
+}
+
 static void vfio_iommu_iova_free(struct list_head *iova)
 {
 	struct vfio_iova *n, *next;
@@ -1475,6 +1551,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	phys_addr_t resv_msi_base;
 	struct iommu_domain_geometry geo;
 	LIST_HEAD(iova_copy);
+	LIST_HEAD(group_resv_regions);
 
 	mutex_lock(&iommu->lock);
 
@@ -1553,6 +1630,13 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		goto out_detach;
 	}
 
+	iommu_get_group_resv_regions(iommu_group, &group_resv_regions);
+
+	if (vfio_iommu_resv_conflict(iommu, &group_resv_regions)) {
+		ret = -EINVAL;
+		goto out_detach;
+	}
+
 	/* Get a copy of the current iova list and work on it */
 	ret = vfio_iommu_iova_get_copy(iommu, &iova_copy);
 	if (ret)
@@ -1563,6 +1647,10 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	if (ret)
 		goto out_detach;
 
+	ret = vfio_iommu_resv_exclude(&iova_copy, &group_resv_regions);
+	if (ret)
+		goto out_detach;
+
 	resv_msi = vfio_iommu_has_sw_msi(iommu_group, &resv_msi_base);
 
 	INIT_LIST_HEAD(&domain->group_list);
@@ -1623,6 +1711,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	/* Delete the old one and insert new iova list */
 	vfio_iommu_iova_insert_copy(iommu, &iova_copy);
 	mutex_unlock(&iommu->lock);
+	vfio_iommu_resv_free(&group_resv_regions);
 
 	return 0;
 
@@ -1631,6 +1720,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 out_domain:
 	iommu_domain_free(domain->domain);
 	vfio_iommu_iova_free(&iova_copy);
+	vfio_iommu_resv_free(&group_resv_regions);
 out_free:
 	kfree(domain);
 	kfree(group);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 3/7] vfio/type1: Update iova list on detach
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson, eric.auger, pmorel
  Cc: kvm, linux-kernel, iommu, linuxarm, john.garry, xuwei5, Shameer Kolothum

Get a copy of iova list on _group_detach and try to update the list.
On success replace the current one with the copy. Leave the list as
it is if update fails.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 drivers/vfio/vfio_iommu_type1.c | 91 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 91 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index a0a79e1..6fd6841 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1776,12 +1776,88 @@ static void vfio_sanity_check_pfn_list(struct vfio_iommu *iommu)
 	WARN_ON(iommu->notifier.head);
 }
 
+/*
+ * Called when a domain is removed in detach. It is possible that
+ * the removed domain decided the iova aperture window. Modify the
+ * iova aperture with the smallest window among existing domains.
+ */
+static void vfio_iommu_aper_expand(struct vfio_iommu *iommu,
+				   struct list_head *iova_copy)
+{
+	struct vfio_domain *domain;
+	struct iommu_domain_geometry geo;
+	struct vfio_iova *node;
+	dma_addr_t start = 0;
+	dma_addr_t end = (dma_addr_t)~0;
+
+	list_for_each_entry(domain, &iommu->domain_list, next) {
+		iommu_domain_get_attr(domain->domain, DOMAIN_ATTR_GEOMETRY,
+				      &geo);
+		if (geo.aperture_start > start)
+			start = geo.aperture_start;
+		if (geo.aperture_end < end)
+			end = geo.aperture_end;
+	}
+
+	/* Modify aperture limits. The new aper is either same or bigger */
+	node = list_first_entry(iova_copy, struct vfio_iova, list);
+	node->start = start;
+	node = list_last_entry(iova_copy, struct vfio_iova, list);
+	node->end = end;
+}
+
+/*
+ * Called when a group is detached. The reserved regions for that
+ * group can be part of valid iova now. But since reserved regions
+ * may be duplicated among groups, populate the iova valid regions
+ * list again.
+ */
+static int vfio_iommu_resv_refresh(struct vfio_iommu *iommu,
+				   struct list_head *iova_copy)
+{
+	struct vfio_domain *d;
+	struct vfio_group *g;
+	struct vfio_iova *node;
+	dma_addr_t start, end;
+	LIST_HEAD(resv_regions);
+	int ret;
+
+	list_for_each_entry(d, &iommu->domain_list, next) {
+		list_for_each_entry(g, &d->group_list, next)
+			iommu_get_group_resv_regions(g->iommu_group,
+							 &resv_regions);
+	}
+
+	if (list_empty(&resv_regions))
+		return 0;
+
+	node = list_first_entry(iova_copy, struct vfio_iova, list);
+	start = node->start;
+	node = list_last_entry(iova_copy, struct vfio_iova, list);
+	end = node->end;
+
+	/* purge the iova list and create new one */
+	vfio_iommu_iova_free(iova_copy);
+
+	ret = vfio_iommu_aper_resize(iova_copy, start, end);
+	if (ret)
+		goto done;
+
+	/* Exclude current reserved regions from iova ranges */
+	ret = vfio_iommu_resv_exclude(iova_copy, &resv_regions);
+done:
+	vfio_iommu_resv_free(&resv_regions);
+	return ret;
+}
+
 static void vfio_iommu_type1_detach_group(void *iommu_data,
 					  struct iommu_group *iommu_group)
 {
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_domain *domain;
 	struct vfio_group *group;
+	bool iova_copy_fail;
+	LIST_HEAD(iova_copy);
 
 	mutex_lock(&iommu->lock);
 
@@ -1804,6 +1880,12 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 		}
 	}
 
+	/*
+	 * Get a copy of iova list. If success, use copy to update the
+	 * list and to replace the current one.
+	 */
+	iova_copy_fail = !!vfio_iommu_iova_get_copy(iommu, &iova_copy);
+
 	list_for_each_entry(domain, &iommu->domain_list, next) {
 		group = find_iommu_group(domain, iommu_group);
 		if (!group)
@@ -1829,10 +1911,19 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 			iommu_domain_free(domain->domain);
 			list_del(&domain->next);
 			kfree(domain);
+			if (!iova_copy_fail && !list_empty(&iommu->domain_list))
+				vfio_iommu_aper_expand(iommu, &iova_copy);
 		}
 		break;
 	}
 
+	if (!iova_copy_fail && !list_empty(&iommu->domain_list)) {
+		if (!vfio_iommu_resv_refresh(iommu, &iova_copy))
+			vfio_iommu_iova_insert_copy(iommu, &iova_copy);
+		else
+			vfio_iommu_iova_free(&iova_copy);
+	}
+
 detach_group_done:
 	mutex_unlock(&iommu->lock);
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 3/7] vfio/type1: Update iova list on detach
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xuwei5-C8/M+/jPZTeaMJb+Lgu22Q, linuxarm-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

Get a copy of iova list on _group_detach and try to update the list.
On success replace the current one with the copy. Leave the list as
it is if update fails.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 drivers/vfio/vfio_iommu_type1.c | 91 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 91 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index a0a79e1..6fd6841 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1776,12 +1776,88 @@ static void vfio_sanity_check_pfn_list(struct vfio_iommu *iommu)
 	WARN_ON(iommu->notifier.head);
 }
 
+/*
+ * Called when a domain is removed in detach. It is possible that
+ * the removed domain decided the iova aperture window. Modify the
+ * iova aperture with the smallest window among existing domains.
+ */
+static void vfio_iommu_aper_expand(struct vfio_iommu *iommu,
+				   struct list_head *iova_copy)
+{
+	struct vfio_domain *domain;
+	struct iommu_domain_geometry geo;
+	struct vfio_iova *node;
+	dma_addr_t start = 0;
+	dma_addr_t end = (dma_addr_t)~0;
+
+	list_for_each_entry(domain, &iommu->domain_list, next) {
+		iommu_domain_get_attr(domain->domain, DOMAIN_ATTR_GEOMETRY,
+				      &geo);
+		if (geo.aperture_start > start)
+			start = geo.aperture_start;
+		if (geo.aperture_end < end)
+			end = geo.aperture_end;
+	}
+
+	/* Modify aperture limits. The new aper is either same or bigger */
+	node = list_first_entry(iova_copy, struct vfio_iova, list);
+	node->start = start;
+	node = list_last_entry(iova_copy, struct vfio_iova, list);
+	node->end = end;
+}
+
+/*
+ * Called when a group is detached. The reserved regions for that
+ * group can be part of valid iova now. But since reserved regions
+ * may be duplicated among groups, populate the iova valid regions
+ * list again.
+ */
+static int vfio_iommu_resv_refresh(struct vfio_iommu *iommu,
+				   struct list_head *iova_copy)
+{
+	struct vfio_domain *d;
+	struct vfio_group *g;
+	struct vfio_iova *node;
+	dma_addr_t start, end;
+	LIST_HEAD(resv_regions);
+	int ret;
+
+	list_for_each_entry(d, &iommu->domain_list, next) {
+		list_for_each_entry(g, &d->group_list, next)
+			iommu_get_group_resv_regions(g->iommu_group,
+							 &resv_regions);
+	}
+
+	if (list_empty(&resv_regions))
+		return 0;
+
+	node = list_first_entry(iova_copy, struct vfio_iova, list);
+	start = node->start;
+	node = list_last_entry(iova_copy, struct vfio_iova, list);
+	end = node->end;
+
+	/* purge the iova list and create new one */
+	vfio_iommu_iova_free(iova_copy);
+
+	ret = vfio_iommu_aper_resize(iova_copy, start, end);
+	if (ret)
+		goto done;
+
+	/* Exclude current reserved regions from iova ranges */
+	ret = vfio_iommu_resv_exclude(iova_copy, &resv_regions);
+done:
+	vfio_iommu_resv_free(&resv_regions);
+	return ret;
+}
+
 static void vfio_iommu_type1_detach_group(void *iommu_data,
 					  struct iommu_group *iommu_group)
 {
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_domain *domain;
 	struct vfio_group *group;
+	bool iova_copy_fail;
+	LIST_HEAD(iova_copy);
 
 	mutex_lock(&iommu->lock);
 
@@ -1804,6 +1880,12 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 		}
 	}
 
+	/*
+	 * Get a copy of iova list. If success, use copy to update the
+	 * list and to replace the current one.
+	 */
+	iova_copy_fail = !!vfio_iommu_iova_get_copy(iommu, &iova_copy);
+
 	list_for_each_entry(domain, &iommu->domain_list, next) {
 		group = find_iommu_group(domain, iommu_group);
 		if (!group)
@@ -1829,10 +1911,19 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 			iommu_domain_free(domain->domain);
 			list_del(&domain->next);
 			kfree(domain);
+			if (!iova_copy_fail && !list_empty(&iommu->domain_list))
+				vfio_iommu_aper_expand(iommu, &iova_copy);
 		}
 		break;
 	}
 
+	if (!iova_copy_fail && !list_empty(&iommu->domain_list)) {
+		if (!vfio_iommu_resv_refresh(iommu, &iova_copy))
+			vfio_iommu_iova_insert_copy(iommu, &iova_copy);
+		else
+			vfio_iommu_iova_free(&iova_copy);
+	}
+
 detach_group_done:
 	mutex_unlock(&iommu->lock);
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 3/7] vfio/type1: Update iova list on detach
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xuwei5-C8/M+/jPZTeaMJb+Lgu22Q, linuxarm-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

Get a copy of iova list on _group_detach and try to update the list.
On success replace the current one with the copy. Leave the list as
it is if update fails.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 drivers/vfio/vfio_iommu_type1.c | 91 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 91 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index a0a79e1..6fd6841 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1776,12 +1776,88 @@ static void vfio_sanity_check_pfn_list(struct vfio_iommu *iommu)
 	WARN_ON(iommu->notifier.head);
 }
 
+/*
+ * Called when a domain is removed in detach. It is possible that
+ * the removed domain decided the iova aperture window. Modify the
+ * iova aperture with the smallest window among existing domains.
+ */
+static void vfio_iommu_aper_expand(struct vfio_iommu *iommu,
+				   struct list_head *iova_copy)
+{
+	struct vfio_domain *domain;
+	struct iommu_domain_geometry geo;
+	struct vfio_iova *node;
+	dma_addr_t start = 0;
+	dma_addr_t end = (dma_addr_t)~0;
+
+	list_for_each_entry(domain, &iommu->domain_list, next) {
+		iommu_domain_get_attr(domain->domain, DOMAIN_ATTR_GEOMETRY,
+				      &geo);
+		if (geo.aperture_start > start)
+			start = geo.aperture_start;
+		if (geo.aperture_end < end)
+			end = geo.aperture_end;
+	}
+
+	/* Modify aperture limits. The new aper is either same or bigger */
+	node = list_first_entry(iova_copy, struct vfio_iova, list);
+	node->start = start;
+	node = list_last_entry(iova_copy, struct vfio_iova, list);
+	node->end = end;
+}
+
+/*
+ * Called when a group is detached. The reserved regions for that
+ * group can be part of valid iova now. But since reserved regions
+ * may be duplicated among groups, populate the iova valid regions
+ * list again.
+ */
+static int vfio_iommu_resv_refresh(struct vfio_iommu *iommu,
+				   struct list_head *iova_copy)
+{
+	struct vfio_domain *d;
+	struct vfio_group *g;
+	struct vfio_iova *node;
+	dma_addr_t start, end;
+	LIST_HEAD(resv_regions);
+	int ret;
+
+	list_for_each_entry(d, &iommu->domain_list, next) {
+		list_for_each_entry(g, &d->group_list, next)
+			iommu_get_group_resv_regions(g->iommu_group,
+							 &resv_regions);
+	}
+
+	if (list_empty(&resv_regions))
+		return 0;
+
+	node = list_first_entry(iova_copy, struct vfio_iova, list);
+	start = node->start;
+	node = list_last_entry(iova_copy, struct vfio_iova, list);
+	end = node->end;
+
+	/* purge the iova list and create new one */
+	vfio_iommu_iova_free(iova_copy);
+
+	ret = vfio_iommu_aper_resize(iova_copy, start, end);
+	if (ret)
+		goto done;
+
+	/* Exclude current reserved regions from iova ranges */
+	ret = vfio_iommu_resv_exclude(iova_copy, &resv_regions);
+done:
+	vfio_iommu_resv_free(&resv_regions);
+	return ret;
+}
+
 static void vfio_iommu_type1_detach_group(void *iommu_data,
 					  struct iommu_group *iommu_group)
 {
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_domain *domain;
 	struct vfio_group *group;
+	bool iova_copy_fail;
+	LIST_HEAD(iova_copy);
 
 	mutex_lock(&iommu->lock);
 
@@ -1804,6 +1880,12 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 		}
 	}
 
+	/*
+	 * Get a copy of iova list. If success, use copy to update the
+	 * list and to replace the current one.
+	 */
+	iova_copy_fail = !!vfio_iommu_iova_get_copy(iommu, &iova_copy);
+
 	list_for_each_entry(domain, &iommu->domain_list, next) {
 		group = find_iommu_group(domain, iommu_group);
 		if (!group)
@@ -1829,10 +1911,19 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 			iommu_domain_free(domain->domain);
 			list_del(&domain->next);
 			kfree(domain);
+			if (!iova_copy_fail && !list_empty(&iommu->domain_list))
+				vfio_iommu_aper_expand(iommu, &iova_copy);
 		}
 		break;
 	}
 
+	if (!iova_copy_fail && !list_empty(&iommu->domain_list)) {
+		if (!vfio_iommu_resv_refresh(iommu, &iova_copy))
+			vfio_iommu_iova_insert_copy(iommu, &iova_copy);
+		else
+			vfio_iommu_iova_free(&iova_copy);
+	}
+
 detach_group_done:
 	mutex_unlock(&iommu->lock);
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 4/7] iommu/dma: Move PCI window region reservation back into dma specific path.
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson, eric.auger, pmorel
  Cc: kvm, linux-kernel, iommu, linuxarm, john.garry, xuwei5,
	Shameer Kolothum, Joerg Roedel

This pretty much reverts commit 273df9635385 ("iommu/dma: Make PCI
window reservation generic")  by moving the PCI window region
reservation back into the dma specific path so that these regions
doesn't get exposed via the IOMMU API interface. With this change,
the vfio interface will report only iommu specific reserved regions
to the user space.

Cc: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c | 54 ++++++++++++++++++++++-------------------------
 1 file changed, 25 insertions(+), 29 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f05f3cf..ddcbbdb 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -167,40 +167,16 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
  * @list: Reserved region list from iommu_get_resv_regions()
  *
  * IOMMU drivers can use this to implement their .get_resv_regions callback
- * for general non-IOMMU-specific reservations. Currently, this covers host
- * bridge windows for PCI devices and GICv3 ITS region reservation on ACPI
- * based ARM platforms that may require HW MSI reservation.
+ * for general non-IOMMU-specific reservations. Currently, this covers GICv3
+ * ITS region reservation on ACPI based ARM platforms that may require HW MSI
+ * reservation.
  */
 void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
 {
-	struct pci_host_bridge *bridge;
-	struct resource_entry *window;
-
-	if (!is_of_node(dev->iommu_fwspec->iommu_fwnode) &&
-		iort_iommu_msi_get_resv_regions(dev, list) < 0)
-		return;
-
-	if (!dev_is_pci(dev))
-		return;
-
-	bridge = pci_find_host_bridge(to_pci_dev(dev)->bus);
-	resource_list_for_each_entry(window, &bridge->windows) {
-		struct iommu_resv_region *region;
-		phys_addr_t start;
-		size_t length;
-
-		if (resource_type(window->res) != IORESOURCE_MEM)
-			continue;
 
-		start = window->res->start - window->offset;
-		length = window->res->end - window->res->start + 1;
-		region = iommu_alloc_resv_region(start, length, 0,
-				IOMMU_RESV_RESERVED);
-		if (!region)
-			return;
+	if (!is_of_node(dev->iommu_fwspec->iommu_fwnode))
+		iort_iommu_msi_get_resv_regions(dev, list);
 
-		list_add_tail(&region->list, list);
-	}
 }
 EXPORT_SYMBOL(iommu_dma_get_resv_regions);
 
@@ -229,6 +205,23 @@ static int cookie_init_hw_msi_region(struct iommu_dma_cookie *cookie,
 	return 0;
 }
 
+static void iova_reserve_pci_windows(struct pci_dev *dev,
+		struct iova_domain *iovad)
+{
+	struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
+	struct resource_entry *window;
+	unsigned long lo, hi;
+
+	resource_list_for_each_entry(window, &bridge->windows) {
+		if (resource_type(window->res) != IORESOURCE_MEM)
+			continue;
+
+		lo = iova_pfn(iovad, window->res->start - window->offset);
+		hi = iova_pfn(iovad, window->res->end - window->offset);
+		reserve_iova(iovad, lo, hi);
+	}
+}
+
 static int iova_reserve_iommu_regions(struct device *dev,
 		struct iommu_domain *domain)
 {
@@ -238,6 +231,9 @@ static int iova_reserve_iommu_regions(struct device *dev,
 	LIST_HEAD(resv_regions);
 	int ret = 0;
 
+	if (dev_is_pci(dev))
+		iova_reserve_pci_windows(to_pci_dev(dev), iovad);
+
 	iommu_get_resv_regions(dev, &resv_regions);
 	list_for_each_entry(region, &resv_regions, list) {
 		unsigned long lo, hi;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 4/7] iommu/dma: Move PCI window region reservation back into dma specific path.
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xuwei5-C8/M+/jPZTeaMJb+Lgu22Q, linuxarm-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

This pretty much reverts commit 273df9635385 ("iommu/dma: Make PCI
window reservation generic")  by moving the PCI window region
reservation back into the dma specific path so that these regions
doesn't get exposed via the IOMMU API interface. With this change,
the vfio interface will report only iommu specific reserved regions
to the user space.

Cc: Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Reviewed-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/dma-iommu.c | 54 ++++++++++++++++++++++-------------------------
 1 file changed, 25 insertions(+), 29 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f05f3cf..ddcbbdb 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -167,40 +167,16 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
  * @list: Reserved region list from iommu_get_resv_regions()
  *
  * IOMMU drivers can use this to implement their .get_resv_regions callback
- * for general non-IOMMU-specific reservations. Currently, this covers host
- * bridge windows for PCI devices and GICv3 ITS region reservation on ACPI
- * based ARM platforms that may require HW MSI reservation.
+ * for general non-IOMMU-specific reservations. Currently, this covers GICv3
+ * ITS region reservation on ACPI based ARM platforms that may require HW MSI
+ * reservation.
  */
 void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
 {
-	struct pci_host_bridge *bridge;
-	struct resource_entry *window;
-
-	if (!is_of_node(dev->iommu_fwspec->iommu_fwnode) &&
-		iort_iommu_msi_get_resv_regions(dev, list) < 0)
-		return;
-
-	if (!dev_is_pci(dev))
-		return;
-
-	bridge = pci_find_host_bridge(to_pci_dev(dev)->bus);
-	resource_list_for_each_entry(window, &bridge->windows) {
-		struct iommu_resv_region *region;
-		phys_addr_t start;
-		size_t length;
-
-		if (resource_type(window->res) != IORESOURCE_MEM)
-			continue;
 
-		start = window->res->start - window->offset;
-		length = window->res->end - window->res->start + 1;
-		region = iommu_alloc_resv_region(start, length, 0,
-				IOMMU_RESV_RESERVED);
-		if (!region)
-			return;
+	if (!is_of_node(dev->iommu_fwspec->iommu_fwnode))
+		iort_iommu_msi_get_resv_regions(dev, list);
 
-		list_add_tail(&region->list, list);
-	}
 }
 EXPORT_SYMBOL(iommu_dma_get_resv_regions);
 
@@ -229,6 +205,23 @@ static int cookie_init_hw_msi_region(struct iommu_dma_cookie *cookie,
 	return 0;
 }
 
+static void iova_reserve_pci_windows(struct pci_dev *dev,
+		struct iova_domain *iovad)
+{
+	struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
+	struct resource_entry *window;
+	unsigned long lo, hi;
+
+	resource_list_for_each_entry(window, &bridge->windows) {
+		if (resource_type(window->res) != IORESOURCE_MEM)
+			continue;
+
+		lo = iova_pfn(iovad, window->res->start - window->offset);
+		hi = iova_pfn(iovad, window->res->end - window->offset);
+		reserve_iova(iovad, lo, hi);
+	}
+}
+
 static int iova_reserve_iommu_regions(struct device *dev,
 		struct iommu_domain *domain)
 {
@@ -238,6 +231,9 @@ static int iova_reserve_iommu_regions(struct device *dev,
 	LIST_HEAD(resv_regions);
 	int ret = 0;
 
+	if (dev_is_pci(dev))
+		iova_reserve_pci_windows(to_pci_dev(dev), iovad);
+
 	iommu_get_resv_regions(dev, &resv_regions);
 	list_for_each_entry(region, &resv_regions, list) {
 		unsigned long lo, hi;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 4/7] iommu/dma: Move PCI window region reservation back into dma specific path.
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xuwei5-C8/M+/jPZTeaMJb+Lgu22Q, linuxarm-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

This pretty much reverts commit 273df9635385 ("iommu/dma: Make PCI
window reservation generic")  by moving the PCI window region
reservation back into the dma specific path so that these regions
doesn't get exposed via the IOMMU API interface. With this change,
the vfio interface will report only iommu specific reserved regions
to the user space.

Cc: Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Reviewed-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/dma-iommu.c | 54 ++++++++++++++++++++++-------------------------
 1 file changed, 25 insertions(+), 29 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f05f3cf..ddcbbdb 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -167,40 +167,16 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
  * @list: Reserved region list from iommu_get_resv_regions()
  *
  * IOMMU drivers can use this to implement their .get_resv_regions callback
- * for general non-IOMMU-specific reservations. Currently, this covers host
- * bridge windows for PCI devices and GICv3 ITS region reservation on ACPI
- * based ARM platforms that may require HW MSI reservation.
+ * for general non-IOMMU-specific reservations. Currently, this covers GICv3
+ * ITS region reservation on ACPI based ARM platforms that may require HW MSI
+ * reservation.
  */
 void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
 {
-	struct pci_host_bridge *bridge;
-	struct resource_entry *window;
-
-	if (!is_of_node(dev->iommu_fwspec->iommu_fwnode) &&
-		iort_iommu_msi_get_resv_regions(dev, list) < 0)
-		return;
-
-	if (!dev_is_pci(dev))
-		return;
-
-	bridge = pci_find_host_bridge(to_pci_dev(dev)->bus);
-	resource_list_for_each_entry(window, &bridge->windows) {
-		struct iommu_resv_region *region;
-		phys_addr_t start;
-		size_t length;
-
-		if (resource_type(window->res) != IORESOURCE_MEM)
-			continue;
 
-		start = window->res->start - window->offset;
-		length = window->res->end - window->res->start + 1;
-		region = iommu_alloc_resv_region(start, length, 0,
-				IOMMU_RESV_RESERVED);
-		if (!region)
-			return;
+	if (!is_of_node(dev->iommu_fwspec->iommu_fwnode))
+		iort_iommu_msi_get_resv_regions(dev, list);
 
-		list_add_tail(&region->list, list);
-	}
 }
 EXPORT_SYMBOL(iommu_dma_get_resv_regions);
 
@@ -229,6 +205,23 @@ static int cookie_init_hw_msi_region(struct iommu_dma_cookie *cookie,
 	return 0;
 }
 
+static void iova_reserve_pci_windows(struct pci_dev *dev,
+		struct iova_domain *iovad)
+{
+	struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
+	struct resource_entry *window;
+	unsigned long lo, hi;
+
+	resource_list_for_each_entry(window, &bridge->windows) {
+		if (resource_type(window->res) != IORESOURCE_MEM)
+			continue;
+
+		lo = iova_pfn(iovad, window->res->start - window->offset);
+		hi = iova_pfn(iovad, window->res->end - window->offset);
+		reserve_iova(iovad, lo, hi);
+	}
+}
+
 static int iova_reserve_iommu_regions(struct device *dev,
 		struct iommu_domain *domain)
 {
@@ -238,6 +231,9 @@ static int iova_reserve_iommu_regions(struct device *dev,
 	LIST_HEAD(resv_regions);
 	int ret = 0;
 
+	if (dev_is_pci(dev))
+		iova_reserve_pci_windows(to_pci_dev(dev), iovad);
+
 	iommu_get_resv_regions(dev, &resv_regions);
 	list_for_each_entry(region, &resv_regions, list) {
 		unsigned long lo, hi;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 5/7] vfio/type1: check dma map request is within a valid iova range
  2018-04-18 11:40 ` Shameer Kolothum
@ 2018-04-18 11:40   ` Shameer Kolothum
  -1 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson, eric.auger, pmorel
  Cc: kvm, linux-kernel, iommu, linuxarm, john.garry, xuwei5, Shameer Kolothum

This checks and rejects any dma map request outside valid iova
range.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 drivers/vfio/vfio_iommu_type1.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 6fd6841..bf33281 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1091,6 +1091,23 @@ static int vfio_pin_map_dma(struct vfio_iommu *iommu, struct vfio_dma *dma,
 	return ret;
 }
 
+/*
+ * Check dma map request is within a valid iova range
+ */
+static bool vfio_iommu_iova_dma_valid(struct vfio_iommu *iommu,
+				dma_addr_t start, dma_addr_t end)
+{
+	struct list_head *iova = &iommu->iova_list;
+	struct vfio_iova *node;
+
+	list_for_each_entry(node, iova, list) {
+		if ((start >= node->start) && (end <= node->end))
+			return true;
+	}
+
+	return false;
+}
+
 static int vfio_dma_do_map(struct vfio_iommu *iommu,
 			   struct vfio_iommu_type1_dma_map *map)
 {
@@ -1129,6 +1146,11 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
 		goto out_unlock;
 	}
 
+	if (!vfio_iommu_iova_dma_valid(iommu, iova, iova + size - 1)) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
 	dma = kzalloc(sizeof(*dma), GFP_KERNEL);
 	if (!dma) {
 		ret = -ENOMEM;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 5/7] vfio/type1: check dma map request is within a valid iova range
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson, eric.auger, pmorel
  Cc: kvm, linux-kernel, iommu, linuxarm, john.garry, xuwei5, Shameer Kolothum

This checks and rejects any dma map request outside valid iova
range.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 drivers/vfio/vfio_iommu_type1.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 6fd6841..bf33281 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1091,6 +1091,23 @@ static int vfio_pin_map_dma(struct vfio_iommu *iommu, struct vfio_dma *dma,
 	return ret;
 }
 
+/*
+ * Check dma map request is within a valid iova range
+ */
+static bool vfio_iommu_iova_dma_valid(struct vfio_iommu *iommu,
+				dma_addr_t start, dma_addr_t end)
+{
+	struct list_head *iova = &iommu->iova_list;
+	struct vfio_iova *node;
+
+	list_for_each_entry(node, iova, list) {
+		if ((start >= node->start) && (end <= node->end))
+			return true;
+	}
+
+	return false;
+}
+
 static int vfio_dma_do_map(struct vfio_iommu *iommu,
 			   struct vfio_iommu_type1_dma_map *map)
 {
@@ -1129,6 +1146,11 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
 		goto out_unlock;
 	}
 
+	if (!vfio_iommu_iova_dma_valid(iommu, iova, iova + size - 1)) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
 	dma = kzalloc(sizeof(*dma), GFP_KERNEL);
 	if (!dma) {
 		ret = -ENOMEM;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 6/7] vfio/type1: Add IOVA range capability support
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson, eric.auger, pmorel
  Cc: kvm, linux-kernel, iommu, linuxarm, john.garry, xuwei5, Shameer Kolothum

This  allows the user-space to retrieve the supported IOVA
range(s), excluding any reserved regions. The implementation
is based on capability chains, added to VFIO_IOMMU_GET_INFO ioctl.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 drivers/vfio/vfio_iommu_type1.c | 96 +++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/vfio.h       | 23 ++++++++++
 2 files changed, 119 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index bf33281..44d0f13d 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2038,6 +2038,68 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
 	return ret;
 }
 
+static int vfio_iommu_iova_add_cap(struct vfio_info_cap *caps,
+		 struct vfio_iommu_type1_info_cap_iova_range *cap_iovas,
+		 size_t size)
+{
+	struct vfio_info_cap_header *header;
+	struct vfio_iommu_type1_info_cap_iova_range *iova_cap;
+
+	header = vfio_info_cap_add(caps, size,
+				VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE, 1);
+	if (IS_ERR(header))
+		return PTR_ERR(header);
+
+	iova_cap = container_of(header,
+			struct vfio_iommu_type1_info_cap_iova_range, header);
+	iova_cap->nr_iovas = cap_iovas->nr_iovas;
+	memcpy(iova_cap->iova_ranges, cap_iovas->iova_ranges,
+		cap_iovas->nr_iovas * sizeof(*cap_iovas->iova_ranges));
+	return 0;
+}
+
+static int vfio_iommu_iova_build_caps(struct vfio_iommu *iommu,
+				struct vfio_info_cap *caps)
+{
+	struct vfio_iommu_type1_info_cap_iova_range *cap_iovas;
+	struct vfio_iova *iova;
+	size_t size;
+	int iovas = 0, i = 0, ret;
+
+	mutex_lock(&iommu->lock);
+
+	list_for_each_entry(iova, &iommu->iova_list, list)
+		iovas++;
+
+	if (!iovas) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
+	size = sizeof(*cap_iovas) + (iovas * sizeof(*cap_iovas->iova_ranges));
+
+	cap_iovas = kzalloc(size, GFP_KERNEL);
+	if (!cap_iovas) {
+		ret = -ENOMEM;
+		goto out_unlock;
+	}
+
+	cap_iovas->nr_iovas = iovas;
+
+	list_for_each_entry(iova, &iommu->iova_list, list) {
+		cap_iovas->iova_ranges[i].start = iova->start;
+		cap_iovas->iova_ranges[i].end = iova->end;
+		i++;
+	}
+
+	ret = vfio_iommu_iova_add_cap(caps, cap_iovas, size);
+
+	kfree(cap_iovas);
+out_unlock:
+	mutex_unlock(&iommu->lock);
+	return ret;
+}
+
 static long vfio_iommu_type1_ioctl(void *iommu_data,
 				   unsigned int cmd, unsigned long arg)
 {
@@ -2059,19 +2121,53 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 		}
 	} else if (cmd == VFIO_IOMMU_GET_INFO) {
 		struct vfio_iommu_type1_info info;
+		struct vfio_info_cap caps = { .buf = NULL, .size = 0 };
+		unsigned long capsz;
+		int ret;
 
 		minsz = offsetofend(struct vfio_iommu_type1_info, iova_pgsizes);
 
+		/* For backward compatibility, cannot require this */
+		capsz = offsetofend(struct vfio_iommu_type1_info, cap_offset);
+
 		if (copy_from_user(&info, (void __user *)arg, minsz))
 			return -EFAULT;
 
 		if (info.argsz < minsz)
 			return -EINVAL;
 
+		if (info.argsz >= capsz) {
+			minsz = capsz;
+			info.cap_offset = 0; /* output, no-recopy necessary */
+		}
+
 		info.flags = VFIO_IOMMU_INFO_PGSIZES;
 
 		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
 
+		ret = vfio_iommu_iova_build_caps(iommu, &caps);
+		if (ret)
+			return ret;
+
+		if (caps.size) {
+			info.flags |= VFIO_IOMMU_INFO_CAPS;
+
+			if (info.argsz < sizeof(info) + caps.size) {
+				info.argsz = sizeof(info) + caps.size;
+			} else {
+				vfio_info_cap_shift(&caps, sizeof(info));
+				if (copy_to_user((void __user *)arg +
+						sizeof(info), caps.buf,
+						caps.size)) {
+					kfree(caps.buf);
+					return -EFAULT;
+				}
+				info.cap_offset = sizeof(info);
+			}
+
+			kfree(caps.buf);
+		}
+
 		return copy_to_user((void __user *)arg, &info, minsz) ?
 			-EFAULT : 0;
 
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 1aa7b82..f1bf6b7 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -616,7 +616,30 @@ struct vfio_iommu_type1_info {
 	__u32	argsz;
 	__u32	flags;
 #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)	/* supported page sizes info */
+#define VFIO_IOMMU_INFO_CAPS	(1 << 1)	/* Info supports caps */
 	__u64	iova_pgsizes;		/* Bitmap of supported page sizes */
+	__u32   cap_offset;	/* Offset within info struct of first cap */
+};
+
+/*
+ * The IOVA capability allows to report the valid IOVA range(s)
+ * excluding any reserved regions associated with dev group. Any dma
+ * map attempt outside the valid iova range will return error.
+ *
+ * The structures below define version 1 of this capability.
+ */
+#define VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE  1
+
+struct vfio_iova_range {
+	__u64	start;
+	__u64	end;
+};
+
+struct vfio_iommu_type1_info_cap_iova_range {
+	struct vfio_info_cap_header header;
+	__u32	nr_iovas;
+	__u32	reserved;
+	struct vfio_iova_range iova_ranges[];
 };
 
 #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 6/7] vfio/type1: Add IOVA range capability support
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xuwei5-C8/M+/jPZTeaMJb+Lgu22Q, linuxarm-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

This  allows the user-space to retrieve the supported IOVA
range(s), excluding any reserved regions. The implementation
is based on capability chains, added to VFIO_IOMMU_GET_INFO ioctl.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 drivers/vfio/vfio_iommu_type1.c | 96 +++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/vfio.h       | 23 ++++++++++
 2 files changed, 119 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index bf33281..44d0f13d 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2038,6 +2038,68 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
 	return ret;
 }
 
+static int vfio_iommu_iova_add_cap(struct vfio_info_cap *caps,
+		 struct vfio_iommu_type1_info_cap_iova_range *cap_iovas,
+		 size_t size)
+{
+	struct vfio_info_cap_header *header;
+	struct vfio_iommu_type1_info_cap_iova_range *iova_cap;
+
+	header = vfio_info_cap_add(caps, size,
+				VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE, 1);
+	if (IS_ERR(header))
+		return PTR_ERR(header);
+
+	iova_cap = container_of(header,
+			struct vfio_iommu_type1_info_cap_iova_range, header);
+	iova_cap->nr_iovas = cap_iovas->nr_iovas;
+	memcpy(iova_cap->iova_ranges, cap_iovas->iova_ranges,
+		cap_iovas->nr_iovas * sizeof(*cap_iovas->iova_ranges));
+	return 0;
+}
+
+static int vfio_iommu_iova_build_caps(struct vfio_iommu *iommu,
+				struct vfio_info_cap *caps)
+{
+	struct vfio_iommu_type1_info_cap_iova_range *cap_iovas;
+	struct vfio_iova *iova;
+	size_t size;
+	int iovas = 0, i = 0, ret;
+
+	mutex_lock(&iommu->lock);
+
+	list_for_each_entry(iova, &iommu->iova_list, list)
+		iovas++;
+
+	if (!iovas) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
+	size = sizeof(*cap_iovas) + (iovas * sizeof(*cap_iovas->iova_ranges));
+
+	cap_iovas = kzalloc(size, GFP_KERNEL);
+	if (!cap_iovas) {
+		ret = -ENOMEM;
+		goto out_unlock;
+	}
+
+	cap_iovas->nr_iovas = iovas;
+
+	list_for_each_entry(iova, &iommu->iova_list, list) {
+		cap_iovas->iova_ranges[i].start = iova->start;
+		cap_iovas->iova_ranges[i].end = iova->end;
+		i++;
+	}
+
+	ret = vfio_iommu_iova_add_cap(caps, cap_iovas, size);
+
+	kfree(cap_iovas);
+out_unlock:
+	mutex_unlock(&iommu->lock);
+	return ret;
+}
+
 static long vfio_iommu_type1_ioctl(void *iommu_data,
 				   unsigned int cmd, unsigned long arg)
 {
@@ -2059,19 +2121,53 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 		}
 	} else if (cmd == VFIO_IOMMU_GET_INFO) {
 		struct vfio_iommu_type1_info info;
+		struct vfio_info_cap caps = { .buf = NULL, .size = 0 };
+		unsigned long capsz;
+		int ret;
 
 		minsz = offsetofend(struct vfio_iommu_type1_info, iova_pgsizes);
 
+		/* For backward compatibility, cannot require this */
+		capsz = offsetofend(struct vfio_iommu_type1_info, cap_offset);
+
 		if (copy_from_user(&info, (void __user *)arg, minsz))
 			return -EFAULT;
 
 		if (info.argsz < minsz)
 			return -EINVAL;
 
+		if (info.argsz >= capsz) {
+			minsz = capsz;
+			info.cap_offset = 0; /* output, no-recopy necessary */
+		}
+
 		info.flags = VFIO_IOMMU_INFO_PGSIZES;
 
 		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
 
+		ret = vfio_iommu_iova_build_caps(iommu, &caps);
+		if (ret)
+			return ret;
+
+		if (caps.size) {
+			info.flags |= VFIO_IOMMU_INFO_CAPS;
+
+			if (info.argsz < sizeof(info) + caps.size) {
+				info.argsz = sizeof(info) + caps.size;
+			} else {
+				vfio_info_cap_shift(&caps, sizeof(info));
+				if (copy_to_user((void __user *)arg +
+						sizeof(info), caps.buf,
+						caps.size)) {
+					kfree(caps.buf);
+					return -EFAULT;
+				}
+				info.cap_offset = sizeof(info);
+			}
+
+			kfree(caps.buf);
+		}
+
 		return copy_to_user((void __user *)arg, &info, minsz) ?
 			-EFAULT : 0;
 
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 1aa7b82..f1bf6b7 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -616,7 +616,30 @@ struct vfio_iommu_type1_info {
 	__u32	argsz;
 	__u32	flags;
 #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)	/* supported page sizes info */
+#define VFIO_IOMMU_INFO_CAPS	(1 << 1)	/* Info supports caps */
 	__u64	iova_pgsizes;		/* Bitmap of supported page sizes */
+	__u32   cap_offset;	/* Offset within info struct of first cap */
+};
+
+/*
+ * The IOVA capability allows to report the valid IOVA range(s)
+ * excluding any reserved regions associated with dev group. Any dma
+ * map attempt outside the valid iova range will return error.
+ *
+ * The structures below define version 1 of this capability.
+ */
+#define VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE  1
+
+struct vfio_iova_range {
+	__u64	start;
+	__u64	end;
+};
+
+struct vfio_iommu_type1_info_cap_iova_range {
+	struct vfio_info_cap_header header;
+	__u32	nr_iovas;
+	__u32	reserved;
+	struct vfio_iova_range iova_ranges[];
 };
 
 #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 6/7] vfio/type1: Add IOVA range capability support
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xuwei5-C8/M+/jPZTeaMJb+Lgu22Q, linuxarm-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

This  allows the user-space to retrieve the supported IOVA
range(s), excluding any reserved regions. The implementation
is based on capability chains, added to VFIO_IOMMU_GET_INFO ioctl.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 drivers/vfio/vfio_iommu_type1.c | 96 +++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/vfio.h       | 23 ++++++++++
 2 files changed, 119 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index bf33281..44d0f13d 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2038,6 +2038,68 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
 	return ret;
 }
 
+static int vfio_iommu_iova_add_cap(struct vfio_info_cap *caps,
+		 struct vfio_iommu_type1_info_cap_iova_range *cap_iovas,
+		 size_t size)
+{
+	struct vfio_info_cap_header *header;
+	struct vfio_iommu_type1_info_cap_iova_range *iova_cap;
+
+	header = vfio_info_cap_add(caps, size,
+				VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE, 1);
+	if (IS_ERR(header))
+		return PTR_ERR(header);
+
+	iova_cap = container_of(header,
+			struct vfio_iommu_type1_info_cap_iova_range, header);
+	iova_cap->nr_iovas = cap_iovas->nr_iovas;
+	memcpy(iova_cap->iova_ranges, cap_iovas->iova_ranges,
+		cap_iovas->nr_iovas * sizeof(*cap_iovas->iova_ranges));
+	return 0;
+}
+
+static int vfio_iommu_iova_build_caps(struct vfio_iommu *iommu,
+				struct vfio_info_cap *caps)
+{
+	struct vfio_iommu_type1_info_cap_iova_range *cap_iovas;
+	struct vfio_iova *iova;
+	size_t size;
+	int iovas = 0, i = 0, ret;
+
+	mutex_lock(&iommu->lock);
+
+	list_for_each_entry(iova, &iommu->iova_list, list)
+		iovas++;
+
+	if (!iovas) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
+	size = sizeof(*cap_iovas) + (iovas * sizeof(*cap_iovas->iova_ranges));
+
+	cap_iovas = kzalloc(size, GFP_KERNEL);
+	if (!cap_iovas) {
+		ret = -ENOMEM;
+		goto out_unlock;
+	}
+
+	cap_iovas->nr_iovas = iovas;
+
+	list_for_each_entry(iova, &iommu->iova_list, list) {
+		cap_iovas->iova_ranges[i].start = iova->start;
+		cap_iovas->iova_ranges[i].end = iova->end;
+		i++;
+	}
+
+	ret = vfio_iommu_iova_add_cap(caps, cap_iovas, size);
+
+	kfree(cap_iovas);
+out_unlock:
+	mutex_unlock(&iommu->lock);
+	return ret;
+}
+
 static long vfio_iommu_type1_ioctl(void *iommu_data,
 				   unsigned int cmd, unsigned long arg)
 {
@@ -2059,19 +2121,53 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 		}
 	} else if (cmd == VFIO_IOMMU_GET_INFO) {
 		struct vfio_iommu_type1_info info;
+		struct vfio_info_cap caps = { .buf = NULL, .size = 0 };
+		unsigned long capsz;
+		int ret;
 
 		minsz = offsetofend(struct vfio_iommu_type1_info, iova_pgsizes);
 
+		/* For backward compatibility, cannot require this */
+		capsz = offsetofend(struct vfio_iommu_type1_info, cap_offset);
+
 		if (copy_from_user(&info, (void __user *)arg, minsz))
 			return -EFAULT;
 
 		if (info.argsz < minsz)
 			return -EINVAL;
 
+		if (info.argsz >= capsz) {
+			minsz = capsz;
+			info.cap_offset = 0; /* output, no-recopy necessary */
+		}
+
 		info.flags = VFIO_IOMMU_INFO_PGSIZES;
 
 		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
 
+		ret = vfio_iommu_iova_build_caps(iommu, &caps);
+		if (ret)
+			return ret;
+
+		if (caps.size) {
+			info.flags |= VFIO_IOMMU_INFO_CAPS;
+
+			if (info.argsz < sizeof(info) + caps.size) {
+				info.argsz = sizeof(info) + caps.size;
+			} else {
+				vfio_info_cap_shift(&caps, sizeof(info));
+				if (copy_to_user((void __user *)arg +
+						sizeof(info), caps.buf,
+						caps.size)) {
+					kfree(caps.buf);
+					return -EFAULT;
+				}
+				info.cap_offset = sizeof(info);
+			}
+
+			kfree(caps.buf);
+		}
+
 		return copy_to_user((void __user *)arg, &info, minsz) ?
 			-EFAULT : 0;
 
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 1aa7b82..f1bf6b7 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -616,7 +616,30 @@ struct vfio_iommu_type1_info {
 	__u32	argsz;
 	__u32	flags;
 #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)	/* supported page sizes info */
+#define VFIO_IOMMU_INFO_CAPS	(1 << 1)	/* Info supports caps */
 	__u64	iova_pgsizes;		/* Bitmap of supported page sizes */
+	__u32   cap_offset;	/* Offset within info struct of first cap */
+};
+
+/*
+ * The IOVA capability allows to report the valid IOVA range(s)
+ * excluding any reserved regions associated with dev group. Any dma
+ * map attempt outside the valid iova range will return error.
+ *
+ * The structures below define version 1 of this capability.
+ */
+#define VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE  1
+
+struct vfio_iova_range {
+	__u64	start;
+	__u64	end;
+};
+
+struct vfio_iommu_type1_info_cap_iova_range {
+	struct vfio_info_cap_header header;
+	__u32	nr_iovas;
+	__u32	reserved;
+	struct vfio_iova_range iova_ranges[];
 };
 
 #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 7/7] vfio/type1: remove duplicate retrieval of reserved regions
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson, eric.auger, pmorel
  Cc: kvm, linux-kernel, iommu, linuxarm, john.garry, xuwei5, Shameer Kolothum

As we now already have the reserved regions list, just pass that into
vfio_iommu_has_sw_msi() fn.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 drivers/vfio/vfio_iommu_type1.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 44d0f13d..13c631a 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1313,15 +1313,13 @@ static struct vfio_group *find_iommu_group(struct vfio_domain *domain,
 	return NULL;
 }
 
-static bool vfio_iommu_has_sw_msi(struct iommu_group *group, phys_addr_t *base)
+static bool vfio_iommu_has_sw_msi(struct list_head *group_resv_regions,
+						phys_addr_t *base)
 {
-	struct list_head group_resv_regions;
-	struct iommu_resv_region *region, *next;
+	struct iommu_resv_region *region;
 	bool ret = false;
 
-	INIT_LIST_HEAD(&group_resv_regions);
-	iommu_get_group_resv_regions(group, &group_resv_regions);
-	list_for_each_entry(region, &group_resv_regions, list) {
+	list_for_each_entry(region, group_resv_regions, list) {
 		/*
 		 * The presence of any 'real' MSI regions should take
 		 * precedence over the software-managed one if the
@@ -1337,8 +1335,7 @@ static bool vfio_iommu_has_sw_msi(struct iommu_group *group, phys_addr_t *base)
 			ret = true;
 		}
 	}
-	list_for_each_entry_safe(region, next, &group_resv_regions, list)
-		kfree(region);
+
 	return ret;
 }
 
@@ -1673,7 +1670,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	if (ret)
 		goto out_detach;
 
-	resv_msi = vfio_iommu_has_sw_msi(iommu_group, &resv_msi_base);
+	resv_msi = vfio_iommu_has_sw_msi(&group_resv_regions, &resv_msi_base);
 
 	INIT_LIST_HEAD(&domain->group_list);
 	list_add(&group->next, &domain->group_list);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 7/7] vfio/type1: remove duplicate retrieval of reserved regions
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xuwei5-C8/M+/jPZTeaMJb+Lgu22Q, linuxarm-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

As we now already have the reserved regions list, just pass that into
vfio_iommu_has_sw_msi() fn.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 drivers/vfio/vfio_iommu_type1.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 44d0f13d..13c631a 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1313,15 +1313,13 @@ static struct vfio_group *find_iommu_group(struct vfio_domain *domain,
 	return NULL;
 }
 
-static bool vfio_iommu_has_sw_msi(struct iommu_group *group, phys_addr_t *base)
+static bool vfio_iommu_has_sw_msi(struct list_head *group_resv_regions,
+						phys_addr_t *base)
 {
-	struct list_head group_resv_regions;
-	struct iommu_resv_region *region, *next;
+	struct iommu_resv_region *region;
 	bool ret = false;
 
-	INIT_LIST_HEAD(&group_resv_regions);
-	iommu_get_group_resv_regions(group, &group_resv_regions);
-	list_for_each_entry(region, &group_resv_regions, list) {
+	list_for_each_entry(region, group_resv_regions, list) {
 		/*
 		 * The presence of any 'real' MSI regions should take
 		 * precedence over the software-managed one if the
@@ -1337,8 +1335,7 @@ static bool vfio_iommu_has_sw_msi(struct iommu_group *group, phys_addr_t *base)
 			ret = true;
 		}
 	}
-	list_for_each_entry_safe(region, next, &group_resv_regions, list)
-		kfree(region);
+
 	return ret;
 }
 
@@ -1673,7 +1670,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	if (ret)
 		goto out_detach;
 
-	resv_msi = vfio_iommu_has_sw_msi(iommu_group, &resv_msi_base);
+	resv_msi = vfio_iommu_has_sw_msi(&group_resv_regions, &resv_msi_base);
 
 	INIT_LIST_HEAD(&domain->group_list);
 	list_add(&group->next, &domain->group_list);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v6 7/7] vfio/type1: remove duplicate retrieval of reserved regions
@ 2018-04-18 11:40   ` Shameer Kolothum
  0 siblings, 0 replies; 42+ messages in thread
From: Shameer Kolothum @ 2018-04-18 11:40 UTC (permalink / raw)
  To: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xuwei5-C8/M+/jPZTeaMJb+Lgu22Q, linuxarm-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

As we now already have the reserved regions list, just pass that into
vfio_iommu_has_sw_msi() fn.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 drivers/vfio/vfio_iommu_type1.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 44d0f13d..13c631a 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1313,15 +1313,13 @@ static struct vfio_group *find_iommu_group(struct vfio_domain *domain,
 	return NULL;
 }
 
-static bool vfio_iommu_has_sw_msi(struct iommu_group *group, phys_addr_t *base)
+static bool vfio_iommu_has_sw_msi(struct list_head *group_resv_regions,
+						phys_addr_t *base)
 {
-	struct list_head group_resv_regions;
-	struct iommu_resv_region *region, *next;
+	struct iommu_resv_region *region;
 	bool ret = false;
 
-	INIT_LIST_HEAD(&group_resv_regions);
-	iommu_get_group_resv_regions(group, &group_resv_regions);
-	list_for_each_entry(region, &group_resv_regions, list) {
+	list_for_each_entry(region, group_resv_regions, list) {
 		/*
 		 * The presence of any 'real' MSI regions should take
 		 * precedence over the software-managed one if the
@@ -1337,8 +1335,7 @@ static bool vfio_iommu_has_sw_msi(struct iommu_group *group, phys_addr_t *base)
 			ret = true;
 		}
 	}
-	list_for_each_entry_safe(region, next, &group_resv_regions, list)
-		kfree(region);
+
 	return ret;
 }
 
@@ -1673,7 +1670,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	if (ret)
 		goto out_detach;
 
-	resv_msi = vfio_iommu_has_sw_msi(iommu_group, &resv_msi_base);
+	resv_msi = vfio_iommu_has_sw_msi(&group_resv_regions, &resv_msi_base);
 
 	INIT_LIST_HEAD(&domain->group_list);
 	list_add(&group->next, &domain->group_list);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* RE: [PATCH v6 4/7] iommu/dma: Move PCI window region reservation back into dma specific path.
@ 2018-04-24 13:16     ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 42+ messages in thread
From: Shameerali Kolothum Thodi @ 2018-04-24 13:16 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: kvm, linux-kernel, iommu, Linuxarm, John Garry, xuwei (O),
	Robin Murphy, alex.williamson, eric.auger, pmorel

Hi Joerg,

Could you please take a look at this patch and let me know.

I have rebased this to 4.17-rc1  and added Robin's R-by.

This series[1] is now pending on this patch as without this it will break few
ARM platforms[2]. 

Please take a look and let me know.

Thanks,
Shameer

[1] https://lkml.org/lkml/2018/4/18/293
[2] https://lkml.org/lkml/2018/3/14/881


> -----Original Message-----
> From: Shameerali Kolothum Thodi
> Sent: Wednesday, April 18, 2018 12:41 PM
> To: alex.williamson@redhat.com; eric.auger@redhat.com;
> pmorel@linux.vnet.ibm.com
> Cc: kvm@vger.kernel.org; linux-kernel@vger.kernel.org; iommu@lists.linux-
> foundation.org; Linuxarm <linuxarm@huawei.com>; John Garry
> <john.garry@huawei.com>; xuwei (O) <xuwei5@huawei.com>; Shameerali
> Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>; Joerg Roedel
> <joro@8bytes.org>
> Subject: [PATCH v6 4/7] iommu/dma: Move PCI window region reservation
> back into dma specific path.
> 
> This pretty much reverts commit 273df9635385 ("iommu/dma: Make PCI
> window reservation generic")  by moving the PCI window region
> reservation back into the dma specific path so that these regions
> doesn't get exposed via the IOMMU API interface. With this change,
> the vfio interface will report only iommu specific reserved regions
> to the user space.
> 
> Cc: Joerg Roedel <joro@8bytes.org>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  drivers/iommu/dma-iommu.c | 54 ++++++++++++++++++++++----------------------
> ---
>  1 file changed, 25 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index f05f3cf..ddcbbdb 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -167,40 +167,16 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
>   * @list: Reserved region list from iommu_get_resv_regions()
>   *
>   * IOMMU drivers can use this to implement their .get_resv_regions callback
> - * for general non-IOMMU-specific reservations. Currently, this covers host
> - * bridge windows for PCI devices and GICv3 ITS region reservation on ACPI
> - * based ARM platforms that may require HW MSI reservation.
> + * for general non-IOMMU-specific reservations. Currently, this covers GICv3
> + * ITS region reservation on ACPI based ARM platforms that may require HW
> MSI
> + * reservation.
>   */
>  void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
>  {
> -	struct pci_host_bridge *bridge;
> -	struct resource_entry *window;
> -
> -	if (!is_of_node(dev->iommu_fwspec->iommu_fwnode) &&
> -		iort_iommu_msi_get_resv_regions(dev, list) < 0)
> -		return;
> -
> -	if (!dev_is_pci(dev))
> -		return;
> -
> -	bridge = pci_find_host_bridge(to_pci_dev(dev)->bus);
> -	resource_list_for_each_entry(window, &bridge->windows) {
> -		struct iommu_resv_region *region;
> -		phys_addr_t start;
> -		size_t length;
> -
> -		if (resource_type(window->res) != IORESOURCE_MEM)
> -			continue;
> 
> -		start = window->res->start - window->offset;
> -		length = window->res->end - window->res->start + 1;
> -		region = iommu_alloc_resv_region(start, length, 0,
> -				IOMMU_RESV_RESERVED);
> -		if (!region)
> -			return;
> +	if (!is_of_node(dev->iommu_fwspec->iommu_fwnode))
> +		iort_iommu_msi_get_resv_regions(dev, list);
> 
> -		list_add_tail(&region->list, list);
> -	}
>  }
>  EXPORT_SYMBOL(iommu_dma_get_resv_regions);
> 
> @@ -229,6 +205,23 @@ static int cookie_init_hw_msi_region(struct
> iommu_dma_cookie *cookie,
>  	return 0;
>  }
> 
> +static void iova_reserve_pci_windows(struct pci_dev *dev,
> +		struct iova_domain *iovad)
> +{
> +	struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
> +	struct resource_entry *window;
> +	unsigned long lo, hi;
> +
> +	resource_list_for_each_entry(window, &bridge->windows) {
> +		if (resource_type(window->res) != IORESOURCE_MEM)
> +			continue;
> +
> +		lo = iova_pfn(iovad, window->res->start - window->offset);
> +		hi = iova_pfn(iovad, window->res->end - window->offset);
> +		reserve_iova(iovad, lo, hi);
> +	}
> +}
> +
>  static int iova_reserve_iommu_regions(struct device *dev,
>  		struct iommu_domain *domain)
>  {
> @@ -238,6 +231,9 @@ static int iova_reserve_iommu_regions(struct device
> *dev,
>  	LIST_HEAD(resv_regions);
>  	int ret = 0;
> 
> +	if (dev_is_pci(dev))
> +		iova_reserve_pci_windows(to_pci_dev(dev), iovad);
> +
>  	iommu_get_resv_regions(dev, &resv_regions);
>  	list_for_each_entry(region, &resv_regions, list) {
>  		unsigned long lo, hi;
> --
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH v6 4/7] iommu/dma: Move PCI window region reservation back into dma specific path.
@ 2018-04-24 13:16     ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 42+ messages in thread
From: Shameerali Kolothum Thodi @ 2018-04-24 13:16 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linuxarm,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, xuwei (O)

Hi Joerg,

Could you please take a look at this patch and let me know.

I have rebased this to 4.17-rc1  and added Robin's R-by.

This series[1] is now pending on this patch as without this it will break few
ARM platforms[2]. 

Please take a look and let me know.

Thanks,
Shameer

[1] https://lkml.org/lkml/2018/4/18/293
[2] https://lkml.org/lkml/2018/3/14/881


> -----Original Message-----
> From: Shameerali Kolothum Thodi
> Sent: Wednesday, April 18, 2018 12:41 PM
> To: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org; eric.auger-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org;
> pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org
> Cc: kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; iommu-cunTk1MwBs/ROKNJybVBZg@public.gmane.org
> foundation.org; Linuxarm <linuxarm-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; John Garry
> <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; xuwei (O) <xuwei5-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; Shameerali
> Kolothum Thodi <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; Joerg Roedel
> <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
> Subject: [PATCH v6 4/7] iommu/dma: Move PCI window region reservation
> back into dma specific path.
> 
> This pretty much reverts commit 273df9635385 ("iommu/dma: Make PCI
> window reservation generic")  by moving the PCI window region
> reservation back into the dma specific path so that these regions
> doesn't get exposed via the IOMMU API interface. With this change,
> the vfio interface will report only iommu specific reserved regions
> to the user space.
> 
> Cc: Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Reviewed-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/dma-iommu.c | 54 ++++++++++++++++++++++----------------------
> ---
>  1 file changed, 25 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index f05f3cf..ddcbbdb 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -167,40 +167,16 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
>   * @list: Reserved region list from iommu_get_resv_regions()
>   *
>   * IOMMU drivers can use this to implement their .get_resv_regions callback
> - * for general non-IOMMU-specific reservations. Currently, this covers host
> - * bridge windows for PCI devices and GICv3 ITS region reservation on ACPI
> - * based ARM platforms that may require HW MSI reservation.
> + * for general non-IOMMU-specific reservations. Currently, this covers GICv3
> + * ITS region reservation on ACPI based ARM platforms that may require HW
> MSI
> + * reservation.
>   */
>  void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
>  {
> -	struct pci_host_bridge *bridge;
> -	struct resource_entry *window;
> -
> -	if (!is_of_node(dev->iommu_fwspec->iommu_fwnode) &&
> -		iort_iommu_msi_get_resv_regions(dev, list) < 0)
> -		return;
> -
> -	if (!dev_is_pci(dev))
> -		return;
> -
> -	bridge = pci_find_host_bridge(to_pci_dev(dev)->bus);
> -	resource_list_for_each_entry(window, &bridge->windows) {
> -		struct iommu_resv_region *region;
> -		phys_addr_t start;
> -		size_t length;
> -
> -		if (resource_type(window->res) != IORESOURCE_MEM)
> -			continue;
> 
> -		start = window->res->start - window->offset;
> -		length = window->res->end - window->res->start + 1;
> -		region = iommu_alloc_resv_region(start, length, 0,
> -				IOMMU_RESV_RESERVED);
> -		if (!region)
> -			return;
> +	if (!is_of_node(dev->iommu_fwspec->iommu_fwnode))
> +		iort_iommu_msi_get_resv_regions(dev, list);
> 
> -		list_add_tail(&region->list, list);
> -	}
>  }
>  EXPORT_SYMBOL(iommu_dma_get_resv_regions);
> 
> @@ -229,6 +205,23 @@ static int cookie_init_hw_msi_region(struct
> iommu_dma_cookie *cookie,
>  	return 0;
>  }
> 
> +static void iova_reserve_pci_windows(struct pci_dev *dev,
> +		struct iova_domain *iovad)
> +{
> +	struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
> +	struct resource_entry *window;
> +	unsigned long lo, hi;
> +
> +	resource_list_for_each_entry(window, &bridge->windows) {
> +		if (resource_type(window->res) != IORESOURCE_MEM)
> +			continue;
> +
> +		lo = iova_pfn(iovad, window->res->start - window->offset);
> +		hi = iova_pfn(iovad, window->res->end - window->offset);
> +		reserve_iova(iovad, lo, hi);
> +	}
> +}
> +
>  static int iova_reserve_iommu_regions(struct device *dev,
>  		struct iommu_domain *domain)
>  {
> @@ -238,6 +231,9 @@ static int iova_reserve_iommu_regions(struct device
> *dev,
>  	LIST_HEAD(resv_regions);
>  	int ret = 0;
> 
> +	if (dev_is_pci(dev))
> +		iova_reserve_pci_windows(to_pci_dev(dev), iovad);
> +
>  	iommu_get_resv_regions(dev, &resv_regions);
>  	list_for_each_entry(region, &resv_regions, list) {
>  		unsigned long lo, hi;
> --
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH v6 4/7] iommu/dma: Move PCI window region reservation back into dma specific path.
@ 2018-04-24 13:16     ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 42+ messages in thread
From: Shameerali Kolothum Thodi @ 2018-04-24 13:16 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linuxarm,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, xuwei (O)

Hi Joerg,

Could you please take a look at this patch and let me know.

I have rebased this to 4.17-rc1  and added Robin's R-by.

This series[1] is now pending on this patch as without this it will break few
ARM platforms[2]. 

Please take a look and let me know.

Thanks,
Shameer

[1] https://lkml.org/lkml/2018/4/18/293
[2] https://lkml.org/lkml/2018/3/14/881


> -----Original Message-----
> From: Shameerali Kolothum Thodi
> Sent: Wednesday, April 18, 2018 12:41 PM
> To: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org; eric.auger-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org;
> pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org
> Cc: kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; iommu-cunTk1MwBs/ROKNJybVBZg@public.gmane.org
> foundation.org; Linuxarm <linuxarm-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; John Garry
> <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; xuwei (O) <xuwei5-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; Shameerali
> Kolothum Thodi <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; Joerg Roedel
> <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
> Subject: [PATCH v6 4/7] iommu/dma: Move PCI window region reservation
> back into dma specific path.
> 
> This pretty much reverts commit 273df9635385 ("iommu/dma: Make PCI
> window reservation generic")  by moving the PCI window region
> reservation back into the dma specific path so that these regions
> doesn't get exposed via the IOMMU API interface. With this change,
> the vfio interface will report only iommu specific reserved regions
> to the user space.
> 
> Cc: Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Reviewed-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/dma-iommu.c | 54 ++++++++++++++++++++++----------------------
> ---
>  1 file changed, 25 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index f05f3cf..ddcbbdb 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -167,40 +167,16 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
>   * @list: Reserved region list from iommu_get_resv_regions()
>   *
>   * IOMMU drivers can use this to implement their .get_resv_regions callback
> - * for general non-IOMMU-specific reservations. Currently, this covers host
> - * bridge windows for PCI devices and GICv3 ITS region reservation on ACPI
> - * based ARM platforms that may require HW MSI reservation.
> + * for general non-IOMMU-specific reservations. Currently, this covers GICv3
> + * ITS region reservation on ACPI based ARM platforms that may require HW
> MSI
> + * reservation.
>   */
>  void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
>  {
> -	struct pci_host_bridge *bridge;
> -	struct resource_entry *window;
> -
> -	if (!is_of_node(dev->iommu_fwspec->iommu_fwnode) &&
> -		iort_iommu_msi_get_resv_regions(dev, list) < 0)
> -		return;
> -
> -	if (!dev_is_pci(dev))
> -		return;
> -
> -	bridge = pci_find_host_bridge(to_pci_dev(dev)->bus);
> -	resource_list_for_each_entry(window, &bridge->windows) {
> -		struct iommu_resv_region *region;
> -		phys_addr_t start;
> -		size_t length;
> -
> -		if (resource_type(window->res) != IORESOURCE_MEM)
> -			continue;
> 
> -		start = window->res->start - window->offset;
> -		length = window->res->end - window->res->start + 1;
> -		region = iommu_alloc_resv_region(start, length, 0,
> -				IOMMU_RESV_RESERVED);
> -		if (!region)
> -			return;
> +	if (!is_of_node(dev->iommu_fwspec->iommu_fwnode))
> +		iort_iommu_msi_get_resv_regions(dev, list);
> 
> -		list_add_tail(&region->list, list);
> -	}
>  }
>  EXPORT_SYMBOL(iommu_dma_get_resv_regions);
> 
> @@ -229,6 +205,23 @@ static int cookie_init_hw_msi_region(struct
> iommu_dma_cookie *cookie,
>  	return 0;
>  }
> 
> +static void iova_reserve_pci_windows(struct pci_dev *dev,
> +		struct iova_domain *iovad)
> +{
> +	struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
> +	struct resource_entry *window;
> +	unsigned long lo, hi;
> +
> +	resource_list_for_each_entry(window, &bridge->windows) {
> +		if (resource_type(window->res) != IORESOURCE_MEM)
> +			continue;
> +
> +		lo = iova_pfn(iovad, window->res->start - window->offset);
> +		hi = iova_pfn(iovad, window->res->end - window->offset);
> +		reserve_iova(iovad, lo, hi);
> +	}
> +}
> +
>  static int iova_reserve_iommu_regions(struct device *dev,
>  		struct iommu_domain *domain)
>  {
> @@ -238,6 +231,9 @@ static int iova_reserve_iommu_regions(struct device
> *dev,
>  	LIST_HEAD(resv_regions);
>  	int ret = 0;
> 
> +	if (dev_is_pci(dev))
> +		iova_reserve_pci_windows(to_pci_dev(dev), iovad);
> +
>  	iommu_get_resv_regions(dev, &resv_regions);
>  	list_for_each_entry(region, &resv_regions, list) {
>  		unsigned long lo, hi;
> --
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 4/7] iommu/dma: Move PCI window region reservation back into dma specific path.
@ 2018-04-30 16:10       ` Alex Williamson
  0 siblings, 0 replies; 42+ messages in thread
From: Alex Williamson @ 2018-04-30 16:10 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Shameerali Kolothum Thodi, kvm, linux-kernel, iommu, Linuxarm,
	John Garry, xuwei (O),
	Robin Murphy, eric.auger, pmorel

On Tue, 24 Apr 2018 13:16:19 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:

> Hi Joerg,
> 
> Could you please take a look at this patch and let me know.
> 
> I have rebased this to 4.17-rc1  and added Robin's R-by.
> 
> This series[1] is now pending on this patch as without this it will break few
> ARM platforms[2]. 
> 
> Please take a look and let me know.

Hi Joerg,

Any thoughts on this patch?  Shameer is blocked on this series without
your ack.  We could consider moving this to a separate IOMMU patch, but
then I think we'd need to order our v4.18 pull requests, so there's a
bit of a preference to keep it within this series.  Thanks,

Alex

> [1] https://lkml.org/lkml/2018/4/18/293
> [2] https://lkml.org/lkml/2018/3/14/881
> 
> 
> > -----Original Message-----
> > From: Shameerali Kolothum Thodi
> > Sent: Wednesday, April 18, 2018 12:41 PM
> > To: alex.williamson@redhat.com; eric.auger@redhat.com;
> > pmorel@linux.vnet.ibm.com
> > Cc: kvm@vger.kernel.org; linux-kernel@vger.kernel.org; iommu@lists.linux-
> > foundation.org; Linuxarm <linuxarm@huawei.com>; John Garry
> > <john.garry@huawei.com>; xuwei (O) <xuwei5@huawei.com>; Shameerali
> > Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>; Joerg Roedel
> > <joro@8bytes.org>
> > Subject: [PATCH v6 4/7] iommu/dma: Move PCI window region reservation
> > back into dma specific path.
> > 
> > This pretty much reverts commit 273df9635385 ("iommu/dma: Make PCI
> > window reservation generic")  by moving the PCI window region
> > reservation back into the dma specific path so that these regions
> > doesn't get exposed via the IOMMU API interface. With this change,
> > the vfio interface will report only iommu specific reserved regions
> > to the user space.
> > 
> > Cc: Joerg Roedel <joro@8bytes.org>
> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> > ---
> >  drivers/iommu/dma-iommu.c | 54 ++++++++++++++++++++++----------------------
> > ---
> >  1 file changed, 25 insertions(+), 29 deletions(-)
> > 
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index f05f3cf..ddcbbdb 100644
> > --- a/drivers/iommu/dma-iommu.c
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -167,40 +167,16 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
> >   * @list: Reserved region list from iommu_get_resv_regions()
> >   *
> >   * IOMMU drivers can use this to implement their .get_resv_regions callback
> > - * for general non-IOMMU-specific reservations. Currently, this covers host
> > - * bridge windows for PCI devices and GICv3 ITS region reservation on ACPI
> > - * based ARM platforms that may require HW MSI reservation.
> > + * for general non-IOMMU-specific reservations. Currently, this covers GICv3
> > + * ITS region reservation on ACPI based ARM platforms that may require HW
> > MSI
> > + * reservation.
> >   */
> >  void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
> >  {
> > -	struct pci_host_bridge *bridge;
> > -	struct resource_entry *window;
> > -
> > -	if (!is_of_node(dev->iommu_fwspec->iommu_fwnode) &&
> > -		iort_iommu_msi_get_resv_regions(dev, list) < 0)
> > -		return;
> > -
> > -	if (!dev_is_pci(dev))
> > -		return;
> > -
> > -	bridge = pci_find_host_bridge(to_pci_dev(dev)->bus);
> > -	resource_list_for_each_entry(window, &bridge->windows) {
> > -		struct iommu_resv_region *region;
> > -		phys_addr_t start;
> > -		size_t length;
> > -
> > -		if (resource_type(window->res) != IORESOURCE_MEM)
> > -			continue;
> > 
> > -		start = window->res->start - window->offset;
> > -		length = window->res->end - window->res->start + 1;
> > -		region = iommu_alloc_resv_region(start, length, 0,
> > -				IOMMU_RESV_RESERVED);
> > -		if (!region)
> > -			return;
> > +	if (!is_of_node(dev->iommu_fwspec->iommu_fwnode))
> > +		iort_iommu_msi_get_resv_regions(dev, list);
> > 
> > -		list_add_tail(&region->list, list);
> > -	}
> >  }
> >  EXPORT_SYMBOL(iommu_dma_get_resv_regions);
> > 
> > @@ -229,6 +205,23 @@ static int cookie_init_hw_msi_region(struct
> > iommu_dma_cookie *cookie,
> >  	return 0;
> >  }
> > 
> > +static void iova_reserve_pci_windows(struct pci_dev *dev,
> > +		struct iova_domain *iovad)
> > +{
> > +	struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
> > +	struct resource_entry *window;
> > +	unsigned long lo, hi;
> > +
> > +	resource_list_for_each_entry(window, &bridge->windows) {
> > +		if (resource_type(window->res) != IORESOURCE_MEM)
> > +			continue;
> > +
> > +		lo = iova_pfn(iovad, window->res->start - window->offset);
> > +		hi = iova_pfn(iovad, window->res->end - window->offset);
> > +		reserve_iova(iovad, lo, hi);
> > +	}
> > +}
> > +
> >  static int iova_reserve_iommu_regions(struct device *dev,
> >  		struct iommu_domain *domain)
> >  {
> > @@ -238,6 +231,9 @@ static int iova_reserve_iommu_regions(struct device
> > *dev,
> >  	LIST_HEAD(resv_regions);
> >  	int ret = 0;
> > 
> > +	if (dev_is_pci(dev))
> > +		iova_reserve_pci_windows(to_pci_dev(dev), iovad);
> > +
> >  	iommu_get_resv_regions(dev, &resv_regions);
> >  	list_for_each_entry(region, &resv_regions, list) {
> >  		unsigned long lo, hi;
> > --
> > 2.7.4
> >   
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 4/7] iommu/dma: Move PCI window region reservation back into dma specific path.
@ 2018-04-30 16:10       ` Alex Williamson
  0 siblings, 0 replies; 42+ messages in thread
From: Alex Williamson @ 2018-04-30 16:10 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Linuxarm,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, xuwei (O)

On Tue, 24 Apr 2018 13:16:19 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:

> Hi Joerg,
> 
> Could you please take a look at this patch and let me know.
> 
> I have rebased this to 4.17-rc1  and added Robin's R-by.
> 
> This series[1] is now pending on this patch as without this it will break few
> ARM platforms[2]. 
> 
> Please take a look and let me know.

Hi Joerg,

Any thoughts on this patch?  Shameer is blocked on this series without
your ack.  We could consider moving this to a separate IOMMU patch, but
then I think we'd need to order our v4.18 pull requests, so there's a
bit of a preference to keep it within this series.  Thanks,

Alex

> [1] https://lkml.org/lkml/2018/4/18/293
> [2] https://lkml.org/lkml/2018/3/14/881
> 
> 
> > -----Original Message-----
> > From: Shameerali Kolothum Thodi
> > Sent: Wednesday, April 18, 2018 12:41 PM
> > To: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org; eric.auger-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org;
> > pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org
> > Cc: kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; iommu-cunTk1MwBs/ROKNJybVBZg@public.gmane.org
> > foundation.org; Linuxarm <linuxarm-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; John Garry
> > <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; xuwei (O) <xuwei5-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; Shameerali
> > Kolothum Thodi <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; Joerg Roedel
> > <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
> > Subject: [PATCH v6 4/7] iommu/dma: Move PCI window region reservation
> > back into dma specific path.
> > 
> > This pretty much reverts commit 273df9635385 ("iommu/dma: Make PCI
> > window reservation generic")  by moving the PCI window region
> > reservation back into the dma specific path so that these regions
> > doesn't get exposed via the IOMMU API interface. With this change,
> > the vfio interface will report only iommu specific reserved regions
> > to the user space.
> > 
> > Cc: Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> > Reviewed-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> > ---
> >  drivers/iommu/dma-iommu.c | 54 ++++++++++++++++++++++----------------------
> > ---
> >  1 file changed, 25 insertions(+), 29 deletions(-)
> > 
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index f05f3cf..ddcbbdb 100644
> > --- a/drivers/iommu/dma-iommu.c
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -167,40 +167,16 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
> >   * @list: Reserved region list from iommu_get_resv_regions()
> >   *
> >   * IOMMU drivers can use this to implement their .get_resv_regions callback
> > - * for general non-IOMMU-specific reservations. Currently, this covers host
> > - * bridge windows for PCI devices and GICv3 ITS region reservation on ACPI
> > - * based ARM platforms that may require HW MSI reservation.
> > + * for general non-IOMMU-specific reservations. Currently, this covers GICv3
> > + * ITS region reservation on ACPI based ARM platforms that may require HW
> > MSI
> > + * reservation.
> >   */
> >  void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
> >  {
> > -	struct pci_host_bridge *bridge;
> > -	struct resource_entry *window;
> > -
> > -	if (!is_of_node(dev->iommu_fwspec->iommu_fwnode) &&
> > -		iort_iommu_msi_get_resv_regions(dev, list) < 0)
> > -		return;
> > -
> > -	if (!dev_is_pci(dev))
> > -		return;
> > -
> > -	bridge = pci_find_host_bridge(to_pci_dev(dev)->bus);
> > -	resource_list_for_each_entry(window, &bridge->windows) {
> > -		struct iommu_resv_region *region;
> > -		phys_addr_t start;
> > -		size_t length;
> > -
> > -		if (resource_type(window->res) != IORESOURCE_MEM)
> > -			continue;
> > 
> > -		start = window->res->start - window->offset;
> > -		length = window->res->end - window->res->start + 1;
> > -		region = iommu_alloc_resv_region(start, length, 0,
> > -				IOMMU_RESV_RESERVED);
> > -		if (!region)
> > -			return;
> > +	if (!is_of_node(dev->iommu_fwspec->iommu_fwnode))
> > +		iort_iommu_msi_get_resv_regions(dev, list);
> > 
> > -		list_add_tail(&region->list, list);
> > -	}
> >  }
> >  EXPORT_SYMBOL(iommu_dma_get_resv_regions);
> > 
> > @@ -229,6 +205,23 @@ static int cookie_init_hw_msi_region(struct
> > iommu_dma_cookie *cookie,
> >  	return 0;
> >  }
> > 
> > +static void iova_reserve_pci_windows(struct pci_dev *dev,
> > +		struct iova_domain *iovad)
> > +{
> > +	struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
> > +	struct resource_entry *window;
> > +	unsigned long lo, hi;
> > +
> > +	resource_list_for_each_entry(window, &bridge->windows) {
> > +		if (resource_type(window->res) != IORESOURCE_MEM)
> > +			continue;
> > +
> > +		lo = iova_pfn(iovad, window->res->start - window->offset);
> > +		hi = iova_pfn(iovad, window->res->end - window->offset);
> > +		reserve_iova(iovad, lo, hi);
> > +	}
> > +}
> > +
> >  static int iova_reserve_iommu_regions(struct device *dev,
> >  		struct iommu_domain *domain)
> >  {
> > @@ -238,6 +231,9 @@ static int iova_reserve_iommu_regions(struct device
> > *dev,
> >  	LIST_HEAD(resv_regions);
> >  	int ret = 0;
> > 
> > +	if (dev_is_pci(dev))
> > +		iova_reserve_pci_windows(to_pci_dev(dev), iovad);
> > +
> >  	iommu_get_resv_regions(dev, &resv_regions);
> >  	list_for_each_entry(region, &resv_regions, list) {
> >  		unsigned long lo, hi;
> > --
> > 2.7.4
> >   
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 4/7] iommu/dma: Move PCI window region reservation back into dma specific path.
@ 2018-04-30 16:10       ` Alex Williamson
  0 siblings, 0 replies; 42+ messages in thread
From: Alex Williamson @ 2018-04-30 16:10 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Linuxarm,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, xuwei (O)

On Tue, 24 Apr 2018 13:16:19 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:

> Hi Joerg,
> 
> Could you please take a look at this patch and let me know.
> 
> I have rebased this to 4.17-rc1  and added Robin's R-by.
> 
> This series[1] is now pending on this patch as without this it will break few
> ARM platforms[2]. 
> 
> Please take a look and let me know.

Hi Joerg,

Any thoughts on this patch?  Shameer is blocked on this series without
your ack.  We could consider moving this to a separate IOMMU patch, but
then I think we'd need to order our v4.18 pull requests, so there's a
bit of a preference to keep it within this series.  Thanks,

Alex

> [1] https://lkml.org/lkml/2018/4/18/293
> [2] https://lkml.org/lkml/2018/3/14/881
> 
> 
> > -----Original Message-----
> > From: Shameerali Kolothum Thodi
> > Sent: Wednesday, April 18, 2018 12:41 PM
> > To: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org; eric.auger-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org;
> > pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org
> > Cc: kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; iommu-cunTk1MwBs/ROKNJybVBZg@public.gmane.org
> > foundation.org; Linuxarm <linuxarm-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; John Garry
> > <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; xuwei (O) <xuwei5-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; Shameerali
> > Kolothum Thodi <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; Joerg Roedel
> > <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
> > Subject: [PATCH v6 4/7] iommu/dma: Move PCI window region reservation
> > back into dma specific path.
> > 
> > This pretty much reverts commit 273df9635385 ("iommu/dma: Make PCI
> > window reservation generic")  by moving the PCI window region
> > reservation back into the dma specific path so that these regions
> > doesn't get exposed via the IOMMU API interface. With this change,
> > the vfio interface will report only iommu specific reserved regions
> > to the user space.
> > 
> > Cc: Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> > Reviewed-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> > ---
> >  drivers/iommu/dma-iommu.c | 54 ++++++++++++++++++++++----------------------
> > ---
> >  1 file changed, 25 insertions(+), 29 deletions(-)
> > 
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index f05f3cf..ddcbbdb 100644
> > --- a/drivers/iommu/dma-iommu.c
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -167,40 +167,16 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
> >   * @list: Reserved region list from iommu_get_resv_regions()
> >   *
> >   * IOMMU drivers can use this to implement their .get_resv_regions callback
> > - * for general non-IOMMU-specific reservations. Currently, this covers host
> > - * bridge windows for PCI devices and GICv3 ITS region reservation on ACPI
> > - * based ARM platforms that may require HW MSI reservation.
> > + * for general non-IOMMU-specific reservations. Currently, this covers GICv3
> > + * ITS region reservation on ACPI based ARM platforms that may require HW
> > MSI
> > + * reservation.
> >   */
> >  void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
> >  {
> > -	struct pci_host_bridge *bridge;
> > -	struct resource_entry *window;
> > -
> > -	if (!is_of_node(dev->iommu_fwspec->iommu_fwnode) &&
> > -		iort_iommu_msi_get_resv_regions(dev, list) < 0)
> > -		return;
> > -
> > -	if (!dev_is_pci(dev))
> > -		return;
> > -
> > -	bridge = pci_find_host_bridge(to_pci_dev(dev)->bus);
> > -	resource_list_for_each_entry(window, &bridge->windows) {
> > -		struct iommu_resv_region *region;
> > -		phys_addr_t start;
> > -		size_t length;
> > -
> > -		if (resource_type(window->res) != IORESOURCE_MEM)
> > -			continue;
> > 
> > -		start = window->res->start - window->offset;
> > -		length = window->res->end - window->res->start + 1;
> > -		region = iommu_alloc_resv_region(start, length, 0,
> > -				IOMMU_RESV_RESERVED);
> > -		if (!region)
> > -			return;
> > +	if (!is_of_node(dev->iommu_fwspec->iommu_fwnode))
> > +		iort_iommu_msi_get_resv_regions(dev, list);
> > 
> > -		list_add_tail(&region->list, list);
> > -	}
> >  }
> >  EXPORT_SYMBOL(iommu_dma_get_resv_regions);
> > 
> > @@ -229,6 +205,23 @@ static int cookie_init_hw_msi_region(struct
> > iommu_dma_cookie *cookie,
> >  	return 0;
> >  }
> > 
> > +static void iova_reserve_pci_windows(struct pci_dev *dev,
> > +		struct iova_domain *iovad)
> > +{
> > +	struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
> > +	struct resource_entry *window;
> > +	unsigned long lo, hi;
> > +
> > +	resource_list_for_each_entry(window, &bridge->windows) {
> > +		if (resource_type(window->res) != IORESOURCE_MEM)
> > +			continue;
> > +
> > +		lo = iova_pfn(iovad, window->res->start - window->offset);
> > +		hi = iova_pfn(iovad, window->res->end - window->offset);
> > +		reserve_iova(iovad, lo, hi);
> > +	}
> > +}
> > +
> >  static int iova_reserve_iommu_regions(struct device *dev,
> >  		struct iommu_domain *domain)
> >  {
> > @@ -238,6 +231,9 @@ static int iova_reserve_iommu_regions(struct device
> > *dev,
> >  	LIST_HEAD(resv_regions);
> >  	int ret = 0;
> > 
> > +	if (dev_is_pci(dev))
> > +		iova_reserve_pci_windows(to_pci_dev(dev), iovad);
> > +
> >  	iommu_get_resv_regions(dev, &resv_regions);
> >  	list_for_each_entry(region, &resv_regions, list) {
> >  		unsigned long lo, hi;
> > --
> > 2.7.4
> >   
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 4/7] iommu/dma: Move PCI window region reservation back into dma specific path.
@ 2018-05-03 13:16     ` Joerg Roedel
  0 siblings, 0 replies; 42+ messages in thread
From: Joerg Roedel @ 2018-05-03 13:16 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: alex.williamson, eric.auger, pmorel, kvm, linux-kernel, iommu,
	linuxarm, john.garry, xuwei5

On Wed, Apr 18, 2018 at 12:40:42PM +0100, Shameer Kolothum wrote:
> This pretty much reverts commit 273df9635385 ("iommu/dma: Make PCI
> window reservation generic")  by moving the PCI window region
> reservation back into the dma specific path so that these regions
> doesn't get exposed via the IOMMU API interface. With this change,
> the vfio interface will report only iommu specific reserved regions
> to the user space.
> 
> Cc: Joerg Roedel <joro@8bytes.org>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  drivers/iommu/dma-iommu.c | 54 ++++++++++++++++++++++-------------------------
>  1 file changed, 25 insertions(+), 29 deletions(-)

Applied to my iommu/fixes branch.

Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 4/7] iommu/dma: Move PCI window region reservation back into dma specific path.
@ 2018-05-03 13:16     ` Joerg Roedel
  0 siblings, 0 replies; 42+ messages in thread
From: Joerg Roedel @ 2018-05-03 13:16 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linuxarm-hv44wF8Li93QT0dZR+AlfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xuwei5-C8/M+/jPZTeaMJb+Lgu22Q

On Wed, Apr 18, 2018 at 12:40:42PM +0100, Shameer Kolothum wrote:
> This pretty much reverts commit 273df9635385 ("iommu/dma: Make PCI
> window reservation generic")  by moving the PCI window region
> reservation back into the dma specific path so that these regions
> doesn't get exposed via the IOMMU API interface. With this change,
> the vfio interface will report only iommu specific reserved regions
> to the user space.
> 
> Cc: Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Reviewed-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/dma-iommu.c | 54 ++++++++++++++++++++++-------------------------
>  1 file changed, 25 insertions(+), 29 deletions(-)

Applied to my iommu/fixes branch.

Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
@ 2018-05-24 18:20   ` Alex Williamson
  0 siblings, 0 replies; 42+ messages in thread
From: Alex Williamson @ 2018-05-24 18:20 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: eric.auger, pmorel, kvm, linux-kernel, iommu, linuxarm,
	john.garry, xuwei5, Joerg Roedel

[Cc +Joerg: AMD-Vi observation towards the end]

On Wed, 18 Apr 2018 12:40:38 +0100
Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote:

> This series introduces an iova list associated with a vfio 
> iommu. The list is kept updated taking care of iommu apertures,
> and reserved regions. Also this series adds checks for any conflict
> with existing dma mappings whenever a new device group is attached to
> the domain.
> 
> User-space can retrieve valid iova ranges using VFIO_IOMMU_GET_INFO
> ioctl capability chains. Any dma map request outside the valid iova
> range will be rejected.

Hi Shameer,

I ran into two minor issues in testing this series, both related to
mdev usage of type1.  First, in patch 5/7 when we try to validate a dma
map request:

> +static bool vfio_iommu_iova_dma_valid(struct vfio_iommu *iommu,
> +				dma_addr_t start, dma_addr_t end)
> +{
> +	struct list_head *iova = &iommu->iova_list;
> +	struct vfio_iova *node;
> +
> +	list_for_each_entry(node, iova, list) {
> +		if ((start >= node->start) && (end <= node->end))
> +			return true;
> +	}
> +
> +	return false;
> +}

A container with only an mdev device will have an empty list because it
has not backing iommu to set ranges or reserved regions, so any dma map
will fail.  I think this is resolved as follows:

--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1100,7 +1100,7 @@ static bool vfio_iommu_iova_dma_valid(struct vfio_iommu *iommu,
                        return true;
        }
 
-       return false;
+       return list_empty(&iommu->iova_list);
 }
 
ie. return false only if there was anything to test against.

The second issue is similar, patch 6/7 adds to VFIO_IOMMU_GET_INFO:

+		ret = vfio_iommu_iova_build_caps(iommu, &caps);
+		if (ret)
+			return ret;

And build_caps has:

+	list_for_each_entry(iova, &iommu->iova_list, list)
+		iovas++;
+
+	if (!iovas) {
+		ret = -EINVAL;

Therefore if the iova list is empty, as for mdevs, the use can no
longer even call VFIO_IOMMU_GET_INFO on the container, which is a
regression.  Again, I think the fix is simple:

@@ -2090,7 +2090,7 @@ static int vfio_iommu_iova_build_caps(struct vfio_iommu *iommu,
                iovas++;
 
        if (!iovas) {
-               ret = -EINVAL;
+               ret = 0;
                goto out_unlock;
        }
 
ie. build_caps needs to handle lack of an iova_list as a non-error.

Also, I wrote a small unit test to validate the iova list for my
systems[1].  With the above changes, my Intel test system gives expected
results:

# ./vfio-type1-iova-list /sys/bus/mdev/devices/c08db5ed-05d3-4b39-b150-438a18bc698f /sys/bus/pci/devices/0000:00:1b.0
---- Adding device: c08db5ed-05d3-4b39-b150-438a18bc698f ----
Initial info struct size: 0x18
No caps
---- Adding device: 0000:00:1b.0 ----
Initial info struct size: 0x18
Requested info struct size: 0x48
New info struct size: 0x48
argsz: 0x48, flags: 0x3, cap_offset: 0x18
	00: 4800 0000 0300 0000 00f0 ffff ffff ffff
	10: 1800 0000 0000 0000 0100 0100 0000 0000
	20: 0200 0000 0000 0000 0000 0000 0000 0000
	30: ffff dffe 0000 0000 0000 f0fe 0000 0000
	40: ffff ffff ffff ff01 
[cap id: 1, version: 1, next: 0x0]
Found type1 iova range version: 1
	00: 0000000000000000 - 00000000fedfffff
	01: 00000000fef00000 - 01ffffffffffffff

Adding an mdev device to the container results in no iova list, adding
the physical device updates to the expected set with the MSI range
excluded.

I was a little surprised by an AMD system:

# ./vfio-type1-iova-list /sys/bus/pci/devices/0000:01:00.0
---- Adding device: 0000:01:00.0 ----
Initial info struct size: 0x18
Requested info struct size: 0x58
New info struct size: 0x58
argsz: 0x58, flags: 0x3, cap_offset: 0x18
	00: 5800 0000 0300 0000 00f0 ffff 7fff ffff
	10: 1800 0000 0000 0000 0100 0100 0000 0000
	20: 0300 0000 0000 0000 0000 0000 0000 0000
	30: ffff dffe 0000 0000 0000 f0fe 0000 0000
	40: ffff ffff fc00 0000 0000 0000 0001 0000
	50: ffff ffff ffff ffff 
[cap id: 1, version: 1, next: 0x0]
Found type1 iova range version: 1
	00: 0000000000000000 - 00000000fedfffff
	01: 00000000fef00000 - 000000fcffffffff
	02: 0000010000000000 - ffffffffffffffff

The additional excluded range is the HyperTransport area (which for
99.9+% of use cases isn't going to be a problem) is a rather surprising
limit for the size of a VM we can use on AMD, just under 1TB.  I fully
expect we'll see a bug report from that and we should be thinking about
how to address it.  Otherwise, if the changes above look good to you
I'll incorporate them into their respective patches and push to my next
branch.  Thanks,

Alex

[1] https://gist.github.com/awilliam/25e39252ab28101a2d89ace1ff765fdd

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
@ 2018-05-24 18:20   ` Alex Williamson
  0 siblings, 0 replies; 42+ messages in thread
From: Alex Williamson @ 2018-05-24 18:20 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xuwei5-C8/M+/jPZTeaMJb+Lgu22Q, linuxarm-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

[Cc +Joerg: AMD-Vi observation towards the end]

On Wed, 18 Apr 2018 12:40:38 +0100
Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:

> This series introduces an iova list associated with a vfio 
> iommu. The list is kept updated taking care of iommu apertures,
> and reserved regions. Also this series adds checks for any conflict
> with existing dma mappings whenever a new device group is attached to
> the domain.
> 
> User-space can retrieve valid iova ranges using VFIO_IOMMU_GET_INFO
> ioctl capability chains. Any dma map request outside the valid iova
> range will be rejected.

Hi Shameer,

I ran into two minor issues in testing this series, both related to
mdev usage of type1.  First, in patch 5/7 when we try to validate a dma
map request:

> +static bool vfio_iommu_iova_dma_valid(struct vfio_iommu *iommu,
> +				dma_addr_t start, dma_addr_t end)
> +{
> +	struct list_head *iova = &iommu->iova_list;
> +	struct vfio_iova *node;
> +
> +	list_for_each_entry(node, iova, list) {
> +		if ((start >= node->start) && (end <= node->end))
> +			return true;
> +	}
> +
> +	return false;
> +}

A container with only an mdev device will have an empty list because it
has not backing iommu to set ranges or reserved regions, so any dma map
will fail.  I think this is resolved as follows:

--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1100,7 +1100,7 @@ static bool vfio_iommu_iova_dma_valid(struct vfio_iommu *iommu,
                        return true;
        }
 
-       return false;
+       return list_empty(&iommu->iova_list);
 }
 
ie. return false only if there was anything to test against.

The second issue is similar, patch 6/7 adds to VFIO_IOMMU_GET_INFO:

+		ret = vfio_iommu_iova_build_caps(iommu, &caps);
+		if (ret)
+			return ret;

And build_caps has:

+	list_for_each_entry(iova, &iommu->iova_list, list)
+		iovas++;
+
+	if (!iovas) {
+		ret = -EINVAL;

Therefore if the iova list is empty, as for mdevs, the use can no
longer even call VFIO_IOMMU_GET_INFO on the container, which is a
regression.  Again, I think the fix is simple:

@@ -2090,7 +2090,7 @@ static int vfio_iommu_iova_build_caps(struct vfio_iommu *iommu,
                iovas++;
 
        if (!iovas) {
-               ret = -EINVAL;
+               ret = 0;
                goto out_unlock;
        }
 
ie. build_caps needs to handle lack of an iova_list as a non-error.

Also, I wrote a small unit test to validate the iova list for my
systems[1].  With the above changes, my Intel test system gives expected
results:

# ./vfio-type1-iova-list /sys/bus/mdev/devices/c08db5ed-05d3-4b39-b150-438a18bc698f /sys/bus/pci/devices/0000:00:1b.0
---- Adding device: c08db5ed-05d3-4b39-b150-438a18bc698f ----
Initial info struct size: 0x18
No caps
---- Adding device: 0000:00:1b.0 ----
Initial info struct size: 0x18
Requested info struct size: 0x48
New info struct size: 0x48
argsz: 0x48, flags: 0x3, cap_offset: 0x18
	00: 4800 0000 0300 0000 00f0 ffff ffff ffff
	10: 1800 0000 0000 0000 0100 0100 0000 0000
	20: 0200 0000 0000 0000 0000 0000 0000 0000
	30: ffff dffe 0000 0000 0000 f0fe 0000 0000
	40: ffff ffff ffff ff01 
[cap id: 1, version: 1, next: 0x0]
Found type1 iova range version: 1
	00: 0000000000000000 - 00000000fedfffff
	01: 00000000fef00000 - 01ffffffffffffff

Adding an mdev device to the container results in no iova list, adding
the physical device updates to the expected set with the MSI range
excluded.

I was a little surprised by an AMD system:

# ./vfio-type1-iova-list /sys/bus/pci/devices/0000:01:00.0
---- Adding device: 0000:01:00.0 ----
Initial info struct size: 0x18
Requested info struct size: 0x58
New info struct size: 0x58
argsz: 0x58, flags: 0x3, cap_offset: 0x18
	00: 5800 0000 0300 0000 00f0 ffff 7fff ffff
	10: 1800 0000 0000 0000 0100 0100 0000 0000
	20: 0300 0000 0000 0000 0000 0000 0000 0000
	30: ffff dffe 0000 0000 0000 f0fe 0000 0000
	40: ffff ffff fc00 0000 0000 0000 0001 0000
	50: ffff ffff ffff ffff 
[cap id: 1, version: 1, next: 0x0]
Found type1 iova range version: 1
	00: 0000000000000000 - 00000000fedfffff
	01: 00000000fef00000 - 000000fcffffffff
	02: 0000010000000000 - ffffffffffffffff

The additional excluded range is the HyperTransport area (which for
99.9+% of use cases isn't going to be a problem) is a rather surprising
limit for the size of a VM we can use on AMD, just under 1TB.  I fully
expect we'll see a bug report from that and we should be thinking about
how to address it.  Otherwise, if the changes above look good to you
I'll incorporate them into their respective patches and push to my next
branch.  Thanks,

Alex

[1] https://gist.github.com/awilliam/25e39252ab28101a2d89ace1ff765fdd

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
  2018-05-24 18:20   ` Alex Williamson
  (?)
@ 2018-05-25  8:45   ` Shameerali Kolothum Thodi
  2018-05-25 20:54       ` Alex Williamson
  -1 siblings, 1 reply; 42+ messages in thread
From: Shameerali Kolothum Thodi @ 2018-05-25  8:45 UTC (permalink / raw)
  To: Alex Williamson
  Cc: eric.auger, pmorel, kvm, linux-kernel, iommu, Linuxarm,
	John Garry, xuwei (O),
	Joerg Roedel

Hi Alex,

> -----Original Message-----
> From: Alex Williamson [mailto:alex.williamson@redhat.com]
> Sent: Thursday, May 24, 2018 7:21 PM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: eric.auger@redhat.com; pmorel@linux.vnet.ibm.com;
> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; iommu@lists.linux-
> foundation.org; Linuxarm <linuxarm@huawei.com>; John Garry
> <john.garry@huawei.com>; xuwei (O) <xuwei5@huawei.com>; Joerg Roedel
> <joro@8bytes.org>
> Subject: Re: [PATCH v6 0/7] vfio/type1: Add support for valid iova list
> management
> 
> [Cc +Joerg: AMD-Vi observation towards the end]
> 
> On Wed, 18 Apr 2018 12:40:38 +0100
> Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote:
> 
> > This series introduces an iova list associated with a vfio
> > iommu. The list is kept updated taking care of iommu apertures,
> > and reserved regions. Also this series adds checks for any conflict
> > with existing dma mappings whenever a new device group is attached to
> > the domain.
> >
> > User-space can retrieve valid iova ranges using VFIO_IOMMU_GET_INFO
> > ioctl capability chains. Any dma map request outside the valid iova
> > range will be rejected.
> 
> Hi Shameer,
> 
> I ran into two minor issues in testing this series, both related to
> mdev usage of type1.  First, in patch 5/7 when we try to validate a dma
> map request:

I must admit I haven't looked into the mdev use case at all and my impression
was that it will be same as others. Thanks for doing these tests.

> > +static bool vfio_iommu_iova_dma_valid(struct vfio_iommu *iommu,
> > +				dma_addr_t start, dma_addr_t end)
> > +{
> > +	struct list_head *iova = &iommu->iova_list;
> > +	struct vfio_iova *node;
> > +
> > +	list_for_each_entry(node, iova, list) {
> > +		if ((start >= node->start) && (end <= node->end))
> > +			return true;
> > +	}
> > +
> > +	return false;
> > +}
> 
> A container with only an mdev device will have an empty list because it
> has not backing iommu to set ranges or reserved regions, so any dma map
> will fail.  I think this is resolved as follows:
> 
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1100,7 +1100,7 @@ static bool vfio_iommu_iova_dma_valid(struct
> vfio_iommu *iommu,
>                         return true;
>         }
> 
> -       return false;
> +       return list_empty(&iommu->iova_list);
>  }

Ok.

> ie. return false only if there was anything to test against.
> 
> The second issue is similar, patch 6/7 adds to VFIO_IOMMU_GET_INFO:
> 
> +		ret = vfio_iommu_iova_build_caps(iommu, &caps);
> +		if (ret)
> +			return ret;
> 
> And build_caps has:
> 
> +	list_for_each_entry(iova, &iommu->iova_list, list)
> +		iovas++;
> +
> +	if (!iovas) {
> +		ret = -EINVAL;
> 
> Therefore if the iova list is empty, as for mdevs, the use can no
> longer even call VFIO_IOMMU_GET_INFO on the container, which is a
> regression.  Again, I think the fix is simple:
> 
> @@ -2090,7 +2090,7 @@ static int vfio_iommu_iova_build_caps(struct
> vfio_iommu *iommu,
>                 iovas++;
> 
>         if (!iovas) {
> -               ret = -EINVAL;
> +               ret = 0;
>                 goto out_unlock;
>         }
> 
> ie. build_caps needs to handle lack of an iova_list as a non-error.

Ok.

> Also, I wrote a small unit test to validate the iova list for my
> systems[1].  With the above changes, my Intel test system gives expected
> results:
> 
> # ./vfio-type1-iova-list /sys/bus/mdev/devices/c08db5ed-05d3-4b39-b150-
> 438a18bc698f /sys/bus/pci/devices/0000:00:1b.0
> ---- Adding device: c08db5ed-05d3-4b39-b150-438a18bc698f ----
> Initial info struct size: 0x18
> No caps
> ---- Adding device: 0000:00:1b.0 ----
> Initial info struct size: 0x18
> Requested info struct size: 0x48
> New info struct size: 0x48
> argsz: 0x48, flags: 0x3, cap_offset: 0x18
> 	00: 4800 0000 0300 0000 00f0 ffff ffff ffff
> 	10: 1800 0000 0000 0000 0100 0100 0000 0000
> 	20: 0200 0000 0000 0000 0000 0000 0000 0000
> 	30: ffff dffe 0000 0000 0000 f0fe 0000 0000
> 	40: ffff ffff ffff ff01
> [cap id: 1, version: 1, next: 0x0]
> Found type1 iova range version: 1
> 	00: 0000000000000000 - 00000000fedfffff
> 	01: 00000000fef00000 - 01ffffffffffffff
> 
> Adding an mdev device to the container results in no iova list, adding
> the physical device updates to the expected set with the MSI range
> excluded.
> 
> I was a little surprised by an AMD system:
> 
> # ./vfio-type1-iova-list /sys/bus/pci/devices/0000:01:00.0
> ---- Adding device: 0000:01:00.0 ----
> Initial info struct size: 0x18
> Requested info struct size: 0x58
> New info struct size: 0x58
> argsz: 0x58, flags: 0x3, cap_offset: 0x18
> 	00: 5800 0000 0300 0000 00f0 ffff 7fff ffff
> 	10: 1800 0000 0000 0000 0100 0100 0000 0000
> 	20: 0300 0000 0000 0000 0000 0000 0000 0000
> 	30: ffff dffe 0000 0000 0000 f0fe 0000 0000
> 	40: ffff ffff fc00 0000 0000 0000 0001 0000
> 	50: ffff ffff ffff ffff
> [cap id: 1, version: 1, next: 0x0]
> Found type1 iova range version: 1
> 	00: 0000000000000000 - 00000000fedfffff
> 	01: 00000000fef00000 - 000000fcffffffff
> 	02: 0000010000000000 - ffffffffffffffff
> 
> The additional excluded range is the HyperTransport area (which for
> 99.9+% of use cases isn't going to be a problem) is a rather surprising
> limit for the size of a VM we can use on AMD, just under 1TB.  I fully
> expect we'll see a bug report from that and we should be thinking about
> how to address it.  Otherwise, if the changes above look good to you
> I'll incorporate them into their respective patches and push to my next
> branch.  Thanks,

Yes, the above changes related to list empty cases looks fine to me. 

Much appreciated,
Shameer
 
> Alex
> 
> [1] https://gist.github.com/awilliam/25e39252ab28101a2d89ace1ff765fdd

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
@ 2018-05-25 20:54       ` Alex Williamson
  0 siblings, 0 replies; 42+ messages in thread
From: Alex Williamson @ 2018-05-25 20:54 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: eric.auger, pmorel, kvm, linux-kernel, iommu, Linuxarm,
	John Garry, xuwei (O),
	Joerg Roedel

On Fri, 25 May 2018 08:45:36 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:

> Yes, the above changes related to list empty cases looks fine to me. 

Thanks Shameer, applied to my next branch with the discussed fixes for
v4.18 (modulo patch 4/7 which Joerg already pushed for v4.17).  Thanks,

Alex

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
@ 2018-05-25 20:54       ` Alex Williamson
  0 siblings, 0 replies; 42+ messages in thread
From: Alex Williamson @ 2018-05-25 20:54 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linuxarm,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, xuwei (O)

On Fri, 25 May 2018 08:45:36 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:

> Yes, the above changes related to list empty cases looks fine to me. 

Thanks Shameer, applied to my next branch with the discussed fixes for
v4.18 (modulo patch 4/7 which Joerg already pushed for v4.17).  Thanks,

Alex

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
@ 2018-05-25 20:54       ` Alex Williamson
  0 siblings, 0 replies; 42+ messages in thread
From: Alex Williamson @ 2018-05-25 20:54 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linuxarm,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, xuwei (O)

On Fri, 25 May 2018 08:45:36 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:

> Yes, the above changes related to list empty cases looks fine to me. 

Thanks Shameer, applied to my next branch with the discussed fixes for
v4.18 (modulo patch 4/7 which Joerg already pushed for v4.17).  Thanks,

Alex

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
  2018-05-25 20:54       ` Alex Williamson
  (?)
  (?)
@ 2018-06-05 17:03       ` Alex Williamson
  2018-06-06  6:52         ` Shameerali Kolothum Thodi
  -1 siblings, 1 reply; 42+ messages in thread
From: Alex Williamson @ 2018-06-05 17:03 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: eric.auger, pmorel, kvm, linux-kernel, iommu, Linuxarm,
	John Garry, xuwei (O),
	Joerg Roedel, David Woodhouse

[Cc +dwmw2]

On Fri, 25 May 2018 14:54:08 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Fri, 25 May 2018 08:45:36 +0000
> Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:
> 
> > Yes, the above changes related to list empty cases looks fine to me.   
> 
> Thanks Shameer, applied to my next branch with the discussed fixes for
> v4.18 (modulo patch 4/7 which Joerg already pushed for v4.17).  Thanks,

Hi Shameer,

We're still hitting issues with this.  VT-d marks reserved regions for
any RMRR ranges, which are IOVA ranges that the BIOS requests to be
identity mapped for a device.  These are indicated by these sorts of
log entries on boot:

 DMAR: Setting RMRR:
 DMAR: Setting identity map for device 0000:00:02.0 [0xbf800000 - 0xcf9fffff]
 DMAR: Setting identity map for device 0000:00:1a.0 [0xbe8d1000 - 0xbe8dffff]
 DMAR: Setting identity map for device 0000:00:1d.0 [0xbe8d1000 - 0xbe8dffff]

So while for an unaffected device, I see very usable ranges for QEMU,
such as:

	00: 0000000000000000 - 00000000fedfffff
	01: 00000000fef00000 - 01ffffffffffffff

If I try to assign the previously assignable 00:02.0 IGD graphics
device, I get:

	00: 0000000000000000 - 00000000bf7fffff
	01: 00000000cfa00000 - 00000000fedfffff
	02: 00000000fef00000 - 01ffffffffffffff

And we get a fault when QEMU tries the following mapping:

vfio_dma_map(0x55f790421a20, 0x100000, 0xbff00000, 0x7ff163f00000)

bff00000 clearly extends into the gap starting at bf800000.  VT-d is
rather split-brained about RMRRs, typically we'd exclude devices from
assignment at all for relying on RMRRs and these reserved ranges
would be a welcome mechanism to avoid conflicts with those ranges, but
for RMRR ranges covering IGD and USB devices we've decided that we
don't need to honor the RMRR (see device_is_rmrr_locked()), but it's
still listed as a reserved range and bites us here.

Unless we can get VT-d to exclude RMRRs from reserved regions where
we've already seen fit to allow the devices to participate in the IOMMU
API, this would introduce a functional regression for assignment of
these devices.  David, should we skip GFX and USB devices with RMRRs in
intel_iommu_get_resv_regions() just as we do in
device_is_rmrr_locked()?  Thanks,

Alex

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
  2018-06-05 17:03       ` Alex Williamson
@ 2018-06-06  6:52         ` Shameerali Kolothum Thodi
  2018-06-07 15:05             ` Alex Williamson
  0 siblings, 1 reply; 42+ messages in thread
From: Shameerali Kolothum Thodi @ 2018-06-06  6:52 UTC (permalink / raw)
  To: Alex Williamson
  Cc: eric.auger, pmorel, kvm, linux-kernel, iommu, Linuxarm,
	John Garry, xuwei (O),
	Joerg Roedel, David Woodhouse



> -----Original Message-----
> From: Alex Williamson [mailto:alex.williamson@redhat.com]
> Sent: 05 June 2018 18:03
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: eric.auger@redhat.com; pmorel@linux.vnet.ibm.com;
> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; iommu@lists.linux-
> foundation.org; Linuxarm <linuxarm@huawei.com>; John Garry
> <john.garry@huawei.com>; xuwei (O) <xuwei5@huawei.com>; Joerg Roedel
> <joro@8bytes.org>; David Woodhouse <dwmw2@infradead.org>
> Subject: Re: [PATCH v6 0/7] vfio/type1: Add support for valid iova list
> management
> 
> [Cc +dwmw2]
> 
> On Fri, 25 May 2018 14:54:08 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> > On Fri, 25 May 2018 08:45:36 +0000
> > Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> wrote:
> >
> > > Yes, the above changes related to list empty cases looks fine to me.
> >
> > Thanks Shameer, applied to my next branch with the discussed fixes for
> > v4.18 (modulo patch 4/7 which Joerg already pushed for v4.17).  Thanks,
> 
> Hi Shameer,
> 
> We're still hitting issues with this.  VT-d marks reserved regions for
> any RMRR ranges, which are IOVA ranges that the BIOS requests to be
> identity mapped for a device.  These are indicated by these sorts of
> log entries on boot:
> 
>  DMAR: Setting RMRR:
>  DMAR: Setting identity map for device 0000:00:02.0 [0xbf800000 - 0xcf9fffff]
>  DMAR: Setting identity map for device 0000:00:1a.0 [0xbe8d1000 - 0xbe8dffff]
>  DMAR: Setting identity map for device 0000:00:1d.0 [0xbe8d1000 - 0xbe8dffff]
> 
> So while for an unaffected device, I see very usable ranges for QEMU,
> such as:
> 
> 	00: 0000000000000000 - 00000000fedfffff
> 	01: 00000000fef00000 - 01ffffffffffffff
> 
> If I try to assign the previously assignable 00:02.0 IGD graphics
> device, I get:
> 
> 	00: 0000000000000000 - 00000000bf7fffff
> 	01: 00000000cfa00000 - 00000000fedfffff
> 	02: 00000000fef00000 - 01ffffffffffffff
> 
> And we get a fault when QEMU tries the following mapping:
> 
> vfio_dma_map(0x55f790421a20, 0x100000, 0xbff00000, 0x7ff163f00000)
> 
> bff00000 clearly extends into the gap starting at bf800000.  VT-d is
> rather split-brained about RMRRs, typically we'd exclude devices from
> assignment at all for relying on RMRRs and these reserved ranges
> would be a welcome mechanism to avoid conflicts with those ranges, but
> for RMRR ranges covering IGD and USB devices we've decided that we
> don't need to honor the RMRR (see device_is_rmrr_locked()), but it's
> still listed as a reserved range and bites us here.

Ah..:(. This looks similar to the pci window range reporting issue faced in
the arm world. I see the RFC you have sent out to ignore these "known" 
RMRRs. Hope we will be able to resolve this soon.

Thanks,
Shameer

> Unless we can get VT-d to exclude RMRRs from reserved regions where
> we've already seen fit to allow the devices to participate in the IOMMU
> API, this would introduce a functional regression for assignment of
> these devices.  David, should we skip GFX and USB devices with RMRRs in
> intel_iommu_get_resv_regions() just as we do in
> device_is_rmrr_locked()?  Thanks,
> 
> Alex

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
@ 2018-06-07 15:05             ` Alex Williamson
  0 siblings, 0 replies; 42+ messages in thread
From: Alex Williamson @ 2018-06-07 15:05 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: eric.auger, pmorel, kvm, linux-kernel, iommu, Linuxarm,
	John Garry, xuwei (O),
	Joerg Roedel, David Woodhouse

On Wed, 6 Jun 2018 06:52:08 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:

> > -----Original Message-----
> > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > Sent: 05 June 2018 18:03
> > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> > Cc: eric.auger@redhat.com; pmorel@linux.vnet.ibm.com;
> > kvm@vger.kernel.org; linux-kernel@vger.kernel.org; iommu@lists.linux-
> > foundation.org; Linuxarm <linuxarm@huawei.com>; John Garry
> > <john.garry@huawei.com>; xuwei (O) <xuwei5@huawei.com>; Joerg Roedel
> > <joro@8bytes.org>; David Woodhouse <dwmw2@infradead.org>
> > Subject: Re: [PATCH v6 0/7] vfio/type1: Add support for valid iova list
> > management
> > 
> > [Cc +dwmw2]
> > 
> > On Fri, 25 May 2018 14:54:08 -0600
> > Alex Williamson <alex.williamson@redhat.com> wrote:
> >   
> > > On Fri, 25 May 2018 08:45:36 +0000
> > > Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>  
> > wrote:  
> > >  
> > > > Yes, the above changes related to list empty cases looks fine to me.  
> > >
> > > Thanks Shameer, applied to my next branch with the discussed fixes for
> > > v4.18 (modulo patch 4/7 which Joerg already pushed for v4.17).  Thanks,  
> > 
> > Hi Shameer,
> > 
> > We're still hitting issues with this.  VT-d marks reserved regions for
> > any RMRR ranges, which are IOVA ranges that the BIOS requests to be
> > identity mapped for a device.  These are indicated by these sorts of
> > log entries on boot:
> > 
> >  DMAR: Setting RMRR:
> >  DMAR: Setting identity map for device 0000:00:02.0 [0xbf800000 - 0xcf9fffff]
> >  DMAR: Setting identity map for device 0000:00:1a.0 [0xbe8d1000 - 0xbe8dffff]
> >  DMAR: Setting identity map for device 0000:00:1d.0 [0xbe8d1000 - 0xbe8dffff]
> > 
> > So while for an unaffected device, I see very usable ranges for QEMU,
> > such as:
> > 
> > 	00: 0000000000000000 - 00000000fedfffff
> > 	01: 00000000fef00000 - 01ffffffffffffff
> > 
> > If I try to assign the previously assignable 00:02.0 IGD graphics
> > device, I get:
> > 
> > 	00: 0000000000000000 - 00000000bf7fffff
> > 	01: 00000000cfa00000 - 00000000fedfffff
> > 	02: 00000000fef00000 - 01ffffffffffffff
> > 
> > And we get a fault when QEMU tries the following mapping:
> > 
> > vfio_dma_map(0x55f790421a20, 0x100000, 0xbff00000, 0x7ff163f00000)
> > 
> > bff00000 clearly extends into the gap starting at bf800000.  VT-d is
> > rather split-brained about RMRRs, typically we'd exclude devices from
> > assignment at all for relying on RMRRs and these reserved ranges
> > would be a welcome mechanism to avoid conflicts with those ranges, but
> > for RMRR ranges covering IGD and USB devices we've decided that we
> > don't need to honor the RMRR (see device_is_rmrr_locked()), but it's
> > still listed as a reserved range and bites us here.  
> 
> Ah..:(. This looks similar to the pci window range reporting issue faced in
> the arm world. I see the RFC you have sent out to ignore these "known" 
> RMRRs. Hope we will be able to resolve this soon.

In the meantime, it seems that I need to drop the iova list from my
branch for v4.18, I don't think it makes much sense to expose to
userspace if we cannot also enforce it and it seems like it presents API
issues if we were to expose the iova list without enforcement and later
try to enforce it.  Sorry I didn't catch this earlier.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
@ 2018-06-07 15:05             ` Alex Williamson
  0 siblings, 0 replies; 42+ messages in thread
From: Alex Williamson @ 2018-06-07 15:05 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linuxarm,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, xuwei (O),
	David Woodhouse

On Wed, 6 Jun 2018 06:52:08 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:

> > -----Original Message-----
> > From: Alex Williamson [mailto:alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org]
> > Sent: 05 June 2018 18:03
> > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> > Cc: eric.auger-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org; pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org;
> > kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; iommu-cunTk1MwBs/ROKNJybVBZg@public.gmane.org
> > foundation.org; Linuxarm <linuxarm-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; John Garry
> > <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; xuwei (O) <xuwei5-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; Joerg Roedel
> > <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>; David Woodhouse <dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
> > Subject: Re: [PATCH v6 0/7] vfio/type1: Add support for valid iova list
> > management
> > 
> > [Cc +dwmw2]
> > 
> > On Fri, 25 May 2018 14:54:08 -0600
> > Alex Williamson <alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> >   
> > > On Fri, 25 May 2018 08:45:36 +0000
> > > Shameerali Kolothum Thodi <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>  
> > wrote:  
> > >  
> > > > Yes, the above changes related to list empty cases looks fine to me.  
> > >
> > > Thanks Shameer, applied to my next branch with the discussed fixes for
> > > v4.18 (modulo patch 4/7 which Joerg already pushed for v4.17).  Thanks,  
> > 
> > Hi Shameer,
> > 
> > We're still hitting issues with this.  VT-d marks reserved regions for
> > any RMRR ranges, which are IOVA ranges that the BIOS requests to be
> > identity mapped for a device.  These are indicated by these sorts of
> > log entries on boot:
> > 
> >  DMAR: Setting RMRR:
> >  DMAR: Setting identity map for device 0000:00:02.0 [0xbf800000 - 0xcf9fffff]
> >  DMAR: Setting identity map for device 0000:00:1a.0 [0xbe8d1000 - 0xbe8dffff]
> >  DMAR: Setting identity map for device 0000:00:1d.0 [0xbe8d1000 - 0xbe8dffff]
> > 
> > So while for an unaffected device, I see very usable ranges for QEMU,
> > such as:
> > 
> > 	00: 0000000000000000 - 00000000fedfffff
> > 	01: 00000000fef00000 - 01ffffffffffffff
> > 
> > If I try to assign the previously assignable 00:02.0 IGD graphics
> > device, I get:
> > 
> > 	00: 0000000000000000 - 00000000bf7fffff
> > 	01: 00000000cfa00000 - 00000000fedfffff
> > 	02: 00000000fef00000 - 01ffffffffffffff
> > 
> > And we get a fault when QEMU tries the following mapping:
> > 
> > vfio_dma_map(0x55f790421a20, 0x100000, 0xbff00000, 0x7ff163f00000)
> > 
> > bff00000 clearly extends into the gap starting at bf800000.  VT-d is
> > rather split-brained about RMRRs, typically we'd exclude devices from
> > assignment at all for relying on RMRRs and these reserved ranges
> > would be a welcome mechanism to avoid conflicts with those ranges, but
> > for RMRR ranges covering IGD and USB devices we've decided that we
> > don't need to honor the RMRR (see device_is_rmrr_locked()), but it's
> > still listed as a reserved range and bites us here.  
> 
> Ah..:(. This looks similar to the pci window range reporting issue faced in
> the arm world. I see the RFC you have sent out to ignore these "known" 
> RMRRs. Hope we will be able to resolve this soon.

In the meantime, it seems that I need to drop the iova list from my
branch for v4.18, I don't think it makes much sense to expose to
userspace if we cannot also enforce it and it seems like it presents API
issues if we were to expose the iova list without enforcement and later
try to enforce it.  Sorry I didn't catch this earlier.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
@ 2018-06-07 15:05             ` Alex Williamson
  0 siblings, 0 replies; 42+ messages in thread
From: Alex Williamson @ 2018-06-07 15:05 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA,
	pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Linuxarm,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, xuwei (O),
	David Woodhouse

On Wed, 6 Jun 2018 06:52:08 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:

> > -----Original Message-----
> > From: Alex Williamson [mailto:alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org]
> > Sent: 05 June 2018 18:03
> > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> > Cc: eric.auger-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org; pmorel-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org;
> > kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; iommu-cunTk1MwBs/ROKNJybVBZg@public.gmane.org
> > foundation.org; Linuxarm <linuxarm-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; John Garry
> > <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; xuwei (O) <xuwei5-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; Joerg Roedel
> > <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>; David Woodhouse <dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
> > Subject: Re: [PATCH v6 0/7] vfio/type1: Add support for valid iova list
> > management
> > 
> > [Cc +dwmw2]
> > 
> > On Fri, 25 May 2018 14:54:08 -0600
> > Alex Williamson <alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> >   
> > > On Fri, 25 May 2018 08:45:36 +0000
> > > Shameerali Kolothum Thodi <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>  
> > wrote:  
> > >  
> > > > Yes, the above changes related to list empty cases looks fine to me.  
> > >
> > > Thanks Shameer, applied to my next branch with the discussed fixes for
> > > v4.18 (modulo patch 4/7 which Joerg already pushed for v4.17).  Thanks,  
> > 
> > Hi Shameer,
> > 
> > We're still hitting issues with this.  VT-d marks reserved regions for
> > any RMRR ranges, which are IOVA ranges that the BIOS requests to be
> > identity mapped for a device.  These are indicated by these sorts of
> > log entries on boot:
> > 
> >  DMAR: Setting RMRR:
> >  DMAR: Setting identity map for device 0000:00:02.0 [0xbf800000 - 0xcf9fffff]
> >  DMAR: Setting identity map for device 0000:00:1a.0 [0xbe8d1000 - 0xbe8dffff]
> >  DMAR: Setting identity map for device 0000:00:1d.0 [0xbe8d1000 - 0xbe8dffff]
> > 
> > So while for an unaffected device, I see very usable ranges for QEMU,
> > such as:
> > 
> > 	00: 0000000000000000 - 00000000fedfffff
> > 	01: 00000000fef00000 - 01ffffffffffffff
> > 
> > If I try to assign the previously assignable 00:02.0 IGD graphics
> > device, I get:
> > 
> > 	00: 0000000000000000 - 00000000bf7fffff
> > 	01: 00000000cfa00000 - 00000000fedfffff
> > 	02: 00000000fef00000 - 01ffffffffffffff
> > 
> > And we get a fault when QEMU tries the following mapping:
> > 
> > vfio_dma_map(0x55f790421a20, 0x100000, 0xbff00000, 0x7ff163f00000)
> > 
> > bff00000 clearly extends into the gap starting at bf800000.  VT-d is
> > rather split-brained about RMRRs, typically we'd exclude devices from
> > assignment at all for relying on RMRRs and these reserved ranges
> > would be a welcome mechanism to avoid conflicts with those ranges, but
> > for RMRR ranges covering IGD and USB devices we've decided that we
> > don't need to honor the RMRR (see device_is_rmrr_locked()), but it's
> > still listed as a reserved range and bites us here.  
> 
> Ah..:(. This looks similar to the pci window range reporting issue faced in
> the arm world. I see the RFC you have sent out to ignore these "known" 
> RMRRs. Hope we will be able to resolve this soon.

In the meantime, it seems that I need to drop the iova list from my
branch for v4.18, I don't think it makes much sense to expose to
userspace if we cannot also enforce it and it seems like it presents API
issues if we were to expose the iova list without enforcement and later
try to enforce it.  Sorry I didn't catch this earlier.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2018-06-07 15:05 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-18 11:40 [PATCH v6 0/7] vfio/type1: Add support for valid iova list management Shameer Kolothum
2018-04-18 11:40 ` Shameer Kolothum
2018-04-18 11:40 ` Shameer Kolothum
2018-04-18 11:40 ` [PATCH v6 1/7] vfio/type1: Introduce iova list and add iommu aperture validity check Shameer Kolothum
2018-04-18 11:40   ` Shameer Kolothum
2018-04-18 11:40   ` Shameer Kolothum
2018-04-18 11:40 ` [PATCH v6 2/7] vfio/type1: Check reserve region conflict and update iova list Shameer Kolothum
2018-04-18 11:40   ` Shameer Kolothum
2018-04-18 11:40   ` Shameer Kolothum
2018-04-18 11:40 ` [PATCH v6 3/7] vfio/type1: Update iova list on detach Shameer Kolothum
2018-04-18 11:40   ` Shameer Kolothum
2018-04-18 11:40   ` Shameer Kolothum
2018-04-18 11:40 ` [PATCH v6 4/7] iommu/dma: Move PCI window region reservation back into dma specific path Shameer Kolothum
2018-04-18 11:40   ` Shameer Kolothum
2018-04-18 11:40   ` Shameer Kolothum
2018-04-24 13:16   ` Shameerali Kolothum Thodi
2018-04-24 13:16     ` Shameerali Kolothum Thodi
2018-04-24 13:16     ` Shameerali Kolothum Thodi
2018-04-30 16:10     ` Alex Williamson
2018-04-30 16:10       ` Alex Williamson
2018-04-30 16:10       ` Alex Williamson
2018-05-03 13:16   ` Joerg Roedel
2018-05-03 13:16     ` Joerg Roedel
2018-04-18 11:40 ` [PATCH v6 5/7] vfio/type1: check dma map request is within a valid iova range Shameer Kolothum
2018-04-18 11:40   ` Shameer Kolothum
2018-04-18 11:40 ` [PATCH v6 6/7] vfio/type1: Add IOVA range capability support Shameer Kolothum
2018-04-18 11:40   ` Shameer Kolothum
2018-04-18 11:40   ` Shameer Kolothum
2018-04-18 11:40 ` [PATCH v6 7/7] vfio/type1: remove duplicate retrieval of reserved regions Shameer Kolothum
2018-04-18 11:40   ` Shameer Kolothum
2018-04-18 11:40   ` Shameer Kolothum
2018-05-24 18:20 ` [PATCH v6 0/7] vfio/type1: Add support for valid iova list management Alex Williamson
2018-05-24 18:20   ` Alex Williamson
2018-05-25  8:45   ` Shameerali Kolothum Thodi
2018-05-25 20:54     ` Alex Williamson
2018-05-25 20:54       ` Alex Williamson
2018-05-25 20:54       ` Alex Williamson
2018-06-05 17:03       ` Alex Williamson
2018-06-06  6:52         ` Shameerali Kolothum Thodi
2018-06-07 15:05           ` Alex Williamson
2018-06-07 15:05             ` Alex Williamson
2018-06-07 15:05             ` Alex Williamson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.