linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/7] RMRR related fixes and enhancements
@ 2019-05-16  8:47 Eric Auger
  2019-05-16  8:47 ` [PATCH v2 1/7] iommu: Pass a GFP flag parameter to iommu_alloc_resv_region() Eric Auger
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: Eric Auger @ 2019-05-16  8:47 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, joro, iommu, linux-kernel, dwmw2,
	lorenzo.pieralisi, robin.murphy, will.deacon, hanjun.guo,
	sudeep.holla
  Cc: alex.williamson, shameerali.kolothum.thodi

Currently the Intel reserved region is attached to the
RMRR unit and when building the list of RMRR seen by a device
we link this unique reserved region without taking care of
potential multiple usage of this reserved region by several devices.

Also while reading the vtd spec it is unclear to me whether
the RMRR device scope referenced by an RMRR ACPI struct could
be a PCI-PCI bridge, in which case I think we also need to
check the device belongs to the PCI sub-hierarchy of the device
referenced in the scope. This would be true for device_has_rmrr()
and intel_iommu_get_resv_regions().

Last, the VFIO subsystem would need to compute the usable IOVA range
by querying the iommu_get_group_resv_regions() API. This would allow,
for instance, to report potential conflicts between the guest physical
address space and host reserved regions.

However iommu_get_group_resv_regions() currently fails to differentiate
RMRRs that are known safe for device assignment and RMRRs that must be
enforced. So we introduce a new reserved memory region type (relaxable),
reported when associated to an USB or GFX device. The last 2 patches aim
at unblocking [1] which is stuck since 4.18.

[1-5] are fixes
[6-7] are enhancements

References:
[1] [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
    https://patchwork.kernel.org/patch/10425309/

The two parts can be considered separately if needed.

History

1->2:
- introduce is_downstream_to_pci_bridge() in a separate patch, change param
  names and add kerneldoc comment (Jacob)
- add 6,7


Eric Auger (7):
  iommu: Pass a GFP flag parameter to iommu_alloc_resv_region()
  iommu/vt-d: Duplicate iommu_resv_region objects per device list
  iommu/vt-d: Introduce is_downstream_to_pci_bridge helper
  iommu/vt-d: Handle RMRR with PCI bridge device scopes
  iommu/vt-d: Handle PCI bridge RMRR device scopes in
    intel_iommu_get_resv_regions
  iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions
  iommu/vt-d: Differentiate relaxable and non relaxable RMRRs

 drivers/acpi/arm64/iort.c   |  3 +-
 drivers/iommu/amd_iommu.c   |  7 ++--
 drivers/iommu/arm-smmu-v3.c |  2 +-
 drivers/iommu/arm-smmu.c    |  2 +-
 drivers/iommu/intel-iommu.c | 82 +++++++++++++++++++++++++------------
 drivers/iommu/iommu.c       | 19 +++++----
 include/linux/iommu.h       |  8 +++-
 7 files changed, 82 insertions(+), 41 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/7] iommu: Pass a GFP flag parameter to iommu_alloc_resv_region()
  2019-05-16  8:47 [PATCH v2 0/7] RMRR related fixes and enhancements Eric Auger
@ 2019-05-16  8:47 ` Eric Auger
  2019-05-16  8:47 ` [PATCH v2 2/7] iommu/vt-d: Duplicate iommu_resv_region objects per device list Eric Auger
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Eric Auger @ 2019-05-16  8:47 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, joro, iommu, linux-kernel, dwmw2,
	lorenzo.pieralisi, robin.murphy, will.deacon, hanjun.guo,
	sudeep.holla
  Cc: alex.williamson, shameerali.kolothum.thodi

We plan to call iommu_alloc_resv_region in a non preemptible section.
Pass a GFP flag to the function and update all the call sites.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/acpi/arm64/iort.c   | 3 ++-
 drivers/iommu/amd_iommu.c   | 7 ++++---
 drivers/iommu/arm-smmu-v3.c | 2 +-
 drivers/iommu/arm-smmu.c    | 2 +-
 drivers/iommu/intel-iommu.c | 4 ++--
 drivers/iommu/iommu.c       | 7 ++++---
 include/linux/iommu.h       | 2 +-
 7 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 9058cb084b91..8313b2d1b710 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -868,7 +868,8 @@ int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
 			struct iommu_resv_region *region;
 
 			region = iommu_alloc_resv_region(base + SZ_64K, SZ_64K,
-							 prot, IOMMU_RESV_MSI);
+							 prot, IOMMU_RESV_MSI,
+							 GFP_KERNEL);
 			if (region) {
 				list_add_tail(&region->list, head);
 				resv++;
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 09c9e45f7fa2..f2eb8e9cd8a6 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3136,7 +3136,8 @@ static void amd_iommu_get_resv_regions(struct device *dev,
 			type = IOMMU_RESV_RESERVED;
 
 		region = iommu_alloc_resv_region(entry->address_start,
-						 length, prot, type);
+						 length, prot, type,
+						 GFP_KERNEL);
 		if (!region) {
 			dev_err(dev, "Out of memory allocating dm-regions\n");
 			return;
@@ -3146,14 +3147,14 @@ static void amd_iommu_get_resv_regions(struct device *dev,
 
 	region = iommu_alloc_resv_region(MSI_RANGE_START,
 					 MSI_RANGE_END - MSI_RANGE_START + 1,
-					 0, IOMMU_RESV_MSI);
+					 0, IOMMU_RESV_MSI, GFP_KERNEL);
 	if (!region)
 		return;
 	list_add_tail(&region->list, head);
 
 	region = iommu_alloc_resv_region(HT_RANGE_START,
 					 HT_RANGE_END - HT_RANGE_START + 1,
-					 0, IOMMU_RESV_RESERVED);
+					 0, IOMMU_RESV_RESERVED, GFP_KERNEL);
 	if (!region)
 		return;
 	list_add_tail(&region->list, head);
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 4d5a694f02c2..f9b1279ef5bf 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2226,7 +2226,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
 	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 
 	region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
-					 prot, IOMMU_RESV_SW_MSI);
+					 prot, IOMMU_RESV_SW_MSI, GFP_KERNEL);
 	if (!region)
 		return;
 
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 5e54cc0a28b3..646e76813e91 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1670,7 +1670,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
 	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 
 	region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
-					 prot, IOMMU_RESV_SW_MSI);
+					 prot, IOMMU_RESV_SW_MSI, GFP_KERNEL);
 	if (!region)
 		return;
 
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a209199f3af6..2be36dff189a 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4220,7 +4220,7 @@ int __init dmar_parse_one_rmrr(struct acpi_dmar_header *header, void *arg)
 
 	length = rmrr->end_address - rmrr->base_address + 1;
 	rmrru->resv = iommu_alloc_resv_region(rmrr->base_address, length, prot,
-					      IOMMU_RESV_DIRECT);
+					      IOMMU_RESV_DIRECT, GFP_KERNEL);
 	if (!rmrru->resv)
 		goto free_rmrru;
 
@@ -5489,7 +5489,7 @@ static void intel_iommu_get_resv_regions(struct device *device,
 
 	reg = iommu_alloc_resv_region(IOAPIC_RANGE_START,
 				      IOAPIC_RANGE_END - IOAPIC_RANGE_START + 1,
-				      0, IOMMU_RESV_MSI);
+				      0, IOMMU_RESV_MSI, GFP_KERNEL);
 	if (!reg)
 		return;
 	list_add_tail(&reg->list, head);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 67ee6623f9b2..ae4ea5c0e6f9 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -256,7 +256,7 @@ static int iommu_insert_resv_region(struct iommu_resv_region *new,
 	}
 insert:
 	region = iommu_alloc_resv_region(new->start, new->length,
-					 new->prot, new->type);
+					 new->prot, new->type, GFP_KERNEL);
 	if (!region)
 		return -ENOMEM;
 
@@ -1891,11 +1891,12 @@ void iommu_put_resv_regions(struct device *dev, struct list_head *list)
 
 struct iommu_resv_region *iommu_alloc_resv_region(phys_addr_t start,
 						  size_t length, int prot,
-						  enum iommu_resv_type type)
+						  enum iommu_resv_type type,
+						  gfp_t flags)
 {
 	struct iommu_resv_region *region;
 
-	region = kzalloc(sizeof(*region), GFP_KERNEL);
+	region = kzalloc(sizeof(*region), flags);
 	if (!region)
 		return NULL;
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a815cf6f6f47..ba91666998fb 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -364,7 +364,7 @@ extern void iommu_put_resv_regions(struct device *dev, struct list_head *list);
 extern int iommu_request_dm_for_dev(struct device *dev);
 extern struct iommu_resv_region *
 iommu_alloc_resv_region(phys_addr_t start, size_t length, int prot,
-			enum iommu_resv_type type);
+			enum iommu_resv_type type, gfp_t flags);
 extern int iommu_get_group_resv_regions(struct iommu_group *group,
 					struct list_head *head);
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/7] iommu/vt-d: Duplicate iommu_resv_region objects per device list
  2019-05-16  8:47 [PATCH v2 0/7] RMRR related fixes and enhancements Eric Auger
  2019-05-16  8:47 ` [PATCH v2 1/7] iommu: Pass a GFP flag parameter to iommu_alloc_resv_region() Eric Auger
@ 2019-05-16  8:47 ` Eric Auger
  2019-05-16  8:47 ` [PATCH v2 3/7] iommu/vt-d: Introduce is_downstream_to_pci_bridge helper Eric Auger
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Eric Auger @ 2019-05-16  8:47 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, joro, iommu, linux-kernel, dwmw2,
	lorenzo.pieralisi, robin.murphy, will.deacon, hanjun.guo,
	sudeep.holla
  Cc: alex.williamson, shameerali.kolothum.thodi

intel_iommu_get_resv_regions() aims to return the list of
reserved regions accessible by a given @device. However several
devices can access the same reserved memory region and when
building the list it is not safe to use a single iommu_resv_region
object, whose container is the RMRR. This iommu_resv_region must
be duplicated per device reserved region list.

Let's remove the struct iommu_resv_region from the RMRR unit
and allocate the iommu_resv_region directly in
intel_iommu_get_resv_regions().

Fixes: 0659b8dc45a6 ("iommu/vt-d: Implement reserved region get/put callbacks")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/iommu/intel-iommu.c | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 2be36dff189a..590a0e78d11d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -322,7 +322,6 @@ struct dmar_rmrr_unit {
 	u64	end_address;		/* reserved end address */
 	struct dmar_dev_scope *devices;	/* target devices */
 	int	devices_cnt;		/* target device count */
-	struct iommu_resv_region *resv; /* reserved region handle */
 };
 
 struct dmar_atsr_unit {
@@ -4205,7 +4204,6 @@ static inline void init_iommu_pm_ops(void) {}
 int __init dmar_parse_one_rmrr(struct acpi_dmar_header *header, void *arg)
 {
 	struct acpi_dmar_reserved_memory *rmrr;
-	int prot = DMA_PTE_READ|DMA_PTE_WRITE;
 	struct dmar_rmrr_unit *rmrru;
 	size_t length;
 
@@ -4219,22 +4217,16 @@ int __init dmar_parse_one_rmrr(struct acpi_dmar_header *header, void *arg)
 	rmrru->end_address = rmrr->end_address;
 
 	length = rmrr->end_address - rmrr->base_address + 1;
-	rmrru->resv = iommu_alloc_resv_region(rmrr->base_address, length, prot,
-					      IOMMU_RESV_DIRECT, GFP_KERNEL);
-	if (!rmrru->resv)
-		goto free_rmrru;
 
 	rmrru->devices = dmar_alloc_dev_scope((void *)(rmrr + 1),
 				((void *)rmrr) + rmrr->header.length,
 				&rmrru->devices_cnt);
 	if (rmrru->devices_cnt && rmrru->devices == NULL)
-		goto free_all;
+		goto free_rmrru;
 
 	list_add(&rmrru->list, &dmar_rmrr_units);
 
 	return 0;
-free_all:
-	kfree(rmrru->resv);
 free_rmrru:
 	kfree(rmrru);
 out:
@@ -4452,7 +4444,6 @@ static void intel_iommu_free_dmars(void)
 	list_for_each_entry_safe(rmrru, rmrr_n, &dmar_rmrr_units, list) {
 		list_del(&rmrru->list);
 		dmar_free_dev_scope(&rmrru->devices, &rmrru->devices_cnt);
-		kfree(rmrru->resv);
 		kfree(rmrru);
 	}
 
@@ -5470,6 +5461,7 @@ static void intel_iommu_remove_device(struct device *dev)
 static void intel_iommu_get_resv_regions(struct device *device,
 					 struct list_head *head)
 {
+	int prot = DMA_PTE_READ|DMA_PTE_WRITE;
 	struct iommu_resv_region *reg;
 	struct dmar_rmrr_unit *rmrr;
 	struct device *i_dev;
@@ -5479,10 +5471,21 @@ static void intel_iommu_get_resv_regions(struct device *device,
 	for_each_rmrr_units(rmrr) {
 		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
 					  i, i_dev) {
+			struct iommu_resv_region *resv;
+			size_t length;
+
 			if (i_dev != device)
 				continue;
 
-			list_add_tail(&rmrr->resv->list, head);
+			length = rmrr->end_address - rmrr->base_address + 1;
+			resv = iommu_alloc_resv_region(rmrr->base_address,
+						       length, prot,
+						       IOMMU_RESV_DIRECT,
+						       GFP_ATOMIC);
+			if (!resv)
+				break;
+
+			list_add_tail(&resv->list, head);
 		}
 	}
 	rcu_read_unlock();
@@ -5500,10 +5503,8 @@ static void intel_iommu_put_resv_regions(struct device *dev,
 {
 	struct iommu_resv_region *entry, *next;
 
-	list_for_each_entry_safe(entry, next, head, list) {
-		if (entry->type == IOMMU_RESV_MSI)
-			kfree(entry);
-	}
+	list_for_each_entry_safe(entry, next, head, list)
+		kfree(entry);
 }
 
 int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct device *dev)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 3/7] iommu/vt-d: Introduce is_downstream_to_pci_bridge helper
  2019-05-16  8:47 [PATCH v2 0/7] RMRR related fixes and enhancements Eric Auger
  2019-05-16  8:47 ` [PATCH v2 1/7] iommu: Pass a GFP flag parameter to iommu_alloc_resv_region() Eric Auger
  2019-05-16  8:47 ` [PATCH v2 2/7] iommu/vt-d: Duplicate iommu_resv_region objects per device list Eric Auger
@ 2019-05-16  8:47 ` Eric Auger
  2019-05-16  8:47 ` [PATCH v2 4/7] iommu/vt-d: Handle RMRR with PCI bridge device scopes Eric Auger
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Eric Auger @ 2019-05-16  8:47 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, joro, iommu, linux-kernel, dwmw2,
	lorenzo.pieralisi, robin.murphy, will.deacon, hanjun.guo,
	sudeep.holla
  Cc: alex.williamson, shameerali.kolothum.thodi

Several call sites are about to check whether a device belongs
to the PCI sub-hierarchy of a candidate PCI-PCI bridge.
Introduce an helper to perform that check.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/iommu/intel-iommu.c | 37 +++++++++++++++++++++++++++++--------
 1 file changed, 29 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 590a0e78d11d..15c2f9677491 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -736,12 +736,39 @@ static int iommu_dummy(struct device *dev)
 	return dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO;
 }
 
+/* is_downstream_to_pci_bridge - test if a device belongs to the
+ * PCI sub-hierarchy of a candidate PCI-PCI bridge
+ *
+ * @dev: candidate PCI device belonging to @bridge PCI sub-hierarchy
+ * @bridge: the candidate PCI-PCI bridge
+ *
+ * Return: true if @dev belongs to @bridge PCI sub-hierarchy
+ */
+static bool
+is_downstream_to_pci_bridge(struct device *dev, struct device *bridge)
+{
+	struct pci_dev *pdev, *pbridge;
+
+	if (!dev_is_pci(dev) || !dev_is_pci(bridge))
+		return false;
+
+	pdev = to_pci_dev(dev);
+	pbridge = to_pci_dev(bridge);
+
+	if (pbridge->subordinate &&
+	    pbridge->subordinate->number <= pdev->bus->number &&
+	    pbridge->subordinate->busn_res.end >= pdev->bus->number)
+		return true;
+
+	return false;
+}
+
 static struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devfn)
 {
 	struct dmar_drhd_unit *drhd = NULL;
 	struct intel_iommu *iommu;
 	struct device *tmp;
-	struct pci_dev *ptmp, *pdev = NULL;
+	struct pci_dev *pdev = NULL;
 	u16 segment = 0;
 	int i;
 
@@ -787,13 +814,7 @@ static struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devf
 				goto out;
 			}
 
-			if (!pdev || !dev_is_pci(tmp))
-				continue;
-
-			ptmp = to_pci_dev(tmp);
-			if (ptmp->subordinate &&
-			    ptmp->subordinate->number <= pdev->bus->number &&
-			    ptmp->subordinate->busn_res.end >= pdev->bus->number)
+			if (is_downstream_to_pci_bridge(dev, tmp))
 				goto got_pdev;
 		}
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 4/7] iommu/vt-d: Handle RMRR with PCI bridge device scopes
  2019-05-16  8:47 [PATCH v2 0/7] RMRR related fixes and enhancements Eric Auger
                   ` (2 preceding siblings ...)
  2019-05-16  8:47 ` [PATCH v2 3/7] iommu/vt-d: Introduce is_downstream_to_pci_bridge helper Eric Auger
@ 2019-05-16  8:47 ` Eric Auger
  2019-05-16  8:47 ` [PATCH v2 5/7] iommu/vt-d: Handle PCI bridge RMRR device scopes in intel_iommu_get_resv_regions Eric Auger
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Eric Auger @ 2019-05-16  8:47 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, joro, iommu, linux-kernel, dwmw2,
	lorenzo.pieralisi, robin.murphy, will.deacon, hanjun.guo,
	sudeep.holla
  Cc: alex.williamson, shameerali.kolothum.thodi

When reading the vtd specification and especially the
Reserved Memory Region Reporting Structure chapter,
it is not obvious a device scope element cannot be a
PCI-PCI bridge, in which case all downstream ports are
likely to access the reserved memory region. Let's handle
this case in device_has_rmrr.

Fixes: ea2447f700ca ("intel-iommu: Prevent devices with RMRRs from being placed into SI Domain")

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v1 -> v2:
- is_downstream_to_pci_bridge helper introduced in a separate patch
---
 drivers/iommu/intel-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 15c2f9677491..7ed820e79313 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2910,7 +2910,8 @@ static bool device_has_rmrr(struct device *dev)
 		 */
 		for_each_active_dev_scope(rmrr->devices,
 					  rmrr->devices_cnt, i, tmp)
-			if (tmp == dev) {
+			if (tmp == dev ||
+			    is_downstream_to_pci_bridge(dev, tmp)) {
 				rcu_read_unlock();
 				return true;
 			}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 5/7] iommu/vt-d: Handle PCI bridge RMRR device scopes in intel_iommu_get_resv_regions
  2019-05-16  8:47 [PATCH v2 0/7] RMRR related fixes and enhancements Eric Auger
                   ` (3 preceding siblings ...)
  2019-05-16  8:47 ` [PATCH v2 4/7] iommu/vt-d: Handle RMRR with PCI bridge device scopes Eric Auger
@ 2019-05-16  8:47 ` Eric Auger
  2019-05-16  8:47 ` [PATCH v2 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions Eric Auger
  2019-05-16  8:47 ` [PATCH v2 7/7] iommu/vt-d: Differentiate relaxable and non relaxable RMRRs Eric Auger
  6 siblings, 0 replies; 9+ messages in thread
From: Eric Auger @ 2019-05-16  8:47 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, joro, iommu, linux-kernel, dwmw2,
	lorenzo.pieralisi, robin.murphy, will.deacon, hanjun.guo,
	sudeep.holla
  Cc: alex.williamson, shameerali.kolothum.thodi

In the case the RMRR device scope is a PCI-PCI bridge, let's check
the device belongs to the PCI sub-hierarchy.

Fixes: 0659b8dc45a6 ("iommu/vt-d: Implement reserved region get/put callbacks")

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/iommu/intel-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 7ed820e79313..a36604f4900f 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5496,7 +5496,8 @@ static void intel_iommu_get_resv_regions(struct device *device,
 			struct iommu_resv_region *resv;
 			size_t length;
 
-			if (i_dev != device)
+			if (i_dev != device &&
+			    !is_downstream_to_pci_bridge(device, i_dev))
 				continue;
 
 			length = rmrr->end_address - rmrr->base_address + 1;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions
  2019-05-16  8:47 [PATCH v2 0/7] RMRR related fixes and enhancements Eric Auger
                   ` (4 preceding siblings ...)
  2019-05-16  8:47 ` [PATCH v2 5/7] iommu/vt-d: Handle PCI bridge RMRR device scopes in intel_iommu_get_resv_regions Eric Auger
@ 2019-05-16  8:47 ` Eric Auger
  2019-05-16  9:32   ` Auger Eric
  2019-05-16  8:47 ` [PATCH v2 7/7] iommu/vt-d: Differentiate relaxable and non relaxable RMRRs Eric Auger
  6 siblings, 1 reply; 9+ messages in thread
From: Eric Auger @ 2019-05-16  8:47 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, joro, iommu, linux-kernel, dwmw2,
	lorenzo.pieralisi, robin.murphy, will.deacon, hanjun.guo,
	sudeep.holla
  Cc: alex.williamson, shameerali.kolothum.thodi

Introduce a new type for reserved region. This corresponds
to directly mapped regions which are known to be relaxable
in some specific conditions, such as device assignment use
case. Well known examples are those used by USB controllers
providing PS/2 keyboard emulation for pre-boot BIOS and
early BOOT or RMRRs associated to IGD working in legacy mode.

Since commit c875d2c1b808 ("iommu/vt-d: Exclude devices using RMRRs
from IOMMU API domains") and commit 18436afdc11a ("iommu/vt-d: Allow
RMRR on graphics devices too"), those regions are currently
considered "safe" with respect to device assignment use case
which requires a non direct mapping at IOMMU physical level
(RAM GPA -> HPA mapping).

Those RMRRs currently exist and sometimes the device is
attempting to access it but this has not been considered
an issue until now.

However at the moment, iommu_get_group_resv_regions() is
not able to make any difference between directly mapped
regions: those which must be absolutely enforced and those
like above ones which are known as relaxable.

This is a blocker for reporting severe conflicts between
non relaxable RMRRs (like MSI doorbells) and guest GPA space.

With this new reserved region type we will be able to use
iommu_get_group_resv_regions() to enumerate the IOVA space
that is usable through the IOMMU API without introducing
regressions with respect to existing device assignment
use cases (USB and IGD).

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

Note: At the moment the sysfs ABI is not changed. However I wonder
whether it wouldn't be preferable to report the direct region as
"direct_relaxed" there. At the moment, in case the same direct
region is used by 2 devices, one USB/GFX and another not belonging
to the previous categories, the direct region will be output twice
with "direct" type.

This would unblock Shameer's series:
[PATCH v6 0/7] vfio/type1: Add support for valid iova list management
https://patchwork.kernel.org/patch/10425309/

which failed to get pulled for 4.18 merge window due to IGD
device assignment regression.
---
 drivers/iommu/iommu.c | 12 +++++++-----
 include/linux/iommu.h |  6 ++++++
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ae4ea5c0e6f9..84dcb6af6511 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -73,10 +73,11 @@ struct iommu_group_attribute {
 };
 
 static const char * const iommu_group_resv_type_string[] = {
-	[IOMMU_RESV_DIRECT]	= "direct",
-	[IOMMU_RESV_RESERVED]	= "reserved",
-	[IOMMU_RESV_MSI]	= "msi",
-	[IOMMU_RESV_SW_MSI]	= "msi",
+	[IOMMU_RESV_DIRECT]			= "direct",
+	[IOMMU_RESV_DIRECT_RELAXABLE]		= "direct",
+	[IOMMU_RESV_RESERVED]			= "reserved",
+	[IOMMU_RESV_MSI]			= "msi",
+	[IOMMU_RESV_SW_MSI]			= "msi",
 };
 
 #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)		\
@@ -573,7 +574,8 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group,
 		start = ALIGN(entry->start, pg_size);
 		end   = ALIGN(entry->start + entry->length, pg_size);
 
-		if (entry->type != IOMMU_RESV_DIRECT)
+		if (entry->type != IOMMU_RESV_DIRECT ||
+		    entry->type != IOMMU_RESV_DIRECT_RELAXABLE)
 			continue;
 
 		for (addr = start; addr < end; addr += pg_size) {
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index ba91666998fb..14a521f85f14 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -135,6 +135,12 @@ enum iommu_attr {
 enum iommu_resv_type {
 	/* Memory regions which must be mapped 1:1 at all times */
 	IOMMU_RESV_DIRECT,
+	/*
+	 * Memory regions which are advertised to be 1:1 but are
+	 * commonly considered relaxable in some conditions,
+	 * for instance in device assignment use case (USB, Graphics)
+	 */
+	IOMMU_RESV_DIRECT_RELAXABLE,
 	/* Arbitrary "never map this or give it to a device" address ranges */
 	IOMMU_RESV_RESERVED,
 	/* Hardware MSI region (untranslated) */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 7/7] iommu/vt-d: Differentiate relaxable and non relaxable RMRRs
  2019-05-16  8:47 [PATCH v2 0/7] RMRR related fixes and enhancements Eric Auger
                   ` (5 preceding siblings ...)
  2019-05-16  8:47 ` [PATCH v2 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions Eric Auger
@ 2019-05-16  8:47 ` Eric Auger
  6 siblings, 0 replies; 9+ messages in thread
From: Eric Auger @ 2019-05-16  8:47 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, joro, iommu, linux-kernel, dwmw2,
	lorenzo.pieralisi, robin.murphy, will.deacon, hanjun.guo,
	sudeep.holla
  Cc: alex.williamson, shameerali.kolothum.thodi

Now we have a new IOMMU_RESV_DIRECT_RELAXABLE reserved memory
region type, let's report USB and GFX RMRRs as relaxable ones.

This allows to have a finer reporting at IOMMU API level of
reserved memory regions. This will be exploitable by VFIO to
define the usable IOVA range and detect potential conflicts
between the guest physical address space and host reserved
regions.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/iommu/intel-iommu.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a36604f4900f..af1d65fdedfc 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5493,7 +5493,9 @@ static void intel_iommu_get_resv_regions(struct device *device,
 	for_each_rmrr_units(rmrr) {
 		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
 					  i, i_dev) {
+			struct pci_dev *pdev = to_pci_dev(device);
 			struct iommu_resv_region *resv;
+			enum iommu_resv_type type;
 			size_t length;
 
 			if (i_dev != device &&
@@ -5501,9 +5503,13 @@ static void intel_iommu_get_resv_regions(struct device *device,
 				continue;
 
 			length = rmrr->end_address - rmrr->base_address + 1;
+
+			type = (pdev &&
+				(IS_USB_DEVICE(pdev) || IS_GFX_DEVICE(pdev))) ?
+				IOMMU_RESV_DIRECT_RELAXABLE : IOMMU_RESV_DIRECT;
+
 			resv = iommu_alloc_resv_region(rmrr->base_address,
-						       length, prot,
-						       IOMMU_RESV_DIRECT,
+						       length, prot, type,
 						       GFP_ATOMIC);
 			if (!resv)
 				break;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions
  2019-05-16  8:47 ` [PATCH v2 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions Eric Auger
@ 2019-05-16  9:32   ` Auger Eric
  0 siblings, 0 replies; 9+ messages in thread
From: Auger Eric @ 2019-05-16  9:32 UTC (permalink / raw)
  To: eric.auger.pro, joro, iommu, linux-kernel, dwmw2,
	lorenzo.pieralisi, robin.murphy, will.deacon, hanjun.guo,
	sudeep.holla
  Cc: alex.williamson, shameerali.kolothum.thodi

Hi

On 5/16/19 10:47 AM, Eric Auger wrote:
> Introduce a new type for reserved region. This corresponds
> to directly mapped regions which are known to be relaxable
> in some specific conditions, such as device assignment use
> case. Well known examples are those used by USB controllers
> providing PS/2 keyboard emulation for pre-boot BIOS and
> early BOOT or RMRRs associated to IGD working in legacy mode.
> 
> Since commit c875d2c1b808 ("iommu/vt-d: Exclude devices using RMRRs
> from IOMMU API domains") and commit 18436afdc11a ("iommu/vt-d: Allow
> RMRR on graphics devices too"), those regions are currently
> considered "safe" with respect to device assignment use case
> which requires a non direct mapping at IOMMU physical level
> (RAM GPA -> HPA mapping).
> 
> Those RMRRs currently exist and sometimes the device is
> attempting to access it but this has not been considered
> an issue until now.
> 
> However at the moment, iommu_get_group_resv_regions() is
> not able to make any difference between directly mapped
> regions: those which must be absolutely enforced and those
> like above ones which are known as relaxable.
> 
> This is a blocker for reporting severe conflicts between
> non relaxable RMRRs (like MSI doorbells) and guest GPA space.
> 
> With this new reserved region type we will be able to use
> iommu_get_group_resv_regions() to enumerate the IOVA space
> that is usable through the IOMMU API without introducing
> regressions with respect to existing device assignment
> use cases (USB and IGD).
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> 
> Note: At the moment the sysfs ABI is not changed. However I wonder
> whether it wouldn't be preferable to report the direct region as
> "direct_relaxed" there. At the moment, in case the same direct
> region is used by 2 devices, one USB/GFX and another not belonging
> to the previous categories, the direct region will be output twice
> with "direct" type.
> 
> This would unblock Shameer's series:
> [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
> https://patchwork.kernel.org/patch/10425309/
> 
> which failed to get pulled for 4.18 merge window due to IGD
> device assignment regression.
> ---
>  drivers/iommu/iommu.c | 12 +++++++-----
>  include/linux/iommu.h |  6 ++++++
>  2 files changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index ae4ea5c0e6f9..84dcb6af6511 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -73,10 +73,11 @@ struct iommu_group_attribute {
>  };
>  
>  static const char * const iommu_group_resv_type_string[] = {
> -	[IOMMU_RESV_DIRECT]	= "direct",
> -	[IOMMU_RESV_RESERVED]	= "reserved",
> -	[IOMMU_RESV_MSI]	= "msi",
> -	[IOMMU_RESV_SW_MSI]	= "msi",
> +	[IOMMU_RESV_DIRECT]			= "direct",
> +	[IOMMU_RESV_DIRECT_RELAXABLE]		= "direct",
> +	[IOMMU_RESV_RESERVED]			= "reserved",
> +	[IOMMU_RESV_MSI]			= "msi",
> +	[IOMMU_RESV_SW_MSI]			= "msi",
>  };
>  
>  #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)		\
> @@ -573,7 +574,8 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group,
>  		start = ALIGN(entry->start, pg_size);
>  		end   = ALIGN(entry->start + entry->length, pg_size);
>  
> -		if (entry->type != IOMMU_RESV_DIRECT)
> +		if (entry->type != IOMMU_RESV_DIRECT ||
It must be a "&&" and not "||"

Respinning ...

Eric

> +		    entry->type != IOMMU_RESV_DIRECT_RELAXABLE)
>  			continue;
>  
>  		for (addr = start; addr < end; addr += pg_size) {
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index ba91666998fb..14a521f85f14 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -135,6 +135,12 @@ enum iommu_attr {
>  enum iommu_resv_type {
>  	/* Memory regions which must be mapped 1:1 at all times */
>  	IOMMU_RESV_DIRECT,
> +	/*
> +	 * Memory regions which are advertised to be 1:1 but are
> +	 * commonly considered relaxable in some conditions,
> +	 * for instance in device assignment use case (USB, Graphics)
> +	 */
> +	IOMMU_RESV_DIRECT_RELAXABLE,
>  	/* Arbitrary "never map this or give it to a device" address ranges */
>  	IOMMU_RESV_RESERVED,
>  	/* Hardware MSI region (untranslated) */
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-05-16  9:32 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-16  8:47 [PATCH v2 0/7] RMRR related fixes and enhancements Eric Auger
2019-05-16  8:47 ` [PATCH v2 1/7] iommu: Pass a GFP flag parameter to iommu_alloc_resv_region() Eric Auger
2019-05-16  8:47 ` [PATCH v2 2/7] iommu/vt-d: Duplicate iommu_resv_region objects per device list Eric Auger
2019-05-16  8:47 ` [PATCH v2 3/7] iommu/vt-d: Introduce is_downstream_to_pci_bridge helper Eric Auger
2019-05-16  8:47 ` [PATCH v2 4/7] iommu/vt-d: Handle RMRR with PCI bridge device scopes Eric Auger
2019-05-16  8:47 ` [PATCH v2 5/7] iommu/vt-d: Handle PCI bridge RMRR device scopes in intel_iommu_get_resv_regions Eric Auger
2019-05-16  8:47 ` [PATCH v2 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions Eric Auger
2019-05-16  9:32   ` Auger Eric
2019-05-16  8:47 ` [PATCH v2 7/7] iommu/vt-d: Differentiate relaxable and non relaxable RMRRs Eric Auger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).