linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.
@ 2019-03-04 15:41 James Sewart
  2019-03-04 15:45 ` [PATCH 1/4] iommu: Move iommu_group_create_direct_mappings to after device_attach James Sewart
                   ` (3 more replies)
  0 siblings, 4 replies; 55+ messages in thread
From: James Sewart @ 2019-03-04 15:41 UTC (permalink / raw)
  To: iommu; +Cc: Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel, Lu Baolu

Hey,

This patchset moves IOMMU_DOMAIN_DMA iommu domain management to iommu.c. 
This avoids the use of find_or_alloc_domain whose domain assignment is 
inconsistent with the iommu grouping as determined by pci_device_group.

Patch 3 permits domain type IOMMU_DOMAIN_DMA to be allocated via the 
iommu_ops api, allowing the default_domain of an iommu group to be set in 
iommu.c. This domain will be attached to every device that is brought up 
with an iommu group, and the devices reserved regions will be mapped using 
regions returned by get_resv_regions.

In intel_iommu_domain_alloc we don’t know the IOMMU this domain will be 
associated with so we defer full initialisation until 
intel_iommu_attach_device. Currently iommu.c:iommu_group_add_device will 
try to map a devices reserved regions before attaching the domain which 
would cause issue if the domain is not fully initialised. This is 
addressed in patch 1 by moving the mapping to after attaching.

Patch 2 implements function apply_resv_region, used in 
iommu_group_create_direct_mappings to mark the reserved regions as non 
mappable for the dma_map_ops api.

Patch 4 removes the domain lazy allocation logic. Before this patch the 
lazy allocation logic would not be used as any domain allocated using 
these paths would be replaced when attaching the group default domain. 
Default domain allocation has been tested with and without this patch on 
4.19.

Cheers,
James.

James Sewart (4):
  iommu: Move iommu_group_create_direct_mappings to after device_attach
  iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
  iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated
  iommu/vt-d: Remove lazy allocation of domains

 drivers/iommu/intel-iommu.c | 329 ++++++++++++------------------------
 drivers/iommu/iommu.c       |   4 +-
 2 files changed, 108 insertions(+), 225 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 1/4] iommu: Move iommu_group_create_direct_mappings to after device_attach
  2019-03-04 15:41 [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain James Sewart
@ 2019-03-04 15:45 ` James Sewart
  2019-03-04 15:46 ` [PATCH 2/4] iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges James Sewart
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 55+ messages in thread
From: James Sewart @ 2019-03-04 15:45 UTC (permalink / raw)
  To: iommu; +Cc: Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel, Lu Baolu

If an IOMMU driver requires to know which IOMMU a domain is associated
to initialise a domain then it will do so in device_attach, before which
it is not safe to call iommu_ops.

Signed-off-by: James Sewart <jamessewart@arista.com>
---
 drivers/iommu/iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3ed4db334341..1c6ffbb2d20e 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -652,8 +652,6 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 
 	dev->iommu_group = group;
 
-	iommu_group_create_direct_mappings(group, dev);
-
 	mutex_lock(&group->mutex);
 	list_add_tail(&device->list, &group->devices);
 	if (group->domain)
@@ -662,6 +660,8 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 	if (ret)
 		goto err_put_group;
 
+	iommu_group_create_direct_mappings(group, dev);
+
 	/* Notify any listeners about change to group. */
 	blocking_notifier_call_chain(&group->notifier,
 				     IOMMU_GROUP_NOTIFY_ADD_DEVICE, dev);
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 2/4] iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
  2019-03-04 15:41 [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain James Sewart
  2019-03-04 15:45 ` [PATCH 1/4] iommu: Move iommu_group_create_direct_mappings to after device_attach James Sewart
@ 2019-03-04 15:46 ` James Sewart
  2019-03-04 15:46   ` [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated James Sewart
  2019-03-05  6:05 ` [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain Lu Baolu
  2019-03-14 11:56 ` [PATCH v2 0/7] " James Sewart
  3 siblings, 1 reply; 55+ messages in thread
From: James Sewart @ 2019-03-04 15:46 UTC (permalink / raw)
  To: iommu; +Cc: Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel, Lu Baolu

Used by iommu.c before creating identity mappings for reserved ranges to
ensure dma-map-ops won't ever remap these addresses.

Signed-off-by: James Sewart <jamessewart@arista.com>
---
 drivers/iommu/intel-iommu.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 78188bf7e90d..8e0a4e2ff77f 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5299,6 +5299,19 @@ static void intel_iommu_put_resv_regions(struct device *dev,
 	}
 }
 
+static void intel_iommu_apply_resv_region(struct device *dev,
+		struct iommu_domain *domain,
+		struct iommu_resv_region *region)
+{
+	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+	unsigned long start, end;
+
+	start = IOVA_PFN(region->start);
+	end   = IOVA_PFN(region->start + region->length - 1);
+
+	WARN_ON_ONCE(reserve_iova(&dmar_domain->iovad, start, end) == NULL);
+}
+
 #ifdef CONFIG_INTEL_IOMMU_SVM
 int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sdev)
 {
@@ -5392,6 +5405,7 @@ const struct iommu_ops intel_iommu_ops = {
 	.remove_device		= intel_iommu_remove_device,
 	.get_resv_regions	= intel_iommu_get_resv_regions,
 	.put_resv_regions	= intel_iommu_put_resv_regions,
+	.apply_resv_region	= intel_iommu_apply_resv_region,
 	.device_group		= pci_device_group,
 	.pgsize_bitmap		= INTEL_IOMMU_PGSIZES,
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated
  2019-03-04 15:46 ` [PATCH 2/4] iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges James Sewart
@ 2019-03-04 15:46   ` James Sewart
  2019-03-04 15:47     ` [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains James Sewart
                       ` (2 more replies)
  0 siblings, 3 replies; 55+ messages in thread
From: James Sewart @ 2019-03-04 15:46 UTC (permalink / raw)
  To: iommu; +Cc: Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel, Lu Baolu

Allowing IOMMU_DOMAIN_DMA type IOMMU domain to be allocated allows the
default_domain of an iommu_group to be set. This delegates device-domain
relationships to the generic IOMMU code.

Signed-off-by: James Sewart <jamessewart@arista.com>
---
 drivers/iommu/intel-iommu.c | 113 +++++++++++++++++++++++++++---------
 1 file changed, 84 insertions(+), 29 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 8e0a4e2ff77f..71cd6bbfec05 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -309,6 +309,14 @@ static int hw_pass_through = 1;
 /* si_domain contains mulitple devices */
 #define DOMAIN_FLAG_STATIC_IDENTITY	(1 << 1)
 
+/* Domain managed externally, don't cleanup if it isn't attached
+ * to any devices. */
+#define DOMAIN_FLAG_MANAGED_EXTERNALLY	(1 << 2)
+
+/* Set after domain initialisation. Used when allocating dma domains to
+ * defer domain initialisation until it is attached to a device */
+#define DOMAIN_FLAG_INITIALISED	(1 << 4)
+
 #define for_each_domain_iommu(idx, domain)			\
 	for (idx = 0; idx < g_num_of_iommus; idx++)		\
 		if (domain->iommu_refcnt[idx])
@@ -558,6 +566,16 @@ static inline int domain_type_is_vm_or_si(struct dmar_domain *domain)
 				DOMAIN_FLAG_STATIC_IDENTITY);
 }
 
+static inline int domain_managed_externally(struct dmar_domain *domain)
+{
+	return domain->flags & DOMAIN_FLAG_MANAGED_EXTERNALLY;
+}
+
+static inline int domain_is_initialised(struct dmar_domain *domain)
+{
+	return domain->flags & DOMAIN_FLAG_INITIALISED;
+}
+
 static inline int domain_pfn_supported(struct dmar_domain *domain,
 				       unsigned long pfn)
 {
@@ -1662,7 +1680,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
 
 		__dmar_remove_one_dev_info(info);
 
-		if (!domain_type_is_vm_or_si(domain)) {
+		if (!domain_managed_externally(domain)) {
 			/*
 			 * The domain_exit() function  can't be called under
 			 * device_domain_lock, as it takes this lock itself.
@@ -1895,6 +1913,7 @@ static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
 	domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
 	if (!domain->pgd)
 		return -ENOMEM;
+	domain->flags |= DOMAIN_FLAG_INITIALISED;
 	__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
 	return 0;
 }
@@ -2807,14 +2826,10 @@ static inline void iommu_prepare_isa(void)
 
 static int md_domain_init(struct dmar_domain *domain, int guest_width);
 
-static int __init si_domain_init(int hw)
+static int __init si_domain_init(struct dmar_domain *si_domain, int hw)
 {
 	int nid, ret = 0;
 
-	si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
-	if (!si_domain)
-		return -EFAULT;
-
 	if (md_domain_init(si_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
 		domain_exit(si_domain);
 		return -EFAULT;
@@ -3417,9 +3432,16 @@ static int __init init_dmars(void)
 	check_tylersburg_isoch();
 
 	if (iommu_identity_mapping) {
-		ret = si_domain_init(hw_pass_through);
-		if (ret)
+		si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
+		if (!si_domain) {
+			ret = -EFAULT;
 			goto free_iommu;
+		}
+		ret = si_domain_init(si_domain, hw_pass_through);
+		if (ret) {
+			domain_exit(si_domain);
+			goto free_iommu;
+		}
 	}
 
 
@@ -4575,7 +4597,7 @@ static int device_notifier(struct notifier_block *nb,
 		return 0;
 
 	dmar_remove_one_dev_info(domain, dev);
-	if (!domain_type_is_vm_or_si(domain) && list_empty(&domain->devices))
+	if (!domain_managed_externally(domain) && list_empty(&domain->devices))
 		domain_exit(domain);
 
 	return 0;
@@ -5020,6 +5042,7 @@ static int md_domain_init(struct dmar_domain *domain, int guest_width)
 	domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
 	if (!domain->pgd)
 		return -ENOMEM;
+	domain->flags |= DOMAIN_FLAG_INITIALISED;
 	domain_flush_cache(domain, domain->pgd, PAGE_SIZE);
 	return 0;
 }
@@ -5028,28 +5051,54 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
 {
 	struct dmar_domain *dmar_domain;
 	struct iommu_domain *domain;
+	int flags = DOMAIN_FLAG_MANAGED_EXTERNALLY;
 
-	if (type != IOMMU_DOMAIN_UNMANAGED)
-		return NULL;
-
-	dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
-	if (!dmar_domain) {
-		pr_err("Can't allocate dmar_domain\n");
-		return NULL;
-	}
-	if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
-		pr_err("Domain initialization failed\n");
-		domain_exit(dmar_domain);
+	switch (type) {
+	case IOMMU_DOMAIN_UNMANAGED:
+		flags |= DOMAIN_FLAG_VIRTUAL_MACHINE | DOMAIN_FLAG_INITIALISED;
+		dmar_domain = alloc_domain(flags);
+		if (!dmar_domain) {
+			pr_err("Can't allocate dmar_domain\n");
+			return NULL;
+		}
+		if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
+			pr_err("Domain initialization failed\n");
+			domain_exit(dmar_domain);
+			return NULL;
+		}
+		domain_update_iommu_cap(dmar_domain);
+		domain = &dmar_domain->domain;
+		domain->geometry.aperture_start = 0;
+		domain->geometry.aperture_end   = __DOMAIN_MAX_ADDR(dmar_domain->gaw);
+		domain->geometry.force_aperture = true;
+		break;
+	case IOMMU_DOMAIN_DMA:
+		dmar_domain = alloc_domain(flags);
+		if (!dmar_domain) {
+			pr_err("Can't allocate dmar_domain\n");
+			return NULL;
+		}
+		// init domain in device attach when we know IOMMU capabilities
+		break;
+	case IOMMU_DOMAIN_IDENTITY:
+		flags |= DOMAIN_FLAG_STATIC_IDENTITY | DOMAIN_FLAG_INITIALISED;
+		dmar_domain = alloc_domain(flags);
+		if (!dmar_domain) {
+			pr_err("Can't allocate dmar_domain\n");
+			return NULL;
+		}
+		if (si_domain_init(dmar_domain, 0)) {
+			pr_err("Domain initialization failed\n");
+			domain_exit(dmar_domain);
+			return NULL;
+		}
+		break;
+	default:
 		return NULL;
 	}
-	domain_update_iommu_cap(dmar_domain);
-
-	domain = &dmar_domain->domain;
-	domain->geometry.aperture_start = 0;
-	domain->geometry.aperture_end   = __DOMAIN_MAX_ADDR(dmar_domain->gaw);
-	domain->geometry.force_aperture = true;
 
-	return domain;
+	dmar_domain->domain.type = type;
+	return &dmar_domain->domain;
 }
 
 static void intel_iommu_domain_free(struct iommu_domain *domain)
@@ -5080,8 +5129,8 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
 			dmar_remove_one_dev_info(old_domain, dev);
 			rcu_read_unlock();
 
-			if (!domain_type_is_vm_or_si(old_domain) &&
-			     list_empty(&old_domain->devices))
+			if (list_empty(&old_domain->devices) &&
+			     !domain_managed_externally(old_domain))
 				domain_exit(old_domain);
 		}
 	}
@@ -5090,6 +5139,12 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
 	if (!iommu)
 		return -ENODEV;
 
+	// Initialise domain with IOMMU capabilities if it isn't already initialised
+	if (!domain_is_initialised(dmar_domain)) {
+		if (domain_init(dmar_domain, iommu, DEFAULT_DOMAIN_ADDRESS_WIDTH))
+			return -ENOMEM;
+	}
+
 	/* check if this iommu agaw is sufficient for max mapped address */
 	addr_width = agaw_to_width(iommu->agaw);
 	if (addr_width > cap_mgaw(iommu->cap))
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains
  2019-03-04 15:46   ` [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated James Sewart
@ 2019-03-04 15:47     ` James Sewart
  2019-03-05  6:59       ` Lu Baolu
  2019-03-05  6:46     ` [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated Lu Baolu
  2019-03-08  1:20     ` Dmitry Safonov
  2 siblings, 1 reply; 55+ messages in thread
From: James Sewart @ 2019-03-04 15:47 UTC (permalink / raw)
  To: iommu; +Cc: Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel, Lu Baolu

The generic IOMMU code will allocate and attach a dma ops domain to each
device that comes online, replacing any lazy allocated domain. Removes
RMRR application at iommu init time as we won't have a domain attached
to any device. iommu.c will do this after attaching a device using
create_direct_mappings.

Signed-off-by: James Sewart <jamessewart@arista.com>
---
 drivers/iommu/intel-iommu.c | 202 ++----------------------------------
 1 file changed, 8 insertions(+), 194 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 71cd6bbfec05..282257e2628d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2595,118 +2595,6 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
 	return domain;
 }
 
-static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
-{
-	*(u16 *)opaque = alias;
-	return 0;
-}
-
-static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
-{
-	struct device_domain_info *info = NULL;
-	struct dmar_domain *domain = NULL;
-	struct intel_iommu *iommu;
-	u16 dma_alias;
-	unsigned long flags;
-	u8 bus, devfn;
-
-	iommu = device_to_iommu(dev, &bus, &devfn);
-	if (!iommu)
-		return NULL;
-
-	if (dev_is_pci(dev)) {
-		struct pci_dev *pdev = to_pci_dev(dev);
-
-		pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
-
-		spin_lock_irqsave(&device_domain_lock, flags);
-		info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
-						      PCI_BUS_NUM(dma_alias),
-						      dma_alias & 0xff);
-		if (info) {
-			iommu = info->iommu;
-			domain = info->domain;
-		}
-		spin_unlock_irqrestore(&device_domain_lock, flags);
-
-		/* DMA alias already has a domain, use it */
-		if (info)
-			goto out;
-	}
-
-	/* Allocate and initialize new domain for the device */
-	domain = alloc_domain(0);
-	if (!domain)
-		return NULL;
-	if (domain_init(domain, iommu, gaw)) {
-		domain_exit(domain);
-		return NULL;
-	}
-
-out:
-
-	return domain;
-}
-
-static struct dmar_domain *set_domain_for_dev(struct device *dev,
-					      struct dmar_domain *domain)
-{
-	struct intel_iommu *iommu;
-	struct dmar_domain *tmp;
-	u16 req_id, dma_alias;
-	u8 bus, devfn;
-
-	iommu = device_to_iommu(dev, &bus, &devfn);
-	if (!iommu)
-		return NULL;
-
-	req_id = ((u16)bus << 8) | devfn;
-
-	if (dev_is_pci(dev)) {
-		struct pci_dev *pdev = to_pci_dev(dev);
-
-		pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
-
-		/* register PCI DMA alias device */
-		if (req_id != dma_alias) {
-			tmp = dmar_insert_one_dev_info(iommu, PCI_BUS_NUM(dma_alias),
-					dma_alias & 0xff, NULL, domain);
-
-			if (!tmp || tmp != domain)
-				return tmp;
-		}
-	}
-
-	tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
-	if (!tmp || tmp != domain)
-		return tmp;
-
-	return domain;
-}
-
-static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
-{
-	struct dmar_domain *domain, *tmp;
-
-	domain = find_domain(dev);
-	if (domain)
-		goto out;
-
-	domain = find_or_alloc_domain(dev, gaw);
-	if (!domain)
-		goto out;
-
-	tmp = set_domain_for_dev(dev, domain);
-	if (!tmp || domain != tmp) {
-		domain_exit(domain);
-		domain = tmp;
-	}
-
-out:
-
-	return domain;
-}
-
 static int iommu_domain_identity_map(struct dmar_domain *domain,
 				     unsigned long long start,
 				     unsigned long long end)
@@ -2779,7 +2667,7 @@ static int iommu_prepare_identity_map(struct device *dev,
 	struct dmar_domain *domain;
 	int ret;
 
-	domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
+	domain = find_domain(dev);
 	if (!domain)
 		return -ENOMEM;
 
@@ -3301,11 +3189,9 @@ static int copy_translation_tables(struct intel_iommu *iommu)
 static int __init init_dmars(void)
 {
 	struct dmar_drhd_unit *drhd;
-	struct dmar_rmrr_unit *rmrr;
 	bool copied_tables = false;
-	struct device *dev;
 	struct intel_iommu *iommu;
-	int i, ret;
+	int ret;
 
 	/*
 	 * for each drhd
@@ -3466,32 +3352,6 @@ static int __init init_dmars(void)
 			goto free_iommu;
 		}
 	}
-	/*
-	 * For each rmrr
-	 *   for each dev attached to rmrr
-	 *   do
-	 *     locate drhd for dev, alloc domain for dev
-	 *     allocate free domain
-	 *     allocate page table entries for rmrr
-	 *     if context not allocated for bus
-	 *           allocate and init context
-	 *           set present in root table for this bus
-	 *     init context with domain, translation etc
-	 *    endfor
-	 * endfor
-	 */
-	pr_info("Setting RMRR:\n");
-	for_each_rmrr_units(rmrr) {
-		/* some BIOS lists non-exist devices in DMAR table. */
-		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
-					  i, dev) {
-			ret = iommu_prepare_rmrr_dev(rmrr, dev);
-			if (ret)
-				pr_err("Mapping reserved region failed\n");
-		}
-	}
-
-	iommu_prepare_isa();
 
 domains_done:
 
@@ -3580,53 +3440,6 @@ static unsigned long intel_alloc_iova(struct device *dev,
 	return iova_pfn;
 }
 
-struct dmar_domain *get_valid_domain_for_dev(struct device *dev)
-{
-	struct dmar_domain *domain, *tmp;
-	struct dmar_rmrr_unit *rmrr;
-	struct device *i_dev;
-	int i, ret;
-
-	domain = find_domain(dev);
-	if (domain)
-		goto out;
-
-	domain = find_or_alloc_domain(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
-	if (!domain)
-		goto out;
-
-	/* We have a new domain - setup possible RMRRs for the device */
-	rcu_read_lock();
-	for_each_rmrr_units(rmrr) {
-		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
-					  i, i_dev) {
-			if (i_dev != dev)
-				continue;
-
-			ret = domain_prepare_identity_map(dev, domain,
-							  rmrr->base_address,
-							  rmrr->end_address);
-			if (ret)
-				dev_err(dev, "Mapping reserved region failed\n");
-		}
-	}
-	rcu_read_unlock();
-
-	tmp = set_domain_for_dev(dev, domain);
-	if (!tmp || domain != tmp) {
-		domain_exit(domain);
-		domain = tmp;
-	}
-
-out:
-
-	if (!domain)
-		pr_err("Allocating domain for %s failed\n", dev_name(dev));
-
-
-	return domain;
-}
-
 /* Check if the dev needs to go through non-identity map and unmap process.*/
 static int iommu_no_mapping(struct device *dev)
 {
@@ -3689,7 +3502,7 @@ static dma_addr_t __intel_map_page(struct device *dev, struct page *page,
 	if (iommu_no_mapping(dev))
 		return paddr;
 
-	domain = get_valid_domain_for_dev(dev);
+	domain = find_domain(dev);
 	if (!domain)
 		return DMA_MAPPING_ERROR;
 
@@ -3753,7 +3566,8 @@ static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
 		return;
 
 	domain = find_domain(dev);
-	BUG_ON(!domain);
+	if (!domain)
+		return;
 
 	iommu = domain_get_iommu(domain);
 
@@ -3899,7 +3713,7 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
 	if (iommu_no_mapping(dev))
 		return intel_nontranslate_map_sg(dev, sglist, nelems, dir);
 
-	domain = get_valid_domain_for_dev(dev);
+	domain = find_domain(dev);
 	if (!domain)
 		return 0;
 
@@ -5377,9 +5191,9 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sd
 	u64 ctx_lo;
 	int ret;
 
-	domain = get_valid_domain_for_dev(sdev->dev);
+	domain = find_domain(sdev->dev);
 	if (!domain)
-		return -EINVAL;
+		return -ENOMEM;
 
 	spin_lock_irqsave(&device_domain_lock, flags);
 	spin_lock(&iommu->lock);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.
  2019-03-04 15:41 [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain James Sewart
  2019-03-04 15:45 ` [PATCH 1/4] iommu: Move iommu_group_create_direct_mappings to after device_attach James Sewart
  2019-03-04 15:46 ` [PATCH 2/4] iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges James Sewart
@ 2019-03-05  6:05 ` Lu Baolu
  2019-03-05 11:14   ` James Sewart
  2019-03-14 11:56 ` [PATCH v2 0/7] " James Sewart
  3 siblings, 1 reply; 55+ messages in thread
From: Lu Baolu @ 2019-03-05  6:05 UTC (permalink / raw)
  To: James Sewart, iommu
  Cc: baolu.lu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi James,

Very glad to see this. Thank you!

On 3/4/19 11:41 PM, James Sewart wrote:
> Hey,
> 
> This patchset moves IOMMU_DOMAIN_DMA iommu domain management to iommu.c.
> This avoids the use of find_or_alloc_domain whose domain assignment is
> inconsistent with the iommu grouping as determined by pci_device_group.

Is this a bug fix or an improvement? What's the real issue will it cause
if we go without this patch set?

Best regards,
Lu Baolu

> 
> Patch 3 permits domain type IOMMU_DOMAIN_DMA to be allocated via the
> iommu_ops api, allowing the default_domain of an iommu group to be set in
> iommu.c. This domain will be attached to every device that is brought up
> with an iommu group, and the devices reserved regions will be mapped using
> regions returned by get_resv_regions.
> 
> In intel_iommu_domain_alloc we don’t know the IOMMU this domain will be
> associated with so we defer full initialisation until
> intel_iommu_attach_device. Currently iommu.c:iommu_group_add_device will
> try to map a devices reserved regions before attaching the domain which
> would cause issue if the domain is not fully initialised. This is
> addressed in patch 1 by moving the mapping to after attaching.
> 
> Patch 2 implements function apply_resv_region, used in
> iommu_group_create_direct_mappings to mark the reserved regions as non
> mappable for the dma_map_ops api.
> 
> Patch 4 removes the domain lazy allocation logic. Before this patch the
> lazy allocation logic would not be used as any domain allocated using
> these paths would be replaced when attaching the group default domain.
> Default domain allocation has been tested with and without this patch on
> 4.19.
> 
> Cheers,
> James.
> 
> James Sewart (4):
>    iommu: Move iommu_group_create_direct_mappings to after device_attach
>    iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
>    iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated
>    iommu/vt-d: Remove lazy allocation of domains
> 
>   drivers/iommu/intel-iommu.c | 329 ++++++++++++------------------------
>   drivers/iommu/iommu.c       |   4 +-
>   2 files changed, 108 insertions(+), 225 deletions(-)
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated
  2019-03-04 15:46   ` [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated James Sewart
  2019-03-04 15:47     ` [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains James Sewart
@ 2019-03-05  6:46     ` Lu Baolu
  2019-03-05 11:34       ` James Sewart
  2019-03-08  1:20     ` Dmitry Safonov
  2 siblings, 1 reply; 55+ messages in thread
From: Lu Baolu @ 2019-03-05  6:46 UTC (permalink / raw)
  To: James Sewart, iommu
  Cc: baolu.lu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi,

On 3/4/19 11:46 PM, James Sewart wrote:
> Allowing IOMMU_DOMAIN_DMA type IOMMU domain to be allocated allows the
> default_domain of an iommu_group to be set. This delegates device-domain
> relationships to the generic IOMMU code.
> 
> Signed-off-by: James Sewart <jamessewart@arista.com>
> ---
>   drivers/iommu/intel-iommu.c | 113 +++++++++++++++++++++++++++---------
>   1 file changed, 84 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 8e0a4e2ff77f..71cd6bbfec05 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -309,6 +309,14 @@ static int hw_pass_through = 1;
>   /* si_domain contains mulitple devices */
>   #define DOMAIN_FLAG_STATIC_IDENTITY	(1 << 1)
>   
> +/* Domain managed externally, don't cleanup if it isn't attached
> + * to any devices. */
> +#define DOMAIN_FLAG_MANAGED_EXTERNALLY	(1 << 2)
> +
> +/* Set after domain initialisation. Used when allocating dma domains to
> + * defer domain initialisation until it is attached to a device */
> +#define DOMAIN_FLAG_INITIALISED	(1 << 4)

Why do you skip bit 3?

> +
>   #define for_each_domain_iommu(idx, domain)			\
>   	for (idx = 0; idx < g_num_of_iommus; idx++)		\
>   		if (domain->iommu_refcnt[idx])
> @@ -558,6 +566,16 @@ static inline int domain_type_is_vm_or_si(struct dmar_domain *domain)
>   				DOMAIN_FLAG_STATIC_IDENTITY);
>   }
>   
> +static inline int domain_managed_externally(struct dmar_domain *domain)
> +{
> +	return domain->flags & DOMAIN_FLAG_MANAGED_EXTERNALLY;
> +}
> +
> +static inline int domain_is_initialised(struct dmar_domain *domain)
> +{
> +	return domain->flags & DOMAIN_FLAG_INITIALISED;
> +}
> +
>   static inline int domain_pfn_supported(struct dmar_domain *domain,
>   				       unsigned long pfn)
>   {
> @@ -1662,7 +1680,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
>   
>   		__dmar_remove_one_dev_info(info);
>   
> -		if (!domain_type_is_vm_or_si(domain)) {
> +		if (!domain_managed_externally(domain)) {
>   			/*
>   			 * The domain_exit() function  can't be called under
>   			 * device_domain_lock, as it takes this lock itself.
> @@ -1895,6 +1913,7 @@ static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
>   	domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
>   	if (!domain->pgd)
>   		return -ENOMEM;
> +	domain->flags |= DOMAIN_FLAG_INITIALISED;
>   	__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
>   	return 0;
>   }
> @@ -2807,14 +2826,10 @@ static inline void iommu_prepare_isa(void)
>   
>   static int md_domain_init(struct dmar_domain *domain, int guest_width);
>   
> -static int __init si_domain_init(int hw)
> +static int __init si_domain_init(struct dmar_domain *si_domain, int hw)
>   {
>   	int nid, ret = 0;
>   
> -	si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
> -	if (!si_domain)
> -		return -EFAULT;
> -
>   	if (md_domain_init(si_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
>   		domain_exit(si_domain);
>   		return -EFAULT;
> @@ -3417,9 +3432,16 @@ static int __init init_dmars(void)
>   	check_tylersburg_isoch();
>   
>   	if (iommu_identity_mapping) {
> -		ret = si_domain_init(hw_pass_through);
> -		if (ret)
> +		si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);

Where will this si_domain be used? We still need to keep the global
si_domain?

> +		if (!si_domain) {
> +			ret = -EFAULT;
>   			goto free_iommu;
> +		}
> +		ret = si_domain_init(si_domain, hw_pass_through);
> +		if (ret) {
> +			domain_exit(si_domain);
> +			goto free_iommu;
> +		}
>   	}
>   
>   
> @@ -4575,7 +4597,7 @@ static int device_notifier(struct notifier_block *nb,
>   		return 0;
>   
>   	dmar_remove_one_dev_info(domain, dev);
> -	if (!domain_type_is_vm_or_si(domain) && list_empty(&domain->devices))
> +	if (!domain_managed_externally(domain) && list_empty(&domain->devices))
>   		domain_exit(domain);
>   
>   	return 0;
> @@ -5020,6 +5042,7 @@ static int md_domain_init(struct dmar_domain *domain, int guest_width)
>   	domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
>   	if (!domain->pgd)
>   		return -ENOMEM;
> +	domain->flags |= DOMAIN_FLAG_INITIALISED;
>   	domain_flush_cache(domain, domain->pgd, PAGE_SIZE);
>   	return 0;
>   }
> @@ -5028,28 +5051,54 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
>   {
>   	struct dmar_domain *dmar_domain;
>   	struct iommu_domain *domain;
> +	int flags = DOMAIN_FLAG_MANAGED_EXTERNALLY;
>   
> -	if (type != IOMMU_DOMAIN_UNMANAGED)
> -		return NULL;
> -
> -	dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
> -	if (!dmar_domain) {
> -		pr_err("Can't allocate dmar_domain\n");
> -		return NULL;
> -	}
> -	if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
> -		pr_err("Domain initialization failed\n");
> -		domain_exit(dmar_domain);
> +	switch (type) {
> +	case IOMMU_DOMAIN_UNMANAGED:
> +		flags |= DOMAIN_FLAG_VIRTUAL_MACHINE | DOMAIN_FLAG_INITIALISED;
> +		dmar_domain = alloc_domain(flags);
> +		if (!dmar_domain) {
> +			pr_err("Can't allocate dmar_domain\n");
> +			return NULL;
> +		}
> +		if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
> +			pr_err("Domain initialization failed\n");
> +			domain_exit(dmar_domain);
> +			return NULL;
> +		}
> +		domain_update_iommu_cap(dmar_domain);
> +		domain = &dmar_domain->domain;
> +		domain->geometry.aperture_start = 0;
> +		domain->geometry.aperture_end   = __DOMAIN_MAX_ADDR(dmar_domain->gaw);
> +		domain->geometry.force_aperture = true;
> +		break;
> +	case IOMMU_DOMAIN_DMA:
> +		dmar_domain = alloc_domain(flags);
> +		if (!dmar_domain) {
> +			pr_err("Can't allocate dmar_domain\n");

Not necessary for memory allocation failure.

> +			return NULL;
> +		}
> +		// init domain in device attach when we know IOMMU capabilities

Use /* */ for comments, please.

> +		break;
> +	case IOMMU_DOMAIN_IDENTITY:
> +		flags |= DOMAIN_FLAG_STATIC_IDENTITY | DOMAIN_FLAG_INITIALISED;
> +		dmar_domain = alloc_domain(flags);
> +		if (!dmar_domain) {
> +			pr_err("Can't allocate dmar_domain\n");
> +			return NULL;
> +		}
> +		if (si_domain_init(dmar_domain, 0)) {
> +			pr_err("Domain initialization failed\n");
> +			domain_exit(dmar_domain);
> +			return NULL;
> +		}
> +		break;

How about moving si domain allocation into a separated patch? There are
two significant changes here, so it worth a new patch for discussion.

1) a shared single si domain vs. per group si domain;
2) previously vt-d drivers determines what kind of devices should use
    identity domain; Now, it turns to iommu generic layer, behavior will
    be changed.

> +	default:
>   		return NULL;
>   	}
> -	domain_update_iommu_cap(dmar_domain);
> -
> -	domain = &dmar_domain->domain;
> -	domain->geometry.aperture_start = 0;
> -	domain->geometry.aperture_end   = __DOMAIN_MAX_ADDR(dmar_domain->gaw);
> -	domain->geometry.force_aperture = true;
>   
> -	return domain;
> +	dmar_domain->domain.type = type;

Is it necessary to set this?

> +	return &dmar_domain->domain;
>   }
>   
>   static void intel_iommu_domain_free(struct iommu_domain *domain)
> @@ -5080,8 +5129,8 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
>   			dmar_remove_one_dev_info(old_domain, dev);
>   			rcu_read_unlock();
>   
> -			if (!domain_type_is_vm_or_si(old_domain) &&
> -			     list_empty(&old_domain->devices))
> +			if (list_empty(&old_domain->devices) &&
> +			     !domain_managed_externally(old_domain))
>   				domain_exit(old_domain);
>   		}
>   	}
> @@ -5090,6 +5139,12 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
>   	if (!iommu)
>   		return -ENODEV;
>   
> +	// Initialise domain with IOMMU capabilities if it isn't already initialised

Ditto.

> +	if (!domain_is_initialised(dmar_domain)) {
> +		if (domain_init(dmar_domain, iommu, DEFAULT_DOMAIN_ADDRESS_WIDTH))
> +			return -ENOMEM;
> +	}

Each domain, not matter the DMA domain or UNMANAGED domain, is allocated
first and then gets initialized during device attaching, right? If so,
why do we need to have a special DOMAIN_FLAG_INITIALISED flag for DMA
domains?

> +
>   	/* check if this iommu agaw is sufficient for max mapped address */
>   	addr_width = agaw_to_width(iommu->agaw);
>   	if (addr_width > cap_mgaw(iommu->cap))
> 

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains
  2019-03-04 15:47     ` [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains James Sewart
@ 2019-03-05  6:59       ` Lu Baolu
  2019-03-05 11:46         ` James Sewart
  0 siblings, 1 reply; 55+ messages in thread
From: Lu Baolu @ 2019-03-05  6:59 UTC (permalink / raw)
  To: James Sewart, iommu
  Cc: baolu.lu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi,

It's hard for me to understand why do we remove the rmrr related
code in this patch.

And, now we have two places to hold a domain for a device: group and
dev->info. We can consider to optimize the use of per device iommu
arch data. This should be later work anyway.

More comments inline.

On 3/4/19 11:47 PM, James Sewart wrote:
> The generic IOMMU code will allocate and attach a dma ops domain to each
> device that comes online, replacing any lazy allocated domain. Removes
> RMRR application at iommu init time as we won't have a domain attached
> to any device. iommu.c will do this after attaching a device using
> create_direct_mappings.
> 
> Signed-off-by: James Sewart <jamessewart@arista.com>
> ---
>   drivers/iommu/intel-iommu.c | 202 ++----------------------------------
>   1 file changed, 8 insertions(+), 194 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 71cd6bbfec05..282257e2628d 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -2595,118 +2595,6 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
>   	return domain;
>   }
>   
> -static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
> -{
> -	*(u16 *)opaque = alias;
> -	return 0;
> -}
> -
> -static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
> -{
> -	struct device_domain_info *info = NULL;
> -	struct dmar_domain *domain = NULL;
> -	struct intel_iommu *iommu;
> -	u16 dma_alias;
> -	unsigned long flags;
> -	u8 bus, devfn;
> -
> -	iommu = device_to_iommu(dev, &bus, &devfn);
> -	if (!iommu)
> -		return NULL;
> -
> -	if (dev_is_pci(dev)) {
> -		struct pci_dev *pdev = to_pci_dev(dev);
> -
> -		pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
> -
> -		spin_lock_irqsave(&device_domain_lock, flags);
> -		info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
> -						      PCI_BUS_NUM(dma_alias),
> -						      dma_alias & 0xff);
> -		if (info) {
> -			iommu = info->iommu;
> -			domain = info->domain;
> -		}
> -		spin_unlock_irqrestore(&device_domain_lock, flags);
> -
> -		/* DMA alias already has a domain, use it */
> -		if (info)
> -			goto out;
> -	}
> -
> -	/* Allocate and initialize new domain for the device */
> -	domain = alloc_domain(0);
> -	if (!domain)
> -		return NULL;
> -	if (domain_init(domain, iommu, gaw)) {
> -		domain_exit(domain);
> -		return NULL;
> -	}
> -
> -out:
> -
> -	return domain;
> -}
> -
> -static struct dmar_domain *set_domain_for_dev(struct device *dev,
> -					      struct dmar_domain *domain)
> -{
> -	struct intel_iommu *iommu;
> -	struct dmar_domain *tmp;
> -	u16 req_id, dma_alias;
> -	u8 bus, devfn;
> -
> -	iommu = device_to_iommu(dev, &bus, &devfn);
> -	if (!iommu)
> -		return NULL;
> -
> -	req_id = ((u16)bus << 8) | devfn;
> -
> -	if (dev_is_pci(dev)) {
> -		struct pci_dev *pdev = to_pci_dev(dev);
> -
> -		pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
> -
> -		/* register PCI DMA alias device */
> -		if (req_id != dma_alias) {
> -			tmp = dmar_insert_one_dev_info(iommu, PCI_BUS_NUM(dma_alias),
> -					dma_alias & 0xff, NULL, domain);
> -
> -			if (!tmp || tmp != domain)
> -				return tmp;
> -		}
> -	}
> -
> -	tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
> -	if (!tmp || tmp != domain)
> -		return tmp;
> -
> -	return domain;
> -}
> -
> -static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
> -{
> -	struct dmar_domain *domain, *tmp;
> -
> -	domain = find_domain(dev);
> -	if (domain)
> -		goto out;
> -
> -	domain = find_or_alloc_domain(dev, gaw);
> -	if (!domain)
> -		goto out;
> -
> -	tmp = set_domain_for_dev(dev, domain);
> -	if (!tmp || domain != tmp) {
> -		domain_exit(domain);
> -		domain = tmp;
> -	}
> -
> -out:
> -
> -	return domain;
> -}
> -
>   static int iommu_domain_identity_map(struct dmar_domain *domain,
>   				     unsigned long long start,
>   				     unsigned long long end)
> @@ -2779,7 +2667,7 @@ static int iommu_prepare_identity_map(struct device *dev,
>   	struct dmar_domain *domain;
>   	int ret;
>   
> -	domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
> +	domain = find_domain(dev);
>   	if (!domain)
>   		return -ENOMEM;
>   
> @@ -3301,11 +3189,9 @@ static int copy_translation_tables(struct intel_iommu *iommu)
>   static int __init init_dmars(void)
>   {
>   	struct dmar_drhd_unit *drhd;
> -	struct dmar_rmrr_unit *rmrr;
>   	bool copied_tables = false;
> -	struct device *dev;
>   	struct intel_iommu *iommu;
> -	int i, ret;
> +	int ret;
>   
>   	/*
>   	 * for each drhd
> @@ -3466,32 +3352,6 @@ static int __init init_dmars(void)
>   			goto free_iommu;
>   		}
>   	}
> -	/*
> -	 * For each rmrr
> -	 *   for each dev attached to rmrr
> -	 *   do
> -	 *     locate drhd for dev, alloc domain for dev
> -	 *     allocate free domain
> -	 *     allocate page table entries for rmrr
> -	 *     if context not allocated for bus
> -	 *           allocate and init context
> -	 *           set present in root table for this bus
> -	 *     init context with domain, translation etc
> -	 *    endfor
> -	 * endfor
> -	 */
> -	pr_info("Setting RMRR:\n");
> -	for_each_rmrr_units(rmrr) {
> -		/* some BIOS lists non-exist devices in DMAR table. */
> -		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
> -					  i, dev) {
> -			ret = iommu_prepare_rmrr_dev(rmrr, dev);
> -			if (ret)
> -				pr_err("Mapping reserved region failed\n");
> -		}
> -	}
> -
> -	iommu_prepare_isa();

Why do you want to remove this segment of code?

>   
>   domains_done:
>   
> @@ -3580,53 +3440,6 @@ static unsigned long intel_alloc_iova(struct device *dev,
>   	return iova_pfn;
>   }
>   
> -struct dmar_domain *get_valid_domain_for_dev(struct device *dev)
> -{
> -	struct dmar_domain *domain, *tmp;
> -	struct dmar_rmrr_unit *rmrr;
> -	struct device *i_dev;
> -	int i, ret;
> -
> -	domain = find_domain(dev);
> -	if (domain)
> -		goto out;
> -
> -	domain = find_or_alloc_domain(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
> -	if (!domain)
> -		goto out;
> -
> -	/* We have a new domain - setup possible RMRRs for the device */
> -	rcu_read_lock();
> -	for_each_rmrr_units(rmrr) {
> -		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
> -					  i, i_dev) {
> -			if (i_dev != dev)
> -				continue;
> -
> -			ret = domain_prepare_identity_map(dev, domain,
> -							  rmrr->base_address,
> -							  rmrr->end_address);
> -			if (ret)
> -				dev_err(dev, "Mapping reserved region failed\n");

We can't simply remove this segment of code, right? At least it should
be moved to the domain allocation interface.

> -		}
> -	}
> -	rcu_read_unlock();
> -
> -	tmp = set_domain_for_dev(dev, domain);
> -	if (!tmp || domain != tmp) {
> -		domain_exit(domain);
> -		domain = tmp;
> -	}
> -
> -out:
> -
> -	if (!domain)
> -		pr_err("Allocating domain for %s failed\n", dev_name(dev));
> -
> -
> -	return domain;
> -}
> -
>   /* Check if the dev needs to go through non-identity map and unmap process.*/
>   static int iommu_no_mapping(struct device *dev)
>   {
> @@ -3689,7 +3502,7 @@ static dma_addr_t __intel_map_page(struct device *dev, struct page *page,
>   	if (iommu_no_mapping(dev))
>   		return paddr;
>   
> -	domain = get_valid_domain_for_dev(dev);
> +	domain = find_domain(dev);
>   	if (!domain)
>   		return DMA_MAPPING_ERROR;
>   
> @@ -3753,7 +3566,8 @@ static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
>   		return;
>   
>   	domain = find_domain(dev);
> -	BUG_ON(!domain);
> +	if (!domain)
> +		return;
>   

This is not related to this patch. Let's do it in a separated patch.

>   	iommu = domain_get_iommu(domain);
>   
> @@ -3899,7 +3713,7 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
>   	if (iommu_no_mapping(dev))
>   		return intel_nontranslate_map_sg(dev, sglist, nelems, dir);
>   
> -	domain = get_valid_domain_for_dev(dev);
> +	domain = find_domain(dev);
>   	if (!domain)
>   		return 0;
>   
> @@ -5377,9 +5191,9 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sd
>   	u64 ctx_lo;
>   	int ret;
>   
> -	domain = get_valid_domain_for_dev(sdev->dev);
> +	domain = find_domain(sdev->dev);
>   	if (!domain)
> -		return -EINVAL;
> +		return -ENOMEM;

This is not related to this patch. Let's do it in a separated patch.

>   
>   	spin_lock_irqsave(&device_domain_lock, flags);
>   	spin_lock(&iommu->lock);
> 

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.
  2019-03-05  6:05 ` [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain Lu Baolu
@ 2019-03-05 11:14   ` James Sewart
  2019-03-06  6:27     ` Lu Baolu
  0 siblings, 1 reply; 55+ messages in thread
From: James Sewart @ 2019-03-05 11:14 UTC (permalink / raw)
  To: Lu Baolu; +Cc: iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hey Lu,

The motivation for this is buggy domain <-> IOMMU group relationship when 
using find_or_alloc_domain. From what I have read it should be the case 
that an IOMMU domain is shared by all devices in the same group, thus the 
same mappings. This is because the IOMMU group is determined by things 
like ACS settings on pci switches which determines whether devices can 
talk to each other.

In find_or_alloc_domain we only determine domain sharing based on devices 
returned by pci_for_each_dma_alias, whereas there are many more checks in 
pci_device_group(e.g. ACS settings of intermediary pci switches), which is 
used for determining which IOMMU group a device is in. This discontinuity 
means it can be the case that each device within an IOMMU group will have 
different domains.

We see issues as it is supposed to be safe to assume that devices within a 
group should be considered to be in the same address space, but if each 
device has its own domain then any mapping created won’t be valid for the 
other devices, and even could be mapped differently for each device. This 
also could cause issues with a device whose RMRR's are not applied to the 
domains of other devices within its group, these regions could be remapped 
for the other devices resulting in unexpected behaviour during 
peer-to-peer transfers.

Cheers,
James


> On 5 Mar 2019, at 06:05, Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
> Hi James,
> 
> Very glad to see this. Thank you!
> 
> On 3/4/19 11:41 PM, James Sewart wrote:
>> Hey,
>> This patchset moves IOMMU_DOMAIN_DMA iommu domain management to iommu.c.
>> This avoids the use of find_or_alloc_domain whose domain assignment is
>> inconsistent with the iommu grouping as determined by pci_device_group.
> 
> Is this a bug fix or an improvement? What's the real issue will it cause
> if we go without this patch set?
> 
> Best regards,
> Lu Baolu
> 
>> Patch 3 permits domain type IOMMU_DOMAIN_DMA to be allocated via the
>> iommu_ops api, allowing the default_domain of an iommu group to be set in
>> iommu.c. This domain will be attached to every device that is brought up
>> with an iommu group, and the devices reserved regions will be mapped using
>> regions returned by get_resv_regions.
>> In intel_iommu_domain_alloc we don’t know the IOMMU this domain will be
>> associated with so we defer full initialisation until
>> intel_iommu_attach_device. Currently iommu.c:iommu_group_add_device will
>> try to map a devices reserved regions before attaching the domain which
>> would cause issue if the domain is not fully initialised. This is
>> addressed in patch 1 by moving the mapping to after attaching.
>> Patch 2 implements function apply_resv_region, used in
>> iommu_group_create_direct_mappings to mark the reserved regions as non
>> mappable for the dma_map_ops api.
>> Patch 4 removes the domain lazy allocation logic. Before this patch the
>> lazy allocation logic would not be used as any domain allocated using
>> these paths would be replaced when attaching the group default domain.
>> Default domain allocation has been tested with and without this patch on
>> 4.19.
>> Cheers,
>> James.
>> James Sewart (4):
>>   iommu: Move iommu_group_create_direct_mappings to after device_attach
>>   iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
>>   iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated
>>   iommu/vt-d: Remove lazy allocation of domains
>>  drivers/iommu/intel-iommu.c | 329 ++++++++++++------------------------
>>  drivers/iommu/iommu.c       |   4 +-
>>  2 files changed, 108 insertions(+), 225 deletions(-)


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated
  2019-03-05  6:46     ` [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated Lu Baolu
@ 2019-03-05 11:34       ` James Sewart
  0 siblings, 0 replies; 55+ messages in thread
From: James Sewart @ 2019-03-05 11:34 UTC (permalink / raw)
  To: Lu Baolu; +Cc: iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hey Lu,

> On 5 Mar 2019, at 06:46, Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
> Hi,
> 
> On 3/4/19 11:46 PM, James Sewart wrote:
>> Allowing IOMMU_DOMAIN_DMA type IOMMU domain to be allocated allows the
>> default_domain of an iommu_group to be set. This delegates device-domain
>> relationships to the generic IOMMU code.
>> Signed-off-by: James Sewart <jamessewart@arista.com>
>> ---
>>  drivers/iommu/intel-iommu.c | 113 +++++++++++++++++++++++++++---------
>>  1 file changed, 84 insertions(+), 29 deletions(-)
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index 8e0a4e2ff77f..71cd6bbfec05 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -309,6 +309,14 @@ static int hw_pass_through = 1;
>>  /* si_domain contains mulitple devices */
>>  #define DOMAIN_FLAG_STATIC_IDENTITY	(1 << 1)
>>  +/* Domain managed externally, don't cleanup if it isn't attached
>> + * to any devices. */
>> +#define DOMAIN_FLAG_MANAGED_EXTERNALLY	(1 << 2)
>> +
>> +/* Set after domain initialisation. Used when allocating dma domains to
>> + * defer domain initialisation until it is attached to a device */
>> +#define DOMAIN_FLAG_INITIALISED	(1 << 4)
> 
> Why do you skip bit 3?

This was an oversight, I will update to use bit 3.

> 
>> +
>>  #define for_each_domain_iommu(idx, domain)			\
>>  	for (idx = 0; idx < g_num_of_iommus; idx++)		\
>>  		if (domain->iommu_refcnt[idx])
>> @@ -558,6 +566,16 @@ static inline int domain_type_is_vm_or_si(struct dmar_domain *domain)
>>  				DOMAIN_FLAG_STATIC_IDENTITY);
>>  }
>>  +static inline int domain_managed_externally(struct dmar_domain *domain)
>> +{
>> +	return domain->flags & DOMAIN_FLAG_MANAGED_EXTERNALLY;
>> +}
>> +
>> +static inline int domain_is_initialised(struct dmar_domain *domain)
>> +{
>> +	return domain->flags & DOMAIN_FLAG_INITIALISED;
>> +}
>> +
>>  static inline int domain_pfn_supported(struct dmar_domain *domain,
>>  				       unsigned long pfn)
>>  {
>> @@ -1662,7 +1680,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
>>    		__dmar_remove_one_dev_info(info);
>>  -		if (!domain_type_is_vm_or_si(domain)) {
>> +		if (!domain_managed_externally(domain)) {
>>  			/*
>>  			 * The domain_exit() function  can't be called under
>>  			 * device_domain_lock, as it takes this lock itself.
>> @@ -1895,6 +1913,7 @@ static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
>>  	domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
>>  	if (!domain->pgd)
>>  		return -ENOMEM;
>> +	domain->flags |= DOMAIN_FLAG_INITIALISED;
>>  	__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
>>  	return 0;
>>  }
>> @@ -2807,14 +2826,10 @@ static inline void iommu_prepare_isa(void)
>>    static int md_domain_init(struct dmar_domain *domain, int guest_width);
>>  -static int __init si_domain_init(int hw)
>> +static int __init si_domain_init(struct dmar_domain *si_domain, int hw)
>>  {
>>  	int nid, ret = 0;
>>  -	si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
>> -	if (!si_domain)
>> -		return -EFAULT;
>> -
>>  	if (md_domain_init(si_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
>>  		domain_exit(si_domain);
>>  		return -EFAULT;
>> @@ -3417,9 +3432,16 @@ static int __init init_dmars(void)
>>  	check_tylersburg_isoch();
>>    	if (iommu_identity_mapping) {
>> -		ret = si_domain_init(hw_pass_through);
>> -		if (ret)
>> +		si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
> 
> Where will this si_domain be used? We still need to keep the global
> si_domain?

I am unsure of the best thing to do here. The si_domain can be initialised 
as a hardware passthrough which means it doesn’t have any mappings applied. 
I think any user allocating a domain here will always want a software 
passthrough domain. I’m not sure if/how to consolidate these two.

> 
>> +		if (!si_domain) {
>> +			ret = -EFAULT;
>>  			goto free_iommu;
>> +		}
>> +		ret = si_domain_init(si_domain, hw_pass_through);
>> +		if (ret) {
>> +			domain_exit(si_domain);
>> +			goto free_iommu;
>> +		}
>>  	}
>>    @@ -4575,7 +4597,7 @@ static int device_notifier(struct notifier_block *nb,
>>  		return 0;
>>    	dmar_remove_one_dev_info(domain, dev);
>> -	if (!domain_type_is_vm_or_si(domain) && list_empty(&domain->devices))
>> +	if (!domain_managed_externally(domain) && list_empty(&domain->devices))
>>  		domain_exit(domain);
>>    	return 0;
>> @@ -5020,6 +5042,7 @@ static int md_domain_init(struct dmar_domain *domain, int guest_width)
>>  	domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
>>  	if (!domain->pgd)
>>  		return -ENOMEM;
>> +	domain->flags |= DOMAIN_FLAG_INITIALISED;
>>  	domain_flush_cache(domain, domain->pgd, PAGE_SIZE);
>>  	return 0;
>>  }
>> @@ -5028,28 +5051,54 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
>>  {
>>  	struct dmar_domain *dmar_domain;
>>  	struct iommu_domain *domain;
>> +	int flags = DOMAIN_FLAG_MANAGED_EXTERNALLY;
>>  -	if (type != IOMMU_DOMAIN_UNMANAGED)
>> -		return NULL;
>> -
>> -	dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
>> -	if (!dmar_domain) {
>> -		pr_err("Can't allocate dmar_domain\n");
>> -		return NULL;
>> -	}
>> -	if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
>> -		pr_err("Domain initialization failed\n");
>> -		domain_exit(dmar_domain);
>> +	switch (type) {
>> +	case IOMMU_DOMAIN_UNMANAGED:
>> +		flags |= DOMAIN_FLAG_VIRTUAL_MACHINE | DOMAIN_FLAG_INITIALISED;
>> +		dmar_domain = alloc_domain(flags);
>> +		if (!dmar_domain) {
>> +			pr_err("Can't allocate dmar_domain\n");
>> +			return NULL;
>> +		}
>> +		if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
>> +			pr_err("Domain initialization failed\n");
>> +			domain_exit(dmar_domain);
>> +			return NULL;
>> +		}
>> +		domain_update_iommu_cap(dmar_domain);
>> +		domain = &dmar_domain->domain;
>> +		domain->geometry.aperture_start = 0;
>> +		domain->geometry.aperture_end   = __DOMAIN_MAX_ADDR(dmar_domain->gaw);
>> +		domain->geometry.force_aperture = true;
>> +		break;
>> +	case IOMMU_DOMAIN_DMA:
>> +		dmar_domain = alloc_domain(flags);
>> +		if (!dmar_domain) {
>> +			pr_err("Can't allocate dmar_domain\n");
> 
> Not necessary for memory allocation failure.

I’ll update in the next revision

> 
>> +			return NULL;
>> +		}
>> +		// init domain in device attach when we know IOMMU capabilities
> 
> Use /* */ for comments, please.

I’ll update in the next revision

> 
>> +		break;
>> +	case IOMMU_DOMAIN_IDENTITY:
>> +		flags |= DOMAIN_FLAG_STATIC_IDENTITY | DOMAIN_FLAG_INITIALISED;
>> +		dmar_domain = alloc_domain(flags);
>> +		if (!dmar_domain) {
>> +			pr_err("Can't allocate dmar_domain\n");
>> +			return NULL;
>> +		}
>> +		if (si_domain_init(dmar_domain, 0)) {
>> +			pr_err("Domain initialization failed\n");
>> +			domain_exit(dmar_domain);
>> +			return NULL;
>> +		}
>> +		break;
> 
> How about moving si domain allocation into a separated patch? There are
> two significant changes here, so it worth a new patch for discussion.
> 
> 1) a shared single si domain vs. per group si domain;
> 2) previously vt-d drivers determines what kind of devices should use
>   identity domain; Now, it turns to iommu generic layer, behavior will
>   be changed.

I’ll move it into another patch. Do you think the global si_domain should 
be returned here?

>> +	default:
>>  		return NULL;
>>  	}
>> -	domain_update_iommu_cap(dmar_domain);
>> -
>> -	domain = &dmar_domain->domain;
>> -	domain->geometry.aperture_start = 0;
>> -	domain->geometry.aperture_end   = __DOMAIN_MAX_ADDR(dmar_domain->gaw);
>> -	domain->geometry.force_aperture = true;
>>  -	return domain;
>> +	dmar_domain->domain.type = type;
> 
> Is it necessary to set this?

It isn’t I’ll remove this for the next revision.

> 
>> +	return &dmar_domain->domain;
>>  }
>>    static void intel_iommu_domain_free(struct iommu_domain *domain)
>> @@ -5080,8 +5129,8 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
>>  			dmar_remove_one_dev_info(old_domain, dev);
>>  			rcu_read_unlock();
>>  -			if (!domain_type_is_vm_or_si(old_domain) &&
>> -			     list_empty(&old_domain->devices))
>> +			if (list_empty(&old_domain->devices) &&
>> +			     !domain_managed_externally(old_domain))
>>  				domain_exit(old_domain);
>>  		}
>>  	}
>> @@ -5090,6 +5139,12 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
>>  	if (!iommu)
>>  		return -ENODEV;
>>  +	// Initialise domain with IOMMU capabilities if it isn't already initialised
> 
> Ditto.
> 
>> +	if (!domain_is_initialised(dmar_domain)) {
>> +		if (domain_init(dmar_domain, iommu, DEFAULT_DOMAIN_ADDRESS_WIDTH))
>> +			return -ENOMEM;
>> +	}
> 
> Each domain, not matter the DMA domain or UNMANAGED domain, is allocated
> first and then gets initialized during device attaching, right? If so,
> why do we need to have a special DOMAIN_FLAG_INITIALISED flag for DMA
> domains?

Only the DMA domain is initialised on attach, UNMANAGED and IDENTITY 
initialises with the lowest common denominator of the parameters of all 
attached IOMMUs using domain_update_iommu_cap.

> 
>> +
>>  	/* check if this iommu agaw is sufficient for max mapped address */
>>  	addr_width = agaw_to_width(iommu->agaw);
>>  	if (addr_width > cap_mgaw(iommu->cap))
> 
> Best regards,
> Lu Baolu

Cheers,
James.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains
  2019-03-05  6:59       ` Lu Baolu
@ 2019-03-05 11:46         ` James Sewart
  2019-03-06  7:00           ` Lu Baolu
  0 siblings, 1 reply; 55+ messages in thread
From: James Sewart @ 2019-03-05 11:46 UTC (permalink / raw)
  To: Lu Baolu; +Cc: iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hey Lu,

> On 5 Mar 2019, at 06:59, Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
> Hi,
> 
> It's hard for me to understand why do we remove the rmrr related
> code in this patch.

The RMRR code removed here requires the lazy allocation of domains to 
exist, as it is run before iommu.c would assign IOMMU groups and attach a 
domain. Before patch 3, removing this code would mean the RMRR regions are 
never mapped for a domain: iommu.c will allocate a default domain for the 
group that a device is about to be put in, it will attach the domain to 
the device, then for each region returned by get_resv_regions it will 
create an identity map, this is where the RMRRs are setup for the default 
domain.

> 
> And, now we have two places to hold a domain for a device: group and
> dev->info. We can consider to optimize the use of per device iommu
> arch data. This should be later work anyway.
> 
> More comments inline.
> 
> On 3/4/19 11:47 PM, James Sewart wrote:
>> The generic IOMMU code will allocate and attach a dma ops domain to each
>> device that comes online, replacing any lazy allocated domain. Removes
>> RMRR application at iommu init time as we won't have a domain attached
>> to any device. iommu.c will do this after attaching a device using
>> create_direct_mappings.
>> Signed-off-by: James Sewart <jamessewart@arista.com>
>> ---
>>  drivers/iommu/intel-iommu.c | 202 ++----------------------------------
>>  1 file changed, 8 insertions(+), 194 deletions(-)
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index 71cd6bbfec05..282257e2628d 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -2595,118 +2595,6 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
>>  	return domain;
>>  }
>>  -static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
>> -{
>> -	*(u16 *)opaque = alias;
>> -	return 0;
>> -}
>> -
>> -static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
>> -{
>> -	struct device_domain_info *info = NULL;
>> -	struct dmar_domain *domain = NULL;
>> -	struct intel_iommu *iommu;
>> -	u16 dma_alias;
>> -	unsigned long flags;
>> -	u8 bus, devfn;
>> -
>> -	iommu = device_to_iommu(dev, &bus, &devfn);
>> -	if (!iommu)
>> -		return NULL;
>> -
>> -	if (dev_is_pci(dev)) {
>> -		struct pci_dev *pdev = to_pci_dev(dev);
>> -
>> -		pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
>> -
>> -		spin_lock_irqsave(&device_domain_lock, flags);
>> -		info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
>> -						      PCI_BUS_NUM(dma_alias),
>> -						      dma_alias & 0xff);
>> -		if (info) {
>> -			iommu = info->iommu;
>> -			domain = info->domain;
>> -		}
>> -		spin_unlock_irqrestore(&device_domain_lock, flags);
>> -
>> -		/* DMA alias already has a domain, use it */
>> -		if (info)
>> -			goto out;
>> -	}
>> -
>> -	/* Allocate and initialize new domain for the device */
>> -	domain = alloc_domain(0);
>> -	if (!domain)
>> -		return NULL;
>> -	if (domain_init(domain, iommu, gaw)) {
>> -		domain_exit(domain);
>> -		return NULL;
>> -	}
>> -
>> -out:
>> -
>> -	return domain;
>> -}
>> -
>> -static struct dmar_domain *set_domain_for_dev(struct device *dev,
>> -					      struct dmar_domain *domain)
>> -{
>> -	struct intel_iommu *iommu;
>> -	struct dmar_domain *tmp;
>> -	u16 req_id, dma_alias;
>> -	u8 bus, devfn;
>> -
>> -	iommu = device_to_iommu(dev, &bus, &devfn);
>> -	if (!iommu)
>> -		return NULL;
>> -
>> -	req_id = ((u16)bus << 8) | devfn;
>> -
>> -	if (dev_is_pci(dev)) {
>> -		struct pci_dev *pdev = to_pci_dev(dev);
>> -
>> -		pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
>> -
>> -		/* register PCI DMA alias device */
>> -		if (req_id != dma_alias) {
>> -			tmp = dmar_insert_one_dev_info(iommu, PCI_BUS_NUM(dma_alias),
>> -					dma_alias & 0xff, NULL, domain);
>> -
>> -			if (!tmp || tmp != domain)
>> -				return tmp;
>> -		}
>> -	}
>> -
>> -	tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
>> -	if (!tmp || tmp != domain)
>> -		return tmp;
>> -
>> -	return domain;
>> -}
>> -
>> -static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
>> -{
>> -	struct dmar_domain *domain, *tmp;
>> -
>> -	domain = find_domain(dev);
>> -	if (domain)
>> -		goto out;
>> -
>> -	domain = find_or_alloc_domain(dev, gaw);
>> -	if (!domain)
>> -		goto out;
>> -
>> -	tmp = set_domain_for_dev(dev, domain);
>> -	if (!tmp || domain != tmp) {
>> -		domain_exit(domain);
>> -		domain = tmp;
>> -	}
>> -
>> -out:
>> -
>> -	return domain;
>> -}
>> -
>>  static int iommu_domain_identity_map(struct dmar_domain *domain,
>>  				     unsigned long long start,
>>  				     unsigned long long end)
>> @@ -2779,7 +2667,7 @@ static int iommu_prepare_identity_map(struct device *dev,
>>  	struct dmar_domain *domain;
>>  	int ret;
>>  -	domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
>> +	domain = find_domain(dev);
>>  	if (!domain)
>>  		return -ENOMEM;
>>  @@ -3301,11 +3189,9 @@ static int copy_translation_tables(struct intel_iommu *iommu)
>>  static int __init init_dmars(void)
>>  {
>>  	struct dmar_drhd_unit *drhd;
>> -	struct dmar_rmrr_unit *rmrr;
>>  	bool copied_tables = false;
>> -	struct device *dev;
>>  	struct intel_iommu *iommu;
>> -	int i, ret;
>> +	int ret;
>>    	/*
>>  	 * for each drhd
>> @@ -3466,32 +3352,6 @@ static int __init init_dmars(void)
>>  			goto free_iommu;
>>  		}
>>  	}
>> -	/*
>> -	 * For each rmrr
>> -	 *   for each dev attached to rmrr
>> -	 *   do
>> -	 *     locate drhd for dev, alloc domain for dev
>> -	 *     allocate free domain
>> -	 *     allocate page table entries for rmrr
>> -	 *     if context not allocated for bus
>> -	 *           allocate and init context
>> -	 *           set present in root table for this bus
>> -	 *     init context with domain, translation etc
>> -	 *    endfor
>> -	 * endfor
>> -	 */
>> -	pr_info("Setting RMRR:\n");
>> -	for_each_rmrr_units(rmrr) {
>> -		/* some BIOS lists non-exist devices in DMAR table. */
>> -		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>> -					  i, dev) {
>> -			ret = iommu_prepare_rmrr_dev(rmrr, dev);
>> -			if (ret)
>> -				pr_err("Mapping reserved region failed\n");
>> -		}
>> -	}
>> -
>> -	iommu_prepare_isa();
> 
> Why do you want to remove this segment of code?

This will only work if the lazy allocation of domains exists, these 
mappings will disappear once a default domain is attached to a device and 
then remade by iommu_group_create_direct_mappings. This code is redundant 
and removing it allows us to remove all the lazy allocation logic.

iommu_prepare_isa does need moving to get_resv_regions for its mappings to 
be applied, this will need some refactoring.

> 
>>    domains_done:
>>  @@ -3580,53 +3440,6 @@ static unsigned long intel_alloc_iova(struct device *dev,
>>  	return iova_pfn;
>>  }
>>  -struct dmar_domain *get_valid_domain_for_dev(struct device *dev)
>> -{
>> -	struct dmar_domain *domain, *tmp;
>> -	struct dmar_rmrr_unit *rmrr;
>> -	struct device *i_dev;
>> -	int i, ret;
>> -
>> -	domain = find_domain(dev);
>> -	if (domain)
>> -		goto out;
>> -
>> -	domain = find_or_alloc_domain(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
>> -	if (!domain)
>> -		goto out;
>> -
>> -	/* We have a new domain - setup possible RMRRs for the device */
>> -	rcu_read_lock();
>> -	for_each_rmrr_units(rmrr) {
>> -		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>> -					  i, i_dev) {
>> -			if (i_dev != dev)
>> -				continue;
>> -
>> -			ret = domain_prepare_identity_map(dev, domain,
>> -							  rmrr->base_address,
>> -							  rmrr->end_address);
>> -			if (ret)
>> -				dev_err(dev, "Mapping reserved region failed\n");
> 
> We can't simply remove this segment of code, right? At least it should
> be moved to the domain allocation interface.

iommu_group_create_direct_mappings will take care of these mappings, this 
code is not used once an externally managed domain(group domain) is 
attached to the device.

> 
>> -		}
>> -	}
>> -	rcu_read_unlock();
>> -
>> -	tmp = set_domain_for_dev(dev, domain);
>> -	if (!tmp || domain != tmp) {
>> -		domain_exit(domain);
>> -		domain = tmp;
>> -	}
>> -
>> -out:
>> -
>> -	if (!domain)
>> -		pr_err("Allocating domain for %s failed\n", dev_name(dev));
>> -
>> -
>> -	return domain;
>> -}
>> -
>>  /* Check if the dev needs to go through non-identity map and unmap process.*/
>>  static int iommu_no_mapping(struct device *dev)
>>  {
>> @@ -3689,7 +3502,7 @@ static dma_addr_t __intel_map_page(struct device *dev, struct page *page,
>>  	if (iommu_no_mapping(dev))
>>  		return paddr;
>>  -	domain = get_valid_domain_for_dev(dev);
>> +	domain = find_domain(dev);
>>  	if (!domain)
>>  		return DMA_MAPPING_ERROR;
>>  @@ -3753,7 +3566,8 @@ static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
>>  		return;
>>    	domain = find_domain(dev);
>> -	BUG_ON(!domain);
>> +	if (!domain)
>> +		return;
>>  
> 
> This is not related to this patch. Let's do it in a separated patch.
> 
>>  	iommu = domain_get_iommu(domain);
>>  @@ -3899,7 +3713,7 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
>>  	if (iommu_no_mapping(dev))
>>  		return intel_nontranslate_map_sg(dev, sglist, nelems, dir);
>>  -	domain = get_valid_domain_for_dev(dev);
>> +	domain = find_domain(dev);
>>  	if (!domain)
>>  		return 0;
>>  @@ -5377,9 +5191,9 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sd
>>  	u64 ctx_lo;
>>  	int ret;
>>  -	domain = get_valid_domain_for_dev(sdev->dev);
>> +	domain = find_domain(sdev->dev);
>>  	if (!domain)
>> -		return -EINVAL;
>> +		return -ENOMEM;
> 
> This is not related to this patch. Let's do it in a separated patch.
> 
>>    	spin_lock_irqsave(&device_domain_lock, flags);
>>  	spin_lock(&iommu->lock);
> 
> Best regards,
> Lu Baolu


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.
  2019-03-05 11:14   ` James Sewart
@ 2019-03-06  6:27     ` Lu Baolu
  0 siblings, 0 replies; 55+ messages in thread
From: Lu Baolu @ 2019-03-06  6:27 UTC (permalink / raw)
  To: James Sewart
  Cc: baolu.lu, iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi,

On 3/5/19 7:14 PM, James Sewart wrote:
> Hey Lu,
> 
> The motivation for this is buggy domain <-> IOMMU group relationship when
> using find_or_alloc_domain. From what I have read it should be the case
> that an IOMMU domain is shared by all devices in the same group, thus the
> same mappings. This is because the IOMMU group is determined by things
> like ACS settings on pci switches which determines whether devices can
> talk to each other.
Yes, exactly. This is a known issue.

> 
> In find_or_alloc_domain we only determine domain sharing based on devices
> returned by pci_for_each_dma_alias, whereas there are many more checks in
> pci_device_group(e.g. ACS settings of intermediary pci switches), which is
> used for determining which IOMMU group a device is in. This discontinuity
> means it can be the case that each device within an IOMMU group will have
> different domains.

Yes. Agree again.

> 
> We see issues as it is supposed to be safe to assume that devices within a
> group should be considered to be in the same address space, but if each
> device has its own domain then any mapping created won’t be valid for the
> other devices, and even could be mapped differently for each device. This
> also could cause issues with a device whose RMRR's are not applied to the
> domains of other devices within its group, these regions could be remapped
> for the other devices resulting in unexpected behaviour during
> peer-to-peer transfers.

Yes, fair enough.

I asked this question because I am interested to know whether multiple
domains per group due to lack of ACS consideration will cause any issues
that we need to fix in the stable kernels.

Best regards,
Lu Baolu

> 
> Cheers,
> James
> 
> 
>> On 5 Mar 2019, at 06:05, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>
>> Hi James,
>>
>> Very glad to see this. Thank you!
>>
>> On 3/4/19 11:41 PM, James Sewart wrote:
>>> Hey,
>>> This patchset moves IOMMU_DOMAIN_DMA iommu domain management to iommu.c.
>>> This avoids the use of find_or_alloc_domain whose domain assignment is
>>> inconsistent with the iommu grouping as determined by pci_device_group.
>>
>> Is this a bug fix or an improvement? What's the real issue will it cause
>> if we go without this patch set?
>>
>> Best regards,
>> Lu Baolu
>>
>>> Patch 3 permits domain type IOMMU_DOMAIN_DMA to be allocated via the
>>> iommu_ops api, allowing the default_domain of an iommu group to be set in
>>> iommu.c. This domain will be attached to every device that is brought up
>>> with an iommu group, and the devices reserved regions will be mapped using
>>> regions returned by get_resv_regions.
>>> In intel_iommu_domain_alloc we don’t know the IOMMU this domain will be
>>> associated with so we defer full initialisation until
>>> intel_iommu_attach_device. Currently iommu.c:iommu_group_add_device will
>>> try to map a devices reserved regions before attaching the domain which
>>> would cause issue if the domain is not fully initialised. This is
>>> addressed in patch 1 by moving the mapping to after attaching.
>>> Patch 2 implements function apply_resv_region, used in
>>> iommu_group_create_direct_mappings to mark the reserved regions as non
>>> mappable for the dma_map_ops api.
>>> Patch 4 removes the domain lazy allocation logic. Before this patch the
>>> lazy allocation logic would not be used as any domain allocated using
>>> these paths would be replaced when attaching the group default domain.
>>> Default domain allocation has been tested with and without this patch on
>>> 4.19.
>>> Cheers,
>>> James.
>>> James Sewart (4):
>>>    iommu: Move iommu_group_create_direct_mappings to after device_attach
>>>    iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
>>>    iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated
>>>    iommu/vt-d: Remove lazy allocation of domains
>>>   drivers/iommu/intel-iommu.c | 329 ++++++++++++------------------------
>>>   drivers/iommu/iommu.c       |   4 +-
>>>   2 files changed, 108 insertions(+), 225 deletions(-)
> 
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains
  2019-03-05 11:46         ` James Sewart
@ 2019-03-06  7:00           ` Lu Baolu
  2019-03-06 18:08             ` James Sewart
  0 siblings, 1 reply; 55+ messages in thread
From: Lu Baolu @ 2019-03-06  7:00 UTC (permalink / raw)
  To: James Sewart
  Cc: baolu.lu, iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi,

On 3/5/19 7:46 PM, James Sewart wrote:
> Hey Lu,
> 
>> On 5 Mar 2019, at 06:59, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>
>> Hi,
>>
>> It's hard for me to understand why do we remove the rmrr related
>> code in this patch.
> 
> The RMRR code removed here requires the lazy allocation of domains to
> exist, as it is run before iommu.c would assign IOMMU groups and attach a
> domain. Before patch 3, removing this code would mean the RMRR regions are
> never mapped for a domain: iommu.c will allocate a default domain for the
> group that a device is about to be put in, it will attach the domain to
> the device, then for each region returned by get_resv_regions it will
> create an identity map, this is where the RMRRs are setup for the default
> domain. >
>>
>> And, now we have two places to hold a domain for a device: group and
>> dev->info. We can consider to optimize the use of per device iommu
>> arch data. This should be later work anyway.
>>
>> More comments inline.
>>
>> On 3/4/19 11:47 PM, James Sewart wrote:
>>> The generic IOMMU code will allocate and attach a dma ops domain to each
>>> device that comes online, replacing any lazy allocated domain. Removes
>>> RMRR application at iommu init time as we won't have a domain attached
>>> to any device. iommu.c will do this after attaching a device using
>>> create_direct_mappings.
>>> Signed-off-by: James Sewart <jamessewart@arista.com>
>>> ---
>>>   drivers/iommu/intel-iommu.c | 202 ++----------------------------------
>>>   1 file changed, 8 insertions(+), 194 deletions(-)
>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>> index 71cd6bbfec05..282257e2628d 100644
>>> --- a/drivers/iommu/intel-iommu.c
>>> +++ b/drivers/iommu/intel-iommu.c
>>> @@ -2595,118 +2595,6 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
>>>   	return domain;
>>>   }
>>>   -static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
>>> -{
>>> -	*(u16 *)opaque = alias;
>>> -	return 0;
>>> -}
>>> -
>>> -static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
>>> -{
>>> -	struct device_domain_info *info = NULL;
>>> -	struct dmar_domain *domain = NULL;
>>> -	struct intel_iommu *iommu;
>>> -	u16 dma_alias;
>>> -	unsigned long flags;
>>> -	u8 bus, devfn;
>>> -
>>> -	iommu = device_to_iommu(dev, &bus, &devfn);
>>> -	if (!iommu)
>>> -		return NULL;
>>> -
>>> -	if (dev_is_pci(dev)) {
>>> -		struct pci_dev *pdev = to_pci_dev(dev);
>>> -
>>> -		pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
>>> -
>>> -		spin_lock_irqsave(&device_domain_lock, flags);
>>> -		info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
>>> -						      PCI_BUS_NUM(dma_alias),
>>> -						      dma_alias & 0xff);
>>> -		if (info) {
>>> -			iommu = info->iommu;
>>> -			domain = info->domain;
>>> -		}
>>> -		spin_unlock_irqrestore(&device_domain_lock, flags);
>>> -
>>> -		/* DMA alias already has a domain, use it */
>>> -		if (info)
>>> -			goto out;
>>> -	}
>>> -
>>> -	/* Allocate and initialize new domain for the device */
>>> -	domain = alloc_domain(0);
>>> -	if (!domain)
>>> -		return NULL;
>>> -	if (domain_init(domain, iommu, gaw)) {
>>> -		domain_exit(domain);
>>> -		return NULL;
>>> -	}
>>> -
>>> -out:
>>> -
>>> -	return domain;
>>> -}
>>> -
>>> -static struct dmar_domain *set_domain_for_dev(struct device *dev,
>>> -					      struct dmar_domain *domain)
>>> -{
>>> -	struct intel_iommu *iommu;
>>> -	struct dmar_domain *tmp;
>>> -	u16 req_id, dma_alias;
>>> -	u8 bus, devfn;
>>> -
>>> -	iommu = device_to_iommu(dev, &bus, &devfn);
>>> -	if (!iommu)
>>> -		return NULL;
>>> -
>>> -	req_id = ((u16)bus << 8) | devfn;
>>> -
>>> -	if (dev_is_pci(dev)) {
>>> -		struct pci_dev *pdev = to_pci_dev(dev);
>>> -
>>> -		pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
>>> -
>>> -		/* register PCI DMA alias device */
>>> -		if (req_id != dma_alias) {
>>> -			tmp = dmar_insert_one_dev_info(iommu, PCI_BUS_NUM(dma_alias),
>>> -					dma_alias & 0xff, NULL, domain);
>>> -
>>> -			if (!tmp || tmp != domain)
>>> -				return tmp;
>>> -		}
>>> -	}
>>> -
>>> -	tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
>>> -	if (!tmp || tmp != domain)
>>> -		return tmp;
>>> -
>>> -	return domain;
>>> -}
>>> -
>>> -static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
>>> -{
>>> -	struct dmar_domain *domain, *tmp;
>>> -
>>> -	domain = find_domain(dev);
>>> -	if (domain)
>>> -		goto out;
>>> -
>>> -	domain = find_or_alloc_domain(dev, gaw);
>>> -	if (!domain)
>>> -		goto out;
>>> -
>>> -	tmp = set_domain_for_dev(dev, domain);
>>> -	if (!tmp || domain != tmp) {
>>> -		domain_exit(domain);
>>> -		domain = tmp;
>>> -	}
>>> -
>>> -out:
>>> -
>>> -	return domain;
>>> -}
>>> -
>>>   static int iommu_domain_identity_map(struct dmar_domain *domain,
>>>   				     unsigned long long start,
>>>   				     unsigned long long end)
>>> @@ -2779,7 +2667,7 @@ static int iommu_prepare_identity_map(struct device *dev,
>>>   	struct dmar_domain *domain;
>>>   	int ret;
>>>   -	domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
>>> +	domain = find_domain(dev);
>>>   	if (!domain)
>>>   		return -ENOMEM;
>>>   @@ -3301,11 +3189,9 @@ static int copy_translation_tables(struct intel_iommu *iommu)
>>>   static int __init init_dmars(void)
>>>   {
>>>   	struct dmar_drhd_unit *drhd;
>>> -	struct dmar_rmrr_unit *rmrr;
>>>   	bool copied_tables = false;
>>> -	struct device *dev;
>>>   	struct intel_iommu *iommu;
>>> -	int i, ret;
>>> +	int ret;
>>>     	/*
>>>   	 * for each drhd
>>> @@ -3466,32 +3352,6 @@ static int __init init_dmars(void)
>>>   			goto free_iommu;
>>>   		}
>>>   	}
>>> -	/*
>>> -	 * For each rmrr
>>> -	 *   for each dev attached to rmrr
>>> -	 *   do
>>> -	 *     locate drhd for dev, alloc domain for dev
>>> -	 *     allocate free domain
>>> -	 *     allocate page table entries for rmrr
>>> -	 *     if context not allocated for bus
>>> -	 *           allocate and init context
>>> -	 *           set present in root table for this bus
>>> -	 *     init context with domain, translation etc
>>> -	 *    endfor
>>> -	 * endfor
>>> -	 */
>>> -	pr_info("Setting RMRR:\n");
>>> -	for_each_rmrr_units(rmrr) {
>>> -		/* some BIOS lists non-exist devices in DMAR table. */
>>> -		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>>> -					  i, dev) {
>>> -			ret = iommu_prepare_rmrr_dev(rmrr, dev);
>>> -			if (ret)
>>> -				pr_err("Mapping reserved region failed\n");
>>> -		}
>>> -	}
>>> -
>>> -	iommu_prepare_isa();
>>
>> Why do you want to remove this segment of code?
> 
> This will only work if the lazy allocation of domains exists, these
> mappings will disappear once a default domain is attached to a device and
> then remade by iommu_group_create_direct_mappings. This code is redundant
> and removing it allows us to remove all the lazy allocation logic.

No exactly.

We need to setup the rmrr mapping before enabling dma remapping,
otherwise, there will be a window after dma remapping enabling and
rmrr getting mapped, during which people might see dma faults.

> 
> iommu_prepare_isa does need moving to get_resv_regions for its mappings to
> be applied, this will need some refactoring.

Yes, agree.

> 
>>
>>>     domains_done:
>>>   @@ -3580,53 +3440,6 @@ static unsigned long intel_alloc_iova(struct device *dev,
>>>   	return iova_pfn;
>>>   }
>>>   -struct dmar_domain *get_valid_domain_for_dev(struct device *dev)
>>> -{
>>> -	struct dmar_domain *domain, *tmp;
>>> -	struct dmar_rmrr_unit *rmrr;
>>> -	struct device *i_dev;
>>> -	int i, ret;
>>> -
>>> -	domain = find_domain(dev);
>>> -	if (domain)
>>> -		goto out;
>>> -
>>> -	domain = find_or_alloc_domain(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
>>> -	if (!domain)
>>> -		goto out;
>>> -
>>> -	/* We have a new domain - setup possible RMRRs for the device */
>>> -	rcu_read_lock();
>>> -	for_each_rmrr_units(rmrr) {
>>> -		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>>> -					  i, i_dev) {
>>> -			if (i_dev != dev)
>>> -				continue;
>>> -
>>> -			ret = domain_prepare_identity_map(dev, domain,
>>> -							  rmrr->base_address,
>>> -							  rmrr->end_address);
>>> -			if (ret)
>>> -				dev_err(dev, "Mapping reserved region failed\n");
>>
>> We can't simply remove this segment of code, right? At least it should
>> be moved to the domain allocation interface.
> 
> iommu_group_create_direct_mappings will take care of these mappings, this
> code is not used once an externally managed domain(group domain) is
> attached to the device.

Yes, this seems to be duplicated even with lazy domain allocation.

Best regards,
Lu Baolu

> 
>>
>>> -		}
>>> -	}
>>> -	rcu_read_unlock();
>>> -
>>> -	tmp = set_domain_for_dev(dev, domain);
>>> -	if (!tmp || domain != tmp) {
>>> -		domain_exit(domain);
>>> -		domain = tmp;
>>> -	}
>>> -
>>> -out:
>>> -
>>> -	if (!domain)
>>> -		pr_err("Allocating domain for %s failed\n", dev_name(dev));
>>> -
>>> -
>>> -	return domain;
>>> -}
>>> -
>>>   /* Check if the dev needs to go through non-identity map and unmap process.*/
>>>   static int iommu_no_mapping(struct device *dev)
>>>   {
>>> @@ -3689,7 +3502,7 @@ static dma_addr_t __intel_map_page(struct device *dev, struct page *page,
>>>   	if (iommu_no_mapping(dev))
>>>   		return paddr;
>>>   -	domain = get_valid_domain_for_dev(dev);
>>> +	domain = find_domain(dev);
>>>   	if (!domain)
>>>   		return DMA_MAPPING_ERROR;
>>>   @@ -3753,7 +3566,8 @@ static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
>>>   		return;
>>>     	domain = find_domain(dev);
>>> -	BUG_ON(!domain);
>>> +	if (!domain)
>>> +		return;
>>>   
>>
>> This is not related to this patch. Let's do it in a separated patch.
>>
>>>   	iommu = domain_get_iommu(domain);
>>>   @@ -3899,7 +3713,7 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
>>>   	if (iommu_no_mapping(dev))
>>>   		return intel_nontranslate_map_sg(dev, sglist, nelems, dir);
>>>   -	domain = get_valid_domain_for_dev(dev);
>>> +	domain = find_domain(dev);
>>>   	if (!domain)
>>>   		return 0;
>>>   @@ -5377,9 +5191,9 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sd
>>>   	u64 ctx_lo;
>>>   	int ret;
>>>   -	domain = get_valid_domain_for_dev(sdev->dev);
>>> +	domain = find_domain(sdev->dev);
>>>   	if (!domain)
>>> -		return -EINVAL;
>>> +		return -ENOMEM;
>>
>> This is not related to this patch. Let's do it in a separated patch.
>>
>>>     	spin_lock_irqsave(&device_domain_lock, flags);
>>>   	spin_lock(&iommu->lock);
>>
>> Best regards,
>> Lu Baolu
> 
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains
  2019-03-06  7:00           ` Lu Baolu
@ 2019-03-06 18:08             ` James Sewart
  2019-03-07  6:31               ` Lu Baolu
  0 siblings, 1 reply; 55+ messages in thread
From: James Sewart @ 2019-03-06 18:08 UTC (permalink / raw)
  To: Lu Baolu; +Cc: iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hey,

> On 6 Mar 2019, at 07:00, Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
> Hi,
> 
> On 3/5/19 7:46 PM, James Sewart wrote:
>> Hey Lu,
>>> On 5 Mar 2019, at 06:59, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>> 
>>> Hi,
>>> 
>>> It's hard for me to understand why do we remove the rmrr related
>>> code in this patch.
>> The RMRR code removed here requires the lazy allocation of domains to
>> exist, as it is run before iommu.c would assign IOMMU groups and attach a
>> domain. Before patch 3, removing this code would mean the RMRR regions are
>> never mapped for a domain: iommu.c will allocate a default domain for the
>> group that a device is about to be put in, it will attach the domain to
>> the device, then for each region returned by get_resv_regions it will
>> create an identity map, this is where the RMRRs are setup for the default
>> domain. >
>>> 
>>> And, now we have two places to hold a domain for a device: group and
>>> dev->info. We can consider to optimize the use of per device iommu
>>> arch data. This should be later work anyway.
>>> 
>>> More comments inline.
>>> 
>>> On 3/4/19 11:47 PM, James Sewart wrote:
>>>> The generic IOMMU code will allocate and attach a dma ops domain to each
>>>> device that comes online, replacing any lazy allocated domain. Removes
>>>> RMRR application at iommu init time as we won't have a domain attached
>>>> to any device. iommu.c will do this after attaching a device using
>>>> create_direct_mappings.
>>>> Signed-off-by: James Sewart <jamessewart@arista.com>
>>>> ---
>>>>  drivers/iommu/intel-iommu.c | 202 ++----------------------------------
>>>>  1 file changed, 8 insertions(+), 194 deletions(-)
>>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>>> index 71cd6bbfec05..282257e2628d 100644
>>>> --- a/drivers/iommu/intel-iommu.c
>>>> +++ b/drivers/iommu/intel-iommu.c
>>>> @@ -2595,118 +2595,6 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
>>>>  	return domain;
>>>>  }
>>>>  -static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
>>>> -{
>>>> -	*(u16 *)opaque = alias;
>>>> -	return 0;
>>>> -}
>>>> -
>>>> -static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
>>>> -{
>>>> -	struct device_domain_info *info = NULL;
>>>> -	struct dmar_domain *domain = NULL;
>>>> -	struct intel_iommu *iommu;
>>>> -	u16 dma_alias;
>>>> -	unsigned long flags;
>>>> -	u8 bus, devfn;
>>>> -
>>>> -	iommu = device_to_iommu(dev, &bus, &devfn);
>>>> -	if (!iommu)
>>>> -		return NULL;
>>>> -
>>>> -	if (dev_is_pci(dev)) {
>>>> -		struct pci_dev *pdev = to_pci_dev(dev);
>>>> -
>>>> -		pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
>>>> -
>>>> -		spin_lock_irqsave(&device_domain_lock, flags);
>>>> -		info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
>>>> -						      PCI_BUS_NUM(dma_alias),
>>>> -						      dma_alias & 0xff);
>>>> -		if (info) {
>>>> -			iommu = info->iommu;
>>>> -			domain = info->domain;
>>>> -		}
>>>> -		spin_unlock_irqrestore(&device_domain_lock, flags);
>>>> -
>>>> -		/* DMA alias already has a domain, use it */
>>>> -		if (info)
>>>> -			goto out;
>>>> -	}
>>>> -
>>>> -	/* Allocate and initialize new domain for the device */
>>>> -	domain = alloc_domain(0);
>>>> -	if (!domain)
>>>> -		return NULL;
>>>> -	if (domain_init(domain, iommu, gaw)) {
>>>> -		domain_exit(domain);
>>>> -		return NULL;
>>>> -	}
>>>> -
>>>> -out:
>>>> -
>>>> -	return domain;
>>>> -}
>>>> -
>>>> -static struct dmar_domain *set_domain_for_dev(struct device *dev,
>>>> -					      struct dmar_domain *domain)
>>>> -{
>>>> -	struct intel_iommu *iommu;
>>>> -	struct dmar_domain *tmp;
>>>> -	u16 req_id, dma_alias;
>>>> -	u8 bus, devfn;
>>>> -
>>>> -	iommu = device_to_iommu(dev, &bus, &devfn);
>>>> -	if (!iommu)
>>>> -		return NULL;
>>>> -
>>>> -	req_id = ((u16)bus << 8) | devfn;
>>>> -
>>>> -	if (dev_is_pci(dev)) {
>>>> -		struct pci_dev *pdev = to_pci_dev(dev);
>>>> -
>>>> -		pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
>>>> -
>>>> -		/* register PCI DMA alias device */
>>>> -		if (req_id != dma_alias) {
>>>> -			tmp = dmar_insert_one_dev_info(iommu, PCI_BUS_NUM(dma_alias),
>>>> -					dma_alias & 0xff, NULL, domain);
>>>> -
>>>> -			if (!tmp || tmp != domain)
>>>> -				return tmp;
>>>> -		}
>>>> -	}
>>>> -
>>>> -	tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
>>>> -	if (!tmp || tmp != domain)
>>>> -		return tmp;
>>>> -
>>>> -	return domain;
>>>> -}
>>>> -
>>>> -static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
>>>> -{
>>>> -	struct dmar_domain *domain, *tmp;
>>>> -
>>>> -	domain = find_domain(dev);
>>>> -	if (domain)
>>>> -		goto out;
>>>> -
>>>> -	domain = find_or_alloc_domain(dev, gaw);
>>>> -	if (!domain)
>>>> -		goto out;
>>>> -
>>>> -	tmp = set_domain_for_dev(dev, domain);
>>>> -	if (!tmp || domain != tmp) {
>>>> -		domain_exit(domain);
>>>> -		domain = tmp;
>>>> -	}
>>>> -
>>>> -out:
>>>> -
>>>> -	return domain;
>>>> -}
>>>> -
>>>>  static int iommu_domain_identity_map(struct dmar_domain *domain,
>>>>  				     unsigned long long start,
>>>>  				     unsigned long long end)
>>>> @@ -2779,7 +2667,7 @@ static int iommu_prepare_identity_map(struct device *dev,
>>>>  	struct dmar_domain *domain;
>>>>  	int ret;
>>>>  -	domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
>>>> +	domain = find_domain(dev);
>>>>  	if (!domain)
>>>>  		return -ENOMEM;
>>>>  @@ -3301,11 +3189,9 @@ static int copy_translation_tables(struct intel_iommu *iommu)
>>>>  static int __init init_dmars(void)
>>>>  {
>>>>  	struct dmar_drhd_unit *drhd;
>>>> -	struct dmar_rmrr_unit *rmrr;
>>>>  	bool copied_tables = false;
>>>> -	struct device *dev;
>>>>  	struct intel_iommu *iommu;
>>>> -	int i, ret;
>>>> +	int ret;
>>>>    	/*
>>>>  	 * for each drhd
>>>> @@ -3466,32 +3352,6 @@ static int __init init_dmars(void)
>>>>  			goto free_iommu;
>>>>  		}
>>>>  	}
>>>> -	/*
>>>> -	 * For each rmrr
>>>> -	 *   for each dev attached to rmrr
>>>> -	 *   do
>>>> -	 *     locate drhd for dev, alloc domain for dev
>>>> -	 *     allocate free domain
>>>> -	 *     allocate page table entries for rmrr
>>>> -	 *     if context not allocated for bus
>>>> -	 *           allocate and init context
>>>> -	 *           set present in root table for this bus
>>>> -	 *     init context with domain, translation etc
>>>> -	 *    endfor
>>>> -	 * endfor
>>>> -	 */
>>>> -	pr_info("Setting RMRR:\n");
>>>> -	for_each_rmrr_units(rmrr) {
>>>> -		/* some BIOS lists non-exist devices in DMAR table. */
>>>> -		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>>>> -					  i, dev) {
>>>> -			ret = iommu_prepare_rmrr_dev(rmrr, dev);
>>>> -			if (ret)
>>>> -				pr_err("Mapping reserved region failed\n");
>>>> -		}
>>>> -	}
>>>> -
>>>> -	iommu_prepare_isa();
>>> 
>>> Why do you want to remove this segment of code?
>> This will only work if the lazy allocation of domains exists, these
>> mappings will disappear once a default domain is attached to a device and
>> then remade by iommu_group_create_direct_mappings. This code is redundant
>> and removing it allows us to remove all the lazy allocation logic.
> 
> No exactly.
> 
> We need to setup the rmrr mapping before enabling dma remapping,
> otherwise, there will be a window after dma remapping enabling and
> rmrr getting mapped, during which people might see dma faults.

Do you think this patch instead should be a refactoring to simplify this initial domain setup before the default domain takes over? It seems like duplicated effort to have both lazy allocated domains and externally managed domains. We could allocate a domain here for any devices with RMRR and call get_resv_regions to avoid duplicating RMRR loop.

> 
>> iommu_prepare_isa does need moving to get_resv_regions for its mappings to
>> be applied, this will need some refactoring.
> 
> Yes, agree.
> 
>>> 
>>>>    domains_done:
>>>>  @@ -3580,53 +3440,6 @@ static unsigned long intel_alloc_iova(struct device *dev,
>>>>  	return iova_pfn;
>>>>  }
>>>>  -struct dmar_domain *get_valid_domain_for_dev(struct device *dev)
>>>> -{
>>>> -	struct dmar_domain *domain, *tmp;
>>>> -	struct dmar_rmrr_unit *rmrr;
>>>> -	struct device *i_dev;
>>>> -	int i, ret;
>>>> -
>>>> -	domain = find_domain(dev);
>>>> -	if (domain)
>>>> -		goto out;
>>>> -
>>>> -	domain = find_or_alloc_domain(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
>>>> -	if (!domain)
>>>> -		goto out;
>>>> -
>>>> -	/* We have a new domain - setup possible RMRRs for the device */
>>>> -	rcu_read_lock();
>>>> -	for_each_rmrr_units(rmrr) {
>>>> -		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>>>> -					  i, i_dev) {
>>>> -			if (i_dev != dev)
>>>> -				continue;
>>>> -
>>>> -			ret = domain_prepare_identity_map(dev, domain,
>>>> -							  rmrr->base_address,
>>>> -							  rmrr->end_address);
>>>> -			if (ret)
>>>> -				dev_err(dev, "Mapping reserved region failed\n");
>>> 
>>> We can't simply remove this segment of code, right? At least it should
>>> be moved to the domain allocation interface.
>> iommu_group_create_direct_mappings will take care of these mappings, this
>> code is not used once an externally managed domain(group domain) is
>> attached to the device.
> 
> Yes, this seems to be duplicated even with lazy domain allocation.
> 
> Best regards,
> Lu Baolu
> 
>>> 
>>>> -		}
>>>> -	}
>>>> -	rcu_read_unlock();
>>>> -
>>>> -	tmp = set_domain_for_dev(dev, domain);
>>>> -	if (!tmp || domain != tmp) {
>>>> -		domain_exit(domain);
>>>> -		domain = tmp;
>>>> -	}
>>>> -
>>>> -out:
>>>> -
>>>> -	if (!domain)
>>>> -		pr_err("Allocating domain for %s failed\n", dev_name(dev));
>>>> -
>>>> -
>>>> -	return domain;
>>>> -}
>>>> -
>>>>  /* Check if the dev needs to go through non-identity map and unmap process.*/
>>>>  static int iommu_no_mapping(struct device *dev)
>>>>  {
>>>> @@ -3689,7 +3502,7 @@ static dma_addr_t __intel_map_page(struct device *dev, struct page *page,
>>>>  	if (iommu_no_mapping(dev))
>>>>  		return paddr;
>>>>  -	domain = get_valid_domain_for_dev(dev);
>>>> +	domain = find_domain(dev);
>>>>  	if (!domain)
>>>>  		return DMA_MAPPING_ERROR;
>>>>  @@ -3753,7 +3566,8 @@ static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
>>>>  		return;
>>>>    	domain = find_domain(dev);
>>>> -	BUG_ON(!domain);
>>>> +	if (!domain)
>>>> +		return;
>>>>  
>>> 
>>> This is not related to this patch. Let's do it in a separated patch.
>>> 
>>>>  	iommu = domain_get_iommu(domain);
>>>>  @@ -3899,7 +3713,7 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
>>>>  	if (iommu_no_mapping(dev))
>>>>  		return intel_nontranslate_map_sg(dev, sglist, nelems, dir);
>>>>  -	domain = get_valid_domain_for_dev(dev);
>>>> +	domain = find_domain(dev);
>>>>  	if (!domain)
>>>>  		return 0;
>>>>  @@ -5377,9 +5191,9 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sd
>>>>  	u64 ctx_lo;
>>>>  	int ret;
>>>>  -	domain = get_valid_domain_for_dev(sdev->dev);
>>>> +	domain = find_domain(sdev->dev);
>>>>  	if (!domain)
>>>> -		return -EINVAL;
>>>> +		return -ENOMEM;
>>> 
>>> This is not related to this patch. Let's do it in a separated patch.
>>> 
>>>>    	spin_lock_irqsave(&device_domain_lock, flags);
>>>>  	spin_lock(&iommu->lock);
>>> 
>>> Best regards,
>>> Lu Baolu

Cheers,
James.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains
  2019-03-06 18:08             ` James Sewart
@ 2019-03-07  6:31               ` Lu Baolu
  2019-03-07 10:21                 ` James Sewart
  0 siblings, 1 reply; 55+ messages in thread
From: Lu Baolu @ 2019-03-07  6:31 UTC (permalink / raw)
  To: James Sewart
  Cc: baolu.lu, iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi James,

On 3/7/19 2:08 AM, James Sewart wrote:
>>>>> -	/*
>>>>> -	 * For each rmrr
>>>>> -	 *   for each dev attached to rmrr
>>>>> -	 *   do
>>>>> -	 *     locate drhd for dev, alloc domain for dev
>>>>> -	 *     allocate free domain
>>>>> -	 *     allocate page table entries for rmrr
>>>>> -	 *     if context not allocated for bus
>>>>> -	 *           allocate and init context
>>>>> -	 *           set present in root table for this bus
>>>>> -	 *     init context with domain, translation etc
>>>>> -	 *    endfor
>>>>> -	 * endfor
>>>>> -	 */
>>>>> -	pr_info("Setting RMRR:\n");
>>>>> -	for_each_rmrr_units(rmrr) {
>>>>> -		/* some BIOS lists non-exist devices in DMAR table. */
>>>>> -		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>>>>> -					  i, dev) {
>>>>> -			ret = iommu_prepare_rmrr_dev(rmrr, dev);
>>>>> -			if (ret)
>>>>> -				pr_err("Mapping reserved region failed\n");
>>>>> -		}
>>>>> -	}
>>>>> -
>>>>> -	iommu_prepare_isa();
>>>> Why do you want to remove this segment of code?
>>> This will only work if the lazy allocation of domains exists, these
>>> mappings will disappear once a default domain is attached to a device and
>>> then remade by iommu_group_create_direct_mappings. This code is redundant
>>> and removing it allows us to remove all the lazy allocation logic.
>> No exactly.
>>
>> We need to setup the rmrr mapping before enabling dma remapping,
>> otherwise, there will be a window after dma remapping enabling and
>> rmrr getting mapped, during which people might see dma faults.
> Do you think this patch instead should be a refactoring to simplify this initial domain setup before the default domain takes over? It seems like duplicated effort to have both lazy allocated domains and externally managed domains. We could allocate a domain here for any devices with RMRR and call get_resv_regions to avoid duplicating RMRR loop.
> 

Agree. We should replace the lazy allocated domains with default domain
in a clean way. Actually, your patches look great to me. But I do think
we need further cleanups. The rmrr code is one example, and the identity
domain (si_domain) is another.

Do you mind if I work on top of your patches for further cleanups and
sign off a v2 together with you?

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains
  2019-03-07  6:31               ` Lu Baolu
@ 2019-03-07 10:21                 ` James Sewart
  2019-03-08  1:09                   ` Lu Baolu
  2019-03-08  3:09                   ` Lu Baolu
  0 siblings, 2 replies; 55+ messages in thread
From: James Sewart @ 2019-03-07 10:21 UTC (permalink / raw)
  To: Lu Baolu; +Cc: iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hey Lu,

> On 7 Mar 2019, at 06:31, Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
> Hi James,
> 
> On 3/7/19 2:08 AM, James Sewart wrote:
>>>>>> -	/*
>>>>>> -	 * For each rmrr
>>>>>> -	 *   for each dev attached to rmrr
>>>>>> -	 *   do
>>>>>> -	 *     locate drhd for dev, alloc domain for dev
>>>>>> -	 *     allocate free domain
>>>>>> -	 *     allocate page table entries for rmrr
>>>>>> -	 *     if context not allocated for bus
>>>>>> -	 *           allocate and init context
>>>>>> -	 *           set present in root table for this bus
>>>>>> -	 *     init context with domain, translation etc
>>>>>> -	 *    endfor
>>>>>> -	 * endfor
>>>>>> -	 */
>>>>>> -	pr_info("Setting RMRR:\n");
>>>>>> -	for_each_rmrr_units(rmrr) {
>>>>>> -		/* some BIOS lists non-exist devices in DMAR table. */
>>>>>> -		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>>>>>> -					  i, dev) {
>>>>>> -			ret = iommu_prepare_rmrr_dev(rmrr, dev);
>>>>>> -			if (ret)
>>>>>> -				pr_err("Mapping reserved region failed\n");
>>>>>> -		}
>>>>>> -	}
>>>>>> -
>>>>>> -	iommu_prepare_isa();
>>>>> Why do you want to remove this segment of code?
>>>> This will only work if the lazy allocation of domains exists, these
>>>> mappings will disappear once a default domain is attached to a device and
>>>> then remade by iommu_group_create_direct_mappings. This code is redundant
>>>> and removing it allows us to remove all the lazy allocation logic.
>>> No exactly.
>>> 
>>> We need to setup the rmrr mapping before enabling dma remapping,
>>> otherwise, there will be a window after dma remapping enabling and
>>> rmrr getting mapped, during which people might see dma faults.
>> Do you think this patch instead should be a refactoring to simplify this initial domain setup before the default domain takes over? It seems like duplicated effort to have both lazy allocated domains and externally managed domains. We could allocate a domain here for any devices with RMRR and call get_resv_regions to avoid duplicating RMRR loop.
> 
> Agree. We should replace the lazy allocated domains with default domain
> in a clean way. Actually, your patches look great to me. But I do think
> we need further cleanups. The rmrr code is one example, and the identity
> domain (si_domain) is another.
> 
> Do you mind if I work on top of your patches for further cleanups and
> sign off a v2 together with you?

Sure, sounds good. I’ll fixup patch 3 and have a go at integrating 
iommu_prepare_isa into get_resv_regions. This should make the initial 
domain logic here quite concise.

> 
> Best regards,
> Lu Baolu

Cheers,
James.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains
  2019-03-07 10:21                 ` James Sewart
@ 2019-03-08  1:09                   ` Lu Baolu
  2019-03-08  3:09                   ` Lu Baolu
  1 sibling, 0 replies; 55+ messages in thread
From: Lu Baolu @ 2019-03-08  1:09 UTC (permalink / raw)
  To: James Sewart
  Cc: baolu.lu, iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi,

On 3/7/19 6:21 PM, James Sewart wrote:
> Hey Lu,
> 
>> On 7 Mar 2019, at 06:31, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>
>> Hi James,
>>
>> On 3/7/19 2:08 AM, James Sewart wrote:
>>>>>>> -	/*
>>>>>>> -	 * For each rmrr
>>>>>>> -	 *   for each dev attached to rmrr
>>>>>>> -	 *   do
>>>>>>> -	 *     locate drhd for dev, alloc domain for dev
>>>>>>> -	 *     allocate free domain
>>>>>>> -	 *     allocate page table entries for rmrr
>>>>>>> -	 *     if context not allocated for bus
>>>>>>> -	 *           allocate and init context
>>>>>>> -	 *           set present in root table for this bus
>>>>>>> -	 *     init context with domain, translation etc
>>>>>>> -	 *    endfor
>>>>>>> -	 * endfor
>>>>>>> -	 */
>>>>>>> -	pr_info("Setting RMRR:\n");
>>>>>>> -	for_each_rmrr_units(rmrr) {
>>>>>>> -		/* some BIOS lists non-exist devices in DMAR table. */
>>>>>>> -		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>>>>>>> -					  i, dev) {
>>>>>>> -			ret = iommu_prepare_rmrr_dev(rmrr, dev);
>>>>>>> -			if (ret)
>>>>>>> -				pr_err("Mapping reserved region failed\n");
>>>>>>> -		}
>>>>>>> -	}
>>>>>>> -
>>>>>>> -	iommu_prepare_isa();
>>>>>> Why do you want to remove this segment of code?
>>>>> This will only work if the lazy allocation of domains exists, these
>>>>> mappings will disappear once a default domain is attached to a device and
>>>>> then remade by iommu_group_create_direct_mappings. This code is redundant
>>>>> and removing it allows us to remove all the lazy allocation logic.
>>>> No exactly.
>>>>
>>>> We need to setup the rmrr mapping before enabling dma remapping,
>>>> otherwise, there will be a window after dma remapping enabling and
>>>> rmrr getting mapped, during which people might see dma faults.
>>> Do you think this patch instead should be a refactoring to simplify this initial domain setup before the default domain takes over? It seems like duplicated effort to have both lazy allocated domains and externally managed domains. We could allocate a domain here for any devices with RMRR and call get_resv_regions to avoid duplicating RMRR loop.
>>
>> Agree. We should replace the lazy allocated domains with default domain
>> in a clean way. Actually, your patches look great to me. But I do think
>> we need further cleanups. The rmrr code is one example, and the identity
>> domain (si_domain) is another.
>>
>> Do you mind if I work on top of your patches for further cleanups and
>> sign off a v2 together with you?
> 
> Sure, sounds good. I’ll fixup patch 3 and have a go at integrating
> iommu_prepare_isa into get_resv_regions. This should make the initial
> domain logic here quite concise.

Very appreciated! Thank you!

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated
  2019-03-04 15:46   ` [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated James Sewart
  2019-03-04 15:47     ` [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains James Sewart
  2019-03-05  6:46     ` [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated Lu Baolu
@ 2019-03-08  1:20     ` Dmitry Safonov
  2019-03-09 11:57       ` James Sewart
  2 siblings, 1 reply; 55+ messages in thread
From: Dmitry Safonov @ 2019-03-08  1:20 UTC (permalink / raw)
  To: James Sewart, iommu; +Cc: Tom Murphy, Jacob Pan, linux-kernel, Lu Baolu

On 3/4/19 3:46 PM, James Sewart wrote:
> +static inline int domain_is_initialised(struct dmar_domain *domain)
> +{
> +	return domain->flags & DOMAIN_FLAG_INITIALISED;
> +}

Maybe check it in intel_iommu_domain_free(), eh?

Thanks,
          Dmitry

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains
  2019-03-07 10:21                 ` James Sewart
  2019-03-08  1:09                   ` Lu Baolu
@ 2019-03-08  3:09                   ` Lu Baolu
  2019-03-08 16:57                     ` James Sewart
  1 sibling, 1 reply; 55+ messages in thread
From: Lu Baolu @ 2019-03-08  3:09 UTC (permalink / raw)
  To: James Sewart
  Cc: baolu.lu, iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4290 bytes --]

Hi James,

On 3/7/19 6:21 PM, James Sewart wrote:
> Hey Lu,
> 
>> On 7 Mar 2019, at 06:31, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>
>> Hi James,
>>
>> On 3/7/19 2:08 AM, James Sewart wrote:
>>>>>>> -	/*
>>>>>>> -	 * For each rmrr
>>>>>>> -	 *   for each dev attached to rmrr
>>>>>>> -	 *   do
>>>>>>> -	 *     locate drhd for dev, alloc domain for dev
>>>>>>> -	 *     allocate free domain
>>>>>>> -	 *     allocate page table entries for rmrr
>>>>>>> -	 *     if context not allocated for bus
>>>>>>> -	 *           allocate and init context
>>>>>>> -	 *           set present in root table for this bus
>>>>>>> -	 *     init context with domain, translation etc
>>>>>>> -	 *    endfor
>>>>>>> -	 * endfor
>>>>>>> -	 */
>>>>>>> -	pr_info("Setting RMRR:\n");
>>>>>>> -	for_each_rmrr_units(rmrr) {
>>>>>>> -		/* some BIOS lists non-exist devices in DMAR table. */
>>>>>>> -		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>>>>>>> -					  i, dev) {
>>>>>>> -			ret = iommu_prepare_rmrr_dev(rmrr, dev);
>>>>>>> -			if (ret)
>>>>>>> -				pr_err("Mapping reserved region failed\n");
>>>>>>> -		}
>>>>>>> -	}
>>>>>>> -
>>>>>>> -	iommu_prepare_isa();
>>>>>> Why do you want to remove this segment of code?
>>>>> This will only work if the lazy allocation of domains exists, these
>>>>> mappings will disappear once a default domain is attached to a device and
>>>>> then remade by iommu_group_create_direct_mappings. This code is redundant
>>>>> and removing it allows us to remove all the lazy allocation logic.
>>>> No exactly.
>>>>
>>>> We need to setup the rmrr mapping before enabling dma remapping,
>>>> otherwise, there will be a window after dma remapping enabling and
>>>> rmrr getting mapped, during which people might see dma faults.
>>> Do you think this patch instead should be a refactoring to simplify this initial domain setup before the default domain takes over? It seems like duplicated effort to have both lazy allocated domains and externally managed domains. We could allocate a domain here for any devices with RMRR and call get_resv_regions to avoid duplicating RMRR loop.
>>
>> Agree. We should replace the lazy allocated domains with default domain
>> in a clean way. Actually, your patches look great to me. But I do think
>> we need further cleanups. The rmrr code is one example, and the identity
>> domain (si_domain) is another.
>>
>> Do you mind if I work on top of your patches for further cleanups and
>> sign off a v2 together with you?
> 
> Sure, sounds good. I’ll fixup patch 3 and have a go at integrating
> iommu_prepare_isa into get_resv_regions. This should make the initial
> domain logic here quite concise.
> 

Here attached three extra patches which I think should be added before
PATCH 3/4, and some further cleanup changes which you can merge with
PATCH 4/4.

----------------

0006-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch
0007-iommu-vt-d-Add-is_identity_map-ops-entry.patch

These two patches aim to add a generic method for vendor specific iommu
drivers to specify the type of the default domain for a device. Intel
iommu driver will register an ops for this since it already has its own
identity map adjudicator for a long time.

----------------

0008-iommu-vt-d-Enable-DMA-remapping-after-rmrr-mapped.patch

This aims to address the requirement of rmrr identity map before
enabling DMA remapping. With default domain mechanism, the default
domain will be allocated and attached to the device in bus_set_iommu()
during boot. Move enabling DMA remapping engines after bus_set_iommu()
will fix the rmrr issue.

----------------

0009-return-si_domain-directly.patch

I suggest that we should keep current si_domain implementation since
changing si_domain is not the purpose of this patch set. Please merge
this with PATCH 3/4 if you like it.

----------------

0010-iommu-vt-d-remove-floopy-workaround.patch
0011-iommu-vt-d-remove-prepare-rmrr-helpers.patch
0012-iommu-vt-d-remove-prepare-identity-map-during-boot.patch

Above patches are further cleanups as the result of switching to default
domain. Please consider to merge them with PATCH 4/4.

----------------

I have done some sanity checks on my local machine against all patches.
I can do more tests after you submit a v2.


Best regards,
Lu Baolu



[-- Attachment #2: 0006-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch --]
[-- Type: text/x-patch, Size: 2502 bytes --]

From d1625f1e8461cef23bc8697ec51382c47b92fa5a Mon Sep 17 00:00:00 2001
From: Lu Baolu <baolu.lu@linux.intel.com>
Date: Thu, 7 Mar 2019 10:50:27 +0800
Subject: [PATCH 06/12] iommu: Add ops entry for vendor specific default domain
 type

This adds an iommu ops entry for iommu vendor driver to specify
the type of the default domain for each device. This is needed
for the vendor driver, which already has its own logic to
determine the use of identity domain for a long time, when it
switches to apply default domain.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/iommu.c | 12 +++++++++---
 include/linux/iommu.h |  4 ++++
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 374327018a11..4101f38a7844 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1102,12 +1102,18 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
 	 */
 	if (!group->default_domain) {
 		struct iommu_domain *dom;
+		int domain_type = iommu_def_domain_type;
 
-		dom = __iommu_domain_alloc(dev->bus, iommu_def_domain_type);
-		if (!dom && iommu_def_domain_type != IOMMU_DOMAIN_DMA) {
+		if (ops->is_identity_map)
+			domain_type = ops->is_identity_map(dev) ?
+				      IOMMU_DOMAIN_IDENTITY :
+				      IOMMU_DOMAIN_DMA;
+
+		dom = __iommu_domain_alloc(dev->bus, domain_type);
+		if (!dom && domain_type != IOMMU_DOMAIN_DMA) {
 			dev_warn(dev,
 				 "failed to allocate default IOMMU domain of type %u; falling back to IOMMU_DOMAIN_DMA",
-				 iommu_def_domain_type);
+				 domain_type);
 			dom = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_DMA);
 		}
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index e90da6b6f3d1..8e298fd631d0 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -184,6 +184,7 @@ struct iommu_resv_region {
  * @domain_window_disable: Disable a particular window for a domain
  * @of_xlate: add OF master IDs to iommu grouping
  * @pgsize_bitmap: bitmap of all possible supported page sizes
+ * @is_identity_map: vendor customized identity map adjudicator
  */
 struct iommu_ops {
 	bool (*capable)(enum iommu_cap);
@@ -226,6 +227,9 @@ struct iommu_ops {
 	int (*of_xlate)(struct device *dev, struct of_phandle_args *args);
 	bool (*is_attach_deferred)(struct iommu_domain *domain, struct device *dev);
 
+	/* Vendor customized identity map adjudicator */
+	bool (*is_identity_map)(struct device *dev);
+
 	unsigned long pgsize_bitmap;
 };
 
-- 
2.17.1


[-- Attachment #3: 0007-iommu-vt-d-Add-is_identity_map-ops-entry.patch --]
[-- Type: text/x-patch, Size: 1299 bytes --]

From c178d3d7033ca35e3302e1e6c8d2a4d906814b9a Mon Sep 17 00:00:00 2001
From: Lu Baolu <baolu.lu@linux.intel.com>
Date: Thu, 7 Mar 2019 12:53:12 +0800
Subject: [PATCH 07/12] iommu/vt-d: Add is_identity_map ops entry

This adds the is_identity_map ops entry for Intel IOMMU driver.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 282257e2628d..eb8aca8d1dc9 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5181,6 +5181,11 @@ static void intel_iommu_apply_resv_region(struct device *dev,
 	WARN_ON_ONCE(reserve_iova(&dmar_domain->iovad, start, end) == NULL);
 }
 
+static bool intel_iommu_is_identity_map(struct device *dev)
+{
+	return iommu_should_identity_map(dev, 1);
+}
+
 #ifdef CONFIG_INTEL_IOMMU_SVM
 int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sdev)
 {
@@ -5276,6 +5281,7 @@ const struct iommu_ops intel_iommu_ops = {
 	.put_resv_regions	= intel_iommu_put_resv_regions,
 	.apply_resv_region	= intel_iommu_apply_resv_region,
 	.device_group		= pci_device_group,
+	.is_identity_map	= intel_iommu_is_identity_map,
 	.pgsize_bitmap		= INTEL_IOMMU_PGSIZES,
 };
 
-- 
2.17.1


[-- Attachment #4: 0008-iommu-vt-d-Enable-DMA-remapping-after-rmrr-mapped.patch --]
[-- Type: text/x-patch, Size: 1937 bytes --]

From 7d60445b9ba0c447a3f027470bed7a7c680f73c9 Mon Sep 17 00:00:00 2001
From: Lu Baolu <baolu.lu@linux.intel.com>
Date: Thu, 7 Mar 2019 13:14:43 +0800
Subject: [PATCH 08/12] iommu/vt-d: Enable DMA remapping after rmrr mapped

The rmrr devices require identity map of the rmrr regions before
enabling DMA remapping. Otherwise, there will be a window during
which DMA from/to the rmrr regions will be blocked. In order to
alleviate this, we move enabling DMA remapping after all rmrr
regions get mapped.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index eb8aca8d1dc9..9641d75cbfa3 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3385,11 +3385,6 @@ static int __init init_dmars(void)
 		ret = dmar_set_interrupt(iommu);
 		if (ret)
 			goto free_iommu;
-
-		if (!translation_pre_enabled(iommu))
-			iommu_enable_translation(iommu);
-
-		iommu_disable_protect_mem_regions(iommu);
 	}
 
 	return 0;
@@ -4733,7 +4728,6 @@ int __init intel_iommu_init(void)
 		goto out_free_reserved_range;
 	}
 	up_write(&dmar_global_lock);
-	pr_info("Intel(R) Virtualization Technology for Directed I/O\n");
 
 #if defined(CONFIG_X86) && defined(CONFIG_SWIOTLB)
 	swiotlb = 0;
@@ -4756,6 +4750,16 @@ int __init intel_iommu_init(void)
 		register_memory_notifier(&intel_iommu_memory_nb);
 	cpuhp_setup_state(CPUHP_IOMMU_INTEL_DEAD, "iommu/intel:dead", NULL,
 			  intel_iommu_cpu_dead);
+
+	/* Finally, we enable the DMA remapping hardware. */
+	for_each_iommu(iommu, drhd) {
+		if (!translation_pre_enabled(iommu))
+			iommu_enable_translation(iommu);
+
+		iommu_disable_protect_mem_regions(iommu);
+	}
+	pr_info("Intel(R) Virtualization Technology for Directed I/O\n");
+
 	intel_iommu_enabled = 1;
 	intel_iommu_debugfs_init();
 
-- 
2.17.1


[-- Attachment #5: 0009-return-si_domain-directly.patch --]
[-- Type: text/x-patch, Size: 2188 bytes --]

From 0df12098d4f86354eea458497f171e77f6cf9dee Mon Sep 17 00:00:00 2001
From: Lu Baolu <baolu.lu@linux.intel.com>
Date: Thu, 7 Mar 2019 16:28:57 +0800
Subject: [PATCH 09/12] return si_domain directly

---
 drivers/iommu/intel-iommu.c | 29 ++++++++---------------------
 1 file changed, 8 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 9641d75cbfa3..b380818878c7 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4875,12 +4875,10 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
 	case IOMMU_DOMAIN_UNMANAGED:
 		flags |= DOMAIN_FLAG_VIRTUAL_MACHINE | DOMAIN_FLAG_INITIALISED;
 		dmar_domain = alloc_domain(flags);
-		if (!dmar_domain) {
-			pr_err("Can't allocate dmar_domain\n");
+		if (!dmar_domain)
 			return NULL;
-		}
+
 		if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
-			pr_err("Domain initialization failed\n");
 			domain_exit(dmar_domain);
 			return NULL;
 		}
@@ -4892,30 +4890,15 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
 		break;
 	case IOMMU_DOMAIN_DMA:
 		dmar_domain = alloc_domain(flags);
-		if (!dmar_domain) {
-			pr_err("Can't allocate dmar_domain\n");
+		if (!dmar_domain)
 			return NULL;
-		}
-		// init domain in device attach when we know IOMMU capabilities
 		break;
 	case IOMMU_DOMAIN_IDENTITY:
-		flags |= DOMAIN_FLAG_STATIC_IDENTITY | DOMAIN_FLAG_INITIALISED;
-		dmar_domain = alloc_domain(flags);
-		if (!dmar_domain) {
-			pr_err("Can't allocate dmar_domain\n");
-			return NULL;
-		}
-		if (si_domain_init(dmar_domain, 0)) {
-			pr_err("Domain initialization failed\n");
-			domain_exit(dmar_domain);
-			return NULL;
-		}
-		break;
+		return &si_domain->domain;
 	default:
 		return NULL;
 	}
 
-	dmar_domain->domain.type = type;
 	return &dmar_domain->domain;
 }
 
@@ -5009,6 +4992,10 @@ static int intel_iommu_map(struct iommu_domain *domain,
 	int prot = 0;
 	int ret;
 
+	/* Don't bother if hardware passthrough used. */
+	if (dmar_domain == si_domain && hw_pass_through)
+		return 0;
+
 	if (iommu_prot & IOMMU_READ)
 		prot |= DMA_PTE_READ;
 	if (iommu_prot & IOMMU_WRITE)
-- 
2.17.1


[-- Attachment #6: 0010-iommu-vt-d-remove-floopy-workaround.patch --]
[-- Type: text/x-patch, Size: 1329 bytes --]

From 3c6f73cf3e37f7691150f4639a9012f22ab97ead Mon Sep 17 00:00:00 2001
From: Lu Baolu <baolu.lu@linux.intel.com>
Date: Fri, 8 Mar 2019 09:20:02 +0800
Subject: [PATCH 10/12] iommu/vt-d: remove floopy workaround

---
 drivers/iommu/intel-iommu.c | 25 -------------------------
 1 file changed, 25 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index b380818878c7..ceea9be31063 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2687,31 +2687,6 @@ static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
 					  rmrr->end_address);
 }
 
-#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
-static inline void iommu_prepare_isa(void)
-{
-	struct pci_dev *pdev;
-	int ret;
-
-	pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
-	if (!pdev)
-		return;
-
-	pr_info("Prepare 0-16MiB unity mapping for LPC\n");
-	ret = iommu_prepare_identity_map(&pdev->dev, 0, 16*1024*1024 - 1);
-
-	if (ret)
-		pr_err("Failed to create 0-16MiB identity map - floppy might not work\n");
-
-	pci_dev_put(pdev);
-}
-#else
-static inline void iommu_prepare_isa(void)
-{
-	return;
-}
-#endif /* !CONFIG_INTEL_IOMMU_FLPY_WA */
-
 static int md_domain_init(struct dmar_domain *domain, int guest_width);
 
 static int __init si_domain_init(struct dmar_domain *si_domain, int hw)
-- 
2.17.1


[-- Attachment #7: 0011-iommu-vt-d-remove-prepare-rmrr-helpers.patch --]
[-- Type: text/x-patch, Size: 2845 bytes --]

From fcd893df02b2288c6ee48224d408c30786a48d68 Mon Sep 17 00:00:00 2001
From: Lu Baolu <baolu.lu@linux.intel.com>
Date: Fri, 8 Mar 2019 09:26:27 +0800
Subject: [PATCH 11/12] iommu/vt-d: remove prepare rmrr helpers

---
 drivers/iommu/intel-iommu.c | 67 -------------------------------------
 1 file changed, 67 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index ceea9be31063..aa33f65b32cc 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2620,73 +2620,6 @@ static int iommu_domain_identity_map(struct dmar_domain *domain,
 				DMA_PTE_READ|DMA_PTE_WRITE);
 }
 
-static int domain_prepare_identity_map(struct device *dev,
-				       struct dmar_domain *domain,
-				       unsigned long long start,
-				       unsigned long long end)
-{
-	/* For _hardware_ passthrough, don't bother. But for software
-	   passthrough, we do it anyway -- it may indicate a memory
-	   range which is reserved in E820, so which didn't get set
-	   up to start with in si_domain */
-	if (domain == si_domain && hw_pass_through) {
-		pr_warn("Ignoring identity map for HW passthrough device %s [0x%Lx - 0x%Lx]\n",
-			dev_name(dev), start, end);
-		return 0;
-	}
-
-	pr_info("Setting identity map for device %s [0x%Lx - 0x%Lx]\n",
-		dev_name(dev), start, end);
-
-	if (end < start) {
-		WARN(1, "Your BIOS is broken; RMRR ends before it starts!\n"
-			"BIOS vendor: %s; Ver: %s; Product Version: %s\n",
-			dmi_get_system_info(DMI_BIOS_VENDOR),
-			dmi_get_system_info(DMI_BIOS_VERSION),
-		     dmi_get_system_info(DMI_PRODUCT_VERSION));
-		return -EIO;
-	}
-
-	if (end >> agaw_to_width(domain->agaw)) {
-		WARN(1, "Your BIOS is broken; RMRR exceeds permitted address width (%d bits)\n"
-		     "BIOS vendor: %s; Ver: %s; Product Version: %s\n",
-		     agaw_to_width(domain->agaw),
-		     dmi_get_system_info(DMI_BIOS_VENDOR),
-		     dmi_get_system_info(DMI_BIOS_VERSION),
-		     dmi_get_system_info(DMI_PRODUCT_VERSION));
-		return -EIO;
-	}
-
-	return iommu_domain_identity_map(domain, start, end);
-}
-
-static int iommu_prepare_identity_map(struct device *dev,
-				      unsigned long long start,
-				      unsigned long long end)
-{
-	struct dmar_domain *domain;
-	int ret;
-
-	domain = find_domain(dev);
-	if (!domain)
-		return -ENOMEM;
-
-	ret = domain_prepare_identity_map(dev, domain, start, end);
-	if (ret)
-		domain_exit(domain);
-
-	return ret;
-}
-
-static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
-					 struct device *dev)
-{
-	if (dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO)
-		return 0;
-	return iommu_prepare_identity_map(dev, rmrr->base_address,
-					  rmrr->end_address);
-}
-
 static int md_domain_init(struct dmar_domain *domain, int guest_width);
 
 static int __init si_domain_init(struct dmar_domain *si_domain, int hw)
-- 
2.17.1


[-- Attachment #8: 0012-iommu-vt-d-remove-prepare-identity-map-during-boot.patch --]
[-- Type: text/x-patch, Size: 1080 bytes --]

From 3d02365161a3b1441aaf1d36940758df2b4f2b1c Mon Sep 17 00:00:00 2001
From: Lu Baolu <baolu.lu@linux.intel.com>
Date: Fri, 8 Mar 2019 09:34:39 +0800
Subject: [PATCH 12/12] iommu/vt-d: remove prepare identity map during boot

---
 drivers/iommu/intel-iommu.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index aa33f65b32cc..e7a94bb927c4 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2839,18 +2839,10 @@ static int __init dev_prepare_static_identity_mapping(struct device *dev, int hw
 
 static int __init iommu_prepare_static_identity_mapping(int hw)
 {
-	struct pci_dev *pdev = NULL;
 	struct dmar_drhd_unit *drhd;
 	struct intel_iommu *iommu;
 	struct device *dev;
-	int i;
-	int ret = 0;
-
-	for_each_pci_dev(pdev) {
-		ret = dev_prepare_static_identity_mapping(&pdev->dev, hw);
-		if (ret)
-			return ret;
-	}
+	int i, ret = 0;
 
 	for_each_active_iommu(iommu, drhd)
 		for_each_active_dev_scope(drhd->devices, drhd->devices_cnt, i, dev) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains
  2019-03-08  3:09                   ` Lu Baolu
@ 2019-03-08 16:57                     ` James Sewart
  2019-03-09  1:53                       ` Lu Baolu
  0 siblings, 1 reply; 55+ messages in thread
From: James Sewart @ 2019-03-08 16:57 UTC (permalink / raw)
  To: Lu Baolu; +Cc: iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hey Lu,

> On 8 Mar 2019, at 03:09, Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
> Hi James,
> 
> On 3/7/19 6:21 PM, James Sewart wrote:
>> Hey Lu,
>>> On 7 Mar 2019, at 06:31, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>> 
>>> Hi James,
>>> 
>>> On 3/7/19 2:08 AM, James Sewart wrote:
>>>>>>>> -	/*
>>>>>>>> -	 * For each rmrr
>>>>>>>> -	 *   for each dev attached to rmrr
>>>>>>>> -	 *   do
>>>>>>>> -	 *     locate drhd for dev, alloc domain for dev
>>>>>>>> -	 *     allocate free domain
>>>>>>>> -	 *     allocate page table entries for rmrr
>>>>>>>> -	 *     if context not allocated for bus
>>>>>>>> -	 *           allocate and init context
>>>>>>>> -	 *           set present in root table for this bus
>>>>>>>> -	 *     init context with domain, translation etc
>>>>>>>> -	 *    endfor
>>>>>>>> -	 * endfor
>>>>>>>> -	 */
>>>>>>>> -	pr_info("Setting RMRR:\n");
>>>>>>>> -	for_each_rmrr_units(rmrr) {
>>>>>>>> -		/* some BIOS lists non-exist devices in DMAR table. */
>>>>>>>> -		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>>>>>>>> -					  i, dev) {
>>>>>>>> -			ret = iommu_prepare_rmrr_dev(rmrr, dev);
>>>>>>>> -			if (ret)
>>>>>>>> -				pr_err("Mapping reserved region failed\n");
>>>>>>>> -		}
>>>>>>>> -	}
>>>>>>>> -
>>>>>>>> -	iommu_prepare_isa();
>>>>>>> Why do you want to remove this segment of code?
>>>>>> This will only work if the lazy allocation of domains exists, these
>>>>>> mappings will disappear once a default domain is attached to a device and
>>>>>> then remade by iommu_group_create_direct_mappings. This code is redundant
>>>>>> and removing it allows us to remove all the lazy allocation logic.
>>>>> No exactly.
>>>>> 
>>>>> We need to setup the rmrr mapping before enabling dma remapping,
>>>>> otherwise, there will be a window after dma remapping enabling and
>>>>> rmrr getting mapped, during which people might see dma faults.
>>>> Do you think this patch instead should be a refactoring to simplify this initial domain setup before the default domain takes over? It seems like duplicated effort to have both lazy allocated domains and externally managed domains. We could allocate a domain here for any devices with RMRR and call get_resv_regions to avoid duplicating RMRR loop.
>>> 
>>> Agree. We should replace the lazy allocated domains with default domain
>>> in a clean way. Actually, your patches look great to me. But I do think
>>> we need further cleanups. The rmrr code is one example, and the identity
>>> domain (si_domain) is another.
>>> 
>>> Do you mind if I work on top of your patches for further cleanups and
>>> sign off a v2 together with you?
>> Sure, sounds good. I’ll fixup patch 3 and have a go at integrating
>> iommu_prepare_isa into get_resv_regions. This should make the initial
>> domain logic here quite concise.
> 
> Here attached three extra patches which I think should be added before
> PATCH 3/4, and some further cleanup changes which you can merge with
> PATCH 4/4.
> 
> ----------------
> 
> 0006-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch
> 0007-iommu-vt-d-Add-is_identity_map-ops-entry.patch
> 
> These two patches aim to add a generic method for vendor specific iommu
> drivers to specify the type of the default domain for a device. Intel
> iommu driver will register an ops for this since it already has its own
> identity map adjudicator for a long time.

This seems like a good idea, but as domain alloc is only called for the 
default domain on first device attached to a group, we may miss checking 
whether a device added later should have an identity domain. Should there 
be paths to downgrade a groups domain if one of the devices requires one?

> 
> ----------------
> 
> 0008-iommu-vt-d-Enable-DMA-remapping-after-rmrr-mapped.patch
> 
> This aims to address the requirement of rmrr identity map before
> enabling DMA remapping. With default domain mechanism, the default
> domain will be allocated and attached to the device in bus_set_iommu()
> during boot. Move enabling DMA remapping engines after bus_set_iommu()
> will fix the rmrr issue.

Thats pretty neat, avoids any temporary domain allocation, nice!

> 
> ----------------
> 
> 0009-return-si_domain-directly.patch
> 
> I suggest that we should keep current si_domain implementation since
> changing si_domain is not the purpose of this patch set. Please merge
> this with PATCH 3/4 if you like it.

Seems reasonable.

> 
> ----------------
> 
> 0010-iommu-vt-d-remove-floopy-workaround.patch
> 0011-iommu-vt-d-remove-prepare-rmrr-helpers.patch
> 0012-iommu-vt-d-remove-prepare-identity-map-during-boot.patch
> 
> Above patches are further cleanups as the result of switching to default
> domain. Please consider to merge them with PATCH 4/4.

Nice, good catch.

> 
> ----------------
> 
> I have done some sanity checks on my local machine against all patches.
> I can do more tests after you submit a v2.
> 
> 
> Best regards,
> Lu Baolu
> 
> 
> <0006-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch><0007-iommu-vt-d-Add-is_identity_map-ops-entry.patch><0008-iommu-vt-d-Enable-DMA-remapping-after-rmrr-mapped.patch><0009-return-si_domain-directly.patch><0010-iommu-vt-d-remove-floopy-workaround.patch><0011-iommu-vt-d-remove-prepare-rmrr-helpers.patch><0012-iommu-vt-d-remove-prepare-identity-map-during-boot.patch>

I have revised my patches and hope to do some testing before reposting.

Cheers,
James.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains
  2019-03-08 16:57                     ` James Sewart
@ 2019-03-09  1:53                       ` Lu Baolu
  2019-03-09 11:49                         ` James Sewart
  0 siblings, 1 reply; 55+ messages in thread
From: Lu Baolu @ 2019-03-09  1:53 UTC (permalink / raw)
  To: James Sewart
  Cc: baolu.lu, iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi James,

On 3/9/19 12:57 AM, James Sewart wrote:
> Hey Lu,
> 
>> On 8 Mar 2019, at 03:09, Lu Baolu<baolu.lu@linux.intel.com>  wrote:
>>>>
>>>> Do you mind if I work on top of your patches for further cleanups and
>>>> sign off a v2 together with you?
>>> Sure, sounds good. I’ll fixup patch 3 and have a go at integrating
>>> iommu_prepare_isa into get_resv_regions. This should make the initial
>>> domain logic here quite concise.
>> Here attached three extra patches which I think should be added before
>> PATCH 3/4, and some further cleanup changes which you can merge with
>> PATCH 4/4.
>>
>> ----------------
>>
>> 0006-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch
>> 0007-iommu-vt-d-Add-is_identity_map-ops-entry.patch
>>
>> These two patches aim to add a generic method for vendor specific iommu
>> drivers to specify the type of the default domain for a device. Intel
>> iommu driver will register an ops for this since it already has its own
>> identity map adjudicator for a long time.
> This seems like a good idea, but as domain alloc is only called for the
> default domain on first device attached to a group, we may miss checking
> whether a device added later should have an identity domain. Should there
> be paths to downgrade a groups domain if one of the devices requires one?
> 

Good catch!

This is supposed to be handled in iommu_no_mapping(). But, obviously
current code sticks to lazy domain allocation. I'm not sure whether
there are any real such cases, but we should handle it in a clean way.
My idea is that we could downgrade to multiple domains per group (just
like what we have now) in this case and print a kernel message for this.

Or any other thoughts?

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains
  2019-03-09  1:53                       ` Lu Baolu
@ 2019-03-09 11:49                         ` James Sewart
  2019-03-10  2:51                           ` Lu Baolu
  0 siblings, 1 reply; 55+ messages in thread
From: James Sewart @ 2019-03-09 11:49 UTC (permalink / raw)
  To: Lu Baolu; +Cc: iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hey Lu,

> On 9 Mar 2019, at 01:53, Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
> Hi James,
> 
> On 3/9/19 12:57 AM, James Sewart wrote:
>> Hey Lu,
>>> On 8 Mar 2019, at 03:09, Lu Baolu<baolu.lu@linux.intel.com>  wrote:
>>>>> 
>>>>> Do you mind if I work on top of your patches for further cleanups and
>>>>> sign off a v2 together with you?
>>>> Sure, sounds good. I’ll fixup patch 3 and have a go at integrating
>>>> iommu_prepare_isa into get_resv_regions. This should make the initial
>>>> domain logic here quite concise.
>>> Here attached three extra patches which I think should be added before
>>> PATCH 3/4, and some further cleanup changes which you can merge with
>>> PATCH 4/4.
>>> 
>>> ----------------
>>> 
>>> 0006-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch
>>> 0007-iommu-vt-d-Add-is_identity_map-ops-entry.patch
>>> 
>>> These two patches aim to add a generic method for vendor specific iommu
>>> drivers to specify the type of the default domain for a device. Intel
>>> iommu driver will register an ops for this since it already has its own
>>> identity map adjudicator for a long time.
>> This seems like a good idea, but as domain alloc is only called for the
>> default domain on first device attached to a group, we may miss checking
>> whether a device added later should have an identity domain. Should there
>> be paths to downgrade a groups domain if one of the devices requires one?
> 
> Good catch!
> 
> This is supposed to be handled in iommu_no_mapping(). But, obviously
> current code sticks to lazy domain allocation. I'm not sure whether
> there are any real such cases, but we should handle it in a clean way.
> My idea is that we could downgrade to multiple domains per group (just
> like what we have now) in this case and print a kernel message for this.

I think if a device requires an identity domain, then it should ignore 
attempts to attach to something else. A print to warn a user about this 
would be a good idea.

I figure during attach: if iommu_no_mapping() then attach to si_domain and 
print, else continue with the given domain.

> 
> Or any other thoughts?
> 
> Best regards,
> Lu Baolu

Cheers,
James.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated
  2019-03-08  1:20     ` Dmitry Safonov
@ 2019-03-09 11:57       ` James Sewart
  0 siblings, 0 replies; 55+ messages in thread
From: James Sewart @ 2019-03-09 11:57 UTC (permalink / raw)
  To: Dmitry Safonov; +Cc: iommu, Tom Murphy, Jacob Pan, linux-kernel, Lu Baolu

Hey Dmitry,

> On 8 Mar 2019, at 01:20, Dmitry Safonov <dima@arista.com> wrote:
> 
> On 3/4/19 3:46 PM, James Sewart wrote:
>> +static inline int domain_is_initialised(struct dmar_domain *domain)
>> +{
>> +	return domain->flags & DOMAIN_FLAG_INITIALISED;
>> +}
> 
> Maybe check it in intel_iommu_domain_free(), eh?

Good shout, freeing a non attached IOMMU_DOMAIN_DMA domain is now 
different to an attached one.

> 
> Thanks,
>          Dmitry


Cheers,
James.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains
  2019-03-09 11:49                         ` James Sewart
@ 2019-03-10  2:51                           ` Lu Baolu
  0 siblings, 0 replies; 55+ messages in thread
From: Lu Baolu @ 2019-03-10  2:51 UTC (permalink / raw)
  To: James Sewart
  Cc: baolu.lu, iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi,

On 3/9/19 7:49 PM, James Sewart wrote:
> Hey Lu,
> 
>> On 9 Mar 2019, at 01:53, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>
>> Hi James,
>>
>> On 3/9/19 12:57 AM, James Sewart wrote:
>>> Hey Lu,
>>>> On 8 Mar 2019, at 03:09, Lu Baolu<baolu.lu@linux.intel.com>  wrote:
>>>>>>
>>>>>> Do you mind if I work on top of your patches for further cleanups and
>>>>>> sign off a v2 together with you?
>>>>> Sure, sounds good. I’ll fixup patch 3 and have a go at integrating
>>>>> iommu_prepare_isa into get_resv_regions. This should make the initial
>>>>> domain logic here quite concise.
>>>> Here attached three extra patches which I think should be added before
>>>> PATCH 3/4, and some further cleanup changes which you can merge with
>>>> PATCH 4/4.
>>>>
>>>> ----------------
>>>>
>>>> 0006-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch
>>>> 0007-iommu-vt-d-Add-is_identity_map-ops-entry.patch
>>>>
>>>> These two patches aim to add a generic method for vendor specific iommu
>>>> drivers to specify the type of the default domain for a device. Intel
>>>> iommu driver will register an ops for this since it already has its own
>>>> identity map adjudicator for a long time.
>>> This seems like a good idea, but as domain alloc is only called for the
>>> default domain on first device attached to a group, we may miss checking
>>> whether a device added later should have an identity domain. Should there
>>> be paths to downgrade a groups domain if one of the devices requires one?
>>
>> Good catch!
>>
>> This is supposed to be handled in iommu_no_mapping(). But, obviously
>> current code sticks to lazy domain allocation. I'm not sure whether
>> there are any real such cases, but we should handle it in a clean way.
>> My idea is that we could downgrade to multiple domains per group (just
>> like what we have now) in this case and print a kernel message for this.
> 
> I think if a device requires an identity domain, then it should ignore
> attempts to attach to something else. A print to warn a user about this
> would be a good idea.
> 
> I figure during attach: if iommu_no_mapping() then attach to si_domain and
> print, else continue with the given domain.

Yes, agree.

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.
  2019-03-04 15:41 [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain James Sewart
                   ` (2 preceding siblings ...)
  2019-03-05  6:05 ` [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain Lu Baolu
@ 2019-03-14 11:56 ` James Sewart
  2019-03-14 11:57   ` [PATCH v2 1/7] iommu: Move iommu_group_create_direct_mappings to after device_attach James Sewart
                     ` (7 more replies)
  3 siblings, 8 replies; 55+ messages in thread
From: James Sewart @ 2019-03-14 11:56 UTC (permalink / raw)
  To: iommu; +Cc: Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel, Lu Baolu

Patches 1 and 2 are the same as v1.

v1-v2:
  Refactored ISA direct mappings to be returned by iommu_get_resv_regions.
  Integrated patch by Lu to defer turning on DMAR until iommu.c has mapped 
reserved regions.
  Integrated patches by Lu to remove more unused code in cleanup.

Lu: I didn't integrate your patch to set the default domain type as it 
isn't directly related to the aim of this patchset. Instead patch 4 
addresses the issue of a device requiring an identity domain by ignoring 
the domain param in attach_device and printing a warning.

I booted some of our devices with this patchset and haven't seen any 
issues. It doesn't look like we have any devices with RMRR's though so 
those codepaths aren't tested.

James Sewart (7):
  iommu: Move iommu_group_create_direct_mappings to after device_attach
  iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
  iommu/vt-d: Expose ISA direct mapping region via
    iommu_get_resv_regions
  iommu/vt-d: Ignore domain parameter in attach_device if device
    requires identity map
  iommu/vt-d: Allow IOMMU_DOMAIN_DMA to be allocated by iommu_ops
  iommu/vt-d: Remove lazy allocation of domains

Lu Baolu (1):
  iommu/vt-d: Enable DMA remapping after rmrr mapped

 drivers/iommu/intel-iommu.c | 444 +++++++++++-------------------------
 drivers/iommu/iommu.c       |   4 +-
 2 files changed, 131 insertions(+), 317 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v2 1/7] iommu: Move iommu_group_create_direct_mappings to after device_attach
  2019-03-14 11:56 ` [PATCH v2 0/7] " James Sewart
@ 2019-03-14 11:57   ` James Sewart
  2019-03-14 11:58   ` [PATCH v2 2/7] iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges James Sewart
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 55+ messages in thread
From: James Sewart @ 2019-03-14 11:57 UTC (permalink / raw)
  To: iommu; +Cc: Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel, Lu Baolu

If an IOMMU driver requires to know which IOMMU a domain is associated
to initialise a domain then it will do so in device_attach, before which
it is not safe to call iommu_ops.

Signed-off-by: James Sewart <jamessewart@arista.com>
---
 drivers/iommu/iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3ed4db334341..1c6ffbb2d20e 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -652,8 +652,6 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 
 	dev->iommu_group = group;
 
-	iommu_group_create_direct_mappings(group, dev);
-
 	mutex_lock(&group->mutex);
 	list_add_tail(&device->list, &group->devices);
 	if (group->domain)
@@ -662,6 +660,8 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 	if (ret)
 		goto err_put_group;
 
+	iommu_group_create_direct_mappings(group, dev);
+
 	/* Notify any listeners about change to group. */
 	blocking_notifier_call_chain(&group->notifier,
 				     IOMMU_GROUP_NOTIFY_ADD_DEVICE, dev);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v2 2/7] iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
  2019-03-14 11:56 ` [PATCH v2 0/7] " James Sewart
  2019-03-14 11:57   ` [PATCH v2 1/7] iommu: Move iommu_group_create_direct_mappings to after device_attach James Sewart
@ 2019-03-14 11:58   ` James Sewart
  2019-03-14 11:58   ` [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions James Sewart
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 55+ messages in thread
From: James Sewart @ 2019-03-14 11:58 UTC (permalink / raw)
  To: iommu; +Cc: Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel, Lu Baolu

Used by iommu.c before creating identity mappings for reserved ranges to
ensure dma-ops won't ever remap these ranges

Signed-off-by: James Sewart <jamessewart@arista.com>
---
 drivers/iommu/intel-iommu.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 78188bf7e90d..8e0a4e2ff77f 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5299,6 +5299,19 @@ static void intel_iommu_put_resv_regions(struct device *dev,
 	}
 }
 
+static void intel_iommu_apply_resv_region(struct device *dev,
+		struct iommu_domain *domain,
+		struct iommu_resv_region *region)
+{
+	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+	unsigned long start, end;
+
+	start = IOVA_PFN(region->start);
+	end   = IOVA_PFN(region->start + region->length - 1);
+
+	WARN_ON_ONCE(reserve_iova(&dmar_domain->iovad, start, end) == NULL);
+}
+
 #ifdef CONFIG_INTEL_IOMMU_SVM
 int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sdev)
 {
@@ -5392,6 +5405,7 @@ const struct iommu_ops intel_iommu_ops = {
 	.remove_device		= intel_iommu_remove_device,
 	.get_resv_regions	= intel_iommu_get_resv_regions,
 	.put_resv_regions	= intel_iommu_put_resv_regions,
+	.apply_resv_region	= intel_iommu_apply_resv_region,
 	.device_group		= pci_device_group,
 	.pgsize_bitmap		= INTEL_IOMMU_PGSIZES,
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via  iommu_get_resv_regions
  2019-03-14 11:56 ` [PATCH v2 0/7] " James Sewart
  2019-03-14 11:57   ` [PATCH v2 1/7] iommu: Move iommu_group_create_direct_mappings to after device_attach James Sewart
  2019-03-14 11:58   ` [PATCH v2 2/7] iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges James Sewart
@ 2019-03-14 11:58   ` James Sewart
  2019-03-15  2:19     ` Lu Baolu
  2019-03-14 11:58   ` [PATCH v2 4/7] iommu/vt-d: Ignore domain parameter in attach_device if device requires identity map James Sewart
                     ` (4 subsequent siblings)
  7 siblings, 1 reply; 55+ messages in thread
From: James Sewart @ 2019-03-14 11:58 UTC (permalink / raw)
  To: iommu; +Cc: Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel, Lu Baolu

To support mapping ISA region via iommu_group_create_direct_mappings,
make sure its exposed by iommu_get_resv_regions. This allows
deduplication of reserved region mappings

Signed-off-by: James Sewart <jamessewart@arista.com>
---
 drivers/iommu/intel-iommu.c | 42 +++++++++++++++++++++++++++++--------
 1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 8e0a4e2ff77f..2e00e8708f06 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units);
 #define for_each_rmrr_units(rmrr) \
 	list_for_each_entry(rmrr, &dmar_rmrr_units, list)
 
+static struct iommu_resv_region *isa_resv_region;
+
 /* bitmap for indexing intel_iommus */
 static int g_num_of_iommus;
 
@@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
 					  rmrr->end_address);
 }
 
+static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
+{
+	if (!isa_resv_region)
+		isa_resv_region = iommu_alloc_resv_region(0,
+				16*1024*1024,
+				0, IOMMU_RESV_DIRECT);
+
+	return isa_resv_region;
+}
+
 #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
-static inline void iommu_prepare_isa(void)
+static inline void iommu_prepare_isa(struct pci_dev *pdev)
 {
-	struct pci_dev *pdev;
 	int ret;
+	struct iommu_resv_region *reg = iommu_get_isa_resv_region();
 
-	pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
-	if (!pdev)
+	if (!reg)
 		return;
 
 	pr_info("Prepare 0-16MiB unity mapping for LPC\n");
-	ret = iommu_prepare_identity_map(&pdev->dev, 0, 16*1024*1024 - 1);
+	ret = iommu_prepare_identity_map(&pdev->dev, reg->start,
+			reg->start + reg->length - 1);
 
 	if (ret)
 		pr_err("Failed to create 0-16MiB identity map - floppy might not work\n");
-
-	pci_dev_put(pdev);
 }
 #else
-static inline void iommu_prepare_isa(void)
+static inline void iommu_prepare_isa(struct pci_dev *pdev)
 {
 	return;
 }
@@ -3289,6 +3299,7 @@ static int __init init_dmars(void)
 	struct dmar_rmrr_unit *rmrr;
 	bool copied_tables = false;
 	struct device *dev;
+	struct pci_dev *pdev;
 	struct intel_iommu *iommu;
 	int i, ret;
 
@@ -3469,7 +3480,11 @@ static int __init init_dmars(void)
 		}
 	}
 
-	iommu_prepare_isa();
+	pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
+	if (pdev) {
+		iommu_prepare_isa(pdev);
+		pci_dev_put(pdev);
+	}
 
 domains_done:
 
@@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct device *device,
 	struct iommu_resv_region *reg;
 	struct dmar_rmrr_unit *rmrr;
 	struct device *i_dev;
+	struct pci_dev *pdev;
 	int i;
 
 	rcu_read_lock();
@@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct device *device,
 	}
 	rcu_read_unlock();
 
+	if (dev_is_pci(device)) {
+		pdev = to_pci_dev(device);
+		if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
+			reg = iommu_get_isa_resv_region();
+			list_add_tail(&reg->list, head);
+		}
+	}
+
 	reg = iommu_alloc_resv_region(IOAPIC_RANGE_START,
 				      IOAPIC_RANGE_END - IOAPIC_RANGE_START + 1,
 				      0, IOMMU_RESV_MSI);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v2 4/7] iommu/vt-d: Ignore domain parameter in attach_device if  device requires identity map
  2019-03-14 11:56 ` [PATCH v2 0/7] " James Sewart
                     ` (2 preceding siblings ...)
  2019-03-14 11:58   ` [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions James Sewart
@ 2019-03-14 11:58   ` James Sewart
  2019-03-15  2:30     ` Lu Baolu
  2019-03-14 11:58   ` [PATCH v2 5/7] iommu/vt-d: Enable DMA remapping after rmrr mapped James Sewart
                     ` (3 subsequent siblings)
  7 siblings, 1 reply; 55+ messages in thread
From: James Sewart @ 2019-03-14 11:58 UTC (permalink / raw)
  To: iommu; +Cc: Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel, Lu Baolu

If a device requires an identity map then it is not safe to attach a
domain that can remap addresses. Warn the user if this occurs.

Signed-off-by: James Sewart <jamessewart@arista.com>
---
 drivers/iommu/intel-iommu.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 2e00e8708f06..104d36f225a7 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5101,6 +5101,11 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
 		}
 	}
 
+	if (iommu_no_mapping(dev)) {
+		dmar_domain = si_domain;
+		dev_warn(dev, "VT-d: Device is required to use identity IOMMU mapping, ignoring domain attached\n");
+	}
+
 	iommu = device_to_iommu(dev, &bus, &devfn);
 	if (!iommu)
 		return -ENODEV;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v2 5/7] iommu/vt-d: Enable DMA remapping after rmrr mapped
  2019-03-14 11:56 ` [PATCH v2 0/7] " James Sewart
                     ` (3 preceding siblings ...)
  2019-03-14 11:58   ` [PATCH v2 4/7] iommu/vt-d: Ignore domain parameter in attach_device if device requires identity map James Sewart
@ 2019-03-14 11:58   ` James Sewart
  2019-03-14 11:59   ` [PATCH v2 6/7] iommu/vt-d: Allow IOMMU_DOMAIN_DMA to be allocated by iommu_ops James Sewart
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 55+ messages in thread
From: James Sewart @ 2019-03-14 11:58 UTC (permalink / raw)
  To: iommu; +Cc: Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel, Lu Baolu

The rmrr devices require identity map of the rmrr regions before
enabling DMA remapping. Otherwise, there will be a window during
which DMA from/to the rmrr regions will be blocked. In order to
alleviate this, we move enabling DMA remapping after all rmrr
regions get mapped.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 104d36f225a7..35821df70f78 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3518,11 +3518,6 @@ static int __init init_dmars(void)
 		ret = dmar_set_interrupt(iommu);
 		if (ret)
 			goto free_iommu;
-
-		if (!translation_pre_enabled(iommu))
-			iommu_enable_translation(iommu);
-
-		iommu_disable_protect_mem_regions(iommu);
 	}
 
 	return 0;
@@ -4912,7 +4907,6 @@ int __init intel_iommu_init(void)
 		goto out_free_reserved_range;
 	}
 	up_write(&dmar_global_lock);
-	pr_info("Intel(R) Virtualization Technology for Directed I/O\n");
 
 #if defined(CONFIG_X86) && defined(CONFIG_SWIOTLB)
 	swiotlb = 0;
@@ -4935,6 +4929,16 @@ int __init intel_iommu_init(void)
 		register_memory_notifier(&intel_iommu_memory_nb);
 	cpuhp_setup_state(CPUHP_IOMMU_INTEL_DEAD, "iommu/intel:dead", NULL,
 			  intel_iommu_cpu_dead);
+
+	/* Finally, we enable the DMA remapping hardware. */
+	for_each_iommu(iommu, drhd) {
+		if (!translation_pre_enabled(iommu))
+			iommu_enable_translation(iommu);
+
+		iommu_disable_protect_mem_regions(iommu);
+	}
+	pr_info("Intel(R) Virtualization Technology for Directed I/O\n");
+
 	intel_iommu_enabled = 1;
 	intel_iommu_debugfs_init();
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v2 6/7] iommu/vt-d: Allow IOMMU_DOMAIN_DMA to be allocated by  iommu_ops
  2019-03-14 11:56 ` [PATCH v2 0/7] " James Sewart
                     ` (4 preceding siblings ...)
  2019-03-14 11:58   ` [PATCH v2 5/7] iommu/vt-d: Enable DMA remapping after rmrr mapped James Sewart
@ 2019-03-14 11:59   ` James Sewart
  2019-03-14 11:59   ` [PATCH v2 7/7] iommu/vt-d: Remove lazy allocation of domains James Sewart
  2019-03-15  3:13   ` [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain Lu Baolu
  7 siblings, 0 replies; 55+ messages in thread
From: James Sewart @ 2019-03-14 11:59 UTC (permalink / raw)
  To: iommu; +Cc: Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel, Lu Baolu

Allowing IOMMU_DOMAIN_DMA type IOMMU domain to be allocated allows the
default_domain of an iommu_group to be set. This delegates device-domain
relationships to the generic IOMMU code.

Signed-off-by: James Sewart <jamessewart@arista.com>
---
 drivers/iommu/intel-iommu.c | 99 +++++++++++++++++++++++++++++--------
 1 file changed, 78 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 35821df70f78..2c9d793af394 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -309,6 +309,18 @@ static int hw_pass_through = 1;
 /* si_domain contains mulitple devices */
 #define DOMAIN_FLAG_STATIC_IDENTITY	(1 << 1)
 
+/*
+ * Domain managed externally, don't cleanup if it isn't attached
+ * to any devices.
+ */
+#define DOMAIN_FLAG_MANAGED_EXTERNALLY	(1 << 2)
+
+/*
+ * Set after domain initialisation. Used when allocating dma domains to
+ * defer domain initialisation until it is attached to a device
+ */
+#define DOMAIN_FLAG_INITIALISED	(1 << 3)
+
 #define for_each_domain_iommu(idx, domain)			\
 	for (idx = 0; idx < g_num_of_iommus; idx++)		\
 		if (domain->iommu_refcnt[idx])
@@ -560,6 +572,16 @@ static inline int domain_type_is_vm_or_si(struct dmar_domain *domain)
 				DOMAIN_FLAG_STATIC_IDENTITY);
 }
 
+static inline int domain_managed_externally(struct dmar_domain *domain)
+{
+	return domain->flags & DOMAIN_FLAG_MANAGED_EXTERNALLY;
+}
+
+static inline int domain_is_initialised(struct dmar_domain *domain)
+{
+	return domain->flags & DOMAIN_FLAG_INITIALISED;
+}
+
 static inline int domain_pfn_supported(struct dmar_domain *domain,
 				       unsigned long pfn)
 {
@@ -1664,7 +1686,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
 
 		__dmar_remove_one_dev_info(info);
 
-		if (!domain_type_is_vm_or_si(domain)) {
+		if (!domain_managed_externally(domain)) {
 			/*
 			 * The domain_exit() function  can't be called under
 			 * device_domain_lock, as it takes this lock itself.
@@ -1897,6 +1919,7 @@ static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
 	domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
 	if (!domain->pgd)
 		return -ENOMEM;
+	domain->flags |= DOMAIN_FLAG_INITIALISED;
 	__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
 	return 0;
 }
@@ -1909,6 +1932,9 @@ static void domain_exit(struct dmar_domain *domain)
 	if (!domain)
 		return;
 
+	if (!domain_is_initialised(domain))
+		goto free_mem;
+
 	/* Remove associated devices and clear attached or cached domains */
 	rcu_read_lock();
 	domain_remove_dev_info(domain);
@@ -1921,6 +1947,7 @@ static void domain_exit(struct dmar_domain *domain)
 
 	dma_free_pagelist(freelist);
 
+free_mem:
 	free_domain_mem(domain);
 }
 
@@ -4585,7 +4612,7 @@ static int device_notifier(struct notifier_block *nb,
 		return 0;
 
 	dmar_remove_one_dev_info(domain, dev);
-	if (!domain_type_is_vm_or_si(domain) && list_empty(&domain->devices))
+	if (!domain_managed_externally(domain) && list_empty(&domain->devices))
 		domain_exit(domain);
 
 	return 0;
@@ -5039,6 +5066,7 @@ static int md_domain_init(struct dmar_domain *domain, int guest_width)
 	domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
 	if (!domain->pgd)
 		return -ENOMEM;
+	domain->flags |= DOMAIN_FLAG_INITIALISED;
 	domain_flush_cache(domain, domain->pgd, PAGE_SIZE);
 	return 0;
 }
@@ -5047,28 +5075,43 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
 {
 	struct dmar_domain *dmar_domain;
 	struct iommu_domain *domain;
+	int flags = DOMAIN_FLAG_MANAGED_EXTERNALLY;
 
-	if (type != IOMMU_DOMAIN_UNMANAGED)
-		return NULL;
+	switch (type) {
+	case IOMMU_DOMAIN_UNMANAGED:
+		flags |= DOMAIN_FLAG_VIRTUAL_MACHINE | DOMAIN_FLAG_INITIALISED;
+		dmar_domain = alloc_domain(flags);
+		if (!dmar_domain)
+			return NULL;
 
-	dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
-	if (!dmar_domain) {
-		pr_err("Can't allocate dmar_domain\n");
-		return NULL;
-	}
-	if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
-		pr_err("Domain initialization failed\n");
-		domain_exit(dmar_domain);
+		if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
+			pr_err("Domain initialization failed\n");
+			domain_exit(dmar_domain);
+			return NULL;
+		}
+		domain_update_iommu_cap(dmar_domain);
+		domain = &dmar_domain->domain;
+		domain->geometry.aperture_start = 0;
+		domain->geometry.aperture_end =
+			__DOMAIN_MAX_ADDR(dmar_domain->gaw);
+		domain->geometry.force_aperture = true;
+		break;
+	case IOMMU_DOMAIN_DMA:
+		dmar_domain = alloc_domain(flags);
+		if (!dmar_domain)
+			return NULL;
+		/*
+		 * init domain in device attach when we know IOMMU
+		 * capabilities
+		 */
+		break;
+	case IOMMU_DOMAIN_IDENTITY:
+		return &si_domain->domain;
+	default:
 		return NULL;
 	}
-	domain_update_iommu_cap(dmar_domain);
-
-	domain = &dmar_domain->domain;
-	domain->geometry.aperture_start = 0;
-	domain->geometry.aperture_end   = __DOMAIN_MAX_ADDR(dmar_domain->gaw);
-	domain->geometry.force_aperture = true;
 
-	return domain;
+	return &dmar_domain->domain;
 }
 
 static void intel_iommu_domain_free(struct iommu_domain *domain)
@@ -5099,8 +5142,8 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
 			dmar_remove_one_dev_info(old_domain, dev);
 			rcu_read_unlock();
 
-			if (!domain_type_is_vm_or_si(old_domain) &&
-			     list_empty(&old_domain->devices))
+			if (list_empty(&old_domain->devices) &&
+			     !domain_managed_externally(old_domain))
 				domain_exit(old_domain);
 		}
 	}
@@ -5114,6 +5157,16 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
 	if (!iommu)
 		return -ENODEV;
 
+	/*
+	 * Initialise domain with IOMMU capabilities if it isn't already
+	 * initialised
+	 */
+	if (!domain_is_initialised(dmar_domain)) {
+		if (domain_init(dmar_domain, iommu,
+				DEFAULT_DOMAIN_ADDRESS_WIDTH))
+			return -ENOMEM;
+	}
+
 	/* check if this iommu agaw is sufficient for max mapped address */
 	addr_width = agaw_to_width(iommu->agaw);
 	if (addr_width > cap_mgaw(iommu->cap))
@@ -5160,6 +5213,10 @@ static int intel_iommu_map(struct iommu_domain *domain,
 	int prot = 0;
 	int ret;
 
+	/* Don't bother if hardware passthrough used. */
+	if (dmar_domain == si_domain && hw_pass_through)
+		return 0;
+
 	if (iommu_prot & IOMMU_READ)
 		prot |= DMA_PTE_READ;
 	if (iommu_prot & IOMMU_WRITE)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v2 7/7] iommu/vt-d: Remove lazy allocation of domains
  2019-03-14 11:56 ` [PATCH v2 0/7] " James Sewart
                     ` (5 preceding siblings ...)
  2019-03-14 11:59   ` [PATCH v2 6/7] iommu/vt-d: Allow IOMMU_DOMAIN_DMA to be allocated by iommu_ops James Sewart
@ 2019-03-14 11:59   ` James Sewart
  2019-03-14 23:35     ` Jacob Pan
  2019-03-15  3:13   ` [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain Lu Baolu
  7 siblings, 1 reply; 55+ messages in thread
From: James Sewart @ 2019-03-14 11:59 UTC (permalink / raw)
  To: iommu; +Cc: Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel, Lu Baolu

The generic IOMMU code will allocate and attach a default domain to each
device that comes online. This patch removes the lazy domain allocation
and early reserved region mapping that is now redundant.

Signed-off-by: James Sewart <jamessewart@arista.com>
---
 drivers/iommu/intel-iommu.c | 300 +-----------------------------------
 1 file changed, 5 insertions(+), 295 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 2c9d793af394..f8c0c3e16935 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2605,118 +2605,6 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
 	return domain;
 }
 
-static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
-{
-	*(u16 *)opaque = alias;
-	return 0;
-}
-
-static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
-{
-	struct device_domain_info *info = NULL;
-	struct dmar_domain *domain = NULL;
-	struct intel_iommu *iommu;
-	u16 dma_alias;
-	unsigned long flags;
-	u8 bus, devfn;
-
-	iommu = device_to_iommu(dev, &bus, &devfn);
-	if (!iommu)
-		return NULL;
-
-	if (dev_is_pci(dev)) {
-		struct pci_dev *pdev = to_pci_dev(dev);
-
-		pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
-
-		spin_lock_irqsave(&device_domain_lock, flags);
-		info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
-						      PCI_BUS_NUM(dma_alias),
-						      dma_alias & 0xff);
-		if (info) {
-			iommu = info->iommu;
-			domain = info->domain;
-		}
-		spin_unlock_irqrestore(&device_domain_lock, flags);
-
-		/* DMA alias already has a domain, use it */
-		if (info)
-			goto out;
-	}
-
-	/* Allocate and initialize new domain for the device */
-	domain = alloc_domain(0);
-	if (!domain)
-		return NULL;
-	if (domain_init(domain, iommu, gaw)) {
-		domain_exit(domain);
-		return NULL;
-	}
-
-out:
-
-	return domain;
-}
-
-static struct dmar_domain *set_domain_for_dev(struct device *dev,
-					      struct dmar_domain *domain)
-{
-	struct intel_iommu *iommu;
-	struct dmar_domain *tmp;
-	u16 req_id, dma_alias;
-	u8 bus, devfn;
-
-	iommu = device_to_iommu(dev, &bus, &devfn);
-	if (!iommu)
-		return NULL;
-
-	req_id = ((u16)bus << 8) | devfn;
-
-	if (dev_is_pci(dev)) {
-		struct pci_dev *pdev = to_pci_dev(dev);
-
-		pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
-
-		/* register PCI DMA alias device */
-		if (req_id != dma_alias) {
-			tmp = dmar_insert_one_dev_info(iommu, PCI_BUS_NUM(dma_alias),
-					dma_alias & 0xff, NULL, domain);
-
-			if (!tmp || tmp != domain)
-				return tmp;
-		}
-	}
-
-	tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
-	if (!tmp || tmp != domain)
-		return tmp;
-
-	return domain;
-}
-
-static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
-{
-	struct dmar_domain *domain, *tmp;
-
-	domain = find_domain(dev);
-	if (domain)
-		goto out;
-
-	domain = find_or_alloc_domain(dev, gaw);
-	if (!domain)
-		goto out;
-
-	tmp = set_domain_for_dev(dev, domain);
-	if (!tmp || domain != tmp) {
-		domain_exit(domain);
-		domain = tmp;
-	}
-
-out:
-
-	return domain;
-}
-
 static int iommu_domain_identity_map(struct dmar_domain *domain,
 				     unsigned long long start,
 				     unsigned long long end)
@@ -2742,73 +2630,6 @@ static int iommu_domain_identity_map(struct dmar_domain *domain,
 				DMA_PTE_READ|DMA_PTE_WRITE);
 }
 
-static int domain_prepare_identity_map(struct device *dev,
-				       struct dmar_domain *domain,
-				       unsigned long long start,
-				       unsigned long long end)
-{
-	/* For _hardware_ passthrough, don't bother. But for software
-	   passthrough, we do it anyway -- it may indicate a memory
-	   range which is reserved in E820, so which didn't get set
-	   up to start with in si_domain */
-	if (domain == si_domain && hw_pass_through) {
-		pr_warn("Ignoring identity map for HW passthrough device %s [0x%Lx - 0x%Lx]\n",
-			dev_name(dev), start, end);
-		return 0;
-	}
-
-	pr_info("Setting identity map for device %s [0x%Lx - 0x%Lx]\n",
-		dev_name(dev), start, end);
-
-	if (end < start) {
-		WARN(1, "Your BIOS is broken; RMRR ends before it starts!\n"
-			"BIOS vendor: %s; Ver: %s; Product Version: %s\n",
-			dmi_get_system_info(DMI_BIOS_VENDOR),
-			dmi_get_system_info(DMI_BIOS_VERSION),
-		     dmi_get_system_info(DMI_PRODUCT_VERSION));
-		return -EIO;
-	}
-
-	if (end >> agaw_to_width(domain->agaw)) {
-		WARN(1, "Your BIOS is broken; RMRR exceeds permitted address width (%d bits)\n"
-		     "BIOS vendor: %s; Ver: %s; Product Version: %s\n",
-		     agaw_to_width(domain->agaw),
-		     dmi_get_system_info(DMI_BIOS_VENDOR),
-		     dmi_get_system_info(DMI_BIOS_VERSION),
-		     dmi_get_system_info(DMI_PRODUCT_VERSION));
-		return -EIO;
-	}
-
-	return iommu_domain_identity_map(domain, start, end);
-}
-
-static int iommu_prepare_identity_map(struct device *dev,
-				      unsigned long long start,
-				      unsigned long long end)
-{
-	struct dmar_domain *domain;
-	int ret;
-
-	domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
-	if (!domain)
-		return -ENOMEM;
-
-	ret = domain_prepare_identity_map(dev, domain, start, end);
-	if (ret)
-		domain_exit(domain);
-
-	return ret;
-}
-
-static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
-					 struct device *dev)
-{
-	if (dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO)
-		return 0;
-	return iommu_prepare_identity_map(dev, rmrr->base_address,
-					  rmrr->end_address);
-}
-
 static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
 {
 	if (!isa_resv_region)
@@ -2819,29 +2640,6 @@ static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
 	return isa_resv_region;
 }
 
-#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
-static inline void iommu_prepare_isa(struct pci_dev *pdev)
-{
-	int ret;
-	struct iommu_resv_region *reg = iommu_get_isa_resv_region();
-
-	if (!reg)
-		return;
-
-	pr_info("Prepare 0-16MiB unity mapping for LPC\n");
-	ret = iommu_prepare_identity_map(&pdev->dev, reg->start,
-			reg->start + reg->length - 1);
-
-	if (ret)
-		pr_err("Failed to create 0-16MiB identity map - floppy might not work\n");
-}
-#else
-static inline void iommu_prepare_isa(struct pci_dev *pdev)
-{
-	return;
-}
-#endif /* !CONFIG_INTEL_IOMMU_FLPY_WA */
-
 static int md_domain_init(struct dmar_domain *domain, int guest_width);
 
 static int __init si_domain_init(int hw)
@@ -3065,18 +2863,10 @@ static int __init dev_prepare_static_identity_mapping(struct device *dev, int hw
 
 static int __init iommu_prepare_static_identity_mapping(int hw)
 {
-	struct pci_dev *pdev = NULL;
 	struct dmar_drhd_unit *drhd;
 	struct intel_iommu *iommu;
 	struct device *dev;
-	int i;
-	int ret = 0;
-
-	for_each_pci_dev(pdev) {
-		ret = dev_prepare_static_identity_mapping(&pdev->dev, hw);
-		if (ret)
-			return ret;
-	}
+	int i, ret = 0;
 
 	for_each_active_iommu(iommu, drhd)
 		for_each_active_dev_scope(drhd->devices, drhd->devices_cnt, i, dev) {
@@ -3323,12 +3113,9 @@ static int copy_translation_tables(struct intel_iommu *iommu)
 static int __init init_dmars(void)
 {
 	struct dmar_drhd_unit *drhd;
-	struct dmar_rmrr_unit *rmrr;
 	bool copied_tables = false;
-	struct device *dev;
-	struct pci_dev *pdev;
 	struct intel_iommu *iommu;
-	int i, ret;
+	int ret;
 
 	/*
 	 * for each drhd
@@ -3482,36 +3269,6 @@ static int __init init_dmars(void)
 			goto free_iommu;
 		}
 	}
-	/*
-	 * For each rmrr
-	 *   for each dev attached to rmrr
-	 *   do
-	 *     locate drhd for dev, alloc domain for dev
-	 *     allocate free domain
-	 *     allocate page table entries for rmrr
-	 *     if context not allocated for bus
-	 *           allocate and init context
-	 *           set present in root table for this bus
-	 *     init context with domain, translation etc
-	 *    endfor
-	 * endfor
-	 */
-	pr_info("Setting RMRR:\n");
-	for_each_rmrr_units(rmrr) {
-		/* some BIOS lists non-exist devices in DMAR table. */
-		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
-					  i, dev) {
-			ret = iommu_prepare_rmrr_dev(rmrr, dev);
-			if (ret)
-				pr_err("Mapping reserved region failed\n");
-		}
-	}
-
-	pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
-	if (pdev) {
-		iommu_prepare_isa(pdev);
-		pci_dev_put(pdev);
-	}
 
 domains_done:
 
@@ -3595,53 +3352,6 @@ static unsigned long intel_alloc_iova(struct device *dev,
 	return iova_pfn;
 }
 
-struct dmar_domain *get_valid_domain_for_dev(struct device *dev)
-{
-	struct dmar_domain *domain, *tmp;
-	struct dmar_rmrr_unit *rmrr;
-	struct device *i_dev;
-	int i, ret;
-
-	domain = find_domain(dev);
-	if (domain)
-		goto out;
-
-	domain = find_or_alloc_domain(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
-	if (!domain)
-		goto out;
-
-	/* We have a new domain - setup possible RMRRs for the device */
-	rcu_read_lock();
-	for_each_rmrr_units(rmrr) {
-		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
-					  i, i_dev) {
-			if (i_dev != dev)
-				continue;
-
-			ret = domain_prepare_identity_map(dev, domain,
-							  rmrr->base_address,
-							  rmrr->end_address);
-			if (ret)
-				dev_err(dev, "Mapping reserved region failed\n");
-		}
-	}
-	rcu_read_unlock();
-
-	tmp = set_domain_for_dev(dev, domain);
-	if (!tmp || domain != tmp) {
-		domain_exit(domain);
-		domain = tmp;
-	}
-
-out:
-
-	if (!domain)
-		pr_err("Allocating domain for %s failed\n", dev_name(dev));
-
-
-	return domain;
-}
-
 /* Check if the dev needs to go through non-identity map and unmap process.*/
 static int iommu_no_mapping(struct device *dev)
 {
@@ -3704,7 +3414,7 @@ static dma_addr_t __intel_map_page(struct device *dev, struct page *page,
 	if (iommu_no_mapping(dev))
 		return paddr;
 
-	domain = get_valid_domain_for_dev(dev);
+	domain = find_domain(dev);
 	if (!domain)
 		return DMA_MAPPING_ERROR;
 
@@ -3914,7 +3624,7 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
 	if (iommu_no_mapping(dev))
 		return intel_nontranslate_map_sg(dev, sglist, nelems, dir);
 
-	domain = get_valid_domain_for_dev(dev);
+	domain = find_domain(dev);
 	if (!domain)
 		return 0;
 
@@ -5412,7 +5122,7 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sd
 	u64 ctx_lo;
 	int ret;
 
-	domain = get_valid_domain_for_dev(sdev->dev);
+	domain = find_domain(sdev->dev);
 	if (!domain)
 		return -EINVAL;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 7/7] iommu/vt-d: Remove lazy allocation of domains
  2019-03-14 11:59   ` [PATCH v2 7/7] iommu/vt-d: Remove lazy allocation of domains James Sewart
@ 2019-03-14 23:35     ` Jacob Pan
  2019-03-22 10:07       ` James Sewart
  0 siblings, 1 reply; 55+ messages in thread
From: Jacob Pan @ 2019-03-14 23:35 UTC (permalink / raw)
  To: James Sewart
  Cc: iommu, Tom Murphy, Dmitry Safonov, linux-kernel, Lu Baolu, jacob.jun.pan

On Thu, 14 Mar 2019 11:59:36 +0000
James Sewart <jamessewart@arista.com> wrote:

>  
> -	domain = get_valid_domain_for_dev(dev);
> +	domain = find_domain(dev);
>  	if (!domain)
>  		return DMA_MAPPING_ERROR;
>  
> @@ -3914,7 +3624,7 @@ static int intel_map_sg(struct device *dev,
> struct scatterlist *sglist, int nele if (iommu_no_mapping(dev))
>  		return intel_nontranslate_map_sg(dev, sglist,
> nelems, dir); 
> -	domain = get_valid_domain_for_dev(dev);
> +	domain = find_domain(dev);
This patchset looks like a very good clean up, I am wondering why we
can't use the generic iommu_get_domain_for_dev() here, since VT-d has a
default DMA domain after your patch.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
  2019-03-14 11:58   ` [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions James Sewart
@ 2019-03-15  2:19     ` Lu Baolu
  2019-03-22  9:57       ` James Sewart
  0 siblings, 1 reply; 55+ messages in thread
From: Lu Baolu @ 2019-03-15  2:19 UTC (permalink / raw)
  To: James Sewart, iommu
  Cc: baolu.lu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi James,

On 3/14/19 7:58 PM, James Sewart wrote:
> To support mapping ISA region via iommu_group_create_direct_mappings,
> make sure its exposed by iommu_get_resv_regions. This allows
> deduplication of reserved region mappings
> 
> Signed-off-by: James Sewart <jamessewart@arista.com>
> ---
>   drivers/iommu/intel-iommu.c | 42 +++++++++++++++++++++++++++++--------
>   1 file changed, 33 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 8e0a4e2ff77f..2e00e8708f06 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units);
>   #define for_each_rmrr_units(rmrr) \
>   	list_for_each_entry(rmrr, &dmar_rmrr_units, list)
>   
> +static struct iommu_resv_region *isa_resv_region;
> +
>   /* bitmap for indexing intel_iommus */
>   static int g_num_of_iommus;
>   
> @@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
>   					  rmrr->end_address);
>   }
>   
> +static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
> +{
> +	if (!isa_resv_region)
> +		isa_resv_region = iommu_alloc_resv_region(0,
> +				16*1024*1024,
> +				0, IOMMU_RESV_DIRECT);
> +
> +	return isa_resv_region;
> +}
> +
>   #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
> -static inline void iommu_prepare_isa(void)
> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>   {
> -	struct pci_dev *pdev;
>   	int ret;
> +	struct iommu_resv_region *reg = iommu_get_isa_resv_region();
>   
> -	pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
> -	if (!pdev)
> +	if (!reg)
>   		return;
>   
>   	pr_info("Prepare 0-16MiB unity mapping for LPC\n");
> -	ret = iommu_prepare_identity_map(&pdev->dev, 0, 16*1024*1024 - 1);
> +	ret = iommu_prepare_identity_map(&pdev->dev, reg->start,
> +			reg->start + reg->length - 1);
>   
>   	if (ret)
>   		pr_err("Failed to create 0-16MiB identity map - floppy might not work\n");
> -
> -	pci_dev_put(pdev);
>   }
>   #else
> -static inline void iommu_prepare_isa(void)
> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>   {
>   	return;
>   }
> @@ -3289,6 +3299,7 @@ static int __init init_dmars(void)
>   	struct dmar_rmrr_unit *rmrr;
>   	bool copied_tables = false;
>   	struct device *dev;
> +	struct pci_dev *pdev;
>   	struct intel_iommu *iommu;
>   	int i, ret;
>   
> @@ -3469,7 +3480,11 @@ static int __init init_dmars(void)
>   		}
>   	}
>   
> -	iommu_prepare_isa();
> +	pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
> +	if (pdev) {
> +		iommu_prepare_isa(pdev);
> +		pci_dev_put(pdev);
> +	}
>   
>   domains_done:
>   
> @@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct device *device,
>   	struct iommu_resv_region *reg;
>   	struct dmar_rmrr_unit *rmrr;
>   	struct device *i_dev;
> +	struct pci_dev *pdev;
>   	int i;
>   
>   	rcu_read_lock();
> @@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct device *device,
>   	}
>   	rcu_read_unlock();
>   
> +	if (dev_is_pci(device)) {
> +		pdev = to_pci_dev(device);
> +		if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
> +			reg = iommu_get_isa_resv_region();
> +			list_add_tail(&reg->list, head);
> +		}
> +	}
> +

Just wondering why not just

+#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
+	if (dev_is_pci(device)) {
+		pdev = to_pci_dev(device);
+		if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
+			reg = iommu_alloc_resv_region(0,
+				16*1024*1024,
+				0, IOMMU_RESV_DIRECT);
+			if (reg)
+				list_add_tail(&reg->list, head);
+		}
+	}
+#endif

and, remove all other related code?

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 4/7] iommu/vt-d: Ignore domain parameter in attach_device if device requires identity map
  2019-03-14 11:58   ` [PATCH v2 4/7] iommu/vt-d: Ignore domain parameter in attach_device if device requires identity map James Sewart
@ 2019-03-15  2:30     ` Lu Baolu
  0 siblings, 0 replies; 55+ messages in thread
From: Lu Baolu @ 2019-03-15  2:30 UTC (permalink / raw)
  To: James Sewart, iommu
  Cc: baolu.lu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi,

On 3/14/19 7:58 PM, James Sewart wrote:
> If a device requires an identity map then it is not safe to attach a
> domain that can remap addresses. Warn the user if this occurs.
> 
> Signed-off-by: James Sewart <jamessewart@arista.com>
> ---
>   drivers/iommu/intel-iommu.c | 5 +++++
>   1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 2e00e8708f06..104d36f225a7 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -5101,6 +5101,11 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
>   		}
>   	}
>   
> +	if (iommu_no_mapping(dev)) {
> +		dmar_domain = si_domain;
> +		dev_warn(dev, "VT-d: Device is required to use identity IOMMU mapping, ignoring domain attached\n");
> +	}

This is awful. :-)

IOMMU generic layer allocated a @domain and tries to bind the @domain
with a device. This code replaces @domain with si_domain without
notifying the upper layer.

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.
  2019-03-14 11:56 ` [PATCH v2 0/7] " James Sewart
                     ` (6 preceding siblings ...)
  2019-03-14 11:59   ` [PATCH v2 7/7] iommu/vt-d: Remove lazy allocation of domains James Sewart
@ 2019-03-15  3:13   ` Lu Baolu
  2019-03-19 13:35     ` James Sewart
  7 siblings, 1 reply; 55+ messages in thread
From: Lu Baolu @ 2019-03-15  3:13 UTC (permalink / raw)
  To: James Sewart, iommu
  Cc: baolu.lu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi James,

On 3/14/19 7:56 PM, James Sewart wrote:
> Patches 1 and 2 are the same as v1.
> 
> v1-v2:
>    Refactored ISA direct mappings to be returned by iommu_get_resv_regions.
>    Integrated patch by Lu to defer turning on DMAR until iommu.c has mapped
> reserved regions.
>    Integrated patches by Lu to remove more unused code in cleanup.
> 
> Lu: I didn't integrate your patch to set the default domain type as it
> isn't directly related to the aim of this patchset. Instead patch 4

Without those patches, user experience will be affected and some devices
will not work on Intel platforms anymore.

For a long time, Intel IOMMU driver has its own logic to determine
whether a device requires an identity domain. For example, when user
specifies "iommu=pt" in kernel parameter, all device will be attached
with the identity domain. Further more, some quirky devices require
an identity domain to be used before enabling DMA remapping, otherwise,
it will not work. This was done by adding quirk bits in Intel IOMMU
driver.

So from my point of view, one way is porting all those quirks and kernel
parameters into IOMMU generic layer, or opening a door for vendor IOMMU
driver to determine the default domain type by their own. I prefer the
latter option since it will not impact any behaviors on other
architectures.

> addresses the issue of a device requiring an identity domain by ignoring
> the domain param in attach_device and printing a warning.

This will not work as I commented in that thread.

> 
> I booted some of our devices with this patchset and haven't seen any
> issues. It doesn't look like we have any devices with RMRR's though so
> those codepaths aren't tested.
> 
> James Sewart (7):
>    iommu: Move iommu_group_create_direct_mappings to after device_attach
>    iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
>    iommu/vt-d: Expose ISA direct mapping region via
>      iommu_get_resv_regions
>    iommu/vt-d: Ignore domain parameter in attach_device if device
>      requires identity map
>    iommu/vt-d: Allow IOMMU_DOMAIN_DMA to be allocated by iommu_ops
>    iommu/vt-d: Remove lazy allocation of domains
> 
> Lu Baolu (1):
>    iommu/vt-d: Enable DMA remapping after rmrr mapped
> 
>   drivers/iommu/intel-iommu.c | 444 +++++++++++-------------------------
>   drivers/iommu/iommu.c       |   4 +-
>   2 files changed, 131 insertions(+), 317 deletions(-)
> 

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.
  2019-03-15  3:13   ` [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain Lu Baolu
@ 2019-03-19 13:35     ` James Sewart
  2019-03-20  1:26       ` Lu Baolu
  0 siblings, 1 reply; 55+ messages in thread
From: James Sewart @ 2019-03-19 13:35 UTC (permalink / raw)
  To: Lu Baolu; +Cc: iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hey Lu,

> On 15 Mar 2019, at 03:13, Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
> Hi James,
> 
> On 3/14/19 7:56 PM, James Sewart wrote:
>> Patches 1 and 2 are the same as v1.
>> v1-v2:
>>   Refactored ISA direct mappings to be returned by iommu_get_resv_regions.
>>   Integrated patch by Lu to defer turning on DMAR until iommu.c has mapped
>> reserved regions.
>>   Integrated patches by Lu to remove more unused code in cleanup.
>> Lu: I didn't integrate your patch to set the default domain type as it
>> isn't directly related to the aim of this patchset. Instead patch 4
> 
> Without those patches, user experience will be affected and some devices
> will not work on Intel platforms anymore.
> 
> For a long time, Intel IOMMU driver has its own logic to determine
> whether a device requires an identity domain. For example, when user
> specifies "iommu=pt" in kernel parameter, all device will be attached
> with the identity domain. Further more, some quirky devices require
> an identity domain to be used before enabling DMA remapping, otherwise,
> it will not work. This was done by adding quirk bits in Intel IOMMU
> driver.
> 
> So from my point of view, one way is porting all those quirks and kernel
> parameters into IOMMU generic layer, or opening a door for vendor IOMMU
> driver to determine the default domain type by their own. I prefer the
> latter option since it will not impact any behaviors on other
> architectures.

I see your point. I’m not confident that using the proposed door to set a 
groups default domain has the desired behaviour. As discussed before the 
default domain type will be set based on the desired type for only the 
first device attached to a group. I think to change the default domain 
type you would need a slightly different door that wasn’t conditioned on 
device.

For situations where individual devices require an identity domain because 
of quirks then maybe calling is_identity_map per device in 
iommu_group_get_for_dev is a better solution than the one I proposed.

> 
>> addresses the issue of a device requiring an identity domain by ignoring
>> the domain param in attach_device and printing a warning.
> 
> This will not work as I commented in that thread.
> 
>> I booted some of our devices with this patchset and haven't seen any
>> issues. It doesn't look like we have any devices with RMRR's though so
>> those codepaths aren't tested.
>> James Sewart (7):
>>   iommu: Move iommu_group_create_direct_mappings to after device_attach
>>   iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
>>   iommu/vt-d: Expose ISA direct mapping region via
>>     iommu_get_resv_regions
>>   iommu/vt-d: Ignore domain parameter in attach_device if device
>>     requires identity map
>>   iommu/vt-d: Allow IOMMU_DOMAIN_DMA to be allocated by iommu_ops
>>   iommu/vt-d: Remove lazy allocation of domains
>> Lu Baolu (1):
>>   iommu/vt-d: Enable DMA remapping after rmrr mapped
>>  drivers/iommu/intel-iommu.c | 444 +++++++++++-------------------------
>>  drivers/iommu/iommu.c       |   4 +-
>>  2 files changed, 131 insertions(+), 317 deletions(-)
> 
> Best regards,
> Lu Baolu

Cheers,
James.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.
  2019-03-19 13:35     ` James Sewart
@ 2019-03-20  1:26       ` Lu Baolu
  2019-03-22 10:05         ` James Sewart
  0 siblings, 1 reply; 55+ messages in thread
From: Lu Baolu @ 2019-03-20  1:26 UTC (permalink / raw)
  To: James Sewart
  Cc: baolu.lu, iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi James,

On 3/19/19 9:35 PM, James Sewart wrote:
> Hey Lu,
> 
>> On 15 Mar 2019, at 03:13, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>
>> Hi James,
>>
>> On 3/14/19 7:56 PM, James Sewart wrote:
>>> Patches 1 and 2 are the same as v1.
>>> v1-v2:
>>>    Refactored ISA direct mappings to be returned by iommu_get_resv_regions.
>>>    Integrated patch by Lu to defer turning on DMAR until iommu.c has mapped
>>> reserved regions.
>>>    Integrated patches by Lu to remove more unused code in cleanup.
>>> Lu: I didn't integrate your patch to set the default domain type as it
>>> isn't directly related to the aim of this patchset. Instead patch 4
>>
>> Without those patches, user experience will be affected and some devices
>> will not work on Intel platforms anymore.
>>
>> For a long time, Intel IOMMU driver has its own logic to determine
>> whether a device requires an identity domain. For example, when user
>> specifies "iommu=pt" in kernel parameter, all device will be attached
>> with the identity domain. Further more, some quirky devices require
>> an identity domain to be used before enabling DMA remapping, otherwise,
>> it will not work. This was done by adding quirk bits in Intel IOMMU
>> driver.
>>
>> So from my point of view, one way is porting all those quirks and kernel
>> parameters into IOMMU generic layer, or opening a door for vendor IOMMU
>> driver to determine the default domain type by their own. I prefer the
>> latter option since it will not impact any behaviors on other
>> architectures.
> 
> I see your point. I’m not confident that using the proposed door to set a
> groups default domain has the desired behaviour. As discussed before the
> default domain type will be set based on the desired type for only the
> first device attached to a group. I think to change the default domain
> type you would need a slightly different door that wasn’t conditioned on
> device.

I think this as another problem. Just a summarize for the ease of
discussion. We saw two problems:

1. When allocating a new group for a device, how should we determine the
type of the default domain? This is what my proposal patches trying to
address.

2. If we need to put a device into an existing group which uses a
different type of domain from what the device desires to use, we might
break the functionality of the device. For this problem I'd second your
proposal below if I get your point correctly.

> 
> For situations where individual devices require an identity domain because
> of quirks then maybe calling is_identity_map per device in
> iommu_group_get_for_dev is a better solution than the one I proposed.
> 

Do you mean if we see a quirky device requires a different domain type
other than the default domain type of the group, we will assign a new
group to it? That looks good to me as far as I can see. I suppose this
should be done in vt-d's ops callback.

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
  2019-03-15  2:19     ` Lu Baolu
@ 2019-03-22  9:57       ` James Sewart
  2019-03-25  2:03         ` Lu Baolu
  0 siblings, 1 reply; 55+ messages in thread
From: James Sewart @ 2019-03-22  9:57 UTC (permalink / raw)
  To: Lu Baolu; +Cc: iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hey Lu,

> On 15 Mar 2019, at 02:19, Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
> Hi James,
> 
> On 3/14/19 7:58 PM, James Sewart wrote:
>> To support mapping ISA region via iommu_group_create_direct_mappings,
>> make sure its exposed by iommu_get_resv_regions. This allows
>> deduplication of reserved region mappings
>> Signed-off-by: James Sewart <jamessewart@arista.com>
>> ---
>>  drivers/iommu/intel-iommu.c | 42 +++++++++++++++++++++++++++++--------
>>  1 file changed, 33 insertions(+), 9 deletions(-)
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index 8e0a4e2ff77f..2e00e8708f06 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units);
>>  #define for_each_rmrr_units(rmrr) \
>>  	list_for_each_entry(rmrr, &dmar_rmrr_units, list)
>>  +static struct iommu_resv_region *isa_resv_region;
>> +
>>  /* bitmap for indexing intel_iommus */
>>  static int g_num_of_iommus;
>>  @@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
>>  					  rmrr->end_address);
>>  }
>>  +static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
>> +{
>> +	if (!isa_resv_region)
>> +		isa_resv_region = iommu_alloc_resv_region(0,
>> +				16*1024*1024,
>> +				0, IOMMU_RESV_DIRECT);
>> +
>> +	return isa_resv_region;
>> +}
>> +
>>  #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
>> -static inline void iommu_prepare_isa(void)
>> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>>  {
>> -	struct pci_dev *pdev;
>>  	int ret;
>> +	struct iommu_resv_region *reg = iommu_get_isa_resv_region();
>>  -	pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
>> -	if (!pdev)
>> +	if (!reg)
>>  		return;
>>    	pr_info("Prepare 0-16MiB unity mapping for LPC\n");
>> -	ret = iommu_prepare_identity_map(&pdev->dev, 0, 16*1024*1024 - 1);
>> +	ret = iommu_prepare_identity_map(&pdev->dev, reg->start,
>> +			reg->start + reg->length - 1);
>>    	if (ret)
>>  		pr_err("Failed to create 0-16MiB identity map - floppy might not work\n");
>> -
>> -	pci_dev_put(pdev);
>>  }
>>  #else
>> -static inline void iommu_prepare_isa(void)
>> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>>  {
>>  	return;
>>  }
>> @@ -3289,6 +3299,7 @@ static int __init init_dmars(void)
>>  	struct dmar_rmrr_unit *rmrr;
>>  	bool copied_tables = false;
>>  	struct device *dev;
>> +	struct pci_dev *pdev;
>>  	struct intel_iommu *iommu;
>>  	int i, ret;
>>  @@ -3469,7 +3480,11 @@ static int __init init_dmars(void)
>>  		}
>>  	}
>>  -	iommu_prepare_isa();
>> +	pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
>> +	if (pdev) {
>> +		iommu_prepare_isa(pdev);
>> +		pci_dev_put(pdev);
>> +	}
>>    domains_done:
>>  @@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct device *device,
>>  	struct iommu_resv_region *reg;
>>  	struct dmar_rmrr_unit *rmrr;
>>  	struct device *i_dev;
>> +	struct pci_dev *pdev;
>>  	int i;
>>    	rcu_read_lock();
>> @@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct device *device,
>>  	}
>>  	rcu_read_unlock();
>>  +	if (dev_is_pci(device)) {
>> +		pdev = to_pci_dev(device);
>> +		if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
>> +			reg = iommu_get_isa_resv_region();
>> +			list_add_tail(&reg->list, head);
>> +		}
>> +	}
>> +
> 
> Just wondering why not just
> 
> +#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
> +	if (dev_is_pci(device)) {
> +		pdev = to_pci_dev(device);
> +		if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
> +			reg = iommu_alloc_resv_region(0,
> +				16*1024*1024,
> +				0, IOMMU_RESV_DIRECT);
> +			if (reg)
> +				list_add_tail(&reg->list, head);
> +		}
> +	}
> +#endif
> 
> and, remove all other related code?

At this point in the patchset if we remove iommu_prepare_isa then the ISA 
region won’t be mapped to the device. Only once the dma domain is allocable 
will the reserved regions be mapped by iommu_group_create_direct_mappings.

Theres an issue that if we choose to alloc a new resv_region with type 
IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions 
to free this entry type which means refactoring the rmrr regions in 
get_resv_regions. Should this work be in this patchset?

> 
> Best regards,
> Lu Baolu

Cheers,
James.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.
  2019-03-20  1:26       ` Lu Baolu
@ 2019-03-22 10:05         ` James Sewart
  0 siblings, 0 replies; 55+ messages in thread
From: James Sewart @ 2019-03-22 10:05 UTC (permalink / raw)
  To: Lu Baolu; +Cc: iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hey Lu,

> On 20 Mar 2019, at 01:26, Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
> Hi James,
> 
> On 3/19/19 9:35 PM, James Sewart wrote:
>> Hey Lu,
>>> On 15 Mar 2019, at 03:13, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>> 
>>> Hi James,
>>> 
>>> On 3/14/19 7:56 PM, James Sewart wrote:
>>>> Patches 1 and 2 are the same as v1.
>>>> v1-v2:
>>>>   Refactored ISA direct mappings to be returned by iommu_get_resv_regions.
>>>>   Integrated patch by Lu to defer turning on DMAR until iommu.c has mapped
>>>> reserved regions.
>>>>   Integrated patches by Lu to remove more unused code in cleanup.
>>>> Lu: I didn't integrate your patch to set the default domain type as it
>>>> isn't directly related to the aim of this patchset. Instead patch 4
>>> 
>>> Without those patches, user experience will be affected and some devices
>>> will not work on Intel platforms anymore.
>>> 
>>> For a long time, Intel IOMMU driver has its own logic to determine
>>> whether a device requires an identity domain. For example, when user
>>> specifies "iommu=pt" in kernel parameter, all device will be attached
>>> with the identity domain. Further more, some quirky devices require
>>> an identity domain to be used before enabling DMA remapping, otherwise,
>>> it will not work. This was done by adding quirk bits in Intel IOMMU
>>> driver.
>>> 
>>> So from my point of view, one way is porting all those quirks and kernel
>>> parameters into IOMMU generic layer, or opening a door for vendor IOMMU
>>> driver to determine the default domain type by their own. I prefer the
>>> latter option since it will not impact any behaviors on other
>>> architectures.
>> I see your point. I’m not confident that using the proposed door to set a
>> groups default domain has the desired behaviour. As discussed before the
>> default domain type will be set based on the desired type for only the
>> first device attached to a group. I think to change the default domain
>> type you would need a slightly different door that wasn’t conditioned on
>> device.
> 
> I think this as another problem. Just a summarize for the ease of
> discussion. We saw two problems:
> 
> 1. When allocating a new group for a device, how should we determine the
> type of the default domain? This is what my proposal patches trying to
> address.

This will work as expected only if all devices within a group get the same 
result from is_identity_map. Otherwise wee see issue 2.

> 
> 2. If we need to put a device into an existing group which uses a
> different type of domain from what the device desires to use, we might
> break the functionality of the device. For this problem I'd second your
> proposal below if I get your point correctly.
> 
>> For situations where individual devices require an identity domain because
>> of quirks then maybe calling is_identity_map per device in
>> iommu_group_get_for_dev is a better solution than the one I proposed.
> 
> Do you mean if we see a quirky device requires a different domain type
> other than the default domain type of the group, we will assign a new
> group to it? That looks good to me as far as I can see. I suppose this
> should be done in vt-d's ops callback.

I have thought about this a bit and I think the cleanest approach is to 
put devices that require an identity domain into their own group, this can 
be done in the device_group callback, avoiding any situation where we have 
to deal with creating a new group based on domain type in 
iommu_group_get_for_dev. This way we shouldn’t ever be in a situation with 
multiple different domain types per group. This way your patches will work 
as expected. See below for a possible implementation.

> 
> Best regards,
> Lu Baolu

Cheers,
James.

Devices that require an identity map because of quirks or other reasons
should be put in their own IOMMU group so as to not end up with multiple
different domains per group.

Signed-off-by: James Sewart <jamessewart@arista.com>

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 3cb8b36abf50..0f5a121d31a0 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5421,6 +5421,22 @@ struct intel_iommu *intel_svm_device_to_iommu(struct device *dev)
 }
 #endif /* CONFIG_INTEL_IOMMU_SVM */

+static struct iommu_group *intel_identity_group;
+
+struct iommu_group *intel_iommu_pci_device_group(struct device *dev)
+{
+       if (iommu_no_mapping(dev)) {
+               if (!intel_identity_group) {
+                       intel_identity_group = iommu_group_alloc();
+                       if (IS_ERR(intel_identity_group))
+                               return NULL;
+               }
+               return intel_identity_group;
+       }
+
+       return pci_device_group(dev);
+}
+
 const struct iommu_ops intel_iommu_ops = {
        .capable                = intel_iommu_capable,
        .domain_alloc           = intel_iommu_domain_alloc,
@@ -5435,7 +5451,7 @@ const struct iommu_ops intel_iommu_ops = {
        .get_resv_regions       = intel_iommu_get_resv_regions,
        .put_resv_regions       = intel_iommu_put_resv_regions,
        .apply_resv_region      = intel_iommu_apply_resv_region,
-       .device_group           = pci_device_group,
+       .device_group           = intel_iommu_pci_device_group,
        .pgsize_bitmap          = INTEL_IOMMU_PGSIZES,
 };

-- 
2.17.1


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 7/7] iommu/vt-d: Remove lazy allocation of domains
  2019-03-14 23:35     ` Jacob Pan
@ 2019-03-22 10:07       ` James Sewart
  0 siblings, 0 replies; 55+ messages in thread
From: James Sewart @ 2019-03-22 10:07 UTC (permalink / raw)
  To: Jacob Pan; +Cc: iommu, Tom Murphy, Dmitry Safonov, linux-kernel, Lu Baolu

Hey Jacob,

> On 14 Mar 2019, at 23:35, Jacob Pan <jacob.jun.pan@linux.intel.com> wrote:
> 
> On Thu, 14 Mar 2019 11:59:36 +0000
> James Sewart <jamessewart@arista.com> wrote:
> 
>> 
>> -	domain = get_valid_domain_for_dev(dev);
>> +	domain = find_domain(dev);
>> 	if (!domain)
>> 		return DMA_MAPPING_ERROR;
>> 
>> @@ -3914,7 +3624,7 @@ static int intel_map_sg(struct device *dev,
>> struct scatterlist *sglist, int nele if (iommu_no_mapping(dev))
>> 		return intel_nontranslate_map_sg(dev, sglist,
>> nelems, dir); 
>> -	domain = get_valid_domain_for_dev(dev);
>> +	domain = find_domain(dev);
> This patchset looks like a very good clean up, I am wondering why we
> can't use the generic iommu_get_domain_for_dev() here, since VT-d has a
> default DMA domain after your patch.

This should be possible, only downside is we get an iommu_domain from 
iommu_get_domain_for_dev and will have to check its not null before 
getting the dmar_domain from it. We will be able to remove find_domain 
though.

Cheers,
James.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
  2019-03-22  9:57       ` James Sewart
@ 2019-03-25  2:03         ` Lu Baolu
  2019-03-25 12:57           ` James Sewart
  0 siblings, 1 reply; 55+ messages in thread
From: Lu Baolu @ 2019-03-25  2:03 UTC (permalink / raw)
  To: James Sewart
  Cc: baolu.lu, iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi James,

On 3/22/19 5:57 PM, James Sewart wrote:
> Hey Lu,
> 
>> On 15 Mar 2019, at 02:19, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>
>> Hi James,
>>
>> On 3/14/19 7:58 PM, James Sewart wrote:
>>> To support mapping ISA region via iommu_group_create_direct_mappings,
>>> make sure its exposed by iommu_get_resv_regions. This allows
>>> deduplication of reserved region mappings
>>> Signed-off-by: James Sewart <jamessewart@arista.com>
>>> ---
>>>   drivers/iommu/intel-iommu.c | 42 +++++++++++++++++++++++++++++--------
>>>   1 file changed, 33 insertions(+), 9 deletions(-)
>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>> index 8e0a4e2ff77f..2e00e8708f06 100644
>>> --- a/drivers/iommu/intel-iommu.c
>>> +++ b/drivers/iommu/intel-iommu.c
>>> @@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units);
>>>   #define for_each_rmrr_units(rmrr) \
>>>   	list_for_each_entry(rmrr, &dmar_rmrr_units, list)
>>>   +static struct iommu_resv_region *isa_resv_region;
>>> +
>>>   /* bitmap for indexing intel_iommus */
>>>   static int g_num_of_iommus;
>>>   @@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
>>>   					  rmrr->end_address);
>>>   }
>>>   +static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
>>> +{
>>> +	if (!isa_resv_region)
>>> +		isa_resv_region = iommu_alloc_resv_region(0,
>>> +				16*1024*1024,
>>> +				0, IOMMU_RESV_DIRECT);
>>> +
>>> +	return isa_resv_region;
>>> +}
>>> +
>>>   #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
>>> -static inline void iommu_prepare_isa(void)
>>> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>>>   {
>>> -	struct pci_dev *pdev;
>>>   	int ret;
>>> +	struct iommu_resv_region *reg = iommu_get_isa_resv_region();
>>>   -	pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
>>> -	if (!pdev)
>>> +	if (!reg)
>>>   		return;
>>>     	pr_info("Prepare 0-16MiB unity mapping for LPC\n");
>>> -	ret = iommu_prepare_identity_map(&pdev->dev, 0, 16*1024*1024 - 1);
>>> +	ret = iommu_prepare_identity_map(&pdev->dev, reg->start,
>>> +			reg->start + reg->length - 1);
>>>     	if (ret)
>>>   		pr_err("Failed to create 0-16MiB identity map - floppy might not work\n");
>>> -
>>> -	pci_dev_put(pdev);
>>>   }
>>>   #else
>>> -static inline void iommu_prepare_isa(void)
>>> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>>>   {
>>>   	return;
>>>   }
>>> @@ -3289,6 +3299,7 @@ static int __init init_dmars(void)
>>>   	struct dmar_rmrr_unit *rmrr;
>>>   	bool copied_tables = false;
>>>   	struct device *dev;
>>> +	struct pci_dev *pdev;
>>>   	struct intel_iommu *iommu;
>>>   	int i, ret;
>>>   @@ -3469,7 +3480,11 @@ static int __init init_dmars(void)
>>>   		}
>>>   	}
>>>   -	iommu_prepare_isa();
>>> +	pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
>>> +	if (pdev) {
>>> +		iommu_prepare_isa(pdev);
>>> +		pci_dev_put(pdev);
>>> +	}
>>>     domains_done:
>>>   @@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct device *device,
>>>   	struct iommu_resv_region *reg;
>>>   	struct dmar_rmrr_unit *rmrr;
>>>   	struct device *i_dev;
>>> +	struct pci_dev *pdev;
>>>   	int i;
>>>     	rcu_read_lock();
>>> @@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct device *device,
>>>   	}
>>>   	rcu_read_unlock();
>>>   +	if (dev_is_pci(device)) {
>>> +		pdev = to_pci_dev(device);
>>> +		if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
>>> +			reg = iommu_get_isa_resv_region();
>>> +			list_add_tail(&reg->list, head);
>>> +		}
>>> +	}
>>> +
>>
>> Just wondering why not just
>>
>> +#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
>> +	if (dev_is_pci(device)) {
>> +		pdev = to_pci_dev(device);
>> +		if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
>> +			reg = iommu_alloc_resv_region(0,
>> +				16*1024*1024,
>> +				0, IOMMU_RESV_DIRECT);
>> +			if (reg)
>> +				list_add_tail(&reg->list, head);
>> +		}
>> +	}
>> +#endif
>>
>> and, remove all other related code?
> 
> At this point in the patchset if we remove iommu_prepare_isa then the ISA
> region won’t be mapped to the device. Only once the dma domain is allocable
> will the reserved regions be mapped by iommu_group_create_direct_mappings.

Yes. So if we put the allocation code here, it won't impact anything and
will take effect as soon as the dma domain is allocatable.

> 
> Theres an issue that if we choose to alloc a new resv_region with type
> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
> to free this entry type which means refactoring the rmrr regions in
> get_resv_regions. Should this work be in this patchset?

Do you mean the rmrr regions are not allocated in get_resv_regions, but
are freed in put_resv_regions? I think we should fix this in this patch
set since this might impact the device passthrough if we don't do it.

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
  2019-03-25  2:03         ` Lu Baolu
@ 2019-03-25 12:57           ` James Sewart
  2019-03-26  1:10             ` Lu Baolu
  2019-03-26  1:24             ` Lu Baolu
  0 siblings, 2 replies; 55+ messages in thread
From: James Sewart @ 2019-03-25 12:57 UTC (permalink / raw)
  To: Lu Baolu; +Cc: iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hey Lu,

> On 25 Mar 2019, at 02:03, Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
> Hi James,
> 
> On 3/22/19 5:57 PM, James Sewart wrote:
>> Hey Lu,
>>> On 15 Mar 2019, at 02:19, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>> 
>>> Hi James,
>>> 
>>> On 3/14/19 7:58 PM, James Sewart wrote:
>>>> To support mapping ISA region via iommu_group_create_direct_mappings,
>>>> make sure its exposed by iommu_get_resv_regions. This allows
>>>> deduplication of reserved region mappings
>>>> Signed-off-by: James Sewart <jamessewart@arista.com>
>>>> ---
>>>>  drivers/iommu/intel-iommu.c | 42 +++++++++++++++++++++++++++++--------
>>>>  1 file changed, 33 insertions(+), 9 deletions(-)
>>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>>> index 8e0a4e2ff77f..2e00e8708f06 100644
>>>> --- a/drivers/iommu/intel-iommu.c
>>>> +++ b/drivers/iommu/intel-iommu.c
>>>> @@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units);
>>>>  #define for_each_rmrr_units(rmrr) \
>>>>  	list_for_each_entry(rmrr, &dmar_rmrr_units, list)
>>>>  +static struct iommu_resv_region *isa_resv_region;
>>>> +
>>>>  /* bitmap for indexing intel_iommus */
>>>>  static int g_num_of_iommus;
>>>>  @@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
>>>>  					  rmrr->end_address);
>>>>  }
>>>>  +static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
>>>> +{
>>>> +	if (!isa_resv_region)
>>>> +		isa_resv_region = iommu_alloc_resv_region(0,
>>>> +				16*1024*1024,
>>>> +				0, IOMMU_RESV_DIRECT);
>>>> +
>>>> +	return isa_resv_region;
>>>> +}
>>>> +
>>>>  #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
>>>> -static inline void iommu_prepare_isa(void)
>>>> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>>>>  {
>>>> -	struct pci_dev *pdev;
>>>>  	int ret;
>>>> +	struct iommu_resv_region *reg = iommu_get_isa_resv_region();
>>>>  -	pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
>>>> -	if (!pdev)
>>>> +	if (!reg)
>>>>  		return;
>>>>    	pr_info("Prepare 0-16MiB unity mapping for LPC\n");
>>>> -	ret = iommu_prepare_identity_map(&pdev->dev, 0, 16*1024*1024 - 1);
>>>> +	ret = iommu_prepare_identity_map(&pdev->dev, reg->start,
>>>> +			reg->start + reg->length - 1);
>>>>    	if (ret)
>>>>  		pr_err("Failed to create 0-16MiB identity map - floppy might not work\n");
>>>> -
>>>> -	pci_dev_put(pdev);
>>>>  }
>>>>  #else
>>>> -static inline void iommu_prepare_isa(void)
>>>> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>>>>  {
>>>>  	return;
>>>>  }
>>>> @@ -3289,6 +3299,7 @@ static int __init init_dmars(void)
>>>>  	struct dmar_rmrr_unit *rmrr;
>>>>  	bool copied_tables = false;
>>>>  	struct device *dev;
>>>> +	struct pci_dev *pdev;
>>>>  	struct intel_iommu *iommu;
>>>>  	int i, ret;
>>>>  @@ -3469,7 +3480,11 @@ static int __init init_dmars(void)
>>>>  		}
>>>>  	}
>>>>  -	iommu_prepare_isa();
>>>> +	pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
>>>> +	if (pdev) {
>>>> +		iommu_prepare_isa(pdev);
>>>> +		pci_dev_put(pdev);
>>>> +	}
>>>>    domains_done:
>>>>  @@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct device *device,
>>>>  	struct iommu_resv_region *reg;
>>>>  	struct dmar_rmrr_unit *rmrr;
>>>>  	struct device *i_dev;
>>>> +	struct pci_dev *pdev;
>>>>  	int i;
>>>>    	rcu_read_lock();
>>>> @@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct device *device,
>>>>  	}
>>>>  	rcu_read_unlock();
>>>>  +	if (dev_is_pci(device)) {
>>>> +		pdev = to_pci_dev(device);
>>>> +		if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
>>>> +			reg = iommu_get_isa_resv_region();
>>>> +			list_add_tail(&reg->list, head);
>>>> +		}
>>>> +	}
>>>> +
>>> 
>>> Just wondering why not just
>>> 
>>> +#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
>>> +	if (dev_is_pci(device)) {
>>> +		pdev = to_pci_dev(device);
>>> +		if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
>>> +			reg = iommu_alloc_resv_region(0,
>>> +				16*1024*1024,
>>> +				0, IOMMU_RESV_DIRECT);
>>> +			if (reg)
>>> +				list_add_tail(&reg->list, head);
>>> +		}
>>> +	}
>>> +#endif
>>> 
>>> and, remove all other related code?
>> At this point in the patchset if we remove iommu_prepare_isa then the ISA
>> region won’t be mapped to the device. Only once the dma domain is allocable
>> will the reserved regions be mapped by iommu_group_create_direct_mappings.
> 
> Yes. So if we put the allocation code here, it won't impact anything and
> will take effect as soon as the dma domain is allocatable.
> 
>> Theres an issue that if we choose to alloc a new resv_region with type
>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
>> to free this entry type which means refactoring the rmrr regions in
>> get_resv_regions. Should this work be in this patchset?
> 
> Do you mean the rmrr regions are not allocated in get_resv_regions, but
> are freed in put_resv_regions? I think we should fix this in this patch
> set since this might impact the device passthrough if we don't do it.

They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is 
freed in put_resv_regions. If we allocate a new resv_region with type 
IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify 
put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free 
the static RMRR regions.

Either the ISA region is static and not freed as with my implementation, 
or the RMRR regions are converted to be allocated on each call to 
get_resv_regions and freed in put_resv_regions.

> 
> Best regards,
> Lu Baolu

Cheers,
James.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
  2019-03-25 12:57           ` James Sewart
@ 2019-03-26  1:10             ` Lu Baolu
  2019-03-26  1:24             ` Lu Baolu
  1 sibling, 0 replies; 55+ messages in thread
From: Lu Baolu @ 2019-03-26  1:10 UTC (permalink / raw)
  To: James Sewart
  Cc: baolu.lu, iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi,

On 3/25/19 8:57 PM, James Sewart wrote:
> Hey Lu,
> 
>> On 25 Mar 2019, at 02:03, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>
>> Hi James,
>>
>> On 3/22/19 5:57 PM, James Sewart wrote:
>>> Hey Lu,
>>>> On 15 Mar 2019, at 02:19, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>>>
>>>> Hi James,
>>>>
>>>> On 3/14/19 7:58 PM, James Sewart wrote:
>>>>> To support mapping ISA region via iommu_group_create_direct_mappings,
>>>>> make sure its exposed by iommu_get_resv_regions. This allows
>>>>> deduplication of reserved region mappings
>>>>> Signed-off-by: James Sewart <jamessewart@arista.com>
>>>>> ---
>>>>>   drivers/iommu/intel-iommu.c | 42 +++++++++++++++++++++++++++++--------
>>>>>   1 file changed, 33 insertions(+), 9 deletions(-)
>>>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>>>> index 8e0a4e2ff77f..2e00e8708f06 100644
>>>>> --- a/drivers/iommu/intel-iommu.c
>>>>> +++ b/drivers/iommu/intel-iommu.c
>>>>> @@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units);
>>>>>   #define for_each_rmrr_units(rmrr) \
>>>>>   	list_for_each_entry(rmrr, &dmar_rmrr_units, list)
>>>>>   +static struct iommu_resv_region *isa_resv_region;
>>>>> +
>>>>>   /* bitmap for indexing intel_iommus */
>>>>>   static int g_num_of_iommus;
>>>>>   @@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
>>>>>   					  rmrr->end_address);
>>>>>   }
>>>>>   +static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
>>>>> +{
>>>>> +	if (!isa_resv_region)
>>>>> +		isa_resv_region = iommu_alloc_resv_region(0,
>>>>> +				16*1024*1024,
>>>>> +				0, IOMMU_RESV_DIRECT);
>>>>> +
>>>>> +	return isa_resv_region;
>>>>> +}
>>>>> +
>>>>>   #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
>>>>> -static inline void iommu_prepare_isa(void)
>>>>> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>>>>>   {
>>>>> -	struct pci_dev *pdev;
>>>>>   	int ret;
>>>>> +	struct iommu_resv_region *reg = iommu_get_isa_resv_region();
>>>>>   -	pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
>>>>> -	if (!pdev)
>>>>> +	if (!reg)
>>>>>   		return;
>>>>>     	pr_info("Prepare 0-16MiB unity mapping for LPC\n");
>>>>> -	ret = iommu_prepare_identity_map(&pdev->dev, 0, 16*1024*1024 - 1);
>>>>> +	ret = iommu_prepare_identity_map(&pdev->dev, reg->start,
>>>>> +			reg->start + reg->length - 1);
>>>>>     	if (ret)
>>>>>   		pr_err("Failed to create 0-16MiB identity map - floppy might not work\n");
>>>>> -
>>>>> -	pci_dev_put(pdev);
>>>>>   }
>>>>>   #else
>>>>> -static inline void iommu_prepare_isa(void)
>>>>> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>>>>>   {
>>>>>   	return;
>>>>>   }
>>>>> @@ -3289,6 +3299,7 @@ static int __init init_dmars(void)
>>>>>   	struct dmar_rmrr_unit *rmrr;
>>>>>   	bool copied_tables = false;
>>>>>   	struct device *dev;
>>>>> +	struct pci_dev *pdev;
>>>>>   	struct intel_iommu *iommu;
>>>>>   	int i, ret;
>>>>>   @@ -3469,7 +3480,11 @@ static int __init init_dmars(void)
>>>>>   		}
>>>>>   	}
>>>>>   -	iommu_prepare_isa();
>>>>> +	pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
>>>>> +	if (pdev) {
>>>>> +		iommu_prepare_isa(pdev);
>>>>> +		pci_dev_put(pdev);
>>>>> +	}
>>>>>     domains_done:
>>>>>   @@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct device *device,
>>>>>   	struct iommu_resv_region *reg;
>>>>>   	struct dmar_rmrr_unit *rmrr;
>>>>>   	struct device *i_dev;
>>>>> +	struct pci_dev *pdev;
>>>>>   	int i;
>>>>>     	rcu_read_lock();
>>>>> @@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct device *device,
>>>>>   	}
>>>>>   	rcu_read_unlock();
>>>>>   +	if (dev_is_pci(device)) {
>>>>> +		pdev = to_pci_dev(device);
>>>>> +		if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
>>>>> +			reg = iommu_get_isa_resv_region();
>>>>> +			list_add_tail(&reg->list, head);
>>>>> +		}
>>>>> +	}
>>>>> +
>>>>
>>>> Just wondering why not just
>>>>
>>>> +#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
>>>> +	if (dev_is_pci(device)) {
>>>> +		pdev = to_pci_dev(device);
>>>> +		if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
>>>> +			reg = iommu_alloc_resv_region(0,
>>>> +				16*1024*1024,
>>>> +				0, IOMMU_RESV_DIRECT);
>>>> +			if (reg)
>>>> +				list_add_tail(&reg->list, head);
>>>> +		}
>>>> +	}
>>>> +#endif
>>>>
>>>> and, remove all other related code?
>>> At this point in the patchset if we remove iommu_prepare_isa then the ISA
>>> region won’t be mapped to the device. Only once the dma domain is allocable
>>> will the reserved regions be mapped by iommu_group_create_direct_mappings.
>>
>> Yes. So if we put the allocation code here, it won't impact anything and
>> will take effect as soon as the dma domain is allocatable.
>>
>>> Theres an issue that if we choose to alloc a new resv_region with type
>>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
>>> to free this entry type which means refactoring the rmrr regions in
>>> get_resv_regions. Should this work be in this patchset?
>>
>> Do you mean the rmrr regions are not allocated in get_resv_regions, but
>> are freed in put_resv_regions? I think we should fix this in this patch
>> set since this might impact the device passthrough if we don't do it.
> 
> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
> freed in put_resv_regions. If we allocate a new resv_region with type
> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
> the static RMRR regions.

Ah yes! Thanks for the explanation.

> 
> Either the ISA region is static and not freed as with my implementation,
> or the RMRR regions are converted to be allocated on each call to
> get_resv_regions and freed in put_resv_regions.

Okay, I second your implementation now.

> 
>>
>> Best regards,
>> Lu Baolu
> 
> Cheers,
> James.
> 

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
  2019-03-25 12:57           ` James Sewart
  2019-03-26  1:10             ` Lu Baolu
@ 2019-03-26  1:24             ` Lu Baolu
  2019-03-28 18:37               ` James Sewart
  1 sibling, 1 reply; 55+ messages in thread
From: Lu Baolu @ 2019-03-26  1:24 UTC (permalink / raw)
  To: James Sewart
  Cc: baolu.lu, iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi James,

On 3/25/19 8:57 PM, James Sewart wrote:
>>> Theres an issue that if we choose to alloc a new resv_region with type
>>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
>>> to free this entry type which means refactoring the rmrr regions in
>>> get_resv_regions. Should this work be in this patchset?
>> Do you mean the rmrr regions are not allocated in get_resv_regions, but
>> are freed in put_resv_regions? I think we should fix this in this patch
>> set since this might impact the device passthrough if we don't do it.
> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
> freed in put_resv_regions. If we allocate a new resv_region with type
> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
> the static RMRR regions.
> 
> Either the ISA region is static and not freed as with my implementation,
> or the RMRR regions are converted to be allocated on each call to
> get_resv_regions and freed in put_resv_regions.
> 

By the way, there's another way in my mind. Let's add a new region type
for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way
as those MSI regions. Just FYI.

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
  2019-03-26  1:24             ` Lu Baolu
@ 2019-03-28 18:37               ` James Sewart
  2019-03-29 15:26                 ` James Sewart
  0 siblings, 1 reply; 55+ messages in thread
From: James Sewart @ 2019-03-28 18:37 UTC (permalink / raw)
  To: Lu Baolu; +Cc: iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hey Lu,

> On 26 Mar 2019, at 01:24, Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
> Hi James,
> 
> On 3/25/19 8:57 PM, James Sewart wrote:
>>>> Theres an issue that if we choose to alloc a new resv_region with type
>>>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
>>>> to free this entry type which means refactoring the rmrr regions in
>>>> get_resv_regions. Should this work be in this patchset?
>>> Do you mean the rmrr regions are not allocated in get_resv_regions, but
>>> are freed in put_resv_regions? I think we should fix this in this patch
>>> set since this might impact the device passthrough if we don't do it.
>> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
>> freed in put_resv_regions. If we allocate a new resv_region with type
>> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
>> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
>> the static RMRR regions.
>> Either the ISA region is static and not freed as with my implementation,
>> or the RMRR regions are converted to be allocated on each call to
>> get_resv_regions and freed in put_resv_regions.
> 
> By the way, there's another way in my mind. Let's add a new region type
> for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way
> as those MSI regions. Just FYI.

This solution would require adding some extra code to 
iommu_group_create_direct_mappings as currently only type 
IOMMU_RESV_DIRECT is identity mapped, other types are only reserved.


> 
> Best regards,
> Lu Baolu

Cheers,
James.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
  2019-03-28 18:37               ` James Sewart
@ 2019-03-29 15:26                 ` James Sewart
  2019-04-04  6:49                   ` Lu Baolu
  0 siblings, 1 reply; 55+ messages in thread
From: James Sewart @ 2019-03-29 15:26 UTC (permalink / raw)
  To: Lu Baolu; +Cc: iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2035 bytes --]

Hey Lu,

I’ve attached a preliminary v3, if you could take a look and run some tests 
that would be great.

Since v2 i’ve added your default domain type patches, the new device_group 
handler, and incorporated Jacob’s feedback.

> On 28 Mar 2019, at 18:37, James Sewart <jamessewart@arista.com> wrote:
> 
> Hey Lu,
> 
>> On 26 Mar 2019, at 01:24, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>> 
>> Hi James,
>> 
>> On 3/25/19 8:57 PM, James Sewart wrote:
>>>>> Theres an issue that if we choose to alloc a new resv_region with type
>>>>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
>>>>> to free this entry type which means refactoring the rmrr regions in
>>>>> get_resv_regions. Should this work be in this patchset?
>>>> Do you mean the rmrr regions are not allocated in get_resv_regions, but
>>>> are freed in put_resv_regions? I think we should fix this in this patch
>>>> set since this might impact the device passthrough if we don't do it.
>>> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
>>> freed in put_resv_regions. If we allocate a new resv_region with type
>>> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
>>> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
>>> the static RMRR regions.
>>> Either the ISA region is static and not freed as with my implementation,
>>> or the RMRR regions are converted to be allocated on each call to
>>> get_resv_regions and freed in put_resv_regions.
>> 
>> By the way, there's another way in my mind. Let's add a new region type
>> for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way
>> as those MSI regions. Just FYI.
> 
> This solution would require adding some extra code to 
> iommu_group_create_direct_mappings as currently only type 
> IOMMU_RESV_DIRECT is identity mapped, other types are only reserved.
> 
> 
>> 
>> Best regards,
>> Lu Baolu
> 
> Cheers,
> James.

Cheers,
James.


[-- Attachment #2: 0001-iommu-Move-iommu_group_create_direct_mappings-to-aft.patch --]
[-- Type: application/octet-stream, Size: 1289 bytes --]

From 798311d11929d45432641f20f2b869b112e8c956 Mon Sep 17 00:00:00 2001
From: James Sewart <jamessewart@arista.com>
Date: Fri, 1 Mar 2019 16:03:40 +0000
Subject: [PATCH 1/9] iommu: Move iommu_group_create_direct_mappings to after
 device_attach

If an IOMMU driver requires to know which IOMMU a domain is associated
to initialise a domain then it will do so in device_attach, before which
it is not safe to call iommu_ops.

Signed-off-by: James Sewart <jamessewart@arista.com>

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 33a982e33716..88d589d56ded 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -652,8 +652,6 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 
 	dev->iommu_group = group;
 
-	iommu_group_create_direct_mappings(group, dev);
-
 	mutex_lock(&group->mutex);
 	list_add_tail(&device->list, &group->devices);
 	if (group->domain)
@@ -662,6 +660,8 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 	if (ret)
 		goto err_put_group;
 
+	iommu_group_create_direct_mappings(group, dev);
+
 	/* Notify any listeners about change to group. */
 	blocking_notifier_call_chain(&group->notifier,
 				     IOMMU_GROUP_NOTIFY_ADD_DEVICE, dev);
-- 
2.17.1


[-- Attachment #3: 0002-iommu-vt-d-Implement-apply_resv_region-for-reserving.patch --]
[-- Type: application/octet-stream, Size: 1609 bytes --]

From 3bf87fef92f632b1bf789fcd2966d3f9bf707c43 Mon Sep 17 00:00:00 2001
From: James Sewart <jamessewart@arista.com>
Date: Fri, 1 Mar 2019 16:24:59 +0000
Subject: [PATCH 2/9] iommu/vt-d: Implement apply_resv_region for reserving
 IOVA ranges

Used by iommu.c before creating identity mappings for reserved ranges to
ensure dma-ops won't ever remap these ranges

Signed-off-by: James Sewart <jamessewart@arista.com>

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 28cb713d728c..3caa7b6fcdde 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5307,6 +5307,19 @@ static void intel_iommu_put_resv_regions(struct device *dev,
 	}
 }
 
+static void intel_iommu_apply_resv_region(struct device *dev,
+		struct iommu_domain *domain,
+		struct iommu_resv_region *region)
+{
+	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+	unsigned long start, end;
+
+	start = IOVA_PFN(region->start);
+	end   = IOVA_PFN(region->start + region->length - 1);
+
+	WARN_ON_ONCE(reserve_iova(&dmar_domain->iovad, start, end) == NULL);
+}
+
 #ifdef CONFIG_INTEL_IOMMU_SVM
 int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sdev)
 {
@@ -5400,6 +5413,7 @@ const struct iommu_ops intel_iommu_ops = {
 	.remove_device		= intel_iommu_remove_device,
 	.get_resv_regions	= intel_iommu_get_resv_regions,
 	.put_resv_regions	= intel_iommu_put_resv_regions,
+	.apply_resv_region	= intel_iommu_apply_resv_region,
 	.device_group		= pci_device_group,
 	.pgsize_bitmap		= INTEL_IOMMU_PGSIZES,
 };
-- 
2.17.1


[-- Attachment #4: 0003-iommu-vt-d-Expose-ISA-direct-mapping-region-via-iomm.patch --]
[-- Type: application/octet-stream, Size: 3328 bytes --]

From d7ccfb5297261d2781b3e29c2636ac94e9194569 Mon Sep 17 00:00:00 2001
From: James Sewart <jamessewart@arista.com>
Date: Thu, 7 Mar 2019 17:57:15 +0000
Subject: [PATCH 3/9] iommu/vt-d: Expose ISA direct mapping region via
 iommu_get_resv_regions

To support mapping ISA region via iommu_group_create_direct_mappings,
make sure its exposed by iommu_get_resv_regions. This allows
deduplication of reserved region mappings

Signed-off-by: James Sewart <jamessewart@arista.com>

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 3caa7b6fcdde..1075a7c74507 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -339,6 +339,8 @@ static LIST_HEAD(dmar_rmrr_units);
 #define for_each_rmrr_units(rmrr) \
 	list_for_each_entry(rmrr, &dmar_rmrr_units, list)
 
+static struct iommu_resv_region *isa_resv_region;
+
 /* bitmap for indexing intel_iommus */
 static int g_num_of_iommus;
 
@@ -2782,26 +2784,34 @@ static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
 					  rmrr->end_address);
 }
 
+static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
+{
+	if (!isa_resv_region)
+		isa_resv_region = iommu_alloc_resv_region(0,
+				16*1024*1024,
+				0, IOMMU_RESV_DIRECT);
+
+	return isa_resv_region;
+}
+
 #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
-static inline void iommu_prepare_isa(void)
+static inline void iommu_prepare_isa(struct pci_dev *pdev)
 {
-	struct pci_dev *pdev;
 	int ret;
+	struct iommu_resv_region *reg = iommu_get_isa_resv_region();
 
-	pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
-	if (!pdev)
+	if (!reg)
 		return;
 
 	pr_info("Prepare 0-16MiB unity mapping for LPC\n");
-	ret = iommu_prepare_identity_map(&pdev->dev, 0, 16*1024*1024 - 1);
+	ret = iommu_prepare_identity_map(&pdev->dev, reg->start,
+			reg->start + reg->length - 1);
 
 	if (ret)
 		pr_err("Failed to create 0-16MiB identity map - floppy might not work\n");
-
-	pci_dev_put(pdev);
 }
 #else
-static inline void iommu_prepare_isa(void)
+static inline void iommu_prepare_isa(struct pci_dev *pdev)
 {
 	return;
 }
@@ -3290,6 +3300,7 @@ static int __init init_dmars(void)
 	struct dmar_rmrr_unit *rmrr;
 	bool copied_tables = false;
 	struct device *dev;
+	struct pci_dev *pdev;
 	struct intel_iommu *iommu;
 	int i, ret;
 
@@ -3470,7 +3481,11 @@ static int __init init_dmars(void)
 		}
 	}
 
-	iommu_prepare_isa();
+	pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
+	if (pdev) {
+		iommu_prepare_isa(pdev);
+		pci_dev_put(pdev);
+	}
 
 domains_done:
 
@@ -5274,6 +5289,7 @@ static void intel_iommu_get_resv_regions(struct device *device,
 	struct iommu_resv_region *reg;
 	struct dmar_rmrr_unit *rmrr;
 	struct device *i_dev;
+	struct pci_dev *pdev;
 	int i;
 
 	rcu_read_lock();
@@ -5288,6 +5304,14 @@ static void intel_iommu_get_resv_regions(struct device *device,
 	}
 	rcu_read_unlock();
 
+	if (dev_is_pci(device)) {
+		pdev = to_pci_dev(device);
+		if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
+			reg = iommu_get_isa_resv_region();
+			list_add_tail(&reg->list, head);
+		}
+	}
+
 	reg = iommu_alloc_resv_region(IOAPIC_RANGE_START,
 				      IOAPIC_RANGE_END - IOAPIC_RANGE_START + 1,
 				      0, IOMMU_RESV_MSI);
-- 
2.17.1


[-- Attachment #5: 0004-iommu-vt-d-Create-an-IOMMU-group-for-devices-that-re.patch --]
[-- Type: application/octet-stream, Size: 1703 bytes --]

From f282a75bf3ed393e9eed57c0aa34a93097022e76 Mon Sep 17 00:00:00 2001
From: James Sewart <jamessewart@arista.com>
Date: Sat, 9 Mar 2019 18:05:53 +0000
Subject: [PATCH 4/9] iommu/vt-d: Create an IOMMU group for devices that
 require an idenity map

Devices that require an identity map because of quirks or other reasons
should be put in their own IOMMU group so as to not end up with multiple
different domains per group.

Signed-off-by: James Sewart <jamessewart@arista.com>

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 1075a7c74507..1db0ceb33020 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5424,6 +5424,22 @@ struct intel_iommu *intel_svm_device_to_iommu(struct device *dev)
 }
 #endif /* CONFIG_INTEL_IOMMU_SVM */
 
+static struct iommu_group *intel_identity_group;
+
+struct iommu_group *intel_iommu_pci_device_group(struct device *dev)
+{
+	if (iommu_no_mapping(dev)) {
+		if (!intel_identity_group) {
+			intel_identity_group = iommu_group_alloc();
+			if (IS_ERR(intel_identity_group))
+				return NULL;
+		}
+		return intel_identity_group;
+	}
+
+	return pci_device_group(dev);
+}
+
 const struct iommu_ops intel_iommu_ops = {
 	.capable		= intel_iommu_capable,
 	.domain_alloc		= intel_iommu_domain_alloc,
@@ -5438,7 +5454,7 @@ const struct iommu_ops intel_iommu_ops = {
 	.get_resv_regions	= intel_iommu_get_resv_regions,
 	.put_resv_regions	= intel_iommu_put_resv_regions,
 	.apply_resv_region	= intel_iommu_apply_resv_region,
-	.device_group		= pci_device_group,
+	.device_group		= intel_iommu_pci_device_group,
 	.pgsize_bitmap		= INTEL_IOMMU_PGSIZES,
 };
 
-- 
2.17.1


[-- Attachment #6: 0005-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch --]
[-- Type: application/octet-stream, Size: 2485 bytes --]

From 5210d963c9dde1d6bae95f0d338360d7eb92ce5a Mon Sep 17 00:00:00 2001
From: James Sewart <jamessewart@arista.com>
Date: Thu, 21 Mar 2019 11:31:44 +0000
Subject: [PATCH 5/9] iommu: Add ops entry for vendor specific default domain
 type

This adds an iommu ops entry for iommu vendor driver to specify
the type of the default domain for each device. This is needed
for the vendor driver, which already has its own logic to
determine the use of identity domain for a long time, when it
switches to apply default domain.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 88d589d56ded..516e4d8995c2 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1102,12 +1102,18 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
 	 */
 	if (!group->default_domain) {
 		struct iommu_domain *dom;
+		int domain_type = iommu_def_domain_type;
 
-		dom = __iommu_domain_alloc(dev->bus, iommu_def_domain_type);
-		if (!dom && iommu_def_domain_type != IOMMU_DOMAIN_DMA) {
+		if (ops->is_identity_map)
+			domain_type = ops->is_identity_map(dev) ?
+				      IOMMU_DOMAIN_IDENTITY :
+				      IOMMU_DOMAIN_DMA;
+
+		dom = __iommu_domain_alloc(dev->bus, domain_type);
+		if (!dom && domain_type != IOMMU_DOMAIN_DMA) {
 			dev_warn(dev,
 				 "failed to allocate default IOMMU domain of type %u; falling back to IOMMU_DOMAIN_DMA",
-				 iommu_def_domain_type);
+				 domain_type);
 			dom = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_DMA);
 		}
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index ffbbc7e39cee..a5007d035218 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -186,6 +186,7 @@ struct iommu_resv_region {
  * @of_xlate: add OF master IDs to iommu grouping
  * @is_attach_deferred: Check if domain attach should be deferred from iommu
  *                      driver init to device driver init (default no)
+ * @is_identity_map: vendor customized identity map adjudicator
  * @pgsize_bitmap: bitmap of all possible supported page sizes
  */
 struct iommu_ops {
@@ -230,6 +231,9 @@ struct iommu_ops {
 	int (*of_xlate)(struct device *dev, struct of_phandle_args *args);
 	bool (*is_attach_deferred)(struct iommu_domain *domain, struct device *dev);
 
+	/* Vendor customized identity map adjudicator */
+	bool (*is_identity_map)(struct device *dev);
+
 	unsigned long pgsize_bitmap;
 };
 
-- 
2.17.1


[-- Attachment #7: 0006-iommu-vt-d-Add-is_identity_map-ops-entry.patch --]
[-- Type: application/octet-stream, Size: 1268 bytes --]

From 55d091d75abc677438feef803fa88b1facdc7cb2 Mon Sep 17 00:00:00 2001
From: Lu Baolu <baolu.lu@linux.intel.com>
Date: Thu, 7 Mar 2019 12:53:12 +0800
Subject: [PATCH 6/9] iommu/vt-d: Add is_identity_map ops entry

This adds the is_identity_map ops entry for Intel IOMMU driver.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 1db0ceb33020..e79df3cf548c 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5344,6 +5344,11 @@ static void intel_iommu_apply_resv_region(struct device *dev,
 	WARN_ON_ONCE(reserve_iova(&dmar_domain->iovad, start, end) == NULL);
 }
 
+static bool intel_iommu_is_identity_map(struct device *dev)
+{
+	return iommu_should_identity_map(dev, 1);
+}
+
 #ifdef CONFIG_INTEL_IOMMU_SVM
 int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sdev)
 {
@@ -5455,6 +5460,7 @@ const struct iommu_ops intel_iommu_ops = {
 	.put_resv_regions	= intel_iommu_put_resv_regions,
 	.apply_resv_region	= intel_iommu_apply_resv_region,
 	.device_group		= intel_iommu_pci_device_group,
+	.is_identity_map	= intel_iommu_is_identity_map,
 	.pgsize_bitmap		= INTEL_IOMMU_PGSIZES,
 };
 
-- 
2.17.1


[-- Attachment #8: 0007-iommu-vt-d-Enable-DMA-remapping-after-rmrr-mapped.patch --]
[-- Type: application/octet-stream, Size: 1889 bytes --]

From 06cabc10e1728142650e545b0038ffb512b45aff Mon Sep 17 00:00:00 2001
From: James Sewart <jamessewart@arista.com>
Date: Fri, 8 Mar 2019 16:12:34 +0000
Subject: [PATCH 7/9] iommu/vt-d: Enable DMA remapping after rmrr mapped

The rmrr devices require identity map of the rmrr regions before
enabling DMA remapping. Otherwise, there will be a window during
which DMA from/to the rmrr regions will be blocked. In order to
alleviate this, we move enabling DMA remapping after all rmrr
regions get mapped.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index e79df3cf548c..3a06e8804b6e 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3519,11 +3519,6 @@ static int __init init_dmars(void)
 		ret = dmar_set_interrupt(iommu);
 		if (ret)
 			goto free_iommu;
-
-		if (!translation_pre_enabled(iommu))
-			iommu_enable_translation(iommu);
-
-		iommu_disable_protect_mem_regions(iommu);
 	}
 
 	return 0;
@@ -4921,7 +4916,6 @@ int __init intel_iommu_init(void)
 		goto out_free_reserved_range;
 	}
 	up_write(&dmar_global_lock);
-	pr_info("Intel(R) Virtualization Technology for Directed I/O\n");
 
 #if defined(CONFIG_X86) && defined(CONFIG_SWIOTLB)
 	swiotlb = 0;
@@ -4944,6 +4938,16 @@ int __init intel_iommu_init(void)
 		register_memory_notifier(&intel_iommu_memory_nb);
 	cpuhp_setup_state(CPUHP_IOMMU_INTEL_DEAD, "iommu/intel:dead", NULL,
 			  intel_iommu_cpu_dead);
+
+	/* Finally, we enable the DMA remapping hardware. */
+	for_each_iommu(iommu, drhd) {
+		if (!translation_pre_enabled(iommu))
+			iommu_enable_translation(iommu);
+
+		iommu_disable_protect_mem_regions(iommu);
+	}
+	pr_info("Intel(R) Virtualization Technology for Directed I/O\n");
+
 	intel_iommu_enabled = 1;
 	intel_iommu_debugfs_init();
 
-- 
2.17.1


[-- Attachment #9: 0008-iommu-vt-d-Allow-IOMMU_DOMAIN_DMA-to-be-allocated-by.patch --]
[-- Type: application/octet-stream, Size: 6655 bytes --]

From a2b312c445ac6f683167d903d93b374441f1018a Mon Sep 17 00:00:00 2001
From: James Sewart <jamessewart@arista.com>
Date: Mon, 11 Mar 2019 14:53:47 +0000
Subject: [PATCH 8/9] iommu/vt-d: Allow IOMMU_DOMAIN_DMA to be allocated by
 iommu_ops

Allowing IOMMU_DOMAIN_DMA type IOMMU domain to be allocated allows the
default_domain of an iommu_group to be set. This delegates device-domain
relationships to the generic IOMMU code.

Signed-off-by: James Sewart <jamessewart@arista.com>

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 3a06e8804b6e..2055c11f5978 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -311,6 +311,18 @@ static int hw_pass_through = 1;
 /* si_domain contains mulitple devices */
 #define DOMAIN_FLAG_STATIC_IDENTITY	(1 << 1)
 
+/*
+ * Domain managed externally, don't cleanup if it isn't attached
+ * to any devices.
+ */
+#define DOMAIN_FLAG_MANAGED_EXTERNALLY	(1 << 2)
+
+/*
+ * Set after domain initialisation. Used when allocating dma domains to
+ * defer domain initialisation until it is attached to a device
+ */
+#define DOMAIN_FLAG_INITIALISED	(1 << 3)
+
 #define for_each_domain_iommu(idx, domain)			\
 	for (idx = 0; idx < g_num_of_iommus; idx++)		\
 		if (domain->iommu_refcnt[idx])
@@ -561,6 +573,16 @@ static inline int domain_type_is_vm_or_si(struct dmar_domain *domain)
 				DOMAIN_FLAG_STATIC_IDENTITY);
 }
 
+static inline int domain_managed_externally(struct dmar_domain *domain)
+{
+	return domain->flags & DOMAIN_FLAG_MANAGED_EXTERNALLY;
+}
+
+static inline int domain_is_initialised(struct dmar_domain *domain)
+{
+	return domain->flags & DOMAIN_FLAG_INITIALISED;
+}
+
 static inline int domain_pfn_supported(struct dmar_domain *domain,
 				       unsigned long pfn)
 {
@@ -1671,7 +1693,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
 
 		__dmar_remove_one_dev_info(info);
 
-		if (!domain_type_is_vm_or_si(domain)) {
+		if (!domain_managed_externally(domain)) {
 			/*
 			 * The domain_exit() function  can't be called under
 			 * device_domain_lock, as it takes this lock itself.
@@ -1904,6 +1926,7 @@ static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
 	domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
 	if (!domain->pgd)
 		return -ENOMEM;
+	domain->flags |= DOMAIN_FLAG_INITIALISED;
 	__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
 	return 0;
 }
@@ -1912,6 +1935,9 @@ static void domain_exit(struct dmar_domain *domain)
 {
 	struct page *freelist;
 
+	if (!domain_is_initialised(domain))
+		goto free_mem;
+
 	/* Remove associated devices and clear attached or cached domains */
 	rcu_read_lock();
 	domain_remove_dev_info(domain);
@@ -1924,6 +1950,7 @@ static void domain_exit(struct dmar_domain *domain)
 
 	dma_free_pagelist(freelist);
 
+free_mem:
 	free_domain_mem(domain);
 }
 
@@ -4589,7 +4616,7 @@ static int device_notifier(struct notifier_block *nb,
 			return 0;
 
 		dmar_remove_one_dev_info(dev);
-		if (!domain_type_is_vm_or_si(domain) &&
+		if (!domain_managed_externally(domain) &&
 		    list_empty(&domain->devices))
 			domain_exit(domain);
 	} else if (action == BUS_NOTIFY_ADD_DEVICE) {
@@ -5047,6 +5074,7 @@ static int md_domain_init(struct dmar_domain *domain, int guest_width)
 	domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
 	if (!domain->pgd)
 		return -ENOMEM;
+	domain->flags |= DOMAIN_FLAG_INITIALISED;
 	domain_flush_cache(domain, domain->pgd, PAGE_SIZE);
 	return 0;
 }
@@ -5055,28 +5083,43 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
 {
 	struct dmar_domain *dmar_domain;
 	struct iommu_domain *domain;
+	int flags = DOMAIN_FLAG_MANAGED_EXTERNALLY;
 
-	if (type != IOMMU_DOMAIN_UNMANAGED)
-		return NULL;
+	switch (type) {
+	case IOMMU_DOMAIN_UNMANAGED:
+		flags |= DOMAIN_FLAG_VIRTUAL_MACHINE | DOMAIN_FLAG_INITIALISED;
+		dmar_domain = alloc_domain(flags);
+		if (!dmar_domain)
+			return NULL;
 
-	dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
-	if (!dmar_domain) {
-		pr_err("Can't allocate dmar_domain\n");
-		return NULL;
-	}
-	if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
-		pr_err("Domain initialization failed\n");
-		domain_exit(dmar_domain);
+		if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
+			pr_err("Domain initialization failed\n");
+			domain_exit(dmar_domain);
+			return NULL;
+		}
+		domain_update_iommu_cap(dmar_domain);
+		domain = &dmar_domain->domain;
+		domain->geometry.aperture_start = 0;
+		domain->geometry.aperture_end =
+			__DOMAIN_MAX_ADDR(dmar_domain->gaw);
+		domain->geometry.force_aperture = true;
+		break;
+	case IOMMU_DOMAIN_DMA:
+		dmar_domain = alloc_domain(flags);
+		if (!dmar_domain)
+			return NULL;
+		/*
+		 * init domain in device attach when we know IOMMU
+		 * capabilities
+		 */
+		break;
+	case IOMMU_DOMAIN_IDENTITY:
+		return &si_domain->domain;
+	default:
 		return NULL;
 	}
-	domain_update_iommu_cap(dmar_domain);
-
-	domain = &dmar_domain->domain;
-	domain->geometry.aperture_start = 0;
-	domain->geometry.aperture_end   = __DOMAIN_MAX_ADDR(dmar_domain->gaw);
-	domain->geometry.force_aperture = true;
 
-	return domain;
+	return &dmar_domain->domain;
 }
 
 static void intel_iommu_domain_free(struct iommu_domain *domain)
@@ -5107,7 +5150,7 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
 			dmar_remove_one_dev_info(dev);
 			rcu_read_unlock();
 
-			if (!domain_type_is_vm_or_si(old_domain) &&
+			if (!domain_managed_externally(old_domain) &&
 			     list_empty(&old_domain->devices))
 				domain_exit(old_domain);
 		}
@@ -5117,6 +5160,16 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
 	if (!iommu)
 		return -ENODEV;
 
+	/*
+	 * Initialise domain with IOMMU capabilities if it isn't already
+	 * initialised
+	 */
+	if (!domain_is_initialised(dmar_domain)) {
+		if (domain_init(dmar_domain, iommu,
+				DEFAULT_DOMAIN_ADDRESS_WIDTH))
+			return -ENOMEM;
+	}
+
 	/* check if this iommu agaw is sufficient for max mapped address */
 	addr_width = agaw_to_width(iommu->agaw);
 	if (addr_width > cap_mgaw(iommu->cap))
@@ -5163,6 +5216,10 @@ static int intel_iommu_map(struct iommu_domain *domain,
 	int prot = 0;
 	int ret;
 
+	/* Don't bother if hardware passthrough used. */
+	if (dmar_domain == si_domain && hw_pass_through)
+		return 0;
+
 	if (iommu_prot & IOMMU_READ)
 		prot |= DMA_PTE_READ;
 	if (iommu_prot & IOMMU_WRITE)
-- 
2.17.1


[-- Attachment #10: 0009-iommu-vt-d-Remove-lazy-allocation-of-domains.patch --]
[-- Type: application/octet-stream, Size: 14536 bytes --]

From 44a879dae8c246cd52fa754e99a1d43bc019c757 Mon Sep 17 00:00:00 2001
From: James Sewart <jamessewart@arista.com>
Date: Mon, 11 Mar 2019 14:55:01 +0000
Subject: [PATCH 9/9] iommu/vt-d: Remove lazy allocation of domains

The generic IOMMU code will allocate and attach a default domain to each
device that comes online. This patch removes the lazy domain allocation
and early reserved region mapping that is now redundant.

Signed-off-by: James Sewart <jamessewart@arista.com>

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 2055c11f5978..4bf719a82b64 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2463,21 +2463,6 @@ static void domain_remove_dev_info(struct dmar_domain *domain)
 	spin_unlock_irqrestore(&device_domain_lock, flags);
 }
 
-/*
- * find_domain
- * Note: we use struct device->archdata.iommu stores the info
- */
-static struct dmar_domain *find_domain(struct device *dev)
-{
-	struct device_domain_info *info;
-
-	/* No lock here, assumes no domain exit in normal case */
-	info = dev->archdata.iommu;
-	if (likely(info))
-		return info->domain;
-	return NULL;
-}
-
 static inline struct device_domain_info *
 dmar_search_domain_by_dev_info(int segment, int bus, int devfn)
 {
@@ -2539,8 +2524,11 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
 	}
 
 	spin_lock_irqsave(&device_domain_lock, flags);
-	if (dev)
-		found = find_domain(dev);
+	if (dev) {
+		struct iommu_domain *io_domain = iommu_get_domain_for_dev(dev);
+		if (io_domain)
+			found = to_dmar_domain(io_domain);
+	}
 
 	if (!found) {
 		struct device_domain_info *info2;
@@ -2608,118 +2596,6 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
 	return domain;
 }
 
-static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
-{
-	*(u16 *)opaque = alias;
-	return 0;
-}
-
-static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
-{
-	struct device_domain_info *info;
-	struct dmar_domain *domain = NULL;
-	struct intel_iommu *iommu;
-	u16 dma_alias;
-	unsigned long flags;
-	u8 bus, devfn;
-
-	iommu = device_to_iommu(dev, &bus, &devfn);
-	if (!iommu)
-		return NULL;
-
-	if (dev_is_pci(dev)) {
-		struct pci_dev *pdev = to_pci_dev(dev);
-
-		pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
-
-		spin_lock_irqsave(&device_domain_lock, flags);
-		info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
-						      PCI_BUS_NUM(dma_alias),
-						      dma_alias & 0xff);
-		if (info) {
-			iommu = info->iommu;
-			domain = info->domain;
-		}
-		spin_unlock_irqrestore(&device_domain_lock, flags);
-
-		/* DMA alias already has a domain, use it */
-		if (info)
-			goto out;
-	}
-
-	/* Allocate and initialize new domain for the device */
-	domain = alloc_domain(0);
-	if (!domain)
-		return NULL;
-	if (domain_init(domain, iommu, gaw)) {
-		domain_exit(domain);
-		return NULL;
-	}
-
-out:
-
-	return domain;
-}
-
-static struct dmar_domain *set_domain_for_dev(struct device *dev,
-					      struct dmar_domain *domain)
-{
-	struct intel_iommu *iommu;
-	struct dmar_domain *tmp;
-	u16 req_id, dma_alias;
-	u8 bus, devfn;
-
-	iommu = device_to_iommu(dev, &bus, &devfn);
-	if (!iommu)
-		return NULL;
-
-	req_id = ((u16)bus << 8) | devfn;
-
-	if (dev_is_pci(dev)) {
-		struct pci_dev *pdev = to_pci_dev(dev);
-
-		pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
-
-		/* register PCI DMA alias device */
-		if (req_id != dma_alias) {
-			tmp = dmar_insert_one_dev_info(iommu, PCI_BUS_NUM(dma_alias),
-					dma_alias & 0xff, NULL, domain);
-
-			if (!tmp || tmp != domain)
-				return tmp;
-		}
-	}
-
-	tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
-	if (!tmp || tmp != domain)
-		return tmp;
-
-	return domain;
-}
-
-static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
-{
-	struct dmar_domain *domain, *tmp;
-
-	domain = find_domain(dev);
-	if (domain)
-		goto out;
-
-	domain = find_or_alloc_domain(dev, gaw);
-	if (!domain)
-		goto out;
-
-	tmp = set_domain_for_dev(dev, domain);
-	if (!tmp || domain != tmp) {
-		domain_exit(domain);
-		domain = tmp;
-	}
-
-out:
-
-	return domain;
-}
-
 static int iommu_domain_identity_map(struct dmar_domain *domain,
 				     unsigned long long start,
 				     unsigned long long end)
@@ -2745,72 +2621,6 @@ static int iommu_domain_identity_map(struct dmar_domain *domain,
 				DMA_PTE_READ|DMA_PTE_WRITE);
 }
 
-static int domain_prepare_identity_map(struct device *dev,
-				       struct dmar_domain *domain,
-				       unsigned long long start,
-				       unsigned long long end)
-{
-	/* For _hardware_ passthrough, don't bother. But for software
-	   passthrough, we do it anyway -- it may indicate a memory
-	   range which is reserved in E820, so which didn't get set
-	   up to start with in si_domain */
-	if (domain == si_domain && hw_pass_through) {
-		dev_warn(dev, "Ignoring identity map for HW passthrough [0x%Lx - 0x%Lx]\n",
-			 start, end);
-		return 0;
-	}
-
-	dev_info(dev, "Setting identity map [0x%Lx - 0x%Lx]\n", start, end);
-
-	if (end < start) {
-		WARN(1, "Your BIOS is broken; RMRR ends before it starts!\n"
-			"BIOS vendor: %s; Ver: %s; Product Version: %s\n",
-			dmi_get_system_info(DMI_BIOS_VENDOR),
-			dmi_get_system_info(DMI_BIOS_VERSION),
-		     dmi_get_system_info(DMI_PRODUCT_VERSION));
-		return -EIO;
-	}
-
-	if (end >> agaw_to_width(domain->agaw)) {
-		WARN(1, "Your BIOS is broken; RMRR exceeds permitted address width (%d bits)\n"
-		     "BIOS vendor: %s; Ver: %s; Product Version: %s\n",
-		     agaw_to_width(domain->agaw),
-		     dmi_get_system_info(DMI_BIOS_VENDOR),
-		     dmi_get_system_info(DMI_BIOS_VERSION),
-		     dmi_get_system_info(DMI_PRODUCT_VERSION));
-		return -EIO;
-	}
-
-	return iommu_domain_identity_map(domain, start, end);
-}
-
-static int iommu_prepare_identity_map(struct device *dev,
-				      unsigned long long start,
-				      unsigned long long end)
-{
-	struct dmar_domain *domain;
-	int ret;
-
-	domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
-	if (!domain)
-		return -ENOMEM;
-
-	ret = domain_prepare_identity_map(dev, domain, start, end);
-	if (ret)
-		domain_exit(domain);
-
-	return ret;
-}
-
-static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
-					 struct device *dev)
-{
-	if (dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO)
-		return 0;
-	return iommu_prepare_identity_map(dev, rmrr->base_address,
-					  rmrr->end_address);
-}
-
 static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
 {
 	if (!isa_resv_region)
@@ -2821,29 +2631,6 @@ static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
 	return isa_resv_region;
 }
 
-#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
-static inline void iommu_prepare_isa(struct pci_dev *pdev)
-{
-	int ret;
-	struct iommu_resv_region *reg = iommu_get_isa_resv_region();
-
-	if (!reg)
-		return;
-
-	pr_info("Prepare 0-16MiB unity mapping for LPC\n");
-	ret = iommu_prepare_identity_map(&pdev->dev, reg->start,
-			reg->start + reg->length - 1);
-
-	if (ret)
-		pr_err("Failed to create 0-16MiB identity map - floppy might not work\n");
-}
-#else
-static inline void iommu_prepare_isa(struct pci_dev *pdev)
-{
-	return;
-}
-#endif /* !CONFIG_INTEL_IOMMU_FLPY_WA */
-
 static int md_domain_init(struct dmar_domain *domain, int guest_width);
 
 static int __init si_domain_init(int hw)
@@ -3066,18 +2853,10 @@ static int __init dev_prepare_static_identity_mapping(struct device *dev, int hw
 
 static int __init iommu_prepare_static_identity_mapping(int hw)
 {
-	struct pci_dev *pdev = NULL;
 	struct dmar_drhd_unit *drhd;
 	struct intel_iommu *iommu;
 	struct device *dev;
-	int i;
-	int ret = 0;
-
-	for_each_pci_dev(pdev) {
-		ret = dev_prepare_static_identity_mapping(&pdev->dev, hw);
-		if (ret)
-			return ret;
-	}
+	int i, ret = 0;
 
 	for_each_active_iommu(iommu, drhd)
 		for_each_active_dev_scope(drhd->devices, drhd->devices_cnt, i, dev) {
@@ -3324,12 +3103,9 @@ static int copy_translation_tables(struct intel_iommu *iommu)
 static int __init init_dmars(void)
 {
 	struct dmar_drhd_unit *drhd;
-	struct dmar_rmrr_unit *rmrr;
 	bool copied_tables = false;
-	struct device *dev;
-	struct pci_dev *pdev;
 	struct intel_iommu *iommu;
-	int i, ret;
+	int ret;
 
 	/*
 	 * for each drhd
@@ -3483,36 +3259,6 @@ static int __init init_dmars(void)
 			goto free_iommu;
 		}
 	}
-	/*
-	 * For each rmrr
-	 *   for each dev attached to rmrr
-	 *   do
-	 *     locate drhd for dev, alloc domain for dev
-	 *     allocate free domain
-	 *     allocate page table entries for rmrr
-	 *     if context not allocated for bus
-	 *           allocate and init context
-	 *           set present in root table for this bus
-	 *     init context with domain, translation etc
-	 *    endfor
-	 * endfor
-	 */
-	pr_info("Setting RMRR:\n");
-	for_each_rmrr_units(rmrr) {
-		/* some BIOS lists non-exist devices in DMAR table. */
-		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
-					  i, dev) {
-			ret = iommu_prepare_rmrr_dev(rmrr, dev);
-			if (ret)
-				pr_err("Mapping reserved region failed\n");
-		}
-	}
-
-	pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
-	if (pdev) {
-		iommu_prepare_isa(pdev);
-		pci_dev_put(pdev);
-	}
 
 domains_done:
 
@@ -3595,53 +3341,6 @@ static unsigned long intel_alloc_iova(struct device *dev,
 	return iova_pfn;
 }
 
-struct dmar_domain *get_valid_domain_for_dev(struct device *dev)
-{
-	struct dmar_domain *domain, *tmp;
-	struct dmar_rmrr_unit *rmrr;
-	struct device *i_dev;
-	int i, ret;
-
-	domain = find_domain(dev);
-	if (domain)
-		goto out;
-
-	domain = find_or_alloc_domain(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
-	if (!domain)
-		goto out;
-
-	/* We have a new domain - setup possible RMRRs for the device */
-	rcu_read_lock();
-	for_each_rmrr_units(rmrr) {
-		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
-					  i, i_dev) {
-			if (i_dev != dev)
-				continue;
-
-			ret = domain_prepare_identity_map(dev, domain,
-							  rmrr->base_address,
-							  rmrr->end_address);
-			if (ret)
-				dev_err(dev, "Mapping reserved region failed\n");
-		}
-	}
-	rcu_read_unlock();
-
-	tmp = set_domain_for_dev(dev, domain);
-	if (!tmp || domain != tmp) {
-		domain_exit(domain);
-		domain = tmp;
-	}
-
-out:
-
-	if (!domain)
-		dev_err(dev, "Allocating domain failed\n");
-
-
-	return domain;
-}
-
 /* Check if the dev needs to go through non-identity map and unmap process.*/
 static int iommu_no_mapping(struct device *dev)
 {
@@ -3688,6 +3387,7 @@ static dma_addr_t __intel_map_single(struct device *dev, phys_addr_t paddr,
 				     size_t size, int dir, u64 dma_mask)
 {
 	struct dmar_domain *domain;
+	struct iommu_domain *io_domain;
 	phys_addr_t start_paddr;
 	unsigned long iova_pfn;
 	int prot = 0;
@@ -3700,9 +3400,10 @@ static dma_addr_t __intel_map_single(struct device *dev, phys_addr_t paddr,
 	if (iommu_no_mapping(dev))
 		return paddr;
 
-	domain = get_valid_domain_for_dev(dev);
-	if (!domain)
+	io_domain = iommu_get_domain_for_dev(dev);
+	if (!io_domain)
 		return DMA_MAPPING_ERROR;
+	domain = to_dmar_domain(io_domain);
 
 	iommu = domain_get_iommu(domain);
 	size = aligned_nrpages(paddr, size);
@@ -3762,6 +3463,7 @@ static dma_addr_t intel_map_resource(struct device *dev, phys_addr_t phys_addr,
 static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
 {
 	struct dmar_domain *domain;
+	struct iommu_domain *io_domain;
 	unsigned long start_pfn, last_pfn;
 	unsigned long nrpages;
 	unsigned long iova_pfn;
@@ -3771,8 +3473,9 @@ static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
 	if (iommu_no_mapping(dev))
 		return;
 
-	domain = find_domain(dev);
-	BUG_ON(!domain);
+	io_domain = iommu_get_domain_for_dev(dev);
+	BUG_ON(!io_domain);
+	domain = to_dmar_domain(io_domain);
 
 	iommu = domain_get_iommu(domain);
 
@@ -3906,6 +3609,7 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
 {
 	int i;
 	struct dmar_domain *domain;
+	struct iommu_domain *io_domain;
 	size_t size = 0;
 	int prot = 0;
 	unsigned long iova_pfn;
@@ -3918,9 +3622,10 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
 	if (iommu_no_mapping(dev))
 		return intel_nontranslate_map_sg(dev, sglist, nelems, dir);
 
-	domain = get_valid_domain_for_dev(dev);
-	if (!domain)
+	io_domain = iommu_get_domain_for_dev(dev);
+	if (!io_domain)
 		return 0;
+	domain = to_dmar_domain(io_domain);
 
 	iommu = domain_get_iommu(domain);
 
@@ -4611,9 +4316,11 @@ static int device_notifier(struct notifier_block *nb,
 		return 0;
 
 	if (action == BUS_NOTIFY_REMOVED_DEVICE) {
-		domain = find_domain(dev);
-		if (!domain)
+		struct iommu_domain *io_domain;
+		io_domain = iommu_get_domain_for_dev(dev);
+		if (!io_domain)
 			return 0;
+		domain = to_dmar_domain(io_domain);
 
 		dmar_remove_one_dev_info(dev);
 		if (!domain_managed_externally(domain) &&
@@ -5143,9 +4850,11 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
 	/* normally dev is not mapped */
 	if (unlikely(domain_context_mapped(dev))) {
 		struct dmar_domain *old_domain;
+		struct iommu_domain *old_io_domain;
 
-		old_domain = find_domain(dev);
-		if (old_domain) {
+		old_io_domain = iommu_get_domain_for_dev(dev);
+		if (old_io_domain) {
+			old_domain = to_dmar_domain(old_io_domain);
 			rcu_read_lock();
 			dmar_remove_one_dev_info(dev);
 			rcu_read_unlock();
@@ -5416,13 +5125,15 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sd
 	struct device_domain_info *info;
 	struct context_entry *context;
 	struct dmar_domain *domain;
+	struct iommu_domain *io_domain;
 	unsigned long flags;
 	u64 ctx_lo;
 	int ret;
 
-	domain = get_valid_domain_for_dev(sdev->dev);
-	if (!domain)
+	io_domain = iommu_get_domain_for_dev(sdev->dev);
+	if (!io_domain)
 		return -EINVAL;
+	domain = to_dmar_domain(io_domain);
 
 	spin_lock_irqsave(&device_domain_lock, flags);
 	spin_lock(&iommu->lock);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
  2019-03-29 15:26                 ` James Sewart
@ 2019-04-04  6:49                   ` Lu Baolu
  2019-04-05 18:02                     ` James Sewart
  0 siblings, 1 reply; 55+ messages in thread
From: Lu Baolu @ 2019-04-04  6:49 UTC (permalink / raw)
  To: James Sewart
  Cc: baolu.lu, iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi James,

I did a sanity test from my end. The test machine fails to boot. I
haven't seen any valuable kernel log. It results in a purple screen.

Best regards,
Lu Baolu

On 3/29/19 11:26 PM, James Sewart wrote:
> Hey Lu,
> 
> I’ve attached a preliminary v3, if you could take a look and run some tests
> that would be great.
> 
> Since v2 i’ve added your default domain type patches, the new device_group
> handler, and incorporated Jacob’s feedback.
> 
>> On 28 Mar 2019, at 18:37, James Sewart <jamessewart@arista.com> wrote:
>>
>> Hey Lu,
>>
>>> On 26 Mar 2019, at 01:24, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>>
>>> Hi James,
>>>
>>> On 3/25/19 8:57 PM, James Sewart wrote:
>>>>>> Theres an issue that if we choose to alloc a new resv_region with type
>>>>>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
>>>>>> to free this entry type which means refactoring the rmrr regions in
>>>>>> get_resv_regions. Should this work be in this patchset?
>>>>> Do you mean the rmrr regions are not allocated in get_resv_regions, but
>>>>> are freed in put_resv_regions? I think we should fix this in this patch
>>>>> set since this might impact the device passthrough if we don't do it.
>>>> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
>>>> freed in put_resv_regions. If we allocate a new resv_region with type
>>>> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
>>>> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
>>>> the static RMRR regions.
>>>> Either the ISA region is static and not freed as with my implementation,
>>>> or the RMRR regions are converted to be allocated on each call to
>>>> get_resv_regions and freed in put_resv_regions.
>>>
>>> By the way, there's another way in my mind. Let's add a new region type
>>> for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way
>>> as those MSI regions. Just FYI.
>>
>> This solution would require adding some extra code to
>> iommu_group_create_direct_mappings as currently only type
>> IOMMU_RESV_DIRECT is identity mapped, other types are only reserved.
>>
>>
>>>
>>> Best regards,
>>> Lu Baolu
>>
>> Cheers,
>> James.
> 
> Cheers,
> James.
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
  2019-04-04  6:49                   ` Lu Baolu
@ 2019-04-05 18:02                     ` James Sewart
  2019-04-08  2:43                       ` Lu Baolu
  2019-04-10  5:22                       ` Lu Baolu
  0 siblings, 2 replies; 55+ messages in thread
From: James Sewart @ 2019-04-05 18:02 UTC (permalink / raw)
  To: Lu Baolu; +Cc: iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 389 bytes --]

Hey Lu,

My bad, did some debugging on my end. The issue was swapping out 
find_domain for iommu_get_domain_for_dev. It seems in some situations the 
domain is not attached to the group but the device is expected to have the 
domain still stored in its archdata.

I’ve attached the final patch with find_domain unremoved which seems to 
work in my testing.

Cheers,
James.


[-- Attachment #2: 0009-iommu-vt-d-Remove-lazy-allocation-of-domains.patch --]
[-- Type: application/octet-stream, Size: 10754 bytes --]

From 7933b0778a217e15faee86f39b3628aa9c8407fc Mon Sep 17 00:00:00 2001
From: James Sewart <jamessewart@arista.com>
Date: Mon, 11 Mar 2019 14:55:01 +0000
Subject: [PATCH 9/9] iommu/vt-d: Remove lazy allocation of domains

The generic IOMMU code will allocate and attach a default domain to each
device that comes online. This patch removes the lazy domain allocation
and early reserved region mapping that is now redundant.

Signed-off-by: James Sewart <jamessewart@arista.com>

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 2055c11f5978..9e330e4a3ae3 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2608,118 +2608,6 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
 	return domain;
 }
 
-static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
-{
-	*(u16 *)opaque = alias;
-	return 0;
-}
-
-static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
-{
-	struct device_domain_info *info;
-	struct dmar_domain *domain = NULL;
-	struct intel_iommu *iommu;
-	u16 dma_alias;
-	unsigned long flags;
-	u8 bus, devfn;
-
-	iommu = device_to_iommu(dev, &bus, &devfn);
-	if (!iommu)
-		return NULL;
-
-	if (dev_is_pci(dev)) {
-		struct pci_dev *pdev = to_pci_dev(dev);
-
-		pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
-
-		spin_lock_irqsave(&device_domain_lock, flags);
-		info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
-						      PCI_BUS_NUM(dma_alias),
-						      dma_alias & 0xff);
-		if (info) {
-			iommu = info->iommu;
-			domain = info->domain;
-		}
-		spin_unlock_irqrestore(&device_domain_lock, flags);
-
-		/* DMA alias already has a domain, use it */
-		if (info)
-			goto out;
-	}
-
-	/* Allocate and initialize new domain for the device */
-	domain = alloc_domain(0);
-	if (!domain)
-		return NULL;
-	if (domain_init(domain, iommu, gaw)) {
-		domain_exit(domain);
-		return NULL;
-	}
-
-out:
-
-	return domain;
-}
-
-static struct dmar_domain *set_domain_for_dev(struct device *dev,
-					      struct dmar_domain *domain)
-{
-	struct intel_iommu *iommu;
-	struct dmar_domain *tmp;
-	u16 req_id, dma_alias;
-	u8 bus, devfn;
-
-	iommu = device_to_iommu(dev, &bus, &devfn);
-	if (!iommu)
-		return NULL;
-
-	req_id = ((u16)bus << 8) | devfn;
-
-	if (dev_is_pci(dev)) {
-		struct pci_dev *pdev = to_pci_dev(dev);
-
-		pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
-
-		/* register PCI DMA alias device */
-		if (req_id != dma_alias) {
-			tmp = dmar_insert_one_dev_info(iommu, PCI_BUS_NUM(dma_alias),
-					dma_alias & 0xff, NULL, domain);
-
-			if (!tmp || tmp != domain)
-				return tmp;
-		}
-	}
-
-	tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
-	if (!tmp || tmp != domain)
-		return tmp;
-
-	return domain;
-}
-
-static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
-{
-	struct dmar_domain *domain, *tmp;
-
-	domain = find_domain(dev);
-	if (domain)
-		goto out;
-
-	domain = find_or_alloc_domain(dev, gaw);
-	if (!domain)
-		goto out;
-
-	tmp = set_domain_for_dev(dev, domain);
-	if (!tmp || domain != tmp) {
-		domain_exit(domain);
-		domain = tmp;
-	}
-
-out:
-
-	return domain;
-}
-
 static int iommu_domain_identity_map(struct dmar_domain *domain,
 				     unsigned long long start,
 				     unsigned long long end)
@@ -2745,72 +2633,6 @@ static int iommu_domain_identity_map(struct dmar_domain *domain,
 				DMA_PTE_READ|DMA_PTE_WRITE);
 }
 
-static int domain_prepare_identity_map(struct device *dev,
-				       struct dmar_domain *domain,
-				       unsigned long long start,
-				       unsigned long long end)
-{
-	/* For _hardware_ passthrough, don't bother. But for software
-	   passthrough, we do it anyway -- it may indicate a memory
-	   range which is reserved in E820, so which didn't get set
-	   up to start with in si_domain */
-	if (domain == si_domain && hw_pass_through) {
-		dev_warn(dev, "Ignoring identity map for HW passthrough [0x%Lx - 0x%Lx]\n",
-			 start, end);
-		return 0;
-	}
-
-	dev_info(dev, "Setting identity map [0x%Lx - 0x%Lx]\n", start, end);
-
-	if (end < start) {
-		WARN(1, "Your BIOS is broken; RMRR ends before it starts!\n"
-			"BIOS vendor: %s; Ver: %s; Product Version: %s\n",
-			dmi_get_system_info(DMI_BIOS_VENDOR),
-			dmi_get_system_info(DMI_BIOS_VERSION),
-		     dmi_get_system_info(DMI_PRODUCT_VERSION));
-		return -EIO;
-	}
-
-	if (end >> agaw_to_width(domain->agaw)) {
-		WARN(1, "Your BIOS is broken; RMRR exceeds permitted address width (%d bits)\n"
-		     "BIOS vendor: %s; Ver: %s; Product Version: %s\n",
-		     agaw_to_width(domain->agaw),
-		     dmi_get_system_info(DMI_BIOS_VENDOR),
-		     dmi_get_system_info(DMI_BIOS_VERSION),
-		     dmi_get_system_info(DMI_PRODUCT_VERSION));
-		return -EIO;
-	}
-
-	return iommu_domain_identity_map(domain, start, end);
-}
-
-static int iommu_prepare_identity_map(struct device *dev,
-				      unsigned long long start,
-				      unsigned long long end)
-{
-	struct dmar_domain *domain;
-	int ret;
-
-	domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
-	if (!domain)
-		return -ENOMEM;
-
-	ret = domain_prepare_identity_map(dev, domain, start, end);
-	if (ret)
-		domain_exit(domain);
-
-	return ret;
-}
-
-static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
-					 struct device *dev)
-{
-	if (dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO)
-		return 0;
-	return iommu_prepare_identity_map(dev, rmrr->base_address,
-					  rmrr->end_address);
-}
-
 static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
 {
 	if (!isa_resv_region)
@@ -2821,29 +2643,6 @@ static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
 	return isa_resv_region;
 }
 
-#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
-static inline void iommu_prepare_isa(struct pci_dev *pdev)
-{
-	int ret;
-	struct iommu_resv_region *reg = iommu_get_isa_resv_region();
-
-	if (!reg)
-		return;
-
-	pr_info("Prepare 0-16MiB unity mapping for LPC\n");
-	ret = iommu_prepare_identity_map(&pdev->dev, reg->start,
-			reg->start + reg->length - 1);
-
-	if (ret)
-		pr_err("Failed to create 0-16MiB identity map - floppy might not work\n");
-}
-#else
-static inline void iommu_prepare_isa(struct pci_dev *pdev)
-{
-	return;
-}
-#endif /* !CONFIG_INTEL_IOMMU_FLPY_WA */
-
 static int md_domain_init(struct dmar_domain *domain, int guest_width);
 
 static int __init si_domain_init(int hw)
@@ -3066,18 +2865,10 @@ static int __init dev_prepare_static_identity_mapping(struct device *dev, int hw
 
 static int __init iommu_prepare_static_identity_mapping(int hw)
 {
-	struct pci_dev *pdev = NULL;
 	struct dmar_drhd_unit *drhd;
 	struct intel_iommu *iommu;
 	struct device *dev;
-	int i;
-	int ret = 0;
-
-	for_each_pci_dev(pdev) {
-		ret = dev_prepare_static_identity_mapping(&pdev->dev, hw);
-		if (ret)
-			return ret;
-	}
+	int i, ret = 0;
 
 	for_each_active_iommu(iommu, drhd)
 		for_each_active_dev_scope(drhd->devices, drhd->devices_cnt, i, dev) {
@@ -3324,12 +3115,9 @@ static int copy_translation_tables(struct intel_iommu *iommu)
 static int __init init_dmars(void)
 {
 	struct dmar_drhd_unit *drhd;
-	struct dmar_rmrr_unit *rmrr;
 	bool copied_tables = false;
-	struct device *dev;
-	struct pci_dev *pdev;
 	struct intel_iommu *iommu;
-	int i, ret;
+	int ret;
 
 	/*
 	 * for each drhd
@@ -3483,36 +3271,6 @@ static int __init init_dmars(void)
 			goto free_iommu;
 		}
 	}
-	/*
-	 * For each rmrr
-	 *   for each dev attached to rmrr
-	 *   do
-	 *     locate drhd for dev, alloc domain for dev
-	 *     allocate free domain
-	 *     allocate page table entries for rmrr
-	 *     if context not allocated for bus
-	 *           allocate and init context
-	 *           set present in root table for this bus
-	 *     init context with domain, translation etc
-	 *    endfor
-	 * endfor
-	 */
-	pr_info("Setting RMRR:\n");
-	for_each_rmrr_units(rmrr) {
-		/* some BIOS lists non-exist devices in DMAR table. */
-		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
-					  i, dev) {
-			ret = iommu_prepare_rmrr_dev(rmrr, dev);
-			if (ret)
-				pr_err("Mapping reserved region failed\n");
-		}
-	}
-
-	pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
-	if (pdev) {
-		iommu_prepare_isa(pdev);
-		pci_dev_put(pdev);
-	}
 
 domains_done:
 
@@ -3595,53 +3353,6 @@ static unsigned long intel_alloc_iova(struct device *dev,
 	return iova_pfn;
 }
 
-struct dmar_domain *get_valid_domain_for_dev(struct device *dev)
-{
-	struct dmar_domain *domain, *tmp;
-	struct dmar_rmrr_unit *rmrr;
-	struct device *i_dev;
-	int i, ret;
-
-	domain = find_domain(dev);
-	if (domain)
-		goto out;
-
-	domain = find_or_alloc_domain(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
-	if (!domain)
-		goto out;
-
-	/* We have a new domain - setup possible RMRRs for the device */
-	rcu_read_lock();
-	for_each_rmrr_units(rmrr) {
-		for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
-					  i, i_dev) {
-			if (i_dev != dev)
-				continue;
-
-			ret = domain_prepare_identity_map(dev, domain,
-							  rmrr->base_address,
-							  rmrr->end_address);
-			if (ret)
-				dev_err(dev, "Mapping reserved region failed\n");
-		}
-	}
-	rcu_read_unlock();
-
-	tmp = set_domain_for_dev(dev, domain);
-	if (!tmp || domain != tmp) {
-		domain_exit(domain);
-		domain = tmp;
-	}
-
-out:
-
-	if (!domain)
-		dev_err(dev, "Allocating domain failed\n");
-
-
-	return domain;
-}
-
 /* Check if the dev needs to go through non-identity map and unmap process.*/
 static int iommu_no_mapping(struct device *dev)
 {
@@ -3700,7 +3411,7 @@ static dma_addr_t __intel_map_single(struct device *dev, phys_addr_t paddr,
 	if (iommu_no_mapping(dev))
 		return paddr;
 
-	domain = get_valid_domain_for_dev(dev);
+	domain = find_domain(dev);
 	if (!domain)
 		return DMA_MAPPING_ERROR;
 
@@ -3918,7 +3629,7 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
 	if (iommu_no_mapping(dev))
 		return intel_nontranslate_map_sg(dev, sglist, nelems, dir);
 
-	domain = get_valid_domain_for_dev(dev);
+	domain = find_domain(dev);
 	if (!domain)
 		return 0;
 
@@ -5420,7 +5131,7 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sd
 	u64 ctx_lo;
 	int ret;
 
-	domain = get_valid_domain_for_dev(sdev->dev);
+	domain = find_domain(sdev->dev);
 	if (!domain)
 		return -EINVAL;
 
-- 
2.17.1


[-- Attachment #3: Type: text/plain, Size: 2445 bytes --]




> On 4 Apr 2019, at 07:49, Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
> Hi James,
> 
> I did a sanity test from my end. The test machine fails to boot. I
> haven't seen any valuable kernel log. It results in a purple screen.
> 
> Best regards,
> Lu Baolu
> 
> On 3/29/19 11:26 PM, James Sewart wrote:
>> Hey Lu,
>> I’ve attached a preliminary v3, if you could take a look and run some tests
>> that would be great.
>> Since v2 i’ve added your default domain type patches, the new device_group
>> handler, and incorporated Jacob’s feedback.
>>> On 28 Mar 2019, at 18:37, James Sewart <jamessewart@arista.com> wrote:
>>> 
>>> Hey Lu,
>>> 
>>>> On 26 Mar 2019, at 01:24, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>>> 
>>>> Hi James,
>>>> 
>>>> On 3/25/19 8:57 PM, James Sewart wrote:
>>>>>>> Theres an issue that if we choose to alloc a new resv_region with type
>>>>>>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
>>>>>>> to free this entry type which means refactoring the rmrr regions in
>>>>>>> get_resv_regions. Should this work be in this patchset?
>>>>>> Do you mean the rmrr regions are not allocated in get_resv_regions, but
>>>>>> are freed in put_resv_regions? I think we should fix this in this patch
>>>>>> set since this might impact the device passthrough if we don't do it.
>>>>> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
>>>>> freed in put_resv_regions. If we allocate a new resv_region with type
>>>>> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
>>>>> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
>>>>> the static RMRR regions.
>>>>> Either the ISA region is static and not freed as with my implementation,
>>>>> or the RMRR regions are converted to be allocated on each call to
>>>>> get_resv_regions and freed in put_resv_regions.
>>>> 
>>>> By the way, there's another way in my mind. Let's add a new region type
>>>> for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way
>>>> as those MSI regions. Just FYI.
>>> 
>>> This solution would require adding some extra code to
>>> iommu_group_create_direct_mappings as currently only type
>>> IOMMU_RESV_DIRECT is identity mapped, other types are only reserved.
>>> 
>>> 
>>>> 
>>>> Best regards,
>>>> Lu Baolu
>>> 
>>> Cheers,
>>> James.
>> Cheers,
>> James.


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
  2019-04-05 18:02                     ` James Sewart
@ 2019-04-08  2:43                       ` Lu Baolu
  2019-04-10  5:22                       ` Lu Baolu
  1 sibling, 0 replies; 55+ messages in thread
From: Lu Baolu @ 2019-04-08  2:43 UTC (permalink / raw)
  To: James Sewart
  Cc: baolu.lu, iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi,

On 4/6/19 2:02 AM, James Sewart wrote:
> Hey Lu,
> 
> My bad, did some debugging on my end. The issue was swapping out
> find_domain for iommu_get_domain_for_dev. It seems in some situations the
> domain is not attached to the group but the device is expected to have the
> domain still stored in its archdata.
> 
> I’ve attached the final patch with find_domain unremoved which seems to
> work in my testing.

This version works for me now.

> 
> Cheers,
> James.

Best regards,
Lu Baolu

> 
> 
> 
> 
> 
>> On 4 Apr 2019, at 07:49, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>
>> Hi James,
>>
>> I did a sanity test from my end. The test machine fails to boot. I
>> haven't seen any valuable kernel log. It results in a purple screen.
>>
>> Best regards,
>> Lu Baolu
>>
>> On 3/29/19 11:26 PM, James Sewart wrote:
>>> Hey Lu,
>>> I’ve attached a preliminary v3, if you could take a look and run some tests
>>> that would be great.
>>> Since v2 i’ve added your default domain type patches, the new device_group
>>> handler, and incorporated Jacob’s feedback.
>>>> On 28 Mar 2019, at 18:37, James Sewart <jamessewart@arista.com> wrote:
>>>>
>>>> Hey Lu,
>>>>
>>>>> On 26 Mar 2019, at 01:24, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>>>>
>>>>> Hi James,
>>>>>
>>>>> On 3/25/19 8:57 PM, James Sewart wrote:
>>>>>>>> Theres an issue that if we choose to alloc a new resv_region with type
>>>>>>>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
>>>>>>>> to free this entry type which means refactoring the rmrr regions in
>>>>>>>> get_resv_regions. Should this work be in this patchset?
>>>>>>> Do you mean the rmrr regions are not allocated in get_resv_regions, but
>>>>>>> are freed in put_resv_regions? I think we should fix this in this patch
>>>>>>> set since this might impact the device passthrough if we don't do it.
>>>>>> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
>>>>>> freed in put_resv_regions. If we allocate a new resv_region with type
>>>>>> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
>>>>>> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
>>>>>> the static RMRR regions.
>>>>>> Either the ISA region is static and not freed as with my implementation,
>>>>>> or the RMRR regions are converted to be allocated on each call to
>>>>>> get_resv_regions and freed in put_resv_regions.
>>>>>
>>>>> By the way, there's another way in my mind. Let's add a new region type
>>>>> for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way
>>>>> as those MSI regions. Just FYI.
>>>>
>>>> This solution would require adding some extra code to
>>>> iommu_group_create_direct_mappings as currently only type
>>>> IOMMU_RESV_DIRECT is identity mapped, other types are only reserved.
>>>>
>>>>
>>>>>
>>>>> Best regards,
>>>>> Lu Baolu
>>>>
>>>> Cheers,
>>>> James.
>>> Cheers,
>>> James.
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
  2019-04-05 18:02                     ` James Sewart
  2019-04-08  2:43                       ` Lu Baolu
@ 2019-04-10  5:22                       ` Lu Baolu
  2019-04-15 14:16                         ` James Sewart
  1 sibling, 1 reply; 55+ messages in thread
From: Lu Baolu @ 2019-04-10  5:22 UTC (permalink / raw)
  To: James Sewart
  Cc: baolu.lu, iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hi James,

On 4/6/19 2:02 AM, James Sewart wrote:
> Hey Lu,
> 
> My bad, did some debugging on my end. The issue was swapping out
> find_domain for iommu_get_domain_for_dev. It seems in some situations the
> domain is not attached to the group but the device is expected to have the
> domain still stored in its archdata.
> 
> I’ve attached the final patch with find_domain unremoved which seems to
> work in my testing.
> 

Just looked into your v3 patch set and some thoughts from my end posted
here just for your information.

Let me post the problems we want to address.

1. When allocating a new group for a device, how should we determine the
type of the default domain?

2. If we need to put a device into an existing group which uses a
different type of domain from what the device desires to use, we might
break the functionality of the device.

My new thought is letting the iommu generic code to determine the
default domain type (hence my proposed vendor specific default domain
type patches could be dropped).

If the default domain type is dynamical mapping, and later in 
iommu_no_mapping(), we determines that we must use an identity domain,
we then call iommu_request_dm_for_dev(dev).

If the default domain type is identity mapping, and later in
iommu_no_mapping(), we determined that we must use a dynamical domain,
we then call iommu_request_dma_domain_for_dev(dev).

We already have iommu_request_dm_for_dev() in iommu.c. We only need to
implement iommu_request_dma_domain_for_dev().

With this done, your patch titled "Create an IOMMU group for devices
that require an identity map" could also be dropped.

Any thoughts?

> Cheers,
> James.

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
  2019-04-10  5:22                       ` Lu Baolu
@ 2019-04-15 14:16                         ` James Sewart
  2019-04-16  2:18                           ` Lu Baolu
  0 siblings, 1 reply; 55+ messages in thread
From: James Sewart @ 2019-04-15 14:16 UTC (permalink / raw)
  To: Lu Baolu; +Cc: iommu, Tom Murphy, Dmitry Safonov, Jacob Pan, linux-kernel

Hey Lu,

> On 10 Apr 2019, at 06:22, Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
> Hi James,
> 
> On 4/6/19 2:02 AM, James Sewart wrote:
>> Hey Lu,
>> My bad, did some debugging on my end. The issue was swapping out
>> find_domain for iommu_get_domain_for_dev. It seems in some situations the
>> domain is not attached to the group but the device is expected to have the
>> domain still stored in its archdata.
>> I’ve attached the final patch with find_domain unremoved which seems to
>> work in my testing.
> 
> Just looked into your v3 patch set and some thoughts from my end posted
> here just for your information.
> 
> Let me post the problems we want to address.
> 
> 1. When allocating a new group for a device, how should we determine the
> type of the default domain?
> 
> 2. If we need to put a device into an existing group which uses a
> different type of domain from what the device desires to use, we might
> break the functionality of the device.
> 
> My new thought is letting the iommu generic code to determine the
> default domain type (hence my proposed vendor specific default domain
> type patches could be dropped).
> 
> If the default domain type is dynamical mapping, and later in iommu_no_mapping(), we determines that we must use an identity domain,
> we then call iommu_request_dm_for_dev(dev).
> 
> If the default domain type is identity mapping, and later in
> iommu_no_mapping(), we determined that we must use a dynamical domain,
> we then call iommu_request_dma_domain_for_dev(dev).
> 
> We already have iommu_request_dm_for_dev() in iommu.c. We only need to
> implement iommu_request_dma_domain_for_dev().
> 
> With this done, your patch titled "Create an IOMMU group for devices
> that require an identity map" could also be dropped.
> 
> Any thoughts?

Theres a couple issues I can think of. iommu_request_dm_for_dev changes 
the domain for all devices within the devices group, if we may have 
devices with different domain requirements in the same group then only the 
last initialised device will get the domain it wants. If we want to add 
iommu_request_dma_domain_for_dev then we would still need another IOMMU 
group for identity domain devices.

Both with v3 and the proposed method here, iommu_should_identity_map is 
determining whether the device requires an identity map. In v3 this is 
called once up front by the generic code to determine for the IOMMU group 
which domain type to use. In the proposed method I think this would happen 
after initial domain allocation and would trigger the domain to be 
replaced with a different domain.

I prefer the solution in v3. What do you think?

> 
>> Cheers,
>> James.
> 
> Best regards,
> Lu Baolu

Cheers,
James.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
  2019-04-15 14:16                         ` James Sewart
@ 2019-04-16  2:18                           ` Lu Baolu
  2019-04-24 23:47                             ` Tom Murphy
  0 siblings, 1 reply; 55+ messages in thread
From: Lu Baolu @ 2019-04-16  2:18 UTC (permalink / raw)
  To: James Sewart
  Cc: baolu.lu, iommu, Tom Murphy, Dmitry Safonov, Jacob Pan,
	linux-kernel, Ashok Raj, Tian, Kevin

Hi James,

On 4/15/19 10:16 PM, James Sewart wrote:
> Hey Lu,
> 
>> On 10 Apr 2019, at 06:22, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>
>> Hi James,
>>
>> On 4/6/19 2:02 AM, James Sewart wrote:
>>> Hey Lu,
>>> My bad, did some debugging on my end. The issue was swapping out
>>> find_domain for iommu_get_domain_for_dev. It seems in some situations the
>>> domain is not attached to the group but the device is expected to have the
>>> domain still stored in its archdata.
>>> I’ve attached the final patch with find_domain unremoved which seems to
>>> work in my testing.
>>
>> Just looked into your v3 patch set and some thoughts from my end posted
>> here just for your information.
>>
>> Let me post the problems we want to address.
>>
>> 1. When allocating a new group for a device, how should we determine the
>> type of the default domain?
>>
>> 2. If we need to put a device into an existing group which uses a
>> different type of domain from what the device desires to use, we might
>> break the functionality of the device.
>>
>> My new thought is letting the iommu generic code to determine the
>> default domain type (hence my proposed vendor specific default domain
>> type patches could be dropped).
>>
>> If the default domain type is dynamical mapping, and later in iommu_no_mapping(), we determines that we must use an identity domain,
>> we then call iommu_request_dm_for_dev(dev).
>>
>> If the default domain type is identity mapping, and later in
>> iommu_no_mapping(), we determined that we must use a dynamical domain,
>> we then call iommu_request_dma_domain_for_dev(dev).
>>
>> We already have iommu_request_dm_for_dev() in iommu.c. We only need to
>> implement iommu_request_dma_domain_for_dev().
>>
>> With this done, your patch titled "Create an IOMMU group for devices
>> that require an identity map" could also be dropped.
>>
>> Any thoughts?
> 
> Theres a couple issues I can think of. iommu_request_dm_for_dev changes
> the domain for all devices within the devices group, if we may have
> devices with different domain requirements in the same group then only the
> last initialised device will get the domain it wants. If we want to add
> iommu_request_dma_domain_for_dev then we would still need another IOMMU
> group for identity domain devices.

Current solution (a.k.a. v3) for this is to assign a new group to the
device which requires an identity mapped domain. This will break the
functionality of device direct pass-through (to user level). The iommu
group represents the minimum granularity of iommu isolation and
protection. The requirement of identity mapped domain should not be a
factor for a new group.

Both iomm_request_dma_domain_for_dev() or iommu_request_dm_for_dev()
only support groups with a single device in it.

The users could choose between domains of DMA type or IDENTITY type for
a group. If multiple devices share a single group and both don't work
for them, they have to disable the IOMMU. This isn't a valid
configuration from IOMMU's point of view.

> 
> Both with v3 and the proposed method here, iommu_should_identity_map is
> determining whether the device requires an identity map. In v3 this is
> called once up front by the generic code to determine for the IOMMU group
> which domain type to use. In the proposed method I think this would happen
> after initial domain allocation and would trigger the domain to be
> replaced with a different domain.
> 
> I prefer the solution in v3. What do you think?

The only concern of solution in v3 from my point of view is some kind of
duplication. Even we can ask the Intel IOMMU driver to tell the default
domain type, we might change it after boot up. For example, for 32-bit
pci devices, we don't know whether it's 64-bit or 32-bit capable during
IOMMU initialization, so we might tell iommu.c to use identity mapped
domain. After boot up, we check it again, and find out that it's only
32-bit capable (hence only can access physical memory below 4G) and the
default identity domain doesn't work. And we have to change it to DMA
domain via iommu_request_dma_domain_for_dev().

So to make it simple and straight-forward, we can just let iommu.c to
determine the default domain type and after that the Intel IOMMU driver
could tweak it according to the quirk bits in the driver.

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
  2019-04-16  2:18                           ` Lu Baolu
@ 2019-04-24 23:47                             ` Tom Murphy
  2019-04-25  1:15                               ` Lu Baolu
  0 siblings, 1 reply; 55+ messages in thread
From: Tom Murphy @ 2019-04-24 23:47 UTC (permalink / raw)
  To: Lu Baolu
  Cc: James Sewart, iommu, Dmitry Safonov, Jacob Pan, linux-kernel,
	Ashok Raj, Tian, Kevin

I can see two potential problems with these patches that should be addressed:

The default domain of a group can be changed to type
IOMMU_DOMAIN_IDENTITY via the command line. With these patches we are
returning the si_domain for type IOMMU_DOMAIN_IDENTITY. There's a
chance the shared si_domain could be freed while it is still being
used when a group is freed. For example here in the
iommu_group_release:
https://github.com/torvalds/linux/blob/cd8dead0c39457e58ec1d36db93aedca811d48f1/drivers/iommu/iommu.c#L376
"if (group->default_domain)
    iommu_domain_free(group->default_domain);"

Also now that a domain is attached to a device earlier we should
implement the is_attach_deferred call-back and use it to defer the
domain attach from iommu driver init to device driver init when iommu
is pre-enabled in kdump kernel. like in this commit:
https://github.com/torvalds/linux/commit/df3f7a6e8e855e4ff533508807cd7c3723faa51f


On Tue, Apr 16, 2019 at 3:24 AM Lu Baolu <baolu.lu@linux.intel.com> wrote:
>
> Hi James,
>
> On 4/15/19 10:16 PM, James Sewart wrote:
> > Hey Lu,
> >
> >> On 10 Apr 2019, at 06:22, Lu Baolu <baolu.lu@linux.intel.com> wrote:
> >>
> >> Hi James,
> >>
> >> On 4/6/19 2:02 AM, James Sewart wrote:
> >>> Hey Lu,
> >>> My bad, did some debugging on my end. The issue was swapping out
> >>> find_domain for iommu_get_domain_for_dev. It seems in some situations the
> >>> domain is not attached to the group but the device is expected to have the
> >>> domain still stored in its archdata.
> >>> I’ve attached the final patch with find_domain unremoved which seems to
> >>> work in my testing.
> >>
> >> Just looked into your v3 patch set and some thoughts from my end posted
> >> here just for your information.
> >>
> >> Let me post the problems we want to address.
> >>
> >> 1. When allocating a new group for a device, how should we determine the
> >> type of the default domain?
> >>
> >> 2. If we need to put a device into an existing group which uses a
> >> different type of domain from what the device desires to use, we might
> >> break the functionality of the device.
> >>
> >> My new thought is letting the iommu generic code to determine the
> >> default domain type (hence my proposed vendor specific default domain
> >> type patches could be dropped).
> >>
> >> If the default domain type is dynamical mapping, and later in iommu_no_mapping(), we determines that we must use an identity domain,
> >> we then call iommu_request_dm_for_dev(dev).
> >>
> >> If the default domain type is identity mapping, and later in
> >> iommu_no_mapping(), we determined that we must use a dynamical domain,
> >> we then call iommu_request_dma_domain_for_dev(dev).
> >>
> >> We already have iommu_request_dm_for_dev() in iommu.c. We only need to
> >> implement iommu_request_dma_domain_for_dev().
> >>
> >> With this done, your patch titled "Create an IOMMU group for devices
> >> that require an identity map" could also be dropped.
> >>
> >> Any thoughts?
> >
> > Theres a couple issues I can think of. iommu_request_dm_for_dev changes
> > the domain for all devices within the devices group, if we may have
> > devices with different domain requirements in the same group then only the
> > last initialised device will get the domain it wants. If we want to add
> > iommu_request_dma_domain_for_dev then we would still need another IOMMU
> > group for identity domain devices.
>
> Current solution (a.k.a. v3) for this is to assign a new group to the
> device which requires an identity mapped domain. This will break the
> functionality of device direct pass-through (to user level). The iommu
> group represents the minimum granularity of iommu isolation and
> protection. The requirement of identity mapped domain should not be a
> factor for a new group.
>
> Both iomm_request_dma_domain_for_dev() or iommu_request_dm_for_dev()
> only support groups with a single device in it.
>
> The users could choose between domains of DMA type or IDENTITY type for
> a group. If multiple devices share a single group and both don't work
> for them, they have to disable the IOMMU. This isn't a valid
> configuration from IOMMU's point of view.
>
> >
> > Both with v3 and the proposed method here, iommu_should_identity_map is
> > determining whether the device requires an identity map. In v3 this is
> > called once up front by the generic code to determine for the IOMMU group
> > which domain type to use. In the proposed method I think this would happen
> > after initial domain allocation and would trigger the domain to be
> > replaced with a different domain.
> >
> > I prefer the solution in v3. What do you think?
>
> The only concern of solution in v3 from my point of view is some kind of
> duplication. Even we can ask the Intel IOMMU driver to tell the default
> domain type, we might change it after boot up. For example, for 32-bit
> pci devices, we don't know whether it's 64-bit or 32-bit capable during
> IOMMU initialization, so we might tell iommu.c to use identity mapped
> domain. After boot up, we check it again, and find out that it's only
> 32-bit capable (hence only can access physical memory below 4G) and the
> default identity domain doesn't work. And we have to change it to DMA
> domain via iommu_request_dma_domain_for_dev().
>
> So to make it simple and straight-forward, we can just let iommu.c to
> determine the default domain type and after that the Intel IOMMU driver
> could tweak it according to the quirk bits in the driver.
>
> Best regards,
> Lu Baolu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
  2019-04-24 23:47                             ` Tom Murphy
@ 2019-04-25  1:15                               ` Lu Baolu
  0 siblings, 0 replies; 55+ messages in thread
From: Lu Baolu @ 2019-04-25  1:15 UTC (permalink / raw)
  To: Tom Murphy
  Cc: baolu.lu, James Sewart, iommu, Dmitry Safonov, Jacob Pan,
	linux-kernel, Ashok Raj, Tian, Kevin

Hi Tom,

On 4/25/19 7:47 AM, Tom Murphy wrote:
> I can see two potential problems with these patches that should be addressed:
> 
> The default domain of a group can be changed to type
> IOMMU_DOMAIN_IDENTITY via the command line. With these patches we are
> returning the si_domain for type IOMMU_DOMAIN_IDENTITY. There's a
> chance the shared si_domain could be freed while it is still being
> used when a group is freed. For example here in the
> iommu_group_release:
> https://github.com/torvalds/linux/blob/cd8dead0c39457e58ec1d36db93aedca811d48f1/drivers/iommu/iommu.c#L376
> "if (group->default_domain)
>      iommu_domain_free(group->default_domain);"
> 
> Also now that a domain is attached to a device earlier we should
> implement the is_attach_deferred call-back and use it to defer the
> domain attach from iommu driver init to device driver init when iommu
> is pre-enabled in kdump kernel. like in this commit:
> https://github.com/torvalds/linux/commit/df3f7a6e8e855e4ff533508807cd7c3723faa51f
> 

Both agreed and should be considered during development.

Best regards,
Lu Baolu

> 
> On Tue, Apr 16, 2019 at 3:24 AM Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>
>> Hi James,
>>
>> On 4/15/19 10:16 PM, James Sewart wrote:
>>> Hey Lu,
>>>
>>>> On 10 Apr 2019, at 06:22, Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>>>
>>>> Hi James,
>>>>
>>>> On 4/6/19 2:02 AM, James Sewart wrote:
>>>>> Hey Lu,
>>>>> My bad, did some debugging on my end. The issue was swapping out
>>>>> find_domain for iommu_get_domain_for_dev. It seems in some situations the
>>>>> domain is not attached to the group but the device is expected to have the
>>>>> domain still stored in its archdata.
>>>>> I’ve attached the final patch with find_domain unremoved which seems to
>>>>> work in my testing.
>>>>
>>>> Just looked into your v3 patch set and some thoughts from my end posted
>>>> here just for your information.
>>>>
>>>> Let me post the problems we want to address.
>>>>
>>>> 1. When allocating a new group for a device, how should we determine the
>>>> type of the default domain?
>>>>
>>>> 2. If we need to put a device into an existing group which uses a
>>>> different type of domain from what the device desires to use, we might
>>>> break the functionality of the device.
>>>>
>>>> My new thought is letting the iommu generic code to determine the
>>>> default domain type (hence my proposed vendor specific default domain
>>>> type patches could be dropped).
>>>>
>>>> If the default domain type is dynamical mapping, and later in iommu_no_mapping(), we determines that we must use an identity domain,
>>>> we then call iommu_request_dm_for_dev(dev).
>>>>
>>>> If the default domain type is identity mapping, and later in
>>>> iommu_no_mapping(), we determined that we must use a dynamical domain,
>>>> we then call iommu_request_dma_domain_for_dev(dev).
>>>>
>>>> We already have iommu_request_dm_for_dev() in iommu.c. We only need to
>>>> implement iommu_request_dma_domain_for_dev().
>>>>
>>>> With this done, your patch titled "Create an IOMMU group for devices
>>>> that require an identity map" could also be dropped.
>>>>
>>>> Any thoughts?
>>>
>>> Theres a couple issues I can think of. iommu_request_dm_for_dev changes
>>> the domain for all devices within the devices group, if we may have
>>> devices with different domain requirements in the same group then only the
>>> last initialised device will get the domain it wants. If we want to add
>>> iommu_request_dma_domain_for_dev then we would still need another IOMMU
>>> group for identity domain devices.
>>
>> Current solution (a.k.a. v3) for this is to assign a new group to the
>> device which requires an identity mapped domain. This will break the
>> functionality of device direct pass-through (to user level). The iommu
>> group represents the minimum granularity of iommu isolation and
>> protection. The requirement of identity mapped domain should not be a
>> factor for a new group.
>>
>> Both iomm_request_dma_domain_for_dev() or iommu_request_dm_for_dev()
>> only support groups with a single device in it.
>>
>> The users could choose between domains of DMA type or IDENTITY type for
>> a group. If multiple devices share a single group and both don't work
>> for them, they have to disable the IOMMU. This isn't a valid
>> configuration from IOMMU's point of view.
>>
>>>
>>> Both with v3 and the proposed method here, iommu_should_identity_map is
>>> determining whether the device requires an identity map. In v3 this is
>>> called once up front by the generic code to determine for the IOMMU group
>>> which domain type to use. In the proposed method I think this would happen
>>> after initial domain allocation and would trigger the domain to be
>>> replaced with a different domain.
>>>
>>> I prefer the solution in v3. What do you think?
>>
>> The only concern of solution in v3 from my point of view is some kind of
>> duplication. Even we can ask the Intel IOMMU driver to tell the default
>> domain type, we might change it after boot up. For example, for 32-bit
>> pci devices, we don't know whether it's 64-bit or 32-bit capable during
>> IOMMU initialization, so we might tell iommu.c to use identity mapped
>> domain. After boot up, we check it again, and find out that it's only
>> 32-bit capable (hence only can access physical memory below 4G) and the
>> default identity domain doesn't work. And we have to change it to DMA
>> domain via iommu_request_dma_domain_for_dev().
>>
>> So to make it simple and straight-forward, we can just let iommu.c to
>> determine the default domain type and after that the Intel IOMMU driver
>> could tweak it according to the quirk bits in the driver.
>>
>> Best regards,
>> Lu Baolu
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2019-04-25  1:25 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-04 15:41 [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain James Sewart
2019-03-04 15:45 ` [PATCH 1/4] iommu: Move iommu_group_create_direct_mappings to after device_attach James Sewart
2019-03-04 15:46 ` [PATCH 2/4] iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges James Sewart
2019-03-04 15:46   ` [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated James Sewart
2019-03-04 15:47     ` [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains James Sewart
2019-03-05  6:59       ` Lu Baolu
2019-03-05 11:46         ` James Sewart
2019-03-06  7:00           ` Lu Baolu
2019-03-06 18:08             ` James Sewart
2019-03-07  6:31               ` Lu Baolu
2019-03-07 10:21                 ` James Sewart
2019-03-08  1:09                   ` Lu Baolu
2019-03-08  3:09                   ` Lu Baolu
2019-03-08 16:57                     ` James Sewart
2019-03-09  1:53                       ` Lu Baolu
2019-03-09 11:49                         ` James Sewart
2019-03-10  2:51                           ` Lu Baolu
2019-03-05  6:46     ` [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated Lu Baolu
2019-03-05 11:34       ` James Sewart
2019-03-08  1:20     ` Dmitry Safonov
2019-03-09 11:57       ` James Sewart
2019-03-05  6:05 ` [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain Lu Baolu
2019-03-05 11:14   ` James Sewart
2019-03-06  6:27     ` Lu Baolu
2019-03-14 11:56 ` [PATCH v2 0/7] " James Sewart
2019-03-14 11:57   ` [PATCH v2 1/7] iommu: Move iommu_group_create_direct_mappings to after device_attach James Sewart
2019-03-14 11:58   ` [PATCH v2 2/7] iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges James Sewart
2019-03-14 11:58   ` [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions James Sewart
2019-03-15  2:19     ` Lu Baolu
2019-03-22  9:57       ` James Sewart
2019-03-25  2:03         ` Lu Baolu
2019-03-25 12:57           ` James Sewart
2019-03-26  1:10             ` Lu Baolu
2019-03-26  1:24             ` Lu Baolu
2019-03-28 18:37               ` James Sewart
2019-03-29 15:26                 ` James Sewart
2019-04-04  6:49                   ` Lu Baolu
2019-04-05 18:02                     ` James Sewart
2019-04-08  2:43                       ` Lu Baolu
2019-04-10  5:22                       ` Lu Baolu
2019-04-15 14:16                         ` James Sewart
2019-04-16  2:18                           ` Lu Baolu
2019-04-24 23:47                             ` Tom Murphy
2019-04-25  1:15                               ` Lu Baolu
2019-03-14 11:58   ` [PATCH v2 4/7] iommu/vt-d: Ignore domain parameter in attach_device if device requires identity map James Sewart
2019-03-15  2:30     ` Lu Baolu
2019-03-14 11:58   ` [PATCH v2 5/7] iommu/vt-d: Enable DMA remapping after rmrr mapped James Sewart
2019-03-14 11:59   ` [PATCH v2 6/7] iommu/vt-d: Allow IOMMU_DOMAIN_DMA to be allocated by iommu_ops James Sewart
2019-03-14 11:59   ` [PATCH v2 7/7] iommu/vt-d: Remove lazy allocation of domains James Sewart
2019-03-14 23:35     ` Jacob Pan
2019-03-22 10:07       ` James Sewart
2019-03-15  3:13   ` [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain Lu Baolu
2019-03-19 13:35     ` James Sewart
2019-03-20  1:26       ` Lu Baolu
2019-03-22 10:05         ` James Sewart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).