iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/7] iommu/vt-d: Fixes and cleanups for linux-next
@ 2019-06-12  0:28 Lu Baolu
  2019-06-12  0:28 ` [PATCH v2 1/7] iommu/vt-d: Don't return error when device gets right domain Lu Baolu
                   ` (7 more replies)
  0 siblings, 8 replies; 18+ messages in thread
From: Lu Baolu @ 2019-06-12  0:28 UTC (permalink / raw)
  To: Joerg Roedel, David Woodhouse
  Cc: kevin.tian, ashok.raj, linux-kernel, iommu, cai, jacob.jun.pan

Hi Joerg,

This series includes several fixes and cleanups after delegating
DMA domain to generic iommu. Please review and consider them for
linux-next.

Best regards,
Baolu

Change log:
  v1->v2:
  - Refine "iommu/vt-d: Cleanup after delegating DMA domain to
    generic iommu" by removing an unnecessary cleanup.
  - Add "Allow DMA domain attaching to rmrr locked device" which
    fix a driver load issue.

Lu Baolu (6):
  iommu/vt-d: Don't return error when device gets right domain
  iommu/vt-d: Set domain type for a private domain
  iommu/vt-d: Don't enable iommu's which have been ignored
  iommu/vt-d: Allow DMA domain attaching to rmrr locked device
  iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices()
  iommu/vt-d: Consolidate domain_init() to avoid duplication

Sai Praneeth Prakhya (1):
  iommu/vt-d: Cleanup after delegating DMA domain to generic iommu

 drivers/iommu/intel-iommu.c | 200 +++++++++---------------------------
 1 file changed, 49 insertions(+), 151 deletions(-)

-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/7] iommu/vt-d: Don't return error when device gets right domain
  2019-06-12  0:28 [PATCH v2 0/7] iommu/vt-d: Fixes and cleanups for linux-next Lu Baolu
@ 2019-06-12  0:28 ` Lu Baolu
  2019-06-12  0:28 ` [PATCH v2 2/7] iommu/vt-d: Set domain type for a private domain Lu Baolu
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Lu Baolu @ 2019-06-12  0:28 UTC (permalink / raw)
  To: Joerg Roedel, David Woodhouse
  Cc: kevin.tian, ashok.raj, linux-kernel, iommu, cai, jacob.jun.pan

If a device gets a right domain in add_device ops, it shouldn't
return error.

Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains replaced with private")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index b431cc6f6ba4..d5a6c8064c56 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5360,10 +5360,7 @@ static int intel_iommu_add_device(struct device *dev)
 				domain_add_dev_info(si_domain, dev);
 				dev_info(dev,
 					 "Device uses a private identity domain.\n");
-				return 0;
 			}
-
-			return -ENODEV;
 		}
 	} else {
 		if (device_def_domain_type(dev) == IOMMU_DOMAIN_DMA) {
@@ -5378,10 +5375,7 @@ static int intel_iommu_add_device(struct device *dev)
 
 				dev_info(dev,
 					 "Device uses a private dma domain.\n");
-				return 0;
 			}
-
-			return -ENODEV;
 		}
 	}
 
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 2/7] iommu/vt-d: Set domain type for a private domain
  2019-06-12  0:28 [PATCH v2 0/7] iommu/vt-d: Fixes and cleanups for linux-next Lu Baolu
  2019-06-12  0:28 ` [PATCH v2 1/7] iommu/vt-d: Don't return error when device gets right domain Lu Baolu
@ 2019-06-12  0:28 ` Lu Baolu
  2019-06-12  0:28 ` [PATCH v2 3/7] iommu/vt-d: Don't enable iommu's which have been ignored Lu Baolu
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Lu Baolu @ 2019-06-12  0:28 UTC (permalink / raw)
  To: Joerg Roedel, David Woodhouse
  Cc: kevin.tian, ashok.raj, linux-kernel, iommu, cai, jacob.jun.pan

Otherwise, domain_get_iommu() will be broken.

Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains replaced with private")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index d5a6c8064c56..d1a82039e835 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3458,6 +3458,8 @@ static struct dmar_domain *get_private_domain_for_dev(struct device *dev)
 out:
 	if (!domain)
 		dev_err(dev, "Allocating domain failed\n");
+	else
+		domain->domain.type = IOMMU_DOMAIN_DMA;
 
 	return domain;
 }
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 3/7] iommu/vt-d: Don't enable iommu's which have been ignored
  2019-06-12  0:28 [PATCH v2 0/7] iommu/vt-d: Fixes and cleanups for linux-next Lu Baolu
  2019-06-12  0:28 ` [PATCH v2 1/7] iommu/vt-d: Don't return error when device gets right domain Lu Baolu
  2019-06-12  0:28 ` [PATCH v2 2/7] iommu/vt-d: Set domain type for a private domain Lu Baolu
@ 2019-06-12  0:28 ` Lu Baolu
  2019-06-12  0:28 ` [PATCH v2 4/7] iommu/vt-d: Allow DMA domain attaching to rmrr locked device Lu Baolu
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Lu Baolu @ 2019-06-12  0:28 UTC (permalink / raw)
  To: Joerg Roedel, David Woodhouse
  Cc: kevin.tian, ashok.raj, linux-kernel, iommu, cai, jacob.jun.pan

The iommu driver will ignore some iommu units if there's no
device under its scope or those devices have been explicitly
set to bypass the DMA translation. Don't enable those iommu
units, otherwise the devices under its scope won't work.

Fixes: d8190dc638866 ("iommu/vt-d: Enable DMA remapping after rmrr mapped")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index d1a82039e835..5251533a18a4 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3232,7 +3232,12 @@ static int __init init_dmars(void)
 		goto error;
 	}
 
-	for_each_active_iommu(iommu, drhd) {
+	for_each_iommu(iommu, drhd) {
+		if (drhd->ignored) {
+			iommu_disable_translation(iommu);
+			continue;
+		}
+
 		/*
 		 * Find the max pasid size of all IOMMU's in the system.
 		 * We need to ensure the system pasid table is no bigger
@@ -4793,7 +4798,7 @@ int __init intel_iommu_init(void)
 
 	/* Finally, we enable the DMA remapping hardware. */
 	for_each_iommu(iommu, drhd) {
-		if (!translation_pre_enabled(iommu))
+		if (!drhd->ignored && !translation_pre_enabled(iommu))
 			iommu_enable_translation(iommu);
 
 		iommu_disable_protect_mem_regions(iommu);
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 4/7] iommu/vt-d: Allow DMA domain attaching to rmrr locked device
  2019-06-12  0:28 [PATCH v2 0/7] iommu/vt-d: Fixes and cleanups for linux-next Lu Baolu
                   ` (2 preceding siblings ...)
  2019-06-12  0:28 ` [PATCH v2 3/7] iommu/vt-d: Don't enable iommu's which have been ignored Lu Baolu
@ 2019-06-12  0:28 ` Lu Baolu
  2019-06-12  0:28 ` [PATCH v2 5/7] iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices() Lu Baolu
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Lu Baolu @ 2019-06-12  0:28 UTC (permalink / raw)
  To: Joerg Roedel, David Woodhouse
  Cc: kevin.tian, ashok.raj, linux-kernel, iommu, cai, jacob.jun.pan

We don't allow a device to be assigned to user level when it is locked
by any RMRR's. Hence, intel_iommu_attach_device() will return error if
a domain of type IOMMU_DOMAIN_UNMANAGED is about to attach to a device
locked by rmrr. But this doesn't apply to a domain of type other than
IOMMU_DOMAIN_UNMANAGED. This adds a check to fix this.

Fixes: fa954e6831789 ("iommu/vt-d: Delegate the dma domain to upper layer")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reported-and-tested-by: Qian Cai <cai@lca.pw>
---
 drivers/iommu/intel-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 5251533a18a4..19c4c387a3f6 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5139,7 +5139,8 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
 {
 	int ret;
 
-	if (device_is_rmrr_locked(dev)) {
+	if (domain->type == IOMMU_DOMAIN_UNMANAGED &&
+	    device_is_rmrr_locked(dev)) {
 		dev_warn(dev, "Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.\n");
 		return -EPERM;
 	}
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 5/7] iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices()
  2019-06-12  0:28 [PATCH v2 0/7] iommu/vt-d: Fixes and cleanups for linux-next Lu Baolu
                   ` (3 preceding siblings ...)
  2019-06-12  0:28 ` [PATCH v2 4/7] iommu/vt-d: Allow DMA domain attaching to rmrr locked device Lu Baolu
@ 2019-06-12  0:28 ` Lu Baolu
  2022-06-29 13:03   ` Robin Murphy
  2019-06-12  0:28 ` [PATCH v2 6/7] iommu/vt-d: Cleanup after delegating DMA domain to generic iommu Lu Baolu
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Lu Baolu @ 2019-06-12  0:28 UTC (permalink / raw)
  To: Joerg Roedel, David Woodhouse
  Cc: kevin.tian, ashok.raj, linux-kernel, iommu, cai, jacob.jun.pan

The drhd and device scope list should be iterated with the
iommu global lock held. Otherwise, a suspicious RCU usage
message will be displayed.

[    3.695886] =============================
[    3.695917] WARNING: suspicious RCU usage
[    3.695950] 5.2.0-rc2+ #2467 Not tainted
[    3.695981] -----------------------------
[    3.696014] drivers/iommu/intel-iommu.c:4569 suspicious rcu_dereference_check() usage!
[    3.696069]
               other info that might help us debug this:

[    3.696126]
               rcu_scheduler_active = 2, debug_locks = 1
[    3.696173] no locks held by swapper/0/1.
[    3.696204]
               stack backtrace:
[    3.696241] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc2+ #2467
[    3.696370] Call Trace:
[    3.696404]  dump_stack+0x85/0xcb
[    3.696441]  intel_iommu_init+0x128c/0x13ce
[    3.696478]  ? kmem_cache_free+0x16b/0x2c0
[    3.696516]  ? __fput+0x14b/0x270
[    3.696550]  ? __call_rcu+0xb7/0x300
[    3.696583]  ? get_max_files+0x10/0x10
[    3.696631]  ? set_debug_rodata+0x11/0x11
[    3.696668]  ? e820__memblock_setup+0x60/0x60
[    3.696704]  ? pci_iommu_init+0x16/0x3f
[    3.696737]  ? set_debug_rodata+0x11/0x11
[    3.696770]  pci_iommu_init+0x16/0x3f
[    3.696805]  do_one_initcall+0x5d/0x2e4
[    3.696844]  ? set_debug_rodata+0x11/0x11
[    3.696880]  ? rcu_read_lock_sched_held+0x6b/0x80
[    3.696924]  kernel_init_freeable+0x1f0/0x27c
[    3.696961]  ? rest_init+0x260/0x260
[    3.696997]  kernel_init+0xa/0x110
[    3.697028]  ret_from_fork+0x3a/0x50

Fixes: fa212a97f3a36 ("iommu/vt-d: Probe DMA-capable ACPI name space devices")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 19c4c387a3f6..84e650c6a46d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4793,8 +4793,10 @@ int __init intel_iommu_init(void)
 	cpuhp_setup_state(CPUHP_IOMMU_INTEL_DEAD, "iommu/intel:dead", NULL,
 			  intel_iommu_cpu_dead);
 
+	down_read(&dmar_global_lock);
 	if (probe_acpi_namespace_devices())
 		pr_warn("ACPI name space devices didn't probe correctly\n");
+	up_read(&dmar_global_lock);
 
 	/* Finally, we enable the DMA remapping hardware. */
 	for_each_iommu(iommu, drhd) {
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 6/7] iommu/vt-d: Cleanup after delegating DMA domain to generic iommu
  2019-06-12  0:28 [PATCH v2 0/7] iommu/vt-d: Fixes and cleanups for linux-next Lu Baolu
                   ` (4 preceding siblings ...)
  2019-06-12  0:28 ` [PATCH v2 5/7] iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices() Lu Baolu
@ 2019-06-12  0:28 ` Lu Baolu
  2019-06-12  0:28 ` [PATCH v2 7/7] iommu/vt-d: Consolidate domain_init() to avoid duplication Lu Baolu
  2019-06-12  8:37 ` [PATCH v2 0/7] iommu/vt-d: Fixes and cleanups for linux-next Joerg Roedel
  7 siblings, 0 replies; 18+ messages in thread
From: Lu Baolu @ 2019-06-12  0:28 UTC (permalink / raw)
  To: Joerg Roedel, David Woodhouse
  Cc: kevin.tian, ashok.raj, linux-kernel, iommu, cai, jacob.jun.pan

From: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>

[No functional changes]

1. Starting with commit df4f3c603aeb ("iommu/vt-d: Remove static identity
map code") there are no callers for iommu_prepare_rmrr_dev() but the
implementation of the function still exists, so remove it. Also, as a
ripple effect remove get_domain_for_dev() and iommu_prepare_identity_map()
because they aren't being used either.

2. Remove extra new line in couple of places.

Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
---
 drivers/iommu/intel-iommu.c | 55 -------------------------------------
 1 file changed, 55 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 84e650c6a46d..5215dcd535a1 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -909,7 +909,6 @@ static struct dma_pte *pfn_to_dma_pte(struct dmar_domain *domain,
 	return pte;
 }
 
-
 /* return address's pte at specific level */
 static struct dma_pte *dma_pfn_level_pte(struct dmar_domain *domain,
 					 unsigned long pfn,
@@ -1578,7 +1577,6 @@ static void iommu_disable_translation(struct intel_iommu *iommu)
 	raw_spin_unlock_irqrestore(&iommu->register_lock, flag);
 }
 
-
 static int iommu_init_domains(struct intel_iommu *iommu)
 {
 	u32 ndomains, nlongs;
@@ -1616,8 +1614,6 @@ static int iommu_init_domains(struct intel_iommu *iommu)
 		return -ENOMEM;
 	}
 
-
-
 	/*
 	 * If Caching mode is set, then invalid translations are tagged
 	 * with domain-id 0, hence we need to pre-allocate it. We also
@@ -2649,29 +2645,6 @@ static struct dmar_domain *set_domain_for_dev(struct device *dev,
 	return domain;
 }
 
-static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
-{
-	struct dmar_domain *domain, *tmp;
-
-	domain = find_domain(dev);
-	if (domain)
-		goto out;
-
-	domain = find_or_alloc_domain(dev, gaw);
-	if (!domain)
-		goto out;
-
-	tmp = set_domain_for_dev(dev, domain);
-	if (!tmp || domain != tmp) {
-		domain_exit(domain);
-		domain = tmp;
-	}
-
-out:
-
-	return domain;
-}
-
 static int iommu_domain_identity_map(struct dmar_domain *domain,
 				     unsigned long long start,
 				     unsigned long long end)
@@ -2736,33 +2709,6 @@ static int domain_prepare_identity_map(struct device *dev,
 	return iommu_domain_identity_map(domain, start, end);
 }
 
-static int iommu_prepare_identity_map(struct device *dev,
-				      unsigned long long start,
-				      unsigned long long end)
-{
-	struct dmar_domain *domain;
-	int ret;
-
-	domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
-	if (!domain)
-		return -ENOMEM;
-
-	ret = domain_prepare_identity_map(dev, domain, start, end);
-	if (ret)
-		domain_exit(domain);
-
-	return ret;
-}
-
-static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
-					 struct device *dev)
-{
-	if (dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO)
-		return 0;
-	return iommu_prepare_identity_map(dev, rmrr->base_address,
-					  rmrr->end_address);
-}
-
 static int md_domain_init(struct dmar_domain *domain, int guest_width);
 
 static int __init si_domain_init(int hw)
@@ -4058,7 +4004,6 @@ static void __init init_iommu_pm_ops(void)
 static inline void init_iommu_pm_ops(void) {}
 #endif	/* CONFIG_PM */
 
-
 int __init dmar_parse_one_rmrr(struct acpi_dmar_header *header, void *arg)
 {
 	struct acpi_dmar_reserved_memory *rmrr;
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 7/7] iommu/vt-d: Consolidate domain_init() to avoid duplication
  2019-06-12  0:28 [PATCH v2 0/7] iommu/vt-d: Fixes and cleanups for linux-next Lu Baolu
                   ` (5 preceding siblings ...)
  2019-06-12  0:28 ` [PATCH v2 6/7] iommu/vt-d: Cleanup after delegating DMA domain to generic iommu Lu Baolu
@ 2019-06-12  0:28 ` Lu Baolu
  2019-07-18 23:16   ` Alex Williamson
  2019-06-12  8:37 ` [PATCH v2 0/7] iommu/vt-d: Fixes and cleanups for linux-next Joerg Roedel
  7 siblings, 1 reply; 18+ messages in thread
From: Lu Baolu @ 2019-06-12  0:28 UTC (permalink / raw)
  To: Joerg Roedel, David Woodhouse
  Cc: kevin.tian, ashok.raj, linux-kernel, iommu, cai, jacob.jun.pan

The domain_init() and md_domain_init() do almost the same job.
Consolidate them to avoid duplication.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 123 +++++++++++-------------------------
 1 file changed, 36 insertions(+), 87 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 5215dcd535a1..b8c6cf1d5f90 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1825,63 +1825,6 @@ static inline int guestwidth_to_adjustwidth(int gaw)
 	return agaw;
 }
 
-static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
-		       int guest_width)
-{
-	int adjust_width, agaw;
-	unsigned long sagaw;
-	int err;
-
-	init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
-
-	err = init_iova_flush_queue(&domain->iovad,
-				    iommu_flush_iova, iova_entry_free);
-	if (err)
-		return err;
-
-	domain_reserve_special_ranges(domain);
-
-	/* calculate AGAW */
-	if (guest_width > cap_mgaw(iommu->cap))
-		guest_width = cap_mgaw(iommu->cap);
-	domain->gaw = guest_width;
-	adjust_width = guestwidth_to_adjustwidth(guest_width);
-	agaw = width_to_agaw(adjust_width);
-	sagaw = cap_sagaw(iommu->cap);
-	if (!test_bit(agaw, &sagaw)) {
-		/* hardware doesn't support it, choose a bigger one */
-		pr_debug("Hardware doesn't support agaw %d\n", agaw);
-		agaw = find_next_bit(&sagaw, 5, agaw);
-		if (agaw >= 5)
-			return -ENODEV;
-	}
-	domain->agaw = agaw;
-
-	if (ecap_coherent(iommu->ecap))
-		domain->iommu_coherency = 1;
-	else
-		domain->iommu_coherency = 0;
-
-	if (ecap_sc_support(iommu->ecap))
-		domain->iommu_snooping = 1;
-	else
-		domain->iommu_snooping = 0;
-
-	if (intel_iommu_superpage)
-		domain->iommu_superpage = fls(cap_super_page_val(iommu->cap));
-	else
-		domain->iommu_superpage = 0;
-
-	domain->nid = iommu->node;
-
-	/* always allocate the top pgd */
-	domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
-	if (!domain->pgd)
-		return -ENOMEM;
-	__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
-	return 0;
-}
-
 static void domain_exit(struct dmar_domain *domain)
 {
 	struct page *freelist;
@@ -2563,6 +2506,31 @@ static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
 	return 0;
 }
 
+static int domain_init(struct dmar_domain *domain, int guest_width)
+{
+	int adjust_width;
+
+	init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
+	domain_reserve_special_ranges(domain);
+
+	/* calculate AGAW */
+	domain->gaw = guest_width;
+	adjust_width = guestwidth_to_adjustwidth(guest_width);
+	domain->agaw = width_to_agaw(adjust_width);
+
+	domain->iommu_coherency = 0;
+	domain->iommu_snooping = 0;
+	domain->iommu_superpage = 0;
+	domain->max_addr = 0;
+
+	/* always allocate the top pgd */
+	domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
+	if (!domain->pgd)
+		return -ENOMEM;
+	domain_flush_cache(domain, domain->pgd, PAGE_SIZE);
+	return 0;
+}
+
 static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
 {
 	struct device_domain_info *info;
@@ -2600,11 +2568,19 @@ static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
 	domain = alloc_domain(0);
 	if (!domain)
 		return NULL;
-	if (domain_init(domain, iommu, gaw)) {
+
+	if (domain_init(domain, gaw)) {
 		domain_exit(domain);
 		return NULL;
 	}
 
+	if (init_iova_flush_queue(&domain->iovad,
+				  iommu_flush_iova,
+				  iova_entry_free)) {
+		pr_warn("iova flush queue initialization failed\n");
+		intel_iommu_strict = 1;
+	}
+
 out:
 	return domain;
 }
@@ -2709,8 +2685,6 @@ static int domain_prepare_identity_map(struct device *dev,
 	return iommu_domain_identity_map(domain, start, end);
 }
 
-static int md_domain_init(struct dmar_domain *domain, int guest_width);
-
 static int __init si_domain_init(int hw)
 {
 	struct dmar_rmrr_unit *rmrr;
@@ -2721,7 +2695,7 @@ static int __init si_domain_init(int hw)
 	if (!si_domain)
 		return -EFAULT;
 
-	if (md_domain_init(si_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
+	if (domain_init(si_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
 		domain_exit(si_domain);
 		return -EFAULT;
 	}
@@ -4837,31 +4811,6 @@ static void dmar_remove_one_dev_info(struct device *dev)
 	spin_unlock_irqrestore(&device_domain_lock, flags);
 }
 
-static int md_domain_init(struct dmar_domain *domain, int guest_width)
-{
-	int adjust_width;
-
-	init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
-	domain_reserve_special_ranges(domain);
-
-	/* calculate AGAW */
-	domain->gaw = guest_width;
-	adjust_width = guestwidth_to_adjustwidth(guest_width);
-	domain->agaw = width_to_agaw(adjust_width);
-
-	domain->iommu_coherency = 0;
-	domain->iommu_snooping = 0;
-	domain->iommu_superpage = 0;
-	domain->max_addr = 0;
-
-	/* always allocate the top pgd */
-	domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
-	if (!domain->pgd)
-		return -ENOMEM;
-	domain_flush_cache(domain, domain->pgd, PAGE_SIZE);
-	return 0;
-}
-
 static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
 {
 	struct dmar_domain *dmar_domain;
@@ -4876,7 +4825,7 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
 			pr_err("Can't allocate dmar_domain\n");
 			return NULL;
 		}
-		if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
+		if (domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
 			pr_err("Domain initialization failed\n");
 			domain_exit(dmar_domain);
 			return NULL;
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 0/7] iommu/vt-d: Fixes and cleanups for linux-next
  2019-06-12  0:28 [PATCH v2 0/7] iommu/vt-d: Fixes and cleanups for linux-next Lu Baolu
                   ` (6 preceding siblings ...)
  2019-06-12  0:28 ` [PATCH v2 7/7] iommu/vt-d: Consolidate domain_init() to avoid duplication Lu Baolu
@ 2019-06-12  8:37 ` Joerg Roedel
  7 siblings, 0 replies; 18+ messages in thread
From: Joerg Roedel @ 2019-06-12  8:37 UTC (permalink / raw)
  To: Lu Baolu
  Cc: kevin.tian, ashok.raj, linux-kernel, iommu, cai, jacob.jun.pan,
	David Woodhouse

On Wed, Jun 12, 2019 at 08:28:44AM +0800, Lu Baolu wrote:
> Lu Baolu (6):
>   iommu/vt-d: Don't return error when device gets right domain
>   iommu/vt-d: Set domain type for a private domain
>   iommu/vt-d: Don't enable iommu's which have been ignored
>   iommu/vt-d: Allow DMA domain attaching to rmrr locked device
>   iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices()
>   iommu/vt-d: Consolidate domain_init() to avoid duplication

Applied, thanks.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 7/7] iommu/vt-d: Consolidate domain_init() to avoid duplication
  2019-06-12  0:28 ` [PATCH v2 7/7] iommu/vt-d: Consolidate domain_init() to avoid duplication Lu Baolu
@ 2019-07-18 23:16   ` Alex Williamson
  2019-07-19  8:27     ` Lu Baolu
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Williamson @ 2019-07-18 23:16 UTC (permalink / raw)
  To: Lu Baolu
  Cc: kevin.tian, ashok.raj, linux-kernel, iommu, jacob.jun.pan,
	David Woodhouse

On Wed, 12 Jun 2019 08:28:51 +0800
Lu Baolu <baolu.lu@linux.intel.com> wrote:

> The domain_init() and md_domain_init() do almost the same job.
> Consolidate them to avoid duplication.
> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  drivers/iommu/intel-iommu.c | 123 +++++++++++-------------------------
>  1 file changed, 36 insertions(+), 87 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 5215dcd535a1..b8c6cf1d5f90 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -1825,63 +1825,6 @@ static inline int guestwidth_to_adjustwidth(int gaw)
>  	return agaw;
>  }
>  
> -static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
> -		       int guest_width)
> -{
> -	int adjust_width, agaw;
> -	unsigned long sagaw;
> -	int err;
> -
> -	init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
> -
> -	err = init_iova_flush_queue(&domain->iovad,
> -				    iommu_flush_iova, iova_entry_free);
> -	if (err)
> -		return err;
> -
> -	domain_reserve_special_ranges(domain);
> -
> -	/* calculate AGAW */
> -	if (guest_width > cap_mgaw(iommu->cap))
> -		guest_width = cap_mgaw(iommu->cap);
> -	domain->gaw = guest_width;
> -	adjust_width = guestwidth_to_adjustwidth(guest_width);
> -	agaw = width_to_agaw(adjust_width);
> -	sagaw = cap_sagaw(iommu->cap);
> -	if (!test_bit(agaw, &sagaw)) {
> -		/* hardware doesn't support it, choose a bigger one */
> -		pr_debug("Hardware doesn't support agaw %d\n", agaw);
> -		agaw = find_next_bit(&sagaw, 5, agaw);
> -		if (agaw >= 5)
> -			return -ENODEV;
> -	}
> -	domain->agaw = agaw;
> -
> -	if (ecap_coherent(iommu->ecap))
> -		domain->iommu_coherency = 1;
> -	else
> -		domain->iommu_coherency = 0;
> -
> -	if (ecap_sc_support(iommu->ecap))
> -		domain->iommu_snooping = 1;
> -	else
> -		domain->iommu_snooping = 0;
> -
> -	if (intel_iommu_superpage)
> -		domain->iommu_superpage = fls(cap_super_page_val(iommu->cap));
> -	else
> -		domain->iommu_superpage = 0;
> -
> -	domain->nid = iommu->node;
> -
> -	/* always allocate the top pgd */
> -	domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
> -	if (!domain->pgd)
> -		return -ENOMEM;
> -	__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
> -	return 0;
> -}
> -
>  static void domain_exit(struct dmar_domain *domain)
>  {
>  	struct page *freelist;
> @@ -2563,6 +2506,31 @@ static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
>  	return 0;
>  }
>  
> +static int domain_init(struct dmar_domain *domain, int guest_width)
> +{
> +	int adjust_width;
> +
> +	init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
> +	domain_reserve_special_ranges(domain);
> +
> +	/* calculate AGAW */
> +	domain->gaw = guest_width;
> +	adjust_width = guestwidth_to_adjustwidth(guest_width);
> +	domain->agaw = width_to_agaw(adjust_width);


How do we justify that domain->agaw is nothing like it was previously
here?  I spent some more time working on the failure to boot that I
thought was caused by 4ec066c7b147, but there are so many breakages and
fixes in Joerg's x86/vt-d branch that I think my bisect zero'd in on
the wrong one.  Instead I cherry-picked every commit from Joerg's tree
and matched Fixes patches to their original commit, which led me to
this patch, mainline commit 123b2ffc376e.  The issue I'm seeing is that
we call domain_context_mapping_one() and we are in this section:

        struct dma_pte *pgd = domain->pgd;
        int agaw;

        context_set_domain_id(context, did);

        if (translation != CONTEXT_TT_PASS_THROUGH) {
                /*
                 * Skip top levels of page tables for iommu which has
                 * less agaw than default. Unnecessary for PT mode.
                 */
                for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
                        ret = -ENOMEM;
                        pgd = phys_to_virt(dma_pte_addr(pgd));
                        if (!dma_pte_present(pgd))
                                goto out_unlock;
                }

Prior to this commit, we had domain->agaw=1 and iommu->agaw=1, so we
don't enter the loop.  With this commit, we have domain->agaw=3,
iommu->agaw=1 with pgd->val=0!

I don't really follow how the setting of these fields above is
equivalent to what they were previously or if they're supposed to be
updated lazily, but the current behavior is non-functional.  Commit
123b2ffc376e can be reverted with only a bit of offset, which brings
Linus' tree back into working operation for me.  Should we revert or is
there an obvious fix here?  Thanks,

Alex
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 7/7] iommu/vt-d: Consolidate domain_init() to avoid duplication
  2019-07-18 23:16   ` Alex Williamson
@ 2019-07-19  8:27     ` Lu Baolu
  2019-07-19 15:19       ` Alex Williamson
  0 siblings, 1 reply; 18+ messages in thread
From: Lu Baolu @ 2019-07-19  8:27 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kevin.tian, ashok.raj, linux-kernel, iommu, jacob.jun.pan,
	David Woodhouse

Hi Alex,

On 7/19/19 7:16 AM, Alex Williamson wrote:
> On Wed, 12 Jun 2019 08:28:51 +0800
> Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
>> The domain_init() and md_domain_init() do almost the same job.
>> Consolidate them to avoid duplication.
>>
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>> ---
>>   drivers/iommu/intel-iommu.c | 123 +++++++++++-------------------------
>>   1 file changed, 36 insertions(+), 87 deletions(-)
>>
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index 5215dcd535a1..b8c6cf1d5f90 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -1825,63 +1825,6 @@ static inline int guestwidth_to_adjustwidth(int gaw)
>>   	return agaw;
>>   }
>>   
>> -static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
>> -		       int guest_width)
>> -{
>> -	int adjust_width, agaw;
>> -	unsigned long sagaw;
>> -	int err;
>> -
>> -	init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
>> -
>> -	err = init_iova_flush_queue(&domain->iovad,
>> -				    iommu_flush_iova, iova_entry_free);
>> -	if (err)
>> -		return err;
>> -
>> -	domain_reserve_special_ranges(domain);
>> -
>> -	/* calculate AGAW */
>> -	if (guest_width > cap_mgaw(iommu->cap))
>> -		guest_width = cap_mgaw(iommu->cap);
>> -	domain->gaw = guest_width;
>> -	adjust_width = guestwidth_to_adjustwidth(guest_width);
>> -	agaw = width_to_agaw(adjust_width);
>> -	sagaw = cap_sagaw(iommu->cap);
>> -	if (!test_bit(agaw, &sagaw)) {
>> -		/* hardware doesn't support it, choose a bigger one */
>> -		pr_debug("Hardware doesn't support agaw %d\n", agaw);
>> -		agaw = find_next_bit(&sagaw, 5, agaw);
>> -		if (agaw >= 5)
>> -			return -ENODEV;
>> -	}
>> -	domain->agaw = agaw;
>> -
>> -	if (ecap_coherent(iommu->ecap))
>> -		domain->iommu_coherency = 1;
>> -	else
>> -		domain->iommu_coherency = 0;
>> -
>> -	if (ecap_sc_support(iommu->ecap))
>> -		domain->iommu_snooping = 1;
>> -	else
>> -		domain->iommu_snooping = 0;
>> -
>> -	if (intel_iommu_superpage)
>> -		domain->iommu_superpage = fls(cap_super_page_val(iommu->cap));
>> -	else
>> -		domain->iommu_superpage = 0;
>> -
>> -	domain->nid = iommu->node;
>> -
>> -	/* always allocate the top pgd */
>> -	domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
>> -	if (!domain->pgd)
>> -		return -ENOMEM;
>> -	__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
>> -	return 0;
>> -}
>> -
>>   static void domain_exit(struct dmar_domain *domain)
>>   {
>>   	struct page *freelist;
>> @@ -2563,6 +2506,31 @@ static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
>>   	return 0;
>>   }
>>   
>> +static int domain_init(struct dmar_domain *domain, int guest_width)
>> +{
>> +	int adjust_width;
>> +
>> +	init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
>> +	domain_reserve_special_ranges(domain);
>> +
>> +	/* calculate AGAW */
>> +	domain->gaw = guest_width;
>> +	adjust_width = guestwidth_to_adjustwidth(guest_width);
>> +	domain->agaw = width_to_agaw(adjust_width);
> 
> 
> How do we justify that domain->agaw is nothing like it was previously
> here?  I spent some more time working on the failure to boot that I
> thought was caused by 4ec066c7b147, but there are so many breakages and
> fixes in Joerg's x86/vt-d branch that I think my bisect zero'd in on
> the wrong one.  Instead I cherry-picked every commit from Joerg's tree
> and matched Fixes patches to their original commit, which led me to
> this patch, mainline commit 123b2ffc376e.  The issue I'm seeing is that
> we call domain_context_mapping_one() and we are in this section:
> 
>          struct dma_pte *pgd = domain->pgd;
>          int agaw;
> 
>          context_set_domain_id(context, did);
> 
>          if (translation != CONTEXT_TT_PASS_THROUGH) {
>                  /*
>                   * Skip top levels of page tables for iommu which has
>                   * less agaw than default. Unnecessary for PT mode.
>                   */
>                  for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
>                          ret = -ENOMEM;
>                          pgd = phys_to_virt(dma_pte_addr(pgd));
>                          if (!dma_pte_present(pgd))
>                                  goto out_unlock;
>                  }
> 
> Prior to this commit, we had domain->agaw=1 and iommu->agaw=1, so we
> don't enter the loop.  With this commit, we have domain->agaw=3,
> iommu->agaw=1 with pgd->val=0!
> 

iommu->agaw presents the level of page table which is by default
supported by the @iommu. domain->agaw presents the level of page
table which is used by the @domain.

agaw = 1: 3-level page table
agaw = 2: 4-level page table
agaw = 3: 5-level page table

The case here is that @iommu only supports 3-level page table, but the
@domain was set to use 5-level page table. So we must skip level 4 and 5
page tables of the @domain.

This code in the loop looks odd to me. It will always goto to unlock and
leave pgd->val==0. How about below change? (not tested yet!)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 412f18aba501..98d6878cd29d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2020,11 +2020,15 @@ static int domain_context_mapping_one(struct 
dmar_domain *domain,
                          * Skip top levels of page tables for iommu 
which has
                          * less agaw than default. Unnecessary for PT mode.
                          */
-                       for (agaw = domain->agaw; agaw > iommu->agaw; 
agaw--) {
-                               ret = -ENOMEM;
-                               pgd = phys_to_virt(dma_pte_addr(pgd));
-                               if (!dma_pte_present(pgd))
-                                       goto out_unlock;
+                       while (iommu->agaw < domain->agaw) {
+                               struct dma_pte *pte;
+
+                               pte = domain->pgd;
+                               if (dma_pte_present(pte)) {
+                                       domain->pgd = 
phys_to_virt(dma_pte_addr(pte));
+                                       free_pgtable_page(pte);
+                               }
+                               domain->agaw--;
                         }

                         info = iommu_support_dev_iotlb(domain, iommu, 
bus, devfn);

> I don't really follow how the setting of these fields above is
> equivalent to what they were previously or if they're supposed to be
> updated lazily, but the current behavior is non-functional.  Commit
> 123b2ffc376e can be reverted with only a bit of offset, which brings
> Linus' tree back into working operation for me.  Should we revert or is
> there an obvious fix here?  Thanks,

This commit is not the root cause of this issue as far as I can see, it
only triggers above loop to get entered.

Best regards,
Baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 7/7] iommu/vt-d: Consolidate domain_init() to avoid duplication
  2019-07-19  8:27     ` Lu Baolu
@ 2019-07-19 15:19       ` Alex Williamson
  2019-07-20  1:15         ` Lu Baolu
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Williamson @ 2019-07-19 15:19 UTC (permalink / raw)
  To: Lu Baolu
  Cc: kevin.tian, ashok.raj, linux-kernel, iommu, jacob.jun.pan,
	David Woodhouse

On Fri, 19 Jul 2019 16:27:04 +0800
Lu Baolu <baolu.lu@linux.intel.com> wrote:

> Hi Alex,
> 
> On 7/19/19 7:16 AM, Alex Williamson wrote:
> > On Wed, 12 Jun 2019 08:28:51 +0800
> > Lu Baolu <baolu.lu@linux.intel.com> wrote:
> >   
> >> The domain_init() and md_domain_init() do almost the same job.
> >> Consolidate them to avoid duplication.
> >>
> >> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> >> ---
> >>   drivers/iommu/intel-iommu.c | 123 +++++++++++-------------------------
> >>   1 file changed, 36 insertions(+), 87 deletions(-)
> >>
> >> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> >> index 5215dcd535a1..b8c6cf1d5f90 100644
> >> --- a/drivers/iommu/intel-iommu.c
> >> +++ b/drivers/iommu/intel-iommu.c
> >> @@ -1825,63 +1825,6 @@ static inline int guestwidth_to_adjustwidth(int gaw)
> >>   	return agaw;
> >>   }
> >>   
> >> -static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
> >> -		       int guest_width)
> >> -{
> >> -	int adjust_width, agaw;
> >> -	unsigned long sagaw;
> >> -	int err;
> >> -
> >> -	init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
> >> -
> >> -	err = init_iova_flush_queue(&domain->iovad,
> >> -				    iommu_flush_iova, iova_entry_free);
> >> -	if (err)
> >> -		return err;
> >> -
> >> -	domain_reserve_special_ranges(domain);
> >> -
> >> -	/* calculate AGAW */
> >> -	if (guest_width > cap_mgaw(iommu->cap))
> >> -		guest_width = cap_mgaw(iommu->cap);
> >> -	domain->gaw = guest_width;
> >> -	adjust_width = guestwidth_to_adjustwidth(guest_width);
> >> -	agaw = width_to_agaw(adjust_width);
> >> -	sagaw = cap_sagaw(iommu->cap);
> >> -	if (!test_bit(agaw, &sagaw)) {
> >> -		/* hardware doesn't support it, choose a bigger one */
> >> -		pr_debug("Hardware doesn't support agaw %d\n", agaw);
> >> -		agaw = find_next_bit(&sagaw, 5, agaw);
> >> -		if (agaw >= 5)
> >> -			return -ENODEV;
> >> -	}
> >> -	domain->agaw = agaw;
> >> -
> >> -	if (ecap_coherent(iommu->ecap))
> >> -		domain->iommu_coherency = 1;
> >> -	else
> >> -		domain->iommu_coherency = 0;
> >> -
> >> -	if (ecap_sc_support(iommu->ecap))
> >> -		domain->iommu_snooping = 1;
> >> -	else
> >> -		domain->iommu_snooping = 0;
> >> -
> >> -	if (intel_iommu_superpage)
> >> -		domain->iommu_superpage = fls(cap_super_page_val(iommu->cap));
> >> -	else
> >> -		domain->iommu_superpage = 0;
> >> -
> >> -	domain->nid = iommu->node;
> >> -
> >> -	/* always allocate the top pgd */
> >> -	domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
> >> -	if (!domain->pgd)
> >> -		return -ENOMEM;
> >> -	__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
> >> -	return 0;
> >> -}
> >> -
> >>   static void domain_exit(struct dmar_domain *domain)
> >>   {
> >>   	struct page *freelist;
> >> @@ -2563,6 +2506,31 @@ static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
> >>   	return 0;
> >>   }
> >>   
> >> +static int domain_init(struct dmar_domain *domain, int guest_width)
> >> +{
> >> +	int adjust_width;
> >> +
> >> +	init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
> >> +	domain_reserve_special_ranges(domain);
> >> +
> >> +	/* calculate AGAW */
> >> +	domain->gaw = guest_width;
> >> +	adjust_width = guestwidth_to_adjustwidth(guest_width);
> >> +	domain->agaw = width_to_agaw(adjust_width);  
> > 
> > 
> > How do we justify that domain->agaw is nothing like it was previously
> > here?  I spent some more time working on the failure to boot that I
> > thought was caused by 4ec066c7b147, but there are so many breakages and
> > fixes in Joerg's x86/vt-d branch that I think my bisect zero'd in on
> > the wrong one.  Instead I cherry-picked every commit from Joerg's tree
> > and matched Fixes patches to their original commit, which led me to
> > this patch, mainline commit 123b2ffc376e.  The issue I'm seeing is that
> > we call domain_context_mapping_one() and we are in this section:
> > 
> >          struct dma_pte *pgd = domain->pgd;
> >          int agaw;
> > 
> >          context_set_domain_id(context, did);
> > 
> >          if (translation != CONTEXT_TT_PASS_THROUGH) {
> >                  /*
> >                   * Skip top levels of page tables for iommu which has
> >                   * less agaw than default. Unnecessary for PT mode.
> >                   */
> >                  for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
> >                          ret = -ENOMEM;
> >                          pgd = phys_to_virt(dma_pte_addr(pgd));
> >                          if (!dma_pte_present(pgd))
> >                                  goto out_unlock;
> >                  }
> > 
> > Prior to this commit, we had domain->agaw=1 and iommu->agaw=1, so we
> > don't enter the loop.  With this commit, we have domain->agaw=3,
> > iommu->agaw=1 with pgd->val=0!
> >   
> 
> iommu->agaw presents the level of page table which is by default
> supported by the @iommu. domain->agaw presents the level of page
> table which is used by the @domain.
> 
> agaw = 1: 3-level page table
> agaw = 2: 4-level page table
> agaw = 3: 5-level page table
> 
> The case here is that @iommu only supports 3-level page table, but the
> @domain was set to use 5-level page table. So we must skip level 4 and 5
> page tables of the @domain.
> 
> This code in the loop looks odd to me. It will always goto to unlock and
> leave pgd->val==0. How about below change? (not tested yet!)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 412f18aba501..98d6878cd29d 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -2020,11 +2020,15 @@ static int domain_context_mapping_one(struct 
> dmar_domain *domain,
>                           * Skip top levels of page tables for iommu 
> which has
>                           * less agaw than default. Unnecessary for PT mode.
>                           */
> -                       for (agaw = domain->agaw; agaw > iommu->agaw; 
> agaw--) {
> -                               ret = -ENOMEM;
> -                               pgd = phys_to_virt(dma_pte_addr(pgd));
> -                               if (!dma_pte_present(pgd))
> -                                       goto out_unlock;
> +                       while (iommu->agaw < domain->agaw) {
> +                               struct dma_pte *pte;
> +
> +                               pte = domain->pgd;
> +                               if (dma_pte_present(pte)) {
> +                                       domain->pgd = 
> phys_to_virt(dma_pte_addr(pte));
> +                                       free_pgtable_page(pte);
> +                               }
> +                               domain->agaw--;
>                          }
> 
>                          info = iommu_support_dev_iotlb(domain, iommu, 
> bus, devfn);


The system still fails to boot, now with DMAR faults indicating invalid
context entry.
 
> > I don't really follow how the setting of these fields above is
> > equivalent to what they were previously or if they're supposed to be
> > updated lazily, but the current behavior is non-functional.  Commit
> > 123b2ffc376e can be reverted with only a bit of offset, which brings
> > Linus' tree back into working operation for me.  Should we revert or is
> > there an obvious fix here?  Thanks,  
> 
> This commit is not the root cause of this issue as far as I can see, it
> only triggers above loop to get entered.

The system fails to boot with this commit, it does boot without this
commit.  Whether this commit exposed a latent issue or introduced a bug
itself is academic.  Thanks,

Alex
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 7/7] iommu/vt-d: Consolidate domain_init() to avoid duplication
  2019-07-19 15:19       ` Alex Williamson
@ 2019-07-20  1:15         ` Lu Baolu
  2019-07-22 14:22           ` Joerg Roedel
  0 siblings, 1 reply; 18+ messages in thread
From: Lu Baolu @ 2019-07-20  1:15 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kevin.tian, ashok.raj, linux-kernel, iommu, jacob.jun.pan,
	David Woodhouse

Hi Alex,

On 7/19/19 11:19 PM, Alex Williamson wrote:
> On Fri, 19 Jul 2019 16:27:04 +0800
> Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
>> Hi Alex,
>>
>> On 7/19/19 7:16 AM, Alex Williamson wrote:
>>> On Wed, 12 Jun 2019 08:28:51 +0800
>>> Lu Baolu <baolu.lu@linux.intel.com> wrote:
>>>    
>>>> The domain_init() and md_domain_init() do almost the same job.
>>>> Consolidate them to avoid duplication.
>>>>
>>>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>>>> ---
>>>>    drivers/iommu/intel-iommu.c | 123 +++++++++++-------------------------
>>>>    1 file changed, 36 insertions(+), 87 deletions(-)
>>>>
>>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>>> index 5215dcd535a1..b8c6cf1d5f90 100644
>>>> --- a/drivers/iommu/intel-iommu.c
>>>> +++ b/drivers/iommu/intel-iommu.c
>>>> @@ -1825,63 +1825,6 @@ static inline int guestwidth_to_adjustwidth(int gaw)
>>>>    	return agaw;
>>>>    }
>>>>    
>>>> -static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
>>>> -		       int guest_width)
>>>> -{
>>>> -	int adjust_width, agaw;
>>>> -	unsigned long sagaw;
>>>> -	int err;
>>>> -
>>>> -	init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
>>>> -
>>>> -	err = init_iova_flush_queue(&domain->iovad,
>>>> -				    iommu_flush_iova, iova_entry_free);
>>>> -	if (err)
>>>> -		return err;
>>>> -
>>>> -	domain_reserve_special_ranges(domain);
>>>> -
>>>> -	/* calculate AGAW */
>>>> -	if (guest_width > cap_mgaw(iommu->cap))
>>>> -		guest_width = cap_mgaw(iommu->cap);
>>>> -	domain->gaw = guest_width;
>>>> -	adjust_width = guestwidth_to_adjustwidth(guest_width);
>>>> -	agaw = width_to_agaw(adjust_width);
>>>> -	sagaw = cap_sagaw(iommu->cap);
>>>> -	if (!test_bit(agaw, &sagaw)) {
>>>> -		/* hardware doesn't support it, choose a bigger one */
>>>> -		pr_debug("Hardware doesn't support agaw %d\n", agaw);
>>>> -		agaw = find_next_bit(&sagaw, 5, agaw);
>>>> -		if (agaw >= 5)
>>>> -			return -ENODEV;
>>>> -	}
>>>> -	domain->agaw = agaw;
>>>> -
>>>> -	if (ecap_coherent(iommu->ecap))
>>>> -		domain->iommu_coherency = 1;
>>>> -	else
>>>> -		domain->iommu_coherency = 0;
>>>> -
>>>> -	if (ecap_sc_support(iommu->ecap))
>>>> -		domain->iommu_snooping = 1;
>>>> -	else
>>>> -		domain->iommu_snooping = 0;
>>>> -
>>>> -	if (intel_iommu_superpage)
>>>> -		domain->iommu_superpage = fls(cap_super_page_val(iommu->cap));
>>>> -	else
>>>> -		domain->iommu_superpage = 0;
>>>> -
>>>> -	domain->nid = iommu->node;
>>>> -
>>>> -	/* always allocate the top pgd */
>>>> -	domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
>>>> -	if (!domain->pgd)
>>>> -		return -ENOMEM;
>>>> -	__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
>>>> -	return 0;
>>>> -}
>>>> -
>>>>    static void domain_exit(struct dmar_domain *domain)
>>>>    {
>>>>    	struct page *freelist;
>>>> @@ -2563,6 +2506,31 @@ static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
>>>>    	return 0;
>>>>    }
>>>>    
>>>> +static int domain_init(struct dmar_domain *domain, int guest_width)
>>>> +{
>>>> +	int adjust_width;
>>>> +
>>>> +	init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
>>>> +	domain_reserve_special_ranges(domain);
>>>> +
>>>> +	/* calculate AGAW */
>>>> +	domain->gaw = guest_width;
>>>> +	adjust_width = guestwidth_to_adjustwidth(guest_width);
>>>> +	domain->agaw = width_to_agaw(adjust_width);
>>>
>>>
>>> How do we justify that domain->agaw is nothing like it was previously
>>> here?  I spent some more time working on the failure to boot that I
>>> thought was caused by 4ec066c7b147, but there are so many breakages and
>>> fixes in Joerg's x86/vt-d branch that I think my bisect zero'd in on
>>> the wrong one.  Instead I cherry-picked every commit from Joerg's tree
>>> and matched Fixes patches to their original commit, which led me to
>>> this patch, mainline commit 123b2ffc376e.  The issue I'm seeing is that
>>> we call domain_context_mapping_one() and we are in this section:
>>>
>>>           struct dma_pte *pgd = domain->pgd;
>>>           int agaw;
>>>
>>>           context_set_domain_id(context, did);
>>>
>>>           if (translation != CONTEXT_TT_PASS_THROUGH) {
>>>                   /*
>>>                    * Skip top levels of page tables for iommu which has
>>>                    * less agaw than default. Unnecessary for PT mode.
>>>                    */
>>>                   for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
>>>                           ret = -ENOMEM;
>>>                           pgd = phys_to_virt(dma_pte_addr(pgd));
>>>                           if (!dma_pte_present(pgd))
>>>                                   goto out_unlock;
>>>                   }
>>>
>>> Prior to this commit, we had domain->agaw=1 and iommu->agaw=1, so we
>>> don't enter the loop.  With this commit, we have domain->agaw=3,
>>> iommu->agaw=1 with pgd->val=0!
>>>    
>>
>> iommu->agaw presents the level of page table which is by default
>> supported by the @iommu. domain->agaw presents the level of page
>> table which is used by the @domain.
>>
>> agaw = 1: 3-level page table
>> agaw = 2: 4-level page table
>> agaw = 3: 5-level page table
>>
>> The case here is that @iommu only supports 3-level page table, but the
>> @domain was set to use 5-level page table. So we must skip level 4 and 5
>> page tables of the @domain.
>>
>> This code in the loop looks odd to me. It will always goto to unlock and
>> leave pgd->val==0. How about below change? (not tested yet!)
>>
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index 412f18aba501..98d6878cd29d 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -2020,11 +2020,15 @@ static int domain_context_mapping_one(struct
>> dmar_domain *domain,
>>                            * Skip top levels of page tables for iommu
>> which has
>>                            * less agaw than default. Unnecessary for PT mode.
>>                            */
>> -                       for (agaw = domain->agaw; agaw > iommu->agaw;
>> agaw--) {
>> -                               ret = -ENOMEM;
>> -                               pgd = phys_to_virt(dma_pte_addr(pgd));
>> -                               if (!dma_pte_present(pgd))
>> -                                       goto out_unlock;
>> +                       while (iommu->agaw < domain->agaw) {
>> +                               struct dma_pte *pte;
>> +
>> +                               pte = domain->pgd;
>> +                               if (dma_pte_present(pte)) {
>> +                                       domain->pgd =
>> phys_to_virt(dma_pte_addr(pte));
>> +                                       free_pgtable_page(pte);
>> +                               }
>> +                               domain->agaw--;
>>                           }
>>
>>                           info = iommu_support_dev_iotlb(domain, iommu,
>> bus, devfn);
> 
> 
> The system still fails to boot, now with DMAR faults indicating invalid
> context entry.

Ah! It demands more investigation. Thank you for testing this.

>   
>>> I don't really follow how the setting of these fields above is
>>> equivalent to what they were previously or if they're supposed to be
>>> updated lazily, but the current behavior is non-functional.  Commit
>>> 123b2ffc376e can be reverted with only a bit of offset, which brings
>>> Linus' tree back into working operation for me.  Should we revert or is
>>> there an obvious fix here?  Thanks,
>>
>> This commit is not the root cause of this issue as far as I can see, it
>> only triggers above loop to get entered.
> 
> The system fails to boot with this commit, it does boot without this
> commit.  Whether this commit exposed a latent issue or introduced a bug
> itself is academic.  Thanks,

Okay. I agree wit you. Let's revert this commit first.

Best regards,
Baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 7/7] iommu/vt-d: Consolidate domain_init() to avoid duplication
  2019-07-20  1:15         ` Lu Baolu
@ 2019-07-22 14:22           ` Joerg Roedel
  2019-07-25  1:41             ` Lu Baolu
  0 siblings, 1 reply; 18+ messages in thread
From: Joerg Roedel @ 2019-07-22 14:22 UTC (permalink / raw)
  To: Lu Baolu
  Cc: kevin.tian, ashok.raj, linux-kernel, iommu, Alex Williamson,
	jacob.jun.pan, David Woodhouse

On Sat, Jul 20, 2019 at 09:15:58AM +0800, Lu Baolu wrote:
> Okay. I agree wit you. Let's revert this commit first.

Reverted the patch and queued it to my iommu/fixes branch.


Regards,

	Joerg
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 7/7] iommu/vt-d: Consolidate domain_init() to avoid duplication
  2019-07-22 14:22           ` Joerg Roedel
@ 2019-07-25  1:41             ` Lu Baolu
  0 siblings, 0 replies; 18+ messages in thread
From: Lu Baolu @ 2019-07-25  1:41 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: kevin.tian, ashok.raj, linux-kernel, iommu, Alex Williamson,
	jacob.jun.pan, David Woodhouse

Hi,

On 7/22/19 10:22 PM, Joerg Roedel wrote:
> On Sat, Jul 20, 2019 at 09:15:58AM +0800, Lu Baolu wrote:
>> Okay. I agree wit you. Let's revert this commit first.
> 
> Reverted the patch and queued it to my iommu/fixes branch.

Thank you! Joerg.

Best regards,
Baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 5/7] iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices()
  2019-06-12  0:28 ` [PATCH v2 5/7] iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices() Lu Baolu
@ 2022-06-29 13:03   ` Robin Murphy
  2022-07-01  7:19     ` Baolu Lu
  0 siblings, 1 reply; 18+ messages in thread
From: Robin Murphy @ 2022-06-29 13:03 UTC (permalink / raw)
  To: Lu Baolu, Joerg Roedel, David Woodhouse
  Cc: kevin.tian, ashok.raj, linux-kernel, iommu, cai, jacob.jun.pan

On 2019-06-12 01:28, Lu Baolu wrote:
> The drhd and device scope list should be iterated with the
> iommu global lock held. Otherwise, a suspicious RCU usage
> message will be displayed.
> 
> [    3.695886] =============================
> [    3.695917] WARNING: suspicious RCU usage
> [    3.695950] 5.2.0-rc2+ #2467 Not tainted
> [    3.695981] -----------------------------
> [    3.696014] drivers/iommu/intel-iommu.c:4569 suspicious rcu_dereference_check() usage!
> [    3.696069]
>                 other info that might help us debug this:
> 
> [    3.696126]
>                 rcu_scheduler_active = 2, debug_locks = 1
> [    3.696173] no locks held by swapper/0/1.
> [    3.696204]
>                 stack backtrace:
> [    3.696241] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc2+ #2467
> [    3.696370] Call Trace:
> [    3.696404]  dump_stack+0x85/0xcb
> [    3.696441]  intel_iommu_init+0x128c/0x13ce
> [    3.696478]  ? kmem_cache_free+0x16b/0x2c0
> [    3.696516]  ? __fput+0x14b/0x270
> [    3.696550]  ? __call_rcu+0xb7/0x300
> [    3.696583]  ? get_max_files+0x10/0x10
> [    3.696631]  ? set_debug_rodata+0x11/0x11
> [    3.696668]  ? e820__memblock_setup+0x60/0x60
> [    3.696704]  ? pci_iommu_init+0x16/0x3f
> [    3.696737]  ? set_debug_rodata+0x11/0x11
> [    3.696770]  pci_iommu_init+0x16/0x3f
> [    3.696805]  do_one_initcall+0x5d/0x2e4
> [    3.696844]  ? set_debug_rodata+0x11/0x11
> [    3.696880]  ? rcu_read_lock_sched_held+0x6b/0x80
> [    3.696924]  kernel_init_freeable+0x1f0/0x27c
> [    3.696961]  ? rest_init+0x260/0x260
> [    3.696997]  kernel_init+0xa/0x110
> [    3.697028]  ret_from_fork+0x3a/0x50
> 
> Fixes: fa212a97f3a36 ("iommu/vt-d: Probe DMA-capable ACPI name space devices")
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>   drivers/iommu/intel-iommu.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 19c4c387a3f6..84e650c6a46d 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -4793,8 +4793,10 @@ int __init intel_iommu_init(void)
>   	cpuhp_setup_state(CPUHP_IOMMU_INTEL_DEAD, "iommu/intel:dead", NULL,
>   			  intel_iommu_cpu_dead);
>   
> +	down_read(&dmar_global_lock);
>   	if (probe_acpi_namespace_devices())
>   		pr_warn("ACPI name space devices didn't probe correctly\n");
> +	up_read(&dmar_global_lock);

Doing a bit of archaeology here, is this actually broken? If any ANDD 
entries exist, we'd end up doing:

   down_read(&dmar_global_lock)
   probe_acpi_namespace_devices()
   -> iommu_probe_device()
      -> iommu_create_device_direct_mappings()
         -> iommu_get_resv_regions()
            -> intel_iommu_get_resv_regions()
               -> down_read(&dmar_global_lock)

I'm wondering whether this might explain why my bus_set_iommu series 
prevented Baolu's machine from booting, since "iommu: Move bus setup to 
IOMMU device registration" creates the same condition where we end up in 
get_resv_regions (via bus_iommu_probe() this time) from the same task 
that already holds dmar_global_lock. Of course that leaves me wondering 
how it *did* manage to boot OK on my Xeon box, but maybe there's a 
config difference or dumb luck at play?

Robin.

>   
>   	/* Finally, we enable the DMA remapping hardware. */
>   	for_each_iommu(iommu, drhd) {
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 5/7] iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices()
  2022-06-29 13:03   ` Robin Murphy
@ 2022-07-01  7:19     ` Baolu Lu
  2022-07-01  8:18       ` Robin Murphy
  0 siblings, 1 reply; 18+ messages in thread
From: Baolu Lu @ 2022-07-01  7:19 UTC (permalink / raw)
  To: Robin Murphy, Joerg Roedel, David Woodhouse
  Cc: kevin.tian, ashok.raj, linux-kernel, iommu, cai, jacob.jun.pan

On 2022/6/29 21:03, Robin Murphy wrote:
> On 2019-06-12 01:28, Lu Baolu wrote:
>> The drhd and device scope list should be iterated with the
>> iommu global lock held. Otherwise, a suspicious RCU usage
>> message will be displayed.
>>
>> [    3.695886] =============================
>> [    3.695917] WARNING: suspicious RCU usage
>> [    3.695950] 5.2.0-rc2+ #2467 Not tainted
>> [    3.695981] -----------------------------
>> [    3.696014] drivers/iommu/intel-iommu.c:4569 suspicious 
>> rcu_dereference_check() usage!
>> [    3.696069]
>>                 other info that might help us debug this:
>>
>> [    3.696126]
>>                 rcu_scheduler_active = 2, debug_locks = 1
>> [    3.696173] no locks held by swapper/0/1.
>> [    3.696204]
>>                 stack backtrace:
>> [    3.696241] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc2+ #2467
>> [    3.696370] Call Trace:
>> [    3.696404]  dump_stack+0x85/0xcb
>> [    3.696441]  intel_iommu_init+0x128c/0x13ce
>> [    3.696478]  ? kmem_cache_free+0x16b/0x2c0
>> [    3.696516]  ? __fput+0x14b/0x270
>> [    3.696550]  ? __call_rcu+0xb7/0x300
>> [    3.696583]  ? get_max_files+0x10/0x10
>> [    3.696631]  ? set_debug_rodata+0x11/0x11
>> [    3.696668]  ? e820__memblock_setup+0x60/0x60
>> [    3.696704]  ? pci_iommu_init+0x16/0x3f
>> [    3.696737]  ? set_debug_rodata+0x11/0x11
>> [    3.696770]  pci_iommu_init+0x16/0x3f
>> [    3.696805]  do_one_initcall+0x5d/0x2e4
>> [    3.696844]  ? set_debug_rodata+0x11/0x11
>> [    3.696880]  ? rcu_read_lock_sched_held+0x6b/0x80
>> [    3.696924]  kernel_init_freeable+0x1f0/0x27c
>> [    3.696961]  ? rest_init+0x260/0x260
>> [    3.696997]  kernel_init+0xa/0x110
>> [    3.697028]  ret_from_fork+0x3a/0x50
>>
>> Fixes: fa212a97f3a36 ("iommu/vt-d: Probe DMA-capable ACPI name space 
>> devices")
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>> ---
>>   drivers/iommu/intel-iommu.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index 19c4c387a3f6..84e650c6a46d 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -4793,8 +4793,10 @@ int __init intel_iommu_init(void)
>>       cpuhp_setup_state(CPUHP_IOMMU_INTEL_DEAD, "iommu/intel:dead", NULL,
>>                 intel_iommu_cpu_dead);
>> +    down_read(&dmar_global_lock);
>>       if (probe_acpi_namespace_devices())
>>           pr_warn("ACPI name space devices didn't probe correctly\n");
>> +    up_read(&dmar_global_lock);
> 
> Doing a bit of archaeology here, is this actually broken? If any ANDD 
> entries exist, we'd end up doing:
> 
>    down_read(&dmar_global_lock)
>    probe_acpi_namespace_devices()
>    -> iommu_probe_device()
>       -> iommu_create_device_direct_mappings()
>          -> iommu_get_resv_regions()
>             -> intel_iommu_get_resv_regions()
>                -> down_read(&dmar_global_lock)
> 
> I'm wondering whether this might explain why my bus_set_iommu series 
> prevented Baolu's machine from booting, since "iommu: Move bus setup to 
> IOMMU device registration" creates the same condition where we end up in 
> get_resv_regions (via bus_iommu_probe() this time) from the same task 
> that already holds dmar_global_lock. Of course that leaves me wondering 
> how it *did* manage to boot OK on my Xeon box, but maybe there's a 
> config difference or dumb luck at play?

This is really problematic. Where does the latest bus_set_iommu series
locate? I'd like to take a closer look at what happened here. Perhaps
two weeks later? I'm busy with preparing Intel IOMMU patches for v5.20
these days.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 5/7] iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices()
  2022-07-01  7:19     ` Baolu Lu
@ 2022-07-01  8:18       ` Robin Murphy
  0 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2022-07-01  8:18 UTC (permalink / raw)
  To: Baolu Lu, Joerg Roedel, David Woodhouse
  Cc: kevin.tian, ashok.raj, linux-kernel, iommu, cai, jacob.jun.pan

On 2022-07-01 08:19, Baolu Lu wrote:
> On 2022/6/29 21:03, Robin Murphy wrote:
>> On 2019-06-12 01:28, Lu Baolu wrote:
>>> The drhd and device scope list should be iterated with the
>>> iommu global lock held. Otherwise, a suspicious RCU usage
>>> message will be displayed.
>>>
>>> [    3.695886] =============================
>>> [    3.695917] WARNING: suspicious RCU usage
>>> [    3.695950] 5.2.0-rc2+ #2467 Not tainted
>>> [    3.695981] -----------------------------
>>> [    3.696014] drivers/iommu/intel-iommu.c:4569 suspicious 
>>> rcu_dereference_check() usage!
>>> [    3.696069]
>>>                 other info that might help us debug this:
>>>
>>> [    3.696126]
>>>                 rcu_scheduler_active = 2, debug_locks = 1
>>> [    3.696173] no locks held by swapper/0/1.
>>> [    3.696204]
>>>                 stack backtrace:
>>> [    3.696241] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc2+ 
>>> #2467
>>> [    3.696370] Call Trace:
>>> [    3.696404]  dump_stack+0x85/0xcb
>>> [    3.696441]  intel_iommu_init+0x128c/0x13ce
>>> [    3.696478]  ? kmem_cache_free+0x16b/0x2c0
>>> [    3.696516]  ? __fput+0x14b/0x270
>>> [    3.696550]  ? __call_rcu+0xb7/0x300
>>> [    3.696583]  ? get_max_files+0x10/0x10
>>> [    3.696631]  ? set_debug_rodata+0x11/0x11
>>> [    3.696668]  ? e820__memblock_setup+0x60/0x60
>>> [    3.696704]  ? pci_iommu_init+0x16/0x3f
>>> [    3.696737]  ? set_debug_rodata+0x11/0x11
>>> [    3.696770]  pci_iommu_init+0x16/0x3f
>>> [    3.696805]  do_one_initcall+0x5d/0x2e4
>>> [    3.696844]  ? set_debug_rodata+0x11/0x11
>>> [    3.696880]  ? rcu_read_lock_sched_held+0x6b/0x80
>>> [    3.696924]  kernel_init_freeable+0x1f0/0x27c
>>> [    3.696961]  ? rest_init+0x260/0x260
>>> [    3.696997]  kernel_init+0xa/0x110
>>> [    3.697028]  ret_from_fork+0x3a/0x50
>>>
>>> Fixes: fa212a97f3a36 ("iommu/vt-d: Probe DMA-capable ACPI name space 
>>> devices")
>>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>>> ---
>>>   drivers/iommu/intel-iommu.c | 2 ++
>>>   1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>> index 19c4c387a3f6..84e650c6a46d 100644
>>> --- a/drivers/iommu/intel-iommu.c
>>> +++ b/drivers/iommu/intel-iommu.c
>>> @@ -4793,8 +4793,10 @@ int __init intel_iommu_init(void)
>>>       cpuhp_setup_state(CPUHP_IOMMU_INTEL_DEAD, "iommu/intel:dead", 
>>> NULL,
>>>                 intel_iommu_cpu_dead);
>>> +    down_read(&dmar_global_lock);
>>>       if (probe_acpi_namespace_devices())
>>>           pr_warn("ACPI name space devices didn't probe correctly\n");
>>> +    up_read(&dmar_global_lock);
>>
>> Doing a bit of archaeology here, is this actually broken? If any ANDD 
>> entries exist, we'd end up doing:
>>
>>    down_read(&dmar_global_lock)
>>    probe_acpi_namespace_devices()
>>    -> iommu_probe_device()
>>       -> iommu_create_device_direct_mappings()
>>          -> iommu_get_resv_regions()
>>             -> intel_iommu_get_resv_regions()
>>                -> down_read(&dmar_global_lock)
>>
>> I'm wondering whether this might explain why my bus_set_iommu series 
>> prevented Baolu's machine from booting, since "iommu: Move bus setup 
>> to IOMMU device registration" creates the same condition where we end 
>> up in get_resv_regions (via bus_iommu_probe() this time) from the same 
>> task that already holds dmar_global_lock. Of course that leaves me 
>> wondering how it *did* manage to boot OK on my Xeon box, but maybe 
>> there's a config difference or dumb luck at play?
> 
> This is really problematic. Where does the latest bus_set_iommu series
> locate? I'd like to take a closer look at what happened here. Perhaps
> two weeks later? I'm busy with preparing Intel IOMMU patches for v5.20
> these days.

I've prepared an up-to-date series here:

https://gitlab.arm.com/linux-arm/linux-rm/-/tree/bus-set-iommu-v3

but I've been hesitant to post it without trying to make *some* progress 
on your breakage. I think last time I was just testing with 
x86_64_defconfig, so I'll double-check it with lockdep this afternoon.

Thanks,
Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-07-01  8:18 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-12  0:28 [PATCH v2 0/7] iommu/vt-d: Fixes and cleanups for linux-next Lu Baolu
2019-06-12  0:28 ` [PATCH v2 1/7] iommu/vt-d: Don't return error when device gets right domain Lu Baolu
2019-06-12  0:28 ` [PATCH v2 2/7] iommu/vt-d: Set domain type for a private domain Lu Baolu
2019-06-12  0:28 ` [PATCH v2 3/7] iommu/vt-d: Don't enable iommu's which have been ignored Lu Baolu
2019-06-12  0:28 ` [PATCH v2 4/7] iommu/vt-d: Allow DMA domain attaching to rmrr locked device Lu Baolu
2019-06-12  0:28 ` [PATCH v2 5/7] iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices() Lu Baolu
2022-06-29 13:03   ` Robin Murphy
2022-07-01  7:19     ` Baolu Lu
2022-07-01  8:18       ` Robin Murphy
2019-06-12  0:28 ` [PATCH v2 6/7] iommu/vt-d: Cleanup after delegating DMA domain to generic iommu Lu Baolu
2019-06-12  0:28 ` [PATCH v2 7/7] iommu/vt-d: Consolidate domain_init() to avoid duplication Lu Baolu
2019-07-18 23:16   ` Alex Williamson
2019-07-19  8:27     ` Lu Baolu
2019-07-19 15:19       ` Alex Williamson
2019-07-20  1:15         ` Lu Baolu
2019-07-22 14:22           ` Joerg Roedel
2019-07-25  1:41             ` Lu Baolu
2019-06-12  8:37 ` [PATCH v2 0/7] iommu/vt-d: Fixes and cleanups for linux-next Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).