All of lore.kernel.org
 help / color / mirror / Atom feed
* [PULL REQUEST] iommu/vt-d: patches for v5.6
@ 2020-01-02  0:18 Lu Baolu
  2020-01-02  0:18 ` [PATCH 01/22] iommu/vt-d: Add Kconfig option to enable/disable scalable mode Lu Baolu
                   ` (22 more replies)
  0 siblings, 23 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

Hi Joerg,

Below patches have been piled up for v5.6.

 - Some preparation patches for VT-d nested mode support
   - VT-d Native Shared virtual memory cleanup and fixes
   - Use 1st-level for IOVA translation

 - VT-d debugging and tracing
   - Extend map_sg trace event for more information
   - Add debugfs support to show page table internals

 - Kconfig option for the default status of scalable mode

 - Some miscellaneous cleanups.

Please consider them for the iommu/vt-d branch.

Best regards,
-baolu

Jacob Pan (8):
  iommu/vt-d: Fix CPU and IOMMU SVM feature matching checks
  iommu/vt-d: Match CPU and IOMMU paging mode
  iommu/vt-d: Reject SVM bind for failed capability check
  iommu/vt-d: Avoid duplicated code for PASID setup
  iommu/vt-d: Fix off-by-one in PASID allocation
  iommu/vt-d: Replace Intel specific PASID allocator with IOASID
  iommu/vt-d: Avoid sending invalid page response
  iommu/vt-d: Misc macro clean up for SVM

Lu Baolu (14):
  iommu/vt-d: Add Kconfig option to enable/disable scalable mode
  iommu/vt-d: trace: Extend map_sg trace event
  iommu/vt-d: Avoid iova flush queue in strict mode
  iommu/vt-d: Loose requirement for flush queue initializaton
  iommu/vt-d: Identify domains using first level page table
  iommu/vt-d: Add set domain DOMAIN_ATTR_NESTING attr
  iommu/vt-d: Add PASID_FLAG_FL5LP for first-level pasid setup
  iommu/vt-d: Setup pasid entries for iova over first level
  iommu/vt-d: Flush PASID-based iotlb for iova over first level
  iommu/vt-d: Make first level IOVA canonical
  iommu/vt-d: Update first level super page capability
  iommu/vt-d: Use iova over first level
  iommu/vt-d: debugfs: Add support to show page table internals
  iommu/vt-d: Add a quirk flag for scope mismatched devices

 drivers/iommu/Kconfig               |  13 ++
 drivers/iommu/dmar.c                |  78 +++++--
 drivers/iommu/intel-iommu-debugfs.c |  75 +++++++
 drivers/iommu/intel-iommu.c         | 305 +++++++++++++++++++++++-----
 drivers/iommu/intel-pasid.c         |  97 +++------
 drivers/iommu/intel-pasid.h         |   6 +
 drivers/iommu/intel-svm.c           | 171 +++++++++-------
 include/linux/intel-iommu.h         |  25 ++-
 include/trace/events/intel_iommu.h  |  48 ++++-
 9 files changed, 593 insertions(+), 225 deletions(-)

-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 01/22] iommu/vt-d: Add Kconfig option to enable/disable scalable mode
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 02/22] iommu/vt-d: Fix CPU and IOMMU SVM feature matching checks Lu Baolu
                   ` (21 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

This adds Kconfig option INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON
to make it easier for distributions to enable or disable the
Intel IOMMU scalable mode by default during kernel build.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/Kconfig       | 12 ++++++++++++
 drivers/iommu/intel-iommu.c |  7 ++++++-
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 0b9d78a0f3ac..bcd1c9510458 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -248,6 +248,18 @@ config INTEL_IOMMU_FLOPPY_WA
 	  workaround will setup a 1:1 mapping for the first
 	  16MiB to make floppy (an ISA device) work.
 
+config INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON
+	bool "Enable Intel IOMMU scalable mode by default"
+	depends on INTEL_IOMMU
+	help
+	  Selecting this option will enable by default the scalable mode if
+	  hardware presents the capability. The scalable mode is defined in
+	  VT-d 3.0. The scalable mode capability could be checked by reading
+	  /sys/devices/virtual/iommu/dmar*/intel-iommu/ecap. If this option
+	  is not selected, scalable mode support could also be enabled by
+	  passing intel_iommu=sm_on to the kernel. If not sure, please use
+	  the default value.
+
 config IRQ_REMAP
 	bool "Support for Interrupt Remapping"
 	depends on X86_64 && X86_IO_APIC && PCI_MSI && ACPI
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 42966611a192..26c40134817e 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -355,9 +355,14 @@ static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
 int dmar_disabled = 0;
 #else
 int dmar_disabled = 1;
-#endif /*CONFIG_INTEL_IOMMU_DEFAULT_ON*/
+#endif /* CONFIG_INTEL_IOMMU_DEFAULT_ON */
 
+#ifdef INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON
+int intel_iommu_sm = 1;
+#else
 int intel_iommu_sm;
+#endif /* INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON */
+
 int intel_iommu_enabled = 0;
 EXPORT_SYMBOL_GPL(intel_iommu_enabled);
 
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 02/22] iommu/vt-d: Fix CPU and IOMMU SVM feature matching checks
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
  2020-01-02  0:18 ` [PATCH 01/22] iommu/vt-d: Add Kconfig option to enable/disable scalable mode Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 03/22] iommu/vt-d: Match CPU and IOMMU paging mode Lu Baolu
                   ` (20 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

From: Jacob Pan <jacob.jun.pan@linux.intel.com>

Shared Virtual Memory(SVM) is based on a collective set of hardware
features detected at runtime. There are requirements for matching CPU
and IOMMU capabilities.

The current code checks CPU and IOMMU feature set for SVM support but
the result is never stored nor used. Therefore, SVM can still be used
even when these checks failed. The consequences can be:
1. CPU uses 5-level paging mode for virtual address of 57 bits, but
IOMMU can only support 4-level paging mode with 48 bits address for DMA.
2. 1GB page size is used by CPU but IOMMU does not support it. VT-d
unrecoverable faults may be generated.

The best solution to fix these problems is to prevent them in the first
place.

This patch consolidates code for checking PASID, CPU vs. IOMMU paging
mode compatibility, as well as provides specific error messages for
each failed checks. On sane hardware configurations, these error message
shall never appear in kernel log.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 10 ++--------
 drivers/iommu/intel-svm.c   | 40 +++++++++++++++++++++++++------------
 include/linux/intel-iommu.h |  5 ++++-
 3 files changed, 33 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 26c40134817e..5328e2ed2dd3 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3299,10 +3299,7 @@ static int __init init_dmars(void)
 
 		if (!ecap_pass_through(iommu->ecap))
 			hw_pass_through = 0;
-#ifdef CONFIG_INTEL_IOMMU_SVM
-		if (pasid_supported(iommu))
-			intel_svm_init(iommu);
-#endif
+		intel_svm_check(iommu);
 	}
 
 	/*
@@ -4495,10 +4492,7 @@ static int intel_iommu_add(struct dmar_drhd_unit *dmaru)
 	if (ret)
 		goto out;
 
-#ifdef CONFIG_INTEL_IOMMU_SVM
-	if (pasid_supported(iommu))
-		intel_svm_init(iommu);
-#endif
+	intel_svm_check(iommu);
 
 	if (dmaru->ignored) {
 		/*
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index dca88f9fdf29..e4a5d542b84f 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -23,19 +23,6 @@
 
 static irqreturn_t prq_event_thread(int irq, void *d);
 
-int intel_svm_init(struct intel_iommu *iommu)
-{
-	if (cpu_feature_enabled(X86_FEATURE_GBPAGES) &&
-			!cap_fl1gp_support(iommu->cap))
-		return -EINVAL;
-
-	if (cpu_feature_enabled(X86_FEATURE_LA57) &&
-			!cap_5lp_support(iommu->cap))
-		return -EINVAL;
-
-	return 0;
-}
-
 #define PRQ_ORDER 0
 
 int intel_svm_enable_prq(struct intel_iommu *iommu)
@@ -99,6 +86,33 @@ int intel_svm_finish_prq(struct intel_iommu *iommu)
 	return 0;
 }
 
+static inline bool intel_svm_capable(struct intel_iommu *iommu)
+{
+	return iommu->flags & VTD_FLAG_SVM_CAPABLE;
+}
+
+void intel_svm_check(struct intel_iommu *iommu)
+{
+	if (!pasid_supported(iommu))
+		return;
+
+	if (cpu_feature_enabled(X86_FEATURE_GBPAGES) &&
+	    !cap_fl1gp_support(iommu->cap)) {
+		pr_err("%s SVM disabled, incompatible 1GB page capability\n",
+		       iommu->name);
+		return;
+	}
+
+	if (cpu_feature_enabled(X86_FEATURE_LA57) &&
+	    !cap_5lp_support(iommu->cap)) {
+		pr_err("%s SVM disabled, incompatible paging mode\n",
+		       iommu->name);
+		return;
+	}
+
+	iommu->flags |= VTD_FLAG_SVM_CAPABLE;
+}
+
 static void intel_flush_svm_range_dev (struct intel_svm *svm, struct intel_svm_dev *sdev,
 				unsigned long address, unsigned long pages, int ih)
 {
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 6d8bf4bdf240..aaece25c055f 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -435,6 +435,7 @@ enum {
 
 #define VTD_FLAG_TRANS_PRE_ENABLED	(1 << 0)
 #define VTD_FLAG_IRQ_REMAP_PRE_ENABLED	(1 << 1)
+#define VTD_FLAG_SVM_CAPABLE		(1 << 2)
 
 extern int intel_iommu_sm;
 
@@ -658,7 +659,7 @@ void iommu_flush_write_buffer(struct intel_iommu *iommu);
 int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct device *dev);
 
 #ifdef CONFIG_INTEL_IOMMU_SVM
-int intel_svm_init(struct intel_iommu *iommu);
+extern void intel_svm_check(struct intel_iommu *iommu);
 extern int intel_svm_enable_prq(struct intel_iommu *iommu);
 extern int intel_svm_finish_prq(struct intel_iommu *iommu);
 
@@ -686,6 +687,8 @@ struct intel_svm {
 };
 
 extern struct intel_iommu *intel_svm_device_to_iommu(struct device *dev);
+#else
+static inline void intel_svm_check(struct intel_iommu *iommu) {}
 #endif
 
 #ifdef CONFIG_INTEL_IOMMU_DEBUGFS
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 03/22] iommu/vt-d: Match CPU and IOMMU paging mode
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
  2020-01-02  0:18 ` [PATCH 01/22] iommu/vt-d: Add Kconfig option to enable/disable scalable mode Lu Baolu
  2020-01-02  0:18 ` [PATCH 02/22] iommu/vt-d: Fix CPU and IOMMU SVM feature matching checks Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 04/22] iommu/vt-d: Reject SVM bind for failed capability check Lu Baolu
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

From: Jacob Pan <jacob.jun.pan@linux.intel.com>

When setting up first level page tables for sharing with CPU, we need
to ensure IOMMU can support no less than the levels supported by the
CPU.

It is not adequate, as in the current code, to set up 5-level paging
in PASID entry First Level Paging Mode(FLPM) solely based on CPU.

Currently, intel_pasid_setup_first_level() is only used by native SVM
code which already checks paging mode matches. However, future use of
this helper function may not be limited to native SVM.
https://lkml.org/lkml/2019/11/18/1037

Fixes: 437f35e1cd4c8 ("iommu/vt-d: Add first level page table interface")
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-pasid.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index 040a445be300..e7cb0b8a7332 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -499,8 +499,16 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu,
 	}
 
 #ifdef CONFIG_X86
-	if (cpu_feature_enabled(X86_FEATURE_LA57))
-		pasid_set_flpm(pte, 1);
+	/* Both CPU and IOMMU paging mode need to match */
+	if (cpu_feature_enabled(X86_FEATURE_LA57)) {
+		if (cap_5lp_support(iommu->cap)) {
+			pasid_set_flpm(pte, 1);
+		} else {
+			pr_err("VT-d has no 5-level paging support for CPU\n");
+			pasid_clear_entry(pte);
+			return -EINVAL;
+		}
+	}
 #endif /* CONFIG_X86 */
 
 	pasid_set_domain_id(pte, did);
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 04/22] iommu/vt-d: Reject SVM bind for failed capability check
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (2 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 03/22] iommu/vt-d: Match CPU and IOMMU paging mode Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 05/22] iommu/vt-d: Avoid duplicated code for PASID setup Lu Baolu
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

From: Jacob Pan <jacob.jun.pan@linux.intel.com>

Add a check during SVM bind to ensure CPU and IOMMU hardware capabilities
are met.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-svm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index e4a5d542b84f..48205ab1fea4 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -234,6 +234,9 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 	if (!iommu || dmar_disabled)
 		return -EINVAL;
 
+	if (!intel_svm_capable(iommu))
+		return -ENOTSUPP;
+
 	if (dev_is_pci(dev)) {
 		pasid_max = pci_max_pasids(to_pci_dev(dev));
 		if (pasid_max < 0)
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 05/22] iommu/vt-d: Avoid duplicated code for PASID setup
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (3 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 04/22] iommu/vt-d: Reject SVM bind for failed capability check Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 06/22] iommu/vt-d: Fix off-by-one in PASID allocation Lu Baolu
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

From: Jacob Pan <jacob.jun.pan@linux.intel.com>

After each setup for PASID entry, related translation caches must be
flushed. We can combine duplicated code into one function which is less
error prone.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-pasid.c | 48 ++++++++++++++-----------------------
 1 file changed, 18 insertions(+), 30 deletions(-)

diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index e7cb0b8a7332..732bfee228df 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -465,6 +465,21 @@ void intel_pasid_tear_down_entry(struct intel_iommu *iommu,
 		devtlb_invalidation_with_pasid(iommu, dev, pasid);
 }
 
+static void pasid_flush_caches(struct intel_iommu *iommu,
+				struct pasid_entry *pte,
+				int pasid, u16 did)
+{
+	if (!ecap_coherent(iommu->ecap))
+		clflush_cache_range(pte, sizeof(*pte));
+
+	if (cap_caching_mode(iommu->cap)) {
+		pasid_cache_invalidation_with_pasid(iommu, did, pasid);
+		iotlb_invalidation_with_pasid(iommu, did, pasid);
+	} else {
+		iommu_flush_write_buffer(iommu);
+	}
+}
+
 /*
  * Set up the scalable mode pasid table entry for first only
  * translation type.
@@ -518,16 +533,7 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu,
 	/* Setup Present and PASID Granular Transfer Type: */
 	pasid_set_translation_type(pte, 1);
 	pasid_set_present(pte);
-
-	if (!ecap_coherent(iommu->ecap))
-		clflush_cache_range(pte, sizeof(*pte));
-
-	if (cap_caching_mode(iommu->cap)) {
-		pasid_cache_invalidation_with_pasid(iommu, did, pasid);
-		iotlb_invalidation_with_pasid(iommu, did, pasid);
-	} else {
-		iommu_flush_write_buffer(iommu);
-	}
+	pasid_flush_caches(iommu, pte, pasid, did);
 
 	return 0;
 }
@@ -591,16 +597,7 @@ int intel_pasid_setup_second_level(struct intel_iommu *iommu,
 	 */
 	pasid_set_sre(pte);
 	pasid_set_present(pte);
-
-	if (!ecap_coherent(iommu->ecap))
-		clflush_cache_range(pte, sizeof(*pte));
-
-	if (cap_caching_mode(iommu->cap)) {
-		pasid_cache_invalidation_with_pasid(iommu, did, pasid);
-		iotlb_invalidation_with_pasid(iommu, did, pasid);
-	} else {
-		iommu_flush_write_buffer(iommu);
-	}
+	pasid_flush_caches(iommu, pte, pasid, did);
 
 	return 0;
 }
@@ -634,16 +631,7 @@ int intel_pasid_setup_pass_through(struct intel_iommu *iommu,
 	 */
 	pasid_set_sre(pte);
 	pasid_set_present(pte);
-
-	if (!ecap_coherent(iommu->ecap))
-		clflush_cache_range(pte, sizeof(*pte));
-
-	if (cap_caching_mode(iommu->cap)) {
-		pasid_cache_invalidation_with_pasid(iommu, did, pasid);
-		iotlb_invalidation_with_pasid(iommu, did, pasid);
-	} else {
-		iommu_flush_write_buffer(iommu);
-	}
+	pasid_flush_caches(iommu, pte, pasid, did);
 
 	return 0;
 }
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 06/22] iommu/vt-d: Fix off-by-one in PASID allocation
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (4 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 05/22] iommu/vt-d: Avoid duplicated code for PASID setup Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 07/22] iommu/vt-d: Replace Intel specific PASID allocator with IOASID Lu Baolu
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

From: Jacob Pan <jacob.jun.pan@linux.intel.com>

PASID allocator uses IDR which is exclusive for the end of the
allocation range. There is no need to decrement pasid_max.

Fixes: af39507305fb ("iommu/vt-d: Apply global PASID in SVA")
Reported-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-svm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 48205ab1fea4..9b32614910a5 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -334,7 +334,7 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 		/* Do not use PASID 0 in caching mode (virtualised IOMMU) */
 		ret = intel_pasid_alloc_id(svm,
 					   !!cap_caching_mode(iommu->cap),
-					   pasid_max - 1, GFP_KERNEL);
+					   pasid_max, GFP_KERNEL);
 		if (ret < 0) {
 			kfree(svm);
 			kfree(sdev);
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 07/22] iommu/vt-d: Replace Intel specific PASID allocator with IOASID
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (5 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 06/22] iommu/vt-d: Fix off-by-one in PASID allocation Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 08/22] iommu/vt-d: Avoid sending invalid page response Lu Baolu
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

From: Jacob Pan <jacob.jun.pan@linux.intel.com>

Make use of generic IOASID code to manage PASID allocation,
free, and lookup. Replace Intel specific code.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/Kconfig       |  1 +
 drivers/iommu/intel-iommu.c | 13 +++++++------
 drivers/iommu/intel-pasid.c | 36 ------------------------------------
 drivers/iommu/intel-svm.c   | 36 ++++++++++++++++++++++--------------
 4 files changed, 30 insertions(+), 56 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index bcd1c9510458..9a9e2882f5db 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -214,6 +214,7 @@ config INTEL_IOMMU_SVM
 	select PCI_PASID
 	select PCI_PRI
 	select MMU_NOTIFIER
+	select IOASID
 	help
 	  Shared Virtual Memory (SVM) provides a facility for devices
 	  to access DMA resources through process address space by
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 5328e2ed2dd3..0d100741cf2e 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5282,7 +5282,7 @@ static void auxiliary_unlink_device(struct dmar_domain *domain,
 	domain->auxd_refcnt--;
 
 	if (!domain->auxd_refcnt && domain->default_pasid > 0)
-		intel_pasid_free_id(domain->default_pasid);
+		ioasid_free(domain->default_pasid);
 }
 
 static int aux_domain_add_dev(struct dmar_domain *domain,
@@ -5300,10 +5300,11 @@ static int aux_domain_add_dev(struct dmar_domain *domain,
 	if (domain->default_pasid <= 0) {
 		int pasid;
 
-		pasid = intel_pasid_alloc_id(domain, PASID_MIN,
-					     pci_max_pasids(to_pci_dev(dev)),
-					     GFP_KERNEL);
-		if (pasid <= 0) {
+		/* No private data needed for the default pasid */
+		pasid = ioasid_alloc(NULL, PASID_MIN,
+				     pci_max_pasids(to_pci_dev(dev)) - 1,
+				     NULL);
+		if (pasid == INVALID_IOASID) {
 			pr_err("Can't allocate default pasid\n");
 			return -ENODEV;
 		}
@@ -5339,7 +5340,7 @@ static int aux_domain_add_dev(struct dmar_domain *domain,
 	spin_unlock(&iommu->lock);
 	spin_unlock_irqrestore(&device_domain_lock, flags);
 	if (!domain->auxd_refcnt && domain->default_pasid > 0)
-		intel_pasid_free_id(domain->default_pasid);
+		ioasid_free(domain->default_pasid);
 
 	return ret;
 }
diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index 732bfee228df..3cb569e76642 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -26,42 +26,6 @@
  */
 static DEFINE_SPINLOCK(pasid_lock);
 u32 intel_pasid_max_id = PASID_MAX;
-static DEFINE_IDR(pasid_idr);
-
-int intel_pasid_alloc_id(void *ptr, int start, int end, gfp_t gfp)
-{
-	int ret, min, max;
-
-	min = max_t(int, start, PASID_MIN);
-	max = min_t(int, end, intel_pasid_max_id);
-
-	WARN_ON(in_interrupt());
-	idr_preload(gfp);
-	spin_lock(&pasid_lock);
-	ret = idr_alloc(&pasid_idr, ptr, min, max, GFP_ATOMIC);
-	spin_unlock(&pasid_lock);
-	idr_preload_end();
-
-	return ret;
-}
-
-void intel_pasid_free_id(int pasid)
-{
-	spin_lock(&pasid_lock);
-	idr_remove(&pasid_idr, pasid);
-	spin_unlock(&pasid_lock);
-}
-
-void *intel_pasid_lookup_id(int pasid)
-{
-	void *p;
-
-	spin_lock(&pasid_lock);
-	p = idr_find(&pasid_idr, pasid);
-	spin_unlock(&pasid_lock);
-
-	return p;
-}
 
 /*
  * Per device pasid table management:
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 9b32614910a5..f0410e29fbc1 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -17,6 +17,7 @@
 #include <linux/dmar.h>
 #include <linux/interrupt.h>
 #include <linux/mm_types.h>
+#include <linux/ioasid.h>
 #include <asm/page.h>
 
 #include "intel-pasid.h"
@@ -331,16 +332,15 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 		if (pasid_max > intel_pasid_max_id)
 			pasid_max = intel_pasid_max_id;
 
-		/* Do not use PASID 0 in caching mode (virtualised IOMMU) */
-		ret = intel_pasid_alloc_id(svm,
-					   !!cap_caching_mode(iommu->cap),
-					   pasid_max, GFP_KERNEL);
-		if (ret < 0) {
+		/* Do not use PASID 0, reserved for RID to PASID */
+		svm->pasid = ioasid_alloc(NULL, PASID_MIN,
+					  pasid_max - 1, svm);
+		if (svm->pasid == INVALID_IOASID) {
 			kfree(svm);
 			kfree(sdev);
+			ret = -ENOSPC;
 			goto out;
 		}
-		svm->pasid = ret;
 		svm->notifier.ops = &intel_mmuops;
 		svm->mm = mm;
 		svm->flags = flags;
@@ -350,7 +350,7 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 		if (mm) {
 			ret = mmu_notifier_register(&svm->notifier, mm);
 			if (ret) {
-				intel_pasid_free_id(svm->pasid);
+				ioasid_free(svm->pasid);
 				kfree(svm);
 				kfree(sdev);
 				goto out;
@@ -366,7 +366,7 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 		if (ret) {
 			if (mm)
 				mmu_notifier_unregister(&svm->notifier, mm);
-			intel_pasid_free_id(svm->pasid);
+			ioasid_free(svm->pasid);
 			kfree(svm);
 			kfree(sdev);
 			goto out;
@@ -414,10 +414,15 @@ int intel_svm_unbind_mm(struct device *dev, int pasid)
 	if (!iommu)
 		goto out;
 
-	svm = intel_pasid_lookup_id(pasid);
+	svm = ioasid_find(NULL, pasid, NULL);
 	if (!svm)
 		goto out;
 
+	if (IS_ERR(svm)) {
+		ret = PTR_ERR(svm);
+		goto out;
+	}
+
 	list_for_each_entry(sdev, &svm->devs, list) {
 		if (dev == sdev->dev) {
 			ret = 0;
@@ -436,7 +441,7 @@ int intel_svm_unbind_mm(struct device *dev, int pasid)
 				kfree_rcu(sdev, rcu);
 
 				if (list_empty(&svm->devs)) {
-					intel_pasid_free_id(svm->pasid);
+					ioasid_free(svm->pasid);
 					if (svm->mm)
 						mmu_notifier_unregister(&svm->notifier, svm->mm);
 
@@ -471,10 +476,14 @@ int intel_svm_is_pasid_valid(struct device *dev, int pasid)
 	if (!iommu)
 		goto out;
 
-	svm = intel_pasid_lookup_id(pasid);
+	svm = ioasid_find(NULL, pasid, NULL);
 	if (!svm)
 		goto out;
 
+	if (IS_ERR(svm)) {
+		ret = PTR_ERR(svm);
+		goto out;
+	}
 	/* init_mm is used in this case */
 	if (!svm->mm)
 		ret = 1;
@@ -581,13 +590,12 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 
 		if (!svm || svm->pasid != req->pasid) {
 			rcu_read_lock();
-			svm = intel_pasid_lookup_id(req->pasid);
+			svm = ioasid_find(NULL, req->pasid, NULL);
 			/* It *can't* go away, because the driver is not permitted
 			 * to unbind the mm while any page faults are outstanding.
 			 * So we only need RCU to protect the internal idr code. */
 			rcu_read_unlock();
-
-			if (!svm) {
+			if (IS_ERR_OR_NULL(svm)) {
 				pr_err("%s: Page request for invalid PASID %d: %08llx %08llx\n",
 				       iommu->name, req->pasid, ((unsigned long long *)req)[0],
 				       ((unsigned long long *)req)[1]);
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 08/22] iommu/vt-d: Avoid sending invalid page response
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (6 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 07/22] iommu/vt-d: Replace Intel specific PASID allocator with IOASID Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 09/22] iommu/vt-d: Misc macro clean up for SVM Lu Baolu
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

From: Jacob Pan <jacob.jun.pan@linux.intel.com>

Page responses should only be sent when last page in group (LPIG) or
private data is present in the page request. This patch avoids sending
invalid descriptors.

Fixes: 5d308fc1ecf53 ("iommu/vt-d: Add 256-bit invalidation descriptor support")
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-svm.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index f0410e29fbc1..7c6a6e8b1c96 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -679,11 +679,10 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 			if (req->priv_data_present)
 				memcpy(&resp.qw2, req->priv_data,
 				       sizeof(req->priv_data));
+			resp.qw2 = 0;
+			resp.qw3 = 0;
+			qi_submit_sync(&resp, iommu);
 		}
-		resp.qw2 = 0;
-		resp.qw3 = 0;
-		qi_submit_sync(&resp, iommu);
-
 		head = (head + sizeof(*req)) & PRQ_RING_MASK;
 	}
 
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 09/22] iommu/vt-d: Misc macro clean up for SVM
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (7 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 08/22] iommu/vt-d: Avoid sending invalid page response Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 10/22] iommu/vt-d: trace: Extend map_sg trace event Lu Baolu
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

From: Jacob Pan <jacob.jun.pan@linux.intel.com>

Use combined macros for_each_svm_dev() to simplify SVM device iteration
and error checking.

Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-svm.c | 79 +++++++++++++++++++--------------------
 1 file changed, 39 insertions(+), 40 deletions(-)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 7c6a6e8b1c96..04023033b79f 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -222,6 +222,10 @@ static const struct mmu_notifier_ops intel_mmuops = {
 static DEFINE_MUTEX(pasid_mutex);
 static LIST_HEAD(global_svm_list);
 
+#define for_each_svm_dev(sdev, svm, d)			\
+	list_for_each_entry((sdev), &(svm)->devs, list)	\
+		if ((d) != (sdev)->dev) {} else
+
 int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_ops *ops)
 {
 	struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
@@ -270,15 +274,14 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 				goto out;
 			}
 
-			list_for_each_entry(sdev, &svm->devs, list) {
-				if (dev == sdev->dev) {
-					if (sdev->ops != ops) {
-						ret = -EBUSY;
-						goto out;
-					}
-					sdev->users++;
-					goto success;
+			/* Find the matching device in svm list */
+			for_each_svm_dev(sdev, svm, dev) {
+				if (sdev->ops != ops) {
+					ret = -EBUSY;
+					goto out;
 				}
+				sdev->users++;
+				goto success;
 			}
 
 			break;
@@ -423,40 +426,36 @@ int intel_svm_unbind_mm(struct device *dev, int pasid)
 		goto out;
 	}
 
-	list_for_each_entry(sdev, &svm->devs, list) {
-		if (dev == sdev->dev) {
-			ret = 0;
-			sdev->users--;
-			if (!sdev->users) {
-				list_del_rcu(&sdev->list);
-				/* Flush the PASID cache and IOTLB for this device.
-				 * Note that we do depend on the hardware *not* using
-				 * the PASID any more. Just as we depend on other
-				 * devices never using PASIDs that they have no right
-				 * to use. We have a *shared* PASID table, because it's
-				 * large and has to be physically contiguous. So it's
-				 * hard to be as defensive as we might like. */
-				intel_pasid_tear_down_entry(iommu, dev, svm->pasid);
-				intel_flush_svm_range_dev(svm, sdev, 0, -1, 0);
-				kfree_rcu(sdev, rcu);
-
-				if (list_empty(&svm->devs)) {
-					ioasid_free(svm->pasid);
-					if (svm->mm)
-						mmu_notifier_unregister(&svm->notifier, svm->mm);
-
-					list_del(&svm->list);
-
-					/* We mandate that no page faults may be outstanding
-					 * for the PASID when intel_svm_unbind_mm() is called.
-					 * If that is not obeyed, subtle errors will happen.
-					 * Let's make them less subtle... */
-					memset(svm, 0x6b, sizeof(*svm));
-					kfree(svm);
-				}
+	for_each_svm_dev(sdev, svm, dev) {
+		ret = 0;
+		sdev->users--;
+		if (!sdev->users) {
+			list_del_rcu(&sdev->list);
+			/* Flush the PASID cache and IOTLB for this device.
+			 * Note that we do depend on the hardware *not* using
+			 * the PASID any more. Just as we depend on other
+			 * devices never using PASIDs that they have no right
+			 * to use. We have a *shared* PASID table, because it's
+			 * large and has to be physically contiguous. So it's
+			 * hard to be as defensive as we might like. */
+			intel_pasid_tear_down_entry(iommu, dev, svm->pasid);
+			intel_flush_svm_range_dev(svm, sdev, 0, -1, 0);
+			kfree_rcu(sdev, rcu);
+
+			if (list_empty(&svm->devs)) {
+				ioasid_free(svm->pasid);
+				if (svm->mm)
+					mmu_notifier_unregister(&svm->notifier, svm->mm);
+				list_del(&svm->list);
+				/* We mandate that no page faults may be outstanding
+				 * for the PASID when intel_svm_unbind_mm() is called.
+				 * If that is not obeyed, subtle errors will happen.
+				 * Let's make them less subtle... */
+				memset(svm, 0x6b, sizeof(*svm));
+				kfree(svm);
 			}
-			break;
 		}
+		break;
 	}
  out:
 	mutex_unlock(&pasid_mutex);
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 10/22] iommu/vt-d: trace: Extend map_sg trace event
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (8 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 09/22] iommu/vt-d: Misc macro clean up for SVM Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 11/22] iommu/vt-d: Avoid iova flush queue in strict mode Lu Baolu
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

Current map_sg stores trace message in a coarse manner. This
extends it so that more detailed messages could be traced.

The map_sg trace message looks like:

map_sg: dev=0000:00:17.0 [1/9] dev_addr=0xf8f90000 phys_addr=0x158051000 size=4096
map_sg: dev=0000:00:17.0 [2/9] dev_addr=0xf8f91000 phys_addr=0x15a858000 size=4096
map_sg: dev=0000:00:17.0 [3/9] dev_addr=0xf8f92000 phys_addr=0x15aa13000 size=4096
map_sg: dev=0000:00:17.0 [4/9] dev_addr=0xf8f93000 phys_addr=0x1570f1000 size=8192
map_sg: dev=0000:00:17.0 [5/9] dev_addr=0xf8f95000 phys_addr=0x15c6d0000 size=4096
map_sg: dev=0000:00:17.0 [6/9] dev_addr=0xf8f96000 phys_addr=0x157194000 size=4096
map_sg: dev=0000:00:17.0 [7/9] dev_addr=0xf8f97000 phys_addr=0x169552000 size=4096
map_sg: dev=0000:00:17.0 [8/9] dev_addr=0xf8f98000 phys_addr=0x169dde000 size=4096
map_sg: dev=0000:00:17.0 [9/9] dev_addr=0xf8f99000 phys_addr=0x148351000 size=4096

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c        |  7 +++--
 include/trace/events/intel_iommu.h | 48 ++++++++++++++++++++++++++----
 2 files changed, 47 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 0d100741cf2e..fb21a7745db2 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3776,8 +3776,8 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
 		return 0;
 	}
 
-	trace_map_sg(dev, iova_pfn << PAGE_SHIFT,
-		     sg_phys(sglist), size << VTD_PAGE_SHIFT);
+	for_each_sg(sglist, sg, nelems, i)
+		trace_map_sg(dev, i + 1, nelems, sg);
 
 	return nelems;
 }
@@ -3989,6 +3989,9 @@ bounce_map_sg(struct device *dev, struct scatterlist *sglist, int nelems,
 		sg_dma_len(sg) = sg->length;
 	}
 
+	for_each_sg(sglist, sg, nelems, i)
+		trace_bounce_map_sg(dev, i + 1, nelems, sg);
+
 	return nelems;
 
 out_unmap:
diff --git a/include/trace/events/intel_iommu.h b/include/trace/events/intel_iommu.h
index 54e61d456cdf..112bd06487bf 100644
--- a/include/trace/events/intel_iommu.h
+++ b/include/trace/events/intel_iommu.h
@@ -49,12 +49,6 @@ DEFINE_EVENT(dma_map, map_single,
 	TP_ARGS(dev, dev_addr, phys_addr, size)
 );
 
-DEFINE_EVENT(dma_map, map_sg,
-	TP_PROTO(struct device *dev, dma_addr_t dev_addr, phys_addr_t phys_addr,
-		 size_t size),
-	TP_ARGS(dev, dev_addr, phys_addr, size)
-);
-
 DEFINE_EVENT(dma_map, bounce_map_single,
 	TP_PROTO(struct device *dev, dma_addr_t dev_addr, phys_addr_t phys_addr,
 		 size_t size),
@@ -99,6 +93,48 @@ DEFINE_EVENT(dma_unmap, bounce_unmap_single,
 	TP_ARGS(dev, dev_addr, size)
 );
 
+DECLARE_EVENT_CLASS(dma_map_sg,
+	TP_PROTO(struct device *dev, int index, int total,
+		 struct scatterlist *sg),
+
+	TP_ARGS(dev, index, total, sg),
+
+	TP_STRUCT__entry(
+		__string(dev_name, dev_name(dev))
+		__field(dma_addr_t, dev_addr)
+		__field(phys_addr_t, phys_addr)
+		__field(size_t,	size)
+		__field(int, index)
+		__field(int, total)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev_name, dev_name(dev));
+		__entry->dev_addr = sg->dma_address;
+		__entry->phys_addr = sg_phys(sg);
+		__entry->size = sg->dma_length;
+		__entry->index = index;
+		__entry->total = total;
+	),
+
+	TP_printk("dev=%s [%d/%d] dev_addr=0x%llx phys_addr=0x%llx size=%zu",
+		  __get_str(dev_name), __entry->index, __entry->total,
+		  (unsigned long long)__entry->dev_addr,
+		  (unsigned long long)__entry->phys_addr,
+		  __entry->size)
+);
+
+DEFINE_EVENT(dma_map_sg, map_sg,
+	TP_PROTO(struct device *dev, int index, int total,
+		 struct scatterlist *sg),
+	TP_ARGS(dev, index, total, sg)
+);
+
+DEFINE_EVENT(dma_map_sg, bounce_map_sg,
+	TP_PROTO(struct device *dev, int index, int total,
+		 struct scatterlist *sg),
+	TP_ARGS(dev, index, total, sg)
+);
 #endif /* _TRACE_INTEL_IOMMU_H */
 
 /* This part must be outside protection */
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 11/22] iommu/vt-d: Avoid iova flush queue in strict mode
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (9 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 10/22] iommu/vt-d: trace: Extend map_sg trace event Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 12/22] iommu/vt-d: Loose requirement for flush queue initializaton Lu Baolu
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

If Intel IOMMU strict mode is enabled by users, it's unnecessary
to create the iova flush queue.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index fb21a7745db2..4631b1796482 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1858,10 +1858,12 @@ static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
 
 	init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
 
-	err = init_iova_flush_queue(&domain->iovad,
-				    iommu_flush_iova, iova_entry_free);
-	if (err)
-		return err;
+	if (!intel_iommu_strict) {
+		err = init_iova_flush_queue(&domain->iovad,
+					    iommu_flush_iova, iova_entry_free);
+		if (err)
+			return err;
+	}
 
 	domain_reserve_special_ranges(domain);
 
@@ -5199,6 +5201,7 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
 {
 	struct dmar_domain *dmar_domain;
 	struct iommu_domain *domain;
+	int ret;
 
 	switch (type) {
 	case IOMMU_DOMAIN_DMA:
@@ -5215,11 +5218,14 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
 			return NULL;
 		}
 
-		if (type == IOMMU_DOMAIN_DMA &&
-		    init_iova_flush_queue(&dmar_domain->iovad,
-					  iommu_flush_iova, iova_entry_free)) {
-			pr_warn("iova flush queue initialization failed\n");
-			intel_iommu_strict = 1;
+		if (!intel_iommu_strict && type == IOMMU_DOMAIN_DMA) {
+			ret = init_iova_flush_queue(&dmar_domain->iovad,
+						    iommu_flush_iova,
+						    iova_entry_free);
+			if (ret) {
+				pr_warn("iova flush queue initialization failed\n");
+				intel_iommu_strict = 1;
+			}
 		}
 
 		domain_update_iommu_cap(dmar_domain);
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 12/22] iommu/vt-d: Loose requirement for flush queue initializaton
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (10 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 11/22] iommu/vt-d: Avoid iova flush queue in strict mode Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 13/22] iommu/vt-d: Identify domains using first level page table Lu Baolu
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

Currently if flush queue initialization fails, we return error
or enforce the system-wide strict mode. These are unnecessary
because we always check the existence of a flush queue before
queuing any iova's for lazy flushing. Printing a informational
message is enough.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 4631b1796482..34723f6be672 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1854,15 +1854,15 @@ static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
 {
 	int adjust_width, agaw;
 	unsigned long sagaw;
-	int err;
+	int ret;
 
 	init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
 
 	if (!intel_iommu_strict) {
-		err = init_iova_flush_queue(&domain->iovad,
+		ret = init_iova_flush_queue(&domain->iovad,
 					    iommu_flush_iova, iova_entry_free);
-		if (err)
-			return err;
+		if (ret)
+			pr_info("iova flush queue initialization failed\n");
 	}
 
 	domain_reserve_special_ranges(domain);
@@ -5222,10 +5222,8 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
 			ret = init_iova_flush_queue(&dmar_domain->iovad,
 						    iommu_flush_iova,
 						    iova_entry_free);
-			if (ret) {
-				pr_warn("iova flush queue initialization failed\n");
-				intel_iommu_strict = 1;
-			}
+			if (ret)
+				pr_info("iova flush queue initialization failed\n");
 		}
 
 		domain_update_iommu_cap(dmar_domain);
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 13/22] iommu/vt-d: Identify domains using first level page table
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (11 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 12/22] iommu/vt-d: Loose requirement for flush queue initializaton Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 14/22] iommu/vt-d: Add set domain DOMAIN_ATTR_NESTING attr Lu Baolu
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

This checks whether a domain should use the first level page
table for map/unmap and marks it in the domain structure.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 39 +++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 34723f6be672..71ad5e5feae2 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -307,6 +307,14 @@ static int hw_pass_through = 1;
  */
 #define DOMAIN_FLAG_LOSE_CHILDREN		BIT(1)
 
+/*
+ * When VT-d works in the scalable mode, it allows DMA translation to
+ * happen through either first level or second level page table. This
+ * bit marks that the DMA translation for the domain goes through the
+ * first level page table, otherwise, it goes through the second level.
+ */
+#define DOMAIN_FLAG_USE_FIRST_LEVEL		BIT(2)
+
 #define for_each_domain_iommu(idx, domain)			\
 	for (idx = 0; idx < g_num_of_iommus; idx++)		\
 		if (domain->iommu_refcnt[idx])
@@ -1714,6 +1722,35 @@ static void free_dmar_iommu(struct intel_iommu *iommu)
 #endif
 }
 
+/*
+ * Check and return whether first level is used by default for
+ * DMA translation. Currently, we make it off by setting
+ * first_level_support = 0, and will change it to -1 after all
+ * map/unmap paths support first level page table.
+ */
+static bool first_level_by_default(void)
+{
+	struct dmar_drhd_unit *drhd;
+	struct intel_iommu *iommu;
+	static int first_level_support = 0;
+
+	if (likely(first_level_support != -1))
+		return first_level_support;
+
+	first_level_support = 1;
+
+	rcu_read_lock();
+	for_each_active_iommu(iommu, drhd) {
+		if (!sm_supported(iommu) || !ecap_flts(iommu->ecap)) {
+			first_level_support = 0;
+			break;
+		}
+	}
+	rcu_read_unlock();
+
+	return first_level_support;
+}
+
 static struct dmar_domain *alloc_domain(int flags)
 {
 	struct dmar_domain *domain;
@@ -1725,6 +1762,8 @@ static struct dmar_domain *alloc_domain(int flags)
 	memset(domain, 0, sizeof(*domain));
 	domain->nid = NUMA_NO_NODE;
 	domain->flags = flags;
+	if (first_level_by_default())
+		domain->flags |= DOMAIN_FLAG_USE_FIRST_LEVEL;
 	domain->has_iotlb_device = false;
 	INIT_LIST_HEAD(&domain->devices);
 
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 14/22] iommu/vt-d: Add set domain DOMAIN_ATTR_NESTING attr
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (12 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 13/22] iommu/vt-d: Identify domains using first level page table Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 15/22] iommu/vt-d: Add PASID_FLAG_FL5LP for first-level pasid setup Lu Baolu
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, Yi Sun

This adds the Intel VT-d specific callback of setting
DOMAIN_ATTR_NESTING domain attribution. It is necessary
to let the VT-d driver know that the domain represents
a virtual machine which requires the IOMMU hardware to
support nested translation mode. Return success if the
IOMMU hardware suports nested mode, otherwise failure.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 56 +++++++++++++++++++++++++++++++++++++
 1 file changed, 56 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 71ad5e5feae2..35f65628202c 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -315,6 +315,12 @@ static int hw_pass_through = 1;
  */
 #define DOMAIN_FLAG_USE_FIRST_LEVEL		BIT(2)
 
+/*
+ * Domain represents a virtual machine which demands iommu nested
+ * translation mode support.
+ */
+#define DOMAIN_FLAG_NESTING_MODE		BIT(3)
+
 #define for_each_domain_iommu(idx, domain)			\
 	for (idx = 0; idx < g_num_of_iommus; idx++)		\
 		if (domain->iommu_refcnt[idx])
@@ -5640,6 +5646,24 @@ static inline bool iommu_pasid_support(void)
 	return ret;
 }
 
+static inline bool nested_mode_support(void)
+{
+	struct dmar_drhd_unit *drhd;
+	struct intel_iommu *iommu;
+	bool ret = true;
+
+	rcu_read_lock();
+	for_each_active_iommu(iommu, drhd) {
+		if (!sm_supported(iommu) || !ecap_nest(iommu->ecap)) {
+			ret = false;
+			break;
+		}
+	}
+	rcu_read_unlock();
+
+	return ret;
+}
+
 static bool intel_iommu_capable(enum iommu_cap cap)
 {
 	if (cap == IOMMU_CAP_CACHE_COHERENCY)
@@ -6018,10 +6042,42 @@ static bool intel_iommu_is_attach_deferred(struct iommu_domain *domain,
 	return dev->archdata.iommu == DEFER_DEVICE_DOMAIN_INFO;
 }
 
+static int
+intel_iommu_domain_set_attr(struct iommu_domain *domain,
+			    enum iommu_attr attr, void *data)
+{
+	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+	unsigned long flags;
+	int ret = 0;
+
+	if (domain->type != IOMMU_DOMAIN_UNMANAGED)
+		return -EINVAL;
+
+	switch (attr) {
+	case DOMAIN_ATTR_NESTING:
+		spin_lock_irqsave(&device_domain_lock, flags);
+		if (nested_mode_support() &&
+		    list_empty(&dmar_domain->devices)) {
+			dmar_domain->flags |= DOMAIN_FLAG_NESTING_MODE;
+			dmar_domain->flags &= ~DOMAIN_FLAG_USE_FIRST_LEVEL;
+		} else {
+			ret = -ENODEV;
+		}
+		spin_unlock_irqrestore(&device_domain_lock, flags);
+		break;
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+	return ret;
+}
+
 const struct iommu_ops intel_iommu_ops = {
 	.capable		= intel_iommu_capable,
 	.domain_alloc		= intel_iommu_domain_alloc,
 	.domain_free		= intel_iommu_domain_free,
+	.domain_set_attr	= intel_iommu_domain_set_attr,
 	.attach_dev		= intel_iommu_attach_device,
 	.detach_dev		= intel_iommu_detach_device,
 	.aux_attach_dev		= intel_iommu_aux_attach_device,
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 15/22] iommu/vt-d: Add PASID_FLAG_FL5LP for first-level pasid setup
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (13 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 14/22] iommu/vt-d: Add set domain DOMAIN_ATTR_NESTING attr Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 16/22] iommu/vt-d: Setup pasid entries for iova over first level Lu Baolu
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

Current intel_pasid_setup_first_level() use 5-level paging for
first level translation if CPUs use 5-level paging mode too.
This makes sense for SVA usages since the page table is shared
between CPUs and IOMMUs. But it makes no sense if we only want
to use first level for IOVA translation. Add PASID_FLAG_FL5LP
bit in the flags which indicates whether the 5-level paging
mode should be used.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-pasid.c | 7 ++-----
 drivers/iommu/intel-pasid.h | 6 ++++++
 drivers/iommu/intel-svm.c   | 8 ++++++--
 3 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index 3cb569e76642..22b30f10b396 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -477,18 +477,15 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu,
 		pasid_set_sre(pte);
 	}
 
-#ifdef CONFIG_X86
-	/* Both CPU and IOMMU paging mode need to match */
-	if (cpu_feature_enabled(X86_FEATURE_LA57)) {
+	if (flags & PASID_FLAG_FL5LP) {
 		if (cap_5lp_support(iommu->cap)) {
 			pasid_set_flpm(pte, 1);
 		} else {
-			pr_err("VT-d has no 5-level paging support for CPU\n");
+			pr_err("No 5-level paging support for first-level\n");
 			pasid_clear_entry(pte);
 			return -EINVAL;
 		}
 	}
-#endif /* CONFIG_X86 */
 
 	pasid_set_domain_id(pte, did);
 	pasid_set_address_width(pte, iommu->agaw);
diff --git a/drivers/iommu/intel-pasid.h b/drivers/iommu/intel-pasid.h
index fc8cd8f17de1..92de6df24ccb 100644
--- a/drivers/iommu/intel-pasid.h
+++ b/drivers/iommu/intel-pasid.h
@@ -37,6 +37,12 @@
  */
 #define PASID_FLAG_SUPERVISOR_MODE	BIT(0)
 
+/*
+ * The PASID_FLAG_FL5LP flag Indicates using 5-level paging for first-
+ * level translation, otherwise, 4-level paging will be used.
+ */
+#define PASID_FLAG_FL5LP		BIT(1)
+
 struct pasid_dir_entry {
 	u64 val;
 };
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 04023033b79f..d7f2a5358900 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -364,7 +364,9 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 		ret = intel_pasid_setup_first_level(iommu, dev,
 				mm ? mm->pgd : init_mm.pgd,
 				svm->pasid, FLPT_DEFAULT_DID,
-				mm ? 0 : PASID_FLAG_SUPERVISOR_MODE);
+				(mm ? 0 : PASID_FLAG_SUPERVISOR_MODE) |
+				(cpu_feature_enabled(X86_FEATURE_LA57) ?
+				 PASID_FLAG_FL5LP : 0));
 		spin_unlock(&iommu->lock);
 		if (ret) {
 			if (mm)
@@ -385,7 +387,9 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 		ret = intel_pasid_setup_first_level(iommu, dev,
 						mm ? mm->pgd : init_mm.pgd,
 						svm->pasid, FLPT_DEFAULT_DID,
-						mm ? 0 : PASID_FLAG_SUPERVISOR_MODE);
+						(mm ? 0 : PASID_FLAG_SUPERVISOR_MODE) |
+						(cpu_feature_enabled(X86_FEATURE_LA57) ?
+						PASID_FLAG_FL5LP : 0));
 		spin_unlock(&iommu->lock);
 		if (ret) {
 			kfree(sdev);
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 16/22] iommu/vt-d: Setup pasid entries for iova over first level
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (14 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 15/22] iommu/vt-d: Add PASID_FLAG_FL5LP for first-level pasid setup Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 17/22] iommu/vt-d: Flush PASID-based iotlb " Lu Baolu
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

Intel VT-d in scalable mode supports two types of page tables for
IOVA translation: first level and second level. The IOMMU driver
can choose one from both for IOVA translation according to the use
case. This sets up the pasid entry if a domain is selected to use
the first-level page table for iova translation.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 57 +++++++++++++++++++++++++++++++++----
 include/linux/intel-iommu.h | 16 +++++++----
 2 files changed, 62 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 35f65628202c..071cbc172ce8 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -571,6 +571,11 @@ static inline int domain_type_is_si(struct dmar_domain *domain)
 	return domain->flags & DOMAIN_FLAG_STATIC_IDENTITY;
 }
 
+static inline bool domain_use_first_level(struct dmar_domain *domain)
+{
+	return domain->flags & DOMAIN_FLAG_USE_FIRST_LEVEL;
+}
+
 static inline int domain_pfn_supported(struct dmar_domain *domain,
 				       unsigned long pfn)
 {
@@ -932,6 +937,8 @@ static struct dma_pte *pfn_to_dma_pte(struct dmar_domain *domain,
 
 			domain_flush_cache(domain, tmp_page, VTD_PAGE_SIZE);
 			pteval = ((uint64_t)virt_to_dma_pfn(tmp_page) << VTD_PAGE_SHIFT) | DMA_PTE_READ | DMA_PTE_WRITE;
+			if (domain_use_first_level(domain))
+				pteval |= DMA_FL_PTE_XD;
 			if (cmpxchg64(&pte->val, 0ULL, pteval))
 				/* Someone else set it while we were thinking; use theirs. */
 				free_pgtable_page(tmp_page);
@@ -2281,17 +2288,20 @@ static int __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
 	unsigned long sg_res = 0;
 	unsigned int largepage_lvl = 0;
 	unsigned long lvl_pages = 0;
+	u64 attr;
 
 	BUG_ON(!domain_pfn_supported(domain, iov_pfn + nr_pages - 1));
 
 	if ((prot & (DMA_PTE_READ|DMA_PTE_WRITE)) == 0)
 		return -EINVAL;
 
-	prot &= DMA_PTE_READ | DMA_PTE_WRITE | DMA_PTE_SNP;
+	attr = prot & (DMA_PTE_READ | DMA_PTE_WRITE | DMA_PTE_SNP);
+	if (domain_use_first_level(domain))
+		attr |= DMA_FL_PTE_PRESENT | DMA_FL_PTE_XD;
 
 	if (!sg) {
 		sg_res = nr_pages;
-		pteval = ((phys_addr_t)phys_pfn << VTD_PAGE_SHIFT) | prot;
+		pteval = ((phys_addr_t)phys_pfn << VTD_PAGE_SHIFT) | attr;
 	}
 
 	while (nr_pages > 0) {
@@ -2303,7 +2313,7 @@ static int __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
 			sg_res = aligned_nrpages(sg->offset, sg->length);
 			sg->dma_address = ((dma_addr_t)iov_pfn << VTD_PAGE_SHIFT) + pgoff;
 			sg->dma_length = sg->length;
-			pteval = (sg_phys(sg) - pgoff) | prot;
+			pteval = (sg_phys(sg) - pgoff) | attr;
 			phys_pfn = pteval >> VTD_PAGE_SHIFT;
 		}
 
@@ -2515,6 +2525,36 @@ dmar_search_domain_by_dev_info(int segment, int bus, int devfn)
 	return NULL;
 }
 
+static int domain_setup_first_level(struct intel_iommu *iommu,
+				    struct dmar_domain *domain,
+				    struct device *dev,
+				    int pasid)
+{
+	int flags = PASID_FLAG_SUPERVISOR_MODE;
+	struct dma_pte *pgd = domain->pgd;
+	int agaw, level;
+
+	/*
+	 * Skip top levels of page tables for iommu which has
+	 * less agaw than default. Unnecessary for PT mode.
+	 */
+	for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
+		pgd = phys_to_virt(dma_pte_addr(pgd));
+		if (!dma_pte_present(pgd))
+			return -ENOMEM;
+	}
+
+	level = agaw_to_level(agaw);
+	if (level != 4 && level != 5)
+		return -EINVAL;
+
+	flags |= (level == 5) ? PASID_FLAG_FL5LP : 0;
+
+	return intel_pasid_setup_first_level(iommu, dev, (pgd_t *)pgd, pasid,
+					     domain->iommu_did[iommu->seq_id],
+					     flags);
+}
+
 static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
 						    int bus, int devfn,
 						    struct device *dev,
@@ -2614,6 +2654,9 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
 		if (hw_pass_through && domain_type_is_si(domain))
 			ret = intel_pasid_setup_pass_through(iommu, domain,
 					dev, PASID_RID2PASID);
+		else if (domain_use_first_level(domain))
+			ret = domain_setup_first_level(iommu, domain, dev,
+					PASID_RID2PASID);
 		else
 			ret = intel_pasid_setup_second_level(iommu, domain,
 					dev, PASID_RID2PASID);
@@ -5374,8 +5417,12 @@ static int aux_domain_add_dev(struct dmar_domain *domain,
 		goto attach_failed;
 
 	/* Setup the PASID entry for mediated devices: */
-	ret = intel_pasid_setup_second_level(iommu, domain, dev,
-					     domain->default_pasid);
+	if (domain_use_first_level(domain))
+		ret = domain_setup_first_level(iommu, domain, dev,
+					       domain->default_pasid);
+	else
+		ret = intel_pasid_setup_second_level(iommu, domain, dev,
+						     domain->default_pasid);
 	if (ret)
 		goto table_failed;
 	spin_unlock(&iommu->lock);
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index aaece25c055f..454c69712131 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -34,10 +34,13 @@
 #define VTD_STRIDE_SHIFT        (9)
 #define VTD_STRIDE_MASK         (((u64)-1) << VTD_STRIDE_SHIFT)
 
-#define DMA_PTE_READ (1)
-#define DMA_PTE_WRITE (2)
-#define DMA_PTE_LARGE_PAGE (1 << 7)
-#define DMA_PTE_SNP (1 << 11)
+#define DMA_PTE_READ		BIT_ULL(0)
+#define DMA_PTE_WRITE		BIT_ULL(1)
+#define DMA_PTE_LARGE_PAGE	BIT_ULL(7)
+#define DMA_PTE_SNP		BIT_ULL(11)
+
+#define DMA_FL_PTE_PRESENT	BIT_ULL(0)
+#define DMA_FL_PTE_XD		BIT_ULL(63)
 
 #define CONTEXT_TT_MULTI_LEVEL	0
 #define CONTEXT_TT_DEV_IOTLB	1
@@ -610,10 +613,11 @@ static inline void dma_clear_pte(struct dma_pte *pte)
 static inline u64 dma_pte_addr(struct dma_pte *pte)
 {
 #ifdef CONFIG_64BIT
-	return pte->val & VTD_PAGE_MASK;
+	return pte->val & VTD_PAGE_MASK & (~DMA_FL_PTE_XD);
 #else
 	/* Must have a full atomic 64-bit read */
-	return  __cmpxchg64(&pte->val, 0ULL, 0ULL) & VTD_PAGE_MASK;
+	return  __cmpxchg64(&pte->val, 0ULL, 0ULL) &
+			VTD_PAGE_MASK & (~DMA_FL_PTE_XD);
 #endif
 }
 
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 17/22] iommu/vt-d: Flush PASID-based iotlb for iova over first level
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (15 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 16/22] iommu/vt-d: Setup pasid entries for iova over first level Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 18/22] iommu/vt-d: Make first level IOVA canonical Lu Baolu
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

When software has changed first-level tables, it should invalidate
the affected IOTLB and the paging-structure-caches using the PASID-
based-IOTLB Invalidate Descriptor defined in spec 6.5.2.4.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/dmar.c        | 41 +++++++++++++++++++++++++++
 drivers/iommu/intel-iommu.c | 56 +++++++++++++++++++++++++++----------
 include/linux/intel-iommu.h |  2 ++
 3 files changed, 84 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index 3acfa6a25fa2..fb30d5053664 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -1371,6 +1371,47 @@ void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 sid, u16 pfsid,
 	qi_submit_sync(&desc, iommu);
 }
 
+/* PASID-based IOTLB invalidation */
+void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, u32 pasid, u64 addr,
+		     unsigned long npages, bool ih)
+{
+	struct qi_desc desc = {.qw2 = 0, .qw3 = 0};
+
+	/*
+	 * npages == -1 means a PASID-selective invalidation, otherwise,
+	 * a positive value for Page-selective-within-PASID invalidation.
+	 * 0 is not a valid input.
+	 */
+	if (WARN_ON(!npages)) {
+		pr_err("Invalid input npages = %ld\n", npages);
+		return;
+	}
+
+	if (npages == -1) {
+		desc.qw0 = QI_EIOTLB_PASID(pasid) |
+				QI_EIOTLB_DID(did) |
+				QI_EIOTLB_GRAN(QI_GRAN_NONG_PASID) |
+				QI_EIOTLB_TYPE;
+		desc.qw1 = 0;
+	} else {
+		int mask = ilog2(__roundup_pow_of_two(npages));
+		unsigned long align = (1ULL << (VTD_PAGE_SHIFT + mask));
+
+		if (WARN_ON_ONCE(!ALIGN(addr, align)))
+			addr &= ~(align - 1);
+
+		desc.qw0 = QI_EIOTLB_PASID(pasid) |
+				QI_EIOTLB_DID(did) |
+				QI_EIOTLB_GRAN(QI_GRAN_PSI_PASID) |
+				QI_EIOTLB_TYPE;
+		desc.qw1 = QI_EIOTLB_ADDR(addr) |
+				QI_EIOTLB_IH(ih) |
+				QI_EIOTLB_AM(mask);
+	}
+
+	qi_submit_sync(&desc, iommu);
+}
+
 /*
  * Disable Queued Invalidation interface.
  */
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 071cbc172ce8..54db6bc0b281 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1509,6 +1509,20 @@ static void iommu_flush_dev_iotlb(struct dmar_domain *domain,
 	spin_unlock_irqrestore(&device_domain_lock, flags);
 }
 
+static void domain_flush_piotlb(struct intel_iommu *iommu,
+				struct dmar_domain *domain,
+				u64 addr, unsigned long npages, bool ih)
+{
+	u16 did = domain->iommu_did[iommu->seq_id];
+
+	if (domain->default_pasid)
+		qi_flush_piotlb(iommu, did, domain->default_pasid,
+				addr, npages, ih);
+
+	if (!list_empty(&domain->devices))
+		qi_flush_piotlb(iommu, did, PASID_RID2PASID, addr, npages, ih);
+}
+
 static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
 				  struct dmar_domain *domain,
 				  unsigned long pfn, unsigned int pages,
@@ -1522,18 +1536,23 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
 
 	if (ih)
 		ih = 1 << 6;
-	/*
-	 * Fallback to domain selective flush if no PSI support or the size is
-	 * too big.
-	 * PSI requires page size to be 2 ^ x, and the base address is naturally
-	 * aligned to the size
-	 */
-	if (!cap_pgsel_inv(iommu->cap) || mask > cap_max_amask_val(iommu->cap))
-		iommu->flush.flush_iotlb(iommu, did, 0, 0,
-						DMA_TLB_DSI_FLUSH);
-	else
-		iommu->flush.flush_iotlb(iommu, did, addr | ih, mask,
-						DMA_TLB_PSI_FLUSH);
+
+	if (domain_use_first_level(domain)) {
+		domain_flush_piotlb(iommu, domain, addr, pages, ih);
+	} else {
+		/*
+		 * Fallback to domain selective flush if no PSI support or
+		 * the size is too big. PSI requires page size to be 2 ^ x,
+		 * and the base address is naturally aligned to the size.
+		 */
+		if (!cap_pgsel_inv(iommu->cap) ||
+		    mask > cap_max_amask_val(iommu->cap))
+			iommu->flush.flush_iotlb(iommu, did, 0, 0,
+							DMA_TLB_DSI_FLUSH);
+		else
+			iommu->flush.flush_iotlb(iommu, did, addr | ih, mask,
+							DMA_TLB_PSI_FLUSH);
+	}
 
 	/*
 	 * In caching mode, changes of pages from non-present to present require
@@ -1548,8 +1567,11 @@ static inline void __mapping_notify_one(struct intel_iommu *iommu,
 					struct dmar_domain *domain,
 					unsigned long pfn, unsigned int pages)
 {
-	/* It's a non-present to present mapping. Only flush if caching mode */
-	if (cap_caching_mode(iommu->cap))
+	/*
+	 * It's a non-present to present mapping. Only flush if caching mode
+	 * and second level.
+	 */
+	if (cap_caching_mode(iommu->cap) && !domain_use_first_level(domain))
 		iommu_flush_iotlb_psi(iommu, domain, pfn, pages, 0, 1);
 	else
 		iommu_flush_write_buffer(iommu);
@@ -1566,7 +1588,11 @@ static void iommu_flush_iova(struct iova_domain *iovad)
 		struct intel_iommu *iommu = g_iommus[idx];
 		u16 did = domain->iommu_did[iommu->seq_id];
 
-		iommu->flush.flush_iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH);
+		if (domain_use_first_level(domain))
+			domain_flush_piotlb(iommu, domain, 0, -1, 0);
+		else
+			iommu->flush.flush_iotlb(iommu, did, 0, 0,
+						 DMA_TLB_DSI_FLUSH);
 
 		if (!cap_caching_mode(iommu->cap))
 			iommu_flush_dev_iotlb(get_iommu_domain(iommu, did),
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 454c69712131..3a4708a8a414 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -650,6 +650,8 @@ extern void qi_flush_iotlb(struct intel_iommu *iommu, u16 did, u64 addr,
 			  unsigned int size_order, u64 type);
 extern void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 sid, u16 pfsid,
 			u16 qdep, u64 addr, unsigned mask);
+void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, u32 pasid, u64 addr,
+		     unsigned long npages, bool ih);
 extern int qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu);
 
 extern int dmar_ir_support(void);
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 18/22] iommu/vt-d: Make first level IOVA canonical
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (16 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 17/22] iommu/vt-d: Flush PASID-based iotlb " Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 19/22] iommu/vt-d: Update first level super page capability Lu Baolu
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

First-level translation restricts the input-address to a canonical
address (i.e., address bits 63:N have the same value as address
bit [N-1], where N is 48-bits with 4-level paging and 57-bits with
5-level paging). (section 3.6 in the spec)

This makes first level IOVA canonical by using IOVA with bit [N-1]
always cleared.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 54db6bc0b281..1ebf5ed460cf 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3505,8 +3505,21 @@ static unsigned long intel_alloc_iova(struct device *dev,
 {
 	unsigned long iova_pfn;
 
-	/* Restrict dma_mask to the width that the iommu can handle */
-	dma_mask = min_t(uint64_t, DOMAIN_MAX_ADDR(domain->gaw), dma_mask);
+	/*
+	 * Restrict dma_mask to the width that the iommu can handle.
+	 * First-level translation restricts the input-address to a
+	 * canonical address (i.e., address bits 63:N have the same
+	 * value as address bit [N-1], where N is 48-bits with 4-level
+	 * paging and 57-bits with 5-level paging). Hence, skip bit
+	 * [N-1].
+	 */
+	if (domain_use_first_level(domain))
+		dma_mask = min_t(uint64_t, DOMAIN_MAX_ADDR(domain->gaw - 1),
+				 dma_mask);
+	else
+		dma_mask = min_t(uint64_t, DOMAIN_MAX_ADDR(domain->gaw),
+				 dma_mask);
+
 	/* Ensure we reserve the whole size-aligned region */
 	nrpages = __roundup_pow_of_two(nrpages);
 
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 19/22] iommu/vt-d: Update first level super page capability
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (17 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 18/22] iommu/vt-d: Make first level IOVA canonical Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 20/22] iommu/vt-d: Use iova over first level Lu Baolu
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

First-level translation may map input addresses to 4-KByte pages,
2-MByte pages, or 1-GByte pages. Support for 4-KByte pages and
2-Mbyte pages are mandatory for first-level translation. Hardware
support for 1-GByte page is reported through the FL1GP field in
the Capability Register.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 1ebf5ed460cf..34e619318f64 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -685,11 +685,12 @@ static int domain_update_iommu_snooping(struct intel_iommu *skip)
 	return ret;
 }
 
-static int domain_update_iommu_superpage(struct intel_iommu *skip)
+static int domain_update_iommu_superpage(struct dmar_domain *domain,
+					 struct intel_iommu *skip)
 {
 	struct dmar_drhd_unit *drhd;
 	struct intel_iommu *iommu;
-	int mask = 0xf;
+	int mask = 0x3;
 
 	if (!intel_iommu_superpage) {
 		return 0;
@@ -699,7 +700,13 @@ static int domain_update_iommu_superpage(struct intel_iommu *skip)
 	rcu_read_lock();
 	for_each_active_iommu(iommu, drhd) {
 		if (iommu != skip) {
-			mask &= cap_super_page_val(iommu->cap);
+			if (domain && domain_use_first_level(domain)) {
+				if (!cap_fl1gp_support(iommu->cap))
+					mask = 0x1;
+			} else {
+				mask &= cap_super_page_val(iommu->cap);
+			}
+
 			if (!mask)
 				break;
 		}
@@ -714,7 +721,7 @@ static void domain_update_iommu_cap(struct dmar_domain *domain)
 {
 	domain_update_iommu_coherency(domain);
 	domain->iommu_snooping = domain_update_iommu_snooping(NULL);
-	domain->iommu_superpage = domain_update_iommu_superpage(NULL);
+	domain->iommu_superpage = domain_update_iommu_superpage(domain, NULL);
 }
 
 struct context_entry *iommu_context_addr(struct intel_iommu *iommu, u8 bus,
@@ -4604,7 +4611,7 @@ static int intel_iommu_add(struct dmar_drhd_unit *dmaru)
 			iommu->name);
 		return -ENXIO;
 	}
-	sp = domain_update_iommu_superpage(iommu) - 1;
+	sp = domain_update_iommu_superpage(NULL, iommu) - 1;
 	if (sp >= 0 && !(cap_super_page_val(iommu->cap) & (1 << sp))) {
 		pr_warn("%s: Doesn't support large page.\n",
 			iommu->name);
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 20/22] iommu/vt-d: Use iova over first level
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (18 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 19/22] iommu/vt-d: Update first level super page capability Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 21/22] iommu/vt-d: debugfs: Add support to show page table internals Lu Baolu
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

After we make all map/unmap paths support first level page table.
Let's turn it on if hardware supports scalable mode.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 34e619318f64..51d60bad0b1d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1770,15 +1770,13 @@ static void free_dmar_iommu(struct intel_iommu *iommu)
 
 /*
  * Check and return whether first level is used by default for
- * DMA translation. Currently, we make it off by setting
- * first_level_support = 0, and will change it to -1 after all
- * map/unmap paths support first level page table.
+ * DMA translation.
  */
 static bool first_level_by_default(void)
 {
 	struct dmar_drhd_unit *drhd;
 	struct intel_iommu *iommu;
-	static int first_level_support = 0;
+	static int first_level_support = -1;
 
 	if (likely(first_level_support != -1))
 		return first_level_support;
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 21/22] iommu/vt-d: debugfs: Add support to show page table internals
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (19 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 20/22] iommu/vt-d: Use iova over first level Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  0:18 ` [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices Lu Baolu
  2020-01-07 13:06 ` [PULL REQUEST] iommu/vt-d: patches for v5.6 Joerg Roedel
  22 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu

Export page table internals of the domain attached to each device.
Example of such dump on a Skylake machine:

$ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
[ ... ]
Device 0000:00:14.0 with pasid 0 @0x15f3d9000
IOVA_PFN                PML5E                   PML4E
0x000000008ced0 |       0x0000000000000000      0x000000015f3da003
0x000000008ced1 |       0x0000000000000000      0x000000015f3da003
0x000000008ced2 |       0x0000000000000000      0x000000015f3da003
0x000000008ced3 |       0x0000000000000000      0x000000015f3da003
0x000000008ced4 |       0x0000000000000000      0x000000015f3da003
0x000000008ced5 |       0x0000000000000000      0x000000015f3da003
0x000000008ced6 |       0x0000000000000000      0x000000015f3da003
0x000000008ced7 |       0x0000000000000000      0x000000015f3da003
0x000000008ced8 |       0x0000000000000000      0x000000015f3da003
0x000000008ced9 |       0x0000000000000000      0x000000015f3da003

PDPE                    PDE                     PTE
0x000000015f3db003      0x000000015f3dc003      0x000000008ced0003
0x000000015f3db003      0x000000015f3dc003      0x000000008ced1003
0x000000015f3db003      0x000000015f3dc003      0x000000008ced2003
0x000000015f3db003      0x000000015f3dc003      0x000000008ced3003
0x000000015f3db003      0x000000015f3dc003      0x000000008ced4003
0x000000015f3db003      0x000000015f3dc003      0x000000008ced5003
0x000000015f3db003      0x000000015f3dc003      0x000000008ced6003
0x000000015f3db003      0x000000015f3dc003      0x000000008ced7003
0x000000015f3db003      0x000000015f3dc003      0x000000008ced8003
0x000000015f3db003      0x000000015f3dc003      0x000000008ced9003
[ ... ]

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu-debugfs.c | 75 +++++++++++++++++++++++++++++
 drivers/iommu/intel-iommu.c         |  4 +-
 include/linux/intel-iommu.h         |  2 +
 3 files changed, 79 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel-iommu-debugfs.c b/drivers/iommu/intel-iommu-debugfs.c
index 471f05d452e0..c1257bef553c 100644
--- a/drivers/iommu/intel-iommu-debugfs.c
+++ b/drivers/iommu/intel-iommu-debugfs.c
@@ -5,6 +5,7 @@
  * Authors: Gayatri Kammela <gayatri.kammela@intel.com>
  *	    Sohil Mehta <sohil.mehta@intel.com>
  *	    Jacob Pan <jacob.jun.pan@linux.intel.com>
+ *	    Lu Baolu <baolu.lu@linux.intel.com>
  */
 
 #include <linux/debugfs.h>
@@ -283,6 +284,77 @@ static int dmar_translation_struct_show(struct seq_file *m, void *unused)
 }
 DEFINE_SHOW_ATTRIBUTE(dmar_translation_struct);
 
+static inline unsigned long level_to_directory_size(int level)
+{
+	return BIT_ULL(VTD_PAGE_SHIFT + VTD_STRIDE_SHIFT * (level - 1));
+}
+
+static inline void
+dump_page_info(struct seq_file *m, unsigned long iova, u64 *path)
+{
+	seq_printf(m, "0x%013lx |\t0x%016llx\t0x%016llx\t0x%016llx\t0x%016llx\t0x%016llx\n",
+		   iova >> VTD_PAGE_SHIFT, path[5], path[4],
+		   path[3], path[2], path[1]);
+}
+
+static void pgtable_walk_level(struct seq_file *m, struct dma_pte *pde,
+			       int level, unsigned long start,
+			       u64 *path)
+{
+	int i;
+
+	if (level > 5 || level < 1)
+		return;
+
+	for (i = 0; i < BIT_ULL(VTD_STRIDE_SHIFT);
+			i++, pde++, start += level_to_directory_size(level)) {
+		if (!dma_pte_present(pde))
+			continue;
+
+		path[level] = pde->val;
+		if (dma_pte_superpage(pde) || level == 1)
+			dump_page_info(m, start, path);
+		else
+			pgtable_walk_level(m, phys_to_virt(dma_pte_addr(pde)),
+					   level - 1, start, path);
+		path[level] = 0;
+	}
+}
+
+static int show_device_domain_translation(struct device *dev, void *data)
+{
+	struct dmar_domain *domain = find_domain(dev);
+	struct seq_file *m = data;
+	u64 path[6] = { 0 };
+
+	if (!domain)
+		return 0;
+
+	seq_printf(m, "Device %s with pasid %d @0x%llx\n",
+		   dev_name(dev), domain->default_pasid,
+		   (u64)virt_to_phys(domain->pgd));
+	seq_puts(m, "IOVA_PFN\t\tPML5E\t\t\tPML4E\t\t\tPDPE\t\t\tPDE\t\t\tPTE\n");
+
+	pgtable_walk_level(m, domain->pgd, domain->agaw + 2, 0, path);
+	seq_putc(m, '\n');
+
+	return 0;
+}
+
+static int domain_translation_struct_show(struct seq_file *m, void *unused)
+{
+	unsigned long flags;
+	int ret;
+
+	spin_lock_irqsave(&device_domain_lock, flags);
+	ret = bus_for_each_dev(&pci_bus_type, NULL, m,
+			       show_device_domain_translation);
+	spin_unlock_irqrestore(&device_domain_lock, flags);
+
+	return ret;
+}
+DEFINE_SHOW_ATTRIBUTE(domain_translation_struct);
+
 #ifdef CONFIG_IRQ_REMAP
 static void ir_tbl_remap_entry_show(struct seq_file *m,
 				    struct intel_iommu *iommu)
@@ -396,6 +468,9 @@ void __init intel_iommu_debugfs_init(void)
 			    &iommu_regset_fops);
 	debugfs_create_file("dmar_translation_struct", 0444, intel_iommu_debug,
 			    NULL, &dmar_translation_struct_fops);
+	debugfs_create_file("domain_translation_struct", 0444,
+			    intel_iommu_debug, NULL,
+			    &domain_translation_struct_fops);
 #ifdef CONFIG_IRQ_REMAP
 	debugfs_create_file("ir_translation_struct", 0444, intel_iommu_debug,
 			    NULL, &ir_translation_struct_fops);
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 51d60bad0b1d..609931f6d771 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -396,7 +396,7 @@ EXPORT_SYMBOL_GPL(intel_iommu_gfx_mapped);
 
 #define DUMMY_DEVICE_DOMAIN_INFO ((struct device_domain_info *)(-1))
 #define DEFER_DEVICE_DOMAIN_INFO ((struct device_domain_info *)(-2))
-static DEFINE_SPINLOCK(device_domain_lock);
+DEFINE_SPINLOCK(device_domain_lock);
 static LIST_HEAD(device_domain_list);
 
 #define device_needs_bounce(d) (!intel_no_bounce && dev_is_pci(d) &&	\
@@ -2513,7 +2513,7 @@ static void domain_remove_dev_info(struct dmar_domain *domain)
 	spin_unlock_irqrestore(&device_domain_lock, flags);
 }
 
-static struct dmar_domain *find_domain(struct device *dev)
+struct dmar_domain *find_domain(struct device *dev)
 {
 	struct device_domain_info *info;
 
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 3a4708a8a414..4a16b39ae353 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -441,6 +441,7 @@ enum {
 #define VTD_FLAG_SVM_CAPABLE		(1 << 2)
 
 extern int intel_iommu_sm;
+extern spinlock_t device_domain_lock;
 
 #define sm_supported(iommu)	(intel_iommu_sm && ecap_smts((iommu)->ecap))
 #define pasid_supported(iommu)	(sm_supported(iommu) &&			\
@@ -663,6 +664,7 @@ int for_each_device_domain(int (*fn)(struct device_domain_info *info,
 				     void *data), void *data);
 void iommu_flush_write_buffer(struct intel_iommu *iommu);
 int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct device *dev);
+struct dmar_domain *find_domain(struct device *dev);
 
 #ifdef CONFIG_INTEL_IOMMU_SVM
 extern void intel_svm_check(struct intel_iommu *iommu);
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (20 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 21/22] iommu/vt-d: debugfs: Add support to show page table internals Lu Baolu
@ 2020-01-02  0:18 ` Lu Baolu
  2020-01-02  2:11   ` Roland Dreier via iommu
  2020-01-08 14:16   ` Christoph Hellwig
  2020-01-07 13:06 ` [PULL REQUEST] iommu/vt-d: patches for v5.6 Joerg Roedel
  22 siblings, 2 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  0:18 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Roland Dreier, Jim Yan, iommu

We expect devices with endpoint scope to have normal PCI headers,
and devices with bridge scope to have bridge PCI headers. However,
some PCI devices may be listed in the DMAR table with bridge scope,
even though they have a normal PCI header. Add a quirk flag for
those special devices.

Cc: Roland Dreier <roland@purestorage.com>
Cc: Jim Yan <jimyan@baidu.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Tested-by: Jim Yan <jimyan@baidu.com>
---
 drivers/iommu/dmar.c | 37 +++++++++++++++++++++++--------------
 1 file changed, 23 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index fb30d5053664..fc24abc70a05 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -65,6 +65,26 @@ static void free_iommu(struct intel_iommu *iommu);
 
 extern const struct iommu_ops intel_iommu_ops;
 
+static int scope_mismatch_quirk;
+static void quirk_dmar_scope_mismatch(struct pci_dev *dev)
+{
+	pci_info(dev, "scope mismatch ignored\n");
+	scope_mismatch_quirk = 1;
+}
+
+/*
+ * We expect devices with endpoint scope to have normal PCI
+ * headers, and devices with bridge scope to have bridge PCI
+ * headers.  However some PCI devices may be listed in the
+ * DMAR table with bridge scope, even though they have a
+ * normal PCI header. We don't declare a socpe mismatch for
+ * below special cases.
+ */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2f0d,	/* NTB devices  */
+			 quirk_dmar_scope_mismatch);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2020,	/* NVME host */
+			 quirk_dmar_scope_mismatch);
+
 static void dmar_register_drhd_unit(struct dmar_drhd_unit *drhd)
 {
 	/*
@@ -231,20 +251,9 @@ int dmar_insert_dev_scope(struct dmar_pci_notify_info *info,
 		if (!dmar_match_pci_path(info, scope->bus, path, level))
 			continue;
 
-		/*
-		 * We expect devices with endpoint scope to have normal PCI
-		 * headers, and devices with bridge scope to have bridge PCI
-		 * headers.  However PCI NTB devices may be listed in the
-		 * DMAR table with bridge scope, even though they have a
-		 * normal PCI header.  NTB devices are identified by class
-		 * "BRIDGE_OTHER" (0680h) - we don't declare a socpe mismatch
-		 * for this special case.
-		 */
-		if ((scope->entry_type == ACPI_DMAR_SCOPE_TYPE_ENDPOINT &&
-		     info->dev->hdr_type != PCI_HEADER_TYPE_NORMAL) ||
-		    (scope->entry_type == ACPI_DMAR_SCOPE_TYPE_BRIDGE &&
-		     (info->dev->hdr_type == PCI_HEADER_TYPE_NORMAL &&
-		      info->dev->class >> 8 != PCI_CLASS_BRIDGE_OTHER))) {
+		if (!scope_mismatch_quirk &&
+		    ((scope->entry_type == ACPI_DMAR_SCOPE_TYPE_ENDPOINT) ^
+		     (info->dev->hdr_type == PCI_HEADER_TYPE_NORMAL))) {
 			pr_warn("Device scope type does not match for %s\n",
 				pci_name(info->dev));
 			return -EINVAL;
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
  2020-01-02  0:18 ` [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices Lu Baolu
@ 2020-01-02  2:11   ` Roland Dreier via iommu
  2020-01-02  2:14     ` Lu Baolu
  2020-01-08 14:16   ` Christoph Hellwig
  1 sibling, 1 reply; 41+ messages in thread
From: Roland Dreier via iommu @ 2020-01-02  2:11 UTC (permalink / raw)
  To: Lu Baolu; +Cc: Jim Yan, iommu

> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2f0d,  /* NTB devices  */
> +                        quirk_dmar_scope_mismatch);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2020,  /* NVME host */
> +                        quirk_dmar_scope_mismatch);

what's the motivation for changing the logic into a quirk table, which
has to be maintained with new device IDs?

In particular this has the Haswell NTB ID 2F0Dh but already leaves out
the Broadwell ID 6F0Dh.

 - R.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
  2020-01-02  2:11   ` Roland Dreier via iommu
@ 2020-01-02  2:14     ` Lu Baolu
  2020-01-02  2:25       ` Roland Dreier via iommu
  0 siblings, 1 reply; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  2:14 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Jim Yan, iommu

Hi,

On 1/2/20 10:11 AM, Roland Dreier wrote:
>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2f0d,  /* NTB devices  */
>> +                        quirk_dmar_scope_mismatch);
>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2020,  /* NVME host */
>> +                        quirk_dmar_scope_mismatch);
> 
> what's the motivation for changing the logic into a quirk table, which
> has to be maintained with new device IDs?

We saw more devices with the same mismatch quirk. So maintaining them in
a quirk table will make it more readable and maintainable.

Best regards,
-baolu

> 
> In particular this has the Haswell NTB ID 2F0Dh but already leaves out
> the Broadwell ID 6F0Dh.
> 
>   - R.
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
  2020-01-02  2:14     ` Lu Baolu
@ 2020-01-02  2:25       ` Roland Dreier via iommu
  2020-01-02  2:34         ` Lu Baolu
                           ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Roland Dreier via iommu @ 2020-01-02  2:25 UTC (permalink / raw)
  To: Lu Baolu; +Cc: Jim Yan, iommu

> We saw more devices with the same mismatch quirk. So maintaining them in
> a quirk table will make it more readable and maintainable.

I guess I disagree about the maintainable part, given that this patch
already regresses Broadwell NTB.

I'm not even sure what the DMAR table says about NTB on my Skylake
systems, exactly because the existing code means I did not have any
problems.  But we might need to add device 201Ch too.

Maybe we don't need the mismatch check at all?  Your patch sets the
quirk if any possibly mismatching device is present in the system, so
we'll ignore any scope mismatch on a system with, say, the 8086:2020
NVMe host in it.  So could we just drop the check completely and not
have a quirk to disable the check?

 - R.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
  2020-01-02  2:25       ` Roland Dreier via iommu
@ 2020-01-02  2:34         ` Lu Baolu
  2020-01-03  0:32         ` Lu Baolu
  2020-01-06 17:05         ` Jerry Snitselaar
  2 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-02  2:34 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Jim Yan, iommu

Hi,

On 1/2/20 10:25 AM, Roland Dreier wrote:
>> We saw more devices with the same mismatch quirk. So maintaining them in
>> a quirk table will make it more readable and maintainable.
> 
> I guess I disagree about the maintainable part, given that this patch
> already regresses Broadwell NTB.
> 
> I'm not even sure what the DMAR table says about NTB on my Skylake
> systems, exactly because the existing code means I did not have any
> problems.  But we might need to add device 201Ch too.
> 
> Maybe we don't need the mismatch check at all?  Your patch sets the
> quirk if any possibly mismatching device is present in the system, so
> we'll ignore any scope mismatch on a system with, say, the 8086:2020
> NVMe host in it.  So could we just drop the check completely and not
> have a quirk to disable the check?

Fair enough. Instead of no check, how about putting a pr_info() there
and give end user a chance to know this?

Best regards,
-baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
  2020-01-02  2:25       ` Roland Dreier via iommu
  2020-01-02  2:34         ` Lu Baolu
@ 2020-01-03  0:32         ` Lu Baolu
  2020-01-04 16:52           ` Roland Dreier via iommu
  2020-01-06 17:05         ` Jerry Snitselaar
  2 siblings, 1 reply; 41+ messages in thread
From: Lu Baolu @ 2020-01-03  0:32 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Jim Yan, iommu

Hi Roland,

Jim proposed another solution.

https://lkml.org/lkml/2019/12/23/653

Does this work for you?

Best regards,
baolu

On 1/2/20 10:25 AM, Roland Dreier wrote:
>> We saw more devices with the same mismatch quirk. So maintaining them in
>> a quirk table will make it more readable and maintainable.
> 
> I guess I disagree about the maintainable part, given that this patch
> already regresses Broadwell NTB.
> 
> I'm not even sure what the DMAR table says about NTB on my Skylake
> systems, exactly because the existing code means I did not have any
> problems.  But we might need to add device 201Ch too.
> 
> Maybe we don't need the mismatch check at all?  Your patch sets the
> quirk if any possibly mismatching device is present in the system, so
> we'll ignore any scope mismatch on a system with, say, the 8086:2020
> NVMe host in it.  So could we just drop the check completely and not
> have a quirk to disable the check?
> 
>   - R.
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
  2020-01-03  0:32         ` Lu Baolu
@ 2020-01-04 16:52           ` Roland Dreier via iommu
  2020-01-05  3:43             ` Lu Baolu
  0 siblings, 1 reply; 41+ messages in thread
From: Roland Dreier via iommu @ 2020-01-04 16:52 UTC (permalink / raw)
  To: Lu Baolu; +Cc: Jim Yan, iommu

> Jim proposed another solution.
>
> https://lkml.org/lkml/2019/12/23/653
>
> Does this work for you?

Yes, that's OK for the cases I've seen too.  All the NTB devices I've
seen are PCI_CLASS_BRIDGE_OTHER with type 0 headers, so this patch
would not break anything.  And I think the idea of allowing DMAR
bridge scope for all devices with PCI class bridge is logical - BIOS
writers probably are going by PCI class rather than header type when
assigning scope.

 - R.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
  2020-01-04 16:52           ` Roland Dreier via iommu
@ 2020-01-05  3:43             ` Lu Baolu
  0 siblings, 0 replies; 41+ messages in thread
From: Lu Baolu @ 2020-01-05  3:43 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Jim Yan, iommu

Hi Jim,

On 1/5/20 12:52 AM, Roland Dreier wrote:
>> Jim proposed another solution.
>>
>> https://lkml.org/lkml/2019/12/23/653
>>
>> Does this work for you?
> 
> Yes, that's OK for the cases I've seen too.  All the NTB devices I've
> seen are PCI_CLASS_BRIDGE_OTHER with type 0 headers, so this patch
> would not break anything.  And I think the idea of allowing DMAR
> bridge scope for all devices with PCI class bridge is logical - BIOS
> writers probably are going by PCI class rather than header type when
> assigning scope.

Can you please post a v2 of this patch with the change you proposed in
https://lkml.org/lkml/2019/12/23/653?

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
  2020-01-02  2:25       ` Roland Dreier via iommu
  2020-01-02  2:34         ` Lu Baolu
  2020-01-03  0:32         ` Lu Baolu
@ 2020-01-06 17:05         ` Jerry Snitselaar
  2020-01-07  0:35           ` Lu Baolu
  2 siblings, 1 reply; 41+ messages in thread
From: Jerry Snitselaar @ 2020-01-06 17:05 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Jim Yan, iommu

On Wed Jan 01 20, Roland Dreier via iommu wrote:
>> We saw more devices with the same mismatch quirk. So maintaining them in
>> a quirk table will make it more readable and maintainable.
>
>I guess I disagree about the maintainable part, given that this patch
>already regresses Broadwell NTB.
>
>I'm not even sure what the DMAR table says about NTB on my Skylake
>systems, exactly because the existing code means I did not have any
>problems.  But we might need to add device 201Ch too.
>
>Maybe we don't need the mismatch check at all?  Your patch sets the
>quirk if any possibly mismatching device is present in the system, so
>we'll ignore any scope mismatch on a system with, say, the 8086:2020
>NVMe host in it.  So could we just drop the check completely and not
>have a quirk to disable the check?
>
> - R.

If the check is removed what happens for cases where there is an actual
problem in the dmar table? I just worked an issue with some Intel
people where a purley system had an rmrr entry pointing to a bridge as
the endpoint device instead of the raid module sitting behind it.

>_______________________________________________
>iommu mailing list
>iommu@lists.linux-foundation.org
>https://lists.linuxfoundation.org/mailman/listinfo/iommu
>

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
  2020-01-06 17:05         ` Jerry Snitselaar
@ 2020-01-07  0:35           ` Lu Baolu
  2020-01-07  1:30             ` Jerry Snitselaar
  0 siblings, 1 reply; 41+ messages in thread
From: Lu Baolu @ 2020-01-07  0:35 UTC (permalink / raw)
  To: Jerry Snitselaar, Roland Dreier; +Cc: Jim Yan, iommu

Hi Jerry,

On 1/7/20 1:05 AM, Jerry Snitselaar wrote:
> On Wed Jan 01 20, Roland Dreier via iommu wrote:
>>> We saw more devices with the same mismatch quirk. So maintaining them in
>>> a quirk table will make it more readable and maintainable.
>>
>> I guess I disagree about the maintainable part, given that this patch
>> already regresses Broadwell NTB.
>>
>> I'm not even sure what the DMAR table says about NTB on my Skylake
>> systems, exactly because the existing code means I did not have any
>> problems.  But we might need to add device 201Ch too.
>>
>> Maybe we don't need the mismatch check at all?  Your patch sets the
>> quirk if any possibly mismatching device is present in the system, so
>> we'll ignore any scope mismatch on a system with, say, the 8086:2020
>> NVMe host in it.  So could we just drop the check completely and not
>> have a quirk to disable the check?
>>
>> - R.
> 
> If the check is removed what happens for cases where there is an actual
> problem in the dmar table? I just worked an issue with some Intel
> people where a purley system had an rmrr entry pointing to a bridge as
> the endpoint device instead of the raid module sitting behind it.

The latest solution was here. https://lkml.org/lkml/2020/1/5/103, does
this work for you?

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
  2020-01-07  0:35           ` Lu Baolu
@ 2020-01-07  1:30             ` Jerry Snitselaar
  2020-01-07  1:47               ` Lu Baolu
  0 siblings, 1 reply; 41+ messages in thread
From: Jerry Snitselaar @ 2020-01-07  1:30 UTC (permalink / raw)
  To: Lu Baolu; +Cc: Roland Dreier, Jim Yan, iommu

On Tue Jan 07 20, Lu Baolu wrote:
>Hi Jerry,
>
>On 1/7/20 1:05 AM, Jerry Snitselaar wrote:
>>On Wed Jan 01 20, Roland Dreier via iommu wrote:
>>>>We saw more devices with the same mismatch quirk. So maintaining them in
>>>>a quirk table will make it more readable and maintainable.
>>>
>>>I guess I disagree about the maintainable part, given that this patch
>>>already regresses Broadwell NTB.
>>>
>>>I'm not even sure what the DMAR table says about NTB on my Skylake
>>>systems, exactly because the existing code means I did not have any
>>>problems.  But we might need to add device 201Ch too.
>>>
>>>Maybe we don't need the mismatch check at all?  Your patch sets the
>>>quirk if any possibly mismatching device is present in the system, so
>>>we'll ignore any scope mismatch on a system with, say, the 8086:2020
>>>NVMe host in it.  So could we just drop the check completely and not
>>>have a quirk to disable the check?
>>>
>>>- R.
>>
>>If the check is removed what happens for cases where there is an actual
>>problem in the dmar table? I just worked an issue with some Intel
>>people where a purley system had an rmrr entry pointing to a bridge as
>>the endpoint device instead of the raid module sitting behind it.
>
>The latest solution was here. https://lkml.org/lkml/2020/1/5/103, does
>this work for you?
>
>Best regards,
>baolu
>

Hi Baolu,

They resolved it by updating the rmrr entry in the dmar table to add
the extra path needed for it to point at the raid module. Looking
at the code though I imagine without the firmware update they would
still have the problem because IIRC it was a combo of an endpoint
scope type, and a pci bridge header so that first check would fail
as it did before. My worry was if the suggestion is to remove the
check completely, a case like that wouldn't report anything wrong.

Jim's latest patch I think solves the issue for what he was seeing
and the NTB case.

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
  2020-01-07  1:30             ` Jerry Snitselaar
@ 2020-01-07  1:47               ` Lu Baolu
  2020-01-09  0:12                 ` Roland Dreier via iommu
  0 siblings, 1 reply; 41+ messages in thread
From: Lu Baolu @ 2020-01-07  1:47 UTC (permalink / raw)
  To: Jerry Snitselaar; +Cc: Roland Dreier, Jim Yan, iommu

Hi,

On 1/7/20 9:30 AM, Jerry Snitselaar wrote:
> On Tue Jan 07 20, Lu Baolu wrote:
>> Hi Jerry,
>>
>> On 1/7/20 1:05 AM, Jerry Snitselaar wrote:
>>> On Wed Jan 01 20, Roland Dreier via iommu wrote:
>>>>> We saw more devices with the same mismatch quirk. So maintaining 
>>>>> them in
>>>>> a quirk table will make it more readable and maintainable.
>>>>
>>>> I guess I disagree about the maintainable part, given that this patch
>>>> already regresses Broadwell NTB.
>>>>
>>>> I'm not even sure what the DMAR table says about NTB on my Skylake
>>>> systems, exactly because the existing code means I did not have any
>>>> problems.  But we might need to add device 201Ch too.
>>>>
>>>> Maybe we don't need the mismatch check at all?  Your patch sets the
>>>> quirk if any possibly mismatching device is present in the system, so
>>>> we'll ignore any scope mismatch on a system with, say, the 8086:2020
>>>> NVMe host in it.  So could we just drop the check completely and not
>>>> have a quirk to disable the check?
>>>>
>>>> - R.
>>>
>>> If the check is removed what happens for cases where there is an actual
>>> problem in the dmar table? I just worked an issue with some Intel
>>> people where a purley system had an rmrr entry pointing to a bridge as
>>> the endpoint device instead of the raid module sitting behind it.
>>
>> The latest solution was here. https://lkml.org/lkml/2020/1/5/103, does
>> this work for you?
>>
>> Best regards,
>> baolu
>>
> 
> Hi Baolu,
> 
> They resolved it by updating the rmrr entry in the dmar table to add
> the extra path needed for it to point at the raid module. Looking
> at the code though I imagine without the firmware update they would
> still have the problem because IIRC it was a combo of an endpoint
> scope type, and a pci bridge header so that first check would fail
> as it did before. My worry was if the suggestion is to remove the
> check completely, a case like that wouldn't report anything wrong.

Yes, agreed.

> 
> Jim's latest patch I think solves the issue for what he was seeing
> and the NTB case.
> 

Jerry and Roland,

Are you willing to add your reviewed-by for Jim's v2 patch? I will
queue it for v5.6 if you both agree.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PULL REQUEST] iommu/vt-d: patches for v5.6
  2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
                   ` (21 preceding siblings ...)
  2020-01-02  0:18 ` [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices Lu Baolu
@ 2020-01-07 13:06 ` Joerg Roedel
  22 siblings, 0 replies; 41+ messages in thread
From: Joerg Roedel @ 2020-01-07 13:06 UTC (permalink / raw)
  To: Lu Baolu; +Cc: iommu

On Thu, Jan 02, 2020 at 08:18:01AM +0800, Lu Baolu wrote:
> Hi Joerg,
> 
> Below patches have been piled up for v5.6.
> 
>  - Some preparation patches for VT-d nested mode support
>    - VT-d Native Shared virtual memory cleanup and fixes
>    - Use 1st-level for IOVA translation
> 
>  - VT-d debugging and tracing
>    - Extend map_sg trace event for more information
>    - Add debugfs support to show page table internals
> 
>  - Kconfig option for the default status of scalable mode
> 
>  - Some miscellaneous cleanups.
> 
> Please consider them for the iommu/vt-d branch.

Applied patches 1-21 to the x86/vt-d branch, thanks Baolu.

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
  2020-01-02  0:18 ` [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices Lu Baolu
  2020-01-02  2:11   ` Roland Dreier via iommu
@ 2020-01-08 14:16   ` Christoph Hellwig
  2020-01-08 23:28     ` Lu Baolu
  1 sibling, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-01-08 14:16 UTC (permalink / raw)
  To: Lu Baolu; +Cc: Roland Dreier, Jim Yan, iommu

> +/*
> + * We expect devices with endpoint scope to have normal PCI
> + * headers, and devices with bridge scope to have bridge PCI
> + * headers.  However some PCI devices may be listed in the
> + * DMAR table with bridge scope, even though they have a
> + * normal PCI header. We don't declare a socpe mismatch for
> + * below special cases.
> + */

Please use up all 80 lines for comments.

> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2f0d,	/* NTB devices  */
> +			 quirk_dmar_scope_mismatch);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2020,	/* NVME host */
> +			 quirk_dmar_scope_mismatch);

As said before "NVME host" host.  Besides the wrong spelling of NVMe,
the NVMe host is the Linux kernel, so describing a device as such seems
rather bogus.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
  2020-01-08 14:16   ` Christoph Hellwig
@ 2020-01-08 23:28     ` Lu Baolu
  2020-01-09  7:06       ` Christoph Hellwig
  0 siblings, 1 reply; 41+ messages in thread
From: Lu Baolu @ 2020-01-08 23:28 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Roland Dreier, Jim Yan, iommu

Hi Christoph,

On 1/8/20 10:16 PM, Christoph Hellwig wrote:
>> +/*
>> + * We expect devices with endpoint scope to have normal PCI
>> + * headers, and devices with bridge scope to have bridge PCI
>> + * headers.  However some PCI devices may be listed in the
>> + * DMAR table with bridge scope, even though they have a
>> + * normal PCI header. We don't declare a socpe mismatch for
>> + * below special cases.
>> + */
> 
> Please use up all 80 lines for comments.
> 
>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2f0d,	/* NTB devices  */
>> +			 quirk_dmar_scope_mismatch);
>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2020,	/* NVME host */
>> +			 quirk_dmar_scope_mismatch);
> 
> As said before "NVME host" host.  Besides the wrong spelling of NVMe,
> the NVMe host is the Linux kernel, so describing a device as such seems
> rather bogus.
> 

This patch has been replaced with this one.

https://lkml.org/lkml/2020/1/5/103

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
  2020-01-07  1:47               ` Lu Baolu
@ 2020-01-09  0:12                 ` Roland Dreier via iommu
  0 siblings, 0 replies; 41+ messages in thread
From: Roland Dreier via iommu @ 2020-01-09  0:12 UTC (permalink / raw)
  To: Lu Baolu; +Cc: Jim Yan, iommu

> Are you willing to add your reviewed-by for Jim's v2 patch? I will
> queue it for v5.6 if you both agree.

Sure:

Reviewed-by: Roland Dreier <roland@purestorage.com>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
  2020-01-08 23:28     ` Lu Baolu
@ 2020-01-09  7:06       ` Christoph Hellwig
  2020-01-09  8:53         ` Lu Baolu
  0 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-01-09  7:06 UTC (permalink / raw)
  To: Lu Baolu; +Cc: Roland Dreier, Jim Yan, iommu

On Thu, Jan 09, 2020 at 07:28:41AM +0800, Lu Baolu wrote:
> This patch has been replaced with this one.
> 
> https://lkml.org/lkml/2020/1/5/103

That still mentions a "nvme host device", which despite the different
spelling still doesn't make any sense.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
  2020-01-09  7:06       ` Christoph Hellwig
@ 2020-01-09  8:53         ` Lu Baolu
  2020-01-09  8:56           ` 答复: " Jim,Yan
  0 siblings, 1 reply; 41+ messages in thread
From: Lu Baolu @ 2020-01-09  8:53 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Roland Dreier, Jim Yan, iommu

On 1/9/20 3:06 PM, Christoph Hellwig wrote:
> On Thu, Jan 09, 2020 at 07:28:41AM +0800, Lu Baolu wrote:
>> This patch has been replaced with this one.
>>
>> https://lkml.org/lkml/2020/1/5/103
> 
> That still mentions a "nvme host device", which despite the different
> spelling still doesn't make any sense.
> 

Jim, can you please refine it accordingly?

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* 答复: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
  2020-01-09  8:53         ` Lu Baolu
@ 2020-01-09  8:56           ` Jim,Yan
  0 siblings, 0 replies; 41+ messages in thread
From: Jim,Yan @ 2020-01-09  8:56 UTC (permalink / raw)
  To: Lu Baolu, Christoph Hellwig; +Cc: Roland Dreier, iommu

Hi Baolu,

> -----邮件原件-----
> 发件人: Lu Baolu [mailto:baolu.lu@linux.intel.com]
> 发送时间: 2020年1月9日 16:53
> 收件人: Christoph Hellwig <hch@infradead.org>
> 抄送: baolu.lu@linux.intel.com; Joerg Roedel <joro@8bytes.org>; Roland
> Dreier <roland@purestorage.com>; Jim,Yan <jimyan@baidu.com>;
> iommu@lists.linux-foundation.org
> 主题: Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched
> devices
> 
> On 1/9/20 3:06 PM, Christoph Hellwig wrote:
> > On Thu, Jan 09, 2020 at 07:28:41AM +0800, Lu Baolu wrote:
> >> This patch has been replaced with this one.
> >>
> >> https://lkml.org/lkml/2020/1/5/103
> >
> > That still mentions a "nvme host device", which despite the different
> > spelling still doesn't make any sense.
> >
> 
> Jim, can you please refine it accordingly?
> 
> Best regards,
> Baolu


Yes, I am working on it.

Regards
Jim
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2020-01-09  8:57 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-02  0:18 [PULL REQUEST] iommu/vt-d: patches for v5.6 Lu Baolu
2020-01-02  0:18 ` [PATCH 01/22] iommu/vt-d: Add Kconfig option to enable/disable scalable mode Lu Baolu
2020-01-02  0:18 ` [PATCH 02/22] iommu/vt-d: Fix CPU and IOMMU SVM feature matching checks Lu Baolu
2020-01-02  0:18 ` [PATCH 03/22] iommu/vt-d: Match CPU and IOMMU paging mode Lu Baolu
2020-01-02  0:18 ` [PATCH 04/22] iommu/vt-d: Reject SVM bind for failed capability check Lu Baolu
2020-01-02  0:18 ` [PATCH 05/22] iommu/vt-d: Avoid duplicated code for PASID setup Lu Baolu
2020-01-02  0:18 ` [PATCH 06/22] iommu/vt-d: Fix off-by-one in PASID allocation Lu Baolu
2020-01-02  0:18 ` [PATCH 07/22] iommu/vt-d: Replace Intel specific PASID allocator with IOASID Lu Baolu
2020-01-02  0:18 ` [PATCH 08/22] iommu/vt-d: Avoid sending invalid page response Lu Baolu
2020-01-02  0:18 ` [PATCH 09/22] iommu/vt-d: Misc macro clean up for SVM Lu Baolu
2020-01-02  0:18 ` [PATCH 10/22] iommu/vt-d: trace: Extend map_sg trace event Lu Baolu
2020-01-02  0:18 ` [PATCH 11/22] iommu/vt-d: Avoid iova flush queue in strict mode Lu Baolu
2020-01-02  0:18 ` [PATCH 12/22] iommu/vt-d: Loose requirement for flush queue initializaton Lu Baolu
2020-01-02  0:18 ` [PATCH 13/22] iommu/vt-d: Identify domains using first level page table Lu Baolu
2020-01-02  0:18 ` [PATCH 14/22] iommu/vt-d: Add set domain DOMAIN_ATTR_NESTING attr Lu Baolu
2020-01-02  0:18 ` [PATCH 15/22] iommu/vt-d: Add PASID_FLAG_FL5LP for first-level pasid setup Lu Baolu
2020-01-02  0:18 ` [PATCH 16/22] iommu/vt-d: Setup pasid entries for iova over first level Lu Baolu
2020-01-02  0:18 ` [PATCH 17/22] iommu/vt-d: Flush PASID-based iotlb " Lu Baolu
2020-01-02  0:18 ` [PATCH 18/22] iommu/vt-d: Make first level IOVA canonical Lu Baolu
2020-01-02  0:18 ` [PATCH 19/22] iommu/vt-d: Update first level super page capability Lu Baolu
2020-01-02  0:18 ` [PATCH 20/22] iommu/vt-d: Use iova over first level Lu Baolu
2020-01-02  0:18 ` [PATCH 21/22] iommu/vt-d: debugfs: Add support to show page table internals Lu Baolu
2020-01-02  0:18 ` [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices Lu Baolu
2020-01-02  2:11   ` Roland Dreier via iommu
2020-01-02  2:14     ` Lu Baolu
2020-01-02  2:25       ` Roland Dreier via iommu
2020-01-02  2:34         ` Lu Baolu
2020-01-03  0:32         ` Lu Baolu
2020-01-04 16:52           ` Roland Dreier via iommu
2020-01-05  3:43             ` Lu Baolu
2020-01-06 17:05         ` Jerry Snitselaar
2020-01-07  0:35           ` Lu Baolu
2020-01-07  1:30             ` Jerry Snitselaar
2020-01-07  1:47               ` Lu Baolu
2020-01-09  0:12                 ` Roland Dreier via iommu
2020-01-08 14:16   ` Christoph Hellwig
2020-01-08 23:28     ` Lu Baolu
2020-01-09  7:06       ` Christoph Hellwig
2020-01-09  8:53         ` Lu Baolu
2020-01-09  8:56           ` 答复: " Jim,Yan
2020-01-07 13:06 ` [PULL REQUEST] iommu/vt-d: patches for v5.6 Joerg Roedel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.