All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] iommu/vt-d: Support enforce_cache_coherency only for empty domains
@ 2023-11-14  1:10 Lu Baolu
  2023-11-14  1:10 ` [PATCH 1/1] iommu/vt-d: Omit devTLB invalidation requests when TES=0 Lu Baolu
                   ` (4 more replies)
  0 siblings, 5 replies; 26+ messages in thread
From: Lu Baolu @ 2023-11-14  1:10 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe, Kevin Tian
  Cc: iommu, linux-kernel, Lu Baolu

The enforce_cache_coherency callback ensures DMA cache coherency for
devices attached to the domain.

Intel IOMMU supports enforced DMA cache coherency when the Snoop
Control bit in the IOMMU's extended capability register is set.
Supporting it differs between legacy and scalable modes.

In legacy mode, it's supported page-level by setting the SNP field
in second-stage page-table entries. In scalable mode, it's supported
in PASID-table granularity by setting the PGSNP field in PASID-table
entries.

In legacy mode, mappings before attaching to a device have SNP
fields cleared, while mappings after the callback have them set.
This means partial DMAs are cache coherent while others are not.

One possible fix is replaying mappings and flipping SNP bits when
attaching a domain to a device. But this seems to be over-engineered,
given that all real use cases just attach an empty domain to a device.

To meet practical needs while reducing mode differences, only support
enforce_cache_coherency on a domain without mappings if SNP field is
used.

Fixes: fc0051cb9590 ("iommu/vt-d: Check domain force_snooping against attached devices")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel/iommu.h | 3 +++
 drivers/iommu/intel/iommu.c | 5 ++++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 65d37a138c75..ce030c5b5772 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -602,6 +602,9 @@ struct dmar_domain {
 					 */
 	u8 dirty_tracking:1;		/* Dirty tracking is enabled */
 	u8 nested_parent:1;		/* Has other domains nested on it */
+	u8 has_mappings:1;		/* Has mappings configured through
+					 * iommu_map() interface.
+					 */
 
 	spinlock_t lock;		/* Protect device tracking lists */
 	struct list_head devices;	/* all devices' list */
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 3531b956556c..11670cd812a3 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2207,6 +2207,8 @@ __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
 			attr |= DMA_FL_PTE_DIRTY;
 	}
 
+	domain->has_mappings = true;
+
 	pteval = ((phys_addr_t)phys_pfn << VTD_PAGE_SHIFT) | attr;
 
 	while (nr_pages > 0) {
@@ -4360,7 +4362,8 @@ static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
 		return true;
 
 	spin_lock_irqsave(&dmar_domain->lock, flags);
-	if (!domain_support_force_snooping(dmar_domain)) {
+	if (!domain_support_force_snooping(dmar_domain) ||
+	    (!dmar_domain->use_first_level && dmar_domain->has_mappings)) {
 		spin_unlock_irqrestore(&dmar_domain->lock, flags);
 		return false;
 	}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 1/1] iommu/vt-d: Omit devTLB invalidation requests when TES=0
  2023-11-14  1:10 [PATCH 1/1] iommu/vt-d: Support enforce_cache_coherency only for empty domains Lu Baolu
@ 2023-11-14  1:10 ` Lu Baolu
  2023-11-14  3:14   ` Tian, Kevin
  2023-11-29 20:10   ` Jason Gunthorpe
  2023-11-14  1:10 ` [PATCH 1/1] iommu/vt-d: Disable PCI ATS in legacy passthrough mode Lu Baolu
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 26+ messages in thread
From: Lu Baolu @ 2023-11-14  1:10 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe, Kevin Tian
  Cc: iommu, linux-kernel, Lu Baolu

The latest VT-d spec indicates that when remapping hardware is disabled
(TES=0 in Global Status Register), upstream ATS Invalidation Completion
requests are treated as UR (Unsupported Request).

Consequently, the spec recommends in section 4.3 Handling of Device-TLB
Invalidations that software refrain from submitting any Device-TLB
invalidation requests when address remapping hardware is disabled.

Verify address remapping hardware is enabled prior to submitting Device-
TLB invalidation requests.

Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by default")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel/dmar.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index a3414afe11b0..23cb80d62a9a 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -1522,6 +1522,15 @@ void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 sid, u16 pfsid,
 {
 	struct qi_desc desc;
 
+	/*
+	 * VT-d spec, section 4.3:
+	 *
+	 * Software is recommended to not submit any Device-TLB invalidation
+	 * requests while address remapping hardware is disabled.
+	 */
+	if (!(iommu->gcmd & DMA_GCMD_TE))
+		return;
+
 	if (mask) {
 		addr |= (1ULL << (VTD_PAGE_SHIFT + mask - 1)) - 1;
 		desc.qw1 = QI_DEV_IOTLB_ADDR(addr) | QI_DEV_IOTLB_SIZE;
@@ -1587,6 +1596,15 @@ void qi_flush_dev_iotlb_pasid(struct intel_iommu *iommu, u16 sid, u16 pfsid,
 	unsigned long mask = 1UL << (VTD_PAGE_SHIFT + size_order - 1);
 	struct qi_desc desc = {.qw1 = 0, .qw2 = 0, .qw3 = 0};
 
+	/*
+	 * VT-d spec, section 4.3:
+	 *
+	 * Software is recommended to not submit any Device-TLB invalidation
+	 * requests while address remapping hardware is disabled.
+	 */
+	if (!(iommu->gcmd & DMA_GCMD_TE))
+		return;
+
 	desc.qw0 = QI_DEV_EIOTLB_PASID(pasid) | QI_DEV_EIOTLB_SID(sid) |
 		QI_DEV_EIOTLB_QDEP(qdep) | QI_DEIOTLB_TYPE |
 		QI_DEV_IOTLB_PFSID(pfsid);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 1/1] iommu/vt-d: Disable PCI ATS in legacy passthrough mode
  2023-11-14  1:10 [PATCH 1/1] iommu/vt-d: Support enforce_cache_coherency only for empty domains Lu Baolu
  2023-11-14  1:10 ` [PATCH 1/1] iommu/vt-d: Omit devTLB invalidation requests when TES=0 Lu Baolu
@ 2023-11-14  1:10 ` Lu Baolu
  2023-11-14  3:14   ` Tian, Kevin
                     ` (2 more replies)
  2023-11-14  1:10 ` [PATCH 1/1] iommu/vt-d: Make context clearing consistent with context mapping Lu Baolu
                   ` (2 subsequent siblings)
  4 siblings, 3 replies; 26+ messages in thread
From: Lu Baolu @ 2023-11-14  1:10 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe, Kevin Tian
  Cc: iommu, linux-kernel, Lu Baolu

When IOMMU hardware operates in legacy mode, the TT field of the context
entry determines the translation type, with three supported types (Section
9.3 Context Entry):

- DMA translation without device TLB support
- DMA translation with device TLB support
- Passthrough mode with translated and translation requests blocked

Device TLB support is absent when hardware is configured in passthrough
mode.

Disable the PCI ATS feature when IOMMU is configured for passthrough
translation type in legacy (non-scalable) mode.

Fixes: 0faa19a1515f ("iommu/vt-d: Decouple PASID & PRI enabling from SVA")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel/iommu.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 11670cd812a3..c3ec09118ab1 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1413,6 +1413,10 @@ static void iommu_enable_pci_caps(struct device_domain_info *info)
 	if (!dev_is_pci(info->dev))
 		return;
 
+	if (!sm_supported(info->iommu) && info->domain &&
+	    domain_type_is_si(info->domain))
+		return;
+
 	pdev = to_pci_dev(info->dev);
 
 	/* The PCIe spec, in its wisdom, declares that the behaviour of
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 1/1] iommu/vt-d: Make context clearing consistent with context mapping
  2023-11-14  1:10 [PATCH 1/1] iommu/vt-d: Support enforce_cache_coherency only for empty domains Lu Baolu
  2023-11-14  1:10 ` [PATCH 1/1] iommu/vt-d: Omit devTLB invalidation requests when TES=0 Lu Baolu
  2023-11-14  1:10 ` [PATCH 1/1] iommu/vt-d: Disable PCI ATS in legacy passthrough mode Lu Baolu
@ 2023-11-14  1:10 ` Lu Baolu
  2023-11-14  3:20   ` Tian, Kevin
  2023-11-14  3:16 ` [PATCH 1/1] iommu/vt-d: Support enforce_cache_coherency only for empty domains Tian, Kevin
  2023-11-29 20:08 ` Jason Gunthorpe
  4 siblings, 1 reply; 26+ messages in thread
From: Lu Baolu @ 2023-11-14  1:10 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe, Kevin Tian
  Cc: iommu, linux-kernel, Lu Baolu

In the iommu probe_device path, domain_context_mapping() allows setting
up the context entry for a non-PCI device. However, in the iommu
release_device path, domain_context_clear() only clears context entries
for PCI devices.

Make domain_context_clear() behave consistently with
domain_context_mapping() by clearing context entries for both PCI and
non-PCI devices.

Fixes: 579305f75d34 ("iommu/vt-d: Update to use PCI DMA aliases")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel/iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index c3ec09118ab1..061df1b68ff7 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3931,8 +3931,8 @@ static int domain_context_clear_one_cb(struct pci_dev *pdev, u16 alias, void *op
  */
 static void domain_context_clear(struct device_domain_info *info)
 {
-	if (!info->iommu || !info->dev || !dev_is_pci(info->dev))
-		return;
+	if (!dev_is_pci(info->dev))
+		domain_context_clear_one(info, info->bus, info->devfn);
 
 	pci_for_each_dma_alias(to_pci_dev(info->dev),
 			       &domain_context_clear_one_cb, info);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/vt-d: Omit devTLB invalidation requests when TES=0
  2023-11-14  3:14   ` Tian, Kevin
@ 2023-11-14  3:13     ` Baolu Lu
  2023-11-14  4:45       ` Tian, Kevin
  0 siblings, 1 reply; 26+ messages in thread
From: Baolu Lu @ 2023-11-14  3:13 UTC (permalink / raw)
  To: Tian, Kevin, Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe
  Cc: baolu.lu, iommu, linux-kernel

On 11/14/23 11:14 AM, Tian, Kevin wrote:
>> From: Lu Baolu<baolu.lu@linux.intel.com>
>> Sent: Tuesday, November 14, 2023 9:11 AM
>>
>> The latest VT-d spec indicates that when remapping hardware is disabled
>> (TES=0 in Global Status Register), upstream ATS Invalidation Completion
>> requests are treated as UR (Unsupported Request).
>>
>> Consequently, the spec recommends in section 4.3 Handling of Device-TLB
>> Invalidations that software refrain from submitting any Device-TLB
>> invalidation requests when address remapping hardware is disabled.
>>
>> Verify address remapping hardware is enabled prior to submitting Device-
>> TLB invalidation requests.
>>
>> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by
>> default")
>> Signed-off-by: Lu Baolu<baolu.lu@linux.intel.com>
>> ---
>>   drivers/iommu/intel/dmar.c | 18 ++++++++++++++++++
>>   1 file changed, 18 insertions(+)
>>
>> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
>> index a3414afe11b0..23cb80d62a9a 100644
>> --- a/drivers/iommu/intel/dmar.c
>> +++ b/drivers/iommu/intel/dmar.c
>> @@ -1522,6 +1522,15 @@ void qi_flush_dev_iotlb(struct intel_iommu
>> *iommu, u16 sid, u16 pfsid,
>>   {
>>   	struct qi_desc desc;
>>
>> +	/*
>> +	 * VT-d spec, section 4.3:
>> +	 *
>> +	 * Software is recommended to not submit any Device-TLB
>> invalidation
>> +	 * requests while address remapping hardware is disabled.
>> +	 */
>> +	if (!(iommu->gcmd & DMA_GCMD_TE))
>> +		return;
>> +
> Is it a valid case to see such request when the iommu is disabled?
> If not then let's add a WARN.

There might be valid cases. The VT-d translation is turned on after all
devices get probed.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH 1/1] iommu/vt-d: Omit devTLB invalidation requests when TES=0
  2023-11-14  1:10 ` [PATCH 1/1] iommu/vt-d: Omit devTLB invalidation requests when TES=0 Lu Baolu
@ 2023-11-14  3:14   ` Tian, Kevin
  2023-11-14  3:13     ` Baolu Lu
  2023-11-29 20:10   ` Jason Gunthorpe
  1 sibling, 1 reply; 26+ messages in thread
From: Tian, Kevin @ 2023-11-14  3:14 UTC (permalink / raw)
  To: Lu Baolu, Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe
  Cc: iommu, linux-kernel

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Tuesday, November 14, 2023 9:11 AM
> 
> The latest VT-d spec indicates that when remapping hardware is disabled
> (TES=0 in Global Status Register), upstream ATS Invalidation Completion
> requests are treated as UR (Unsupported Request).
> 
> Consequently, the spec recommends in section 4.3 Handling of Device-TLB
> Invalidations that software refrain from submitting any Device-TLB
> invalidation requests when address remapping hardware is disabled.
> 
> Verify address remapping hardware is enabled prior to submitting Device-
> TLB invalidation requests.
> 
> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by
> default")
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  drivers/iommu/intel/dmar.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
> index a3414afe11b0..23cb80d62a9a 100644
> --- a/drivers/iommu/intel/dmar.c
> +++ b/drivers/iommu/intel/dmar.c
> @@ -1522,6 +1522,15 @@ void qi_flush_dev_iotlb(struct intel_iommu
> *iommu, u16 sid, u16 pfsid,
>  {
>  	struct qi_desc desc;
> 
> +	/*
> +	 * VT-d spec, section 4.3:
> +	 *
> +	 * Software is recommended to not submit any Device-TLB
> invalidation
> +	 * requests while address remapping hardware is disabled.
> +	 */
> +	if (!(iommu->gcmd & DMA_GCMD_TE))
> +		return;
> +

Is it a valid case to see such request when the iommu is disabled?
If not then let's add a WARN.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH 1/1] iommu/vt-d: Disable PCI ATS in legacy passthrough mode
  2023-11-14  1:10 ` [PATCH 1/1] iommu/vt-d: Disable PCI ATS in legacy passthrough mode Lu Baolu
@ 2023-11-14  3:14   ` Tian, Kevin
  2023-11-16  7:35   ` Baolu Lu
  2023-11-29 20:13   ` Jason Gunthorpe
  2 siblings, 0 replies; 26+ messages in thread
From: Tian, Kevin @ 2023-11-14  3:14 UTC (permalink / raw)
  To: Lu Baolu, Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe
  Cc: iommu, linux-kernel

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Tuesday, November 14, 2023 9:11 AM
> 
> When IOMMU hardware operates in legacy mode, the TT field of the context
> entry determines the translation type, with three supported types (Section
> 9.3 Context Entry):
> 
> - DMA translation without device TLB support
> - DMA translation with device TLB support
> - Passthrough mode with translated and translation requests blocked
> 
> Device TLB support is absent when hardware is configured in passthrough
> mode.
> 
> Disable the PCI ATS feature when IOMMU is configured for passthrough
> translation type in legacy (non-scalable) mode.
> 
> Fixes: 0faa19a1515f ("iommu/vt-d: Decouple PASID & PRI enabling from SVA")
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH 1/1] iommu/vt-d: Support enforce_cache_coherency only for empty domains
  2023-11-14  1:10 [PATCH 1/1] iommu/vt-d: Support enforce_cache_coherency only for empty domains Lu Baolu
                   ` (2 preceding siblings ...)
  2023-11-14  1:10 ` [PATCH 1/1] iommu/vt-d: Make context clearing consistent with context mapping Lu Baolu
@ 2023-11-14  3:16 ` Tian, Kevin
  2023-11-29 20:08 ` Jason Gunthorpe
  4 siblings, 0 replies; 26+ messages in thread
From: Tian, Kevin @ 2023-11-14  3:16 UTC (permalink / raw)
  To: Lu Baolu, Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe
  Cc: iommu, linux-kernel

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Tuesday, November 14, 2023 9:11 AM
> 
> The enforce_cache_coherency callback ensures DMA cache coherency for
> devices attached to the domain.
> 
> Intel IOMMU supports enforced DMA cache coherency when the Snoop
> Control bit in the IOMMU's extended capability register is set.
> Supporting it differs between legacy and scalable modes.
> 
> In legacy mode, it's supported page-level by setting the SNP field
> in second-stage page-table entries. In scalable mode, it's supported
> in PASID-table granularity by setting the PGSNP field in PASID-table
> entries.
> 
> In legacy mode, mappings before attaching to a device have SNP
> fields cleared, while mappings after the callback have them set.
> This means partial DMAs are cache coherent while others are not.
> 
> One possible fix is replaying mappings and flipping SNP bits when
> attaching a domain to a device. But this seems to be over-engineered,
> given that all real use cases just attach an empty domain to a device.
> 
> To meet practical needs while reducing mode differences, only support
> enforce_cache_coherency on a domain without mappings if SNP field is
> used.
> 
> Fixes: fc0051cb9590 ("iommu/vt-d: Check domain force_snooping against
> attached devices")
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH 1/1] iommu/vt-d: Make context clearing consistent with context mapping
  2023-11-14  1:10 ` [PATCH 1/1] iommu/vt-d: Make context clearing consistent with context mapping Lu Baolu
@ 2023-11-14  3:20   ` Tian, Kevin
  2023-11-14  3:22     ` Baolu Lu
  0 siblings, 1 reply; 26+ messages in thread
From: Tian, Kevin @ 2023-11-14  3:20 UTC (permalink / raw)
  To: Lu Baolu, Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe
  Cc: iommu, linux-kernel

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Tuesday, November 14, 2023 9:11 AM
> 
> In the iommu probe_device path, domain_context_mapping() allows setting
> up the context entry for a non-PCI device. However, in the iommu
> release_device path, domain_context_clear() only clears context entries
> for PCI devices.
> 
> Make domain_context_clear() behave consistently with
> domain_context_mapping() by clearing context entries for both PCI and
> non-PCI devices.
> 
> Fixes: 579305f75d34 ("iommu/vt-d: Update to use PCI DMA aliases")
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>

The code before the fix tag also has the same problem. If we really want
backport then let's find out the very first commit which exhibits this
problem.

But I wonder the actual impact w/o such fix. If there is no hot-remove
possible for those non-PCI devices the context entry will be leaved
enabled until the machine is off. Then this fix is nice-to-have then
probably no need to backport?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/vt-d: Make context clearing consistent with context mapping
  2023-11-14  3:20   ` Tian, Kevin
@ 2023-11-14  3:22     ` Baolu Lu
  2023-11-14  4:46       ` Tian, Kevin
  0 siblings, 1 reply; 26+ messages in thread
From: Baolu Lu @ 2023-11-14  3:22 UTC (permalink / raw)
  To: Tian, Kevin, Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe
  Cc: baolu.lu, iommu, linux-kernel

On 11/14/23 11:20 AM, Tian, Kevin wrote:
>> From: Lu Baolu <baolu.lu@linux.intel.com>
>> Sent: Tuesday, November 14, 2023 9:11 AM
>>
>> In the iommu probe_device path, domain_context_mapping() allows setting
>> up the context entry for a non-PCI device. However, in the iommu
>> release_device path, domain_context_clear() only clears context entries
>> for PCI devices.
>>
>> Make domain_context_clear() behave consistently with
>> domain_context_mapping() by clearing context entries for both PCI and
>> non-PCI devices.
>>
>> Fixes: 579305f75d34 ("iommu/vt-d: Update to use PCI DMA aliases")
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> 
> The code before the fix tag also has the same problem. If we really want
> backport then let's find out the very first commit which exhibits this
> problem.

Commit 579305f75d34 allows non-PCI devices.

+       if (!dev_is_pci(dev))
+               return domain_context_mapping_one(domain, iommu, bus, devfn,
                                                   translation);
+
+       data.domain = domain;
+       data.iommu = iommu;
+       data.translation = translation;
+
+       return pci_for_each_dma_alias(to_pci_dev(dev),
+                                     &domain_context_mapping_cb, &data);

But it forgot to update the domain_context_clear() helper. So this is
actually a fix for that commit.

> 
> But I wonder the actual impact w/o such fix. If there is no hot-remove
> possible for those non-PCI devices the context entry will be leaved
> enabled until the machine is off. Then this fix is nice-to-have then
> probably no need to backport?

It doesn't cause real issues as far as I can see. So there's no need to
back port it to stable kernels. That's the reason I didn't add cc-stable
tag. But we still need a fix tag as it's actually a fix.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH 1/1] iommu/vt-d: Omit devTLB invalidation requests when TES=0
  2023-11-14  3:13     ` Baolu Lu
@ 2023-11-14  4:45       ` Tian, Kevin
  2023-11-14  4:54         ` Baolu Lu
  0 siblings, 1 reply; 26+ messages in thread
From: Tian, Kevin @ 2023-11-14  4:45 UTC (permalink / raw)
  To: Baolu Lu, Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe
  Cc: iommu, linux-kernel

> From: Baolu Lu <baolu.lu@linux.intel.com>
> Sent: Tuesday, November 14, 2023 11:14 AM
> 
> On 11/14/23 11:14 AM, Tian, Kevin wrote:
> >> From: Lu Baolu<baolu.lu@linux.intel.com>
> >> Sent: Tuesday, November 14, 2023 9:11 AM
> >>
> >> The latest VT-d spec indicates that when remapping hardware is disabled
> >> (TES=0 in Global Status Register), upstream ATS Invalidation Completion
> >> requests are treated as UR (Unsupported Request).
> >>
> >> Consequently, the spec recommends in section 4.3 Handling of Device-TLB
> >> Invalidations that software refrain from submitting any Device-TLB
> >> invalidation requests when address remapping hardware is disabled.
> >>
> >> Verify address remapping hardware is enabled prior to submitting Device-
> >> TLB invalidation requests.
> >>
> >> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by
> >> default")
> >> Signed-off-by: Lu Baolu<baolu.lu@linux.intel.com>
> >> ---
> >>   drivers/iommu/intel/dmar.c | 18 ++++++++++++++++++
> >>   1 file changed, 18 insertions(+)
> >>
> >> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
> >> index a3414afe11b0..23cb80d62a9a 100644
> >> --- a/drivers/iommu/intel/dmar.c
> >> +++ b/drivers/iommu/intel/dmar.c
> >> @@ -1522,6 +1522,15 @@ void qi_flush_dev_iotlb(struct intel_iommu
> >> *iommu, u16 sid, u16 pfsid,
> >>   {
> >>   	struct qi_desc desc;
> >>
> >> +	/*
> >> +	 * VT-d spec, section 4.3:
> >> +	 *
> >> +	 * Software is recommended to not submit any Device-TLB
> >> invalidation
> >> +	 * requests while address remapping hardware is disabled.
> >> +	 */
> >> +	if (!(iommu->gcmd & DMA_GCMD_TE))
> >> +		return;
> >> +
> > Is it a valid case to see such request when the iommu is disabled?
> > If not then let's add a WARN.
> 
> There might be valid cases. The VT-d translation is turned on after all
> devices get probed.
> 

but I didn't get why there will be actual mapping changes before 
vtd translation is enabled...

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH 1/1] iommu/vt-d: Make context clearing consistent with context mapping
  2023-11-14  3:22     ` Baolu Lu
@ 2023-11-14  4:46       ` Tian, Kevin
  0 siblings, 0 replies; 26+ messages in thread
From: Tian, Kevin @ 2023-11-14  4:46 UTC (permalink / raw)
  To: Baolu Lu, Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe
  Cc: iommu, linux-kernel

> From: Baolu Lu <baolu.lu@linux.intel.com>
> Sent: Tuesday, November 14, 2023 11:22 AM
> 
> On 11/14/23 11:20 AM, Tian, Kevin wrote:
> >> From: Lu Baolu <baolu.lu@linux.intel.com>
> >> Sent: Tuesday, November 14, 2023 9:11 AM
> >>
> >> In the iommu probe_device path, domain_context_mapping() allows
> setting
> >> up the context entry for a non-PCI device. However, in the iommu
> >> release_device path, domain_context_clear() only clears context entries
> >> for PCI devices.
> >>
> >> Make domain_context_clear() behave consistently with
> >> domain_context_mapping() by clearing context entries for both PCI and
> >> non-PCI devices.
> >>
> >> Fixes: 579305f75d34 ("iommu/vt-d: Update to use PCI DMA aliases")
> >> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> >
> > The code before the fix tag also has the same problem. If we really want
> > backport then let's find out the very first commit which exhibits this
> > problem.
> 
> Commit 579305f75d34 allows non-PCI devices.
> 
> +       if (!dev_is_pci(dev))
> +               return domain_context_mapping_one(domain, iommu, bus, devfn,
>                                                    translation);
> +
> +       data.domain = domain;
> +       data.iommu = iommu;
> +       data.translation = translation;
> +
> +       return pci_for_each_dma_alias(to_pci_dev(dev),
> +                                     &domain_context_mapping_cb, &data);
> 
> But it forgot to update the domain_context_clear() helper. So this is
> actually a fix for that commit.

thanks. I didn't note that side effect from commit msg.

> 
> >
> > But I wonder the actual impact w/o such fix. If there is no hot-remove
> > possible for those non-PCI devices the context entry will be leaved
> > enabled until the machine is off. Then this fix is nice-to-have then
> > probably no need to backport?
> 
> It doesn't cause real issues as far as I can see. So there's no need to
> back port it to stable kernels. That's the reason I didn't add cc-stable
> tag. But we still need a fix tag as it's actually a fix.
> 

make sense.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/vt-d: Omit devTLB invalidation requests when TES=0
  2023-11-14  4:45       ` Tian, Kevin
@ 2023-11-14  4:54         ` Baolu Lu
  2023-11-14  5:31           ` Tian, Kevin
  0 siblings, 1 reply; 26+ messages in thread
From: Baolu Lu @ 2023-11-14  4:54 UTC (permalink / raw)
  To: Tian, Kevin, Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe
  Cc: baolu.lu, iommu, linux-kernel

On 11/14/23 12:45 PM, Tian, Kevin wrote:
>> From: Baolu Lu <baolu.lu@linux.intel.com>
>> Sent: Tuesday, November 14, 2023 11:14 AM
>>
>> On 11/14/23 11:14 AM, Tian, Kevin wrote:
>>>> From: Lu Baolu<baolu.lu@linux.intel.com>
>>>> Sent: Tuesday, November 14, 2023 9:11 AM
>>>>
>>>> The latest VT-d spec indicates that when remapping hardware is disabled
>>>> (TES=0 in Global Status Register), upstream ATS Invalidation Completion
>>>> requests are treated as UR (Unsupported Request).
>>>>
>>>> Consequently, the spec recommends in section 4.3 Handling of Device-TLB
>>>> Invalidations that software refrain from submitting any Device-TLB
>>>> invalidation requests when address remapping hardware is disabled.
>>>>
>>>> Verify address remapping hardware is enabled prior to submitting Device-
>>>> TLB invalidation requests.
>>>>
>>>> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by
>>>> default")
>>>> Signed-off-by: Lu Baolu<baolu.lu@linux.intel.com>
>>>> ---
>>>>    drivers/iommu/intel/dmar.c | 18 ++++++++++++++++++
>>>>    1 file changed, 18 insertions(+)
>>>>
>>>> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
>>>> index a3414afe11b0..23cb80d62a9a 100644
>>>> --- a/drivers/iommu/intel/dmar.c
>>>> +++ b/drivers/iommu/intel/dmar.c
>>>> @@ -1522,6 +1522,15 @@ void qi_flush_dev_iotlb(struct intel_iommu
>>>> *iommu, u16 sid, u16 pfsid,
>>>>    {
>>>>    	struct qi_desc desc;
>>>>
>>>> +	/*
>>>> +	 * VT-d spec, section 4.3:
>>>> +	 *
>>>> +	 * Software is recommended to not submit any Device-TLB
>>>> invalidation
>>>> +	 * requests while address remapping hardware is disabled.
>>>> +	 */
>>>> +	if (!(iommu->gcmd & DMA_GCMD_TE))
>>>> +		return;
>>>> +
>>> Is it a valid case to see such request when the iommu is disabled?
>>> If not then let's add a WARN.
>>
>> There might be valid cases. The VT-d translation is turned on after all
>> devices get probed.
>>
> 
> but I didn't get why there will be actual mapping changes before
> vtd translation is enabled...

For an example, in iommu_create_device_direct_mappings(),
iommu_flush_iotlb_all() is called after direct mappings are created.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH 1/1] iommu/vt-d: Omit devTLB invalidation requests when TES=0
  2023-11-14  4:54         ` Baolu Lu
@ 2023-11-14  5:31           ` Tian, Kevin
  0 siblings, 0 replies; 26+ messages in thread
From: Tian, Kevin @ 2023-11-14  5:31 UTC (permalink / raw)
  To: Baolu Lu, Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe
  Cc: iommu, linux-kernel

> From: Baolu Lu <baolu.lu@linux.intel.com>
> Sent: Tuesday, November 14, 2023 12:55 PM
> 
> On 11/14/23 12:45 PM, Tian, Kevin wrote:
> >> From: Baolu Lu <baolu.lu@linux.intel.com>
> >> Sent: Tuesday, November 14, 2023 11:14 AM
> >>
> >> On 11/14/23 11:14 AM, Tian, Kevin wrote:
> >>>> From: Lu Baolu<baolu.lu@linux.intel.com>
> >>>> Sent: Tuesday, November 14, 2023 9:11 AM
> >>>>
> >>>> The latest VT-d spec indicates that when remapping hardware is
> disabled
> >>>> (TES=0 in Global Status Register), upstream ATS Invalidation Completion
> >>>> requests are treated as UR (Unsupported Request).
> >>>>
> >>>> Consequently, the spec recommends in section 4.3 Handling of Device-
> TLB
> >>>> Invalidations that software refrain from submitting any Device-TLB
> >>>> invalidation requests when address remapping hardware is disabled.
> >>>>
> >>>> Verify address remapping hardware is enabled prior to submitting
> Device-
> >>>> TLB invalidation requests.
> >>>>
> >>>> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode
> by
> >>>> default")
> >>>> Signed-off-by: Lu Baolu<baolu.lu@linux.intel.com>
> >>>> ---
> >>>>    drivers/iommu/intel/dmar.c | 18 ++++++++++++++++++
> >>>>    1 file changed, 18 insertions(+)
> >>>>
> >>>> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
> >>>> index a3414afe11b0..23cb80d62a9a 100644
> >>>> --- a/drivers/iommu/intel/dmar.c
> >>>> +++ b/drivers/iommu/intel/dmar.c
> >>>> @@ -1522,6 +1522,15 @@ void qi_flush_dev_iotlb(struct intel_iommu
> >>>> *iommu, u16 sid, u16 pfsid,
> >>>>    {
> >>>>    	struct qi_desc desc;
> >>>>
> >>>> +	/*
> >>>> +	 * VT-d spec, section 4.3:
> >>>> +	 *
> >>>> +	 * Software is recommended to not submit any Device-TLB
> >>>> invalidation
> >>>> +	 * requests while address remapping hardware is disabled.
> >>>> +	 */
> >>>> +	if (!(iommu->gcmd & DMA_GCMD_TE))
> >>>> +		return;
> >>>> +
> >>> Is it a valid case to see such request when the iommu is disabled?
> >>> If not then let's add a WARN.
> >>
> >> There might be valid cases. The VT-d translation is turned on after all
> >> devices get probed.
> >>
> >
> > but I didn't get why there will be actual mapping changes before
> > vtd translation is enabled...
> 
> For an example, in iommu_create_device_direct_mappings(),
> iommu_flush_iotlb_all() is called after direct mappings are created.
> 

Okay.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/vt-d: Disable PCI ATS in legacy passthrough mode
  2023-11-14  1:10 ` [PATCH 1/1] iommu/vt-d: Disable PCI ATS in legacy passthrough mode Lu Baolu
  2023-11-14  3:14   ` Tian, Kevin
@ 2023-11-16  7:35   ` Baolu Lu
  2023-11-16  8:24     ` Tian, Kevin
  2023-11-29 20:13   ` Jason Gunthorpe
  2 siblings, 1 reply; 26+ messages in thread
From: Baolu Lu @ 2023-11-16  7:35 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe, Kevin Tian
  Cc: baolu.lu, iommu, linux-kernel

On 2023/11/14 9:10, Lu Baolu wrote:
> When IOMMU hardware operates in legacy mode, the TT field of the context
> entry determines the translation type, with three supported types (Section
> 9.3 Context Entry):
> 
> - DMA translation without device TLB support
> - DMA translation with device TLB support
> - Passthrough mode with translated and translation requests blocked
> 
> Device TLB support is absent when hardware is configured in passthrough
> mode.
> 
> Disable the PCI ATS feature when IOMMU is configured for passthrough
> translation type in legacy (non-scalable) mode.
> 
> Fixes: 0faa19a1515f ("iommu/vt-d: Decouple PASID & PRI enabling from SVA")
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>   drivers/iommu/intel/iommu.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 11670cd812a3..c3ec09118ab1 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1413,6 +1413,10 @@ static void iommu_enable_pci_caps(struct device_domain_info *info)
>   	if (!dev_is_pci(info->dev))
>   		return;
>   
> +	if (!sm_supported(info->iommu) && info->domain &&
> +	    domain_type_is_si(info->domain))
> +		return;
> +
>   	pdev = to_pci_dev(info->dev);
>   
>   	/* The PCIe spec, in its wisdom, declares that the behaviour of

Perhaps we could move the check into the caller and make this helper
transparent to the iommu mode and domain type?

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 11670cd812a3..9bddd4fbbdf8 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2492,7 +2492,8 @@ static int dmar_domain_attach_device(struct 
dmar_domain *domain,
                 return ret;
         }

-       iommu_enable_pci_caps(info);
+       if (sm_supported(info->iommu) || !domain_type_is_si(info->domain))
+               iommu_enable_pci_caps(info);

         return 0;
  }

Best regards,
baolu

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* RE: [PATCH 1/1] iommu/vt-d: Disable PCI ATS in legacy passthrough mode
  2023-11-16  7:35   ` Baolu Lu
@ 2023-11-16  8:24     ` Tian, Kevin
  2023-11-17  1:09       ` Baolu Lu
  0 siblings, 1 reply; 26+ messages in thread
From: Tian, Kevin @ 2023-11-16  8:24 UTC (permalink / raw)
  To: Baolu Lu, Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe
  Cc: iommu, linux-kernel

> From: Baolu Lu <baolu.lu@linux.intel.com>
> Sent: Thursday, November 16, 2023 3:35 PM
> 
> On 2023/11/14 9:10, Lu Baolu wrote:
> > When IOMMU hardware operates in legacy mode, the TT field of the
> context
> > entry determines the translation type, with three supported types (Section
> > 9.3 Context Entry):
> >
> > - DMA translation without device TLB support
> > - DMA translation with device TLB support
> > - Passthrough mode with translated and translation requests blocked
> >
> > Device TLB support is absent when hardware is configured in passthrough
> > mode.
> >
> > Disable the PCI ATS feature when IOMMU is configured for passthrough
> > translation type in legacy (non-scalable) mode.
> >
> > Fixes: 0faa19a1515f ("iommu/vt-d: Decouple PASID & PRI enabling from
> SVA")
> > Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> > ---
> >   drivers/iommu/intel/iommu.c | 4 ++++
> >   1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index 11670cd812a3..c3ec09118ab1 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -1413,6 +1413,10 @@ static void iommu_enable_pci_caps(struct
> device_domain_info *info)
> >   	if (!dev_is_pci(info->dev))
> >   		return;
> >
> > +	if (!sm_supported(info->iommu) && info->domain &&
> > +	    domain_type_is_si(info->domain))
> > +		return;
> > +
> >   	pdev = to_pci_dev(info->dev);
> >
> >   	/* The PCIe spec, in its wisdom, declares that the behaviour of
> 
> Perhaps we could move the check into the caller and make this helper
> transparent to the iommu mode and domain type?
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 11670cd812a3..9bddd4fbbdf8 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -2492,7 +2492,8 @@ static int dmar_domain_attach_device(struct
> dmar_domain *domain,
>                  return ret;
>          }
> 
> -       iommu_enable_pci_caps(info);
> +       if (sm_supported(info->iommu) || !domain_type_is_si(info->domain))
> +               iommu_enable_pci_caps(info);
> 

IMHO both old and this new version are confusing regarding to that
the commit msg talks only about ATS but the actual code disable all
pci caps. It's correct, being that only ATS is relevant in legacy mode,
but the readability is not good.

what about introducing a helper e.g. device_domain_ats_supported(info)
which includes above checks plus info->ats_supported and then use it
to replace info->ats_supported in iommu_enable_pci_caps()?


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/vt-d: Disable PCI ATS in legacy passthrough mode
  2023-11-16  8:24     ` Tian, Kevin
@ 2023-11-17  1:09       ` Baolu Lu
  2023-11-17  2:22         ` Tian, Kevin
  0 siblings, 1 reply; 26+ messages in thread
From: Baolu Lu @ 2023-11-17  1:09 UTC (permalink / raw)
  To: Tian, Kevin, Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe
  Cc: baolu.lu, iommu, linux-kernel

On 11/16/23 4:24 PM, Tian, Kevin wrote:
>> From: Baolu Lu <baolu.lu@linux.intel.com>
>> Sent: Thursday, November 16, 2023 3:35 PM
>>
>> On 2023/11/14 9:10, Lu Baolu wrote:
>>> When IOMMU hardware operates in legacy mode, the TT field of the
>> context
>>> entry determines the translation type, with three supported types (Section
>>> 9.3 Context Entry):
>>>
>>> - DMA translation without device TLB support
>>> - DMA translation with device TLB support
>>> - Passthrough mode with translated and translation requests blocked
>>>
>>> Device TLB support is absent when hardware is configured in passthrough
>>> mode.
>>>
>>> Disable the PCI ATS feature when IOMMU is configured for passthrough
>>> translation type in legacy (non-scalable) mode.
>>>
>>> Fixes: 0faa19a1515f ("iommu/vt-d: Decouple PASID & PRI enabling from
>> SVA")
>>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>>> ---
>>>    drivers/iommu/intel/iommu.c | 4 ++++
>>>    1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>>> index 11670cd812a3..c3ec09118ab1 100644
>>> --- a/drivers/iommu/intel/iommu.c
>>> +++ b/drivers/iommu/intel/iommu.c
>>> @@ -1413,6 +1413,10 @@ static void iommu_enable_pci_caps(struct
>> device_domain_info *info)
>>>    	if (!dev_is_pci(info->dev))
>>>    		return;
>>>
>>> +	if (!sm_supported(info->iommu) && info->domain &&
>>> +	    domain_type_is_si(info->domain))
>>> +		return;
>>> +
>>>    	pdev = to_pci_dev(info->dev);
>>>
>>>    	/* The PCIe spec, in its wisdom, declares that the behaviour of
>>
>> Perhaps we could move the check into the caller and make this helper
>> transparent to the iommu mode and domain type?
>>
>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>> index 11670cd812a3..9bddd4fbbdf8 100644
>> --- a/drivers/iommu/intel/iommu.c
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -2492,7 +2492,8 @@ static int dmar_domain_attach_device(struct
>> dmar_domain *domain,
>>                   return ret;
>>           }
>>
>> -       iommu_enable_pci_caps(info);
>> +       if (sm_supported(info->iommu) || !domain_type_is_si(info->domain))
>> +               iommu_enable_pci_caps(info);
>>
> 
> IMHO both old and this new version are confusing regarding to that
> the commit msg talks only about ATS but the actual code disable all
> pci caps. It's correct, being that only ATS is relevant in legacy mode,
> but the readability is not good.

The function name is a bit misleading, but its actual purpose is to
enable the ATS feature. PASID is enabled within this function due to a
specification requirement that prevents modifications to PASID registers
after ATS activation.

> 
> what about introducing a helper e.g. device_domain_ats_supported(info)
> which includes above checks plus info->ats_supported and then use it
> to replace info->ats_supported in iommu_enable_pci_caps()?
> 

I have a patch series under test which moves the ATS control out to the
device drivers or users (through sysfs node). With that done, we could
simply remove all these and give the control to device drivers or users.

How about considering above after I post that series? And for now, we
can make the fix patch quick and simple.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH 1/1] iommu/vt-d: Disable PCI ATS in legacy passthrough mode
  2023-11-17  1:09       ` Baolu Lu
@ 2023-11-17  2:22         ` Tian, Kevin
  0 siblings, 0 replies; 26+ messages in thread
From: Tian, Kevin @ 2023-11-17  2:22 UTC (permalink / raw)
  To: Baolu Lu, Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe
  Cc: iommu, linux-kernel

> From: Baolu Lu <baolu.lu@linux.intel.com>
> Sent: Friday, November 17, 2023 9:10 AM
> 
> On 11/16/23 4:24 PM, Tian, Kevin wrote:
> >> From: Baolu Lu <baolu.lu@linux.intel.com>
> >> Sent: Thursday, November 16, 2023 3:35 PM
> >>
> >> On 2023/11/14 9:10, Lu Baolu wrote:
> >>> When IOMMU hardware operates in legacy mode, the TT field of the
> >> context
> >>> entry determines the translation type, with three supported types
> (Section
> >>> 9.3 Context Entry):
> >>>
> >>> - DMA translation without device TLB support
> >>> - DMA translation with device TLB support
> >>> - Passthrough mode with translated and translation requests blocked
> >>>
> >>> Device TLB support is absent when hardware is configured in
> passthrough
> >>> mode.
> >>>
> >>> Disable the PCI ATS feature when IOMMU is configured for passthrough
> >>> translation type in legacy (non-scalable) mode.
> >>>
> >>> Fixes: 0faa19a1515f ("iommu/vt-d: Decouple PASID & PRI enabling from
> >> SVA")
> >>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> >>> ---
> >>>    drivers/iommu/intel/iommu.c | 4 ++++
> >>>    1 file changed, 4 insertions(+)
> >>>
> >>> diff --git a/drivers/iommu/intel/iommu.c
> b/drivers/iommu/intel/iommu.c
> >>> index 11670cd812a3..c3ec09118ab1 100644
> >>> --- a/drivers/iommu/intel/iommu.c
> >>> +++ b/drivers/iommu/intel/iommu.c
> >>> @@ -1413,6 +1413,10 @@ static void iommu_enable_pci_caps(struct
> >> device_domain_info *info)
> >>>    	if (!dev_is_pci(info->dev))
> >>>    		return;
> >>>
> >>> +	if (!sm_supported(info->iommu) && info->domain &&
> >>> +	    domain_type_is_si(info->domain))
> >>> +		return;
> >>> +
> >>>    	pdev = to_pci_dev(info->dev);
> >>>
> >>>    	/* The PCIe spec, in its wisdom, declares that the behaviour of
> >>
> >> Perhaps we could move the check into the caller and make this helper
> >> transparent to the iommu mode and domain type?
> >>
> >> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> >> index 11670cd812a3..9bddd4fbbdf8 100644
> >> --- a/drivers/iommu/intel/iommu.c
> >> +++ b/drivers/iommu/intel/iommu.c
> >> @@ -2492,7 +2492,8 @@ static int dmar_domain_attach_device(struct
> >> dmar_domain *domain,
> >>                   return ret;
> >>           }
> >>
> >> -       iommu_enable_pci_caps(info);
> >> +       if (sm_supported(info->iommu) || !domain_type_is_si(info-
> >domain))
> >> +               iommu_enable_pci_caps(info);
> >>
> >
> > IMHO both old and this new version are confusing regarding to that
> > the commit msg talks only about ATS but the actual code disable all
> > pci caps. It's correct, being that only ATS is relevant in legacy mode,
> > but the readability is not good.
> 
> The function name is a bit misleading, but its actual purpose is to
> enable the ATS feature. PASID is enabled within this function due to a
> specification requirement that prevents modifications to PASID registers
> after ATS activation.
> 
> >
> > what about introducing a helper e.g. device_domain_ats_supported(info)
> > which includes above checks plus info->ats_supported and then use it
> > to replace info->ats_supported in iommu_enable_pci_caps()?
> >
> 
> I have a patch series under test which moves the ATS control out to the
> device drivers or users (through sysfs node). With that done, we could
> simply remove all these and give the control to device drivers or users.
> 
> How about considering above after I post that series? And for now, we
> can make the fix patch quick and simple.
> 

Then please add a comment for the rationale.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/vt-d: Support enforce_cache_coherency only for empty domains
  2023-11-14  1:10 [PATCH 1/1] iommu/vt-d: Support enforce_cache_coherency only for empty domains Lu Baolu
                   ` (3 preceding siblings ...)
  2023-11-14  3:16 ` [PATCH 1/1] iommu/vt-d: Support enforce_cache_coherency only for empty domains Tian, Kevin
@ 2023-11-29 20:08 ` Jason Gunthorpe
  2023-11-30  3:48   ` Baolu Lu
  4 siblings, 1 reply; 26+ messages in thread
From: Jason Gunthorpe @ 2023-11-29 20:08 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian, iommu, linux-kernel

On Tue, Nov 14, 2023 at 09:10:33AM +0800, Lu Baolu wrote:
> diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
> index 65d37a138c75..ce030c5b5772 100644
> --- a/drivers/iommu/intel/iommu.h
> +++ b/drivers/iommu/intel/iommu.h
> @@ -602,6 +602,9 @@ struct dmar_domain {
>  					 */
>  	u8 dirty_tracking:1;		/* Dirty tracking is enabled */
>  	u8 nested_parent:1;		/* Has other domains nested on it */
> +	u8 has_mappings:1;		/* Has mappings configured through
> +					 * iommu_map() interface.
> +					 */

Is it racey?

The other option is to make iommfd do this and forbid it from
switching the enforce_cache_coherency if the IOAS has any maps
attached. We can get the correct locking at that point.

AMD has the same issue if it ever wants to implement its per-PTE bit

Jason

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/vt-d: Omit devTLB invalidation requests when TES=0
  2023-11-14  1:10 ` [PATCH 1/1] iommu/vt-d: Omit devTLB invalidation requests when TES=0 Lu Baolu
  2023-11-14  3:14   ` Tian, Kevin
@ 2023-11-29 20:10   ` Jason Gunthorpe
  2023-11-30  4:06     ` Baolu Lu
  1 sibling, 1 reply; 26+ messages in thread
From: Jason Gunthorpe @ 2023-11-29 20:10 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian, iommu, linux-kernel

On Tue, Nov 14, 2023 at 09:10:34AM +0800, Lu Baolu wrote:
> The latest VT-d spec indicates that when remapping hardware is disabled
> (TES=0 in Global Status Register), upstream ATS Invalidation Completion
> requests are treated as UR (Unsupported Request).
> 
> Consequently, the spec recommends in section 4.3 Handling of Device-TLB
> Invalidations that software refrain from submitting any Device-TLB
> invalidation requests when address remapping hardware is disabled.
> 
> Verify address remapping hardware is enabled prior to submitting Device-
> TLB invalidation requests.
> 
> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by default")
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  drivers/iommu/intel/dmar.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)

How did you get to the point where flush_dev_iotlb could even be
called if the iommu has somehow been globally disabled?

Shouldn't the attach of the domain compeltely fail if the HW is
disabled?

If the domain is not attached to anything why would flushing happen?

Jason

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/vt-d: Disable PCI ATS in legacy passthrough mode
  2023-11-14  1:10 ` [PATCH 1/1] iommu/vt-d: Disable PCI ATS in legacy passthrough mode Lu Baolu
  2023-11-14  3:14   ` Tian, Kevin
  2023-11-16  7:35   ` Baolu Lu
@ 2023-11-29 20:13   ` Jason Gunthorpe
  2023-11-30  5:44     ` Baolu Lu
  2 siblings, 1 reply; 26+ messages in thread
From: Jason Gunthorpe @ 2023-11-29 20:13 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian, iommu, linux-kernel

On Tue, Nov 14, 2023 at 09:10:35AM +0800, Lu Baolu wrote:
> When IOMMU hardware operates in legacy mode, the TT field of the context
> entry determines the translation type, with three supported types (Section
> 9.3 Context Entry):
> 
> - DMA translation without device TLB support
> - DMA translation with device TLB support
> - Passthrough mode with translated and translation requests blocked
> 
> Device TLB support is absent when hardware is configured in passthrough
> mode.
> 
> Disable the PCI ATS feature when IOMMU is configured for passthrough
> translation type in legacy (non-scalable) mode.

Oh.. That is the same horrible outcome that ARM has :(

The issue is what to do if the RID translation is in identity but a
PASID is attached that should be using ATS - eg do you completely
loose SVA support if the RID is set to the optimized identity mode?

I vote no. We should make the drivers aware that they should not use
ATS on their RIDs instead :(

Jason

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/vt-d: Support enforce_cache_coherency only for empty domains
  2023-11-29 20:08 ` Jason Gunthorpe
@ 2023-11-30  3:48   ` Baolu Lu
  0 siblings, 0 replies; 26+ messages in thread
From: Baolu Lu @ 2023-11-30  3:48 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: baolu.lu, Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian,
	iommu, linux-kernel

On 2023/11/30 4:08, Jason Gunthorpe wrote:
> On Tue, Nov 14, 2023 at 09:10:33AM +0800, Lu Baolu wrote:
>> diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
>> index 65d37a138c75..ce030c5b5772 100644
>> --- a/drivers/iommu/intel/iommu.h
>> +++ b/drivers/iommu/intel/iommu.h
>> @@ -602,6 +602,9 @@ struct dmar_domain {
>>   					 */
>>   	u8 dirty_tracking:1;		/* Dirty tracking is enabled */
>>   	u8 nested_parent:1;		/* Has other domains nested on it */
>> +	u8 has_mappings:1;		/* Has mappings configured through
>> +					 * iommu_map() interface.
>> +					 */
> Is it racey?
> 
> The other option is to make iommfd do this and forbid it from
> switching the enforce_cache_coherency if the IOAS has any maps
> attached. We can get the correct locking at that point.
> 
> AMD has the same issue if it ever wants to implement its per-PTE bit

Yes. It's better to do this in iommufd. With that done, we can then
remove this code.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/vt-d: Omit devTLB invalidation requests when TES=0
  2023-11-29 20:10   ` Jason Gunthorpe
@ 2023-11-30  4:06     ` Baolu Lu
  2023-11-30 12:15       ` Jason Gunthorpe
  0 siblings, 1 reply; 26+ messages in thread
From: Baolu Lu @ 2023-11-30  4:06 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: baolu.lu, Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian,
	iommu, linux-kernel

On 2023/11/30 4:10, Jason Gunthorpe wrote:
> On Tue, Nov 14, 2023 at 09:10:34AM +0800, Lu Baolu wrote:
>> The latest VT-d spec indicates that when remapping hardware is disabled
>> (TES=0 in Global Status Register), upstream ATS Invalidation Completion
>> requests are treated as UR (Unsupported Request).
>>
>> Consequently, the spec recommends in section 4.3 Handling of Device-TLB
>> Invalidations that software refrain from submitting any Device-TLB
>> invalidation requests when address remapping hardware is disabled.
>>
>> Verify address remapping hardware is enabled prior to submitting Device-
>> TLB invalidation requests.
>>
>> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by default")
>> Signed-off-by: Lu Baolu<baolu.lu@linux.intel.com>
>> ---
>>   drivers/iommu/intel/dmar.c | 18 ++++++++++++++++++
>>   1 file changed, 18 insertions(+)
> How did you get to the point where flush_dev_iotlb could even be
> called if the iommu has somehow been globally disabled?
> 
> Shouldn't the attach of the domain compeltely fail if the HW is
> disabled?
> 
> If the domain is not attached to anything why would flushing happen?

The VT-d hardware can be in a state where the hardware is on but DMA
translation is deactivated. In this state, the device probe process
during boot proceeds as follows:

1) Initialize the IOMMU contexts: This sets up the data structures that
    the IOMMU uses to manage address translation for DMA operations.

2) Register the IOMMU devices: This registers the IOMMU devices to the
    core. The core then probes devices on buses like PCI.

3) Enable DMA translation: This step activates DMA translation.

With regard to step 2), the call to iommu_flush_iotlb_all() in
iommu_create_device_direct_mappings() can potentially cause device TBL
invalidation when the VT-d DMA translation is deactivated.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/vt-d: Disable PCI ATS in legacy passthrough mode
  2023-11-29 20:13   ` Jason Gunthorpe
@ 2023-11-30  5:44     ` Baolu Lu
  2023-11-30 16:18       ` Jason Gunthorpe
  0 siblings, 1 reply; 26+ messages in thread
From: Baolu Lu @ 2023-11-30  5:44 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: baolu.lu, Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian,
	iommu, linux-kernel

On 2023/11/30 4:13, Jason Gunthorpe wrote:
> On Tue, Nov 14, 2023 at 09:10:35AM +0800, Lu Baolu wrote:
>> When IOMMU hardware operates in legacy mode, the TT field of the context
>> entry determines the translation type, with three supported types (Section
>> 9.3 Context Entry):
>>
>> - DMA translation without device TLB support
>> - DMA translation with device TLB support
>> - Passthrough mode with translated and translation requests blocked
>>
>> Device TLB support is absent when hardware is configured in passthrough
>> mode.
>>
>> Disable the PCI ATS feature when IOMMU is configured for passthrough
>> translation type in legacy (non-scalable) mode.
> Oh.. That is the same horrible outcome that ARM has 🙁
> 
> The issue is what to do if the RID translation is in identity but a
> PASID is attached that should be using ATS - eg do you completely
> loose SVA support if the RID is set to the optimized identity mode?

This fix only affects the non-scalable mode that doesn't support PASID
  features.

> I vote no. We should make the drivers aware that they should not use
> ATS on their RIDs instead 🙁

Intel VT-d hardware does not supported ATS in non-scalable mode when
translation is set to passthrough mode. Historically, the Intel IOMMU
driver has never enabled the ATS feature in this configuration.

Commit 0faa19a1515f changed this behavior by accident, potentially
leading to incorrect hardware configuration. This patch fixes the issue
by reverting to the previous behavior. Otherwise, it leads to hardware
configuration errors and potential DMA malfunction, as all translated
DMA requests would be blocked by the IOMMU.

Going forward, I agree that device drivers should have the ability to
access and potentially control the ATS status. I have a upcoming series
that will enable device drivers or user-space to manage the ATS as we
have discussed in other threads before.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/vt-d: Omit devTLB invalidation requests when TES=0
  2023-11-30  4:06     ` Baolu Lu
@ 2023-11-30 12:15       ` Jason Gunthorpe
  0 siblings, 0 replies; 26+ messages in thread
From: Jason Gunthorpe @ 2023-11-30 12:15 UTC (permalink / raw)
  To: Baolu Lu
  Cc: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian, iommu, linux-kernel

On Thu, Nov 30, 2023 at 12:06:59PM +0800, Baolu Lu wrote:
> On 2023/11/30 4:10, Jason Gunthorpe wrote:
> > On Tue, Nov 14, 2023 at 09:10:34AM +0800, Lu Baolu wrote:
> > > The latest VT-d spec indicates that when remapping hardware is disabled
> > > (TES=0 in Global Status Register), upstream ATS Invalidation Completion
> > > requests are treated as UR (Unsupported Request).
> > > 
> > > Consequently, the spec recommends in section 4.3 Handling of Device-TLB
> > > Invalidations that software refrain from submitting any Device-TLB
> > > invalidation requests when address remapping hardware is disabled.
> > > 
> > > Verify address remapping hardware is enabled prior to submitting Device-
> > > TLB invalidation requests.
> > > 
> > > Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by default")
> > > Signed-off-by: Lu Baolu<baolu.lu@linux.intel.com>
> > > ---
> > >   drivers/iommu/intel/dmar.c | 18 ++++++++++++++++++
> > >   1 file changed, 18 insertions(+)
> > How did you get to the point where flush_dev_iotlb could even be
> > called if the iommu has somehow been globally disabled?
> > 
> > Shouldn't the attach of the domain compeltely fail if the HW is
> > disabled?
> > 
> > If the domain is not attached to anything why would flushing happen?
> 
> The VT-d hardware can be in a state where the hardware is on but DMA
> translation is deactivated. In this state, the device probe process
> during boot proceeds as follows:
> 
> 1) Initialize the IOMMU contexts: This sets up the data structures that
>    the IOMMU uses to manage address translation for DMA operations.
> 
> 2) Register the IOMMU devices: This registers the IOMMU devices to the
>    core. The core then probes devices on buses like PCI.
> 
> 3) Enable DMA translation: This step activates DMA translation.
> 
> With regard to step 2), the call to iommu_flush_iotlb_all() in
> iommu_create_device_direct_mappings() can potentially cause device TBL
> invalidation when the VT-d DMA translation is deactivated.

You are trying to create an atomic change at boot from non-translating
to DMA translating for HW that doesn't support the identity mode?

This should probably get a comment in this patch..

Jason

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/vt-d: Disable PCI ATS in legacy passthrough mode
  2023-11-30  5:44     ` Baolu Lu
@ 2023-11-30 16:18       ` Jason Gunthorpe
  0 siblings, 0 replies; 26+ messages in thread
From: Jason Gunthorpe @ 2023-11-30 16:18 UTC (permalink / raw)
  To: Baolu Lu
  Cc: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian, iommu, linux-kernel

On Thu, Nov 30, 2023 at 01:44:19PM +0800, Baolu Lu wrote:
> On 2023/11/30 4:13, Jason Gunthorpe wrote:
> > On Tue, Nov 14, 2023 at 09:10:35AM +0800, Lu Baolu wrote:
> > > When IOMMU hardware operates in legacy mode, the TT field of the context
> > > entry determines the translation type, with three supported types (Section
> > > 9.3 Context Entry):
> > > 
> > > - DMA translation without device TLB support
> > > - DMA translation with device TLB support
> > > - Passthrough mode with translated and translation requests blocked
> > > 
> > > Device TLB support is absent when hardware is configured in passthrough
> > > mode.
> > > 
> > > Disable the PCI ATS feature when IOMMU is configured for passthrough
> > > translation type in legacy (non-scalable) mode.
> > Oh.. That is the same horrible outcome that ARM has 🙁
> > 
> > The issue is what to do if the RID translation is in identity but a
> > PASID is attached that should be using ATS - eg do you completely
> > loose SVA support if the RID is set to the optimized identity mode?
> 
> This fix only affects the non-scalable mode that doesn't support PASID
>  features.

Ah, OK so it is OK. I'm glad the new mode supports ATS against
passthrough.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-11-30 16:18 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-14  1:10 [PATCH 1/1] iommu/vt-d: Support enforce_cache_coherency only for empty domains Lu Baolu
2023-11-14  1:10 ` [PATCH 1/1] iommu/vt-d: Omit devTLB invalidation requests when TES=0 Lu Baolu
2023-11-14  3:14   ` Tian, Kevin
2023-11-14  3:13     ` Baolu Lu
2023-11-14  4:45       ` Tian, Kevin
2023-11-14  4:54         ` Baolu Lu
2023-11-14  5:31           ` Tian, Kevin
2023-11-29 20:10   ` Jason Gunthorpe
2023-11-30  4:06     ` Baolu Lu
2023-11-30 12:15       ` Jason Gunthorpe
2023-11-14  1:10 ` [PATCH 1/1] iommu/vt-d: Disable PCI ATS in legacy passthrough mode Lu Baolu
2023-11-14  3:14   ` Tian, Kevin
2023-11-16  7:35   ` Baolu Lu
2023-11-16  8:24     ` Tian, Kevin
2023-11-17  1:09       ` Baolu Lu
2023-11-17  2:22         ` Tian, Kevin
2023-11-29 20:13   ` Jason Gunthorpe
2023-11-30  5:44     ` Baolu Lu
2023-11-30 16:18       ` Jason Gunthorpe
2023-11-14  1:10 ` [PATCH 1/1] iommu/vt-d: Make context clearing consistent with context mapping Lu Baolu
2023-11-14  3:20   ` Tian, Kevin
2023-11-14  3:22     ` Baolu Lu
2023-11-14  4:46       ` Tian, Kevin
2023-11-14  3:16 ` [PATCH 1/1] iommu/vt-d: Support enforce_cache_coherency only for empty domains Tian, Kevin
2023-11-29 20:08 ` Jason Gunthorpe
2023-11-30  3:48   ` Baolu Lu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.