linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] iommu/vt-d: Avoid unnecessary device TLB flush in map path
@ 2024-04-07 14:42 Lu Baolu
  2024-04-07 14:42 ` [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush Lu Baolu
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Lu Baolu @ 2024-04-07 14:42 UTC (permalink / raw)
  To: iommu
  Cc: Kevin Tian, Yi Liu, Joerg Roedel, Will Deacon, Robin Murphy,
	linux-kernel, Lu Baolu

iommu_flush_iotlb_psi() is called in map and unmap paths. The caching
mode check before device TLB invalidation will cause device TLB
invalidation always issued if IOMMU is not running in the caching mode.
This is inefficient and causes performance overhead.

Make device TLB invalidation behavior consistent between batched mode
unmapping and strict mode unmapping. Device TLB invalidation should only
be requested in the unmap path if the IOMMU is not in caching mode.

Fixes: bf92df30df90 ("intel-iommu: Only avoid flushing device IOTLB for domain ID 0 in caching mode")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel/iommu.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 50eb9aed47cc..493b6a600394 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1501,11 +1501,7 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
 	else
 		__iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
 
-	/*
-	 * In caching mode, changes of pages from non-present to present require
-	 * flush. However, device IOTLB doesn't need to be flushed in this case.
-	 */
-	if (!cap_caching_mode(iommu->cap) || !map)
+	if (!cap_caching_mode(iommu->cap) && !map)
 		iommu_flush_dev_iotlb(domain, addr, mask);
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush
  2024-04-07 14:42 [PATCH 1/2] iommu/vt-d: Avoid unnecessary device TLB flush in map path Lu Baolu
@ 2024-04-07 14:42 ` Lu Baolu
  2024-04-08  7:21   ` Ethan Zhao
                     ` (3 more replies)
  2024-04-09  7:20 ` [PATCH 1/2] iommu/vt-d: Avoid unnecessary device TLB flush in map path Tian, Kevin
                   ` (2 subsequent siblings)
  3 siblings, 4 replies; 21+ messages in thread
From: Lu Baolu @ 2024-04-07 14:42 UTC (permalink / raw)
  To: iommu
  Cc: Kevin Tian, Yi Liu, Joerg Roedel, Will Deacon, Robin Murphy,
	linux-kernel, Lu Baolu

The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
implementation caches not-present or erroneous translation-structure
entries except the first-stage translation. The caching mode is
unrelated to the device TLB , therefore there is no need to check
it before a device TLB invalidation operation.

Before the scalable mode is introduced, caching mode is treated as
an indication that the driver is running in a VM guest. This is just
a software contract as shadow page table is the only way to implement
a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
the scalable mode is introduced, this doesn't stand for anymore, as
caching mode is not relevant for the first-stage translation. A virtual
IOMMU implementation is free to support first-stage translation only
with caching mode cleared.

Remove the caching mode check before device TLB invalidation to ensure
compatibility with the scalable mode use cases.

Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by default")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel/iommu.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 493b6a600394..681789b1258d 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
 	else
 		__iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
 
-	if (!cap_caching_mode(iommu->cap) && !map)
+	if (!map)
 		iommu_flush_dev_iotlb(domain, addr, mask);
 }
 
@@ -1575,8 +1575,7 @@ static void intel_flush_iotlb_all(struct iommu_domain *domain)
 			iommu->flush.flush_iotlb(iommu, did, 0, 0,
 						 DMA_TLB_DSI_FLUSH);
 
-		if (!cap_caching_mode(iommu->cap))
-			iommu_flush_dev_iotlb(dmar_domain, 0, MAX_AGAW_PFN_WIDTH);
+		iommu_flush_dev_iotlb(dmar_domain, 0, MAX_AGAW_PFN_WIDTH);
 	}
 
 	if (dmar_domain->nested_parent)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush
  2024-04-07 14:42 ` [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush Lu Baolu
@ 2024-04-08  7:21   ` Ethan Zhao
  2024-04-08  7:23     ` Baolu Lu
  2024-04-08 21:03   ` Jacob Pan
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 21+ messages in thread
From: Ethan Zhao @ 2024-04-08  7:21 UTC (permalink / raw)
  To: Lu Baolu, iommu
  Cc: Kevin Tian, Yi Liu, Joerg Roedel, Will Deacon, Robin Murphy,
	linux-kernel

On 4/7/2024 10:42 PM, Lu Baolu wrote:
> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
> implementation caches not-present or erroneous translation-structure
> entries except the first-stage translation. The caching mode is
> unrelated to the device TLB , therefore there is no need to check
> it before a device TLB invalidation operation.
>
> Before the scalable mode is introduced, caching mode is treated as
> an indication that the driver is running in a VM guest. This is just
> a software contract as shadow page table is the only way to implement
> a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
> the scalable mode is introduced, this doesn't stand for anymore, as
> caching mode is not relevant for the first-stage translation. A virtual
> IOMMU implementation is free to support first-stage translation only
> with caching mode cleared.
>
> Remove the caching mode check before device TLB invalidation to ensure
> compatibility with the scalable mode use cases.
>
> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by default")
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>   drivers/iommu/intel/iommu.c | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 493b6a600394..681789b1258d 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
>   	else
>   		__iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
>   
> -	if (!cap_caching_mode(iommu->cap) && !map)
> +	if (!map)

My understanding, we don't need patch[1/2] at all, and customer is just asking
about the CM & tlb flushing, it is great to have this commit [2/2].


Thanks,
Ethan

>   		iommu_flush_dev_iotlb(domain, addr, mask);
>   }
>   
> @@ -1575,8 +1575,7 @@ static void intel_flush_iotlb_all(struct iommu_domain *domain)
>   			iommu->flush.flush_iotlb(iommu, did, 0, 0,
>   						 DMA_TLB_DSI_FLUSH);
>   
> -		if (!cap_caching_mode(iommu->cap))
> -			iommu_flush_dev_iotlb(dmar_domain, 0, MAX_AGAW_PFN_WIDTH);
> +		iommu_flush_dev_iotlb(dmar_domain, 0, MAX_AGAW_PFN_WIDTH);
>   	}
>   
>   	if (dmar_domain->nested_parent)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush
  2024-04-08  7:21   ` Ethan Zhao
@ 2024-04-08  7:23     ` Baolu Lu
  2024-04-08  7:43       ` Ethan Zhao
  0 siblings, 1 reply; 21+ messages in thread
From: Baolu Lu @ 2024-04-08  7:23 UTC (permalink / raw)
  To: Ethan Zhao, iommu
  Cc: baolu.lu, Kevin Tian, Yi Liu, Joerg Roedel, Will Deacon,
	Robin Murphy, linux-kernel

On 2024/4/8 15:21, Ethan Zhao wrote:
> On 4/7/2024 10:42 PM, Lu Baolu wrote:
>> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
>> implementation caches not-present or erroneous translation-structure
>> entries except the first-stage translation. The caching mode is
>> unrelated to the device TLB , therefore there is no need to check
>> it before a device TLB invalidation operation.
>>
>> Before the scalable mode is introduced, caching mode is treated as
>> an indication that the driver is running in a VM guest. This is just
>> a software contract as shadow page table is the only way to implement
>> a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
>> the scalable mode is introduced, this doesn't stand for anymore, as
>> caching mode is not relevant for the first-stage translation. A virtual
>> IOMMU implementation is free to support first-stage translation only
>> with caching mode cleared.
>>
>> Remove the caching mode check before device TLB invalidation to ensure
>> compatibility with the scalable mode use cases.
>>
>> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by 
>> default")
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>> ---
>>   drivers/iommu/intel/iommu.c | 5 ++---
>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>> index 493b6a600394..681789b1258d 100644
>> --- a/drivers/iommu/intel/iommu.c
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct 
>> intel_iommu *iommu,
>>       else
>>           __iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
>> -    if (!cap_caching_mode(iommu->cap) && !map)
>> +    if (!map)
> 
> My understanding, we don't need patch[1/2] at all, and customer is just 
> asking
> about the CM & tlb flushing, it is great to have this commit [2/2].

Actually they fix different problems.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush
  2024-04-08  7:23     ` Baolu Lu
@ 2024-04-08  7:43       ` Ethan Zhao
  0 siblings, 0 replies; 21+ messages in thread
From: Ethan Zhao @ 2024-04-08  7:43 UTC (permalink / raw)
  To: Baolu Lu, iommu
  Cc: Kevin Tian, Yi Liu, Joerg Roedel, Will Deacon, Robin Murphy,
	linux-kernel

On 4/8/2024 3:23 PM, Baolu Lu wrote:
> On 2024/4/8 15:21, Ethan Zhao wrote:
>> On 4/7/2024 10:42 PM, Lu Baolu wrote:
>>> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
>>> implementation caches not-present or erroneous translation-structure
>>> entries except the first-stage translation. The caching mode is
>>> unrelated to the device TLB , therefore there is no need to check
>>> it before a device TLB invalidation operation.
>>>
>>> Before the scalable mode is introduced, caching mode is treated as
>>> an indication that the driver is running in a VM guest. This is just
>>> a software contract as shadow page table is the only way to implement
>>> a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
>>> the scalable mode is introduced, this doesn't stand for anymore, as
>>> caching mode is not relevant for the first-stage translation. A virtual
>>> IOMMU implementation is free to support first-stage translation only
>>> with caching mode cleared.
>>>
>>> Remove the caching mode check before device TLB invalidation to ensure
>>> compatibility with the scalable mode use cases.
>>>
>>> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode 
>>> by default")
>>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>>> ---
>>>   drivers/iommu/intel/iommu.c | 5 ++---
>>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>>> index 493b6a600394..681789b1258d 100644
>>> --- a/drivers/iommu/intel/iommu.c
>>> +++ b/drivers/iommu/intel/iommu.c
>>> @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct 
>>> intel_iommu *iommu,
>>>       else
>>>           __iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
>>> -    if (!cap_caching_mode(iommu->cap) && !map)
>>> +    if (!map)
>>
>> My understanding, we don't need patch[1/2] at all, and customer is 
>> just asking
>> about the CM & tlb flushing, it is great to have this commit [2/2].
>
> Actually they fix different problems.

Yup, progress view.

Thanks,
Ethan

>
> Best regards,
> baolu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush
  2024-04-07 14:42 ` [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush Lu Baolu
  2024-04-08  7:21   ` Ethan Zhao
@ 2024-04-08 21:03   ` Jacob Pan
  2024-04-09  3:12     ` Baolu Lu
  2024-04-09  7:30   ` Tian, Kevin
  2024-04-09  8:36   ` Yi Liu
  3 siblings, 1 reply; 21+ messages in thread
From: Jacob Pan @ 2024-04-08 21:03 UTC (permalink / raw)
  To: Lu Baolu
  Cc: iommu, Kevin Tian, Yi Liu, Joerg Roedel, Will Deacon,
	Robin Murphy, linux-kernel, jacob.jun.pan

Hi Lu,

On Sun,  7 Apr 2024 22:42:32 +0800, Lu Baolu <baolu.lu@linux.intel.com>
wrote:

> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
> implementation caches not-present or erroneous translation-structure
> entries except the first-stage translation. The caching mode is
> unrelated to the device TLB , therefore there is no need to check
> it before a device TLB invalidation operation.
> 
> Before the scalable mode is introduced, caching mode is treated as
> an indication that the driver is running in a VM guest. This is just
> a software contract as shadow page table is the only way to implement
> a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
> the scalable mode is introduced, this doesn't stand for anymore, as
> caching mode is not relevant for the first-stage translation. A virtual
> IOMMU implementation is free to support first-stage translation only
> with caching mode cleared.
> 
> Remove the caching mode check before device TLB invalidation to ensure
> compatibility with the scalable mode use cases.
> 
I agree with the changes below, what about this CM check:

/* Notification for newly created mappings */
static void __mapping_notify_one(struct intel_iommu *iommu, struct dmar_domain *domain,
				 unsigned long pfn, unsigned int pages)
{
	/*
	 * It's a non-present to present mapping. Only flush if caching mode
	 * and second level.
	 */
	if (cap_caching_mode(iommu->cap) && !domain->use_first_level)
		iommu_flush_iotlb_psi(iommu, domain, pfn, pages, 0, 1);

We are still tying devTLB flush to CM=1, no?

If we are running in the guest with second level page table (shadowed), can
we decide if devTLB flush is needed based on ATS enable just as the rest of
the cases?


> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by
> default") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  drivers/iommu/intel/iommu.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 493b6a600394..681789b1258d 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct
> intel_iommu *iommu, else
>  		__iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
>  
> -	if (!cap_caching_mode(iommu->cap) && !map)
> +	if (!map)
>  		iommu_flush_dev_iotlb(domain, addr, mask);
>  }
>  
> @@ -1575,8 +1575,7 @@ static void intel_flush_iotlb_all(struct
> iommu_domain *domain) iommu->flush.flush_iotlb(iommu, did, 0, 0,
>  						 DMA_TLB_DSI_FLUSH);
>  
> -		if (!cap_caching_mode(iommu->cap))
> -			iommu_flush_dev_iotlb(dmar_domain, 0,
> MAX_AGAW_PFN_WIDTH);
> +		iommu_flush_dev_iotlb(dmar_domain, 0,
> MAX_AGAW_PFN_WIDTH); }
>  
>  	if (dmar_domain->nested_parent)


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush
  2024-04-08 21:03   ` Jacob Pan
@ 2024-04-09  3:12     ` Baolu Lu
  2024-04-09 17:31       ` Jacob Pan
  0 siblings, 1 reply; 21+ messages in thread
From: Baolu Lu @ 2024-04-09  3:12 UTC (permalink / raw)
  To: Jacob Pan
  Cc: baolu.lu, iommu, Kevin Tian, Yi Liu, Joerg Roedel, Will Deacon,
	Robin Murphy, linux-kernel

On 4/9/24 5:03 AM, Jacob Pan wrote:
> Hi Lu,

Hi Jacob,

> 
> On Sun,  7 Apr 2024 22:42:32 +0800, Lu Baolu<baolu.lu@linux.intel.com>
> wrote:
> 
>> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
>> implementation caches not-present or erroneous translation-structure
>> entries except the first-stage translation. The caching mode is
>> unrelated to the device TLB , therefore there is no need to check
>> it before a device TLB invalidation operation.
>>
>> Before the scalable mode is introduced, caching mode is treated as
>> an indication that the driver is running in a VM guest. This is just
>> a software contract as shadow page table is the only way to implement
>> a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
>> the scalable mode is introduced, this doesn't stand for anymore, as
>> caching mode is not relevant for the first-stage translation. A virtual
>> IOMMU implementation is free to support first-stage translation only
>> with caching mode cleared.
>>
>> Remove the caching mode check before device TLB invalidation to ensure
>> compatibility with the scalable mode use cases.
>>
> I agree with the changes below, what about this CM check:
> 
> /* Notification for newly created mappings */
> static void __mapping_notify_one(struct intel_iommu *iommu, struct dmar_domain *domain,
> 				 unsigned long pfn, unsigned int pages)
> {
> 	/*
> 	 * It's a non-present to present mapping. Only flush if caching mode
> 	 * and second level.
> 	 */
> 	if (cap_caching_mode(iommu->cap) && !domain->use_first_level)
> 		iommu_flush_iotlb_psi(iommu, domain, pfn, pages, 0, 1);
> 
> We are still tying devTLB flush to CM=1, no?

__mapping_notify_one() is called in the path where some PTEs are changed
from non-present to present.

In this scenario,

- if CM is set and first-stage translation is not used, the IOTLB caches
   are required to be explicitly flushed.
- else if hardware requires write buffer flushing, do it.
- Otherwise, no op.
- devtlb invalidation is irrelevant to this path.

The code after the fix appears to do the right thing. devTLB is not
invalidated in iommu_flush_iotlb_psi() since it's a map (map == 1).

Or perhaps I overlooked anything?

> 
> If we are running in the guest with second level page table (shadowed), can
> we decide if devTLB flush is needed based on ATS enable just as the rest of
> the cases?

I think the ATS check should be consistent. It's generic no matter how
the IOMMU is implemented (in hardware or emulated in software).

Best regards,
baolu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH 1/2] iommu/vt-d: Avoid unnecessary device TLB flush in map path
  2024-04-07 14:42 [PATCH 1/2] iommu/vt-d: Avoid unnecessary device TLB flush in map path Lu Baolu
  2024-04-07 14:42 ` [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush Lu Baolu
@ 2024-04-09  7:20 ` Tian, Kevin
  2024-04-09  7:21 ` Tian, Kevin
  2024-04-09  8:27 ` Yi Liu
  3 siblings, 0 replies; 21+ messages in thread
From: Tian, Kevin @ 2024-04-09  7:20 UTC (permalink / raw)
  To: Lu Baolu, iommu
  Cc: Liu, Yi L, Joerg Roedel, Will Deacon, Robin Murphy, linux-kernel

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Sunday, April 7, 2024 10:43 PM
> 
> iommu_flush_iotlb_psi() is called in map and unmap paths. The caching
> mode check before device TLB invalidation will cause device TLB
> invalidation always issued if IOMMU is not running in the caching mode.
> This is inefficient and causes performance overhead.

s/inefficient/wrong/

'inefficient' means that it's a right thing to do but just inefficient compared
to other options. But here by definition it's not required and caching mode
is irrelevant in the context of devtlb.

> 
> Make device TLB invalidation behavior consistent between batched mode
> unmapping and strict mode unmapping. Device TLB invalidation should only

I don't quite understand the connection to batch vs. strict.

> be requested in the unmap path if the IOMMU is not in caching mode.

I'll remove the part about caching mode as it's irrelevant.

> 
> Fixes: bf92df30df90 ("intel-iommu: Only avoid flushing device IOTLB for
> domain ID 0 in caching mode")
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  drivers/iommu/intel/iommu.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 50eb9aed47cc..493b6a600394 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1501,11 +1501,7 @@ static void iommu_flush_iotlb_psi(struct
> intel_iommu *iommu,
>  	else
>  		__iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
> 
> -	/*
> -	 * In caching mode, changes of pages from non-present to present
> require
> -	 * flush. However, device IOTLB doesn't need to be flushed in this
> case.
> -	 */
> -	if (!cap_caching_mode(iommu->cap) || !map)
> +	if (!cap_caching_mode(iommu->cap) && !map)
>  		iommu_flush_dev_iotlb(domain, addr, mask);

It's a bit strange to do this half-baked way and then rely on next patch
to do the right thing. It temporarily removes all devtlb invalidations
for caching mode here and then add them back for unmap in next patch.

caching mode has nothing to do with devtlb. So I'd just do:

	if (!map)
		iommu_flush_dev_iotlb(domain, addr, mask);



^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH 1/2] iommu/vt-d: Avoid unnecessary device TLB flush in map path
  2024-04-07 14:42 [PATCH 1/2] iommu/vt-d: Avoid unnecessary device TLB flush in map path Lu Baolu
  2024-04-07 14:42 ` [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush Lu Baolu
  2024-04-09  7:20 ` [PATCH 1/2] iommu/vt-d: Avoid unnecessary device TLB flush in map path Tian, Kevin
@ 2024-04-09  7:21 ` Tian, Kevin
  2024-04-09  8:27 ` Yi Liu
  3 siblings, 0 replies; 21+ messages in thread
From: Tian, Kevin @ 2024-04-09  7:21 UTC (permalink / raw)
  To: Lu Baolu, iommu
  Cc: Liu, Yi L, Joerg Roedel, Will Deacon, Robin Murphy, linux-kernel

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Sunday, April 7, 2024 10:43 PM
> 
> iommu_flush_iotlb_psi() is called in map and unmap paths. The caching
> mode check before device TLB invalidation will cause device TLB
> invalidation always issued if IOMMU is not running in the caching mode.
> This is inefficient and causes performance overhead.
> 
> Make device TLB invalidation behavior consistent between batched mode
> unmapping and strict mode unmapping. Device TLB invalidation should only
> be requested in the unmap path if the IOMMU is not in caching mode.
> 
> Fixes: bf92df30df90 ("intel-iommu: Only avoid flushing device IOTLB for
> domain ID 0 in caching mode")
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>

and given it only affects performance then not qualified for backporting?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush
  2024-04-07 14:42 ` [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush Lu Baolu
  2024-04-08  7:21   ` Ethan Zhao
  2024-04-08 21:03   ` Jacob Pan
@ 2024-04-09  7:30   ` Tian, Kevin
  2024-04-10  5:40     ` Baolu Lu
  2024-04-10 23:49     ` Zhang, Tina
  2024-04-09  8:36   ` Yi Liu
  3 siblings, 2 replies; 21+ messages in thread
From: Tian, Kevin @ 2024-04-09  7:30 UTC (permalink / raw)
  To: Lu Baolu, iommu
  Cc: Liu, Yi L, Joerg Roedel, Will Deacon, Robin Murphy, linux-kernel

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Sunday, April 7, 2024 10:43 PM
> 
> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
> implementation caches not-present or erroneous translation-structure
> entries except the first-stage translation. The caching mode is
> unrelated to the device TLB , therefore there is no need to check
> it before a device TLB invalidation operation.
> 
> Before the scalable mode is introduced, caching mode is treated as
> an indication that the driver is running in a VM guest. This is just
> a software contract as shadow page table is the only way to implement
> a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
> the scalable mode is introduced, this doesn't stand for anymore, as
> caching mode is not relevant for the first-stage translation. A virtual
> IOMMU implementation is free to support first-stage translation only
> with caching mode cleared.

I didn't get the connection to the scalable mode.

if required we can still use caching mode to imply running as a guest.
Just need to make sure its implementation conforming to the VT-d spec.

> 
> Remove the caching mode check before device TLB invalidation to ensure
> compatibility with the scalable mode use cases.
> 
> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by
> default")
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  drivers/iommu/intel/iommu.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 493b6a600394..681789b1258d 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct
> intel_iommu *iommu,
>  	else
>  		__iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
> 
> -	if (!cap_caching_mode(iommu->cap) && !map)
> +	if (!map)
>  		iommu_flush_dev_iotlb(domain, addr, mask);

as commented earlier better squash this in patch1.

>  }
> 
> @@ -1575,8 +1575,7 @@ static void intel_flush_iotlb_all(struct
> iommu_domain *domain)
>  			iommu->flush.flush_iotlb(iommu, did, 0, 0,
>  						 DMA_TLB_DSI_FLUSH);
> 
> -		if (!cap_caching_mode(iommu->cap))
> -			iommu_flush_dev_iotlb(dmar_domain, 0,
> MAX_AGAW_PFN_WIDTH);
> +		iommu_flush_dev_iotlb(dmar_domain, 0,
> MAX_AGAW_PFN_WIDTH);
>  	}
> 

I'm hesitating to agree with this change. Strictly speaking it's correct.
but w/o supporting batch invalidation this implies performance drop
on viommu due to more VM-exits and there may incur user complaints
when their VMs upgrade to a newer kernel version.

So it'd be better to keep this behavior and fix it together with batch
invalidation support. Anyway none of the viommu implementations
today (either shadow or nested translation) relies on the correct
devtlb behavior from the guest otherwise it's already broken.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] iommu/vt-d: Avoid unnecessary device TLB flush in map path
  2024-04-07 14:42 [PATCH 1/2] iommu/vt-d: Avoid unnecessary device TLB flush in map path Lu Baolu
                   ` (2 preceding siblings ...)
  2024-04-09  7:21 ` Tian, Kevin
@ 2024-04-09  8:27 ` Yi Liu
  3 siblings, 0 replies; 21+ messages in thread
From: Yi Liu @ 2024-04-09  8:27 UTC (permalink / raw)
  To: Lu Baolu, iommu
  Cc: Kevin Tian, Joerg Roedel, Will Deacon, Robin Murphy, linux-kernel

On 2024/4/7 22:42, Lu Baolu wrote:
> iommu_flush_iotlb_psi() is called in map and unmap paths. The caching
> mode check before device TLB invalidation will cause device TLB
> invalidation always issued if IOMMU is not running in the caching mode.
> This is inefficient and causes performance overhead.
> 
> Make device TLB invalidation behavior consistent between batched mode
> unmapping and strict mode unmapping. Device TLB invalidation should only
> be requested in the unmap path if the IOMMU is not in caching mode.
> 
> Fixes: bf92df30df90 ("intel-iommu: Only avoid flushing device IOTLB for domain ID 0 in caching mode")
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>   drivers/iommu/intel/iommu.c | 6 +-----
>   1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 50eb9aed47cc..493b6a600394 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1501,11 +1501,7 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
>   	else
>   		__iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
>   
> -	/*
> -	 * In caching mode, changes of pages from non-present to present require
> -	 * flush. However, device IOTLB doesn't need to be flushed in this case.
> -	 */
> -	if (!cap_caching_mode(iommu->cap) || !map)
> +	if (!cap_caching_mode(iommu->cap) && !map)
>   		iommu_flush_dev_iotlb(domain, addr, mask);
>   }
>   

The existing code works but kind of confusing. The iommu_flush_iotlb_psi()
helper will be called in both the map and unmap path. But device-TLB only
needed to be flushed in the unmap path since there is no chance for
device ATC to have cache for a non-present mapping. And only when caching
mode is reported, then should the helper be called in the map path. So the
fact is if caching mode is 0, the @map should always be false. If caching
mode is 1, then @map can be either false or true. To be simpler, it should
be enough to check @map before flushing device-TLB. no matter caching mode
or not.

-- 
Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush
  2024-04-07 14:42 ` [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush Lu Baolu
                     ` (2 preceding siblings ...)
  2024-04-09  7:30   ` Tian, Kevin
@ 2024-04-09  8:36   ` Yi Liu
  3 siblings, 0 replies; 21+ messages in thread
From: Yi Liu @ 2024-04-09  8:36 UTC (permalink / raw)
  To: Lu Baolu, iommu
  Cc: Kevin Tian, Joerg Roedel, Will Deacon, Robin Murphy, linux-kernel

On 2024/4/7 22:42, Lu Baolu wrote:
> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
> implementation caches not-present or erroneous translation-structure
> entries except the first-stage translation. The caching mode is
> unrelated to the device TLB , therefore there is no need to check
> it before a device TLB invalidation operation.
> 
> Before the scalable mode is introduced, caching mode is treated as
> an indication that the driver is running in a VM guest. This is just
> a software contract as shadow page table is the only way to implement
> a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
> the scalable mode is introduced, this doesn't stand for anymore, as
> caching mode is not relevant for the first-stage translation. A virtual
> IOMMU implementation is free to support first-stage translation only
> with caching mode cleared.
> 
> Remove the caching mode check before device TLB invalidation to ensure
> compatibility with the scalable mode use cases.
> 
> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by default")
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>   drivers/iommu/intel/iommu.c | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 493b6a600394..681789b1258d 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
>   	else
>   		__iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
>   
> -	if (!cap_caching_mode(iommu->cap) && !map)
> +	if (!map)
>   		iommu_flush_dev_iotlb(domain, addr, mask);
>   }
>   

IMHO. this change can be merged with patch 01. And the reason is to make
the logic simple and clear.

> @@ -1575,8 +1575,7 @@ static void intel_flush_iotlb_all(struct iommu_domain *domain)
>   			iommu->flush.flush_iotlb(iommu, did, 0, 0,
>   						 DMA_TLB_DSI_FLUSH);
>   
> -		if (!cap_caching_mode(iommu->cap))
> -			iommu_flush_dev_iotlb(dmar_domain, 0, MAX_AGAW_PFN_WIDTH);
> +		iommu_flush_dev_iotlb(dmar_domain, 0, MAX_AGAW_PFN_WIDTH);
>   	}
>   
>   	if (dmar_domain->nested_parent)

you can also move the iommu_flush_dev_iotlb() call out of the iommu_array 
loop since it does not require checking iommu cap.

-- 
Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush
  2024-04-09  3:12     ` Baolu Lu
@ 2024-04-09 17:31       ` Jacob Pan
  2024-04-10  0:32         ` Tian, Kevin
  0 siblings, 1 reply; 21+ messages in thread
From: Jacob Pan @ 2024-04-09 17:31 UTC (permalink / raw)
  To: Baolu Lu
  Cc: iommu, Kevin Tian, Yi Liu, Joerg Roedel, Will Deacon,
	Robin Murphy, linux-kernel, jacob.jun.pan

Hi Baolu,

On Tue, 9 Apr 2024 11:12:20 +0800, Baolu Lu <baolu.lu@linux.intel.com>
wrote:

> On 4/9/24 5:03 AM, Jacob Pan wrote:
> > Hi Lu,  
> 
> Hi Jacob,
> 
> > 
> > On Sun,  7 Apr 2024 22:42:32 +0800, Lu Baolu<baolu.lu@linux.intel.com>
> > wrote:
> >   
> >> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
> >> implementation caches not-present or erroneous translation-structure
> >> entries except the first-stage translation. The caching mode is
> >> unrelated to the device TLB , therefore there is no need to check
> >> it before a device TLB invalidation operation.
> >>
> >> Before the scalable mode is introduced, caching mode is treated as
> >> an indication that the driver is running in a VM guest. This is just
> >> a software contract as shadow page table is the only way to implement
> >> a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
> >> the scalable mode is introduced, this doesn't stand for anymore, as
> >> caching mode is not relevant for the first-stage translation. A virtual
> >> IOMMU implementation is free to support first-stage translation only
> >> with caching mode cleared.
> >>
> >> Remove the caching mode check before device TLB invalidation to ensure
> >> compatibility with the scalable mode use cases.
> >>  
> > I agree with the changes below, what about this CM check:
> > 
> > /* Notification for newly created mappings */
> > static void __mapping_notify_one(struct intel_iommu *iommu, struct
> > dmar_domain *domain, unsigned long pfn, unsigned int pages)
> > {
> > 	/*
> > 	 * It's a non-present to present mapping. Only flush if caching
> > mode
> > 	 * and second level.
> > 	 */
> > 	if (cap_caching_mode(iommu->cap) && !domain->use_first_level)
> > 		iommu_flush_iotlb_psi(iommu, domain, pfn, pages, 0, 1);
> > 
> > We are still tying devTLB flush to CM=1, no?  
> 
> __mapping_notify_one() is called in the path where some PTEs are changed
> from non-present to present.
> 
> In this scenario,
> 
> - if CM is set and first-stage translation is not used, the IOTLB caches
>    are required to be explicitly flushed.
> - else if hardware requires write buffer flushing, do it.
> - Otherwise, no op.
> - devtlb invalidation is irrelevant to this path.
> 
> The code after the fix appears to do the right thing. devTLB is not
> invalidated in iommu_flush_iotlb_psi() since it's a map (map == 1).
> 
> Or perhaps I overlooked anything?
My confusion is that, on one side, this patch is saying devTLB flush has
nothing to do with CM. But here, if CMD==1, we don't flush devTLB since
map==1.

If the guest uses SL page tables in vIOMMU, we don;t expose ATS to the
guest. So ATS is not relevant here, does't matter map or unmap.

Can we remove the map argument in iommu_flush_iotlb_psi(iommu, domain,pfn,
pages, 0, 1)?

Then devTLB flush will naturally be skipped in the guest (CM=1, SL) since
ATS is not enabled.
iommu_flush_dev_iotlb(domain, addr, mask);

i.e.

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 50eb9aed47cc..ee3e5a1af0c5 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1483,7 +1483,7 @@ static void __iommu_flush_iotlb_psi(struct intel_iommu *iommu, u16 did,
 static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
                                  struct dmar_domain *domain,
                                  unsigned long pfn, unsigned int pages,
-                                 int ih, int map)
+                                 int ih)
 {
        unsigned int aligned_pages = __roundup_pow_of_two(pages);
        unsigned int mask = ilog2(aligned_pages);
@@ -1501,12 +1501,7 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
        else
                __iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
 
-       /*
-        * In caching mode, changes of pages from non-present to present require
-        * flush. However, device IOTLB doesn't need to be flushed in this case.
-        */
-       if (!cap_caching_mode(iommu->cap) || !map)
-               iommu_flush_dev_iotlb(domain, addr, mask);
+       iommu_flush_dev_iotlb(domain, addr, mask);
 }
 
 /* Notification for newly created mappings */
@@ -1518,7 +1513,7 @@ static void __mapping_notify_one(struct intel_iommu *iommu, struct dmar_domain *
         * and second level.
         */
        if (cap_caching_mode(iommu->cap) && !domain->use_first_level)
-               iommu_flush_iotlb_psi(iommu, domain, pfn, pages, 0, 1);
+               iommu_flush_iotlb_psi(iommu, domain, pfn, pages, 0);
        else
                iommu_flush_write_buffer(iommu);
 }



> > 
> > If we are running in the guest with second level page table (shadowed),
> > can we decide if devTLB flush is needed based on ATS enable just as the
> > rest of the cases?  
> 
> I think the ATS check should be consistent. It's generic no matter how
> the IOMMU is implemented (in hardware or emulated in software).
> 
> Best regards,
> baolu


Thanks,

Jacob

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* RE: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush
  2024-04-09 17:31       ` Jacob Pan
@ 2024-04-10  0:32         ` Tian, Kevin
  2024-04-10 16:19           ` Jacob Pan
  0 siblings, 1 reply; 21+ messages in thread
From: Tian, Kevin @ 2024-04-10  0:32 UTC (permalink / raw)
  To: Jacob Pan, Baolu Lu
  Cc: iommu, Liu, Yi L, Joerg Roedel, Will Deacon, Robin Murphy, linux-kernel

> From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Sent: Wednesday, April 10, 2024 1:32 AM
> 
> If the guest uses SL page tables in vIOMMU, we don;t expose ATS to the
> guest. So ATS is not relevant here, does't matter map or unmap.
> 

ATS is orthogonal to SL vs. FL. Where is this restriction coming from?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush
  2024-04-09  7:30   ` Tian, Kevin
@ 2024-04-10  5:40     ` Baolu Lu
  2024-04-10 23:49     ` Zhang, Tina
  1 sibling, 0 replies; 21+ messages in thread
From: Baolu Lu @ 2024-04-10  5:40 UTC (permalink / raw)
  To: Tian, Kevin, iommu
  Cc: baolu.lu, Liu, Yi L, Joerg Roedel, Will Deacon, Robin Murphy,
	linux-kernel

On 4/9/24 3:30 PM, Tian, Kevin wrote:
>> Remove the caching mode check before device TLB invalidation to ensure
>> compatibility with the scalable mode use cases.
>>
>> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by
>> default")
>> Signed-off-by: Lu Baolu<baolu.lu@linux.intel.com>
>> ---
>>   drivers/iommu/intel/iommu.c | 5 ++---
>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>> index 493b6a600394..681789b1258d 100644
>> --- a/drivers/iommu/intel/iommu.c
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct
>> intel_iommu *iommu,
>>   	else
>>   		__iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
>>
>> -	if (!cap_caching_mode(iommu->cap) && !map)
>> +	if (!map)
>>   		iommu_flush_dev_iotlb(domain, addr, mask);
> as commented earlier better squash this in patch1.

Okay, let me squash them into a single patch and make the commit message 
more descriptive.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush
  2024-04-10  0:32         ` Tian, Kevin
@ 2024-04-10 16:19           ` Jacob Pan
  2024-04-10 23:23             ` Tian, Kevin
  0 siblings, 1 reply; 21+ messages in thread
From: Jacob Pan @ 2024-04-10 16:19 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Baolu Lu, iommu, Liu, Yi L, Joerg Roedel, Will Deacon,
	Robin Murphy, linux-kernel, jacob.jun.pan

Hi Kevin,

On Wed, 10 Apr 2024 00:32:06 +0000, "Tian, Kevin" <kevin.tian@intel.com>
wrote:

> > From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > Sent: Wednesday, April 10, 2024 1:32 AM
> > 
> > If the guest uses SL page tables in vIOMMU, we don;t expose ATS to the
> > guest. So ATS is not relevant here, does't matter map or unmap.
> >   
> 
> ATS is orthogonal to SL vs. FL. Where is this restriction coming from?
For practical purposes, what would be the usage to have SL in the guest and
ATS enabled. i.e. shadowing SL but directly expose ATS? 

Thanks,

Jacob

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush
  2024-04-10 16:19           ` Jacob Pan
@ 2024-04-10 23:23             ` Tian, Kevin
  2024-04-11 16:17               ` Jacob Pan
  0 siblings, 1 reply; 21+ messages in thread
From: Tian, Kevin @ 2024-04-10 23:23 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Baolu Lu, iommu, Liu, Yi L, Joerg Roedel, Will Deacon,
	Robin Murphy, linux-kernel

> From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Sent: Thursday, April 11, 2024 12:20 AM
> 
> Hi Kevin,
> 
> On Wed, 10 Apr 2024 00:32:06 +0000, "Tian, Kevin" <kevin.tian@intel.com>
> wrote:
> 
> > > From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > > Sent: Wednesday, April 10, 2024 1:32 AM
> > >
> > > If the guest uses SL page tables in vIOMMU, we don;t expose ATS to the
> > > guest. So ATS is not relevant here, does't matter map or unmap.
> > >
> >
> > ATS is orthogonal to SL vs. FL. Where is this restriction coming from?
> For practical purposes, what would be the usage to have SL in the guest and
> ATS enabled. i.e. shadowing SL but directly expose ATS?
> 

ATS is about the protocol between device and iommu to look up
translations. Why does it care about internal paging layout in
iommu?

I'm not sure which spec explicitly states such restriction.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush
  2024-04-09  7:30   ` Tian, Kevin
  2024-04-10  5:40     ` Baolu Lu
@ 2024-04-10 23:49     ` Zhang, Tina
  2024-04-11 12:15       ` Baolu Lu
  1 sibling, 1 reply; 21+ messages in thread
From: Zhang, Tina @ 2024-04-10 23:49 UTC (permalink / raw)
  To: Tian, Kevin, Lu Baolu, iommu
  Cc: Liu, Yi L, Joerg Roedel, Will Deacon, Robin Murphy, linux-kernel

Hi,

> -----Original Message-----
> From: Tian, Kevin <kevin.tian@intel.com>
> Sent: Tuesday, April 9, 2024 3:30 PM
> To: Lu Baolu <baolu.lu@linux.intel.com>; iommu@lists.linux.dev
> Cc: Liu, Yi L <yi.l.liu@intel.com>; Joerg Roedel <joro@8bytes.org>; Will
> Deacon <will@kernel.org>; Robin Murphy <robin.murphy@arm.com>; linux-
> kernel@vger.kernel.org
> Subject: RE: [PATCH 2/2] iommu/vt-d: Remove caching mode check before
> devtlb flush
> 
> > From: Lu Baolu <baolu.lu@linux.intel.com>
> > Sent: Sunday, April 7, 2024 10:43 PM
> >
> > The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
> > implementation caches not-present or erroneous translation-structure
> > entries except the first-stage translation. The caching mode is
> > unrelated to the device TLB , therefore there is no need to check it
> > before a device TLB invalidation operation.
> >
> > Before the scalable mode is introduced, caching mode is treated as an
> > indication that the driver is running in a VM guest. This is just a
> > software contract as shadow page table is the only way to implement a
> > virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
> > the scalable mode is introduced, this doesn't stand for anymore, as
> > caching mode is not relevant for the first-stage translation. A
> > virtual IOMMU implementation is free to support first-stage
> > translation only with caching mode cleared.
> 
> I didn't get the connection to the scalable mode.
> 
> if required we can still use caching mode to imply running as a guest.
> Just need to make sure its implementation conforming to the VT-d spec.
> 
> >
> > Remove the caching mode check before device TLB invalidation to ensure
> > compatibility with the scalable mode use cases.
> >
> > Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by
> > default")
> > Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> > ---
> >  drivers/iommu/intel/iommu.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index 493b6a600394..681789b1258d 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct
> > intel_iommu *iommu,
> >  	else
> >  		__iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
> >
> > -	if (!cap_caching_mode(iommu->cap) && !map)
> > +	if (!map)
> >  		iommu_flush_dev_iotlb(domain, addr, mask);
> 
> as commented earlier better squash this in patch1.
> 
> >  }
> >
> > @@ -1575,8 +1575,7 @@ static void intel_flush_iotlb_all(struct
> > iommu_domain *domain)
> >  			iommu->flush.flush_iotlb(iommu, did, 0, 0,
> >  						 DMA_TLB_DSI_FLUSH);
> >
> > -		if (!cap_caching_mode(iommu->cap))
> > -			iommu_flush_dev_iotlb(dmar_domain, 0,
> > MAX_AGAW_PFN_WIDTH);
> > +		iommu_flush_dev_iotlb(dmar_domain, 0,
> > MAX_AGAW_PFN_WIDTH);
> >  	}
> >
> 
> I'm hesitating to agree with this change. Strictly speaking it's correct.
> but w/o supporting batch invalidation this implies performance drop on
> viommu due to more VM-exits and there may incur user complaints when
> their VMs upgrade to a newer kernel version.
> 
> So it'd be better to keep this behavior and fix it together with batch
> invalidation support. Anyway none of the viommu implementations today
> (either shadow or nested translation) relies on the correct devtlb behavior
> from the guest otherwise it's already broken.
How about we split this change into a patch. I'm working on the batch invalidation patch-set now and I'm happy to include this code change into the batch invalidation series.

Regards,
-Tina


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush
  2024-04-10 23:49     ` Zhang, Tina
@ 2024-04-11 12:15       ` Baolu Lu
  0 siblings, 0 replies; 21+ messages in thread
From: Baolu Lu @ 2024-04-11 12:15 UTC (permalink / raw)
  To: Zhang, Tina, Tian, Kevin, iommu
  Cc: baolu.lu, Liu, Yi L, Joerg Roedel, Will Deacon, Robin Murphy,
	linux-kernel

On 2024/4/11 7:49, Zhang, Tina wrote:
> Hi,
> 
>> -----Original Message-----
>> From: Tian, Kevin <kevin.tian@intel.com>
>> Sent: Tuesday, April 9, 2024 3:30 PM
>> To: Lu Baolu <baolu.lu@linux.intel.com>; iommu@lists.linux.dev
>> Cc: Liu, Yi L <yi.l.liu@intel.com>; Joerg Roedel <joro@8bytes.org>; Will
>> Deacon <will@kernel.org>; Robin Murphy <robin.murphy@arm.com>; linux-
>> kernel@vger.kernel.org
>> Subject: RE: [PATCH 2/2] iommu/vt-d: Remove caching mode check before
>> devtlb flush
>>
>>> From: Lu Baolu <baolu.lu@linux.intel.com>
>>> Sent: Sunday, April 7, 2024 10:43 PM
>>>
>>> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
>>> implementation caches not-present or erroneous translation-structure
>>> entries except the first-stage translation. The caching mode is
>>> unrelated to the device TLB , therefore there is no need to check it
>>> before a device TLB invalidation operation.
>>>
>>> Before the scalable mode is introduced, caching mode is treated as an
>>> indication that the driver is running in a VM guest. This is just a
>>> software contract as shadow page table is the only way to implement a
>>> virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
>>> the scalable mode is introduced, this doesn't stand for anymore, as
>>> caching mode is not relevant for the first-stage translation. A
>>> virtual IOMMU implementation is free to support first-stage
>>> translation only with caching mode cleared.
>>
>> I didn't get the connection to the scalable mode.
>>
>> if required we can still use caching mode to imply running as a guest.
>> Just need to make sure its implementation conforming to the VT-d spec.
>>
>>>
>>> Remove the caching mode check before device TLB invalidation to ensure
>>> compatibility with the scalable mode use cases.
>>>
>>> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by
>>> default")
>>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>>> ---
>>>   drivers/iommu/intel/iommu.c | 5 ++---
>>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>>> index 493b6a600394..681789b1258d 100644
>>> --- a/drivers/iommu/intel/iommu.c
>>> +++ b/drivers/iommu/intel/iommu.c
>>> @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct
>>> intel_iommu *iommu,
>>>   	else
>>>   		__iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
>>>
>>> -	if (!cap_caching_mode(iommu->cap) && !map)
>>> +	if (!map)
>>>   		iommu_flush_dev_iotlb(domain, addr, mask);
>>
>> as commented earlier better squash this in patch1.
>>
>>>   }
>>>
>>> @@ -1575,8 +1575,7 @@ static void intel_flush_iotlb_all(struct
>>> iommu_domain *domain)
>>>   			iommu->flush.flush_iotlb(iommu, did, 0, 0,
>>>   						 DMA_TLB_DSI_FLUSH);
>>>
>>> -		if (!cap_caching_mode(iommu->cap))
>>> -			iommu_flush_dev_iotlb(dmar_domain, 0,
>>> MAX_AGAW_PFN_WIDTH);
>>> +		iommu_flush_dev_iotlb(dmar_domain, 0,
>>> MAX_AGAW_PFN_WIDTH);
>>>   	}
>>>
>>
>> I'm hesitating to agree with this change. Strictly speaking it's correct.
>> but w/o supporting batch invalidation this implies performance drop on
>> viommu due to more VM-exits and there may incur user complaints when
>> their VMs upgrade to a newer kernel version.
>>
>> So it'd be better to keep this behavior and fix it together with batch
>> invalidation support. Anyway none of the viommu implementations today
>> (either shadow or nested translation) relies on the correct devtlb behavior
>> from the guest otherwise it's already broken.
> How about we split this change into a patch. I'm working on the batch invalidation patch-set now and I'm happy to include this code change into the batch invalidation series.

No worries. It turned out that these two checks are just unnecessary.
Removing them won't cause any driver behavior change. So, I will make it
a cleanup patch and target it for the next merge window.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush
  2024-04-10 23:23             ` Tian, Kevin
@ 2024-04-11 16:17               ` Jacob Pan
  2024-04-12  3:13                 ` Tian, Kevin
  0 siblings, 1 reply; 21+ messages in thread
From: Jacob Pan @ 2024-04-11 16:17 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Baolu Lu, iommu, Liu, Yi L, Joerg Roedel, Will Deacon,
	Robin Murphy, linux-kernel, jacob.jun.pan

Hi Kevin,

On Wed, 10 Apr 2024 23:23:57 +0000, "Tian, Kevin" <kevin.tian@intel.com>
wrote:

> > From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > Sent: Thursday, April 11, 2024 12:20 AM
> > 
> > Hi Kevin,
> > 
> > On Wed, 10 Apr 2024 00:32:06 +0000, "Tian, Kevin" <kevin.tian@intel.com>
> > wrote:
> >   
> > > > From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > > > Sent: Wednesday, April 10, 2024 1:32 AM
> > > >
> > > > If the guest uses SL page tables in vIOMMU, we don;t expose ATS to
> > > > the guest. So ATS is not relevant here, does't matter map or unmap.
> > > >  
> > >
> > > ATS is orthogonal to SL vs. FL. Where is this restriction coming
> > > from?  
> > For practical purposes, what would be the usage to have SL in the guest
> > and ATS enabled. i.e. shadowing SL but directly expose ATS?
> >   
> 
> ATS is about the protocol between device and iommu to look up
> translations. Why does it care about internal paging layout in
> iommu?
> 
Maybe the original intent was missed, I was suggesting the devTLB flush
should be based on ATS cap (as you said here) not map/unmap.
 
-       /*
-        * In caching mode, changes of pages from non-present to present require
-        * flush. However, device IOTLB doesn't need to be flushed in this case.
-        */
-       if (!cap_caching_mode(iommu->cap) || !map)
-               iommu_flush_dev_iotlb(domain, addr, mask);
+       iommu_flush_dev_iotlb(domain, addr, mask);


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush
  2024-04-11 16:17               ` Jacob Pan
@ 2024-04-12  3:13                 ` Tian, Kevin
  0 siblings, 0 replies; 21+ messages in thread
From: Tian, Kevin @ 2024-04-12  3:13 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Baolu Lu, iommu, Liu, Yi L, Joerg Roedel, Will Deacon,
	Robin Murphy, linux-kernel

> From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Sent: Friday, April 12, 2024 12:18 AM
> 
> Hi Kevin,
> 
> On Wed, 10 Apr 2024 23:23:57 +0000, "Tian, Kevin" <kevin.tian@intel.com>
> wrote:
> 
> > > From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > > Sent: Thursday, April 11, 2024 12:20 AM
> > >
> > > Hi Kevin,
> > >
> > > On Wed, 10 Apr 2024 00:32:06 +0000, "Tian, Kevin"
> <kevin.tian@intel.com>
> > > wrote:
> > >
> > > > > From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > > > > Sent: Wednesday, April 10, 2024 1:32 AM
> > > > >
> > > > > If the guest uses SL page tables in vIOMMU, we don;t expose ATS to
> > > > > the guest. So ATS is not relevant here, does't matter map or unmap.
> > > > >
> > > >
> > > > ATS is orthogonal to SL vs. FL. Where is this restriction coming
> > > > from?
> > > For practical purposes, what would be the usage to have SL in the guest
> > > and ATS enabled. i.e. shadowing SL but directly expose ATS?
> > >
> >
> > ATS is about the protocol between device and iommu to look up
> > translations. Why does it care about internal paging layout in
> > iommu?
> >
> Maybe the original intent was missed, I was suggesting the devTLB flush
> should be based on ATS cap (as you said here) not map/unmap.
> 
> -       /*
> -        * In caching mode, changes of pages from non-present to present
> require
> -        * flush. However, device IOTLB doesn't need to be flushed in this case.
> -        */
> -       if (!cap_caching_mode(iommu->cap) || !map)
> -               iommu_flush_dev_iotlb(domain, addr, mask);
> +       iommu_flush_dev_iotlb(domain, addr, mask);
> 

We need check both, as devtlb doesn't cache non-present 
so the invalidation is required only for unmap.

Here just the check of caching mode is irrelevant.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-04-12  3:13 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-07 14:42 [PATCH 1/2] iommu/vt-d: Avoid unnecessary device TLB flush in map path Lu Baolu
2024-04-07 14:42 ` [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush Lu Baolu
2024-04-08  7:21   ` Ethan Zhao
2024-04-08  7:23     ` Baolu Lu
2024-04-08  7:43       ` Ethan Zhao
2024-04-08 21:03   ` Jacob Pan
2024-04-09  3:12     ` Baolu Lu
2024-04-09 17:31       ` Jacob Pan
2024-04-10  0:32         ` Tian, Kevin
2024-04-10 16:19           ` Jacob Pan
2024-04-10 23:23             ` Tian, Kevin
2024-04-11 16:17               ` Jacob Pan
2024-04-12  3:13                 ` Tian, Kevin
2024-04-09  7:30   ` Tian, Kevin
2024-04-10  5:40     ` Baolu Lu
2024-04-10 23:49     ` Zhang, Tina
2024-04-11 12:15       ` Baolu Lu
2024-04-09  8:36   ` Yi Liu
2024-04-09  7:20 ` [PATCH 1/2] iommu/vt-d: Avoid unnecessary device TLB flush in map path Tian, Kevin
2024-04-09  7:21 ` Tian, Kevin
2024-04-09  8:27 ` Yi Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).