linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] iommu/vt-d: Some fine tuning of SVA
@ 2022-04-16 12:30 Lu Baolu
  2022-04-16 12:30 ` [PATCH 1/3] iommu/vt-d: Set PGSNP bit in pasid table entry for sva binding Lu Baolu
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Lu Baolu @ 2022-04-16 12:30 UTC (permalink / raw)
  To: Jacob jun Pan, Kevin Tian, Ashok Raj, Liu Yi L
  Cc: iommu, linux-kernel, Lu Baolu

Hi folks,

This includes several tunings of Intel SVA implementation. I plan to
target them for v5.19. Please help to review.

Best regards,
baolu


Lu Baolu (3):
  iommu/vt-d: Set PGSNP bit in pasid table entry for sva binding
  iommu/vt-d: Drop stop marker messages
  iommu/vt-d: Size Page Request Queue to avoid overflow condition

 include/linux/intel-svm.h   | 2 +-
 drivers/iommu/intel/pasid.c | 2 +-
 drivers/iommu/intel/svm.c   | 5 +++++
 3 files changed, 7 insertions(+), 2 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] iommu/vt-d: Set PGSNP bit in pasid table entry for sva binding
  2022-04-16 12:30 [PATCH 0/3] iommu/vt-d: Some fine tuning of SVA Lu Baolu
@ 2022-04-16 12:30 ` Lu Baolu
  2022-04-18  6:56   ` Tian, Kevin
  2022-04-16 12:30 ` [PATCH 2/3] iommu/vt-d: Drop stop marker messages Lu Baolu
  2022-04-16 12:30 ` [PATCH 3/3] iommu/vt-d: Size Page Request Queue to avoid overflow condition Lu Baolu
  2 siblings, 1 reply; 10+ messages in thread
From: Lu Baolu @ 2022-04-16 12:30 UTC (permalink / raw)
  To: Jacob jun Pan, Kevin Tian, Ashok Raj, Liu Yi L
  Cc: iommu, linux-kernel, Lu Baolu

This field make the requests snoop processor caches irrespective of other
attributes in the request or other fields in paging structure entries
used to translate the request. The latest VT-d specification states that
this field is treated as Reserved(0) for implementations not supporting
Snoop Control (SC=0 in the Extended Capability Register). Hence add a
check in the code.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel/pasid.c | 2 +-
 drivers/iommu/intel/svm.c   | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index f8d215d85695..9ca3c67a2058 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -625,7 +625,7 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu,
 		}
 	}
 
-	if (flags & PASID_FLAG_PAGE_SNOOP)
+	if ((flags & PASID_FLAG_PAGE_SNOOP) && ecap_sc_support(iommu->ecap))
 		pasid_set_pgsnp(pte);
 
 	pasid_set_domain_id(pte, did);
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 23a38763c1d1..d88af37c20ef 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -394,6 +394,7 @@ static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
 	sflags = (flags & SVM_FLAG_SUPERVISOR_MODE) ?
 			PASID_FLAG_SUPERVISOR_MODE : 0;
 	sflags |= cpu_feature_enabled(X86_FEATURE_LA57) ? PASID_FLAG_FL5LP : 0;
+	sflags |= PASID_FLAG_PAGE_SNOOP;
 	spin_lock_irqsave(&iommu->lock, iflags);
 	ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, mm->pasid,
 					    FLPT_DEFAULT_DID, sflags);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] iommu/vt-d: Drop stop marker messages
  2022-04-16 12:30 [PATCH 0/3] iommu/vt-d: Some fine tuning of SVA Lu Baolu
  2022-04-16 12:30 ` [PATCH 1/3] iommu/vt-d: Set PGSNP bit in pasid table entry for sva binding Lu Baolu
@ 2022-04-16 12:30 ` Lu Baolu
  2022-04-18  6:58   ` Tian, Kevin
  2022-04-16 12:30 ` [PATCH 3/3] iommu/vt-d: Size Page Request Queue to avoid overflow condition Lu Baolu
  2 siblings, 1 reply; 10+ messages in thread
From: Lu Baolu @ 2022-04-16 12:30 UTC (permalink / raw)
  To: Jacob jun Pan, Kevin Tian, Ashok Raj, Liu Yi L
  Cc: iommu, linux-kernel, Lu Baolu, Jacob Pan

The VT-d driver explicitly drains the pending page requests when a CPU
page table (represented by a mm struct) is unbound from a PASID according
to the procedures defined in the VT-d spec. Hence, there's no need to
report the stop-marker message in prq_event_thread(). The stop marker
messages do not need a response. This drops stop marker messages silently
if any of them is found in the page request queue.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/iommu/intel/svm.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index d88af37c20ef..d1c42dfae6ca 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -758,6 +758,10 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 			goto bad_req;
 		}
 
+		/* Drop Stop Marker message. No need for a response. */
+		if (unlikely(req->lpig && !req->rd_req && !req->wr_req))
+			goto prq_advance;
+
 		if (!svm || svm->pasid != req->pasid) {
 			/*
 			 * It can't go away, because the driver is not permitted
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] iommu/vt-d: Size Page Request Queue to avoid overflow condition
  2022-04-16 12:30 [PATCH 0/3] iommu/vt-d: Some fine tuning of SVA Lu Baolu
  2022-04-16 12:30 ` [PATCH 1/3] iommu/vt-d: Set PGSNP bit in pasid table entry for sva binding Lu Baolu
  2022-04-16 12:30 ` [PATCH 2/3] iommu/vt-d: Drop stop marker messages Lu Baolu
@ 2022-04-16 12:30 ` Lu Baolu
  2022-04-18  7:00   ` Tian, Kevin
  2 siblings, 1 reply; 10+ messages in thread
From: Lu Baolu @ 2022-04-16 12:30 UTC (permalink / raw)
  To: Jacob jun Pan, Kevin Tian, Ashok Raj, Liu Yi L
  Cc: iommu, linux-kernel, Lu Baolu

PRQ overflow may cause I/O throughput congestion, resulting in unnecessary
degradation of IO performance. Appropriately increasing the length of PRQ
can greatly reduce the occurrence of PRQ overflow. The count of maximum
page requests that can be generated in parallel by a PCIe device is
statically defined in the Outstanding Page Request Capacity field of the
PCIe ATS configure space.

The new lenght of PRQ is calculated by summing up the value of Outstanding
Page Request Capacity register across all devices where Page Requests are
supported on the real PR-capable platfrom (Intel Sapphire Rapids). The
result is round to the nearest higher power of 2.

The PRQ length is also double sized as the VT-d IOMMU driver only updates
the Page Request Queue Head Register (PQH_REG) after processing the entire
queue.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/intel-svm.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/intel-svm.h b/include/linux/intel-svm.h
index b3b125b332aa..207ef06ba3e1 100644
--- a/include/linux/intel-svm.h
+++ b/include/linux/intel-svm.h
@@ -9,7 +9,7 @@
 #define __INTEL_SVM_H__
 
 /* Page Request Queue depth */
-#define PRQ_ORDER	2
+#define PRQ_ORDER	4
 #define PRQ_RING_MASK	((0x1000 << PRQ_ORDER) - 0x20)
 #define PRQ_DEPTH	((0x1000 << PRQ_ORDER) >> 5)
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* RE: [PATCH 1/3] iommu/vt-d: Set PGSNP bit in pasid table entry for sva binding
  2022-04-16 12:30 ` [PATCH 1/3] iommu/vt-d: Set PGSNP bit in pasid table entry for sva binding Lu Baolu
@ 2022-04-18  6:56   ` Tian, Kevin
  2022-04-18 11:28     ` Lu Baolu
  0 siblings, 1 reply; 10+ messages in thread
From: Tian, Kevin @ 2022-04-18  6:56 UTC (permalink / raw)
  To: Lu Baolu, Pan, Jacob jun, Raj, Ashok, Liu, Yi L; +Cc: iommu, linux-kernel

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Saturday, April 16, 2022 8:31 PM
> 
> This field make the requests snoop processor caches irrespective of other
> attributes in the request or other fields in paging structure entries
> used to translate the request. The latest VT-d specification states that
> this field is treated as Reserved(0) for implementations not supporting
> Snoop Control (SC=0 in the Extended Capability Register). Hence add a
> check in the code.
> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  drivers/iommu/intel/pasid.c | 2 +-
>  drivers/iommu/intel/svm.c   | 1 +
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
> index f8d215d85695..9ca3c67a2058 100644
> --- a/drivers/iommu/intel/pasid.c
> +++ b/drivers/iommu/intel/pasid.c
> @@ -625,7 +625,7 @@ int intel_pasid_setup_first_level(struct intel_iommu
> *iommu,
>  		}
>  	}
> 
> -	if (flags & PASID_FLAG_PAGE_SNOOP)
> +	if ((flags & PASID_FLAG_PAGE_SNOOP) && ecap_sc_support(iommu-
> >ecap))
>  		pasid_set_pgsnp(pte);

If the caller wants snoop for some reason is it correct to simply
ignore the request when lacking of hw support? Suppose certain
errno should be returned here...

> 
>  	pasid_set_domain_id(pte, did);
> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
> index 23a38763c1d1..d88af37c20ef 100644
> --- a/drivers/iommu/intel/svm.c
> +++ b/drivers/iommu/intel/svm.c
> @@ -394,6 +394,7 @@ static struct iommu_sva *intel_svm_bind_mm(struct
> intel_iommu *iommu,
>  	sflags = (flags & SVM_FLAG_SUPERVISOR_MODE) ?
>  			PASID_FLAG_SUPERVISOR_MODE : 0;
>  	sflags |= cpu_feature_enabled(X86_FEATURE_LA57) ?
> PASID_FLAG_FL5LP : 0;
> +	sflags |= PASID_FLAG_PAGE_SNOOP;
>  	spin_lock_irqsave(&iommu->lock, iflags);
>  	ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, mm-
> >pasid,
>  					    FLPT_DEFAULT_DID, sflags);
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 2/3] iommu/vt-d: Drop stop marker messages
  2022-04-16 12:30 ` [PATCH 2/3] iommu/vt-d: Drop stop marker messages Lu Baolu
@ 2022-04-18  6:58   ` Tian, Kevin
  2022-04-18 11:37     ` Lu Baolu
  0 siblings, 1 reply; 10+ messages in thread
From: Tian, Kevin @ 2022-04-18  6:58 UTC (permalink / raw)
  To: Lu Baolu, Pan, Jacob jun, Raj, Ashok, Liu, Yi L
  Cc: iommu, linux-kernel, Jacob Pan

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Saturday, April 16, 2022 8:31 PM
> 
> The VT-d driver explicitly drains the pending page requests when a CPU
> page table (represented by a mm struct) is unbound from a PASID according
> to the procedures defined in the VT-d spec. Hence, there's no need to
> report the stop-marker message in prq_event_thread(). The stop marker
> messages do not need a response. This drops stop marker messages silently
> if any of them is found in the page request queue.

The comment for iommu_queue_iopf says:

 * This module doesn't handle PCI PASID Stop Marker; IOMMU drivers must discard
 * them before reporting faults. A PASID Stop Marker (LRW = 0b100) doesn't
 * expect a response. It may be generated when disabling a PASID (issuing a
 * PASID stop request) by some PCI devices.

So obviously the current vt-d driver behavior violates that requirement.
Then should this be a bug fix instead?

> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> Reviewed-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> ---
>  drivers/iommu/intel/svm.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
> index d88af37c20ef..d1c42dfae6ca 100644
> --- a/drivers/iommu/intel/svm.c
> +++ b/drivers/iommu/intel/svm.c
> @@ -758,6 +758,10 @@ static irqreturn_t prq_event_thread(int irq, void *d)
>  			goto bad_req;
>  		}
> 
> +		/* Drop Stop Marker message. No need for a response. */
> +		if (unlikely(req->lpig && !req->rd_req && !req->wr_req))
> +			goto prq_advance;
> +
>  		if (!svm || svm->pasid != req->pasid) {
>  			/*
>  			 * It can't go away, because the driver is not
> permitted
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 3/3] iommu/vt-d: Size Page Request Queue to avoid overflow condition
  2022-04-16 12:30 ` [PATCH 3/3] iommu/vt-d: Size Page Request Queue to avoid overflow condition Lu Baolu
@ 2022-04-18  7:00   ` Tian, Kevin
  2022-04-18 11:44     ` Lu Baolu
  0 siblings, 1 reply; 10+ messages in thread
From: Tian, Kevin @ 2022-04-18  7:00 UTC (permalink / raw)
  To: Lu Baolu, Pan, Jacob jun, Raj, Ashok, Liu, Yi L; +Cc: iommu, linux-kernel

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Saturday, April 16, 2022 8:31 PM
> 
> PRQ overflow may cause I/O throughput congestion, resulting in unnecessary
> degradation of IO performance. Appropriately increasing the length of PRQ
> can greatly reduce the occurrence of PRQ overflow. The count of maximum
> page requests that can be generated in parallel by a PCIe device is
> statically defined in the Outstanding Page Request Capacity field of the
> PCIe ATS configure space.
> 
> The new lenght of PRQ is calculated by summing up the value of Outstanding
> Page Request Capacity register across all devices where Page Requests are
> supported on the real PR-capable platfrom (Intel Sapphire Rapids). The
> result is round to the nearest higher power of 2.

The actual requirement is usage and platform specific. What about
doubling the default size and also provide an option for admin to
tune?

> 
> The PRQ length is also double sized as the VT-d IOMMU driver only updates
> the Page Request Queue Head Register (PQH_REG) after processing the
> entire
> queue.
> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  include/linux/intel-svm.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/intel-svm.h b/include/linux/intel-svm.h
> index b3b125b332aa..207ef06ba3e1 100644
> --- a/include/linux/intel-svm.h
> +++ b/include/linux/intel-svm.h
> @@ -9,7 +9,7 @@
>  #define __INTEL_SVM_H__
> 
>  /* Page Request Queue depth */
> -#define PRQ_ORDER	2
> +#define PRQ_ORDER	4
>  #define PRQ_RING_MASK	((0x1000 << PRQ_ORDER) - 0x20)
>  #define PRQ_DEPTH	((0x1000 << PRQ_ORDER) >> 5)
> 
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] iommu/vt-d: Set PGSNP bit in pasid table entry for sva binding
  2022-04-18  6:56   ` Tian, Kevin
@ 2022-04-18 11:28     ` Lu Baolu
  0 siblings, 0 replies; 10+ messages in thread
From: Lu Baolu @ 2022-04-18 11:28 UTC (permalink / raw)
  To: Tian, Kevin, Pan, Jacob jun, Raj, Ashok, Liu, Yi L
  Cc: baolu.lu, iommu, linux-kernel

On 2022/4/18 14:56, Tian, Kevin wrote:
>> From: Lu Baolu<baolu.lu@linux.intel.com>
>> Sent: Saturday, April 16, 2022 8:31 PM
>>
>> This field make the requests snoop processor caches irrespective of other
>> attributes in the request or other fields in paging structure entries
>> used to translate the request. The latest VT-d specification states that
>> this field is treated as Reserved(0) for implementations not supporting
>> Snoop Control (SC=0 in the Extended Capability Register). Hence add a
>> check in the code.
>>
>> Signed-off-by: Lu Baolu<baolu.lu@linux.intel.com>
>> ---
>>   drivers/iommu/intel/pasid.c | 2 +-
>>   drivers/iommu/intel/svm.c   | 1 +
>>   2 files changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
>> index f8d215d85695..9ca3c67a2058 100644
>> --- a/drivers/iommu/intel/pasid.c
>> +++ b/drivers/iommu/intel/pasid.c
>> @@ -625,7 +625,7 @@ int intel_pasid_setup_first_level(struct intel_iommu
>> *iommu,
>>   		}
>>   	}
>>
>> -	if (flags & PASID_FLAG_PAGE_SNOOP)
>> +	if ((flags & PASID_FLAG_PAGE_SNOOP) && ecap_sc_support(iommu-
>>> ecap))
>>   		pasid_set_pgsnp(pte);
> If the caller wants snoop for some reason is it correct to simply
> ignore the request when lacking of hw support? Suppose certain
> errno should be returned here...

Good catch. Perhaps I should make the cap check in a separated patch.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] iommu/vt-d: Drop stop marker messages
  2022-04-18  6:58   ` Tian, Kevin
@ 2022-04-18 11:37     ` Lu Baolu
  0 siblings, 0 replies; 10+ messages in thread
From: Lu Baolu @ 2022-04-18 11:37 UTC (permalink / raw)
  To: Tian, Kevin, Pan, Jacob jun, Raj, Ashok, Liu, Yi L
  Cc: baolu.lu, iommu, linux-kernel, Jacob Pan

On 2022/4/18 14:58, Tian, Kevin wrote:
>> From: Lu Baolu<baolu.lu@linux.intel.com>
>> Sent: Saturday, April 16, 2022 8:31 PM
>>
>> The VT-d driver explicitly drains the pending page requests when a CPU
>> page table (represented by a mm struct) is unbound from a PASID according
>> to the procedures defined in the VT-d spec. Hence, there's no need to
>> report the stop-marker message in prq_event_thread(). The stop marker
>> messages do not need a response. This drops stop marker messages silently
>> if any of them is found in the page request queue.
> The comment for iommu_queue_iopf says:
> 
>   * This module doesn't handle PCI PASID Stop Marker; IOMMU drivers must discard
>   * them before reporting faults. A PASID Stop Marker (LRW = 0b100) doesn't
>   * expect a response. It may be generated when disabling a PASID (issuing a
>   * PASID stop request) by some PCI devices.
> 
> So obviously the current vt-d driver behavior violates that requirement.
> Then should this be a bug fix instead?
> 

Our platforms have no devices issuing Stop Marker messages yet. But in a
virtualization environment, a device might be emulated with this
capability. Yes! Let me make it as a fix.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] iommu/vt-d: Size Page Request Queue to avoid overflow condition
  2022-04-18  7:00   ` Tian, Kevin
@ 2022-04-18 11:44     ` Lu Baolu
  0 siblings, 0 replies; 10+ messages in thread
From: Lu Baolu @ 2022-04-18 11:44 UTC (permalink / raw)
  To: Tian, Kevin, Pan, Jacob jun, Raj, Ashok, Liu, Yi L
  Cc: baolu.lu, iommu, linux-kernel

On 2022/4/18 15:00, Tian, Kevin wrote:
>> From: Lu Baolu<baolu.lu@linux.intel.com>
>> Sent: Saturday, April 16, 2022 8:31 PM
>>
>> PRQ overflow may cause I/O throughput congestion, resulting in unnecessary
>> degradation of IO performance. Appropriately increasing the length of PRQ
>> can greatly reduce the occurrence of PRQ overflow. The count of maximum
>> page requests that can be generated in parallel by a PCIe device is
>> statically defined in the Outstanding Page Request Capacity field of the
>> PCIe ATS configure space.
>>
>> The new lenght of PRQ is calculated by summing up the value of Outstanding
>> Page Request Capacity register across all devices where Page Requests are
>> supported on the real PR-capable platfrom (Intel Sapphire Rapids). The
>> result is round to the nearest higher power of 2.
> The actual requirement is usage and platform specific. What about
> doubling the default size and also provide an option for admin to
> tune?

I also have this in my list to provide an opt-in interface for user. But
I don't want to include it in this series as this is only for small
tuning. The opt-in interface looks more like a new feature. :-)

Best regards,
baolu

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-04-18 11:44 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-16 12:30 [PATCH 0/3] iommu/vt-d: Some fine tuning of SVA Lu Baolu
2022-04-16 12:30 ` [PATCH 1/3] iommu/vt-d: Set PGSNP bit in pasid table entry for sva binding Lu Baolu
2022-04-18  6:56   ` Tian, Kevin
2022-04-18 11:28     ` Lu Baolu
2022-04-16 12:30 ` [PATCH 2/3] iommu/vt-d: Drop stop marker messages Lu Baolu
2022-04-18  6:58   ` Tian, Kevin
2022-04-18 11:37     ` Lu Baolu
2022-04-16 12:30 ` [PATCH 3/3] iommu/vt-d: Size Page Request Queue to avoid overflow condition Lu Baolu
2022-04-18  7:00   ` Tian, Kevin
2022-04-18 11:44     ` Lu Baolu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).