linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] iommu/vt-d: Add a fix for devices need extra dtlb flush
@ 2022-11-30  6:24 Jacob Pan
  2022-12-01  4:02 ` Baolu Lu
  2022-12-05 10:43 ` Joerg Roedel
  0 siblings, 2 replies; 7+ messages in thread
From: Jacob Pan @ 2022-11-30  6:24 UTC (permalink / raw)
  To: LKML, iommu, Lu Baolu, Joerg Roedel
  Cc: Robin Murphy, Will Deacon, David Woodhouse, Raj Ashok, Tian,
	Kevin, Yi Liu, Jacob Pan, Yuzhang Luo

QAT devices on Intel Sapphire Rapids and Emerald Rapids have a defect in
address translation service (ATS). These devices may inadvertently issue
ATS invalidation completion before posted writes initiated with
translated address that utilized translations matching the invalidation
address range, violating the invalidation completion ordering.

This patch adds an extra device TLB invalidation for the affected devices,
it is needed to ensure no more posted writes with translated address
following the invalidation completion. Therefore, the ordering is
preserved and data-corruption is prevented.

Device TLBs are invalidated under the following six conditions:
1. Device driver does DMA API unmap IOVA
2. Device driver unbind a PASID from a process, sva_unbind_device()
3. PASID is torn down, after PASID cache is flushed. e.g. process
exit_mmap() due to crash
4. Under SVA usage, called by mmu_notifier.invalidate_range() where
VM has to free pages that were unmapped
5. userspace driver unmaps a DMA buffer
6. Cache invalidation in vSVA usage (upcoming)

For #1 and #2, device drivers are responsible for stopping DMA traffic
before unmap/unbind. For #3, iommu driver gets mmu_notifier to
invalidate TLB the same way as normal user unmap which will do an extra
invalidation. The dTLB invalidation after PASID cache flush does not
need an extra invalidation.

Therefore, we only need to deal with #4 and #5 in this patch. #1 is also
covered by this patch due to common code path with #5.

Tested-by: Yuzhang Luo <yuzhang.luo@intel.com>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
v3
- renamed quirk function
- add more comments to explain risky_device() check
v2
- removed risky_device() check based on the review by Robin, added comments
  to explain the exemption.
- reworked commit message based on the review from Ashok
---
 drivers/iommu/intel/iommu.c | 67 +++++++++++++++++++++++++++++++++++--
 drivers/iommu/intel/iommu.h |  3 ++
 drivers/iommu/intel/svm.c   |  5 ++-
 3 files changed, 72 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 996a8b5ee5ee..d8759f445aff 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1396,6 +1396,23 @@ static void domain_update_iotlb(struct dmar_domain *domain)
 	spin_unlock_irqrestore(&domain->lock, flags);
 }
 
+/*
+ * Impacted QAT device IDs ranging from 0x4940 to 0x4943.
+ * This quirk is exempted from risky_device() check because it applies only
+ * to the built-in QAT devices and it doesn't grant additional privileges.
+ */
+#define BUGGY_QAT_DEVID_MASK 0x494c
+static bool dev_needs_extra_dtlb_flush(struct pci_dev *pdev)
+{
+	if (pdev->vendor != PCI_VENDOR_ID_INTEL)
+		return false;
+
+	if ((pdev->device & 0xfffc) != BUGGY_QAT_DEVID_MASK)
+		return false;
+
+	return true;
+}
+
 static void iommu_enable_pci_caps(struct device_domain_info *info)
 {
 	struct pci_dev *pdev;
@@ -1478,6 +1495,7 @@ static void __iommu_flush_dev_iotlb(struct device_domain_info *info,
 	qdep = info->ats_qdep;
 	qi_flush_dev_iotlb(info->iommu, sid, info->pfsid,
 			   qdep, addr, mask);
+	quirk_extra_dev_tlb_flush(info, addr, mask, PASID_RID2PASID, qdep);
 }
 
 static void iommu_flush_dev_iotlb(struct dmar_domain *domain,
@@ -4490,9 +4508,10 @@ static struct iommu_device *intel_iommu_probe_device(struct device *dev)
 	if (dev_is_pci(dev)) {
 		if (ecap_dev_iotlb_support(iommu->ecap) &&
 		    pci_ats_supported(pdev) &&
-		    dmar_ats_supported(pdev, iommu))
+		    dmar_ats_supported(pdev, iommu)) {
 			info->ats_supported = 1;
-
+			info->dtlb_extra_inval = dev_needs_extra_dtlb_flush(pdev);
+		}
 		if (sm_supported(iommu)) {
 			if (pasid_supported(iommu)) {
 				int features = pci_pasid_features(pdev);
@@ -4931,3 +4950,47 @@ static void __init check_tylersburg_isoch(void)
 	pr_warn("Recommended TLB entries for ISOCH unit is 16; your BIOS set %d\n",
 	       vtisochctrl);
 }
+
+/*
+ * Here we deal with a device TLB defect where device may inadvertently issue ATS
+ * invalidation completion before posted writes initiated with translated address
+ * that utilized translations matching the invalidation address range, violating
+ * the invalidation completion ordering.
+ * Therefore, any use cases that cannot guarantee DMA is stopped before unmap is
+ * vulnerable to this defect. In other words, any dTLB invalidation initiated not
+ * under the control of the trusted/privileged host device driver must use this
+ * quirk.
+ * Device TLBs are invalidated under the following six conditions:
+ * 1. Device driver does DMA API unmap IOVA
+ * 2. Device driver unbind a PASID from a process, sva_unbind_device()
+ * 3. PASID is torn down, after PASID cache is flushed. e.g. process
+ *    exit_mmap() due to crash
+ * 4. Under SVA usage, called by mmu_notifier.invalidate_range() where
+ *    VM has to free pages that were unmapped
+ * 5. Userspace driver unmaps a DMA buffer
+ * 6. Cache invalidation in vSVA usage (upcoming)
+ *
+ * For #1 and #2, device drivers are responsible for stopping DMA traffic
+ * before unmap/unbind. For #3, iommu driver gets mmu_notifier to
+ * invalidate TLB the same way as normal user unmap which will use this quirk.
+ * The dTLB invalidation after PASID cache flush does not need this quirk.
+ *
+ * As a reminder, #6 will *NEED* this quirk as we enable nested translation.
+ */
+void quirk_extra_dev_tlb_flush(struct device_domain_info *info, unsigned long address,
+		   unsigned long mask, u32 pasid, u16 qdep)
+{
+	u16 sid;
+
+	if (likely(!info->dtlb_extra_inval))
+		return;
+
+	sid = PCI_DEVID(info->bus, info->devfn);
+	if (pasid == PASID_RID2PASID) {
+		qi_flush_dev_iotlb(info->iommu, sid, info->pfsid,
+				   qdep, address, mask);
+	} else {
+		qi_flush_dev_iotlb_pasid(info->iommu, sid, info->pfsid,
+					 pasid, qdep, address, mask);
+	}
+}
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 92023dff9513..36297e17d815 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -623,6 +623,7 @@ struct device_domain_info {
 	u8 pri_enabled:1;
 	u8 ats_supported:1;
 	u8 ats_enabled:1;
+	u8 dtlb_extra_inval:1;	/* Quirk for devices need extra flush */
 	u8 ats_qdep;
 	struct device *dev; /* it's NULL for PCIe-to-PCI bridge */
 	struct intel_iommu *iommu; /* IOMMU used by this device */
@@ -728,6 +729,8 @@ void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, u32 pasid, u64 addr,
 void qi_flush_dev_iotlb_pasid(struct intel_iommu *iommu, u16 sid, u16 pfsid,
 			      u32 pasid, u16 qdep, u64 addr,
 			      unsigned int size_order);
+void quirk_extra_dev_tlb_flush(struct device_domain_info *info, unsigned long address,
+		   unsigned long pages, u32 pasid, u16 qdep);
 void qi_flush_pasid_cache(struct intel_iommu *iommu, u16 did, u64 granu,
 			  u32 pasid);
 
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 7d08eb034f2d..fe615c53479c 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -184,10 +184,13 @@ static void __flush_svm_range_dev(struct intel_svm *svm,
 		return;
 
 	qi_flush_piotlb(sdev->iommu, sdev->did, svm->pasid, address, pages, ih);
-	if (info->ats_enabled)
+	if (info->ats_enabled) {
 		qi_flush_dev_iotlb_pasid(sdev->iommu, sdev->sid, info->pfsid,
 					 svm->pasid, sdev->qdep, address,
 					 order_base_2(pages));
+		quirk_extra_dev_tlb_flush(info, address, order_base_2(pages),
+					  svm->pasid, sdev->qdep);
+	}
 }
 
 static void intel_flush_svm_range_dev(struct intel_svm *svm,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] iommu/vt-d: Add a fix for devices need extra dtlb flush
  2022-11-30  6:24 [PATCH v3] iommu/vt-d: Add a fix for devices need extra dtlb flush Jacob Pan
@ 2022-12-01  4:02 ` Baolu Lu
  2022-12-05 10:43 ` Joerg Roedel
  1 sibling, 0 replies; 7+ messages in thread
From: Baolu Lu @ 2022-12-01  4:02 UTC (permalink / raw)
  To: Jacob Pan, LKML, iommu, Joerg Roedel
  Cc: baolu.lu, Robin Murphy, Will Deacon, David Woodhouse, Raj Ashok,
	Tian, Kevin, Yi Liu, Yuzhang Luo

On 2022/11/30 14:24, Jacob Pan wrote:
> QAT devices on Intel Sapphire Rapids and Emerald Rapids have a defect in
> address translation service (ATS). These devices may inadvertently issue
> ATS invalidation completion before posted writes initiated with
> translated address that utilized translations matching the invalidation
> address range, violating the invalidation completion ordering.
> 
> This patch adds an extra device TLB invalidation for the affected devices,
> it is needed to ensure no more posted writes with translated address
> following the invalidation completion. Therefore, the ordering is
> preserved and data-corruption is prevented.
> 
> Device TLBs are invalidated under the following six conditions:
> 1. Device driver does DMA API unmap IOVA
> 2. Device driver unbind a PASID from a process, sva_unbind_device()
> 3. PASID is torn down, after PASID cache is flushed. e.g. process
> exit_mmap() due to crash
> 4. Under SVA usage, called by mmu_notifier.invalidate_range() where
> VM has to free pages that were unmapped
> 5. userspace driver unmaps a DMA buffer
> 6. Cache invalidation in vSVA usage (upcoming)
> 
> For #1 and #2, device drivers are responsible for stopping DMA traffic
> before unmap/unbind. For #3, iommu driver gets mmu_notifier to
> invalidate TLB the same way as normal user unmap which will do an extra
> invalidation. The dTLB invalidation after PASID cache flush does not
> need an extra invalidation.
> 
> Therefore, we only need to deal with #4 and #5 in this patch. #1 is also
> covered by this patch due to common code path with #5.
> 
> Tested-by: Yuzhang Luo<yuzhang.luo@intel.com>
> Reviewed-by: Ashok Raj<ashok.raj@intel.com>
> Reviewed-by: Kevin Tian<kevin.tian@intel.com>
> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>

Queued for Joerg. Thank you!

Best regards,
baolu

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] iommu/vt-d: Add a fix for devices need extra dtlb flush
  2022-11-30  6:24 [PATCH v3] iommu/vt-d: Add a fix for devices need extra dtlb flush Jacob Pan
  2022-12-01  4:02 ` Baolu Lu
@ 2022-12-05 10:43 ` Joerg Roedel
  2022-12-05 10:59   ` Baolu Lu
  1 sibling, 1 reply; 7+ messages in thread
From: Joerg Roedel @ 2022-12-05 10:43 UTC (permalink / raw)
  To: Jacob Pan
  Cc: LKML, iommu, Lu Baolu, Robin Murphy, Will Deacon,
	David Woodhouse, Raj Ashok, Tian, Kevin, Yi Liu, Yuzhang Luo

On Tue, Nov 29, 2022 at 10:24:49PM -0800, Jacob Pan wrote:
>  drivers/iommu/intel/iommu.c | 67 +++++++++++++++++++++++++++++++++++--
>  drivers/iommu/intel/iommu.h |  3 ++
>  drivers/iommu/intel/svm.c   |  5 ++-
>  3 files changed, 72 insertions(+), 3 deletions(-)

I removed this commit from my fixes branch. Please re-submit with the
fix included and I will queue it for 6.2 once Lu Baolu acked it.

Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] iommu/vt-d: Add a fix for devices need extra dtlb flush
  2022-12-05 10:43 ` Joerg Roedel
@ 2022-12-05 10:59   ` Baolu Lu
  2022-12-05 12:35     ` Joerg Roedel
  2022-12-05 12:44     ` Joerg Roedel
  0 siblings, 2 replies; 7+ messages in thread
From: Baolu Lu @ 2022-12-05 10:59 UTC (permalink / raw)
  To: Joerg Roedel, Jacob Pan
  Cc: baolu.lu, LKML, iommu, Robin Murphy, Will Deacon,
	David Woodhouse, Raj Ashok, Tian, Kevin, Yi Liu, Yuzhang Luo

On 2022/12/5 18:43, Joerg Roedel wrote:
> On Tue, Nov 29, 2022 at 10:24:49PM -0800, Jacob Pan wrote:
>>   drivers/iommu/intel/iommu.c | 67 +++++++++++++++++++++++++++++++++++--
>>   drivers/iommu/intel/iommu.h |  3 ++
>>   drivers/iommu/intel/svm.c   |  5 ++-
>>   3 files changed, 72 insertions(+), 3 deletions(-)
> I removed this commit from my fixes branch. Please re-submit with the
> fix included and I will queue it for 6.2 once Lu Baolu acked it.

But this patch has already been merged in v6.1-rc7.

https://lore.kernel.org/linux-iommu/Y4oF8e7quzjm2kzD@8bytes.org/

Is this still feasible?

Best regards,
baolu

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] iommu/vt-d: Add a fix for devices need extra dtlb flush
  2022-12-05 10:59   ` Baolu Lu
@ 2022-12-05 12:35     ` Joerg Roedel
  2022-12-05 12:44     ` Joerg Roedel
  1 sibling, 0 replies; 7+ messages in thread
From: Joerg Roedel @ 2022-12-05 12:35 UTC (permalink / raw)
  To: Baolu Lu
  Cc: Jacob Pan, LKML, iommu, Robin Murphy, Will Deacon,
	David Woodhouse, Raj Ashok, Tian, Kevin, Yi Liu, Yuzhang Luo

On Mon, Dec 05, 2022 at 06:59:19PM +0800, Baolu Lu wrote:
> But this patch has already been merged in v6.1-rc7.
> 
> https://lore.kernel.org/linux-iommu/Y4oF8e7quzjm2kzD@8bytes.org/

Ah right, Linus did pull it. I though it wasn't pulled because I didn't
get a bot message about it. I will merge the fix.

Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] iommu/vt-d: Add a fix for devices need extra dtlb flush
  2022-12-05 10:59   ` Baolu Lu
  2022-12-05 12:35     ` Joerg Roedel
@ 2022-12-05 12:44     ` Joerg Roedel
  2022-12-05 13:19       ` Baolu Lu
  1 sibling, 1 reply; 7+ messages in thread
From: Joerg Roedel @ 2022-12-05 12:44 UTC (permalink / raw)
  To: Baolu Lu
  Cc: Jacob Pan, LKML, iommu, Robin Murphy, Will Deacon,
	David Woodhouse, Raj Ashok, Tian, Kevin, Yi Liu, Yuzhang Luo

Baolu,

On Mon, Dec 05, 2022 at 06:59:19PM +0800, Baolu Lu wrote:
> Is this still feasible?

Can you please review the fix and provide your tag? I will put in the
fixes branch then.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] iommu/vt-d: Add a fix for devices need extra dtlb flush
  2022-12-05 12:44     ` Joerg Roedel
@ 2022-12-05 13:19       ` Baolu Lu
  0 siblings, 0 replies; 7+ messages in thread
From: Baolu Lu @ 2022-12-05 13:19 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: baolu.lu, Jacob Pan, LKML, iommu, Robin Murphy, Will Deacon,
	David Woodhouse, Raj Ashok, Tian, Kevin, Yi Liu, Yuzhang Luo

On 2022/12/5 20:44, Joerg Roedel wrote:
> Baolu,
> 
> On Mon, Dec 05, 2022 at 06:59:19PM +0800, Baolu Lu wrote:
>> Is this still feasible?
> 
> Can you please review the fix and provide your tag? I will put in the
> fixes branch then.

It seems that this patch doesn't apply to the fixes branch. Let me
prepare a pull request for you.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-12-05 13:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-30  6:24 [PATCH v3] iommu/vt-d: Add a fix for devices need extra dtlb flush Jacob Pan
2022-12-01  4:02 ` Baolu Lu
2022-12-05 10:43 ` Joerg Roedel
2022-12-05 10:59   ` Baolu Lu
2022-12-05 12:35     ` Joerg Roedel
2022-12-05 12:44     ` Joerg Roedel
2022-12-05 13:19       ` Baolu Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).