linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] iommu/vt-d: Force to flush iotlb before creating superpage
@ 2021-04-15  0:46 Longpeng(Mike)
  2021-04-15  5:49 ` Lu Baolu
  2021-04-15 14:13 ` Joerg Roedel
  0 siblings, 2 replies; 4+ messages in thread
From: Longpeng(Mike) @ 2021-04-15  0:46 UTC (permalink / raw)
  To: iommu, linux-kernel
  Cc: longpeng2, David Woodhouse, Lu Baolu, Nadav Amit,
	Alex Williamson, Joerg Roedel, Kevin Tian, Gonglei, stable

The translation caches may preserve obsolete data when the
mapping size is changed, suppose the following sequence which
can reveal the problem with high probability.

1.mmap(4GB,MAP_HUGETLB)
2.
  while (1) {
   (a)    DMA MAP   0,0xa0000
   (b)    DMA UNMAP 0,0xa0000
   (c)    DMA MAP   0,0xc0000000
             * DMA read IOVA 0 may failure here (Not present)
             * if the problem occurs.
   (d)    DMA UNMAP 0,0xc0000000
  }

The page table(only focus on IOVA 0) after (a) is:
 PML4: 0x19db5c1003   entry:0xffff899bdcd2f000
  PDPE: 0x1a1cacb003  entry:0xffff89b35b5c1000
   PDE: 0x1a30a72003  entry:0xffff89b39cacb000
    PTE: 0x21d200803  entry:0xffff89b3b0a72000

The page table after (b) is:
 PML4: 0x19db5c1003   entry:0xffff899bdcd2f000
  PDPE: 0x1a1cacb003  entry:0xffff89b35b5c1000
   PDE: 0x1a30a72003  entry:0xffff89b39cacb000
    PTE: 0x0          entry:0xffff89b3b0a72000

The page table after (c) is:
 PML4: 0x19db5c1003   entry:0xffff899bdcd2f000
  PDPE: 0x1a1cacb003  entry:0xffff89b35b5c1000
   PDE: 0x21d200883   entry:0xffff89b39cacb000 (*)

Because the PDE entry after (b) is present, it won't be
flushed even if the iommu driver flush cache when unmap,
so the obsolete data may be preserved in cache, which
would cause the wrong translation at end.

However, we can see the PDE entry is finally switch to
2M-superpage mapping, but it does not transform
to 0x21d200883 directly:

1. PDE: 0x1a30a72003
2. __domain_mapping
     dma_pte_free_pagetable
       Set the PDE entry to ZERO
     Set the PDE entry to 0x21d200883

So we must flush the cache after the entry switch to ZERO
to avoid the obsolete info be preserved.

Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Gonglei (Arei) <arei.gonglei@huawei.com>

Fixes: 6491d4d02893 ("intel-iommu: Free old page tables before creating superpage")
Cc: <stable@vger.kernel.org> # v3.0+
Link: https://lore.kernel.org/linux-iommu/670baaf8-4ff8-4e84-4be3-030b95ab5a5e@huawei.com/
Suggested-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
---
v1 -> v2:
  - add Joerg
  - reconstruct the solution base on the Baolu's suggestion
---
 drivers/iommu/intel/iommu.c | 52 +++++++++++++++++++++++++++++++++------------
 1 file changed, 38 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index ee09323..881c9f2 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2289,6 +2289,41 @@ static inline int hardware_largepage_caps(struct dmar_domain *domain,
 	return level;
 }
 
+/*
+ * Ensure that old small page tables are removed to make room for superpage(s).
+ * We're going to add new large pages, so make sure we don't remove their parent
+ * tables. The IOTLB/devTLBs should be flushed if any PDE/PTEs are cleared.
+ */
+static void switch_to_super_page(struct dmar_domain *domain,
+				 unsigned long start_pfn,
+				 unsigned long end_pfn, int level)
+{
+	unsigned long lvl_pages = lvl_to_nr_pages(level);
+	struct dma_pte *pte = NULL;
+	int i;
+
+	while (start_pfn <= end_pfn) {
+		if (!pte)
+			pte = pfn_to_dma_pte(domain, start_pfn, &level);
+
+		if (dma_pte_present(pte)) {
+			dma_pte_free_pagetable(domain, start_pfn,
+					       start_pfn + lvl_pages - 1,
+					       level + 1);
+
+			for_each_domain_iommu(i, domain)
+				iommu_flush_iotlb_psi(g_iommus[i], domain,
+						      start_pfn, lvl_pages,
+						      0, 0);
+		}
+
+		pte++;
+		start_pfn += lvl_pages;
+		if (first_pte_in_page(pte))
+			pte = NULL;
+	}
+}
+
 static int
 __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
 		 unsigned long phys_pfn, unsigned long nr_pages, int prot)
@@ -2329,22 +2364,11 @@ static inline int hardware_largepage_caps(struct dmar_domain *domain,
 				return -ENOMEM;
 			/* It is large page*/
 			if (largepage_lvl > 1) {
-				unsigned long nr_superpages, end_pfn;
+				unsigned long end_pfn;
 
 				pteval |= DMA_PTE_LARGE_PAGE;
-				lvl_pages = lvl_to_nr_pages(largepage_lvl);
-
-				nr_superpages = nr_pages / lvl_pages;
-				end_pfn = iov_pfn + nr_superpages * lvl_pages - 1;
-
-				/*
-				 * Ensure that old small page tables are
-				 * removed to make room for superpage(s).
-				 * We're adding new large pages, so make sure
-				 * we don't remove their parent tables.
-				 */
-				dma_pte_free_pagetable(domain, iov_pfn, end_pfn,
-						       largepage_lvl + 1);
+				end_pfn = ((iov_pfn + nr_pages) & level_mask(largepage_lvl)) - 1;
+				switch_to_super_page(domain, iov_pfn, end_pfn, largepage_lvl);
 			} else {
 				pteval &= ~(uint64_t)DMA_PTE_LARGE_PAGE;
 			}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] iommu/vt-d: Force to flush iotlb before creating superpage
  2021-04-15  0:46 [PATCH v2] iommu/vt-d: Force to flush iotlb before creating superpage Longpeng(Mike)
@ 2021-04-15  5:49 ` Lu Baolu
  2021-04-15 14:13 ` Joerg Roedel
  1 sibling, 0 replies; 4+ messages in thread
From: Lu Baolu @ 2021-04-15  5:49 UTC (permalink / raw)
  To: Longpeng(Mike), iommu, linux-kernel
  Cc: baolu.lu, David Woodhouse, Nadav Amit, Alex Williamson,
	Joerg Roedel, Kevin Tian, Gonglei, stable

Hi Longpeng,

On 4/15/21 8:46 AM, Longpeng(Mike) wrote:
> The translation caches may preserve obsolete data when the
> mapping size is changed, suppose the following sequence which
> can reveal the problem with high probability.
> 
> 1.mmap(4GB,MAP_HUGETLB)
> 2.
>    while (1) {
>     (a)    DMA MAP   0,0xa0000
>     (b)    DMA UNMAP 0,0xa0000
>     (c)    DMA MAP   0,0xc0000000
>               * DMA read IOVA 0 may failure here (Not present)
>               * if the problem occurs.
>     (d)    DMA UNMAP 0,0xc0000000
>    }
> 
> The page table(only focus on IOVA 0) after (a) is:
>   PML4: 0x19db5c1003   entry:0xffff899bdcd2f000
>    PDPE: 0x1a1cacb003  entry:0xffff89b35b5c1000
>     PDE: 0x1a30a72003  entry:0xffff89b39cacb000
>      PTE: 0x21d200803  entry:0xffff89b3b0a72000
> 
> The page table after (b) is:
>   PML4: 0x19db5c1003   entry:0xffff899bdcd2f000
>    PDPE: 0x1a1cacb003  entry:0xffff89b35b5c1000
>     PDE: 0x1a30a72003  entry:0xffff89b39cacb000
>      PTE: 0x0          entry:0xffff89b3b0a72000
> 
> The page table after (c) is:
>   PML4: 0x19db5c1003   entry:0xffff899bdcd2f000
>    PDPE: 0x1a1cacb003  entry:0xffff89b35b5c1000
>     PDE: 0x21d200883   entry:0xffff89b39cacb000 (*)
> 
> Because the PDE entry after (b) is present, it won't be
> flushed even if the iommu driver flush cache when unmap,
> so the obsolete data may be preserved in cache, which
> would cause the wrong translation at end.
> 
> However, we can see the PDE entry is finally switch to
> 2M-superpage mapping, but it does not transform
> to 0x21d200883 directly:
> 
> 1. PDE: 0x1a30a72003
> 2. __domain_mapping
>       dma_pte_free_pagetable
>         Set the PDE entry to ZERO
>       Set the PDE entry to 0x21d200883
> 
> So we must flush the cache after the entry switch to ZERO
> to avoid the obsolete info be preserved.
> 
> Cc: David Woodhouse <dwmw2@infradead.org>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Cc: Nadav Amit <nadav.amit@gmail.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Gonglei (Arei) <arei.gonglei@huawei.com>
> 
> Fixes: 6491d4d02893 ("intel-iommu: Free old page tables before creating superpage")
> Cc: <stable@vger.kernel.org> # v3.0+
> Link: https://lore.kernel.org/linux-iommu/670baaf8-4ff8-4e84-4be3-030b95ab5a5e@huawei.com/
> Suggested-by: Lu Baolu <baolu.lu@linux.intel.com>
> Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
> ---
> v1 -> v2:
>    - add Joerg
>    - reconstruct the solution base on the Baolu's suggestion
> ---
>   drivers/iommu/intel/iommu.c | 52 +++++++++++++++++++++++++++++++++------------
>   1 file changed, 38 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index ee09323..881c9f2 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -2289,6 +2289,41 @@ static inline int hardware_largepage_caps(struct dmar_domain *domain,
>   	return level;
>   }
>   
> +/*
> + * Ensure that old small page tables are removed to make room for superpage(s).
> + * We're going to add new large pages, so make sure we don't remove their parent
> + * tables. The IOTLB/devTLBs should be flushed if any PDE/PTEs are cleared.
> + */
> +static void switch_to_super_page(struct dmar_domain *domain,
> +				 unsigned long start_pfn,
> +				 unsigned long end_pfn, int level)
> +{
> +	unsigned long lvl_pages = lvl_to_nr_pages(level);
> +	struct dma_pte *pte = NULL;
> +	int i;
> +
> +	while (start_pfn <= end_pfn) {
> +		if (!pte)
> +			pte = pfn_to_dma_pte(domain, start_pfn, &level);
> +
> +		if (dma_pte_present(pte)) {
> +			dma_pte_free_pagetable(domain, start_pfn,
> +					       start_pfn + lvl_pages - 1,
> +					       level + 1);
> +
> +			for_each_domain_iommu(i, domain)
> +				iommu_flush_iotlb_psi(g_iommus[i], domain,
> +						      start_pfn, lvl_pages,
> +						      0, 0);
> +		}
> +
> +		pte++;
> +		start_pfn += lvl_pages;
> +		if (first_pte_in_page(pte))
> +			pte = NULL;
> +	}
> +}
> +
>   static int
>   __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
>   		 unsigned long phys_pfn, unsigned long nr_pages, int prot)
> @@ -2329,22 +2364,11 @@ static inline int hardware_largepage_caps(struct dmar_domain *domain,
>   				return -ENOMEM;
>   			/* It is large page*/
>   			if (largepage_lvl > 1) {
> -				unsigned long nr_superpages, end_pfn;
> +				unsigned long end_pfn;
>   
>   				pteval |= DMA_PTE_LARGE_PAGE;
> -				lvl_pages = lvl_to_nr_pages(largepage_lvl);
> -
> -				nr_superpages = nr_pages / lvl_pages;
> -				end_pfn = iov_pfn + nr_superpages * lvl_pages - 1;
> -
> -				/*
> -				 * Ensure that old small page tables are
> -				 * removed to make room for superpage(s).
> -				 * We're adding new large pages, so make sure
> -				 * we don't remove their parent tables.
> -				 */
> -				dma_pte_free_pagetable(domain, iov_pfn, end_pfn,
> -						       largepage_lvl + 1);
> +				end_pfn = ((iov_pfn + nr_pages) & level_mask(largepage_lvl)) - 1;
> +				switch_to_super_page(domain, iov_pfn, end_pfn, largepage_lvl);
>   			} else {
>   				pteval &= ~(uint64_t)DMA_PTE_LARGE_PAGE;
>   			}
> 

Thank you!

Acked-by: Lu Baolu <baolu.lu@linux.intel.com>

Best regards,
baolu

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] iommu/vt-d: Force to flush iotlb before creating superpage
  2021-04-15  0:46 [PATCH v2] iommu/vt-d: Force to flush iotlb before creating superpage Longpeng(Mike)
  2021-04-15  5:49 ` Lu Baolu
@ 2021-04-15 14:13 ` Joerg Roedel
  2021-04-30 22:50   ` Nadav Amit
  1 sibling, 1 reply; 4+ messages in thread
From: Joerg Roedel @ 2021-04-15 14:13 UTC (permalink / raw)
  To: Longpeng(Mike)
  Cc: iommu, linux-kernel, David Woodhouse, Lu Baolu, Nadav Amit,
	Alex Williamson, Kevin Tian, Gonglei, stable

On Thu, Apr 15, 2021 at 08:46:28AM +0800, Longpeng(Mike) wrote:
> Fixes: 6491d4d02893 ("intel-iommu: Free old page tables before creating superpage")
> Cc: <stable@vger.kernel.org> # v3.0+
> Link: https://lore.kernel.org/linux-iommu/670baaf8-4ff8-4e84-4be3-030b95ab5a5e@huawei.com/
> Suggested-by: Lu Baolu <baolu.lu@linux.intel.com>
> Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
> ---
> v1 -> v2:
>   - add Joerg
>   - reconstruct the solution base on the Baolu's suggestion
> ---
>  drivers/iommu/intel/iommu.c | 52 +++++++++++++++++++++++++++++++++------------
>  1 file changed, 38 insertions(+), 14 deletions(-)

Applied, thanks.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] iommu/vt-d: Force to flush iotlb before creating superpage
  2021-04-15 14:13 ` Joerg Roedel
@ 2021-04-30 22:50   ` Nadav Amit
  0 siblings, 0 replies; 4+ messages in thread
From: Nadav Amit @ 2021-04-30 22:50 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Longpeng(Mike),
	iommu, Linux Kernel Mailing List, David Woodhouse, Lu Baolu,
	Alex Williamson, Kevin Tian, Gonglei, stable

[-- Attachment #1: Type: text/plain, Size: 859 bytes --]



> On Apr 15, 2021, at 7:13 AM, Joerg Roedel <joro@8bytes.org> wrote:
> 
> On Thu, Apr 15, 2021 at 08:46:28AM +0800, Longpeng(Mike) wrote:
>> Fixes: 6491d4d02893 ("intel-iommu: Free old page tables before creating superpage")
>> Cc: <stable@vger.kernel.org> # v3.0+
>> Link: https://lore.kernel.org/linux-iommu/670baaf8-4ff8-4e84-4be3-030b95ab5a5e@huawei.com/
>> Suggested-by: Lu Baolu <baolu.lu@linux.intel.com>
>> Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
>> ---
>> v1 -> v2:
>>  - add Joerg
>>  - reconstruct the solution base on the Baolu's suggestion
>> ---
>> drivers/iommu/intel/iommu.c | 52 +++++++++++++++++++++++++++++++++------------
>> 1 file changed, 38 insertions(+), 14 deletions(-)
> 
> Applied, thanks.
> 

Err.. There is a bug in my patch, and some other problem. I will
investigate and get back to you.


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-04-30 22:50 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-15  0:46 [PATCH v2] iommu/vt-d: Force to flush iotlb before creating superpage Longpeng(Mike)
2021-04-15  5:49 ` Lu Baolu
2021-04-15 14:13 ` Joerg Roedel
2021-04-30 22:50   ` Nadav Amit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).