linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] kvm: arm: Skip stage2 huge mappings for unaligned ipa backed by THP
@ 2019-04-10 15:14 Suzuki K Poulose
  2019-04-10 15:23 ` [PATCH 0/2] kvm: arm: Unify stage2 mapping for THP backed memory Suzuki K Poulose
  0 siblings, 1 reply; 9+ messages in thread
From: Suzuki K Poulose @ 2019-04-10 15:14 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, kvm, kvmarm, julien.thierry, christoffer.dall,
	marc.zyngier, andrew.murray, eric.auger, zhengxiang9, yuzenghui,
	wanghaibin.wang, Suzuki K Poulose

With commit a80868f398554842b14, we no longer ensure that the
THP page is properly aligned in the guest IPA. Skip the stage2
huge mapping for unaligned IPA backed by transparent hugepages.

Fixes: a80868f398554842b14 ("KVM: arm/arm64: Enforce PTE mappings at stage2 when needed")
Reported-by: Eric Auger <eric.auger@redhat.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Chirstoffer Dall <christoffer.dall@arm.com>
Cc: Zenghui Yu <yuzenghui@huawei.com>
Cc: Zheng Xiang <zhengxiang9@huawei.com>
Cc: Andrew Murray <andrew.murray@arm.com>
Cc: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

---
Changes since V1:
 - Zenghui Yu reported that the first version misses the boundary
   check for the end address. Lets reuse the existing helper to
   check it.
---
 virt/kvm/arm/mmu.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 27c9583..a39dcfd 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1781,8 +1781,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		 * Only PMD_SIZE transparent hugepages(THP) are
 		 * currently supported. This code will need to be
 		 * updated to support other THP sizes.
+		 *
+		 * Make sure the host VA and the guest IPA are sufficiently
+		 * aligned and that the block is contained within the memslot.
 		 */
-		if (transparent_hugepage_adjust(&pfn, &fault_ipa))
+		if (fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE) &&
+		    transparent_hugepage_adjust(&pfn, &fault_ipa))
 			vma_pagesize = PMD_SIZE;
 	}
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 0/2] kvm: arm: Unify stage2 mapping for THP backed memory
  2019-04-10 15:14 [PATCH v2] kvm: arm: Skip stage2 huge mappings for unaligned ipa backed by THP Suzuki K Poulose
@ 2019-04-10 15:23 ` Suzuki K Poulose
  2019-04-10 15:23   ` [PATCH 1/2] kvm: arm: Clean up the checking for huge mapping Suzuki K Poulose
  2019-04-10 15:23   ` [PATCH 2/2] kvm: arm: Unify handling THP backed host memory Suzuki K Poulose
  0 siblings, 2 replies; 9+ messages in thread
From: Suzuki K Poulose @ 2019-04-10 15:23 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, kvm, kvmarm, julien.thierry, christoffer.dall,
	marc.zyngier, andrew.murray, eric.auger, zhengxiang9, yuzenghui,
	wanghaibin.wang, Suzuki K Poulose


This series cleans up the handling of the stage2 huge mapping for THP.
Applies on top of [1]


[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2019-April/645324.html

Suzuki K Poulose (2):
  kvm: arm: Clean up the checking for huge mapping
  kvm: arm: Unify handling THP backed host memory

 virt/kvm/arm/mmu.c | 127 ++++++++++++++++++++++++++++-------------------------
 1 file changed, 66 insertions(+), 61 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] kvm: arm: Clean up the checking for huge mapping
  2019-04-10 15:23 ` [PATCH 0/2] kvm: arm: Unify stage2 mapping for THP backed memory Suzuki K Poulose
@ 2019-04-10 15:23   ` Suzuki K Poulose
  2019-04-11  1:48     ` Zenghui Yu
  2019-04-10 15:23   ` [PATCH 2/2] kvm: arm: Unify handling THP backed host memory Suzuki K Poulose
  1 sibling, 1 reply; 9+ messages in thread
From: Suzuki K Poulose @ 2019-04-10 15:23 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, kvm, kvmarm, julien.thierry, christoffer.dall,
	marc.zyngier, andrew.murray, eric.auger, zhengxiang9, yuzenghui,
	wanghaibin.wang, Suzuki K Poulose

If we are checking whether the stage2 can map PAGE_SIZE,
we don't have to do the boundary checks as both the host
VMA and the guest memslots are page aligned. Bail the case
easily.

Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 virt/kvm/arm/mmu.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index a39dcfd..6d73322 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1624,6 +1624,10 @@ static bool fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
 	hva_t uaddr_start, uaddr_end;
 	size_t size;
 
+	/* The memslot and the VMA are guaranteed to be aligned to PAGE_SIZE */
+	if (map_size == PAGE_SIZE)
+		return true;
+
 	size = memslot->npages * PAGE_SIZE;
 
 	gpa_start = memslot->base_gfn << PAGE_SHIFT;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] kvm: arm: Unify handling THP backed host memory
  2019-04-10 15:23 ` [PATCH 0/2] kvm: arm: Unify stage2 mapping for THP backed memory Suzuki K Poulose
  2019-04-10 15:23   ` [PATCH 1/2] kvm: arm: Clean up the checking for huge mapping Suzuki K Poulose
@ 2019-04-10 15:23   ` Suzuki K Poulose
  2019-04-11  1:59     ` Zenghui Yu
  1 sibling, 1 reply; 9+ messages in thread
From: Suzuki K Poulose @ 2019-04-10 15:23 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, kvm, kvmarm, julien.thierry, christoffer.dall,
	marc.zyngier, andrew.murray, eric.auger, zhengxiang9, yuzenghui,
	wanghaibin.wang, Suzuki K Poulose

We support mapping host memory backed by PMD transparent hugepages
at stage2 as huge pages. However the checks are now spread across
two different places. Let us unify the handling of the THPs to
keep the code cleaner (and future proof for PUD THP support).
This patch moves transparent_hugepage_adjust() closer to the caller
to avoid a forward declaration for fault_supports_stage2_huge_mappings().

Also, since we already handle the case where the host VA and the guest
PA may not be aligned, the explicit VM_BUG_ON() is not required.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Zneghui Yu <yuzenghui@huawei.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 virt/kvm/arm/mmu.c | 123 +++++++++++++++++++++++++++--------------------------
 1 file changed, 62 insertions(+), 61 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 6d73322..714eec2 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1380,53 +1380,6 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
 	return ret;
 }
 
-static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
-{
-	kvm_pfn_t pfn = *pfnp;
-	gfn_t gfn = *ipap >> PAGE_SHIFT;
-	struct page *page = pfn_to_page(pfn);
-
-	/*
-	 * PageTransCompoundMap() returns true for THP and
-	 * hugetlbfs. Make sure the adjustment is done only for THP
-	 * pages.
-	 */
-	if (!PageHuge(page) && PageTransCompoundMap(page)) {
-		unsigned long mask;
-		/*
-		 * The address we faulted on is backed by a transparent huge
-		 * page.  However, because we map the compound huge page and
-		 * not the individual tail page, we need to transfer the
-		 * refcount to the head page.  We have to be careful that the
-		 * THP doesn't start to split while we are adjusting the
-		 * refcounts.
-		 *
-		 * We are sure this doesn't happen, because mmu_notifier_retry
-		 * was successful and we are holding the mmu_lock, so if this
-		 * THP is trying to split, it will be blocked in the mmu
-		 * notifier before touching any of the pages, specifically
-		 * before being able to call __split_huge_page_refcount().
-		 *
-		 * We can therefore safely transfer the refcount from PG_tail
-		 * to PG_head and switch the pfn from a tail page to the head
-		 * page accordingly.
-		 */
-		mask = PTRS_PER_PMD - 1;
-		VM_BUG_ON((gfn & mask) != (pfn & mask));
-		if (pfn & mask) {
-			*ipap &= PMD_MASK;
-			kvm_release_pfn_clean(pfn);
-			pfn &= ~mask;
-			kvm_get_pfn(pfn);
-			*pfnp = pfn;
-		}
-
-		return true;
-	}
-
-	return false;
-}
-
 /**
  * stage2_wp_ptes - write protect PMD range
  * @pmd:	pointer to pmd entry
@@ -1677,6 +1630,61 @@ static bool fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
 	       (hva & ~(map_size - 1)) + map_size <= uaddr_end;
 }
 
+/*
+ * Check if the given hva is backed by a transparent huge page (THP)
+ * and whether it can be mapped using block mapping in stage2. If so, adjust
+ * the stage2 PFN and IPA accordingly. Only PMD_SIZE THPs are currently
+ * supported. This will need to be updated to support other THP sizes.
+ *
+ * Returns the size of the mapping.
+ */
+static unsigned long
+transparent_hugepage_adjust(struct kvm_memory_slot *memslot,
+			    unsigned long hva, kvm_pfn_t *pfnp,
+			    phys_addr_t *ipap)
+{
+	kvm_pfn_t pfn = *pfnp;
+	struct page *page = pfn_to_page(pfn);
+
+	/*
+	 * PageTransCompoundMap() returns true for THP and
+	 * hugetlbfs. Make sure the adjustment is done only for THP
+	 * pages. Also make sure that the HVA and IPA are sufficiently
+	 * aligned and that the  block map is contained within the memslot.
+	 */
+	if (!PageHuge(page) && PageTransCompoundMap(page) &&
+	    fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE)) {
+		/*
+		 * The address we faulted on is backed by a transparent huge
+		 * page.  However, because we map the compound huge page and
+		 * not the individual tail page, we need to transfer the
+		 * refcount to the head page.  We have to be careful that the
+		 * THP doesn't start to split while we are adjusting the
+		 * refcounts.
+		 *
+		 * We are sure this doesn't happen, because mmu_notifier_retry
+		 * was successful and we are holding the mmu_lock, so if this
+		 * THP is trying to split, it will be blocked in the mmu
+		 * notifier before touching any of the pages, specifically
+		 * before being able to call __split_huge_page_refcount().
+		 *
+		 * We can therefore safely transfer the refcount from PG_tail
+		 * to PG_head and switch the pfn from a tail page to the head
+		 * page accordingly.
+		 */
+		*ipap &= PMD_MASK;
+		kvm_release_pfn_clean(pfn);
+		pfn &= ~(PTRS_PER_PMD - 1);
+		kvm_get_pfn(pfn);
+		*pfnp = pfn;
+
+		return PMD_SIZE;
+	}
+
+	/* Use page mapping if we cannot use block mapping */
+	return PAGE_SIZE;
+}
+
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			  struct kvm_memory_slot *memslot, unsigned long hva,
 			  unsigned long fault_status)
@@ -1780,20 +1788,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	if (mmu_notifier_retry(kvm, mmu_seq))
 		goto out_unlock;
 
-	if (vma_pagesize == PAGE_SIZE && !force_pte) {
-		/*
-		 * Only PMD_SIZE transparent hugepages(THP) are
-		 * currently supported. This code will need to be
-		 * updated to support other THP sizes.
-		 *
-		 * Make sure the host VA and the guest IPA are sufficiently
-		 * aligned and that the block is contained within the memslot.
-		 */
-		if (fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE) &&
-		    transparent_hugepage_adjust(&pfn, &fault_ipa))
-			vma_pagesize = PMD_SIZE;
-	}
-
+	/*
+	 * If we are not forced to use page mapping, check if we are
+	 * backed by a THP and thus use block mapping if possible.
+	 */
+	if (vma_pagesize == PAGE_SIZE && !force_pte)
+		vma_pagesize = transparent_hugepage_adjust(memslot, hva,
+							   &pfn, &fault_ipa);
 	if (writable)
 		kvm_set_pfn_dirty(pfn);
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] kvm: arm: Clean up the checking for huge mapping
  2019-04-10 15:23   ` [PATCH 1/2] kvm: arm: Clean up the checking for huge mapping Suzuki K Poulose
@ 2019-04-11  1:48     ` Zenghui Yu
  2019-04-11  9:47       ` Suzuki K Poulose
  0 siblings, 1 reply; 9+ messages in thread
From: Zenghui Yu @ 2019-04-11  1:48 UTC (permalink / raw)
  To: Suzuki K Poulose, linux-arm-kernel
  Cc: linux-kernel, kvm, kvmarm, julien.thierry, christoffer.dall,
	marc.zyngier, andrew.murray, eric.auger, zhengxiang9,
	wanghaibin.wang


On 2019/4/10 23:23, Suzuki K Poulose wrote:
> If we are checking whether the stage2 can map PAGE_SIZE,
> we don't have to do the boundary checks as both the host
> VMA and the guest memslots are page aligned. Bail the case
> easily.
> 
> Cc: Christoffer Dall <christoffer.dall@arm.com>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>   virt/kvm/arm/mmu.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index a39dcfd..6d73322 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1624,6 +1624,10 @@ static bool fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
>   	hva_t uaddr_start, uaddr_end;
>   	size_t size;
>   
> +	/* The memslot and the VMA are guaranteed to be aligned to PAGE_SIZE */
> +	if (map_size == PAGE_SIZE)
> +		return true;
> +
>   	size = memslot->npages * PAGE_SIZE;
>   
>   	gpa_start = memslot->base_gfn << PAGE_SHIFT;
> 
We can do a comment clean up as well in this patch.

s/<< PAGE_SIZE/<< PAGE_SHIFT/


thanks,
zenghui


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] kvm: arm: Unify handling THP backed host memory
  2019-04-10 15:23   ` [PATCH 2/2] kvm: arm: Unify handling THP backed host memory Suzuki K Poulose
@ 2019-04-11  1:59     ` Zenghui Yu
  2019-04-11 15:16       ` Suzuki K Poulose
  0 siblings, 1 reply; 9+ messages in thread
From: Zenghui Yu @ 2019-04-11  1:59 UTC (permalink / raw)
  To: Suzuki K Poulose, linux-arm-kernel
  Cc: linux-kernel, kvm, kvmarm, julien.thierry, christoffer.dall,
	marc.zyngier, andrew.murray, eric.auger, zhengxiang9,
	wanghaibin.wang

Hi Suzuki,

On 2019/4/10 23:23, Suzuki K Poulose wrote:
> We support mapping host memory backed by PMD transparent hugepages
> at stage2 as huge pages. However the checks are now spread across
> two different places. Let us unify the handling of the THPs to
> keep the code cleaner (and future proof for PUD THP support).
> This patch moves transparent_hugepage_adjust() closer to the caller
> to avoid a forward declaration for fault_supports_stage2_huge_mappings().
> 
> Also, since we already handle the case where the host VA and the guest
> PA may not be aligned, the explicit VM_BUG_ON() is not required.
> 
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Cc: Christoffer Dall <christoffer.dall@arm.com>
> Cc: Zneghui Yu <yuzenghui@huawei.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>   virt/kvm/arm/mmu.c | 123 +++++++++++++++++++++++++++--------------------------
>   1 file changed, 62 insertions(+), 61 deletions(-)
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 6d73322..714eec2 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1380,53 +1380,6 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
>   	return ret;
>   }
>   
> -static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
> -{
> -	kvm_pfn_t pfn = *pfnp;
> -	gfn_t gfn = *ipap >> PAGE_SHIFT;
> -	struct page *page = pfn_to_page(pfn);
> -
> -	/*
> -	 * PageTransCompoundMap() returns true for THP and
> -	 * hugetlbfs. Make sure the adjustment is done only for THP
> -	 * pages.
> -	 */
> -	if (!PageHuge(page) && PageTransCompoundMap(page)) {
> -		unsigned long mask;
> -		/*
> -		 * The address we faulted on is backed by a transparent huge
> -		 * page.  However, because we map the compound huge page and
> -		 * not the individual tail page, we need to transfer the
> -		 * refcount to the head page.  We have to be careful that the
> -		 * THP doesn't start to split while we are adjusting the
> -		 * refcounts.
> -		 *
> -		 * We are sure this doesn't happen, because mmu_notifier_retry
> -		 * was successful and we are holding the mmu_lock, so if this
> -		 * THP is trying to split, it will be blocked in the mmu
> -		 * notifier before touching any of the pages, specifically
> -		 * before being able to call __split_huge_page_refcount().
> -		 *
> -		 * We can therefore safely transfer the refcount from PG_tail
> -		 * to PG_head and switch the pfn from a tail page to the head
> -		 * page accordingly.
> -		 */
> -		mask = PTRS_PER_PMD - 1;
> -		VM_BUG_ON((gfn & mask) != (pfn & mask));
> -		if (pfn & mask) {
> -			*ipap &= PMD_MASK;
> -			kvm_release_pfn_clean(pfn);
> -			pfn &= ~mask;
> -			kvm_get_pfn(pfn);
> -			*pfnp = pfn;
> -		}
> -
> -		return true;
> -	}
> -
> -	return false;
> -}
> -
>   /**
>    * stage2_wp_ptes - write protect PMD range
>    * @pmd:	pointer to pmd entry
> @@ -1677,6 +1630,61 @@ static bool fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
>   	       (hva & ~(map_size - 1)) + map_size <= uaddr_end;
>   }
>   
> +/*
> + * Check if the given hva is backed by a transparent huge page (THP)
> + * and whether it can be mapped using block mapping in stage2. If so, adjust
> + * the stage2 PFN and IPA accordingly. Only PMD_SIZE THPs are currently
> + * supported. This will need to be updated to support other THP sizes.
> + *
> + * Returns the size of the mapping.
> + */
> +static unsigned long
> +transparent_hugepage_adjust(struct kvm_memory_slot *memslot,
> +			    unsigned long hva, kvm_pfn_t *pfnp,
> +			    phys_addr_t *ipap)
> +{
> +	kvm_pfn_t pfn = *pfnp;
> +	struct page *page = pfn_to_page(pfn);
> +
> +	/*
> +	 * PageTransCompoundMap() returns true for THP and
> +	 * hugetlbfs. Make sure the adjustment is done only for THP
> +	 * pages. Also make sure that the HVA and IPA are sufficiently
> +	 * aligned and that the  block map is contained within the memslot.
> +	 */
> +	if (!PageHuge(page) && PageTransCompoundMap(page) &&

We managed to get here, ensure that we only play with normal size pages
and no hugetlbfs pages will be involved.  "!PageHuge(page)" will always
return true and we can let it go.

> +	    fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE)) {
> +		/*
> +		 * The address we faulted on is backed by a transparent huge
> +		 * page.  However, because we map the compound huge page and
> +		 * not the individual tail page, we need to transfer the
> +		 * refcount to the head page.  We have to be careful that the
> +		 * THP doesn't start to split while we are adjusting the
> +		 * refcounts.
> +		 *
> +		 * We are sure this doesn't happen, because mmu_notifier_retry
> +		 * was successful and we are holding the mmu_lock, so if this
> +		 * THP is trying to split, it will be blocked in the mmu
> +		 * notifier before touching any of the pages, specifically
> +		 * before being able to call __split_huge_page_refcount().
> +		 *
> +		 * We can therefore safely transfer the refcount from PG_tail
> +		 * to PG_head and switch the pfn from a tail page to the head
> +		 * page accordingly.
> +		 */
> +		*ipap &= PMD_MASK;
> +		kvm_release_pfn_clean(pfn);
> +		pfn &= ~(PTRS_PER_PMD - 1);
> +		kvm_get_pfn(pfn);
> +		*pfnp = pfn;
> +
> +		return PMD_SIZE;
> +	}
> +
> +	/* Use page mapping if we cannot use block mapping */
> +	return PAGE_SIZE;
> +}
> +
>   static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>   			  struct kvm_memory_slot *memslot, unsigned long hva,
>   			  unsigned long fault_status)
> @@ -1780,20 +1788,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>   	if (mmu_notifier_retry(kvm, mmu_seq))
>   		goto out_unlock;
>   
> -	if (vma_pagesize == PAGE_SIZE && !force_pte) {
> -		/*
> -		 * Only PMD_SIZE transparent hugepages(THP) are
> -		 * currently supported. This code will need to be
> -		 * updated to support other THP sizes.
> -		 *
> -		 * Make sure the host VA and the guest IPA are sufficiently
> -		 * aligned and that the block is contained within the memslot.
> -		 */
> -		if (fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE) &&
> -		    transparent_hugepage_adjust(&pfn, &fault_ipa))
> -			vma_pagesize = PMD_SIZE;
> -	}
> -
> +	/*
> +	 * If we are not forced to use page mapping, check if we are
> +	 * backed by a THP and thus use block mapping if possible.
> +	 */
> +	if (vma_pagesize == PAGE_SIZE && !force_pte)
> +		vma_pagesize = transparent_hugepage_adjust(memslot, hva,
> +							   &pfn, &fault_ipa);
>   	if (writable)
>   		kvm_set_pfn_dirty(pfn);
>   
thanks,
zenghui


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] kvm: arm: Clean up the checking for huge mapping
  2019-04-11  1:48     ` Zenghui Yu
@ 2019-04-11  9:47       ` Suzuki K Poulose
  0 siblings, 0 replies; 9+ messages in thread
From: Suzuki K Poulose @ 2019-04-11  9:47 UTC (permalink / raw)
  To: yuzenghui, linux-arm-kernel
  Cc: linux-kernel, kvm, kvmarm, julien.thierry, christoffer.dall,
	marc.zyngier, andrew.murray, eric.auger, zhengxiang9,
	wanghaibin.wang

On 04/11/2019 02:48 AM, Zenghui Yu wrote:
> 
> On 2019/4/10 23:23, Suzuki K Poulose wrote:
>> If we are checking whether the stage2 can map PAGE_SIZE,
>> we don't have to do the boundary checks as both the host
>> VMA and the guest memslots are page aligned. Bail the case
>> easily.
>>
>> Cc: Christoffer Dall <christoffer.dall@arm.com>
>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>   virt/kvm/arm/mmu.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index a39dcfd..6d73322 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -1624,6 +1624,10 @@ static bool 
>> fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
>>       hva_t uaddr_start, uaddr_end;
>>       size_t size;
>> +    /* The memslot and the VMA are guaranteed to be aligned to 
>> PAGE_SIZE */
>> +    if (map_size == PAGE_SIZE)
>> +        return true;
>> +
>>       size = memslot->npages * PAGE_SIZE;
>>       gpa_start = memslot->base_gfn << PAGE_SHIFT;
>>
> We can do a comment clean up as well in this patch.
> 
> s/<< PAGE_SIZE/<< PAGE_SHIFT/

Sure, I missed that. Will fix it in the next version.

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] kvm: arm: Unify handling THP backed host memory
  2019-04-11  1:59     ` Zenghui Yu
@ 2019-04-11 15:16       ` Suzuki K Poulose
  2019-04-12  9:34         ` Zenghui Yu
  0 siblings, 1 reply; 9+ messages in thread
From: Suzuki K Poulose @ 2019-04-11 15:16 UTC (permalink / raw)
  To: yuzenghui, linux-arm-kernel
  Cc: linux-kernel, kvm, kvmarm, julien.thierry, christoffer.dall,
	marc.zyngier, andrew.murray, eric.auger, zhengxiang9,
	wanghaibin.wang

Hi Zhengui,

On 11/04/2019 02:59, Zenghui Yu wrote:
> Hi Suzuki,
> 
> On 2019/4/10 23:23, Suzuki K Poulose wrote:
>> We support mapping host memory backed by PMD transparent hugepages
>> at stage2 as huge pages. However the checks are now spread across
>> two different places. Let us unify the handling of the THPs to
>> keep the code cleaner (and future proof for PUD THP support).
>> This patch moves transparent_hugepage_adjust() closer to the caller
>> to avoid a forward declaration for fault_supports_stage2_huge_mappings().
>>
>> Also, since we already handle the case where the host VA and the guest
>> PA may not be aligned, the explicit VM_BUG_ON() is not required.
>>
>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>> Cc: Christoffer Dall <christoffer.dall@arm.com>
>> Cc: Zneghui Yu <yuzenghui@huawei.com>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>    virt/kvm/arm/mmu.c | 123 +++++++++++++++++++++++++++--------------------------
>>    1 file changed, 62 insertions(+), 61 deletions(-)
>>
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 6d73322..714eec2 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -1380,53 +1380,6 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
>>    	return ret;
>>    }
>>    
>> -static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
>> -{
>> -	kvm_pfn_t pfn = *pfnp;
>> -	gfn_t gfn = *ipap >> PAGE_SHIFT;
>> -	struct page *page = pfn_to_page(pfn);
>> -
>> -	/*
>> -	 * PageTransCompoundMap() returns true for THP and
>> -	 * hugetlbfs. Make sure the adjustment is done only for THP
>> -	 * pages.
>> -	 */
>> -	if (!PageHuge(page) && PageTransCompoundMap(page)) {
>> -		unsigned long mask;
>> -		/*
>> -		 * The address we faulted on is backed by a transparent huge
>> -		 * page.  However, because we map the compound huge page and
>> -		 * not the individual tail page, we need to transfer the
>> -		 * refcount to the head page.  We have to be careful that the
>> -		 * THP doesn't start to split while we are adjusting the
>> -		 * refcounts.
>> -		 *
>> -		 * We are sure this doesn't happen, because mmu_notifier_retry
>> -		 * was successful and we are holding the mmu_lock, so if this
>> -		 * THP is trying to split, it will be blocked in the mmu
>> -		 * notifier before touching any of the pages, specifically
>> -		 * before being able to call __split_huge_page_refcount().
>> -		 *
>> -		 * We can therefore safely transfer the refcount from PG_tail
>> -		 * to PG_head and switch the pfn from a tail page to the head
>> -		 * page accordingly.
>> -		 */
>> -		mask = PTRS_PER_PMD - 1;
>> -		VM_BUG_ON((gfn & mask) != (pfn & mask));
>> -		if (pfn & mask) {
>> -			*ipap &= PMD_MASK;
>> -			kvm_release_pfn_clean(pfn);
>> -			pfn &= ~mask;
>> -			kvm_get_pfn(pfn);
>> -			*pfnp = pfn;
>> -		}
>> -
>> -		return true;
>> -	}
>> -
>> -	return false;
>> -}
>> -
>>    /**
>>     * stage2_wp_ptes - write protect PMD range
>>     * @pmd:	pointer to pmd entry
>> @@ -1677,6 +1630,61 @@ static bool fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
>>    	       (hva & ~(map_size - 1)) + map_size <= uaddr_end;
>>    }
>>    
>> +/*
>> + * Check if the given hva is backed by a transparent huge page (THP)
>> + * and whether it can be mapped using block mapping in stage2. If so, adjust
>> + * the stage2 PFN and IPA accordingly. Only PMD_SIZE THPs are currently
>> + * supported. This will need to be updated to support other THP sizes.
>> + *
>> + * Returns the size of the mapping.
>> + */
>> +static unsigned long
>> +transparent_hugepage_adjust(struct kvm_memory_slot *memslot,
>> +			    unsigned long hva, kvm_pfn_t *pfnp,
>> +			    phys_addr_t *ipap)
>> +{
>> +	kvm_pfn_t pfn = *pfnp;
>> +	struct page *page = pfn_to_page(pfn);
>> +
>> +	/*
>> +	 * PageTransCompoundMap() returns true for THP and
>> +	 * hugetlbfs. Make sure the adjustment is done only for THP
>> +	 * pages. Also make sure that the HVA and IPA are sufficiently
>> +	 * aligned and that the  block map is contained within the memslot.
>> +	 */
>> +	if (!PageHuge(page) && PageTransCompoundMap(page) &&
> 
> We managed to get here, ensure that we only play with normal size pages
> and no hugetlbfs pages will be involved.  "!PageHuge(page)" will always
> return true and we can let it go.

I think that is a bit tricky. If someone ever modifies the user_mem_abort()
and we end up in getting called with a HugeTLB backed page things could go
wrong.

I could do remove the check, but would like to add a WARN_ON_ONCE() to make
sure our assumption is held.

i.e,
	WARN_ON_ONCE(PageHuge(page));

	if (PageTransCompoundMap(page) &&>> +	 
fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE)) {

...


Cheers
Suzuki

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] kvm: arm: Unify handling THP backed host memory
  2019-04-11 15:16       ` Suzuki K Poulose
@ 2019-04-12  9:34         ` Zenghui Yu
  0 siblings, 0 replies; 9+ messages in thread
From: Zenghui Yu @ 2019-04-12  9:34 UTC (permalink / raw)
  To: Suzuki K Poulose, linux-arm-kernel
  Cc: linux-kernel, kvm, kvmarm, julien.thierry, christoffer.dall,
	marc.zyngier, andrew.murray, eric.auger, zhengxiang9,
	wanghaibin.wang


On 2019/4/11 23:16, Suzuki K Poulose wrote:
> Hi Zhengui,
> 
> On 11/04/2019 02:59, Zenghui Yu wrote:
>> Hi Suzuki,
>>
>> On 2019/4/10 23:23, Suzuki K Poulose wrote:
>>> We support mapping host memory backed by PMD transparent hugepages
>>> at stage2 as huge pages. However the checks are now spread across
>>> two different places. Let us unify the handling of the THPs to
>>> keep the code cleaner (and future proof for PUD THP support).
>>> This patch moves transparent_hugepage_adjust() closer to the caller
>>> to avoid a forward declaration for 
>>> fault_supports_stage2_huge_mappings().
>>>
>>> Also, since we already handle the case where the host VA and the guest
>>> PA may not be aligned, the explicit VM_BUG_ON() is not required.
>>>
>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>> Cc: Christoffer Dall <christoffer.dall@arm.com>
>>> Cc: Zneghui Yu <yuzenghui@huawei.com>
>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>> ---
>>>    virt/kvm/arm/mmu.c | 123 
>>> +++++++++++++++++++++++++++--------------------------
>>>    1 file changed, 62 insertions(+), 61 deletions(-)
>>>
>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>> index 6d73322..714eec2 100644
>>> --- a/virt/kvm/arm/mmu.c
>>> +++ b/virt/kvm/arm/mmu.c
>>> @@ -1380,53 +1380,6 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, 
>>> phys_addr_t guest_ipa,
>>>        return ret;
>>>    }
>>> -static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t 
>>> *ipap)
>>> -{
>>> -    kvm_pfn_t pfn = *pfnp;
>>> -    gfn_t gfn = *ipap >> PAGE_SHIFT;
>>> -    struct page *page = pfn_to_page(pfn);
>>> -
>>> -    /*
>>> -     * PageTransCompoundMap() returns true for THP and
>>> -     * hugetlbfs. Make sure the adjustment is done only for THP
>>> -     * pages.
>>> -     */
>>> -    if (!PageHuge(page) && PageTransCompoundMap(page)) {
>>> -        unsigned long mask;
>>> -        /*
>>> -         * The address we faulted on is backed by a transparent huge
>>> -         * page.  However, because we map the compound huge page and
>>> -         * not the individual tail page, we need to transfer the
>>> -         * refcount to the head page.  We have to be careful that the
>>> -         * THP doesn't start to split while we are adjusting the
>>> -         * refcounts.
>>> -         *
>>> -         * We are sure this doesn't happen, because mmu_notifier_retry
>>> -         * was successful and we are holding the mmu_lock, so if this
>>> -         * THP is trying to split, it will be blocked in the mmu
>>> -         * notifier before touching any of the pages, specifically
>>> -         * before being able to call __split_huge_page_refcount().
>>> -         *
>>> -         * We can therefore safely transfer the refcount from PG_tail
>>> -         * to PG_head and switch the pfn from a tail page to the head
>>> -         * page accordingly.
>>> -         */
>>> -        mask = PTRS_PER_PMD - 1;
>>> -        VM_BUG_ON((gfn & mask) != (pfn & mask));
>>> -        if (pfn & mask) {
>>> -            *ipap &= PMD_MASK;
>>> -            kvm_release_pfn_clean(pfn);
>>> -            pfn &= ~mask;
>>> -            kvm_get_pfn(pfn);
>>> -            *pfnp = pfn;
>>> -        }
>>> -
>>> -        return true;
>>> -    }
>>> -
>>> -    return false;
>>> -}
>>> -
>>>    /**
>>>     * stage2_wp_ptes - write protect PMD range
>>>     * @pmd:    pointer to pmd entry
>>> @@ -1677,6 +1630,61 @@ static bool 
>>> fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
>>>               (hva & ~(map_size - 1)) + map_size <= uaddr_end;
>>>    }
>>> +/*
>>> + * Check if the given hva is backed by a transparent huge page (THP)
>>> + * and whether it can be mapped using block mapping in stage2. If 
>>> so, adjust
>>> + * the stage2 PFN and IPA accordingly. Only PMD_SIZE THPs are currently
>>> + * supported. This will need to be updated to support other THP sizes.
>>> + *
>>> + * Returns the size of the mapping.
>>> + */
>>> +static unsigned long
>>> +transparent_hugepage_adjust(struct kvm_memory_slot *memslot,
>>> +                unsigned long hva, kvm_pfn_t *pfnp,
>>> +                phys_addr_t *ipap)
>>> +{
>>> +    kvm_pfn_t pfn = *pfnp;
>>> +    struct page *page = pfn_to_page(pfn);
>>> +
>>> +    /*
>>> +     * PageTransCompoundMap() returns true for THP and
>>> +     * hugetlbfs. Make sure the adjustment is done only for THP
>>> +     * pages. Also make sure that the HVA and IPA are sufficiently
>>> +     * aligned and that the  block map is contained within the memslot.
>>> +     */
>>> +    if (!PageHuge(page) && PageTransCompoundMap(page) &&
>>
>> We managed to get here, ensure that we only play with normal size pages
>> and no hugetlbfs pages will be involved.  "!PageHuge(page)" will always
>> return true and we can let it go.
> 
> I think that is a bit tricky. If someone ever modifies the user_mem_abort()
> and we end up in getting called with a HugeTLB backed page things could go
> wrong.

That will be bad. I'm not sure if it's possible in the future.

> I could do remove the check, but would like to add a WARN_ON_ONCE() to make
> sure our assumption is held.
> 
> i.e,
>      WARN_ON_ONCE(PageHuge(page));

But this is a careful approach. I think this will be valuable both for
developers and the code itself. Thanks!


zenghui

> 
>      if (PageTransCompoundMap(page) &&>> + 
> fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE)) {
> 
> ...
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-04-12  9:37 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-10 15:14 [PATCH v2] kvm: arm: Skip stage2 huge mappings for unaligned ipa backed by THP Suzuki K Poulose
2019-04-10 15:23 ` [PATCH 0/2] kvm: arm: Unify stage2 mapping for THP backed memory Suzuki K Poulose
2019-04-10 15:23   ` [PATCH 1/2] kvm: arm: Clean up the checking for huge mapping Suzuki K Poulose
2019-04-11  1:48     ` Zenghui Yu
2019-04-11  9:47       ` Suzuki K Poulose
2019-04-10 15:23   ` [PATCH 2/2] kvm: arm: Unify handling THP backed host memory Suzuki K Poulose
2019-04-11  1:59     ` Zenghui Yu
2019-04-11 15:16       ` Suzuki K Poulose
2019-04-12  9:34         ` Zenghui Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).