kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/2] kvm/arm64: Try stage2 block mapping for host device MMIO
@ 2021-05-07 11:03 Keqian Zhu
  2021-05-07 11:03 ` [PATCH v5 1/2] kvm/arm64: Remove the creation time's mapping of MMIO regions Keqian Zhu
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Keqian Zhu @ 2021-05-07 11:03 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm, Marc Zyngier; +Cc: wanghaibin.wang

Hi Marc,

This rebases to newest mainline kernel.

Thanks,
Keqian


We have two pathes to build stage2 mapping for MMIO regions.

Create time's path and stage2 fault path.

Patch#1 removes the creation time's mapping of MMIO regions
Patch#2 tries stage2 block mapping for host device MMIO at fault path

Changelog:

v5:
 - Rebase to newest mainline.

v4:
 - use get_vma_page_shift() handle all cases. (Marc)
 - get rid of force_pte for device mapping. (Marc)

v3:
 - Do not need to check memslot boundary in device_rough_page_shift(). (Marc)


Keqian Zhu (2):
  kvm/arm64: Remove the creation time's mapping of MMIO regions
  kvm/arm64: Try stage2 block mapping for host device MMIO

 arch/arm64/kvm/mmu.c | 99 ++++++++++++++++++++++++--------------------
 1 file changed, 54 insertions(+), 45 deletions(-)

-- 
2.19.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v5 1/2] kvm/arm64: Remove the creation time's mapping of MMIO regions
  2021-05-07 11:03 [PATCH v5 0/2] kvm/arm64: Try stage2 block mapping for host device MMIO Keqian Zhu
@ 2021-05-07 11:03 ` Keqian Zhu
  2021-05-07 11:03 ` [PATCH v5 2/2] kvm/arm64: Try stage2 block mapping for host device MMIO Keqian Zhu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Keqian Zhu @ 2021-05-07 11:03 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm, Marc Zyngier; +Cc: wanghaibin.wang

The MMIO regions may be unmapped for many reasons and can be remapped
by stage2 fault path. Map MMIO regions at creation time becomes a
minor optimization and makes these two mapping path hard to sync.

Remove the mapping code while keep the useful sanity check.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
---
 arch/arm64/kvm/mmu.c | 38 +++-----------------------------------
 1 file changed, 3 insertions(+), 35 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index c5d1f3c87dbd..16efd47439a6 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1346,7 +1346,6 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 {
 	hva_t hva = mem->userspace_addr;
 	hva_t reg_end = hva + mem->memory_size;
-	bool writable = !(mem->flags & KVM_MEM_READONLY);
 	int ret = 0;
 
 	if (change != KVM_MR_CREATE && change != KVM_MR_MOVE &&
@@ -1363,8 +1362,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 	mmap_read_lock(current->mm);
 	/*
 	 * A memory region could potentially cover multiple VMAs, and any holes
-	 * between them, so iterate over all of them to find out if we can map
-	 * any of them right now.
+	 * between them, so iterate over all of them.
 	 *
 	 *     +--------------------------------------------+
 	 * +---------------+----------------+   +----------------+
@@ -1375,51 +1373,21 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 	 */
 	do {
 		struct vm_area_struct *vma;
-		hva_t vm_start, vm_end;
 
 		vma = find_vma_intersection(current->mm, hva, reg_end);
 		if (!vma)
 			break;
 
-		/*
-		 * Take the intersection of this VMA with the memory region
-		 */
-		vm_start = max(hva, vma->vm_start);
-		vm_end = min(reg_end, vma->vm_end);
-
 		if (vma->vm_flags & VM_PFNMAP) {
-			gpa_t gpa = mem->guest_phys_addr +
-				    (vm_start - mem->userspace_addr);
-			phys_addr_t pa;
-
-			pa = (phys_addr_t)vma->vm_pgoff << PAGE_SHIFT;
-			pa += vm_start - vma->vm_start;
-
 			/* IO region dirty page logging not allowed */
 			if (memslot->flags & KVM_MEM_LOG_DIRTY_PAGES) {
 				ret = -EINVAL;
-				goto out;
-			}
-
-			ret = kvm_phys_addr_ioremap(kvm, gpa, pa,
-						    vm_end - vm_start,
-						    writable);
-			if (ret)
 				break;
+			}
 		}
-		hva = vm_end;
+		hva = min(reg_end, vma->vm_end);
 	} while (hva < reg_end);
 
-	if (change == KVM_MR_FLAGS_ONLY)
-		goto out;
-
-	spin_lock(&kvm->mmu_lock);
-	if (ret)
-		unmap_stage2_range(&kvm->arch.mmu, mem->guest_phys_addr, mem->memory_size);
-	else if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB))
-		stage2_flush_memslot(kvm, memslot);
-	spin_unlock(&kvm->mmu_lock);
-out:
 	mmap_read_unlock(current->mm);
 	return ret;
 }
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v5 2/2] kvm/arm64: Try stage2 block mapping for host device MMIO
  2021-05-07 11:03 [PATCH v5 0/2] kvm/arm64: Try stage2 block mapping for host device MMIO Keqian Zhu
  2021-05-07 11:03 ` [PATCH v5 1/2] kvm/arm64: Remove the creation time's mapping of MMIO regions Keqian Zhu
@ 2021-05-07 11:03 ` Keqian Zhu
  2021-06-01  1:16 ` [PATCH v5 0/2] " Keqian Zhu
  2021-06-01 11:07 ` Marc Zyngier
  3 siblings, 0 replies; 5+ messages in thread
From: Keqian Zhu @ 2021-05-07 11:03 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm, Marc Zyngier; +Cc: wanghaibin.wang

The MMIO region of a device maybe huge (GB level), try to use
block mapping in stage2 to speedup both map and unmap.

Compared to normal memory mapping, we should consider two more
points when try block mapping for MMIO region:

1. For normal memory mapping, the PA(host physical address) and
HVA have same alignment within PUD_SIZE or PMD_SIZE when we use
the HVA to request hugepage, so we don't need to consider PA
alignment when verifing block mapping. But for device memory
mapping, the PA and HVA may have different alignment.

2. For normal memory mapping, we are sure hugepage size properly
fit into vma, so we don't check whether the mapping size exceeds
the boundary of vma. But for device memory mapping, we should pay
attention to this.

This adds get_vma_page_shift() to get page shift for both normal
memory and device MMIO region, and check these two points when
selecting block mapping size for MMIO region.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
---
 arch/arm64/kvm/mmu.c | 61 ++++++++++++++++++++++++++++++++++++--------
 1 file changed, 51 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 16efd47439a6..2a3d3b760550 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -822,6 +822,35 @@ transparent_hugepage_adjust(struct kvm_memory_slot *memslot,
 	return PAGE_SIZE;
 }
 
+static int get_vma_page_shift(struct vm_area_struct *vma, unsigned long hva)
+{
+	unsigned long pa;
+
+	if (is_vm_hugetlb_page(vma) && !(vma->vm_flags & VM_PFNMAP))
+		return huge_page_shift(hstate_vma(vma));
+
+	if (!(vma->vm_flags & VM_PFNMAP))
+		return PAGE_SHIFT;
+
+	VM_BUG_ON(is_vm_hugetlb_page(vma));
+
+	pa = (vma->vm_pgoff << PAGE_SHIFT) + (hva - vma->vm_start);
+
+#ifndef __PAGETABLE_PMD_FOLDED
+	if ((hva & (PUD_SIZE - 1)) == (pa & (PUD_SIZE - 1)) &&
+	    ALIGN_DOWN(hva, PUD_SIZE) >= vma->vm_start &&
+	    ALIGN(hva, PUD_SIZE) <= vma->vm_end)
+		return PUD_SHIFT;
+#endif
+
+	if ((hva & (PMD_SIZE - 1)) == (pa & (PMD_SIZE - 1)) &&
+	    ALIGN_DOWN(hva, PMD_SIZE) >= vma->vm_start &&
+	    ALIGN(hva, PMD_SIZE) <= vma->vm_end)
+		return PMD_SHIFT;
+
+	return PAGE_SHIFT;
+}
+
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			  struct kvm_memory_slot *memslot, unsigned long hva,
 			  unsigned long fault_status)
@@ -853,7 +882,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		return -EFAULT;
 	}
 
-	/* Let's check if we will get back a huge page backed by hugetlbfs */
+	/*
+	 * Let's check if we will get back a huge page backed by hugetlbfs, or
+	 * get block mapping for device MMIO region.
+	 */
 	mmap_read_lock(current->mm);
 	vma = find_vma_intersection(current->mm, hva, hva + 1);
 	if (unlikely(!vma)) {
@@ -862,15 +894,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		return -EFAULT;
 	}
 
-	if (is_vm_hugetlb_page(vma))
-		vma_shift = huge_page_shift(hstate_vma(vma));
-	else
-		vma_shift = PAGE_SHIFT;
-
-	if (logging_active ||
-	    (vma->vm_flags & VM_PFNMAP)) {
+	/*
+	 * logging_active is guaranteed to never be true for VM_PFNMAP
+	 * memslots.
+	 */
+	if (logging_active) {
 		force_pte = true;
 		vma_shift = PAGE_SHIFT;
+	} else {
+		vma_shift = get_vma_page_shift(vma, hva);
 	}
 
 	switch (vma_shift) {
@@ -943,8 +975,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		return -EFAULT;
 
 	if (kvm_is_device_pfn(pfn)) {
+		/*
+		 * If the page was identified as device early by looking at
+		 * the VMA flags, vma_pagesize is already representing the
+		 * largest quantity we can map.  If instead it was mapped
+		 * via gfn_to_pfn_prot(), vma_pagesize is set to PAGE_SIZE
+		 * and must not be upgraded.
+		 *
+		 * In both cases, we don't let transparent_hugepage_adjust()
+		 * change things at the last minute.
+		 */
 		device = true;
-		force_pte = true;
 	} else if (logging_active && !write_fault) {
 		/*
 		 * Only actually map the page as writable if this was a write
@@ -965,7 +1006,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	 * If we are not forced to use page mapping, check if we are
 	 * backed by a THP and thus use block mapping if possible.
 	 */
-	if (vma_pagesize == PAGE_SIZE && !force_pte)
+	if (vma_pagesize == PAGE_SIZE && !(force_pte || device))
 		vma_pagesize = transparent_hugepage_adjust(memslot, hva,
 							   &pfn, &fault_ipa);
 	if (writable)
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v5 0/2] kvm/arm64: Try stage2 block mapping for host device MMIO
  2021-05-07 11:03 [PATCH v5 0/2] kvm/arm64: Try stage2 block mapping for host device MMIO Keqian Zhu
  2021-05-07 11:03 ` [PATCH v5 1/2] kvm/arm64: Remove the creation time's mapping of MMIO regions Keqian Zhu
  2021-05-07 11:03 ` [PATCH v5 2/2] kvm/arm64: Try stage2 block mapping for host device MMIO Keqian Zhu
@ 2021-06-01  1:16 ` Keqian Zhu
  2021-06-01 11:07 ` Marc Zyngier
  3 siblings, 0 replies; 5+ messages in thread
From: Keqian Zhu @ 2021-06-01  1:16 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-arm-kernel, kvm, kvmarm, wanghaibin.wang

Hi Marc,

Kindly remind. :)

BRs,
Keqian

On 2021/5/7 19:03, Keqian Zhu wrote:
> Hi Marc,
> 
> This rebases to newest mainline kernel.
> 
> Thanks,
> Keqian
> 
> 
> We have two pathes to build stage2 mapping for MMIO regions.
> 
> Create time's path and stage2 fault path.
> 
> Patch#1 removes the creation time's mapping of MMIO regions
> Patch#2 tries stage2 block mapping for host device MMIO at fault path
> 
> Changelog:
> 
> v5:
>  - Rebase to newest mainline.
> 
> v4:
>  - use get_vma_page_shift() handle all cases. (Marc)
>  - get rid of force_pte for device mapping. (Marc)
> 
> v3:
>  - Do not need to check memslot boundary in device_rough_page_shift(). (Marc)
> 
> 
> Keqian Zhu (2):
>   kvm/arm64: Remove the creation time's mapping of MMIO regions
>   kvm/arm64: Try stage2 block mapping for host device MMIO
> 
>  arch/arm64/kvm/mmu.c | 99 ++++++++++++++++++++++++--------------------
>  1 file changed, 54 insertions(+), 45 deletions(-)
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v5 0/2] kvm/arm64: Try stage2 block mapping for host device MMIO
  2021-05-07 11:03 [PATCH v5 0/2] kvm/arm64: Try stage2 block mapping for host device MMIO Keqian Zhu
                   ` (2 preceding siblings ...)
  2021-06-01  1:16 ` [PATCH v5 0/2] " Keqian Zhu
@ 2021-06-01 11:07 ` Marc Zyngier
  3 siblings, 0 replies; 5+ messages in thread
From: Marc Zyngier @ 2021-06-01 11:07 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm, Keqian Zhu; +Cc: wanghaibin.wang

On Fri, 7 May 2021 19:03:20 +0800, Keqian Zhu wrote:
> This rebases to newest mainline kernel.
> 
> Thanks,
> Keqian
> 
> 
> We have two pathes to build stage2 mapping for MMIO regions.
> 
> [...]

Applied to next, thanks!

[1/2] kvm/arm64: Remove the creation time's mapping of MMIO regions
      commit: fd6f17bade2147b31198ad00b22d3acf5a398aec
[2/2] kvm/arm64: Try stage2 block mapping for host device MMIO
      commit: 2aa53d68cee6603931f73b28ef6b51ff3fde9397

Cheers,

	M.
-- 
Without deviation from the norm, progress is not possible.



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-06-01 11:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-07 11:03 [PATCH v5 0/2] kvm/arm64: Try stage2 block mapping for host device MMIO Keqian Zhu
2021-05-07 11:03 ` [PATCH v5 1/2] kvm/arm64: Remove the creation time's mapping of MMIO regions Keqian Zhu
2021-05-07 11:03 ` [PATCH v5 2/2] kvm/arm64: Try stage2 block mapping for host device MMIO Keqian Zhu
2021-06-01  1:16 ` [PATCH v5 0/2] " Keqian Zhu
2021-06-01 11:07 ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).