linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/6] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page.
@ 2014-07-23 19:42 Tang Chen
  2014-07-23 19:42 ` [PATCH v3 1/6] kvm: Add gfn_to_page_no_pin() to translate gfn to page without pinning Tang Chen
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Tang Chen @ 2014-07-23 19:42 UTC (permalink / raw)
  To: gleb, mtosatti, nadav.amit, jan.kiszka
  Cc: kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel, tangchen

ept identity pagetable and apic access page in kvm are pinned in memory.
As a result, they cannot be migrated/hot-removed.

But actually they don't need to be pinned in memory.

[For ept identity page]
Just do not pin it. When it is migrated, guest will be able to find the
new page in the next ept violation.

[For apic access page]
The hpa of apic access page is stored in VMCS APIC_ACCESS_ADDR pointer.
When apic access page is migrated, we update VMCS APIC_ACCESS_ADDR pointer
for each vcpu in addition.

NOTE: Patch 1~5 are tested with -cpu xxx,-x2apic option, and they work well.
      Patch 6 is not tested yet, not sure if it is right.

Change log v2 -> v3:
1. Remove original [PATCH 3/6] since ept_identity_pagetable has been removed
   in new [PATCH 3/6].
2. In [PATCH 3/6], fix the problem that kvm->slots_lock does not protect 
   kvm->arch.ept_identity_pagetable_done checking.
3. In [PATCH 3/6], drop gfn_to_page() since ept_identity_pagetable has been 
   removed.
4. Add new [PATCH 4/6], remove redundant variable in init_rmode_identity_map(), 
   and make it return 0 on success.
5. In [PATCH 5/6], drop put_page(kvm->arch.apic_access_page) from x86.c .
6. In [PATCH 5/6], update kvm->arch.apic_access_page in vcpu_reload_apic_access_page().
7. Add new [PATCH 6/6], reload apic access page in L2->L1 exit.

Change log v1 -> v2:
1. Add [PATCH 4/5] to remove unnecessary kvm_arch->ept_identity_pagetable.
2. In [PATCH 5/5], only introduce KVM_REQ_APIC_PAGE_RELOAD request.
3. In [PATCH 5/5], add set_apic_access_page_addr() for svm.


Tang Chen (6):
  kvm: Add gfn_to_page_no_pin() to translate gfn to page without
    pinning.
  kvm: Use APIC_DEFAULT_PHYS_BASE macro as the apic access page address.
  kvm: Remove ept_identity_pagetable from struct kvm_arch.
  kvm: Make init_rmode_identity_map() return 0 on success.
  kvm, mem-hotplug: Do not pin apic access page in memory.
  kvm, mem-hotplug: Reload L1's apic access page if it is migrated when
    L2 is running.

 arch/x86/include/asm/kvm_host.h |   3 +-
 arch/x86/kvm/svm.c              |  15 +++++-
 arch/x86/kvm/vmx.c              | 108 +++++++++++++++++++++++++++-------------
 arch/x86/kvm/x86.c              |  22 ++++++--
 include/linux/kvm_host.h        |   3 ++
 virt/kvm/kvm_main.c             |  29 ++++++++++-
 6 files changed, 139 insertions(+), 41 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 1/6] kvm: Add gfn_to_page_no_pin() to translate gfn to page without pinning.
  2014-07-23 19:42 [PATCH v3 0/6] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
@ 2014-07-23 19:42 ` Tang Chen
  2014-07-23 19:42 ` [PATCH v3 2/6] kvm: Use APIC_DEFAULT_PHYS_BASE macro as the apic access page address Tang Chen
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Tang Chen @ 2014-07-23 19:42 UTC (permalink / raw)
  To: gleb, mtosatti, nadav.amit, jan.kiszka
  Cc: kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel, tangchen

gfn_to_page() will finally call hva_to_pfn() to get the pfn, and pin the page
in memory by calling GUP functions. This function unpins the page.

Will be used by the followed patches.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
---
 include/linux/kvm_host.h |  1 +
 virt/kvm/kvm_main.c      | 17 ++++++++++++++++-
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ec4e3bd..7c58d9d 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -541,6 +541,7 @@ int gfn_to_page_many_atomic(struct kvm *kvm, gfn_t gfn, struct page **pages,
 			    int nr_pages);
 
 struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn);
+struct page *gfn_to_page_no_pin(struct kvm *kvm, gfn_t gfn);
 unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn);
 unsigned long gfn_to_hva_prot(struct kvm *kvm, gfn_t gfn, bool *writable);
 unsigned long gfn_to_hva_memslot(struct kvm_memory_slot *slot, gfn_t gfn);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4b6c01b..6091849 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1371,9 +1371,24 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
 
 	return kvm_pfn_to_page(pfn);
 }
-
 EXPORT_SYMBOL_GPL(gfn_to_page);
 
+struct page *gfn_to_page_no_pin(struct kvm *kvm, gfn_t gfn)
+{
+	struct page *page = gfn_to_page(kvm, gfn);
+
+	/*
+	 * gfn_to_page() will finally call hva_to_pfn() to get the pfn, and pin
+	 * the page in memory by calling GUP functions. This function unpins
+	 * the page.
+	 */
+	if (!is_error_page(page))
+		put_page(page);
+
+	return page;
+}
+EXPORT_SYMBOL_GPL(gfn_to_page_no_pin);
+
 void kvm_release_page_clean(struct page *page)
 {
 	WARN_ON(is_error_page(page));
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 2/6] kvm: Use APIC_DEFAULT_PHYS_BASE macro as the apic access page address.
  2014-07-23 19:42 [PATCH v3 0/6] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
  2014-07-23 19:42 ` [PATCH v3 1/6] kvm: Add gfn_to_page_no_pin() to translate gfn to page without pinning Tang Chen
@ 2014-07-23 19:42 ` Tang Chen
  2014-07-23 19:42 ` [PATCH v3 3/6] kvm: Remove ept_identity_pagetable from struct kvm_arch Tang Chen
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Tang Chen @ 2014-07-23 19:42 UTC (permalink / raw)
  To: gleb, mtosatti, nadav.amit, jan.kiszka
  Cc: kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel, tangchen

We have APIC_DEFAULT_PHYS_BASE defined as 0xfee00000, which is also the address of
apic access page. So use this macro.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
---
 arch/x86/kvm/svm.c | 3 ++-
 arch/x86/kvm/vmx.c | 6 +++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index ec8366c..576b525 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1257,7 +1257,8 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id)
 	svm->asid_generation = 0;
 	init_vmcb(svm);
 
-	svm->vcpu.arch.apic_base = 0xfee00000 | MSR_IA32_APICBASE_ENABLE;
+	svm->vcpu.arch.apic_base = APIC_DEFAULT_PHYS_BASE |
+				   MSR_IA32_APICBASE_ENABLE;
 	if (kvm_vcpu_is_bsp(&svm->vcpu))
 		svm->vcpu.arch.apic_base |= MSR_IA32_APICBASE_BSP;
 
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 801332e..0e1117c 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3982,13 +3982,13 @@ static int alloc_apic_access_page(struct kvm *kvm)
 		goto out;
 	kvm_userspace_mem.slot = APIC_ACCESS_PAGE_PRIVATE_MEMSLOT;
 	kvm_userspace_mem.flags = 0;
-	kvm_userspace_mem.guest_phys_addr = 0xfee00000ULL;
+	kvm_userspace_mem.guest_phys_addr = APIC_DEFAULT_PHYS_BASE;
 	kvm_userspace_mem.memory_size = PAGE_SIZE;
 	r = __kvm_set_memory_region(kvm, &kvm_userspace_mem);
 	if (r)
 		goto out;
 
-	page = gfn_to_page(kvm, 0xfee00);
+	page = gfn_to_page(kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT);
 	if (is_error_page(page)) {
 		r = -EFAULT;
 		goto out;
@@ -4460,7 +4460,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu)
 
 	vmx->vcpu.arch.regs[VCPU_REGS_RDX] = get_rdx_init_val();
 	kvm_set_cr8(&vmx->vcpu, 0);
-	apic_base_msr.data = 0xfee00000 | MSR_IA32_APICBASE_ENABLE;
+	apic_base_msr.data = APIC_DEFAULT_PHYS_BASE | MSR_IA32_APICBASE_ENABLE;
 	if (kvm_vcpu_is_bsp(&vmx->vcpu))
 		apic_base_msr.data |= MSR_IA32_APICBASE_BSP;
 	apic_base_msr.host_initiated = true;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 3/6] kvm: Remove ept_identity_pagetable from struct kvm_arch.
  2014-07-23 19:42 [PATCH v3 0/6] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
  2014-07-23 19:42 ` [PATCH v3 1/6] kvm: Add gfn_to_page_no_pin() to translate gfn to page without pinning Tang Chen
  2014-07-23 19:42 ` [PATCH v3 2/6] kvm: Use APIC_DEFAULT_PHYS_BASE macro as the apic access page address Tang Chen
@ 2014-07-23 19:42 ` Tang Chen
  2014-07-23 19:42 ` [PATCH v3 4/6] kvm: Make init_rmode_identity_map() return 0 on success Tang Chen
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Tang Chen @ 2014-07-23 19:42 UTC (permalink / raw)
  To: gleb, mtosatti, nadav.amit, jan.kiszka
  Cc: kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel, tangchen

kvm_arch->ept_identity_pagetable holds the ept identity pagetable page. But
it is never used to refer to the page at all.

In vcpu initialization, it indicates two things:
1. indicates if ept page is allocated
2. indicates if a memory slot for identity page is initialized

Actually, kvm_arch->ept_identity_pagetable_done is enough to tell if the ept
identity pagetable is initialized. So we can remove ept_identity_pagetable.

NOTE: In the original code, ept identity pagetable page is pinned in memroy.
      As a result, it cannot be migrated/hot-removed. After this patch, since
      kvm_arch->ept_identity_pagetable is removed, ept identity pagetable page
      is no longer pinned in memory. And it can be migrated/hot-removed.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
---
 arch/x86/include/asm/kvm_host.h |  1 -
 arch/x86/kvm/vmx.c              | 50 ++++++++++++++++++++---------------------
 arch/x86/kvm/x86.c              |  2 --
 3 files changed, 25 insertions(+), 28 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4931415..62f973e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -578,7 +578,6 @@ struct kvm_arch {
 
 	gpa_t wall_clock;
 
-	struct page *ept_identity_pagetable;
 	bool ept_identity_pagetable_done;
 	gpa_t ept_identity_map_addr;
 
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 0e1117c..b8bf47d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -741,6 +741,7 @@ static void vmx_sync_pir_to_irr_dummy(struct kvm_vcpu *vcpu);
 static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx);
 static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx);
 static bool vmx_mpx_supported(void);
+static int alloc_identity_pagetable(struct kvm *kvm);
 
 static DEFINE_PER_CPU(struct vmcs *, vmxarea);
 static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
@@ -3921,21 +3922,27 @@ out:
 
 static int init_rmode_identity_map(struct kvm *kvm)
 {
-	int i, idx, r, ret;
+	int i, idx, r, ret = 0;
 	pfn_t identity_map_pfn;
 	u32 tmp;
 
 	if (!enable_ept)
 		return 1;
-	if (unlikely(!kvm->arch.ept_identity_pagetable)) {
-		printk(KERN_ERR "EPT: identity-mapping pagetable "
-			"haven't been allocated!\n");
-		return 0;
+
+	/* Protect kvm->arch.ept_identity_pagetable_done. */
+	mutex_lock(&kvm->slots_lock);
+
+	if (likely(kvm->arch.ept_identity_pagetable_done)) {
+		ret = 1;
+		goto out2;
 	}
-	if (likely(kvm->arch.ept_identity_pagetable_done))
-		return 1;
-	ret = 0;
+
 	identity_map_pfn = kvm->arch.ept_identity_map_addr >> PAGE_SHIFT;
+
+	r = alloc_identity_pagetable(kvm);
+	if (r)
+		goto out2;
+
 	idx = srcu_read_lock(&kvm->srcu);
 	r = kvm_clear_guest_page(kvm, identity_map_pfn, 0, PAGE_SIZE);
 	if (r < 0)
@@ -3953,6 +3960,9 @@ static int init_rmode_identity_map(struct kvm *kvm)
 	ret = 1;
 out:
 	srcu_read_unlock(&kvm->srcu, idx);
+
+out2:
+	mutex_unlock(&kvm->slots_lock);
 	return ret;
 }
 
@@ -4002,31 +4012,23 @@ out:
 
 static int alloc_identity_pagetable(struct kvm *kvm)
 {
-	struct page *page;
+	/*
+	 * In init_rmode_identity_map(), kvm->arch.ept_identity_pagetable_done
+	 * is checked before calling this function and set to true after the
+	 * calling. The access to kvm->arch.ept_identity_pagetable_done should
+	 * be protected by kvm->slots_lock.
+	 */
+
 	struct kvm_userspace_memory_region kvm_userspace_mem;
 	int r = 0;
 
-	mutex_lock(&kvm->slots_lock);
-	if (kvm->arch.ept_identity_pagetable)
-		goto out;
 	kvm_userspace_mem.slot = IDENTITY_PAGETABLE_PRIVATE_MEMSLOT;
 	kvm_userspace_mem.flags = 0;
 	kvm_userspace_mem.guest_phys_addr =
 		kvm->arch.ept_identity_map_addr;
 	kvm_userspace_mem.memory_size = PAGE_SIZE;
 	r = __kvm_set_memory_region(kvm, &kvm_userspace_mem);
-	if (r)
-		goto out;
 
-	page = gfn_to_page(kvm, kvm->arch.ept_identity_map_addr >> PAGE_SHIFT);
-	if (is_error_page(page)) {
-		r = -EFAULT;
-		goto out;
-	}
-
-	kvm->arch.ept_identity_pagetable = page;
-out:
-	mutex_unlock(&kvm->slots_lock);
 	return r;
 }
 
@@ -7582,8 +7584,6 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
 			kvm->arch.ept_identity_map_addr =
 				VMX_EPT_IDENTITY_PAGETABLE_ADDR;
 		err = -ENOMEM;
-		if (alloc_identity_pagetable(kvm) != 0)
-			goto free_vmcs;
 		if (!init_rmode_identity_map(kvm))
 			goto free_vmcs;
 	}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f32a025..ffbe557 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7177,8 +7177,6 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
 	kvm_free_vcpus(kvm);
 	if (kvm->arch.apic_access_page)
 		put_page(kvm->arch.apic_access_page);
-	if (kvm->arch.ept_identity_pagetable)
-		put_page(kvm->arch.ept_identity_pagetable);
 	kfree(rcu_dereference_check(kvm->arch.apic_map, 1));
 }
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 4/6] kvm: Make init_rmode_identity_map() return 0 on success.
  2014-07-23 19:42 [PATCH v3 0/6] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
                   ` (2 preceding siblings ...)
  2014-07-23 19:42 ` [PATCH v3 3/6] kvm: Remove ept_identity_pagetable from struct kvm_arch Tang Chen
@ 2014-07-23 19:42 ` Tang Chen
  2014-07-23 19:42 ` [PATCH v3 5/6] kvm, mem-hotplug: Do not pin apic access page in memory Tang Chen
  2014-07-23 19:42 ` [PATCH v3 6/6] kvm, mem-hotplug: Reload L1's apic access page if it is migrated when L2 is running Tang Chen
  5 siblings, 0 replies; 9+ messages in thread
From: Tang Chen @ 2014-07-23 19:42 UTC (permalink / raw)
  To: gleb, mtosatti, nadav.amit, jan.kiszka
  Cc: kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel, tangchen

In init_rmode_identity_map(), there two variables indicating the return
value, r and ret, and it return 0 on error, 1 on success. The function
is only called by vmx_create_vcpu(), and r is redundant.

This patch removes the redundant variable r, and make init_rmode_identity_map()
return 0 on success, -errno on failure.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
---
 arch/x86/kvm/vmx.c | 25 +++++++++++--------------
 1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index b8bf47d..6ab4f87 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3922,45 +3922,42 @@ out:
 
 static int init_rmode_identity_map(struct kvm *kvm)
 {
-	int i, idx, r, ret = 0;
+	int i, idx, ret = 0;
 	pfn_t identity_map_pfn;
 	u32 tmp;
 
 	if (!enable_ept)
-		return 1;
+		return 0;
 
 	/* Protect kvm->arch.ept_identity_pagetable_done. */
 	mutex_lock(&kvm->slots_lock);
 
-	if (likely(kvm->arch.ept_identity_pagetable_done)) {
-		ret = 1;
+	if (likely(kvm->arch.ept_identity_pagetable_done))
 		goto out2;
-	}
 
 	identity_map_pfn = kvm->arch.ept_identity_map_addr >> PAGE_SHIFT;
 
-	r = alloc_identity_pagetable(kvm);
-	if (r)
+	ret = alloc_identity_pagetable(kvm);
+	if (ret)
 		goto out2;
 
 	idx = srcu_read_lock(&kvm->srcu);
-	r = kvm_clear_guest_page(kvm, identity_map_pfn, 0, PAGE_SIZE);
-	if (r < 0)
+	ret = kvm_clear_guest_page(kvm, identity_map_pfn, 0, PAGE_SIZE);
+	if (ret)
 		goto out;
 	/* Set up identity-mapping pagetable for EPT in real mode */
 	for (i = 0; i < PT32_ENT_PER_PAGE; i++) {
 		tmp = (i << 22) + (_PAGE_PRESENT | _PAGE_RW | _PAGE_USER |
 			_PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE);
-		r = kvm_write_guest_page(kvm, identity_map_pfn,
+		ret = kvm_write_guest_page(kvm, identity_map_pfn,
 				&tmp, i * sizeof(tmp), sizeof(tmp));
-		if (r < 0)
+		if (ret)
 			goto out;
 	}
 	kvm->arch.ept_identity_pagetable_done = true;
-	ret = 1;
+
 out:
 	srcu_read_unlock(&kvm->srcu, idx);
-
 out2:
 	mutex_unlock(&kvm->slots_lock);
 	return ret;
@@ -7584,7 +7581,7 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
 			kvm->arch.ept_identity_map_addr =
 				VMX_EPT_IDENTITY_PAGETABLE_ADDR;
 		err = -ENOMEM;
-		if (!init_rmode_identity_map(kvm))
+		if (init_rmode_identity_map(kvm))
 			goto free_vmcs;
 	}
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 5/6] kvm, mem-hotplug: Do not pin apic access page in memory.
  2014-07-23 19:42 [PATCH v3 0/6] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
                   ` (3 preceding siblings ...)
  2014-07-23 19:42 ` [PATCH v3 4/6] kvm: Make init_rmode_identity_map() return 0 on success Tang Chen
@ 2014-07-23 19:42 ` Tang Chen
  2014-07-23 19:42 ` [PATCH v3 6/6] kvm, mem-hotplug: Reload L1's apic access page if it is migrated when L2 is running Tang Chen
  5 siblings, 0 replies; 9+ messages in thread
From: Tang Chen @ 2014-07-23 19:42 UTC (permalink / raw)
  To: gleb, mtosatti, nadav.amit, jan.kiszka
  Cc: kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel, tangchen

apic access page is pinned in memory. As a result, it cannot be migrated/hot-removed.
Actually, it is not necessary to be pinned.

The hpa of apic access page is stored in VMCS APIC_ACCESS_ADDR pointer. When
the page is migrated, kvm_mmu_notifier_invalidate_page() will invalidate the
corresponding ept entry. This patch introduces a new vcpu request named
KVM_REQ_APIC_PAGE_RELOAD, and makes this request to all the vcpus at this time,
and force all the vcpus exit guest, and re-enter guest till they updates the VMCS
APIC_ACCESS_ADDR pointer to the new apic access page address, and updates
kvm->arch.apic_access_page to the new page.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/svm.c              |  6 ++++++
 arch/x86/kvm/vmx.c              |  8 +++++++-
 arch/x86/kvm/x86.c              | 17 +++++++++++++++--
 include/linux/kvm_host.h        |  2 ++
 virt/kvm/kvm_main.c             | 12 ++++++++++++
 6 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 62f973e..9ce6bfd 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -737,6 +737,7 @@ struct kvm_x86_ops {
 	void (*hwapic_isr_update)(struct kvm *kvm, int isr);
 	void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
 	void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
+	void (*set_apic_access_page_addr)(struct kvm *kvm, hpa_t hpa);
 	void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
 	void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
 	int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 576b525..dc76f29 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3612,6 +3612,11 @@ static void svm_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
 	return;
 }
 
+static void svm_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
+{
+	return;
+}
+
 static int svm_vm_has_apicv(struct kvm *kvm)
 {
 	return 0;
@@ -4365,6 +4370,7 @@ static struct kvm_x86_ops svm_x86_ops = {
 	.enable_irq_window = enable_irq_window,
 	.update_cr8_intercept = update_cr8_intercept,
 	.set_virtual_x2apic_mode = svm_set_virtual_x2apic_mode,
+	.set_apic_access_page_addr = svm_set_apic_access_page_addr,
 	.vm_has_apicv = svm_vm_has_apicv,
 	.load_eoi_exitmap = svm_load_eoi_exitmap,
 	.hwapic_isr_update = svm_hwapic_isr_update,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 6ab4f87..c123c1d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3995,7 +3995,7 @@ static int alloc_apic_access_page(struct kvm *kvm)
 	if (r)
 		goto out;
 
-	page = gfn_to_page(kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT);
+	page = gfn_to_page_no_pin(kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT);
 	if (is_error_page(page)) {
 		r = -EFAULT;
 		goto out;
@@ -7072,6 +7072,11 @@ static void vmx_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
 	vmx_set_msr_bitmap(vcpu);
 }
 
+static void vmx_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
+{
+	vmcs_write64(APIC_ACCESS_ADDR, hpa);
+}
+
 static void vmx_hwapic_isr_update(struct kvm *kvm, int isr)
 {
 	u16 status;
@@ -8841,6 +8846,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
 	.enable_irq_window = enable_irq_window,
 	.update_cr8_intercept = update_cr8_intercept,
 	.set_virtual_x2apic_mode = vmx_set_virtual_x2apic_mode,
+	.set_apic_access_page_addr = vmx_set_apic_access_page_addr,
 	.vm_has_apicv = vmx_vm_has_apicv,
 	.load_eoi_exitmap = vmx_load_eoi_exitmap,
 	.hwapic_irr_update = vmx_hwapic_irr_update,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ffbe557..7541a66 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5929,6 +5929,19 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
 	kvm_apic_update_tmr(vcpu, tmr);
 }
 
+static void vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu)
+{
+	/*
+	 * apic access page could be migrated. When the page is being migrated,
+	 * GUP will wait till the migrate entry is replaced with the new pte
+	 * entry pointing to the new page.
+	 */
+	vcpu->kvm->arch.apic_access_page = gfn_to_page_no_pin(vcpu->kvm,
+				APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT);
+	kvm_x86_ops->set_apic_access_page_addr(vcpu->kvm,
+				page_to_phys(vcpu->kvm->arch.apic_access_page));
+}
+
 /*
  * Returns 1 to let __vcpu_run() continue the guest execution loop without
  * exiting to the userspace.  Otherwise, the value will be returned to the
@@ -5989,6 +6002,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 			kvm_deliver_pmi(vcpu);
 		if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
 			vcpu_scan_ioapic(vcpu);
+		if (kvm_check_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu))
+			vcpu_reload_apic_access_page(vcpu);
 	}
 
 	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
@@ -7175,8 +7190,6 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
 	kfree(kvm->arch.vpic);
 	kfree(kvm->arch.vioapic);
 	kvm_free_vcpus(kvm);
-	if (kvm->arch.apic_access_page)
-		put_page(kvm->arch.apic_access_page);
 	kfree(rcu_dereference_check(kvm->arch.apic_map, 1));
 }
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 7c58d9d..f49be86 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -136,6 +136,7 @@ static inline bool is_error_page(struct page *page)
 #define KVM_REQ_GLOBAL_CLOCK_UPDATE 22
 #define KVM_REQ_ENABLE_IBS        23
 #define KVM_REQ_DISABLE_IBS       24
+#define KVM_REQ_APIC_PAGE_RELOAD  25
 
 #define KVM_USERSPACE_IRQ_SOURCE_ID		0
 #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID	1
@@ -596,6 +597,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm);
 void kvm_reload_remote_mmus(struct kvm *kvm);
 void kvm_make_mclock_inprogress_request(struct kvm *kvm);
 void kvm_make_scan_ioapic_request(struct kvm *kvm);
+void kvm_reload_apic_access_page(struct kvm *kvm);
 
 long kvm_arch_dev_ioctl(struct file *filp,
 			unsigned int ioctl, unsigned long arg);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6091849..965b702 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -210,6 +210,11 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
 	make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
 }
 
+void kvm_reload_apic_access_page(struct kvm *kvm)
+{
+	make_all_cpus_request(kvm, KVM_REQ_APIC_PAGE_RELOAD);
+}
+
 int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
 {
 	struct page *page;
@@ -294,6 +299,13 @@ static void kvm_mmu_notifier_invalidate_page(struct mmu_notifier *mn,
 	if (need_tlb_flush)
 		kvm_flush_remote_tlbs(kvm);
 
+	/*
+	 * The physical address of apic access page is stroed in VMCS.
+	 * So need to update it when it becomes invalid.
+	 */
+	if (address == gfn_to_hva(kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT))
+		kvm_reload_apic_access_page(kvm);
+
 	spin_unlock(&kvm->mmu_lock);
 	srcu_read_unlock(&kvm->srcu, idx);
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 6/6] kvm, mem-hotplug: Reload L1's apic access page if it is migrated when L2 is running.
  2014-07-23 19:42 [PATCH v3 0/6] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
                   ` (4 preceding siblings ...)
  2014-07-23 19:42 ` [PATCH v3 5/6] kvm, mem-hotplug: Do not pin apic access page in memory Tang Chen
@ 2014-07-23 19:42 ` Tang Chen
  2014-07-26  8:44   ` Jan Kiszka
  5 siblings, 1 reply; 9+ messages in thread
From: Tang Chen @ 2014-07-23 19:42 UTC (permalink / raw)
  To: gleb, mtosatti, nadav.amit, jan.kiszka
  Cc: kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel, tangchen

This patch only handle "L1 and L2 vm share one apic access page" situation.

When L1 vm is running, if the shared apic access page is migrated, mmu_notifier will
request all vcpus to exit to L0, and reload apic access page physical address for
all the vcpus' vmcs (which is done by patch 5/6). And when it enters L2 vm, L2's vmcs
will be updated in prepare_vmcs02() called by nested_vm_run(). So we need to do
nothing.

When L2 vm is running, if the shared apic access page is migrated, mmu_notifier will
request all vcpus to exit to L0, and reload apic access page physical address for
all L2 vmcs. And this patch requests apic access page reload in L2->L1 vmexit.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/svm.c              |  6 ++++++
 arch/x86/kvm/vmx.c              | 37 +++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c              |  3 +++
 4 files changed, 47 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9ce6bfd..613ee7f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -738,6 +738,7 @@ struct kvm_x86_ops {
 	void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
 	void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
 	void (*set_apic_access_page_addr)(struct kvm *kvm, hpa_t hpa);
+	void (*set_nested_apic_page_migrated)(struct kvm_vcpu *vcpu, bool set);
 	void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
 	void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
 	int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index dc76f29..87273ef 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3617,6 +3617,11 @@ static void svm_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
 	return;
 }
 
+static void svm_set_nested_apic_page_migrated(struct kvm_vcpu *vcpu, bool set)
+{
+	return;
+}
+
 static int svm_vm_has_apicv(struct kvm *kvm)
 {
 	return 0;
@@ -4371,6 +4376,7 @@ static struct kvm_x86_ops svm_x86_ops = {
 	.update_cr8_intercept = update_cr8_intercept,
 	.set_virtual_x2apic_mode = svm_set_virtual_x2apic_mode,
 	.set_apic_access_page_addr = svm_set_apic_access_page_addr,
+	.set_nested_apic_page_migrated = svm_set_nested_apic_page_migrated,
 	.vm_has_apicv = svm_vm_has_apicv,
 	.load_eoi_exitmap = svm_load_eoi_exitmap,
 	.hwapic_isr_update = svm_hwapic_isr_update,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c123c1d..9231afe 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -379,6 +379,16 @@ struct nested_vmx {
 	 * we must keep them pinned while L2 runs.
 	 */
 	struct page *apic_access_page;
+	/*
+	 * L1's apic access page can be migrated. When L1 and L2 are sharing
+	 * the apic access page, after the page is migrated when L2 is running,
+	 * we have to reload it to L1 vmcs before we enter L1.
+	 *
+	 * When the shared apic access page is migrated in L1 mode, we don't
+	 * need to do anything else because we reload apic access page each
+	 * time when entering L2 in prepare_vmcs02().
+	 */
+	bool apic_access_page_migrated;
 	u64 msr_ia32_feature_control;
 
 	struct hrtimer preemption_timer;
@@ -7077,6 +7087,12 @@ static void vmx_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
 	vmcs_write64(APIC_ACCESS_ADDR, hpa);
 }
 
+static void vmx_set_nested_apic_page_migrated(struct kvm_vcpu *vcpu, bool set)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+	vmx->nested.apic_access_page_migrated = set;
+}
+
 static void vmx_hwapic_isr_update(struct kvm *kvm, int isr)
 {
 	u16 status;
@@ -8727,6 +8743,26 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 	}
 
 	/*
+	 * When shared (L1 & L2) apic access page is migrated during L2 is
+	 * running, mmu_notifier will force to reload the page's hpa for L2
+	 * vmcs. Need to reload it for L1 before entering L1.
+	 */
+	if (vmx->nested.apic_access_page_migrated) {
+		/*
+		 * Do not call kvm_reload_apic_access_page() because we are now
+		 * in L2. We should not call make_all_cpus_request() to exit to
+		 * L0, otherwise we will reload for L2 vmcs again.
+		 */
+		int i;
+
+		for (i = 0; i < atomic_read(&vcpu->kvm->online_vcpus); i++)
+			kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD,
+					 vcpu->kvm->vcpus[i]);
+
+		vmx->nested.apic_access_page_migrated = false;
+	}
+
+	/*
 	 * Exiting from L2 to L1, we're now back to L1 which thinks it just
 	 * finished a VMLAUNCH or VMRESUME instruction, so we need to set the
 	 * success or failure flag accordingly.
@@ -8847,6 +8883,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
 	.update_cr8_intercept = update_cr8_intercept,
 	.set_virtual_x2apic_mode = vmx_set_virtual_x2apic_mode,
 	.set_apic_access_page_addr = vmx_set_apic_access_page_addr,
+	.set_nested_apic_page_migrated = vmx_set_nested_apic_page_migrated,
 	.vm_has_apicv = vmx_vm_has_apicv,
 	.load_eoi_exitmap = vmx_load_eoi_exitmap,
 	.hwapic_irr_update = vmx_hwapic_irr_update,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7541a66..0c11e12 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5940,6 +5940,9 @@ static void vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu)
 				APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT);
 	kvm_x86_ops->set_apic_access_page_addr(vcpu->kvm,
 				page_to_phys(vcpu->kvm->arch.apic_access_page));
+
+	if (is_guest_mode(vcpu))
+		kvm_x86_ops->set_nested_apic_page_migrated(vcpu, true);
 }
 
 /*
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 6/6] kvm, mem-hotplug: Reload L1's apic access page if it is migrated when L2 is running.
  2014-07-23 19:42 ` [PATCH v3 6/6] kvm, mem-hotplug: Reload L1's apic access page if it is migrated when L2 is running Tang Chen
@ 2014-07-26  8:44   ` Jan Kiszka
  2014-07-29 22:56     ` tangchen
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Kiszka @ 2014-07-26  8:44 UTC (permalink / raw)
  To: Tang Chen, gleb, mtosatti, nadav.amit
  Cc: kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 822 bytes --]

On 2014-07-23 21:42, Tang Chen wrote:
> This patch only handle "L1 and L2 vm share one apic access page" situation.
> 
> When L1 vm is running, if the shared apic access page is migrated, mmu_notifier will
> request all vcpus to exit to L0, and reload apic access page physical address for
> all the vcpus' vmcs (which is done by patch 5/6). And when it enters L2 vm, L2's vmcs
> will be updated in prepare_vmcs02() called by nested_vm_run(). So we need to do
> nothing.
> 
> When L2 vm is running, if the shared apic access page is migrated, mmu_notifier will
> request all vcpus to exit to L0, and reload apic access page physical address for
> all L2 vmcs. And this patch requests apic access page reload in L2->L1 vmexit.

Shouldn't this patch come before we allow apic access page migration?

Jan



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 6/6] kvm, mem-hotplug: Reload L1's apic access page if it is migrated when L2 is running.
  2014-07-26  8:44   ` Jan Kiszka
@ 2014-07-29 22:56     ` tangchen
  0 siblings, 0 replies; 9+ messages in thread
From: tangchen @ 2014-07-29 22:56 UTC (permalink / raw)
  To: Jan Kiszka, gleb, mtosatti, nadav.amit
  Cc: kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel, tangchen


On 07/26/2014 04:44 AM, Jan Kiszka wrote:
> On 2014-07-23 21:42, Tang Chen wrote:
>> This patch only handle "L1 and L2 vm share one apic access page" situation.
>>
>> When L1 vm is running, if the shared apic access page is migrated, mmu_notifier will
>> request all vcpus to exit to L0, and reload apic access page physical address for
>> all the vcpus' vmcs (which is done by patch 5/6). And when it enters L2 vm, L2's vmcs
>> will be updated in prepare_vmcs02() called by nested_vm_run(). So we need to do
>> nothing.
>>
>> When L2 vm is running, if the shared apic access page is migrated, mmu_notifier will
>> request all vcpus to exit to L0, and reload apic access page physical address for
>> all L2 vmcs. And this patch requests apic access page reload in L2->L1 vmexit.
> Shouldn't this patch come before we allow apic access page migration?
Yes, it should come before patch 5.

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-07-29 10:57 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-23 19:42 [PATCH v3 0/6] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
2014-07-23 19:42 ` [PATCH v3 1/6] kvm: Add gfn_to_page_no_pin() to translate gfn to page without pinning Tang Chen
2014-07-23 19:42 ` [PATCH v3 2/6] kvm: Use APIC_DEFAULT_PHYS_BASE macro as the apic access page address Tang Chen
2014-07-23 19:42 ` [PATCH v3 3/6] kvm: Remove ept_identity_pagetable from struct kvm_arch Tang Chen
2014-07-23 19:42 ` [PATCH v3 4/6] kvm: Make init_rmode_identity_map() return 0 on success Tang Chen
2014-07-23 19:42 ` [PATCH v3 5/6] kvm, mem-hotplug: Do not pin apic access page in memory Tang Chen
2014-07-23 19:42 ` [PATCH v3 6/6] kvm, mem-hotplug: Reload L1's apic access page if it is migrated when L2 is running Tang Chen
2014-07-26  8:44   ` Jan Kiszka
2014-07-29 22:56     ` tangchen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).