All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page.
@ 2014-07-02  9:00 Tang Chen
  2014-07-02  9:00 ` [PATCH 1/4] kvm: Add gfn_to_page_no_pin() Tang Chen
                   ` (4 more replies)
  0 siblings, 5 replies; 29+ messages in thread
From: Tang Chen @ 2014-07-02  9:00 UTC (permalink / raw)
  To: gleb, mtosatti
  Cc: kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel, tangchen

Hi Gleb, Marcelo,

Please help to review this patch-set.

NOTE: This patch-set doesn't work properly.


ept identity pagetable and apic access page in kvm are pinned in memory.
As a result, they cannot be migrated/hot-removed.

But actually they don't need to be pinned in memory.

This patch-set introduces two new vcpu requests: KVM_REQ_MIGRATE_EPT and KVM_REQ_MIGRATE_APIC.
These two requests are made when the two pages are migrated by the mmu_notifier
to reset the related variable to unusable value. And will also be made when
ept violation happens to reset new pages.


[Known problem]
After this patch-set applied, the two pages can be migrated/hot-removed.
But after migrating apic access page, the guest died.

The host physical address of apic access page is stored in VMCS. I reset
it to 0 to stop guest from accessing it when it is unmapped by
kvm_mmu_notifier_invalidate_page(). And reset it to new page's host physical
address in tdp_page_fault(). But it seems that guest will access apic page
directly by the host physical address.


Tang Chen (4):
  kvm: Add gfn_to_page_no_pin()
  kvm: Add macro VMX_APIC_ACCESS_PAGE_ADDR
  kvm, memory-hotplug: Update ept identity pagetable when it is
    migrated.
  kvm, mem-hotplug: Update apic access page when it is migrated.

 arch/x86/include/asm/kvm_host.h |  3 +++
 arch/x86/include/asm/vmx.h      |  2 +-
 arch/x86/kvm/mmu.c              | 26 ++++++++++++++++++++++++++
 arch/x86/kvm/svm.c              |  3 ++-
 arch/x86/kvm/vmx.c              | 17 +++++++++++++----
 arch/x86/kvm/x86.c              | 36 ++++++++++++++++++++++++++++++++++++
 include/linux/kvm_host.h        |  3 +++
 virt/kvm/kvm_main.c             | 38 +++++++++++++++++++++++++++++++++++++-
 8 files changed, 121 insertions(+), 7 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/4] kvm: Add gfn_to_page_no_pin()
  2014-07-02  9:00 [PATCH 0/4] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
@ 2014-07-02  9:00 ` Tang Chen
  2014-07-02  9:00 ` [PATCH 2/4] kvm: Add macro VMX_APIC_ACCESS_PAGE_ADDR Tang Chen
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 29+ messages in thread
From: Tang Chen @ 2014-07-02  9:00 UTC (permalink / raw)
  To: gleb, mtosatti
  Cc: kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel, tangchen

Used by the followed patches.
---
 include/linux/kvm_host.h |  1 +
 virt/kvm/kvm_main.c      | 17 ++++++++++++++++-
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ec4e3bd..7c58d9d 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -541,6 +541,7 @@ int gfn_to_page_many_atomic(struct kvm *kvm, gfn_t gfn, struct page **pages,
 			    int nr_pages);
 
 struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn);
+struct page *gfn_to_page_no_pin(struct kvm *kvm, gfn_t gfn);
 unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn);
 unsigned long gfn_to_hva_prot(struct kvm *kvm, gfn_t gfn, bool *writable);
 unsigned long gfn_to_hva_memslot(struct kvm_memory_slot *slot, gfn_t gfn);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4b6c01b..6091849 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1371,9 +1371,24 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
 
 	return kvm_pfn_to_page(pfn);
 }
-
 EXPORT_SYMBOL_GPL(gfn_to_page);
 
+struct page *gfn_to_page_no_pin(struct kvm *kvm, gfn_t gfn)
+{
+	struct page *page = gfn_to_page(kvm, gfn);
+
+	/*
+	 * gfn_to_page() will finally call hva_to_pfn() to get the pfn, and pin
+	 * the page in memory by calling GUP functions. This function unpins
+	 * the page.
+	 */
+	if (!is_error_page(page))
+		put_page(page);
+
+	return page;
+}
+EXPORT_SYMBOL_GPL(gfn_to_page_no_pin);
+
 void kvm_release_page_clean(struct page *page)
 {
 	WARN_ON(is_error_page(page));
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/4] kvm: Add macro VMX_APIC_ACCESS_PAGE_ADDR
  2014-07-02  9:00 [PATCH 0/4] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
  2014-07-02  9:00 ` [PATCH 1/4] kvm: Add gfn_to_page_no_pin() Tang Chen
@ 2014-07-02  9:00 ` Tang Chen
  2014-07-02 16:24   ` Gleb Natapov
  2014-07-02  9:00 ` [PATCH 3/4] kvm, memory-hotplug: Update ept identity pagetable when it is migrated Tang Chen
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 29+ messages in thread
From: Tang Chen @ 2014-07-02  9:00 UTC (permalink / raw)
  To: gleb, mtosatti
  Cc: kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel, tangchen

Define guest phys_addr of apic access page.
---
 arch/x86/include/asm/vmx.h | 2 +-
 arch/x86/kvm/svm.c         | 3 ++-
 arch/x86/kvm/vmx.c         | 7 ++++---
 3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 7004d21..c4672d1 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -422,7 +422,7 @@ enum vmcs_field {
 #define VMX_EPT_DIRTY_BIT				(1ull << 9)
 
 #define VMX_EPT_IDENTITY_PAGETABLE_ADDR		0xfffbc000ul
-
+#define VMX_APIC_ACCESS_PAGE_ADDR		0xfee00000ull
 
 #define ASM_VMX_VMCLEAR_RAX       ".byte 0x66, 0x0f, 0xc7, 0x30"
 #define ASM_VMX_VMLAUNCH          ".byte 0x0f, 0x01, 0xc2"
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index ec8366c..22aa2ae 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1257,7 +1257,8 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id)
 	svm->asid_generation = 0;
 	init_vmcb(svm);
 
-	svm->vcpu.arch.apic_base = 0xfee00000 | MSR_IA32_APICBASE_ENABLE;
+	svm->vcpu.arch.apic_base = VMX_APIC_ACCESS_PAGE_ADDR |
+				   MSR_IA32_APICBASE_ENABLE;
 	if (kvm_vcpu_is_bsp(&svm->vcpu))
 		svm->vcpu.arch.apic_base |= MSR_IA32_APICBASE_BSP;
 
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 801332e..366b5b3 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3982,13 +3982,13 @@ static int alloc_apic_access_page(struct kvm *kvm)
 		goto out;
 	kvm_userspace_mem.slot = APIC_ACCESS_PAGE_PRIVATE_MEMSLOT;
 	kvm_userspace_mem.flags = 0;
-	kvm_userspace_mem.guest_phys_addr = 0xfee00000ULL;
+	kvm_userspace_mem.guest_phys_addr = VMX_APIC_ACCESS_PAGE_ADDR;
 	kvm_userspace_mem.memory_size = PAGE_SIZE;
 	r = __kvm_set_memory_region(kvm, &kvm_userspace_mem);
 	if (r)
 		goto out;
 
-	page = gfn_to_page(kvm, 0xfee00);
+	page = gfn_to_page(kvm, VMX_APIC_ACCESS_PAGE_ADDR >> PAGE_SHIFT);
 	if (is_error_page(page)) {
 		r = -EFAULT;
 		goto out;
@@ -4460,7 +4460,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu)
 
 	vmx->vcpu.arch.regs[VCPU_REGS_RDX] = get_rdx_init_val();
 	kvm_set_cr8(&vmx->vcpu, 0);
-	apic_base_msr.data = 0xfee00000 | MSR_IA32_APICBASE_ENABLE;
+	apic_base_msr.data = VMX_APIC_ACCESS_PAGE_ADDR |
+			     MSR_IA32_APICBASE_ENABLE;
 	if (kvm_vcpu_is_bsp(&vmx->vcpu))
 		apic_base_msr.data |= MSR_IA32_APICBASE_BSP;
 	apic_base_msr.host_initiated = true;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 3/4] kvm, memory-hotplug: Update ept identity pagetable when it is migrated.
  2014-07-02  9:00 [PATCH 0/4] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
  2014-07-02  9:00 ` [PATCH 1/4] kvm: Add gfn_to_page_no_pin() Tang Chen
  2014-07-02  9:00 ` [PATCH 2/4] kvm: Add macro VMX_APIC_ACCESS_PAGE_ADDR Tang Chen
@ 2014-07-02  9:00 ` Tang Chen
  2014-07-02 16:34   ` Gleb Natapov
  2014-07-02  9:00 ` [PATCH 4/4] kvm, mem-hotplug: Update apic access page " Tang Chen
  2014-07-03  1:17 ` [PATCH 0/4] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
  4 siblings, 1 reply; 29+ messages in thread
From: Tang Chen @ 2014-07-02  9:00 UTC (permalink / raw)
  To: gleb, mtosatti
  Cc: kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel, tangchen

ept identity pagetable is pinned in memory, and as a result it cannot be
migrated/hot-removed.

But actually it doesn't need to be pinned in memory.

This patch introduces a new vcpu request: KVM_REQ_MIGRATE_EPT to reset ept
indetity pagetable related variable. This request will be made when
kvm_mmu_notifier_invalidate_page() is called when the page is unmapped
from the qemu user space to reset kvm->arch.ept_identity_pagetable to NULL.
And will also be made when ept violation happens to reset
kvm->arch.ept_identity_pagetable to the new page.
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/mmu.c              | 11 +++++++++++
 arch/x86/kvm/vmx.c              |  3 ++-
 arch/x86/kvm/x86.c              | 16 ++++++++++++++++
 include/linux/kvm_host.h        |  1 +
 virt/kvm/kvm_main.c             |  6 ++++++
 6 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4931415..8771c0f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -581,6 +581,7 @@ struct kvm_arch {
 	struct page *ept_identity_pagetable;
 	bool ept_identity_pagetable_done;
 	gpa_t ept_identity_map_addr;
+	bool ept_identity_pagetable_migrated;
 
 	unsigned long irq_sources_bitmap;
 	s64 kvmclock_offset;
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 9314678..c0d72f6 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3425,6 +3425,17 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
 		transparent_hugepage_adjust(vcpu, &gfn, &pfn, &level);
 	r = __direct_map(vcpu, gpa, write, map_writable,
 			 level, gfn, pfn, prefault);
+
+	/*
+	 * Update ept identity pagetable page and apic access page if
+	 * they are migrated.
+	 */
+	if (gpa == vcpu->kvm->arch.ept_identity_map_addr &&
+	    vcpu->kvm->arch.ept_identity_pagetable_migrated) {
+		vcpu->kvm->arch.ept_identity_pagetable_migrated = false;
+		kvm_make_request(KVM_REQ_MIGRATE_EPT, vcpu);
+	}
+
 	spin_unlock(&vcpu->kvm->mmu_lock);
 
 	return r;
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 366b5b3..c336cb3 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4018,7 +4018,8 @@ static int alloc_identity_pagetable(struct kvm *kvm)
 	if (r)
 		goto out;
 
-	page = gfn_to_page(kvm, kvm->arch.ept_identity_map_addr >> PAGE_SHIFT);
+	page = gfn_to_page_no_pin(kvm,
+			kvm->arch.ept_identity_map_addr >> PAGE_SHIFT);
 	if (is_error_page(page)) {
 		r = -EFAULT;
 		goto out;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f32a025..a26524f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5929,6 +5929,20 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
 	kvm_apic_update_tmr(vcpu, tmr);
 }
 
+static void vcpu_migrated_page_update_ept(struct kvm_vcpu *vcpu)
+{
+	struct kvm *kvm = vcpu->kvm;
+
+	if (kvm->arch.ept_identity_pagetable_migrated)
+		kvm->arch.ept_identity_pagetable = NULL;
+	else {
+		struct page *page;
+		page = gfn_to_page_no_pin(kvm,
+				kvm->arch.ept_identity_map_addr >> PAGE_SHIFT);
+		kvm->arch.ept_identity_pagetable = page;
+	}
+}
+
 /*
  * Returns 1 to let __vcpu_run() continue the guest execution loop without
  * exiting to the userspace.  Otherwise, the value will be returned to the
@@ -5989,6 +6003,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 			kvm_deliver_pmi(vcpu);
 		if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
 			vcpu_scan_ioapic(vcpu);
+		if (kvm_check_request(KVM_REQ_MIGRATE_EPT, vcpu))
+			vcpu_migrated_page_update_ept(vcpu);
 	}
 
 	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 7c58d9d..4b7e51a 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -136,6 +136,7 @@ static inline bool is_error_page(struct page *page)
 #define KVM_REQ_GLOBAL_CLOCK_UPDATE 22
 #define KVM_REQ_ENABLE_IBS        23
 #define KVM_REQ_DISABLE_IBS       24
+#define KVM_REQ_MIGRATE_EPT       25
 
 #define KVM_USERSPACE_IRQ_SOURCE_ID		0
 #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID	1
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6091849..d271e89 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -294,6 +294,12 @@ static void kvm_mmu_notifier_invalidate_page(struct mmu_notifier *mn,
 	if (need_tlb_flush)
 		kvm_flush_remote_tlbs(kvm);
 
+	if (address ==
+	    gfn_to_hva(kvm, kvm->arch.ept_identity_map_addr >> PAGE_SHIFT)) {
+		kvm->arch.ept_identity_pagetable_migrated = true;
+		kvm_make_request(KVM_REQ_MIGRATE_EPT, kvm->vcpus[0]);
+	}
+
 	spin_unlock(&kvm->mmu_lock);
 	srcu_read_unlock(&kvm->srcu, idx);
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated.
  2014-07-02  9:00 [PATCH 0/4] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
                   ` (2 preceding siblings ...)
  2014-07-02  9:00 ` [PATCH 3/4] kvm, memory-hotplug: Update ept identity pagetable when it is migrated Tang Chen
@ 2014-07-02  9:00 ` Tang Chen
  2014-07-03 13:55   ` Gleb Natapov
  2014-07-07 10:35   ` Wanpeng Li
  2014-07-03  1:17 ` [PATCH 0/4] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
  4 siblings, 2 replies; 29+ messages in thread
From: Tang Chen @ 2014-07-02  9:00 UTC (permalink / raw)
  To: gleb, mtosatti
  Cc: kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel, tangchen

apic access page is pinned in memory, and as a result it cannot be
migrated/hot-removed.

Actually it doesn't need to be pinned in memory.

This patch introduces a new vcpu request: KVM_REQ_MIGRATE_EPT. This requet
will be made when kvm_mmu_notifier_invalidate_page() is called when the page
is unmapped from the qemu user space to reset APIC_ACCESS_ADDR pointer in
each online vcpu to 0. And will also be made when ept violation happens to
reset APIC_ACCESS_ADDR to the new page phys_addr (host phys_addr).
---
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/mmu.c              | 15 +++++++++++++++
 arch/x86/kvm/vmx.c              |  9 ++++++++-
 arch/x86/kvm/x86.c              | 20 ++++++++++++++++++++
 include/linux/kvm_host.h        |  1 +
 virt/kvm/kvm_main.c             | 15 +++++++++++++++
 6 files changed, 61 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 8771c0f..f104b87 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -575,6 +575,7 @@ struct kvm_arch {
 
 	unsigned int tss_addr;
 	struct page *apic_access_page;
+	bool apic_access_page_migrated;
 
 	gpa_t wall_clock;
 
@@ -739,6 +740,7 @@ struct kvm_x86_ops {
 	void (*hwapic_isr_update)(struct kvm *kvm, int isr);
 	void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
 	void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
+	void (*set_apic_access_page_addr)(struct kvm *kvm, hpa_t hpa);
 	void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
 	void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
 	int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index c0d72f6..a655444 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3436,6 +3436,21 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
 		kvm_make_request(KVM_REQ_MIGRATE_EPT, vcpu);
 	}
 
+	if (gpa == VMX_APIC_ACCESS_PAGE_ADDR &&
+	    vcpu->kvm->arch.apic_access_page_migrated) {
+		int i;
+
+		vcpu->kvm->arch.apic_access_page_migrated = false;
+
+		/*
+		 * We need update APIC_ACCESS_ADDR pointer in each VMCS of
+		 * all the online vcpus.
+		 */
+		for (i = 0; i < atomic_read(&vcpu->kvm->online_vcpus); i++)
+			kvm_make_request(KVM_REQ_MIGRATE_APIC,
+					 vcpu->kvm->vcpus[i]);
+	}
+
 	spin_unlock(&vcpu->kvm->mmu_lock);
 
 	return r;
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c336cb3..abc152f 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3988,7 +3988,7 @@ static int alloc_apic_access_page(struct kvm *kvm)
 	if (r)
 		goto out;
 
-	page = gfn_to_page(kvm, VMX_APIC_ACCESS_PAGE_ADDR >> PAGE_SHIFT);
+	page = gfn_to_page_no_pin(kvm, VMX_APIC_ACCESS_PAGE_ADDR >> PAGE_SHIFT);
 	if (is_error_page(page)) {
 		r = -EFAULT;
 		goto out;
@@ -7075,6 +7075,12 @@ static void vmx_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
 	vmx_set_msr_bitmap(vcpu);
 }
 
+static void vmx_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
+{
+	if (vm_need_virtualize_apic_accesses(kvm))
+		vmcs_write64(APIC_ACCESS_ADDR, hpa);
+}
+
 static void vmx_hwapic_isr_update(struct kvm *kvm, int isr)
 {
 	u16 status;
@@ -8846,6 +8852,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
 	.enable_irq_window = enable_irq_window,
 	.update_cr8_intercept = update_cr8_intercept,
 	.set_virtual_x2apic_mode = vmx_set_virtual_x2apic_mode,
+	.set_apic_access_page_addr = vmx_set_apic_access_page_addr,
 	.vm_has_apicv = vmx_vm_has_apicv,
 	.load_eoi_exitmap = vmx_load_eoi_exitmap,
 	.hwapic_irr_update = vmx_hwapic_irr_update,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a26524f..14e7174 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5943,6 +5943,24 @@ static void vcpu_migrated_page_update_ept(struct kvm_vcpu *vcpu)
 	}
 }
 
+static void vcpu_migrated_page_update_apic(struct kvm_vcpu *vcpu)
+{
+	struct kvm *kvm = vcpu->kvm;
+
+	if (kvm->arch.apic_access_page_migrated) {
+		if (kvm->arch.apic_access_page)
+			kvm->arch.apic_access_page = pfn_to_page(0);
+		kvm_x86_ops->set_apic_access_page_addr(kvm, 0x0ull);
+	} else {
+		struct page *page;
+		page = gfn_to_page_no_pin(kvm,
+				VMX_APIC_ACCESS_PAGE_ADDR >> PAGE_SHIFT);
+		kvm->arch.apic_access_page = page;
+		kvm_x86_ops->set_apic_access_page_addr(kvm,
+						       page_to_phys(page));
+	}
+}
+
 /*
  * Returns 1 to let __vcpu_run() continue the guest execution loop without
  * exiting to the userspace.  Otherwise, the value will be returned to the
@@ -6005,6 +6023,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 			vcpu_scan_ioapic(vcpu);
 		if (kvm_check_request(KVM_REQ_MIGRATE_EPT, vcpu))
 			vcpu_migrated_page_update_ept(vcpu);
+		if (kvm_check_request(KVM_REQ_MIGRATE_APIC, vcpu))
+			vcpu_migrated_page_update_apic(vcpu);
 	}
 
 	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 4b7e51a..e2ad65e 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -137,6 +137,7 @@ static inline bool is_error_page(struct page *page)
 #define KVM_REQ_ENABLE_IBS        23
 #define KVM_REQ_DISABLE_IBS       24
 #define KVM_REQ_MIGRATE_EPT       25
+#define KVM_REQ_MIGRATE_APIC      26
 
 #define KVM_USERSPACE_IRQ_SOURCE_ID		0
 #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID	1
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d271e89..f06438c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -54,6 +54,7 @@
 #include <asm/io.h>
 #include <asm/uaccess.h>
 #include <asm/pgtable.h>
+#include <asm/vmx.h>
 
 #include "coalesced_mmio.h"
 #include "async_pf.h"
@@ -300,6 +301,20 @@ static void kvm_mmu_notifier_invalidate_page(struct mmu_notifier *mn,
 		kvm_make_request(KVM_REQ_MIGRATE_EPT, kvm->vcpus[0]);
 	}
 
+	if (address ==
+	    gfn_to_hva(kvm, VMX_APIC_ACCESS_PAGE_ADDR >> PAGE_SHIFT)) {
+		int i;
+
+		kvm->arch.apic_access_page_migrated = true;
+
+		/*
+		 * We need update APIC_ACCESS_ADDR pointer in each VMCS of
+		 * all the online vcpus.
+		 */
+		for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
+			kvm_make_request(KVM_REQ_MIGRATE_APIC, kvm->vcpus[i]);
+	}
+
 	spin_unlock(&kvm->mmu_lock);
 	srcu_read_unlock(&kvm->srcu, idx);
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/4] kvm: Add macro VMX_APIC_ACCESS_PAGE_ADDR
  2014-07-02  9:00 ` [PATCH 2/4] kvm: Add macro VMX_APIC_ACCESS_PAGE_ADDR Tang Chen
@ 2014-07-02 16:24   ` Gleb Natapov
  2014-07-03  1:19     ` Tang Chen
  0 siblings, 1 reply; 29+ messages in thread
From: Gleb Natapov @ 2014-07-02 16:24 UTC (permalink / raw)
  To: Tang Chen; +Cc: mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel

On Wed, Jul 02, 2014 at 05:00:35PM +0800, Tang Chen wrote:
> Define guest phys_addr of apic access page.
> ---
>  arch/x86/include/asm/vmx.h | 2 +-
>  arch/x86/kvm/svm.c         | 3 ++-
>  arch/x86/kvm/vmx.c         | 7 ++++---
>  3 files changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
> index 7004d21..c4672d1 100644
> --- a/arch/x86/include/asm/vmx.h
> +++ b/arch/x86/include/asm/vmx.h
> @@ -422,7 +422,7 @@ enum vmcs_field {
>  #define VMX_EPT_DIRTY_BIT				(1ull << 9)
>  
>  #define VMX_EPT_IDENTITY_PAGETABLE_ADDR		0xfffbc000ul
> -
> +#define VMX_APIC_ACCESS_PAGE_ADDR		0xfee00000ull
>  
It has nothing to do with VMX and there is already define for that: APIC_DEFAULT_PHYS_BASE

--
			Gleb.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/4] kvm, memory-hotplug: Update ept identity pagetable when it is migrated.
  2014-07-02  9:00 ` [PATCH 3/4] kvm, memory-hotplug: Update ept identity pagetable when it is migrated Tang Chen
@ 2014-07-02 16:34   ` Gleb Natapov
  2014-07-03  1:19     ` Tang Chen
  2014-07-04  2:36     ` Tang Chen
  0 siblings, 2 replies; 29+ messages in thread
From: Gleb Natapov @ 2014-07-02 16:34 UTC (permalink / raw)
  To: Tang Chen; +Cc: mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel

On Wed, Jul 02, 2014 at 05:00:36PM +0800, Tang Chen wrote:
> ept identity pagetable is pinned in memory, and as a result it cannot be
> migrated/hot-removed.
> 
> But actually it doesn't need to be pinned in memory.
> 
> This patch introduces a new vcpu request: KVM_REQ_MIGRATE_EPT to reset ept
> indetity pagetable related variable. This request will be made when
> kvm_mmu_notifier_invalidate_page() is called when the page is unmapped
> from the qemu user space to reset kvm->arch.ept_identity_pagetable to NULL.
> And will also be made when ept violation happens to reset
> kvm->arch.ept_identity_pagetable to the new page.

kvm->arch.ept_identity_pagetable is never used as a page address, just
boolean null/!null to see if identity pagetable is initialized. I do
not see why would we want to track its address at all. Changing it to bool
and assigning true during initialization should be enough.


--
			Gleb.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/4] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page.
  2014-07-02  9:00 [PATCH 0/4] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
                   ` (3 preceding siblings ...)
  2014-07-02  9:00 ` [PATCH 4/4] kvm, mem-hotplug: Update apic access page " Tang Chen
@ 2014-07-03  1:17 ` Tang Chen
  2014-07-03  6:04   ` Gleb Natapov
  4 siblings, 1 reply; 29+ messages in thread
From: Tang Chen @ 2014-07-03  1:17 UTC (permalink / raw)
  To: gleb
  Cc: Tang Chen, mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel

Hi Gleb,

On 07/02/2014 05:00 PM, Tang Chen wrote:
> Hi Gleb, Marcelo,
>
> Please help to review this patch-set.
>
> NOTE: This patch-set doesn't work properly.
>
>
> ept identity pagetable and apic access page in kvm are pinned in memory.
> As a result, they cannot be migrated/hot-removed.
>
> But actually they don't need to be pinned in memory.
>
> This patch-set introduces two new vcpu requests: KVM_REQ_MIGRATE_EPT and KVM_REQ_MIGRATE_APIC.
> These two requests are made when the two pages are migrated by the mmu_notifier
> to reset the related variable to unusable value. And will also be made when
> ept violation happens to reset new pages.
>
>
> [Known problem]
> After this patch-set applied, the two pages can be migrated/hot-removed.
> But after migrating apic access page, the guest died.
>
> The host physical address of apic access page is stored in VMCS. I reset
> it to 0 to stop guest from accessing it when it is unmapped by
> kvm_mmu_notifier_invalidate_page(). And reset it to new page's host physical
> address in tdp_page_fault(). But it seems that guest will access apic page
> directly by the host physical address.

Would you please to give some advice about this problem ?

Thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/4] kvm: Add macro VMX_APIC_ACCESS_PAGE_ADDR
  2014-07-02 16:24   ` Gleb Natapov
@ 2014-07-03  1:19     ` Tang Chen
  0 siblings, 0 replies; 29+ messages in thread
From: Tang Chen @ 2014-07-03  1:19 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel

On 07/03/2014 12:24 AM, Gleb Natapov wrote:
> On Wed, Jul 02, 2014 at 05:00:35PM +0800, Tang Chen wrote:
>> Define guest phys_addr of apic access page.
>> ---
>>   arch/x86/include/asm/vmx.h | 2 +-
>>   arch/x86/kvm/svm.c         | 3 ++-
>>   arch/x86/kvm/vmx.c         | 7 ++++---
>>   3 files changed, 7 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
>> index 7004d21..c4672d1 100644
>> --- a/arch/x86/include/asm/vmx.h
>> +++ b/arch/x86/include/asm/vmx.h
>> @@ -422,7 +422,7 @@ enum vmcs_field {
>>   #define VMX_EPT_DIRTY_BIT				(1ull<<  9)
>>
>>   #define VMX_EPT_IDENTITY_PAGETABLE_ADDR		0xfffbc000ul
>> -
>> +#define VMX_APIC_ACCESS_PAGE_ADDR		0xfee00000ull
>>
> It has nothing to do with VMX and there is already define for that: APIC_DEFAULT_PHYS_BASE

OK, followed.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/4] kvm, memory-hotplug: Update ept identity pagetable when it is migrated.
  2014-07-02 16:34   ` Gleb Natapov
@ 2014-07-03  1:19     ` Tang Chen
  2014-07-04  2:36     ` Tang Chen
  1 sibling, 0 replies; 29+ messages in thread
From: Tang Chen @ 2014-07-03  1:19 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel

On 07/03/2014 12:34 AM, Gleb Natapov wrote:
> On Wed, Jul 02, 2014 at 05:00:36PM +0800, Tang Chen wrote:
>> ept identity pagetable is pinned in memory, and as a result it cannot be
>> migrated/hot-removed.
>>
>> But actually it doesn't need to be pinned in memory.
>>
>> This patch introduces a new vcpu request: KVM_REQ_MIGRATE_EPT to reset ept
>> indetity pagetable related variable. This request will be made when
>> kvm_mmu_notifier_invalidate_page() is called when the page is unmapped
>> from the qemu user space to reset kvm->arch.ept_identity_pagetable to NULL.
>> And will also be made when ept violation happens to reset
>> kvm->arch.ept_identity_pagetable to the new page.
>
> kvm->arch.ept_identity_pagetable is never used as a page address, just
> boolean null/!null to see if identity pagetable is initialized. I do
> not see why would we want to track its address at all. Changing it to bool
> and assigning true during initialization should be enough.

OK, followed.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/4] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page.
  2014-07-03  1:17 ` [PATCH 0/4] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
@ 2014-07-03  6:04   ` Gleb Natapov
  2014-07-04  6:41     ` Tang Chen
  0 siblings, 1 reply; 29+ messages in thread
From: Gleb Natapov @ 2014-07-03  6:04 UTC (permalink / raw)
  To: Tang Chen; +Cc: mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel

On Thu, Jul 03, 2014 at 09:17:59AM +0800, Tang Chen wrote:
> Hi Gleb,
> 
> On 07/02/2014 05:00 PM, Tang Chen wrote:
> >Hi Gleb, Marcelo,
> >
> >Please help to review this patch-set.
> >
> >NOTE: This patch-set doesn't work properly.
> >
> >
> >ept identity pagetable and apic access page in kvm are pinned in memory.
> >As a result, they cannot be migrated/hot-removed.
> >
> >But actually they don't need to be pinned in memory.
> >
> >This patch-set introduces two new vcpu requests: KVM_REQ_MIGRATE_EPT and KVM_REQ_MIGRATE_APIC.
> >These two requests are made when the two pages are migrated by the mmu_notifier
> >to reset the related variable to unusable value. And will also be made when
> >ept violation happens to reset new pages.
> >
> >
> >[Known problem]
> >After this patch-set applied, the two pages can be migrated/hot-removed.
> >But after migrating apic access page, the guest died.
> >
> >The host physical address of apic access page is stored in VMCS. I reset
> >it to 0 to stop guest from accessing it when it is unmapped by
> >kvm_mmu_notifier_invalidate_page(). And reset it to new page's host physical
> >address in tdp_page_fault(). But it seems that guest will access apic page
> >directly by the host physical address.
> 
> Would you please to give some advice about this problem ?
> 
I haven't reviewed third patch yet, will do ASAP.

--
			Gleb.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated.
  2014-07-02  9:00 ` [PATCH 4/4] kvm, mem-hotplug: Update apic access page " Tang Chen
@ 2014-07-03 13:55   ` Gleb Natapov
  2014-07-04  2:18     ` Tang Chen
  2014-07-04  2:18     ` Tang Chen
  2014-07-07 10:35   ` Wanpeng Li
  1 sibling, 2 replies; 29+ messages in thread
From: Gleb Natapov @ 2014-07-03 13:55 UTC (permalink / raw)
  To: Tang Chen; +Cc: mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel

On Wed, Jul 02, 2014 at 05:00:37PM +0800, Tang Chen wrote:
> apic access page is pinned in memory, and as a result it cannot be
> migrated/hot-removed.
> 
> Actually it doesn't need to be pinned in memory.
> 
> This patch introduces a new vcpu request: KVM_REQ_MIGRATE_EPT. This requet
> will be made when kvm_mmu_notifier_invalidate_page() is called when the page
> is unmapped from the qemu user space to reset APIC_ACCESS_ADDR pointer in
> each online vcpu to 0. And will also be made when ept violation happens to
> reset APIC_ACCESS_ADDR to the new page phys_addr (host phys_addr).
> ---
>  arch/x86/include/asm/kvm_host.h |  2 ++
>  arch/x86/kvm/mmu.c              | 15 +++++++++++++++
>  arch/x86/kvm/vmx.c              |  9 ++++++++-
>  arch/x86/kvm/x86.c              | 20 ++++++++++++++++++++
>  include/linux/kvm_host.h        |  1 +
>  virt/kvm/kvm_main.c             | 15 +++++++++++++++
>  6 files changed, 61 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 8771c0f..f104b87 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -575,6 +575,7 @@ struct kvm_arch {
>  
>  	unsigned int tss_addr;
>  	struct page *apic_access_page;
> +	bool apic_access_page_migrated;
Better have two requests KVM_REQ_APIC_PAGE_MAP, KVM_REQ_APIC_PAGE_UNMAP IMO.

>  
>  	gpa_t wall_clock;
>  
> @@ -739,6 +740,7 @@ struct kvm_x86_ops {
>  	void (*hwapic_isr_update)(struct kvm *kvm, int isr);
>  	void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
>  	void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
> +	void (*set_apic_access_page_addr)(struct kvm *kvm, hpa_t hpa);
>  	void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
>  	void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
>  	int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index c0d72f6..a655444 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -3436,6 +3436,21 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
>  		kvm_make_request(KVM_REQ_MIGRATE_EPT, vcpu);
>  	}
>  
> +	if (gpa == VMX_APIC_ACCESS_PAGE_ADDR &&
> +	    vcpu->kvm->arch.apic_access_page_migrated) {
Why check arch.apic_access_page_migrated here? Isn't it enough that the fault is on apic
address.

> +		int i;
> +
> +		vcpu->kvm->arch.apic_access_page_migrated = false;
> +
> +		/*
> +		 * We need update APIC_ACCESS_ADDR pointer in each VMCS of
> +		 * all the online vcpus.
> +		 */
> +		for (i = 0; i < atomic_read(&vcpu->kvm->online_vcpus); i++)
> +			kvm_make_request(KVM_REQ_MIGRATE_APIC,
> +					 vcpu->kvm->vcpus[i]);
make_all_cpus_request(). You need to kick all vcpus from a guest mode.

> +	}
> +
>  	spin_unlock(&vcpu->kvm->mmu_lock);
>  
>  	return r;
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index c336cb3..abc152f 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -3988,7 +3988,7 @@ static int alloc_apic_access_page(struct kvm *kvm)
>  	if (r)
>  		goto out;
>  
> -	page = gfn_to_page(kvm, VMX_APIC_ACCESS_PAGE_ADDR >> PAGE_SHIFT);
> +	page = gfn_to_page_no_pin(kvm, VMX_APIC_ACCESS_PAGE_ADDR >> PAGE_SHIFT);
>  	if (is_error_page(page)) {
>  		r = -EFAULT;
>  		goto out;
> @@ -7075,6 +7075,12 @@ static void vmx_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
>  	vmx_set_msr_bitmap(vcpu);
>  }
>  
> +static void vmx_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
> +{
> +	if (vm_need_virtualize_apic_accesses(kvm))
This shouldn't even been called if apic access page is not supported. Nor
mmu_notifier path neither tdp_page_fault path should ever see 0xfee00000
address. BUG() is more appropriate here.


> +		vmcs_write64(APIC_ACCESS_ADDR, hpa);
> +}
> +
>  static void vmx_hwapic_isr_update(struct kvm *kvm, int isr)
>  {
>  	u16 status;
> @@ -8846,6 +8852,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
>  	.enable_irq_window = enable_irq_window,
>  	.update_cr8_intercept = update_cr8_intercept,
>  	.set_virtual_x2apic_mode = vmx_set_virtual_x2apic_mode,
> +	.set_apic_access_page_addr = vmx_set_apic_access_page_addr,
svm needs that too.

>  	.vm_has_apicv = vmx_vm_has_apicv,
>  	.load_eoi_exitmap = vmx_load_eoi_exitmap,
>  	.hwapic_irr_update = vmx_hwapic_irr_update,
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index a26524f..14e7174 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5943,6 +5943,24 @@ static void vcpu_migrated_page_update_ept(struct kvm_vcpu *vcpu)
>  	}
>  }
>  
> +static void vcpu_migrated_page_update_apic(struct kvm_vcpu *vcpu)
> +{
> +	struct kvm *kvm = vcpu->kvm;
> +
> +	if (kvm->arch.apic_access_page_migrated) {
> +		if (kvm->arch.apic_access_page)
> +			kvm->arch.apic_access_page = pfn_to_page(0);
All vcpus will access apic_access_page without locking here. May be
set kvm->arch.apic_access_page to zero in mmu_notifier and here call
 kvm_x86_ops->set_apic_access_page_addr(kvm, kvm->arch.apic_access_page);


> +		kvm_x86_ops->set_apic_access_page_addr(kvm, 0x0ull);
> +	} else {
> +		struct page *page;
> +		page = gfn_to_page_no_pin(kvm,
> +				VMX_APIC_ACCESS_PAGE_ADDR >> PAGE_SHIFT);
> +		kvm->arch.apic_access_page = page;
Same, set it during tdp fault when page is mapped.

> +		kvm_x86_ops->set_apic_access_page_addr(kvm,
> +						       page_to_phys(page));
> +	}
> +}
> +
>  /*
>   * Returns 1 to let __vcpu_run() continue the guest execution loop without
>   * exiting to the userspace.  Otherwise, the value will be returned to the
> @@ -6005,6 +6023,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  			vcpu_scan_ioapic(vcpu);
>  		if (kvm_check_request(KVM_REQ_MIGRATE_EPT, vcpu))
>  			vcpu_migrated_page_update_ept(vcpu);
> +		if (kvm_check_request(KVM_REQ_MIGRATE_APIC, vcpu))
> +			vcpu_migrated_page_update_apic(vcpu);
>  	}
>  
>  	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 4b7e51a..e2ad65e 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -137,6 +137,7 @@ static inline bool is_error_page(struct page *page)
>  #define KVM_REQ_ENABLE_IBS        23
>  #define KVM_REQ_DISABLE_IBS       24
>  #define KVM_REQ_MIGRATE_EPT       25
> +#define KVM_REQ_MIGRATE_APIC      26
>  
>  #define KVM_USERSPACE_IRQ_SOURCE_ID		0
>  #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID	1
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index d271e89..f06438c 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -54,6 +54,7 @@
>  #include <asm/io.h>
>  #include <asm/uaccess.h>
>  #include <asm/pgtable.h>
> +#include <asm/vmx.h>
>  
>  #include "coalesced_mmio.h"
>  #include "async_pf.h"
> @@ -300,6 +301,20 @@ static void kvm_mmu_notifier_invalidate_page(struct mmu_notifier *mn,
>  		kvm_make_request(KVM_REQ_MIGRATE_EPT, kvm->vcpus[0]);
>  	}
>  
> +	if (address ==
> +	    gfn_to_hva(kvm, VMX_APIC_ACCESS_PAGE_ADDR >> PAGE_SHIFT)) {
> +		int i;
> +
> +		kvm->arch.apic_access_page_migrated = true;
> +
> +		/*
> +		 * We need update APIC_ACCESS_ADDR pointer in each VMCS of
> +		 * all the online vcpus.
> +		 */
> +		for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
> +			kvm_make_request(KVM_REQ_MIGRATE_APIC, kvm->vcpus[i]);
> +	}
make_all_cpus_request()

Also you need to drop put_page(kvm->arch.apic_access_page); from x86.c
> +
>  	spin_unlock(&kvm->mmu_lock);
>  	srcu_read_unlock(&kvm->srcu, idx);
>  }
> -- 
> 1.8.3.1
> 

--
			Gleb.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated.
  2014-07-03 13:55   ` Gleb Natapov
@ 2014-07-04  2:18     ` Tang Chen
  2014-07-04  2:18     ` Tang Chen
  1 sibling, 0 replies; 29+ messages in thread
From: Tang Chen @ 2014-07-04  2:18 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel

Hi Gleb,

Thanks for the advices. Please see below.

On 07/03/2014 09:55 PM, Gleb Natapov wrote:
......
>> @@ -575,6 +575,7 @@ struct kvm_arch {
>>
>>   	unsigned int tss_addr;
>>   	struct page *apic_access_page;
>> +	bool apic_access_page_migrated;
> Better have two requests KVM_REQ_APIC_PAGE_MAP, KVM_REQ_APIC_PAGE_UNMAP IMO.
>

vcpu->requests is an unsigned long, and we can only has 64 requests. Isn't
adding two requests for apic page and another similar two for ept page too
many ?  Not sure.

>>
>>   	gpa_t wall_clock;
>>
>> @@ -739,6 +740,7 @@ struct kvm_x86_ops {
>>   	void (*hwapic_isr_update)(struct kvm *kvm, int isr);
>>   	void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
>>   	void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
>> +	void (*set_apic_access_page_addr)(struct kvm *kvm, hpa_t hpa);
>>   	void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
>>   	void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
>>   	int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>> index c0d72f6..a655444 100644
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -3436,6 +3436,21 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
>>   		kvm_make_request(KVM_REQ_MIGRATE_EPT, vcpu);
>>   	}
>>
>> +	if (gpa == VMX_APIC_ACCESS_PAGE_ADDR&&
>> +	    vcpu->kvm->arch.apic_access_page_migrated) {
> Why check arch.apic_access_page_migrated here? Isn't it enough that the fault is on apic
> address.
>

True. It's enough. Followed.

>> +		int i;
>> +
>> +		vcpu->kvm->arch.apic_access_page_migrated = false;
>> +
>> +		/*
>> +		 * We need update APIC_ACCESS_ADDR pointer in each VMCS of
>> +		 * all the online vcpus.
>> +		 */
>> +		for (i = 0; i<  atomic_read(&vcpu->kvm->online_vcpus); i++)
>> +			kvm_make_request(KVM_REQ_MIGRATE_APIC,
>> +					 vcpu->kvm->vcpus[i]);
> make_all_cpus_request(). You need to kick all vcpus from a guest mode.
>

OK, followed. But would you please explain more about this. :)
Why need to kick all vcpus from guest mode when making request to all 
vcpus ?

>> +	}
>> +
>>   	spin_unlock(&vcpu->kvm->mmu_lock);
>>
>>   	return r;
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index c336cb3..abc152f 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -3988,7 +3988,7 @@ static int alloc_apic_access_page(struct kvm *kvm)
>>   	if (r)
>>   		goto out;
>>
>> -	page = gfn_to_page(kvm, VMX_APIC_ACCESS_PAGE_ADDR>>  PAGE_SHIFT);
>> +	page = gfn_to_page_no_pin(kvm, VMX_APIC_ACCESS_PAGE_ADDR>>  PAGE_SHIFT);
>>   	if (is_error_page(page)) {
>>   		r = -EFAULT;
>>   		goto out;
>> @@ -7075,6 +7075,12 @@ static void vmx_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
>>   	vmx_set_msr_bitmap(vcpu);
>>   }
>>
>> +static void vmx_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
>> +{
>> +	if (vm_need_virtualize_apic_accesses(kvm))
> This shouldn't even been called if apic access page is not supported. Nor
> mmu_notifier path neither tdp_page_fault path should ever see 0xfee00000
> address. BUG() is more appropriate here.
>

I don't quite understand. Why calling this function here will leed to bug ?
(Sorry, I'm not quite understand the internal of KVM. Please help.)

>
>> +		vmcs_write64(APIC_ACCESS_ADDR, hpa);
>> +}
>> +
>>   static void vmx_hwapic_isr_update(struct kvm *kvm, int isr)
>>   {
>>   	u16 status;
>> @@ -8846,6 +8852,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
>>   	.enable_irq_window = enable_irq_window,
>>   	.update_cr8_intercept = update_cr8_intercept,
>>   	.set_virtual_x2apic_mode = vmx_set_virtual_x2apic_mode,
>> +	.set_apic_access_page_addr = vmx_set_apic_access_page_addr,
> svm needs that too.
>

OK, will add one for svm.

>>   	.vm_has_apicv = vmx_vm_has_apicv,
>>   	.load_eoi_exitmap = vmx_load_eoi_exitmap,
>>   	.hwapic_irr_update = vmx_hwapic_irr_update,
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index a26524f..14e7174 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -5943,6 +5943,24 @@ static void vcpu_migrated_page_update_ept(struct kvm_vcpu *vcpu)
>>   	}
>>   }
>>
>> +static void vcpu_migrated_page_update_apic(struct kvm_vcpu *vcpu)
>> +{
>> +	struct kvm *kvm = vcpu->kvm;
>> +
>> +	if (kvm->arch.apic_access_page_migrated) {
>> +		if (kvm->arch.apic_access_page)
>> +			kvm->arch.apic_access_page = pfn_to_page(0);
> All vcpus will access apic_access_page without locking here. May be
> set kvm->arch.apic_access_page to zero in mmu_notifier and here call
>   kvm_x86_ops->set_apic_access_page_addr(kvm, kvm->arch.apic_access_page);
>

I'm a little confused. apic access page's phys_addr is stored in vmcs, and
I think it will be used by vcpu directly to access the physical page.
Setting kvm->arch.apic_access_page to zero will not stop it, right ?

I'm wondering what happens when apic page is migrated, but the vmcs is 
still
holding its old phys_addr before the vcpu request is handled.

>
>> +		kvm_x86_ops->set_apic_access_page_addr(kvm, 0x0ull);
>> +	} else {
>> +		struct page *page;
>> +		page = gfn_to_page_no_pin(kvm,
>> +				VMX_APIC_ACCESS_PAGE_ADDR>>  PAGE_SHIFT);
>> +		kvm->arch.apic_access_page = page;
> Same, set it during tdp fault when page is mapped.
>
>> +		kvm_x86_ops->set_apic_access_page_addr(kvm,
>> +						       page_to_phys(page));
>> +	}
>> +}
>> +
>>   /*
>>    * Returns 1 to let __vcpu_run() continue the guest execution loop without
>>    * exiting to the userspace.  Otherwise, the value will be returned to the
>> @@ -6005,6 +6023,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>   			vcpu_scan_ioapic(vcpu);
>>   		if (kvm_check_request(KVM_REQ_MIGRATE_EPT, vcpu))
>>   			vcpu_migrated_page_update_ept(vcpu);
>> +		if (kvm_check_request(KVM_REQ_MIGRATE_APIC, vcpu))
>> +			vcpu_migrated_page_update_apic(vcpu);
>>   	}
>>
>>   	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>> index 4b7e51a..e2ad65e 100644
>> --- a/include/linux/kvm_host.h
>> +++ b/include/linux/kvm_host.h
>> @@ -137,6 +137,7 @@ static inline bool is_error_page(struct page *page)
>>   #define KVM_REQ_ENABLE_IBS        23
>>   #define KVM_REQ_DISABLE_IBS       24
>>   #define KVM_REQ_MIGRATE_EPT       25
>> +#define KVM_REQ_MIGRATE_APIC      26
>>
>>   #define KVM_USERSPACE_IRQ_SOURCE_ID		0
>>   #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID	1
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index d271e89..f06438c 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -54,6 +54,7 @@
>>   #include<asm/io.h>
>>   #include<asm/uaccess.h>
>>   #include<asm/pgtable.h>
>> +#include<asm/vmx.h>
>>
>>   #include "coalesced_mmio.h"
>>   #include "async_pf.h"
>> @@ -300,6 +301,20 @@ static void kvm_mmu_notifier_invalidate_page(struct mmu_notifier *mn,
>>   		kvm_make_request(KVM_REQ_MIGRATE_EPT, kvm->vcpus[0]);
>>   	}
>>
>> +	if (address ==
>> +	    gfn_to_hva(kvm, VMX_APIC_ACCESS_PAGE_ADDR>>  PAGE_SHIFT)) {
>> +		int i;
>> +
>> +		kvm->arch.apic_access_page_migrated = true;
>> +
>> +		/*
>> +		 * We need update APIC_ACCESS_ADDR pointer in each VMCS of
>> +		 * all the online vcpus.
>> +		 */
>> +		for (i = 0; i<  atomic_read(&kvm->online_vcpus); i++)
>> +			kvm_make_request(KVM_REQ_MIGRATE_APIC, kvm->vcpus[i]);
>> +	}
> make_all_cpus_request()
>
> Also you need to drop put_page(kvm->arch.apic_access_page); from x86.c

Yes. Forgot it.

Thanks. :)

>> +
>>   	spin_unlock(&kvm->mmu_lock);
>>   	srcu_read_unlock(&kvm->srcu, idx);
>>   }
>> --
>> 1.8.3.1
>>
>
> --
> 			Gleb.
> .
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated.
  2014-07-03 13:55   ` Gleb Natapov
  2014-07-04  2:18     ` Tang Chen
@ 2014-07-04  2:18     ` Tang Chen
  2014-07-04 10:13       ` Gleb Natapov
  1 sibling, 1 reply; 29+ messages in thread
From: Tang Chen @ 2014-07-04  2:18 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel, Tang Chen

Hi Gleb,

Thanks for the advices. Please see below.

On 07/03/2014 09:55 PM, Gleb Natapov wrote:
......
>> @@ -575,6 +575,7 @@ struct kvm_arch {
>>
>>   	unsigned int tss_addr;
>>   	struct page *apic_access_page;
>> +	bool apic_access_page_migrated;
> Better have two requests KVM_REQ_APIC_PAGE_MAP, KVM_REQ_APIC_PAGE_UNMAP IMO.
>

vcpu->requests is an unsigned long, and we can only has 64 requests. Isn't
adding two requests for apic page and another similar two for ept page too
many ?  Not sure.

>>
>>   	gpa_t wall_clock;
>>
>> @@ -739,6 +740,7 @@ struct kvm_x86_ops {
>>   	void (*hwapic_isr_update)(struct kvm *kvm, int isr);
>>   	void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
>>   	void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
>> +	void (*set_apic_access_page_addr)(struct kvm *kvm, hpa_t hpa);
>>   	void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
>>   	void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
>>   	int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>> index c0d72f6..a655444 100644
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -3436,6 +3436,21 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
>>   		kvm_make_request(KVM_REQ_MIGRATE_EPT, vcpu);
>>   	}
>>
>> +	if (gpa == VMX_APIC_ACCESS_PAGE_ADDR&&
>> +	    vcpu->kvm->arch.apic_access_page_migrated) {
> Why check arch.apic_access_page_migrated here? Isn't it enough that the fault is on apic
> address.
>

True. It's enough. Followed.

>> +		int i;
>> +
>> +		vcpu->kvm->arch.apic_access_page_migrated = false;
>> +
>> +		/*
>> +		 * We need update APIC_ACCESS_ADDR pointer in each VMCS of
>> +		 * all the online vcpus.
>> +		 */
>> +		for (i = 0; i<  atomic_read(&vcpu->kvm->online_vcpus); i++)
>> +			kvm_make_request(KVM_REQ_MIGRATE_APIC,
>> +					 vcpu->kvm->vcpus[i]);
> make_all_cpus_request(). You need to kick all vcpus from a guest mode.
>

OK, followed. But would you please explain more about this. :)
Why need to kick all vcpus from guest mode when making request to all 
vcpus ?

>> +	}
>> +
>>   	spin_unlock(&vcpu->kvm->mmu_lock);
>>
>>   	return r;
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index c336cb3..abc152f 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -3988,7 +3988,7 @@ static int alloc_apic_access_page(struct kvm *kvm)
>>   	if (r)
>>   		goto out;
>>
>> -	page = gfn_to_page(kvm, VMX_APIC_ACCESS_PAGE_ADDR>>  PAGE_SHIFT);
>> +	page = gfn_to_page_no_pin(kvm, VMX_APIC_ACCESS_PAGE_ADDR>>  PAGE_SHIFT);
>>   	if (is_error_page(page)) {
>>   		r = -EFAULT;
>>   		goto out;
>> @@ -7075,6 +7075,12 @@ static void vmx_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
>>   	vmx_set_msr_bitmap(vcpu);
>>   }
>>
>> +static void vmx_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
>> +{
>> +	if (vm_need_virtualize_apic_accesses(kvm))
> This shouldn't even been called if apic access page is not supported. Nor
> mmu_notifier path neither tdp_page_fault path should ever see 0xfee00000
> address. BUG() is more appropriate here.
>

I don't quite understand. Why calling this function here will leed to bug ?
(Sorry, I'm not quite understand the internal of KVM. Please help.)

>
>> +		vmcs_write64(APIC_ACCESS_ADDR, hpa);
>> +}
>> +
>>   static void vmx_hwapic_isr_update(struct kvm *kvm, int isr)
>>   {
>>   	u16 status;
>> @@ -8846,6 +8852,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
>>   	.enable_irq_window = enable_irq_window,
>>   	.update_cr8_intercept = update_cr8_intercept,
>>   	.set_virtual_x2apic_mode = vmx_set_virtual_x2apic_mode,
>> +	.set_apic_access_page_addr = vmx_set_apic_access_page_addr,
> svm needs that too.
>

OK, will add one for svm.

>>   	.vm_has_apicv = vmx_vm_has_apicv,
>>   	.load_eoi_exitmap = vmx_load_eoi_exitmap,
>>   	.hwapic_irr_update = vmx_hwapic_irr_update,
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index a26524f..14e7174 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -5943,6 +5943,24 @@ static void vcpu_migrated_page_update_ept(struct kvm_vcpu *vcpu)
>>   	}
>>   }
>>
>> +static void vcpu_migrated_page_update_apic(struct kvm_vcpu *vcpu)
>> +{
>> +	struct kvm *kvm = vcpu->kvm;
>> +
>> +	if (kvm->arch.apic_access_page_migrated) {
>> +		if (kvm->arch.apic_access_page)
>> +			kvm->arch.apic_access_page = pfn_to_page(0);
> All vcpus will access apic_access_page without locking here. May be
> set kvm->arch.apic_access_page to zero in mmu_notifier and here call
>   kvm_x86_ops->set_apic_access_page_addr(kvm, kvm->arch.apic_access_page);
>

I'm a little confused. apic access page's phys_addr is stored in vmcs, and
I think it will be used by vcpu directly to access the physical page.
Setting kvm->arch.apic_access_page to zero will not stop it, right ?

I'm wondering what happens when apic page is migrated, but the vmcs is 
still
holding its old phys_addr before the vcpu request is handled.

>
>> +		kvm_x86_ops->set_apic_access_page_addr(kvm, 0x0ull);
>> +	} else {
>> +		struct page *page;
>> +		page = gfn_to_page_no_pin(kvm,
>> +				VMX_APIC_ACCESS_PAGE_ADDR>>  PAGE_SHIFT);
>> +		kvm->arch.apic_access_page = page;
> Same, set it during tdp fault when page is mapped.
>
>> +		kvm_x86_ops->set_apic_access_page_addr(kvm,
>> +						       page_to_phys(page));
>> +	}
>> +}
>> +
>>   /*
>>    * Returns 1 to let __vcpu_run() continue the guest execution loop without
>>    * exiting to the userspace.  Otherwise, the value will be returned to the
>> @@ -6005,6 +6023,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>   			vcpu_scan_ioapic(vcpu);
>>   		if (kvm_check_request(KVM_REQ_MIGRATE_EPT, vcpu))
>>   			vcpu_migrated_page_update_ept(vcpu);
>> +		if (kvm_check_request(KVM_REQ_MIGRATE_APIC, vcpu))
>> +			vcpu_migrated_page_update_apic(vcpu);
>>   	}
>>
>>   	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>> index 4b7e51a..e2ad65e 100644
>> --- a/include/linux/kvm_host.h
>> +++ b/include/linux/kvm_host.h
>> @@ -137,6 +137,7 @@ static inline bool is_error_page(struct page *page)
>>   #define KVM_REQ_ENABLE_IBS        23
>>   #define KVM_REQ_DISABLE_IBS       24
>>   #define KVM_REQ_MIGRATE_EPT       25
>> +#define KVM_REQ_MIGRATE_APIC      26
>>
>>   #define KVM_USERSPACE_IRQ_SOURCE_ID		0
>>   #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID	1
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index d271e89..f06438c 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -54,6 +54,7 @@
>>   #include<asm/io.h>
>>   #include<asm/uaccess.h>
>>   #include<asm/pgtable.h>
>> +#include<asm/vmx.h>
>>
>>   #include "coalesced_mmio.h"
>>   #include "async_pf.h"
>> @@ -300,6 +301,20 @@ static void kvm_mmu_notifier_invalidate_page(struct mmu_notifier *mn,
>>   		kvm_make_request(KVM_REQ_MIGRATE_EPT, kvm->vcpus[0]);
>>   	}
>>
>> +	if (address ==
>> +	    gfn_to_hva(kvm, VMX_APIC_ACCESS_PAGE_ADDR>>  PAGE_SHIFT)) {
>> +		int i;
>> +
>> +		kvm->arch.apic_access_page_migrated = true;
>> +
>> +		/*
>> +		 * We need update APIC_ACCESS_ADDR pointer in each VMCS of
>> +		 * all the online vcpus.
>> +		 */
>> +		for (i = 0; i<  atomic_read(&kvm->online_vcpus); i++)
>> +			kvm_make_request(KVM_REQ_MIGRATE_APIC, kvm->vcpus[i]);
>> +	}
> make_all_cpus_request()
>
> Also you need to drop put_page(kvm->arch.apic_access_page); from x86.c

Yes. Forgot it.

Thanks. :)

>> +
>>   	spin_unlock(&kvm->mmu_lock);
>>   	srcu_read_unlock(&kvm->srcu, idx);
>>   }
>> --
>> 1.8.3.1
>>
>
> --
> 			Gleb.
> .
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/4] kvm, memory-hotplug: Update ept identity pagetable when it is migrated.
  2014-07-02 16:34   ` Gleb Natapov
  2014-07-03  1:19     ` Tang Chen
@ 2014-07-04  2:36     ` Tang Chen
  2014-07-04  9:49       ` Gleb Natapov
  1 sibling, 1 reply; 29+ messages in thread
From: Tang Chen @ 2014-07-04  2:36 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel

Hi Gleb,

On 07/03/2014 12:34 AM, Gleb Natapov wrote:
> On Wed, Jul 02, 2014 at 05:00:36PM +0800, Tang Chen wrote:
>> ept identity pagetable is pinned in memory, and as a result it cannot be
>> migrated/hot-removed.
>>
>> But actually it doesn't need to be pinned in memory.
>>
>> This patch introduces a new vcpu request: KVM_REQ_MIGRATE_EPT to reset ept
>> indetity pagetable related variable. This request will be made when
>> kvm_mmu_notifier_invalidate_page() is called when the page is unmapped
>> from the qemu user space to reset kvm->arch.ept_identity_pagetable to NULL.
>> And will also be made when ept violation happens to reset
>> kvm->arch.ept_identity_pagetable to the new page.
>
> kvm->arch.ept_identity_pagetable is never used as a page address, just
> boolean null/!null to see if identity pagetable is initialized. I do
> not see why would we want to track its address at all. Changing it to bool
> and assigning true during initialization should be enough.

We already have kvm->arch.ept_identity_pagetable_done to indicate if the 
ept
identity table is initialized. If we make 
kvm->arch.ept_identity_pagetable to
bool, do you mean we have:

kvm->arch.ept_identity_pagetable: indicate if ept page is allocated,
kvm->arch.ept_identity_pagetable_done: indicate if ept page is initialized ?

I don't think we need this. Shall we remove 
kvm->arch.ept_identity_pagetable ?

Thanks.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/4] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page.
  2014-07-03  6:04   ` Gleb Natapov
@ 2014-07-04  6:41     ` Tang Chen
  0 siblings, 0 replies; 29+ messages in thread
From: Tang Chen @ 2014-07-04  6:41 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel, Tang Chen

Hi Gleb,

On 07/03/2014 02:04 PM, Gleb Natapov wrote:
> On Thu, Jul 03, 2014 at 09:17:59AM +0800, Tang Chen wrote:
>> Hi Gleb,
>>
>> On 07/02/2014 05:00 PM, Tang Chen wrote:
>>> Hi Gleb, Marcelo,
>>>
>>> Please help to review this patch-set.
>>>
>>> NOTE: This patch-set doesn't work properly.
>>>
>>>
>>> ept identity pagetable and apic access page in kvm are pinned in memory.
>>> As a result, they cannot be migrated/hot-removed.
>>>
>>> But actually they don't need to be pinned in memory.
>>>
>>> This patch-set introduces two new vcpu requests: KVM_REQ_MIGRATE_EPT and KVM_REQ_MIGRATE_APIC.
>>> These two requests are made when the two pages are migrated by the mmu_notifier
>>> to reset the related variable to unusable value. And will also be made when
>>> ept violation happens to reset new pages.
>>>
>>>
>>> [Known problem]
>>> After this patch-set applied, the two pages can be migrated/hot-removed.
>>> But after migrating apic access page, the guest died.
>>>
>>> The host physical address of apic access page is stored in VMCS. I reset
>>> it to 0 to stop guest from accessing it when it is unmapped by
>>> kvm_mmu_notifier_invalidate_page(). And reset it to new page's host physical
>>> address in tdp_page_fault(). But it seems that guest will access apic page
>>> directly by the host physical address.
>>
>> Would you please to give some advice about this problem ?
>>
> I haven't reviewed third patch yet, will do ASAP.
>

I printed some info in the kernel, and I found that mmu_notifier 
unmapped the
apic page and set VMCS APIC_ACCESS_ADDR to 0. But apic page ept 
violation didn't
happen. And the guest stopped running.

I think when guest tried to access apic page, there was no ept violation 
happened.
And as a result, VMCS APIC_ACCESS_ADDR was not correctly set.

Referring to Intel Software Developer's Manuel Vol 3B, when accessing 
apic page
using translation with a large page (2M, 4M, 1G), APIC VM_exit will not 
happen.

How do you think about this ?

Thanks. :)





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/4] kvm, memory-hotplug: Update ept identity pagetable when it is migrated.
  2014-07-04  2:36     ` Tang Chen
@ 2014-07-04  9:49       ` Gleb Natapov
  0 siblings, 0 replies; 29+ messages in thread
From: Gleb Natapov @ 2014-07-04  9:49 UTC (permalink / raw)
  To: Tang Chen; +Cc: mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel

On Fri, Jul 04, 2014 at 10:36:06AM +0800, Tang Chen wrote:
> Hi Gleb,
> 
> On 07/03/2014 12:34 AM, Gleb Natapov wrote:
> >On Wed, Jul 02, 2014 at 05:00:36PM +0800, Tang Chen wrote:
> >>ept identity pagetable is pinned in memory, and as a result it cannot be
> >>migrated/hot-removed.
> >>
> >>But actually it doesn't need to be pinned in memory.
> >>
> >>This patch introduces a new vcpu request: KVM_REQ_MIGRATE_EPT to reset ept
> >>indetity pagetable related variable. This request will be made when
> >>kvm_mmu_notifier_invalidate_page() is called when the page is unmapped
> >>from the qemu user space to reset kvm->arch.ept_identity_pagetable to NULL.
> >>And will also be made when ept violation happens to reset
> >>kvm->arch.ept_identity_pagetable to the new page.
> >
> >kvm->arch.ept_identity_pagetable is never used as a page address, just
> >boolean null/!null to see if identity pagetable is initialized. I do
> >not see why would we want to track its address at all. Changing it to bool
> >and assigning true during initialization should be enough.
> 
> We already have kvm->arch.ept_identity_pagetable_done to indicate if the ept
> identity table is initialized. If we make kvm->arch.ept_identity_pagetable
> to
> bool, do you mean we have:
> 
> kvm->arch.ept_identity_pagetable: indicate if ept page is allocated,
> kvm->arch.ept_identity_pagetable_done: indicate if ept page is initialized ?
> 
ept_identity_pagetable also means that a memory slot for identity page is initialized.

> I don't think we need this. Shall we remove kvm->arch.ept_identity_pagetable
> ?
> 
May be those two can be consolidated somehow, but lets leave it to a separate patch
to make this one simpler.

--
			Gleb.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated.
  2014-07-04  2:18     ` Tang Chen
@ 2014-07-04 10:13       ` Gleb Natapov
  2014-07-07  6:17         ` Tang Chen
  2014-07-07  9:52         ` Tang Chen
  0 siblings, 2 replies; 29+ messages in thread
From: Gleb Natapov @ 2014-07-04 10:13 UTC (permalink / raw)
  To: Tang Chen; +Cc: mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel

On Fri, Jul 04, 2014 at 10:18:25AM +0800, Tang Chen wrote:
> Hi Gleb,
> 
> Thanks for the advices. Please see below.
> 
> On 07/03/2014 09:55 PM, Gleb Natapov wrote:
> ......
> >>@@ -575,6 +575,7 @@ struct kvm_arch {
> >>
> >>  	unsigned int tss_addr;
> >>  	struct page *apic_access_page;
> >>+	bool apic_access_page_migrated;
> >Better have two requests KVM_REQ_APIC_PAGE_MAP, KVM_REQ_APIC_PAGE_UNMAP IMO.
> >
> 
> vcpu->requests is an unsigned long, and we can only has 64 requests. Isn't
> adding two requests for apic page and another similar two for ept page too
> many ?  Not sure.
> 
Lets not worry about that for now. May be it is enough to have only one
KVM_REQ_APIC_PAGE_RELOAD request set apic_access_page to a new value
before sending the request and reload whatever is in apic_access_page
during KVM_REQ_APIC_PAGE_RELOAD processing. Or we can even reload
apic_access_page as part of mmu reload and reuse KVM_REQ_MMU_RELOAD.

> >>
> >>  	gpa_t wall_clock;
> >>
> >>@@ -739,6 +740,7 @@ struct kvm_x86_ops {
> >>  	void (*hwapic_isr_update)(struct kvm *kvm, int isr);
> >>  	void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
> >>  	void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
> >>+	void (*set_apic_access_page_addr)(struct kvm *kvm, hpa_t hpa);
> >>  	void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
> >>  	void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
> >>  	int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
> >>diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> >>index c0d72f6..a655444 100644
> >>--- a/arch/x86/kvm/mmu.c
> >>+++ b/arch/x86/kvm/mmu.c
> >>@@ -3436,6 +3436,21 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
> >>  		kvm_make_request(KVM_REQ_MIGRATE_EPT, vcpu);
> >>  	}
> >>
> >>+	if (gpa == VMX_APIC_ACCESS_PAGE_ADDR&&
> >>+	    vcpu->kvm->arch.apic_access_page_migrated) {
> >Why check arch.apic_access_page_migrated here? Isn't it enough that the fault is on apic
> >address.
> >
> 
> True. It's enough. Followed.
> 
> >>+		int i;
> >>+
> >>+		vcpu->kvm->arch.apic_access_page_migrated = false;
> >>+
> >>+		/*
> >>+		 * We need update APIC_ACCESS_ADDR pointer in each VMCS of
> >>+		 * all the online vcpus.
> >>+		 */
> >>+		for (i = 0; i<  atomic_read(&vcpu->kvm->online_vcpus); i++)
> >>+			kvm_make_request(KVM_REQ_MIGRATE_APIC,
> >>+					 vcpu->kvm->vcpus[i]);
> >make_all_cpus_request(). You need to kick all vcpus from a guest mode.
> >
> 
> OK, followed. But would you please explain more about this. :)
> Why need to kick all vcpus from guest mode when making request to all vcpus
> ?
Because if you do not force other vcpus from a guest mode they will not reload
apic_access_page value till next vmexit, but since EPT page table now has a mapping
for 0xfee00000 access to this address will not cause EPT violation and will not cause
apic exit either.

> 
> >>+	}
> >>+
> >>  	spin_unlock(&vcpu->kvm->mmu_lock);
> >>
> >>  	return r;
> >>diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> >>index c336cb3..abc152f 100644
> >>--- a/arch/x86/kvm/vmx.c
> >>+++ b/arch/x86/kvm/vmx.c
> >>@@ -3988,7 +3988,7 @@ static int alloc_apic_access_page(struct kvm *kvm)
> >>  	if (r)
> >>  		goto out;
> >>
> >>-	page = gfn_to_page(kvm, VMX_APIC_ACCESS_PAGE_ADDR>>  PAGE_SHIFT);
> >>+	page = gfn_to_page_no_pin(kvm, VMX_APIC_ACCESS_PAGE_ADDR>>  PAGE_SHIFT);
> >>  	if (is_error_page(page)) {
> >>  		r = -EFAULT;
> >>  		goto out;
> >>@@ -7075,6 +7075,12 @@ static void vmx_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
> >>  	vmx_set_msr_bitmap(vcpu);
> >>  }
> >>
> >>+static void vmx_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
> >>+{
> >>+	if (vm_need_virtualize_apic_accesses(kvm))
> >This shouldn't even been called if apic access page is not supported. Nor
> >mmu_notifier path neither tdp_page_fault path should ever see 0xfee00000
> >address. BUG() is more appropriate here.
> >
> 
> I don't quite understand. Why calling this function here will leed to bug ?
> (Sorry, I'm not quite understand the internal of KVM. Please help.)
I didn't say that calling this function here will lead to a bug. I am saying that
if vm_need_virtualize_apic_accesses() is false this function should not be called
at all, so this check is redundant.

> 
> >
> >>+		vmcs_write64(APIC_ACCESS_ADDR, hpa);
> >>+}
> >>+
> >>  static void vmx_hwapic_isr_update(struct kvm *kvm, int isr)
> >>  {
> >>  	u16 status;
> >>@@ -8846,6 +8852,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
> >>  	.enable_irq_window = enable_irq_window,
> >>  	.update_cr8_intercept = update_cr8_intercept,
> >>  	.set_virtual_x2apic_mode = vmx_set_virtual_x2apic_mode,
> >>+	.set_apic_access_page_addr = vmx_set_apic_access_page_addr,
> >svm needs that too.
> >
> 
> OK, will add one for svm.
> 
> >>  	.vm_has_apicv = vmx_vm_has_apicv,
> >>  	.load_eoi_exitmap = vmx_load_eoi_exitmap,
> >>  	.hwapic_irr_update = vmx_hwapic_irr_update,
> >>diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >>index a26524f..14e7174 100644
> >>--- a/arch/x86/kvm/x86.c
> >>+++ b/arch/x86/kvm/x86.c
> >>@@ -5943,6 +5943,24 @@ static void vcpu_migrated_page_update_ept(struct kvm_vcpu *vcpu)
> >>  	}
> >>  }
> >>
> >>+static void vcpu_migrated_page_update_apic(struct kvm_vcpu *vcpu)
> >>+{
> >>+	struct kvm *kvm = vcpu->kvm;
> >>+
> >>+	if (kvm->arch.apic_access_page_migrated) {
> >>+		if (kvm->arch.apic_access_page)
> >>+			kvm->arch.apic_access_page = pfn_to_page(0);
> >All vcpus will access apic_access_page without locking here. May be
> >set kvm->arch.apic_access_page to zero in mmu_notifier and here call
> >  kvm_x86_ops->set_apic_access_page_addr(kvm, kvm->arch.apic_access_page);
> >
> 
> I'm a little confused. apic access page's phys_addr is stored in vmcs, and
> I think it will be used by vcpu directly to access the physical page.
> Setting kvm->arch.apic_access_page to zero will not stop it, right ?
> 
Right, kvm->arch.apic_access_page is just a shadow value for whatever is written
in vmcs. After setting it all vcpus need to update their vmcs values.

> I'm wondering what happens when apic page is migrated, but the vmcs is still
> holding its old phys_addr before the vcpu request is handled.
> 
apic page should not be migrated untill all vpus are forced out of a guest mode and
instructed to reload new value on a next guest entry. That's what we are trying to
achieve here.

--
			Gleb.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated.
  2014-07-04 10:13       ` Gleb Natapov
@ 2014-07-07  6:17         ` Tang Chen
  2014-07-07  9:52         ` Tang Chen
  1 sibling, 0 replies; 29+ messages in thread
From: Tang Chen @ 2014-07-07  6:17 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel, Tang Chen

Hi Gleb,

Thanks for all the advices. Please see below.

On 07/04/2014 06:13 PM, Gleb Natapov wrote:
......
>>>> +static void vmx_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
>>>> +{
>>>> +	if (vm_need_virtualize_apic_accesses(kvm))
>>> This shouldn't even been called if apic access page is not supported. Nor
>>> mmu_notifier path neither tdp_page_fault path should ever see 0xfee00000
>>> address. BUG() is more appropriate here.
>>>
>>
>> I don't quite understand. Why calling this function here will leed to bug ?
>> (Sorry, I'm not quite understand the internal of KVM. Please help.)
> I didn't say that calling this function here will lead to a bug. I am saying that
> if vm_need_virtualize_apic_accesses() is false this function should not be called
> at all, so this check is redundant.
>

Do you mean when vm_need_virtualize_apic_accesses() is false, it should 
not be called ?
It has to be true ?

......
>>>> +	if (kvm->arch.apic_access_page_migrated) {
>>>> +		if (kvm->arch.apic_access_page)
>>>> +			kvm->arch.apic_access_page = pfn_to_page(0);
>>> All vcpus will access apic_access_page without locking here. May be
>>> set kvm->arch.apic_access_page to zero in mmu_notifier and here call
>>>   kvm_x86_ops->set_apic_access_page_addr(kvm, kvm->arch.apic_access_page);
>>>
>>
>> I'm a little confused. apic access page's phys_addr is stored in vmcs, and
>> I think it will be used by vcpu directly to access the physical page.
>> Setting kvm->arch.apic_access_page to zero will not stop it, right ?
>>
> Right, kvm->arch.apic_access_page is just a shadow value for whatever is written
> in vmcs. After setting it all vcpus need to update their vmcs values.
>
>> I'm wondering what happens when apic page is migrated, but the vmcs is still
>> holding its old phys_addr before the vcpu request is handled.
>>
> apic page should not be migrated untill all vpus are forced out of a guest mode and
> instructed to reload new value on a next guest entry. That's what we are trying to
> achieve here.
>

So, setting VMCS APIC_ACCESS_ADDR pointer to zero will not stop vcpu to 
access
apic access page, right ?

If so, all the vcpus have to stop till apic page finishes its migration, 
and new
value is set in each vcpu, which means we should stop guest, right ?

Thanks.




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated.
  2014-07-04 10:13       ` Gleb Natapov
  2014-07-07  6:17         ` Tang Chen
@ 2014-07-07  9:52         ` Tang Chen
  2014-07-07 11:42           ` Nadav Amit
  1 sibling, 1 reply; 29+ messages in thread
From: Tang Chen @ 2014-07-07  9:52 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel, Tang Chen

Hi Gleb,

The guest hang problem has been solved.

When mmu_notifier is called, I set VMCS APIC_ACCESS_ADDR to the new value
instead of setting it to 0. And only update kvm->arch.apic_access_page in
the next ept violation.

The guest is running well now.

I'll post the new patches tomorrow. ;)

Thanks.


On 07/04/2014 06:13 PM, Gleb Natapov wrote:
> On Fri, Jul 04, 2014 at 10:18:25AM +0800, Tang Chen wrote:
>> Hi Gleb,
>>
>> Thanks for the advices. Please see below.
>>
>> On 07/03/2014 09:55 PM, Gleb Natapov wrote:
>> ......
>>>> @@ -575,6 +575,7 @@ struct kvm_arch {
>>>>
>>>>   	unsigned int tss_addr;
>>>>   	struct page *apic_access_page;
>>>> +	bool apic_access_page_migrated;
>>> Better have two requests KVM_REQ_APIC_PAGE_MAP, KVM_REQ_APIC_PAGE_UNMAP IMO.
>>>
>>
>> vcpu->requests is an unsigned long, and we can only has 64 requests. Isn't
>> adding two requests for apic page and another similar two for ept page too
>> many ?  Not sure.
>>
> Lets not worry about that for now. May be it is enough to have only one
> KVM_REQ_APIC_PAGE_RELOAD request set apic_access_page to a new value
> before sending the request and reload whatever is in apic_access_page
> during KVM_REQ_APIC_PAGE_RELOAD processing. Or we can even reload
> apic_access_page as part of mmu reload and reuse KVM_REQ_MMU_RELOAD.
>
>>>>
>>>>   	gpa_t wall_clock;
>>>>
>>>> @@ -739,6 +740,7 @@ struct kvm_x86_ops {
>>>>   	void (*hwapic_isr_update)(struct kvm *kvm, int isr);
>>>>   	void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
>>>>   	void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
>>>> +	void (*set_apic_access_page_addr)(struct kvm *kvm, hpa_t hpa);
>>>>   	void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
>>>>   	void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
>>>>   	int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
>>>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>>>> index c0d72f6..a655444 100644
>>>> --- a/arch/x86/kvm/mmu.c
>>>> +++ b/arch/x86/kvm/mmu.c
>>>> @@ -3436,6 +3436,21 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
>>>>   		kvm_make_request(KVM_REQ_MIGRATE_EPT, vcpu);
>>>>   	}
>>>>
>>>> +	if (gpa == VMX_APIC_ACCESS_PAGE_ADDR&&
>>>> +	    vcpu->kvm->arch.apic_access_page_migrated) {
>>> Why check arch.apic_access_page_migrated here? Isn't it enough that the fault is on apic
>>> address.
>>>
>>
>> True. It's enough. Followed.
>>
>>>> +		int i;
>>>> +
>>>> +		vcpu->kvm->arch.apic_access_page_migrated = false;
>>>> +
>>>> +		/*
>>>> +		 * We need update APIC_ACCESS_ADDR pointer in each VMCS of
>>>> +		 * all the online vcpus.
>>>> +		 */
>>>> +		for (i = 0; i<   atomic_read(&vcpu->kvm->online_vcpus); i++)
>>>> +			kvm_make_request(KVM_REQ_MIGRATE_APIC,
>>>> +					 vcpu->kvm->vcpus[i]);
>>> make_all_cpus_request(). You need to kick all vcpus from a guest mode.
>>>
>>
>> OK, followed. But would you please explain more about this. :)
>> Why need to kick all vcpus from guest mode when making request to all vcpus
>> ?
> Because if you do not force other vcpus from a guest mode they will not reload
> apic_access_page value till next vmexit, but since EPT page table now has a mapping
> for 0xfee00000 access to this address will not cause EPT violation and will not cause
> apic exit either.
>
>>
>>>> +	}
>>>> +
>>>>   	spin_unlock(&vcpu->kvm->mmu_lock);
>>>>
>>>>   	return r;
>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>>> index c336cb3..abc152f 100644
>>>> --- a/arch/x86/kvm/vmx.c
>>>> +++ b/arch/x86/kvm/vmx.c
>>>> @@ -3988,7 +3988,7 @@ static int alloc_apic_access_page(struct kvm *kvm)
>>>>   	if (r)
>>>>   		goto out;
>>>>
>>>> -	page = gfn_to_page(kvm, VMX_APIC_ACCESS_PAGE_ADDR>>   PAGE_SHIFT);
>>>> +	page = gfn_to_page_no_pin(kvm, VMX_APIC_ACCESS_PAGE_ADDR>>   PAGE_SHIFT);
>>>>   	if (is_error_page(page)) {
>>>>   		r = -EFAULT;
>>>>   		goto out;
>>>> @@ -7075,6 +7075,12 @@ static void vmx_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
>>>>   	vmx_set_msr_bitmap(vcpu);
>>>>   }
>>>>
>>>> +static void vmx_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
>>>> +{
>>>> +	if (vm_need_virtualize_apic_accesses(kvm))
>>> This shouldn't even been called if apic access page is not supported. Nor
>>> mmu_notifier path neither tdp_page_fault path should ever see 0xfee00000
>>> address. BUG() is more appropriate here.
>>>
>>
>> I don't quite understand. Why calling this function here will leed to bug ?
>> (Sorry, I'm not quite understand the internal of KVM. Please help.)
> I didn't say that calling this function here will lead to a bug. I am saying that
> if vm_need_virtualize_apic_accesses() is false this function should not be called
> at all, so this check is redundant.
>
>>
>>>
>>>> +		vmcs_write64(APIC_ACCESS_ADDR, hpa);
>>>> +}
>>>> +
>>>>   static void vmx_hwapic_isr_update(struct kvm *kvm, int isr)
>>>>   {
>>>>   	u16 status;
>>>> @@ -8846,6 +8852,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
>>>>   	.enable_irq_window = enable_irq_window,
>>>>   	.update_cr8_intercept = update_cr8_intercept,
>>>>   	.set_virtual_x2apic_mode = vmx_set_virtual_x2apic_mode,
>>>> +	.set_apic_access_page_addr = vmx_set_apic_access_page_addr,
>>> svm needs that too.
>>>
>>
>> OK, will add one for svm.
>>
>>>>   	.vm_has_apicv = vmx_vm_has_apicv,
>>>>   	.load_eoi_exitmap = vmx_load_eoi_exitmap,
>>>>   	.hwapic_irr_update = vmx_hwapic_irr_update,
>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>> index a26524f..14e7174 100644
>>>> --- a/arch/x86/kvm/x86.c
>>>> +++ b/arch/x86/kvm/x86.c
>>>> @@ -5943,6 +5943,24 @@ static void vcpu_migrated_page_update_ept(struct kvm_vcpu *vcpu)
>>>>   	}
>>>>   }
>>>>
>>>> +static void vcpu_migrated_page_update_apic(struct kvm_vcpu *vcpu)
>>>> +{
>>>> +	struct kvm *kvm = vcpu->kvm;
>>>> +
>>>> +	if (kvm->arch.apic_access_page_migrated) {
>>>> +		if (kvm->arch.apic_access_page)
>>>> +			kvm->arch.apic_access_page = pfn_to_page(0);
>>> All vcpus will access apic_access_page without locking here. May be
>>> set kvm->arch.apic_access_page to zero in mmu_notifier and here call
>>>   kvm_x86_ops->set_apic_access_page_addr(kvm, kvm->arch.apic_access_page);
>>>
>>
>> I'm a little confused. apic access page's phys_addr is stored in vmcs, and
>> I think it will be used by vcpu directly to access the physical page.
>> Setting kvm->arch.apic_access_page to zero will not stop it, right ?
>>
> Right, kvm->arch.apic_access_page is just a shadow value for whatever is written
> in vmcs. After setting it all vcpus need to update their vmcs values.
>
>> I'm wondering what happens when apic page is migrated, but the vmcs is still
>> holding its old phys_addr before the vcpu request is handled.
>>
> apic page should not be migrated untill all vpus are forced out of a guest mode and
> instructed to reload new value on a next guest entry. That's what we are trying to
> achieve here.
>
> --
> 			Gleb.
> .
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated.
  2014-07-02  9:00 ` [PATCH 4/4] kvm, mem-hotplug: Update apic access page " Tang Chen
  2014-07-03 13:55   ` Gleb Natapov
@ 2014-07-07 10:35   ` Wanpeng Li
  2014-07-08  9:40     ` Tang Chen
  1 sibling, 1 reply; 29+ messages in thread
From: Wanpeng Li @ 2014-07-07 10:35 UTC (permalink / raw)
  To: Tang Chen
  Cc: gleb, mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel

On Wed, Jul 02, 2014 at 05:00:37PM +0800, Tang Chen wrote:
>apic access page is pinned in memory, and as a result it cannot be
>migrated/hot-removed.
>
>Actually it doesn't need to be pinned in memory.
>
>This patch introduces a new vcpu request: KVM_REQ_MIGRATE_EPT. This requet

s/KVM_REQ_MIGRATE_EPT/KVM_REQ_MIGRATE_APIC

>will be made when kvm_mmu_notifier_invalidate_page() is called when the page
>is unmapped from the qemu user space to reset APIC_ACCESS_ADDR pointer in
>each online vcpu to 0. And will also be made when ept violation happens to
>reset APIC_ACCESS_ADDR to the new page phys_addr (host phys_addr).
>---
> arch/x86/include/asm/kvm_host.h |  2 ++
> arch/x86/kvm/mmu.c              | 15 +++++++++++++++
> arch/x86/kvm/vmx.c              |  9 ++++++++-
> arch/x86/kvm/x86.c              | 20 ++++++++++++++++++++
> include/linux/kvm_host.h        |  1 +
> virt/kvm/kvm_main.c             | 15 +++++++++++++++
> 6 files changed, 61 insertions(+), 1 deletion(-)
>
>diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>index 8771c0f..f104b87 100644
>--- a/arch/x86/include/asm/kvm_host.h
>+++ b/arch/x86/include/asm/kvm_host.h
>@@ -575,6 +575,7 @@ struct kvm_arch {
> 
> 	unsigned int tss_addr;
> 	struct page *apic_access_page;
>+	bool apic_access_page_migrated;
> 
> 	gpa_t wall_clock;
> 
>@@ -739,6 +740,7 @@ struct kvm_x86_ops {
> 	void (*hwapic_isr_update)(struct kvm *kvm, int isr);
> 	void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
> 	void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
>+	void (*set_apic_access_page_addr)(struct kvm *kvm, hpa_t hpa);
> 	void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
> 	void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
> 	int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
>diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>index c0d72f6..a655444 100644
>--- a/arch/x86/kvm/mmu.c
>+++ b/arch/x86/kvm/mmu.c
>@@ -3436,6 +3436,21 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
> 		kvm_make_request(KVM_REQ_MIGRATE_EPT, vcpu);
> 	}
> 
>+	if (gpa == VMX_APIC_ACCESS_PAGE_ADDR &&
>+	    vcpu->kvm->arch.apic_access_page_migrated) {
>+		int i;
>+
>+		vcpu->kvm->arch.apic_access_page_migrated = false;
>+
>+		/*
>+		 * We need update APIC_ACCESS_ADDR pointer in each VMCS of
>+		 * all the online vcpus.
>+		 */
>+		for (i = 0; i < atomic_read(&vcpu->kvm->online_vcpus); i++)
>+			kvm_make_request(KVM_REQ_MIGRATE_APIC,
>+					 vcpu->kvm->vcpus[i]);
>+	}
>+
> 	spin_unlock(&vcpu->kvm->mmu_lock);
> 
> 	return r;
>diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>index c336cb3..abc152f 100644
>--- a/arch/x86/kvm/vmx.c
>+++ b/arch/x86/kvm/vmx.c
>@@ -3988,7 +3988,7 @@ static int alloc_apic_access_page(struct kvm *kvm)
> 	if (r)
> 		goto out;
> 
>-	page = gfn_to_page(kvm, VMX_APIC_ACCESS_PAGE_ADDR >> PAGE_SHIFT);
>+	page = gfn_to_page_no_pin(kvm, VMX_APIC_ACCESS_PAGE_ADDR >> PAGE_SHIFT);
> 	if (is_error_page(page)) {
> 		r = -EFAULT;
> 		goto out;
>@@ -7075,6 +7075,12 @@ static void vmx_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
> 	vmx_set_msr_bitmap(vcpu);
> }
> 
>+static void vmx_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
>+{
>+	if (vm_need_virtualize_apic_accesses(kvm))
>+		vmcs_write64(APIC_ACCESS_ADDR, hpa);
>+}
>+
> static void vmx_hwapic_isr_update(struct kvm *kvm, int isr)
> {
> 	u16 status;
>@@ -8846,6 +8852,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
> 	.enable_irq_window = enable_irq_window,
> 	.update_cr8_intercept = update_cr8_intercept,
> 	.set_virtual_x2apic_mode = vmx_set_virtual_x2apic_mode,
>+	.set_apic_access_page_addr = vmx_set_apic_access_page_addr,
> 	.vm_has_apicv = vmx_vm_has_apicv,
> 	.load_eoi_exitmap = vmx_load_eoi_exitmap,
> 	.hwapic_irr_update = vmx_hwapic_irr_update,
>diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>index a26524f..14e7174 100644
>--- a/arch/x86/kvm/x86.c
>+++ b/arch/x86/kvm/x86.c
>@@ -5943,6 +5943,24 @@ static void vcpu_migrated_page_update_ept(struct kvm_vcpu *vcpu)
> 	}
> }
> 
>+static void vcpu_migrated_page_update_apic(struct kvm_vcpu *vcpu)
>+{
>+	struct kvm *kvm = vcpu->kvm;
>+
>+	if (kvm->arch.apic_access_page_migrated) {
>+		if (kvm->arch.apic_access_page)
>+			kvm->arch.apic_access_page = pfn_to_page(0);
>+		kvm_x86_ops->set_apic_access_page_addr(kvm, 0x0ull);
>+	} else {
>+		struct page *page;
>+		page = gfn_to_page_no_pin(kvm,
>+				VMX_APIC_ACCESS_PAGE_ADDR >> PAGE_SHIFT);
>+		kvm->arch.apic_access_page = page;
>+		kvm_x86_ops->set_apic_access_page_addr(kvm,
>+						       page_to_phys(page));
>+	}
>+}
>+
> /*
>  * Returns 1 to let __vcpu_run() continue the guest execution loop without
>  * exiting to the userspace.  Otherwise, the value will be returned to the
>@@ -6005,6 +6023,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> 			vcpu_scan_ioapic(vcpu);
> 		if (kvm_check_request(KVM_REQ_MIGRATE_EPT, vcpu))
> 			vcpu_migrated_page_update_ept(vcpu);
>+		if (kvm_check_request(KVM_REQ_MIGRATE_APIC, vcpu))
>+			vcpu_migrated_page_update_apic(vcpu);
> 	}
> 
> 	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
>diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>index 4b7e51a..e2ad65e 100644
>--- a/include/linux/kvm_host.h
>+++ b/include/linux/kvm_host.h
>@@ -137,6 +137,7 @@ static inline bool is_error_page(struct page *page)
> #define KVM_REQ_ENABLE_IBS        23
> #define KVM_REQ_DISABLE_IBS       24
> #define KVM_REQ_MIGRATE_EPT       25
>+#define KVM_REQ_MIGRATE_APIC      26
> 
> #define KVM_USERSPACE_IRQ_SOURCE_ID		0
> #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID	1
>diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>index d271e89..f06438c 100644
>--- a/virt/kvm/kvm_main.c
>+++ b/virt/kvm/kvm_main.c
>@@ -54,6 +54,7 @@
> #include <asm/io.h>
> #include <asm/uaccess.h>
> #include <asm/pgtable.h>
>+#include <asm/vmx.h>
> 
> #include "coalesced_mmio.h"
> #include "async_pf.h"
>@@ -300,6 +301,20 @@ static void kvm_mmu_notifier_invalidate_page(struct mmu_notifier *mn,
> 		kvm_make_request(KVM_REQ_MIGRATE_EPT, kvm->vcpus[0]);
> 	}
> 
>+	if (address ==
>+	    gfn_to_hva(kvm, VMX_APIC_ACCESS_PAGE_ADDR >> PAGE_SHIFT)) {
>+		int i;
>+
>+		kvm->arch.apic_access_page_migrated = true;
>+
>+		/*
>+		 * We need update APIC_ACCESS_ADDR pointer in each VMCS of
>+		 * all the online vcpus.
>+		 */
>+		for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
>+			kvm_make_request(KVM_REQ_MIGRATE_APIC, kvm->vcpus[i]);
>+	}
>+
> 	spin_unlock(&kvm->mmu_lock);
> 	srcu_read_unlock(&kvm->srcu, idx);
> }
>-- 
>1.8.3.1
>
>--
>To unsubscribe from this list: send the line "unsubscribe kvm" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated.
  2014-07-07  9:52         ` Tang Chen
@ 2014-07-07 11:42           ` Nadav Amit
  2014-07-07 11:54             ` Gleb Natapov
  0 siblings, 1 reply; 29+ messages in thread
From: Nadav Amit @ 2014-07-07 11:42 UTC (permalink / raw)
  To: Tang Chen, Gleb Natapov
  Cc: mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel

Tang,

Running some (unrelated) tests I see that KVM does not handle APIC base 
relocation correctly. When the base is changed, kvm_lapic_set_base just 
changes lapic->base_address without taking further action (i.e., 
modifying the VMCS apic address in VMX).

This patch follows KVM bad behavior by using the constant 
VMX_APIC_ACCESS_PAGE_ADDR instead of lapic->base_address.

Anyhow, I didn't see anything that would make my life (in fixing the 
lapic base issue) too difficult. Yet, feel free in making it more 
"fix-friendly".

Thanks,
Nadav

On 7/7/14, 12:52 PM, Tang Chen wrote:
> Hi Gleb,
>
> The guest hang problem has been solved.
>
> When mmu_notifier is called, I set VMCS APIC_ACCESS_ADDR to the new value
> instead of setting it to 0. And only update kvm->arch.apic_access_page in
> the next ept violation.
>
> The guest is running well now.
>
> I'll post the new patches tomorrow. ;)
>
> Thanks.
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated.
  2014-07-07 11:42           ` Nadav Amit
@ 2014-07-07 11:54             ` Gleb Natapov
  2014-07-07 12:10               ` Nadav Amit
  0 siblings, 1 reply; 29+ messages in thread
From: Gleb Natapov @ 2014-07-07 11:54 UTC (permalink / raw)
  To: Nadav Amit
  Cc: Tang Chen, mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel

On Mon, Jul 07, 2014 at 02:42:27PM +0300, Nadav Amit wrote:
> Tang,
> 
> Running some (unrelated) tests I see that KVM does not handle APIC base
> relocation correctly. When the base is changed, kvm_lapic_set_base just
> changes lapic->base_address without taking further action (i.e., modifying
> the VMCS apic address in VMX).
> 
> This patch follows KVM bad behavior by using the constant
> VMX_APIC_ACCESS_PAGE_ADDR instead of lapic->base_address.
There is no OS out there that relocates APIC base (in fact it was not always
relocatable on real HW), so there is not point in complicating the code to support
it. In fact current APIC_ACCESS_ADDR handling relies on the fact that all vcpus
has apic mapped at the same address.

> 
> Anyhow, I didn't see anything that would make my life (in fixing the lapic
> base issue) too difficult. Yet, feel free in making it more "fix-friendly".
> 
Why would you want to fix it?

> Thanks,
> Nadav
> 
> On 7/7/14, 12:52 PM, Tang Chen wrote:
> >Hi Gleb,
> >
> >The guest hang problem has been solved.
> >
> >When mmu_notifier is called, I set VMCS APIC_ACCESS_ADDR to the new value
> >instead of setting it to 0. And only update kvm->arch.apic_access_page in
> >the next ept violation.
> >
> >The guest is running well now.
> >
> >I'll post the new patches tomorrow. ;)
> >
> >Thanks.
> >

--
			Gleb.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated.
  2014-07-07 11:54             ` Gleb Natapov
@ 2014-07-07 12:10               ` Nadav Amit
  2014-07-07 12:15                 ` Gleb Natapov
  2014-07-08  1:44                 ` [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated Tang Chen
  0 siblings, 2 replies; 29+ messages in thread
From: Nadav Amit @ 2014-07-07 12:10 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Tang Chen, mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel

On 7/7/14, 2:54 PM, Gleb Natapov wrote:
> On Mon, Jul 07, 2014 at 02:42:27PM +0300, Nadav Amit wrote:
>> Tang,
>>
>> Running some (unrelated) tests I see that KVM does not handle APIC base
>> relocation correctly. When the base is changed, kvm_lapic_set_base just
>> changes lapic->base_address without taking further action (i.e., modifying
>> the VMCS apic address in VMX).
>>
>> This patch follows KVM bad behavior by using the constant
>> VMX_APIC_ACCESS_PAGE_ADDR instead of lapic->base_address.
> There is no OS out there that relocates APIC base (in fact it was not always
> relocatable on real HW), so there is not point in complicating the code to support
> it. In fact current APIC_ACCESS_ADDR handling relies on the fact that all vcpus
> has apic mapped at the same address.
>
>>
>> Anyhow, I didn't see anything that would make my life (in fixing the lapic
>> base issue) too difficult. Yet, feel free in making it more "fix-friendly".
>>
> Why would you want to fix it?
>
If there is no general need, I will not send a fix. However, I think the 
very least a warning message should be appear if the guest relocates the 
APIC base.

Nadav


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated.
  2014-07-07 12:10               ` Nadav Amit
@ 2014-07-07 12:15                 ` Gleb Natapov
       [not found]                   ` <1408522330-18009-1-git-send-email-namit@cs.technion.ac.il>
  2014-07-08  1:44                 ` [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated Tang Chen
  1 sibling, 1 reply; 29+ messages in thread
From: Gleb Natapov @ 2014-07-07 12:15 UTC (permalink / raw)
  To: Nadav Amit
  Cc: Tang Chen, mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel

On Mon, Jul 07, 2014 at 03:10:23PM +0300, Nadav Amit wrote:
> On 7/7/14, 2:54 PM, Gleb Natapov wrote:
> >On Mon, Jul 07, 2014 at 02:42:27PM +0300, Nadav Amit wrote:
> >>Tang,
> >>
> >>Running some (unrelated) tests I see that KVM does not handle APIC base
> >>relocation correctly. When the base is changed, kvm_lapic_set_base just
> >>changes lapic->base_address without taking further action (i.e., modifying
> >>the VMCS apic address in VMX).
> >>
> >>This patch follows KVM bad behavior by using the constant
> >>VMX_APIC_ACCESS_PAGE_ADDR instead of lapic->base_address.
> >There is no OS out there that relocates APIC base (in fact it was not always
> >relocatable on real HW), so there is not point in complicating the code to support
> >it. In fact current APIC_ACCESS_ADDR handling relies on the fact that all vcpus
> >has apic mapped at the same address.
> >
> >>
> >>Anyhow, I didn't see anything that would make my life (in fixing the lapic
> >>base issue) too difficult. Yet, feel free in making it more "fix-friendly".
> >>
> >Why would you want to fix it?
> >
> If there is no general need, I will not send a fix. However, I think the
It is much more works that it may look like for no apparent gain.

> very least a warning message should be appear if the guest relocates the
> APIC base.
> 
Warning is fine. Make it _once().

--
			Gleb.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated.
  2014-07-07 12:10               ` Nadav Amit
  2014-07-07 12:15                 ` Gleb Natapov
@ 2014-07-08  1:44                 ` Tang Chen
  2014-07-08  6:46                   ` Nadav Amit
  1 sibling, 1 reply; 29+ messages in thread
From: Tang Chen @ 2014-07-08  1:44 UTC (permalink / raw)
  To: Nadav Amit
  Cc: Gleb Natapov, mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst,
	linux-kernel

Hi Nadav,

Thanks for the reply, please see below.

On 07/07/2014 08:10 PM, Nadav Amit wrote:
> On 7/7/14, 2:54 PM, Gleb Natapov wrote:
>> On Mon, Jul 07, 2014 at 02:42:27PM +0300, Nadav Amit wrote:
>>> Tang,
>>>
>>> Running some (unrelated) tests I see that KVM does not handle APIC base
>>> relocation correctly. When the base is changed, kvm_lapic_set_base just
>>> changes lapic->base_address without taking further action (i.e.,
>>> modifying
>>> the VMCS apic address in VMX).
>>>
>>> This patch follows KVM bad behavior by using the constant
>>> VMX_APIC_ACCESS_PAGE_ADDR instead of lapic->base_address.
>> There is no OS out there that relocates APIC base (in fact it was not
>> always
>> relocatable on real HW), so there is not point in complicating the
>> code to support
>> it. In fact current APIC_ACCESS_ADDR handling relies on the fact that
>> all vcpus
>> has apic mapped at the same address.
>>
>>>
>>> Anyhow, I didn't see anything that would make my life (in fixing the
>>> lapic
>>> base issue) too difficult. Yet, feel free in making it more
>>> "fix-friendly".
>>>
>> Why would you want to fix it?
>>
> If there is no general need, I will not send a fix. However, I think the
> very least a warning message should be appear if the guest relocates the
> APIC base.

Maybe I didn't understand you question correctly. If I'm wrong, please 
tell me.

This patch does not relocate APIC base in guest, but in host. Host migrates
the apic page to somewhere else, and KVM updates ept pagetable to track it.
In guest, apic base address (gpa) doesn't change.

Is this lapic->base_address a hpa ?

Is there anywhere I need to update in my patch ?

Thanks.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated.
  2014-07-08  1:44                 ` [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated Tang Chen
@ 2014-07-08  6:46                   ` Nadav Amit
  0 siblings, 0 replies; 29+ messages in thread
From: Nadav Amit @ 2014-07-08  6:46 UTC (permalink / raw)
  To: Tang Chen
  Cc: Gleb Natapov, mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst,
	linux-kernel

Tang,

I am sorry if I caused any confusion.

Following Gleb response, there is no apparent need for dealing with the 
scenario I mentioned (relocating the APIC base), so you don't need to do 
any changes to your patch, and I will post another patch later to warn 
if the guest relocates its APIC (from the default address to another 
guest physical address). My answers to your questions are below.

On 7/8/14, 4:44 AM, Tang Chen wrote:
> Hi Nadav,
>
> Thanks for the reply, please see below.
>
> On 07/07/2014 08:10 PM, Nadav Amit wrote:
>> On 7/7/14, 2:54 PM, Gleb Natapov wrote:
>>> On Mon, Jul 07, 2014 at 02:42:27PM +0300, Nadav Amit wrote:
>>>> Tang,
>>>>
>>>> Running some (unrelated) tests I see that KVM does not handle APIC base
>>>> relocation correctly. When the base is changed, kvm_lapic_set_base just
>>>> changes lapic->base_address without taking further action (i.e.,
>>>> modifying
>>>> the VMCS apic address in VMX).
>>>>
>>>> This patch follows KVM bad behavior by using the constant
>>>> VMX_APIC_ACCESS_PAGE_ADDR instead of lapic->base_address.
>>> There is no OS out there that relocates APIC base (in fact it was not
>>> always
>>> relocatable on real HW), so there is not point in complicating the
>>> code to support
>>> it. In fact current APIC_ACCESS_ADDR handling relies on the fact that
>>> all vcpus
>>> has apic mapped at the same address.
>>>
>>>>
>>>> Anyhow, I didn't see anything that would make my life (in fixing the
>>>> lapic
>>>> base issue) too difficult. Yet, feel free in making it more
>>>> "fix-friendly".
>>>>
>>> Why would you want to fix it?
>>>
>> If there is no general need, I will not send a fix. However, I think the
>> very least a warning message should be appear if the guest relocates the
>> APIC base.
>
> Maybe I didn't understand you question correctly. If I'm wrong, please
> tell me.
>
> This patch does not relocate APIC base in guest, but in host. Host migrates
> the apic page to somewhere else, and KVM updates ept pagetable to track it.
> In guest, apic base address (gpa) doesn't change.
The last claim is true in practice, according to Gleb, but it is not 
necessarily so according to the specifications. Pentium 4, Intel Xeon 
and P6 family processors support APIC base relocation. See the Intel SDM 
section 10.4.5. Anyhow, Gleb claims it is not used by any OS.

>
> Is this lapic->base_address a hpa ?
No, it is guest physical address.

>
> Is there anywhere I need to update in my patch ?

No. I'll send another patch on top of yours that prints a warning if the 
APIC base is relocated (i.e., the guest physical address of the APIC 
base is changed). Such relocation is done explicitly by the guest, not 
by your patch.

Nadav


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated.
  2014-07-07 10:35   ` Wanpeng Li
@ 2014-07-08  9:40     ` Tang Chen
  0 siblings, 0 replies; 29+ messages in thread
From: Tang Chen @ 2014-07-08  9:40 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: gleb, mtosatti, kvm, laijs, isimatu.yasuaki, guz.fnst, linux-kernel

Hi Wanpeng,

On 07/07/2014 06:35 PM, Wanpeng Li wrote:
> On Wed, Jul 02, 2014 at 05:00:37PM +0800, Tang Chen wrote:
>> apic access page is pinned in memory, and as a result it cannot be
>> migrated/hot-removed.
>>
>> Actually it doesn't need to be pinned in memory.
>>
>> This patch introduces a new vcpu request: KVM_REQ_MIGRATE_EPT. This requet
>
> s/KVM_REQ_MIGRATE_EPT/KVM_REQ_MIGRATE_APIC

Thanks, will fix it in the next version.

Thanks.

>
>> will be made when kvm_mmu_notifier_invalidate_page() is called when the page
>> is unmapped from the qemu user space to reset APIC_ACCESS_ADDR pointer in
>> each online vcpu to 0. And will also be made when ept violation happens to
>> reset APIC_ACCESS_ADDR to the new page phys_addr (host phys_addr).
>> ---
>> arch/x86/include/asm/kvm_host.h |  2 ++
>> arch/x86/kvm/mmu.c              | 15 +++++++++++++++
>> arch/x86/kvm/vmx.c              |  9 ++++++++-
>> arch/x86/kvm/x86.c              | 20 ++++++++++++++++++++
>> include/linux/kvm_host.h        |  1 +
>> virt/kvm/kvm_main.c             | 15 +++++++++++++++
>> 6 files changed, 61 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 8771c0f..f104b87 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -575,6 +575,7 @@ struct kvm_arch {
>>
>> 	unsigned int tss_addr;
>> 	struct page *apic_access_page;
>> +	bool apic_access_page_migrated;
>>
>> 	gpa_t wall_clock;
>>
>> @@ -739,6 +740,7 @@ struct kvm_x86_ops {
>> 	void (*hwapic_isr_update)(struct kvm *kvm, int isr);
>> 	void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
>> 	void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
>> +	void (*set_apic_access_page_addr)(struct kvm *kvm, hpa_t hpa);
>> 	void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
>> 	void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
>> 	int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>> index c0d72f6..a655444 100644
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -3436,6 +3436,21 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
>> 		kvm_make_request(KVM_REQ_MIGRATE_EPT, vcpu);
>> 	}
>>
>> +	if (gpa == VMX_APIC_ACCESS_PAGE_ADDR&&
>> +	    vcpu->kvm->arch.apic_access_page_migrated) {
>> +		int i;
>> +
>> +		vcpu->kvm->arch.apic_access_page_migrated = false;
>> +
>> +		/*
>> +		 * We need update APIC_ACCESS_ADDR pointer in each VMCS of
>> +		 * all the online vcpus.
>> +		 */
>> +		for (i = 0; i<  atomic_read(&vcpu->kvm->online_vcpus); i++)
>> +			kvm_make_request(KVM_REQ_MIGRATE_APIC,
>> +					 vcpu->kvm->vcpus[i]);
>> +	}
>> +
>> 	spin_unlock(&vcpu->kvm->mmu_lock);
>>
>> 	return r;
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index c336cb3..abc152f 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -3988,7 +3988,7 @@ static int alloc_apic_access_page(struct kvm *kvm)
>> 	if (r)
>> 		goto out;
>>
>> -	page = gfn_to_page(kvm, VMX_APIC_ACCESS_PAGE_ADDR>>  PAGE_SHIFT);
>> +	page = gfn_to_page_no_pin(kvm, VMX_APIC_ACCESS_PAGE_ADDR>>  PAGE_SHIFT);
>> 	if (is_error_page(page)) {
>> 		r = -EFAULT;
>> 		goto out;
>> @@ -7075,6 +7075,12 @@ static void vmx_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
>> 	vmx_set_msr_bitmap(vcpu);
>> }
>>
>> +static void vmx_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
>> +{
>> +	if (vm_need_virtualize_apic_accesses(kvm))
>> +		vmcs_write64(APIC_ACCESS_ADDR, hpa);
>> +}
>> +
>> static void vmx_hwapic_isr_update(struct kvm *kvm, int isr)
>> {
>> 	u16 status;
>> @@ -8846,6 +8852,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
>> 	.enable_irq_window = enable_irq_window,
>> 	.update_cr8_intercept = update_cr8_intercept,
>> 	.set_virtual_x2apic_mode = vmx_set_virtual_x2apic_mode,
>> +	.set_apic_access_page_addr = vmx_set_apic_access_page_addr,
>> 	.vm_has_apicv = vmx_vm_has_apicv,
>> 	.load_eoi_exitmap = vmx_load_eoi_exitmap,
>> 	.hwapic_irr_update = vmx_hwapic_irr_update,
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index a26524f..14e7174 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -5943,6 +5943,24 @@ static void vcpu_migrated_page_update_ept(struct kvm_vcpu *vcpu)
>> 	}
>> }
>>
>> +static void vcpu_migrated_page_update_apic(struct kvm_vcpu *vcpu)
>> +{
>> +	struct kvm *kvm = vcpu->kvm;
>> +
>> +	if (kvm->arch.apic_access_page_migrated) {
>> +		if (kvm->arch.apic_access_page)
>> +			kvm->arch.apic_access_page = pfn_to_page(0);
>> +		kvm_x86_ops->set_apic_access_page_addr(kvm, 0x0ull);
>> +	} else {
>> +		struct page *page;
>> +		page = gfn_to_page_no_pin(kvm,
>> +				VMX_APIC_ACCESS_PAGE_ADDR>>  PAGE_SHIFT);
>> +		kvm->arch.apic_access_page = page;
>> +		kvm_x86_ops->set_apic_access_page_addr(kvm,
>> +						       page_to_phys(page));
>> +	}
>> +}
>> +
>> /*
>>   * Returns 1 to let __vcpu_run() continue the guest execution loop without
>>   * exiting to the userspace.  Otherwise, the value will be returned to the
>> @@ -6005,6 +6023,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>> 			vcpu_scan_ioapic(vcpu);
>> 		if (kvm_check_request(KVM_REQ_MIGRATE_EPT, vcpu))
>> 			vcpu_migrated_page_update_ept(vcpu);
>> +		if (kvm_check_request(KVM_REQ_MIGRATE_APIC, vcpu))
>> +			vcpu_migrated_page_update_apic(vcpu);
>> 	}
>>
>> 	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>> index 4b7e51a..e2ad65e 100644
>> --- a/include/linux/kvm_host.h
>> +++ b/include/linux/kvm_host.h
>> @@ -137,6 +137,7 @@ static inline bool is_error_page(struct page *page)
>> #define KVM_REQ_ENABLE_IBS        23
>> #define KVM_REQ_DISABLE_IBS       24
>> #define KVM_REQ_MIGRATE_EPT       25
>> +#define KVM_REQ_MIGRATE_APIC      26
>>
>> #define KVM_USERSPACE_IRQ_SOURCE_ID		0
>> #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID	1
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index d271e89..f06438c 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -54,6 +54,7 @@
>> #include<asm/io.h>
>> #include<asm/uaccess.h>
>> #include<asm/pgtable.h>
>> +#include<asm/vmx.h>
>>
>> #include "coalesced_mmio.h"
>> #include "async_pf.h"
>> @@ -300,6 +301,20 @@ static void kvm_mmu_notifier_invalidate_page(struct mmu_notifier *mn,
>> 		kvm_make_request(KVM_REQ_MIGRATE_EPT, kvm->vcpus[0]);
>> 	}
>>
>> +	if (address ==
>> +	    gfn_to_hva(kvm, VMX_APIC_ACCESS_PAGE_ADDR>>  PAGE_SHIFT)) {
>> +		int i;
>> +
>> +		kvm->arch.apic_access_page_migrated = true;
>> +
>> +		/*
>> +		 * We need update APIC_ACCESS_ADDR pointer in each VMCS of
>> +		 * all the online vcpus.
>> +		 */
>> +		for (i = 0; i<  atomic_read(&kvm->online_vcpus); i++)
>> +			kvm_make_request(KVM_REQ_MIGRATE_APIC, kvm->vcpus[i]);
>> +	}
>> +
>> 	spin_unlock(&kvm->mmu_lock);
>> 	srcu_read_unlock(&kvm->srcu, idx);
>> }
>> --
>> 1.8.3.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> .
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] KVM: x86: Warn on APIC base relocation
       [not found]                   ` <1408522330-18009-1-git-send-email-namit@cs.technion.ac.il>
@ 2014-08-20  8:44                     ` Nadav Amit
  0 siblings, 0 replies; 29+ messages in thread
From: Nadav Amit @ 2014-08-20  8:44 UTC (permalink / raw)
  To: Nadav Amit; +Cc: gleb, Paolo Bonzini, KVM

[-- Attachment #1: Type: text/plain, Size: 1259 bytes --]

CC’ing the KVM mailing list which I forgot.

On Aug 20, 2014, at 11:12 AM, Nadav Amit <namit@cs.technion.ac.il> wrote:

> APIC base relocation is unsupported by KVM. If anyone uses it, the least should
> be to report a warning in the hypervisor. Note that kvm-unit-tests performs
> APIC base relocation, and causes the warning to be printed.
> 
> Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
> ---
> arch/x86/kvm/lapic.c | 5 +++++
> 1 file changed, 5 insertions(+)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 08e8a89..6655e20 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1416,6 +1416,11 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
>  	apic->base_address = apic->vcpu->arch.apic_base &
>  			     MSR_IA32_APICBASE_BASE;
> 
> +	if ((value & MSR_IA32_APICBASE_ENABLE) &&
> +	     apic->base_address != APIC_DEFAULT_PHYS_BASE)
> +		printk_once(KERN_WARNING
> +			    "APIC base relocation is unsupported by KVM\n");
> +
>  	/* with FSB delivery interrupt, we can restart APIC functionality */
>  	apic_debug("apic base msr is 0x%016" PRIx64 ", and base address is "
>  		   "0x%lx.\n", apic->vcpu->arch.apic_base, apic->base_address);
> -- 
> 1.9.1
> 


[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2014-08-20  8:44 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-02  9:00 [PATCH 0/4] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
2014-07-02  9:00 ` [PATCH 1/4] kvm: Add gfn_to_page_no_pin() Tang Chen
2014-07-02  9:00 ` [PATCH 2/4] kvm: Add macro VMX_APIC_ACCESS_PAGE_ADDR Tang Chen
2014-07-02 16:24   ` Gleb Natapov
2014-07-03  1:19     ` Tang Chen
2014-07-02  9:00 ` [PATCH 3/4] kvm, memory-hotplug: Update ept identity pagetable when it is migrated Tang Chen
2014-07-02 16:34   ` Gleb Natapov
2014-07-03  1:19     ` Tang Chen
2014-07-04  2:36     ` Tang Chen
2014-07-04  9:49       ` Gleb Natapov
2014-07-02  9:00 ` [PATCH 4/4] kvm, mem-hotplug: Update apic access page " Tang Chen
2014-07-03 13:55   ` Gleb Natapov
2014-07-04  2:18     ` Tang Chen
2014-07-04  2:18     ` Tang Chen
2014-07-04 10:13       ` Gleb Natapov
2014-07-07  6:17         ` Tang Chen
2014-07-07  9:52         ` Tang Chen
2014-07-07 11:42           ` Nadav Amit
2014-07-07 11:54             ` Gleb Natapov
2014-07-07 12:10               ` Nadav Amit
2014-07-07 12:15                 ` Gleb Natapov
     [not found]                   ` <1408522330-18009-1-git-send-email-namit@cs.technion.ac.il>
2014-08-20  8:44                     ` [PATCH] KVM: x86: Warn on APIC base relocation Nadav Amit
2014-07-08  1:44                 ` [PATCH 4/4] kvm, mem-hotplug: Update apic access page when it is migrated Tang Chen
2014-07-08  6:46                   ` Nadav Amit
2014-07-07 10:35   ` Wanpeng Li
2014-07-08  9:40     ` Tang Chen
2014-07-03  1:17 ` [PATCH 0/4] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
2014-07-03  6:04   ` Gleb Natapov
2014-07-04  6:41     ` Tang Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.