linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/58] KVM updates for 2.6.23
@ 2007-06-17  9:43 Avi Kivity
  2007-06-17  9:43 ` [PATCH 01/58] KVM: VMX: Enable io bitmaps to avoid IO port 0x80 VMEXITs Avi Kivity
                   ` (56 more replies)
  0 siblings, 57 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel

Following is my patchqueue for the 2.6.23 merge window, not including
the cpu hotplug fixes posted earlier.  The changes include performance
improvements, guest smp, random fixes, and cleanups.  Comments welcome.

Anthony Liguori (1):
      KVM: SVM: Allow direct guest access to PC debug port

Avi Kivity (44):
      KVM: Assume that writes smaller than 4 bytes are to non-pagetable pages
      KVM: Avoid saving and restoring some host CPU state on lightweight vmexit
      KVM: Unindent some code
      KVM: Reduce misfirings of the fork detector
      KVM: Be more careful restoring fs on lightweight vmexit
      KVM: Unify kvm_mmu_pre_write() and kvm_mmu_post_write()
      KVM: MMU: Respect nonpae pagetable quadrant when zapping ptes
      KVM: Update shadow pte on write to guest pte
      KVM: Increase mmu shadow cache to 1024 pages
      KVM: Fix potential guest state leak into host
      KVM: Move some more msr mangling into vmx_save_host_state()
      KVM: Rationalize exception bitmap usage
      KVM: Consolidate guest fpu activation and deactivation
      KVM: Set cr0.mp for guests
      KVM: MMU: Simplify kvm_mmu_free_page() a tiny bit
      KVM: MMU: Store shadow page tables as kernel virtual addresses, not physical
      KVM: VMX: Only reload guest msrs if they are already loaded
      KVM: Avoid corrupting tr in real mode
      KVM: Fix vmx I/O bitmap initialization on highmem systems
      KVM: VMX: Use local labels in inline assembly
      KVM: x86 emulator: implement wbinvd
      KVM: MMU: Use slab caches for shadow pages and their headers
      KVM: MMU: Simplify fetch() a little bit
      KVM: MMU: Move set_pte_common() to pte width dependent code
      KVM: MMU: Pass the guest pde to set_pte_common
      KVM: MMU: Fold fix_read_pf() into set_pte_common()
      KVM: MMU: Fold fix_write_pf() into set_pte_common()
      KVM: Move shadow pte modifications from set_pte/set_pde to set_pde_common()
      KVM: Make shadow pte updates atomic
      KVM: MMU: Make setting shadow ptes atomic on i386
      KVM: MMU: Remove cr0.wp tricks
      KVM: MMU: Simpify accessed/dirty/present/nx bit handling
      KVM: MMU: Don't cache guest access bits in the shadow page table
      KVM: MMU: Remove unused large page marker
      KVM: Lazy guest cr3 switching
      KVM: Fix vcpu freeing for guest smp
      KVM: Fix adding an smp virtual machine to the vm list
      KVM: Enable guest smp
      KVM: Move duplicate halt handling code into kvm_main.c
      KVM: Emulate hlt on real mode for Intel
      KVM: Keep an upper bound of initialized vcpus
      KVM: Flush remote tlbs when reducing shadow pte permissions
      KVM: Initialize the BSP bit in the APIC_BASE msr correctly
      KVM: VMX: Ensure vcpu time stamp counter is monotonous

Eddie Dong (4):
      KVM: VMX: Avoid saving and restoring msrs on lightweight vmexit
      KVM: VMX: Cleanup redundant code in MSR set
      KVM: VMX: Avoid saving and restoring msr_efer on lightweight vmexit
      KVM: Use symbolic constants instead of magic numbers

He, Qing (1):
      KVM: VMX: Enable io bitmaps to avoid IO port 0x80 VMEXITs

Jan Engelhardt (1):
      Use menuconfig objects II - KVM/Virt

Markus Rechberger (1):
      KVM: Fix includes

Matthew Gregan (1):
      KVM: Implement IA32_EBL_CR_POWERON msr

Nguyen Anh Quynh (1):
      KVM: Remove unnecessary initialization and checks in mark_page_dirty()

Nitin A Kamble (1):
      KVM: VMX: Handle #SS faults from real mode

Robert P. J. Day (1):
      KVM: Replace C code with call to ARRAY_SIZE() macro.

Shani Moideen (2):
      KVM: SVM: Replace memset(<addr>, 0, PAGESIZE) with clear_page(<addr>)
      KVM: VMX: Replace memset(<addr>, 0, PAGESIZE) with clear_page(<addr>)

 drivers/kvm/Kconfig       |    9 +-
 drivers/kvm/kvm.h         |   53 ++++-
 drivers/kvm/kvm_main.c    |  115 ++++++++-
 drivers/kvm/mmu.c         |  284 +++++++++-----------
 drivers/kvm/paging_tmpl.h |  273 ++++++++++---------
 drivers/kvm/svm.c         |   46 ++--
 drivers/kvm/vmx.c         |  640 ++++++++++++++++++++++++++++-----------------
 drivers/kvm/x86_emulate.c |   10 +-
 8 files changed, 868 insertions(+), 562 deletions(-)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH 01/58] KVM: VMX: Enable io bitmaps to avoid IO port 0x80 VMEXITs
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
@ 2007-06-17  9:43 ` Avi Kivity
  2007-06-17  9:43 ` [PATCH 02/58] KVM: SVM: Allow direct guest access to PC debug port Avi Kivity
                   ` (55 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, He, Qing, Avi Kivity

From: He, Qing <qing.he@intel.com>

This patch enables IO bitmaps control on vmx and unmask the 0x80 port to
avoid VMEXITs caused by accessing port 0x80. 0x80 is used as delays (see
include/asm/io.h), and handling VMEXITs on its access is unnecessary but
slows things down. This patch improves kernel build test at around
3%~5%.
	Because every VM uses the same io bitmap, it is shared between
all VMs rather than a per-VM data structure.

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/vmx.c |   50 ++++++++++++++++++++++++++++++++++++++++++++++----
 1 files changed, 46 insertions(+), 4 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index c1ac106..52bd5f0 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -34,6 +34,9 @@ MODULE_LICENSE("GPL");
 static DEFINE_PER_CPU(struct vmcs *, vmxarea);
 static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
 
+static struct page *vmx_io_bitmap_a;
+static struct page *vmx_io_bitmap_b;
+
 #ifdef CONFIG_X86_64
 #define HOST_IS_64 1
 #else
@@ -1129,8 +1132,8 @@ static int vmx_vcpu_setup(struct kvm_vcpu *vcpu)
 	vmcs_write32(GUEST_PENDING_DBG_EXCEPTIONS, 0);
 
 	/* I/O */
-	vmcs_write64(IO_BITMAP_A, 0);
-	vmcs_write64(IO_BITMAP_B, 0);
+	vmcs_write64(IO_BITMAP_A, page_to_phys(vmx_io_bitmap_a));
+	vmcs_write64(IO_BITMAP_B, page_to_phys(vmx_io_bitmap_b));
 
 	guest_write_tsc(0);
 
@@ -1150,7 +1153,7 @@ static int vmx_vcpu_setup(struct kvm_vcpu *vcpu)
 			       CPU_BASED_HLT_EXITING         /* 20.6.2 */
 			       | CPU_BASED_CR8_LOAD_EXITING    /* 20.6.2 */
 			       | CPU_BASED_CR8_STORE_EXITING   /* 20.6.2 */
-			       | CPU_BASED_UNCOND_IO_EXITING   /* 20.6.2 */
+			       | CPU_BASED_ACTIVATE_IO_BITMAP  /* 20.6.2 */
 			       | CPU_BASED_MOV_DR_EXITING
 			       | CPU_BASED_USE_TSC_OFFSETING   /* 21.3 */
 			);
@@ -2188,11 +2191,50 @@ static struct kvm_arch_ops vmx_arch_ops = {
 
 static int __init vmx_init(void)
 {
-	return kvm_init_arch(&vmx_arch_ops, THIS_MODULE);
+	void *iova;
+	int r;
+
+	vmx_io_bitmap_a = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
+	if (!vmx_io_bitmap_a)
+		return -ENOMEM;
+
+	vmx_io_bitmap_b = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
+	if (!vmx_io_bitmap_b) {
+		r = -ENOMEM;
+		goto out;
+	}
+
+	/*
+	 * Allow direct access to the PC debug port (it is often used for I/O
+	 * delays, but the vmexits simply slow things down).
+	 */
+	iova = kmap(vmx_io_bitmap_a);
+	memset(iova, 0xff, PAGE_SIZE);
+	clear_bit(0x80, iova);
+	kunmap(iova);
+
+	iova = kmap(vmx_io_bitmap_b);
+	memset(iova, 0xff, PAGE_SIZE);
+	kunmap(iova);
+
+	r = kvm_init_arch(&vmx_arch_ops, THIS_MODULE);
+	if (r)
+		goto out1;
+
+	return 0;
+
+out1:
+	__free_page(vmx_io_bitmap_b);
+out:
+	__free_page(vmx_io_bitmap_a);
+	return r;
 }
 
 static void __exit vmx_exit(void)
 {
+	__free_page(vmx_io_bitmap_b);
+	__free_page(vmx_io_bitmap_a);
+
 	kvm_exit_arch();
 }
 
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 02/58] KVM: SVM: Allow direct guest access to PC debug port
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
  2007-06-17  9:43 ` [PATCH 01/58] KVM: VMX: Enable io bitmaps to avoid IO port 0x80 VMEXITs Avi Kivity
@ 2007-06-17  9:43 ` Avi Kivity
  2007-06-17  9:43 ` [PATCH 03/58] KVM: Assume that writes smaller than 4 bytes are to non-pagetable pages Avi Kivity
                   ` (54 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Anthony Liguori, Avi Kivity

From: Anthony Liguori <aliguori@us.ibm.com>

The PC debug port is used for IO delay and does not require emulation.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/svm.c |    8 +++++---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index fa17d6d..6cd6a50 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -378,7 +378,7 @@ static __init int svm_hardware_setup(void)
 	int cpu;
 	struct page *iopm_pages;
 	struct page *msrpm_pages;
-	void *msrpm_va;
+	void *iopm_va, *msrpm_va;
 	int r;
 
 	kvm_emulator_want_group7_invlpg();
@@ -387,8 +387,10 @@ static __init int svm_hardware_setup(void)
 
 	if (!iopm_pages)
 		return -ENOMEM;
-	memset(page_address(iopm_pages), 0xff,
-					PAGE_SIZE * (1 << IOPM_ALLOC_ORDER));
+
+	iopm_va = page_address(iopm_pages);
+	memset(iopm_va, 0xff, PAGE_SIZE * (1 << IOPM_ALLOC_ORDER));
+	clear_bit(0x80, iopm_va); /* allow direct access to PC debug port */
 	iopm_base = page_to_pfn(iopm_pages) << PAGE_SHIFT;
 
 
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 03/58] KVM: Assume that writes smaller than 4 bytes are to non-pagetable pages
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
  2007-06-17  9:43 ` [PATCH 01/58] KVM: VMX: Enable io bitmaps to avoid IO port 0x80 VMEXITs Avi Kivity
  2007-06-17  9:43 ` [PATCH 02/58] KVM: SVM: Allow direct guest access to PC debug port Avi Kivity
@ 2007-06-17  9:43 ` Avi Kivity
  2007-06-17  9:43 ` [PATCH 04/58] KVM: Avoid saving and restoring some host CPU state on lightweight vmexit Avi Kivity
                   ` (53 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

This allows us to remove write protection earlier than otherwise.  Should
some mad OS choose to use byte writes to update pagetables, it will suffer
a performance hit, but still work correctly.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/mmu.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index e8e2281..2277b7c 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -1169,6 +1169,7 @@ void kvm_mmu_pre_write(struct kvm_vcpu *vcpu, gpa_t gpa, int bytes)
 			continue;
 		pte_size = page->role.glevels == PT32_ROOT_LEVEL ? 4 : 8;
 		misaligned = (offset ^ (offset + bytes - 1)) & ~(pte_size - 1);
+		misaligned |= bytes < 4;
 		if (misaligned || flooded) {
 			/*
 			 * Misaligned accesses are too much trouble to fix
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 04/58] KVM: Avoid saving and restoring some host CPU state on lightweight vmexit
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (2 preceding siblings ...)
  2007-06-17  9:43 ` [PATCH 03/58] KVM: Assume that writes smaller than 4 bytes are to non-pagetable pages Avi Kivity
@ 2007-06-17  9:43 ` Avi Kivity
  2007-06-17  9:43 ` [PATCH 05/58] KVM: Unindent some code Avi Kivity
                   ` (52 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity, Yaozu Dong

Many msrs and the like will only be used by the host if we schedule() or
return to userspace.  Therefore, we avoid saving them if we handle the
exit within the kernel, and if a reschedule is not requested.

Based on a patch from Eddie Dong <eddie.dong@intel.com> with a couple of
fixes by me.

Signed-off-by: Yaozu(Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h      |    1 +
 drivers/kvm/kvm_main.c |    1 +
 drivers/kvm/vmx.c      |  105 +++++++++++++++++++++++++++--------------------
 3 files changed, 62 insertions(+), 45 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 152312c..7facebd 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -252,6 +252,7 @@ struct kvm_stat {
 	u32 halt_exits;
 	u32 request_irq_exits;
 	u32 irq_exits;
+	u32 light_exits;
 };
 
 struct kvm_vcpu {
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 8f1f07a..7d68258 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -72,6 +72,7 @@ static struct kvm_stats_debugfs_item {
 	{ "halt_exits", STAT_OFFSET(halt_exits) },
 	{ "request_irq", STAT_OFFSET(request_irq_exits) },
 	{ "irq_exits", STAT_OFFSET(irq_exits) },
+	{ "light_exits", STAT_OFFSET(light_exits) },
 	{ NULL }
 };
 
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 52bd5f0..84ce0c0 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -483,6 +483,13 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
 	case MSR_GS_BASE:
 		vmcs_writel(GUEST_GS_BASE, data);
 		break;
+	case MSR_LSTAR:
+	case MSR_SYSCALL_MASK:
+		msr = find_msr_entry(vcpu, msr_index);
+		if (msr)
+			msr->data = data;
+		load_msrs(vcpu->guest_msrs, NR_BAD_MSRS);
+		break;
 #endif
 	case MSR_IA32_SYSENTER_CS:
 		vmcs_write32(GUEST_SYSENTER_CS, data);
@@ -1820,7 +1827,7 @@ static int vmx_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	int fs_gs_ldt_reload_needed;
 	int r;
 
-again:
+preempted:
 	/*
 	 * Set host fs and gs selectors.  Unfortunately, 22.2.3 does not
 	 * allow segment selectors with cpl > 0 or ti == 1.
@@ -1851,13 +1858,6 @@ again:
 	if (vcpu->guest_debug.enabled)
 		kvm_guest_debug_pre(vcpu);
 
-	kvm_load_guest_fpu(vcpu);
-
-	/*
-	 * Loading guest fpu may have cleared host cr0.ts
-	 */
-	vmcs_writel(HOST_CR0, read_cr0());
-
 #ifdef CONFIG_X86_64
 	if (is_long_mode(vcpu)) {
 		save_msrs(vcpu->host_msrs + msr_offset_kernel_gs_base, 1);
@@ -1865,6 +1865,14 @@ again:
 	}
 #endif
 
+again:
+	kvm_load_guest_fpu(vcpu);
+
+	/*
+	 * Loading guest fpu may have cleared host cr0.ts
+	 */
+	vmcs_writel(HOST_CR0, read_cr0());
+
 	asm (
 		/* Store host registers */
 		"pushf \n\t"
@@ -1984,36 +1992,8 @@ again:
 		[cr2]"i"(offsetof(struct kvm_vcpu, cr2))
 	      : "cc", "memory" );
 
-	/*
-	 * Reload segment selectors ASAP. (it's needed for a functional
-	 * kernel: x86 relies on having __KERNEL_PDA in %fs and x86_64
-	 * relies on having 0 in %gs for the CPU PDA to work.)
-	 */
-	if (fs_gs_ldt_reload_needed) {
-		load_ldt(ldt_sel);
-		load_fs(fs_sel);
-		/*
-		 * If we have to reload gs, we must take care to
-		 * preserve our gs base.
-		 */
-		local_irq_disable();
-		load_gs(gs_sel);
-#ifdef CONFIG_X86_64
-		wrmsrl(MSR_GS_BASE, vmcs_readl(HOST_GS_BASE));
-#endif
-		local_irq_enable();
-
-		reload_tss();
-	}
 	++vcpu->stat.exits;
 
-#ifdef CONFIG_X86_64
-	if (is_long_mode(vcpu)) {
-		save_msrs(vcpu->guest_msrs, NR_BAD_MSRS);
-		load_msrs(vcpu->host_msrs, NR_BAD_MSRS);
-	}
-#endif
-
 	vcpu->interrupt_window_open = (vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) & 3) == 0;
 
 	asm ("mov %0, %%ds; mov %0, %%es" : : "r"(__USER_DS));
@@ -2035,24 +2015,59 @@ again:
 		if (r > 0) {
 			/* Give scheduler a change to reschedule. */
 			if (signal_pending(current)) {
-				++vcpu->stat.signal_exits;
-				post_kvm_run_save(vcpu, kvm_run);
+				r = -EINTR;
 				kvm_run->exit_reason = KVM_EXIT_INTR;
-				return -EINTR;
+				++vcpu->stat.signal_exits;
+				goto out;
 			}
 
 			if (dm_request_for_irq_injection(vcpu, kvm_run)) {
-				++vcpu->stat.request_irq_exits;
-				post_kvm_run_save(vcpu, kvm_run);
+				r = -EINTR;
 				kvm_run->exit_reason = KVM_EXIT_INTR;
-				return -EINTR;
+				++vcpu->stat.request_irq_exits;
+				goto out;
+			}
+			if (!need_resched()) {
+				++vcpu->stat.light_exits;
+				goto again;
 			}
-
-			kvm_resched(vcpu);
-			goto again;
 		}
 	}
 
+out:
+	/*
+	 * Reload segment selectors ASAP. (it's needed for a functional
+	 * kernel: x86 relies on having __KERNEL_PDA in %fs and x86_64
+	 * relies on having 0 in %gs for the CPU PDA to work.)
+	 */
+	if (fs_gs_ldt_reload_needed) {
+		load_ldt(ldt_sel);
+		load_fs(fs_sel);
+		/*
+		 * If we have to reload gs, we must take care to
+		 * preserve our gs base.
+		 */
+		local_irq_disable();
+		load_gs(gs_sel);
+#ifdef CONFIG_X86_64
+		wrmsrl(MSR_GS_BASE, vmcs_readl(HOST_GS_BASE));
+#endif
+		local_irq_enable();
+
+		reload_tss();
+	}
+#ifdef CONFIG_X86_64
+	if (is_long_mode(vcpu)) {
+		save_msrs(vcpu->guest_msrs, NR_BAD_MSRS);
+		load_msrs(vcpu->host_msrs, NR_BAD_MSRS);
+	}
+#endif
+
+	if (r > 0) {
+		kvm_resched(vcpu);
+		goto preempted;
+	}
+
 	post_kvm_run_save(vcpu, kvm_run);
 	return r;
 }
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 05/58] KVM: Unindent some code
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (3 preceding siblings ...)
  2007-06-17  9:43 ` [PATCH 04/58] KVM: Avoid saving and restoring some host CPU state on lightweight vmexit Avi Kivity
@ 2007-06-17  9:43 ` Avi Kivity
  2007-06-17  9:43 ` [PATCH 06/58] KVM: Reduce misfirings of the fork detector Avi Kivity
                   ` (51 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/vmx.c |   58 ++++++++++++++++++++++++++--------------------------
 1 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 84ce0c0..9ebb18d 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1998,39 +1998,39 @@ again:
 
 	asm ("mov %0, %%ds; mov %0, %%es" : : "r"(__USER_DS));
 
-	if (fail) {
+	if (unlikely(fail)) {
 		kvm_run->exit_reason = KVM_EXIT_FAIL_ENTRY;
 		kvm_run->fail_entry.hardware_entry_failure_reason
 			= vmcs_read32(VM_INSTRUCTION_ERROR);
 		r = 0;
-	} else {
-		/*
-		 * Profile KVM exit RIPs:
-		 */
-		if (unlikely(prof_on == KVM_PROFILING))
-			profile_hit(KVM_PROFILING, (void *)vmcs_readl(GUEST_RIP));
-
-		vcpu->launched = 1;
-		r = kvm_handle_exit(kvm_run, vcpu);
-		if (r > 0) {
-			/* Give scheduler a change to reschedule. */
-			if (signal_pending(current)) {
-				r = -EINTR;
-				kvm_run->exit_reason = KVM_EXIT_INTR;
-				++vcpu->stat.signal_exits;
-				goto out;
-			}
-
-			if (dm_request_for_irq_injection(vcpu, kvm_run)) {
-				r = -EINTR;
-				kvm_run->exit_reason = KVM_EXIT_INTR;
-				++vcpu->stat.request_irq_exits;
-				goto out;
-			}
-			if (!need_resched()) {
-				++vcpu->stat.light_exits;
-				goto again;
-			}
+		goto out;
+	}
+	/*
+	 * Profile KVM exit RIPs:
+	 */
+	if (unlikely(prof_on == KVM_PROFILING))
+		profile_hit(KVM_PROFILING, (void *)vmcs_readl(GUEST_RIP));
+
+	vcpu->launched = 1;
+	r = kvm_handle_exit(kvm_run, vcpu);
+	if (r > 0) {
+		/* Give scheduler a change to reschedule. */
+		if (signal_pending(current)) {
+			r = -EINTR;
+			kvm_run->exit_reason = KVM_EXIT_INTR;
+			++vcpu->stat.signal_exits;
+			goto out;
+		}
+
+		if (dm_request_for_irq_injection(vcpu, kvm_run)) {
+			r = -EINTR;
+			kvm_run->exit_reason = KVM_EXIT_INTR;
+			++vcpu->stat.request_irq_exits;
+			goto out;
+		}
+		if (!need_resched()) {
+			++vcpu->stat.light_exits;
+			goto again;
 		}
 	}
 
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 06/58] KVM: Reduce misfirings of the fork detector
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (4 preceding siblings ...)
  2007-06-17  9:43 ` [PATCH 05/58] KVM: Unindent some code Avi Kivity
@ 2007-06-17  9:43 ` Avi Kivity
  2007-06-17  9:43 ` [PATCH 07/58] KVM: Be more careful restoring fs on lightweight vmexit Avi Kivity
                   ` (50 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

The kvm mmu tries to detects forks by looking for repeated writes to a
page table.  If it sees a fork, it unshadows the page table so the page
table copying can proceed at native speed instead of being emulated.

However, the detector also triggered on simple demand paging access patterns:
a linear walk of memory would of course cause repeated writes to the same
pagetable page, causing it to unshadow prematurely.

Fix by resetting the fork detector if we detect a demand fault.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/paging_tmpl.h |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 73ffbff..bc64cce 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -421,6 +421,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr,
 		pgprintk("%s: guest page fault\n", __FUNCTION__);
 		inject_page_fault(vcpu, addr, walker.error_code);
 		FNAME(release_walker)(&walker);
+		vcpu->last_pt_write_count = 0; /* reset fork detector */
 		return 0;
 	}
 
@@ -442,6 +443,9 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr,
 
 	FNAME(release_walker)(&walker);
 
+	if (!write_pt)
+		vcpu->last_pt_write_count = 0; /* reset fork detector */
+
 	/*
 	 * mmio: emulate if accessible, otherwise its a guest fault.
 	 */
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 07/58] KVM: Be more careful restoring fs on lightweight vmexit
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (5 preceding siblings ...)
  2007-06-17  9:43 ` [PATCH 06/58] KVM: Reduce misfirings of the fork detector Avi Kivity
@ 2007-06-17  9:43 ` Avi Kivity
  2007-06-17  9:43 ` [PATCH 08/58] KVM: Unify kvm_mmu_pre_write() and kvm_mmu_post_write() Avi Kivity
                   ` (49 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

i386 wants fs for accessing the pda even on a lightweight exit, so ensure
we can always restore it.  This fixes a regression on i386 introduced by
the lightweight vmexit patch.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/vmx.c |   22 +++++++++++-----------
 1 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 9ebb18d..49cadd3 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1832,16 +1832,21 @@ preempted:
 	 * Set host fs and gs selectors.  Unfortunately, 22.2.3 does not
 	 * allow segment selectors with cpl > 0 or ti == 1.
 	 */
-	fs_sel = read_fs();
-	gs_sel = read_gs();
 	ldt_sel = read_ldt();
-	fs_gs_ldt_reload_needed = (fs_sel & 7) | (gs_sel & 7) | ldt_sel;
-	if (!fs_gs_ldt_reload_needed) {
+	fs_gs_ldt_reload_needed = ldt_sel;
+	fs_sel = read_fs();
+	if (!(fs_sel & 7))
 		vmcs_write16(HOST_FS_SELECTOR, fs_sel);
-		vmcs_write16(HOST_GS_SELECTOR, gs_sel);
-	} else {
+	else {
 		vmcs_write16(HOST_FS_SELECTOR, 0);
+		fs_gs_ldt_reload_needed = 1;
+	}
+	gs_sel = read_gs();
+	if (!(gs_sel & 7))
+		vmcs_write16(HOST_GS_SELECTOR, gs_sel);
+	else {
 		vmcs_write16(HOST_GS_SELECTOR, 0);
+		fs_gs_ldt_reload_needed = 1;
 	}
 
 #ifdef CONFIG_X86_64
@@ -2035,11 +2040,6 @@ again:
 	}
 
 out:
-	/*
-	 * Reload segment selectors ASAP. (it's needed for a functional
-	 * kernel: x86 relies on having __KERNEL_PDA in %fs and x86_64
-	 * relies on having 0 in %gs for the CPU PDA to work.)
-	 */
 	if (fs_gs_ldt_reload_needed) {
 		load_ldt(ldt_sel);
 		load_fs(fs_sel);
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 08/58] KVM: Unify kvm_mmu_pre_write() and kvm_mmu_post_write()
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (6 preceding siblings ...)
  2007-06-17  9:43 ` [PATCH 07/58] KVM: Be more careful restoring fs on lightweight vmexit Avi Kivity
@ 2007-06-17  9:43 ` Avi Kivity
  2007-06-17  9:43 ` [PATCH 09/58] KVM: MMU: Respect nonpae pagetable quadrant when zapping ptes Avi Kivity
                   ` (48 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Instead of calling two functions and repeating expensive checks, call one
function and provide it with before/after information.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h      |    4 ++--
 drivers/kvm/kvm_main.c |    4 ++--
 drivers/kvm/mmu.c      |   11 ++++-------
 3 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 7facebd..11c519e 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -525,8 +525,8 @@ int kvm_write_guest(struct kvm_vcpu *vcpu,
 
 unsigned long segment_base(u16 selector);
 
-void kvm_mmu_pre_write(struct kvm_vcpu *vcpu, gpa_t gpa, int bytes);
-void kvm_mmu_post_write(struct kvm_vcpu *vcpu, gpa_t gpa, int bytes);
+void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
+		       const u8 *old, const u8 *new, int bytes);
 int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva);
 void kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu);
 
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 7d68258..b6ad9c6 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1071,18 +1071,18 @@ static int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,
 {
 	struct page *page;
 	void *virt;
+	unsigned offset = offset_in_page(gpa);
 
 	if (((gpa + bytes - 1) >> PAGE_SHIFT) != (gpa >> PAGE_SHIFT))
 		return 0;
 	page = gfn_to_page(vcpu->kvm, gpa >> PAGE_SHIFT);
 	if (!page)
 		return 0;
-	kvm_mmu_pre_write(vcpu, gpa, bytes);
 	mark_page_dirty(vcpu->kvm, gpa >> PAGE_SHIFT);
 	virt = kmap_atomic(page, KM_USER0);
+	kvm_mmu_pte_write(vcpu, gpa, virt + offset, val, bytes);
 	memcpy(virt + offset_in_page(gpa), val, bytes);
 	kunmap_atomic(virt, KM_USER0);
-	kvm_mmu_post_write(vcpu, gpa, bytes);
 	return 1;
 }
 
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 2277b7c..b3a83ef 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -1118,7 +1118,7 @@ out:
 	return r;
 }
 
-static void mmu_pre_write_zap_pte(struct kvm_vcpu *vcpu,
+static void mmu_pte_write_zap_pte(struct kvm_vcpu *vcpu,
 				  struct kvm_mmu_page *page,
 				  u64 *spte)
 {
@@ -1137,7 +1137,8 @@ static void mmu_pre_write_zap_pte(struct kvm_vcpu *vcpu,
 	*spte = 0;
 }
 
-void kvm_mmu_pre_write(struct kvm_vcpu *vcpu, gpa_t gpa, int bytes)
+void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
+		       const u8 *old, const u8 *new, int bytes)
 {
 	gfn_t gfn = gpa >> PAGE_SHIFT;
 	struct kvm_mmu_page *page;
@@ -1206,16 +1207,12 @@ void kvm_mmu_pre_write(struct kvm_vcpu *vcpu, gpa_t gpa, int bytes)
 		spte = __va(page->page_hpa);
 		spte += page_offset / sizeof(*spte);
 		while (npte--) {
-			mmu_pre_write_zap_pte(vcpu, page, spte);
+			mmu_pte_write_zap_pte(vcpu, page, spte);
 			++spte;
 		}
 	}
 }
 
-void kvm_mmu_post_write(struct kvm_vcpu *vcpu, gpa_t gpa, int bytes)
-{
-}
-
 int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva)
 {
 	gpa_t gpa = vcpu->mmu.gva_to_gpa(vcpu, gva);
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 09/58] KVM: MMU: Respect nonpae pagetable quadrant when zapping ptes
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (7 preceding siblings ...)
  2007-06-17  9:43 ` [PATCH 08/58] KVM: Unify kvm_mmu_pre_write() and kvm_mmu_post_write() Avi Kivity
@ 2007-06-17  9:43 ` Avi Kivity
  2007-06-17  9:43 ` [PATCH 10/58] KVM: Update shadow pte on write to guest pte Avi Kivity
                   ` (47 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

When a guest writes to a page that has an mmu shadow, we have to clear
the shadow pte corresponding to the memory location touched by the guest.

Now, in nonpae mode, a single guest page may have two or four shadow
pages (because a nonpae page maps 4MB or 4GB, whereas the pae shadow maps
2MB or 1GB), so we when we look up the page we find up to three additional
aliases for the page.  Since we _clear_ the shadow pte, it doesn't matter
except for a slight performance penalty, but if we want to _update_ the
shadow pte instead of clearing it, it is vital that we don't modify the
aliases.

Fortunately, exactly which page is needed (the "quadrant") is easily
computed, and is accessible in the shadow page header.  All we need is
to ignore shadow pages from the wrong quadrants.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/mmu.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index b3a83ef..23dc461 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -1150,6 +1150,7 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 	unsigned pte_size;
 	unsigned page_offset;
 	unsigned misaligned;
+	unsigned quadrant;
 	int level;
 	int flooded = 0;
 	int npte;
@@ -1202,7 +1203,10 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 				page_offset <<= 1;
 				npte = 2;
 			}
+			quadrant = page_offset >> PAGE_SHIFT;
 			page_offset &= ~PAGE_MASK;
+			if (quadrant != page->role.quadrant)
+				continue;
 		}
 		spte = __va(page->page_hpa);
 		spte += page_offset / sizeof(*spte);
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 10/58] KVM: Update shadow pte on write to guest pte
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (8 preceding siblings ...)
  2007-06-17  9:43 ` [PATCH 09/58] KVM: MMU: Respect nonpae pagetable quadrant when zapping ptes Avi Kivity
@ 2007-06-17  9:43 ` Avi Kivity
  2007-06-17  9:43 ` [PATCH 11/58] KVM: Increase mmu shadow cache to 1024 pages Avi Kivity
                   ` (46 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

A typical demand page/copy on write pattern is:

- page fault on vaddr
- kvm propagates fault to guest
- guest handles fault, updates pte
- kvm traps write, clears shadow pte, resumes guest
- guest returns to userspace, re-faults on same vaddr
- kvm installs shadow pte, resumes guest
- guest continues

So, three vmexits for a single guest page fault.  But if instead of clearing
the page table entry, we update to correspond to the value that the guest
has just written, we eliminate the third vmexit.

This patch does exactly that, reducing kbuild time by about 10%.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/mmu.c         |   15 +++++++++++++++
 drivers/kvm/paging_tmpl.h |   15 +++++++++++++++
 2 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 23dc461..9ec3df9 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -1137,6 +1137,20 @@ static void mmu_pte_write_zap_pte(struct kvm_vcpu *vcpu,
 	*spte = 0;
 }
 
+static void mmu_pte_write_new_pte(struct kvm_vcpu *vcpu,
+				  struct kvm_mmu_page *page,
+				  u64 *spte,
+				  const void *new, int bytes)
+{
+	if (page->role.level != PT_PAGE_TABLE_LEVEL)
+		return;
+
+	if (page->role.glevels == PT32_ROOT_LEVEL)
+		paging32_update_pte(vcpu, page, spte, new, bytes);
+	else
+		paging64_update_pte(vcpu, page, spte, new, bytes);
+}
+
 void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 		       const u8 *old, const u8 *new, int bytes)
 {
@@ -1212,6 +1226,7 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 		spte += page_offset / sizeof(*spte);
 		while (npte--) {
 			mmu_pte_write_zap_pte(vcpu, page, spte);
+			mmu_pte_write_new_pte(vcpu, page, spte, new, bytes);
 			++spte;
 		}
 	}
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index bc64cce..10ba0a8 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -202,6 +202,21 @@ static void FNAME(set_pte)(struct kvm_vcpu *vcpu, u64 guest_pte,
 		       guest_pte & PT_DIRTY_MASK, access_bits, gfn);
 }
 
+static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page,
+			      u64 *spte, const void *pte, int bytes)
+{
+	pt_element_t gpte;
+
+	if (bytes < sizeof(pt_element_t))
+		return;
+	gpte = *(const pt_element_t *)pte;
+	if (~gpte & (PT_PRESENT_MASK | PT_ACCESSED_MASK))
+		return;
+	pgprintk("%s: gpte %llx spte %p\n", __FUNCTION__, (u64)gpte, spte);
+	FNAME(set_pte)(vcpu, gpte, spte, 6,
+		       (gpte & PT_BASE_ADDR_MASK) >> PAGE_SHIFT);
+}
+
 static void FNAME(set_pde)(struct kvm_vcpu *vcpu, u64 guest_pde,
 			   u64 *shadow_pte, u64 access_bits, gfn_t gfn)
 {
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 11/58] KVM: Increase mmu shadow cache to 1024 pages
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (9 preceding siblings ...)
  2007-06-17  9:43 ` [PATCH 10/58] KVM: Update shadow pte on write to guest pte Avi Kivity
@ 2007-06-17  9:43 ` Avi Kivity
  2007-06-17  9:43 ` [PATCH 12/58] KVM: Fix potential guest state leak into host Avi Kivity
                   ` (45 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

This improves kbuild times by about 10%, bringing it within a respectable
25% of native.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 11c519e..f6ee189 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -54,7 +54,7 @@
 #define KVM_MAX_VCPUS 1
 #define KVM_ALIAS_SLOTS 4
 #define KVM_MEMORY_SLOTS 4
-#define KVM_NUM_MMU_PAGES 256
+#define KVM_NUM_MMU_PAGES 1024
 #define KVM_MIN_FREE_MMU_PAGES 5
 #define KVM_REFILL_PAGES 25
 #define KVM_MAX_CPUID_ENTRIES 40
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 12/58] KVM: Fix potential guest state leak into host
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (10 preceding siblings ...)
  2007-06-17  9:43 ` [PATCH 11/58] KVM: Increase mmu shadow cache to 1024 pages Avi Kivity
@ 2007-06-17  9:43 ` Avi Kivity
  2007-06-17  9:43 ` [PATCH 13/58] KVM: Move some more msr mangling into vmx_save_host_state() Avi Kivity
                   ` (44 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

The lightweight vmexit path avoids saving and reloading certain host
state.  However in certain cases lightweight vmexit handling can schedule()
which requires reloading the host state.

So we store the host state in the vcpu structure, and reloaded it if we
relinquish the vcpu.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h |    5 ++
 drivers/kvm/vmx.c |  160 +++++++++++++++++++++++++++++-----------------------
 2 files changed, 94 insertions(+), 71 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index f6ee189..bb32383 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -306,6 +306,11 @@ struct kvm_vcpu {
 	char *guest_fx_image;
 	int fpu_active;
 	int guest_fpu_loaded;
+	struct vmx_host_state {
+		int loaded;
+		u16 fs_sel, gs_sel, ldt_sel;
+		int fs_gs_ldt_reload_needed;
+	} vmx_host_state;
 
 	int mmio_needed;
 	int mmio_read_completed;
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 49cadd3..677b38c 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -237,6 +237,93 @@ static void vmcs_set_bits(unsigned long field, u32 mask)
 	vmcs_writel(field, vmcs_readl(field) | mask);
 }
 
+static void reload_tss(void)
+{
+#ifndef CONFIG_X86_64
+
+	/*
+	 * VT restores TR but not its size.  Useless.
+	 */
+	struct descriptor_table gdt;
+	struct segment_descriptor *descs;
+
+	get_gdt(&gdt);
+	descs = (void *)gdt.base;
+	descs[GDT_ENTRY_TSS].type = 9; /* available TSS */
+	load_TR_desc();
+#endif
+}
+
+static void vmx_save_host_state(struct kvm_vcpu *vcpu)
+{
+	struct vmx_host_state *hs = &vcpu->vmx_host_state;
+
+	if (hs->loaded)
+		return;
+
+	hs->loaded = 1;
+	/*
+	 * Set host fs and gs selectors.  Unfortunately, 22.2.3 does not
+	 * allow segment selectors with cpl > 0 or ti == 1.
+	 */
+	hs->ldt_sel = read_ldt();
+	hs->fs_gs_ldt_reload_needed = hs->ldt_sel;
+	hs->fs_sel = read_fs();
+	if (!(hs->fs_sel & 7))
+		vmcs_write16(HOST_FS_SELECTOR, hs->fs_sel);
+	else {
+		vmcs_write16(HOST_FS_SELECTOR, 0);
+		hs->fs_gs_ldt_reload_needed = 1;
+	}
+	hs->gs_sel = read_gs();
+	if (!(hs->gs_sel & 7))
+		vmcs_write16(HOST_GS_SELECTOR, hs->gs_sel);
+	else {
+		vmcs_write16(HOST_GS_SELECTOR, 0);
+		hs->fs_gs_ldt_reload_needed = 1;
+	}
+
+#ifdef CONFIG_X86_64
+	vmcs_writel(HOST_FS_BASE, read_msr(MSR_FS_BASE));
+	vmcs_writel(HOST_GS_BASE, read_msr(MSR_GS_BASE));
+#else
+	vmcs_writel(HOST_FS_BASE, segment_base(hs->fs_sel));
+	vmcs_writel(HOST_GS_BASE, segment_base(hs->gs_sel));
+#endif
+}
+
+static void vmx_load_host_state(struct kvm_vcpu *vcpu)
+{
+	struct vmx_host_state *hs = &vcpu->vmx_host_state;
+
+	if (!hs->loaded)
+		return;
+
+	hs->loaded = 0;
+	if (hs->fs_gs_ldt_reload_needed) {
+		load_ldt(hs->ldt_sel);
+		load_fs(hs->fs_sel);
+		/*
+		 * If we have to reload gs, we must take care to
+		 * preserve our gs base.
+		 */
+		local_irq_disable();
+		load_gs(hs->gs_sel);
+#ifdef CONFIG_X86_64
+		wrmsrl(MSR_GS_BASE, vmcs_readl(HOST_GS_BASE));
+#endif
+		local_irq_enable();
+
+		reload_tss();
+	}
+#ifdef CONFIG_X86_64
+	if (is_long_mode(vcpu)) {
+		save_msrs(vcpu->guest_msrs, NR_BAD_MSRS);
+		load_msrs(vcpu->host_msrs, NR_BAD_MSRS);
+	}
+#endif
+}
+
 /*
  * Switches to specified vcpu, until a matching vcpu_put(), but assumes
  * vcpu mutex is already taken.
@@ -283,6 +370,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu)
 
 static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
 {
+	vmx_load_host_state(vcpu);
 	kvm_put_guest_fpu(vcpu);
 	put_cpu();
 }
@@ -397,23 +485,6 @@ static void guest_write_tsc(u64 guest_tsc)
 	vmcs_write64(TSC_OFFSET, guest_tsc - host_tsc);
 }
 
-static void reload_tss(void)
-{
-#ifndef CONFIG_X86_64
-
-	/*
-	 * VT restores TR but not its size.  Useless.
-	 */
-	struct descriptor_table gdt;
-	struct segment_descriptor *descs;
-
-	get_gdt(&gdt);
-	descs = (void *)gdt.base;
-	descs[GDT_ENTRY_TSS].type = 9; /* available TSS */
-	load_TR_desc();
-#endif
-}
-
 /*
  * Reads an msr value (of 'msr_index') into 'pdata'.
  * Returns 0 on success, non-0 otherwise.
@@ -1823,40 +1894,9 @@ static int dm_request_for_irq_injection(struct kvm_vcpu *vcpu,
 static int vmx_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	u8 fail;
-	u16 fs_sel, gs_sel, ldt_sel;
-	int fs_gs_ldt_reload_needed;
 	int r;
 
 preempted:
-	/*
-	 * Set host fs and gs selectors.  Unfortunately, 22.2.3 does not
-	 * allow segment selectors with cpl > 0 or ti == 1.
-	 */
-	ldt_sel = read_ldt();
-	fs_gs_ldt_reload_needed = ldt_sel;
-	fs_sel = read_fs();
-	if (!(fs_sel & 7))
-		vmcs_write16(HOST_FS_SELECTOR, fs_sel);
-	else {
-		vmcs_write16(HOST_FS_SELECTOR, 0);
-		fs_gs_ldt_reload_needed = 1;
-	}
-	gs_sel = read_gs();
-	if (!(gs_sel & 7))
-		vmcs_write16(HOST_GS_SELECTOR, gs_sel);
-	else {
-		vmcs_write16(HOST_GS_SELECTOR, 0);
-		fs_gs_ldt_reload_needed = 1;
-	}
-
-#ifdef CONFIG_X86_64
-	vmcs_writel(HOST_FS_BASE, read_msr(MSR_FS_BASE));
-	vmcs_writel(HOST_GS_BASE, read_msr(MSR_GS_BASE));
-#else
-	vmcs_writel(HOST_FS_BASE, segment_base(fs_sel));
-	vmcs_writel(HOST_GS_BASE, segment_base(gs_sel));
-#endif
-
 	if (!vcpu->mmio_read_completed)
 		do_interrupt_requests(vcpu, kvm_run);
 
@@ -1871,6 +1911,7 @@ preempted:
 #endif
 
 again:
+	vmx_save_host_state(vcpu);
 	kvm_load_guest_fpu(vcpu);
 
 	/*
@@ -2040,29 +2081,6 @@ again:
 	}
 
 out:
-	if (fs_gs_ldt_reload_needed) {
-		load_ldt(ldt_sel);
-		load_fs(fs_sel);
-		/*
-		 * If we have to reload gs, we must take care to
-		 * preserve our gs base.
-		 */
-		local_irq_disable();
-		load_gs(gs_sel);
-#ifdef CONFIG_X86_64
-		wrmsrl(MSR_GS_BASE, vmcs_readl(HOST_GS_BASE));
-#endif
-		local_irq_enable();
-
-		reload_tss();
-	}
-#ifdef CONFIG_X86_64
-	if (is_long_mode(vcpu)) {
-		save_msrs(vcpu->guest_msrs, NR_BAD_MSRS);
-		load_msrs(vcpu->host_msrs, NR_BAD_MSRS);
-	}
-#endif
-
 	if (r > 0) {
 		kvm_resched(vcpu);
 		goto preempted;
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 13/58] KVM: Move some more msr mangling into vmx_save_host_state()
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (11 preceding siblings ...)
  2007-06-17  9:43 ` [PATCH 12/58] KVM: Fix potential guest state leak into host Avi Kivity
@ 2007-06-17  9:43 ` Avi Kivity
  2007-06-17  9:43 ` [PATCH 14/58] KVM: Rationalize exception bitmap usage Avi Kivity
                   ` (43 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/vmx.c |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 677b38c..93c3abf 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -290,6 +290,13 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu)
 	vmcs_writel(HOST_FS_BASE, segment_base(hs->fs_sel));
 	vmcs_writel(HOST_GS_BASE, segment_base(hs->gs_sel));
 #endif
+
+#ifdef CONFIG_X86_64
+	if (is_long_mode(vcpu)) {
+		save_msrs(vcpu->host_msrs + msr_offset_kernel_gs_base, 1);
+		load_msrs(vcpu->guest_msrs, NR_BAD_MSRS);
+	}
+#endif
 }
 
 static void vmx_load_host_state(struct kvm_vcpu *vcpu)
@@ -1903,13 +1910,6 @@ preempted:
 	if (vcpu->guest_debug.enabled)
 		kvm_guest_debug_pre(vcpu);
 
-#ifdef CONFIG_X86_64
-	if (is_long_mode(vcpu)) {
-		save_msrs(vcpu->host_msrs + msr_offset_kernel_gs_base, 1);
-		load_msrs(vcpu->guest_msrs, NR_BAD_MSRS);
-	}
-#endif
-
 again:
 	vmx_save_host_state(vcpu);
 	kvm_load_guest_fpu(vcpu);
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 14/58] KVM: Rationalize exception bitmap usage
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (12 preceding siblings ...)
  2007-06-17  9:43 ` [PATCH 13/58] KVM: Move some more msr mangling into vmx_save_host_state() Avi Kivity
@ 2007-06-17  9:43 ` Avi Kivity
  2007-06-17  9:43 ` [PATCH 15/58] KVM: Consolidate guest fpu activation and deactivation Avi Kivity
                   ` (42 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Everyone owns a piece of the exception bitmap, but they happily write to
the entire thing like there's no tomorrow.  Centralize handling in
update_exception_bitmap() and have everyone call that.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/vmx.c |   42 +++++++++++++++++++++---------------------
 1 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 93c3abf..2190020 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -237,6 +237,20 @@ static void vmcs_set_bits(unsigned long field, u32 mask)
 	vmcs_writel(field, vmcs_readl(field) | mask);
 }
 
+static void update_exception_bitmap(struct kvm_vcpu *vcpu)
+{
+	u32 eb;
+
+	eb = 1u << PF_VECTOR;
+	if (!vcpu->fpu_active)
+		eb |= 1u << NM_VECTOR;
+	if (vcpu->guest_debug.enabled)
+		eb |= 1u << 1;
+	if (vcpu->rmode.active)
+		eb = ~0;
+	vmcs_write32(EXCEPTION_BITMAP, eb);
+}
+
 static void reload_tss(void)
 {
 #ifndef CONFIG_X86_64
@@ -618,10 +632,8 @@ static void vcpu_put_rsp_rip(struct kvm_vcpu *vcpu)
 static int set_guest_debug(struct kvm_vcpu *vcpu, struct kvm_debug_guest *dbg)
 {
 	unsigned long dr7 = 0x400;
-	u32 exception_bitmap;
 	int old_singlestep;
 
-	exception_bitmap = vmcs_read32(EXCEPTION_BITMAP);
 	old_singlestep = vcpu->guest_debug.singlestep;
 
 	vcpu->guest_debug.enabled = dbg->enabled;
@@ -637,13 +649,9 @@ static int set_guest_debug(struct kvm_vcpu *vcpu, struct kvm_debug_guest *dbg)
 			dr7 |= 0 << (i*4+16); /* execution breakpoint */
 		}
 
-		exception_bitmap |= (1u << 1);  /* Trap debug exceptions */
-
 		vcpu->guest_debug.singlestep = dbg->singlestep;
-	} else {
-		exception_bitmap &= ~(1u << 1); /* Ignore debug exceptions */
+	} else
 		vcpu->guest_debug.singlestep = 0;
-	}
 
 	if (old_singlestep && !vcpu->guest_debug.singlestep) {
 		unsigned long flags;
@@ -653,7 +661,7 @@ static int set_guest_debug(struct kvm_vcpu *vcpu, struct kvm_debug_guest *dbg)
 		vmcs_writel(GUEST_RFLAGS, flags);
 	}
 
-	vmcs_write32(EXCEPTION_BITMAP, exception_bitmap);
+	update_exception_bitmap(vcpu);
 	vmcs_writel(GUEST_DR7, dr7);
 
 	return 0;
@@ -767,14 +775,6 @@ static __exit void hardware_unsetup(void)
 	free_kvm_area();
 }
 
-static void update_exception_bitmap(struct kvm_vcpu *vcpu)
-{
-	if (vcpu->rmode.active)
-		vmcs_write32(EXCEPTION_BITMAP, ~0);
-	else
-		vmcs_write32(EXCEPTION_BITMAP, 1 << PF_VECTOR);
-}
-
 static void fix_pmode_dataseg(int seg, struct kvm_save_segment *save)
 {
 	struct kvm_vmx_segment_field *sf = &kvm_vmx_segment_fields[seg];
@@ -942,7 +942,7 @@ static void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 
 	if (!(cr0 & CR0_TS_MASK)) {
 		vcpu->fpu_active = 1;
-		vmcs_clear_bits(EXCEPTION_BITMAP, CR0_TS_MASK);
+		update_exception_bitmap(vcpu);
 	}
 
 	vmcs_writel(CR0_READ_SHADOW, cr0);
@@ -958,7 +958,7 @@ static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 	if (!(vcpu->cr0 & CR0_TS_MASK)) {
 		vcpu->fpu_active = 0;
 		vmcs_set_bits(GUEST_CR0, CR0_TS_MASK);
-		vmcs_set_bits(EXCEPTION_BITMAP, 1 << NM_VECTOR);
+		update_exception_bitmap(vcpu);
 	}
 }
 
@@ -1243,7 +1243,6 @@ static int vmx_vcpu_setup(struct kvm_vcpu *vcpu)
 			       | CPU_BASED_USE_TSC_OFFSETING   /* 21.3 */
 			);
 
-	vmcs_write32(EXCEPTION_BITMAP, 1 << PF_VECTOR);
 	vmcs_write32(PAGE_FAULT_ERROR_CODE_MASK, 0);
 	vmcs_write32(PAGE_FAULT_ERROR_CODE_MATCH, 0);
 	vmcs_write32(CR3_TARGET_COUNT, 0);           /* 22.2.1 */
@@ -1329,6 +1328,7 @@ static int vmx_vcpu_setup(struct kvm_vcpu *vcpu)
 #ifdef CONFIG_X86_64
 	vmx_set_efer(vcpu, 0);
 #endif
+	update_exception_bitmap(vcpu);
 
 	return 0;
 
@@ -1489,7 +1489,7 @@ static int handle_exception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 
 	if (is_no_device(intr_info)) {
 		vcpu->fpu_active = 1;
-		vmcs_clear_bits(EXCEPTION_BITMAP, 1 << NM_VECTOR);
+		update_exception_bitmap(vcpu);
 		if (!(vcpu->cr0 & CR0_TS_MASK))
 			vmcs_clear_bits(GUEST_CR0, CR0_TS_MASK);
 		return 1;
@@ -1684,7 +1684,7 @@ static int handle_cr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	case 2: /* clts */
 		vcpu_load_rsp_rip(vcpu);
 		vcpu->fpu_active = 1;
-		vmcs_clear_bits(EXCEPTION_BITMAP, 1 << NM_VECTOR);
+		update_exception_bitmap(vcpu);
 		vmcs_clear_bits(GUEST_CR0, CR0_TS_MASK);
 		vcpu->cr0 &= ~CR0_TS_MASK;
 		vmcs_writel(CR0_READ_SHADOW, vcpu->cr0);
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 15/58] KVM: Consolidate guest fpu activation and deactivation
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (13 preceding siblings ...)
  2007-06-17  9:43 ` [PATCH 14/58] KVM: Rationalize exception bitmap usage Avi Kivity
@ 2007-06-17  9:43 ` Avi Kivity
  2007-06-17  9:43 ` [PATCH 16/58] KVM: Set cr0.mp for guests Avi Kivity
                   ` (41 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Easier to keep track of where the fpu is this way.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h |    2 +-
 drivers/kvm/vmx.c |   50 +++++++++++++++++++++++++++++++-------------------
 2 files changed, 32 insertions(+), 20 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index bb32383..4724087 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -42,7 +42,7 @@
 	(CR0_PG_MASK | CR0_PE_MASK | CR0_WP_MASK | CR0_NE_MASK \
 	 | CR0_NW_MASK | CR0_CD_MASK)
 #define KVM_VM_CR0_ALWAYS_ON \
-	(CR0_PG_MASK | CR0_PE_MASK | CR0_WP_MASK | CR0_NE_MASK)
+	(CR0_PG_MASK | CR0_PE_MASK | CR0_WP_MASK | CR0_NE_MASK | CR0_TS_MASK)
 #define KVM_GUEST_CR4_MASK \
 	(CR4_PSE_MASK | CR4_PAE_MASK | CR4_PGE_MASK | CR4_VMXE_MASK | CR4_VME_MASK)
 #define KVM_PMODE_VM_CR4_ALWAYS_ON (CR4_VMXE_MASK | CR4_PAE_MASK)
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 2190020..096cb6a 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -396,6 +396,26 @@ static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
 	put_cpu();
 }
 
+static void vmx_fpu_activate(struct kvm_vcpu *vcpu)
+{
+	if (vcpu->fpu_active)
+		return;
+	vcpu->fpu_active = 1;
+	vmcs_clear_bits(GUEST_CR0, CR0_TS_MASK);
+	if (vcpu->cr0 & CR0_TS_MASK)
+		vmcs_set_bits(GUEST_CR0, CR0_TS_MASK);
+	update_exception_bitmap(vcpu);
+}
+
+static void vmx_fpu_deactivate(struct kvm_vcpu *vcpu)
+{
+	if (!vcpu->fpu_active)
+		return;
+	vcpu->fpu_active = 0;
+	vmcs_set_bits(GUEST_CR0, CR0_TS_MASK);
+	update_exception_bitmap(vcpu);
+}
+
 static void vmx_vcpu_decache(struct kvm_vcpu *vcpu)
 {
 	vcpu_clear(vcpu);
@@ -925,6 +945,8 @@ static void vmx_decache_cr4_guest_bits(struct kvm_vcpu *vcpu)
 
 static void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 {
+	vmx_fpu_deactivate(vcpu);
+
 	if (vcpu->rmode.active && (cr0 & CR0_PE_MASK))
 		enter_pmode(vcpu);
 
@@ -940,26 +962,20 @@ static void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 	}
 #endif
 
-	if (!(cr0 & CR0_TS_MASK)) {
-		vcpu->fpu_active = 1;
-		update_exception_bitmap(vcpu);
-	}
-
 	vmcs_writel(CR0_READ_SHADOW, cr0);
 	vmcs_writel(GUEST_CR0,
 		    (cr0 & ~KVM_GUEST_CR0_MASK) | KVM_VM_CR0_ALWAYS_ON);
 	vcpu->cr0 = cr0;
+
+	if (!(cr0 & CR0_TS_MASK) || !(cr0 & CR0_PE_MASK))
+		vmx_fpu_activate(vcpu);
 }
 
 static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 {
 	vmcs_writel(GUEST_CR3, cr3);
-
-	if (!(vcpu->cr0 & CR0_TS_MASK)) {
-		vcpu->fpu_active = 0;
-		vmcs_set_bits(GUEST_CR0, CR0_TS_MASK);
-		update_exception_bitmap(vcpu);
-	}
+	if (vcpu->cr0 & CR0_PE_MASK)
+		vmx_fpu_deactivate(vcpu);
 }
 
 static void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
@@ -1328,6 +1344,7 @@ static int vmx_vcpu_setup(struct kvm_vcpu *vcpu)
 #ifdef CONFIG_X86_64
 	vmx_set_efer(vcpu, 0);
 #endif
+	vmx_fpu_activate(vcpu);
 	update_exception_bitmap(vcpu);
 
 	return 0;
@@ -1488,10 +1505,7 @@ static int handle_exception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	}
 
 	if (is_no_device(intr_info)) {
-		vcpu->fpu_active = 1;
-		update_exception_bitmap(vcpu);
-		if (!(vcpu->cr0 & CR0_TS_MASK))
-			vmcs_clear_bits(GUEST_CR0, CR0_TS_MASK);
+		vmx_fpu_activate(vcpu);
 		return 1;
 	}
 
@@ -1683,11 +1697,10 @@ static int handle_cr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 		break;
 	case 2: /* clts */
 		vcpu_load_rsp_rip(vcpu);
-		vcpu->fpu_active = 1;
-		update_exception_bitmap(vcpu);
-		vmcs_clear_bits(GUEST_CR0, CR0_TS_MASK);
+		vmx_fpu_deactivate(vcpu);
 		vcpu->cr0 &= ~CR0_TS_MASK;
 		vmcs_writel(CR0_READ_SHADOW, vcpu->cr0);
+		vmx_fpu_activate(vcpu);
 		skip_emulated_instruction(vcpu);
 		return 1;
 	case 1: /*mov from cr*/
@@ -2158,7 +2171,6 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 	vmcs_clear(vmcs);
 	vcpu->vmcs = vmcs;
 	vcpu->launched = 0;
-	vcpu->fpu_active = 1;
 
 	return 0;
 
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 16/58] KVM: Set cr0.mp for guests
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (14 preceding siblings ...)
  2007-06-17  9:43 ` [PATCH 15/58] KVM: Consolidate guest fpu activation and deactivation Avi Kivity
@ 2007-06-17  9:43 ` Avi Kivity
  2007-06-17  9:43 ` [PATCH 17/58] KVM: Implement IA32_EBL_CR_POWERON msr Avi Kivity
                   ` (40 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

This allows fwait instructions to be trapped when the guest fpu is not
loaded.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 4724087..5e6dac5 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -18,6 +18,7 @@
 #include <linux/kvm_para.h>
 
 #define CR0_PE_MASK (1ULL << 0)
+#define CR0_MP_MASK (1ULL << 1)
 #define CR0_TS_MASK (1ULL << 3)
 #define CR0_NE_MASK (1ULL << 5)
 #define CR0_WP_MASK (1ULL << 16)
@@ -42,7 +43,8 @@
 	(CR0_PG_MASK | CR0_PE_MASK | CR0_WP_MASK | CR0_NE_MASK \
 	 | CR0_NW_MASK | CR0_CD_MASK)
 #define KVM_VM_CR0_ALWAYS_ON \
-	(CR0_PG_MASK | CR0_PE_MASK | CR0_WP_MASK | CR0_NE_MASK | CR0_TS_MASK)
+	(CR0_PG_MASK | CR0_PE_MASK | CR0_WP_MASK | CR0_NE_MASK | CR0_TS_MASK \
+	 | CR0_MP_MASK)
 #define KVM_GUEST_CR4_MASK \
 	(CR4_PSE_MASK | CR4_PAE_MASK | CR4_PGE_MASK | CR4_VMXE_MASK | CR4_VME_MASK)
 #define KVM_PMODE_VM_CR4_ALWAYS_ON (CR4_VMXE_MASK | CR4_PAE_MASK)
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 17/58] KVM: Implement IA32_EBL_CR_POWERON msr
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (15 preceding siblings ...)
  2007-06-17  9:43 ` [PATCH 16/58] KVM: Set cr0.mp for guests Avi Kivity
@ 2007-06-17  9:43 ` Avi Kivity
  2007-06-17  9:43 ` [PATCH 18/58] KVM: MMU: Simplify kvm_mmu_free_page() a tiny bit Avi Kivity
                   ` (39 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Matthew Gregan, Avi Kivity

From: Matthew Gregan <kinetik@flim.org>

Attempting to boot the default 'bsd' kernel of OpenBSD 4.1 i386 in a guest
fails early in the kernel init inside p3_get_bus_clock while trying to read
the IA32_EBL_CR_POWERON MSR.  KVM logs an 'unhandled MSR' message and the
guest kernel faults.

This patch is sufficient to allow OpenBSD to boot, after which it seems to
run fine.  I'm not sure if this is the correct solution for dealing with
this particular MSR, but it works for me.

Signed-off-by: Matthew Gregan <kinetik@flim.org>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm_main.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index b6ad9c6..095d673 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1470,6 +1470,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
 	case MSR_IA32_MC0_MISC+16:
 	case MSR_IA32_UCODE_REV:
 	case MSR_IA32_PERF_STATUS:
+	case MSR_IA32_EBL_CR_POWERON:
 		/* MTRR registers */
 	case 0xfe:
 	case 0x200 ... 0x2ff:
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 18/58] KVM: MMU: Simplify kvm_mmu_free_page() a tiny bit
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (16 preceding siblings ...)
  2007-06-17  9:43 ` [PATCH 17/58] KVM: Implement IA32_EBL_CR_POWERON msr Avi Kivity
@ 2007-06-17  9:43 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 19/58] KVM: MMU: Store shadow page tables as kernel virtual addresses, not physical Avi Kivity
                   ` (38 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:43 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/mmu.c |   10 ++++------
 1 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 9ec3df9..a96c9ae 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -455,12 +455,10 @@ static int is_empty_shadow_page(hpa_t page_hpa)
 }
 #endif
 
-static void kvm_mmu_free_page(struct kvm_vcpu *vcpu, hpa_t page_hpa)
+static void kvm_mmu_free_page(struct kvm_vcpu *vcpu,
+			      struct kvm_mmu_page *page_head)
 {
-	struct kvm_mmu_page *page_head = page_header(page_hpa);
-
-	ASSERT(is_empty_shadow_page(page_hpa));
-	page_head->page_hpa = page_hpa;
+	ASSERT(is_empty_shadow_page(page_head->page_hpa));
 	list_move(&page_head->link, &vcpu->free_pages);
 	++vcpu->kvm->n_free_mmu_pages;
 }
@@ -690,7 +688,7 @@ static void kvm_mmu_zap_page(struct kvm_vcpu *vcpu,
 	kvm_mmu_page_unlink_children(vcpu, page);
 	if (!page->root_count) {
 		hlist_del(&page->hash_link);
-		kvm_mmu_free_page(vcpu, page->page_hpa);
+		kvm_mmu_free_page(vcpu, page);
 	} else
 		list_move(&page->link, &vcpu->kvm->active_mmu_pages);
 }
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 19/58] KVM: MMU: Store shadow page tables as kernel virtual addresses, not physical
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (17 preceding siblings ...)
  2007-06-17  9:43 ` [PATCH 18/58] KVM: MMU: Simplify kvm_mmu_free_page() a tiny bit Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 20/58] KVM: VMX: Only reload guest msrs if they are already loaded Avi Kivity
                   ` (37 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Simpifies things a bit.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h         |    2 +-
 drivers/kvm/mmu.c         |   32 +++++++++++++++-----------------
 drivers/kvm/paging_tmpl.h |    2 +-
 3 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 5e6dac5..fc4a6c1 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -139,7 +139,7 @@ struct kvm_mmu_page {
 	gfn_t gfn;
 	union kvm_mmu_page_role role;
 
-	hpa_t page_hpa;
+	u64 *spt;
 	unsigned long slot_bitmap; /* One bit set per slot which has memory
 				    * in this shadow page.
 				    */
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index a96c9ae..c85c664 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -439,13 +439,12 @@ static void rmap_write_protect(struct kvm_vcpu *vcpu, u64 gfn)
 }
 
 #ifdef MMU_DEBUG
-static int is_empty_shadow_page(hpa_t page_hpa)
+static int is_empty_shadow_page(u64 *spt)
 {
 	u64 *pos;
 	u64 *end;
 
-	for (pos = __va(page_hpa), end = pos + PAGE_SIZE / sizeof(u64);
-		      pos != end; pos++)
+	for (pos = spt, end = pos + PAGE_SIZE / sizeof(u64); pos != end; pos++)
 		if (*pos != 0) {
 			printk(KERN_ERR "%s: %p %llx\n", __FUNCTION__,
 			       pos, *pos);
@@ -458,7 +457,7 @@ static int is_empty_shadow_page(hpa_t page_hpa)
 static void kvm_mmu_free_page(struct kvm_vcpu *vcpu,
 			      struct kvm_mmu_page *page_head)
 {
-	ASSERT(is_empty_shadow_page(page_head->page_hpa));
+	ASSERT(is_empty_shadow_page(page_head->spt));
 	list_move(&page_head->link, &vcpu->free_pages);
 	++vcpu->kvm->n_free_mmu_pages;
 }
@@ -478,7 +477,7 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu,
 
 	page = list_entry(vcpu->free_pages.next, struct kvm_mmu_page, link);
 	list_move(&page->link, &vcpu->kvm->active_mmu_pages);
-	ASSERT(is_empty_shadow_page(page->page_hpa));
+	ASSERT(is_empty_shadow_page(page->spt));
 	page->slot_bitmap = 0;
 	page->multimapped = 0;
 	page->parent_pte = parent_pte;
@@ -636,7 +635,7 @@ static void kvm_mmu_page_unlink_children(struct kvm_vcpu *vcpu,
 	u64 *pt;
 	u64 ent;
 
-	pt = __va(page->page_hpa);
+	pt = page->spt;
 
 	if (page->role.level == PT_PAGE_TABLE_LEVEL) {
 		for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
@@ -803,7 +802,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, hpa_t p)
 				return -ENOMEM;
 			}
 
-			table[index] = new_table->page_hpa | PT_PRESENT_MASK
+			table[index] = __pa(new_table->spt) | PT_PRESENT_MASK
 				| PT_WRITABLE_MASK | PT_USER_MASK;
 		}
 		table_addr = table[index] & PT64_BASE_ADDR_MASK;
@@ -855,7 +854,7 @@ static void mmu_alloc_roots(struct kvm_vcpu *vcpu)
 		ASSERT(!VALID_PAGE(root));
 		page = kvm_mmu_get_page(vcpu, root_gfn, 0,
 					PT64_ROOT_LEVEL, 0, 0, NULL);
-		root = page->page_hpa;
+		root = __pa(page->spt);
 		++page->root_count;
 		vcpu->mmu.root_hpa = root;
 		return;
@@ -876,7 +875,7 @@ static void mmu_alloc_roots(struct kvm_vcpu *vcpu)
 		page = kvm_mmu_get_page(vcpu, root_gfn, i << 30,
 					PT32_ROOT_LEVEL, !is_paging(vcpu),
 					0, NULL);
-		root = page->page_hpa;
+		root = __pa(page->spt);
 		++page->root_count;
 		vcpu->mmu.pae_root[i] = root | PT_PRESENT_MASK;
 	}
@@ -1220,8 +1219,7 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 			if (quadrant != page->role.quadrant)
 				continue;
 		}
-		spte = __va(page->page_hpa);
-		spte += page_offset / sizeof(*spte);
+		spte = &page->spt[page_offset / sizeof(*spte)];
 		while (npte--) {
 			mmu_pte_write_zap_pte(vcpu, page, spte);
 			mmu_pte_write_new_pte(vcpu, page, spte, new, bytes);
@@ -1262,8 +1260,8 @@ static void free_mmu_pages(struct kvm_vcpu *vcpu)
 		page = list_entry(vcpu->free_pages.next,
 				  struct kvm_mmu_page, link);
 		list_del(&page->link);
-		__free_page(pfn_to_page(page->page_hpa >> PAGE_SHIFT));
-		page->page_hpa = INVALID_PAGE;
+		free_page((unsigned long)page->spt);
+		page->spt = NULL;
 	}
 	free_page((unsigned long)vcpu->mmu.pae_root);
 }
@@ -1282,8 +1280,8 @@ static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
 		if ((page = alloc_page(GFP_KERNEL)) == NULL)
 			goto error_1;
 		set_page_private(page, (unsigned long)page_header);
-		page_header->page_hpa = (hpa_t)page_to_pfn(page) << PAGE_SHIFT;
-		memset(__va(page_header->page_hpa), 0, PAGE_SIZE);
+		page_header->spt = page_address(page);
+		memset(page_header->spt, 0, PAGE_SIZE);
 		list_add(&page_header->link, &vcpu->free_pages);
 		++vcpu->kvm->n_free_mmu_pages;
 	}
@@ -1346,7 +1344,7 @@ void kvm_mmu_slot_remove_write_access(struct kvm_vcpu *vcpu, int slot)
 		if (!test_bit(slot, &page->slot_bitmap))
 			continue;
 
-		pt = __va(page->page_hpa);
+		pt = page->spt;
 		for (i = 0; i < PT64_ENT_PER_PAGE; ++i)
 			/* avoid RMW */
 			if (pt[i] & PT_WRITABLE_MASK) {
@@ -1497,7 +1495,7 @@ static int count_writable_mappings(struct kvm_vcpu *vcpu)
 	int i;
 
 	list_for_each_entry(page, &vcpu->kvm->active_mmu_pages, link) {
-		u64 *pt = __va(page->page_hpa);
+		u64 *pt = page->spt;
 
 		if (page->role.level != PT_PAGE_TABLE_LEVEL)
 			continue;
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 10ba0a8..6dd0da9 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -304,7 +304,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 		shadow_page = kvm_mmu_get_page(vcpu, table_gfn, addr, level-1,
 					       metaphysical, hugepage_access,
 					       shadow_ent);
-		shadow_addr = shadow_page->page_hpa;
+		shadow_addr = __pa(shadow_page->spt);
 		shadow_pte = shadow_addr | PT_PRESENT_MASK | PT_ACCESSED_MASK
 			| PT_WRITABLE_MASK | PT_USER_MASK;
 		*shadow_ent = shadow_pte;
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 20/58] KVM: VMX: Only reload guest msrs if they are already loaded
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (18 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 19/58] KVM: MMU: Store shadow page tables as kernel virtual addresses, not physical Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 21/58] KVM: Avoid corrupting tr in real mode Avi Kivity
                   ` (36 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

If we set an msr via an ioctl() instead of by handling a guest exit, we
have the host state loaded, so reloading the msrs would clobber host
state instead of guest state.

This fixes a host oops (and loss of a cpu) on a guest reboot.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/vmx.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 096cb6a..b353eaa 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -600,7 +600,8 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
 		msr = find_msr_entry(vcpu, msr_index);
 		if (msr)
 			msr->data = data;
-		load_msrs(vcpu->guest_msrs, NR_BAD_MSRS);
+		if (vcpu->vmx_host_state.loaded)
+			load_msrs(vcpu->guest_msrs, NR_BAD_MSRS);
 		break;
 #endif
 	case MSR_IA32_SYSENTER_CS:
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 21/58] KVM: Avoid corrupting tr in real mode
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (19 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 20/58] KVM: VMX: Only reload guest msrs if they are already loaded Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 22/58] KVM: Fix vmx I/O bitmap initialization on highmem systems Avi Kivity
                   ` (35 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

The real mode tr needs to be set to a specific tss so that I/O
instructions can function.  Divert the new tr values to the real
mode save area from where they will be restored on transition to
protected mode.

This fixes some crashes on reboot when the bios accesses an I/O
instruction.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/vmx.c |   45 +++++++++++++++++++++++++++++++--------------
 1 files changed, 31 insertions(+), 14 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index b353eaa..e39ebe0 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1042,23 +1042,11 @@ static void vmx_get_segment(struct kvm_vcpu *vcpu,
 	var->unusable = (ar >> 16) & 1;
 }
 
-static void vmx_set_segment(struct kvm_vcpu *vcpu,
-			    struct kvm_segment *var, int seg)
+static u32 vmx_segment_access_rights(struct kvm_segment *var)
 {
-	struct kvm_vmx_segment_field *sf = &kvm_vmx_segment_fields[seg];
 	u32 ar;
 
-	vmcs_writel(sf->base, var->base);
-	vmcs_write32(sf->limit, var->limit);
-	vmcs_write16(sf->selector, var->selector);
-	if (vcpu->rmode.active && var->s) {
-		/*
-		 * Hack real-mode segments into vm86 compatibility.
-		 */
-		if (var->base == 0xffff0000 && var->selector == 0xf000)
-			vmcs_writel(sf->base, 0xf0000);
-		ar = 0xf3;
-	} else if (var->unusable)
+	if (var->unusable)
 		ar = 1 << 16;
 	else {
 		ar = var->type & 15;
@@ -1072,6 +1060,35 @@ static void vmx_set_segment(struct kvm_vcpu *vcpu,
 	}
 	if (ar == 0) /* a 0 value means unusable */
 		ar = AR_UNUSABLE_MASK;
+
+	return ar;
+}
+
+static void vmx_set_segment(struct kvm_vcpu *vcpu,
+			    struct kvm_segment *var, int seg)
+{
+	struct kvm_vmx_segment_field *sf = &kvm_vmx_segment_fields[seg];
+	u32 ar;
+
+	if (vcpu->rmode.active && seg == VCPU_SREG_TR) {
+		vcpu->rmode.tr.selector = var->selector;
+		vcpu->rmode.tr.base = var->base;
+		vcpu->rmode.tr.limit = var->limit;
+		vcpu->rmode.tr.ar = vmx_segment_access_rights(var);
+		return;
+	}
+	vmcs_writel(sf->base, var->base);
+	vmcs_write32(sf->limit, var->limit);
+	vmcs_write16(sf->selector, var->selector);
+	if (vcpu->rmode.active && var->s) {
+		/*
+		 * Hack real-mode segments into vm86 compatibility.
+		 */
+		if (var->base == 0xffff0000 && var->selector == 0xf000)
+			vmcs_writel(sf->base, 0xf0000);
+		ar = 0xf3;
+	} else
+		ar = vmx_segment_access_rights(var);
 	vmcs_write32(sf->ar_bytes, ar);
 }
 
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 22/58] KVM: Fix vmx I/O bitmap initialization on highmem systems
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (20 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 21/58] KVM: Avoid corrupting tr in real mode Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 23/58] KVM: VMX: Use local labels in inline assembly Avi Kivity
                   ` (34 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

kunmap() expects a struct page, not a virtual address.  Fixes an oops loading
kvm-intel.ko on i386 with CONFIG_HIGHMEM.

Thanks to Michael Ivanov <deruhu@peterstar.ru> for reporting.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/vmx.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index e39ebe0..34171d9 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -2274,11 +2274,11 @@ static int __init vmx_init(void)
 	iova = kmap(vmx_io_bitmap_a);
 	memset(iova, 0xff, PAGE_SIZE);
 	clear_bit(0x80, iova);
-	kunmap(iova);
+	kunmap(vmx_io_bitmap_a);
 
 	iova = kmap(vmx_io_bitmap_b);
 	memset(iova, 0xff, PAGE_SIZE);
-	kunmap(iova);
+	kunmap(vmx_io_bitmap_b);
 
 	r = kvm_init_arch(&vmx_arch_ops, THIS_MODULE);
 	if (r)
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 23/58] KVM: VMX: Use local labels in inline assembly
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (21 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 22/58] KVM: Fix vmx I/O bitmap initialization on highmem systems Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 24/58] KVM: VMX: Handle #SS faults from real mode Avi Kivity
                   ` (33 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

This makes oprofile dumps and disassebly easier to read.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/vmx.c |   15 +++++++--------
 1 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 34171d9..c4c5535 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1188,7 +1188,7 @@ static int vmx_vcpu_setup(struct kvm_vcpu *vcpu)
 	struct descriptor_table dt;
 	int i;
 	int ret = 0;
-	extern asmlinkage void kvm_vmx_return(void);
+	unsigned long kvm_vmx_return;
 
 	if (!init_rmode_tss(vcpu->kvm)) {
 		ret = -ENOMEM;
@@ -1306,8 +1306,8 @@ static int vmx_vcpu_setup(struct kvm_vcpu *vcpu)
 	get_idt(&dt);
 	vmcs_writel(HOST_IDTR_BASE, dt.base);   /* 22.2.4 */
 
-
-	vmcs_writel(HOST_RIP, (unsigned long)kvm_vmx_return); /* 22.2.5 */
+	asm ("mov $.Lkvm_vmx_return, %0" : "=r"(kvm_vmx_return));
+	vmcs_writel(HOST_RIP, kvm_vmx_return); /* 22.2.5 */
 
 	rdmsr(MSR_IA32_SYSENTER_CS, host_sysenter_cs, junk);
 	vmcs_write32(HOST_IA32_SYSENTER_CS, host_sysenter_cs);
@@ -1997,12 +1997,11 @@ again:
 		"mov %c[rcx](%3), %%ecx \n\t" /* kills %3 (ecx) */
 #endif
 		/* Enter guest mode */
-		"jne launched \n\t"
+		"jne .Llaunched \n\t"
 		ASM_VMX_VMLAUNCH "\n\t"
-		"jmp kvm_vmx_return \n\t"
-		"launched: " ASM_VMX_VMRESUME "\n\t"
-		".globl kvm_vmx_return \n\t"
-		"kvm_vmx_return: "
+		"jmp .Lkvm_vmx_return \n\t"
+		".Llaunched: " ASM_VMX_VMRESUME "\n\t"
+		".Lkvm_vmx_return: "
 		/* Save guest registers, load host registers, keep flags */
 #ifdef CONFIG_X86_64
 		"xchg %3,     (%%rsp) \n\t"
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 24/58] KVM: VMX: Handle #SS faults from real mode
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (22 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 23/58] KVM: VMX: Use local labels in inline assembly Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 25/58] KVM: VMX: Avoid saving and restoring msrs on lightweight vmexit Avi Kivity
                   ` (32 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Nitin A Kamble, Avi Kivity

From: Nitin A Kamble <nitin.a.kamble@intel.com>

Instructions with address size override prefix opcode 0x67
Cause the #SS fault with 0 error code in VM86 mode.  Forward
them to the emulator.

Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/vmx.c |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index c4c5535..a05bfa0 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1488,7 +1488,11 @@ static int handle_rmode_exception(struct kvm_vcpu *vcpu,
 	if (!vcpu->rmode.active)
 		return 0;
 
-	if (vec == GP_VECTOR && err_code == 0)
+	/*
+	 * Instruction with address size override prefix opcode 0x67
+	 * Cause the #SS fault with 0 error code in VM86 mode.
+	 */
+	if (((vec == GP_VECTOR) || (vec == SS_VECTOR)) && err_code == 0)
 		if (emulate_instruction(vcpu, NULL, 0, 0) == EMULATE_DONE)
 			return 1;
 	return 0;
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 25/58] KVM: VMX: Avoid saving and restoring msrs on lightweight vmexit
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (23 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 24/58] KVM: VMX: Handle #SS faults from real mode Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 26/58] KVM: VMX: Cleanup redundant code in MSR set Avi Kivity
                   ` (31 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Eddie Dong, Avi Kivity

From: Eddie Dong <eddie.dong@intel.com>

In a lightweight exit (where we exit and reenter the guest without
scheduling or exiting to userspace in between), we don't need various
msrs on the host, and avoiding shuffling them around reduces raw exit
time by 8%.

i386 compile fix by Daniel Hecken <dh@bahntechnik.de>.

Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h |    4 ++
 drivers/kvm/vmx.c |  128 ++++++++++++++++++++++++++++++-----------------------
 2 files changed, 76 insertions(+), 56 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index fc4a6c1..c252efe 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -288,6 +288,10 @@ struct kvm_vcpu {
 	u64 apic_base;
 	u64 ia32_misc_enable_msr;
 	int nmsrs;
+	int save_nmsrs;
+#ifdef CONFIG_X86_64
+	int msr_offset_kernel_gs_base;
+#endif
 	struct vmx_msr_entry *guest_msrs;
 	struct vmx_msr_entry *host_msrs;
 
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index a05bfa0..872ca03 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -85,19 +85,6 @@ static const u32 vmx_msr_index[] = {
 };
 #define NR_VMX_MSR ARRAY_SIZE(vmx_msr_index)
 
-#ifdef CONFIG_X86_64
-static unsigned msr_offset_kernel_gs_base;
-#define NR_64BIT_MSRS 4
-/*
- * avoid save/load MSR_SYSCALL_MASK and MSR_LSTAR by std vt
- * mechanism (cpu bug AA24)
- */
-#define NR_BAD_MSRS 2
-#else
-#define NR_64BIT_MSRS 0
-#define NR_BAD_MSRS 0
-#endif
-
 static inline int is_page_fault(u32 intr_info)
 {
 	return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VECTOR_MASK |
@@ -118,13 +105,23 @@ static inline int is_external_interrupt(u32 intr_info)
 		== (INTR_TYPE_EXT_INTR | INTR_INFO_VALID_MASK);
 }
 
-static struct vmx_msr_entry *find_msr_entry(struct kvm_vcpu *vcpu, u32 msr)
+static int __find_msr_index(struct kvm_vcpu *vcpu, u32 msr)
 {
 	int i;
 
 	for (i = 0; i < vcpu->nmsrs; ++i)
 		if (vcpu->guest_msrs[i].index == msr)
-			return &vcpu->guest_msrs[i];
+			return i;
+	return -1;
+}
+
+static struct vmx_msr_entry *find_msr_entry(struct kvm_vcpu *vcpu, u32 msr)
+{
+	int i;
+
+	i = __find_msr_index(vcpu, msr);
+	if (i >= 0)
+		return &vcpu->guest_msrs[i];
 	return NULL;
 }
 
@@ -307,10 +304,10 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu)
 
 #ifdef CONFIG_X86_64
 	if (is_long_mode(vcpu)) {
-		save_msrs(vcpu->host_msrs + msr_offset_kernel_gs_base, 1);
-		load_msrs(vcpu->guest_msrs, NR_BAD_MSRS);
+		save_msrs(vcpu->host_msrs + vcpu->msr_offset_kernel_gs_base, 1);
 	}
 #endif
+	load_msrs(vcpu->guest_msrs, vcpu->save_nmsrs);
 }
 
 static void vmx_load_host_state(struct kvm_vcpu *vcpu)
@@ -337,12 +334,8 @@ static void vmx_load_host_state(struct kvm_vcpu *vcpu)
 
 		reload_tss();
 	}
-#ifdef CONFIG_X86_64
-	if (is_long_mode(vcpu)) {
-		save_msrs(vcpu->guest_msrs, NR_BAD_MSRS);
-		load_msrs(vcpu->host_msrs, NR_BAD_MSRS);
-	}
-#endif
+	save_msrs(vcpu->guest_msrs, vcpu->save_nmsrs);
+	load_msrs(vcpu->host_msrs, vcpu->save_nmsrs);
 }
 
 /*
@@ -464,41 +457,74 @@ static void vmx_inject_gp(struct kvm_vcpu *vcpu, unsigned error_code)
 }
 
 /*
+ * Swap MSR entry in host/guest MSR entry array.
+ */
+void move_msr_up(struct kvm_vcpu *vcpu, int from, int to)
+{
+	struct vmx_msr_entry tmp;
+	tmp = vcpu->guest_msrs[to];
+	vcpu->guest_msrs[to] = vcpu->guest_msrs[from];
+	vcpu->guest_msrs[from] = tmp;
+	tmp = vcpu->host_msrs[to];
+	vcpu->host_msrs[to] = vcpu->host_msrs[from];
+	vcpu->host_msrs[from] = tmp;
+}
+
+/*
  * Set up the vmcs to automatically save and restore system
  * msrs.  Don't touch the 64-bit msrs if the guest is in legacy
  * mode, as fiddling with msrs is very expensive.
  */
 static void setup_msrs(struct kvm_vcpu *vcpu)
 {
-	int nr_skip, nr_good_msrs;
+	int index, save_nmsrs;
 
-	if (is_long_mode(vcpu))
-		nr_skip = NR_BAD_MSRS;
-	else
-		nr_skip = NR_64BIT_MSRS;
-	nr_good_msrs = vcpu->nmsrs - nr_skip;
+	save_nmsrs = 0;
+#ifdef CONFIG_X86_64
+	if (is_long_mode(vcpu)) {
+		index = __find_msr_index(vcpu, MSR_SYSCALL_MASK);
+		if (index >= 0)
+			move_msr_up(vcpu, index, save_nmsrs++);
+		index = __find_msr_index(vcpu, MSR_LSTAR);
+		if (index >= 0)
+			move_msr_up(vcpu, index, save_nmsrs++);
+		index = __find_msr_index(vcpu, MSR_CSTAR);
+		if (index >= 0)
+			move_msr_up(vcpu, index, save_nmsrs++);
+		index = __find_msr_index(vcpu, MSR_KERNEL_GS_BASE);
+		if (index >= 0)
+			move_msr_up(vcpu, index, save_nmsrs++);
+		/*
+		 * MSR_K6_STAR is only needed on long mode guests, and only
+		 * if efer.sce is enabled.
+		 */
+		index = __find_msr_index(vcpu, MSR_K6_STAR);
+		if ((index >= 0) && (vcpu->shadow_efer & EFER_SCE))
+			move_msr_up(vcpu, index, save_nmsrs++);
+	}
+#endif
+	vcpu->save_nmsrs = save_nmsrs;
 
-	/*
-	 * MSR_K6_STAR is only needed on long mode guests, and only
-	 * if efer.sce is enabled.
-	 */
-	if (find_msr_entry(vcpu, MSR_K6_STAR)) {
-		--nr_good_msrs;
 #ifdef CONFIG_X86_64
-		if (is_long_mode(vcpu) && (vcpu->shadow_efer & EFER_SCE))
-			++nr_good_msrs;
+	vcpu->msr_offset_kernel_gs_base =
+		__find_msr_index(vcpu, MSR_KERNEL_GS_BASE);
 #endif
+	index = __find_msr_index(vcpu, MSR_EFER);
+	if (index >= 0)
+		save_nmsrs = 1;
+	else {
+		save_nmsrs = 0;
+		index = 0;
 	}
-
 	vmcs_writel(VM_ENTRY_MSR_LOAD_ADDR,
-		    virt_to_phys(vcpu->guest_msrs + nr_skip));
+		    virt_to_phys(vcpu->guest_msrs + index));
 	vmcs_writel(VM_EXIT_MSR_STORE_ADDR,
-		    virt_to_phys(vcpu->guest_msrs + nr_skip));
+		    virt_to_phys(vcpu->guest_msrs + index));
 	vmcs_writel(VM_EXIT_MSR_LOAD_ADDR,
-		    virt_to_phys(vcpu->host_msrs + nr_skip));
-	vmcs_write32(VM_EXIT_MSR_STORE_COUNT, nr_good_msrs); /* 22.2.2 */
-	vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, nr_good_msrs);  /* 22.2.2 */
-	vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, nr_good_msrs); /* 22.2.2 */
+		    virt_to_phys(vcpu->host_msrs + index));
+	vmcs_write32(VM_EXIT_MSR_STORE_COUNT, save_nmsrs);
+	vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, save_nmsrs);
+	vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, save_nmsrs);
 }
 
 /*
@@ -595,14 +621,6 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
 	case MSR_GS_BASE:
 		vmcs_writel(GUEST_GS_BASE, data);
 		break;
-	case MSR_LSTAR:
-	case MSR_SYSCALL_MASK:
-		msr = find_msr_entry(vcpu, msr_index);
-		if (msr)
-			msr->data = data;
-		if (vcpu->vmx_host_state.loaded)
-			load_msrs(vcpu->guest_msrs, NR_BAD_MSRS);
-		break;
 #endif
 	case MSR_IA32_SYSENTER_CS:
 		vmcs_write32(GUEST_SYSENTER_CS, data);
@@ -620,6 +638,8 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
 		msr = find_msr_entry(vcpu, msr_index);
 		if (msr) {
 			msr->data = data;
+			if (vcpu->vmx_host_state.loaded)
+				load_msrs(vcpu->guest_msrs,vcpu->save_nmsrs);
 			break;
 		}
 		return kvm_set_msr_common(vcpu, msr_index, data);
@@ -1331,10 +1351,6 @@ static int vmx_vcpu_setup(struct kvm_vcpu *vcpu)
 		vcpu->host_msrs[j].reserved = 0;
 		vcpu->host_msrs[j].data = data;
 		vcpu->guest_msrs[j] = vcpu->host_msrs[j];
-#ifdef CONFIG_X86_64
-		if (index == MSR_KERNEL_GS_BASE)
-			msr_offset_kernel_gs_base = j;
-#endif
 		++vcpu->nmsrs;
 	}
 
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 26/58] KVM: VMX: Cleanup redundant code in MSR set
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (24 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 25/58] KVM: VMX: Avoid saving and restoring msrs on lightweight vmexit Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 27/58] KVM: VMX: Avoid saving and restoring msr_efer on lightweight vmexit Avi Kivity
                   ` (30 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Eddie Dong, Avi Kivity

From: Eddie Dong <eddie.dong@intel.com>

Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/vmx.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 872ca03..dc99191 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -643,8 +643,6 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
 			break;
 		}
 		return kvm_set_msr_common(vcpu, msr_index, data);
-		msr->data = data;
-		break;
 	}
 
 	return 0;
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 27/58] KVM: VMX: Avoid saving and restoring msr_efer on lightweight vmexit
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (25 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 26/58] KVM: VMX: Cleanup redundant code in MSR set Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 28/58] Use menuconfig objects II - KVM/Virt Avi Kivity
                   ` (29 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Eddie Dong, Avi Kivity

From: Eddie Dong <eddie.dong@intel.com>

MSR_EFER.LME/LMA bits are automatically save/restored by VMX
hardware, KVM only needs to save NX/SCE bits at time of heavy
weight VM Exit. But clearing NX bits in host envirnment may
cause system hang if the host page table is using EXB bits,
thus we leave NX bits as it is. If Host NX=1 and guest NX=0, we
can do guest page table EXB bits check before inserting a shadow
pte (though no guest is expecting to see this kind of gp fault).
If host NX=0, we present guest no Execute-Disable feature to guest,
thus no host NX=0, guest NX=1 combination.

This patch reduces raw vmexit time by ~27%.

Me: fix compile warnings on i386.

Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h      |    2 +
 drivers/kvm/kvm_main.c |   23 ++++++++++++++++
 drivers/kvm/vmx.c      |   67 +++++++++++++++++++++++++++++++++---------------
 3 files changed, 71 insertions(+), 21 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index c252efe..db2bc6f 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -255,6 +255,7 @@ struct kvm_stat {
 	u32 request_irq_exits;
 	u32 irq_exits;
 	u32 light_exits;
+	u32 efer_reload;
 };
 
 struct kvm_vcpu {
@@ -289,6 +290,7 @@ struct kvm_vcpu {
 	u64 ia32_misc_enable_msr;
 	int nmsrs;
 	int save_nmsrs;
+	int msr_offset_efer;
 #ifdef CONFIG_X86_64
 	int msr_offset_kernel_gs_base;
 #endif
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 095d673..af07cd5 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -73,6 +73,7 @@ static struct kvm_stats_debugfs_item {
 	{ "request_irq", STAT_OFFSET(request_irq_exits) },
 	{ "irq_exits", STAT_OFFSET(irq_exits) },
 	{ "light_exits", STAT_OFFSET(light_exits) },
+	{ "efer_reload", STAT_OFFSET(efer_reload) },
 	{ NULL }
 };
 
@@ -2378,6 +2379,27 @@ out:
 	return r;
 }
 
+static void cpuid_fix_nx_cap(struct kvm_vcpu *vcpu)
+{
+	u64 efer;
+	int i;
+	struct kvm_cpuid_entry *e, *entry;
+
+	rdmsrl(MSR_EFER, efer);
+	entry = NULL;
+	for (i = 0; i < vcpu->cpuid_nent; ++i) {
+		e = &vcpu->cpuid_entries[i];
+		if (e->function == 0x80000001) {
+			entry = e;
+			break;
+		}
+	}
+	if (entry && (entry->edx & EFER_NX) && !(efer & EFER_NX)) {
+		entry->edx &= ~(1 << 20);
+		printk(KERN_INFO ": guest NX capability removed\n");
+	}
+}
+
 static int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
 				    struct kvm_cpuid *cpuid,
 				    struct kvm_cpuid_entry __user *entries)
@@ -2392,6 +2414,7 @@ static int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
 			   cpuid->nent * sizeof(struct kvm_cpuid_entry)))
 		goto out;
 	vcpu->cpuid_nent = cpuid->nent;
+	cpuid_fix_nx_cap(vcpu);
 	return 0;
 
 out:
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index dc99191..93e5bb2 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -42,6 +42,7 @@ static struct page *vmx_io_bitmap_b;
 #else
 #define HOST_IS_64 0
 #endif
+#define EFER_SAVE_RESTORE_BITS ((u64)EFER_SCE)
 
 static struct vmcs_descriptor {
 	int size;
@@ -85,6 +86,18 @@ static const u32 vmx_msr_index[] = {
 };
 #define NR_VMX_MSR ARRAY_SIZE(vmx_msr_index)
 
+static inline u64 msr_efer_save_restore_bits(struct vmx_msr_entry msr)
+{
+	return (u64)msr.data & EFER_SAVE_RESTORE_BITS;
+}
+
+static inline int msr_efer_need_save_restore(struct kvm_vcpu *vcpu)
+{
+	int efer_offset = vcpu->msr_offset_efer;
+	return msr_efer_save_restore_bits(vcpu->host_msrs[efer_offset]) !=
+		msr_efer_save_restore_bits(vcpu->guest_msrs[efer_offset]);
+}
+
 static inline int is_page_fault(u32 intr_info)
 {
 	return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VECTOR_MASK |
@@ -265,6 +278,19 @@ static void reload_tss(void)
 #endif
 }
 
+static void load_transition_efer(struct kvm_vcpu *vcpu)
+{
+	u64 trans_efer;
+	int efer_offset = vcpu->msr_offset_efer;
+
+	trans_efer = vcpu->host_msrs[efer_offset].data;
+	trans_efer &= ~EFER_SAVE_RESTORE_BITS;
+	trans_efer |= msr_efer_save_restore_bits(
+				vcpu->guest_msrs[efer_offset]);
+	wrmsrl(MSR_EFER, trans_efer);
+	vcpu->stat.efer_reload++;
+}
+
 static void vmx_save_host_state(struct kvm_vcpu *vcpu)
 {
 	struct vmx_host_state *hs = &vcpu->vmx_host_state;
@@ -308,6 +334,8 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu)
 	}
 #endif
 	load_msrs(vcpu->guest_msrs, vcpu->save_nmsrs);
+	if (msr_efer_need_save_restore(vcpu))
+		load_transition_efer(vcpu);
 }
 
 static void vmx_load_host_state(struct kvm_vcpu *vcpu)
@@ -336,6 +364,8 @@ static void vmx_load_host_state(struct kvm_vcpu *vcpu)
 	}
 	save_msrs(vcpu->guest_msrs, vcpu->save_nmsrs);
 	load_msrs(vcpu->host_msrs, vcpu->save_nmsrs);
+	if (msr_efer_need_save_restore(vcpu))
+		load_msrs(vcpu->host_msrs + vcpu->msr_offset_efer, 1);
 }
 
 /*
@@ -477,11 +507,13 @@ void move_msr_up(struct kvm_vcpu *vcpu, int from, int to)
  */
 static void setup_msrs(struct kvm_vcpu *vcpu)
 {
-	int index, save_nmsrs;
+	int save_nmsrs;
 
 	save_nmsrs = 0;
 #ifdef CONFIG_X86_64
 	if (is_long_mode(vcpu)) {
+		int index;
+
 		index = __find_msr_index(vcpu, MSR_SYSCALL_MASK);
 		if (index >= 0)
 			move_msr_up(vcpu, index, save_nmsrs++);
@@ -509,22 +541,7 @@ static void setup_msrs(struct kvm_vcpu *vcpu)
 	vcpu->msr_offset_kernel_gs_base =
 		__find_msr_index(vcpu, MSR_KERNEL_GS_BASE);
 #endif
-	index = __find_msr_index(vcpu, MSR_EFER);
-	if (index >= 0)
-		save_nmsrs = 1;
-	else {
-		save_nmsrs = 0;
-		index = 0;
-	}
-	vmcs_writel(VM_ENTRY_MSR_LOAD_ADDR,
-		    virt_to_phys(vcpu->guest_msrs + index));
-	vmcs_writel(VM_EXIT_MSR_STORE_ADDR,
-		    virt_to_phys(vcpu->guest_msrs + index));
-	vmcs_writel(VM_EXIT_MSR_LOAD_ADDR,
-		    virt_to_phys(vcpu->host_msrs + index));
-	vmcs_write32(VM_EXIT_MSR_STORE_COUNT, save_nmsrs);
-	vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, save_nmsrs);
-	vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, save_nmsrs);
+	vcpu->msr_offset_efer = __find_msr_index(vcpu, MSR_EFER);
 }
 
 /*
@@ -611,10 +628,15 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
 static int vmx_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
 {
 	struct vmx_msr_entry *msr;
+	int ret = 0;
+
 	switch (msr_index) {
 #ifdef CONFIG_X86_64
 	case MSR_EFER:
-		return kvm_set_msr_common(vcpu, msr_index, data);
+		ret = kvm_set_msr_common(vcpu, msr_index, data);
+		if (vcpu->vmx_host_state.loaded)
+			load_transition_efer(vcpu);
+		break;
 	case MSR_FS_BASE:
 		vmcs_writel(GUEST_FS_BASE, data);
 		break;
@@ -639,13 +661,13 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
 		if (msr) {
 			msr->data = data;
 			if (vcpu->vmx_host_state.loaded)
-				load_msrs(vcpu->guest_msrs,vcpu->save_nmsrs);
+				load_msrs(vcpu->guest_msrs, vcpu->save_nmsrs);
 			break;
 		}
-		return kvm_set_msr_common(vcpu, msr_index, data);
+		ret = kvm_set_msr_common(vcpu, msr_index, data);
 	}
 
-	return 0;
+	return ret;
 }
 
 /*
@@ -1326,6 +1348,9 @@ static int vmx_vcpu_setup(struct kvm_vcpu *vcpu)
 
 	asm ("mov $.Lkvm_vmx_return, %0" : "=r"(kvm_vmx_return));
 	vmcs_writel(HOST_RIP, kvm_vmx_return); /* 22.2.5 */
+	vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0);
+	vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, 0);
+	vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, 0);
 
 	rdmsr(MSR_IA32_SYSENTER_CS, host_sysenter_cs, junk);
 	vmcs_write32(HOST_IA32_SYSENTER_CS, host_sysenter_cs);
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 28/58] Use menuconfig objects II - KVM/Virt
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (26 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 27/58] KVM: VMX: Avoid saving and restoring msr_efer on lightweight vmexit Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 29/58] KVM: x86 emulator: implement wbinvd Avi Kivity
                   ` (28 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel
  Cc: linux-kernel, Jan Engelhardt, Jan Engelhardt, Andrew Morton, Avi Kivity

From: Jan Engelhardt <jengelh@linux01.gwdg.de>

Make a "menuconfig" out of the Kconfig objects "menu, ..., endmenu",
so that the user can disable all the options in that menu at once
instead of having to disable each option separately.

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/Kconfig |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/Kconfig b/drivers/kvm/Kconfig
index e8e37d8..2f661e5 100644
--- a/drivers/kvm/Kconfig
+++ b/drivers/kvm/Kconfig
@@ -1,8 +1,12 @@
 #
 # KVM configuration
 #
-menu "Virtualization"
+menuconfig VIRTUALIZATION
+	bool "Virtualization"
 	depends on X86
+	default y
+
+if VIRTUALIZATION
 
 config KVM
 	tristate "Kernel-based Virtual Machine (KVM) support"
@@ -35,4 +39,4 @@ config KVM_AMD
 	  Provides support for KVM on AMD processors equipped with the AMD-V
 	  (SVM) extensions.
 
-endmenu
+endif # VIRTUALIZATION
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 29/58] KVM: x86 emulator: implement wbinvd
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (27 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 28/58] Use menuconfig objects II - KVM/Virt Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 30/58] KVM: Fix includes Avi Kivity
                   ` (27 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Vista seems to trigger it.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/x86_emulate.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index 7ade090..6123c02 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -152,7 +152,7 @@ static u8 opcode_table[256] = {
 static u16 twobyte_table[256] = {
 	/* 0x00 - 0x0F */
 	0, SrcMem | ModRM | DstReg, 0, 0, 0, 0, ImplicitOps, 0,
-	0, 0, 0, 0, 0, ImplicitOps | ModRM, 0, 0,
+	0, ImplicitOps, 0, 0, 0, ImplicitOps | ModRM, 0, 0,
 	/* 0x10 - 0x1F */
 	0, 0, 0, 0, 0, 0, 0, 0, ImplicitOps | ModRM, 0, 0, 0, 0, 0, 0, 0,
 	/* 0x20 - 0x2F */
@@ -1304,6 +1304,8 @@ twobyte_special_insn:
 	/* Disable writeback. */
 	dst.orig_val = dst.val;
 	switch (b) {
+	case 0x09:		/* wbinvd */
+		break;
 	case 0x0d:		/* GrpP (prefetch) */
 	case 0x18:		/* Grp16 (prefetch/nop) */
 		break;
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 30/58] KVM: Fix includes
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (28 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 29/58] KVM: x86 emulator: implement wbinvd Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 31/58] KVM: Use symbolic constants instead of magic numbers Avi Kivity
                   ` (26 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Markus Rechberger, Avi Kivity

From: Markus Rechberger <markus.rechberger@amd.com>

KVM compilation fails for some .configs.  This fixes it.

Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index db2bc6f..90001b5 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -10,6 +10,8 @@
 #include <linux/list.h>
 #include <linux/mutex.h>
 #include <linux/spinlock.h>
+#include <linux/signal.h>
+#include <linux/sched.h>
 #include <linux/mm.h>
 #include <asm/signal.h>
 
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 31/58] KVM: Use symbolic constants instead of magic numbers
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (29 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 30/58] KVM: Fix includes Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 32/58] KVM: MMU: Use slab caches for shadow pages and their headers Avi Kivity
                   ` (25 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Eddie Dong, Avi Kivity

From: Eddie Dong <eddie.dong@intel.com>

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/paging_tmpl.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 6dd0da9..183d4ca 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -213,7 +213,7 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page,
 	if (~gpte & (PT_PRESENT_MASK | PT_ACCESSED_MASK))
 		return;
 	pgprintk("%s: gpte %llx spte %p\n", __FUNCTION__, (u64)gpte, spte);
-	FNAME(set_pte)(vcpu, gpte, spte, 6,
+	FNAME(set_pte)(vcpu, gpte, spte, PT_USER_MASK | PT_WRITABLE_MASK,
 		       (gpte & PT_BASE_ADDR_MASK) >> PAGE_SHIFT);
 }
 
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 32/58] KVM: MMU: Use slab caches for shadow pages and their headers
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (30 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 31/58] KVM: Use symbolic constants instead of magic numbers Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 33/58] KVM: MMU: Simplify fetch() a little bit Avi Kivity
                   ` (24 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Use slab caches instead of a simple custom list.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h      |    4 +-
 drivers/kvm/kvm_main.c |    1 -
 drivers/kvm/mmu.c      |   64 +++++++++++++++++++++++++++++------------------
 3 files changed, 41 insertions(+), 28 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 90001b5..199e1e9 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -299,12 +299,12 @@ struct kvm_vcpu {
 	struct vmx_msr_entry *guest_msrs;
 	struct vmx_msr_entry *host_msrs;
 
-	struct list_head free_pages;
-	struct kvm_mmu_page page_header_buf[KVM_NUM_MMU_PAGES];
 	struct kvm_mmu mmu;
 
 	struct kvm_mmu_memory_cache mmu_pte_chain_cache;
 	struct kvm_mmu_memory_cache mmu_rmap_desc_cache;
+	struct kvm_mmu_memory_cache mmu_page_cache;
+	struct kvm_mmu_memory_cache mmu_page_header_cache;
 
 	gfn_t last_pt_write_gfn;
 	int   last_pt_write_count;
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index af07cd5..bf35457 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -326,7 +326,6 @@ static struct kvm *kvm_create_vm(void)
 		vcpu->cpu = -1;
 		vcpu->kvm = kvm;
 		vcpu->mmu.root_hpa = INVALID_PAGE;
-		INIT_LIST_HEAD(&vcpu->free_pages);
 		spin_lock(&kvm_lock);
 		list_add(&kvm->vm_list, &vm_list);
 		spin_unlock(&kvm_lock);
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index c85c664..46491b4 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -165,6 +165,8 @@ struct kvm_rmap_desc {
 
 static struct kmem_cache *pte_chain_cache;
 static struct kmem_cache *rmap_desc_cache;
+static struct kmem_cache *mmu_page_cache;
+static struct kmem_cache *mmu_page_header_cache;
 
 static int is_write_protection(struct kvm_vcpu *vcpu)
 {
@@ -235,6 +237,14 @@ static int __mmu_topup_memory_caches(struct kvm_vcpu *vcpu, gfp_t gfp_flags)
 		goto out;
 	r = mmu_topup_memory_cache(&vcpu->mmu_rmap_desc_cache,
 				   rmap_desc_cache, 1, gfp_flags);
+	if (r)
+		goto out;
+	r = mmu_topup_memory_cache(&vcpu->mmu_page_cache,
+				   mmu_page_cache, 4, gfp_flags);
+	if (r)
+		goto out;
+	r = mmu_topup_memory_cache(&vcpu->mmu_page_header_cache,
+				   mmu_page_header_cache, 4, gfp_flags);
 out:
 	return r;
 }
@@ -258,6 +268,8 @@ static void mmu_free_memory_caches(struct kvm_vcpu *vcpu)
 {
 	mmu_free_memory_cache(&vcpu->mmu_pte_chain_cache);
 	mmu_free_memory_cache(&vcpu->mmu_rmap_desc_cache);
+	mmu_free_memory_cache(&vcpu->mmu_page_cache);
+	mmu_free_memory_cache(&vcpu->mmu_page_header_cache);
 }
 
 static void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc,
@@ -458,7 +470,9 @@ static void kvm_mmu_free_page(struct kvm_vcpu *vcpu,
 			      struct kvm_mmu_page *page_head)
 {
 	ASSERT(is_empty_shadow_page(page_head->spt));
-	list_move(&page_head->link, &vcpu->free_pages);
+	list_del(&page_head->link);
+	mmu_memory_cache_free(&vcpu->mmu_page_cache, page_head->spt);
+	mmu_memory_cache_free(&vcpu->mmu_page_header_cache, page_head);
 	++vcpu->kvm->n_free_mmu_pages;
 }
 
@@ -472,11 +486,14 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu,
 {
 	struct kvm_mmu_page *page;
 
-	if (list_empty(&vcpu->free_pages))
+	if (!vcpu->kvm->n_free_mmu_pages)
 		return NULL;
 
-	page = list_entry(vcpu->free_pages.next, struct kvm_mmu_page, link);
-	list_move(&page->link, &vcpu->kvm->active_mmu_pages);
+	page = mmu_memory_cache_alloc(&vcpu->mmu_page_header_cache,
+				      sizeof *page);
+	page->spt = mmu_memory_cache_alloc(&vcpu->mmu_page_cache, PAGE_SIZE);
+	set_page_private(virt_to_page(page->spt), (unsigned long)page);
+	list_add(&page->link, &vcpu->kvm->active_mmu_pages);
 	ASSERT(is_empty_shadow_page(page->spt));
 	page->slot_bitmap = 0;
 	page->multimapped = 0;
@@ -1083,6 +1100,7 @@ static int init_kvm_mmu(struct kvm_vcpu *vcpu)
 	ASSERT(vcpu);
 	ASSERT(!VALID_PAGE(vcpu->mmu.root_hpa));
 
+	mmu_topup_memory_caches(vcpu);
 	if (!is_paging(vcpu))
 		return nonpaging_init_context(vcpu);
 	else if (is_long_mode(vcpu))
@@ -1256,13 +1274,6 @@ static void free_mmu_pages(struct kvm_vcpu *vcpu)
 				    struct kvm_mmu_page, link);
 		kvm_mmu_zap_page(vcpu, page);
 	}
-	while (!list_empty(&vcpu->free_pages)) {
-		page = list_entry(vcpu->free_pages.next,
-				  struct kvm_mmu_page, link);
-		list_del(&page->link);
-		free_page((unsigned long)page->spt);
-		page->spt = NULL;
-	}
 	free_page((unsigned long)vcpu->mmu.pae_root);
 }
 
@@ -1273,18 +1284,7 @@ static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
 
 	ASSERT(vcpu);
 
-	for (i = 0; i < KVM_NUM_MMU_PAGES; i++) {
-		struct kvm_mmu_page *page_header = &vcpu->page_header_buf[i];
-
-		INIT_LIST_HEAD(&page_header->link);
-		if ((page = alloc_page(GFP_KERNEL)) == NULL)
-			goto error_1;
-		set_page_private(page, (unsigned long)page_header);
-		page_header->spt = page_address(page);
-		memset(page_header->spt, 0, PAGE_SIZE);
-		list_add(&page_header->link, &vcpu->free_pages);
-		++vcpu->kvm->n_free_mmu_pages;
-	}
+	vcpu->kvm->n_free_mmu_pages = KVM_NUM_MMU_PAGES;
 
 	/*
 	 * When emulating 32-bit mode, cr3 is only 32 bits even on x86_64.
@@ -1309,7 +1309,6 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
 {
 	ASSERT(vcpu);
 	ASSERT(!VALID_PAGE(vcpu->mmu.root_hpa));
-	ASSERT(list_empty(&vcpu->free_pages));
 
 	return alloc_mmu_pages(vcpu);
 }
@@ -1318,7 +1317,6 @@ int kvm_mmu_setup(struct kvm_vcpu *vcpu)
 {
 	ASSERT(vcpu);
 	ASSERT(!VALID_PAGE(vcpu->mmu.root_hpa));
-	ASSERT(!list_empty(&vcpu->free_pages));
 
 	return init_kvm_mmu(vcpu);
 }
@@ -1377,6 +1375,10 @@ void kvm_mmu_module_exit(void)
 		kmem_cache_destroy(pte_chain_cache);
 	if (rmap_desc_cache)
 		kmem_cache_destroy(rmap_desc_cache);
+	if (mmu_page_cache)
+		kmem_cache_destroy(mmu_page_cache);
+	if (mmu_page_header_cache)
+		kmem_cache_destroy(mmu_page_header_cache);
 }
 
 int kvm_mmu_module_init(void)
@@ -1392,6 +1394,18 @@ int kvm_mmu_module_init(void)
 	if (!rmap_desc_cache)
 		goto nomem;
 
+	mmu_page_cache = kmem_cache_create("kvm_mmu_page",
+					   PAGE_SIZE,
+					   PAGE_SIZE, 0, NULL, NULL);
+	if (!mmu_page_cache)
+		goto nomem;
+
+	mmu_page_header_cache = kmem_cache_create("kvm_mmu_page_header",
+						  sizeof(struct kvm_mmu_page),
+						  0, 0, NULL, NULL);
+	if (!mmu_page_header_cache)
+		goto nomem;
+
 	return 0;
 
 nomem:
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 33/58] KVM: MMU: Simplify fetch() a little bit
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (31 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 32/58] KVM: MMU: Use slab caches for shadow pages and their headers Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 34/58] KVM: MMU: Move set_pte_common() to pte width dependent code Avi Kivity
                   ` (23 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/paging_tmpl.h |   34 +++++++++++++++++-----------------
 1 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 183d4ca..e094a8b 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -241,6 +241,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 {
 	hpa_t shadow_addr;
 	int level;
+	u64 *shadow_ent;
 	u64 *prev_shadow_ent = NULL;
 	pt_element_t *guest_ent = walker->ptep;
 
@@ -257,13 +258,13 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 
 	for (; ; level--) {
 		u32 index = SHADOW_PT_INDEX(addr, level);
-		u64 *shadow_ent = ((u64 *)__va(shadow_addr)) + index;
 		struct kvm_mmu_page *shadow_page;
 		u64 shadow_pte;
 		int metaphysical;
 		gfn_t table_gfn;
 		unsigned hugepage_access = 0;
 
+		shadow_ent = ((u64 *)__va(shadow_addr)) + index;
 		if (is_present_pte(*shadow_ent) || is_io_pte(*shadow_ent)) {
 			if (level == PT_PAGE_TABLE_LEVEL)
 				return shadow_ent;
@@ -272,22 +273,8 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 			continue;
 		}
 
-		if (level == PT_PAGE_TABLE_LEVEL) {
-
-			if (walker->level == PT_DIRECTORY_LEVEL) {
-				if (prev_shadow_ent)
-					*prev_shadow_ent |= PT_SHADOW_PS_MARK;
-				FNAME(set_pde)(vcpu, *guest_ent, shadow_ent,
-					       walker->inherited_ar,
-					       walker->gfn);
-			} else {
-				ASSERT(walker->level == PT_PAGE_TABLE_LEVEL);
-				FNAME(set_pte)(vcpu, *guest_ent, shadow_ent,
-					       walker->inherited_ar,
-					       walker->gfn);
-			}
-			return shadow_ent;
-		}
+		if (level == PT_PAGE_TABLE_LEVEL)
+			break;
 
 		if (level - 1 == PT_PAGE_TABLE_LEVEL
 		    && walker->level == PT_DIRECTORY_LEVEL) {
@@ -310,6 +297,19 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 		*shadow_ent = shadow_pte;
 		prev_shadow_ent = shadow_ent;
 	}
+
+	if (walker->level == PT_DIRECTORY_LEVEL) {
+		if (prev_shadow_ent)
+			*prev_shadow_ent |= PT_SHADOW_PS_MARK;
+		FNAME(set_pde)(vcpu, *guest_ent, shadow_ent,
+			       walker->inherited_ar, walker->gfn);
+	} else {
+		ASSERT(walker->level == PT_PAGE_TABLE_LEVEL);
+		FNAME(set_pte)(vcpu, *guest_ent, shadow_ent,
+			       walker->inherited_ar,
+			       walker->gfn);
+	}
+	return shadow_ent;
 }
 
 /*
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 34/58] KVM: MMU: Move set_pte_common() to pte width dependent code
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (32 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 33/58] KVM: MMU: Simplify fetch() a little bit Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 35/58] KVM: MMU: Pass the guest pde to set_pte_common Avi Kivity
                   ` (22 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

In preparation of some modifications.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/mmu.c         |   48 --------------------------------------
 drivers/kvm/paging_tmpl.h |   56 +++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 52 insertions(+), 52 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 46491b4..a763150 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -965,54 +965,6 @@ static void paging_new_cr3(struct kvm_vcpu *vcpu)
 	kvm_arch_ops->set_cr3(vcpu, vcpu->mmu.root_hpa);
 }
 
-static inline void set_pte_common(struct kvm_vcpu *vcpu,
-			     u64 *shadow_pte,
-			     gpa_t gaddr,
-			     int dirty,
-			     u64 access_bits,
-			     gfn_t gfn)
-{
-	hpa_t paddr;
-
-	*shadow_pte |= access_bits << PT_SHADOW_BITS_OFFSET;
-	if (!dirty)
-		access_bits &= ~PT_WRITABLE_MASK;
-
-	paddr = gpa_to_hpa(vcpu, gaddr & PT64_BASE_ADDR_MASK);
-
-	*shadow_pte |= access_bits;
-
-	if (is_error_hpa(paddr)) {
-		*shadow_pte |= gaddr;
-		*shadow_pte |= PT_SHADOW_IO_MARK;
-		*shadow_pte &= ~PT_PRESENT_MASK;
-		return;
-	}
-
-	*shadow_pte |= paddr;
-
-	if (access_bits & PT_WRITABLE_MASK) {
-		struct kvm_mmu_page *shadow;
-
-		shadow = kvm_mmu_lookup_page(vcpu, gfn);
-		if (shadow) {
-			pgprintk("%s: found shadow page for %lx, marking ro\n",
-				 __FUNCTION__, gfn);
-			access_bits &= ~PT_WRITABLE_MASK;
-			if (is_writeble_pte(*shadow_pte)) {
-				    *shadow_pte &= ~PT_WRITABLE_MASK;
-				    kvm_arch_ops->tlb_flush(vcpu);
-			}
-		}
-	}
-
-	if (access_bits & PT_WRITABLE_MASK)
-		mark_page_dirty(vcpu->kvm, gaddr >> PAGE_SHIFT);
-
-	page_header_update_slot(vcpu->kvm, shadow_pte, gaddr);
-	rmap_add(vcpu, shadow_pte);
-}
-
 static void inject_page_fault(struct kvm_vcpu *vcpu,
 			      u64 addr,
 			      u32 err_code)
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index e094a8b..6576300 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -192,14 +192,62 @@ static void FNAME(mark_pagetable_dirty)(struct kvm *kvm,
 	mark_page_dirty(kvm, walker->table_gfn[walker->level - 1]);
 }
 
+static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
+				  u64 *shadow_pte,
+				  gpa_t gaddr,
+				  int dirty,
+				  u64 access_bits,
+				  gfn_t gfn)
+{
+	hpa_t paddr;
+
+	*shadow_pte |= access_bits << PT_SHADOW_BITS_OFFSET;
+	if (!dirty)
+		access_bits &= ~PT_WRITABLE_MASK;
+
+	paddr = gpa_to_hpa(vcpu, gaddr & PT64_BASE_ADDR_MASK);
+
+	*shadow_pte |= access_bits;
+
+	if (is_error_hpa(paddr)) {
+		*shadow_pte |= gaddr;
+		*shadow_pte |= PT_SHADOW_IO_MARK;
+		*shadow_pte &= ~PT_PRESENT_MASK;
+		return;
+	}
+
+	*shadow_pte |= paddr;
+
+	if (access_bits & PT_WRITABLE_MASK) {
+		struct kvm_mmu_page *shadow;
+
+		shadow = kvm_mmu_lookup_page(vcpu, gfn);
+		if (shadow) {
+			pgprintk("%s: found shadow page for %lx, marking ro\n",
+				 __FUNCTION__, gfn);
+			access_bits &= ~PT_WRITABLE_MASK;
+			if (is_writeble_pte(*shadow_pte)) {
+				    *shadow_pte &= ~PT_WRITABLE_MASK;
+				    kvm_arch_ops->tlb_flush(vcpu);
+			}
+		}
+	}
+
+	if (access_bits & PT_WRITABLE_MASK)
+		mark_page_dirty(vcpu->kvm, gaddr >> PAGE_SHIFT);
+
+	page_header_update_slot(vcpu->kvm, shadow_pte, gaddr);
+	rmap_add(vcpu, shadow_pte);
+}
+
 static void FNAME(set_pte)(struct kvm_vcpu *vcpu, u64 guest_pte,
 			   u64 *shadow_pte, u64 access_bits, gfn_t gfn)
 {
 	ASSERT(*shadow_pte == 0);
 	access_bits &= guest_pte;
 	*shadow_pte = (guest_pte & PT_PTE_COPY_MASK);
-	set_pte_common(vcpu, shadow_pte, guest_pte & PT_BASE_ADDR_MASK,
-		       guest_pte & PT_DIRTY_MASK, access_bits, gfn);
+	FNAME(set_pte_common)(vcpu, shadow_pte, guest_pte & PT_BASE_ADDR_MASK,
+			      guest_pte & PT_DIRTY_MASK, access_bits, gfn);
 }
 
 static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page,
@@ -229,8 +277,8 @@ static void FNAME(set_pde)(struct kvm_vcpu *vcpu, u64 guest_pde,
 		gaddr |= (guest_pde & PT32_DIR_PSE36_MASK) <<
 			(32 - PT32_DIR_PSE36_SHIFT);
 	*shadow_pte = guest_pde & PT_PTE_COPY_MASK;
-	set_pte_common(vcpu, shadow_pte, gaddr,
-		       guest_pde & PT_DIRTY_MASK, access_bits, gfn);
+	FNAME(set_pte_common)(vcpu, shadow_pte, gaddr,
+			      guest_pde & PT_DIRTY_MASK, access_bits, gfn);
 }
 
 /*
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 35/58] KVM: MMU: Pass the guest pde to set_pte_common
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (33 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 34/58] KVM: MMU: Move set_pte_common() to pte width dependent code Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 36/58] KVM: MMU: Fold fix_read_pf() into set_pte_common() Avi Kivity
                   ` (21 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

We will need the accessed bit (in addition to the dirty bit) and
also write access (for setting the dirty bit) in a future patch.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/paging_tmpl.h |   29 +++++++++++++++--------------
 1 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 6576300..7e998d1 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -195,11 +195,12 @@ static void FNAME(mark_pagetable_dirty)(struct kvm *kvm,
 static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 				  u64 *shadow_pte,
 				  gpa_t gaddr,
-				  int dirty,
+				  pt_element_t *gpte,
 				  u64 access_bits,
 				  gfn_t gfn)
 {
 	hpa_t paddr;
+	int dirty = *gpte & PT_DIRTY_MASK;
 
 	*shadow_pte |= access_bits << PT_SHADOW_BITS_OFFSET;
 	if (!dirty)
@@ -240,14 +241,14 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 	rmap_add(vcpu, shadow_pte);
 }
 
-static void FNAME(set_pte)(struct kvm_vcpu *vcpu, u64 guest_pte,
+static void FNAME(set_pte)(struct kvm_vcpu *vcpu, pt_element_t *gpte,
 			   u64 *shadow_pte, u64 access_bits, gfn_t gfn)
 {
 	ASSERT(*shadow_pte == 0);
-	access_bits &= guest_pte;
-	*shadow_pte = (guest_pte & PT_PTE_COPY_MASK);
-	FNAME(set_pte_common)(vcpu, shadow_pte, guest_pte & PT_BASE_ADDR_MASK,
-			      guest_pte & PT_DIRTY_MASK, access_bits, gfn);
+	access_bits &= *gpte;
+	*shadow_pte = (*gpte & PT_PTE_COPY_MASK);
+	FNAME(set_pte_common)(vcpu, shadow_pte, *gpte & PT_BASE_ADDR_MASK,
+			      gpte, access_bits, gfn);
 }
 
 static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page,
@@ -261,24 +262,24 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page,
 	if (~gpte & (PT_PRESENT_MASK | PT_ACCESSED_MASK))
 		return;
 	pgprintk("%s: gpte %llx spte %p\n", __FUNCTION__, (u64)gpte, spte);
-	FNAME(set_pte)(vcpu, gpte, spte, PT_USER_MASK | PT_WRITABLE_MASK,
+	FNAME(set_pte)(vcpu, &gpte, spte, PT_USER_MASK | PT_WRITABLE_MASK,
 		       (gpte & PT_BASE_ADDR_MASK) >> PAGE_SHIFT);
 }
 
-static void FNAME(set_pde)(struct kvm_vcpu *vcpu, u64 guest_pde,
+static void FNAME(set_pde)(struct kvm_vcpu *vcpu, pt_element_t *gpde,
 			   u64 *shadow_pte, u64 access_bits, gfn_t gfn)
 {
 	gpa_t gaddr;
 
 	ASSERT(*shadow_pte == 0);
-	access_bits &= guest_pde;
+	access_bits &= *gpde;
 	gaddr = (gpa_t)gfn << PAGE_SHIFT;
 	if (PTTYPE == 32 && is_cpuid_PSE36())
-		gaddr |= (guest_pde & PT32_DIR_PSE36_MASK) <<
+		gaddr |= (*gpde & PT32_DIR_PSE36_MASK) <<
 			(32 - PT32_DIR_PSE36_SHIFT);
-	*shadow_pte = guest_pde & PT_PTE_COPY_MASK;
+	*shadow_pte = *gpde & PT_PTE_COPY_MASK;
 	FNAME(set_pte_common)(vcpu, shadow_pte, gaddr,
-			      guest_pde & PT_DIRTY_MASK, access_bits, gfn);
+			      gpde, access_bits, gfn);
 }
 
 /*
@@ -349,11 +350,11 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 	if (walker->level == PT_DIRECTORY_LEVEL) {
 		if (prev_shadow_ent)
 			*prev_shadow_ent |= PT_SHADOW_PS_MARK;
-		FNAME(set_pde)(vcpu, *guest_ent, shadow_ent,
+		FNAME(set_pde)(vcpu, guest_ent, shadow_ent,
 			       walker->inherited_ar, walker->gfn);
 	} else {
 		ASSERT(walker->level == PT_PAGE_TABLE_LEVEL);
-		FNAME(set_pte)(vcpu, *guest_ent, shadow_ent,
+		FNAME(set_pte)(vcpu, guest_ent, shadow_ent,
 			       walker->inherited_ar,
 			       walker->gfn);
 	}
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 36/58] KVM: MMU: Fold fix_read_pf() into set_pte_common()
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (34 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 35/58] KVM: MMU: Pass the guest pde to set_pte_common Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 37/58] KVM: MMU: Fold fix_write_pf() " Avi Kivity
                   ` (20 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/mmu.c         |   17 -----------------
 drivers/kvm/paging_tmpl.h |   34 +++++++++++++++++++++++-----------
 2 files changed, 23 insertions(+), 28 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index a763150..2079d69 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -972,23 +972,6 @@ static void inject_page_fault(struct kvm_vcpu *vcpu,
 	kvm_arch_ops->inject_page_fault(vcpu, addr, err_code);
 }
 
-static inline int fix_read_pf(u64 *shadow_ent)
-{
-	if ((*shadow_ent & PT_SHADOW_USER_MASK) &&
-	    !(*shadow_ent & PT_USER_MASK)) {
-		/*
-		 * If supervisor write protect is disabled, we shadow kernel
-		 * pages as user pages so we can trap the write access.
-		 */
-		*shadow_ent |= PT_USER_MASK;
-		*shadow_ent &= ~PT_WRITABLE_MASK;
-
-		return 1;
-
-	}
-	return 0;
-}
-
 static void paging_free(struct kvm_vcpu *vcpu)
 {
 	nonpaging_free(vcpu);
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 7e998d1..869582b 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -197,6 +197,7 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 				  gpa_t gaddr,
 				  pt_element_t *gpte,
 				  u64 access_bits,
+				  int write_fault,
 				  gfn_t gfn)
 {
 	hpa_t paddr;
@@ -219,6 +220,17 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 
 	*shadow_pte |= paddr;
 
+	if (!write_fault && (*shadow_pte & PT_SHADOW_USER_MASK) &&
+	    !(*shadow_pte & PT_USER_MASK)) {
+		/*
+		 * If supervisor write protect is disabled, we shadow kernel
+		 * pages as user pages so we can trap the write access.
+		 */
+		*shadow_pte |= PT_USER_MASK;
+		*shadow_pte &= ~PT_WRITABLE_MASK;
+		access_bits &= ~PT_WRITABLE_MASK;
+	}
+
 	if (access_bits & PT_WRITABLE_MASK) {
 		struct kvm_mmu_page *shadow;
 
@@ -242,13 +254,14 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 }
 
 static void FNAME(set_pte)(struct kvm_vcpu *vcpu, pt_element_t *gpte,
-			   u64 *shadow_pte, u64 access_bits, gfn_t gfn)
+			   u64 *shadow_pte, u64 access_bits,
+			   int write_fault, gfn_t gfn)
 {
 	ASSERT(*shadow_pte == 0);
 	access_bits &= *gpte;
 	*shadow_pte = (*gpte & PT_PTE_COPY_MASK);
 	FNAME(set_pte_common)(vcpu, shadow_pte, *gpte & PT_BASE_ADDR_MASK,
-			      gpte, access_bits, gfn);
+			      gpte, access_bits, write_fault, gfn);
 }
 
 static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page,
@@ -262,12 +275,13 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page,
 	if (~gpte & (PT_PRESENT_MASK | PT_ACCESSED_MASK))
 		return;
 	pgprintk("%s: gpte %llx spte %p\n", __FUNCTION__, (u64)gpte, spte);
-	FNAME(set_pte)(vcpu, &gpte, spte, PT_USER_MASK | PT_WRITABLE_MASK,
+	FNAME(set_pte)(vcpu, &gpte, spte, PT_USER_MASK | PT_WRITABLE_MASK, 0,
 		       (gpte & PT_BASE_ADDR_MASK) >> PAGE_SHIFT);
 }
 
 static void FNAME(set_pde)(struct kvm_vcpu *vcpu, pt_element_t *gpde,
-			   u64 *shadow_pte, u64 access_bits, gfn_t gfn)
+			   u64 *shadow_pte, u64 access_bits, int write_fault,
+			   gfn_t gfn)
 {
 	gpa_t gaddr;
 
@@ -279,14 +293,14 @@ static void FNAME(set_pde)(struct kvm_vcpu *vcpu, pt_element_t *gpde,
 			(32 - PT32_DIR_PSE36_SHIFT);
 	*shadow_pte = *gpde & PT_PTE_COPY_MASK;
 	FNAME(set_pte_common)(vcpu, shadow_pte, gaddr,
-			      gpde, access_bits, gfn);
+			      gpde, access_bits, write_fault, gfn);
 }
 
 /*
  * Fetch a shadow pte for a specific level in the paging hierarchy.
  */
 static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
-			      struct guest_walker *walker)
+			 struct guest_walker *walker, int write_fault)
 {
 	hpa_t shadow_addr;
 	int level;
@@ -351,12 +365,12 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 		if (prev_shadow_ent)
 			*prev_shadow_ent |= PT_SHADOW_PS_MARK;
 		FNAME(set_pde)(vcpu, guest_ent, shadow_ent,
-			       walker->inherited_ar, walker->gfn);
+			       walker->inherited_ar, write_fault, walker->gfn);
 	} else {
 		ASSERT(walker->level == PT_PAGE_TABLE_LEVEL);
 		FNAME(set_pte)(vcpu, guest_ent, shadow_ent,
 			       walker->inherited_ar,
-			       walker->gfn);
+			       write_fault, walker->gfn);
 	}
 	return shadow_ent;
 }
@@ -489,7 +503,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr,
 		return 0;
 	}
 
-	shadow_pte = FNAME(fetch)(vcpu, addr, &walker);
+	shadow_pte = FNAME(fetch)(vcpu, addr, &walker, write_fault);
 	pgprintk("%s: shadow pte %p %llx\n", __FUNCTION__,
 		 shadow_pte, *shadow_pte);
 
@@ -499,8 +513,6 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr,
 	if (write_fault)
 		fixed = FNAME(fix_write_pf)(vcpu, shadow_pte, &walker, addr,
 					    user_fault, &write_pt);
-	else
-		fixed = fix_read_pf(shadow_pte);
 
 	pgprintk("%s: updated shadow pte %p %llx\n", __FUNCTION__,
 		 shadow_pte, *shadow_pte);
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 37/58] KVM: MMU: Fold fix_write_pf() into set_pte_common()
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (35 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 36/58] KVM: MMU: Fold fix_read_pf() into set_pte_common() Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 38/58] KVM: Move shadow pte modifications from set_pte/set_pde to set_pde_common() Avi Kivity
                   ` (19 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

This prevents some work from being performed twice, and, more importantly,
reduces the number of places where we modify shadow ptes.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/mmu.c         |   11 +++
 drivers/kvm/paging_tmpl.h |  168 +++++++++++++++-----------------------------
 2 files changed, 68 insertions(+), 111 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 2079d69..3cdbf68 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -731,6 +731,17 @@ static int kvm_mmu_unprotect_page(struct kvm_vcpu *vcpu, gfn_t gfn)
 	return r;
 }
 
+static void mmu_unshadow(struct kvm_vcpu *vcpu, gfn_t gfn)
+{
+	struct kvm_mmu_page *page;
+
+	while ((page = kvm_mmu_lookup_page(vcpu, gfn)) != NULL) {
+		pgprintk("%s: zap %lx %x\n",
+			 __FUNCTION__, gfn, page->role.word);
+		kvm_mmu_zap_page(vcpu, page);
+	}
+}
+
 static void page_header_update_slot(struct kvm *kvm, void *pte, gpa_t gpa)
 {
 	int slot = memslot_id(kvm, gfn_to_memslot(kvm, gpa >> PAGE_SHIFT));
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 869582b..c067203 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -197,11 +197,26 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 				  gpa_t gaddr,
 				  pt_element_t *gpte,
 				  u64 access_bits,
+				  int user_fault,
 				  int write_fault,
+				  int *ptwrite,
+				  struct guest_walker *walker,
 				  gfn_t gfn)
 {
 	hpa_t paddr;
 	int dirty = *gpte & PT_DIRTY_MASK;
+	int was_rmapped = is_rmap_pte(*shadow_pte);
+
+	pgprintk("%s: spte %llx gpte %llx access %llx write_fault %d"
+		 " user_fault %d gfn %lx\n",
+		 __FUNCTION__, *shadow_pte, (u64)*gpte, access_bits,
+		 write_fault, user_fault, gfn);
+
+	if (write_fault && !dirty) {
+		*gpte |= PT_DIRTY_MASK;
+		dirty = 1;
+		FNAME(mark_pagetable_dirty)(vcpu->kvm, walker);
+	}
 
 	*shadow_pte |= access_bits << PT_SHADOW_BITS_OFFSET;
 	if (!dirty)
@@ -209,7 +224,9 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 
 	paddr = gpa_to_hpa(vcpu, gaddr & PT64_BASE_ADDR_MASK);
 
-	*shadow_pte |= access_bits;
+	*shadow_pte |= PT_PRESENT_MASK;
+	if (access_bits & PT_USER_MASK)
+		*shadow_pte |= PT_USER_MASK;
 
 	if (is_error_hpa(paddr)) {
 		*shadow_pte |= gaddr;
@@ -231,37 +248,50 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 		access_bits &= ~PT_WRITABLE_MASK;
 	}
 
-	if (access_bits & PT_WRITABLE_MASK) {
+	if ((access_bits & PT_WRITABLE_MASK)
+	    || (write_fault && !is_write_protection(vcpu) && !user_fault)) {
 		struct kvm_mmu_page *shadow;
 
+		*shadow_pte |= PT_WRITABLE_MASK;
+		if (user_fault) {
+			mmu_unshadow(vcpu, gfn);
+			goto unshadowed;
+		}
+
 		shadow = kvm_mmu_lookup_page(vcpu, gfn);
 		if (shadow) {
 			pgprintk("%s: found shadow page for %lx, marking ro\n",
 				 __FUNCTION__, gfn);
 			access_bits &= ~PT_WRITABLE_MASK;
 			if (is_writeble_pte(*shadow_pte)) {
-				    *shadow_pte &= ~PT_WRITABLE_MASK;
-				    kvm_arch_ops->tlb_flush(vcpu);
+				*shadow_pte &= ~PT_WRITABLE_MASK;
+				kvm_arch_ops->tlb_flush(vcpu);
 			}
+			if (write_fault)
+				*ptwrite = 1;
 		}
 	}
 
+unshadowed:
+
 	if (access_bits & PT_WRITABLE_MASK)
 		mark_page_dirty(vcpu->kvm, gaddr >> PAGE_SHIFT);
 
 	page_header_update_slot(vcpu->kvm, shadow_pte, gaddr);
-	rmap_add(vcpu, shadow_pte);
+	if (!was_rmapped)
+		rmap_add(vcpu, shadow_pte);
 }
 
 static void FNAME(set_pte)(struct kvm_vcpu *vcpu, pt_element_t *gpte,
 			   u64 *shadow_pte, u64 access_bits,
-			   int write_fault, gfn_t gfn)
+			   int user_fault, int write_fault, int *ptwrite,
+			   struct guest_walker *walker, gfn_t gfn)
 {
-	ASSERT(*shadow_pte == 0);
 	access_bits &= *gpte;
-	*shadow_pte = (*gpte & PT_PTE_COPY_MASK);
+	*shadow_pte |= (*gpte & PT_PTE_COPY_MASK);
 	FNAME(set_pte_common)(vcpu, shadow_pte, *gpte & PT_BASE_ADDR_MASK,
-			      gpte, access_bits, write_fault, gfn);
+			      gpte, access_bits, user_fault, write_fault,
+			      ptwrite, walker, gfn);
 }
 
 static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page,
@@ -276,31 +306,34 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page,
 		return;
 	pgprintk("%s: gpte %llx spte %p\n", __FUNCTION__, (u64)gpte, spte);
 	FNAME(set_pte)(vcpu, &gpte, spte, PT_USER_MASK | PT_WRITABLE_MASK, 0,
+		       0, NULL, NULL,
 		       (gpte & PT_BASE_ADDR_MASK) >> PAGE_SHIFT);
 }
 
 static void FNAME(set_pde)(struct kvm_vcpu *vcpu, pt_element_t *gpde,
-			   u64 *shadow_pte, u64 access_bits, int write_fault,
-			   gfn_t gfn)
+			   u64 *shadow_pte, u64 access_bits,
+			   int user_fault, int write_fault, int *ptwrite,
+			   struct guest_walker *walker, gfn_t gfn)
 {
 	gpa_t gaddr;
 
-	ASSERT(*shadow_pte == 0);
 	access_bits &= *gpde;
 	gaddr = (gpa_t)gfn << PAGE_SHIFT;
 	if (PTTYPE == 32 && is_cpuid_PSE36())
 		gaddr |= (*gpde & PT32_DIR_PSE36_MASK) <<
 			(32 - PT32_DIR_PSE36_SHIFT);
-	*shadow_pte = *gpde & PT_PTE_COPY_MASK;
+	*shadow_pte |= *gpde & PT_PTE_COPY_MASK;
 	FNAME(set_pte_common)(vcpu, shadow_pte, gaddr,
-			      gpde, access_bits, write_fault, gfn);
+			      gpde, access_bits, user_fault, write_fault,
+			      ptwrite, walker, gfn);
 }
 
 /*
  * Fetch a shadow pte for a specific level in the paging hierarchy.
  */
 static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
-			 struct guest_walker *walker, int write_fault)
+			 struct guest_walker *walker,
+			 int user_fault, int write_fault, int *ptwrite)
 {
 	hpa_t shadow_addr;
 	int level;
@@ -330,7 +363,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 		shadow_ent = ((u64 *)__va(shadow_addr)) + index;
 		if (is_present_pte(*shadow_ent) || is_io_pte(*shadow_ent)) {
 			if (level == PT_PAGE_TABLE_LEVEL)
-				return shadow_ent;
+				break;
 			shadow_addr = *shadow_ent & PT64_BASE_ADDR_MASK;
 			prev_shadow_ent = shadow_ent;
 			continue;
@@ -365,95 +398,18 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 		if (prev_shadow_ent)
 			*prev_shadow_ent |= PT_SHADOW_PS_MARK;
 		FNAME(set_pde)(vcpu, guest_ent, shadow_ent,
-			       walker->inherited_ar, write_fault, walker->gfn);
+			       walker->inherited_ar, user_fault, write_fault,
+			       ptwrite, walker, walker->gfn);
 	} else {
 		ASSERT(walker->level == PT_PAGE_TABLE_LEVEL);
 		FNAME(set_pte)(vcpu, guest_ent, shadow_ent,
-			       walker->inherited_ar,
-			       write_fault, walker->gfn);
+			       walker->inherited_ar, user_fault, write_fault,
+			       ptwrite, walker, walker->gfn);
 	}
 	return shadow_ent;
 }
 
 /*
- * The guest faulted for write.  We need to
- *
- * - check write permissions
- * - update the guest pte dirty bit
- * - update our own dirty page tracking structures
- */
-static int FNAME(fix_write_pf)(struct kvm_vcpu *vcpu,
-			       u64 *shadow_ent,
-			       struct guest_walker *walker,
-			       gva_t addr,
-			       int user,
-			       int *write_pt)
-{
-	pt_element_t *guest_ent;
-	int writable_shadow;
-	gfn_t gfn;
-	struct kvm_mmu_page *page;
-
-	if (is_writeble_pte(*shadow_ent))
-		return !user || (*shadow_ent & PT_USER_MASK);
-
-	writable_shadow = *shadow_ent & PT_SHADOW_WRITABLE_MASK;
-	if (user) {
-		/*
-		 * User mode access.  Fail if it's a kernel page or a read-only
-		 * page.
-		 */
-		if (!(*shadow_ent & PT_SHADOW_USER_MASK) || !writable_shadow)
-			return 0;
-		ASSERT(*shadow_ent & PT_USER_MASK);
-	} else
-		/*
-		 * Kernel mode access.  Fail if it's a read-only page and
-		 * supervisor write protection is enabled.
-		 */
-		if (!writable_shadow) {
-			if (is_write_protection(vcpu))
-				return 0;
-			*shadow_ent &= ~PT_USER_MASK;
-		}
-
-	guest_ent = walker->ptep;
-
-	if (!is_present_pte(*guest_ent)) {
-		*shadow_ent = 0;
-		return 0;
-	}
-
-	gfn = walker->gfn;
-
-	if (user) {
-		/*
-		 * Usermode page faults won't be for page table updates.
-		 */
-		while ((page = kvm_mmu_lookup_page(vcpu, gfn)) != NULL) {
-			pgprintk("%s: zap %lx %x\n",
-				 __FUNCTION__, gfn, page->role.word);
-			kvm_mmu_zap_page(vcpu, page);
-		}
-	} else if (kvm_mmu_lookup_page(vcpu, gfn)) {
-		pgprintk("%s: found shadow page for %lx, marking ro\n",
-			 __FUNCTION__, gfn);
-		mark_page_dirty(vcpu->kvm, gfn);
-		FNAME(mark_pagetable_dirty)(vcpu->kvm, walker);
-		*guest_ent |= PT_DIRTY_MASK;
-		*write_pt = 1;
-		return 0;
-	}
-	mark_page_dirty(vcpu->kvm, gfn);
-	*shadow_ent |= PT_WRITABLE_MASK;
-	FNAME(mark_pagetable_dirty)(vcpu->kvm, walker);
-	*guest_ent |= PT_DIRTY_MASK;
-	rmap_add(vcpu, shadow_ent);
-
-	return 1;
-}
-
-/*
  * Page fault handler.  There are several causes for a page fault:
  *   - there is no shadow pte for the guest pte
  *   - write access through a shadow pte marked read only so that we can set
@@ -475,7 +431,6 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr,
 	int fetch_fault = error_code & PFERR_FETCH_MASK;
 	struct guest_walker walker;
 	u64 *shadow_pte;
-	int fixed;
 	int write_pt = 0;
 	int r;
 
@@ -503,19 +458,10 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr,
 		return 0;
 	}
 
-	shadow_pte = FNAME(fetch)(vcpu, addr, &walker, write_fault);
-	pgprintk("%s: shadow pte %p %llx\n", __FUNCTION__,
-		 shadow_pte, *shadow_pte);
-
-	/*
-	 * Update the shadow pte.
-	 */
-	if (write_fault)
-		fixed = FNAME(fix_write_pf)(vcpu, shadow_pte, &walker, addr,
-					    user_fault, &write_pt);
-
-	pgprintk("%s: updated shadow pte %p %llx\n", __FUNCTION__,
-		 shadow_pte, *shadow_pte);
+	shadow_pte = FNAME(fetch)(vcpu, addr, &walker, user_fault, write_fault,
+				  &write_pt);
+	pgprintk("%s: shadow pte %p %llx ptwrite %d\n", __FUNCTION__,
+		 shadow_pte, *shadow_pte, write_pt);
 
 	FNAME(release_walker)(&walker);
 
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 38/58] KVM: Move shadow pte modifications from set_pte/set_pde to set_pde_common()
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (36 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 37/58] KVM: MMU: Fold fix_write_pf() " Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 39/58] KVM: Make shadow pte updates atomic Avi Kivity
                   ` (18 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

We want all shadow pte modifications in one place.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/paging_tmpl.h |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index c067203..35f264f 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -218,6 +218,7 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 		FNAME(mark_pagetable_dirty)(vcpu->kvm, walker);
 	}
 
+	*shadow_pte |= *gpte & PT_PTE_COPY_MASK;
 	*shadow_pte |= access_bits << PT_SHADOW_BITS_OFFSET;
 	if (!dirty)
 		access_bits &= ~PT_WRITABLE_MASK;
@@ -288,7 +289,6 @@ static void FNAME(set_pte)(struct kvm_vcpu *vcpu, pt_element_t *gpte,
 			   struct guest_walker *walker, gfn_t gfn)
 {
 	access_bits &= *gpte;
-	*shadow_pte |= (*gpte & PT_PTE_COPY_MASK);
 	FNAME(set_pte_common)(vcpu, shadow_pte, *gpte & PT_BASE_ADDR_MASK,
 			      gpte, access_bits, user_fault, write_fault,
 			      ptwrite, walker, gfn);
@@ -322,7 +322,6 @@ static void FNAME(set_pde)(struct kvm_vcpu *vcpu, pt_element_t *gpde,
 	if (PTTYPE == 32 && is_cpuid_PSE36())
 		gaddr |= (*gpde & PT32_DIR_PSE36_MASK) <<
 			(32 - PT32_DIR_PSE36_SHIFT);
-	*shadow_pte |= *gpde & PT_PTE_COPY_MASK;
 	FNAME(set_pte_common)(vcpu, shadow_pte, gaddr,
 			      gpde, access_bits, user_fault, write_fault,
 			      ptwrite, walker, gfn);
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 39/58] KVM: Make shadow pte updates atomic
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (37 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 38/58] KVM: Move shadow pte modifications from set_pte/set_pde to set_pde_common() Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 40/58] KVM: MMU: Make setting shadow ptes atomic on i386 Avi Kivity
                   ` (17 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

With guest smp, a second vcpu might see partial updates when the first
vcpu services a page fault.  So delay all updates until we have figured
out what the pte should look like.

Note that on i386, this is still not completely atomic as a 64-bit write
will be split into two on a 32-bit machine.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/paging_tmpl.h |   37 ++++++++++++++++++++-----------------
 1 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 35f264f..397a403 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -205,11 +205,12 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 {
 	hpa_t paddr;
 	int dirty = *gpte & PT_DIRTY_MASK;
-	int was_rmapped = is_rmap_pte(*shadow_pte);
+	u64 spte = *shadow_pte;
+	int was_rmapped = is_rmap_pte(spte);
 
 	pgprintk("%s: spte %llx gpte %llx access %llx write_fault %d"
 		 " user_fault %d gfn %lx\n",
-		 __FUNCTION__, *shadow_pte, (u64)*gpte, access_bits,
+		 __FUNCTION__, spte, (u64)*gpte, access_bits,
 		 write_fault, user_fault, gfn);
 
 	if (write_fault && !dirty) {
@@ -218,34 +219,35 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 		FNAME(mark_pagetable_dirty)(vcpu->kvm, walker);
 	}
 
-	*shadow_pte |= *gpte & PT_PTE_COPY_MASK;
-	*shadow_pte |= access_bits << PT_SHADOW_BITS_OFFSET;
+	spte |= *gpte & PT_PTE_COPY_MASK;
+	spte |= access_bits << PT_SHADOW_BITS_OFFSET;
 	if (!dirty)
 		access_bits &= ~PT_WRITABLE_MASK;
 
 	paddr = gpa_to_hpa(vcpu, gaddr & PT64_BASE_ADDR_MASK);
 
-	*shadow_pte |= PT_PRESENT_MASK;
+	spte |= PT_PRESENT_MASK;
 	if (access_bits & PT_USER_MASK)
-		*shadow_pte |= PT_USER_MASK;
+		spte |= PT_USER_MASK;
 
 	if (is_error_hpa(paddr)) {
-		*shadow_pte |= gaddr;
-		*shadow_pte |= PT_SHADOW_IO_MARK;
-		*shadow_pte &= ~PT_PRESENT_MASK;
+		spte |= gaddr;
+		spte |= PT_SHADOW_IO_MARK;
+		spte &= ~PT_PRESENT_MASK;
+		*shadow_pte = spte;
 		return;
 	}
 
-	*shadow_pte |= paddr;
+	spte |= paddr;
 
-	if (!write_fault && (*shadow_pte & PT_SHADOW_USER_MASK) &&
-	    !(*shadow_pte & PT_USER_MASK)) {
+	if (!write_fault && (spte & PT_SHADOW_USER_MASK) &&
+	    !(spte & PT_USER_MASK)) {
 		/*
 		 * If supervisor write protect is disabled, we shadow kernel
 		 * pages as user pages so we can trap the write access.
 		 */
-		*shadow_pte |= PT_USER_MASK;
-		*shadow_pte &= ~PT_WRITABLE_MASK;
+		spte |= PT_USER_MASK;
+		spte &= ~PT_WRITABLE_MASK;
 		access_bits &= ~PT_WRITABLE_MASK;
 	}
 
@@ -253,7 +255,7 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 	    || (write_fault && !is_write_protection(vcpu) && !user_fault)) {
 		struct kvm_mmu_page *shadow;
 
-		*shadow_pte |= PT_WRITABLE_MASK;
+		spte |= PT_WRITABLE_MASK;
 		if (user_fault) {
 			mmu_unshadow(vcpu, gfn);
 			goto unshadowed;
@@ -264,8 +266,8 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 			pgprintk("%s: found shadow page for %lx, marking ro\n",
 				 __FUNCTION__, gfn);
 			access_bits &= ~PT_WRITABLE_MASK;
-			if (is_writeble_pte(*shadow_pte)) {
-				*shadow_pte &= ~PT_WRITABLE_MASK;
+			if (is_writeble_pte(spte)) {
+				spte &= ~PT_WRITABLE_MASK;
 				kvm_arch_ops->tlb_flush(vcpu);
 			}
 			if (write_fault)
@@ -278,6 +280,7 @@ unshadowed:
 	if (access_bits & PT_WRITABLE_MASK)
 		mark_page_dirty(vcpu->kvm, gaddr >> PAGE_SHIFT);
 
+	*shadow_pte = spte;
 	page_header_update_slot(vcpu->kvm, shadow_pte, gaddr);
 	if (!was_rmapped)
 		rmap_add(vcpu, shadow_pte);
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 40/58] KVM: MMU: Make setting shadow ptes atomic on i386
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (38 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 39/58] KVM: Make shadow pte updates atomic Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 41/58] KVM: MMU: Remove cr0.wp tricks Avi Kivity
                   ` (16 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/Kconfig       |    1 +
 drivers/kvm/mmu.c         |   14 ++++++++++++--
 drivers/kvm/paging_tmpl.h |    4 ++--
 3 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/kvm/Kconfig b/drivers/kvm/Kconfig
index 2f661e5..33fa28a 100644
--- a/drivers/kvm/Kconfig
+++ b/drivers/kvm/Kconfig
@@ -11,6 +11,7 @@ if VIRTUALIZATION
 config KVM
 	tristate "Kernel-based Virtual Machine (KVM) support"
 	depends on X86 && EXPERIMENTAL
+	depends on X86_CMPXCHG64 || 64BIT
 	---help---
 	  Support hosting fully virtualized guest machines using hardware
 	  virtualization extensions.  You will need a fairly recent
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 3cdbf68..f24b540 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -22,6 +22,7 @@
 #include <linux/mm.h>
 #include <linux/highmem.h>
 #include <linux/module.h>
+#include <asm/cmpxchg.h>
 
 #include "vmx.h"
 #include "kvm.h"
@@ -204,6 +205,15 @@ static int is_rmap_pte(u64 pte)
 		== (PT_WRITABLE_MASK | PT_PRESENT_MASK);
 }
 
+static void set_shadow_pte(u64 *sptep, u64 spte)
+{
+#ifdef CONFIG_X86_64
+	set_64bit((unsigned long *)sptep, spte);
+#else
+	set_64bit((unsigned long long *)sptep, spte);
+#endif
+}
+
 static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
 				  struct kmem_cache *base_cache, int min,
 				  gfp_t gfp_flags)
@@ -446,7 +456,7 @@ static void rmap_write_protect(struct kvm_vcpu *vcpu, u64 gfn)
 		rmap_printk("rmap_write_protect: spte %p %llx\n", spte, *spte);
 		rmap_remove(vcpu, spte);
 		kvm_arch_ops->tlb_flush(vcpu);
-		*spte &= ~(u64)PT_WRITABLE_MASK;
+		set_shadow_pte(spte, *spte & ~PT_WRITABLE_MASK);
 	}
 }
 
@@ -699,7 +709,7 @@ static void kvm_mmu_zap_page(struct kvm_vcpu *vcpu,
 		}
 		BUG_ON(!parent_pte);
 		kvm_mmu_put_page(vcpu, page, parent_pte);
-		*parent_pte = 0;
+		set_shadow_pte(parent_pte, 0);
 	}
 	kvm_mmu_page_unlink_children(vcpu, page);
 	if (!page->root_count) {
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 397a403..fabc2c9 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -234,7 +234,7 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 		spte |= gaddr;
 		spte |= PT_SHADOW_IO_MARK;
 		spte &= ~PT_PRESENT_MASK;
-		*shadow_pte = spte;
+		set_shadow_pte(shadow_pte, spte);
 		return;
 	}
 
@@ -280,7 +280,7 @@ unshadowed:
 	if (access_bits & PT_WRITABLE_MASK)
 		mark_page_dirty(vcpu->kvm, gaddr >> PAGE_SHIFT);
 
-	*shadow_pte = spte;
+	set_shadow_pte(shadow_pte, spte);
 	page_header_update_slot(vcpu->kvm, shadow_pte, gaddr);
 	if (!was_rmapped)
 		rmap_add(vcpu, shadow_pte);
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 41/58] KVM: MMU: Remove cr0.wp tricks
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (39 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 40/58] KVM: MMU: Make setting shadow ptes atomic on i386 Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 42/58] KVM: MMU: Simpify accessed/dirty/present/nx bit handling Avi Kivity
                   ` (15 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

No longer needed as we do everything in one place.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/paging_tmpl.h |   11 -----------
 1 files changed, 0 insertions(+), 11 deletions(-)

diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index fabc2c9..59b4cb2 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -240,17 +240,6 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 
 	spte |= paddr;
 
-	if (!write_fault && (spte & PT_SHADOW_USER_MASK) &&
-	    !(spte & PT_USER_MASK)) {
-		/*
-		 * If supervisor write protect is disabled, we shadow kernel
-		 * pages as user pages so we can trap the write access.
-		 */
-		spte |= PT_USER_MASK;
-		spte &= ~PT_WRITABLE_MASK;
-		access_bits &= ~PT_WRITABLE_MASK;
-	}
-
 	if ((access_bits & PT_WRITABLE_MASK)
 	    || (write_fault && !is_write_protection(vcpu) && !user_fault)) {
 		struct kvm_mmu_page *shadow;
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 42/58] KVM: MMU: Simpify accessed/dirty/present/nx bit handling
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (40 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 41/58] KVM: MMU: Remove cr0.wp tricks Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 43/58] KVM: MMU: Don't cache guest access bits in the shadow page table Avi Kivity
                   ` (14 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Always set the accessed and dirty bit (since having them cleared causes
a read-modify-write cycle), always set the present bit, and copy the
nx bit from the guest.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/mmu.c         |    5 -----
 drivers/kvm/paging_tmpl.h |    7 ++-----
 2 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index f24b540..b47391f 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -91,11 +91,6 @@ static int dbg = 1;
 #define PT32_DIR_PSE36_MASK (((1ULL << PT32_DIR_PSE36_SIZE) - 1) << PT32_DIR_PSE36_SHIFT)
 
 
-#define PT32_PTE_COPY_MASK \
-	(PT_PRESENT_MASK | PT_ACCESSED_MASK | PT_DIRTY_MASK | PT_GLOBAL_MASK)
-
-#define PT64_PTE_COPY_MASK (PT64_NX_MASK | PT32_PTE_COPY_MASK)
-
 #define PT_FIRST_AVAIL_BITS_SHIFT 9
 #define PT64_SECOND_AVAIL_BITS_SHIFT 52
 
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 59b4cb2..b17a4b7 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -31,7 +31,6 @@
 	#define PT_INDEX(addr, level) PT64_INDEX(addr, level)
 	#define SHADOW_PT_INDEX(addr, level) PT64_INDEX(addr, level)
 	#define PT_LEVEL_MASK(level) PT64_LEVEL_MASK(level)
-	#define PT_PTE_COPY_MASK PT64_PTE_COPY_MASK
 	#ifdef CONFIG_X86_64
 	#define PT_MAX_FULL_LEVELS 4
 	#else
@@ -46,7 +45,6 @@
 	#define PT_INDEX(addr, level) PT32_INDEX(addr, level)
 	#define SHADOW_PT_INDEX(addr, level) PT64_INDEX(addr, level)
 	#define PT_LEVEL_MASK(level) PT32_LEVEL_MASK(level)
-	#define PT_PTE_COPY_MASK PT32_PTE_COPY_MASK
 	#define PT_MAX_FULL_LEVELS 2
 #else
 	#error Invalid PTTYPE value
@@ -219,7 +217,8 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 		FNAME(mark_pagetable_dirty)(vcpu->kvm, walker);
 	}
 
-	spte |= *gpte & PT_PTE_COPY_MASK;
+	spte |= PT_PRESENT_MASK | PT_ACCESSED_MASK | PT_DIRTY_MASK;
+	spte |= *gpte & PT64_NX_MASK;
 	spte |= access_bits << PT_SHADOW_BITS_OFFSET;
 	if (!dirty)
 		access_bits &= ~PT_WRITABLE_MASK;
@@ -495,7 +494,5 @@ static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t vaddr)
 #undef PT_INDEX
 #undef SHADOW_PT_INDEX
 #undef PT_LEVEL_MASK
-#undef PT_PTE_COPY_MASK
-#undef PT_NON_PTE_COPY_MASK
 #undef PT_DIR_BASE_ADDR_MASK
 #undef PT_MAX_FULL_LEVELS
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 43/58] KVM: MMU: Don't cache guest access bits in the shadow page table
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (41 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 42/58] KVM: MMU: Simpify accessed/dirty/present/nx bit handling Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 44/58] KVM: MMU: Remove unused large page marker Avi Kivity
                   ` (13 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

This was once used to avoid accessing the guest pte when upgrading
the shadow pte from read-only to read-write.  But usually we need
to set the guest pte dirty or accessed bits anyway, so this wasn't
really exploited.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/mmu.c         |    8 --------
 drivers/kvm/paging_tmpl.h |    1 -
 2 files changed, 0 insertions(+), 9 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index b47391f..986d012 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -97,14 +97,6 @@ static int dbg = 1;
 #define PT_SHADOW_PS_MARK (1ULL << PT_FIRST_AVAIL_BITS_SHIFT)
 #define PT_SHADOW_IO_MARK (1ULL << PT_FIRST_AVAIL_BITS_SHIFT)
 
-#define PT_SHADOW_WRITABLE_SHIFT (PT_FIRST_AVAIL_BITS_SHIFT + 1)
-#define PT_SHADOW_WRITABLE_MASK (1ULL << PT_SHADOW_WRITABLE_SHIFT)
-
-#define PT_SHADOW_USER_SHIFT (PT_SHADOW_WRITABLE_SHIFT + 1)
-#define PT_SHADOW_USER_MASK (1ULL << (PT_SHADOW_USER_SHIFT))
-
-#define PT_SHADOW_BITS_OFFSET (PT_SHADOW_WRITABLE_SHIFT - PT_WRITABLE_SHIFT)
-
 #define VALID_PAGE(x) ((x) != INVALID_PAGE)
 
 #define PT64_LEVEL_BITS 9
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index b17a4b7..adc1206 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -219,7 +219,6 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 
 	spte |= PT_PRESENT_MASK | PT_ACCESSED_MASK | PT_DIRTY_MASK;
 	spte |= *gpte & PT64_NX_MASK;
-	spte |= access_bits << PT_SHADOW_BITS_OFFSET;
 	if (!dirty)
 		access_bits &= ~PT_WRITABLE_MASK;
 
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 44/58] KVM: MMU: Remove unused large page marker
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (42 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 43/58] KVM: MMU: Don't cache guest access bits in the shadow page table Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 45/58] KVM: Lazy guest cr3 switching Avi Kivity
                   ` (12 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

This has not been used for some time, as the same information is available
in the page header.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/mmu.c         |    1 -
 drivers/kvm/paging_tmpl.h |    2 --
 2 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 986d012..283df03 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -94,7 +94,6 @@ static int dbg = 1;
 #define PT_FIRST_AVAIL_BITS_SHIFT 9
 #define PT64_SECOND_AVAIL_BITS_SHIFT 52
 
-#define PT_SHADOW_PS_MARK (1ULL << PT_FIRST_AVAIL_BITS_SHIFT)
 #define PT_SHADOW_IO_MARK (1ULL << PT_FIRST_AVAIL_BITS_SHIFT)
 
 #define VALID_PAGE(x) ((x) != INVALID_PAGE)
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index adc1206..a7c5cb0 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -384,8 +384,6 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 	}
 
 	if (walker->level == PT_DIRECTORY_LEVEL) {
-		if (prev_shadow_ent)
-			*prev_shadow_ent |= PT_SHADOW_PS_MARK;
 		FNAME(set_pde)(vcpu, guest_ent, shadow_ent,
 			       walker->inherited_ar, user_fault, write_fault,
 			       ptwrite, walker, walker->gfn);
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 45/58] KVM: Lazy guest cr3 switching
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (43 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 44/58] KVM: MMU: Remove unused large page marker Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 47/58] KVM: Remove unnecessary initialization and checks in mark_page_dirty() Avi Kivity
                   ` (11 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Switch guest paging context may require us to allocate memory, which
might fail.  Instead of wiring up error paths everywhere, make context
switching lazy and actually do the switch before the next guest entry,
where we can return an error if allocation fails.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h |   10 ++++++++++
 drivers/kvm/mmu.c |   43 ++++++++++++++++++++++---------------------
 drivers/kvm/svm.c |    4 ++++
 drivers/kvm/vmx.c |    4 ++++
 4 files changed, 40 insertions(+), 21 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 199e1e9..3ec4e26 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -544,6 +544,8 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 		       const u8 *old, const u8 *new, int bytes);
 int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva);
 void kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu);
+int kvm_mmu_load(struct kvm_vcpu *vcpu);
+void kvm_mmu_unload(struct kvm_vcpu *vcpu);
 
 int kvm_hypercall(struct kvm_vcpu *vcpu, struct kvm_run *run);
 
@@ -555,6 +557,14 @@ static inline int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t gva,
 	return vcpu->mmu.page_fault(vcpu, gva, error_code);
 }
 
+static inline int kvm_mmu_reload(struct kvm_vcpu *vcpu)
+{
+	if (likely(vcpu->mmu.root_hpa != INVALID_PAGE))
+		return 0;
+
+	return kvm_mmu_load(vcpu);
+}
+
 static inline int is_long_mode(struct kvm_vcpu *vcpu)
 {
 #ifdef CONFIG_X86_64
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 283df03..5915d7a 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -949,9 +949,7 @@ static int nonpaging_init_context(struct kvm_vcpu *vcpu)
 	context->free = nonpaging_free;
 	context->root_level = 0;
 	context->shadow_root_level = PT32E_ROOT_LEVEL;
-	mmu_alloc_roots(vcpu);
-	ASSERT(VALID_PAGE(context->root_hpa));
-	kvm_arch_ops->set_cr3(vcpu, context->root_hpa);
+	context->root_hpa = INVALID_PAGE;
 	return 0;
 }
 
@@ -965,11 +963,6 @@ static void paging_new_cr3(struct kvm_vcpu *vcpu)
 {
 	pgprintk("%s: cr3 %lx\n", __FUNCTION__, vcpu->cr3);
 	mmu_free_roots(vcpu);
-	if (unlikely(vcpu->kvm->n_free_mmu_pages < KVM_MIN_FREE_MMU_PAGES))
-		kvm_mmu_free_some_pages(vcpu);
-	mmu_alloc_roots(vcpu);
-	kvm_mmu_flush_tlb(vcpu);
-	kvm_arch_ops->set_cr3(vcpu, vcpu->mmu.root_hpa);
 }
 
 static void inject_page_fault(struct kvm_vcpu *vcpu,
@@ -1003,10 +996,7 @@ static int paging64_init_context_common(struct kvm_vcpu *vcpu, int level)
 	context->free = paging_free;
 	context->root_level = level;
 	context->shadow_root_level = level;
-	mmu_alloc_roots(vcpu);
-	ASSERT(VALID_PAGE(context->root_hpa));
-	kvm_arch_ops->set_cr3(vcpu, context->root_hpa |
-		    (vcpu->cr3 & (CR3_PCD_MASK | CR3_WPT_MASK)));
+	context->root_hpa = INVALID_PAGE;
 	return 0;
 }
 
@@ -1025,10 +1015,7 @@ static int paging32_init_context(struct kvm_vcpu *vcpu)
 	context->free = paging_free;
 	context->root_level = PT32_ROOT_LEVEL;
 	context->shadow_root_level = PT32E_ROOT_LEVEL;
-	mmu_alloc_roots(vcpu);
-	ASSERT(VALID_PAGE(context->root_hpa));
-	kvm_arch_ops->set_cr3(vcpu, context->root_hpa |
-		    (vcpu->cr3 & (CR3_PCD_MASK | CR3_WPT_MASK)));
+	context->root_hpa = INVALID_PAGE;
 	return 0;
 }
 
@@ -1042,7 +1029,6 @@ static int init_kvm_mmu(struct kvm_vcpu *vcpu)
 	ASSERT(vcpu);
 	ASSERT(!VALID_PAGE(vcpu->mmu.root_hpa));
 
-	mmu_topup_memory_caches(vcpu);
 	if (!is_paging(vcpu))
 		return nonpaging_init_context(vcpu);
 	else if (is_long_mode(vcpu))
@@ -1064,16 +1050,31 @@ static void destroy_kvm_mmu(struct kvm_vcpu *vcpu)
 
 int kvm_mmu_reset_context(struct kvm_vcpu *vcpu)
 {
+	destroy_kvm_mmu(vcpu);
+	return init_kvm_mmu(vcpu);
+}
+
+int kvm_mmu_load(struct kvm_vcpu *vcpu)
+{
 	int r;
 
-	destroy_kvm_mmu(vcpu);
-	r = init_kvm_mmu(vcpu);
-	if (r < 0)
-		goto out;
+	spin_lock(&vcpu->kvm->lock);
 	r = mmu_topup_memory_caches(vcpu);
+	if (r)
+		goto out;
+	mmu_alloc_roots(vcpu);
+	kvm_arch_ops->set_cr3(vcpu, vcpu->mmu.root_hpa);
+	kvm_mmu_flush_tlb(vcpu);
 out:
+	spin_unlock(&vcpu->kvm->lock);
 	return r;
 }
+EXPORT_SYMBOL_GPL(kvm_mmu_load);
+
+void kvm_mmu_unload(struct kvm_vcpu *vcpu)
+{
+	mmu_free_roots(vcpu);
+}
 
 static void mmu_pte_write_zap_pte(struct kvm_vcpu *vcpu,
 				  struct kvm_mmu_page *page,
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 6cd6a50..ec040e2 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -1483,6 +1483,10 @@ static int svm_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	int r;
 
 again:
+	r = kvm_mmu_reload(vcpu);
+	if (unlikely(r))
+		return r;
+
 	if (!vcpu->mmio_read_completed)
 		do_interrupt_requests(vcpu, kvm_run);
 
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 93e5bb2..4d25549 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1988,6 +1988,10 @@ again:
 	vmx_save_host_state(vcpu);
 	kvm_load_guest_fpu(vcpu);
 
+	r = kvm_mmu_reload(vcpu);
+	if (unlikely(r))
+		goto out;
+
 	/*
 	 * Loading guest fpu may have cleared host cr0.ts
 	 */
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 47/58] KVM: Remove unnecessary initialization and checks in mark_page_dirty()
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (44 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 45/58] KVM: Lazy guest cr3 switching Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 48/58] KVM: Fix vcpu freeing for guest smp Avi Kivity
                   ` (10 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Nguyen Anh Quynh, Avi Kivity

From: Nguyen Anh Quynh <aquynh@gmail.com>

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm_main.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index bf35457..3c3231d 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -970,7 +970,7 @@ EXPORT_SYMBOL_GPL(gfn_to_page);
 void mark_page_dirty(struct kvm *kvm, gfn_t gfn)
 {
 	int i;
-	struct kvm_memory_slot *memslot = NULL;
+	struct kvm_memory_slot *memslot;
 	unsigned long rel_gfn;
 
 	for (i = 0; i < kvm->nmemslots; ++i) {
@@ -979,7 +979,7 @@ void mark_page_dirty(struct kvm *kvm, gfn_t gfn)
 		if (gfn >= memslot->base_gfn
 		    && gfn < memslot->base_gfn + memslot->npages) {
 
-			if (!memslot || !memslot->dirty_bitmap)
+			if (!memslot->dirty_bitmap)
 				return;
 
 			rel_gfn = gfn - memslot->base_gfn;
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 48/58] KVM: Fix vcpu freeing for guest smp
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (45 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 47/58] KVM: Remove unnecessary initialization and checks in mark_page_dirty() Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 49/58] KVM: Fix adding an smp virtual machine to the vm list Avi Kivity
                   ` (9 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

A vcpu can pin up to four mmu shadow pages, which means the freeing
loop will never terminate.  Fix by first unpinning shadow pages on
all vcpus, then freeing shadow pages.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm_main.c |   15 +++++++++++++++
 drivers/kvm/mmu.c      |    4 ++--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 3c3231d..3ff8ee5 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -381,6 +381,16 @@ static void free_pio_guest_pages(struct kvm_vcpu *vcpu)
 		}
 }
 
+static void kvm_unload_vcpu_mmu(struct kvm_vcpu *vcpu)
+{
+	if (!vcpu->vmcs)
+		return;
+
+	vcpu_load(vcpu);
+	kvm_mmu_unload(vcpu);
+	vcpu_put(vcpu);
+}
+
 static void kvm_free_vcpu(struct kvm_vcpu *vcpu)
 {
 	if (!vcpu->vmcs)
@@ -401,6 +411,11 @@ static void kvm_free_vcpus(struct kvm *kvm)
 {
 	unsigned int i;
 
+	/*
+	 * Unpin any mmu pages first.
+	 */
+	for (i = 0; i < KVM_MAX_VCPUS; ++i)
+		kvm_unload_vcpu_mmu(&kvm->vcpus[i]);
 	for (i = 0; i < KVM_MAX_VCPUS; ++i)
 		kvm_free_vcpu(&kvm->vcpus[i]);
 }
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 5915d7a..d4de988 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -838,11 +838,12 @@ static void mmu_free_roots(struct kvm_vcpu *vcpu)
 	int i;
 	struct kvm_mmu_page *page;
 
+	if (!VALID_PAGE(vcpu->mmu.root_hpa))
+		return;
 #ifdef CONFIG_X86_64
 	if (vcpu->mmu.shadow_root_level == PT64_ROOT_LEVEL) {
 		hpa_t root = vcpu->mmu.root_hpa;
 
-		ASSERT(VALID_PAGE(root));
 		page = page_header(root);
 		--page->root_count;
 		vcpu->mmu.root_hpa = INVALID_PAGE;
@@ -853,7 +854,6 @@ static void mmu_free_roots(struct kvm_vcpu *vcpu)
 		hpa_t root = vcpu->mmu.pae_root[i];
 
 		if (root) {
-			ASSERT(VALID_PAGE(root));
 			root &= PT64_BASE_ADDR_MASK;
 			page = page_header(root);
 			--page->root_count;
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 49/58] KVM: Fix adding an smp virtual machine to the vm list
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (46 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 48/58] KVM: Fix vcpu freeing for guest smp Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 50/58] KVM: Enable guest smp Avi Kivity
                   ` (8 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

If we add the vm once per vcpu, we corrupt the list if the guest has
multiple vcpus.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm_main.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 3ff8ee5..230b25a 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -319,6 +319,9 @@ static struct kvm *kvm_create_vm(void)
 
 	spin_lock_init(&kvm->lock);
 	INIT_LIST_HEAD(&kvm->active_mmu_pages);
+	spin_lock(&kvm_lock);
+	list_add(&kvm->vm_list, &vm_list);
+	spin_unlock(&kvm_lock);
 	for (i = 0; i < KVM_MAX_VCPUS; ++i) {
 		struct kvm_vcpu *vcpu = &kvm->vcpus[i];
 
@@ -326,9 +329,6 @@ static struct kvm *kvm_create_vm(void)
 		vcpu->cpu = -1;
 		vcpu->kvm = kvm;
 		vcpu->mmu.root_hpa = INVALID_PAGE;
-		spin_lock(&kvm_lock);
-		list_add(&kvm->vm_list, &vm_list);
-		spin_unlock(&kvm_lock);
 	}
 	return kvm;
 }
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 50/58] KVM: Enable guest smp
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (47 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 49/58] KVM: Fix adding an smp virtual machine to the vm list Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 51/58] KVM: Move duplicate halt handling code into kvm_main.c Avi Kivity
                   ` (7 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

As we don't support guest tlb shootdown yet, this is only reliable
for real-mode guests.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 3ec4e26..e665f55 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -55,7 +55,7 @@
 #define INVALID_PAGE (~(hpa_t)0)
 #define UNMAPPED_GVA (~(gpa_t)0)
 
-#define KVM_MAX_VCPUS 1
+#define KVM_MAX_VCPUS 4
 #define KVM_ALIAS_SLOTS 4
 #define KVM_MEMORY_SLOTS 4
 #define KVM_NUM_MMU_PAGES 1024
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 51/58] KVM: Move duplicate halt handling code into kvm_main.c
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (48 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 50/58] KVM: Enable guest smp Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 52/58] KVM: Emulate hlt on real mode for Intel Avi Kivity
                   ` (6 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Will soon have a thid user.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h      |    1 +
 drivers/kvm/kvm_main.c |   11 +++++++++++
 drivers/kvm/svm.c      |    7 +------
 drivers/kvm/vmx.c      |    7 +------
 4 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index e665f55..ac358b8 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -504,6 +504,7 @@ int kvm_setup_pio(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
 		  int size, unsigned long count, int string, int down,
 		  gva_t address, int rep, unsigned port);
 void kvm_emulate_cpuid(struct kvm_vcpu *vcpu);
+int kvm_emulate_halt(struct kvm_vcpu *vcpu);
 int emulate_invlpg(struct kvm_vcpu *vcpu, gva_t address);
 int emulate_clts(struct kvm_vcpu *vcpu);
 int emulator_get_dr(struct x86_emulate_ctxt* ctxt, int dr,
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 230b25a..5564169 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1285,6 +1285,17 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
 }
 EXPORT_SYMBOL_GPL(emulate_instruction);
 
+int kvm_emulate_halt(struct kvm_vcpu *vcpu)
+{
+	if (vcpu->irq_summary)
+		return 1;
+
+	vcpu->run->exit_reason = KVM_EXIT_HLT;
+	++vcpu->stat.halt_exits;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_emulate_halt);
+
 int kvm_hypercall(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
 	unsigned long nr, a0, a1, a2, a3, a4, a5, ret;
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index ec040e2..70f386e 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -1115,12 +1115,7 @@ static int halt_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	vcpu->svm->next_rip = vcpu->svm->vmcb->save.rip + 1;
 	skip_emulated_instruction(vcpu);
-	if (vcpu->irq_summary)
-		return 1;
-
-	kvm_run->exit_reason = KVM_EXIT_HLT;
-	++vcpu->stat.halt_exits;
-	return 0;
+	return kvm_emulate_halt(vcpu);
 }
 
 static int vmmcall_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index a534e6f..90abd3c 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1896,12 +1896,7 @@ static int handle_interrupt_window(struct kvm_vcpu *vcpu,
 static int handle_halt(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	skip_emulated_instruction(vcpu);
-	if (vcpu->irq_summary)
-		return 1;
-
-	kvm_run->exit_reason = KVM_EXIT_HLT;
-	++vcpu->stat.halt_exits;
-	return 0;
+	return kvm_emulate_halt(vcpu);
 }
 
 static int handle_vmcall(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 52/58] KVM: Emulate hlt on real mode for Intel
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (49 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 51/58] KVM: Move duplicate halt handling code into kvm_main.c Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 53/58] KVM: Keep an upper bound of initialized vcpus Avi Kivity
                   ` (5 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

This has two use cases: the bios can't boot from disk, and guest smp
bootstrap.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h         |    1 +
 drivers/kvm/vmx.c         |    7 ++++++-
 drivers/kvm/x86_emulate.c |    6 +++++-
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index ac358b8..d49b16c 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -347,6 +347,7 @@ struct kvm_vcpu {
 			u32 ar;
 		} tr, es, ds, fs, gs;
 	} rmode;
+	int halt_request; /* real mode on Intel only */
 
 	int cpuid_nent;
 	struct kvm_cpuid_entry cpuid_entries[KVM_MAX_CPUID_ENTRIES];
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 90abd3c..a1f51b9 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1608,8 +1608,13 @@ static int handle_exception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 
 	if (vcpu->rmode.active &&
 	    handle_rmode_exception(vcpu, intr_info & INTR_INFO_VECTOR_MASK,
-								error_code))
+								error_code)) {
+		if (vcpu->halt_request) {
+			vcpu->halt_request = 0;
+			return kvm_emulate_halt(vcpu);
+		}
 		return 1;
+	}
 
 	if ((intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VECTOR_MASK)) == (INTR_TYPE_EXCEPTION | 1)) {
 		kvm_run->exit_reason = KVM_EXIT_DEBUG;
diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index 6123c02..a4a8481 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -143,7 +143,8 @@ static u8 opcode_table[256] = {
 	0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 	/* 0xF0 - 0xF7 */
 	0, 0, 0, 0,
-	0, 0, ByteOp | DstMem | SrcNone | ModRM, DstMem | SrcNone | ModRM,
+	ImplicitOps, 0,
+	ByteOp | DstMem | SrcNone | ModRM, DstMem | SrcNone | ModRM,
 	/* 0xF8 - 0xFF */
 	0, 0, 0, 0,
 	0, 0, ByteOp | DstMem | SrcNone | ModRM, DstMem | SrcNone | ModRM
@@ -1149,6 +1150,9 @@ special_insn:
 	case 0xae ... 0xaf:	/* scas */
 		DPRINTF("Urk! I don't handle SCAS.\n");
 		goto cannot_emulate;
+	case 0xf4:              /* hlt */
+		ctxt->vcpu->halt_request = 1;
+		goto done;
 	}
 	goto writeback;
 
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 53/58] KVM: Keep an upper bound of initialized vcpus
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (50 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 52/58] KVM: Emulate hlt on real mode for Intel Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 54/58] KVM: Flush remote tlbs when reducing shadow pte permissions Avi Kivity
                   ` (4 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

That way, we don't need to loop for KVM_MAX_VCPUS for a single vcpu
vm.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h      |    1 +
 drivers/kvm/kvm_main.c |    5 +++++
 2 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index d49b16c..528a56b 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -379,6 +379,7 @@ struct kvm {
 	struct list_head active_mmu_pages;
 	int n_free_mmu_pages;
 	struct hlist_head mmu_page_hash[KVM_NUM_MMU_PAGES];
+	int nvcpus;
 	struct kvm_vcpu vcpus[KVM_MAX_VCPUS];
 	int memory_config_version;
 	int busy;
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 5564169..4e1a017 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -2391,6 +2391,11 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, int n)
 	if (r < 0)
 		goto out_free_vcpus;
 
+	spin_lock(&kvm_lock);
+	if (n >= kvm->nvcpus)
+		kvm->nvcpus = n + 1;
+	spin_unlock(&kvm_lock);
+
 	return r;
 
 out_free_vcpus:
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 54/58] KVM: Flush remote tlbs when reducing shadow pte permissions
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (51 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 53/58] KVM: Keep an upper bound of initialized vcpus Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 55/58] KVM: SVM: Replace memset(<addr>, 0, PAGESIZE) with clear_page(<addr>) Avi Kivity
                   ` (3 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

When a vcpu causes a shadow tlb entry to have reduced permissions, it
must also clear the tlb on remote vcpus.  We do that by:

- setting a bit on the vcpu that requests a tlb flush before the next entry
- if the vcpu is currently executing, we send an ipi to make sure it
  exits before we continue

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h      |    8 ++++++++
 drivers/kvm/kvm_main.c |   44 ++++++++++++++++++++++++++++++++++++++++++++
 drivers/kvm/mmu.c      |    8 +++++---
 drivers/kvm/svm.c      |   17 ++++++++++++-----
 drivers/kvm/vmx.c      |   22 +++++++++++++++-------
 5 files changed, 84 insertions(+), 15 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 528a56b..b08272b 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -84,6 +84,11 @@
 #define KVM_PIO_PAGE_OFFSET 1
 
 /*
+ * vcpu->requests bit members
+ */
+#define KVM_TLB_FLUSH 0
+
+/*
  * Address types:
  *
  *  gva - guest virtual address
@@ -272,6 +277,8 @@ struct kvm_vcpu {
 	u64 host_tsc;
 	struct kvm_run *run;
 	int interrupt_window_open;
+	int guest_mode;
+	unsigned long requests;
 	unsigned long irq_summary; /* bit vector: 1 per word in irq_pending */
 #define NR_IRQ_WORDS KVM_IRQ_BITMAP_SIZE(unsigned long)
 	unsigned long irq_pending[NR_IRQ_WORDS];
@@ -530,6 +537,7 @@ void save_msrs(struct vmx_msr_entry *e, int n);
 void kvm_resched(struct kvm_vcpu *vcpu);
 void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
 void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
+void kvm_flush_remote_tlbs(struct kvm *kvm);
 
 int kvm_read_guest(struct kvm_vcpu *vcpu,
 	       gva_t addr,
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 4e1a017..633c2ed 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -41,6 +41,8 @@
 #include <linux/fs.h>
 #include <linux/mount.h>
 #include <linux/sched.h>
+#include <linux/cpumask.h>
+#include <linux/smp.h>
 
 #include "x86_emulate.h"
 #include "segment_descriptor.h"
@@ -309,6 +311,48 @@ static void vcpu_put(struct kvm_vcpu *vcpu)
 	mutex_unlock(&vcpu->mutex);
 }
 
+static void ack_flush(void *_completed)
+{
+	atomic_t *completed = _completed;
+
+	atomic_inc(completed);
+}
+
+void kvm_flush_remote_tlbs(struct kvm *kvm)
+{
+	int i, cpu, needed;
+	cpumask_t cpus;
+	struct kvm_vcpu *vcpu;
+	atomic_t completed;
+
+	atomic_set(&completed, 0);
+	cpus_clear(cpus);
+	needed = 0;
+	for (i = 0; i < kvm->nvcpus; ++i) {
+		vcpu = &kvm->vcpus[i];
+		if (test_and_set_bit(KVM_TLB_FLUSH, &vcpu->requests))
+			continue;
+		cpu = vcpu->cpu;
+		if (cpu != -1 && cpu != raw_smp_processor_id())
+			if (!cpu_isset(cpu, cpus)) {
+				cpu_set(cpu, cpus);
+				++needed;
+			}
+	}
+
+	/*
+	 * We really want smp_call_function_mask() here.  But that's not
+	 * available, so ipi all cpus in parallel and wait for them
+	 * to complete.
+	 */
+	for (cpu = first_cpu(cpus); cpu != NR_CPUS; cpu = next_cpu(cpu, cpus))
+		smp_call_function_single(cpu, ack_flush, &completed, 1, 0);
+	while (atomic_read(&completed) != needed) {
+		cpu_relax();
+		barrier();
+	}
+}
+
 static struct kvm *kvm_create_vm(void)
 {
 	struct kvm *kvm = kzalloc(sizeof(struct kvm), GFP_KERNEL);
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index d4de988..ad50cfd 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -441,7 +441,7 @@ static void rmap_write_protect(struct kvm_vcpu *vcpu, u64 gfn)
 		BUG_ON(!(*spte & PT_WRITABLE_MASK));
 		rmap_printk("rmap_write_protect: spte %p %llx\n", spte, *spte);
 		rmap_remove(vcpu, spte);
-		kvm_arch_ops->tlb_flush(vcpu);
+		kvm_flush_remote_tlbs(vcpu->kvm);
 		set_shadow_pte(spte, *spte & ~PT_WRITABLE_MASK);
 	}
 }
@@ -656,7 +656,7 @@ static void kvm_mmu_page_unlink_children(struct kvm_vcpu *vcpu,
 				rmap_remove(vcpu, &pt[i]);
 			pt[i] = 0;
 		}
-		kvm_arch_ops->tlb_flush(vcpu);
+		kvm_flush_remote_tlbs(vcpu->kvm);
 		return;
 	}
 
@@ -669,6 +669,7 @@ static void kvm_mmu_page_unlink_children(struct kvm_vcpu *vcpu,
 		ent &= PT64_BASE_ADDR_MASK;
 		mmu_page_remove_parent_pte(vcpu, page_header(ent), &pt[i]);
 	}
+	kvm_flush_remote_tlbs(vcpu->kvm);
 }
 
 static void kvm_mmu_put_page(struct kvm_vcpu *vcpu,
@@ -1093,6 +1094,7 @@ static void mmu_pte_write_zap_pte(struct kvm_vcpu *vcpu,
 		}
 	}
 	*spte = 0;
+	kvm_flush_remote_tlbs(vcpu->kvm);
 }
 
 static void mmu_pte_write_new_pte(struct kvm_vcpu *vcpu,
@@ -1308,7 +1310,7 @@ void kvm_mmu_zap_all(struct kvm_vcpu *vcpu)
 	}
 
 	mmu_free_memory_caches(vcpu);
-	kvm_arch_ops->tlb_flush(vcpu);
+	kvm_flush_remote_tlbs(vcpu->kvm);
 	init_kvm_mmu(vcpu);
 }
 
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 70f386e..eb175c5 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -1470,6 +1470,11 @@ static void load_db_regs(unsigned long *db_regs)
 	asm volatile ("mov %0, %%dr3" : : "r"(db_regs[3]));
 }
 
+static void svm_flush_tlb(struct kvm_vcpu *vcpu)
+{
+	force_new_asid(vcpu);
+}
+
 static int svm_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	u16 fs_selector;
@@ -1487,6 +1492,11 @@ again:
 
 	clgi();
 
+	vcpu->guest_mode = 1;
+	if (vcpu->requests)
+		if (test_and_clear_bit(KVM_TLB_FLUSH, &vcpu->requests))
+		    svm_flush_tlb(vcpu);
+
 	pre_svm_run(vcpu);
 
 	save_host_msrs(vcpu);
@@ -1618,6 +1628,8 @@ again:
 #endif
 		: "cc", "memory" );
 
+	vcpu->guest_mode = 0;
+
 	if (vcpu->fpu_active) {
 		fx_save(vcpu->guest_fx_image);
 		fx_restore(vcpu->host_fx_image);
@@ -1682,11 +1694,6 @@ again:
 	return r;
 }
 
-static void svm_flush_tlb(struct kvm_vcpu *vcpu)
-{
-	force_new_asid(vcpu);
-}
-
 static void svm_set_cr3(struct kvm_vcpu *vcpu, unsigned long root)
 {
 	vcpu->svm->vmcb->save.cr3 = root;
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index a1f51b9..b969db1 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1972,6 +1972,11 @@ static int dm_request_for_irq_injection(struct kvm_vcpu *vcpu,
 		(vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_IF));
 }
 
+static void vmx_flush_tlb(struct kvm_vcpu *vcpu)
+{
+	vmcs_writel(GUEST_CR3, vmcs_readl(GUEST_CR3));
+}
+
 static int vmx_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	u8 fail;
@@ -1997,9 +2002,15 @@ again:
 	 */
 	vmcs_writel(HOST_CR0, read_cr0());
 
+	local_irq_disable();
+
+	vcpu->guest_mode = 1;
+	if (vcpu->requests)
+		if (test_and_clear_bit(KVM_TLB_FLUSH, &vcpu->requests))
+		    vmx_flush_tlb(vcpu);
+
 	asm (
 		/* Store host registers */
-		"pushf \n\t"
 #ifdef CONFIG_X86_64
 		"push %%rax; push %%rbx; push %%rdx;"
 		"push %%rsi; push %%rdi; push %%rbp;"
@@ -2091,7 +2102,6 @@ again:
 		"pop %%ecx; popa \n\t"
 #endif
 		"setbe %0 \n\t"
-		"popf \n\t"
 	      : "=q" (fail)
 	      : "r"(vcpu->launched), "d"((unsigned long)HOST_RSP),
 		"c"(vcpu),
@@ -2115,6 +2125,9 @@ again:
 		[cr2]"i"(offsetof(struct kvm_vcpu, cr2))
 	      : "cc", "memory" );
 
+	vcpu->guest_mode = 0;
+	local_irq_enable();
+
 	++vcpu->stat.exits;
 
 	vcpu->interrupt_window_open = (vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) & 3) == 0;
@@ -2167,11 +2180,6 @@ out:
 	return r;
 }
 
-static void vmx_flush_tlb(struct kvm_vcpu *vcpu)
-{
-	vmcs_writel(GUEST_CR3, vmcs_readl(GUEST_CR3));
-}
-
 static void vmx_inject_page_fault(struct kvm_vcpu *vcpu,
 				  unsigned long addr,
 				  u32 err_code)
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 55/58] KVM: SVM: Replace memset(<addr>, 0, PAGESIZE) with clear_page(<addr>)
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (52 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 54/58] KVM: Flush remote tlbs when reducing shadow pte permissions Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 56/58] KVM: VMX: " Avi Kivity
                   ` (2 subsequent siblings)
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Shani Moideen, Avi Kivity

From: Shani Moideen <shani.moideen@wipro.com>

Signed-off-by: Shani Moideen <shani.moideen@wipro.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/svm.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index eb175c5..68841ef 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -581,7 +581,7 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 		goto out2;
 
 	vcpu->svm->vmcb = page_address(page);
-	memset(vcpu->svm->vmcb, 0, PAGE_SIZE);
+	clear_page(vcpu->svm->vmcb);
 	vcpu->svm->vmcb_pa = page_to_pfn(page) << PAGE_SHIFT;
 	vcpu->svm->asid_generation = 0;
 	memset(vcpu->svm->db_regs, 0, sizeof(vcpu->svm->db_regs));
@@ -957,7 +957,7 @@ static int shutdown_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	 * VMCB is undefined after a SHUTDOWN intercept
 	 * so reinitialize it.
 	 */
-	memset(vcpu->svm->vmcb, 0, PAGE_SIZE);
+	clear_page(vcpu->svm->vmcb);
 	init_vmcb(vcpu->svm->vmcb);
 
 	kvm_run->exit_reason = KVM_EXIT_SHUTDOWN;
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 56/58] KVM: VMX: Replace memset(<addr>, 0, PAGESIZE) with clear_page(<addr>)
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (53 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 55/58] KVM: SVM: Replace memset(<addr>, 0, PAGESIZE) with clear_page(<addr>) Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 57/58] KVM: Initialize the BSP bit in the APIC_BASE msr correctly Avi Kivity
  2007-06-17  9:44 ` [PATCH 58/58] KVM: VMX: Ensure vcpu time stamp counter is monotonous Avi Kivity
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Shani Moideen, Avi Kivity

From: Shani Moideen <shani.moideen@wipro.com>

Signed-off-by: Shani Moideen <shani.moideen@wipro.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/vmx.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index b969db1..b909b54 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1180,16 +1180,16 @@ static int init_rmode_tss(struct kvm* kvm)
 	}
 
 	page = kmap_atomic(p1, KM_USER0);
-	memset(page, 0, PAGE_SIZE);
+	clear_page(page);
 	*(u16*)(page + 0x66) = TSS_BASE_SIZE + TSS_REDIRECTION_SIZE;
 	kunmap_atomic(page, KM_USER0);
 
 	page = kmap_atomic(p2, KM_USER0);
-	memset(page, 0, PAGE_SIZE);
+	clear_page(page);
 	kunmap_atomic(page, KM_USER0);
 
 	page = kmap_atomic(p3, KM_USER0);
-	memset(page, 0, PAGE_SIZE);
+	clear_page(page);
 	*(page + RMODE_TSS_SIZE - 2 * PAGE_SIZE - 1) = ~0;
 	kunmap_atomic(page, KM_USER0);
 
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 57/58] KVM: Initialize the BSP bit in the APIC_BASE msr correctly
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (54 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 56/58] KVM: VMX: " Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  2007-06-17  9:44 ` [PATCH 58/58] KVM: VMX: Ensure vcpu time stamp counter is monotonous Avi Kivity
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Needs to be set on vcpu 0 only.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/svm.c |    6 +++---
 drivers/kvm/vmx.c |    6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 68841ef..62ec38c 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -589,9 +589,9 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 
 	fx_init(vcpu);
 	vcpu->fpu_active = 1;
-	vcpu->apic_base = 0xfee00000 |
-			/*for vcpu 0*/ MSR_IA32_APICBASE_BSP |
-			MSR_IA32_APICBASE_ENABLE;
+	vcpu->apic_base = 0xfee00000 | MSR_IA32_APICBASE_ENABLE;
+	if (vcpu == &vcpu->kvm->vcpus[0])
+		vcpu->apic_base |= MSR_IA32_APICBASE_BSP;
 
 	return 0;
 
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index b909b54..0b2aace 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1238,9 +1238,9 @@ static int vmx_vcpu_setup(struct kvm_vcpu *vcpu)
 	memset(vcpu->regs, 0, sizeof(vcpu->regs));
 	vcpu->regs[VCPU_REGS_RDX] = get_rdx_init_val();
 	vcpu->cr8 = 0;
-	vcpu->apic_base = 0xfee00000 |
-			/*for vcpu 0*/ MSR_IA32_APICBASE_BSP |
-			MSR_IA32_APICBASE_ENABLE;
+	vcpu->apic_base = 0xfee00000 | MSR_IA32_APICBASE_ENABLE;
+	if (vcpu == &vcpu->kvm->vcpus[0])
+		vcpu->apic_base |= MSR_IA32_APICBASE_BSP;
 
 	fx_init(vcpu);
 
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 58/58] KVM: VMX: Ensure vcpu time stamp counter is monotonous
  2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
                   ` (55 preceding siblings ...)
  2007-06-17  9:44 ` [PATCH 57/58] KVM: Initialize the BSP bit in the APIC_BASE msr correctly Avi Kivity
@ 2007-06-17  9:44 ` Avi Kivity
  56 siblings, 0 replies; 58+ messages in thread
From: Avi Kivity @ 2007-06-17  9:44 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

If the time stamp counter goes backwards, a guest delay loop can become
infinite.  This can happen if a vcpu is migrated to another cpu, where
the counter has a lower value than the first cpu.

Since we're doing an IPI to the first cpu anyway, we can use that to pick
up the old tsc, and use that to calculate the adjustment we need to make
to the tsc offset.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/vmx.c |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 0b2aace..d06c362 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -160,6 +160,7 @@ static void __vcpu_clear(void *arg)
 		vmcs_clear(vcpu->vmcs);
 	if (per_cpu(current_vmcs, cpu) == vcpu->vmcs)
 		per_cpu(current_vmcs, cpu) = NULL;
+	rdtscll(vcpu->host_tsc);
 }
 
 static void vcpu_clear(struct kvm_vcpu *vcpu)
@@ -376,6 +377,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu)
 {
 	u64 phys_addr = __pa(vcpu->vmcs);
 	int cpu;
+	u64 tsc_this, delta;
 
 	cpu = get_cpu();
 
@@ -409,6 +411,13 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu)
 
 		rdmsrl(MSR_IA32_SYSENTER_ESP, sysenter_esp);
 		vmcs_writel(HOST_IA32_SYSENTER_ESP, sysenter_esp); /* 22.2.3 */
+
+		/*
+		 * Make sure the time stamp counter is monotonous.
+		 */
+		rdtscll(tsc_this);
+		delta = vcpu->host_tsc - tsc_this;
+		vmcs_write64(TSC_OFFSET, vmcs_read64(TSC_OFFSET) + delta);
 	}
 }
 
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2007-06-17 10:04 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-06-17  9:43 [PATCH 00/58] KVM updates for 2.6.23 Avi Kivity
2007-06-17  9:43 ` [PATCH 01/58] KVM: VMX: Enable io bitmaps to avoid IO port 0x80 VMEXITs Avi Kivity
2007-06-17  9:43 ` [PATCH 02/58] KVM: SVM: Allow direct guest access to PC debug port Avi Kivity
2007-06-17  9:43 ` [PATCH 03/58] KVM: Assume that writes smaller than 4 bytes are to non-pagetable pages Avi Kivity
2007-06-17  9:43 ` [PATCH 04/58] KVM: Avoid saving and restoring some host CPU state on lightweight vmexit Avi Kivity
2007-06-17  9:43 ` [PATCH 05/58] KVM: Unindent some code Avi Kivity
2007-06-17  9:43 ` [PATCH 06/58] KVM: Reduce misfirings of the fork detector Avi Kivity
2007-06-17  9:43 ` [PATCH 07/58] KVM: Be more careful restoring fs on lightweight vmexit Avi Kivity
2007-06-17  9:43 ` [PATCH 08/58] KVM: Unify kvm_mmu_pre_write() and kvm_mmu_post_write() Avi Kivity
2007-06-17  9:43 ` [PATCH 09/58] KVM: MMU: Respect nonpae pagetable quadrant when zapping ptes Avi Kivity
2007-06-17  9:43 ` [PATCH 10/58] KVM: Update shadow pte on write to guest pte Avi Kivity
2007-06-17  9:43 ` [PATCH 11/58] KVM: Increase mmu shadow cache to 1024 pages Avi Kivity
2007-06-17  9:43 ` [PATCH 12/58] KVM: Fix potential guest state leak into host Avi Kivity
2007-06-17  9:43 ` [PATCH 13/58] KVM: Move some more msr mangling into vmx_save_host_state() Avi Kivity
2007-06-17  9:43 ` [PATCH 14/58] KVM: Rationalize exception bitmap usage Avi Kivity
2007-06-17  9:43 ` [PATCH 15/58] KVM: Consolidate guest fpu activation and deactivation Avi Kivity
2007-06-17  9:43 ` [PATCH 16/58] KVM: Set cr0.mp for guests Avi Kivity
2007-06-17  9:43 ` [PATCH 17/58] KVM: Implement IA32_EBL_CR_POWERON msr Avi Kivity
2007-06-17  9:43 ` [PATCH 18/58] KVM: MMU: Simplify kvm_mmu_free_page() a tiny bit Avi Kivity
2007-06-17  9:44 ` [PATCH 19/58] KVM: MMU: Store shadow page tables as kernel virtual addresses, not physical Avi Kivity
2007-06-17  9:44 ` [PATCH 20/58] KVM: VMX: Only reload guest msrs if they are already loaded Avi Kivity
2007-06-17  9:44 ` [PATCH 21/58] KVM: Avoid corrupting tr in real mode Avi Kivity
2007-06-17  9:44 ` [PATCH 22/58] KVM: Fix vmx I/O bitmap initialization on highmem systems Avi Kivity
2007-06-17  9:44 ` [PATCH 23/58] KVM: VMX: Use local labels in inline assembly Avi Kivity
2007-06-17  9:44 ` [PATCH 24/58] KVM: VMX: Handle #SS faults from real mode Avi Kivity
2007-06-17  9:44 ` [PATCH 25/58] KVM: VMX: Avoid saving and restoring msrs on lightweight vmexit Avi Kivity
2007-06-17  9:44 ` [PATCH 26/58] KVM: VMX: Cleanup redundant code in MSR set Avi Kivity
2007-06-17  9:44 ` [PATCH 27/58] KVM: VMX: Avoid saving and restoring msr_efer on lightweight vmexit Avi Kivity
2007-06-17  9:44 ` [PATCH 28/58] Use menuconfig objects II - KVM/Virt Avi Kivity
2007-06-17  9:44 ` [PATCH 29/58] KVM: x86 emulator: implement wbinvd Avi Kivity
2007-06-17  9:44 ` [PATCH 30/58] KVM: Fix includes Avi Kivity
2007-06-17  9:44 ` [PATCH 31/58] KVM: Use symbolic constants instead of magic numbers Avi Kivity
2007-06-17  9:44 ` [PATCH 32/58] KVM: MMU: Use slab caches for shadow pages and their headers Avi Kivity
2007-06-17  9:44 ` [PATCH 33/58] KVM: MMU: Simplify fetch() a little bit Avi Kivity
2007-06-17  9:44 ` [PATCH 34/58] KVM: MMU: Move set_pte_common() to pte width dependent code Avi Kivity
2007-06-17  9:44 ` [PATCH 35/58] KVM: MMU: Pass the guest pde to set_pte_common Avi Kivity
2007-06-17  9:44 ` [PATCH 36/58] KVM: MMU: Fold fix_read_pf() into set_pte_common() Avi Kivity
2007-06-17  9:44 ` [PATCH 37/58] KVM: MMU: Fold fix_write_pf() " Avi Kivity
2007-06-17  9:44 ` [PATCH 38/58] KVM: Move shadow pte modifications from set_pte/set_pde to set_pde_common() Avi Kivity
2007-06-17  9:44 ` [PATCH 39/58] KVM: Make shadow pte updates atomic Avi Kivity
2007-06-17  9:44 ` [PATCH 40/58] KVM: MMU: Make setting shadow ptes atomic on i386 Avi Kivity
2007-06-17  9:44 ` [PATCH 41/58] KVM: MMU: Remove cr0.wp tricks Avi Kivity
2007-06-17  9:44 ` [PATCH 42/58] KVM: MMU: Simpify accessed/dirty/present/nx bit handling Avi Kivity
2007-06-17  9:44 ` [PATCH 43/58] KVM: MMU: Don't cache guest access bits in the shadow page table Avi Kivity
2007-06-17  9:44 ` [PATCH 44/58] KVM: MMU: Remove unused large page marker Avi Kivity
2007-06-17  9:44 ` [PATCH 45/58] KVM: Lazy guest cr3 switching Avi Kivity
2007-06-17  9:44 ` [PATCH 47/58] KVM: Remove unnecessary initialization and checks in mark_page_dirty() Avi Kivity
2007-06-17  9:44 ` [PATCH 48/58] KVM: Fix vcpu freeing for guest smp Avi Kivity
2007-06-17  9:44 ` [PATCH 49/58] KVM: Fix adding an smp virtual machine to the vm list Avi Kivity
2007-06-17  9:44 ` [PATCH 50/58] KVM: Enable guest smp Avi Kivity
2007-06-17  9:44 ` [PATCH 51/58] KVM: Move duplicate halt handling code into kvm_main.c Avi Kivity
2007-06-17  9:44 ` [PATCH 52/58] KVM: Emulate hlt on real mode for Intel Avi Kivity
2007-06-17  9:44 ` [PATCH 53/58] KVM: Keep an upper bound of initialized vcpus Avi Kivity
2007-06-17  9:44 ` [PATCH 54/58] KVM: Flush remote tlbs when reducing shadow pte permissions Avi Kivity
2007-06-17  9:44 ` [PATCH 55/58] KVM: SVM: Replace memset(<addr>, 0, PAGESIZE) with clear_page(<addr>) Avi Kivity
2007-06-17  9:44 ` [PATCH 56/58] KVM: VMX: " Avi Kivity
2007-06-17  9:44 ` [PATCH 57/58] KVM: Initialize the BSP bit in the APIC_BASE msr correctly Avi Kivity
2007-06-17  9:44 ` [PATCH 58/58] KVM: VMX: Ensure vcpu time stamp counter is monotonous Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).