All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/11] KVM: x86: optimize for writing guest page
@ 2011-09-22  8:52 Xiao Guangrong
  2011-09-22  8:53 ` [PATCH v4 01/11] KVM: MMU: avoid pte_list_desc running out in kvm_mmu_pte_write Xiao Guangrong
                   ` (13 more replies)
  0 siblings, 14 replies; 26+ messages in thread
From: Xiao Guangrong @ 2011-09-22  8:52 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

This patchset is against https://github.com/avikivity/kvm.git next branch.

In this version, some changes come from Avi's comments:
- fix instruction retried for nested guest
- skip write-flooding for the sp whose level is 1
- rename some functions

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 01/11] KVM: MMU: avoid pte_list_desc running out in kvm_mmu_pte_write
  2011-09-22  8:52 [PATCH v4 00/11] KVM: x86: optimize for writing guest page Xiao Guangrong
@ 2011-09-22  8:53 ` Xiao Guangrong
  2011-09-22  8:53 ` [PATCH v4 02/11] KVM: x86: tag the instructions which are used to write page table Xiao Guangrong
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Xiao Guangrong @ 2011-09-22  8:53 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

kvm_mmu_pte_write is unsafe since we need to alloc pte_list_desc in the
function when spte is prefetched, unfortunately, we can not know how many
spte need to be prefetched on this path, that means we can use out of the
free  pte_list_desc object in the cache, and BUG_ON() is triggered, also some
path does not fill the cache, such as INS instruction emulated that does not
trigger page fault

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 arch/x86/kvm/mmu.c |   25 ++++++++++++++++++++-----
 1 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 5d7fbf0..b01afee 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -592,6 +592,11 @@ static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
 	return 0;
 }
 
+static int mmu_memory_cache_free_objects(struct kvm_mmu_memory_cache *cache)
+{
+	return cache->nobjs;
+}
+
 static void mmu_free_memory_cache(struct kvm_mmu_memory_cache *mc,
 				  struct kmem_cache *cache)
 {
@@ -969,6 +974,14 @@ static unsigned long *gfn_to_rmap(struct kvm *kvm, gfn_t gfn, int level)
 	return &linfo->rmap_pde;
 }
 
+static bool rmap_can_add(struct kvm_vcpu *vcpu)
+{
+	struct kvm_mmu_memory_cache *cache;
+
+	cache = &vcpu->arch.mmu_pte_list_desc_cache;
+	return mmu_memory_cache_free_objects(cache);
+}
+
 static int rmap_add(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn)
 {
 	struct kvm_mmu_page *sp;
@@ -3585,6 +3598,12 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 		break;
 	}
 
+	/*
+	 * No need to care whether allocation memory is successful
+	 * or not since pte prefetch is skiped if it does not have
+	 * enough objects in the cache.
+	 */
+	mmu_topup_memory_caches(vcpu);
 	spin_lock(&vcpu->kvm->mmu_lock);
 	if (atomic_read(&vcpu->kvm->arch.invlpg_counter) != invlpg_counter)
 		gentry = 0;
@@ -3655,7 +3674,7 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 			mmu_page_zap_pte(vcpu->kvm, sp, spte);
 			if (gentry &&
 			      !((sp->role.word ^ vcpu->arch.mmu.base_role.word)
-			      & mask.word))
+			      & mask.word) && rmap_can_add(vcpu))
 				mmu_pte_write_new_pte(vcpu, sp, spte, &gentry);
 			if (!remote_flush && need_remote_flush(entry, *spte))
 				remote_flush = true;
@@ -3716,10 +3735,6 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t cr2, u32 error_code,
 		goto out;
 	}
 
-	r = mmu_topup_memory_caches(vcpu);
-	if (r)
-		goto out;
-
 	er = x86_emulate_instruction(vcpu, cr2, 0, insn, insn_len);
 
 	switch (er) {
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 02/11] KVM: x86: tag the instructions which are used to write page table
  2011-09-22  8:52 [PATCH v4 00/11] KVM: x86: optimize for writing guest page Xiao Guangrong
  2011-09-22  8:53 ` [PATCH v4 01/11] KVM: MMU: avoid pte_list_desc running out in kvm_mmu_pte_write Xiao Guangrong
@ 2011-09-22  8:53 ` Xiao Guangrong
  2011-09-22  8:55 ` [PATCH v4 04/11] KVM: x86: cleanup port-in/port-out emulated Xiao Guangrong
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Xiao Guangrong @ 2011-09-22  8:53 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

The idea is from Avi:
| tag instructions that are typically used to modify the page tables, and
| drop shadow if any other instruction is used.
| The list would include, I'd guess, and, or, bts, btc, mov, xchg, cmpxchg,
| and cmpxchg8b.

This patch is used to tag the instructions and in the later path, shadow page
is dropped if it is written by other instructions

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 arch/x86/kvm/emulate.c |   37 +++++++++++++++++++++----------------
 1 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index f1e3be1..a10950a 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -125,8 +125,9 @@
 #define Lock        (1<<26) /* lock prefix is allowed for the instruction */
 #define Priv        (1<<27) /* instruction generates #GP if current CPL != 0 */
 #define No64	    (1<<28)
+#define PageTable   (1 << 29)   /* instruction used to write page table */
 /* Source 2 operand type */
-#define Src2Shift   (29)
+#define Src2Shift   (30)
 #define Src2None    (OpNone << Src2Shift)
 #define Src2CL      (OpCL << Src2Shift)
 #define Src2ImmByte (OpImmByte << Src2Shift)
@@ -3033,10 +3034,10 @@ static struct opcode group7_rm7[] = {
 
 static struct opcode group1[] = {
 	I(Lock, em_add),
-	I(Lock, em_or),
+	I(Lock | PageTable, em_or),
 	I(Lock, em_adc),
 	I(Lock, em_sbb),
-	I(Lock, em_and),
+	I(Lock | PageTable, em_and),
 	I(Lock, em_sub),
 	I(Lock, em_xor),
 	I(0, em_cmp),
@@ -3096,18 +3097,21 @@ static struct group_dual group7 = { {
 
 static struct opcode group8[] = {
 	N, N, N, N,
-	D(DstMem | SrcImmByte | ModRM), D(DstMem | SrcImmByte | ModRM | Lock),
-	D(DstMem | SrcImmByte | ModRM | Lock), D(DstMem | SrcImmByte | ModRM | Lock),
+	D(DstMem | SrcImmByte | ModRM),
+	D(DstMem | SrcImmByte | ModRM | Lock | PageTable),
+	D(DstMem | SrcImmByte | ModRM | Lock),
+	D(DstMem | SrcImmByte | ModRM | Lock | PageTable),
 };
 
 static struct group_dual group9 = { {
-	N, D(DstMem64 | ModRM | Lock), N, N, N, N, N, N,
+	N, D(DstMem64 | ModRM | Lock | PageTable), N, N, N, N, N, N,
 }, {
 	N, N, N, N, N, N, N, N,
 } };
 
 static struct opcode group11[] = {
-	I(DstMem | SrcImm | ModRM | Mov, em_mov), X7(D(Undefined)),
+	I(DstMem | SrcImm | ModRM | Mov | PageTable, em_mov),
+	X7(D(Undefined)),
 };
 
 static struct gprefix pfx_0f_6f_0f_7f = {
@@ -3120,7 +3124,7 @@ static struct opcode opcode_table[256] = {
 	I(ImplicitOps | Stack | No64 | Src2ES, em_push_sreg),
 	I(ImplicitOps | Stack | No64 | Src2ES, em_pop_sreg),
 	/* 0x08 - 0x0F */
-	I6ALU(Lock, em_or),
+	I6ALU(Lock | PageTable, em_or),
 	I(ImplicitOps | Stack | No64 | Src2CS, em_push_sreg),
 	N,
 	/* 0x10 - 0x17 */
@@ -3132,7 +3136,7 @@ static struct opcode opcode_table[256] = {
 	I(ImplicitOps | Stack | No64 | Src2DS, em_push_sreg),
 	I(ImplicitOps | Stack | No64 | Src2DS, em_pop_sreg),
 	/* 0x20 - 0x27 */
-	I6ALU(Lock, em_and), N, N,
+	I6ALU(Lock | PageTable, em_and), N, N,
 	/* 0x28 - 0x2F */
 	I6ALU(Lock, em_sub), N, I(ByteOp | DstAcc | No64, em_das),
 	/* 0x30 - 0x37 */
@@ -3165,11 +3169,11 @@ static struct opcode opcode_table[256] = {
 	G(ByteOp | DstMem | SrcImm | ModRM | No64 | Group, group1),
 	G(DstMem | SrcImmByte | ModRM | Group, group1),
 	I2bv(DstMem | SrcReg | ModRM, em_test),
-	I2bv(DstMem | SrcReg | ModRM | Lock, em_xchg),
+	I2bv(DstMem | SrcReg | ModRM | Lock | PageTable, em_xchg),
 	/* 0x88 - 0x8F */
-	I2bv(DstMem | SrcReg | ModRM | Mov, em_mov),
+	I2bv(DstMem | SrcReg | ModRM | Mov | PageTable, em_mov),
 	I2bv(DstReg | SrcMem | ModRM | Mov, em_mov),
-	I(DstMem | SrcNone | ModRM | Mov, em_mov_rm_sreg),
+	I(DstMem | SrcNone | ModRM | Mov | PageTable, em_mov_rm_sreg),
 	D(ModRM | SrcMem | NoAccess | DstReg),
 	I(ImplicitOps | SrcMem16 | ModRM, em_mov_sreg_rm),
 	G(0, group1A),
@@ -3182,7 +3186,7 @@ static struct opcode opcode_table[256] = {
 	II(ImplicitOps | Stack, em_popf, popf), N, N,
 	/* 0xA0 - 0xA7 */
 	I2bv(DstAcc | SrcMem | Mov | MemAbs, em_mov),
-	I2bv(DstMem | SrcAcc | Mov | MemAbs, em_mov),
+	I2bv(DstMem | SrcAcc | Mov | MemAbs | PageTable, em_mov),
 	I2bv(SrcSI | DstDI | Mov | String, em_mov),
 	I2bv(SrcSI | DstDI | String, em_cmp),
 	/* 0xA8 - 0xAF */
@@ -3280,12 +3284,13 @@ static struct opcode twobyte_table[256] = {
 	D(DstMem | SrcReg | Src2CL | ModRM), N, N,
 	/* 0xA8 - 0xAF */
 	I(Stack | Src2GS, em_push_sreg), I(Stack | Src2GS, em_pop_sreg),
-	DI(ImplicitOps, rsm), D(DstMem | SrcReg | ModRM | BitOp | Lock),
+	DI(ImplicitOps, rsm),
+	D(DstMem | SrcReg | ModRM | BitOp | Lock | PageTable),
 	D(DstMem | SrcReg | Src2ImmByte | ModRM),
 	D(DstMem | SrcReg | Src2CL | ModRM),
 	D(ModRM), I(DstReg | SrcMem | ModRM, em_imul),
 	/* 0xB0 - 0xB7 */
-	D2bv(DstMem | SrcReg | ModRM | Lock),
+	D2bv(DstMem | SrcReg | ModRM | Lock | PageTable),
 	I(DstReg | SrcMemFAddr | ModRM | Src2SS, em_lseg),
 	D(DstMem | SrcReg | ModRM | BitOp | Lock),
 	I(DstReg | SrcMemFAddr | ModRM | Src2FS, em_lseg),
@@ -3293,7 +3298,7 @@ static struct opcode twobyte_table[256] = {
 	D(ByteOp | DstReg | SrcMem | ModRM | Mov), D(DstReg | SrcMem16 | ModRM | Mov),
 	/* 0xB8 - 0xBF */
 	N, N,
-	G(BitOp, group8), D(DstMem | SrcReg | ModRM | BitOp | Lock),
+	G(BitOp, group8), D(DstMem | SrcReg | ModRM | BitOp | Lock | PageTable),
 	D(DstReg | SrcMem | ModRM), D(DstReg | SrcMem | ModRM),
 	D(ByteOp | DstReg | SrcMem | ModRM | Mov), D(DstReg | SrcMem16 | ModRM | Mov),
 	/* 0xC0 - 0xCF */
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 04/11] KVM: x86: cleanup port-in/port-out emulated
  2011-09-22  8:52 [PATCH v4 00/11] KVM: x86: optimize for writing guest page Xiao Guangrong
  2011-09-22  8:53 ` [PATCH v4 01/11] KVM: MMU: avoid pte_list_desc running out in kvm_mmu_pte_write Xiao Guangrong
  2011-09-22  8:53 ` [PATCH v4 02/11] KVM: x86: tag the instructions which are used to write page table Xiao Guangrong
@ 2011-09-22  8:55 ` Xiao Guangrong
  2011-09-22  8:55 ` [PATCH v4 05/11] KVM: MMU: do not mark accessed bit on pte write path Xiao Guangrong
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Xiao Guangrong @ 2011-09-22  8:55 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

Remove the same code between emulator_pio_in_emulated and
emulator_pio_out_emulated

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 arch/x86/kvm/x86.c |   59 ++++++++++++++++++++++-----------------------------
 1 files changed, 26 insertions(+), 33 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 727a6af..a69a3e5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4327,32 +4327,24 @@ static int kernel_pio(struct kvm_vcpu *vcpu, void *pd)
 	return r;
 }
 
-
-static int emulator_pio_in_emulated(struct x86_emulate_ctxt *ctxt,
-				    int size, unsigned short port, void *val,
-				    unsigned int count)
+static int emulator_pio_in_out(struct kvm_vcpu *vcpu, int size,
+			       unsigned short port, void *val,
+			       unsigned int count, bool in)
 {
-	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
-
-	if (vcpu->arch.pio.count)
-		goto data_avail;
-
-	trace_kvm_pio(0, port, size, count);
+	trace_kvm_pio(!in, port, size, count);
 
 	vcpu->arch.pio.port = port;
-	vcpu->arch.pio.in = 1;
+	vcpu->arch.pio.in = in;
 	vcpu->arch.pio.count  = count;
 	vcpu->arch.pio.size = size;
 
 	if (!kernel_pio(vcpu, vcpu->arch.pio_data)) {
-	data_avail:
-		memcpy(val, vcpu->arch.pio_data, size * count);
 		vcpu->arch.pio.count = 0;
 		return 1;
 	}
 
 	vcpu->run->exit_reason = KVM_EXIT_IO;
-	vcpu->run->io.direction = KVM_EXIT_IO_IN;
+	vcpu->run->io.direction = in ? KVM_EXIT_IO_IN : KVM_EXIT_IO_OUT;
 	vcpu->run->io.size = size;
 	vcpu->run->io.data_offset = KVM_PIO_PAGE_OFFSET * PAGE_SIZE;
 	vcpu->run->io.count = count;
@@ -4361,36 +4353,37 @@ static int emulator_pio_in_emulated(struct x86_emulate_ctxt *ctxt,
 	return 0;
 }
 
-static int emulator_pio_out_emulated(struct x86_emulate_ctxt *ctxt,
-				     int size, unsigned short port,
-				     const void *val, unsigned int count)
+static int emulator_pio_in_emulated(struct x86_emulate_ctxt *ctxt,
+				    int size, unsigned short port, void *val,
+				    unsigned int count)
 {
 	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
+	int ret;
 
-	trace_kvm_pio(1, port, size, count);
-
-	vcpu->arch.pio.port = port;
-	vcpu->arch.pio.in = 0;
-	vcpu->arch.pio.count = count;
-	vcpu->arch.pio.size = size;
-
-	memcpy(vcpu->arch.pio_data, val, size * count);
+	if (vcpu->arch.pio.count)
+		goto data_avail;
 
-	if (!kernel_pio(vcpu, vcpu->arch.pio_data)) {
+	ret = emulator_pio_in_out(vcpu, size, port, val, count, true);
+	if (ret) {
+data_avail:
+		memcpy(val, vcpu->arch.pio_data, size * count);
 		vcpu->arch.pio.count = 0;
 		return 1;
 	}
 
-	vcpu->run->exit_reason = KVM_EXIT_IO;
-	vcpu->run->io.direction = KVM_EXIT_IO_OUT;
-	vcpu->run->io.size = size;
-	vcpu->run->io.data_offset = KVM_PIO_PAGE_OFFSET * PAGE_SIZE;
-	vcpu->run->io.count = count;
-	vcpu->run->io.port = port;
-
 	return 0;
 }
 
+static int emulator_pio_out_emulated(struct x86_emulate_ctxt *ctxt,
+				     int size, unsigned short port,
+				     const void *val, unsigned int count)
+{
+	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
+
+	memcpy(vcpu->arch.pio_data, val, size * count);
+	return emulator_pio_in_out(vcpu, size, port, (void *)val, count, false);
+}
+
 static unsigned long get_segment_base(struct kvm_vcpu *vcpu, int seg)
 {
 	return kvm_x86_ops->get_segment_base(vcpu, seg);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 05/11] KVM: MMU: do not mark accessed bit on pte write path
  2011-09-22  8:52 [PATCH v4 00/11] KVM: x86: optimize for writing guest page Xiao Guangrong
                   ` (2 preceding siblings ...)
  2011-09-22  8:55 ` [PATCH v4 04/11] KVM: x86: cleanup port-in/port-out emulated Xiao Guangrong
@ 2011-09-22  8:55 ` Xiao Guangrong
  2011-09-22  8:56 ` [PATCH v4 06/11] KVM: MMU: cleanup FNAME(invlpg) Xiao Guangrong
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Xiao Guangrong @ 2011-09-22  8:55 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

In current code, the accessed bit is always set when page fault occurred,
do not need to set it on pte write path

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 arch/x86/include/asm/kvm_host.h |    1 -
 arch/x86/kvm/mmu.c              |   22 +---------------------
 2 files changed, 1 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 27a25df..58ea3a7 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -356,7 +356,6 @@ struct kvm_vcpu_arch {
 	gfn_t last_pt_write_gfn;
 	int   last_pt_write_count;
 	u64  *last_pte_updated;
-	gfn_t last_pte_gfn;
 
 	struct fpu guest_fpu;
 	u64 xcr0;
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 4e53d6b..6a35024 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2206,11 +2206,6 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 	if (set_mmio_spte(sptep, gfn, pfn, pte_access))
 		return 0;
 
-	/*
-	 * We don't set the accessed bit, since we sometimes want to see
-	 * whether the guest actually used the pte (in order to detect
-	 * demand paging).
-	 */
 	spte = PT_PRESENT_MASK;
 	if (!speculative)
 		spte |= shadow_accessed_mask;
@@ -2361,10 +2356,8 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 		}
 	}
 	kvm_release_pfn_clean(pfn);
-	if (speculative) {
+	if (speculative)
 		vcpu->arch.last_pte_updated = sptep;
-		vcpu->arch.last_pte_gfn = gfn;
-	}
 }
 
 static void nonpaging_new_cr3(struct kvm_vcpu *vcpu)
@@ -3532,18 +3525,6 @@ static bool last_updated_pte_accessed(struct kvm_vcpu *vcpu)
 	return !!(spte && (*spte & shadow_accessed_mask));
 }
 
-static void kvm_mmu_access_page(struct kvm_vcpu *vcpu, gfn_t gfn)
-{
-	u64 *spte = vcpu->arch.last_pte_updated;
-
-	if (spte
-	    && vcpu->arch.last_pte_gfn == gfn
-	    && shadow_accessed_mask
-	    && !(*spte & shadow_accessed_mask)
-	    && is_shadow_present_pte(*spte))
-		set_bit(PT_ACCESSED_SHIFT, (unsigned long *)spte);
-}
-
 void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 		       const u8 *new, int bytes,
 		       bool guest_initiated)
@@ -3614,7 +3595,6 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 	++vcpu->kvm->stat.mmu_pte_write;
 	trace_kvm_mmu_audit(vcpu, AUDIT_PRE_PTE_WRITE);
 	if (guest_initiated) {
-		kvm_mmu_access_page(vcpu, gfn);
 		if (gfn == vcpu->arch.last_pt_write_gfn
 		    && !last_updated_pte_accessed(vcpu)) {
 			++vcpu->arch.last_pt_write_count;
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 06/11] KVM: MMU: cleanup FNAME(invlpg)
  2011-09-22  8:52 [PATCH v4 00/11] KVM: x86: optimize for writing guest page Xiao Guangrong
                   ` (3 preceding siblings ...)
  2011-09-22  8:55 ` [PATCH v4 05/11] KVM: MMU: do not mark accessed bit on pte write path Xiao Guangrong
@ 2011-09-22  8:56 ` Xiao Guangrong
  2011-09-22  8:56 ` [PATCH v4 07/11] KVM: MMU: fast prefetch spte on invlpg path Xiao Guangrong
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Xiao Guangrong @ 2011-09-22  8:56 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

Directly Use mmu_page_zap_pte to zap spte in FNAME(invlpg), also remove the
same code between FNAME(invlpg) and FNAME(sync_page)

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 arch/x86/kvm/mmu.c         |   16 ++++++++++------
 arch/x86/kvm/paging_tmpl.h |   44 +++++++++++++++++---------------------------
 2 files changed, 27 insertions(+), 33 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 6a35024..805a9d5 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1808,7 +1808,7 @@ static void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 	}
 }
 
-static void mmu_page_zap_pte(struct kvm *kvm, struct kvm_mmu_page *sp,
+static bool mmu_page_zap_pte(struct kvm *kvm, struct kvm_mmu_page *sp,
 			     u64 *spte)
 {
 	u64 pte;
@@ -1816,17 +1816,21 @@ static void mmu_page_zap_pte(struct kvm *kvm, struct kvm_mmu_page *sp,
 
 	pte = *spte;
 	if (is_shadow_present_pte(pte)) {
-		if (is_last_spte(pte, sp->role.level))
+		if (is_last_spte(pte, sp->role.level)) {
 			drop_spte(kvm, spte);
-		else {
+			if (is_large_pte(pte))
+				--kvm->stat.lpages;
+		} else {
 			child = page_header(pte & PT64_BASE_ADDR_MASK);
 			drop_parent_pte(child, spte);
 		}
-	} else if (is_mmio_spte(pte))
+		return true;
+	}
+
+	if (is_mmio_spte(pte))
 		mmu_spte_clear_no_track(spte);
 
-	if (is_large_pte(pte))
-		--kvm->stat.lpages;
+	return false;
 }
 
 static void kvm_mmu_page_unlink_children(struct kvm *kvm,
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 9299410..d8d3906 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -656,6 +656,18 @@ out_unlock:
 	return 0;
 }
 
+static gpa_t FNAME(get_level1_sp_gpa)(struct kvm_mmu_page *sp)
+{
+	int offset = 0;
+
+	WARN_ON(sp->role.level != 1);
+
+	if (PTTYPE == 32)
+		offset = sp->role.quadrant << PT64_LEVEL_BITS;
+
+	return gfn_to_gpa(sp->gfn) + offset * sizeof(pt_element_t);
+}
+
 static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva)
 {
 	struct kvm_shadow_walk_iterator iterator;
@@ -663,7 +675,6 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva)
 	gpa_t pte_gpa = -1;
 	int level;
 	u64 *sptep;
-	int need_flush = 0;
 
 	vcpu_clear_mmio_info(vcpu, gva);
 
@@ -675,36 +686,20 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva)
 
 		sp = page_header(__pa(sptep));
 		if (is_last_spte(*sptep, level)) {
-			int offset, shift;
-
 			if (!sp->unsync)
 				break;
 
-			shift = PAGE_SHIFT -
-				  (PT_LEVEL_BITS - PT64_LEVEL_BITS) * level;
-			offset = sp->role.quadrant << shift;
-
-			pte_gpa = (sp->gfn << PAGE_SHIFT) + offset;
+			pte_gpa = FNAME(get_level1_sp_gpa)(sp);
 			pte_gpa += (sptep - sp->spt) * sizeof(pt_element_t);
 
-			if (is_shadow_present_pte(*sptep)) {
-				if (is_large_pte(*sptep))
-					--vcpu->kvm->stat.lpages;
-				drop_spte(vcpu->kvm, sptep);
-				need_flush = 1;
-			} else if (is_mmio_spte(*sptep))
-				mmu_spte_clear_no_track(sptep);
-
-			break;
+			if (mmu_page_zap_pte(vcpu->kvm, sp, sptep))
+				kvm_flush_remote_tlbs(vcpu->kvm);
 		}
 
 		if (!is_shadow_present_pte(*sptep) || !sp->unsync_children)
 			break;
 	}
 
-	if (need_flush)
-		kvm_flush_remote_tlbs(vcpu->kvm);
-
 	atomic_inc(&vcpu->kvm->arch.invlpg_counter);
 
 	spin_unlock(&vcpu->kvm->mmu_lock);
@@ -769,19 +764,14 @@ static gpa_t FNAME(gva_to_gpa_nested)(struct kvm_vcpu *vcpu, gva_t vaddr,
  */
 static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
 {
-	int i, offset, nr_present;
+	int i, nr_present = 0;
 	bool host_writable;
 	gpa_t first_pte_gpa;
 
-	offset = nr_present = 0;
-
 	/* direct kvm_mmu_page can not be unsync. */
 	BUG_ON(sp->role.direct);
 
-	if (PTTYPE == 32)
-		offset = sp->role.quadrant << PT64_LEVEL_BITS;
-
-	first_pte_gpa = gfn_to_gpa(sp->gfn) + offset * sizeof(pt_element_t);
+	first_pte_gpa = FNAME(get_level1_sp_gpa)(sp);
 
 	for (i = 0; i < PT64_ENT_PER_PAGE; i++) {
 		unsigned pte_access;
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 07/11] KVM: MMU: fast prefetch spte on invlpg path
  2011-09-22  8:52 [PATCH v4 00/11] KVM: x86: optimize for writing guest page Xiao Guangrong
                   ` (4 preceding siblings ...)
  2011-09-22  8:56 ` [PATCH v4 06/11] KVM: MMU: cleanup FNAME(invlpg) Xiao Guangrong
@ 2011-09-22  8:56 ` Xiao Guangrong
  2011-09-22  8:56 ` [PATCH v4 08/11] KVM: MMU: remove unnecessary kvm_mmu_free_some_pages Xiao Guangrong
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Xiao Guangrong @ 2011-09-22  8:56 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

Fast prefetch spte for the unsync shadow page on invlpg path

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 arch/x86/include/asm/kvm_host.h |    4 +---
 arch/x86/kvm/mmu.c              |   38 +++++++++++++++-----------------------
 arch/x86/kvm/paging_tmpl.h      |   30 ++++++++++++++++++------------
 arch/x86/kvm/x86.c              |    4 ++--
 4 files changed, 36 insertions(+), 40 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 58ea3a7..927ba73 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -460,7 +460,6 @@ struct kvm_arch {
 	unsigned int n_requested_mmu_pages;
 	unsigned int n_max_mmu_pages;
 	unsigned int indirect_shadow_pages;
-	atomic_t invlpg_counter;
 	struct hlist_head mmu_page_hash[KVM_NUM_MMU_PAGES];
 	/*
 	 * Hash table of struct kvm_mmu_page.
@@ -754,8 +753,7 @@ int fx_init(struct kvm_vcpu *vcpu);
 
 void kvm_mmu_flush_tlb(struct kvm_vcpu *vcpu);
 void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
-		       const u8 *new, int bytes,
-		       bool guest_initiated);
+		       const u8 *new, int bytes);
 int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn);
 int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva);
 void __kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 805a9d5..4128aba 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3530,8 +3530,7 @@ static bool last_updated_pte_accessed(struct kvm_vcpu *vcpu)
 }
 
 void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
-		       const u8 *new, int bytes,
-		       bool guest_initiated)
+		       const u8 *new, int bytes)
 {
 	gfn_t gfn = gpa >> PAGE_SHIFT;
 	union kvm_mmu_page_role mask = { .word = 0 };
@@ -3540,7 +3539,7 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 	LIST_HEAD(invalid_list);
 	u64 entry, gentry, *spte;
 	unsigned pte_size, page_offset, misaligned, quadrant, offset;
-	int level, npte, invlpg_counter, r, flooded = 0;
+	int level, npte, r, flooded = 0;
 	bool remote_flush, local_flush, zap_page;
 
 	/*
@@ -3555,19 +3554,16 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 
 	pgprintk("%s: gpa %llx bytes %d\n", __func__, gpa, bytes);
 
-	invlpg_counter = atomic_read(&vcpu->kvm->arch.invlpg_counter);
-
 	/*
 	 * Assume that the pte write on a page table of the same type
 	 * as the current vcpu paging mode since we update the sptes only
 	 * when they have the same mode.
 	 */
-	if ((is_pae(vcpu) && bytes == 4) || !new) {
+	if (is_pae(vcpu) && bytes == 4) {
 		/* Handle a 32-bit guest writing two halves of a 64-bit gpte */
-		if (is_pae(vcpu)) {
-			gpa &= ~(gpa_t)7;
-			bytes = 8;
-		}
+		gpa &= ~(gpa_t)7;
+		bytes = 8;
+
 		r = kvm_read_guest(vcpu->kvm, gpa, &gentry, min(bytes, 8));
 		if (r)
 			gentry = 0;
@@ -3593,22 +3589,18 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 	 */
 	mmu_topup_memory_caches(vcpu);
 	spin_lock(&vcpu->kvm->mmu_lock);
-	if (atomic_read(&vcpu->kvm->arch.invlpg_counter) != invlpg_counter)
-		gentry = 0;
 	kvm_mmu_free_some_pages(vcpu);
 	++vcpu->kvm->stat.mmu_pte_write;
 	trace_kvm_mmu_audit(vcpu, AUDIT_PRE_PTE_WRITE);
-	if (guest_initiated) {
-		if (gfn == vcpu->arch.last_pt_write_gfn
-		    && !last_updated_pte_accessed(vcpu)) {
-			++vcpu->arch.last_pt_write_count;
-			if (vcpu->arch.last_pt_write_count >= 3)
-				flooded = 1;
-		} else {
-			vcpu->arch.last_pt_write_gfn = gfn;
-			vcpu->arch.last_pt_write_count = 1;
-			vcpu->arch.last_pte_updated = NULL;
-		}
+	if (gfn == vcpu->arch.last_pt_write_gfn
+	    && !last_updated_pte_accessed(vcpu)) {
+		++vcpu->arch.last_pt_write_count;
+		if (vcpu->arch.last_pt_write_count >= 3)
+			flooded = 1;
+	} else {
+		vcpu->arch.last_pt_write_gfn = gfn;
+		vcpu->arch.last_pt_write_count = 1;
+		vcpu->arch.last_pte_updated = NULL;
 	}
 
 	mask.cr0_wp = mask.cr4_pae = mask.nxe = 1;
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index d8d3906..9efb860 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -672,20 +672,27 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva)
 {
 	struct kvm_shadow_walk_iterator iterator;
 	struct kvm_mmu_page *sp;
-	gpa_t pte_gpa = -1;
 	int level;
 	u64 *sptep;
 
 	vcpu_clear_mmio_info(vcpu, gva);
 
-	spin_lock(&vcpu->kvm->mmu_lock);
+	/*
+	 * No need to check return value here, rmap_can_add() can
+	 * help us to skip pte prefetch later.
+	 */
+	mmu_topup_memory_caches(vcpu);
 
+	spin_lock(&vcpu->kvm->mmu_lock);
 	for_each_shadow_entry(vcpu, gva, iterator) {
 		level = iterator.level;
 		sptep = iterator.sptep;
 
 		sp = page_header(__pa(sptep));
 		if (is_last_spte(*sptep, level)) {
+			pt_element_t gpte;
+			gpa_t pte_gpa;
+
 			if (!sp->unsync)
 				break;
 
@@ -694,22 +701,21 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva)
 
 			if (mmu_page_zap_pte(vcpu->kvm, sp, sptep))
 				kvm_flush_remote_tlbs(vcpu->kvm);
+
+			if (!rmap_can_add(vcpu))
+				break;
+
+			if (kvm_read_guest_atomic(vcpu->kvm, pte_gpa, &gpte,
+						  sizeof(pt_element_t)))
+				break;
+
+			FNAME(update_pte)(vcpu, sp, sptep, &gpte);
 		}
 
 		if (!is_shadow_present_pte(*sptep) || !sp->unsync_children)
 			break;
 	}
-
-	atomic_inc(&vcpu->kvm->arch.invlpg_counter);
-
 	spin_unlock(&vcpu->kvm->mmu_lock);
-
-	if (pte_gpa == -1)
-		return;
-
-	if (mmu_topup_memory_caches(vcpu))
-		return;
-	kvm_mmu_pte_write(vcpu, pte_gpa, NULL, sizeof(pt_element_t), 0);
 }
 
 static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t vaddr, u32 access,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a69a3e5..9206e39 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4065,7 +4065,7 @@ int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,
 	ret = kvm_write_guest(vcpu->kvm, gpa, val, bytes);
 	if (ret < 0)
 		return 0;
-	kvm_mmu_pte_write(vcpu, gpa, val, bytes, 1);
+	kvm_mmu_pte_write(vcpu, gpa, val, bytes);
 	return 1;
 }
 
@@ -4302,7 +4302,7 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
 	if (!exchanged)
 		return X86EMUL_CMPXCHG_FAILED;
 
-	kvm_mmu_pte_write(vcpu, gpa, new, bytes, 1);
+	kvm_mmu_pte_write(vcpu, gpa, new, bytes);
 
 	return X86EMUL_CONTINUE;
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 08/11] KVM: MMU: remove unnecessary kvm_mmu_free_some_pages
  2011-09-22  8:52 [PATCH v4 00/11] KVM: x86: optimize for writing guest page Xiao Guangrong
                   ` (5 preceding siblings ...)
  2011-09-22  8:56 ` [PATCH v4 07/11] KVM: MMU: fast prefetch spte on invlpg path Xiao Guangrong
@ 2011-09-22  8:56 ` Xiao Guangrong
  2011-09-22  8:57 ` [PATCH v4 09/11] KVM: MMU: split kvm_mmu_pte_write function Xiao Guangrong
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Xiao Guangrong @ 2011-09-22  8:56 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

In kvm_mmu_pte_write, we do not need to alloc shadow page, so calling
kvm_mmu_free_some_pages is really unnecessary

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 arch/x86/kvm/mmu.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 4128aba..7b22f3a 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3589,7 +3589,6 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 	 */
 	mmu_topup_memory_caches(vcpu);
 	spin_lock(&vcpu->kvm->mmu_lock);
-	kvm_mmu_free_some_pages(vcpu);
 	++vcpu->kvm->stat.mmu_pte_write;
 	trace_kvm_mmu_audit(vcpu, AUDIT_PRE_PTE_WRITE);
 	if (gfn == vcpu->arch.last_pt_write_gfn
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 09/11] KVM: MMU: split kvm_mmu_pte_write function
  2011-09-22  8:52 [PATCH v4 00/11] KVM: x86: optimize for writing guest page Xiao Guangrong
                   ` (6 preceding siblings ...)
  2011-09-22  8:56 ` [PATCH v4 08/11] KVM: MMU: remove unnecessary kvm_mmu_free_some_pages Xiao Guangrong
@ 2011-09-22  8:57 ` Xiao Guangrong
  2011-09-22  8:57 ` [PATCH v4 10/11] KVM: MMU: fix detecting misaligned accessed Xiao Guangrong
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Xiao Guangrong @ 2011-09-22  8:57 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

kvm_mmu_pte_write is too long, we split it for better readable

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 arch/x86/kvm/mmu.c |  194 ++++++++++++++++++++++++++++++++--------------------
 1 files changed, 119 insertions(+), 75 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 7b22f3a..6e39ec5 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3529,48 +3529,28 @@ static bool last_updated_pte_accessed(struct kvm_vcpu *vcpu)
 	return !!(spte && (*spte & shadow_accessed_mask));
 }
 
-void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
-		       const u8 *new, int bytes)
+static u64 mmu_pte_write_fetch_gpte(struct kvm_vcpu *vcpu, gpa_t *gpa,
+				    const u8 *new, int *bytes)
 {
-	gfn_t gfn = gpa >> PAGE_SHIFT;
-	union kvm_mmu_page_role mask = { .word = 0 };
-	struct kvm_mmu_page *sp;
-	struct hlist_node *node;
-	LIST_HEAD(invalid_list);
-	u64 entry, gentry, *spte;
-	unsigned pte_size, page_offset, misaligned, quadrant, offset;
-	int level, npte, r, flooded = 0;
-	bool remote_flush, local_flush, zap_page;
-
-	/*
-	 * If we don't have indirect shadow pages, it means no page is
-	 * write-protected, so we can exit simply.
-	 */
-	if (!ACCESS_ONCE(vcpu->kvm->arch.indirect_shadow_pages))
-		return;
-
-	zap_page = remote_flush = local_flush = false;
-	offset = offset_in_page(gpa);
-
-	pgprintk("%s: gpa %llx bytes %d\n", __func__, gpa, bytes);
+	u64 gentry;
+	int r;
 
 	/*
 	 * Assume that the pte write on a page table of the same type
 	 * as the current vcpu paging mode since we update the sptes only
 	 * when they have the same mode.
 	 */
-	if (is_pae(vcpu) && bytes == 4) {
+	if (is_pae(vcpu) && *bytes == 4) {
 		/* Handle a 32-bit guest writing two halves of a 64-bit gpte */
-		gpa &= ~(gpa_t)7;
-		bytes = 8;
-
-		r = kvm_read_guest(vcpu->kvm, gpa, &gentry, min(bytes, 8));
+		*gpa &= ~(gpa_t)7;
+		*bytes = 8;
+		r = kvm_read_guest(vcpu->kvm, *gpa, &gentry, min(*bytes, 8));
 		if (r)
 			gentry = 0;
 		new = (const u8 *)&gentry;
 	}
 
-	switch (bytes) {
+	switch (*bytes) {
 	case 4:
 		gentry = *(const u32 *)new;
 		break;
@@ -3582,71 +3562,135 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 		break;
 	}
 
-	/*
-	 * No need to care whether allocation memory is successful
-	 * or not since pte prefetch is skiped if it does not have
-	 * enough objects in the cache.
-	 */
-	mmu_topup_memory_caches(vcpu);
-	spin_lock(&vcpu->kvm->mmu_lock);
-	++vcpu->kvm->stat.mmu_pte_write;
-	trace_kvm_mmu_audit(vcpu, AUDIT_PRE_PTE_WRITE);
+	return gentry;
+}
+
+/*
+ * If we're seeing too many writes to a page, it may no longer be a page table,
+ * or we may be forking, in which case it is better to unmap the page.
+ */
+static bool detect_write_flooding(struct kvm_vcpu *vcpu, gfn_t gfn)
+{
+	bool flooded = false;
+
 	if (gfn == vcpu->arch.last_pt_write_gfn
 	    && !last_updated_pte_accessed(vcpu)) {
 		++vcpu->arch.last_pt_write_count;
 		if (vcpu->arch.last_pt_write_count >= 3)
-			flooded = 1;
+			flooded = true;
 	} else {
 		vcpu->arch.last_pt_write_gfn = gfn;
 		vcpu->arch.last_pt_write_count = 1;
 		vcpu->arch.last_pte_updated = NULL;
 	}
 
+	return flooded;
+}
+
+/*
+ * Misaligned accesses are too much trouble to fix up; also, they usually
+ * indicate a page is not used as a page table.
+ */
+static bool detect_write_misaligned(struct kvm_mmu_page *sp, gpa_t gpa,
+				    int bytes)
+{
+	unsigned offset, pte_size, misaligned;
+
+	pgprintk("misaligned: gpa %llx bytes %d role %x\n",
+		 gpa, bytes, sp->role.word);
+
+	offset = offset_in_page(gpa);
+	pte_size = sp->role.cr4_pae ? 8 : 4;
+	misaligned = (offset ^ (offset + bytes - 1)) & ~(pte_size - 1);
+	misaligned |= bytes < 4;
+
+	return misaligned;
+}
+
+static u64 *get_written_sptes(struct kvm_mmu_page *sp, gpa_t gpa, int *nspte)
+{
+	unsigned page_offset, quadrant;
+	u64 *spte;
+	int level;
+
+	page_offset = offset_in_page(gpa);
+	level = sp->role.level;
+	*nspte = 1;
+	if (!sp->role.cr4_pae) {
+		page_offset <<= 1;	/* 32->64 */
+		/*
+		 * A 32-bit pde maps 4MB while the shadow pdes map
+		 * only 2MB.  So we need to double the offset again
+		 * and zap two pdes instead of one.
+		 */
+		if (level == PT32_ROOT_LEVEL) {
+			page_offset &= ~7; /* kill rounding error */
+			page_offset <<= 1;
+			*nspte = 2;
+		}
+		quadrant = page_offset >> PAGE_SHIFT;
+		page_offset &= ~PAGE_MASK;
+		if (quadrant != sp->role.quadrant)
+			return NULL;
+	}
+
+	spte = &sp->spt[page_offset / sizeof(*spte)];
+	return spte;
+}
+
+void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
+		       const u8 *new, int bytes)
+{
+	gfn_t gfn = gpa >> PAGE_SHIFT;
+	union kvm_mmu_page_role mask = { .word = 0 };
+	struct kvm_mmu_page *sp;
+	struct hlist_node *node;
+	LIST_HEAD(invalid_list);
+	u64 entry, gentry, *spte;
+	int npte;
+	bool remote_flush, local_flush, zap_page, flooded, misaligned;
+
+	/*
+	 * If we don't have indirect shadow pages, it means no page is
+	 * write-protected, so we can exit simply.
+	 */
+	if (!ACCESS_ONCE(vcpu->kvm->arch.indirect_shadow_pages))
+		return;
+
+	zap_page = remote_flush = local_flush = false;
+
+	pgprintk("%s: gpa %llx bytes %d\n", __func__, gpa, bytes);
+
+	gentry = mmu_pte_write_fetch_gpte(vcpu, &gpa, new, &bytes);
+
+	/*
+	 * No need to care whether allocation memory is successful
+	 * or not since pte prefetch is skiped if it does not have
+	 * enough objects in the cache.
+	 */
+	mmu_topup_memory_caches(vcpu);
+
+	spin_lock(&vcpu->kvm->mmu_lock);
+	++vcpu->kvm->stat.mmu_pte_write;
+	trace_kvm_mmu_audit(vcpu, AUDIT_PRE_PTE_WRITE);
+
+	flooded = detect_write_flooding(vcpu, gfn);
 	mask.cr0_wp = mask.cr4_pae = mask.nxe = 1;
 	for_each_gfn_indirect_valid_sp(vcpu->kvm, sp, gfn, node) {
-		pte_size = sp->role.cr4_pae ? 8 : 4;
-		misaligned = (offset ^ (offset + bytes - 1)) & ~(pte_size - 1);
-		misaligned |= bytes < 4;
+		misaligned = detect_write_misaligned(sp, gpa, bytes);
+
 		if (misaligned || flooded) {
-			/*
-			 * Misaligned accesses are too much trouble to fix
-			 * up; also, they usually indicate a page is not used
-			 * as a page table.
-			 *
-			 * If we're seeing too many writes to a page,
-			 * it may no longer be a page table, or we may be
-			 * forking, in which case it is better to unmap the
-			 * page.
-			 */
-			pgprintk("misaligned: gpa %llx bytes %d role %x\n",
-				 gpa, bytes, sp->role.word);
 			zap_page |= !!kvm_mmu_prepare_zap_page(vcpu->kvm, sp,
 						     &invalid_list);
 			++vcpu->kvm->stat.mmu_flooded;
 			continue;
 		}
-		page_offset = offset;
-		level = sp->role.level;
-		npte = 1;
-		if (!sp->role.cr4_pae) {
-			page_offset <<= 1;	/* 32->64 */
-			/*
-			 * A 32-bit pde maps 4MB while the shadow pdes map
-			 * only 2MB.  So we need to double the offset again
-			 * and zap two pdes instead of one.
-			 */
-			if (level == PT32_ROOT_LEVEL) {
-				page_offset &= ~7; /* kill rounding error */
-				page_offset <<= 1;
-				npte = 2;
-			}
-			quadrant = page_offset >> PAGE_SHIFT;
-			page_offset &= ~PAGE_MASK;
-			if (quadrant != sp->role.quadrant)
-				continue;
-		}
+
+		spte = get_written_sptes(sp, gpa, &npte);
+		if (!spte)
+			continue;
+
 		local_flush = true;
-		spte = &sp->spt[page_offset / sizeof(*spte)];
 		while (npte--) {
 			entry = *spte;
 			mmu_page_zap_pte(vcpu->kvm, sp, spte);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 10/11] KVM: MMU: fix detecting misaligned accessed
  2011-09-22  8:52 [PATCH v4 00/11] KVM: x86: optimize for writing guest page Xiao Guangrong
                   ` (7 preceding siblings ...)
  2011-09-22  8:57 ` [PATCH v4 09/11] KVM: MMU: split kvm_mmu_pte_write function Xiao Guangrong
@ 2011-09-22  8:57 ` Xiao Guangrong
  2011-09-22  8:58 ` [PATCH v4 11/11] KVM: MMU: improve write flooding detected Xiao Guangrong
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Xiao Guangrong @ 2011-09-22  8:57 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

Sometimes, we only modify the last one byte of a pte to update status bit,
for example, clear_bit is used to clear r/w bit in linux kernel and 'andb'
instruction is used in this function, in this case, kvm_mmu_pte_write will
treat it as misaligned access, and the shadow page table is zapped

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 arch/x86/kvm/mmu.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 6e39ec5..13f4d2a 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3601,6 +3601,14 @@ static bool detect_write_misaligned(struct kvm_mmu_page *sp, gpa_t gpa,
 
 	offset = offset_in_page(gpa);
 	pte_size = sp->role.cr4_pae ? 8 : 4;
+
+	/*
+	 * Sometimes, the OS only writes the last one bytes to update status
+	 * bits, for example, in linux, andb instruction is used in clear_bit().
+	 */
+	if (!(offset & (pte_size - 1)) && bytes == 1)
+		return false;
+
 	misaligned = (offset ^ (offset + bytes - 1)) & ~(pte_size - 1);
 	misaligned |= bytes < 4;
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 11/11] KVM: MMU: improve write flooding detected
  2011-09-22  8:52 [PATCH v4 00/11] KVM: x86: optimize for writing guest page Xiao Guangrong
                   ` (8 preceding siblings ...)
  2011-09-22  8:57 ` [PATCH v4 10/11] KVM: MMU: fix detecting misaligned accessed Xiao Guangrong
@ 2011-09-22  8:58 ` Xiao Guangrong
  2011-09-22  9:02 ` [PATCH v4 03/11] KVM: x86: retry non-page-table writing instructions Xiao Guangrong
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Xiao Guangrong @ 2011-09-22  8:58 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

Detecting write-flooding does not work well, when we handle page written, if
the last speculative spte is not accessed, we treat the page is
write-flooding, however, we can speculative spte on many path, such as pte
prefetch, page synced, that means the last speculative spte may be not point
to the written page and the written page can be accessed via other sptes, so
depends on the Accessed bit of the last speculative spte is not enough

Instead of detected page accessed, we can detect whether the spte is accessed
after it is written, if the spte is not accessed but it is written frequently,
we treat is not a page table or it not used for a long time

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 arch/x86/include/asm/kvm_host.h |    6 +--
 arch/x86/kvm/mmu.c              |   62 +++++++++++++++-----------------------
 arch/x86/kvm/paging_tmpl.h      |   12 +++----
 3 files changed, 32 insertions(+), 48 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 927ba73..9d17238 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -239,6 +239,8 @@ struct kvm_mmu_page {
 	int clear_spte_count;
 #endif
 
+	int write_flooding_count;
+
 	struct rcu_head rcu;
 };
 
@@ -353,10 +355,6 @@ struct kvm_vcpu_arch {
 	struct kvm_mmu_memory_cache mmu_page_cache;
 	struct kvm_mmu_memory_cache mmu_page_header_cache;
 
-	gfn_t last_pt_write_gfn;
-	int   last_pt_write_count;
-	u64  *last_pte_updated;
-
 	struct fpu guest_fpu;
 	u64 xcr0;
 
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 13f4d2a..77030ea 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1652,6 +1652,18 @@ static void init_shadow_page_table(struct kvm_mmu_page *sp)
 		sp->spt[i] = 0ull;
 }
 
+static void __clear_sp_write_flooding_count(struct kvm_mmu_page *sp)
+{
+	sp->write_flooding_count = 0;
+}
+
+static void clear_sp_write_flooding_count(u64 *spte)
+{
+	struct kvm_mmu_page *sp =  page_header(__pa(spte));
+
+	__clear_sp_write_flooding_count(sp);
+}
+
 static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 					     gfn_t gfn,
 					     gva_t gaddr,
@@ -1695,6 +1707,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 		} else if (sp->unsync)
 			kvm_mmu_mark_parents_unsync(sp);
 
+		__clear_sp_write_flooding_count(sp);
 		trace_kvm_mmu_get_page(sp, false);
 		return sp;
 	}
@@ -1847,15 +1860,6 @@ static void kvm_mmu_put_page(struct kvm_mmu_page *sp, u64 *parent_pte)
 	mmu_page_remove_parent_pte(sp, parent_pte);
 }
 
-static void kvm_mmu_reset_last_pte_updated(struct kvm *kvm)
-{
-	int i;
-	struct kvm_vcpu *vcpu;
-
-	kvm_for_each_vcpu(i, vcpu, kvm)
-		vcpu->arch.last_pte_updated = NULL;
-}
-
 static void kvm_mmu_unlink_parents(struct kvm *kvm, struct kvm_mmu_page *sp)
 {
 	u64 *parent_pte;
@@ -1915,7 +1919,6 @@ static int kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
 	}
 
 	sp->role.invalid = 1;
-	kvm_mmu_reset_last_pte_updated(kvm);
 	return ret;
 }
 
@@ -2360,8 +2363,6 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 		}
 	}
 	kvm_release_pfn_clean(pfn);
-	if (speculative)
-		vcpu->arch.last_pte_updated = sptep;
 }
 
 static void nonpaging_new_cr3(struct kvm_vcpu *vcpu)
@@ -3522,13 +3523,6 @@ static void mmu_pte_write_flush_tlb(struct kvm_vcpu *vcpu, bool zap_page,
 		kvm_mmu_flush_tlb(vcpu);
 }
 
-static bool last_updated_pte_accessed(struct kvm_vcpu *vcpu)
-{
-	u64 *spte = vcpu->arch.last_pte_updated;
-
-	return !!(spte && (*spte & shadow_accessed_mask));
-}
-
 static u64 mmu_pte_write_fetch_gpte(struct kvm_vcpu *vcpu, gpa_t *gpa,
 				    const u8 *new, int *bytes)
 {
@@ -3569,22 +3563,16 @@ static u64 mmu_pte_write_fetch_gpte(struct kvm_vcpu *vcpu, gpa_t *gpa,
  * If we're seeing too many writes to a page, it may no longer be a page table,
  * or we may be forking, in which case it is better to unmap the page.
  */
-static bool detect_write_flooding(struct kvm_vcpu *vcpu, gfn_t gfn)
+static bool detect_write_flooding(struct kvm_mmu_page *sp, u64 *spte)
 {
-	bool flooded = false;
-
-	if (gfn == vcpu->arch.last_pt_write_gfn
-	    && !last_updated_pte_accessed(vcpu)) {
-		++vcpu->arch.last_pt_write_count;
-		if (vcpu->arch.last_pt_write_count >= 3)
-			flooded = true;
-	} else {
-		vcpu->arch.last_pt_write_gfn = gfn;
-		vcpu->arch.last_pt_write_count = 1;
-		vcpu->arch.last_pte_updated = NULL;
-	}
+	/*
+	 * Skip write-flooding detected for the sp whose level is 1, because
+	 * it can become unsync, then the guest page is not write-protected.
+	 */
+	if (sp->role.level == 1)
+		return false;
 
-	return flooded;
+	return ++sp->write_flooding_count >= 3;
 }
 
 /*
@@ -3656,7 +3644,7 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 	LIST_HEAD(invalid_list);
 	u64 entry, gentry, *spte;
 	int npte;
-	bool remote_flush, local_flush, zap_page, flooded, misaligned;
+	bool remote_flush, local_flush, zap_page;
 
 	/*
 	 * If we don't have indirect shadow pages, it means no page is
@@ -3682,12 +3670,12 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 	++vcpu->kvm->stat.mmu_pte_write;
 	trace_kvm_mmu_audit(vcpu, AUDIT_PRE_PTE_WRITE);
 
-	flooded = detect_write_flooding(vcpu, gfn);
 	mask.cr0_wp = mask.cr4_pae = mask.nxe = 1;
 	for_each_gfn_indirect_valid_sp(vcpu->kvm, sp, gfn, node) {
-		misaligned = detect_write_misaligned(sp, gpa, bytes);
+		spte = get_written_sptes(sp, gpa, &npte);
 
-		if (misaligned || flooded) {
+		if (detect_write_misaligned(sp, gpa, bytes) ||
+		      detect_write_flooding(sp, spte)) {
 			zap_page |= !!kvm_mmu_prepare_zap_page(vcpu->kvm, sp,
 						     &invalid_list);
 			++vcpu->kvm->stat.mmu_flooded;
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 9efb860..52e9d58 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -497,6 +497,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 	     shadow_walk_next(&it)) {
 		gfn_t table_gfn;
 
+		clear_sp_write_flooding_count(it.sptep);
 		drop_large_spte(vcpu, it.sptep);
 
 		sp = NULL;
@@ -522,6 +523,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 	     shadow_walk_next(&it)) {
 		gfn_t direct_gfn;
 
+		clear_sp_write_flooding_count(it.sptep);
 		validate_direct_spte(vcpu, it.sptep, direct_access);
 
 		drop_large_spte(vcpu, it.sptep);
@@ -536,6 +538,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 		link_shadow_page(it.sptep, sp);
 	}
 
+	clear_sp_write_flooding_count(it.sptep);
 	mmu_set_spte(vcpu, it.sptep, access, gw->pte_access,
 		     user_fault, write_fault, emulate, it.level,
 		     gw->gfn, pfn, prefault, map_writable);
@@ -599,11 +602,9 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr, u32 error_code,
 	 */
 	if (!r) {
 		pgprintk("%s: guest page fault\n", __func__);
-		if (!prefault) {
+		if (!prefault)
 			inject_page_fault(vcpu, &walker.fault);
-			/* reset fork detector */
-			vcpu->arch.last_pt_write_count = 0;
-		}
+
 		return 0;
 	}
 
@@ -641,9 +642,6 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr, u32 error_code,
 	pgprintk("%s: shadow pte %p %llx emulate %d\n", __func__,
 		 sptep, *sptep, emulate);
 
-	if (!emulate)
-		vcpu->arch.last_pt_write_count = 0; /* reset fork detector */
-
 	++vcpu->stat.pf_fixed;
 	trace_kvm_mmu_audit(vcpu, AUDIT_POST_PAGE_FAULT);
 	spin_unlock(&vcpu->kvm->mmu_lock);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 03/11] KVM: x86: retry non-page-table writing instructions
  2011-09-22  8:52 [PATCH v4 00/11] KVM: x86: optimize for writing guest page Xiao Guangrong
                   ` (9 preceding siblings ...)
  2011-09-22  8:58 ` [PATCH v4 11/11] KVM: MMU: improve write flooding detected Xiao Guangrong
@ 2011-09-22  9:02 ` Xiao Guangrong
  2011-09-23 11:51 ` [PATCH v4 00/11] KVM: x86: optimize for writing guest page Marcelo Tosatti
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Xiao Guangrong @ 2011-09-22  9:02 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

If the emulation is caused by #PF and it is non-page_table writing instruction,
it means the VM-EXIT is caused by shadow page protected, we can zap the shadow
page and retry this instruction directly

The idea is from Avi

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 arch/x86/include/asm/kvm_emulate.h |    1 +
 arch/x86/include/asm/kvm_host.h    |    5 ++++
 arch/x86/kvm/emulate.c             |    5 ++++
 arch/x86/kvm/mmu.c                 |   25 ++++++++++++++----
 arch/x86/kvm/x86.c                 |   47 ++++++++++++++++++++++++++++++++++++
 5 files changed, 77 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index a026507..9a4acf4 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -364,6 +364,7 @@ enum x86_intercept {
 #endif
 
 int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len);
+bool x86_page_table_writing_insn(struct x86_emulate_ctxt *ctxt);
 #define EMULATION_FAILED -1
 #define EMULATION_OK 0
 #define EMULATION_RESTART 1
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6ab4241..27a25df 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -443,6 +443,9 @@ struct kvm_vcpu_arch {
 
 	cpumask_var_t wbinvd_dirty_mask;
 
+	unsigned long last_retry_eip;
+	unsigned long last_retry_addr;
+
 	struct {
 		bool halted;
 		gfn_t gfns[roundup_pow_of_two(ASYNC_PF_PER_VCPU)];
@@ -689,6 +692,7 @@ enum emulation_result {
 #define EMULTYPE_NO_DECODE	    (1 << 0)
 #define EMULTYPE_TRAP_UD	    (1 << 1)
 #define EMULTYPE_SKIP		    (1 << 2)
+#define EMULTYPE_RETRY		    (1 << 3)
 int x86_emulate_instruction(struct kvm_vcpu *vcpu, unsigned long cr2,
 			    int emulation_type, void *insn, int insn_len);
 
@@ -753,6 +757,7 @@ void kvm_mmu_flush_tlb(struct kvm_vcpu *vcpu);
 void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 		       const u8 *new, int bytes,
 		       bool guest_initiated);
+int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn);
 int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva);
 void __kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu);
 int kvm_mmu_load(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index a10950a..8547958 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3702,6 +3702,11 @@ done:
 	return (rc != X86EMUL_CONTINUE) ? EMULATION_FAILED : EMULATION_OK;
 }
 
+bool x86_page_table_writing_insn(struct x86_emulate_ctxt *ctxt)
+{
+	return ctxt->d & PageTable;
+}
+
 static bool string_insn_completed(struct x86_emulate_ctxt *ctxt)
 {
 	/* The second termination condition only applies for REPE
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index b01afee..4e53d6b 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1997,7 +1997,7 @@ void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int goal_nr_mmu_pages)
 	kvm->arch.n_max_mmu_pages = goal_nr_mmu_pages;
 }
 
-static int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn)
+int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn)
 {
 	struct kvm_mmu_page *sp;
 	struct hlist_node *node;
@@ -2006,7 +2006,7 @@ static int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn)
 
 	pgprintk("%s: looking for gfn %llx\n", __func__, gfn);
 	r = 0;
-
+	spin_lock(&kvm->mmu_lock);
 	for_each_gfn_indirect_valid_sp(kvm, sp, gfn, node) {
 		pgprintk("%s: gfn %llx role %x\n", __func__, gfn,
 			 sp->role.word);
@@ -2014,8 +2014,11 @@ static int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn)
 		kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
 	}
 	kvm_mmu_commit_zap_page(kvm, &invalid_list);
+	spin_unlock(&kvm->mmu_lock);
+
 	return r;
 }
+EXPORT_SYMBOL_GPL(kvm_mmu_unprotect_page);
 
 static void mmu_unshadow(struct kvm *kvm, gfn_t gfn)
 {
@@ -3697,9 +3700,8 @@ int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva)
 
 	gpa = kvm_mmu_gva_to_gpa_read(vcpu, gva, NULL);
 
-	spin_lock(&vcpu->kvm->mmu_lock);
 	r = kvm_mmu_unprotect_page(vcpu->kvm, gpa >> PAGE_SHIFT);
-	spin_unlock(&vcpu->kvm->mmu_lock);
+
 	return r;
 }
 EXPORT_SYMBOL_GPL(kvm_mmu_unprotect_page_virt);
@@ -3720,10 +3722,18 @@ void __kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu)
 	kvm_mmu_commit_zap_page(vcpu->kvm, &invalid_list);
 }
 
+static bool is_mmio_page_fault(struct kvm_vcpu *vcpu, gva_t addr)
+{
+	if (vcpu->arch.mmu.direct_map || mmu_is_nested(vcpu))
+		return vcpu_match_mmio_gpa(vcpu, addr);
+
+	return vcpu_match_mmio_gva(vcpu, addr);
+}
+
 int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t cr2, u32 error_code,
 		       void *insn, int insn_len)
 {
-	int r;
+	int r, emulation_type = EMULTYPE_RETRY;
 	enum emulation_result er;
 
 	r = vcpu->arch.mmu.page_fault(vcpu, cr2, error_code, false);
@@ -3735,7 +3745,10 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t cr2, u32 error_code,
 		goto out;
 	}
 
-	er = x86_emulate_instruction(vcpu, cr2, 0, insn, insn_len);
+	if (is_mmio_page_fault(vcpu, cr2))
+		emulation_type = 0;
+
+	er = x86_emulate_instruction(vcpu, cr2, emulation_type, insn, insn_len);
 
 	switch (er) {
 	case EMULATE_DONE:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6b37f18..727a6af 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4814,6 +4814,50 @@ static bool reexecute_instruction(struct kvm_vcpu *vcpu, gva_t gva)
 	return false;
 }
 
+static bool retry_instruction(struct x86_emulate_ctxt *ctxt,
+			      unsigned long cr2,  int emulation_type)
+{
+	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
+	unsigned long last_retry_eip, last_retry_addr, gpa = cr2;
+
+	last_retry_eip = vcpu->arch.last_retry_eip;
+	last_retry_addr = vcpu->arch.last_retry_addr;
+
+	/*
+	 * If the emulation is caused by #PF and it is non-page_table
+	 * writing instruction, it means the VM-EXIT is caused by shadow
+	 * page protected, we can zap the shadow page and retry this
+	 * instruction directly.
+	 *
+	 * Note: if the guest uses a non-page-table modifying instruction
+	 * on the PDE that points to the instruction, then we will unmap
+	 * the instruction and go to an infinite loop. So, we cache the
+	 * last retried eip and the last fault address, if we meet the eip
+	 * and the address again, we can break out of the potential infinite
+	 * loop.
+	 */
+	vcpu->arch.last_retry_eip = vcpu->arch.last_retry_addr = 0;
+
+	if (!(emulation_type & EMULTYPE_RETRY))
+		return false;
+
+	if (x86_page_table_writing_insn(ctxt))
+		return false;
+
+	if (ctxt->eip == last_retry_eip && last_retry_addr == cr2)
+		return false;
+
+	vcpu->arch.last_retry_eip = ctxt->eip;
+	vcpu->arch.last_retry_addr = cr2;
+
+	if (!vcpu->arch.mmu.direct_map)
+		gpa = kvm_mmu_gva_to_gpa_write(vcpu, cr2, NULL);
+
+	kvm_mmu_unprotect_page(vcpu->kvm, gpa >> PAGE_SHIFT);
+
+	return true;
+}
+
 int x86_emulate_instruction(struct kvm_vcpu *vcpu,
 			    unsigned long cr2,
 			    int emulation_type,
@@ -4855,6 +4899,9 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
 		return EMULATE_DONE;
 	}
 
+	if (retry_instruction(ctxt, cr2, emulation_type))
+		return EMULATE_DONE;
+
 	/* this is needed for vmware backdoor interface to work since it
 	   changes registers values  during IO operation */
 	if (vcpu->arch.emulate_regs_need_sync_from_vcpu) {
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 00/11] KVM: x86: optimize for writing guest page
  2011-09-22  8:52 [PATCH v4 00/11] KVM: x86: optimize for writing guest page Xiao Guangrong
                   ` (10 preceding siblings ...)
  2011-09-22  9:02 ` [PATCH v4 03/11] KVM: x86: retry non-page-table writing instructions Xiao Guangrong
@ 2011-09-23 11:51 ` Marcelo Tosatti
  2011-09-30  3:49   ` Xiao Guangrong
  2011-10-05 13:25   ` Avi Kivity
  2011-10-06 17:50 ` Marcelo Tosatti
  2011-10-06 17:53 ` Marcelo Tosatti
  13 siblings, 2 replies; 26+ messages in thread
From: Marcelo Tosatti @ 2011-09-23 11:51 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, LKML, KVM

On Thu, Sep 22, 2011 at 04:52:40PM +0800, Xiao Guangrong wrote:
> This patchset is against https://github.com/avikivity/kvm.git next branch.
> 
> In this version, some changes come from Avi's comments:
> - fix instruction retried for nested guest
> - skip write-flooding for the sp whose level is 1
> - rename some functions

Looks good to me.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 00/11] KVM: x86: optimize for writing guest page
  2011-09-23 11:51 ` [PATCH v4 00/11] KVM: x86: optimize for writing guest page Marcelo Tosatti
@ 2011-09-30  3:49   ` Xiao Guangrong
  2011-10-05 13:25   ` Avi Kivity
  1 sibling, 0 replies; 26+ messages in thread
From: Xiao Guangrong @ 2011-09-30  3:49 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Avi Kivity, LKML, KVM

On 09/23/2011 07:51 PM, Marcelo Tosatti wrote:
> On Thu, Sep 22, 2011 at 04:52:40PM +0800, Xiao Guangrong wrote:
>> This patchset is against https://github.com/avikivity/kvm.git next branch.
>>
>> In this version, some changes come from Avi's comments:
>> - fix instruction retried for nested guest
>> - skip write-flooding for the sp whose level is 1
>> - rename some functions
> 
> Looks good to me.
> 

Ping......


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 00/11] KVM: x86: optimize for writing guest page
  2011-09-23 11:51 ` [PATCH v4 00/11] KVM: x86: optimize for writing guest page Marcelo Tosatti
  2011-09-30  3:49   ` Xiao Guangrong
@ 2011-10-05 13:25   ` Avi Kivity
  1 sibling, 0 replies; 26+ messages in thread
From: Avi Kivity @ 2011-10-05 13:25 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Xiao Guangrong, LKML, KVM

On 09/23/2011 02:51 PM, Marcelo Tosatti wrote:
> On Thu, Sep 22, 2011 at 04:52:40PM +0800, Xiao Guangrong wrote:
> >  This patchset is against https://github.com/avikivity/kvm.git next branch.
> >
> >  In this version, some changes come from Avi's comments:
> >  - fix instruction retried for nested guest
> >  - skip write-flooding for the sp whose level is 1
> >  - rename some functions
>
> Looks good to me.
>

To me as well.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 00/11] KVM: x86: optimize for writing guest page
  2011-09-22  8:52 [PATCH v4 00/11] KVM: x86: optimize for writing guest page Xiao Guangrong
                   ` (11 preceding siblings ...)
  2011-09-23 11:51 ` [PATCH v4 00/11] KVM: x86: optimize for writing guest page Marcelo Tosatti
@ 2011-10-06 17:50 ` Marcelo Tosatti
  2011-10-06 17:53 ` Marcelo Tosatti
  13 siblings, 0 replies; 26+ messages in thread
From: Marcelo Tosatti @ 2011-10-06 17:50 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, LKML, KVM

On Thu, Sep 22, 2011 at 04:52:40PM +0800, Xiao Guangrong wrote:
> This patchset is against https://github.com/avikivity/kvm.git next branch.
> 
> In this version, some changes come from Avi's comments:
> - fix instruction retried for nested guest
> - skip write-flooding for the sp whose level is 1
> - rename some functions

Please rebase.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 00/11] KVM: x86: optimize for writing guest page
  2011-09-22  8:52 [PATCH v4 00/11] KVM: x86: optimize for writing guest page Xiao Guangrong
                   ` (12 preceding siblings ...)
  2011-10-06 17:50 ` Marcelo Tosatti
@ 2011-10-06 17:53 ` Marcelo Tosatti
  2011-10-08  4:06   ` Xiao Guangrong
  13 siblings, 1 reply; 26+ messages in thread
From: Marcelo Tosatti @ 2011-10-06 17:53 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, LKML, KVM

On Thu, Sep 22, 2011 at 04:52:40PM +0800, Xiao Guangrong wrote:
> This patchset is against https://github.com/avikivity/kvm.git next branch.
> 
> In this version, some changes come from Avi's comments:
> - fix instruction retried for nested guest
> - skip write-flooding for the sp whose level is 1
> - rename some functions

Please rebase.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 00/11] KVM: x86: optimize for writing guest page
  2011-10-06 17:53 ` Marcelo Tosatti
@ 2011-10-08  4:06   ` Xiao Guangrong
  2011-10-09 12:24     ` Avi Kivity
  0 siblings, 1 reply; 26+ messages in thread
From: Xiao Guangrong @ 2011-10-08  4:06 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Avi Kivity, LKML, KVM

On 10/07/2011 01:53 AM, Marcelo Tosatti wrote:
> On Thu, Sep 22, 2011 at 04:52:40PM +0800, Xiao Guangrong wrote:
>> This patchset is against https://github.com/avikivity/kvm.git next branch.
>>
>> In this version, some changes come from Avi's comments:
>> - fix instruction retried for nested guest
>> - skip write-flooding for the sp whose level is 1
>> - rename some functions
> 
> Please rebase.
> 
> 

Marcelo,

These patches can be applied without any conflict and it also works well,
the current code was pulled from https://github.com/avikivity/kvm.git next branch.

What problem did you meet when you applied these? :(

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 00/11] KVM: x86: optimize for writing guest page
  2011-10-08  4:06   ` Xiao Guangrong
@ 2011-10-09 12:24     ` Avi Kivity
  2011-10-09 13:37       ` Avi Kivity
  0 siblings, 1 reply; 26+ messages in thread
From: Avi Kivity @ 2011-10-09 12:24 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Marcelo Tosatti, LKML, KVM

On 10/08/2011 06:06 AM, Xiao Guangrong wrote:
> On 10/07/2011 01:53 AM, Marcelo Tosatti wrote:
> >  On Thu, Sep 22, 2011 at 04:52:40PM +0800, Xiao Guangrong wrote:
> >>  This patchset is against https://github.com/avikivity/kvm.git next branch.
> >>
> >>  In this version, some changes come from Avi's comments:
> >>  - fix instruction retried for nested guest
> >>  - skip write-flooding for the sp whose level is 1
> >>  - rename some functions
> >
> >  Please rebase.
> >
> >
>
> Marcelo,
>
> These patches can be applied without any conflict and it also works well,
> the current code was pulled from https://github.com/avikivity/kvm.git next branch.
>
> What problem did you meet when you applied these? :(

I guess it was a user error - it applies cleanly here too (and pushed to 
next, thanks).

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 00/11] KVM: x86: optimize for writing guest page
  2011-10-09 12:24     ` Avi Kivity
@ 2011-10-09 13:37       ` Avi Kivity
  2011-10-11  8:36         ` Xiao Guangrong
  0 siblings, 1 reply; 26+ messages in thread
From: Avi Kivity @ 2011-10-09 13:37 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Marcelo Tosatti, LKML, KVM

On 10/09/2011 02:24 PM, Avi Kivity wrote:
> On 10/08/2011 06:06 AM, Xiao Guangrong wrote:
>> On 10/07/2011 01:53 AM, Marcelo Tosatti wrote:
>> >  On Thu, Sep 22, 2011 at 04:52:40PM +0800, Xiao Guangrong wrote:
>> >>  This patchset is against https://github.com/avikivity/kvm.git 
>> next branch.
>> >>
>> >>  In this version, some changes come from Avi's comments:
>> >>  - fix instruction retried for nested guest
>> >>  - skip write-flooding for the sp whose level is 1
>> >>  - rename some functions
>> >
>> >  Please rebase.
>> >
>> >
>>
>> Marcelo,
>>
>> These patches can be applied without any conflict and it also works 
>> well,
>> the current code was pulled from https://github.com/avikivity/kvm.git 
>> next branch.
>>
>> What problem did you meet when you applied these? :(
>
> I guess it was a user error - it applies cleanly here too (and pushed 
> to next, thanks).
>

However, it seems to reduce performance.

Autotest results before:

Test                                        Status     Seconds  Info
----                                        ------     -------  ----
         (Result file: ../../results/default/status)
unittest                                    GOOD       147      
completed successfully
Fedora.9.32.install.smp2                    GOOD       865      
completed successfully
Fedora.9.32.boot.smp2                       GOOD       46       
completed successfully
Fedora.9.32.reboot.smp2                     GOOD       49       
completed successfully
Fedora.9.32.shutdown.smp2                   GOOD       15       
completed successfully
Fedora.9.64.install.smp2                    GOOD       943      
completed successfully
Fedora.9.64.boot.smp2                       GOOD       47       
completed successfully
Fedora.9.64.reboot.smp2                     GOOD       48       
completed successfully
Fedora.9.64.shutdown.smp2                   GOOD       14       
completed successfully
WinXP.32.install.smp2                       GOOD       772      
completed successfully
WinXP.32.setup.smp2                         GOOD       53       
completed successfully
WinXP.32.boot.smp2                          GOOD       57       
completed successfully
WinXP.32.reboot.smp2                        GOOD       34       
completed successfully
WinXP.32.shutdown.smp2                      GOOD       5        
completed successfully
WinXP.64.install.smp2                       GOOD       636      
completed successfully


After:

unittest                                    GOOD       150      
completed successfully
Fedora.9.32.install.smp2                    GOOD       879      
completed successfully
Fedora.9.32.boot.smp2                       GOOD       50       
completed successfully
Fedora.9.32.reboot.smp2                     GOOD       48       
completed successfully
Fedora.9.32.shutdown.smp2                   GOOD       15       
completed successfully
Fedora.9.64.install.smp2                    GOOD       997      
completed successfully
Fedora.9.64.boot.smp2                       GOOD       47       
completed successfully
Fedora.9.64.reboot.smp2                     GOOD       48       
completed successfully
Fedora.9.64.shutdown.smp2                   GOOD       14       
completed successfully
WinXP.32.install.smp2                       GOOD       764      
completed successfully
WinXP.32.setup.smp2                         GOOD       51       
completed successfully
WinXP.32.boot.smp2                          GOOD       40       
completed successfully
WinXP.32.reboot.smp2                        GOOD       34       
completed successfully
WinXP.32.shutdown.smp2                      GOOD       5        
completed successfully
WinXP.64.install.smp2                       GOOD       666      
completed successfully


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 00/11] KVM: x86: optimize for writing guest page
  2011-10-09 13:37       ` Avi Kivity
@ 2011-10-11  8:36         ` Xiao Guangrong
  2011-11-04  9:16           ` Xiao Guangrong
  0 siblings, 1 reply; 26+ messages in thread
From: Xiao Guangrong @ 2011-10-11  8:36 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

On 10/09/2011 09:37 PM, Avi Kivity wrote:
> On 10/09/2011 02:24 PM, Avi Kivity wrote:
>> On 10/08/2011 06:06 AM, Xiao Guangrong wrote:
>>> On 10/07/2011 01:53 AM, Marcelo Tosatti wrote:
>>> >  On Thu, Sep 22, 2011 at 04:52:40PM +0800, Xiao Guangrong wrote:
>>> >>  This patchset is against https://github.com/avikivity/kvm.git next branch.
>>> >>
>>> >>  In this version, some changes come from Avi's comments:
>>> >>  - fix instruction retried for nested guest
>>> >>  - skip write-flooding for the sp whose level is 1
>>> >>  - rename some functions
>>> >
>>> >  Please rebase.
>>> >
>>> >
>>>
>>> Marcelo,
>>>
>>> These patches can be applied without any conflict and it also works well,
>>> the current code was pulled from https://github.com/avikivity/kvm.git next branch.
>>>
>>> What problem did you meet when you applied these? :(
>>
>> I guess it was a user error - it applies cleanly here too (and pushed to next, thanks).
>>
> 
> However, it seems to reduce performance.
> 

Ouch, will look into it soon.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 00/11] KVM: x86: optimize for writing guest page
  2011-10-11  8:36         ` Xiao Guangrong
@ 2011-11-04  9:16           ` Xiao Guangrong
  2011-11-06 15:35             ` Avi Kivity
  0 siblings, 1 reply; 26+ messages in thread
From: Xiao Guangrong @ 2011-11-04  9:16 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

On 10/11/2011 04:36 PM, Xiao Guangrong wrote:
> On 10/09/2011 09:37 PM, Avi Kivity wrote:
>> On 10/09/2011 02:24 PM, Avi Kivity wrote:
>>> On 10/08/2011 06:06 AM, Xiao Guangrong wrote:
>>>> On 10/07/2011 01:53 AM, Marcelo Tosatti wrote:
>>>>>   On Thu, Sep 22, 2011 at 04:52:40PM +0800, Xiao Guangrong wrote:
>>>>>>   This patchset is against https://github.com/avikivity/kvm.git next branch.
>>>>>>
>>>>>>   In this version, some changes come from Avi's comments:
>>>>>>   - fix instruction retried for nested guest
>>>>>>   - skip write-flooding for the sp whose level is 1
>>>>>>   - rename some functions
>>>>>
>>>>>   Please rebase.
>>>>>
>>>>>
>>>>
>>>> Marcelo,
>>>>
>>>> These patches can be applied without any conflict and it also works well,
>>>> the current code was pulled from https://github.com/avikivity/kvm.git next branch.
>>>>
>>>> What problem did you meet when you applied these? :(
>>>
>>> I guess it was a user error - it applies cleanly here too (and pushed to next, thanks).
>>>
>>
>> However, it seems to reduce performance.
>>
>
> Ouch, will look into it soon.

Hi Avi,

I have done kernbench tests several times on my desktop, but it shows 
very well:

before patchset:
real 212.27
real 213.47
real 204.99
real 200.58
real 199.99
real 199.94
real 201.51
real 199.83
real 198.19
real 205.13

after patchset:
real 199.90
real 201.89
real 194.54
real 188.71
real 185.75
real 187.70
real 188.99
real 188.53
real 186.29
real 188.25

I will test it on our server using kvm-autotest, could you share me your 
config file please?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 00/11] KVM: x86: optimize for writing guest page
  2011-11-04  9:16           ` Xiao Guangrong
@ 2011-11-06 15:35             ` Avi Kivity
  2011-11-10 13:28               ` Xiao Guangrong
  0 siblings, 1 reply; 26+ messages in thread
From: Avi Kivity @ 2011-11-06 15:35 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Xiao Guangrong, Marcelo Tosatti, LKML, KVM

On 11/04/2011 11:16 AM, Xiao Guangrong wrote:
>
> I have done kernbench tests several times on my desktop, but it shows
> very well:
>
> before patchset:
> real 212.27
> real 213.47
> real 204.99
> real 200.58
> real 199.99
> real 199.94
> real 201.51
> real 199.83
> real 198.19
> real 205.13
>
> after patchset:
> real 199.90
> real 201.89
> real 194.54
> real 188.71
> real 185.75
> real 187.70
> real 188.99
> real 188.53
> real 186.29
> real 188.25
>
> I will test it on our server using kvm-autotest, could you share me
> your config file please?


# Copy this file to tests.cfg and edit it.
#
# This file contains the test set definitions. Define your test sets here.

include base.cfg
include subtests.cfg
include guest-os.cfg
include cdkeys.cfg

extra_params += ' -enable-kvm'
image_name(_.*)? ?<= /images/autotest/
#cdrom(_.*)? ?<= isos/

# Modify/comment the following lines if you wish to modify
# the paths of the image files, ISO files, step files or qemu binaries.
#
# As for the defaults:
# * qemu and qemu-img are expected to be found under /usr/bin/qemu-kvm and
#   /usr/bin/qemu-img respectively.
# * All image files are expected under /tmp/kvm_autotest_root/images/
# * All iso files are expected under /tmp/kvm_autotest_root/isos/
# * All step files are expected under /tmp/kvm_autotest_root/steps/
#image_name.* ?<= images/
#cdrom.* ?<= isos/

drive_cache = writeback
# -no-kvm-irqchip
#timeout_multiplier = 8

#iterations = 30

vga = cirrus

# Here are the test sets variants. The variant 'qemu_kvm_windows_quick' is
# fully commented, the following ones have comments only on noteworthy
points
variants:
    - @avi:
        only no_pci_assignable
        only qcow2
        only ide
        #only default
        only smp2
        #only up
        only Fedora.9.32 Fedora.9.64 WinVista.64sp1 WinXP
        only install setup boot reboot migrate shutdown
        only rtl8139
        only smallpages
        #only default_host   

no migrate.exec

# Uncomment the following lines to enable abort-on-error mode:
# abort_on_error = yes
kill_vm.* ?= no
kill_unresponsive_vms.* ?= no

WinXP.64:
    no shutdown
    no reboot

Win2003.64:
    no shutdown
    no reboot

# Choose your test list from the testsets defined
only avi

pci_assignable = no

serial_console = no


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 00/11] KVM: x86: optimize for writing guest page
  2011-11-06 15:35             ` Avi Kivity
@ 2011-11-10 13:28               ` Xiao Guangrong
  2011-11-10 14:05                 ` Avi Kivity
  0 siblings, 1 reply; 26+ messages in thread
From: Xiao Guangrong @ 2011-11-10 13:28 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Xiao Guangrong, Marcelo Tosatti, LKML, KVM

[-- Attachment #1: Type: text/plain, Size: 1075 bytes --]

On 11/06/2011 11:35 PM, Avi Kivity wrote:
> On 11/04/2011 11:16 AM, Xiao Guangrong wrote:
>>
>> I have done kernbench tests several times on my desktop, but it shows
>> very well:
>>
>> before patchset:
>> real 212.27
>> real 213.47
>> real 204.99
>> real 200.58
>> real 199.99
>> real 199.94
>> real 201.51
>> real 199.83
>> real 198.19
>> real 205.13
>>
>> after patchset:
>> real 199.90
>> real 201.89
>> real 194.54
>> real 188.71
>> real 185.75
>> real 187.70
>> real 188.99
>> real 188.53
>> real 186.29
>> real 188.25
>>
>> I will test it on our server using kvm-autotest, could you share me
>> your config file please?
>
>
> # Copy this file to tests.cfg and edit it.
> #
> # This file contains the test set definitions. Define your test sets here.
>

Thanks Avi!

I have tested RHEL.6.1 setup/boot/reboot/shutdown and the complete output of 
scan_results.py is attached.

The result shows the performance is improved:
before:			After:
570			529
555			538
552			531
546			528
553			559
553			527
550			523
553			533
547			538
550			526

How do you think about it? :)

[-- Attachment #2: result.tar.bz2 --]
[-- Type: application/x-bzip, Size: 737 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 00/11] KVM: x86: optimize for writing guest page
  2011-11-10 13:28               ` Xiao Guangrong
@ 2011-11-10 14:05                 ` Avi Kivity
  2011-11-11  3:42                   ` Xiao Guangrong
  0 siblings, 1 reply; 26+ messages in thread
From: Avi Kivity @ 2011-11-10 14:05 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Xiao Guangrong, Marcelo Tosatti, LKML, KVM

On 11/10/2011 03:28 PM, Xiao Guangrong wrote:
>
> I have tested RHEL.6.1 setup/boot/reboot/shutdown and the complete
> output of scan_results.py is attached.
>
> The result shows the performance is improved:
> before:            After:
> 570            529
> 555            538
> 552            531
> 546            528
> 553            559
> 553            527
> 550            523
> 553            533
> 547            538
> 550            526
>
> How do you think about it? :)

Well, either I was sloppy in my measurements, or maybe RHEL 6 is very
different from F9 (unlikely).  I'll measure it again and see.

btw, this is with ept=0, yes?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 00/11] KVM: x86: optimize for writing guest page
  2011-11-10 14:05                 ` Avi Kivity
@ 2011-11-11  3:42                   ` Xiao Guangrong
  0 siblings, 0 replies; 26+ messages in thread
From: Xiao Guangrong @ 2011-11-11  3:42 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Xiao Guangrong, Marcelo Tosatti, LKML, KVM

On 11/10/2011 10:05 PM, Avi Kivity wrote:
> On 11/10/2011 03:28 PM, Xiao Guangrong wrote:
>>
>> I have tested RHEL.6.1 setup/boot/reboot/shutdown and the complete
>> output of scan_results.py is attached.
>>
>> The result shows the performance is improved:
>> before:            After:
>> 570            529
>> 555            538
>> 552            531
>> 546            528
>> 553            559
>> 553            527
>> 550            523
>> 553            533
>> 547            538
>> 550            526
>>
>> How do you think about it? :)
>
> Well, either I was sloppy in my measurements, or maybe RHEL 6 is very
> different from F9 (unlikely).  I'll measure it again and see.
>

Thanks for your time. :)

> btw, this is with ept=0, yes?
>

Yeah.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2011-11-11  3:42 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-22  8:52 [PATCH v4 00/11] KVM: x86: optimize for writing guest page Xiao Guangrong
2011-09-22  8:53 ` [PATCH v4 01/11] KVM: MMU: avoid pte_list_desc running out in kvm_mmu_pte_write Xiao Guangrong
2011-09-22  8:53 ` [PATCH v4 02/11] KVM: x86: tag the instructions which are used to write page table Xiao Guangrong
2011-09-22  8:55 ` [PATCH v4 04/11] KVM: x86: cleanup port-in/port-out emulated Xiao Guangrong
2011-09-22  8:55 ` [PATCH v4 05/11] KVM: MMU: do not mark accessed bit on pte write path Xiao Guangrong
2011-09-22  8:56 ` [PATCH v4 06/11] KVM: MMU: cleanup FNAME(invlpg) Xiao Guangrong
2011-09-22  8:56 ` [PATCH v4 07/11] KVM: MMU: fast prefetch spte on invlpg path Xiao Guangrong
2011-09-22  8:56 ` [PATCH v4 08/11] KVM: MMU: remove unnecessary kvm_mmu_free_some_pages Xiao Guangrong
2011-09-22  8:57 ` [PATCH v4 09/11] KVM: MMU: split kvm_mmu_pte_write function Xiao Guangrong
2011-09-22  8:57 ` [PATCH v4 10/11] KVM: MMU: fix detecting misaligned accessed Xiao Guangrong
2011-09-22  8:58 ` [PATCH v4 11/11] KVM: MMU: improve write flooding detected Xiao Guangrong
2011-09-22  9:02 ` [PATCH v4 03/11] KVM: x86: retry non-page-table writing instructions Xiao Guangrong
2011-09-23 11:51 ` [PATCH v4 00/11] KVM: x86: optimize for writing guest page Marcelo Tosatti
2011-09-30  3:49   ` Xiao Guangrong
2011-10-05 13:25   ` Avi Kivity
2011-10-06 17:50 ` Marcelo Tosatti
2011-10-06 17:53 ` Marcelo Tosatti
2011-10-08  4:06   ` Xiao Guangrong
2011-10-09 12:24     ` Avi Kivity
2011-10-09 13:37       ` Avi Kivity
2011-10-11  8:36         ` Xiao Guangrong
2011-11-04  9:16           ` Xiao Guangrong
2011-11-06 15:35             ` Avi Kivity
2011-11-10 13:28               ` Xiao Guangrong
2011-11-10 14:05                 ` Avi Kivity
2011-11-11  3:42                   ` Xiao Guangrong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.