On Tue, 3 Jul 2012 04:55:35 -0300, Marcelo Tosatti wrote: > On Mon, Jun 04, 2012 at 10:37:24AM +0530, Nikunj A. Dadhania wrote: > > flush_tlb_others_ipi depends on lot of statics in tlb.c. Replicated > > the flush_tlb_others_ipi as kvm_flush_tlb_others to further adapt to > > paravirtualization. > > > > Use the vcpu state information inside the kvm_flush_tlb_others to > > avoid sending ipi to pre-empted vcpus. > > > > * Do not send ipi's to offline vcpus and set flush_on_enter flag > > * For online vcpus: Wait for them to clear the flag > > > > The approach was discussed here: https://lkml.org/lkml/2012/2/20/157 > > > > Suggested-by: Peter Zijlstra > > Signed-off-by: Nikunj A. Dadhania > > > > -- > > Pseudo Algo: > > > > Write() > > ====== > > > > guest_exit() > > flush_on_enter[i]=0; > > running[i] = 0; > > > > guest_enter() > > running[i] = 1; > > smp_mb(); > > if(flush_on_enter[i]) { > > tlb_flush() > > flush_on_enter[i]=0; > > } > > > > > > Read() > > ====== > > > > GUEST KVM-HV > > > > f->flushcpumask = cpumask - me; > > > > again: > > for_each_cpu(i, f->flushmask) { > > > > if (!running[i]) { > > case 1: > > > > running[n]=1 > > > > (cpuN does not see > > flush_on_enter set, > > guest later finds it > > running and sends ipi, > > we are fine here, need > > to clear the flag on > > guest_exit) > > > > flush_on_enter[i] = 1; > > case2: > > > > running[n]=1 > > (cpuN - will see flush > > on enter and an IPI as > > well - addressed in patch-4) > > > > if (!running[i]) > > cpu_clear(f->flushmask); All is well, vm_enter > > will do the fixup > > } > > case 3: > > running[n] = 0; > > > > (cpuN went to sleep, > > we saw it as awake, > > ipi sent, but wait > > will break without > > zero_mask and goto > > again will take care) > > > > } > > send_ipi(f->flushmask) > > > > wait_a_while_for_zero_mask(); > > > > if (!zero_mask) > > goto again; > > Can you please measure increased vmentry/vmexit overhead? x86/vmexit.c > of git://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git should > help. > Please find below the results (debug patch attached for enabling registration of kvm_vcu_state) I have taken results for 1 and 4 vcpus. Used the following command for starting the tests: /usr/libexec/qemu-kvm -smp $i -device testdev,chardev=testlog -chardev file,id=testlog,path=vmexit.out -serial stdio -kernel ./x86/vmexit.flat Machine : IBM xSeries with Intel(R) Xeon(R) X7560 2.27GHz CPU with 32 core, 32 online cpus and 4*64GB RAM. x base - unpatched host kernel + wo_vs - patched host kernel, vcpu_state not registered * w_vs.txt - patched host kernel and vcpu_state registered 1 vcpu results: --------------- cpuid ===== N Avg Stddev x 10 2135.1 17.8975 + 10 2188 18.3666 * 10 2448.9 43.9910 vmcall ====== N Avg Stddev x 10 2025.5 38.1641 + 10 2047.5 24.8205 * 10 2306.2 40.3066 mov_from_cr8 ============ N Avg Stddev x 10 12 0.0000 + 10 12 0.0000 * 10 12 0.0000 mov_to_cr8 ========== N Avg Stddev x 10 19.4 0.5164 + 10 19.1 0.3162 * 10 19.2 0.4216 inl_from_pmtimer ================ N Avg Stddev x 10 18093.2 462.0543 + 10 16579.7 1448.8892 * 10 18577.7 266.2676 ple-round-robin =============== N Avg Stddev x 10 16.1 0.3162 + 10 16.2 0.4216 * 10 15.3 0.4830 4 vcpus result -------------- cpuid ===== N Avg Stddev x 10 2135.8 10.0642 + 10 2165 6.4118 * 10 2423.7 12.5526 vmcall ====== N Avg Stddev x 10 2028.3 19.6641 + 10 2024.7 7.2273 * 10 2276.1 13.8680 mov_from_cr8 ============ N Avg Stddev x 10 12 0.0000 + 10 12 0.0000 * 10 12 0.0000 mov_to_cr8 ========== N Avg Stddev x 10 19 0.0000 + 10 19 0.0000 * 10 19 0.0000 inl_from_pmtimer ================ N Avg Stddev x 10 25574.2 1693.5374 + 10 25190.7 2219.9223 * 10 23044 1230.8737 ipi === N Avg Stddev x 20 31996.75 7290.1777 + 20 33683.25 9795.1601 * 20 34563.5 8338.7826 ple-round-robin =============== N Avg Stddev x 10 6281.7 1543.8601 + 10 6149.8 1207.7928 * 10 6433.3 2304.5377 Thanks Nikunj