All of lore.kernel.org
 help / color / mirror / Atom feed
* Windows Server 2008R2 KVM guest performance issues
@ 2013-08-26 19:15 Brian Rak
  2013-08-26 22:01 ` Brian Rak
  2013-08-27  7:18 ` Paolo Bonzini
  0 siblings, 2 replies; 10+ messages in thread
From: Brian Rak @ 2013-08-26 19:15 UTC (permalink / raw)
  To: kvm

I've been trying to track down the cause of some serious performance 
issues with a Windows 2008R2 KVM guest.  So far, I've been unable to 
determine what exactly is causing the issue.

When the guest is under load, I see very high kernel CPU usage, as well 
as terrible guest performance.  The workload on the guest is 
approximately 1/4 of what we'd run unvirtualized on the same hardware.  
Even at that level, we max out every vCPU in the guest. While the guest 
runs, I see very high kernel CPU usage (based on `htop` output).


Host setup:
Linux nj1058 3.10.8-1.el6.elrepo.x86_64 #1 SMP Tue Aug 20 18:48:29 EDT 
2013 x86_64 x86_64 x86_64 GNU/Linux
CentOS 6
qemu 1.6.0
2x Intel E5-2630 (virtualization extensions turned on, total of 24 cores 
including hyperthread cores)
24GB memory
swap file is enabled, but unused

Guest setup:
Windows Server 2008R2 (64 bit)
24 vCPUs
16 GB memory
VirtIO disk and network drivers installed
/qemu16/bin/qemu-system-x86_64 -name VMID100 -S -machine 
pc-i440fx-1.6,accel=kvm,usb=off -cpu 
host,hv_relaxed,hv_vapic,hv_spinlocks=0x1000 -m 15259 -smp 
24,sockets=1,cores=12,threads=2 -uuid 
90301200-8d47-6bb3-0623-bed7c8b1dd7c -no-user-config -nodefaults 
-chardev 
socket,id=charmonitor,path=/libvirt111/var/lib/libvirt/qemu/VMID100.monitor,server,nowait 
-mon chardev=charmonitor,id=monitor,mode=readline -rtc 
base=utc,driftfix=slew -no-hpet -boot c -usb -drive 
file=/dev/vmimages/VMID100,if=none,id=drive-virtio-disk0,format=raw,cache=writeback,aio=native 
-device 
virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 
-drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw 
-device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 
-netdev tap,fd=18,id=hostnet0,vhost=on,vhostfd=19 -device 
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:00:2c:6d,bus=pci.0,addr=0x3 
-vnc 127.0.0.1:100 -k en-us -vga cirrus -device 
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

The beginning of `perf top` output:

Samples: 62M of event 'cycles', Event count (approx.): 642019289177
  64.69%  [kernel]                    [k] _raw_spin_lock
   2.59%  qemu-system-x86_64          [.] 0x00000000001e688d
   1.90%  [kernel]                    [k] native_write_msr_safe
   0.84%  [kvm]                       [k] vcpu_enter_guest
   0.80%  [kernel]                    [k] __schedule
   0.77%  [kvm_intel]                 [k] vmx_vcpu_run
   0.68%  [kernel]                    [k] effective_load
   0.65%  [kernel]                    [k] update_cfs_shares
   0.62%  [kernel]                    [k] _raw_spin_lock_irq
   0.61%  [kernel]                    [k] native_read_msr_safe
   0.56%  [kernel]                    [k] enqueue_entity

I've captured 20,000 lines of kvm trace output.  This can be found 
https://gist.github.com/devicenull/fa8f49d4366060029ee4/raw/fb89720d34b43920be22e3e9a1d88962bf305da8/trace

So far, I've tried the following with very little effect:
* Disable HPET on the guest
* Enable hv_relaxed, hv_vapic, hv_spinlocks
* Enable SR-IOV
* Pin vCPUs to physical CPUs
* Forcing x2apic enabled in the guest (bcdedit /set x2apicpolicy yes)
* bcdedit /set useplatformclock yes and no


Any suggestions as to what I can do to get better performance out of ths 
guest?  Or reasons why I'm seeing such high kernel cpu usage with it?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Windows Server 2008R2 KVM guest performance issues
  2013-08-26 19:15 Windows Server 2008R2 KVM guest performance issues Brian Rak
@ 2013-08-26 22:01 ` Brian Rak
  2013-08-27  7:18 ` Paolo Bonzini
  1 sibling, 0 replies; 10+ messages in thread
From: Brian Rak @ 2013-08-26 22:01 UTC (permalink / raw)
  To: kvm

On 8/26/2013 3:15 PM, Brian Rak wrote:
> I've been trying to track down the cause of some serious performance 
> issues with a Windows 2008R2 KVM guest.  So far, I've been unable to 
> determine what exactly is causing the issue.
>
> When the guest is under load, I see very high kernel CPU usage, as 
> well as terrible guest performance.  The workload on the guest is 
> approximately 1/4 of what we'd run unvirtualized on the same 
> hardware.  Even at that level, we max out every vCPU in the guest. 
> While the guest runs, I see very high kernel CPU usage (based on 
> `htop` output).
>
>
> Host setup:
> Linux nj1058 3.10.8-1.el6.elrepo.x86_64 #1 SMP Tue Aug 20 18:48:29 EDT 
> 2013 x86_64 x86_64 x86_64 GNU/Linux
> CentOS 6
> qemu 1.6.0
> 2x Intel E5-2630 (virtualization extensions turned on, total of 24 
> cores including hyperthread cores)
> 24GB memory
> swap file is enabled, but unused
>
> Guest setup:
> Windows Server 2008R2 (64 bit)
> 24 vCPUs
> 16 GB memory
> VirtIO disk and network drivers installed
> /qemu16/bin/qemu-system-x86_64 -name VMID100 -S -machine 
> pc-i440fx-1.6,accel=kvm,usb=off -cpu 
> host,hv_relaxed,hv_vapic,hv_spinlocks=0x1000 -m 15259 -smp 
> 24,sockets=1,cores=12,threads=2 -uuid 
> 90301200-8d47-6bb3-0623-bed7c8b1dd7c -no-user-config -nodefaults 
> -chardev 
> socket,id=charmonitor,path=/libvirt111/var/lib/libvirt/qemu/VMID100.monitor,server,nowait 
> -mon chardev=charmonitor,id=monitor,mode=readline -rtc 
> base=utc,driftfix=slew -no-hpet -boot c -usb -drive 
> file=/dev/vmimages/VMID100,if=none,id=drive-virtio-disk0,format=raw,cache=writeback,aio=native 
> -device 
> virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 
> -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw 
> -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 
> -netdev tap,fd=18,id=hostnet0,vhost=on,vhostfd=19 -device 
> virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:00:2c:6d,bus=pci.0,addr=0x3 
> -vnc 127.0.0.1:100 -k en-us -vga cirrus -device 
> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
>
> The beginning of `perf top` output:
>
> Samples: 62M of event 'cycles', Event count (approx.): 642019289177
>  64.69%  [kernel]                    [k] _raw_spin_lock
>   2.59%  qemu-system-x86_64          [.] 0x00000000001e688d
>   1.90%  [kernel]                    [k] native_write_msr_safe
>   0.84%  [kvm]                       [k] vcpu_enter_guest
>   0.80%  [kernel]                    [k] __schedule
>   0.77%  [kvm_intel]                 [k] vmx_vcpu_run
>   0.68%  [kernel]                    [k] effective_load
>   0.65%  [kernel]                    [k] update_cfs_shares
>   0.62%  [kernel]                    [k] _raw_spin_lock_irq
>   0.61%  [kernel]                    [k] native_read_msr_safe
>   0.56%  [kernel]                    [k] enqueue_entity
>
> I've captured 20,000 lines of kvm trace output.  This can be found 
> https://gist.github.com/devicenull/fa8f49d4366060029ee4/raw/fb89720d34b43920be22e3e9a1d88962bf305da8/trace 
>
>
> So far, I've tried the following with very little effect:
> * Disable HPET on the guest
> * Enable hv_relaxed, hv_vapic, hv_spinlocks
> * Enable SR-IOV
> * Pin vCPUs to physical CPUs
> * Forcing x2apic enabled in the guest (bcdedit /set x2apicpolicy yes)
> * bcdedit /set useplatformclock yes and no
>
>
> Any suggestions as to what I can do to get better performance out of 
> ths guest?  Or reasons why I'm seeing such high kernel cpu usage with it?
> -- 
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

I've done some additional research on this, and I believe that 'kvm_pio: 
pio_read at 0xb008 size 4 count 1' is related to windows trying to read 
the pm timer.  This timer appears to use the TSC in some cases (I 
think).  I found this patchset: 
http://www.spinics.net/lists/kvm/msg91214.html which doesn't appear to 
be applied yet.  Does it seem reasonable that this patchset would 
eliminate the need for windows to read from the pm timer continuously?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Windows Server 2008R2 KVM guest performance issues
  2013-08-26 19:15 Windows Server 2008R2 KVM guest performance issues Brian Rak
  2013-08-26 22:01 ` Brian Rak
@ 2013-08-27  7:18 ` Paolo Bonzini
  2013-08-27  7:38   ` Gleb Natapov
  2013-08-27 14:44   ` Brian Rak
  1 sibling, 2 replies; 10+ messages in thread
From: Paolo Bonzini @ 2013-08-27  7:18 UTC (permalink / raw)
  To: Brian Rak; +Cc: kvm

Il 26/08/2013 21:15, Brian Rak ha scritto:
> 
> Samples: 62M of event 'cycles', Event count (approx.): 642019289177
>  64.69%  [kernel]                    [k] _raw_spin_lock
>   2.59%  qemu-system-x86_64          [.] 0x00000000001e688d
>   1.90%  [kernel]                    [k] native_write_msr_safe
>   0.84%  [kvm]                       [k] vcpu_enter_guest
>   0.80%  [kernel]                    [k] __schedule
>   0.77%  [kvm_intel]                 [k] vmx_vcpu_run
>   0.68%  [kernel]                    [k] effective_load
>   0.65%  [kernel]                    [k] update_cfs_shares
>   0.62%  [kernel]                    [k] _raw_spin_lock_irq
>   0.61%  [kernel]                    [k] native_read_msr_safe
>   0.56%  [kernel]                    [k] enqueue_entity

Can you capture the call graphs, too (perf record -g)?

> I've captured 20,000 lines of kvm trace output.  This can be found
> https://gist.github.com/devicenull/fa8f49d4366060029ee4/raw/fb89720d34b43920be22e3e9a1d88962bf305da8/trace

The guest is doing quite a lot of exits per second, mostly to (a) access
the ACPI timer (b) service NMIs.  In fact, every NMI is reading the
timer too and causing an exit to QEMU.

So it is also possible that you have to debug this inside the guest, to
see if these exits are expected or not.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Windows Server 2008R2 KVM guest performance issues
  2013-08-27  7:18 ` Paolo Bonzini
@ 2013-08-27  7:38   ` Gleb Natapov
  2013-08-27  8:38     ` Paolo Bonzini
  2013-08-27 14:09     ` Brian Rak
  2013-08-27 14:44   ` Brian Rak
  1 sibling, 2 replies; 10+ messages in thread
From: Gleb Natapov @ 2013-08-27  7:38 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Brian Rak, kvm

On Tue, Aug 27, 2013 at 09:18:00AM +0200, Paolo Bonzini wrote:
> > I've captured 20,000 lines of kvm trace output.  This can be found
> > https://gist.github.com/devicenull/fa8f49d4366060029ee4/raw/fb89720d34b43920be22e3e9a1d88962bf305da8/trace
> 
> The guest is doing quite a lot of exits per second, mostly to (a) access
> the ACPI timer
I see a lot of PM timer access not ACPI timer. The solution for that is
the patchset Brian linked.

>                 (b) service NMIs.  In fact, every NMI is reading the
> timer too and causing an exit to QEMU.
> 
Do you mean "kvm_exit: reason EXCEPTION_NMI rip 0xfffff800016dcf84 info
0 80000307"? Those are not NMIs, single NMI will kill Windows, they are #NM
exceptions. Brian, is your workload uses floating point calculation?

> So it is also possible that you have to debug this inside the guest, to
> see if these exits are expected or not.
> 
> Paolo
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
			Gleb.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Windows Server 2008R2 KVM guest performance issues
  2013-08-27  7:38   ` Gleb Natapov
@ 2013-08-27  8:38     ` Paolo Bonzini
  2013-08-27  8:44       ` Gleb Natapov
  2013-08-27 14:09     ` Brian Rak
  1 sibling, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2013-08-27  8:38 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Brian Rak, kvm

Il 27/08/2013 09:38, Gleb Natapov ha scritto:
> On Tue, Aug 27, 2013 at 09:18:00AM +0200, Paolo Bonzini wrote:
>>> I've captured 20,000 lines of kvm trace output.  This can be found
>>> https://gist.github.com/devicenull/fa8f49d4366060029ee4/raw/fb89720d34b43920be22e3e9a1d88962bf305da8/trace
>>
>> The guest is doing quite a lot of exits per second, mostly to (a) access
>> the ACPI timer
> I see a lot of PM timer access not ACPI timer. The solution for that is
> the patchset Brian linked.

ACPI timer = PM timer, no?

>>                 (b) service NMIs.  In fact, every NMI is reading the
>> timer too and causing an exit to QEMU.
>>
> Do you mean "kvm_exit: reason EXCEPTION_NMI rip 0xfffff800016dcf84 info
> 0 80000307"? Those are not NMIs, single NMI will kill Windows, they are #NM
> exceptions.

Oops, yes.

> Brian, is your workload uses floating point calculation?

Yeah, it looks like it does, there are a lot of kvm_fpu tracepoints too.

Basically the problem is that every exit to userspace unloads the FPU in
vcpu_put.

 qemu-system-x86-16189 [003] d...  9439.144500: kvm_entry: vcpu 12
 qemu-system-x86-16189 [003] d...  9439.144502: kvm_exit: reason EXCEPTION_NMI rip 0xfffff800016dcf84 info 0 80000307
 qemu-system-x86-16189 [003] ....  9439.144502: kvm_fpu: load
 qemu-system-x86-16189 [003] d...  9439.144503: kvm_entry: vcpu 12
 qemu-system-x86-16189 [003] d...  9439.144505: kvm_exit: reason IO_INSTRUCTION rip 0xfffff8000162d17d info b008000b 0
 qemu-system-x86-16189 [003] ....  9439.144506: kvm_emulate_insn: 0:fffff8000162d17d:ed (prot64)
 qemu-system-x86-16189 [003] ....  9439.144506: kvm_pio: pio_read at 0xb008 size 4 count 1
 qemu-system-x86-16189 [003] ....  9439.144507: kvm_userspace_exit: reason KVM_EXIT_IO (2)
 qemu-system-x86-16189 [003] ....  9439.144508: kvm_fpu: unload
 qemu-system-x86-16189 [003] d...  9439.144578: kvm_entry: vcpu 12
 qemu-system-x86-16189 [003] d...  9439.144579: kvm_exit: reason EXCEPTION_NMI rip 0xfffff800016dcf84 info 0 80000307
 qemu-system-x86-16189 [003] ....  9439.144581: kvm_fpu: load
 qemu-system-x86-16189 [003] d...  9439.144581: kvm_entry: vcpu 12
 qemu-system-x86-16189 [003] d...  9439.144583: kvm_exit: reason IO_INSTRUCTION rip 0xfffff8000162d17d info b008000b 0
 qemu-system-x86-16189 [003] ....  9439.144585: kvm_emulate_insn: 0:fffff8000162d17d:ed (prot64)
 qemu-system-x86-16189 [003] ....  9439.144585: kvm_pio: pio_read at 0xb008 size 4 count 1
 qemu-system-x86-16189 [003] ....  9439.144586: kvm_userspace_exit: reason KVM_EXIT_IO (2)
 qemu-system-x86-16189 [003] ....  9439.144587: kvm_fpu: unload
 qemu-system-x86-16189 [003] d...  9439.144787: kvm_entry: vcpu 12
 qemu-system-x86-16189 [003] d...  9439.144788: kvm_exit: reason EXCEPTION_NMI rip 0xfffff800016dcf84 info 0 80000307
 qemu-system-x86-16189 [003] ....  9439.144789: kvm_fpu: load
 qemu-system-x86-16189 [003] d...  9439.144789: kvm_entry: vcpu 12
 qemu-system-x86-16189 [003] d...  9439.144791: kvm_exit: reason IO_INSTRUCTION rip 0xfffff8000162d17d info b008000b 0
 qemu-system-x86-16189 [003] ....  9439.144792: kvm_emulate_insn: 0:fffff8000162d17d:ed (prot64)
 qemu-system-x86-16189 [003] ....  9439.144793: kvm_pio: pio_read at 0xb008 size 4 count 1
 qemu-system-x86-16189 [003] ....  9439.144794: kvm_userspace_exit: reason KVM_EXIT_IO (2)
 qemu-system-x86-16189 [003] ....  9439.144794: kvm_fpu: unload

It should be interesting to analyze the cost in kvm-unit-tests.  I'll look
at it.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Windows Server 2008R2 KVM guest performance issues
  2013-08-27  8:38     ` Paolo Bonzini
@ 2013-08-27  8:44       ` Gleb Natapov
  0 siblings, 0 replies; 10+ messages in thread
From: Gleb Natapov @ 2013-08-27  8:44 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Brian Rak, kvm

On Tue, Aug 27, 2013 at 10:38:04AM +0200, Paolo Bonzini wrote:
> Il 27/08/2013 09:38, Gleb Natapov ha scritto:
> > On Tue, Aug 27, 2013 at 09:18:00AM +0200, Paolo Bonzini wrote:
> >>> I've captured 20,000 lines of kvm trace output.  This can be found
> >>> https://gist.github.com/devicenull/fa8f49d4366060029ee4/raw/fb89720d34b43920be22e3e9a1d88962bf305da8/trace
> >>
> >> The guest is doing quite a lot of exits per second, mostly to (a) access
> >> the ACPI timer
> > I see a lot of PM timer access not ACPI timer. The solution for that is
> > the patchset Brian linked.
> 
> ACPI timer = PM timer, no?
> 
My brain often substitute ACPI with APIC, yes you are right.

--
			Gleb.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Windows Server 2008R2 KVM guest performance issues
  2013-08-27  7:38   ` Gleb Natapov
  2013-08-27  8:38     ` Paolo Bonzini
@ 2013-08-27 14:09     ` Brian Rak
  1 sibling, 0 replies; 10+ messages in thread
From: Brian Rak @ 2013-08-27 14:09 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Paolo Bonzini, kvm


On 8/27/2013 3:38 AM, Gleb Natapov wrote:
> On Tue, Aug 27, 2013 at 09:18:00AM +0200, Paolo Bonzini wrote:
>>> I've captured 20,000 lines of kvm trace output.  This can be found
>>> https://gist.github.com/devicenull/fa8f49d4366060029ee4/raw/fb89720d34b43920be22e3e9a1d88962bf305da8/trace
>> The guest is doing quite a lot of exits per second, mostly to (a) access
>> the ACPI timer
> I see a lot of PM timer access not ACPI timer. The solution for that is
> the patchset Brian linked.
>
>>                  (b) service NMIs.  In fact, every NMI is reading the
>> timer too and causing an exit to QEMU.
>>
> Do you mean "kvm_exit: reason EXCEPTION_NMI rip 0xfffff800016dcf84 info
> 0 80000307"? Those are not NMIs, single NMI will kill Windows, they are #NM
> exceptions. Brian, is your workload uses floating point calculation?

Yes, our workload uses floating point heavily.  I'd also strongly 
suspect it's doing various things with timers quite frequently. (This is 
all third party software, so I don't have the source to examine to 
determine exactly what it's doing).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Windows Server 2008R2 KVM guest performance issues
  2013-08-27  7:18 ` Paolo Bonzini
  2013-08-27  7:38   ` Gleb Natapov
@ 2013-08-27 14:44   ` Brian Rak
  2013-08-27 15:09     ` Paolo Bonzini
  1 sibling, 1 reply; 10+ messages in thread
From: Brian Rak @ 2013-08-27 14:44 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm


On 8/27/2013 3:18 AM, Paolo Bonzini wrote:
> Il 26/08/2013 21:15, Brian Rak ha scritto:
>> Samples: 62M of event 'cycles', Event count (approx.): 642019289177
>>   64.69%  [kernel]                    [k] _raw_spin_lock
>>    2.59%  qemu-system-x86_64          [.] 0x00000000001e688d
>>    1.90%  [kernel]                    [k] native_write_msr_safe
>>    0.84%  [kvm]                       [k] vcpu_enter_guest
>>    0.80%  [kernel]                    [k] __schedule
>>    0.77%  [kvm_intel]                 [k] vmx_vcpu_run
>>    0.68%  [kernel]                    [k] effective_load
>>    0.65%  [kernel]                    [k] update_cfs_shares
>>    0.62%  [kernel]                    [k] _raw_spin_lock_irq
>>    0.61%  [kernel]                    [k] native_read_msr_safe
>>    0.56%  [kernel]                    [k] enqueue_entity
> Can you capture the call graphs, too (perf record -g)?

Sure.  I'm not entire certain how to use perf effectively.  I've used 
`perf record`, then manually expanded the call stacks in `perf report`.  
If this isn't what you wanted, please let me know.

https://gist.github.com/devicenull/7961f23e6756b647a86a/raw/a04718db2c26b31e50fb7f521d47d911610383d8/gistfile1.txt

>> I've captured 20,000 lines of kvm trace output.  This can be found
>> https://gist.github.com/devicenull/fa8f49d4366060029ee4/raw/fb89720d34b43920be22e3e9a1d88962bf305da8/trace
> The guest is doing quite a lot of exits per second, mostly to (a) access
> the ACPI timer (b) service NMIs.  In fact, every NMI is reading the
> timer too and causing an exit to QEMU.
>
> So it is also possible that you have to debug this inside the guest, to
> see if these exits are expected or not.
Do you have any suggestions for how I would do this?  Given that the 
guest is Windows, I'm not certain how I could even begin to debug this.


Also, for that patch set I found, do I also need a patch for qemu to 
actually enable the new enlightenment? I haven't been able to find 
anything for qemu that matches that patch.  I did find 
http://www.mail-archive.com/kvm@vger.kernel.org/msg82495.html , but 
that's from significantly before the patchset, so I can't tell if that's 
still related.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Windows Server 2008R2 KVM guest performance issues
  2013-08-27 14:44   ` Brian Rak
@ 2013-08-27 15:09     ` Paolo Bonzini
  2013-08-27 16:45       ` Brian Rak
  0 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2013-08-27 15:09 UTC (permalink / raw)
  To: Brian Rak; +Cc: kvm

Il 27/08/2013 16:44, Brian Rak ha scritto:
>> Il 26/08/2013 21:15, Brian Rak ha scritto:
>>> Samples: 62M of event 'cycles', Event count (approx.): 642019289177
>>>   64.69%  [kernel]                    [k] _raw_spin_lock
>>>    2.59%  qemu-system-x86_64          [.] 0x00000000001e688d
>>>    1.90%  [kernel]                    [k] native_write_msr_safe
>>>    0.84%  [kvm]                       [k] vcpu_enter_guest
>>>    0.80%  [kernel]                    [k] __schedule
>>>    0.77%  [kvm_intel]                 [k] vmx_vcpu_run
>>>    0.68%  [kernel]                    [k] effective_load
>>>    0.65%  [kernel]                    [k] update_cfs_shares
>>>    0.62%  [kernel]                    [k] _raw_spin_lock_irq
>>>    0.61%  [kernel]                    [k] native_read_msr_safe
>>>    0.56%  [kernel]                    [k] enqueue_entity
>> Can you capture the call graphs, too (perf record -g)?
> 
> Sure.  I'm not entire certain how to use perf effectively.  I've used
> `perf record`, then manually expanded the call stacks in `perf report`. 
> If this isn't what you wanted, please let me know.
> 
> https://gist.github.com/devicenull/7961f23e6756b647a86a/raw/a04718db2c26b31e50fb7f521d47d911610383d8/gistfile1.txt
> 

This is actually quite useful!

-  41.41%  qemu-system-x86  [kernel.kallsyms]                                                                     0xffffffff815ef6d5 k [k] _raw_spin_lock
   - _raw_spin_lock
      - 48.06% futex_wait_setup
           futex_wait
           do_futex
           SyS_futex
           system_call_fastpath
         - __lll_lock_wait
              99.32% 0x10100000002
      - 44.71% futex_wake
           do_futex
           SyS_futex
           system_call_fastpath
         - __lll_unlock_wake
              99.33% 0x10100000002

This could be multiple VCPUs competing on QEMU's "big lock" because the pmtimer
is being read by different VCPUs at the same time.  This can be fixed, and
probably will in 1.7 or 1.8.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Windows Server 2008R2 KVM guest performance issues
  2013-08-27 15:09     ` Paolo Bonzini
@ 2013-08-27 16:45       ` Brian Rak
  0 siblings, 0 replies; 10+ messages in thread
From: Brian Rak @ 2013-08-27 16:45 UTC (permalink / raw)
  To: kvm


On 8/27/2013 11:09 AM, Paolo Bonzini wrote:
> Il 27/08/2013 16:44, Brian Rak ha scritto:
>>> Il 26/08/2013 21:15, Brian Rak ha scritto:
>>>> Samples: 62M of event 'cycles', Event count (approx.): 642019289177
>>>>    64.69%  [kernel]                    [k] _raw_spin_lock
>>>>     2.59%  qemu-system-x86_64          [.] 0x00000000001e688d
>>>>     1.90%  [kernel]                    [k] native_write_msr_safe
>>>>     0.84%  [kvm]                       [k] vcpu_enter_guest
>>>>     0.80%  [kernel]                    [k] __schedule
>>>>     0.77%  [kvm_intel]                 [k] vmx_vcpu_run
>>>>     0.68%  [kernel]                    [k] effective_load
>>>>     0.65%  [kernel]                    [k] update_cfs_shares
>>>>     0.62%  [kernel]                    [k] _raw_spin_lock_irq
>>>>     0.61%  [kernel]                    [k] native_read_msr_safe
>>>>     0.56%  [kernel]                    [k] enqueue_entity
>>> Can you capture the call graphs, too (perf record -g)?
>> Sure.  I'm not entire certain how to use perf effectively.  I've used
>> `perf record`, then manually expanded the call stacks in `perf report`.
>> If this isn't what you wanted, please let me know.
>>
>> https://gist.github.com/devicenull/7961f23e6756b647a86a/raw/a04718db2c26b31e50fb7f521d47d911610383d8/gistfile1.txt
>>
> This is actually quite useful!
>
> -  41.41%  qemu-system-x86  [kernel.kallsyms]                                                                     0xffffffff815ef6d5 k [k] _raw_spin_lock
>     - _raw_spin_lock
>        - 48.06% futex_wait_setup
>             futex_wait
>             do_futex
>             SyS_futex
>             system_call_fastpath
>           - __lll_lock_wait
>                99.32% 0x10100000002
>        - 44.71% futex_wake
>             do_futex
>             SyS_futex
>             system_call_fastpath
>           - __lll_unlock_wake
>                99.33% 0x10100000002
>
> This could be multiple VCPUs competing on QEMU's "big lock" because the pmtimer
> is being read by different VCPUs at the same time.  This can be fixed, and
> probably will in 1.7 or 1.8.
>

I've successfully applied the patch set, and have seen significant 
performance increases.  Kernel CPU usage is no longer half of all CPU 
usage, and my insn_emulation counts are down to ~2000/s rather then 
20,000/s.

I did end up having to patch qemu in a terrible way in order to get this 
working. I've just enabled the TSC optimizations whenever hv_vapic is 
enabled.  This is far from the best way of doing it, but I'm not really 
a C developer and we'll always want the TSC optimizations on our windows 
VMs.  In case anyone wants to do the same, it's a pretty simple patch:

*** clean/qemu-1.6.0/target-i386/kvm.c  2013-08-15 15:56:23.000000000 -0400
--- qemu-1.6.0/target-i386/kvm.c        2013-08-27 11:08:21.388841555 -0400
*************** int kvm_arch_init_vcpu(CPUState *cs)
*** 477,482 ****
--- 477,484 ----
           if (hyperv_vapic_recommended()) {
               c->eax |= HV_X64_MSR_HYPERCALL_AVAILABLE;
               c->eax |= HV_X64_MSR_APIC_ACCESS_AVAILABLE;
+           c->eax |= HV_X64_MSR_TIME_REF_COUNT_AVAILABLE;
+           c->eax |= 0x200;
           }

           c = &cpuid_data.entries[cpuid_i++];

It also seems that if you have useplatformclock=yes in the guest, it 
will not use the enlightened TSC.  `bcdedit /set useplatformclock=no` 
and a reboot will correct that.

Are there any sort of guidelines for what I should be seeing from 
kvm_stat?  This is pretty much average for me now:

  exits               1362839114  195453
  fpu_reload           199991016   34100
  halt_exits           187767718   33222
  halt_wakeup          198400078   35628
  host_state_reload    222907845   36212
  insn_emulation        22108942    2091
  io_exits              32094455    3132
  irq_exits             88852031   15855
  irq_injections       332358611   60694
  irq_window            61495812   12125

(all the other ones do not change frequently)

The only real way I know to judge things is based on the performance of 
the guest.  Are there any sort of thresholds for these numbers that 
would indicate a problem?




^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-08-27 16:45 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-26 19:15 Windows Server 2008R2 KVM guest performance issues Brian Rak
2013-08-26 22:01 ` Brian Rak
2013-08-27  7:18 ` Paolo Bonzini
2013-08-27  7:38   ` Gleb Natapov
2013-08-27  8:38     ` Paolo Bonzini
2013-08-27  8:44       ` Gleb Natapov
2013-08-27 14:09     ` Brian Rak
2013-08-27 14:44   ` Brian Rak
2013-08-27 15:09     ` Paolo Bonzini
2013-08-27 16:45       ` Brian Rak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.