All of lore.kernel.org
 help / color / mirror / Atom feed
* QEMU KVM - Windows Server guest
@ 2017-04-12 19:11 Jean Baptiste Guerraz
  2017-04-13 14:05 ` Stefan Hajnoczi
  0 siblings, 1 reply; 6+ messages in thread
From: Jean Baptiste Guerraz @ 2017-04-12 19:11 UTC (permalink / raw)
  To: kvm

Hello,

We're facing an issue with a Windows guest VM which runs quite well on
a laptop (Fedora 25 - Core I7 4720HQ -
https://www.asus.com/Notebooks/N551JX/specifications/ ) but not on 2
different servers (one Debian 8 on
https://documentation.online.net/en/dedicated-server/offers/2015/server-dedibox-pro-2015-gen2#server_dedibox_pro_2015_gen2
- and one Fedora 25 on
https://documentation.online.net/en/dedicated-server/offers/2016/server-dedibox-md-2016#server_dedibox_md_2016)
with the same symptom : Windows get stuck with CPU at 100% (from
services.exe process) a few "minute" after desktop is shown.

Detailed informations (record / strace / qemu command / packages
versions...) available here :
https://gist.github.com/jbguerraz/faef292b48b2d0106d8d96ba0ddd943c

Record tool shows a loop on such pattern (more on the linked gist) :

qemu-system-x86  7420 [000] 28829.151596: kvm:kvm_exit: reason
EPT_MISCONFIG rip 0xfffff80001df7eea info 0 0
qemu-system-x86  7420 [000] 28829.151596: kvm:kvm_emulate_insn:
0:fffff80001df7eea: 44 8b 80 f0 00 00 00
qemu-system-x86  7420 [000] 28829.151597: kvm:vcpu_match_mmio: gva
0xffffffffffd090f0 gpa 0xfed000f0 Read GPA
qemu-system-x86  7420 [000] 28829.151597: kvm:kvm_mmio: mmio
unsatisfied-read len 4 gpa 0xfed000f0 val 0x0
qemu-system-x86  7420 [000] 28829.151597: kvm:kvm_userspace_exit:
reason KVM_EXIT_MMIO (6)
qemu-system-x86  7420 [000] 28829.151598: kvm:kvm_fpu: unload
qemu-system-x86  7420 [000] 28829.151599: kvm:kvm_mmio: mmio read len
4 gpa 0xfed000f0 val 0x5e3c5955
qemu-system-x86  7420 [000] 28829.151600: kvm:kvm_fpu: load
qemu-system-x86  7420 [000] 28829.151600: kvm:kvm_entry: vcpu 0
qemu-system-x86  7420 [000] 28829.151607: kvm:kvm_exit: reason
EPT_MISCONFIG rip 0xfffff80001df7eea info 0 0
qemu-system-x86  7420 [000] 28829.151608: kvm:kvm_emulate_insn:
0:fffff80001df7eea: 44 8b 80 f0 00 00 00
qemu-system-x86  7420 [000] 28829.151609: kvm:vcpu_match_mmio: gva
0xffffffffffd090f0 gpa 0xfed000f0 Read GPA
qemu-system-x86  7420 [000] 28829.151609: kvm:kvm_mmio: mmio
unsatisfied-read len 4 gpa 0xfed000f0 val 0x0
qemu-system-x86  7420 [000] 28829.151609: kvm:kvm_userspace_exit:
reason KVM_EXIT_MMIO (6)
qemu-system-x86  7420 [000] 28829.151610: kvm:kvm_fpu: unload
qemu-system-x86  7420 [000] 28829.151611: kvm:kvm_mmio: mmio read len
4 gpa 0xfed000f0 val 0x5e3c5e0d
qemu-system-x86  7420 [000] 28829.151612: kvm:kvm_fpu: load
qemu-system-x86  7420 [000] 28829.151612: kvm:kvm_entry: vcpu 0
qemu-system-x86  7420 [000] 28829.151615: kvm:kvm_exit: reason
EPT_MISCONFIG rip 0xfffff80001df7eea info 0 0
qemu-system-x86  7420 [000] 28829.151616: kvm:kvm_emulate_insn:
0:fffff80001df7eea: 44 8b 80 f0 00 00 00
qemu-system-x86  7420 [000] 28829.151616: kvm:vcpu_match_mmio: gva
0xffffffffffd090f0 gpa 0xfed000f0 Read GPA
qemu-system-x86  7420 [000] 28829.151617: kvm:kvm_mmio: mmio
unsatisfied-read len 4 gpa 0xfed000f0 val 0x0
qemu-system-x86  7420 [000] 28829.151617: kvm:kvm_userspace_exit:
reason KVM_EXIT_MMIO (6)
qemu-system-x86  7420 [000] 28829.151617: kvm:kvm_fpu: unload
qemu-system-x86  7420 [000] 28829.151618: kvm:kvm_mmio: mmio read len
4 gpa 0xfed000f0 val 0x5e3c60f5
qemu-system-x86  7420 [000] 28829.151619: kvm:kvm_fpu: load
qemu-system-x86  7420 [000] 28829.151619: kvm:kvm_entry: vcpu 0

If one of you have an idea about how to dig further, that would be super :)

Thank you!

Jean-Baptiste

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: QEMU KVM - Windows Server guest
  2017-04-12 19:11 QEMU KVM - Windows Server guest Jean Baptiste Guerraz
@ 2017-04-13 14:05 ` Stefan Hajnoczi
  2017-04-13 14:44   ` Radim Krčmář
  0 siblings, 1 reply; 6+ messages in thread
From: Stefan Hajnoczi @ 2017-04-13 14:05 UTC (permalink / raw)
  To: Jean Baptiste Guerraz; +Cc: kvm

[-- Attachment #1: Type: text/plain, Size: 4089 bytes --]

On Wed, Apr 12, 2017 at 09:11:12PM +0200, Jean Baptiste Guerraz wrote:
> Hello,
> 
> We're facing an issue with a Windows guest VM which runs quite well on
> a laptop (Fedora 25 - Core I7 4720HQ -
> https://www.asus.com/Notebooks/N551JX/specifications/ ) but not on 2
> different servers (one Debian 8 on
> https://documentation.online.net/en/dedicated-server/offers/2015/server-dedibox-pro-2015-gen2#server_dedibox_pro_2015_gen2
> - and one Fedora 25 on
> https://documentation.online.net/en/dedicated-server/offers/2016/server-dedibox-md-2016#server_dedibox_md_2016)
> with the same symptom : Windows get stuck with CPU at 100% (from
> services.exe process) a few "minute" after desktop is shown.
> 
> Detailed informations (record / strace / qemu command / packages
> versions...) available here :
> https://gist.github.com/jbguerraz/faef292b48b2d0106d8d96ba0ddd943c
> 
> Record tool shows a loop on such pattern (more on the linked gist) :
> 
> qemu-system-x86  7420 [000] 28829.151596: kvm:kvm_exit: reason
> EPT_MISCONFIG rip 0xfffff80001df7eea info 0 0
> qemu-system-x86  7420 [000] 28829.151596: kvm:kvm_emulate_insn:
> 0:fffff80001df7eea: 44 8b 80 f0 00 00 00
> qemu-system-x86  7420 [000] 28829.151597: kvm:vcpu_match_mmio: gva
> 0xffffffffffd090f0 gpa 0xfed000f0 Read GPA
> qemu-system-x86  7420 [000] 28829.151597: kvm:kvm_mmio: mmio
> unsatisfied-read len 4 gpa 0xfed000f0 val 0x0
> qemu-system-x86  7420 [000] 28829.151597: kvm:kvm_userspace_exit:
> reason KVM_EXIT_MMIO (6)
> qemu-system-x86  7420 [000] 28829.151598: kvm:kvm_fpu: unload
> qemu-system-x86  7420 [000] 28829.151599: kvm:kvm_mmio: mmio read len
> 4 gpa 0xfed000f0 val 0x5e3c5955

This is the physical address range of the HPET (timer).

> qemu-system-x86  7420 [000] 28829.151600: kvm:kvm_fpu: load
> qemu-system-x86  7420 [000] 28829.151600: kvm:kvm_entry: vcpu 0
> qemu-system-x86  7420 [000] 28829.151607: kvm:kvm_exit: reason
> EPT_MISCONFIG rip 0xfffff80001df7eea info 0 0
> qemu-system-x86  7420 [000] 28829.151608: kvm:kvm_emulate_insn:
> 0:fffff80001df7eea: 44 8b 80 f0 00 00 00
> qemu-system-x86  7420 [000] 28829.151609: kvm:vcpu_match_mmio: gva
> 0xffffffffffd090f0 gpa 0xfed000f0 Read GPA
> qemu-system-x86  7420 [000] 28829.151609: kvm:kvm_mmio: mmio
> unsatisfied-read len 4 gpa 0xfed000f0 val 0x0
> qemu-system-x86  7420 [000] 28829.151609: kvm:kvm_userspace_exit:
> reason KVM_EXIT_MMIO (6)
> qemu-system-x86  7420 [000] 28829.151610: kvm:kvm_fpu: unload
> qemu-system-x86  7420 [000] 28829.151611: kvm:kvm_mmio: mmio read len
> 4 gpa 0xfed000f0 val 0x5e3c5e0d

HPET access again.

> qemu-system-x86  7420 [000] 28829.151612: kvm:kvm_fpu: load
> qemu-system-x86  7420 [000] 28829.151612: kvm:kvm_entry: vcpu 0
> qemu-system-x86  7420 [000] 28829.151615: kvm:kvm_exit: reason
> EPT_MISCONFIG rip 0xfffff80001df7eea info 0 0
> qemu-system-x86  7420 [000] 28829.151616: kvm:kvm_emulate_insn:
> 0:fffff80001df7eea: 44 8b 80 f0 00 00 00
> qemu-system-x86  7420 [000] 28829.151616: kvm:vcpu_match_mmio: gva
> 0xffffffffffd090f0 gpa 0xfed000f0 Read GPA
> qemu-system-x86  7420 [000] 28829.151617: kvm:kvm_mmio: mmio
> unsatisfied-read len 4 gpa 0xfed000f0 val 0x0

Here too.

> qemu-system-x86  7420 [000] 28829.151617: kvm:kvm_userspace_exit:
> reason KVM_EXIT_MMIO (6)
> qemu-system-x86  7420 [000] 28829.151617: kvm:kvm_fpu: unload
> qemu-system-x86  7420 [000] 28829.151618: kvm:kvm_mmio: mmio read len
> 4 gpa 0xfed000f0 val 0x5e3c60f5
> qemu-system-x86  7420 [000] 28829.151619: kvm:kvm_fpu: load
> qemu-system-x86  7420 [000] 28829.151619: kvm:kvm_entry: vcpu 0
> 
> If one of you have an idea about how to dig further, that would be super :)

I looked at a few of the interrupts that were injected.  An interrupt is
interrupt delivered every 15 milliseconds.  They were immediately
acknowledged by the interrupt handler function inside the guest.

This just looks like a running guest that's doing no I/O to me.

Can anyone else spot something suspicious that indicates 100% guest CPU
consumption?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: QEMU KVM - Windows Server guest
  2017-04-13 14:05 ` Stefan Hajnoczi
@ 2017-04-13 14:44   ` Radim Krčmář
  2017-04-14  8:45     ` Jean Baptiste Guerraz
  0 siblings, 1 reply; 6+ messages in thread
From: Radim Krčmář @ 2017-04-13 14:44 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Jean Baptiste Guerraz, kvm

2017-04-13 15:05+0100, Stefan Hajnoczi:
> On Wed, Apr 12, 2017 at 09:11:12PM +0200, Jean Baptiste Guerraz wrote:
> > Hello,
> > 
> > We're facing an issue with a Windows guest VM which runs quite well on
> > a laptop (Fedora 25 - Core I7 4720HQ -
> > https://www.asus.com/Notebooks/N551JX/specifications/ ) but not on 2
> > different servers (one Debian 8 on
> > https://documentation.online.net/en/dedicated-server/offers/2015/server-dedibox-pro-2015-gen2#server_dedibox_pro_2015_gen2
> > - and one Fedora 25 on
> > https://documentation.online.net/en/dedicated-server/offers/2016/server-dedibox-md-2016#server_dedibox_md_2016)

Interesting, is there some consistent difference between

  grep . /sys/module/kvm*/parameters/*

on those systems?

> > qemu-system-x86  7420 [000] 28829.151599: kvm:kvm_mmio: mmio read len
> > 4 gpa 0xfed000f0 val 0x5e3c5955
> 
> This is the physical address range of the HPET (timer).

Right, disabling HPET is worth a shot. :)

> > If one of you have an idea about how to dig further, that would be super :)
> 
> I looked at a few of the interrupts that were injected.  An interrupt is
> interrupt delivered every 15 milliseconds.  They were immediately
> acknowledged by the interrupt handler function inside the guest.
> 
> This just looks like a running guest that's doing no I/O to me.

It had two vector 47 interrupts earlier and those only got delivered
after TPR was lowered, so the problem could a bug in KVM's TPR handling,
which only allows vector 209 after some point?

> Can anyone else spot something suspicious that indicates 100% guest CPU
> consumption?

It seems to be reading HPET in a tight loop.  No idea why, though.

So far, I'd compare kvm parameters, disable HPET, and check TPR, to see
where that goes,

thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: QEMU KVM - Windows Server guest
  2017-04-13 14:44   ` Radim Krčmář
@ 2017-04-14  8:45     ` Jean Baptiste Guerraz
  2017-04-14  9:00       ` Paolo Bonzini
  0 siblings, 1 reply; 6+ messages in thread
From: Jean Baptiste Guerraz @ 2017-04-14  8:45 UTC (permalink / raw)
  To: Radim Krčmář; +Cc: Stefan Hajnoczi, kvm

Thank you Radim for the answer!

There are indeed some differences between those systems in KVM modules
parameters; I've updated the gist with all those details :
https://gist.github.com/jbguerraz/faef292b48b2d0106d8d96ba0ddd943c

With a focus on differences between the two FC25 systems, we get :
- /sys/module/kvm_intel/parameters/enable_shadow_vmcs is set to Y on
the not working one
- /sys/module/kvm_intel/parameters/pml is set to Y on the not working one

Tried to use the "-no-hpet" flag but, no luck, same problem.

I've finally used my best google-fu but I'm sadly lost with the "check
TPR", maybe you could give me a hint about that ?


2017-04-13 16:44 GMT+02:00 Radim Krčmář <rkrcmar@redhat.com>:
> 2017-04-13 15:05+0100, Stefan Hajnoczi:
>> On Wed, Apr 12, 2017 at 09:11:12PM +0200, Jean Baptiste Guerraz wrote:
>> > Hello,
>> >
>> > We're facing an issue with a Windows guest VM which runs quite well on
>> > a laptop (Fedora 25 - Core I7 4720HQ -
>> > https://www.asus.com/Notebooks/N551JX/specifications/ ) but not on 2
>> > different servers (one Debian 8 on
>> > https://documentation.online.net/en/dedicated-server/offers/2015/server-dedibox-pro-2015-gen2#server_dedibox_pro_2015_gen2
>> > - and one Fedora 25 on
>> > https://documentation.online.net/en/dedicated-server/offers/2016/server-dedibox-md-2016#server_dedibox_md_2016)
>
> Interesting, is there some consistent difference between
>
>   grep . /sys/module/kvm*/parameters/*
>
> on those systems?
>
>> > qemu-system-x86  7420 [000] 28829.151599: kvm:kvm_mmio: mmio read len
>> > 4 gpa 0xfed000f0 val 0x5e3c5955
>>
>> This is the physical address range of the HPET (timer).
>
> Right, disabling HPET is worth a shot. :)
>
>> > If one of you have an idea about how to dig further, that would be super :)
>>
>> I looked at a few of the interrupts that were injected.  An interrupt is
>> interrupt delivered every 15 milliseconds.  They were immediately
>> acknowledged by the interrupt handler function inside the guest.
>>
>> This just looks like a running guest that's doing no I/O to me.
>
> It had two vector 47 interrupts earlier and those only got delivered
> after TPR was lowered, so the problem could a bug in KVM's TPR handling,
> which only allows vector 209 after some point?
>
>> Can anyone else spot something suspicious that indicates 100% guest CPU
>> consumption?
>
> It seems to be reading HPET in a tight loop.  No idea why, though.
>
> So far, I'd compare kvm parameters, disable HPET, and check TPR, to see
> where that goes,
>
> thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: QEMU KVM - Windows Server guest
  2017-04-14  8:45     ` Jean Baptiste Guerraz
@ 2017-04-14  9:00       ` Paolo Bonzini
  2017-04-14 15:36         ` Jean Baptiste Guerraz
  0 siblings, 1 reply; 6+ messages in thread
From: Paolo Bonzini @ 2017-04-14  9:00 UTC (permalink / raw)
  To: Jean Baptiste Guerraz, Radim Krčmář; +Cc: Stefan Hajnoczi, kvm



On 14/04/2017 16:45, Jean Baptiste Guerraz wrote:
> Thank you Radim for the answer!
> 
> There are indeed some differences between those systems in KVM modules
> parameters; I've updated the gist with all those details :
> https://gist.github.com/jbguerraz/faef292b48b2d0106d8d96ba0ddd943c
> 
> With a focus on differences between the two FC25 systems, we get :
> - /sys/module/kvm_intel/parameters/enable_shadow_vmcs is set to Y on
> the not working one
> - /sys/module/kvm_intel/parameters/pml is set to Y on the not working one

So one is Ivy Bridge and one is Broadwell I think.  That's good to know
because both have the same interrupt injection path.

What version of the kernel is this?  Does loading kvm_intel with pml=0
fix it?  (Shadow VMCS is not an issue because it's only used for nested
virtualization).

Paolo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: QEMU KVM - Windows Server guest
  2017-04-14  9:00       ` Paolo Bonzini
@ 2017-04-14 15:36         ` Jean Baptiste Guerraz
  0 siblings, 0 replies; 6+ messages in thread
From: Jean Baptiste Guerraz @ 2017-04-14 15:36 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Radim Krčmář, Stefan Hajnoczi, kvm

Thank you Paolo !

FC25 - VM "OK" - Linux Kernel : 4.10.5-200.fc25.x86_64
FC25 - VM "Not OK" - Linux Kernel : 4.10.8-200.fc25.x86_64

I tried with "pml=0" and same result.

Finally I decided (hopeless) to re-transfer the windows disk image
from my laptop to the server and... it worked! :O I dunno if a couple
of bits got corrupted somehow during initial transfer but, I feel so,
it can't be a random luck.

Thank you again guys! I really appreciated the support!


2017-04-14 11:00 GMT+02:00 Paolo Bonzini <pbonzini@redhat.com>:
>
>
> On 14/04/2017 16:45, Jean Baptiste Guerraz wrote:
>> Thank you Radim for the answer!
>>
>> There are indeed some differences between those systems in KVM modules
>> parameters; I've updated the gist with all those details :
>> https://gist.github.com/jbguerraz/faef292b48b2d0106d8d96ba0ddd943c
>>
>> With a focus on differences between the two FC25 systems, we get :
>> - /sys/module/kvm_intel/parameters/enable_shadow_vmcs is set to Y on
>> the not working one
>> - /sys/module/kvm_intel/parameters/pml is set to Y on the not working one
>
> So one is Ivy Bridge and one is Broadwell I think.  That's good to know
> because both have the same interrupt injection path.
>
> What version of the kernel is this?  Does loading kvm_intel with pml=0
> fix it?  (Shadow VMCS is not an issue because it's only used for nested
> virtualization).
>
> Paolo

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-04-14 15:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-12 19:11 QEMU KVM - Windows Server guest Jean Baptiste Guerraz
2017-04-13 14:05 ` Stefan Hajnoczi
2017-04-13 14:44   ` Radim Krčmář
2017-04-14  8:45     ` Jean Baptiste Guerraz
2017-04-14  9:00       ` Paolo Bonzini
2017-04-14 15:36         ` Jean Baptiste Guerraz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.