kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Windows guests become unresponsive and display black screen in VNC
@ 2021-09-16  0:29 Bill Sherwood
  0 siblings, 0 replies; only message in thread
From: Bill Sherwood @ 2021-09-16  0:29 UTC (permalink / raw)
  To: kvm

Host is Ubuntu 20.04 - 5.4.0-26-generic

KVM is 1:4.2-3ubuntu6.17

Guest OS -- various flavors of Windows 2012 server or greater

Symptom:   Guests will appear to be running fine for some time (hours
or days), occasionally a perfectly running guest will "lock up" and
become unresponsive to network, VNC, etc..   We are unable to access
the VM and the only resort is to destroy the VM - shutdown will not
tear down the VM.

Looking at a number of things, we can see the following via trace_pipe
-- nothing else is emitted while in the hung state.

 cat /sys/kernel/debug/tracing/trace_pipe | grep -i <pid>

.....

 <...>-3441924 [004] .... 1907374.582975: kvm_set_irq: gsi 0 level 1 source 0
           <...>-3441924 [004] .... 1907374.582977: kvm_pic_set_irq:
chip 0 pin 0 (edge|masked)
           <...>-3441924 [004] .... 1907374.582979:
kvm_apic_accept_irq: apicid 0 vec 209 (Fixed|edge)
           <...>-3441924 [004] .... 1907374.582981:
kvm_ioapic_set_irq: pin 2 dst 1 vec 209 (Fixed|logical|edge)
           <...>-3441924 [004] .... 1907374.582983: kvm_set_irq: gsi 0
level 0 source 0
           <...>-3441924 [004] .... 1907374.582984: kvm_pic_set_irq:
chip 0 pin 0 (edge|masked)
           <...>-3441924 [004] .... 1907374.582984:
kvm_ioapic_set_irq: pin 2 dst 1 vec 209 (Fixed|logical|edge)
           <...>-3441924 [004] .... 1907374.583457: kvm_set_irq: gsi 0
level 1 source 0
           <...>-3441924 [004] .... 1907374.583459: kvm_pic_set_irq:
chip 0 pin 0 (edge|masked)
           <...>-3441924 [004] .... 1907374.583461:
kvm_apic_accept_irq: apicid 0 vec 209 (Fixed|edge)
           <...>-3441924 [004] .... 1907374.583462:
kvm_ioapic_set_irq: pin 2 dst 1 vec 209 (Fixed|logical|edge)
           <...>-3441924 [004] .... 1907374.583464: kvm_set_irq: gsi 0
level 0 source 0
.....


----------------------


Strace (15 second) output:

 55.39    0.082181          16      4937           ioctl
 23.05    0.034208          14      2390           ppoll
 21.56    0.031990          13      2379           futex
  0.00    0.000001           0        30           read
------ ----------- ----------- --------- --------- ----------------
100.00    0.148380                  9736           total


-----------------------

KVM stat live output:

        APIC_ACCESS      11491    44.50%     5.12%      1.82us
5021.41us      9.41us ( +-   9.87% )
       EPT_MISCONFIG       8007    31.01%    17.51%     11.25us
11003.56us     46.18us ( +-   8.59% )
                 HLT       3804    14.73%    76.26%      2.19us
3962.04us    423.30us ( +-   0.44% )
  EXTERNAL_INTERRUPT       2492     9.65%     0.46%      0.96us
163.72us      3.91us ( +-   4.36% )
   PAUSE_INSTRUCTION         14     0.05%     0.00%      1.24us
6.59us      3.40us ( +-  14.61% )
      IO_INSTRUCTION          8     0.03%     0.64%      8.34us
13470.05us   1693.58us ( +-  99.34% )
   PENDING_INTERRUPT          3     0.01%     0.00%      2.31us
3.69us      2.99us ( +-  13.36% )
       EPT_VIOLATION          2     0.01%     0.00%      7.74us
9.37us      8.56us ( +-   9.51% )
       EXCEPTION_NMI          1     0.00%     0.00%      3.09us
3.09us      3.09us ( +-   0.00% )
 TPR_BELOW_THRESHOLD          1     0.00%     0.00%      2.39us
2.39us      2.39us ( +-   0.00% )


We have tried quite a few of the HyperV enlightenments and clock
settings and those do not resolve the issue

We can reproduce the issue by stopping all guests(7) on the host and
starting them up at the same time - this may take a couple of
cycles(2-3) en-masse restarts and 1 or 2 out of the 7 will exhibit
this behavior


CPU is host-passthrough
Server has 24 cores with 256G of ram


Looking to see if anyone in the community has encountered this problem.

Thanks
Bill

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-09-16  0:29 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-16  0:29 Windows guests become unresponsive and display black screen in VNC Bill Sherwood

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).