All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 214423] New: Windows guests become unresponsive and display black screen in VNC
@ 2021-09-15 21:03 bugzilla-daemon
  0 siblings, 0 replies; only message in thread
From: bugzilla-daemon @ 2021-09-15 21:03 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=214423

            Bug ID: 214423
           Summary: Windows guests become unresponsive and display black
                    screen in VNC
           Product: Virtualization
           Version: unspecified
    Kernel Version: 1:4.2-3ubuntu6.17
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: high
          Priority: P1
         Component: kvm
          Assignee: virtualization_kvm@kernel-bugs.osdl.org
          Reporter: bsherwood218@gmail.com
        Regression: No

Host is Ubuntu 20.04 - 5.4.0-26-generic

KVM is 1:4.2-3ubuntu6.17

Guest OS -- various flavors of Windows 2012 server or greater

Symptom:   Guests will appear to be running fine for some time (hours or days),
occasionally a perfectly running guest will "lock up" and become unresponsive
to network, VNC, etc..   We are unable to access the VM and the only resort is
to destroy the VM - shutdown will not tear down the VM.

Looking at a number of things, we can see the following via trace_pipe --
nothing else is emitted while in the hung state.

 cat /sys/kernel/debug/tracing/trace_pipe | grep -i <pid>

.....

 <...>-3441924 [004] .... 1907374.582975: kvm_set_irq: gsi 0 level 1 source 0
           <...>-3441924 [004] .... 1907374.582977: kvm_pic_set_irq: chip 0 pin
0 (edge|masked)
           <...>-3441924 [004] .... 1907374.582979: kvm_apic_accept_irq: apicid
0 vec 209 (Fixed|edge)
           <...>-3441924 [004] .... 1907374.582981: kvm_ioapic_set_irq: pin 2
dst 1 vec 209 (Fixed|logical|edge)
           <...>-3441924 [004] .... 1907374.582983: kvm_set_irq: gsi 0 level 0
source 0
           <...>-3441924 [004] .... 1907374.582984: kvm_pic_set_irq: chip 0 pin
0 (edge|masked)
           <...>-3441924 [004] .... 1907374.582984: kvm_ioapic_set_irq: pin 2
dst 1 vec 209 (Fixed|logical|edge)
           <...>-3441924 [004] .... 1907374.583457: kvm_set_irq: gsi 0 level 1
source 0
           <...>-3441924 [004] .... 1907374.583459: kvm_pic_set_irq: chip 0 pin
0 (edge|masked)
           <...>-3441924 [004] .... 1907374.583461: kvm_apic_accept_irq: apicid
0 vec 209 (Fixed|edge)
           <...>-3441924 [004] .... 1907374.583462: kvm_ioapic_set_irq: pin 2
dst 1 vec 209 (Fixed|logical|edge)
           <...>-3441924 [004] .... 1907374.583464: kvm_set_irq: gsi 0 level 0
source 0
.....


---------------------- 


Strace (15 second) output:

 55.39    0.082181          16      4937           ioctl
 23.05    0.034208          14      2390           ppoll
 21.56    0.031990          13      2379           futex
  0.00    0.000001           0        30           read
------ ----------- ----------- --------- --------- ----------------
100.00    0.148380                  9736           total


-----------------------

KVM stat live output:

        APIC_ACCESS      11491    44.50%     5.12%      1.82us   5021.41us     
9.41us ( +-   9.87% )
       EPT_MISCONFIG       8007    31.01%    17.51%     11.25us  11003.56us    
46.18us ( +-   8.59% )
                 HLT       3804    14.73%    76.26%      2.19us   3962.04us   
423.30us ( +-   0.44% )
  EXTERNAL_INTERRUPT       2492     9.65%     0.46%      0.96us    163.72us    
 3.91us ( +-   4.36% )
   PAUSE_INSTRUCTION         14     0.05%     0.00%      1.24us      6.59us    
 3.40us ( +-  14.61% )
      IO_INSTRUCTION          8     0.03%     0.64%      8.34us  13470.05us  
1693.58us ( +-  99.34% )
   PENDING_INTERRUPT          3     0.01%     0.00%      2.31us      3.69us    
 2.99us ( +-  13.36% )
       EPT_VIOLATION          2     0.01%     0.00%      7.74us      9.37us    
 8.56us ( +-   9.51% )
       EXCEPTION_NMI          1     0.00%     0.00%      3.09us      3.09us    
 3.09us ( +-   0.00% )
 TPR_BELOW_THRESHOLD          1     0.00%     0.00%      2.39us      2.39us    
 2.39us ( +-   0.00% )


We have tried quite a few of the HyperV enlightenments and clock settings and
those do not resolve the issue

We can reproduce the issue by stopping all guests(7) on the host and starting
them up at the same time - this may take a couple of cycles(2-3) en-masse
restarts and 1 or 2 out of the 7 will exhibit this behavior


CPU is host-passthrough
Server has 24 cores with 256G of ram 


Looking to see if anyone in the community has encountered this problem. 

Thanks 
Bill

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-09-15 21:03 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-15 21:03 [Bug 214423] New: Windows guests become unresponsive and display black screen in VNC bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.