In kvm_cpu_has_interrupt() we see the following FIXME: /* * FIXME: interrupt.injected represents an interrupt that it's * side-effects have already been applied (e.g. bit from IRR * already moved to ISR). Therefore, it is incorrect to rely * on interrupt.injected to know if there is a pending * interrupt in the user-mode LAPIC. * This leads to nVMX/nSVM not be able to distinguish * if it should exit from L2 to L1 on EXTERNAL_INTERRUPT on * pending interrupt or should re-inject an injected * interrupt. */ I'm using nested VMX for testing, while I add split-irqchip support to my VMM. I see the vCPU lock up when attempting to deliver an interrupt. What seems to happen is that request_interrupt_window is set, causing an immediate vmexit because an IRQ *can* be delivered. But then kvm_vcpu_ready_for_interrupt_injection() returns false, because kvm_cpu_has_interrupt() is true. Because that returns false, the kernel just continues looping in vcpu_run(), constantly vmexiting and going right back in. This utterly naïve hack makes my L2 guest boot properly, by not enabling the irq window when we were going to ignore the exit anyway. Is there a better fix? I must also confess I'm working on a slightly older kernel in L1, and have forward-ported to a more recent tree without actually testing because from inspection it looks like exactly the same issue still exists. diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 397f599b20e5..e23f0c8b4a16 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8830,7 +8830,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) } inject_pending_event(vcpu, &req_immediate_exit); - if (req_int_win) + /* Don't enable the interrupt window for userspace if + * kvm_cpu_has_interrupt() is set and we'd never actually + * exit with ready_for_interrupt_window set anyway. */ + if (req_int_win && !kvm_cpu_has_interrupt(vcpu) kvm_x86_ops.enable_irq_window(vcpu); if (kvm_lapic_enabled(vcpu)) {