On Wed, 2020-12-09 at 10:51 +0000, Joao Martins wrote:
> Isn't this what the first half of this patch was doing initially (minus the
> irq routing) ? Looks really similar:
> 
> https://lore.kernel.org/kvm/20190220201609.28290-11-joao.m.martins@oracle.com/

Absolutely! This thread is in reply to your original posting of
precisely that patch, and I've had your tree open in gitk to crib from
for most of the last week.

There's a *reason* why my tree looks like yours, and why more than half
of the patches in it still show you as being the author :)

> Albeit, I gotta after seeing the irq routing removed it ends much simpler, if we just
> replace the irq routing with a domain-wide upcall vector ;)

I like "simpler".

I also killed the ->cb.queued flag you had because I think it's
redundant with evtchn_upcall_pending anyway.

> Albeit it wouldn't cover the Windows Xen open source drivers which use the EVTCHN method
> (which is a regular LAPIC vector) [to answer your question on what uses it] For the EVTCHN
> you can just inject a regular vector through lapic deliver, and guest acks it. Sadly Linux
> doesn't use it,
> and if it was the case we would probably get faster upcalls with APICv/AVIC.

It doesn't need to, because those can just be injected with
KVM_SIGNAL_MSI.

At most, we just need to make sure that kvm_xen_has_interrupt() returns
false if the per-vCPU LAPIC vector is configured. But I didn't do that
because I checked Xen and it doesn't do it either.

As far as I can tell, Xen's hvm_vcpu_has_pending_irq() will still
return the domain-wide vector in preference to the one in the LAPIC, if
it actually gets invoked. And if the guest sets ->evtchn_upcall_pending
to reinject the IRQ (as Linux does on unmask) Xen won't use the per-
vCPU vector to inject that; it'll use the domain-wide vector.

> > I'm not sure that condition wasn't *already* broken some cases for
> > KVM_INTERRUPT anyway. In kvm_vcpu_ioctl_interrupt() we set
> > vcpu->arch.pending_userspace_vector and we *do* request KVM_REQ_EVENT,
> > sure.
> > 
> > But what if vcpu_enter_guest() processes that the first time (and
> > clears KVM_REQ_EVENT), and then exits for some *other* reason with
> > interrupts still disabled? Next time through vcpu_enter_guest(), even
> > though kvm_cpu_has_injectable_intr() is still true, we don't enable the
> > IRQ windowvmexit because KVM_REQ_EVENT got lost so we don't even call
> > inject_pending_event().
> > 
> > So instead of just kvm_xen_has_interrupt() in my patch below, I wonder
> > if we need to use kvm_cpu_has_injectable_intr() to fix the existing
> > bug? Or am I missing something there and there isn't a bug after all?
> > 
> 
> Given that the notion of an event channel pending is Xen specific handling, I am not sure
> we can remove the kvm_xen_has_interrupt()/kvm_xen_get_interrupt() logic. Much of the
> reason that I ended up checking on vmenter that checks event channels pendings..

Sure, we definitely need the check I added in vcpu_enter_guest() for
Xen unless I'm going to come up with a way to set KVM_REQ_EVENT at the
appropriate time.

But I'm looking at the ExtINT handling and as far as I can tell it's
buggy. So I didn't want to use it as a model for setting KVM_REQ_EVENT
for Xen events.