On Wed, 2019-02-20 at 20:15 +0000, Joao Martins wrote:
> @@ -176,6 +177,9 @@ int kvm_arch_set_irq_inatomic(struct kvm_kernel_irq_routing_entry *e,
>         int r;
>  
>         switch (e->type) {
> +       case KVM_IRQ_ROUTING_XEN_EVTCHN:
> +               return kvm_xen_set_evtchn(e, kvm, irq_source_id, level,
> +                                      line_status);
>         case KVM_IRQ_ROUTING_HV_SINT:
>                 return kvm_hv_set_sint(e, kvm, irq_source_id, level,
>                                        line_status);
> @@ -325,6 +329,13 @@ int kvm_set_routing_entry(struct kvm *kvm,
>                 e->hv_sint.vcpu = ue->u.hv_sint.vcpu;
>                 e->hv_sint.sint = ue->u.hv_sint.sint;
>                 break;
> +       case KVM_IRQ_ROUTING_XEN_EVTCHN:
> +               e->set = kvm_xen_set_evtchn;
> +               e->evtchn.vcpu = ue->u.evtchn.vcpu;
> +               e->evtchn.vector = ue->u.evtchn.vector;
> +               e->evtchn.via = ue->u.evtchn.via;
> +
> +               return kvm_xen_setup_evtchn(kvm, e);
>         default:
>                 return -EINVAL;
>         }


Hmm. I'm not sure I've have done it that way.

These IRQ routing entries aren't for individual event channel ports;
they don't map to kvm_xen_evtchn_send().

They actually represent the upcall to the given vCPU when any event
channel is signalled, and it's really per-vCPU configuration.

When the kernel raises (IPI, VIRQ) events on a given CPU, it doesn't
actually use these routing entries; it just uses the values in
vcpu_xen->cb.{via,vector} which were cached from the last of these IRQ
routing entries that happens to have been processed?

The VMM is *expected* to set up precisely one of these for each vCPU,
right? Would it not be better to do that via KVM_XEN_HVM_SET_ATTR?

The usage model for userspace is presumably that the VMM should set the
appropriate bit in evtchn_pending, check evtchn_mask and then call into
the kernel to do the set_irq() to inject the callback vector to the
guest?

I might be more inclined to go for a model where the kernel handles the
evtchn_pending/evtchn_mask for us. What would go into the irq routing
table is { vcpu, port# } which get passed to kvm_xen_evtchn_send().

Does that seem reasonable?

Either way, I do think we need a way for events raised in the kernel to
be signalled to userspace, if they are targeted at a vCPU which has
CALLBACK_VIA_INTX that the kernel can't do directly. So we probably
*do* need that eventfd I was objecting to earlier, except that it's not
a per-evtchn thing; it's per-vCPU.