Avi Kivity wrote: > Gregory Haskins wrote: >>> - works with all guests >>> - supports hotplug/hotunplug, udev, sysfs, module autoloading, ... >>> - supported in all OSes >>> - someone else maintains it >>> >> These points are all valid, and I really struggled with this particular >> part of the design. The entire vbus design only requires one IRQ for >> the entire guest, > > Won't this have scaling issues? One IRQ means one target vcpu. > Whereas I'd like virtio devices to span multiple queues, each queue > with its own MSI IRQ. Hmm..you know I hadnt really thought of it that way, but you have a point. To clarify, my design actually uses one IRQ per "eventq", where we can have an arbitrary number of eventq's defined (note: today I only define one eventq, however). An eventq is actually a shm-ring construct where I can pass events up to the host like "device added" or "ring X signaled". Each individual device based virtio-ring would then aggregates "signal" events onto this eventq mechanism to actually inject events to the host. Only the eventq itself injects an actual IRQ to the assigned vcpu. My intended use of multiple eventqs was for prioritization of different rings. For instance, we could define 8 priority levels, each with its own ring/irq. That way, a virtio-net that supports something like 802.1p could define 8 virtio-rings, one for each priority level. But this scheme is more targeted at prioritization than per vcpu irq-balancing. I support the eventq construct I proposed could still be used in this fashion since each has its own routable IRQ. However, I would have to think about that some more because it is beyond the design spec. The good news is that the decision to use the "eventq+irq" approach is completely contained in the kvm-host+guest.patch. We could easily switch to a 1:1 irq:shm-signal if we wanted to, and the device/drivers would work exactly the same without modification. > Also, the single IRQ handler will need to scan for all potential IRQ > sources. Even if implemented carefully, this will cause many > cacheline bounces. Well, no, I think this part is covered. As mentioned above, we use a queuing technique so there is no scanning needed. Ultimately I would love to adapt a similar technique to optionally replace the LAPIC. That way we can avoid the EOI trap and just consume the next interrupt (if applicable) from the shm-ring. > >> so its conceivable that I could present a simple >> "dummy" PCI device with some "VBUS" type PCI-ID, just to piggy back on >> the IRQ routing logic. Then userspace could simply pass the IRQ routing >> info down to the kernel with an ioctl, or something similar. >> > > Xen does something similar, I believe. > >> I think ultimately I was trying to stay away from PCI in general because >> I want to support environments that do not have PCI. However, for the >> kvm-transport case (at least on x86) this isnt really a constraint. >> >> > > s/PCI/the native IRQ solution for your platform/. virtio has the same > problem; on s390 we use the native (if that word ever applies to s390) > interrupt and device discovery mechanism. yeah, I agree. We can contain the "exposure" of PCI to just platforms within KVM that care about it. -Greg