[PATCH v9 0/2] kvm: level irqfd support

* [PATCH v9 0/2] kvm: level irqfd support
@ 2012-08-21 19:28 Alex Williamson
  2012-08-21 19:29 ` [PATCH v9 1/2] kvm: Use a reserved IRQ source ID for irqfd Alex Williamson
                   ` (4 more replies)
  0 siblings, 5 replies; 35+ messages in thread
From: Alex Williamson @ 2012-08-21 19:28 UTC (permalink / raw)
  To: avi, mst; +Cc: gleb, kvm, linux-kernel

Here's the much anticipated re-write of support for level irqfds.  As
Michael suggested, I've rolled the eoi/ack notification fd into
KVM_IRQFD as a new mode.  For lack of a better name, as there seems to
be objections to associating this specifically with an EOI or an ACK,
I've name this OADN or "On Ack, De-assert & Notify".

Patch 1of2 switches current KVM_IRQFDs to use their own IRQ source ID
since we're potentially stepping on KVM_USERSPACE_IRQ_SOURCE_ID.
Unfurtunately I was not able to make 2of2 use a single IRQ source ID,
the reason is it's racy.  Objects to track OADNs are made dynamically,
we look through existing ones for a match under spinlock and setup a
new one if there's no match.  On teardown, we can remove the OADN from
the list under lock, but that same lock prevents us from de-assigning
the IRQ ACK notifier or waiting for an RCU grace period.  We must make
sure that any unused GSI is de-asserted, but the above means it's
possible that another OADN has been created for this source ID/GSI
and de-asserting the GSI could lead to breakage.  Instead each OADN
object gets it's own source ID, but these are all shared by users
of the same GSI.  So for PCI devices, we might have up to 4 IRQ
source IDs allocated.

Michael had also suggested avoiding reference counting and using
list_empty for this OADN object.  Unfortunately, that doesn't work
for similar reasons.  We want to release the OADN object underlock,
preventing others from re-using it on the free path, but in order
to have lock-less de-assert & notify we use RCU, meaning we can't
trust list_empty until after an RCU grace period, which must be
done outside of spinlocks.

If there are suggestions how we can handle these better, please
make them, but I think this compromise is race-free and still
manages to make allocation of IRQ source IDs mostly a non-issue
for device assignment limits.  Thanks,

Alex

---

Alex Williamson (2):
      kvm: On Ack, De-assert & Notify KVM_IRQFD extension
      kvm: Use a reserved IRQ source ID for irqfd

 Documentation/virtual/kvm/api.txt |   13 ++
 arch/x86/kvm/x86.c                |    4 +
 include/linux/kvm.h               |    7 +
 include/linux/kvm_host.h          |    2 
 virt/kvm/eventfd.c                |  199 ++++++++++++++++++++++++++++++++++++-
 5 files changed, 218 insertions(+), 7 deletions(-)

^ permalink raw reply	[flat|nested] 35+ messages in thread