[PATCH] vfio: avoid SET_ACTION_TRIGGER ioctls

* [PATCH] vfio: avoid SET_ACTION_TRIGGER ioctls
@ 2020-02-28 12:08 Roman Kapl
  2020-03-05 22:37 ` Alex Williamson
  0 siblings, 1 reply; 4+ messages in thread
From: Roman Kapl @ 2020-02-28 12:08 UTC (permalink / raw)
  Cc: Alex Williamson, qemu-devel, Roman Kapl

For MSI-X interrupts, remember what the last used eventfd was (KVM
bypass vs QEMU) and only call vfio_set_irq_signaling if it has changed.

This not only helps with performance, but it seems that interrupts can
be lost during VFIO_IRQ_SET_ACTION_TRIGGER. With the 'x-no-kvm-msix'
switch and this patch, SET_ACTION_TRIGGER is not called during
mask/unmask. This really only affects guests that actively use MSI-X masking.

Signed-off-by: Roman Kapl <rka@sysgo.com>
---

This patch scratches my particular itch. I am able to get our guest (which masks
MSI on each interrupt) running, without getting randomly stuck on waiting for
interrupt. However, the solution is far from perfect (x-no-kvm-msix is required)
and pretty slow. I would be interested in hearing any ideas how to improve this.
Some ideas:

1) Fix the kernel so that SET_ACTION_TRIGGER does not loose interrupts (I think
the problem is there, but not 100% sure). I've tested on 5.3.0-40-generic
#32~18.04.1-Ubuntu SMP.

2) Add support for MASK/UNMASK for MSI-X in kernel and use that. But I don't
know how to do PBA in that case. Another IOCTL? We could look at the real PBA
array, if mapping is supported, but that seems hacky.

3) Twiddle the bits behing kernel's back, if it can be mapped?

Still, I think this patch does not hurt anything and could be applied if no-one
can think of a better way.

---

 hw/vfio/pci.c | 32 ++++++++++++++++++++++----------
 hw/vfio/pci.h |  2 ++
 2 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index e6569a7968..5f7ce91519 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -390,12 +390,16 @@ static int vfio_enable_vectors(VFIOPCIDevice *vdev, bool msix)
          * MSI-X mask and pending bits are emulated, so we want to use the
          * KVM signaling path only when configured and unmasked.
          */
-        if (vdev->msi_vectors[i].use) {
-            if (vdev->msi_vectors[i].virq < 0 ||
-                (msix && msix_is_masked(&vdev->pdev, i))) {
-                fd = event_notifier_get_fd(&vdev->msi_vectors[i].interrupt);
+        VFIOMSIVector *vector = &vdev->msi_vectors[i];
+        if (vector->use) {
+            if (vector->virq < 0 ||
+                (msix && msix_is_masked(&vdev->pdev, i)))
+            {
+                vector->kvm_path_active = false;
+                fd = event_notifier_get_fd(&vector->interrupt);
             } else {
-                fd = event_notifier_get_fd(&vdev->msi_vectors[i].kvm_interrupt);
+                vector->kvm_path_active = true;
+                fd = event_notifier_get_fd(&vector->kvm_interrupt);
             }
         }
 
@@ -509,17 +513,23 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr,
     } else {
         Error *err = NULL;
         int32_t fd;
+        bool kvm_path;
 
         if (vector->virq >= 0) {
             fd = event_notifier_get_fd(&vector->kvm_interrupt);
+            kvm_path = true;
         } else {
             fd = event_notifier_get_fd(&vector->interrupt);
+            kvm_path = false;
         }
 
-        if (vfio_set_irq_signaling(&vdev->vbasedev,
-                                     VFIO_PCI_MSIX_IRQ_INDEX, nr,
-                                     VFIO_IRQ_SET_ACTION_TRIGGER, fd, &err)) {
-            error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name);
+        if (vector->kvm_path_active != kvm_path) {
+            if (vfio_set_irq_signaling(&vdev->vbasedev,
+                                       VFIO_PCI_MSIX_IRQ_INDEX, nr,
+                                       VFIO_IRQ_SET_ACTION_TRIGGER, fd, &err)) {
+                error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name);
+            }
+            vector->kvm_path_active = kvm_path;
         }
     }
 
@@ -555,13 +565,15 @@ static void vfio_msix_vector_release(PCIDevice *pdev, unsigned int nr)
      * core will mask the interrupt and set pending bits, allowing it to
      * be re-asserted on unmask.  Nothing to do if already using QEMU mode.
      */
-    if (vector->virq >= 0) {
+    if (vector->virq >= 0 && vector->kvm_path_active) {
         int32_t fd = event_notifier_get_fd(&vector->interrupt);
         Error *err = NULL;
 
         if (vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX, nr,
                                    VFIO_IRQ_SET_ACTION_TRIGGER, fd, &err)) {
             error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name);
+        } else {
+            vector->kvm_path_active = false;
         }
     }
 }
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index b329d50338..b01d2676cf 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -91,6 +91,8 @@ typedef struct VFIOMSIVector {
      */
     EventNotifier interrupt;
     EventNotifier kvm_interrupt;
+    /* Set when the trigger action is set to the KVM bypass FD */
+    bool kvm_path_active;
     struct VFIOPCIDevice *vdev; /* back pointer to device */
     int virq;
     bool use;
-- 
2.22.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread