All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] virtio: Use ioeventfd for virtqueue notify
@ 2010-11-11 13:47 ` Stefan Hajnoczi
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-11-11 13:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, Michael S. Tsirkin

This is a rewrite of the virtio-ioeventfd patchset to work at the virtio-pci.c
level instead of virtio.c.  This results in better integration with the
host/guest notifier code and makes the code simpler (no more state machine).

Virtqueue notify is currently handled synchronously in userspace virtio.  This
prevents the vcpu from executing guest code while hardware emulation code
handles the notify.

On systems that support KVM, the ioeventfd mechanism can be used to make
virtqueue notify a lightweight exit by deferring hardware emulation to the
iothread and allowing the VM to continue execution.  This model is similar to
how vhost receives virtqueue notifies.

The result of this change is improved performance for userspace virtio devices.
Virtio-blk throughput increases especially for multithreaded scenarios and
virtio-net transmit throughput increases substantially.

Now that this code is in virtio-pci.c it is possible to explicitly enable
devices for which virtio-ioeventfd should be used.  Only virtio-blk and
virtio-net are enabled at this time.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH v2 0/3] virtio: Use ioeventfd for virtqueue notify
@ 2010-11-11 13:47 ` Stefan Hajnoczi
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-11-11 13:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, Michael S. Tsirkin

This is a rewrite of the virtio-ioeventfd patchset to work at the virtio-pci.c
level instead of virtio.c.  This results in better integration with the
host/guest notifier code and makes the code simpler (no more state machine).

Virtqueue notify is currently handled synchronously in userspace virtio.  This
prevents the vcpu from executing guest code while hardware emulation code
handles the notify.

On systems that support KVM, the ioeventfd mechanism can be used to make
virtqueue notify a lightweight exit by deferring hardware emulation to the
iothread and allowing the VM to continue execution.  This model is similar to
how vhost receives virtqueue notifies.

The result of this change is improved performance for userspace virtio devices.
Virtio-blk throughput increases especially for multithreaded scenarios and
virtio-net transmit throughput increases substantially.

Now that this code is in virtio-pci.c it is possible to explicitly enable
devices for which virtio-ioeventfd should be used.  Only virtio-blk and
virtio-net are enabled at this time.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/3] virtio-pci: Rename bugs field to flags
  2010-11-11 13:47 ` [Qemu-devel] " Stefan Hajnoczi
@ 2010-11-11 13:47   ` Stefan Hajnoczi
  -1 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-11-11 13:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, Michael S. Tsirkin, Stefan Hajnoczi

The VirtIOPCIProxy bugs field is currently used to enable workarounds
for older guests.  Rename it to flags so that other per-device behavior
can be tracked.

A later patch uses the flags field to remember whether ioeventfd should
be used for virtqueue host notification.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 hw/virtio-pci.c |   15 +++++++--------
 1 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 729917d..549118d 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -80,9 +80,8 @@
  * 12 is historical, and due to x86 page size. */
 #define VIRTIO_PCI_QUEUE_ADDR_SHIFT    12
 
-/* We can catch some guest bugs inside here so we continue supporting older
-   guests. */
-#define VIRTIO_PCI_BUG_BUS_MASTER	(1 << 0)
+/* Flags track per-device state like workarounds for quirks in older guests. */
+#define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
 
 /* QEMU doesn't strictly need write barriers since everything runs in
  * lock-step.  We'll leave the calls to wmb() in though to make it obvious for
@@ -95,7 +94,7 @@
 typedef struct {
     PCIDevice pci_dev;
     VirtIODevice *vdev;
-    uint32_t bugs;
+    uint32_t flags;
     uint32_t addr;
     uint32_t class_code;
     uint32_t nvectors;
@@ -159,7 +158,7 @@ static int virtio_pci_load_config(void * opaque, QEMUFile *f)
        in ready state. Then we have a buggy guest OS. */
     if ((proxy->vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) &&
         !(proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
-        proxy->bugs |= VIRTIO_PCI_BUG_BUS_MASTER;
+        proxy->flags |= VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
     }
     return 0;
 }
@@ -185,7 +184,7 @@ static void virtio_pci_reset(DeviceState *d)
     VirtIOPCIProxy *proxy = container_of(d, VirtIOPCIProxy, pci_dev.qdev);
     virtio_reset(proxy->vdev);
     msix_reset(&proxy->pci_dev);
-    proxy->bugs = 0;
+    proxy->flags = 0;
 }
 
 static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
@@ -235,7 +234,7 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
            some safety checks. */
         if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
             !(proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
-            proxy->bugs |= VIRTIO_PCI_BUG_BUS_MASTER;
+            proxy->flags |= VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
         }
         break;
     case VIRTIO_MSI_CONFIG_VECTOR:
@@ -403,7 +402,7 @@ static void virtio_write_config(PCIDevice *pci_dev, uint32_t address,
 
     if (PCI_COMMAND == address) {
         if (!(val & PCI_COMMAND_MASTER)) {
-            if (!(proxy->bugs & VIRTIO_PCI_BUG_BUS_MASTER)) {
+            if (!(proxy->flags & VIRTIO_PCI_FLAG_BUS_MASTER_BUG)) {
                 virtio_set_status(proxy->vdev,
                                   proxy->vdev->status & ~VIRTIO_CONFIG_S_DRIVER_OK);
             }
-- 
1.7.2.3


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 1/3] virtio-pci: Rename bugs field to flags
@ 2010-11-11 13:47   ` Stefan Hajnoczi
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-11-11 13:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Stefan Hajnoczi, kvm, Michael S. Tsirkin

The VirtIOPCIProxy bugs field is currently used to enable workarounds
for older guests.  Rename it to flags so that other per-device behavior
can be tracked.

A later patch uses the flags field to remember whether ioeventfd should
be used for virtqueue host notification.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 hw/virtio-pci.c |   15 +++++++--------
 1 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 729917d..549118d 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -80,9 +80,8 @@
  * 12 is historical, and due to x86 page size. */
 #define VIRTIO_PCI_QUEUE_ADDR_SHIFT    12
 
-/* We can catch some guest bugs inside here so we continue supporting older
-   guests. */
-#define VIRTIO_PCI_BUG_BUS_MASTER	(1 << 0)
+/* Flags track per-device state like workarounds for quirks in older guests. */
+#define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
 
 /* QEMU doesn't strictly need write barriers since everything runs in
  * lock-step.  We'll leave the calls to wmb() in though to make it obvious for
@@ -95,7 +94,7 @@
 typedef struct {
     PCIDevice pci_dev;
     VirtIODevice *vdev;
-    uint32_t bugs;
+    uint32_t flags;
     uint32_t addr;
     uint32_t class_code;
     uint32_t nvectors;
@@ -159,7 +158,7 @@ static int virtio_pci_load_config(void * opaque, QEMUFile *f)
        in ready state. Then we have a buggy guest OS. */
     if ((proxy->vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) &&
         !(proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
-        proxy->bugs |= VIRTIO_PCI_BUG_BUS_MASTER;
+        proxy->flags |= VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
     }
     return 0;
 }
@@ -185,7 +184,7 @@ static void virtio_pci_reset(DeviceState *d)
     VirtIOPCIProxy *proxy = container_of(d, VirtIOPCIProxy, pci_dev.qdev);
     virtio_reset(proxy->vdev);
     msix_reset(&proxy->pci_dev);
-    proxy->bugs = 0;
+    proxy->flags = 0;
 }
 
 static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
@@ -235,7 +234,7 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
            some safety checks. */
         if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
             !(proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
-            proxy->bugs |= VIRTIO_PCI_BUG_BUS_MASTER;
+            proxy->flags |= VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
         }
         break;
     case VIRTIO_MSI_CONFIG_VECTOR:
@@ -403,7 +402,7 @@ static void virtio_write_config(PCIDevice *pci_dev, uint32_t address,
 
     if (PCI_COMMAND == address) {
         if (!(val & PCI_COMMAND_MASTER)) {
-            if (!(proxy->bugs & VIRTIO_PCI_BUG_BUS_MASTER)) {
+            if (!(proxy->flags & VIRTIO_PCI_FLAG_BUS_MASTER_BUG)) {
                 virtio_set_status(proxy->vdev,
                                   proxy->vdev->status & ~VIRTIO_CONFIG_S_DRIVER_OK);
             }
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-11-11 13:47 ` [Qemu-devel] " Stefan Hajnoczi
@ 2010-11-11 13:47   ` Stefan Hajnoczi
  -1 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-11-11 13:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, Michael S. Tsirkin, Stefan Hajnoczi

Virtqueue notify is currently handled synchronously in userspace virtio.  This
prevents the vcpu from executing guest code while hardware emulation code
handles the notify.

On systems that support KVM, the ioeventfd mechanism can be used to make
virtqueue notify a lightweight exit by deferring hardware emulation to the
iothread and allowing the VM to continue execution.  This model is similar to
how vhost receives virtqueue notifies.

The result of this change is improved performance for userspace virtio devices.
Virtio-blk throughput increases especially for multithreaded scenarios and
virtio-net transmit throughput increases substantially.

Some virtio devices are known to have guest drivers which expect a notify to be
processed synchronously and spin waiting for completion.  Only enable ioeventfd
for virtio-blk and virtio-net for now.

Care must be taken not to interfere with vhost-net, which already uses
ioeventfd host notifiers.  The following list shows the behavior implemented in
this patch and is designed to take vhost-net into account:

 * VIRTIO_CONFIG_S_DRIVER_OK -> assign host notifiers, qemu_set_fd_handler(virtio_pci_host_notifier_read)
 * reset -> qemu_set_fd_handler(NULL), deassign host notifiers
 * virtio_pci_set_host_notifier(true) -> qemu_set_fd_handler(NULL)
 * virtio_pci_set_host_notifier(false) -> qemu_set_fd_handler(virtio_pci_host_notifier_read)

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 hw/virtio-pci.c |  155 +++++++++++++++++++++++++++++++++++++++++++-----------
 hw/virtio.c     |    5 ++
 hw/virtio.h     |    1 +
 3 files changed, 129 insertions(+), 32 deletions(-)

diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 549118d..436fc59 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -83,6 +83,10 @@
 /* Flags track per-device state like workarounds for quirks in older guests. */
 #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
 
+/* Performance improves when virtqueue kick processing is decoupled from the
+ * vcpu thread using ioeventfd for some devices. */
+#define VIRTIO_PCI_FLAG_USE_IOEVENTFD   (1 << 1)
+
 /* QEMU doesn't strictly need write barriers since everything runs in
  * lock-step.  We'll leave the calls to wmb() in though to make it obvious for
  * KVM or if kqemu gets SMP support.
@@ -179,12 +183,108 @@ static int virtio_pci_load_queue(void * opaque, int n, QEMUFile *f)
     return 0;
 }
 
+static int virtio_pci_set_host_notifier_ioeventfd(VirtIOPCIProxy *proxy, int n, bool assign)
+{
+    VirtQueue *vq = virtio_get_queue(proxy->vdev, n);
+    EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
+    int r;
+    if (assign) {
+        r = event_notifier_init(notifier, 1);
+        if (r < 0) {
+            return r;
+        }
+        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
+                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
+                                       n, assign);
+        if (r < 0) {
+            event_notifier_cleanup(notifier);
+        }
+    } else {
+        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
+                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
+                                       n, assign);
+        if (r < 0) {
+            return r;
+        }
+        event_notifier_cleanup(notifier);
+    }
+    return r;
+}
+
+static void virtio_pci_host_notifier_read(void *opaque)
+{
+    VirtQueue *vq = opaque;
+    EventNotifier *n = virtio_queue_get_host_notifier(vq);
+    if (event_notifier_test_and_clear(n)) {
+        virtio_queue_notify_vq(vq);
+    }
+}
+
+static void virtio_pci_set_host_notifier_fd_handler(VirtIOPCIProxy *proxy, int n, bool assign)
+{
+    VirtQueue *vq = virtio_get_queue(proxy->vdev, n);
+    EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
+    if (assign) {
+        qemu_set_fd_handler(event_notifier_get_fd(notifier),
+                            virtio_pci_host_notifier_read, NULL, vq);
+    } else {
+        qemu_set_fd_handler(event_notifier_get_fd(notifier),
+                            NULL, NULL, NULL);
+    }
+}
+
+static int virtio_pci_set_host_notifiers(VirtIOPCIProxy *proxy, bool assign)
+{
+    int n, r;
+
+    for (n = 0; n < VIRTIO_PCI_QUEUE_MAX; n++) {
+        if (!virtio_queue_get_num(proxy->vdev, n)) {
+            continue;
+        }
+
+        if (assign) {
+            r = virtio_pci_set_host_notifier_ioeventfd(proxy, n, true);
+            if (r < 0) {
+                goto assign_error;
+            }
+
+            virtio_pci_set_host_notifier_fd_handler(proxy, n, true);
+        } else {
+            virtio_pci_set_host_notifier_fd_handler(proxy, n, false);
+            virtio_pci_set_host_notifier_ioeventfd(proxy, n, false);
+        }
+    }
+    return 0;
+
+assign_error:
+    proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
+    while (--n >= 0) {
+        virtio_pci_set_host_notifier_fd_handler(proxy, n, false);
+        virtio_pci_set_host_notifier_ioeventfd(proxy, n, false);
+    }
+    return r;
+}
+
+static void virtio_pci_reset_vdev(VirtIOPCIProxy *proxy)
+{
+    /* Poke virtio device so it deassigns its host notifiers (if any) */
+    virtio_set_status(proxy->vdev, 0);
+
+    /* Now safely deassign our own host notifiers */
+    if (proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD) {
+        virtio_pci_set_host_notifiers(proxy, false);
+    }
+
+    virtio_reset(proxy->vdev);
+    msix_unuse_all_vectors(&proxy->pci_dev);
+}
+
 static void virtio_pci_reset(DeviceState *d)
 {
     VirtIOPCIProxy *proxy = container_of(d, VirtIOPCIProxy, pci_dev.qdev);
-    virtio_reset(proxy->vdev);
+    virtio_pci_reset_vdev(proxy);
     msix_reset(&proxy->pci_dev);
-    proxy->flags = 0;
+    proxy->flags &= ~VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
 }
 
 static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
@@ -209,11 +309,10 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
     case VIRTIO_PCI_QUEUE_PFN:
         pa = (target_phys_addr_t)val << VIRTIO_PCI_QUEUE_ADDR_SHIFT;
         if (pa == 0) {
-            virtio_reset(proxy->vdev);
-            msix_unuse_all_vectors(&proxy->pci_dev);
-        }
-        else
+            virtio_pci_reset_vdev(proxy);
+        } else {
             virtio_queue_set_addr(vdev, vdev->queue_sel, pa);
+        }
         break;
     case VIRTIO_PCI_QUEUE_SEL:
         if (val < VIRTIO_PCI_QUEUE_MAX)
@@ -223,10 +322,16 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
         virtio_queue_notify(vdev, val);
         break;
     case VIRTIO_PCI_STATUS:
-        virtio_set_status(vdev, val & 0xFF);
-        if (vdev->status == 0) {
-            virtio_reset(proxy->vdev);
-            msix_unuse_all_vectors(&proxy->pci_dev);
+        if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
+            !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) &&
+            (proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD)) {
+            virtio_pci_set_host_notifiers(proxy, true);
+        }
+
+        if (val & 0xFF) {
+            virtio_set_status(vdev, val & 0xFF);
+        } else {
+            virtio_pci_reset_vdev(proxy);
         }
 
         /* Linux before 2.6.34 sets the device as OK without enabling
@@ -480,30 +585,12 @@ assign_error:
 static int virtio_pci_set_host_notifier(void *opaque, int n, bool assign)
 {
     VirtIOPCIProxy *proxy = opaque;
-    VirtQueue *vq = virtio_get_queue(proxy->vdev, n);
-    EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
-    int r;
-    if (assign) {
-        r = event_notifier_init(notifier, 1);
-        if (r < 0) {
-            return r;
-        }
-        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
-                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
-                                       n, assign);
-        if (r < 0) {
-            event_notifier_cleanup(notifier);
-        }
+    if (proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD) {
+        virtio_pci_set_host_notifier_fd_handler(proxy, n, !assign);
+        return 0;
     } else {
-        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
-                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
-                                       n, assign);
-        if (r < 0) {
-            return r;
-        }
-        event_notifier_cleanup(notifier);
+        return virtio_pci_set_host_notifier_ioeventfd(proxy, n, assign);
     }
-    return r;
 }
 
 static const VirtIOBindings virtio_pci_bindings = {
@@ -702,6 +789,8 @@ static PCIDeviceInfo virtio_info[] = {
         .qdev.props = (Property[]) {
             DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0),
             DEFINE_BLOCK_PROPERTIES(VirtIOPCIProxy, block),
+            DEFINE_PROP_UINT32("flags", VirtIOPCIProxy, flags,
+                               VIRTIO_PCI_FLAG_USE_IOEVENTFD),
             DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
             DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, host_features),
             DEFINE_PROP_END_OF_LIST(),
@@ -714,6 +803,8 @@ static PCIDeviceInfo virtio_info[] = {
         .exit       = virtio_net_exit_pci,
         .romfile    = "pxe-virtio.bin",
         .qdev.props = (Property[]) {
+            DEFINE_PROP_UINT32("flags", VirtIOPCIProxy, flags,
+                               VIRTIO_PCI_FLAG_USE_IOEVENTFD),
             DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 3),
             DEFINE_VIRTIO_NET_FEATURES(VirtIOPCIProxy, host_features),
             DEFINE_NIC_PROPERTIES(VirtIOPCIProxy, nic),
diff --git a/hw/virtio.c b/hw/virtio.c
index a2a657e..f588e29 100644
--- a/hw/virtio.c
+++ b/hw/virtio.c
@@ -582,6 +582,11 @@ void virtio_queue_notify(VirtIODevice *vdev, int n)
     }
 }
 
+void virtio_queue_notify_vq(VirtQueue *vq)
+{
+    virtio_queue_notify(vq->vdev, vq - vq->vdev->vq);
+}
+
 uint16_t virtio_queue_vector(VirtIODevice *vdev, int n)
 {
     return n < VIRTIO_PCI_QUEUE_MAX ? vdev->vq[n].vector :
diff --git a/hw/virtio.h b/hw/virtio.h
index 02fa312..5ae521c 100644
--- a/hw/virtio.h
+++ b/hw/virtio.h
@@ -219,5 +219,6 @@ void virtio_queue_set_last_avail_idx(VirtIODevice *vdev, int n, uint16_t idx);
 VirtQueue *virtio_get_queue(VirtIODevice *vdev, int n);
 EventNotifier *virtio_queue_get_guest_notifier(VirtQueue *vq);
 EventNotifier *virtio_queue_get_host_notifier(VirtQueue *vq);
+void virtio_queue_notify_vq(VirtQueue *vq);
 void virtio_irq(VirtQueue *vq);
 #endif
-- 
1.7.2.3


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-11-11 13:47   ` Stefan Hajnoczi
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-11-11 13:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Stefan Hajnoczi, kvm, Michael S. Tsirkin

Virtqueue notify is currently handled synchronously in userspace virtio.  This
prevents the vcpu from executing guest code while hardware emulation code
handles the notify.

On systems that support KVM, the ioeventfd mechanism can be used to make
virtqueue notify a lightweight exit by deferring hardware emulation to the
iothread and allowing the VM to continue execution.  This model is similar to
how vhost receives virtqueue notifies.

The result of this change is improved performance for userspace virtio devices.
Virtio-blk throughput increases especially for multithreaded scenarios and
virtio-net transmit throughput increases substantially.

Some virtio devices are known to have guest drivers which expect a notify to be
processed synchronously and spin waiting for completion.  Only enable ioeventfd
for virtio-blk and virtio-net for now.

Care must be taken not to interfere with vhost-net, which already uses
ioeventfd host notifiers.  The following list shows the behavior implemented in
this patch and is designed to take vhost-net into account:

 * VIRTIO_CONFIG_S_DRIVER_OK -> assign host notifiers, qemu_set_fd_handler(virtio_pci_host_notifier_read)
 * reset -> qemu_set_fd_handler(NULL), deassign host notifiers
 * virtio_pci_set_host_notifier(true) -> qemu_set_fd_handler(NULL)
 * virtio_pci_set_host_notifier(false) -> qemu_set_fd_handler(virtio_pci_host_notifier_read)

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 hw/virtio-pci.c |  155 +++++++++++++++++++++++++++++++++++++++++++-----------
 hw/virtio.c     |    5 ++
 hw/virtio.h     |    1 +
 3 files changed, 129 insertions(+), 32 deletions(-)

diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 549118d..436fc59 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -83,6 +83,10 @@
 /* Flags track per-device state like workarounds for quirks in older guests. */
 #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
 
+/* Performance improves when virtqueue kick processing is decoupled from the
+ * vcpu thread using ioeventfd for some devices. */
+#define VIRTIO_PCI_FLAG_USE_IOEVENTFD   (1 << 1)
+
 /* QEMU doesn't strictly need write barriers since everything runs in
  * lock-step.  We'll leave the calls to wmb() in though to make it obvious for
  * KVM or if kqemu gets SMP support.
@@ -179,12 +183,108 @@ static int virtio_pci_load_queue(void * opaque, int n, QEMUFile *f)
     return 0;
 }
 
+static int virtio_pci_set_host_notifier_ioeventfd(VirtIOPCIProxy *proxy, int n, bool assign)
+{
+    VirtQueue *vq = virtio_get_queue(proxy->vdev, n);
+    EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
+    int r;
+    if (assign) {
+        r = event_notifier_init(notifier, 1);
+        if (r < 0) {
+            return r;
+        }
+        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
+                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
+                                       n, assign);
+        if (r < 0) {
+            event_notifier_cleanup(notifier);
+        }
+    } else {
+        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
+                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
+                                       n, assign);
+        if (r < 0) {
+            return r;
+        }
+        event_notifier_cleanup(notifier);
+    }
+    return r;
+}
+
+static void virtio_pci_host_notifier_read(void *opaque)
+{
+    VirtQueue *vq = opaque;
+    EventNotifier *n = virtio_queue_get_host_notifier(vq);
+    if (event_notifier_test_and_clear(n)) {
+        virtio_queue_notify_vq(vq);
+    }
+}
+
+static void virtio_pci_set_host_notifier_fd_handler(VirtIOPCIProxy *proxy, int n, bool assign)
+{
+    VirtQueue *vq = virtio_get_queue(proxy->vdev, n);
+    EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
+    if (assign) {
+        qemu_set_fd_handler(event_notifier_get_fd(notifier),
+                            virtio_pci_host_notifier_read, NULL, vq);
+    } else {
+        qemu_set_fd_handler(event_notifier_get_fd(notifier),
+                            NULL, NULL, NULL);
+    }
+}
+
+static int virtio_pci_set_host_notifiers(VirtIOPCIProxy *proxy, bool assign)
+{
+    int n, r;
+
+    for (n = 0; n < VIRTIO_PCI_QUEUE_MAX; n++) {
+        if (!virtio_queue_get_num(proxy->vdev, n)) {
+            continue;
+        }
+
+        if (assign) {
+            r = virtio_pci_set_host_notifier_ioeventfd(proxy, n, true);
+            if (r < 0) {
+                goto assign_error;
+            }
+
+            virtio_pci_set_host_notifier_fd_handler(proxy, n, true);
+        } else {
+            virtio_pci_set_host_notifier_fd_handler(proxy, n, false);
+            virtio_pci_set_host_notifier_ioeventfd(proxy, n, false);
+        }
+    }
+    return 0;
+
+assign_error:
+    proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
+    while (--n >= 0) {
+        virtio_pci_set_host_notifier_fd_handler(proxy, n, false);
+        virtio_pci_set_host_notifier_ioeventfd(proxy, n, false);
+    }
+    return r;
+}
+
+static void virtio_pci_reset_vdev(VirtIOPCIProxy *proxy)
+{
+    /* Poke virtio device so it deassigns its host notifiers (if any) */
+    virtio_set_status(proxy->vdev, 0);
+
+    /* Now safely deassign our own host notifiers */
+    if (proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD) {
+        virtio_pci_set_host_notifiers(proxy, false);
+    }
+
+    virtio_reset(proxy->vdev);
+    msix_unuse_all_vectors(&proxy->pci_dev);
+}
+
 static void virtio_pci_reset(DeviceState *d)
 {
     VirtIOPCIProxy *proxy = container_of(d, VirtIOPCIProxy, pci_dev.qdev);
-    virtio_reset(proxy->vdev);
+    virtio_pci_reset_vdev(proxy);
     msix_reset(&proxy->pci_dev);
-    proxy->flags = 0;
+    proxy->flags &= ~VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
 }
 
 static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
@@ -209,11 +309,10 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
     case VIRTIO_PCI_QUEUE_PFN:
         pa = (target_phys_addr_t)val << VIRTIO_PCI_QUEUE_ADDR_SHIFT;
         if (pa == 0) {
-            virtio_reset(proxy->vdev);
-            msix_unuse_all_vectors(&proxy->pci_dev);
-        }
-        else
+            virtio_pci_reset_vdev(proxy);
+        } else {
             virtio_queue_set_addr(vdev, vdev->queue_sel, pa);
+        }
         break;
     case VIRTIO_PCI_QUEUE_SEL:
         if (val < VIRTIO_PCI_QUEUE_MAX)
@@ -223,10 +322,16 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
         virtio_queue_notify(vdev, val);
         break;
     case VIRTIO_PCI_STATUS:
-        virtio_set_status(vdev, val & 0xFF);
-        if (vdev->status == 0) {
-            virtio_reset(proxy->vdev);
-            msix_unuse_all_vectors(&proxy->pci_dev);
+        if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
+            !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) &&
+            (proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD)) {
+            virtio_pci_set_host_notifiers(proxy, true);
+        }
+
+        if (val & 0xFF) {
+            virtio_set_status(vdev, val & 0xFF);
+        } else {
+            virtio_pci_reset_vdev(proxy);
         }
 
         /* Linux before 2.6.34 sets the device as OK without enabling
@@ -480,30 +585,12 @@ assign_error:
 static int virtio_pci_set_host_notifier(void *opaque, int n, bool assign)
 {
     VirtIOPCIProxy *proxy = opaque;
-    VirtQueue *vq = virtio_get_queue(proxy->vdev, n);
-    EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
-    int r;
-    if (assign) {
-        r = event_notifier_init(notifier, 1);
-        if (r < 0) {
-            return r;
-        }
-        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
-                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
-                                       n, assign);
-        if (r < 0) {
-            event_notifier_cleanup(notifier);
-        }
+    if (proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD) {
+        virtio_pci_set_host_notifier_fd_handler(proxy, n, !assign);
+        return 0;
     } else {
-        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
-                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
-                                       n, assign);
-        if (r < 0) {
-            return r;
-        }
-        event_notifier_cleanup(notifier);
+        return virtio_pci_set_host_notifier_ioeventfd(proxy, n, assign);
     }
-    return r;
 }
 
 static const VirtIOBindings virtio_pci_bindings = {
@@ -702,6 +789,8 @@ static PCIDeviceInfo virtio_info[] = {
         .qdev.props = (Property[]) {
             DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0),
             DEFINE_BLOCK_PROPERTIES(VirtIOPCIProxy, block),
+            DEFINE_PROP_UINT32("flags", VirtIOPCIProxy, flags,
+                               VIRTIO_PCI_FLAG_USE_IOEVENTFD),
             DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
             DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, host_features),
             DEFINE_PROP_END_OF_LIST(),
@@ -714,6 +803,8 @@ static PCIDeviceInfo virtio_info[] = {
         .exit       = virtio_net_exit_pci,
         .romfile    = "pxe-virtio.bin",
         .qdev.props = (Property[]) {
+            DEFINE_PROP_UINT32("flags", VirtIOPCIProxy, flags,
+                               VIRTIO_PCI_FLAG_USE_IOEVENTFD),
             DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 3),
             DEFINE_VIRTIO_NET_FEATURES(VirtIOPCIProxy, host_features),
             DEFINE_NIC_PROPERTIES(VirtIOPCIProxy, nic),
diff --git a/hw/virtio.c b/hw/virtio.c
index a2a657e..f588e29 100644
--- a/hw/virtio.c
+++ b/hw/virtio.c
@@ -582,6 +582,11 @@ void virtio_queue_notify(VirtIODevice *vdev, int n)
     }
 }
 
+void virtio_queue_notify_vq(VirtQueue *vq)
+{
+    virtio_queue_notify(vq->vdev, vq - vq->vdev->vq);
+}
+
 uint16_t virtio_queue_vector(VirtIODevice *vdev, int n)
 {
     return n < VIRTIO_PCI_QUEUE_MAX ? vdev->vq[n].vector :
diff --git a/hw/virtio.h b/hw/virtio.h
index 02fa312..5ae521c 100644
--- a/hw/virtio.h
+++ b/hw/virtio.h
@@ -219,5 +219,6 @@ void virtio_queue_set_last_avail_idx(VirtIODevice *vdev, int n, uint16_t idx);
 VirtQueue *virtio_get_queue(VirtIODevice *vdev, int n);
 EventNotifier *virtio_queue_get_guest_notifier(VirtQueue *vq);
 EventNotifier *virtio_queue_get_host_notifier(VirtQueue *vq);
+void virtio_queue_notify_vq(VirtQueue *vq);
 void virtio_irq(VirtQueue *vq);
 #endif
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 3/3] virtio-pci: Don't use ioeventfd on old kernels
  2010-11-11 13:47 ` [Qemu-devel] " Stefan Hajnoczi
@ 2010-11-11 13:47   ` Stefan Hajnoczi
  -1 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-11-11 13:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, Michael S. Tsirkin, Stefan Hajnoczi

There used to be a limit of 6 KVM io bus devices inside the kernel.  On
such a kernel, don't use ioeventfd for virtqueue host notification since
the limit is reached too easily.  This ensures that existing vhost-net
setups (which always use ioeventfd) have ioeventfds available so they
can continue to work.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 hw/virtio-pci.c |    4 ++++
 kvm-all.c       |   46 ++++++++++++++++++++++++++++++++++++++++++++++
 kvm-stub.c      |    5 +++++
 kvm.h           |    1 +
 4 files changed, 56 insertions(+), 0 deletions(-)

diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 436fc59..365a26b 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -646,6 +646,10 @@ static void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev,
     pci_register_bar(&proxy->pci_dev, 0, size, PCI_BASE_ADDRESS_SPACE_IO,
                            virtio_map);
 
+    if (!kvm_has_many_ioeventfds()) {
+        proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
+    }
+
     virtio_bind_device(vdev, &virtio_pci_bindings, proxy);
     proxy->host_features |= 0x1 << VIRTIO_F_NOTIFY_ON_EMPTY;
     proxy->host_features |= 0x1 << VIRTIO_F_BAD_FEATURE;
diff --git a/kvm-all.c b/kvm-all.c
index 37b99c7..ba302bc 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -28,6 +28,11 @@
 #include "kvm.h"
 #include "bswap.h"
 
+/* This check must be after config-host.h is included */
+#ifdef CONFIG_EVENTFD
+#include <sys/eventfd.h>
+#endif
+
 /* KVM uses PAGE_SIZE in it's definition of COALESCED_MMIO_MAX */
 #define PAGE_SIZE TARGET_PAGE_SIZE
 
@@ -72,6 +77,7 @@ struct KVMState
     int irqchip_in_kernel;
     int pit_in_kernel;
     int xsave, xcrs;
+    int many_ioeventfds;
 };
 
 static KVMState *kvm_state;
@@ -441,6 +447,39 @@ int kvm_check_extension(KVMState *s, unsigned int extension)
     return ret;
 }
 
+static int kvm_check_many_ioeventfds(void)
+{
+    /* Older kernels have a 6 device limit on the KVM io bus.  Find out so we
+     * can avoid creating too many ioeventfds.
+     */
+#ifdef CONFIG_EVENTFD
+    int ioeventfds[7];
+    int i, ret = 0;
+    for (i = 0; i < ARRAY_SIZE(ioeventfds); i++) {
+        ioeventfds[i] = eventfd(0, EFD_CLOEXEC);
+        if (ioeventfds[i] < 0) {
+            break;
+        }
+        ret = kvm_set_ioeventfd_pio_word(ioeventfds[i], 0, i, true);
+        if (ret < 0) {
+            close(ioeventfds[i]);
+            break;
+        }
+    }
+
+    /* Decide whether many devices are supported or not */
+    ret = i == ARRAY_SIZE(ioeventfds);
+
+    while (i-- > 0) {
+        kvm_set_ioeventfd_pio_word(ioeventfds[i], 0, i, false);
+        close(ioeventfds[i]);
+    }
+    return ret;
+#else
+    return 0;
+#endif
+}
+
 static void kvm_set_phys_mem(target_phys_addr_t start_addr,
 			     ram_addr_t size,
 			     ram_addr_t phys_offset)
@@ -717,6 +756,8 @@ int kvm_init(int smp_cpus)
     kvm_state = s;
     cpu_register_phys_memory_client(&kvm_cpu_phys_memory_client);
 
+    s->many_ioeventfds = kvm_check_many_ioeventfds();
+
     return 0;
 
 err:
@@ -1046,6 +1087,11 @@ int kvm_has_xcrs(void)
     return kvm_state->xcrs;
 }
 
+int kvm_has_many_ioeventfds(void)
+{
+    return kvm_state->many_ioeventfds;
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
     if (!kvm_has_sync_mmu()) {
diff --git a/kvm-stub.c b/kvm-stub.c
index 5384a4b..33d4476 100644
--- a/kvm-stub.c
+++ b/kvm-stub.c
@@ -99,6 +99,11 @@ int kvm_has_robust_singlestep(void)
     return 0;
 }
 
+int kvm_has_many_ioeventfds(void)
+{
+    return 0;
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
 }
diff --git a/kvm.h b/kvm.h
index 60a9b42..ce08d42 100644
--- a/kvm.h
+++ b/kvm.h
@@ -42,6 +42,7 @@ int kvm_has_robust_singlestep(void);
 int kvm_has_debugregs(void);
 int kvm_has_xsave(void);
 int kvm_has_xcrs(void);
+int kvm_has_many_ioeventfds(void);
 
 #ifdef NEED_CPU_H
 int kvm_init_vcpu(CPUState *env);
-- 
1.7.2.3


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 3/3] virtio-pci: Don't use ioeventfd on old kernels
@ 2010-11-11 13:47   ` Stefan Hajnoczi
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-11-11 13:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Stefan Hajnoczi, kvm, Michael S. Tsirkin

There used to be a limit of 6 KVM io bus devices inside the kernel.  On
such a kernel, don't use ioeventfd for virtqueue host notification since
the limit is reached too easily.  This ensures that existing vhost-net
setups (which always use ioeventfd) have ioeventfds available so they
can continue to work.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 hw/virtio-pci.c |    4 ++++
 kvm-all.c       |   46 ++++++++++++++++++++++++++++++++++++++++++++++
 kvm-stub.c      |    5 +++++
 kvm.h           |    1 +
 4 files changed, 56 insertions(+), 0 deletions(-)

diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 436fc59..365a26b 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -646,6 +646,10 @@ static void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev,
     pci_register_bar(&proxy->pci_dev, 0, size, PCI_BASE_ADDRESS_SPACE_IO,
                            virtio_map);
 
+    if (!kvm_has_many_ioeventfds()) {
+        proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
+    }
+
     virtio_bind_device(vdev, &virtio_pci_bindings, proxy);
     proxy->host_features |= 0x1 << VIRTIO_F_NOTIFY_ON_EMPTY;
     proxy->host_features |= 0x1 << VIRTIO_F_BAD_FEATURE;
diff --git a/kvm-all.c b/kvm-all.c
index 37b99c7..ba302bc 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -28,6 +28,11 @@
 #include "kvm.h"
 #include "bswap.h"
 
+/* This check must be after config-host.h is included */
+#ifdef CONFIG_EVENTFD
+#include <sys/eventfd.h>
+#endif
+
 /* KVM uses PAGE_SIZE in it's definition of COALESCED_MMIO_MAX */
 #define PAGE_SIZE TARGET_PAGE_SIZE
 
@@ -72,6 +77,7 @@ struct KVMState
     int irqchip_in_kernel;
     int pit_in_kernel;
     int xsave, xcrs;
+    int many_ioeventfds;
 };
 
 static KVMState *kvm_state;
@@ -441,6 +447,39 @@ int kvm_check_extension(KVMState *s, unsigned int extension)
     return ret;
 }
 
+static int kvm_check_many_ioeventfds(void)
+{
+    /* Older kernels have a 6 device limit on the KVM io bus.  Find out so we
+     * can avoid creating too many ioeventfds.
+     */
+#ifdef CONFIG_EVENTFD
+    int ioeventfds[7];
+    int i, ret = 0;
+    for (i = 0; i < ARRAY_SIZE(ioeventfds); i++) {
+        ioeventfds[i] = eventfd(0, EFD_CLOEXEC);
+        if (ioeventfds[i] < 0) {
+            break;
+        }
+        ret = kvm_set_ioeventfd_pio_word(ioeventfds[i], 0, i, true);
+        if (ret < 0) {
+            close(ioeventfds[i]);
+            break;
+        }
+    }
+
+    /* Decide whether many devices are supported or not */
+    ret = i == ARRAY_SIZE(ioeventfds);
+
+    while (i-- > 0) {
+        kvm_set_ioeventfd_pio_word(ioeventfds[i], 0, i, false);
+        close(ioeventfds[i]);
+    }
+    return ret;
+#else
+    return 0;
+#endif
+}
+
 static void kvm_set_phys_mem(target_phys_addr_t start_addr,
 			     ram_addr_t size,
 			     ram_addr_t phys_offset)
@@ -717,6 +756,8 @@ int kvm_init(int smp_cpus)
     kvm_state = s;
     cpu_register_phys_memory_client(&kvm_cpu_phys_memory_client);
 
+    s->many_ioeventfds = kvm_check_many_ioeventfds();
+
     return 0;
 
 err:
@@ -1046,6 +1087,11 @@ int kvm_has_xcrs(void)
     return kvm_state->xcrs;
 }
 
+int kvm_has_many_ioeventfds(void)
+{
+    return kvm_state->many_ioeventfds;
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
     if (!kvm_has_sync_mmu()) {
diff --git a/kvm-stub.c b/kvm-stub.c
index 5384a4b..33d4476 100644
--- a/kvm-stub.c
+++ b/kvm-stub.c
@@ -99,6 +99,11 @@ int kvm_has_robust_singlestep(void)
     return 0;
 }
 
+int kvm_has_many_ioeventfds(void)
+{
+    return 0;
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
 }
diff --git a/kvm.h b/kvm.h
index 60a9b42..ce08d42 100644
--- a/kvm.h
+++ b/kvm.h
@@ -42,6 +42,7 @@ int kvm_has_robust_singlestep(void);
 int kvm_has_debugregs(void);
 int kvm_has_xsave(void);
 int kvm_has_xcrs(void);
+int kvm_has_many_ioeventfds(void);
 
 #ifdef NEED_CPU_H
 int kvm_init_vcpu(CPUState *env);
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-11-11 13:47   ` [Qemu-devel] " Stefan Hajnoczi
@ 2010-11-11 15:53     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 46+ messages in thread
From: Michael S. Tsirkin @ 2010-11-11 15:53 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, kvm

On Thu, Nov 11, 2010 at 01:47:21PM +0000, Stefan Hajnoczi wrote:
> Virtqueue notify is currently handled synchronously in userspace virtio.  This
> prevents the vcpu from executing guest code while hardware emulation code
> handles the notify.
> 
> On systems that support KVM, the ioeventfd mechanism can be used to make
> virtqueue notify a lightweight exit by deferring hardware emulation to the
> iothread and allowing the VM to continue execution.  This model is similar to
> how vhost receives virtqueue notifies.
> 
> The result of this change is improved performance for userspace virtio devices.
> Virtio-blk throughput increases especially for multithreaded scenarios and
> virtio-net transmit throughput increases substantially.
> 
> Some virtio devices are known to have guest drivers which expect a notify to be
> processed synchronously and spin waiting for completion.  Only enable ioeventfd
> for virtio-blk and virtio-net for now.
> 
> Care must be taken not to interfere with vhost-net, which already uses
> ioeventfd host notifiers.  The following list shows the behavior implemented in
> this patch and is designed to take vhost-net into account:
> 
>  * VIRTIO_CONFIG_S_DRIVER_OK -> assign host notifiers, qemu_set_fd_handler(virtio_pci_host_notifier_read)

we should also deassign when VIRTIO_CONFIG_S_DRIVER_OK is cleared
by io write or bus master bit?

>  * reset -> qemu_set_fd_handler(NULL), deassign host notifiers
>  * virtio_pci_set_host_notifier(true) -> qemu_set_fd_handler(NULL)
>  * virtio_pci_set_host_notifier(false) -> qemu_set_fd_handler(virtio_pci_host_notifier_read)
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> ---
>  hw/virtio-pci.c |  155 +++++++++++++++++++++++++++++++++++++++++++-----------
>  hw/virtio.c     |    5 ++
>  hw/virtio.h     |    1 +
>  3 files changed, 129 insertions(+), 32 deletions(-)
> 
> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> index 549118d..436fc59 100644
> --- a/hw/virtio-pci.c
> +++ b/hw/virtio-pci.c
> @@ -83,6 +83,10 @@
>  /* Flags track per-device state like workarounds for quirks in older guests. */
>  #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
>  
> +/* Performance improves when virtqueue kick processing is decoupled from the
> + * vcpu thread using ioeventfd for some devices. */
> +#define VIRTIO_PCI_FLAG_USE_IOEVENTFD   (1 << 1)
> +
>  /* QEMU doesn't strictly need write barriers since everything runs in
>   * lock-step.  We'll leave the calls to wmb() in though to make it obvious for
>   * KVM or if kqemu gets SMP support.
> @@ -179,12 +183,108 @@ static int virtio_pci_load_queue(void * opaque, int n, QEMUFile *f)
>      return 0;
>  }
>  
> +static int virtio_pci_set_host_notifier_ioeventfd(VirtIOPCIProxy *proxy, int n, bool assign)
> +{
> +    VirtQueue *vq = virtio_get_queue(proxy->vdev, n);
> +    EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
> +    int r;
> +    if (assign) {
> +        r = event_notifier_init(notifier, 1);
> +        if (r < 0) {
> +            return r;
> +        }
> +        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
> +                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
> +                                       n, assign);
> +        if (r < 0) {
> +            event_notifier_cleanup(notifier);
> +        }
> +    } else {
> +        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
> +                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
> +                                       n, assign);
> +        if (r < 0) {
> +            return r;
> +        }
> +        event_notifier_cleanup(notifier);
> +    }
> +    return r;
> +}
> +
> +static void virtio_pci_host_notifier_read(void *opaque)
> +{
> +    VirtQueue *vq = opaque;
> +    EventNotifier *n = virtio_queue_get_host_notifier(vq);
> +    if (event_notifier_test_and_clear(n)) {
> +        virtio_queue_notify_vq(vq);
> +    }
> +}
> +
> +static void virtio_pci_set_host_notifier_fd_handler(VirtIOPCIProxy *proxy, int n, bool assign)
> +{
> +    VirtQueue *vq = virtio_get_queue(proxy->vdev, n);
> +    EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
> +    if (assign) {
> +        qemu_set_fd_handler(event_notifier_get_fd(notifier),
> +                            virtio_pci_host_notifier_read, NULL, vq);
> +    } else {
> +        qemu_set_fd_handler(event_notifier_get_fd(notifier),
> +                            NULL, NULL, NULL);
> +    }
> +}
> +
> +static int virtio_pci_set_host_notifiers(VirtIOPCIProxy *proxy, bool assign)
> +{
> +    int n, r;
> +
> +    for (n = 0; n < VIRTIO_PCI_QUEUE_MAX; n++) {
> +        if (!virtio_queue_get_num(proxy->vdev, n)) {
> +            continue;
> +        }
> +
> +        if (assign) {
> +            r = virtio_pci_set_host_notifier_ioeventfd(proxy, n, true);
> +            if (r < 0) {
> +                goto assign_error;
> +            }
> +
> +            virtio_pci_set_host_notifier_fd_handler(proxy, n, true);
> +        } else {
> +            virtio_pci_set_host_notifier_fd_handler(proxy, n, false);
> +            virtio_pci_set_host_notifier_ioeventfd(proxy, n, false);
> +        }
> +    }
> +    return 0;
> +
> +assign_error:
> +    proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
> +    while (--n >= 0) {
> +        virtio_pci_set_host_notifier_fd_handler(proxy, n, false);
> +        virtio_pci_set_host_notifier_ioeventfd(proxy, n, false);
> +    }
> +    return r;
> +}
> +
> +static void virtio_pci_reset_vdev(VirtIOPCIProxy *proxy)
> +{
> +    /* Poke virtio device so it deassigns its host notifiers (if any) */
> +    virtio_set_status(proxy->vdev, 0);

Hmm. virtio_reset already sets status to 0.
I guess it should just be fixed to call virtio_set_status?

> +
> +    /* Now safely deassign our own host notifiers */
> +    if (proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD) {
> +        virtio_pci_set_host_notifiers(proxy, false);
> +    }
> +
> +    virtio_reset(proxy->vdev);
> +    msix_unuse_all_vectors(&proxy->pci_dev);
> +}
> +
>  static void virtio_pci_reset(DeviceState *d)
>  {
>      VirtIOPCIProxy *proxy = container_of(d, VirtIOPCIProxy, pci_dev.qdev);
> -    virtio_reset(proxy->vdev);
> +    virtio_pci_reset_vdev(proxy);
>      msix_reset(&proxy->pci_dev);
> -    proxy->flags = 0;
> +    proxy->flags &= ~VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
>  }
>  
>  static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
> @@ -209,11 +309,10 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
>      case VIRTIO_PCI_QUEUE_PFN:
>          pa = (target_phys_addr_t)val << VIRTIO_PCI_QUEUE_ADDR_SHIFT;
>          if (pa == 0) {
> -            virtio_reset(proxy->vdev);
> -            msix_unuse_all_vectors(&proxy->pci_dev);
> -        }
> -        else
> +            virtio_pci_reset_vdev(proxy);
> +        } else {
>              virtio_queue_set_addr(vdev, vdev->queue_sel, pa);
> +        }
>          break;
>      case VIRTIO_PCI_QUEUE_SEL:
>          if (val < VIRTIO_PCI_QUEUE_MAX)
> @@ -223,10 +322,16 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
>          virtio_queue_notify(vdev, val);
>          break;
>      case VIRTIO_PCI_STATUS:
> -        virtio_set_status(vdev, val & 0xFF);
> -        if (vdev->status == 0) {
> -            virtio_reset(proxy->vdev);
> -            msix_unuse_all_vectors(&proxy->pci_dev);
> +        if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
> +            !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) &&
> +            (proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD)) {
> +            virtio_pci_set_host_notifiers(proxy, true);
> +        }

So we set host notifiers to true from here, but to false
only on reset? This seems strange. Should not we disable
notifiers when driver clears OK status?
How about on bus master disable?

> +
> +        if (val & 0xFF) {
> +            virtio_set_status(vdev, val & 0xFF);
> +        } else {
> +            virtio_pci_reset_vdev(proxy);
>          }
>  
>          /* Linux before 2.6.34 sets the device as OK without enabling
> @@ -480,30 +585,12 @@ assign_error:
>  static int virtio_pci_set_host_notifier(void *opaque, int n, bool assign)
>  {
>      VirtIOPCIProxy *proxy = opaque;
> -    VirtQueue *vq = virtio_get_queue(proxy->vdev, n);
> -    EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
> -    int r;
> -    if (assign) {
> -        r = event_notifier_init(notifier, 1);
> -        if (r < 0) {
> -            return r;
> -        }
> -        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
> -                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
> -                                       n, assign);
> -        if (r < 0) {
> -            event_notifier_cleanup(notifier);
> -        }
> +    if (proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD) {
> +        virtio_pci_set_host_notifier_fd_handler(proxy, n, !assign);
> +        return 0;
>      } else {
> -        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
> -                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
> -                                       n, assign);
> -        if (r < 0) {
> -            return r;
> -        }
> -        event_notifier_cleanup(notifier);
> +        return virtio_pci_set_host_notifier_ioeventfd(proxy, n, assign);
>      }
> -    return r;
>  }
>  
>  static const VirtIOBindings virtio_pci_bindings = {
> @@ -702,6 +789,8 @@ static PCIDeviceInfo virtio_info[] = {
>          .qdev.props = (Property[]) {
>              DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0),
>              DEFINE_BLOCK_PROPERTIES(VirtIOPCIProxy, block),
> +            DEFINE_PROP_UINT32("flags", VirtIOPCIProxy, flags,
> +                               VIRTIO_PCI_FLAG_USE_IOEVENTFD),
>              DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
>              DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, host_features),
>              DEFINE_PROP_END_OF_LIST(),
> @@ -714,6 +803,8 @@ static PCIDeviceInfo virtio_info[] = {
>          .exit       = virtio_net_exit_pci,
>          .romfile    = "pxe-virtio.bin",
>          .qdev.props = (Property[]) {
> +            DEFINE_PROP_UINT32("flags", VirtIOPCIProxy, flags,
> +                               VIRTIO_PCI_FLAG_USE_IOEVENTFD),
>              DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 3),
>              DEFINE_VIRTIO_NET_FEATURES(VirtIOPCIProxy, host_features),
>              DEFINE_NIC_PROPERTIES(VirtIOPCIProxy, nic),

This ties interface to an internal macro value.  Further, user gets to
tweak other fields in this integer which we don't want.  Finally, the
interface is extremely unfriendly.
Please use a bit property instead: DEFINE_PROP_BIT.

> diff --git a/hw/virtio.c b/hw/virtio.c
> index a2a657e..f588e29 100644
> --- a/hw/virtio.c
> +++ b/hw/virtio.c
> @@ -582,6 +582,11 @@ void virtio_queue_notify(VirtIODevice *vdev, int n)
>      }
>  }
>  
> +void virtio_queue_notify_vq(VirtQueue *vq)
> +{
> +    virtio_queue_notify(vq->vdev, vq - vq->vdev->vq);

Let's implement virtio_queue_notify in terms of virtio_queue_notify_vq.
Not the other way around.

> +}
> +
>  uint16_t virtio_queue_vector(VirtIODevice *vdev, int n)
>  {
>      return n < VIRTIO_PCI_QUEUE_MAX ? vdev->vq[n].vector :
> diff --git a/hw/virtio.h b/hw/virtio.h
> index 02fa312..5ae521c 100644
> --- a/hw/virtio.h
> +++ b/hw/virtio.h
> @@ -219,5 +219,6 @@ void virtio_queue_set_last_avail_idx(VirtIODevice *vdev, int n, uint16_t idx);
>  VirtQueue *virtio_get_queue(VirtIODevice *vdev, int n);
>  EventNotifier *virtio_queue_get_guest_notifier(VirtQueue *vq);
>  EventNotifier *virtio_queue_get_host_notifier(VirtQueue *vq);
> +void virtio_queue_notify_vq(VirtQueue *vq);
>  void virtio_irq(VirtQueue *vq);
>  #endif
> -- 
> 1.7.2.3

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-11-11 15:53     ` Michael S. Tsirkin
  0 siblings, 0 replies; 46+ messages in thread
From: Michael S. Tsirkin @ 2010-11-11 15:53 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, kvm

On Thu, Nov 11, 2010 at 01:47:21PM +0000, Stefan Hajnoczi wrote:
> Virtqueue notify is currently handled synchronously in userspace virtio.  This
> prevents the vcpu from executing guest code while hardware emulation code
> handles the notify.
> 
> On systems that support KVM, the ioeventfd mechanism can be used to make
> virtqueue notify a lightweight exit by deferring hardware emulation to the
> iothread and allowing the VM to continue execution.  This model is similar to
> how vhost receives virtqueue notifies.
> 
> The result of this change is improved performance for userspace virtio devices.
> Virtio-blk throughput increases especially for multithreaded scenarios and
> virtio-net transmit throughput increases substantially.
> 
> Some virtio devices are known to have guest drivers which expect a notify to be
> processed synchronously and spin waiting for completion.  Only enable ioeventfd
> for virtio-blk and virtio-net for now.
> 
> Care must be taken not to interfere with vhost-net, which already uses
> ioeventfd host notifiers.  The following list shows the behavior implemented in
> this patch and is designed to take vhost-net into account:
> 
>  * VIRTIO_CONFIG_S_DRIVER_OK -> assign host notifiers, qemu_set_fd_handler(virtio_pci_host_notifier_read)

we should also deassign when VIRTIO_CONFIG_S_DRIVER_OK is cleared
by io write or bus master bit?

>  * reset -> qemu_set_fd_handler(NULL), deassign host notifiers
>  * virtio_pci_set_host_notifier(true) -> qemu_set_fd_handler(NULL)
>  * virtio_pci_set_host_notifier(false) -> qemu_set_fd_handler(virtio_pci_host_notifier_read)
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> ---
>  hw/virtio-pci.c |  155 +++++++++++++++++++++++++++++++++++++++++++-----------
>  hw/virtio.c     |    5 ++
>  hw/virtio.h     |    1 +
>  3 files changed, 129 insertions(+), 32 deletions(-)
> 
> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> index 549118d..436fc59 100644
> --- a/hw/virtio-pci.c
> +++ b/hw/virtio-pci.c
> @@ -83,6 +83,10 @@
>  /* Flags track per-device state like workarounds for quirks in older guests. */
>  #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
>  
> +/* Performance improves when virtqueue kick processing is decoupled from the
> + * vcpu thread using ioeventfd for some devices. */
> +#define VIRTIO_PCI_FLAG_USE_IOEVENTFD   (1 << 1)
> +
>  /* QEMU doesn't strictly need write barriers since everything runs in
>   * lock-step.  We'll leave the calls to wmb() in though to make it obvious for
>   * KVM or if kqemu gets SMP support.
> @@ -179,12 +183,108 @@ static int virtio_pci_load_queue(void * opaque, int n, QEMUFile *f)
>      return 0;
>  }
>  
> +static int virtio_pci_set_host_notifier_ioeventfd(VirtIOPCIProxy *proxy, int n, bool assign)
> +{
> +    VirtQueue *vq = virtio_get_queue(proxy->vdev, n);
> +    EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
> +    int r;
> +    if (assign) {
> +        r = event_notifier_init(notifier, 1);
> +        if (r < 0) {
> +            return r;
> +        }
> +        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
> +                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
> +                                       n, assign);
> +        if (r < 0) {
> +            event_notifier_cleanup(notifier);
> +        }
> +    } else {
> +        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
> +                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
> +                                       n, assign);
> +        if (r < 0) {
> +            return r;
> +        }
> +        event_notifier_cleanup(notifier);
> +    }
> +    return r;
> +}
> +
> +static void virtio_pci_host_notifier_read(void *opaque)
> +{
> +    VirtQueue *vq = opaque;
> +    EventNotifier *n = virtio_queue_get_host_notifier(vq);
> +    if (event_notifier_test_and_clear(n)) {
> +        virtio_queue_notify_vq(vq);
> +    }
> +}
> +
> +static void virtio_pci_set_host_notifier_fd_handler(VirtIOPCIProxy *proxy, int n, bool assign)
> +{
> +    VirtQueue *vq = virtio_get_queue(proxy->vdev, n);
> +    EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
> +    if (assign) {
> +        qemu_set_fd_handler(event_notifier_get_fd(notifier),
> +                            virtio_pci_host_notifier_read, NULL, vq);
> +    } else {
> +        qemu_set_fd_handler(event_notifier_get_fd(notifier),
> +                            NULL, NULL, NULL);
> +    }
> +}
> +
> +static int virtio_pci_set_host_notifiers(VirtIOPCIProxy *proxy, bool assign)
> +{
> +    int n, r;
> +
> +    for (n = 0; n < VIRTIO_PCI_QUEUE_MAX; n++) {
> +        if (!virtio_queue_get_num(proxy->vdev, n)) {
> +            continue;
> +        }
> +
> +        if (assign) {
> +            r = virtio_pci_set_host_notifier_ioeventfd(proxy, n, true);
> +            if (r < 0) {
> +                goto assign_error;
> +            }
> +
> +            virtio_pci_set_host_notifier_fd_handler(proxy, n, true);
> +        } else {
> +            virtio_pci_set_host_notifier_fd_handler(proxy, n, false);
> +            virtio_pci_set_host_notifier_ioeventfd(proxy, n, false);
> +        }
> +    }
> +    return 0;
> +
> +assign_error:
> +    proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
> +    while (--n >= 0) {
> +        virtio_pci_set_host_notifier_fd_handler(proxy, n, false);
> +        virtio_pci_set_host_notifier_ioeventfd(proxy, n, false);
> +    }
> +    return r;
> +}
> +
> +static void virtio_pci_reset_vdev(VirtIOPCIProxy *proxy)
> +{
> +    /* Poke virtio device so it deassigns its host notifiers (if any) */
> +    virtio_set_status(proxy->vdev, 0);

Hmm. virtio_reset already sets status to 0.
I guess it should just be fixed to call virtio_set_status?

> +
> +    /* Now safely deassign our own host notifiers */
> +    if (proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD) {
> +        virtio_pci_set_host_notifiers(proxy, false);
> +    }
> +
> +    virtio_reset(proxy->vdev);
> +    msix_unuse_all_vectors(&proxy->pci_dev);
> +}
> +
>  static void virtio_pci_reset(DeviceState *d)
>  {
>      VirtIOPCIProxy *proxy = container_of(d, VirtIOPCIProxy, pci_dev.qdev);
> -    virtio_reset(proxy->vdev);
> +    virtio_pci_reset_vdev(proxy);
>      msix_reset(&proxy->pci_dev);
> -    proxy->flags = 0;
> +    proxy->flags &= ~VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
>  }
>  
>  static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
> @@ -209,11 +309,10 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
>      case VIRTIO_PCI_QUEUE_PFN:
>          pa = (target_phys_addr_t)val << VIRTIO_PCI_QUEUE_ADDR_SHIFT;
>          if (pa == 0) {
> -            virtio_reset(proxy->vdev);
> -            msix_unuse_all_vectors(&proxy->pci_dev);
> -        }
> -        else
> +            virtio_pci_reset_vdev(proxy);
> +        } else {
>              virtio_queue_set_addr(vdev, vdev->queue_sel, pa);
> +        }
>          break;
>      case VIRTIO_PCI_QUEUE_SEL:
>          if (val < VIRTIO_PCI_QUEUE_MAX)
> @@ -223,10 +322,16 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
>          virtio_queue_notify(vdev, val);
>          break;
>      case VIRTIO_PCI_STATUS:
> -        virtio_set_status(vdev, val & 0xFF);
> -        if (vdev->status == 0) {
> -            virtio_reset(proxy->vdev);
> -            msix_unuse_all_vectors(&proxy->pci_dev);
> +        if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
> +            !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) &&
> +            (proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD)) {
> +            virtio_pci_set_host_notifiers(proxy, true);
> +        }

So we set host notifiers to true from here, but to false
only on reset? This seems strange. Should not we disable
notifiers when driver clears OK status?
How about on bus master disable?

> +
> +        if (val & 0xFF) {
> +            virtio_set_status(vdev, val & 0xFF);
> +        } else {
> +            virtio_pci_reset_vdev(proxy);
>          }
>  
>          /* Linux before 2.6.34 sets the device as OK without enabling
> @@ -480,30 +585,12 @@ assign_error:
>  static int virtio_pci_set_host_notifier(void *opaque, int n, bool assign)
>  {
>      VirtIOPCIProxy *proxy = opaque;
> -    VirtQueue *vq = virtio_get_queue(proxy->vdev, n);
> -    EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
> -    int r;
> -    if (assign) {
> -        r = event_notifier_init(notifier, 1);
> -        if (r < 0) {
> -            return r;
> -        }
> -        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
> -                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
> -                                       n, assign);
> -        if (r < 0) {
> -            event_notifier_cleanup(notifier);
> -        }
> +    if (proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD) {
> +        virtio_pci_set_host_notifier_fd_handler(proxy, n, !assign);
> +        return 0;
>      } else {
> -        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
> -                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
> -                                       n, assign);
> -        if (r < 0) {
> -            return r;
> -        }
> -        event_notifier_cleanup(notifier);
> +        return virtio_pci_set_host_notifier_ioeventfd(proxy, n, assign);
>      }
> -    return r;
>  }
>  
>  static const VirtIOBindings virtio_pci_bindings = {
> @@ -702,6 +789,8 @@ static PCIDeviceInfo virtio_info[] = {
>          .qdev.props = (Property[]) {
>              DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0),
>              DEFINE_BLOCK_PROPERTIES(VirtIOPCIProxy, block),
> +            DEFINE_PROP_UINT32("flags", VirtIOPCIProxy, flags,
> +                               VIRTIO_PCI_FLAG_USE_IOEVENTFD),
>              DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
>              DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, host_features),
>              DEFINE_PROP_END_OF_LIST(),
> @@ -714,6 +803,8 @@ static PCIDeviceInfo virtio_info[] = {
>          .exit       = virtio_net_exit_pci,
>          .romfile    = "pxe-virtio.bin",
>          .qdev.props = (Property[]) {
> +            DEFINE_PROP_UINT32("flags", VirtIOPCIProxy, flags,
> +                               VIRTIO_PCI_FLAG_USE_IOEVENTFD),
>              DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 3),
>              DEFINE_VIRTIO_NET_FEATURES(VirtIOPCIProxy, host_features),
>              DEFINE_NIC_PROPERTIES(VirtIOPCIProxy, nic),

This ties interface to an internal macro value.  Further, user gets to
tweak other fields in this integer which we don't want.  Finally, the
interface is extremely unfriendly.
Please use a bit property instead: DEFINE_PROP_BIT.

> diff --git a/hw/virtio.c b/hw/virtio.c
> index a2a657e..f588e29 100644
> --- a/hw/virtio.c
> +++ b/hw/virtio.c
> @@ -582,6 +582,11 @@ void virtio_queue_notify(VirtIODevice *vdev, int n)
>      }
>  }
>  
> +void virtio_queue_notify_vq(VirtQueue *vq)
> +{
> +    virtio_queue_notify(vq->vdev, vq - vq->vdev->vq);

Let's implement virtio_queue_notify in terms of virtio_queue_notify_vq.
Not the other way around.

> +}
> +
>  uint16_t virtio_queue_vector(VirtIODevice *vdev, int n)
>  {
>      return n < VIRTIO_PCI_QUEUE_MAX ? vdev->vq[n].vector :
> diff --git a/hw/virtio.h b/hw/virtio.h
> index 02fa312..5ae521c 100644
> --- a/hw/virtio.h
> +++ b/hw/virtio.h
> @@ -219,5 +219,6 @@ void virtio_queue_set_last_avail_idx(VirtIODevice *vdev, int n, uint16_t idx);
>  VirtQueue *virtio_get_queue(VirtIODevice *vdev, int n);
>  EventNotifier *virtio_queue_get_guest_notifier(VirtQueue *vq);
>  EventNotifier *virtio_queue_get_host_notifier(VirtQueue *vq);
> +void virtio_queue_notify_vq(VirtQueue *vq);
>  void virtio_irq(VirtQueue *vq);
>  #endif
> -- 
> 1.7.2.3

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-11-11 13:47   ` [Qemu-devel] " Stefan Hajnoczi
@ 2010-11-11 16:45     ` Christoph Hellwig
  -1 siblings, 0 replies; 46+ messages in thread
From: Christoph Hellwig @ 2010-11-11 16:45 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, kvm, Michael S. Tsirkin

On Thu, Nov 11, 2010 at 01:47:21PM +0000, Stefan Hajnoczi wrote:
> Some virtio devices are known to have guest drivers which expect a notify to be
> processed synchronously and spin waiting for completion.  Only enable ioeventfd
> for virtio-blk and virtio-net for now.

Who guarantees that less common virtio-blk and virtio-net guest drivers
for non-Linux OSes are fine with it?  Maybe you should add a feature flag
that the guest has to ACK to enable it.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-11-11 16:45     ` Christoph Hellwig
  0 siblings, 0 replies; 46+ messages in thread
From: Christoph Hellwig @ 2010-11-11 16:45 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, kvm, Michael S. Tsirkin

On Thu, Nov 11, 2010 at 01:47:21PM +0000, Stefan Hajnoczi wrote:
> Some virtio devices are known to have guest drivers which expect a notify to be
> processed synchronously and spin waiting for completion.  Only enable ioeventfd
> for virtio-blk and virtio-net for now.

Who guarantees that less common virtio-blk and virtio-net guest drivers
for non-Linux OSes are fine with it?  Maybe you should add a feature flag
that the guest has to ACK to enable it.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-11-11 13:47   ` [Qemu-devel] " Stefan Hajnoczi
@ 2010-11-11 16:59     ` Avi Kivity
  -1 siblings, 0 replies; 46+ messages in thread
From: Avi Kivity @ 2010-11-11 16:59 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, kvm, Michael S. Tsirkin

On 11/11/2010 03:47 PM, Stefan Hajnoczi wrote:
> Some virtio devices are known to have guest drivers which expect a notify to be
> processed synchronously and spin waiting for completion.  Only enable ioeventfd
> for virtio-blk and virtio-net for now.

Which drivers are these?

I only know of the virtio-gl driver, which isn't upstream.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-11-11 16:59     ` Avi Kivity
  0 siblings, 0 replies; 46+ messages in thread
From: Avi Kivity @ 2010-11-11 16:59 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, kvm, Michael S. Tsirkin

On 11/11/2010 03:47 PM, Stefan Hajnoczi wrote:
> Some virtio devices are known to have guest drivers which expect a notify to be
> processed synchronously and spin waiting for completion.  Only enable ioeventfd
> for virtio-blk and virtio-net for now.

Which drivers are these?

I only know of the virtio-gl driver, which isn't upstream.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-11-11 16:59     ` [Qemu-devel] " Avi Kivity
@ 2010-11-11 17:12       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 46+ messages in thread
From: Michael S. Tsirkin @ 2010-11-11 17:12 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Stefan Hajnoczi, qemu-devel, kvm, gleb

On Thu, Nov 11, 2010 at 06:59:29PM +0200, Avi Kivity wrote:
> On 11/11/2010 03:47 PM, Stefan Hajnoczi wrote:
> >Some virtio devices are known to have guest drivers which expect a notify to be
> >processed synchronously and spin waiting for completion.  Only enable ioeventfd
> >for virtio-blk and virtio-net for now.
> 
> Which drivers are these?
> 
> I only know of the virtio-gl driver, which isn't upstream.

I think that PXE virtio drivers do polling. But I think this change does
not break these drivers. It'll probably make them a bit slower.  Right,
Gleb?

-- 
MST

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-11-11 17:12       ` Michael S. Tsirkin
  0 siblings, 0 replies; 46+ messages in thread
From: Michael S. Tsirkin @ 2010-11-11 17:12 UTC (permalink / raw)
  To: Avi Kivity; +Cc: gleb, Stefan Hajnoczi, kvm, qemu-devel

On Thu, Nov 11, 2010 at 06:59:29PM +0200, Avi Kivity wrote:
> On 11/11/2010 03:47 PM, Stefan Hajnoczi wrote:
> >Some virtio devices are known to have guest drivers which expect a notify to be
> >processed synchronously and spin waiting for completion.  Only enable ioeventfd
> >for virtio-blk and virtio-net for now.
> 
> Which drivers are these?
> 
> I only know of the virtio-gl driver, which isn't upstream.

I think that PXE virtio drivers do polling. But I think this change does
not break these drivers. It'll probably make them a bit slower.  Right,
Gleb?

-- 
MST

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 0/3] virtio: Use ioeventfd for virtqueue notify
  2010-11-11 13:47 ` [Qemu-devel] " Stefan Hajnoczi
@ 2010-11-11 17:46   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 46+ messages in thread
From: Michael S. Tsirkin @ 2010-11-11 17:46 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, kvm

On Thu, Nov 11, 2010 at 01:47:19PM +0000, Stefan Hajnoczi wrote:
> This is a rewrite of the virtio-ioeventfd patchset to work at the virtio-pci.c
> level instead of virtio.c.  This results in better integration with the
> host/guest notifier code and makes the code simpler (no more state machine).

Yep, much cleaner. Sent some comments separately.

-- 
MST

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [Qemu-devel] Re: [PATCH v2 0/3] virtio: Use ioeventfd for virtqueue notify
@ 2010-11-11 17:46   ` Michael S. Tsirkin
  0 siblings, 0 replies; 46+ messages in thread
From: Michael S. Tsirkin @ 2010-11-11 17:46 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, kvm

On Thu, Nov 11, 2010 at 01:47:19PM +0000, Stefan Hajnoczi wrote:
> This is a rewrite of the virtio-ioeventfd patchset to work at the virtio-pci.c
> level instead of virtio.c.  This results in better integration with the
> host/guest notifier code and makes the code simpler (no more state machine).

Yep, much cleaner. Sent some comments separately.

-- 
MST

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-11-11 17:12       ` [Qemu-devel] " Michael S. Tsirkin
@ 2010-11-11 17:55         ` Gleb Natapov
  -1 siblings, 0 replies; 46+ messages in thread
From: Gleb Natapov @ 2010-11-11 17:55 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Avi Kivity, Stefan Hajnoczi, qemu-devel, kvm

On Thu, Nov 11, 2010 at 07:12:40PM +0200, Michael S. Tsirkin wrote:
> On Thu, Nov 11, 2010 at 06:59:29PM +0200, Avi Kivity wrote:
> > On 11/11/2010 03:47 PM, Stefan Hajnoczi wrote:
> > >Some virtio devices are known to have guest drivers which expect a notify to be
> > >processed synchronously and spin waiting for completion.  Only enable ioeventfd
> > >for virtio-blk and virtio-net for now.
> > 
> > Which drivers are these?
> > 
> > I only know of the virtio-gl driver, which isn't upstream.
> 
> I think that PXE virtio drivers do polling. But I think this change does
> not break these drivers. It'll probably make them a bit slower.  Right,
> Gleb?
> 
Can't tell about PXE virtio drivers, but virtio-blk in seabios does
polling.

--
			Gleb.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-11-11 17:55         ` Gleb Natapov
  0 siblings, 0 replies; 46+ messages in thread
From: Gleb Natapov @ 2010-11-11 17:55 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: qemu-devel, Avi Kivity, kvm, Stefan Hajnoczi

On Thu, Nov 11, 2010 at 07:12:40PM +0200, Michael S. Tsirkin wrote:
> On Thu, Nov 11, 2010 at 06:59:29PM +0200, Avi Kivity wrote:
> > On 11/11/2010 03:47 PM, Stefan Hajnoczi wrote:
> > >Some virtio devices are known to have guest drivers which expect a notify to be
> > >processed synchronously and spin waiting for completion.  Only enable ioeventfd
> > >for virtio-blk and virtio-net for now.
> > 
> > Which drivers are these?
> > 
> > I only know of the virtio-gl driver, which isn't upstream.
> 
> I think that PXE virtio drivers do polling. But I think this change does
> not break these drivers. It'll probably make them a bit slower.  Right,
> Gleb?
> 
Can't tell about PXE virtio drivers, but virtio-blk in seabios does
polling.

--
			Gleb.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-11-11 15:53     ` [Qemu-devel] " Michael S. Tsirkin
@ 2010-11-12  9:18       ` Stefan Hajnoczi
  -1 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-11-12  9:18 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Stefan Hajnoczi, qemu-devel, kvm

On Thu, Nov 11, 2010 at 3:53 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Thu, Nov 11, 2010 at 01:47:21PM +0000, Stefan Hajnoczi wrote:
>> Care must be taken not to interfere with vhost-net, which already uses
>> ioeventfd host notifiers.  The following list shows the behavior implemented in
>> this patch and is designed to take vhost-net into account:
>>
>>  * VIRTIO_CONFIG_S_DRIVER_OK -> assign host notifiers, qemu_set_fd_handler(virtio_pci_host_notifier_read)
>
> we should also deassign when VIRTIO_CONFIG_S_DRIVER_OK is cleared
> by io write or bus master bit?

You're right, I'll fix the lifecycle to trigger symmetrically on
status bit changes rather than VIRTIO_CONFIG_S_DRIVER_OK/reset.

>> +static void virtio_pci_reset_vdev(VirtIOPCIProxy *proxy)
>> +{
>> +    /* Poke virtio device so it deassigns its host notifiers (if any) */
>> +    virtio_set_status(proxy->vdev, 0);
>
> Hmm. virtio_reset already sets status to 0.
> I guess it should just be fixed to call virtio_set_status?

This part is ugly.  The problem is that virtio_reset() calls
virtio_set_status(vdev, 0) but doesn't give the transport binding a
chance clean up after the virtio device has cleaned up.  Since
virtio-net will spot status=0 and deassign its host notifier, we need
to perform our own clean up after vhost.

What makes this slightly less of a hack is the fact that virtio-pci.c
was already causing virtio_set_status(vdev, 0) to be invoked twice
during reset.  When 0 is written to the VIRTIO_PCI_STATUS register, we
do virtio_set_status(proxy->vdev, val & 0xFF) and then
virtio_reset(proxy->vdev).  So the status byte callback already gets
invoked twice today.

I've just split this out into virtio_pci_reset_vdev() and (ab)used it
to correctly clean up virtqueue ioeventfd.

The alternative is to add another callback from virtio.c so we are
notified after the vdev's reset callback has finished.

>> @@ -223,10 +322,16 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
>>          virtio_queue_notify(vdev, val);
>>          break;
>>      case VIRTIO_PCI_STATUS:
>> -        virtio_set_status(vdev, val & 0xFF);
>> -        if (vdev->status == 0) {
>> -            virtio_reset(proxy->vdev);
>> -            msix_unuse_all_vectors(&proxy->pci_dev);
>> +        if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
>> +            !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) &&
>> +            (proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD)) {
>> +            virtio_pci_set_host_notifiers(proxy, true);
>> +        }
>
> So we set host notifiers to true from here, but to false
> only on reset? This seems strange. Should not we disable
> notifiers when driver clears OK status?
> How about on bus master disable?

You're right, this needs to be fixed.

>> @@ -714,6 +803,8 @@ static PCIDeviceInfo virtio_info[] = {
>>          .exit       = virtio_net_exit_pci,
>>          .romfile    = "pxe-virtio.bin",
>>          .qdev.props = (Property[]) {
>> +            DEFINE_PROP_UINT32("flags", VirtIOPCIProxy, flags,
>> +                               VIRTIO_PCI_FLAG_USE_IOEVENTFD),
>>              DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 3),
>>              DEFINE_VIRTIO_NET_FEATURES(VirtIOPCIProxy, host_features),
>>              DEFINE_NIC_PROPERTIES(VirtIOPCIProxy, nic),
>
> This ties interface to an internal macro value.  Further, user gets to
> tweak other fields in this integer which we don't want.  Finally, the
> interface is extremely unfriendly.
> Please use a bit property instead: DEFINE_PROP_BIT.

Will fix in v3.

>> diff --git a/hw/virtio.c b/hw/virtio.c
>> index a2a657e..f588e29 100644
>> --- a/hw/virtio.c
>> +++ b/hw/virtio.c
>> @@ -582,6 +582,11 @@ void virtio_queue_notify(VirtIODevice *vdev, int n)
>>      }
>>  }
>>
>> +void virtio_queue_notify_vq(VirtQueue *vq)
>> +{
>> +    virtio_queue_notify(vq->vdev, vq - vq->vdev->vq);
>
> Let's implement virtio_queue_notify in terms of virtio_queue_notify_vq.
> Not the other way around.

Will fix in v3.

Stefan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-11-12  9:18       ` Stefan Hajnoczi
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-11-12  9:18 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Stefan Hajnoczi, kvm, qemu-devel

On Thu, Nov 11, 2010 at 3:53 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Thu, Nov 11, 2010 at 01:47:21PM +0000, Stefan Hajnoczi wrote:
>> Care must be taken not to interfere with vhost-net, which already uses
>> ioeventfd host notifiers.  The following list shows the behavior implemented in
>> this patch and is designed to take vhost-net into account:
>>
>>  * VIRTIO_CONFIG_S_DRIVER_OK -> assign host notifiers, qemu_set_fd_handler(virtio_pci_host_notifier_read)
>
> we should also deassign when VIRTIO_CONFIG_S_DRIVER_OK is cleared
> by io write or bus master bit?

You're right, I'll fix the lifecycle to trigger symmetrically on
status bit changes rather than VIRTIO_CONFIG_S_DRIVER_OK/reset.

>> +static void virtio_pci_reset_vdev(VirtIOPCIProxy *proxy)
>> +{
>> +    /* Poke virtio device so it deassigns its host notifiers (if any) */
>> +    virtio_set_status(proxy->vdev, 0);
>
> Hmm. virtio_reset already sets status to 0.
> I guess it should just be fixed to call virtio_set_status?

This part is ugly.  The problem is that virtio_reset() calls
virtio_set_status(vdev, 0) but doesn't give the transport binding a
chance clean up after the virtio device has cleaned up.  Since
virtio-net will spot status=0 and deassign its host notifier, we need
to perform our own clean up after vhost.

What makes this slightly less of a hack is the fact that virtio-pci.c
was already causing virtio_set_status(vdev, 0) to be invoked twice
during reset.  When 0 is written to the VIRTIO_PCI_STATUS register, we
do virtio_set_status(proxy->vdev, val & 0xFF) and then
virtio_reset(proxy->vdev).  So the status byte callback already gets
invoked twice today.

I've just split this out into virtio_pci_reset_vdev() and (ab)used it
to correctly clean up virtqueue ioeventfd.

The alternative is to add another callback from virtio.c so we are
notified after the vdev's reset callback has finished.

>> @@ -223,10 +322,16 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
>>          virtio_queue_notify(vdev, val);
>>          break;
>>      case VIRTIO_PCI_STATUS:
>> -        virtio_set_status(vdev, val & 0xFF);
>> -        if (vdev->status == 0) {
>> -            virtio_reset(proxy->vdev);
>> -            msix_unuse_all_vectors(&proxy->pci_dev);
>> +        if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
>> +            !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) &&
>> +            (proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD)) {
>> +            virtio_pci_set_host_notifiers(proxy, true);
>> +        }
>
> So we set host notifiers to true from here, but to false
> only on reset? This seems strange. Should not we disable
> notifiers when driver clears OK status?
> How about on bus master disable?

You're right, this needs to be fixed.

>> @@ -714,6 +803,8 @@ static PCIDeviceInfo virtio_info[] = {
>>          .exit       = virtio_net_exit_pci,
>>          .romfile    = "pxe-virtio.bin",
>>          .qdev.props = (Property[]) {
>> +            DEFINE_PROP_UINT32("flags", VirtIOPCIProxy, flags,
>> +                               VIRTIO_PCI_FLAG_USE_IOEVENTFD),
>>              DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 3),
>>              DEFINE_VIRTIO_NET_FEATURES(VirtIOPCIProxy, host_features),
>>              DEFINE_NIC_PROPERTIES(VirtIOPCIProxy, nic),
>
> This ties interface to an internal macro value.  Further, user gets to
> tweak other fields in this integer which we don't want.  Finally, the
> interface is extremely unfriendly.
> Please use a bit property instead: DEFINE_PROP_BIT.

Will fix in v3.

>> diff --git a/hw/virtio.c b/hw/virtio.c
>> index a2a657e..f588e29 100644
>> --- a/hw/virtio.c
>> +++ b/hw/virtio.c
>> @@ -582,6 +582,11 @@ void virtio_queue_notify(VirtIODevice *vdev, int n)
>>      }
>>  }
>>
>> +void virtio_queue_notify_vq(VirtQueue *vq)
>> +{
>> +    virtio_queue_notify(vq->vdev, vq - vq->vdev->vq);
>
> Let's implement virtio_queue_notify in terms of virtio_queue_notify_vq.
> Not the other way around.

Will fix in v3.

Stefan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-11-11 16:45     ` [Qemu-devel] " Christoph Hellwig
@ 2010-11-12  9:20       ` Stefan Hajnoczi
  -1 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-11-12  9:20 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Stefan Hajnoczi, qemu-devel, kvm, Michael S. Tsirkin

On Thu, Nov 11, 2010 at 4:45 PM, Christoph Hellwig <hch@infradead.org> wrote:
> On Thu, Nov 11, 2010 at 01:47:21PM +0000, Stefan Hajnoczi wrote:
>> Some virtio devices are known to have guest drivers which expect a notify to be
>> processed synchronously and spin waiting for completion.  Only enable ioeventfd
>> for virtio-blk and virtio-net for now.
>
> Who guarantees that less common virtio-blk and virtio-net guest drivers
> for non-Linux OSes are fine with it?  Maybe you should add a feature flag
> that the guest has to ACK to enable it.

Virtio-blk and virtio-net are fine.  Both of those devices are
expected to operate asynchronously.  SeaBIOS and gPXE virtio-net
drivers spin but they expect to and it is okay in those environments.
They already burn CPU today.

Virtio-console expects synchronous virtqueue kick.  In Linux,
virtio_console.c __send_control_msg() and send_buf() will spin.  Qemu
userspace is able to complete those requests synchronously so that the
guest never actually burns CPU (e.g.
hw/virtio-serial-bus.c:send_control_msg()).  I don't want to burn CPU
in places where we previously didn't.

It's good that QEMU can decide whether or not to handle virtqueue kick
in the vcpu thread.  For high performance asynchronous devices like
virtio-net and virtio-blk it makes sense to use ioeventfd.  For others
it may not be useful.  I'm not sure a feature bit that exposes this
detail to the guest would be useful.

Stefan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-11-12  9:20       ` Stefan Hajnoczi
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-11-12  9:20 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Michael S. Tsirkin, Stefan Hajnoczi, kvm, qemu-devel

On Thu, Nov 11, 2010 at 4:45 PM, Christoph Hellwig <hch@infradead.org> wrote:
> On Thu, Nov 11, 2010 at 01:47:21PM +0000, Stefan Hajnoczi wrote:
>> Some virtio devices are known to have guest drivers which expect a notify to be
>> processed synchronously and spin waiting for completion.  Only enable ioeventfd
>> for virtio-blk and virtio-net for now.
>
> Who guarantees that less common virtio-blk and virtio-net guest drivers
> for non-Linux OSes are fine with it?  Maybe you should add a feature flag
> that the guest has to ACK to enable it.

Virtio-blk and virtio-net are fine.  Both of those devices are
expected to operate asynchronously.  SeaBIOS and gPXE virtio-net
drivers spin but they expect to and it is okay in those environments.
They already burn CPU today.

Virtio-console expects synchronous virtqueue kick.  In Linux,
virtio_console.c __send_control_msg() and send_buf() will spin.  Qemu
userspace is able to complete those requests synchronously so that the
guest never actually burns CPU (e.g.
hw/virtio-serial-bus.c:send_control_msg()).  I don't want to burn CPU
in places where we previously didn't.

It's good that QEMU can decide whether or not to handle virtqueue kick
in the vcpu thread.  For high performance asynchronous devices like
virtio-net and virtio-blk it makes sense to use ioeventfd.  For others
it may not be useful.  I'm not sure a feature bit that exposes this
detail to the guest would be useful.

Stefan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-11-12  9:18       ` [Qemu-devel] " Stefan Hajnoczi
@ 2010-11-12  9:25         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 46+ messages in thread
From: Michael S. Tsirkin @ 2010-11-12  9:25 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Stefan Hajnoczi, qemu-devel, kvm

On Fri, Nov 12, 2010 at 09:18:48AM +0000, Stefan Hajnoczi wrote:
> On Thu, Nov 11, 2010 at 3:53 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Thu, Nov 11, 2010 at 01:47:21PM +0000, Stefan Hajnoczi wrote:
> >> Care must be taken not to interfere with vhost-net, which already uses
> >> ioeventfd host notifiers.  The following list shows the behavior implemented in
> >> this patch and is designed to take vhost-net into account:
> >>
> >>  * VIRTIO_CONFIG_S_DRIVER_OK -> assign host notifiers, qemu_set_fd_handler(virtio_pci_host_notifier_read)
> >
> > we should also deassign when VIRTIO_CONFIG_S_DRIVER_OK is cleared
> > by io write or bus master bit?
> 
> You're right, I'll fix the lifecycle to trigger symmetrically on
> status bit changes rather than VIRTIO_CONFIG_S_DRIVER_OK/reset.
> 
> >> +static void virtio_pci_reset_vdev(VirtIOPCIProxy *proxy)
> >> +{
> >> +    /* Poke virtio device so it deassigns its host notifiers (if any) */
> >> +    virtio_set_status(proxy->vdev, 0);
> >
> > Hmm. virtio_reset already sets status to 0.
> > I guess it should just be fixed to call virtio_set_status?
> 
> This part is ugly.  The problem is that virtio_reset() calls
> virtio_set_status(vdev, 0) but doesn't give the transport binding a
> chance clean up after the virtio device has cleaned up.  Since
> virtio-net will spot status=0 and deassign its host notifier, we need
> to perform our own clean up after vhost.
> 
> What makes this slightly less of a hack is the fact that virtio-pci.c
> was already causing virtio_set_status(vdev, 0) to be invoked twice
> during reset.  When 0 is written to the VIRTIO_PCI_STATUS register, we
> do virtio_set_status(proxy->vdev, val & 0xFF) and then
> virtio_reset(proxy->vdev).  So the status byte callback already gets
> invoked twice today.
> 
> I've just split this out into virtio_pci_reset_vdev() and (ab)used it
> to correctly clean up virtqueue ioeventfd.
> 
> The alternative is to add another callback from virtio.c so we are
> notified after the vdev's reset callback has finished.

Oh, likely not worth it. Mabe put the above explanation in the comment.
Will this go away now that you move to set notifiers on status write?

> >> @@ -223,10 +322,16 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
> >>          virtio_queue_notify(vdev, val);
> >>          break;
> >>      case VIRTIO_PCI_STATUS:
> >> -        virtio_set_status(vdev, val & 0xFF);
> >> -        if (vdev->status == 0) {
> >> -            virtio_reset(proxy->vdev);
> >> -            msix_unuse_all_vectors(&proxy->pci_dev);
> >> +        if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
> >> +            !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) &&
> >> +            (proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD)) {
> >> +            virtio_pci_set_host_notifiers(proxy, true);
> >> +        }
> >
> > So we set host notifiers to true from here, but to false
> > only on reset? This seems strange. Should not we disable
> > notifiers when driver clears OK status?
> > How about on bus master disable?
> 
> You're right, this needs to be fixed.
> 
> >> @@ -714,6 +803,8 @@ static PCIDeviceInfo virtio_info[] = {
> >>          .exit       = virtio_net_exit_pci,
> >>          .romfile    = "pxe-virtio.bin",
> >>          .qdev.props = (Property[]) {
> >> +            DEFINE_PROP_UINT32("flags", VirtIOPCIProxy, flags,
> >> +                               VIRTIO_PCI_FLAG_USE_IOEVENTFD),
> >>              DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 3),
> >>              DEFINE_VIRTIO_NET_FEATURES(VirtIOPCIProxy, host_features),
> >>              DEFINE_NIC_PROPERTIES(VirtIOPCIProxy, nic),
> >
> > This ties interface to an internal macro value.  Further, user gets to
> > tweak other fields in this integer which we don't want.  Finally, the
> > interface is extremely unfriendly.
> > Please use a bit property instead: DEFINE_PROP_BIT.
> 
> Will fix in v3.
> 
> >> diff --git a/hw/virtio.c b/hw/virtio.c
> >> index a2a657e..f588e29 100644
> >> --- a/hw/virtio.c
> >> +++ b/hw/virtio.c
> >> @@ -582,6 +582,11 @@ void virtio_queue_notify(VirtIODevice *vdev, int n)
> >>      }
> >>  }
> >>
> >> +void virtio_queue_notify_vq(VirtQueue *vq)
> >> +{
> >> +    virtio_queue_notify(vq->vdev, vq - vq->vdev->vq);
> >
> > Let's implement virtio_queue_notify in terms of virtio_queue_notify_vq.
> > Not the other way around.
> 
> Will fix in v3.
> 
> Stefan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-11-12  9:25         ` Michael S. Tsirkin
  0 siblings, 0 replies; 46+ messages in thread
From: Michael S. Tsirkin @ 2010-11-12  9:25 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Stefan Hajnoczi, kvm, qemu-devel

On Fri, Nov 12, 2010 at 09:18:48AM +0000, Stefan Hajnoczi wrote:
> On Thu, Nov 11, 2010 at 3:53 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Thu, Nov 11, 2010 at 01:47:21PM +0000, Stefan Hajnoczi wrote:
> >> Care must be taken not to interfere with vhost-net, which already uses
> >> ioeventfd host notifiers.  The following list shows the behavior implemented in
> >> this patch and is designed to take vhost-net into account:
> >>
> >>  * VIRTIO_CONFIG_S_DRIVER_OK -> assign host notifiers, qemu_set_fd_handler(virtio_pci_host_notifier_read)
> >
> > we should also deassign when VIRTIO_CONFIG_S_DRIVER_OK is cleared
> > by io write or bus master bit?
> 
> You're right, I'll fix the lifecycle to trigger symmetrically on
> status bit changes rather than VIRTIO_CONFIG_S_DRIVER_OK/reset.
> 
> >> +static void virtio_pci_reset_vdev(VirtIOPCIProxy *proxy)
> >> +{
> >> +    /* Poke virtio device so it deassigns its host notifiers (if any) */
> >> +    virtio_set_status(proxy->vdev, 0);
> >
> > Hmm. virtio_reset already sets status to 0.
> > I guess it should just be fixed to call virtio_set_status?
> 
> This part is ugly.  The problem is that virtio_reset() calls
> virtio_set_status(vdev, 0) but doesn't give the transport binding a
> chance clean up after the virtio device has cleaned up.  Since
> virtio-net will spot status=0 and deassign its host notifier, we need
> to perform our own clean up after vhost.
> 
> What makes this slightly less of a hack is the fact that virtio-pci.c
> was already causing virtio_set_status(vdev, 0) to be invoked twice
> during reset.  When 0 is written to the VIRTIO_PCI_STATUS register, we
> do virtio_set_status(proxy->vdev, val & 0xFF) and then
> virtio_reset(proxy->vdev).  So the status byte callback already gets
> invoked twice today.
> 
> I've just split this out into virtio_pci_reset_vdev() and (ab)used it
> to correctly clean up virtqueue ioeventfd.
> 
> The alternative is to add another callback from virtio.c so we are
> notified after the vdev's reset callback has finished.

Oh, likely not worth it. Mabe put the above explanation in the comment.
Will this go away now that you move to set notifiers on status write?

> >> @@ -223,10 +322,16 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
> >>          virtio_queue_notify(vdev, val);
> >>          break;
> >>      case VIRTIO_PCI_STATUS:
> >> -        virtio_set_status(vdev, val & 0xFF);
> >> -        if (vdev->status == 0) {
> >> -            virtio_reset(proxy->vdev);
> >> -            msix_unuse_all_vectors(&proxy->pci_dev);
> >> +        if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
> >> +            !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) &&
> >> +            (proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD)) {
> >> +            virtio_pci_set_host_notifiers(proxy, true);
> >> +        }
> >
> > So we set host notifiers to true from here, but to false
> > only on reset? This seems strange. Should not we disable
> > notifiers when driver clears OK status?
> > How about on bus master disable?
> 
> You're right, this needs to be fixed.
> 
> >> @@ -714,6 +803,8 @@ static PCIDeviceInfo virtio_info[] = {
> >>          .exit       = virtio_net_exit_pci,
> >>          .romfile    = "pxe-virtio.bin",
> >>          .qdev.props = (Property[]) {
> >> +            DEFINE_PROP_UINT32("flags", VirtIOPCIProxy, flags,
> >> +                               VIRTIO_PCI_FLAG_USE_IOEVENTFD),
> >>              DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 3),
> >>              DEFINE_VIRTIO_NET_FEATURES(VirtIOPCIProxy, host_features),
> >>              DEFINE_NIC_PROPERTIES(VirtIOPCIProxy, nic),
> >
> > This ties interface to an internal macro value.  Further, user gets to
> > tweak other fields in this integer which we don't want.  Finally, the
> > interface is extremely unfriendly.
> > Please use a bit property instead: DEFINE_PROP_BIT.
> 
> Will fix in v3.
> 
> >> diff --git a/hw/virtio.c b/hw/virtio.c
> >> index a2a657e..f588e29 100644
> >> --- a/hw/virtio.c
> >> +++ b/hw/virtio.c
> >> @@ -582,6 +582,11 @@ void virtio_queue_notify(VirtIODevice *vdev, int n)
> >>      }
> >>  }
> >>
> >> +void virtio_queue_notify_vq(VirtQueue *vq)
> >> +{
> >> +    virtio_queue_notify(vq->vdev, vq - vq->vdev->vq);
> >
> > Let's implement virtio_queue_notify in terms of virtio_queue_notify_vq.
> > Not the other way around.
> 
> Will fix in v3.
> 
> Stefan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-11-12  9:25         ` [Qemu-devel] " Michael S. Tsirkin
@ 2010-11-12 13:10           ` Stefan Hajnoczi
  -1 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-11-12 13:10 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Stefan Hajnoczi, qemu-devel, kvm

On Fri, Nov 12, 2010 at 9:25 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Fri, Nov 12, 2010 at 09:18:48AM +0000, Stefan Hajnoczi wrote:
>> On Thu, Nov 11, 2010 at 3:53 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> > On Thu, Nov 11, 2010 at 01:47:21PM +0000, Stefan Hajnoczi wrote:
>> >> Care must be taken not to interfere with vhost-net, which already uses
>> >> ioeventfd host notifiers.  The following list shows the behavior implemented in
>> >> this patch and is designed to take vhost-net into account:
>> >>
>> >>  * VIRTIO_CONFIG_S_DRIVER_OK -> assign host notifiers, qemu_set_fd_handler(virtio_pci_host_notifier_read)
>> >
>> > we should also deassign when VIRTIO_CONFIG_S_DRIVER_OK is cleared
>> > by io write or bus master bit?
>>
>> You're right, I'll fix the lifecycle to trigger symmetrically on
>> status bit changes rather than VIRTIO_CONFIG_S_DRIVER_OK/reset.
>>
>> >> +static void virtio_pci_reset_vdev(VirtIOPCIProxy *proxy)
>> >> +{
>> >> +    /* Poke virtio device so it deassigns its host notifiers (if any) */
>> >> +    virtio_set_status(proxy->vdev, 0);
>> >
>> > Hmm. virtio_reset already sets status to 0.
>> > I guess it should just be fixed to call virtio_set_status?
>>
>> This part is ugly.  The problem is that virtio_reset() calls
>> virtio_set_status(vdev, 0) but doesn't give the transport binding a
>> chance clean up after the virtio device has cleaned up.  Since
>> virtio-net will spot status=0 and deassign its host notifier, we need
>> to perform our own clean up after vhost.
>>
>> What makes this slightly less of a hack is the fact that virtio-pci.c
>> was already causing virtio_set_status(vdev, 0) to be invoked twice
>> during reset.  When 0 is written to the VIRTIO_PCI_STATUS register, we
>> do virtio_set_status(proxy->vdev, val & 0xFF) and then
>> virtio_reset(proxy->vdev).  So the status byte callback already gets
>> invoked twice today.
>>
>> I've just split this out into virtio_pci_reset_vdev() and (ab)used it
>> to correctly clean up virtqueue ioeventfd.
>>
>> The alternative is to add another callback from virtio.c so we are
>> notified after the vdev's reset callback has finished.
>
> Oh, likely not worth it. Mabe put the above explanation in the comment.
> Will this go away now that you move to set notifiers on status write?

For v3 I have switched to a bindings callback.  I wish it wasn't
necessary but the only other ways I can think of catching status
writes are hacks which depend on side-effects too much.

Stefan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-11-12 13:10           ` Stefan Hajnoczi
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-11-12 13:10 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Stefan Hajnoczi, kvm, qemu-devel

On Fri, Nov 12, 2010 at 9:25 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Fri, Nov 12, 2010 at 09:18:48AM +0000, Stefan Hajnoczi wrote:
>> On Thu, Nov 11, 2010 at 3:53 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> > On Thu, Nov 11, 2010 at 01:47:21PM +0000, Stefan Hajnoczi wrote:
>> >> Care must be taken not to interfere with vhost-net, which already uses
>> >> ioeventfd host notifiers.  The following list shows the behavior implemented in
>> >> this patch and is designed to take vhost-net into account:
>> >>
>> >>  * VIRTIO_CONFIG_S_DRIVER_OK -> assign host notifiers, qemu_set_fd_handler(virtio_pci_host_notifier_read)
>> >
>> > we should also deassign when VIRTIO_CONFIG_S_DRIVER_OK is cleared
>> > by io write or bus master bit?
>>
>> You're right, I'll fix the lifecycle to trigger symmetrically on
>> status bit changes rather than VIRTIO_CONFIG_S_DRIVER_OK/reset.
>>
>> >> +static void virtio_pci_reset_vdev(VirtIOPCIProxy *proxy)
>> >> +{
>> >> +    /* Poke virtio device so it deassigns its host notifiers (if any) */
>> >> +    virtio_set_status(proxy->vdev, 0);
>> >
>> > Hmm. virtio_reset already sets status to 0.
>> > I guess it should just be fixed to call virtio_set_status?
>>
>> This part is ugly.  The problem is that virtio_reset() calls
>> virtio_set_status(vdev, 0) but doesn't give the transport binding a
>> chance clean up after the virtio device has cleaned up.  Since
>> virtio-net will spot status=0 and deassign its host notifier, we need
>> to perform our own clean up after vhost.
>>
>> What makes this slightly less of a hack is the fact that virtio-pci.c
>> was already causing virtio_set_status(vdev, 0) to be invoked twice
>> during reset.  When 0 is written to the VIRTIO_PCI_STATUS register, we
>> do virtio_set_status(proxy->vdev, val & 0xFF) and then
>> virtio_reset(proxy->vdev).  So the status byte callback already gets
>> invoked twice today.
>>
>> I've just split this out into virtio_pci_reset_vdev() and (ab)used it
>> to correctly clean up virtqueue ioeventfd.
>>
>> The alternative is to add another callback from virtio.c so we are
>> notified after the vdev's reset callback has finished.
>
> Oh, likely not worth it. Mabe put the above explanation in the comment.
> Will this go away now that you move to set notifiers on status write?

For v3 I have switched to a bindings callback.  I wish it wasn't
necessary but the only other ways I can think of catching status
writes are hacks which depend on side-effects too much.

Stefan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-11-12  9:20       ` Stefan Hajnoczi
@ 2010-11-14 10:34         ` Avi Kivity
  -1 siblings, 0 replies; 46+ messages in thread
From: Avi Kivity @ 2010-11-14 10:34 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Christoph Hellwig, Stefan Hajnoczi, qemu-devel, kvm, Michael S. Tsirkin

On 11/12/2010 11:20 AM, Stefan Hajnoczi wrote:
> >  Who guarantees that less common virtio-blk and virtio-net guest drivers
> >  for non-Linux OSes are fine with it?  Maybe you should add a feature flag
> >  that the guest has to ACK to enable it.
>
> Virtio-blk and virtio-net are fine.  Both of those devices are
> expected to operate asynchronously.  SeaBIOS and gPXE virtio-net
> drivers spin but they expect to and it is okay in those environments.
> They already burn CPU today.
>
> Virtio-console expects synchronous virtqueue kick.  In Linux,
> virtio_console.c __send_control_msg() and send_buf() will spin.  Qemu
> userspace is able to complete those requests synchronously so that the
> guest never actually burns CPU (e.g.
> hw/virtio-serial-bus.c:send_control_msg()).  I don't want to burn CPU
> in places where we previously didn't.

This is a horrible bug.  virtio is an asynchronous API.  Some hypervisor 
implementations cannot even provide synchronous notifications.

> It's good that QEMU can decide whether or not to handle virtqueue kick
> in the vcpu thread.  For high performance asynchronous devices like
> virtio-net and virtio-blk it makes sense to use ioeventfd.  For others
> it may not be useful.  I'm not sure a feature bit that exposes this
> detail to the guest would be useful.

The guest should always assume that virtio devices are asynchronous.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-11-14 10:34         ` Avi Kivity
  0 siblings, 0 replies; 46+ messages in thread
From: Avi Kivity @ 2010-11-14 10:34 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Christoph Hellwig, Michael S. Tsirkin, Stefan Hajnoczi, kvm, qemu-devel

On 11/12/2010 11:20 AM, Stefan Hajnoczi wrote:
> >  Who guarantees that less common virtio-blk and virtio-net guest drivers
> >  for non-Linux OSes are fine with it?  Maybe you should add a feature flag
> >  that the guest has to ACK to enable it.
>
> Virtio-blk and virtio-net are fine.  Both of those devices are
> expected to operate asynchronously.  SeaBIOS and gPXE virtio-net
> drivers spin but they expect to and it is okay in those environments.
> They already burn CPU today.
>
> Virtio-console expects synchronous virtqueue kick.  In Linux,
> virtio_console.c __send_control_msg() and send_buf() will spin.  Qemu
> userspace is able to complete those requests synchronously so that the
> guest never actually burns CPU (e.g.
> hw/virtio-serial-bus.c:send_control_msg()).  I don't want to burn CPU
> in places where we previously didn't.

This is a horrible bug.  virtio is an asynchronous API.  Some hypervisor 
implementations cannot even provide synchronous notifications.

> It's good that QEMU can decide whether or not to handle virtqueue kick
> in the vcpu thread.  For high performance asynchronous devices like
> virtio-net and virtio-blk it makes sense to use ioeventfd.  For others
> it may not be useful.  I'm not sure a feature bit that exposes this
> detail to the guest would be useful.

The guest should always assume that virtio devices are asynchronous.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-11-14 10:34         ` Avi Kivity
@ 2010-11-14 10:37           ` Stefan Hajnoczi
  -1 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-11-14 10:37 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Christoph Hellwig, Stefan Hajnoczi, qemu-devel, kvm, Michael S. Tsirkin

On Sun, Nov 14, 2010 at 10:34 AM, Avi Kivity <avi@redhat.com> wrote:
> On 11/12/2010 11:20 AM, Stefan Hajnoczi wrote:
>>
>> >  Who guarantees that less common virtio-blk and virtio-net guest drivers
>> >  for non-Linux OSes are fine with it?  Maybe you should add a feature
>> > flag
>> >  that the guest has to ACK to enable it.
>>
>> Virtio-blk and virtio-net are fine.  Both of those devices are
>> expected to operate asynchronously.  SeaBIOS and gPXE virtio-net
>> drivers spin but they expect to and it is okay in those environments.
>> They already burn CPU today.
>>
>> Virtio-console expects synchronous virtqueue kick.  In Linux,
>> virtio_console.c __send_control_msg() and send_buf() will spin.  Qemu
>> userspace is able to complete those requests synchronously so that the
>> guest never actually burns CPU (e.g.
>> hw/virtio-serial-bus.c:send_control_msg()).  I don't want to burn CPU
>> in places where we previously didn't.
>
> This is a horrible bug.  virtio is an asynchronous API.  Some hypervisor
> implementations cannot even provide synchronous notifications.
>
>> It's good that QEMU can decide whether or not to handle virtqueue kick
>> in the vcpu thread.  For high performance asynchronous devices like
>> virtio-net and virtio-blk it makes sense to use ioeventfd.  For others
>> it may not be useful.  I'm not sure a feature bit that exposes this
>> detail to the guest would be useful.
>
> The guest should always assume that virtio devices are asynchronous.

I agree, but let's enable virtio-ioeventfd carefully because bad code
is out there.

Stefan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-11-14 10:37           ` Stefan Hajnoczi
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-11-14 10:37 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Christoph Hellwig, Michael S. Tsirkin, Stefan Hajnoczi, kvm, qemu-devel

On Sun, Nov 14, 2010 at 10:34 AM, Avi Kivity <avi@redhat.com> wrote:
> On 11/12/2010 11:20 AM, Stefan Hajnoczi wrote:
>>
>> >  Who guarantees that less common virtio-blk and virtio-net guest drivers
>> >  for non-Linux OSes are fine with it?  Maybe you should add a feature
>> > flag
>> >  that the guest has to ACK to enable it.
>>
>> Virtio-blk and virtio-net are fine.  Both of those devices are
>> expected to operate asynchronously.  SeaBIOS and gPXE virtio-net
>> drivers spin but they expect to and it is okay in those environments.
>> They already burn CPU today.
>>
>> Virtio-console expects synchronous virtqueue kick.  In Linux,
>> virtio_console.c __send_control_msg() and send_buf() will spin.  Qemu
>> userspace is able to complete those requests synchronously so that the
>> guest never actually burns CPU (e.g.
>> hw/virtio-serial-bus.c:send_control_msg()).  I don't want to burn CPU
>> in places where we previously didn't.
>
> This is a horrible bug.  virtio is an asynchronous API.  Some hypervisor
> implementations cannot even provide synchronous notifications.
>
>> It's good that QEMU can decide whether or not to handle virtqueue kick
>> in the vcpu thread.  For high performance asynchronous devices like
>> virtio-net and virtio-blk it makes sense to use ioeventfd.  For others
>> it may not be useful.  I'm not sure a feature bit that exposes this
>> detail to the guest would be useful.
>
> The guest should always assume that virtio devices are asynchronous.

I agree, but let's enable virtio-ioeventfd carefully because bad code
is out there.

Stefan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-11-14 10:37           ` Stefan Hajnoczi
@ 2010-11-14 11:05             ` Avi Kivity
  -1 siblings, 0 replies; 46+ messages in thread
From: Avi Kivity @ 2010-11-14 11:05 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Christoph Hellwig, Stefan Hajnoczi, qemu-devel, kvm, Michael S. Tsirkin

On 11/14/2010 12:37 PM, Stefan Hajnoczi wrote:
> On Sun, Nov 14, 2010 at 10:34 AM, Avi Kivity<avi@redhat.com>  wrote:
> >  On 11/12/2010 11:20 AM, Stefan Hajnoczi wrote:
> >>
> >>  >    Who guarantees that less common virtio-blk and virtio-net guest drivers
> >>  >    for non-Linux OSes are fine with it?  Maybe you should add a feature
> >>  >  flag
> >>  >    that the guest has to ACK to enable it.
> >>
> >>  Virtio-blk and virtio-net are fine.  Both of those devices are
> >>  expected to operate asynchronously.  SeaBIOS and gPXE virtio-net
> >>  drivers spin but they expect to and it is okay in those environments.
> >>  They already burn CPU today.
> >>
> >>  Virtio-console expects synchronous virtqueue kick.  In Linux,
> >>  virtio_console.c __send_control_msg() and send_buf() will spin.  Qemu
> >>  userspace is able to complete those requests synchronously so that the
> >>  guest never actually burns CPU (e.g.
> >>  hw/virtio-serial-bus.c:send_control_msg()).  I don't want to burn CPU
> >>  in places where we previously didn't.
> >
> >  This is a horrible bug.  virtio is an asynchronous API.  Some hypervisor
> >  implementations cannot even provide synchronous notifications.
> >
> >>  It's good that QEMU can decide whether or not to handle virtqueue kick
> >>  in the vcpu thread.  For high performance asynchronous devices like
> >>  virtio-net and virtio-blk it makes sense to use ioeventfd.  For others
> >>  it may not be useful.  I'm not sure a feature bit that exposes this
> >>  detail to the guest would be useful.
> >
> >  The guest should always assume that virtio devices are asynchronous.
>
> I agree, but let's enable virtio-ioeventfd carefully because bad code
> is out there.

Sure.  Note as long as the thread waiting on ioeventfd doesn't consume 
too much cpu, it will awaken quickly and we won't have the "transaction 
per timeslice" effect.

btw, what about virtio-blk with linux-aio?  Have you benchmarked that 
with and without ioeventfd?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-11-14 11:05             ` Avi Kivity
  0 siblings, 0 replies; 46+ messages in thread
From: Avi Kivity @ 2010-11-14 11:05 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Christoph Hellwig, Michael S. Tsirkin, Stefan Hajnoczi, kvm, qemu-devel

On 11/14/2010 12:37 PM, Stefan Hajnoczi wrote:
> On Sun, Nov 14, 2010 at 10:34 AM, Avi Kivity<avi@redhat.com>  wrote:
> >  On 11/12/2010 11:20 AM, Stefan Hajnoczi wrote:
> >>
> >>  >    Who guarantees that less common virtio-blk and virtio-net guest drivers
> >>  >    for non-Linux OSes are fine with it?  Maybe you should add a feature
> >>  >  flag
> >>  >    that the guest has to ACK to enable it.
> >>
> >>  Virtio-blk and virtio-net are fine.  Both of those devices are
> >>  expected to operate asynchronously.  SeaBIOS and gPXE virtio-net
> >>  drivers spin but they expect to and it is okay in those environments.
> >>  They already burn CPU today.
> >>
> >>  Virtio-console expects synchronous virtqueue kick.  In Linux,
> >>  virtio_console.c __send_control_msg() and send_buf() will spin.  Qemu
> >>  userspace is able to complete those requests synchronously so that the
> >>  guest never actually burns CPU (e.g.
> >>  hw/virtio-serial-bus.c:send_control_msg()).  I don't want to burn CPU
> >>  in places where we previously didn't.
> >
> >  This is a horrible bug.  virtio is an asynchronous API.  Some hypervisor
> >  implementations cannot even provide synchronous notifications.
> >
> >>  It's good that QEMU can decide whether or not to handle virtqueue kick
> >>  in the vcpu thread.  For high performance asynchronous devices like
> >>  virtio-net and virtio-blk it makes sense to use ioeventfd.  For others
> >>  it may not be useful.  I'm not sure a feature bit that exposes this
> >>  detail to the guest would be useful.
> >
> >  The guest should always assume that virtio devices are asynchronous.
>
> I agree, but let's enable virtio-ioeventfd carefully because bad code
> is out there.

Sure.  Note as long as the thread waiting on ioeventfd doesn't consume 
too much cpu, it will awaken quickly and we won't have the "transaction 
per timeslice" effect.

btw, what about virtio-blk with linux-aio?  Have you benchmarked that 
with and without ioeventfd?

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-11-14 11:05             ` Avi Kivity
@ 2010-11-14 12:19               ` Avi Kivity
  -1 siblings, 0 replies; 46+ messages in thread
From: Avi Kivity @ 2010-11-14 12:19 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Christoph Hellwig, Stefan Hajnoczi, qemu-devel, kvm, Michael S. Tsirkin

On 11/14/2010 01:05 PM, Avi Kivity wrote:
>> I agree, but let's enable virtio-ioeventfd carefully because bad code
>> is out there.
>
>
> Sure.  Note as long as the thread waiting on ioeventfd doesn't consume 
> too much cpu, it will awaken quickly and we won't have the 
> "transaction per timeslice" effect.
>
> btw, what about virtio-blk with linux-aio?  Have you benchmarked that 
> with and without ioeventfd?
>

And, what about efficiency?  As in bits/cycle?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-11-14 12:19               ` Avi Kivity
  0 siblings, 0 replies; 46+ messages in thread
From: Avi Kivity @ 2010-11-14 12:19 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Christoph Hellwig, Michael S. Tsirkin, Stefan Hajnoczi, kvm, qemu-devel

On 11/14/2010 01:05 PM, Avi Kivity wrote:
>> I agree, but let's enable virtio-ioeventfd carefully because bad code
>> is out there.
>
>
> Sure.  Note as long as the thread waiting on ioeventfd doesn't consume 
> too much cpu, it will awaken quickly and we won't have the 
> "transaction per timeslice" effect.
>
> btw, what about virtio-blk with linux-aio?  Have you benchmarked that 
> with and without ioeventfd?
>

And, what about efficiency?  As in bits/cycle?

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-11-14 12:19               ` Avi Kivity
@ 2010-11-15 11:20                 ` Stefan Hajnoczi
  -1 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-11-15 11:20 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Christoph Hellwig, Stefan Hajnoczi, qemu-devel, kvm,
	Michael S. Tsirkin, Khoa Huynh

On Sun, Nov 14, 2010 at 12:19 PM, Avi Kivity <avi@redhat.com> wrote:
> On 11/14/2010 01:05 PM, Avi Kivity wrote:
>>>
>>> I agree, but let's enable virtio-ioeventfd carefully because bad code
>>> is out there.
>>
>>
>> Sure.  Note as long as the thread waiting on ioeventfd doesn't consume too
>> much cpu, it will awaken quickly and we won't have the "transaction per
>> timeslice" effect.
>>
>> btw, what about virtio-blk with linux-aio?  Have you benchmarked that with
>> and without ioeventfd?
>>
>
> And, what about efficiency?  As in bits/cycle?

We are running benchmarks with this latest patch and will report results.

Stefan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-11-15 11:20                 ` Stefan Hajnoczi
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-11-15 11:20 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Stefan Hajnoczi, kvm, Michael S. Tsirkin, qemu-devel,
	Christoph Hellwig, Khoa Huynh

On Sun, Nov 14, 2010 at 12:19 PM, Avi Kivity <avi@redhat.com> wrote:
> On 11/14/2010 01:05 PM, Avi Kivity wrote:
>>>
>>> I agree, but let's enable virtio-ioeventfd carefully because bad code
>>> is out there.
>>
>>
>> Sure.  Note as long as the thread waiting on ioeventfd doesn't consume too
>> much cpu, it will awaken quickly and we won't have the "transaction per
>> timeslice" effect.
>>
>> btw, what about virtio-blk with linux-aio?  Have you benchmarked that with
>> and without ioeventfd?
>>
>
> And, what about efficiency?  As in bits/cycle?

We are running benchmarks with this latest patch and will report results.

Stefan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-11-15 11:20                 ` Stefan Hajnoczi
@ 2010-12-01 11:44                   ` Stefan Hajnoczi
  -1 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-12-01 11:44 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Christoph Hellwig, Stefan Hajnoczi, qemu-devel, kvm,
	Michael S. Tsirkin, Khoa Huynh

On Mon, Nov 15, 2010 at 11:20 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Sun, Nov 14, 2010 at 12:19 PM, Avi Kivity <avi@redhat.com> wrote:
>> On 11/14/2010 01:05 PM, Avi Kivity wrote:
>>>>
>>>> I agree, but let's enable virtio-ioeventfd carefully because bad code
>>>> is out there.
>>>
>>>
>>> Sure.  Note as long as the thread waiting on ioeventfd doesn't consume too
>>> much cpu, it will awaken quickly and we won't have the "transaction per
>>> timeslice" effect.
>>>
>>> btw, what about virtio-blk with linux-aio?  Have you benchmarked that with
>>> and without ioeventfd?
>>>
>>
>> And, what about efficiency?  As in bits/cycle?
>
> We are running benchmarks with this latest patch and will report results.

Full results here (thanks to Khoa Huynh):

http://wiki.qemu.org/Features/VirtioIoeventfd

The host CPU utilization is scaled to 16 CPUs so a 2-3% reduction is
actually in the 32-48% range for a single CPU.

The guest CPU utilization numbers include an efficiency metric: %vcpu
per MB/sec.  Here we see significant improvements too.  Guests that
previously couldn't get more CPU work done now have regained some
breathing space.

Stefan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-12-01 11:44                   ` Stefan Hajnoczi
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-12-01 11:44 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Stefan Hajnoczi, kvm, Michael S. Tsirkin, qemu-devel,
	Christoph Hellwig, Khoa Huynh

On Mon, Nov 15, 2010 at 11:20 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Sun, Nov 14, 2010 at 12:19 PM, Avi Kivity <avi@redhat.com> wrote:
>> On 11/14/2010 01:05 PM, Avi Kivity wrote:
>>>>
>>>> I agree, but let's enable virtio-ioeventfd carefully because bad code
>>>> is out there.
>>>
>>>
>>> Sure.  Note as long as the thread waiting on ioeventfd doesn't consume too
>>> much cpu, it will awaken quickly and we won't have the "transaction per
>>> timeslice" effect.
>>>
>>> btw, what about virtio-blk with linux-aio?  Have you benchmarked that with
>>> and without ioeventfd?
>>>
>>
>> And, what about efficiency?  As in bits/cycle?
>
> We are running benchmarks with this latest patch and will report results.

Full results here (thanks to Khoa Huynh):

http://wiki.qemu.org/Features/VirtioIoeventfd

The host CPU utilization is scaled to 16 CPUs so a 2-3% reduction is
actually in the 32-48% range for a single CPU.

The guest CPU utilization numbers include an efficiency metric: %vcpu
per MB/sec.  Here we see significant improvements too.  Guests that
previously couldn't get more CPU work done now have regained some
breathing space.

Stefan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-12-01 11:44                   ` Stefan Hajnoczi
@ 2010-12-01 12:30                     ` Avi Kivity
  -1 siblings, 0 replies; 46+ messages in thread
From: Avi Kivity @ 2010-12-01 12:30 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Christoph Hellwig, Stefan Hajnoczi, qemu-devel, kvm,
	Michael S. Tsirkin, Khoa Huynh

On 12/01/2010 01:44 PM, Stefan Hajnoczi wrote:
> >>
> >>  And, what about efficiency?  As in bits/cycle?
> >
> >  We are running benchmarks with this latest patch and will report results.
>
> Full results here (thanks to Khoa Huynh):
>
> http://wiki.qemu.org/Features/VirtioIoeventfd
>
> The host CPU utilization is scaled to 16 CPUs so a 2-3% reduction is
> actually in the 32-48% range for a single CPU.
>
> The guest CPU utilization numbers include an efficiency metric: %vcpu
> per MB/sec.  Here we see significant improvements too.  Guests that
> previously couldn't get more CPU work done now have regained some
> breathing space.

Thanks for those numbers.  The guest improvements were expected, but the 
host numbers surprised me.  Do you have an explanation as to why total 
host load should decrease?

Seems to me the host is doing more work, and doing it less efficiently 
(by ping-ponging requests to another thread).

In any case, looks like good improvement with no downsides.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-12-01 12:30                     ` Avi Kivity
  0 siblings, 0 replies; 46+ messages in thread
From: Avi Kivity @ 2010-12-01 12:30 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Stefan Hajnoczi, kvm, Michael S. Tsirkin, qemu-devel,
	Christoph Hellwig, Khoa Huynh

On 12/01/2010 01:44 PM, Stefan Hajnoczi wrote:
> >>
> >>  And, what about efficiency?  As in bits/cycle?
> >
> >  We are running benchmarks with this latest patch and will report results.
>
> Full results here (thanks to Khoa Huynh):
>
> http://wiki.qemu.org/Features/VirtioIoeventfd
>
> The host CPU utilization is scaled to 16 CPUs so a 2-3% reduction is
> actually in the 32-48% range for a single CPU.
>
> The guest CPU utilization numbers include an efficiency metric: %vcpu
> per MB/sec.  Here we see significant improvements too.  Guests that
> previously couldn't get more CPU work done now have regained some
> breathing space.

Thanks for those numbers.  The guest improvements were expected, but the 
host numbers surprised me.  Do you have an explanation as to why total 
host load should decrease?

Seems to me the host is doing more work, and doing it less efficiently 
(by ping-ponging requests to another thread).

In any case, looks like good improvement with no downsides.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-12-01 12:30                     ` Avi Kivity
@ 2010-12-01 21:34                       ` Stefan Hajnoczi
  -1 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-12-01 21:34 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Christoph Hellwig, Stefan Hajnoczi, qemu-devel, kvm,
	Michael S. Tsirkin, Khoa Huynh

On Wed, Dec 1, 2010 at 12:30 PM, Avi Kivity <avi@redhat.com> wrote:
> On 12/01/2010 01:44 PM, Stefan Hajnoczi wrote:
>>
>> >>
>> >>  And, what about efficiency?  As in bits/cycle?
>> >
>> >  We are running benchmarks with this latest patch and will report
>> > results.
>>
>> Full results here (thanks to Khoa Huynh):
>>
>> http://wiki.qemu.org/Features/VirtioIoeventfd
>>
>> The host CPU utilization is scaled to 16 CPUs so a 2-3% reduction is
>> actually in the 32-48% range for a single CPU.
>>
>> The guest CPU utilization numbers include an efficiency metric: %vcpu
>> per MB/sec.  Here we see significant improvements too.  Guests that
>> previously couldn't get more CPU work done now have regained some
>> breathing space.
>
> Thanks for those numbers.  The guest improvements were expected, but the
> host numbers surprised me.  Do you have an explanation as to why total host
> load should decrease?

The first vcpu does virtqueue kick - it holds the guest driver
vblk->lock across kick.  Before this kick completes a second vcpu
tries to acquire vblk->lock, finds it is contended, and spins.  So
we're burning CPU due to the long vblk->lock hold times.

With virtio-ioeventfd those kick times are reduced an there is less
contention on vblk->lock.

Stefan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-12-01 21:34                       ` Stefan Hajnoczi
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2010-12-01 21:34 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Stefan Hajnoczi, kvm, Michael S. Tsirkin, qemu-devel,
	Christoph Hellwig, Khoa Huynh

On Wed, Dec 1, 2010 at 12:30 PM, Avi Kivity <avi@redhat.com> wrote:
> On 12/01/2010 01:44 PM, Stefan Hajnoczi wrote:
>>
>> >>
>> >>  And, what about efficiency?  As in bits/cycle?
>> >
>> >  We are running benchmarks with this latest patch and will report
>> > results.
>>
>> Full results here (thanks to Khoa Huynh):
>>
>> http://wiki.qemu.org/Features/VirtioIoeventfd
>>
>> The host CPU utilization is scaled to 16 CPUs so a 2-3% reduction is
>> actually in the 32-48% range for a single CPU.
>>
>> The guest CPU utilization numbers include an efficiency metric: %vcpu
>> per MB/sec.  Here we see significant improvements too.  Guests that
>> previously couldn't get more CPU work done now have regained some
>> breathing space.
>
> Thanks for those numbers.  The guest improvements were expected, but the
> host numbers surprised me.  Do you have an explanation as to why total host
> load should decrease?

The first vcpu does virtqueue kick - it holds the guest driver
vblk->lock across kick.  Before this kick completes a second vcpu
tries to acquire vblk->lock, finds it is contended, and spins.  So
we're burning CPU due to the long vblk->lock hold times.

With virtio-ioeventfd those kick times are reduced an there is less
contention on vblk->lock.

Stefan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
  2010-12-01 21:34                       ` Stefan Hajnoczi
@ 2010-12-06 15:09                         ` Avi Kivity
  -1 siblings, 0 replies; 46+ messages in thread
From: Avi Kivity @ 2010-12-06 15:09 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Christoph Hellwig, Stefan Hajnoczi, qemu-devel, kvm,
	Michael S. Tsirkin, Khoa Huynh

On 12/01/2010 11:34 PM, Stefan Hajnoczi wrote:
> >>  The guest CPU utilization numbers include an efficiency metric: %vcpu
> >>  per MB/sec.  Here we see significant improvements too.  Guests that
> >>  previously couldn't get more CPU work done now have regained some
> >>  breathing space.
> >
> >  Thanks for those numbers.  The guest improvements were expected, but the
> >  host numbers surprised me.  Do you have an explanation as to why total host
> >  load should decrease?
>
> The first vcpu does virtqueue kick - it holds the guest driver
> vblk->lock across kick.  Before this kick completes a second vcpu
> tries to acquire vblk->lock, finds it is contended, and spins.  So
> we're burning CPU due to the long vblk->lock hold times.
>
> With virtio-ioeventfd those kick times are reduced an there is less
> contention on vblk->lock.

Makes sense.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
@ 2010-12-06 15:09                         ` Avi Kivity
  0 siblings, 0 replies; 46+ messages in thread
From: Avi Kivity @ 2010-12-06 15:09 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Stefan Hajnoczi, kvm, Michael S. Tsirkin, qemu-devel,
	Christoph Hellwig, Khoa Huynh

On 12/01/2010 11:34 PM, Stefan Hajnoczi wrote:
> >>  The guest CPU utilization numbers include an efficiency metric: %vcpu
> >>  per MB/sec.  Here we see significant improvements too.  Guests that
> >>  previously couldn't get more CPU work done now have regained some
> >>  breathing space.
> >
> >  Thanks for those numbers.  The guest improvements were expected, but the
> >  host numbers surprised me.  Do you have an explanation as to why total host
> >  load should decrease?
>
> The first vcpu does virtqueue kick - it holds the guest driver
> vblk->lock across kick.  Before this kick completes a second vcpu
> tries to acquire vblk->lock, finds it is contended, and spins.  So
> we're burning CPU due to the long vblk->lock hold times.
>
> With virtio-ioeventfd those kick times are reduced an there is less
> contention on vblk->lock.

Makes sense.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2010-12-06 15:10 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-11 13:47 [PATCH v2 0/3] virtio: Use ioeventfd for virtqueue notify Stefan Hajnoczi
2010-11-11 13:47 ` [Qemu-devel] " Stefan Hajnoczi
2010-11-11 13:47 ` [PATCH 1/3] virtio-pci: Rename bugs field to flags Stefan Hajnoczi
2010-11-11 13:47   ` [Qemu-devel] " Stefan Hajnoczi
2010-11-11 13:47 ` [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify Stefan Hajnoczi
2010-11-11 13:47   ` [Qemu-devel] " Stefan Hajnoczi
2010-11-11 15:53   ` Michael S. Tsirkin
2010-11-11 15:53     ` [Qemu-devel] " Michael S. Tsirkin
2010-11-12  9:18     ` Stefan Hajnoczi
2010-11-12  9:18       ` [Qemu-devel] " Stefan Hajnoczi
2010-11-12  9:25       ` Michael S. Tsirkin
2010-11-12  9:25         ` [Qemu-devel] " Michael S. Tsirkin
2010-11-12 13:10         ` Stefan Hajnoczi
2010-11-12 13:10           ` [Qemu-devel] " Stefan Hajnoczi
2010-11-11 16:45   ` Christoph Hellwig
2010-11-11 16:45     ` [Qemu-devel] " Christoph Hellwig
2010-11-12  9:20     ` Stefan Hajnoczi
2010-11-12  9:20       ` Stefan Hajnoczi
2010-11-14 10:34       ` Avi Kivity
2010-11-14 10:34         ` Avi Kivity
2010-11-14 10:37         ` Stefan Hajnoczi
2010-11-14 10:37           ` Stefan Hajnoczi
2010-11-14 11:05           ` Avi Kivity
2010-11-14 11:05             ` Avi Kivity
2010-11-14 12:19             ` Avi Kivity
2010-11-14 12:19               ` Avi Kivity
2010-11-15 11:20               ` Stefan Hajnoczi
2010-11-15 11:20                 ` Stefan Hajnoczi
2010-12-01 11:44                 ` Stefan Hajnoczi
2010-12-01 11:44                   ` Stefan Hajnoczi
2010-12-01 12:30                   ` Avi Kivity
2010-12-01 12:30                     ` Avi Kivity
2010-12-01 21:34                     ` Stefan Hajnoczi
2010-12-01 21:34                       ` Stefan Hajnoczi
2010-12-06 15:09                       ` Avi Kivity
2010-12-06 15:09                         ` Avi Kivity
2010-11-11 16:59   ` Avi Kivity
2010-11-11 16:59     ` [Qemu-devel] " Avi Kivity
2010-11-11 17:12     ` Michael S. Tsirkin
2010-11-11 17:12       ` [Qemu-devel] " Michael S. Tsirkin
2010-11-11 17:55       ` Gleb Natapov
2010-11-11 17:55         ` [Qemu-devel] " Gleb Natapov
2010-11-11 13:47 ` [PATCH 3/3] virtio-pci: Don't use ioeventfd on old kernels Stefan Hajnoczi
2010-11-11 13:47   ` [Qemu-devel] " Stefan Hajnoczi
2010-11-11 17:46 ` [PATCH v2 0/3] virtio: Use ioeventfd for virtqueue notify Michael S. Tsirkin
2010-11-11 17:46   ` [Qemu-devel] " Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.