* [Qemu-devel] [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify @ 2010-12-12 15:02 Stefan Hajnoczi 2010-12-12 15:02 ` [Qemu-devel] [PATCH v5 1/4] virtio-pci: Rename bugs field to flags Stefan Hajnoczi ` (5 more replies) 0 siblings, 6 replies; 52+ messages in thread From: Stefan Hajnoczi @ 2010-12-12 15:02 UTC (permalink / raw) To: qemu-devel; +Cc: Michael S. Tsirkin See below for the v5 changelog. Due to lack of connectivity I am sending from GMail. Git should retain my stefanha@linux.vnet.ibm.com From address. Virtqueue notify is currently handled synchronously in userspace virtio. This prevents the vcpu from executing guest code while hardware emulation code handles the notify. On systems that support KVM, the ioeventfd mechanism can be used to make virtqueue notify a lightweight exit by deferring hardware emulation to the iothread and allowing the VM to continue execution. This model is similar to how vhost receives virtqueue notifies. The result of this change is improved performance for userspace virtio devices. Virtio-blk throughput increases especially for multithreaded scenarios and virtio-net transmit throughput increases substantially. Now that this code is in virtio-pci.c it is possible to explicitly enable devices for which virtio-ioeventfd should be used. Only virtio-blk and virtio-net are enabled at this time. v5: * Fix spurious whitespace change in documentation * Test and clear event notifier when deassigning to catch race condition v4: * Simpler start/stop ioeventfd mechanism using bool ioeventfd_started state * Support for migration * Handle deassign race condition to avoid dropping a virtqueue kick * Add missing kvm_enabled() check to kvm_has_many_ioeventfds() * Documentation updates for qdev -device with ioeventfd=on|off ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] [PATCH v5 1/4] virtio-pci: Rename bugs field to flags 2010-12-12 15:02 [Qemu-devel] [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify Stefan Hajnoczi @ 2010-12-12 15:02 ` Stefan Hajnoczi 2010-12-12 15:02 ` [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify Stefan Hajnoczi ` (4 subsequent siblings) 5 siblings, 0 replies; 52+ messages in thread From: Stefan Hajnoczi @ 2010-12-12 15:02 UTC (permalink / raw) To: qemu-devel; +Cc: Stefan Hajnoczi, Michael S. Tsirkin The VirtIOPCIProxy bugs field is currently used to enable workarounds for older guests. Rename it to flags so that other per-device behavior can be tracked. A later patch uses the flags field to remember whether ioeventfd should be used for virtqueue host notification. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> --- hw/virtio-pci.c | 15 +++++++-------- 1 files changed, 7 insertions(+), 8 deletions(-) diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c index 6186142..13dd391 100644 --- a/hw/virtio-pci.c +++ b/hw/virtio-pci.c @@ -80,9 +80,8 @@ * 12 is historical, and due to x86 page size. */ #define VIRTIO_PCI_QUEUE_ADDR_SHIFT 12 -/* We can catch some guest bugs inside here so we continue supporting older - guests. */ -#define VIRTIO_PCI_BUG_BUS_MASTER (1 << 0) +/* Flags track per-device state like workarounds for quirks in older guests. */ +#define VIRTIO_PCI_FLAG_BUS_MASTER_BUG (1 << 0) /* QEMU doesn't strictly need write barriers since everything runs in * lock-step. We'll leave the calls to wmb() in though to make it obvious for @@ -95,7 +94,7 @@ typedef struct { PCIDevice pci_dev; VirtIODevice *vdev; - uint32_t bugs; + uint32_t flags; uint32_t addr; uint32_t class_code; uint32_t nvectors; @@ -159,7 +158,7 @@ static int virtio_pci_load_config(void * opaque, QEMUFile *f) in ready state. Then we have a buggy guest OS. */ if ((proxy->vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) && !(proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER)) { - proxy->bugs |= VIRTIO_PCI_BUG_BUS_MASTER; + proxy->flags |= VIRTIO_PCI_FLAG_BUS_MASTER_BUG; } return 0; } @@ -185,7 +184,7 @@ static void virtio_pci_reset(DeviceState *d) VirtIOPCIProxy *proxy = container_of(d, VirtIOPCIProxy, pci_dev.qdev); virtio_reset(proxy->vdev); msix_reset(&proxy->pci_dev); - proxy->bugs = 0; + proxy->flags = 0; } static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val) @@ -235,7 +234,7 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val) some safety checks. */ if ((val & VIRTIO_CONFIG_S_DRIVER_OK) && !(proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER)) { - proxy->bugs |= VIRTIO_PCI_BUG_BUS_MASTER; + proxy->flags |= VIRTIO_PCI_FLAG_BUS_MASTER_BUG; } break; case VIRTIO_MSI_CONFIG_VECTOR: @@ -403,7 +402,7 @@ static void virtio_write_config(PCIDevice *pci_dev, uint32_t address, if (PCI_COMMAND == address) { if (!(val & PCI_COMMAND_MASTER)) { - if (!(proxy->bugs & VIRTIO_PCI_BUG_BUS_MASTER)) { + if (!(proxy->flags & VIRTIO_PCI_FLAG_BUS_MASTER_BUG)) { virtio_set_status(proxy->vdev, proxy->vdev->status & ~VIRTIO_CONFIG_S_DRIVER_OK); } -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 52+ messages in thread
* [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify 2010-12-12 15:02 [Qemu-devel] [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify Stefan Hajnoczi 2010-12-12 15:02 ` [Qemu-devel] [PATCH v5 1/4] virtio-pci: Rename bugs field to flags Stefan Hajnoczi @ 2010-12-12 15:02 ` Stefan Hajnoczi 2011-01-24 18:54 ` Kevin Wolf 2010-12-12 15:02 ` [Qemu-devel] [PATCH v5 3/4] virtio-pci: Don't use ioeventfd on old kernels Stefan Hajnoczi ` (3 subsequent siblings) 5 siblings, 1 reply; 52+ messages in thread From: Stefan Hajnoczi @ 2010-12-12 15:02 UTC (permalink / raw) To: qemu-devel; +Cc: Stefan Hajnoczi, Michael S. Tsirkin Virtqueue notify is currently handled synchronously in userspace virtio. This prevents the vcpu from executing guest code while hardware emulation code handles the notify. On systems that support KVM, the ioeventfd mechanism can be used to make virtqueue notify a lightweight exit by deferring hardware emulation to the iothread and allowing the VM to continue execution. This model is similar to how vhost receives virtqueue notifies. The result of this change is improved performance for userspace virtio devices. Virtio-blk throughput increases especially for multithreaded scenarios and virtio-net transmit throughput increases substantially. Some virtio devices are known to have guest drivers which expect a notify to be processed synchronously and spin waiting for completion. Only enable ioeventfd for virtio-blk and virtio-net for now. Care must be taken not to interfere with vhost-net, which uses host notifiers. If the set_host_notifier() API is used by a device virtio-pci will disable virtio-ioeventfd and let the device deal with host notifiers as it wishes. After migration and on VM change state (running/paused) virtio-ioeventfd will enable/disable itself. * VIRTIO_CONFIG_S_DRIVER_OK -> enable virtio-ioeventfd * !VIRTIO_CONFIG_S_DRIVER_OK -> disable virtio-ioeventfd * virtio_pci_set_host_notifier() -> disable virtio-ioeventfd * vm_change_state(running=0) -> disable virtio-ioeventfd * vm_change_state(running=1) -> enable virtio-ioeventfd Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> --- hw/virtio-pci.c | 190 ++++++++++++++++++++++++++++++++++++++++++++++++------- hw/virtio.c | 14 +++- hw/virtio.h | 1 + 3 files changed, 179 insertions(+), 26 deletions(-) diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c index 13dd391..f57c45a 100644 --- a/hw/virtio-pci.c +++ b/hw/virtio-pci.c @@ -83,6 +83,11 @@ /* Flags track per-device state like workarounds for quirks in older guests. */ #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG (1 << 0) +/* Performance improves when virtqueue kick processing is decoupled from the + * vcpu thread using ioeventfd for some devices. */ +#define VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT 1 +#define VIRTIO_PCI_FLAG_USE_IOEVENTFD (1 << VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT) + /* QEMU doesn't strictly need write barriers since everything runs in * lock-step. We'll leave the calls to wmb() in though to make it obvious for * KVM or if kqemu gets SMP support. @@ -107,6 +112,8 @@ typedef struct { /* Max. number of ports we can have for a the virtio-serial device */ uint32_t max_virtserial_ports; virtio_net_conf net; + bool ioeventfd_started; + VMChangeStateEntry *vm_change_state_entry; } VirtIOPCIProxy; /* virtio device */ @@ -179,12 +186,131 @@ static int virtio_pci_load_queue(void * opaque, int n, QEMUFile *f) return 0; } +static int virtio_pci_set_host_notifier_internal(VirtIOPCIProxy *proxy, + int n, bool assign) +{ + VirtQueue *vq = virtio_get_queue(proxy->vdev, n); + EventNotifier *notifier = virtio_queue_get_host_notifier(vq); + int r; + if (assign) { + r = event_notifier_init(notifier, 1); + if (r < 0) { + return r; + } + r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier), + proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY, + n, assign); + if (r < 0) { + event_notifier_cleanup(notifier); + } + } else { + r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier), + proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY, + n, assign); + if (r < 0) { + return r; + } + + /* Handle the race condition where the guest kicked and we deassigned + * before we got around to handling the kick. + */ + if (event_notifier_test_and_clear(notifier)) { + virtio_queue_notify_vq(vq); + } + + event_notifier_cleanup(notifier); + } + return r; +} + +static void virtio_pci_host_notifier_read(void *opaque) +{ + VirtQueue *vq = opaque; + EventNotifier *n = virtio_queue_get_host_notifier(vq); + if (event_notifier_test_and_clear(n)) { + virtio_queue_notify_vq(vq); + } +} + +static void virtio_pci_set_host_notifier_fd_handler(VirtIOPCIProxy *proxy, + int n, bool assign) +{ + VirtQueue *vq = virtio_get_queue(proxy->vdev, n); + EventNotifier *notifier = virtio_queue_get_host_notifier(vq); + if (assign) { + qemu_set_fd_handler(event_notifier_get_fd(notifier), + virtio_pci_host_notifier_read, NULL, vq); + } else { + qemu_set_fd_handler(event_notifier_get_fd(notifier), + NULL, NULL, NULL); + } +} + +static int virtio_pci_start_ioeventfd(VirtIOPCIProxy *proxy) +{ + int n, r; + + if (!(proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD) || + proxy->ioeventfd_started) { + return 0; + } + + for (n = 0; n < VIRTIO_PCI_QUEUE_MAX; n++) { + if (!virtio_queue_get_num(proxy->vdev, n)) { + continue; + } + + r = virtio_pci_set_host_notifier_internal(proxy, n, true); + if (r < 0) { + goto assign_error; + } + + virtio_pci_set_host_notifier_fd_handler(proxy, n, true); + } + proxy->ioeventfd_started = true; + return 0; + +assign_error: + while (--n >= 0) { + if (!virtio_queue_get_num(proxy->vdev, n)) { + continue; + } + + virtio_pci_set_host_notifier_fd_handler(proxy, n, false); + virtio_pci_set_host_notifier_internal(proxy, n, false); + } + proxy->ioeventfd_started = false; + proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD; + return r; +} + +static int virtio_pci_stop_ioeventfd(VirtIOPCIProxy *proxy) +{ + int n; + + if (!proxy->ioeventfd_started) { + return 0; + } + + for (n = 0; n < VIRTIO_PCI_QUEUE_MAX; n++) { + if (!virtio_queue_get_num(proxy->vdev, n)) { + continue; + } + + virtio_pci_set_host_notifier_fd_handler(proxy, n, false); + virtio_pci_set_host_notifier_internal(proxy, n, false); + } + proxy->ioeventfd_started = false; + return 0; +} + static void virtio_pci_reset(DeviceState *d) { VirtIOPCIProxy *proxy = container_of(d, VirtIOPCIProxy, pci_dev.qdev); + virtio_pci_stop_ioeventfd(proxy); virtio_reset(proxy->vdev); msix_reset(&proxy->pci_dev); - proxy->flags = 0; + proxy->flags &= ~VIRTIO_PCI_FLAG_BUS_MASTER_BUG; } static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val) @@ -209,6 +335,7 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val) case VIRTIO_PCI_QUEUE_PFN: pa = (target_phys_addr_t)val << VIRTIO_PCI_QUEUE_ADDR_SHIFT; if (pa == 0) { + virtio_pci_stop_ioeventfd(proxy); virtio_reset(proxy->vdev); msix_unuse_all_vectors(&proxy->pci_dev); } @@ -223,6 +350,12 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val) virtio_queue_notify(vdev, val); break; case VIRTIO_PCI_STATUS: + if (val & VIRTIO_CONFIG_S_DRIVER_OK) { + virtio_pci_start_ioeventfd(proxy); + } else { + virtio_pci_stop_ioeventfd(proxy); + } + virtio_set_status(vdev, val & 0xFF); if (vdev->status == 0) { virtio_reset(proxy->vdev); @@ -403,6 +536,7 @@ static void virtio_write_config(PCIDevice *pci_dev, uint32_t address, if (PCI_COMMAND == address) { if (!(val & PCI_COMMAND_MASTER)) { if (!(proxy->flags & VIRTIO_PCI_FLAG_BUS_MASTER_BUG)) { + virtio_pci_stop_ioeventfd(proxy); virtio_set_status(proxy->vdev, proxy->vdev->status & ~VIRTIO_CONFIG_S_DRIVER_OK); } @@ -480,30 +614,27 @@ assign_error: static int virtio_pci_set_host_notifier(void *opaque, int n, bool assign) { VirtIOPCIProxy *proxy = opaque; - VirtQueue *vq = virtio_get_queue(proxy->vdev, n); - EventNotifier *notifier = virtio_queue_get_host_notifier(vq); - int r; - if (assign) { - r = event_notifier_init(notifier, 1); - if (r < 0) { - return r; - } - r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier), - proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY, - n, assign); - if (r < 0) { - event_notifier_cleanup(notifier); - } + + /* Stop using ioeventfd for virtqueue kick if the device starts using host + * notifiers. This makes it easy to avoid stepping on each others' toes. + */ + if (proxy->ioeventfd_started) { + virtio_pci_stop_ioeventfd(proxy); + proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD; + } + + return virtio_pci_set_host_notifier_internal(proxy, n, assign); +} + +static void virtio_pci_vm_change_state_handler(void *opaque, int running, int reason) +{ + VirtIOPCIProxy *proxy = opaque; + + if (running && (proxy->vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { + virtio_pci_start_ioeventfd(proxy); } else { - r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier), - proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY, - n, assign); - if (r < 0) { - return r; - } - event_notifier_cleanup(notifier); + virtio_pci_stop_ioeventfd(proxy); } - return r; } static const VirtIOBindings virtio_pci_bindings = { @@ -563,6 +694,10 @@ static void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev, proxy->host_features |= 0x1 << VIRTIO_F_NOTIFY_ON_EMPTY; proxy->host_features |= 0x1 << VIRTIO_F_BAD_FEATURE; proxy->host_features = vdev->get_features(vdev, proxy->host_features); + + proxy->vm_change_state_entry = qemu_add_vm_change_state_handler( + virtio_pci_vm_change_state_handler, + proxy); } static int virtio_blk_init_pci(PCIDevice *pci_dev) @@ -590,6 +725,9 @@ static int virtio_blk_init_pci(PCIDevice *pci_dev) static int virtio_exit_pci(PCIDevice *pci_dev) { + VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev); + + qemu_del_vm_change_state_handler(proxy->vm_change_state_entry); return msix_uninit(pci_dev); } @@ -597,6 +735,7 @@ static int virtio_blk_exit_pci(PCIDevice *pci_dev) { VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev); + virtio_pci_stop_ioeventfd(proxy); virtio_blk_exit(proxy->vdev); blockdev_mark_auto_del(proxy->block.bs); return virtio_exit_pci(pci_dev); @@ -658,6 +797,7 @@ static int virtio_net_exit_pci(PCIDevice *pci_dev) { VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev); + virtio_pci_stop_ioeventfd(proxy); virtio_net_exit(proxy->vdev); return virtio_exit_pci(pci_dev); } @@ -705,6 +845,8 @@ static PCIDeviceInfo virtio_info[] = { .qdev.props = (Property[]) { DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0), DEFINE_BLOCK_PROPERTIES(VirtIOPCIProxy, block), + DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags, + VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true), DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2), DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, host_features), DEFINE_PROP_END_OF_LIST(), @@ -717,6 +859,8 @@ static PCIDeviceInfo virtio_info[] = { .exit = virtio_net_exit_pci, .romfile = "pxe-virtio.bin", .qdev.props = (Property[]) { + DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags, + VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true), DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 3), DEFINE_VIRTIO_NET_FEATURES(VirtIOPCIProxy, host_features), DEFINE_NIC_PROPERTIES(VirtIOPCIProxy, nic), diff --git a/hw/virtio.c b/hw/virtio.c index 07dbf86..e40296a 100644 --- a/hw/virtio.c +++ b/hw/virtio.c @@ -575,11 +575,19 @@ int virtio_queue_get_num(VirtIODevice *vdev, int n) return vdev->vq[n].vring.num; } +void virtio_queue_notify_vq(VirtQueue *vq) +{ + if (vq->vring.desc) { + VirtIODevice *vdev = vq->vdev; + trace_virtio_queue_notify(vdev, vq - vdev->vq, vq); + vq->handle_output(vdev, vq); + } +} + void virtio_queue_notify(VirtIODevice *vdev, int n) { - if (n < VIRTIO_PCI_QUEUE_MAX && vdev->vq[n].vring.desc) { - trace_virtio_queue_notify(vdev, n, &vdev->vq[n]); - vdev->vq[n].handle_output(vdev, &vdev->vq[n]); + if (n < VIRTIO_PCI_QUEUE_MAX) { + virtio_queue_notify_vq(&vdev->vq[n]); } } diff --git a/hw/virtio.h b/hw/virtio.h index 02fa312..5ae521c 100644 --- a/hw/virtio.h +++ b/hw/virtio.h @@ -219,5 +219,6 @@ void virtio_queue_set_last_avail_idx(VirtIODevice *vdev, int n, uint16_t idx); VirtQueue *virtio_get_queue(VirtIODevice *vdev, int n); EventNotifier *virtio_queue_get_guest_notifier(VirtQueue *vq); EventNotifier *virtio_queue_get_host_notifier(VirtQueue *vq); +void virtio_queue_notify_vq(VirtQueue *vq); void virtio_irq(VirtQueue *vq); #endif -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify 2010-12-12 15:02 ` [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify Stefan Hajnoczi @ 2011-01-24 18:54 ` Kevin Wolf 2011-01-24 19:36 ` Michael S. Tsirkin 0 siblings, 1 reply; 52+ messages in thread From: Kevin Wolf @ 2011-01-24 18:54 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi Am 12.12.2010 16:02, schrieb Stefan Hajnoczi: > Virtqueue notify is currently handled synchronously in userspace virtio. This > prevents the vcpu from executing guest code while hardware emulation code > handles the notify. > > On systems that support KVM, the ioeventfd mechanism can be used to make > virtqueue notify a lightweight exit by deferring hardware emulation to the > iothread and allowing the VM to continue execution. This model is similar to > how vhost receives virtqueue notifies. > > The result of this change is improved performance for userspace virtio devices. > Virtio-blk throughput increases especially for multithreaded scenarios and > virtio-net transmit throughput increases substantially. > > Some virtio devices are known to have guest drivers which expect a notify to be > processed synchronously and spin waiting for completion. Only enable ioeventfd > for virtio-blk and virtio-net for now. > > Care must be taken not to interfere with vhost-net, which uses host > notifiers. If the set_host_notifier() API is used by a device > virtio-pci will disable virtio-ioeventfd and let the device deal with > host notifiers as it wishes. > > After migration and on VM change state (running/paused) virtio-ioeventfd > will enable/disable itself. > > * VIRTIO_CONFIG_S_DRIVER_OK -> enable virtio-ioeventfd > * !VIRTIO_CONFIG_S_DRIVER_OK -> disable virtio-ioeventfd > * virtio_pci_set_host_notifier() -> disable virtio-ioeventfd > * vm_change_state(running=0) -> disable virtio-ioeventfd > * vm_change_state(running=1) -> enable virtio-ioeventfd > > Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> On current git master I'm getting hangs when running iozone on a virtio-blk disk. "Hang" means that it's not responsive any more and has 100% CPU consumption. I bisected the problem to this patch. Any ideas? Kevin ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify 2011-01-24 18:54 ` Kevin Wolf @ 2011-01-24 19:36 ` Michael S. Tsirkin 2011-01-24 19:48 ` Kevin Wolf 0 siblings, 1 reply; 52+ messages in thread From: Michael S. Tsirkin @ 2011-01-24 19:36 UTC (permalink / raw) To: Kevin Wolf; +Cc: Stefan Hajnoczi, qemu-devel, Stefan Hajnoczi On Mon, Jan 24, 2011 at 07:54:20PM +0100, Kevin Wolf wrote: > Am 12.12.2010 16:02, schrieb Stefan Hajnoczi: > > Virtqueue notify is currently handled synchronously in userspace virtio. This > > prevents the vcpu from executing guest code while hardware emulation code > > handles the notify. > > > > On systems that support KVM, the ioeventfd mechanism can be used to make > > virtqueue notify a lightweight exit by deferring hardware emulation to the > > iothread and allowing the VM to continue execution. This model is similar to > > how vhost receives virtqueue notifies. > > > > The result of this change is improved performance for userspace virtio devices. > > Virtio-blk throughput increases especially for multithreaded scenarios and > > virtio-net transmit throughput increases substantially. > > > > Some virtio devices are known to have guest drivers which expect a notify to be > > processed synchronously and spin waiting for completion. Only enable ioeventfd > > for virtio-blk and virtio-net for now. > > > > Care must be taken not to interfere with vhost-net, which uses host > > notifiers. If the set_host_notifier() API is used by a device > > virtio-pci will disable virtio-ioeventfd and let the device deal with > > host notifiers as it wishes. > > > > After migration and on VM change state (running/paused) virtio-ioeventfd > > will enable/disable itself. > > > > * VIRTIO_CONFIG_S_DRIVER_OK -> enable virtio-ioeventfd > > * !VIRTIO_CONFIG_S_DRIVER_OK -> disable virtio-ioeventfd > > * virtio_pci_set_host_notifier() -> disable virtio-ioeventfd > > * vm_change_state(running=0) -> disable virtio-ioeventfd > > * vm_change_state(running=1) -> enable virtio-ioeventfd > > > > Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> > > On current git master I'm getting hangs when running iozone on a > virtio-blk disk. "Hang" means that it's not responsive any more and has > 100% CPU consumption. > > I bisected the problem to this patch. Any ideas? > > Kevin Does it help if you set ioeventfd=off on command line? ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify 2011-01-24 19:36 ` Michael S. Tsirkin @ 2011-01-24 19:48 ` Kevin Wolf 2011-01-24 19:47 ` Michael S. Tsirkin 0 siblings, 1 reply; 52+ messages in thread From: Kevin Wolf @ 2011-01-24 19:48 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: Stefan Hajnoczi, qemu-devel, Stefan Hajnoczi Am 24.01.2011 20:36, schrieb Michael S. Tsirkin: > On Mon, Jan 24, 2011 at 07:54:20PM +0100, Kevin Wolf wrote: >> Am 12.12.2010 16:02, schrieb Stefan Hajnoczi: >>> Virtqueue notify is currently handled synchronously in userspace virtio. This >>> prevents the vcpu from executing guest code while hardware emulation code >>> handles the notify. >>> >>> On systems that support KVM, the ioeventfd mechanism can be used to make >>> virtqueue notify a lightweight exit by deferring hardware emulation to the >>> iothread and allowing the VM to continue execution. This model is similar to >>> how vhost receives virtqueue notifies. >>> >>> The result of this change is improved performance for userspace virtio devices. >>> Virtio-blk throughput increases especially for multithreaded scenarios and >>> virtio-net transmit throughput increases substantially. >>> >>> Some virtio devices are known to have guest drivers which expect a notify to be >>> processed synchronously and spin waiting for completion. Only enable ioeventfd >>> for virtio-blk and virtio-net for now. >>> >>> Care must be taken not to interfere with vhost-net, which uses host >>> notifiers. If the set_host_notifier() API is used by a device >>> virtio-pci will disable virtio-ioeventfd and let the device deal with >>> host notifiers as it wishes. >>> >>> After migration and on VM change state (running/paused) virtio-ioeventfd >>> will enable/disable itself. >>> >>> * VIRTIO_CONFIG_S_DRIVER_OK -> enable virtio-ioeventfd >>> * !VIRTIO_CONFIG_S_DRIVER_OK -> disable virtio-ioeventfd >>> * virtio_pci_set_host_notifier() -> disable virtio-ioeventfd >>> * vm_change_state(running=0) -> disable virtio-ioeventfd >>> * vm_change_state(running=1) -> enable virtio-ioeventfd >>> >>> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> >> >> On current git master I'm getting hangs when running iozone on a >> virtio-blk disk. "Hang" means that it's not responsive any more and has >> 100% CPU consumption. >> >> I bisected the problem to this patch. Any ideas? >> >> Kevin > > Does it help if you set ioeventfd=off on command line? Yes, with ioeventfd=off it seems to work fine. Kevin ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify 2011-01-24 19:48 ` Kevin Wolf @ 2011-01-24 19:47 ` Michael S. Tsirkin 2011-01-24 20:05 ` Kevin Wolf 0 siblings, 1 reply; 52+ messages in thread From: Michael S. Tsirkin @ 2011-01-24 19:47 UTC (permalink / raw) To: Kevin Wolf; +Cc: Stefan Hajnoczi, qemu-devel, Stefan Hajnoczi On Mon, Jan 24, 2011 at 08:48:05PM +0100, Kevin Wolf wrote: > Am 24.01.2011 20:36, schrieb Michael S. Tsirkin: > > On Mon, Jan 24, 2011 at 07:54:20PM +0100, Kevin Wolf wrote: > >> Am 12.12.2010 16:02, schrieb Stefan Hajnoczi: > >>> Virtqueue notify is currently handled synchronously in userspace virtio. This > >>> prevents the vcpu from executing guest code while hardware emulation code > >>> handles the notify. > >>> > >>> On systems that support KVM, the ioeventfd mechanism can be used to make > >>> virtqueue notify a lightweight exit by deferring hardware emulation to the > >>> iothread and allowing the VM to continue execution. This model is similar to > >>> how vhost receives virtqueue notifies. > >>> > >>> The result of this change is improved performance for userspace virtio devices. > >>> Virtio-blk throughput increases especially for multithreaded scenarios and > >>> virtio-net transmit throughput increases substantially. > >>> > >>> Some virtio devices are known to have guest drivers which expect a notify to be > >>> processed synchronously and spin waiting for completion. Only enable ioeventfd > >>> for virtio-blk and virtio-net for now. > >>> > >>> Care must be taken not to interfere with vhost-net, which uses host > >>> notifiers. If the set_host_notifier() API is used by a device > >>> virtio-pci will disable virtio-ioeventfd and let the device deal with > >>> host notifiers as it wishes. > >>> > >>> After migration and on VM change state (running/paused) virtio-ioeventfd > >>> will enable/disable itself. > >>> > >>> * VIRTIO_CONFIG_S_DRIVER_OK -> enable virtio-ioeventfd > >>> * !VIRTIO_CONFIG_S_DRIVER_OK -> disable virtio-ioeventfd > >>> * virtio_pci_set_host_notifier() -> disable virtio-ioeventfd > >>> * vm_change_state(running=0) -> disable virtio-ioeventfd > >>> * vm_change_state(running=1) -> enable virtio-ioeventfd > >>> > >>> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> > >> > >> On current git master I'm getting hangs when running iozone on a > >> virtio-blk disk. "Hang" means that it's not responsive any more and has > >> 100% CPU consumption. > >> > >> I bisected the problem to this patch. Any ideas? > >> > >> Kevin > > > > Does it help if you set ioeventfd=off on command line? > > Yes, with ioeventfd=off it seems to work fine. > > Kevin Then it's the ioeventfd that is to blame. Is it the io thread that consumes 100% CPU? Or the vcpu thread? -- MST ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify 2011-01-24 19:47 ` Michael S. Tsirkin @ 2011-01-24 20:05 ` Kevin Wolf 2011-01-25 7:12 ` Stefan Hajnoczi 0 siblings, 1 reply; 52+ messages in thread From: Kevin Wolf @ 2011-01-24 20:05 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: Stefan Hajnoczi, qemu-devel, Stefan Hajnoczi Am 24.01.2011 20:47, schrieb Michael S. Tsirkin: > On Mon, Jan 24, 2011 at 08:48:05PM +0100, Kevin Wolf wrote: >> Am 24.01.2011 20:36, schrieb Michael S. Tsirkin: >>> On Mon, Jan 24, 2011 at 07:54:20PM +0100, Kevin Wolf wrote: >>>> Am 12.12.2010 16:02, schrieb Stefan Hajnoczi: >>>>> Virtqueue notify is currently handled synchronously in userspace virtio. This >>>>> prevents the vcpu from executing guest code while hardware emulation code >>>>> handles the notify. >>>>> >>>>> On systems that support KVM, the ioeventfd mechanism can be used to make >>>>> virtqueue notify a lightweight exit by deferring hardware emulation to the >>>>> iothread and allowing the VM to continue execution. This model is similar to >>>>> how vhost receives virtqueue notifies. >>>>> >>>>> The result of this change is improved performance for userspace virtio devices. >>>>> Virtio-blk throughput increases especially for multithreaded scenarios and >>>>> virtio-net transmit throughput increases substantially. >>>>> >>>>> Some virtio devices are known to have guest drivers which expect a notify to be >>>>> processed synchronously and spin waiting for completion. Only enable ioeventfd >>>>> for virtio-blk and virtio-net for now. >>>>> >>>>> Care must be taken not to interfere with vhost-net, which uses host >>>>> notifiers. If the set_host_notifier() API is used by a device >>>>> virtio-pci will disable virtio-ioeventfd and let the device deal with >>>>> host notifiers as it wishes. >>>>> >>>>> After migration and on VM change state (running/paused) virtio-ioeventfd >>>>> will enable/disable itself. >>>>> >>>>> * VIRTIO_CONFIG_S_DRIVER_OK -> enable virtio-ioeventfd >>>>> * !VIRTIO_CONFIG_S_DRIVER_OK -> disable virtio-ioeventfd >>>>> * virtio_pci_set_host_notifier() -> disable virtio-ioeventfd >>>>> * vm_change_state(running=0) -> disable virtio-ioeventfd >>>>> * vm_change_state(running=1) -> enable virtio-ioeventfd >>>>> >>>>> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> >>>> >>>> On current git master I'm getting hangs when running iozone on a >>>> virtio-blk disk. "Hang" means that it's not responsive any more and has >>>> 100% CPU consumption. >>>> >>>> I bisected the problem to this patch. Any ideas? >>>> >>>> Kevin >>> >>> Does it help if you set ioeventfd=off on command line? >> >> Yes, with ioeventfd=off it seems to work fine. >> >> Kevin > > Then it's the ioeventfd that is to blame. > Is it the io thread that consumes 100% CPU? > Or the vcpu thread? I was building with the default options, i.e. there is no IO thread. Now I'm just running the test with IO threads enabled, and so far everything looks good. So I can only reproduce the problem with IO threads disabled. Kevin ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify 2011-01-24 20:05 ` Kevin Wolf @ 2011-01-25 7:12 ` Stefan Hajnoczi 2011-01-25 9:49 ` Stefan Hajnoczi 0 siblings, 1 reply; 52+ messages in thread From: Stefan Hajnoczi @ 2011-01-25 7:12 UTC (permalink / raw) To: Kevin Wolf; +Cc: qemu-devel, Stefan Hajnoczi, Michael S. Tsirkin On Mon, Jan 24, 2011 at 8:05 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 24.01.2011 20:47, schrieb Michael S. Tsirkin: >> On Mon, Jan 24, 2011 at 08:48:05PM +0100, Kevin Wolf wrote: >>> Am 24.01.2011 20:36, schrieb Michael S. Tsirkin: >>>> On Mon, Jan 24, 2011 at 07:54:20PM +0100, Kevin Wolf wrote: >>>>> Am 12.12.2010 16:02, schrieb Stefan Hajnoczi: >>>>>> Virtqueue notify is currently handled synchronously in userspace virtio. This >>>>>> prevents the vcpu from executing guest code while hardware emulation code >>>>>> handles the notify. >>>>>> >>>>>> On systems that support KVM, the ioeventfd mechanism can be used to make >>>>>> virtqueue notify a lightweight exit by deferring hardware emulation to the >>>>>> iothread and allowing the VM to continue execution. This model is similar to >>>>>> how vhost receives virtqueue notifies. >>>>>> >>>>>> The result of this change is improved performance for userspace virtio devices. >>>>>> Virtio-blk throughput increases especially for multithreaded scenarios and >>>>>> virtio-net transmit throughput increases substantially. >>>>>> >>>>>> Some virtio devices are known to have guest drivers which expect a notify to be >>>>>> processed synchronously and spin waiting for completion. Only enable ioeventfd >>>>>> for virtio-blk and virtio-net for now. >>>>>> >>>>>> Care must be taken not to interfere with vhost-net, which uses host >>>>>> notifiers. If the set_host_notifier() API is used by a device >>>>>> virtio-pci will disable virtio-ioeventfd and let the device deal with >>>>>> host notifiers as it wishes. >>>>>> >>>>>> After migration and on VM change state (running/paused) virtio-ioeventfd >>>>>> will enable/disable itself. >>>>>> >>>>>> * VIRTIO_CONFIG_S_DRIVER_OK -> enable virtio-ioeventfd >>>>>> * !VIRTIO_CONFIG_S_DRIVER_OK -> disable virtio-ioeventfd >>>>>> * virtio_pci_set_host_notifier() -> disable virtio-ioeventfd >>>>>> * vm_change_state(running=0) -> disable virtio-ioeventfd >>>>>> * vm_change_state(running=1) -> enable virtio-ioeventfd >>>>>> >>>>>> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> >>>>> >>>>> On current git master I'm getting hangs when running iozone on a >>>>> virtio-blk disk. "Hang" means that it's not responsive any more and has >>>>> 100% CPU consumption. >>>>> >>>>> I bisected the problem to this patch. Any ideas? >>>>> >>>>> Kevin >>>> >>>> Does it help if you set ioeventfd=off on command line? >>> >>> Yes, with ioeventfd=off it seems to work fine. >>> >>> Kevin >> >> Then it's the ioeventfd that is to blame. >> Is it the io thread that consumes 100% CPU? >> Or the vcpu thread? > > I was building with the default options, i.e. there is no IO thread. > > Now I'm just running the test with IO threads enabled, and so far > everything looks good. So I can only reproduce the problem with IO > threads disabled. Hrm...aio uses SIGUSR2 to force the vcpu to process aio completions (relevant when --enable-io-thread is not used). I will take a look at that again and see why we're spinning without checking for ioeventfd completion. Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify 2011-01-25 7:12 ` Stefan Hajnoczi @ 2011-01-25 9:49 ` Stefan Hajnoczi 2011-01-25 9:54 ` Stefan Hajnoczi ` (2 more replies) 0 siblings, 3 replies; 52+ messages in thread From: Stefan Hajnoczi @ 2011-01-25 9:49 UTC (permalink / raw) To: Kevin Wolf Cc: Anthony Liguori, Avi Kivity, qemu-devel, Stefan Hajnoczi, Michael S. Tsirkin On Tue, Jan 25, 2011 at 7:12 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote: > On Mon, Jan 24, 2011 at 8:05 PM, Kevin Wolf <kwolf@redhat.com> wrote: >> Am 24.01.2011 20:47, schrieb Michael S. Tsirkin: >>> On Mon, Jan 24, 2011 at 08:48:05PM +0100, Kevin Wolf wrote: >>>> Am 24.01.2011 20:36, schrieb Michael S. Tsirkin: >>>>> On Mon, Jan 24, 2011 at 07:54:20PM +0100, Kevin Wolf wrote: >>>>>> Am 12.12.2010 16:02, schrieb Stefan Hajnoczi: >>>>>>> Virtqueue notify is currently handled synchronously in userspace virtio. This >>>>>>> prevents the vcpu from executing guest code while hardware emulation code >>>>>>> handles the notify. >>>>>>> >>>>>>> On systems that support KVM, the ioeventfd mechanism can be used to make >>>>>>> virtqueue notify a lightweight exit by deferring hardware emulation to the >>>>>>> iothread and allowing the VM to continue execution. This model is similar to >>>>>>> how vhost receives virtqueue notifies. >>>>>>> >>>>>>> The result of this change is improved performance for userspace virtio devices. >>>>>>> Virtio-blk throughput increases especially for multithreaded scenarios and >>>>>>> virtio-net transmit throughput increases substantially. >>>>>>> >>>>>>> Some virtio devices are known to have guest drivers which expect a notify to be >>>>>>> processed synchronously and spin waiting for completion. Only enable ioeventfd >>>>>>> for virtio-blk and virtio-net for now. >>>>>>> >>>>>>> Care must be taken not to interfere with vhost-net, which uses host >>>>>>> notifiers. If the set_host_notifier() API is used by a device >>>>>>> virtio-pci will disable virtio-ioeventfd and let the device deal with >>>>>>> host notifiers as it wishes. >>>>>>> >>>>>>> After migration and on VM change state (running/paused) virtio-ioeventfd >>>>>>> will enable/disable itself. >>>>>>> >>>>>>> * VIRTIO_CONFIG_S_DRIVER_OK -> enable virtio-ioeventfd >>>>>>> * !VIRTIO_CONFIG_S_DRIVER_OK -> disable virtio-ioeventfd >>>>>>> * virtio_pci_set_host_notifier() -> disable virtio-ioeventfd >>>>>>> * vm_change_state(running=0) -> disable virtio-ioeventfd >>>>>>> * vm_change_state(running=1) -> enable virtio-ioeventfd >>>>>>> >>>>>>> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> >>>>>> >>>>>> On current git master I'm getting hangs when running iozone on a >>>>>> virtio-blk disk. "Hang" means that it's not responsive any more and has >>>>>> 100% CPU consumption. >>>>>> >>>>>> I bisected the problem to this patch. Any ideas? >>>>>> >>>>>> Kevin >>>>> >>>>> Does it help if you set ioeventfd=off on command line? >>>> >>>> Yes, with ioeventfd=off it seems to work fine. >>>> >>>> Kevin >>> >>> Then it's the ioeventfd that is to blame. >>> Is it the io thread that consumes 100% CPU? >>> Or the vcpu thread? >> >> I was building with the default options, i.e. there is no IO thread. >> >> Now I'm just running the test with IO threads enabled, and so far >> everything looks good. So I can only reproduce the problem with IO >> threads disabled. > > Hrm...aio uses SIGUSR2 to force the vcpu to process aio completions > (relevant when --enable-io-thread is not used). I will take a look at > that again and see why we're spinning without checking for ioeventfd > completion. Here's my understanding of --disable-io-thread. Added Anthony on CC, please correct me. When I/O thread is disabled our only thread runs guest code until an exit request is made. There are synchronous exit cases like a halt instruction or single step. There are also asynchronous exit cases when signal handlers use qemu_notify_event(), which does cpu_exit(), to set env->exit_request = 1 and unlink the current tb. With this structure in mind, anything which needs to interrupt the vcpu in order to process events must use signals and qemu_notify_event(). Otherwise that event source may be starved and never processed. virtio-ioeventfd currently does not use signals and will therefore never interrupt the vcpu. However, you normally don't notice the missing signal handler because some other event interrupts the vcpu and we enter select(2) to process all pending handlers. So virtio-ioeventfd mostly gets a free ride on top of timer events. This is suboptimal because it adds latency to virtqueue kick - we're waiting for another event to interrupt the vcpu before we can process virtqueue-kick. If any other vcpu interruption makes virtio-ioeventfd chug along then why are you seeing 100% CPU livelock? My theory is that dynticks has a race condition which causes timers to stop working in QEMU. Here is an strace of QEMU --disable-io-thread entering live lock. I can trigger this by starting a VM and running "while true; do true; done" at the shell. Then strace the QEMU process: 08:04:34.985177 ioctl(11, KVM_RUN, 0) = 0 08:04:34.985242 --- SIGALRM (Alarm clock) @ 0 (0) --- 08:04:34.985319 write(6, "\1\0\0\0\0\0\0\0", 8) = 8 08:04:34.985368 rt_sigreturn(0x2758ad0) = 0 08:04:34.985423 select(15, [5 8 14], [], [], {0, 0}) = 1 (in [5], left {0, 0}) 08:04:34.985484 read(5, "\1\0\0\0\0\0\0\0", 512) = 8 08:04:34.985538 timer_gettime(0, {it_interval={0, 0}, it_value={0, 0}}) = 0 08:04:34.985588 timer_settime(0, 0, {it_interval={0, 0}, it_value={0, 273000}}, NULL) = 0 08:04:34.985646 ioctl(11, KVM_RUN, 0) = -1 EINTR (Interrupted system call) 08:04:34.985928 --- SIGALRM (Alarm clock) @ 0 (0) --- 08:04:34.986007 write(6, "\1\0\0\0\0\0\0\0", 8) = 8 08:04:34.986063 rt_sigreturn(0x2758ad0) = -1 EINTR (Interrupted system call) 08:04:34.986124 select(15, [5 8 14], [], [], {0, 0}) = 1 (in [5], left {0, 0}) 08:04:34.986188 read(5, "\1\0\0\0\0\0\0\0", 512) = 8 08:04:34.986246 timer_gettime(0, {it_interval={0, 0}, it_value={0, 0}}) = 0 08:04:34.986299 timer_settime(0, 0, {it_interval={0, 0}, it_value={0, 250000}}, NULL) = 0 08:04:34.986359 ioctl(11, KVM_INTERRUPT, 0x7fff90404ef0) = 0 08:04:34.986406 ioctl(11, KVM_RUN, 0) = 0 08:04:34.986465 ioctl(11, KVM_RUN, 0) = 0 <--- guest finishes execution v--- dynticks_rearm_timer() returns early because timer is already scheduled 08:04:34.986533 timer_gettime(0, {it_interval={0, 0}, it_value={0, 24203}}) = 0 08:04:34.986585 --- SIGALRM (Alarm clock) @ 0 (0) --- <--- timer expires 08:04:34.986661 write(6, "\1\0\0\0\0\0\0\0", 8) = 8 08:04:34.986710 rt_sigreturn(0x2758ad0) = 0 v--- we re-enter the guest without rearming the timer! 08:04:34.986754 ioctl(11, KVM_RUN^C <unfinished ...> [QEMU hang, 100% CPU] So dynticks fails to rearm the timer before we enter the guest. This is a race condition: we check that there is already a timer scheduled and head on towards re-entering the guest, the timer expires before we enter the guest, we re-enter the guest without realizing the timer has expired. Now we're inside the guest without the hope of a timer expiring - and the guest is running a CPU-bound workload that doesn't need to perform I/O. The result is a hung QEMU (screen does not update) and a softlockup inside the guest once we do kick it to life again (by detaching strace). I think the only way to avoid this race condition in dynticks is to mask SIGALRM, then check if the timer expired, and then ioctl(KVM_RUN) with atomic signal mask change back to SIGALRM enabled. Thoughts? Back to virtio-ioeventfd, we really shouldn't use virtio-ioeventfd when there is no I/O thread. It doesn't make sense because there's no opportunity to process the virtqueue while the guest code is executing in parallel like there is with I/O thread. It will just degrade performance when QEMU only has one thread. I'll send a patch to disable it when we build without I/O thread. Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify 2011-01-25 9:49 ` Stefan Hajnoczi @ 2011-01-25 9:54 ` Stefan Hajnoczi 2011-01-25 11:27 ` Michael S. Tsirkin 2011-01-25 19:18 ` Anthony Liguori 2 siblings, 0 replies; 52+ messages in thread From: Stefan Hajnoczi @ 2011-01-25 9:54 UTC (permalink / raw) To: Kevin Wolf Cc: Anthony Liguori, Avi Kivity, qemu-devel, Stefan Hajnoczi, Michael S. Tsirkin On Tue, Jan 25, 2011 at 9:49 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote: > If any other vcpu interruption makes virtio-ioeventfd chug along then > why are you seeing 100% CPU livelock? My theory is that dynticks has > a race condition which causes timers to stop working in QEMU. I forgot to mention that you can test this theory by building without I/O thread and running with -clock hpet. If the guest no longer hangs, then this suggests you're seeing the dynticks race condition. Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify 2011-01-25 9:49 ` Stefan Hajnoczi 2011-01-25 9:54 ` Stefan Hajnoczi @ 2011-01-25 11:27 ` Michael S. Tsirkin 2011-01-25 13:20 ` Stefan Hajnoczi 2011-01-25 19:18 ` Anthony Liguori 2 siblings, 1 reply; 52+ messages in thread From: Michael S. Tsirkin @ 2011-01-25 11:27 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Kevin Wolf, Anthony Liguori, Avi Kivity, qemu-devel, Stefan Hajnoczi On Tue, Jan 25, 2011 at 09:49:04AM +0000, Stefan Hajnoczi wrote: > On Tue, Jan 25, 2011 at 7:12 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote: > > On Mon, Jan 24, 2011 at 8:05 PM, Kevin Wolf <kwolf@redhat.com> wrote: > >> Am 24.01.2011 20:47, schrieb Michael S. Tsirkin: > >>> On Mon, Jan 24, 2011 at 08:48:05PM +0100, Kevin Wolf wrote: > >>>> Am 24.01.2011 20:36, schrieb Michael S. Tsirkin: > >>>>> On Mon, Jan 24, 2011 at 07:54:20PM +0100, Kevin Wolf wrote: > >>>>>> Am 12.12.2010 16:02, schrieb Stefan Hajnoczi: > >>>>>>> Virtqueue notify is currently handled synchronously in userspace virtio. This > >>>>>>> prevents the vcpu from executing guest code while hardware emulation code > >>>>>>> handles the notify. > >>>>>>> > >>>>>>> On systems that support KVM, the ioeventfd mechanism can be used to make > >>>>>>> virtqueue notify a lightweight exit by deferring hardware emulation to the > >>>>>>> iothread and allowing the VM to continue execution. This model is similar to > >>>>>>> how vhost receives virtqueue notifies. > >>>>>>> > >>>>>>> The result of this change is improved performance for userspace virtio devices. > >>>>>>> Virtio-blk throughput increases especially for multithreaded scenarios and > >>>>>>> virtio-net transmit throughput increases substantially. > >>>>>>> > >>>>>>> Some virtio devices are known to have guest drivers which expect a notify to be > >>>>>>> processed synchronously and spin waiting for completion. Only enable ioeventfd > >>>>>>> for virtio-blk and virtio-net for now. > >>>>>>> > >>>>>>> Care must be taken not to interfere with vhost-net, which uses host > >>>>>>> notifiers. If the set_host_notifier() API is used by a device > >>>>>>> virtio-pci will disable virtio-ioeventfd and let the device deal with > >>>>>>> host notifiers as it wishes. > >>>>>>> > >>>>>>> After migration and on VM change state (running/paused) virtio-ioeventfd > >>>>>>> will enable/disable itself. > >>>>>>> > >>>>>>> * VIRTIO_CONFIG_S_DRIVER_OK -> enable virtio-ioeventfd > >>>>>>> * !VIRTIO_CONFIG_S_DRIVER_OK -> disable virtio-ioeventfd > >>>>>>> * virtio_pci_set_host_notifier() -> disable virtio-ioeventfd > >>>>>>> * vm_change_state(running=0) -> disable virtio-ioeventfd > >>>>>>> * vm_change_state(running=1) -> enable virtio-ioeventfd > >>>>>>> > >>>>>>> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> > >>>>>> > >>>>>> On current git master I'm getting hangs when running iozone on a > >>>>>> virtio-blk disk. "Hang" means that it's not responsive any more and has > >>>>>> 100% CPU consumption. > >>>>>> > >>>>>> I bisected the problem to this patch. Any ideas? > >>>>>> > >>>>>> Kevin > >>>>> > >>>>> Does it help if you set ioeventfd=off on command line? > >>>> > >>>> Yes, with ioeventfd=off it seems to work fine. > >>>> > >>>> Kevin > >>> > >>> Then it's the ioeventfd that is to blame. > >>> Is it the io thread that consumes 100% CPU? > >>> Or the vcpu thread? > >> > >> I was building with the default options, i.e. there is no IO thread. > >> > >> Now I'm just running the test with IO threads enabled, and so far > >> everything looks good. So I can only reproduce the problem with IO > >> threads disabled. > > > > Hrm...aio uses SIGUSR2 to force the vcpu to process aio completions > > (relevant when --enable-io-thread is not used). I will take a look at > > that again and see why we're spinning without checking for ioeventfd > > completion. > > Here's my understanding of --disable-io-thread. Added Anthony on CC, > please correct me. > > When I/O thread is disabled our only thread runs guest code until an > exit request is made. There are synchronous exit cases like a halt > instruction or single step. There are also asynchronous exit cases > when signal handlers use qemu_notify_event(), which does cpu_exit(), > to set env->exit_request = 1 and unlink the current tb. > > With this structure in mind, anything which needs to interrupt the > vcpu in order to process events must use signals and > qemu_notify_event(). Otherwise that event source may be starved and > never processed. > > virtio-ioeventfd currently does not use signals and will therefore > never interrupt the vcpu. > > However, you normally don't notice the missing signal handler because > some other event interrupts the vcpu and we enter select(2) to process > all pending handlers. So virtio-ioeventfd mostly gets a free ride on > top of timer events. This is suboptimal because it adds latency to > virtqueue kick - we're waiting for another event to interrupt the vcpu > before we can process virtqueue-kick. > > If any other vcpu interruption makes virtio-ioeventfd chug along then > why are you seeing 100% CPU livelock? My theory is that dynticks has > a race condition which causes timers to stop working in QEMU. Here is > an strace of QEMU --disable-io-thread entering live lock. I can > trigger this by starting a VM and running "while true; do true; done" > at the shell. Then strace the QEMU process: > > 08:04:34.985177 ioctl(11, KVM_RUN, 0) = 0 > 08:04:34.985242 --- SIGALRM (Alarm clock) @ 0 (0) --- > 08:04:34.985319 write(6, "\1\0\0\0\0\0\0\0", 8) = 8 > 08:04:34.985368 rt_sigreturn(0x2758ad0) = 0 > 08:04:34.985423 select(15, [5 8 14], [], [], {0, 0}) = 1 (in [5], left {0, 0}) > 08:04:34.985484 read(5, "\1\0\0\0\0\0\0\0", 512) = 8 > 08:04:34.985538 timer_gettime(0, {it_interval={0, 0}, it_value={0, 0}}) = 0 > 08:04:34.985588 timer_settime(0, 0, {it_interval={0, 0}, it_value={0, > 273000}}, NULL) = 0 > 08:04:34.985646 ioctl(11, KVM_RUN, 0) = -1 EINTR (Interrupted system call) > 08:04:34.985928 --- SIGALRM (Alarm clock) @ 0 (0) --- > 08:04:34.986007 write(6, "\1\0\0\0\0\0\0\0", 8) = 8 > 08:04:34.986063 rt_sigreturn(0x2758ad0) = -1 EINTR (Interrupted system call) > 08:04:34.986124 select(15, [5 8 14], [], [], {0, 0}) = 1 (in [5], left {0, 0}) > 08:04:34.986188 read(5, "\1\0\0\0\0\0\0\0", 512) = 8 > 08:04:34.986246 timer_gettime(0, {it_interval={0, 0}, it_value={0, 0}}) = 0 > 08:04:34.986299 timer_settime(0, 0, {it_interval={0, 0}, it_value={0, > 250000}}, NULL) = 0 > 08:04:34.986359 ioctl(11, KVM_INTERRUPT, 0x7fff90404ef0) = 0 > 08:04:34.986406 ioctl(11, KVM_RUN, 0) = 0 > 08:04:34.986465 ioctl(11, KVM_RUN, 0) = 0 <--- guest > finishes execution > > v--- dynticks_rearm_timer() returns early because > timer is already scheduled > 08:04:34.986533 timer_gettime(0, {it_interval={0, 0}, it_value={0, 24203}}) = 0 > 08:04:34.986585 --- SIGALRM (Alarm clock) @ 0 (0) --- <--- timer expires > 08:04:34.986661 write(6, "\1\0\0\0\0\0\0\0", 8) = 8 > 08:04:34.986710 rt_sigreturn(0x2758ad0) = 0 > > v--- we re-enter the guest without rearming the timer! > 08:04:34.986754 ioctl(11, KVM_RUN^C <unfinished ...> > [QEMU hang, 100% CPU] > > So dynticks fails to rearm the timer before we enter the guest. This > is a race condition: we check that there is already a timer scheduled > and head on towards re-entering the guest, the timer expires before we > enter the guest, we re-enter the guest without realizing the timer has > expired. Now we're inside the guest without the hope of a timer > expiring - and the guest is running a CPU-bound workload that doesn't > need to perform I/O. > > The result is a hung QEMU (screen does not update) and a softlockup > inside the guest once we do kick it to life again (by detaching > strace). > > I think the only way to avoid this race condition in dynticks is to > mask SIGALRM, then check if the timer expired, and then ioctl(KVM_RUN) > with atomic signal mask change back to SIGALRM enabled. Thoughts? > > Back to virtio-ioeventfd, we really shouldn't use virtio-ioeventfd > when there is no I/O thread. Can we make it work with SIGIO? > It doesn't make sense because there's no > opportunity to process the virtqueue while the guest code is executing > in parallel like there is with I/O thread. It will just degrade > performance when QEMU only has one thread. Probably. But it's really better to check this than theorethise about it. > I'll send a patch to > disable it when we build without I/O thread. > > Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify 2011-01-25 11:27 ` Michael S. Tsirkin @ 2011-01-25 13:20 ` Stefan Hajnoczi 2011-01-25 14:07 ` Stefan Hajnoczi 0 siblings, 1 reply; 52+ messages in thread From: Stefan Hajnoczi @ 2011-01-25 13:20 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Kevin Wolf, Anthony Liguori, Avi Kivity, qemu-devel, Stefan Hajnoczi On Tue, Jan 25, 2011 at 11:27 AM, Michael S. Tsirkin <mst@redhat.com> wrote: > On Tue, Jan 25, 2011 at 09:49:04AM +0000, Stefan Hajnoczi wrote: >> On Tue, Jan 25, 2011 at 7:12 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote: >> > On Mon, Jan 24, 2011 at 8:05 PM, Kevin Wolf <kwolf@redhat.com> wrote: >> >> Am 24.01.2011 20:47, schrieb Michael S. Tsirkin: >> >>> On Mon, Jan 24, 2011 at 08:48:05PM +0100, Kevin Wolf wrote: >> >>>> Am 24.01.2011 20:36, schrieb Michael S. Tsirkin: >> >>>>> On Mon, Jan 24, 2011 at 07:54:20PM +0100, Kevin Wolf wrote: >> >>>>>> Am 12.12.2010 16:02, schrieb Stefan Hajnoczi: >> >>>>>>> Virtqueue notify is currently handled synchronously in userspace virtio. This >> >>>>>>> prevents the vcpu from executing guest code while hardware emulation code >> >>>>>>> handles the notify. >> >>>>>>> >> >>>>>>> On systems that support KVM, the ioeventfd mechanism can be used to make >> >>>>>>> virtqueue notify a lightweight exit by deferring hardware emulation to the >> >>>>>>> iothread and allowing the VM to continue execution. This model is similar to >> >>>>>>> how vhost receives virtqueue notifies. >> >>>>>>> >> >>>>>>> The result of this change is improved performance for userspace virtio devices. >> >>>>>>> Virtio-blk throughput increases especially for multithreaded scenarios and >> >>>>>>> virtio-net transmit throughput increases substantially. >> >>>>>>> >> >>>>>>> Some virtio devices are known to have guest drivers which expect a notify to be >> >>>>>>> processed synchronously and spin waiting for completion. Only enable ioeventfd >> >>>>>>> for virtio-blk and virtio-net for now. >> >>>>>>> >> >>>>>>> Care must be taken not to interfere with vhost-net, which uses host >> >>>>>>> notifiers. If the set_host_notifier() API is used by a device >> >>>>>>> virtio-pci will disable virtio-ioeventfd and let the device deal with >> >>>>>>> host notifiers as it wishes. >> >>>>>>> >> >>>>>>> After migration and on VM change state (running/paused) virtio-ioeventfd >> >>>>>>> will enable/disable itself. >> >>>>>>> >> >>>>>>> * VIRTIO_CONFIG_S_DRIVER_OK -> enable virtio-ioeventfd >> >>>>>>> * !VIRTIO_CONFIG_S_DRIVER_OK -> disable virtio-ioeventfd >> >>>>>>> * virtio_pci_set_host_notifier() -> disable virtio-ioeventfd >> >>>>>>> * vm_change_state(running=0) -> disable virtio-ioeventfd >> >>>>>>> * vm_change_state(running=1) -> enable virtio-ioeventfd >> >>>>>>> >> >>>>>>> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> >> >>>>>> >> >>>>>> On current git master I'm getting hangs when running iozone on a >> >>>>>> virtio-blk disk. "Hang" means that it's not responsive any more and has >> >>>>>> 100% CPU consumption. >> >>>>>> >> >>>>>> I bisected the problem to this patch. Any ideas? >> >>>>>> >> >>>>>> Kevin >> >>>>> >> >>>>> Does it help if you set ioeventfd=off on command line? >> >>>> >> >>>> Yes, with ioeventfd=off it seems to work fine. >> >>>> >> >>>> Kevin >> >>> >> >>> Then it's the ioeventfd that is to blame. >> >>> Is it the io thread that consumes 100% CPU? >> >>> Or the vcpu thread? >> >> >> >> I was building with the default options, i.e. there is no IO thread. >> >> >> >> Now I'm just running the test with IO threads enabled, and so far >> >> everything looks good. So I can only reproduce the problem with IO >> >> threads disabled. >> > >> > Hrm...aio uses SIGUSR2 to force the vcpu to process aio completions >> > (relevant when --enable-io-thread is not used). I will take a look at >> > that again and see why we're spinning without checking for ioeventfd >> > completion. >> >> Here's my understanding of --disable-io-thread. Added Anthony on CC, >> please correct me. >> >> When I/O thread is disabled our only thread runs guest code until an >> exit request is made. There are synchronous exit cases like a halt >> instruction or single step. There are also asynchronous exit cases >> when signal handlers use qemu_notify_event(), which does cpu_exit(), >> to set env->exit_request = 1 and unlink the current tb. >> >> With this structure in mind, anything which needs to interrupt the >> vcpu in order to process events must use signals and >> qemu_notify_event(). Otherwise that event source may be starved and >> never processed. >> >> virtio-ioeventfd currently does not use signals and will therefore >> never interrupt the vcpu. >> >> However, you normally don't notice the missing signal handler because >> some other event interrupts the vcpu and we enter select(2) to process >> all pending handlers. So virtio-ioeventfd mostly gets a free ride on >> top of timer events. This is suboptimal because it adds latency to >> virtqueue kick - we're waiting for another event to interrupt the vcpu >> before we can process virtqueue-kick. >> >> If any other vcpu interruption makes virtio-ioeventfd chug along then >> why are you seeing 100% CPU livelock? My theory is that dynticks has >> a race condition which causes timers to stop working in QEMU. Here is >> an strace of QEMU --disable-io-thread entering live lock. I can >> trigger this by starting a VM and running "while true; do true; done" >> at the shell. Then strace the QEMU process: >> >> 08:04:34.985177 ioctl(11, KVM_RUN, 0) = 0 >> 08:04:34.985242 --- SIGALRM (Alarm clock) @ 0 (0) --- >> 08:04:34.985319 write(6, "\1\0\0\0\0\0\0\0", 8) = 8 >> 08:04:34.985368 rt_sigreturn(0x2758ad0) = 0 >> 08:04:34.985423 select(15, [5 8 14], [], [], {0, 0}) = 1 (in [5], left {0, 0}) >> 08:04:34.985484 read(5, "\1\0\0\0\0\0\0\0", 512) = 8 >> 08:04:34.985538 timer_gettime(0, {it_interval={0, 0}, it_value={0, 0}}) = 0 >> 08:04:34.985588 timer_settime(0, 0, {it_interval={0, 0}, it_value={0, >> 273000}}, NULL) = 0 >> 08:04:34.985646 ioctl(11, KVM_RUN, 0) = -1 EINTR (Interrupted system call) >> 08:04:34.985928 --- SIGALRM (Alarm clock) @ 0 (0) --- >> 08:04:34.986007 write(6, "\1\0\0\0\0\0\0\0", 8) = 8 >> 08:04:34.986063 rt_sigreturn(0x2758ad0) = -1 EINTR (Interrupted system call) >> 08:04:34.986124 select(15, [5 8 14], [], [], {0, 0}) = 1 (in [5], left {0, 0}) >> 08:04:34.986188 read(5, "\1\0\0\0\0\0\0\0", 512) = 8 >> 08:04:34.986246 timer_gettime(0, {it_interval={0, 0}, it_value={0, 0}}) = 0 >> 08:04:34.986299 timer_settime(0, 0, {it_interval={0, 0}, it_value={0, >> 250000}}, NULL) = 0 >> 08:04:34.986359 ioctl(11, KVM_INTERRUPT, 0x7fff90404ef0) = 0 >> 08:04:34.986406 ioctl(11, KVM_RUN, 0) = 0 >> 08:04:34.986465 ioctl(11, KVM_RUN, 0) = 0 <--- guest >> finishes execution >> >> v--- dynticks_rearm_timer() returns early because >> timer is already scheduled >> 08:04:34.986533 timer_gettime(0, {it_interval={0, 0}, it_value={0, 24203}}) = 0 >> 08:04:34.986585 --- SIGALRM (Alarm clock) @ 0 (0) --- <--- timer expires >> 08:04:34.986661 write(6, "\1\0\0\0\0\0\0\0", 8) = 8 >> 08:04:34.986710 rt_sigreturn(0x2758ad0) = 0 >> >> v--- we re-enter the guest without rearming the timer! >> 08:04:34.986754 ioctl(11, KVM_RUN^C <unfinished ...> >> [QEMU hang, 100% CPU] >> >> So dynticks fails to rearm the timer before we enter the guest. This >> is a race condition: we check that there is already a timer scheduled >> and head on towards re-entering the guest, the timer expires before we >> enter the guest, we re-enter the guest without realizing the timer has >> expired. Now we're inside the guest without the hope of a timer >> expiring - and the guest is running a CPU-bound workload that doesn't >> need to perform I/O. >> >> The result is a hung QEMU (screen does not update) and a softlockup >> inside the guest once we do kick it to life again (by detaching >> strace). >> >> I think the only way to avoid this race condition in dynticks is to >> mask SIGALRM, then check if the timer expired, and then ioctl(KVM_RUN) >> with atomic signal mask change back to SIGALRM enabled. Thoughts? >> >> Back to virtio-ioeventfd, we really shouldn't use virtio-ioeventfd >> when there is no I/O thread. > > Can we make it work with SIGIO? > >> It doesn't make sense because there's no >> opportunity to process the virtqueue while the guest code is executing >> in parallel like there is with I/O thread. It will just degrade >> performance when QEMU only has one thread. > > Probably. But it's really better to check this than theorethise about > it. eventfd does not seem to support O_ASYNC. After adding the necessary code into QEMU no signals were firing so I wrote a test: #define _GNU_SOURCE #include <stdlib.h> #include <stdio.h> #include <fcntl.h> #include <signal.h> #include <sys/eventfd.h> int main(int argc, char **argv) { int fd = eventfd(0, 0); if (fd < 0) { perror("eventfd"); exit(1); } if (fcntl(fd, F_SETSIG, SIGTERM) < 0) { perror("fcntl(F_SETSIG)"); exit(1); } if (fcntl(fd, F_SETOWN, getpid()) < 0) { perror("fcntl(F_SETOWN)"); exit(1); } if (fcntl(fd, F_SETFL, O_NONBLOCK | O_ASYNC) < 0) { perror("fcntl(F_SETFL)"); exit(1); } switch (fork()) { case -1: perror("fork"); exit(1); case 0: /* child */ eventfd_write(fd, 1); exit(0); default: /* parent */ break; } sleep(5); wait(NULL); close(fd); return 0; } I'd expect the parent to get a SIGTERM but the process just sleeps and then exits. When replacing the eventfd with a pipe in this program the parent does receive a SIGKILL. Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify 2011-01-25 13:20 ` Stefan Hajnoczi @ 2011-01-25 14:07 ` Stefan Hajnoczi 0 siblings, 0 replies; 52+ messages in thread From: Stefan Hajnoczi @ 2011-01-25 14:07 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Kevin Wolf, Anthony Liguori, Avi Kivity, qemu-devel, Stefan Hajnoczi On Tue, Jan 25, 2011 at 1:20 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote: > eventfd does not seem to support O_ASYNC. linux-2.6/fs/eventfd.c does not implement file_operations::fasync() so I'm convinced SIGIO is not possible here. I have sent a patch to disable virtio-ioeventfd when !CONFIG_IOTHREAD. Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify 2011-01-25 9:49 ` Stefan Hajnoczi 2011-01-25 9:54 ` Stefan Hajnoczi 2011-01-25 11:27 ` Michael S. Tsirkin @ 2011-01-25 19:18 ` Anthony Liguori 2011-01-25 19:45 ` Stefan Hajnoczi 2 siblings, 1 reply; 52+ messages in thread From: Anthony Liguori @ 2011-01-25 19:18 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Kevin Wolf, Anthony Liguori, Stefan Hajnoczi, Michael S. Tsirkin, qemu-devel, Avi Kivity On 01/25/2011 03:49 AM, Stefan Hajnoczi wrote: > On Tue, Jan 25, 2011 at 7:12 AM, Stefan Hajnoczi<stefanha@gmail.com> wrote: > >> On Mon, Jan 24, 2011 at 8:05 PM, Kevin Wolf<kwolf@redhat.com> wrote: >> >>> Am 24.01.2011 20:47, schrieb Michael S. Tsirkin: >>> >>>> On Mon, Jan 24, 2011 at 08:48:05PM +0100, Kevin Wolf wrote: >>>> >>>>> Am 24.01.2011 20:36, schrieb Michael S. Tsirkin: >>>>> >>>>>> On Mon, Jan 24, 2011 at 07:54:20PM +0100, Kevin Wolf wrote: >>>>>> >>>>>>> Am 12.12.2010 16:02, schrieb Stefan Hajnoczi: >>>>>>> >>>>>>>> Virtqueue notify is currently handled synchronously in userspace virtio. This >>>>>>>> prevents the vcpu from executing guest code while hardware emulation code >>>>>>>> handles the notify. >>>>>>>> >>>>>>>> On systems that support KVM, the ioeventfd mechanism can be used to make >>>>>>>> virtqueue notify a lightweight exit by deferring hardware emulation to the >>>>>>>> iothread and allowing the VM to continue execution. This model is similar to >>>>>>>> how vhost receives virtqueue notifies. >>>>>>>> >>>>>>>> The result of this change is improved performance for userspace virtio devices. >>>>>>>> Virtio-blk throughput increases especially for multithreaded scenarios and >>>>>>>> virtio-net transmit throughput increases substantially. >>>>>>>> >>>>>>>> Some virtio devices are known to have guest drivers which expect a notify to be >>>>>>>> processed synchronously and spin waiting for completion. Only enable ioeventfd >>>>>>>> for virtio-blk and virtio-net for now. >>>>>>>> >>>>>>>> Care must be taken not to interfere with vhost-net, which uses host >>>>>>>> notifiers. If the set_host_notifier() API is used by a device >>>>>>>> virtio-pci will disable virtio-ioeventfd and let the device deal with >>>>>>>> host notifiers as it wishes. >>>>>>>> >>>>>>>> After migration and on VM change state (running/paused) virtio-ioeventfd >>>>>>>> will enable/disable itself. >>>>>>>> >>>>>>>> * VIRTIO_CONFIG_S_DRIVER_OK -> enable virtio-ioeventfd >>>>>>>> * !VIRTIO_CONFIG_S_DRIVER_OK -> disable virtio-ioeventfd >>>>>>>> * virtio_pci_set_host_notifier() -> disable virtio-ioeventfd >>>>>>>> * vm_change_state(running=0) -> disable virtio-ioeventfd >>>>>>>> * vm_change_state(running=1) -> enable virtio-ioeventfd >>>>>>>> >>>>>>>> Signed-off-by: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com> >>>>>>>> >>>>>>> On current git master I'm getting hangs when running iozone on a >>>>>>> virtio-blk disk. "Hang" means that it's not responsive any more and has >>>>>>> 100% CPU consumption. >>>>>>> >>>>>>> I bisected the problem to this patch. Any ideas? >>>>>>> >>>>>>> Kevin >>>>>>> >>>>>> Does it help if you set ioeventfd=off on command line? >>>>>> >>>>> Yes, with ioeventfd=off it seems to work fine. >>>>> >>>>> Kevin >>>>> >>>> Then it's the ioeventfd that is to blame. >>>> Is it the io thread that consumes 100% CPU? >>>> Or the vcpu thread? >>>> >>> I was building with the default options, i.e. there is no IO thread. >>> >>> Now I'm just running the test with IO threads enabled, and so far >>> everything looks good. So I can only reproduce the problem with IO >>> threads disabled. >>> >> Hrm...aio uses SIGUSR2 to force the vcpu to process aio completions >> (relevant when --enable-io-thread is not used). I will take a look at >> that again and see why we're spinning without checking for ioeventfd >> completion. >> > Here's my understanding of --disable-io-thread. Added Anthony on CC, > please correct me. > > When I/O thread is disabled our only thread runs guest code until an > exit request is made. There are synchronous exit cases like a halt > instruction or single step. There are also asynchronous exit cases > when signal handlers use qemu_notify_event(), which does cpu_exit(), > to set env->exit_request = 1 and unlink the current tb. > Correct. Note that this is a problem today. If you have a tight loop in TCG and you have nothing that would generate a signal (no pending disk I/O and no periodic timer) then the main loop is starved. This is a fundamental flaw in TCG. Regards, Anthony Liguori ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify 2011-01-25 19:18 ` Anthony Liguori @ 2011-01-25 19:45 ` Stefan Hajnoczi 2011-01-25 19:51 ` Anthony Liguori 0 siblings, 1 reply; 52+ messages in thread From: Stefan Hajnoczi @ 2011-01-25 19:45 UTC (permalink / raw) To: Anthony Liguori Cc: Kevin Wolf, Anthony Liguori, Stefan Hajnoczi, Michael S. Tsirkin, qemu-devel, Avi Kivity On Tue, Jan 25, 2011 at 7:18 PM, Anthony Liguori <aliguori@linux.vnet.ibm.com> wrote: > On 01/25/2011 03:49 AM, Stefan Hajnoczi wrote: >> >> On Tue, Jan 25, 2011 at 7:12 AM, Stefan Hajnoczi<stefanha@gmail.com> >> wrote: >> >>> >>> On Mon, Jan 24, 2011 at 8:05 PM, Kevin Wolf<kwolf@redhat.com> wrote: >>> >>>> >>>> Am 24.01.2011 20:47, schrieb Michael S. Tsirkin: >>>> >>>>> >>>>> On Mon, Jan 24, 2011 at 08:48:05PM +0100, Kevin Wolf wrote: >>>>> >>>>>> >>>>>> Am 24.01.2011 20:36, schrieb Michael S. Tsirkin: >>>>>> >>>>>>> >>>>>>> On Mon, Jan 24, 2011 at 07:54:20PM +0100, Kevin Wolf wrote: >>>>>>> >>>>>>>> >>>>>>>> Am 12.12.2010 16:02, schrieb Stefan Hajnoczi: >>>>>>>> >>>>>>>>> >>>>>>>>> Virtqueue notify is currently handled synchronously in userspace >>>>>>>>> virtio. This >>>>>>>>> prevents the vcpu from executing guest code while hardware >>>>>>>>> emulation code >>>>>>>>> handles the notify. >>>>>>>>> >>>>>>>>> On systems that support KVM, the ioeventfd mechanism can be used to >>>>>>>>> make >>>>>>>>> virtqueue notify a lightweight exit by deferring hardware emulation >>>>>>>>> to the >>>>>>>>> iothread and allowing the VM to continue execution. This model is >>>>>>>>> similar to >>>>>>>>> how vhost receives virtqueue notifies. >>>>>>>>> >>>>>>>>> The result of this change is improved performance for userspace >>>>>>>>> virtio devices. >>>>>>>>> Virtio-blk throughput increases especially for multithreaded >>>>>>>>> scenarios and >>>>>>>>> virtio-net transmit throughput increases substantially. >>>>>>>>> >>>>>>>>> Some virtio devices are known to have guest drivers which expect a >>>>>>>>> notify to be >>>>>>>>> processed synchronously and spin waiting for completion. Only >>>>>>>>> enable ioeventfd >>>>>>>>> for virtio-blk and virtio-net for now. >>>>>>>>> >>>>>>>>> Care must be taken not to interfere with vhost-net, which uses host >>>>>>>>> notifiers. If the set_host_notifier() API is used by a device >>>>>>>>> virtio-pci will disable virtio-ioeventfd and let the device deal >>>>>>>>> with >>>>>>>>> host notifiers as it wishes. >>>>>>>>> >>>>>>>>> After migration and on VM change state (running/paused) >>>>>>>>> virtio-ioeventfd >>>>>>>>> will enable/disable itself. >>>>>>>>> >>>>>>>>> * VIRTIO_CONFIG_S_DRIVER_OK -> enable virtio-ioeventfd >>>>>>>>> * !VIRTIO_CONFIG_S_DRIVER_OK -> disable virtio-ioeventfd >>>>>>>>> * virtio_pci_set_host_notifier() -> disable virtio-ioeventfd >>>>>>>>> * vm_change_state(running=0) -> disable virtio-ioeventfd >>>>>>>>> * vm_change_state(running=1) -> enable virtio-ioeventfd >>>>>>>>> >>>>>>>>> Signed-off-by: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com> >>>>>>>>> >>>>>>>> >>>>>>>> On current git master I'm getting hangs when running iozone on a >>>>>>>> virtio-blk disk. "Hang" means that it's not responsive any more and >>>>>>>> has >>>>>>>> 100% CPU consumption. >>>>>>>> >>>>>>>> I bisected the problem to this patch. Any ideas? >>>>>>>> >>>>>>>> Kevin >>>>>>>> >>>>>>> >>>>>>> Does it help if you set ioeventfd=off on command line? >>>>>>> >>>>>> >>>>>> Yes, with ioeventfd=off it seems to work fine. >>>>>> >>>>>> Kevin >>>>>> >>>>> >>>>> Then it's the ioeventfd that is to blame. >>>>> Is it the io thread that consumes 100% CPU? >>>>> Or the vcpu thread? >>>>> >>>> >>>> I was building with the default options, i.e. there is no IO thread. >>>> >>>> Now I'm just running the test with IO threads enabled, and so far >>>> everything looks good. So I can only reproduce the problem with IO >>>> threads disabled. >>>> >>> >>> Hrm...aio uses SIGUSR2 to force the vcpu to process aio completions >>> (relevant when --enable-io-thread is not used). I will take a look at >>> that again and see why we're spinning without checking for ioeventfd >>> completion. >>> >> >> Here's my understanding of --disable-io-thread. Added Anthony on CC, >> please correct me. >> >> When I/O thread is disabled our only thread runs guest code until an >> exit request is made. There are synchronous exit cases like a halt >> instruction or single step. There are also asynchronous exit cases >> when signal handlers use qemu_notify_event(), which does cpu_exit(), >> to set env->exit_request = 1 and unlink the current tb. >> > > Correct. > > Note that this is a problem today. If you have a tight loop in TCG and you > have nothing that would generate a signal (no pending disk I/O and no > periodic timer) then the main loop is starved. Even with KVM we can spin inside the guest and get a softlockup due to the dynticks race condition shown above. In a CPU bound guest that's doing no I/O it's possible to go AWOL for extended periods of time. I can think of two solutions: 1. Block SIGALRM during critical regions, not sure if the necessary atomic signal mask capabilities are there in KVM. Haven't looked at TCG yet either. 2. Make a portion of the timer code signal-safe and rearm the timer from within the SIGLARM handler. Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify 2011-01-25 19:45 ` Stefan Hajnoczi @ 2011-01-25 19:51 ` Anthony Liguori 2011-01-25 19:59 ` Stefan Hajnoczi 0 siblings, 1 reply; 52+ messages in thread From: Anthony Liguori @ 2011-01-25 19:51 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Kevin Wolf, Anthony Liguori, Stefan Hajnoczi, Michael S. Tsirkin, qemu-devel, Avi Kivity On 01/25/2011 01:45 PM, Stefan Hajnoczi wrote: > On Tue, Jan 25, 2011 at 7:18 PM, Anthony Liguori > <aliguori@linux.vnet.ibm.com> wrote: > >> On 01/25/2011 03:49 AM, Stefan Hajnoczi wrote: >> >>> On Tue, Jan 25, 2011 at 7:12 AM, Stefan Hajnoczi<stefanha@gmail.com> >>> wrote: >>> >>> >>>> On Mon, Jan 24, 2011 at 8:05 PM, Kevin Wolf<kwolf@redhat.com> wrote: >>>> >>>> >>>>> Am 24.01.2011 20:47, schrieb Michael S. Tsirkin: >>>>> >>>>> >>>>>> On Mon, Jan 24, 2011 at 08:48:05PM +0100, Kevin Wolf wrote: >>>>>> >>>>>> >>>>>>> Am 24.01.2011 20:36, schrieb Michael S. Tsirkin: >>>>>>> >>>>>>> >>>>>>>> On Mon, Jan 24, 2011 at 07:54:20PM +0100, Kevin Wolf wrote: >>>>>>>> >>>>>>>> >>>>>>>>> Am 12.12.2010 16:02, schrieb Stefan Hajnoczi: >>>>>>>>> >>>>>>>>> >>>>>>>>>> Virtqueue notify is currently handled synchronously in userspace >>>>>>>>>> virtio. This >>>>>>>>>> prevents the vcpu from executing guest code while hardware >>>>>>>>>> emulation code >>>>>>>>>> handles the notify. >>>>>>>>>> >>>>>>>>>> On systems that support KVM, the ioeventfd mechanism can be used to >>>>>>>>>> make >>>>>>>>>> virtqueue notify a lightweight exit by deferring hardware emulation >>>>>>>>>> to the >>>>>>>>>> iothread and allowing the VM to continue execution. This model is >>>>>>>>>> similar to >>>>>>>>>> how vhost receives virtqueue notifies. >>>>>>>>>> >>>>>>>>>> The result of this change is improved performance for userspace >>>>>>>>>> virtio devices. >>>>>>>>>> Virtio-blk throughput increases especially for multithreaded >>>>>>>>>> scenarios and >>>>>>>>>> virtio-net transmit throughput increases substantially. >>>>>>>>>> >>>>>>>>>> Some virtio devices are known to have guest drivers which expect a >>>>>>>>>> notify to be >>>>>>>>>> processed synchronously and spin waiting for completion. Only >>>>>>>>>> enable ioeventfd >>>>>>>>>> for virtio-blk and virtio-net for now. >>>>>>>>>> >>>>>>>>>> Care must be taken not to interfere with vhost-net, which uses host >>>>>>>>>> notifiers. If the set_host_notifier() API is used by a device >>>>>>>>>> virtio-pci will disable virtio-ioeventfd and let the device deal >>>>>>>>>> with >>>>>>>>>> host notifiers as it wishes. >>>>>>>>>> >>>>>>>>>> After migration and on VM change state (running/paused) >>>>>>>>>> virtio-ioeventfd >>>>>>>>>> will enable/disable itself. >>>>>>>>>> >>>>>>>>>> * VIRTIO_CONFIG_S_DRIVER_OK -> enable virtio-ioeventfd >>>>>>>>>> * !VIRTIO_CONFIG_S_DRIVER_OK -> disable virtio-ioeventfd >>>>>>>>>> * virtio_pci_set_host_notifier() -> disable virtio-ioeventfd >>>>>>>>>> * vm_change_state(running=0) -> disable virtio-ioeventfd >>>>>>>>>> * vm_change_state(running=1) -> enable virtio-ioeventfd >>>>>>>>>> >>>>>>>>>> Signed-off-by: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> On current git master I'm getting hangs when running iozone on a >>>>>>>>> virtio-blk disk. "Hang" means that it's not responsive any more and >>>>>>>>> has >>>>>>>>> 100% CPU consumption. >>>>>>>>> >>>>>>>>> I bisected the problem to this patch. Any ideas? >>>>>>>>> >>>>>>>>> Kevin >>>>>>>>> >>>>>>>>> >>>>>>>> Does it help if you set ioeventfd=off on command line? >>>>>>>> >>>>>>>> >>>>>>> Yes, with ioeventfd=off it seems to work fine. >>>>>>> >>>>>>> Kevin >>>>>>> >>>>>>> >>>>>> Then it's the ioeventfd that is to blame. >>>>>> Is it the io thread that consumes 100% CPU? >>>>>> Or the vcpu thread? >>>>>> >>>>>> >>>>> I was building with the default options, i.e. there is no IO thread. >>>>> >>>>> Now I'm just running the test with IO threads enabled, and so far >>>>> everything looks good. So I can only reproduce the problem with IO >>>>> threads disabled. >>>>> >>>>> >>>> Hrm...aio uses SIGUSR2 to force the vcpu to process aio completions >>>> (relevant when --enable-io-thread is not used). I will take a look at >>>> that again and see why we're spinning without checking for ioeventfd >>>> completion. >>>> >>>> >>> Here's my understanding of --disable-io-thread. Added Anthony on CC, >>> please correct me. >>> >>> When I/O thread is disabled our only thread runs guest code until an >>> exit request is made. There are synchronous exit cases like a halt >>> instruction or single step. There are also asynchronous exit cases >>> when signal handlers use qemu_notify_event(), which does cpu_exit(), >>> to set env->exit_request = 1 and unlink the current tb. >>> >>> >> Correct. >> >> Note that this is a problem today. If you have a tight loop in TCG and you >> have nothing that would generate a signal (no pending disk I/O and no >> periodic timer) then the main loop is starved. >> > Even with KVM we can spin inside the guest and get a softlockup due to > the dynticks race condition shown above. In a CPU bound guest that's > doing no I/O it's possible to go AWOL for extended periods of time. > This is a different race. I need to look more deeply into the code. > I can think of two solutions: > 1. Block SIGALRM during critical regions, not sure if the necessary > atomic signal mask capabilities are there in KVM. Haven't looked at > TCG yet either. > 2. Make a portion of the timer code signal-safe and rearm the timer > from within the SIGLARM handler. > Or, switch to timerfd and stop using a signal based alarm timer. Regards, Anthony Liguori > Stefan > ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify 2011-01-25 19:51 ` Anthony Liguori @ 2011-01-25 19:59 ` Stefan Hajnoczi 2011-01-26 0:18 ` Anthony Liguori 0 siblings, 1 reply; 52+ messages in thread From: Stefan Hajnoczi @ 2011-01-25 19:59 UTC (permalink / raw) To: Anthony Liguori Cc: Kevin Wolf, Anthony Liguori, Stefan Hajnoczi, Michael S. Tsirkin, qemu-devel, Avi Kivity On Tue, Jan 25, 2011 at 7:51 PM, Anthony Liguori <aliguori@linux.vnet.ibm.com> wrote: > On 01/25/2011 01:45 PM, Stefan Hajnoczi wrote: >> >> On Tue, Jan 25, 2011 at 7:18 PM, Anthony Liguori >> <aliguori@linux.vnet.ibm.com> wrote: >> >>> >>> On 01/25/2011 03:49 AM, Stefan Hajnoczi wrote: >>> >>>> >>>> On Tue, Jan 25, 2011 at 7:12 AM, Stefan Hajnoczi<stefanha@gmail.com> >>>> wrote: >>>> >>>> >>>>> >>>>> On Mon, Jan 24, 2011 at 8:05 PM, Kevin Wolf<kwolf@redhat.com> wrote: >>>>> >>>>> >>>>>> >>>>>> Am 24.01.2011 20:47, schrieb Michael S. Tsirkin: >>>>>> >>>>>> >>>>>>> >>>>>>> On Mon, Jan 24, 2011 at 08:48:05PM +0100, Kevin Wolf wrote: >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Am 24.01.2011 20:36, schrieb Michael S. Tsirkin: >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Jan 24, 2011 at 07:54:20PM +0100, Kevin Wolf wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Am 12.12.2010 16:02, schrieb Stefan Hajnoczi: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Virtqueue notify is currently handled synchronously in userspace >>>>>>>>>>> virtio. This >>>>>>>>>>> prevents the vcpu from executing guest code while hardware >>>>>>>>>>> emulation code >>>>>>>>>>> handles the notify. >>>>>>>>>>> >>>>>>>>>>> On systems that support KVM, the ioeventfd mechanism can be used >>>>>>>>>>> to >>>>>>>>>>> make >>>>>>>>>>> virtqueue notify a lightweight exit by deferring hardware >>>>>>>>>>> emulation >>>>>>>>>>> to the >>>>>>>>>>> iothread and allowing the VM to continue execution. This model >>>>>>>>>>> is >>>>>>>>>>> similar to >>>>>>>>>>> how vhost receives virtqueue notifies. >>>>>>>>>>> >>>>>>>>>>> The result of this change is improved performance for userspace >>>>>>>>>>> virtio devices. >>>>>>>>>>> Virtio-blk throughput increases especially for multithreaded >>>>>>>>>>> scenarios and >>>>>>>>>>> virtio-net transmit throughput increases substantially. >>>>>>>>>>> >>>>>>>>>>> Some virtio devices are known to have guest drivers which expect >>>>>>>>>>> a >>>>>>>>>>> notify to be >>>>>>>>>>> processed synchronously and spin waiting for completion. Only >>>>>>>>>>> enable ioeventfd >>>>>>>>>>> for virtio-blk and virtio-net for now. >>>>>>>>>>> >>>>>>>>>>> Care must be taken not to interfere with vhost-net, which uses >>>>>>>>>>> host >>>>>>>>>>> notifiers. If the set_host_notifier() API is used by a device >>>>>>>>>>> virtio-pci will disable virtio-ioeventfd and let the device deal >>>>>>>>>>> with >>>>>>>>>>> host notifiers as it wishes. >>>>>>>>>>> >>>>>>>>>>> After migration and on VM change state (running/paused) >>>>>>>>>>> virtio-ioeventfd >>>>>>>>>>> will enable/disable itself. >>>>>>>>>>> >>>>>>>>>>> * VIRTIO_CONFIG_S_DRIVER_OK -> enable virtio-ioeventfd >>>>>>>>>>> * !VIRTIO_CONFIG_S_DRIVER_OK -> disable virtio-ioeventfd >>>>>>>>>>> * virtio_pci_set_host_notifier() -> disable virtio-ioeventfd >>>>>>>>>>> * vm_change_state(running=0) -> disable virtio-ioeventfd >>>>>>>>>>> * vm_change_state(running=1) -> enable virtio-ioeventfd >>>>>>>>>>> >>>>>>>>>>> Signed-off-by: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On current git master I'm getting hangs when running iozone on a >>>>>>>>>> virtio-blk disk. "Hang" means that it's not responsive any more >>>>>>>>>> and >>>>>>>>>> has >>>>>>>>>> 100% CPU consumption. >>>>>>>>>> >>>>>>>>>> I bisected the problem to this patch. Any ideas? >>>>>>>>>> >>>>>>>>>> Kevin >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> Does it help if you set ioeventfd=off on command line? >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> Yes, with ioeventfd=off it seems to work fine. >>>>>>>> >>>>>>>> Kevin >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> Then it's the ioeventfd that is to blame. >>>>>>> Is it the io thread that consumes 100% CPU? >>>>>>> Or the vcpu thread? >>>>>>> >>>>>>> >>>>>> >>>>>> I was building with the default options, i.e. there is no IO thread. >>>>>> >>>>>> Now I'm just running the test with IO threads enabled, and so far >>>>>> everything looks good. So I can only reproduce the problem with IO >>>>>> threads disabled. >>>>>> >>>>>> >>>>> >>>>> Hrm...aio uses SIGUSR2 to force the vcpu to process aio completions >>>>> (relevant when --enable-io-thread is not used). I will take a look at >>>>> that again and see why we're spinning without checking for ioeventfd >>>>> completion. >>>>> >>>>> >>>> >>>> Here's my understanding of --disable-io-thread. Added Anthony on CC, >>>> please correct me. >>>> >>>> When I/O thread is disabled our only thread runs guest code until an >>>> exit request is made. There are synchronous exit cases like a halt >>>> instruction or single step. There are also asynchronous exit cases >>>> when signal handlers use qemu_notify_event(), which does cpu_exit(), >>>> to set env->exit_request = 1 and unlink the current tb. >>>> >>>> >>> >>> Correct. >>> >>> Note that this is a problem today. If you have a tight loop in TCG and >>> you >>> have nothing that would generate a signal (no pending disk I/O and no >>> periodic timer) then the main loop is starved. >>> >> >> Even with KVM we can spin inside the guest and get a softlockup due to >> the dynticks race condition shown above. In a CPU bound guest that's >> doing no I/O it's possible to go AWOL for extended periods of time. >> > > This is a different race. I need to look more deeply into the code. int kvm_cpu_exec(CPUState *env) { struct kvm_run *run = env->kvm_run; int ret; DPRINTF("kvm_cpu_exec()\n"); do { This is broken because a signal handler could change env->exit_request after this check: #ifndef CONFIG_IOTHREAD if (env->exit_request) { DPRINTF("interrupt exit requested\n"); ret = 0; break; } #endif if (kvm_arch_process_irqchip_events(env)) { ret = 0; break; } if (env->kvm_vcpu_dirty) { kvm_arch_put_registers(env, KVM_PUT_RUNTIME_STATE); env->kvm_vcpu_dirty = 0; } kvm_arch_pre_run(env, run); cpu_single_env = NULL; qemu_mutex_unlock_iothread(); env->exit_request might be set but we still reenter, possibly without rearming the timer: ret = kvm_vcpu_ioctl(env, KVM_RUN, 0); >> I can think of two solutions: >> 1. Block SIGALRM during critical regions, not sure if the necessary >> atomic signal mask capabilities are there in KVM. Haven't looked at >> TCG yet either. >> 2. Make a portion of the timer code signal-safe and rearm the timer >> from within the SIGLARM handler. >> > > Or, switch to timerfd and stop using a signal based alarm timer. Doesn't work for !CONFIG_IOTHREAD. Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify 2011-01-25 19:59 ` Stefan Hajnoczi @ 2011-01-26 0:18 ` Anthony Liguori 0 siblings, 0 replies; 52+ messages in thread From: Anthony Liguori @ 2011-01-26 0:18 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Kevin Wolf, qemu-devel, Avi Kivity, Stefan Hajnoczi, Michael S. Tsirkin On 01/25/2011 01:59 PM, Stefan Hajnoczi wrote: > int kvm_cpu_exec(CPUState *env) > { > struct kvm_run *run = env->kvm_run; > int ret; > > DPRINTF("kvm_cpu_exec()\n"); > > do { > > This is broken because a signal handler could change env->exit_request > after this check: > > #ifndef CONFIG_IOTHREAD > if (env->exit_request) { > DPRINTF("interrupt exit requested\n"); > ret = 0; > break; > } > #endif > Yeah, this is classic signal/select race with ioctl(KVM_RUN) subbing in for select. But this is supposed to be mitigated by the fact that we block SIG_IPI except for when we execute KVM_RUN which means that we can reliably send SIG_IPI. Of course, that doesn't help for SIGALRM unless we send a SIG_IPI from the SIGALRM handler which we do with the I/O thread but not w/o it. At any rate, post stable-0.14, I want to enable I/O thread by default so I don't know that we really need to fix this... > if (kvm_arch_process_irqchip_events(env)) { > ret = 0; > break; > } > > if (env->kvm_vcpu_dirty) { > kvm_arch_put_registers(env, KVM_PUT_RUNTIME_STATE); > env->kvm_vcpu_dirty = 0; > } > > kvm_arch_pre_run(env, run); > cpu_single_env = NULL; > qemu_mutex_unlock_iothread(); > > env->exit_request might be set but we still reenter, possibly without > rearming the timer: > ret = kvm_vcpu_ioctl(env, KVM_RUN, 0); > > >>> I can think of two solutions: >>> 1. Block SIGALRM during critical regions, not sure if the necessary >>> atomic signal mask capabilities are there in KVM. Haven't looked at >>> TCG yet either. >>> 2. Make a portion of the timer code signal-safe and rearm the timer >>> from within the SIGLARM handler. >>> >>> >> Or, switch to timerfd and stop using a signal based alarm timer. >> > Doesn't work for !CONFIG_IOTHREAD. > Yeah, we need to get rid of !CONFIG_IOTHREAD. We need to run select() in parallel with TCG/KVM and interrupt the VCPUs appropriately when select() returns. Regards, Anthony Liguori > Stefan > > ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] [PATCH v5 3/4] virtio-pci: Don't use ioeventfd on old kernels 2010-12-12 15:02 [Qemu-devel] [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify Stefan Hajnoczi 2010-12-12 15:02 ` [Qemu-devel] [PATCH v5 1/4] virtio-pci: Rename bugs field to flags Stefan Hajnoczi 2010-12-12 15:02 ` [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify Stefan Hajnoczi @ 2010-12-12 15:02 ` Stefan Hajnoczi 2010-12-12 15:02 ` [Qemu-devel] [PATCH v5 4/4] docs: Document virtio PCI -device ioeventfd=on|off Stefan Hajnoczi ` (2 subsequent siblings) 5 siblings, 0 replies; 52+ messages in thread From: Stefan Hajnoczi @ 2010-12-12 15:02 UTC (permalink / raw) To: qemu-devel; +Cc: Stefan Hajnoczi, Michael S. Tsirkin There used to be a limit of 6 KVM io bus devices inside the kernel. On such a kernel, don't use ioeventfd for virtqueue host notification since the limit is reached too easily. This ensures that existing vhost-net setups (which always use ioeventfd) have ioeventfds available so they can continue to work. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> --- hw/virtio-pci.c | 4 ++++ kvm-all.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++ kvm-stub.c | 5 +++++ kvm.h | 1 + 4 files changed, 59 insertions(+), 0 deletions(-) diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c index f57c45a..db0df67 100644 --- a/hw/virtio-pci.c +++ b/hw/virtio-pci.c @@ -690,6 +690,10 @@ static void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev, pci_register_bar(&proxy->pci_dev, 0, size, PCI_BASE_ADDRESS_SPACE_IO, virtio_map); + if (!kvm_has_many_ioeventfds()) { + proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD; + } + virtio_bind_device(vdev, &virtio_pci_bindings, proxy); proxy->host_features |= 0x1 << VIRTIO_F_NOTIFY_ON_EMPTY; proxy->host_features |= 0x1 << VIRTIO_F_BAD_FEATURE; diff --git a/kvm-all.c b/kvm-all.c index cae24bb..255b6fa 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -28,6 +28,11 @@ #include "kvm.h" #include "bswap.h" +/* This check must be after config-host.h is included */ +#ifdef CONFIG_EVENTFD +#include <sys/eventfd.h> +#endif + /* KVM uses PAGE_SIZE in it's definition of COALESCED_MMIO_MAX */ #define PAGE_SIZE TARGET_PAGE_SIZE @@ -72,6 +77,7 @@ struct KVMState int irqchip_in_kernel; int pit_in_kernel; int xsave, xcrs; + int many_ioeventfds; }; static KVMState *kvm_state; @@ -441,6 +447,39 @@ int kvm_check_extension(KVMState *s, unsigned int extension) return ret; } +static int kvm_check_many_ioeventfds(void) +{ + /* Older kernels have a 6 device limit on the KVM io bus. Find out so we + * can avoid creating too many ioeventfds. + */ +#ifdef CONFIG_EVENTFD + int ioeventfds[7]; + int i, ret = 0; + for (i = 0; i < ARRAY_SIZE(ioeventfds); i++) { + ioeventfds[i] = eventfd(0, EFD_CLOEXEC); + if (ioeventfds[i] < 0) { + break; + } + ret = kvm_set_ioeventfd_pio_word(ioeventfds[i], 0, i, true); + if (ret < 0) { + close(ioeventfds[i]); + break; + } + } + + /* Decide whether many devices are supported or not */ + ret = i == ARRAY_SIZE(ioeventfds); + + while (i-- > 0) { + kvm_set_ioeventfd_pio_word(ioeventfds[i], 0, i, false); + close(ioeventfds[i]); + } + return ret; +#else + return 0; +#endif +} + static void kvm_set_phys_mem(target_phys_addr_t start_addr, ram_addr_t size, ram_addr_t phys_offset) @@ -717,6 +756,8 @@ int kvm_init(int smp_cpus) kvm_state = s; cpu_register_phys_memory_client(&kvm_cpu_phys_memory_client); + s->many_ioeventfds = kvm_check_many_ioeventfds(); + return 0; err: @@ -1046,6 +1087,14 @@ int kvm_has_xcrs(void) return kvm_state->xcrs; } +int kvm_has_many_ioeventfds(void) +{ + if (!kvm_enabled()) { + return 0; + } + return kvm_state->many_ioeventfds; +} + void kvm_setup_guest_memory(void *start, size_t size) { if (!kvm_has_sync_mmu()) { diff --git a/kvm-stub.c b/kvm-stub.c index 5384a4b..33d4476 100644 --- a/kvm-stub.c +++ b/kvm-stub.c @@ -99,6 +99,11 @@ int kvm_has_robust_singlestep(void) return 0; } +int kvm_has_many_ioeventfds(void) +{ + return 0; +} + void kvm_setup_guest_memory(void *start, size_t size) { } diff --git a/kvm.h b/kvm.h index 60a9b42..ce08d42 100644 --- a/kvm.h +++ b/kvm.h @@ -42,6 +42,7 @@ int kvm_has_robust_singlestep(void); int kvm_has_debugregs(void); int kvm_has_xsave(void); int kvm_has_xcrs(void); +int kvm_has_many_ioeventfds(void); #ifdef NEED_CPU_H int kvm_init_vcpu(CPUState *env); -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 52+ messages in thread
* [Qemu-devel] [PATCH v5 4/4] docs: Document virtio PCI -device ioeventfd=on|off 2010-12-12 15:02 [Qemu-devel] [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify Stefan Hajnoczi ` (2 preceding siblings ...) 2010-12-12 15:02 ` [Qemu-devel] [PATCH v5 3/4] virtio-pci: Don't use ioeventfd on old kernels Stefan Hajnoczi @ 2010-12-12 15:02 ` Stefan Hajnoczi 2010-12-12 15:14 ` [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify Stefan Hajnoczi 2010-12-12 20:41 ` Michael S. Tsirkin 5 siblings, 0 replies; 52+ messages in thread From: Stefan Hajnoczi @ 2010-12-12 15:02 UTC (permalink / raw) To: qemu-devel; +Cc: Stefan Hajnoczi, Michael S. Tsirkin Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> --- docs/qdev-device-use.txt | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/docs/qdev-device-use.txt b/docs/qdev-device-use.txt index f252c8e..84d0c82 100644 --- a/docs/qdev-device-use.txt +++ b/docs/qdev-device-use.txt @@ -97,10 +97,13 @@ The -device argument differs in detail for each kind of drive: * if=virtio - -device virtio-blk-pci,drive=DRIVE-ID,class=C,vectors=V + -device virtio-blk-pci,drive=DRIVE-ID,class=C,vectors=V,ioeventfd=IOEVENTFD This lets you control PCI device class and MSI-X vectors. + IOEVENTFD controls whether or not ioeventfd is used for virtqueue notify. It + can be set to on (default) or off. + As for all PCI devices, you can add bus=PCI-BUS,addr=DEVFN to control the PCI device address. @@ -240,6 +243,9 @@ For PCI devices, you can add bus=PCI-BUS,addr=DEVFN to control the PCI device address, as usual. The old -net nic provides parameter addr for that, it is silently ignored when the NIC is not a PCI device. +For virtio-net-pci, you can control whether or not ioeventfd is used for +virtqueue notify by setting ioeventfd= to on (default) or off. + -net nic accepts vectors=V for all models, but it's silently ignored except for virtio-net-pci (model=virtio). With -device, only devices that support it accept it. -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-12 15:02 [Qemu-devel] [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify Stefan Hajnoczi ` (3 preceding siblings ...) 2010-12-12 15:02 ` [Qemu-devel] [PATCH v5 4/4] docs: Document virtio PCI -device ioeventfd=on|off Stefan Hajnoczi @ 2010-12-12 15:14 ` Stefan Hajnoczi 2010-12-12 20:41 ` Michael S. Tsirkin 5 siblings, 0 replies; 52+ messages in thread From: Stefan Hajnoczi @ 2010-12-12 15:14 UTC (permalink / raw) To: qemu-devel; +Cc: Michael S. Tsirkin On Sun, Dec 12, 2010 at 3:02 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote: > Due to lack of connectivity I am sending from GMail. Git should retain my > stefanha@linux.vnet.ibm.com From address. The From address didn't come through correctly so I've pushed the commits here: git pull git://repo.or.cz/qemu/stefanha.git virtio-ioeventfd-2 Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-12 15:02 [Qemu-devel] [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify Stefan Hajnoczi ` (4 preceding siblings ...) 2010-12-12 15:14 ` [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify Stefan Hajnoczi @ 2010-12-12 20:41 ` Michael S. Tsirkin 2010-12-12 20:42 ` Michael S. Tsirkin 5 siblings, 1 reply; 52+ messages in thread From: Michael S. Tsirkin @ 2010-12-12 20:41 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel On Sun, Dec 12, 2010 at 03:02:04PM +0000, Stefan Hajnoczi wrote: > See below for the v5 changelog. > > Due to lack of connectivity I am sending from GMail. Git should retain my > stefanha@linux.vnet.ibm.com From address. > > Virtqueue notify is currently handled synchronously in userspace virtio. This > prevents the vcpu from executing guest code while hardware emulation code > handles the notify. > > On systems that support KVM, the ioeventfd mechanism can be used to make > virtqueue notify a lightweight exit by deferring hardware emulation to the > iothread and allowing the VM to continue execution. This model is similar to > how vhost receives virtqueue notifies. > > The result of this change is improved performance for userspace virtio devices. > Virtio-blk throughput increases especially for multithreaded scenarios and > virtio-net transmit throughput increases substantially. Interestingly, I see decreased throughput for small message host to get netperf runs. The command that I used was: netperf -H $vguest -- -m 200 And the results are: - with ioeventfd=off TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.104 (11.0.0.104) port 0 AF_INET : demo Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 200 10.00 3035.48 15.50 99.30 6.695 2.680 - with ioeventfd=on TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.104 (11.0.0.104) port 0 AF_INET : demo Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 200 10.00 1770.95 18.16 51.65 13.442 2.389 Do you see this behaviour too? -- MST ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-12 20:41 ` Michael S. Tsirkin @ 2010-12-12 20:42 ` Michael S. Tsirkin 2010-12-12 20:56 ` Michael S. Tsirkin 0 siblings, 1 reply; 52+ messages in thread From: Michael S. Tsirkin @ 2010-12-12 20:42 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel On Sun, Dec 12, 2010 at 10:41:28PM +0200, Michael S. Tsirkin wrote: > On Sun, Dec 12, 2010 at 03:02:04PM +0000, Stefan Hajnoczi wrote: > > See below for the v5 changelog. > > > > Due to lack of connectivity I am sending from GMail. Git should retain my > > stefanha@linux.vnet.ibm.com From address. > > > > Virtqueue notify is currently handled synchronously in userspace virtio. This > > prevents the vcpu from executing guest code while hardware emulation code > > handles the notify. > > > > On systems that support KVM, the ioeventfd mechanism can be used to make > > virtqueue notify a lightweight exit by deferring hardware emulation to the > > iothread and allowing the VM to continue execution. This model is similar to > > how vhost receives virtqueue notifies. > > > > The result of this change is improved performance for userspace virtio devices. > > Virtio-blk throughput increases especially for multithreaded scenarios and > > virtio-net transmit throughput increases substantially. > > Interestingly, I see decreased throughput for small message > host to get netperf runs. > > The command that I used was: > netperf -H $vguest -- -m 200 > > And the results are: > - with ioeventfd=off > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.104 (11.0.0.104) port 0 AF_INET : demo > Recv Send Send Utilization Service Demand > Socket Socket Message Elapsed Send Recv Send Recv > Size Size Size Time Throughput local remote local remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB > > 87380 16384 200 10.00 3035.48 15.50 99.30 6.695 2.680 > > - with ioeventfd=on > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.104 (11.0.0.104) port 0 AF_INET : demo > Recv Send Send Utilization Service Demand > Socket Socket Message Elapsed Send Recv Send Recv > Size Size Size Time Throughput local remote local remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB > > 87380 16384 200 10.00 1770.95 18.16 51.65 13.442 2.389 > > > Do you see this behaviour too? Just a note: this is with the patchset ported to qemu-kvm. > -- > MST ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-12 20:42 ` Michael S. Tsirkin @ 2010-12-12 20:56 ` Michael S. Tsirkin 2010-12-12 21:09 ` Michael S. Tsirkin 0 siblings, 1 reply; 52+ messages in thread From: Michael S. Tsirkin @ 2010-12-12 20:56 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel On Sun, Dec 12, 2010 at 10:42:28PM +0200, Michael S. Tsirkin wrote: > On Sun, Dec 12, 2010 at 10:41:28PM +0200, Michael S. Tsirkin wrote: > > On Sun, Dec 12, 2010 at 03:02:04PM +0000, Stefan Hajnoczi wrote: > > > See below for the v5 changelog. > > > > > > Due to lack of connectivity I am sending from GMail. Git should retain my > > > stefanha@linux.vnet.ibm.com From address. > > > > > > Virtqueue notify is currently handled synchronously in userspace virtio. This > > > prevents the vcpu from executing guest code while hardware emulation code > > > handles the notify. > > > > > > On systems that support KVM, the ioeventfd mechanism can be used to make > > > virtqueue notify a lightweight exit by deferring hardware emulation to the > > > iothread and allowing the VM to continue execution. This model is similar to > > > how vhost receives virtqueue notifies. > > > > > > The result of this change is improved performance for userspace virtio devices. > > > Virtio-blk throughput increases especially for multithreaded scenarios and > > > virtio-net transmit throughput increases substantially. > > > > Interestingly, I see decreased throughput for small message > > host to get netperf runs. > > > > The command that I used was: > > netperf -H $vguest -- -m 200 > > > > And the results are: > > - with ioeventfd=off > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.104 (11.0.0.104) port 0 AF_INET : demo > > Recv Send Send Utilization Service Demand > > Socket Socket Message Elapsed Send Recv Send Recv > > Size Size Size Time Throughput local remote local remote > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB > > > > 87380 16384 200 10.00 3035.48 15.50 99.30 6.695 2.680 > > > > - with ioeventfd=on > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.104 (11.0.0.104) port 0 AF_INET : demo > > Recv Send Send Utilization Service Demand > > Socket Socket Message Elapsed Send Recv Send Recv > > Size Size Size Time Throughput local remote local remote > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB > > > > 87380 16384 200 10.00 1770.95 18.16 51.65 13.442 2.389 > > > > > > Do you see this behaviour too? > > Just a note: this is with the patchset ported to qemu-kvm. And just another note: the trend is reversed for larged messages, e.g. with 1.5k messages ioeventfd=on outputforms ioeventfd=off. > > -- > > MST ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-12 20:56 ` Michael S. Tsirkin @ 2010-12-12 21:09 ` Michael S. Tsirkin 2010-12-13 10:24 ` Stefan Hajnoczi 0 siblings, 1 reply; 52+ messages in thread From: Michael S. Tsirkin @ 2010-12-12 21:09 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel On Sun, Dec 12, 2010 at 10:56:34PM +0200, Michael S. Tsirkin wrote: > On Sun, Dec 12, 2010 at 10:42:28PM +0200, Michael S. Tsirkin wrote: > > On Sun, Dec 12, 2010 at 10:41:28PM +0200, Michael S. Tsirkin wrote: > > > On Sun, Dec 12, 2010 at 03:02:04PM +0000, Stefan Hajnoczi wrote: > > > > See below for the v5 changelog. > > > > > > > > Due to lack of connectivity I am sending from GMail. Git should retain my > > > > stefanha@linux.vnet.ibm.com From address. > > > > > > > > Virtqueue notify is currently handled synchronously in userspace virtio. This > > > > prevents the vcpu from executing guest code while hardware emulation code > > > > handles the notify. > > > > > > > > On systems that support KVM, the ioeventfd mechanism can be used to make > > > > virtqueue notify a lightweight exit by deferring hardware emulation to the > > > > iothread and allowing the VM to continue execution. This model is similar to > > > > how vhost receives virtqueue notifies. > > > > > > > > The result of this change is improved performance for userspace virtio devices. > > > > Virtio-blk throughput increases especially for multithreaded scenarios and > > > > virtio-net transmit throughput increases substantially. > > > > > > Interestingly, I see decreased throughput for small message > > > host to get netperf runs. > > > > > > The command that I used was: > > > netperf -H $vguest -- -m 200 > > > > > > And the results are: > > > - with ioeventfd=off > > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.104 (11.0.0.104) port 0 AF_INET : demo > > > Recv Send Send Utilization Service Demand > > > Socket Socket Message Elapsed Send Recv Send Recv > > > Size Size Size Time Throughput local remote local remote > > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB > > > > > > 87380 16384 200 10.00 3035.48 15.50 99.30 6.695 2.680 > > > > > > - with ioeventfd=on > > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.104 (11.0.0.104) port 0 AF_INET : demo > > > Recv Send Send Utilization Service Demand > > > Socket Socket Message Elapsed Send Recv Send Recv > > > Size Size Size Time Throughput local remote local remote > > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB > > > > > > 87380 16384 200 10.00 1770.95 18.16 51.65 13.442 2.389 > > > > > > > > > Do you see this behaviour too? > > > > Just a note: this is with the patchset ported to qemu-kvm. > > And just another note: the trend is reversed for larged messages, > e.g. with 1.5k messages ioeventfd=on outputforms ioeventfd=off. Another datapoint where I see a regression is with 4000 byte messages for guest to host traffic. ioeventfd=off set_up_server could not establish a listen endpoint for port 12865 with family AF_UNSPEC TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.4 (11.0.0.4) port 0 AF_INET : demo Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 4000 10.00 7717.56 98.80 15.11 1.049 2.566 ioeventfd=on TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.4 (11.0.0.4) port 0 AF_INET : demo Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 4000 10.00 3965.86 87.69 15.29 1.811 5.055 -- MST ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-12 21:09 ` Michael S. Tsirkin @ 2010-12-13 10:24 ` Stefan Hajnoczi 2010-12-13 10:38 ` Michael S. Tsirkin 0 siblings, 1 reply; 52+ messages in thread From: Stefan Hajnoczi @ 2010-12-13 10:24 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: qemu-devel On Sun, Dec 12, 2010 at 9:09 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > On Sun, Dec 12, 2010 at 10:56:34PM +0200, Michael S. Tsirkin wrote: >> On Sun, Dec 12, 2010 at 10:42:28PM +0200, Michael S. Tsirkin wrote: >> > On Sun, Dec 12, 2010 at 10:41:28PM +0200, Michael S. Tsirkin wrote: >> > > On Sun, Dec 12, 2010 at 03:02:04PM +0000, Stefan Hajnoczi wrote: >> > > > See below for the v5 changelog. >> > > > >> > > > Due to lack of connectivity I am sending from GMail. Git should retain my >> > > > stefanha@linux.vnet.ibm.com From address. >> > > > >> > > > Virtqueue notify is currently handled synchronously in userspace virtio. This >> > > > prevents the vcpu from executing guest code while hardware emulation code >> > > > handles the notify. >> > > > >> > > > On systems that support KVM, the ioeventfd mechanism can be used to make >> > > > virtqueue notify a lightweight exit by deferring hardware emulation to the >> > > > iothread and allowing the VM to continue execution. This model is similar to >> > > > how vhost receives virtqueue notifies. >> > > > >> > > > The result of this change is improved performance for userspace virtio devices. >> > > > Virtio-blk throughput increases especially for multithreaded scenarios and >> > > > virtio-net transmit throughput increases substantially. >> > > >> > > Interestingly, I see decreased throughput for small message >> > > host to get netperf runs. >> > > >> > > The command that I used was: >> > > netperf -H $vguest -- -m 200 >> > > >> > > And the results are: >> > > - with ioeventfd=off >> > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.104 (11.0.0.104) port 0 AF_INET : demo >> > > Recv Send Send Utilization Service Demand >> > > Socket Socket Message Elapsed Send Recv Send Recv >> > > Size Size Size Time Throughput local remote local remote >> > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB >> > > >> > > 87380 16384 200 10.00 3035.48 15.50 99.30 6.695 2.680 >> > > >> > > - with ioeventfd=on >> > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.104 (11.0.0.104) port 0 AF_INET : demo >> > > Recv Send Send Utilization Service Demand >> > > Socket Socket Message Elapsed Send Recv Send Recv >> > > Size Size Size Time Throughput local remote local remote >> > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB >> > > >> > > 87380 16384 200 10.00 1770.95 18.16 51.65 13.442 2.389 >> > > >> > > >> > > Do you see this behaviour too? >> > >> > Just a note: this is with the patchset ported to qemu-kvm. >> >> And just another note: the trend is reversed for larged messages, >> e.g. with 1.5k messages ioeventfd=on outputforms ioeventfd=off. > > Another datapoint where I see a regression is with 4000 byte messages > for guest to host traffic. > > ioeventfd=off > set_up_server could not establish a listen endpoint for port 12865 with family AF_UNSPEC > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.4 (11.0.0.4) port 0 AF_INET : demo > Recv Send Send Utilization Service Demand > Socket Socket Message Elapsed Send Recv Send Recv > Size Size Size Time Throughput local remote local remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB > > 87380 16384 4000 10.00 7717.56 98.80 15.11 1.049 2.566 > > ioeventfd=on > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.4 (11.0.0.4) port 0 AF_INET : demo > Recv Send Send Utilization Service Demand > Socket Socket Message Elapsed Send Recv Send Recv > Size Size Size Time Throughput local remote local remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB > > 87380 16384 4000 10.00 3965.86 87.69 15.29 1.811 5.055 Interesting. I posted the following results in an earlier version of this patch: "Sridhar Samudrala <sri@us.ibm.com> collected the following data for virtio-net with 2.6.36-rc1 on the host and 2.6.34 on the guest. Guest to Host TCP_STREAM throughput(Mb/sec) ------------------------------------------- Msg Size vhost-net virtio-net virtio-net/ioeventfd 65536 12755 6430 7590 16384 8499 3084 5764 4096 4723 1578 3659" Here we got a throughput improvement where you got a regression. Your virtio-net ioeventfd=off throughput is much higher than what we got (different hardware and configuration, but still I didn't know that virtio-net reaches 7 Gbit/s!). I have focussed on the block side of things. Any thoughts about the virtio-net performance we're seeing? " 1024 1827 981 2060 Host to Guest TCP_STREAM throughput(Mb/sec) ------------------------------------------- Msg Size vhost-net virtio-net virtio-net/ioeventfd 65536 11156 5790 5853 16384 10787 5575 5691 4096 10452 5556 4277 1024 4437 3671 5277 Guest to Host TCP_RR latency(transactions/sec) ---------------------------------------------- Msg Size vhost-net virtio-net virtio-net/ioeventfd 1 9903 3459 3425 4096 7185 1931 1899 16384 6108 2102 1923 65536 3161 1610 1744" I'll also run the netperf tests you posted to check what I get. Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-13 10:24 ` Stefan Hajnoczi @ 2010-12-13 10:38 ` Michael S. Tsirkin 2010-12-13 13:11 ` Stefan Hajnoczi 0 siblings, 1 reply; 52+ messages in thread From: Michael S. Tsirkin @ 2010-12-13 10:38 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel On Mon, Dec 13, 2010 at 10:24:51AM +0000, Stefan Hajnoczi wrote: > On Sun, Dec 12, 2010 at 9:09 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Sun, Dec 12, 2010 at 10:56:34PM +0200, Michael S. Tsirkin wrote: > >> On Sun, Dec 12, 2010 at 10:42:28PM +0200, Michael S. Tsirkin wrote: > >> > On Sun, Dec 12, 2010 at 10:41:28PM +0200, Michael S. Tsirkin wrote: > >> > > On Sun, Dec 12, 2010 at 03:02:04PM +0000, Stefan Hajnoczi wrote: > >> > > > See below for the v5 changelog. > >> > > > > >> > > > Due to lack of connectivity I am sending from GMail. Git should retain my > >> > > > stefanha@linux.vnet.ibm.com From address. > >> > > > > >> > > > Virtqueue notify is currently handled synchronously in userspace virtio. This > >> > > > prevents the vcpu from executing guest code while hardware emulation code > >> > > > handles the notify. > >> > > > > >> > > > On systems that support KVM, the ioeventfd mechanism can be used to make > >> > > > virtqueue notify a lightweight exit by deferring hardware emulation to the > >> > > > iothread and allowing the VM to continue execution. This model is similar to > >> > > > how vhost receives virtqueue notifies. > >> > > > > >> > > > The result of this change is improved performance for userspace virtio devices. > >> > > > Virtio-blk throughput increases especially for multithreaded scenarios and > >> > > > virtio-net transmit throughput increases substantially. > >> > > > >> > > Interestingly, I see decreased throughput for small message > >> > > host to get netperf runs. > >> > > > >> > > The command that I used was: > >> > > netperf -H $vguest -- -m 200 > >> > > > >> > > And the results are: > >> > > - with ioeventfd=off > >> > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.104 (11.0.0.104) port 0 AF_INET : demo > >> > > Recv Send Send Utilization Service Demand > >> > > Socket Socket Message Elapsed Send Recv Send Recv > >> > > Size Size Size Time Throughput local remote local remote > >> > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB > >> > > > >> > > 87380 16384 200 10.00 3035.48 15.50 99.30 6.695 2.680 > >> > > > >> > > - with ioeventfd=on > >> > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.104 (11.0.0.104) port 0 AF_INET : demo > >> > > Recv Send Send Utilization Service Demand > >> > > Socket Socket Message Elapsed Send Recv Send Recv > >> > > Size Size Size Time Throughput local remote local remote > >> > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB > >> > > > >> > > 87380 16384 200 10.00 1770.95 18.16 51.65 13.442 2.389 > >> > > > >> > > > >> > > Do you see this behaviour too? > >> > > >> > Just a note: this is with the patchset ported to qemu-kvm. > >> > >> And just another note: the trend is reversed for larged messages, > >> e.g. with 1.5k messages ioeventfd=on outputforms ioeventfd=off. > > > > Another datapoint where I see a regression is with 4000 byte messages > > for guest to host traffic. > > > > ioeventfd=off > > set_up_server could not establish a listen endpoint for port 12865 with family AF_UNSPEC > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.4 (11.0.0.4) port 0 AF_INET : demo > > Recv Send Send Utilization Service Demand > > Socket Socket Message Elapsed Send Recv Send Recv > > Size Size Size Time Throughput local remote local remote > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB > > > > 87380 16384 4000 10.00 7717.56 98.80 15.11 1.049 2.566 > > > > ioeventfd=on > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.4 (11.0.0.4) port 0 AF_INET : demo > > Recv Send Send Utilization Service Demand > > Socket Socket Message Elapsed Send Recv Send Recv > > Size Size Size Time Throughput local remote local remote > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB > > > > 87380 16384 4000 10.00 3965.86 87.69 15.29 1.811 5.055 > > Interesting. I posted the following results in an earlier version of > this patch: > > "Sridhar Samudrala <sri@us.ibm.com> collected the following data for > virtio-net with 2.6.36-rc1 on the host and 2.6.34 on the guest. > > Guest to Host TCP_STREAM throughput(Mb/sec) > ------------------------------------------- > Msg Size vhost-net virtio-net virtio-net/ioeventfd > 65536 12755 6430 7590 > 16384 8499 3084 5764 > 4096 4723 1578 3659" > > Here we got a throughput improvement where you got a regression. Your > virtio-net ioeventfd=off throughput is much higher than what we got > (different hardware and configuration, but still I didn't know that > virtio-net reaches 7 Gbit/s!). Which qemu are you running? Mine is upstream qemu-kvm + your patches v4 + my patch to port to qemu-kvm. Are you testing qemu.git? My cpu is Intel(R) Xeon(R) CPU X5560 @ 2.80GHz, I am running without any special flags, so IIRC kvm64 cpu type is emulated. Should really try +x2apic. > I have focussed on the block side of things. Any thoughts about the > virtio-net performance we're seeing? > > " 1024 1827 981 2060 I tried 1.5k, I am getting about 3000 guest to host, but in my testing I get about 2000 without ioeventfd as well. > Host to Guest TCP_STREAM throughput(Mb/sec) > ------------------------------------------- > Msg Size vhost-net virtio-net virtio-net/ioeventfd > 65536 11156 5790 5853 > 16384 10787 5575 5691 > 4096 10452 5556 4277 > 1024 4437 3671 5277 > > Guest to Host TCP_RR latency(transactions/sec) > ---------------------------------------------- > > Msg Size vhost-net virtio-net virtio-net/ioeventfd > 1 9903 3459 3425 > 4096 7185 1931 1899 > 16384 6108 2102 1923 > 65536 3161 1610 1744" > > I'll also run the netperf tests you posted to check what I get. > > Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-13 10:38 ` Michael S. Tsirkin @ 2010-12-13 13:11 ` Stefan Hajnoczi 2010-12-13 13:35 ` Michael S. Tsirkin 0 siblings, 1 reply; 52+ messages in thread From: Stefan Hajnoczi @ 2010-12-13 13:11 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: qemu-devel Fresh results: 192.168.0.1 - host (runs netperf) 192.168.0.2 - guest (runs netserver) host$ src/netperf -H 192.168.0.2 -- -m 200 ioeventfd=on TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 (192.168.0.2) port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 200 10.00 1759.25 ioeventfd=off TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 (192.168.0.2) port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 200 10.00 1757.15 The results vary approx +/- 3% between runs. Invocation: $ x86_64-softmmu/qemu-system-x86_64 -m 4096 -enable-kvm -netdev type=tap,id=net0,ifname=tap0,script=no,downscript=no -device virtio-net-pci,netdev=net0,ioeventfd=on|off -vnc :0 -drive if=virtio,cache=none,file=$HOME/rhel6-autobench-raw.img I am running qemu.git with v5 patches, based off 36888c6335422f07bbc50bf3443a39f24b90c7c6. Host: 1 Quad-Core AMD Opteron(tm) Processor 2350 @ 2 GHz 8 GB RAM RHEL 6 host Next I will try the patches on latest qemu-kvm.git Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-13 13:11 ` Stefan Hajnoczi @ 2010-12-13 13:35 ` Michael S. Tsirkin 2010-12-13 13:36 ` Michael S. Tsirkin 0 siblings, 1 reply; 52+ messages in thread From: Michael S. Tsirkin @ 2010-12-13 13:35 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel On Mon, Dec 13, 2010 at 01:11:27PM +0000, Stefan Hajnoczi wrote: > Fresh results: > > 192.168.0.1 - host (runs netperf) > 192.168.0.2 - guest (runs netserver) > > host$ src/netperf -H 192.168.0.2 -- -m 200 > > ioeventfd=on > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 > (192.168.0.2) port 0 AF_INET > Recv Send Send > Socket Socket Message Elapsed > Size Size Size Time Throughput > bytes bytes bytes secs. 10^6bits/sec > 87380 16384 200 10.00 1759.25 > > ioeventfd=off > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 > (192.168.0.2) port 0 AF_INET > Recv Send Send > Socket Socket Message Elapsed > Size Size Size Time Throughput > bytes bytes bytes secs. 10^6bits/sec > > 87380 16384 200 10.00 1757.15 > > The results vary approx +/- 3% between runs. > > Invocation: > $ x86_64-softmmu/qemu-system-x86_64 -m 4096 -enable-kvm -netdev > type=tap,id=net0,ifname=tap0,script=no,downscript=no -device > virtio-net-pci,netdev=net0,ioeventfd=on|off -vnc :0 -drive > if=virtio,cache=none,file=$HOME/rhel6-autobench-raw.img > > I am running qemu.git with v5 patches, based off > 36888c6335422f07bbc50bf3443a39f24b90c7c6. > > Host: > 1 Quad-Core AMD Opteron(tm) Processor 2350 @ 2 GHz > 8 GB RAM > RHEL 6 host > > Next I will try the patches on latest qemu-kvm.git > > Stefan One interesting thing is that I put virtio-net earlier on command line. Since iobus scan is linear for now, I wonder if this might possibly matter. -- MST ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-13 13:35 ` Michael S. Tsirkin @ 2010-12-13 13:36 ` Michael S. Tsirkin 2010-12-13 14:06 ` Stefan Hajnoczi 2010-12-13 15:27 ` Stefan Hajnoczi 0 siblings, 2 replies; 52+ messages in thread From: Michael S. Tsirkin @ 2010-12-13 13:36 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel On Mon, Dec 13, 2010 at 03:35:38PM +0200, Michael S. Tsirkin wrote: > On Mon, Dec 13, 2010 at 01:11:27PM +0000, Stefan Hajnoczi wrote: > > Fresh results: > > > > 192.168.0.1 - host (runs netperf) > > 192.168.0.2 - guest (runs netserver) > > > > host$ src/netperf -H 192.168.0.2 -- -m 200 > > > > ioeventfd=on > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 > > (192.168.0.2) port 0 AF_INET > > Recv Send Send > > Socket Socket Message Elapsed > > Size Size Size Time Throughput > > bytes bytes bytes secs. 10^6bits/sec > > 87380 16384 200 10.00 1759.25 > > > > ioeventfd=off > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 > > (192.168.0.2) port 0 AF_INET > > Recv Send Send > > Socket Socket Message Elapsed > > Size Size Size Time Throughput > > bytes bytes bytes secs. 10^6bits/sec > > > > 87380 16384 200 10.00 1757.15 > > > > The results vary approx +/- 3% between runs. > > > > Invocation: > > $ x86_64-softmmu/qemu-system-x86_64 -m 4096 -enable-kvm -netdev > > type=tap,id=net0,ifname=tap0,script=no,downscript=no -device > > virtio-net-pci,netdev=net0,ioeventfd=on|off -vnc :0 -drive > > if=virtio,cache=none,file=$HOME/rhel6-autobench-raw.img > > > > I am running qemu.git with v5 patches, based off > > 36888c6335422f07bbc50bf3443a39f24b90c7c6. > > > > Host: > > 1 Quad-Core AMD Opteron(tm) Processor 2350 @ 2 GHz > > 8 GB RAM > > RHEL 6 host > > > > Next I will try the patches on latest qemu-kvm.git > > > > Stefan > > One interesting thing is that I put virtio-net earlier on > command line. Sorry I mean I put it after disk, you put it before. > Since iobus scan is linear for now, I wonder if this might > possibly matter. > > -- > MST ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-13 13:36 ` Michael S. Tsirkin @ 2010-12-13 14:06 ` Stefan Hajnoczi 2010-12-13 15:27 ` Stefan Hajnoczi 1 sibling, 0 replies; 52+ messages in thread From: Stefan Hajnoczi @ 2010-12-13 14:06 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: qemu-devel Here are my results on qemu-kvm.git: ioeventfd=on TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 (192.168.0.2) port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 200 10.00 1203.44 ioeventfd=off TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 (192.168.0.2) port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 200 10.00 1677.96 This is a 30% degradation that wasn't visible on qemu.git. Same host. qemu-kvm.git with v5 patches based on cb1983b8809d0e06a97384a40bad1194a32fc814. Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-13 13:36 ` Michael S. Tsirkin 2010-12-13 14:06 ` Stefan Hajnoczi @ 2010-12-13 15:27 ` Stefan Hajnoczi 2010-12-13 16:00 ` Michael S. Tsirkin 2010-12-13 16:12 ` Michael S. Tsirkin 1 sibling, 2 replies; 52+ messages in thread From: Stefan Hajnoczi @ 2010-12-13 15:27 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: qemu-devel On Mon, Dec 13, 2010 at 1:36 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > On Mon, Dec 13, 2010 at 03:35:38PM +0200, Michael S. Tsirkin wrote: >> On Mon, Dec 13, 2010 at 01:11:27PM +0000, Stefan Hajnoczi wrote: >> > Fresh results: >> > >> > 192.168.0.1 - host (runs netperf) >> > 192.168.0.2 - guest (runs netserver) >> > >> > host$ src/netperf -H 192.168.0.2 -- -m 200 >> > >> > ioeventfd=on >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 >> > (192.168.0.2) port 0 AF_INET >> > Recv Send Send >> > Socket Socket Message Elapsed >> > Size Size Size Time Throughput >> > bytes bytes bytes secs. 10^6bits/sec >> > 87380 16384 200 10.00 1759.25 >> > >> > ioeventfd=off >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 >> > (192.168.0.2) port 0 AF_INET >> > Recv Send Send >> > Socket Socket Message Elapsed >> > Size Size Size Time Throughput >> > bytes bytes bytes secs. 10^6bits/sec >> > >> > 87380 16384 200 10.00 1757.15 >> > >> > The results vary approx +/- 3% between runs. >> > >> > Invocation: >> > $ x86_64-softmmu/qemu-system-x86_64 -m 4096 -enable-kvm -netdev >> > type=tap,id=net0,ifname=tap0,script=no,downscript=no -device >> > virtio-net-pci,netdev=net0,ioeventfd=on|off -vnc :0 -drive >> > if=virtio,cache=none,file=$HOME/rhel6-autobench-raw.img >> > >> > I am running qemu.git with v5 patches, based off >> > 36888c6335422f07bbc50bf3443a39f24b90c7c6. >> > >> > Host: >> > 1 Quad-Core AMD Opteron(tm) Processor 2350 @ 2 GHz >> > 8 GB RAM >> > RHEL 6 host >> > >> > Next I will try the patches on latest qemu-kvm.git >> > >> > Stefan >> >> One interesting thing is that I put virtio-net earlier on >> command line. > > Sorry I mean I put it after disk, you put it before. I can't find a measurable difference when swapping -drive and -netdev. Can you run the same test with vhost? I assume it still outperforms userspace virtio for small message sizes? I'm interested because that also uses ioeventfd. I am wondering if the iothread differences between qemu.git and qemu-kvm.git can explain the performance results we see. In particular, qemu.git produces the same (high) throughput whether ioeventfd is on or off. Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-13 15:27 ` Stefan Hajnoczi @ 2010-12-13 16:00 ` Michael S. Tsirkin 2010-12-13 16:29 ` Stefan Hajnoczi 2010-12-13 16:12 ` Michael S. Tsirkin 1 sibling, 1 reply; 52+ messages in thread From: Michael S. Tsirkin @ 2010-12-13 16:00 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel On Mon, Dec 13, 2010 at 03:27:06PM +0000, Stefan Hajnoczi wrote: > On Mon, Dec 13, 2010 at 1:36 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Mon, Dec 13, 2010 at 03:35:38PM +0200, Michael S. Tsirkin wrote: > >> On Mon, Dec 13, 2010 at 01:11:27PM +0000, Stefan Hajnoczi wrote: > >> > Fresh results: > >> > > >> > 192.168.0.1 - host (runs netperf) > >> > 192.168.0.2 - guest (runs netserver) > >> > > >> > host$ src/netperf -H 192.168.0.2 -- -m 200 > >> > > >> > ioeventfd=on > >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 > >> > (192.168.0.2) port 0 AF_INET > >> > Recv Send Send > >> > Socket Socket Message Elapsed > >> > Size Size Size Time Throughput > >> > bytes bytes bytes secs. 10^6bits/sec > >> > 87380 16384 200 10.00 1759.25 > >> > > >> > ioeventfd=off > >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 > >> > (192.168.0.2) port 0 AF_INET > >> > Recv Send Send > >> > Socket Socket Message Elapsed > >> > Size Size Size Time Throughput > >> > bytes bytes bytes secs. 10^6bits/sec > >> > > >> > 87380 16384 200 10.00 1757.15 > >> > > >> > The results vary approx +/- 3% between runs. > >> > > >> > Invocation: > >> > $ x86_64-softmmu/qemu-system-x86_64 -m 4096 -enable-kvm -netdev > >> > type=tap,id=net0,ifname=tap0,script=no,downscript=no -device > >> > virtio-net-pci,netdev=net0,ioeventfd=on|off -vnc :0 -drive > >> > if=virtio,cache=none,file=$HOME/rhel6-autobench-raw.img > >> > > >> > I am running qemu.git with v5 patches, based off > >> > 36888c6335422f07bbc50bf3443a39f24b90c7c6. > >> > > >> > Host: > >> > 1 Quad-Core AMD Opteron(tm) Processor 2350 @ 2 GHz > >> > 8 GB RAM > >> > RHEL 6 host > >> > > >> > Next I will try the patches on latest qemu-kvm.git > >> > > >> > Stefan > >> > >> One interesting thing is that I put virtio-net earlier on > >> command line. > > > > Sorry I mean I put it after disk, you put it before. > > I can't find a measurable difference when swapping -drive and -netdev. > > Can you run the same test with vhost? I assume it still outperforms > userspace virtio for small message sizes? I'm interested because that > also uses ioeventfd. Seems to work same as ioeventfd. > I am wondering if the iothread differences between qemu.git and > qemu-kvm.git can explain the performance results we see. In > particular, qemu.git produces the same (high) throughput whether > ioeventfd is on or off. > > Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-13 16:00 ` Michael S. Tsirkin @ 2010-12-13 16:29 ` Stefan Hajnoczi 2010-12-13 16:30 ` Michael S. Tsirkin 0 siblings, 1 reply; 52+ messages in thread From: Stefan Hajnoczi @ 2010-12-13 16:29 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: qemu-devel On Mon, Dec 13, 2010 at 4:00 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > On Mon, Dec 13, 2010 at 03:27:06PM +0000, Stefan Hajnoczi wrote: >> On Mon, Dec 13, 2010 at 1:36 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >> > On Mon, Dec 13, 2010 at 03:35:38PM +0200, Michael S. Tsirkin wrote: >> >> On Mon, Dec 13, 2010 at 01:11:27PM +0000, Stefan Hajnoczi wrote: >> >> > Fresh results: >> >> > >> >> > 192.168.0.1 - host (runs netperf) >> >> > 192.168.0.2 - guest (runs netserver) >> >> > >> >> > host$ src/netperf -H 192.168.0.2 -- -m 200 >> >> > >> >> > ioeventfd=on >> >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 >> >> > (192.168.0.2) port 0 AF_INET >> >> > Recv Send Send >> >> > Socket Socket Message Elapsed >> >> > Size Size Size Time Throughput >> >> > bytes bytes bytes secs. 10^6bits/sec >> >> > 87380 16384 200 10.00 1759.25 >> >> > >> >> > ioeventfd=off >> >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 >> >> > (192.168.0.2) port 0 AF_INET >> >> > Recv Send Send >> >> > Socket Socket Message Elapsed >> >> > Size Size Size Time Throughput >> >> > bytes bytes bytes secs. 10^6bits/sec >> >> > >> >> > 87380 16384 200 10.00 1757.15 >> >> > >> >> > The results vary approx +/- 3% between runs. >> >> > >> >> > Invocation: >> >> > $ x86_64-softmmu/qemu-system-x86_64 -m 4096 -enable-kvm -netdev >> >> > type=tap,id=net0,ifname=tap0,script=no,downscript=no -device >> >> > virtio-net-pci,netdev=net0,ioeventfd=on|off -vnc :0 -drive >> >> > if=virtio,cache=none,file=$HOME/rhel6-autobench-raw.img >> >> > >> >> > I am running qemu.git with v5 patches, based off >> >> > 36888c6335422f07bbc50bf3443a39f24b90c7c6. >> >> > >> >> > Host: >> >> > 1 Quad-Core AMD Opteron(tm) Processor 2350 @ 2 GHz >> >> > 8 GB RAM >> >> > RHEL 6 host >> >> > >> >> > Next I will try the patches on latest qemu-kvm.git >> >> > >> >> > Stefan >> >> >> >> One interesting thing is that I put virtio-net earlier on >> >> command line. >> > >> > Sorry I mean I put it after disk, you put it before. >> >> I can't find a measurable difference when swapping -drive and -netdev. >> >> Can you run the same test with vhost? I assume it still outperforms >> userspace virtio for small message sizes? I'm interested because that >> also uses ioeventfd. > > Seems to work same as ioeventfd. vhost performs the same as ioeventfd=on? And that means slower than ioeventfd=off? Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-13 16:29 ` Stefan Hajnoczi @ 2010-12-13 16:30 ` Michael S. Tsirkin 0 siblings, 0 replies; 52+ messages in thread From: Michael S. Tsirkin @ 2010-12-13 16:30 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel On Mon, Dec 13, 2010 at 04:29:58PM +0000, Stefan Hajnoczi wrote: > On Mon, Dec 13, 2010 at 4:00 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Mon, Dec 13, 2010 at 03:27:06PM +0000, Stefan Hajnoczi wrote: > >> On Mon, Dec 13, 2010 at 1:36 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > >> > On Mon, Dec 13, 2010 at 03:35:38PM +0200, Michael S. Tsirkin wrote: > >> >> On Mon, Dec 13, 2010 at 01:11:27PM +0000, Stefan Hajnoczi wrote: > >> >> > Fresh results: > >> >> > > >> >> > 192.168.0.1 - host (runs netperf) > >> >> > 192.168.0.2 - guest (runs netserver) > >> >> > > >> >> > host$ src/netperf -H 192.168.0.2 -- -m 200 > >> >> > > >> >> > ioeventfd=on > >> >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 > >> >> > (192.168.0.2) port 0 AF_INET > >> >> > Recv Send Send > >> >> > Socket Socket Message Elapsed > >> >> > Size Size Size Time Throughput > >> >> > bytes bytes bytes secs. 10^6bits/sec > >> >> > 87380 16384 200 10.00 1759.25 > >> >> > > >> >> > ioeventfd=off > >> >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 > >> >> > (192.168.0.2) port 0 AF_INET > >> >> > Recv Send Send > >> >> > Socket Socket Message Elapsed > >> >> > Size Size Size Time Throughput > >> >> > bytes bytes bytes secs. 10^6bits/sec > >> >> > > >> >> > 87380 16384 200 10.00 1757.15 > >> >> > > >> >> > The results vary approx +/- 3% between runs. > >> >> > > >> >> > Invocation: > >> >> > $ x86_64-softmmu/qemu-system-x86_64 -m 4096 -enable-kvm -netdev > >> >> > type=tap,id=net0,ifname=tap0,script=no,downscript=no -device > >> >> > virtio-net-pci,netdev=net0,ioeventfd=on|off -vnc :0 -drive > >> >> > if=virtio,cache=none,file=$HOME/rhel6-autobench-raw.img > >> >> > > >> >> > I am running qemu.git with v5 patches, based off > >> >> > 36888c6335422f07bbc50bf3443a39f24b90c7c6. > >> >> > > >> >> > Host: > >> >> > 1 Quad-Core AMD Opteron(tm) Processor 2350 @ 2 GHz > >> >> > 8 GB RAM > >> >> > RHEL 6 host > >> >> > > >> >> > Next I will try the patches on latest qemu-kvm.git > >> >> > > >> >> > Stefan > >> >> > >> >> One interesting thing is that I put virtio-net earlier on > >> >> command line. > >> > > >> > Sorry I mean I put it after disk, you put it before. > >> > >> I can't find a measurable difference when swapping -drive and -netdev. > >> > >> Can you run the same test with vhost? I assume it still outperforms > >> userspace virtio for small message sizes? I'm interested because that > >> also uses ioeventfd. > > > > Seems to work same as ioeventfd. > > vhost performs the same as ioeventfd=on? And that means slower than > ioeventfd=off? > > Stefan Yes. -- MST ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-13 15:27 ` Stefan Hajnoczi 2010-12-13 16:00 ` Michael S. Tsirkin @ 2010-12-13 16:12 ` Michael S. Tsirkin 2010-12-13 16:28 ` Stefan Hajnoczi 1 sibling, 1 reply; 52+ messages in thread From: Michael S. Tsirkin @ 2010-12-13 16:12 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel On Mon, Dec 13, 2010 at 03:27:06PM +0000, Stefan Hajnoczi wrote: > On Mon, Dec 13, 2010 at 1:36 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Mon, Dec 13, 2010 at 03:35:38PM +0200, Michael S. Tsirkin wrote: > >> On Mon, Dec 13, 2010 at 01:11:27PM +0000, Stefan Hajnoczi wrote: > >> > Fresh results: > >> > > >> > 192.168.0.1 - host (runs netperf) > >> > 192.168.0.2 - guest (runs netserver) > >> > > >> > host$ src/netperf -H 192.168.0.2 -- -m 200 > >> > > >> > ioeventfd=on > >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 > >> > (192.168.0.2) port 0 AF_INET > >> > Recv Send Send > >> > Socket Socket Message Elapsed > >> > Size Size Size Time Throughput > >> > bytes bytes bytes secs. 10^6bits/sec > >> > 87380 16384 200 10.00 1759.25 > >> > > >> > ioeventfd=off > >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 > >> > (192.168.0.2) port 0 AF_INET > >> > Recv Send Send > >> > Socket Socket Message Elapsed > >> > Size Size Size Time Throughput > >> > bytes bytes bytes secs. 10^6bits/sec > >> > > >> > 87380 16384 200 10.00 1757.15 > >> > > >> > The results vary approx +/- 3% between runs. > >> > > >> > Invocation: > >> > $ x86_64-softmmu/qemu-system-x86_64 -m 4096 -enable-kvm -netdev > >> > type=tap,id=net0,ifname=tap0,script=no,downscript=no -device > >> > virtio-net-pci,netdev=net0,ioeventfd=on|off -vnc :0 -drive > >> > if=virtio,cache=none,file=$HOME/rhel6-autobench-raw.img > >> > > >> > I am running qemu.git with v5 patches, based off > >> > 36888c6335422f07bbc50bf3443a39f24b90c7c6. > >> > > >> > Host: > >> > 1 Quad-Core AMD Opteron(tm) Processor 2350 @ 2 GHz > >> > 8 GB RAM > >> > RHEL 6 host > >> > > >> > Next I will try the patches on latest qemu-kvm.git > >> > > >> > Stefan > >> > >> One interesting thing is that I put virtio-net earlier on > >> command line. > > > > Sorry I mean I put it after disk, you put it before. > > I can't find a measurable difference when swapping -drive and -netdev. One other concern I have is that we are apparently using ioeventfd for all VQs. E.g. for virtio-net we probably should not use it for the control VQ - it's a waste of resources. > Can you run the same test with vhost? I assume it still outperforms > userspace virtio for small message sizes? I'm interested because that > also uses ioeventfd. > > I am wondering if the iothread differences between qemu.git and > qemu-kvm.git can explain the performance results we see. In > particular, qemu.git produces the same (high) throughput whether > ioeventfd is on or off. > > Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-13 16:12 ` Michael S. Tsirkin @ 2010-12-13 16:28 ` Stefan Hajnoczi 2010-12-13 17:57 ` Stefan Hajnoczi 0 siblings, 1 reply; 52+ messages in thread From: Stefan Hajnoczi @ 2010-12-13 16:28 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: qemu-devel On Mon, Dec 13, 2010 at 4:12 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > On Mon, Dec 13, 2010 at 03:27:06PM +0000, Stefan Hajnoczi wrote: >> On Mon, Dec 13, 2010 at 1:36 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >> > On Mon, Dec 13, 2010 at 03:35:38PM +0200, Michael S. Tsirkin wrote: >> >> On Mon, Dec 13, 2010 at 01:11:27PM +0000, Stefan Hajnoczi wrote: >> >> > Fresh results: >> >> > >> >> > 192.168.0.1 - host (runs netperf) >> >> > 192.168.0.2 - guest (runs netserver) >> >> > >> >> > host$ src/netperf -H 192.168.0.2 -- -m 200 >> >> > >> >> > ioeventfd=on >> >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 >> >> > (192.168.0.2) port 0 AF_INET >> >> > Recv Send Send >> >> > Socket Socket Message Elapsed >> >> > Size Size Size Time Throughput >> >> > bytes bytes bytes secs. 10^6bits/sec >> >> > 87380 16384 200 10.00 1759.25 >> >> > >> >> > ioeventfd=off >> >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 >> >> > (192.168.0.2) port 0 AF_INET >> >> > Recv Send Send >> >> > Socket Socket Message Elapsed >> >> > Size Size Size Time Throughput >> >> > bytes bytes bytes secs. 10^6bits/sec >> >> > >> >> > 87380 16384 200 10.00 1757.15 >> >> > >> >> > The results vary approx +/- 3% between runs. >> >> > >> >> > Invocation: >> >> > $ x86_64-softmmu/qemu-system-x86_64 -m 4096 -enable-kvm -netdev >> >> > type=tap,id=net0,ifname=tap0,script=no,downscript=no -device >> >> > virtio-net-pci,netdev=net0,ioeventfd=on|off -vnc :0 -drive >> >> > if=virtio,cache=none,file=$HOME/rhel6-autobench-raw.img >> >> > >> >> > I am running qemu.git with v5 patches, based off >> >> > 36888c6335422f07bbc50bf3443a39f24b90c7c6. >> >> > >> >> > Host: >> >> > 1 Quad-Core AMD Opteron(tm) Processor 2350 @ 2 GHz >> >> > 8 GB RAM >> >> > RHEL 6 host >> >> > >> >> > Next I will try the patches on latest qemu-kvm.git >> >> > >> >> > Stefan >> >> >> >> One interesting thing is that I put virtio-net earlier on >> >> command line. >> > >> > Sorry I mean I put it after disk, you put it before. >> >> I can't find a measurable difference when swapping -drive and -netdev. > > One other concern I have is that we are apparently using > ioeventfd for all VQs. E.g. for virtio-net we probably should not > use it for the control VQ - it's a waste of resources. One option is a per-device (block, net, etc) bitmap that masks out virtqueues. Is that something you'd like to see? I'm tempted to mask out the RX vq too and see how that affects the qemu-kvm.git specific issue. Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-13 16:28 ` Stefan Hajnoczi @ 2010-12-13 17:57 ` Stefan Hajnoczi 2010-12-13 18:52 ` Michael S. Tsirkin 0 siblings, 1 reply; 52+ messages in thread From: Stefan Hajnoczi @ 2010-12-13 17:57 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: qemu-devel On Mon, Dec 13, 2010 at 4:28 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote: > On Mon, Dec 13, 2010 at 4:12 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >> On Mon, Dec 13, 2010 at 03:27:06PM +0000, Stefan Hajnoczi wrote: >>> On Mon, Dec 13, 2010 at 1:36 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>> > On Mon, Dec 13, 2010 at 03:35:38PM +0200, Michael S. Tsirkin wrote: >>> >> On Mon, Dec 13, 2010 at 01:11:27PM +0000, Stefan Hajnoczi wrote: >>> >> > Fresh results: >>> >> > >>> >> > 192.168.0.1 - host (runs netperf) >>> >> > 192.168.0.2 - guest (runs netserver) >>> >> > >>> >> > host$ src/netperf -H 192.168.0.2 -- -m 200 >>> >> > >>> >> > ioeventfd=on >>> >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 >>> >> > (192.168.0.2) port 0 AF_INET >>> >> > Recv Send Send >>> >> > Socket Socket Message Elapsed >>> >> > Size Size Size Time Throughput >>> >> > bytes bytes bytes secs. 10^6bits/sec >>> >> > 87380 16384 200 10.00 1759.25 >>> >> > >>> >> > ioeventfd=off >>> >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 >>> >> > (192.168.0.2) port 0 AF_INET >>> >> > Recv Send Send >>> >> > Socket Socket Message Elapsed >>> >> > Size Size Size Time Throughput >>> >> > bytes bytes bytes secs. 10^6bits/sec >>> >> > >>> >> > 87380 16384 200 10.00 1757.15 >>> >> > >>> >> > The results vary approx +/- 3% between runs. >>> >> > >>> >> > Invocation: >>> >> > $ x86_64-softmmu/qemu-system-x86_64 -m 4096 -enable-kvm -netdev >>> >> > type=tap,id=net0,ifname=tap0,script=no,downscript=no -device >>> >> > virtio-net-pci,netdev=net0,ioeventfd=on|off -vnc :0 -drive >>> >> > if=virtio,cache=none,file=$HOME/rhel6-autobench-raw.img >>> >> > >>> >> > I am running qemu.git with v5 patches, based off >>> >> > 36888c6335422f07bbc50bf3443a39f24b90c7c6. >>> >> > >>> >> > Host: >>> >> > 1 Quad-Core AMD Opteron(tm) Processor 2350 @ 2 GHz >>> >> > 8 GB RAM >>> >> > RHEL 6 host >>> >> > >>> >> > Next I will try the patches on latest qemu-kvm.git >>> >> > >>> >> > Stefan >>> >> >>> >> One interesting thing is that I put virtio-net earlier on >>> >> command line. >>> > >>> > Sorry I mean I put it after disk, you put it before. >>> >>> I can't find a measurable difference when swapping -drive and -netdev. >> >> One other concern I have is that we are apparently using >> ioeventfd for all VQs. E.g. for virtio-net we probably should not >> use it for the control VQ - it's a waste of resources. > > One option is a per-device (block, net, etc) bitmap that masks out > virtqueues. Is that something you'd like to see? > > I'm tempted to mask out the RX vq too and see how that affects the > qemu-kvm.git specific issue. As expected, the rx virtqueue is involved in the degradation. I enabled ioeventfd only for the TX virtqueue and got the same good results as userspace virtio-net. When I enable only the rx virtqueue, performs decreases as we've seen above. Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-13 17:57 ` Stefan Hajnoczi @ 2010-12-13 18:52 ` Michael S. Tsirkin 2010-12-15 11:42 ` Stefan Hajnoczi 0 siblings, 1 reply; 52+ messages in thread From: Michael S. Tsirkin @ 2010-12-13 18:52 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel On Mon, Dec 13, 2010 at 05:57:28PM +0000, Stefan Hajnoczi wrote: > On Mon, Dec 13, 2010 at 4:28 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote: > > On Mon, Dec 13, 2010 at 4:12 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > >> On Mon, Dec 13, 2010 at 03:27:06PM +0000, Stefan Hajnoczi wrote: > >>> On Mon, Dec 13, 2010 at 1:36 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > >>> > On Mon, Dec 13, 2010 at 03:35:38PM +0200, Michael S. Tsirkin wrote: > >>> >> On Mon, Dec 13, 2010 at 01:11:27PM +0000, Stefan Hajnoczi wrote: > >>> >> > Fresh results: > >>> >> > > >>> >> > 192.168.0.1 - host (runs netperf) > >>> >> > 192.168.0.2 - guest (runs netserver) > >>> >> > > >>> >> > host$ src/netperf -H 192.168.0.2 -- -m 200 > >>> >> > > >>> >> > ioeventfd=on > >>> >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 > >>> >> > (192.168.0.2) port 0 AF_INET > >>> >> > Recv Send Send > >>> >> > Socket Socket Message Elapsed > >>> >> > Size Size Size Time Throughput > >>> >> > bytes bytes bytes secs. 10^6bits/sec > >>> >> > 87380 16384 200 10.00 1759.25 > >>> >> > > >>> >> > ioeventfd=off > >>> >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 > >>> >> > (192.168.0.2) port 0 AF_INET > >>> >> > Recv Send Send > >>> >> > Socket Socket Message Elapsed > >>> >> > Size Size Size Time Throughput > >>> >> > bytes bytes bytes secs. 10^6bits/sec > >>> >> > > >>> >> > 87380 16384 200 10.00 1757.15 > >>> >> > > >>> >> > The results vary approx +/- 3% between runs. > >>> >> > > >>> >> > Invocation: > >>> >> > $ x86_64-softmmu/qemu-system-x86_64 -m 4096 -enable-kvm -netdev > >>> >> > type=tap,id=net0,ifname=tap0,script=no,downscript=no -device > >>> >> > virtio-net-pci,netdev=net0,ioeventfd=on|off -vnc :0 -drive > >>> >> > if=virtio,cache=none,file=$HOME/rhel6-autobench-raw.img > >>> >> > > >>> >> > I am running qemu.git with v5 patches, based off > >>> >> > 36888c6335422f07bbc50bf3443a39f24b90c7c6. > >>> >> > > >>> >> > Host: > >>> >> > 1 Quad-Core AMD Opteron(tm) Processor 2350 @ 2 GHz > >>> >> > 8 GB RAM > >>> >> > RHEL 6 host > >>> >> > > >>> >> > Next I will try the patches on latest qemu-kvm.git > >>> >> > > >>> >> > Stefan > >>> >> > >>> >> One interesting thing is that I put virtio-net earlier on > >>> >> command line. > >>> > > >>> > Sorry I mean I put it after disk, you put it before. > >>> > >>> I can't find a measurable difference when swapping -drive and -netdev. > >> > >> One other concern I have is that we are apparently using > >> ioeventfd for all VQs. E.g. for virtio-net we probably should not > >> use it for the control VQ - it's a waste of resources. > > > > One option is a per-device (block, net, etc) bitmap that masks out > > virtqueues. Is that something you'd like to see? > > > > I'm tempted to mask out the RX vq too and see how that affects the > > qemu-kvm.git specific issue. > > As expected, the rx virtqueue is involved in the degradation. I > enabled ioeventfd only for the TX virtqueue and got the same good > results as userspace virtio-net. > > When I enable only the rx virtqueue, performs decreases as we've seen above. > > Stefan Interesting. In particular this implies something's wrong with the queue: we should not normally be getting notifications from rx queue at all. Is it running low on buffers? Does it help to increase the vq size? Any other explanation? -- MST ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-13 18:52 ` Michael S. Tsirkin @ 2010-12-15 11:42 ` Stefan Hajnoczi 2010-12-15 11:48 ` Stefan Hajnoczi 2010-12-15 12:14 ` Michael S. Tsirkin 0 siblings, 2 replies; 52+ messages in thread From: Stefan Hajnoczi @ 2010-12-15 11:42 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: qemu-devel On Mon, Dec 13, 2010 at 6:52 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > On Mon, Dec 13, 2010 at 05:57:28PM +0000, Stefan Hajnoczi wrote: >> On Mon, Dec 13, 2010 at 4:28 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote: >> > On Mon, Dec 13, 2010 at 4:12 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >> >> On Mon, Dec 13, 2010 at 03:27:06PM +0000, Stefan Hajnoczi wrote: >> >>> On Mon, Dec 13, 2010 at 1:36 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >> >>> > On Mon, Dec 13, 2010 at 03:35:38PM +0200, Michael S. Tsirkin wrote: >> >>> >> On Mon, Dec 13, 2010 at 01:11:27PM +0000, Stefan Hajnoczi wrote: >> >>> >> > Fresh results: >> >>> >> > >> >>> >> > 192.168.0.1 - host (runs netperf) >> >>> >> > 192.168.0.2 - guest (runs netserver) >> >>> >> > >> >>> >> > host$ src/netperf -H 192.168.0.2 -- -m 200 >> >>> >> > >> >>> >> > ioeventfd=on >> >>> >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 >> >>> >> > (192.168.0.2) port 0 AF_INET >> >>> >> > Recv Send Send >> >>> >> > Socket Socket Message Elapsed >> >>> >> > Size Size Size Time Throughput >> >>> >> > bytes bytes bytes secs. 10^6bits/sec >> >>> >> > 87380 16384 200 10.00 1759.25 >> >>> >> > >> >>> >> > ioeventfd=off >> >>> >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 >> >>> >> > (192.168.0.2) port 0 AF_INET >> >>> >> > Recv Send Send >> >>> >> > Socket Socket Message Elapsed >> >>> >> > Size Size Size Time Throughput >> >>> >> > bytes bytes bytes secs. 10^6bits/sec >> >>> >> > >> >>> >> > 87380 16384 200 10.00 1757.15 >> >>> >> > >> >>> >> > The results vary approx +/- 3% between runs. >> >>> >> > >> >>> >> > Invocation: >> >>> >> > $ x86_64-softmmu/qemu-system-x86_64 -m 4096 -enable-kvm -netdev >> >>> >> > type=tap,id=net0,ifname=tap0,script=no,downscript=no -device >> >>> >> > virtio-net-pci,netdev=net0,ioeventfd=on|off -vnc :0 -drive >> >>> >> > if=virtio,cache=none,file=$HOME/rhel6-autobench-raw.img >> >>> >> > >> >>> >> > I am running qemu.git with v5 patches, based off >> >>> >> > 36888c6335422f07bbc50bf3443a39f24b90c7c6. >> >>> >> > >> >>> >> > Host: >> >>> >> > 1 Quad-Core AMD Opteron(tm) Processor 2350 @ 2 GHz >> >>> >> > 8 GB RAM >> >>> >> > RHEL 6 host >> >>> >> > >> >>> >> > Next I will try the patches on latest qemu-kvm.git >> >>> >> > >> >>> >> > Stefan >> >>> >> >> >>> >> One interesting thing is that I put virtio-net earlier on >> >>> >> command line. >> >>> > >> >>> > Sorry I mean I put it after disk, you put it before. >> >>> >> >>> I can't find a measurable difference when swapping -drive and -netdev. >> >> >> >> One other concern I have is that we are apparently using >> >> ioeventfd for all VQs. E.g. for virtio-net we probably should not >> >> use it for the control VQ - it's a waste of resources. >> > >> > One option is a per-device (block, net, etc) bitmap that masks out >> > virtqueues. Is that something you'd like to see? >> > >> > I'm tempted to mask out the RX vq too and see how that affects the >> > qemu-kvm.git specific issue. >> >> As expected, the rx virtqueue is involved in the degradation. I >> enabled ioeventfd only for the TX virtqueue and got the same good >> results as userspace virtio-net. >> >> When I enable only the rx virtqueue, performs decreases as we've seen above. >> >> Stefan > > Interesting. In particular this implies something's wrong with the > queue: we should not normally be getting notifications from rx queue > at all. Is it running low on buffers? Does it help to increase the vq > size? Any other explanation? I made a mistake, it is the *tx* vq that causes reduced performance on short packets with ioeventfd. I double-checked the results and the rx vq doesn't affect performance. Initially I thought the fix would be to adjust the tx mitigation mechanism since ioeventfd does its own mitigation of sorts. Multiple eventfd signals will be coalesced into one qemu-kvm event handler call if qemu-kvm didn't have a chance to handle the first event before the eventfd was signalled again. I added -device virtio-net-pci tx=immediate to flush the TX queue immediately instead of scheduling a BH or timer. Unfortunately this had little measurable effect and performance stayed the same. This suggests most of the latency is between the guest's pio write and qemu-kvm getting around to handling the event. You mentioned that vhost-net has the same performance issue on this benchmark. I guess a solution for vhost-net may help virtio-ioeventfd and vice versa. Are you happy with this patchset if I remove virtio-net-pci ioeventfd=on|off so only virtio-blk-pci has ioeventfd=on|off (with default on)? For block we've found it to be a win and the initial results looked good for net too. Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-15 11:42 ` Stefan Hajnoczi @ 2010-12-15 11:48 ` Stefan Hajnoczi 2010-12-15 12:00 ` Michael S. Tsirkin 2010-12-15 12:14 ` Michael S. Tsirkin 1 sibling, 1 reply; 52+ messages in thread From: Stefan Hajnoczi @ 2010-12-15 11:48 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: qemu-devel For the record, here are the commits to selectively mask virtqueues for ioeventfd and to add -device virtio-net-pci,tx=immediate: http://repo.or.cz/w/qemu-kvm/stefanha.git/shortlog/refs/heads/virtio-ioeventfd-2 I'm posting this in case you want to try it out too. Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-15 11:48 ` Stefan Hajnoczi @ 2010-12-15 12:00 ` Michael S. Tsirkin 0 siblings, 0 replies; 52+ messages in thread From: Michael S. Tsirkin @ 2010-12-15 12:00 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel On Wed, Dec 15, 2010 at 11:48:50AM +0000, Stefan Hajnoczi wrote: > For the record, here are the commits to selectively mask virtqueues > for ioeventfd and to add -device virtio-net-pci,tx=immediate: > http://repo.or.cz/w/qemu-kvm/stefanha.git/shortlog/refs/heads/virtio-ioeventfd-2 > > I'm posting this in case you want to try it out too. > > Stefan Thanks! ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-15 11:42 ` Stefan Hajnoczi 2010-12-15 11:48 ` Stefan Hajnoczi @ 2010-12-15 12:14 ` Michael S. Tsirkin 2010-12-15 12:59 ` Stefan Hajnoczi 1 sibling, 1 reply; 52+ messages in thread From: Michael S. Tsirkin @ 2010-12-15 12:14 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel On Wed, Dec 15, 2010 at 11:42:12AM +0000, Stefan Hajnoczi wrote: > On Mon, Dec 13, 2010 at 6:52 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Mon, Dec 13, 2010 at 05:57:28PM +0000, Stefan Hajnoczi wrote: > >> On Mon, Dec 13, 2010 at 4:28 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote: > >> > On Mon, Dec 13, 2010 at 4:12 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > >> >> On Mon, Dec 13, 2010 at 03:27:06PM +0000, Stefan Hajnoczi wrote: > >> >>> On Mon, Dec 13, 2010 at 1:36 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > >> >>> > On Mon, Dec 13, 2010 at 03:35:38PM +0200, Michael S. Tsirkin wrote: > >> >>> >> On Mon, Dec 13, 2010 at 01:11:27PM +0000, Stefan Hajnoczi wrote: > >> >>> >> > Fresh results: > >> >>> >> > > >> >>> >> > 192.168.0.1 - host (runs netperf) > >> >>> >> > 192.168.0.2 - guest (runs netserver) > >> >>> >> > > >> >>> >> > host$ src/netperf -H 192.168.0.2 -- -m 200 > >> >>> >> > > >> >>> >> > ioeventfd=on > >> >>> >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 > >> >>> >> > (192.168.0.2) port 0 AF_INET > >> >>> >> > Recv Send Send > >> >>> >> > Socket Socket Message Elapsed > >> >>> >> > Size Size Size Time Throughput > >> >>> >> > bytes bytes bytes secs. 10^6bits/sec > >> >>> >> > 87380 16384 200 10.00 1759.25 > >> >>> >> > > >> >>> >> > ioeventfd=off > >> >>> >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 > >> >>> >> > (192.168.0.2) port 0 AF_INET > >> >>> >> > Recv Send Send > >> >>> >> > Socket Socket Message Elapsed > >> >>> >> > Size Size Size Time Throughput > >> >>> >> > bytes bytes bytes secs. 10^6bits/sec > >> >>> >> > > >> >>> >> > 87380 16384 200 10.00 1757.15 > >> >>> >> > > >> >>> >> > The results vary approx +/- 3% between runs. > >> >>> >> > > >> >>> >> > Invocation: > >> >>> >> > $ x86_64-softmmu/qemu-system-x86_64 -m 4096 -enable-kvm -netdev > >> >>> >> > type=tap,id=net0,ifname=tap0,script=no,downscript=no -device > >> >>> >> > virtio-net-pci,netdev=net0,ioeventfd=on|off -vnc :0 -drive > >> >>> >> > if=virtio,cache=none,file=$HOME/rhel6-autobench-raw.img > >> >>> >> > > >> >>> >> > I am running qemu.git with v5 patches, based off > >> >>> >> > 36888c6335422f07bbc50bf3443a39f24b90c7c6. > >> >>> >> > > >> >>> >> > Host: > >> >>> >> > 1 Quad-Core AMD Opteron(tm) Processor 2350 @ 2 GHz > >> >>> >> > 8 GB RAM > >> >>> >> > RHEL 6 host > >> >>> >> > > >> >>> >> > Next I will try the patches on latest qemu-kvm.git > >> >>> >> > > >> >>> >> > Stefan > >> >>> >> > >> >>> >> One interesting thing is that I put virtio-net earlier on > >> >>> >> command line. > >> >>> > > >> >>> > Sorry I mean I put it after disk, you put it before. > >> >>> > >> >>> I can't find a measurable difference when swapping -drive and -netdev. > >> >> > >> >> One other concern I have is that we are apparently using > >> >> ioeventfd for all VQs. E.g. for virtio-net we probably should not > >> >> use it for the control VQ - it's a waste of resources. > >> > > >> > One option is a per-device (block, net, etc) bitmap that masks out > >> > virtqueues. Is that something you'd like to see? > >> > > >> > I'm tempted to mask out the RX vq too and see how that affects the > >> > qemu-kvm.git specific issue. > >> > >> As expected, the rx virtqueue is involved in the degradation. I > >> enabled ioeventfd only for the TX virtqueue and got the same good > >> results as userspace virtio-net. > >> > >> When I enable only the rx virtqueue, performs decreases as we've seen above. > >> > >> Stefan > > > > Interesting. In particular this implies something's wrong with the > > queue: we should not normally be getting notifications from rx queue > > at all. Is it running low on buffers? Does it help to increase the vq > > size? Any other explanation? > > I made a mistake, it is the *tx* vq that causes reduced performance on > short packets with ioeventfd. I double-checked the results and the rx > vq doesn't affect performance. > > Initially I thought the fix would be to adjust the tx mitigation > mechanism since ioeventfd does its own mitigation of sorts. Multiple > eventfd signals will be coalesced into one qemu-kvm event handler call > if qemu-kvm didn't have a chance to handle the first event before the > eventfd was signalled again. > > I added -device virtio-net-pci tx=immediate to flush the TX queue > immediately instead of scheduling a BH or timer. Unfortunately this > had little measurable effect and performance stayed the same. This > suggests most of the latency is between the guest's pio write and > qemu-kvm getting around to handling the event. > > You mentioned that vhost-net has the same performance issue on this > benchmark. I guess a solution for vhost-net may help virtio-ioeventfd > and vice versa. > > Are you happy with this patchset if I remove virtio-net-pci > ioeventfd=on|off so only virtio-blk-pci has ioeventfd=on|off (with > default on)? For block we've found it to be a win and the initial > results looked good for net too. > > Stefan I'm concerned that the tests were done on qemu.git. Could you check block with qemu-kvm too please? ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-15 12:14 ` Michael S. Tsirkin @ 2010-12-15 12:59 ` Stefan Hajnoczi 2010-12-16 16:40 ` Stefan Hajnoczi 2010-12-19 14:49 ` Michael S. Tsirkin 0 siblings, 2 replies; 52+ messages in thread From: Stefan Hajnoczi @ 2010-12-15 12:59 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: qemu-devel On Wed, Dec 15, 2010 at 12:14 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > On Wed, Dec 15, 2010 at 11:42:12AM +0000, Stefan Hajnoczi wrote: >> On Mon, Dec 13, 2010 at 6:52 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >> > On Mon, Dec 13, 2010 at 05:57:28PM +0000, Stefan Hajnoczi wrote: >> >> On Mon, Dec 13, 2010 at 4:28 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote: >> >> > On Mon, Dec 13, 2010 at 4:12 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >> >> >> On Mon, Dec 13, 2010 at 03:27:06PM +0000, Stefan Hajnoczi wrote: >> >> >>> On Mon, Dec 13, 2010 at 1:36 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >> >> >>> > On Mon, Dec 13, 2010 at 03:35:38PM +0200, Michael S. Tsirkin wrote: >> >> >>> >> On Mon, Dec 13, 2010 at 01:11:27PM +0000, Stefan Hajnoczi wrote: >> >> >>> >> > Fresh results: >> >> >>> >> > >> >> >>> >> > 192.168.0.1 - host (runs netperf) >> >> >>> >> > 192.168.0.2 - guest (runs netserver) >> >> >>> >> > >> >> >>> >> > host$ src/netperf -H 192.168.0.2 -- -m 200 >> >> >>> >> > >> >> >>> >> > ioeventfd=on >> >> >>> >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 >> >> >>> >> > (192.168.0.2) port 0 AF_INET >> >> >>> >> > Recv Send Send >> >> >>> >> > Socket Socket Message Elapsed >> >> >>> >> > Size Size Size Time Throughput >> >> >>> >> > bytes bytes bytes secs. 10^6bits/sec >> >> >>> >> > 87380 16384 200 10.00 1759.25 >> >> >>> >> > >> >> >>> >> > ioeventfd=off >> >> >>> >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2 >> >> >>> >> > (192.168.0.2) port 0 AF_INET >> >> >>> >> > Recv Send Send >> >> >>> >> > Socket Socket Message Elapsed >> >> >>> >> > Size Size Size Time Throughput >> >> >>> >> > bytes bytes bytes secs. 10^6bits/sec >> >> >>> >> > >> >> >>> >> > 87380 16384 200 10.00 1757.15 >> >> >>> >> > >> >> >>> >> > The results vary approx +/- 3% between runs. >> >> >>> >> > >> >> >>> >> > Invocation: >> >> >>> >> > $ x86_64-softmmu/qemu-system-x86_64 -m 4096 -enable-kvm -netdev >> >> >>> >> > type=tap,id=net0,ifname=tap0,script=no,downscript=no -device >> >> >>> >> > virtio-net-pci,netdev=net0,ioeventfd=on|off -vnc :0 -drive >> >> >>> >> > if=virtio,cache=none,file=$HOME/rhel6-autobench-raw.img >> >> >>> >> > >> >> >>> >> > I am running qemu.git with v5 patches, based off >> >> >>> >> > 36888c6335422f07bbc50bf3443a39f24b90c7c6. >> >> >>> >> > >> >> >>> >> > Host: >> >> >>> >> > 1 Quad-Core AMD Opteron(tm) Processor 2350 @ 2 GHz >> >> >>> >> > 8 GB RAM >> >> >>> >> > RHEL 6 host >> >> >>> >> > >> >> >>> >> > Next I will try the patches on latest qemu-kvm.git >> >> >>> >> > >> >> >>> >> > Stefan >> >> >>> >> >> >> >>> >> One interesting thing is that I put virtio-net earlier on >> >> >>> >> command line. >> >> >>> > >> >> >>> > Sorry I mean I put it after disk, you put it before. >> >> >>> >> >> >>> I can't find a measurable difference when swapping -drive and -netdev. >> >> >> >> >> >> One other concern I have is that we are apparently using >> >> >> ioeventfd for all VQs. E.g. for virtio-net we probably should not >> >> >> use it for the control VQ - it's a waste of resources. >> >> > >> >> > One option is a per-device (block, net, etc) bitmap that masks out >> >> > virtqueues. Is that something you'd like to see? >> >> > >> >> > I'm tempted to mask out the RX vq too and see how that affects the >> >> > qemu-kvm.git specific issue. >> >> >> >> As expected, the rx virtqueue is involved in the degradation. I >> >> enabled ioeventfd only for the TX virtqueue and got the same good >> >> results as userspace virtio-net. >> >> >> >> When I enable only the rx virtqueue, performs decreases as we've seen above. >> >> >> >> Stefan >> > >> > Interesting. In particular this implies something's wrong with the >> > queue: we should not normally be getting notifications from rx queue >> > at all. Is it running low on buffers? Does it help to increase the vq >> > size? Any other explanation? >> >> I made a mistake, it is the *tx* vq that causes reduced performance on >> short packets with ioeventfd. I double-checked the results and the rx >> vq doesn't affect performance. >> >> Initially I thought the fix would be to adjust the tx mitigation >> mechanism since ioeventfd does its own mitigation of sorts. Multiple >> eventfd signals will be coalesced into one qemu-kvm event handler call >> if qemu-kvm didn't have a chance to handle the first event before the >> eventfd was signalled again. >> >> I added -device virtio-net-pci tx=immediate to flush the TX queue >> immediately instead of scheduling a BH or timer. Unfortunately this >> had little measurable effect and performance stayed the same. This >> suggests most of the latency is between the guest's pio write and >> qemu-kvm getting around to handling the event. >> >> You mentioned that vhost-net has the same performance issue on this >> benchmark. I guess a solution for vhost-net may help virtio-ioeventfd >> and vice versa. >> >> Are you happy with this patchset if I remove virtio-net-pci >> ioeventfd=on|off so only virtio-blk-pci has ioeventfd=on|off (with >> default on)? For block we've found it to be a win and the initial >> results looked good for net too. >> >> Stefan > > I'm concerned that the tests were done on qemu.git. > Could you check block with qemu-kvm too please? The following results show qemu-kvm with virtio-ioeventfd v3 for both aio=native and aio=threads: http://wiki.qemu.org/Features/VirtioIoeventfd Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-15 12:59 ` Stefan Hajnoczi @ 2010-12-16 16:40 ` Stefan Hajnoczi 2010-12-16 23:39 ` Michael S. Tsirkin 2010-12-19 14:49 ` Michael S. Tsirkin 1 sibling, 1 reply; 52+ messages in thread From: Stefan Hajnoczi @ 2010-12-16 16:40 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: qemu-devel On Wed, Dec 15, 2010 at 12:59 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote: > On Wed, Dec 15, 2010 at 12:14 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >> On Wed, Dec 15, 2010 at 11:42:12AM +0000, Stefan Hajnoczi wrote: >>> Are you happy with this patchset if I remove virtio-net-pci >>> ioeventfd=on|off so only virtio-blk-pci has ioeventfd=on|off (with >>> default on)? For block we've found it to be a win and the initial >>> results looked good for net too. Please let me know if I should disable ioeventfd for virtio-net. Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-16 16:40 ` Stefan Hajnoczi @ 2010-12-16 23:39 ` Michael S. Tsirkin 0 siblings, 0 replies; 52+ messages in thread From: Michael S. Tsirkin @ 2010-12-16 23:39 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel On Thu, Dec 16, 2010 at 04:40:32PM +0000, Stefan Hajnoczi wrote: > On Wed, Dec 15, 2010 at 12:59 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote: > > On Wed, Dec 15, 2010 at 12:14 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > >> On Wed, Dec 15, 2010 at 11:42:12AM +0000, Stefan Hajnoczi wrote: > >>> Are you happy with this patchset if I remove virtio-net-pci > >>> ioeventfd=on|off so only virtio-blk-pci has ioeventfd=on|off (with > >>> default on)? For block we've found it to be a win and the initial > >>> results looked good for net too. > > Please let me know if I should disable ioeventfd for virtio-net. > > Stefan Sure, if it slows us down, we should disable it. What bothers me is the API issue that makes ioeventfd an all or nothing thing, so it's enabled for the control vq needs to be resolved anyway. Still it does not affect block, so maybe we can merge as is and fix later ... I will try to think it over on the weekend. -- MST ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-15 12:59 ` Stefan Hajnoczi 2010-12-16 16:40 ` Stefan Hajnoczi @ 2010-12-19 14:49 ` Michael S. Tsirkin 2011-01-06 16:41 ` Stefan Hajnoczi 1 sibling, 1 reply; 52+ messages in thread From: Michael S. Tsirkin @ 2010-12-19 14:49 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel On Wed, Dec 15, 2010 at 12:59:45PM +0000, Stefan Hajnoczi wrote: > > I'm concerned that the tests were done on qemu.git. > > Could you check block with qemu-kvm too please? > > The following results show qemu-kvm with virtio-ioeventfd v3 for both > aio=native and aio=threads: > > http://wiki.qemu.org/Features/VirtioIoeventfd > > Stefan What were the flags used to run qemu here? One option that's known to affect speed significantly is x2apic. Did you try it? -- MST ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2010-12-19 14:49 ` Michael S. Tsirkin @ 2011-01-06 16:41 ` Stefan Hajnoczi 2011-01-06 17:04 ` Michael S. Tsirkin 2011-01-06 18:00 ` Michael S. Tsirkin 0 siblings, 2 replies; 52+ messages in thread From: Stefan Hajnoczi @ 2011-01-06 16:41 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: Khoa Huynh, qemu-devel Here are 4k sequential read results (cache=none) to check whether we see an ioeventfd performance regression with virtio-blk. The idea is to use a small blocksize with an I/O pattern (sequential reads) that is cheap and executes quickly. Therefore we're doing many iops and the cost virtqueue kick/notify is especially important. We're not trying to stress the disk, we're trying to make the difference in ioeventfd=on/off apparent. I did 2 runs for both ioeventfd=off and ioeventfd=on. The results are similar: 1% and 2% degradation in MB/s or iops. We'd have to do more runs to see if the degradation is statistically significant, but the percentage value is so low that I'm satisfied. Are you happy to merge virtio-ioeventfd v6 + your fixups? Full results below: x86_64-softmmu/qemu-system-x86_64 -m 1024 -drive if=none,file=rhel6.img,cache=none,id=system -device virtio-blk-pci,drive=system -drive if=none,file=/dev/volumes/storage,cache=none,id=storage -device virtio-blk-pci,drive=storage -cpu kvm64,+x2apic -vnc :0 fio jobfile: [global] ioengine=libaio buffered=0 rw=read bs=4k iodepth=1 runtime=2m [job1] filename=/dev/vdb ioeventfd=off: job1: (groupid=0, jobs=1): err= 0: pid=2692 read : io=2,353MB, bw=20,080KB/s, iops=5,019, runt=120001msec slat (usec): min=20, max=1,424, avg=34.86, stdev= 7.62 clat (usec): min=1, max=11,547, avg=162.02, stdev=42.95 bw (KB/s) : min=16600, max=20328, per=100.03%, avg=20084.25, stdev=241.88 cpu : usr=1.14%, sys=13.40%, ctx=604918, majf=0, minf=29 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued r/w: total=602391/0, short=0/0 lat (usec): 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01% lat (usec): 100=0.01%, 250=99.89%, 500=0.07%, 750=0.01%, 1000=0.02% lat (msec): 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01% Run status group 0 (all jobs): READ: io=2,353MB, aggrb=20,079KB/s, minb=20,561KB/s, maxb=20,561KB/s, mint=120001msec, maxt=120001msec Disk stats (read/write): vdb: ios=601339/0, merge=0/0, ticks=112092/0, in_queue=111815, util=93.38% ioeventfd=on: job1: (groupid=0, jobs=1): err= 0: pid=2692 read : io=2,299MB, bw=19,619KB/s, iops=4,904, runt=120001msec slat (usec): min=9, max=2,257, avg=40.43, stdev=11.65 clat (usec): min=1, max=28,000, avg=161.12, stdev=61.46 bw (KB/s) : min=15720, max=19984, per=100.02%, avg=19623.26, stdev=290.76 cpu : usr=1.49%, sys=19.34%, ctx=591398, majf=0, minf=29 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued r/w: total=588578/0, short=0/0 lat (usec): 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01% lat (usec): 100=0.01%, 250=99.86%, 500=0.09%, 750=0.01%, 1000=0.02% lat (msec): 2=0.01%, 4=0.01%, 10=0.01%, 50=0.01% Run status group 0 (all jobs): READ: io=2,299MB, aggrb=19,619KB/s, minb=20,089KB/s, maxb=20,089KB/s, mint=120001msec, maxt=120001msec Disk stats (read/write): vdb: ios=587592/0, merge=0/0, ticks=110373/0, in_queue=110125, util=91.97% Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2011-01-06 16:41 ` Stefan Hajnoczi @ 2011-01-06 17:04 ` Michael S. Tsirkin 2011-01-06 18:00 ` Michael S. Tsirkin 1 sibling, 0 replies; 52+ messages in thread From: Michael S. Tsirkin @ 2011-01-06 17:04 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: Khoa Huynh, qemu-devel On Thu, Jan 06, 2011 at 04:41:50PM +0000, Stefan Hajnoczi wrote: > Here are 4k sequential read results (cache=none) to check whether we > see an ioeventfd performance regression with virtio-blk. > > The idea is to use a small blocksize with an I/O pattern (sequential > reads) that is cheap and executes quickly. Therefore we're doing many > iops and the cost virtqueue kick/notify is especially important. > We're not trying to stress the disk, we're trying to make the > difference in ioeventfd=on/off apparent. > > I did 2 runs for both ioeventfd=off and ioeventfd=on. The results are > similar: 1% and 2% degradation in MB/s or iops. We'd have to do more > runs to see if the degradation is statistically significant, but the > percentage value is so low that I'm satisfied. > > Are you happy to merge virtio-ioeventfd v6 + your fixups? Think so. I would like to do a bit of testing of the whole thing with migration (ideally with virtio net and vhost too, even though we don't yet enable them). Hope to put it on my tree by next week. > Full results below: > > x86_64-softmmu/qemu-system-x86_64 -m 1024 -drive > if=none,file=rhel6.img,cache=none,id=system -device > virtio-blk-pci,drive=system -drive > if=none,file=/dev/volumes/storage,cache=none,id=storage -device > virtio-blk-pci,drive=storage -cpu kvm64,+x2apic -vnc :0 > > fio jobfile: > [global] > ioengine=libaio > buffered=0 > rw=read > bs=4k > iodepth=1 > runtime=2m > > [job1] > filename=/dev/vdb > > ioeventfd=off: > job1: (groupid=0, jobs=1): err= 0: pid=2692 > read : io=2,353MB, bw=20,080KB/s, iops=5,019, runt=120001msec > slat (usec): min=20, max=1,424, avg=34.86, stdev= 7.62 > clat (usec): min=1, max=11,547, avg=162.02, stdev=42.95 > bw (KB/s) : min=16600, max=20328, per=100.03%, avg=20084.25, stdev=241.88 > cpu : usr=1.14%, sys=13.40%, ctx=604918, majf=0, minf=29 > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > issued r/w: total=602391/0, short=0/0 > lat (usec): 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01% > lat (usec): 100=0.01%, 250=99.89%, 500=0.07%, 750=0.01%, 1000=0.02% > lat (msec): 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01% > > Run status group 0 (all jobs): > READ: io=2,353MB, aggrb=20,079KB/s, minb=20,561KB/s, > maxb=20,561KB/s, mint=120001msec, maxt=120001msec > > Disk stats (read/write): > vdb: ios=601339/0, merge=0/0, ticks=112092/0, in_queue=111815, util=93.38% > > ioeventfd=on: > job1: (groupid=0, jobs=1): err= 0: pid=2692 > read : io=2,299MB, bw=19,619KB/s, iops=4,904, runt=120001msec > slat (usec): min=9, max=2,257, avg=40.43, stdev=11.65 > clat (usec): min=1, max=28,000, avg=161.12, stdev=61.46 > bw (KB/s) : min=15720, max=19984, per=100.02%, avg=19623.26, stdev=290.76 > cpu : usr=1.49%, sys=19.34%, ctx=591398, majf=0, minf=29 > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > issued r/w: total=588578/0, short=0/0 > lat (usec): 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01% > lat (usec): 100=0.01%, 250=99.86%, 500=0.09%, 750=0.01%, 1000=0.02% > lat (msec): 2=0.01%, 4=0.01%, 10=0.01%, 50=0.01% > > Run status group 0 (all jobs): > READ: io=2,299MB, aggrb=19,619KB/s, minb=20,089KB/s, > maxb=20,089KB/s, mint=120001msec, maxt=120001msec > > Disk stats (read/write): > vdb: ios=587592/0, merge=0/0, ticks=110373/0, in_queue=110125, util=91.97% > > Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2011-01-06 16:41 ` Stefan Hajnoczi 2011-01-06 17:04 ` Michael S. Tsirkin @ 2011-01-06 18:00 ` Michael S. Tsirkin 2011-01-07 8:56 ` Stefan Hajnoczi 1 sibling, 1 reply; 52+ messages in thread From: Michael S. Tsirkin @ 2011-01-06 18:00 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: Khoa Huynh, qemu-devel On Thu, Jan 06, 2011 at 04:41:50PM +0000, Stefan Hajnoczi wrote: > Here are 4k sequential read results (cache=none) to check whether we > see an ioeventfd performance regression with virtio-blk. > > The idea is to use a small blocksize with an I/O pattern (sequential > reads) that is cheap and executes quickly. Therefore we're doing many > iops and the cost virtqueue kick/notify is especially important. > We're not trying to stress the disk, we're trying to make the > difference in ioeventfd=on/off apparent. > > I did 2 runs for both ioeventfd=off and ioeventfd=on. The results are > similar: 1% and 2% degradation in MB/s or iops. We'd have to do more > runs to see if the degradation is statistically significant, but the > percentage value is so low that I'm satisfied. > > Are you happy to merge virtio-ioeventfd v6 + your fixups? BTW if you could do some migration stress-testing too, would be nice. autotest has support for it now. -- MST ^ permalink raw reply [flat|nested] 52+ messages in thread
* [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify 2011-01-06 18:00 ` Michael S. Tsirkin @ 2011-01-07 8:56 ` Stefan Hajnoczi 0 siblings, 0 replies; 52+ messages in thread From: Stefan Hajnoczi @ 2011-01-07 8:56 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: Khoa Huynh, qemu-devel On Thu, Jan 6, 2011 at 6:00 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > On Thu, Jan 06, 2011 at 04:41:50PM +0000, Stefan Hajnoczi wrote: >> Here are 4k sequential read results (cache=none) to check whether we >> see an ioeventfd performance regression with virtio-blk. >> >> The idea is to use a small blocksize with an I/O pattern (sequential >> reads) that is cheap and executes quickly. Therefore we're doing many >> iops and the cost virtqueue kick/notify is especially important. >> We're not trying to stress the disk, we're trying to make the >> difference in ioeventfd=on/off apparent. >> >> I did 2 runs for both ioeventfd=off and ioeventfd=on. The results are >> similar: 1% and 2% degradation in MB/s or iops. We'd have to do more >> runs to see if the degradation is statistically significant, but the >> percentage value is so low that I'm satisfied. >> >> Are you happy to merge virtio-ioeventfd v6 + your fixups? > > BTW if you could do some migration stress-testing too, > would be nice. autotest has support for it now. Okay, I'll let you know the results. Stefan ^ permalink raw reply [flat|nested] 52+ messages in thread
end of thread, other threads:[~2011-01-26 0:18 UTC | newest] Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-12-12 15:02 [Qemu-devel] [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify Stefan Hajnoczi 2010-12-12 15:02 ` [Qemu-devel] [PATCH v5 1/4] virtio-pci: Rename bugs field to flags Stefan Hajnoczi 2010-12-12 15:02 ` [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify Stefan Hajnoczi 2011-01-24 18:54 ` Kevin Wolf 2011-01-24 19:36 ` Michael S. Tsirkin 2011-01-24 19:48 ` Kevin Wolf 2011-01-24 19:47 ` Michael S. Tsirkin 2011-01-24 20:05 ` Kevin Wolf 2011-01-25 7:12 ` Stefan Hajnoczi 2011-01-25 9:49 ` Stefan Hajnoczi 2011-01-25 9:54 ` Stefan Hajnoczi 2011-01-25 11:27 ` Michael S. Tsirkin 2011-01-25 13:20 ` Stefan Hajnoczi 2011-01-25 14:07 ` Stefan Hajnoczi 2011-01-25 19:18 ` Anthony Liguori 2011-01-25 19:45 ` Stefan Hajnoczi 2011-01-25 19:51 ` Anthony Liguori 2011-01-25 19:59 ` Stefan Hajnoczi 2011-01-26 0:18 ` Anthony Liguori 2010-12-12 15:02 ` [Qemu-devel] [PATCH v5 3/4] virtio-pci: Don't use ioeventfd on old kernels Stefan Hajnoczi 2010-12-12 15:02 ` [Qemu-devel] [PATCH v5 4/4] docs: Document virtio PCI -device ioeventfd=on|off Stefan Hajnoczi 2010-12-12 15:14 ` [Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify Stefan Hajnoczi 2010-12-12 20:41 ` Michael S. Tsirkin 2010-12-12 20:42 ` Michael S. Tsirkin 2010-12-12 20:56 ` Michael S. Tsirkin 2010-12-12 21:09 ` Michael S. Tsirkin 2010-12-13 10:24 ` Stefan Hajnoczi 2010-12-13 10:38 ` Michael S. Tsirkin 2010-12-13 13:11 ` Stefan Hajnoczi 2010-12-13 13:35 ` Michael S. Tsirkin 2010-12-13 13:36 ` Michael S. Tsirkin 2010-12-13 14:06 ` Stefan Hajnoczi 2010-12-13 15:27 ` Stefan Hajnoczi 2010-12-13 16:00 ` Michael S. Tsirkin 2010-12-13 16:29 ` Stefan Hajnoczi 2010-12-13 16:30 ` Michael S. Tsirkin 2010-12-13 16:12 ` Michael S. Tsirkin 2010-12-13 16:28 ` Stefan Hajnoczi 2010-12-13 17:57 ` Stefan Hajnoczi 2010-12-13 18:52 ` Michael S. Tsirkin 2010-12-15 11:42 ` Stefan Hajnoczi 2010-12-15 11:48 ` Stefan Hajnoczi 2010-12-15 12:00 ` Michael S. Tsirkin 2010-12-15 12:14 ` Michael S. Tsirkin 2010-12-15 12:59 ` Stefan Hajnoczi 2010-12-16 16:40 ` Stefan Hajnoczi 2010-12-16 23:39 ` Michael S. Tsirkin 2010-12-19 14:49 ` Michael S. Tsirkin 2011-01-06 16:41 ` Stefan Hajnoczi 2011-01-06 17:04 ` Michael S. Tsirkin 2011-01-06 18:00 ` Michael S. Tsirkin 2011-01-07 8:56 ` Stefan Hajnoczi
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.