kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] virtio-net: More configurability and bh handling for tx
@ 2010-09-02 15:00 Alex Williamson
  2010-09-02 15:00 ` [PATCH v2 1/4] virtio-net: Make tx_timer timeout configurable Alex Williamson
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Alex Williamson @ 2010-09-02 15:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, alex.williamson, anthony, mst, chrisw, quintela, jes.sorensen

Incorporated feedback.  txburst= and txtimer= are now "x-" prefixed
developer options.  I added a tx= option, because I do want there to
be a supported way to switch between TX strategies.  This also drops
the magic value of txtimer= 1 or 0 (setting default timeout or switching
modes).  I also dropped the trickiness around only enabling the bottom
half for the iothread since Anthony indicates we don't care about
performance for the non-iothread case.

New performance data against both qemu.git and qemu-kvm.git:

https://spreadsheets.google.com/pub?key=0AoEm50Bac2U7dGdlREhrWUpPVEdMcTJaX0RjSEgtc3c&hl=en&single=true&gid=0&output=html

This shows the base performance versus patched with new default for
various packet sizes and netperf test loads.  The guest is a 2-way
SMP, connected directly to the host via a tap (no bridge).  Thanks,

Alex

---

Alex Williamson (4):
      virtio-net: Introduce a new bottom half packet TX
      virtio-net: Rename tx_timer_active to tx_waiting
      virtio-net: Limit number of packets sent per TX flush
      virtio-net: Make tx_timer timeout configurable


 hw/s390-virtio-bus.c |    8 +++
 hw/s390-virtio-bus.h |    1 
 hw/syborg_virtio.c   |    8 +++
 hw/virtio-net.c      |  129 +++++++++++++++++++++++++++++++++++++++++---------
 hw/virtio-net.h      |   14 +++++
 hw/virtio-pci.c      |    8 +++
 hw/virtio.h          |    4 +-
 7 files changed, 144 insertions(+), 28 deletions(-)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/4] virtio-net: Make tx_timer timeout configurable
  2010-09-02 15:00 [PATCH v2 0/4] virtio-net: More configurability and bh handling for tx Alex Williamson
@ 2010-09-02 15:00 ` Alex Williamson
  2010-09-02 15:00 ` [PATCH v2 2/4] virtio-net: Limit number of packets sent per TX flush Alex Williamson
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Alex Williamson @ 2010-09-02 15:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, alex.williamson, anthony, mst, chrisw, quintela, jes.sorensen

Add an option to make the TX mitigation timer adjustable as a device
option.  The 150us hard coded default used currently is reasonable,
but may not be suitable for all workloads, this gives us a way to
adjust it using a single binary.  We can't support any random option
though, so use the "x-" prefix to indicate this is a developer
option.  Usage:

-device virtio-net-pci,x-txtimer=500000,... # .5ms timeout

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---

 hw/s390-virtio-bus.c |    5 ++++-
 hw/s390-virtio-bus.h |    1 +
 hw/syborg_virtio.c   |    5 ++++-
 hw/virtio-net.c      |    9 ++++++---
 hw/virtio-net.h      |    5 +++++
 hw/virtio-pci.c      |    5 ++++-
 hw/virtio.h          |    4 +++-
 7 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/hw/s390-virtio-bus.c b/hw/s390-virtio-bus.c
index fe6884d..d5cb24e 100644
--- a/hw/s390-virtio-bus.c
+++ b/hw/s390-virtio-bus.c
@@ -27,6 +27,7 @@
 #include "elf.h"
 #include "hw/virtio.h"
 #include "hw/virtio-serial.h"
+#include "hw/virtio-net.h"
 #include "hw/sysbus.h"
 #include "kvm.h"
 
@@ -110,7 +111,7 @@ static int s390_virtio_net_init(VirtIOS390Device *dev)
 {
     VirtIODevice *vdev;
 
-    vdev = virtio_net_init((DeviceState *)dev, &dev->nic);
+    vdev = virtio_net_init((DeviceState *)dev, &dev->nic, &dev->net);
     if (!vdev) {
         return -1;
     }
@@ -327,6 +328,8 @@ static VirtIOS390DeviceInfo s390_virtio_net = {
     .qdev.size = sizeof(VirtIOS390Device),
     .qdev.props = (Property[]) {
         DEFINE_NIC_PROPERTIES(VirtIOS390Device, nic),
+        DEFINE_PROP_UINT32("x-txtimer", VirtIOS390Device,
+                           net.txtimer, TX_TIMER_INTERVAL),
         DEFINE_PROP_END_OF_LIST(),
     },
 };
diff --git a/hw/s390-virtio-bus.h b/hw/s390-virtio-bus.h
index 333fea8..41558c9 100644
--- a/hw/s390-virtio-bus.h
+++ b/hw/s390-virtio-bus.h
@@ -43,6 +43,7 @@ typedef struct VirtIOS390Device {
     uint32_t host_features;
     /* Max. number of ports we can have for a the virtio-serial device */
     uint32_t max_virtserial_ports;
+    virtio_net_conf net;
 } VirtIOS390Device;
 
 typedef struct VirtIOS390Bus {
diff --git a/hw/syborg_virtio.c b/hw/syborg_virtio.c
index abf0370..5665189 100644
--- a/hw/syborg_virtio.c
+++ b/hw/syborg_virtio.c
@@ -68,6 +68,7 @@ typedef struct {
     uint32_t id;
     NICConf nic;
     uint32_t host_features;
+    virtio_net_conf net;
 } SyborgVirtIOProxy;
 
 static uint32_t syborg_virtio_readl(void *opaque, target_phys_addr_t offset)
@@ -284,7 +285,7 @@ static int syborg_virtio_net_init(SysBusDevice *dev)
     VirtIODevice *vdev;
     SyborgVirtIOProxy *proxy = FROM_SYSBUS(SyborgVirtIOProxy, dev);
 
-    vdev = virtio_net_init(&dev->qdev, &proxy->nic);
+    vdev = virtio_net_init(&dev->qdev, &proxy->nic, &proxy->net);
     return syborg_virtio_init(proxy, vdev);
 }
 
@@ -295,6 +296,8 @@ static SysBusDeviceInfo syborg_virtio_net_info = {
     .qdev.props = (Property[]) {
         DEFINE_NIC_PROPERTIES(SyborgVirtIOProxy, nic),
         DEFINE_VIRTIO_NET_FEATURES(SyborgVirtIOProxy, host_features),
+        DEFINE_PROP_UINT32("x-txtimer", SyborgVirtIOProxy,
+                           net.txtimer, TX_TIMER_INTERVAL),
         DEFINE_PROP_END_OF_LIST(),
     }
 };
diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 075f72d..d5b03ab 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -36,6 +36,7 @@ typedef struct VirtIONet
     VirtQueue *ctrl_vq;
     NICState *nic;
     QEMUTimer *tx_timer;
+    uint32_t tx_timeout;
     int tx_timer_active;
     uint32_t has_vnet_hdr;
     uint8_t has_ufo;
@@ -702,7 +703,7 @@ static void virtio_net_handle_tx(VirtIODevice *vdev, VirtQueue *vq)
         virtio_net_flush_tx(n, vq);
     } else {
         qemu_mod_timer(n->tx_timer,
-                       qemu_get_clock(vm_clock) + TX_TIMER_INTERVAL);
+                       qemu_get_clock(vm_clock) + n->tx_timeout);
         n->tx_timer_active = 1;
         virtio_queue_set_notification(vq, 0);
     }
@@ -842,7 +843,7 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
 
     if (n->tx_timer_active) {
         qemu_mod_timer(n->tx_timer,
-                       qemu_get_clock(vm_clock) + TX_TIMER_INTERVAL);
+                       qemu_get_clock(vm_clock) + n->tx_timeout);
     }
     return 0;
 }
@@ -903,7 +904,8 @@ static void virtio_net_vmstate_change(void *opaque, int running, int reason)
     virtio_net_set_status(&n->vdev, n->vdev.status & status);
 }
 
-VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf)
+VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
+                              virtio_net_conf *net)
 {
     VirtIONet *n;
 
@@ -931,6 +933,7 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf)
 
     n->tx_timer = qemu_new_timer(vm_clock, virtio_net_tx_timer, n);
     n->tx_timer_active = 0;
+    n->tx_timeout = net->txtimer;
     n->mergeable_rx_bufs = 0;
     n->promisc = 1; /* for compatibility */
 
diff --git a/hw/virtio-net.h b/hw/virtio-net.h
index 235f1a9..46a2e1c 100644
--- a/hw/virtio-net.h
+++ b/hw/virtio-net.h
@@ -49,6 +49,11 @@
 
 #define TX_TIMER_INTERVAL 150000 /* 150 us */
 
+typedef struct virtio_net_conf
+{
+    uint32_t txtimer;
+} virtio_net_conf;
+
 /* Maximum packet size we can receive from tap device: header + 64k */
 #define VIRTIO_NET_MAX_BUFSIZE (sizeof(struct virtio_net_hdr) + (64 << 10))
 
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 6e8f88a..1af48e2 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -107,6 +107,7 @@ typedef struct {
 #endif
     /* Max. number of ports we can have for a the virtio-serial device */
     uint32_t max_virtserial_ports;
+    virtio_net_conf net;
 } VirtIOPCIProxy;
 
 /* virtio device */
@@ -613,7 +614,7 @@ static int virtio_net_init_pci(PCIDevice *pci_dev)
     VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
     VirtIODevice *vdev;
 
-    vdev = virtio_net_init(&pci_dev->qdev, &proxy->nic);
+    vdev = virtio_net_init(&pci_dev->qdev, &proxy->nic, &proxy->net);
 
     vdev->nvectors = proxy->nvectors;
     virtio_init_pci(proxy, vdev,
@@ -690,6 +691,8 @@ static PCIDeviceInfo virtio_info[] = {
             DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 3),
             DEFINE_VIRTIO_NET_FEATURES(VirtIOPCIProxy, host_features),
             DEFINE_NIC_PROPERTIES(VirtIOPCIProxy, nic),
+            DEFINE_PROP_UINT32("x-txtimer", VirtIOPCIProxy,
+                               net.txtimer, TX_TIMER_INTERVAL),
             DEFINE_PROP_END_OF_LIST(),
         },
         .qdev.reset = virtio_pci_reset,
diff --git a/hw/virtio.h b/hw/virtio.h
index 5836ab6..a60d735 100644
--- a/hw/virtio.h
+++ b/hw/virtio.h
@@ -185,7 +185,9 @@ void virtio_bind_device(VirtIODevice *vdev, const VirtIOBindings *binding,
 
 /* Base devices.  */
 VirtIODevice *virtio_blk_init(DeviceState *dev, BlockConf *conf);
-VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf);
+struct virtio_net_conf;
+VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
+                              struct virtio_net_conf *net);
 VirtIODevice *virtio_serial_init(DeviceState *dev, uint32_t max_nr_ports);
 VirtIODevice *virtio_balloon_init(DeviceState *dev);
 #ifdef CONFIG_LINUX


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/4] virtio-net: Limit number of packets sent per TX flush
  2010-09-02 15:00 [PATCH v2 0/4] virtio-net: More configurability and bh handling for tx Alex Williamson
  2010-09-02 15:00 ` [PATCH v2 1/4] virtio-net: Make tx_timer timeout configurable Alex Williamson
@ 2010-09-02 15:00 ` Alex Williamson
  2010-09-02 15:01 ` [PATCH v2 3/4] virtio-net: Rename tx_timer_active to tx_waiting Alex Williamson
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Alex Williamson @ 2010-09-02 15:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, alex.williamson, anthony, mst, chrisw, quintela, jes.sorensen

If virtio_net_flush_tx() is called with notification disabled, we can
race with the guest, processing packets at the same rate as they
get produced.  The trouble is that this means we have no guaranteed
exit condition from the function and can spend minutes in there.
Currently flush_tx is only called with notification on, which seems
to limit us to one pass through the queue per call.  An upcoming
patch changes this.

Also add an option to set this value on the command line as different
workloads may wish to use different values.  We can't necessarily
support any random value, so this is a developer option: x-txburst=
Usage:

-device virtio-net-pci,x-txburst=64 # 64 packets per tx flush

One pass through the queue (256) seems to be a good default value
for this, balancing latency with throughput.  We use a signed int
for x-txburst because 2^31 packets in a burst would take many, many
minutes to process and it allows us to easily return a negative
value value from virtio_net_flush_tx() to indicate a back-off
or error condition.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---

 hw/s390-virtio-bus.c |    2 ++
 hw/syborg_virtio.c   |    2 ++
 hw/virtio-net.c      |   21 +++++++++++++++------
 hw/virtio-net.h      |    8 ++++++++
 hw/virtio-pci.c      |    2 ++
 5 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/hw/s390-virtio-bus.c b/hw/s390-virtio-bus.c
index d5cb24e..092e65f 100644
--- a/hw/s390-virtio-bus.c
+++ b/hw/s390-virtio-bus.c
@@ -330,6 +330,8 @@ static VirtIOS390DeviceInfo s390_virtio_net = {
         DEFINE_NIC_PROPERTIES(VirtIOS390Device, nic),
         DEFINE_PROP_UINT32("x-txtimer", VirtIOS390Device,
                            net.txtimer, TX_TIMER_INTERVAL),
+        DEFINE_PROP_INT32("x-txburst", VirtIOS390Device,
+                          net.txburst, TX_BURST),
         DEFINE_PROP_END_OF_LIST(),
     },
 };
diff --git a/hw/syborg_virtio.c b/hw/syborg_virtio.c
index 5665189..3c3f3b0 100644
--- a/hw/syborg_virtio.c
+++ b/hw/syborg_virtio.c
@@ -298,6 +298,8 @@ static SysBusDeviceInfo syborg_virtio_net_info = {
         DEFINE_VIRTIO_NET_FEATURES(SyborgVirtIOProxy, host_features),
         DEFINE_PROP_UINT32("x-txtimer", SyborgVirtIOProxy,
                            net.txtimer, TX_TIMER_INTERVAL),
+        DEFINE_PROP_INT32("x-txburst", SyborgVirtIOProxy,
+                          net.txburst, TX_BURST),
         DEFINE_PROP_END_OF_LIST(),
     }
 };
diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index d5b03ab..55f3d94 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -37,6 +37,7 @@ typedef struct VirtIONet
     NICState *nic;
     QEMUTimer *tx_timer;
     uint32_t tx_timeout;
+    int32_t tx_burst;
     int tx_timer_active;
     uint32_t has_vnet_hdr;
     uint8_t has_ufo;
@@ -620,7 +621,7 @@ static ssize_t virtio_net_receive(VLANClientState *nc, const uint8_t *buf, size_
     return size;
 }
 
-static void virtio_net_flush_tx(VirtIONet *n, VirtQueue *vq);
+static int32_t virtio_net_flush_tx(VirtIONet *n, VirtQueue *vq);
 
 static void virtio_net_tx_complete(VLANClientState *nc, ssize_t len)
 {
@@ -636,16 +637,18 @@ static void virtio_net_tx_complete(VLANClientState *nc, ssize_t len)
 }
 
 /* TX */
-static void virtio_net_flush_tx(VirtIONet *n, VirtQueue *vq)
+static int32_t virtio_net_flush_tx(VirtIONet *n, VirtQueue *vq)
 {
     VirtQueueElement elem;
+    int32_t num_packets = 0;
 
-    if (!(n->vdev.status & VIRTIO_CONFIG_S_DRIVER_OK))
-        return;
+    if (!(n->vdev.status & VIRTIO_CONFIG_S_DRIVER_OK)) {
+        return num_packets;
+    }
 
     if (n->async_tx.elem.out_num) {
         virtio_queue_set_notification(n->tx_vq, 0);
-        return;
+        return num_packets;
     }
 
     while (virtqueue_pop(vq, &elem)) {
@@ -682,14 +685,19 @@ static void virtio_net_flush_tx(VirtIONet *n, VirtQueue *vq)
             virtio_queue_set_notification(n->tx_vq, 0);
             n->async_tx.elem = elem;
             n->async_tx.len  = len;
-            return;
+            return -EBUSY;
         }
 
         len += ret;
 
         virtqueue_push(vq, &elem, len);
         virtio_notify(&n->vdev, vq);
+
+        if (++num_packets >= n->tx_burst) {
+            break;
+        }
     }
+    return num_packets;
 }
 
 static void virtio_net_handle_tx(VirtIODevice *vdev, VirtQueue *vq)
@@ -934,6 +942,7 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
     n->tx_timer = qemu_new_timer(vm_clock, virtio_net_tx_timer, n);
     n->tx_timer_active = 0;
     n->tx_timeout = net->txtimer;
+    n->tx_burst = net->txburst;
     n->mergeable_rx_bufs = 0;
     n->promisc = 1; /* for compatibility */
 
diff --git a/hw/virtio-net.h b/hw/virtio-net.h
index 46a2e1c..a2d1545 100644
--- a/hw/virtio-net.h
+++ b/hw/virtio-net.h
@@ -49,9 +49,17 @@
 
 #define TX_TIMER_INTERVAL 150000 /* 150 us */
 
+/* Limit the number of packets that can be sent via a single flush
+ * of the TX queue.  This gives us a guaranteed exit condition and
+ * ensures fairness in the io path.  256 conveniently matches the
+ * length of the TX queue and shows a good balance of performance
+ * and latency. */
+#define TX_BURST 256
+
 typedef struct virtio_net_conf
 {
     uint32_t txtimer;
+    int32_t txburst;
 } virtio_net_conf;
 
 /* Maximum packet size we can receive from tap device: header + 64k */
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 1af48e2..3a5b3e6 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -693,6 +693,8 @@ static PCIDeviceInfo virtio_info[] = {
             DEFINE_NIC_PROPERTIES(VirtIOPCIProxy, nic),
             DEFINE_PROP_UINT32("x-txtimer", VirtIOPCIProxy,
                                net.txtimer, TX_TIMER_INTERVAL),
+            DEFINE_PROP_INT32("x-txburst", VirtIOPCIProxy,
+                              net.txburst, TX_BURST),
             DEFINE_PROP_END_OF_LIST(),
         },
         .qdev.reset = virtio_pci_reset,


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 3/4] virtio-net: Rename tx_timer_active to tx_waiting
  2010-09-02 15:00 [PATCH v2 0/4] virtio-net: More configurability and bh handling for tx Alex Williamson
  2010-09-02 15:00 ` [PATCH v2 1/4] virtio-net: Make tx_timer timeout configurable Alex Williamson
  2010-09-02 15:00 ` [PATCH v2 2/4] virtio-net: Limit number of packets sent per TX flush Alex Williamson
@ 2010-09-02 15:01 ` Alex Williamson
  2010-09-02 15:01 ` [PATCH v2 4/4] virtio-net: Introduce a new bottom half packet TX Alex Williamson
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Alex Williamson @ 2010-09-02 15:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, alex.williamson, anthony, mst, chrisw, quintela, jes.sorensen

De-couple this from the timer since we might want to use
different backends to send the packet.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---

 hw/virtio-net.c |   18 +++++++++---------
 1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 55f3d94..1a4ba21 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -38,7 +38,7 @@ typedef struct VirtIONet
     QEMUTimer *tx_timer;
     uint32_t tx_timeout;
     int32_t tx_burst;
-    int tx_timer_active;
+    int tx_waiting;
     uint32_t has_vnet_hdr;
     uint8_t has_ufo;
     struct {
@@ -704,15 +704,15 @@ static void virtio_net_handle_tx(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIONet *n = to_virtio_net(vdev);
 
-    if (n->tx_timer_active) {
+    if (n->tx_waiting) {
         virtio_queue_set_notification(vq, 1);
         qemu_del_timer(n->tx_timer);
-        n->tx_timer_active = 0;
+        n->tx_waiting = 0;
         virtio_net_flush_tx(n, vq);
     } else {
         qemu_mod_timer(n->tx_timer,
                        qemu_get_clock(vm_clock) + n->tx_timeout);
-        n->tx_timer_active = 1;
+        n->tx_waiting = 1;
         virtio_queue_set_notification(vq, 0);
     }
 }
@@ -721,7 +721,7 @@ static void virtio_net_tx_timer(void *opaque)
 {
     VirtIONet *n = opaque;
 
-    n->tx_timer_active = 0;
+    n->tx_waiting = 0;
 
     /* Just in case the driver is not ready on more */
     if (!(n->vdev.status & VIRTIO_CONFIG_S_DRIVER_OK))
@@ -744,7 +744,7 @@ static void virtio_net_save(QEMUFile *f, void *opaque)
     virtio_save(&n->vdev, f);
 
     qemu_put_buffer(f, n->mac, ETH_ALEN);
-    qemu_put_be32(f, n->tx_timer_active);
+    qemu_put_be32(f, n->tx_waiting);
     qemu_put_be32(f, n->mergeable_rx_bufs);
     qemu_put_be16(f, n->status);
     qemu_put_byte(f, n->promisc);
@@ -773,7 +773,7 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
     virtio_load(&n->vdev, f);
 
     qemu_get_buffer(f, n->mac, ETH_ALEN);
-    n->tx_timer_active = qemu_get_be32(f);
+    n->tx_waiting = qemu_get_be32(f);
     n->mergeable_rx_bufs = qemu_get_be32(f);
 
     if (version_id >= 3)
@@ -849,7 +849,7 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
     }
     n->mac_table.first_multi = i;
 
-    if (n->tx_timer_active) {
+    if (n->tx_waiting) {
         qemu_mod_timer(n->tx_timer,
                        qemu_get_clock(vm_clock) + n->tx_timeout);
     }
@@ -940,7 +940,7 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
     qemu_format_nic_info_str(&n->nic->nc, conf->macaddr.a);
 
     n->tx_timer = qemu_new_timer(vm_clock, virtio_net_tx_timer, n);
-    n->tx_timer_active = 0;
+    n->tx_waiting = 0;
     n->tx_timeout = net->txtimer;
     n->tx_burst = net->txburst;
     n->mergeable_rx_bufs = 0;


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 4/4] virtio-net: Introduce a new bottom half packet TX
  2010-09-02 15:00 [PATCH v2 0/4] virtio-net: More configurability and bh handling for tx Alex Williamson
                   ` (2 preceding siblings ...)
  2010-09-02 15:01 ` [PATCH v2 3/4] virtio-net: Rename tx_timer_active to tx_waiting Alex Williamson
@ 2010-09-02 15:01 ` Alex Williamson
  2010-09-02 15:35 ` [PATCH v2 0/4] virtio-net: More configurability and bh handling for tx Michael S. Tsirkin
  2010-09-03 13:11 ` Juan Quintela
  5 siblings, 0 replies; 7+ messages in thread
From: Alex Williamson @ 2010-09-02 15:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, alex.williamson, anthony, mst, chrisw, quintela, jes.sorensen

Based on a patch from Mark McLoughlin, this patch introduces a new
bottom half packet transmitter that avoids the latency imposed by
the tx_timer approach.  Rather than scheduling a timer when a TX
packet comes in, schedule a bottom half to be run from the iothread.
The bottom half handler first attempts to flush the queue with
notification disabled (this is where we could race with a guest
without txburst).  If we flush a full burst, reschedule immediately.
If we send short of a full burst, try to re-enable notification.
To avoid a race with TXs that may have occurred, we must then
flush again.  If we find some packets to send, the guest it probably
active, so we can reschedule again.

tx_timer and tx_bh are mutually exclusive, so we can re-use the
tx_waiting flag to indicate one or the other needs to be setup.
This allows us to seamlessly migrate between timer and bh TX
handling.

The bottom half handler becomes the new default and we add a new
tx= option to virtio-net-pci.  Usage:

-device virtio-net-pci,tx=timer # select timer mitigation vs "bh"

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---

 hw/s390-virtio-bus.c |    1 +
 hw/syborg_virtio.c   |    1 +
 hw/virtio-net.c      |   85 +++++++++++++++++++++++++++++++++++++++++++++-----
 hw/virtio-net.h      |    1 +
 hw/virtio-pci.c      |    1 +
 5 files changed, 81 insertions(+), 8 deletions(-)

diff --git a/hw/s390-virtio-bus.c b/hw/s390-virtio-bus.c
index 092e65f..784dc01 100644
--- a/hw/s390-virtio-bus.c
+++ b/hw/s390-virtio-bus.c
@@ -332,6 +332,7 @@ static VirtIOS390DeviceInfo s390_virtio_net = {
                            net.txtimer, TX_TIMER_INTERVAL),
         DEFINE_PROP_INT32("x-txburst", VirtIOS390Device,
                           net.txburst, TX_BURST),
+        DEFINE_PROP_STRING("tx", VirtIOS390Device, net.tx),
         DEFINE_PROP_END_OF_LIST(),
     },
 };
diff --git a/hw/syborg_virtio.c b/hw/syborg_virtio.c
index 3c3f3b0..4dfd1a8 100644
--- a/hw/syborg_virtio.c
+++ b/hw/syborg_virtio.c
@@ -300,6 +300,7 @@ static SysBusDeviceInfo syborg_virtio_net_info = {
                            net.txtimer, TX_TIMER_INTERVAL),
         DEFINE_PROP_INT32("x-txburst", SyborgVirtIOProxy,
                           net.txburst, TX_BURST),
+        DEFINE_PROP_STRING("tx", SyborgVirtIOProxy, net.tx),
         DEFINE_PROP_END_OF_LIST(),
     }
 };
diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 1a4ba21..0a9cae2 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -36,6 +36,7 @@ typedef struct VirtIONet
     VirtQueue *ctrl_vq;
     NICState *nic;
     QEMUTimer *tx_timer;
+    QEMUBH *tx_bh;
     uint32_t tx_timeout;
     int32_t tx_burst;
     int tx_waiting;
@@ -700,7 +701,7 @@ static int32_t virtio_net_flush_tx(VirtIONet *n, VirtQueue *vq)
     return num_packets;
 }
 
-static void virtio_net_handle_tx(VirtIODevice *vdev, VirtQueue *vq)
+static void virtio_net_handle_tx_timer(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIONet *n = to_virtio_net(vdev);
 
@@ -717,6 +718,18 @@ static void virtio_net_handle_tx(VirtIODevice *vdev, VirtQueue *vq)
     }
 }
 
+static void virtio_net_handle_tx_bh(VirtIODevice *vdev, VirtQueue *vq)
+{
+    VirtIONet *n = to_virtio_net(vdev);
+
+    if (unlikely(n->tx_waiting)) {
+        return;
+    }
+    virtio_queue_set_notification(vq, 0);
+    qemu_bh_schedule(n->tx_bh);
+    n->tx_waiting = 1;
+}
+
 static void virtio_net_tx_timer(void *opaque)
 {
     VirtIONet *n = opaque;
@@ -731,6 +744,41 @@ static void virtio_net_tx_timer(void *opaque)
     virtio_net_flush_tx(n, n->tx_vq);
 }
 
+static void virtio_net_tx_bh(void *opaque)
+{
+    VirtIONet *n = opaque;
+    int32_t ret;
+
+    n->tx_waiting = 0;
+
+    /* Just in case the driver is not ready on more */
+    if (unlikely(!(n->vdev.status & VIRTIO_CONFIG_S_DRIVER_OK)))
+        return;
+
+    ret = virtio_net_flush_tx(n, n->tx_vq);
+    if (ret == -EBUSY) {
+        return; /* Notification re-enable handled by tx_complete */
+    }
+
+    /* If we flush a full burst of packets, assume there are
+     * more coming and immediately reschedule */
+    if (ret >= n->tx_burst) {
+        qemu_bh_schedule(n->tx_bh);
+        n->tx_waiting = 1;
+        return;
+    }
+
+    /* If less than a full burst, re-enable notification and flush
+     * anything that may have come in while we weren't looking.  If
+     * we find something, assume the guest is still active and reschedule */
+    virtio_queue_set_notification(n->tx_vq, 1);
+    if (virtio_net_flush_tx(n, n->tx_vq) > 0) {
+        virtio_queue_set_notification(n->tx_vq, 0);
+        qemu_bh_schedule(n->tx_bh);
+        n->tx_waiting = 1;
+    }
+}
+
 static void virtio_net_save(QEMUFile *f, void *opaque)
 {
     VirtIONet *n = opaque;
@@ -850,8 +898,12 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
     n->mac_table.first_multi = i;
 
     if (n->tx_waiting) {
-        qemu_mod_timer(n->tx_timer,
-                       qemu_get_clock(vm_clock) + n->tx_timeout);
+        if (n->tx_timer) {
+            qemu_mod_timer(n->tx_timer,
+                           qemu_get_clock(vm_clock) + n->tx_timeout);
+        } else {
+            qemu_bh_schedule(n->tx_bh);
+        }
     }
     return 0;
 }
@@ -929,7 +981,22 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
     n->vdev.reset = virtio_net_reset;
     n->vdev.set_status = virtio_net_set_status;
     n->rx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_rx);
-    n->tx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_tx);
+
+    if (net->tx && strcmp(net->tx, "timer") && strcmp(net->tx, "bh")) {
+        fprintf(stderr, "virtio-net: "
+                "Unknown option tx=%s, valid options: \"timer\" \"bh\"\n",
+                net->tx);
+        fprintf(stderr, "Defaulting to \"bh\"\n");
+    }
+
+    if (net->tx && !strcmp(net->tx, "timer")) {
+        n->tx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_tx_timer);
+        n->tx_timer = qemu_new_timer(vm_clock, virtio_net_tx_timer, n);
+        n->tx_timeout = net->txtimer;
+    } else {
+        n->tx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_tx_bh);
+        n->tx_bh = qemu_bh_new(virtio_net_tx_bh, n);
+    }
     n->ctrl_vq = virtio_add_queue(&n->vdev, 64, virtio_net_handle_ctrl);
     qemu_macaddr_default_if_unset(&conf->macaddr);
     memcpy(&n->mac[0], &conf->macaddr, sizeof(n->mac));
@@ -939,9 +1006,7 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
 
     qemu_format_nic_info_str(&n->nic->nc, conf->macaddr.a);
 
-    n->tx_timer = qemu_new_timer(vm_clock, virtio_net_tx_timer, n);
     n->tx_waiting = 0;
-    n->tx_timeout = net->txtimer;
     n->tx_burst = net->txburst;
     n->mergeable_rx_bufs = 0;
     n->promisc = 1; /* for compatibility */
@@ -974,8 +1039,12 @@ void virtio_net_exit(VirtIODevice *vdev)
     qemu_free(n->mac_table.macs);
     qemu_free(n->vlans);
 
-    qemu_del_timer(n->tx_timer);
-    qemu_free_timer(n->tx_timer);
+    if (n->tx_timer) {
+        qemu_del_timer(n->tx_timer);
+        qemu_free_timer(n->tx_timer);
+    } else {
+        qemu_bh_delete(n->tx_bh);
+    }
 
     virtio_cleanup(&n->vdev);
     qemu_del_vlan_client(&n->nic->nc);
diff --git a/hw/virtio-net.h b/hw/virtio-net.h
index a2d1545..8af9a1c 100644
--- a/hw/virtio-net.h
+++ b/hw/virtio-net.h
@@ -60,6 +60,7 @@ typedef struct virtio_net_conf
 {
     uint32_t txtimer;
     int32_t txburst;
+    char *tx;
 } virtio_net_conf;
 
 /* Maximum packet size we can receive from tap device: header + 64k */
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 3a5b3e6..86e6b0a 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -695,6 +695,7 @@ static PCIDeviceInfo virtio_info[] = {
                                net.txtimer, TX_TIMER_INTERVAL),
             DEFINE_PROP_INT32("x-txburst", VirtIOPCIProxy,
                               net.txburst, TX_BURST),
+            DEFINE_PROP_STRING("tx", VirtIOPCIProxy, net.tx),
             DEFINE_PROP_END_OF_LIST(),
         },
         .qdev.reset = virtio_pci_reset,


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 0/4] virtio-net: More configurability and bh handling for tx
  2010-09-02 15:00 [PATCH v2 0/4] virtio-net: More configurability and bh handling for tx Alex Williamson
                   ` (3 preceding siblings ...)
  2010-09-02 15:01 ` [PATCH v2 4/4] virtio-net: Introduce a new bottom half packet TX Alex Williamson
@ 2010-09-02 15:35 ` Michael S. Tsirkin
  2010-09-03 13:11 ` Juan Quintela
  5 siblings, 0 replies; 7+ messages in thread
From: Michael S. Tsirkin @ 2010-09-02 15:35 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel, kvm, anthony, chrisw, quintela, jes.sorensen

On Thu, Sep 02, 2010 at 09:00:42AM -0600, Alex Williamson wrote:
> Incorporated feedback.  txburst= and txtimer= are now "x-" prefixed
> developer options.  I added a tx= option, because I do want there to
> be a supported way to switch between TX strategies.  This also drops
> the magic value of txtimer= 1 or 0 (setting default timeout or switching
> modes).  I also dropped the trickiness around only enabling the bottom
> half for the iothread since Anthony indicates we don't care about
> performance for the non-iothread case.
> 
> New performance data against both qemu.git and qemu-kvm.git:
> 
> https://spreadsheets.google.com/pub?key=0AoEm50Bac2U7dGdlREhrWUpPVEdMcTJaX0RjSEgtc3c&hl=en&single=true&gid=0&output=html
> 
> This shows the base performance versus patched with new default for
> various packet sizes and netperf test loads.  The guest is a 2-way
> SMP, connected directly to the host via a tap (no bridge).  Thanks,
> 
> Alex

ACK the series.

Please remember to clone bugzilla into libvirt and ask them
to support the new option on zstream/0day, so that we do have
a working backup plan.

> ---
> 
> Alex Williamson (4):
>       virtio-net: Introduce a new bottom half packet TX
>       virtio-net: Rename tx_timer_active to tx_waiting
>       virtio-net: Limit number of packets sent per TX flush
>       virtio-net: Make tx_timer timeout configurable
> 
> 
>  hw/s390-virtio-bus.c |    8 +++
>  hw/s390-virtio-bus.h |    1 
>  hw/syborg_virtio.c   |    8 +++
>  hw/virtio-net.c      |  129 +++++++++++++++++++++++++++++++++++++++++---------
>  hw/virtio-net.h      |   14 +++++
>  hw/virtio-pci.c      |    8 +++
>  hw/virtio.h          |    4 +-
>  7 files changed, 144 insertions(+), 28 deletions(-)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 0/4] virtio-net: More configurability and bh handling for tx
  2010-09-02 15:00 [PATCH v2 0/4] virtio-net: More configurability and bh handling for tx Alex Williamson
                   ` (4 preceding siblings ...)
  2010-09-02 15:35 ` [PATCH v2 0/4] virtio-net: More configurability and bh handling for tx Michael S. Tsirkin
@ 2010-09-03 13:11 ` Juan Quintela
  5 siblings, 0 replies; 7+ messages in thread
From: Juan Quintela @ 2010-09-03 13:11 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel, kvm, anthony, mst, chrisw, jes.sorensen

Alex Williamson <alex.williamson@redhat.com> wrote:
> Incorporated feedback.  txburst= and txtimer= are now "x-" prefixed
> developer options.  I added a tx= option, because I do want there to
> be a supported way to switch between TX strategies.  This also drops
> the magic value of txtimer= 1 or 0 (setting default timeout or switching
> modes).  I also dropped the trickiness around only enabling the bottom
> half for the iothread since Anthony indicates we don't care about
> performance for the non-iothread case.
>
> New performance data against both qemu.git and qemu-kvm.git:
>
> https://spreadsheets.google.com/pub?key=0AoEm50Bac2U7dGdlREhrWUpPVEdMcTJaX0RjSEgtc3c&hl=en&single=true&gid=0&output=html
>
> This shows the base performance versus patched with new default for
> various packet sizes and netperf test loads.  The guest is a 2-way
> SMP, connected directly to the host via a tap (no bridge).  Thanks,
>
> Alex

Acked-by: Juan Quintela <quintela@redhat.com>

(whole series)

Thanks very much for the cleanups,  I think it is clearer nicer now.

Later, Juan.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-09-03 13:13 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-02 15:00 [PATCH v2 0/4] virtio-net: More configurability and bh handling for tx Alex Williamson
2010-09-02 15:00 ` [PATCH v2 1/4] virtio-net: Make tx_timer timeout configurable Alex Williamson
2010-09-02 15:00 ` [PATCH v2 2/4] virtio-net: Limit number of packets sent per TX flush Alex Williamson
2010-09-02 15:01 ` [PATCH v2 3/4] virtio-net: Rename tx_timer_active to tx_waiting Alex Williamson
2010-09-02 15:01 ` [PATCH v2 4/4] virtio-net: Introduce a new bottom half packet TX Alex Williamson
2010-09-02 15:35 ` [PATCH v2 0/4] virtio-net: More configurability and bh handling for tx Michael S. Tsirkin
2010-09-03 13:11 ` Juan Quintela

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).