All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 00/20] Multiqueue virtio-net
@ 2013-01-25 10:35 Jason Wang
  2013-01-25 10:35 ` [PATCH V2 01/20] net: introduce qemu_get_queue() Jason Wang
                   ` (20 more replies)
  0 siblings, 21 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

Hello all:

This seires is an update of last version of multiqueue virtio-net support.

This series tries to brings multiqueue support to virtio-net through a
multiqueue support tap backend and multiple vhost threads.

To support this, multiqueue nic support were added to qemu. This is done by
introducing an array of NetClientStates in NICState, and make each pair of peers
to be an queue of the nic. This is done in patch 1-7.

Tap were also converted to be able to create a multiple queue
backend. Currently, only linux support this by issuing TUNSETIFF N times with
the same device name to create N queues. Each fd returned by TUNSETIFF were a
queue supported by kernel. Three new command lines were introduced, "queues"
were used to tell how many queues will be created by qemu; "fds" were used to
pass multiple pre-created tap file descriptors to qemu; "vhostfds" were used to
pass multiple pre-created vhost descriptors to qemu. This is done in patch 8-13.

A method of deleting a queue and queue_index were also introduce for virtio,
this is done in patch 14-15.

Vhost were also changed to support multiqueue by introducing a start vq index
which tracks the first virtqueue that will be used by vhost instead of the
assumption that the vhost always use virtqueue from index 0. This is done in
patch 16.

The last part is the multiqueue userspace changes, this is done in patch 17-20.

With this changes, user could start a multiqueue virtio-net device through

./qemu -netdev tap,id=hn0,queues=2,vhost=on -device virtio-net-pci,netdev=hn0

Management tools such as libvirt can pass multiple pre-created fds/vhostfds through

./qemu -netdev tap,id=hn0,fds=X:Y,vhostfds=M:N -device virtio-net-pci,netdev=hn0

No git tree this round since github is unavailable in China...

Changes from V1:
- silent checkpatch (Blue)
- use fds/vhostfds instead of fd/vhostfd (Stefan)
- use fds="X:Y:Z" instead of fd=X,fd=Y,fd=Z (Anthony)
- split patches (Stefan)
- typos in commit log (Stefan)
- Warn 'queues=' when fds/vhostfds is used (Stefan)
- rename __net_init_tap to net_init_tap_one (Stefan)
- check the consistency of vnet_hdr of multiple tap fds (Stefan)
- disable multiqueue support for bridge-helper (Stefan)
- rename tap_attach()/tap_detach() to tap_enable()/tap_disable() (Stefan)
- fix booting with legacy guest (WanLong)
- don't bump the version when doing migration (Michael)
- simplify the interface between virtio-net and multiqueue vhost_net (Michael)
- rebase the patches to latest
- re-order the patches that let the net part comes first to simplify the
  reviewing
- simplify the interface between virtio-net and multiqueue vhost_net
- move the guest notifiers setup from vhost to vhost_net
- fix a build issue of hw/mcf_fce.c

Changes from RFC v2:
- rebase the codes to latest qemu
- align the multiqueue virtio-net implementation to virtio spec
- split the patches into more smaller patches
- set_link and hotplug support

Changes from RFC V1:
- rebase to the latest
- fix memory leak in parse_netdev
- fix guest notifiers assignment/de-assignment
- changes the command lines to:
   qemu -netdev tap,queues=2 -device virtio-net-pci,queues=2

Reference:
V1: http://lists.nongnu.org/archive/html/qemu-devel/2012-12/msg03558.html
RFC v2: http://lists.gnu.org/archive/html/qemu-devel/2012-06/msg04108.html
RFC v1: http://comments.gmane.org/gmane.comp.emulators.qemu/100481

Perf Numbers:
- norm is short for normalize result
- trans.rate is short for transaction rate

Two Intel Xeon 5620 with direct connected intel 82599EB
Host/Guest kernel: David net tree
vhost enabled

- lots of improvents of both latency and cpu utilization in request-reponse test
- get regression of guest sending small packets which because TCP tends to batch
  less when the latency were improved

1q/2q/4q
TCP_RR
 size #sessions trans.rate  norm trans.rate  norm trans.rate  norm
1 1     9393.26   595.64  9408.18   597.34  9375.19   584.12
1 20    72162.1   2214.24 129880.22 2456.13 196949.81 2298.13
1 50    107513.38 2653.99 139721.93 2490.58 259713.82 2873.57
1 100   126734.63 2676.54 145553.5  2406.63 265252.68 2943
64 1    9453.42   632.33  9371.37   616.13  9338.19   615.97
64 20   70620.03  2093.68 125155.75 2409.15 191239.91 2253.32
64 50   106966    2448.29 146518.67 2514.47 242134.07 2720.91
64 100  117046.35 2394.56 190153.09 2696.82 238881.29 2704.41
256 1   8733.29   736.36  8701.07   680.83  8608.92   530.1
256 20  69279.89  2274.45 115103.07 2299.76 144555.16 1963.53
256 50  97676.02  2296.09 150719.57 2522.92 254510.5  3028.44
256 100 150221.55 2949.56 197569.3  2790.92 300695.78 3494.83
TCP_CRR
 size #sessions trans.rate  norm trans.rate  norm trans.rate  norm
1 1     2848.37  163.41 2230.39  130.89 2013.09  120.47
1 20    23434.5  562.11 31057.43 531.07 49488.28 564.41
1 50    28514.88 582.17 40494.23 605.92 60113.35 654.97
1 100   28827.22 584.73 48813.25 661.6  61783.62 676.56
64 1    2780.08  159.4  2201.07  127.96 2006.8   117.63
64 20   23318.51 564.47 30982.44 530.24 49734.95 566.13
64 50   28585.72 582.54 40576.7  610.08 60167.89 656.56
64 100  28747.37 584.17 49081.87 667.87 60612.94 662
256 1   2772.08  160.51 2231.84  131.05 2003.62  113.45
256 20  23086.35 559.8  30929.09 528.16 48454.9  555.22
256 50  28354.7  579.85 40578.31 607    60261.71 657.87
256 100 28844.55 585.67 48541.86 659.08 61941.07 676.72
TCP_STREAM guest receiving
 size #sessions throughput  norm throughput  norm throughput  norm
1 1     16.27   1.33   16.1    1.12   16.13   0.99
1 2     33.04   2.08   32.96   2.19   32.75   1.98
1 4     66.62   6.83   68.3    5.56   66.14   2.65
64 1    896.55  56.67  914.02  58.14  898.9   61.56
64 2    1830.46 91.02  1812.02 64.59  1835.57 66.26
64 4    3626.61 142.55 3636.25 100.64 3607.46 75.03
256 1   2619.49 131.23 2543.19 129.03 2618.69 132.39
256 2   5136.58 203.02 5163.31 141.11 5236.51 149.4
256 4   7063.99 242.83 9365.4  208.49 9421.03 159.94
512 1   3592.43 165.24 3603.12 167.19 3552.5  169.57
512 2   7042.62 246.59 7068.46 180.87 7258.52 186.3
512 4   6996.08 241.49 9298.34 206.12 9418.52 159.33
1024 1  4339.54 192.95 4370.2  191.92 4211.72 192.49
1024 2  7439.45 254.77 9403.99 215.24 9120.82 222.67
1024 4  7953.86 272.11 9403.87 208.23 9366.98 159.49
4096 1  7696.28 272.04 7611.41 270.38 7778.71 267.76
4096 2  7530.35 261.1  8905.43 246.27 8990.18 267.57
4096 4  7121.6  247.02 9411.75 206.71 9654.96 184.67
16384 1 7795.73 268.54 7780.94 267.2  7634.26 260.73
16384 2 7436.57 255.81 9381.86 220.85 9392    220.36
16384 4 7199.07 247.81 9420.96 205.87 9373.69 159.57
TCP_MAERTS guest sending
 size #sessions throughput  norm throughput  norm throughput  norm
1 1     15.94   0.62   15.55   0.61   15.13   0.59
1 2     36.11   0.83   32.46   0.69   32.28   0.69
1 4     71.59   1      68.91   0.94   61.52   0.77
64 1    630.71  22.52  622.11  22.35  605.09  21.84
64 2    1442.36 30.57  1292.15 25.82  1282.67 25.55
64 4    3186.79 42.59  2844.96 36.03  2529.69 30.06
256 1   1760.96 58.07  1738.44 57.43  1695.99 56.19
256 2   4834.23 95.19  3524.85 64.21  3511.94 64.45
256 4   9324.63 145.74 8956.49 116.39 6720.17 73.86
512 1   2678.03 84.1   2630.68 82.93  2636.54 82.57
512 2   9368.17 195.61 9408.82 204.53 5316.3  92.99
512 4   9186.34 209.68 9358.72 183.82 9489.29 160.42
1024 1  3620.71 109.88 3625.54 109.83 3606.61 112.35
1024 2  9429    258.32 7082.79 120.55 7403.53 134.78
1024 4  9430.66 290.44 9499.29 232.31 9414.6  190.92
4096 1  9339.28 296.48 9374.23 372.88 9348.76 298.49
4096 2  9410.53 378.69 9412.61 286.18 9409.75 278.31
4096 4  9487.35 374.1  9556.91 288.81 9441.94 221.64
16384 1 9380.43 403.8  9379.78 399.13 9382.42 393.55
16384 2 9367.69 406.93 9415.04 312.68 9409.29 300.9
16384 4 9391.96 405.17 9695.12 310.54 9423.76 223.47

Jason Wang (20):
  net: introduce qemu_get_queue()
  net: introduce qemu_get_nic()
  net: intorduce qemu_del_nic()
  net: introduce qemu_find_net_clients_except()
  net: introduce qemu_net_client_setup()
  net: introduce NetClientState destructor
  net: multiqueue support
  tap: import linux multiqueue constants
  tap: factor out common tap initialization
  tap: add Linux multiqueue support
  tap: support enabling or disabling a queue
  tap: introduce a helper to get the name of an interface
  tap: multiqueue support
  vhost: multiqueue support
  virtio: introduce virtio_del_queue()
  virtio: add a queue_index to VirtQueue
  virtio-net: separate virtqueue from VirtIONet
  virtio-net: multiqueue support
  virtio-net: migration support for multiqueue
  virtio-net: compat multiqueue support

 hw/cadence_gem.c            |   17 +-
 hw/dp8393x.c                |   17 +-
 hw/e1000.c                  |   32 ++--
 hw/eepro100.c               |   18 +-
 hw/etraxfs_eth.c            |   10 +-
 hw/lan9118.c                |   16 +-
 hw/lance.c                  |    2 +-
 hw/mcf_fec.c                |   12 +-
 hw/milkymist-minimac2.c     |   10 +-
 hw/mipsnet.c                |   10 +-
 hw/musicpal.c               |    6 +-
 hw/ne2000-isa.c             |    4 +-
 hw/ne2000.c                 |   13 +-
 hw/opencores_eth.c          |   12 +-
 hw/pc_piix.c                |    4 +
 hw/pcnet-pci.c              |    4 +-
 hw/pcnet.c                  |   13 +-
 hw/qdev-properties-system.c |   46 ++++-
 hw/qdev-properties.h        |    6 +-
 hw/rtl8139.c                |   22 +-
 hw/smc91c111.c              |   10 +-
 hw/spapr_llan.c             |    8 +-
 hw/stellaris_enet.c         |   11 +-
 hw/usb/dev-network.c        |   16 +-
 hw/vhost.c                  |   82 +++----
 hw/vhost.h                  |    2 +
 hw/vhost_net.c              |   92 +++++++-
 hw/vhost_net.h              |    6 +-
 hw/virtio-net.c             |  519 +++++++++++++++++++++++++++++++------------
 hw/virtio-net.h             |   28 +++-
 hw/virtio.c                 |   17 ++
 hw/virtio.h                 |    3 +
 hw/xen_nic.c                |   17 +-
 hw/xgmac.c                  |   10 +-
 hw/xilinx_axienet.c         |   10 +-
 hw/xilinx_ethlite.c         |   10 +-
 include/net/net.h           |   26 ++-
 include/net/tap.h           |    2 +
 net/net.c                   |  198 +++++++++++++----
 net/tap-aix.c               |   19 ++-
 net/tap-bsd.c               |   18 ++-
 net/tap-haiku.c             |   18 ++-
 net/tap-linux.c             |   67 ++++++-
 net/tap-linux.h             |    4 +
 net/tap-solaris.c           |   18 ++-
 net/tap-win32.c             |   10 +
 net/tap.c                   |  295 ++++++++++++++++++-------
 net/tap_int.h               |    6 +-
 qapi-schema.json            |    5 +-
 savevm.c                    |    2 +-
 50 files changed, 1319 insertions(+), 484 deletions(-)


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH V2 01/20] net: introduce qemu_get_queue()
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-25 10:35 ` [PATCH V2 02/20] net: introduce qemu_get_nic() Jason Wang
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

To support multiqueue, the patch introduce a helper qemu_get_queue()
which is used to get the NetClientState of a device. The following patches would
refactor this helper to support multiqueue.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 hw/cadence_gem.c        |    9 +++--
 hw/dp8393x.c            |    9 +++--
 hw/e1000.c              |   24 ++++++++-------
 hw/eepro100.c           |   12 ++++----
 hw/etraxfs_eth.c        |    4 +-
 hw/lan9118.c            |   10 +++---
 hw/mcf_fec.c            |    4 +-
 hw/milkymist-minimac2.c |    4 +-
 hw/mipsnet.c            |    4 +-
 hw/musicpal.c           |    2 +-
 hw/ne2000-isa.c         |    2 +-
 hw/ne2000.c             |    7 ++--
 hw/opencores_eth.c      |    6 ++--
 hw/pcnet-pci.c          |    2 +-
 hw/pcnet.c              |    7 ++--
 hw/rtl8139.c            |   14 ++++----
 hw/smc91c111.c          |    4 +-
 hw/spapr_llan.c         |    4 +-
 hw/stellaris_enet.c     |    5 ++-
 hw/usb/dev-network.c    |   10 +++---
 hw/virtio-net.c         |   76 ++++++++++++++++++++++++++---------------------
 hw/xen_nic.c            |   13 +++++---
 hw/xgmac.c              |    4 +-
 hw/xilinx_axienet.c     |    4 +-
 hw/xilinx_ethlite.c     |    4 +-
 include/net/net.h       |    1 +
 net/net.c               |    5 +++
 savevm.c                |    2 +-
 28 files changed, 138 insertions(+), 114 deletions(-)

diff --git a/hw/cadence_gem.c b/hw/cadence_gem.c
index 0d83442..9de688f 100644
--- a/hw/cadence_gem.c
+++ b/hw/cadence_gem.c
@@ -389,10 +389,10 @@ static void gem_init_register_masks(GemState *s)
  */
 static void phy_update_link(GemState *s)
 {
-    DB_PRINT("down %d\n", s->nic->nc.link_down);
+    DB_PRINT("down %d\n", qemu_get_queue(s->nic)->link_down);
 
     /* Autonegotiation status mirrors link status.  */
-    if (s->nic->nc.link_down) {
+    if (qemu_get_queue(s->nic)->link_down) {
         s->phy_regs[PHY_REG_STATUS] &= ~(PHY_REG_STATUS_ANEGCMPL |
                                          PHY_REG_STATUS_LINK);
         s->phy_regs[PHY_REG_INT_ST] |= PHY_REG_INT_ST_LINKC;
@@ -906,9 +906,10 @@ static void gem_transmit(GemState *s)
 
             /* Send the packet somewhere */
             if (s->phy_loop) {
-                gem_receive(&s->nic->nc, tx_packet, total_bytes);
+                gem_receive(qemu_get_queue(s->nic), tx_packet, total_bytes);
             } else {
-                qemu_send_packet(&s->nic->nc, tx_packet, total_bytes);
+                qemu_send_packet(qemu_get_queue(s->nic), tx_packet,
+                                 total_bytes);
             }
 
             /* Prepare for next packet */
diff --git a/hw/dp8393x.c b/hw/dp8393x.c
index b501450..c2d0bc8 100644
--- a/hw/dp8393x.c
+++ b/hw/dp8393x.c
@@ -339,6 +339,7 @@ static void do_receiver_disable(dp8393xState *s)
 
 static void do_transmit_packets(dp8393xState *s)
 {
+    NetClientState *nc = qemu_get_queue(s->nic);
     uint16_t data[12];
     int width, size;
     int tx_len, len;
@@ -408,13 +409,13 @@ static void do_transmit_packets(dp8393xState *s)
         if (s->regs[SONIC_RCR] & (SONIC_RCR_LB1 | SONIC_RCR_LB0)) {
             /* Loopback */
             s->regs[SONIC_TCR] |= SONIC_TCR_CRSL;
-            if (s->nic->nc.info->can_receive(&s->nic->nc)) {
+            if (nc->info->can_receive(nc)) {
                 s->loopback_packet = 1;
-                s->nic->nc.info->receive(&s->nic->nc, s->tx_buffer, tx_len);
+                nc->info->receive(nc, s->tx_buffer, tx_len);
             }
         } else {
             /* Transmit packet */
-            qemu_send_packet(&s->nic->nc, s->tx_buffer, tx_len);
+            qemu_send_packet(nc, s->tx_buffer, tx_len);
         }
         s->regs[SONIC_TCR] |= SONIC_TCR_PTX;
 
@@ -903,7 +904,7 @@ void dp83932_init(NICInfo *nd, hwaddr base, int it_shift,
 
     s->nic = qemu_new_nic(&net_dp83932_info, &s->conf, nd->model, nd->name, s);
 
-    qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a);
+    qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
     qemu_register_reset(nic_reset, s);
     nic_reset(s);
 
diff --git a/hw/e1000.c b/hw/e1000.c
index ef06ca1..7b310d7 100644
--- a/hw/e1000.c
+++ b/hw/e1000.c
@@ -167,11 +167,11 @@ set_phy_ctrl(E1000State *s, int index, uint16_t val)
 {
     if ((val & MII_CR_AUTO_NEG_EN) && (val & MII_CR_RESTART_AUTO_NEG)) {
         /* no need auto-negotiation if link was down */
-        if (s->nic->nc.link_down) {
+        if (qemu_get_queue(s->nic)->link_down) {
             s->phy_reg[PHY_STATUS] |= MII_SR_AUTONEG_COMPLETE;
             return;
         }
-        s->nic->nc.link_down = true;
+        qemu_get_queue(s->nic)->link_down = true;
         e1000_link_down(s);
         s->phy_reg[PHY_STATUS] &= ~MII_SR_AUTONEG_COMPLETE;
         DBGOUT(PHY, "Start link auto negotiation\n");
@@ -183,7 +183,7 @@ static void
 e1000_autoneg_timer(void *opaque)
 {
     E1000State *s = opaque;
-    s->nic->nc.link_down = false;
+    qemu_get_queue(s->nic)->link_down = false;
     e1000_link_up(s);
     s->phy_reg[PHY_STATUS] |= MII_SR_AUTONEG_COMPLETE;
     DBGOUT(PHY, "Auto negotiation is completed\n");
@@ -286,7 +286,7 @@ static void e1000_reset(void *opaque)
     d->rxbuf_min_shift = 1;
     memset(&d->tx, 0, sizeof d->tx);
 
-    if (d->nic->nc.link_down) {
+    if (qemu_get_queue(d->nic)->link_down) {
         e1000_link_down(d);
     }
 
@@ -314,7 +314,7 @@ set_rx_control(E1000State *s, int index, uint32_t val)
     s->rxbuf_min_shift = ((val / E1000_RCTL_RDMTS_QUAT) & 3) + 1;
     DBGOUT(RX, "RCTL: %d, mac_reg[RCTL] = 0x%x\n", s->mac_reg[RDT],
            s->mac_reg[RCTL]);
-    qemu_flush_queued_packets(&s->nic->nc);
+    qemu_flush_queued_packets(qemu_get_queue(s->nic));
 }
 
 static void
@@ -465,10 +465,11 @@ fcs_len(E1000State *s)
 static void
 e1000_send_packet(E1000State *s, const uint8_t *buf, int size)
 {
+    NetClientState *nc = qemu_get_queue(s->nic);
     if (s->phy_reg[PHY_CTRL] & MII_CR_LOOPBACK) {
-        s->nic->nc.info->receive(&s->nic->nc, buf, size);
+        nc->info->receive(nc, buf, size);
     } else {
-        qemu_send_packet(&s->nic->nc, buf, size);
+        qemu_send_packet(nc, buf, size);
     }
 }
 
@@ -953,7 +954,7 @@ set_rdt(E1000State *s, int index, uint32_t val)
 {
     s->mac_reg[index] = val & 0xffff;
     if (e1000_has_rxbufs(s, 1)) {
-        qemu_flush_queued_packets(&s->nic->nc);
+        qemu_flush_queued_packets(qemu_get_queue(s->nic));
     }
 }
 
@@ -1107,10 +1108,11 @@ static bool is_version_1(void *opaque, int version_id)
 static int e1000_post_load(void *opaque, int version_id)
 {
     E1000State *s = opaque;
+    NetClientState *nc = qemu_get_queue(s->nic);
 
     /* nc.link_down can't be migrated, so infer link_down according
      * to link status bit in mac_reg[STATUS] */
-    s->nic->nc.link_down = (s->mac_reg[STATUS] & E1000_STATUS_LU) == 0;
+    nc->link_down = (s->mac_reg[STATUS] & E1000_STATUS_LU) == 0;
 
     return 0;
 }
@@ -1242,7 +1244,7 @@ pci_e1000_uninit(PCIDevice *dev)
     qemu_free_timer(d->autoneg_timer);
     memory_region_destroy(&d->mmio);
     memory_region_destroy(&d->io);
-    qemu_del_net_client(&d->nic->nc);
+    qemu_del_net_client(qemu_get_queue(d->nic));
 }
 
 static NetClientInfo net_e1000_info = {
@@ -1289,7 +1291,7 @@ static int pci_e1000_init(PCIDevice *pci_dev)
     d->nic = qemu_new_nic(&net_e1000_info, &d->conf,
                           object_get_typename(OBJECT(d)), d->dev.qdev.id, d);
 
-    qemu_format_nic_info_str(&d->nic->nc, macaddr);
+    qemu_format_nic_info_str(qemu_get_queue(d->nic), macaddr);
 
     add_boot_device_path(d->conf.bootindex, &pci_dev->qdev, "/ethernet-phy@0");
 
diff --git a/hw/eepro100.c b/hw/eepro100.c
index 6bbefb5..5b77bdc 100644
--- a/hw/eepro100.c
+++ b/hw/eepro100.c
@@ -828,7 +828,7 @@ static void tx_command(EEPRO100State *s)
         }
     }
     TRACE(RXTX, logout("%p sending frame, len=%d,%s\n", s, size, nic_dump(buf, size)));
-    qemu_send_packet(&s->nic->nc, buf, size);
+    qemu_send_packet(qemu_get_queue(s->nic), buf, size);
     s->statistics.tx_good_frames++;
     /* Transmit with bad status would raise an CX/TNO interrupt.
      * (82557 only). Emulation never has bad status. */
@@ -1036,7 +1036,7 @@ static void eepro100_ru_command(EEPRO100State * s, uint8_t val)
         }
         set_ru_state(s, ru_ready);
         s->ru_offset = e100_read_reg4(s, SCBPointer);
-        qemu_flush_queued_packets(&s->nic->nc);
+        qemu_flush_queued_packets(qemu_get_queue(s->nic));
         TRACE(OTHER, logout("val=0x%02x (rx start)\n", val));
         break;
     case RX_RESUME:
@@ -1849,7 +1849,7 @@ static void pci_nic_uninit(PCIDevice *pci_dev)
     memory_region_destroy(&s->flash_bar);
     vmstate_unregister(&pci_dev->qdev, s->vmstate, s);
     eeprom93xx_free(&pci_dev->qdev, s->eeprom);
-    qemu_del_net_client(&s->nic->nc);
+    qemu_del_net_client(qemu_get_queue(s->nic));
 }
 
 static NetClientInfo net_eepro100_info = {
@@ -1895,14 +1895,14 @@ static int e100_nic_init(PCIDevice *pci_dev)
     s->nic = qemu_new_nic(&net_eepro100_info, &s->conf,
                           object_get_typename(OBJECT(pci_dev)), pci_dev->qdev.id, s);
 
-    qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a);
-    TRACE(OTHER, logout("%s\n", s->nic->nc.info_str));
+    qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
+    TRACE(OTHER, logout("%s\n", qemu_get_queue(s->nic)->info_str));
 
     qemu_register_reset(nic_reset, s);
 
     s->vmstate = g_malloc(sizeof(vmstate_eepro100));
     memcpy(s->vmstate, &vmstate_eepro100, sizeof(vmstate_eepro100));
-    s->vmstate->name = s->nic->nc.model;
+    s->vmstate->name = qemu_get_queue(s->nic)->model;
     vmstate_register(&pci_dev->qdev, -1, s->vmstate, s);
 
     add_boot_device_path(s->conf.bootindex, &pci_dev->qdev, "/ethernet-phy@0");
diff --git a/hw/etraxfs_eth.c b/hw/etraxfs_eth.c
index ec23fa6..9df476a 100644
--- a/hw/etraxfs_eth.c
+++ b/hw/etraxfs_eth.c
@@ -545,7 +545,7 @@ static int eth_tx_push(void *opaque, unsigned char *buf, int len, bool eop)
 	struct fs_eth *eth = opaque;
 
 	D(printf("%s buf=%p len=%d\n", __func__, buf, len));
-	qemu_send_packet(&eth->nic->nc, buf, len);
+        qemu_send_packet(qemu_get_queue(eth->nic), buf, len);
 	return len;
 }
 
@@ -606,7 +606,7 @@ static int fs_eth_init(SysBusDevice *dev)
 	qemu_macaddr_default_if_unset(&s->conf.macaddr);
 	s->nic = qemu_new_nic(&net_etraxfs_info, &s->conf,
 			      object_get_typename(OBJECT(s)), dev->qdev.id, s);
-	qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a);
+        qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
 
 	tdk_init(&s->phy);
 	mdio_attach(&s->mdio_bus, &s->phy, s->phyaddr);
diff --git a/hw/lan9118.c b/hw/lan9118.c
index 6596979..262f389 100644
--- a/hw/lan9118.c
+++ b/hw/lan9118.c
@@ -341,7 +341,7 @@ static void lan9118_update(lan9118_state *s)
 
 static void lan9118_mac_changed(lan9118_state *s)
 {
-    qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a);
+    qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
 }
 
 static void lan9118_reload_eeprom(lan9118_state *s)
@@ -373,7 +373,7 @@ static void phy_update_irq(lan9118_state *s)
 static void phy_update_link(lan9118_state *s)
 {
     /* Autonegotiation status mirrors link status.  */
-    if (s->nic->nc.link_down) {
+    if (qemu_get_queue(s->nic)->link_down) {
         s->phy_status &= ~0x0024;
         s->phy_int |= PHY_INT_DOWN;
     } else {
@@ -657,9 +657,9 @@ static void do_tx_packet(lan9118_state *s)
     /* FIXME: Honor TX disable, and allow queueing of packets.  */
     if (s->phy_control & 0x4000)  {
         /* This assumes the receive routine doesn't touch the VLANClient.  */
-        lan9118_receive(&s->nic->nc, s->txp->data, s->txp->len);
+        lan9118_receive(qemu_get_queue(s->nic), s->txp->data, s->txp->len);
     } else {
-        qemu_send_packet(&s->nic->nc, s->txp->data, s->txp->len);
+        qemu_send_packet(qemu_get_queue(s->nic), s->txp->data, s->txp->len);
     }
     s->txp->fifo_used = 0;
 
@@ -1335,7 +1335,7 @@ static int lan9118_init1(SysBusDevice *dev)
 
     s->nic = qemu_new_nic(&net_lan9118_info, &s->conf,
                           object_get_typename(OBJECT(dev)), dev->qdev.id, s);
-    qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a);
+    qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
     s->eeprom[0] = 0xa5;
     for (i = 0; i < 6; i++) {
         s->eeprom[i + 1] = s->conf.macaddr.a[i];
diff --git a/hw/mcf_fec.c b/hw/mcf_fec.c
index 2423f64..8a90bf8 100644
--- a/hw/mcf_fec.c
+++ b/hw/mcf_fec.c
@@ -174,7 +174,7 @@ static void mcf_fec_do_tx(mcf_fec_state *s)
         if (bd.flags & FEC_BD_L) {
             /* Last buffer in frame.  */
             DPRINTF("Sending packet\n");
-            qemu_send_packet(&s->nic->nc, frame, len);
+            qemu_send_packet(qemu_get_queue(s->nic), frame, len);
             ptr = frame;
             frame_size = 0;
             s->eir |= FEC_INT_TXF;
@@ -476,5 +476,5 @@ void mcf_fec_init(MemoryRegion *sysmem, NICInfo *nd,
 
     s->nic = qemu_new_nic(&net_mcf_fec_info, &s->conf, nd->model, nd->name, s);
 
-    qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a);
+    qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
 }
diff --git a/hw/milkymist-minimac2.c b/hw/milkymist-minimac2.c
index 43d6c19..2a8a4ef 100644
--- a/hw/milkymist-minimac2.c
+++ b/hw/milkymist-minimac2.c
@@ -257,7 +257,7 @@ static void minimac2_tx(MilkymistMinimac2State *s)
     trace_milkymist_minimac2_tx_frame(txcount - 12);
 
     /* send packet, skipping preamble and sfd */
-    qemu_send_packet_raw(&s->nic->nc, buf + 8, txcount - 12);
+    qemu_send_packet_raw(qemu_get_queue(s->nic), buf + 8, txcount - 12);
 
     s->regs[R_TXCOUNT] = 0;
 
@@ -480,7 +480,7 @@ static int milkymist_minimac2_init(SysBusDevice *dev)
     qemu_macaddr_default_if_unset(&s->conf.macaddr);
     s->nic = qemu_new_nic(&net_milkymist_minimac2_info, &s->conf,
                           object_get_typename(OBJECT(dev)), dev->qdev.id, s);
-    qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a);
+    qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
 
     return 0;
 }
diff --git a/hw/mipsnet.c b/hw/mipsnet.c
index feac815..15761b1 100644
--- a/hw/mipsnet.c
+++ b/hw/mipsnet.c
@@ -173,7 +173,7 @@ static void mipsnet_ioport_write(void *opaque, hwaddr addr,
         if (s->tx_written == s->tx_count) {
             /* Send buffer. */
             trace_mipsnet_send(s->tx_count);
-            qemu_send_packet(&s->nic->nc, s->tx_buffer, s->tx_count);
+            qemu_send_packet(qemu_get_queue(s->nic), s->tx_buffer, s->tx_count);
             s->tx_count = s->tx_written = 0;
             s->intctl |= MIPSNET_INTCTL_TXDONE;
             s->busy = 1;
@@ -241,7 +241,7 @@ static int mipsnet_sysbus_init(SysBusDevice *dev)
 
     s->nic = qemu_new_nic(&net_mipsnet_info, &s->conf,
                           object_get_typename(OBJECT(dev)), dev->qdev.id, s);
-    qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a);
+    qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
 
     return 0;
 }
diff --git a/hw/musicpal.c b/hw/musicpal.c
index 7ac0a91..9e22f69 100644
--- a/hw/musicpal.c
+++ b/hw/musicpal.c
@@ -257,7 +257,7 @@ static void eth_send(mv88w8618_eth_state *s, int queue_index)
             len = desc.bytes;
             if (len < 2048) {
                 cpu_physical_memory_read(desc.buffer, buf, len);
-                qemu_send_packet(&s->nic->nc, buf, len);
+                qemu_send_packet(qemu_get_queue(s->nic), buf, len);
             }
             desc.cmdstat &= ~MP_ETH_TX_OWN;
             s->icr |= 1 << (MP_ETH_IRQ_TXLO_BIT - queue_index);
diff --git a/hw/ne2000-isa.c b/hw/ne2000-isa.c
index 7c11229..fa47e12 100644
--- a/hw/ne2000-isa.c
+++ b/hw/ne2000-isa.c
@@ -77,7 +77,7 @@ static int isa_ne2000_initfn(ISADevice *dev)
 
     s->nic = qemu_new_nic(&net_ne2000_isa_info, &s->c,
                           object_get_typename(OBJECT(dev)), dev->qdev.id, s);
-    qemu_format_nic_info_str(&s->nic->nc, s->c.macaddr.a);
+    qemu_format_nic_info_str(qemu_get_queue(s->nic), s->c.macaddr.a);
 
     return 0;
 }
diff --git a/hw/ne2000.c b/hw/ne2000.c
index 872115c..03c4209 100644
--- a/hw/ne2000.c
+++ b/hw/ne2000.c
@@ -300,7 +300,8 @@ static void ne2000_ioport_write(void *opaque, uint32_t addr, uint32_t val)
                     index -= NE2000_PMEM_SIZE;
                 /* fail safe: check range on the transmitted length  */
                 if (index + s->tcnt <= NE2000_PMEM_END) {
-                    qemu_send_packet(&s->nic->nc, s->mem + index, s->tcnt);
+                    qemu_send_packet(qemu_get_queue(s->nic), s->mem + index,
+                                     s->tcnt);
                 }
                 /* signal end of transfer */
                 s->tsr = ENTSR_PTX;
@@ -737,7 +738,7 @@ static int pci_ne2000_init(PCIDevice *pci_dev)
 
     s->nic = qemu_new_nic(&net_ne2000_info, &s->c,
                           object_get_typename(OBJECT(pci_dev)), pci_dev->qdev.id, s);
-    qemu_format_nic_info_str(&s->nic->nc, s->c.macaddr.a);
+    qemu_format_nic_info_str(qemu_get_queue(s->nic), s->c.macaddr.a);
 
     add_boot_device_path(s->c.bootindex, &pci_dev->qdev, "/ethernet-phy@0");
 
@@ -750,7 +751,7 @@ static void pci_ne2000_exit(PCIDevice *pci_dev)
     NE2000State *s = &d->ne2000;
 
     memory_region_destroy(&s->io);
-    qemu_del_net_client(&s->nic->nc);
+    qemu_del_net_client(qemu_get_queue(s->nic));
 }
 
 static Property ne2000_properties[] = {
diff --git a/hw/opencores_eth.c b/hw/opencores_eth.c
index 746a959..2496d4e 100644
--- a/hw/opencores_eth.c
+++ b/hw/opencores_eth.c
@@ -339,7 +339,7 @@ static void open_eth_reset(void *opaque)
     s->rx_desc = 0x40;
 
     mii_reset(&s->mii);
-    open_eth_set_link_status(&s->nic->nc);
+    open_eth_set_link_status(qemu_get_queue(s->nic));
 }
 
 static int open_eth_can_receive(NetClientState *nc)
@@ -499,7 +499,7 @@ static void open_eth_start_xmit(OpenEthState *s, desc *tx)
     if (tx_len > len) {
         memset(buf + len, 0, tx_len - len);
     }
-    qemu_send_packet(&s->nic->nc, buf, tx_len);
+    qemu_send_packet(qemu_get_queue(s->nic), buf, tx_len);
 
     if (tx->len_flags & TXD_WR) {
         s->tx_desc = 0;
@@ -606,7 +606,7 @@ static void open_eth_mii_command_host_write(OpenEthState *s, uint32_t val)
         } else {
             s->regs[MIIRX_DATA] = 0xffff;
         }
-        SET_REGFIELD(s, MIISTATUS, LINKFAIL, s->nic->nc.link_down);
+        SET_REGFIELD(s, MIISTATUS, LINKFAIL, qemu_get_queue(s->nic)->link_down);
     }
 }
 
diff --git a/hw/pcnet-pci.c b/hw/pcnet-pci.c
index a94f642..54a849d 100644
--- a/hw/pcnet-pci.c
+++ b/hw/pcnet-pci.c
@@ -279,7 +279,7 @@ static void pci_pcnet_uninit(PCIDevice *dev)
     memory_region_destroy(&d->io_bar);
     qemu_del_timer(d->state.poll_timer);
     qemu_free_timer(d->state.poll_timer);
-    qemu_del_net_client(&d->state.nic->nc);
+    qemu_del_net_client(qemu_get_queue(d->state.nic));
 }
 
 static NetClientInfo net_pci_pcnet_info = {
diff --git a/hw/pcnet.c b/hw/pcnet.c
index 30f1000..2126e22 100644
--- a/hw/pcnet.c
+++ b/hw/pcnet.c
@@ -1261,11 +1261,12 @@ static void pcnet_transmit(PCNetState *s)
                 if (BCR_SWSTYLE(s) == 1)
                     add_crc = !GET_FIELD(tmd.status, TMDS, NOFCS);
                 s->looptest = add_crc ? PCNET_LOOPTEST_CRC : PCNET_LOOPTEST_NOCRC;
-                pcnet_receive(&s->nic->nc, s->buffer, s->xmit_pos);
+                pcnet_receive(qemu_get_queue(s->nic), s->buffer, s->xmit_pos);
                 s->looptest = 0;
             } else
                 if (s->nic)
-                    qemu_send_packet(&s->nic->nc, s->buffer, s->xmit_pos);
+                    qemu_send_packet(qemu_get_queue(s->nic), s->buffer,
+                                     s->xmit_pos);
 
             s->csr[0] &= ~0x0008;   /* clear TDMD */
             s->csr[4] |= 0x0004;    /* set TXSTRT */
@@ -1730,7 +1731,7 @@ int pcnet_common_init(DeviceState *dev, PCNetState *s, NetClientInfo *info)
 
     qemu_macaddr_default_if_unset(&s->conf.macaddr);
     s->nic = qemu_new_nic(info, &s->conf, object_get_typename(OBJECT(dev)), dev->id, s);
-    qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a);
+    qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
 
     add_boot_device_path(s->conf.bootindex, dev, "/ethernet-phy@0");
 
diff --git a/hw/rtl8139.c b/hw/rtl8139.c
index cfbf3f4..22d24ae 100644
--- a/hw/rtl8139.c
+++ b/hw/rtl8139.c
@@ -1259,7 +1259,7 @@ static void rtl8139_reset(DeviceState *d)
     //s->BasicModeStatus |= 0x0040; /* UTP medium */
     s->BasicModeStatus |= 0x0020; /* autonegotiation completed */
     /* preserve link state */
-    s->BasicModeStatus |= s->nic->nc.link_down ? 0 : 0x04;
+    s->BasicModeStatus |= qemu_get_queue(s->nic)->link_down ? 0 : 0x04;
 
     s->NWayAdvert    = 0x05e1; /* all modes, full duplex */
     s->NWayLPAR      = 0x05e1; /* all modes, full duplex */
@@ -1787,7 +1787,7 @@ static void rtl8139_transfer_frame(RTL8139State *s, uint8_t *buf, int size,
         }
 
         DPRINTF("+++ transmit loopback mode\n");
-        rtl8139_do_receive(&s->nic->nc, buf, size, do_interrupt);
+        rtl8139_do_receive(qemu_get_queue(s->nic), buf, size, do_interrupt);
 
         if (iov) {
             g_free(buf2);
@@ -1796,9 +1796,9 @@ static void rtl8139_transfer_frame(RTL8139State *s, uint8_t *buf, int size,
     else
     {
         if (iov) {
-            qemu_sendv_packet(&s->nic->nc, iov, 3);
+            qemu_sendv_packet(qemu_get_queue(s->nic), iov, 3);
         } else {
-            qemu_send_packet(&s->nic->nc, buf, size);
+            qemu_send_packet(qemu_get_queue(s->nic), buf, size);
         }
     }
 }
@@ -3230,7 +3230,7 @@ static int rtl8139_post_load(void *opaque, int version_id)
 
     /* nc.link_down can't be migrated, so infer link_down according
      * to link status bit in BasicModeStatus */
-    s->nic->nc.link_down = (s->BasicModeStatus & 0x04) == 0;
+    qemu_get_queue(s->nic)->link_down = (s->BasicModeStatus & 0x04) == 0;
 
     return 0;
 }
@@ -3446,7 +3446,7 @@ static void pci_rtl8139_uninit(PCIDevice *dev)
     }
     qemu_del_timer(s->timer);
     qemu_free_timer(s->timer);
-    qemu_del_net_client(&s->nic->nc);
+    qemu_del_net_client(qemu_get_queue(s->nic));
 }
 
 static void rtl8139_set_link_status(NetClientState *nc)
@@ -3503,7 +3503,7 @@ static int pci_rtl8139_init(PCIDevice *dev)
 
     s->nic = qemu_new_nic(&net_rtl8139_info, &s->conf,
                           object_get_typename(OBJECT(dev)), dev->qdev.id, s);
-    qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a);
+    qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
 
     s->cplus_txbuffer = NULL;
     s->cplus_txbuffer_len = 0;
diff --git a/hw/smc91c111.c b/hw/smc91c111.c
index 36cb4ed..b097d6b 100644
--- a/hw/smc91c111.c
+++ b/hw/smc91c111.c
@@ -237,7 +237,7 @@ static void smc91c111_do_tx(smc91c111_state *s)
             smc91c111_release_packet(s, packetnum);
         else if (s->tx_fifo_done_len < NUM_PACKETS)
             s->tx_fifo_done[s->tx_fifo_done_len++] = packetnum;
-        qemu_send_packet(&s->nic->nc, p, len);
+        qemu_send_packet(qemu_get_queue(s->nic), p, len);
     }
     s->tx_fifo_len = 0;
     smc91c111_update(s);
@@ -753,7 +753,7 @@ static int smc91c111_init1(SysBusDevice *dev)
     qemu_macaddr_default_if_unset(&s->conf.macaddr);
     s->nic = qemu_new_nic(&net_smc91c111_info, &s->conf,
                           object_get_typename(OBJECT(dev)), dev->qdev.id, s);
-    qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a);
+    qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
     /* ??? Save/restore.  */
     return 0;
 }
diff --git a/hw/spapr_llan.c b/hw/spapr_llan.c
index db34b48..d53d4ae 100644
--- a/hw/spapr_llan.c
+++ b/hw/spapr_llan.c
@@ -199,7 +199,7 @@ static int spapr_vlan_init(VIOsPAPRDevice *sdev)
 
     dev->nic = qemu_new_nic(&net_spapr_vlan_info, &dev->nicconf,
                             object_get_typename(OBJECT(sdev)), sdev->qdev.id, dev);
-    qemu_format_nic_info_str(&dev->nic->nc, dev->nicconf.macaddr.a);
+    qemu_format_nic_info_str(qemu_get_queue(dev->nic), dev->nicconf.macaddr.a);
 
     return 0;
 }
@@ -462,7 +462,7 @@ static target_ulong h_send_logical_lan(PowerPCCPU *cpu, sPAPREnvironment *spapr,
         p += VLAN_BD_LEN(bufs[i]);
     }
 
-    qemu_send_packet(&dev->nic->nc, lbuf, total_len);
+    qemu_send_packet(qemu_get_queue(dev->nic), lbuf, total_len);
 
     return H_SUCCESS;
 }
diff --git a/hw/stellaris_enet.c b/hw/stellaris_enet.c
index 5e9053f..99d4730 100644
--- a/hw/stellaris_enet.c
+++ b/hw/stellaris_enet.c
@@ -259,7 +259,8 @@ static void stellaris_enet_write(void *opaque, hwaddr offset,
                     memset(&s->tx_fifo[s->tx_frame_len], 0, 60 - s->tx_frame_len);
                     s->tx_fifo_len = 60;
                 }
-                qemu_send_packet(&s->nic->nc, s->tx_fifo, s->tx_frame_len);
+                qemu_send_packet(qemu_get_queue(s->nic), s->tx_fifo,
+                                 s->tx_frame_len);
                 s->tx_frame_len = -1;
                 s->ris |= SE_INT_TXEMP;
                 stellaris_enet_update(s);
@@ -412,7 +413,7 @@ static int stellaris_enet_init(SysBusDevice *dev)
 
     s->nic = qemu_new_nic(&net_stellaris_enet_info, &s->conf,
                           object_get_typename(OBJECT(dev)), dev->qdev.id, s);
-    qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a);
+    qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
 
     stellaris_enet_reset(s);
     register_savevm(&s->busdev.qdev, "stellaris_enet", -1, 1,
diff --git a/hw/usb/dev-network.c b/hw/usb/dev-network.c
index 9dede4c..a131f9c 100644
--- a/hw/usb/dev-network.c
+++ b/hw/usb/dev-network.c
@@ -1012,7 +1012,7 @@ static int rndis_keepalive_response(USBNetState *s,
 static void usb_net_reset_in_buf(USBNetState *s)
 {
     s->in_ptr = s->in_len = 0;
-    qemu_flush_queued_packets(&s->nic->nc);
+    qemu_flush_queued_packets(qemu_get_queue(s->nic));
 }
 
 static int rndis_parse(USBNetState *s, uint8_t *data, int length)
@@ -1196,7 +1196,7 @@ static void usb_net_handle_dataout(USBNetState *s, USBPacket *p)
 
     if (!is_rndis(s)) {
         if (p->iov.size < 64) {
-            qemu_send_packet(&s->nic->nc, s->out_buf, s->out_ptr);
+            qemu_send_packet(qemu_get_queue(s->nic), s->out_buf, s->out_ptr);
             s->out_ptr = 0;
         }
         return;
@@ -1209,7 +1209,7 @@ static void usb_net_handle_dataout(USBNetState *s, USBPacket *p)
         uint32_t offs = 8 + le32_to_cpu(msg->DataOffset);
         uint32_t size = le32_to_cpu(msg->DataLength);
         if (offs + size <= len)
-            qemu_send_packet(&s->nic->nc, s->out_buf + offs, size);
+            qemu_send_packet(qemu_get_queue(s->nic), s->out_buf + offs, size);
     }
     s->out_ptr -= len;
     memmove(s->out_buf, &s->out_buf[len], s->out_ptr);
@@ -1330,7 +1330,7 @@ static void usb_net_handle_destroy(USBDevice *dev)
 
     /* TODO: remove the nd_table[] entry */
     rndis_clear_responsequeue(s);
-    qemu_del_net_client(&s->nic->nc);
+    qemu_del_net_client(qemu_get_queue(s->nic));
 }
 
 static NetClientInfo net_usbnet_info = {
@@ -1361,7 +1361,7 @@ static int usb_net_initfn(USBDevice *dev)
     qemu_macaddr_default_if_unset(&s->conf.macaddr);
     s->nic = qemu_new_nic(&net_usbnet_info, &s->conf,
                           object_get_typename(OBJECT(s)), s->dev.qdev.id, s);
-    qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a);
+    qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
     snprintf(s->usbstring_mac, sizeof(s->usbstring_mac),
              "%02x%02x%02x%02x%02x%02x",
              0x40,
diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 3bb01b1..551f6dc 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -95,7 +95,7 @@ static void virtio_net_set_config(VirtIODevice *vdev, const uint8_t *config)
 
     if (memcmp(netcfg.mac, n->mac, ETH_ALEN)) {
         memcpy(n->mac, netcfg.mac, ETH_ALEN);
-        qemu_format_nic_info_str(&n->nic->nc, n->mac);
+        qemu_format_nic_info_str(qemu_get_queue(n->nic), n->mac);
     }
 }
 
@@ -107,34 +107,36 @@ static bool virtio_net_started(VirtIONet *n, uint8_t status)
 
 static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
 {
-    if (!n->nic->nc.peer) {
+    NetClientState *nc = qemu_get_queue(n->nic);
+
+    if (!nc->peer) {
         return;
     }
-    if (n->nic->nc.peer->info->type != NET_CLIENT_OPTIONS_KIND_TAP) {
+    if (nc->peer->info->type != NET_CLIENT_OPTIONS_KIND_TAP) {
         return;
     }
 
-    if (!tap_get_vhost_net(n->nic->nc.peer)) {
+    if (!tap_get_vhost_net(nc->peer)) {
         return;
     }
     if (!!n->vhost_started == virtio_net_started(n, status) &&
-                              !n->nic->nc.peer->link_down) {
+                              !nc->peer->link_down) {
         return;
     }
     if (!n->vhost_started) {
         int r;
-        if (!vhost_net_query(tap_get_vhost_net(n->nic->nc.peer), &n->vdev)) {
+        if (!vhost_net_query(tap_get_vhost_net(nc->peer), &n->vdev)) {
             return;
         }
         n->vhost_started = 1;
-        r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer), &n->vdev);
+        r = vhost_net_start(tap_get_vhost_net(nc->peer), &n->vdev);
         if (r < 0) {
             error_report("unable to start vhost net: %d: "
                          "falling back on userspace virtio", -r);
             n->vhost_started = 0;
         }
     } else {
-        vhost_net_stop(tap_get_vhost_net(n->nic->nc.peer), &n->vdev);
+        vhost_net_stop(tap_get_vhost_net(nc->peer), &n->vdev);
         n->vhost_started = 0;
     }
 }
@@ -204,13 +206,16 @@ static void virtio_net_reset(VirtIODevice *vdev)
 
 static void peer_test_vnet_hdr(VirtIONet *n)
 {
-    if (!n->nic->nc.peer)
+    NetClientState *nc = qemu_get_queue(n->nic);
+    if (!nc->peer) {
         return;
+    }
 
-    if (n->nic->nc.peer->info->type != NET_CLIENT_OPTIONS_KIND_TAP)
+    if (nc->peer->info->type != NET_CLIENT_OPTIONS_KIND_TAP) {
         return;
+    }
 
-    n->has_vnet_hdr = tap_has_vnet_hdr(n->nic->nc.peer);
+    n->has_vnet_hdr = tap_has_vnet_hdr(nc->peer);
 }
 
 static int peer_has_vnet_hdr(VirtIONet *n)
@@ -223,7 +228,7 @@ static int peer_has_ufo(VirtIONet *n)
     if (!peer_has_vnet_hdr(n))
         return 0;
 
-    n->has_ufo = tap_has_ufo(n->nic->nc.peer);
+    n->has_ufo = tap_has_ufo(qemu_get_queue(n->nic)->peer);
 
     return n->has_ufo;
 }
@@ -236,8 +241,8 @@ static void virtio_net_set_mrg_rx_bufs(VirtIONet *n, int mergeable_rx_bufs)
         sizeof(struct virtio_net_hdr_mrg_rxbuf) : sizeof(struct virtio_net_hdr);
 
     if (peer_has_vnet_hdr(n) &&
-        tap_has_vnet_hdr_len(n->nic->nc.peer, n->guest_hdr_len)) {
-        tap_set_vnet_hdr_len(n->nic->nc.peer, n->guest_hdr_len);
+        tap_has_vnet_hdr_len(qemu_get_queue(n->nic)->peer, n->guest_hdr_len)) {
+        tap_set_vnet_hdr_len(qemu_get_queue(n->nic)->peer, n->guest_hdr_len);
         n->host_hdr_len = n->guest_hdr_len;
     }
 }
@@ -245,6 +250,7 @@ static void virtio_net_set_mrg_rx_bufs(VirtIONet *n, int mergeable_rx_bufs)
 static uint32_t virtio_net_get_features(VirtIODevice *vdev, uint32_t features)
 {
     VirtIONet *n = to_virtio_net(vdev);
+    NetClientState *nc = qemu_get_queue(n->nic);
 
     features |= (1 << VIRTIO_NET_F_MAC);
 
@@ -265,14 +271,13 @@ static uint32_t virtio_net_get_features(VirtIODevice *vdev, uint32_t features)
         features &= ~(0x1 << VIRTIO_NET_F_HOST_UFO);
     }
 
-    if (!n->nic->nc.peer ||
-        n->nic->nc.peer->info->type != NET_CLIENT_OPTIONS_KIND_TAP) {
+    if (!nc->peer || nc->peer->info->type != NET_CLIENT_OPTIONS_KIND_TAP) {
         return features;
     }
-    if (!tap_get_vhost_net(n->nic->nc.peer)) {
+    if (!tap_get_vhost_net(nc->peer)) {
         return features;
     }
-    return vhost_net_get_features(tap_get_vhost_net(n->nic->nc.peer), features);
+    return vhost_net_get_features(tap_get_vhost_net(nc->peer), features);
 }
 
 static uint32_t virtio_net_bad_features(VirtIODevice *vdev)
@@ -293,25 +298,25 @@ static uint32_t virtio_net_bad_features(VirtIODevice *vdev)
 static void virtio_net_set_features(VirtIODevice *vdev, uint32_t features)
 {
     VirtIONet *n = to_virtio_net(vdev);
+    NetClientState *nc = qemu_get_queue(n->nic);
 
     virtio_net_set_mrg_rx_bufs(n, !!(features & (1 << VIRTIO_NET_F_MRG_RXBUF)));
 
     if (n->has_vnet_hdr) {
-        tap_set_offload(n->nic->nc.peer,
+        tap_set_offload(nc->peer,
                         (features >> VIRTIO_NET_F_GUEST_CSUM) & 1,
                         (features >> VIRTIO_NET_F_GUEST_TSO4) & 1,
                         (features >> VIRTIO_NET_F_GUEST_TSO6) & 1,
                         (features >> VIRTIO_NET_F_GUEST_ECN)  & 1,
                         (features >> VIRTIO_NET_F_GUEST_UFO)  & 1);
     }
-    if (!n->nic->nc.peer ||
-        n->nic->nc.peer->info->type != NET_CLIENT_OPTIONS_KIND_TAP) {
+    if (!nc->peer || nc->peer->info->type != NET_CLIENT_OPTIONS_KIND_TAP) {
         return;
     }
-    if (!tap_get_vhost_net(n->nic->nc.peer)) {
+    if (!tap_get_vhost_net(nc->peer)) {
         return;
     }
-    vhost_net_ack_features(tap_get_vhost_net(n->nic->nc.peer), features);
+    vhost_net_ack_features(tap_get_vhost_net(nc->peer), features);
 }
 
 static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd,
@@ -463,7 +468,7 @@ static void virtio_net_handle_rx(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIONet *n = to_virtio_net(vdev);
 
-    qemu_flush_queued_packets(&n->nic->nc);
+    qemu_flush_queued_packets(qemu_get_queue(n->nic));
 }
 
 static int virtio_net_can_receive(NetClientState *nc)
@@ -605,8 +610,9 @@ static ssize_t virtio_net_receive(NetClientState *nc, const uint8_t *buf, size_t
     unsigned mhdr_cnt = 0;
     size_t offset, i, guest_offset;
 
-    if (!virtio_net_can_receive(&n->nic->nc))
+    if (!virtio_net_can_receive(qemu_get_queue(n->nic))) {
         return -1;
+    }
 
     /* hdr_len refers to the header we supply to the guest */
     if (!virtio_net_has_buffers(n, size + n->guest_hdr_len - n->host_hdr_len))
@@ -754,7 +760,7 @@ static int32_t virtio_net_flush_tx(VirtIONet *n, VirtQueue *vq)
 
         len = n->guest_hdr_len;
 
-        ret = qemu_sendv_packet_async(&n->nic->nc, out_sg, out_num,
+        ret = qemu_sendv_packet_async(qemu_get_queue(n->nic), out_sg, out_num,
                                       virtio_net_tx_complete);
         if (ret == 0) {
             virtio_queue_set_notification(n->tx_vq, 0);
@@ -951,7 +957,7 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
         }
 
         if (n->has_vnet_hdr) {
-            tap_set_offload(n->nic->nc.peer,
+            tap_set_offload(qemu_get_queue(n->nic)->peer,
                     (n->vdev.guest_features >> VIRTIO_NET_F_GUEST_CSUM) & 1,
                     (n->vdev.guest_features >> VIRTIO_NET_F_GUEST_TSO4) & 1,
                     (n->vdev.guest_features >> VIRTIO_NET_F_GUEST_TSO6) & 1,
@@ -989,7 +995,7 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
 
     /* nc.link_down can't be migrated, so infer link_down according
      * to link status bit in n->status */
-    n->nic->nc.link_down = (n->status & VIRTIO_NET_S_LINK_UP) == 0;
+    qemu_get_queue(n->nic)->link_down = (n->status & VIRTIO_NET_S_LINK_UP) == 0;
 
     return 0;
 }
@@ -1013,16 +1019,18 @@ static NetClientInfo net_virtio_info = {
 static bool virtio_net_guest_notifier_pending(VirtIODevice *vdev, int idx)
 {
     VirtIONet *n = to_virtio_net(vdev);
+    NetClientState *nc = qemu_get_queue(n->nic);
     assert(n->vhost_started);
-    return vhost_net_virtqueue_pending(tap_get_vhost_net(n->nic->nc.peer), idx);
+    return vhost_net_virtqueue_pending(tap_get_vhost_net(nc->peer), idx);
 }
 
 static void virtio_net_guest_notifier_mask(VirtIODevice *vdev, int idx,
                                            bool mask)
 {
     VirtIONet *n = to_virtio_net(vdev);
+    NetClientState *nc = qemu_get_queue(n->nic);
     assert(n->vhost_started);
-    vhost_net_virtqueue_mask(tap_get_vhost_net(n->nic->nc.peer),
+    vhost_net_virtqueue_mask(tap_get_vhost_net(nc->peer),
                              vdev, idx, mask);
 }
 
@@ -1069,13 +1077,13 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
     n->nic = qemu_new_nic(&net_virtio_info, conf, object_get_typename(OBJECT(dev)), dev->id, n);
     peer_test_vnet_hdr(n);
     if (peer_has_vnet_hdr(n)) {
-        tap_using_vnet_hdr(n->nic->nc.peer, 1);
+        tap_using_vnet_hdr(qemu_get_queue(n->nic)->peer, 1);
         n->host_hdr_len = sizeof(struct virtio_net_hdr);
     } else {
         n->host_hdr_len = 0;
     }
 
-    qemu_format_nic_info_str(&n->nic->nc, conf->macaddr.a);
+    qemu_format_nic_info_str(qemu_get_queue(n->nic), conf->macaddr.a);
 
     n->tx_waiting = 0;
     n->tx_burst = net->txburst;
@@ -1102,7 +1110,7 @@ void virtio_net_exit(VirtIODevice *vdev)
     /* This will stop vhost backend if appropriate. */
     virtio_net_set_status(vdev, 0);
 
-    qemu_purge_queued_packets(&n->nic->nc);
+    qemu_purge_queued_packets(qemu_get_queue(n->nic));
 
     unregister_savevm(n->qdev, "virtio-net", n);
 
@@ -1116,6 +1124,6 @@ void virtio_net_exit(VirtIODevice *vdev)
         qemu_bh_delete(n->tx_bh);
     }
 
-    qemu_del_net_client(&n->nic->nc);
+    qemu_del_net_client(qemu_get_queue(n->nic));
     virtio_cleanup(&n->vdev);
 }
diff --git a/hw/xen_nic.c b/hw/xen_nic.c
index dc12110..d5b39ea 100644
--- a/hw/xen_nic.c
+++ b/hw/xen_nic.c
@@ -185,9 +185,11 @@ static void net_tx_packets(struct XenNetDev *netdev)
                 }
                 memcpy(tmpbuf, page + txreq.offset, txreq.size);
                 net_checksum_calculate(tmpbuf, txreq.size);
-                qemu_send_packet(&netdev->nic->nc, tmpbuf, txreq.size);
+                qemu_send_packet(qemu_get_queue(netdev->nic), tmpbuf,
+                                 txreq.size);
             } else {
-                qemu_send_packet(&netdev->nic->nc, page + txreq.offset, txreq.size);
+                qemu_send_packet(qemu_get_queue(netdev->nic),
+                                 page + txreq.offset, txreq.size);
             }
             xc_gnttab_munmap(netdev->xendev.gnttabdev, page, 1);
             net_tx_response(netdev, &txreq, NETIF_RSP_OKAY);
@@ -329,7 +331,8 @@ static int net_init(struct XenDevice *xendev)
     netdev->nic = qemu_new_nic(&net_xen_info, &netdev->conf,
                                "xen", NULL, netdev);
 
-    snprintf(netdev->nic->nc.info_str, sizeof(netdev->nic->nc.info_str),
+    snprintf(qemu_get_queue(netdev->nic)->info_str,
+             sizeof(qemu_get_queue(netdev->nic)->info_str),
              "nic: xenbus vif macaddr=%s", netdev->mac);
 
     /* fill info */
@@ -405,7 +408,7 @@ static void net_disconnect(struct XenDevice *xendev)
         netdev->rxs = NULL;
     }
     if (netdev->nic) {
-        qemu_del_net_client(&netdev->nic->nc);
+        qemu_del_net_client(qemu_get_queue(netdev->nic));
         netdev->nic = NULL;
     }
 }
@@ -414,7 +417,7 @@ static void net_event(struct XenDevice *xendev)
 {
     struct XenNetDev *netdev = container_of(xendev, struct XenNetDev, xendev);
     net_tx_packets(netdev);
-    qemu_flush_queued_packets(&netdev->nic->nc);
+    qemu_flush_queued_packets(qemu_get_queue(netdev->nic));
 }
 
 static int net_free(struct XenDevice *xendev)
diff --git a/hw/xgmac.c b/hw/xgmac.c
index 00dae77..4d7bb13 100644
--- a/hw/xgmac.c
+++ b/hw/xgmac.c
@@ -235,7 +235,7 @@ static void xgmac_enet_send(struct XgmacState *s)
         frame_size += len;
         if (bd.ctl_stat & 0x20000000) {
             /* Last buffer in frame.  */
-            qemu_send_packet(&s->nic->nc, frame, len);
+            qemu_send_packet(qemu_get_queue(s->nic), frame, len);
             ptr = frame;
             frame_size = 0;
             s->regs[DMA_STATUS] |= DMA_STATUS_TI | DMA_STATUS_NIS;
@@ -391,7 +391,7 @@ static int xgmac_enet_init(SysBusDevice *dev)
     qemu_macaddr_default_if_unset(&s->conf.macaddr);
     s->nic = qemu_new_nic(&net_xgmac_enet_info, &s->conf,
                           object_get_typename(OBJECT(dev)), dev->qdev.id, s);
-    qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a);
+    qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
 
     s->regs[XGMAC_ADDR_HIGH(0)] = (s->conf.macaddr.a[5] << 8) |
                                    s->conf.macaddr.a[4];
diff --git a/hw/xilinx_axienet.c b/hw/xilinx_axienet.c
index 51c2896..a7e8e2c 100644
--- a/hw/xilinx_axienet.c
+++ b/hw/xilinx_axienet.c
@@ -826,7 +826,7 @@ axienet_stream_push(StreamSlave *obj, uint8_t *buf, size_t size, uint32_t *hdr)
         buf[write_off + 1] = csum & 0xff;
     }
 
-    qemu_send_packet(&s->nic->nc, buf, size);
+    qemu_send_packet(qemu_get_queue(s->nic), buf, size);
 
     s->stats.tx_bytes += size;
     s->regs[R_IS] |= IS_TX_COMPLETE;
@@ -853,7 +853,7 @@ static int xilinx_enet_init(SysBusDevice *dev)
     qemu_macaddr_default_if_unset(&s->conf.macaddr);
     s->nic = qemu_new_nic(&net_xilinx_enet_info, &s->conf,
                           object_get_typename(OBJECT(dev)), dev->qdev.id, s);
-    qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a);
+    qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
 
     tdk_init(&s->TEMAC.phy);
     mdio_attach(&s->TEMAC.mdio_bus, &s->TEMAC.phy, s->c_phyaddr);
diff --git a/hw/xilinx_ethlite.c b/hw/xilinx_ethlite.c
index 2254851..5ab3915 100644
--- a/hw/xilinx_ethlite.c
+++ b/hw/xilinx_ethlite.c
@@ -117,7 +117,7 @@ eth_write(void *opaque, hwaddr addr,
 
             D(qemu_log("%s addr=%x val=%x\n", __func__, addr * 4, value));
             if ((value & (CTRL_P | CTRL_S)) == CTRL_S) {
-                qemu_send_packet(&s->nic->nc,
+                qemu_send_packet(qemu_get_queue(s->nic),
                                  (void *) &s->regs[base],
                                  s->regs[base + R_TX_LEN0]);
                 D(qemu_log("eth_tx %d\n", s->regs[base + R_TX_LEN0]));
@@ -223,7 +223,7 @@ static int xilinx_ethlite_init(SysBusDevice *dev)
     qemu_macaddr_default_if_unset(&s->conf.macaddr);
     s->nic = qemu_new_nic(&net_xilinx_ethlite_info, &s->conf,
                           object_get_typename(OBJECT(dev)), dev->qdev.id, s);
-    qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a);
+    qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
     return 0;
 }
 
diff --git a/include/net/net.h b/include/net/net.h
index 4a92b6c..5d8aecf 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -77,6 +77,7 @@ NICState *qemu_new_nic(NetClientInfo *info,
                        const char *model,
                        const char *name,
                        void *opaque);
+NetClientState *qemu_get_queue(NICState *nic);
 void qemu_del_net_client(NetClientState *nc);
 NetClientState *qemu_find_vlan_client_by_name(Monitor *mon, int vlan_id,
                                               const char *client_str);
diff --git a/net/net.c b/net/net.c
index cdd9b04..e9a0d15 100644
--- a/net/net.c
+++ b/net/net.c
@@ -234,6 +234,11 @@ NICState *qemu_new_nic(NetClientInfo *info,
     return nic;
 }
 
+NetClientState *qemu_get_queue(NICState *nic)
+{
+    return &nic->nc;
+}
+
 static void qemu_cleanup_net_client(NetClientState *nc)
 {
     QTAILQ_REMOVE(&net_clients, nc, next);
diff --git a/savevm.c b/savevm.c
index 304d1ef..749b57e 100644
--- a/savevm.c
+++ b/savevm.c
@@ -81,7 +81,7 @@ static void qemu_announce_self_iter(NICState *nic, void *opaque)
 
     len = announce_self_create(buf, nic->conf->macaddr.a);
 
-    qemu_send_packet_raw(&nic->nc, buf, len);
+    qemu_send_packet_raw(qemu_get_queue(nic), buf, len);
 }
 
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 02/20] net: introduce qemu_get_nic()
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
  2013-01-25 10:35 ` [PATCH V2 01/20] net: introduce qemu_get_queue() Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-25 10:35 ` [PATCH V2 03/20] net: intorduce qemu_del_nic() Jason Wang
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

To support multiqueue, this patch introduces a helper qemu_get_nic() to get
NICState from a NetClientState. The following patches would refactor this helper
to support multiqueue.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 hw/cadence_gem.c        |    8 ++++----
 hw/dp8393x.c            |    6 +++---
 hw/e1000.c              |    8 ++++----
 hw/eepro100.c           |    6 +++---
 hw/etraxfs_eth.c        |    6 +++---
 hw/lan9118.c            |    6 +++---
 hw/lance.c              |    2 +-
 hw/mcf_fec.c            |    6 +++---
 hw/milkymist-minimac2.c |    6 +++---
 hw/mipsnet.c            |    6 +++---
 hw/musicpal.c           |    4 ++--
 hw/ne2000-isa.c         |    2 +-
 hw/ne2000.c             |    6 +++---
 hw/opencores_eth.c      |    6 +++---
 hw/pcnet-pci.c          |    2 +-
 hw/pcnet.c              |    6 +++---
 hw/rtl8139.c            |    8 ++++----
 hw/smc91c111.c          |    6 +++---
 hw/spapr_llan.c         |    4 ++--
 hw/stellaris_enet.c     |    6 +++---
 hw/usb/dev-network.c    |    6 +++---
 hw/virtio-net.c         |   10 +++++-----
 hw/xen_nic.c            |    4 ++--
 hw/xgmac.c              |    6 +++---
 hw/xilinx_axienet.c     |    6 +++---
 hw/xilinx_ethlite.c     |    6 +++---
 include/net/net.h       |    2 ++
 net/net.c               |   20 ++++++++++++++++----
 28 files changed, 92 insertions(+), 78 deletions(-)

diff --git a/hw/cadence_gem.c b/hw/cadence_gem.c
index 9de688f..ab35329 100644
--- a/hw/cadence_gem.c
+++ b/hw/cadence_gem.c
@@ -409,7 +409,7 @@ static int gem_can_receive(NetClientState *nc)
 {
     GemState *s;
 
-    s = DO_UPCAST(NICState, nc, nc)->opaque;
+    s = qemu_get_nic_opaque(nc);
 
     DB_PRINT("\n");
 
@@ -612,7 +612,7 @@ static ssize_t gem_receive(NetClientState *nc, const uint8_t *buf, size_t size)
     uint8_t    rxbuf[2048];
     uint8_t   *rxbuf_ptr;
 
-    s = DO_UPCAST(NICState, nc, nc)->opaque;
+    s = qemu_get_nic_opaque(nc);
 
     /* Do nothing if receive is not enabled. */
     if (!(s->regs[GEM_NWCTRL] & GEM_NWCTRL_RXENA)) {
@@ -1149,7 +1149,7 @@ static const MemoryRegionOps gem_ops = {
 
 static void gem_cleanup(NetClientState *nc)
 {
-    GemState *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    GemState *s = qemu_get_nic_opaque(nc);
 
     DB_PRINT("\n");
     s->nic = NULL;
@@ -1158,7 +1158,7 @@ static void gem_cleanup(NetClientState *nc)
 static void gem_set_link(NetClientState *nc)
 {
     DB_PRINT("\n");
-    phy_update_link(DO_UPCAST(NICState, nc, nc)->opaque);
+    phy_update_link(qemu_get_nic_opaque(nc));
 }
 
 static NetClientInfo net_gem_info = {
diff --git a/hw/dp8393x.c b/hw/dp8393x.c
index c2d0bc8..0273fad 100644
--- a/hw/dp8393x.c
+++ b/hw/dp8393x.c
@@ -676,7 +676,7 @@ static const MemoryRegionOps dp8393x_ops = {
 
 static int nic_can_receive(NetClientState *nc)
 {
-    dp8393xState *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    dp8393xState *s = qemu_get_nic_opaque(nc);
 
     if (!(s->regs[SONIC_CR] & SONIC_CR_RXEN))
         return 0;
@@ -725,7 +725,7 @@ static int receive_filter(dp8393xState *s, const uint8_t * buf, int size)
 
 static ssize_t nic_receive(NetClientState *nc, const uint8_t * buf, size_t size)
 {
-    dp8393xState *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    dp8393xState *s = qemu_get_nic_opaque(nc);
     uint16_t data[10];
     int packet_type;
     uint32_t available, address;
@@ -861,7 +861,7 @@ static void nic_reset(void *opaque)
 
 static void nic_cleanup(NetClientState *nc)
 {
-    dp8393xState *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    dp8393xState *s = qemu_get_nic_opaque(nc);
 
     memory_region_del_subregion(s->address_space, &s->mmio);
     memory_region_destroy(&s->mmio);
diff --git a/hw/e1000.c b/hw/e1000.c
index 7b310d7..36f4051 100644
--- a/hw/e1000.c
+++ b/hw/e1000.c
@@ -743,7 +743,7 @@ receive_filter(E1000State *s, const uint8_t *buf, int size)
 static void
 e1000_set_link_status(NetClientState *nc)
 {
-    E1000State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    E1000State *s = qemu_get_nic_opaque(nc);
     uint32_t old_status = s->mac_reg[STATUS];
 
     if (nc->link_down) {
@@ -777,7 +777,7 @@ static bool e1000_has_rxbufs(E1000State *s, size_t total_size)
 static int
 e1000_can_receive(NetClientState *nc)
 {
-    E1000State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    E1000State *s = qemu_get_nic_opaque(nc);
 
     return (s->mac_reg[RCTL] & E1000_RCTL_EN) && e1000_has_rxbufs(s, 1);
 }
@@ -793,7 +793,7 @@ static uint64_t rx_desc_base(E1000State *s)
 static ssize_t
 e1000_receive(NetClientState *nc, const uint8_t *buf, size_t size)
 {
-    E1000State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    E1000State *s = qemu_get_nic_opaque(nc);
     struct e1000_rx_desc desc;
     dma_addr_t base;
     unsigned int n, rdt;
@@ -1230,7 +1230,7 @@ e1000_mmio_setup(E1000State *d)
 static void
 e1000_cleanup(NetClientState *nc)
 {
-    E1000State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    E1000State *s = qemu_get_nic_opaque(nc);
 
     s->nic = NULL;
 }
diff --git a/hw/eepro100.c b/hw/eepro100.c
index 5b77bdc..f9856ae 100644
--- a/hw/eepro100.c
+++ b/hw/eepro100.c
@@ -1619,7 +1619,7 @@ static const MemoryRegionOps eepro100_ops = {
 
 static int nic_can_receive(NetClientState *nc)
 {
-    EEPRO100State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    EEPRO100State *s = qemu_get_nic_opaque(nc);
     TRACE(RXTX, logout("%p\n", s));
     return get_ru_state(s) == ru_ready;
 #if 0
@@ -1633,7 +1633,7 @@ static ssize_t nic_receive(NetClientState *nc, const uint8_t * buf, size_t size)
      * - Magic packets should set bit 30 in power management driver register.
      * - Interesting packets should set bit 29 in power management driver register.
      */
-    EEPRO100State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    EEPRO100State *s = qemu_get_nic_opaque(nc);
     uint16_t rfd_status = 0xa000;
 #if defined(CONFIG_PAD_RECEIVED_FRAMES)
     uint8_t min_buf[60];
@@ -1835,7 +1835,7 @@ static const VMStateDescription vmstate_eepro100 = {
 
 static void nic_cleanup(NetClientState *nc)
 {
-    EEPRO100State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    EEPRO100State *s = qemu_get_nic_opaque(nc);
 
     s->nic = NULL;
 }
diff --git a/hw/etraxfs_eth.c b/hw/etraxfs_eth.c
index 9df476a..d426311 100644
--- a/hw/etraxfs_eth.c
+++ b/hw/etraxfs_eth.c
@@ -515,7 +515,7 @@ static int eth_can_receive(NetClientState *nc)
 static ssize_t eth_receive(NetClientState *nc, const uint8_t *buf, size_t size)
 {
 	unsigned char sa_bcast[6] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff };
-	struct fs_eth *eth = DO_UPCAST(NICState, nc, nc)->opaque;
+        struct fs_eth *eth = qemu_get_nic_opaque(nc);
 	int use_ma0 = eth->regs[RW_REC_CTRL] & 1;
 	int use_ma1 = eth->regs[RW_REC_CTRL] & 2;
 	int r_bcast = eth->regs[RW_REC_CTRL] & 8;
@@ -551,7 +551,7 @@ static int eth_tx_push(void *opaque, unsigned char *buf, int len, bool eop)
 
 static void eth_set_link(NetClientState *nc)
 {
-	struct fs_eth *eth = DO_UPCAST(NICState, nc, nc)->opaque;
+    struct fs_eth *eth = qemu_get_nic_opaque(nc);
 	D(printf("%s %d\n", __func__, nc->link_down));
 	eth->phy.link = !nc->link_down;
 }
@@ -568,7 +568,7 @@ static const MemoryRegionOps eth_ops = {
 
 static void eth_cleanup(NetClientState *nc)
 {
-	struct fs_eth *eth = DO_UPCAST(NICState, nc, nc)->opaque;
+    struct fs_eth *eth = qemu_get_nic_opaque(nc);
 
 	/* Disconnect the client.  */
 	eth->dma_out->client.push = NULL;
diff --git a/hw/lan9118.c b/hw/lan9118.c
index 262f389..0e844e5 100644
--- a/hw/lan9118.c
+++ b/hw/lan9118.c
@@ -386,7 +386,7 @@ static void phy_update_link(lan9118_state *s)
 
 static void lan9118_set_link(NetClientState *nc)
 {
-    phy_update_link(DO_UPCAST(NICState, nc, nc)->opaque);
+    phy_update_link(qemu_get_nic_opaque(nc));
 }
 
 static void phy_reset(lan9118_state *s)
@@ -512,7 +512,7 @@ static int lan9118_filter(lan9118_state *s, const uint8_t *addr)
 static ssize_t lan9118_receive(NetClientState *nc, const uint8_t *buf,
                                size_t size)
 {
-    lan9118_state *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    lan9118_state *s = qemu_get_nic_opaque(nc);
     int fifo_len;
     int offset;
     int src_pos;
@@ -1306,7 +1306,7 @@ static const MemoryRegionOps lan9118_16bit_mem_ops = {
 
 static void lan9118_cleanup(NetClientState *nc)
 {
-    lan9118_state *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    lan9118_state *s = qemu_get_nic_opaque(nc);
 
     s->nic = NULL;
 }
diff --git a/hw/lance.c b/hw/lance.c
index a5997fd..4b92425 100644
--- a/hw/lance.c
+++ b/hw/lance.c
@@ -87,7 +87,7 @@ static const MemoryRegionOps lance_mem_ops = {
 
 static void lance_cleanup(NetClientState *nc)
 {
-    PCNetState *d = DO_UPCAST(NICState, nc, nc)->opaque;
+    PCNetState *d = qemu_get_nic_opaque(nc);
 
     pcnet_common_cleanup(d);
 }
diff --git a/hw/mcf_fec.c b/hw/mcf_fec.c
index 8a90bf8..909e32b 100644
--- a/hw/mcf_fec.c
+++ b/hw/mcf_fec.c
@@ -353,13 +353,13 @@ static void mcf_fec_write(void *opaque, hwaddr addr,
 
 static int mcf_fec_can_receive(NetClientState *nc)
 {
-    mcf_fec_state *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    mcf_fec_state *s = qemu_get_nic_opaque(nc);
     return s->rx_enabled;
 }
 
 static ssize_t mcf_fec_receive(NetClientState *nc, const uint8_t *buf, size_t size)
 {
-    mcf_fec_state *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    mcf_fec_state *s = qemu_get_nic_opaque(nc);
     mcf_fec_bd bd;
     uint32_t flags = 0;
     uint32_t addr;
@@ -441,7 +441,7 @@ static const MemoryRegionOps mcf_fec_ops = {
 
 static void mcf_fec_cleanup(NetClientState *nc)
 {
-    mcf_fec_state *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    mcf_fec_state *s = qemu_get_nic_opaque(nc);
 
     memory_region_del_subregion(s->sysmem, &s->iomem);
     memory_region_destroy(&s->iomem);
diff --git a/hw/milkymist-minimac2.c b/hw/milkymist-minimac2.c
index 2a8a4ef..9992dcc 100644
--- a/hw/milkymist-minimac2.c
+++ b/hw/milkymist-minimac2.c
@@ -280,7 +280,7 @@ static void update_rx_interrupt(MilkymistMinimac2State *s)
 
 static ssize_t minimac2_rx(NetClientState *nc, const uint8_t *buf, size_t size)
 {
-    MilkymistMinimac2State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    MilkymistMinimac2State *s = qemu_get_nic_opaque(nc);
 
     uint32_t r_count;
     uint32_t r_state;
@@ -410,7 +410,7 @@ static const MemoryRegionOps minimac2_ops = {
 
 static int minimac2_can_rx(NetClientState *nc)
 {
-    MilkymistMinimac2State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    MilkymistMinimac2State *s = qemu_get_nic_opaque(nc);
 
     if (s->regs[R_STATE0] == STATE_LOADED) {
         return 1;
@@ -424,7 +424,7 @@ static int minimac2_can_rx(NetClientState *nc)
 
 static void minimac2_cleanup(NetClientState *nc)
 {
-    MilkymistMinimac2State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    MilkymistMinimac2State *s = qemu_get_nic_opaque(nc);
 
     s->nic = NULL;
 }
diff --git a/hw/mipsnet.c b/hw/mipsnet.c
index 15761b1..ff6bf7f 100644
--- a/hw/mipsnet.c
+++ b/hw/mipsnet.c
@@ -64,7 +64,7 @@ static int mipsnet_buffer_full(MIPSnetState *s)
 
 static int mipsnet_can_receive(NetClientState *nc)
 {
-    MIPSnetState *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    MIPSnetState *s = qemu_get_nic_opaque(nc);
 
     if (s->busy)
         return 0;
@@ -73,7 +73,7 @@ static int mipsnet_can_receive(NetClientState *nc)
 
 static ssize_t mipsnet_receive(NetClientState *nc, const uint8_t *buf, size_t size)
 {
-    MIPSnetState *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    MIPSnetState *s = qemu_get_nic_opaque(nc);
 
     trace_mipsnet_receive(size);
     if (!mipsnet_can_receive(nc))
@@ -211,7 +211,7 @@ static const VMStateDescription vmstate_mipsnet = {
 
 static void mipsnet_cleanup(NetClientState *nc)
 {
-    MIPSnetState *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    MIPSnetState *s = qemu_get_nic_opaque(nc);
 
     s->nic = NULL;
 }
diff --git a/hw/musicpal.c b/hw/musicpal.c
index 9e22f69..272cb80 100644
--- a/hw/musicpal.c
+++ b/hw/musicpal.c
@@ -190,7 +190,7 @@ static int eth_can_receive(NetClientState *nc)
 
 static ssize_t eth_receive(NetClientState *nc, const uint8_t *buf, size_t size)
 {
-    mv88w8618_eth_state *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    mv88w8618_eth_state *s = qemu_get_nic_opaque(nc);
     uint32_t desc_addr;
     mv88w8618_rx_desc desc;
     int i;
@@ -369,7 +369,7 @@ static const MemoryRegionOps mv88w8618_eth_ops = {
 
 static void eth_cleanup(NetClientState *nc)
 {
-    mv88w8618_eth_state *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    mv88w8618_eth_state *s = qemu_get_nic_opaque(nc);
 
     s->nic = NULL;
 }
diff --git a/hw/ne2000-isa.c b/hw/ne2000-isa.c
index fa47e12..342c6bd 100644
--- a/hw/ne2000-isa.c
+++ b/hw/ne2000-isa.c
@@ -38,7 +38,7 @@ typedef struct ISANE2000State {
 
 static void isa_ne2000_cleanup(NetClientState *nc)
 {
-    NE2000State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    NE2000State *s = qemu_get_nic_opaque(nc);
 
     s->nic = NULL;
 }
diff --git a/hw/ne2000.c b/hw/ne2000.c
index 03c4209..c989190 100644
--- a/hw/ne2000.c
+++ b/hw/ne2000.c
@@ -167,7 +167,7 @@ static int ne2000_buffer_full(NE2000State *s)
 
 int ne2000_can_receive(NetClientState *nc)
 {
-    NE2000State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    NE2000State *s = qemu_get_nic_opaque(nc);
 
     if (s->cmd & E8390_STOP)
         return 1;
@@ -178,7 +178,7 @@ int ne2000_can_receive(NetClientState *nc)
 
 ssize_t ne2000_receive(NetClientState *nc, const uint8_t *buf, size_t size_)
 {
-    NE2000State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    NE2000State *s = qemu_get_nic_opaque(nc);
     int size = size_;
     uint8_t *p;
     unsigned int total_len, next, avail, len, index, mcast_idx;
@@ -706,7 +706,7 @@ void ne2000_setup_io(NE2000State *s, unsigned size)
 
 static void ne2000_cleanup(NetClientState *nc)
 {
-    NE2000State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    NE2000State *s = qemu_get_nic_opaque(nc);
 
     s->nic = NULL;
 }
diff --git a/hw/opencores_eth.c b/hw/opencores_eth.c
index 2496d4e..f9ba5ee 100644
--- a/hw/opencores_eth.c
+++ b/hw/opencores_eth.c
@@ -313,7 +313,7 @@ static void open_eth_int_source_write(OpenEthState *s,
 
 static void open_eth_set_link_status(NetClientState *nc)
 {
-    OpenEthState *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    OpenEthState *s = qemu_get_nic_opaque(nc);
 
     if (GET_REGBIT(s, MIICOMMAND, SCANSTAT)) {
         SET_REGFIELD(s, MIISTATUS, LINKFAIL, nc->link_down);
@@ -344,7 +344,7 @@ static void open_eth_reset(void *opaque)
 
 static int open_eth_can_receive(NetClientState *nc)
 {
-    OpenEthState *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    OpenEthState *s = qemu_get_nic_opaque(nc);
 
     return GET_REGBIT(s, MODER, RXEN) &&
         (s->regs[TX_BD_NUM] < 0x80) &&
@@ -354,7 +354,7 @@ static int open_eth_can_receive(NetClientState *nc)
 static ssize_t open_eth_receive(NetClientState *nc,
         const uint8_t *buf, size_t size)
 {
-    OpenEthState *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    OpenEthState *s = qemu_get_nic_opaque(nc);
     size_t maxfl = GET_REGFIELD(s, PACKETLEN, MAXFL);
     size_t minfl = GET_REGFIELD(s, PACKETLEN, MINFL);
     size_t fcsl = 4;
diff --git a/hw/pcnet-pci.c b/hw/pcnet-pci.c
index 54a849d..26c90bf 100644
--- a/hw/pcnet-pci.c
+++ b/hw/pcnet-pci.c
@@ -266,7 +266,7 @@ static void pci_physical_memory_read(void *dma_opaque, hwaddr addr,
 
 static void pci_pcnet_cleanup(NetClientState *nc)
 {
-    PCNetState *d = DO_UPCAST(NICState, nc, nc)->opaque;
+    PCNetState *d = qemu_get_nic_opaque(nc);
 
     pcnet_common_cleanup(d);
 }
diff --git a/hw/pcnet.c b/hw/pcnet.c
index 2126e22..e0de1e3 100644
--- a/hw/pcnet.c
+++ b/hw/pcnet.c
@@ -1006,7 +1006,7 @@ static int pcnet_tdte_poll(PCNetState *s)
 
 int pcnet_can_receive(NetClientState *nc)
 {
-    PCNetState *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    PCNetState *s = qemu_get_nic_opaque(nc);
     if (CSR_STOP(s) || CSR_SPND(s))
         return 0;
 
@@ -1017,7 +1017,7 @@ int pcnet_can_receive(NetClientState *nc)
 
 ssize_t pcnet_receive(NetClientState *nc, const uint8_t *buf, size_t size_)
 {
-    PCNetState *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    PCNetState *s = qemu_get_nic_opaque(nc);
     int is_padr = 0, is_bcast = 0, is_ladr = 0;
     uint8_t buf1[60];
     int remaining;
@@ -1199,7 +1199,7 @@ ssize_t pcnet_receive(NetClientState *nc, const uint8_t *buf, size_t size_)
 
 void pcnet_set_link_status(NetClientState *nc)
 {
-    PCNetState *d = DO_UPCAST(NICState, nc, nc)->opaque;
+    PCNetState *d = qemu_get_nic_opaque(nc);
 
     d->lnkst = nc->link_down ? 0 : 0x40;
 }
diff --git a/hw/rtl8139.c b/hw/rtl8139.c
index 22d24ae..b825e83 100644
--- a/hw/rtl8139.c
+++ b/hw/rtl8139.c
@@ -786,7 +786,7 @@ static bool rtl8139_cp_rx_valid(RTL8139State *s)
 
 static int rtl8139_can_receive(NetClientState *nc)
 {
-    RTL8139State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    RTL8139State *s = qemu_get_nic_opaque(nc);
     int avail;
 
     /* Receive (drop) packets if card is disabled.  */
@@ -808,7 +808,7 @@ static int rtl8139_can_receive(NetClientState *nc)
 
 static ssize_t rtl8139_do_receive(NetClientState *nc, const uint8_t *buf, size_t size_, int do_interrupt)
 {
-    RTL8139State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    RTL8139State *s = qemu_get_nic_opaque(nc);
     /* size is the length of the buffer passed to the driver */
     int size = size_;
     const uint8_t *dot1q_buf = NULL;
@@ -3429,7 +3429,7 @@ static void rtl8139_timer(void *opaque)
 
 static void rtl8139_cleanup(NetClientState *nc)
 {
-    RTL8139State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    RTL8139State *s = qemu_get_nic_opaque(nc);
 
     s->nic = NULL;
 }
@@ -3451,7 +3451,7 @@ static void pci_rtl8139_uninit(PCIDevice *dev)
 
 static void rtl8139_set_link_status(NetClientState *nc)
 {
-    RTL8139State *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    RTL8139State *s = qemu_get_nic_opaque(nc);
 
     if (nc->link_down) {
         s->BasicModeStatus &= ~0x04;
diff --git a/hw/smc91c111.c b/hw/smc91c111.c
index b097d6b..1c0ec8e 100644
--- a/hw/smc91c111.c
+++ b/hw/smc91c111.c
@@ -630,7 +630,7 @@ static uint32_t smc91c111_readl(void *opaque, hwaddr offset)
 
 static int smc91c111_can_receive(NetClientState *nc)
 {
-    smc91c111_state *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    smc91c111_state *s = qemu_get_nic_opaque(nc);
 
     if ((s->rcr & RCR_RXEN) == 0 || (s->rcr & RCR_SOFT_RST))
         return 1;
@@ -641,7 +641,7 @@ static int smc91c111_can_receive(NetClientState *nc)
 
 static ssize_t smc91c111_receive(NetClientState *nc, const uint8_t *buf, size_t size)
 {
-    smc91c111_state *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    smc91c111_state *s = qemu_get_nic_opaque(nc);
     int status;
     int packetsize;
     uint32_t crc;
@@ -730,7 +730,7 @@ static const MemoryRegionOps smc91c111_mem_ops = {
 
 static void smc91c111_cleanup(NetClientState *nc)
 {
-    smc91c111_state *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    smc91c111_state *s = qemu_get_nic_opaque(nc);
 
     s->nic = NULL;
 }
diff --git a/hw/spapr_llan.c b/hw/spapr_llan.c
index d53d4ae..6ef2936 100644
--- a/hw/spapr_llan.c
+++ b/hw/spapr_llan.c
@@ -85,7 +85,7 @@ typedef struct VIOsPAPRVLANDevice {
 
 static int spapr_vlan_can_receive(NetClientState *nc)
 {
-    VIOsPAPRVLANDevice *dev = DO_UPCAST(NICState, nc, nc)->opaque;
+    VIOsPAPRVLANDevice *dev = qemu_get_nic_opaque(nc);
 
     return (dev->isopen && dev->rx_bufs > 0);
 }
@@ -93,7 +93,7 @@ static int spapr_vlan_can_receive(NetClientState *nc)
 static ssize_t spapr_vlan_receive(NetClientState *nc, const uint8_t *buf,
                                   size_t size)
 {
-    VIOsPAPRDevice *sdev = DO_UPCAST(NICState, nc, nc)->opaque;
+    VIOsPAPRDevice *sdev = qemu_get_nic_opaque(nc);
     VIOsPAPRVLANDevice *dev = (VIOsPAPRVLANDevice *)sdev;
     vlan_bd_t rxq_bd = vio_ldq(sdev, dev->buf_list + VLAN_RXQ_BD_OFF);
     vlan_bd_t bd;
diff --git a/hw/stellaris_enet.c b/hw/stellaris_enet.c
index 99d4730..6c701fb 100644
--- a/hw/stellaris_enet.c
+++ b/hw/stellaris_enet.c
@@ -80,7 +80,7 @@ static void stellaris_enet_update(stellaris_enet_state *s)
 /* TODO: Implement MAC address filtering.  */
 static ssize_t stellaris_enet_receive(NetClientState *nc, const uint8_t *buf, size_t size)
 {
-    stellaris_enet_state *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    stellaris_enet_state *s = qemu_get_nic_opaque(nc);
     int n;
     uint8_t *p;
     uint32_t crc;
@@ -122,7 +122,7 @@ static ssize_t stellaris_enet_receive(NetClientState *nc, const uint8_t *buf, si
 
 static int stellaris_enet_can_receive(NetClientState *nc)
 {
-    stellaris_enet_state *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    stellaris_enet_state *s = qemu_get_nic_opaque(nc);
 
     if ((s->rctl & SE_RCTL_RXEN) == 0)
         return 1;
@@ -384,7 +384,7 @@ static int stellaris_enet_load(QEMUFile *f, void *opaque, int version_id)
 
 static void stellaris_enet_cleanup(NetClientState *nc)
 {
-    stellaris_enet_state *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    stellaris_enet_state *s = qemu_get_nic_opaque(nc);
 
     unregister_savevm(&s->busdev.qdev, "stellaris_enet", s);
 
diff --git a/hw/usb/dev-network.c b/hw/usb/dev-network.c
index a131f9c..abc6eac 100644
--- a/hw/usb/dev-network.c
+++ b/hw/usb/dev-network.c
@@ -1261,7 +1261,7 @@ static void usb_net_handle_data(USBDevice *dev, USBPacket *p)
 
 static ssize_t usbnet_receive(NetClientState *nc, const uint8_t *buf, size_t size)
 {
-    USBNetState *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    USBNetState *s = qemu_get_nic_opaque(nc);
     uint8_t *in_buf = s->in_buf;
     size_t total_size = size;
 
@@ -1308,7 +1308,7 @@ static ssize_t usbnet_receive(NetClientState *nc, const uint8_t *buf, size_t siz
 
 static int usbnet_can_receive(NetClientState *nc)
 {
-    USBNetState *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    USBNetState *s = qemu_get_nic_opaque(nc);
 
     if (is_rndis(s) && s->rndis_state != RNDIS_DATA_INITIALIZED) {
         return 1;
@@ -1319,7 +1319,7 @@ static int usbnet_can_receive(NetClientState *nc)
 
 static void usbnet_cleanup(NetClientState *nc)
 {
-    USBNetState *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    USBNetState *s = qemu_get_nic_opaque(nc);
 
     s->nic = NULL;
 }
diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 551f6dc..0b43add 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -169,7 +169,7 @@ static void virtio_net_set_status(struct VirtIODevice *vdev, uint8_t status)
 
 static void virtio_net_set_link_status(NetClientState *nc)
 {
-    VirtIONet *n = DO_UPCAST(NICState, nc, nc)->opaque;
+    VirtIONet *n = qemu_get_nic_opaque(nc);
     uint16_t old_status = n->status;
 
     if (nc->link_down)
@@ -473,7 +473,7 @@ static void virtio_net_handle_rx(VirtIODevice *vdev, VirtQueue *vq)
 
 static int virtio_net_can_receive(NetClientState *nc)
 {
-    VirtIONet *n = DO_UPCAST(NICState, nc, nc)->opaque;
+    VirtIONet *n = qemu_get_nic_opaque(nc);
     if (!n->vdev.vm_running) {
         return 0;
     }
@@ -604,7 +604,7 @@ static int receive_filter(VirtIONet *n, const uint8_t *buf, int size)
 
 static ssize_t virtio_net_receive(NetClientState *nc, const uint8_t *buf, size_t size)
 {
-    VirtIONet *n = DO_UPCAST(NICState, nc, nc)->opaque;
+    VirtIONet *n = qemu_get_nic_opaque(nc);
     struct iovec mhdr_sg[VIRTQUEUE_MAX_SIZE];
     struct virtio_net_hdr_mrg_rxbuf mhdr;
     unsigned mhdr_cnt = 0;
@@ -703,7 +703,7 @@ static int32_t virtio_net_flush_tx(VirtIONet *n, VirtQueue *vq);
 
 static void virtio_net_tx_complete(NetClientState *nc, ssize_t len)
 {
-    VirtIONet *n = DO_UPCAST(NICState, nc, nc)->opaque;
+    VirtIONet *n = qemu_get_nic_opaque(nc);
 
     virtqueue_push(n->tx_vq, &n->async_tx.elem, 0);
     virtio_notify(&n->vdev, n->tx_vq);
@@ -1002,7 +1002,7 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
 
 static void virtio_net_cleanup(NetClientState *nc)
 {
-    VirtIONet *n = DO_UPCAST(NICState, nc, nc)->opaque;
+    VirtIONet *n = qemu_get_nic_opaque(nc);
 
     n->nic = NULL;
 }
diff --git a/hw/xen_nic.c b/hw/xen_nic.c
index d5b39ea..55b7960 100644
--- a/hw/xen_nic.c
+++ b/hw/xen_nic.c
@@ -236,7 +236,7 @@ static void net_rx_response(struct XenNetDev *netdev,
 
 static int net_rx_ok(NetClientState *nc)
 {
-    struct XenNetDev *netdev = DO_UPCAST(NICState, nc, nc)->opaque;
+    struct XenNetDev *netdev = qemu_get_nic_opaque(nc);
     RING_IDX rc, rp;
 
     if (netdev->xendev.be_state != XenbusStateConnected) {
@@ -257,7 +257,7 @@ static int net_rx_ok(NetClientState *nc)
 
 static ssize_t net_rx_packet(NetClientState *nc, const uint8_t *buf, size_t size)
 {
-    struct XenNetDev *netdev = DO_UPCAST(NICState, nc, nc)->opaque;
+    struct XenNetDev *netdev = qemu_get_nic_opaque(nc);
     netif_rx_request_t rxreq;
     RING_IDX rc, rp;
     void *page;
diff --git a/hw/xgmac.c b/hw/xgmac.c
index 4d7bb13..5072298 100644
--- a/hw/xgmac.c
+++ b/hw/xgmac.c
@@ -310,7 +310,7 @@ static const MemoryRegionOps enet_mem_ops = {
 
 static int eth_can_rx(NetClientState *nc)
 {
-    struct XgmacState *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    struct XgmacState *s = qemu_get_nic_opaque(nc);
 
     /* RX enabled?  */
     return s->regs[DMA_CONTROL] & DMA_CONTROL_SR;
@@ -318,7 +318,7 @@ static int eth_can_rx(NetClientState *nc)
 
 static ssize_t eth_rx(NetClientState *nc, const uint8_t *buf, size_t size)
 {
-    struct XgmacState *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    struct XgmacState *s = qemu_get_nic_opaque(nc);
     static const unsigned char sa_bcast[6] = {0xff, 0xff, 0xff,
                                               0xff, 0xff, 0xff};
     int unicast, broadcast, multicast;
@@ -366,7 +366,7 @@ out:
 
 static void eth_cleanup(NetClientState *nc)
 {
-    struct XgmacState *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    struct XgmacState *s = qemu_get_nic_opaque(nc);
     s->nic = NULL;
 }
 
diff --git a/hw/xilinx_axienet.c b/hw/xilinx_axienet.c
index a7e8e2c..34e344c 100644
--- a/hw/xilinx_axienet.c
+++ b/hw/xilinx_axienet.c
@@ -617,7 +617,7 @@ static const MemoryRegionOps enet_ops = {
 
 static int eth_can_rx(NetClientState *nc)
 {
-    struct XilinxAXIEnet *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    struct XilinxAXIEnet *s = qemu_get_nic_opaque(nc);
 
     /* RX enabled?  */
     return !axienet_rx_resetting(s) && axienet_rx_enabled(s);
@@ -640,7 +640,7 @@ static int enet_match_addr(const uint8_t *buf, uint32_t f0, uint32_t f1)
 
 static ssize_t eth_rx(NetClientState *nc, const uint8_t *buf, size_t size)
 {
-    struct XilinxAXIEnet *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    struct XilinxAXIEnet *s = qemu_get_nic_opaque(nc);
     static const unsigned char sa_bcast[6] = {0xff, 0xff, 0xff,
                                               0xff, 0xff, 0xff};
     static const unsigned char sa_ipmcast[3] = {0x01, 0x00, 0x52};
@@ -785,7 +785,7 @@ static ssize_t eth_rx(NetClientState *nc, const uint8_t *buf, size_t size)
 static void eth_cleanup(NetClientState *nc)
 {
     /* FIXME.  */
-    struct XilinxAXIEnet *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    struct XilinxAXIEnet *s = qemu_get_nic_opaque(nc);
     g_free(s->rxmem);
     g_free(s);
 }
diff --git a/hw/xilinx_ethlite.c b/hw/xilinx_ethlite.c
index 5ab3915..ca07c3d 100644
--- a/hw/xilinx_ethlite.c
+++ b/hw/xilinx_ethlite.c
@@ -162,7 +162,7 @@ static const MemoryRegionOps eth_ops = {
 
 static int eth_can_rx(NetClientState *nc)
 {
-    struct xlx_ethlite *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    struct xlx_ethlite *s = qemu_get_nic_opaque(nc);
     int r;
     r = !(s->regs[R_RX_CTRL0] & CTRL_S);
     return r;
@@ -170,7 +170,7 @@ static int eth_can_rx(NetClientState *nc)
 
 static ssize_t eth_rx(NetClientState *nc, const uint8_t *buf, size_t size)
 {
-    struct xlx_ethlite *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    struct xlx_ethlite *s = qemu_get_nic_opaque(nc);
     unsigned int rxbase = s->rxbuf * (0x800 / 4);
 
     /* DA filter.  */
@@ -196,7 +196,7 @@ static ssize_t eth_rx(NetClientState *nc, const uint8_t *buf, size_t size)
 
 static void eth_cleanup(NetClientState *nc)
 {
-    struct xlx_ethlite *s = DO_UPCAST(NICState, nc, nc)->opaque;
+    struct xlx_ethlite *s = qemu_get_nic_opaque(nc);
 
     s->nic = NULL;
 }
diff --git a/include/net/net.h b/include/net/net.h
index 5d8aecf..96e05c4 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -78,6 +78,8 @@ NICState *qemu_new_nic(NetClientInfo *info,
                        const char *name,
                        void *opaque);
 NetClientState *qemu_get_queue(NICState *nic);
+NICState *qemu_get_nic(NetClientState *nc);
+void *qemu_get_nic_opaque(NetClientState *nc);
 void qemu_del_net_client(NetClientState *nc);
 NetClientState *qemu_find_vlan_client_by_name(Monitor *mon, int vlan_id,
                                               const char *client_str);
diff --git a/net/net.c b/net/net.c
index e9a0d15..41dc12c 100644
--- a/net/net.c
+++ b/net/net.c
@@ -227,7 +227,7 @@ NICState *qemu_new_nic(NetClientInfo *info,
 
     nc = qemu_new_net_client(info, conf->peer, model, name);
 
-    nic = DO_UPCAST(NICState, nc, nc);
+    nic = qemu_get_nic(nc);
     nic->conf = conf;
     nic->opaque = opaque;
 
@@ -239,6 +239,18 @@ NetClientState *qemu_get_queue(NICState *nic)
     return &nic->nc;
 }
 
+NICState *qemu_get_nic(NetClientState *nc)
+{
+    return DO_UPCAST(NICState, nc, nc);
+}
+
+void *qemu_get_nic_opaque(NetClientState *nc)
+{
+    NICState *nic = qemu_get_nic(nc);
+
+    return nic->opaque;
+}
+
 static void qemu_cleanup_net_client(NetClientState *nc)
 {
     QTAILQ_REMOVE(&net_clients, nc, next);
@@ -265,7 +277,7 @@ void qemu_del_net_client(NetClientState *nc)
 {
     /* If there is a peer NIC, delete and cleanup client, but do not free. */
     if (nc->peer && nc->peer->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
-        NICState *nic = DO_UPCAST(NICState, nc, nc->peer);
+        NICState *nic = qemu_get_nic(nc->peer);
         if (nic->peer_deleted) {
             return;
         }
@@ -281,7 +293,7 @@ void qemu_del_net_client(NetClientState *nc)
 
     /* If this is a peer NIC and peer has already been deleted, free it now. */
     if (nc->peer && nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
-        NICState *nic = DO_UPCAST(NICState, nc, nc);
+        NICState *nic = qemu_get_nic(nc);
         if (nic->peer_deleted) {
             qemu_free_net_client(nc->peer);
         }
@@ -297,7 +309,7 @@ void qemu_foreach_nic(qemu_nic_foreach func, void *opaque)
 
     QTAILQ_FOREACH(nc, &net_clients, next) {
         if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
-            func(DO_UPCAST(NICState, nc, nc), opaque);
+            func(qemu_get_nic(nc), opaque);
         }
     }
 }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 03/20] net: intorduce qemu_del_nic()
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
  2013-01-25 10:35 ` [PATCH V2 01/20] net: introduce qemu_get_queue() Jason Wang
  2013-01-25 10:35 ` [PATCH V2 02/20] net: introduce qemu_get_nic() Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-25 10:35 ` [PATCH V2 04/20] net: introduce qemu_find_net_clients_except() Jason Wang
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

To support multiqueue nic, this patch separate the nic destructor from
qemu_del_net_client() to a new helper qemu_del_nic() since the mapping bettween
NiCState and NetClientState were not 1:1 in multiqueue. The following patches
would refactor this function to support multiqueue nic.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 hw/e1000.c           |    2 +-
 hw/eepro100.c        |    2 +-
 hw/ne2000.c          |    2 +-
 hw/pcnet-pci.c       |    2 +-
 hw/rtl8139.c         |    2 +-
 hw/usb/dev-network.c |    2 +-
 hw/virtio-net.c      |    2 +-
 hw/xen_nic.c         |    2 +-
 include/net/net.h    |    1 +
 net/net.c            |   15 ++++++++++++++-
 10 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/hw/e1000.c b/hw/e1000.c
index 36f4051..f3590a9 100644
--- a/hw/e1000.c
+++ b/hw/e1000.c
@@ -1244,7 +1244,7 @@ pci_e1000_uninit(PCIDevice *dev)
     qemu_free_timer(d->autoneg_timer);
     memory_region_destroy(&d->mmio);
     memory_region_destroy(&d->io);
-    qemu_del_net_client(qemu_get_queue(d->nic));
+    qemu_del_nic(d->nic);
 }
 
 static NetClientInfo net_e1000_info = {
diff --git a/hw/eepro100.c b/hw/eepro100.c
index f9856ae..5d23796 100644
--- a/hw/eepro100.c
+++ b/hw/eepro100.c
@@ -1849,7 +1849,7 @@ static void pci_nic_uninit(PCIDevice *pci_dev)
     memory_region_destroy(&s->flash_bar);
     vmstate_unregister(&pci_dev->qdev, s->vmstate, s);
     eeprom93xx_free(&pci_dev->qdev, s->eeprom);
-    qemu_del_net_client(qemu_get_queue(s->nic));
+    qemu_del_nic(s->nic);
 }
 
 static NetClientInfo net_eepro100_info = {
diff --git a/hw/ne2000.c b/hw/ne2000.c
index c989190..3dd1c84 100644
--- a/hw/ne2000.c
+++ b/hw/ne2000.c
@@ -751,7 +751,7 @@ static void pci_ne2000_exit(PCIDevice *pci_dev)
     NE2000State *s = &d->ne2000;
 
     memory_region_destroy(&s->io);
-    qemu_del_net_client(qemu_get_queue(s->nic));
+    qemu_del_nic(s->nic);
 }
 
 static Property ne2000_properties[] = {
diff --git a/hw/pcnet-pci.c b/hw/pcnet-pci.c
index 26c90bf..df63b22 100644
--- a/hw/pcnet-pci.c
+++ b/hw/pcnet-pci.c
@@ -279,7 +279,7 @@ static void pci_pcnet_uninit(PCIDevice *dev)
     memory_region_destroy(&d->io_bar);
     qemu_del_timer(d->state.poll_timer);
     qemu_free_timer(d->state.poll_timer);
-    qemu_del_net_client(qemu_get_queue(d->state.nic));
+    qemu_del_nic(d->state.nic);
 }
 
 static NetClientInfo net_pci_pcnet_info = {
diff --git a/hw/rtl8139.c b/hw/rtl8139.c
index b825e83..d7716be 100644
--- a/hw/rtl8139.c
+++ b/hw/rtl8139.c
@@ -3446,7 +3446,7 @@ static void pci_rtl8139_uninit(PCIDevice *dev)
     }
     qemu_del_timer(s->timer);
     qemu_free_timer(s->timer);
-    qemu_del_net_client(qemu_get_queue(s->nic));
+    qemu_del_nic(s->nic);
 }
 
 static void rtl8139_set_link_status(NetClientState *nc)
diff --git a/hw/usb/dev-network.c b/hw/usb/dev-network.c
index abc6eac..a01a5e7 100644
--- a/hw/usb/dev-network.c
+++ b/hw/usb/dev-network.c
@@ -1330,7 +1330,7 @@ static void usb_net_handle_destroy(USBDevice *dev)
 
     /* TODO: remove the nd_table[] entry */
     rndis_clear_responsequeue(s);
-    qemu_del_net_client(qemu_get_queue(s->nic));
+    qemu_del_nic(s->nic);
 }
 
 static NetClientInfo net_usbnet_info = {
diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 0b43add..47f4ab4 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -1124,6 +1124,6 @@ void virtio_net_exit(VirtIODevice *vdev)
         qemu_bh_delete(n->tx_bh);
     }
 
-    qemu_del_net_client(qemu_get_queue(n->nic));
+    qemu_del_nic(n->nic);
     virtio_cleanup(&n->vdev);
 }
diff --git a/hw/xen_nic.c b/hw/xen_nic.c
index 55b7960..4be077d 100644
--- a/hw/xen_nic.c
+++ b/hw/xen_nic.c
@@ -408,7 +408,7 @@ static void net_disconnect(struct XenDevice *xendev)
         netdev->rxs = NULL;
     }
     if (netdev->nic) {
-        qemu_del_net_client(qemu_get_queue(netdev->nic));
+        qemu_del_nic(netdev->nic);
         netdev->nic = NULL;
     }
 }
diff --git a/include/net/net.h b/include/net/net.h
index 96e05c4..f0d1aa2 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -77,6 +77,7 @@ NICState *qemu_new_nic(NetClientInfo *info,
                        const char *model,
                        const char *name,
                        void *opaque);
+void qemu_del_nic(NICState *nic);
 NetClientState *qemu_get_queue(NICState *nic);
 NICState *qemu_get_nic(NetClientState *nc);
 void *qemu_get_nic_opaque(NetClientState *nc);
diff --git a/net/net.c b/net/net.c
index 41dc12c..8999f8d 100644
--- a/net/net.c
+++ b/net/net.c
@@ -291,6 +291,15 @@ void qemu_del_net_client(NetClientState *nc)
         return;
     }
 
+    assert(nc->info->type != NET_CLIENT_OPTIONS_KIND_NIC);
+
+    qemu_cleanup_net_client(nc);
+    qemu_free_net_client(nc);
+}
+
+void qemu_del_nic(NICState *nic)
+{
+    NetClientState *nc = qemu_get_queue(nic);
     /* If this is a peer NIC and peer has already been deleted, free it now. */
     if (nc->peer && nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
         NICState *nic = qemu_get_nic(nc);
@@ -933,7 +942,11 @@ void net_cleanup(void)
     NetClientState *nc, *next_vc;
 
     QTAILQ_FOREACH_SAFE(nc, &net_clients, next, next_vc) {
-        qemu_del_net_client(nc);
+        if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
+            qemu_del_nic(qemu_get_nic(nc));
+        } else {
+            qemu_del_net_client(nc);
+        }
     }
 }
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 04/20] net: introduce qemu_find_net_clients_except()
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
                   ` (2 preceding siblings ...)
  2013-01-25 10:35 ` [PATCH V2 03/20] net: intorduce qemu_del_nic() Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-25 10:35 ` [PATCH V2 05/20] net: introduce qemu_net_client_setup() Jason Wang
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

In multiqueue, all NetClientState that belongs to the same netdev or nic has the
same id. So this patches introduces an helper qemu_find_net_clients_except()
which finds all NetClientState with the same id. This will be used by multiqueue
networking.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/net/net.h |    2 ++
 net/net.c         |   21 +++++++++++++++++++++
 2 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/include/net/net.h b/include/net/net.h
index f0d1aa2..995df5c 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -68,6 +68,8 @@ typedef struct NICState {
 } NICState;
 
 NetClientState *qemu_find_netdev(const char *id);
+int qemu_find_net_clients_except(const char *id, NetClientState **ncs,
+                                 NetClientOptionsKind type, int max);
 NetClientState *qemu_new_net_client(NetClientInfo *info,
                                     NetClientState *peer,
                                     const char *model,
diff --git a/net/net.c b/net/net.c
index 8999f8d..6457fc0 100644
--- a/net/net.c
+++ b/net/net.c
@@ -508,6 +508,27 @@ NetClientState *qemu_find_netdev(const char *id)
     return NULL;
 }
 
+int qemu_find_net_clients_except(const char *id, NetClientState **ncs,
+                                 NetClientOptionsKind type, int max)
+{
+    NetClientState *nc;
+    int ret = 0;
+
+    QTAILQ_FOREACH(nc, &net_clients, next) {
+        if (nc->info->type == type) {
+            continue;
+        }
+        if (!strcmp(nc->name, id)) {
+            if (ret < max) {
+                ncs[ret] = nc;
+            }
+            ret++;
+        }
+    }
+
+    return ret;
+}
+
 static int nic_get_free_idx(void)
 {
     int index;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 05/20] net: introduce qemu_net_client_setup()
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
                   ` (3 preceding siblings ...)
  2013-01-25 10:35 ` [PATCH V2 04/20] net: introduce qemu_find_net_clients_except() Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-25 10:35 ` [PATCH V2 06/20] net: introduce NetClientState destructor Jason Wang
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

This patch separates the setup of NetClientState from its allocation, this will
allow allocating an arrays of NetClientState and does the initialization one by
one which is what multiqueue needs.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 net/net.c |   29 +++++++++++++++++++----------
 1 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/net/net.c b/net/net.c
index 6457fc0..4e84d54 100644
--- a/net/net.c
+++ b/net/net.c
@@ -182,17 +182,12 @@ static char *assign_name(NetClientState *nc1, const char *model)
     return g_strdup(buf);
 }
 
-NetClientState *qemu_new_net_client(NetClientInfo *info,
-                                    NetClientState *peer,
-                                    const char *model,
-                                    const char *name)
+static void qemu_net_client_setup(NetClientState *nc,
+                                  NetClientInfo *info,
+                                  NetClientState *peer,
+                                  const char *model,
+                                  const char *name)
 {
-    NetClientState *nc;
-
-    assert(info->size >= sizeof(NetClientState));
-
-    nc = g_malloc0(info->size);
-
     nc->info = info;
     nc->model = g_strdup(model);
     if (name) {
@@ -210,6 +205,20 @@ NetClientState *qemu_new_net_client(NetClientInfo *info,
 
     nc->send_queue = qemu_new_net_queue(nc);
 
+}
+
+NetClientState *qemu_new_net_client(NetClientInfo *info,
+                                    NetClientState *peer,
+                                    const char *model,
+                                    const char *name)
+{
+    NetClientState *nc;
+
+    assert(info->size >= sizeof(NetClientState));
+
+    nc = g_malloc0(info->size);
+    qemu_net_client_setup(nc, info, peer, model, name);
+
     return nc;
 }
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 06/20] net: introduce NetClientState destructor
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
                   ` (4 preceding siblings ...)
  2013-01-25 10:35 ` [PATCH V2 05/20] net: introduce qemu_net_client_setup() Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-25 10:35 ` [PATCH V2 07/20] net: multiqueue support Jason Wang
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

To allow allocating an array of NetClientState and free it once, this patch
introduces destructor of NetClientState. Which could do type specific free,
which could be used by multiqueue to free the array once.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/net/net.h |    2 ++
 net/net.c         |   17 +++++++++++++----
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/include/net/net.h b/include/net/net.h
index 995df5c..22adc99 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -35,6 +35,7 @@ typedef ssize_t (NetReceive)(NetClientState *, const uint8_t *, size_t);
 typedef ssize_t (NetReceiveIOV)(NetClientState *, const struct iovec *, int);
 typedef void (NetCleanup) (NetClientState *);
 typedef void (LinkStatusChanged)(NetClientState *);
+typedef void (NetClientDestructor)(NetClientState *);
 
 typedef struct NetClientInfo {
     NetClientOptionsKind type;
@@ -58,6 +59,7 @@ struct NetClientState {
     char *name;
     char info_str[256];
     unsigned receive_disabled : 1;
+    NetClientDestructor *destructor;
 };
 
 typedef struct NICState {
diff --git a/net/net.c b/net/net.c
index 4e84d54..6368896 100644
--- a/net/net.c
+++ b/net/net.c
@@ -182,11 +182,17 @@ static char *assign_name(NetClientState *nc1, const char *model)
     return g_strdup(buf);
 }
 
+static void qemu_net_client_destructor(NetClientState *nc)
+{
+    g_free(nc);
+}
+
 static void qemu_net_client_setup(NetClientState *nc,
                                   NetClientInfo *info,
                                   NetClientState *peer,
                                   const char *model,
-                                  const char *name)
+                                  const char *name,
+                                  NetClientDestructor *destructor)
 {
     nc->info = info;
     nc->model = g_strdup(model);
@@ -204,7 +210,7 @@ static void qemu_net_client_setup(NetClientState *nc,
     QTAILQ_INSERT_TAIL(&net_clients, nc, next);
 
     nc->send_queue = qemu_new_net_queue(nc);
-
+    nc->destructor = destructor;
 }
 
 NetClientState *qemu_new_net_client(NetClientInfo *info,
@@ -217,7 +223,8 @@ NetClientState *qemu_new_net_client(NetClientInfo *info,
     assert(info->size >= sizeof(NetClientState));
 
     nc = g_malloc0(info->size);
-    qemu_net_client_setup(nc, info, peer, model, name);
+    qemu_net_client_setup(nc, info, peer, model, name,
+                          qemu_net_client_destructor);
 
     return nc;
 }
@@ -279,7 +286,9 @@ static void qemu_free_net_client(NetClientState *nc)
     }
     g_free(nc->name);
     g_free(nc->model);
-    g_free(nc);
+    if (nc->destructor) {
+        nc->destructor(nc);
+    }
 }
 
 void qemu_del_net_client(NetClientState *nc)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 07/20] net: multiqueue support
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
                   ` (5 preceding siblings ...)
  2013-01-25 10:35 ` [PATCH V2 06/20] net: introduce NetClientState destructor Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-25 10:35 ` [PATCH V2 08/20] tap: import linux multiqueue constants Jason Wang
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, Jason Wang, rusty, gaowanlong, jwhan, shiyer

This patch adds basic multiqueue support for qemu. The idea is simple, an array
of NetClientStates were introduced in NICState, parse_netdev() were extended to
find and match all NetClientStates belongs to the backend and place their
pointers in NICConf. Then qemu_new_nic can setup a N:N mapping between NICStates
that belongs to a nic and NICStates belongs to the netdev. And a queue_index
were introduced in NetClientState to track its index. After this, each peers of
a NICState were abstracted as a queue.

After this change, all NetClientState that belongs to the same backend/nic has
the same id. When use want to change the link status, all NetClientStates that
belongs to the same backend/nic will be also changed. When user want to delete
a device or netdev, all NetClientStates that belongs to the same backend/nic
will be deleted also. Changing or deleting an specific queue is not allowed.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 hw/dp8393x.c                |    2 +-
 hw/mcf_fec.c                |    2 +-
 hw/qdev-properties-system.c |   46 +++++++++++++++---
 hw/qdev-properties.h        |    6 +-
 include/net/net.h           |   18 +++++--
 net/net.c                   |  113 +++++++++++++++++++++++++++++++------------
 6 files changed, 139 insertions(+), 48 deletions(-)

diff --git a/hw/dp8393x.c b/hw/dp8393x.c
index 0273fad..808157b 100644
--- a/hw/dp8393x.c
+++ b/hw/dp8393x.c
@@ -900,7 +900,7 @@ void dp83932_init(NICInfo *nd, hwaddr base, int it_shift,
     s->regs[SONIC_SR] = 0x0004; /* only revision recognized by Linux */
 
     s->conf.macaddr = nd->macaddr;
-    s->conf.peer = nd->netdev;
+    s->conf.peers.ncs[0] = nd->netdev;
 
     s->nic = qemu_new_nic(&net_dp83932_info, &s->conf, nd->model, nd->name, s);
 
diff --git a/hw/mcf_fec.c b/hw/mcf_fec.c
index 909e32b..8e60f09 100644
--- a/hw/mcf_fec.c
+++ b/hw/mcf_fec.c
@@ -472,7 +472,7 @@ void mcf_fec_init(MemoryRegion *sysmem, NICInfo *nd,
     memory_region_add_subregion(sysmem, base, &s->iomem);
 
     s->conf.macaddr = nd->macaddr;
-    s->conf.peer = nd->netdev;
+    s->conf.peers.ncs[0] = nd->netdev;
 
     s->nic = qemu_new_nic(&net_mcf_fec_info, &s->conf, nd->model, nd->name, s);
 
diff --git a/hw/qdev-properties-system.c b/hw/qdev-properties-system.c
index ce0f793..ce3af22 100644
--- a/hw/qdev-properties-system.c
+++ b/hw/qdev-properties-system.c
@@ -173,16 +173,47 @@ PropertyInfo qdev_prop_chr = {
 
 static int parse_netdev(DeviceState *dev, const char *str, void **ptr)
 {
-    NetClientState *netdev = qemu_find_netdev(str);
+    NICPeers *peers_ptr = (NICPeers *)ptr;
+    NICConf *conf = container_of(peers_ptr, NICConf, peers);
+    NetClientState **ncs = peers_ptr->ncs;
+    NetClientState *peers[MAX_QUEUE_NUM];
+    int queues, i = 0;
+    int ret;
 
-    if (netdev == NULL) {
-        return -ENOENT;
+    queues = qemu_find_net_clients_except(str, peers,
+                                          NET_CLIENT_OPTIONS_KIND_NIC,
+                                          MAX_QUEUE_NUM);
+    if (queues == 0) {
+        ret = -ENOENT;
+        goto err;
     }
-    if (netdev->peer) {
-        return -EEXIST;
+
+    if (queues > MAX_QUEUE_NUM) {
+        ret = -E2BIG;
+        goto err;
+    }
+
+    for (i = 0; i < queues; i++) {
+        if (peers[i] == NULL) {
+            ret = -ENOENT;
+            goto err;
+        }
+
+        if (peers[i]->peer) {
+            ret = -EEXIST;
+            goto err;
+        }
+
+        ncs[i] = peers[i];
+        ncs[i]->queue_index = i;
     }
-    *ptr = netdev;
+
+    conf->queues = queues;
+
     return 0;
+
+err:
+    return ret;
 }
 
 static const char *print_netdev(void *ptr)
@@ -249,7 +280,8 @@ static void set_vlan(Object *obj, Visitor *v, void *opaque,
 {
     DeviceState *dev = DEVICE(obj);
     Property *prop = opaque;
-    NetClientState **ptr = qdev_get_prop_ptr(dev, prop);
+    NICPeers *peers_ptr = qdev_get_prop_ptr(dev, prop);
+    NetClientState **ptr = &peers_ptr->ncs[0];
     Error *local_err = NULL;
     int32_t id;
     NetClientState *hubport;
diff --git a/hw/qdev-properties.h b/hw/qdev-properties.h
index ddcf774..20c67f3 100644
--- a/hw/qdev-properties.h
+++ b/hw/qdev-properties.h
@@ -31,7 +31,7 @@ extern PropertyInfo qdev_prop_pci_host_devaddr;
         .name      = (_name),                                    \
         .info      = &(_prop),                                   \
         .offset    = offsetof(_state, _field)                    \
-            + type_check(_type,typeof_field(_state, _field)),    \
+            + type_check(_type, typeof_field(_state, _field)),   \
         }
 #define DEFINE_PROP_DEFAULT(_name, _state, _field, _defval, _prop, _type) { \
         .name      = (_name),                                           \
@@ -77,9 +77,9 @@ extern PropertyInfo qdev_prop_pci_host_devaddr;
 #define DEFINE_PROP_STRING(_n, _s, _f)             \
     DEFINE_PROP(_n, _s, _f, qdev_prop_string, char*)
 #define DEFINE_PROP_NETDEV(_n, _s, _f)             \
-    DEFINE_PROP(_n, _s, _f, qdev_prop_netdev, NetClientState*)
+    DEFINE_PROP(_n, _s, _f, qdev_prop_netdev, NICPeers)
 #define DEFINE_PROP_VLAN(_n, _s, _f)             \
-    DEFINE_PROP(_n, _s, _f, qdev_prop_vlan, NetClientState*)
+    DEFINE_PROP(_n, _s, _f, qdev_prop_vlan, NICPeers)
 #define DEFINE_PROP_DRIVE(_n, _s, _f) \
     DEFINE_PROP(_n, _s, _f, qdev_prop_drive, BlockDriverState *)
 #define DEFINE_PROP_MACADDR(_n, _s, _f)         \
diff --git a/include/net/net.h b/include/net/net.h
index 22adc99..43a045e 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -9,24 +9,32 @@
 #include "migration/vmstate.h"
 #include "qapi-types.h"
 
+#define MAX_QUEUE_NUM 1024
+
 struct MACAddr {
     uint8_t a[6];
 };
 
 /* qdev nic properties */
 
+typedef struct NICPeers {
+    NetClientState *ncs[MAX_QUEUE_NUM];
+} NICPeers;
+
 typedef struct NICConf {
     MACAddr macaddr;
-    NetClientState *peer;
+    NICPeers peers;
     int32_t bootindex;
+    int32_t queues;
 } NICConf;
 
 #define DEFINE_NIC_PROPERTIES(_state, _conf)                            \
     DEFINE_PROP_MACADDR("mac",   _state, _conf.macaddr),                \
-    DEFINE_PROP_VLAN("vlan",     _state, _conf.peer),                   \
-    DEFINE_PROP_NETDEV("netdev", _state, _conf.peer),                   \
+    DEFINE_PROP_VLAN("vlan",     _state, _conf.peers),                   \
+    DEFINE_PROP_NETDEV("netdev", _state, _conf.peers),                   \
     DEFINE_PROP_INT32("bootindex", _state, _conf.bootindex, -1)
 
+
 /* Net clients */
 
 typedef void (NetPoll)(NetClientState *, bool enable);
@@ -60,10 +68,11 @@ struct NetClientState {
     char info_str[256];
     unsigned receive_disabled : 1;
     NetClientDestructor *destructor;
+    unsigned int queue_index;
 };
 
 typedef struct NICState {
-    NetClientState nc;
+    NetClientState ncs[MAX_QUEUE_NUM];
     NICConf *conf;
     void *opaque;
     bool peer_deleted;
@@ -82,6 +91,7 @@ NICState *qemu_new_nic(NetClientInfo *info,
                        const char *name,
                        void *opaque);
 void qemu_del_nic(NICState *nic);
+NetClientState *qemu_get_subqueue(NICState *nic, int queue_index);
 NetClientState *qemu_get_queue(NICState *nic);
 NICState *qemu_get_nic(NetClientState *nc);
 void *qemu_get_nic_opaque(NetClientState *nc);
diff --git a/net/net.c b/net/net.c
index 6368896..a71de70 100644
--- a/net/net.c
+++ b/net/net.c
@@ -236,28 +236,44 @@ NICState *qemu_new_nic(NetClientInfo *info,
                        void *opaque)
 {
     NetClientState *nc;
+    NetClientState **peers = conf->peers.ncs;
     NICState *nic;
+    int i;
 
     assert(info->type == NET_CLIENT_OPTIONS_KIND_NIC);
     assert(info->size >= sizeof(NICState));
 
-    nc = qemu_new_net_client(info, conf->peer, model, name);
+    nc = qemu_new_net_client(info, peers[0], model, name);
+    nc->queue_index = 0;
 
     nic = qemu_get_nic(nc);
     nic->conf = conf;
     nic->opaque = opaque;
 
+    for (i = 1; i < conf->queues; i++) {
+        qemu_net_client_setup(&nic->ncs[i], info, peers[i], model, nc->name,
+                              NULL);
+        nic->ncs[i].queue_index = i;
+    }
+
     return nic;
 }
 
+NetClientState *qemu_get_subqueue(NICState *nic, int queue_index)
+{
+    return &nic->ncs[queue_index];
+}
+
 NetClientState *qemu_get_queue(NICState *nic)
 {
-    return &nic->nc;
+    return qemu_get_subqueue(nic, 0);
 }
 
 NICState *qemu_get_nic(NetClientState *nc)
 {
-    return DO_UPCAST(NICState, nc, nc);
+    NetClientState *nc0 = nc - nc->queue_index;
+
+    return DO_UPCAST(NICState, ncs[0], nc0);
 }
 
 void *qemu_get_nic_opaque(NetClientState *nc)
@@ -271,9 +287,7 @@ static void qemu_cleanup_net_client(NetClientState *nc)
 {
     QTAILQ_REMOVE(&net_clients, nc, next);
 
-    if (nc->info->cleanup) {
-        nc->info->cleanup(nc);
-    }
+    nc->info->cleanup(nc);
 }
 
 static void qemu_free_net_client(NetClientState *nc)
@@ -293,6 +307,17 @@ static void qemu_free_net_client(NetClientState *nc)
 
 void qemu_del_net_client(NetClientState *nc)
 {
+    NetClientState *ncs[MAX_QUEUE_NUM];
+    int queues, i;
+
+    /* If the NetClientState belongs to a multiqueue backend, we will change all
+     * other NetClientStates also.
+     */
+    queues = qemu_find_net_clients_except(nc->name, ncs,
+                                          NET_CLIENT_OPTIONS_KIND_NIC,
+                                          MAX_QUEUE_NUM);
+    assert(queues != 0);
+
     /* If there is a peer NIC, delete and cleanup client, but do not free. */
     if (nc->peer && nc->peer->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
         NICState *nic = qemu_get_nic(nc->peer);
@@ -300,34 +325,47 @@ void qemu_del_net_client(NetClientState *nc)
             return;
         }
         nic->peer_deleted = true;
-        /* Let NIC know peer is gone. */
-        nc->peer->link_down = true;
+
+        for (i = 0; i < queues; i++) {
+            ncs[i]->peer->link_down = true;
+        }
+
         if (nc->peer->info->link_status_changed) {
             nc->peer->info->link_status_changed(nc->peer);
         }
-        qemu_cleanup_net_client(nc);
+
+        for (i = 0; i < queues; i++) {
+            qemu_cleanup_net_client(ncs[i]);
+        }
+
         return;
     }
 
     assert(nc->info->type != NET_CLIENT_OPTIONS_KIND_NIC);
 
-    qemu_cleanup_net_client(nc);
-    qemu_free_net_client(nc);
+    for (i = 0; i < queues; i++) {
+        qemu_cleanup_net_client(ncs[i]);
+        qemu_free_net_client(ncs[i]);
+    }
 }
 
 void qemu_del_nic(NICState *nic)
 {
-    NetClientState *nc = qemu_get_queue(nic);
+    int i, queues = nic->conf->queues;
+
     /* If this is a peer NIC and peer has already been deleted, free it now. */
-    if (nc->peer && nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
-        NICState *nic = qemu_get_nic(nc);
-        if (nic->peer_deleted) {
-            qemu_free_net_client(nc->peer);
+    if (nic->peer_deleted) {
+        for (i = 0; i < queues; i++) {
+            qemu_free_net_client(qemu_get_subqueue(nic, i)->peer);
         }
     }
 
-    qemu_cleanup_net_client(nc);
-    qemu_free_net_client(nc);
+    for (i = queues - 1; i >= 0; i--) {
+        NetClientState *nc = qemu_get_subqueue(nic, i);
+
+        qemu_cleanup_net_client(nc);
+        qemu_free_net_client(nc);
+    }
 }
 
 void qemu_foreach_nic(qemu_nic_foreach func, void *opaque)
@@ -336,7 +374,9 @@ void qemu_foreach_nic(qemu_nic_foreach func, void *opaque)
 
     QTAILQ_FOREACH(nc, &net_clients, next) {
         if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
-            func(qemu_get_nic(nc), opaque);
+            if (nc->queue_index == 0) {
+                func(qemu_get_nic(nc), opaque);
+            }
         }
     }
 }
@@ -913,8 +953,10 @@ void qmp_netdev_del(const char *id, Error **errp)
 
 void print_net_client(Monitor *mon, NetClientState *nc)
 {
-    monitor_printf(mon, "%s: type=%s,%s\n", nc->name,
-                   NetClientOptionsKind_lookup[nc->info->type], nc->info_str);
+    monitor_printf(mon, "%s: index=%d,type=%s,%s\n", nc->name,
+                   nc->queue_index,
+                   NetClientOptionsKind_lookup[nc->info->type],
+                   nc->info_str);
 }
 
 void do_info_network(Monitor *mon, const QDict *qdict)
@@ -945,20 +987,23 @@ void do_info_network(Monitor *mon, const QDict *qdict)
 
 void qmp_set_link(const char *name, bool up, Error **errp)
 {
-    NetClientState *nc = NULL;
+    NetClientState *ncs[MAX_QUEUE_NUM];
+    NetClientState *nc;
+    int queues, i;
 
-    QTAILQ_FOREACH(nc, &net_clients, next) {
-        if (!strcmp(nc->name, name)) {
-            goto done;
-        }
-    }
-done:
-    if (!nc) {
+    queues = qemu_find_net_clients_except(name, ncs,
+                                          NET_CLIENT_OPTIONS_KIND_MAX,
+                                          MAX_QUEUE_NUM);
+
+    if (queues == 0) {
         error_set(errp, QERR_DEVICE_NOT_FOUND, name);
         return;
     }
+    nc = ncs[0];
 
-    nc->link_down = !up;
+    for (i = 0; i < queues; i++) {
+        ncs[i]->link_down = !up;
+    }
 
     if (nc->info->link_status_changed) {
         nc->info->link_status_changed(nc);
@@ -978,9 +1023,13 @@ done:
 
 void net_cleanup(void)
 {
-    NetClientState *nc, *next_vc;
+    NetClientState *nc;
 
-    QTAILQ_FOREACH_SAFE(nc, &net_clients, next, next_vc) {
+    /* We may del multiple entries during qemu_del_net_client(),
+     * so QTAILQ_FOREACH_SAFE() is also not safe here.
+     */
+    while (!QTAILQ_EMPTY(&net_clients)) {
+        nc = QTAILQ_FIRST(&net_clients);
         if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
             qemu_del_nic(qemu_get_nic(nc));
         } else {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 08/20] tap: import linux multiqueue constants
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
                   ` (6 preceding siblings ...)
  2013-01-25 10:35 ` [PATCH V2 07/20] net: multiqueue support Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-25 10:35 ` [PATCH V2 09/20] tap: factor out common tap initialization Jason Wang
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

Import multiqueue constants from if_tun.h from 3.8-rc3. A new ifr flag
IFF_MULTI_QUEUE were introduced to create a multiqueue backend by calling
TUNSETIFF with the this flag and with the same interface name many times.

A new ioctl TUNSETQUEUE were introduced. When doing this ioctl with
IFF_DETACH_QUEUE, the queue were disabled in the linux kernel. When doing this
ioctl with IFF_ATTACH_QUEUE, the queue were enabled in the linux kernel.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 net/tap-linux.h |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/net/tap-linux.h b/net/tap-linux.h
index cb2a6d4..65087e1 100644
--- a/net/tap-linux.h
+++ b/net/tap-linux.h
@@ -29,6 +29,7 @@
 #define TUNSETSNDBUF   _IOW('T', 212, int)
 #define TUNGETVNETHDRSZ _IOR('T', 215, int)
 #define TUNSETVNETHDRSZ _IOW('T', 216, int)
+#define TUNSETQUEUE  _IOW('T', 217, int)
 
 #endif
 
@@ -36,6 +37,9 @@
 #define IFF_TAP		0x0002
 #define IFF_NO_PI	0x1000
 #define IFF_VNET_HDR	0x4000
+#define IFF_MULTI_QUEUE 0x0100
+#define IFF_ATTACH_QUEUE 0x0200
+#define IFF_DETACH_QUEUE 0x0400
 
 /* Features for GSO (TUNSETOFFLOAD). */
 #define TUN_F_CSUM	0x01	/* You can hand me unchecksummed packets. */
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 09/20] tap: factor out common tap initialization
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
                   ` (7 preceding siblings ...)
  2013-01-25 10:35 ` [PATCH V2 08/20] tap: import linux multiqueue constants Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-25 10:35 ` [PATCH V2 10/20] tap: add Linux multiqueue support Jason Wang
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

This patch factors out the common initialization of tap into a new helper
net_init_tap_one(). This will be used by multiqueue tap patches.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 net/tap.c |  130 ++++++++++++++++++++++++++++++++++---------------------------
 1 files changed, 73 insertions(+), 57 deletions(-)

diff --git a/net/tap.c b/net/tap.c
index eb40c42..67080f1 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -593,6 +593,73 @@ static int net_tap_init(const NetdevTapOptions *tap, int *vnet_hdr,
     return fd;
 }
 
+static int net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
+                            const char *model, const char *name,
+                            const char *ifname, const char *script,
+                            const char *downscript, const char *vhostfdname,
+                            int vnet_hdr, int fd)
+{
+    TAPState *s;
+
+    s = net_tap_fd_init(peer, model, name, fd, vnet_hdr);
+    if (!s) {
+        close(fd);
+        return -1;
+    }
+
+    if (tap_set_sndbuf(s->fd, tap) < 0) {
+        return -1;
+    }
+
+    if (tap->has_fd) {
+        snprintf(s->nc.info_str, sizeof(s->nc.info_str), "fd=%d", fd);
+    } else if (tap->has_helper) {
+        snprintf(s->nc.info_str, sizeof(s->nc.info_str), "helper=%s",
+                 tap->helper);
+    } else {
+        const char *downscript;
+
+        downscript = tap->has_downscript ? tap->downscript :
+            DEFAULT_NETWORK_DOWN_SCRIPT;
+
+        snprintf(s->nc.info_str, sizeof(s->nc.info_str),
+                 "ifname=%s,script=%s,downscript=%s", ifname, script,
+                 downscript);
+
+        if (strcmp(downscript, "no") != 0) {
+            snprintf(s->down_script, sizeof(s->down_script), "%s", downscript);
+            snprintf(s->down_script_arg, sizeof(s->down_script_arg),
+                     "%s", ifname);
+        }
+    }
+
+    if (tap->has_vhost ? tap->vhost :
+        vhostfdname || (tap->has_vhostforce && tap->vhostforce)) {
+        int vhostfd;
+
+        if (tap->has_vhostfd) {
+            vhostfd = monitor_handle_fd_param(cur_mon, vhostfdname);
+            if (vhostfd == -1) {
+                return -1;
+            }
+        } else {
+            vhostfd = -1;
+        }
+
+        s->vhost_net = vhost_net_init(&s->nc, vhostfd,
+                                      tap->has_vhostforce && tap->vhostforce);
+        if (!s->vhost_net) {
+            error_report("vhost-net requested but could not be initialized");
+            return -1;
+        }
+    } else if (tap->has_vhostfd) {
+        error_report("vhostfd= is not valid without vhost");
+        return -1;
+    }
+
+    return 0;
+}
+
 int net_init_tap(const NetClientOptions *opts, const char *name,
                  NetClientState *peer)
 {
@@ -600,10 +667,10 @@ int net_init_tap(const NetClientOptions *opts, const char *name,
 
     int fd, vnet_hdr = 0;
     const char *model;
-    TAPState *s;
 
     /* for the no-fd, no-helper case */
     const char *script = NULL; /* suppress wrong "uninit'd use" gcc warning */
+    const char *downscript = NULL;
     char ifname[128];
 
     assert(opts->kind == NET_CLIENT_OPTIONS_KIND_TAP);
@@ -649,6 +716,8 @@ int net_init_tap(const NetClientOptions *opts, const char *name,
 
     } else {
         script = tap->has_script ? tap->script : DEFAULT_NETWORK_SCRIPT;
+        downscript = tap->has_downscript ? tap->downscript :
+            DEFAULT_NETWORK_DOWN_SCRIPT;
         fd = net_tap_init(tap, &vnet_hdr, script, ifname, sizeof ifname);
         if (fd == -1) {
             return -1;
@@ -657,62 +726,9 @@ int net_init_tap(const NetClientOptions *opts, const char *name,
         model = "tap";
     }
 
-    s = net_tap_fd_init(peer, model, name, fd, vnet_hdr);
-    if (!s) {
-        close(fd);
-        return -1;
-    }
-
-    if (tap_set_sndbuf(s->fd, tap) < 0) {
-        return -1;
-    }
-
-    if (tap->has_fd) {
-        snprintf(s->nc.info_str, sizeof(s->nc.info_str), "fd=%d", fd);
-    } else if (tap->has_helper) {
-        snprintf(s->nc.info_str, sizeof(s->nc.info_str), "helper=%s",
-                 tap->helper);
-    } else {
-        const char *downscript;
-
-        downscript = tap->has_downscript ? tap->downscript :
-                                           DEFAULT_NETWORK_DOWN_SCRIPT;
-
-        snprintf(s->nc.info_str, sizeof(s->nc.info_str),
-                 "ifname=%s,script=%s,downscript=%s", ifname, script,
-                 downscript);
-
-        if (strcmp(downscript, "no") != 0) {
-            snprintf(s->down_script, sizeof(s->down_script), "%s", downscript);
-            snprintf(s->down_script_arg, sizeof(s->down_script_arg), "%s", ifname);
-        }
-    }
-
-    if (tap->has_vhost ? tap->vhost :
-        tap->has_vhostfd || (tap->has_vhostforce && tap->vhostforce)) {
-        int vhostfd;
-
-        if (tap->has_vhostfd) {
-            vhostfd = monitor_handle_fd_param(cur_mon, tap->vhostfd);
-            if (vhostfd == -1) {
-                return -1;
-            }
-        } else {
-            vhostfd = -1;
-        }
-
-        s->vhost_net = vhost_net_init(&s->nc, vhostfd,
-                                      tap->has_vhostforce && tap->vhostforce);
-        if (!s->vhost_net) {
-            error_report("vhost-net requested but could not be initialized");
-            return -1;
-        }
-    } else if (tap->has_vhostfd) {
-        error_report("vhostfd= is not valid without vhost");
-        return -1;
-    }
-
-    return 0;
+    return net_init_tap_one(tap, peer, model, name, ifname, script,
+                            downscript, tap->has_vhostfd ? tap->vhostfd : NULL,
+                            vnet_hdr, fd);
 }
 
 VHostNetState *tap_get_vhost_net(NetClientState *nc)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 10/20] tap: add Linux multiqueue support
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
                   ` (8 preceding siblings ...)
  2013-01-25 10:35 ` [PATCH V2 09/20] tap: factor out common tap initialization Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-25 10:35 ` [PATCH V2 11/20] tap: support enabling or disabling a queue Jason Wang
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

This patch add basic multiqueue support for Linux. When multiqueue is needed, we
will first check whether kernel support multiqueue tap before creating more
queues. Two new functions tap_fd_enable() and tap_fd_disable() were introduced
to enable and disable a specific queue. Since the multiqueue is only supported
in Linux, return error on other platforms.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 net/tap-aix.c     |   10 ++++++++++
 net/tap-bsd.c     |   11 +++++++++++
 net/tap-haiku.c   |   11 +++++++++++
 net/tap-linux.c   |   52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 net/tap-solaris.c |   11 +++++++++++
 net/tap_int.h     |    2 ++
 6 files changed, 97 insertions(+), 0 deletions(-)

diff --git a/net/tap-aix.c b/net/tap-aix.c
index aff6c52..66e0574 100644
--- a/net/tap-aix.c
+++ b/net/tap-aix.c
@@ -59,3 +59,13 @@ void tap_fd_set_offload(int fd, int csum, int tso4,
                         int tso6, int ecn, int ufo)
 {
 }
+
+int tap_fd_enable(int fd)
+{
+    return -1;
+}
+
+int tap_fd_disable(int fd)
+{
+    return -1;
+}
diff --git a/net/tap-bsd.c b/net/tap-bsd.c
index 01c705b..cfc7a28 100644
--- a/net/tap-bsd.c
+++ b/net/tap-bsd.c
@@ -145,3 +145,14 @@ void tap_fd_set_offload(int fd, int csum, int tso4,
                         int tso6, int ecn, int ufo)
 {
 }
+
+int tap_fd_enable(int fd)
+{
+    return -1;
+}
+
+int tap_fd_disable(int fd)
+{
+    return -1;
+}
+
diff --git a/net/tap-haiku.c b/net/tap-haiku.c
index 08cc034..664d40f 100644
--- a/net/tap-haiku.c
+++ b/net/tap-haiku.c
@@ -59,3 +59,14 @@ void tap_fd_set_offload(int fd, int csum, int tso4,
                         int tso6, int ecn, int ufo)
 {
 }
+
+int tap_fd_enable(int fd)
+{
+    return -1;
+}
+
+int tap_fd_disable(int fd)
+{
+    return -1;
+}
+
diff --git a/net/tap-linux.c b/net/tap-linux.c
index 059f5f3..60ea8d0 100644
--- a/net/tap-linux.c
+++ b/net/tap-linux.c
@@ -41,6 +41,7 @@ int tap_open(char *ifname, int ifname_size, int *vnet_hdr, int vnet_hdr_required
     struct ifreq ifr;
     int fd, ret;
     int len = sizeof(struct virtio_net_hdr);
+    int mq_required = 0;
 
     TFR(fd = open(PATH_NET_TUN, O_RDWR));
     if (fd < 0) {
@@ -76,6 +77,20 @@ int tap_open(char *ifname, int ifname_size, int *vnet_hdr, int vnet_hdr_required
         ioctl(fd, TUNSETVNETHDRSZ, &len);
     }
 
+    if (mq_required) {
+        unsigned int features;
+
+        if ((ioctl(fd, TUNGETFEATURES, &features) != 0) ||
+            !(features & IFF_MULTI_QUEUE)) {
+            error_report("multiqueue required, but no kernel "
+                         "support for IFF_MULTI_QUEUE available");
+            close(fd);
+            return -1;
+        } else {
+            ifr.ifr_flags |= IFF_MULTI_QUEUE;
+        }
+    }
+
     if (ifname[0] != '\0')
         pstrcpy(ifr.ifr_name, IFNAMSIZ, ifname);
     else
@@ -209,3 +224,40 @@ void tap_fd_set_offload(int fd, int csum, int tso4,
         }
     }
 }
+
+/* Enable a specific queue of tap. */
+int tap_fd_enable(int fd)
+{
+    struct ifreq ifr;
+    int ret;
+
+    memset(&ifr, 0, sizeof(ifr));
+
+    ifr.ifr_flags = IFF_ATTACH_QUEUE;
+    ret = ioctl(fd, TUNSETQUEUE, (void *) &ifr);
+
+    if (ret != 0) {
+        error_report("could not enable queue");
+    }
+
+    return ret;
+}
+
+/* Disable a specific queue of tap/ */
+int tap_fd_disable(int fd)
+{
+    struct ifreq ifr;
+    int ret;
+
+    memset(&ifr, 0, sizeof(ifr));
+
+    ifr.ifr_flags = IFF_DETACH_QUEUE;
+    ret = ioctl(fd, TUNSETQUEUE, (void *) &ifr);
+
+    if (ret != 0) {
+        error_report("could not disable queue");
+    }
+
+    return ret;
+}
+
diff --git a/net/tap-solaris.c b/net/tap-solaris.c
index 486a7ea..12cc392 100644
--- a/net/tap-solaris.c
+++ b/net/tap-solaris.c
@@ -225,3 +225,14 @@ void tap_fd_set_offload(int fd, int csum, int tso4,
                         int tso6, int ecn, int ufo)
 {
 }
+
+int tap_fd_enable(int fd)
+{
+    return -1;
+}
+
+int tap_fd_disable(int fd)
+{
+    return -1;
+}
+
diff --git a/net/tap_int.h b/net/tap_int.h
index 1dffe12..ca1c21b 100644
--- a/net/tap_int.h
+++ b/net/tap_int.h
@@ -42,5 +42,7 @@ int tap_probe_vnet_hdr_len(int fd, int len);
 int tap_probe_has_ufo(int fd);
 void tap_fd_set_offload(int fd, int csum, int tso4, int tso6, int ecn, int ufo);
 void tap_fd_set_vnet_hdr_len(int fd, int len);
+int tap_fd_enable(int fd);
+int tap_fd_disable(int fd);
 
 #endif /* QEMU_TAP_H */
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 11/20] tap: support enabling or disabling a queue
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
                   ` (9 preceding siblings ...)
  2013-01-25 10:35 ` [PATCH V2 10/20] tap: add Linux multiqueue support Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-25 19:13   ` [Qemu-devel] " Blue Swirl
  2013-01-25 10:35 ` [PATCH V2 12/20] tap: introduce a helper to get the name of an interface Jason Wang
                   ` (9 subsequent siblings)
  20 siblings, 1 reply; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

This patch introduce a new bit - enabled in TAPState which tracks whether a
specific queue/fd is enabled. The tap/fd is enabled during initialization and
could be enabled/disabled by tap_enalbe() and tap_disable() which calls platform
specific helpers to do the real work. Polling of a tap fd can only done when
the tap was enabled.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/net/tap.h |    2 ++
 net/tap-win32.c   |   10 ++++++++++
 net/tap.c         |   43 ++++++++++++++++++++++++++++++++++++++++---
 3 files changed, 52 insertions(+), 3 deletions(-)

diff --git a/include/net/tap.h b/include/net/tap.h
index bb7efb5..0caf8c4 100644
--- a/include/net/tap.h
+++ b/include/net/tap.h
@@ -35,6 +35,8 @@ int tap_has_vnet_hdr_len(NetClientState *nc, int len);
 void tap_using_vnet_hdr(NetClientState *nc, int using_vnet_hdr);
 void tap_set_offload(NetClientState *nc, int csum, int tso4, int tso6, int ecn, int ufo);
 void tap_set_vnet_hdr_len(NetClientState *nc, int len);
+int tap_enable(NetClientState *nc);
+int tap_disable(NetClientState *nc);
 
 int tap_get_fd(NetClientState *nc);
 
diff --git a/net/tap-win32.c b/net/tap-win32.c
index 265369c..a2cd94b 100644
--- a/net/tap-win32.c
+++ b/net/tap-win32.c
@@ -764,3 +764,13 @@ void tap_set_vnet_hdr_len(NetClientState *nc, int len)
 {
     assert(0);
 }
+
+int tap_enable(NetClientState *nc)
+{
+    assert(0);
+}
+
+int tap_disable(NetClientState *nc)
+{
+    assert(0);
+}
diff --git a/net/tap.c b/net/tap.c
index 67080f1..95e557b 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -59,6 +59,7 @@ typedef struct TAPState {
     unsigned int write_poll : 1;
     unsigned int using_vnet_hdr : 1;
     unsigned int has_ufo: 1;
+    unsigned int enabled : 1;
     VHostNetState *vhost_net;
     unsigned host_vnet_hdr_len;
 } TAPState;
@@ -72,9 +73,9 @@ static void tap_writable(void *opaque);
 static void tap_update_fd_handler(TAPState *s)
 {
     qemu_set_fd_handler2(s->fd,
-                         s->read_poll  ? tap_can_send : NULL,
-                         s->read_poll  ? tap_send     : NULL,
-                         s->write_poll ? tap_writable : NULL,
+                         s->read_poll && s->enabled ? tap_can_send : NULL,
+                         s->read_poll && s->enabled ? tap_send     : NULL,
+                         s->write_poll && s->enabled ? tap_writable : NULL,
                          s);
 }
 
@@ -339,6 +340,7 @@ static TAPState *net_tap_fd_init(NetClientState *peer,
     s->host_vnet_hdr_len = vnet_hdr ? sizeof(struct virtio_net_hdr) : 0;
     s->using_vnet_hdr = 0;
     s->has_ufo = tap_probe_has_ufo(s->fd);
+    s->enabled = 1;
     tap_set_offload(&s->nc, 0, 0, 0, 0, 0);
     /*
      * Make sure host header length is set correctly in tap:
@@ -737,3 +739,38 @@ VHostNetState *tap_get_vhost_net(NetClientState *nc)
     assert(nc->info->type == NET_CLIENT_OPTIONS_KIND_TAP);
     return s->vhost_net;
 }
+
+int tap_enable(NetClientState *nc)
+{
+    TAPState *s = DO_UPCAST(TAPState, nc, nc);
+    int ret;
+
+    if (s->enabled) {
+        return 0;
+    } else {
+        ret = tap_fd_enable(s->fd);
+        if (ret == 0) {
+            s->enabled = 1;
+            tap_update_fd_handler(s);
+        }
+        return ret;
+    }
+}
+
+int tap_disable(NetClientState *nc)
+{
+    TAPState *s = DO_UPCAST(TAPState, nc, nc);
+    int ret;
+
+    if (s->enabled == 0) {
+        return 0;
+    } else {
+        ret = tap_fd_disable(s->fd);
+        if (ret == 0) {
+            qemu_purge_queued_packets(nc);
+            s->enabled = 0;
+            tap_update_fd_handler(s);
+        }
+        return ret;
+    }
+}
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 12/20] tap: introduce a helper to get the name of an interface
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
                   ` (10 preceding siblings ...)
  2013-01-25 10:35 ` [PATCH V2 11/20] tap: support enabling or disabling a queue Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-25 10:35 ` [PATCH V2 13/20] tap: multiqueue support Jason Wang
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

This patch introduces a helper tap_get_ifname() to get the device name of tap
device. This is needed when ifname is unspecified in the command line and qemu
were asked to create tap device by itself. In this situation, the name were
allocated by kernel, so if multiqueue is asked, we need to fetch its name after
creating the first queue.

Only linux has this support since it's the only platform that supports
multiqueue tap.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/net/tap.h |    1 +
 net/tap-aix.c     |    6 ++++++
 net/tap-bsd.c     |    4 ++++
 net/tap-haiku.c   |    4 ++++
 net/tap-linux.c   |   13 +++++++++++++
 net/tap-solaris.c |    4 ++++
 net/tap_int.h     |    1 +
 7 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/include/net/tap.h b/include/net/tap.h
index 0caf8c4..c523ff0 100644
--- a/include/net/tap.h
+++ b/include/net/tap.h
@@ -37,6 +37,7 @@ void tap_set_offload(NetClientState *nc, int csum, int tso4, int tso6, int ecn,
 void tap_set_vnet_hdr_len(NetClientState *nc, int len);
 int tap_enable(NetClientState *nc);
 int tap_disable(NetClientState *nc);
+int tap_get_ifname(NetClientState *nc, char *ifname);
 
 int tap_get_fd(NetClientState *nc);
 
diff --git a/net/tap-aix.c b/net/tap-aix.c
index 66e0574..e760e9a 100644
--- a/net/tap-aix.c
+++ b/net/tap-aix.c
@@ -69,3 +69,9 @@ int tap_fd_disable(int fd)
 {
     return -1;
 }
+
+int tap_fd_get_ifname(int fd, char *ifname)
+{
+    return -1;
+}
+
diff --git a/net/tap-bsd.c b/net/tap-bsd.c
index cfc7a28..4f22109 100644
--- a/net/tap-bsd.c
+++ b/net/tap-bsd.c
@@ -156,3 +156,7 @@ int tap_fd_disable(int fd)
     return -1;
 }
 
+int tap_fd_get_ifname(int fd, char *ifname)
+{
+    return -1;
+}
diff --git a/net/tap-haiku.c b/net/tap-haiku.c
index 664d40f..b3b5fbb 100644
--- a/net/tap-haiku.c
+++ b/net/tap-haiku.c
@@ -70,3 +70,7 @@ int tap_fd_disable(int fd)
     return -1;
 }
 
+int tap_fd_get_ifname(int fd, char *ifname)
+{
+    return -1;
+}
diff --git a/net/tap-linux.c b/net/tap-linux.c
index 60ea8d0..6827c2a 100644
--- a/net/tap-linux.c
+++ b/net/tap-linux.c
@@ -261,3 +261,16 @@ int tap_fd_disable(int fd)
     return ret;
 }
 
+int tap_fd_get_ifname(int fd, char *ifname)
+{
+    struct ifreq ifr;
+
+    if (ioctl(fd, TUNGETIFF, &ifr) != 0) {
+        error_report("TUNGETIFF ioctl() failed: %s",
+                     strerror(errno));
+        return -1;
+    }
+
+    pstrcpy(ifname, sizeof(ifr.ifr_name), ifr.ifr_name);
+    return 0;
+}
diff --git a/net/tap-solaris.c b/net/tap-solaris.c
index 12cc392..214d95e 100644
--- a/net/tap-solaris.c
+++ b/net/tap-solaris.c
@@ -236,3 +236,7 @@ int tap_fd_disable(int fd)
     return -1;
 }
 
+int tap_fd_get_ifname(int fd, char *ifname)
+{
+    return -1;
+}
diff --git a/net/tap_int.h b/net/tap_int.h
index ca1c21b..125f83d 100644
--- a/net/tap_int.h
+++ b/net/tap_int.h
@@ -44,5 +44,6 @@ void tap_fd_set_offload(int fd, int csum, int tso4, int tso6, int ecn, int ufo);
 void tap_fd_set_vnet_hdr_len(int fd, int len);
 int tap_fd_enable(int fd);
 int tap_fd_disable(int fd);
+int tap_fd_get_ifname(int fd, char *ifname);
 
 #endif /* QEMU_TAP_H */
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 13/20] tap: multiqueue support
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
                   ` (11 preceding siblings ...)
  2013-01-25 10:35 ` [PATCH V2 12/20] tap: introduce a helper to get the name of an interface Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-25 10:35 ` [PATCH V2 14/20] vhost: " Jason Wang
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

Recently, linux support multiqueue tap which could let userspace call TUNSETIFF
for a signle device many times to create multiple file descriptors as
independent queues. User could also enable/disabe a specific queue through
TUNSETQUEUE.

The patch adds the generic infrastructure to create multiqueue taps. To achieve
this a new parameter "queues" were introduced to specify how many queues were
expected to be created for tap by qemu itself. Alternatively, management could
also pass multiple pre-created tap file descriptors separated with ':' through a
new parameter fds like -netdev tap,id=hn0,fds="X:Y:..:Z". Multiple vhost file
descriptors could also be passed in this way.

Each TAPState were still associated to a tap fd, which mean multiple TAPStates
were created when user needs multiqueue taps. Since each TAPState contains one
NetClientState, with the multiqueue nic support, an N peers of NetClientState
were built up.

A new parameter, mq_required were introduce in tap_open() to create multiqueue
tap fds.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/net/tap.h |    1 -
 net/tap-aix.c     |    3 +-
 net/tap-bsd.c     |    3 +-
 net/tap-haiku.c   |    3 +-
 net/tap-linux.c   |    4 +-
 net/tap-solaris.c |    3 +-
 net/tap.c         |  158 +++++++++++++++++++++++++++++++++++++++++------------
 net/tap_int.h     |    3 +-
 qapi-schema.json  |    5 +-
 9 files changed, 139 insertions(+), 44 deletions(-)

diff --git a/include/net/tap.h b/include/net/tap.h
index c523ff0..0caf8c4 100644
--- a/include/net/tap.h
+++ b/include/net/tap.h
@@ -37,7 +37,6 @@ void tap_set_offload(NetClientState *nc, int csum, int tso4, int tso6, int ecn,
 void tap_set_vnet_hdr_len(NetClientState *nc, int len);
 int tap_enable(NetClientState *nc);
 int tap_disable(NetClientState *nc);
-int tap_get_ifname(NetClientState *nc, char *ifname);
 
 int tap_get_fd(NetClientState *nc);
 
diff --git a/net/tap-aix.c b/net/tap-aix.c
index e760e9a..804d164 100644
--- a/net/tap-aix.c
+++ b/net/tap-aix.c
@@ -25,7 +25,8 @@
 #include "tap_int.h"
 #include <stdio.h>
 
-int tap_open(char *ifname, int ifname_size, int *vnet_hdr, int vnet_hdr_required)
+int tap_open(char *ifname, int ifname_size, int *vnet_hdr,
+             int vnet_hdr_required, int mq_required)
 {
     fprintf(stderr, "no tap on AIX\n");
     return -1;
diff --git a/net/tap-bsd.c b/net/tap-bsd.c
index 4f22109..bcdb268 100644
--- a/net/tap-bsd.c
+++ b/net/tap-bsd.c
@@ -33,7 +33,8 @@
 #include <net/if_tap.h>
 #endif
 
-int tap_open(char *ifname, int ifname_size, int *vnet_hdr, int vnet_hdr_required)
+int tap_open(char *ifname, int ifname_size, int *vnet_hdr,
+             int vnet_hdr_required, int mq_required)
 {
     int fd;
 #ifdef TAPGIFNAME
diff --git a/net/tap-haiku.c b/net/tap-haiku.c
index b3b5fbb..e5ce436 100644
--- a/net/tap-haiku.c
+++ b/net/tap-haiku.c
@@ -25,7 +25,8 @@
 #include "tap_int.h"
 #include <stdio.h>
 
-int tap_open(char *ifname, int ifname_size, int *vnet_hdr, int vnet_hdr_required)
+int tap_open(char *ifname, int ifname_size, int *vnet_hdr,
+             int vnet_hdr_required, int mq_required)
 {
     fprintf(stderr, "no tap on Haiku\n");
     return -1;
diff --git a/net/tap-linux.c b/net/tap-linux.c
index 6827c2a..a1a6128 100644
--- a/net/tap-linux.c
+++ b/net/tap-linux.c
@@ -36,12 +36,12 @@
 
 #define PATH_NET_TUN "/dev/net/tun"
 
-int tap_open(char *ifname, int ifname_size, int *vnet_hdr, int vnet_hdr_required)
+int tap_open(char *ifname, int ifname_size, int *vnet_hdr,
+             int vnet_hdr_required, int mq_required)
 {
     struct ifreq ifr;
     int fd, ret;
     int len = sizeof(struct virtio_net_hdr);
-    int mq_required = 0;
 
     TFR(fd = open(PATH_NET_TUN, O_RDWR));
     if (fd < 0) {
diff --git a/net/tap-solaris.c b/net/tap-solaris.c
index 214d95e..9c7278f 100644
--- a/net/tap-solaris.c
+++ b/net/tap-solaris.c
@@ -173,7 +173,8 @@ static int tap_alloc(char *dev, size_t dev_size)
     return tap_fd;
 }
 
-int tap_open(char *ifname, int ifname_size, int *vnet_hdr, int vnet_hdr_required)
+int tap_open(char *ifname, int ifname_size, int *vnet_hdr,
+             int vnet_hdr_required, int mq_required)
 {
     char  dev[10]="";
     int fd;
diff --git a/net/tap.c b/net/tap.c
index 95e557b..072e166 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -560,17 +560,10 @@ int net_init_bridge(const NetClientOptions *opts, const char *name,
 
 static int net_tap_init(const NetdevTapOptions *tap, int *vnet_hdr,
                         const char *setup_script, char *ifname,
-                        size_t ifname_sz)
+                        size_t ifname_sz, int mq_required)
 {
     int fd, vnet_hdr_required;
 
-    if (tap->has_ifname) {
-        pstrcpy(ifname, ifname_sz, tap->ifname);
-    } else {
-        assert(ifname_sz > 0);
-        ifname[0] = '\0';
-    }
-
     if (tap->has_vnet_hdr) {
         *vnet_hdr = tap->vnet_hdr;
         vnet_hdr_required = *vnet_hdr;
@@ -579,7 +572,8 @@ static int net_tap_init(const NetdevTapOptions *tap, int *vnet_hdr,
         vnet_hdr_required = 0;
     }
 
-    TFR(fd = tap_open(ifname, ifname_sz, vnet_hdr, vnet_hdr_required));
+    TFR(fd = tap_open(ifname, ifname_sz, vnet_hdr, vnet_hdr_required,
+                      mq_required));
     if (fd < 0) {
         return -1;
     }
@@ -595,6 +589,8 @@ static int net_tap_init(const NetdevTapOptions *tap, int *vnet_hdr,
     return fd;
 }
 
+#define MAX_TAP_QUEUES 1024
+
 static int net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
                             const char *model, const char *name,
                             const char *ifname, const char *script,
@@ -613,17 +609,12 @@ static int net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
         return -1;
     }
 
-    if (tap->has_fd) {
+    if (tap->has_fd || tap->has_fds) {
         snprintf(s->nc.info_str, sizeof(s->nc.info_str), "fd=%d", fd);
     } else if (tap->has_helper) {
         snprintf(s->nc.info_str, sizeof(s->nc.info_str), "helper=%s",
                  tap->helper);
     } else {
-        const char *downscript;
-
-        downscript = tap->has_downscript ? tap->downscript :
-            DEFAULT_NETWORK_DOWN_SCRIPT;
-
         snprintf(s->nc.info_str, sizeof(s->nc.info_str),
                  "ifname=%s,script=%s,downscript=%s", ifname, script,
                  downscript);
@@ -654,7 +645,7 @@ static int net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
             error_report("vhost-net requested but could not be initialized");
             return -1;
         }
-    } else if (tap->has_vhostfd) {
+    } else if (tap->has_vhostfd || tap->has_vhostfds) {
         error_report("vhostfd= is not valid without vhost");
         return -1;
     }
@@ -662,27 +653,54 @@ static int net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
     return 0;
 }
 
+static int get_fds(char *str, char *fds[], int max)
+{
+    char *ptr = str, *this;
+    size_t len = strlen(str);
+    int i = 0;
+
+    while (i < max && ptr < str + len) {
+        this = strchr(ptr, ':');
+
+        if (this == NULL) {
+            fds[i] = g_strdup(ptr);
+        } else {
+            fds[i] = g_strndup(ptr, this - ptr);
+        }
+
+        i++;
+        if (this == NULL) {
+            break;
+        } else {
+            ptr = this + 1;
+        }
+    }
+
+    return i;
+}
+
 int net_init_tap(const NetClientOptions *opts, const char *name,
                  NetClientState *peer)
 {
     const NetdevTapOptions *tap;
-
-    int fd, vnet_hdr = 0;
-    const char *model;
-
+    int fd, vnet_hdr = 0, i = 0, queues;
     /* for the no-fd, no-helper case */
     const char *script = NULL; /* suppress wrong "uninit'd use" gcc warning */
     const char *downscript = NULL;
+    const char *vhostfdname;
     char ifname[128];
 
     assert(opts->kind == NET_CLIENT_OPTIONS_KIND_TAP);
     tap = opts->tap;
+    queues = tap->has_queues ? tap->queues : 1;
+    vhostfdname = tap->has_vhostfd ? tap->vhostfd : NULL;
 
     if (tap->has_fd) {
         if (tap->has_ifname || tap->has_script || tap->has_downscript ||
-            tap->has_vnet_hdr || tap->has_helper) {
+            tap->has_vnet_hdr || tap->has_helper || tap->has_queues ||
+            tap->has_fds) {
             error_report("ifname=, script=, downscript=, vnet_hdr=, "
-                         "and helper= are invalid with fd=");
+                         "helper=, queues=, and fds= are invalid with fd=");
             return -1;
         }
 
@@ -695,13 +713,61 @@ int net_init_tap(const NetClientOptions *opts, const char *name,
 
         vnet_hdr = tap_probe_vnet_hdr(fd);
 
-        model = "tap";
+        if (net_init_tap_one(tap, peer, "tap", NULL, NULL,
+                             script, downscript,
+                             vhostfdname, vnet_hdr, fd)) {
+            return -1;
+        }
+    } else if (tap->has_fds) {
+        char *fds[MAX_TAP_QUEUES];
+        char *vhost_fds[MAX_TAP_QUEUES];
+        int nfds, nvhosts;
+
+        if (tap->has_ifname || tap->has_script || tap->has_downscript ||
+            tap->has_vnet_hdr || tap->has_helper || tap->has_queues ||
+            tap->has_fd) {
+            error_report("ifname=, script=, downscript=, vnet_hdr=, "
+                         "helper=, queues=, and fd= are invalid with fds=");
+            return -1;
+        }
+
+        nfds = get_fds(tap->fds, fds, MAX_TAP_QUEUES);
+        if (tap->has_vhostfds) {
+            nvhosts = get_fds(tap->vhostfds, vhost_fds, MAX_TAP_QUEUES);
+            if (nfds != nvhosts) {
+                error_report("The number of fds passed does not match the "
+                             "number of vhostfds passed");
+                return -1;
+            }
+        }
+
+        for (i = 0; i < nfds; i++) {
+            fd = monitor_handle_fd_param(cur_mon, fds[i]);
+            if (fd == -1) {
+                return -1;
+            }
+
+            fcntl(fd, F_SETFL, O_NONBLOCK);
 
+            if (i == 0) {
+                vnet_hdr = tap_probe_vnet_hdr(fd);
+            } else if (vnet_hdr != tap_probe_vnet_hdr(fd)) {
+                error_report("vnet_hdr not consistent across given tap fds");
+                return -1;
+            }
+
+            if (net_init_tap_one(tap, peer, "tap", name, ifname,
+                                 script, downscript,
+                                 tap->has_vhostfds ? vhost_fds[i] : NULL,
+                                 vnet_hdr, fd)) {
+                return -1;
+            }
+        }
     } else if (tap->has_helper) {
         if (tap->has_ifname || tap->has_script || tap->has_downscript ||
-            tap->has_vnet_hdr) {
+            tap->has_vnet_hdr || tap->has_queues || tap->has_fds) {
             error_report("ifname=, script=, downscript=, and vnet_hdr= "
-                         "are invalid with helper=");
+                         "queues=, and fds= are invalid with helper=");
             return -1;
         }
 
@@ -711,26 +777,48 @@ int net_init_tap(const NetClientOptions *opts, const char *name,
         }
 
         fcntl(fd, F_SETFL, O_NONBLOCK);
-
         vnet_hdr = tap_probe_vnet_hdr(fd);
 
-        model = "bridge";
-
+        if (net_init_tap_one(tap, peer, "bridge", name, ifname,
+                             script, downscript, vhostfdname,
+                             vnet_hdr, fd)) {
+            return -1;
+        }
     } else {
         script = tap->has_script ? tap->script : DEFAULT_NETWORK_SCRIPT;
         downscript = tap->has_downscript ? tap->downscript :
             DEFAULT_NETWORK_DOWN_SCRIPT;
-        fd = net_tap_init(tap, &vnet_hdr, script, ifname, sizeof ifname);
-        if (fd == -1) {
-            return -1;
+
+        if (tap->has_ifname) {
+            pstrcpy(ifname, sizeof ifname, tap->ifname);
+        } else {
+            ifname[0] = '\0';
         }
 
-        model = "tap";
+        for (i = 0; i < queues; i++) {
+            fd = net_tap_init(tap, &vnet_hdr, i >= 1 ? "no" : script,
+                              ifname, sizeof ifname, queues > 1);
+            if (fd == -1) {
+                return -1;
+            }
+
+            if (queues > 1 && i == 0 && !tap->has_ifname) {
+                if (tap_fd_get_ifname(fd, ifname)) {
+                    error_report("Fail to get ifname");
+                    return -1;
+                }
+            }
+
+            if (net_init_tap_one(tap, peer, "tap", name, ifname,
+                                 i >= 1 ? "no" : script,
+                                 i >= 1 ? "no" : downscript,
+                                 vhostfdname, vnet_hdr, fd)) {
+                return -1;
+            }
+        }
     }
 
-    return net_init_tap_one(tap, peer, model, name, ifname, script,
-                            downscript, tap->has_vhostfd ? tap->vhostfd : NULL,
-                            vnet_hdr, fd);
+    return 0;
 }
 
 VHostNetState *tap_get_vhost_net(NetClientState *nc)
diff --git a/net/tap_int.h b/net/tap_int.h
index 125f83d..86bb224 100644
--- a/net/tap_int.h
+++ b/net/tap_int.h
@@ -32,7 +32,8 @@
 #define DEFAULT_NETWORK_SCRIPT "/etc/qemu-ifup"
 #define DEFAULT_NETWORK_DOWN_SCRIPT "/etc/qemu-ifdown"
 
-int tap_open(char *ifname, int ifname_size, int *vnet_hdr, int vnet_hdr_required);
+int tap_open(char *ifname, int ifname_size, int *vnet_hdr,
+             int vnet_hdr_required, int mq_required);
 
 ssize_t tap_read_packet(int tapfd, uint8_t *buf, int maxlen);
 
diff --git a/qapi-schema.json b/qapi-schema.json
index 6d7252b..4737800 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2466,6 +2466,7 @@
   'data': {
     '*ifname':     'str',
     '*fd':         'str',
+    '*fds':        'str',
     '*script':     'str',
     '*downscript': 'str',
     '*helper':     'str',
@@ -2473,7 +2474,9 @@
     '*vnet_hdr':   'bool',
     '*vhost':      'bool',
     '*vhostfd':    'str',
-    '*vhostforce': 'bool' } }
+    '*vhostfds':   'str',
+    '*vhostforce': 'bool',
+    '*queues':     'uint32'} }
 
 ##
 # @NetdevSocketOptions
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 14/20] vhost: multiqueue support
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
                   ` (12 preceding siblings ...)
  2013-01-25 10:35 ` [PATCH V2 13/20] tap: multiqueue support Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-29 13:53   ` Jason Wang
  2013-01-25 10:35 ` [PATCH V2 15/20] virtio: introduce virtio_del_queue() Jason Wang
                   ` (6 subsequent siblings)
  20 siblings, 1 reply; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

This patch lets vhost support multiqueue. The idea is simple, just launching
multiple threads of vhost and let each of vhost thread processing a subset of
the virtqueues of the device. After this change each emulated device can have
multiple vhost threads as its backend.

To do this, a virtqueue index were introduced to record to first virtqueue that
will be handled by this vhost_net device. Based on this and nvqs, vhost could
calculate its relative index to setup vhost_net device.

Since we may have many vhost/net devices for a virtio-net device. The setting of
guest notifiers were moved out of the starting/stopping of a specific vhost
thread. The vhost_net_{start|stop}() were renamed to
vhost_net_{start|stop}_one(), and a new vhost_net_{start|stop}() were introduced
to configure the guest notifiers and start/stop all vhost/vhost_net devices.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 hw/vhost.c      |   82 +++++++++++++++++++++---------------------------
 hw/vhost.h      |    2 +
 hw/vhost_net.c  |   92 ++++++++++++++++++++++++++++++++++++++++++++++++++-----
 hw/vhost_net.h  |    6 ++-
 hw/virtio-net.c |    4 +-
 5 files changed, 128 insertions(+), 58 deletions(-)

diff --git a/hw/vhost.c b/hw/vhost.c
index cee8aad..38257b9 100644
--- a/hw/vhost.c
+++ b/hw/vhost.c
@@ -619,14 +619,17 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
 {
     hwaddr s, l, a;
     int r;
+    int vhost_vq_index = idx - dev->vq_index;
     struct vhost_vring_file file = {
-        .index = idx,
+        .index = vhost_vq_index
     };
     struct vhost_vring_state state = {
-        .index = idx,
+        .index = vhost_vq_index
     };
     struct VirtQueue *vvq = virtio_get_queue(vdev, idx);
 
+    assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
+
     vq->num = state.num = virtio_queue_get_num(vdev, idx);
     r = ioctl(dev->control, VHOST_SET_VRING_NUM, &state);
     if (r) {
@@ -669,11 +672,12 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
         goto fail_alloc_ring;
     }
 
-    r = vhost_virtqueue_set_addr(dev, vq, idx, dev->log_enabled);
+    r = vhost_virtqueue_set_addr(dev, vq, vhost_vq_index, dev->log_enabled);
     if (r < 0) {
         r = -errno;
         goto fail_alloc;
     }
+
     file.fd = event_notifier_get_fd(virtio_queue_get_host_notifier(vvq));
     r = ioctl(dev->control, VHOST_SET_VRING_KICK, &file);
     if (r) {
@@ -709,9 +713,10 @@ static void vhost_virtqueue_stop(struct vhost_dev *dev,
                                     unsigned idx)
 {
     struct vhost_vring_state state = {
-        .index = idx,
+        .index = idx - dev->vq_index
     };
     int r;
+    assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
     r = ioctl(dev->control, VHOST_GET_VRING_BASE, &state);
     if (r < 0) {
         fprintf(stderr, "vhost VQ %d ring restore failed: %d\n", idx, r);
@@ -867,7 +872,9 @@ int vhost_dev_enable_notifiers(struct vhost_dev *hdev, VirtIODevice *vdev)
     }
 
     for (i = 0; i < hdev->nvqs; ++i) {
-        r = vdev->binding->set_host_notifier(vdev->binding_opaque, i, true);
+        r = vdev->binding->set_host_notifier(vdev->binding_opaque,
+                                             hdev->vq_index + i,
+                                             true);
         if (r < 0) {
             fprintf(stderr, "vhost VQ %d notifier binding failed: %d\n", i, -r);
             goto fail_vq;
@@ -877,7 +884,9 @@ int vhost_dev_enable_notifiers(struct vhost_dev *hdev, VirtIODevice *vdev)
     return 0;
 fail_vq:
     while (--i >= 0) {
-        r = vdev->binding->set_host_notifier(vdev->binding_opaque, i, false);
+        r = vdev->binding->set_host_notifier(vdev->binding_opaque,
+                                             hdev->vq_index + i,
+                                             false);
         if (r < 0) {
             fprintf(stderr, "vhost VQ %d notifier cleanup error: %d\n", i, -r);
             fflush(stderr);
@@ -898,7 +907,9 @@ void vhost_dev_disable_notifiers(struct vhost_dev *hdev, VirtIODevice *vdev)
     int i, r;
 
     for (i = 0; i < hdev->nvqs; ++i) {
-        r = vdev->binding->set_host_notifier(vdev->binding_opaque, i, false);
+        r = vdev->binding->set_host_notifier(vdev->binding_opaque,
+                                             hdev->vq_index + i,
+                                             false);
         if (r < 0) {
             fprintf(stderr, "vhost VQ %d notifier cleanup failed: %d\n", i, -r);
             fflush(stderr);
@@ -912,8 +923,9 @@ void vhost_dev_disable_notifiers(struct vhost_dev *hdev, VirtIODevice *vdev)
  */
 bool vhost_virtqueue_pending(struct vhost_dev *hdev, int n)
 {
-    struct vhost_virtqueue *vq = hdev->vqs + n;
+    struct vhost_virtqueue *vq = hdev->vqs + n - hdev->vq_index;
     assert(hdev->started);
+    assert(n >= hdev->vq_index && n < hdev->vq_index + hdev->nvqs);
     return event_notifier_test_and_clear(&vq->masked_notifier);
 }
 
@@ -922,15 +934,16 @@ void vhost_virtqueue_mask(struct vhost_dev *hdev, VirtIODevice *vdev, int n,
                          bool mask)
 {
     struct VirtQueue *vvq = virtio_get_queue(vdev, n);
-    int r;
+    int r, index = n - hdev->vq_index;
 
     assert(hdev->started);
+    assert(n >= hdev->vq_index && n < hdev->vq_index + hdev->nvqs);
 
     struct vhost_vring_file file = {
-        .index = n,
+        .index = index
     };
     if (mask) {
-        file.fd = event_notifier_get_fd(&hdev->vqs[n].masked_notifier);
+        file.fd = event_notifier_get_fd(&hdev->vqs[index].masked_notifier);
     } else {
         file.fd = event_notifier_get_fd(virtio_queue_get_guest_notifier(vvq));
     }
@@ -945,20 +958,6 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
 
     hdev->started = true;
 
-    if (!vdev->binding->set_guest_notifiers) {
-        fprintf(stderr, "binding does not support guest notifiers\n");
-        r = -ENOSYS;
-        goto fail;
-    }
-
-    r = vdev->binding->set_guest_notifiers(vdev->binding_opaque,
-                                           hdev->nvqs,
-                                           true);
-    if (r < 0) {
-        fprintf(stderr, "Error binding guest notifier: %d\n", -r);
-        goto fail_notifiers;
-    }
-
     r = vhost_dev_set_features(hdev, hdev->log_enabled);
     if (r < 0) {
         goto fail_features;
@@ -970,9 +969,9 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
     }
     for (i = 0; i < hdev->nvqs; ++i) {
         r = vhost_virtqueue_start(hdev,
-                                 vdev,
-                                 hdev->vqs + i,
-                                 i);
+                                  vdev,
+                                  hdev->vqs + i,
+                                  hdev->vq_index + i);
         if (r < 0) {
             goto fail_vq;
         }
@@ -995,15 +994,13 @@ fail_log:
 fail_vq:
     while (--i >= 0) {
         vhost_virtqueue_stop(hdev,
-                                vdev,
-                                hdev->vqs + i,
-                                i);
+                             vdev,
+                             hdev->vqs + i,
+                             hdev->vq_index + i);
     }
+    i = hdev->nvqs;
 fail_mem:
 fail_features:
-    vdev->binding->set_guest_notifiers(vdev->binding_opaque, hdev->nvqs, false);
-fail_notifiers:
-fail:
 
     hdev->started = false;
     return r;
@@ -1012,29 +1009,22 @@ fail:
 /* Host notifiers must be enabled at this point. */
 void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
 {
-    int i, r;
+    int i;
 
     for (i = 0; i < hdev->nvqs; ++i) {
         vhost_virtqueue_stop(hdev,
-                                vdev,
-                                hdev->vqs + i,
-                                i);
+                             vdev,
+                             hdev->vqs + i,
+                             hdev->vq_index + i);
     }
     for (i = 0; i < hdev->n_mem_sections; ++i) {
         vhost_sync_dirty_bitmap(hdev, &hdev->mem_sections[i],
                                 0, (hwaddr)~0x0ull);
     }
-    r = vdev->binding->set_guest_notifiers(vdev->binding_opaque,
-                                           hdev->nvqs,
-                                           false);
-    if (r < 0) {
-        fprintf(stderr, "vhost guest notifier cleanup failed: %d\n", r);
-        fflush(stderr);
-    }
-    assert (r >= 0);
 
     hdev->started = false;
     g_free(hdev->log);
     hdev->log = NULL;
     hdev->log_size = 0;
 }
+
diff --git a/hw/vhost.h b/hw/vhost.h
index 44c61a5..f062d48 100644
--- a/hw/vhost.h
+++ b/hw/vhost.h
@@ -35,6 +35,8 @@ struct vhost_dev {
     MemoryRegionSection *mem_sections;
     struct vhost_virtqueue *vqs;
     int nvqs;
+    /* the first virtuque which would be used by this vhost dev */
+    int vq_index;
     unsigned long long features;
     unsigned long long acked_features;
     unsigned long long backend_features;
diff --git a/hw/vhost_net.c b/hw/vhost_net.c
index d3a04ca..c955611 100644
--- a/hw/vhost_net.c
+++ b/hw/vhost_net.c
@@ -140,12 +140,21 @@ bool vhost_net_query(VHostNetState *net, VirtIODevice *dev)
     return vhost_dev_query(&net->dev, dev);
 }
 
-int vhost_net_start(struct vhost_net *net,
-                    VirtIODevice *dev)
+static int vhost_net_start_one(struct vhost_net *net,
+                               VirtIODevice *dev,
+                               int vq_index)
 {
     struct vhost_vring_file file = { };
     int r;
 
+    if (net->dev.started) {
+        return 0;
+    }
+
+    net->dev.nvqs = 2;
+    net->dev.vqs = net->vqs;
+    net->dev.vq_index = vq_index;
+
     r = vhost_dev_enable_notifiers(&net->dev, dev);
     if (r < 0) {
         goto fail_notifiers;
@@ -181,11 +190,15 @@ fail_notifiers:
     return r;
 }
 
-void vhost_net_stop(struct vhost_net *net,
-                    VirtIODevice *dev)
+static void vhost_net_stop_one(struct vhost_net *net,
+                               VirtIODevice *dev)
 {
     struct vhost_vring_file file = { .fd = -1 };
 
+    if (!net->dev.started) {
+        return;
+    }
+
     for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
         int r = ioctl(net->dev.control, VHOST_NET_SET_BACKEND, &file);
         assert(r >= 0);
@@ -195,6 +208,65 @@ void vhost_net_stop(struct vhost_net *net,
     vhost_dev_disable_notifiers(&net->dev, dev);
 }
 
+int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
+                    int start_queues, int total_queues)
+{
+    int r, i = 0;
+
+    if (!dev->binding->set_guest_notifiers) {
+        error_report("binding does not support guest notifiers\n");
+        r = -ENOSYS;
+        goto err;
+    }
+
+    for (i = start_queues; i < total_queues; i++) {
+        vhost_net_stop_one(tap_get_vhost_net(ncs[i].peer), dev);
+    }
+
+    for (i = 0; i < start_queues; i++) {
+        r = vhost_net_start_one(tap_get_vhost_net(ncs[i].peer), dev, i * 2);
+
+        if (r < 0) {
+            goto err;
+        }
+    }
+
+    r = dev->binding->set_guest_notifiers(dev->binding_opaque,
+                                          start_queues * 2,
+                                          true);
+    if (r < 0) {
+        error_report("Error binding guest notifier: %d\n", -r);
+        goto err;
+    }
+
+    return 0;
+
+err:
+    while (--i >= 0) {
+        vhost_net_stop_one(tap_get_vhost_net(ncs[i].peer), dev);
+    }
+    return r;
+}
+
+void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs,
+                    int start_queues, int total_queues)
+{
+    int i, r;
+
+    r = dev->binding->set_guest_notifiers(dev->binding_opaque,
+                                          start_queues * 2,
+                                          false);
+    if (r < 0) {
+        fprintf(stderr, "vhost guest notifier cleanup failed: %d\n", r);
+        fflush(stderr);
+    }
+    assert(r >= 0);
+
+    for (i = 0; i < total_queues; i++) {
+        vhost_net_stop_one(tap_get_vhost_net(ncs[i].peer), dev);
+    }
+}
+
 void vhost_net_cleanup(struct vhost_net *net)
 {
     vhost_dev_cleanup(&net->dev);
@@ -224,13 +296,17 @@ bool vhost_net_query(VHostNetState *net, VirtIODevice *dev)
     return false;
 }
 
-int vhost_net_start(struct vhost_net *net,
-		    VirtIODevice *dev)
+int vhost_net_start(VirtIODevice *dev,
+                    NetClientState *ncs,
+                    int start_queues,
+                    int total_queues)
 {
     return -ENOSYS;
 }
-void vhost_net_stop(struct vhost_net *net,
-		    VirtIODevice *dev)
+void vhost_net_stop(VirtIODevice *dev,
+                    NetClientState *ncs,
+                    int start_queues,
+                    int total_queues)
 {
 }
 
diff --git a/hw/vhost_net.h b/hw/vhost_net.h
index 88912b8..9fbd79d 100644
--- a/hw/vhost_net.h
+++ b/hw/vhost_net.h
@@ -9,8 +9,10 @@ typedef struct vhost_net VHostNetState;
 VHostNetState *vhost_net_init(NetClientState *backend, int devfd, bool force);
 
 bool vhost_net_query(VHostNetState *net, VirtIODevice *dev);
-int vhost_net_start(VHostNetState *net, VirtIODevice *dev);
-void vhost_net_stop(VHostNetState *net, VirtIODevice *dev);
+int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
+                    int start_queues, int total_queues);
+void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs,
+                    int start_queues, int total_queues);
 
 void vhost_net_cleanup(VHostNetState *net);
 
diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 47f4ab4..2f49fd8 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -129,14 +129,14 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
             return;
         }
         n->vhost_started = 1;
-        r = vhost_net_start(tap_get_vhost_net(nc->peer), &n->vdev);
+        r = vhost_net_start(&n->vdev, nc, 1, 1);
         if (r < 0) {
             error_report("unable to start vhost net: %d: "
                          "falling back on userspace virtio", -r);
             n->vhost_started = 0;
         }
     } else {
-        vhost_net_stop(tap_get_vhost_net(nc->peer), &n->vdev);
+        vhost_net_stop(&n->vdev, nc, 1, 1);
         n->vhost_started = 0;
     }
 }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 15/20] virtio: introduce virtio_del_queue()
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
                   ` (13 preceding siblings ...)
  2013-01-25 10:35 ` [PATCH V2 14/20] vhost: " Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-25 10:35 ` [PATCH V2 16/20] virtio: add a queue_index to VirtQueue Jason Wang
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

Some device (such as virtio-net) needs the ability to destroy or re-order the
virtqueues, this patch adds a helper to do this.

Signed-off-by: Jason Wang <jasowang>
---
 hw/virtio.c |    9 +++++++++
 hw/virtio.h |    2 ++
 2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/hw/virtio.c b/hw/virtio.c
index ca170c3..d8c77b0 100644
--- a/hw/virtio.c
+++ b/hw/virtio.c
@@ -701,6 +701,15 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size,
     return &vdev->vq[i];
 }
 
+void virtio_del_queue(VirtIODevice *vdev, int n)
+{
+    if (n < 0 || n >= VIRTIO_PCI_QUEUE_MAX) {
+        abort();
+    }
+
+    vdev->vq[n].vring.num = 0;
+}
+
 void virtio_irq(VirtQueue *vq)
 {
     trace_virtio_irq(vq);
diff --git a/hw/virtio.h b/hw/virtio.h
index 9cc7b85..d3da1d2 100644
--- a/hw/virtio.h
+++ b/hw/virtio.h
@@ -181,6 +181,8 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size,
                             void (*handle_output)(VirtIODevice *,
                                                   VirtQueue *));
 
+void virtio_del_queue(VirtIODevice *vdev, int n);
+
 void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
                     unsigned int len);
 void virtqueue_flush(VirtQueue *vq, unsigned int count);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 16/20] virtio: add a queue_index to VirtQueue
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
                   ` (14 preceding siblings ...)
  2013-01-25 10:35 ` [PATCH V2 15/20] virtio: introduce virtio_del_queue() Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-25 10:35 ` [PATCH V2 17/20] virtio-net: separate virtqueue from VirtIONet Jason Wang
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

Add a queue_index to VirtQueue and a helper to fetch it, this could be used by
multiqueue supported device.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 hw/virtio.c |    8 ++++++++
 hw/virtio.h |    1 +
 2 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/hw/virtio.c b/hw/virtio.c
index d8c77b0..e259348 100644
--- a/hw/virtio.c
+++ b/hw/virtio.c
@@ -73,6 +73,8 @@ struct VirtQueue
     /* Notification enabled? */
     bool notification;
 
+    uint16_t queue_index;
+
     int inuse;
 
     uint16_t vector;
@@ -931,6 +933,7 @@ void virtio_init(VirtIODevice *vdev, const char *name,
     for (i = 0; i < VIRTIO_PCI_QUEUE_MAX; i++) {
         vdev->vq[i].vector = VIRTIO_NO_VECTOR;
         vdev->vq[i].vdev = vdev;
+        vdev->vq[i].queue_index = i;
     }
 
     vdev->name = name;
@@ -1018,6 +1021,11 @@ VirtQueue *virtio_get_queue(VirtIODevice *vdev, int n)
     return vdev->vq + n;
 }
 
+uint16_t virtio_get_queue_index(VirtQueue *vq)
+{
+    return vq->queue_index;
+}
+
 static void virtio_queue_guest_notifier_read(EventNotifier *n)
 {
     VirtQueue *vq = container_of(n, VirtQueue, guest_notifier);
diff --git a/hw/virtio.h b/hw/virtio.h
index d3da1d2..a29a54d 100644
--- a/hw/virtio.h
+++ b/hw/virtio.h
@@ -280,6 +280,7 @@ hwaddr virtio_queue_get_ring_size(VirtIODevice *vdev, int n);
 uint16_t virtio_queue_get_last_avail_idx(VirtIODevice *vdev, int n);
 void virtio_queue_set_last_avail_idx(VirtIODevice *vdev, int n, uint16_t idx);
 VirtQueue *virtio_get_queue(VirtIODevice *vdev, int n);
+uint16_t virtio_get_queue_index(VirtQueue *vq);
 int virtio_queue_get_id(VirtQueue *vq);
 EventNotifier *virtio_queue_get_guest_notifier(VirtQueue *vq);
 void virtio_queue_set_guest_notifier_fd_handler(VirtQueue *vq, bool assign,
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 17/20] virtio-net: separate virtqueue from VirtIONet
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
                   ` (15 preceding siblings ...)
  2013-01-25 10:35 ` [PATCH V2 16/20] virtio: add a queue_index to VirtQueue Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-25 10:35 ` [PATCH V2 18/20] virtio-net: multiqueue support Jason Wang
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

To support multiqueue virtio-net, the first step is to separate the virtqueue
related fields from VirtIONet to a new structure VirtIONetQueue. The following
patches will add an array of VirtIONetQueue to VirtIONet based on this patch.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 hw/virtio-net.c |  195 ++++++++++++++++++++++++++++++++-----------------------
 1 files changed, 114 insertions(+), 81 deletions(-)

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 2f49fd8..ef522d5 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -26,28 +26,33 @@
 #define MAC_TABLE_ENTRIES    64
 #define MAX_VLAN    (1 << 12)   /* Per 802.1Q definition */
 
+typedef struct VirtIONetQueue {
+    VirtQueue *rx_vq;
+    VirtQueue *tx_vq;
+    QEMUTimer *tx_timer;
+    QEMUBH *tx_bh;
+    int tx_waiting;
+    struct {
+        VirtQueueElement elem;
+        ssize_t len;
+    } async_tx;
+    struct VirtIONet *n;
+} VirtIONetQueue;
+
 typedef struct VirtIONet
 {
     VirtIODevice vdev;
     uint8_t mac[ETH_ALEN];
     uint16_t status;
-    VirtQueue *rx_vq;
-    VirtQueue *tx_vq;
+    VirtIONetQueue vq;
     VirtQueue *ctrl_vq;
     NICState *nic;
-    QEMUTimer *tx_timer;
-    QEMUBH *tx_bh;
     uint32_t tx_timeout;
     int32_t tx_burst;
-    int tx_waiting;
     uint32_t has_vnet_hdr;
     size_t host_hdr_len;
     size_t guest_hdr_len;
     uint8_t has_ufo;
-    struct {
-        VirtQueueElement elem;
-        ssize_t len;
-    } async_tx;
     int mergeable_rx_bufs;
     uint8_t promisc;
     uint8_t allmulti;
@@ -67,6 +72,12 @@ typedef struct VirtIONet
     DeviceState *qdev;
 } VirtIONet;
 
+static VirtIONetQueue *virtio_net_get_queue(NetClientState *nc)
+{
+    VirtIONet *n = qemu_get_nic_opaque(nc);
+
+    return &n->vq;
+}
 /* TODO
  * - we could suppress RX interrupt if we were so inclined.
  */
@@ -134,6 +145,8 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
             error_report("unable to start vhost net: %d: "
                          "falling back on userspace virtio", -r);
             n->vhost_started = 0;
+        } else {
+            n->vhost_started = 1;
         }
     } else {
         vhost_net_stop(&n->vdev, nc, 1, 1);
@@ -144,25 +157,26 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
 static void virtio_net_set_status(struct VirtIODevice *vdev, uint8_t status)
 {
     VirtIONet *n = to_virtio_net(vdev);
+    VirtIONetQueue *q = &n->vq;
 
     virtio_net_vhost_status(n, status);
 
-    if (!n->tx_waiting) {
+    if (!q->tx_waiting) {
         return;
     }
 
     if (virtio_net_started(n, status) && !n->vhost_started) {
-        if (n->tx_timer) {
-            qemu_mod_timer(n->tx_timer,
+        if (q->tx_timer) {
+            qemu_mod_timer(q->tx_timer,
                            qemu_get_clock_ns(vm_clock) + n->tx_timeout);
         } else {
-            qemu_bh_schedule(n->tx_bh);
+            qemu_bh_schedule(q->tx_bh);
         }
     } else {
-        if (n->tx_timer) {
-            qemu_del_timer(n->tx_timer);
+        if (q->tx_timer) {
+            qemu_del_timer(q->tx_timer);
         } else {
-            qemu_bh_cancel(n->tx_bh);
+            qemu_bh_cancel(q->tx_bh);
         }
     }
 }
@@ -474,35 +488,40 @@ static void virtio_net_handle_rx(VirtIODevice *vdev, VirtQueue *vq)
 static int virtio_net_can_receive(NetClientState *nc)
 {
     VirtIONet *n = qemu_get_nic_opaque(nc);
+    VirtIONetQueue *q = virtio_net_get_queue(nc);
+
     if (!n->vdev.vm_running) {
         return 0;
     }
 
-    if (!virtio_queue_ready(n->rx_vq) ||
-        !(n->vdev.status & VIRTIO_CONFIG_S_DRIVER_OK))
+    if (!virtio_queue_ready(q->rx_vq) ||
+        !(n->vdev.status & VIRTIO_CONFIG_S_DRIVER_OK)) {
         return 0;
+    }
 
     return 1;
 }
 
-static int virtio_net_has_buffers(VirtIONet *n, int bufsize)
+static int virtio_net_has_buffers(VirtIONetQueue *q, int bufsize)
 {
-    if (virtio_queue_empty(n->rx_vq) ||
+    VirtIONet *n = q->n;
+    if (virtio_queue_empty(q->rx_vq) ||
         (n->mergeable_rx_bufs &&
-         !virtqueue_avail_bytes(n->rx_vq, bufsize, 0))) {
-        virtio_queue_set_notification(n->rx_vq, 1);
+         !virtqueue_avail_bytes(q->rx_vq, bufsize, 0))) {
+        virtio_queue_set_notification(q->rx_vq, 1);
 
         /* To avoid a race condition where the guest has made some buffers
          * available after the above check but before notification was
          * enabled, check for available buffers again.
          */
-        if (virtio_queue_empty(n->rx_vq) ||
+        if (virtio_queue_empty(q->rx_vq) ||
             (n->mergeable_rx_bufs &&
-             !virtqueue_avail_bytes(n->rx_vq, bufsize, 0)))
+             !virtqueue_avail_bytes(q->rx_vq, bufsize, 0))) {
             return 0;
+        }
     }
 
-    virtio_queue_set_notification(n->rx_vq, 0);
+    virtio_queue_set_notification(q->rx_vq, 0);
     return 1;
 }
 
@@ -605,6 +624,7 @@ static int receive_filter(VirtIONet *n, const uint8_t *buf, int size)
 static ssize_t virtio_net_receive(NetClientState *nc, const uint8_t *buf, size_t size)
 {
     VirtIONet *n = qemu_get_nic_opaque(nc);
+    VirtIONetQueue *q = virtio_net_get_queue(nc);
     struct iovec mhdr_sg[VIRTQUEUE_MAX_SIZE];
     struct virtio_net_hdr_mrg_rxbuf mhdr;
     unsigned mhdr_cnt = 0;
@@ -615,8 +635,9 @@ static ssize_t virtio_net_receive(NetClientState *nc, const uint8_t *buf, size_t
     }
 
     /* hdr_len refers to the header we supply to the guest */
-    if (!virtio_net_has_buffers(n, size + n->guest_hdr_len - n->host_hdr_len))
+    if (!virtio_net_has_buffers(q, size + n->guest_hdr_len - n->host_hdr_len)) {
         return 0;
+    }
 
     if (!receive_filter(n, buf, size))
         return size;
@@ -630,7 +651,7 @@ static ssize_t virtio_net_receive(NetClientState *nc, const uint8_t *buf, size_t
 
         total = 0;
 
-        if (virtqueue_pop(n->rx_vq, &elem) == 0) {
+        if (virtqueue_pop(q->rx_vq, &elem) == 0) {
             if (i == 0)
                 return -1;
             error_report("virtio-net unexpected empty queue: "
@@ -683,7 +704,7 @@ static ssize_t virtio_net_receive(NetClientState *nc, const uint8_t *buf, size_t
         }
 
         /* signal other side */
-        virtqueue_fill(n->rx_vq, &elem, total, i++);
+        virtqueue_fill(q->rx_vq, &elem, total, i++);
     }
 
     if (mhdr_cnt) {
@@ -693,30 +714,32 @@ static ssize_t virtio_net_receive(NetClientState *nc, const uint8_t *buf, size_t
                      &mhdr.num_buffers, sizeof mhdr.num_buffers);
     }
 
-    virtqueue_flush(n->rx_vq, i);
-    virtio_notify(&n->vdev, n->rx_vq);
+    virtqueue_flush(q->rx_vq, i);
+    virtio_notify(&n->vdev, q->rx_vq);
 
     return size;
 }
 
-static int32_t virtio_net_flush_tx(VirtIONet *n, VirtQueue *vq);
+static int32_t virtio_net_flush_tx(VirtIONetQueue *q);
 
 static void virtio_net_tx_complete(NetClientState *nc, ssize_t len)
 {
     VirtIONet *n = qemu_get_nic_opaque(nc);
+    VirtIONetQueue *q = virtio_net_get_queue(nc);
 
-    virtqueue_push(n->tx_vq, &n->async_tx.elem, 0);
-    virtio_notify(&n->vdev, n->tx_vq);
+    virtqueue_push(q->tx_vq, &q->async_tx.elem, 0);
+    virtio_notify(&n->vdev, q->tx_vq);
 
-    n->async_tx.elem.out_num = n->async_tx.len = 0;
+    q->async_tx.elem.out_num = q->async_tx.len = 0;
 
-    virtio_queue_set_notification(n->tx_vq, 1);
-    virtio_net_flush_tx(n, n->tx_vq);
+    virtio_queue_set_notification(q->tx_vq, 1);
+    virtio_net_flush_tx(q);
 }
 
 /* TX */
-static int32_t virtio_net_flush_tx(VirtIONet *n, VirtQueue *vq)
+static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
 {
+    VirtIONet *n = q->n;
     VirtQueueElement elem;
     int32_t num_packets = 0;
     if (!(n->vdev.status & VIRTIO_CONFIG_S_DRIVER_OK)) {
@@ -725,12 +748,12 @@ static int32_t virtio_net_flush_tx(VirtIONet *n, VirtQueue *vq)
 
     assert(n->vdev.vm_running);
 
-    if (n->async_tx.elem.out_num) {
-        virtio_queue_set_notification(n->tx_vq, 0);
+    if (q->async_tx.elem.out_num) {
+        virtio_queue_set_notification(q->tx_vq, 0);
         return num_packets;
     }
 
-    while (virtqueue_pop(vq, &elem)) {
+    while (virtqueue_pop(q->tx_vq, &elem)) {
         ssize_t ret, len;
         unsigned int out_num = elem.out_num;
         struct iovec *out_sg = &elem.out_sg[0];
@@ -763,16 +786,16 @@ static int32_t virtio_net_flush_tx(VirtIONet *n, VirtQueue *vq)
         ret = qemu_sendv_packet_async(qemu_get_queue(n->nic), out_sg, out_num,
                                       virtio_net_tx_complete);
         if (ret == 0) {
-            virtio_queue_set_notification(n->tx_vq, 0);
-            n->async_tx.elem = elem;
-            n->async_tx.len  = len;
+            virtio_queue_set_notification(q->tx_vq, 0);
+            q->async_tx.elem = elem;
+            q->async_tx.len  = len;
             return -EBUSY;
         }
 
         len += ret;
 
-        virtqueue_push(vq, &elem, 0);
-        virtio_notify(&n->vdev, vq);
+        virtqueue_push(q->tx_vq, &elem, 0);
+        virtio_notify(&n->vdev, q->tx_vq);
 
         if (++num_packets >= n->tx_burst) {
             break;
@@ -784,22 +807,23 @@ static int32_t virtio_net_flush_tx(VirtIONet *n, VirtQueue *vq)
 static void virtio_net_handle_tx_timer(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIONet *n = to_virtio_net(vdev);
+    VirtIONetQueue *q = &n->vq;
 
     /* This happens when device was stopped but VCPU wasn't. */
     if (!n->vdev.vm_running) {
-        n->tx_waiting = 1;
+        q->tx_waiting = 1;
         return;
     }
 
-    if (n->tx_waiting) {
+    if (q->tx_waiting) {
         virtio_queue_set_notification(vq, 1);
-        qemu_del_timer(n->tx_timer);
-        n->tx_waiting = 0;
-        virtio_net_flush_tx(n, vq);
+        qemu_del_timer(q->tx_timer);
+        q->tx_waiting = 0;
+        virtio_net_flush_tx(q);
     } else {
-        qemu_mod_timer(n->tx_timer,
+        qemu_mod_timer(q->tx_timer,
                        qemu_get_clock_ns(vm_clock) + n->tx_timeout);
-        n->tx_waiting = 1;
+        q->tx_waiting = 1;
         virtio_queue_set_notification(vq, 0);
     }
 }
@@ -807,48 +831,51 @@ static void virtio_net_handle_tx_timer(VirtIODevice *vdev, VirtQueue *vq)
 static void virtio_net_handle_tx_bh(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIONet *n = to_virtio_net(vdev);
+    VirtIONetQueue *q = &n->vq;
 
-    if (unlikely(n->tx_waiting)) {
+    if (unlikely(q->tx_waiting)) {
         return;
     }
-    n->tx_waiting = 1;
+    q->tx_waiting = 1;
     /* This happens when device was stopped but VCPU wasn't. */
     if (!n->vdev.vm_running) {
         return;
     }
     virtio_queue_set_notification(vq, 0);
-    qemu_bh_schedule(n->tx_bh);
+    qemu_bh_schedule(q->tx_bh);
 }
 
 static void virtio_net_tx_timer(void *opaque)
 {
-    VirtIONet *n = opaque;
+    VirtIONetQueue *q = opaque;
+    VirtIONet *n = q->n;
     assert(n->vdev.vm_running);
 
-    n->tx_waiting = 0;
+    q->tx_waiting = 0;
 
     /* Just in case the driver is not ready on more */
     if (!(n->vdev.status & VIRTIO_CONFIG_S_DRIVER_OK))
         return;
 
-    virtio_queue_set_notification(n->tx_vq, 1);
-    virtio_net_flush_tx(n, n->tx_vq);
+    virtio_queue_set_notification(q->tx_vq, 1);
+    virtio_net_flush_tx(q);
 }
 
 static void virtio_net_tx_bh(void *opaque)
 {
-    VirtIONet *n = opaque;
+    VirtIONetQueue *q = opaque;
+    VirtIONet *n = q->n;
     int32_t ret;
 
     assert(n->vdev.vm_running);
 
-    n->tx_waiting = 0;
+    q->tx_waiting = 0;
 
     /* Just in case the driver is not ready on more */
     if (unlikely(!(n->vdev.status & VIRTIO_CONFIG_S_DRIVER_OK)))
         return;
 
-    ret = virtio_net_flush_tx(n, n->tx_vq);
+    ret = virtio_net_flush_tx(q);
     if (ret == -EBUSY) {
         return; /* Notification re-enable handled by tx_complete */
     }
@@ -856,25 +883,26 @@ static void virtio_net_tx_bh(void *opaque)
     /* If we flush a full burst of packets, assume there are
      * more coming and immediately reschedule */
     if (ret >= n->tx_burst) {
-        qemu_bh_schedule(n->tx_bh);
-        n->tx_waiting = 1;
+        qemu_bh_schedule(q->tx_bh);
+        q->tx_waiting = 1;
         return;
     }
 
     /* If less than a full burst, re-enable notification and flush
      * anything that may have come in while we weren't looking.  If
      * we find something, assume the guest is still active and reschedule */
-    virtio_queue_set_notification(n->tx_vq, 1);
-    if (virtio_net_flush_tx(n, n->tx_vq) > 0) {
-        virtio_queue_set_notification(n->tx_vq, 0);
-        qemu_bh_schedule(n->tx_bh);
-        n->tx_waiting = 1;
+    virtio_queue_set_notification(q->tx_vq, 1);
+    if (virtio_net_flush_tx(q) > 0) {
+        virtio_queue_set_notification(q->tx_vq, 0);
+        qemu_bh_schedule(q->tx_bh);
+        q->tx_waiting = 1;
     }
 }
 
 static void virtio_net_save(QEMUFile *f, void *opaque)
 {
     VirtIONet *n = opaque;
+    VirtIONetQueue *q = &n->vq;
 
     /* At this point, backend must be stopped, otherwise
      * it might keep writing to memory. */
@@ -882,7 +910,7 @@ static void virtio_net_save(QEMUFile *f, void *opaque)
     virtio_save(&n->vdev, f);
 
     qemu_put_buffer(f, n->mac, ETH_ALEN);
-    qemu_put_be32(f, n->tx_waiting);
+    qemu_put_be32(f, q->tx_waiting);
     qemu_put_be32(f, n->mergeable_rx_bufs);
     qemu_put_be16(f, n->status);
     qemu_put_byte(f, n->promisc);
@@ -903,6 +931,7 @@ static void virtio_net_save(QEMUFile *f, void *opaque)
 static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
 {
     VirtIONet *n = opaque;
+    VirtIONetQueue *q = &n->vq;
     int i;
     int ret;
 
@@ -915,7 +944,7 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
     }
 
     qemu_get_buffer(f, n->mac, ETH_ALEN);
-    n->tx_waiting = qemu_get_be32(f);
+    q->tx_waiting = qemu_get_be32(f);
 
     virtio_net_set_mrg_rx_bufs(n, qemu_get_be32(f));
 
@@ -1052,7 +1081,8 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
     n->vdev.set_status = virtio_net_set_status;
     n->vdev.guest_notifier_mask = virtio_net_guest_notifier_mask;
     n->vdev.guest_notifier_pending = virtio_net_guest_notifier_pending;
-    n->rx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_rx);
+    n->vq.rx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_rx);
+    n->vq.n = n;
 
     if (net->tx && strcmp(net->tx, "timer") && strcmp(net->tx, "bh")) {
         error_report("virtio-net: "
@@ -1062,12 +1092,14 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
     }
 
     if (net->tx && !strcmp(net->tx, "timer")) {
-        n->tx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_tx_timer);
-        n->tx_timer = qemu_new_timer_ns(vm_clock, virtio_net_tx_timer, n);
+        n->vq.tx_vq = virtio_add_queue(&n->vdev, 256,
+                                       virtio_net_handle_tx_timer);
+        n->vq.tx_timer = qemu_new_timer_ns(vm_clock,
+                                           virtio_net_tx_timer, &n->vq);
         n->tx_timeout = net->txtimer;
     } else {
-        n->tx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_tx_bh);
-        n->tx_bh = qemu_bh_new(virtio_net_tx_bh, n);
+        n->vq.tx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_tx_bh);
+        n->vq.tx_bh = qemu_bh_new(virtio_net_tx_bh, &n->vq);
     }
     n->ctrl_vq = virtio_add_queue(&n->vdev, 64, virtio_net_handle_ctrl);
     qemu_macaddr_default_if_unset(&conf->macaddr);
@@ -1085,7 +1117,7 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
 
     qemu_format_nic_info_str(qemu_get_queue(n->nic), conf->macaddr.a);
 
-    n->tx_waiting = 0;
+    n->vq.tx_waiting = 0;
     n->tx_burst = net->txburst;
     virtio_net_set_mrg_rx_bufs(n, 0);
     n->promisc = 1; /* for compatibility */
@@ -1106,6 +1138,7 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
 void virtio_net_exit(VirtIODevice *vdev)
 {
     VirtIONet *n = DO_UPCAST(VirtIONet, vdev, vdev);
+    VirtIONetQueue *q = &n->vq;
 
     /* This will stop vhost backend if appropriate. */
     virtio_net_set_status(vdev, 0);
@@ -1117,11 +1150,11 @@ void virtio_net_exit(VirtIODevice *vdev)
     g_free(n->mac_table.macs);
     g_free(n->vlans);
 
-    if (n->tx_timer) {
-        qemu_del_timer(n->tx_timer);
-        qemu_free_timer(n->tx_timer);
+    if (q->tx_timer) {
+        qemu_del_timer(q->tx_timer);
+        qemu_free_timer(q->tx_timer);
     } else {
-        qemu_bh_delete(n->tx_bh);
+        qemu_bh_delete(q->tx_bh);
     }
 
     qemu_del_nic(n->nic);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 18/20] virtio-net: multiqueue support
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
                   ` (16 preceding siblings ...)
  2013-01-25 10:35 ` [PATCH V2 17/20] virtio-net: separate virtqueue from VirtIONet Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-04-13 13:17   ` [Qemu-devel] " Aurelien Jarno
  2013-01-25 10:35 ` [PATCH V2 19/20] virtio-net: migration support for multiqueue Jason Wang
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

This patch implements both userspace and vhost support for multiple queue
virtio-net (VIRTIO_NET_F_MQ). This is done by introducing an array of
VirtIONetQueue to VirtIONet.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 hw/virtio-net.c |  317 +++++++++++++++++++++++++++++++++++++++++++------------
 hw/virtio-net.h |   28 +++++-
 2 files changed, 275 insertions(+), 70 deletions(-)

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index ef522d5..cec91a7 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -44,7 +44,7 @@ typedef struct VirtIONet
     VirtIODevice vdev;
     uint8_t mac[ETH_ALEN];
     uint16_t status;
-    VirtIONetQueue vq;
+    VirtIONetQueue vqs[MAX_QUEUE_NUM];
     VirtQueue *ctrl_vq;
     NICState *nic;
     uint32_t tx_timeout;
@@ -70,14 +70,24 @@ typedef struct VirtIONet
     } mac_table;
     uint32_t *vlans;
     DeviceState *qdev;
+    int multiqueue;
+    uint16_t max_queues;
+    uint16_t curr_queues;
+    bool queues_changed;
 } VirtIONet;
 
-static VirtIONetQueue *virtio_net_get_queue(NetClientState *nc)
+static VirtIONetQueue *virtio_net_get_subqueue(NetClientState *nc)
 {
     VirtIONet *n = qemu_get_nic_opaque(nc);
 
-    return &n->vq;
+    return &n->vqs[nc->queue_index];
 }
+
+static int vq2q(int queue_index)
+{
+    return queue_index / 2;
+}
+
 /* TODO
  * - we could suppress RX interrupt if we were so inclined.
  */
@@ -93,6 +103,7 @@ static void virtio_net_get_config(VirtIODevice *vdev, uint8_t *config)
     struct virtio_net_config netcfg;
 
     stw_p(&netcfg.status, n->status);
+    stw_p(&netcfg.max_virtqueue_pairs, n->max_queues);
     memcpy(netcfg.mac, n->mac, ETH_ALEN);
     memcpy(config, &netcfg, sizeof(netcfg));
 }
@@ -119,6 +130,7 @@ static bool virtio_net_started(VirtIONet *n, uint8_t status)
 static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
 {
     NetClientState *nc = qemu_get_queue(n->nic);
+    int queues = n->multiqueue ? n->max_queues : 1;
 
     if (!nc->peer) {
         return;
@@ -130,26 +142,27 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
     if (!tap_get_vhost_net(nc->peer)) {
         return;
     }
-    if (!!n->vhost_started == virtio_net_started(n, status) &&
-                              !nc->peer->link_down) {
+
+    if (!n->queues_changed &&
+        !!n->vhost_started ==
+        (virtio_net_started(n, status) && !nc->peer->link_down)) {
         return;
     }
-    if (!n->vhost_started) {
+    if (!n->vhost_started || n->queues_changed) {
         int r;
         if (!vhost_net_query(tap_get_vhost_net(nc->peer), &n->vdev)) {
             return;
         }
         n->vhost_started = 1;
-        r = vhost_net_start(&n->vdev, nc, 1, 1);
+        r = vhost_net_start(&n->vdev, n->nic->ncs, n->curr_queues, queues);
         if (r < 0) {
             error_report("unable to start vhost net: %d: "
                          "falling back on userspace virtio", -r);
             n->vhost_started = 0;
-        } else {
-            n->vhost_started = 1;
         }
+        n->queues_changed = false;
     } else {
-        vhost_net_stop(&n->vdev, nc, 1, 1);
+        vhost_net_stop(&n->vdev, n->nic->ncs, n->curr_queues, queues);
         n->vhost_started = 0;
     }
 }
@@ -157,26 +170,38 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
 static void virtio_net_set_status(struct VirtIODevice *vdev, uint8_t status)
 {
     VirtIONet *n = to_virtio_net(vdev);
-    VirtIONetQueue *q = &n->vq;
+    VirtIONetQueue *q;
+    int i;
+    uint8_t queue_status;
 
     virtio_net_vhost_status(n, status);
 
-    if (!q->tx_waiting) {
-        return;
-    }
+    for (i = 0; i < n->max_queues; i++) {
+        q = &n->vqs[i];
 
-    if (virtio_net_started(n, status) && !n->vhost_started) {
-        if (q->tx_timer) {
-            qemu_mod_timer(q->tx_timer,
-                           qemu_get_clock_ns(vm_clock) + n->tx_timeout);
+        if ((!n->multiqueue && i != 0) || i >= n->curr_queues) {
+            queue_status = 0;
         } else {
-            qemu_bh_schedule(q->tx_bh);
+            queue_status = status;
         }
-    } else {
-        if (q->tx_timer) {
-            qemu_del_timer(q->tx_timer);
+
+        if (!q->tx_waiting) {
+            continue;
+        }
+
+        if (virtio_net_started(n, queue_status) && !n->vhost_started) {
+            if (q->tx_timer) {
+                qemu_mod_timer(q->tx_timer,
+                               qemu_get_clock_ns(vm_clock) + n->tx_timeout);
+            } else {
+                qemu_bh_schedule(q->tx_bh);
+            }
         } else {
-            qemu_bh_cancel(q->tx_bh);
+            if (q->tx_timer) {
+                qemu_del_timer(q->tx_timer);
+            } else {
+                qemu_bh_cancel(q->tx_bh);
+            }
         }
     }
 }
@@ -208,6 +233,9 @@ static void virtio_net_reset(VirtIODevice *vdev)
     n->nomulti = 0;
     n->nouni = 0;
     n->nobcast = 0;
+    /* multiqueue is disabled by default */
+    n->curr_queues = 1;
+    n->queues_changed = 0;
 
     /* Flush any MAC and VLAN filter table state */
     n->mac_table.in_use = 0;
@@ -249,18 +277,72 @@ static int peer_has_ufo(VirtIONet *n)
 
 static void virtio_net_set_mrg_rx_bufs(VirtIONet *n, int mergeable_rx_bufs)
 {
+    int i;
+    NetClientState *nc;
+
     n->mergeable_rx_bufs = mergeable_rx_bufs;
 
     n->guest_hdr_len = n->mergeable_rx_bufs ?
         sizeof(struct virtio_net_hdr_mrg_rxbuf) : sizeof(struct virtio_net_hdr);
 
-    if (peer_has_vnet_hdr(n) &&
-        tap_has_vnet_hdr_len(qemu_get_queue(n->nic)->peer, n->guest_hdr_len)) {
-        tap_set_vnet_hdr_len(qemu_get_queue(n->nic)->peer, n->guest_hdr_len);
-        n->host_hdr_len = n->guest_hdr_len;
+    for (i = 0; i < n->max_queues; i++) {
+        nc = qemu_get_subqueue(n->nic, i);
+
+        if (peer_has_vnet_hdr(n) &&
+            tap_has_vnet_hdr_len(nc->peer, n->guest_hdr_len)) {
+            tap_set_vnet_hdr_len(nc->peer, n->guest_hdr_len);
+            n->host_hdr_len = n->guest_hdr_len;
+        }
+    }
+}
+
+static int peer_attach(VirtIONet *n, int index)
+{
+    NetClientState *nc = qemu_get_subqueue(n->nic, index);
+    int ret;
+
+    if (!nc->peer) {
+        ret = -1;
+    } else if (nc->peer->info->type != NET_CLIENT_OPTIONS_KIND_TAP) {
+        ret = -1;
+    } else {
+        ret = tap_enable(nc->peer);
+    }
+
+    return ret;
+}
+
+static int peer_detach(VirtIONet *n, int index)
+{
+    NetClientState *nc = qemu_get_subqueue(n->nic, index);
+    int ret;
+
+    if (!nc->peer) {
+        ret = -1;
+    } else if (nc->peer->info->type !=  NET_CLIENT_OPTIONS_KIND_TAP) {
+        ret = -1;
+    } else {
+        ret = tap_disable(nc->peer);
+    }
+
+    return ret;
+}
+
+static void virtio_net_set_queues(VirtIONet *n)
+{
+    int i;
+
+    for (i = 0; i < n->max_queues; i++) {
+        if (i < n->curr_queues) {
+            assert(!peer_attach(n, i));
+        } else {
+            assert(!peer_detach(n, i));
+        }
     }
 }
 
+static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue, int ctrl);
+
 static uint32_t virtio_net_get_features(VirtIODevice *vdev, uint32_t features)
 {
     VirtIONet *n = to_virtio_net(vdev);
@@ -312,25 +394,33 @@ static uint32_t virtio_net_bad_features(VirtIODevice *vdev)
 static void virtio_net_set_features(VirtIODevice *vdev, uint32_t features)
 {
     VirtIONet *n = to_virtio_net(vdev);
-    NetClientState *nc = qemu_get_queue(n->nic);
+    int i;
+
+    virtio_net_set_multiqueue(n, !!(features & (1 << VIRTIO_NET_F_MQ)),
+                              !!(features & (1 << VIRTIO_NET_F_CTRL_VQ)));
 
     virtio_net_set_mrg_rx_bufs(n, !!(features & (1 << VIRTIO_NET_F_MRG_RXBUF)));
 
     if (n->has_vnet_hdr) {
-        tap_set_offload(nc->peer,
+        tap_set_offload(qemu_get_subqueue(n->nic, 0)->peer,
                         (features >> VIRTIO_NET_F_GUEST_CSUM) & 1,
                         (features >> VIRTIO_NET_F_GUEST_TSO4) & 1,
                         (features >> VIRTIO_NET_F_GUEST_TSO6) & 1,
                         (features >> VIRTIO_NET_F_GUEST_ECN)  & 1,
                         (features >> VIRTIO_NET_F_GUEST_UFO)  & 1);
     }
-    if (!nc->peer || nc->peer->info->type != NET_CLIENT_OPTIONS_KIND_TAP) {
-        return;
-    }
-    if (!tap_get_vhost_net(nc->peer)) {
-        return;
+
+    for (i = 0;  i < n->max_queues; i++) {
+        NetClientState *nc = qemu_get_subqueue(n->nic, i);
+
+        if (!nc->peer || nc->peer->info->type != NET_CLIENT_OPTIONS_KIND_TAP) {
+            continue;
+        }
+        if (!tap_get_vhost_net(nc->peer)) {
+            continue;
+        }
+        vhost_net_ack_features(tap_get_vhost_net(nc->peer), features);
     }
-    vhost_net_ack_features(tap_get_vhost_net(nc->peer), features);
 }
 
 static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd,
@@ -440,6 +530,39 @@ static int virtio_net_handle_vlan_table(VirtIONet *n, uint8_t cmd,
     return VIRTIO_NET_OK;
 }
 
+static int virtio_net_handle_mq(VirtIONet *n, uint8_t cmd,
+                                VirtQueueElement *elem)
+{
+    struct virtio_net_ctrl_mq s;
+
+    if (elem->out_num != 2 ||
+        elem->out_sg[1].iov_len != sizeof(struct virtio_net_ctrl_mq)) {
+        error_report("virtio-net ctrl invalid steering command");
+        return VIRTIO_NET_ERR;
+    }
+
+    if (cmd != VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET) {
+        return VIRTIO_NET_ERR;
+    }
+
+    memcpy(&s, elem->out_sg[1].iov_base, sizeof(struct virtio_net_ctrl_mq));
+
+    if (s.virtqueue_pairs < VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MIN ||
+        s.virtqueue_pairs > VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX ||
+        s.virtqueue_pairs > n->max_queues ||
+        !n->multiqueue) {
+        return VIRTIO_NET_ERR;
+    }
+
+    n->curr_queues = s.virtqueue_pairs;
+    n->queues_changed = true;
+    /* stop the backend before changing the number of queues to avoid handling a
+     * disabled queue */
+    virtio_net_set_status(&n->vdev, n->vdev.status);
+    virtio_net_set_queues(n);
+
+    return VIRTIO_NET_OK;
+}
 static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIONet *n = to_virtio_net(vdev);
@@ -468,6 +591,9 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
             status = virtio_net_handle_mac(n, ctrl.cmd, &elem);
         else if (ctrl.class == VIRTIO_NET_CTRL_VLAN)
             status = virtio_net_handle_vlan_table(n, ctrl.cmd, &elem);
+        else if (ctrl.class == VIRTIO_NET_CTRL_MQ) {
+            status = virtio_net_handle_mq(n, ctrl.cmd, &elem);
+        }
 
         stb_p(elem.in_sg[elem.in_num - 1].iov_base, status);
 
@@ -481,19 +607,24 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
 static void virtio_net_handle_rx(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIONet *n = to_virtio_net(vdev);
+    int queue_index = vq2q(virtio_get_queue_index(vq));
 
-    qemu_flush_queued_packets(qemu_get_queue(n->nic));
+    qemu_flush_queued_packets(qemu_get_subqueue(n->nic, queue_index));
 }
 
 static int virtio_net_can_receive(NetClientState *nc)
 {
     VirtIONet *n = qemu_get_nic_opaque(nc);
-    VirtIONetQueue *q = virtio_net_get_queue(nc);
+    VirtIONetQueue *q = virtio_net_get_subqueue(nc);
 
     if (!n->vdev.vm_running) {
         return 0;
     }
 
+    if (nc->queue_index >= n->curr_queues) {
+        return 0;
+    }
+
     if (!virtio_queue_ready(q->rx_vq) ||
         !(n->vdev.status & VIRTIO_CONFIG_S_DRIVER_OK)) {
         return 0;
@@ -624,13 +755,13 @@ static int receive_filter(VirtIONet *n, const uint8_t *buf, int size)
 static ssize_t virtio_net_receive(NetClientState *nc, const uint8_t *buf, size_t size)
 {
     VirtIONet *n = qemu_get_nic_opaque(nc);
-    VirtIONetQueue *q = virtio_net_get_queue(nc);
+    VirtIONetQueue *q = virtio_net_get_subqueue(nc);
     struct iovec mhdr_sg[VIRTQUEUE_MAX_SIZE];
     struct virtio_net_hdr_mrg_rxbuf mhdr;
     unsigned mhdr_cnt = 0;
     size_t offset, i, guest_offset;
 
-    if (!virtio_net_can_receive(qemu_get_queue(n->nic))) {
+    if (!virtio_net_can_receive(nc)) {
         return -1;
     }
 
@@ -725,7 +856,7 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q);
 static void virtio_net_tx_complete(NetClientState *nc, ssize_t len)
 {
     VirtIONet *n = qemu_get_nic_opaque(nc);
-    VirtIONetQueue *q = virtio_net_get_queue(nc);
+    VirtIONetQueue *q = virtio_net_get_subqueue(nc);
 
     virtqueue_push(q->tx_vq, &q->async_tx.elem, 0);
     virtio_notify(&n->vdev, q->tx_vq);
@@ -742,6 +873,7 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
     VirtIONet *n = q->n;
     VirtQueueElement elem;
     int32_t num_packets = 0;
+    int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
     if (!(n->vdev.status & VIRTIO_CONFIG_S_DRIVER_OK)) {
         return num_packets;
     }
@@ -783,8 +915,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
 
         len = n->guest_hdr_len;
 
-        ret = qemu_sendv_packet_async(qemu_get_queue(n->nic), out_sg, out_num,
-                                      virtio_net_tx_complete);
+        ret = qemu_sendv_packet_async(qemu_get_subqueue(n->nic, queue_index),
+                                      out_sg, out_num, virtio_net_tx_complete);
         if (ret == 0) {
             virtio_queue_set_notification(q->tx_vq, 0);
             q->async_tx.elem = elem;
@@ -807,7 +939,7 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
 static void virtio_net_handle_tx_timer(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIONet *n = to_virtio_net(vdev);
-    VirtIONetQueue *q = &n->vq;
+    VirtIONetQueue *q = &n->vqs[vq2q(virtio_get_queue_index(vq))];
 
     /* This happens when device was stopped but VCPU wasn't. */
     if (!n->vdev.vm_running) {
@@ -831,7 +963,7 @@ static void virtio_net_handle_tx_timer(VirtIODevice *vdev, VirtQueue *vq)
 static void virtio_net_handle_tx_bh(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIONet *n = to_virtio_net(vdev);
-    VirtIONetQueue *q = &n->vq;
+    VirtIONetQueue *q = &n->vqs[vq2q(virtio_get_queue_index(vq))];
 
     if (unlikely(q->tx_waiting)) {
         return;
@@ -899,10 +1031,46 @@ static void virtio_net_tx_bh(void *opaque)
     }
 }
 
+static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue, int ctrl)
+{
+    VirtIODevice *vdev = &n->vdev;
+    int i, max = multiqueue ? n->max_queues : 1;
+
+    n->multiqueue = multiqueue;
+
+    for (i = 2; i <= n->max_queues * 2 + 1; i++) {
+        virtio_del_queue(vdev, i);
+    }
+
+    for (i = 1; i < max; i++) {
+        n->vqs[i].rx_vq = virtio_add_queue(vdev, 256, virtio_net_handle_rx);
+        if (n->vqs[i].tx_timer) {
+            n->vqs[i].tx_vq =
+                virtio_add_queue(vdev, 256, virtio_net_handle_tx_timer);
+            n->vqs[i].tx_timer = qemu_new_timer_ns(vm_clock,
+                                                   virtio_net_tx_timer,
+                                                   &n->vqs[i]);
+        } else {
+            n->vqs[i].tx_vq =
+                virtio_add_queue(vdev, 256, virtio_net_handle_tx_bh);
+            n->vqs[i].tx_bh = qemu_bh_new(virtio_net_tx_bh, &n->vqs[i]);
+        }
+
+        n->vqs[i].tx_waiting = 0;
+        n->vqs[i].n = n;
+    }
+
+    if (ctrl) {
+        n->ctrl_vq = virtio_add_queue(vdev, 64, virtio_net_handle_ctrl);
+    }
+
+    virtio_net_set_queues(n);
+}
+
 static void virtio_net_save(QEMUFile *f, void *opaque)
 {
     VirtIONet *n = opaque;
-    VirtIONetQueue *q = &n->vq;
+    VirtIONetQueue *q = &n->vqs[0];
 
     /* At this point, backend must be stopped, otherwise
      * it might keep writing to memory. */
@@ -931,9 +1099,8 @@ static void virtio_net_save(QEMUFile *f, void *opaque)
 static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
 {
     VirtIONet *n = opaque;
-    VirtIONetQueue *q = &n->vq;
-    int i;
-    int ret;
+    VirtIONetQueue *q = &n->vqs[0];
+    int ret, i;
 
     if (version_id < 2 || version_id > VIRTIO_NET_VM_VERSION)
         return -EINVAL;
@@ -1048,7 +1215,7 @@ static NetClientInfo net_virtio_info = {
 static bool virtio_net_guest_notifier_pending(VirtIODevice *vdev, int idx)
 {
     VirtIONet *n = to_virtio_net(vdev);
-    NetClientState *nc = qemu_get_queue(n->nic);
+    NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
     assert(n->vhost_started);
     return vhost_net_virtqueue_pending(tap_get_vhost_net(nc->peer), idx);
 }
@@ -1057,7 +1224,7 @@ static void virtio_net_guest_notifier_mask(VirtIODevice *vdev, int idx,
                                            bool mask)
 {
     VirtIONet *n = to_virtio_net(vdev);
-    NetClientState *nc = qemu_get_queue(n->nic);
+    NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
     assert(n->vhost_started);
     vhost_net_virtqueue_mask(tap_get_vhost_net(nc->peer),
                              vdev, idx, mask);
@@ -1067,6 +1234,7 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
                               virtio_net_conf *net)
 {
     VirtIONet *n;
+    int i;
 
     n = (VirtIONet *)virtio_common_init("virtio-net", VIRTIO_ID_NET,
                                         sizeof(struct virtio_net_config),
@@ -1081,8 +1249,12 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
     n->vdev.set_status = virtio_net_set_status;
     n->vdev.guest_notifier_mask = virtio_net_guest_notifier_mask;
     n->vdev.guest_notifier_pending = virtio_net_guest_notifier_pending;
-    n->vq.rx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_rx);
-    n->vq.n = n;
+    n->vqs[0].rx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_rx);
+    n->max_queues = conf->queues;
+    n->curr_queues = 1;
+    n->queues_changed = false;
+    n->vqs[0].n = n;
+    n->tx_timeout = net->txtimer;
 
     if (net->tx && strcmp(net->tx, "timer") && strcmp(net->tx, "bh")) {
         error_report("virtio-net: "
@@ -1092,14 +1264,14 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
     }
 
     if (net->tx && !strcmp(net->tx, "timer")) {
-        n->vq.tx_vq = virtio_add_queue(&n->vdev, 256,
-                                       virtio_net_handle_tx_timer);
-        n->vq.tx_timer = qemu_new_timer_ns(vm_clock,
-                                           virtio_net_tx_timer, &n->vq);
-        n->tx_timeout = net->txtimer;
+        n->vqs[0].tx_vq = virtio_add_queue(&n->vdev, 256,
+                                           virtio_net_handle_tx_timer);
+        n->vqs[0].tx_timer = qemu_new_timer_ns(vm_clock, virtio_net_tx_timer,
+                                               &n->vqs[0]);
     } else {
-        n->vq.tx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_tx_bh);
-        n->vq.tx_bh = qemu_bh_new(virtio_net_tx_bh, &n->vq);
+        n->vqs[0].tx_vq = virtio_add_queue(&n->vdev, 256,
+                                           virtio_net_handle_tx_bh);
+        n->vqs[0].tx_bh = qemu_bh_new(virtio_net_tx_bh, &n->vqs[0]);
     }
     n->ctrl_vq = virtio_add_queue(&n->vdev, 64, virtio_net_handle_ctrl);
     qemu_macaddr_default_if_unset(&conf->macaddr);
@@ -1109,7 +1281,9 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
     n->nic = qemu_new_nic(&net_virtio_info, conf, object_get_typename(OBJECT(dev)), dev->id, n);
     peer_test_vnet_hdr(n);
     if (peer_has_vnet_hdr(n)) {
-        tap_using_vnet_hdr(qemu_get_queue(n->nic)->peer, 1);
+        for (i = 0; i < n->max_queues; i++) {
+            tap_using_vnet_hdr(qemu_get_subqueue(n->nic, i)->peer, 1);
+        }
         n->host_hdr_len = sizeof(struct virtio_net_hdr);
     } else {
         n->host_hdr_len = 0;
@@ -1117,7 +1291,7 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
 
     qemu_format_nic_info_str(qemu_get_queue(n->nic), conf->macaddr.a);
 
-    n->vq.tx_waiting = 0;
+    n->vqs[0].tx_waiting = 0;
     n->tx_burst = net->txburst;
     virtio_net_set_mrg_rx_bufs(n, 0);
     n->promisc = 1; /* for compatibility */
@@ -1138,23 +1312,28 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
 void virtio_net_exit(VirtIODevice *vdev)
 {
     VirtIONet *n = DO_UPCAST(VirtIONet, vdev, vdev);
-    VirtIONetQueue *q = &n->vq;
+    int i;
 
     /* This will stop vhost backend if appropriate. */
     virtio_net_set_status(vdev, 0);
 
-    qemu_purge_queued_packets(qemu_get_queue(n->nic));
-
     unregister_savevm(n->qdev, "virtio-net", n);
 
     g_free(n->mac_table.macs);
     g_free(n->vlans);
 
-    if (q->tx_timer) {
-        qemu_del_timer(q->tx_timer);
-        qemu_free_timer(q->tx_timer);
-    } else {
-        qemu_bh_delete(q->tx_bh);
+    for (i = 0; i < n->max_queues; i++) {
+        VirtIONetQueue *q = &n->vqs[i];
+        NetClientState *nc = qemu_get_subqueue(n->nic, i);
+
+        qemu_purge_queued_packets(nc);
+
+        if (q->tx_timer) {
+            qemu_del_timer(q->tx_timer);
+            qemu_free_timer(q->tx_timer);
+        } else {
+            qemu_bh_delete(q->tx_bh);
+        }
     }
 
     qemu_del_nic(n->nic);
diff --git a/hw/virtio-net.h b/hw/virtio-net.h
index d46fb98..d4fba23 100644
--- a/hw/virtio-net.h
+++ b/hw/virtio-net.h
@@ -43,6 +43,8 @@
 #define VIRTIO_NET_F_CTRL_RX    18      /* Control channel RX mode support */
 #define VIRTIO_NET_F_CTRL_VLAN  19      /* Control channel VLAN filtering */
 #define VIRTIO_NET_F_CTRL_RX_EXTRA 20   /* Extra RX mode control support */
+#define VIRTIO_NET_F_MQ         22      /* Device supports Receive Flow
+                                         * Steering */
 
 #define VIRTIO_NET_S_LINK_UP    1       /* Link is up */
 
@@ -71,6 +73,8 @@ struct virtio_net_config
     uint8_t mac[ETH_ALEN];
     /* See VIRTIO_NET_F_STATUS and VIRTIO_NET_S_* above */
     uint16_t status;
+    /* Max virtqueue pairs supported by the device */
+    uint16_t max_virtqueue_pairs;
 } QEMU_PACKED;
 
 /*
@@ -140,6 +144,26 @@ struct virtio_net_ctrl_mac {
  #define VIRTIO_NET_CTRL_VLAN_ADD             0
  #define VIRTIO_NET_CTRL_VLAN_DEL             1
 
+/*
+ * Control Multiqueue
+ *
+ * The command VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET
+ * enables multiqueue, specifying the number of the transmit and
+ * receive queues that will be used. After the command is consumed and acked by
+ * the device, the device will not steer new packets on receive virtqueues
+ * other than specified nor read from transmit virtqueues other than specified.
+ * Accordingly, driver should not transmit new packets  on virtqueues other than
+ * specified.
+ */
+struct virtio_net_ctrl_mq {
+    uint16_t virtqueue_pairs;
+};
+
+#define VIRTIO_NET_CTRL_MQ   4
+ #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET        0
+ #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MIN        1
+ #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX        0x8000
+
 #define DEFINE_VIRTIO_NET_FEATURES(_state, _field) \
         DEFINE_VIRTIO_COMMON_FEATURES(_state, _field), \
         DEFINE_PROP_BIT("csum", _state, _field, VIRTIO_NET_F_CSUM, true), \
@@ -158,5 +182,7 @@ struct virtio_net_ctrl_mac {
         DEFINE_PROP_BIT("ctrl_vq", _state, _field, VIRTIO_NET_F_CTRL_VQ, true), \
         DEFINE_PROP_BIT("ctrl_rx", _state, _field, VIRTIO_NET_F_CTRL_RX, true), \
         DEFINE_PROP_BIT("ctrl_vlan", _state, _field, VIRTIO_NET_F_CTRL_VLAN, true), \
-        DEFINE_PROP_BIT("ctrl_rx_extra", _state, _field, VIRTIO_NET_F_CTRL_RX_EXTRA, true)
+        DEFINE_PROP_BIT("ctrl_rx_extra", _state, _field, \
+                        VIRTIO_NET_F_CTRL_RX_EXTRA, true),              \
+        DEFINE_PROP_BIT("mq", _state, _field, VIRTIO_NET_F_MQ, true)
 #endif
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 19/20] virtio-net: migration support for multiqueue
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
                   ` (17 preceding siblings ...)
  2013-01-25 10:35 ` [PATCH V2 18/20] virtio-net: multiqueue support Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-25 10:35 ` [PATCH V2 20/20] virtio-net: compat multiqueue support Jason Wang
  2013-01-28  3:27 ` [PATCH V2 00/20] Multiqueue virtio-net Wanlong Gao
  20 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

This patch add migration support for multiqueue virtio-net. Instead of bumping
the version, we conditionally send the info of multiqueue only when the device
support more than one queue to maintain the backward compatibility.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 hw/virtio-net.c |   35 +++++++++++++++++++++++++++++------
 1 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index cec91a7..4eb191f 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -1069,8 +1069,8 @@ static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue, int ctrl)
 
 static void virtio_net_save(QEMUFile *f, void *opaque)
 {
+    int i;
     VirtIONet *n = opaque;
-    VirtIONetQueue *q = &n->vqs[0];
 
     /* At this point, backend must be stopped, otherwise
      * it might keep writing to memory. */
@@ -1078,7 +1078,7 @@ static void virtio_net_save(QEMUFile *f, void *opaque)
     virtio_save(&n->vdev, f);
 
     qemu_put_buffer(f, n->mac, ETH_ALEN);
-    qemu_put_be32(f, q->tx_waiting);
+    qemu_put_be32(f, n->vqs[0].tx_waiting);
     qemu_put_be32(f, n->mergeable_rx_bufs);
     qemu_put_be16(f, n->status);
     qemu_put_byte(f, n->promisc);
@@ -1094,13 +1094,19 @@ static void virtio_net_save(QEMUFile *f, void *opaque)
     qemu_put_byte(f, n->nouni);
     qemu_put_byte(f, n->nobcast);
     qemu_put_byte(f, n->has_ufo);
+    if (n->max_queues > 1) {
+        qemu_put_be16(f, n->max_queues);
+        qemu_put_be16(f, n->curr_queues);
+        for (i = 1; i < n->curr_queues; i++) {
+            qemu_put_be32(f, n->vqs[i].tx_waiting);
+        }
+    }
 }
 
 static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
 {
     VirtIONet *n = opaque;
-    VirtIONetQueue *q = &n->vqs[0];
-    int ret, i;
+    int ret, i, link_down;
 
     if (version_id < 2 || version_id > VIRTIO_NET_VM_VERSION)
         return -EINVAL;
@@ -1111,7 +1117,7 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
     }
 
     qemu_get_buffer(f, n->mac, ETH_ALEN);
-    q->tx_waiting = qemu_get_be32(f);
+    n->vqs[0].tx_waiting = qemu_get_be32(f);
 
     virtio_net_set_mrg_rx_bufs(n, qemu_get_be32(f));
 
@@ -1181,6 +1187,20 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
         }
     }
 
+    if (n->max_queues > 1) {
+        if (n->max_queues != qemu_get_be16(f)) {
+            error_report("virtio-net: different max_queues ");
+            return -1;
+        }
+
+        n->curr_queues = qemu_get_be16(f);
+        for (i = 1; i < n->curr_queues; i++) {
+            n->vqs[i].tx_waiting = qemu_get_be32(f);
+        }
+    }
+
+    virtio_net_set_queues(n);
+
     /* Find the first multicast entry in the saved MAC filter */
     for (i = 0; i < n->mac_table.in_use; i++) {
         if (n->mac_table.macs[i * ETH_ALEN] & 1) {
@@ -1191,7 +1211,10 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
 
     /* nc.link_down can't be migrated, so infer link_down according
      * to link status bit in n->status */
-    qemu_get_queue(n->nic)->link_down = (n->status & VIRTIO_NET_S_LINK_UP) == 0;
+    link_down = (n->status & VIRTIO_NET_S_LINK_UP) == 0;
+    for (i = 0; i < n->max_queues; i++) {
+        qemu_get_subqueue(n->nic, i)->link_down = link_down;
+    }
 
     return 0;
 }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V2 20/20] virtio-net: compat multiqueue support
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
                   ` (18 preceding siblings ...)
  2013-01-25 10:35 ` [PATCH V2 19/20] virtio-net: migration support for multiqueue Jason Wang
@ 2013-01-25 10:35 ` Jason Wang
  2013-01-28  3:27 ` [PATCH V2 00/20] Multiqueue virtio-net Wanlong Gao
  20 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-25 10:35 UTC (permalink / raw)
  To: mst, qemu-devel, aliguori, shajnocz
  Cc: krkumar2, kvm, mprivozn, rusty, jwhan, shiyer, gaowanlong, Jason Wang

Disable multiqueue support for pre 1.4.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 hw/pc_piix.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 0a6923d..7bc3563 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -297,6 +297,10 @@ static QEMUMachine pc_i440fx_machine_v1_4 = {
             .driver   = "usb-tablet",\
             .property = "usb_version",\
             .value    = stringify(1),\
+        },{ \
+            .driver   = "virtio-net-pci", \
+            .property = "mq", \
+            .value    = "off", \
         }
 
 static QEMUMachine pc_machine_v1_3 = {
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH V2 11/20] tap: support enabling or disabling a queue
  2013-01-25 10:35 ` [PATCH V2 11/20] tap: support enabling or disabling a queue Jason Wang
@ 2013-01-25 19:13   ` Blue Swirl
  2013-01-29 13:50     ` Jason Wang
  0 siblings, 1 reply; 41+ messages in thread
From: Blue Swirl @ 2013-01-25 19:13 UTC (permalink / raw)
  To: Jason Wang
  Cc: mst, qemu-devel, aliguori, shajnocz, krkumar2, kvm, mprivozn,
	rusty, gaowanlong, jwhan, shiyer

On Fri, Jan 25, 2013 at 10:35 AM, Jason Wang <jasowang@redhat.com> wrote:
> This patch introduce a new bit - enabled in TAPState which tracks whether a
> specific queue/fd is enabled. The tap/fd is enabled during initialization and
> could be enabled/disabled by tap_enalbe() and tap_disable() which calls platform
> specific helpers to do the real work. Polling of a tap fd can only done when
> the tap was enabled.
>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  include/net/tap.h |    2 ++
>  net/tap-win32.c   |   10 ++++++++++
>  net/tap.c         |   43 ++++++++++++++++++++++++++++++++++++++++---
>  3 files changed, 52 insertions(+), 3 deletions(-)
>
> diff --git a/include/net/tap.h b/include/net/tap.h
> index bb7efb5..0caf8c4 100644
> --- a/include/net/tap.h
> +++ b/include/net/tap.h
> @@ -35,6 +35,8 @@ int tap_has_vnet_hdr_len(NetClientState *nc, int len);
>  void tap_using_vnet_hdr(NetClientState *nc, int using_vnet_hdr);
>  void tap_set_offload(NetClientState *nc, int csum, int tso4, int tso6, int ecn, int ufo);
>  void tap_set_vnet_hdr_len(NetClientState *nc, int len);
> +int tap_enable(NetClientState *nc);
> +int tap_disable(NetClientState *nc);
>
>  int tap_get_fd(NetClientState *nc);
>
> diff --git a/net/tap-win32.c b/net/tap-win32.c
> index 265369c..a2cd94b 100644
> --- a/net/tap-win32.c
> +++ b/net/tap-win32.c
> @@ -764,3 +764,13 @@ void tap_set_vnet_hdr_len(NetClientState *nc, int len)
>  {
>      assert(0);
>  }
> +
> +int tap_enable(NetClientState *nc)
> +{
> +    assert(0);

abort()

> +}
> +
> +int tap_disable(NetClientState *nc)
> +{
> +    assert(0);
> +}
> diff --git a/net/tap.c b/net/tap.c
> index 67080f1..95e557b 100644
> --- a/net/tap.c
> +++ b/net/tap.c
> @@ -59,6 +59,7 @@ typedef struct TAPState {
>      unsigned int write_poll : 1;
>      unsigned int using_vnet_hdr : 1;
>      unsigned int has_ufo: 1;
> +    unsigned int enabled : 1;

bool without bit field?

>      VHostNetState *vhost_net;
>      unsigned host_vnet_hdr_len;
>  } TAPState;
> @@ -72,9 +73,9 @@ static void tap_writable(void *opaque);
>  static void tap_update_fd_handler(TAPState *s)
>  {
>      qemu_set_fd_handler2(s->fd,
> -                         s->read_poll  ? tap_can_send : NULL,
> -                         s->read_poll  ? tap_send     : NULL,
> -                         s->write_poll ? tap_writable : NULL,
> +                         s->read_poll && s->enabled ? tap_can_send : NULL,
> +                         s->read_poll && s->enabled ? tap_send     : NULL,
> +                         s->write_poll && s->enabled ? tap_writable : NULL,
>                           s);
>  }
>
> @@ -339,6 +340,7 @@ static TAPState *net_tap_fd_init(NetClientState *peer,
>      s->host_vnet_hdr_len = vnet_hdr ? sizeof(struct virtio_net_hdr) : 0;
>      s->using_vnet_hdr = 0;
>      s->has_ufo = tap_probe_has_ufo(s->fd);
> +    s->enabled = 1;
>      tap_set_offload(&s->nc, 0, 0, 0, 0, 0);
>      /*
>       * Make sure host header length is set correctly in tap:
> @@ -737,3 +739,38 @@ VHostNetState *tap_get_vhost_net(NetClientState *nc)
>      assert(nc->info->type == NET_CLIENT_OPTIONS_KIND_TAP);
>      return s->vhost_net;
>  }
> +
> +int tap_enable(NetClientState *nc)
> +{
> +    TAPState *s = DO_UPCAST(TAPState, nc, nc);
> +    int ret;
> +
> +    if (s->enabled) {
> +        return 0;
> +    } else {
> +        ret = tap_fd_enable(s->fd);
> +        if (ret == 0) {
> +            s->enabled = 1;
> +            tap_update_fd_handler(s);
> +        }
> +        return ret;
> +    }
> +}
> +
> +int tap_disable(NetClientState *nc)
> +{
> +    TAPState *s = DO_UPCAST(TAPState, nc, nc);
> +    int ret;
> +
> +    if (s->enabled == 0) {
> +        return 0;
> +    } else {
> +        ret = tap_fd_disable(s->fd);
> +        if (ret == 0) {
> +            qemu_purge_queued_packets(nc);
> +            s->enabled = 0;
> +            tap_update_fd_handler(s);
> +        }
> +        return ret;
> +    }
> +}
> --
> 1.7.1
>
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH V2 00/20] Multiqueue virtio-net
  2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
                   ` (19 preceding siblings ...)
  2013-01-25 10:35 ` [PATCH V2 20/20] virtio-net: compat multiqueue support Jason Wang
@ 2013-01-28  3:27 ` Wanlong Gao
  2013-01-28  4:24   ` [Qemu-devel] " Jason Wang
  20 siblings, 1 reply; 41+ messages in thread
From: Wanlong Gao @ 2013-01-28  3:27 UTC (permalink / raw)
  To: Jason Wang
  Cc: mst, qemu-devel, aliguori, shajnocz, krkumar2, kvm, mprivozn,
	rusty, jwhan, shiyer, Wanlong Gao

On 01/25/2013 06:35 PM, Jason Wang wrote:
> Hello all:
> 
> This seires is an update of last version of multiqueue virtio-net support.
> 
> This series tries to brings multiqueue support to virtio-net through a
> multiqueue support tap backend and multiple vhost threads.
> 
> To support this, multiqueue nic support were added to qemu. This is done by
> introducing an array of NetClientStates in NICState, and make each pair of peers
> to be an queue of the nic. This is done in patch 1-7.
> 
> Tap were also converted to be able to create a multiple queue
> backend. Currently, only linux support this by issuing TUNSETIFF N times with
> the same device name to create N queues. Each fd returned by TUNSETIFF were a
> queue supported by kernel. Three new command lines were introduced, "queues"
> were used to tell how many queues will be created by qemu; "fds" were used to
> pass multiple pre-created tap file descriptors to qemu; "vhostfds" were used to
> pass multiple pre-created vhost descriptors to qemu. This is done in patch 8-13.
> 
> A method of deleting a queue and queue_index were also introduce for virtio,
> this is done in patch 14-15.
> 
> Vhost were also changed to support multiqueue by introducing a start vq index
> which tracks the first virtqueue that will be used by vhost instead of the
> assumption that the vhost always use virtqueue from index 0. This is done in
> patch 16.
> 
> The last part is the multiqueue userspace changes, this is done in patch 17-20.
> 
> With this changes, user could start a multiqueue virtio-net device through
> 
> ./qemu -netdev tap,id=hn0,queues=2,vhost=on -device virtio-net-pci,netdev=hn0
> 
> Management tools such as libvirt can pass multiple pre-created fds/vhostfds through
> 
> ./qemu -netdev tap,id=hn0,fds=X:Y,vhostfds=M:N -device virtio-net-pci,netdev=hn0
> 
> No git tree this round since github is unavailable in China...

I saw that github had already been opened again. I can use it.

Thanks,
Wanlong Gao


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH V2 00/20] Multiqueue virtio-net
  2013-01-28  3:27 ` [PATCH V2 00/20] Multiqueue virtio-net Wanlong Gao
@ 2013-01-28  4:24   ` Jason Wang
  2013-01-29  5:36     ` Wanlong Gao
  0 siblings, 1 reply; 41+ messages in thread
From: Jason Wang @ 2013-01-28  4:24 UTC (permalink / raw)
  To: gaowanlong
  Cc: krkumar2, aliguori, kvm, mst, mprivozn, rusty, qemu-devel,
	shajnocz, jwhan, shiyer

On 01/28/2013 11:27 AM, Wanlong Gao wrote:
> On 01/25/2013 06:35 PM, Jason Wang wrote:
>> Hello all:
>>
>> This seires is an update of last version of multiqueue virtio-net support.
>>
>> This series tries to brings multiqueue support to virtio-net through a
>> multiqueue support tap backend and multiple vhost threads.
>>
>> To support this, multiqueue nic support were added to qemu. This is done by
>> introducing an array of NetClientStates in NICState, and make each pair of peers
>> to be an queue of the nic. This is done in patch 1-7.
>>
>> Tap were also converted to be able to create a multiple queue
>> backend. Currently, only linux support this by issuing TUNSETIFF N times with
>> the same device name to create N queues. Each fd returned by TUNSETIFF were a
>> queue supported by kernel. Three new command lines were introduced, "queues"
>> were used to tell how many queues will be created by qemu; "fds" were used to
>> pass multiple pre-created tap file descriptors to qemu; "vhostfds" were used to
>> pass multiple pre-created vhost descriptors to qemu. This is done in patch 8-13.
>>
>> A method of deleting a queue and queue_index were also introduce for virtio,
>> this is done in patch 14-15.
>>
>> Vhost were also changed to support multiqueue by introducing a start vq index
>> which tracks the first virtqueue that will be used by vhost instead of the
>> assumption that the vhost always use virtqueue from index 0. This is done in
>> patch 16.
>>
>> The last part is the multiqueue userspace changes, this is done in patch 17-20.
>>
>> With this changes, user could start a multiqueue virtio-net device through
>>
>> ./qemu -netdev tap,id=hn0,queues=2,vhost=on -device virtio-net-pci,netdev=hn0
>>
>> Management tools such as libvirt can pass multiple pre-created fds/vhostfds through
>>
>> ./qemu -netdev tap,id=hn0,fds=X:Y,vhostfds=M:N -device virtio-net-pci,netdev=hn0
>>
>> No git tree this round since github is unavailable in China...
> I saw that github had already been opened again. I can use it.

Thanks for reminding, I've pushed the new bits to
git://github.com/jasowang/qemu.git.
>
> Thanks,
> Wanlong Gao
>
>


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH V2 00/20] Multiqueue virtio-net
  2013-01-28  4:24   ` [Qemu-devel] " Jason Wang
@ 2013-01-29  5:36     ` Wanlong Gao
  2013-01-29  5:44       ` Jason Wang
  0 siblings, 1 reply; 41+ messages in thread
From: Wanlong Gao @ 2013-01-29  5:36 UTC (permalink / raw)
  To: Jason Wang, mst
  Cc: krkumar2, aliguori, kvm, mprivozn, rusty, qemu-devel, shajnocz,
	jwhan, shiyer, Wanlong Gao

On 01/28/2013 12:24 PM, Jason Wang wrote:
> On 01/28/2013 11:27 AM, Wanlong Gao wrote:
>> On 01/25/2013 06:35 PM, Jason Wang wrote:
>>> Hello all:
>>>
>>> This seires is an update of last version of multiqueue virtio-net support.
>>>
>>> This series tries to brings multiqueue support to virtio-net through a
>>> multiqueue support tap backend and multiple vhost threads.
>>>
>>> To support this, multiqueue nic support were added to qemu. This is done by
>>> introducing an array of NetClientStates in NICState, and make each pair of peers
>>> to be an queue of the nic. This is done in patch 1-7.
>>>
>>> Tap were also converted to be able to create a multiple queue
>>> backend. Currently, only linux support this by issuing TUNSETIFF N times with
>>> the same device name to create N queues. Each fd returned by TUNSETIFF were a
>>> queue supported by kernel. Three new command lines were introduced, "queues"
>>> were used to tell how many queues will be created by qemu; "fds" were used to
>>> pass multiple pre-created tap file descriptors to qemu; "vhostfds" were used to
>>> pass multiple pre-created vhost descriptors to qemu. This is done in patch 8-13.
>>>
>>> A method of deleting a queue and queue_index were also introduce for virtio,
>>> this is done in patch 14-15.
>>>
>>> Vhost were also changed to support multiqueue by introducing a start vq index
>>> which tracks the first virtqueue that will be used by vhost instead of the
>>> assumption that the vhost always use virtqueue from index 0. This is done in
>>> patch 16.
>>>
>>> The last part is the multiqueue userspace changes, this is done in patch 17-20.
>>>
>>> With this changes, user could start a multiqueue virtio-net device through
>>>
>>> ./qemu -netdev tap,id=hn0,queues=2,vhost=on -device virtio-net-pci,netdev=hn0
>>>
>>> Management tools such as libvirt can pass multiple pre-created fds/vhostfds through
>>>
>>> ./qemu -netdev tap,id=hn0,fds=X:Y,vhostfds=M:N -device virtio-net-pci,netdev=hn0
>>>
>>> No git tree this round since github is unavailable in China...
>> I saw that github had already been opened again. I can use it.
> 
> Thanks for reminding, I've pushed the new bits to
> git://github.com/jasowang/qemu.git.

I got host kernel oops here using your qemu tree and 3.8-rc5 kernel on host,

[31499.754779] BUG: unable to handle kernel NULL pointer dereference at           (null)
[31499.757098] IP: [<ffffffff816475ef>] _raw_spin_lock_irqsave+0x1f/0x40
[31499.758304] PGD 0 
[31499.759498] Oops: 0002 [#1] SMP 
[31499.760704] Modules linked in: tcp_lp fuse xt_CHECKSUM lockd ipt_MASQUERADE sunrpc bnep bluetooth rfkill bridge stp llc iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntr
ack_ipv4 nf_defrag_ipv4 nf_conntrack snd_hda_codec_realtek snd_hda_intel snd_hda_codec vhost_net tun snd_hwdep macvtap snd_seq macvlan coretemp kvm_intel snd_seq_device kvm snd_p
cm crc32c_intel r8169 snd_page_alloc snd_timer ghash_clmulni_intel snd mei iTCO_wdt mii microcode iTCO_vendor_support uinput serio_raw wmi i2c_i801 lpc_ich soundcore pcspkr mfd_c
ore i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: ip6t_REJECT]
[31499.766412] CPU 2 
[31499.766426] Pid: 18742, comm: vhost-18728 Not tainted 3.8.0-rc5 #1 LENOVO QiTianM4300/To be filled by O.E.M.
[31499.769340] RIP: 0010:[<ffffffff816475ef>]  [<ffffffff816475ef>] _raw_spin_lock_irqsave+0x1f/0x40
[31499.770861] RSP: 0018:ffff8801b2f9dd08  EFLAGS: 00010086
[31499.772380] RAX: 0000000000000286 RBX: 0000000000000000 RCX: 0000000000000000
[31499.773916] RDX: 0000000000000100 RSI: 0000000000000286 RDI: 0000000000000000
[31499.775394] RBP: ffff8801b2f9dd08 R08: ffff880132ed4368 R09: 0000000000000000
[31499.776923] R10: 0000000000000001 R11: 0000000000000001 R12: ffff880132ed8590
[31499.778466] R13: ffff880232a6c290 R14: ffff880132ed42b0 R15: ffff880132ed0078
[31499.780012] FS:  0000000000000000(0000) GS:ffff88023fb00000(0000) knlGS:0000000000000000
[31499.781574] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[31499.783126] CR2: 0000000000000000 CR3: 0000000132d9c000 CR4: 00000000000427e0
[31499.784696] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[31499.786267] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[31499.787822] Process vhost-18728 (pid: 18742, threadinfo ffff8801b2f9c000, task ffff880036959740)
[31499.788821] Stack:
[31499.790392]  ffff8801b2f9dd38 ffffffff81082534 0000000000000000 0000000000000001
[31499.792029]  ffff880132ed0000 ffff880232a6c290 ffff8801b2f9dd48 ffffffffa023fab6
[31499.793677]  ffff8801b2f9de28 ffffffffa0242f64 ffff8801b2f9ddb8 ffffffff8109e0e0
[31499.795332] Call Trace:
[31499.796974]  [<ffffffff81082534>] remove_wait_queue+0x24/0x50
[31499.798641]  [<ffffffffa023fab6>] vhost_poll_stop+0x16/0x20 [vhost_net]
[31499.800313]  [<ffffffffa0242f64>] handle_tx+0x4c4/0x680 [vhost_net]
[31499.801995]  [<ffffffff8109e0e0>] ? idle_balance+0x1b0/0x2f0
[31499.803685]  [<ffffffffa0243155>] handle_tx_kick+0x15/0x20 [vhost_net]
[31499.805128]  [<ffffffffa023f95d>] vhost_worker+0xed/0x190 [vhost_net]
[31499.806842]  [<ffffffffa023f870>] ? vhost_work_flush+0x110/0x110 [vhost_net]
[31499.808553]  [<ffffffff81081b70>] kthread+0xc0/0xd0
[31499.810259]  [<ffffffff81010000>] ? ftrace_define_fields_xen_mc_entry+0x30/0xf0
[31499.811996]  [<ffffffff81081ab0>] ? kthread_create_on_node+0x120/0x120
[31499.813726]  [<ffffffff8164fb2c>] ret_from_fork+0x7c/0xb0
[31499.815442]  [<ffffffff81081ab0>] ? kthread_create_on_node+0x120/0x120
[31499.817168] Code: 08 61 cb ff 48 89 d0 5d c3 0f 1f 00 66 66 66 66 90 55 48 89 e5 9c 58 66 66 90 66 90 48 89 c6 fa 66 66 90 66 66 90 ba 00 01 00 00 <f0> 66 0f c1 17 0f b6 ce 38 d1 74 0e 0f 1f 44 00 00 f3 90 0f b6 
[31499.821098] RIP  [<ffffffff816475ef>] _raw_spin_lock_irqsave+0x1f/0x40
[31499.823040]  RSP <ffff8801b2f9dd08>
[31499.824976] CR2: 0000000000000000
[31499.844842] ---[ end trace b7130aab34f0ed9c ]---


According printing the value, I saw that the NULL pointer is poll->wqh in vhost_poll_stop(),

[  136.616527] vhost_net: poll = ffff8802081f8578
[  136.616529] vhost_net: poll>wqh =           (null)
[  136.616530] vhost_net: &poll->wait = ffff8802081f8590
[  136.622478] Modules linked in: fuse ebtable_nat xt_CHECKSUM lockd sunrpc ipt_MASQUERADE nf_conntrack_netbios_ns bnep nf_conntrack_broadcast bluetooth bridge rfkill ip6table_mangle stp llc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables snd_hda_codec_realtek snd_hda_intel vhost_net snd_hda_codec tun macvtap snd_hwdep macvlan snd_seq snd_seq_device coretemp snd_pcm kvm_intel kvm snd_page_alloc crc32c_intel snd_timer ghash_clmulni_intel snd r8169 iTCO_wdt microcode iTCO_vendor_support mei lpc_ich pcspkr mii soundcore mfd_core i2c_i801 serio_raw wmi uinput i915 video i2c_algo_bit drm_kms_helper drm i2c_core
[  136.663172]  [<ffffffffa0283afc>] vhost_poll_stop+0x5c/0x70 [vhost_net]
[  136.664880]  [<ffffffffa0286cf2>] handle_tx+0x262/0x650 [vhost_net]
[  136.668289]  [<ffffffffa0287115>] handle_tx_kick+0x15/0x20 [vhost_net]
[  136.670013]  [<ffffffffa028395d>] vhost_worker+0xed/0x190 [vhost_net]
[  136.671737]  [<ffffffffa0283870>] ? vhost_work_flush+0x110/0x110 [vhost_net]


But I don't know whether we should check poll->wqh here. Or it's a qemu bug causes host kernel oops?

Thanks,
Wanlong Gao

>>
>> Thanks,
>> Wanlong Gao
>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH V2 00/20] Multiqueue virtio-net
  2013-01-29  5:36     ` Wanlong Gao
@ 2013-01-29  5:44       ` Jason Wang
  0 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-29  5:44 UTC (permalink / raw)
  To: gaowanlong
  Cc: krkumar2, aliguori, kvm, mst, mprivozn, rusty, qemu-devel,
	shajnocz, jwhan, shiyer

On 01/29/2013 01:36 PM, Wanlong Gao wrote:
> On 01/28/2013 12:24 PM, Jason Wang wrote:
>> On 01/28/2013 11:27 AM, Wanlong Gao wrote:
>>> On 01/25/2013 06:35 PM, Jason Wang wrote:
>>>> Hello all:
>>>>
>>>> This seires is an update of last version of multiqueue virtio-net support.
>>>>
>>>> This series tries to brings multiqueue support to virtio-net through a
>>>> multiqueue support tap backend and multiple vhost threads.
>>>>
>>>> To support this, multiqueue nic support were added to qemu. This is done by
>>>> introducing an array of NetClientStates in NICState, and make each pair of peers
>>>> to be an queue of the nic. This is done in patch 1-7.
>>>>
>>>> Tap were also converted to be able to create a multiple queue
>>>> backend. Currently, only linux support this by issuing TUNSETIFF N times with
>>>> the same device name to create N queues. Each fd returned by TUNSETIFF were a
>>>> queue supported by kernel. Three new command lines were introduced, "queues"
>>>> were used to tell how many queues will be created by qemu; "fds" were used to
>>>> pass multiple pre-created tap file descriptors to qemu; "vhostfds" were used to
>>>> pass multiple pre-created vhost descriptors to qemu. This is done in patch 8-13.
>>>>
>>>> A method of deleting a queue and queue_index were also introduce for virtio,
>>>> this is done in patch 14-15.
>>>>
>>>> Vhost were also changed to support multiqueue by introducing a start vq index
>>>> which tracks the first virtqueue that will be used by vhost instead of the
>>>> assumption that the vhost always use virtqueue from index 0. This is done in
>>>> patch 16.
>>>>
>>>> The last part is the multiqueue userspace changes, this is done in patch 17-20.
>>>>
>>>> With this changes, user could start a multiqueue virtio-net device through
>>>>
>>>> ./qemu -netdev tap,id=hn0,queues=2,vhost=on -device virtio-net-pci,netdev=hn0
>>>>
>>>> Management tools such as libvirt can pass multiple pre-created fds/vhostfds through
>>>>
>>>> ./qemu -netdev tap,id=hn0,fds=X:Y,vhostfds=M:N -device virtio-net-pci,netdev=hn0
>>>>
>>>> No git tree this round since github is unavailable in China...
>>> I saw that github had already been opened again. I can use it.
>> Thanks for reminding, I've pushed the new bits to
>> git://github.com/jasowang/qemu.git.
> I got host kernel oops here using your qemu tree and 3.8-rc5 kernel on host,
>
> [31499.754779] BUG: unable to handle kernel NULL pointer dereference at           (null)
> [31499.757098] IP: [<ffffffff816475ef>] _raw_spin_lock_irqsave+0x1f/0x40
> [31499.758304] PGD 0 
> [31499.759498] Oops: 0002 [#1] SMP 
> [31499.760704] Modules linked in: tcp_lp fuse xt_CHECKSUM lockd ipt_MASQUERADE sunrpc bnep bluetooth rfkill bridge stp llc iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntr
> ack_ipv4 nf_defrag_ipv4 nf_conntrack snd_hda_codec_realtek snd_hda_intel snd_hda_codec vhost_net tun snd_hwdep macvtap snd_seq macvlan coretemp kvm_intel snd_seq_device kvm snd_p
> cm crc32c_intel r8169 snd_page_alloc snd_timer ghash_clmulni_intel snd mei iTCO_wdt mii microcode iTCO_vendor_support uinput serio_raw wmi i2c_i801 lpc_ich soundcore pcspkr mfd_c
> ore i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: ip6t_REJECT]
> [31499.766412] CPU 2 
> [31499.766426] Pid: 18742, comm: vhost-18728 Not tainted 3.8.0-rc5 #1 LENOVO QiTianM4300/To be filled by O.E.M.
> [31499.769340] RIP: 0010:[<ffffffff816475ef>]  [<ffffffff816475ef>] _raw_spin_lock_irqsave+0x1f/0x40
> [31499.770861] RSP: 0018:ffff8801b2f9dd08  EFLAGS: 00010086
> [31499.772380] RAX: 0000000000000286 RBX: 0000000000000000 RCX: 0000000000000000
> [31499.773916] RDX: 0000000000000100 RSI: 0000000000000286 RDI: 0000000000000000
> [31499.775394] RBP: ffff8801b2f9dd08 R08: ffff880132ed4368 R09: 0000000000000000
> [31499.776923] R10: 0000000000000001 R11: 0000000000000001 R12: ffff880132ed8590
> [31499.778466] R13: ffff880232a6c290 R14: ffff880132ed42b0 R15: ffff880132ed0078
> [31499.780012] FS:  0000000000000000(0000) GS:ffff88023fb00000(0000) knlGS:0000000000000000
> [31499.781574] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [31499.783126] CR2: 0000000000000000 CR3: 0000000132d9c000 CR4: 00000000000427e0
> [31499.784696] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [31499.786267] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [31499.787822] Process vhost-18728 (pid: 18742, threadinfo ffff8801b2f9c000, task ffff880036959740)
> [31499.788821] Stack:
> [31499.790392]  ffff8801b2f9dd38 ffffffff81082534 0000000000000000 0000000000000001
> [31499.792029]  ffff880132ed0000 ffff880232a6c290 ffff8801b2f9dd48 ffffffffa023fab6
> [31499.793677]  ffff8801b2f9de28 ffffffffa0242f64 ffff8801b2f9ddb8 ffffffff8109e0e0
> [31499.795332] Call Trace:
> [31499.796974]  [<ffffffff81082534>] remove_wait_queue+0x24/0x50
> [31499.798641]  [<ffffffffa023fab6>] vhost_poll_stop+0x16/0x20 [vhost_net]
> [31499.800313]  [<ffffffffa0242f64>] handle_tx+0x4c4/0x680 [vhost_net]
> [31499.801995]  [<ffffffff8109e0e0>] ? idle_balance+0x1b0/0x2f0
> [31499.803685]  [<ffffffffa0243155>] handle_tx_kick+0x15/0x20 [vhost_net]
> [31499.805128]  [<ffffffffa023f95d>] vhost_worker+0xed/0x190 [vhost_net]
> [31499.806842]  [<ffffffffa023f870>] ? vhost_work_flush+0x110/0x110 [vhost_net]
> [31499.808553]  [<ffffffff81081b70>] kthread+0xc0/0xd0
> [31499.810259]  [<ffffffff81010000>] ? ftrace_define_fields_xen_mc_entry+0x30/0xf0
> [31499.811996]  [<ffffffff81081ab0>] ? kthread_create_on_node+0x120/0x120
> [31499.813726]  [<ffffffff8164fb2c>] ret_from_fork+0x7c/0xb0
> [31499.815442]  [<ffffffff81081ab0>] ? kthread_create_on_node+0x120/0x120
> [31499.817168] Code: 08 61 cb ff 48 89 d0 5d c3 0f 1f 00 66 66 66 66 90 55 48 89 e5 9c 58 66 66 90 66 90 48 89 c6 fa 66 66 90 66 66 90 ba 00 01 00 00 <f0> 66 0f c1 17 0f b6 ce 38 d1 74 0e 0f 1f 44 00 00 f3 90 0f b6 
> [31499.821098] RIP  [<ffffffff816475ef>] _raw_spin_lock_irqsave+0x1f/0x40
> [31499.823040]  RSP <ffff8801b2f9dd08>
> [31499.824976] CR2: 0000000000000000
> [31499.844842] ---[ end trace b7130aab34f0ed9c ]---
>
>
> According printing the value, I saw that the NULL pointer is poll->wqh in vhost_poll_stop(),
>
> [  136.616527] vhost_net: poll = ffff8802081f8578
> [  136.616529] vhost_net: poll>wqh =           (null)
> [  136.616530] vhost_net: &poll->wait = ffff8802081f8590
> [  136.622478] Modules linked in: fuse ebtable_nat xt_CHECKSUM lockd sunrpc ipt_MASQUERADE nf_conntrack_netbios_ns bnep nf_conntrack_broadcast bluetooth bridge rfkill ip6table_mangle stp llc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables snd_hda_codec_realtek snd_hda_intel vhost_net snd_hda_codec tun macvtap snd_hwdep macvlan snd_seq snd_seq_device coretemp snd_pcm kvm_intel kvm snd_page_alloc crc32c_intel snd_timer ghash_clmulni_intel snd r8169 iTCO_wdt microcode iTCO_vendor_support mei lpc_ich pcspkr mii soundcore mfd_core i2c_i801 serio_raw wmi uinput i915 video i2c_algo_bit drm_kms_helper drm i2c_core
> [  136.663172]  [<ffffffffa0283afc>] vhost_poll_stop+0x5c/0x70 [vhost_net]
> [  136.664880]  [<ffffffffa0286cf2>] handle_tx+0x262/0x650 [vhost_net]
> [  136.668289]  [<ffffffffa0287115>] handle_tx_kick+0x15/0x20 [vhost_net]
> [  136.670013]  [<ffffffffa028395d>] vhost_worker+0xed/0x190 [vhost_net]
> [  136.671737]  [<ffffffffa0283870>] ? vhost_work_flush+0x110/0x110 [vhost_net]
>
>
> But I don't know whether we should check poll->wqh here. Or it's a qemu bug causes host kernel oops?

Right, it's a bug of vhost which should check poll->wqh and POLLERR.

I've posted the fixes in netdev:
http://marc.info/?l=linux-netdev&m=135937170929355&w=2

You can try the fixes there. It should fix your panic.

Thanks
>
> Thanks,
> Wanlong Gao
>
>>> Thanks,
>>> Wanlong Gao
>>>
>>>
>>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH V2 11/20] tap: support enabling or disabling a queue
  2013-01-25 19:13   ` [Qemu-devel] " Blue Swirl
@ 2013-01-29 13:50     ` Jason Wang
  2013-01-29 20:10       ` Blue Swirl
  0 siblings, 1 reply; 41+ messages in thread
From: Jason Wang @ 2013-01-29 13:50 UTC (permalink / raw)
  To: Blue Swirl
  Cc: krkumar2, aliguori, kvm, mst, mprivozn, rusty, qemu-devel,
	shajnocz, gaowanlong, jwhan, shiyer

On 01/26/2013 03:13 AM, Blue Swirl wrote:
> On Fri, Jan 25, 2013 at 10:35 AM, Jason Wang <jasowang@redhat.com> wrote:
>> This patch introduce a new bit - enabled in TAPState which tracks whether a
>> specific queue/fd is enabled. The tap/fd is enabled during initialization and
>> could be enabled/disabled by tap_enalbe() and tap_disable() which calls platform
>> specific helpers to do the real work. Polling of a tap fd can only done when
>> the tap was enabled.
>>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>  include/net/tap.h |    2 ++
>>  net/tap-win32.c   |   10 ++++++++++
>>  net/tap.c         |   43 ++++++++++++++++++++++++++++++++++++++++---
>>  3 files changed, 52 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/net/tap.h b/include/net/tap.h
>> index bb7efb5..0caf8c4 100644
>> --- a/include/net/tap.h
>> +++ b/include/net/tap.h
>> @@ -35,6 +35,8 @@ int tap_has_vnet_hdr_len(NetClientState *nc, int len);
>>  void tap_using_vnet_hdr(NetClientState *nc, int using_vnet_hdr);
>>  void tap_set_offload(NetClientState *nc, int csum, int tso4, int tso6, int ecn, int ufo);
>>  void tap_set_vnet_hdr_len(NetClientState *nc, int len);
>> +int tap_enable(NetClientState *nc);
>> +int tap_disable(NetClientState *nc);
>>
>>  int tap_get_fd(NetClientState *nc);
>>
>> diff --git a/net/tap-win32.c b/net/tap-win32.c
>> index 265369c..a2cd94b 100644
>> --- a/net/tap-win32.c
>> +++ b/net/tap-win32.c
>> @@ -764,3 +764,13 @@ void tap_set_vnet_hdr_len(NetClientState *nc, int len)
>>  {
>>      assert(0);
>>  }
>> +
>> +int tap_enable(NetClientState *nc)
>> +{
>> +    assert(0);
> abort()

This is just to be consistent with the reset of the helpers in this file.
>
>> +}
>> +
>> +int tap_disable(NetClientState *nc)
>> +{
>> +    assert(0);
>> +}
>> diff --git a/net/tap.c b/net/tap.c
>> index 67080f1..95e557b 100644
>> --- a/net/tap.c
>> +++ b/net/tap.c
>> @@ -59,6 +59,7 @@ typedef struct TAPState {
>>      unsigned int write_poll : 1;
>>      unsigned int using_vnet_hdr : 1;
>>      unsigned int has_ufo: 1;
>> +    unsigned int enabled : 1;
> bool without bit field?

Also to be consistent with other field. If you wish I can send patches
to convert all those bit field to bool on top of this series.

Thanks
>>      VHostNetState *vhost_net;
>>      unsigned host_vnet_hdr_len;
>>  } TAPState;
>> @@ -72,9 +73,9 @@ static void tap_writable(void *opaque);
>>  static void tap_update_fd_handler(TAPState *s)
>>  {
>>      qemu_set_fd_handler2(s->fd,
>> -                         s->read_poll  ? tap_can_send : NULL,
>> -                         s->read_poll  ? tap_send     : NULL,
>> -                         s->write_poll ? tap_writable : NULL,
>> +                         s->read_poll && s->enabled ? tap_can_send : NULL,
>> +                         s->read_poll && s->enabled ? tap_send     : NULL,
>> +                         s->write_poll && s->enabled ? tap_writable : NULL,
>>                           s);
>>  }
>>
>> @@ -339,6 +340,7 @@ static TAPState *net_tap_fd_init(NetClientState *peer,
>>      s->host_vnet_hdr_len = vnet_hdr ? sizeof(struct virtio_net_hdr) : 0;
>>      s->using_vnet_hdr = 0;
>>      s->has_ufo = tap_probe_has_ufo(s->fd);
>> +    s->enabled = 1;
>>      tap_set_offload(&s->nc, 0, 0, 0, 0, 0);
>>      /*
>>       * Make sure host header length is set correctly in tap:
>> @@ -737,3 +739,38 @@ VHostNetState *tap_get_vhost_net(NetClientState *nc)
>>      assert(nc->info->type == NET_CLIENT_OPTIONS_KIND_TAP);
>>      return s->vhost_net;
>>  }
>> +
>> +int tap_enable(NetClientState *nc)
>> +{
>> +    TAPState *s = DO_UPCAST(TAPState, nc, nc);
>> +    int ret;
>> +
>> +    if (s->enabled) {
>> +        return 0;
>> +    } else {
>> +        ret = tap_fd_enable(s->fd);
>> +        if (ret == 0) {
>> +            s->enabled = 1;
>> +            tap_update_fd_handler(s);
>> +        }
>> +        return ret;
>> +    }
>> +}
>> +
>> +int tap_disable(NetClientState *nc)
>> +{
>> +    TAPState *s = DO_UPCAST(TAPState, nc, nc);
>> +    int ret;
>> +
>> +    if (s->enabled == 0) {
>> +        return 0;
>> +    } else {
>> +        ret = tap_fd_disable(s->fd);
>> +        if (ret == 0) {
>> +            qemu_purge_queued_packets(nc);
>> +            s->enabled = 0;
>> +            tap_update_fd_handler(s);
>> +        }
>> +        return ret;
>> +    }
>> +}
>> --
>> 1.7.1
>>
>>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH V2 14/20] vhost: multiqueue support
  2013-01-25 10:35 ` [PATCH V2 14/20] vhost: " Jason Wang
@ 2013-01-29 13:53   ` Jason Wang
  0 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-29 13:53 UTC (permalink / raw)
  To: Jason Wang
  Cc: mst, qemu-devel, aliguori, shajnocz, krkumar2, kvm, mprivozn,
	rusty, jwhan, shiyer, gaowanlong

On 01/25/2013 06:35 PM, Jason Wang wrote:
> This patch lets vhost support multiqueue. The idea is simple, just launching
> multiple threads of vhost and let each of vhost thread processing a subset of
> the virtqueues of the device. After this change each emulated device can have
> multiple vhost threads as its backend.
>
> To do this, a virtqueue index were introduced to record to first virtqueue that
> will be handled by this vhost_net device. Based on this and nvqs, vhost could
> calculate its relative index to setup vhost_net device.
>
> Since we may have many vhost/net devices for a virtio-net device. The setting of
> guest notifiers were moved out of the starting/stopping of a specific vhost
> thread. The vhost_net_{start|stop}() were renamed to
> vhost_net_{start|stop}_one(), and a new vhost_net_{start|stop}() were introduced
> to configure the guest notifiers and start/stop all vhost/vhost_net devices.
>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  hw/vhost.c      |   82 +++++++++++++++++++++---------------------------
>  hw/vhost.h      |    2 +
>  hw/vhost_net.c  |   92 ++++++++++++++++++++++++++++++++++++++++++++++++++-----
>  hw/vhost_net.h  |    6 ++-
>  hw/virtio-net.c |    4 +-
>  5 files changed, 128 insertions(+), 58 deletions(-)
>
> diff --git a/hw/vhost.c b/hw/vhost.c
> index cee8aad..38257b9 100644
> --- a/hw/vhost.c
> +++ b/hw/vhost.c
> @@ -619,14 +619,17 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
>  {
>      hwaddr s, l, a;
>      int r;
> +    int vhost_vq_index = idx - dev->vq_index;
>      struct vhost_vring_file file = {
> -        .index = idx,
> +        .index = vhost_vq_index
>      };
>      struct vhost_vring_state state = {
> -        .index = idx,
> +        .index = vhost_vq_index
>      };
>      struct VirtQueue *vvq = virtio_get_queue(vdev, idx);
>  
> +    assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
> +
>      vq->num = state.num = virtio_queue_get_num(vdev, idx);
>      r = ioctl(dev->control, VHOST_SET_VRING_NUM, &state);
>      if (r) {
> @@ -669,11 +672,12 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
>          goto fail_alloc_ring;
>      }
>  
> -    r = vhost_virtqueue_set_addr(dev, vq, idx, dev->log_enabled);
> +    r = vhost_virtqueue_set_addr(dev, vq, vhost_vq_index, dev->log_enabled);
>      if (r < 0) {
>          r = -errno;
>          goto fail_alloc;
>      }
> +
>      file.fd = event_notifier_get_fd(virtio_queue_get_host_notifier(vvq));
>      r = ioctl(dev->control, VHOST_SET_VRING_KICK, &file);
>      if (r) {
> @@ -709,9 +713,10 @@ static void vhost_virtqueue_stop(struct vhost_dev *dev,
>                                      unsigned idx)
>  {
>      struct vhost_vring_state state = {
> -        .index = idx,
> +        .index = idx - dev->vq_index
>      };
>      int r;
> +    assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
>      r = ioctl(dev->control, VHOST_GET_VRING_BASE, &state);
>      if (r < 0) {
>          fprintf(stderr, "vhost VQ %d ring restore failed: %d\n", idx, r);
> @@ -867,7 +872,9 @@ int vhost_dev_enable_notifiers(struct vhost_dev *hdev, VirtIODevice *vdev)
>      }
>  
>      for (i = 0; i < hdev->nvqs; ++i) {
> -        r = vdev->binding->set_host_notifier(vdev->binding_opaque, i, true);
> +        r = vdev->binding->set_host_notifier(vdev->binding_opaque,
> +                                             hdev->vq_index + i,
> +                                             true);
>          if (r < 0) {
>              fprintf(stderr, "vhost VQ %d notifier binding failed: %d\n", i, -r);
>              goto fail_vq;
> @@ -877,7 +884,9 @@ int vhost_dev_enable_notifiers(struct vhost_dev *hdev, VirtIODevice *vdev)
>      return 0;
>  fail_vq:
>      while (--i >= 0) {
> -        r = vdev->binding->set_host_notifier(vdev->binding_opaque, i, false);
> +        r = vdev->binding->set_host_notifier(vdev->binding_opaque,
> +                                             hdev->vq_index + i,
> +                                             false);
>          if (r < 0) {
>              fprintf(stderr, "vhost VQ %d notifier cleanup error: %d\n", i, -r);
>              fflush(stderr);
> @@ -898,7 +907,9 @@ void vhost_dev_disable_notifiers(struct vhost_dev *hdev, VirtIODevice *vdev)
>      int i, r;
>  
>      for (i = 0; i < hdev->nvqs; ++i) {
> -        r = vdev->binding->set_host_notifier(vdev->binding_opaque, i, false);
> +        r = vdev->binding->set_host_notifier(vdev->binding_opaque,
> +                                             hdev->vq_index + i,
> +                                             false);
>          if (r < 0) {
>              fprintf(stderr, "vhost VQ %d notifier cleanup failed: %d\n", i, -r);
>              fflush(stderr);
> @@ -912,8 +923,9 @@ void vhost_dev_disable_notifiers(struct vhost_dev *hdev, VirtIODevice *vdev)
>   */
>  bool vhost_virtqueue_pending(struct vhost_dev *hdev, int n)
>  {
> -    struct vhost_virtqueue *vq = hdev->vqs + n;
> +    struct vhost_virtqueue *vq = hdev->vqs + n - hdev->vq_index;
>      assert(hdev->started);
> +    assert(n >= hdev->vq_index && n < hdev->vq_index + hdev->nvqs);
>      return event_notifier_test_and_clear(&vq->masked_notifier);
>  }
>  
> @@ -922,15 +934,16 @@ void vhost_virtqueue_mask(struct vhost_dev *hdev, VirtIODevice *vdev, int n,
>                           bool mask)
>  {
>      struct VirtQueue *vvq = virtio_get_queue(vdev, n);
> -    int r;
> +    int r, index = n - hdev->vq_index;
>  
>      assert(hdev->started);
> +    assert(n >= hdev->vq_index && n < hdev->vq_index + hdev->nvqs);
>  
>      struct vhost_vring_file file = {
> -        .index = n,
> +        .index = index
>      };
>      if (mask) {
> -        file.fd = event_notifier_get_fd(&hdev->vqs[n].masked_notifier);
> +        file.fd = event_notifier_get_fd(&hdev->vqs[index].masked_notifier);
>      } else {
>          file.fd = event_notifier_get_fd(virtio_queue_get_guest_notifier(vvq));
>      }
> @@ -945,20 +958,6 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
>  
>      hdev->started = true;
>  
> -    if (!vdev->binding->set_guest_notifiers) {
> -        fprintf(stderr, "binding does not support guest notifiers\n");
> -        r = -ENOSYS;
> -        goto fail;
> -    }
> -
> -    r = vdev->binding->set_guest_notifiers(vdev->binding_opaque,
> -                                           hdev->nvqs,
> -                                           true);
> -    if (r < 0) {
> -        fprintf(stderr, "Error binding guest notifier: %d\n", -r);
> -        goto fail_notifiers;
> -    }
> -
>      r = vhost_dev_set_features(hdev, hdev->log_enabled);
>      if (r < 0) {
>          goto fail_features;
> @@ -970,9 +969,9 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
>      }
>      for (i = 0; i < hdev->nvqs; ++i) {
>          r = vhost_virtqueue_start(hdev,
> -                                 vdev,
> -                                 hdev->vqs + i,
> -                                 i);
> +                                  vdev,
> +                                  hdev->vqs + i,
> +                                  hdev->vq_index + i);
>          if (r < 0) {
>              goto fail_vq;
>          }
> @@ -995,15 +994,13 @@ fail_log:
>  fail_vq:
>      while (--i >= 0) {
>          vhost_virtqueue_stop(hdev,
> -                                vdev,
> -                                hdev->vqs + i,
> -                                i);
> +                             vdev,
> +                             hdev->vqs + i,
> +                             hdev->vq_index + i);
>      }
> +    i = hdev->nvqs;
>  fail_mem:
>  fail_features:
> -    vdev->binding->set_guest_notifiers(vdev->binding_opaque, hdev->nvqs, false);
> -fail_notifiers:
> -fail:
>  
>      hdev->started = false;
>      return r;
> @@ -1012,29 +1009,22 @@ fail:
>  /* Host notifiers must be enabled at this point. */
>  void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
>  {
> -    int i, r;
> +    int i;
>  
>      for (i = 0; i < hdev->nvqs; ++i) {
>          vhost_virtqueue_stop(hdev,
> -                                vdev,
> -                                hdev->vqs + i,
> -                                i);
> +                             vdev,
> +                             hdev->vqs + i,
> +                             hdev->vq_index + i);
>      }
>      for (i = 0; i < hdev->n_mem_sections; ++i) {
>          vhost_sync_dirty_bitmap(hdev, &hdev->mem_sections[i],
>                                  0, (hwaddr)~0x0ull);
>      }
> -    r = vdev->binding->set_guest_notifiers(vdev->binding_opaque,
> -                                           hdev->nvqs,
> -                                           false);
> -    if (r < 0) {
> -        fprintf(stderr, "vhost guest notifier cleanup failed: %d\n", r);
> -        fflush(stderr);
> -    }
> -    assert (r >= 0);
>  
>      hdev->started = false;
>      g_free(hdev->log);
>      hdev->log = NULL;
>      hdev->log_size = 0;
>  }
> +
> diff --git a/hw/vhost.h b/hw/vhost.h
> index 44c61a5..f062d48 100644
> --- a/hw/vhost.h
> +++ b/hw/vhost.h
> @@ -35,6 +35,8 @@ struct vhost_dev {
>      MemoryRegionSection *mem_sections;
>      struct vhost_virtqueue *vqs;
>      int nvqs;
> +    /* the first virtuque which would be used by this vhost dev */
> +    int vq_index;
>      unsigned long long features;
>      unsigned long long acked_features;
>      unsigned long long backend_features;
> diff --git a/hw/vhost_net.c b/hw/vhost_net.c
> index d3a04ca..c955611 100644
> --- a/hw/vhost_net.c
> +++ b/hw/vhost_net.c
> @@ -140,12 +140,21 @@ bool vhost_net_query(VHostNetState *net, VirtIODevice *dev)
>      return vhost_dev_query(&net->dev, dev);
>  }
>  
> -int vhost_net_start(struct vhost_net *net,
> -                    VirtIODevice *dev)
> +static int vhost_net_start_one(struct vhost_net *net,
> +                               VirtIODevice *dev,
> +                               int vq_index)
>  {
>      struct vhost_vring_file file = { };
>      int r;
>  
> +    if (net->dev.started) {
> +        return 0;
> +    }
> +
> +    net->dev.nvqs = 2;
> +    net->dev.vqs = net->vqs;
> +    net->dev.vq_index = vq_index;
> +
>      r = vhost_dev_enable_notifiers(&net->dev, dev);
>      if (r < 0) {
>          goto fail_notifiers;
> @@ -181,11 +190,15 @@ fail_notifiers:
>      return r;
>  }
>  
> -void vhost_net_stop(struct vhost_net *net,
> -                    VirtIODevice *dev)
> +static void vhost_net_stop_one(struct vhost_net *net,
> +                               VirtIODevice *dev)
>  {
>      struct vhost_vring_file file = { .fd = -1 };
>  
> +    if (!net->dev.started) {
> +        return;
> +    }
> +
>      for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
>          int r = ioctl(net->dev.control, VHOST_NET_SET_BACKEND, &file);
>          assert(r >= 0);
> @@ -195,6 +208,65 @@ void vhost_net_stop(struct vhost_net *net,
>      vhost_dev_disable_notifiers(&net->dev, dev);
>  }
>  
> +int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
> +                    int start_queues, int total_queues)
> +{
> +    int r, i = 0;
> +
> +    if (!dev->binding->set_guest_notifiers) {
> +        error_report("binding does not support guest notifiers\n");
> +        r = -ENOSYS;
> +        goto err;
> +    }
> +
> +    for (i = start_queues; i < total_queues; i++) {
> +        vhost_net_stop_one(tap_get_vhost_net(ncs[i].peer), dev);
> +    }
> +

Since kernel will support polling/writing when detached, there's no need
to stop the vhost threads that is polling the disabled queue here. This
can further simplify the interface between virtio-net and vhost.

Will send a new version.
> +    for (i = 0; i < start_queues; i++) {
> +        r = vhost_net_start_one(tap_get_vhost_net(ncs[i].peer), dev, i * 2);
> +
> +        if (r < 0) {
> +            goto err;
> +        }
> +    }
> +
> +    r = dev->binding->set_guest_notifiers(dev->binding_opaque,
> +                                          start_queues * 2,
> +                                          true);
> +    if (r < 0) {
> +        error_report("Error binding guest notifier: %d\n", -r);
> +        goto err;
> +    }
> +
> +    return 0;
> +
> +err:
> +    while (--i >= 0) {
> +        vhost_net_stop_one(tap_get_vhost_net(ncs[i].peer), dev);
> +    }
> +    return r;
> +}
> +
> +void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs,
> +                    int start_queues, int total_queues)
> +{
> +    int i, r;
> +
> +    r = dev->binding->set_guest_notifiers(dev->binding_opaque,
> +                                          start_queues * 2,
> +                                          false);
> +    if (r < 0) {
> +        fprintf(stderr, "vhost guest notifier cleanup failed: %d\n", r);
> +        fflush(stderr);
> +    }
> +    assert(r >= 0);
> +
> +    for (i = 0; i < total_queues; i++) {
> +        vhost_net_stop_one(tap_get_vhost_net(ncs[i].peer), dev);
> +    }
> +}
> +
>  void vhost_net_cleanup(struct vhost_net *net)
>  {
>      vhost_dev_cleanup(&net->dev);
> @@ -224,13 +296,17 @@ bool vhost_net_query(VHostNetState *net, VirtIODevice *dev)
>      return false;
>  }
>  
> -int vhost_net_start(struct vhost_net *net,
> -		    VirtIODevice *dev)
> +int vhost_net_start(VirtIODevice *dev,
> +                    NetClientState *ncs,
> +                    int start_queues,
> +                    int total_queues)
>  {
>      return -ENOSYS;
>  }
> -void vhost_net_stop(struct vhost_net *net,
> -		    VirtIODevice *dev)
> +void vhost_net_stop(VirtIODevice *dev,
> +                    NetClientState *ncs,
> +                    int start_queues,
> +                    int total_queues)
>  {
>  }
>  
> diff --git a/hw/vhost_net.h b/hw/vhost_net.h
> index 88912b8..9fbd79d 100644
> --- a/hw/vhost_net.h
> +++ b/hw/vhost_net.h
> @@ -9,8 +9,10 @@ typedef struct vhost_net VHostNetState;
>  VHostNetState *vhost_net_init(NetClientState *backend, int devfd, bool force);
>  
>  bool vhost_net_query(VHostNetState *net, VirtIODevice *dev);
> -int vhost_net_start(VHostNetState *net, VirtIODevice *dev);
> -void vhost_net_stop(VHostNetState *net, VirtIODevice *dev);
> +int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
> +                    int start_queues, int total_queues);
> +void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs,
> +                    int start_queues, int total_queues);
>  
>  void vhost_net_cleanup(VHostNetState *net);
>  
> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> index 47f4ab4..2f49fd8 100644
> --- a/hw/virtio-net.c
> +++ b/hw/virtio-net.c
> @@ -129,14 +129,14 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
>              return;
>          }
>          n->vhost_started = 1;
> -        r = vhost_net_start(tap_get_vhost_net(nc->peer), &n->vdev);
> +        r = vhost_net_start(&n->vdev, nc, 1, 1);
>          if (r < 0) {
>              error_report("unable to start vhost net: %d: "
>                           "falling back on userspace virtio", -r);
>              n->vhost_started = 0;
>          }
>      } else {
> -        vhost_net_stop(tap_get_vhost_net(nc->peer), &n->vdev);
> +        vhost_net_stop(&n->vdev, nc, 1, 1);
>          n->vhost_started = 0;
>      }
>  }


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH V2 11/20] tap: support enabling or disabling a queue
  2013-01-29 13:50     ` Jason Wang
@ 2013-01-29 20:10       ` Blue Swirl
  2013-01-29 22:11         ` [Qemu-devel] " Michael S. Tsirkin
  0 siblings, 1 reply; 41+ messages in thread
From: Blue Swirl @ 2013-01-29 20:10 UTC (permalink / raw)
  To: Jason Wang
  Cc: krkumar2, aliguori, kvm, mst, mprivozn, rusty, qemu-devel,
	shajnocz, gaowanlong, jwhan, shiyer

On Tue, Jan 29, 2013 at 1:50 PM, Jason Wang <jasowang@redhat.com> wrote:
> On 01/26/2013 03:13 AM, Blue Swirl wrote:
>> On Fri, Jan 25, 2013 at 10:35 AM, Jason Wang <jasowang@redhat.com> wrote:
>>> This patch introduce a new bit - enabled in TAPState which tracks whether a
>>> specific queue/fd is enabled. The tap/fd is enabled during initialization and
>>> could be enabled/disabled by tap_enalbe() and tap_disable() which calls platform
>>> specific helpers to do the real work. Polling of a tap fd can only done when
>>> the tap was enabled.
>>>
>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>> ---
>>>  include/net/tap.h |    2 ++
>>>  net/tap-win32.c   |   10 ++++++++++
>>>  net/tap.c         |   43 ++++++++++++++++++++++++++++++++++++++++---
>>>  3 files changed, 52 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/include/net/tap.h b/include/net/tap.h
>>> index bb7efb5..0caf8c4 100644
>>> --- a/include/net/tap.h
>>> +++ b/include/net/tap.h
>>> @@ -35,6 +35,8 @@ int tap_has_vnet_hdr_len(NetClientState *nc, int len);
>>>  void tap_using_vnet_hdr(NetClientState *nc, int using_vnet_hdr);
>>>  void tap_set_offload(NetClientState *nc, int csum, int tso4, int tso6, int ecn, int ufo);
>>>  void tap_set_vnet_hdr_len(NetClientState *nc, int len);
>>> +int tap_enable(NetClientState *nc);
>>> +int tap_disable(NetClientState *nc);
>>>
>>>  int tap_get_fd(NetClientState *nc);
>>>
>>> diff --git a/net/tap-win32.c b/net/tap-win32.c
>>> index 265369c..a2cd94b 100644
>>> --- a/net/tap-win32.c
>>> +++ b/net/tap-win32.c
>>> @@ -764,3 +764,13 @@ void tap_set_vnet_hdr_len(NetClientState *nc, int len)
>>>  {
>>>      assert(0);
>>>  }
>>> +
>>> +int tap_enable(NetClientState *nc)
>>> +{
>>> +    assert(0);
>> abort()
>
> This is just to be consistent with the reset of the helpers in this file.
>>
>>> +}
>>> +
>>> +int tap_disable(NetClientState *nc)
>>> +{
>>> +    assert(0);
>>> +}
>>> diff --git a/net/tap.c b/net/tap.c
>>> index 67080f1..95e557b 100644
>>> --- a/net/tap.c
>>> +++ b/net/tap.c
>>> @@ -59,6 +59,7 @@ typedef struct TAPState {
>>>      unsigned int write_poll : 1;
>>>      unsigned int using_vnet_hdr : 1;
>>>      unsigned int has_ufo: 1;
>>> +    unsigned int enabled : 1;
>> bool without bit field?
>
> Also to be consistent with other field. If you wish I can send patches
> to convert all those bit field to bool on top of this series.

That would be nice, likewise for the assert(0).

>
> Thanks
>>>      VHostNetState *vhost_net;
>>>      unsigned host_vnet_hdr_len;
>>>  } TAPState;
>>> @@ -72,9 +73,9 @@ static void tap_writable(void *opaque);
>>>  static void tap_update_fd_handler(TAPState *s)
>>>  {
>>>      qemu_set_fd_handler2(s->fd,
>>> -                         s->read_poll  ? tap_can_send : NULL,
>>> -                         s->read_poll  ? tap_send     : NULL,
>>> -                         s->write_poll ? tap_writable : NULL,
>>> +                         s->read_poll && s->enabled ? tap_can_send : NULL,
>>> +                         s->read_poll && s->enabled ? tap_send     : NULL,
>>> +                         s->write_poll && s->enabled ? tap_writable : NULL,
>>>                           s);
>>>  }
>>>
>>> @@ -339,6 +340,7 @@ static TAPState *net_tap_fd_init(NetClientState *peer,
>>>      s->host_vnet_hdr_len = vnet_hdr ? sizeof(struct virtio_net_hdr) : 0;
>>>      s->using_vnet_hdr = 0;
>>>      s->has_ufo = tap_probe_has_ufo(s->fd);
>>> +    s->enabled = 1;
>>>      tap_set_offload(&s->nc, 0, 0, 0, 0, 0);
>>>      /*
>>>       * Make sure host header length is set correctly in tap:
>>> @@ -737,3 +739,38 @@ VHostNetState *tap_get_vhost_net(NetClientState *nc)
>>>      assert(nc->info->type == NET_CLIENT_OPTIONS_KIND_TAP);
>>>      return s->vhost_net;
>>>  }
>>> +
>>> +int tap_enable(NetClientState *nc)
>>> +{
>>> +    TAPState *s = DO_UPCAST(TAPState, nc, nc);
>>> +    int ret;
>>> +
>>> +    if (s->enabled) {
>>> +        return 0;
>>> +    } else {
>>> +        ret = tap_fd_enable(s->fd);
>>> +        if (ret == 0) {
>>> +            s->enabled = 1;
>>> +            tap_update_fd_handler(s);
>>> +        }
>>> +        return ret;
>>> +    }
>>> +}
>>> +
>>> +int tap_disable(NetClientState *nc)
>>> +{
>>> +    TAPState *s = DO_UPCAST(TAPState, nc, nc);
>>> +    int ret;
>>> +
>>> +    if (s->enabled == 0) {
>>> +        return 0;
>>> +    } else {
>>> +        ret = tap_fd_disable(s->fd);
>>> +        if (ret == 0) {
>>> +            qemu_purge_queued_packets(nc);
>>> +            s->enabled = 0;
>>> +            tap_update_fd_handler(s);
>>> +        }
>>> +        return ret;
>>> +    }
>>> +}
>>> --
>>> 1.7.1
>>>
>>>
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH V2 11/20] tap: support enabling or disabling a queue
  2013-01-29 20:10       ` Blue Swirl
@ 2013-01-29 22:11         ` Michael S. Tsirkin
  2013-01-29 22:55           ` Anthony Liguori
  0 siblings, 1 reply; 41+ messages in thread
From: Michael S. Tsirkin @ 2013-01-29 22:11 UTC (permalink / raw)
  To: Blue Swirl
  Cc: Jason Wang, krkumar2, aliguori, kvm, mprivozn, rusty, qemu-devel,
	shajnocz, shiyer, jwhan, gaowanlong

On Tue, Jan 29, 2013 at 08:10:26PM +0000, Blue Swirl wrote:
> On Tue, Jan 29, 2013 at 1:50 PM, Jason Wang <jasowang@redhat.com> wrote:
> > On 01/26/2013 03:13 AM, Blue Swirl wrote:
> >> On Fri, Jan 25, 2013 at 10:35 AM, Jason Wang <jasowang@redhat.com> wrote:
> >>> This patch introduce a new bit - enabled in TAPState which tracks whether a
> >>> specific queue/fd is enabled. The tap/fd is enabled during initialization and
> >>> could be enabled/disabled by tap_enalbe() and tap_disable() which calls platform
> >>> specific helpers to do the real work. Polling of a tap fd can only done when
> >>> the tap was enabled.
> >>>
> >>> Signed-off-by: Jason Wang <jasowang@redhat.com>
> >>> ---
> >>>  include/net/tap.h |    2 ++
> >>>  net/tap-win32.c   |   10 ++++++++++
> >>>  net/tap.c         |   43 ++++++++++++++++++++++++++++++++++++++++---
> >>>  3 files changed, 52 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/include/net/tap.h b/include/net/tap.h
> >>> index bb7efb5..0caf8c4 100644
> >>> --- a/include/net/tap.h
> >>> +++ b/include/net/tap.h
> >>> @@ -35,6 +35,8 @@ int tap_has_vnet_hdr_len(NetClientState *nc, int len);
> >>>  void tap_using_vnet_hdr(NetClientState *nc, int using_vnet_hdr);
> >>>  void tap_set_offload(NetClientState *nc, int csum, int tso4, int tso6, int ecn, int ufo);
> >>>  void tap_set_vnet_hdr_len(NetClientState *nc, int len);
> >>> +int tap_enable(NetClientState *nc);
> >>> +int tap_disable(NetClientState *nc);
> >>>
> >>>  int tap_get_fd(NetClientState *nc);
> >>>
> >>> diff --git a/net/tap-win32.c b/net/tap-win32.c
> >>> index 265369c..a2cd94b 100644
> >>> --- a/net/tap-win32.c
> >>> +++ b/net/tap-win32.c
> >>> @@ -764,3 +764,13 @@ void tap_set_vnet_hdr_len(NetClientState *nc, int len)
> >>>  {
> >>>      assert(0);
> >>>  }
> >>> +
> >>> +int tap_enable(NetClientState *nc)
> >>> +{
> >>> +    assert(0);
> >> abort()
> >
> > This is just to be consistent with the reset of the helpers in this file.
> >>
> >>> +}
> >>> +
> >>> +int tap_disable(NetClientState *nc)
> >>> +{
> >>> +    assert(0);
> >>> +}
> >>> diff --git a/net/tap.c b/net/tap.c
> >>> index 67080f1..95e557b 100644
> >>> --- a/net/tap.c
> >>> +++ b/net/tap.c
> >>> @@ -59,6 +59,7 @@ typedef struct TAPState {
> >>>      unsigned int write_poll : 1;
> >>>      unsigned int using_vnet_hdr : 1;
> >>>      unsigned int has_ufo: 1;
> >>> +    unsigned int enabled : 1;
> >> bool without bit field?
> >
> > Also to be consistent with other field. If you wish I can send patches
> > to convert all those bit field to bool on top of this series.
> 
> That would be nice, likewise for the assert(0).

OK so let's go ahead with this patchset as is,
and a cleanup patch will be send after 1.4 then.


> >
> > Thanks
> >>>      VHostNetState *vhost_net;
> >>>      unsigned host_vnet_hdr_len;
> >>>  } TAPState;
> >>> @@ -72,9 +73,9 @@ static void tap_writable(void *opaque);
> >>>  static void tap_update_fd_handler(TAPState *s)
> >>>  {
> >>>      qemu_set_fd_handler2(s->fd,
> >>> -                         s->read_poll  ? tap_can_send : NULL,
> >>> -                         s->read_poll  ? tap_send     : NULL,
> >>> -                         s->write_poll ? tap_writable : NULL,
> >>> +                         s->read_poll && s->enabled ? tap_can_send : NULL,
> >>> +                         s->read_poll && s->enabled ? tap_send     : NULL,
> >>> +                         s->write_poll && s->enabled ? tap_writable : NULL,
> >>>                           s);
> >>>  }
> >>>
> >>> @@ -339,6 +340,7 @@ static TAPState *net_tap_fd_init(NetClientState *peer,
> >>>      s->host_vnet_hdr_len = vnet_hdr ? sizeof(struct virtio_net_hdr) : 0;
> >>>      s->using_vnet_hdr = 0;
> >>>      s->has_ufo = tap_probe_has_ufo(s->fd);
> >>> +    s->enabled = 1;
> >>>      tap_set_offload(&s->nc, 0, 0, 0, 0, 0);
> >>>      /*
> >>>       * Make sure host header length is set correctly in tap:
> >>> @@ -737,3 +739,38 @@ VHostNetState *tap_get_vhost_net(NetClientState *nc)
> >>>      assert(nc->info->type == NET_CLIENT_OPTIONS_KIND_TAP);
> >>>      return s->vhost_net;
> >>>  }
> >>> +
> >>> +int tap_enable(NetClientState *nc)
> >>> +{
> >>> +    TAPState *s = DO_UPCAST(TAPState, nc, nc);
> >>> +    int ret;
> >>> +
> >>> +    if (s->enabled) {
> >>> +        return 0;
> >>> +    } else {
> >>> +        ret = tap_fd_enable(s->fd);
> >>> +        if (ret == 0) {
> >>> +            s->enabled = 1;
> >>> +            tap_update_fd_handler(s);
> >>> +        }
> >>> +        return ret;
> >>> +    }
> >>> +}
> >>> +
> >>> +int tap_disable(NetClientState *nc)
> >>> +{
> >>> +    TAPState *s = DO_UPCAST(TAPState, nc, nc);
> >>> +    int ret;
> >>> +
> >>> +    if (s->enabled == 0) {
> >>> +        return 0;
> >>> +    } else {
> >>> +        ret = tap_fd_disable(s->fd);
> >>> +        if (ret == 0) {
> >>> +            qemu_purge_queued_packets(nc);
> >>> +            s->enabled = 0;
> >>> +            tap_update_fd_handler(s);
> >>> +        }
> >>> +        return ret;
> >>> +    }
> >>> +}
> >>> --
> >>> 1.7.1
> >>>
> >>>
> >

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH V2 11/20] tap: support enabling or disabling a queue
  2013-01-29 22:11         ` [Qemu-devel] " Michael S. Tsirkin
@ 2013-01-29 22:55           ` Anthony Liguori
  2013-01-29 23:03             ` Michael S. Tsirkin
  0 siblings, 1 reply; 41+ messages in thread
From: Anthony Liguori @ 2013-01-29 22:55 UTC (permalink / raw)
  To: Michael S. Tsirkin, Blue Swirl
  Cc: Jason Wang, krkumar2, kvm, mprivozn, rusty, qemu-devel, shajnocz,
	shiyer, jwhan, gaowanlong

"Michael S. Tsirkin" <mst@redhat.com> writes:

> On Tue, Jan 29, 2013 at 08:10:26PM +0000, Blue Swirl wrote:
>> On Tue, Jan 29, 2013 at 1:50 PM, Jason Wang <jasowang@redhat.com> wrote:
>> > On 01/26/2013 03:13 AM, Blue Swirl wrote:
>> >> On Fri, Jan 25, 2013 at 10:35 AM, Jason Wang <jasowang@redhat.com> wrote:
>> >>> This patch introduce a new bit - enabled in TAPState which tracks whether a
>> >>> specific queue/fd is enabled. The tap/fd is enabled during initialization and
>> >>> could be enabled/disabled by tap_enalbe() and tap_disable() which calls platform
>> >>> specific helpers to do the real work. Polling of a tap fd can only done when
>> >>> the tap was enabled.
>> >>>
>> >>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> >>> ---
>> >>>  include/net/tap.h |    2 ++
>> >>>  net/tap-win32.c   |   10 ++++++++++
>> >>>  net/tap.c         |   43 ++++++++++++++++++++++++++++++++++++++++---
>> >>>  3 files changed, 52 insertions(+), 3 deletions(-)
>> >>>
>> >>> diff --git a/include/net/tap.h b/include/net/tap.h
>> >>> index bb7efb5..0caf8c4 100644
>> >>> --- a/include/net/tap.h
>> >>> +++ b/include/net/tap.h
>> >>> @@ -35,6 +35,8 @@ int tap_has_vnet_hdr_len(NetClientState *nc, int len);
>> >>>  void tap_using_vnet_hdr(NetClientState *nc, int using_vnet_hdr);
>> >>>  void tap_set_offload(NetClientState *nc, int csum, int tso4, int tso6, int ecn, int ufo);
>> >>>  void tap_set_vnet_hdr_len(NetClientState *nc, int len);
>> >>> +int tap_enable(NetClientState *nc);
>> >>> +int tap_disable(NetClientState *nc);
>> >>>
>> >>>  int tap_get_fd(NetClientState *nc);
>> >>>
>> >>> diff --git a/net/tap-win32.c b/net/tap-win32.c
>> >>> index 265369c..a2cd94b 100644
>> >>> --- a/net/tap-win32.c
>> >>> +++ b/net/tap-win32.c
>> >>> @@ -764,3 +764,13 @@ void tap_set_vnet_hdr_len(NetClientState *nc, int len)
>> >>>  {
>> >>>      assert(0);
>> >>>  }
>> >>> +
>> >>> +int tap_enable(NetClientState *nc)
>> >>> +{
>> >>> +    assert(0);
>> >> abort()
>> >
>> > This is just to be consistent with the reset of the helpers in this file.
>> >>
>> >>> +}
>> >>> +
>> >>> +int tap_disable(NetClientState *nc)
>> >>> +{
>> >>> +    assert(0);
>> >>> +}
>> >>> diff --git a/net/tap.c b/net/tap.c
>> >>> index 67080f1..95e557b 100644
>> >>> --- a/net/tap.c
>> >>> +++ b/net/tap.c
>> >>> @@ -59,6 +59,7 @@ typedef struct TAPState {
>> >>>      unsigned int write_poll : 1;
>> >>>      unsigned int using_vnet_hdr : 1;
>> >>>      unsigned int has_ufo: 1;
>> >>> +    unsigned int enabled : 1;
>> >> bool without bit field?
>> >
>> > Also to be consistent with other field. If you wish I can send patches
>> > to convert all those bit field to bool on top of this series.
>> 
>> That would be nice, likewise for the assert(0).
>
> OK so let's go ahead with this patchset as is,
> and a cleanup patch will be send after 1.4 then.

Why?  I'd prefer that we didn't rush things into 1.4 just because.
There's still ample time to respin a corrected series.

Regards,

Anthony Liguori

>
>
>> >
>> > Thanks
>> >>>      VHostNetState *vhost_net;
>> >>>      unsigned host_vnet_hdr_len;
>> >>>  } TAPState;
>> >>> @@ -72,9 +73,9 @@ static void tap_writable(void *opaque);
>> >>>  static void tap_update_fd_handler(TAPState *s)
>> >>>  {
>> >>>      qemu_set_fd_handler2(s->fd,
>> >>> -                         s->read_poll  ? tap_can_send : NULL,
>> >>> -                         s->read_poll  ? tap_send     : NULL,
>> >>> -                         s->write_poll ? tap_writable : NULL,
>> >>> +                         s->read_poll && s->enabled ? tap_can_send : NULL,
>> >>> +                         s->read_poll && s->enabled ? tap_send     : NULL,
>> >>> +                         s->write_poll && s->enabled ? tap_writable : NULL,
>> >>>                           s);
>> >>>  }
>> >>>
>> >>> @@ -339,6 +340,7 @@ static TAPState *net_tap_fd_init(NetClientState *peer,
>> >>>      s->host_vnet_hdr_len = vnet_hdr ? sizeof(struct virtio_net_hdr) : 0;
>> >>>      s->using_vnet_hdr = 0;
>> >>>      s->has_ufo = tap_probe_has_ufo(s->fd);
>> >>> +    s->enabled = 1;
>> >>>      tap_set_offload(&s->nc, 0, 0, 0, 0, 0);
>> >>>      /*
>> >>>       * Make sure host header length is set correctly in tap:
>> >>> @@ -737,3 +739,38 @@ VHostNetState *tap_get_vhost_net(NetClientState *nc)
>> >>>      assert(nc->info->type == NET_CLIENT_OPTIONS_KIND_TAP);
>> >>>      return s->vhost_net;
>> >>>  }
>> >>> +
>> >>> +int tap_enable(NetClientState *nc)
>> >>> +{
>> >>> +    TAPState *s = DO_UPCAST(TAPState, nc, nc);
>> >>> +    int ret;
>> >>> +
>> >>> +    if (s->enabled) {
>> >>> +        return 0;
>> >>> +    } else {
>> >>> +        ret = tap_fd_enable(s->fd);
>> >>> +        if (ret == 0) {
>> >>> +            s->enabled = 1;
>> >>> +            tap_update_fd_handler(s);
>> >>> +        }
>> >>> +        return ret;
>> >>> +    }
>> >>> +}
>> >>> +
>> >>> +int tap_disable(NetClientState *nc)
>> >>> +{
>> >>> +    TAPState *s = DO_UPCAST(TAPState, nc, nc);
>> >>> +    int ret;
>> >>> +
>> >>> +    if (s->enabled == 0) {
>> >>> +        return 0;
>> >>> +    } else {
>> >>> +        ret = tap_fd_disable(s->fd);
>> >>> +        if (ret == 0) {
>> >>> +            qemu_purge_queued_packets(nc);
>> >>> +            s->enabled = 0;
>> >>> +            tap_update_fd_handler(s);
>> >>> +        }
>> >>> +        return ret;
>> >>> +    }
>> >>> +}
>> >>> --
>> >>> 1.7.1
>> >>>
>> >>>
>> >
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH V2 11/20] tap: support enabling or disabling a queue
  2013-01-29 22:55           ` Anthony Liguori
@ 2013-01-29 23:03             ` Michael S. Tsirkin
  2013-01-30  9:46               ` Jason Wang
  0 siblings, 1 reply; 41+ messages in thread
From: Michael S. Tsirkin @ 2013-01-29 23:03 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: krkumar2, kvm, mprivozn, Jason Wang, rusty, qemu-devel,
	Blue Swirl, shajnocz, gaowanlong, jwhan, shiyer

On Tue, Jan 29, 2013 at 04:55:25PM -0600, Anthony Liguori wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> 
> > On Tue, Jan 29, 2013 at 08:10:26PM +0000, Blue Swirl wrote:
> >> On Tue, Jan 29, 2013 at 1:50 PM, Jason Wang <jasowang@redhat.com> wrote:
> >> > On 01/26/2013 03:13 AM, Blue Swirl wrote:
> >> >> On Fri, Jan 25, 2013 at 10:35 AM, Jason Wang <jasowang@redhat.com> wrote:
> >> >>> This patch introduce a new bit - enabled in TAPState which tracks whether a
> >> >>> specific queue/fd is enabled. The tap/fd is enabled during initialization and
> >> >>> could be enabled/disabled by tap_enalbe() and tap_disable() which calls platform
> >> >>> specific helpers to do the real work. Polling of a tap fd can only done when
> >> >>> the tap was enabled.
> >> >>>
> >> >>> Signed-off-by: Jason Wang <jasowang@redhat.com>
> >> >>> ---
> >> >>>  include/net/tap.h |    2 ++
> >> >>>  net/tap-win32.c   |   10 ++++++++++
> >> >>>  net/tap.c         |   43 ++++++++++++++++++++++++++++++++++++++++---
> >> >>>  3 files changed, 52 insertions(+), 3 deletions(-)
> >> >>>
> >> >>> diff --git a/include/net/tap.h b/include/net/tap.h
> >> >>> index bb7efb5..0caf8c4 100644
> >> >>> --- a/include/net/tap.h
> >> >>> +++ b/include/net/tap.h
> >> >>> @@ -35,6 +35,8 @@ int tap_has_vnet_hdr_len(NetClientState *nc, int len);
> >> >>>  void tap_using_vnet_hdr(NetClientState *nc, int using_vnet_hdr);
> >> >>>  void tap_set_offload(NetClientState *nc, int csum, int tso4, int tso6, int ecn, int ufo);
> >> >>>  void tap_set_vnet_hdr_len(NetClientState *nc, int len);
> >> >>> +int tap_enable(NetClientState *nc);
> >> >>> +int tap_disable(NetClientState *nc);
> >> >>>
> >> >>>  int tap_get_fd(NetClientState *nc);
> >> >>>
> >> >>> diff --git a/net/tap-win32.c b/net/tap-win32.c
> >> >>> index 265369c..a2cd94b 100644
> >> >>> --- a/net/tap-win32.c
> >> >>> +++ b/net/tap-win32.c
> >> >>> @@ -764,3 +764,13 @@ void tap_set_vnet_hdr_len(NetClientState *nc, int len)
> >> >>>  {
> >> >>>      assert(0);
> >> >>>  }
> >> >>> +
> >> >>> +int tap_enable(NetClientState *nc)
> >> >>> +{
> >> >>> +    assert(0);
> >> >> abort()
> >> >
> >> > This is just to be consistent with the reset of the helpers in this file.
> >> >>
> >> >>> +}
> >> >>> +
> >> >>> +int tap_disable(NetClientState *nc)
> >> >>> +{
> >> >>> +    assert(0);
> >> >>> +}
> >> >>> diff --git a/net/tap.c b/net/tap.c
> >> >>> index 67080f1..95e557b 100644
> >> >>> --- a/net/tap.c
> >> >>> +++ b/net/tap.c
> >> >>> @@ -59,6 +59,7 @@ typedef struct TAPState {
> >> >>>      unsigned int write_poll : 1;
> >> >>>      unsigned int using_vnet_hdr : 1;
> >> >>>      unsigned int has_ufo: 1;
> >> >>> +    unsigned int enabled : 1;
> >> >> bool without bit field?
> >> >
> >> > Also to be consistent with other field. If you wish I can send patches
> >> > to convert all those bit field to bool on top of this series.
> >> 
> >> That would be nice, likewise for the assert(0).
> >
> > OK so let's go ahead with this patchset as is,
> > and a cleanup patch will be send after 1.4 then.
> 
> Why?  I'd prefer that we didn't rush things into 1.4 just because.
> There's still ample time to respin a corrected series.
> 
> Regards,
> 
> Anthony Liguori

Confused.  Do you want the coding style rework of net/tap.c
switching it from assert(0)/bitfields to abort()/bool for 1.4?

> >
> >
> >> >
> >> > Thanks
> >> >>>      VHostNetState *vhost_net;
> >> >>>      unsigned host_vnet_hdr_len;
> >> >>>  } TAPState;
> >> >>> @@ -72,9 +73,9 @@ static void tap_writable(void *opaque);
> >> >>>  static void tap_update_fd_handler(TAPState *s)
> >> >>>  {
> >> >>>      qemu_set_fd_handler2(s->fd,
> >> >>> -                         s->read_poll  ? tap_can_send : NULL,
> >> >>> -                         s->read_poll  ? tap_send     : NULL,
> >> >>> -                         s->write_poll ? tap_writable : NULL,
> >> >>> +                         s->read_poll && s->enabled ? tap_can_send : NULL,
> >> >>> +                         s->read_poll && s->enabled ? tap_send     : NULL,
> >> >>> +                         s->write_poll && s->enabled ? tap_writable : NULL,
> >> >>>                           s);
> >> >>>  }
> >> >>>
> >> >>> @@ -339,6 +340,7 @@ static TAPState *net_tap_fd_init(NetClientState *peer,
> >> >>>      s->host_vnet_hdr_len = vnet_hdr ? sizeof(struct virtio_net_hdr) : 0;
> >> >>>      s->using_vnet_hdr = 0;
> >> >>>      s->has_ufo = tap_probe_has_ufo(s->fd);
> >> >>> +    s->enabled = 1;
> >> >>>      tap_set_offload(&s->nc, 0, 0, 0, 0, 0);
> >> >>>      /*
> >> >>>       * Make sure host header length is set correctly in tap:
> >> >>> @@ -737,3 +739,38 @@ VHostNetState *tap_get_vhost_net(NetClientState *nc)
> >> >>>      assert(nc->info->type == NET_CLIENT_OPTIONS_KIND_TAP);
> >> >>>      return s->vhost_net;
> >> >>>  }
> >> >>> +
> >> >>> +int tap_enable(NetClientState *nc)
> >> >>> +{
> >> >>> +    TAPState *s = DO_UPCAST(TAPState, nc, nc);
> >> >>> +    int ret;
> >> >>> +
> >> >>> +    if (s->enabled) {
> >> >>> +        return 0;
> >> >>> +    } else {
> >> >>> +        ret = tap_fd_enable(s->fd);
> >> >>> +        if (ret == 0) {
> >> >>> +            s->enabled = 1;
> >> >>> +            tap_update_fd_handler(s);
> >> >>> +        }
> >> >>> +        return ret;
> >> >>> +    }
> >> >>> +}
> >> >>> +
> >> >>> +int tap_disable(NetClientState *nc)
> >> >>> +{
> >> >>> +    TAPState *s = DO_UPCAST(TAPState, nc, nc);
> >> >>> +    int ret;
> >> >>> +
> >> >>> +    if (s->enabled == 0) {
> >> >>> +        return 0;
> >> >>> +    } else {
> >> >>> +        ret = tap_fd_disable(s->fd);
> >> >>> +        if (ret == 0) {
> >> >>> +            qemu_purge_queued_packets(nc);
> >> >>> +            s->enabled = 0;
> >> >>> +            tap_update_fd_handler(s);
> >> >>> +        }
> >> >>> +        return ret;
> >> >>> +    }
> >> >>> +}
> >> >>> --
> >> >>> 1.7.1
> >> >>>
> >> >>>
> >> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH V2 11/20] tap: support enabling or disabling a queue
  2013-01-29 23:03             ` Michael S. Tsirkin
@ 2013-01-30  9:46               ` Jason Wang
  0 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-01-30  9:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: krkumar2, Anthony Liguori, kvm, mprivozn, rusty, qemu-devel,
	Blue Swirl, shajnocz, shiyer, jwhan, gaowanlong

On 01/30/2013 07:03 AM, Michael S. Tsirkin wrote:
> On Tue, Jan 29, 2013 at 04:55:25PM -0600, Anthony Liguori wrote:
>> "Michael S. Tsirkin" <mst@redhat.com> writes:
>>
>>> On Tue, Jan 29, 2013 at 08:10:26PM +0000, Blue Swirl wrote:
>>>> On Tue, Jan 29, 2013 at 1:50 PM, Jason Wang <jasowang@redhat.com> wrote:
>>>>> On 01/26/2013 03:13 AM, Blue Swirl wrote:
>>>>>> On Fri, Jan 25, 2013 at 10:35 AM, Jason Wang <jasowang@redhat.com> wrote:
>>>>>>> This patch introduce a new bit - enabled in TAPState which tracks whether a
>>>>>>> specific queue/fd is enabled. The tap/fd is enabled during initialization and
>>>>>>> could be enabled/disabled by tap_enalbe() and tap_disable() which calls platform
>>>>>>> specific helpers to do the real work. Polling of a tap fd can only done when
>>>>>>> the tap was enabled.
>>>>>>>
>>>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>>>>> ---
>>>>>>>  include/net/tap.h |    2 ++
>>>>>>>  net/tap-win32.c   |   10 ++++++++++
>>>>>>>  net/tap.c         |   43 ++++++++++++++++++++++++++++++++++++++++---
>>>>>>>  3 files changed, 52 insertions(+), 3 deletions(-)
>>>>>>>
>>>>>>> diff --git a/include/net/tap.h b/include/net/tap.h
>>>>>>> index bb7efb5..0caf8c4 100644
>>>>>>> --- a/include/net/tap.h
>>>>>>> +++ b/include/net/tap.h
>>>>>>> @@ -35,6 +35,8 @@ int tap_has_vnet_hdr_len(NetClientState *nc, int len);
>>>>>>>  void tap_using_vnet_hdr(NetClientState *nc, int using_vnet_hdr);
>>>>>>>  void tap_set_offload(NetClientState *nc, int csum, int tso4, int tso6, int ecn, int ufo);
>>>>>>>  void tap_set_vnet_hdr_len(NetClientState *nc, int len);
>>>>>>> +int tap_enable(NetClientState *nc);
>>>>>>> +int tap_disable(NetClientState *nc);
>>>>>>>
>>>>>>>  int tap_get_fd(NetClientState *nc);
>>>>>>>
>>>>>>> diff --git a/net/tap-win32.c b/net/tap-win32.c
>>>>>>> index 265369c..a2cd94b 100644
>>>>>>> --- a/net/tap-win32.c
>>>>>>> +++ b/net/tap-win32.c
>>>>>>> @@ -764,3 +764,13 @@ void tap_set_vnet_hdr_len(NetClientState *nc, int len)
>>>>>>>  {
>>>>>>>      assert(0);
>>>>>>>  }
>>>>>>> +
>>>>>>> +int tap_enable(NetClientState *nc)
>>>>>>> +{
>>>>>>> +    assert(0);
>>>>>> abort()
>>>>> This is just to be consistent with the reset of the helpers in this file.
>>>>>>> +}
>>>>>>> +
>>>>>>> +int tap_disable(NetClientState *nc)
>>>>>>> +{
>>>>>>> +    assert(0);
>>>>>>> +}
>>>>>>> diff --git a/net/tap.c b/net/tap.c
>>>>>>> index 67080f1..95e557b 100644
>>>>>>> --- a/net/tap.c
>>>>>>> +++ b/net/tap.c
>>>>>>> @@ -59,6 +59,7 @@ typedef struct TAPState {
>>>>>>>      unsigned int write_poll : 1;
>>>>>>>      unsigned int using_vnet_hdr : 1;
>>>>>>>      unsigned int has_ufo: 1;
>>>>>>> +    unsigned int enabled : 1;
>>>>>> bool without bit field?
>>>>> Also to be consistent with other field. If you wish I can send patches
>>>>> to convert all those bit field to bool on top of this series.
>>>> That would be nice, likewise for the assert(0).
>>> OK so let's go ahead with this patchset as is,
>>> and a cleanup patch will be send after 1.4 then.
>> Why?  I'd prefer that we didn't rush things into 1.4 just because.
>> There's still ample time to respin a corrected series.
>>
>> Regards,
>>
>> Anthony Liguori
> Confused.  Do you want the coding style rework of net/tap.c
> switching it from assert(0)/bitfields to abort()/bool for 1.4?

I will send a new series with the patches that addresses Blue's comments
on assert(0) and bitfields.

Thanks
>>>
>>>>> Thanks
>>>>>>>      VHostNetState *vhost_net;
>>>>>>>      unsigned host_vnet_hdr_len;
>>>>>>>  } TAPState;
>>>>>>> @@ -72,9 +73,9 @@ static void tap_writable(void *opaque);
>>>>>>>  static void tap_update_fd_handler(TAPState *s)
>>>>>>>  {
>>>>>>>      qemu_set_fd_handler2(s->fd,
>>>>>>> -                         s->read_poll  ? tap_can_send : NULL,
>>>>>>> -                         s->read_poll  ? tap_send     : NULL,
>>>>>>> -                         s->write_poll ? tap_writable : NULL,
>>>>>>> +                         s->read_poll && s->enabled ? tap_can_send : NULL,
>>>>>>> +                         s->read_poll && s->enabled ? tap_send     : NULL,
>>>>>>> +                         s->write_poll && s->enabled ? tap_writable : NULL,
>>>>>>>                           s);
>>>>>>>  }
>>>>>>>
>>>>>>> @@ -339,6 +340,7 @@ static TAPState *net_tap_fd_init(NetClientState *peer,
>>>>>>>      s->host_vnet_hdr_len = vnet_hdr ? sizeof(struct virtio_net_hdr) : 0;
>>>>>>>      s->using_vnet_hdr = 0;
>>>>>>>      s->has_ufo = tap_probe_has_ufo(s->fd);
>>>>>>> +    s->enabled = 1;
>>>>>>>      tap_set_offload(&s->nc, 0, 0, 0, 0, 0);
>>>>>>>      /*
>>>>>>>       * Make sure host header length is set correctly in tap:
>>>>>>> @@ -737,3 +739,38 @@ VHostNetState *tap_get_vhost_net(NetClientState *nc)
>>>>>>>      assert(nc->info->type == NET_CLIENT_OPTIONS_KIND_TAP);
>>>>>>>      return s->vhost_net;
>>>>>>>  }
>>>>>>> +
>>>>>>> +int tap_enable(NetClientState *nc)
>>>>>>> +{
>>>>>>> +    TAPState *s = DO_UPCAST(TAPState, nc, nc);
>>>>>>> +    int ret;
>>>>>>> +
>>>>>>> +    if (s->enabled) {
>>>>>>> +        return 0;
>>>>>>> +    } else {
>>>>>>> +        ret = tap_fd_enable(s->fd);
>>>>>>> +        if (ret == 0) {
>>>>>>> +            s->enabled = 1;
>>>>>>> +            tap_update_fd_handler(s);
>>>>>>> +        }
>>>>>>> +        return ret;
>>>>>>> +    }
>>>>>>> +}
>>>>>>> +
>>>>>>> +int tap_disable(NetClientState *nc)
>>>>>>> +{
>>>>>>> +    TAPState *s = DO_UPCAST(TAPState, nc, nc);
>>>>>>> +    int ret;
>>>>>>> +
>>>>>>> +    if (s->enabled == 0) {
>>>>>>> +        return 0;
>>>>>>> +    } else {
>>>>>>> +        ret = tap_fd_disable(s->fd);
>>>>>>> +        if (ret == 0) {
>>>>>>> +            qemu_purge_queued_packets(nc);
>>>>>>> +            s->enabled = 0;
>>>>>>> +            tap_update_fd_handler(s);
>>>>>>> +        }
>>>>>>> +        return ret;
>>>>>>> +    }
>>>>>>> +}
>>>>>>> --
>>>>>>> 1.7.1
>>>>>>>
>>>>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH V2 18/20] virtio-net: multiqueue support
  2013-01-25 10:35 ` [PATCH V2 18/20] virtio-net: multiqueue support Jason Wang
@ 2013-04-13 13:17   ` Aurelien Jarno
  2013-04-15  5:29     ` Jason Wang
  0 siblings, 1 reply; 41+ messages in thread
From: Aurelien Jarno @ 2013-04-13 13:17 UTC (permalink / raw)
  To: Jason Wang; +Cc: qemu-devel

On Fri, Jan 25, 2013 at 06:35:41PM +0800, Jason Wang wrote:
> This patch implements both userspace and vhost support for multiple queue
> virtio-net (VIRTIO_NET_F_MQ). This is done by introducing an array of
> VirtIONetQueue to VirtIONet.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  hw/virtio-net.c |  317 +++++++++++++++++++++++++++++++++++++++++++------------
>  hw/virtio-net.h |   28 +++++-
>  2 files changed, 275 insertions(+), 70 deletions(-)

This patch breaks virtio-net in Minix, even with multiqueue disable. I
don't know virtio enough to know if it is a Minix or a QEMU problem.
However I have been able to identify the part of the commit causing the
failure:

> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> index ef522d5..cec91a7 100644
> --- a/hw/virtio-net.c
> +++ b/hw/virtio-net.c

...

> +static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue, int ctrl)
> +{
> +    VirtIODevice *vdev = &n->vdev;
> +    int i, max = multiqueue ? n->max_queues : 1;
> +
> +    n->multiqueue = multiqueue;
> +
> +    for (i = 2; i <= n->max_queues * 2 + 1; i++) {
> +        virtio_del_queue(vdev, i);
> +    }
> +

The for loop above is something which is new, even with multiqueue
disabled. Even with max_queues=1 it calls virtio_del_queue with i = 2
and i = 3. Disabling this loop makes the code to work as before.

On the Minix side it triggers the following assertion:

| virtio.c:370: assert "q->vaddr != NULL" failed, function "free_phys_queue"                                                                                                                                 
| virtio_net(73141): panic: assert failed

This correspond to this function in lib/libvirtio/virtio.c:

| static void
| free_phys_queue(struct virtio_queue *q)
| {
|         assert(q != NULL);
|         assert(q->vaddr != NULL);
| 
|         free_contig(q->vaddr, q->ring_size);
|         q->vaddr = NULL;
|         q->paddr = 0;
|         q->num = 0;
|         free_contig(q->data, sizeof(q->data[0]));
|         q->data = NULL;
| }

Do you have an idea if the problem is on the Minix side or on the QEMU
side?

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH V2 18/20] virtio-net: multiqueue support
  2013-04-13 13:17   ` [Qemu-devel] " Aurelien Jarno
@ 2013-04-15  5:29     ` Jason Wang
  2013-04-15  8:28       ` Aurelien Jarno
  0 siblings, 1 reply; 41+ messages in thread
From: Jason Wang @ 2013-04-15  5:29 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 04/13/2013 09:17 PM, Aurelien Jarno wrote:
> On Fri, Jan 25, 2013 at 06:35:41PM +0800, Jason Wang wrote:
>> This patch implements both userspace and vhost support for multiple queue
>> virtio-net (VIRTIO_NET_F_MQ). This is done by introducing an array of
>> VirtIONetQueue to VirtIONet.
>>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>  hw/virtio-net.c |  317 +++++++++++++++++++++++++++++++++++++++++++------------
>>  hw/virtio-net.h |   28 +++++-
>>  2 files changed, 275 insertions(+), 70 deletions(-)
> This patch breaks virtio-net in Minix, even with multiqueue disable. I
> don't know virtio enough to know if it is a Minix or a QEMU problem.
> However I have been able to identify the part of the commit causing the
> failure:

Hi Aurelien:

Thanks for the work.
>
>> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
>> index ef522d5..cec91a7 100644
>> --- a/hw/virtio-net.c
>> +++ b/hw/virtio-net.c
> ...
>
>> +static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue, int ctrl)
>> +{
>> +    VirtIODevice *vdev = &n->vdev;
>> +    int i, max = multiqueue ? n->max_queues : 1;
>> +
>> +    n->multiqueue = multiqueue;
>> +
>> +    for (i = 2; i <= n->max_queues * 2 + 1; i++) {
>> +        virtio_del_queue(vdev, i);
>> +    }
>> +
> The for loop above is something which is new, even with multiqueue
> disabled. Even with max_queues=1 it calls virtio_del_queue with i = 2
> and i = 3. Disabling this loop makes the code to work as before.

Looks like a bug here, need to change n->max_queues * 2 + 1 to
n->max_queues * 2. The reason we need to del queue 2 each time because
vq 2 has different meaning is multiqueue and single queue. In single
queue, vq 2 maybe ctrl vq, but in multiqueue mode it was rx1.

Let's see whether this small change works.
>
> On the Minix side it triggers the following assertion:
>
> | virtio.c:370: assert "q->vaddr != NULL" failed, function "free_phys_queue"                                                                                                                                 
> | virtio_net(73141): panic: assert failed
>
> This correspond to this function in lib/libvirtio/virtio.c:
>
> | static void
> | free_phys_queue(struct virtio_queue *q)
> | {
> |         assert(q != NULL);
> |         assert(q->vaddr != NULL);
> | 
> |         free_contig(q->vaddr, q->ring_size);
> |         q->vaddr = NULL;
> |         q->paddr = 0;
> |         q->num = 0;
> |         free_contig(q->data, sizeof(q->data[0]));
> |         q->data = NULL;
> | }
>
> Do you have an idea if the problem is on the Minix side or on the QEMU
> side?
>

Haven't figured out the relationship between virtqueue dynamic del/add
and q->vaddr here. If the above changes does not work, I guess this
problem happens only for ctrl vq (vq2)? And when does this happen?
Rebooting?

Thanks a lot

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH V2 18/20] virtio-net: multiqueue support
  2013-04-15  5:29     ` Jason Wang
@ 2013-04-15  8:28       ` Aurelien Jarno
  2013-04-16  3:08         ` Jason Wang
  0 siblings, 1 reply; 41+ messages in thread
From: Aurelien Jarno @ 2013-04-15  8:28 UTC (permalink / raw)
  To: Jason Wang; +Cc: qemu-devel

On Mon, Apr 15, 2013 at 01:29:01PM +0800, Jason Wang wrote:
> On 04/13/2013 09:17 PM, Aurelien Jarno wrote:
> > On Fri, Jan 25, 2013 at 06:35:41PM +0800, Jason Wang wrote:
> >> This patch implements both userspace and vhost support for multiple queue
> >> virtio-net (VIRTIO_NET_F_MQ). This is done by introducing an array of
> >> VirtIONetQueue to VirtIONet.
> >>
> >> Signed-off-by: Jason Wang <jasowang@redhat.com>
> >> ---
> >>  hw/virtio-net.c |  317 +++++++++++++++++++++++++++++++++++++++++++------------
> >>  hw/virtio-net.h |   28 +++++-
> >>  2 files changed, 275 insertions(+), 70 deletions(-)
> > This patch breaks virtio-net in Minix, even with multiqueue disable. I
> > don't know virtio enough to know if it is a Minix or a QEMU problem.
> > However I have been able to identify the part of the commit causing the
> > failure:
> 
> Hi Aurelien:
> 
> Thanks for the work.
> >
> >> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> >> index ef522d5..cec91a7 100644
> >> --- a/hw/virtio-net.c
> >> +++ b/hw/virtio-net.c
> > ...
> >
> >> +static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue, int ctrl)
> >> +{
> >> +    VirtIODevice *vdev = &n->vdev;
> >> +    int i, max = multiqueue ? n->max_queues : 1;
> >> +
> >> +    n->multiqueue = multiqueue;
> >> +
> >> +    for (i = 2; i <= n->max_queues * 2 + 1; i++) {
> >> +        virtio_del_queue(vdev, i);
> >> +    }
> >> +
> > The for loop above is something which is new, even with multiqueue
> > disabled. Even with max_queues=1 it calls virtio_del_queue with i = 2
> > and i = 3. Disabling this loop makes the code to work as before.
> 
> Looks like a bug here, need to change n->max_queues * 2 + 1 to
> n->max_queues * 2. The reason we need to del queue 2 each time because
> vq 2 has different meaning is multiqueue and single queue. In single
> queue, vq 2 maybe ctrl vq, but in multiqueue mode it was rx1.
> 
> Let's see whether this small change works.

Unfortunately it doesn't fix the issue. I don't know a lot about virtio,
but would it be possible to only delete the queue that have been
enabled?

> >
> > On the Minix side it triggers the following assertion:
> >
> > | virtio.c:370: assert "q->vaddr != NULL" failed, function "free_phys_queue"                                                                                                                                 
> > | virtio_net(73141): panic: assert failed
> >
> > This correspond to this function in lib/libvirtio/virtio.c:
> >
> > | static void
> > | free_phys_queue(struct virtio_queue *q)
> > | {
> > |         assert(q != NULL);
> > |         assert(q->vaddr != NULL);
> > | 
> > |         free_contig(q->vaddr, q->ring_size);
> > |         q->vaddr = NULL;
> > |         q->paddr = 0;
> > |         q->num = 0;
> > |         free_contig(q->data, sizeof(q->data[0]));
> > |         q->data = NULL;
> > | }
> >
> > Do you have an idea if the problem is on the Minix side or on the QEMU
> > side?
> >
> 
> Haven't figured out the relationship between virtqueue dynamic del/add
> and q->vaddr here. If the above changes does not work, I guess this

Unfortunately I don't really know the minix code either. I happen to
found the issue when testing other changes, and looked at the code to
try to understand the problem.

> problem happens only for ctrl vq (vq2)? And when does this happen?

I guess so given removing the call to virtio_del_queue() for vq2 fixes
or workarounds the issue.

> Rebooting?

It happens at the initial boot.

Thanks,
Aurelien

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH V2 18/20] virtio-net: multiqueue support
  2013-04-15  8:28       ` Aurelien Jarno
@ 2013-04-16  3:08         ` Jason Wang
  2013-04-17  7:20           ` Aurelien Jarno
  0 siblings, 1 reply; 41+ messages in thread
From: Jason Wang @ 2013-04-16  3:08 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 04/15/2013 04:28 PM, Aurelien Jarno wrote:
> On Mon, Apr 15, 2013 at 01:29:01PM +0800, Jason Wang wrote:
>> On 04/13/2013 09:17 PM, Aurelien Jarno wrote:
>>> On Fri, Jan 25, 2013 at 06:35:41PM +0800, Jason Wang wrote:
>>>> This patch implements both userspace and vhost support for multiple queue
>>>> virtio-net (VIRTIO_NET_F_MQ). This is done by introducing an array of
>>>> VirtIONetQueue to VirtIONet.
>>>>
>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>> ---
>>>>  hw/virtio-net.c |  317 +++++++++++++++++++++++++++++++++++++++++++------------
>>>>  hw/virtio-net.h |   28 +++++-
>>>>  2 files changed, 275 insertions(+), 70 deletions(-)
>>> This patch breaks virtio-net in Minix, even with multiqueue disable. I
>>> don't know virtio enough to know if it is a Minix or a QEMU problem.
>>> However I have been able to identify the part of the commit causing the
>>> failure:
>> Hi Aurelien:
>>
>> Thanks for the work.
>>>> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
>>>> index ef522d5..cec91a7 100644
>>>> --- a/hw/virtio-net.c
>>>> +++ b/hw/virtio-net.c
>>> ...
>>>
>>>> +static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue, int ctrl)
>>>> +{
>>>> +    VirtIODevice *vdev = &n->vdev;
>>>> +    int i, max = multiqueue ? n->max_queues : 1;
>>>> +
>>>> +    n->multiqueue = multiqueue;
>>>> +
>>>> +    for (i = 2; i <= n->max_queues * 2 + 1; i++) {
>>>> +        virtio_del_queue(vdev, i);
>>>> +    }
>>>> +
>>> The for loop above is something which is new, even with multiqueue
>>> disabled. Even with max_queues=1 it calls virtio_del_queue with i = 2
>>> and i = 3. Disabling this loop makes the code to work as before.
>> Looks like a bug here, need to change n->max_queues * 2 + 1 to
>> n->max_queues * 2. The reason we need to del queue 2 each time because
>> vq 2 has different meaning is multiqueue and single queue. In single
>> queue, vq 2 maybe ctrl vq, but in multiqueue mode it was rx1.
>>
>> Let's see whether this small change works.
> Unfortunately it doesn't fix the issue. I don't know a lot about virtio,
> but would it be possible to only delete the queue that have been
> enabled?

The issue is vq 2 has different meanings in two modes, so it must be
deleted and reinitialized during feature negotiation.
>
>>> On the Minix side it triggers the following assertion:
>>>
>>> | virtio.c:370: assert "q->vaddr != NULL" failed, function "free_phys_queue"                                                                                                                                 
>>> | virtio_net(73141): panic: assert failed
>>>
>>> This correspond to this function in lib/libvirtio/virtio.c:
>>>
>>> | static void
>>> | free_phys_queue(struct virtio_queue *q)
>>> | {
>>> |         assert(q != NULL);
>>> |         assert(q->vaddr != NULL);
>>> | 
>>> |         free_contig(q->vaddr, q->ring_size);
>>> |         q->vaddr = NULL;
>>> |         q->paddr = 0;
>>> |         q->num = 0;
>>> |         free_contig(q->data, sizeof(q->data[0]));
>>> |         q->data = NULL;
>>> | }
>>>
>>> Do you have an idea if the problem is on the Minix side or on the QEMU
>>> side?
>>>
>> Haven't figured out the relationship between virtqueue dynamic del/add
>> and q->vaddr here. If the above changes does not work, I guess this
> Unfortunately I don't really know the minix code either. I happen to
> found the issue when testing other changes, and looked at the code to
> try to understand the problem.

Me too.
>> problem happens only for ctrl vq (vq2)? And when does this happen?
> I guess so given removing the call to virtio_del_queue() for vq2 fixes
> or workarounds the issue.

It will cause trouble for multiqueue guest who may think vq2 is rx queue.
>
>> Rebooting?
> It happens at the initial boot.

Looks at the codes, looks like vq2 was initialized unconditionally at
start, how about the following patch?

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 4bb49eb..4886aa0 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -1300,7 +1300,6 @@ VirtIODevice *virtio_net_init(DeviceState *dev,
NICConf *conf,
                                            virtio_net_handle_tx_bh);
         n->vqs[0].tx_bh = qemu_bh_new(virtio_net_tx_bh, &n->vqs[0]);
     }
-    n->ctrl_vq = virtio_add_queue(&n->vdev, 64, virtio_net_handle_ctrl);
     qemu_macaddr_default_if_unset(&conf->macaddr);
     memcpy(&n->mac[0], &conf->macaddr, sizeof(n->mac));
     n->status = VIRTIO_NET_S_LINK_UP;

Thanks
>
> Thanks,
> Aurelien
>

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH V2 18/20] virtio-net: multiqueue support
  2013-04-16  3:08         ` Jason Wang
@ 2013-04-17  7:20           ` Aurelien Jarno
  2013-04-17  8:27             ` Jason Wang
  0 siblings, 1 reply; 41+ messages in thread
From: Aurelien Jarno @ 2013-04-17  7:20 UTC (permalink / raw)
  To: Jason Wang; +Cc: qemu-devel

On Tue, Apr 16, 2013 at 11:08:59AM +0800, Jason Wang wrote:
> On 04/15/2013 04:28 PM, Aurelien Jarno wrote:
> > On Mon, Apr 15, 2013 at 01:29:01PM +0800, Jason Wang wrote:
> >> On 04/13/2013 09:17 PM, Aurelien Jarno wrote:
> >>> On Fri, Jan 25, 2013 at 06:35:41PM +0800, Jason Wang wrote:
> >>>> This patch implements both userspace and vhost support for multiple queue
> >>>> virtio-net (VIRTIO_NET_F_MQ). This is done by introducing an array of
> >>>> VirtIONetQueue to VirtIONet.
> >>>>
> >>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
> >>>> ---
> >>>>  hw/virtio-net.c |  317 +++++++++++++++++++++++++++++++++++++++++++------------
> >>>>  hw/virtio-net.h |   28 +++++-
> >>>>  2 files changed, 275 insertions(+), 70 deletions(-)
> >>> This patch breaks virtio-net in Minix, even with multiqueue disable. I
> >>> don't know virtio enough to know if it is a Minix or a QEMU problem.
> >>> However I have been able to identify the part of the commit causing the
> >>> failure:
> >> Hi Aurelien:
> >>
> >> Thanks for the work.
> >>>> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> >>>> index ef522d5..cec91a7 100644
> >>>> --- a/hw/virtio-net.c
> >>>> +++ b/hw/virtio-net.c
> >>> ...
> >>>
> >>>> +static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue, int ctrl)
> >>>> +{
> >>>> +    VirtIODevice *vdev = &n->vdev;
> >>>> +    int i, max = multiqueue ? n->max_queues : 1;
> >>>> +
> >>>> +    n->multiqueue = multiqueue;
> >>>> +
> >>>> +    for (i = 2; i <= n->max_queues * 2 + 1; i++) {
> >>>> +        virtio_del_queue(vdev, i);
> >>>> +    }
> >>>> +
> >>> The for loop above is something which is new, even with multiqueue
> >>> disabled. Even with max_queues=1 it calls virtio_del_queue with i = 2
> >>> and i = 3. Disabling this loop makes the code to work as before.
> >> Looks like a bug here, need to change n->max_queues * 2 + 1 to
> >> n->max_queues * 2. The reason we need to del queue 2 each time because
> >> vq 2 has different meaning is multiqueue and single queue. In single
> >> queue, vq 2 maybe ctrl vq, but in multiqueue mode it was rx1.
> >>
> >> Let's see whether this small change works.
> > Unfortunately it doesn't fix the issue. I don't know a lot about virtio,
> > but would it be possible to only delete the queue that have been
> > enabled?
> 
> The issue is vq 2 has different meanings in two modes, so it must be
> deleted and reinitialized during feature negotiation.
> >
> >>> On the Minix side it triggers the following assertion:
> >>>
> >>> | virtio.c:370: assert "q->vaddr != NULL" failed, function "free_phys_queue"                                                                                                                                 
> >>> | virtio_net(73141): panic: assert failed
> >>>
> >>> This correspond to this function in lib/libvirtio/virtio.c:
> >>>
> >>> | static void
> >>> | free_phys_queue(struct virtio_queue *q)
> >>> | {
> >>> |         assert(q != NULL);
> >>> |         assert(q->vaddr != NULL);
> >>> | 
> >>> |         free_contig(q->vaddr, q->ring_size);
> >>> |         q->vaddr = NULL;
> >>> |         q->paddr = 0;
> >>> |         q->num = 0;
> >>> |         free_contig(q->data, sizeof(q->data[0]));
> >>> |         q->data = NULL;
> >>> | }
> >>>
> >>> Do you have an idea if the problem is on the Minix side or on the QEMU
> >>> side?
> >>>
> >> Haven't figured out the relationship between virtqueue dynamic del/add
> >> and q->vaddr here. If the above changes does not work, I guess this
> > Unfortunately I don't really know the minix code either. I happen to
> > found the issue when testing other changes, and looked at the code to
> > try to understand the problem.
> 
> Me too.
> >> problem happens only for ctrl vq (vq2)? And when does this happen?
> > I guess so given removing the call to virtio_del_queue() for vq2 fixes
> > or workarounds the issue.
> 
> It will cause trouble for multiqueue guest who may think vq2 is rx queue.
> >
> >> Rebooting?
> > It happens at the initial boot.
> 
> Looks at the codes, looks like vq2 was initialized unconditionally at
> start, how about the following patch?
> 
> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> index 4bb49eb..4886aa0 100644
> --- a/hw/virtio-net.c
> +++ b/hw/virtio-net.c
> @@ -1300,7 +1300,6 @@ VirtIODevice *virtio_net_init(DeviceState *dev,
> NICConf *conf,
>                                             virtio_net_handle_tx_bh);
>          n->vqs[0].tx_bh = qemu_bh_new(virtio_net_tx_bh, &n->vqs[0]);
>      }
> -    n->ctrl_vq = virtio_add_queue(&n->vdev, 64, virtio_net_handle_ctrl);
>      qemu_macaddr_default_if_unset(&conf->macaddr);
>      memcpy(&n->mac[0], &conf->macaddr, sizeof(n->mac));
>      n->status = VIRTIO_NET_S_LINK_UP;
> 

Unfortunately this patch doesn't work, I tried it with and without your
previous suggested change (max_queues * 2 + 1) => max_queues * 2.

Aurelien

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH V2 18/20] virtio-net: multiqueue support
  2013-04-17  7:20           ` Aurelien Jarno
@ 2013-04-17  8:27             ` Jason Wang
  2013-04-17  8:42               ` Aurelien Jarno
  0 siblings, 1 reply; 41+ messages in thread
From: Jason Wang @ 2013-04-17  8:27 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 04/17/2013 03:20 PM, Aurelien Jarno wrote:
> On Tue, Apr 16, 2013 at 11:08:59AM +0800, Jason Wang wrote:
>> On 04/15/2013 04:28 PM, Aurelien Jarno wrote:
>>> On Mon, Apr 15, 2013 at 01:29:01PM +0800, Jason Wang wrote:
>>>> On 04/13/2013 09:17 PM, Aurelien Jarno wrote:
>>>>> On Fri, Jan 25, 2013 at 06:35:41PM +0800, Jason Wang wrote:
>>>>>> This patch implements both userspace and vhost support for multiple queue
>>>>>> virtio-net (VIRTIO_NET_F_MQ). This is done by introducing an array of
>>>>>> VirtIONetQueue to VirtIONet.
>>>>>>
>>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>>>> ---
>>>>>>  hw/virtio-net.c |  317 +++++++++++++++++++++++++++++++++++++++++++------------
>>>>>>  hw/virtio-net.h |   28 +++++-
>>>>>>  2 files changed, 275 insertions(+), 70 deletions(-)
>>>>> This patch breaks virtio-net in Minix, even with multiqueue disable. I
>>>>> don't know virtio enough to know if it is a Minix or a QEMU problem.
>>>>> However I have been able to identify the part of the commit causing the
>>>>> failure:
>>>> Hi Aurelien:
>>>>
>>>> Thanks for the work.
>>>>>> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
>>>>>> index ef522d5..cec91a7 100644
>>>>>> --- a/hw/virtio-net.c
>>>>>> +++ b/hw/virtio-net.c
>>>>> ...
>>>>>
>>>>>> +static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue, int ctrl)
>>>>>> +{
>>>>>> +    VirtIODevice *vdev = &n->vdev;
>>>>>> +    int i, max = multiqueue ? n->max_queues : 1;
>>>>>> +
>>>>>> +    n->multiqueue = multiqueue;
>>>>>> +
>>>>>> +    for (i = 2; i <= n->max_queues * 2 + 1; i++) {
>>>>>> +        virtio_del_queue(vdev, i);
>>>>>> +    }
>>>>>> +
>>>>> The for loop above is something which is new, even with multiqueue
>>>>> disabled. Even with max_queues=1 it calls virtio_del_queue with i = 2
>>>>> and i = 3. Disabling this loop makes the code to work as before.
>>>> Looks like a bug here, need to change n->max_queues * 2 + 1 to
>>>> n->max_queues * 2. The reason we need to del queue 2 each time because
>>>> vq 2 has different meaning is multiqueue and single queue. In single
>>>> queue, vq 2 maybe ctrl vq, but in multiqueue mode it was rx1.
>>>>
>>>> Let's see whether this small change works.
>>> Unfortunately it doesn't fix the issue. I don't know a lot about virtio,
>>> but would it be possible to only delete the queue that have been
>>> enabled?
>> The issue is vq 2 has different meanings in two modes, so it must be
>> deleted and reinitialized during feature negotiation.
>>>>> On the Minix side it triggers the following assertion:
>>>>>
>>>>> | virtio.c:370: assert "q->vaddr != NULL" failed, function "free_phys_queue"                                                                                                                                 
>>>>> | virtio_net(73141): panic: assert failed
>>>>>
>>>>> This correspond to this function in lib/libvirtio/virtio.c:
>>>>>
>>>>> | static void
>>>>> | free_phys_queue(struct virtio_queue *q)
>>>>> | {
>>>>> |         assert(q != NULL);
>>>>> |         assert(q->vaddr != NULL);
>>>>> | 
>>>>> |         free_contig(q->vaddr, q->ring_size);
>>>>> |         q->vaddr = NULL;
>>>>> |         q->paddr = 0;
>>>>> |         q->num = 0;
>>>>> |         free_contig(q->data, sizeof(q->data[0]));
>>>>> |         q->data = NULL;
>>>>> | }
>>>>>
>>>>> Do you have an idea if the problem is on the Minix side or on the QEMU
>>>>> side?
>>>>>
>>>> Haven't figured out the relationship between virtqueue dynamic del/add
>>>> and q->vaddr here. If the above changes does not work, I guess this
>>> Unfortunately I don't really know the minix code either. I happen to
>>> found the issue when testing other changes, and looked at the code to
>>> try to understand the problem.
>> Me too.
>>>> problem happens only for ctrl vq (vq2)? And when does this happen?
>>> I guess so given removing the call to virtio_del_queue() for vq2 fixes
>>> or workarounds the issue.
>> It will cause trouble for multiqueue guest who may think vq2 is rx queue.
>>>> Rebooting?
>>> It happens at the initial boot.
>> Looks at the codes, looks like vq2 was initialized unconditionally at
>> start, how about the following patch?
>>
>> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
>> index 4bb49eb..4886aa0 100644
>> --- a/hw/virtio-net.c
>> +++ b/hw/virtio-net.c
>> @@ -1300,7 +1300,6 @@ VirtIODevice *virtio_net_init(DeviceState *dev,
>> NICConf *conf,
>>                                             virtio_net_handle_tx_bh);
>>          n->vqs[0].tx_bh = qemu_bh_new(virtio_net_tx_bh, &n->vqs[0]);
>>      }
>> -    n->ctrl_vq = virtio_add_queue(&n->vdev, 64, virtio_net_handle_ctrl);
>>      qemu_macaddr_default_if_unset(&conf->macaddr);
>>      memcpy(&n->mac[0], &conf->macaddr, sizeof(n->mac));
>>      n->status = VIRTIO_NET_S_LINK_UP;
>>
> Unfortunately this patch doesn't work, I tried it with and without your
> previous suggested change (max_queues * 2 + 1) => max_queues * 2.
>
> Aurelien
>

Thanks for the test, I will try to get a minix to see what happens and
keep you updated.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH V2 18/20] virtio-net: multiqueue support
  2013-04-17  8:27             ` Jason Wang
@ 2013-04-17  8:42               ` Aurelien Jarno
  2013-04-25  6:19                 ` Jason Wang
  0 siblings, 1 reply; 41+ messages in thread
From: Aurelien Jarno @ 2013-04-17  8:42 UTC (permalink / raw)
  To: Jason Wang; +Cc: qemu-devel

On Wed, Apr 17, 2013 at 04:27:13PM +0800, Jason Wang wrote:
> On 04/17/2013 03:20 PM, Aurelien Jarno wrote:
> > On Tue, Apr 16, 2013 at 11:08:59AM +0800, Jason Wang wrote:
> >> On 04/15/2013 04:28 PM, Aurelien Jarno wrote:
> >>> On Mon, Apr 15, 2013 at 01:29:01PM +0800, Jason Wang wrote:
> >>>> On 04/13/2013 09:17 PM, Aurelien Jarno wrote:
> >>>>> On Fri, Jan 25, 2013 at 06:35:41PM +0800, Jason Wang wrote:
> >>>>>> This patch implements both userspace and vhost support for multiple queue
> >>>>>> virtio-net (VIRTIO_NET_F_MQ). This is done by introducing an array of
> >>>>>> VirtIONetQueue to VirtIONet.
> >>>>>>
> >>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
> >>>>>> ---
> >>>>>>  hw/virtio-net.c |  317 +++++++++++++++++++++++++++++++++++++++++++------------
> >>>>>>  hw/virtio-net.h |   28 +++++-
> >>>>>>  2 files changed, 275 insertions(+), 70 deletions(-)
> >>>>> This patch breaks virtio-net in Minix, even with multiqueue disable. I
> >>>>> don't know virtio enough to know if it is a Minix or a QEMU problem.
> >>>>> However I have been able to identify the part of the commit causing the
> >>>>> failure:
> >>>> Hi Aurelien:
> >>>>
> >>>> Thanks for the work.
> >>>>>> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> >>>>>> index ef522d5..cec91a7 100644
> >>>>>> --- a/hw/virtio-net.c
> >>>>>> +++ b/hw/virtio-net.c
> >>>>> ...
> >>>>>
> >>>>>> +static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue, int ctrl)
> >>>>>> +{
> >>>>>> +    VirtIODevice *vdev = &n->vdev;
> >>>>>> +    int i, max = multiqueue ? n->max_queues : 1;
> >>>>>> +
> >>>>>> +    n->multiqueue = multiqueue;
> >>>>>> +
> >>>>>> +    for (i = 2; i <= n->max_queues * 2 + 1; i++) {
> >>>>>> +        virtio_del_queue(vdev, i);
> >>>>>> +    }
> >>>>>> +
> >>>>> The for loop above is something which is new, even with multiqueue
> >>>>> disabled. Even with max_queues=1 it calls virtio_del_queue with i = 2
> >>>>> and i = 3. Disabling this loop makes the code to work as before.
> >>>> Looks like a bug here, need to change n->max_queues * 2 + 1 to
> >>>> n->max_queues * 2. The reason we need to del queue 2 each time because
> >>>> vq 2 has different meaning is multiqueue and single queue. In single
> >>>> queue, vq 2 maybe ctrl vq, but in multiqueue mode it was rx1.
> >>>>
> >>>> Let's see whether this small change works.
> >>> Unfortunately it doesn't fix the issue. I don't know a lot about virtio,
> >>> but would it be possible to only delete the queue that have been
> >>> enabled?
> >> The issue is vq 2 has different meanings in two modes, so it must be
> >> deleted and reinitialized during feature negotiation.
> >>>>> On the Minix side it triggers the following assertion:
> >>>>>
> >>>>> | virtio.c:370: assert "q->vaddr != NULL" failed, function "free_phys_queue"                                                                                                                                 
> >>>>> | virtio_net(73141): panic: assert failed
> >>>>>
> >>>>> This correspond to this function in lib/libvirtio/virtio.c:
> >>>>>
> >>>>> | static void
> >>>>> | free_phys_queue(struct virtio_queue *q)
> >>>>> | {
> >>>>> |         assert(q != NULL);
> >>>>> |         assert(q->vaddr != NULL);
> >>>>> | 
> >>>>> |         free_contig(q->vaddr, q->ring_size);
> >>>>> |         q->vaddr = NULL;
> >>>>> |         q->paddr = 0;
> >>>>> |         q->num = 0;
> >>>>> |         free_contig(q->data, sizeof(q->data[0]));
> >>>>> |         q->data = NULL;
> >>>>> | }
> >>>>>
> >>>>> Do you have an idea if the problem is on the Minix side or on the QEMU
> >>>>> side?
> >>>>>
> >>>> Haven't figured out the relationship between virtqueue dynamic del/add
> >>>> and q->vaddr here. If the above changes does not work, I guess this
> >>> Unfortunately I don't really know the minix code either. I happen to
> >>> found the issue when testing other changes, and looked at the code to
> >>> try to understand the problem.
> >> Me too.
> >>>> problem happens only for ctrl vq (vq2)? And when does this happen?
> >>> I guess so given removing the call to virtio_del_queue() for vq2 fixes
> >>> or workarounds the issue.
> >> It will cause trouble for multiqueue guest who may think vq2 is rx queue.
> >>>> Rebooting?
> >>> It happens at the initial boot.
> >> Looks at the codes, looks like vq2 was initialized unconditionally at
> >> start, how about the following patch?
> >>
> >> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> >> index 4bb49eb..4886aa0 100644
> >> --- a/hw/virtio-net.c
> >> +++ b/hw/virtio-net.c
> >> @@ -1300,7 +1300,6 @@ VirtIODevice *virtio_net_init(DeviceState *dev,
> >> NICConf *conf,
> >>                                             virtio_net_handle_tx_bh);
> >>          n->vqs[0].tx_bh = qemu_bh_new(virtio_net_tx_bh, &n->vqs[0]);
> >>      }
> >> -    n->ctrl_vq = virtio_add_queue(&n->vdev, 64, virtio_net_handle_ctrl);
> >>      qemu_macaddr_default_if_unset(&conf->macaddr);
> >>      memcpy(&n->mac[0], &conf->macaddr, sizeof(n->mac));
> >>      n->status = VIRTIO_NET_S_LINK_UP;
> >>
> > Unfortunately this patch doesn't work, I tried it with and without your
> > previous suggested change (max_queues * 2 + 1) => max_queues * 2.
> >
> > Aurelien
> >
> 
> Thanks for the test, I will try to get a minix to see what happens and
> keep you updated.
> 

Thanks! I have used minix 3.2.1 which include virtio drivers. See the
following page, step 5 to know how to enable virtio:

http://wiki.minix3.org/en/UsersGuide/RunningOnQemu

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH V2 18/20] virtio-net: multiqueue support
  2013-04-17  8:42               ` Aurelien Jarno
@ 2013-04-25  6:19                 ` Jason Wang
  0 siblings, 0 replies; 41+ messages in thread
From: Jason Wang @ 2013-04-25  6:19 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 04/17/2013 04:42 PM, Aurelien Jarno wrote:
> On Wed, Apr 17, 2013 at 04:27:13PM +0800, Jason Wang wrote:
>> On 04/17/2013 03:20 PM, Aurelien Jarno wrote:
>>> On Tue, Apr 16, 2013 at 11:08:59AM +0800, Jason Wang wrote:
>>>> On 04/15/2013 04:28 PM, Aurelien Jarno wrote:
>>>>> On Mon, Apr 15, 2013 at 01:29:01PM +0800, Jason Wang wrote:
>>>>>> On 04/13/2013 09:17 PM, Aurelien Jarno wrote:
>>>>>>> On Fri, Jan 25, 2013 at 06:35:41PM +0800, Jason Wang wrote:
>>>>>>>> This patch implements both userspace and vhost support for multiple queue
>>>>>>>> virtio-net (VIRTIO_NET_F_MQ). This is done by introducing an array of
>>>>>>>> VirtIONetQueue to VirtIONet.
>>>>>>>>
>>>>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>>>>>> ---
>>>>>>>>  hw/virtio-net.c |  317 +++++++++++++++++++++++++++++++++++++++++++------------
>>>>>>>>  hw/virtio-net.h |   28 +++++-
>>>>>>>>  2 files changed, 275 insertions(+), 70 deletions(-)
>>>>>>> This patch breaks virtio-net in Minix, even with multiqueue disable. I
>>>>>>> don't know virtio enough to know if it is a Minix or a QEMU problem.
>>>>>>> However I have been able to identify the part of the commit causing the
>>>>>>> failure:
>>>>>> Hi Aurelien:
>>>>>>
>>>>>> Thanks for the work.
>>>>>>>> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
>>>>>>>> index ef522d5..cec91a7 100644
>>>>>>>> --- a/hw/virtio-net.c
>>>>>>>> +++ b/hw/virtio-net.c
>>>>>>> ...
>>>>>>>
>>>>>>>> +static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue, int ctrl)
>>>>>>>> +{
>>>>>>>> +    VirtIODevice *vdev = &n->vdev;
>>>>>>>> +    int i, max = multiqueue ? n->max_queues : 1;
>>>>>>>> +
>>>>>>>> +    n->multiqueue = multiqueue;
>>>>>>>> +
>>>>>>>> +    for (i = 2; i <= n->max_queues * 2 + 1; i++) {
>>>>>>>> +        virtio_del_queue(vdev, i);
>>>>>>>> +    }
>>>>>>>> +
>>>>>>> The for loop above is something which is new, even with multiqueue
>>>>>>> disabled. Even with max_queues=1 it calls virtio_del_queue with i = 2
>>>>>>> and i = 3. Disabling this loop makes the code to work as before.
>>>>>> Looks like a bug here, need to change n->max_queues * 2 + 1 to
>>>>>> n->max_queues * 2. The reason we need to del queue 2 each time because
>>>>>> vq 2 has different meaning is multiqueue and single queue. In single
>>>>>> queue, vq 2 maybe ctrl vq, but in multiqueue mode it was rx1.
>>>>>>
>>>>>> Let's see whether this small change works.
>>>>> Unfortunately it doesn't fix the issue. I don't know a lot about virtio,
>>>>> but would it be possible to only delete the queue that have been
>>>>> enabled?
>>>> The issue is vq 2 has different meanings in two modes, so it must be
>>>> deleted and reinitialized during feature negotiation.
>>>>>>> On the Minix side it triggers the following assertion:
>>>>>>>
>>>>>>> | virtio.c:370: assert "q->vaddr != NULL" failed, function "free_phys_queue"                                                                                                                                 
>>>>>>> | virtio_net(73141): panic: assert failed
>>>>>>>
>>>>>>> This correspond to this function in lib/libvirtio/virtio.c:
>>>>>>>
>>>>>>> | static void
>>>>>>> | free_phys_queue(struct virtio_queue *q)
>>>>>>> | {
>>>>>>> |         assert(q != NULL);
>>>>>>> |         assert(q->vaddr != NULL);
>>>>>>> | 
>>>>>>> |         free_contig(q->vaddr, q->ring_size);
>>>>>>> |         q->vaddr = NULL;
>>>>>>> |         q->paddr = 0;
>>>>>>> |         q->num = 0;
>>>>>>> |         free_contig(q->data, sizeof(q->data[0]));
>>>>>>> |         q->data = NULL;
>>>>>>> | }
>>>>>>>
>>>>>>> Do you have an idea if the problem is on the Minix side or on the QEMU
>>>>>>> side?
>>>>>>>
>>>>>> Haven't figured out the relationship between virtqueue dynamic del/add
>>>>>> and q->vaddr here. If the above changes does not work, I guess this
>>>>> Unfortunately I don't really know the minix code either. I happen to
>>>>> found the issue when testing other changes, and looked at the code to
>>>>> try to understand the problem.
>>>> Me too.
>>>>>> problem happens only for ctrl vq (vq2)? And when does this happen?
>>>>> I guess so given removing the call to virtio_del_queue() for vq2 fixes
>>>>> or workarounds the issue.
>>>> It will cause trouble for multiqueue guest who may think vq2 is rx queue.
>>>>>> Rebooting?
>>>>> It happens at the initial boot.
>>>> Looks at the codes, looks like vq2 was initialized unconditionally at
>>>> start, how about the following patch?
>>>>
>>>> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
>>>> index 4bb49eb..4886aa0 100644
>>>> --- a/hw/virtio-net.c
>>>> +++ b/hw/virtio-net.c
>>>> @@ -1300,7 +1300,6 @@ VirtIODevice *virtio_net_init(DeviceState *dev,
>>>> NICConf *conf,
>>>>                                             virtio_net_handle_tx_bh);
>>>>          n->vqs[0].tx_bh = qemu_bh_new(virtio_net_tx_bh, &n->vqs[0]);
>>>>      }
>>>> -    n->ctrl_vq = virtio_add_queue(&n->vdev, 64, virtio_net_handle_ctrl);
>>>>      qemu_macaddr_default_if_unset(&conf->macaddr);
>>>>      memcpy(&n->mac[0], &conf->macaddr, sizeof(n->mac));
>>>>      n->status = VIRTIO_NET_S_LINK_UP;
>>>>
>>> Unfortunately this patch doesn't work, I tried it with and without your
>>> previous suggested change (max_queues * 2 + 1) => max_queues * 2.
>>>
>>> Aurelien
>>>
>> Thanks for the test, I will try to get a minix to see what happens and
>> keep you updated.
>>
> Thanks! I have used minix 3.2.1 which include virtio drivers. See the
> following page, step 5 to know how to enable virtio:
>
> http://wiki.minix3.org/en/UsersGuide/RunningOnQemu
>

Not sure whether it's a bug of minix guest, since it always tries to
identify the control vq even if it does not support it:

>From virtio_net_probe():

    /* If the host supports the control queue, allocate it as well */
    if (virtio_host_supports(net_dev, VIRTIO_NET_F_CTRL_VQ))
        queues += 1;

This violates the virtio spec:
" If the VIRTIO_NET_F_CTRL_VQ feature bit is negotiated, identify the
control virtqueue."

The multiqueue patchset conditionally allocate control vq depends on
whether guest support it instead of always allocating it. I will send a
patch to keep this behaviour to preserve backward compatibility.

Thanks

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2013-04-25  6:20 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-25 10:35 [PATCH V2 00/20] Multiqueue virtio-net Jason Wang
2013-01-25 10:35 ` [PATCH V2 01/20] net: introduce qemu_get_queue() Jason Wang
2013-01-25 10:35 ` [PATCH V2 02/20] net: introduce qemu_get_nic() Jason Wang
2013-01-25 10:35 ` [PATCH V2 03/20] net: intorduce qemu_del_nic() Jason Wang
2013-01-25 10:35 ` [PATCH V2 04/20] net: introduce qemu_find_net_clients_except() Jason Wang
2013-01-25 10:35 ` [PATCH V2 05/20] net: introduce qemu_net_client_setup() Jason Wang
2013-01-25 10:35 ` [PATCH V2 06/20] net: introduce NetClientState destructor Jason Wang
2013-01-25 10:35 ` [PATCH V2 07/20] net: multiqueue support Jason Wang
2013-01-25 10:35 ` [PATCH V2 08/20] tap: import linux multiqueue constants Jason Wang
2013-01-25 10:35 ` [PATCH V2 09/20] tap: factor out common tap initialization Jason Wang
2013-01-25 10:35 ` [PATCH V2 10/20] tap: add Linux multiqueue support Jason Wang
2013-01-25 10:35 ` [PATCH V2 11/20] tap: support enabling or disabling a queue Jason Wang
2013-01-25 19:13   ` [Qemu-devel] " Blue Swirl
2013-01-29 13:50     ` Jason Wang
2013-01-29 20:10       ` Blue Swirl
2013-01-29 22:11         ` [Qemu-devel] " Michael S. Tsirkin
2013-01-29 22:55           ` Anthony Liguori
2013-01-29 23:03             ` Michael S. Tsirkin
2013-01-30  9:46               ` Jason Wang
2013-01-25 10:35 ` [PATCH V2 12/20] tap: introduce a helper to get the name of an interface Jason Wang
2013-01-25 10:35 ` [PATCH V2 13/20] tap: multiqueue support Jason Wang
2013-01-25 10:35 ` [PATCH V2 14/20] vhost: " Jason Wang
2013-01-29 13:53   ` Jason Wang
2013-01-25 10:35 ` [PATCH V2 15/20] virtio: introduce virtio_del_queue() Jason Wang
2013-01-25 10:35 ` [PATCH V2 16/20] virtio: add a queue_index to VirtQueue Jason Wang
2013-01-25 10:35 ` [PATCH V2 17/20] virtio-net: separate virtqueue from VirtIONet Jason Wang
2013-01-25 10:35 ` [PATCH V2 18/20] virtio-net: multiqueue support Jason Wang
2013-04-13 13:17   ` [Qemu-devel] " Aurelien Jarno
2013-04-15  5:29     ` Jason Wang
2013-04-15  8:28       ` Aurelien Jarno
2013-04-16  3:08         ` Jason Wang
2013-04-17  7:20           ` Aurelien Jarno
2013-04-17  8:27             ` Jason Wang
2013-04-17  8:42               ` Aurelien Jarno
2013-04-25  6:19                 ` Jason Wang
2013-01-25 10:35 ` [PATCH V2 19/20] virtio-net: migration support for multiqueue Jason Wang
2013-01-25 10:35 ` [PATCH V2 20/20] virtio-net: compat multiqueue support Jason Wang
2013-01-28  3:27 ` [PATCH V2 00/20] Multiqueue virtio-net Wanlong Gao
2013-01-28  4:24   ` [Qemu-devel] " Jason Wang
2013-01-29  5:36     ` Wanlong Gao
2013-01-29  5:44       ` Jason Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.