* [RFC PATCH] vhost_net: should not use max_queue_pairs for non-mq guest
@ 2022-03-19 4:13 Si-Wei Liu
2022-03-22 3:47 ` Jason Wang
0 siblings, 1 reply; 5+ messages in thread
From: Si-Wei Liu @ 2022-03-19 4:13 UTC (permalink / raw)
To: qemu-devel, jasowang; +Cc: si-wei.liu, eperezma, eli, mst
With MQ enabled vdpa device and non-MQ supporting guest e.g.
booting vdpa with mq=on over OVMF of single vqp, it's easy
to hit assert failure as the following:
../hw/virtio/vhost-vdpa.c:560: vhost_vdpa_get_vq_index: Assertion `idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs' failed.
0 0x00007f8ce3ff3387 in raise () at /lib64/libc.so.6
1 0x00007f8ce3ff4a78 in abort () at /lib64/libc.so.6
2 0x00007f8ce3fec1a6 in __assert_fail_base () at /lib64/libc.so.6
3 0x00007f8ce3fec252 in () at /lib64/libc.so.6
4 0x0000558f52d79421 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:563
5 0x0000558f52d79421 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:558
6 0x0000558f52d7329a in vhost_virtqueue_mask (hdev=0x558f55c01800, vdev=0x558f568f91f0, n=2, mask=<optimized out>) at ../hw/virtio/vhost.c:1557
7 0x0000558f52c6b89a in virtio_pci_set_guest_notifier (d=d@entry=0x558f568f0f60, n=n@entry=2, assign=assign@entry=true, with_irqfd=with_irqfd@entry=false)
at ../hw/virtio/virtio-pci.c:974
8 0x0000558f52c6c0d8 in virtio_pci_set_guest_notifiers (d=0x558f568f0f60, nvqs=3, assign=true) at ../hw/virtio/virtio-pci.c:1019
9 0x0000558f52bf091d in vhost_net_start (dev=dev@entry=0x558f568f91f0, ncs=0x558f56937cd0, data_queue_pairs=data_queue_pairs@entry=1, cvq=cvq@entry=1)
at ../hw/net/vhost_net.c:361
10 0x0000558f52d4e5e7 in virtio_net_set_status (status=<optimized out>, n=0x558f568f91f0) at ../hw/net/virtio-net.c:289
11 0x0000558f52d4e5e7 in virtio_net_set_status (vdev=0x558f568f91f0, status=15 '\017') at ../hw/net/virtio-net.c:370
12 0x0000558f52d6c4b2 in virtio_set_status (vdev=vdev@entry=0x558f568f91f0, val=val@entry=15 '\017') at ../hw/virtio/virtio.c:1945
13 0x0000558f52c69eff in virtio_pci_common_write (opaque=0x558f568f0f60, addr=<optimized out>, val=<optimized out>, size=<optimized out>) at ../hw/virtio/virtio-pci.c:1292
14 0x0000558f52d15d6e in memory_region_write_accessor (mr=0x558f568f19d0, addr=20, value=<optimized out>, size=1, shift=<optimized out>, mask=<optimized out>, attrs=...)
at ../softmmu/memory.c:492
15 0x0000558f52d127de in access_with_adjusted_size (addr=addr@entry=20, value=value@entry=0x7f8cdbffe748, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=0x558f52d15cf0 <memory_region_write_accessor>, mr=0x558f568f19d0, attrs=...) at ../softmmu/memory.c:554
16 0x0000558f52d157ef in memory_region_dispatch_write (mr=mr@entry=0x558f568f19d0, addr=20, data=<optimized out>, op=<optimized out>, attrs=attrs@entry=...)
at ../softmmu/memory.c:1504
17 0x0000558f52d078e7 in flatview_write_continue (fv=fv@entry=0x7f8accbc3b90, addr=addr@entry=103079215124, attrs=..., ptr=ptr@entry=0x7f8ce6300028, len=len@entry=1, addr1=<optimized out>, l=<optimized out>, mr=0x558f568f19d0) at /home/opc/qemu-upstream/include/qemu/host-utils.h:165
18 0x0000558f52d07b06 in flatview_write (fv=0x7f8accbc3b90, addr=103079215124, attrs=..., buf=0x7f8ce6300028, len=1) at ../softmmu/physmem.c:2822
19 0x0000558f52d0b36b in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=buf@entry=0x7f8ce6300028, len=<optimized out>)
at ../softmmu/physmem.c:2914
20 0x0000558f52d0b3da in address_space_rw (as=<optimized out>, addr=<optimized out>, attrs=...,
attrs@entry=..., buf=buf@entry=0x7f8ce6300028, len=<optimized out>, is_write=<optimized out>) at ../softmmu/physmem.c:2924
21 0x0000558f52dced09 in kvm_cpu_exec (cpu=cpu@entry=0x558f55c2da60) at ../accel/kvm/kvm-all.c:2903
22 0x0000558f52dcfabd in kvm_vcpu_thread_fn (arg=arg@entry=0x558f55c2da60) at ../accel/kvm/kvm-accel-ops.c:49
23 0x0000558f52f9f04a in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:556
24 0x00007f8ce4392ea5 in start_thread () at /lib64/libpthread.so.0
25 0x00007f8ce40bb9fd in clone () at /lib64/libc.so.6
The cause for the assert failure is due to that the vhost_dev index
for the ctrl vq was not aligned with actual one in use by the guest.
Upon multiqueue feature negotiation in virtio_net_set_multiqueue(),
if guest doesn't support multiqueue, the guest vq layout would shrink
to single queue pair of 3 vqs in total (rx, tx and ctrl). This results
in ctrl_vq taking a different vhost_dev group index than the default
n->max_queue_pairs, the latter of which is only valid for multiqueue
guest. While on those additional vqs not exposed to the guest,
vhost_net_set_vq_index() never populated vq_index properly, hence
getting the assert failure.
A possible fix is to pick the correct vhost_dev group for the control
vq according to this table [*]:
vdpa tool / QEMU arg / guest config / ctrl_vq group index
----------------------------------------------------------------
max_vqp 8 / mq=on / mq=off (UEFI) => data_queue_pairs
max_vqp 8 / mq=on / mq=on (Linux) => n->max_queue_pairs(>1)
max_vqp 8 / mq=off / mq=on (Linux) => n->max_queue_pairs(=1)
[*] Please see FIXME in the code for open question and discussion
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
---
hw/net/vhost_net.c | 13 +++++++++----
hw/virtio/vhost-vdpa.c | 25 ++++++++++++++++++++++++-
2 files changed, 33 insertions(+), 5 deletions(-)
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 30379d2..9a4479b 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -322,6 +322,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(dev)));
VirtioBusState *vbus = VIRTIO_BUS(qbus);
VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
+ bool mq = virtio_host_has_feature(dev, VIRTIO_NET_F_MQ);
int total_notifiers = data_queue_pairs * 2 + cvq;
VirtIONet *n = VIRTIO_NET(dev);
int nvhosts = data_queue_pairs + cvq;
@@ -343,7 +344,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
if (i < data_queue_pairs) {
peer = qemu_get_peer(ncs, i);
} else { /* Control Virtqueue */
- peer = qemu_get_peer(ncs, n->max_queue_pairs);
+ peer = qemu_get_peer(ncs, mq ? data_queue_pairs : n->max_queue_pairs);
}
net = get_vhost_net(peer);
@@ -368,7 +369,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
if (i < data_queue_pairs) {
peer = qemu_get_peer(ncs, i);
} else {
- peer = qemu_get_peer(ncs, n->max_queue_pairs);
+ peer = qemu_get_peer(ncs, mq ? data_queue_pairs : n->max_queue_pairs);
}
r = vhost_net_start_one(get_vhost_net(peer), dev);
@@ -390,7 +391,10 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
err_start:
while (--i >= 0) {
- peer = qemu_get_peer(ncs , i);
+ if (mq)
+ peer = qemu_get_peer(ncs, i < data_queue_pairs ? i : data_queue_pairs);
+ else
+ peer = qemu_get_peer(ncs, i < data_queue_pairs ? i : n->max_queue_pairs);
vhost_net_stop_one(get_vhost_net(peer), dev);
}
e = k->set_guest_notifiers(qbus->parent, total_notifiers, false);
@@ -409,6 +413,7 @@ void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs,
VirtioBusState *vbus = VIRTIO_BUS(qbus);
VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
VirtIONet *n = VIRTIO_NET(dev);
+ bool mq = virtio_host_has_feature(dev, VIRTIO_NET_F_MQ);
NetClientState *peer;
int total_notifiers = data_queue_pairs * 2 + cvq;
int nvhosts = data_queue_pairs + cvq;
@@ -418,7 +423,7 @@ void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs,
if (i < data_queue_pairs) {
peer = qemu_get_peer(ncs, i);
} else {
- peer = qemu_get_peer(ncs, n->max_queue_pairs);
+ peer = qemu_get_peer(ncs, mq ? data_queue_pairs : n->max_queue_pairs);
}
vhost_net_stop_one(get_vhost_net(peer), dev);
}
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 27ea706..623476e 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1097,7 +1097,30 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
}
- if (dev->vq_index + dev->nvqs != dev->vq_index_end) {
+ /* FIXME the vhost_dev group for the control vq may have bogus nvqs=2
+ * value rather than nvqs=1. This can happen in case the guest doesn't
+ * support multiqueue, as a result of virtio_net_change_num_queue_pairs()
+ * destroying and rebuilding all the vqs, the guest index for control vq
+ * will no longer align with the host's. Currently net_init_vhost_vdpa()
+ * only initializes all vhost_dev's and net_clients once during
+ * net_client_init1() time, way earlier before multiqueue feature
+ * negotiation can kick in.
+ *
+ * Discussion - some possible fixes so far I can think of:
+ *
+ * option 1: fix vhost_net->dev.nvqs and nc->is_datapath in place for
+ * vdpa's ctrl vq, or rebuild all vdpa's vhost_dev groups and the
+ * net_client array, in the virtio_net_set_multiqueue() path;
+ *
+ * option 2: fix vhost_dev->nvqs in place at vhost_vdpa_set_features()
+ * before coming down to vhost_vdpa_dev_start() (Q: nc->is_datapath
+ * seems only used in virtio_net_device_realize, is it relevant?);
+ *
+ * option 3: use host queue index all along in vhost-vdpa ioctls instead
+ * of using guest vq index, so that vhost_net_start/stop() can remain
+ * as-is today
+ */
+ if (dev->vq_index + dev->nvqs < dev->vq_index_end) {
return 0;
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] vhost_net: should not use max_queue_pairs for non-mq guest
2022-03-19 4:13 [RFC PATCH] vhost_net: should not use max_queue_pairs for non-mq guest Si-Wei Liu
@ 2022-03-22 3:47 ` Jason Wang
2022-03-25 7:01 ` Si-Wei Liu
0 siblings, 1 reply; 5+ messages in thread
From: Jason Wang @ 2022-03-22 3:47 UTC (permalink / raw)
To: Si-Wei Liu; +Cc: eperezma, Eli Cohen, qemu-devel, mst
On Sat, Mar 19, 2022 at 12:14 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>
> With MQ enabled vdpa device and non-MQ supporting guest e.g.
> booting vdpa with mq=on over OVMF of single vqp, it's easy
> to hit assert failure as the following:
>
> ../hw/virtio/vhost-vdpa.c:560: vhost_vdpa_get_vq_index: Assertion `idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs' failed.
>
> 0 0x00007f8ce3ff3387 in raise () at /lib64/libc.so.6
> 1 0x00007f8ce3ff4a78 in abort () at /lib64/libc.so.6
> 2 0x00007f8ce3fec1a6 in __assert_fail_base () at /lib64/libc.so.6
> 3 0x00007f8ce3fec252 in () at /lib64/libc.so.6
> 4 0x0000558f52d79421 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:563
> 5 0x0000558f52d79421 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:558
> 6 0x0000558f52d7329a in vhost_virtqueue_mask (hdev=0x558f55c01800, vdev=0x558f568f91f0, n=2, mask=<optimized out>) at ../hw/virtio/vhost.c:1557
> 7 0x0000558f52c6b89a in virtio_pci_set_guest_notifier (d=d@entry=0x558f568f0f60, n=n@entry=2, assign=assign@entry=true, with_irqfd=with_irqfd@entry=false)
> at ../hw/virtio/virtio-pci.c:974
> 8 0x0000558f52c6c0d8 in virtio_pci_set_guest_notifiers (d=0x558f568f0f60, nvqs=3, assign=true) at ../hw/virtio/virtio-pci.c:1019
> 9 0x0000558f52bf091d in vhost_net_start (dev=dev@entry=0x558f568f91f0, ncs=0x558f56937cd0, data_queue_pairs=data_queue_pairs@entry=1, cvq=cvq@entry=1)
> at ../hw/net/vhost_net.c:361
> 10 0x0000558f52d4e5e7 in virtio_net_set_status (status=<optimized out>, n=0x558f568f91f0) at ../hw/net/virtio-net.c:289
> 11 0x0000558f52d4e5e7 in virtio_net_set_status (vdev=0x558f568f91f0, status=15 '\017') at ../hw/net/virtio-net.c:370
> 12 0x0000558f52d6c4b2 in virtio_set_status (vdev=vdev@entry=0x558f568f91f0, val=val@entry=15 '\017') at ../hw/virtio/virtio.c:1945
> 13 0x0000558f52c69eff in virtio_pci_common_write (opaque=0x558f568f0f60, addr=<optimized out>, val=<optimized out>, size=<optimized out>) at ../hw/virtio/virtio-pci.c:1292
> 14 0x0000558f52d15d6e in memory_region_write_accessor (mr=0x558f568f19d0, addr=20, value=<optimized out>, size=1, shift=<optimized out>, mask=<optimized out>, attrs=...)
> at ../softmmu/memory.c:492
> 15 0x0000558f52d127de in access_with_adjusted_size (addr=addr@entry=20, value=value@entry=0x7f8cdbffe748, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=0x558f52d15cf0 <memory_region_write_accessor>, mr=0x558f568f19d0, attrs=...) at ../softmmu/memory.c:554
> 16 0x0000558f52d157ef in memory_region_dispatch_write (mr=mr@entry=0x558f568f19d0, addr=20, data=<optimized out>, op=<optimized out>, attrs=attrs@entry=...)
> at ../softmmu/memory.c:1504
> 17 0x0000558f52d078e7 in flatview_write_continue (fv=fv@entry=0x7f8accbc3b90, addr=addr@entry=103079215124, attrs=..., ptr=ptr@entry=0x7f8ce6300028, len=len@entry=1, addr1=<optimized out>, l=<optimized out>, mr=0x558f568f19d0) at /home/opc/qemu-upstream/include/qemu/host-utils.h:165
> 18 0x0000558f52d07b06 in flatview_write (fv=0x7f8accbc3b90, addr=103079215124, attrs=..., buf=0x7f8ce6300028, len=1) at ../softmmu/physmem.c:2822
> 19 0x0000558f52d0b36b in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=buf@entry=0x7f8ce6300028, len=<optimized out>)
> at ../softmmu/physmem.c:2914
> 20 0x0000558f52d0b3da in address_space_rw (as=<optimized out>, addr=<optimized out>, attrs=...,
> attrs@entry=..., buf=buf@entry=0x7f8ce6300028, len=<optimized out>, is_write=<optimized out>) at ../softmmu/physmem.c:2924
> 21 0x0000558f52dced09 in kvm_cpu_exec (cpu=cpu@entry=0x558f55c2da60) at ../accel/kvm/kvm-all.c:2903
> 22 0x0000558f52dcfabd in kvm_vcpu_thread_fn (arg=arg@entry=0x558f55c2da60) at ../accel/kvm/kvm-accel-ops.c:49
> 23 0x0000558f52f9f04a in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:556
> 24 0x00007f8ce4392ea5 in start_thread () at /lib64/libpthread.so.0
> 25 0x00007f8ce40bb9fd in clone () at /lib64/libc.so.6
>
> The cause for the assert failure is due to that the vhost_dev index
> for the ctrl vq was not aligned with actual one in use by the guest.
> Upon multiqueue feature negotiation in virtio_net_set_multiqueue(),
> if guest doesn't support multiqueue, the guest vq layout would shrink
> to single queue pair of 3 vqs in total (rx, tx and ctrl). This results
> in ctrl_vq taking a different vhost_dev group index than the default
> n->max_queue_pairs, the latter of which is only valid for multiqueue
> guest. While on those additional vqs not exposed to the guest,
> vhost_net_set_vq_index() never populated vq_index properly, hence
> getting the assert failure.
>
> A possible fix is to pick the correct vhost_dev group for the control
> vq according to this table [*]:
>
> vdpa tool / QEMU arg / guest config / ctrl_vq group index
> ----------------------------------------------------------------
> max_vqp 8 / mq=on / mq=off (UEFI) => data_queue_pairs
> max_vqp 8 / mq=on / mq=on (Linux) => n->max_queue_pairs(>1)
> max_vqp 8 / mq=off / mq=on (Linux) => n->max_queue_pairs(=1)
>
> [*] Please see FIXME in the code for open question and discussion
>
> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
> ---
> hw/net/vhost_net.c | 13 +++++++++----
> hw/virtio/vhost-vdpa.c | 25 ++++++++++++++++++++++++-
> 2 files changed, 33 insertions(+), 5 deletions(-)
>
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index 30379d2..9a4479b 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -322,6 +322,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
> BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(dev)));
> VirtioBusState *vbus = VIRTIO_BUS(qbus);
> VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
> + bool mq = virtio_host_has_feature(dev, VIRTIO_NET_F_MQ);
> int total_notifiers = data_queue_pairs * 2 + cvq;
> VirtIONet *n = VIRTIO_NET(dev);
> int nvhosts = data_queue_pairs + cvq;
> @@ -343,7 +344,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
> if (i < data_queue_pairs) {
> peer = qemu_get_peer(ncs, i);
> } else { /* Control Virtqueue */
> - peer = qemu_get_peer(ncs, n->max_queue_pairs);
> + peer = qemu_get_peer(ncs, mq ? data_queue_pairs : n->max_queue_pairs);
> }
>
> net = get_vhost_net(peer);
> @@ -368,7 +369,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
> if (i < data_queue_pairs) {
> peer = qemu_get_peer(ncs, i);
> } else {
> - peer = qemu_get_peer(ncs, n->max_queue_pairs);
> + peer = qemu_get_peer(ncs, mq ? data_queue_pairs : n->max_queue_pairs);
> }
> r = vhost_net_start_one(get_vhost_net(peer), dev);
>
> @@ -390,7 +391,10 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>
> err_start:
> while (--i >= 0) {
> - peer = qemu_get_peer(ncs , i);
> + if (mq)
> + peer = qemu_get_peer(ncs, i < data_queue_pairs ? i : data_queue_pairs);
> + else
> + peer = qemu_get_peer(ncs, i < data_queue_pairs ? i : n->max_queue_pairs);
> vhost_net_stop_one(get_vhost_net(peer), dev);
> }
> e = k->set_guest_notifiers(qbus->parent, total_notifiers, false);
> @@ -409,6 +413,7 @@ void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs,
> VirtioBusState *vbus = VIRTIO_BUS(qbus);
> VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
> VirtIONet *n = VIRTIO_NET(dev);
> + bool mq = virtio_host_has_feature(dev, VIRTIO_NET_F_MQ);
> NetClientState *peer;
> int total_notifiers = data_queue_pairs * 2 + cvq;
> int nvhosts = data_queue_pairs + cvq;
> @@ -418,7 +423,7 @@ void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs,
> if (i < data_queue_pairs) {
> peer = qemu_get_peer(ncs, i);
> } else {
> - peer = qemu_get_peer(ncs, n->max_queue_pairs);
> + peer = qemu_get_peer(ncs, mq ? data_queue_pairs : n->max_queue_pairs);
> }
> vhost_net_stop_one(get_vhost_net(peer), dev);
> }
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index 27ea706..623476e 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -1097,7 +1097,30 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
> }
>
> - if (dev->vq_index + dev->nvqs != dev->vq_index_end) {
> + /* FIXME the vhost_dev group for the control vq may have bogus nvqs=2
> + * value rather than nvqs=1. This can happen in case the guest doesn't
> + * support multiqueue, as a result of virtio_net_change_num_queue_pairs()
> + * destroying and rebuilding all the vqs, the guest index for control vq
> + * will no longer align with the host's. Currently net_init_vhost_vdpa()
> + * only initializes all vhost_dev's and net_clients once during
> + * net_client_init1() time, way earlier before multiqueue feature
> + * negotiation can kick in.
See below, it looks like the code doesn't find the correct vhost_dev.
> + *
> + * Discussion - some possible fixes so far I can think of:
> + *
> + * option 1: fix vhost_net->dev.nvqs and nc->is_datapath in place for
> + * vdpa's ctrl vq, or rebuild all vdpa's vhost_dev groups and the
> + * net_client array, in the virtio_net_set_multiqueue() path;
> + *
> + * option 2: fix vhost_dev->nvqs in place at vhost_vdpa_set_features()
> + * before coming down to vhost_vdpa_dev_start() (Q: nc->is_datapath
> + * seems only used in virtio_net_device_realize, is it relevant?);
Relevant but not directly related, for the vhost_dev where
nc->is_datapath is false, it will assume it is backed by a single
queue not a queue pair.
> + *
> + * option 3: use host queue index all along in vhost-vdpa ioctls instead
> + * of using guest vq index, so that vhost_net_start/stop() can remain
> + * as-is today
> + */
Note that the vq_index of each vhost_dev is assigned during
vhost_net_start() according to whether or not the MQ or CVQ is
negotiated in vhost_net_start()
for (i = 0; i < nvhosts; i++) {
if (i < data_queue_pairs) {
peer = qemu_get_peer(ncs, i);
} else { /* Control Virtqueue */
peer = qemu_get_peer(ncs, n->max_queue_pairs);
}
net = get_vhost_net(peer);
vhost_net_set_vq_index(net, i * 2, index_end);
It means some of the peers won't be used when MQ is not negotiated. So
it looks to me the evil came from virtio_net_get_notifier_mask().
Where it doesn't mask the correct vhost dev when the guest doesn't
support MQ but the host does. So we had option 4:
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 2087516253..5e9ac019cd 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3179,7 +3179,13 @@ static void
virtio_net_guest_notifier_mask(VirtIODevice *vdev, int idx,
bool mask)
{
VirtIONet *n = VIRTIO_NET(vdev);
- NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
+ NetClientState *nc;
+
+ if (!virtio_vdev_has_feature(vdev, VIRTIO_NET_F_MQ) && idx == 2) {
+ nc = qemu_get_subqueue(n->nic, n->max_queue_pairs);
+ } else {
+ nc = qemu_get_subqueue(n->nic, vq2q(idx));
+ }
assert(n->vhost_started);
vhost_net_virtqueue_mask(get_vhost_net(nc->peer),
vdev, idx, mask);
Thanks
> + if (dev->vq_index + dev->nvqs < dev->vq_index_end) {
> return 0;
> }
>
> --
> 1.8.3.1
>
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] vhost_net: should not use max_queue_pairs for non-mq guest
2022-03-22 3:47 ` Jason Wang
@ 2022-03-25 7:01 ` Si-Wei Liu
2022-03-25 7:59 ` Jason Wang
0 siblings, 1 reply; 5+ messages in thread
From: Si-Wei Liu @ 2022-03-25 7:01 UTC (permalink / raw)
To: Jason Wang; +Cc: eperezma, Eli Cohen, qemu-devel, mst
On 3/21/2022 8:47 PM, Jason Wang wrote:
> On Sat, Mar 19, 2022 at 12:14 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>> With MQ enabled vdpa device and non-MQ supporting guest e.g.
>> booting vdpa with mq=on over OVMF of single vqp, it's easy
>> to hit assert failure as the following:
>>
>> ../hw/virtio/vhost-vdpa.c:560: vhost_vdpa_get_vq_index: Assertion `idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs' failed.
>>
>> 0 0x00007f8ce3ff3387 in raise () at /lib64/libc.so.6
>> 1 0x00007f8ce3ff4a78 in abort () at /lib64/libc.so.6
>> 2 0x00007f8ce3fec1a6 in __assert_fail_base () at /lib64/libc.so.6
>> 3 0x00007f8ce3fec252 in () at /lib64/libc.so.6
>> 4 0x0000558f52d79421 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:563
>> 5 0x0000558f52d79421 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:558
>> 6 0x0000558f52d7329a in vhost_virtqueue_mask (hdev=0x558f55c01800, vdev=0x558f568f91f0, n=2, mask=<optimized out>) at ../hw/virtio/vhost.c:1557
>> 7 0x0000558f52c6b89a in virtio_pci_set_guest_notifier (d=d@entry=0x558f568f0f60, n=n@entry=2, assign=assign@entry=true, with_irqfd=with_irqfd@entry=false)
>> at ../hw/virtio/virtio-pci.c:974
>> 8 0x0000558f52c6c0d8 in virtio_pci_set_guest_notifiers (d=0x558f568f0f60, nvqs=3, assign=true) at ../hw/virtio/virtio-pci.c:1019
>> 9 0x0000558f52bf091d in vhost_net_start (dev=dev@entry=0x558f568f91f0, ncs=0x558f56937cd0, data_queue_pairs=data_queue_pairs@entry=1, cvq=cvq@entry=1)
>> at ../hw/net/vhost_net.c:361
>> 10 0x0000558f52d4e5e7 in virtio_net_set_status (status=<optimized out>, n=0x558f568f91f0) at ../hw/net/virtio-net.c:289
>> 11 0x0000558f52d4e5e7 in virtio_net_set_status (vdev=0x558f568f91f0, status=15 '\017') at ../hw/net/virtio-net.c:370
>> 12 0x0000558f52d6c4b2 in virtio_set_status (vdev=vdev@entry=0x558f568f91f0, val=val@entry=15 '\017') at ../hw/virtio/virtio.c:1945
>> 13 0x0000558f52c69eff in virtio_pci_common_write (opaque=0x558f568f0f60, addr=<optimized out>, val=<optimized out>, size=<optimized out>) at ../hw/virtio/virtio-pci.c:1292
>> 14 0x0000558f52d15d6e in memory_region_write_accessor (mr=0x558f568f19d0, addr=20, value=<optimized out>, size=1, shift=<optimized out>, mask=<optimized out>, attrs=...)
>> at ../softmmu/memory.c:492
>> 15 0x0000558f52d127de in access_with_adjusted_size (addr=addr@entry=20, value=value@entry=0x7f8cdbffe748, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=0x558f52d15cf0 <memory_region_write_accessor>, mr=0x558f568f19d0, attrs=...) at ../softmmu/memory.c:554
>> 16 0x0000558f52d157ef in memory_region_dispatch_write (mr=mr@entry=0x558f568f19d0, addr=20, data=<optimized out>, op=<optimized out>, attrs=attrs@entry=...)
>> at ../softmmu/memory.c:1504
>> 17 0x0000558f52d078e7 in flatview_write_continue (fv=fv@entry=0x7f8accbc3b90, addr=addr@entry=103079215124, attrs=..., ptr=ptr@entry=0x7f8ce6300028, len=len@entry=1, addr1=<optimized out>, l=<optimized out>, mr=0x558f568f19d0) at /home/opc/qemu-upstream/include/qemu/host-utils.h:165
>> 18 0x0000558f52d07b06 in flatview_write (fv=0x7f8accbc3b90, addr=103079215124, attrs=..., buf=0x7f8ce6300028, len=1) at ../softmmu/physmem.c:2822
>> 19 0x0000558f52d0b36b in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=buf@entry=0x7f8ce6300028, len=<optimized out>)
>> at ../softmmu/physmem.c:2914
>> 20 0x0000558f52d0b3da in address_space_rw (as=<optimized out>, addr=<optimized out>, attrs=...,
>> attrs@entry=..., buf=buf@entry=0x7f8ce6300028, len=<optimized out>, is_write=<optimized out>) at ../softmmu/physmem.c:2924
>> 21 0x0000558f52dced09 in kvm_cpu_exec (cpu=cpu@entry=0x558f55c2da60) at ../accel/kvm/kvm-all.c:2903
>> 22 0x0000558f52dcfabd in kvm_vcpu_thread_fn (arg=arg@entry=0x558f55c2da60) at ../accel/kvm/kvm-accel-ops.c:49
>> 23 0x0000558f52f9f04a in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:556
>> 24 0x00007f8ce4392ea5 in start_thread () at /lib64/libpthread.so.0
>> 25 0x00007f8ce40bb9fd in clone () at /lib64/libc.so.6
>>
>> The cause for the assert failure is due to that the vhost_dev index
>> for the ctrl vq was not aligned with actual one in use by the guest.
>> Upon multiqueue feature negotiation in virtio_net_set_multiqueue(),
>> if guest doesn't support multiqueue, the guest vq layout would shrink
>> to single queue pair of 3 vqs in total (rx, tx and ctrl). This results
>> in ctrl_vq taking a different vhost_dev group index than the default
>> n->max_queue_pairs, the latter of which is only valid for multiqueue
>> guest. While on those additional vqs not exposed to the guest,
>> vhost_net_set_vq_index() never populated vq_index properly, hence
>> getting the assert failure.
>>
>> A possible fix is to pick the correct vhost_dev group for the control
>> vq according to this table [*]:
>>
>> vdpa tool / QEMU arg / guest config / ctrl_vq group index
>> ----------------------------------------------------------------
>> max_vqp 8 / mq=on / mq=off (UEFI) => data_queue_pairs
>> max_vqp 8 / mq=on / mq=on (Linux) => n->max_queue_pairs(>1)
>> max_vqp 8 / mq=off / mq=on (Linux) => n->max_queue_pairs(=1)
>>
>> [*] Please see FIXME in the code for open question and discussion
>>
>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
>> ---
>> hw/net/vhost_net.c | 13 +++++++++----
>> hw/virtio/vhost-vdpa.c | 25 ++++++++++++++++++++++++-
>> 2 files changed, 33 insertions(+), 5 deletions(-)
>>
>> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
>> index 30379d2..9a4479b 100644
>> --- a/hw/net/vhost_net.c
>> +++ b/hw/net/vhost_net.c
>> @@ -322,6 +322,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>> BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(dev)));
>> VirtioBusState *vbus = VIRTIO_BUS(qbus);
>> VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
>> + bool mq = virtio_host_has_feature(dev, VIRTIO_NET_F_MQ);
>> int total_notifiers = data_queue_pairs * 2 + cvq;
>> VirtIONet *n = VIRTIO_NET(dev);
>> int nvhosts = data_queue_pairs + cvq;
>> @@ -343,7 +344,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>> if (i < data_queue_pairs) {
>> peer = qemu_get_peer(ncs, i);
>> } else { /* Control Virtqueue */
>> - peer = qemu_get_peer(ncs, n->max_queue_pairs);
>> + peer = qemu_get_peer(ncs, mq ? data_queue_pairs : n->max_queue_pairs);
>> }
>>
>> net = get_vhost_net(peer);
>> @@ -368,7 +369,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>> if (i < data_queue_pairs) {
>> peer = qemu_get_peer(ncs, i);
>> } else {
>> - peer = qemu_get_peer(ncs, n->max_queue_pairs);
>> + peer = qemu_get_peer(ncs, mq ? data_queue_pairs : n->max_queue_pairs);
>> }
>> r = vhost_net_start_one(get_vhost_net(peer), dev);
>>
>> @@ -390,7 +391,10 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>>
>> err_start:
>> while (--i >= 0) {
>> - peer = qemu_get_peer(ncs , i);
>> + if (mq)
>> + peer = qemu_get_peer(ncs, i < data_queue_pairs ? i : data_queue_pairs);
>> + else
>> + peer = qemu_get_peer(ncs, i < data_queue_pairs ? i : n->max_queue_pairs);
>> vhost_net_stop_one(get_vhost_net(peer), dev);
>> }
>> e = k->set_guest_notifiers(qbus->parent, total_notifiers, false);
>> @@ -409,6 +413,7 @@ void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs,
>> VirtioBusState *vbus = VIRTIO_BUS(qbus);
>> VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
>> VirtIONet *n = VIRTIO_NET(dev);
>> + bool mq = virtio_host_has_feature(dev, VIRTIO_NET_F_MQ);
>> NetClientState *peer;
>> int total_notifiers = data_queue_pairs * 2 + cvq;
>> int nvhosts = data_queue_pairs + cvq;
>> @@ -418,7 +423,7 @@ void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs,
>> if (i < data_queue_pairs) {
>> peer = qemu_get_peer(ncs, i);
>> } else {
>> - peer = qemu_get_peer(ncs, n->max_queue_pairs);
>> + peer = qemu_get_peer(ncs, mq ? data_queue_pairs : n->max_queue_pairs);
>> }
>> vhost_net_stop_one(get_vhost_net(peer), dev);
>> }
>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>> index 27ea706..623476e 100644
>> --- a/hw/virtio/vhost-vdpa.c
>> +++ b/hw/virtio/vhost-vdpa.c
>> @@ -1097,7 +1097,30 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
>> vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
>> }
>>
>> - if (dev->vq_index + dev->nvqs != dev->vq_index_end) {
>> + /* FIXME the vhost_dev group for the control vq may have bogus nvqs=2
>> + * value rather than nvqs=1. This can happen in case the guest doesn't
>> + * support multiqueue, as a result of virtio_net_change_num_queue_pairs()
>> + * destroying and rebuilding all the vqs, the guest index for control vq
>> + * will no longer align with the host's. Currently net_init_vhost_vdpa()
>> + * only initializes all vhost_dev's and net_clients once during
>> + * net_client_init1() time, way earlier before multiqueue feature
>> + * negotiation can kick in.
> See below, it looks like the code doesn't find the correct vhost_dev.
>
>> + *
>> + * Discussion - some possible fixes so far I can think of:
>> + *
>> + * option 1: fix vhost_net->dev.nvqs and nc->is_datapath in place for
>> + * vdpa's ctrl vq, or rebuild all vdpa's vhost_dev groups and the
>> + * net_client array, in the virtio_net_set_multiqueue() path;
>> + *
>> + * option 2: fix vhost_dev->nvqs in place at vhost_vdpa_set_features()
>> + * before coming down to vhost_vdpa_dev_start() (Q: nc->is_datapath
>> + * seems only used in virtio_net_device_realize, is it relevant?);
> Relevant but not directly related, for the vhost_dev where
> nc->is_datapath is false, it will assume it is backed by a single
> queue not a queue pair.
>
>> + *
>> + * option 3: use host queue index all along in vhost-vdpa ioctls instead
>> + * of using guest vq index, so that vhost_net_start/stop() can remain
>> + * as-is today
>> + */
> Note that the vq_index of each vhost_dev is assigned during
> vhost_net_start() according to whether or not the MQ or CVQ is
> negotiated in vhost_net_start()
>
> for (i = 0; i < nvhosts; i++) {
>
> if (i < data_queue_pairs) {
> peer = qemu_get_peer(ncs, i);
> } else { /* Control Virtqueue */
> peer = qemu_get_peer(ncs, n->max_queue_pairs);
> }
>
> net = get_vhost_net(peer);
> vhost_net_set_vq_index(net, i * 2, index_end);
>
> It means some of the peers won't be used when MQ is not negotiated. So
> it looks to me the evil came from virtio_net_get_notifier_mask().
Yes, there it is. Where the control virtqueue first ever needs a
guest_notifier for vhost_dev.
> Where it doesn't mask the correct vhost dev when the guest doesn't
> support MQ but the host does. So we had option 4:
>
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 2087516253..5e9ac019cd 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -3179,7 +3179,13 @@ static void
> virtio_net_guest_notifier_mask(VirtIODevice *vdev, int idx,
> bool mask)
> {
> VirtIONet *n = VIRTIO_NET(vdev);
> - NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
> + NetClientState *nc;
> +
> + if (!virtio_vdev_has_feature(vdev, VIRTIO_NET_F_MQ) && idx == 2) {
Hmmm, I thought it would be more natural to align the layout of
vhost_dev's with that of virtqueue's, not the other way around. Not sure
how this vhost_dev selection scheme may work with additional queues
discovered through transport specific mechanism, such as the admin
virtqueue, but I can live with it for now:
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -244,7 +244,8 @@ static void virtio_net_vhost_status(VirtIONet *n,
uint8_t status)
VirtIODevice *vdev = VIRTIO_DEVICE(n);
NetClientState *nc = qemu_get_queue(n->nic);
int queue_pairs = n->multiqueue ? n->max_queue_pairs : 1;
- int cvq = n->max_ncs - n->max_queue_pairs;
+ int cvq = virtio_vdev_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ) ?
+ n->max_ncs - n->max_queue_pairs : 0;
if (!get_vhost_net(nc->peer)) {
return;
@@ -3161,8 +3162,14 @@ static NetClientInfo net_virtio_info = {
static bool virtio_net_guest_notifier_pending(VirtIODevice *vdev, int idx)
{
VirtIONet *n = VIRTIO_NET(vdev);
- NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
+ NetClientState *nc;
assert(n->vhost_started);
+ if (!virtio_vdev_has_feature(vdev, VIRTIO_NET_F_MQ) && idx == 2) {
+ assert(virtio_vdev_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ));
+ nc = qemu_get_subqueue(n->nic, n->max_queue_pairs);
+ } else {
+ nc = qemu_get_subqueue(n->nic, vq2q(idx));
+ }
return vhost_net_virtqueue_pending(get_vhost_net(nc->peer), idx);
}
@@ -3170,8 +3177,14 @@ static void
virtio_net_guest_notifier_mask(VirtIODevice *vdev, int idx,
bool mask)
{
VirtIONet *n = VIRTIO_NET(vdev);
- NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
+ NetClientState *nc;
assert(n->vhost_started);
+ if (!virtio_vdev_has_feature(vdev, VIRTIO_NET_F_MQ) && idx == 2) {
+ assert(virtio_vdev_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ));
+ nc = qemu_get_subqueue(n->nic, n->max_queue_pairs);
+ } else {
+ nc = qemu_get_subqueue(n->nic, vq2q(idx));
+ }
vhost_net_virtqueue_mask(get_vhost_net(nc->peer),
vdev, idx, mask);
}
Thanks,
-Siwei
> + nc = qemu_get_subqueue(n->nic, n->max_queue_pairs);
> + } else {
> + nc = qemu_get_subqueue(n->nic, vq2q(idx));
> + }
> assert(n->vhost_started);
> vhost_net_virtqueue_mask(get_vhost_net(nc->peer),
> vdev, idx, mask);
>
> Thanks
>
>> + if (dev->vq_index + dev->nvqs < dev->vq_index_end) {
>> return 0;
>> }
>>
>> --
>> 1.8.3.1
>>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] vhost_net: should not use max_queue_pairs for non-mq guest
2022-03-25 7:01 ` Si-Wei Liu
@ 2022-03-25 7:59 ` Jason Wang
2022-03-25 23:15 ` Si-Wei Liu
0 siblings, 1 reply; 5+ messages in thread
From: Jason Wang @ 2022-03-25 7:59 UTC (permalink / raw)
To: Si-Wei Liu; +Cc: eperezma, Eli Cohen, qemu-devel, mst
On Fri, Mar 25, 2022 at 3:02 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>
>
>
> On 3/21/2022 8:47 PM, Jason Wang wrote:
> > On Sat, Mar 19, 2022 at 12:14 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
> >> With MQ enabled vdpa device and non-MQ supporting guest e.g.
> >> booting vdpa with mq=on over OVMF of single vqp, it's easy
> >> to hit assert failure as the following:
> >>
> >> ../hw/virtio/vhost-vdpa.c:560: vhost_vdpa_get_vq_index: Assertion `idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs' failed.
> >>
> >> 0 0x00007f8ce3ff3387 in raise () at /lib64/libc.so.6
> >> 1 0x00007f8ce3ff4a78 in abort () at /lib64/libc.so.6
> >> 2 0x00007f8ce3fec1a6 in __assert_fail_base () at /lib64/libc.so.6
> >> 3 0x00007f8ce3fec252 in () at /lib64/libc.so.6
> >> 4 0x0000558f52d79421 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:563
> >> 5 0x0000558f52d79421 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:558
> >> 6 0x0000558f52d7329a in vhost_virtqueue_mask (hdev=0x558f55c01800, vdev=0x558f568f91f0, n=2, mask=<optimized out>) at ../hw/virtio/vhost.c:1557
> >> 7 0x0000558f52c6b89a in virtio_pci_set_guest_notifier (d=d@entry=0x558f568f0f60, n=n@entry=2, assign=assign@entry=true, with_irqfd=with_irqfd@entry=false)
> >> at ../hw/virtio/virtio-pci.c:974
> >> 8 0x0000558f52c6c0d8 in virtio_pci_set_guest_notifiers (d=0x558f568f0f60, nvqs=3, assign=true) at ../hw/virtio/virtio-pci.c:1019
> >> 9 0x0000558f52bf091d in vhost_net_start (dev=dev@entry=0x558f568f91f0, ncs=0x558f56937cd0, data_queue_pairs=data_queue_pairs@entry=1, cvq=cvq@entry=1)
> >> at ../hw/net/vhost_net.c:361
> >> 10 0x0000558f52d4e5e7 in virtio_net_set_status (status=<optimized out>, n=0x558f568f91f0) at ../hw/net/virtio-net.c:289
> >> 11 0x0000558f52d4e5e7 in virtio_net_set_status (vdev=0x558f568f91f0, status=15 '\017') at ../hw/net/virtio-net.c:370
> >> 12 0x0000558f52d6c4b2 in virtio_set_status (vdev=vdev@entry=0x558f568f91f0, val=val@entry=15 '\017') at ../hw/virtio/virtio.c:1945
> >> 13 0x0000558f52c69eff in virtio_pci_common_write (opaque=0x558f568f0f60, addr=<optimized out>, val=<optimized out>, size=<optimized out>) at ../hw/virtio/virtio-pci.c:1292
> >> 14 0x0000558f52d15d6e in memory_region_write_accessor (mr=0x558f568f19d0, addr=20, value=<optimized out>, size=1, shift=<optimized out>, mask=<optimized out>, attrs=...)
> >> at ../softmmu/memory.c:492
> >> 15 0x0000558f52d127de in access_with_adjusted_size (addr=addr@entry=20, value=value@entry=0x7f8cdbffe748, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=0x558f52d15cf0 <memory_region_write_accessor>, mr=0x558f568f19d0, attrs=...) at ../softmmu/memory.c:554
> >> 16 0x0000558f52d157ef in memory_region_dispatch_write (mr=mr@entry=0x558f568f19d0, addr=20, data=<optimized out>, op=<optimized out>, attrs=attrs@entry=...)
> >> at ../softmmu/memory.c:1504
> >> 17 0x0000558f52d078e7 in flatview_write_continue (fv=fv@entry=0x7f8accbc3b90, addr=addr@entry=103079215124, attrs=..., ptr=ptr@entry=0x7f8ce6300028, len=len@entry=1, addr1=<optimized out>, l=<optimized out>, mr=0x558f568f19d0) at /home/opc/qemu-upstream/include/qemu/host-utils.h:165
> >> 18 0x0000558f52d07b06 in flatview_write (fv=0x7f8accbc3b90, addr=103079215124, attrs=..., buf=0x7f8ce6300028, len=1) at ../softmmu/physmem.c:2822
> >> 19 0x0000558f52d0b36b in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=buf@entry=0x7f8ce6300028, len=<optimized out>)
> >> at ../softmmu/physmem.c:2914
> >> 20 0x0000558f52d0b3da in address_space_rw (as=<optimized out>, addr=<optimized out>, attrs=...,
> >> attrs@entry=..., buf=buf@entry=0x7f8ce6300028, len=<optimized out>, is_write=<optimized out>) at ../softmmu/physmem.c:2924
> >> 21 0x0000558f52dced09 in kvm_cpu_exec (cpu=cpu@entry=0x558f55c2da60) at ../accel/kvm/kvm-all.c:2903
> >> 22 0x0000558f52dcfabd in kvm_vcpu_thread_fn (arg=arg@entry=0x558f55c2da60) at ../accel/kvm/kvm-accel-ops.c:49
> >> 23 0x0000558f52f9f04a in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:556
> >> 24 0x00007f8ce4392ea5 in start_thread () at /lib64/libpthread.so.0
> >> 25 0x00007f8ce40bb9fd in clone () at /lib64/libc.so.6
> >>
> >> The cause for the assert failure is due to that the vhost_dev index
> >> for the ctrl vq was not aligned with actual one in use by the guest.
> >> Upon multiqueue feature negotiation in virtio_net_set_multiqueue(),
> >> if guest doesn't support multiqueue, the guest vq layout would shrink
> >> to single queue pair of 3 vqs in total (rx, tx and ctrl). This results
> >> in ctrl_vq taking a different vhost_dev group index than the default
> >> n->max_queue_pairs, the latter of which is only valid for multiqueue
> >> guest. While on those additional vqs not exposed to the guest,
> >> vhost_net_set_vq_index() never populated vq_index properly, hence
> >> getting the assert failure.
> >>
> >> A possible fix is to pick the correct vhost_dev group for the control
> >> vq according to this table [*]:
> >>
> >> vdpa tool / QEMU arg / guest config / ctrl_vq group index
> >> ----------------------------------------------------------------
> >> max_vqp 8 / mq=on / mq=off (UEFI) => data_queue_pairs
> >> max_vqp 8 / mq=on / mq=on (Linux) => n->max_queue_pairs(>1)
> >> max_vqp 8 / mq=off / mq=on (Linux) => n->max_queue_pairs(=1)
> >>
> >> [*] Please see FIXME in the code for open question and discussion
> >>
> >> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
> >> ---
> >> hw/net/vhost_net.c | 13 +++++++++----
> >> hw/virtio/vhost-vdpa.c | 25 ++++++++++++++++++++++++-
> >> 2 files changed, 33 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> >> index 30379d2..9a4479b 100644
> >> --- a/hw/net/vhost_net.c
> >> +++ b/hw/net/vhost_net.c
> >> @@ -322,6 +322,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
> >> BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(dev)));
> >> VirtioBusState *vbus = VIRTIO_BUS(qbus);
> >> VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
> >> + bool mq = virtio_host_has_feature(dev, VIRTIO_NET_F_MQ);
> >> int total_notifiers = data_queue_pairs * 2 + cvq;
> >> VirtIONet *n = VIRTIO_NET(dev);
> >> int nvhosts = data_queue_pairs + cvq;
> >> @@ -343,7 +344,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
> >> if (i < data_queue_pairs) {
> >> peer = qemu_get_peer(ncs, i);
> >> } else { /* Control Virtqueue */
> >> - peer = qemu_get_peer(ncs, n->max_queue_pairs);
> >> + peer = qemu_get_peer(ncs, mq ? data_queue_pairs : n->max_queue_pairs);
> >> }
> >>
> >> net = get_vhost_net(peer);
> >> @@ -368,7 +369,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
> >> if (i < data_queue_pairs) {
> >> peer = qemu_get_peer(ncs, i);
> >> } else {
> >> - peer = qemu_get_peer(ncs, n->max_queue_pairs);
> >> + peer = qemu_get_peer(ncs, mq ? data_queue_pairs : n->max_queue_pairs);
> >> }
> >> r = vhost_net_start_one(get_vhost_net(peer), dev);
> >>
> >> @@ -390,7 +391,10 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
> >>
> >> err_start:
> >> while (--i >= 0) {
> >> - peer = qemu_get_peer(ncs , i);
> >> + if (mq)
> >> + peer = qemu_get_peer(ncs, i < data_queue_pairs ? i : data_queue_pairs);
> >> + else
> >> + peer = qemu_get_peer(ncs, i < data_queue_pairs ? i : n->max_queue_pairs);
> >> vhost_net_stop_one(get_vhost_net(peer), dev);
> >> }
> >> e = k->set_guest_notifiers(qbus->parent, total_notifiers, false);
> >> @@ -409,6 +413,7 @@ void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs,
> >> VirtioBusState *vbus = VIRTIO_BUS(qbus);
> >> VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
> >> VirtIONet *n = VIRTIO_NET(dev);
> >> + bool mq = virtio_host_has_feature(dev, VIRTIO_NET_F_MQ);
> >> NetClientState *peer;
> >> int total_notifiers = data_queue_pairs * 2 + cvq;
> >> int nvhosts = data_queue_pairs + cvq;
> >> @@ -418,7 +423,7 @@ void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs,
> >> if (i < data_queue_pairs) {
> >> peer = qemu_get_peer(ncs, i);
> >> } else {
> >> - peer = qemu_get_peer(ncs, n->max_queue_pairs);
> >> + peer = qemu_get_peer(ncs, mq ? data_queue_pairs : n->max_queue_pairs);
> >> }
> >> vhost_net_stop_one(get_vhost_net(peer), dev);
> >> }
> >> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >> index 27ea706..623476e 100644
> >> --- a/hw/virtio/vhost-vdpa.c
> >> +++ b/hw/virtio/vhost-vdpa.c
> >> @@ -1097,7 +1097,30 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> >> vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
> >> }
> >>
> >> - if (dev->vq_index + dev->nvqs != dev->vq_index_end) {
> >> + /* FIXME the vhost_dev group for the control vq may have bogus nvqs=2
> >> + * value rather than nvqs=1. This can happen in case the guest doesn't
> >> + * support multiqueue, as a result of virtio_net_change_num_queue_pairs()
> >> + * destroying and rebuilding all the vqs, the guest index for control vq
> >> + * will no longer align with the host's. Currently net_init_vhost_vdpa()
> >> + * only initializes all vhost_dev's and net_clients once during
> >> + * net_client_init1() time, way earlier before multiqueue feature
> >> + * negotiation can kick in.
> > See below, it looks like the code doesn't find the correct vhost_dev.
> >
> >> + *
> >> + * Discussion - some possible fixes so far I can think of:
> >> + *
> >> + * option 1: fix vhost_net->dev.nvqs and nc->is_datapath in place for
> >> + * vdpa's ctrl vq, or rebuild all vdpa's vhost_dev groups and the
> >> + * net_client array, in the virtio_net_set_multiqueue() path;
> >> + *
> >> + * option 2: fix vhost_dev->nvqs in place at vhost_vdpa_set_features()
> >> + * before coming down to vhost_vdpa_dev_start() (Q: nc->is_datapath
> >> + * seems only used in virtio_net_device_realize, is it relevant?);
> > Relevant but not directly related, for the vhost_dev where
> > nc->is_datapath is false, it will assume it is backed by a single
> > queue not a queue pair.
> >
> >> + *
> >> + * option 3: use host queue index all along in vhost-vdpa ioctls instead
> >> + * of using guest vq index, so that vhost_net_start/stop() can remain
> >> + * as-is today
> >> + */
> > Note that the vq_index of each vhost_dev is assigned during
> > vhost_net_start() according to whether or not the MQ or CVQ is
> > negotiated in vhost_net_start()
> >
> > for (i = 0; i < nvhosts; i++) {
> >
> > if (i < data_queue_pairs) {
> > peer = qemu_get_peer(ncs, i);
> > } else { /* Control Virtqueue */
> > peer = qemu_get_peer(ncs, n->max_queue_pairs);
> > }
> >
> > net = get_vhost_net(peer);
> > vhost_net_set_vq_index(net, i * 2, index_end);
> >
> > It means some of the peers won't be used when MQ is not negotiated. So
> > it looks to me the evil came from virtio_net_get_notifier_mask().
> Yes, there it is. Where the control virtqueue first ever needs a
> guest_notifier for vhost_dev.
> > Where it doesn't mask the correct vhost dev when the guest doesn't
> > support MQ but the host does. So we had option 4:
> >
> > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > index 2087516253..5e9ac019cd 100644
> > --- a/hw/net/virtio-net.c
> > +++ b/hw/net/virtio-net.c
> > @@ -3179,7 +3179,13 @@ static void
> > virtio_net_guest_notifier_mask(VirtIODevice *vdev, int idx,
> > bool mask)
> > {
> > VirtIONet *n = VIRTIO_NET(vdev);
> > - NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
> > + NetClientState *nc;
> > +
> > + if (!virtio_vdev_has_feature(vdev, VIRTIO_NET_F_MQ) && idx == 2) {
> Hmmm, I thought it would be more natural to align the layout of
> vhost_dev's with that of virtqueue's, not the other way around.
The problem is that we need to make sure it works for vhost_net as
well where it doesn't support cvq.
> Not sure
> how this vhost_dev selection scheme may work with additional queues
> discovered through transport specific mechanism, such as the admin
> virtqueue, but I can live with it for now:
>
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -244,7 +244,8 @@ static void virtio_net_vhost_status(VirtIONet *n,
> uint8_t status)
> VirtIODevice *vdev = VIRTIO_DEVICE(n);
> NetClientState *nc = qemu_get_queue(n->nic);
> int queue_pairs = n->multiqueue ? n->max_queue_pairs : 1;
> - int cvq = n->max_ncs - n->max_queue_pairs;
> + int cvq = virtio_vdev_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ) ?
> + n->max_ncs - n->max_queue_pairs : 0;
Any reason for this line?
Btw, would you mind to post a formal patch for this?
Thanks
>
> if (!get_vhost_net(nc->peer)) {
> return;
> @@ -3161,8 +3162,14 @@ static NetClientInfo net_virtio_info = {
> static bool virtio_net_guest_notifier_pending(VirtIODevice *vdev, int idx)
> {
> VirtIONet *n = VIRTIO_NET(vdev);
> - NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
> + NetClientState *nc;
> assert(n->vhost_started);
> + if (!virtio_vdev_has_feature(vdev, VIRTIO_NET_F_MQ) && idx == 2) {
> + assert(virtio_vdev_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ));
> + nc = qemu_get_subqueue(n->nic, n->max_queue_pairs);
> + } else {
> + nc = qemu_get_subqueue(n->nic, vq2q(idx));
> + }
> return vhost_net_virtqueue_pending(get_vhost_net(nc->peer), idx);
> }
>
> @@ -3170,8 +3177,14 @@ static void
> virtio_net_guest_notifier_mask(VirtIODevice *vdev, int idx,
> bool mask)
> {
> VirtIONet *n = VIRTIO_NET(vdev);
> - NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
> + NetClientState *nc;
> assert(n->vhost_started);
> + if (!virtio_vdev_has_feature(vdev, VIRTIO_NET_F_MQ) && idx == 2) {
> + assert(virtio_vdev_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ));
> + nc = qemu_get_subqueue(n->nic, n->max_queue_pairs);
> + } else {
> + nc = qemu_get_subqueue(n->nic, vq2q(idx));
> + }
> vhost_net_virtqueue_mask(get_vhost_net(nc->peer),
> vdev, idx, mask);
> }
>
>
> Thanks,
> -Siwei
>
> > + nc = qemu_get_subqueue(n->nic, n->max_queue_pairs);
> > + } else {
> > + nc = qemu_get_subqueue(n->nic, vq2q(idx));
> > + }
> > assert(n->vhost_started);
> > vhost_net_virtqueue_mask(get_vhost_net(nc->peer),
> > vdev, idx, mask);
> >
> > Thanks
> >
> >> + if (dev->vq_index + dev->nvqs < dev->vq_index_end) {
> >> return 0;
> >> }
> >>
> >> --
> >> 1.8.3.1
> >>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] vhost_net: should not use max_queue_pairs for non-mq guest
2022-03-25 7:59 ` Jason Wang
@ 2022-03-25 23:15 ` Si-Wei Liu
0 siblings, 0 replies; 5+ messages in thread
From: Si-Wei Liu @ 2022-03-25 23:15 UTC (permalink / raw)
To: Jason Wang; +Cc: eperezma, Eli Cohen, qemu-devel, mst
On 3/25/2022 12:59 AM, Jason Wang wrote:
> On Fri, Mar 25, 2022 at 3:02 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>>
>>
>> On 3/21/2022 8:47 PM, Jason Wang wrote:
>>> On Sat, Mar 19, 2022 at 12:14 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>>>> With MQ enabled vdpa device and non-MQ supporting guest e.g.
>>>> booting vdpa with mq=on over OVMF of single vqp, it's easy
>>>> to hit assert failure as the following:
>>>>
>>>> ../hw/virtio/vhost-vdpa.c:560: vhost_vdpa_get_vq_index: Assertion `idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs' failed.
>>>>
>>>> 0 0x00007f8ce3ff3387 in raise () at /lib64/libc.so.6
>>>> 1 0x00007f8ce3ff4a78 in abort () at /lib64/libc.so.6
>>>> 2 0x00007f8ce3fec1a6 in __assert_fail_base () at /lib64/libc.so.6
>>>> 3 0x00007f8ce3fec252 in () at /lib64/libc.so.6
>>>> 4 0x0000558f52d79421 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:563
>>>> 5 0x0000558f52d79421 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:558
>>>> 6 0x0000558f52d7329a in vhost_virtqueue_mask (hdev=0x558f55c01800, vdev=0x558f568f91f0, n=2, mask=<optimized out>) at ../hw/virtio/vhost.c:1557
>>>> 7 0x0000558f52c6b89a in virtio_pci_set_guest_notifier (d=d@entry=0x558f568f0f60, n=n@entry=2, assign=assign@entry=true, with_irqfd=with_irqfd@entry=false)
>>>> at ../hw/virtio/virtio-pci.c:974
>>>> 8 0x0000558f52c6c0d8 in virtio_pci_set_guest_notifiers (d=0x558f568f0f60, nvqs=3, assign=true) at ../hw/virtio/virtio-pci.c:1019
>>>> 9 0x0000558f52bf091d in vhost_net_start (dev=dev@entry=0x558f568f91f0, ncs=0x558f56937cd0, data_queue_pairs=data_queue_pairs@entry=1, cvq=cvq@entry=1)
>>>> at ../hw/net/vhost_net.c:361
>>>> 10 0x0000558f52d4e5e7 in virtio_net_set_status (status=<optimized out>, n=0x558f568f91f0) at ../hw/net/virtio-net.c:289
>>>> 11 0x0000558f52d4e5e7 in virtio_net_set_status (vdev=0x558f568f91f0, status=15 '\017') at ../hw/net/virtio-net.c:370
>>>> 12 0x0000558f52d6c4b2 in virtio_set_status (vdev=vdev@entry=0x558f568f91f0, val=val@entry=15 '\017') at ../hw/virtio/virtio.c:1945
>>>> 13 0x0000558f52c69eff in virtio_pci_common_write (opaque=0x558f568f0f60, addr=<optimized out>, val=<optimized out>, size=<optimized out>) at ../hw/virtio/virtio-pci.c:1292
>>>> 14 0x0000558f52d15d6e in memory_region_write_accessor (mr=0x558f568f19d0, addr=20, value=<optimized out>, size=1, shift=<optimized out>, mask=<optimized out>, attrs=...)
>>>> at ../softmmu/memory.c:492
>>>> 15 0x0000558f52d127de in access_with_adjusted_size (addr=addr@entry=20, value=value@entry=0x7f8cdbffe748, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=0x558f52d15cf0 <memory_region_write_accessor>, mr=0x558f568f19d0, attrs=...) at ../softmmu/memory.c:554
>>>> 16 0x0000558f52d157ef in memory_region_dispatch_write (mr=mr@entry=0x558f568f19d0, addr=20, data=<optimized out>, op=<optimized out>, attrs=attrs@entry=...)
>>>> at ../softmmu/memory.c:1504
>>>> 17 0x0000558f52d078e7 in flatview_write_continue (fv=fv@entry=0x7f8accbc3b90, addr=addr@entry=103079215124, attrs=..., ptr=ptr@entry=0x7f8ce6300028, len=len@entry=1, addr1=<optimized out>, l=<optimized out>, mr=0x558f568f19d0) at /home/opc/qemu-upstream/include/qemu/host-utils.h:165
>>>> 18 0x0000558f52d07b06 in flatview_write (fv=0x7f8accbc3b90, addr=103079215124, attrs=..., buf=0x7f8ce6300028, len=1) at ../softmmu/physmem.c:2822
>>>> 19 0x0000558f52d0b36b in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=buf@entry=0x7f8ce6300028, len=<optimized out>)
>>>> at ../softmmu/physmem.c:2914
>>>> 20 0x0000558f52d0b3da in address_space_rw (as=<optimized out>, addr=<optimized out>, attrs=...,
>>>> attrs@entry=..., buf=buf@entry=0x7f8ce6300028, len=<optimized out>, is_write=<optimized out>) at ../softmmu/physmem.c:2924
>>>> 21 0x0000558f52dced09 in kvm_cpu_exec (cpu=cpu@entry=0x558f55c2da60) at ../accel/kvm/kvm-all.c:2903
>>>> 22 0x0000558f52dcfabd in kvm_vcpu_thread_fn (arg=arg@entry=0x558f55c2da60) at ../accel/kvm/kvm-accel-ops.c:49
>>>> 23 0x0000558f52f9f04a in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:556
>>>> 24 0x00007f8ce4392ea5 in start_thread () at /lib64/libpthread.so.0
>>>> 25 0x00007f8ce40bb9fd in clone () at /lib64/libc.so.6
>>>>
>>>> The cause for the assert failure is due to that the vhost_dev index
>>>> for the ctrl vq was not aligned with actual one in use by the guest.
>>>> Upon multiqueue feature negotiation in virtio_net_set_multiqueue(),
>>>> if guest doesn't support multiqueue, the guest vq layout would shrink
>>>> to single queue pair of 3 vqs in total (rx, tx and ctrl). This results
>>>> in ctrl_vq taking a different vhost_dev group index than the default
>>>> n->max_queue_pairs, the latter of which is only valid for multiqueue
>>>> guest. While on those additional vqs not exposed to the guest,
>>>> vhost_net_set_vq_index() never populated vq_index properly, hence
>>>> getting the assert failure.
>>>>
>>>> A possible fix is to pick the correct vhost_dev group for the control
>>>> vq according to this table [*]:
>>>>
>>>> vdpa tool / QEMU arg / guest config / ctrl_vq group index
>>>> ----------------------------------------------------------------
>>>> max_vqp 8 / mq=on / mq=off (UEFI) => data_queue_pairs
>>>> max_vqp 8 / mq=on / mq=on (Linux) => n->max_queue_pairs(>1)
>>>> max_vqp 8 / mq=off / mq=on (Linux) => n->max_queue_pairs(=1)
>>>>
>>>> [*] Please see FIXME in the code for open question and discussion
>>>>
>>>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
>>>> ---
>>>> hw/net/vhost_net.c | 13 +++++++++----
>>>> hw/virtio/vhost-vdpa.c | 25 ++++++++++++++++++++++++-
>>>> 2 files changed, 33 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
>>>> index 30379d2..9a4479b 100644
>>>> --- a/hw/net/vhost_net.c
>>>> +++ b/hw/net/vhost_net.c
>>>> @@ -322,6 +322,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>>>> BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(dev)));
>>>> VirtioBusState *vbus = VIRTIO_BUS(qbus);
>>>> VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
>>>> + bool mq = virtio_host_has_feature(dev, VIRTIO_NET_F_MQ);
>>>> int total_notifiers = data_queue_pairs * 2 + cvq;
>>>> VirtIONet *n = VIRTIO_NET(dev);
>>>> int nvhosts = data_queue_pairs + cvq;
>>>> @@ -343,7 +344,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>>>> if (i < data_queue_pairs) {
>>>> peer = qemu_get_peer(ncs, i);
>>>> } else { /* Control Virtqueue */
>>>> - peer = qemu_get_peer(ncs, n->max_queue_pairs);
>>>> + peer = qemu_get_peer(ncs, mq ? data_queue_pairs : n->max_queue_pairs);
>>>> }
>>>>
>>>> net = get_vhost_net(peer);
>>>> @@ -368,7 +369,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>>>> if (i < data_queue_pairs) {
>>>> peer = qemu_get_peer(ncs, i);
>>>> } else {
>>>> - peer = qemu_get_peer(ncs, n->max_queue_pairs);
>>>> + peer = qemu_get_peer(ncs, mq ? data_queue_pairs : n->max_queue_pairs);
>>>> }
>>>> r = vhost_net_start_one(get_vhost_net(peer), dev);
>>>>
>>>> @@ -390,7 +391,10 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>>>>
>>>> err_start:
>>>> while (--i >= 0) {
>>>> - peer = qemu_get_peer(ncs , i);
>>>> + if (mq)
>>>> + peer = qemu_get_peer(ncs, i < data_queue_pairs ? i : data_queue_pairs);
>>>> + else
>>>> + peer = qemu_get_peer(ncs, i < data_queue_pairs ? i : n->max_queue_pairs);
>>>> vhost_net_stop_one(get_vhost_net(peer), dev);
>>>> }
>>>> e = k->set_guest_notifiers(qbus->parent, total_notifiers, false);
>>>> @@ -409,6 +413,7 @@ void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs,
>>>> VirtioBusState *vbus = VIRTIO_BUS(qbus);
>>>> VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
>>>> VirtIONet *n = VIRTIO_NET(dev);
>>>> + bool mq = virtio_host_has_feature(dev, VIRTIO_NET_F_MQ);
>>>> NetClientState *peer;
>>>> int total_notifiers = data_queue_pairs * 2 + cvq;
>>>> int nvhosts = data_queue_pairs + cvq;
>>>> @@ -418,7 +423,7 @@ void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs,
>>>> if (i < data_queue_pairs) {
>>>> peer = qemu_get_peer(ncs, i);
>>>> } else {
>>>> - peer = qemu_get_peer(ncs, n->max_queue_pairs);
>>>> + peer = qemu_get_peer(ncs, mq ? data_queue_pairs : n->max_queue_pairs);
>>>> }
>>>> vhost_net_stop_one(get_vhost_net(peer), dev);
>>>> }
>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>>> index 27ea706..623476e 100644
>>>> --- a/hw/virtio/vhost-vdpa.c
>>>> +++ b/hw/virtio/vhost-vdpa.c
>>>> @@ -1097,7 +1097,30 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
>>>> vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
>>>> }
>>>>
>>>> - if (dev->vq_index + dev->nvqs != dev->vq_index_end) {
>>>> + /* FIXME the vhost_dev group for the control vq may have bogus nvqs=2
>>>> + * value rather than nvqs=1. This can happen in case the guest doesn't
>>>> + * support multiqueue, as a result of virtio_net_change_num_queue_pairs()
>>>> + * destroying and rebuilding all the vqs, the guest index for control vq
>>>> + * will no longer align with the host's. Currently net_init_vhost_vdpa()
>>>> + * only initializes all vhost_dev's and net_clients once during
>>>> + * net_client_init1() time, way earlier before multiqueue feature
>>>> + * negotiation can kick in.
>>> See below, it looks like the code doesn't find the correct vhost_dev.
>>>
>>>> + *
>>>> + * Discussion - some possible fixes so far I can think of:
>>>> + *
>>>> + * option 1: fix vhost_net->dev.nvqs and nc->is_datapath in place for
>>>> + * vdpa's ctrl vq, or rebuild all vdpa's vhost_dev groups and the
>>>> + * net_client array, in the virtio_net_set_multiqueue() path;
>>>> + *
>>>> + * option 2: fix vhost_dev->nvqs in place at vhost_vdpa_set_features()
>>>> + * before coming down to vhost_vdpa_dev_start() (Q: nc->is_datapath
>>>> + * seems only used in virtio_net_device_realize, is it relevant?);
>>> Relevant but not directly related, for the vhost_dev where
>>> nc->is_datapath is false, it will assume it is backed by a single
>>> queue not a queue pair.
>>>
>>>> + *
>>>> + * option 3: use host queue index all along in vhost-vdpa ioctls instead
>>>> + * of using guest vq index, so that vhost_net_start/stop() can remain
>>>> + * as-is today
>>>> + */
>>> Note that the vq_index of each vhost_dev is assigned during
>>> vhost_net_start() according to whether or not the MQ or CVQ is
>>> negotiated in vhost_net_start()
>>>
>>> for (i = 0; i < nvhosts; i++) {
>>>
>>> if (i < data_queue_pairs) {
>>> peer = qemu_get_peer(ncs, i);
>>> } else { /* Control Virtqueue */
>>> peer = qemu_get_peer(ncs, n->max_queue_pairs);
>>> }
>>>
>>> net = get_vhost_net(peer);
>>> vhost_net_set_vq_index(net, i * 2, index_end);
>>>
>>> It means some of the peers won't be used when MQ is not negotiated. So
>>> it looks to me the evil came from virtio_net_get_notifier_mask().
>> Yes, there it is. Where the control virtqueue first ever needs a
>> guest_notifier for vhost_dev.
>>> Where it doesn't mask the correct vhost dev when the guest doesn't
>>> support MQ but the host does. So we had option 4:
>>>
>>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>>> index 2087516253..5e9ac019cd 100644
>>> --- a/hw/net/virtio-net.c
>>> +++ b/hw/net/virtio-net.c
>>> @@ -3179,7 +3179,13 @@ static void
>>> virtio_net_guest_notifier_mask(VirtIODevice *vdev, int idx,
>>> bool mask)
>>> {
>>> VirtIONet *n = VIRTIO_NET(vdev);
>>> - NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
>>> + NetClientState *nc;
>>> +
>>> + if (!virtio_vdev_has_feature(vdev, VIRTIO_NET_F_MQ) && idx == 2) {
>> Hmmm, I thought it would be more natural to align the layout of
>> vhost_dev's with that of virtqueue's, not the other way around.
> The problem is that we need to make sure it works for vhost_net as
> well where it doesn't support cvq.
>
>> Not sure
>> how this vhost_dev selection scheme may work with additional queues
>> discovered through transport specific mechanism, such as the admin
>> virtqueue, but I can live with it for now:
>>
>> --- a/hw/net/virtio-net.c
>> +++ b/hw/net/virtio-net.c
>> @@ -244,7 +244,8 @@ static void virtio_net_vhost_status(VirtIONet *n,
>> uint8_t status)
>> VirtIODevice *vdev = VIRTIO_DEVICE(n);
>> NetClientState *nc = qemu_get_queue(n->nic);
>> int queue_pairs = n->multiqueue ? n->max_queue_pairs : 1;
>> - int cvq = n->max_ncs - n->max_queue_pairs;
>> + int cvq = virtio_vdev_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ) ?
>> + n->max_ncs - n->max_queue_pairs : 0;
> Any reason for this line?
This corresponds to the following asserts:
assert(virtio_vdev_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ));
If the QEMU or guest doesn't support control vq, there's no need to
bother exposing vhost_dev and guest notifier for the control vq. Noted
the vhost_net_start/stop implies DRIVER_OK is set in device status,
meaning feature negotiation is complete already (same as n->multiqueue).
>
> Btw, would you mind to post a formal patch for this?
I would love to, there was a set of mq bug fixes sitting in my queue
pending on paper work, but I had been dragged to other stuff earlier
this week. I will try to post it early next week.
-Siwei
>
> Thanks
>
>> if (!get_vhost_net(nc->peer)) {
>> return;
>> @@ -3161,8 +3162,14 @@ static NetClientInfo net_virtio_info = {
>> static bool virtio_net_guest_notifier_pending(VirtIODevice *vdev, int idx)
>> {
>> VirtIONet *n = VIRTIO_NET(vdev);
>> - NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
>> + NetClientState *nc;
>> assert(n->vhost_started);
>> + if (!virtio_vdev_has_feature(vdev, VIRTIO_NET_F_MQ) && idx == 2) {
>> + assert(virtio_vdev_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ));
>> + nc = qemu_get_subqueue(n->nic, n->max_queue_pairs);
>> + } else {
>> + nc = qemu_get_subqueue(n->nic, vq2q(idx));
>> + }
>> return vhost_net_virtqueue_pending(get_vhost_net(nc->peer), idx);
>> }
>>
>> @@ -3170,8 +3177,14 @@ static void
>> virtio_net_guest_notifier_mask(VirtIODevice *vdev, int idx,
>> bool mask)
>> {
>> VirtIONet *n = VIRTIO_NET(vdev);
>> - NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
>> + NetClientState *nc;
>> assert(n->vhost_started);
>> + if (!virtio_vdev_has_feature(vdev, VIRTIO_NET_F_MQ) && idx == 2) {
>> + assert(virtio_vdev_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ));
>> + nc = qemu_get_subqueue(n->nic, n->max_queue_pairs);
>> + } else {
>> + nc = qemu_get_subqueue(n->nic, vq2q(idx));
>> + }
>> vhost_net_virtqueue_mask(get_vhost_net(nc->peer),
>> vdev, idx, mask);
>> }
>>
>>
>> Thanks,
>> -Siwei
>>
>>> + nc = qemu_get_subqueue(n->nic, n->max_queue_pairs);
>>> + } else {
>>> + nc = qemu_get_subqueue(n->nic, vq2q(idx));
>>> + }
>>> assert(n->vhost_started);
>>> vhost_net_virtqueue_mask(get_vhost_net(nc->peer),
>>> vdev, idx, mask);
>>>
>>> Thanks
>>>
>>>> + if (dev->vq_index + dev->nvqs < dev->vq_index_end) {
>>>> return 0;
>>>> }
>>>>
>>>> --
>>>> 1.8.3.1
>>>>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-03-25 23:18 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-19 4:13 [RFC PATCH] vhost_net: should not use max_queue_pairs for non-mq guest Si-Wei Liu
2022-03-22 3:47 ` Jason Wang
2022-03-25 7:01 ` Si-Wei Liu
2022-03-25 7:59 ` Jason Wang
2022-03-25 23:15 ` Si-Wei Liu
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.