All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/2] virtio: fix for assertion failure: virtio_net_get_subqueue(nc)->async_tx.elem failed
@ 2023-01-29  2:51 Xuan Zhuo
  2023-01-29  2:51 ` [PATCH v1 1/2] virtio: struct VirtQueue introduce reset Xuan Zhuo
                   ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Xuan Zhuo @ 2023-01-29  2:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Michael S. Tsirkin, Jason Wang

In the current design, we stop the device from operating on the vring
during per-queue reset by resetting the structure VirtQueue.

But before the reset operation, when recycling some resources, we should
stop referencing new vring resources.

This bug is caused by this reason.

    https://gitlab.com/qemu-project/qemu/-/issues/1451

Before we reset the structure, we called the ->queue_reset callback to let the
device reclaim resources. Here virtio-net tries to release the packets sent
asynchronously, but during this process virtio_net_flush_tx() will be called,
and new data will be sent again. This leads to asserted.

     assert(!virtio_net_get_subqueue(nc)->async_tx.elem);

This patch set introduce new item "reset" into struct VirtQueue, then device can
know this virtqueue is per-queue reset state.

v1:
    1. rename "reset" to disabled_by_reset
    2. add api: virtio_queue_reset_state()

Xuan Zhuo (2):
  virtio: struct VirtQueue introduce reset
  virtio-net: virtio_net_flush_tx() check for per-queue reset

 hw/net/virtio-net.c        |  3 ++-
 hw/virtio/virtio.c         | 15 +++++++++++++++
 include/hw/virtio/virtio.h |  1 +
 3 files changed, 18 insertions(+), 1 deletion(-)

--
2.32.0.3.g01195cf9f



^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v1 1/2] virtio: struct VirtQueue introduce reset
  2023-01-29  2:51 [PATCH v1 0/2] virtio: fix for assertion failure: virtio_net_get_subqueue(nc)->async_tx.elem failed Xuan Zhuo
@ 2023-01-29  2:51 ` Xuan Zhuo
  2023-01-29  7:12   ` Michael S. Tsirkin
  2023-01-29  2:51 ` [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset Xuan Zhuo
  2023-01-29  7:26 ` [PATCH v1 0/2] virtio: fix for assertion failure: virtio_net_get_subqueue(nc)->async_tx.elem failed Michael S. Tsirkin
  2 siblings, 1 reply; 32+ messages in thread
From: Xuan Zhuo @ 2023-01-29  2:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Michael S. Tsirkin, Jason Wang

In the current design, we stop the device from operating on the vring
during per-queue reset by resetting the structure VirtQueue.

But before the reset operation, when recycling some resources, we should
stop referencing new vring resources. For example, when recycling
virtio-net's asynchronous sending resources, virtio-net should be able
to perceive that the current queue is in the per-queue reset state, and
stop sending new packets from the tx queue.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
 hw/virtio/virtio.c         | 15 +++++++++++++++
 include/hw/virtio/virtio.h |  1 +
 2 files changed, 16 insertions(+)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index f35178f5fc..c954f2a2b3 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -142,6 +142,8 @@ struct VirtQueue
     /* Notification enabled? */
     bool notification;
 
+    bool disabled_by_reset;
+
     uint16_t queue_index;
 
     unsigned int inuse;
@@ -2079,6 +2081,12 @@ void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
 {
     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
 
+    /*
+     * Mark this queue is per-queue reset status. The device should release the
+     * references of the vring, and not refer more new vring item.
+     */
+    vdev->vq[queue_index].disabled_by_reset = true;
+
     if (k->queue_reset) {
         k->queue_reset(vdev, queue_index);
     }
@@ -2102,11 +2110,18 @@ void virtio_queue_enable(VirtIODevice *vdev, uint32_t queue_index)
     }
     */
 
+    vdev->vq[queue_index].disabled_by_reset = false;
+
     if (k->queue_enable) {
         k->queue_enable(vdev, queue_index);
     }
 }
 
+bool virtio_queue_reset_state(VirtQueue *vq)
+{
+    return vq->disabled_by_reset;
+}
+
 void virtio_reset(void *opaque)
 {
     VirtIODevice *vdev = opaque;
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 77c6c55929..00e91af7c4 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -319,6 +319,7 @@ int virtio_set_status(VirtIODevice *vdev, uint8_t val);
 void virtio_reset(void *opaque);
 void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index);
 void virtio_queue_enable(VirtIODevice *vdev, uint32_t queue_index);
+bool virtio_queue_reset_state(VirtQueue *vq);
 void virtio_update_irq(VirtIODevice *vdev);
 int virtio_set_features(VirtIODevice *vdev, uint64_t val);
 
-- 
2.32.0.3.g01195cf9f



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-29  2:51 [PATCH v1 0/2] virtio: fix for assertion failure: virtio_net_get_subqueue(nc)->async_tx.elem failed Xuan Zhuo
  2023-01-29  2:51 ` [PATCH v1 1/2] virtio: struct VirtQueue introduce reset Xuan Zhuo
@ 2023-01-29  2:51 ` Xuan Zhuo
  2023-01-29  6:23   ` Jason Wang
  2023-01-29  7:25   ` Michael S. Tsirkin
  2023-01-29  7:26 ` [PATCH v1 0/2] virtio: fix for assertion failure: virtio_net_get_subqueue(nc)->async_tx.elem failed Michael S. Tsirkin
  2 siblings, 2 replies; 32+ messages in thread
From: Xuan Zhuo @ 2023-01-29  2:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Michael S. Tsirkin, Jason Wang, Alexander Bulekov

Check whether it is per-queue reset state in virtio_net_flush_tx().

Before per-queue reset, we need to recover async tx resources. At this
time, virtio_net_flush_tx() is called, but we should not try to send
new packets, so virtio_net_flush_tx() should check the current
per-queue reset state.

Fixes: 7dc6be52 ("virtio-net: support queue reset")
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
 hw/net/virtio-net.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 3ae909041a..fba6451a50 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
     VirtQueueElement *elem;
     int32_t num_packets = 0;
     int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
-    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
+    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
+        virtio_queue_reset_state(q->tx_vq)) {
         return num_packets;
     }
 
-- 
2.32.0.3.g01195cf9f



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-29  2:51 ` [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset Xuan Zhuo
@ 2023-01-29  6:23   ` Jason Wang
  2023-01-29  7:43     ` Xuan Zhuo
  2023-01-29  7:25   ` Michael S. Tsirkin
  1 sibling, 1 reply; 32+ messages in thread
From: Jason Wang @ 2023-01-29  6:23 UTC (permalink / raw)
  To: Xuan Zhuo; +Cc: qemu-devel, Michael S. Tsirkin, Alexander Bulekov

On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> Check whether it is per-queue reset state in virtio_net_flush_tx().
>
> Before per-queue reset, we need to recover async tx resources. At this
> time, virtio_net_flush_tx() is called, but we should not try to send
> new packets, so virtio_net_flush_tx() should check the current
> per-queue reset state.
>
> Fixes: 7dc6be52 ("virtio-net: support queue reset")
> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> Reported-by: Alexander Bulekov <alxndr@bu.edu>
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
>  hw/net/virtio-net.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 3ae909041a..fba6451a50 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
>      VirtQueueElement *elem;
>      int32_t num_packets = 0;
>      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> +        virtio_queue_reset_state(q->tx_vq)) {

We have other places that check DRIVER_OK do we need to check queue
reset as well?

E.g:
virtio_net_can_receive()
virtio_net_tx_{timer|bh}()

Thanks

>          return num_packets;
>      }
>
> --
> 2.32.0.3.g01195cf9f
>



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 1/2] virtio: struct VirtQueue introduce reset
  2023-01-29  2:51 ` [PATCH v1 1/2] virtio: struct VirtQueue introduce reset Xuan Zhuo
@ 2023-01-29  7:12   ` Michael S. Tsirkin
  2023-01-29  7:15     ` Xuan Zhuo
  0 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2023-01-29  7:12 UTC (permalink / raw)
  To: Xuan Zhuo; +Cc: qemu-devel, Jason Wang


subject seems wrong.

On Sun, Jan 29, 2023 at 10:51:49AM +0800, Xuan Zhuo wrote:
> In the current design, we stop the device from operating on the vring
> during per-queue reset by resetting the structure VirtQueue.
> 
> But before the reset operation, when recycling some resources, we should
> stop referencing new vring resources. For example, when recycling
> virtio-net's asynchronous sending resources, virtio-net should be able
> to perceive that the current queue is in the per-queue reset state, and
> stop sending new packets from the tx queue.
> 
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
>  hw/virtio/virtio.c         | 15 +++++++++++++++
>  include/hw/virtio/virtio.h |  1 +
>  2 files changed, 16 insertions(+)
> 
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index f35178f5fc..c954f2a2b3 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -142,6 +142,8 @@ struct VirtQueue
>      /* Notification enabled? */
>      bool notification;
>  
> +    bool disabled_by_reset;
> +
>      uint16_t queue_index;
>  
>      unsigned int inuse;
> @@ -2079,6 +2081,12 @@ void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
>  {
>      VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
>  
> +    /*
> +     * Mark this queue is per-queue reset status. The device should release the
> +     * references of the vring, and not refer more new vring item.

items

> +     */
> +    vdev->vq[queue_index].disabled_by_reset = true;
> +
>      if (k->queue_reset) {
>          k->queue_reset(vdev, queue_index);
>      }

can we set this after calling queue_reset? For symmetry with enable.

> @@ -2102,11 +2110,18 @@ void virtio_queue_enable(VirtIODevice *vdev, uint32_t queue_index)
>      }
>      */
>  
> +    vdev->vq[queue_index].disabled_by_reset = false;
> +
>      if (k->queue_enable) {
>          k->queue_enable(vdev, queue_index);
>      }
>  }
>  
> +bool virtio_queue_reset_state(VirtQueue *vq)
> +{
> +    return vq->disabled_by_reset;
> +}
> +
>  void virtio_reset(void *opaque)
>  {
>      VirtIODevice *vdev = opaque;
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index 77c6c55929..00e91af7c4 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -319,6 +319,7 @@ int virtio_set_status(VirtIODevice *vdev, uint8_t val);
>  void virtio_reset(void *opaque);
>  void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index);
>  void virtio_queue_enable(VirtIODevice *vdev, uint32_t queue_index);
> +bool virtio_queue_reset_state(VirtQueue *vq);
>  void virtio_update_irq(VirtIODevice *vdev);
>  int virtio_set_features(VirtIODevice *vdev, uint64_t val);

OK I guess ... what about migration. This state won't be
set correctly will it?


>  
> -- 
> 2.32.0.3.g01195cf9f



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 1/2] virtio: struct VirtQueue introduce reset
  2023-01-29  7:12   ` Michael S. Tsirkin
@ 2023-01-29  7:15     ` Xuan Zhuo
  2023-01-29  7:37       ` Michael S. Tsirkin
  0 siblings, 1 reply; 32+ messages in thread
From: Xuan Zhuo @ 2023-01-29  7:15 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: qemu-devel, Jason Wang

On Sun, 29 Jan 2023 02:12:36 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>
> subject seems wrong.


Will fix.


>
> On Sun, Jan 29, 2023 at 10:51:49AM +0800, Xuan Zhuo wrote:
> > In the current design, we stop the device from operating on the vring
> > during per-queue reset by resetting the structure VirtQueue.
> >
> > But before the reset operation, when recycling some resources, we should
> > stop referencing new vring resources. For example, when recycling
> > virtio-net's asynchronous sending resources, virtio-net should be able
> > to perceive that the current queue is in the per-queue reset state, and
> > stop sending new packets from the tx queue.
> >
> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > ---
> >  hw/virtio/virtio.c         | 15 +++++++++++++++
> >  include/hw/virtio/virtio.h |  1 +
> >  2 files changed, 16 insertions(+)
> >
> > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > index f35178f5fc..c954f2a2b3 100644
> > --- a/hw/virtio/virtio.c
> > +++ b/hw/virtio/virtio.c
> > @@ -142,6 +142,8 @@ struct VirtQueue
> >      /* Notification enabled? */
> >      bool notification;
> >
> > +    bool disabled_by_reset;
> > +
> >      uint16_t queue_index;
> >
> >      unsigned int inuse;
> > @@ -2079,6 +2081,12 @@ void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> >  {
> >      VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> >
> > +    /*
> > +     * Mark this queue is per-queue reset status. The device should release the
> > +     * references of the vring, and not refer more new vring item.
>
> items


Will fix.

>
> > +     */
> > +    vdev->vq[queue_index].disabled_by_reset = true;
> > +
> >      if (k->queue_reset) {
> >          k->queue_reset(vdev, queue_index);
> >      }
>
> can we set this after calling queue_reset? For symmetry with enable.


In fact,  queue_reset() will check it.


>
> > @@ -2102,11 +2110,18 @@ void virtio_queue_enable(VirtIODevice *vdev, uint32_t queue_index)
> >      }
> >      */
> >
> > +    vdev->vq[queue_index].disabled_by_reset = false;
> > +
> >      if (k->queue_enable) {
> >          k->queue_enable(vdev, queue_index);
> >      }
> >  }
> >
> > +bool virtio_queue_reset_state(VirtQueue *vq)
> > +{
> > +    return vq->disabled_by_reset;
> > +}
> > +
> >  void virtio_reset(void *opaque)
> >  {
> >      VirtIODevice *vdev = opaque;
> > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > index 77c6c55929..00e91af7c4 100644
> > --- a/include/hw/virtio/virtio.h
> > +++ b/include/hw/virtio/virtio.h
> > @@ -319,6 +319,7 @@ int virtio_set_status(VirtIODevice *vdev, uint8_t val);
> >  void virtio_reset(void *opaque);
> >  void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index);
> >  void virtio_queue_enable(VirtIODevice *vdev, uint32_t queue_index);
> > +bool virtio_queue_reset_state(VirtQueue *vq);
> >  void virtio_update_irq(VirtIODevice *vdev);
> >  int virtio_set_features(VirtIODevice *vdev, uint64_t val);
>
> OK I guess ... what about migration. This state won't be
> set correctly will it?

I think it has no effect. After the reset, there is actually no state. We can
migrate.

The current variable is only used by ->queue_reset().

Thanks.


>
>
> >
> > --
> > 2.32.0.3.g01195cf9f
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-29  2:51 ` [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset Xuan Zhuo
  2023-01-29  6:23   ` Jason Wang
@ 2023-01-29  7:25   ` Michael S. Tsirkin
  2023-01-29  7:28     ` Xuan Zhuo
  1 sibling, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2023-01-29  7:25 UTC (permalink / raw)
  To: Xuan Zhuo; +Cc: qemu-devel, Jason Wang, Alexander Bulekov

On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> Check whether it is per-queue reset state in virtio_net_flush_tx().
> 
> Before per-queue reset, we need to recover async tx resources. At this
> time, virtio_net_flush_tx() is called, but we should not try to send
> new packets, so virtio_net_flush_tx() should check the current
> per-queue reset state.


What does "at this time" mean here?
Do you in fact mean it's called from flush_or_purge_queued_packets?
What does the call stack look like?

If yes introducing a vq state just so virtio_net_flush_tx
knows we are in the process of reset would be a bad idea.
We want something much more local, ideally on stack even ...


> 
> Fixes: 7dc6be52 ("virtio-net: support queue reset")
> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> Reported-by: Alexander Bulekov <alxndr@bu.edu>
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
>  hw/net/virtio-net.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 3ae909041a..fba6451a50 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
>      VirtQueueElement *elem;
>      int32_t num_packets = 0;
>      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> +        virtio_queue_reset_state(q->tx_vq)) {
>          return num_packets;
>      }
>  
> -- 
> 2.32.0.3.g01195cf9f



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 0/2] virtio: fix for assertion failure: virtio_net_get_subqueue(nc)->async_tx.elem failed
  2023-01-29  2:51 [PATCH v1 0/2] virtio: fix for assertion failure: virtio_net_get_subqueue(nc)->async_tx.elem failed Xuan Zhuo
  2023-01-29  2:51 ` [PATCH v1 1/2] virtio: struct VirtQueue introduce reset Xuan Zhuo
  2023-01-29  2:51 ` [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset Xuan Zhuo
@ 2023-01-29  7:26 ` Michael S. Tsirkin
  2 siblings, 0 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2023-01-29  7:26 UTC (permalink / raw)
  To: Xuan Zhuo; +Cc: qemu-devel, Jason Wang

On Sun, Jan 29, 2023 at 10:51:48AM +0800, Xuan Zhuo wrote:
> In the current design, we stop the device from operating on the vring
> during per-queue reset by resetting the structure VirtQueue.
> 
> But before the reset operation, when recycling some resources, we should
> stop referencing new vring resources.
> 
> This bug is caused by this reason.
> 
>     https://gitlab.com/qemu-project/qemu/-/issues/1451
> 
> Before we reset the structure, we called the ->queue_reset callback to let the
> device reclaim resources. Here virtio-net tries to release the packets sent
> asynchronously, but during this process virtio_net_flush_tx() will be called,
> and new data will be sent again. This leads to asserted.
> 
>      assert(!virtio_net_get_subqueue(nc)->async_tx.elem);
> 
> This patch set introduce new item "reset" into struct VirtQueue, then device can
> know this virtqueue is per-queue reset state.

Better but I still don't exacctly understand what this state means.
Sent some questions on the patches themselves.
Thanks!


> v1:
>     1. rename "reset" to disabled_by_reset
>     2. add api: virtio_queue_reset_state()
> 
> Xuan Zhuo (2):
>   virtio: struct VirtQueue introduce reset
>   virtio-net: virtio_net_flush_tx() check for per-queue reset
> 
>  hw/net/virtio-net.c        |  3 ++-
>  hw/virtio/virtio.c         | 15 +++++++++++++++
>  include/hw/virtio/virtio.h |  1 +
>  3 files changed, 18 insertions(+), 1 deletion(-)
> 
> --
> 2.32.0.3.g01195cf9f



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-29  7:25   ` Michael S. Tsirkin
@ 2023-01-29  7:28     ` Xuan Zhuo
  2023-01-29  8:12       ` Michael S. Tsirkin
  0 siblings, 1 reply; 32+ messages in thread
From: Xuan Zhuo @ 2023-01-29  7:28 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: qemu-devel, Jason Wang, Alexander Bulekov

On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > Check whether it is per-queue reset state in virtio_net_flush_tx().
> >
> > Before per-queue reset, we need to recover async tx resources. At this
> > time, virtio_net_flush_tx() is called, but we should not try to send
> > new packets, so virtio_net_flush_tx() should check the current
> > per-queue reset state.
>
>
> What does "at this time" mean here?
> Do you in fact mean it's called from flush_or_purge_queued_packets?

Yes

virtio_queue_reset
	k->queue_reset
		virtio_net_queue_reset
			flush_or_purge_queued_packets
				qemu_flush_or_purge_queued_packets
					.....
					(callback) virtio_net_tx_complete
						virtio_net_flush_tx <-- here send new packet. We need stop it.


Because it is inside the callback, I can't pass information through the stack. I
originally thought it was a general situation, so I wanted to put it in
struct VirtQueue.

If it is not very suitable, it may be better to put it in VirtIONetQueue.

Thanks.

> What does the call stack look like?
>
> If yes introducing a vq state just so virtio_net_flush_tx
> knows we are in the process of reset would be a bad idea.
> We want something much more local, ideally on stack even ...
>
>
> >
> > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > ---
> >  hw/net/virtio-net.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > index 3ae909041a..fba6451a50 100644
> > --- a/hw/net/virtio-net.c
> > +++ b/hw/net/virtio-net.c
> > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> >      VirtQueueElement *elem;
> >      int32_t num_packets = 0;
> >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > +        virtio_queue_reset_state(q->tx_vq)) {
> >          return num_packets;
> >      }
> >
> > --
> > 2.32.0.3.g01195cf9f
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 1/2] virtio: struct VirtQueue introduce reset
  2023-01-29  7:15     ` Xuan Zhuo
@ 2023-01-29  7:37       ` Michael S. Tsirkin
  2023-01-29  7:39         ` Xuan Zhuo
  0 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2023-01-29  7:37 UTC (permalink / raw)
  To: Xuan Zhuo; +Cc: qemu-devel, Jason Wang

On Sun, Jan 29, 2023 at 03:15:16PM +0800, Xuan Zhuo wrote:
> On Sun, 29 Jan 2023 02:12:36 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >
> > subject seems wrong.
> 
> 
> Will fix.
> 
> 
> >
> > On Sun, Jan 29, 2023 at 10:51:49AM +0800, Xuan Zhuo wrote:
> > > In the current design, we stop the device from operating on the vring
> > > during per-queue reset by resetting the structure VirtQueue.
> > >
> > > But before the reset operation, when recycling some resources, we should
> > > stop referencing new vring resources. For example, when recycling
> > > virtio-net's asynchronous sending resources, virtio-net should be able
> > > to perceive that the current queue is in the per-queue reset state, and
> > > stop sending new packets from the tx queue.
> > >
> > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > ---
> > >  hw/virtio/virtio.c         | 15 +++++++++++++++
> > >  include/hw/virtio/virtio.h |  1 +
> > >  2 files changed, 16 insertions(+)
> > >
> > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > > index f35178f5fc..c954f2a2b3 100644
> > > --- a/hw/virtio/virtio.c
> > > +++ b/hw/virtio/virtio.c
> > > @@ -142,6 +142,8 @@ struct VirtQueue
> > >      /* Notification enabled? */
> > >      bool notification;
> > >
> > > +    bool disabled_by_reset;
> > > +
> > >      uint16_t queue_index;
> > >
> > >      unsigned int inuse;
> > > @@ -2079,6 +2081,12 @@ void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> > >  {
> > >      VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > >
> > > +    /*
> > > +     * Mark this queue is per-queue reset status. The device should release the
> > > +     * references of the vring, and not refer more new vring item.
> >
> > items
> 
> 
> Will fix.
> 
> >
> > > +     */
> > > +    vdev->vq[queue_index].disabled_by_reset = true;
> > > +
> > >      if (k->queue_reset) {
> > >          k->queue_reset(vdev, queue_index);
> > >      }
> >
> > can we set this after calling queue_reset? For symmetry with enable.
> 
> 
> In fact,  queue_reset() will check it.
> 

when you disable you first set it then disable.
so when we are not 100% ready it's already set.
when you enable you first clear it then enable.
so we are not 100% ready but it's no longer set.
inconsistent.


> >
> > > @@ -2102,11 +2110,18 @@ void virtio_queue_enable(VirtIODevice *vdev, uint32_t queue_index)
> > >      }
> > >      */
> > >
> > > +    vdev->vq[queue_index].disabled_by_reset = false;
> > > +
> > >      if (k->queue_enable) {
> > >          k->queue_enable(vdev, queue_index);
> > >      }
> > >  }
> > >
> > > +bool virtio_queue_reset_state(VirtQueue *vq)
> > > +{
> > > +    return vq->disabled_by_reset;
> > > +}
> > > +
> > >  void virtio_reset(void *opaque)
> > >  {
> > >      VirtIODevice *vdev = opaque;
> > > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > > index 77c6c55929..00e91af7c4 100644
> > > --- a/include/hw/virtio/virtio.h
> > > +++ b/include/hw/virtio/virtio.h
> > > @@ -319,6 +319,7 @@ int virtio_set_status(VirtIODevice *vdev, uint8_t val);
> > >  void virtio_reset(void *opaque);
> > >  void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index);
> > >  void virtio_queue_enable(VirtIODevice *vdev, uint32_t queue_index);
> > > +bool virtio_queue_reset_state(VirtQueue *vq);
> > >  void virtio_update_irq(VirtIODevice *vdev);
> > >  int virtio_set_features(VirtIODevice *vdev, uint64_t val);
> >
> > OK I guess ... what about migration. This state won't be
> > set correctly will it?
> 
> I think it has no effect. After the reset, there is actually no state. We can
> migrate.
> 
> The current variable is only used by ->queue_reset().
> 
> Thanks.
> 

Yea maybe it works for this bug but ... yack. This means the state has
no logic consistency.  It's just there because you found a bug and
wanted to fix it.
An ultra specific
	bool this_weird_state_fuzzer_gets_in_issue_1451;
is hard to maintain, not happy :(


> >
> >
> > >
> > > --
> > > 2.32.0.3.g01195cf9f
> >



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 1/2] virtio: struct VirtQueue introduce reset
  2023-01-29  7:37       ` Michael S. Tsirkin
@ 2023-01-29  7:39         ` Xuan Zhuo
  0 siblings, 0 replies; 32+ messages in thread
From: Xuan Zhuo @ 2023-01-29  7:39 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: qemu-devel, Jason Wang

On Sun, 29 Jan 2023 02:37:22 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Sun, Jan 29, 2023 at 03:15:16PM +0800, Xuan Zhuo wrote:
> > On Sun, 29 Jan 2023 02:12:36 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > >
> > > subject seems wrong.
> >
> >
> > Will fix.
> >
> >
> > >
> > > On Sun, Jan 29, 2023 at 10:51:49AM +0800, Xuan Zhuo wrote:
> > > > In the current design, we stop the device from operating on the vring
> > > > during per-queue reset by resetting the structure VirtQueue.
> > > >
> > > > But before the reset operation, when recycling some resources, we should
> > > > stop referencing new vring resources. For example, when recycling
> > > > virtio-net's asynchronous sending resources, virtio-net should be able
> > > > to perceive that the current queue is in the per-queue reset state, and
> > > > stop sending new packets from the tx queue.
> > > >
> > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > ---
> > > >  hw/virtio/virtio.c         | 15 +++++++++++++++
> > > >  include/hw/virtio/virtio.h |  1 +
> > > >  2 files changed, 16 insertions(+)
> > > >
> > > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > > > index f35178f5fc..c954f2a2b3 100644
> > > > --- a/hw/virtio/virtio.c
> > > > +++ b/hw/virtio/virtio.c
> > > > @@ -142,6 +142,8 @@ struct VirtQueue
> > > >      /* Notification enabled? */
> > > >      bool notification;
> > > >
> > > > +    bool disabled_by_reset;
> > > > +
> > > >      uint16_t queue_index;
> > > >
> > > >      unsigned int inuse;
> > > > @@ -2079,6 +2081,12 @@ void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> > > >  {
> > > >      VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > >
> > > > +    /*
> > > > +     * Mark this queue is per-queue reset status. The device should release the
> > > > +     * references of the vring, and not refer more new vring item.
> > >
> > > items
> >
> >
> > Will fix.
> >
> > >
> > > > +     */
> > > > +    vdev->vq[queue_index].disabled_by_reset = true;
> > > > +
> > > >      if (k->queue_reset) {
> > > >          k->queue_reset(vdev, queue_index);
> > > >      }
> > >
> > > can we set this after calling queue_reset? For symmetry with enable.
> >
> >
> > In fact,  queue_reset() will check it.
> >
>
> when you disable you first set it then disable.
> so when we are not 100% ready it's already set.
> when you enable you first clear it then enable.
> so we are not 100% ready but it's no longer set.
> inconsistent.
>
>
> > >
> > > > @@ -2102,11 +2110,18 @@ void virtio_queue_enable(VirtIODevice *vdev, uint32_t queue_index)
> > > >      }
> > > >      */
> > > >
> > > > +    vdev->vq[queue_index].disabled_by_reset = false;
> > > > +
> > > >      if (k->queue_enable) {
> > > >          k->queue_enable(vdev, queue_index);
> > > >      }
> > > >  }
> > > >
> > > > +bool virtio_queue_reset_state(VirtQueue *vq)
> > > > +{
> > > > +    return vq->disabled_by_reset;
> > > > +}
> > > > +
> > > >  void virtio_reset(void *opaque)
> > > >  {
> > > >      VirtIODevice *vdev = opaque;
> > > > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > > > index 77c6c55929..00e91af7c4 100644
> > > > --- a/include/hw/virtio/virtio.h
> > > > +++ b/include/hw/virtio/virtio.h
> > > > @@ -319,6 +319,7 @@ int virtio_set_status(VirtIODevice *vdev, uint8_t val);
> > > >  void virtio_reset(void *opaque);
> > > >  void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index);
> > > >  void virtio_queue_enable(VirtIODevice *vdev, uint32_t queue_index);
> > > > +bool virtio_queue_reset_state(VirtQueue *vq);
> > > >  void virtio_update_irq(VirtIODevice *vdev);
> > > >  int virtio_set_features(VirtIODevice *vdev, uint64_t val);
> > >
> > > OK I guess ... what about migration. This state won't be
> > > set correctly will it?
> >
> > I think it has no effect. After the reset, there is actually no state. We can
> > migrate.
> >
> > The current variable is only used by ->queue_reset().
> >
> > Thanks.
> >
>
> Yea maybe it works for this bug but ... yack. This means the state has
> no logic consistency.  It's just there because you found a bug and
> wanted to fix it.
> An ultra specific
> 	bool this_weird_state_fuzzer_gets_in_issue_1451;
> is hard to maintain, not happy :(


I agree.


Thanks.


>
>
> > >
> > >
> > > >
> > > > --
> > > > 2.32.0.3.g01195cf9f
> > >
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-29  6:23   ` Jason Wang
@ 2023-01-29  7:43     ` Xuan Zhuo
  2023-01-30  3:01       ` Jason Wang
  0 siblings, 1 reply; 32+ messages in thread
From: Xuan Zhuo @ 2023-01-29  7:43 UTC (permalink / raw)
  To: Jason Wang; +Cc: qemu-devel, Michael S. Tsirkin, Alexander Bulekov

On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > Check whether it is per-queue reset state in virtio_net_flush_tx().
> >
> > Before per-queue reset, we need to recover async tx resources. At this
> > time, virtio_net_flush_tx() is called, but we should not try to send
> > new packets, so virtio_net_flush_tx() should check the current
> > per-queue reset state.
> >
> > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > ---
> >  hw/net/virtio-net.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > index 3ae909041a..fba6451a50 100644
> > --- a/hw/net/virtio-net.c
> > +++ b/hw/net/virtio-net.c
> > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> >      VirtQueueElement *elem;
> >      int32_t num_packets = 0;
> >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > +        virtio_queue_reset_state(q->tx_vq)) {
>
> We have other places that check DRIVER_OK do we need to check queue
> reset as well?

I checked it again. I still think that the location of other checking DRIVER_OK
does not need to check the queue reset.

Thanks.


>
> E.g:
> virtio_net_can_receive()
> virtio_net_tx_{timer|bh}()
>
> Thanks
>
> >          return num_packets;
> >      }
> >
> > --
> > 2.32.0.3.g01195cf9f
> >
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-29  7:28     ` Xuan Zhuo
@ 2023-01-29  8:12       ` Michael S. Tsirkin
  2023-01-29  8:23         ` Xuan Zhuo
  0 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2023-01-29  8:12 UTC (permalink / raw)
  To: Xuan Zhuo; +Cc: qemu-devel, Jason Wang, Alexander Bulekov

On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > >
> > > Before per-queue reset, we need to recover async tx resources. At this
> > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > new packets, so virtio_net_flush_tx() should check the current
> > > per-queue reset state.
> >
> >
> > What does "at this time" mean here?
> > Do you in fact mean it's called from flush_or_purge_queued_packets?
> 
> Yes
> 
> virtio_queue_reset
> 	k->queue_reset
> 		virtio_net_queue_reset
> 			flush_or_purge_queued_packets
> 				qemu_flush_or_purge_queued_packets
> 					.....
> 					(callback) virtio_net_tx_complete
> 						virtio_net_flush_tx <-- here send new packet. We need stop it.
> 
> 
> Because it is inside the callback, I can't pass information through the stack. I
> originally thought it was a general situation, so I wanted to put it in
> struct VirtQueue.
> 
> If it is not very suitable, it may be better to put it in VirtIONetQueue.
> 
> Thanks.

Hmm maybe. Another idea: isn't virtio_net_tx_complete called
with length 0 here? Are there other cases where length is 0?


> > What does the call stack look like?
> >
> > If yes introducing a vq state just so virtio_net_flush_tx
> > knows we are in the process of reset would be a bad idea.
> > We want something much more local, ideally on stack even ...
> >
> >
> > >
> > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > ---
> > >  hw/net/virtio-net.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > index 3ae909041a..fba6451a50 100644
> > > --- a/hw/net/virtio-net.c
> > > +++ b/hw/net/virtio-net.c
> > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > >      VirtQueueElement *elem;
> > >      int32_t num_packets = 0;
> > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > +        virtio_queue_reset_state(q->tx_vq)) {

btw this sounds like you are asking it to reset some state.

> > >          return num_packets;

and then

    ret = virtio_net_flush_tx(q);
    if (ret >= n->tx_burst)


will reschedule automatically won't it?

also why check in virtio_net_flush_tx and not virtio_net_tx_complete?


> > >      }
> > >
> > > --
> > > 2.32.0.3.g01195cf9f
> >



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-29  8:12       ` Michael S. Tsirkin
@ 2023-01-29  8:23         ` Xuan Zhuo
  2023-01-29 11:57           ` Michael S. Tsirkin
  0 siblings, 1 reply; 32+ messages in thread
From: Xuan Zhuo @ 2023-01-29  8:23 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: qemu-devel, Jason Wang, Alexander Bulekov

On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > >
> > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > new packets, so virtio_net_flush_tx() should check the current
> > > > per-queue reset state.
> > >
> > >
> > > What does "at this time" mean here?
> > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> >
> > Yes
> >
> > virtio_queue_reset
> > 	k->queue_reset
> > 		virtio_net_queue_reset
> > 			flush_or_purge_queued_packets
> > 				qemu_flush_or_purge_queued_packets
> > 					.....
> > 					(callback) virtio_net_tx_complete
> > 						virtio_net_flush_tx <-- here send new packet. We need stop it.
> >
> >
> > Because it is inside the callback, I can't pass information through the stack. I
> > originally thought it was a general situation, so I wanted to put it in
> > struct VirtQueue.
> >
> > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> >
> > Thanks.
>
> Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> with length 0 here? Are there other cases where length is 0?
>
>
> > > What does the call stack look like?
> > >
> > > If yes introducing a vq state just so virtio_net_flush_tx
> > > knows we are in the process of reset would be a bad idea.
> > > We want something much more local, ideally on stack even ...
> > >
> > >
> > > >
> > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > ---
> > > >  hw/net/virtio-net.c | 3 ++-
> > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > index 3ae909041a..fba6451a50 100644
> > > > --- a/hw/net/virtio-net.c
> > > > +++ b/hw/net/virtio-net.c
> > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > >      VirtQueueElement *elem;
> > > >      int32_t num_packets = 0;
> > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > +        virtio_queue_reset_state(q->tx_vq)) {
>
> btw this sounds like you are asking it to reset some state.
>
> > > >          return num_packets;
>
> and then
>
>     ret = virtio_net_flush_tx(q);
>     if (ret >= n->tx_burst)
>
>
> will reschedule automatically won't it?
>
> also why check in virtio_net_flush_tx and not virtio_net_tx_complete?

virtio_net_flush_tx may been called by timer.

Thanks.

>
>
> > > >      }
> > > >
> > > > --
> > > > 2.32.0.3.g01195cf9f
> > >
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-29  8:23         ` Xuan Zhuo
@ 2023-01-29 11:57           ` Michael S. Tsirkin
  2023-01-29 12:03             ` Xuan Zhuo
  0 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2023-01-29 11:57 UTC (permalink / raw)
  To: Xuan Zhuo; +Cc: qemu-devel, Jason Wang, Alexander Bulekov

On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > >
> > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > per-queue reset state.
> > > >
> > > >
> > > > What does "at this time" mean here?
> > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > >
> > > Yes
> > >
> > > virtio_queue_reset
> > > 	k->queue_reset
> > > 		virtio_net_queue_reset
> > > 			flush_or_purge_queued_packets
> > > 				qemu_flush_or_purge_queued_packets
> > > 					.....
> > > 					(callback) virtio_net_tx_complete
> > > 						virtio_net_flush_tx <-- here send new packet. We need stop it.
> > >
> > >
> > > Because it is inside the callback, I can't pass information through the stack. I
> > > originally thought it was a general situation, so I wanted to put it in
> > > struct VirtQueue.
> > >
> > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > >
> > > Thanks.
> >
> > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > with length 0 here? Are there other cases where length is 0?
> >
> >
> > > > What does the call stack look like?
> > > >
> > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > knows we are in the process of reset would be a bad idea.
> > > > We want something much more local, ideally on stack even ...
> > > >
> > > >
> > > > >
> > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > ---
> > > > >  hw/net/virtio-net.c | 3 ++-
> > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > index 3ae909041a..fba6451a50 100644
> > > > > --- a/hw/net/virtio-net.c
> > > > > +++ b/hw/net/virtio-net.c
> > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > >      VirtQueueElement *elem;
> > > > >      int32_t num_packets = 0;
> > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> >
> > btw this sounds like you are asking it to reset some state.
> >
> > > > >          return num_packets;
> >
> > and then
> >
> >     ret = virtio_net_flush_tx(q);
> >     if (ret >= n->tx_burst)
> >
> >
> > will reschedule automatically won't it?
> >
> > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> 
> virtio_net_flush_tx may been called by timer.
> 
> Thanks.

timer won't run while flush_or_purge_queued_packets is in progress.

> >
> >
> > > > >      }
> > > > >
> > > > > --
> > > > > 2.32.0.3.g01195cf9f
> > > >
> >



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-29 11:57           ` Michael S. Tsirkin
@ 2023-01-29 12:03             ` Xuan Zhuo
  2023-01-29 12:15               ` Michael S. Tsirkin
  0 siblings, 1 reply; 32+ messages in thread
From: Xuan Zhuo @ 2023-01-29 12:03 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: qemu-devel, Jason Wang, Alexander Bulekov

On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > >
> > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > per-queue reset state.
> > > > >
> > > > >
> > > > > What does "at this time" mean here?
> > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > >
> > > > Yes
> > > >
> > > > virtio_queue_reset
> > > > 	k->queue_reset
> > > > 		virtio_net_queue_reset
> > > > 			flush_or_purge_queued_packets
> > > > 				qemu_flush_or_purge_queued_packets
> > > > 					.....
> > > > 					(callback) virtio_net_tx_complete
> > > > 						virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > >
> > > >
> > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > originally thought it was a general situation, so I wanted to put it in
> > > > struct VirtQueue.
> > > >
> > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > >
> > > > Thanks.
> > >
> > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > with length 0 here? Are there other cases where length is 0?
> > >
> > >
> > > > > What does the call stack look like?
> > > > >
> > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > knows we are in the process of reset would be a bad idea.
> > > > > We want something much more local, ideally on stack even ...
> > > > >
> > > > >
> > > > > >
> > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > ---
> > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > --- a/hw/net/virtio-net.c
> > > > > > +++ b/hw/net/virtio-net.c
> > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > >      VirtQueueElement *elem;
> > > > > >      int32_t num_packets = 0;
> > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > >
> > > btw this sounds like you are asking it to reset some state.
> > >
> > > > > >          return num_packets;
> > >
> > > and then
> > >
> > >     ret = virtio_net_flush_tx(q);
> > >     if (ret >= n->tx_burst)
> > >
> > >
> > > will reschedule automatically won't it?
> > >
> > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> >
> > virtio_net_flush_tx may been called by timer.
> >
> > Thanks.
>
> timer won't run while flush_or_purge_queued_packets is in progress.

Is timer not executed during the VMEXIT process? Otherwise, we still have to
consider that after the flush_or_purge_queued_packets, this process before the
structure is cleared.

Even if it can be processed in virtio_net_tx_complete, is there any good way?
This is a callback, it is not convenient to pass the parameters.

Thanks


>
> > >
> > >
> > > > > >      }
> > > > > >
> > > > > > --
> > > > > > 2.32.0.3.g01195cf9f
> > > > >
> > >
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-29 12:03             ` Xuan Zhuo
@ 2023-01-29 12:15               ` Michael S. Tsirkin
  2023-01-29 12:28                 ` Xuan Zhuo
  2023-01-30  2:15                 ` Xuan Zhuo
  0 siblings, 2 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2023-01-29 12:15 UTC (permalink / raw)
  To: Xuan Zhuo; +Cc: qemu-devel, Jason Wang, Alexander Bulekov

On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > >
> > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > per-queue reset state.
> > > > > >
> > > > > >
> > > > > > What does "at this time" mean here?
> > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > >
> > > > > Yes
> > > > >
> > > > > virtio_queue_reset
> > > > > 	k->queue_reset
> > > > > 		virtio_net_queue_reset
> > > > > 			flush_or_purge_queued_packets
> > > > > 				qemu_flush_or_purge_queued_packets
> > > > > 					.....
> > > > > 					(callback) virtio_net_tx_complete
> > > > > 						virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > >
> > > > >
> > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > struct VirtQueue.
> > > > >
> > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > >
> > > > > Thanks.
> > > >
> > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > with length 0 here? Are there other cases where length is 0?
> > > >
> > > >
> > > > > > What does the call stack look like?
> > > > > >
> > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > We want something much more local, ideally on stack even ...
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > ---
> > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > >      VirtQueueElement *elem;
> > > > > > >      int32_t num_packets = 0;
> > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > >
> > > > btw this sounds like you are asking it to reset some state.
> > > >
> > > > > > >          return num_packets;
> > > >
> > > > and then
> > > >
> > > >     ret = virtio_net_flush_tx(q);
> > > >     if (ret >= n->tx_burst)
> > > >
> > > >
> > > > will reschedule automatically won't it?
> > > >
> > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > >
> > > virtio_net_flush_tx may been called by timer.
> > >
> > > Thanks.
> >
> > timer won't run while flush_or_purge_queued_packets is in progress.
> 
> Is timer not executed during the VMEXIT process? Otherwise, we still have to
> consider that after the flush_or_purge_queued_packets, this process before the
> structure is cleared.



void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
{
    VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);

    if (k->queue_reset) {
        k->queue_reset(vdev, queue_index);
    }

    __virtio_queue_reset(vdev, queue_index);
}


No timers do not run between  k->queue_reset and __virtio_queue_reset.


> Even if it can be processed in virtio_net_tx_complete, is there any good way?
> This is a callback, it is not convenient to pass the parameters.
> 
> Thanks


How about checking that length is 0?

> 
> >
> > > >
> > > >
> > > > > > >      }
> > > > > > >
> > > > > > > --
> > > > > > > 2.32.0.3.g01195cf9f
> > > > > >
> > > >
> >



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-29 12:15               ` Michael S. Tsirkin
@ 2023-01-29 12:28                 ` Xuan Zhuo
  2023-01-30  2:15                 ` Xuan Zhuo
  1 sibling, 0 replies; 32+ messages in thread
From: Xuan Zhuo @ 2023-01-29 12:28 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: qemu-devel, Jason Wang, Alexander Bulekov

On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > >
> > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > per-queue reset state.
> > > > > > >
> > > > > > >
> > > > > > > What does "at this time" mean here?
> > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > >
> > > > > > Yes
> > > > > >
> > > > > > virtio_queue_reset
> > > > > > 	k->queue_reset
> > > > > > 		virtio_net_queue_reset
> > > > > > 			flush_or_purge_queued_packets
> > > > > > 				qemu_flush_or_purge_queued_packets
> > > > > > 					.....
> > > > > > 					(callback) virtio_net_tx_complete
> > > > > > 						virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > >
> > > > > >
> > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > struct VirtQueue.
> > > > > >
> > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > >
> > > > > > Thanks.
> > > > >
> > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > with length 0 here? Are there other cases where length is 0?
> > > > >
> > > > >
> > > > > > > What does the call stack look like?
> > > > > > >
> > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > We want something much more local, ideally on stack even ...
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > ---
> > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > >      VirtQueueElement *elem;
> > > > > > > >      int32_t num_packets = 0;
> > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > >
> > > > > btw this sounds like you are asking it to reset some state.
> > > > >
> > > > > > > >          return num_packets;
> > > > >
> > > > > and then
> > > > >
> > > > >     ret = virtio_net_flush_tx(q);
> > > > >     if (ret >= n->tx_burst)
> > > > >
> > > > >
> > > > > will reschedule automatically won't it?
> > > > >
> > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > >
> > > > virtio_net_flush_tx may been called by timer.
> > > >
> > > > Thanks.
> > >
> > > timer won't run while flush_or_purge_queued_packets is in progress.
> >
> > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > consider that after the flush_or_purge_queued_packets, this process before the
> > structure is cleared.
>
>
>
> void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> {
>     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
>
>     if (k->queue_reset) {
>         k->queue_reset(vdev, queue_index);
>     }
>
>     __virtio_queue_reset(vdev, queue_index);
> }
>
>
> No timers do not run between  k->queue_reset and __virtio_queue_reset.
>
>
> > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > This is a callback, it is not convenient to pass the parameters.
> >
> > Thanks
>
>
> How about checking that length is 0?


I think you refer to the second parameter of virtio_net_tx_complete "len".

void qemu_net_queue_purge(NetQueue *queue, NetClientState *from)
{
    NetPacket *packet, *next;

    QTAILQ_FOREACH_SAFE(packet, &queue->packets, entry, next) {
        if (packet->sender == from) {
            QTAILQ_REMOVE(&queue->packets, packet, entry);
            queue->nq_count--;
            if (packet->sent_cb) {
                packet->sent_cb(packet->sender, 0);
            }
            g_free(packet);
        }
    }
}

qemu_net_queue_purge pass 0 as len. This function has been called in large
quantities. I am not sure if it will have a bad impact. I will try if there is a
good way to pass the parameters on the stack.

Thanks.

>
> >
> > >
> > > > >
> > > > >
> > > > > > > >      }
> > > > > > > >
> > > > > > > > --
> > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > >
> > > > >
> > >
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-29 12:15               ` Michael S. Tsirkin
  2023-01-29 12:28                 ` Xuan Zhuo
@ 2023-01-30  2:15                 ` Xuan Zhuo
  2023-01-30  5:32                   ` Michael S. Tsirkin
  1 sibling, 1 reply; 32+ messages in thread
From: Xuan Zhuo @ 2023-01-30  2:15 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: qemu-devel, Jason Wang, Alexander Bulekov

On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > >
> > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > per-queue reset state.
> > > > > > >
> > > > > > >
> > > > > > > What does "at this time" mean here?
> > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > >
> > > > > > Yes
> > > > > >
> > > > > > virtio_queue_reset
> > > > > > 	k->queue_reset
> > > > > > 		virtio_net_queue_reset
> > > > > > 			flush_or_purge_queued_packets
> > > > > > 				qemu_flush_or_purge_queued_packets
> > > > > > 					.....
> > > > > > 					(callback) virtio_net_tx_complete
> > > > > > 						virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > >
> > > > > >
> > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > struct VirtQueue.
> > > > > >
> > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > >
> > > > > > Thanks.
> > > > >
> > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > with length 0 here? Are there other cases where length is 0?
> > > > >
> > > > >
> > > > > > > What does the call stack look like?
> > > > > > >
> > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > We want something much more local, ideally on stack even ...
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > ---
> > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > >      VirtQueueElement *elem;
> > > > > > > >      int32_t num_packets = 0;
> > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > >
> > > > > btw this sounds like you are asking it to reset some state.
> > > > >
> > > > > > > >          return num_packets;
> > > > >
> > > > > and then
> > > > >
> > > > >     ret = virtio_net_flush_tx(q);
> > > > >     if (ret >= n->tx_burst)
> > > > >
> > > > >
> > > > > will reschedule automatically won't it?
> > > > >
> > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > >
> > > > virtio_net_flush_tx may been called by timer.
> > > >
> > > > Thanks.
> > >
> > > timer won't run while flush_or_purge_queued_packets is in progress.
> >
> > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > consider that after the flush_or_purge_queued_packets, this process before the
> > structure is cleared.
>
>
>
> void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> {
>     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
>
>     if (k->queue_reset) {
>         k->queue_reset(vdev, queue_index);
>     }
>
>     __virtio_queue_reset(vdev, queue_index);
> }
>
>
> No timers do not run between  k->queue_reset and __virtio_queue_reset.
>
>
> > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > This is a callback, it is not convenient to pass the parameters.
> >
> > Thanks
>
>
> How about checking that length is 0?


I think that check length is not a good way. This modifys the semantics of 0. It is
not friendly to the future maintenance. On the other hand, qemu_net_queue_purge()
will pass 0, and this function is called by many places.

How about we add an api in queue.c to replace the sent_cb callback on queue?

Thanks.


>
> >
> > >
> > > > >
> > > > >
> > > > > > > >      }
> > > > > > > >
> > > > > > > > --
> > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > >
> > > > >
> > >
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-29  7:43     ` Xuan Zhuo
@ 2023-01-30  3:01       ` Jason Wang
  2023-01-30  3:38         ` Xuan Zhuo
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Wang @ 2023-01-30  3:01 UTC (permalink / raw)
  To: Xuan Zhuo; +Cc: qemu-devel, Michael S. Tsirkin, Alexander Bulekov

On Sun, Jan 29, 2023 at 3:44 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > >
> > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > >
> > > Before per-queue reset, we need to recover async tx resources. At this
> > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > new packets, so virtio_net_flush_tx() should check the current
> > > per-queue reset state.
> > >
> > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > ---
> > >  hw/net/virtio-net.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > index 3ae909041a..fba6451a50 100644
> > > --- a/hw/net/virtio-net.c
> > > +++ b/hw/net/virtio-net.c
> > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > >      VirtQueueElement *elem;
> > >      int32_t num_packets = 0;
> > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > +        virtio_queue_reset_state(q->tx_vq)) {
> >
> > We have other places that check DRIVER_OK do we need to check queue
> > reset as well?
>
> I checked it again. I still think that the location of other checking DRIVER_OK
> does not need to check the queue reset.

For example, if we don't disable can_receive() when the queue is
reset, it means rx may go for virtio_net_receive_rcu(). It means the
Qemu is still trying to process the traffic from the network backend
like tap which may waste cpu cycles.

I think the correct way is to return false when the queue is reset in
can_receive(), then the backend poll will be disabled (e.g TAP). When
the queue is enabled again, qemu_flush_queued_packets() will wake up
the backend polling.

Having had time to check other places but it would be better to
mention why it doesn't need a check in the changelog.

Thanks

>
> Thanks.
>
>
> >
> > E.g:
> > virtio_net_can_receive()
> > virtio_net_tx_{timer|bh}()
> >
> > Thanks
> >
> > >          return num_packets;
> > >      }
> > >
> > > --
> > > 2.32.0.3.g01195cf9f
> > >
> >
>



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-30  3:01       ` Jason Wang
@ 2023-01-30  3:38         ` Xuan Zhuo
  2023-01-30  3:53           ` Jason Wang
  0 siblings, 1 reply; 32+ messages in thread
From: Xuan Zhuo @ 2023-01-30  3:38 UTC (permalink / raw)
  To: Jason Wang; +Cc: qemu-devel, Michael S. Tsirkin, Alexander Bulekov

On Mon, 30 Jan 2023 11:01:40 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Sun, Jan 29, 2023 at 3:44 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > >
> > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > >
> > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > new packets, so virtio_net_flush_tx() should check the current
> > > > per-queue reset state.
> > > >
> > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > ---
> > > >  hw/net/virtio-net.c | 3 ++-
> > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > index 3ae909041a..fba6451a50 100644
> > > > --- a/hw/net/virtio-net.c
> > > > +++ b/hw/net/virtio-net.c
> > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > >      VirtQueueElement *elem;
> > > >      int32_t num_packets = 0;
> > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > >
> > > We have other places that check DRIVER_OK do we need to check queue
> > > reset as well?
> >
> > I checked it again. I still think that the location of other checking DRIVER_OK
> > does not need to check the queue reset.
>
> For example, if we don't disable can_receive() when the queue is
> reset, it means rx may go for virtio_net_receive_rcu(). It means the
> Qemu is still trying to process the traffic from the network backend
> like tap which may waste cpu cycles.
>
> I think the correct way is to return false when the queue is reset in
> can_receive(), then the backend poll will be disabled (e.g TAP). When
> the queue is enabled again, qemu_flush_queued_packets() will wake up
> the backend polling.
>
> Having had time to check other places but it would be better to
> mention why it doesn't need a check in the changelog.


static bool virtio_net_can_receive(NetClientState *nc)
{
    VirtIONet *n = qemu_get_nic_opaque(nc);
    VirtIODevice *vdev = VIRTIO_DEVICE(n);
    VirtIONetQueue *q = virtio_net_get_subqueue(nc);

    if (!vdev->vm_running) {
        return false;
    }

    if (nc->queue_index >= n->curr_queue_pairs) {
        return false;
    }

    if (!virtio_queue_ready(q->rx_vq) ||
        !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
        return false;
    }

    return true;
}

int virtio_queue_ready(VirtQueue *vq)
{
    return vq->vring.avail != 0;
}


static void __virtio_queue_reset(VirtIODevice *vdev, uint32_t i)
{
    vdev->vq[i].vring.desc = 0;
    vdev->vq[i].vring.avail = 0;
    vdev->vq[i].vring.used = 0;
    vdev->vq[i].last_avail_idx = 0;
    vdev->vq[i].shadow_avail_idx = 0;
    vdev->vq[i].used_idx = 0;
    vdev->vq[i].last_avail_wrap_counter = true;
    vdev->vq[i].shadow_avail_wrap_counter = true;
    vdev->vq[i].used_wrap_counter = true;
    virtio_queue_set_vector(vdev, i, VIRTIO_NO_VECTOR);
    vdev->vq[i].signalled_used = 0;
    vdev->vq[i].signalled_used_valid = false;
    vdev->vq[i].notification = true;
    vdev->vq[i].vring.num = vdev->vq[i].vring.num_default;
    vdev->vq[i].inuse = 0;
    virtio_virtqueue_reset_region_cache(&vdev->vq[i]);
}

In the implementation of Per-Queue Reset, for RX, we stop RX by setting vdev->vq[i].vring.avail to 0.
Then callback can_receive will return False.


Thanks.


>
> Thanks
>
> >
> > Thanks.
> >
> >
> > >
> > > E.g:
> > > virtio_net_can_receive()
> > > virtio_net_tx_{timer|bh}()
> > >
> > > Thanks
> > >
> > > >          return num_packets;
> > > >      }
> > > >
> > > > --
> > > > 2.32.0.3.g01195cf9f
> > > >
> > >
> >
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-30  3:38         ` Xuan Zhuo
@ 2023-01-30  3:53           ` Jason Wang
  2023-01-30  5:50             ` Michael S. Tsirkin
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Wang @ 2023-01-30  3:53 UTC (permalink / raw)
  To: Xuan Zhuo; +Cc: qemu-devel, Michael S. Tsirkin, Alexander Bulekov

On Mon, Jan 30, 2023 at 11:42 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> On Mon, 30 Jan 2023 11:01:40 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > On Sun, Jan 29, 2023 at 3:44 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > >
> > > On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > > On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > >
> > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > >
> > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > per-queue reset state.
> > > > >
> > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > ---
> > > > >  hw/net/virtio-net.c | 3 ++-
> > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > index 3ae909041a..fba6451a50 100644
> > > > > --- a/hw/net/virtio-net.c
> > > > > +++ b/hw/net/virtio-net.c
> > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > >      VirtQueueElement *elem;
> > > > >      int32_t num_packets = 0;
> > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > >
> > > > We have other places that check DRIVER_OK do we need to check queue
> > > > reset as well?
> > >
> > > I checked it again. I still think that the location of other checking DRIVER_OK
> > > does not need to check the queue reset.
> >
> > For example, if we don't disable can_receive() when the queue is
> > reset, it means rx may go for virtio_net_receive_rcu(). It means the
> > Qemu is still trying to process the traffic from the network backend
> > like tap which may waste cpu cycles.
> >
> > I think the correct way is to return false when the queue is reset in
> > can_receive(), then the backend poll will be disabled (e.g TAP). When
> > the queue is enabled again, qemu_flush_queued_packets() will wake up
> > the backend polling.
> >
> > Having had time to check other places but it would be better to
> > mention why it doesn't need a check in the changelog.
>
>
> static bool virtio_net_can_receive(NetClientState *nc)
> {
>     VirtIONet *n = qemu_get_nic_opaque(nc);
>     VirtIODevice *vdev = VIRTIO_DEVICE(n);
>     VirtIONetQueue *q = virtio_net_get_subqueue(nc);
>
>     if (!vdev->vm_running) {
>         return false;
>     }
>
>     if (nc->queue_index >= n->curr_queue_pairs) {
>         return false;
>     }
>
>     if (!virtio_queue_ready(q->rx_vq) ||
>         !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
>         return false;
>     }
>
>     return true;
> }
>
> int virtio_queue_ready(VirtQueue *vq)
> {
>     return vq->vring.avail != 0;
> }
>
>
> static void __virtio_queue_reset(VirtIODevice *vdev, uint32_t i)
> {
>     vdev->vq[i].vring.desc = 0;
>     vdev->vq[i].vring.avail = 0;
>     vdev->vq[i].vring.used = 0;
>     vdev->vq[i].last_avail_idx = 0;
>     vdev->vq[i].shadow_avail_idx = 0;
>     vdev->vq[i].used_idx = 0;
>     vdev->vq[i].last_avail_wrap_counter = true;
>     vdev->vq[i].shadow_avail_wrap_counter = true;
>     vdev->vq[i].used_wrap_counter = true;
>     virtio_queue_set_vector(vdev, i, VIRTIO_NO_VECTOR);
>     vdev->vq[i].signalled_used = 0;
>     vdev->vq[i].signalled_used_valid = false;
>     vdev->vq[i].notification = true;
>     vdev->vq[i].vring.num = vdev->vq[i].vring.num_default;
>     vdev->vq[i].inuse = 0;
>     virtio_virtqueue_reset_region_cache(&vdev->vq[i]);
> }
>
> In the implementation of Per-Queue Reset, for RX, we stop RX by setting vdev->vq[i].vring.avail to 0.

Ok, but this is kind of fragile (especially when vIOMMU is enabled).
I'd add an explicit check for reset there.

(probably on top).

Thanks

> Then callback can_receive will return False.
>
>
> Thanks.
>
>
> >
> > Thanks
> >
> > >
> > > Thanks.
> > >
> > >
> > > >
> > > > E.g:
> > > > virtio_net_can_receive()
> > > > virtio_net_tx_{timer|bh}()
> > > >
> > > > Thanks
> > > >
> > > > >          return num_packets;
> > > > >      }
> > > > >
> > > > > --
> > > > > 2.32.0.3.g01195cf9f
> > > > >
> > > >
> > >
> >
>



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-30  2:15                 ` Xuan Zhuo
@ 2023-01-30  5:32                   ` Michael S. Tsirkin
  2023-01-30  7:49                     ` Jason Wang
  0 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2023-01-30  5:32 UTC (permalink / raw)
  To: Xuan Zhuo; +Cc: qemu-devel, Jason Wang, Alexander Bulekov

On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote:
> On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > > >
> > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > > per-queue reset state.
> > > > > > > >
> > > > > > > >
> > > > > > > > What does "at this time" mean here?
> > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > > >
> > > > > > > Yes
> > > > > > >
> > > > > > > virtio_queue_reset
> > > > > > > 	k->queue_reset
> > > > > > > 		virtio_net_queue_reset
> > > > > > > 			flush_or_purge_queued_packets
> > > > > > > 				qemu_flush_or_purge_queued_packets
> > > > > > > 					.....
> > > > > > > 					(callback) virtio_net_tx_complete
> > > > > > > 						virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > > >
> > > > > > >
> > > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > > struct VirtQueue.
> > > > > > >
> > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > > >
> > > > > > > Thanks.
> > > > > >
> > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > > with length 0 here? Are there other cases where length is 0?
> > > > > >
> > > > > >
> > > > > > > > What does the call stack look like?
> > > > > > > >
> > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > > We want something much more local, ideally on stack even ...
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > > ---
> > > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > > >      VirtQueueElement *elem;
> > > > > > > > >      int32_t num_packets = 0;
> > > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > > >
> > > > > > btw this sounds like you are asking it to reset some state.
> > > > > >
> > > > > > > > >          return num_packets;
> > > > > >
> > > > > > and then
> > > > > >
> > > > > >     ret = virtio_net_flush_tx(q);
> > > > > >     if (ret >= n->tx_burst)
> > > > > >
> > > > > >
> > > > > > will reschedule automatically won't it?
> > > > > >
> > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > > >
> > > > > virtio_net_flush_tx may been called by timer.
> > > > >
> > > > > Thanks.
> > > >
> > > > timer won't run while flush_or_purge_queued_packets is in progress.
> > >
> > > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > > consider that after the flush_or_purge_queued_packets, this process before the
> > > structure is cleared.
> >
> >
> >
> > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> > {
> >     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> >
> >     if (k->queue_reset) {
> >         k->queue_reset(vdev, queue_index);
> >     }
> >
> >     __virtio_queue_reset(vdev, queue_index);
> > }
> >
> >
> > No timers do not run between  k->queue_reset and __virtio_queue_reset.
> >
> >
> > > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > > This is a callback, it is not convenient to pass the parameters.
> > >
> > > Thanks
> >
> >
> > How about checking that length is 0?
> 
> 
> I think that check length is not a good way. This modifys the semantics of 0. It is
> not friendly to the future maintenance. On the other hand, qemu_net_queue_purge()
> will pass 0, and this function is called by many places.
> 
> How about we add an api in queue.c to replace the sent_cb callback on queue?
> 
> Thanks.

OK I guess. Jason?

> 
> >
> > >
> > > >
> > > > > >
> > > > > >
> > > > > > > > >      }
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > > >
> > > > > >
> > > >
> >



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-30  3:53           ` Jason Wang
@ 2023-01-30  5:50             ` Michael S. Tsirkin
  2023-01-30  8:41               ` Jason Wang
  0 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2023-01-30  5:50 UTC (permalink / raw)
  To: Jason Wang; +Cc: Xuan Zhuo, qemu-devel, Alexander Bulekov

On Mon, Jan 30, 2023 at 11:53:18AM +0800, Jason Wang wrote:
> On Mon, Jan 30, 2023 at 11:42 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > On Mon, 30 Jan 2023 11:01:40 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > On Sun, Jan 29, 2023 at 3:44 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > >
> > > > On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > > > On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > > >
> > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > >
> > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > per-queue reset state.
> > > > > >
> > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > ---
> > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > --- a/hw/net/virtio-net.c
> > > > > > +++ b/hw/net/virtio-net.c
> > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > >      VirtQueueElement *elem;
> > > > > >      int32_t num_packets = 0;
> > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > >
> > > > > We have other places that check DRIVER_OK do we need to check queue
> > > > > reset as well?
> > > >
> > > > I checked it again. I still think that the location of other checking DRIVER_OK
> > > > does not need to check the queue reset.
> > >
> > > For example, if we don't disable can_receive() when the queue is
> > > reset, it means rx may go for virtio_net_receive_rcu(). It means the
> > > Qemu is still trying to process the traffic from the network backend
> > > like tap which may waste cpu cycles.
> > >
> > > I think the correct way is to return false when the queue is reset in
> > > can_receive(), then the backend poll will be disabled (e.g TAP). When
> > > the queue is enabled again, qemu_flush_queued_packets() will wake up
> > > the backend polling.
> > >
> > > Having had time to check other places but it would be better to
> > > mention why it doesn't need a check in the changelog.
> >
> >
> > static bool virtio_net_can_receive(NetClientState *nc)
> > {
> >     VirtIONet *n = qemu_get_nic_opaque(nc);
> >     VirtIODevice *vdev = VIRTIO_DEVICE(n);
> >     VirtIONetQueue *q = virtio_net_get_subqueue(nc);
> >
> >     if (!vdev->vm_running) {
> >         return false;
> >     }
> >
> >     if (nc->queue_index >= n->curr_queue_pairs) {
> >         return false;
> >     }
> >
> >     if (!virtio_queue_ready(q->rx_vq) ||
> >         !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> >         return false;
> >     }
> >
> >     return true;
> > }
> >
> > int virtio_queue_ready(VirtQueue *vq)
> > {
> >     return vq->vring.avail != 0;
> > }
> >
> >
> > static void __virtio_queue_reset(VirtIODevice *vdev, uint32_t i)
> > {
> >     vdev->vq[i].vring.desc = 0;
> >     vdev->vq[i].vring.avail = 0;
> >     vdev->vq[i].vring.used = 0;
> >     vdev->vq[i].last_avail_idx = 0;
> >     vdev->vq[i].shadow_avail_idx = 0;
> >     vdev->vq[i].used_idx = 0;
> >     vdev->vq[i].last_avail_wrap_counter = true;
> >     vdev->vq[i].shadow_avail_wrap_counter = true;
> >     vdev->vq[i].used_wrap_counter = true;
> >     virtio_queue_set_vector(vdev, i, VIRTIO_NO_VECTOR);
> >     vdev->vq[i].signalled_used = 0;
> >     vdev->vq[i].signalled_used_valid = false;
> >     vdev->vq[i].notification = true;
> >     vdev->vq[i].vring.num = vdev->vq[i].vring.num_default;
> >     vdev->vq[i].inuse = 0;
> >     virtio_virtqueue_reset_region_cache(&vdev->vq[i]);
> > }
> >
> > In the implementation of Per-Queue Reset, for RX, we stop RX by setting vdev->vq[i].vring.avail to 0.
> 
> Ok, but this is kind of fragile (especially when vIOMMU is enabled).
> I'd add an explicit check for reset there.

It's not great in that spec says avail 0 is actually legal.
But I don't really want to see more and more checks.
If we are doing cleanups, the right way is probably a new "live" flag
that transports can set correctly from the combination of
DRIVER_OK, desc, kick, queue_enable, queue_reset and so on.

> (probably on top).
> 
> Thanks
> 
> > Then callback can_receive will return False.
> >
> >
> > Thanks.
> >
> >
> > >
> > > Thanks
> > >
> > > >
> > > > Thanks.
> > > >
> > > >
> > > > >
> > > > > E.g:
> > > > > virtio_net_can_receive()
> > > > > virtio_net_tx_{timer|bh}()
> > > > >
> > > > > Thanks
> > > > >
> > > > > >          return num_packets;
> > > > > >      }
> > > > > >
> > > > > > --
> > > > > > 2.32.0.3.g01195cf9f
> > > > > >
> > > > >
> > > >
> > >
> >



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-30  5:32                   ` Michael S. Tsirkin
@ 2023-01-30  7:49                     ` Jason Wang
  2023-01-30  7:53                       ` Xuan Zhuo
  2023-01-31  7:38                       ` Xuan Zhuo
  0 siblings, 2 replies; 32+ messages in thread
From: Jason Wang @ 2023-01-30  7:49 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Xuan Zhuo, qemu-devel, Alexander Bulekov

On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote:
> > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > > > >
> > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > > > per-queue reset state.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > What does "at this time" mean here?
> > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > > > >
> > > > > > > > Yes
> > > > > > > >
> > > > > > > > virtio_queue_reset
> > > > > > > >   k->queue_reset
> > > > > > > >           virtio_net_queue_reset
> > > > > > > >                   flush_or_purge_queued_packets
> > > > > > > >                           qemu_flush_or_purge_queued_packets
> > > > > > > >                                   .....
> > > > > > > >                                   (callback) virtio_net_tx_complete
> > > > > > > >                                           virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > > > >
> > > > > > > >
> > > > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > > > struct VirtQueue.
> > > > > > > >
> > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > > > >
> > > > > > > > Thanks.
> > > > > > >
> > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > > > with length 0 here? Are there other cases where length is 0?
> > > > > > >
> > > > > > >
> > > > > > > > > What does the call stack look like?
> > > > > > > > >
> > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > > > We want something much more local, ideally on stack even ...
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > > > ---
> > > > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > > > >      VirtQueueElement *elem;
> > > > > > > > > >      int32_t num_packets = 0;
> > > > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > > > >
> > > > > > > btw this sounds like you are asking it to reset some state.
> > > > > > >
> > > > > > > > > >          return num_packets;
> > > > > > >
> > > > > > > and then
> > > > > > >
> > > > > > >     ret = virtio_net_flush_tx(q);
> > > > > > >     if (ret >= n->tx_burst)
> > > > > > >
> > > > > > >
> > > > > > > will reschedule automatically won't it?
> > > > > > >
> > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > > > >
> > > > > > virtio_net_flush_tx may been called by timer.

We stop timer/bh during device reset, do we need to do the same with vq reset?

> > > > > >
> > > > > > Thanks.
> > > > >
> > > > > timer won't run while flush_or_purge_queued_packets is in progress.
> > > >
> > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > > > consider that after the flush_or_purge_queued_packets, this process before the
> > > > structure is cleared.
> > >
> > >
> > >
> > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> > > {
> > >     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > >
> > >     if (k->queue_reset) {
> > >         k->queue_reset(vdev, queue_index);
> > >     }
> > >
> > >     __virtio_queue_reset(vdev, queue_index);
> > > }
> > >
> > >
> > > No timers do not run between  k->queue_reset and __virtio_queue_reset.
> > >
> > >
> > > > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > > > This is a callback, it is not convenient to pass the parameters.
> > > >
> > > > Thanks
> > >
> > >
> > > How about checking that length is 0?
> >
> >
> > I think that check length is not a good way. This modifys the semantics of 0.

0 seems to mean "purge" and

> It is
> > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge()
> > will pass 0, and this function is called by many places.

That's exactly what we want actually, when do purge we don't need a flush?

> >
> > How about we add an api in queue.c to replace the sent_cb callback on queue?
> >
> > Thanks.
>
> OK I guess. Jason?

Not sure, anything different from adding a check in
virtio_net_tx_complete()? (assuming bh and timer is cancelled or
deleted).

Thanks

>
> >
> > >
> > > >
> > > > >
> > > > > > >
> > > > > > >
> > > > > > > > > >      }
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > > > >
> > > > > > >
> > > > >
> > >
>



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-30  7:49                     ` Jason Wang
@ 2023-01-30  7:53                       ` Xuan Zhuo
  2023-01-30  8:40                         ` Jason Wang
  2023-01-31  7:38                       ` Xuan Zhuo
  1 sibling, 1 reply; 32+ messages in thread
From: Xuan Zhuo @ 2023-01-30  7:53 UTC (permalink / raw)
  To: Jason Wang; +Cc: qemu-devel, Alexander Bulekov, Michael S. Tsirkin

On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote:
> > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > > > > >
> > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > > > > per-queue reset state.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > What does "at this time" mean here?
> > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > > > > >
> > > > > > > > > Yes
> > > > > > > > >
> > > > > > > > > virtio_queue_reset
> > > > > > > > >   k->queue_reset
> > > > > > > > >           virtio_net_queue_reset
> > > > > > > > >                   flush_or_purge_queued_packets
> > > > > > > > >                           qemu_flush_or_purge_queued_packets
> > > > > > > > >                                   .....
> > > > > > > > >                                   (callback) virtio_net_tx_complete
> > > > > > > > >                                           virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > > > > struct VirtQueue.
> > > > > > > > >
> > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > > > > >
> > > > > > > > > Thanks.
> > > > > > > >
> > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > > > > with length 0 here? Are there other cases where length is 0?
> > > > > > > >
> > > > > > > >
> > > > > > > > > > What does the call stack look like?
> > > > > > > > > >
> > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > > > > We want something much more local, ideally on stack even ...
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > > > > ---
> > > > > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > > > > >      VirtQueueElement *elem;
> > > > > > > > > > >      int32_t num_packets = 0;
> > > > > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > > > > >
> > > > > > > > btw this sounds like you are asking it to reset some state.
> > > > > > > >
> > > > > > > > > > >          return num_packets;
> > > > > > > >
> > > > > > > > and then
> > > > > > > >
> > > > > > > >     ret = virtio_net_flush_tx(q);
> > > > > > > >     if (ret >= n->tx_burst)
> > > > > > > >
> > > > > > > >
> > > > > > > > will reschedule automatically won't it?
> > > > > > > >
> > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > > > > >
> > > > > > > virtio_net_flush_tx may been called by timer.
>
> We stop timer/bh during device reset, do we need to do the same with vq reset?
>
> > > > > > >
> > > > > > > Thanks.
> > > > > >
> > > > > > timer won't run while flush_or_purge_queued_packets is in progress.
> > > > >
> > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > > > > consider that after the flush_or_purge_queued_packets, this process before the
> > > > > structure is cleared.
> > > >
> > > >
> > > >
> > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> > > > {
> > > >     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > >
> > > >     if (k->queue_reset) {
> > > >         k->queue_reset(vdev, queue_index);
> > > >     }
> > > >
> > > >     __virtio_queue_reset(vdev, queue_index);
> > > > }
> > > >
> > > >
> > > > No timers do not run between  k->queue_reset and __virtio_queue_reset.
> > > >
> > > >
> > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > > > > This is a callback, it is not convenient to pass the parameters.
> > > > >
> > > > > Thanks
> > > >
> > > >
> > > > How about checking that length is 0?
> > >
> > >
> > > I think that check length is not a good way. This modifys the semantics of 0.
>
> 0 seems to mean "purge" and
>
> > It is
> > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge()
> > > will pass 0, and this function is called by many places.
>
> That's exactly what we want actually, when do purge we don't need a flush?

Yes, but I'm not sure. If we stop flush, there will be any other effects.

On the other hand, if we use "0" as a judgment condition, do you mean only the
implementation of the purge in the flush_or_purge_queued_packets()?

>
> > >
> > > How about we add an api in queue.c to replace the sent_cb callback on queue?
> > >
> > > Thanks.
> > > > OK I guess. Jason?
>
> Not sure, anything different from adding a check in
> virtio_net_tx_complete()? (assuming bh and timer is cancelled or
> deleted).

We replaced the sent_cb with a function without flush.

Thanks.


>
> Thanks
>
> >
> > >
> > > >
> > > > >
> > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > > >      }
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> >
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-30  7:53                       ` Xuan Zhuo
@ 2023-01-30  8:40                         ` Jason Wang
  2023-01-30 10:24                           ` Xuan Zhuo
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Wang @ 2023-01-30  8:40 UTC (permalink / raw)
  To: Xuan Zhuo; +Cc: qemu-devel, Alexander Bulekov, Michael S. Tsirkin

On Mon, Jan 30, 2023 at 4:03 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote:
> > > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > > > > > >
> > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > > > > > per-queue reset state.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > What does "at this time" mean here?
> > > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > > > > > >
> > > > > > > > > > Yes
> > > > > > > > > >
> > > > > > > > > > virtio_queue_reset
> > > > > > > > > >   k->queue_reset
> > > > > > > > > >           virtio_net_queue_reset
> > > > > > > > > >                   flush_or_purge_queued_packets
> > > > > > > > > >                           qemu_flush_or_purge_queued_packets
> > > > > > > > > >                                   .....
> > > > > > > > > >                                   (callback) virtio_net_tx_complete
> > > > > > > > > >                                           virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > > > > > struct VirtQueue.
> > > > > > > > > >
> > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > > > > > >
> > > > > > > > > > Thanks.
> > > > > > > > >
> > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > > > > > with length 0 here? Are there other cases where length is 0?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > > What does the call stack look like?
> > > > > > > > > > >
> > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > > > > > We want something much more local, ideally on stack even ...
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > > > > > ---
> > > > > > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > > > > > >
> > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > > > > > >      VirtQueueElement *elem;
> > > > > > > > > > > >      int32_t num_packets = 0;
> > > > > > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > > > > > >
> > > > > > > > > btw this sounds like you are asking it to reset some state.
> > > > > > > > >
> > > > > > > > > > > >          return num_packets;
> > > > > > > > >
> > > > > > > > > and then
> > > > > > > > >
> > > > > > > > >     ret = virtio_net_flush_tx(q);
> > > > > > > > >     if (ret >= n->tx_burst)
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > will reschedule automatically won't it?
> > > > > > > > >
> > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > > > > > >
> > > > > > > > virtio_net_flush_tx may been called by timer.
> >
> > We stop timer/bh during device reset, do we need to do the same with vq reset?
> >
> > > > > > > >
> > > > > > > > Thanks.
> > > > > > >
> > > > > > > timer won't run while flush_or_purge_queued_packets is in progress.
> > > > > >
> > > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > > > > > consider that after the flush_or_purge_queued_packets, this process before the
> > > > > > structure is cleared.
> > > > >
> > > > >
> > > > >
> > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> > > > > {
> > > > >     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > > >
> > > > >     if (k->queue_reset) {
> > > > >         k->queue_reset(vdev, queue_index);
> > > > >     }
> > > > >
> > > > >     __virtio_queue_reset(vdev, queue_index);
> > > > > }
> > > > >
> > > > >
> > > > > No timers do not run between  k->queue_reset and __virtio_queue_reset.
> > > > >
> > > > >
> > > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > > > > > This is a callback, it is not convenient to pass the parameters.
> > > > > >
> > > > > > Thanks
> > > > >
> > > > >
> > > > > How about checking that length is 0?
> > > >
> > > >
> > > > I think that check length is not a good way. This modifys the semantics of 0.
> >
> > 0 seems to mean "purge" and
> >
> > > It is
> > > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge()
> > > > will pass 0, and this function is called by many places.
> >
> > That's exactly what we want actually, when do purge we don't need a flush?
>
> Yes, but I'm not sure. If we stop flush, there will be any other effects.

So we did:

virtio_net_queue_reset():
    nc = qemu_get_subqueue(n->nic, vq2q(queue_index);
    flush_or_purge_queued_packets(nc);
        qemu_flush_or_purge_queued_packets(nc->peer, true); // [1]
            if (qemu_net_queue_flush(nc->incoming_queue)) {
            ....
            } else if (purge) {
                qemu_net_queue_purge(nc->incoming_queue, nc->peer);
                        packet->send_cb()
                            virtio_net_tx_complete()
                                virtio_net_flush_tx()
                                    qemu_sendv_packet_async() // [2]
            }

We try to flush the tap's incoming queue and if we fail we will purge
in [1]. But the sent_cb() tries to send more packets which could be
queued to the tap incoming queue [2]. This breaks the semantic of
qemu_flush_or_purge_queued_packets().

>
> On the other hand, if we use "0" as a judgment condition, do you mean only the
> implementation of the purge in the flush_or_purge_queued_packets()?

It should be all the users of qemu_net_queue_purge(). The rest users
seems all fine:

virtio_net_vhost_status(), if we do flush, it may end up with touching
vring during vhost is running.
filters: all do a flush before.

>
> >
> > > >
> > > > How about we add an api in queue.c to replace the sent_cb callback on queue?
> > > >
> > > > Thanks.
> > > > > OK I guess. Jason?
> >
> > Not sure, anything different from adding a check in
> > virtio_net_tx_complete()? (assuming bh and timer is cancelled or
> > deleted).
>
> We replaced the sent_cb with a function without flush.

I meant it won't be different from adding a

if (virtio_queue_is_rest())

somewhere in virtio_net_tx_complete()?

Thanks

>
> Thanks.
>
>
> >
> > Thanks
> >
> > >
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > > >      }
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
>



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-30  5:50             ` Michael S. Tsirkin
@ 2023-01-30  8:41               ` Jason Wang
  0 siblings, 0 replies; 32+ messages in thread
From: Jason Wang @ 2023-01-30  8:41 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Xuan Zhuo, qemu-devel, Alexander Bulekov

On Mon, Jan 30, 2023 at 1:50 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Jan 30, 2023 at 11:53:18AM +0800, Jason Wang wrote:
> > On Mon, Jan 30, 2023 at 11:42 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > >
> > > On Mon, 30 Jan 2023 11:01:40 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > > On Sun, Jan 29, 2023 at 3:44 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > >
> > > > > On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > > > > On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > > > >
> > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > >
> > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > per-queue reset state.
> > > > > > >
> > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > ---
> > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > >      VirtQueueElement *elem;
> > > > > > >      int32_t num_packets = 0;
> > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > > >
> > > > > > We have other places that check DRIVER_OK do we need to check queue
> > > > > > reset as well?
> > > > >
> > > > > I checked it again. I still think that the location of other checking DRIVER_OK
> > > > > does not need to check the queue reset.
> > > >
> > > > For example, if we don't disable can_receive() when the queue is
> > > > reset, it means rx may go for virtio_net_receive_rcu(). It means the
> > > > Qemu is still trying to process the traffic from the network backend
> > > > like tap which may waste cpu cycles.
> > > >
> > > > I think the correct way is to return false when the queue is reset in
> > > > can_receive(), then the backend poll will be disabled (e.g TAP). When
> > > > the queue is enabled again, qemu_flush_queued_packets() will wake up
> > > > the backend polling.
> > > >
> > > > Having had time to check other places but it would be better to
> > > > mention why it doesn't need a check in the changelog.
> > >
> > >
> > > static bool virtio_net_can_receive(NetClientState *nc)
> > > {
> > >     VirtIONet *n = qemu_get_nic_opaque(nc);
> > >     VirtIODevice *vdev = VIRTIO_DEVICE(n);
> > >     VirtIONetQueue *q = virtio_net_get_subqueue(nc);
> > >
> > >     if (!vdev->vm_running) {
> > >         return false;
> > >     }
> > >
> > >     if (nc->queue_index >= n->curr_queue_pairs) {
> > >         return false;
> > >     }
> > >
> > >     if (!virtio_queue_ready(q->rx_vq) ||
> > >         !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > >         return false;
> > >     }
> > >
> > >     return true;
> > > }
> > >
> > > int virtio_queue_ready(VirtQueue *vq)
> > > {
> > >     return vq->vring.avail != 0;
> > > }
> > >
> > >
> > > static void __virtio_queue_reset(VirtIODevice *vdev, uint32_t i)
> > > {
> > >     vdev->vq[i].vring.desc = 0;
> > >     vdev->vq[i].vring.avail = 0;
> > >     vdev->vq[i].vring.used = 0;
> > >     vdev->vq[i].last_avail_idx = 0;
> > >     vdev->vq[i].shadow_avail_idx = 0;
> > >     vdev->vq[i].used_idx = 0;
> > >     vdev->vq[i].last_avail_wrap_counter = true;
> > >     vdev->vq[i].shadow_avail_wrap_counter = true;
> > >     vdev->vq[i].used_wrap_counter = true;
> > >     virtio_queue_set_vector(vdev, i, VIRTIO_NO_VECTOR);
> > >     vdev->vq[i].signalled_used = 0;
> > >     vdev->vq[i].signalled_used_valid = false;
> > >     vdev->vq[i].notification = true;
> > >     vdev->vq[i].vring.num = vdev->vq[i].vring.num_default;
> > >     vdev->vq[i].inuse = 0;
> > >     virtio_virtqueue_reset_region_cache(&vdev->vq[i]);
> > > }
> > >
> > > In the implementation of Per-Queue Reset, for RX, we stop RX by setting vdev->vq[i].vring.avail to 0.
> >
> > Ok, but this is kind of fragile (especially when vIOMMU is enabled).
> > I'd add an explicit check for reset there.
>
> It's not great in that spec says avail 0 is actually legal.
> But I don't really want to see more and more checks.
> If we are doing cleanups, the right way is probably a new "live" flag
> that transports can set correctly from the combination of
> DRIVER_OK, desc, kick, queue_enable, queue_reset and so on.

I second this, but for kick, we can only do that unless it is mandated
by the spec (otherwise we may break drivers silently).

Thanks

>
> > (probably on top).
> >
> > Thanks
> >
> > > Then callback can_receive will return False.
> > >
> > >
> > > Thanks.
> > >
> > >
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > > Thanks.
> > > > >
> > > > >
> > > > > >
> > > > > > E.g:
> > > > > > virtio_net_can_receive()
> > > > > > virtio_net_tx_{timer|bh}()
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > >          return num_packets;
> > > > > > >      }
> > > > > > >
> > > > > > > --
> > > > > > > 2.32.0.3.g01195cf9f
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
>



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-30  8:40                         ` Jason Wang
@ 2023-01-30 10:24                           ` Xuan Zhuo
  2023-01-31  3:27                             ` Jason Wang
  0 siblings, 1 reply; 32+ messages in thread
From: Xuan Zhuo @ 2023-01-30 10:24 UTC (permalink / raw)
  To: Jason Wang; +Cc: qemu-devel, Alexander Bulekov, Michael S. Tsirkin

On Mon, 30 Jan 2023 16:40:08 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Mon, Jan 30, 2023 at 4:03 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote:
> > > > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > > > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > > > > > > >
> > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > > > > > > per-queue reset state.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > What does "at this time" mean here?
> > > > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > > > > > > >
> > > > > > > > > > > Yes
> > > > > > > > > > >
> > > > > > > > > > > virtio_queue_reset
> > > > > > > > > > >   k->queue_reset
> > > > > > > > > > >           virtio_net_queue_reset
> > > > > > > > > > >                   flush_or_purge_queued_packets
> > > > > > > > > > >                           qemu_flush_or_purge_queued_packets
> > > > > > > > > > >                                   .....
> > > > > > > > > > >                                   (callback) virtio_net_tx_complete
> > > > > > > > > > >                                           virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > > > > > > struct VirtQueue.
> > > > > > > > > > >
> > > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > > > > > > >
> > > > > > > > > > > Thanks.
> > > > > > > > > >
> > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > > > > > > with length 0 here? Are there other cases where length is 0?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > > What does the call stack look like?
> > > > > > > > > > > >
> > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > > > > > > We want something much more local, ideally on stack even ...
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > > > > > > ---
> > > > > > > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > > > > > > >
> > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > > > > > > >      VirtQueueElement *elem;
> > > > > > > > > > > > >      int32_t num_packets = 0;
> > > > > > > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > > > > > > >
> > > > > > > > > > btw this sounds like you are asking it to reset some state.
> > > > > > > > > >
> > > > > > > > > > > > >          return num_packets;
> > > > > > > > > >
> > > > > > > > > > and then
> > > > > > > > > >
> > > > > > > > > >     ret = virtio_net_flush_tx(q);
> > > > > > > > > >     if (ret >= n->tx_burst)
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > will reschedule automatically won't it?
> > > > > > > > > >
> > > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > > > > > > >
> > > > > > > > > virtio_net_flush_tx may been called by timer.
> > >
> > > We stop timer/bh during device reset, do we need to do the same with vq reset?
> > >
> > > > > > > > >
> > > > > > > > > Thanks.
> > > > > > > >
> > > > > > > > timer won't run while flush_or_purge_queued_packets is in progress.
> > > > > > >
> > > > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > > > > > > consider that after the flush_or_purge_queued_packets, this process before the
> > > > > > > structure is cleared.
> > > > > >
> > > > > >
> > > > > >
> > > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> > > > > > {
> > > > > >     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > > > >
> > > > > >     if (k->queue_reset) {
> > > > > >         k->queue_reset(vdev, queue_index);
> > > > > >     }
> > > > > >
> > > > > >     __virtio_queue_reset(vdev, queue_index);
> > > > > > }
> > > > > >
> > > > > >
> > > > > > No timers do not run between  k->queue_reset and __virtio_queue_reset.
> > > > > >
> > > > > >
> > > > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > > > > > > This is a callback, it is not convenient to pass the parameters.
> > > > > > >
> > > > > > > Thanks
> > > > > >
> > > > > >
> > > > > > How about checking that length is 0?
> > > > >
> > > > >
> > > > > I think that check length is not a good way. This modifys the semantics of 0.
> > >
> > > 0 seems to mean "purge" and
> > >
> > > > It is
> > > > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge()
> > > > > will pass 0, and this function is called by many places.
> > >
> > > That's exactly what we want actually, when do purge we don't need a flush?
> >
> > Yes, but I'm not sure. If we stop flush, there will be any other effects.
>
> So we did:
>
> virtio_net_queue_reset():
>     nc = qemu_get_subqueue(n->nic, vq2q(queue_index);
>     flush_or_purge_queued_packets(nc);
>         qemu_flush_or_purge_queued_packets(nc->peer, true); // [1]
>             if (qemu_net_queue_flush(nc->incoming_queue)) {
>             ....
>             } else if (purge) {
>                 qemu_net_queue_purge(nc->incoming_queue, nc->peer);
>                         packet->send_cb()
>                             virtio_net_tx_complete()
>                                 virtio_net_flush_tx()
>                                     qemu_sendv_packet_async() // [2]
>             }
>
> We try to flush the tap's incoming queue and if we fail we will purge
> in [1]. But the sent_cb() tries to send more packets which could be
> queued to the tap incoming queue [2]. This breaks the semantic of
> qemu_flush_or_purge_queued_packets().

Sounds like good news, and I think so too.

>
> >
> > On the other hand, if we use "0" as a judgment condition, do you mean only the
> > implementation of the purge in the flush_or_purge_queued_packets()?
>
> It should be all the users of qemu_net_queue_purge(). The rest users
> seems all fine:
>
> virtio_net_vhost_status(), if we do flush, it may end up with touching
> vring during vhost is running.
> filters: all do a flush before.
>
> >
> > >
> > > > >
> > > > > How about we add an api in queue.c to replace the sent_cb callback on queue?
> > > > >
> > > > > Thanks.
> > > > > > OK I guess. Jason?
> > >
> > > Not sure, anything different from adding a check in
> > > virtio_net_tx_complete()? (assuming bh and timer is cancelled or
> > > deleted).
> >
> > We replaced the sent_cb with a function without flush.
>
> I meant it won't be different from adding a
>
> if (virtio_queue_is_rest())
>
> somewhere in virtio_net_tx_complete()?


Only modify on the stack, without using a variable like disabled_by_reset.

Thanks.


>
> Thanks
>
> >
> > Thanks.
> >
> >
> > >
> > > Thanks
> > >
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > > >      }
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > >
> >
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-30 10:24                           ` Xuan Zhuo
@ 2023-01-31  3:27                             ` Jason Wang
  2023-01-31  7:17                               ` Xuan Zhuo
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Wang @ 2023-01-31  3:27 UTC (permalink / raw)
  To: Xuan Zhuo; +Cc: qemu-devel, Alexander Bulekov, Michael S. Tsirkin

On Mon, Jan 30, 2023 at 6:26 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> On Mon, 30 Jan 2023 16:40:08 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > On Mon, Jan 30, 2023 at 4:03 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > >
> > > On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > > On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote:
> > > > > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > > > > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > > > > > > > per-queue reset state.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > What does "at this time" mean here?
> > > > > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > > > > > > > >
> > > > > > > > > > > > Yes
> > > > > > > > > > > >
> > > > > > > > > > > > virtio_queue_reset
> > > > > > > > > > > >   k->queue_reset
> > > > > > > > > > > >           virtio_net_queue_reset
> > > > > > > > > > > >                   flush_or_purge_queued_packets
> > > > > > > > > > > >                           qemu_flush_or_purge_queued_packets
> > > > > > > > > > > >                                   .....
> > > > > > > > > > > >                                   (callback) virtio_net_tx_complete
> > > > > > > > > > > >                                           virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > > > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > > > > > > > struct VirtQueue.
> > > > > > > > > > > >
> > > > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks.
> > > > > > > > > > >
> > > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > > > > > > > with length 0 here? Are there other cases where length is 0?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > > What does the call stack look like?
> > > > > > > > > > > > >
> > > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > > > > > > > We want something much more local, ideally on stack even ...
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > > > > > > > ---
> > > > > > > > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > > > > > > > >      VirtQueueElement *elem;
> > > > > > > > > > > > > >      int32_t num_packets = 0;
> > > > > > > > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > > > > > > > >
> > > > > > > > > > > btw this sounds like you are asking it to reset some state.
> > > > > > > > > > >
> > > > > > > > > > > > > >          return num_packets;
> > > > > > > > > > >
> > > > > > > > > > > and then
> > > > > > > > > > >
> > > > > > > > > > >     ret = virtio_net_flush_tx(q);
> > > > > > > > > > >     if (ret >= n->tx_burst)
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > will reschedule automatically won't it?
> > > > > > > > > > >
> > > > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > > > > > > > >
> > > > > > > > > > virtio_net_flush_tx may been called by timer.
> > > >
> > > > We stop timer/bh during device reset, do we need to do the same with vq reset?
> > > >
> > > > > > > > > >
> > > > > > > > > > Thanks.
> > > > > > > > >
> > > > > > > > > timer won't run while flush_or_purge_queued_packets is in progress.
> > > > > > > >
> > > > > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > > > > > > > consider that after the flush_or_purge_queued_packets, this process before the
> > > > > > > > structure is cleared.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> > > > > > > {
> > > > > > >     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > > > > >
> > > > > > >     if (k->queue_reset) {
> > > > > > >         k->queue_reset(vdev, queue_index);
> > > > > > >     }
> > > > > > >
> > > > > > >     __virtio_queue_reset(vdev, queue_index);
> > > > > > > }
> > > > > > >
> > > > > > >
> > > > > > > No timers do not run between  k->queue_reset and __virtio_queue_reset.
> > > > > > >
> > > > > > >
> > > > > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > > > > > > > This is a callback, it is not convenient to pass the parameters.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > >
> > > > > > >
> > > > > > > How about checking that length is 0?
> > > > > >
> > > > > >
> > > > > > I think that check length is not a good way. This modifys the semantics of 0.
> > > >
> > > > 0 seems to mean "purge" and
> > > >
> > > > > It is
> > > > > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge()
> > > > > > will pass 0, and this function is called by many places.
> > > >
> > > > That's exactly what we want actually, when do purge we don't need a flush?
> > >
> > > Yes, but I'm not sure. If we stop flush, there will be any other effects.
> >
> > So we did:
> >
> > virtio_net_queue_reset():
> >     nc = qemu_get_subqueue(n->nic, vq2q(queue_index);
> >     flush_or_purge_queued_packets(nc);
> >         qemu_flush_or_purge_queued_packets(nc->peer, true); // [1]
> >             if (qemu_net_queue_flush(nc->incoming_queue)) {
> >             ....
> >             } else if (purge) {
> >                 qemu_net_queue_purge(nc->incoming_queue, nc->peer);
> >                         packet->send_cb()
> >                             virtio_net_tx_complete()
> >                                 virtio_net_flush_tx()
> >                                     qemu_sendv_packet_async() // [2]
> >             }
> >
> > We try to flush the tap's incoming queue and if we fail we will purge
> > in [1]. But the sent_cb() tries to send more packets which could be
> > queued to the tap incoming queue [2]. This breaks the semantic of
> > qemu_flush_or_purge_queued_packets().
>
> Sounds like good news, and I think so too.
>
> >
> > >
> > > On the other hand, if we use "0" as a judgment condition, do you mean only the
> > > implementation of the purge in the flush_or_purge_queued_packets()?
> >
> > It should be all the users of qemu_net_queue_purge(). The rest users
> > seems all fine:
> >
> > virtio_net_vhost_status(), if we do flush, it may end up with touching
> > vring during vhost is running.
> > filters: all do a flush before.
> >
> > >
> > > >
> > > > > >
> > > > > > How about we add an api in queue.c to replace the sent_cb callback on queue?
> > > > > >
> > > > > > Thanks.
> > > > > > > OK I guess. Jason?
> > > >
> > > > Not sure, anything different from adding a check in
> > > > virtio_net_tx_complete()? (assuming bh and timer is cancelled or
> > > > deleted).
> > >
> > > We replaced the sent_cb with a function without flush.
> >
> > I meant it won't be different from adding a
> >
> > if (virtio_queue_is_rest())
> >
> > somewhere in virtio_net_tx_complete()?
>
>
> Only modify on the stack, without using a variable like disabled_by_reset.

Ok, but per discussion above, it looks to me we can simply check len
against zero which seems simpler. (And it may fix other possible bugs)

Thanks

>
> Thanks.
>
>
> >
> > Thanks
> >
> > >
> > > Thanks.
> > >
> > >
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > > >      }
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> >
>



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-31  3:27                             ` Jason Wang
@ 2023-01-31  7:17                               ` Xuan Zhuo
  0 siblings, 0 replies; 32+ messages in thread
From: Xuan Zhuo @ 2023-01-31  7:17 UTC (permalink / raw)
  To: Jason Wang; +Cc: qemu-devel, Alexander Bulekov, Michael S. Tsirkin

On Tue, 31 Jan 2023 11:27:42 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Mon, Jan 30, 2023 at 6:26 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > On Mon, 30 Jan 2023 16:40:08 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > On Mon, Jan 30, 2023 at 4:03 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > >
> > > > On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > > > On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote:
> > > > > > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > > > > > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > > > > > > > > per-queue reset state.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > What does "at this time" mean here?
> > > > > > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Yes
> > > > > > > > > > > > >
> > > > > > > > > > > > > virtio_queue_reset
> > > > > > > > > > > > >   k->queue_reset
> > > > > > > > > > > > >           virtio_net_queue_reset
> > > > > > > > > > > > >                   flush_or_purge_queued_packets
> > > > > > > > > > > > >                           qemu_flush_or_purge_queued_packets
> > > > > > > > > > > > >                                   .....
> > > > > > > > > > > > >                                   (callback) virtio_net_tx_complete
> > > > > > > > > > > > >                                           virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > > > > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > > > > > > > > struct VirtQueue.
> > > > > > > > > > > > >
> > > > > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks.
> > > > > > > > > > > >
> > > > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > > > > > > > > with length 0 here? Are there other cases where length is 0?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > > What does the call stack look like?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > > > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > > > > > > > > We want something much more local, ideally on stack even ...
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > > > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > > > > > > > > >      VirtQueueElement *elem;
> > > > > > > > > > > > > > >      int32_t num_packets = 0;
> > > > > > > > > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > > > > > > > > >
> > > > > > > > > > > > btw this sounds like you are asking it to reset some state.
> > > > > > > > > > > >
> > > > > > > > > > > > > > >          return num_packets;
> > > > > > > > > > > >
> > > > > > > > > > > > and then
> > > > > > > > > > > >
> > > > > > > > > > > >     ret = virtio_net_flush_tx(q);
> > > > > > > > > > > >     if (ret >= n->tx_burst)
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > will reschedule automatically won't it?
> > > > > > > > > > > >
> > > > > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > > > > > > > > >
> > > > > > > > > > > virtio_net_flush_tx may been called by timer.
> > > > >
> > > > > We stop timer/bh during device reset, do we need to do the same with vq reset?
> > > > >
> > > > > > > > > > >
> > > > > > > > > > > Thanks.
> > > > > > > > > >
> > > > > > > > > > timer won't run while flush_or_purge_queued_packets is in progress.
> > > > > > > > >
> > > > > > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > > > > > > > > consider that after the flush_or_purge_queued_packets, this process before the
> > > > > > > > > structure is cleared.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> > > > > > > > {
> > > > > > > >     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > > > > > >
> > > > > > > >     if (k->queue_reset) {
> > > > > > > >         k->queue_reset(vdev, queue_index);
> > > > > > > >     }
> > > > > > > >
> > > > > > > >     __virtio_queue_reset(vdev, queue_index);
> > > > > > > > }
> > > > > > > >
> > > > > > > >
> > > > > > > > No timers do not run between  k->queue_reset and __virtio_queue_reset.
> > > > > > > >
> > > > > > > >
> > > > > > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > > > > > > > > This is a callback, it is not convenient to pass the parameters.
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > >
> > > > > > > >
> > > > > > > > How about checking that length is 0?
> > > > > > >
> > > > > > >
> > > > > > > I think that check length is not a good way. This modifys the semantics of 0.
> > > > >
> > > > > 0 seems to mean "purge" and
> > > > >
> > > > > > It is
> > > > > > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge()
> > > > > > > will pass 0, and this function is called by many places.
> > > > >
> > > > > That's exactly what we want actually, when do purge we don't need a flush?
> > > >
> > > > Yes, but I'm not sure. If we stop flush, there will be any other effects.
> > >
> > > So we did:
> > >
> > > virtio_net_queue_reset():
> > >     nc = qemu_get_subqueue(n->nic, vq2q(queue_index);
> > >     flush_or_purge_queued_packets(nc);
> > >         qemu_flush_or_purge_queued_packets(nc->peer, true); // [1]
> > >             if (qemu_net_queue_flush(nc->incoming_queue)) {
> > >             ....
> > >             } else if (purge) {
> > >                 qemu_net_queue_purge(nc->incoming_queue, nc->peer);
> > >                         packet->send_cb()
> > >                             virtio_net_tx_complete()
> > >                                 virtio_net_flush_tx()
> > >                                     qemu_sendv_packet_async() // [2]
> > >             }
> > >
> > > We try to flush the tap's incoming queue and if we fail we will purge
> > > in [1]. But the sent_cb() tries to send more packets which could be
> > > queued to the tap incoming queue [2]. This breaks the semantic of
> > > qemu_flush_or_purge_queued_packets().
> >
> > Sounds like good news, and I think so too.
> >
> > >
> > > >
> > > > On the other hand, if we use "0" as a judgment condition, do you mean only the
> > > > implementation of the purge in the flush_or_purge_queued_packets()?
> > >
> > > It should be all the users of qemu_net_queue_purge(). The rest users
> > > seems all fine:
> > >
> > > virtio_net_vhost_status(), if we do flush, it may end up with touching
> > > vring during vhost is running.
> > > filters: all do a flush before.
> > >
> > > >
> > > > >
> > > > > > >
> > > > > > > How about we add an api in queue.c to replace the sent_cb callback on queue?
> > > > > > >
> > > > > > > Thanks.
> > > > > > > > OK I guess. Jason?
> > > > >
> > > > > Not sure, anything different from adding a check in
> > > > > virtio_net_tx_complete()? (assuming bh and timer is cancelled or
> > > > > deleted).
> > > >
> > > > We replaced the sent_cb with a function without flush.
> > >
> > > I meant it won't be different from adding a
> > >
> > > if (virtio_queue_is_rest())
> > >
> > > somewhere in virtio_net_tx_complete()?
> >
> >
> > Only modify on the stack, without using a variable like disabled_by_reset.
>
> Ok, but per discussion above, it looks to me we can simply check len
> against zero which seems simpler. (And it may fix other possible bugs)


Yes, if '0' means purge and we should not flush new packet for purge, that is a
good way.

I will post patch soon.

Thanks.


>
> Thanks
>
> >
> > Thanks.
> >
> >
> > >
> > > Thanks
> > >
> > > >
> > > > Thanks.
> > > >
> > > >
> > > > >
> > > > > Thanks
> > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > > >      }
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
  2023-01-30  7:49                     ` Jason Wang
  2023-01-30  7:53                       ` Xuan Zhuo
@ 2023-01-31  7:38                       ` Xuan Zhuo
  1 sibling, 0 replies; 32+ messages in thread
From: Xuan Zhuo @ 2023-01-31  7:38 UTC (permalink / raw)
  To: Jason Wang; +Cc: qemu-devel, Alexander Bulekov, Michael S. Tsirkin

On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote:
> > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote:
> > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote:
> > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote:
> > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote:
> > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx().
> > > > > > > > > > >
> > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this
> > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send
> > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current
> > > > > > > > > > > per-queue reset state.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > What does "at this time" mean here?
> > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets?
> > > > > > > > >
> > > > > > > > > Yes
> > > > > > > > >
> > > > > > > > > virtio_queue_reset
> > > > > > > > >   k->queue_reset
> > > > > > > > >           virtio_net_queue_reset
> > > > > > > > >                   flush_or_purge_queued_packets
> > > > > > > > >                           qemu_flush_or_purge_queued_packets
> > > > > > > > >                                   .....
> > > > > > > > >                                   (callback) virtio_net_tx_complete
> > > > > > > > >                                           virtio_net_flush_tx <-- here send new packet. We need stop it.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I
> > > > > > > > > originally thought it was a general situation, so I wanted to put it in
> > > > > > > > > struct VirtQueue.
> > > > > > > > >
> > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue.
> > > > > > > > >
> > > > > > > > > Thanks.
> > > > > > > >
> > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called
> > > > > > > > with length 0 here? Are there other cases where length is 0?
> > > > > > > >
> > > > > > > >
> > > > > > > > > > What does the call stack look like?
> > > > > > > > > >
> > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx
> > > > > > > > > > knows we are in the process of reset would be a bad idea.
> > > > > > > > > > We want something much more local, ideally on stack even ...
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset")
> > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451
> > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu>
> > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > > > > > ---
> > > > > > > > > > >  hw/net/virtio-net.c | 3 ++-
> > > > > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > > > > > index 3ae909041a..fba6451a50 100644
> > > > > > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > > > > > > > > > >      VirtQueueElement *elem;
> > > > > > > > > > >      int32_t num_packets = 0;
> > > > > > > > > > >      int queue_index = vq2q(virtio_get_queue_index(q->tx_vq));
> > > > > > > > > > > -    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > > > > > > > > +    if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) ||
> > > > > > > > > > > +        virtio_queue_reset_state(q->tx_vq)) {
> > > > > > > >
> > > > > > > > btw this sounds like you are asking it to reset some state.
> > > > > > > >
> > > > > > > > > > >          return num_packets;
> > > > > > > >
> > > > > > > > and then
> > > > > > > >
> > > > > > > >     ret = virtio_net_flush_tx(q);
> > > > > > > >     if (ret >= n->tx_burst)
> > > > > > > >
> > > > > > > >
> > > > > > > > will reschedule automatically won't it?
> > > > > > > >
> > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete?
> > > > > > >
> > > > > > > virtio_net_flush_tx may been called by timer.
>
> We stop timer/bh during device reset, do we need to do the same with vq reset?


Should I stop timer/bh?

Thanks.


>
> > > > > > >
> > > > > > > Thanks.
> > > > > >
> > > > > > timer won't run while flush_or_purge_queued_packets is in progress.
> > > > >
> > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to
> > > > > consider that after the flush_or_purge_queued_packets, this process before the
> > > > > structure is cleared.
> > > >
> > > >
> > > >
> > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index)
> > > > {
> > > >     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > >
> > > >     if (k->queue_reset) {
> > > >         k->queue_reset(vdev, queue_index);
> > > >     }
> > > >
> > > >     __virtio_queue_reset(vdev, queue_index);
> > > > }
> > > >
> > > >
> > > > No timers do not run between  k->queue_reset and __virtio_queue_reset.
> > > >
> > > >
> > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way?
> > > > > This is a callback, it is not convenient to pass the parameters.
> > > > >
> > > > > Thanks
> > > >
> > > >
> > > > How about checking that length is 0?
> > >
> > >
> > > I think that check length is not a good way. This modifys the semantics of 0.
>
> 0 seems to mean "purge" and
>
> > It is
> > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge()
> > > will pass 0, and this function is called by many places.
>
> That's exactly what we want actually, when do purge we don't need a flush?
>
> > >
> > > How about we add an api in queue.c to replace the sent_cb callback on queue?
> > >
> > > Thanks.
> >
> > OK I guess. Jason?
>
> Not sure, anything different from adding a check in
> virtio_net_tx_complete()? (assuming bh and timer is cancelled or
> deleted).
>
> Thanks
>
> >
> > >
> > > >
> > > > >
> > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > > >      }
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > 2.32.0.3.g01195cf9f
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> >
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2023-01-31  7:39 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-29  2:51 [PATCH v1 0/2] virtio: fix for assertion failure: virtio_net_get_subqueue(nc)->async_tx.elem failed Xuan Zhuo
2023-01-29  2:51 ` [PATCH v1 1/2] virtio: struct VirtQueue introduce reset Xuan Zhuo
2023-01-29  7:12   ` Michael S. Tsirkin
2023-01-29  7:15     ` Xuan Zhuo
2023-01-29  7:37       ` Michael S. Tsirkin
2023-01-29  7:39         ` Xuan Zhuo
2023-01-29  2:51 ` [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset Xuan Zhuo
2023-01-29  6:23   ` Jason Wang
2023-01-29  7:43     ` Xuan Zhuo
2023-01-30  3:01       ` Jason Wang
2023-01-30  3:38         ` Xuan Zhuo
2023-01-30  3:53           ` Jason Wang
2023-01-30  5:50             ` Michael S. Tsirkin
2023-01-30  8:41               ` Jason Wang
2023-01-29  7:25   ` Michael S. Tsirkin
2023-01-29  7:28     ` Xuan Zhuo
2023-01-29  8:12       ` Michael S. Tsirkin
2023-01-29  8:23         ` Xuan Zhuo
2023-01-29 11:57           ` Michael S. Tsirkin
2023-01-29 12:03             ` Xuan Zhuo
2023-01-29 12:15               ` Michael S. Tsirkin
2023-01-29 12:28                 ` Xuan Zhuo
2023-01-30  2:15                 ` Xuan Zhuo
2023-01-30  5:32                   ` Michael S. Tsirkin
2023-01-30  7:49                     ` Jason Wang
2023-01-30  7:53                       ` Xuan Zhuo
2023-01-30  8:40                         ` Jason Wang
2023-01-30 10:24                           ` Xuan Zhuo
2023-01-31  3:27                             ` Jason Wang
2023-01-31  7:17                               ` Xuan Zhuo
2023-01-31  7:38                       ` Xuan Zhuo
2023-01-29  7:26 ` [PATCH v1 0/2] virtio: fix for assertion failure: virtio_net_get_subqueue(nc)->async_tx.elem failed Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.