All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/2] disable the configuration interrupt for the unsupported device
@ 2024-03-27  1:22 Cindy Lu
  2024-03-27  1:22 ` [RFC 1/2] virtio-net: disable the configure interrupt for not support device Cindy Lu
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Cindy Lu @ 2024-03-27  1:22 UTC (permalink / raw)
  To: lulu, mst, jasowang, qemu-devel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 2035 bytes --]

we need a crash in Non-standard image, here is the jira for this https://issues.redhat.com/browse/RHEL-28522
The root cause of the issue is that an IRQFD was used without initialization..

During the booting process of the Vyatta image, the behavior of the called function in qemu is as follows:

1. vhost_net_stop() was called, this will call the function
virtio_pci_set_guest_notifiers() with assgin= false, and
virtio_pci_set_guest_notifiers() will release the irqfd for vector 0

2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTORt

3.vhost_net_start() was called (at this time the configure vector is
still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
assgin= true, so the irqfd for vector 0 was not "init" during this process

4. The system continues to boot and msix_fire_vector_notifier() was
called unmask the vector 0 and then met the crash
[msix_fire_vector_notifier] 112 called vector 0 is_masked 1
[msix_fire_vector_notifier] 112 called vector 0 is_masked 0

The reason for not reproducing in RHEL/fedora guest image is because
REHL/Fedora doesn't have the behavior of calling vhost_net_stop and then virtio_reset, and also won't call msix_fire_vector_notifier for vector 0 during system boot.

The reason for not reproducing before configure interrupt support is because
vector 0 is for configure interrupt,  before the support for configure interrupts, the notifier process will not handle vector 0.

For the device Vyatta using, it doesn't support configure interrupts at all, So we plan to disable the configure interrupts in unsupported device

Signed-off-by: Cindy Lu <lulu@redhat.com>

Cindy Lu (2):
  virtio-net: disable the configure interrupt for not support device
  virtio-pci: check if the configure interrupt enable

 hw/net/virtio-net.c        |  5 ++++-
 hw/virtio/virtio-pci.c     | 41 +++++++++++++++++++++-----------------
 hw/virtio/virtio.c         |  1 +
 include/hw/virtio/virtio.h |  1 +
 4 files changed, 29 insertions(+), 19 deletions(-)

-- 
2.43.0



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC 1/2] virtio-net: disable the configure interrupt for not support device
  2024-03-27  1:22 [RFC 0/2] disable the configuration interrupt for the unsupported device Cindy Lu
@ 2024-03-27  1:22 ` Cindy Lu
  2024-03-27  2:54   ` Jason Wang
  2024-03-27  1:22 ` [RFC 2/2] virtio-pci: check if the configure interrupt enable Cindy Lu
  2024-03-27  3:05 ` [RFC 0/2] disable the configuration interrupt for the unsupported device Jason Wang
  2 siblings, 1 reply; 17+ messages in thread
From: Cindy Lu @ 2024-03-27  1:22 UTC (permalink / raw)
  To: lulu, mst, jasowang, qemu-devel

Only the vdpa device support configure interrupt, we need to disable the
configure interrupt process in all other device.
In order to achieve this, I added a check in the virtio_net_device_realize().
When the peer's type is vdpa, the value of config_irq_enabled in the structure
VirtIODevice will set to true.

Signed-off-by: Cindy Lu <lulu@redhat.com>
---
 hw/net/virtio-net.c        | 5 ++++-
 hw/virtio/virtio.c         | 1 +
 include/hw/virtio/virtio.h | 1 +
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 80c56f0cfc..3b487864a8 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3749,12 +3749,15 @@ static void virtio_net_device_realize(DeviceState *dev, Error **errp)
 
     nc = qemu_get_queue(n->nic);
     nc->rxfilter_notify_enabled = 1;
+    vdev->config_irq_enabled = false;
 
-   if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {
+    if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {
         struct virtio_net_config netcfg = {};
         memcpy(&netcfg.mac, &n->nic_conf.macaddr, ETH_ALEN);
         vhost_net_set_config(get_vhost_net(nc->peer),
             (uint8_t *)&netcfg, 0, ETH_ALEN, VHOST_SET_CONFIG_TYPE_FRONTEND);
+
+        vdev->config_irq_enabled = true;
     }
     QTAILQ_INIT(&n->rsc_chains);
     n->qdev = dev;
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 3a160f86ed..6b52a7190d 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -3255,6 +3255,7 @@ void virtio_init(VirtIODevice *vdev, uint16_t device_id, size_t config_size)
             virtio_vmstate_change, vdev);
     vdev->device_endian = virtio_default_endian();
     vdev->use_guest_notifier_mask = true;
+    vdev->config_irq_enabled = false;
 }
 
 /*
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index c8f72850bc..a7763b71e0 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -164,6 +164,7 @@ struct VirtIODevice
      */
     EventNotifier config_notifier;
     bool device_iotlb_enabled;
+    bool config_irq_enabled;
 };
 
 struct VirtioDeviceClass {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC 2/2] virtio-pci: check if the configure interrupt enable
  2024-03-27  1:22 [RFC 0/2] disable the configuration interrupt for the unsupported device Cindy Lu
  2024-03-27  1:22 ` [RFC 1/2] virtio-net: disable the configure interrupt for not support device Cindy Lu
@ 2024-03-27  1:22 ` Cindy Lu
  2024-03-27  3:05 ` [RFC 0/2] disable the configuration interrupt for the unsupported device Jason Wang
  2 siblings, 0 replies; 17+ messages in thread
From: Cindy Lu @ 2024-03-27  1:22 UTC (permalink / raw)
  To: lulu, mst, jasowang, qemu-devel

If config_irq_enabled is not true, it means that configure interrupt is
not supported. Therefore, the config vector will not be handled during
the interrupt process.

Signed-off-by: Cindy Lu <lulu@redhat.com>
---
 hw/virtio/virtio-pci.c | 41 +++++++++++++++++++++++------------------
 1 file changed, 23 insertions(+), 18 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index e433879542..36ad7da206 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1065,7 +1065,7 @@ static int virtio_pci_vector_unmask(PCIDevice *dev, unsigned vector,
         vq = virtio_vector_next_queue(vq);
     }
     /* unmask config intr */
-    if (vector == vdev->config_vector) {
+    if ((vector == vdev->config_vector) && (true == vdev->config_irq_enabled)) {
         n = virtio_config_get_guest_notifier(vdev);
         ret = virtio_pci_one_vector_unmask(proxy, VIRTIO_CONFIG_IRQ_IDX, vector,
                                            msg, n);
@@ -1111,7 +1111,7 @@ static void virtio_pci_vector_mask(PCIDevice *dev, unsigned vector)
         vq = virtio_vector_next_queue(vq);
     }
 
-    if (vector == vdev->config_vector) {
+    if ((vector == vdev->config_vector) && (true == vdev->config_irq_enabled)) {
         n = virtio_config_get_guest_notifier(vdev);
         virtio_pci_one_vector_mask(proxy, VIRTIO_CONFIG_IRQ_IDX, vector, n);
     }
@@ -1147,21 +1147,24 @@ static void virtio_pci_vector_poll(PCIDevice *dev,
         }
     }
     /* poll the config intr */
-    ret = virtio_pci_get_notifier(proxy, VIRTIO_CONFIG_IRQ_IDX, &notifier,
-                                  &vector);
-    if (ret < 0) {
-        return;
-    }
-    if (vector < vector_start || vector >= vector_end ||
-        !msix_is_masked(dev, vector)) {
-        return;
-    }
-    if (k->guest_notifier_pending) {
-        if (k->guest_notifier_pending(vdev, VIRTIO_CONFIG_IRQ_IDX)) {
+    if (true == vdev->config_irq_enabled) {
+        ret = virtio_pci_get_notifier(proxy, VIRTIO_CONFIG_IRQ_IDX, &notifier,
+                                      &vector);
+        if (ret < 0) {
+            return;
+        }
+
+        if (vector < vector_start || vector >= vector_end ||
+            !msix_is_masked(dev, vector)) {
+            return;
+        }
+        if (k->guest_notifier_pending) {
+            if (k->guest_notifier_pending(vdev, VIRTIO_CONFIG_IRQ_IDX)) {
+                msix_set_pending(dev, vector);
+            }
+        } else if (event_notifier_test_and_clear(notifier)) {
             msix_set_pending(dev, vector);
         }
-    } else if (event_notifier_test_and_clear(notifier)) {
-        msix_set_pending(dev, vector);
     }
 }
 
@@ -1282,9 +1285,11 @@ static int virtio_pci_set_guest_notifiers(DeviceState *d, int nvqs, bool assign)
             if (r < 0) {
                 goto config_assign_error;
             }
-            r = kvm_virtio_pci_vector_config_use(proxy);
-            if (r < 0) {
-                goto config_error;
+            if (true == vdev->config_irq_enabled) {
+                r = kvm_virtio_pci_vector_config_use(proxy);
+                if (r < 0) {
+                    goto config_error;
+                }
             }
         }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [RFC 1/2] virtio-net: disable the configure interrupt for not support device
  2024-03-27  1:22 ` [RFC 1/2] virtio-net: disable the configure interrupt for not support device Cindy Lu
@ 2024-03-27  2:54   ` Jason Wang
  0 siblings, 0 replies; 17+ messages in thread
From: Jason Wang @ 2024-03-27  2:54 UTC (permalink / raw)
  To: Cindy Lu; +Cc: mst, qemu-devel

On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu <lulu@redhat.com> wrote:
>
> Only the vdpa device support configure interrupt, we need to disable the
> configure interrupt process in all other device.

I think we need to tweak the terminology here at least.

It's not about configure interrupt, it's about whether or not we can
try to use irqfd for configure interrupt.

Btw, have you tried this on the old kernel that doesn't support
configure interrupt for vDPA?

> In order to achieve this, I added a check in the virtio_net_device_realize().
> When the peer's type is vdpa, the value of config_irq_enabled in the structure
> VirtIODevice will set to true.
>
> Signed-off-by: Cindy Lu <lulu@redhat.com>
> ---
>  hw/net/virtio-net.c        | 5 ++++-
>  hw/virtio/virtio.c         | 1 +
>  include/hw/virtio/virtio.h | 1 +
>  3 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 80c56f0cfc..3b487864a8 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -3749,12 +3749,15 @@ static void virtio_net_device_realize(DeviceState *dev, Error **errp)
>
>      nc = qemu_get_queue(n->nic);
>      nc->rxfilter_notify_enabled = 1;
> +    vdev->config_irq_enabled = false;

Let's tweak the name of the variable.

But in another thought, there's no easy way to know if vDPA support
configure interrupt at device realization.

We need a graceful fallback or just disable irqfd to configure irq.

>
> -   if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {
> +    if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {
>          struct virtio_net_config netcfg = {};
>          memcpy(&netcfg.mac, &n->nic_conf.macaddr, ETH_ALEN);
>          vhost_net_set_config(get_vhost_net(nc->peer),
>              (uint8_t *)&netcfg, 0, ETH_ALEN, VHOST_SET_CONFIG_TYPE_FRONTEND);
> +
> +        vdev->config_irq_enabled = true;
>      }
>      QTAILQ_INIT(&n->rsc_chains);
>      n->qdev = dev;
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 3a160f86ed..6b52a7190d 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -3255,6 +3255,7 @@ void virtio_init(VirtIODevice *vdev, uint16_t device_id, size_t config_size)
>              virtio_vmstate_change, vdev);
>      vdev->device_endian = virtio_default_endian();
>      vdev->use_guest_notifier_mask = true;
> +    vdev->config_irq_enabled = false;
>  }
>
>  /*
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index c8f72850bc..a7763b71e0 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -164,6 +164,7 @@ struct VirtIODevice
>       */
>      EventNotifier config_notifier;
>      bool device_iotlb_enabled;
> +    bool config_irq_enabled;
>  };
>
>  struct VirtioDeviceClass {

Thanks

> --
> 2.43.0
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 0/2] disable the configuration interrupt for the unsupported device
  2024-03-27  1:22 [RFC 0/2] disable the configuration interrupt for the unsupported device Cindy Lu
  2024-03-27  1:22 ` [RFC 1/2] virtio-net: disable the configure interrupt for not support device Cindy Lu
  2024-03-27  1:22 ` [RFC 2/2] virtio-pci: check if the configure interrupt enable Cindy Lu
@ 2024-03-27  3:05 ` Jason Wang
  2024-03-27  6:02   ` Cindy Lu
  2 siblings, 1 reply; 17+ messages in thread
From: Jason Wang @ 2024-03-27  3:05 UTC (permalink / raw)
  To: Cindy Lu; +Cc: mst, qemu-devel

Hi Cindy:

On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu <lulu@redhat.com> wrote:
>
> we need a crash in Non-standard image, here is the jira for this https://issues.redhat.com/browse/RHEL-28522
> The root cause of the issue is that an IRQFD was used without initialization..
>
> During the booting process of the Vyatta image, the behavior of the called function in qemu is as follows:
>
> 1. vhost_net_stop() was called, this will call the function
> virtio_pci_set_guest_notifiers() with assgin= false, and
> virtio_pci_set_guest_notifiers() will release the irqfd for vector 0

Before vhost_net_stop(), do we know which vector is used by which queue?

>
> 2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTORt
>
> 3.vhost_net_start() was called (at this time the configure vector is
> still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
> assgin= true, so the irqfd for vector 0 was not "init" during this process

How does the configure vector differ from the virtqueue vector here?

>
> 4. The system continues to boot and msix_fire_vector_notifier() was
> called unmask the vector 0 and then met the crash
> [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
>
> The reason for not reproducing in RHEL/fedora guest image is because
> REHL/Fedora doesn't have the behavior of calling vhost_net_stop and then virtio_reset, and also won't call msix_fire_vector_notifier for vector 0 during system boot.
>
> The reason for not reproducing before configure interrupt support is because
> vector 0 is for configure interrupt,  before the support for configure interrupts, the notifier process will not handle vector 0.
>
> For the device Vyatta using, it doesn't support configure interrupts at all, So we plan to disable the configure interrupts in unsupported device

Btw, let's tweak the changelog, it's a little bit hard to understand.

Thanks

>
> Signed-off-by: Cindy Lu <lulu@redhat.com>
>
> Cindy Lu (2):
>   virtio-net: disable the configure interrupt for not support device
>   virtio-pci: check if the configure interrupt enable
>
>  hw/net/virtio-net.c        |  5 ++++-
>  hw/virtio/virtio-pci.c     | 41 +++++++++++++++++++++-----------------
>  hw/virtio/virtio.c         |  1 +
>  include/hw/virtio/virtio.h |  1 +
>  4 files changed, 29 insertions(+), 19 deletions(-)
>
> --
> 2.43.0
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 0/2] disable the configuration interrupt for the unsupported device
  2024-03-27  3:05 ` [RFC 0/2] disable the configuration interrupt for the unsupported device Jason Wang
@ 2024-03-27  6:02   ` Cindy Lu
  2024-03-27  7:54     ` Jason Wang
  0 siblings, 1 reply; 17+ messages in thread
From: Cindy Lu @ 2024-03-27  6:02 UTC (permalink / raw)
  To: Jason Wang; +Cc: mst, qemu-devel

On Wed, Mar 27, 2024 at 11:05 AM Jason Wang <jasowang@redhat.com> wrote:
>
> Hi Cindy:
>
> On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu <lulu@redhat.com> wrote:
> >
> > we need a crash in Non-standard image, here is the jira for this https://issues.redhat.com/browse/RHEL-28522
> > The root cause of the issue is that an IRQFD was used without initialization..
> >
> > During the booting process of the Vyatta image, the behavior of the called function in qemu is as follows:
> >
> > 1. vhost_net_stop() was called, this will call the function
> > virtio_pci_set_guest_notifiers() with assgin= false, and
> > virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
>
> Before vhost_net_stop(), do we know which vector is used by which queue?
>
before this stop, vdev->config_verctor is get from
virtio_pci_common_read/virtio_pci_common_write
it was set to vector 0
> >
> > 2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTORt
> >
> > 3.vhost_net_start() was called (at this time the configure vector is
> > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
> > assgin= true, so the irqfd for vector 0 was not "init" during this process
>
> How does the configure vector differ from the virtqueue vector here?
>
All the vectors are VIRTIO_NO_VECTOR (including vq). any
msix_fire_vector_notifier()
been called will cause the crash at this time.  So I think this should
be a bug in this guest image
> >
> > 4. The system continues to boot and msix_fire_vector_notifier() was
> > called unmask the vector 0 and then met the crash
> > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> >
> > The reason for not reproducing in RHEL/fedora guest image is because
> > REHL/Fedora doesn't have the behavior of calling vhost_net_stop and then virtio_reset, and also won't call msix_fire_vector_notifier for vector 0 during system boot.
> >
> > The reason for not reproducing before configure interrupt support is because
> > vector 0 is for configure interrupt,  before the support for configure interrupts, the notifier process will not handle vector 0.
> >
> > For the device Vyatta using, it doesn't support configure interrupts at all, So we plan to disable the configure interrupts in unsupported device
>
> Btw, let's tweak the changelog, it's a little bit hard to understand.
>
sure will do
thanks
Cindy
> Thanks
>
> >
> > Signed-off-by: Cindy Lu <lulu@redhat.com>
> >
> > Cindy Lu (2):
> >   virtio-net: disable the configure interrupt for not support device
> >   virtio-pci: check if the configure interrupt enable
> >
> >  hw/net/virtio-net.c        |  5 ++++-
> >  hw/virtio/virtio-pci.c     | 41 +++++++++++++++++++++-----------------
> >  hw/virtio/virtio.c         |  1 +
> >  include/hw/virtio/virtio.h |  1 +
> >  4 files changed, 29 insertions(+), 19 deletions(-)
> >
> > --
> > 2.43.0
> >
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 0/2] disable the configuration interrupt for the unsupported device
  2024-03-27  6:02   ` Cindy Lu
@ 2024-03-27  7:54     ` Jason Wang
  2024-03-27  8:28       ` Cindy Lu
  0 siblings, 1 reply; 17+ messages in thread
From: Jason Wang @ 2024-03-27  7:54 UTC (permalink / raw)
  To: Cindy Lu; +Cc: mst, qemu-devel

On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu <lulu@redhat.com> wrote:
>
> On Wed, Mar 27, 2024 at 11:05 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > Hi Cindy:
> >
> > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu <lulu@redhat.com> wrote:
> > >
> > > we need a crash in Non-standard image, here is the jira for this https://issues.redhat.com/browse/RHEL-28522
> > > The root cause of the issue is that an IRQFD was used without initialization..
> > >
> > > During the booting process of the Vyatta image, the behavior of the called function in qemu is as follows:
> > >
> > > 1. vhost_net_stop() was called, this will call the function
> > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
> >
> > Before vhost_net_stop(), do we know which vector is used by which queue?
> >
> before this stop, vdev->config_verctor is get from
> virtio_pci_common_read/virtio_pci_common_write
> it was set to vector 0

I basically meant if vector 0 is shared with some virtqueues here.

> > >
> > > 2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTORt
> > >
> > > 3.vhost_net_start() was called (at this time the configure vector is
> > > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
> > > assgin= true, so the irqfd for vector 0 was not "init" during this process
> >
> > How does the configure vector differ from the virtqueue vector here?
> >
> All the vectors are VIRTIO_NO_VECTOR (including vq). any
> msix_fire_vector_notifier()
> been called will cause the crash at this time.

Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
the assignment is true?

> So I think this should
> be a bug in this guest image

The point is Qemu should not crash even if the guest driver is buggy.

It would be nice if we can have a qtest for this on top.

Thanks

> > >
> > > 4. The system continues to boot and msix_fire_vector_notifier() was
> > > called unmask the vector 0 and then met the crash
> > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > >
> > > The reason for not reproducing in RHEL/fedora guest image is because
> > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop and then virtio_reset, and also won't call msix_fire_vector_notifier for vector 0 during system boot.
> > >
> > > The reason for not reproducing before configure interrupt support is because
> > > vector 0 is for configure interrupt,  before the support for configure interrupts, the notifier process will not handle vector 0.
> > >
> > > For the device Vyatta using, it doesn't support configure interrupts at all, So we plan to disable the configure interrupts in unsupported device
> >
> > Btw, let's tweak the changelog, it's a little bit hard to understand.
> >
> sure will do
> thanks
> Cindy
> > Thanks
> >
> > >
> > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > >
> > > Cindy Lu (2):
> > >   virtio-net: disable the configure interrupt for not support device
> > >   virtio-pci: check if the configure interrupt enable
> > >
> > >  hw/net/virtio-net.c        |  5 ++++-
> > >  hw/virtio/virtio-pci.c     | 41 +++++++++++++++++++++-----------------
> > >  hw/virtio/virtio.c         |  1 +
> > >  include/hw/virtio/virtio.h |  1 +
> > >  4 files changed, 29 insertions(+), 19 deletions(-)
> > >
> > > --
> > > 2.43.0
> > >
> >
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 0/2] disable the configuration interrupt for the unsupported device
  2024-03-27  7:54     ` Jason Wang
@ 2024-03-27  8:28       ` Cindy Lu
  2024-03-27  9:12         ` Jason Wang
  0 siblings, 1 reply; 17+ messages in thread
From: Cindy Lu @ 2024-03-27  8:28 UTC (permalink / raw)
  To: Jason Wang; +Cc: mst, qemu-devel

On Wed, Mar 27, 2024 at 3:54 PM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu <lulu@redhat.com> wrote:
> >
> > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > Hi Cindy:
> > >
> > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu <lulu@redhat.com> wrote:
> > > >
> > > > we need a crash in Non-standard image, here is the jira for this https://issues.redhat.com/browse/RHEL-28522
> > > > The root cause of the issue is that an IRQFD was used without initialization..
> > > >
> > > > During the booting process of the Vyatta image, the behavior of the called function in qemu is as follows:
> > > >
> > > > 1. vhost_net_stop() was called, this will call the function
> > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
> > >
> > > Before vhost_net_stop(), do we know which vector is used by which queue?
> > >
> > before this stop, vdev->config_verctor is get from
> > virtio_pci_common_read/virtio_pci_common_write
> > it was set to vector 0
>
> I basically meant if vector 0 is shared with some virtqueues here.
>
Really sorry for this, vq's vector is 1,2, and will not share with the
configure vector
> > > >
> > > > 2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTORt
> > > >
> > > > 3.vhost_net_start() was called (at this time the configure vector is
> > > > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
> > > > assgin= true, so the irqfd for vector 0 was not "init" during this process
> > >
> > > How does the configure vector differ from the virtqueue vector here?
> > >
> > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > msix_fire_vector_notifier()
> > been called will cause the crash at this time.
>
> Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
> the assignment is true?
>
It will allocate, but  the vector is VIRTIO_NO_VECTOR (0xffff)

then it will called kvm_virtio_pci_vector_use_one()

in this function, there is a check for

    if (vector >= msix_nr_vectors_allocated(dev))

{         return 0;     }

So it will return.

> > So I think this should
> > be a bug in this guest image
>
> The point is Qemu should not crash even if the guest driver is buggy.
>
> It would be nice if we can have a qtest for this on top.
>
> Thanks
>
sure, got it, I have done the Qtest, and it passed
here is the result

Ok:                 794
Expected Fail:      0
Fail:               0
Unexpected Pass:    0
Skipped:            32
Timeout:            0

> > > >
> > > > 4. The system continues to boot and msix_fire_vector_notifier() was
> > > > called unmask the vector 0 and then met the crash
> > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > > >
> > > > The reason for not reproducing in RHEL/fedora guest image is because
> > > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop and then virtio_reset, and also won't call msix_fire_vector_notifier for vector 0 during system boot.
> > > >
> > > > The reason for not reproducing before configure interrupt support is because
> > > > vector 0 is for configure interrupt,  before the support for configure interrupts, the notifier process will not handle vector 0.
> > > >
> > > > For the device Vyatta using, it doesn't support configure interrupts at all, So we plan to disable the configure interrupts in unsupported device
> > >
> > > Btw, let's tweak the changelog, it's a little bit hard to understand.
> > >
> > sure will do
> > thanks
> > Cindy
> > > Thanks
> > >
> > > >
> > > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > > >
> > > > Cindy Lu (2):
> > > >   virtio-net: disable the configure interrupt for not support device
> > > >   virtio-pci: check if the configure interrupt enable
> > > >
> > > >  hw/net/virtio-net.c        |  5 ++++-
> > > >  hw/virtio/virtio-pci.c     | 41 +++++++++++++++++++++-----------------
> > > >  hw/virtio/virtio.c         |  1 +
> > > >  include/hw/virtio/virtio.h |  1 +
> > > >  4 files changed, 29 insertions(+), 19 deletions(-)
> > > >
> > > > --
> > > > 2.43.0
> > > >
> > >
> >
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 0/2] disable the configuration interrupt for the unsupported device
  2024-03-27  8:28       ` Cindy Lu
@ 2024-03-27  9:12         ` Jason Wang
  2024-03-27  9:12           ` Jason Wang
  2024-03-27  9:32           ` Cindy Lu
  0 siblings, 2 replies; 17+ messages in thread
From: Jason Wang @ 2024-03-27  9:12 UTC (permalink / raw)
  To: Cindy Lu; +Cc: mst, qemu-devel

On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu <lulu@redhat.com> wrote:
>
> On Wed, Mar 27, 2024 at 3:54 PM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu <lulu@redhat.com> wrote:
> > >
> > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > Hi Cindy:
> > > >
> > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu <lulu@redhat.com> wrote:
> > > > >
> > > > > we need a crash in Non-standard image, here is the jira for this https://issues.redhat.com/browse/RHEL-28522
> > > > > The root cause of the issue is that an IRQFD was used without initialization..
> > > > >
> > > > > During the booting process of the Vyatta image, the behavior of the called function in qemu is as follows:
> > > > >
> > > > > 1. vhost_net_stop() was called, this will call the function
> > > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > > virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
> > > >
> > > > Before vhost_net_stop(), do we know which vector is used by which queue?
> > > >
> > > before this stop, vdev->config_verctor is get from
> > > virtio_pci_common_read/virtio_pci_common_write
> > > it was set to vector 0
> >
> > I basically meant if vector 0 is shared with some virtqueues here.
> >
> Really sorry for this, vq's vector is 1,2, and will not share with the
> configure vector
> > > > >
> > > > > 2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTORt
> > > > >
> > > > > 3.vhost_net_start() was called (at this time the configure vector is
> > > > > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
> > > > > assgin= true, so the irqfd for vector 0 was not "init" during this process
> > > >
> > > > How does the configure vector differ from the virtqueue vector here?
> > > >
> > > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > > msix_fire_vector_notifier()
> > > been called will cause the crash at this time.
> >
> > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
> > the assignment is true?
> >
> It will allocate, but  the vector is VIRTIO_NO_VECTOR (0xffff)
>
> then it will called kvm_virtio_pci_vector_use_one()
>
> in this function, there is a check for
>
>     if (vector >= msix_nr_vectors_allocated(dev))
>
> {         return 0;     }
>
> So it will return.

How about let's just fix this?

Btw, it's better to explain in detail like the above in the next version.

Thanks

>
> > > So I think this should
> > > be a bug in this guest image
> >
> > The point is Qemu should not crash even if the guest driver is buggy.
> >
> > It would be nice if we can have a qtest for this on top.
> >
> > Thanks
> >
> sure, got it, I have done the Qtest, and it passed
> here is the result
>
> Ok:                 794
> Expected Fail:      0
> Fail:               0
> Unexpected Pass:    0
> Skipped:            32
> Timeout:            0
>
> > > > >
> > > > > 4. The system continues to boot and msix_fire_vector_notifier() was
> > > > > called unmask the vector 0 and then met the crash
> > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > > > >
> > > > > The reason for not reproducing in RHEL/fedora guest image is because
> > > > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop and then virtio_reset, and also won't call msix_fire_vector_notifier for vector 0 during system boot.
> > > > >
> > > > > The reason for not reproducing before configure interrupt support is because
> > > > > vector 0 is for configure interrupt,  before the support for configure interrupts, the notifier process will not handle vector 0.
> > > > >
> > > > > For the device Vyatta using, it doesn't support configure interrupts at all, So we plan to disable the configure interrupts in unsupported device
> > > >
> > > > Btw, let's tweak the changelog, it's a little bit hard to understand.
> > > >
> > > sure will do
> > > thanks
> > > Cindy
> > > > Thanks
> > > >
> > > > >
> > > > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > > > >
> > > > > Cindy Lu (2):
> > > > >   virtio-net: disable the configure interrupt for not support device
> > > > >   virtio-pci: check if the configure interrupt enable
> > > > >
> > > > >  hw/net/virtio-net.c        |  5 ++++-
> > > > >  hw/virtio/virtio-pci.c     | 41 +++++++++++++++++++++-----------------
> > > > >  hw/virtio/virtio.c         |  1 +
> > > > >  include/hw/virtio/virtio.h |  1 +
> > > > >  4 files changed, 29 insertions(+), 19 deletions(-)
> > > > >
> > > > > --
> > > > > 2.43.0
> > > > >
> > > >
> > >
> >
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 0/2] disable the configuration interrupt for the unsupported device
  2024-03-27  9:12         ` Jason Wang
@ 2024-03-27  9:12           ` Jason Wang
  2024-03-27  9:43             ` Cindy Lu
  2024-03-27  9:32           ` Cindy Lu
  1 sibling, 1 reply; 17+ messages in thread
From: Jason Wang @ 2024-03-27  9:12 UTC (permalink / raw)
  To: Cindy Lu; +Cc: mst, qemu-devel

On Wed, Mar 27, 2024 at 5:12 PM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu <lulu@redhat.com> wrote:
> >
> > On Wed, Mar 27, 2024 at 3:54 PM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu <lulu@redhat.com> wrote:
> > > >
> > > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > >
> > > > > Hi Cindy:
> > > > >
> > > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu <lulu@redhat.com> wrote:
> > > > > >
> > > > > > we need a crash in Non-standard image, here is the jira for this https://issues.redhat.com/browse/RHEL-28522
> > > > > > The root cause of the issue is that an IRQFD was used without initialization..
> > > > > >
> > > > > > During the booting process of the Vyatta image, the behavior of the called function in qemu is as follows:
> > > > > >
> > > > > > 1. vhost_net_stop() was called, this will call the function
> > > > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > > > virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
> > > > >
> > > > > Before vhost_net_stop(), do we know which vector is used by which queue?
> > > > >
> > > > before this stop, vdev->config_verctor is get from
> > > > virtio_pci_common_read/virtio_pci_common_write
> > > > it was set to vector 0
> > >
> > > I basically meant if vector 0 is shared with some virtqueues here.
> > >
> > Really sorry for this, vq's vector is 1,2, and will not share with the
> > configure vector
> > > > > >
> > > > > > 2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTORt
> > > > > >
> > > > > > 3.vhost_net_start() was called (at this time the configure vector is
> > > > > > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
> > > > > > assgin= true, so the irqfd for vector 0 was not "init" during this process
> > > > >
> > > > > How does the configure vector differ from the virtqueue vector here?
> > > > >
> > > > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > > > msix_fire_vector_notifier()
> > > > been called will cause the crash at this time.
> > >
> > > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
> > > the assignment is true?
> > >
> > It will allocate, but  the vector is VIRTIO_NO_VECTOR (0xffff)
> >
> > then it will called kvm_virtio_pci_vector_use_one()
> >
> > in this function, there is a check for
> >
> >     if (vector >= msix_nr_vectors_allocated(dev))
> >
> > {         return 0;     }
> >
> > So it will return.
>
> How about let's just fix this?

Btw, another question, how does vDPA work here?

Thanks

>
> Btw, it's better to explain in detail like the above in the next version.
>
> Thanks
>
> >
> > > > So I think this should
> > > > be a bug in this guest image
> > >
> > > The point is Qemu should not crash even if the guest driver is buggy.
> > >
> > > It would be nice if we can have a qtest for this on top.
> > >
> > > Thanks
> > >
> > sure, got it, I have done the Qtest, and it passed
> > here is the result
> >
> > Ok:                 794
> > Expected Fail:      0
> > Fail:               0
> > Unexpected Pass:    0
> > Skipped:            32
> > Timeout:            0
> >
> > > > > >
> > > > > > 4. The system continues to boot and msix_fire_vector_notifier() was
> > > > > > called unmask the vector 0 and then met the crash
> > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > > > > >
> > > > > > The reason for not reproducing in RHEL/fedora guest image is because
> > > > > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop and then virtio_reset, and also won't call msix_fire_vector_notifier for vector 0 during system boot.
> > > > > >
> > > > > > The reason for not reproducing before configure interrupt support is because
> > > > > > vector 0 is for configure interrupt,  before the support for configure interrupts, the notifier process will not handle vector 0.
> > > > > >
> > > > > > For the device Vyatta using, it doesn't support configure interrupts at all, So we plan to disable the configure interrupts in unsupported device
> > > > >
> > > > > Btw, let's tweak the changelog, it's a little bit hard to understand.
> > > > >
> > > > sure will do
> > > > thanks
> > > > Cindy
> > > > > Thanks
> > > > >
> > > > > >
> > > > > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > > > > >
> > > > > > Cindy Lu (2):
> > > > > >   virtio-net: disable the configure interrupt for not support device
> > > > > >   virtio-pci: check if the configure interrupt enable
> > > > > >
> > > > > >  hw/net/virtio-net.c        |  5 ++++-
> > > > > >  hw/virtio/virtio-pci.c     | 41 +++++++++++++++++++++-----------------
> > > > > >  hw/virtio/virtio.c         |  1 +
> > > > > >  include/hw/virtio/virtio.h |  1 +
> > > > > >  4 files changed, 29 insertions(+), 19 deletions(-)
> > > > > >
> > > > > > --
> > > > > > 2.43.0
> > > > > >
> > > > >
> > > >
> > >
> >



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 0/2] disable the configuration interrupt for the unsupported device
  2024-03-27  9:12         ` Jason Wang
  2024-03-27  9:12           ` Jason Wang
@ 2024-03-27  9:32           ` Cindy Lu
  2024-03-28  4:12             ` Jason Wang
  1 sibling, 1 reply; 17+ messages in thread
From: Cindy Lu @ 2024-03-27  9:32 UTC (permalink / raw)
  To: Jason Wang; +Cc: mst, qemu-devel

On Wed, Mar 27, 2024 at 5:12 PM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu <lulu@redhat.com> wrote:
> >
> > On Wed, Mar 27, 2024 at 3:54 PM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu <lulu@redhat.com> wrote:
> > > >
> > > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > >
> > > > > Hi Cindy:
> > > > >
> > > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu <lulu@redhat.com> wrote:
> > > > > >
> > > > > > we need a crash in Non-standard image, here is the jira for this https://issues.redhat.com/browse/RHEL-28522
> > > > > > The root cause of the issue is that an IRQFD was used without initialization..
> > > > > >
> > > > > > During the booting process of the Vyatta image, the behavior of the called function in qemu is as follows:
> > > > > >
> > > > > > 1. vhost_net_stop() was called, this will call the function
> > > > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > > > virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
> > > > >
> > > > > Before vhost_net_stop(), do we know which vector is used by which queue?
> > > > >
> > > > before this stop, vdev->config_verctor is get from
> > > > virtio_pci_common_read/virtio_pci_common_write
> > > > it was set to vector 0
> > >
> > > I basically meant if vector 0 is shared with some virtqueues here.
> > >
> > Really sorry for this, vq's vector is 1,2, and will not share with the
> > configure vector
> > > > > >
> > > > > > 2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTORt
> > > > > >
> > > > > > 3.vhost_net_start() was called (at this time the configure vector is
> > > > > > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
> > > > > > assgin= true, so the irqfd for vector 0 was not "init" during this process
> > > > >
> > > > > How does the configure vector differ from the virtqueue vector here?
> > > > >
> > > > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > > > msix_fire_vector_notifier()
> > > > been called will cause the crash at this time.
> > >
> > > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
> > > the assignment is true?
> > >
> > It will allocate, but  the vector is VIRTIO_NO_VECTOR (0xffff)
> >
> > then it will called kvm_virtio_pci_vector_use_one()
> >
> > in this function, there is a check for
> >
> >     if (vector >= msix_nr_vectors_allocated(dev))
> >
> > {         return 0;     }
> >
> > So it will return.
>
> How about let's just fix this?
>
> Btw, it's better to explain in detail like the above in the next version.
>
> Thanks
>
The problem is I think the behavior here is correct, The vector here is
 VIRTIO_NO_VECTOR and we should return,
the fix could work maybe is we try get to know if this was changed
from another value
and use that one? this seems strange.
Thanks
cindy
> >
> > > > So I think this should
> > > > be a bug in this guest image
> > >
> > > The point is Qemu should not crash even if the guest driver is buggy.
> > >
> > > It would be nice if we can have a qtest for this on top.
> > >
> > > Thanks
> > >
> > sure, got it, I have done the Qtest, and it passed
> > here is the result
> >
> > Ok:                 794
> > Expected Fail:      0
> > Fail:               0
> > Unexpected Pass:    0
> > Skipped:            32
> > Timeout:            0
> >
> > > > > >
> > > > > > 4. The system continues to boot and msix_fire_vector_notifier() was
> > > > > > called unmask the vector 0 and then met the crash
> > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > > > > >
> > > > > > The reason for not reproducing in RHEL/fedora guest image is because
> > > > > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop and then virtio_reset, and also won't call msix_fire_vector_notifier for vector 0 during system boot.
> > > > > >
> > > > > > The reason for not reproducing before configure interrupt support is because
> > > > > > vector 0 is for configure interrupt,  before the support for configure interrupts, the notifier process will not handle vector 0.
> > > > > >
> > > > > > For the device Vyatta using, it doesn't support configure interrupts at all, So we plan to disable the configure interrupts in unsupported device
> > > > >
> > > > > Btw, let's tweak the changelog, it's a little bit hard to understand.
> > > > >
> > > > sure will do
> > > > thanks
> > > > Cindy
> > > > > Thanks
> > > > >
> > > > > >
> > > > > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > > > > >
> > > > > > Cindy Lu (2):
> > > > > >   virtio-net: disable the configure interrupt for not support device
> > > > > >   virtio-pci: check if the configure interrupt enable
> > > > > >
> > > > > >  hw/net/virtio-net.c        |  5 ++++-
> > > > > >  hw/virtio/virtio-pci.c     | 41 +++++++++++++++++++++-----------------
> > > > > >  hw/virtio/virtio.c         |  1 +
> > > > > >  include/hw/virtio/virtio.h |  1 +
> > > > > >  4 files changed, 29 insertions(+), 19 deletions(-)
> > > > > >
> > > > > > --
> > > > > > 2.43.0
> > > > > >
> > > > >
> > > >
> > >
> >
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 0/2] disable the configuration interrupt for the unsupported device
  2024-03-27  9:12           ` Jason Wang
@ 2024-03-27  9:43             ` Cindy Lu
  2024-03-28  4:14               ` Jason Wang
  0 siblings, 1 reply; 17+ messages in thread
From: Cindy Lu @ 2024-03-27  9:43 UTC (permalink / raw)
  To: Jason Wang; +Cc: mst, qemu-devel

On Wed, Mar 27, 2024 at 5:13 PM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Mar 27, 2024 at 5:12 PM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu <lulu@redhat.com> wrote:
> > >
> > > On Wed, Mar 27, 2024 at 3:54 PM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu <lulu@redhat.com> wrote:
> > > > >
> > > > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > >
> > > > > > Hi Cindy:
> > > > > >
> > > > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu <lulu@redhat.com> wrote:
> > > > > > >
> > > > > > > we need a crash in Non-standard image, here is the jira for this https://issues.redhat.com/browse/RHEL-28522
> > > > > > > The root cause of the issue is that an IRQFD was used without initialization..
> > > > > > >
> > > > > > > During the booting process of the Vyatta image, the behavior of the called function in qemu is as follows:
> > > > > > >
> > > > > > > 1. vhost_net_stop() was called, this will call the function
> > > > > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > > > > virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
> > > > > >
> > > > > > Before vhost_net_stop(), do we know which vector is used by which queue?
> > > > > >
> > > > > before this stop, vdev->config_verctor is get from
> > > > > virtio_pci_common_read/virtio_pci_common_write
> > > > > it was set to vector 0
> > > >
> > > > I basically meant if vector 0 is shared with some virtqueues here.
> > > >
> > > Really sorry for this, vq's vector is 1,2, and will not share with the
> > > configure vector
> > > > > > >
> > > > > > > 2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTORt
> > > > > > >
> > > > > > > 3.vhost_net_start() was called (at this time the configure vector is
> > > > > > > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
> > > > > > > assgin= true, so the irqfd for vector 0 was not "init" during this process
> > > > > >
> > > > > > How does the configure vector differ from the virtqueue vector here?
> > > > > >
> > > > > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > > > > msix_fire_vector_notifier()
> > > > > been called will cause the crash at this time.
> > > >
> > > > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
> > > > the assignment is true?
> > > >
> > > It will allocate, but  the vector is VIRTIO_NO_VECTOR (0xffff)
> > >
> > > then it will called kvm_virtio_pci_vector_use_one()
> > >
> > > in this function, there is a check for
> > >
> > >     if (vector >= msix_nr_vectors_allocated(dev))
> > >
> > > {         return 0;     }
> > >
> > > So it will return.
> >
> > How about let's just fix this?
>
> Btw, another question, how does vDPA work here?
>
> Thanks
>
the rhel/fedroa guest image will not call  vrtio_stop and virtio_reset
during the boot
So vector will not change to  VIRTIO_NO_VECTOR. So the vdpa's
configure interrupt
Should work ok and there is no crash
Thanks
cindy

> >
> > Btw, it's better to explain in detail like the above in the next version.
> >
> > Thanks
> >
> > >
> > > > > So I think this should
> > > > > be a bug in this guest image
> > > >
> > > > The point is Qemu should not crash even if the guest driver is buggy.
> > > >
> > > > It would be nice if we can have a qtest for this on top.
> > > >
> > > > Thanks
> > > >
> > > sure, got it, I have done the Qtest, and it passed
> > > here is the result
> > >
> > > Ok:                 794
> > > Expected Fail:      0
> > > Fail:               0
> > > Unexpected Pass:    0
> > > Skipped:            32
> > > Timeout:            0
> > >
> > > > > > >
> > > > > > > 4. The system continues to boot and msix_fire_vector_notifier() was
> > > > > > > called unmask the vector 0 and then met the crash
> > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > > > > > >
> > > > > > > The reason for not reproducing in RHEL/fedora guest image is because
> > > > > > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop and then virtio_reset, and also won't call msix_fire_vector_notifier for vector 0 during system boot.
> > > > > > >
> > > > > > > The reason for not reproducing before configure interrupt support is because
> > > > > > > vector 0 is for configure interrupt,  before the support for configure interrupts, the notifier process will not handle vector 0.
> > > > > > >
> > > > > > > For the device Vyatta using, it doesn't support configure interrupts at all, So we plan to disable the configure interrupts in unsupported device
> > > > > >
> > > > > > Btw, let's tweak the changelog, it's a little bit hard to understand.
> > > > > >
> > > > > sure will do
> > > > > thanks
> > > > > Cindy
> > > > > > Thanks
> > > > > >
> > > > > > >
> > > > > > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > > > > > >
> > > > > > > Cindy Lu (2):
> > > > > > >   virtio-net: disable the configure interrupt for not support device
> > > > > > >   virtio-pci: check if the configure interrupt enable
> > > > > > >
> > > > > > >  hw/net/virtio-net.c        |  5 ++++-
> > > > > > >  hw/virtio/virtio-pci.c     | 41 +++++++++++++++++++++-----------------
> > > > > > >  hw/virtio/virtio.c         |  1 +
> > > > > > >  include/hw/virtio/virtio.h |  1 +
> > > > > > >  4 files changed, 29 insertions(+), 19 deletions(-)
> > > > > > >
> > > > > > > --
> > > > > > > 2.43.0
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 0/2] disable the configuration interrupt for the unsupported device
  2024-03-27  9:32           ` Cindy Lu
@ 2024-03-28  4:12             ` Jason Wang
  2024-03-29  3:02               ` Cindy Lu
  0 siblings, 1 reply; 17+ messages in thread
From: Jason Wang @ 2024-03-28  4:12 UTC (permalink / raw)
  To: Cindy Lu; +Cc: mst, qemu-devel

On Wed, Mar 27, 2024 at 5:33 PM Cindy Lu <lulu@redhat.com> wrote:
>
> On Wed, Mar 27, 2024 at 5:12 PM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu <lulu@redhat.com> wrote:
> > >
> > > On Wed, Mar 27, 2024 at 3:54 PM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu <lulu@redhat.com> wrote:
> > > > >
> > > > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > >
> > > > > > Hi Cindy:
> > > > > >
> > > > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu <lulu@redhat.com> wrote:
> > > > > > >
> > > > > > > we need a crash in Non-standard image, here is the jira for this https://issues.redhat.com/browse/RHEL-28522
> > > > > > > The root cause of the issue is that an IRQFD was used without initialization..
> > > > > > >
> > > > > > > During the booting process of the Vyatta image, the behavior of the called function in qemu is as follows:
> > > > > > >
> > > > > > > 1. vhost_net_stop() was called, this will call the function
> > > > > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > > > > virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
> > > > > >
> > > > > > Before vhost_net_stop(), do we know which vector is used by which queue?
> > > > > >
> > > > > before this stop, vdev->config_verctor is get from
> > > > > virtio_pci_common_read/virtio_pci_common_write
> > > > > it was set to vector 0
> > > >
> > > > I basically meant if vector 0 is shared with some virtqueues here.
> > > >
> > > Really sorry for this, vq's vector is 1,2, and will not share with the
> > > configure vector
> > > > > > >
> > > > > > > 2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTORt
> > > > > > >
> > > > > > > 3.vhost_net_start() was called (at this time the configure vector is
> > > > > > > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
> > > > > > > assgin= true, so the irqfd for vector 0 was not "init" during this process
> > > > > >
> > > > > > How does the configure vector differ from the virtqueue vector here?
> > > > > >
> > > > > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > > > > msix_fire_vector_notifier()
> > > > > been called will cause the crash at this time.
> > > >
> > > > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
> > > > the assignment is true?
> > > >
> > > It will allocate, but  the vector is VIRTIO_NO_VECTOR (0xffff)
> > >
> > > then it will called kvm_virtio_pci_vector_use_one()
> > >
> > > in this function, there is a check for
> > >
> > >     if (vector >= msix_nr_vectors_allocated(dev))
> > >
> > > {         return 0;     }
> > >
> > > So it will return.
> >
> > How about let's just fix this?
> >
> > Btw, it's better to explain in detail like the above in the next version.
> >
> > Thanks
> >
> The problem is I think the behavior here is correct, The vector here is
>  VIRTIO_NO_VECTOR and we should return,

So if I understand correctly, the configure vector is configured after
DRIVER_OK?

Spec doesn't forbid this, this is something we need to support.

It looks to me the correct fix is to kvm_virtio_pci_vector_use_one()
when guest is writing to msix_vector after DRIVER_OK?

Thanks

> the fix could work maybe is we try get to know if this was changed
> from another value
> and use that one? this seems strange.
> Thanks
> cindy
> > >
> > > > > So I think this should
> > > > > be a bug in this guest image
> > > >
> > > > The point is Qemu should not crash even if the guest driver is buggy.
> > > >
> > > > It would be nice if we can have a qtest for this on top.
> > > >
> > > > Thanks
> > > >
> > > sure, got it, I have done the Qtest, and it passed
> > > here is the result
> > >
> > > Ok:                 794
> > > Expected Fail:      0
> > > Fail:               0
> > > Unexpected Pass:    0
> > > Skipped:            32
> > > Timeout:            0
> > >
> > > > > > >
> > > > > > > 4. The system continues to boot and msix_fire_vector_notifier() was
> > > > > > > called unmask the vector 0 and then met the crash
> > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > > > > > >
> > > > > > > The reason for not reproducing in RHEL/fedora guest image is because
> > > > > > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop and then virtio_reset, and also won't call msix_fire_vector_notifier for vector 0 during system boot.
> > > > > > >
> > > > > > > The reason for not reproducing before configure interrupt support is because
> > > > > > > vector 0 is for configure interrupt,  before the support for configure interrupts, the notifier process will not handle vector 0.
> > > > > > >
> > > > > > > For the device Vyatta using, it doesn't support configure interrupts at all, So we plan to disable the configure interrupts in unsupported device
> > > > > >
> > > > > > Btw, let's tweak the changelog, it's a little bit hard to understand.
> > > > > >
> > > > > sure will do
> > > > > thanks
> > > > > Cindy
> > > > > > Thanks
> > > > > >
> > > > > > >
> > > > > > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > > > > > >
> > > > > > > Cindy Lu (2):
> > > > > > >   virtio-net: disable the configure interrupt for not support device
> > > > > > >   virtio-pci: check if the configure interrupt enable
> > > > > > >
> > > > > > >  hw/net/virtio-net.c        |  5 ++++-
> > > > > > >  hw/virtio/virtio-pci.c     | 41 +++++++++++++++++++++-----------------
> > > > > > >  hw/virtio/virtio.c         |  1 +
> > > > > > >  include/hw/virtio/virtio.h |  1 +
> > > > > > >  4 files changed, 29 insertions(+), 19 deletions(-)
> > > > > > >
> > > > > > > --
> > > > > > > 2.43.0
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 0/2] disable the configuration interrupt for the unsupported device
  2024-03-27  9:43             ` Cindy Lu
@ 2024-03-28  4:14               ` Jason Wang
  2024-03-28  7:07                 ` Cindy Lu
  0 siblings, 1 reply; 17+ messages in thread
From: Jason Wang @ 2024-03-28  4:14 UTC (permalink / raw)
  To: Cindy Lu; +Cc: mst, qemu-devel

On Wed, Mar 27, 2024 at 5:44 PM Cindy Lu <lulu@redhat.com> wrote:
>
> On Wed, Mar 27, 2024 at 5:13 PM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, Mar 27, 2024 at 5:12 PM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu <lulu@redhat.com> wrote:
> > > >
> > > > On Wed, Mar 27, 2024 at 3:54 PM Jason Wang <jasowang@redhat.com> wrote:
> > > > >
> > > > > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu <lulu@redhat.com> wrote:
> > > > > >
> > > > > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > > >
> > > > > > > Hi Cindy:
> > > > > > >
> > > > > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu <lulu@redhat.com> wrote:
> > > > > > > >
> > > > > > > > we need a crash in Non-standard image, here is the jira for this https://issues.redhat.com/browse/RHEL-28522
> > > > > > > > The root cause of the issue is that an IRQFD was used without initialization..
> > > > > > > >
> > > > > > > > During the booting process of the Vyatta image, the behavior of the called function in qemu is as follows:
> > > > > > > >
> > > > > > > > 1. vhost_net_stop() was called, this will call the function
> > > > > > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > > > > > virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
> > > > > > >
> > > > > > > Before vhost_net_stop(), do we know which vector is used by which queue?
> > > > > > >
> > > > > > before this stop, vdev->config_verctor is get from
> > > > > > virtio_pci_common_read/virtio_pci_common_write
> > > > > > it was set to vector 0
> > > > >
> > > > > I basically meant if vector 0 is shared with some virtqueues here.
> > > > >
> > > > Really sorry for this, vq's vector is 1,2, and will not share with the
> > > > configure vector
> > > > > > > >
> > > > > > > > 2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTORt
> > > > > > > >
> > > > > > > > 3.vhost_net_start() was called (at this time the configure vector is
> > > > > > > > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
> > > > > > > > assgin= true, so the irqfd for vector 0 was not "init" during this process
> > > > > > >
> > > > > > > How does the configure vector differ from the virtqueue vector here?
> > > > > > >
> > > > > > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > > > > > msix_fire_vector_notifier()
> > > > > > been called will cause the crash at this time.
> > > > >
> > > > > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
> > > > > the assignment is true?
> > > > >
> > > > It will allocate, but  the vector is VIRTIO_NO_VECTOR (0xffff)
> > > >
> > > > then it will called kvm_virtio_pci_vector_use_one()
> > > >
> > > > in this function, there is a check for
> > > >
> > > >     if (vector >= msix_nr_vectors_allocated(dev))
> > > >
> > > > {         return 0;     }
> > > >
> > > > So it will return.
> > >
> > > How about let's just fix this?
> >
> > Btw, another question, how does vDPA work here?
> >
> > Thanks
> >
> the rhel/fedroa guest image will not call  vrtio_stop and virtio_reset
> during the boot
> So vector will not change to  VIRTIO_NO_VECTOR. So the vdpa's
> configure interrupt
> Should work ok and there is no crash

I mean:

1) if vDPA can work with the image you used to reproduce the issue
2) if current Qemu can work on old kernel without configure interrupt
support for vDPA

Thanks

> Thanks
> cindy
>
> > >
> > > Btw, it's better to explain in detail like the above in the next version.
> > >
> > > Thanks
> > >
> > > >
> > > > > > So I think this should
> > > > > > be a bug in this guest image
> > > > >
> > > > > The point is Qemu should not crash even if the guest driver is buggy.
> > > > >
> > > > > It would be nice if we can have a qtest for this on top.
> > > > >
> > > > > Thanks
> > > > >
> > > > sure, got it, I have done the Qtest, and it passed
> > > > here is the result
> > > >
> > > > Ok:                 794
> > > > Expected Fail:      0
> > > > Fail:               0
> > > > Unexpected Pass:    0
> > > > Skipped:            32
> > > > Timeout:            0
> > > >
> > > > > > > >
> > > > > > > > 4. The system continues to boot and msix_fire_vector_notifier() was
> > > > > > > > called unmask the vector 0 and then met the crash
> > > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > > > > > > >
> > > > > > > > The reason for not reproducing in RHEL/fedora guest image is because
> > > > > > > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop and then virtio_reset, and also won't call msix_fire_vector_notifier for vector 0 during system boot.
> > > > > > > >
> > > > > > > > The reason for not reproducing before configure interrupt support is because
> > > > > > > > vector 0 is for configure interrupt,  before the support for configure interrupts, the notifier process will not handle vector 0.
> > > > > > > >
> > > > > > > > For the device Vyatta using, it doesn't support configure interrupts at all, So we plan to disable the configure interrupts in unsupported device
> > > > > > >
> > > > > > > Btw, let's tweak the changelog, it's a little bit hard to understand.
> > > > > > >
> > > > > > sure will do
> > > > > > thanks
> > > > > > Cindy
> > > > > > > Thanks
> > > > > > >
> > > > > > > >
> > > > > > > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > > > > > > >
> > > > > > > > Cindy Lu (2):
> > > > > > > >   virtio-net: disable the configure interrupt for not support device
> > > > > > > >   virtio-pci: check if the configure interrupt enable
> > > > > > > >
> > > > > > > >  hw/net/virtio-net.c        |  5 ++++-
> > > > > > > >  hw/virtio/virtio-pci.c     | 41 +++++++++++++++++++++-----------------
> > > > > > > >  hw/virtio/virtio.c         |  1 +
> > > > > > > >  include/hw/virtio/virtio.h |  1 +
> > > > > > > >  4 files changed, 29 insertions(+), 19 deletions(-)
> > > > > > > >
> > > > > > > > --
> > > > > > > > 2.43.0
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> >
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 0/2] disable the configuration interrupt for the unsupported device
  2024-03-28  4:14               ` Jason Wang
@ 2024-03-28  7:07                 ` Cindy Lu
  0 siblings, 0 replies; 17+ messages in thread
From: Cindy Lu @ 2024-03-28  7:07 UTC (permalink / raw)
  To: Jason Wang; +Cc: mst, qemu-devel

On Thu, Mar 28, 2024 at 12:14 PM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Mar 27, 2024 at 5:44 PM Cindy Lu <lulu@redhat.com> wrote:
> >
> > On Wed, Mar 27, 2024 at 5:13 PM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Wed, Mar 27, 2024 at 5:12 PM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu <lulu@redhat.com> wrote:
> > > > >
> > > > > On Wed, Mar 27, 2024 at 3:54 PM Jason Wang <jasowang@redhat.com> wrote:
> > > > > >
> > > > > > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu <lulu@redhat.com> wrote:
> > > > > > >
> > > > > > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > > > >
> > > > > > > > Hi Cindy:
> > > > > > > >
> > > > > > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu <lulu@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > we need a crash in Non-standard image, here is the jira for this https://issues.redhat.com/browse/RHEL-28522
> > > > > > > > > The root cause of the issue is that an IRQFD was used without initialization..
> > > > > > > > >
> > > > > > > > > During the booting process of the Vyatta image, the behavior of the called function in qemu is as follows:
> > > > > > > > >
> > > > > > > > > 1. vhost_net_stop() was called, this will call the function
> > > > > > > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > > > > > > virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
> > > > > > > >
> > > > > > > > Before vhost_net_stop(), do we know which vector is used by which queue?
> > > > > > > >
> > > > > > > before this stop, vdev->config_verctor is get from
> > > > > > > virtio_pci_common_read/virtio_pci_common_write
> > > > > > > it was set to vector 0
> > > > > >
> > > > > > I basically meant if vector 0 is shared with some virtqueues here.
> > > > > >
> > > > > Really sorry for this, vq's vector is 1,2, and will not share with the
> > > > > configure vector
> > > > > > > > >
> > > > > > > > > 2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTORt
> > > > > > > > >
> > > > > > > > > 3.vhost_net_start() was called (at this time the configure vector is
> > > > > > > > > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
> > > > > > > > > assgin= true, so the irqfd for vector 0 was not "init" during this process
> > > > > > > >
> > > > > > > > How does the configure vector differ from the virtqueue vector here?
> > > > > > > >
> > > > > > > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > > > > > > msix_fire_vector_notifier()
> > > > > > > been called will cause the crash at this time.
> > > > > >
> > > > > > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
> > > > > > the assignment is true?
> > > > > >
> > > > > It will allocate, but  the vector is VIRTIO_NO_VECTOR (0xffff)
> > > > >
> > > > > then it will called kvm_virtio_pci_vector_use_one()
> > > > >
> > > > > in this function, there is a check for
> > > > >
> > > > >     if (vector >= msix_nr_vectors_allocated(dev))
> > > > >
> > > > > {         return 0;     }
> > > > >
> > > > > So it will return.
> > > >
> > > > How about let's just fix this?
> > >
> > > Btw, another question, how does vDPA work here?
> > >
> > > Thanks
> > >
> > the rhel/fedroa guest image will not call  vrtio_stop and virtio_reset
> > during the boot
> > So vector will not change to  VIRTIO_NO_VECTOR. So the vdpa's
> > configure interrupt
> > Should work ok and there is no crash
>
> I mean:
>
> 1) if vDPA can work with the image you used to reproduce the issue
> 2) if current Qemu can work on old kernel without configure interrupt
> support for vDPA
>
Really Sorry for ,  I tried to answer this
1. the vDPA device also can not working in this image,
 Because the irqfd for vector 0 is released. and then guest image
called unmask this vector. These code are all not dependent on vdpa
related processes

2, the current qemu can work with old kernel
because as I debug the old kernel  won't call
msix_fire_vector_notifier for vector 0, and it will not unmask vector
0
Thansk
cindy

> Thanks
>
> > Thanks
> > cindy
> >
> > > >
> > > > Btw, it's better to explain in detail like the above in the next version.
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > > > > So I think this should
> > > > > > > be a bug in this guest image
> > > > > >
> > > > > > The point is Qemu should not crash even if the guest driver is buggy.
> > > > > >
> > > > > > It would be nice if we can have a qtest for this on top.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > sure, got it, I have done the Qtest, and it passed
> > > > > here is the result
> > > > >
> > > > > Ok:                 794
> > > > > Expected Fail:      0
> > > > > Fail:               0
> > > > > Unexpected Pass:    0
> > > > > Skipped:            32
> > > > > Timeout:            0
> > > > >
> > > > > > > > >
> > > > > > > > > 4. The system continues to boot and msix_fire_vector_notifier() was
> > > > > > > > > called unmask the vector 0 and then met the crash
> > > > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > > > > > > > >
> > > > > > > > > The reason for not reproducing in RHEL/fedora guest image is because
> > > > > > > > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop and then virtio_reset, and also won't call msix_fire_vector_notifier for vector 0 during system boot.
> > > > > > > > >
> > > > > > > > > The reason for not reproducing before configure interrupt support is because
> > > > > > > > > vector 0 is for configure interrupt,  before the support for configure interrupts, the notifier process will not handle vector 0.
> > > > > > > > >
> > > > > > > > > For the device Vyatta using, it doesn't support configure interrupts at all, So we plan to disable the configure interrupts in unsupported device
> > > > > > > >
> > > > > > > > Btw, let's tweak the changelog, it's a little bit hard to understand.
> > > > > > > >
> > > > > > > sure will do
> > > > > > > thanks
> > > > > > > Cindy
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > > > > > > > >
> > > > > > > > > Cindy Lu (2):
> > > > > > > > >   virtio-net: disable the configure interrupt for not support device
> > > > > > > > >   virtio-pci: check if the configure interrupt enable
> > > > > > > > >
> > > > > > > > >  hw/net/virtio-net.c        |  5 ++++-
> > > > > > > > >  hw/virtio/virtio-pci.c     | 41 +++++++++++++++++++++-----------------
> > > > > > > > >  hw/virtio/virtio.c         |  1 +
> > > > > > > > >  include/hw/virtio/virtio.h |  1 +
> > > > > > > > >  4 files changed, 29 insertions(+), 19 deletions(-)
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > 2.43.0
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> >
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 0/2] disable the configuration interrupt for the unsupported device
  2024-03-28  4:12             ` Jason Wang
@ 2024-03-29  3:02               ` Cindy Lu
  2024-03-29  3:27                 ` Jason Wang
  0 siblings, 1 reply; 17+ messages in thread
From: Cindy Lu @ 2024-03-29  3:02 UTC (permalink / raw)
  To: Jason Wang; +Cc: mst, qemu-devel

On Thu, Mar 28, 2024 at 12:12 PM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Mar 27, 2024 at 5:33 PM Cindy Lu <lulu@redhat.com> wrote:
> >
> > On Wed, Mar 27, 2024 at 5:12 PM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu <lulu@redhat.com> wrote:
> > > >
> > > > On Wed, Mar 27, 2024 at 3:54 PM Jason Wang <jasowang@redhat.com> wrote:
> > > > >
> > > > > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu <lulu@redhat.com> wrote:
> > > > > >
> > > > > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > > >
> > > > > > > Hi Cindy:
> > > > > > >
> > > > > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu <lulu@redhat.com> wrote:
> > > > > > > >
> > > > > > > > we need a crash in Non-standard image, here is the jira for this https://issues.redhat.com/browse/RHEL-28522
> > > > > > > > The root cause of the issue is that an IRQFD was used without initialization..
> > > > > > > >
> > > > > > > > During the booting process of the Vyatta image, the behavior of the called function in qemu is as follows:
> > > > > > > >
> > > > > > > > 1. vhost_net_stop() was called, this will call the function
> > > > > > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > > > > > virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
> > > > > > >
> > > > > > > Before vhost_net_stop(), do we know which vector is used by which queue?
> > > > > > >
> > > > > > before this stop, vdev->config_verctor is get from
> > > > > > virtio_pci_common_read/virtio_pci_common_write
> > > > > > it was set to vector 0
> > > > >
> > > > > I basically meant if vector 0 is shared with some virtqueues here.
> > > > >
> > > > Really sorry for this, vq's vector is 1,2, and will not share with the
> > > > configure vector
> > > > > > > >
> > > > > > > > 2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTORt
> > > > > > > >
> > > > > > > > 3.vhost_net_start() was called (at this time the configure vector is
> > > > > > > > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
> > > > > > > > assgin= true, so the irqfd for vector 0 was not "init" during this process
> > > > > > >
> > > > > > > How does the configure vector differ from the virtqueue vector here?
> > > > > > >
> > > > > > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > > > > > msix_fire_vector_notifier()
> > > > > > been called will cause the crash at this time.
> > > > >
> > > > > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
> > > > > the assignment is true?
> > > > >
> > > > It will allocate, but  the vector is VIRTIO_NO_VECTOR (0xffff)
> > > >
> > > > then it will called kvm_virtio_pci_vector_use_one()
> > > >
> > > > in this function, there is a check for
> > > >
> > > >     if (vector >= msix_nr_vectors_allocated(dev))
> > > >
> > > > {         return 0;     }
> > > >
> > > > So it will return.
> > >
> > > How about let's just fix this?
> > >
> > > Btw, it's better to explain in detail like the above in the next version.
> > >
> > > Thanks
> > >
> > The problem is I think the behavior here is correct, The vector here is
> >  VIRTIO_NO_VECTOR and we should return,
>
> So if I understand correctly, the configure vector is configured after
> DRIVER_OK?
>
sorry I didn't get your point, Do you mean set_guest_notifiers()?,
this was called during the system boot
 but for the value of vdev->config_vector/vq vector, this is changed
by virtio_pci_common_read/virtio_pci_common_write
and these function will not check the process  DRIVER_OK.
> Spec doesn't forbid this, this is something we need to support.
>
> It looks to me the correct fix is to kvm_virtio_pci_vector_use_one()
> when guest is writing to msix_vector after DRIVER_OK?
>
if I understand correctly. do you mean
when  function   virtio_pci_common_read/virtio_pci_common_write was called
we need to check the number of  vdev->config_vector/vq vector, if this
was changed and also DRIVER_OK was set
then we need to call virtio_pci_set_guest_notifiers() to re-init the irqfd?
Thanks
cindy
> Thanks
>
> > the fix could work maybe is we try get to know if this was changed
> > from another value
> > and use that one? this seems strange.
> > Thanks
> > cindy
> > > >
> > > > > > So I think this should
> > > > > > be a bug in this guest image
> > > > >
> > > > > The point is Qemu should not crash even if the guest driver is buggy.
> > > > >
> > > > > It would be nice if we can have a qtest for this on top.
> > > > >
> > > > > Thanks
> > > > >
> > > > sure, got it, I have done the Qtest, and it passed
> > > > here is the result
> > > >
> > > > Ok:                 794
> > > > Expected Fail:      0
> > > > Fail:               0
> > > > Unexpected Pass:    0
> > > > Skipped:            32
> > > > Timeout:            0
> > > >
> > > > > > > >
> > > > > > > > 4. The system continues to boot and msix_fire_vector_notifier() was
> > > > > > > > called unmask the vector 0 and then met the crash
> > > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > > > > > > >
> > > > > > > > The reason for not reproducing in RHEL/fedora guest image is because
> > > > > > > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop and then virtio_reset, and also won't call msix_fire_vector_notifier for vector 0 during system boot.
> > > > > > > >
> > > > > > > > The reason for not reproducing before configure interrupt support is because
> > > > > > > > vector 0 is for configure interrupt,  before the support for configure interrupts, the notifier process will not handle vector 0.
> > > > > > > >
> > > > > > > > For the device Vyatta using, it doesn't support configure interrupts at all, So we plan to disable the configure interrupts in unsupported device
> > > > > > >
> > > > > > > Btw, let's tweak the changelog, it's a little bit hard to understand.
> > > > > > >
> > > > > > sure will do
> > > > > > thanks
> > > > > > Cindy
> > > > > > > Thanks
> > > > > > >
> > > > > > > >
> > > > > > > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > > > > > > >
> > > > > > > > Cindy Lu (2):
> > > > > > > >   virtio-net: disable the configure interrupt for not support device
> > > > > > > >   virtio-pci: check if the configure interrupt enable
> > > > > > > >
> > > > > > > >  hw/net/virtio-net.c        |  5 ++++-
> > > > > > > >  hw/virtio/virtio-pci.c     | 41 +++++++++++++++++++++-----------------
> > > > > > > >  hw/virtio/virtio.c         |  1 +
> > > > > > > >  include/hw/virtio/virtio.h |  1 +
> > > > > > > >  4 files changed, 29 insertions(+), 19 deletions(-)
> > > > > > > >
> > > > > > > > --
> > > > > > > > 2.43.0
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 0/2] disable the configuration interrupt for the unsupported device
  2024-03-29  3:02               ` Cindy Lu
@ 2024-03-29  3:27                 ` Jason Wang
  0 siblings, 0 replies; 17+ messages in thread
From: Jason Wang @ 2024-03-29  3:27 UTC (permalink / raw)
  To: Cindy Lu; +Cc: mst, qemu-devel

On Fri, Mar 29, 2024 at 11:02 AM Cindy Lu <lulu@redhat.com> wrote:
>
> On Thu, Mar 28, 2024 at 12:12 PM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, Mar 27, 2024 at 5:33 PM Cindy Lu <lulu@redhat.com> wrote:
> > >
> > > On Wed, Mar 27, 2024 at 5:12 PM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu <lulu@redhat.com> wrote:
> > > > >
> > > > > On Wed, Mar 27, 2024 at 3:54 PM Jason Wang <jasowang@redhat.com> wrote:
> > > > > >
> > > > > > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu <lulu@redhat.com> wrote:
> > > > > > >
> > > > > > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > > > >
> > > > > > > > Hi Cindy:
> > > > > > > >
> > > > > > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu <lulu@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > we need a crash in Non-standard image, here is the jira for this https://issues.redhat.com/browse/RHEL-28522
> > > > > > > > > The root cause of the issue is that an IRQFD was used without initialization..
> > > > > > > > >
> > > > > > > > > During the booting process of the Vyatta image, the behavior of the called function in qemu is as follows:
> > > > > > > > >
> > > > > > > > > 1. vhost_net_stop() was called, this will call the function
> > > > > > > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > > > > > > virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
> > > > > > > >
> > > > > > > > Before vhost_net_stop(), do we know which vector is used by which queue?
> > > > > > > >
> > > > > > > before this stop, vdev->config_verctor is get from
> > > > > > > virtio_pci_common_read/virtio_pci_common_write
> > > > > > > it was set to vector 0
> > > > > >
> > > > > > I basically meant if vector 0 is shared with some virtqueues here.
> > > > > >
> > > > > Really sorry for this, vq's vector is 1,2, and will not share with the
> > > > > configure vector
> > > > > > > > >
> > > > > > > > > 2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTORt
> > > > > > > > >
> > > > > > > > > 3.vhost_net_start() was called (at this time the configure vector is
> > > > > > > > > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
> > > > > > > > > assgin= true, so the irqfd for vector 0 was not "init" during this process
> > > > > > > >
> > > > > > > > How does the configure vector differ from the virtqueue vector here?
> > > > > > > >
> > > > > > > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > > > > > > msix_fire_vector_notifier()
> > > > > > > been called will cause the crash at this time.
> > > > > >
> > > > > > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
> > > > > > the assignment is true?
> > > > > >
> > > > > It will allocate, but  the vector is VIRTIO_NO_VECTOR (0xffff)
> > > > >
> > > > > then it will called kvm_virtio_pci_vector_use_one()
> > > > >
> > > > > in this function, there is a check for
> > > > >
> > > > >     if (vector >= msix_nr_vectors_allocated(dev))
> > > > >
> > > > > {         return 0;     }
> > > > >
> > > > > So it will return.
> > > >
> > > > How about let's just fix this?
> > > >
> > > > Btw, it's better to explain in detail like the above in the next version.
> > > >
> > > > Thanks
> > > >
> > > The problem is I think the behavior here is correct, The vector here is
> > >  VIRTIO_NO_VECTOR and we should return,
> >
> > So if I understand correctly, the configure vector is configured after
> > DRIVER_OK?
> >
> sorry I didn't get your point, Do you mean set_guest_notifiers()?,
> this was called during the system boot
>  but for the value of vdev->config_vector/vq vector, this is changed
> by virtio_pci_common_read/virtio_pci_common_write
> and these function will not check the process  DRIVER_OK.

I basically mean Qemu behave based on the guest's behaviour.

So what you've described looks like a guest trying to configure the
config vector after it sets DRIVER_OK. So Qemu tries to use the irqfd
without initializaiton.

> > Spec doesn't forbid this, this is something we need to support.
> >
> > It looks to me the correct fix is to kvm_virtio_pci_vector_use_one()
> > when guest is writing to msix_vector after DRIVER_OK?
> >
> if I understand correctly. do you mean
> when  function   virtio_pci_common_read/virtio_pci_common_write was called
> we need to check the number of  vdev->config_vector/vq vector, if this
> was changed and also DRIVER_OK was set
> then we need to call virtio_pci_set_guest_notifiers() to re-init the irqfd?

It is not re-init, as it has been freed.

A quick fix would be, call kvm_virtio_pci_vector_use/unuse_one() when
a guest assign/deassign a vector after DRIVER_OK.

Thanks

> Thanks
> cindy
> > Thanks
> >
> > > the fix could work maybe is we try get to know if this was changed
> > > from another value
> > > and use that one? this seems strange.
> > > Thanks
> > > cindy
> > > > >
> > > > > > > So I think this should
> > > > > > > be a bug in this guest image
> > > > > >
> > > > > > The point is Qemu should not crash even if the guest driver is buggy.
> > > > > >
> > > > > > It would be nice if we can have a qtest for this on top.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > sure, got it, I have done the Qtest, and it passed
> > > > > here is the result
> > > > >
> > > > > Ok:                 794
> > > > > Expected Fail:      0
> > > > > Fail:               0
> > > > > Unexpected Pass:    0
> > > > > Skipped:            32
> > > > > Timeout:            0
> > > > >
> > > > > > > > >
> > > > > > > > > 4. The system continues to boot and msix_fire_vector_notifier() was
> > > > > > > > > called unmask the vector 0 and then met the crash
> > > > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > > > > > > > >
> > > > > > > > > The reason for not reproducing in RHEL/fedora guest image is because
> > > > > > > > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop and then virtio_reset, and also won't call msix_fire_vector_notifier for vector 0 during system boot.
> > > > > > > > >
> > > > > > > > > The reason for not reproducing before configure interrupt support is because
> > > > > > > > > vector 0 is for configure interrupt,  before the support for configure interrupts, the notifier process will not handle vector 0.
> > > > > > > > >
> > > > > > > > > For the device Vyatta using, it doesn't support configure interrupts at all, So we plan to disable the configure interrupts in unsupported device
> > > > > > > >
> > > > > > > > Btw, let's tweak the changelog, it's a little bit hard to understand.
> > > > > > > >
> > > > > > > sure will do
> > > > > > > thanks
> > > > > > > Cindy
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > > > > > > > >
> > > > > > > > > Cindy Lu (2):
> > > > > > > > >   virtio-net: disable the configure interrupt for not support device
> > > > > > > > >   virtio-pci: check if the configure interrupt enable
> > > > > > > > >
> > > > > > > > >  hw/net/virtio-net.c        |  5 ++++-
> > > > > > > > >  hw/virtio/virtio-pci.c     | 41 +++++++++++++++++++++-----------------
> > > > > > > > >  hw/virtio/virtio.c         |  1 +
> > > > > > > > >  include/hw/virtio/virtio.h |  1 +
> > > > > > > > >  4 files changed, 29 insertions(+), 19 deletions(-)
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > 2.43.0
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-03-29  3:28 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-27  1:22 [RFC 0/2] disable the configuration interrupt for the unsupported device Cindy Lu
2024-03-27  1:22 ` [RFC 1/2] virtio-net: disable the configure interrupt for not support device Cindy Lu
2024-03-27  2:54   ` Jason Wang
2024-03-27  1:22 ` [RFC 2/2] virtio-pci: check if the configure interrupt enable Cindy Lu
2024-03-27  3:05 ` [RFC 0/2] disable the configuration interrupt for the unsupported device Jason Wang
2024-03-27  6:02   ` Cindy Lu
2024-03-27  7:54     ` Jason Wang
2024-03-27  8:28       ` Cindy Lu
2024-03-27  9:12         ` Jason Wang
2024-03-27  9:12           ` Jason Wang
2024-03-27  9:43             ` Cindy Lu
2024-03-28  4:14               ` Jason Wang
2024-03-28  7:07                 ` Cindy Lu
2024-03-27  9:32           ` Cindy Lu
2024-03-28  4:12             ` Jason Wang
2024-03-29  3:02               ` Cindy Lu
2024-03-29  3:27                 ` Jason Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.