All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] vfio/spapr: Handle changes of master irq chip for VFIO devices
@ 2019-11-21  0:56 ` David Gibson
  0 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2019-11-21  0:56 UTC (permalink / raw)
  To: Alex Williamson, clg
  Cc: groug, philmd, qemu-ppc, Paolo Bonzini, Jason Wang,
	Laurent Vivier, David Gibson, kvm, qemu-devel, Riku Voipio,
	Marc-André Lureau

Due to the way feature negotiation works in PAPR (which is a
paravirtualized platform), we can end up changing the global irq chip
at runtime, including it's KVM accelerate model.  That causes
complications for VFIO devices with INTx, which wire themselves up
directly to the KVM irqchip for performance.

This series introduces a new notifier to let VFIO devices (and
anything else that needs to in the future) know about changes to the
master irqchip.  It modifies VFIO to respond to the notifier,
reconnecting itself to the new KVM irqchip as necessary.

In particular this removes a misleading (though not wholly inaccurate)
warning that occurs when using VFIO devices on a pseries machine type
guest.

Open question: should this go into qemu-4.2 or wait until 5.0?  It's
has medium complexity / intrusiveness, but it *is* a bugfix that I
can't see a simpler way to fix.  It's effectively a regression from
qemu-4.0 to qemu-4.1 (because that introduced XIVE support by
default), although not from 4.1 to 4.2.

Changes since RFC:
 * Fixed some incorrect error paths pointed by aw in 3/5
 * 5/5 had some problems previously, but they have been obsoleted by
   other changes merged in the meantime

David Gibson (5):
  kvm: Introduce KVM irqchip change notifier
  vfio/pci: Split vfio_intx_update()
  vfio/pci: Respond to KVM irqchip change notifier
  spapr: Handle irq backend changes with VFIO PCI devices
  spapr: Work around spurious warnings from vfio INTx initialization

 accel/kvm/kvm-all.c    | 18 ++++++++++++
 accel/stubs/kvm-stub.c | 12 ++++++++
 hw/ppc/spapr_irq.c     | 17 +++++++++++-
 hw/vfio/pci.c          | 62 +++++++++++++++++++++++++++---------------
 hw/vfio/pci.h          |  1 +
 include/sysemu/kvm.h   |  5 ++++
 6 files changed, 92 insertions(+), 23 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 0/5] vfio/spapr: Handle changes of master irq chip for VFIO devices
@ 2019-11-21  0:56 ` David Gibson
  0 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2019-11-21  0:56 UTC (permalink / raw)
  To: Alex Williamson, clg
  Cc: kvm, qemu-devel, Jason Wang, Riku Voipio, Laurent Vivier, groug,
	qemu-ppc, Marc-André Lureau, Paolo Bonzini, philmd,
	David Gibson

Due to the way feature negotiation works in PAPR (which is a
paravirtualized platform), we can end up changing the global irq chip
at runtime, including it's KVM accelerate model.  That causes
complications for VFIO devices with INTx, which wire themselves up
directly to the KVM irqchip for performance.

This series introduces a new notifier to let VFIO devices (and
anything else that needs to in the future) know about changes to the
master irqchip.  It modifies VFIO to respond to the notifier,
reconnecting itself to the new KVM irqchip as necessary.

In particular this removes a misleading (though not wholly inaccurate)
warning that occurs when using VFIO devices on a pseries machine type
guest.

Open question: should this go into qemu-4.2 or wait until 5.0?  It's
has medium complexity / intrusiveness, but it *is* a bugfix that I
can't see a simpler way to fix.  It's effectively a regression from
qemu-4.0 to qemu-4.1 (because that introduced XIVE support by
default), although not from 4.1 to 4.2.

Changes since RFC:
 * Fixed some incorrect error paths pointed by aw in 3/5
 * 5/5 had some problems previously, but they have been obsoleted by
   other changes merged in the meantime

David Gibson (5):
  kvm: Introduce KVM irqchip change notifier
  vfio/pci: Split vfio_intx_update()
  vfio/pci: Respond to KVM irqchip change notifier
  spapr: Handle irq backend changes with VFIO PCI devices
  spapr: Work around spurious warnings from vfio INTx initialization

 accel/kvm/kvm-all.c    | 18 ++++++++++++
 accel/stubs/kvm-stub.c | 12 ++++++++
 hw/ppc/spapr_irq.c     | 17 +++++++++++-
 hw/vfio/pci.c          | 62 +++++++++++++++++++++++++++---------------
 hw/vfio/pci.h          |  1 +
 include/sysemu/kvm.h   |  5 ++++
 6 files changed, 92 insertions(+), 23 deletions(-)

-- 
2.23.0



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/5] kvm: Introduce KVM irqchip change notifier
  2019-11-21  0:56 ` David Gibson
@ 2019-11-21  0:56   ` David Gibson
  -1 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2019-11-21  0:56 UTC (permalink / raw)
  To: Alex Williamson, clg
  Cc: groug, philmd, qemu-ppc, Paolo Bonzini, Jason Wang,
	Laurent Vivier, David Gibson, kvm, qemu-devel, Riku Voipio,
	Marc-André Lureau, Alexey Kardashevskiy

Awareness of an in kernel irqchip is usually local to the machine and its
top-level interrupt controller.  However, in a few cases other things need
to know about it.  In particular vfio devices need this in order to
accelerate interrupt delivery.

If interrupt routing is changed, such devices may need to readjust their
connection to the KVM irqchip.  pci_bus_fire_intx_routing_notifier() exists
to do just this.

However, for the pseries machine type we have a situation where the routing
remains constant but the top-level irq chip itself is changed.  This occurs
because of PAPR feature negotiation which allows the guest to decide
between the older XICS and newer XIVE irq chip models (both of which are
paravirtualized).

To allow devices like vfio to adjust to this change, introduce a new
notifier for the purpose kvm_irqchip_change_notify().

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 accel/kvm/kvm-all.c    | 18 ++++++++++++++++++
 accel/stubs/kvm-stub.c | 12 ++++++++++++
 include/sysemu/kvm.h   |  5 +++++
 3 files changed, 35 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 140b0bd8f6..ca00daa2f5 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -149,6 +149,9 @@ static const KVMCapabilityInfo kvm_required_capabilites[] = {
     KVM_CAP_LAST_INFO
 };
 
+static NotifierList kvm_irqchip_change_notifiers =
+    NOTIFIER_LIST_INITIALIZER(kvm_irqchip_change_notifiers);
+
 #define kvm_slots_lock(kml)      qemu_mutex_lock(&(kml)->slots_lock)
 #define kvm_slots_unlock(kml)    qemu_mutex_unlock(&(kml)->slots_lock)
 
@@ -1396,6 +1399,21 @@ void kvm_irqchip_release_virq(KVMState *s, int virq)
     trace_kvm_irqchip_release_virq(virq);
 }
 
+void kvm_irqchip_add_change_notifier(Notifier *n)
+{
+    notifier_list_add(&kvm_irqchip_change_notifiers, n);
+}
+
+void kvm_irqchip_remove_change_notifier(Notifier *n)
+{
+    notifier_remove(n);
+}
+
+void kvm_irqchip_change_notify(void)
+{
+    notifier_list_notify(&kvm_irqchip_change_notifiers, NULL);
+}
+
 static unsigned int kvm_hash_msi(uint32_t data)
 {
     /* This is optimized for IA32 MSI layout. However, no other arch shall
diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
index 6feb66ed80..82f118d2df 100644
--- a/accel/stubs/kvm-stub.c
+++ b/accel/stubs/kvm-stub.c
@@ -138,6 +138,18 @@ void kvm_irqchip_commit_routes(KVMState *s)
 {
 }
 
+void kvm_irqchip_add_change_notifier(Notifier *n)
+{
+}
+
+void kvm_irqchip_remove_change_notifier(Notifier *n)
+{
+}
+
+void kvm_irqchip_change_notify(void)
+{
+}
+
 int kvm_irqchip_add_adapter_route(KVMState *s, AdapterInfo *adapter)
 {
     return -ENOSYS;
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 9d143282bc..9fe233b9bf 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -201,6 +201,7 @@ typedef struct KVMCapabilityInfo {
 struct KVMState;
 typedef struct KVMState KVMState;
 extern KVMState *kvm_state;
+typedef struct Notifier Notifier;
 
 /* external API */
 
@@ -401,6 +402,10 @@ int kvm_irqchip_send_msi(KVMState *s, MSIMessage msg);
 
 void kvm_irqchip_add_irq_route(KVMState *s, int gsi, int irqchip, int pin);
 
+void kvm_irqchip_add_change_notifier(Notifier *n);
+void kvm_irqchip_remove_change_notifier(Notifier *n);
+void kvm_irqchip_change_notify(void);
+
 void kvm_get_apic_state(DeviceState *d, struct kvm_lapic_state *kapic);
 
 struct kvm_guest_debug;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 1/5] kvm: Introduce KVM irqchip change notifier
@ 2019-11-21  0:56   ` David Gibson
  0 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2019-11-21  0:56 UTC (permalink / raw)
  To: Alex Williamson, clg
  Cc: kvm, qemu-devel, Jason Wang, Riku Voipio, Laurent Vivier, groug,
	Alexey Kardashevskiy, qemu-ppc, Marc-André Lureau,
	Paolo Bonzini, philmd, David Gibson

Awareness of an in kernel irqchip is usually local to the machine and its
top-level interrupt controller.  However, in a few cases other things need
to know about it.  In particular vfio devices need this in order to
accelerate interrupt delivery.

If interrupt routing is changed, such devices may need to readjust their
connection to the KVM irqchip.  pci_bus_fire_intx_routing_notifier() exists
to do just this.

However, for the pseries machine type we have a situation where the routing
remains constant but the top-level irq chip itself is changed.  This occurs
because of PAPR feature negotiation which allows the guest to decide
between the older XICS and newer XIVE irq chip models (both of which are
paravirtualized).

To allow devices like vfio to adjust to this change, introduce a new
notifier for the purpose kvm_irqchip_change_notify().

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 accel/kvm/kvm-all.c    | 18 ++++++++++++++++++
 accel/stubs/kvm-stub.c | 12 ++++++++++++
 include/sysemu/kvm.h   |  5 +++++
 3 files changed, 35 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 140b0bd8f6..ca00daa2f5 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -149,6 +149,9 @@ static const KVMCapabilityInfo kvm_required_capabilites[] = {
     KVM_CAP_LAST_INFO
 };
 
+static NotifierList kvm_irqchip_change_notifiers =
+    NOTIFIER_LIST_INITIALIZER(kvm_irqchip_change_notifiers);
+
 #define kvm_slots_lock(kml)      qemu_mutex_lock(&(kml)->slots_lock)
 #define kvm_slots_unlock(kml)    qemu_mutex_unlock(&(kml)->slots_lock)
 
@@ -1396,6 +1399,21 @@ void kvm_irqchip_release_virq(KVMState *s, int virq)
     trace_kvm_irqchip_release_virq(virq);
 }
 
+void kvm_irqchip_add_change_notifier(Notifier *n)
+{
+    notifier_list_add(&kvm_irqchip_change_notifiers, n);
+}
+
+void kvm_irqchip_remove_change_notifier(Notifier *n)
+{
+    notifier_remove(n);
+}
+
+void kvm_irqchip_change_notify(void)
+{
+    notifier_list_notify(&kvm_irqchip_change_notifiers, NULL);
+}
+
 static unsigned int kvm_hash_msi(uint32_t data)
 {
     /* This is optimized for IA32 MSI layout. However, no other arch shall
diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
index 6feb66ed80..82f118d2df 100644
--- a/accel/stubs/kvm-stub.c
+++ b/accel/stubs/kvm-stub.c
@@ -138,6 +138,18 @@ void kvm_irqchip_commit_routes(KVMState *s)
 {
 }
 
+void kvm_irqchip_add_change_notifier(Notifier *n)
+{
+}
+
+void kvm_irqchip_remove_change_notifier(Notifier *n)
+{
+}
+
+void kvm_irqchip_change_notify(void)
+{
+}
+
 int kvm_irqchip_add_adapter_route(KVMState *s, AdapterInfo *adapter)
 {
     return -ENOSYS;
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 9d143282bc..9fe233b9bf 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -201,6 +201,7 @@ typedef struct KVMCapabilityInfo {
 struct KVMState;
 typedef struct KVMState KVMState;
 extern KVMState *kvm_state;
+typedef struct Notifier Notifier;
 
 /* external API */
 
@@ -401,6 +402,10 @@ int kvm_irqchip_send_msi(KVMState *s, MSIMessage msg);
 
 void kvm_irqchip_add_irq_route(KVMState *s, int gsi, int irqchip, int pin);
 
+void kvm_irqchip_add_change_notifier(Notifier *n);
+void kvm_irqchip_remove_change_notifier(Notifier *n);
+void kvm_irqchip_change_notify(void);
+
 void kvm_get_apic_state(DeviceState *d, struct kvm_lapic_state *kapic);
 
 struct kvm_guest_debug;
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/5] vfio/pci: Split vfio_intx_update()
  2019-11-21  0:56 ` David Gibson
@ 2019-11-21  0:56   ` David Gibson
  -1 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2019-11-21  0:56 UTC (permalink / raw)
  To: Alex Williamson, clg
  Cc: groug, philmd, qemu-ppc, Paolo Bonzini, Jason Wang,
	Laurent Vivier, David Gibson, kvm, qemu-devel, Riku Voipio,
	Marc-André Lureau, Alexey Kardashevskiy

This splits the vfio_intx_update() function into one part doing the actual
reconnection with the KVM irqchip (vfio_intx_update(), now taking an
argument with the new routing) and vfio_intx_routing_notifier() which
handles calls to the pci device intx routing notifier and calling
vfio_intx_update() when necessary.  This will make adding support for the
irqchip change notifier easier.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/vfio/pci.c | 39 ++++++++++++++++++++++-----------------
 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 0c55883bba..521289aa7d 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -216,30 +216,18 @@ static void vfio_intx_disable_kvm(VFIOPCIDevice *vdev)
 #endif
 }
 
-static void vfio_intx_update(PCIDevice *pdev)
+static void vfio_intx_update(VFIOPCIDevice *vdev, PCIINTxRoute *route)
 {
-    VFIOPCIDevice *vdev = PCI_VFIO(pdev);
-    PCIINTxRoute route;
     Error *err = NULL;
 
-    if (vdev->interrupt != VFIO_INT_INTx) {
-        return;
-    }
-
-    route = pci_device_route_intx_to_irq(&vdev->pdev, vdev->intx.pin);
-
-    if (!pci_intx_route_changed(&vdev->intx.route, &route)) {
-        return; /* Nothing changed */
-    }
-
     trace_vfio_intx_update(vdev->vbasedev.name,
-                           vdev->intx.route.irq, route.irq);
+                           vdev->intx.route.irq, route->irq);
 
     vfio_intx_disable_kvm(vdev);
 
-    vdev->intx.route = route;
+    vdev->intx.route = *route;
 
-    if (route.mode != PCI_INTX_ENABLED) {
+    if (route->mode != PCI_INTX_ENABLED) {
         return;
     }
 
@@ -252,6 +240,22 @@ static void vfio_intx_update(PCIDevice *pdev)
     vfio_intx_eoi(&vdev->vbasedev);
 }
 
+static void vfio_intx_routing_notifier(PCIDevice *pdev)
+{
+    VFIOPCIDevice *vdev = PCI_VFIO(pdev);
+    PCIINTxRoute route;
+
+    if (vdev->interrupt != VFIO_INT_INTx) {
+        return;
+    }
+
+    route = pci_device_route_intx_to_irq(&vdev->pdev, vdev->intx.pin);
+
+    if (pci_intx_route_changed(&vdev->intx.route, &route)) {
+        vfio_intx_update(vdev, &route);
+    }
+}
+
 static int vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp)
 {
     uint8_t pin = vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1);
@@ -2967,7 +2971,8 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
     if (vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1)) {
         vdev->intx.mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
                                                   vfio_intx_mmap_enable, vdev);
-        pci_device_set_intx_routing_notifier(&vdev->pdev, vfio_intx_update);
+        pci_device_set_intx_routing_notifier(&vdev->pdev,
+                                             vfio_intx_routing_notifier);
         ret = vfio_intx_enable(vdev, errp);
         if (ret) {
             goto out_teardown;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/5] vfio/pci: Split vfio_intx_update()
@ 2019-11-21  0:56   ` David Gibson
  0 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2019-11-21  0:56 UTC (permalink / raw)
  To: Alex Williamson, clg
  Cc: kvm, qemu-devel, Jason Wang, Riku Voipio, Laurent Vivier, groug,
	Alexey Kardashevskiy, qemu-ppc, Marc-André Lureau,
	Paolo Bonzini, philmd, David Gibson

This splits the vfio_intx_update() function into one part doing the actual
reconnection with the KVM irqchip (vfio_intx_update(), now taking an
argument with the new routing) and vfio_intx_routing_notifier() which
handles calls to the pci device intx routing notifier and calling
vfio_intx_update() when necessary.  This will make adding support for the
irqchip change notifier easier.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/vfio/pci.c | 39 ++++++++++++++++++++++-----------------
 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 0c55883bba..521289aa7d 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -216,30 +216,18 @@ static void vfio_intx_disable_kvm(VFIOPCIDevice *vdev)
 #endif
 }
 
-static void vfio_intx_update(PCIDevice *pdev)
+static void vfio_intx_update(VFIOPCIDevice *vdev, PCIINTxRoute *route)
 {
-    VFIOPCIDevice *vdev = PCI_VFIO(pdev);
-    PCIINTxRoute route;
     Error *err = NULL;
 
-    if (vdev->interrupt != VFIO_INT_INTx) {
-        return;
-    }
-
-    route = pci_device_route_intx_to_irq(&vdev->pdev, vdev->intx.pin);
-
-    if (!pci_intx_route_changed(&vdev->intx.route, &route)) {
-        return; /* Nothing changed */
-    }
-
     trace_vfio_intx_update(vdev->vbasedev.name,
-                           vdev->intx.route.irq, route.irq);
+                           vdev->intx.route.irq, route->irq);
 
     vfio_intx_disable_kvm(vdev);
 
-    vdev->intx.route = route;
+    vdev->intx.route = *route;
 
-    if (route.mode != PCI_INTX_ENABLED) {
+    if (route->mode != PCI_INTX_ENABLED) {
         return;
     }
 
@@ -252,6 +240,22 @@ static void vfio_intx_update(PCIDevice *pdev)
     vfio_intx_eoi(&vdev->vbasedev);
 }
 
+static void vfio_intx_routing_notifier(PCIDevice *pdev)
+{
+    VFIOPCIDevice *vdev = PCI_VFIO(pdev);
+    PCIINTxRoute route;
+
+    if (vdev->interrupt != VFIO_INT_INTx) {
+        return;
+    }
+
+    route = pci_device_route_intx_to_irq(&vdev->pdev, vdev->intx.pin);
+
+    if (pci_intx_route_changed(&vdev->intx.route, &route)) {
+        vfio_intx_update(vdev, &route);
+    }
+}
+
 static int vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp)
 {
     uint8_t pin = vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1);
@@ -2967,7 +2971,8 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
     if (vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1)) {
         vdev->intx.mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
                                                   vfio_intx_mmap_enable, vdev);
-        pci_device_set_intx_routing_notifier(&vdev->pdev, vfio_intx_update);
+        pci_device_set_intx_routing_notifier(&vdev->pdev,
+                                             vfio_intx_routing_notifier);
         ret = vfio_intx_enable(vdev, errp);
         if (ret) {
             goto out_teardown;
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 3/5] vfio/pci: Respond to KVM irqchip change notifier
  2019-11-21  0:56 ` David Gibson
@ 2019-11-21  0:56   ` David Gibson
  -1 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2019-11-21  0:56 UTC (permalink / raw)
  To: Alex Williamson, clg
  Cc: groug, philmd, qemu-ppc, Paolo Bonzini, Jason Wang,
	Laurent Vivier, David Gibson, kvm, qemu-devel, Riku Voipio,
	Marc-André Lureau, Alexey Kardashevskiy

VFIO PCI devices already respond to the pci intx routing notifier, in order
to update kernel irqchip mappings when routing is updated.  However this
won't handle the case where the irqchip itself is replaced by a different
model while retaining the same routing.  This case can happen on
the pseries machine type due to PAPR feature negotiation.

To handle that case, add a handler for the irqchip change notifier, which
does much the same thing as the routing notifier, but is unconditional,
rather than being a no-op when the routing hasn't changed.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/vfio/pci.c | 23 ++++++++++++++++++-----
 hw/vfio/pci.h |  1 +
 2 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 521289aa7d..95478c2c55 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -256,6 +256,14 @@ static void vfio_intx_routing_notifier(PCIDevice *pdev)
     }
 }
 
+static void vfio_irqchip_change(Notifier *notify, void *data)
+{
+    VFIOPCIDevice *vdev = container_of(notify, VFIOPCIDevice,
+                                       irqchip_change_notifier);
+
+    vfio_intx_update(vdev, &vdev->intx.route);
+}
+
 static int vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp)
 {
     uint8_t pin = vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1);
@@ -2973,16 +2981,18 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
                                                   vfio_intx_mmap_enable, vdev);
         pci_device_set_intx_routing_notifier(&vdev->pdev,
                                              vfio_intx_routing_notifier);
+        vdev->irqchip_change_notifier.notify = vfio_irqchip_change;
+        kvm_irqchip_add_change_notifier(&vdev->irqchip_change_notifier);
         ret = vfio_intx_enable(vdev, errp);
         if (ret) {
-            goto out_teardown;
+            goto out_deregister;
         }
     }
 
     if (vdev->display != ON_OFF_AUTO_OFF) {
         ret = vfio_display_probe(vdev, errp);
         if (ret) {
-            goto out_teardown;
+            goto out_deregister;
         }
     }
     if (vdev->enable_ramfb && vdev->dpy == NULL) {
@@ -2992,11 +3002,11 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
     if (vdev->display_xres || vdev->display_yres) {
         if (vdev->dpy == NULL) {
             error_setg(errp, "xres and yres properties require display=on");
-            goto out_teardown;
+            goto out_deregister;
         }
         if (vdev->dpy->edid_regs == NULL) {
             error_setg(errp, "xres and yres properties need edid support");
-            goto out_teardown;
+            goto out_deregister;
         }
     }
 
@@ -3020,8 +3030,10 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
 
     return;
 
-out_teardown:
+out_deregister:
     pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
+    kvm_irqchip_remove_change_notifier(&vdev->irqchip_change_notifier);
+out_teardown:
     vfio_teardown_msi(vdev);
     vfio_bars_exit(vdev);
 error:
@@ -3064,6 +3076,7 @@ static void vfio_exitfn(PCIDevice *pdev)
     vfio_unregister_req_notifier(vdev);
     vfio_unregister_err_notifier(vdev);
     pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
+    kvm_irqchip_remove_change_notifier(&vdev->irqchip_change_notifier);
     vfio_disable_interrupts(vdev);
     if (vdev->intx.mmap_timer) {
         timer_free(vdev->intx.mmap_timer);
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index b329d50338..35626cd63e 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -169,6 +169,7 @@ typedef struct VFIOPCIDevice {
     bool enable_ramfb;
     VFIODisplay *dpy;
     Error *migration_blocker;
+    Notifier irqchip_change_notifier;
 } VFIOPCIDevice;
 
 uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 3/5] vfio/pci: Respond to KVM irqchip change notifier
@ 2019-11-21  0:56   ` David Gibson
  0 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2019-11-21  0:56 UTC (permalink / raw)
  To: Alex Williamson, clg
  Cc: kvm, qemu-devel, Jason Wang, Riku Voipio, Laurent Vivier, groug,
	Alexey Kardashevskiy, qemu-ppc, Marc-André Lureau,
	Paolo Bonzini, philmd, David Gibson

VFIO PCI devices already respond to the pci intx routing notifier, in order
to update kernel irqchip mappings when routing is updated.  However this
won't handle the case where the irqchip itself is replaced by a different
model while retaining the same routing.  This case can happen on
the pseries machine type due to PAPR feature negotiation.

To handle that case, add a handler for the irqchip change notifier, which
does much the same thing as the routing notifier, but is unconditional,
rather than being a no-op when the routing hasn't changed.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/vfio/pci.c | 23 ++++++++++++++++++-----
 hw/vfio/pci.h |  1 +
 2 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 521289aa7d..95478c2c55 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -256,6 +256,14 @@ static void vfio_intx_routing_notifier(PCIDevice *pdev)
     }
 }
 
+static void vfio_irqchip_change(Notifier *notify, void *data)
+{
+    VFIOPCIDevice *vdev = container_of(notify, VFIOPCIDevice,
+                                       irqchip_change_notifier);
+
+    vfio_intx_update(vdev, &vdev->intx.route);
+}
+
 static int vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp)
 {
     uint8_t pin = vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1);
@@ -2973,16 +2981,18 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
                                                   vfio_intx_mmap_enable, vdev);
         pci_device_set_intx_routing_notifier(&vdev->pdev,
                                              vfio_intx_routing_notifier);
+        vdev->irqchip_change_notifier.notify = vfio_irqchip_change;
+        kvm_irqchip_add_change_notifier(&vdev->irqchip_change_notifier);
         ret = vfio_intx_enable(vdev, errp);
         if (ret) {
-            goto out_teardown;
+            goto out_deregister;
         }
     }
 
     if (vdev->display != ON_OFF_AUTO_OFF) {
         ret = vfio_display_probe(vdev, errp);
         if (ret) {
-            goto out_teardown;
+            goto out_deregister;
         }
     }
     if (vdev->enable_ramfb && vdev->dpy == NULL) {
@@ -2992,11 +3002,11 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
     if (vdev->display_xres || vdev->display_yres) {
         if (vdev->dpy == NULL) {
             error_setg(errp, "xres and yres properties require display=on");
-            goto out_teardown;
+            goto out_deregister;
         }
         if (vdev->dpy->edid_regs == NULL) {
             error_setg(errp, "xres and yres properties need edid support");
-            goto out_teardown;
+            goto out_deregister;
         }
     }
 
@@ -3020,8 +3030,10 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
 
     return;
 
-out_teardown:
+out_deregister:
     pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
+    kvm_irqchip_remove_change_notifier(&vdev->irqchip_change_notifier);
+out_teardown:
     vfio_teardown_msi(vdev);
     vfio_bars_exit(vdev);
 error:
@@ -3064,6 +3076,7 @@ static void vfio_exitfn(PCIDevice *pdev)
     vfio_unregister_req_notifier(vdev);
     vfio_unregister_err_notifier(vdev);
     pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
+    kvm_irqchip_remove_change_notifier(&vdev->irqchip_change_notifier);
     vfio_disable_interrupts(vdev);
     if (vdev->intx.mmap_timer) {
         timer_free(vdev->intx.mmap_timer);
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index b329d50338..35626cd63e 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -169,6 +169,7 @@ typedef struct VFIOPCIDevice {
     bool enable_ramfb;
     VFIODisplay *dpy;
     Error *migration_blocker;
+    Notifier irqchip_change_notifier;
 } VFIOPCIDevice;
 
 uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 4/5] spapr: Handle irq backend changes with VFIO PCI devices
  2019-11-21  0:56 ` David Gibson
@ 2019-11-21  0:56   ` David Gibson
  -1 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2019-11-21  0:56 UTC (permalink / raw)
  To: Alex Williamson, clg
  Cc: groug, philmd, qemu-ppc, Paolo Bonzini, Jason Wang,
	Laurent Vivier, David Gibson, kvm, qemu-devel, Riku Voipio,
	Marc-André Lureau, Alexey Kardashevskiy

pseries machine type can have one of two different interrupt controllers in
use depending on feature negotiation with the guest.  Usually this is
invisible to devices, because they route to a common set of qemu_irqs which
in turn dispatch to the correct back end.

VFIO passthrough devices, however, wire themselves up directly to the KVM
irqchip for performance, which means they are affected by this change in
interrupt controller.  To get them to adjust correctly for the change in
irqchip, we need to fire the kvm irqchip change notifier.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr_irq.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 168044be85..1d27034962 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -508,6 +508,12 @@ static void set_active_intc(SpaprMachineState *spapr,
     }
 
     spapr->active_intc = new_intc;
+
+    /*
+     * We've changed the kernel irqchip, let VFIO devices know they
+     * need to readjust.
+     */
+    kvm_irqchip_change_notify();
 }
 
 void spapr_irq_update_active_intc(SpaprMachineState *spapr)
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 4/5] spapr: Handle irq backend changes with VFIO PCI devices
@ 2019-11-21  0:56   ` David Gibson
  0 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2019-11-21  0:56 UTC (permalink / raw)
  To: Alex Williamson, clg
  Cc: kvm, qemu-devel, Jason Wang, Riku Voipio, Laurent Vivier, groug,
	Alexey Kardashevskiy, qemu-ppc, Marc-André Lureau,
	Paolo Bonzini, philmd, David Gibson

pseries machine type can have one of two different interrupt controllers in
use depending on feature negotiation with the guest.  Usually this is
invisible to devices, because they route to a common set of qemu_irqs which
in turn dispatch to the correct back end.

VFIO passthrough devices, however, wire themselves up directly to the KVM
irqchip for performance, which means they are affected by this change in
interrupt controller.  To get them to adjust correctly for the change in
irqchip, we need to fire the kvm irqchip change notifier.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr_irq.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 168044be85..1d27034962 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -508,6 +508,12 @@ static void set_active_intc(SpaprMachineState *spapr,
     }
 
     spapr->active_intc = new_intc;
+
+    /*
+     * We've changed the kernel irqchip, let VFIO devices know they
+     * need to readjust.
+     */
+    kvm_irqchip_change_notify();
 }
 
 void spapr_irq_update_active_intc(SpaprMachineState *spapr)
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 5/5] spapr: Work around spurious warnings from vfio INTx initialization
  2019-11-21  0:56 ` David Gibson
@ 2019-11-21  0:56   ` David Gibson
  -1 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2019-11-21  0:56 UTC (permalink / raw)
  To: Alex Williamson, clg
  Cc: groug, philmd, qemu-ppc, Paolo Bonzini, Jason Wang,
	Laurent Vivier, David Gibson, kvm, qemu-devel, Riku Voipio,
	Marc-André Lureau, Alexey Kardashevskiy

Traditional PCI INTx for vfio devices can only perform well if using
an in-kernel irqchip.  Therefore, vfio_intx_update() issues a warning
if an in kernel irqchip is not available.

We usually do have an in-kernel irqchip available for pseries machines
on POWER hosts.  However, because the platform allows feature
negotiation of what interrupt controller model to use, we don't
currently initialize it until machine reset.  vfio_intx_update() is
called (first) from vfio_realize() before that, so it can issue a
spurious warning, even if we will have an in kernel irqchip by the
time we need it.

To workaround this, make a call to spapr_irq_update_active_intc() from
spapr_irq_init() which is called at machine realize time, before the
vfio realize.  This call will be pretty much obsoleted by the later
call at reset time, but it serves to suppress the spurious warning
from VFIO.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr_irq.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 1d27034962..d6bb7fd2d6 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -373,6 +373,14 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
 
     spapr->qirqs = qemu_allocate_irqs(spapr_set_irq, spapr,
                                       smc->nr_xirqs + SPAPR_XIRQ_BASE);
+
+    /*
+     * Mostly we don't actually need this until reset, except that not
+     * having this set up can cause VFIO devices to issue a
+     * false-positive warning during realize(), because they don't yet
+     * have an in-kernel irq chip.
+     */
+    spapr_irq_update_active_intc(spapr);
 }
 
 int spapr_irq_claim(SpaprMachineState *spapr, int irq, bool lsi, Error **errp)
@@ -528,7 +536,8 @@ void spapr_irq_update_active_intc(SpaprMachineState *spapr)
          * this.
          */
         new_intc = SPAPR_INTC(spapr->xive);
-    } else if (spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
+    } else if (spapr->ov5_cas
+               && spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
         new_intc = SPAPR_INTC(spapr->xive);
     } else {
         new_intc = SPAPR_INTC(spapr->ics);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 5/5] spapr: Work around spurious warnings from vfio INTx initialization
@ 2019-11-21  0:56   ` David Gibson
  0 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2019-11-21  0:56 UTC (permalink / raw)
  To: Alex Williamson, clg
  Cc: kvm, qemu-devel, Jason Wang, Riku Voipio, Laurent Vivier, groug,
	Alexey Kardashevskiy, qemu-ppc, Marc-André Lureau,
	Paolo Bonzini, philmd, David Gibson

Traditional PCI INTx for vfio devices can only perform well if using
an in-kernel irqchip.  Therefore, vfio_intx_update() issues a warning
if an in kernel irqchip is not available.

We usually do have an in-kernel irqchip available for pseries machines
on POWER hosts.  However, because the platform allows feature
negotiation of what interrupt controller model to use, we don't
currently initialize it until machine reset.  vfio_intx_update() is
called (first) from vfio_realize() before that, so it can issue a
spurious warning, even if we will have an in kernel irqchip by the
time we need it.

To workaround this, make a call to spapr_irq_update_active_intc() from
spapr_irq_init() which is called at machine realize time, before the
vfio realize.  This call will be pretty much obsoleted by the later
call at reset time, but it serves to suppress the spurious warning
from VFIO.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr_irq.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 1d27034962..d6bb7fd2d6 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -373,6 +373,14 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
 
     spapr->qirqs = qemu_allocate_irqs(spapr_set_irq, spapr,
                                       smc->nr_xirqs + SPAPR_XIRQ_BASE);
+
+    /*
+     * Mostly we don't actually need this until reset, except that not
+     * having this set up can cause VFIO devices to issue a
+     * false-positive warning during realize(), because they don't yet
+     * have an in-kernel irq chip.
+     */
+    spapr_irq_update_active_intc(spapr);
 }
 
 int spapr_irq_claim(SpaprMachineState *spapr, int irq, bool lsi, Error **errp)
@@ -528,7 +536,8 @@ void spapr_irq_update_active_intc(SpaprMachineState *spapr)
          * this.
          */
         new_intc = SPAPR_INTC(spapr->xive);
-    } else if (spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
+    } else if (spapr->ov5_cas
+               && spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
         new_intc = SPAPR_INTC(spapr->xive);
     } else {
         new_intc = SPAPR_INTC(spapr->ics);
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/5] spapr: Handle irq backend changes with VFIO PCI devices
  2019-11-21  0:56   ` David Gibson
@ 2019-11-21 16:35     ` Cédric Le Goater
  -1 siblings, 0 replies; 30+ messages in thread
From: Cédric Le Goater @ 2019-11-21 16:35 UTC (permalink / raw)
  To: David Gibson, Alex Williamson
  Cc: groug, philmd, qemu-ppc, Paolo Bonzini, Jason Wang,
	Laurent Vivier, kvm, qemu-devel, Riku Voipio,
	Marc-André Lureau, Alexey Kardashevskiy

On 21/11/2019 01:56, David Gibson wrote:
> pseries machine type can have one of two different interrupt controllers in
> use depending on feature negotiation with the guest.  Usually this is
> invisible to devices, because they route to a common set of qemu_irqs which
> in turn dispatch to the correct back end.
> 
> VFIO passthrough devices, however, wire themselves up directly to the KVM
> irqchip for performance, which means they are affected by this change in
> interrupt controller.  To get them to adjust correctly for the change in
> irqchip, we need to fire the kvm irqchip change notifier.
> 
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>


Reviewed-by: Cédric Le Goater <clg@kaod.org>

> ---
>  hw/ppc/spapr_irq.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 168044be85..1d27034962 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -508,6 +508,12 @@ static void set_active_intc(SpaprMachineState *spapr,
>      }
>  
>      spapr->active_intc = new_intc;
> +
> +    /*
> +     * We've changed the kernel irqchip, let VFIO devices know they
> +     * need to readjust.
> +     */
> +    kvm_irqchip_change_notify();
>  }
>  
>  void spapr_irq_update_active_intc(SpaprMachineState *spapr)
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/5] spapr: Handle irq backend changes with VFIO PCI devices
@ 2019-11-21 16:35     ` Cédric Le Goater
  0 siblings, 0 replies; 30+ messages in thread
From: Cédric Le Goater @ 2019-11-21 16:35 UTC (permalink / raw)
  To: David Gibson, Alex Williamson
  Cc: kvm, qemu-devel, Jason Wang, Riku Voipio, Laurent Vivier, groug,
	Alexey Kardashevskiy, qemu-ppc, Marc-André Lureau,
	Paolo Bonzini, philmd

On 21/11/2019 01:56, David Gibson wrote:
> pseries machine type can have one of two different interrupt controllers in
> use depending on feature negotiation with the guest.  Usually this is
> invisible to devices, because they route to a common set of qemu_irqs which
> in turn dispatch to the correct back end.
> 
> VFIO passthrough devices, however, wire themselves up directly to the KVM
> irqchip for performance, which means they are affected by this change in
> interrupt controller.  To get them to adjust correctly for the change in
> irqchip, we need to fire the kvm irqchip change notifier.
> 
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>


Reviewed-by: Cédric Le Goater <clg@kaod.org>

> ---
>  hw/ppc/spapr_irq.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 168044be85..1d27034962 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -508,6 +508,12 @@ static void set_active_intc(SpaprMachineState *spapr,
>      }
>  
>      spapr->active_intc = new_intc;
> +
> +    /*
> +     * We've changed the kernel irqchip, let VFIO devices know they
> +     * need to readjust.
> +     */
> +    kvm_irqchip_change_notify();
>  }
>  
>  void spapr_irq_update_active_intc(SpaprMachineState *spapr)
> 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/5] spapr: Work around spurious warnings from vfio INTx initialization
  2019-11-21  0:56   ` David Gibson
@ 2019-11-21 16:35     ` Cédric Le Goater
  -1 siblings, 0 replies; 30+ messages in thread
From: Cédric Le Goater @ 2019-11-21 16:35 UTC (permalink / raw)
  To: David Gibson, Alex Williamson
  Cc: groug, philmd, qemu-ppc, Paolo Bonzini, Jason Wang,
	Laurent Vivier, kvm, qemu-devel, Riku Voipio,
	Marc-André Lureau, Alexey Kardashevskiy

On 21/11/2019 01:56, David Gibson wrote:
> Traditional PCI INTx for vfio devices can only perform well if using
> an in-kernel irqchip.  Therefore, vfio_intx_update() issues a warning
> if an in kernel irqchip is not available.
> 
> We usually do have an in-kernel irqchip available for pseries machines
> on POWER hosts.  However, because the platform allows feature
> negotiation of what interrupt controller model to use, we don't
> currently initialize it until machine reset.  vfio_intx_update() is
> called (first) from vfio_realize() before that, so it can issue a
> spurious warning, even if we will have an in kernel irqchip by the
> time we need it.
> 
> To workaround this, make a call to spapr_irq_update_active_intc() from
> spapr_irq_init() which is called at machine realize time, before the
> vfio realize.  This call will be pretty much obsoleted by the later
> call at reset time, but it serves to suppress the spurious warning
> from VFIO.
> 
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Reviewed-by: Cédric Le Goater <clg@kaod.org>

> ---
>  hw/ppc/spapr_irq.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 1d27034962..d6bb7fd2d6 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -373,6 +373,14 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
>  
>      spapr->qirqs = qemu_allocate_irqs(spapr_set_irq, spapr,
>                                        smc->nr_xirqs + SPAPR_XIRQ_BASE);
> +
> +    /*
> +     * Mostly we don't actually need this until reset, except that not
> +     * having this set up can cause VFIO devices to issue a
> +     * false-positive warning during realize(), because they don't yet
> +     * have an in-kernel irq chip.
> +     */
> +    spapr_irq_update_active_intc(spapr);
>  }
>  
>  int spapr_irq_claim(SpaprMachineState *spapr, int irq, bool lsi, Error **errp)
> @@ -528,7 +536,8 @@ void spapr_irq_update_active_intc(SpaprMachineState *spapr)
>           * this.
>           */
>          new_intc = SPAPR_INTC(spapr->xive);
> -    } else if (spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
> +    } else if (spapr->ov5_cas
> +               && spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
>          new_intc = SPAPR_INTC(spapr->xive);
>      } else {
>          new_intc = SPAPR_INTC(spapr->ics);
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/5] spapr: Work around spurious warnings from vfio INTx initialization
@ 2019-11-21 16:35     ` Cédric Le Goater
  0 siblings, 0 replies; 30+ messages in thread
From: Cédric Le Goater @ 2019-11-21 16:35 UTC (permalink / raw)
  To: David Gibson, Alex Williamson
  Cc: kvm, qemu-devel, Jason Wang, Riku Voipio, Laurent Vivier, groug,
	Alexey Kardashevskiy, qemu-ppc, Marc-André Lureau,
	Paolo Bonzini, philmd

On 21/11/2019 01:56, David Gibson wrote:
> Traditional PCI INTx for vfio devices can only perform well if using
> an in-kernel irqchip.  Therefore, vfio_intx_update() issues a warning
> if an in kernel irqchip is not available.
> 
> We usually do have an in-kernel irqchip available for pseries machines
> on POWER hosts.  However, because the platform allows feature
> negotiation of what interrupt controller model to use, we don't
> currently initialize it until machine reset.  vfio_intx_update() is
> called (first) from vfio_realize() before that, so it can issue a
> spurious warning, even if we will have an in kernel irqchip by the
> time we need it.
> 
> To workaround this, make a call to spapr_irq_update_active_intc() from
> spapr_irq_init() which is called at machine realize time, before the
> vfio realize.  This call will be pretty much obsoleted by the later
> call at reset time, but it serves to suppress the spurious warning
> from VFIO.
> 
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Reviewed-by: Cédric Le Goater <clg@kaod.org>

> ---
>  hw/ppc/spapr_irq.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 1d27034962..d6bb7fd2d6 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -373,6 +373,14 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
>  
>      spapr->qirqs = qemu_allocate_irqs(spapr_set_irq, spapr,
>                                        smc->nr_xirqs + SPAPR_XIRQ_BASE);
> +
> +    /*
> +     * Mostly we don't actually need this until reset, except that not
> +     * having this set up can cause VFIO devices to issue a
> +     * false-positive warning during realize(), because they don't yet
> +     * have an in-kernel irq chip.
> +     */
> +    spapr_irq_update_active_intc(spapr);
>  }
>  
>  int spapr_irq_claim(SpaprMachineState *spapr, int irq, bool lsi, Error **errp)
> @@ -528,7 +536,8 @@ void spapr_irq_update_active_intc(SpaprMachineState *spapr)
>           * this.
>           */
>          new_intc = SPAPR_INTC(spapr->xive);
> -    } else if (spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
> +    } else if (spapr->ov5_cas
> +               && spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
>          new_intc = SPAPR_INTC(spapr->xive);
>      } else {
>          new_intc = SPAPR_INTC(spapr->ics);
> 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/5] vfio/spapr: Handle changes of master irq chip for VFIO devices
  2019-11-21  0:56 ` David Gibson
@ 2019-11-21 16:57   ` Alex Williamson
  -1 siblings, 0 replies; 30+ messages in thread
From: Alex Williamson @ 2019-11-21 16:57 UTC (permalink / raw)
  To: David Gibson
  Cc: clg, groug, philmd, qemu-ppc, Paolo Bonzini, Jason Wang,
	Laurent Vivier, kvm, qemu-devel, Riku Voipio,
	Marc-André Lureau

On Thu, 21 Nov 2019 11:56:02 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> Due to the way feature negotiation works in PAPR (which is a
> paravirtualized platform), we can end up changing the global irq chip
> at runtime, including it's KVM accelerate model.  That causes
> complications for VFIO devices with INTx, which wire themselves up
> directly to the KVM irqchip for performance.
> 
> This series introduces a new notifier to let VFIO devices (and
> anything else that needs to in the future) know about changes to the
> master irqchip.  It modifies VFIO to respond to the notifier,
> reconnecting itself to the new KVM irqchip as necessary.
> 
> In particular this removes a misleading (though not wholly inaccurate)
> warning that occurs when using VFIO devices on a pseries machine type
> guest.
> 
> Open question: should this go into qemu-4.2 or wait until 5.0?  It's
> has medium complexity / intrusiveness, but it *is* a bugfix that I
> can't see a simpler way to fix.  It's effectively a regression from
> qemu-4.0 to qemu-4.1 (because that introduced XIVE support by
> default), although not from 4.1 to 4.2.

Looks reasonable to me for 4.2, the vfio changes are not as big as they
appear.  If Paolo approves this week, I can send a pull request,
otherwise I can leave my ack for someone else as I'll be on PTO/holiday
next week.  Thanks,

Alex
 
> Changes since RFC:
>  * Fixed some incorrect error paths pointed by aw in 3/5
>  * 5/5 had some problems previously, but they have been obsoleted by
>    other changes merged in the meantime
> 
> David Gibson (5):
>   kvm: Introduce KVM irqchip change notifier
>   vfio/pci: Split vfio_intx_update()
>   vfio/pci: Respond to KVM irqchip change notifier
>   spapr: Handle irq backend changes with VFIO PCI devices
>   spapr: Work around spurious warnings from vfio INTx initialization
> 
>  accel/kvm/kvm-all.c    | 18 ++++++++++++
>  accel/stubs/kvm-stub.c | 12 ++++++++
>  hw/ppc/spapr_irq.c     | 17 +++++++++++-
>  hw/vfio/pci.c          | 62 +++++++++++++++++++++++++++---------------
>  hw/vfio/pci.h          |  1 +
>  include/sysemu/kvm.h   |  5 ++++
>  6 files changed, 92 insertions(+), 23 deletions(-)
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/5] vfio/spapr: Handle changes of master irq chip for VFIO devices
@ 2019-11-21 16:57   ` Alex Williamson
  0 siblings, 0 replies; 30+ messages in thread
From: Alex Williamson @ 2019-11-21 16:57 UTC (permalink / raw)
  To: David Gibson
  Cc: kvm, qemu-devel, Jason Wang, Riku Voipio, groug, Laurent Vivier,
	qemu-ppc, clg, Marc-André Lureau, Paolo Bonzini, philmd

On Thu, 21 Nov 2019 11:56:02 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> Due to the way feature negotiation works in PAPR (which is a
> paravirtualized platform), we can end up changing the global irq chip
> at runtime, including it's KVM accelerate model.  That causes
> complications for VFIO devices with INTx, which wire themselves up
> directly to the KVM irqchip for performance.
> 
> This series introduces a new notifier to let VFIO devices (and
> anything else that needs to in the future) know about changes to the
> master irqchip.  It modifies VFIO to respond to the notifier,
> reconnecting itself to the new KVM irqchip as necessary.
> 
> In particular this removes a misleading (though not wholly inaccurate)
> warning that occurs when using VFIO devices on a pseries machine type
> guest.
> 
> Open question: should this go into qemu-4.2 or wait until 5.0?  It's
> has medium complexity / intrusiveness, but it *is* a bugfix that I
> can't see a simpler way to fix.  It's effectively a regression from
> qemu-4.0 to qemu-4.1 (because that introduced XIVE support by
> default), although not from 4.1 to 4.2.

Looks reasonable to me for 4.2, the vfio changes are not as big as they
appear.  If Paolo approves this week, I can send a pull request,
otherwise I can leave my ack for someone else as I'll be on PTO/holiday
next week.  Thanks,

Alex
 
> Changes since RFC:
>  * Fixed some incorrect error paths pointed by aw in 3/5
>  * 5/5 had some problems previously, but they have been obsoleted by
>    other changes merged in the meantime
> 
> David Gibson (5):
>   kvm: Introduce KVM irqchip change notifier
>   vfio/pci: Split vfio_intx_update()
>   vfio/pci: Respond to KVM irqchip change notifier
>   spapr: Handle irq backend changes with VFIO PCI devices
>   spapr: Work around spurious warnings from vfio INTx initialization
> 
>  accel/kvm/kvm-all.c    | 18 ++++++++++++
>  accel/stubs/kvm-stub.c | 12 ++++++++
>  hw/ppc/spapr_irq.c     | 17 +++++++++++-
>  hw/vfio/pci.c          | 62 +++++++++++++++++++++++++++---------------
>  hw/vfio/pci.h          |  1 +
>  include/sysemu/kvm.h   |  5 ++++
>  6 files changed, 92 insertions(+), 23 deletions(-)
> 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/5] vfio/spapr: Handle changes of master irq chip for VFIO devices
  2019-11-21 16:57   ` Alex Williamson
@ 2019-11-22  1:18     ` David Gibson
  -1 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2019-11-22  1:18 UTC (permalink / raw)
  To: Alex Williamson
  Cc: clg, groug, philmd, qemu-ppc, Paolo Bonzini, Jason Wang,
	Laurent Vivier, kvm, qemu-devel, Riku Voipio,
	Marc-André Lureau

[-- Attachment #1: Type: text/plain, Size: 1895 bytes --]

On Thu, Nov 21, 2019 at 09:57:38AM -0700, Alex Williamson wrote:
> On Thu, 21 Nov 2019 11:56:02 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > Due to the way feature negotiation works in PAPR (which is a
> > paravirtualized platform), we can end up changing the global irq chip
> > at runtime, including it's KVM accelerate model.  That causes
> > complications for VFIO devices with INTx, which wire themselves up
> > directly to the KVM irqchip for performance.
> > 
> > This series introduces a new notifier to let VFIO devices (and
> > anything else that needs to in the future) know about changes to the
> > master irqchip.  It modifies VFIO to respond to the notifier,
> > reconnecting itself to the new KVM irqchip as necessary.
> > 
> > In particular this removes a misleading (though not wholly inaccurate)
> > warning that occurs when using VFIO devices on a pseries machine type
> > guest.
> > 
> > Open question: should this go into qemu-4.2 or wait until 5.0?  It's
> > has medium complexity / intrusiveness, but it *is* a bugfix that I
> > can't see a simpler way to fix.  It's effectively a regression from
> > qemu-4.0 to qemu-4.1 (because that introduced XIVE support by
> > default), although not from 4.1 to 4.2.
> 
> Looks reasonable to me for 4.2, the vfio changes are not as big as they
> appear.  If Paolo approves this week, I can send a pull request,
> otherwise I can leave my ack for someone else as I'll be on PTO/holiday
> next week.  Thanks,

I'm happy to take it through my tree, and expect to be sending a PR in
that timescale, so an ack sounds good.

I've pulled the series into my ppc-for-4.2 branch tentatively.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/5] vfio/spapr: Handle changes of master irq chip for VFIO devices
@ 2019-11-22  1:18     ` David Gibson
  0 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2019-11-22  1:18 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, qemu-devel, Jason Wang, Riku Voipio, groug, Laurent Vivier,
	qemu-ppc, clg, Marc-André Lureau, Paolo Bonzini, philmd

[-- Attachment #1: Type: text/plain, Size: 1895 bytes --]

On Thu, Nov 21, 2019 at 09:57:38AM -0700, Alex Williamson wrote:
> On Thu, 21 Nov 2019 11:56:02 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > Due to the way feature negotiation works in PAPR (which is a
> > paravirtualized platform), we can end up changing the global irq chip
> > at runtime, including it's KVM accelerate model.  That causes
> > complications for VFIO devices with INTx, which wire themselves up
> > directly to the KVM irqchip for performance.
> > 
> > This series introduces a new notifier to let VFIO devices (and
> > anything else that needs to in the future) know about changes to the
> > master irqchip.  It modifies VFIO to respond to the notifier,
> > reconnecting itself to the new KVM irqchip as necessary.
> > 
> > In particular this removes a misleading (though not wholly inaccurate)
> > warning that occurs when using VFIO devices on a pseries machine type
> > guest.
> > 
> > Open question: should this go into qemu-4.2 or wait until 5.0?  It's
> > has medium complexity / intrusiveness, but it *is* a bugfix that I
> > can't see a simpler way to fix.  It's effectively a regression from
> > qemu-4.0 to qemu-4.1 (because that introduced XIVE support by
> > default), although not from 4.1 to 4.2.
> 
> Looks reasonable to me for 4.2, the vfio changes are not as big as they
> appear.  If Paolo approves this week, I can send a pull request,
> otherwise I can leave my ack for someone else as I'll be on PTO/holiday
> next week.  Thanks,

I'm happy to take it through my tree, and expect to be sending a PR in
that timescale, so an ack sounds good.

I've pulled the series into my ppc-for-4.2 branch tentatively.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/5] vfio/spapr: Handle changes of master irq chip for VFIO devices
  2019-11-22  1:18     ` David Gibson
@ 2019-11-22  1:28       ` Alex Williamson
  -1 siblings, 0 replies; 30+ messages in thread
From: Alex Williamson @ 2019-11-22  1:28 UTC (permalink / raw)
  To: David Gibson
  Cc: clg, groug, philmd, qemu-ppc, Paolo Bonzini, Jason Wang,
	Laurent Vivier, kvm, qemu-devel, Riku Voipio,
	Marc-André Lureau

On Fri, 22 Nov 2019 12:18:24 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Thu, Nov 21, 2019 at 09:57:38AM -0700, Alex Williamson wrote:
> > On Thu, 21 Nov 2019 11:56:02 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > Due to the way feature negotiation works in PAPR (which is a
> > > paravirtualized platform), we can end up changing the global irq chip
> > > at runtime, including it's KVM accelerate model.  That causes
> > > complications for VFIO devices with INTx, which wire themselves up
> > > directly to the KVM irqchip for performance.
> > > 
> > > This series introduces a new notifier to let VFIO devices (and
> > > anything else that needs to in the future) know about changes to the
> > > master irqchip.  It modifies VFIO to respond to the notifier,
> > > reconnecting itself to the new KVM irqchip as necessary.
> > > 
> > > In particular this removes a misleading (though not wholly inaccurate)
> > > warning that occurs when using VFIO devices on a pseries machine type
> > > guest.
> > > 
> > > Open question: should this go into qemu-4.2 or wait until 5.0?  It's
> > > has medium complexity / intrusiveness, but it *is* a bugfix that I
> > > can't see a simpler way to fix.  It's effectively a regression from
> > > qemu-4.0 to qemu-4.1 (because that introduced XIVE support by
> > > default), although not from 4.1 to 4.2.  
> > 
> > Looks reasonable to me for 4.2, the vfio changes are not as big as they
> > appear.  If Paolo approves this week, I can send a pull request,
> > otherwise I can leave my ack for someone else as I'll be on PTO/holiday
> > next week.  Thanks,  
> 
> I'm happy to take it through my tree, and expect to be sending a PR in
> that timescale, so an ack sounds good.
> 
> I've pulled the series into my ppc-for-4.2 branch tentatively.
> 

Tested-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/5] vfio/spapr: Handle changes of master irq chip for VFIO devices
@ 2019-11-22  1:28       ` Alex Williamson
  0 siblings, 0 replies; 30+ messages in thread
From: Alex Williamson @ 2019-11-22  1:28 UTC (permalink / raw)
  To: David Gibson
  Cc: kvm, qemu-devel, Jason Wang, Riku Voipio, groug, Laurent Vivier,
	qemu-ppc, clg, Marc-André Lureau, Paolo Bonzini, philmd

On Fri, 22 Nov 2019 12:18:24 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Thu, Nov 21, 2019 at 09:57:38AM -0700, Alex Williamson wrote:
> > On Thu, 21 Nov 2019 11:56:02 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > Due to the way feature negotiation works in PAPR (which is a
> > > paravirtualized platform), we can end up changing the global irq chip
> > > at runtime, including it's KVM accelerate model.  That causes
> > > complications for VFIO devices with INTx, which wire themselves up
> > > directly to the KVM irqchip for performance.
> > > 
> > > This series introduces a new notifier to let VFIO devices (and
> > > anything else that needs to in the future) know about changes to the
> > > master irqchip.  It modifies VFIO to respond to the notifier,
> > > reconnecting itself to the new KVM irqchip as necessary.
> > > 
> > > In particular this removes a misleading (though not wholly inaccurate)
> > > warning that occurs when using VFIO devices on a pseries machine type
> > > guest.
> > > 
> > > Open question: should this go into qemu-4.2 or wait until 5.0?  It's
> > > has medium complexity / intrusiveness, but it *is* a bugfix that I
> > > can't see a simpler way to fix.  It's effectively a regression from
> > > qemu-4.0 to qemu-4.1 (because that introduced XIVE support by
> > > default), although not from 4.1 to 4.2.  
> > 
> > Looks reasonable to me for 4.2, the vfio changes are not as big as they
> > appear.  If Paolo approves this week, I can send a pull request,
> > otherwise I can leave my ack for someone else as I'll be on PTO/holiday
> > next week.  Thanks,  
> 
> I'm happy to take it through my tree, and expect to be sending a PR in
> that timescale, so an ack sounds good.
> 
> I've pulled the series into my ppc-for-4.2 branch tentatively.
> 

Tested-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/5] vfio/spapr: Handle changes of master irq chip for VFIO devices
  2019-11-22  1:28       ` Alex Williamson
@ 2019-11-22  1:35         ` David Gibson
  -1 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2019-11-22  1:35 UTC (permalink / raw)
  To: Alex Williamson
  Cc: clg, groug, philmd, qemu-ppc, Paolo Bonzini, Jason Wang,
	Laurent Vivier, kvm, qemu-devel, Riku Voipio,
	Marc-André Lureau

[-- Attachment #1: Type: text/plain, Size: 2391 bytes --]

On Thu, Nov 21, 2019 at 06:28:07PM -0700, Alex Williamson wrote:
> On Fri, 22 Nov 2019 12:18:24 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Thu, Nov 21, 2019 at 09:57:38AM -0700, Alex Williamson wrote:
> > > On Thu, 21 Nov 2019 11:56:02 +1100
> > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > >   
> > > > Due to the way feature negotiation works in PAPR (which is a
> > > > paravirtualized platform), we can end up changing the global irq chip
> > > > at runtime, including it's KVM accelerate model.  That causes
> > > > complications for VFIO devices with INTx, which wire themselves up
> > > > directly to the KVM irqchip for performance.
> > > > 
> > > > This series introduces a new notifier to let VFIO devices (and
> > > > anything else that needs to in the future) know about changes to the
> > > > master irqchip.  It modifies VFIO to respond to the notifier,
> > > > reconnecting itself to the new KVM irqchip as necessary.
> > > > 
> > > > In particular this removes a misleading (though not wholly inaccurate)
> > > > warning that occurs when using VFIO devices on a pseries machine type
> > > > guest.
> > > > 
> > > > Open question: should this go into qemu-4.2 or wait until 5.0?  It's
> > > > has medium complexity / intrusiveness, but it *is* a bugfix that I
> > > > can't see a simpler way to fix.  It's effectively a regression from
> > > > qemu-4.0 to qemu-4.1 (because that introduced XIVE support by
> > > > default), although not from 4.1 to 4.2.  
> > > 
> > > Looks reasonable to me for 4.2, the vfio changes are not as big as they
> > > appear.  If Paolo approves this week, I can send a pull request,
> > > otherwise I can leave my ack for someone else as I'll be on PTO/holiday
> > > next week.  Thanks,  
> > 
> > I'm happy to take it through my tree, and expect to be sending a PR in
> > that timescale, so an ack sounds good.
> > 
> > I've pulled the series into my ppc-for-4.2 branch tentatively.
> 
> Tested-by: Alex Williamson <alex.williamson@redhat.com>
> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
> Acked-by: Alex Williamson <alex.williamson@redhat.com>

Thanks!

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/5] vfio/spapr: Handle changes of master irq chip for VFIO devices
@ 2019-11-22  1:35         ` David Gibson
  0 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2019-11-22  1:35 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, qemu-devel, Jason Wang, Riku Voipio, groug, Laurent Vivier,
	qemu-ppc, clg, Marc-André Lureau, Paolo Bonzini, philmd

[-- Attachment #1: Type: text/plain, Size: 2391 bytes --]

On Thu, Nov 21, 2019 at 06:28:07PM -0700, Alex Williamson wrote:
> On Fri, 22 Nov 2019 12:18:24 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Thu, Nov 21, 2019 at 09:57:38AM -0700, Alex Williamson wrote:
> > > On Thu, 21 Nov 2019 11:56:02 +1100
> > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > >   
> > > > Due to the way feature negotiation works in PAPR (which is a
> > > > paravirtualized platform), we can end up changing the global irq chip
> > > > at runtime, including it's KVM accelerate model.  That causes
> > > > complications for VFIO devices with INTx, which wire themselves up
> > > > directly to the KVM irqchip for performance.
> > > > 
> > > > This series introduces a new notifier to let VFIO devices (and
> > > > anything else that needs to in the future) know about changes to the
> > > > master irqchip.  It modifies VFIO to respond to the notifier,
> > > > reconnecting itself to the new KVM irqchip as necessary.
> > > > 
> > > > In particular this removes a misleading (though not wholly inaccurate)
> > > > warning that occurs when using VFIO devices on a pseries machine type
> > > > guest.
> > > > 
> > > > Open question: should this go into qemu-4.2 or wait until 5.0?  It's
> > > > has medium complexity / intrusiveness, but it *is* a bugfix that I
> > > > can't see a simpler way to fix.  It's effectively a regression from
> > > > qemu-4.0 to qemu-4.1 (because that introduced XIVE support by
> > > > default), although not from 4.1 to 4.2.  
> > > 
> > > Looks reasonable to me for 4.2, the vfio changes are not as big as they
> > > appear.  If Paolo approves this week, I can send a pull request,
> > > otherwise I can leave my ack for someone else as I'll be on PTO/holiday
> > > next week.  Thanks,  
> > 
> > I'm happy to take it through my tree, and expect to be sending a PR in
> > that timescale, so an ack sounds good.
> > 
> > I've pulled the series into my ppc-for-4.2 branch tentatively.
> 
> Tested-by: Alex Williamson <alex.williamson@redhat.com>
> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
> Acked-by: Alex Williamson <alex.williamson@redhat.com>

Thanks!

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/5] vfio/pci: Respond to KVM irqchip change notifier
  2019-11-21  0:56   ` David Gibson
@ 2019-11-22  5:12     ` Greg Kurz
  -1 siblings, 0 replies; 30+ messages in thread
From: Greg Kurz @ 2019-11-22  5:12 UTC (permalink / raw)
  To: David Gibson
  Cc: Alex Williamson, clg, philmd, qemu-ppc, Paolo Bonzini,
	Jason Wang, Laurent Vivier, kvm, qemu-devel, Riku Voipio,
	Marc-André Lureau, Alexey Kardashevskiy

On Thu, 21 Nov 2019 11:56:05 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> VFIO PCI devices already respond to the pci intx routing notifier, in order
> to update kernel irqchip mappings when routing is updated.  However this
> won't handle the case where the irqchip itself is replaced by a different
> model while retaining the same routing.  This case can happen on
> the pseries machine type due to PAPR feature negotiation.
> 
> To handle that case, add a handler for the irqchip change notifier, which
> does much the same thing as the routing notifier, but is unconditional,
> rather than being a no-op when the routing hasn't changed.
> 
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/vfio/pci.c | 23 ++++++++++++++++++-----
>  hw/vfio/pci.h |  1 +
>  2 files changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 521289aa7d..95478c2c55 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -256,6 +256,14 @@ static void vfio_intx_routing_notifier(PCIDevice *pdev)
>      }
>  }
>  
> +static void vfio_irqchip_change(Notifier *notify, void *data)
> +{
> +    VFIOPCIDevice *vdev = container_of(notify, VFIOPCIDevice,
> +                                       irqchip_change_notifier);
> +
> +    vfio_intx_update(vdev, &vdev->intx.route);
> +}
> +
>  static int vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp)
>  {
>      uint8_t pin = vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1);
> @@ -2973,16 +2981,18 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
>                                                    vfio_intx_mmap_enable, vdev);
>          pci_device_set_intx_routing_notifier(&vdev->pdev,
>                                               vfio_intx_routing_notifier);
> +        vdev->irqchip_change_notifier.notify = vfio_irqchip_change;
> +        kvm_irqchip_add_change_notifier(&vdev->irqchip_change_notifier);
>          ret = vfio_intx_enable(vdev, errp);
>          if (ret) {
> -            goto out_teardown;
> +            goto out_deregister;
>          }
>      }
>  
>      if (vdev->display != ON_OFF_AUTO_OFF) {
>          ret = vfio_display_probe(vdev, errp);
>          if (ret) {
> -            goto out_teardown;
> +            goto out_deregister;
>          }
>      }
>      if (vdev->enable_ramfb && vdev->dpy == NULL) {
> @@ -2992,11 +3002,11 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
>      if (vdev->display_xres || vdev->display_yres) {
>          if (vdev->dpy == NULL) {
>              error_setg(errp, "xres and yres properties require display=on");
> -            goto out_teardown;
> +            goto out_deregister;
>          }
>          if (vdev->dpy->edid_regs == NULL) {
>              error_setg(errp, "xres and yres properties need edid support");
> -            goto out_teardown;
> +            goto out_deregister;
>          }
>      }
>  

After this change, we end up with:

    if (vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1)) {
        vdev->intx.mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
                                                  vfio_intx_mmap_enable, vdev);
        pci_device_set_intx_routing_notifier(&vdev->pdev,
                                             vfio_intx_routing_notifier);
        vdev->irqchip_change_notifier.notify = vfio_irqchip_change;
        kvm_irqchip_add_change_notifier(&vdev->irqchip_change_notifier);
        ret = vfio_intx_enable(vdev, errp);
        if (ret) {
            goto out_deregister;
        }
    }

    if (vdev->display != ON_OFF_AUTO_OFF) {
        ret = vfio_display_probe(vdev, errp);
        if (ret) {
            goto out_deregister;
        }
    }
    if (vdev->enable_ramfb && vdev->dpy == NULL) {
        error_setg(errp, "ramfb=on requires display=on");
        goto out_teardown;
             ^^^^^^^^^^^^

This should be out_deregister.

The enable_ramfb property belongs to the nohotplug variant. It
means QEMU is going to terminate and we probably don't really
care to leak notifiers, but this still looks weird and fragile,
if enable_ramfb ever becomes usable by hotpluggable devices
one day.

    }
    if (vdev->display_xres || vdev->display_yres) {
        if (vdev->dpy == NULL) {
            error_setg(errp, "xres and yres properties require display=on");
            goto out_deregister;
        }
        if (vdev->dpy->edid_regs == NULL) {
            error_setg(errp, "xres and yres properties need edid support");
            goto out_deregister;
        }
    }


> @@ -3020,8 +3030,10 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
>  
>      return;
>  
> -out_teardown:
> +out_deregister:
>      pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
> +    kvm_irqchip_remove_change_notifier(&vdev->irqchip_change_notifier);
> +out_teardown:
>      vfio_teardown_msi(vdev);
>      vfio_bars_exit(vdev);
>  error:
> @@ -3064,6 +3076,7 @@ static void vfio_exitfn(PCIDevice *pdev)
>      vfio_unregister_req_notifier(vdev);
>      vfio_unregister_err_notifier(vdev);
>      pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
> +    kvm_irqchip_remove_change_notifier(&vdev->irqchip_change_notifier);
>      vfio_disable_interrupts(vdev);
>      if (vdev->intx.mmap_timer) {
>          timer_free(vdev->intx.mmap_timer);
> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
> index b329d50338..35626cd63e 100644
> --- a/hw/vfio/pci.h
> +++ b/hw/vfio/pci.h
> @@ -169,6 +169,7 @@ typedef struct VFIOPCIDevice {
>      bool enable_ramfb;
>      VFIODisplay *dpy;
>      Error *migration_blocker;
> +    Notifier irqchip_change_notifier;
>  } VFIOPCIDevice;
>  
>  uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/5] vfio/pci: Respond to KVM irqchip change notifier
@ 2019-11-22  5:12     ` Greg Kurz
  0 siblings, 0 replies; 30+ messages in thread
From: Greg Kurz @ 2019-11-22  5:12 UTC (permalink / raw)
  To: David Gibson
  Cc: kvm, Alexey Kardashevskiy, Jason Wang, Riku Voipio,
	Laurent Vivier, qemu-devel, Alex Williamson, qemu-ppc, clg,
	Marc-André Lureau, Paolo Bonzini, philmd

On Thu, 21 Nov 2019 11:56:05 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> VFIO PCI devices already respond to the pci intx routing notifier, in order
> to update kernel irqchip mappings when routing is updated.  However this
> won't handle the case where the irqchip itself is replaced by a different
> model while retaining the same routing.  This case can happen on
> the pseries machine type due to PAPR feature negotiation.
> 
> To handle that case, add a handler for the irqchip change notifier, which
> does much the same thing as the routing notifier, but is unconditional,
> rather than being a no-op when the routing hasn't changed.
> 
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/vfio/pci.c | 23 ++++++++++++++++++-----
>  hw/vfio/pci.h |  1 +
>  2 files changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 521289aa7d..95478c2c55 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -256,6 +256,14 @@ static void vfio_intx_routing_notifier(PCIDevice *pdev)
>      }
>  }
>  
> +static void vfio_irqchip_change(Notifier *notify, void *data)
> +{
> +    VFIOPCIDevice *vdev = container_of(notify, VFIOPCIDevice,
> +                                       irqchip_change_notifier);
> +
> +    vfio_intx_update(vdev, &vdev->intx.route);
> +}
> +
>  static int vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp)
>  {
>      uint8_t pin = vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1);
> @@ -2973,16 +2981,18 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
>                                                    vfio_intx_mmap_enable, vdev);
>          pci_device_set_intx_routing_notifier(&vdev->pdev,
>                                               vfio_intx_routing_notifier);
> +        vdev->irqchip_change_notifier.notify = vfio_irqchip_change;
> +        kvm_irqchip_add_change_notifier(&vdev->irqchip_change_notifier);
>          ret = vfio_intx_enable(vdev, errp);
>          if (ret) {
> -            goto out_teardown;
> +            goto out_deregister;
>          }
>      }
>  
>      if (vdev->display != ON_OFF_AUTO_OFF) {
>          ret = vfio_display_probe(vdev, errp);
>          if (ret) {
> -            goto out_teardown;
> +            goto out_deregister;
>          }
>      }
>      if (vdev->enable_ramfb && vdev->dpy == NULL) {
> @@ -2992,11 +3002,11 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
>      if (vdev->display_xres || vdev->display_yres) {
>          if (vdev->dpy == NULL) {
>              error_setg(errp, "xres and yres properties require display=on");
> -            goto out_teardown;
> +            goto out_deregister;
>          }
>          if (vdev->dpy->edid_regs == NULL) {
>              error_setg(errp, "xres and yres properties need edid support");
> -            goto out_teardown;
> +            goto out_deregister;
>          }
>      }
>  

After this change, we end up with:

    if (vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1)) {
        vdev->intx.mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
                                                  vfio_intx_mmap_enable, vdev);
        pci_device_set_intx_routing_notifier(&vdev->pdev,
                                             vfio_intx_routing_notifier);
        vdev->irqchip_change_notifier.notify = vfio_irqchip_change;
        kvm_irqchip_add_change_notifier(&vdev->irqchip_change_notifier);
        ret = vfio_intx_enable(vdev, errp);
        if (ret) {
            goto out_deregister;
        }
    }

    if (vdev->display != ON_OFF_AUTO_OFF) {
        ret = vfio_display_probe(vdev, errp);
        if (ret) {
            goto out_deregister;
        }
    }
    if (vdev->enable_ramfb && vdev->dpy == NULL) {
        error_setg(errp, "ramfb=on requires display=on");
        goto out_teardown;
             ^^^^^^^^^^^^

This should be out_deregister.

The enable_ramfb property belongs to the nohotplug variant. It
means QEMU is going to terminate and we probably don't really
care to leak notifiers, but this still looks weird and fragile,
if enable_ramfb ever becomes usable by hotpluggable devices
one day.

    }
    if (vdev->display_xres || vdev->display_yres) {
        if (vdev->dpy == NULL) {
            error_setg(errp, "xres and yres properties require display=on");
            goto out_deregister;
        }
        if (vdev->dpy->edid_regs == NULL) {
            error_setg(errp, "xres and yres properties need edid support");
            goto out_deregister;
        }
    }


> @@ -3020,8 +3030,10 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
>  
>      return;
>  
> -out_teardown:
> +out_deregister:
>      pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
> +    kvm_irqchip_remove_change_notifier(&vdev->irqchip_change_notifier);
> +out_teardown:
>      vfio_teardown_msi(vdev);
>      vfio_bars_exit(vdev);
>  error:
> @@ -3064,6 +3076,7 @@ static void vfio_exitfn(PCIDevice *pdev)
>      vfio_unregister_req_notifier(vdev);
>      vfio_unregister_err_notifier(vdev);
>      pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
> +    kvm_irqchip_remove_change_notifier(&vdev->irqchip_change_notifier);
>      vfio_disable_interrupts(vdev);
>      if (vdev->intx.mmap_timer) {
>          timer_free(vdev->intx.mmap_timer);
> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
> index b329d50338..35626cd63e 100644
> --- a/hw/vfio/pci.h
> +++ b/hw/vfio/pci.h
> @@ -169,6 +169,7 @@ typedef struct VFIOPCIDevice {
>      bool enable_ramfb;
>      VFIODisplay *dpy;
>      Error *migration_blocker;
> +    Notifier irqchip_change_notifier;
>  } VFIOPCIDevice;
>  
>  uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/5] vfio/pci: Respond to KVM irqchip change notifier
  2019-11-22  5:12     ` Greg Kurz
@ 2019-11-22  5:50       ` David Gibson
  -1 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2019-11-22  5:50 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Alex Williamson, clg, philmd, qemu-ppc, Paolo Bonzini,
	Jason Wang, Laurent Vivier, kvm, qemu-devel, Riku Voipio,
	Marc-André Lureau, Alexey Kardashevskiy

[-- Attachment #1: Type: text/plain, Size: 6587 bytes --]

On Fri, Nov 22, 2019 at 06:12:57AM +0100, Greg Kurz wrote:
> On Thu, 21 Nov 2019 11:56:05 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > VFIO PCI devices already respond to the pci intx routing notifier, in order
> > to update kernel irqchip mappings when routing is updated.  However this
> > won't handle the case where the irqchip itself is replaced by a different
> > model while retaining the same routing.  This case can happen on
> > the pseries machine type due to PAPR feature negotiation.
> > 
> > To handle that case, add a handler for the irqchip change notifier, which
> > does much the same thing as the routing notifier, but is unconditional,
> > rather than being a no-op when the routing hasn't changed.
> > 
> > Cc: Alex Williamson <alex.williamson@redhat.com>
> > Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  hw/vfio/pci.c | 23 ++++++++++++++++++-----
> >  hw/vfio/pci.h |  1 +
> >  2 files changed, 19 insertions(+), 5 deletions(-)
> > 
> > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> > index 521289aa7d..95478c2c55 100644
> > --- a/hw/vfio/pci.c
> > +++ b/hw/vfio/pci.c
> > @@ -256,6 +256,14 @@ static void vfio_intx_routing_notifier(PCIDevice *pdev)
> >      }
> >  }
> >  
> > +static void vfio_irqchip_change(Notifier *notify, void *data)
> > +{
> > +    VFIOPCIDevice *vdev = container_of(notify, VFIOPCIDevice,
> > +                                       irqchip_change_notifier);
> > +
> > +    vfio_intx_update(vdev, &vdev->intx.route);
> > +}
> > +
> >  static int vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp)
> >  {
> >      uint8_t pin = vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1);
> > @@ -2973,16 +2981,18 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
> >                                                    vfio_intx_mmap_enable, vdev);
> >          pci_device_set_intx_routing_notifier(&vdev->pdev,
> >                                               vfio_intx_routing_notifier);
> > +        vdev->irqchip_change_notifier.notify = vfio_irqchip_change;
> > +        kvm_irqchip_add_change_notifier(&vdev->irqchip_change_notifier);
> >          ret = vfio_intx_enable(vdev, errp);
> >          if (ret) {
> > -            goto out_teardown;
> > +            goto out_deregister;
> >          }
> >      }
> >  
> >      if (vdev->display != ON_OFF_AUTO_OFF) {
> >          ret = vfio_display_probe(vdev, errp);
> >          if (ret) {
> > -            goto out_teardown;
> > +            goto out_deregister;
> >          }
> >      }
> >      if (vdev->enable_ramfb && vdev->dpy == NULL) {
> > @@ -2992,11 +3002,11 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
> >      if (vdev->display_xres || vdev->display_yres) {
> >          if (vdev->dpy == NULL) {
> >              error_setg(errp, "xres and yres properties require display=on");
> > -            goto out_teardown;
> > +            goto out_deregister;
> >          }
> >          if (vdev->dpy->edid_regs == NULL) {
> >              error_setg(errp, "xres and yres properties need edid support");
> > -            goto out_teardown;
> > +            goto out_deregister;
> >          }
> >      }
> >  
> 
> After this change, we end up with:
> 
>     if (vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1)) {
>         vdev->intx.mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
>                                                   vfio_intx_mmap_enable, vdev);
>         pci_device_set_intx_routing_notifier(&vdev->pdev,
>                                              vfio_intx_routing_notifier);
>         vdev->irqchip_change_notifier.notify = vfio_irqchip_change;
>         kvm_irqchip_add_change_notifier(&vdev->irqchip_change_notifier);
>         ret = vfio_intx_enable(vdev, errp);
>         if (ret) {
>             goto out_deregister;
>         }
>     }
> 
>     if (vdev->display != ON_OFF_AUTO_OFF) {
>         ret = vfio_display_probe(vdev, errp);
>         if (ret) {
>             goto out_deregister;
>         }
>     }
>     if (vdev->enable_ramfb && vdev->dpy == NULL) {
>         error_setg(errp, "ramfb=on requires display=on");
>         goto out_teardown;
>              ^^^^^^^^^^^^
> 
> This should be out_deregister.

Oops, fixed in my tree.

> The enable_ramfb property belongs to the nohotplug variant. It
> means QEMU is going to terminate and we probably don't really
> care to leak notifiers, but this still looks weird and fragile,
> if enable_ramfb ever becomes usable by hotpluggable devices
> one day.
> 
>     }
>     if (vdev->display_xres || vdev->display_yres) {
>         if (vdev->dpy == NULL) {
>             error_setg(errp, "xres and yres properties require display=on");
>             goto out_deregister;
>         }
>         if (vdev->dpy->edid_regs == NULL) {
>             error_setg(errp, "xres and yres properties need edid support");
>             goto out_deregister;
>         }
>     }
> 
> 
> > @@ -3020,8 +3030,10 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
> >  
> >      return;
> >  
> > -out_teardown:
> > +out_deregister:
> >      pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
> > +    kvm_irqchip_remove_change_notifier(&vdev->irqchip_change_notifier);
> > +out_teardown:
> >      vfio_teardown_msi(vdev);
> >      vfio_bars_exit(vdev);
> >  error:
> > @@ -3064,6 +3076,7 @@ static void vfio_exitfn(PCIDevice *pdev)
> >      vfio_unregister_req_notifier(vdev);
> >      vfio_unregister_err_notifier(vdev);
> >      pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
> > +    kvm_irqchip_remove_change_notifier(&vdev->irqchip_change_notifier);
> >      vfio_disable_interrupts(vdev);
> >      if (vdev->intx.mmap_timer) {
> >          timer_free(vdev->intx.mmap_timer);
> > diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
> > index b329d50338..35626cd63e 100644
> > --- a/hw/vfio/pci.h
> > +++ b/hw/vfio/pci.h
> > @@ -169,6 +169,7 @@ typedef struct VFIOPCIDevice {
> >      bool enable_ramfb;
> >      VFIODisplay *dpy;
> >      Error *migration_blocker;
> > +    Notifier irqchip_change_notifier;
> >  } VFIOPCIDevice;
> >  
> >  uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/5] vfio/pci: Respond to KVM irqchip change notifier
@ 2019-11-22  5:50       ` David Gibson
  0 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2019-11-22  5:50 UTC (permalink / raw)
  To: Greg Kurz
  Cc: kvm, Alexey Kardashevskiy, Jason Wang, Riku Voipio,
	Laurent Vivier, qemu-devel, Alex Williamson, qemu-ppc, clg,
	Marc-André Lureau, Paolo Bonzini, philmd

[-- Attachment #1: Type: text/plain, Size: 6587 bytes --]

On Fri, Nov 22, 2019 at 06:12:57AM +0100, Greg Kurz wrote:
> On Thu, 21 Nov 2019 11:56:05 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > VFIO PCI devices already respond to the pci intx routing notifier, in order
> > to update kernel irqchip mappings when routing is updated.  However this
> > won't handle the case where the irqchip itself is replaced by a different
> > model while retaining the same routing.  This case can happen on
> > the pseries machine type due to PAPR feature negotiation.
> > 
> > To handle that case, add a handler for the irqchip change notifier, which
> > does much the same thing as the routing notifier, but is unconditional,
> > rather than being a no-op when the routing hasn't changed.
> > 
> > Cc: Alex Williamson <alex.williamson@redhat.com>
> > Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  hw/vfio/pci.c | 23 ++++++++++++++++++-----
> >  hw/vfio/pci.h |  1 +
> >  2 files changed, 19 insertions(+), 5 deletions(-)
> > 
> > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> > index 521289aa7d..95478c2c55 100644
> > --- a/hw/vfio/pci.c
> > +++ b/hw/vfio/pci.c
> > @@ -256,6 +256,14 @@ static void vfio_intx_routing_notifier(PCIDevice *pdev)
> >      }
> >  }
> >  
> > +static void vfio_irqchip_change(Notifier *notify, void *data)
> > +{
> > +    VFIOPCIDevice *vdev = container_of(notify, VFIOPCIDevice,
> > +                                       irqchip_change_notifier);
> > +
> > +    vfio_intx_update(vdev, &vdev->intx.route);
> > +}
> > +
> >  static int vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp)
> >  {
> >      uint8_t pin = vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1);
> > @@ -2973,16 +2981,18 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
> >                                                    vfio_intx_mmap_enable, vdev);
> >          pci_device_set_intx_routing_notifier(&vdev->pdev,
> >                                               vfio_intx_routing_notifier);
> > +        vdev->irqchip_change_notifier.notify = vfio_irqchip_change;
> > +        kvm_irqchip_add_change_notifier(&vdev->irqchip_change_notifier);
> >          ret = vfio_intx_enable(vdev, errp);
> >          if (ret) {
> > -            goto out_teardown;
> > +            goto out_deregister;
> >          }
> >      }
> >  
> >      if (vdev->display != ON_OFF_AUTO_OFF) {
> >          ret = vfio_display_probe(vdev, errp);
> >          if (ret) {
> > -            goto out_teardown;
> > +            goto out_deregister;
> >          }
> >      }
> >      if (vdev->enable_ramfb && vdev->dpy == NULL) {
> > @@ -2992,11 +3002,11 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
> >      if (vdev->display_xres || vdev->display_yres) {
> >          if (vdev->dpy == NULL) {
> >              error_setg(errp, "xres and yres properties require display=on");
> > -            goto out_teardown;
> > +            goto out_deregister;
> >          }
> >          if (vdev->dpy->edid_regs == NULL) {
> >              error_setg(errp, "xres and yres properties need edid support");
> > -            goto out_teardown;
> > +            goto out_deregister;
> >          }
> >      }
> >  
> 
> After this change, we end up with:
> 
>     if (vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1)) {
>         vdev->intx.mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
>                                                   vfio_intx_mmap_enable, vdev);
>         pci_device_set_intx_routing_notifier(&vdev->pdev,
>                                              vfio_intx_routing_notifier);
>         vdev->irqchip_change_notifier.notify = vfio_irqchip_change;
>         kvm_irqchip_add_change_notifier(&vdev->irqchip_change_notifier);
>         ret = vfio_intx_enable(vdev, errp);
>         if (ret) {
>             goto out_deregister;
>         }
>     }
> 
>     if (vdev->display != ON_OFF_AUTO_OFF) {
>         ret = vfio_display_probe(vdev, errp);
>         if (ret) {
>             goto out_deregister;
>         }
>     }
>     if (vdev->enable_ramfb && vdev->dpy == NULL) {
>         error_setg(errp, "ramfb=on requires display=on");
>         goto out_teardown;
>              ^^^^^^^^^^^^
> 
> This should be out_deregister.

Oops, fixed in my tree.

> The enable_ramfb property belongs to the nohotplug variant. It
> means QEMU is going to terminate and we probably don't really
> care to leak notifiers, but this still looks weird and fragile,
> if enable_ramfb ever becomes usable by hotpluggable devices
> one day.
> 
>     }
>     if (vdev->display_xres || vdev->display_yres) {
>         if (vdev->dpy == NULL) {
>             error_setg(errp, "xres and yres properties require display=on");
>             goto out_deregister;
>         }
>         if (vdev->dpy->edid_regs == NULL) {
>             error_setg(errp, "xres and yres properties need edid support");
>             goto out_deregister;
>         }
>     }
> 
> 
> > @@ -3020,8 +3030,10 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
> >  
> >      return;
> >  
> > -out_teardown:
> > +out_deregister:
> >      pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
> > +    kvm_irqchip_remove_change_notifier(&vdev->irqchip_change_notifier);
> > +out_teardown:
> >      vfio_teardown_msi(vdev);
> >      vfio_bars_exit(vdev);
> >  error:
> > @@ -3064,6 +3076,7 @@ static void vfio_exitfn(PCIDevice *pdev)
> >      vfio_unregister_req_notifier(vdev);
> >      vfio_unregister_err_notifier(vdev);
> >      pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
> > +    kvm_irqchip_remove_change_notifier(&vdev->irqchip_change_notifier);
> >      vfio_disable_interrupts(vdev);
> >      if (vdev->intx.mmap_timer) {
> >          timer_free(vdev->intx.mmap_timer);
> > diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
> > index b329d50338..35626cd63e 100644
> > --- a/hw/vfio/pci.h
> > +++ b/hw/vfio/pci.h
> > @@ -169,6 +169,7 @@ typedef struct VFIOPCIDevice {
> >      bool enable_ramfb;
> >      VFIODisplay *dpy;
> >      Error *migration_blocker;
> > +    Notifier irqchip_change_notifier;
> >  } VFIOPCIDevice;
> >  
> >  uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/5] vfio/spapr: Handle changes of master irq chip for VFIO devices
  2019-11-21  0:56 ` David Gibson
@ 2019-11-22  6:09   ` Greg Kurz
  -1 siblings, 0 replies; 30+ messages in thread
From: Greg Kurz @ 2019-11-22  6:09 UTC (permalink / raw)
  To: David Gibson
  Cc: Alex Williamson, clg, philmd, qemu-ppc, Paolo Bonzini,
	Jason Wang, Laurent Vivier, kvm, qemu-devel, Riku Voipio,
	Marc-André Lureau

On Thu, 21 Nov 2019 11:56:02 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> Due to the way feature negotiation works in PAPR (which is a
> paravirtualized platform), we can end up changing the global irq chip
> at runtime, including it's KVM accelerate model.  That causes
> complications for VFIO devices with INTx, which wire themselves up
> directly to the KVM irqchip for performance.
> 
> This series introduces a new notifier to let VFIO devices (and
> anything else that needs to in the future) know about changes to the
> master irqchip.  It modifies VFIO to respond to the notifier,
> reconnecting itself to the new KVM irqchip as necessary.
> 
> In particular this removes a misleading (though not wholly inaccurate)
> warning that occurs when using VFIO devices on a pseries machine type
> guest.
> 
> Open question: should this go into qemu-4.2 or wait until 5.0?  It's
> has medium complexity / intrusiveness, but it *is* a bugfix that I
> can't see a simpler way to fix.  It's effectively a regression from
> qemu-4.0 to qemu-4.1 (because that introduced XIVE support by
> default), although not from 4.1 to 4.2.
> 
> Changes since RFC:
>  * Fixed some incorrect error paths pointed by aw in 3/5
>  * 5/5 had some problems previously, but they have been obsoleted by
>    other changes merged in the meantime
> 
> David Gibson (5):
>   kvm: Introduce KVM irqchip change notifier
>   vfio/pci: Split vfio_intx_update()
>   vfio/pci: Respond to KVM irqchip change notifier
>   spapr: Handle irq backend changes with VFIO PCI devices
>   spapr: Work around spurious warnings from vfio INTx initialization
> 
>  accel/kvm/kvm-all.c    | 18 ++++++++++++
>  accel/stubs/kvm-stub.c | 12 ++++++++
>  hw/ppc/spapr_irq.c     | 17 +++++++++++-
>  hw/vfio/pci.c          | 62 +++++++++++++++++++++++++++---------------
>  hw/vfio/pci.h          |  1 +
>  include/sysemu/kvm.h   |  5 ++++
>  6 files changed, 92 insertions(+), 23 deletions(-)
> 

With the issue spotted in patch 3/5 fixed, the series looks good:

Reviewed-by: Greg Kurz <groug@kaod.org>

Then I've tried passthrough of a BCM5719 gigabit adapter to a guest. It works
as expected with MSIs but if I force LSI, either through /sys or with the
pci=nomsi kernel command line, I get no interrupts for the device in the guest.
Note that the same device works ok with LSI in the host.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/5] vfio/spapr: Handle changes of master irq chip for VFIO devices
@ 2019-11-22  6:09   ` Greg Kurz
  0 siblings, 0 replies; 30+ messages in thread
From: Greg Kurz @ 2019-11-22  6:09 UTC (permalink / raw)
  To: David Gibson
  Cc: kvm, Jason Wang, Riku Voipio, Laurent Vivier, qemu-devel,
	Alex Williamson, qemu-ppc, clg, Marc-André Lureau,
	Paolo Bonzini, philmd

On Thu, 21 Nov 2019 11:56:02 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> Due to the way feature negotiation works in PAPR (which is a
> paravirtualized platform), we can end up changing the global irq chip
> at runtime, including it's KVM accelerate model.  That causes
> complications for VFIO devices with INTx, which wire themselves up
> directly to the KVM irqchip for performance.
> 
> This series introduces a new notifier to let VFIO devices (and
> anything else that needs to in the future) know about changes to the
> master irqchip.  It modifies VFIO to respond to the notifier,
> reconnecting itself to the new KVM irqchip as necessary.
> 
> In particular this removes a misleading (though not wholly inaccurate)
> warning that occurs when using VFIO devices on a pseries machine type
> guest.
> 
> Open question: should this go into qemu-4.2 or wait until 5.0?  It's
> has medium complexity / intrusiveness, but it *is* a bugfix that I
> can't see a simpler way to fix.  It's effectively a regression from
> qemu-4.0 to qemu-4.1 (because that introduced XIVE support by
> default), although not from 4.1 to 4.2.
> 
> Changes since RFC:
>  * Fixed some incorrect error paths pointed by aw in 3/5
>  * 5/5 had some problems previously, but they have been obsoleted by
>    other changes merged in the meantime
> 
> David Gibson (5):
>   kvm: Introduce KVM irqchip change notifier
>   vfio/pci: Split vfio_intx_update()
>   vfio/pci: Respond to KVM irqchip change notifier
>   spapr: Handle irq backend changes with VFIO PCI devices
>   spapr: Work around spurious warnings from vfio INTx initialization
> 
>  accel/kvm/kvm-all.c    | 18 ++++++++++++
>  accel/stubs/kvm-stub.c | 12 ++++++++
>  hw/ppc/spapr_irq.c     | 17 +++++++++++-
>  hw/vfio/pci.c          | 62 +++++++++++++++++++++++++++---------------
>  hw/vfio/pci.h          |  1 +
>  include/sysemu/kvm.h   |  5 ++++
>  6 files changed, 92 insertions(+), 23 deletions(-)
> 

With the issue spotted in patch 3/5 fixed, the series looks good:

Reviewed-by: Greg Kurz <groug@kaod.org>

Then I've tried passthrough of a BCM5719 gigabit adapter to a guest. It works
as expected with MSIs but if I force LSI, either through /sys or with the
pci=nomsi kernel command line, I get no interrupts for the device in the guest.
Note that the same device works ok with LSI in the host.


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2019-11-22  6:28 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-21  0:56 [PATCH 0/5] vfio/spapr: Handle changes of master irq chip for VFIO devices David Gibson
2019-11-21  0:56 ` David Gibson
2019-11-21  0:56 ` [PATCH 1/5] kvm: Introduce KVM irqchip change notifier David Gibson
2019-11-21  0:56   ` David Gibson
2019-11-21  0:56 ` [PATCH 2/5] vfio/pci: Split vfio_intx_update() David Gibson
2019-11-21  0:56   ` David Gibson
2019-11-21  0:56 ` [PATCH 3/5] vfio/pci: Respond to KVM irqchip change notifier David Gibson
2019-11-21  0:56   ` David Gibson
2019-11-22  5:12   ` Greg Kurz
2019-11-22  5:12     ` Greg Kurz
2019-11-22  5:50     ` David Gibson
2019-11-22  5:50       ` David Gibson
2019-11-21  0:56 ` [PATCH 4/5] spapr: Handle irq backend changes with VFIO PCI devices David Gibson
2019-11-21  0:56   ` David Gibson
2019-11-21 16:35   ` Cédric Le Goater
2019-11-21 16:35     ` Cédric Le Goater
2019-11-21  0:56 ` [PATCH 5/5] spapr: Work around spurious warnings from vfio INTx initialization David Gibson
2019-11-21  0:56   ` David Gibson
2019-11-21 16:35   ` Cédric Le Goater
2019-11-21 16:35     ` Cédric Le Goater
2019-11-21 16:57 ` [PATCH 0/5] vfio/spapr: Handle changes of master irq chip for VFIO devices Alex Williamson
2019-11-21 16:57   ` Alex Williamson
2019-11-22  1:18   ` David Gibson
2019-11-22  1:18     ` David Gibson
2019-11-22  1:28     ` Alex Williamson
2019-11-22  1:28       ` Alex Williamson
2019-11-22  1:35       ` David Gibson
2019-11-22  1:35         ` David Gibson
2019-11-22  6:09 ` Greg Kurz
2019-11-22  6:09   ` Greg Kurz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.