All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/2] spapr: reduce the number of IRQ
@ 2019-09-11 13:39 Cédric Le Goater
  2019-09-11 13:39 ` [Qemu-devel] [PATCH v2 1/2] spapr/irq: Introduce an ics_irq_free() helper Cédric Le Goater
  2019-09-11 13:39 ` [Qemu-devel] [PATCH v2 2/2] spapr/irq: Only claim VALID interrupts at the KVM level Cédric Le Goater
  0 siblings, 2 replies; 8+ messages in thread
From: Cédric Le Goater @ 2019-09-11 13:39 UTC (permalink / raw)
  To: David Gibson
  Cc: Satheesh Rajendran, Cédric Le Goater, qemu-ppc, qemu-devel,
	Greg Kurz

Hello, 

A typical pseries VM with 16 vCPUs, one disk, one network adapater
uses less than 100 interrupts but the whole IRQ number space of the
QEMU machine is allocated at reset time and it is 8K wide. This is
wasting a considerable amount of interrupt numbers in the global IRQ
space which has 1M interrupts per socket on a POWER9.

To optimise the HW resources, only request at the KVM level interrupts
which have been claimed by the guest. This will help to increase the
maximum number of VMs per system and also help supporting nested
guests using the XIVE interrupt mode.

Thanks,

C.

Changes since v1:

 - split the patch 
 - removed useless 'reset_all' machine flag which supposed to preserve
   migration compatibility

Cédric Le Goater (2):
  spapr/irq: Introduce an ics_irq_free() helper
  spapr/irq: Only claim VALID interrupts at the KVM level

 include/hw/ppc/xics.h    |  5 +++++
 hw/intc/spapr_xive_kvm.c | 29 ++++++++++++++++++++++++++---
 hw/intc/xics_kvm.c       |  8 ++++++++
 hw/ppc/spapr_irq.c       |  9 +++------
 4 files changed, 42 insertions(+), 9 deletions(-)

-- 
2.21.0



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH v2 1/2] spapr/irq: Introduce an ics_irq_free() helper
  2019-09-11 13:39 [Qemu-devel] [PATCH v2 0/2] spapr: reduce the number of IRQ Cédric Le Goater
@ 2019-09-11 13:39 ` Cédric Le Goater
  2019-09-11 14:09   ` Greg Kurz
  2019-09-16  0:38   ` David Gibson
  2019-09-11 13:39 ` [Qemu-devel] [PATCH v2 2/2] spapr/irq: Only claim VALID interrupts at the KVM level Cédric Le Goater
  1 sibling, 2 replies; 8+ messages in thread
From: Cédric Le Goater @ 2019-09-11 13:39 UTC (permalink / raw)
  To: David Gibson
  Cc: Satheesh Rajendran, Cédric Le Goater, qemu-ppc, qemu-devel,
	Greg Kurz

It will help us to discard interrupt numbers which have not been
claimed in the next patch.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 include/hw/ppc/xics.h | 5 +++++
 hw/ppc/spapr_irq.c    | 9 +++------
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index f2a8d6a4b4f9..64a2c8862a72 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -179,6 +179,11 @@ void ics_simple_write_xive(ICSState *ics, int nr, int server,
                            uint8_t priority, uint8_t saved_priority);
 void ics_simple_set_irq(void *opaque, int srcno, int val);
 
+static inline bool ics_irq_free(ICSState *ics, uint32_t srcno)
+{
+    return !(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK);
+}
+
 void ics_set_irq_type(ICSState *ics, int srcno, bool lsi);
 void icp_pic_print_info(ICPState *icp, Monitor *mon);
 void ics_pic_print_info(ICSState *ics, Monitor *mon);
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 06fe2432bae5..d8f46b6797f8 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -114,9 +114,6 @@ static void spapr_irq_init_xics(SpaprMachineState *spapr, int nr_irqs,
     xics_spapr_init(spapr);
 }
 
-#define ICS_IRQ_FREE(ics, srcno)   \
-    (!((ics)->irqs[(srcno)].flags & (XICS_FLAGS_IRQ_MASK)))
-
 static int spapr_irq_claim_xics(SpaprMachineState *spapr, int irq, bool lsi,
                                 Error **errp)
 {
@@ -129,7 +126,7 @@ static int spapr_irq_claim_xics(SpaprMachineState *spapr, int irq, bool lsi,
         return -1;
     }
 
-    if (!ICS_IRQ_FREE(ics, irq - ics->offset)) {
+    if (!ics_irq_free(ics, irq - ics->offset)) {
         error_setg(errp, "IRQ %d is not free", irq);
         return -1;
     }
@@ -147,7 +144,7 @@ static void spapr_irq_free_xics(SpaprMachineState *spapr, int irq, int num)
     if (ics_valid_irq(ics, irq)) {
         trace_spapr_irq_free(0, irq, num);
         for (i = srcno; i < srcno + num; ++i) {
-            if (ICS_IRQ_FREE(ics, i)) {
+            if (ics_irq_free(ics, i)) {
                 trace_spapr_irq_free_warn(0, i);
             }
             memset(&ics->irqs[i], 0, sizeof(ICSIRQState));
@@ -767,7 +764,7 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
             return -1;
         }
         for (i = first; i < first + num; ++i) {
-            if (!ICS_IRQ_FREE(ics, i)) {
+            if (!ics_irq_free(ics, i)) {
                 break;
             }
         }
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH v2 2/2] spapr/irq: Only claim VALID interrupts at the KVM level
  2019-09-11 13:39 [Qemu-devel] [PATCH v2 0/2] spapr: reduce the number of IRQ Cédric Le Goater
  2019-09-11 13:39 ` [Qemu-devel] [PATCH v2 1/2] spapr/irq: Introduce an ics_irq_free() helper Cédric Le Goater
@ 2019-09-11 13:39 ` Cédric Le Goater
  2019-09-11 14:14   ` Greg Kurz
  2019-09-16  0:44   ` David Gibson
  1 sibling, 2 replies; 8+ messages in thread
From: Cédric Le Goater @ 2019-09-11 13:39 UTC (permalink / raw)
  To: David Gibson
  Cc: Satheesh Rajendran, Cédric Le Goater, qemu-ppc, qemu-devel,
	Greg Kurz

A typical pseries VM with 16 vCPUs, one disk, one network adapater
uses less than 100 interrupts but the whole IRQ number space of the
QEMU machine is allocated at reset time and it is 8K wide. This is
wasting a considerable amount of interrupt numbers in the global IRQ
space which has 1M interrupts per socket on a POWER9.

To optimise the HW resources, only request at the KVM level interrupts
which have been claimed by the guest. This will help to increase the
maximum number of VMs per system and also help supporting nested guests
using the XIVE interrupt mode.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 hw/intc/spapr_xive_kvm.c | 29 ++++++++++++++++++++++++++---
 hw/intc/xics_kvm.c       |  8 ++++++++
 2 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
index 17af4d19f54e..71b88d7797bc 100644
--- a/hw/intc/spapr_xive_kvm.c
+++ b/hw/intc/spapr_xive_kvm.c
@@ -255,11 +255,16 @@ void kvmppc_xive_source_reset_one(XiveSource *xsrc, int srcno, Error **errp)
 
 static void kvmppc_xive_source_reset(XiveSource *xsrc, Error **errp)
 {
+    SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
     int i;
 
     for (i = 0; i < xsrc->nr_irqs; i++) {
         Error *local_err = NULL;
 
+        if (!xive_eas_is_valid(&xive->eat[i])) {
+            continue;
+        }
+
         kvmppc_xive_source_reset_one(xsrc, i, &local_err);
         if (local_err) {
             error_propagate(errp, local_err);
@@ -328,11 +333,18 @@ uint64_t kvmppc_xive_esb_rw(XiveSource *xsrc, int srcno, uint32_t offset,
 
 static void kvmppc_xive_source_get_state(XiveSource *xsrc)
 {
+    SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
     int i;
 
     for (i = 0; i < xsrc->nr_irqs; i++) {
+        uint8_t pq;
+
+        if (!xive_eas_is_valid(&xive->eat[i])) {
+            continue;
+        }
+
         /* Perform a load without side effect to retrieve the PQ bits */
-        uint8_t pq = xive_esb_read(xsrc, i, XIVE_ESB_GET);
+        pq = xive_esb_read(xsrc, i, XIVE_ESB_GET);
 
         /* and save PQ locally */
         xive_source_esb_set(xsrc, i, pq);
@@ -521,9 +533,14 @@ static void kvmppc_xive_change_state_handler(void *opaque, int running,
      */
     if (running) {
         for (i = 0; i < xsrc->nr_irqs; i++) {
-            uint8_t pq = xive_source_esb_get(xsrc, i);
+            uint8_t pq;
             uint8_t old_pq;
 
+            if (!xive_eas_is_valid(&xive->eat[i])) {
+                continue;
+            }
+
+            pq = xive_source_esb_get(xsrc, i);
             old_pq = xive_esb_read(xsrc, i, XIVE_ESB_SET_PQ_00 + (pq << 8));
 
             /*
@@ -545,7 +562,13 @@ static void kvmppc_xive_change_state_handler(void *opaque, int running,
      * migration is in progress.
      */
     for (i = 0; i < xsrc->nr_irqs; i++) {
-        uint8_t pq = xive_esb_read(xsrc, i, XIVE_ESB_GET);
+        uint8_t pq;
+
+        if (!xive_eas_is_valid(&xive->eat[i])) {
+            continue;
+        }
+
+        pq = xive_esb_read(xsrc, i, XIVE_ESB_GET);
 
         /*
          * PQ is set to PENDING to possibly catch a triggered
diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index a4d2e876cc5f..ba90d6dc966c 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -190,6 +190,10 @@ void ics_get_kvm_state(ICSState *ics)
     for (i = 0; i < ics->nr_irqs; i++) {
         ICSIRQState *irq = &ics->irqs[i];
 
+        if (ics_irq_free(ics, i)) {
+            continue;
+        }
+
         kvm_device_access(kernel_xics_fd, KVM_DEV_XICS_GRP_SOURCES,
                           i + ics->offset, &state, false, &error_fatal);
 
@@ -301,6 +305,10 @@ int ics_set_kvm_state(ICSState *ics, Error **errp)
         Error *local_err = NULL;
         int ret;
 
+        if (ics_irq_free(ics, i)) {
+            continue;
+        }
+
         ret = ics_set_kvm_state_one(ics, i, &local_err);
         if (ret < 0) {
             error_propagate(errp, local_err);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/2] spapr/irq: Introduce an ics_irq_free() helper
  2019-09-11 13:39 ` [Qemu-devel] [PATCH v2 1/2] spapr/irq: Introduce an ics_irq_free() helper Cédric Le Goater
@ 2019-09-11 14:09   ` Greg Kurz
  2019-09-16  0:38   ` David Gibson
  1 sibling, 0 replies; 8+ messages in thread
From: Greg Kurz @ 2019-09-11 14:09 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: Satheesh Rajendran, qemu-ppc, qemu-devel, David Gibson

On Wed, 11 Sep 2019 15:39:36 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> It will help us to discard interrupt numbers which have not been
> claimed in the next patch.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  include/hw/ppc/xics.h | 5 +++++
>  hw/ppc/spapr_irq.c    | 9 +++------
>  2 files changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> index f2a8d6a4b4f9..64a2c8862a72 100644
> --- a/include/hw/ppc/xics.h
> +++ b/include/hw/ppc/xics.h
> @@ -179,6 +179,11 @@ void ics_simple_write_xive(ICSState *ics, int nr, int server,
>                             uint8_t priority, uint8_t saved_priority);
>  void ics_simple_set_irq(void *opaque, int srcno, int val);
>  
> +static inline bool ics_irq_free(ICSState *ics, uint32_t srcno)
> +{
> +    return !(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK);
> +}
> +
>  void ics_set_irq_type(ICSState *ics, int srcno, bool lsi);
>  void icp_pic_print_info(ICPState *icp, Monitor *mon);
>  void ics_pic_print_info(ICSState *ics, Monitor *mon);
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 06fe2432bae5..d8f46b6797f8 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -114,9 +114,6 @@ static void spapr_irq_init_xics(SpaprMachineState *spapr, int nr_irqs,
>      xics_spapr_init(spapr);
>  }
>  
> -#define ICS_IRQ_FREE(ics, srcno)   \
> -    (!((ics)->irqs[(srcno)].flags & (XICS_FLAGS_IRQ_MASK)))
> -
>  static int spapr_irq_claim_xics(SpaprMachineState *spapr, int irq, bool lsi,
>                                  Error **errp)
>  {
> @@ -129,7 +126,7 @@ static int spapr_irq_claim_xics(SpaprMachineState *spapr, int irq, bool lsi,
>          return -1;
>      }
>  
> -    if (!ICS_IRQ_FREE(ics, irq - ics->offset)) {
> +    if (!ics_irq_free(ics, irq - ics->offset)) {
>          error_setg(errp, "IRQ %d is not free", irq);
>          return -1;
>      }
> @@ -147,7 +144,7 @@ static void spapr_irq_free_xics(SpaprMachineState *spapr, int irq, int num)
>      if (ics_valid_irq(ics, irq)) {
>          trace_spapr_irq_free(0, irq, num);
>          for (i = srcno; i < srcno + num; ++i) {
> -            if (ICS_IRQ_FREE(ics, i)) {
> +            if (ics_irq_free(ics, i)) {
>                  trace_spapr_irq_free_warn(0, i);
>              }
>              memset(&ics->irqs[i], 0, sizeof(ICSIRQState));
> @@ -767,7 +764,7 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
>              return -1;
>          }
>          for (i = first; i < first + num; ++i) {
> -            if (!ICS_IRQ_FREE(ics, i)) {
> +            if (!ics_irq_free(ics, i)) {
>                  break;
>              }
>          }



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/2] spapr/irq: Only claim VALID interrupts at the KVM level
  2019-09-11 13:39 ` [Qemu-devel] [PATCH v2 2/2] spapr/irq: Only claim VALID interrupts at the KVM level Cédric Le Goater
@ 2019-09-11 14:14   ` Greg Kurz
  2019-09-16  0:44   ` David Gibson
  1 sibling, 0 replies; 8+ messages in thread
From: Greg Kurz @ 2019-09-11 14:14 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: Satheesh Rajendran, qemu-ppc, qemu-devel, David Gibson

On Wed, 11 Sep 2019 15:39:37 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> A typical pseries VM with 16 vCPUs, one disk, one network adapater
> uses less than 100 interrupts but the whole IRQ number space of the
> QEMU machine is allocated at reset time and it is 8K wide. This is
> wasting a considerable amount of interrupt numbers in the global IRQ
> space which has 1M interrupts per socket on a POWER9.
> 
> To optimise the HW resources, only request at the KVM level interrupts
> which have been claimed by the guest. This will help to increase the
> maximum number of VMs per system and also help supporting nested guests
> using the XIVE interrupt mode.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  hw/intc/spapr_xive_kvm.c | 29 ++++++++++++++++++++++++++---
>  hw/intc/xics_kvm.c       |  8 ++++++++
>  2 files changed, 34 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> index 17af4d19f54e..71b88d7797bc 100644
> --- a/hw/intc/spapr_xive_kvm.c
> +++ b/hw/intc/spapr_xive_kvm.c
> @@ -255,11 +255,16 @@ void kvmppc_xive_source_reset_one(XiveSource *xsrc, int srcno, Error **errp)
>  
>  static void kvmppc_xive_source_reset(XiveSource *xsrc, Error **errp)
>  {
> +    SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
>      int i;
>  
>      for (i = 0; i < xsrc->nr_irqs; i++) {
>          Error *local_err = NULL;
>  
> +        if (!xive_eas_is_valid(&xive->eat[i])) {
> +            continue;
> +        }
> +
>          kvmppc_xive_source_reset_one(xsrc, i, &local_err);
>          if (local_err) {
>              error_propagate(errp, local_err);
> @@ -328,11 +333,18 @@ uint64_t kvmppc_xive_esb_rw(XiveSource *xsrc, int srcno, uint32_t offset,
>  
>  static void kvmppc_xive_source_get_state(XiveSource *xsrc)
>  {
> +    SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
>      int i;
>  
>      for (i = 0; i < xsrc->nr_irqs; i++) {
> +        uint8_t pq;
> +
> +        if (!xive_eas_is_valid(&xive->eat[i])) {
> +            continue;
> +        }
> +
>          /* Perform a load without side effect to retrieve the PQ bits */
> -        uint8_t pq = xive_esb_read(xsrc, i, XIVE_ESB_GET);
> +        pq = xive_esb_read(xsrc, i, XIVE_ESB_GET);
>  
>          /* and save PQ locally */
>          xive_source_esb_set(xsrc, i, pq);
> @@ -521,9 +533,14 @@ static void kvmppc_xive_change_state_handler(void *opaque, int running,
>       */
>      if (running) {
>          for (i = 0; i < xsrc->nr_irqs; i++) {
> -            uint8_t pq = xive_source_esb_get(xsrc, i);
> +            uint8_t pq;
>              uint8_t old_pq;
>  
> +            if (!xive_eas_is_valid(&xive->eat[i])) {
> +                continue;
> +            }
> +
> +            pq = xive_source_esb_get(xsrc, i);
>              old_pq = xive_esb_read(xsrc, i, XIVE_ESB_SET_PQ_00 + (pq << 8));
>  
>              /*
> @@ -545,7 +562,13 @@ static void kvmppc_xive_change_state_handler(void *opaque, int running,
>       * migration is in progress.
>       */
>      for (i = 0; i < xsrc->nr_irqs; i++) {
> -        uint8_t pq = xive_esb_read(xsrc, i, XIVE_ESB_GET);
> +        uint8_t pq;
> +
> +        if (!xive_eas_is_valid(&xive->eat[i])) {
> +            continue;
> +        }
> +
> +        pq = xive_esb_read(xsrc, i, XIVE_ESB_GET);
>  
>          /*
>           * PQ is set to PENDING to possibly catch a triggered
> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> index a4d2e876cc5f..ba90d6dc966c 100644
> --- a/hw/intc/xics_kvm.c
> +++ b/hw/intc/xics_kvm.c
> @@ -190,6 +190,10 @@ void ics_get_kvm_state(ICSState *ics)
>      for (i = 0; i < ics->nr_irqs; i++) {
>          ICSIRQState *irq = &ics->irqs[i];
>  
> +        if (ics_irq_free(ics, i)) {
> +            continue;
> +        }
> +
>          kvm_device_access(kernel_xics_fd, KVM_DEV_XICS_GRP_SOURCES,
>                            i + ics->offset, &state, false, &error_fatal);
>  
> @@ -301,6 +305,10 @@ int ics_set_kvm_state(ICSState *ics, Error **errp)
>          Error *local_err = NULL;
>          int ret;
>  
> +        if (ics_irq_free(ics, i)) {
> +            continue;
> +        }
> +
>          ret = ics_set_kvm_state_one(ics, i, &local_err);
>          if (ret < 0) {
>              error_propagate(errp, local_err);



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/2] spapr/irq: Introduce an ics_irq_free() helper
  2019-09-11 13:39 ` [Qemu-devel] [PATCH v2 1/2] spapr/irq: Introduce an ics_irq_free() helper Cédric Le Goater
  2019-09-11 14:09   ` Greg Kurz
@ 2019-09-16  0:38   ` David Gibson
  1 sibling, 0 replies; 8+ messages in thread
From: David Gibson @ 2019-09-16  0:38 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: Satheesh Rajendran, qemu-ppc, qemu-devel, Greg Kurz

[-- Attachment #1: Type: text/plain, Size: 2980 bytes --]

On Wed, Sep 11, 2019 at 03:39:36PM +0200, Cédric Le Goater wrote:
> It will help us to discard interrupt numbers which have not been
> claimed in the next patch.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>

Applied to ppc-for-4.2, thanks.

> ---
>  include/hw/ppc/xics.h | 5 +++++
>  hw/ppc/spapr_irq.c    | 9 +++------
>  2 files changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> index f2a8d6a4b4f9..64a2c8862a72 100644
> --- a/include/hw/ppc/xics.h
> +++ b/include/hw/ppc/xics.h
> @@ -179,6 +179,11 @@ void ics_simple_write_xive(ICSState *ics, int nr, int server,
>                             uint8_t priority, uint8_t saved_priority);
>  void ics_simple_set_irq(void *opaque, int srcno, int val);
>  
> +static inline bool ics_irq_free(ICSState *ics, uint32_t srcno)
> +{
> +    return !(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK);
> +}
> +
>  void ics_set_irq_type(ICSState *ics, int srcno, bool lsi);
>  void icp_pic_print_info(ICPState *icp, Monitor *mon);
>  void ics_pic_print_info(ICSState *ics, Monitor *mon);
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 06fe2432bae5..d8f46b6797f8 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -114,9 +114,6 @@ static void spapr_irq_init_xics(SpaprMachineState *spapr, int nr_irqs,
>      xics_spapr_init(spapr);
>  }
>  
> -#define ICS_IRQ_FREE(ics, srcno)   \
> -    (!((ics)->irqs[(srcno)].flags & (XICS_FLAGS_IRQ_MASK)))
> -
>  static int spapr_irq_claim_xics(SpaprMachineState *spapr, int irq, bool lsi,
>                                  Error **errp)
>  {
> @@ -129,7 +126,7 @@ static int spapr_irq_claim_xics(SpaprMachineState *spapr, int irq, bool lsi,
>          return -1;
>      }
>  
> -    if (!ICS_IRQ_FREE(ics, irq - ics->offset)) {
> +    if (!ics_irq_free(ics, irq - ics->offset)) {
>          error_setg(errp, "IRQ %d is not free", irq);
>          return -1;
>      }
> @@ -147,7 +144,7 @@ static void spapr_irq_free_xics(SpaprMachineState *spapr, int irq, int num)
>      if (ics_valid_irq(ics, irq)) {
>          trace_spapr_irq_free(0, irq, num);
>          for (i = srcno; i < srcno + num; ++i) {
> -            if (ICS_IRQ_FREE(ics, i)) {
> +            if (ics_irq_free(ics, i)) {
>                  trace_spapr_irq_free_warn(0, i);
>              }
>              memset(&ics->irqs[i], 0, sizeof(ICSIRQState));
> @@ -767,7 +764,7 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
>              return -1;
>          }
>          for (i = first; i < first + num; ++i) {
> -            if (!ICS_IRQ_FREE(ics, i)) {
> +            if (!ics_irq_free(ics, i)) {
>                  break;
>              }
>          }

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/2] spapr/irq: Only claim VALID interrupts at the KVM level
  2019-09-11 13:39 ` [Qemu-devel] [PATCH v2 2/2] spapr/irq: Only claim VALID interrupts at the KVM level Cédric Le Goater
  2019-09-11 14:14   ` Greg Kurz
@ 2019-09-16  0:44   ` David Gibson
  2019-09-25 15:41     ` Greg Kurz
  1 sibling, 1 reply; 8+ messages in thread
From: David Gibson @ 2019-09-16  0:44 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: Satheesh Rajendran, qemu-ppc, qemu-devel, Greg Kurz

[-- Attachment #1: Type: text/plain, Size: 4594 bytes --]

On Wed, Sep 11, 2019 at 03:39:37PM +0200, Cédric Le Goater wrote:
> A typical pseries VM with 16 vCPUs, one disk, one network adapater
> uses less than 100 interrupts but the whole IRQ number space of the
> QEMU machine is allocated at reset time and it is 8K wide. This is
> wasting a considerable amount of interrupt numbers in the global IRQ
> space which has 1M interrupts per socket on a POWER9.
> 
> To optimise the HW resources, only request at the KVM level interrupts
> which have been claimed by the guest. This will help to increase the
> maximum number of VMs per system and also help supporting nested guests
> using the XIVE interrupt mode.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>

Applied to ppc-for-4.2, thanks.

> ---
>  hw/intc/spapr_xive_kvm.c | 29 ++++++++++++++++++++++++++---
>  hw/intc/xics_kvm.c       |  8 ++++++++
>  2 files changed, 34 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> index 17af4d19f54e..71b88d7797bc 100644
> --- a/hw/intc/spapr_xive_kvm.c
> +++ b/hw/intc/spapr_xive_kvm.c
> @@ -255,11 +255,16 @@ void kvmppc_xive_source_reset_one(XiveSource *xsrc, int srcno, Error **errp)
>  
>  static void kvmppc_xive_source_reset(XiveSource *xsrc, Error **errp)
>  {
> +    SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
>      int i;
>  
>      for (i = 0; i < xsrc->nr_irqs; i++) {
>          Error *local_err = NULL;
>  
> +        if (!xive_eas_is_valid(&xive->eat[i])) {
> +            continue;
> +        }
> +
>          kvmppc_xive_source_reset_one(xsrc, i, &local_err);
>          if (local_err) {
>              error_propagate(errp, local_err);
> @@ -328,11 +333,18 @@ uint64_t kvmppc_xive_esb_rw(XiveSource *xsrc, int srcno, uint32_t offset,
>  
>  static void kvmppc_xive_source_get_state(XiveSource *xsrc)
>  {
> +    SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
>      int i;
>  
>      for (i = 0; i < xsrc->nr_irqs; i++) {
> +        uint8_t pq;
> +
> +        if (!xive_eas_is_valid(&xive->eat[i])) {
> +            continue;
> +        }
> +
>          /* Perform a load without side effect to retrieve the PQ bits */
> -        uint8_t pq = xive_esb_read(xsrc, i, XIVE_ESB_GET);
> +        pq = xive_esb_read(xsrc, i, XIVE_ESB_GET);
>  
>          /* and save PQ locally */
>          xive_source_esb_set(xsrc, i, pq);
> @@ -521,9 +533,14 @@ static void kvmppc_xive_change_state_handler(void *opaque, int running,
>       */
>      if (running) {
>          for (i = 0; i < xsrc->nr_irqs; i++) {
> -            uint8_t pq = xive_source_esb_get(xsrc, i);
> +            uint8_t pq;
>              uint8_t old_pq;
>  
> +            if (!xive_eas_is_valid(&xive->eat[i])) {
> +                continue;
> +            }
> +
> +            pq = xive_source_esb_get(xsrc, i);
>              old_pq = xive_esb_read(xsrc, i, XIVE_ESB_SET_PQ_00 + (pq << 8));
>  
>              /*
> @@ -545,7 +562,13 @@ static void kvmppc_xive_change_state_handler(void *opaque, int running,
>       * migration is in progress.
>       */
>      for (i = 0; i < xsrc->nr_irqs; i++) {
> -        uint8_t pq = xive_esb_read(xsrc, i, XIVE_ESB_GET);
> +        uint8_t pq;
> +
> +        if (!xive_eas_is_valid(&xive->eat[i])) {
> +            continue;
> +        }
> +
> +        pq = xive_esb_read(xsrc, i, XIVE_ESB_GET);
>  
>          /*
>           * PQ is set to PENDING to possibly catch a triggered
> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> index a4d2e876cc5f..ba90d6dc966c 100644
> --- a/hw/intc/xics_kvm.c
> +++ b/hw/intc/xics_kvm.c
> @@ -190,6 +190,10 @@ void ics_get_kvm_state(ICSState *ics)
>      for (i = 0; i < ics->nr_irqs; i++) {
>          ICSIRQState *irq = &ics->irqs[i];
>  
> +        if (ics_irq_free(ics, i)) {
> +            continue;
> +        }
> +
>          kvm_device_access(kernel_xics_fd, KVM_DEV_XICS_GRP_SOURCES,
>                            i + ics->offset, &state, false, &error_fatal);
>  
> @@ -301,6 +305,10 @@ int ics_set_kvm_state(ICSState *ics, Error **errp)
>          Error *local_err = NULL;
>          int ret;
>  
> +        if (ics_irq_free(ics, i)) {
> +            continue;
> +        }
> +
>          ret = ics_set_kvm_state_one(ics, i, &local_err);
>          if (ret < 0) {
>              error_propagate(errp, local_err);

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] spapr/irq: Only claim VALID interrupts at the KVM level
  2019-09-16  0:44   ` David Gibson
@ 2019-09-25 15:41     ` Greg Kurz
  0 siblings, 0 replies; 8+ messages in thread
From: Greg Kurz @ 2019-09-25 15:41 UTC (permalink / raw)
  To: David Gibson
  Cc: Satheesh Rajendran, qemu-ppc, Cédric Le Goater, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 5420 bytes --]

On Mon, 16 Sep 2019 10:44:32 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Wed, Sep 11, 2019 at 03:39:37PM +0200, Cédric Le Goater wrote:
> > A typical pseries VM with 16 vCPUs, one disk, one network adapater
> > uses less than 100 interrupts but the whole IRQ number space of the
> > QEMU machine is allocated at reset time and it is 8K wide. This is
> > wasting a considerable amount of interrupt numbers in the global IRQ
> > space which has 1M interrupts per socket on a POWER9.
> > 
> > To optimise the HW resources, only request at the KVM level interrupts
> > which have been claimed by the guest. This will help to increase the
> > maximum number of VMs per system and also help supporting nested guests
> > using the XIVE interrupt mode.
> > 
> > Signed-off-by: Cédric Le Goater <clg@kaod.org>
> 
> Applied to ppc-for-4.2, thanks.
> 

While experimenting 4.1->4.2 migration with your irq cleanup series,
I've hit this:

qemu-system-ppc64: KVM_SET_DEVICE_ATTR failed: Group 3 attr 0x0000000000001300: Invalid argument
qemu-system-ppc64: error while loading state for instance 0x0 of device 'spapr'
qemu-system-ppc64: load of migration failed: Operation not permitted

Failing to restore the source config with EINVAL seems to come from the
following check in kvmppc_xive_native_set_source_config():

	if (!state->valid)
		return -EINVAL;

which makes sense since we haven't requested any interrupt yet.

We should hence do it at post load before restoring the source
config.

I'll send a patch ASAP.

> > ---
> >  hw/intc/spapr_xive_kvm.c | 29 ++++++++++++++++++++++++++---
> >  hw/intc/xics_kvm.c       |  8 ++++++++
> >  2 files changed, 34 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> > index 17af4d19f54e..71b88d7797bc 100644
> > --- a/hw/intc/spapr_xive_kvm.c
> > +++ b/hw/intc/spapr_xive_kvm.c
> > @@ -255,11 +255,16 @@ void kvmppc_xive_source_reset_one(XiveSource *xsrc, int srcno, Error **errp)
> >  
> >  static void kvmppc_xive_source_reset(XiveSource *xsrc, Error **errp)
> >  {
> > +    SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
> >      int i;
> >  
> >      for (i = 0; i < xsrc->nr_irqs; i++) {
> >          Error *local_err = NULL;
> >  
> > +        if (!xive_eas_is_valid(&xive->eat[i])) {
> > +            continue;
> > +        }
> > +
> >          kvmppc_xive_source_reset_one(xsrc, i, &local_err);
> >          if (local_err) {
> >              error_propagate(errp, local_err);
> > @@ -328,11 +333,18 @@ uint64_t kvmppc_xive_esb_rw(XiveSource *xsrc, int srcno, uint32_t offset,
> >  
> >  static void kvmppc_xive_source_get_state(XiveSource *xsrc)
> >  {
> > +    SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
> >      int i;
> >  
> >      for (i = 0; i < xsrc->nr_irqs; i++) {
> > +        uint8_t pq;
> > +
> > +        if (!xive_eas_is_valid(&xive->eat[i])) {
> > +            continue;
> > +        }
> > +
> >          /* Perform a load without side effect to retrieve the PQ bits */
> > -        uint8_t pq = xive_esb_read(xsrc, i, XIVE_ESB_GET);
> > +        pq = xive_esb_read(xsrc, i, XIVE_ESB_GET);
> >  
> >          /* and save PQ locally */
> >          xive_source_esb_set(xsrc, i, pq);
> > @@ -521,9 +533,14 @@ static void kvmppc_xive_change_state_handler(void *opaque, int running,
> >       */
> >      if (running) {
> >          for (i = 0; i < xsrc->nr_irqs; i++) {
> > -            uint8_t pq = xive_source_esb_get(xsrc, i);
> > +            uint8_t pq;
> >              uint8_t old_pq;
> >  
> > +            if (!xive_eas_is_valid(&xive->eat[i])) {
> > +                continue;
> > +            }
> > +
> > +            pq = xive_source_esb_get(xsrc, i);
> >              old_pq = xive_esb_read(xsrc, i, XIVE_ESB_SET_PQ_00 + (pq << 8));
> >  
> >              /*
> > @@ -545,7 +562,13 @@ static void kvmppc_xive_change_state_handler(void *opaque, int running,
> >       * migration is in progress.
> >       */
> >      for (i = 0; i < xsrc->nr_irqs; i++) {
> > -        uint8_t pq = xive_esb_read(xsrc, i, XIVE_ESB_GET);
> > +        uint8_t pq;
> > +
> > +        if (!xive_eas_is_valid(&xive->eat[i])) {
> > +            continue;
> > +        }
> > +
> > +        pq = xive_esb_read(xsrc, i, XIVE_ESB_GET);
> >  
> >          /*
> >           * PQ is set to PENDING to possibly catch a triggered
> > diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> > index a4d2e876cc5f..ba90d6dc966c 100644
> > --- a/hw/intc/xics_kvm.c
> > +++ b/hw/intc/xics_kvm.c
> > @@ -190,6 +190,10 @@ void ics_get_kvm_state(ICSState *ics)
> >      for (i = 0; i < ics->nr_irqs; i++) {
> >          ICSIRQState *irq = &ics->irqs[i];
> >  
> > +        if (ics_irq_free(ics, i)) {
> > +            continue;
> > +        }
> > +
> >          kvm_device_access(kernel_xics_fd, KVM_DEV_XICS_GRP_SOURCES,
> >                            i + ics->offset, &state, false, &error_fatal);
> >  
> > @@ -301,6 +305,10 @@ int ics_set_kvm_state(ICSState *ics, Error **errp)
> >          Error *local_err = NULL;
> >          int ret;
> >  
> > +        if (ics_irq_free(ics, i)) {
> > +            continue;
> > +        }
> > +
> >          ret = ics_set_kvm_state_one(ics, i, &local_err);
> >          if (ret < 0) {
> >              error_propagate(errp, local_err);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-09-25 16:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-11 13:39 [Qemu-devel] [PATCH v2 0/2] spapr: reduce the number of IRQ Cédric Le Goater
2019-09-11 13:39 ` [Qemu-devel] [PATCH v2 1/2] spapr/irq: Introduce an ics_irq_free() helper Cédric Le Goater
2019-09-11 14:09   ` Greg Kurz
2019-09-16  0:38   ` David Gibson
2019-09-11 13:39 ` [Qemu-devel] [PATCH v2 2/2] spapr/irq: Only claim VALID interrupts at the KVM level Cédric Le Goater
2019-09-11 14:14   ` Greg Kurz
2019-09-16  0:44   ` David Gibson
2019-09-25 15:41     ` Greg Kurz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.