All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/2] Further VSMT fixes
@ 2018-01-15  7:27 David Gibson
  2018-01-15  7:27 ` [Qemu-devel] [PATCH 1/2] target/ppc: Clarify compat mode max_threads value David Gibson
  2018-01-15  7:27 ` [Qemu-devel] [PATCH 2/2] spapr: Adjust default VSMT value for better migration compatibility David Gibson
  0 siblings, 2 replies; 10+ messages in thread
From: David Gibson @ 2018-01-15  7:27 UTC (permalink / raw)
  To: groug, joserz, surajjs, sam.bobroff
  Cc: lvivier, qemu-ppc, qemu-devel, David Gibson

Here are some follow on fixes to Ziviani's proposed changes to VSMT
handling.  This should fix migration of POWER8 compat mode guests
between POWER8 and POWER9 hosts.

The changes are simple, the rationale's rather more complex.

David Gibson (2):
  target/ppc: Clarify compat mode max_threads value
  spapr: Adjust default VSMT value for better migration compatibility

 hw/ppc/spapr.c      | 15 ++++++++++-----
 target/ppc/compat.c | 25 +++++++++++++++++--------
 target/ppc/cpu.h    |  2 +-
 3 files changed, 28 insertions(+), 14 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 1/2] target/ppc: Clarify compat mode max_threads value
  2018-01-15  7:27 [Qemu-devel] [PATCH 0/2] Further VSMT fixes David Gibson
@ 2018-01-15  7:27 ` David Gibson
  2018-01-15  7:52   ` Laurent Vivier
                     ` (2 more replies)
  2018-01-15  7:27 ` [Qemu-devel] [PATCH 2/2] spapr: Adjust default VSMT value for better migration compatibility David Gibson
  1 sibling, 3 replies; 10+ messages in thread
From: David Gibson @ 2018-01-15  7:27 UTC (permalink / raw)
  To: groug, joserz, surajjs, sam.bobroff
  Cc: lvivier, qemu-ppc, qemu-devel, David Gibson

We recently had some discussions that were sidetracked for a while, because
nearly everyone misapprehended the purpose of the 'max_threads' field in
the compatiblity modes table.  It's all about guest expectations, not host
expectations or support (that's handled elsewhere).

In an attempt to avoid a repeat of that confusion, rename the field to
'max_vthreads' and add an explanatory comment.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr.c      |  4 ++--
 target/ppc/compat.c | 25 +++++++++++++++++--------
 target/ppc/cpu.h    |  2 +-
 3 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 3e528fe91e..e35214bfc3 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -345,7 +345,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState *spapr)
         PowerPCCPU *cpu = POWERPC_CPU(cs);
         DeviceClass *dc = DEVICE_GET_CLASS(cs);
         int index = spapr_vcpu_id(cpu);
-        int compat_smt = MIN(smp_threads, ppc_compat_max_threads(cpu));
+        int compat_smt = MIN(smp_threads, ppc_compat_max_vthreads(cpu));
 
         if ((index % smt) != 0) {
             continue;
@@ -503,7 +503,7 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
     size_t page_sizes_prop_size;
     uint32_t vcpus_per_socket = smp_threads * smp_cores;
     uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
-    int compat_smt = MIN(smp_threads, ppc_compat_max_threads(cpu));
+    int compat_smt = MIN(smp_threads, ppc_compat_max_vthreads(cpu));
     sPAPRDRConnector *drc;
     int drc_index;
     uint32_t radix_AP_encodings[PPC_PAGE_SIZES_MAX_SZ];
diff --git a/target/ppc/compat.c b/target/ppc/compat.c
index 276b5b52c2..807c906f68 100644
--- a/target/ppc/compat.c
+++ b/target/ppc/compat.c
@@ -32,7 +32,16 @@ typedef struct {
     uint32_t pvr;
     uint64_t pcr;
     uint64_t pcr_level;
-    int max_threads;
+
+    /*
+     * Maximum allowed virtual threads per virtual core
+     *
+     * This is to stop older guests getting confused by seeing more
+     * threads than they think the cpu can support.  Usually it's
+     * equal to the number of threads supported on bare metal
+     * hardware, but not always (see POWER9).
+     */
+    int max_vthreads;
 } CompatInfo;
 
 static const CompatInfo compat_table[] = {
@@ -45,28 +54,28 @@ static const CompatInfo compat_table[] = {
         .pcr = PCR_COMPAT_3_00 | PCR_COMPAT_2_07 | PCR_COMPAT_2_06 |
                PCR_COMPAT_2_05 | PCR_TM_DIS | PCR_VSX_DIS,
         .pcr_level = PCR_COMPAT_2_05,
-        .max_threads = 2,
+        .max_vthreads = 2,
     },
     { /* POWER7, ISA2.06 */
         .name = "power7",
         .pvr = CPU_POWERPC_LOGICAL_2_06,
         .pcr = PCR_COMPAT_3_00 | PCR_COMPAT_2_07 | PCR_COMPAT_2_06 | PCR_TM_DIS,
         .pcr_level = PCR_COMPAT_2_06,
-        .max_threads = 4,
+        .max_vthreads = 4,
     },
     {
         .name = "power7+",
         .pvr = CPU_POWERPC_LOGICAL_2_06_PLUS,
         .pcr = PCR_COMPAT_3_00 | PCR_COMPAT_2_07 | PCR_COMPAT_2_06 | PCR_TM_DIS,
         .pcr_level = PCR_COMPAT_2_06,
-        .max_threads = 4,
+        .max_vthreads = 4,
     },
     { /* POWER8, ISA2.07 */
         .name = "power8",
         .pvr = CPU_POWERPC_LOGICAL_2_07,
         .pcr = PCR_COMPAT_3_00 | PCR_COMPAT_2_07,
         .pcr_level = PCR_COMPAT_2_07,
-        .max_threads = 8,
+        .max_vthreads = 8,
     },
     { /* POWER9, ISA3.00 */
         .name = "power9",
@@ -80,7 +89,7 @@ static const CompatInfo compat_table[] = {
          * confusing if half of the threads disappear from the guest
          * if it announces it's POWER9 aware at CAS time.
          */
-        .max_threads = 8,
+        .max_vthreads = 8,
     },
 };
 
@@ -192,14 +201,14 @@ void ppc_set_compat_all(uint32_t compat_pvr, Error **errp)
     }
 }
 
-int ppc_compat_max_threads(PowerPCCPU *cpu)
+int ppc_compat_max_vthreads(PowerPCCPU *cpu)
 {
     const CompatInfo *compat = compat_by_pvr(cpu->compat_pvr);
     int n_threads = CPU(cpu)->nr_threads;
 
     if (cpu->compat_pvr) {
         g_assert(compat);
-        n_threads = MIN(n_threads, compat->max_threads);
+        n_threads = MIN(n_threads, compat->max_vthreads);
     }
 
     return n_threads;
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index a5e49f23e9..dc6820c5eb 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1395,7 +1395,7 @@ void ppc_set_compat(PowerPCCPU *cpu, uint32_t compat_pvr, Error **errp);
 #if !defined(CONFIG_USER_ONLY)
 void ppc_set_compat_all(uint32_t compat_pvr, Error **errp);
 #endif
-int ppc_compat_max_threads(PowerPCCPU *cpu);
+int ppc_compat_max_vthreads(PowerPCCPU *cpu);
 void ppc_compat_add_property(Object *obj, const char *name,
                              uint32_t *compat_pvr, const char *basedesc,
                              Error **errp);
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 2/2] spapr: Adjust default VSMT value for better migration compatibility
  2018-01-15  7:27 [Qemu-devel] [PATCH 0/2] Further VSMT fixes David Gibson
  2018-01-15  7:27 ` [Qemu-devel] [PATCH 1/2] target/ppc: Clarify compat mode max_threads value David Gibson
@ 2018-01-15  7:27 ` David Gibson
  2018-01-15  7:53   ` Laurent Vivier
                     ` (2 more replies)
  1 sibling, 3 replies; 10+ messages in thread
From: David Gibson @ 2018-01-15  7:27 UTC (permalink / raw)
  To: groug, joserz, surajjs, sam.bobroff
  Cc: lvivier, qemu-ppc, qemu-devel, David Gibson

fa98fbfc "PC: KVM: Support machine option to set VSMT mode" introduced the
"vsmt" parameter for the pseries machine type, which controls the spacing
of the vcpu ids of thread 0 for each virtual core.  This was done to bring
some consistency and stability to how that was done, while still allowing
backwards compatibility for migration and otherwise.

The default value we used for vsmt was set to the max of the host's
advertised default number of threads and the number of vthreads per vcore
in the guest.  This was done to continue running without extra parameters
on older KVM versions which don't allow the VSMT value to be changed.

Unfortunately, even that smaller than before leakage of host configuration
into guest visible configuration still breaks things.  Specifically a guest
with 4 (or less) vthread/vcore will get a different vsmt value when
running on a POWER8 (vsmt==8) and POWER9 (vsmt==4) host.  That means the
vcpu ids don't line up so you can't migrate between them, though you should
be able to.

Long term we really want to make vsmt == smp_threads for sufficiently
new machine types.  However, that means that qemu will then require a
sufficiently recent KVM (one which supports changing VSMT) - that's still
not widely enough deployed to be really comfortable to do.

In the meantime we some default that will work as often as possible.
This patch changes that default to 8 in all circumstances.  This does
change guest visible behaviour (including for existing machine versions)
for many cases - just not the most common/important case.

Following is case by case justification for why this is still the least
worst option.  Note that any of the old behaviours can still be duplicated
after this patch, it's just that it requires manual intervention by
setting the vsmt property on the command line.

KVM HV on POWER8 host:
   This is the overwhelmingly common case in production setups, and is
   unchanged by design.  POWER8 hosts will advertise a default VSMT mode
   of 8, and > 8 vthreads/vcore isn't permitted

KVM HV on POWER7 host:
   Will break, but POWER7s allowing KVM were never released to the public.

KVM HV on POWER9 host:
   Not yet released to the public, breaking this now will reduce other
   breakage later.

KVM HV on PowerPC 970:
   Will theoretically break it, but it was barely supported to begin with
   and already required various user visible hacks to work.  Also so old
   that I just don't care.

TCG:
   This is the nastiest one; it means migration of TCG guests (without
   manual vsmt setting) will break.  Since TCG is rarely used in production
   I think this is worth it for the other benefits.  It does also remove
   one more barrier to TCG<->KVM migration which could be interesting for
   debugging applications.

KVM PR:
   As with TCG, this will break migration of existing configurations,
   without adding extra manual vsmt options.  As with TCG, it is rare in
   production so I think the benefits outweigh breakages.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index e35214bfc3..8e5ef7c9de 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2305,9 +2305,14 @@ static void spapr_set_vsmt_mode(sPAPRMachineState *spapr, Error **errp)
         }
         /* In this case, spapr->vsmt has been set by the command line */
     } else {
-        /* Choose a VSMT mode that may be higher than necessary but is
-         * likely to be compatible with hosts that don't have VSMT. */
-        spapr->vsmt = MAX(kvm_smt, smp_threads);
+        /*
+         * Default VSMT value is tricky, because we need it to be as
+         * consistent as possible (for migration), but this requires
+         * changing it for at least some existing cases.  We pick 8 as
+         * the value that we'd get with KVM on POWER8, the
+         * overwhelmingly common case in production systems.
+         */
+        spapr->vsmt = 8;
     }
 
     /* KVM: If necessary, set the SMT mode: */
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] target/ppc: Clarify compat mode max_threads value
  2018-01-15  7:27 ` [Qemu-devel] [PATCH 1/2] target/ppc: Clarify compat mode max_threads value David Gibson
@ 2018-01-15  7:52   ` Laurent Vivier
  2018-01-15  8:45   ` Greg Kurz
  2018-01-15 10:40   ` joserz
  2 siblings, 0 replies; 10+ messages in thread
From: Laurent Vivier @ 2018-01-15  7:52 UTC (permalink / raw)
  To: David Gibson, groug, joserz, surajjs, sam.bobroff; +Cc: qemu-ppc, qemu-devel

On 15/01/2018 08:27, David Gibson wrote:
> We recently had some discussions that were sidetracked for a while, because
> nearly everyone misapprehended the purpose of the 'max_threads' field in
> the compatiblity modes table.  It's all about guest expectations, not host
> expectations or support (that's handled elsewhere).
> 
> In an attempt to avoid a repeat of that confusion, rename the field to
> 'max_vthreads' and add an explanatory comment.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/ppc/spapr.c      |  4 ++--
>  target/ppc/compat.c | 25 +++++++++++++++++--------
>  target/ppc/cpu.h    |  2 +-
>  3 files changed, 20 insertions(+), 11 deletions(-)
> 
Reviewed-by: Laurent Vivier <lvivier@redhat.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] spapr: Adjust default VSMT value for better migration compatibility
  2018-01-15  7:27 ` [Qemu-devel] [PATCH 2/2] spapr: Adjust default VSMT value for better migration compatibility David Gibson
@ 2018-01-15  7:53   ` Laurent Vivier
  2018-01-15  9:48   ` Greg Kurz
  2018-01-15 10:38   ` joserz
  2 siblings, 0 replies; 10+ messages in thread
From: Laurent Vivier @ 2018-01-15  7:53 UTC (permalink / raw)
  To: David Gibson, groug, joserz, surajjs, sam.bobroff; +Cc: qemu-ppc, qemu-devel

On 15/01/2018 08:27, David Gibson wrote:
> fa98fbfc "PC: KVM: Support machine option to set VSMT mode" introduced the
> "vsmt" parameter for the pseries machine type, which controls the spacing
> of the vcpu ids of thread 0 for each virtual core.  This was done to bring
> some consistency and stability to how that was done, while still allowing
> backwards compatibility for migration and otherwise.
> 
> The default value we used for vsmt was set to the max of the host's
> advertised default number of threads and the number of vthreads per vcore
> in the guest.  This was done to continue running without extra parameters
> on older KVM versions which don't allow the VSMT value to be changed.
> 
> Unfortunately, even that smaller than before leakage of host configuration
> into guest visible configuration still breaks things.  Specifically a guest
> with 4 (or less) vthread/vcore will get a different vsmt value when
> running on a POWER8 (vsmt==8) and POWER9 (vsmt==4) host.  That means the
> vcpu ids don't line up so you can't migrate between them, though you should
> be able to.
> 
> Long term we really want to make vsmt == smp_threads for sufficiently
> new machine types.  However, that means that qemu will then require a
> sufficiently recent KVM (one which supports changing VSMT) - that's still
> not widely enough deployed to be really comfortable to do.
> 
> In the meantime we some default that will work as often as possible.
> This patch changes that default to 8 in all circumstances.  This does
> change guest visible behaviour (including for existing machine versions)
> for many cases - just not the most common/important case.
> 
> Following is case by case justification for why this is still the least
> worst option.  Note that any of the old behaviours can still be duplicated
> after this patch, it's just that it requires manual intervention by
> setting the vsmt property on the command line.
> 
> KVM HV on POWER8 host:
>    This is the overwhelmingly common case in production setups, and is
>    unchanged by design.  POWER8 hosts will advertise a default VSMT mode
>    of 8, and > 8 vthreads/vcore isn't permitted
> 
> KVM HV on POWER7 host:
>    Will break, but POWER7s allowing KVM were never released to the public.
> 
> KVM HV on POWER9 host:
>    Not yet released to the public, breaking this now will reduce other
>    breakage later.
> 
> KVM HV on PowerPC 970:
>    Will theoretically break it, but it was barely supported to begin with
>    and already required various user visible hacks to work.  Also so old
>    that I just don't care.
> 
> TCG:
>    This is the nastiest one; it means migration of TCG guests (without
>    manual vsmt setting) will break.  Since TCG is rarely used in production
>    I think this is worth it for the other benefits.  It does also remove
>    one more barrier to TCG<->KVM migration which could be interesting for
>    debugging applications.
> 
> KVM PR:
>    As with TCG, this will break migration of existing configurations,
>    without adding extra manual vsmt options.  As with TCG, it is rare in
>    production so I think the benefits outweigh breakages.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/ppc/spapr.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)

Reviewed-by: Laurent Vivier <lvivier@redhat.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] target/ppc: Clarify compat mode max_threads value
  2018-01-15  7:27 ` [Qemu-devel] [PATCH 1/2] target/ppc: Clarify compat mode max_threads value David Gibson
  2018-01-15  7:52   ` Laurent Vivier
@ 2018-01-15  8:45   ` Greg Kurz
  2018-01-15 10:40   ` joserz
  2 siblings, 0 replies; 10+ messages in thread
From: Greg Kurz @ 2018-01-15  8:45 UTC (permalink / raw)
  To: David Gibson; +Cc: joserz, surajjs, sam.bobroff, lvivier, qemu-ppc, qemu-devel

On Mon, 15 Jan 2018 18:27:14 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> We recently had some discussions that were sidetracked for a while, because
> nearly everyone misapprehended the purpose of the 'max_threads' field in
> the compatiblity modes table.  It's all about guest expectations, not host
> expectations or support (that's handled elsewhere).
> 
> In an attempt to avoid a repeat of that confusion, rename the field to
> 'max_vthreads' and add an explanatory comment.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  hw/ppc/spapr.c      |  4 ++--
>  target/ppc/compat.c | 25 +++++++++++++++++--------
>  target/ppc/cpu.h    |  2 +-
>  3 files changed, 20 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 3e528fe91e..e35214bfc3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -345,7 +345,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState *spapr)
>          PowerPCCPU *cpu = POWERPC_CPU(cs);
>          DeviceClass *dc = DEVICE_GET_CLASS(cs);
>          int index = spapr_vcpu_id(cpu);
> -        int compat_smt = MIN(smp_threads, ppc_compat_max_threads(cpu));
> +        int compat_smt = MIN(smp_threads, ppc_compat_max_vthreads(cpu));
>  
>          if ((index % smt) != 0) {
>              continue;
> @@ -503,7 +503,7 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
>      size_t page_sizes_prop_size;
>      uint32_t vcpus_per_socket = smp_threads * smp_cores;
>      uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
> -    int compat_smt = MIN(smp_threads, ppc_compat_max_threads(cpu));
> +    int compat_smt = MIN(smp_threads, ppc_compat_max_vthreads(cpu));
>      sPAPRDRConnector *drc;
>      int drc_index;
>      uint32_t radix_AP_encodings[PPC_PAGE_SIZES_MAX_SZ];
> diff --git a/target/ppc/compat.c b/target/ppc/compat.c
> index 276b5b52c2..807c906f68 100644
> --- a/target/ppc/compat.c
> +++ b/target/ppc/compat.c
> @@ -32,7 +32,16 @@ typedef struct {
>      uint32_t pvr;
>      uint64_t pcr;
>      uint64_t pcr_level;
> -    int max_threads;
> +
> +    /*
> +     * Maximum allowed virtual threads per virtual core
> +     *
> +     * This is to stop older guests getting confused by seeing more
> +     * threads than they think the cpu can support.  Usually it's
> +     * equal to the number of threads supported on bare metal
> +     * hardware, but not always (see POWER9).
> +     */
> +    int max_vthreads;
>  } CompatInfo;
>  
>  static const CompatInfo compat_table[] = {
> @@ -45,28 +54,28 @@ static const CompatInfo compat_table[] = {
>          .pcr = PCR_COMPAT_3_00 | PCR_COMPAT_2_07 | PCR_COMPAT_2_06 |
>                 PCR_COMPAT_2_05 | PCR_TM_DIS | PCR_VSX_DIS,
>          .pcr_level = PCR_COMPAT_2_05,
> -        .max_threads = 2,
> +        .max_vthreads = 2,
>      },
>      { /* POWER7, ISA2.06 */
>          .name = "power7",
>          .pvr = CPU_POWERPC_LOGICAL_2_06,
>          .pcr = PCR_COMPAT_3_00 | PCR_COMPAT_2_07 | PCR_COMPAT_2_06 | PCR_TM_DIS,
>          .pcr_level = PCR_COMPAT_2_06,
> -        .max_threads = 4,
> +        .max_vthreads = 4,
>      },
>      {
>          .name = "power7+",
>          .pvr = CPU_POWERPC_LOGICAL_2_06_PLUS,
>          .pcr = PCR_COMPAT_3_00 | PCR_COMPAT_2_07 | PCR_COMPAT_2_06 | PCR_TM_DIS,
>          .pcr_level = PCR_COMPAT_2_06,
> -        .max_threads = 4,
> +        .max_vthreads = 4,
>      },
>      { /* POWER8, ISA2.07 */
>          .name = "power8",
>          .pvr = CPU_POWERPC_LOGICAL_2_07,
>          .pcr = PCR_COMPAT_3_00 | PCR_COMPAT_2_07,
>          .pcr_level = PCR_COMPAT_2_07,
> -        .max_threads = 8,
> +        .max_vthreads = 8,
>      },
>      { /* POWER9, ISA3.00 */
>          .name = "power9",
> @@ -80,7 +89,7 @@ static const CompatInfo compat_table[] = {
>           * confusing if half of the threads disappear from the guest
>           * if it announces it's POWER9 aware at CAS time.
>           */
> -        .max_threads = 8,
> +        .max_vthreads = 8,
>      },
>  };
>  
> @@ -192,14 +201,14 @@ void ppc_set_compat_all(uint32_t compat_pvr, Error **errp)
>      }
>  }
>  
> -int ppc_compat_max_threads(PowerPCCPU *cpu)
> +int ppc_compat_max_vthreads(PowerPCCPU *cpu)
>  {
>      const CompatInfo *compat = compat_by_pvr(cpu->compat_pvr);
>      int n_threads = CPU(cpu)->nr_threads;
>  
>      if (cpu->compat_pvr) {
>          g_assert(compat);
> -        n_threads = MIN(n_threads, compat->max_threads);
> +        n_threads = MIN(n_threads, compat->max_vthreads);
>      }
>  
>      return n_threads;
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index a5e49f23e9..dc6820c5eb 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -1395,7 +1395,7 @@ void ppc_set_compat(PowerPCCPU *cpu, uint32_t compat_pvr, Error **errp);
>  #if !defined(CONFIG_USER_ONLY)
>  void ppc_set_compat_all(uint32_t compat_pvr, Error **errp);
>  #endif
> -int ppc_compat_max_threads(PowerPCCPU *cpu);
> +int ppc_compat_max_vthreads(PowerPCCPU *cpu);
>  void ppc_compat_add_property(Object *obj, const char *name,
>                               uint32_t *compat_pvr, const char *basedesc,
>                               Error **errp);

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] spapr: Adjust default VSMT value for better migration compatibility
  2018-01-15  7:27 ` [Qemu-devel] [PATCH 2/2] spapr: Adjust default VSMT value for better migration compatibility David Gibson
  2018-01-15  7:53   ` Laurent Vivier
@ 2018-01-15  9:48   ` Greg Kurz
  2018-01-16  4:42     ` David Gibson
  2018-01-15 10:38   ` joserz
  2 siblings, 1 reply; 10+ messages in thread
From: Greg Kurz @ 2018-01-15  9:48 UTC (permalink / raw)
  To: David Gibson; +Cc: joserz, surajjs, sam.bobroff, lvivier, qemu-ppc, qemu-devel

On Mon, 15 Jan 2018 18:27:15 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> fa98fbfc "PC: KVM: Support machine option to set VSMT mode" introduced the
> "vsmt" parameter for the pseries machine type, which controls the spacing
> of the vcpu ids of thread 0 for each virtual core.  This was done to bring
> some consistency and stability to how that was done, while still allowing
> backwards compatibility for migration and otherwise.
> 
> The default value we used for vsmt was set to the max of the host's
> advertised default number of threads and the number of vthreads per vcore
> in the guest.  This was done to continue running without extra parameters
> on older KVM versions which don't allow the VSMT value to be changed.
> 
> Unfortunately, even that smaller than before leakage of host configuration
> into guest visible configuration still breaks things.  Specifically a guest
> with 4 (or less) vthread/vcore will get a different vsmt value when
> running on a POWER8 (vsmt==8) and POWER9 (vsmt==4) host.  That means the
> vcpu ids don't line up so you can't migrate between them, though you should
> be able to.
> 
> Long term we really want to make vsmt == smp_threads for sufficiently
> new machine types.  However, that means that qemu will then require a
> sufficiently recent KVM (one which supports changing VSMT) - that's still
> not widely enough deployed to be really comfortable to do.
> 
> In the meantime we some default that will work as often as possible.

s/we some/we need some/ ?

> This patch changes that default to 8 in all circumstances.  This does
> change guest visible behaviour (including for existing machine versions)
> for many cases - just not the most common/important case.
> 
> Following is case by case justification for why this is still the least
> worst option.  Note that any of the old behaviours can still be duplicated
> after this patch, it's just that it requires manual intervention by
> setting the vsmt property on the command line.
> 

IIUC this unconditionally breaks existing setups that rely on static
Micro-Threading on a POWER8 host (eg, subcores-per-core=2 on the host
and smp_threads=4). I have no evidence this is a widely used setup,
but FWIW it is documented in some IBM RedBooks:

"Performance Optimization and Tuning Techniques for IBM Power Systems
 Processors Including IBM POWER8"

http://www.redbooks.ibm.com/abstracts/sg248171.html?Open

"IBM PowerKVM: Configuration and Use"

http://www.redbooks.ibm.com/abstracts/sg248231.html?Open

Maybe the new behaviour could be added for new machine types only ?

Anyway, in case you don't want extra complexity,

Reviewed-by: Greg Kurz <groug@kaod.org>

> KVM HV on POWER8 host:
>    This is the overwhelmingly common case in production setups, and is
>    unchanged by design.  POWER8 hosts will advertise a default VSMT mode
>    of 8, and > 8 vthreads/vcore isn't permitted
> 
> KVM HV on POWER7 host:
>    Will break, but POWER7s allowing KVM were never released to the public.
> 
> KVM HV on POWER9 host:
>    Not yet released to the public, breaking this now will reduce other
>    breakage later.
> 
> KVM HV on PowerPC 970:
>    Will theoretically break it, but it was barely supported to begin with
>    and already required various user visible hacks to work.  Also so old
>    that I just don't care.
> 
> TCG:
>    This is the nastiest one; it means migration of TCG guests (without
>    manual vsmt setting) will break.  Since TCG is rarely used in production
>    I think this is worth it for the other benefits.  It does also remove
>    one more barrier to TCG<->KVM migration which could be interesting for
>    debugging applications.
> 
> KVM PR:
>    As with TCG, this will break migration of existing configurations,
>    without adding extra manual vsmt options.  As with TCG, it is rare in
>    production so I think the benefits outweigh breakages.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/ppc/spapr.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index e35214bfc3..8e5ef7c9de 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2305,9 +2305,14 @@ static void spapr_set_vsmt_mode(sPAPRMachineState *spapr, Error **errp)
>          }
>          /* In this case, spapr->vsmt has been set by the command line */
>      } else {
> -        /* Choose a VSMT mode that may be higher than necessary but is
> -         * likely to be compatible with hosts that don't have VSMT. */
> -        spapr->vsmt = MAX(kvm_smt, smp_threads);
> +        /*
> +         * Default VSMT value is tricky, because we need it to be as
> +         * consistent as possible (for migration), but this requires
> +         * changing it for at least some existing cases.  We pick 8 as
> +         * the value that we'd get with KVM on POWER8, the
> +         * overwhelmingly common case in production systems.
> +         */
> +        spapr->vsmt = 8;
>      }
>  
>      /* KVM: If necessary, set the SMT mode: */

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] spapr: Adjust default VSMT value for better migration compatibility
  2018-01-15  7:27 ` [Qemu-devel] [PATCH 2/2] spapr: Adjust default VSMT value for better migration compatibility David Gibson
  2018-01-15  7:53   ` Laurent Vivier
  2018-01-15  9:48   ` Greg Kurz
@ 2018-01-15 10:38   ` joserz
  2 siblings, 0 replies; 10+ messages in thread
From: joserz @ 2018-01-15 10:38 UTC (permalink / raw)
  To: David Gibson; +Cc: groug, surajjs, sam.bobroff, lvivier, qemu-ppc, qemu-devel

On Mon, Jan 15, 2018 at 06:27:15PM +1100, David Gibson wrote:
> fa98fbfc "PC: KVM: Support machine option to set VSMT mode" introduced the
> "vsmt" parameter for the pseries machine type, which controls the spacing
> of the vcpu ids of thread 0 for each virtual core.  This was done to bring
> some consistency and stability to how that was done, while still allowing
> backwards compatibility for migration and otherwise.
> 
> The default value we used for vsmt was set to the max of the host's
> advertised default number of threads and the number of vthreads per vcore
> in the guest.  This was done to continue running without extra parameters
> on older KVM versions which don't allow the VSMT value to be changed.
> 
> Unfortunately, even that smaller than before leakage of host configuration
> into guest visible configuration still breaks things.  Specifically a guest
> with 4 (or less) vthread/vcore will get a different vsmt value when
> running on a POWER8 (vsmt==8) and POWER9 (vsmt==4) host.  That means the
> vcpu ids don't line up so you can't migrate between them, though you should
> be able to.
> 
> Long term we really want to make vsmt == smp_threads for sufficiently
> new machine types.  However, that means that qemu will then require a
> sufficiently recent KVM (one which supports changing VSMT) - that's still
> not widely enough deployed to be really comfortable to do.
> 
> In the meantime we some default that will work as often as possible.
> This patch changes that default to 8 in all circumstances.  This does
> change guest visible behaviour (including for existing machine versions)
> for many cases - just not the most common/important case.
> 
> Following is case by case justification for why this is still the least
> worst option.  Note that any of the old behaviours can still be duplicated
> after this patch, it's just that it requires manual intervention by
> setting the vsmt property on the command line.
> 
> KVM HV on POWER8 host:
>    This is the overwhelmingly common case in production setups, and is
>    unchanged by design.  POWER8 hosts will advertise a default VSMT mode
>    of 8, and > 8 vthreads/vcore isn't permitted
> 
> KVM HV on POWER7 host:
>    Will break, but POWER7s allowing KVM were never released to the public.
> 
> KVM HV on POWER9 host:
>    Not yet released to the public, breaking this now will reduce other
>    breakage later.
> 
> KVM HV on PowerPC 970:
>    Will theoretically break it, but it was barely supported to begin with
>    and already required various user visible hacks to work.  Also so old
>    that I just don't care.
> 
> TCG:
>    This is the nastiest one; it means migration of TCG guests (without
>    manual vsmt setting) will break.  Since TCG is rarely used in production
>    I think this is worth it for the other benefits.  It does also remove
>    one more barrier to TCG<->KVM migration which could be interesting for
>    debugging applications.
> 
> KVM PR:
>    As with TCG, this will break migration of existing configurations,
>    without adding extra manual vsmt options.  As with TCG, it is rare in
>    production so I think the benefits outweigh breakages.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/ppc/spapr.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index e35214bfc3..8e5ef7c9de 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2305,9 +2305,14 @@ static void spapr_set_vsmt_mode(sPAPRMachineState *spapr, Error **errp)
>          }
>          /* In this case, spapr->vsmt has been set by the command line */
>      } else {
> -        /* Choose a VSMT mode that may be higher than necessary but is
> -         * likely to be compatible with hosts that don't have VSMT. */
> -        spapr->vsmt = MAX(kvm_smt, smp_threads);
> +        /*
> +         * Default VSMT value is tricky, because we need it to be as
> +         * consistent as possible (for migration), but this requires
> +         * changing it for at least some existing cases.  We pick 8 as
> +         * the value that we'd get with KVM on POWER8, the
> +         * overwhelmingly common case in production systems.
> +         */
> +        spapr->vsmt = 8;
>      }
> 
>      /* KVM: If necessary, set the SMT mode: */
> -- 
> 2.14.3
> 

Great rationale.

Reviewed-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] target/ppc: Clarify compat mode max_threads value
  2018-01-15  7:27 ` [Qemu-devel] [PATCH 1/2] target/ppc: Clarify compat mode max_threads value David Gibson
  2018-01-15  7:52   ` Laurent Vivier
  2018-01-15  8:45   ` Greg Kurz
@ 2018-01-15 10:40   ` joserz
  2 siblings, 0 replies; 10+ messages in thread
From: joserz @ 2018-01-15 10:40 UTC (permalink / raw)
  To: David Gibson; +Cc: groug, surajjs, sam.bobroff, lvivier, qemu-ppc, qemu-devel

On Mon, Jan 15, 2018 at 06:27:14PM +1100, David Gibson wrote:
> We recently had some discussions that were sidetracked for a while, because
> nearly everyone misapprehended the purpose of the 'max_threads' field in
> the compatiblity modes table.  It's all about guest expectations, not host
> expectations or support (that's handled elsewhere).
> 
> In an attempt to avoid a repeat of that confusion, rename the field to
> 'max_vthreads' and add an explanatory comment.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/ppc/spapr.c      |  4 ++--
>  target/ppc/compat.c | 25 +++++++++++++++++--------
>  target/ppc/cpu.h    |  2 +-
>  3 files changed, 20 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 3e528fe91e..e35214bfc3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -345,7 +345,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState *spapr)
>          PowerPCCPU *cpu = POWERPC_CPU(cs);
>          DeviceClass *dc = DEVICE_GET_CLASS(cs);
>          int index = spapr_vcpu_id(cpu);
> -        int compat_smt = MIN(smp_threads, ppc_compat_max_threads(cpu));
> +        int compat_smt = MIN(smp_threads, ppc_compat_max_vthreads(cpu));
> 
>          if ((index % smt) != 0) {
>              continue;
> @@ -503,7 +503,7 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
>      size_t page_sizes_prop_size;
>      uint32_t vcpus_per_socket = smp_threads * smp_cores;
>      uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
> -    int compat_smt = MIN(smp_threads, ppc_compat_max_threads(cpu));
> +    int compat_smt = MIN(smp_threads, ppc_compat_max_vthreads(cpu));
>      sPAPRDRConnector *drc;
>      int drc_index;
>      uint32_t radix_AP_encodings[PPC_PAGE_SIZES_MAX_SZ];
> diff --git a/target/ppc/compat.c b/target/ppc/compat.c
> index 276b5b52c2..807c906f68 100644
> --- a/target/ppc/compat.c
> +++ b/target/ppc/compat.c
> @@ -32,7 +32,16 @@ typedef struct {
>      uint32_t pvr;
>      uint64_t pcr;
>      uint64_t pcr_level;
> -    int max_threads;
> +
> +    /*
> +     * Maximum allowed virtual threads per virtual core
> +     *
> +     * This is to stop older guests getting confused by seeing more
> +     * threads than they think the cpu can support.  Usually it's
> +     * equal to the number of threads supported on bare metal
> +     * hardware, but not always (see POWER9).
> +     */
> +    int max_vthreads;
>  } CompatInfo;
> 
>  static const CompatInfo compat_table[] = {
> @@ -45,28 +54,28 @@ static const CompatInfo compat_table[] = {
>          .pcr = PCR_COMPAT_3_00 | PCR_COMPAT_2_07 | PCR_COMPAT_2_06 |
>                 PCR_COMPAT_2_05 | PCR_TM_DIS | PCR_VSX_DIS,
>          .pcr_level = PCR_COMPAT_2_05,
> -        .max_threads = 2,
> +        .max_vthreads = 2,
>      },
>      { /* POWER7, ISA2.06 */
>          .name = "power7",
>          .pvr = CPU_POWERPC_LOGICAL_2_06,
>          .pcr = PCR_COMPAT_3_00 | PCR_COMPAT_2_07 | PCR_COMPAT_2_06 | PCR_TM_DIS,
>          .pcr_level = PCR_COMPAT_2_06,
> -        .max_threads = 4,
> +        .max_vthreads = 4,
>      },
>      {
>          .name = "power7+",
>          .pvr = CPU_POWERPC_LOGICAL_2_06_PLUS,
>          .pcr = PCR_COMPAT_3_00 | PCR_COMPAT_2_07 | PCR_COMPAT_2_06 | PCR_TM_DIS,
>          .pcr_level = PCR_COMPAT_2_06,
> -        .max_threads = 4,
> +        .max_vthreads = 4,
>      },
>      { /* POWER8, ISA2.07 */
>          .name = "power8",
>          .pvr = CPU_POWERPC_LOGICAL_2_07,
>          .pcr = PCR_COMPAT_3_00 | PCR_COMPAT_2_07,
>          .pcr_level = PCR_COMPAT_2_07,
> -        .max_threads = 8,
> +        .max_vthreads = 8,
>      },
>      { /* POWER9, ISA3.00 */
>          .name = "power9",
> @@ -80,7 +89,7 @@ static const CompatInfo compat_table[] = {
>           * confusing if half of the threads disappear from the guest
>           * if it announces it's POWER9 aware at CAS time.
>           */
> -        .max_threads = 8,
> +        .max_vthreads = 8,
>      },
>  };
> 
> @@ -192,14 +201,14 @@ void ppc_set_compat_all(uint32_t compat_pvr, Error **errp)
>      }
>  }
> 
> -int ppc_compat_max_threads(PowerPCCPU *cpu)
> +int ppc_compat_max_vthreads(PowerPCCPU *cpu)
>  {
>      const CompatInfo *compat = compat_by_pvr(cpu->compat_pvr);
>      int n_threads = CPU(cpu)->nr_threads;
> 
>      if (cpu->compat_pvr) {
>          g_assert(compat);
> -        n_threads = MIN(n_threads, compat->max_threads);
> +        n_threads = MIN(n_threads, compat->max_vthreads);
>      }
> 
>      return n_threads;
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index a5e49f23e9..dc6820c5eb 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -1395,7 +1395,7 @@ void ppc_set_compat(PowerPCCPU *cpu, uint32_t compat_pvr, Error **errp);
>  #if !defined(CONFIG_USER_ONLY)
>  void ppc_set_compat_all(uint32_t compat_pvr, Error **errp);
>  #endif
> -int ppc_compat_max_threads(PowerPCCPU *cpu);
> +int ppc_compat_max_vthreads(PowerPCCPU *cpu);
>  void ppc_compat_add_property(Object *obj, const char *name,
>                               uint32_t *compat_pvr, const char *basedesc,
>                               Error **errp);
> -- 
> 2.14.3
> 

Reviewed-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] spapr: Adjust default VSMT value for better migration compatibility
  2018-01-15  9:48   ` Greg Kurz
@ 2018-01-16  4:42     ` David Gibson
  0 siblings, 0 replies; 10+ messages in thread
From: David Gibson @ 2018-01-16  4:42 UTC (permalink / raw)
  To: Greg Kurz; +Cc: joserz, surajjs, sam.bobroff, lvivier, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3551 bytes --]

On Mon, Jan 15, 2018 at 10:48:47AM +0100, Greg Kurz wrote:
> On Mon, 15 Jan 2018 18:27:15 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > fa98fbfc "PC: KVM: Support machine option to set VSMT mode" introduced the
> > "vsmt" parameter for the pseries machine type, which controls the spacing
> > of the vcpu ids of thread 0 for each virtual core.  This was done to bring
> > some consistency and stability to how that was done, while still allowing
> > backwards compatibility for migration and otherwise.
> > 
> > The default value we used for vsmt was set to the max of the host's
> > advertised default number of threads and the number of vthreads per vcore
> > in the guest.  This was done to continue running without extra parameters
> > on older KVM versions which don't allow the VSMT value to be changed.
> > 
> > Unfortunately, even that smaller than before leakage of host configuration
> > into guest visible configuration still breaks things.  Specifically a guest
> > with 4 (or less) vthread/vcore will get a different vsmt value when
> > running on a POWER8 (vsmt==8) and POWER9 (vsmt==4) host.  That means the
> > vcpu ids don't line up so you can't migrate between them, though you should
> > be able to.
> > 
> > Long term we really want to make vsmt == smp_threads for sufficiently
> > new machine types.  However, that means that qemu will then require a
> > sufficiently recent KVM (one which supports changing VSMT) - that's still
> > not widely enough deployed to be really comfortable to do.
> > 
> > In the meantime we some default that will work as often as possible.
> 
> s/we some/we need some/ ?

Corrected.

> > This patch changes that default to 8 in all circumstances.  This does
> > change guest visible behaviour (including for existing machine versions)
> > for many cases - just not the most common/important case.
> > 
> > Following is case by case justification for why this is still the least
> > worst option.  Note that any of the old behaviours can still be duplicated
> > after this patch, it's just that it requires manual intervention by
> > setting the vsmt property on the command line.
> > 
> 
> IIUC this unconditionally breaks existing setups that rely on static
> Micro-Threading on a POWER8 host (eg, subcores-per-core=2 on the host
> and smp_threads=4). I have no evidence this is a widely used setup,
> but FWIW it is documented in some IBM RedBooks:

Well.. it will break migration between old and new qemu on the
microthreaded setup,  but fix it between new qemu on microthreaded
setup and new qemu on non-microthreaded setup (old qemu on
microthreaded to old qemu on non-microthreaded was already broken for
the same reasons as p8<->p9).  It's not really obvious to me which is
preferable.

> "Performance Optimization and Tuning Techniques for IBM Power Systems
>  Processors Including IBM POWER8"
> 
> http://www.redbooks.ibm.com/abstracts/sg248171.html?Open
> 
> "IBM PowerKVM: Configuration and Use"
> 
> http://www.redbooks.ibm.com/abstracts/sg248231.html?Open
>
> Maybe the new behaviour could be added for new machine types only ?

I'd really prefer not to.  It makes some existing cases work, but
breaks some other cases.  Given that the old behaviour is inherently
wrong, I'm more inclined to change it.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-01-16  4:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-15  7:27 [Qemu-devel] [PATCH 0/2] Further VSMT fixes David Gibson
2018-01-15  7:27 ` [Qemu-devel] [PATCH 1/2] target/ppc: Clarify compat mode max_threads value David Gibson
2018-01-15  7:52   ` Laurent Vivier
2018-01-15  8:45   ` Greg Kurz
2018-01-15 10:40   ` joserz
2018-01-15  7:27 ` [Qemu-devel] [PATCH 2/2] spapr: Adjust default VSMT value for better migration compatibility David Gibson
2018-01-15  7:53   ` Laurent Vivier
2018-01-15  9:48   ` Greg Kurz
2018-01-16  4:42     ` David Gibson
2018-01-15 10:38   ` joserz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.