All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling
@ 2018-06-18  6:35 David Gibson
  2018-06-18  6:35 ` [Qemu-devel] [PATCH 1/9] target/ppc: Allow cpu compatiblity checks based on type, not instance David Gibson
                   ` (10 more replies)
  0 siblings, 11 replies; 43+ messages in thread
From: David Gibson @ 2018-06-18  6:35 UTC (permalink / raw)
  To: groug, abologna; +Cc: clg, qemu-ppc, qemu-devel, aik, David Gibson

Currently the "pseries" machine type will (usually) advertise
different pagesizes to the guest when running under KVM and TCG, which
is not how things are supposed to work.

This comes from poor handling of hardware limitations which mean that
under KVM HV the guest is unable to use pagesizes larger than those
backing the guest's RAM on the host side.

The new scheme turns things around by having an explicit machine
parameter controlling the largest page size that the guest is allowed
to use.  This limitation applies regardless of accelerator.  When
we're running on KVM HV we ensure that our backing pages are adequate
to supply the requested guest page sizes, rather than adjusting the
guest page sizes based on what KVM can supply.

This means that in order to use hugepages in a PAPR guest it's
necessary to add a "cap-hpt-max-page-size=16m" machine parameter as
well as setting the mem-path correctly.  This is a bit more work on
the user and/or management side, but results in consistent behaviour
so I think it's worth it.

Longer term, we might also use this parameter to control IOMMU page
sizes.  But, I'm still working out how restrictions deriving from the
guest kernel, host kernel and hardware capabilities all interact here.

This applies on top of my ppc-for-3.0 tree.

Changes since RFC:
 * Add preliminary cleanups to allow us to evaluate effective
   capabilities levels earlier.
 * Don't try to remove double resetting of cpus.  It doesn't quite
   work, and is no longer necessary with the above.
 * Some user-friendliness improvements: use "hpt-max-page-size"
   instead of the cryptic "hpt-mps", and take an actual page size
   (allowing k/m/g suffixies) instead of a shift

David Gibson (9):
  target/ppc: Allow cpu compatiblity checks based on type, not instance
  spapr: Compute effective capability values earlier
  spapr: Add cpu_apply hook to capabilities
  target/ppc: Add kvmppc_hpt_needs_host_contiguous_pages() helper
  spapr: Maximum (HPT) pagesize property
  spapr: Use maximum page size capability to simplify memory backend
    checking
  target/ppc: Add ppc_hash64_filter_pagesizes()
  spapr: Limit available pagesizes to provide a consistent guest
    environment
  spapr: Don't rewrite mmu capabilities in KVM mode

 hw/ppc/spapr.c          |  45 +++++++-----
 hw/ppc/spapr_caps.c     | 156 ++++++++++++++++++++++++++++++++++++----
 hw/ppc/spapr_cpu_core.c |   4 ++
 include/hw/ppc/spapr.h  |  11 ++-
 target/ppc/compat.c     |  27 +++++--
 target/ppc/cpu.h        |   4 ++
 target/ppc/kvm.c        | 146 ++++++++++++++++++-------------------
 target/ppc/kvm_ppc.h    |  11 ++-
 target/ppc/mmu-hash64.c |  59 +++++++++++++++
 target/ppc/mmu-hash64.h |   3 +
 10 files changed, 349 insertions(+), 117 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH 1/9] target/ppc: Allow cpu compatiblity checks based on type, not instance
  2018-06-18  6:35 [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling David Gibson
@ 2018-06-18  6:35 ` David Gibson
  2018-06-18 13:22   ` Greg Kurz
  2018-06-21  5:20   ` Cédric Le Goater
  2018-06-18  6:35 ` [Qemu-devel] [PATCH 2/9] spapr: Compute effective capability values earlier David Gibson
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 43+ messages in thread
From: David Gibson @ 2018-06-18  6:35 UTC (permalink / raw)
  To: groug, abologna; +Cc: clg, qemu-ppc, qemu-devel, aik, David Gibson

ppc_check_compat() is used in a number of places to check if a cpu object
supports a certain compatiblity mode, subject to various constraints.

It takes a PowerPCCPU *, however it really only depends on the cpu's class.
We have upcoming cases where it would be useful to make compatibility
checks before we fully instantiate the cpu objects.

ppc_type_check_compat() will now make an equivalent check, but based on a
CPU's QOM typename instead of an instantiated CPU object.

We make use of the new interface in several places in spapr, where we're
essentially making a global check, rather than one specific to a particular
cpu.  This avoids some ugly uses of first_cpu to grab a "representative"
instance.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr.c      | 10 ++++------
 hw/ppc/spapr_caps.c | 19 +++++++++----------
 target/ppc/compat.c | 27 +++++++++++++++++++++------
 target/ppc/cpu.h    |  4 ++++
 4 files changed, 38 insertions(+), 22 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index db0fb385d4..b0b94fc1f0 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1616,8 +1616,8 @@ static void spapr_machine_reset(void)
 
     first_ppc_cpu = POWERPC_CPU(first_cpu);
     if (kvm_enabled() && kvmppc_has_cap_mmu_radix() &&
-        ppc_check_compat(first_ppc_cpu, CPU_POWERPC_LOGICAL_3_00, 0,
-                         spapr->max_compat_pvr)) {
+        ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00, 0,
+                              spapr->max_compat_pvr)) {
         /* If using KVM with radix mode available, VCPUs can be started
          * without a HPT because KVM will start them in radix mode.
          * Set the GR bit in PATB so that we know there is no HPT. */
@@ -2520,7 +2520,6 @@ static void spapr_machine_init(MachineState *machine)
     long load_limit, fw_size;
     char *filename;
     Error *resize_hpt_err = NULL;
-    PowerPCCPU *first_ppc_cpu;
 
     msi_nonbroken = true;
 
@@ -2618,10 +2617,9 @@ static void spapr_machine_init(MachineState *machine)
     /* init CPUs */
     spapr_init_cpus(spapr);
 
-    first_ppc_cpu = POWERPC_CPU(first_cpu);
     if ((!kvm_enabled() || kvmppc_has_cap_mmu_radix()) &&
-        ppc_check_compat(first_ppc_cpu, CPU_POWERPC_LOGICAL_3_00, 0,
-                         spapr->max_compat_pvr)) {
+        ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00, 0,
+                              spapr->max_compat_pvr)) {
         /* KVM and TCG always allow GTSE with radix... */
         spapr_ovec_set(spapr->ov5, OV5_MMU_RADIX_GTSE);
     }
diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 00e43a9ba7..469f38f0ef 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -327,27 +327,26 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
 };
 
 static sPAPRCapabilities default_caps_with_cpu(sPAPRMachineState *spapr,
-                                               CPUState *cs)
+                                               const char *cputype)
 {
     sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
-    PowerPCCPU *cpu = POWERPC_CPU(cs);
     sPAPRCapabilities caps;
 
     caps = smc->default_caps;
 
-    if (!ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_2_07,
-                          0, spapr->max_compat_pvr)) {
+    if (!ppc_type_check_compat(cputype, CPU_POWERPC_LOGICAL_2_07,
+                               0, spapr->max_compat_pvr)) {
         caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_OFF;
         caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_BROKEN;
     }
 
-    if (!ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_2_06_PLUS,
-                          0, spapr->max_compat_pvr)) {
+    if (!ppc_type_check_compat(cputype, CPU_POWERPC_LOGICAL_2_06_PLUS,
+                               0, spapr->max_compat_pvr)) {
         caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_BROKEN;
     }
 
-    if (!ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_2_06,
-                          0, spapr->max_compat_pvr)) {
+    if (!ppc_type_check_compat(cputype, CPU_POWERPC_LOGICAL_2_06,
+                               0, spapr->max_compat_pvr)) {
         caps.caps[SPAPR_CAP_VSX] = SPAPR_CAP_OFF;
         caps.caps[SPAPR_CAP_DFP] = SPAPR_CAP_OFF;
         caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_BROKEN;
@@ -384,7 +383,7 @@ int spapr_caps_post_migration(sPAPRMachineState *spapr)
     sPAPRCapabilities dstcaps = spapr->eff;
     sPAPRCapabilities srccaps;
 
-    srccaps = default_caps_with_cpu(spapr, first_cpu);
+    srccaps = default_caps_with_cpu(spapr, MACHINE(spapr)->cpu_type);
     for (i = 0; i < SPAPR_CAP_NUM; i++) {
         /* If not default value then assume came in with the migration */
         if (spapr->mig.caps[i] != spapr->def.caps[i]) {
@@ -446,7 +445,7 @@ void spapr_caps_reset(sPAPRMachineState *spapr)
     int i;
 
     /* First compute the actual set of caps we're running with.. */
-    default_caps = default_caps_with_cpu(spapr, first_cpu);
+    default_caps = default_caps_with_cpu(spapr, MACHINE(spapr)->cpu_type);
 
     for (i = 0; i < SPAPR_CAP_NUM; i++) {
         /* Store the defaults */
diff --git a/target/ppc/compat.c b/target/ppc/compat.c
index 807c906f68..7de4bf3122 100644
--- a/target/ppc/compat.c
+++ b/target/ppc/compat.c
@@ -105,17 +105,13 @@ static const CompatInfo *compat_by_pvr(uint32_t pvr)
     return NULL;
 }
 
-bool ppc_check_compat(PowerPCCPU *cpu, uint32_t compat_pvr,
-                      uint32_t min_compat_pvr, uint32_t max_compat_pvr)
+static bool pcc_compat(PowerPCCPUClass *pcc, uint32_t compat_pvr,
+                       uint32_t min_compat_pvr, uint32_t max_compat_pvr)
 {
-    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
     const CompatInfo *compat = compat_by_pvr(compat_pvr);
     const CompatInfo *min = compat_by_pvr(min_compat_pvr);
     const CompatInfo *max = compat_by_pvr(max_compat_pvr);
 
-#if !defined(CONFIG_USER_ONLY)
-    g_assert(cpu->vhyp);
-#endif
     g_assert(!min_compat_pvr || min);
     g_assert(!max_compat_pvr || max);
 
@@ -134,6 +130,25 @@ bool ppc_check_compat(PowerPCCPU *cpu, uint32_t compat_pvr,
     return true;
 }
 
+bool ppc_check_compat(PowerPCCPU *cpu, uint32_t compat_pvr,
+                      uint32_t min_compat_pvr, uint32_t max_compat_pvr)
+{
+    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
+
+#if !defined(CONFIG_USER_ONLY)
+    g_assert(cpu->vhyp);
+#endif
+
+    return pcc_compat(pcc, compat_pvr, min_compat_pvr, max_compat_pvr);
+}
+
+bool ppc_type_check_compat(const char *cputype, uint32_t compat_pvr,
+                           uint32_t min_compat_pvr, uint32_t max_compat_pvr)
+{
+    PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(object_class_by_name(cputype));
+    return pcc_compat(pcc, compat_pvr, min_compat_pvr, max_compat_pvr);
+}
+
 void ppc_set_compat(PowerPCCPU *cpu, uint32_t compat_pvr, Error **errp)
 {
     const CompatInfo *compat = compat_by_pvr(compat_pvr);
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 874da6efbc..c7f3fb6b73 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1369,7 +1369,11 @@ static inline int cpu_mmu_index (CPUPPCState *env, bool ifetch)
 #if defined(TARGET_PPC64)
 bool ppc_check_compat(PowerPCCPU *cpu, uint32_t compat_pvr,
                       uint32_t min_compat_pvr, uint32_t max_compat_pvr);
+bool ppc_type_check_compat(const char *cputype, uint32_t compat_pvr,
+                           uint32_t min_compat_pvr, uint32_t max_compat_pvr);
+
 void ppc_set_compat(PowerPCCPU *cpu, uint32_t compat_pvr, Error **errp);
+
 #if !defined(CONFIG_USER_ONLY)
 void ppc_set_compat_all(uint32_t compat_pvr, Error **errp);
 #endif
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH 2/9] spapr: Compute effective capability values earlier
  2018-06-18  6:35 [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling David Gibson
  2018-06-18  6:35 ` [Qemu-devel] [PATCH 1/9] target/ppc: Allow cpu compatiblity checks based on type, not instance David Gibson
@ 2018-06-18  6:35 ` David Gibson
  2018-06-18 13:37   ` Greg Kurz
  2018-06-21  5:32   ` Cédric Le Goater
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 3/9] spapr: Add cpu_apply hook to capabilities David Gibson
                   ` (8 subsequent siblings)
  10 siblings, 2 replies; 43+ messages in thread
From: David Gibson @ 2018-06-18  6:35 UTC (permalink / raw)
  To: groug, abologna; +Cc: clg, qemu-ppc, qemu-devel, aik, David Gibson

Previously, the effective values of the various spapr capability flags
were only determined at machine reset time.  That was a lazy way of making
sure it was after cpu initialization so it could use the cpu object to
inform the defaults.

But we've now improved the compat checking code so that we don't need to
instantiate the cpus to use it.  That lets us move the resolution of the
capability defaults much earlier.

This is going to be necessary for some future capabilities.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr.c         | 6 ++++--
 hw/ppc/spapr_caps.c    | 9 ++++++---
 include/hw/ppc/spapr.h | 3 ++-
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b0b94fc1f0..40858d047c 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1612,7 +1612,7 @@ static void spapr_machine_reset(void)
     void *fdt;
     int rc;
 
-    spapr_caps_reset(spapr);
+    spapr_caps_apply(spapr);
 
     first_ppc_cpu = POWERPC_CPU(first_cpu);
     if (kvm_enabled() && kvmppc_has_cap_mmu_radix() &&
@@ -2526,7 +2526,9 @@ static void spapr_machine_init(MachineState *machine)
     QLIST_INIT(&spapr->phbs);
     QTAILQ_INIT(&spapr->pending_dimm_unplugs);
 
-    /* Check HPT resizing availability */
+    /* Determine capabilities to run with */
+    spapr_caps_init(spapr);
+
     kvmppc_check_papr_resize_hpt(&resize_hpt_err);
     if (spapr->resize_hpt == SPAPR_RESIZE_HPT_DEFAULT) {
         /*
diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 469f38f0ef..dabed817d1 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -439,12 +439,12 @@ SPAPR_CAP_MIG_STATE(cfpc, SPAPR_CAP_CFPC);
 SPAPR_CAP_MIG_STATE(sbbc, SPAPR_CAP_SBBC);
 SPAPR_CAP_MIG_STATE(ibs, SPAPR_CAP_IBS);
 
-void spapr_caps_reset(sPAPRMachineState *spapr)
+void spapr_caps_init(sPAPRMachineState *spapr)
 {
     sPAPRCapabilities default_caps;
     int i;
 
-    /* First compute the actual set of caps we're running with.. */
+    /* Compute the actual set of caps we should run with */
     default_caps = default_caps_with_cpu(spapr, MACHINE(spapr)->cpu_type);
 
     for (i = 0; i < SPAPR_CAP_NUM; i++) {
@@ -455,8 +455,11 @@ void spapr_caps_reset(sPAPRMachineState *spapr)
             spapr->eff.caps[i] = default_caps.caps[i];
         }
     }
+}
 
-    /* .. then apply those caps to the virtual hardware */
+void spapr_caps_apply(sPAPRMachineState *spapr)
+{
+    int i;
 
     for (i = 0; i < SPAPR_CAP_NUM; i++) {
         sPAPRCapabilityInfo *info = &capability_table[i];
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 3388750fc7..9dbd6010f5 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -798,7 +798,8 @@ static inline uint8_t spapr_get_cap(sPAPRMachineState *spapr, int cap)
     return spapr->eff.caps[cap];
 }
 
-void spapr_caps_reset(sPAPRMachineState *spapr);
+void spapr_caps_init(sPAPRMachineState *spapr);
+void spapr_caps_apply(sPAPRMachineState *spapr);
 void spapr_caps_add_properties(sPAPRMachineClass *smc, Error **errp);
 int spapr_caps_post_migration(sPAPRMachineState *spapr);
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH 3/9] spapr: Add cpu_apply hook to capabilities
  2018-06-18  6:35 [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling David Gibson
  2018-06-18  6:35 ` [Qemu-devel] [PATCH 1/9] target/ppc: Allow cpu compatiblity checks based on type, not instance David Gibson
  2018-06-18  6:35 ` [Qemu-devel] [PATCH 2/9] spapr: Compute effective capability values earlier David Gibson
@ 2018-06-18  6:36 ` David Gibson
  2018-06-18 15:28   ` Greg Kurz
  2018-06-21  5:34   ` Cédric Le Goater
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 4/9] target/ppc: Add kvmppc_hpt_needs_host_contiguous_pages() helper David Gibson
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 43+ messages in thread
From: David Gibson @ 2018-06-18  6:36 UTC (permalink / raw)
  To: groug, abologna; +Cc: clg, qemu-ppc, qemu-devel, aik, David Gibson

spapr capabilities have an apply hook to actually activate (or deactivate)
the feature in the system at reset time.  However, a number of capabilities
affect the setup of cpus, and need to be applied to each of them -
including hotplugged cpus for extra complication.  To make this simpler,
add an optional cpu_apply hook that is called from spapr_cpu_reset().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr_caps.c     | 19 +++++++++++++++++++
 hw/ppc/spapr_cpu_core.c |  2 ++
 include/hw/ppc/spapr.h  |  1 +
 3 files changed, 22 insertions(+)

diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index dabed817d1..68a4243efc 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -59,6 +59,8 @@ typedef struct sPAPRCapabilityInfo {
     sPAPRCapPossible *possible;
     /* Make sure the virtual hardware can support this capability */
     void (*apply)(sPAPRMachineState *spapr, uint8_t val, Error **errp);
+    void (*cpu_apply)(sPAPRMachineState *spapr, PowerPCCPU *cpu,
+                      uint8_t val, Error **errp);
 } sPAPRCapabilityInfo;
 
 static void spapr_cap_get_bool(Object *obj, Visitor *v, const char *name,
@@ -472,6 +474,23 @@ void spapr_caps_apply(sPAPRMachineState *spapr)
     }
 }
 
+void spapr_caps_cpu_apply(sPAPRMachineState *spapr, PowerPCCPU *cpu)
+{
+    int i;
+
+    for (i = 0; i < SPAPR_CAP_NUM; i++) {
+        sPAPRCapabilityInfo *info = &capability_table[i];
+
+        /*
+         * If the apply function can't set the desired level and thinks it's
+         * fatal, it should cause that.
+         */
+        if (info->cpu_apply) {
+            info->cpu_apply(spapr, cpu, spapr->eff.caps[i], &error_fatal);
+        }
+    }
+}
+
 void spapr_caps_add_properties(sPAPRMachineClass *smc, Error **errp)
 {
     Error *local_err = NULL;
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index aef3be33a3..324623190d 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -76,6 +76,8 @@ static void spapr_cpu_reset(void *opaque)
     spapr_cpu->slb_shadow_size = 0;
     spapr_cpu->dtl_addr = 0;
     spapr_cpu->dtl_size = 0;
+
+    spapr_caps_cpu_apply(SPAPR_MACHINE(qdev_get_machine()), cpu);
 }
 
 void spapr_cpu_set_entry_state(PowerPCCPU *cpu, target_ulong nip, target_ulong r3)
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 9dbd6010f5..9dd46a72f6 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -800,6 +800,7 @@ static inline uint8_t spapr_get_cap(sPAPRMachineState *spapr, int cap)
 
 void spapr_caps_init(sPAPRMachineState *spapr);
 void spapr_caps_apply(sPAPRMachineState *spapr);
+void spapr_caps_cpu_apply(sPAPRMachineState *spapr, PowerPCCPU *cpu);
 void spapr_caps_add_properties(sPAPRMachineClass *smc, Error **errp);
 int spapr_caps_post_migration(sPAPRMachineState *spapr);
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH 4/9] target/ppc: Add kvmppc_hpt_needs_host_contiguous_pages() helper
  2018-06-18  6:35 [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling David Gibson
                   ` (2 preceding siblings ...)
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 3/9] spapr: Add cpu_apply hook to capabilities David Gibson
@ 2018-06-18  6:36 ` David Gibson
  2018-06-18 15:32   ` Greg Kurz
  2018-06-21  5:56   ` Cédric Le Goater
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 5/9] spapr: Maximum (HPT) pagesize property David Gibson
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 43+ messages in thread
From: David Gibson @ 2018-06-18  6:36 UTC (permalink / raw)
  To: groug, abologna; +Cc: clg, qemu-ppc, qemu-devel, aik, David Gibson

KVM HV has a restriction that for HPT mode guests, guest pages must be hpa
contiguous as well as gpa contiguous.  We have to account for that in
various places.  We determine whether we're subject to this restriction
from the SMMU information exposed by KVM.

Planned cleanups to the way we handle this will require knowing whether
this restriction is in play in wider parts of the code.  So, expose a
helper function which returns it.

This does mean some redundant calls to kvm_get_smmu_info(), but they'll go
away again with future cleanups.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 target/ppc/kvm.c     | 17 +++++++++++++++--
 target/ppc/kvm_ppc.h |  6 ++++++
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 5c0e313ca6..50b5d01432 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -406,9 +406,22 @@ target_ulong kvmppc_configure_v3_mmu(PowerPCCPU *cpu,
     }
 }
 
+bool kvmppc_hpt_needs_host_contiguous_pages(void)
+{
+    PowerPCCPU *cpu = POWERPC_CPU(first_cpu);
+    static struct kvm_ppc_smmu_info smmu_info;
+
+    if (!kvm_enabled()) {
+        return false;
+    }
+
+    kvm_get_smmu_info(cpu, &smmu_info);
+    return !!(smmu_info.flags & KVM_PPC_PAGE_SIZES_REAL);
+}
+
 static bool kvm_valid_page_size(uint32_t flags, long rampgsize, uint32_t shift)
 {
-    if (!(flags & KVM_PPC_PAGE_SIZES_REAL)) {
+    if (!kvmppc_hpt_needs_host_contiguous_pages()) {
         return true;
     }
 
@@ -445,7 +458,7 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
     /* If we have HV KVM, we need to forbid CI large pages if our
      * host page size is smaller than 64K.
      */
-    if (smmu_info.flags & KVM_PPC_PAGE_SIZES_REAL) {
+    if (kvmppc_hpt_needs_host_contiguous_pages()) {
         if (getpagesize() >= 0x10000) {
             cpu->hash64_opts->flags |= PPC_HASH64_CI_LARGEPAGE;
         } else {
diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
index e2840e1d33..a7ddb8a5d6 100644
--- a/target/ppc/kvm_ppc.h
+++ b/target/ppc/kvm_ppc.h
@@ -70,6 +70,7 @@ int kvmppc_resize_hpt_prepare(PowerPCCPU *cpu, target_ulong flags, int shift);
 int kvmppc_resize_hpt_commit(PowerPCCPU *cpu, target_ulong flags, int shift);
 bool kvmppc_pvr_workaround_required(PowerPCCPU *cpu);
 
+bool kvmppc_hpt_needs_host_contiguous_pages(void);
 bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path);
 
 #else
@@ -222,6 +223,11 @@ static inline uint64_t kvmppc_rma_size(uint64_t current_size,
     return ram_size;
 }
 
+static inline bool kvmppc_hpt_needs_host_contiguous_pages(void)
+{
+    return false;
+}
+
 static inline bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
 {
     return true;
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH 5/9] spapr: Maximum (HPT) pagesize property
  2018-06-18  6:35 [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling David Gibson
                   ` (3 preceding siblings ...)
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 4/9] target/ppc: Add kvmppc_hpt_needs_host_contiguous_pages() helper David Gibson
@ 2018-06-18  6:36 ` David Gibson
  2018-06-19  9:23   ` Cédric Le Goater
                     ` (2 more replies)
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 6/9] spapr: Use maximum page size capability to simplify memory backend checking David Gibson
                   ` (5 subsequent siblings)
  10 siblings, 3 replies; 43+ messages in thread
From: David Gibson @ 2018-06-18  6:36 UTC (permalink / raw)
  To: groug, abologna; +Cc: clg, qemu-ppc, qemu-devel, aik, David Gibson

The way the POWER Hash Page Table (HPT) MMU is virtualized by KVM HV means
that every page that the guest puts in the pagetables must be truly
physically contiguous, not just GPA-contiguous.  In effect this means that
an HPT guest can't use any pagesizes greater than the host page size used
to back its memory.

At present we handle this by changing what we advertise to the guest based
on the backing pagesizes.  This is pretty bad, because it means the guest
sees a different environment depending on what should be host configuration
details.

As a start on fixing this, we add a new capability parameter to the pseries
machine type which gives the maximum allowed pagesizes for an HPT guest (as
a shift).  For now we just create and validate the parameter without making
it do anything.

For backwards compatibility, on older machine types we set it to the max
available page size for the host.  For the 3.0 machine type, we fix it to
16, the intention being to only allow HPT pagesizes up to 64kiB by default
in future.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr.c         | 12 +++++++++
 hw/ppc/spapr_caps.c    | 56 ++++++++++++++++++++++++++++++++++++++++++
 include/hw/ppc/spapr.h |  4 ++-
 3 files changed, 71 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 40858d047c..74a76e7e09 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -63,6 +63,7 @@
 #include "hw/virtio/vhost-scsi-common.h"
 
 #include "exec/address-spaces.h"
+#include "exec/ram_addr.h"
 #include "hw/usb.h"
 #include "qemu/config-file.h"
 #include "qemu/error-report.h"
@@ -4043,6 +4044,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     smc->default_caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_BROKEN;
     smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_BROKEN;
     smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_BROKEN;
+    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 16; /* 64kiB */
     spapr_caps_add_properties(smc, &error_abort);
 }
 
@@ -4126,8 +4128,18 @@ static void spapr_machine_2_12_instance_options(MachineState *machine)
 
 static void spapr_machine_2_12_class_options(MachineClass *mc)
 {
+    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
+    uint8_t mps;
+
     spapr_machine_3_0_class_options(mc);
     SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_12);
+
+    if (kvmppc_hpt_needs_host_contiguous_pages()) {
+        mps = ctz64(qemu_getrampagesize());
+    } else {
+        mps = 34; /* allow everything up to 16GiB, i.e. everything */
+    }
+    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = mps;
 }
 
 DEFINE_SPAPR_MACHINE(2_12, "2.12", false);
diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 68a4243efc..6cdc0c94e7 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -27,6 +27,7 @@
 #include "qapi/visitor.h"
 #include "sysemu/hw_accel.h"
 #include "target/ppc/cpu.h"
+#include "target/ppc/mmu-hash64.h"
 #include "cpu-models.h"
 #include "kvm_ppc.h"
 
@@ -144,6 +145,42 @@ out:
     g_free(val);
 }
 
+static void spapr_cap_get_pagesize(Object *obj, Visitor *v, const char *name,
+                                   void *opaque, Error **errp)
+{
+    sPAPRCapabilityInfo *cap = opaque;
+    sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
+    uint8_t val = spapr_get_cap(spapr, cap->index);
+    uint64_t pagesize = (1ULL << val);
+
+    visit_type_size(v, name, &pagesize, errp);
+}
+
+static void spapr_cap_set_pagesize(Object *obj, Visitor *v, const char *name,
+                                   void *opaque, Error **errp)
+{
+    sPAPRCapabilityInfo *cap = opaque;
+    sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
+    uint64_t pagesize;
+    uint8_t val;
+    Error *local_err = NULL;
+
+    visit_type_size(v, name, &pagesize, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    if (!is_power_of_2(pagesize)) {
+        error_setg(errp, "cap-%s must be a power of 2", cap->name);
+        return;
+    }
+
+    val = ctz64(pagesize);
+    spapr->cmd_line_caps[cap->index] = true;
+    spapr->eff.caps[cap->index] = val;
+}
+
 static void cap_htm_apply(sPAPRMachineState *spapr, uint8_t val, Error **errp)
 {
     if (!val) {
@@ -267,6 +304,16 @@ static void cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
 
 #define VALUE_DESC_TRISTATE     " (broken, workaround, fixed)"
 
+static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
+                                      uint8_t val, Error **errp)
+{
+    if (val < 12) {
+        error_setg(errp, "Require at least 4kiB hpt-max-page-size");
+    } else if (val < 16) {
+        warn_report("Many guests require at least 64kiB hpt-max-page-size");
+    }
+}
+
 sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
     [SPAPR_CAP_HTM] = {
         .name = "htm",
@@ -326,6 +373,15 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
         .possible = &cap_ibs_possible,
         .apply = cap_safe_indirect_branch_apply,
     },
+    [SPAPR_CAP_HPT_MAXPAGESIZE] = {
+        .name = "hpt-max-page-size",
+        .description = "Maximum page size for Hash Page Table guests",
+        .index = SPAPR_CAP_HPT_MAXPAGESIZE,
+        .get = spapr_cap_get_pagesize,
+        .set = spapr_cap_set_pagesize,
+        .type = "int",
+        .apply = cap_hpt_maxpagesize_apply,
+    },
 };
 
 static sPAPRCapabilities default_caps_with_cpu(sPAPRMachineState *spapr,
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 9dd46a72f6..c97593d032 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -66,8 +66,10 @@ typedef enum {
 #define SPAPR_CAP_SBBC                  0x04
 /* Indirect Branch Serialisation */
 #define SPAPR_CAP_IBS                   0x05
+/* HPT Maximum Page Size (encoded as a shift) */
+#define SPAPR_CAP_HPT_MAXPAGESIZE       0x06
 /* Num Caps */
-#define SPAPR_CAP_NUM                   (SPAPR_CAP_IBS + 1)
+#define SPAPR_CAP_NUM                   (SPAPR_CAP_HPT_MAXPAGESIZE + 1)
 
 /*
  * Capability Values
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH 6/9] spapr: Use maximum page size capability to simplify memory backend checking
  2018-06-18  6:35 [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling David Gibson
                   ` (4 preceding siblings ...)
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 5/9] spapr: Maximum (HPT) pagesize property David Gibson
@ 2018-06-18  6:36 ` David Gibson
  2018-06-21  6:29   ` Cédric Le Goater
  2018-06-21 10:29   ` Greg Kurz
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 7/9] target/ppc: Add ppc_hash64_filter_pagesizes() David Gibson
                   ` (4 subsequent siblings)
  10 siblings, 2 replies; 43+ messages in thread
From: David Gibson @ 2018-06-18  6:36 UTC (permalink / raw)
  To: groug, abologna; +Cc: clg, qemu-ppc, qemu-devel, aik, David Gibson

The way we used to handle KVM allowable guest pagesizes for PAPR guests
required some convoluted checking of memory attached to the guest.

The allowable pagesizes advertised to the guest cpus depended on the memory
which was attached at boot, but then we needed to ensure that any memory
later hotplugged didn't change which pagesizes were allowed.

Now that we have an explicit machine option to control the allowable
maximum pagesize we can simplify this.  We just check all memory backends
against that declared pagesize.  We check base and cold-plugged memory at
reset time, and hotplugged memory at pre_plug() time.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr.c         | 17 +++++++----------
 hw/ppc/spapr_caps.c    | 20 ++++++++++++++++++++
 include/hw/ppc/spapr.h |  3 +++
 target/ppc/kvm.c       | 14 --------------
 target/ppc/kvm_ppc.h   |  6 ------
 5 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 74a76e7e09..efd36e92e2 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3192,11 +3192,13 @@ static void spapr_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
                                   Error **errp)
 {
     const sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(hotplug_dev);
+    sPAPRMachineState *spapr = SPAPR_MACHINE(hotplug_dev);
     PCDIMMDevice *dimm = PC_DIMM(dev);
     PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
     MemoryRegion *mr;
     uint64_t size;
-    char *mem_dev;
+    Object *memdev;
+    hwaddr pagesize;
 
     if (!smc->dr_lmb_enabled) {
         error_setg(errp, "Memory hotplug not supported for this machine");
@@ -3215,15 +3217,10 @@ static void spapr_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
         return;
     }
 
-    mem_dev = object_property_get_str(OBJECT(dimm), PC_DIMM_MEMDEV_PROP, NULL);
-    if (mem_dev && !kvmppc_is_mem_backend_page_size_ok(mem_dev)) {
-        error_setg(errp, "Memory backend has bad page size. "
-                   "Use 'memory-backend-file' with correct mem-path.");
-        goto out;
-    }
-
-out:
-    g_free(mem_dev);
+    memdev = object_property_get_link(OBJECT(dimm), PC_DIMM_MEMDEV_PROP,
+                                      &error_abort);
+    pagesize = host_memory_backend_pagesize(MEMORY_BACKEND(memdev));
+    spapr_check_pagesize(spapr, pagesize, errp);
 }
 
 struct sPAPRDIMMState {
diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 6cdc0c94e7..9fc739b3f5 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -26,6 +26,7 @@
 #include "qapi/error.h"
 #include "qapi/visitor.h"
 #include "sysemu/hw_accel.h"
+#include "exec/ram_addr.h"
 #include "target/ppc/cpu.h"
 #include "target/ppc/mmu-hash64.h"
 #include "cpu-models.h"
@@ -304,6 +305,23 @@ static void cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
 
 #define VALUE_DESC_TRISTATE     " (broken, workaround, fixed)"
 
+void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
+                          Error **errp)
+{
+    hwaddr maxpagesize = (1ULL << spapr->eff.caps[SPAPR_CAP_HPT_MPS]);
+
+    if (!kvmppc_hpt_needs_host_contiguous_pages()) {
+        return;
+    }
+
+    if (maxpagesize > pagesize) {
+        error_setg(errp,
+                   "Can't support %"HWADDR_PRIu" kiB guest pages with %"
+                   HWADDR_PRIu" kiB host pages with this KVM implementation",
+                   maxpagesize >> 10, pagesize >> 10);
+    }
+}
+
 static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
                                       uint8_t val, Error **errp)
 {
@@ -312,6 +330,8 @@ static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
     } else if (val < 16) {
         warn_report("Many guests require at least 64kiB hpt-max-page-size");
     }
+
+    spapr_check_pagesize(spapr, qemu_getrampagesize(), errp);
 }
 
 sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index c97593d032..75e2cf2687 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -806,4 +806,7 @@ void spapr_caps_cpu_apply(sPAPRMachineState *spapr, PowerPCCPU *cpu);
 void spapr_caps_add_properties(sPAPRMachineClass *smc, Error **errp);
 int spapr_caps_post_migration(sPAPRMachineState *spapr);
 
+void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
+                          Error **errp);
+
 #endif /* HW_SPAPR_H */
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 50b5d01432..9cfbd388ad 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -500,26 +500,12 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
         cpu->hash64_opts->flags &= ~PPC_HASH64_1TSEG;
     }
 }
-
-bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
-{
-    Object *mem_obj = object_resolve_path(obj_path, NULL);
-    long pagesize = host_memory_backend_pagesize(MEMORY_BACKEND(mem_obj));
-
-    return pagesize >= max_cpu_page_size;
-}
-
 #else /* defined (TARGET_PPC64) */
 
 static inline void kvm_fixup_page_sizes(PowerPCCPU *cpu)
 {
 }
 
-bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
-{
-    return true;
-}
-
 #endif /* !defined (TARGET_PPC64) */
 
 unsigned long kvm_arch_vcpu_id(CPUState *cpu)
diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
index a7ddb8a5d6..443fca0a4e 100644
--- a/target/ppc/kvm_ppc.h
+++ b/target/ppc/kvm_ppc.h
@@ -71,7 +71,6 @@ int kvmppc_resize_hpt_commit(PowerPCCPU *cpu, target_ulong flags, int shift);
 bool kvmppc_pvr_workaround_required(PowerPCCPU *cpu);
 
 bool kvmppc_hpt_needs_host_contiguous_pages(void);
-bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path);
 
 #else
 
@@ -228,11 +227,6 @@ static inline bool kvmppc_hpt_needs_host_contiguous_pages(void)
     return false;
 }
 
-static inline bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
-{
-    return true;
-}
-
 static inline bool kvmppc_has_cap_spapr_vfio(void)
 {
     return false;
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH 7/9] target/ppc: Add ppc_hash64_filter_pagesizes()
  2018-06-18  6:35 [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling David Gibson
                   ` (5 preceding siblings ...)
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 6/9] spapr: Use maximum page size capability to simplify memory backend checking David Gibson
@ 2018-06-18  6:36 ` David Gibson
  2018-06-21  6:38   ` Cédric Le Goater
  2018-06-21 11:48   ` Greg Kurz
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 8/9] spapr: Limit available pagesizes to provide a consistent guest environment David Gibson
                   ` (3 subsequent siblings)
  10 siblings, 2 replies; 43+ messages in thread
From: David Gibson @ 2018-06-18  6:36 UTC (permalink / raw)
  To: groug, abologna; +Cc: clg, qemu-ppc, qemu-devel, aik, David Gibson

The paravirtualized PAPR platform sometimes needs to restrict the guest to
using only some of the page sizes actually supported by the host's MMU.
At the moment this is handled in KVM specific code, but for consistency we
want to apply the same limitations to all accelerators.

This makes a start on this by providing a helper function in the cpu code
to allow platform code to remove some of the cpu's page size definitions
via a caller supplied callback.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 target/ppc/mmu-hash64.c | 59 +++++++++++++++++++++++++++++++++++++++++
 target/ppc/mmu-hash64.h |  3 +++
 2 files changed, 62 insertions(+)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index aa200cba4c..276d9015e7 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -1166,3 +1166,62 @@ const PPCHash64Options ppc_hash64_opts_POWER7 = {
         },
     }
 };
+
+void ppc_hash64_filter_pagesizes(PowerPCCPU *cpu,
+                                 bool (*cb)(void *, uint32_t, uint32_t),
+                                 void *opaque)
+{
+    PPCHash64Options *opts = cpu->hash64_opts;
+    int i;
+    int n = 0;
+    bool ci_largepage = false;
+
+    assert(opts);
+
+    n = 0;
+    for (i = 0; i < ARRAY_SIZE(opts->sps); i++) {
+        PPCHash64SegmentPageSizes *sps = &opts->sps[i];
+        int j;
+        int m = 0;
+
+        assert(n <= i);
+
+        if (!sps->page_shift) {
+            break;
+        }
+
+        for (j = 0; j < ARRAY_SIZE(sps->enc); j++) {
+            PPCHash64PageSize *ps = &sps->enc[j];
+
+            assert(m <= j);
+            if (!ps->page_shift) {
+                break;
+            }
+
+            if (cb(opaque, sps->page_shift, ps->page_shift)) {
+                if (ps->page_shift >= 16) {
+                    ci_largepage = true;
+                }
+                sps->enc[m++] = *ps;
+            }
+        }
+
+        /* Clear rest of the row */
+        for (j = m; j < ARRAY_SIZE(sps->enc); j++) {
+            memset(&sps->enc[j], 0, sizeof(sps->enc[j]));
+        }
+
+        if (m) {
+            n++;
+        }
+    }
+
+    /* Clear the rest of the table */
+    for (i = n; i < ARRAY_SIZE(opts->sps); i++) {
+        memset(&opts->sps[i], 0, sizeof(opts->sps[i]));
+    }
+
+    if (!ci_largepage) {
+        opts->flags &= ~PPC_HASH64_CI_LARGEPAGE;
+    }
+}
diff --git a/target/ppc/mmu-hash64.h b/target/ppc/mmu-hash64.h
index 53dcec5b93..f11efc9cbc 100644
--- a/target/ppc/mmu-hash64.h
+++ b/target/ppc/mmu-hash64.h
@@ -20,6 +20,9 @@ unsigned ppc_hash64_hpte_page_shift_noslb(PowerPCCPU *cpu,
 void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val);
 void ppc_hash64_init(PowerPCCPU *cpu);
 void ppc_hash64_finalize(PowerPCCPU *cpu);
+void ppc_hash64_filter_pagesizes(PowerPCCPU *cpu,
+                                 bool (*cb)(void *, uint32_t, uint32_t),
+                                 void *opaque);
 #endif
 
 /*
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH 8/9] spapr: Limit available pagesizes to provide a consistent guest environment
  2018-06-18  6:35 [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling David Gibson
                   ` (6 preceding siblings ...)
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 7/9] target/ppc: Add ppc_hash64_filter_pagesizes() David Gibson
@ 2018-06-18  6:36 ` David Gibson
  2018-06-21  7:01   ` Cédric Le Goater
  2018-06-21 12:24   ` Greg Kurz
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 9/9] spapr: Don't rewrite mmu capabilities in KVM mode David Gibson
                   ` (2 subsequent siblings)
  10 siblings, 2 replies; 43+ messages in thread
From: David Gibson @ 2018-06-18  6:36 UTC (permalink / raw)
  To: groug, abologna; +Cc: clg, qemu-ppc, qemu-devel, aik, David Gibson

KVM HV has some limitations (deriving from the hardware) that mean not all
host-cpu supported pagesizes may be usable in the guest.  At present this
means that KVM guests and TCG guests may see different available page sizes
even if they notionally have the same vcpu model.  This is confusing and
also prevents migration between TCG and KVM.

This patch makes the environment consistent by always allowing the same set
of pagesizes.  Since we can't remove the KVM limitations, we do this by
always applying the same limitations it has, even to TCG guests.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr_caps.c | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 9fc739b3f5..0584c7c6ab 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -334,6 +334,38 @@ static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
     spapr_check_pagesize(spapr, qemu_getrampagesize(), errp);
 }
 
+static bool spapr_pagesize_cb(void *opaque, uint32_t seg_pshift, uint32_t pshift)
+{
+    unsigned maxshift = *((unsigned *)opaque);
+
+    assert(pshift >= seg_pshift);
+
+    /* Don't allow the guest to use pages bigger than the configured
+     * maximum size */
+    if (pshift > maxshift) {
+        return false;
+    }
+
+    /* For whatever reason, KVM doesn't allow multiple pagesizes
+     * within a segment, *except* for the case of 16M pages in a 4k or
+     * 64k segment.  Always exclude other cases, so that TCG and KVM
+     * guests see a consistent environment */
+    if ((pshift != seg_pshift) && (pshift != 24)) {
+        return false;
+    }
+
+    return true;
+}
+
+static void cap_hpt_maxpagesize_cpu_apply(sPAPRMachineState *spapr,
+                                          PowerPCCPU *cpu,
+                                          uint8_t val, Error **errp)
+{
+    unsigned maxshift = val;
+
+    ppc_hash64_filter_pagesizes(cpu, spapr_pagesize_cb, &maxshift);
+}
+
 sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
     [SPAPR_CAP_HTM] = {
         .name = "htm",
@@ -401,6 +433,7 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
         .set = spapr_cap_set_pagesize,
         .type = "int",
         .apply = cap_hpt_maxpagesize_apply,
+        .cpu_apply = cap_hpt_maxpagesize_cpu_apply,
     },
 };
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH 9/9] spapr: Don't rewrite mmu capabilities in KVM mode
  2018-06-18  6:35 [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling David Gibson
                   ` (7 preceding siblings ...)
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 8/9] spapr: Limit available pagesizes to provide a consistent guest environment David Gibson
@ 2018-06-18  6:36 ` David Gibson
  2018-06-21  7:53   ` Cédric Le Goater
  2018-06-21  1:08 ` [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling David Gibson
  2018-06-21  6:52 ` no-reply
  10 siblings, 1 reply; 43+ messages in thread
From: David Gibson @ 2018-06-18  6:36 UTC (permalink / raw)
  To: groug, abologna; +Cc: clg, qemu-ppc, qemu-devel, aik, David Gibson

Currently during KVM initialization on POWER, kvm_fixup_page_sizes()
rewrites a bunch of information in the cpu state to reflect the
capabilities of the host MMU and KVM.  This overwrites the information
that's already there reflecting how the TCG implementation of the MMU will
operate.

This means that we can get guest-visibly different behaviour between KVM
and TCG (and between different KVM implementations).  That's bad.  It also
prevents migration between KVM and TCG.

The pseries machine type now has filtering of the pagesizes it allows the
guest to use which means it can present a consistent model of the MMU
across all accelerators.

So, we can now replace kvm_fixup_page_sizes() with kvm_check_mmu() which
merely verifies that the expected cpu model can be faithfully handled by
KVM, rather than updating the cpu model to match KVM.

We call kvm_check_mmu() from the spapr cpu reset code.  This is a hack:
conceptually it makes more sense where fixup_page_sizes() was - in the KVM
cpu init path.  However, doing that would require moving the platform's
pagesize filtering much earlier, which would require a lot of work making
further adjustments.  There wouldn't be a lot of concrete point to doing
that, since the only KVM implementation which has the awkward MMU
restrictions is KVM HV, which can only work with an spapr guest anyway.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr_caps.c     |   2 +-
 hw/ppc/spapr_cpu_core.c |   2 +
 target/ppc/kvm.c        | 133 ++++++++++++++++++++--------------------
 target/ppc/kvm_ppc.h    |   5 ++
 4 files changed, 73 insertions(+), 69 deletions(-)

diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 0584c7c6ab..bc89a4cd70 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -308,7 +308,7 @@ static void cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
 void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
                           Error **errp)
 {
-    hwaddr maxpagesize = (1ULL << spapr->eff.caps[SPAPR_CAP_HPT_MPS]);
+    hwaddr maxpagesize = (1ULL << spapr->eff.caps[SPAPR_CAP_HPT_MAXPAGESIZE]);
 
     if (!kvmppc_hpt_needs_host_contiguous_pages()) {
         return;
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 324623190d..4e8fa28796 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -78,6 +78,8 @@ static void spapr_cpu_reset(void *opaque)
     spapr_cpu->dtl_size = 0;
 
     spapr_caps_cpu_apply(SPAPR_MACHINE(qdev_get_machine()), cpu);
+
+    kvm_check_mmu(cpu, &error_fatal);
 }
 
 void spapr_cpu_set_entry_state(PowerPCCPU *cpu, target_ulong nip, target_ulong r3)
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 9cfbd388ad..b386335014 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -419,93 +419,93 @@ bool kvmppc_hpt_needs_host_contiguous_pages(void)
     return !!(smmu_info.flags & KVM_PPC_PAGE_SIZES_REAL);
 }
 
-static bool kvm_valid_page_size(uint32_t flags, long rampgsize, uint32_t shift)
+void kvm_check_mmu(PowerPCCPU *cpu, Error **errp)
 {
-    if (!kvmppc_hpt_needs_host_contiguous_pages()) {
-        return true;
-    }
-
-    return (1ul << shift) <= rampgsize;
-}
-
-static long max_cpu_page_size;
-
-static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
-{
-    static struct kvm_ppc_smmu_info smmu_info;
-    static bool has_smmu_info;
-    CPUPPCState *env = &cpu->env;
+    struct kvm_ppc_smmu_info smmu_info;
     int iq, ik, jq, jk;
 
-    /* We only handle page sizes for 64-bit server guests for now */
-    if (!(env->mmu_model & POWERPC_MMU_64)) {
+    /* For now, we only have anything to check on hash64 MMUs */
+    if (!cpu->hash64_opts || !kvm_enabled()) {
         return;
     }
 
-    /* Collect MMU info from kernel if not already */
-    if (!has_smmu_info) {
-        kvm_get_smmu_info(cpu, &smmu_info);
-        has_smmu_info = true;
-    }
+    kvm_get_smmu_info(cpu, &smmu_info);
 
-    if (!max_cpu_page_size) {
-        max_cpu_page_size = qemu_getrampagesize();
+    if (ppc_hash64_has(cpu, PPC_HASH64_1TSEG)
+        && !(smmu_info.flags & KVM_PPC_1T_SEGMENTS)) {
+        error_setg(errp,
+                   "KVM does not support 1TiB segments which guest expects");
+        return;
     }
 
-    /* Convert to QEMU form */
-    memset(cpu->hash64_opts->sps, 0, sizeof(*cpu->hash64_opts->sps));
-
-    /* If we have HV KVM, we need to forbid CI large pages if our
-     * host page size is smaller than 64K.
-     */
-    if (kvmppc_hpt_needs_host_contiguous_pages()) {
-        if (getpagesize() >= 0x10000) {
-            cpu->hash64_opts->flags |= PPC_HASH64_CI_LARGEPAGE;
-        } else {
-            cpu->hash64_opts->flags &= ~PPC_HASH64_CI_LARGEPAGE;
-        }
+    if (smmu_info.slb_size < cpu->hash64_opts->slb_size) {
+        error_setg(errp, "KVM only supports %u SLB entries, but guest needs %u",
+                   smmu_info.slb_size, cpu->hash64_opts->slb_size);
+        return;
     }
 
     /*
-     * XXX This loop should be an entry wide AND of the capabilities that
-     *     the selected CPU has with the capabilities that KVM supports.
+     * Verify that every pagesize supported by the cpu model is
+     * supported by KVM with the same encodings
      */
-    for (ik = iq = 0; ik < KVM_PPC_PAGE_SIZES_MAX_SZ; ik++) {
+    for (iq = 0; iq < ARRAY_SIZE(cpu->hash64_opts->sps); iq++) {
         PPCHash64SegmentPageSizes *qsps = &cpu->hash64_opts->sps[iq];
-        struct kvm_ppc_one_seg_page_size *ksps = &smmu_info.sps[ik];
+        struct kvm_ppc_one_seg_page_size *ksps;
 
-        if (!kvm_valid_page_size(smmu_info.flags, max_cpu_page_size,
-                                 ksps->page_shift)) {
-            continue;
-        }
-        qsps->page_shift = ksps->page_shift;
-        qsps->slb_enc = ksps->slb_enc;
-        for (jk = jq = 0; jk < KVM_PPC_PAGE_SIZES_MAX_SZ; jk++) {
-            if (!kvm_valid_page_size(smmu_info.flags, max_cpu_page_size,
-                                     ksps->enc[jk].page_shift)) {
-                continue;
-            }
-            qsps->enc[jq].page_shift = ksps->enc[jk].page_shift;
-            qsps->enc[jq].pte_enc = ksps->enc[jk].pte_enc;
-            if (++jq >= PPC_PAGE_SIZES_MAX_SZ) {
+        for (ik = 0; ik < ARRAY_SIZE(smmu_info.sps); ik++) {
+            if (qsps->page_shift == smmu_info.sps[ik].page_shift) {
                 break;
             }
         }
-        if (++iq >= PPC_PAGE_SIZES_MAX_SZ) {
-            break;
+        if (ik >= ARRAY_SIZE(smmu_info.sps)) {
+            error_setg(errp, "KVM doesn't support for base page shift %u",
+                       qsps->page_shift);
+            return;
+        }
+
+        ksps = &smmu_info.sps[ik];
+        if (ksps->slb_enc != qsps->slb_enc) {
+            error_setg(errp,
+"KVM uses SLB encoding 0x%x for page shift %u, but guest expects 0x%x",
+                       ksps->slb_enc, ksps->page_shift, qsps->slb_enc);
+            return;
+        }
+
+        for (jq = 0; jq < ARRAY_SIZE(qsps->enc); jq++) {
+            for (jk = 0; jk < ARRAY_SIZE(ksps->enc); jk++) {
+                if (qsps->enc[jq].page_shift == ksps->enc[jk].page_shift) {
+                    break;
+                }
+            }
+
+            if (jk >= ARRAY_SIZE(ksps->enc)) {
+                error_setg(errp, "KVM doesn't support page shift %u/%u",
+                           qsps->enc[jq].page_shift, qsps->page_shift);
+                return;
+            }
+            if (qsps->enc[jq].pte_enc != ksps->enc[jk].pte_enc) {
+                error_setg(errp,
+"KVM uses PTE encoding 0x%x for page shift %u/%u, but guest expects 0x%x",
+                           ksps->enc[jk].pte_enc, qsps->enc[jq].page_shift,
+                           qsps->page_shift, qsps->enc[jq].pte_enc);
+                return;
+            }
         }
     }
-    cpu->hash64_opts->slb_size = smmu_info.slb_size;
-    if (!(smmu_info.flags & KVM_PPC_1T_SEGMENTS)) {
-        cpu->hash64_opts->flags &= ~PPC_HASH64_1TSEG;
-    }
-}
-#else /* defined (TARGET_PPC64) */
 
-static inline void kvm_fixup_page_sizes(PowerPCCPU *cpu)
-{
+    if (ppc_hash64_has(cpu, PPC_HASH64_CI_LARGEPAGE)) {
+        /* Mostly what guest pagesizes we can use are related to the
+         * host pages used to map guest RAM, which is handled in the
+         * platform code. Cache-Inhibited largepages (64k) however are
+         * used for I/O, so if they're mapped to the host at all it
+         * will be a normal mapping, not a special hugepage one used
+         * for RAM. */
+        if (getpagesize() < 0x10000) {
+            error_setg(errp,
+"KVM can't supply 64kiB CI pages, which guest expects\n");
+        }
+    }
 }
-
 #endif /* !defined (TARGET_PPC64) */
 
 unsigned long kvm_arch_vcpu_id(CPUState *cpu)
@@ -551,9 +551,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
     CPUPPCState *cenv = &cpu->env;
     int ret;
 
-    /* Gather server mmu info from KVM and update the CPU state */
-    kvm_fixup_page_sizes(cpu);
-
     /* Synchronize sregs with kvm */
     ret = kvm_arch_sync_sregs(cpu);
     if (ret) {
diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
index 443fca0a4e..657582bb32 100644
--- a/target/ppc/kvm_ppc.h
+++ b/target/ppc/kvm_ppc.h
@@ -71,6 +71,7 @@ int kvmppc_resize_hpt_commit(PowerPCCPU *cpu, target_ulong flags, int shift);
 bool kvmppc_pvr_workaround_required(PowerPCCPU *cpu);
 
 bool kvmppc_hpt_needs_host_contiguous_pages(void);
+void kvm_check_mmu(PowerPCCPU *cpu, Error **errp);
 
 #else
 
@@ -227,6 +228,10 @@ static inline bool kvmppc_hpt_needs_host_contiguous_pages(void)
     return false;
 }
 
+static inline void kvm_check_mmu(PowerPCCPU *cpu, Error **errp)
+{
+}
+
 static inline bool kvmppc_has_cap_spapr_vfio(void)
 {
     return false;
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 1/9] target/ppc: Allow cpu compatiblity checks based on type, not instance
  2018-06-18  6:35 ` [Qemu-devel] [PATCH 1/9] target/ppc: Allow cpu compatiblity checks based on type, not instance David Gibson
@ 2018-06-18 13:22   ` Greg Kurz
  2018-06-21  5:20   ` Cédric Le Goater
  1 sibling, 0 replies; 43+ messages in thread
From: Greg Kurz @ 2018-06-18 13:22 UTC (permalink / raw)
  To: David Gibson; +Cc: abologna, clg, qemu-ppc, qemu-devel, aik

On Mon, 18 Jun 2018 16:35:58 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> ppc_check_compat() is used in a number of places to check if a cpu object
> supports a certain compatiblity mode, subject to various constraints.
> 
> It takes a PowerPCCPU *, however it really only depends on the cpu's class.
> We have upcoming cases where it would be useful to make compatibility
> checks before we fully instantiate the cpu objects.
> 
> ppc_type_check_compat() will now make an equivalent check, but based on a
> CPU's QOM typename instead of an instantiated CPU object.
> 
> We make use of the new interface in several places in spapr, where we're
> essentially making a global check, rather than one specific to a particular
> cpu.  This avoids some ugly uses of first_cpu to grab a "representative"
> instance.
> 

Nice cleanup !

> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  hw/ppc/spapr.c      | 10 ++++------
>  hw/ppc/spapr_caps.c | 19 +++++++++----------
>  target/ppc/compat.c | 27 +++++++++++++++++++++------
>  target/ppc/cpu.h    |  4 ++++
>  4 files changed, 38 insertions(+), 22 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index db0fb385d4..b0b94fc1f0 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1616,8 +1616,8 @@ static void spapr_machine_reset(void)
>  
>      first_ppc_cpu = POWERPC_CPU(first_cpu);
>      if (kvm_enabled() && kvmppc_has_cap_mmu_radix() &&
> -        ppc_check_compat(first_ppc_cpu, CPU_POWERPC_LOGICAL_3_00, 0,
> -                         spapr->max_compat_pvr)) {
> +        ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00, 0,
> +                              spapr->max_compat_pvr)) {
>          /* If using KVM with radix mode available, VCPUs can be started
>           * without a HPT because KVM will start them in radix mode.
>           * Set the GR bit in PATB so that we know there is no HPT. */
> @@ -2520,7 +2520,6 @@ static void spapr_machine_init(MachineState *machine)
>      long load_limit, fw_size;
>      char *filename;
>      Error *resize_hpt_err = NULL;
> -    PowerPCCPU *first_ppc_cpu;
>  
>      msi_nonbroken = true;
>  
> @@ -2618,10 +2617,9 @@ static void spapr_machine_init(MachineState *machine)
>      /* init CPUs */
>      spapr_init_cpus(spapr);
>  
> -    first_ppc_cpu = POWERPC_CPU(first_cpu);
>      if ((!kvm_enabled() || kvmppc_has_cap_mmu_radix()) &&
> -        ppc_check_compat(first_ppc_cpu, CPU_POWERPC_LOGICAL_3_00, 0,
> -                         spapr->max_compat_pvr)) {
> +        ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00, 0,
> +                              spapr->max_compat_pvr)) {
>          /* KVM and TCG always allow GTSE with radix... */
>          spapr_ovec_set(spapr->ov5, OV5_MMU_RADIX_GTSE);
>      }
> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> index 00e43a9ba7..469f38f0ef 100644
> --- a/hw/ppc/spapr_caps.c
> +++ b/hw/ppc/spapr_caps.c
> @@ -327,27 +327,26 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
>  };
>  
>  static sPAPRCapabilities default_caps_with_cpu(sPAPRMachineState *spapr,
> -                                               CPUState *cs)
> +                                               const char *cputype)
>  {
>      sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> -    PowerPCCPU *cpu = POWERPC_CPU(cs);
>      sPAPRCapabilities caps;
>  
>      caps = smc->default_caps;
>  
> -    if (!ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_2_07,
> -                          0, spapr->max_compat_pvr)) {
> +    if (!ppc_type_check_compat(cputype, CPU_POWERPC_LOGICAL_2_07,
> +                               0, spapr->max_compat_pvr)) {
>          caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_OFF;
>          caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_BROKEN;
>      }
>  
> -    if (!ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_2_06_PLUS,
> -                          0, spapr->max_compat_pvr)) {
> +    if (!ppc_type_check_compat(cputype, CPU_POWERPC_LOGICAL_2_06_PLUS,
> +                               0, spapr->max_compat_pvr)) {
>          caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_BROKEN;
>      }
>  
> -    if (!ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_2_06,
> -                          0, spapr->max_compat_pvr)) {
> +    if (!ppc_type_check_compat(cputype, CPU_POWERPC_LOGICAL_2_06,
> +                               0, spapr->max_compat_pvr)) {
>          caps.caps[SPAPR_CAP_VSX] = SPAPR_CAP_OFF;
>          caps.caps[SPAPR_CAP_DFP] = SPAPR_CAP_OFF;
>          caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_BROKEN;
> @@ -384,7 +383,7 @@ int spapr_caps_post_migration(sPAPRMachineState *spapr)
>      sPAPRCapabilities dstcaps = spapr->eff;
>      sPAPRCapabilities srccaps;
>  
> -    srccaps = default_caps_with_cpu(spapr, first_cpu);
> +    srccaps = default_caps_with_cpu(spapr, MACHINE(spapr)->cpu_type);
>      for (i = 0; i < SPAPR_CAP_NUM; i++) {
>          /* If not default value then assume came in with the migration */
>          if (spapr->mig.caps[i] != spapr->def.caps[i]) {
> @@ -446,7 +445,7 @@ void spapr_caps_reset(sPAPRMachineState *spapr)
>      int i;
>  
>      /* First compute the actual set of caps we're running with.. */
> -    default_caps = default_caps_with_cpu(spapr, first_cpu);
> +    default_caps = default_caps_with_cpu(spapr, MACHINE(spapr)->cpu_type);
>  
>      for (i = 0; i < SPAPR_CAP_NUM; i++) {
>          /* Store the defaults */
> diff --git a/target/ppc/compat.c b/target/ppc/compat.c
> index 807c906f68..7de4bf3122 100644
> --- a/target/ppc/compat.c
> +++ b/target/ppc/compat.c
> @@ -105,17 +105,13 @@ static const CompatInfo *compat_by_pvr(uint32_t pvr)
>      return NULL;
>  }
>  
> -bool ppc_check_compat(PowerPCCPU *cpu, uint32_t compat_pvr,
> -                      uint32_t min_compat_pvr, uint32_t max_compat_pvr)
> +static bool pcc_compat(PowerPCCPUClass *pcc, uint32_t compat_pvr,
> +                       uint32_t min_compat_pvr, uint32_t max_compat_pvr)
>  {
> -    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
>      const CompatInfo *compat = compat_by_pvr(compat_pvr);
>      const CompatInfo *min = compat_by_pvr(min_compat_pvr);
>      const CompatInfo *max = compat_by_pvr(max_compat_pvr);
>  
> -#if !defined(CONFIG_USER_ONLY)
> -    g_assert(cpu->vhyp);
> -#endif
>      g_assert(!min_compat_pvr || min);
>      g_assert(!max_compat_pvr || max);
>  
> @@ -134,6 +130,25 @@ bool ppc_check_compat(PowerPCCPU *cpu, uint32_t compat_pvr,
>      return true;
>  }
>  
> +bool ppc_check_compat(PowerPCCPU *cpu, uint32_t compat_pvr,
> +                      uint32_t min_compat_pvr, uint32_t max_compat_pvr)
> +{
> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> +
> +#if !defined(CONFIG_USER_ONLY)
> +    g_assert(cpu->vhyp);
> +#endif
> +
> +    return pcc_compat(pcc, compat_pvr, min_compat_pvr, max_compat_pvr);
> +}
> +
> +bool ppc_type_check_compat(const char *cputype, uint32_t compat_pvr,
> +                           uint32_t min_compat_pvr, uint32_t max_compat_pvr)
> +{
> +    PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(object_class_by_name(cputype));
> +    return pcc_compat(pcc, compat_pvr, min_compat_pvr, max_compat_pvr);
> +}
> +
>  void ppc_set_compat(PowerPCCPU *cpu, uint32_t compat_pvr, Error **errp)
>  {
>      const CompatInfo *compat = compat_by_pvr(compat_pvr);
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index 874da6efbc..c7f3fb6b73 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -1369,7 +1369,11 @@ static inline int cpu_mmu_index (CPUPPCState *env, bool ifetch)
>  #if defined(TARGET_PPC64)
>  bool ppc_check_compat(PowerPCCPU *cpu, uint32_t compat_pvr,
>                        uint32_t min_compat_pvr, uint32_t max_compat_pvr);
> +bool ppc_type_check_compat(const char *cputype, uint32_t compat_pvr,
> +                           uint32_t min_compat_pvr, uint32_t max_compat_pvr);
> +
>  void ppc_set_compat(PowerPCCPU *cpu, uint32_t compat_pvr, Error **errp);
> +
>  #if !defined(CONFIG_USER_ONLY)
>  void ppc_set_compat_all(uint32_t compat_pvr, Error **errp);
>  #endif

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 2/9] spapr: Compute effective capability values earlier
  2018-06-18  6:35 ` [Qemu-devel] [PATCH 2/9] spapr: Compute effective capability values earlier David Gibson
@ 2018-06-18 13:37   ` Greg Kurz
  2018-06-21  5:32   ` Cédric Le Goater
  1 sibling, 0 replies; 43+ messages in thread
From: Greg Kurz @ 2018-06-18 13:37 UTC (permalink / raw)
  To: David Gibson; +Cc: abologna, clg, qemu-ppc, qemu-devel, aik

On Mon, 18 Jun 2018 16:35:59 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> Previously, the effective values of the various spapr capability flags
> were only determined at machine reset time.  That was a lazy way of making
> sure it was after cpu initialization so it could use the cpu object to
> inform the defaults.
> 
> But we've now improved the compat checking code so that we don't need to
> instantiate the cpus to use it.  That lets us move the resolution of the
> capability defaults much earlier.
> 
> This is going to be necessary for some future capabilities.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  hw/ppc/spapr.c         | 6 ++++--
>  hw/ppc/spapr_caps.c    | 9 ++++++---
>  include/hw/ppc/spapr.h | 3 ++-
>  3 files changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index b0b94fc1f0..40858d047c 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1612,7 +1612,7 @@ static void spapr_machine_reset(void)
>      void *fdt;
>      int rc;
>  
> -    spapr_caps_reset(spapr);
> +    spapr_caps_apply(spapr);
>  
>      first_ppc_cpu = POWERPC_CPU(first_cpu);
>      if (kvm_enabled() && kvmppc_has_cap_mmu_radix() &&
> @@ -2526,7 +2526,9 @@ static void spapr_machine_init(MachineState *machine)
>      QLIST_INIT(&spapr->phbs);
>      QTAILQ_INIT(&spapr->pending_dimm_unplugs);
>  
> -    /* Check HPT resizing availability */
> +    /* Determine capabilities to run with */
> +    spapr_caps_init(spapr);
> +
>      kvmppc_check_papr_resize_hpt(&resize_hpt_err);
>      if (spapr->resize_hpt == SPAPR_RESIZE_HPT_DEFAULT) {
>          /*
> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> index 469f38f0ef..dabed817d1 100644
> --- a/hw/ppc/spapr_caps.c
> +++ b/hw/ppc/spapr_caps.c
> @@ -439,12 +439,12 @@ SPAPR_CAP_MIG_STATE(cfpc, SPAPR_CAP_CFPC);
>  SPAPR_CAP_MIG_STATE(sbbc, SPAPR_CAP_SBBC);
>  SPAPR_CAP_MIG_STATE(ibs, SPAPR_CAP_IBS);
>  
> -void spapr_caps_reset(sPAPRMachineState *spapr)
> +void spapr_caps_init(sPAPRMachineState *spapr)
>  {
>      sPAPRCapabilities default_caps;
>      int i;
>  
> -    /* First compute the actual set of caps we're running with.. */
> +    /* Compute the actual set of caps we should run with */
>      default_caps = default_caps_with_cpu(spapr, MACHINE(spapr)->cpu_type);
>  
>      for (i = 0; i < SPAPR_CAP_NUM; i++) {
> @@ -455,8 +455,11 @@ void spapr_caps_reset(sPAPRMachineState *spapr)
>              spapr->eff.caps[i] = default_caps.caps[i];
>          }
>      }
> +}
>  
> -    /* .. then apply those caps to the virtual hardware */
> +void spapr_caps_apply(sPAPRMachineState *spapr)
> +{
> +    int i;
>  
>      for (i = 0; i < SPAPR_CAP_NUM; i++) {
>          sPAPRCapabilityInfo *info = &capability_table[i];
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 3388750fc7..9dbd6010f5 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -798,7 +798,8 @@ static inline uint8_t spapr_get_cap(sPAPRMachineState *spapr, int cap)
>      return spapr->eff.caps[cap];
>  }
>  
> -void spapr_caps_reset(sPAPRMachineState *spapr);
> +void spapr_caps_init(sPAPRMachineState *spapr);
> +void spapr_caps_apply(sPAPRMachineState *spapr);
>  void spapr_caps_add_properties(sPAPRMachineClass *smc, Error **errp);
>  int spapr_caps_post_migration(sPAPRMachineState *spapr);
>  

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/9] spapr: Add cpu_apply hook to capabilities
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 3/9] spapr: Add cpu_apply hook to capabilities David Gibson
@ 2018-06-18 15:28   ` Greg Kurz
  2018-06-21  5:34   ` Cédric Le Goater
  1 sibling, 0 replies; 43+ messages in thread
From: Greg Kurz @ 2018-06-18 15:28 UTC (permalink / raw)
  To: David Gibson; +Cc: abologna, clg, qemu-ppc, qemu-devel, aik

On Mon, 18 Jun 2018 16:36:00 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> spapr capabilities have an apply hook to actually activate (or deactivate)
> the feature in the system at reset time.  However, a number of capabilities
> affect the setup of cpus, and need to be applied to each of them -
> including hotplugged cpus for extra complication.  To make this simpler,
> add an optional cpu_apply hook that is called from spapr_cpu_reset().
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  hw/ppc/spapr_caps.c     | 19 +++++++++++++++++++
>  hw/ppc/spapr_cpu_core.c |  2 ++
>  include/hw/ppc/spapr.h  |  1 +
>  3 files changed, 22 insertions(+)
> 
> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> index dabed817d1..68a4243efc 100644
> --- a/hw/ppc/spapr_caps.c
> +++ b/hw/ppc/spapr_caps.c
> @@ -59,6 +59,8 @@ typedef struct sPAPRCapabilityInfo {
>      sPAPRCapPossible *possible;
>      /* Make sure the virtual hardware can support this capability */
>      void (*apply)(sPAPRMachineState *spapr, uint8_t val, Error **errp);
> +    void (*cpu_apply)(sPAPRMachineState *spapr, PowerPCCPU *cpu,
> +                      uint8_t val, Error **errp);
>  } sPAPRCapabilityInfo;
>  
>  static void spapr_cap_get_bool(Object *obj, Visitor *v, const char *name,
> @@ -472,6 +474,23 @@ void spapr_caps_apply(sPAPRMachineState *spapr)
>      }
>  }
>  
> +void spapr_caps_cpu_apply(sPAPRMachineState *spapr, PowerPCCPU *cpu)
> +{
> +    int i;
> +
> +    for (i = 0; i < SPAPR_CAP_NUM; i++) {
> +        sPAPRCapabilityInfo *info = &capability_table[i];
> +
> +        /*
> +         * If the apply function can't set the desired level and thinks it's
> +         * fatal, it should cause that.
> +         */
> +        if (info->cpu_apply) {
> +            info->cpu_apply(spapr, cpu, spapr->eff.caps[i], &error_fatal);
> +        }
> +    }
> +}
> +
>  void spapr_caps_add_properties(sPAPRMachineClass *smc, Error **errp)
>  {
>      Error *local_err = NULL;
> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> index aef3be33a3..324623190d 100644
> --- a/hw/ppc/spapr_cpu_core.c
> +++ b/hw/ppc/spapr_cpu_core.c
> @@ -76,6 +76,8 @@ static void spapr_cpu_reset(void *opaque)
>      spapr_cpu->slb_shadow_size = 0;
>      spapr_cpu->dtl_addr = 0;
>      spapr_cpu->dtl_size = 0;
> +
> +    spapr_caps_cpu_apply(SPAPR_MACHINE(qdev_get_machine()), cpu);
>  }
>  
>  void spapr_cpu_set_entry_state(PowerPCCPU *cpu, target_ulong nip, target_ulong r3)
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 9dbd6010f5..9dd46a72f6 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -800,6 +800,7 @@ static inline uint8_t spapr_get_cap(sPAPRMachineState *spapr, int cap)
>  
>  void spapr_caps_init(sPAPRMachineState *spapr);
>  void spapr_caps_apply(sPAPRMachineState *spapr);
> +void spapr_caps_cpu_apply(sPAPRMachineState *spapr, PowerPCCPU *cpu);
>  void spapr_caps_add_properties(sPAPRMachineClass *smc, Error **errp);
>  int spapr_caps_post_migration(sPAPRMachineState *spapr);
>  

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 4/9] target/ppc: Add kvmppc_hpt_needs_host_contiguous_pages() helper
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 4/9] target/ppc: Add kvmppc_hpt_needs_host_contiguous_pages() helper David Gibson
@ 2018-06-18 15:32   ` Greg Kurz
  2018-06-21  5:56   ` Cédric Le Goater
  1 sibling, 0 replies; 43+ messages in thread
From: Greg Kurz @ 2018-06-18 15:32 UTC (permalink / raw)
  To: David Gibson; +Cc: abologna, clg, qemu-ppc, qemu-devel, aik

On Mon, 18 Jun 2018 16:36:01 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> KVM HV has a restriction that for HPT mode guests, guest pages must be hpa
> contiguous as well as gpa contiguous.  We have to account for that in
> various places.  We determine whether we're subject to this restriction
> from the SMMU information exposed by KVM.
> 
> Planned cleanups to the way we handle this will require knowing whether
> this restriction is in play in wider parts of the code.  So, expose a
> helper function which returns it.
> 
> This does mean some redundant calls to kvm_get_smmu_info(), but they'll go
> away again with future cleanups.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  target/ppc/kvm.c     | 17 +++++++++++++++--
>  target/ppc/kvm_ppc.h |  6 ++++++
>  2 files changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index 5c0e313ca6..50b5d01432 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -406,9 +406,22 @@ target_ulong kvmppc_configure_v3_mmu(PowerPCCPU *cpu,
>      }
>  }
>  
> +bool kvmppc_hpt_needs_host_contiguous_pages(void)
> +{
> +    PowerPCCPU *cpu = POWERPC_CPU(first_cpu);
> +    static struct kvm_ppc_smmu_info smmu_info;
> +
> +    if (!kvm_enabled()) {
> +        return false;
> +    }
> +
> +    kvm_get_smmu_info(cpu, &smmu_info);
> +    return !!(smmu_info.flags & KVM_PPC_PAGE_SIZES_REAL);
> +}
> +
>  static bool kvm_valid_page_size(uint32_t flags, long rampgsize, uint32_t shift)
>  {
> -    if (!(flags & KVM_PPC_PAGE_SIZES_REAL)) {
> +    if (!kvmppc_hpt_needs_host_contiguous_pages()) {
>          return true;
>      }
>  
> @@ -445,7 +458,7 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>      /* If we have HV KVM, we need to forbid CI large pages if our
>       * host page size is smaller than 64K.
>       */
> -    if (smmu_info.flags & KVM_PPC_PAGE_SIZES_REAL) {
> +    if (kvmppc_hpt_needs_host_contiguous_pages()) {
>          if (getpagesize() >= 0x10000) {
>              cpu->hash64_opts->flags |= PPC_HASH64_CI_LARGEPAGE;
>          } else {
> diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> index e2840e1d33..a7ddb8a5d6 100644
> --- a/target/ppc/kvm_ppc.h
> +++ b/target/ppc/kvm_ppc.h
> @@ -70,6 +70,7 @@ int kvmppc_resize_hpt_prepare(PowerPCCPU *cpu, target_ulong flags, int shift);
>  int kvmppc_resize_hpt_commit(PowerPCCPU *cpu, target_ulong flags, int shift);
>  bool kvmppc_pvr_workaround_required(PowerPCCPU *cpu);
>  
> +bool kvmppc_hpt_needs_host_contiguous_pages(void);
>  bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path);
>  
>  #else
> @@ -222,6 +223,11 @@ static inline uint64_t kvmppc_rma_size(uint64_t current_size,
>      return ram_size;
>  }
>  
> +static inline bool kvmppc_hpt_needs_host_contiguous_pages(void)
> +{
> +    return false;
> +}
> +
>  static inline bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
>  {
>      return true;

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 5/9] spapr: Maximum (HPT) pagesize property
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 5/9] spapr: Maximum (HPT) pagesize property David Gibson
@ 2018-06-19  9:23   ` Cédric Le Goater
  2018-06-19 11:22     ` David Gibson
  2018-06-21  6:22   ` Cédric Le Goater
  2018-06-21  9:19   ` Greg Kurz
  2 siblings, 1 reply; 43+ messages in thread
From: Cédric Le Goater @ 2018-06-19  9:23 UTC (permalink / raw)
  To: David Gibson, groug, abologna; +Cc: qemu-ppc, qemu-devel, aik

On 06/18/2018 08:36 AM, David Gibson wrote:
> The way the POWER Hash Page Table (HPT) MMU is virtualized by KVM HV means
> that every page that the guest puts in the pagetables must be truly
> physically contiguous, not just GPA-contiguous.  In effect this means that
> an HPT guest can't use any pagesizes greater than the host page size used
> to back its memory.
> 
> At present we handle this by changing what we advertise to the guest based
> on the backing pagesizes.  This is pretty bad, because it means the guest
> sees a different environment depending on what should be host configuration
> details.
> 
> As a start on fixing this, we add a new capability parameter to the pseries
> machine type which gives the maximum allowed pagesizes for an HPT guest (as
> a shift).  For now we just create and validate the parameter without making
> it do anything.
> 
> For backwards compatibility, on older machine types we set it to the max
> available page size for the host.  For the 3.0 machine type, we fix it to
> 16, the intention being to only allow HPT pagesizes up to 64kiB by default
> in future.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/ppc/spapr.c         | 12 +++++++++
>  hw/ppc/spapr_caps.c    | 56 ++++++++++++++++++++++++++++++++++++++++++
>  include/hw/ppc/spapr.h |  4 ++-
>  3 files changed, 71 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 40858d047c..74a76e7e09 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -63,6 +63,7 @@
>  #include "hw/virtio/vhost-scsi-common.h"
>  
>  #include "exec/address-spaces.h"
> +#include "exec/ram_addr.h"
>  #include "hw/usb.h"
>  #include "qemu/config-file.h"
>  #include "qemu/error-report.h"
> @@ -4043,6 +4044,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      smc->default_caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_BROKEN;
>      smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_BROKEN;
>      smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_BROKEN;
> +    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 16; /* 64kiB */
>      spapr_caps_add_properties(smc, &error_abort);
>  }
>  
> @@ -4126,8 +4128,18 @@ static void spapr_machine_2_12_instance_options(MachineState *machine)
>  
>  static void spapr_machine_2_12_class_options(MachineClass *mc)
>  {
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
> +    uint8_t mps;
> +
>      spapr_machine_3_0_class_options(mc);
>      SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_12);
> +
> +    if (kvmppc_hpt_needs_host_contiguous_pages()) {
> +        mps = ctz64(qemu_getrampagesize());
> +    } else {
> +        mps = 34; /* allow everything up to 16GiB, i.e. everything */
> +    }
> +    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = mps;
>  }
>  
>  DEFINE_SPAPR_MACHINE(2_12, "2.12", false);
> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> index 68a4243efc..6cdc0c94e7 100644
> --- a/hw/ppc/spapr_caps.c
> +++ b/hw/ppc/spapr_caps.c
> @@ -27,6 +27,7 @@
>  #include "qapi/visitor.h"
>  #include "sysemu/hw_accel.h"
>  #include "target/ppc/cpu.h"
> +#include "target/ppc/mmu-hash64.h"
>  #include "cpu-models.h"
>  #include "kvm_ppc.h"
>  
> @@ -144,6 +145,42 @@ out:
>      g_free(val);
>  }
>  
> +static void spapr_cap_get_pagesize(Object *obj, Visitor *v, const char *name,
> +                                   void *opaque, Error **errp)
> +{
> +    sPAPRCapabilityInfo *cap = opaque;
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
> +    uint8_t val = spapr_get_cap(spapr, cap->index);
> +    uint64_t pagesize = (1ULL << val);
> +
> +    visit_type_size(v, name, &pagesize, errp);
> +}
> +
> +static void spapr_cap_set_pagesize(Object *obj, Visitor *v, const char *name,
> +                                   void *opaque, Error **errp)
> +{
> +    sPAPRCapabilityInfo *cap = opaque;
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
> +    uint64_t pagesize;
> +    uint8_t val;
> +    Error *local_err = NULL;
> +
> +    visit_type_size(v, name, &pagesize, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +
> +    if (!is_power_of_2(pagesize)) {
> +        error_setg(errp, "cap-%s must be a power of 2", cap->name);
> +        return;
> +    }
> +
> +    val = ctz64(pagesize);
> +    spapr->cmd_line_caps[cap->index] = true;
> +    spapr->eff.caps[cap->index] = val;
> +}
> +
>  static void cap_htm_apply(sPAPRMachineState *spapr, uint8_t val, Error **errp)
>  {
>      if (!val) {
> @@ -267,6 +304,16 @@ static void cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
>  
>  #define VALUE_DESC_TRISTATE     " (broken, workaround, fixed)"
>  
> +static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
> +                                      uint8_t val, Error **errp)
> +{
> +    if (val < 12) {
> +        error_setg(errp, "Require at least 4kiB hpt-max-page-size");
> +    } else if (val < 16) {
> +        warn_report("Many guests require at least 64kiB hpt-max-page-size");
> +    }
> +}
> +
>  sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
>      [SPAPR_CAP_HTM] = {
>          .name = "htm",
> @@ -326,6 +373,15 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
>          .possible = &cap_ibs_possible,
>          .apply = cap_safe_indirect_branch_apply,
>      },
> +    [SPAPR_CAP_HPT_MAXPAGESIZE] = {
> +        .name = "hpt-max-page-size",
> +        .description = "Maximum page size for Hash Page Table guests",
> +        .index = SPAPR_CAP_HPT_MAXPAGESIZE,
> +        .get = spapr_cap_get_pagesize,
> +        .set = spapr_cap_set_pagesize,
> +        .type = "int",
> +        .apply = cap_hpt_maxpagesize_apply,
> +    },
>  };

Why not use a "PAGESHIFT" name instead ? and also simplify 'set_pagesize' 
by requiring a page shift and not a page size.

C.


>  static sPAPRCapabilities default_caps_with_cpu(sPAPRMachineState *spapr,
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 9dd46a72f6..c97593d032 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -66,8 +66,10 @@ typedef enum {
>  #define SPAPR_CAP_SBBC                  0x04
>  /* Indirect Branch Serialisation */
>  #define SPAPR_CAP_IBS                   0x05
> +/* HPT Maximum Page Size (encoded as a shift) */
> +#define SPAPR_CAP_HPT_MAXPAGESIZE       0x06
>  /* Num Caps */
> -#define SPAPR_CAP_NUM                   (SPAPR_CAP_IBS + 1)
> +#define SPAPR_CAP_NUM                   (SPAPR_CAP_HPT_MAXPAGESIZE + 1)
>  
>  /*
>   * Capability Values
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 5/9] spapr: Maximum (HPT) pagesize property
  2018-06-19  9:23   ` Cédric Le Goater
@ 2018-06-19 11:22     ` David Gibson
  0 siblings, 0 replies; 43+ messages in thread
From: David Gibson @ 2018-06-19 11:22 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: groug, abologna, qemu-ppc, qemu-devel, aik

[-- Attachment #1: Type: text/plain, Size: 7472 bytes --]

On Tue, Jun 19, 2018 at 11:23:04AM +0200, Cédric Le Goater wrote:
> On 06/18/2018 08:36 AM, David Gibson wrote:
> > The way the POWER Hash Page Table (HPT) MMU is virtualized by KVM HV means
> > that every page that the guest puts in the pagetables must be truly
> > physically contiguous, not just GPA-contiguous.  In effect this means that
> > an HPT guest can't use any pagesizes greater than the host page size used
> > to back its memory.
> > 
> > At present we handle this by changing what we advertise to the guest based
> > on the backing pagesizes.  This is pretty bad, because it means the guest
> > sees a different environment depending on what should be host configuration
> > details.
> > 
> > As a start on fixing this, we add a new capability parameter to the pseries
> > machine type which gives the maximum allowed pagesizes for an HPT guest (as
> > a shift).  For now we just create and validate the parameter without making
> > it do anything.
> > 
> > For backwards compatibility, on older machine types we set it to the max
> > available page size for the host.  For the 3.0 machine type, we fix it to
> > 16, the intention being to only allow HPT pagesizes up to 64kiB by default
> > in future.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  hw/ppc/spapr.c         | 12 +++++++++
> >  hw/ppc/spapr_caps.c    | 56 ++++++++++++++++++++++++++++++++++++++++++
> >  include/hw/ppc/spapr.h |  4 ++-
> >  3 files changed, 71 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 40858d047c..74a76e7e09 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -63,6 +63,7 @@
> >  #include "hw/virtio/vhost-scsi-common.h"
> >  
> >  #include "exec/address-spaces.h"
> > +#include "exec/ram_addr.h"
> >  #include "hw/usb.h"
> >  #include "qemu/config-file.h"
> >  #include "qemu/error-report.h"
> > @@ -4043,6 +4044,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
> >      smc->default_caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_BROKEN;
> >      smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_BROKEN;
> >      smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_BROKEN;
> > +    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 16; /* 64kiB */
> >      spapr_caps_add_properties(smc, &error_abort);
> >  }
> >  
> > @@ -4126,8 +4128,18 @@ static void spapr_machine_2_12_instance_options(MachineState *machine)
> >  
> >  static void spapr_machine_2_12_class_options(MachineClass *mc)
> >  {
> > +    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
> > +    uint8_t mps;
> > +
> >      spapr_machine_3_0_class_options(mc);
> >      SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_12);
> > +
> > +    if (kvmppc_hpt_needs_host_contiguous_pages()) {
> > +        mps = ctz64(qemu_getrampagesize());
> > +    } else {
> > +        mps = 34; /* allow everything up to 16GiB, i.e. everything */
> > +    }
> > +    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = mps;
> >  }
> >  
> >  DEFINE_SPAPR_MACHINE(2_12, "2.12", false);
> > diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> > index 68a4243efc..6cdc0c94e7 100644
> > --- a/hw/ppc/spapr_caps.c
> > +++ b/hw/ppc/spapr_caps.c
> > @@ -27,6 +27,7 @@
> >  #include "qapi/visitor.h"
> >  #include "sysemu/hw_accel.h"
> >  #include "target/ppc/cpu.h"
> > +#include "target/ppc/mmu-hash64.h"
> >  #include "cpu-models.h"
> >  #include "kvm_ppc.h"
> >  
> > @@ -144,6 +145,42 @@ out:
> >      g_free(val);
> >  }
> >  
> > +static void spapr_cap_get_pagesize(Object *obj, Visitor *v, const char *name,
> > +                                   void *opaque, Error **errp)
> > +{
> > +    sPAPRCapabilityInfo *cap = opaque;
> > +    sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
> > +    uint8_t val = spapr_get_cap(spapr, cap->index);
> > +    uint64_t pagesize = (1ULL << val);
> > +
> > +    visit_type_size(v, name, &pagesize, errp);
> > +}
> > +
> > +static void spapr_cap_set_pagesize(Object *obj, Visitor *v, const char *name,
> > +                                   void *opaque, Error **errp)
> > +{
> > +    sPAPRCapabilityInfo *cap = opaque;
> > +    sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
> > +    uint64_t pagesize;
> > +    uint8_t val;
> > +    Error *local_err = NULL;
> > +
> > +    visit_type_size(v, name, &pagesize, &local_err);
> > +    if (local_err) {
> > +        error_propagate(errp, local_err);
> > +        return;
> > +    }
> > +
> > +    if (!is_power_of_2(pagesize)) {
> > +        error_setg(errp, "cap-%s must be a power of 2", cap->name);
> > +        return;
> > +    }
> > +
> > +    val = ctz64(pagesize);
> > +    spapr->cmd_line_caps[cap->index] = true;
> > +    spapr->eff.caps[cap->index] = val;
> > +}
> > +
> >  static void cap_htm_apply(sPAPRMachineState *spapr, uint8_t val, Error **errp)
> >  {
> >      if (!val) {
> > @@ -267,6 +304,16 @@ static void cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
> >  
> >  #define VALUE_DESC_TRISTATE     " (broken, workaround, fixed)"
> >  
> > +static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
> > +                                      uint8_t val, Error **errp)
> > +{
> > +    if (val < 12) {
> > +        error_setg(errp, "Require at least 4kiB hpt-max-page-size");
> > +    } else if (val < 16) {
> > +        warn_report("Many guests require at least 64kiB hpt-max-page-size");
> > +    }
> > +}
> > +
> >  sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
> >      [SPAPR_CAP_HTM] = {
> >          .name = "htm",
> > @@ -326,6 +373,15 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
> >          .possible = &cap_ibs_possible,
> >          .apply = cap_safe_indirect_branch_apply,
> >      },
> > +    [SPAPR_CAP_HPT_MAXPAGESIZE] = {
> > +        .name = "hpt-max-page-size",
> > +        .description = "Maximum page size for Hash Page Table guests",
> > +        .index = SPAPR_CAP_HPT_MAXPAGESIZE,
> > +        .get = spapr_cap_get_pagesize,
> > +        .set = spapr_cap_set_pagesize,
> > +        .type = "int",
> > +        .apply = cap_hpt_maxpagesize_apply,
> > +    },
> >  };
> 
> Why not use a "PAGESHIFT" name instead ? and also simplify 'set_pagesize' 
> by requiring a page shift and not a page size.

I had that in my previous version.  Andrew suggested this was a
friendlier interface, and on reflection, I agree.


> 
> C.
> 
> 
> >  static sPAPRCapabilities default_caps_with_cpu(sPAPRMachineState *spapr,
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index 9dd46a72f6..c97593d032 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -66,8 +66,10 @@ typedef enum {
> >  #define SPAPR_CAP_SBBC                  0x04
> >  /* Indirect Branch Serialisation */
> >  #define SPAPR_CAP_IBS                   0x05
> > +/* HPT Maximum Page Size (encoded as a shift) */
> > +#define SPAPR_CAP_HPT_MAXPAGESIZE       0x06
> >  /* Num Caps */
> > -#define SPAPR_CAP_NUM                   (SPAPR_CAP_IBS + 1)
> > +#define SPAPR_CAP_NUM                   (SPAPR_CAP_HPT_MAXPAGESIZE + 1)
> >  
> >  /*
> >   * Capability Values
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling
  2018-06-18  6:35 [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling David Gibson
                   ` (8 preceding siblings ...)
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 9/9] spapr: Don't rewrite mmu capabilities in KVM mode David Gibson
@ 2018-06-21  1:08 ` David Gibson
  2018-06-21  6:52 ` no-reply
  10 siblings, 0 replies; 43+ messages in thread
From: David Gibson @ 2018-06-21  1:08 UTC (permalink / raw)
  To: groug, abologna; +Cc: clg, qemu-ppc, qemu-devel, aik

[-- Attachment #1: Type: text/plain, Size: 3400 bytes --]

On Mon, Jun 18, 2018 at 04:35:57PM +1000, David Gibson wrote:
> Currently the "pseries" machine type will (usually) advertise
> different pagesizes to the guest when running under KVM and TCG, which
> is not how things are supposed to work.
> 
> This comes from poor handling of hardware limitations which mean that
> under KVM HV the guest is unable to use pagesizes larger than those
> backing the guest's RAM on the host side.
> 
> The new scheme turns things around by having an explicit machine
> parameter controlling the largest page size that the guest is allowed
> to use.  This limitation applies regardless of accelerator.  When
> we're running on KVM HV we ensure that our backing pages are adequate
> to supply the requested guest page sizes, rather than adjusting the
> guest page sizes based on what KVM can supply.
> 
> This means that in order to use hugepages in a PAPR guest it's
> necessary to add a "cap-hpt-max-page-size=16m" machine parameter as
> well as setting the mem-path correctly.  This is a bit more work on
> the user and/or management side, but results in consistent behaviour
> so I think it's worth it.
> 
> Longer term, we might also use this parameter to control IOMMU page
> sizes.  But, I'm still working out how restrictions deriving from the
> guest kernel, host kernel and hardware capabilities all interact here.
> 
> This applies on top of my ppc-for-3.0 tree.

Greg, Cédric, could you try to review this series pretty soon?

I'd really like to get it merged, because it's the basis for a number
of fixes for assorted problems with hugepage behaviour.

> 
> Changes since RFC:
>  * Add preliminary cleanups to allow us to evaluate effective
>    capabilities levels earlier.
>  * Don't try to remove double resetting of cpus.  It doesn't quite
>    work, and is no longer necessary with the above.
>  * Some user-friendliness improvements: use "hpt-max-page-size"
>    instead of the cryptic "hpt-mps", and take an actual page size
>    (allowing k/m/g suffixies) instead of a shift
> 
> David Gibson (9):
>   target/ppc: Allow cpu compatiblity checks based on type, not instance
>   spapr: Compute effective capability values earlier
>   spapr: Add cpu_apply hook to capabilities
>   target/ppc: Add kvmppc_hpt_needs_host_contiguous_pages() helper
>   spapr: Maximum (HPT) pagesize property
>   spapr: Use maximum page size capability to simplify memory backend
>     checking
>   target/ppc: Add ppc_hash64_filter_pagesizes()
>   spapr: Limit available pagesizes to provide a consistent guest
>     environment
>   spapr: Don't rewrite mmu capabilities in KVM mode
> 
>  hw/ppc/spapr.c          |  45 +++++++-----
>  hw/ppc/spapr_caps.c     | 156 ++++++++++++++++++++++++++++++++++++----
>  hw/ppc/spapr_cpu_core.c |   4 ++
>  include/hw/ppc/spapr.h  |  11 ++-
>  target/ppc/compat.c     |  27 +++++--
>  target/ppc/cpu.h        |   4 ++
>  target/ppc/kvm.c        | 146 ++++++++++++++++++-------------------
>  target/ppc/kvm_ppc.h    |  11 ++-
>  target/ppc/mmu-hash64.c |  59 +++++++++++++++
>  target/ppc/mmu-hash64.h |   3 +
>  10 files changed, 349 insertions(+), 117 deletions(-)
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 1/9] target/ppc: Allow cpu compatiblity checks based on type, not instance
  2018-06-18  6:35 ` [Qemu-devel] [PATCH 1/9] target/ppc: Allow cpu compatiblity checks based on type, not instance David Gibson
  2018-06-18 13:22   ` Greg Kurz
@ 2018-06-21  5:20   ` Cédric Le Goater
  1 sibling, 0 replies; 43+ messages in thread
From: Cédric Le Goater @ 2018-06-21  5:20 UTC (permalink / raw)
  To: David Gibson, groug, abologna
  Cc: qemu-ppc, qemu-devel, aik, Cédric Le Goater

On 06/18/2018 08:35 AM, David Gibson wrote:
> ppc_check_compat() is used in a number of places to check if a cpu object
> supports a certain compatiblity mode, subject to various constraints.
> 
> It takes a PowerPCCPU *, however it really only depends on the cpu's class.
> We have upcoming cases where it would be useful to make compatibility
> checks before we fully instantiate the cpu objects.
> 
> ppc_type_check_compat() will now make an equivalent check, but based on a
> CPU's QOM typename instead of an instantiated CPU object.
> 
> We make use of the new interface in several places in spapr, where we're
> essentially making a global check, rather than one specific to a particular
> cpu.  This avoids some ugly uses of first_cpu to grab a "representative"
> instance.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Reviewed-by: Cédric Le Goater <clg@kaod.org>

Looks good to me,

Thanks,

C.


> ---
>  hw/ppc/spapr.c      | 10 ++++------
>  hw/ppc/spapr_caps.c | 19 +++++++++----------
>  target/ppc/compat.c | 27 +++++++++++++++++++++------
>  target/ppc/cpu.h    |  4 ++++
>  4 files changed, 38 insertions(+), 22 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index db0fb385d4..b0b94fc1f0 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1616,8 +1616,8 @@ static void spapr_machine_reset(void)
>  
>      first_ppc_cpu = POWERPC_CPU(first_cpu);
>      if (kvm_enabled() && kvmppc_has_cap_mmu_radix() &&
> -        ppc_check_compat(first_ppc_cpu, CPU_POWERPC_LOGICAL_3_00, 0,
> -                         spapr->max_compat_pvr)) {
> +        ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00, 0,
> +                              spapr->max_compat_pvr)) {
>          /* If using KVM with radix mode available, VCPUs can be started
>           * without a HPT because KVM will start them in radix mode.
>           * Set the GR bit in PATB so that we know there is no HPT. */
> @@ -2520,7 +2520,6 @@ static void spapr_machine_init(MachineState *machine)
>      long load_limit, fw_size;
>      char *filename;
>      Error *resize_hpt_err = NULL;
> -    PowerPCCPU *first_ppc_cpu;
>  
>      msi_nonbroken = true;
>  
> @@ -2618,10 +2617,9 @@ static void spapr_machine_init(MachineState *machine)
>      /* init CPUs */
>      spapr_init_cpus(spapr);
>  
> -    first_ppc_cpu = POWERPC_CPU(first_cpu);
>      if ((!kvm_enabled() || kvmppc_has_cap_mmu_radix()) &&
> -        ppc_check_compat(first_ppc_cpu, CPU_POWERPC_LOGICAL_3_00, 0,
> -                         spapr->max_compat_pvr)) {
> +        ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00, 0,
> +                              spapr->max_compat_pvr)) {
>          /* KVM and TCG always allow GTSE with radix... */
>          spapr_ovec_set(spapr->ov5, OV5_MMU_RADIX_GTSE);
>      }
> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> index 00e43a9ba7..469f38f0ef 100644
> --- a/hw/ppc/spapr_caps.c
> +++ b/hw/ppc/spapr_caps.c
> @@ -327,27 +327,26 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
>  };
>  
>  static sPAPRCapabilities default_caps_with_cpu(sPAPRMachineState *spapr,
> -                                               CPUState *cs)
> +                                               const char *cputype)
>  {
>      sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> -    PowerPCCPU *cpu = POWERPC_CPU(cs);
>      sPAPRCapabilities caps;
>  
>      caps = smc->default_caps;
>  
> -    if (!ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_2_07,
> -                          0, spapr->max_compat_pvr)) {
> +    if (!ppc_type_check_compat(cputype, CPU_POWERPC_LOGICAL_2_07,
> +                               0, spapr->max_compat_pvr)) {
>          caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_OFF;
>          caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_BROKEN;
>      }
>  
> -    if (!ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_2_06_PLUS,
> -                          0, spapr->max_compat_pvr)) {
> +    if (!ppc_type_check_compat(cputype, CPU_POWERPC_LOGICAL_2_06_PLUS,
> +                               0, spapr->max_compat_pvr)) {
>          caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_BROKEN;
>      }
>  
> -    if (!ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_2_06,
> -                          0, spapr->max_compat_pvr)) {
> +    if (!ppc_type_check_compat(cputype, CPU_POWERPC_LOGICAL_2_06,
> +                               0, spapr->max_compat_pvr)) {
>          caps.caps[SPAPR_CAP_VSX] = SPAPR_CAP_OFF;
>          caps.caps[SPAPR_CAP_DFP] = SPAPR_CAP_OFF;
>          caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_BROKEN;
> @@ -384,7 +383,7 @@ int spapr_caps_post_migration(sPAPRMachineState *spapr)
>      sPAPRCapabilities dstcaps = spapr->eff;
>      sPAPRCapabilities srccaps;
>  
> -    srccaps = default_caps_with_cpu(spapr, first_cpu);
> +    srccaps = default_caps_with_cpu(spapr, MACHINE(spapr)->cpu_type);
>      for (i = 0; i < SPAPR_CAP_NUM; i++) {
>          /* If not default value then assume came in with the migration */
>          if (spapr->mig.caps[i] != spapr->def.caps[i]) {
> @@ -446,7 +445,7 @@ void spapr_caps_reset(sPAPRMachineState *spapr)
>      int i;
>  
>      /* First compute the actual set of caps we're running with.. */
> -    default_caps = default_caps_with_cpu(spapr, first_cpu);
> +    default_caps = default_caps_with_cpu(spapr, MACHINE(spapr)->cpu_type);
>  
>      for (i = 0; i < SPAPR_CAP_NUM; i++) {
>          /* Store the defaults */
> diff --git a/target/ppc/compat.c b/target/ppc/compat.c
> index 807c906f68..7de4bf3122 100644
> --- a/target/ppc/compat.c
> +++ b/target/ppc/compat.c
> @@ -105,17 +105,13 @@ static const CompatInfo *compat_by_pvr(uint32_t pvr)
>      return NULL;
>  }
>  
> -bool ppc_check_compat(PowerPCCPU *cpu, uint32_t compat_pvr,
> -                      uint32_t min_compat_pvr, uint32_t max_compat_pvr)
> +static bool pcc_compat(PowerPCCPUClass *pcc, uint32_t compat_pvr,
> +                       uint32_t min_compat_pvr, uint32_t max_compat_pvr)
>  {
> -    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
>      const CompatInfo *compat = compat_by_pvr(compat_pvr);
>      const CompatInfo *min = compat_by_pvr(min_compat_pvr);
>      const CompatInfo *max = compat_by_pvr(max_compat_pvr);
>  
> -#if !defined(CONFIG_USER_ONLY)
> -    g_assert(cpu->vhyp);
> -#endif
>      g_assert(!min_compat_pvr || min);
>      g_assert(!max_compat_pvr || max);
>  
> @@ -134,6 +130,25 @@ bool ppc_check_compat(PowerPCCPU *cpu, uint32_t compat_pvr,
>      return true;
>  }
>  
> +bool ppc_check_compat(PowerPCCPU *cpu, uint32_t compat_pvr,
> +                      uint32_t min_compat_pvr, uint32_t max_compat_pvr)
> +{
> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> +
> +#if !defined(CONFIG_USER_ONLY)
> +    g_assert(cpu->vhyp);
> +#endif
> +
> +    return pcc_compat(pcc, compat_pvr, min_compat_pvr, max_compat_pvr);
> +}
> +
> +bool ppc_type_check_compat(const char *cputype, uint32_t compat_pvr,
> +                           uint32_t min_compat_pvr, uint32_t max_compat_pvr)
> +{
> +    PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(object_class_by_name(cputype));
> +    return pcc_compat(pcc, compat_pvr, min_compat_pvr, max_compat_pvr);
> +}
> +
>  void ppc_set_compat(PowerPCCPU *cpu, uint32_t compat_pvr, Error **errp)
>  {
>      const CompatInfo *compat = compat_by_pvr(compat_pvr);
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index 874da6efbc..c7f3fb6b73 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -1369,7 +1369,11 @@ static inline int cpu_mmu_index (CPUPPCState *env, bool ifetch)
>  #if defined(TARGET_PPC64)
>  bool ppc_check_compat(PowerPCCPU *cpu, uint32_t compat_pvr,
>                        uint32_t min_compat_pvr, uint32_t max_compat_pvr);
> +bool ppc_type_check_compat(const char *cputype, uint32_t compat_pvr,
> +                           uint32_t min_compat_pvr, uint32_t max_compat_pvr);
> +
>  void ppc_set_compat(PowerPCCPU *cpu, uint32_t compat_pvr, Error **errp);
> +
>  #if !defined(CONFIG_USER_ONLY)
>  void ppc_set_compat_all(uint32_t compat_pvr, Error **errp);
>  #endif
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 2/9] spapr: Compute effective capability values earlier
  2018-06-18  6:35 ` [Qemu-devel] [PATCH 2/9] spapr: Compute effective capability values earlier David Gibson
  2018-06-18 13:37   ` Greg Kurz
@ 2018-06-21  5:32   ` Cédric Le Goater
  1 sibling, 0 replies; 43+ messages in thread
From: Cédric Le Goater @ 2018-06-21  5:32 UTC (permalink / raw)
  To: David Gibson, groug, abologna; +Cc: qemu-ppc, qemu-devel, aik

On 06/18/2018 08:35 AM, David Gibson wrote:
> Previously, the effective values of the various spapr capability flags
> were only determined at machine reset time.  That was a lazy way of making
> sure it was after cpu initialization so it could use the cpu object to
> inform the defaults.
> 
> But we've now improved the compat checking code so that we don't need to
> instantiate the cpus to use it.  That lets us move the resolution of the
> capability defaults much earlier.
> 
> This is going to be necessary for some future capabilities.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Reviewed-by: Cédric Le Goater <clg@kaod.org>

Thanks,

C.
> ---
>  hw/ppc/spapr.c         | 6 ++++--
>  hw/ppc/spapr_caps.c    | 9 ++++++---
>  include/hw/ppc/spapr.h | 3 ++-
>  3 files changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index b0b94fc1f0..40858d047c 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1612,7 +1612,7 @@ static void spapr_machine_reset(void)
>      void *fdt;
>      int rc;
>  
> -    spapr_caps_reset(spapr);
> +    spapr_caps_apply(spapr);
>  
>      first_ppc_cpu = POWERPC_CPU(first_cpu);
>      if (kvm_enabled() && kvmppc_has_cap_mmu_radix() &&
> @@ -2526,7 +2526,9 @@ static void spapr_machine_init(MachineState *machine)
>      QLIST_INIT(&spapr->phbs);
>      QTAILQ_INIT(&spapr->pending_dimm_unplugs);
>  
> -    /* Check HPT resizing availability */
> +    /* Determine capabilities to run with */
> +    spapr_caps_init(spapr);
> +
>      kvmppc_check_papr_resize_hpt(&resize_hpt_err);
>      if (spapr->resize_hpt == SPAPR_RESIZE_HPT_DEFAULT) {
>          /*
> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> index 469f38f0ef..dabed817d1 100644
> --- a/hw/ppc/spapr_caps.c
> +++ b/hw/ppc/spapr_caps.c
> @@ -439,12 +439,12 @@ SPAPR_CAP_MIG_STATE(cfpc, SPAPR_CAP_CFPC);
>  SPAPR_CAP_MIG_STATE(sbbc, SPAPR_CAP_SBBC);
>  SPAPR_CAP_MIG_STATE(ibs, SPAPR_CAP_IBS);
>  
> -void spapr_caps_reset(sPAPRMachineState *spapr)
> +void spapr_caps_init(sPAPRMachineState *spapr)
>  {
>      sPAPRCapabilities default_caps;
>      int i;
>  
> -    /* First compute the actual set of caps we're running with.. */
> +    /* Compute the actual set of caps we should run with */
>      default_caps = default_caps_with_cpu(spapr, MACHINE(spapr)->cpu_type);
>  
>      for (i = 0; i < SPAPR_CAP_NUM; i++) {
> @@ -455,8 +455,11 @@ void spapr_caps_reset(sPAPRMachineState *spapr)
>              spapr->eff.caps[i] = default_caps.caps[i];
>          }
>      }
> +}
>  
> -    /* .. then apply those caps to the virtual hardware */
> +void spapr_caps_apply(sPAPRMachineState *spapr)
> +{
> +    int i;
>  
>      for (i = 0; i < SPAPR_CAP_NUM; i++) {
>          sPAPRCapabilityInfo *info = &capability_table[i];
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 3388750fc7..9dbd6010f5 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -798,7 +798,8 @@ static inline uint8_t spapr_get_cap(sPAPRMachineState *spapr, int cap)
>      return spapr->eff.caps[cap];
>  }
>  
> -void spapr_caps_reset(sPAPRMachineState *spapr);
> +void spapr_caps_init(sPAPRMachineState *spapr);
> +void spapr_caps_apply(sPAPRMachineState *spapr);
>  void spapr_caps_add_properties(sPAPRMachineClass *smc, Error **errp);
>  int spapr_caps_post_migration(sPAPRMachineState *spapr);
>  
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/9] spapr: Add cpu_apply hook to capabilities
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 3/9] spapr: Add cpu_apply hook to capabilities David Gibson
  2018-06-18 15:28   ` Greg Kurz
@ 2018-06-21  5:34   ` Cédric Le Goater
  1 sibling, 0 replies; 43+ messages in thread
From: Cédric Le Goater @ 2018-06-21  5:34 UTC (permalink / raw)
  To: David Gibson, groug, abologna; +Cc: qemu-ppc, qemu-devel, aik

On 06/18/2018 08:36 AM, David Gibson wrote:
> spapr capabilities have an apply hook to actually activate (or deactivate)
> the feature in the system at reset time.  However, a number of capabilities
> affect the setup of cpus, and need to be applied to each of them -
> including hotplugged cpus for extra complication.  To make this simpler,
> add an optional cpu_apply hook that is called from spapr_cpu_reset().
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Reviewed-by: Cédric Le Goater <clg@kaod.org>


Thanks,

C.

> ---
>  hw/ppc/spapr_caps.c     | 19 +++++++++++++++++++
>  hw/ppc/spapr_cpu_core.c |  2 ++
>  include/hw/ppc/spapr.h  |  1 +
>  3 files changed, 22 insertions(+)
> 
> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> index dabed817d1..68a4243efc 100644
> --- a/hw/ppc/spapr_caps.c
> +++ b/hw/ppc/spapr_caps.c
> @@ -59,6 +59,8 @@ typedef struct sPAPRCapabilityInfo {
>      sPAPRCapPossible *possible;
>      /* Make sure the virtual hardware can support this capability */
>      void (*apply)(sPAPRMachineState *spapr, uint8_t val, Error **errp);
> +    void (*cpu_apply)(sPAPRMachineState *spapr, PowerPCCPU *cpu,
> +                      uint8_t val, Error **errp);
>  } sPAPRCapabilityInfo;
>  
>  static void spapr_cap_get_bool(Object *obj, Visitor *v, const char *name,
> @@ -472,6 +474,23 @@ void spapr_caps_apply(sPAPRMachineState *spapr)
>      }
>  }
>  
> +void spapr_caps_cpu_apply(sPAPRMachineState *spapr, PowerPCCPU *cpu)
> +{
> +    int i;
> +
> +    for (i = 0; i < SPAPR_CAP_NUM; i++) {
> +        sPAPRCapabilityInfo *info = &capability_table[i];
> +
> +        /*
> +         * If the apply function can't set the desired level and thinks it's
> +         * fatal, it should cause that.
> +         */
> +        if (info->cpu_apply) {
> +            info->cpu_apply(spapr, cpu, spapr->eff.caps[i], &error_fatal);
> +        }
> +    }
> +}
> +
>  void spapr_caps_add_properties(sPAPRMachineClass *smc, Error **errp)
>  {
>      Error *local_err = NULL;
> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> index aef3be33a3..324623190d 100644
> --- a/hw/ppc/spapr_cpu_core.c
> +++ b/hw/ppc/spapr_cpu_core.c
> @@ -76,6 +76,8 @@ static void spapr_cpu_reset(void *opaque)
>      spapr_cpu->slb_shadow_size = 0;
>      spapr_cpu->dtl_addr = 0;
>      spapr_cpu->dtl_size = 0;
> +
> +    spapr_caps_cpu_apply(SPAPR_MACHINE(qdev_get_machine()), cpu);
>  }
>  
>  void spapr_cpu_set_entry_state(PowerPCCPU *cpu, target_ulong nip, target_ulong r3)
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 9dbd6010f5..9dd46a72f6 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -800,6 +800,7 @@ static inline uint8_t spapr_get_cap(sPAPRMachineState *spapr, int cap)
>  
>  void spapr_caps_init(sPAPRMachineState *spapr);
>  void spapr_caps_apply(sPAPRMachineState *spapr);
> +void spapr_caps_cpu_apply(sPAPRMachineState *spapr, PowerPCCPU *cpu);
>  void spapr_caps_add_properties(sPAPRMachineClass *smc, Error **errp);
>  int spapr_caps_post_migration(sPAPRMachineState *spapr);
>  
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 4/9] target/ppc: Add kvmppc_hpt_needs_host_contiguous_pages() helper
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 4/9] target/ppc: Add kvmppc_hpt_needs_host_contiguous_pages() helper David Gibson
  2018-06-18 15:32   ` Greg Kurz
@ 2018-06-21  5:56   ` Cédric Le Goater
  2018-06-21  6:34     ` David Gibson
  1 sibling, 1 reply; 43+ messages in thread
From: Cédric Le Goater @ 2018-06-21  5:56 UTC (permalink / raw)
  To: David Gibson, groug, abologna; +Cc: qemu-ppc, qemu-devel, aik

On 06/18/2018 08:36 AM, David Gibson wrote:
> KVM HV has a restriction that for HPT mode guests, guest pages must be hpa
> contiguous as well as gpa contiguous.  We have to account for that in
> various places.  We determine whether we're subject to this restriction
> from the SMMU information exposed by KVM.
> 
> Planned cleanups to the way we handle this will require knowing whether
> this restriction is in play in wider parts of the code.  So, expose a
> helper function which returns it.
> 
> This does mean some redundant calls to kvm_get_smmu_info(), but they'll go
> away again with future cleanups.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Reviewed-by: Cédric Le Goater <clg@kaod.org>

but this patch is already committed it seems.

C. 

> ---
>  target/ppc/kvm.c     | 17 +++++++++++++++--
>  target/ppc/kvm_ppc.h |  6 ++++++
>  2 files changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index 5c0e313ca6..50b5d01432 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -406,9 +406,22 @@ target_ulong kvmppc_configure_v3_mmu(PowerPCCPU *cpu,
>      }
>  }
>  
> +bool kvmppc_hpt_needs_host_contiguous_pages(void)
> +{
> +    PowerPCCPU *cpu = POWERPC_CPU(first_cpu);
> +    static struct kvm_ppc_smmu_info smmu_info;
> +
> +    if (!kvm_enabled()) {
> +        return false;
> +    }
> +
> +    kvm_get_smmu_info(cpu, &smmu_info);
> +    return !!(smmu_info.flags & KVM_PPC_PAGE_SIZES_REAL);


> +}
> +
>  static bool kvm_valid_page_size(uint32_t flags, long rampgsize, uint32_t shift)
>  {
> -    if (!(flags & KVM_PPC_PAGE_SIZES_REAL)) {
> +    if (!kvmppc_hpt_needs_host_contiguous_pages()) {
>          return true;
>      }
>  
> @@ -445,7 +458,7 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>      /* If we have HV KVM, we need to forbid CI large pages if our
>       * host page size is smaller than 64K.
>       */
> -    if (smmu_info.flags & KVM_PPC_PAGE_SIZES_REAL) {
> +    if (kvmppc_hpt_needs_host_contiguous_pages()) {
>          if (getpagesize() >= 0x10000) {
>              cpu->hash64_opts->flags |= PPC_HASH64_CI_LARGEPAGE;
>          } else {
> diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> index e2840e1d33..a7ddb8a5d6 100644
> --- a/target/ppc/kvm_ppc.h
> +++ b/target/ppc/kvm_ppc.h
> @@ -70,6 +70,7 @@ int kvmppc_resize_hpt_prepare(PowerPCCPU *cpu, target_ulong flags, int shift);
>  int kvmppc_resize_hpt_commit(PowerPCCPU *cpu, target_ulong flags, int shift);
>  bool kvmppc_pvr_workaround_required(PowerPCCPU *cpu);
>  
> +bool kvmppc_hpt_needs_host_contiguous_pages(void);
>  bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path);
>  
>  #else
> @@ -222,6 +223,11 @@ static inline uint64_t kvmppc_rma_size(uint64_t current_size,
>      return ram_size;
>  }
>  
> +static inline bool kvmppc_hpt_needs_host_contiguous_pages(void)
> +{
> +    return false;
> +}
> +
>  static inline bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
>  {
>      return true;
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 5/9] spapr: Maximum (HPT) pagesize property
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 5/9] spapr: Maximum (HPT) pagesize property David Gibson
  2018-06-19  9:23   ` Cédric Le Goater
@ 2018-06-21  6:22   ` Cédric Le Goater
  2018-06-21 11:00     ` David Gibson
  2018-06-21  9:19   ` Greg Kurz
  2 siblings, 1 reply; 43+ messages in thread
From: Cédric Le Goater @ 2018-06-21  6:22 UTC (permalink / raw)
  To: David Gibson, groug, abologna; +Cc: qemu-ppc, qemu-devel, aik

On 06/18/2018 08:36 AM, David Gibson wrote:
> The way the POWER Hash Page Table (HPT) MMU is virtualized by KVM HV means
> that every page that the guest puts in the pagetables must be truly
> physically contiguous, not just GPA-contiguous.  In effect this means that
> an HPT guest can't use any pagesizes greater than the host page size used
> to back its memory.
> 
> At present we handle this by changing what we advertise to the guest based
> on the backing pagesizes.  This is pretty bad, because it means the guest
> sees a different environment depending on what should be host configuration
> details.
> 
> As a start on fixing this, we add a new capability parameter to the pseries
> machine type which gives the maximum allowed pagesizes for an HPT guest (as
> a shift).  For now we just create and validate the parameter without making
> it do anything.
> 
> For backwards compatibility, on older machine types we set it to the max
> available page size for the host.  For the 3.0 machine type, we fix it to
> 16, the intention being to only allow HPT pagesizes up to 64kiB by default
> in future.

Why not do it now ? I don't think the pseries machine supports 4k pages
anyway. so you could change the warn_report() below in an error I think.
 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Reviewed-by: Cédric Le Goater <clg@kaod.org>

C.

> ---
>  hw/ppc/spapr.c         | 12 +++++++++
>  hw/ppc/spapr_caps.c    | 56 ++++++++++++++++++++++++++++++++++++++++++
>  include/hw/ppc/spapr.h |  4 ++-
>  3 files changed, 71 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 40858d047c..74a76e7e09 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -63,6 +63,7 @@
>  #include "hw/virtio/vhost-scsi-common.h"
>  
>  #include "exec/address-spaces.h"
> +#include "exec/ram_addr.h"
>  #include "hw/usb.h"
>  #include "qemu/config-file.h"
>  #include "qemu/error-report.h"
> @@ -4043,6 +4044,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      smc->default_caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_BROKEN;
>      smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_BROKEN;
>      smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_BROKEN;
> +    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 16; /* 64kiB */
>      spapr_caps_add_properties(smc, &error_abort);
>  }
>  
> @@ -4126,8 +4128,18 @@ static void spapr_machine_2_12_instance_options(MachineState *machine)
>  
>  static void spapr_machine_2_12_class_options(MachineClass *mc)
>  {
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
> +    uint8_t mps;
> +
>      spapr_machine_3_0_class_options(mc);
>      SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_12);
> +
> +    if (kvmppc_hpt_needs_host_contiguous_pages()) {
> +        mps = ctz64(qemu_getrampagesize());
> +    } else {
> +        mps = 34; /* allow everything up to 16GiB, i.e. everything */
> +    }
> +    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = mps;
>  }
>  
>  DEFINE_SPAPR_MACHINE(2_12, "2.12", false);
> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> index 68a4243efc..6cdc0c94e7 100644
> --- a/hw/ppc/spapr_caps.c
> +++ b/hw/ppc/spapr_caps.c
> @@ -27,6 +27,7 @@
>  #include "qapi/visitor.h"
>  #include "sysemu/hw_accel.h"
>  #include "target/ppc/cpu.h"
> +#include "target/ppc/mmu-hash64.h"
>  #include "cpu-models.h"
>  #include "kvm_ppc.h"
>  
> @@ -144,6 +145,42 @@ out:
>      g_free(val);
>  }
>  
> +static void spapr_cap_get_pagesize(Object *obj, Visitor *v, const char *name,
> +                                   void *opaque, Error **errp)
> +{
> +    sPAPRCapabilityInfo *cap = opaque;
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
> +    uint8_t val = spapr_get_cap(spapr, cap->index);
> +    uint64_t pagesize = (1ULL << val);
> +
> +    visit_type_size(v, name, &pagesize, errp);
> +}
> +
> +static void spapr_cap_set_pagesize(Object *obj, Visitor *v, const char *name,
> +                                   void *opaque, Error **errp)
> +{
> +    sPAPRCapabilityInfo *cap = opaque;
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
> +    uint64_t pagesize;
> +    uint8_t val;
> +    Error *local_err = NULL;
> +
> +    visit_type_size(v, name, &pagesize, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +
> +    if (!is_power_of_2(pagesize)) {
> +        error_setg(errp, "cap-%s must be a power of 2", cap->name);
> +        return;
> +    }
> +
> +    val = ctz64(pagesize);
> +    spapr->cmd_line_caps[cap->index] = true;
> +    spapr->eff.caps[cap->index] = val;
> +}
> +
>  static void cap_htm_apply(sPAPRMachineState *spapr, uint8_t val, Error **errp)
>  {
>      if (!val) {
> @@ -267,6 +304,16 @@ static void cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
>  
>  #define VALUE_DESC_TRISTATE     " (broken, workaround, fixed)"
>  
> +static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
> +                                      uint8_t val, Error **errp)
> +{
> +    if (val < 12) {
> +        error_setg(errp, "Require at least 4kiB hpt-max-page-size");
> +    } else if (val < 16) {
> +        warn_report("Many guests require at least 64kiB hpt-max-page-size");
> +    }
> +}
> +
>  sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
>      [SPAPR_CAP_HTM] = {
>          .name = "htm",
> @@ -326,6 +373,15 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
>          .possible = &cap_ibs_possible,
>          .apply = cap_safe_indirect_branch_apply,
>      },
> +    [SPAPR_CAP_HPT_MAXPAGESIZE] = {
> +        .name = "hpt-max-page-size",
> +        .description = "Maximum page size for Hash Page Table guests",
> +        .index = SPAPR_CAP_HPT_MAXPAGESIZE,
> +        .get = spapr_cap_get_pagesize,
> +        .set = spapr_cap_set_pagesize,
> +        .type = "int",
> +        .apply = cap_hpt_maxpagesize_apply,
> +    },
>  };
>  
>  static sPAPRCapabilities default_caps_with_cpu(sPAPRMachineState *spapr,
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 9dd46a72f6..c97593d032 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -66,8 +66,10 @@ typedef enum {
>  #define SPAPR_CAP_SBBC                  0x04
>  /* Indirect Branch Serialisation */
>  #define SPAPR_CAP_IBS                   0x05
> +/* HPT Maximum Page Size (encoded as a shift) */
> +#define SPAPR_CAP_HPT_MAXPAGESIZE       0x06
>  /* Num Caps */
> -#define SPAPR_CAP_NUM                   (SPAPR_CAP_IBS + 1)
> +#define SPAPR_CAP_NUM                   (SPAPR_CAP_HPT_MAXPAGESIZE + 1)
>  
>  /*
>   * Capability Values
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 6/9] spapr: Use maximum page size capability to simplify memory backend checking
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 6/9] spapr: Use maximum page size capability to simplify memory backend checking David Gibson
@ 2018-06-21  6:29   ` Cédric Le Goater
  2018-06-21 11:06     ` David Gibson
  2018-06-21 10:29   ` Greg Kurz
  1 sibling, 1 reply; 43+ messages in thread
From: Cédric Le Goater @ 2018-06-21  6:29 UTC (permalink / raw)
  To: David Gibson, groug, abologna; +Cc: qemu-ppc, qemu-devel, aik

On 06/18/2018 08:36 AM, David Gibson wrote:
> The way we used to handle KVM allowable guest pagesizes for PAPR guests
> required some convoluted checking of memory attached to the guest.
> 
> The allowable pagesizes advertised to the guest cpus depended on the memory
> which was attached at boot, but then we needed to ensure that any memory
> later hotplugged didn't change which pagesizes were allowed.
> 
> Now that we have an explicit machine option to control the allowable
> maximum pagesize we can simplify this.  We just check all memory backends
> against that declared pagesize.  We check base and cold-plugged memory at
> reset time, and hotplugged memory at pre_plug() time.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

One minor question below.

Nevertheless,

Reviewed-by: Cédric Le Goater <clg@kaod.org>

Thanks,

C.

> ---
>  hw/ppc/spapr.c         | 17 +++++++----------
>  hw/ppc/spapr_caps.c    | 20 ++++++++++++++++++++
>  include/hw/ppc/spapr.h |  3 +++
>  target/ppc/kvm.c       | 14 --------------
>  target/ppc/kvm_ppc.h   |  6 ------
>  5 files changed, 30 insertions(+), 30 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 74a76e7e09..efd36e92e2 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3192,11 +3192,13 @@ static void spapr_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>                                    Error **errp)
>  {
>      const sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(hotplug_dev);
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(hotplug_dev);
>      PCDIMMDevice *dimm = PC_DIMM(dev);
>      PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
>      MemoryRegion *mr;
>      uint64_t size;
> -    char *mem_dev;
> +    Object *memdev;
> +    hwaddr pagesize;
>  
>      if (!smc->dr_lmb_enabled) {
>          error_setg(errp, "Memory hotplug not supported for this machine");
> @@ -3215,15 +3217,10 @@ static void spapr_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>          return;
>      }
>  
> -    mem_dev = object_property_get_str(OBJECT(dimm), PC_DIMM_MEMDEV_PROP, NULL);
> -    if (mem_dev && !kvmppc_is_mem_backend_page_size_ok(mem_dev)) {
> -        error_setg(errp, "Memory backend has bad page size. "
> -                   "Use 'memory-backend-file' with correct mem-path.");
> -        goto out;
> -    }
> -
> -out:
> -    g_free(mem_dev);
> +    memdev = object_property_get_link(OBJECT(dimm), PC_DIMM_MEMDEV_PROP,
> +                                      &error_abort);
> +    pagesize = host_memory_backend_pagesize(MEMORY_BACKEND(memdev));
> +    spapr_check_pagesize(spapr, pagesize, errp);
>  }
>  
>  struct sPAPRDIMMState {
> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> index 6cdc0c94e7..9fc739b3f5 100644
> --- a/hw/ppc/spapr_caps.c
> +++ b/hw/ppc/spapr_caps.c
> @@ -26,6 +26,7 @@
>  #include "qapi/error.h"
>  #include "qapi/visitor.h"
>  #include "sysemu/hw_accel.h"
> +#include "exec/ram_addr.h"
>  #include "target/ppc/cpu.h"
>  #include "target/ppc/mmu-hash64.h"
>  #include "cpu-models.h"
> @@ -304,6 +305,23 @@ static void cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
>  
>  #define VALUE_DESC_TRISTATE     " (broken, workaround, fixed)"
>  
> +void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
> +                          Error **errp)
> +{
> +    hwaddr maxpagesize = (1ULL << spapr->eff.caps[SPAPR_CAP_HPT_MPS]);

I suppose this is SPAPR_CAP_HPT_MAXPAGESIZE now ? 

> +
> +    if (!kvmppc_hpt_needs_host_contiguous_pages()) {
> +        return;
> +    }
> +
> +    if (maxpagesize > pagesize) {
> +        error_setg(errp,
> +                   "Can't support %"HWADDR_PRIu" kiB guest pages with %"
> +                   HWADDR_PRIu" kiB host pages with this KVM implementation",
> +                   maxpagesize >> 10, pagesize >> 10);
> +    }
> +}
> +
>  static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
>                                        uint8_t val, Error **errp)
>  {
> @@ -312,6 +330,8 @@ static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
>      } else if (val < 16) {
>          warn_report("Many guests require at least 64kiB hpt-max-page-size");
>      }
> +
> +    spapr_check_pagesize(spapr, qemu_getrampagesize(), errp);
>  }
>  
>  sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index c97593d032..75e2cf2687 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -806,4 +806,7 @@ void spapr_caps_cpu_apply(sPAPRMachineState *spapr, PowerPCCPU *cpu);
>  void spapr_caps_add_properties(sPAPRMachineClass *smc, Error **errp);
>  int spapr_caps_post_migration(sPAPRMachineState *spapr);
>  
> +void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
> +                          Error **errp);
> +
>  #endif /* HW_SPAPR_H */
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index 50b5d01432..9cfbd388ad 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -500,26 +500,12 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>          cpu->hash64_opts->flags &= ~PPC_HASH64_1TSEG;
>      }
>  }
> -
> -bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
> -{
> -    Object *mem_obj = object_resolve_path(obj_path, NULL);
> -    long pagesize = host_memory_backend_pagesize(MEMORY_BACKEND(mem_obj));
> -
> -    return pagesize >= max_cpu_page_size;
> -}
> -
>  #else /* defined (TARGET_PPC64) */
>  
>  static inline void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>  {
>  }
>  
> -bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
> -{
> -    return true;
> -}
> -
>  #endif /* !defined (TARGET_PPC64) */
>  
>  unsigned long kvm_arch_vcpu_id(CPUState *cpu)
> diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> index a7ddb8a5d6..443fca0a4e 100644
> --- a/target/ppc/kvm_ppc.h
> +++ b/target/ppc/kvm_ppc.h
> @@ -71,7 +71,6 @@ int kvmppc_resize_hpt_commit(PowerPCCPU *cpu, target_ulong flags, int shift);
>  bool kvmppc_pvr_workaround_required(PowerPCCPU *cpu);
>  
>  bool kvmppc_hpt_needs_host_contiguous_pages(void);
> -bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path);
>  
>  #else
>  
> @@ -228,11 +227,6 @@ static inline bool kvmppc_hpt_needs_host_contiguous_pages(void)
>      return false;
>  }
>  
> -static inline bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
> -{
> -    return true;
> -}
> -
>  static inline bool kvmppc_has_cap_spapr_vfio(void)
>  {
>      return false;
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 4/9] target/ppc: Add kvmppc_hpt_needs_host_contiguous_pages() helper
  2018-06-21  5:56   ` Cédric Le Goater
@ 2018-06-21  6:34     ` David Gibson
  0 siblings, 0 replies; 43+ messages in thread
From: David Gibson @ 2018-06-21  6:34 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: groug, abologna, qemu-ppc, qemu-devel, aik

[-- Attachment #1: Type: text/plain, Size: 1231 bytes --]

On Thu, Jun 21, 2018 at 07:56:58AM +0200, Cédric Le Goater wrote:
> On 06/18/2018 08:36 AM, David Gibson wrote:
> > KVM HV has a restriction that for HPT mode guests, guest pages must be hpa
> > contiguous as well as gpa contiguous.  We have to account for that in
> > various places.  We determine whether we're subject to this restriction
> > from the SMMU information exposed by KVM.
> > 
> > Planned cleanups to the way we handle this will require knowing whether
> > this restriction is in play in wider parts of the code.  So, expose a
> > helper function which returns it.
> > 
> > This does mean some redundant calls to kvm_get_smmu_info(), but they'll go
> > away again with future cleanups.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> 
> Reviewed-by: Cédric Le Goater <clg@kaod.org>
> 
> but this patch is already committed it seems.

Yeah.  I was pretty confident of these earlier cleanups, so I merged
them with just Greg's R-b.  It's 5-9 I'm particularly in need of
review.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 7/9] target/ppc: Add ppc_hash64_filter_pagesizes()
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 7/9] target/ppc: Add ppc_hash64_filter_pagesizes() David Gibson
@ 2018-06-21  6:38   ` Cédric Le Goater
  2018-06-21 11:48   ` Greg Kurz
  1 sibling, 0 replies; 43+ messages in thread
From: Cédric Le Goater @ 2018-06-21  6:38 UTC (permalink / raw)
  To: David Gibson, groug, abologna; +Cc: qemu-ppc, qemu-devel, aik

On 06/18/2018 08:36 AM, David Gibson wrote:
> The paravirtualized PAPR platform sometimes needs to restrict the guest to
> using only some of the page sizes actually supported by the host's MMU.
> At the moment this is handled in KVM specific code, but for consistency we
> want to apply the same limitations to all accelerators.
> 
> This makes a start on this by providing a helper function in the cpu code
> to allow platform code to remove some of the cpu's page size definitions
> via a caller supplied callback.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

it looks correct.

Reviewed-by: Cédric Le Goater <clg@kaod.org>

Thanks,

C.

> ---
>  target/ppc/mmu-hash64.c | 59 +++++++++++++++++++++++++++++++++++++++++
>  target/ppc/mmu-hash64.h |  3 +++
>  2 files changed, 62 insertions(+)
> 
> diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
> index aa200cba4c..276d9015e7 100644
> --- a/target/ppc/mmu-hash64.c
> +++ b/target/ppc/mmu-hash64.c
> @@ -1166,3 +1166,62 @@ const PPCHash64Options ppc_hash64_opts_POWER7 = {
>          },
>      }
>  };
> +
> +void ppc_hash64_filter_pagesizes(PowerPCCPU *cpu,
> +                                 bool (*cb)(void *, uint32_t, uint32_t),
> +                                 void *opaque)
> +{
> +    PPCHash64Options *opts = cpu->hash64_opts;
> +    int i;
> +    int n = 0;
> +    bool ci_largepage = false;
> +
> +    assert(opts);
> +
> +    n = 0;
> +    for (i = 0; i < ARRAY_SIZE(opts->sps); i++) {
> +        PPCHash64SegmentPageSizes *sps = &opts->sps[i];
> +        int j;
> +        int m = 0;
> +
> +        assert(n <= i);
> +
> +        if (!sps->page_shift) {
> +            break;
> +        }
> +
> +        for (j = 0; j < ARRAY_SIZE(sps->enc); j++) {
> +            PPCHash64PageSize *ps = &sps->enc[j];
> +
> +            assert(m <= j);
> +            if (!ps->page_shift) {
> +                break;
> +            }
> +
> +            if (cb(opaque, sps->page_shift, ps->page_shift)) {
> +                if (ps->page_shift >= 16) {
> +                    ci_largepage = true;
> +                }
> +                sps->enc[m++] = *ps;
> +            }
> +        }
> +
> +        /* Clear rest of the row */
> +        for (j = m; j < ARRAY_SIZE(sps->enc); j++) {
> +            memset(&sps->enc[j], 0, sizeof(sps->enc[j]));
> +        }
> +
> +        if (m) {
> +            n++;
> +        }
> +    }
> +
> +    /* Clear the rest of the table */
> +    for (i = n; i < ARRAY_SIZE(opts->sps); i++) {
> +        memset(&opts->sps[i], 0, sizeof(opts->sps[i]));
> +    }
> +
> +    if (!ci_largepage) {
> +        opts->flags &= ~PPC_HASH64_CI_LARGEPAGE;
> +    }
> +}
> diff --git a/target/ppc/mmu-hash64.h b/target/ppc/mmu-hash64.h
> index 53dcec5b93..f11efc9cbc 100644
> --- a/target/ppc/mmu-hash64.h
> +++ b/target/ppc/mmu-hash64.h
> @@ -20,6 +20,9 @@ unsigned ppc_hash64_hpte_page_shift_noslb(PowerPCCPU *cpu,
>  void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val);
>  void ppc_hash64_init(PowerPCCPU *cpu);
>  void ppc_hash64_finalize(PowerPCCPU *cpu);
> +void ppc_hash64_filter_pagesizes(PowerPCCPU *cpu,
> +                                 bool (*cb)(void *, uint32_t, uint32_t),
> +                                 void *opaque);
>  #endif
>  
>  /*
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling
  2018-06-18  6:35 [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling David Gibson
                   ` (9 preceding siblings ...)
  2018-06-21  1:08 ` [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling David Gibson
@ 2018-06-21  6:52 ` no-reply
  10 siblings, 0 replies; 43+ messages in thread
From: no-reply @ 2018-06-21  6:52 UTC (permalink / raw)
  To: david; +Cc: famz, groug, abologna, aik, qemu-ppc, clg, qemu-devel

Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180618063606.2513-1-david@gibson.dropbear.id.au
Subject: [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]               patchew/20180618063606.2513-1-david@gibson.dropbear.id.au -> patchew/20180618063606.2513-1-david@gibson.dropbear.id.au
Switched to a new branch 'test'
0c66386fb1 spapr: Don't rewrite mmu capabilities in KVM mode
787addd355 spapr: Limit available pagesizes to provide a consistent guest environment
f6b4366e24 target/ppc: Add ppc_hash64_filter_pagesizes()
143ee46ef0 spapr: Use maximum page size capability to simplify memory backend checking
7bb6e85e44 spapr: Maximum (HPT) pagesize property
7a43ed59ce target/ppc: Add kvmppc_hpt_needs_host_contiguous_pages() helper
9bff9c15e0 spapr: Add cpu_apply hook to capabilities
4c9c79b967 spapr: Compute effective capability values earlier
1321ab423b target/ppc: Allow cpu compatiblity checks based on type, not instance

=== OUTPUT BEGIN ===
Checking PATCH 1/9: target/ppc: Allow cpu compatiblity checks based on type, not instance...
Checking PATCH 2/9: spapr: Compute effective capability values earlier...
Checking PATCH 3/9: spapr: Add cpu_apply hook to capabilities...
Checking PATCH 4/9: target/ppc: Add kvmppc_hpt_needs_host_contiguous_pages() helper...
Checking PATCH 5/9: spapr: Maximum (HPT) pagesize property...
Checking PATCH 6/9: spapr: Use maximum page size capability to simplify memory backend checking...
Checking PATCH 7/9: target/ppc: Add ppc_hash64_filter_pagesizes()...
Checking PATCH 8/9: spapr: Limit available pagesizes to provide a consistent guest environment...
WARNING: line over 80 characters
#28: FILE: hw/ppc/spapr_caps.c:337:
+static bool spapr_pagesize_cb(void *opaque, uint32_t seg_pshift, uint32_t pshift)

total: 0 errors, 1 warnings, 45 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 9/9: spapr: Don't rewrite mmu capabilities in KVM mode...
ERROR: Error messages should not contain newlines
#212: FILE: target/ppc/kvm.c:505:
+"KVM can't supply 64kiB CI pages, which guest expects\n");

total: 1 errors, 0 warnings, 201 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 8/9] spapr: Limit available pagesizes to provide a consistent guest environment
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 8/9] spapr: Limit available pagesizes to provide a consistent guest environment David Gibson
@ 2018-06-21  7:01   ` Cédric Le Goater
  2018-06-21 11:52     ` David Gibson
  2018-06-21 12:24   ` Greg Kurz
  1 sibling, 1 reply; 43+ messages in thread
From: Cédric Le Goater @ 2018-06-21  7:01 UTC (permalink / raw)
  To: David Gibson, groug, abologna; +Cc: qemu-ppc, qemu-devel, aik

On 06/18/2018 08:36 AM, David Gibson wrote:
> KVM HV has some limitations (deriving from the hardware) that mean not all
> host-cpu supported pagesizes may be usable in the guest.  At present this
> means that KVM guests and TCG guests may see different available page sizes
> even if they notionally have the same vcpu model.  This is confusing and
> also prevents migration between TCG and KVM.
> 
> This patch makes the environment consistent by always allowing the same set
> of pagesizes.  Since we can't remove the KVM limitations, we do this by
> always applying the same limitations it has, even to TCG guests.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>
> ---
>  hw/ppc/spapr_caps.c | 33 +++++++++++++++++++++++++++++++++
>  1 file changed, 33 insertions(+)
> 
> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> index 9fc739b3f5..0584c7c6ab 100644
> --- a/hw/ppc/spapr_caps.c
> +++ b/hw/ppc/spapr_caps.c
> @@ -334,6 +334,38 @@ static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
>      spapr_check_pagesize(spapr, qemu_getrampagesize(), errp);
>  }
>  
> +static bool spapr_pagesize_cb(void *opaque, uint32_t seg_pshift, uint32_t pshift)
> +{
> +    unsigned maxshift = *((unsigned *)opaque);
> +
> +    assert(pshift >= seg_pshift);

you could check that elsewhere.

> +    /* Don't allow the guest to use pages bigger than the configured
> +     * maximum size */
> +    if (pshift > maxshift) {
> +        return false;
> +    }
> +
> +    /* For whatever reason, KVM doesn't allow multiple pagesizes
> +     * within a segment, *except* for the case of 16M pages in a 4k or
> +     * 64k segment.  Always exclude other cases, so that TCG and KVM
> +     * guests see a consistent environment */
> +    if ((pshift != seg_pshift) && (pshift != 24)) {
> +        return false;
> +    }
> +
> +    return true;
> +}

So, do we really need ppc_hash64_filter_pagesizes() to have a callback ? 

It seems that we only use the routine once in the patchset and that the
only thing we need to check is 'maxshift'.

Do you envision other usage of the routine ?

Thanks,

C.

> +static void cap_hpt_maxpagesize_cpu_apply(sPAPRMachineState *spapr,
> +                                          PowerPCCPU *cpu,
> +                                          uint8_t val, Error **errp)
> +{
> +    unsigned maxshift = val;
> +
> +    ppc_hash64_filter_pagesizes(cpu, spapr_pagesize_cb, &maxshift);
> +}
> +
>  sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
>      [SPAPR_CAP_HTM] = {
>          .name = "htm",
> @@ -401,6 +433,7 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
>          .set = spapr_cap_set_pagesize,
>          .type = "int",
>          .apply = cap_hpt_maxpagesize_apply,
> +        .cpu_apply = cap_hpt_maxpagesize_cpu_apply,
>      },
>  };
>  
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 9/9] spapr: Don't rewrite mmu capabilities in KVM mode
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 9/9] spapr: Don't rewrite mmu capabilities in KVM mode David Gibson
@ 2018-06-21  7:53   ` Cédric Le Goater
  2018-06-21 12:01     ` David Gibson
  0 siblings, 1 reply; 43+ messages in thread
From: Cédric Le Goater @ 2018-06-21  7:53 UTC (permalink / raw)
  To: David Gibson, groug, abologna; +Cc: qemu-ppc, qemu-devel, aik

On 06/18/2018 08:36 AM, David Gibson wrote:
> Currently during KVM initialization on POWER, kvm_fixup_page_sizes()
> rewrites a bunch of information in the cpu state to reflect the
> capabilities of the host MMU and KVM.  This overwrites the information
> that's already there reflecting how the TCG implementation of the MMU will
> operate.
> 
> This means that we can get guest-visibly different behaviour between KVM
> and TCG (and between different KVM implementations).  That's bad.  It also
> prevents migration between KVM and TCG.
> 
> The pseries machine type now has filtering of the pagesizes it allows the
> guest to use which means it can present a consistent model of the MMU
> across all accelerators.
> 
> So, we can now replace kvm_fixup_page_sizes() with kvm_check_mmu() which
> merely verifies that the expected cpu model can be faithfully handled by
> KVM, rather than updating the cpu model to match KVM.
> 
> We call kvm_check_mmu() from the spapr cpu reset code.  This is a hack:

I think this is fine but we are still doing some MMU checks in 
kvm_arch_init_vcpu() we might want to do in a single routine.

> conceptually it makes more sense where fixup_page_sizes() was - in the KVM
> cpu init path.  However, doing that would require moving the platform's
> pagesize filtering much earlier, which would require a lot of work making
> further adjustments.  There wouldn't be a lot of concrete point to doing
> that, since the only KVM implementation which has the awkward MMU
> restrictions is KVM HV, which can only work with an spapr guest anyway.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/ppc/spapr_caps.c     |   2 +-
>  hw/ppc/spapr_cpu_core.c |   2 +
>  target/ppc/kvm.c        | 133 ++++++++++++++++++++--------------------
>  target/ppc/kvm_ppc.h    |   5 ++
>  4 files changed, 73 insertions(+), 69 deletions(-)
> 
> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> index 0584c7c6ab..bc89a4cd70 100644
> --- a/hw/ppc/spapr_caps.c
> +++ b/hw/ppc/spapr_caps.c
> @@ -308,7 +308,7 @@ static void cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
>  void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
>                            Error **errp)
>  {
> -    hwaddr maxpagesize = (1ULL << spapr->eff.caps[SPAPR_CAP_HPT_MPS]);
> +    hwaddr maxpagesize = (1ULL << spapr->eff.caps[SPAPR_CAP_HPT_MAXPAGESIZE]);

There might be some renames I missed. no big issue.

>  
>      if (!kvmppc_hpt_needs_host_contiguous_pages()) {
>          return;
> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> index 324623190d..4e8fa28796 100644
> --- a/hw/ppc/spapr_cpu_core.c
> +++ b/hw/ppc/spapr_cpu_core.c
> @@ -78,6 +78,8 @@ static void spapr_cpu_reset(void *opaque)
>      spapr_cpu->dtl_size = 0;
>  
>      spapr_caps_cpu_apply(SPAPR_MACHINE(qdev_get_machine()), cpu);
> +
> +    kvm_check_mmu(cpu, &error_fatal);
>  }
>  
>  void spapr_cpu_set_entry_state(PowerPCCPU *cpu, target_ulong nip, target_ulong r3)
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index 9cfbd388ad..b386335014 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -419,93 +419,93 @@ bool kvmppc_hpt_needs_host_contiguous_pages(void)
>      return !!(smmu_info.flags & KVM_PPC_PAGE_SIZES_REAL);
>  }
>  
> -static bool kvm_valid_page_size(uint32_t flags, long rampgsize, uint32_t shift)
> +void kvm_check_mmu(PowerPCCPU *cpu, Error **errp)
>  {
> -    if (!kvmppc_hpt_needs_host_contiguous_pages()) {
> -        return true;
> -    }
> -
> -    return (1ul << shift) <= rampgsize;
> -}
> -
> -static long max_cpu_page_size;
> -
> -static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
> -{
> -    static struct kvm_ppc_smmu_info smmu_info;
> -    static bool has_smmu_info;
> -    CPUPPCState *env = &cpu->env;
> +    struct kvm_ppc_smmu_info smmu_info;
>      int iq, ik, jq, jk;
>  
> -    /* We only handle page sizes for 64-bit server guests for now */
> -    if (!(env->mmu_model & POWERPC_MMU_64)) {
> +    /* For now, we only have anything to check on hash64 MMUs */
> +    if (!cpu->hash64_opts || !kvm_enabled()) {
>          return;
>      }
>  
> -    /* Collect MMU info from kernel if not already */
> -    if (!has_smmu_info) {
> -        kvm_get_smmu_info(cpu, &smmu_info);
> -        has_smmu_info = true;
> -    }
> +    kvm_get_smmu_info(cpu, &smmu_info);

kvm_ppc_smmu_info and PPCHash64Options really are dual objects, and the 
routine below checks that they are in sync. Pity that we have to maintain
two different structs. I guess we can't do differently.

> -    if (!max_cpu_page_size) {
> -        max_cpu_page_size = qemu_getrampagesize();
> +    if (ppc_hash64_has(cpu, PPC_HASH64_1TSEG)
> +        && !(smmu_info.flags & KVM_PPC_1T_SEGMENTS)) {
> +        error_setg(errp,
> +                   "KVM does not support 1TiB segments which guest expects");
> +        return;
>      }
>  
> -    /* Convert to QEMU form */
> -    memset(cpu->hash64_opts->sps, 0, sizeof(*cpu->hash64_opts->sps));
> -
> -    /* If we have HV KVM, we need to forbid CI large pages if our
> -     * host page size is smaller than 64K.
> -     */
> -    if (kvmppc_hpt_needs_host_contiguous_pages()) {
> -        if (getpagesize() >= 0x10000) {
> -            cpu->hash64_opts->flags |= PPC_HASH64_CI_LARGEPAGE;
> -        } else {
> -            cpu->hash64_opts->flags &= ~PPC_HASH64_CI_LARGEPAGE;
> -        }
> +    if (smmu_info.slb_size < cpu->hash64_opts->slb_size) {
> +        error_setg(errp, "KVM only supports %u SLB entries, but guest needs %u",
> +                   smmu_info.slb_size, cpu->hash64_opts->slb_size);
> +        return;
>      }
>

The routine below is doing a simple PPCHash64SegmentPageSizes compare. 
Is it possible to move it in the mmu-hash64.c file ? It means introducing
kvm notions under mmu-hash64.c

>      /*
> -     * XXX This loop should be an entry wide AND of the capabilities that
> -     *     the selected CPU has with the capabilities that KVM supports.
> +     * Verify that every pagesize supported by the cpu model is
> +     * supported by KVM with the same encodings
>       */
> -    for (ik = iq = 0; ik < KVM_PPC_PAGE_SIZES_MAX_SZ; ik++) {
> +    for (iq = 0; iq < ARRAY_SIZE(cpu->hash64_opts->sps); iq++) {
>          PPCHash64SegmentPageSizes *qsps = &cpu->hash64_opts->sps[iq];
> -        struct kvm_ppc_one_seg_page_size *ksps = &smmu_info.sps[ik];
> +        struct kvm_ppc_one_seg_page_size *ksps;
>  
> -        if (!kvm_valid_page_size(smmu_info.flags, max_cpu_page_size,
> -                                 ksps->page_shift)) {
> -            continue;
> -        }
> -        qsps->page_shift = ksps->page_shift;
> -        qsps->slb_enc = ksps->slb_enc;
> -        for (jk = jq = 0; jk < KVM_PPC_PAGE_SIZES_MAX_SZ; jk++) {
> -            if (!kvm_valid_page_size(smmu_info.flags, max_cpu_page_size,
> -                                     ksps->enc[jk].page_shift)) {
> -                continue;
> -            }
> -            qsps->enc[jq].page_shift = ksps->enc[jk].page_shift;
> -            qsps->enc[jq].pte_enc = ksps->enc[jk].pte_enc;
> -            if (++jq >= PPC_PAGE_SIZES_MAX_SZ) {
> +        for (ik = 0; ik < ARRAY_SIZE(smmu_info.sps); ik++) {
> +            if (qsps->page_shift == smmu_info.sps[ik].page_shift) {
>                  break;
>              }
>          }
> -        if (++iq >= PPC_PAGE_SIZES_MAX_SZ) {
> -            break;
> +        if (ik >= ARRAY_SIZE(smmu_info.sps)) {
> +            error_setg(errp, "KVM doesn't support for base page shift %u",
> +                       qsps->page_shift);
> +            return;
> +        }
> +
> +        ksps = &smmu_info.sps[ik];
> +        if (ksps->slb_enc != qsps->slb_enc) {
> +            error_setg(errp,
> +"KVM uses SLB encoding 0x%x for page shift %u, but guest expects 0x%x",
> +                       ksps->slb_enc, ksps->page_shift, qsps->slb_enc);
> +            return;
> +        }
> +
> +        for (jq = 0; jq < ARRAY_SIZE(qsps->enc); jq++) {
> +            for (jk = 0; jk < ARRAY_SIZE(ksps->enc); jk++) {
> +                if (qsps->enc[jq].page_shift == ksps->enc[jk].page_shift) {
> +                    break;
> +                }
> +            }
> +
> +            if (jk >= ARRAY_SIZE(ksps->enc)) {
> +                error_setg(errp, "KVM doesn't support page shift %u/%u",
> +                           qsps->enc[jq].page_shift, qsps->page_shift);
> +                return;
> +            }
> +            if (qsps->enc[jq].pte_enc != ksps->enc[jk].pte_enc) {
> +                error_setg(errp,
> +"KVM uses PTE encoding 0x%x for page shift %u/%u, but guest expects 0x%x",
> +                           ksps->enc[jk].pte_enc, qsps->enc[jq].page_shift,
> +                           qsps->page_shift, qsps->enc[jq].pte_enc);
> +                return;
> +            }
>          }
>      }
> -    cpu->hash64_opts->slb_size = smmu_info.slb_size;
> -    if (!(smmu_info.flags & KVM_PPC_1T_SEGMENTS)) {
> -        cpu->hash64_opts->flags &= ~PPC_HASH64_1TSEG;
> -    }
> -}
> -#else /* defined (TARGET_PPC64) */
>  
> -static inline void kvm_fixup_page_sizes(PowerPCCPU *cpu)
> -{
> +    if (ppc_hash64_has(cpu, PPC_HASH64_CI_LARGEPAGE)) {
> +        /* Mostly what guest pagesizes we can use are related to the
> +         * host pages used to map guest RAM, which is handled in the
> +         * platform code. Cache-Inhibited largepages (64k) however are
> +         * used for I/O, so if they're mapped to the host at all it
> +         * will be a normal mapping, not a special hugepage one used
> +         * for RAM. */
> +        if (getpagesize() < 0x10000) {
> +            error_setg(errp,
> +"KVM can't supply 64kiB CI pages, which guest expects\n");
> +        }
> +    }
>  }
> -
>  #endif /* !defined (TARGET_PPC64) */
>  
>  unsigned long kvm_arch_vcpu_id(CPUState *cpu)
> @@ -551,9 +551,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
>      CPUPPCState *cenv = &cpu->env;
>      int ret;
>  
> -    /* Gather server mmu info from KVM and update the CPU state */
> -    kvm_fixup_page_sizes(cpu);
> -
>      /* Synchronize sregs with kvm */
>      ret = kvm_arch_sync_sregs(cpu);
>      if (ret) {
> diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> index 443fca0a4e..657582bb32 100644
> --- a/target/ppc/kvm_ppc.h
> +++ b/target/ppc/kvm_ppc.h
> @@ -71,6 +71,7 @@ int kvmppc_resize_hpt_commit(PowerPCCPU *cpu, target_ulong flags, int shift);
>  bool kvmppc_pvr_workaround_required(PowerPCCPU *cpu);
>  
>  bool kvmppc_hpt_needs_host_contiguous_pages(void);
> +void kvm_check_mmu(PowerPCCPU *cpu, Error **errp);
>  
>  #else
>  
> @@ -227,6 +228,10 @@ static inline bool kvmppc_hpt_needs_host_contiguous_pages(void)
>      return false;
>  }
>  
> +static inline void kvm_check_mmu(PowerPCCPU *cpu, Error **errp)
> +{
> +}
> +
>  static inline bool kvmppc_has_cap_spapr_vfio(void)
>  {
>      return false;
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 5/9] spapr: Maximum (HPT) pagesize property
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 5/9] spapr: Maximum (HPT) pagesize property David Gibson
  2018-06-19  9:23   ` Cédric Le Goater
  2018-06-21  6:22   ` Cédric Le Goater
@ 2018-06-21  9:19   ` Greg Kurz
  2018-06-21 11:01     ` David Gibson
  2 siblings, 1 reply; 43+ messages in thread
From: Greg Kurz @ 2018-06-21  9:19 UTC (permalink / raw)
  To: David Gibson; +Cc: abologna, clg, qemu-ppc, qemu-devel, aik

On Mon, 18 Jun 2018 16:36:02 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> The way the POWER Hash Page Table (HPT) MMU is virtualized by KVM HV means
> that every page that the guest puts in the pagetables must be truly
> physically contiguous, not just GPA-contiguous.  In effect this means that
> an HPT guest can't use any pagesizes greater than the host page size used
> to back its memory.
> 
> At present we handle this by changing what we advertise to the guest based
> on the backing pagesizes.  This is pretty bad, because it means the guest
> sees a different environment depending on what should be host configuration
> details.
> 
> As a start on fixing this, we add a new capability parameter to the pseries
> machine type which gives the maximum allowed pagesizes for an HPT guest (as
> a shift).

Maybe you can mention that it is exposed to the user as a genuine pagesize,
not a shift, because it is a friendlier interface.

> For now we just create and validate the parameter without making
> it do anything.
> 
> For backwards compatibility, on older machine types we set it to the max
> available page size for the host.  For the 3.0 machine type, we fix it to
> 16, the intention being to only allow HPT pagesizes up to 64kiB by default
> in future.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  hw/ppc/spapr.c         | 12 +++++++++
>  hw/ppc/spapr_caps.c    | 56 ++++++++++++++++++++++++++++++++++++++++++
>  include/hw/ppc/spapr.h |  4 ++-
>  3 files changed, 71 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 40858d047c..74a76e7e09 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -63,6 +63,7 @@
>  #include "hw/virtio/vhost-scsi-common.h"
>  
>  #include "exec/address-spaces.h"
> +#include "exec/ram_addr.h"
>  #include "hw/usb.h"
>  #include "qemu/config-file.h"
>  #include "qemu/error-report.h"
> @@ -4043,6 +4044,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      smc->default_caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_BROKEN;
>      smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_BROKEN;
>      smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_BROKEN;
> +    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 16; /* 64kiB */
>      spapr_caps_add_properties(smc, &error_abort);
>  }
>  
> @@ -4126,8 +4128,18 @@ static void spapr_machine_2_12_instance_options(MachineState *machine)
>  
>  static void spapr_machine_2_12_class_options(MachineClass *mc)
>  {
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
> +    uint8_t mps;
> +
>      spapr_machine_3_0_class_options(mc);
>      SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_12);
> +
> +    if (kvmppc_hpt_needs_host_contiguous_pages()) {
> +        mps = ctz64(qemu_getrampagesize());
> +    } else {
> +        mps = 34; /* allow everything up to 16GiB, i.e. everything */
> +    }
> +    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = mps;
>  }
>  
>  DEFINE_SPAPR_MACHINE(2_12, "2.12", false);
> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> index 68a4243efc..6cdc0c94e7 100644
> --- a/hw/ppc/spapr_caps.c
> +++ b/hw/ppc/spapr_caps.c
> @@ -27,6 +27,7 @@
>  #include "qapi/visitor.h"
>  #include "sysemu/hw_accel.h"
>  #include "target/ppc/cpu.h"
> +#include "target/ppc/mmu-hash64.h"
>  #include "cpu-models.h"
>  #include "kvm_ppc.h"
>  
> @@ -144,6 +145,42 @@ out:
>      g_free(val);
>  }
>  
> +static void spapr_cap_get_pagesize(Object *obj, Visitor *v, const char *name,
> +                                   void *opaque, Error **errp)
> +{
> +    sPAPRCapabilityInfo *cap = opaque;
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
> +    uint8_t val = spapr_get_cap(spapr, cap->index);
> +    uint64_t pagesize = (1ULL << val);
> +
> +    visit_type_size(v, name, &pagesize, errp);
> +}
> +
> +static void spapr_cap_set_pagesize(Object *obj, Visitor *v, const char *name,
> +                                   void *opaque, Error **errp)
> +{
> +    sPAPRCapabilityInfo *cap = opaque;
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
> +    uint64_t pagesize;
> +    uint8_t val;
> +    Error *local_err = NULL;
> +
> +    visit_type_size(v, name, &pagesize, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +
> +    if (!is_power_of_2(pagesize)) {
> +        error_setg(errp, "cap-%s must be a power of 2", cap->name);
> +        return;
> +    }
> +
> +    val = ctz64(pagesize);
> +    spapr->cmd_line_caps[cap->index] = true;
> +    spapr->eff.caps[cap->index] = val;
> +}
> +
>  static void cap_htm_apply(sPAPRMachineState *spapr, uint8_t val, Error **errp)
>  {
>      if (!val) {
> @@ -267,6 +304,16 @@ static void cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
>  
>  #define VALUE_DESC_TRISTATE     " (broken, workaround, fixed)"
>  
> +static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
> +                                      uint8_t val, Error **errp)
> +{
> +    if (val < 12) {
> +        error_setg(errp, "Require at least 4kiB hpt-max-page-size");
> +    } else if (val < 16) {
> +        warn_report("Many guests require at least 64kiB hpt-max-page-size");
> +    }
> +}
> +
>  sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
>      [SPAPR_CAP_HTM] = {
>          .name = "htm",
> @@ -326,6 +373,15 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
>          .possible = &cap_ibs_possible,
>          .apply = cap_safe_indirect_branch_apply,
>      },
> +    [SPAPR_CAP_HPT_MAXPAGESIZE] = {
> +        .name = "hpt-max-page-size",
> +        .description = "Maximum page size for Hash Page Table guests",
> +        .index = SPAPR_CAP_HPT_MAXPAGESIZE,
> +        .get = spapr_cap_get_pagesize,
> +        .set = spapr_cap_set_pagesize,
> +        .type = "int",
> +        .apply = cap_hpt_maxpagesize_apply,
> +    },
>  };
>  
>  static sPAPRCapabilities default_caps_with_cpu(sPAPRMachineState *spapr,
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 9dd46a72f6..c97593d032 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -66,8 +66,10 @@ typedef enum {
>  #define SPAPR_CAP_SBBC                  0x04
>  /* Indirect Branch Serialisation */
>  #define SPAPR_CAP_IBS                   0x05
> +/* HPT Maximum Page Size (encoded as a shift) */
> +#define SPAPR_CAP_HPT_MAXPAGESIZE       0x06
>  /* Num Caps */
> -#define SPAPR_CAP_NUM                   (SPAPR_CAP_IBS + 1)
> +#define SPAPR_CAP_NUM                   (SPAPR_CAP_HPT_MAXPAGESIZE + 1)
>  
>  /*
>   * Capability Values

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 6/9] spapr: Use maximum page size capability to simplify memory backend checking
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 6/9] spapr: Use maximum page size capability to simplify memory backend checking David Gibson
  2018-06-21  6:29   ` Cédric Le Goater
@ 2018-06-21 10:29   ` Greg Kurz
  2018-06-21 11:11     ` David Gibson
  1 sibling, 1 reply; 43+ messages in thread
From: Greg Kurz @ 2018-06-21 10:29 UTC (permalink / raw)
  To: David Gibson; +Cc: abologna, clg, qemu-ppc, qemu-devel, aik

On Mon, 18 Jun 2018 16:36:03 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> The way we used to handle KVM allowable guest pagesizes for PAPR guests
> required some convoluted checking of memory attached to the guest.
> 
> The allowable pagesizes advertised to the guest cpus depended on the memory
> which was attached at boot, but then we needed to ensure that any memory
> later hotplugged didn't change which pagesizes were allowed.
> 
> Now that we have an explicit machine option to control the allowable
> maximum pagesize we can simplify this.  We just check all memory backends
> against that declared pagesize.  We check base and cold-plugged memory at
> reset time, and hotplugged memory at pre_plug() time.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/ppc/spapr.c         | 17 +++++++----------
>  hw/ppc/spapr_caps.c    | 20 ++++++++++++++++++++
>  include/hw/ppc/spapr.h |  3 +++
>  target/ppc/kvm.c       | 14 --------------
>  target/ppc/kvm_ppc.h   |  6 ------
>  5 files changed, 30 insertions(+), 30 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 74a76e7e09..efd36e92e2 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3192,11 +3192,13 @@ static void spapr_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>                                    Error **errp)
>  {
>      const sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(hotplug_dev);
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(hotplug_dev);
>      PCDIMMDevice *dimm = PC_DIMM(dev);
>      PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
>      MemoryRegion *mr;
>      uint64_t size;
> -    char *mem_dev;
> +    Object *memdev;
> +    hwaddr pagesize;
>  
>      if (!smc->dr_lmb_enabled) {
>          error_setg(errp, "Memory hotplug not supported for this machine");
> @@ -3215,15 +3217,10 @@ static void spapr_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>          return;
>      }
>  
> -    mem_dev = object_property_get_str(OBJECT(dimm), PC_DIMM_MEMDEV_PROP, NULL);
> -    if (mem_dev && !kvmppc_is_mem_backend_page_size_ok(mem_dev)) {
> -        error_setg(errp, "Memory backend has bad page size. "
> -                   "Use 'memory-backend-file' with correct mem-path.");
> -        goto out;
> -    }
> -
> -out:
> -    g_free(mem_dev);
> +    memdev = object_property_get_link(OBJECT(dimm), PC_DIMM_MEMDEV_PROP,
> +                                      &error_abort);
> +    pagesize = host_memory_backend_pagesize(MEMORY_BACKEND(memdev));
> +    spapr_check_pagesize(spapr, pagesize, errp);
>  }
>  
>  struct sPAPRDIMMState {
> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> index 6cdc0c94e7..9fc739b3f5 100644
> --- a/hw/ppc/spapr_caps.c
> +++ b/hw/ppc/spapr_caps.c
> @@ -26,6 +26,7 @@
>  #include "qapi/error.h"
>  #include "qapi/visitor.h"
>  #include "sysemu/hw_accel.h"
> +#include "exec/ram_addr.h"
>  #include "target/ppc/cpu.h"
>  #include "target/ppc/mmu-hash64.h"
>  #include "cpu-models.h"
> @@ -304,6 +305,23 @@ static void cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
>  
>  #define VALUE_DESC_TRISTATE     " (broken, workaround, fixed)"
>  
> +void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
> +                          Error **errp)
> +{
> +    hwaddr maxpagesize = (1ULL << spapr->eff.caps[SPAPR_CAP_HPT_MPS]);

s/SPAPR_CAP_HPT_MPS/SPAPR_CAP_HPT_MAXPAGESIZE

> +
> +    if (!kvmppc_hpt_needs_host_contiguous_pages()) {
> +        return;
> +    }
> +
> +    if (maxpagesize > pagesize) {
> +        error_setg(errp,
> +                   "Can't support %"HWADDR_PRIu" kiB guest pages with %"
> +                   HWADDR_PRIu" kiB host pages with this KVM implementation",
> +                   maxpagesize >> 10, pagesize >> 10);
> +    }
> +}
> +
>  static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
>                                        uint8_t val, Error **errp)
>  {
> @@ -312,6 +330,8 @@ static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
>      } else if (val < 16) {
>          warn_report("Many guests require at least 64kiB hpt-max-page-size");
>      }
> +
> +    spapr_check_pagesize(spapr, qemu_getrampagesize(), errp);

Even if in this precise case QEMU will always exit gracefully since
errp == &error_fatal, passing errp several times is a fragile pattern.
It may cause a crash if *errp was already allocated.

Maybe use a local_err variable and error_propagate() or at least return
in the (val < 12) block above.

Rest looks good. With the two issues addressed:

Reviewed-by: Greg Kurz <groug@kaod.org>

>  }
>  
>  sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index c97593d032..75e2cf2687 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -806,4 +806,7 @@ void spapr_caps_cpu_apply(sPAPRMachineState *spapr, PowerPCCPU *cpu);
>  void spapr_caps_add_properties(sPAPRMachineClass *smc, Error **errp);
>  int spapr_caps_post_migration(sPAPRMachineState *spapr);
>  
> +void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
> +                          Error **errp);
> +
>  #endif /* HW_SPAPR_H */
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index 50b5d01432..9cfbd388ad 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -500,26 +500,12 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>          cpu->hash64_opts->flags &= ~PPC_HASH64_1TSEG;
>      }
>  }
> -
> -bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
> -{
> -    Object *mem_obj = object_resolve_path(obj_path, NULL);
> -    long pagesize = host_memory_backend_pagesize(MEMORY_BACKEND(mem_obj));
> -
> -    return pagesize >= max_cpu_page_size;
> -}
> -
>  #else /* defined (TARGET_PPC64) */
>  
>  static inline void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>  {
>  }
>  
> -bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
> -{
> -    return true;
> -}
> -
>  #endif /* !defined (TARGET_PPC64) */
>  
>  unsigned long kvm_arch_vcpu_id(CPUState *cpu)
> diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> index a7ddb8a5d6..443fca0a4e 100644
> --- a/target/ppc/kvm_ppc.h
> +++ b/target/ppc/kvm_ppc.h
> @@ -71,7 +71,6 @@ int kvmppc_resize_hpt_commit(PowerPCCPU *cpu, target_ulong flags, int shift);
>  bool kvmppc_pvr_workaround_required(PowerPCCPU *cpu);
>  
>  bool kvmppc_hpt_needs_host_contiguous_pages(void);
> -bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path);
>  
>  #else
>  
> @@ -228,11 +227,6 @@ static inline bool kvmppc_hpt_needs_host_contiguous_pages(void)
>      return false;
>  }
>  
> -static inline bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
> -{
> -    return true;
> -}
> -
>  static inline bool kvmppc_has_cap_spapr_vfio(void)
>  {
>      return false;

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 5/9] spapr: Maximum (HPT) pagesize property
  2018-06-21  6:22   ` Cédric Le Goater
@ 2018-06-21 11:00     ` David Gibson
  0 siblings, 0 replies; 43+ messages in thread
From: David Gibson @ 2018-06-21 11:00 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: groug, abologna, qemu-ppc, qemu-devel, aik

[-- Attachment #1: Type: text/plain, Size: 7760 bytes --]

On Thu, Jun 21, 2018 at 08:22:15AM +0200, Cédric Le Goater wrote:
> On 06/18/2018 08:36 AM, David Gibson wrote:
> > The way the POWER Hash Page Table (HPT) MMU is virtualized by KVM HV means
> > that every page that the guest puts in the pagetables must be truly
> > physically contiguous, not just GPA-contiguous.  In effect this means that
> > an HPT guest can't use any pagesizes greater than the host page size used
> > to back its memory.
> > 
> > At present we handle this by changing what we advertise to the guest based
> > on the backing pagesizes.  This is pretty bad, because it means the guest
> > sees a different environment depending on what should be host configuration
> > details.
> > 
> > As a start on fixing this, we add a new capability parameter to the pseries
> > machine type which gives the maximum allowed pagesizes for an HPT guest (as
> > a shift).  For now we just create and validate the parameter without making
> > it do anything.
> > 
> > For backwards compatibility, on older machine types we set it to the max
> > available page size for the host.  For the 3.0 machine type, we fix it to
> > 16, the intention being to only allow HPT pagesizes up to 64kiB by default
> > in future.
> 
> Why not do it now ?

Uh.. do what now.  Essentially this *is* doing it now, except that we
don't have the mechanism to actually enforce it until a couple of
patches further in.

> I don't think the pseries machine supports 4k pages
> anyway. so you could change the warn_report() below in an error I think.

Uh.. I think pseries does technically still support 4k pages.
Although it might not have been much tested recently, since none of
the distros configure it that way.

> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> 
> Reviewed-by: Cédric Le Goater <clg@kaod.org>
> 
> C.
> 
> > ---
> >  hw/ppc/spapr.c         | 12 +++++++++
> >  hw/ppc/spapr_caps.c    | 56 ++++++++++++++++++++++++++++++++++++++++++
> >  include/hw/ppc/spapr.h |  4 ++-
> >  3 files changed, 71 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 40858d047c..74a76e7e09 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -63,6 +63,7 @@
> >  #include "hw/virtio/vhost-scsi-common.h"
> >  
> >  #include "exec/address-spaces.h"
> > +#include "exec/ram_addr.h"
> >  #include "hw/usb.h"
> >  #include "qemu/config-file.h"
> >  #include "qemu/error-report.h"
> > @@ -4043,6 +4044,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
> >      smc->default_caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_BROKEN;
> >      smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_BROKEN;
> >      smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_BROKEN;
> > +    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 16; /* 64kiB */
> >      spapr_caps_add_properties(smc, &error_abort);
> >  }
> >  
> > @@ -4126,8 +4128,18 @@ static void spapr_machine_2_12_instance_options(MachineState *machine)
> >  
> >  static void spapr_machine_2_12_class_options(MachineClass *mc)
> >  {
> > +    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
> > +    uint8_t mps;
> > +
> >      spapr_machine_3_0_class_options(mc);
> >      SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_12);
> > +
> > +    if (kvmppc_hpt_needs_host_contiguous_pages()) {
> > +        mps = ctz64(qemu_getrampagesize());
> > +    } else {
> > +        mps = 34; /* allow everything up to 16GiB, i.e. everything */
> > +    }
> > +    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = mps;
> >  }
> >  
> >  DEFINE_SPAPR_MACHINE(2_12, "2.12", false);
> > diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> > index 68a4243efc..6cdc0c94e7 100644
> > --- a/hw/ppc/spapr_caps.c
> > +++ b/hw/ppc/spapr_caps.c
> > @@ -27,6 +27,7 @@
> >  #include "qapi/visitor.h"
> >  #include "sysemu/hw_accel.h"
> >  #include "target/ppc/cpu.h"
> > +#include "target/ppc/mmu-hash64.h"
> >  #include "cpu-models.h"
> >  #include "kvm_ppc.h"
> >  
> > @@ -144,6 +145,42 @@ out:
> >      g_free(val);
> >  }
> >  
> > +static void spapr_cap_get_pagesize(Object *obj, Visitor *v, const char *name,
> > +                                   void *opaque, Error **errp)
> > +{
> > +    sPAPRCapabilityInfo *cap = opaque;
> > +    sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
> > +    uint8_t val = spapr_get_cap(spapr, cap->index);
> > +    uint64_t pagesize = (1ULL << val);
> > +
> > +    visit_type_size(v, name, &pagesize, errp);
> > +}
> > +
> > +static void spapr_cap_set_pagesize(Object *obj, Visitor *v, const char *name,
> > +                                   void *opaque, Error **errp)
> > +{
> > +    sPAPRCapabilityInfo *cap = opaque;
> > +    sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
> > +    uint64_t pagesize;
> > +    uint8_t val;
> > +    Error *local_err = NULL;
> > +
> > +    visit_type_size(v, name, &pagesize, &local_err);
> > +    if (local_err) {
> > +        error_propagate(errp, local_err);
> > +        return;
> > +    }
> > +
> > +    if (!is_power_of_2(pagesize)) {
> > +        error_setg(errp, "cap-%s must be a power of 2", cap->name);
> > +        return;
> > +    }
> > +
> > +    val = ctz64(pagesize);
> > +    spapr->cmd_line_caps[cap->index] = true;
> > +    spapr->eff.caps[cap->index] = val;
> > +}
> > +
> >  static void cap_htm_apply(sPAPRMachineState *spapr, uint8_t val, Error **errp)
> >  {
> >      if (!val) {
> > @@ -267,6 +304,16 @@ static void cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
> >  
> >  #define VALUE_DESC_TRISTATE     " (broken, workaround, fixed)"
> >  
> > +static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
> > +                                      uint8_t val, Error **errp)
> > +{
> > +    if (val < 12) {
> > +        error_setg(errp, "Require at least 4kiB hpt-max-page-size");
> > +    } else if (val < 16) {
> > +        warn_report("Many guests require at least 64kiB hpt-max-page-size");
> > +    }
> > +}
> > +
> >  sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
> >      [SPAPR_CAP_HTM] = {
> >          .name = "htm",
> > @@ -326,6 +373,15 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
> >          .possible = &cap_ibs_possible,
> >          .apply = cap_safe_indirect_branch_apply,
> >      },
> > +    [SPAPR_CAP_HPT_MAXPAGESIZE] = {
> > +        .name = "hpt-max-page-size",
> > +        .description = "Maximum page size for Hash Page Table guests",
> > +        .index = SPAPR_CAP_HPT_MAXPAGESIZE,
> > +        .get = spapr_cap_get_pagesize,
> > +        .set = spapr_cap_set_pagesize,
> > +        .type = "int",
> > +        .apply = cap_hpt_maxpagesize_apply,
> > +    },
> >  };
> >  
> >  static sPAPRCapabilities default_caps_with_cpu(sPAPRMachineState *spapr,
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index 9dd46a72f6..c97593d032 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -66,8 +66,10 @@ typedef enum {
> >  #define SPAPR_CAP_SBBC                  0x04
> >  /* Indirect Branch Serialisation */
> >  #define SPAPR_CAP_IBS                   0x05
> > +/* HPT Maximum Page Size (encoded as a shift) */
> > +#define SPAPR_CAP_HPT_MAXPAGESIZE       0x06
> >  /* Num Caps */
> > -#define SPAPR_CAP_NUM                   (SPAPR_CAP_IBS + 1)
> > +#define SPAPR_CAP_NUM                   (SPAPR_CAP_HPT_MAXPAGESIZE + 1)
> >  
> >  /*
> >   * Capability Values
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 5/9] spapr: Maximum (HPT) pagesize property
  2018-06-21  9:19   ` Greg Kurz
@ 2018-06-21 11:01     ` David Gibson
  0 siblings, 0 replies; 43+ messages in thread
From: David Gibson @ 2018-06-21 11:01 UTC (permalink / raw)
  To: Greg Kurz; +Cc: abologna, clg, qemu-ppc, qemu-devel, aik

[-- Attachment #1: Type: text/plain, Size: 7562 bytes --]

On Thu, Jun 21, 2018 at 11:19:41AM +0200, Greg Kurz wrote:
> On Mon, 18 Jun 2018 16:36:02 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > The way the POWER Hash Page Table (HPT) MMU is virtualized by KVM HV means
> > that every page that the guest puts in the pagetables must be truly
> > physically contiguous, not just GPA-contiguous.  In effect this means that
> > an HPT guest can't use any pagesizes greater than the host page size used
> > to back its memory.
> > 
> > At present we handle this by changing what we advertise to the guest based
> > on the backing pagesizes.  This is pretty bad, because it means the guest
> > sees a different environment depending on what should be host configuration
> > details.
> > 
> > As a start on fixing this, we add a new capability parameter to the pseries
> > machine type which gives the maximum allowed pagesizes for an HPT guest (as
> > a shift).
> 
> Maybe you can mention that it is exposed to the user as a genuine pagesize,
> not a shift, because it is a friendlier interface.

Oops, I fixed most of the shift references in the commit messages, but
missed this one.  I removed the "as a shift" text.

> > For now we just create and validate the parameter without making
> > it do anything.
> > 
> > For backwards compatibility, on older machine types we set it to the max
> > available page size for the host.  For the 3.0 machine type, we fix it to
> > 16, the intention being to only allow HPT pagesizes up to 64kiB by default
> > in future.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> 
> Reviewed-by: Greg Kurz <groug@kaod.org>
> 
> >  hw/ppc/spapr.c         | 12 +++++++++
> >  hw/ppc/spapr_caps.c    | 56 ++++++++++++++++++++++++++++++++++++++++++
> >  include/hw/ppc/spapr.h |  4 ++-
> >  3 files changed, 71 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 40858d047c..74a76e7e09 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -63,6 +63,7 @@
> >  #include "hw/virtio/vhost-scsi-common.h"
> >  
> >  #include "exec/address-spaces.h"
> > +#include "exec/ram_addr.h"
> >  #include "hw/usb.h"
> >  #include "qemu/config-file.h"
> >  #include "qemu/error-report.h"
> > @@ -4043,6 +4044,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
> >      smc->default_caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_BROKEN;
> >      smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_BROKEN;
> >      smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_BROKEN;
> > +    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 16; /* 64kiB */
> >      spapr_caps_add_properties(smc, &error_abort);
> >  }
> >  
> > @@ -4126,8 +4128,18 @@ static void spapr_machine_2_12_instance_options(MachineState *machine)
> >  
> >  static void spapr_machine_2_12_class_options(MachineClass *mc)
> >  {
> > +    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
> > +    uint8_t mps;
> > +
> >      spapr_machine_3_0_class_options(mc);
> >      SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_12);
> > +
> > +    if (kvmppc_hpt_needs_host_contiguous_pages()) {
> > +        mps = ctz64(qemu_getrampagesize());
> > +    } else {
> > +        mps = 34; /* allow everything up to 16GiB, i.e. everything */
> > +    }
> > +    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = mps;
> >  }
> >  
> >  DEFINE_SPAPR_MACHINE(2_12, "2.12", false);
> > diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> > index 68a4243efc..6cdc0c94e7 100644
> > --- a/hw/ppc/spapr_caps.c
> > +++ b/hw/ppc/spapr_caps.c
> > @@ -27,6 +27,7 @@
> >  #include "qapi/visitor.h"
> >  #include "sysemu/hw_accel.h"
> >  #include "target/ppc/cpu.h"
> > +#include "target/ppc/mmu-hash64.h"
> >  #include "cpu-models.h"
> >  #include "kvm_ppc.h"
> >  
> > @@ -144,6 +145,42 @@ out:
> >      g_free(val);
> >  }
> >  
> > +static void spapr_cap_get_pagesize(Object *obj, Visitor *v, const char *name,
> > +                                   void *opaque, Error **errp)
> > +{
> > +    sPAPRCapabilityInfo *cap = opaque;
> > +    sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
> > +    uint8_t val = spapr_get_cap(spapr, cap->index);
> > +    uint64_t pagesize = (1ULL << val);
> > +
> > +    visit_type_size(v, name, &pagesize, errp);
> > +}
> > +
> > +static void spapr_cap_set_pagesize(Object *obj, Visitor *v, const char *name,
> > +                                   void *opaque, Error **errp)
> > +{
> > +    sPAPRCapabilityInfo *cap = opaque;
> > +    sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
> > +    uint64_t pagesize;
> > +    uint8_t val;
> > +    Error *local_err = NULL;
> > +
> > +    visit_type_size(v, name, &pagesize, &local_err);
> > +    if (local_err) {
> > +        error_propagate(errp, local_err);
> > +        return;
> > +    }
> > +
> > +    if (!is_power_of_2(pagesize)) {
> > +        error_setg(errp, "cap-%s must be a power of 2", cap->name);
> > +        return;
> > +    }
> > +
> > +    val = ctz64(pagesize);
> > +    spapr->cmd_line_caps[cap->index] = true;
> > +    spapr->eff.caps[cap->index] = val;
> > +}
> > +
> >  static void cap_htm_apply(sPAPRMachineState *spapr, uint8_t val, Error **errp)
> >  {
> >      if (!val) {
> > @@ -267,6 +304,16 @@ static void cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
> >  
> >  #define VALUE_DESC_TRISTATE     " (broken, workaround, fixed)"
> >  
> > +static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
> > +                                      uint8_t val, Error **errp)
> > +{
> > +    if (val < 12) {
> > +        error_setg(errp, "Require at least 4kiB hpt-max-page-size");
> > +    } else if (val < 16) {
> > +        warn_report("Many guests require at least 64kiB hpt-max-page-size");
> > +    }
> > +}
> > +
> >  sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
> >      [SPAPR_CAP_HTM] = {
> >          .name = "htm",
> > @@ -326,6 +373,15 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
> >          .possible = &cap_ibs_possible,
> >          .apply = cap_safe_indirect_branch_apply,
> >      },
> > +    [SPAPR_CAP_HPT_MAXPAGESIZE] = {
> > +        .name = "hpt-max-page-size",
> > +        .description = "Maximum page size for Hash Page Table guests",
> > +        .index = SPAPR_CAP_HPT_MAXPAGESIZE,
> > +        .get = spapr_cap_get_pagesize,
> > +        .set = spapr_cap_set_pagesize,
> > +        .type = "int",
> > +        .apply = cap_hpt_maxpagesize_apply,
> > +    },
> >  };
> >  
> >  static sPAPRCapabilities default_caps_with_cpu(sPAPRMachineState *spapr,
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index 9dd46a72f6..c97593d032 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -66,8 +66,10 @@ typedef enum {
> >  #define SPAPR_CAP_SBBC                  0x04
> >  /* Indirect Branch Serialisation */
> >  #define SPAPR_CAP_IBS                   0x05
> > +/* HPT Maximum Page Size (encoded as a shift) */
> > +#define SPAPR_CAP_HPT_MAXPAGESIZE       0x06
> >  /* Num Caps */
> > -#define SPAPR_CAP_NUM                   (SPAPR_CAP_IBS + 1)
> > +#define SPAPR_CAP_NUM                   (SPAPR_CAP_HPT_MAXPAGESIZE + 1)
> >  
> >  /*
> >   * Capability Values
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 6/9] spapr: Use maximum page size capability to simplify memory backend checking
  2018-06-21  6:29   ` Cédric Le Goater
@ 2018-06-21 11:06     ` David Gibson
  0 siblings, 0 replies; 43+ messages in thread
From: David Gibson @ 2018-06-21 11:06 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: groug, abologna, qemu-ppc, qemu-devel, aik

[-- Attachment #1: Type: text/plain, Size: 5072 bytes --]

On Thu, Jun 21, 2018 at 08:29:36AM +0200, Cédric Le Goater wrote:
> On 06/18/2018 08:36 AM, David Gibson wrote:
> > The way we used to handle KVM allowable guest pagesizes for PAPR guests
> > required some convoluted checking of memory attached to the guest.
> > 
> > The allowable pagesizes advertised to the guest cpus depended on the memory
> > which was attached at boot, but then we needed to ensure that any memory
> > later hotplugged didn't change which pagesizes were allowed.
> > 
> > Now that we have an explicit machine option to control the allowable
> > maximum pagesize we can simplify this.  We just check all memory backends
> > against that declared pagesize.  We check base and cold-plugged memory at
> > reset time, and hotplugged memory at pre_plug() time.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> 
> One minor question below.
> 
> Nevertheless,
> 
> Reviewed-by: Cédric Le Goater <clg@kaod.org>

[snip]
> > @@ -304,6 +305,23 @@ static void cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
> >  
> >  #define VALUE_DESC_TRISTATE     " (broken, workaround, fixed)"
> >  
> > +void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
> > +                          Error **errp)
> > +{
> > +    hwaddr maxpagesize = (1ULL << spapr->eff.caps[SPAPR_CAP_HPT_MPS]);
> 
> I suppose this is SPAPR_CAP_HPT_MAXPAGESIZE now ? 

Oops.  I thought I'd compile tested all the intermediate patches,
which would have caught this, but apparently not.

Fixed now.

> > +
> > +    if (!kvmppc_hpt_needs_host_contiguous_pages()) {
> > +        return;
> > +    }
> > +
> > +    if (maxpagesize > pagesize) {
> > +        error_setg(errp,
> > +                   "Can't support %"HWADDR_PRIu" kiB guest pages with %"
> > +                   HWADDR_PRIu" kiB host pages with this KVM implementation",
> > +                   maxpagesize >> 10, pagesize >> 10);
> > +    }
> > +}
> > +
> >  static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
> >                                        uint8_t val, Error **errp)
> >  {
> > @@ -312,6 +330,8 @@ static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
> >      } else if (val < 16) {
> >          warn_report("Many guests require at least 64kiB hpt-max-page-size");
> >      }
> > +
> > +    spapr_check_pagesize(spapr, qemu_getrampagesize(), errp);
> >  }
> >  
> >  sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index c97593d032..75e2cf2687 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -806,4 +806,7 @@ void spapr_caps_cpu_apply(sPAPRMachineState *spapr, PowerPCCPU *cpu);
> >  void spapr_caps_add_properties(sPAPRMachineClass *smc, Error **errp);
> >  int spapr_caps_post_migration(sPAPRMachineState *spapr);
> >  
> > +void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
> > +                          Error **errp);
> > +
> >  #endif /* HW_SPAPR_H */
> > diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> > index 50b5d01432..9cfbd388ad 100644
> > --- a/target/ppc/kvm.c
> > +++ b/target/ppc/kvm.c
> > @@ -500,26 +500,12 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
> >          cpu->hash64_opts->flags &= ~PPC_HASH64_1TSEG;
> >      }
> >  }
> > -
> > -bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
> > -{
> > -    Object *mem_obj = object_resolve_path(obj_path, NULL);
> > -    long pagesize = host_memory_backend_pagesize(MEMORY_BACKEND(mem_obj));
> > -
> > -    return pagesize >= max_cpu_page_size;
> > -}
> > -
> >  #else /* defined (TARGET_PPC64) */
> >  
> >  static inline void kvm_fixup_page_sizes(PowerPCCPU *cpu)
> >  {
> >  }
> >  
> > -bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
> > -{
> > -    return true;
> > -}
> > -
> >  #endif /* !defined (TARGET_PPC64) */
> >  
> >  unsigned long kvm_arch_vcpu_id(CPUState *cpu)
> > diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> > index a7ddb8a5d6..443fca0a4e 100644
> > --- a/target/ppc/kvm_ppc.h
> > +++ b/target/ppc/kvm_ppc.h
> > @@ -71,7 +71,6 @@ int kvmppc_resize_hpt_commit(PowerPCCPU *cpu, target_ulong flags, int shift);
> >  bool kvmppc_pvr_workaround_required(PowerPCCPU *cpu);
> >  
> >  bool kvmppc_hpt_needs_host_contiguous_pages(void);
> > -bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path);
> >  
> >  #else
> >  
> > @@ -228,11 +227,6 @@ static inline bool kvmppc_hpt_needs_host_contiguous_pages(void)
> >      return false;
> >  }
> >  
> > -static inline bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
> > -{
> > -    return true;
> > -}
> > -
> >  static inline bool kvmppc_has_cap_spapr_vfio(void)
> >  {
> >      return false;
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 6/9] spapr: Use maximum page size capability to simplify memory backend checking
  2018-06-21 10:29   ` Greg Kurz
@ 2018-06-21 11:11     ` David Gibson
  0 siblings, 0 replies; 43+ messages in thread
From: David Gibson @ 2018-06-21 11:11 UTC (permalink / raw)
  To: Greg Kurz; +Cc: abologna, clg, qemu-ppc, qemu-devel, aik

[-- Attachment #1: Type: text/plain, Size: 4637 bytes --]

On Thu, Jun 21, 2018 at 12:29:14PM +0200, Greg Kurz wrote:
> On Mon, 18 Jun 2018 16:36:03 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
[snip]
> > @@ -304,6 +305,23 @@ static void cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
> >  
> >  #define VALUE_DESC_TRISTATE     " (broken, workaround, fixed)"
> >  
> > +void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
> > +                          Error **errp)
> > +{
> > +    hwaddr maxpagesize = (1ULL << spapr->eff.caps[SPAPR_CAP_HPT_MPS]);
> 
> s/SPAPR_CAP_HPT_MPS/SPAPR_CAP_HPT_MAXPAGESIZE

Fixed.

> > +
> > +    if (!kvmppc_hpt_needs_host_contiguous_pages()) {
> > +        return;
> > +    }
> > +
> > +    if (maxpagesize > pagesize) {
> > +        error_setg(errp,
> > +                   "Can't support %"HWADDR_PRIu" kiB guest pages with %"
> > +                   HWADDR_PRIu" kiB host pages with this KVM implementation",
> > +                   maxpagesize >> 10, pagesize >> 10);
> > +    }
> > +}
> > +
> >  static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
> >                                        uint8_t val, Error **errp)
> >  {
> > @@ -312,6 +330,8 @@ static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
> >      } else if (val < 16) {
> >          warn_report("Many guests require at least 64kiB hpt-max-page-size");
> >      }
> > +
> > +    spapr_check_pagesize(spapr, qemu_getrampagesize(), errp);
> 
> Even if in this precise case QEMU will always exit gracefully since
> errp == &error_fatal, passing errp several times is a fragile pattern.
> It may cause a crash if *errp was already allocated.
> 
> Maybe use a local_err variable and error_propagate() or at least return
> in the (val < 12) block above.

Actually, just a return; after the first error should be sufficient.
I've put that in.

> 
> Rest looks good. With the two issues addressed:
> 
> Reviewed-by: Greg Kurz <groug@kaod.org>
> 
> >  }
> >  
> >  sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index c97593d032..75e2cf2687 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -806,4 +806,7 @@ void spapr_caps_cpu_apply(sPAPRMachineState *spapr, PowerPCCPU *cpu);
> >  void spapr_caps_add_properties(sPAPRMachineClass *smc, Error **errp);
> >  int spapr_caps_post_migration(sPAPRMachineState *spapr);
> >  
> > +void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
> > +                          Error **errp);
> > +
> >  #endif /* HW_SPAPR_H */
> > diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> > index 50b5d01432..9cfbd388ad 100644
> > --- a/target/ppc/kvm.c
> > +++ b/target/ppc/kvm.c
> > @@ -500,26 +500,12 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
> >          cpu->hash64_opts->flags &= ~PPC_HASH64_1TSEG;
> >      }
> >  }
> > -
> > -bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
> > -{
> > -    Object *mem_obj = object_resolve_path(obj_path, NULL);
> > -    long pagesize = host_memory_backend_pagesize(MEMORY_BACKEND(mem_obj));
> > -
> > -    return pagesize >= max_cpu_page_size;
> > -}
> > -
> >  #else /* defined (TARGET_PPC64) */
> >  
> >  static inline void kvm_fixup_page_sizes(PowerPCCPU *cpu)
> >  {
> >  }
> >  
> > -bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
> > -{
> > -    return true;
> > -}
> > -
> >  #endif /* !defined (TARGET_PPC64) */
> >  
> >  unsigned long kvm_arch_vcpu_id(CPUState *cpu)
> > diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> > index a7ddb8a5d6..443fca0a4e 100644
> > --- a/target/ppc/kvm_ppc.h
> > +++ b/target/ppc/kvm_ppc.h
> > @@ -71,7 +71,6 @@ int kvmppc_resize_hpt_commit(PowerPCCPU *cpu, target_ulong flags, int shift);
> >  bool kvmppc_pvr_workaround_required(PowerPCCPU *cpu);
> >  
> >  bool kvmppc_hpt_needs_host_contiguous_pages(void);
> > -bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path);
> >  
> >  #else
> >  
> > @@ -228,11 +227,6 @@ static inline bool kvmppc_hpt_needs_host_contiguous_pages(void)
> >      return false;
> >  }
> >  
> > -static inline bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
> > -{
> > -    return true;
> > -}
> > -
> >  static inline bool kvmppc_has_cap_spapr_vfio(void)
> >  {
> >      return false;
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 7/9] target/ppc: Add ppc_hash64_filter_pagesizes()
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 7/9] target/ppc: Add ppc_hash64_filter_pagesizes() David Gibson
  2018-06-21  6:38   ` Cédric Le Goater
@ 2018-06-21 11:48   ` Greg Kurz
  1 sibling, 0 replies; 43+ messages in thread
From: Greg Kurz @ 2018-06-21 11:48 UTC (permalink / raw)
  To: David Gibson; +Cc: abologna, clg, qemu-ppc, qemu-devel, aik

On Mon, 18 Jun 2018 16:36:04 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> The paravirtualized PAPR platform sometimes needs to restrict the guest to
> using only some of the page sizes actually supported by the host's MMU.
> At the moment this is handled in KVM specific code, but for consistency we
> want to apply the same limitations to all accelerators.
> 
> This makes a start on this by providing a helper function in the cpu code
> to allow platform code to remove some of the cpu's page size definitions
> via a caller supplied callback.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  target/ppc/mmu-hash64.c | 59 +++++++++++++++++++++++++++++++++++++++++
>  target/ppc/mmu-hash64.h |  3 +++
>  2 files changed, 62 insertions(+)
> 
> diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
> index aa200cba4c..276d9015e7 100644
> --- a/target/ppc/mmu-hash64.c
> +++ b/target/ppc/mmu-hash64.c
> @@ -1166,3 +1166,62 @@ const PPCHash64Options ppc_hash64_opts_POWER7 = {
>          },
>      }
>  };
> +
> +void ppc_hash64_filter_pagesizes(PowerPCCPU *cpu,
> +                                 bool (*cb)(void *, uint32_t, uint32_t),
> +                                 void *opaque)
> +{
> +    PPCHash64Options *opts = cpu->hash64_opts;
> +    int i;
> +    int n = 0;
> +    bool ci_largepage = false;
> +
> +    assert(opts);
> +
> +    n = 0;
> +    for (i = 0; i < ARRAY_SIZE(opts->sps); i++) {
> +        PPCHash64SegmentPageSizes *sps = &opts->sps[i];
> +        int j;
> +        int m = 0;
> +
> +        assert(n <= i);
> +
> +        if (!sps->page_shift) {
> +            break;
> +        }
> +
> +        for (j = 0; j < ARRAY_SIZE(sps->enc); j++) {
> +            PPCHash64PageSize *ps = &sps->enc[j];
> +
> +            assert(m <= j);
> +            if (!ps->page_shift) {
> +                break;
> +            }
> +
> +            if (cb(opaque, sps->page_shift, ps->page_shift)) {
> +                if (ps->page_shift >= 16) {
> +                    ci_largepage = true;
> +                }
> +                sps->enc[m++] = *ps;
> +            }
> +        }
> +
> +        /* Clear rest of the row */
> +        for (j = m; j < ARRAY_SIZE(sps->enc); j++) {
> +            memset(&sps->enc[j], 0, sizeof(sps->enc[j]));
> +        }
> +
> +        if (m) {
> +            n++;
> +        }
> +    }
> +
> +    /* Clear the rest of the table */
> +    for (i = n; i < ARRAY_SIZE(opts->sps); i++) {
> +        memset(&opts->sps[i], 0, sizeof(opts->sps[i]));
> +    }
> +
> +    if (!ci_largepage) {
> +        opts->flags &= ~PPC_HASH64_CI_LARGEPAGE;
> +    }
> +}
> diff --git a/target/ppc/mmu-hash64.h b/target/ppc/mmu-hash64.h
> index 53dcec5b93..f11efc9cbc 100644
> --- a/target/ppc/mmu-hash64.h
> +++ b/target/ppc/mmu-hash64.h
> @@ -20,6 +20,9 @@ unsigned ppc_hash64_hpte_page_shift_noslb(PowerPCCPU *cpu,
>  void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val);
>  void ppc_hash64_init(PowerPCCPU *cpu);
>  void ppc_hash64_finalize(PowerPCCPU *cpu);
> +void ppc_hash64_filter_pagesizes(PowerPCCPU *cpu,
> +                                 bool (*cb)(void *, uint32_t, uint32_t),
> +                                 void *opaque);
>  #endif
>  
>  /*

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 8/9] spapr: Limit available pagesizes to provide a consistent guest environment
  2018-06-21  7:01   ` Cédric Le Goater
@ 2018-06-21 11:52     ` David Gibson
  2018-06-21 12:50       ` Cédric Le Goater
  0 siblings, 1 reply; 43+ messages in thread
From: David Gibson @ 2018-06-21 11:52 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: groug, abologna, qemu-ppc, qemu-devel, aik

[-- Attachment #1: Type: text/plain, Size: 3909 bytes --]

On Thu, Jun 21, 2018 at 09:01:27AM +0200, Cédric Le Goater wrote:
> On 06/18/2018 08:36 AM, David Gibson wrote:
> > KVM HV has some limitations (deriving from the hardware) that mean not all
> > host-cpu supported pagesizes may be usable in the guest.  At present this
> > means that KVM guests and TCG guests may see different available page sizes
> > even if they notionally have the same vcpu model.  This is confusing and
> > also prevents migration between TCG and KVM.
> > 
> > This patch makes the environment consistent by always allowing the same set
> > of pagesizes.  Since we can't remove the KVM limitations, we do this by
> > always applying the same limitations it has, even to TCG guests.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> >
> > ---
> >  hw/ppc/spapr_caps.c | 33 +++++++++++++++++++++++++++++++++
> >  1 file changed, 33 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> > index 9fc739b3f5..0584c7c6ab 100644
> > --- a/hw/ppc/spapr_caps.c
> > +++ b/hw/ppc/spapr_caps.c
> > @@ -334,6 +334,38 @@ static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
> >      spapr_check_pagesize(spapr, qemu_getrampagesize(), errp);
> >  }
> >  
> > +static bool spapr_pagesize_cb(void *opaque, uint32_t seg_pshift, uint32_t pshift)
> > +{
> > +    unsigned maxshift = *((unsigned *)opaque);
> > +
> > +    assert(pshift >= seg_pshift);
> 
> you could check that elsewhere.

Um.. I'm not sure what you're getting at.

> > +    /* Don't allow the guest to use pages bigger than the configured
> > +     * maximum size */
> > +    if (pshift > maxshift) {
> > +        return false;
> > +    }
> > +
> > +    /* For whatever reason, KVM doesn't allow multiple pagesizes
> > +     * within a segment, *except* for the case of 16M pages in a 4k or
> > +     * 64k segment.  Always exclude other cases, so that TCG and KVM
> > +     * guests see a consistent environment */
> > +    if ((pshift != seg_pshift) && (pshift != 24)) {
> > +        return false;
> > +    }

Note the stanza above, I'll refer to it below.

> > +
> > +    return true;
> > +}
> 
> So, do we really need ppc_hash64_filter_pagesizes() to have a callback ? 

I agree that it seems overly involved, but it was the best way I could
see to logically separate the TCG / softmmu specific logic from the
spapr specific logic.

> It seems that we only use the routine once in the patchset and that the
> only thing we need to check is 'maxshift'.

Not quite.  An earlier draft had this routine just take a max page
size and clamp accordingly.  But that failed when I wrote the code to
check against the KVM capabilities, because KVM also excludes some
other pagesize combinations.  That's what the stanza I point out above
is about

> Do you envision other usage of the routine ?

Not really, no.

> 
> Thanks,
> 
> C.
> 
> > +static void cap_hpt_maxpagesize_cpu_apply(sPAPRMachineState *spapr,
> > +                                          PowerPCCPU *cpu,
> > +                                          uint8_t val, Error **errp)
> > +{
> > +    unsigned maxshift = val;
> > +
> > +    ppc_hash64_filter_pagesizes(cpu, spapr_pagesize_cb, &maxshift);
> > +}
> > +
> >  sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
> >      [SPAPR_CAP_HTM] = {
> >          .name = "htm",
> > @@ -401,6 +433,7 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
> >          .set = spapr_cap_set_pagesize,
> >          .type = "int",
> >          .apply = cap_hpt_maxpagesize_apply,
> > +        .cpu_apply = cap_hpt_maxpagesize_cpu_apply,
> >      },
> >  };
> >  
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 9/9] spapr: Don't rewrite mmu capabilities in KVM mode
  2018-06-21  7:53   ` Cédric Le Goater
@ 2018-06-21 12:01     ` David Gibson
  2018-06-21 12:51       ` Cédric Le Goater
  0 siblings, 1 reply; 43+ messages in thread
From: David Gibson @ 2018-06-21 12:01 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: groug, abologna, qemu-ppc, qemu-devel, aik

[-- Attachment #1: Type: text/plain, Size: 13218 bytes --]

On Thu, Jun 21, 2018 at 09:53:10AM +0200, Cédric Le Goater wrote:
> On 06/18/2018 08:36 AM, David Gibson wrote:
> > Currently during KVM initialization on POWER, kvm_fixup_page_sizes()
> > rewrites a bunch of information in the cpu state to reflect the
> > capabilities of the host MMU and KVM.  This overwrites the information
> > that's already there reflecting how the TCG implementation of the MMU will
> > operate.
> > 
> > This means that we can get guest-visibly different behaviour between KVM
> > and TCG (and between different KVM implementations).  That's bad.  It also
> > prevents migration between KVM and TCG.
> > 
> > The pseries machine type now has filtering of the pagesizes it allows the
> > guest to use which means it can present a consistent model of the MMU
> > across all accelerators.
> > 
> > So, we can now replace kvm_fixup_page_sizes() with kvm_check_mmu() which
> > merely verifies that the expected cpu model can be faithfully handled by
> > KVM, rather than updating the cpu model to match KVM.
> > 
> > We call kvm_check_mmu() from the spapr cpu reset code.  This is a hack:
> 
> I think this is fine but we are still doing some MMU checks in 
> kvm_arch_init_vcpu() we might want to do in a single routine.

Uh.. sort of.  We do do some messing around for BookE 2.06.  That
probably should move into the check_mmu routine.  Actually, it
probably needs to be turned around to give consistent behaviour
between TCG and KVM.  But in any case that'll require more looking at
how BookE works, so it's a project for another day.

The other check is about transactional memory and doesn't actually
have to do with the MMU at all.  It's keyed off env->mmu_model, but
that's an abuse, we should be doing a compat check instead.  Yes,
something to clean up, buit not really in scope for here.

> 
> > conceptually it makes more sense where fixup_page_sizes() was - in the KVM
> > cpu init path.  However, doing that would require moving the platform's
> > pagesize filtering much earlier, which would require a lot of work making
> > further adjustments.  There wouldn't be a lot of concrete point to doing
> > that, since the only KVM implementation which has the awkward MMU
> > restrictions is KVM HV, which can only work with an spapr guest anyway.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  hw/ppc/spapr_caps.c     |   2 +-
> >  hw/ppc/spapr_cpu_core.c |   2 +
> >  target/ppc/kvm.c        | 133 ++++++++++++++++++++--------------------
> >  target/ppc/kvm_ppc.h    |   5 ++
> >  4 files changed, 73 insertions(+), 69 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> > index 0584c7c6ab..bc89a4cd70 100644
> > --- a/hw/ppc/spapr_caps.c
> > +++ b/hw/ppc/spapr_caps.c
> > @@ -308,7 +308,7 @@ static void cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
> >  void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
> >                            Error **errp)
> >  {
> > -    hwaddr maxpagesize = (1ULL << spapr->eff.caps[SPAPR_CAP_HPT_MPS]);
> > +    hwaddr maxpagesize = (1ULL << spapr->eff.caps[SPAPR_CAP_HPT_MAXPAGESIZE]);
> 
> There might be some renames I missed. no big issue.

Looks like this fixup hunk ended up in the wrong patch.  I've folded
it into the right place now.

> 
> >  
> >      if (!kvmppc_hpt_needs_host_contiguous_pages()) {
> >          return;
> > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> > index 324623190d..4e8fa28796 100644
> > --- a/hw/ppc/spapr_cpu_core.c
> > +++ b/hw/ppc/spapr_cpu_core.c
> > @@ -78,6 +78,8 @@ static void spapr_cpu_reset(void *opaque)
> >      spapr_cpu->dtl_size = 0;
> >  
> >      spapr_caps_cpu_apply(SPAPR_MACHINE(qdev_get_machine()), cpu);
> > +
> > +    kvm_check_mmu(cpu, &error_fatal);
> >  }
> >  
> >  void spapr_cpu_set_entry_state(PowerPCCPU *cpu, target_ulong nip, target_ulong r3)
> > diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> > index 9cfbd388ad..b386335014 100644
> > --- a/target/ppc/kvm.c
> > +++ b/target/ppc/kvm.c
> > @@ -419,93 +419,93 @@ bool kvmppc_hpt_needs_host_contiguous_pages(void)
> >      return !!(smmu_info.flags & KVM_PPC_PAGE_SIZES_REAL);
> >  }
> >  
> > -static bool kvm_valid_page_size(uint32_t flags, long rampgsize, uint32_t shift)
> > +void kvm_check_mmu(PowerPCCPU *cpu, Error **errp)
> >  {
> > -    if (!kvmppc_hpt_needs_host_contiguous_pages()) {
> > -        return true;
> > -    }
> > -
> > -    return (1ul << shift) <= rampgsize;
> > -}
> > -
> > -static long max_cpu_page_size;
> > -
> > -static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
> > -{
> > -    static struct kvm_ppc_smmu_info smmu_info;
> > -    static bool has_smmu_info;
> > -    CPUPPCState *env = &cpu->env;
> > +    struct kvm_ppc_smmu_info smmu_info;
> >      int iq, ik, jq, jk;
> >  
> > -    /* We only handle page sizes for 64-bit server guests for now */
> > -    if (!(env->mmu_model & POWERPC_MMU_64)) {
> > +    /* For now, we only have anything to check on hash64 MMUs */
> > +    if (!cpu->hash64_opts || !kvm_enabled()) {
> >          return;
> >      }
> >  
> > -    /* Collect MMU info from kernel if not already */
> > -    if (!has_smmu_info) {
> > -        kvm_get_smmu_info(cpu, &smmu_info);
> > -        has_smmu_info = true;
> > -    }
> > +    kvm_get_smmu_info(cpu, &smmu_info);
> 
> kvm_ppc_smmu_info and PPCHash64Options really are dual objects, and the 
> routine below checks that they are in sync. Pity that we have to maintain
> two different structs. I guess we can't do differently.

No, and I don't think it really makes sense to try.  kvm_ppc_smmu_info
is about the host+KVM capabilities, PPCHash64Options is about the
guest capabilities.  The guest options need to be supportable by the
host, but they *don't* need to be identical (and no longer will be
after this series).


> > -    if (!max_cpu_page_size) {
> > -        max_cpu_page_size = qemu_getrampagesize();
> > +    if (ppc_hash64_has(cpu, PPC_HASH64_1TSEG)
> > +        && !(smmu_info.flags & KVM_PPC_1T_SEGMENTS)) {
> > +        error_setg(errp,
> > +                   "KVM does not support 1TiB segments which guest expects");
> > +        return;
> >      }
> >  
> > -    /* Convert to QEMU form */
> > -    memset(cpu->hash64_opts->sps, 0, sizeof(*cpu->hash64_opts->sps));
> > -
> > -    /* If we have HV KVM, we need to forbid CI large pages if our
> > -     * host page size is smaller than 64K.
> > -     */
> > -    if (kvmppc_hpt_needs_host_contiguous_pages()) {
> > -        if (getpagesize() >= 0x10000) {
> > -            cpu->hash64_opts->flags |= PPC_HASH64_CI_LARGEPAGE;
> > -        } else {
> > -            cpu->hash64_opts->flags &= ~PPC_HASH64_CI_LARGEPAGE;
> > -        }
> > +    if (smmu_info.slb_size < cpu->hash64_opts->slb_size) {
> > +        error_setg(errp, "KVM only supports %u SLB entries, but guest needs %u",
> > +                   smmu_info.slb_size, cpu->hash64_opts->slb_size);
> > +        return;
> >      }
> >
> 
> The routine below is doing a simple PPCHash64SegmentPageSizes compare. 
> Is it possible to move it in the mmu-hash64.c file ? It means introducing
> kvm notions under mmu-hash64.c

Yes it would, which is why I didn't put it in mmu-hash64.c.  Moreover
it would involve including KVM specific struct definitions from kernel
arch-specific header files into files that don't expect to use kernel
arch specific header files.

> 
> >      /*
> > -     * XXX This loop should be an entry wide AND of the capabilities that
> > -     *     the selected CPU has with the capabilities that KVM supports.
> > +     * Verify that every pagesize supported by the cpu model is
> > +     * supported by KVM with the same encodings
> >       */
> > -    for (ik = iq = 0; ik < KVM_PPC_PAGE_SIZES_MAX_SZ; ik++) {
> > +    for (iq = 0; iq < ARRAY_SIZE(cpu->hash64_opts->sps); iq++) {
> >          PPCHash64SegmentPageSizes *qsps = &cpu->hash64_opts->sps[iq];
> > -        struct kvm_ppc_one_seg_page_size *ksps = &smmu_info.sps[ik];
> > +        struct kvm_ppc_one_seg_page_size *ksps;
> >  
> > -        if (!kvm_valid_page_size(smmu_info.flags, max_cpu_page_size,
> > -                                 ksps->page_shift)) {
> > -            continue;
> > -        }
> > -        qsps->page_shift = ksps->page_shift;
> > -        qsps->slb_enc = ksps->slb_enc;
> > -        for (jk = jq = 0; jk < KVM_PPC_PAGE_SIZES_MAX_SZ; jk++) {
> > -            if (!kvm_valid_page_size(smmu_info.flags, max_cpu_page_size,
> > -                                     ksps->enc[jk].page_shift)) {
> > -                continue;
> > -            }
> > -            qsps->enc[jq].page_shift = ksps->enc[jk].page_shift;
> > -            qsps->enc[jq].pte_enc = ksps->enc[jk].pte_enc;
> > -            if (++jq >= PPC_PAGE_SIZES_MAX_SZ) {
> > +        for (ik = 0; ik < ARRAY_SIZE(smmu_info.sps); ik++) {
> > +            if (qsps->page_shift == smmu_info.sps[ik].page_shift) {
> >                  break;
> >              }
> >          }
> > -        if (++iq >= PPC_PAGE_SIZES_MAX_SZ) {
> > -            break;
> > +        if (ik >= ARRAY_SIZE(smmu_info.sps)) {
> > +            error_setg(errp, "KVM doesn't support for base page shift %u",
> > +                       qsps->page_shift);
> > +            return;
> > +        }
> > +
> > +        ksps = &smmu_info.sps[ik];
> > +        if (ksps->slb_enc != qsps->slb_enc) {
> > +            error_setg(errp,
> > +"KVM uses SLB encoding 0x%x for page shift %u, but guest expects 0x%x",
> > +                       ksps->slb_enc, ksps->page_shift, qsps->slb_enc);
> > +            return;
> > +        }
> > +
> > +        for (jq = 0; jq < ARRAY_SIZE(qsps->enc); jq++) {
> > +            for (jk = 0; jk < ARRAY_SIZE(ksps->enc); jk++) {
> > +                if (qsps->enc[jq].page_shift == ksps->enc[jk].page_shift) {
> > +                    break;
> > +                }
> > +            }
> > +
> > +            if (jk >= ARRAY_SIZE(ksps->enc)) {
> > +                error_setg(errp, "KVM doesn't support page shift %u/%u",
> > +                           qsps->enc[jq].page_shift, qsps->page_shift);
> > +                return;
> > +            }
> > +            if (qsps->enc[jq].pte_enc != ksps->enc[jk].pte_enc) {
> > +                error_setg(errp,
> > +"KVM uses PTE encoding 0x%x for page shift %u/%u, but guest expects 0x%x",
> > +                           ksps->enc[jk].pte_enc, qsps->enc[jq].page_shift,
> > +                           qsps->page_shift, qsps->enc[jq].pte_enc);
> > +                return;
> > +            }
> >          }
> >      }
> > -    cpu->hash64_opts->slb_size = smmu_info.slb_size;
> > -    if (!(smmu_info.flags & KVM_PPC_1T_SEGMENTS)) {
> > -        cpu->hash64_opts->flags &= ~PPC_HASH64_1TSEG;
> > -    }
> > -}
> > -#else /* defined (TARGET_PPC64) */
> >  
> > -static inline void kvm_fixup_page_sizes(PowerPCCPU *cpu)
> > -{
> > +    if (ppc_hash64_has(cpu, PPC_HASH64_CI_LARGEPAGE)) {
> > +        /* Mostly what guest pagesizes we can use are related to the
> > +         * host pages used to map guest RAM, which is handled in the
> > +         * platform code. Cache-Inhibited largepages (64k) however are
> > +         * used for I/O, so if they're mapped to the host at all it
> > +         * will be a normal mapping, not a special hugepage one used
> > +         * for RAM. */
> > +        if (getpagesize() < 0x10000) {
> > +            error_setg(errp,
> > +"KVM can't supply 64kiB CI pages, which guest expects\n");
> > +        }
> > +    }
> >  }
> > -
> >  #endif /* !defined (TARGET_PPC64) */
> >  
> >  unsigned long kvm_arch_vcpu_id(CPUState *cpu)
> > @@ -551,9 +551,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
> >      CPUPPCState *cenv = &cpu->env;
> >      int ret;
> >  
> > -    /* Gather server mmu info from KVM and update the CPU state */
> > -    kvm_fixup_page_sizes(cpu);
> > -
> >      /* Synchronize sregs with kvm */
> >      ret = kvm_arch_sync_sregs(cpu);
> >      if (ret) {
> > diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> > index 443fca0a4e..657582bb32 100644
> > --- a/target/ppc/kvm_ppc.h
> > +++ b/target/ppc/kvm_ppc.h
> > @@ -71,6 +71,7 @@ int kvmppc_resize_hpt_commit(PowerPCCPU *cpu, target_ulong flags, int shift);
> >  bool kvmppc_pvr_workaround_required(PowerPCCPU *cpu);
> >  
> >  bool kvmppc_hpt_needs_host_contiguous_pages(void);
> > +void kvm_check_mmu(PowerPCCPU *cpu, Error **errp);
> >  
> >  #else
> >  
> > @@ -227,6 +228,10 @@ static inline bool kvmppc_hpt_needs_host_contiguous_pages(void)
> >      return false;
> >  }
> >  
> > +static inline void kvm_check_mmu(PowerPCCPU *cpu, Error **errp)
> > +{
> > +}
> > +
> >  static inline bool kvmppc_has_cap_spapr_vfio(void)
> >  {
> >      return false;
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 8/9] spapr: Limit available pagesizes to provide a consistent guest environment
  2018-06-18  6:36 ` [Qemu-devel] [PATCH 8/9] spapr: Limit available pagesizes to provide a consistent guest environment David Gibson
  2018-06-21  7:01   ` Cédric Le Goater
@ 2018-06-21 12:24   ` Greg Kurz
  2018-06-21 14:01     ` David Gibson
  1 sibling, 1 reply; 43+ messages in thread
From: Greg Kurz @ 2018-06-21 12:24 UTC (permalink / raw)
  To: David Gibson; +Cc: abologna, clg, qemu-ppc, qemu-devel, aik

On Mon, 18 Jun 2018 16:36:05 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> KVM HV has some limitations (deriving from the hardware) that mean not all
> host-cpu supported pagesizes may be usable in the guest.  At present this
> means that KVM guests and TCG guests may see different available page sizes
> even if they notionally have the same vcpu model.  This is confusing and
> also prevents migration between TCG and KVM.
> 
> This patch makes the environment consistent by always allowing the same set
> of pagesizes.  Since we can't remove the KVM limitations, we do this by
> always applying the same limitations it has, even to TCG guests.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/ppc/spapr_caps.c | 33 +++++++++++++++++++++++++++++++++
>  1 file changed, 33 insertions(+)
> 
> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> index 9fc739b3f5..0584c7c6ab 100644
> --- a/hw/ppc/spapr_caps.c
> +++ b/hw/ppc/spapr_caps.c
> @@ -334,6 +334,38 @@ static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
>      spapr_check_pagesize(spapr, qemu_getrampagesize(), errp);
>  }
>  
> +static bool spapr_pagesize_cb(void *opaque, uint32_t seg_pshift, uint32_t pshift)
> +{
> +    unsigned maxshift = *((unsigned *)opaque);
> +
> +    assert(pshift >= seg_pshift);
> +
> +    /* Don't allow the guest to use pages bigger than the configured
> +     * maximum size */
> +    if (pshift > maxshift) {
> +        return false;
> +    }
> +
> +    /* For whatever reason, KVM doesn't allow multiple pagesizes
> +     * within a segment, *except* for the case of 16M pages in a 4k or
> +     * 64k segment.  Always exclude other cases, so that TCG and KVM
> +     * guests see a consistent environment */

Unless I'm missing something, I don't see how we could get "other cases"
with TCG, at least with the current content of ppc_hash64_opts_POWER7.

> +    if ((pshift != seg_pshift) && (pshift != 24)) {
> +        return false;
> +    }
> +
> +    return true;
> +}
> +
> +static void cap_hpt_maxpagesize_cpu_apply(sPAPRMachineState *spapr,
> +                                          PowerPCCPU *cpu,
> +                                          uint8_t val, Error **errp)
> +{
> +    unsigned maxshift = val;
> +
> +    ppc_hash64_filter_pagesizes(cpu, spapr_pagesize_cb, &maxshift);
> +}
> +
>  sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
>      [SPAPR_CAP_HTM] = {
>          .name = "htm",
> @@ -401,6 +433,7 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
>          .set = spapr_cap_set_pagesize,
>          .type = "int",
>          .apply = cap_hpt_maxpagesize_apply,
> +        .cpu_apply = cap_hpt_maxpagesize_cpu_apply,
>      },
>  };
>  

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 8/9] spapr: Limit available pagesizes to provide a consistent guest environment
  2018-06-21 11:52     ` David Gibson
@ 2018-06-21 12:50       ` Cédric Le Goater
  2018-06-21 13:58         ` David Gibson
  0 siblings, 1 reply; 43+ messages in thread
From: Cédric Le Goater @ 2018-06-21 12:50 UTC (permalink / raw)
  To: David Gibson; +Cc: groug, abologna, qemu-ppc, qemu-devel, aik

On 06/21/2018 01:52 PM, David Gibson wrote:
> On Thu, Jun 21, 2018 at 09:01:27AM +0200, Cédric Le Goater wrote:
>> On 06/18/2018 08:36 AM, David Gibson wrote:
>>> KVM HV has some limitations (deriving from the hardware) that mean not all
>>> host-cpu supported pagesizes may be usable in the guest.  At present this
>>> means that KVM guests and TCG guests may see different available page sizes
>>> even if they notionally have the same vcpu model.  This is confusing and
>>> also prevents migration between TCG and KVM.
>>>
>>> This patch makes the environment consistent by always allowing the same set
>>> of pagesizes.  Since we can't remove the KVM limitations, we do this by
>>> always applying the same limitations it has, even to TCG guests.
>>>
>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>>>
>>> ---
>>>  hw/ppc/spapr_caps.c | 33 +++++++++++++++++++++++++++++++++
>>>  1 file changed, 33 insertions(+)
>>>
>>> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
>>> index 9fc739b3f5..0584c7c6ab 100644
>>> --- a/hw/ppc/spapr_caps.c
>>> +++ b/hw/ppc/spapr_caps.c
>>> @@ -334,6 +334,38 @@ static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
>>>      spapr_check_pagesize(spapr, qemu_getrampagesize(), errp);
>>>  }
>>>  
>>> +static bool spapr_pagesize_cb(void *opaque, uint32_t seg_pshift, uint32_t pshift)
>>> +{
>>> +    unsigned maxshift = *((unsigned *)opaque);
>>> +
>>> +    assert(pshift >= seg_pshift);
>>
>> you could check that elsewhere.
> 
> Um.. I'm not sure what you're getting at.

you could put the assert in ppc_hash64_filter_pagesizes(), that is where
the parameters are coming from.

>>> +    /* Don't allow the guest to use pages bigger than the configured
>>> +     * maximum size */
>>> +    if (pshift > maxshift) {
>>> +        return false;
>>> +    }
>>> +
>>> +    /* For whatever reason, KVM doesn't allow multiple pagesizes
>>> +     * within a segment, *except* for the case of 16M pages in a 4k or
>>> +     * 64k segment.  Always exclude other cases, so that TCG and KVM
>>> +     * guests see a consistent environment */
>>> +    if ((pshift != seg_pshift) && (pshift != 24)) {
>>> +        return false;
>>> +    }
> 
> Note the stanza above, I'll refer to it below.

ok.

> 
>>> +
>>> +    return true;
>>> +}
>>
>> So, do we really need ppc_hash64_filter_pagesizes() to have a callback ? 
> 
> I agree that it seems overly involved, but it was the best way I could
> see to logically separate the TCG / softmmu specific logic from the
> spapr specific logic.

ok. I agree then.

Reviewed-by: Cédric Le Goater <clg@kaod.org>

Thanks,

C.

>> It seems that we only use the routine once in the patchset and that the
>> only thing we need to check is 'maxshift'.
> 
> Not quite.  An earlier draft had this routine just take a max page
> size and clamp accordingly.  But that failed when I wrote the code to
> check against the KVM capabilities, because KVM also excludes some
> other pagesize combinations.  That's what the stanza I point out above
> is about
> 
>> Do you envision other usage of the routine ?
> 
> Not really, no.
> 
>>
>> Thanks,
>>
>> C.
>>
>>> +static void cap_hpt_maxpagesize_cpu_apply(sPAPRMachineState *spapr,
>>> +                                          PowerPCCPU *cpu,
>>> +                                          uint8_t val, Error **errp)
>>> +{
>>> +    unsigned maxshift = val;
>>> +
>>> +    ppc_hash64_filter_pagesizes(cpu, spapr_pagesize_cb, &maxshift);
>>> +}
>>> +
>>>  sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
>>>      [SPAPR_CAP_HTM] = {
>>>          .name = "htm",
>>> @@ -401,6 +433,7 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
>>>          .set = spapr_cap_set_pagesize,
>>>          .type = "int",
>>>          .apply = cap_hpt_maxpagesize_apply,
>>> +        .cpu_apply = cap_hpt_maxpagesize_cpu_apply,
>>>      },
>>>  };
>>>  
>>>
>>
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 9/9] spapr: Don't rewrite mmu capabilities in KVM mode
  2018-06-21 12:01     ` David Gibson
@ 2018-06-21 12:51       ` Cédric Le Goater
  0 siblings, 0 replies; 43+ messages in thread
From: Cédric Le Goater @ 2018-06-21 12:51 UTC (permalink / raw)
  To: David Gibson; +Cc: groug, abologna, qemu-ppc, qemu-devel, aik

On 06/21/2018 02:01 PM, David Gibson wrote:
> On Thu, Jun 21, 2018 at 09:53:10AM +0200, Cédric Le Goater wrote:
>> On 06/18/2018 08:36 AM, David Gibson wrote:
>>> Currently during KVM initialization on POWER, kvm_fixup_page_sizes()
>>> rewrites a bunch of information in the cpu state to reflect the
>>> capabilities of the host MMU and KVM.  This overwrites the information
>>> that's already there reflecting how the TCG implementation of the MMU will
>>> operate.
>>>
>>> This means that we can get guest-visibly different behaviour between KVM
>>> and TCG (and between different KVM implementations).  That's bad.  It also
>>> prevents migration between KVM and TCG.
>>>
>>> The pseries machine type now has filtering of the pagesizes it allows the
>>> guest to use which means it can present a consistent model of the MMU
>>> across all accelerators.
>>>
>>> So, we can now replace kvm_fixup_page_sizes() with kvm_check_mmu() which
>>> merely verifies that the expected cpu model can be faithfully handled by
>>> KVM, rather than updating the cpu model to match KVM.
>>>
>>> We call kvm_check_mmu() from the spapr cpu reset code.  This is a hack:
>>
>> I think this is fine but we are still doing some MMU checks in 
>> kvm_arch_init_vcpu() we might want to do in a single routine.
> 
> Uh.. sort of.  We do do some messing around for BookE 2.06.  That
> probably should move into the check_mmu routine.  Actually, it
> probably needs to be turned around to give consistent behaviour
> between TCG and KVM.  But in any case that'll require more looking at
> how BookE works, so it's a project for another day.
> 
> The other check is about transactional memory and doesn't actually
> have to do with the MMU at all.  It's keyed off env->mmu_model, but
> that's an abuse, we should be doing a compat check instead.  Yes,
> something to clean up, buit not really in scope for here.
> 
>>
>>> conceptually it makes more sense where fixup_page_sizes() was - in the KVM
>>> cpu init path.  However, doing that would require moving the platform's
>>> pagesize filtering much earlier, which would require a lot of work making
>>> further adjustments.  There wouldn't be a lot of concrete point to doing
>>> that, since the only KVM implementation which has the awkward MMU
>>> restrictions is KVM HV, which can only work with an spapr guest anyway.
>>>
>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>>> ---
>>>  hw/ppc/spapr_caps.c     |   2 +-
>>>  hw/ppc/spapr_cpu_core.c |   2 +
>>>  target/ppc/kvm.c        | 133 ++++++++++++++++++++--------------------
>>>  target/ppc/kvm_ppc.h    |   5 ++
>>>  4 files changed, 73 insertions(+), 69 deletions(-)
>>>
>>> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
>>> index 0584c7c6ab..bc89a4cd70 100644
>>> --- a/hw/ppc/spapr_caps.c
>>> +++ b/hw/ppc/spapr_caps.c
>>> @@ -308,7 +308,7 @@ static void cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
>>>  void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
>>>                            Error **errp)
>>>  {
>>> -    hwaddr maxpagesize = (1ULL << spapr->eff.caps[SPAPR_CAP_HPT_MPS]);
>>> +    hwaddr maxpagesize = (1ULL << spapr->eff.caps[SPAPR_CAP_HPT_MAXPAGESIZE]);
>>
>> There might be some renames I missed. no big issue.
> 
> Looks like this fixup hunk ended up in the wrong patch.  I've folded
> it into the right place now.
> 
>>
>>>  
>>>      if (!kvmppc_hpt_needs_host_contiguous_pages()) {
>>>          return;
>>> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
>>> index 324623190d..4e8fa28796 100644
>>> --- a/hw/ppc/spapr_cpu_core.c
>>> +++ b/hw/ppc/spapr_cpu_core.c
>>> @@ -78,6 +78,8 @@ static void spapr_cpu_reset(void *opaque)
>>>      spapr_cpu->dtl_size = 0;
>>>  
>>>      spapr_caps_cpu_apply(SPAPR_MACHINE(qdev_get_machine()), cpu);
>>> +
>>> +    kvm_check_mmu(cpu, &error_fatal);
>>>  }
>>>  
>>>  void spapr_cpu_set_entry_state(PowerPCCPU *cpu, target_ulong nip, target_ulong r3)
>>> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
>>> index 9cfbd388ad..b386335014 100644
>>> --- a/target/ppc/kvm.c
>>> +++ b/target/ppc/kvm.c
>>> @@ -419,93 +419,93 @@ bool kvmppc_hpt_needs_host_contiguous_pages(void)
>>>      return !!(smmu_info.flags & KVM_PPC_PAGE_SIZES_REAL);
>>>  }
>>>  
>>> -static bool kvm_valid_page_size(uint32_t flags, long rampgsize, uint32_t shift)
>>> +void kvm_check_mmu(PowerPCCPU *cpu, Error **errp)
>>>  {
>>> -    if (!kvmppc_hpt_needs_host_contiguous_pages()) {
>>> -        return true;
>>> -    }
>>> -
>>> -    return (1ul << shift) <= rampgsize;
>>> -}
>>> -
>>> -static long max_cpu_page_size;
>>> -
>>> -static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>>> -{
>>> -    static struct kvm_ppc_smmu_info smmu_info;
>>> -    static bool has_smmu_info;
>>> -    CPUPPCState *env = &cpu->env;
>>> +    struct kvm_ppc_smmu_info smmu_info;
>>>      int iq, ik, jq, jk;
>>>  
>>> -    /* We only handle page sizes for 64-bit server guests for now */
>>> -    if (!(env->mmu_model & POWERPC_MMU_64)) {
>>> +    /* For now, we only have anything to check on hash64 MMUs */
>>> +    if (!cpu->hash64_opts || !kvm_enabled()) {
>>>          return;
>>>      }
>>>  
>>> -    /* Collect MMU info from kernel if not already */
>>> -    if (!has_smmu_info) {
>>> -        kvm_get_smmu_info(cpu, &smmu_info);
>>> -        has_smmu_info = true;
>>> -    }
>>> +    kvm_get_smmu_info(cpu, &smmu_info);
>>
>> kvm_ppc_smmu_info and PPCHash64Options really are dual objects, and the 
>> routine below checks that they are in sync. Pity that we have to maintain
>> two different structs. I guess we can't do differently.
> 
> No, and I don't think it really makes sense to try.  kvm_ppc_smmu_info
> is about the host+KVM capabilities, PPCHash64Options is about the
> guest capabilities.  The guest options need to be supportable by the
> host, but they *don't* need to be identical (and no longer will be
> after this series).
> 
> 
>>> -    if (!max_cpu_page_size) {
>>> -        max_cpu_page_size = qemu_getrampagesize();
>>> +    if (ppc_hash64_has(cpu, PPC_HASH64_1TSEG)
>>> +        && !(smmu_info.flags & KVM_PPC_1T_SEGMENTS)) {
>>> +        error_setg(errp,
>>> +                   "KVM does not support 1TiB segments which guest expects");
>>> +        return;
>>>      }
>>>  
>>> -    /* Convert to QEMU form */
>>> -    memset(cpu->hash64_opts->sps, 0, sizeof(*cpu->hash64_opts->sps));
>>> -
>>> -    /* If we have HV KVM, we need to forbid CI large pages if our
>>> -     * host page size is smaller than 64K.
>>> -     */
>>> -    if (kvmppc_hpt_needs_host_contiguous_pages()) {
>>> -        if (getpagesize() >= 0x10000) {
>>> -            cpu->hash64_opts->flags |= PPC_HASH64_CI_LARGEPAGE;
>>> -        } else {
>>> -            cpu->hash64_opts->flags &= ~PPC_HASH64_CI_LARGEPAGE;
>>> -        }
>>> +    if (smmu_info.slb_size < cpu->hash64_opts->slb_size) {
>>> +        error_setg(errp, "KVM only supports %u SLB entries, but guest needs %u",
>>> +                   smmu_info.slb_size, cpu->hash64_opts->slb_size);
>>> +        return;
>>>      }
>>>
>>
>> The routine below is doing a simple PPCHash64SegmentPageSizes compare. 
>> Is it possible to move it in the mmu-hash64.c file ? It means introducing
>> kvm notions under mmu-hash64.c
> 
> Yes it would, which is why I didn't put it in mmu-hash64.c.  Moreover
> it would involve including KVM specific struct definitions from kernel
> arch-specific header files into files that don't expect to use kernel
> arch specific header files.

yes. This is true.

Reviewed-by: Cédric Le Goater <clg@kaod.org>

Thanks,

C.

> 
>>
>>>      /*
>>> -     * XXX This loop should be an entry wide AND of the capabilities that
>>> -     *     the selected CPU has with the capabilities that KVM supports.
>>> +     * Verify that every pagesize supported by the cpu model is
>>> +     * supported by KVM with the same encodings
>>>       */
>>> -    for (ik = iq = 0; ik < KVM_PPC_PAGE_SIZES_MAX_SZ; ik++) {
>>> +    for (iq = 0; iq < ARRAY_SIZE(cpu->hash64_opts->sps); iq++) {
>>>          PPCHash64SegmentPageSizes *qsps = &cpu->hash64_opts->sps[iq];
>>> -        struct kvm_ppc_one_seg_page_size *ksps = &smmu_info.sps[ik];
>>> +        struct kvm_ppc_one_seg_page_size *ksps;
>>>  
>>> -        if (!kvm_valid_page_size(smmu_info.flags, max_cpu_page_size,
>>> -                                 ksps->page_shift)) {
>>> -            continue;
>>> -        }
>>> -        qsps->page_shift = ksps->page_shift;
>>> -        qsps->slb_enc = ksps->slb_enc;
>>> -        for (jk = jq = 0; jk < KVM_PPC_PAGE_SIZES_MAX_SZ; jk++) {
>>> -            if (!kvm_valid_page_size(smmu_info.flags, max_cpu_page_size,
>>> -                                     ksps->enc[jk].page_shift)) {
>>> -                continue;
>>> -            }
>>> -            qsps->enc[jq].page_shift = ksps->enc[jk].page_shift;
>>> -            qsps->enc[jq].pte_enc = ksps->enc[jk].pte_enc;
>>> -            if (++jq >= PPC_PAGE_SIZES_MAX_SZ) {
>>> +        for (ik = 0; ik < ARRAY_SIZE(smmu_info.sps); ik++) {
>>> +            if (qsps->page_shift == smmu_info.sps[ik].page_shift) {
>>>                  break;
>>>              }
>>>          }
>>> -        if (++iq >= PPC_PAGE_SIZES_MAX_SZ) {
>>> -            break;
>>> +        if (ik >= ARRAY_SIZE(smmu_info.sps)) {
>>> +            error_setg(errp, "KVM doesn't support for base page shift %u",
>>> +                       qsps->page_shift);
>>> +            return;
>>> +        }
>>> +
>>> +        ksps = &smmu_info.sps[ik];
>>> +        if (ksps->slb_enc != qsps->slb_enc) {
>>> +            error_setg(errp,
>>> +"KVM uses SLB encoding 0x%x for page shift %u, but guest expects 0x%x",
>>> +                       ksps->slb_enc, ksps->page_shift, qsps->slb_enc);
>>> +            return;
>>> +        }
>>> +
>>> +        for (jq = 0; jq < ARRAY_SIZE(qsps->enc); jq++) {
>>> +            for (jk = 0; jk < ARRAY_SIZE(ksps->enc); jk++) {
>>> +                if (qsps->enc[jq].page_shift == ksps->enc[jk].page_shift) {
>>> +                    break;
>>> +                }
>>> +            }
>>> +
>>> +            if (jk >= ARRAY_SIZE(ksps->enc)) {
>>> +                error_setg(errp, "KVM doesn't support page shift %u/%u",
>>> +                           qsps->enc[jq].page_shift, qsps->page_shift);
>>> +                return;
>>> +            }
>>> +            if (qsps->enc[jq].pte_enc != ksps->enc[jk].pte_enc) {
>>> +                error_setg(errp,
>>> +"KVM uses PTE encoding 0x%x for page shift %u/%u, but guest expects 0x%x",
>>> +                           ksps->enc[jk].pte_enc, qsps->enc[jq].page_shift,
>>> +                           qsps->page_shift, qsps->enc[jq].pte_enc);
>>> +                return;
>>> +            }
>>>          }
>>>      }
>>> -    cpu->hash64_opts->slb_size = smmu_info.slb_size;
>>> -    if (!(smmu_info.flags & KVM_PPC_1T_SEGMENTS)) {
>>> -        cpu->hash64_opts->flags &= ~PPC_HASH64_1TSEG;
>>> -    }
>>> -}
>>> -#else /* defined (TARGET_PPC64) */
>>>  
>>> -static inline void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>>> -{
>>> +    if (ppc_hash64_has(cpu, PPC_HASH64_CI_LARGEPAGE)) {
>>> +        /* Mostly what guest pagesizes we can use are related to the
>>> +         * host pages used to map guest RAM, which is handled in the
>>> +         * platform code. Cache-Inhibited largepages (64k) however are
>>> +         * used for I/O, so if they're mapped to the host at all it
>>> +         * will be a normal mapping, not a special hugepage one used
>>> +         * for RAM. */
>>> +        if (getpagesize() < 0x10000) {
>>> +            error_setg(errp,
>>> +"KVM can't supply 64kiB CI pages, which guest expects\n");
>>> +        }
>>> +    }
>>>  }
>>> -
>>>  #endif /* !defined (TARGET_PPC64) */
>>>  
>>>  unsigned long kvm_arch_vcpu_id(CPUState *cpu)
>>> @@ -551,9 +551,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
>>>      CPUPPCState *cenv = &cpu->env;
>>>      int ret;
>>>  
>>> -    /* Gather server mmu info from KVM and update the CPU state */
>>> -    kvm_fixup_page_sizes(cpu);
>>> -
>>>      /* Synchronize sregs with kvm */
>>>      ret = kvm_arch_sync_sregs(cpu);
>>>      if (ret) {
>>> diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
>>> index 443fca0a4e..657582bb32 100644
>>> --- a/target/ppc/kvm_ppc.h
>>> +++ b/target/ppc/kvm_ppc.h
>>> @@ -71,6 +71,7 @@ int kvmppc_resize_hpt_commit(PowerPCCPU *cpu, target_ulong flags, int shift);
>>>  bool kvmppc_pvr_workaround_required(PowerPCCPU *cpu);
>>>  
>>>  bool kvmppc_hpt_needs_host_contiguous_pages(void);
>>> +void kvm_check_mmu(PowerPCCPU *cpu, Error **errp);
>>>  
>>>  #else
>>>  
>>> @@ -227,6 +228,10 @@ static inline bool kvmppc_hpt_needs_host_contiguous_pages(void)
>>>      return false;
>>>  }
>>>  
>>> +static inline void kvm_check_mmu(PowerPCCPU *cpu, Error **errp)
>>> +{
>>> +}
>>> +
>>>  static inline bool kvmppc_has_cap_spapr_vfio(void)
>>>  {
>>>      return false;
>>>
>>
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 8/9] spapr: Limit available pagesizes to provide a consistent guest environment
  2018-06-21 12:50       ` Cédric Le Goater
@ 2018-06-21 13:58         ` David Gibson
  0 siblings, 0 replies; 43+ messages in thread
From: David Gibson @ 2018-06-21 13:58 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: groug, abologna, qemu-ppc, qemu-devel, aik

[-- Attachment #1: Type: text/plain, Size: 4614 bytes --]

On Thu, Jun 21, 2018 at 02:50:32PM +0200, Cédric Le Goater wrote:
> On 06/21/2018 01:52 PM, David Gibson wrote:
> > On Thu, Jun 21, 2018 at 09:01:27AM +0200, Cédric Le Goater wrote:
> >> On 06/18/2018 08:36 AM, David Gibson wrote:
> >>> KVM HV has some limitations (deriving from the hardware) that mean not all
> >>> host-cpu supported pagesizes may be usable in the guest.  At present this
> >>> means that KVM guests and TCG guests may see different available page sizes
> >>> even if they notionally have the same vcpu model.  This is confusing and
> >>> also prevents migration between TCG and KVM.
> >>>
> >>> This patch makes the environment consistent by always allowing the same set
> >>> of pagesizes.  Since we can't remove the KVM limitations, we do this by
> >>> always applying the same limitations it has, even to TCG guests.
> >>>
> >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> >>>
> >>> ---
> >>>  hw/ppc/spapr_caps.c | 33 +++++++++++++++++++++++++++++++++
> >>>  1 file changed, 33 insertions(+)
> >>>
> >>> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> >>> index 9fc739b3f5..0584c7c6ab 100644
> >>> --- a/hw/ppc/spapr_caps.c
> >>> +++ b/hw/ppc/spapr_caps.c
> >>> @@ -334,6 +334,38 @@ static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
> >>>      spapr_check_pagesize(spapr, qemu_getrampagesize(), errp);
> >>>  }
> >>>  
> >>> +static bool spapr_pagesize_cb(void *opaque, uint32_t seg_pshift, uint32_t pshift)
> >>> +{
> >>> +    unsigned maxshift = *((unsigned *)opaque);
> >>> +
> >>> +    assert(pshift >= seg_pshift);
> >>
> >> you could check that elsewhere.
> > 
> > Um.. I'm not sure what you're getting at.
> 
> you could put the assert in ppc_hash64_filter_pagesizes(), that is where
> the parameters are coming from.

Yes.. but it's here that we're relying on that fact.  That's kind of
the point with assert()s.

> 
> >>> +    /* Don't allow the guest to use pages bigger than the configured
> >>> +     * maximum size */
> >>> +    if (pshift > maxshift) {
> >>> +        return false;
> >>> +    }
> >>> +
> >>> +    /* For whatever reason, KVM doesn't allow multiple pagesizes
> >>> +     * within a segment, *except* for the case of 16M pages in a 4k or
> >>> +     * 64k segment.  Always exclude other cases, so that TCG and KVM
> >>> +     * guests see a consistent environment */
> >>> +    if ((pshift != seg_pshift) && (pshift != 24)) {
> >>> +        return false;
> >>> +    }
> > 
> > Note the stanza above, I'll refer to it below.
> 
> ok.
> 
> > 
> >>> +
> >>> +    return true;
> >>> +}
> >>
> >> So, do we really need ppc_hash64_filter_pagesizes() to have a callback ? 
> > 
> > I agree that it seems overly involved, but it was the best way I could
> > see to logically separate the TCG / softmmu specific logic from the
> > spapr specific logic.
> 
> ok. I agree then.
> 
> Reviewed-by: Cédric Le Goater <clg@kaod.org>
> 
> Thanks,
> 
> C.
> 
> >> It seems that we only use the routine once in the patchset and that the
> >> only thing we need to check is 'maxshift'.
> > 
> > Not quite.  An earlier draft had this routine just take a max page
> > size and clamp accordingly.  But that failed when I wrote the code to
> > check against the KVM capabilities, because KVM also excludes some
> > other pagesize combinations.  That's what the stanza I point out above
> > is about
> > 
> >> Do you envision other usage of the routine ?
> > 
> > Not really, no.
> > 
> >>
> >> Thanks,
> >>
> >> C.
> >>
> >>> +static void cap_hpt_maxpagesize_cpu_apply(sPAPRMachineState *spapr,
> >>> +                                          PowerPCCPU *cpu,
> >>> +                                          uint8_t val, Error **errp)
> >>> +{
> >>> +    unsigned maxshift = val;
> >>> +
> >>> +    ppc_hash64_filter_pagesizes(cpu, spapr_pagesize_cb, &maxshift);
> >>> +}
> >>> +
> >>>  sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
> >>>      [SPAPR_CAP_HTM] = {
> >>>          .name = "htm",
> >>> @@ -401,6 +433,7 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
> >>>          .set = spapr_cap_set_pagesize,
> >>>          .type = "int",
> >>>          .apply = cap_hpt_maxpagesize_apply,
> >>> +        .cpu_apply = cap_hpt_maxpagesize_cpu_apply,
> >>>      },
> >>>  };
> >>>  
> >>>
> >>
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 8/9] spapr: Limit available pagesizes to provide a consistent guest environment
  2018-06-21 12:24   ` Greg Kurz
@ 2018-06-21 14:01     ` David Gibson
  2018-06-21 14:18       ` Greg Kurz
  0 siblings, 1 reply; 43+ messages in thread
From: David Gibson @ 2018-06-21 14:01 UTC (permalink / raw)
  To: Greg Kurz; +Cc: abologna, clg, qemu-ppc, qemu-devel, aik

[-- Attachment #1: Type: text/plain, Size: 2404 bytes --]

On Thu, Jun 21, 2018 at 02:24:19PM +0200, Greg Kurz wrote:
> On Mon, 18 Jun 2018 16:36:05 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > KVM HV has some limitations (deriving from the hardware) that mean not all
> > host-cpu supported pagesizes may be usable in the guest.  At present this
> > means that KVM guests and TCG guests may see different available page sizes
> > even if they notionally have the same vcpu model.  This is confusing and
> > also prevents migration between TCG and KVM.
> > 
> > This patch makes the environment consistent by always allowing the same set
> > of pagesizes.  Since we can't remove the KVM limitations, we do this by
> > always applying the same limitations it has, even to TCG guests.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  hw/ppc/spapr_caps.c | 33 +++++++++++++++++++++++++++++++++
> >  1 file changed, 33 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> > index 9fc739b3f5..0584c7c6ab 100644
> > --- a/hw/ppc/spapr_caps.c
> > +++ b/hw/ppc/spapr_caps.c
> > @@ -334,6 +334,38 @@ static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
> >      spapr_check_pagesize(spapr, qemu_getrampagesize(), errp);
> >  }
> >  
> > +static bool spapr_pagesize_cb(void *opaque, uint32_t seg_pshift, uint32_t pshift)
> > +{
> > +    unsigned maxshift = *((unsigned *)opaque);
> > +
> > +    assert(pshift >= seg_pshift);
> > +
> > +    /* Don't allow the guest to use pages bigger than the configured
> > +     * maximum size */
> > +    if (pshift > maxshift) {
> > +        return false;
> > +    }
> > +
> > +    /* For whatever reason, KVM doesn't allow multiple pagesizes
> > +     * within a segment, *except* for the case of 16M pages in a 4k or
> > +     * 64k segment.  Always exclude other cases, so that TCG and KVM
> > +     * guests see a consistent environment */
> 
> Unless I'm missing something, I don't see how we could get "other cases"
> with TCG, at least with the current content of ppc_hash64_opts_POWER7.

You're missing something.  hash64_opts_POWER7 includes 64k pages in a
segment with base page size of 4k.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 8/9] spapr: Limit available pagesizes to provide a consistent guest environment
  2018-06-21 14:01     ` David Gibson
@ 2018-06-21 14:18       ` Greg Kurz
  0 siblings, 0 replies; 43+ messages in thread
From: Greg Kurz @ 2018-06-21 14:18 UTC (permalink / raw)
  To: David Gibson; +Cc: abologna, clg, qemu-ppc, qemu-devel, aik

[-- Attachment #1: Type: text/plain, Size: 2490 bytes --]

On Fri, 22 Jun 2018 00:01:13 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Thu, Jun 21, 2018 at 02:24:19PM +0200, Greg Kurz wrote:
> > On Mon, 18 Jun 2018 16:36:05 +1000
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > KVM HV has some limitations (deriving from the hardware) that mean not all
> > > host-cpu supported pagesizes may be usable in the guest.  At present this
> > > means that KVM guests and TCG guests may see different available page sizes
> > > even if they notionally have the same vcpu model.  This is confusing and
> > > also prevents migration between TCG and KVM.
> > > 
> > > This patch makes the environment consistent by always allowing the same set
> > > of pagesizes.  Since we can't remove the KVM limitations, we do this by
> > > always applying the same limitations it has, even to TCG guests.
> > > 
> > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > ---
> > >  hw/ppc/spapr_caps.c | 33 +++++++++++++++++++++++++++++++++
> > >  1 file changed, 33 insertions(+)
> > > 
> > > diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> > > index 9fc739b3f5..0584c7c6ab 100644
> > > --- a/hw/ppc/spapr_caps.c
> > > +++ b/hw/ppc/spapr_caps.c
> > > @@ -334,6 +334,38 @@ static void cap_hpt_maxpagesize_apply(sPAPRMachineState *spapr,
> > >      spapr_check_pagesize(spapr, qemu_getrampagesize(), errp);
> > >  }
> > >  
> > > +static bool spapr_pagesize_cb(void *opaque, uint32_t seg_pshift, uint32_t pshift)
> > > +{
> > > +    unsigned maxshift = *((unsigned *)opaque);
> > > +
> > > +    assert(pshift >= seg_pshift);
> > > +
> > > +    /* Don't allow the guest to use pages bigger than the configured
> > > +     * maximum size */
> > > +    if (pshift > maxshift) {
> > > +        return false;
> > > +    }
> > > +
> > > +    /* For whatever reason, KVM doesn't allow multiple pagesizes
> > > +     * within a segment, *except* for the case of 16M pages in a 4k or
> > > +     * 64k segment.  Always exclude other cases, so that TCG and KVM
> > > +     * guests see a consistent environment */  
> > 
> > Unless I'm missing something, I don't see how we could get "other cases"
> > with TCG, at least with the current content of ppc_hash64_opts_POWER7.  
> 
> You're missing something.  hash64_opts_POWER7 includes 64k pages in a
> segment with base page size of 4k.
> 

/me should read more carefully... :(


Reviewed-by: Greg Kurz <groug@kaod.org>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2018-06-21 14:18 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-18  6:35 [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling David Gibson
2018-06-18  6:35 ` [Qemu-devel] [PATCH 1/9] target/ppc: Allow cpu compatiblity checks based on type, not instance David Gibson
2018-06-18 13:22   ` Greg Kurz
2018-06-21  5:20   ` Cédric Le Goater
2018-06-18  6:35 ` [Qemu-devel] [PATCH 2/9] spapr: Compute effective capability values earlier David Gibson
2018-06-18 13:37   ` Greg Kurz
2018-06-21  5:32   ` Cédric Le Goater
2018-06-18  6:36 ` [Qemu-devel] [PATCH 3/9] spapr: Add cpu_apply hook to capabilities David Gibson
2018-06-18 15:28   ` Greg Kurz
2018-06-21  5:34   ` Cédric Le Goater
2018-06-18  6:36 ` [Qemu-devel] [PATCH 4/9] target/ppc: Add kvmppc_hpt_needs_host_contiguous_pages() helper David Gibson
2018-06-18 15:32   ` Greg Kurz
2018-06-21  5:56   ` Cédric Le Goater
2018-06-21  6:34     ` David Gibson
2018-06-18  6:36 ` [Qemu-devel] [PATCH 5/9] spapr: Maximum (HPT) pagesize property David Gibson
2018-06-19  9:23   ` Cédric Le Goater
2018-06-19 11:22     ` David Gibson
2018-06-21  6:22   ` Cédric Le Goater
2018-06-21 11:00     ` David Gibson
2018-06-21  9:19   ` Greg Kurz
2018-06-21 11:01     ` David Gibson
2018-06-18  6:36 ` [Qemu-devel] [PATCH 6/9] spapr: Use maximum page size capability to simplify memory backend checking David Gibson
2018-06-21  6:29   ` Cédric Le Goater
2018-06-21 11:06     ` David Gibson
2018-06-21 10:29   ` Greg Kurz
2018-06-21 11:11     ` David Gibson
2018-06-18  6:36 ` [Qemu-devel] [PATCH 7/9] target/ppc: Add ppc_hash64_filter_pagesizes() David Gibson
2018-06-21  6:38   ` Cédric Le Goater
2018-06-21 11:48   ` Greg Kurz
2018-06-18  6:36 ` [Qemu-devel] [PATCH 8/9] spapr: Limit available pagesizes to provide a consistent guest environment David Gibson
2018-06-21  7:01   ` Cédric Le Goater
2018-06-21 11:52     ` David Gibson
2018-06-21 12:50       ` Cédric Le Goater
2018-06-21 13:58         ` David Gibson
2018-06-21 12:24   ` Greg Kurz
2018-06-21 14:01     ` David Gibson
2018-06-21 14:18       ` Greg Kurz
2018-06-18  6:36 ` [Qemu-devel] [PATCH 9/9] spapr: Don't rewrite mmu capabilities in KVM mode David Gibson
2018-06-21  7:53   ` Cédric Le Goater
2018-06-21 12:01     ` David Gibson
2018-06-21 12:51       ` Cédric Le Goater
2018-06-21  1:08 ` [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling David Gibson
2018-06-21  6:52 ` no-reply

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.