[Qemu-devel] [PATCH 0/3] spapr: fix regression with older machine types

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH 0/3] spapr: fix regression with older machine types
@ 2018-06-28 10:14 Greg Kurz
  2018-06-28 10:14 ` [Qemu-devel] [PATCH 1/3] target/ppc/kvm: don't pass cpu to kvm_get_smmu_info() Greg Kurz
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Greg Kurz @ 2018-06-28 10:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, David Gibson, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Cédric Le Goater

Since the recent cleanups to hide host configuration details from guests,
it isn't possible to start an older machine type with HV KVM [*]:

qemu-system-ppc64: KVM doesn't support for base page shift 34

This basically boils down to the fact that it isn't safe to call
the kvmppc_hpt_needs_host_contiguous_pages() helper from a class
init function because:
- KVM isn't initialized yet, and kvm_enabled() always return false
  in this case. This causes kvmppc_hpt_needs_host_contiguous_pages()
  to do nothing and we end up choosing a 16G default page size
  which is not supported by KVM.
- even if we drop kvm_enabled() we then have the issue that
  kvmppc_hpt_needs_host_contiguous_pages() assumes CPUs are
  created, which isn't the case either.

The choice was made to initialize capabilities during machine
init before creating the CPUs, and I don't think we should
revert to the previous behavior. Let's go forward instead and
ensure we can retrieve the MMU information from KVM before
CPUs are created.

To fix this, we first change kvm_get_smmu_info() so that it
doesn't need a CPU object. This allows to stop using first_cpu
in kvmppc_hpt_needs_host_contiguous_pages(). Then we delay
the setting of the default value to machine init time, so
that we're sure that KVM is fully initialized.

As a bonus, the last patch is a tentative to be able to detect
such misuse of *_enabled() accelerator helpers earlier.

Please comment.

[*] it also breaks PR KVM actually, but the error is different and
    I need to dig some more.

--
Greg

---

Greg Kurz (3):
      target/ppc/kvm: don't pass cpu to kvm_get_smmu_info()
      spapr: compute default value of "hpt-max-page-size" later
      accel: forbid early use of kvm_enabled() and friends


 accel/accel.c           |    7 +++++++
 hw/ppc/spapr.c          |   25 ++++++++++++++++++-------
 include/qemu-common.h   |    3 ++-
 include/sysemu/accel.h  |    1 +
 include/sysemu/kvm.h    |    3 ++-
 qom/cpu.c               |    1 +
 stubs/Makefile.objs     |    1 +
 stubs/accel.c           |   14 ++++++++++++++
 target/i386/hax-all.c   |    2 +-
 target/i386/whpx-all.c  |    2 +-
 target/ppc/kvm.c        |   37 ++++++++++++++++++-------------------
 target/ppc/mmu-hash64.h |    8 +++++++-
 12 files changed, 73 insertions(+), 31 deletions(-)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 1/3] target/ppc/kvm: don't pass cpu to kvm_get_smmu_info()
  2018-06-28 10:14 [Qemu-devel] [PATCH 0/3] spapr: fix regression with older machine types Greg Kurz
@ 2018-06-28 10:14 ` Greg Kurz
  2018-06-28 11:56   ` Cédric Le Goater
  2018-06-29  5:16   ` David Gibson
  2018-06-28 10:15 ` [Qemu-devel] [PATCH 2/3] spapr: compute default value of "hpt-max-page-size" later Greg Kurz
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 15+ messages in thread
From: Greg Kurz @ 2018-06-28 10:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, David Gibson, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Cédric Le Goater

In a future patch the machine code will need to retrieve the MMU
information from KVM during machine initialization before the CPUs
are created.

Actually, KVM_PPC_GET_SMMU_INFO is a VM class ioctl, and thus, it only
needs the kvm_state global variable to be set. The fallback code only
needs informations from the machine's CPU model class.

So this patch basically drops the CPU argument to kvm_get_smmu_info()
and kvm_get_fallback_smmu_info(). The kvm_state and current_machine
globals are used instead to reach out to KVM and CPU model details
respectively.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 target/ppc/kvm.c        |   37 ++++++++++++++++++-------------------
 target/ppc/mmu-hash64.h |    8 +++++++-
 2 files changed, 25 insertions(+), 20 deletions(-)

diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 4df4ff6cbff2..9fae89ff8175 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -248,11 +248,12 @@ static int kvm_booke206_tlb_init(PowerPCCPU *cpu)
 
 
 #if defined(TARGET_PPC64)
-static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
-                                       struct kvm_ppc_smmu_info *info)
+static void kvm_get_fallback_smmu_info(struct kvm_ppc_smmu_info *info)
 {
-    CPUPPCState *env = &cpu->env;
-    CPUState *cs = CPU(cpu);
+    const PowerPCCPUClass *pcc;
+
+    assert(current_machine != NULL);
+    pcc = POWERPC_CPU_CLASS(object_class_by_name(current_machine->cpu_type));
 
     memset(info, 0, sizeof(*info));
 
@@ -278,7 +279,7 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
      *   implements KVM_CAP_PPC_GET_SMMU_INFO and thus doesn't hit
      *   this fallback.
      */
-    if (kvmppc_is_pr(cs->kvm_state)) {
+    if (kvmppc_is_pr(kvm_state)) {
         /* No flags */
         info->flags = 0;
         info->slb_size = 64;
@@ -300,12 +301,12 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
         /* HV KVM has backing store size restrictions */
         info->flags = KVM_PPC_PAGE_SIZES_REAL;
 
-        if (ppc_hash64_has(cpu, PPC_HASH64_1TSEG)) {
+        if (hash64_opts_has(pcc->hash64_opts, PPC_HASH64_1TSEG)) {
             info->flags |= KVM_PPC_1T_SEGMENTS;
         }
 
-        if (env->mmu_model == POWERPC_MMU_2_06 ||
-            env->mmu_model == POWERPC_MMU_2_07) {
+        if (pcc->mmu_model == POWERPC_MMU_2_06 ||
+            pcc->mmu_model == POWERPC_MMU_2_07) {
             info->slb_size = 32;
         } else {
             info->slb_size = 64;
@@ -319,8 +320,8 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
         i++;
 
         /* 64K on MMU 2.06 and later */
-        if (env->mmu_model == POWERPC_MMU_2_06 ||
-            env->mmu_model == POWERPC_MMU_2_07) {
+        if (pcc->mmu_model == POWERPC_MMU_2_06 ||
+            pcc->mmu_model == POWERPC_MMU_2_07) {
             info->sps[i].page_shift = 16;
             info->sps[i].slb_enc = 0x110;
             info->sps[i].enc[0].page_shift = 16;
@@ -336,19 +337,18 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
     }
 }
 
-static void kvm_get_smmu_info(PowerPCCPU *cpu, struct kvm_ppc_smmu_info *info)
+static void kvm_get_smmu_info(struct kvm_ppc_smmu_info *info)
 {
-    CPUState *cs = CPU(cpu);
     int ret;
 
-    if (kvm_check_extension(cs->kvm_state, KVM_CAP_PPC_GET_SMMU_INFO)) {
-        ret = kvm_vm_ioctl(cs->kvm_state, KVM_PPC_GET_SMMU_INFO, info);
+    if (kvm_check_extension(kvm_state, KVM_CAP_PPC_GET_SMMU_INFO)) {
+        ret = kvm_vm_ioctl(kvm_state, KVM_PPC_GET_SMMU_INFO, info);
         if (ret == 0) {
             return;
         }
     }
 
-    kvm_get_fallback_smmu_info(cpu, info);
+    kvm_get_fallback_smmu_info(info);
 }
 
 struct ppc_radix_page_info *kvm_get_radix_page_info(void)
@@ -408,14 +408,13 @@ target_ulong kvmppc_configure_v3_mmu(PowerPCCPU *cpu,
 
 bool kvmppc_hpt_needs_host_contiguous_pages(void)
 {
-    PowerPCCPU *cpu = POWERPC_CPU(first_cpu);
     static struct kvm_ppc_smmu_info smmu_info;
 
     if (!kvm_enabled()) {
         return false;
     }
 
-    kvm_get_smmu_info(cpu, &smmu_info);
+    kvm_get_smmu_info(&smmu_info);
     return !!(smmu_info.flags & KVM_PPC_PAGE_SIZES_REAL);
 }
 
@@ -429,7 +428,7 @@ void kvm_check_mmu(PowerPCCPU *cpu, Error **errp)
         return;
     }
 
-    kvm_get_smmu_info(cpu, &smmu_info);
+    kvm_get_smmu_info(&smmu_info);
 
     if (ppc_hash64_has(cpu, PPC_HASH64_1TSEG)
         && !(smmu_info.flags & KVM_PPC_1T_SEGMENTS)) {
@@ -2168,7 +2167,7 @@ uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift)
 
     /* Find the largest hardware supported page size that's less than
      * or equal to the (logical) backing page size of guest RAM */
-    kvm_get_smmu_info(POWERPC_CPU(first_cpu), &info);
+    kvm_get_smmu_info(&info);
     rampagesize = qemu_getrampagesize();
     best_page_shift = 0;
 
diff --git a/target/ppc/mmu-hash64.h b/target/ppc/mmu-hash64.h
index f11efc9cbc1f..00a249f26bc3 100644
--- a/target/ppc/mmu-hash64.h
+++ b/target/ppc/mmu-hash64.h
@@ -169,9 +169,15 @@ struct PPCHash64Options {
 extern const PPCHash64Options ppc_hash64_opts_basic;
 extern const PPCHash64Options ppc_hash64_opts_POWER7;
 
+static inline bool hash64_opts_has(const PPCHash64Options *opts,
+                                   unsigned feature)
+{
+    return !!(opts->flags & feature);
+}
+
 static inline bool ppc_hash64_has(PowerPCCPU *cpu, unsigned feature)
 {
-    return !!(cpu->hash64_opts->flags & feature);
+    return hash64_opts_has(cpu->hash64_opts, feature);
 }
 
 #endif /* CONFIG_USER_ONLY */

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 2/3] spapr: compute default value of "hpt-max-page-size" later
  2018-06-28 10:14 [Qemu-devel] [PATCH 0/3] spapr: fix regression with older machine types Greg Kurz
  2018-06-28 10:14 ` [Qemu-devel] [PATCH 1/3] target/ppc/kvm: don't pass cpu to kvm_get_smmu_info() Greg Kurz
@ 2018-06-28 10:15 ` Greg Kurz
  2018-06-29  5:16   ` David Gibson
  2018-06-29 19:08   ` Eduardo Habkost
  2018-06-28 10:15 ` [Qemu-devel] [PATCH 3/3] accel: forbid early use of kvm_enabled() and friends Greg Kurz
  2018-06-28 19:48 ` [Qemu-devel] [Qemu-ppc] [PATCH 0/3] spapr: fix regression with older machine types Greg Kurz
  3 siblings, 2 replies; 15+ messages in thread
From: Greg Kurz @ 2018-06-28 10:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, David Gibson, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Cédric Le Goater

It is currently not possible to run a pseries-2.12 or older machine
with HV KVM. QEMU prints the following and exits right away.

qemu-system-ppc64: KVM doesn't support for base page shift 34

The "hpt-max-page-size" capability was recently added to spapr to hide
host configuration details from HPT mode guests. Its default value for
newer machine types is 64k.

For backwards compatibility, pseries-2.12 and older machine types need
a different value. This is handled as usual in a class init function.
The default value is 16G, ie, all page sizes supported by POWER7 and
newer CPUs, but HV KVM requires guest pages to be hpa contiguous as
well as gpa contiguous. The default value is the page size used to
back the guest RAM in this case.

Unfortunately kvmppc_hpt_needs_host_contiguous_pages()->kvm_enabled() is
called way before KVM init and returns false, even if the user requested
KVM. We thus end up selecting 16G, which isn't supported by HV KVM.

We fix this by moving the logic to spapr_machine_init() because this
is the earliest call where we're sure kvm_enabled() can be trusted.
Since the user cannot pass cap-hpt-max-page-size=0, we set the default
to 0 in the pseries-2.12 class init function and use that as a flag
in spapr_machine_init() to do the real work.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/ppc/spapr.c |   25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 8cc996d0b822..1eb45cd8c424 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2527,6 +2527,18 @@ static void spapr_machine_init(MachineState *machine)
     QLIST_INIT(&spapr->phbs);
     QTAILQ_INIT(&spapr->pending_dimm_unplugs);
 
+    /* This is for pseries-2.12 and older machine types */
+    if (!smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE]) {
+        uint8_t mps;
+
+        if (kvmppc_hpt_needs_host_contiguous_pages()) {
+            mps = ctz64(qemu_getrampagesize());
+        } else {
+            mps = 34; /* allow everything up to 16GiB, i.e. everything */
+        }
+        smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = mps;
+    }
+
     /* Determine capabilities to run with */
     spapr_caps_init(spapr);
 
@@ -4103,17 +4115,16 @@ static void spapr_machine_2_12_instance_options(MachineState *machine)
 static void spapr_machine_2_12_class_options(MachineClass *mc)
 {
     sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
-    uint8_t mps;
 
     spapr_machine_3_0_class_options(mc);
     SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_12);
 
-    if (kvmppc_hpt_needs_host_contiguous_pages()) {
-        mps = ctz64(qemu_getrampagesize());
-    } else {
-        mps = 34; /* allow everything up to 16GiB, i.e. everything */
-    }
-    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = mps;
+    /* We depend on kvm_enabled() to choose a default value for the
+     * hpt-max-page-size capability. Of course we can't do it here
+     * because this is too early and the HW accelerator isn't initialzed
+     * yet. Postpone this to spapr_machine_init().
+     */
+    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 0;
 }
 
 DEFINE_SPAPR_MACHINE(2_12, "2.12", false);

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 3/3] accel: forbid early use of kvm_enabled() and friends
  2018-06-28 10:14 [Qemu-devel] [PATCH 0/3] spapr: fix regression with older machine types Greg Kurz
  2018-06-28 10:14 ` [Qemu-devel] [PATCH 1/3] target/ppc/kvm: don't pass cpu to kvm_get_smmu_info() Greg Kurz
  2018-06-28 10:15 ` [Qemu-devel] [PATCH 2/3] spapr: compute default value of "hpt-max-page-size" later Greg Kurz
@ 2018-06-28 10:15 ` Greg Kurz
  2018-06-29  5:18   ` David Gibson
  2018-06-29 19:58   ` Eduardo Habkost
  2018-06-28 19:48 ` [Qemu-devel] [Qemu-ppc] [PATCH 0/3] spapr: fix regression with older machine types Greg Kurz
  3 siblings, 2 replies; 15+ messages in thread
From: Greg Kurz @ 2018-06-28 10:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, David Gibson, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Cédric Le Goater

It is unsafe to rely on *_enabled() helpers before the accelerator has
been initialized, ie, accel_init_machine() has succeeded, because they
always return false. But it is still possible to end up calling them
indirectly by inadvertance, and cause QEMU to misbehave.

This patch causes QEMU to abort if we try to check for an accelerator
before it has been set up. This will help to catch bugs earlier.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 accel/accel.c          |    7 +++++++
 include/qemu-common.h  |    3 ++-
 include/sysemu/accel.h |    1 +
 include/sysemu/kvm.h   |    3 ++-
 qom/cpu.c              |    1 +
 stubs/Makefile.objs    |    1 +
 stubs/accel.c          |   14 ++++++++++++++
 target/i386/hax-all.c  |    2 +-
 target/i386/whpx-all.c |    2 +-
 9 files changed, 30 insertions(+), 4 deletions(-)
 create mode 100644 stubs/accel.c

diff --git a/accel/accel.c b/accel/accel.c
index 966b2d8f536c..27900aac9cc5 100644
--- a/accel/accel.c
+++ b/accel/accel.c
@@ -51,6 +51,13 @@ static AccelClass *accel_find(const char *opt_name)
     return ac;
 }
 
+bool assert_accelerator_initialized(bool allowed)
+{
+    assert(current_machine != NULL);
+    assert(current_machine->accelerator != NULL);
+    return allowed;
+}
+
 static int accel_init_machine(AccelClass *acc, MachineState *ms)
 {
     ObjectClass *oc = OBJECT_CLASS(acc);
diff --git a/include/qemu-common.h b/include/qemu-common.h
index 85f4749aefb7..01d5e4d97dbf 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -82,7 +82,8 @@ int qemu_openpty_raw(int *aslave, char *pty_name);
 extern bool tcg_allowed;
 void tcg_exec_init(unsigned long tb_size);
 #ifdef CONFIG_TCG
-#define tcg_enabled() (tcg_allowed)
+#include "sysemu/accel.h"
+#define tcg_enabled() (assert_accelerator_initialized(tcg_allowed))
 #else
 #define tcg_enabled() 0
 #endif
diff --git a/include/sysemu/accel.h b/include/sysemu/accel.h
index 637358f43014..76965cb69cc9 100644
--- a/include/sysemu/accel.h
+++ b/include/sysemu/accel.h
@@ -71,5 +71,6 @@ void configure_accelerator(MachineState *ms);
 void accel_register_compat_props(AccelState *accel);
 /* Called just before os_setup_post (ie just before drop OS privs) */
 void accel_setup_post(MachineState *ms);
+bool assert_accelerator_initialized(bool allowed);
 
 #endif
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 0b64b8e06786..ac4dbb2d6d6d 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -18,6 +18,7 @@
 #include "qom/cpu.h"
 #include "exec/memattrs.h"
 #include "hw/irq.h"
+#include "sysemu/accel.h"
 
 #ifdef NEED_CPU_H
 # ifdef CONFIG_KVM
@@ -46,7 +47,7 @@ extern bool kvm_direct_msi_allowed;
 extern bool kvm_ioeventfd_any_length_allowed;
 extern bool kvm_msi_use_devid;
 
-#define kvm_enabled()           (kvm_allowed)
+#define kvm_enabled()           (assert_accelerator_initialized(kvm_allowed))
 /**
  * kvm_irqchip_in_kernel:
  *
diff --git a/qom/cpu.c b/qom/cpu.c
index 92599f35413b..65a8f03a66a4 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -23,6 +23,7 @@
 #include "qemu-common.h"
 #include "qom/cpu.h"
 #include "sysemu/hw_accel.h"
+#include "sysemu/accel.h"
 #include "qemu/notify.h"
 #include "qemu/log.h"
 #include "exec/log.h"
diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index 53d3f32cb258..2d5142287525 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -43,3 +43,4 @@ stub-obj-y += xen-common.o
 stub-obj-y += xen-hvm.o
 stub-obj-y += pci-host-piix.o
 stub-obj-y += ram-block.o
+stub-obj-y += accel.o
diff --git a/stubs/accel.c b/stubs/accel.c
new file mode 100644
index 000000000000..4f480f2d3f29
--- /dev/null
+++ b/stubs/accel.c
@@ -0,0 +1,14 @@
+/*
+ * accel stubs
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/accel.h"
+
+bool assert_accelerator_initialized(bool allowed)
+{
+    return allowed;
+}
diff --git a/target/i386/hax-all.c b/target/i386/hax-all.c
index d2e512856bb8..7c78bd7d094d 100644
--- a/target/i386/hax-all.c
+++ b/target/i386/hax-all.c
@@ -57,7 +57,7 @@ static int hax_arch_get_registers(CPUArchState *env);
 
 int hax_enabled(void)
 {
-    return hax_allowed;
+    return assert_accelerator_initialized(hax_allowed);
 }
 
 int valid_hax_tunnel_size(uint16_t size)
diff --git a/target/i386/whpx-all.c b/target/i386/whpx-all.c
index 6b42096698ee..e7f6bc5958e7 100644
--- a/target/i386/whpx-all.c
+++ b/target/i386/whpx-all.c
@@ -1422,7 +1422,7 @@ static int whpx_accel_init(MachineState *ms)
 
 int whpx_enabled(void)
 {
-    return whpx_allowed;
+    return assert_accelerator_initialized(whpx_allowed);
 }
 
 static void whpx_accel_class_init(ObjectClass *oc, void *data)

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] target/ppc/kvm: don't pass cpu to kvm_get_smmu_info()
  2018-06-28 10:14 ` [Qemu-devel] [PATCH 1/3] target/ppc/kvm: don't pass cpu to kvm_get_smmu_info() Greg Kurz
@ 2018-06-28 11:56   ` Cédric Le Goater
  2018-06-28 12:14     ` Greg Kurz
  2018-06-29  5:16   ` David Gibson
  1 sibling, 1 reply; 15+ messages in thread
From: Cédric Le Goater @ 2018-06-28 11:56 UTC (permalink / raw)
  To: Greg Kurz, qemu-devel
  Cc: qemu-ppc, David Gibson, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost

On 06/28/2018 12:14 PM, Greg Kurz wrote:
> In a future patch the machine code will need to retrieve the MMU
> information from KVM during machine initialization before the CPUs
> are created.
> 
> Actually, KVM_PPC_GET_SMMU_INFO is a VM class ioctl, and thus, it only
> needs the kvm_state global variable to be set. The fallback code only
> needs informations from the machine's CPU model class.
> 
> So this patch basically drops the CPU argument to kvm_get_smmu_info()
> and kvm_get_fallback_smmu_info(). The kvm_state and current_machine
> globals are used instead to reach out to KVM and CPU model details
> respectively.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>

Reviewed-by: Cédric Le Goater <clg@kaod.org>

one question below,

Thanks,

C.

> ---
>  target/ppc/kvm.c        |   37 ++++++++++++++++++-------------------
>  target/ppc/mmu-hash64.h |    8 +++++++-
>  2 files changed, 25 insertions(+), 20 deletions(-)
> 
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index 4df4ff6cbff2..9fae89ff8175 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -248,11 +248,12 @@ static int kvm_booke206_tlb_init(PowerPCCPU *cpu)
>  
>  
>  #if defined(TARGET_PPC64)
> -static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
> -                                       struct kvm_ppc_smmu_info *info)
> +static void kvm_get_fallback_smmu_info(struct kvm_ppc_smmu_info *info)
>  {
> -    CPUPPCState *env = &cpu->env;
> -    CPUState *cs = CPU(cpu);
> +    const PowerPCCPUClass *pcc;
> +
> +    assert(current_machine != NULL);

can that ever happen ? 

> +    pcc = POWERPC_CPU_CLASS(object_class_by_name(current_machine->cpu_type));
>  
>      memset(info, 0, sizeof(*info));
>  
> @@ -278,7 +279,7 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
>       *   implements KVM_CAP_PPC_GET_SMMU_INFO and thus doesn't hit
>       *   this fallback.
>       */
> -    if (kvmppc_is_pr(cs->kvm_state)) {
> +    if (kvmppc_is_pr(kvm_state)) {
>          /* No flags */
>          info->flags = 0;
>          info->slb_size = 64;
> @@ -300,12 +301,12 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
>          /* HV KVM has backing store size restrictions */
>          info->flags = KVM_PPC_PAGE_SIZES_REAL;
>  
> -        if (ppc_hash64_has(cpu, PPC_HASH64_1TSEG)) {
> +        if (hash64_opts_has(pcc->hash64_opts, PPC_HASH64_1TSEG)) {
>              info->flags |= KVM_PPC_1T_SEGMENTS;
>          }
>  
> -        if (env->mmu_model == POWERPC_MMU_2_06 ||
> -            env->mmu_model == POWERPC_MMU_2_07) {
> +        if (pcc->mmu_model == POWERPC_MMU_2_06 ||
> +            pcc->mmu_model == POWERPC_MMU_2_07) {
>              info->slb_size = 32;
>          } else {
>              info->slb_size = 64;
> @@ -319,8 +320,8 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
>          i++;
>  
>          /* 64K on MMU 2.06 and later */
> -        if (env->mmu_model == POWERPC_MMU_2_06 ||
> -            env->mmu_model == POWERPC_MMU_2_07) {
> +        if (pcc->mmu_model == POWERPC_MMU_2_06 ||
> +            pcc->mmu_model == POWERPC_MMU_2_07) {
>              info->sps[i].page_shift = 16;
>              info->sps[i].slb_enc = 0x110;
>              info->sps[i].enc[0].page_shift = 16;
> @@ -336,19 +337,18 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
>      }
>  }
>  
> -static void kvm_get_smmu_info(PowerPCCPU *cpu, struct kvm_ppc_smmu_info *info)
> +static void kvm_get_smmu_info(struct kvm_ppc_smmu_info *info)
>  {
> -    CPUState *cs = CPU(cpu);
>      int ret;
>  
> -    if (kvm_check_extension(cs->kvm_state, KVM_CAP_PPC_GET_SMMU_INFO)) {
> -        ret = kvm_vm_ioctl(cs->kvm_state, KVM_PPC_GET_SMMU_INFO, info);
> +    if (kvm_check_extension(kvm_state, KVM_CAP_PPC_GET_SMMU_INFO)) {
> +        ret = kvm_vm_ioctl(kvm_state, KVM_PPC_GET_SMMU_INFO, info);
>          if (ret == 0) {
>              return;
>          }
>      }
>  
> -    kvm_get_fallback_smmu_info(cpu, info);
> +    kvm_get_fallback_smmu_info(info);
>  }
>  
>  struct ppc_radix_page_info *kvm_get_radix_page_info(void)
> @@ -408,14 +408,13 @@ target_ulong kvmppc_configure_v3_mmu(PowerPCCPU *cpu,
>  
>  bool kvmppc_hpt_needs_host_contiguous_pages(void)
>  {
> -    PowerPCCPU *cpu = POWERPC_CPU(first_cpu);
>      static struct kvm_ppc_smmu_info smmu_info;
>  
>      if (!kvm_enabled()) {
>          return false;
>      }
>  
> -    kvm_get_smmu_info(cpu, &smmu_info);
> +    kvm_get_smmu_info(&smmu_info);
>      return !!(smmu_info.flags & KVM_PPC_PAGE_SIZES_REAL);
>  }
>  
> @@ -429,7 +428,7 @@ void kvm_check_mmu(PowerPCCPU *cpu, Error **errp)
>          return;
>      }
>  
> -    kvm_get_smmu_info(cpu, &smmu_info);
> +    kvm_get_smmu_info(&smmu_info);
>  
>      if (ppc_hash64_has(cpu, PPC_HASH64_1TSEG)
>          && !(smmu_info.flags & KVM_PPC_1T_SEGMENTS)) {
> @@ -2168,7 +2167,7 @@ uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift)
>  
>      /* Find the largest hardware supported page size that's less than
>       * or equal to the (logical) backing page size of guest RAM */
> -    kvm_get_smmu_info(POWERPC_CPU(first_cpu), &info);
> +    kvm_get_smmu_info(&info);
>      rampagesize = qemu_getrampagesize();
>      best_page_shift = 0;
>  
> diff --git a/target/ppc/mmu-hash64.h b/target/ppc/mmu-hash64.h
> index f11efc9cbc1f..00a249f26bc3 100644
> --- a/target/ppc/mmu-hash64.h
> +++ b/target/ppc/mmu-hash64.h
> @@ -169,9 +169,15 @@ struct PPCHash64Options {
>  extern const PPCHash64Options ppc_hash64_opts_basic;
>  extern const PPCHash64Options ppc_hash64_opts_POWER7;
>  
> +static inline bool hash64_opts_has(const PPCHash64Options *opts,
> +                                   unsigned feature)
> +{
> +    return !!(opts->flags & feature);
> +}
> +
>  static inline bool ppc_hash64_has(PowerPCCPU *cpu, unsigned feature)
>  {
> -    return !!(cpu->hash64_opts->flags & feature);
> +    return hash64_opts_has(cpu->hash64_opts, feature);
>  }
>  
>  #endif /* CONFIG_USER_ONLY */
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] target/ppc/kvm: don't pass cpu to kvm_get_smmu_info()
  2018-06-28 11:56   ` Cédric Le Goater
@ 2018-06-28 12:14     ` Greg Kurz
  0 siblings, 0 replies; 15+ messages in thread
From: Greg Kurz @ 2018-06-28 12:14 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-devel, qemu-ppc, David Gibson, Paolo Bonzini,
	Richard Henderson, Eduardo Habkost

On Thu, 28 Jun 2018 13:56:58 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> On 06/28/2018 12:14 PM, Greg Kurz wrote:
> > In a future patch the machine code will need to retrieve the MMU
> > information from KVM during machine initialization before the CPUs
> > are created.
> > 
> > Actually, KVM_PPC_GET_SMMU_INFO is a VM class ioctl, and thus, it only
> > needs the kvm_state global variable to be set. The fallback code only
> > needs informations from the machine's CPU model class.
> > 
> > So this patch basically drops the CPU argument to kvm_get_smmu_info()
> > and kvm_get_fallback_smmu_info(). The kvm_state and current_machine
> > globals are used instead to reach out to KVM and CPU model details
> > respectively.
> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>  
> 
> Reviewed-by: Cédric Le Goater <clg@kaod.org>
> 
> one question below,
> 
> Thanks,
> 
> C.
> 
> > ---
> >  target/ppc/kvm.c        |   37 ++++++++++++++++++-------------------
> >  target/ppc/mmu-hash64.h |    8 +++++++-
> >  2 files changed, 25 insertions(+), 20 deletions(-)
> > 
> > diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> > index 4df4ff6cbff2..9fae89ff8175 100644
> > --- a/target/ppc/kvm.c
> > +++ b/target/ppc/kvm.c
> > @@ -248,11 +248,12 @@ static int kvm_booke206_tlb_init(PowerPCCPU *cpu)
> >  
> >  
> >  #if defined(TARGET_PPC64)
> > -static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
> > -                                       struct kvm_ppc_smmu_info *info)
> > +static void kvm_get_fallback_smmu_info(struct kvm_ppc_smmu_info *info)
> >  {
> > -    CPUPPCState *env = &cpu->env;
> > -    CPUState *cs = CPU(cpu);
> > +    const PowerPCCPUClass *pcc;
> > +
> > +    assert(current_machine != NULL);  
> 
> can that ever happen ? 
> 

It shouldn't, hence the assert() rather than merely SEGV'ing below.

> > +    pcc = POWERPC_CPU_CLASS(object_class_by_name(current_machine->cpu_type));
> >  
> >      memset(info, 0, sizeof(*info));
> >  
> > @@ -278,7 +279,7 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
> >       *   implements KVM_CAP_PPC_GET_SMMU_INFO and thus doesn't hit
> >       *   this fallback.
> >       */
> > -    if (kvmppc_is_pr(cs->kvm_state)) {
> > +    if (kvmppc_is_pr(kvm_state)) {
> >          /* No flags */
> >          info->flags = 0;
> >          info->slb_size = 64;
> > @@ -300,12 +301,12 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
> >          /* HV KVM has backing store size restrictions */
> >          info->flags = KVM_PPC_PAGE_SIZES_REAL;
> >  
> > -        if (ppc_hash64_has(cpu, PPC_HASH64_1TSEG)) {
> > +        if (hash64_opts_has(pcc->hash64_opts, PPC_HASH64_1TSEG)) {
> >              info->flags |= KVM_PPC_1T_SEGMENTS;
> >          }
> >  
> > -        if (env->mmu_model == POWERPC_MMU_2_06 ||
> > -            env->mmu_model == POWERPC_MMU_2_07) {
> > +        if (pcc->mmu_model == POWERPC_MMU_2_06 ||
> > +            pcc->mmu_model == POWERPC_MMU_2_07) {
> >              info->slb_size = 32;
> >          } else {
> >              info->slb_size = 64;
> > @@ -319,8 +320,8 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
> >          i++;
> >  
> >          /* 64K on MMU 2.06 and later */
> > -        if (env->mmu_model == POWERPC_MMU_2_06 ||
> > -            env->mmu_model == POWERPC_MMU_2_07) {
> > +        if (pcc->mmu_model == POWERPC_MMU_2_06 ||
> > +            pcc->mmu_model == POWERPC_MMU_2_07) {
> >              info->sps[i].page_shift = 16;
> >              info->sps[i].slb_enc = 0x110;
> >              info->sps[i].enc[0].page_shift = 16;
> > @@ -336,19 +337,18 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
> >      }
> >  }
> >  
> > -static void kvm_get_smmu_info(PowerPCCPU *cpu, struct kvm_ppc_smmu_info *info)
> > +static void kvm_get_smmu_info(struct kvm_ppc_smmu_info *info)
> >  {
> > -    CPUState *cs = CPU(cpu);
> >      int ret;
> >  
> > -    if (kvm_check_extension(cs->kvm_state, KVM_CAP_PPC_GET_SMMU_INFO)) {
> > -        ret = kvm_vm_ioctl(cs->kvm_state, KVM_PPC_GET_SMMU_INFO, info);
> > +    if (kvm_check_extension(kvm_state, KVM_CAP_PPC_GET_SMMU_INFO)) {
> > +        ret = kvm_vm_ioctl(kvm_state, KVM_PPC_GET_SMMU_INFO, info);
> >          if (ret == 0) {
> >              return;
> >          }
> >      }
> >  
> > -    kvm_get_fallback_smmu_info(cpu, info);
> > +    kvm_get_fallback_smmu_info(info);
> >  }
> >  
> >  struct ppc_radix_page_info *kvm_get_radix_page_info(void)
> > @@ -408,14 +408,13 @@ target_ulong kvmppc_configure_v3_mmu(PowerPCCPU *cpu,
> >  
> >  bool kvmppc_hpt_needs_host_contiguous_pages(void)
> >  {
> > -    PowerPCCPU *cpu = POWERPC_CPU(first_cpu);
> >      static struct kvm_ppc_smmu_info smmu_info;
> >  
> >      if (!kvm_enabled()) {
> >          return false;
> >      }
> >  
> > -    kvm_get_smmu_info(cpu, &smmu_info);
> > +    kvm_get_smmu_info(&smmu_info);
> >      return !!(smmu_info.flags & KVM_PPC_PAGE_SIZES_REAL);
> >  }
> >  
> > @@ -429,7 +428,7 @@ void kvm_check_mmu(PowerPCCPU *cpu, Error **errp)
> >          return;
> >      }
> >  
> > -    kvm_get_smmu_info(cpu, &smmu_info);
> > +    kvm_get_smmu_info(&smmu_info);
> >  
> >      if (ppc_hash64_has(cpu, PPC_HASH64_1TSEG)
> >          && !(smmu_info.flags & KVM_PPC_1T_SEGMENTS)) {
> > @@ -2168,7 +2167,7 @@ uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift)
> >  
> >      /* Find the largest hardware supported page size that's less than
> >       * or equal to the (logical) backing page size of guest RAM */
> > -    kvm_get_smmu_info(POWERPC_CPU(first_cpu), &info);
> > +    kvm_get_smmu_info(&info);
> >      rampagesize = qemu_getrampagesize();
> >      best_page_shift = 0;
> >  
> > diff --git a/target/ppc/mmu-hash64.h b/target/ppc/mmu-hash64.h
> > index f11efc9cbc1f..00a249f26bc3 100644
> > --- a/target/ppc/mmu-hash64.h
> > +++ b/target/ppc/mmu-hash64.h
> > @@ -169,9 +169,15 @@ struct PPCHash64Options {
> >  extern const PPCHash64Options ppc_hash64_opts_basic;
> >  extern const PPCHash64Options ppc_hash64_opts_POWER7;
> >  
> > +static inline bool hash64_opts_has(const PPCHash64Options *opts,
> > +                                   unsigned feature)
> > +{
> > +    return !!(opts->flags & feature);
> > +}
> > +
> >  static inline bool ppc_hash64_has(PowerPCCPU *cpu, unsigned feature)
> >  {
> > -    return !!(cpu->hash64_opts->flags & feature);
> > +    return hash64_opts_has(cpu->hash64_opts, feature);
> >  }
> >  
> >  #endif /* CONFIG_USER_ONLY */
> >   
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/3] spapr: fix regression with older machine types
  2018-06-28 10:14 [Qemu-devel] [PATCH 0/3] spapr: fix regression with older machine types Greg Kurz
                   ` (2 preceding siblings ...)
  2018-06-28 10:15 ` [Qemu-devel] [PATCH 3/3] accel: forbid early use of kvm_enabled() and friends Greg Kurz
@ 2018-06-28 19:48 ` Greg Kurz
  2018-06-29  5:21   ` David Gibson
  3 siblings, 1 reply; 15+ messages in thread
From: Greg Kurz @ 2018-06-28 19:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, qemu-ppc, Cédric Le Goater, Paolo Bonzini,
	David Gibson, Richard Henderson

On Thu, 28 Jun 2018 12:14:25 +0200
Greg Kurz <groug@kaod.org> wrote:

> Since the recent cleanups to hide host configuration details from guests,
> it isn't possible to start an older machine type with HV KVM [*]:
> 
> qemu-system-ppc64: KVM doesn't support for base page shift 34
> 
> This basically boils down to the fact that it isn't safe to call
> the kvmppc_hpt_needs_host_contiguous_pages() helper from a class
> init function because:
> - KVM isn't initialized yet, and kvm_enabled() always return false
>   in this case. This causes kvmppc_hpt_needs_host_contiguous_pages()
>   to do nothing and we end up choosing a 16G default page size
>   which is not supported by KVM.
> - even if we drop kvm_enabled() we then have the issue that
>   kvmppc_hpt_needs_host_contiguous_pages() assumes CPUs are
>   created, which isn't the case either.
> 
> The choice was made to initialize capabilities during machine
> init before creating the CPUs, and I don't think we should
> revert to the previous behavior. Let's go forward instead and
> ensure we can retrieve the MMU information from KVM before
> CPUs are created.
> 
> To fix this, we first change kvm_get_smmu_info() so that it
> doesn't need a CPU object. This allows to stop using first_cpu
> in kvmppc_hpt_needs_host_contiguous_pages(). Then we delay
> the setting of the default value to machine init time, so
> that we're sure that KVM is fully initialized.
> 
> As a bonus, the last patch is a tentative to be able to detect
> such misuse of *_enabled() accelerator helpers earlier.
> 
> Please comment.
> 
> [*] it also breaks PR KVM actually, but the error is different and
>     I need to dig some more.
> 

With current master:

1) qemu-system-ppc64 -machine pseries,accel=kvm,kvm-type=PR

The guest starts but its kernel oopses at some point:

[    0.011328] kernel tried to execute exec-protected page (c000000001611244) -exploit attempt? (uid: 0)
[    0.011379] Unable to handle kernel paging request for instruction fetch
[    0.011416] Faulting instruction address: 0xc000000001611244
[    0.011453] Oops: Kernel access of bad area, sig: 11 [#1]
[    0.011482] LE SMP NR_CPUS=1024 NUMA pSeries
[    0.011512] Modules linked in:
[    0.011557] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.17.2-200.fc28.ppc64le #1
[    0.011600] NIP:  c000000001611244 LR: c00000000000acec CTR: 0000000000000000
[    0.011643] REGS: c00000003fffba90 TRAP: 0400   Not tainted  (4.17.2-200.fc28.ppc64le)
[    0.011694] MSR:  b000000010001033 <SF,HV,ME,IR,DR,RI,LE>  CR: 28000848  XER: 20000000
[    0.011741] CFAR: 0000000000000000 SOFTE: 1 
[    0.011741] GPR00: 0000000000000000 c00000003fffbd10 c000000001570b00 c00000003fffbd80 
[    0.011741] GPR04: c000000000034418 0000000048000000 000000000000000a 000000004aa21de8 
[    0.011741] GPR08: 000000007d410164 0000000000000000 0000000000000002 0000000000000900 
[    0.011741] GPR12: b000000002009033 c000000001840000 c000000000071a2c 00000000495de1a4 
[    0.011741] GPR16: 0000000000000078 c00000000160fd10 c000000000e705e0 000000007c1b03a6 
[    0.011741] GPR20: 000000007c1ffaa6 c0000000016125b8 c0000000014253e8 000000007c1303a6 
[    0.011741] GPR24: 000000007c1643a6 000000007c1a03a6 c00000000160fd08 ffffffffebc0f008 
[    0.011741] GPR28: ffffffffebc0f000 c0000000000345d8 c0000000000345d8 0000000000000000 
[    0.012138] NIP [c000000001611244] kvm_tmp+0x1534/0x100000
[    0.012170] LR [c00000000000acec] soft_nmi_common+0xcc/0xd0
[    0.012199] Call Trace:
[    0.012214] Instruction dump:
[    0.012236] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
[    0.012289] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
[    0.012334] ---[ end trace d2ee28832d481d2d ]---
[    0.012362] 
[    1.012387] kernel tried to execute exec-protected page (c000000001611808) -exploit attempt? (uid: 0)
[    1.012433] Unable to handle kernel paging request for instruction fetch
[    1.012468] Faulting instruction address: 0xc000000001611808
[    1.012504] Oops: Kernel access of bad area, sig: 11 [#2]
[    1.012532] LE SMP NR_CPUS=1024 NUMA pSeries
[    1.012561] Modules linked in:
[    1.012583] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G      D           4.17.2-200.fc28.ppc64le #1
[    1.012641] NIP:  c000000001611808 LR: c0000000001247fc CTR: c000000001840000
[    1.012684] REGS: c00000003fffb5d0 TRAP: 0400   Tainted: G      D            (4.17.2-200.fc28.ppc64le)
[    1.012740] MSR:  b000000010001033 <SF,HV,ME,IR,DR,RI,LE>  CR: 48000224  XER: 20000000
[    1.012785] CFAR: 0000000000000000 SOFTE: 0 
[    1.012785] GPR00: c0000000001247fc c00000003fffb850 c000000001570b00 0000000000000000 
[    1.012785] GPR04: 0000000000000000 c0000000fe9e4900 fffffffffffffffd c0000000fe9e4900 
[    1.012785] GPR08: 00000000fed50000 b000000000001033 0000000000000009 c00000003fffb55f 
[    1.012785] GPR12: 0000000000000000 c000000001840000 c000000000071a2c 00000000495de1a4 
[    1.012785] GPR16: 0000000000000078 c00000000160fd10 c000000000e705e0 000000007c1b03a6 
[    1.012785] GPR20: 000000007c1ffaa6 c0000000016125b8 c0000000014253e8 000000007c1303a6 
[    1.012785] GPR24: 000000007c1643a6 000000007c1a03a6 c00000000160fd08 ffffffffebc0f008 
[    1.012785] GPR28: 0000000000000000 000000000000000b 000000000000000b c0000000fe9e4900 
[    1.013166] NIP [c000000001611808] kvm_tmp+0x1af8/0x100000
[    1.013196] LR [c0000000001247fc] do_exit+0x12c/0xd30
[    1.013224] Call Trace:
[    1.013238] Instruction dump:
[    1.013260] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
[    1.013303] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
[    1.013348] ---[ end trace d2ee28832d481d2e ]---
[    1.013375] 
[    2.013391] Fixing recursive fault but reboot is needed!

and the guest gets unresponsive.

2) qemu-system-ppc64 -machine pseries-2.12,accel=kvm,kvm-type=PR

prints an error message and terminates right away:

qemu-system-ppc64: KVM doesn't support page shift 24/12

This error is expected: since PR KVM doesn't set KVM_PPC_PAGE_SIZES_REAL,
ie, we choose to support all possible page sizes, but PR KVM doesn't
support this page shift combination indeed. Unsurprisingly we get the
same error with:

-machine pseries,accel-kvm,kvm-type=PR,cap-hpt-max-page-size=${pagesize}

if ${pagesize} is >= 16m. This is the result of PR KVM not supporting
MPSS at all, even though it supports 16m pages in a 16m segment. We
cannot really fix this in QEMU, unless we completely filter out MPSS
in spapr_pagesize_cb() but I'm pretty sure we don't want that. :)

But then, if we go for a 64k limit, we hit 1).

An obvious change in the DT since the page size cleanup is:

                            [4k seg    [4k pg]] [64k seg      [64k pg]] [16m seg      [16m pg]]
- ibm,segment-page-sizes = <0xc 0x0 0x1 0xc 0x0 0x10 0x110 0x1 0x10 0x1 0x18 0x100 0x1 0x18 0x0>;
+ ibm,segment-page-sizes = <0xc 0x0 0x1 0xc 0x0 0x10 0x110 0x1 0x10 0x1>;
                            [4k seg    [4k pg]] [64k seg      [64k pg]]

If I add the 16m entry back, the guest boots just fine.

Not sure yet what's happening... any idea ?

Cheers,

--
Greg


> --
> Greg
> 
> ---
> 
> Greg Kurz (3):
>       target/ppc/kvm: don't pass cpu to kvm_get_smmu_info()
>       spapr: compute default value of "hpt-max-page-size" later
>       accel: forbid early use of kvm_enabled() and friends
> 
> 
>  accel/accel.c           |    7 +++++++
>  hw/ppc/spapr.c          |   25 ++++++++++++++++++-------
>  include/qemu-common.h   |    3 ++-
>  include/sysemu/accel.h  |    1 +
>  include/sysemu/kvm.h    |    3 ++-
>  qom/cpu.c               |    1 +
>  stubs/Makefile.objs     |    1 +
>  stubs/accel.c           |   14 ++++++++++++++
>  target/i386/hax-all.c   |    2 +-
>  target/i386/whpx-all.c  |    2 +-
>  target/ppc/kvm.c        |   37 ++++++++++++++++++-------------------
>  target/ppc/mmu-hash64.h |    8 +++++++-
>  12 files changed, 73 insertions(+), 31 deletions(-)
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] target/ppc/kvm: don't pass cpu to kvm_get_smmu_info()
  2018-06-28 10:14 ` [Qemu-devel] [PATCH 1/3] target/ppc/kvm: don't pass cpu to kvm_get_smmu_info() Greg Kurz
  2018-06-28 11:56   ` Cédric Le Goater
@ 2018-06-29  5:16   ` David Gibson
  1 sibling, 0 replies; 15+ messages in thread
From: David Gibson @ 2018-06-29  5:16 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 6983 bytes --]

On Thu, Jun 28, 2018 at 12:14:51PM +0200, Greg Kurz wrote:
> In a future patch the machine code will need to retrieve the MMU
> information from KVM during machine initialization before the CPUs
> are created.
> 
> Actually, KVM_PPC_GET_SMMU_INFO is a VM class ioctl, and thus, it only
> needs the kvm_state global variable to be set. The fallback code only
> needs informations from the machine's CPU model class.
> 
> So this patch basically drops the CPU argument to kvm_get_smmu_info()
> and kvm_get_fallback_smmu_info(). The kvm_state and current_machine
> globals are used instead to reach out to KVM and CPU model details
> respectively.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>

This looks more or less correct, but I think there's a better way to
approach it.

Something I meant to do once the pagesize cleanups were made, but
didn't get around to is to eliminate kvm_get_fallback_smmu_info()
entirely.  Instead we can allow kvm_get_smmu_info() to fail.  Since
we're now just checking the configured setup against KVM's
capabilities, rather than adjusting the configured setup according to
KVM's capabilities, we can just skip the checking if we can't get the
smmu info.

That might result in slightly more cryptic failure modes errors for
really old kernels, but I think that's worth it for simpler logic.

> ---
>  target/ppc/kvm.c        |   37 ++++++++++++++++++-------------------
>  target/ppc/mmu-hash64.h |    8 +++++++-
>  2 files changed, 25 insertions(+), 20 deletions(-)
> 
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index 4df4ff6cbff2..9fae89ff8175 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -248,11 +248,12 @@ static int kvm_booke206_tlb_init(PowerPCCPU *cpu)
>  
>  
>  #if defined(TARGET_PPC64)
> -static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
> -                                       struct kvm_ppc_smmu_info *info)
> +static void kvm_get_fallback_smmu_info(struct kvm_ppc_smmu_info *info)
>  {
> -    CPUPPCState *env = &cpu->env;
> -    CPUState *cs = CPU(cpu);
> +    const PowerPCCPUClass *pcc;
> +
> +    assert(current_machine != NULL);
> +    pcc = POWERPC_CPU_CLASS(object_class_by_name(current_machine->cpu_type));
>  
>      memset(info, 0, sizeof(*info));
>  
> @@ -278,7 +279,7 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
>       *   implements KVM_CAP_PPC_GET_SMMU_INFO and thus doesn't hit
>       *   this fallback.
>       */
> -    if (kvmppc_is_pr(cs->kvm_state)) {
> +    if (kvmppc_is_pr(kvm_state)) {
>          /* No flags */
>          info->flags = 0;
>          info->slb_size = 64;
> @@ -300,12 +301,12 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
>          /* HV KVM has backing store size restrictions */
>          info->flags = KVM_PPC_PAGE_SIZES_REAL;
>  
> -        if (ppc_hash64_has(cpu, PPC_HASH64_1TSEG)) {
> +        if (hash64_opts_has(pcc->hash64_opts, PPC_HASH64_1TSEG)) {
>              info->flags |= KVM_PPC_1T_SEGMENTS;
>          }
>  
> -        if (env->mmu_model == POWERPC_MMU_2_06 ||
> -            env->mmu_model == POWERPC_MMU_2_07) {
> +        if (pcc->mmu_model == POWERPC_MMU_2_06 ||
> +            pcc->mmu_model == POWERPC_MMU_2_07) {
>              info->slb_size = 32;
>          } else {
>              info->slb_size = 64;
> @@ -319,8 +320,8 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
>          i++;
>  
>          /* 64K on MMU 2.06 and later */
> -        if (env->mmu_model == POWERPC_MMU_2_06 ||
> -            env->mmu_model == POWERPC_MMU_2_07) {
> +        if (pcc->mmu_model == POWERPC_MMU_2_06 ||
> +            pcc->mmu_model == POWERPC_MMU_2_07) {
>              info->sps[i].page_shift = 16;
>              info->sps[i].slb_enc = 0x110;
>              info->sps[i].enc[0].page_shift = 16;
> @@ -336,19 +337,18 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
>      }
>  }
>  
> -static void kvm_get_smmu_info(PowerPCCPU *cpu, struct kvm_ppc_smmu_info *info)
> +static void kvm_get_smmu_info(struct kvm_ppc_smmu_info *info)
>  {
> -    CPUState *cs = CPU(cpu);
>      int ret;
>  
> -    if (kvm_check_extension(cs->kvm_state, KVM_CAP_PPC_GET_SMMU_INFO)) {
> -        ret = kvm_vm_ioctl(cs->kvm_state, KVM_PPC_GET_SMMU_INFO, info);
> +    if (kvm_check_extension(kvm_state, KVM_CAP_PPC_GET_SMMU_INFO)) {
> +        ret = kvm_vm_ioctl(kvm_state, KVM_PPC_GET_SMMU_INFO, info);
>          if (ret == 0) {
>              return;
>          }
>      }
>  
> -    kvm_get_fallback_smmu_info(cpu, info);
> +    kvm_get_fallback_smmu_info(info);
>  }
>  
>  struct ppc_radix_page_info *kvm_get_radix_page_info(void)
> @@ -408,14 +408,13 @@ target_ulong kvmppc_configure_v3_mmu(PowerPCCPU *cpu,
>  
>  bool kvmppc_hpt_needs_host_contiguous_pages(void)
>  {
> -    PowerPCCPU *cpu = POWERPC_CPU(first_cpu);
>      static struct kvm_ppc_smmu_info smmu_info;
>  
>      if (!kvm_enabled()) {
>          return false;
>      }
>  
> -    kvm_get_smmu_info(cpu, &smmu_info);
> +    kvm_get_smmu_info(&smmu_info);
>      return !!(smmu_info.flags & KVM_PPC_PAGE_SIZES_REAL);
>  }
>  
> @@ -429,7 +428,7 @@ void kvm_check_mmu(PowerPCCPU *cpu, Error **errp)
>          return;
>      }
>  
> -    kvm_get_smmu_info(cpu, &smmu_info);
> +    kvm_get_smmu_info(&smmu_info);
>  
>      if (ppc_hash64_has(cpu, PPC_HASH64_1TSEG)
>          && !(smmu_info.flags & KVM_PPC_1T_SEGMENTS)) {
> @@ -2168,7 +2167,7 @@ uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift)
>  
>      /* Find the largest hardware supported page size that's less than
>       * or equal to the (logical) backing page size of guest RAM */
> -    kvm_get_smmu_info(POWERPC_CPU(first_cpu), &info);
> +    kvm_get_smmu_info(&info);
>      rampagesize = qemu_getrampagesize();
>      best_page_shift = 0;
>  
> diff --git a/target/ppc/mmu-hash64.h b/target/ppc/mmu-hash64.h
> index f11efc9cbc1f..00a249f26bc3 100644
> --- a/target/ppc/mmu-hash64.h
> +++ b/target/ppc/mmu-hash64.h
> @@ -169,9 +169,15 @@ struct PPCHash64Options {
>  extern const PPCHash64Options ppc_hash64_opts_basic;
>  extern const PPCHash64Options ppc_hash64_opts_POWER7;
>  
> +static inline bool hash64_opts_has(const PPCHash64Options *opts,
> +                                   unsigned feature)
> +{
> +    return !!(opts->flags & feature);
> +}

Any exported function from mmu-hash64.c shoud have a name prefixed
with "ppc_hash64_".

>  static inline bool ppc_hash64_has(PowerPCCPU *cpu, unsigned feature)
>  {
> -    return !!(cpu->hash64_opts->flags & feature);
> +    return hash64_opts_has(cpu->hash64_opts, feature);
>  }
>  
>  #endif /* CONFIG_USER_ONLY */
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] spapr: compute default value of "hpt-max-page-size" later
  2018-06-28 10:15 ` [Qemu-devel] [PATCH 2/3] spapr: compute default value of "hpt-max-page-size" later Greg Kurz
@ 2018-06-29  5:16   ` David Gibson
  2018-06-29 19:08   ` Eduardo Habkost
  1 sibling, 0 replies; 15+ messages in thread
From: David Gibson @ 2018-06-29  5:16 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 3738 bytes --]

On Thu, Jun 28, 2018 at 12:15:14PM +0200, Greg Kurz wrote:
> It is currently not possible to run a pseries-2.12 or older machine
> with HV KVM. QEMU prints the following and exits right away.
> 
> qemu-system-ppc64: KVM doesn't support for base page shift 34
> 
> The "hpt-max-page-size" capability was recently added to spapr to hide
> host configuration details from HPT mode guests. Its default value for
> newer machine types is 64k.
> 
> For backwards compatibility, pseries-2.12 and older machine types need
> a different value. This is handled as usual in a class init function.
> The default value is 16G, ie, all page sizes supported by POWER7 and
> newer CPUs, but HV KVM requires guest pages to be hpa contiguous as
> well as gpa contiguous. The default value is the page size used to
> back the guest RAM in this case.
> 
> Unfortunately kvmppc_hpt_needs_host_contiguous_pages()->kvm_enabled() is
> called way before KVM init and returns false, even if the user requested
> KVM. We thus end up selecting 16G, which isn't supported by HV KVM.
> 
> We fix this by moving the logic to spapr_machine_init() because this
> is the earliest call where we're sure kvm_enabled() can be trusted.
> Since the user cannot pass cap-hpt-max-page-size=0, we set the default
> to 0 in the pseries-2.12 class init function and use that as a flag
> in spapr_machine_init() to do the real work.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  hw/ppc/spapr.c |   25 ++++++++++++++++++-------
>  1 file changed, 18 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 8cc996d0b822..1eb45cd8c424 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2527,6 +2527,18 @@ static void spapr_machine_init(MachineState *machine)
>      QLIST_INIT(&spapr->phbs);
>      QTAILQ_INIT(&spapr->pending_dimm_unplugs);
>  
> +    /* This is for pseries-2.12 and older machine types */
> +    if (!smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE]) {
> +        uint8_t mps;
> +
> +        if (kvmppc_hpt_needs_host_contiguous_pages()) {
> +            mps = ctz64(qemu_getrampagesize());
> +        } else {
> +            mps = 34; /* allow everything up to 16GiB, i.e. everything */
> +        }
> +        smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = mps;
> +    }
> +

The logic above should be folded into spapr_caps_init().

>      /* Determine capabilities to run with */
>      spapr_caps_init(spapr);
>  
> @@ -4103,17 +4115,16 @@ static void spapr_machine_2_12_instance_options(MachineState *machine)
>  static void spapr_machine_2_12_class_options(MachineClass *mc)
>  {
>      sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
> -    uint8_t mps;
>  
>      spapr_machine_3_0_class_options(mc);
>      SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_12);
>  
> -    if (kvmppc_hpt_needs_host_contiguous_pages()) {
> -        mps = ctz64(qemu_getrampagesize());
> -    } else {
> -        mps = 34; /* allow everything up to 16GiB, i.e. everything */
> -    }
> -    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = mps;
> +    /* We depend on kvm_enabled() to choose a default value for the
> +     * hpt-max-page-size capability. Of course we can't do it here
> +     * because this is too early and the HW accelerator isn't initialzed
> +     * yet. Postpone this to spapr_machine_init().
> +     */
> +    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 0;
>  }
>  
>  DEFINE_SPAPR_MACHINE(2_12, "2.12", false);
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] accel: forbid early use of kvm_enabled() and friends
  2018-06-28 10:15 ` [Qemu-devel] [PATCH 3/3] accel: forbid early use of kvm_enabled() and friends Greg Kurz
@ 2018-06-29  5:18   ` David Gibson
  2018-06-29 10:23     ` Greg Kurz
  2018-06-29 19:58   ` Eduardo Habkost
  1 sibling, 1 reply; 15+ messages in thread
From: David Gibson @ 2018-06-29  5:18 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 5666 bytes --]

On Thu, Jun 28, 2018 at 12:15:33PM +0200, Greg Kurz wrote:
> It is unsafe to rely on *_enabled() helpers before the accelerator has
> been initialized, ie, accel_init_machine() has succeeded, because they
> always return false. But it is still possible to end up calling them
> indirectly by inadvertance, and cause QEMU to misbehave.
> 
> This patch causes QEMU to abort if we try to check for an accelerator
> before it has been set up. This will help to catch bugs earlier.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

I think this is a good idea, but it has pretty widereaching impact, so
it can't go through my tree.  You'll need to send this separately to
the list.

> ---
>  accel/accel.c          |    7 +++++++
>  include/qemu-common.h  |    3 ++-
>  include/sysemu/accel.h |    1 +
>  include/sysemu/kvm.h   |    3 ++-
>  qom/cpu.c              |    1 +
>  stubs/Makefile.objs    |    1 +
>  stubs/accel.c          |   14 ++++++++++++++
>  target/i386/hax-all.c  |    2 +-
>  target/i386/whpx-all.c |    2 +-
>  9 files changed, 30 insertions(+), 4 deletions(-)
>  create mode 100644 stubs/accel.c
> 
> diff --git a/accel/accel.c b/accel/accel.c
> index 966b2d8f536c..27900aac9cc5 100644
> --- a/accel/accel.c
> +++ b/accel/accel.c
> @@ -51,6 +51,13 @@ static AccelClass *accel_find(const char *opt_name)
>      return ac;
>  }
>  
> +bool assert_accelerator_initialized(bool allowed)
> +{
> +    assert(current_machine != NULL);
> +    assert(current_machine->accelerator != NULL);
> +    return allowed;
> +}
> +
>  static int accel_init_machine(AccelClass *acc, MachineState *ms)
>  {
>      ObjectClass *oc = OBJECT_CLASS(acc);
> diff --git a/include/qemu-common.h b/include/qemu-common.h
> index 85f4749aefb7..01d5e4d97dbf 100644
> --- a/include/qemu-common.h
> +++ b/include/qemu-common.h
> @@ -82,7 +82,8 @@ int qemu_openpty_raw(int *aslave, char *pty_name);
>  extern bool tcg_allowed;
>  void tcg_exec_init(unsigned long tb_size);
>  #ifdef CONFIG_TCG
> -#define tcg_enabled() (tcg_allowed)
> +#include "sysemu/accel.h"
> +#define tcg_enabled() (assert_accelerator_initialized(tcg_allowed))
>  #else
>  #define tcg_enabled() 0
>  #endif
> diff --git a/include/sysemu/accel.h b/include/sysemu/accel.h
> index 637358f43014..76965cb69cc9 100644
> --- a/include/sysemu/accel.h
> +++ b/include/sysemu/accel.h
> @@ -71,5 +71,6 @@ void configure_accelerator(MachineState *ms);
>  void accel_register_compat_props(AccelState *accel);
>  /* Called just before os_setup_post (ie just before drop OS privs) */
>  void accel_setup_post(MachineState *ms);
> +bool assert_accelerator_initialized(bool allowed);
>  
>  #endif
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index 0b64b8e06786..ac4dbb2d6d6d 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -18,6 +18,7 @@
>  #include "qom/cpu.h"
>  #include "exec/memattrs.h"
>  #include "hw/irq.h"
> +#include "sysemu/accel.h"
>  
>  #ifdef NEED_CPU_H
>  # ifdef CONFIG_KVM
> @@ -46,7 +47,7 @@ extern bool kvm_direct_msi_allowed;
>  extern bool kvm_ioeventfd_any_length_allowed;
>  extern bool kvm_msi_use_devid;
>  
> -#define kvm_enabled()           (kvm_allowed)
> +#define kvm_enabled()           (assert_accelerator_initialized(kvm_allowed))
>  /**
>   * kvm_irqchip_in_kernel:
>   *
> diff --git a/qom/cpu.c b/qom/cpu.c
> index 92599f35413b..65a8f03a66a4 100644
> --- a/qom/cpu.c
> +++ b/qom/cpu.c
> @@ -23,6 +23,7 @@
>  #include "qemu-common.h"
>  #include "qom/cpu.h"
>  #include "sysemu/hw_accel.h"
> +#include "sysemu/accel.h"
>  #include "qemu/notify.h"
>  #include "qemu/log.h"
>  #include "exec/log.h"
> diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
> index 53d3f32cb258..2d5142287525 100644
> --- a/stubs/Makefile.objs
> +++ b/stubs/Makefile.objs
> @@ -43,3 +43,4 @@ stub-obj-y += xen-common.o
>  stub-obj-y += xen-hvm.o
>  stub-obj-y += pci-host-piix.o
>  stub-obj-y += ram-block.o
> +stub-obj-y += accel.o
> diff --git a/stubs/accel.c b/stubs/accel.c
> new file mode 100644
> index 000000000000..4f480f2d3f29
> --- /dev/null
> +++ b/stubs/accel.c
> @@ -0,0 +1,14 @@
> +/*
> + * accel stubs
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "sysemu/accel.h"
> +
> +bool assert_accelerator_initialized(bool allowed)
> +{
> +    return allowed;
> +}
> diff --git a/target/i386/hax-all.c b/target/i386/hax-all.c
> index d2e512856bb8..7c78bd7d094d 100644
> --- a/target/i386/hax-all.c
> +++ b/target/i386/hax-all.c
> @@ -57,7 +57,7 @@ static int hax_arch_get_registers(CPUArchState *env);
>  
>  int hax_enabled(void)
>  {
> -    return hax_allowed;
> +    return assert_accelerator_initialized(hax_allowed);
>  }
>  
>  int valid_hax_tunnel_size(uint16_t size)
> diff --git a/target/i386/whpx-all.c b/target/i386/whpx-all.c
> index 6b42096698ee..e7f6bc5958e7 100644
> --- a/target/i386/whpx-all.c
> +++ b/target/i386/whpx-all.c
> @@ -1422,7 +1422,7 @@ static int whpx_accel_init(MachineState *ms)
>  
>  int whpx_enabled(void)
>  {
> -    return whpx_allowed;
> +    return assert_accelerator_initialized(whpx_allowed);
>  }
>  
>  static void whpx_accel_class_init(ObjectClass *oc, void *data)
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/3] spapr: fix regression with older machine types
  2018-06-28 19:48 ` [Qemu-devel] [Qemu-ppc] [PATCH 0/3] spapr: fix regression with older machine types Greg Kurz
@ 2018-06-29  5:21   ` David Gibson
  0 siblings, 0 replies; 15+ messages in thread
From: David Gibson @ 2018-06-29  5:21 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-devel, Eduardo Habkost, qemu-ppc, Cédric Le Goater,
	Paolo Bonzini, Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 7990 bytes --]

On Thu, Jun 28, 2018 at 09:48:25PM +0200, Greg Kurz wrote:
> On Thu, 28 Jun 2018 12:14:25 +0200
> Greg Kurz <groug@kaod.org> wrote:
> 
> > Since the recent cleanups to hide host configuration details from guests,
> > it isn't possible to start an older machine type with HV KVM [*]:
> > 
> > qemu-system-ppc64: KVM doesn't support for base page shift 34
> > 
> > This basically boils down to the fact that it isn't safe to call
> > the kvmppc_hpt_needs_host_contiguous_pages() helper from a class
> > init function because:
> > - KVM isn't initialized yet, and kvm_enabled() always return false
> >   in this case. This causes kvmppc_hpt_needs_host_contiguous_pages()
> >   to do nothing and we end up choosing a 16G default page size
> >   which is not supported by KVM.
> > - even if we drop kvm_enabled() we then have the issue that
> >   kvmppc_hpt_needs_host_contiguous_pages() assumes CPUs are
> >   created, which isn't the case either.
> > 
> > The choice was made to initialize capabilities during machine
> > init before creating the CPUs, and I don't think we should
> > revert to the previous behavior. Let's go forward instead and
> > ensure we can retrieve the MMU information from KVM before
> > CPUs are created.
> > 
> > To fix this, we first change kvm_get_smmu_info() so that it
> > doesn't need a CPU object. This allows to stop using first_cpu
> > in kvmppc_hpt_needs_host_contiguous_pages(). Then we delay
> > the setting of the default value to machine init time, so
> > that we're sure that KVM is fully initialized.
> > 
> > As a bonus, the last patch is a tentative to be able to detect
> > such misuse of *_enabled() accelerator helpers earlier.
> > 
> > Please comment.
> > 
> > [*] it also breaks PR KVM actually, but the error is different and
> >     I need to dig some more.
> > 
> 
> With current master:
> 
> 1) qemu-system-ppc64 -machine pseries,accel=kvm,kvm-type=PR
> 
> The guest starts but its kernel oopses at some point:
> 
> [    0.011328] kernel tried to execute exec-protected page (c000000001611244) -exploit attempt? (uid: 0)
> [    0.011379] Unable to handle kernel paging request for instruction fetch
> [    0.011416] Faulting instruction address: 0xc000000001611244
> [    0.011453] Oops: Kernel access of bad area, sig: 11 [#1]
> [    0.011482] LE SMP NR_CPUS=1024 NUMA pSeries
> [    0.011512] Modules linked in:
> [    0.011557] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.17.2-200.fc28.ppc64le #1
> [    0.011600] NIP:  c000000001611244 LR: c00000000000acec CTR: 0000000000000000
> [    0.011643] REGS: c00000003fffba90 TRAP: 0400   Not tainted  (4.17.2-200.fc28.ppc64le)
> [    0.011694] MSR:  b000000010001033 <SF,HV,ME,IR,DR,RI,LE>  CR: 28000848  XER: 20000000
> [    0.011741] CFAR: 0000000000000000 SOFTE: 1 
> [    0.011741] GPR00: 0000000000000000 c00000003fffbd10 c000000001570b00 c00000003fffbd80 
> [    0.011741] GPR04: c000000000034418 0000000048000000 000000000000000a 000000004aa21de8 
> [    0.011741] GPR08: 000000007d410164 0000000000000000 0000000000000002 0000000000000900 
> [    0.011741] GPR12: b000000002009033 c000000001840000 c000000000071a2c 00000000495de1a4 
> [    0.011741] GPR16: 0000000000000078 c00000000160fd10 c000000000e705e0 000000007c1b03a6 
> [    0.011741] GPR20: 000000007c1ffaa6 c0000000016125b8 c0000000014253e8 000000007c1303a6 
> [    0.011741] GPR24: 000000007c1643a6 000000007c1a03a6 c00000000160fd08 ffffffffebc0f008 
> [    0.011741] GPR28: ffffffffebc0f000 c0000000000345d8 c0000000000345d8 0000000000000000 
> [    0.012138] NIP [c000000001611244] kvm_tmp+0x1534/0x100000
> [    0.012170] LR [c00000000000acec] soft_nmi_common+0xcc/0xd0
> [    0.012199] Call Trace:
> [    0.012214] Instruction dump:
> [    0.012236] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
> [    0.012289] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
> [    0.012334] ---[ end trace d2ee28832d481d2d ]---
> [    0.012362] 
> [    1.012387] kernel tried to execute exec-protected page (c000000001611808) -exploit attempt? (uid: 0)
> [    1.012433] Unable to handle kernel paging request for instruction fetch
> [    1.012468] Faulting instruction address: 0xc000000001611808
> [    1.012504] Oops: Kernel access of bad area, sig: 11 [#2]
> [    1.012532] LE SMP NR_CPUS=1024 NUMA pSeries
> [    1.012561] Modules linked in:
> [    1.012583] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G      D           4.17.2-200.fc28.ppc64le #1
> [    1.012641] NIP:  c000000001611808 LR: c0000000001247fc CTR: c000000001840000
> [    1.012684] REGS: c00000003fffb5d0 TRAP: 0400   Tainted: G      D            (4.17.2-200.fc28.ppc64le)
> [    1.012740] MSR:  b000000010001033 <SF,HV,ME,IR,DR,RI,LE>  CR: 48000224  XER: 20000000
> [    1.012785] CFAR: 0000000000000000 SOFTE: 0 
> [    1.012785] GPR00: c0000000001247fc c00000003fffb850 c000000001570b00 0000000000000000 
> [    1.012785] GPR04: 0000000000000000 c0000000fe9e4900 fffffffffffffffd c0000000fe9e4900 
> [    1.012785] GPR08: 00000000fed50000 b000000000001033 0000000000000009 c00000003fffb55f 
> [    1.012785] GPR12: 0000000000000000 c000000001840000 c000000000071a2c 00000000495de1a4 
> [    1.012785] GPR16: 0000000000000078 c00000000160fd10 c000000000e705e0 000000007c1b03a6 
> [    1.012785] GPR20: 000000007c1ffaa6 c0000000016125b8 c0000000014253e8 000000007c1303a6 
> [    1.012785] GPR24: 000000007c1643a6 000000007c1a03a6 c00000000160fd08 ffffffffebc0f008 
> [    1.012785] GPR28: 0000000000000000 000000000000000b 000000000000000b c0000000fe9e4900 
> [    1.013166] NIP [c000000001611808] kvm_tmp+0x1af8/0x100000
> [    1.013196] LR [c0000000001247fc] do_exit+0x12c/0xd30
> [    1.013224] Call Trace:
> [    1.013238] Instruction dump:
> [    1.013260] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
> [    1.013303] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
> [    1.013348] ---[ end trace d2ee28832d481d2e ]---
> [    1.013375] 
> [    2.013391] Fixing recursive fault but reboot is needed!
> 
> and the guest gets unresponsive.

Huh, that's a bit weird.

> 2) qemu-system-ppc64 -machine pseries-2.12,accel=kvm,kvm-type=PR
> 
> prints an error message and terminates right away:
> 
> qemu-system-ppc64: KVM doesn't support page shift 24/12
> 
> This error is expected: since PR KVM doesn't set KVM_PPC_PAGE_SIZES_REAL,
> ie, we choose to support all possible page sizes, but PR KVM doesn't
> support this page shift combination indeed. Unsurprisingly we get the
> same error with:
> 
> -machine pseries,accel-kvm,kvm-type=PR,cap-hpt-max-page-size=${pagesize}
> 
> if ${pagesize} is >= 16m. This is the result of PR KVM not supporting
> MPSS at all, even though it supports 16m pages in a 16m segment. We
> cannot really fix this in QEMU, unless we completely filter out MPSS
> in spapr_pagesize_cb() but I'm pretty sure we don't want that. :)

Yeah.  I think sacrificing PR without special options (or fixing PR)
is the price we have to pay for sane behaviour otherwise here.

> But then, if we go for a 64k limit, we hit 1).
> 
> An obvious change in the DT since the page size cleanup is:
> 
>                             [4k seg    [4k pg]] [64k seg      [64k pg]] [16m seg      [16m pg]]
> - ibm,segment-page-sizes = <0xc 0x0 0x1 0xc 0x0 0x10 0x110 0x1 0x10 0x1 0x18 0x100 0x1 0x18 0x0>;
> + ibm,segment-page-sizes = <0xc 0x0 0x1 0xc 0x0 0x10 0x110 0x1 0x10 0x1>;
>                             [4k seg    [4k pg]] [64k seg      [64k pg]]
> 
> If I add the 16m entry back, the guest boots just fine.
> 
> Not sure yet what's happening... any idea ?

No, not sure why lacking 16m pages would break PR.


-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] accel: forbid early use of kvm_enabled() and friends
  2018-06-29  5:18   ` David Gibson
@ 2018-06-29 10:23     ` Greg Kurz
  0 siblings, 0 replies; 15+ messages in thread
From: Greg Kurz @ 2018-06-29 10:23 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 6178 bytes --]

On Fri, 29 Jun 2018 15:18:22 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Thu, Jun 28, 2018 at 12:15:33PM +0200, Greg Kurz wrote:
> > It is unsafe to rely on *_enabled() helpers before the accelerator has
> > been initialized, ie, accel_init_machine() has succeeded, because they
> > always return false. But it is still possible to end up calling them
> > indirectly by inadvertance, and cause QEMU to misbehave.
> > 
> > This patch causes QEMU to abort if we try to check for an accelerator
> > before it has been set up. This will help to catch bugs earlier.
> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>  
> 
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> 
> I think this is a good idea, but it has pretty widereaching impact, so
> it can't go through my tree.  You'll need to send this separately to
> the list.
> 

Ok, and BTW a v2 is needed. See below.

> > ---
> >  accel/accel.c          |    7 +++++++
> >  include/qemu-common.h  |    3 ++-
> >  include/sysemu/accel.h |    1 +
> >  include/sysemu/kvm.h   |    3 ++-
> >  qom/cpu.c              |    1 +
> >  stubs/Makefile.objs    |    1 +
> >  stubs/accel.c          |   14 ++++++++++++++
> >  target/i386/hax-all.c  |    2 +-
> >  target/i386/whpx-all.c |    2 +-
> >  9 files changed, 30 insertions(+), 4 deletions(-)
> >  create mode 100644 stubs/accel.c
> > 
> > diff --git a/accel/accel.c b/accel/accel.c
> > index 966b2d8f536c..27900aac9cc5 100644
> > --- a/accel/accel.c
> > +++ b/accel/accel.c
> > @@ -51,6 +51,13 @@ static AccelClass *accel_find(const char *opt_name)
> >      return ac;
> >  }
> >  
> > +bool assert_accelerator_initialized(bool allowed)
> > +{
> > +    assert(current_machine != NULL);
> > +    assert(current_machine->accelerator != NULL);
> > +    return allowed;
> > +}
> > +
> >  static int accel_init_machine(AccelClass *acc, MachineState *ms)
> >  {
> >      ObjectClass *oc = OBJECT_CLASS(acc);
> > diff --git a/include/qemu-common.h b/include/qemu-common.h
> > index 85f4749aefb7..01d5e4d97dbf 100644
> > --- a/include/qemu-common.h
> > +++ b/include/qemu-common.h
> > @@ -82,7 +82,8 @@ int qemu_openpty_raw(int *aslave, char *pty_name);
> >  extern bool tcg_allowed;
> >  void tcg_exec_init(unsigned long tb_size);
> >  #ifdef CONFIG_TCG
> > -#define tcg_enabled() (tcg_allowed)
> > +#include "sysemu/accel.h"
> > +#define tcg_enabled() (assert_accelerator_initialized(tcg_allowed))
> >  #else
> >  #define tcg_enabled() 0
> >  #endif
> > diff --git a/include/sysemu/accel.h b/include/sysemu/accel.h
> > index 637358f43014..76965cb69cc9 100644
> > --- a/include/sysemu/accel.h
> > +++ b/include/sysemu/accel.h
> > @@ -71,5 +71,6 @@ void configure_accelerator(MachineState *ms);
> >  void accel_register_compat_props(AccelState *accel);
> >  /* Called just before os_setup_post (ie just before drop OS privs) */
> >  void accel_setup_post(MachineState *ms);
> > +bool assert_accelerator_initialized(bool allowed);
> >  
> >  #endif
> > diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> > index 0b64b8e06786..ac4dbb2d6d6d 100644
> > --- a/include/sysemu/kvm.h
> > +++ b/include/sysemu/kvm.h
> > @@ -18,6 +18,7 @@
> >  #include "qom/cpu.h"
> >  #include "exec/memattrs.h"
> >  #include "hw/irq.h"
> > +#include "sysemu/accel.h"
> >  

This header should only be included if we actually need it...

> >  #ifdef NEED_CPU_H
> >  # ifdef CONFIG_KVM
> > @@ -46,7 +47,7 @@ extern bool kvm_direct_msi_allowed;
> >  extern bool kvm_ioeventfd_any_length_allowed;
> >  extern bool kvm_msi_use_devid;
> >  
> > -#define kvm_enabled()           (kvm_allowed)

... ie, here (like we do with TCG).

> > +#define kvm_enabled()           (assert_accelerator_initialized(kvm_allowed))
> >  /**
> >   * kvm_irqchip_in_kernel:
> >   *
> > diff --git a/qom/cpu.c b/qom/cpu.c
> > index 92599f35413b..65a8f03a66a4 100644
> > --- a/qom/cpu.c
> > +++ b/qom/cpu.c
> > @@ -23,6 +23,7 @@
> >  #include "qemu-common.h"
> >  #include "qom/cpu.h"
> >  #include "sysemu/hw_accel.h"
> > +#include "sysemu/accel.h"

This change is a leftover from an intermediate version of this patch.

I plan to keep your R-b since these changes are minor.

> >  #include "qemu/notify.h"
> >  #include "qemu/log.h"
> >  #include "exec/log.h"
> > diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
> > index 53d3f32cb258..2d5142287525 100644
> > --- a/stubs/Makefile.objs
> > +++ b/stubs/Makefile.objs
> > @@ -43,3 +43,4 @@ stub-obj-y += xen-common.o
> >  stub-obj-y += xen-hvm.o
> >  stub-obj-y += pci-host-piix.o
> >  stub-obj-y += ram-block.o
> > +stub-obj-y += accel.o
> > diff --git a/stubs/accel.c b/stubs/accel.c
> > new file mode 100644
> > index 000000000000..4f480f2d3f29
> > --- /dev/null
> > +++ b/stubs/accel.c
> > @@ -0,0 +1,14 @@
> > +/*
> > + * accel stubs
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "sysemu/accel.h"
> > +
> > +bool assert_accelerator_initialized(bool allowed)
> > +{
> > +    return allowed;
> > +}
> > diff --git a/target/i386/hax-all.c b/target/i386/hax-all.c
> > index d2e512856bb8..7c78bd7d094d 100644
> > --- a/target/i386/hax-all.c
> > +++ b/target/i386/hax-all.c
> > @@ -57,7 +57,7 @@ static int hax_arch_get_registers(CPUArchState *env);
> >  
> >  int hax_enabled(void)
> >  {
> > -    return hax_allowed;
> > +    return assert_accelerator_initialized(hax_allowed);
> >  }
> >  
> >  int valid_hax_tunnel_size(uint16_t size)
> > diff --git a/target/i386/whpx-all.c b/target/i386/whpx-all.c
> > index 6b42096698ee..e7f6bc5958e7 100644
> > --- a/target/i386/whpx-all.c
> > +++ b/target/i386/whpx-all.c
> > @@ -1422,7 +1422,7 @@ static int whpx_accel_init(MachineState *ms)
> >  
> >  int whpx_enabled(void)
> >  {
> > -    return whpx_allowed;
> > +    return assert_accelerator_initialized(whpx_allowed);
> >  }
> >  
> >  static void whpx_accel_class_init(ObjectClass *oc, void *data)
> >   
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] spapr: compute default value of "hpt-max-page-size" later
  2018-06-28 10:15 ` [Qemu-devel] [PATCH 2/3] spapr: compute default value of "hpt-max-page-size" later Greg Kurz
  2018-06-29  5:16   ` David Gibson
@ 2018-06-29 19:08   ` Eduardo Habkost
  2018-07-02  4:06     ` David Gibson
  1 sibling, 1 reply; 15+ messages in thread
From: Eduardo Habkost @ 2018-06-29 19:08 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-devel, qemu-ppc, David Gibson, Paolo Bonzini,
	Richard Henderson, Cédric Le Goater

On Thu, Jun 28, 2018 at 12:15:14PM +0200, Greg Kurz wrote:
[...]
> @@ -2527,6 +2527,18 @@ static void spapr_machine_init(MachineState *machine)
>      QLIST_INIT(&spapr->phbs);
>      QTAILQ_INIT(&spapr->pending_dimm_unplugs);
>  
> +    /* This is for pseries-2.12 and older machine types */
> +    if (!smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE]) {
> +        uint8_t mps;
> +
> +        if (kvmppc_hpt_needs_host_contiguous_pages()) {
> +            mps = ctz64(qemu_getrampagesize());
> +        } else {
> +            mps = 34; /* allow everything up to 16GiB, i.e. everything */
> +        }
> +        smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = mps;

Initializing class fields on instance init functions is a sign
that the information doesn't belong to the class struct.  Can't
this be handled at default_caps_with_cpu()?

> +    }
> +
>      /* Determine capabilities to run with */
>      spapr_caps_init(spapr);
>  
> @@ -4103,17 +4115,16 @@ static void spapr_machine_2_12_instance_options(MachineState *machine)
>  static void spapr_machine_2_12_class_options(MachineClass *mc)
>  {
>      sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
> -    uint8_t mps;
>  
>      spapr_machine_3_0_class_options(mc);
>      SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_12);
>  
> -    if (kvmppc_hpt_needs_host_contiguous_pages()) {
> -        mps = ctz64(qemu_getrampagesize());
> -    } else {
> -        mps = 34; /* allow everything up to 16GiB, i.e. everything */
> -    }
> -    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = mps;
> +    /* We depend on kvm_enabled() to choose a default value for the
> +     * hpt-max-page-size capability. Of course we can't do it here
> +     * because this is too early and the HW accelerator isn't initialzed
> +     * yet. Postpone this to spapr_machine_init().
> +     */
> +    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 0;
>  }
>  
>  DEFINE_SPAPR_MACHINE(2_12, "2.12", false);
> 

-- 
Eduardo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] accel: forbid early use of kvm_enabled() and friends
  2018-06-28 10:15 ` [Qemu-devel] [PATCH 3/3] accel: forbid early use of kvm_enabled() and friends Greg Kurz
  2018-06-29  5:18   ` David Gibson
@ 2018-06-29 19:58   ` Eduardo Habkost
  1 sibling, 0 replies; 15+ messages in thread
From: Eduardo Habkost @ 2018-06-29 19:58 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-devel, qemu-ppc, David Gibson, Paolo Bonzini,
	Richard Henderson, Cédric Le Goater

On Thu, Jun 28, 2018 at 12:15:33PM +0200, Greg Kurz wrote:
> It is unsafe to rely on *_enabled() helpers before the accelerator has
> been initialized, ie, accel_init_machine() has succeeded, because they
> always return false. But it is still possible to end up calling them
> indirectly by inadvertance, and cause QEMU to misbehave.
> 
> This patch causes QEMU to abort if we try to check for an accelerator
> before it has been set up. This will help to catch bugs earlier.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  accel/accel.c          |    7 +++++++
>  include/qemu-common.h  |    3 ++-
>  include/sysemu/accel.h |    1 +
>  include/sysemu/kvm.h   |    3 ++-
>  qom/cpu.c              |    1 +
>  stubs/Makefile.objs    |    1 +
>  stubs/accel.c          |   14 ++++++++++++++
>  target/i386/hax-all.c  |    2 +-
>  target/i386/whpx-all.c |    2 +-
>  9 files changed, 30 insertions(+), 4 deletions(-)
>  create mode 100644 stubs/accel.c
> 
> diff --git a/accel/accel.c b/accel/accel.c
> index 966b2d8f536c..27900aac9cc5 100644
> --- a/accel/accel.c
> +++ b/accel/accel.c
> @@ -51,6 +51,13 @@ static AccelClass *accel_find(const char *opt_name)
>      return ac;
>  }
>  
> +bool assert_accelerator_initialized(bool allowed)
> +{
> +    assert(current_machine != NULL);
> +    assert(current_machine->accelerator != NULL);
> +    return allowed;
> +}
> +
>  static int accel_init_machine(AccelClass *acc, MachineState *ms)
>  {
>      ObjectClass *oc = OBJECT_CLASS(acc);
> diff --git a/include/qemu-common.h b/include/qemu-common.h
> index 85f4749aefb7..01d5e4d97dbf 100644
> --- a/include/qemu-common.h
> +++ b/include/qemu-common.h
> @@ -82,7 +82,8 @@ int qemu_openpty_raw(int *aslave, char *pty_name);
>  extern bool tcg_allowed;
>  void tcg_exec_init(unsigned long tb_size);
>  #ifdef CONFIG_TCG
> -#define tcg_enabled() (tcg_allowed)
> +#include "sysemu/accel.h"
> +#define tcg_enabled() (assert_accelerator_initialized(tcg_allowed))
>  #else
>  #define tcg_enabled() 0

It would be nice to catch mistakes even if
the CONFIG_{TCG,KVM,HAX,XEN} is disabled.  That would require making
assert_accelerator_initialized() a macro or inline function,
though.


>  #endif
> diff --git a/include/sysemu/accel.h b/include/sysemu/accel.h
> index 637358f43014..76965cb69cc9 100644
> --- a/include/sysemu/accel.h
> +++ b/include/sysemu/accel.h
> @@ -71,5 +71,6 @@ void configure_accelerator(MachineState *ms);
>  void accel_register_compat_props(AccelState *accel);
>  /* Called just before os_setup_post (ie just before drop OS privs) */
>  void accel_setup_post(MachineState *ms);
> +bool assert_accelerator_initialized(bool allowed);
>  
>  #endif
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index 0b64b8e06786..ac4dbb2d6d6d 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -18,6 +18,7 @@
>  #include "qom/cpu.h"
>  #include "exec/memattrs.h"
>  #include "hw/irq.h"
> +#include "sysemu/accel.h"
>  
>  #ifdef NEED_CPU_H
>  # ifdef CONFIG_KVM
> @@ -46,7 +47,7 @@ extern bool kvm_direct_msi_allowed;
>  extern bool kvm_ioeventfd_any_length_allowed;
>  extern bool kvm_msi_use_devid;
>  
> -#define kvm_enabled()           (kvm_allowed)
> +#define kvm_enabled()           (assert_accelerator_initialized(kvm_allowed))
>  /**
>   * kvm_irqchip_in_kernel:
>   *
> diff --git a/qom/cpu.c b/qom/cpu.c
> index 92599f35413b..65a8f03a66a4 100644
> --- a/qom/cpu.c
> +++ b/qom/cpu.c
> @@ -23,6 +23,7 @@
>  #include "qemu-common.h"
>  #include "qom/cpu.h"
>  #include "sysemu/hw_accel.h"
> +#include "sysemu/accel.h"
>  #include "qemu/notify.h"
>  #include "qemu/log.h"
>  #include "exec/log.h"
> diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
> index 53d3f32cb258..2d5142287525 100644
> --- a/stubs/Makefile.objs
> +++ b/stubs/Makefile.objs
> @@ -43,3 +43,4 @@ stub-obj-y += xen-common.o
>  stub-obj-y += xen-hvm.o
>  stub-obj-y += pci-host-piix.o
>  stub-obj-y += ram-block.o
> +stub-obj-y += accel.o
> diff --git a/stubs/accel.c b/stubs/accel.c
> new file mode 100644
> index 000000000000..4f480f2d3f29
> --- /dev/null
> +++ b/stubs/accel.c
> @@ -0,0 +1,14 @@
> +/*
> + * accel stubs
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "sysemu/accel.h"
> +
> +bool assert_accelerator_initialized(bool allowed)
> +{
> +    return allowed;
> +}
> diff --git a/target/i386/hax-all.c b/target/i386/hax-all.c
> index d2e512856bb8..7c78bd7d094d 100644
> --- a/target/i386/hax-all.c
> +++ b/target/i386/hax-all.c
> @@ -57,7 +57,7 @@ static int hax_arch_get_registers(CPUArchState *env);
>  
>  int hax_enabled(void)
>  {
> -    return hax_allowed;
> +    return assert_accelerator_initialized(hax_allowed);
>  }
>  
>  int valid_hax_tunnel_size(uint16_t size)
> diff --git a/target/i386/whpx-all.c b/target/i386/whpx-all.c
> index 6b42096698ee..e7f6bc5958e7 100644
> --- a/target/i386/whpx-all.c
> +++ b/target/i386/whpx-all.c
> @@ -1422,7 +1422,7 @@ static int whpx_accel_init(MachineState *ms)
>  
>  int whpx_enabled(void)
>  {
> -    return whpx_allowed;
> +    return assert_accelerator_initialized(whpx_allowed);
>  }
>  
>  static void whpx_accel_class_init(ObjectClass *oc, void *data)
> 

-- 
Eduardo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] spapr: compute default value of "hpt-max-page-size" later
  2018-06-29 19:08   ` Eduardo Habkost
@ 2018-07-02  4:06     ` David Gibson
  0 siblings, 0 replies; 15+ messages in thread
From: David Gibson @ 2018-07-02  4:06 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Greg Kurz, qemu-devel, qemu-ppc, Paolo Bonzini,
	Richard Henderson, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 2450 bytes --]

On Fri, Jun 29, 2018 at 04:08:22PM -0300, Eduardo Habkost wrote:
> On Thu, Jun 28, 2018 at 12:15:14PM +0200, Greg Kurz wrote:
> [...]
> > @@ -2527,6 +2527,18 @@ static void spapr_machine_init(MachineState *machine)
> >      QLIST_INIT(&spapr->phbs);
> >      QTAILQ_INIT(&spapr->pending_dimm_unplugs);
> >  
> > +    /* This is for pseries-2.12 and older machine types */
> > +    if (!smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE]) {
> > +        uint8_t mps;
> > +
> > +        if (kvmppc_hpt_needs_host_contiguous_pages()) {
> > +            mps = ctz64(qemu_getrampagesize());
> > +        } else {
> > +            mps = 34; /* allow everything up to 16GiB, i.e. everything */
> > +        }
> > +        smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = mps;
> 
> Initializing class fields on instance init functions is a sign
> that the information doesn't belong to the class struct.  Can't
> this be handled at default_caps_with_cpu()?

Yeah, doing it inside default_caps_with_cpu() probably makes more
sense.

> > +    }
> > +
> >      /* Determine capabilities to run with */
> >      spapr_caps_init(spapr);
> >  
> > @@ -4103,17 +4115,16 @@ static void spapr_machine_2_12_instance_options(MachineState *machine)
> >  static void spapr_machine_2_12_class_options(MachineClass *mc)
> >  {
> >      sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
> > -    uint8_t mps;
> >  
> >      spapr_machine_3_0_class_options(mc);
> >      SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_12);
> >  
> > -    if (kvmppc_hpt_needs_host_contiguous_pages()) {
> > -        mps = ctz64(qemu_getrampagesize());
> > -    } else {
> > -        mps = 34; /* allow everything up to 16GiB, i.e. everything */
> > -    }
> > -    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = mps;
> > +    /* We depend on kvm_enabled() to choose a default value for the
> > +     * hpt-max-page-size capability. Of course we can't do it here
> > +     * because this is too early and the HW accelerator isn't initialzed
> > +     * yet. Postpone this to spapr_machine_init().
> > +     */
> > +    smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 0;
> >  }
> >  
> >  DEFINE_SPAPR_MACHINE(2_12, "2.12", false);
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-07-02  4:06 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-28 10:14 [Qemu-devel] [PATCH 0/3] spapr: fix regression with older machine types Greg Kurz
2018-06-28 10:14 ` [Qemu-devel] [PATCH 1/3] target/ppc/kvm: don't pass cpu to kvm_get_smmu_info() Greg Kurz
2018-06-28 11:56   ` Cédric Le Goater
2018-06-28 12:14     ` Greg Kurz
2018-06-29  5:16   ` David Gibson
2018-06-28 10:15 ` [Qemu-devel] [PATCH 2/3] spapr: compute default value of "hpt-max-page-size" later Greg Kurz
2018-06-29  5:16   ` David Gibson
2018-06-29 19:08   ` Eduardo Habkost
2018-07-02  4:06     ` David Gibson
2018-06-28 10:15 ` [Qemu-devel] [PATCH 3/3] accel: forbid early use of kvm_enabled() and friends Greg Kurz
2018-06-29  5:18   ` David Gibson
2018-06-29 10:23     ` Greg Kurz
2018-06-29 19:58   ` Eduardo Habkost
2018-06-28 19:48 ` [Qemu-devel] [Qemu-ppc] [PATCH 0/3] spapr: fix regression with older machine types Greg Kurz
2018-06-29  5:21   ` David Gibson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.