All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH v2 00/12] ISA 3.00 KVM guest support
@ 2017-02-23  5:59 Sam Bobroff
  2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 01/12] spapr: Small cleanup of PPC MMU enums Sam Bobroff
                   ` (11 more replies)
  0 siblings, 12 replies; 28+ messages in thread
From: Sam Bobroff @ 2017-02-23  5:59 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, sjitindarsingh


Update notes:

Since the last version, there has been a change to specification of the values
used during client architecture support, regarding the bits in option vector 5,
so some bits and other processing have changed a bit.

This version has not been as well tested as the last. Testing is ongoing.

General intro:

Because KVM will soon provide the necessary infrastructure for KVM guests to
run on POWER9 CPUs, we can now start exploiting this new functionality from
QEMU. See:
https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-January/153433.html

This work is not yet complete but it is functional and is presented for early
review. It overlaps in some places with current work supporting the same guests
under full emulation.

This set aims to support only the following scenarios:
* A POWER9 host running in radix mode, running a guest in radix mode.
* A POWER9 host running in hash mode, running a guest in hash mode.
* A POWER9 host running in hash mode, running a guest in legacy(+) mode.
(+) Where legacy means that the guest does not support ISA 3.00.

Hash or radix mode for the host is controlled via the "disable_radix" kernel
command line parameter: the host will use radix unless disable_radix is given.
For the guest it should be automatically selected to match the host.

Bad legacy guests: There are some recent kernels (e.g. 4.9) that will, when run
as a KVM guest and if the ibm,pa-features entry in the device tree has the
Radix MMU bit set, attempt to initialize the MMU as if they were a host (which
will cause them to crash). To avoid exposing this problem, the Radix MMU bit
is removed from ibm,pa-features when a legacy guest is detected.

Final Notes:
* Migration/snapshots are not yet investigated.
* This set is based on the ppc-for-2.9 branch of David Gibson's tree
at https://github.com/dgibson/qemu.git
* It also relies on some work already posted here:
https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg02527.html
Specifically patches 1..4 which set up the new CPU and MMU models.

Changes v1 -> v2:
Patch 1/12: scripts/update-linux-headers.sh: refactor extra files

I've factored the script to make it easier to add new files.

Patch 2/12: scripts/update-linux-headers.sh: add new files for ARM

* Added the two new arm headers.

Patch 3/12: Move virtio_mmio.h to fix update-linux-headers.sh
* FWIW, here's one way of fixing it.

Patch 4/12: Update headers using update-linux-headers.sh

* Added information about where the headers came from.

Patch 5/12: spapr: Add ibm,processor-radix-AP-encodings to the device tree

* ppc_radix_page_info now kept in native format, conversion to BE done when adding to the device tree.
* radix_page_info moved into the CPU class, cleaning up some code.

Patch 6/12: target-ppc: support KVM_CAP_PPC_MMU_RADIX, KVM_CAP_PPC_MMU_HASH_V3

* cap_mmu_hash renamed to cap_mmu_hash_v3.

Patch 7/12: spapr: Only setup HTP if necessary.

* This patch has been mostly rewritten to move the late HPT allocation to CAS.
This allows a guest to start in radix mode (when it's in real mode) and then
change to hash, even if it is a legacy guest and will not call
h_register_process_table().
* Added an exported function to spapr.c to perform HPT allocation and adjust
the vrma if necessary. This makes it possible to allocate the HPT from
h_client_architecture_support() in spapr_hcall.c.

Patch 8/12: spapr: Add h_register_process_table() hypercall

* I haven't addressed review comments for this patch because it overlaps with
Suraj's implementation of the same function and we'll work together to
integrate them.

Patch 10/12: spapr: Enable ISA 3.0 MMU mode selection via CAS

* Unused bits removed.
* Logic and bit definitions changed due to architectural change.
* Cleanly terminate QEMU if the guest requests an unavailable mode (as required
  by the new architecture).
* Legacy guest workaround moved to it's own patch.
* I'm sorry for the bitfield constants in spapr_dt_ov5_platform_support() but
  there don't seem to be convienent macros for converting an option vector
  specifier (OV_BIT(x,y)) into a byte-mask. I'm open to suggestions.


Sam Bobroff (12):
  spapr: Small cleanup of PPC MMU enums
  scripts/update-linux-headers.sh: refactor extra files
  scripts/update-linux-headers.sh: add new files for ARM
  Move virtio_mmio.h to fix update-linux-headers.sh
  Update headers using update-linux-headers.sh
  spapr: Add ibm,processor-radix-AP-encodings to the device tree
  target-ppc: support KVM_CAP_PPC_MMU_RADIX, KVM_CAP_PPC_MMU_HASH_V3
  spapr: Only setup HTP if necessary.
  spapr: Add h_register_process_table() hypercall
  spapr: move spapr_populate_pa_features()
  spapr: Enable ISA 3.0 MMU mode selection via CAS
  spapr: Workaround for broken radix guests

 hw/ppc/spapr.c                                     | 190 +++++++---
 hw/ppc/spapr_hcall.c                               |  84 ++++-
 hw/virtio/virtio-mmio.c                            |   2 +-
 include/hw/ppc/spapr.h                             |   3 +
 include/hw/ppc/spapr_ovec.h                        |   8 +
 .../linux/virtio_mmio.h                            |   0
 include/standard-headers/linux/input-event-codes.h |   2 +-
 include/standard-headers/linux/pci_regs.h          |   8 +
 include/standard-headers/linux/virtio_ids.h        |   1 +
 include/sysemu/kvm.h                               |   1 +
 linux-headers/asm-arm/kvm.h                        |   2 +
 linux-headers/asm-arm/unistd-eabi.h                |   5 +
 linux-headers/asm-arm/unistd-oabi.h                |  17 +
 linux-headers/asm-arm/unistd.h                     | 419 +--------------------
 linux-headers/asm-powerpc/kvm.h                    |  27 ++
 linux-headers/asm-powerpc/unistd.h                 |   1 +
 linux-headers/asm-x86/kvm_para.h                   |   4 +-
 linux-headers/linux/kvm.h                          |  20 +-
 linux-headers/linux/vfio.h                         |  10 +
 scripts/update-linux-headers.sh                    |  26 +-
 target/ppc/cpu-qom.h                               |  13 +-
 target/ppc/cpu.h                                   |   4 +
 target/ppc/kvm.c                                   |  61 ++-
 target/ppc/kvm_ppc.h                               |  13 +
 target/ppc/mmu-hash64.c                            |  10 +-
 target/ppc/mmu_helper.c                            |  67 ++--
 target/ppc/translate.c                             |  12 +-
 27 files changed, 457 insertions(+), 553 deletions(-)
 rename include/{standard-headers => kernel-headers}/linux/virtio_mmio.h (100%)
 create mode 100644 linux-headers/asm-arm/unistd-eabi.h
 create mode 100644 linux-headers/asm-arm/unistd-oabi.h

-- 
2.11.0

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Qemu-devel] [RFC PATCH v2 01/12] spapr: Small cleanup of PPC MMU enums
  2017-02-23  5:59 [Qemu-devel] [RFC PATCH v2 00/12] ISA 3.00 KVM guest support Sam Bobroff
@ 2017-02-23  5:59 ` Sam Bobroff
  2017-02-27  6:22   ` David Gibson
  2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 02/12] scripts/update-linux-headers.sh: refactor extra files Sam Bobroff
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 28+ messages in thread
From: Sam Bobroff @ 2017-02-23  5:59 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, sjitindarsingh

The PPC MMU types are sometimes treated as if they were a bit field
and sometime as if they were an enum which causes maintenance
problems: flipping bits in the MMU type (which is done on both the 1TB
segment and 64K segment bits) currently produces new MMU type
values that are not handled in every "switch" on it, sometimes causing
an abort().

This patch provides some macros that can be used to filter out the
"bit field-like" bits so that the remainder of the value can be
switched on, like an enum. This allows removal of all of the
"degraded" types from the list and should ease maintenance.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
 hw/ppc/spapr.c          |  8 +++---
 target/ppc/cpu-qom.h    | 12 ++++-----
 target/ppc/kvm.c        |  8 +++---
 target/ppc/mmu-hash64.c | 10 ++++----
 target/ppc/mmu_helper.c | 67 ++++++++++++++++++++-----------------------------
 target/ppc/translate.c  | 12 ++++-----
 6 files changed, 50 insertions(+), 67 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 5904e6498f..cceb35f083 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -359,14 +359,12 @@ static void spapr_populate_pa_features(CPUPPCState *env, void *fdt, int offset)
     uint8_t *pa_features;
     size_t pa_size;
 
-    switch (env->mmu_model) {
-    case POWERPC_MMU_2_06:
-    case POWERPC_MMU_2_06a:
+    switch (POWERPC_MMU_VER(env->mmu_model)) {
+    case POWERPC_MMU_VER_2_06:
         pa_features = pa_features_206;
         pa_size = sizeof(pa_features_206);
         break;
-    case POWERPC_MMU_2_07:
-    case POWERPC_MMU_2_07a:
+    case POWERPC_MMU_VER_2_07:
         pa_features = pa_features_207;
         pa_size = sizeof(pa_features_207);
         break;
diff --git a/target/ppc/cpu-qom.h b/target/ppc/cpu-qom.h
index 4e3132b56b..4807f4d86c 100644
--- a/target/ppc/cpu-qom.h
+++ b/target/ppc/cpu-qom.h
@@ -79,21 +79,21 @@ enum powerpc_mmu_t {
     POWERPC_MMU_2_06       = POWERPC_MMU_64 | POWERPC_MMU_1TSEG
                              | POWERPC_MMU_64K
                              | POWERPC_MMU_AMR | 0x00000003,
-    /* Architecture 2.06 "degraded" (no 1T segments)           */
-    POWERPC_MMU_2_06a      = POWERPC_MMU_64 | POWERPC_MMU_AMR
-                             | 0x00000003,
     /* Architecture 2.07 variant                               */
     POWERPC_MMU_2_07       = POWERPC_MMU_64 | POWERPC_MMU_1TSEG
                              | POWERPC_MMU_64K
                              | POWERPC_MMU_AMR | 0x00000004,
-    /* Architecture 2.07 "degraded" (no 1T segments)           */
-    POWERPC_MMU_2_07a      = POWERPC_MMU_64 | POWERPC_MMU_AMR
-                             | 0x00000004,
     /* Architecture 3.00 variant                               */
     POWERPC_MMU_3_00       = POWERPC_MMU_64 | POWERPC_MMU_1TSEG
                              | POWERPC_MMU_64K
                              | POWERPC_MMU_AMR | 0x00000005,
 };
+#define POWERPC_MMU_VER(x) ((x) & (POWERPC_MMU_64 | 0xFFFF))
+#define POWERPC_MMU_VER_64B POWERPC_MMU_VER(POWERPC_MMU_64B)
+#define POWERPC_MMU_VER_2_03 POWERPC_MMU_VER(POWERPC_MMU_2_03)
+#define POWERPC_MMU_VER_2_06 POWERPC_MMU_VER(POWERPC_MMU_2_06)
+#define POWERPC_MMU_VER_2_07 POWERPC_MMU_VER(POWERPC_MMU_2_07)
+#define POWERPC_MMU_VER_3_00 POWERPC_MMU_VER(POWERPC_MMU_3_00)
 
 /*****************************************************************************/
 /* Exception model                                                           */
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 52bbea514a..d53ede8b4a 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -282,8 +282,8 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
             info->flags |= KVM_PPC_1T_SEGMENTS;
         }
 
-        if (env->mmu_model == POWERPC_MMU_2_06 ||
-            env->mmu_model == POWERPC_MMU_2_07) {
+        if (POWERPC_MMU_VER(env->mmu_model) == POWERPC_MMU_VER_2_06 ||
+           POWERPC_MMU_VER(env->mmu_model) == POWERPC_MMU_VER_2_07) {
             info->slb_size = 32;
         } else {
             info->slb_size = 64;
@@ -297,8 +297,8 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
         i++;
 
         /* 64K on MMU 2.06 and later */
-        if (env->mmu_model == POWERPC_MMU_2_06 ||
-            env->mmu_model == POWERPC_MMU_2_07) {
+        if (POWERPC_MMU_VER(env->mmu_model) == POWERPC_MMU_VER_2_06 ||
+            POWERPC_MMU_VER(env->mmu_model) == POWERPC_MMU_VER_2_07) {
             info->sps[i].page_shift = 16;
             info->sps[i].slb_enc = 0x110;
             info->sps[i].enc[0].page_shift = 16;
diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 76669ed82c..6346167b48 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -1032,8 +1032,8 @@ void helper_store_lpcr(CPUPPCState *env, target_ulong val)
     uint64_t lpcr = 0;
 
     /* Filter out bits */
-    switch (env->mmu_model) {
-    case POWERPC_MMU_64B: /* 970 */
+    switch (POWERPC_MMU_VER(env->mmu_model)) {
+    case POWERPC_MMU_VER_64B: /* 970 */
         if (val & 0x40) {
             lpcr |= LPCR_LPES0;
         }
@@ -1059,19 +1059,19 @@ void helper_store_lpcr(CPUPPCState *env, target_ulong val)
          * to dig HRMOR out of HID5
          */
         break;
-    case POWERPC_MMU_2_03: /* P5p */
+    case POWERPC_MMU_VER_2_03: /* P5p */
         lpcr = val & (LPCR_RMLS | LPCR_ILE |
                       LPCR_LPES0 | LPCR_LPES1 |
                       LPCR_RMI | LPCR_HDICE);
         break;
-    case POWERPC_MMU_2_06: /* P7 */
+    case POWERPC_MMU_VER_2_06: /* P7 */
         lpcr = val & (LPCR_VPM0 | LPCR_VPM1 | LPCR_ISL | LPCR_DPFD |
                       LPCR_VRMASD | LPCR_RMLS | LPCR_ILE |
                       LPCR_P7_PECE0 | LPCR_P7_PECE1 | LPCR_P7_PECE2 |
                       LPCR_MER | LPCR_TC |
                       LPCR_LPES0 | LPCR_LPES1 | LPCR_HDICE);
         break;
-    case POWERPC_MMU_2_07: /* P8 */
+    case POWERPC_MMU_VER_2_07: /* P8 */
         lpcr = val & (LPCR_VPM0 | LPCR_VPM1 | LPCR_ISL | LPCR_KBV |
                       LPCR_DPFD | LPCR_VRMASD | LPCR_RMLS | LPCR_ILE |
                       LPCR_AIL | LPCR_ONL | LPCR_P8_PECE0 | LPCR_P8_PECE1 |
diff --git a/target/ppc/mmu_helper.c b/target/ppc/mmu_helper.c
index eb2d482ef7..0f6016ff0d 100644
--- a/target/ppc/mmu_helper.c
+++ b/target/ppc/mmu_helper.c
@@ -1260,7 +1260,7 @@ static void mmu6xx_dump_mmu(FILE *f, fprintf_function cpu_fprintf,
 
 void dump_mmu(FILE *f, fprintf_function cpu_fprintf, CPUPPCState *env)
 {
-    switch (env->mmu_model) {
+    switch (POWERPC_MMU_VER(env->mmu_model)) {
     case POWERPC_MMU_BOOKE:
         mmubooke_dump_mmu(f, cpu_fprintf, env);
         break;
@@ -1272,12 +1272,10 @@ void dump_mmu(FILE *f, fprintf_function cpu_fprintf, CPUPPCState *env)
         mmu6xx_dump_mmu(f, cpu_fprintf, env);
         break;
 #if defined(TARGET_PPC64)
-    case POWERPC_MMU_64B:
-    case POWERPC_MMU_2_03:
-    case POWERPC_MMU_2_06:
-    case POWERPC_MMU_2_06a:
-    case POWERPC_MMU_2_07:
-    case POWERPC_MMU_2_07a:
+    case POWERPC_MMU_VER_64B:
+    case POWERPC_MMU_VER_2_03:
+    case POWERPC_MMU_VER_2_06:
+    case POWERPC_MMU_VER_2_07:
         dump_slb(f, cpu_fprintf, ppc_env_get_cpu(env));
         break;
 #endif
@@ -1412,14 +1410,12 @@ hwaddr ppc_cpu_get_phys_page_debug(CPUState *cs, vaddr addr)
     CPUPPCState *env = &cpu->env;
     mmu_ctx_t ctx;
 
-    switch (env->mmu_model) {
+    switch (POWERPC_MMU_VER(env->mmu_model)) {
 #if defined(TARGET_PPC64)
-    case POWERPC_MMU_64B:
-    case POWERPC_MMU_2_03:
-    case POWERPC_MMU_2_06:
-    case POWERPC_MMU_2_06a:
-    case POWERPC_MMU_2_07:
-    case POWERPC_MMU_2_07a:
+    case POWERPC_MMU_VER_64B:
+    case POWERPC_MMU_VER_2_03:
+    case POWERPC_MMU_VER_2_06:
+    case POWERPC_MMU_VER_2_07:
         return ppc_hash64_get_phys_page_debug(cpu, addr);
 #endif
 
@@ -1904,6 +1900,12 @@ void ppc_tlb_invalidate_all(CPUPPCState *env)
 {
     PowerPCCPU *cpu = ppc_env_get_cpu(env);
 
+#if defined(TARGET_PPC64)
+    if (env->mmu_model & POWERPC_MMU_64) {
+        env->tlb_need_flush = 0;
+        tlb_flush(CPU(cpu));
+    } else
+#endif /* defined(TARGET_PPC64) */
     switch (env->mmu_model) {
     case POWERPC_MMU_SOFT_6xx:
     case POWERPC_MMU_SOFT_74xx:
@@ -1928,21 +1930,12 @@ void ppc_tlb_invalidate_all(CPUPPCState *env)
         break;
     case POWERPC_MMU_32B:
     case POWERPC_MMU_601:
-#if defined(TARGET_PPC64)
-    case POWERPC_MMU_64B:
-    case POWERPC_MMU_2_03:
-    case POWERPC_MMU_2_06:
-    case POWERPC_MMU_2_06a:
-    case POWERPC_MMU_2_07:
-    case POWERPC_MMU_2_07a:
-    case POWERPC_MMU_3_00:
-#endif /* defined(TARGET_PPC64) */
         env->tlb_need_flush = 0;
         tlb_flush(CPU(cpu));
         break;
     default:
         /* XXX: TODO */
-        cpu_abort(CPU(cpu), "Unknown MMU model %d\n", env->mmu_model);
+        cpu_abort(CPU(cpu), "Unknown MMU model %x\n", env->mmu_model);
         break;
     }
 }
@@ -1951,6 +1944,16 @@ void ppc_tlb_invalidate_one(CPUPPCState *env, target_ulong addr)
 {
 #if !defined(FLUSH_ALL_TLBS)
     addr &= TARGET_PAGE_MASK;
+#if defined(TARGET_PPC64)
+    if (env->mmu_model & POWERPC_MMU_64) {
+        /* tlbie invalidate TLBs for all segments */
+        /* XXX: given the fact that there are too many segments to invalidate,
+         *      and we still don't have a tlb_flush_mask(env, n, mask) in QEMU,
+         *      we just invalidate all TLBs
+         */
+        env->tlb_need_flush |= TLB_NEED_LOCAL_FLUSH;
+    } else
+#endif /* defined(TARGET_PPC64) */
     switch (env->mmu_model) {
     case POWERPC_MMU_SOFT_6xx:
     case POWERPC_MMU_SOFT_74xx:
@@ -1968,22 +1971,6 @@ void ppc_tlb_invalidate_one(CPUPPCState *env, target_ulong addr)
          */
         env->tlb_need_flush |= TLB_NEED_LOCAL_FLUSH;
         break;
-#if defined(TARGET_PPC64)
-    case POWERPC_MMU_64B:
-    case POWERPC_MMU_2_03:
-    case POWERPC_MMU_2_06:
-    case POWERPC_MMU_2_06a:
-    case POWERPC_MMU_2_07:
-    case POWERPC_MMU_2_07a:
-    case POWERPC_MMU_3_00:
-        /* tlbie invalidate TLBs for all segments */
-        /* XXX: given the fact that there are too many segments to invalidate,
-         *      and we still don't have a tlb_flush_mask(env, n, mask) in QEMU,
-         *      we just invalidate all TLBs
-         */
-        env->tlb_need_flush |= TLB_NEED_LOCAL_FLUSH;
-        break;
-#endif /* defined(TARGET_PPC64) */
     default:
         /* Should never reach here with other MMU models */
         assert(0);
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index b09e16ff76..2a24d1de67 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -6988,18 +6988,16 @@ void ppc_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
     if (env->spr_cb[SPR_LPCR].name)
         cpu_fprintf(f, " LPCR " TARGET_FMT_lx "\n", env->spr[SPR_LPCR]);
 
-    switch (env->mmu_model) {
+    switch (POWERPC_MMU_VER(env->mmu_model)) {
     case POWERPC_MMU_32B:
     case POWERPC_MMU_601:
     case POWERPC_MMU_SOFT_6xx:
     case POWERPC_MMU_SOFT_74xx:
 #if defined(TARGET_PPC64)
-    case POWERPC_MMU_64B:
-    case POWERPC_MMU_2_03:
-    case POWERPC_MMU_2_06:
-    case POWERPC_MMU_2_06a:
-    case POWERPC_MMU_2_07:
-    case POWERPC_MMU_2_07a:
+    case POWERPC_MMU_VER_64B:
+    case POWERPC_MMU_VER_2_03:
+    case POWERPC_MMU_VER_2_06:
+    case POWERPC_MMU_VER_2_07:
 #endif
         cpu_fprintf(f, " SDR1 " TARGET_FMT_lx "   DAR " TARGET_FMT_lx
                        "  DSISR " TARGET_FMT_lx "\n", env->spr[SPR_SDR1],
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Qemu-devel] [RFC PATCH v2 02/12] scripts/update-linux-headers.sh: refactor extra files
  2017-02-23  5:59 [Qemu-devel] [RFC PATCH v2 00/12] ISA 3.00 KVM guest support Sam Bobroff
  2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 01/12] spapr: Small cleanup of PPC MMU enums Sam Bobroff
@ 2017-02-23  5:59 ` Sam Bobroff
  2017-02-27  6:24   ` David Gibson
  2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 03/12] scripts/update-linux-headers.sh: add new files for ARM Sam Bobroff
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 28+ messages in thread
From: Sam Bobroff @ 2017-02-23  5:59 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, sjitindarsingh

Refactor the architecture specific code to make it easier
to add new special case files.

There should be no change in functionality.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
v2:

I've factored the script to make it easier to add new files.

 scripts/update-linux-headers.sh | 25 +++++++++++--------------
 1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index ef11a8ab42..c75c30da1b 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -76,28 +76,25 @@ for arch in $ARCHLIST; do
     fi
 
     make -C "$linux" INSTALL_HDR_PATH="$tmpdir" ARCH=$arch headers_install
+    ARCH_EXTRA=
+    ARCH_STD_EXTRA=
+    case "$arch" in
+        powerpc) ARCH_EXTRA=epapr_hcalls.h ;;
+        s390) ARCH_STD_EXTRA="kvm_virtio.h virtio-ccw.h" ;;
+        x86) ARCH_EXTRA="unistd_32.h unistd_x32.h unistd_64.h" ARCH_STD_EXTRA="hyperv.h" ;;
+    esac
 
     rm -rf "$output/linux-headers/asm-$arch"
     mkdir -p "$output/linux-headers/asm-$arch"
-    for header in kvm.h kvm_para.h unistd.h; do
+    for header in kvm.h kvm_para.h unistd.h $ARCH_EXTRA; do
         cp "$tmpdir/include/asm/$header" "$output/linux-headers/asm-$arch"
     done
-    if [ $arch = powerpc ]; then
-        cp "$tmpdir/include/asm/epapr_hcalls.h" "$output/linux-headers/asm-powerpc/"
-    fi
 
     rm -rf "$output/include/standard-headers/asm-$arch"
     mkdir -p "$output/include/standard-headers/asm-$arch"
-    if [ $arch = s390 ]; then
-        cp_portable "$tmpdir/include/asm/kvm_virtio.h" "$output/include/standard-headers/asm-s390/"
-        cp_portable "$tmpdir/include/asm/virtio-ccw.h" "$output/include/standard-headers/asm-s390/"
-    fi
-    if [ $arch = x86 ]; then
-        cp_portable "$tmpdir/include/asm/hyperv.h" "$output/include/standard-headers/asm-x86/"
-        cp "$tmpdir/include/asm/unistd_32.h" "$output/linux-headers/asm-x86/"
-        cp "$tmpdir/include/asm/unistd_x32.h" "$output/linux-headers/asm-x86/"
-        cp "$tmpdir/include/asm/unistd_64.h" "$output/linux-headers/asm-x86/"
-    fi
+    for header in $ARCH_STD_EXTRA; do
+        cp_portable "$tmpdir/include/asm/$header" "$output/include/standard-headers/asm-$arch/"
+    done
 done
 
 rm -rf "$output/linux-headers/linux"
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Qemu-devel] [RFC PATCH v2 03/12] scripts/update-linux-headers.sh: add new files for ARM
  2017-02-23  5:59 [Qemu-devel] [RFC PATCH v2 00/12] ISA 3.00 KVM guest support Sam Bobroff
  2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 01/12] spapr: Small cleanup of PPC MMU enums Sam Bobroff
  2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 02/12] scripts/update-linux-headers.sh: refactor extra files Sam Bobroff
@ 2017-02-23  5:59 ` Sam Bobroff
  2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 04/12] Move virtio_mmio.h to fix update-linux-headers.sh Sam Bobroff
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 28+ messages in thread
From: Sam Bobroff @ 2017-02-23  5:59 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, sjitindarsingh

The kernel has added some new headers for ARM, so add these so that
the script can be run successfully.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
v2:

* Added the two new arm headers.

 scripts/update-linux-headers.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index c75c30da1b..d8a178b070 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -79,6 +79,7 @@ for arch in $ARCHLIST; do
     ARCH_EXTRA=
     ARCH_STD_EXTRA=
     case "$arch" in
+        arm) ARCH_EXTRA="unistd-eabi.h unistd-oabi.h" ;;
         powerpc) ARCH_EXTRA=epapr_hcalls.h ;;
         s390) ARCH_STD_EXTRA="kvm_virtio.h virtio-ccw.h" ;;
         x86) ARCH_EXTRA="unistd_32.h unistd_x32.h unistd_64.h" ARCH_STD_EXTRA="hyperv.h" ;;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Qemu-devel] [RFC PATCH v2 04/12] Move virtio_mmio.h to fix update-linux-headers.sh
  2017-02-23  5:59 [Qemu-devel] [RFC PATCH v2 00/12] ISA 3.00 KVM guest support Sam Bobroff
                   ` (2 preceding siblings ...)
  2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 03/12] scripts/update-linux-headers.sh: add new files for ARM Sam Bobroff
@ 2017-02-23  5:59 ` Sam Bobroff
  2017-02-24 16:40   ` Michael S. Tsirkin
  2017-02-24 16:47   ` Michael S. Tsirkin
  2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 05/12] Update headers using update-linux-headers.sh Sam Bobroff
                   ` (7 subsequent siblings)
  11 siblings, 2 replies; 28+ messages in thread
From: Sam Bobroff @ 2017-02-23  5:59 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, sjitindarsingh

Currently, running update-linux-headers.sh will produce a patch that
deletes virtio_mmio.h, which is still needed. This happens because
virtio_mmio.h is in the directory used to store headers from the linux
kernel that are copied by the kernel's "make headers_install" target
(used by the update script) but it is not one of the files in that
set.

Fix this by moving that file into a new directory.

In the future if that file is added to the "headers_install" target
then this change should be reverted.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
v2:
* FWIW, here's one way of fixing it.

 hw/virtio/virtio-mmio.c                                          | 2 +-
 include/{standard-headers => kernel-headers}/linux/virtio_mmio.h | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename include/{standard-headers => kernel-headers}/linux/virtio_mmio.h (100%)

diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
index 5807aa87fe..cc6afa9da1 100644
--- a/hw/virtio/virtio-mmio.c
+++ b/hw/virtio/virtio-mmio.c
@@ -20,7 +20,7 @@
  */
 
 #include "qemu/osdep.h"
-#include "standard-headers/linux/virtio_mmio.h"
+#include "kernel-headers/linux/virtio_mmio.h"
 #include "hw/sysbus.h"
 #include "hw/virtio/virtio.h"
 #include "qemu/host-utils.h"
diff --git a/include/standard-headers/linux/virtio_mmio.h b/include/kernel-headers/linux/virtio_mmio.h
similarity index 100%
rename from include/standard-headers/linux/virtio_mmio.h
rename to include/kernel-headers/linux/virtio_mmio.h
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Qemu-devel] [RFC PATCH v2 05/12] Update headers using update-linux-headers.sh
  2017-02-23  5:59 [Qemu-devel] [RFC PATCH v2 00/12] ISA 3.00 KVM guest support Sam Bobroff
                   ` (3 preceding siblings ...)
  2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 04/12] Move virtio_mmio.h to fix update-linux-headers.sh Sam Bobroff
@ 2017-02-23  5:59 ` Sam Bobroff
  2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 06/12] spapr: Add ibm, processor-radix-AP-encodings to the device tree Sam Bobroff
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 28+ messages in thread
From: Sam Bobroff @ 2017-02-23  5:59 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, sjitindarsingh

Updated against Paul's kvm-ppc-next tree:
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git
... at commit:
5982f0849e08fe4e4e7df5e345c4539ce9780b1b
... in order to provide some new definitions needed by ISA 3.00
guests.

This is a large change because it is the first import since
some kernel header files have become autogenerated.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
v2:

* Added information about where the headers came from.

 include/standard-headers/linux/input-event-codes.h |   2 +-
 include/standard-headers/linux/pci_regs.h          |   8 +
 include/standard-headers/linux/virtio_ids.h        |   1 +
 linux-headers/asm-arm/kvm.h                        |   2 +
 linux-headers/asm-arm/unistd-eabi.h                |   5 +
 linux-headers/asm-arm/unistd-oabi.h                |  17 +
 linux-headers/asm-arm/unistd.h                     | 419 +--------------------
 linux-headers/asm-powerpc/kvm.h                    |  27 ++
 linux-headers/asm-powerpc/unistd.h                 |   1 +
 linux-headers/asm-x86/kvm_para.h                   |   4 +-
 linux-headers/linux/kvm.h                          |  20 +-
 linux-headers/linux/vfio.h                         |  10 +
 12 files changed, 97 insertions(+), 419 deletions(-)
 create mode 100644 linux-headers/asm-arm/unistd-eabi.h
 create mode 100644 linux-headers/asm-arm/unistd-oabi.h

diff --git a/include/standard-headers/linux/input-event-codes.h b/include/standard-headers/linux/input-event-codes.h
index 5c10f7e25d..c8b3338375 100644
--- a/include/standard-headers/linux/input-event-codes.h
+++ b/include/standard-headers/linux/input-event-codes.h
@@ -640,7 +640,7 @@
  * Control a data application associated with the currently viewed channel,
  * e.g. teletext or data broadcast application (MHEG, MHP, HbbTV, etc.)
  */
-#define KEY_DATA			0x275
+#define KEY_DATA			0x277
 
 #define BTN_TRIGGER_HAPPY		0x2c0
 #define BTN_TRIGGER_HAPPY1		0x2c0
diff --git a/include/standard-headers/linux/pci_regs.h b/include/standard-headers/linux/pci_regs.h
index e5a2e68b22..174d114708 100644
--- a/include/standard-headers/linux/pci_regs.h
+++ b/include/standard-headers/linux/pci_regs.h
@@ -23,6 +23,14 @@
 #define LINUX_PCI_REGS_H
 
 /*
+ * Conventional PCI and PCI-X Mode 1 devices have 256 bytes of
+ * configuration space.  PCI-X Mode 2 and PCIe devices have 4096 bytes of
+ * configuration space.
+ */
+#define PCI_CFG_SPACE_SIZE	256
+#define PCI_CFG_SPACE_EXP_SIZE	4096
+
+/*
  * Under PCI, each device has 256 bytes of configuration address space,
  * of which the first 64 bytes are standardized as follows:
  */
diff --git a/include/standard-headers/linux/virtio_ids.h b/include/standard-headers/linux/virtio_ids.h
index fe74e422d4..6d5c3b2d4f 100644
--- a/include/standard-headers/linux/virtio_ids.h
+++ b/include/standard-headers/linux/virtio_ids.h
@@ -43,4 +43,5 @@
 #define VIRTIO_ID_INPUT        18 /* virtio input */
 #define VIRTIO_ID_VSOCK        19 /* virtio vsock transport */
 #define VIRTIO_ID_CRYPTO       20 /* virtio crypto */
+
 #endif /* _LINUX_VIRTIO_IDS_H */
diff --git a/linux-headers/asm-arm/kvm.h b/linux-headers/asm-arm/kvm.h
index 2fb7859465..09a555cc83 100644
--- a/linux-headers/asm-arm/kvm.h
+++ b/linux-headers/asm-arm/kvm.h
@@ -87,9 +87,11 @@ struct kvm_regs {
 /* Supported VGICv3 address types  */
 #define KVM_VGIC_V3_ADDR_TYPE_DIST	2
 #define KVM_VGIC_V3_ADDR_TYPE_REDIST	3
+#define KVM_VGIC_ITS_ADDR_TYPE		4
 
 #define KVM_VGIC_V3_DIST_SIZE		SZ_64K
 #define KVM_VGIC_V3_REDIST_SIZE		(2 * SZ_64K)
+#define KVM_VGIC_V3_ITS_SIZE		(2 * SZ_64K)
 
 #define KVM_ARM_VCPU_POWER_OFF		0 /* CPU is started in OFF state */
 #define KVM_ARM_VCPU_PSCI_0_2		1 /* CPU uses PSCI v0.2 */
diff --git a/linux-headers/asm-arm/unistd-eabi.h b/linux-headers/asm-arm/unistd-eabi.h
new file mode 100644
index 0000000000..266f1fcdfb
--- /dev/null
+++ b/linux-headers/asm-arm/unistd-eabi.h
@@ -0,0 +1,5 @@
+#ifndef _ASM_ARM_UNISTD_EABI_H
+#define _ASM_ARM_UNISTD_EABI_H 1
+
+
+#endif /* _ASM_ARM_UNISTD_EABI_H */
diff --git a/linux-headers/asm-arm/unistd-oabi.h b/linux-headers/asm-arm/unistd-oabi.h
new file mode 100644
index 0000000000..47d9afb96d
--- /dev/null
+++ b/linux-headers/asm-arm/unistd-oabi.h
@@ -0,0 +1,17 @@
+#ifndef _ASM_ARM_UNISTD_OABI_H
+#define _ASM_ARM_UNISTD_OABI_H 1
+
+#define __NR_time (__NR_SYSCALL_BASE + 13)
+#define __NR_umount (__NR_SYSCALL_BASE + 22)
+#define __NR_stime (__NR_SYSCALL_BASE + 25)
+#define __NR_alarm (__NR_SYSCALL_BASE + 27)
+#define __NR_utime (__NR_SYSCALL_BASE + 30)
+#define __NR_getrlimit (__NR_SYSCALL_BASE + 76)
+#define __NR_select (__NR_SYSCALL_BASE + 82)
+#define __NR_readdir (__NR_SYSCALL_BASE + 89)
+#define __NR_mmap (__NR_SYSCALL_BASE + 90)
+#define __NR_socketcall (__NR_SYSCALL_BASE + 102)
+#define __NR_syscall (__NR_SYSCALL_BASE + 113)
+#define __NR_ipc (__NR_SYSCALL_BASE + 117)
+
+#endif /* _ASM_ARM_UNISTD_OABI_H */
diff --git a/linux-headers/asm-arm/unistd.h b/linux-headers/asm-arm/unistd.h
index ceb5450c81..155571b874 100644
--- a/linux-headers/asm-arm/unistd.h
+++ b/linux-headers/asm-arm/unistd.h
@@ -17,409 +17,14 @@
 
 #if defined(__thumb__) || defined(__ARM_EABI__)
 #define __NR_SYSCALL_BASE	0
+#include <asm/unistd-eabi.h>
 #else
 #define __NR_SYSCALL_BASE	__NR_OABI_SYSCALL_BASE
+#include <asm/unistd-oabi.h>
 #endif
 
-/*
- * This file contains the system call numbers.
- */
-
-#define __NR_restart_syscall		(__NR_SYSCALL_BASE+  0)
-#define __NR_exit			(__NR_SYSCALL_BASE+  1)
-#define __NR_fork			(__NR_SYSCALL_BASE+  2)
-#define __NR_read			(__NR_SYSCALL_BASE+  3)
-#define __NR_write			(__NR_SYSCALL_BASE+  4)
-#define __NR_open			(__NR_SYSCALL_BASE+  5)
-#define __NR_close			(__NR_SYSCALL_BASE+  6)
-					/* 7 was sys_waitpid */
-#define __NR_creat			(__NR_SYSCALL_BASE+  8)
-#define __NR_link			(__NR_SYSCALL_BASE+  9)
-#define __NR_unlink			(__NR_SYSCALL_BASE+ 10)
-#define __NR_execve			(__NR_SYSCALL_BASE+ 11)
-#define __NR_chdir			(__NR_SYSCALL_BASE+ 12)
-#define __NR_time			(__NR_SYSCALL_BASE+ 13)
-#define __NR_mknod			(__NR_SYSCALL_BASE+ 14)
-#define __NR_chmod			(__NR_SYSCALL_BASE+ 15)
-#define __NR_lchown			(__NR_SYSCALL_BASE+ 16)
-					/* 17 was sys_break */
-					/* 18 was sys_stat */
-#define __NR_lseek			(__NR_SYSCALL_BASE+ 19)
-#define __NR_getpid			(__NR_SYSCALL_BASE+ 20)
-#define __NR_mount			(__NR_SYSCALL_BASE+ 21)
-#define __NR_umount			(__NR_SYSCALL_BASE+ 22)
-#define __NR_setuid			(__NR_SYSCALL_BASE+ 23)
-#define __NR_getuid			(__NR_SYSCALL_BASE+ 24)
-#define __NR_stime			(__NR_SYSCALL_BASE+ 25)
-#define __NR_ptrace			(__NR_SYSCALL_BASE+ 26)
-#define __NR_alarm			(__NR_SYSCALL_BASE+ 27)
-					/* 28 was sys_fstat */
-#define __NR_pause			(__NR_SYSCALL_BASE+ 29)
-#define __NR_utime			(__NR_SYSCALL_BASE+ 30)
-					/* 31 was sys_stty */
-					/* 32 was sys_gtty */
-#define __NR_access			(__NR_SYSCALL_BASE+ 33)
-#define __NR_nice			(__NR_SYSCALL_BASE+ 34)
-					/* 35 was sys_ftime */
-#define __NR_sync			(__NR_SYSCALL_BASE+ 36)
-#define __NR_kill			(__NR_SYSCALL_BASE+ 37)
-#define __NR_rename			(__NR_SYSCALL_BASE+ 38)
-#define __NR_mkdir			(__NR_SYSCALL_BASE+ 39)
-#define __NR_rmdir			(__NR_SYSCALL_BASE+ 40)
-#define __NR_dup			(__NR_SYSCALL_BASE+ 41)
-#define __NR_pipe			(__NR_SYSCALL_BASE+ 42)
-#define __NR_times			(__NR_SYSCALL_BASE+ 43)
-					/* 44 was sys_prof */
-#define __NR_brk			(__NR_SYSCALL_BASE+ 45)
-#define __NR_setgid			(__NR_SYSCALL_BASE+ 46)
-#define __NR_getgid			(__NR_SYSCALL_BASE+ 47)
-					/* 48 was sys_signal */
-#define __NR_geteuid			(__NR_SYSCALL_BASE+ 49)
-#define __NR_getegid			(__NR_SYSCALL_BASE+ 50)
-#define __NR_acct			(__NR_SYSCALL_BASE+ 51)
-#define __NR_umount2			(__NR_SYSCALL_BASE+ 52)
-					/* 53 was sys_lock */
-#define __NR_ioctl			(__NR_SYSCALL_BASE+ 54)
-#define __NR_fcntl			(__NR_SYSCALL_BASE+ 55)
-					/* 56 was sys_mpx */
-#define __NR_setpgid			(__NR_SYSCALL_BASE+ 57)
-					/* 58 was sys_ulimit */
-					/* 59 was sys_olduname */
-#define __NR_umask			(__NR_SYSCALL_BASE+ 60)
-#define __NR_chroot			(__NR_SYSCALL_BASE+ 61)
-#define __NR_ustat			(__NR_SYSCALL_BASE+ 62)
-#define __NR_dup2			(__NR_SYSCALL_BASE+ 63)
-#define __NR_getppid			(__NR_SYSCALL_BASE+ 64)
-#define __NR_getpgrp			(__NR_SYSCALL_BASE+ 65)
-#define __NR_setsid			(__NR_SYSCALL_BASE+ 66)
-#define __NR_sigaction			(__NR_SYSCALL_BASE+ 67)
-					/* 68 was sys_sgetmask */
-					/* 69 was sys_ssetmask */
-#define __NR_setreuid			(__NR_SYSCALL_BASE+ 70)
-#define __NR_setregid			(__NR_SYSCALL_BASE+ 71)
-#define __NR_sigsuspend			(__NR_SYSCALL_BASE+ 72)
-#define __NR_sigpending			(__NR_SYSCALL_BASE+ 73)
-#define __NR_sethostname		(__NR_SYSCALL_BASE+ 74)
-#define __NR_setrlimit			(__NR_SYSCALL_BASE+ 75)
-#define __NR_getrlimit			(__NR_SYSCALL_BASE+ 76)	/* Back compat 2GB limited rlimit */
-#define __NR_getrusage			(__NR_SYSCALL_BASE+ 77)
-#define __NR_gettimeofday		(__NR_SYSCALL_BASE+ 78)
-#define __NR_settimeofday		(__NR_SYSCALL_BASE+ 79)
-#define __NR_getgroups			(__NR_SYSCALL_BASE+ 80)
-#define __NR_setgroups			(__NR_SYSCALL_BASE+ 81)
-#define __NR_select			(__NR_SYSCALL_BASE+ 82)
-#define __NR_symlink			(__NR_SYSCALL_BASE+ 83)
-					/* 84 was sys_lstat */
-#define __NR_readlink			(__NR_SYSCALL_BASE+ 85)
-#define __NR_uselib			(__NR_SYSCALL_BASE+ 86)
-#define __NR_swapon			(__NR_SYSCALL_BASE+ 87)
-#define __NR_reboot			(__NR_SYSCALL_BASE+ 88)
-#define __NR_readdir			(__NR_SYSCALL_BASE+ 89)
-#define __NR_mmap			(__NR_SYSCALL_BASE+ 90)
-#define __NR_munmap			(__NR_SYSCALL_BASE+ 91)
-#define __NR_truncate			(__NR_SYSCALL_BASE+ 92)
-#define __NR_ftruncate			(__NR_SYSCALL_BASE+ 93)
-#define __NR_fchmod			(__NR_SYSCALL_BASE+ 94)
-#define __NR_fchown			(__NR_SYSCALL_BASE+ 95)
-#define __NR_getpriority		(__NR_SYSCALL_BASE+ 96)
-#define __NR_setpriority		(__NR_SYSCALL_BASE+ 97)
-					/* 98 was sys_profil */
-#define __NR_statfs			(__NR_SYSCALL_BASE+ 99)
-#define __NR_fstatfs			(__NR_SYSCALL_BASE+100)
-					/* 101 was sys_ioperm */
-#define __NR_socketcall			(__NR_SYSCALL_BASE+102)
-#define __NR_syslog			(__NR_SYSCALL_BASE+103)
-#define __NR_setitimer			(__NR_SYSCALL_BASE+104)
-#define __NR_getitimer			(__NR_SYSCALL_BASE+105)
-#define __NR_stat			(__NR_SYSCALL_BASE+106)
-#define __NR_lstat			(__NR_SYSCALL_BASE+107)
-#define __NR_fstat			(__NR_SYSCALL_BASE+108)
-					/* 109 was sys_uname */
-					/* 110 was sys_iopl */
-#define __NR_vhangup			(__NR_SYSCALL_BASE+111)
-					/* 112 was sys_idle */
-#define __NR_syscall			(__NR_SYSCALL_BASE+113) /* syscall to call a syscall! */
-#define __NR_wait4			(__NR_SYSCALL_BASE+114)
-#define __NR_swapoff			(__NR_SYSCALL_BASE+115)
-#define __NR_sysinfo			(__NR_SYSCALL_BASE+116)
-#define __NR_ipc			(__NR_SYSCALL_BASE+117)
-#define __NR_fsync			(__NR_SYSCALL_BASE+118)
-#define __NR_sigreturn			(__NR_SYSCALL_BASE+119)
-#define __NR_clone			(__NR_SYSCALL_BASE+120)
-#define __NR_setdomainname		(__NR_SYSCALL_BASE+121)
-#define __NR_uname			(__NR_SYSCALL_BASE+122)
-					/* 123 was sys_modify_ldt */
-#define __NR_adjtimex			(__NR_SYSCALL_BASE+124)
-#define __NR_mprotect			(__NR_SYSCALL_BASE+125)
-#define __NR_sigprocmask		(__NR_SYSCALL_BASE+126)
-					/* 127 was sys_create_module */
-#define __NR_init_module		(__NR_SYSCALL_BASE+128)
-#define __NR_delete_module		(__NR_SYSCALL_BASE+129)
-					/* 130 was sys_get_kernel_syms */
-#define __NR_quotactl			(__NR_SYSCALL_BASE+131)
-#define __NR_getpgid			(__NR_SYSCALL_BASE+132)
-#define __NR_fchdir			(__NR_SYSCALL_BASE+133)
-#define __NR_bdflush			(__NR_SYSCALL_BASE+134)
-#define __NR_sysfs			(__NR_SYSCALL_BASE+135)
-#define __NR_personality		(__NR_SYSCALL_BASE+136)
-					/* 137 was sys_afs_syscall */
-#define __NR_setfsuid			(__NR_SYSCALL_BASE+138)
-#define __NR_setfsgid			(__NR_SYSCALL_BASE+139)
-#define __NR__llseek			(__NR_SYSCALL_BASE+140)
-#define __NR_getdents			(__NR_SYSCALL_BASE+141)
-#define __NR__newselect			(__NR_SYSCALL_BASE+142)
-#define __NR_flock			(__NR_SYSCALL_BASE+143)
-#define __NR_msync			(__NR_SYSCALL_BASE+144)
-#define __NR_readv			(__NR_SYSCALL_BASE+145)
-#define __NR_writev			(__NR_SYSCALL_BASE+146)
-#define __NR_getsid			(__NR_SYSCALL_BASE+147)
-#define __NR_fdatasync			(__NR_SYSCALL_BASE+148)
-#define __NR__sysctl			(__NR_SYSCALL_BASE+149)
-#define __NR_mlock			(__NR_SYSCALL_BASE+150)
-#define __NR_munlock			(__NR_SYSCALL_BASE+151)
-#define __NR_mlockall			(__NR_SYSCALL_BASE+152)
-#define __NR_munlockall			(__NR_SYSCALL_BASE+153)
-#define __NR_sched_setparam		(__NR_SYSCALL_BASE+154)
-#define __NR_sched_getparam		(__NR_SYSCALL_BASE+155)
-#define __NR_sched_setscheduler		(__NR_SYSCALL_BASE+156)
-#define __NR_sched_getscheduler		(__NR_SYSCALL_BASE+157)
-#define __NR_sched_yield		(__NR_SYSCALL_BASE+158)
-#define __NR_sched_get_priority_max	(__NR_SYSCALL_BASE+159)
-#define __NR_sched_get_priority_min	(__NR_SYSCALL_BASE+160)
-#define __NR_sched_rr_get_interval	(__NR_SYSCALL_BASE+161)
-#define __NR_nanosleep			(__NR_SYSCALL_BASE+162)
-#define __NR_mremap			(__NR_SYSCALL_BASE+163)
-#define __NR_setresuid			(__NR_SYSCALL_BASE+164)
-#define __NR_getresuid			(__NR_SYSCALL_BASE+165)
-					/* 166 was sys_vm86 */
-					/* 167 was sys_query_module */
-#define __NR_poll			(__NR_SYSCALL_BASE+168)
-#define __NR_nfsservctl			(__NR_SYSCALL_BASE+169)
-#define __NR_setresgid			(__NR_SYSCALL_BASE+170)
-#define __NR_getresgid			(__NR_SYSCALL_BASE+171)
-#define __NR_prctl			(__NR_SYSCALL_BASE+172)
-#define __NR_rt_sigreturn		(__NR_SYSCALL_BASE+173)
-#define __NR_rt_sigaction		(__NR_SYSCALL_BASE+174)
-#define __NR_rt_sigprocmask		(__NR_SYSCALL_BASE+175)
-#define __NR_rt_sigpending		(__NR_SYSCALL_BASE+176)
-#define __NR_rt_sigtimedwait		(__NR_SYSCALL_BASE+177)
-#define __NR_rt_sigqueueinfo		(__NR_SYSCALL_BASE+178)
-#define __NR_rt_sigsuspend		(__NR_SYSCALL_BASE+179)
-#define __NR_pread64			(__NR_SYSCALL_BASE+180)
-#define __NR_pwrite64			(__NR_SYSCALL_BASE+181)
-#define __NR_chown			(__NR_SYSCALL_BASE+182)
-#define __NR_getcwd			(__NR_SYSCALL_BASE+183)
-#define __NR_capget			(__NR_SYSCALL_BASE+184)
-#define __NR_capset			(__NR_SYSCALL_BASE+185)
-#define __NR_sigaltstack		(__NR_SYSCALL_BASE+186)
-#define __NR_sendfile			(__NR_SYSCALL_BASE+187)
-					/* 188 reserved */
-					/* 189 reserved */
-#define __NR_vfork			(__NR_SYSCALL_BASE+190)
-#define __NR_ugetrlimit			(__NR_SYSCALL_BASE+191)	/* SuS compliant getrlimit */
-#define __NR_mmap2			(__NR_SYSCALL_BASE+192)
-#define __NR_truncate64			(__NR_SYSCALL_BASE+193)
-#define __NR_ftruncate64		(__NR_SYSCALL_BASE+194)
-#define __NR_stat64			(__NR_SYSCALL_BASE+195)
-#define __NR_lstat64			(__NR_SYSCALL_BASE+196)
-#define __NR_fstat64			(__NR_SYSCALL_BASE+197)
-#define __NR_lchown32			(__NR_SYSCALL_BASE+198)
-#define __NR_getuid32			(__NR_SYSCALL_BASE+199)
-#define __NR_getgid32			(__NR_SYSCALL_BASE+200)
-#define __NR_geteuid32			(__NR_SYSCALL_BASE+201)
-#define __NR_getegid32			(__NR_SYSCALL_BASE+202)
-#define __NR_setreuid32			(__NR_SYSCALL_BASE+203)
-#define __NR_setregid32			(__NR_SYSCALL_BASE+204)
-#define __NR_getgroups32		(__NR_SYSCALL_BASE+205)
-#define __NR_setgroups32		(__NR_SYSCALL_BASE+206)
-#define __NR_fchown32			(__NR_SYSCALL_BASE+207)
-#define __NR_setresuid32		(__NR_SYSCALL_BASE+208)
-#define __NR_getresuid32		(__NR_SYSCALL_BASE+209)
-#define __NR_setresgid32		(__NR_SYSCALL_BASE+210)
-#define __NR_getresgid32		(__NR_SYSCALL_BASE+211)
-#define __NR_chown32			(__NR_SYSCALL_BASE+212)
-#define __NR_setuid32			(__NR_SYSCALL_BASE+213)
-#define __NR_setgid32			(__NR_SYSCALL_BASE+214)
-#define __NR_setfsuid32			(__NR_SYSCALL_BASE+215)
-#define __NR_setfsgid32			(__NR_SYSCALL_BASE+216)
-#define __NR_getdents64			(__NR_SYSCALL_BASE+217)
-#define __NR_pivot_root			(__NR_SYSCALL_BASE+218)
-#define __NR_mincore			(__NR_SYSCALL_BASE+219)
-#define __NR_madvise			(__NR_SYSCALL_BASE+220)
-#define __NR_fcntl64			(__NR_SYSCALL_BASE+221)
-					/* 222 for tux */
-					/* 223 is unused */
-#define __NR_gettid			(__NR_SYSCALL_BASE+224)
-#define __NR_readahead			(__NR_SYSCALL_BASE+225)
-#define __NR_setxattr			(__NR_SYSCALL_BASE+226)
-#define __NR_lsetxattr			(__NR_SYSCALL_BASE+227)
-#define __NR_fsetxattr			(__NR_SYSCALL_BASE+228)
-#define __NR_getxattr			(__NR_SYSCALL_BASE+229)
-#define __NR_lgetxattr			(__NR_SYSCALL_BASE+230)
-#define __NR_fgetxattr			(__NR_SYSCALL_BASE+231)
-#define __NR_listxattr			(__NR_SYSCALL_BASE+232)
-#define __NR_llistxattr			(__NR_SYSCALL_BASE+233)
-#define __NR_flistxattr			(__NR_SYSCALL_BASE+234)
-#define __NR_removexattr		(__NR_SYSCALL_BASE+235)
-#define __NR_lremovexattr		(__NR_SYSCALL_BASE+236)
-#define __NR_fremovexattr		(__NR_SYSCALL_BASE+237)
-#define __NR_tkill			(__NR_SYSCALL_BASE+238)
-#define __NR_sendfile64			(__NR_SYSCALL_BASE+239)
-#define __NR_futex			(__NR_SYSCALL_BASE+240)
-#define __NR_sched_setaffinity		(__NR_SYSCALL_BASE+241)
-#define __NR_sched_getaffinity		(__NR_SYSCALL_BASE+242)
-#define __NR_io_setup			(__NR_SYSCALL_BASE+243)
-#define __NR_io_destroy			(__NR_SYSCALL_BASE+244)
-#define __NR_io_getevents		(__NR_SYSCALL_BASE+245)
-#define __NR_io_submit			(__NR_SYSCALL_BASE+246)
-#define __NR_io_cancel			(__NR_SYSCALL_BASE+247)
-#define __NR_exit_group			(__NR_SYSCALL_BASE+248)
-#define __NR_lookup_dcookie		(__NR_SYSCALL_BASE+249)
-#define __NR_epoll_create		(__NR_SYSCALL_BASE+250)
-#define __NR_epoll_ctl			(__NR_SYSCALL_BASE+251)
-#define __NR_epoll_wait			(__NR_SYSCALL_BASE+252)
-#define __NR_remap_file_pages		(__NR_SYSCALL_BASE+253)
-					/* 254 for set_thread_area */
-					/* 255 for get_thread_area */
-#define __NR_set_tid_address		(__NR_SYSCALL_BASE+256)
-#define __NR_timer_create		(__NR_SYSCALL_BASE+257)
-#define __NR_timer_settime		(__NR_SYSCALL_BASE+258)
-#define __NR_timer_gettime		(__NR_SYSCALL_BASE+259)
-#define __NR_timer_getoverrun		(__NR_SYSCALL_BASE+260)
-#define __NR_timer_delete		(__NR_SYSCALL_BASE+261)
-#define __NR_clock_settime		(__NR_SYSCALL_BASE+262)
-#define __NR_clock_gettime		(__NR_SYSCALL_BASE+263)
-#define __NR_clock_getres		(__NR_SYSCALL_BASE+264)
-#define __NR_clock_nanosleep		(__NR_SYSCALL_BASE+265)
-#define __NR_statfs64			(__NR_SYSCALL_BASE+266)
-#define __NR_fstatfs64			(__NR_SYSCALL_BASE+267)
-#define __NR_tgkill			(__NR_SYSCALL_BASE+268)
-#define __NR_utimes			(__NR_SYSCALL_BASE+269)
-#define __NR_arm_fadvise64_64		(__NR_SYSCALL_BASE+270)
-#define __NR_pciconfig_iobase		(__NR_SYSCALL_BASE+271)
-#define __NR_pciconfig_read		(__NR_SYSCALL_BASE+272)
-#define __NR_pciconfig_write		(__NR_SYSCALL_BASE+273)
-#define __NR_mq_open			(__NR_SYSCALL_BASE+274)
-#define __NR_mq_unlink			(__NR_SYSCALL_BASE+275)
-#define __NR_mq_timedsend		(__NR_SYSCALL_BASE+276)
-#define __NR_mq_timedreceive		(__NR_SYSCALL_BASE+277)
-#define __NR_mq_notify			(__NR_SYSCALL_BASE+278)
-#define __NR_mq_getsetattr		(__NR_SYSCALL_BASE+279)
-#define __NR_waitid			(__NR_SYSCALL_BASE+280)
-#define __NR_socket			(__NR_SYSCALL_BASE+281)
-#define __NR_bind			(__NR_SYSCALL_BASE+282)
-#define __NR_connect			(__NR_SYSCALL_BASE+283)
-#define __NR_listen			(__NR_SYSCALL_BASE+284)
-#define __NR_accept			(__NR_SYSCALL_BASE+285)
-#define __NR_getsockname		(__NR_SYSCALL_BASE+286)
-#define __NR_getpeername		(__NR_SYSCALL_BASE+287)
-#define __NR_socketpair			(__NR_SYSCALL_BASE+288)
-#define __NR_send			(__NR_SYSCALL_BASE+289)
-#define __NR_sendto			(__NR_SYSCALL_BASE+290)
-#define __NR_recv			(__NR_SYSCALL_BASE+291)
-#define __NR_recvfrom			(__NR_SYSCALL_BASE+292)
-#define __NR_shutdown			(__NR_SYSCALL_BASE+293)
-#define __NR_setsockopt			(__NR_SYSCALL_BASE+294)
-#define __NR_getsockopt			(__NR_SYSCALL_BASE+295)
-#define __NR_sendmsg			(__NR_SYSCALL_BASE+296)
-#define __NR_recvmsg			(__NR_SYSCALL_BASE+297)
-#define __NR_semop			(__NR_SYSCALL_BASE+298)
-#define __NR_semget			(__NR_SYSCALL_BASE+299)
-#define __NR_semctl			(__NR_SYSCALL_BASE+300)
-#define __NR_msgsnd			(__NR_SYSCALL_BASE+301)
-#define __NR_msgrcv			(__NR_SYSCALL_BASE+302)
-#define __NR_msgget			(__NR_SYSCALL_BASE+303)
-#define __NR_msgctl			(__NR_SYSCALL_BASE+304)
-#define __NR_shmat			(__NR_SYSCALL_BASE+305)
-#define __NR_shmdt			(__NR_SYSCALL_BASE+306)
-#define __NR_shmget			(__NR_SYSCALL_BASE+307)
-#define __NR_shmctl			(__NR_SYSCALL_BASE+308)
-#define __NR_add_key			(__NR_SYSCALL_BASE+309)
-#define __NR_request_key		(__NR_SYSCALL_BASE+310)
-#define __NR_keyctl			(__NR_SYSCALL_BASE+311)
-#define __NR_semtimedop			(__NR_SYSCALL_BASE+312)
-#define __NR_vserver			(__NR_SYSCALL_BASE+313)
-#define __NR_ioprio_set			(__NR_SYSCALL_BASE+314)
-#define __NR_ioprio_get			(__NR_SYSCALL_BASE+315)
-#define __NR_inotify_init		(__NR_SYSCALL_BASE+316)
-#define __NR_inotify_add_watch		(__NR_SYSCALL_BASE+317)
-#define __NR_inotify_rm_watch		(__NR_SYSCALL_BASE+318)
-#define __NR_mbind			(__NR_SYSCALL_BASE+319)
-#define __NR_get_mempolicy		(__NR_SYSCALL_BASE+320)
-#define __NR_set_mempolicy		(__NR_SYSCALL_BASE+321)
-#define __NR_openat			(__NR_SYSCALL_BASE+322)
-#define __NR_mkdirat			(__NR_SYSCALL_BASE+323)
-#define __NR_mknodat			(__NR_SYSCALL_BASE+324)
-#define __NR_fchownat			(__NR_SYSCALL_BASE+325)
-#define __NR_futimesat			(__NR_SYSCALL_BASE+326)
-#define __NR_fstatat64			(__NR_SYSCALL_BASE+327)
-#define __NR_unlinkat			(__NR_SYSCALL_BASE+328)
-#define __NR_renameat			(__NR_SYSCALL_BASE+329)
-#define __NR_linkat			(__NR_SYSCALL_BASE+330)
-#define __NR_symlinkat			(__NR_SYSCALL_BASE+331)
-#define __NR_readlinkat			(__NR_SYSCALL_BASE+332)
-#define __NR_fchmodat			(__NR_SYSCALL_BASE+333)
-#define __NR_faccessat			(__NR_SYSCALL_BASE+334)
-#define __NR_pselect6			(__NR_SYSCALL_BASE+335)
-#define __NR_ppoll			(__NR_SYSCALL_BASE+336)
-#define __NR_unshare			(__NR_SYSCALL_BASE+337)
-#define __NR_set_robust_list		(__NR_SYSCALL_BASE+338)
-#define __NR_get_robust_list		(__NR_SYSCALL_BASE+339)
-#define __NR_splice			(__NR_SYSCALL_BASE+340)
-#define __NR_arm_sync_file_range	(__NR_SYSCALL_BASE+341)
+#include <asm/unistd-common.h>
 #define __NR_sync_file_range2		__NR_arm_sync_file_range
-#define __NR_tee			(__NR_SYSCALL_BASE+342)
-#define __NR_vmsplice			(__NR_SYSCALL_BASE+343)
-#define __NR_move_pages			(__NR_SYSCALL_BASE+344)
-#define __NR_getcpu			(__NR_SYSCALL_BASE+345)
-#define __NR_epoll_pwait		(__NR_SYSCALL_BASE+346)
-#define __NR_kexec_load			(__NR_SYSCALL_BASE+347)
-#define __NR_utimensat			(__NR_SYSCALL_BASE+348)
-#define __NR_signalfd			(__NR_SYSCALL_BASE+349)
-#define __NR_timerfd_create		(__NR_SYSCALL_BASE+350)
-#define __NR_eventfd			(__NR_SYSCALL_BASE+351)
-#define __NR_fallocate			(__NR_SYSCALL_BASE+352)
-#define __NR_timerfd_settime		(__NR_SYSCALL_BASE+353)
-#define __NR_timerfd_gettime		(__NR_SYSCALL_BASE+354)
-#define __NR_signalfd4			(__NR_SYSCALL_BASE+355)
-#define __NR_eventfd2			(__NR_SYSCALL_BASE+356)
-#define __NR_epoll_create1		(__NR_SYSCALL_BASE+357)
-#define __NR_dup3			(__NR_SYSCALL_BASE+358)
-#define __NR_pipe2			(__NR_SYSCALL_BASE+359)
-#define __NR_inotify_init1		(__NR_SYSCALL_BASE+360)
-#define __NR_preadv			(__NR_SYSCALL_BASE+361)
-#define __NR_pwritev			(__NR_SYSCALL_BASE+362)
-#define __NR_rt_tgsigqueueinfo		(__NR_SYSCALL_BASE+363)
-#define __NR_perf_event_open		(__NR_SYSCALL_BASE+364)
-#define __NR_recvmmsg			(__NR_SYSCALL_BASE+365)
-#define __NR_accept4			(__NR_SYSCALL_BASE+366)
-#define __NR_fanotify_init		(__NR_SYSCALL_BASE+367)
-#define __NR_fanotify_mark		(__NR_SYSCALL_BASE+368)
-#define __NR_prlimit64			(__NR_SYSCALL_BASE+369)
-#define __NR_name_to_handle_at		(__NR_SYSCALL_BASE+370)
-#define __NR_open_by_handle_at		(__NR_SYSCALL_BASE+371)
-#define __NR_clock_adjtime		(__NR_SYSCALL_BASE+372)
-#define __NR_syncfs			(__NR_SYSCALL_BASE+373)
-#define __NR_sendmmsg			(__NR_SYSCALL_BASE+374)
-#define __NR_setns			(__NR_SYSCALL_BASE+375)
-#define __NR_process_vm_readv		(__NR_SYSCALL_BASE+376)
-#define __NR_process_vm_writev		(__NR_SYSCALL_BASE+377)
-#define __NR_kcmp			(__NR_SYSCALL_BASE+378)
-#define __NR_finit_module		(__NR_SYSCALL_BASE+379)
-#define __NR_sched_setattr		(__NR_SYSCALL_BASE+380)
-#define __NR_sched_getattr		(__NR_SYSCALL_BASE+381)
-#define __NR_renameat2			(__NR_SYSCALL_BASE+382)
-#define __NR_seccomp			(__NR_SYSCALL_BASE+383)
-#define __NR_getrandom			(__NR_SYSCALL_BASE+384)
-#define __NR_memfd_create		(__NR_SYSCALL_BASE+385)
-#define __NR_bpf			(__NR_SYSCALL_BASE+386)
-#define __NR_execveat			(__NR_SYSCALL_BASE+387)
-#define __NR_userfaultfd		(__NR_SYSCALL_BASE+388)
-#define __NR_membarrier			(__NR_SYSCALL_BASE+389)
-#define __NR_mlock2			(__NR_SYSCALL_BASE+390)
-#define __NR_copy_file_range		(__NR_SYSCALL_BASE+391)
-#define __NR_preadv2			(__NR_SYSCALL_BASE+392)
-#define __NR_pwritev2			(__NR_SYSCALL_BASE+393)
 
 /*
  * The following SWIs are ARM private.
@@ -431,22 +36,4 @@
 #define __ARM_NR_usr32			(__ARM_NR_BASE+4)
 #define __ARM_NR_set_tls		(__ARM_NR_BASE+5)
 
-/*
- * The following syscalls are obsolete and no longer available for EABI.
- */
-#if defined(__ARM_EABI__)
-#undef __NR_time
-#undef __NR_umount
-#undef __NR_stime
-#undef __NR_alarm
-#undef __NR_utime
-#undef __NR_getrlimit
-#undef __NR_select
-#undef __NR_readdir
-#undef __NR_mmap
-#undef __NR_socketcall
-#undef __NR_syscall
-#undef __NR_ipc
-#endif
-
 #endif /* __ASM_ARM_UNISTD_H */
diff --git a/linux-headers/asm-powerpc/kvm.h b/linux-headers/asm-powerpc/kvm.h
index c93cf35ce3..4edbe4bb0e 100644
--- a/linux-headers/asm-powerpc/kvm.h
+++ b/linux-headers/asm-powerpc/kvm.h
@@ -413,6 +413,26 @@ struct kvm_get_htab_header {
 	__u16	n_invalid;
 };
 
+/* For KVM_PPC_CONFIGURE_V3_MMU */
+struct kvm_ppc_mmuv3_cfg {
+	__u64	flags;
+	__u64	process_table;	/* second doubleword of partition table entry */
+};
+
+/* Flag values for KVM_PPC_CONFIGURE_V3_MMU */
+#define KVM_PPC_MMUV3_RADIX	1	/* 1 = radix mode, 0 = HPT */
+#define KVM_PPC_MMUV3_GTSE	2	/* global translation shootdown enb. */
+
+/* For KVM_PPC_GET_RMMU_INFO */
+struct kvm_ppc_rmmu_info {
+	struct kvm_ppc_radix_geom {
+		__u8	page_shift;
+		__u8	level_bits[4];
+		__u8	pad[3];
+	}	geometries[8];
+	__u32	ap_encodings[8];
+};
+
 /* Per-vcpu XICS interrupt controller state */
 #define KVM_REG_PPC_ICP_STATE	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x8c)
 
@@ -573,6 +593,10 @@ struct kvm_get_htab_header {
 #define KVM_REG_PPC_SPRG9	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xba)
 #define KVM_REG_PPC_DBSR	(KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xbb)
 
+/* POWER9 registers */
+#define KVM_REG_PPC_TIDR	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xbc)
+#define KVM_REG_PPC_PSSCR	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xbd)
+
 /* Transactional Memory checkpointed state:
  * This is all GPRs, all VSX regs and a subset of SPRs
  */
@@ -596,6 +620,7 @@ struct kvm_get_htab_header {
 #define KVM_REG_PPC_TM_VSCR	(KVM_REG_PPC_TM | KVM_REG_SIZE_U32 | 0x67)
 #define KVM_REG_PPC_TM_DSCR	(KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x68)
 #define KVM_REG_PPC_TM_TAR	(KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x69)
+#define KVM_REG_PPC_TM_XER	(KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x6a)
 
 /* PPC64 eXternal Interrupt Controller Specification */
 #define KVM_DEV_XICS_GRP_SOURCES	1	/* 64-bit source attributes */
@@ -608,5 +633,7 @@ struct kvm_get_htab_header {
 #define  KVM_XICS_LEVEL_SENSITIVE	(1ULL << 40)
 #define  KVM_XICS_MASKED		(1ULL << 41)
 #define  KVM_XICS_PENDING		(1ULL << 42)
+#define  KVM_XICS_PRESENTED		(1ULL << 43)
+#define  KVM_XICS_QUEUED		(1ULL << 44)
 
 #endif /* __LINUX_KVM_POWERPC_H */
diff --git a/linux-headers/asm-powerpc/unistd.h b/linux-headers/asm-powerpc/unistd.h
index 1e66eba4c6..598043c7b6 100644
--- a/linux-headers/asm-powerpc/unistd.h
+++ b/linux-headers/asm-powerpc/unistd.h
@@ -392,5 +392,6 @@
 #define __NR_copy_file_range	379
 #define __NR_preadv2		380
 #define __NR_pwritev2		381
+#define __NR_kexec_file_load	382
 
 #endif /* _ASM_POWERPC_UNISTD_H_ */
diff --git a/linux-headers/asm-x86/kvm_para.h b/linux-headers/asm-x86/kvm_para.h
index e41c5c1a28..0739a74626 100644
--- a/linux-headers/asm-x86/kvm_para.h
+++ b/linux-headers/asm-x86/kvm_para.h
@@ -45,7 +45,9 @@ struct kvm_steal_time {
 	__u64 steal;
 	__u32 version;
 	__u32 flags;
-	__u32 pad[12];
+	__u8  preempted;
+	__u8  u8_pad[3];
+	__u32 pad[11];
 };
 
 #define KVM_STEAL_ALIGNMENT_BITS 5
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index bb0ed71223..8391bbd21b 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -651,6 +651,9 @@ struct kvm_enable_cap {
 };
 
 /* for KVM_PPC_GET_PVINFO */
+
+#define KVM_PPC_PVINFO_FLAGS_EV_IDLE   (1<<0)
+
 struct kvm_ppc_pvinfo {
 	/* out */
 	__u32 flags;
@@ -682,7 +685,12 @@ struct kvm_ppc_smmu_info {
 	struct kvm_ppc_one_seg_page_size sps[KVM_PPC_PAGE_SIZES_MAX_SZ];
 };
 
-#define KVM_PPC_PVINFO_FLAGS_EV_IDLE   (1<<0)
+/* for KVM_PPC_RESIZE_HPT_{PREPARE,COMMIT} */
+struct kvm_ppc_resize_hpt {
+	__u64 flags;
+	__u32 shift;
+	__u32 pad;
+};
 
 #define KVMIO 0xAE
 
@@ -870,6 +878,9 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_S390_USER_INSTR0 130
 #define KVM_CAP_MSI_DEVID 131
 #define KVM_CAP_PPC_HTM 132
+#define KVM_CAP_SPAPR_RESIZE_HPT 133
+#define KVM_CAP_PPC_MMU_RADIX 134
+#define KVM_CAP_PPC_MMU_HASH_V3 135
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1186,6 +1197,13 @@ struct kvm_s390_ucas_mapping {
 #define KVM_ARM_SET_DEVICE_ADDR	  _IOW(KVMIO,  0xab, struct kvm_arm_device_addr)
 /* Available with KVM_CAP_PPC_RTAS */
 #define KVM_PPC_RTAS_DEFINE_TOKEN _IOW(KVMIO,  0xac, struct kvm_rtas_token_args)
+/* Available with KVM_CAP_SPAPR_RESIZE_HPT */
+#define KVM_PPC_RESIZE_HPT_PREPARE _IOR(KVMIO, 0xad, struct kvm_ppc_resize_hpt)
+#define KVM_PPC_RESIZE_HPT_COMMIT  _IOR(KVMIO, 0xae, struct kvm_ppc_resize_hpt)
+/* Available with KVM_CAP_PPC_RADIX_MMU or KVM_CAP_PPC_HASH_MMU_V3 */
+#define KVM_PPC_CONFIGURE_V3_MMU  _IOW(KVMIO,  0xaf, struct kvm_ppc_mmuv3_cfg)
+/* Available with KVM_CAP_PPC_RADIX_MMU */
+#define KVM_PPC_GET_RMMU_INFO	  _IOW(KVMIO,  0xb0, struct kvm_ppc_rmmu_info)
 
 /* ioctl for vm fd */
 #define KVM_CREATE_DEVICE	  _IOWR(KVMIO,  0xe0, struct kvm_create_device)
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 759b850a3e..531cb2eda9 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -203,6 +203,16 @@ struct vfio_device_info {
 };
 #define VFIO_DEVICE_GET_INFO		_IO(VFIO_TYPE, VFIO_BASE + 7)
 
+/*
+ * Vendor driver using Mediated device framework should provide device_api
+ * attribute in supported type attribute groups. Device API string should be one
+ * of the following corresponding to device flags in vfio_device_info structure.
+ */
+
+#define VFIO_DEVICE_API_PCI_STRING		"vfio-pci"
+#define VFIO_DEVICE_API_PLATFORM_STRING		"vfio-platform"
+#define VFIO_DEVICE_API_AMBA_STRING		"vfio-amba"
+
 /**
  * VFIO_DEVICE_GET_REGION_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 8,
  *				       struct vfio_region_info)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Qemu-devel] [RFC PATCH v2 06/12] spapr: Add ibm, processor-radix-AP-encodings to the device tree
  2017-02-23  5:59 [Qemu-devel] [RFC PATCH v2 00/12] ISA 3.00 KVM guest support Sam Bobroff
                   ` (4 preceding siblings ...)
  2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 05/12] Update headers using update-linux-headers.sh Sam Bobroff
@ 2017-02-23  5:59 ` Sam Bobroff
  2017-02-28  0:12   ` David Gibson
  2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 07/12] target-ppc: support KVM_CAP_PPC_MMU_RADIX, KVM_CAP_PPC_MMU_HASH_V3 Sam Bobroff
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 28+ messages in thread
From: Sam Bobroff @ 2017-02-23  5:59 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, sjitindarsingh

Use the new ioctl, KVM_PPC_GET_RMMU_INFO, to fetch radix MMU
information from KVM and present the page encodings in the device tree
under ibm,processor-radix-AP-encodings. This provides page size
information to the guest which is necessary for it to use radix mode.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
v2:

* ppc_radix_page_info now kept in native format, conversion to BE done when adding to the device tree.
* radix_page_info moved into the CPU class, cleaning up some code.

 hw/ppc/spapr.c       | 12 ++++++++++++
 include/sysemu/kvm.h |  1 +
 target/ppc/cpu-qom.h |  1 +
 target/ppc/cpu.h     |  4 ++++
 target/ppc/kvm.c     | 27 +++++++++++++++++++++++++++
 5 files changed, 45 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index cceb35f083..ca3812555f 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -409,6 +409,8 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
     sPAPRDRConnector *drc;
     sPAPRDRConnectorClass *drck;
     int drc_index;
+    uint32_t radix_AP_encodings[PPC_PAGE_SIZES_MAX_SZ];
+    int i;
 
     drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, index);
     if (drc) {
@@ -494,6 +496,16 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
     _FDT(spapr_fixup_cpu_numa_dt(fdt, offset, cs));
 
     _FDT(spapr_fixup_cpu_smt_dt(fdt, offset, cpu, compat_smt));
+
+    if (pcc->radix_page_info) {
+        for (i = 0; i < pcc->radix_page_info->count; i++) {
+            radix_AP_encodings[i] = cpu_to_be32(pcc->radix_page_info->entries[i]);
+        }
+        _FDT((fdt_setprop(fdt, offset, "ibm,processor-radix-AP-encodings",
+                          radix_AP_encodings,
+                          pcc->radix_page_info->count *
+                          sizeof(radix_AP_encodings[0]))));
+    }
 }
 
 static void spapr_populate_cpus_dt_node(void *fdt, sPAPRMachineState *spapr)
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 3045ee7678..01a8db1180 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -526,5 +526,6 @@ int kvm_set_one_reg(CPUState *cs, uint64_t id, void *source);
  * Returns: 0 on success, or a negative errno on failure.
  */
 int kvm_get_one_reg(CPUState *cs, uint64_t id, void *target);
+struct ppc_radix_page_info *kvm_get_radix_page_info(void);
 int kvm_get_max_memslots(void);
 #endif
diff --git a/target/ppc/cpu-qom.h b/target/ppc/cpu-qom.h
index 4807f4d86c..0efb543912 100644
--- a/target/ppc/cpu-qom.h
+++ b/target/ppc/cpu-qom.h
@@ -195,6 +195,7 @@ typedef struct PowerPCCPUClass {
     int bfd_mach;
     uint32_t l1_dcache_size, l1_icache_size;
     const struct ppc_segment_page_sizes *sps;
+    struct ppc_radix_page_info *radix_page_info;
     void (*init_proc)(CPUPPCState *env);
     int  (*check_pow)(CPUPPCState *env);
     int (*handle_mmu_fault)(PowerPCCPU *cpu, vaddr eaddr, int rwx, int mmu_idx);
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index b559b67073..a6c8c5ff4c 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -934,6 +934,10 @@ struct ppc_segment_page_sizes {
     struct ppc_one_seg_page_size sps[PPC_PAGE_SIZES_MAX_SZ];
 };
 
+struct ppc_radix_page_info {
+    uint32_t count;
+    uint32_t entries[PPC_PAGE_SIZES_MAX_SZ];
+};
 
 /*****************************************************************************/
 /* The whole PowerPC CPU context */
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index d53ede8b4a..cf62a42c1f 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -48,6 +48,7 @@
 #if defined(TARGET_PPC64)
 #include "hw/ppc/spapr_cpu_core.h"
 #endif
+#include "sysemu/kvm_int.h"
 
 //#define DEBUG_KVM
 
@@ -329,6 +330,30 @@ static void kvm_get_smmu_info(PowerPCCPU *cpu, struct kvm_ppc_smmu_info *info)
     kvm_get_fallback_smmu_info(cpu, info);
 }
 
+struct ppc_radix_page_info *kvm_get_radix_page_info(void)
+{
+    KVMState *s = KVM_STATE(current_machine->accelerator);
+    struct ppc_radix_page_info *radix_page_info;
+    struct kvm_ppc_rmmu_info rmmu_info;
+    int i;
+
+    if (!kvm_check_extension(s, KVM_CAP_PPC_MMU_RADIX)) {
+        return NULL;
+    }
+    if (kvm_vm_ioctl(s, KVM_PPC_GET_RMMU_INFO, &rmmu_info)) {
+        return NULL;
+    }
+    radix_page_info = g_malloc0(sizeof(*radix_page_info));
+    radix_page_info->count = 0;
+    for (i = 0; i < PPC_PAGE_SIZES_MAX_SZ; i++) {
+        if (rmmu_info.ap_encodings[i]) {
+            radix_page_info->entries[i] = rmmu_info.ap_encodings[i];
+            radix_page_info->count++;
+        }
+    }
+    return radix_page_info;
+}
+
 static long gethugepagesize(const char *mem_path)
 {
     struct statfs fs;
@@ -2379,6 +2404,8 @@ static void kvmppc_host_cpu_class_init(ObjectClass *oc, void *data)
         pcc->l1_icache_size = icache_size;
     }
 
+    pcc->radix_page_info = kvm_enabled() ? kvm_get_radix_page_info() : NULL;
+
     /* Reason: kvmppc_host_cpu_initfn() dies when !kvm_enabled() */
     dc->cannot_destroy_with_object_finalize_yet = true;
 }
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Qemu-devel] [RFC PATCH v2 07/12] target-ppc: support KVM_CAP_PPC_MMU_RADIX, KVM_CAP_PPC_MMU_HASH_V3
  2017-02-23  5:59 [Qemu-devel] [RFC PATCH v2 00/12] ISA 3.00 KVM guest support Sam Bobroff
                   ` (5 preceding siblings ...)
  2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 06/12] spapr: Add ibm, processor-radix-AP-encodings to the device tree Sam Bobroff
@ 2017-02-23  6:00 ` Sam Bobroff
  2017-02-28  0:13   ` David Gibson
  2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 08/12] spapr: Only setup HTP if necessary Sam Bobroff
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 28+ messages in thread
From: Sam Bobroff @ 2017-02-23  6:00 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, sjitindarsingh

Query and cache the value of two new KVM capabilities that indicate
KVM's support for new radix and hash modes of the MMU.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
v2:

* cap_mmu_hash renamed to cap_mmu_hash_v3.

 target/ppc/kvm.c     | 14 ++++++++++++++
 target/ppc/kvm_ppc.h | 12 ++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index cf62a42c1f..8b153808fd 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -83,6 +83,8 @@ static int cap_papr;
 static int cap_htab_fd;
 static int cap_fixup_hcalls;
 static int cap_htm;             /* Hardware transactional memory support */
+static int cap_mmu_radix;
+static int cap_mmu_hash_v3;
 
 static uint32_t debug_inst_opcode;
 
@@ -136,6 +138,8 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
     cap_htab_fd = kvm_check_extension(s, KVM_CAP_PPC_HTAB_FD);
     cap_fixup_hcalls = kvm_check_extension(s, KVM_CAP_PPC_FIXUP_HCALL);
     cap_htm = kvm_vm_check_extension(s, KVM_CAP_PPC_HTM);
+    cap_mmu_radix = kvm_vm_check_extension(s, KVM_CAP_PPC_MMU_RADIX);
+    cap_mmu_hash_v3 = kvm_vm_check_extension(s, KVM_CAP_PPC_MMU_HASH_V3);
 
     if (!cap_interrupt_level) {
         fprintf(stderr, "KVM: Couldn't find level irq capability. Expect the "
@@ -2430,6 +2434,16 @@ bool kvmppc_has_cap_htm(void)
     return cap_htm;
 }
 
+bool kvmppc_has_cap_mmu_radix(void)
+{
+    return cap_mmu_radix;
+}
+
+bool kvmppc_has_cap_mmu_hash_v3(void)
+{
+    return cap_mmu_hash_v3;
+}
+
 static PowerPCCPUClass *ppc_cpu_get_family_class(PowerPCCPUClass *pcc)
 {
     ObjectClass *oc = OBJECT_CLASS(pcc);
diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
index 8da2ee418a..56e222dfc2 100644
--- a/target/ppc/kvm_ppc.h
+++ b/target/ppc/kvm_ppc.h
@@ -56,6 +56,8 @@ void kvmppc_hash64_write_pte(CPUPPCState *env, target_ulong pte_index,
                              target_ulong pte0, target_ulong pte1);
 bool kvmppc_has_cap_fixup_hcalls(void);
 bool kvmppc_has_cap_htm(void);
+bool kvmppc_has_cap_mmu_radix(void);
+bool kvmppc_has_cap_mmu_hash_v3(void);
 int kvmppc_enable_hwrng(void);
 int kvmppc_put_books_sregs(PowerPCCPU *cpu);
 PowerPCCPUClass *kvm_ppc_get_host_cpu_class(void);
@@ -262,6 +264,16 @@ static inline bool kvmppc_has_cap_htm(void)
     return false;
 }
 
+static inline bool kvmppc_has_cap_mmu_radix(void)
+{
+    return false;
+}
+
+static inline bool kvmppc_has_cap_mmu_hash_v3(void)
+{
+    return false;
+}
+
 static inline int kvmppc_enable_hwrng(void)
 {
     return -1;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Qemu-devel] [RFC PATCH v2 08/12] spapr: Only setup HTP if necessary.
  2017-02-23  5:59 [Qemu-devel] [RFC PATCH v2 00/12] ISA 3.00 KVM guest support Sam Bobroff
                   ` (6 preceding siblings ...)
  2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 07/12] target-ppc: support KVM_CAP_PPC_MMU_RADIX, KVM_CAP_PPC_MMU_HASH_V3 Sam Bobroff
@ 2017-02-23  6:00 ` Sam Bobroff
  2017-02-28  0:28   ` David Gibson
  2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 09/12] spapr: Add h_register_process_table() hypercall Sam Bobroff
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 28+ messages in thread
From: Sam Bobroff @ 2017-02-23  6:00 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, sjitindarsingh

If QEMU is using KVM, and KVM is capable of running in radix mode,
guests can be run in real-mode without allocating a HPT (because KVM
will use a minimal RPT). So in this case, we avoid creating the HPT
at reset time and later (during CAS) create it if it is necessary.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
v2:

* This patch has been mostly rewritten to move the late HPT allocation to CAS.
This allows a guest to start in radix mode (when it's in real mode) and then
change to hash, even if it is a legacy guest and will not call
h_register_process_table().
* Added an exported function to spapr.c to perform HPT allocation and adjust
the vrma if necessary. This makes it possible to allocate the HPT from
h_client_architecture_support() in spapr_hcall.c.

 hw/ppc/spapr.c         | 24 +++++++++++++++---------
 hw/ppc/spapr_hcall.c   | 10 ++++++++++
 include/hw/ppc/spapr.h |  1 +
 3 files changed, 26 insertions(+), 9 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index ca3812555f..dfee0f685f 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1123,6 +1123,17 @@ static void spapr_reallocate_hpt(sPAPRMachineState *spapr, int shift,
     }
 }
 
+void spapr_setup_hpt_and_vrma(sPAPRMachineState *spapr)
+{
+    spapr_reallocate_hpt(spapr,
+                     spapr_hpt_shift_for_ramsize(MACHINE(qdev_get_machine())->maxram_size),
+                     &error_fatal);
+    if (spapr->vrma_adjust) {
+        spapr->rma_size = kvmppc_rma_size(spapr_node0_size(),
+                                          spapr->htab_shift);
+    }
+}
+
 static void find_unknown_sysbus_device(SysBusDevice *sbdev, void *opaque)
 {
     bool matched = false;
@@ -1151,15 +1162,10 @@ static void ppc_spapr_reset(void)
     /* Check for unknown sysbus devices */
     foreach_dynamic_sysbus_device(find_unknown_sysbus_device, NULL);
 
-    /* Allocate and/or reset the hash page table */
-    spapr_reallocate_hpt(spapr,
-                         spapr_hpt_shift_for_ramsize(machine->maxram_size),
-                         &error_fatal);
-
-    /* Update the RMA size if necessary */
-    if (spapr->vrma_adjust) {
-        spapr->rma_size = kvmppc_rma_size(spapr_node0_size(),
-                                          spapr->htab_shift);
+    /* If using KVM with radix mode available, VCPUs can be started
+     * without a HPT because KVM will start them in radix mode. */
+    if (!(kvm_enabled() && kvmppc_has_cap_mmu_radix())) {
+        spapr_setup_hpt_and_vrma(spapr);
     }
 
     qemu_devices_reset();
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 42d20e0b92..cea34073aa 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1002,6 +1002,16 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
     ov5_updates = spapr_ovec_new();
     spapr->cas_reboot = spapr_ovec_diff(ov5_updates,
                                         ov5_cas_old, spapr->ov5_cas);
+    if (kvm_enabled()) {
+        if (kvmppc_has_cap_mmu_radix()) {
+            /* If the HPT hasn't yet been set up (see
+             * ppc_spapr_reset()), and it's needed, do it now: */
+            if (!spapr_ovec_test(ov5_updates, OV5_MMU_RADIX)) {
+                /* legacy hash or new hash: */
+                spapr_setup_hpt_and_vrma(spapr);
+            }
+        }
+    }
 
     if (!spapr->cas_reboot) {
         spapr->cas_reboot =
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index f9b17d860a..a30cbc485c 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -590,6 +590,7 @@ void spapr_dt_events(sPAPRMachineState *sm, void *fdt);
 int spapr_h_cas_compose_response(sPAPRMachineState *sm,
                                  target_ulong addr, target_ulong size,
                                  sPAPROptionVector *ov5_updates);
+void spapr_setup_hpt_and_vrma(sPAPRMachineState *spapr);
 sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn);
 void spapr_tce_table_enable(sPAPRTCETable *tcet,
                             uint32_t page_shift, uint64_t bus_offset,
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Qemu-devel] [RFC PATCH v2 09/12] spapr: Add h_register_process_table() hypercall
  2017-02-23  5:59 [Qemu-devel] [RFC PATCH v2 00/12] ISA 3.00 KVM guest support Sam Bobroff
                   ` (7 preceding siblings ...)
  2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 08/12] spapr: Only setup HTP if necessary Sam Bobroff
@ 2017-02-23  6:00 ` Sam Bobroff
  2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 10/12] spapr: move spapr_populate_pa_features() Sam Bobroff
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 28+ messages in thread
From: Sam Bobroff @ 2017-02-23  6:00 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, sjitindarsingh

Both radix and hash modes require guests to use
h_register_process_table() to set up the MMU. Implement it using the
new KVM ioctl KVM_PPC_CONFIGURE_V3_MMU.

This hypercall is also necessary for fully emulated guests, so it will
need to be reworked to integrate with Suraj's TCG patchset.
---
v2:

* I haven't addressed review comments for this patch because it overlaps with
Suraj's implementation of the same function and we'll work together to
integrate them.

 hw/ppc/spapr_hcall.c   | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/hw/ppc/spapr.h |  1 +
 target/ppc/kvm.c       | 12 ++++++++++++
 target/ppc/kvm_ppc.h   |  1 +
 4 files changed, 62 insertions(+)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index cea34073aa..9391619ed6 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1027,6 +1027,50 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
     return H_SUCCESS;
 }
 
+static target_ulong h_register_process_table(PowerPCCPU *cpu,
+                                             sPAPRMachineState *spapr,
+                                             target_ulong opcode,
+                                             target_ulong *args)
+{
+    static target_ulong last_process_table;
+    target_ulong flags = args[0];
+    target_ulong proc_tbl = args[1];
+    target_ulong page_size = args[2];
+    target_ulong table_size = args[3];
+    uint64_t cflags, cproc;
+
+    cflags = (flags & 4) ? KVM_PPC_MMUV3_RADIX : 0;
+    cflags |= (flags & 1) ? KVM_PPC_MMUV3_GTSE : 0;
+    cproc = (flags & 4) ? (1ul << 63) : 0;
+    if (!(flags & 0x10)) {
+        if ((last_process_table & (1ul << 63)) != cproc) {
+            return H_PARAMETER;
+        }
+        cproc = last_process_table;
+    } else if (!(flags & 0x8)) {
+        ; /* do nothing */
+    } else if (flags & 4) {
+        /* radix */
+        if (table_size > 24 || (proc_tbl & 0xfff) || (proc_tbl >> 60)) {
+            return H_PARAMETER;
+        }
+        cproc |= proc_tbl | table_size;
+    } else {
+        /* hash, possibly with process table */
+        if (table_size > 24 || (proc_tbl >> 38) || page_size > 7) {
+            return H_PARAMETER;
+        }
+        cproc = (proc_tbl << 25) | (page_size << 5) | table_size;
+    }
+    last_process_table = cproc;
+    fprintf(stderr, "calling config mmu flags=%lx proctbl=%lx\n",
+            cflags, cproc);
+    if  (!kvmppc_configure_v3_mmu(cpu, cflags, cproc)) {
+        return H_HARDWARE;
+    }
+    return H_SUCCESS;
+}
+
 static spapr_hcall_fn papr_hypercall_table[(MAX_HCALL_OPCODE / 4) + 1];
 static spapr_hcall_fn kvmppc_hypercall_table[KVMPPC_HCALL_MAX - KVMPPC_HCALL_BASE + 1];
 
@@ -1115,6 +1159,10 @@ static void hypercall_register_types(void)
 
     /* ibm,client-architecture-support support */
     spapr_register_hypercall(KVMPPC_H_CAS, h_client_architecture_support);
+
+    /* Power9 MMU support */
+    spapr_register_hypercall(H_REGISTER_PROC_TBL,
+                             h_register_process_table);
 }
 
 type_init(hypercall_register_types)
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index a30cbc485c..d523db3b4a 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -346,6 +346,7 @@ struct sPAPRMachineState {
 #define H_XIRR_X                0x2FC
 #define H_RANDOM                0x300
 #define H_SET_MODE              0x31C
+#define H_REGISTER_PROC_TBL     0x37C
 #define H_SIGNAL_SYS_RESET      0x380
 #define MAX_HCALL_OPCODE        H_SIGNAL_SYS_RESET
 
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 8b153808fd..34dde45eef 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -358,6 +358,18 @@ struct ppc_radix_page_info *kvm_get_radix_page_info(void)
     return radix_page_info;
 }
 
+bool kvmppc_configure_v3_mmu(PowerPCCPU *cpu, uint64_t flags, uint64_t proc_tbl)
+{
+    CPUState *cs = CPU(cpu);
+    int ret;
+    struct kvm_ppc_mmuv3_cfg cfg;
+
+    cfg.flags = flags;
+    cfg.process_table = proc_tbl;
+    ret = kvm_vm_ioctl(cs->kvm_state, KVM_PPC_CONFIGURE_V3_MMU, &cfg);
+    return ret == 0;
+}
+
 static long gethugepagesize(const char *mem_path)
 {
     struct statfs fs;
diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
index 56e222dfc2..441fa6a2db 100644
--- a/target/ppc/kvm_ppc.h
+++ b/target/ppc/kvm_ppc.h
@@ -33,6 +33,7 @@ int kvmppc_clear_tsr_bits(PowerPCCPU *cpu, uint32_t tsr_bits);
 int kvmppc_or_tsr_bits(PowerPCCPU *cpu, uint32_t tsr_bits);
 int kvmppc_set_tcr(PowerPCCPU *cpu);
 int kvmppc_booke_watchdog_enable(PowerPCCPU *cpu);
+bool kvmppc_configure_v3_mmu(PowerPCCPU *cpu, uint64_t flags, uint64_t proctbl);
 #ifndef CONFIG_USER_ONLY
 off_t kvmppc_alloc_rma(void **rma);
 bool kvmppc_spapr_use_multitce(void);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Qemu-devel] [RFC PATCH v2 10/12] spapr: move spapr_populate_pa_features()
  2017-02-23  5:59 [Qemu-devel] [RFC PATCH v2 00/12] ISA 3.00 KVM guest support Sam Bobroff
                   ` (8 preceding siblings ...)
  2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 09/12] spapr: Add h_register_process_table() hypercall Sam Bobroff
@ 2017-02-23  6:00 ` Sam Bobroff
  2017-02-28  0:29   ` David Gibson
  2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 11/12] spapr: Enable ISA 3.0 MMU mode selection via CAS Sam Bobroff
  2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 12/12] spapr: Workaround for broken radix guests Sam Bobroff
  11 siblings, 1 reply; 28+ messages in thread
From: Sam Bobroff @ 2017-02-23  6:00 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, sjitindarsingh

In the next patch, spapr_fixup_cpu_dt() will need to call
spapr_populate_pa_features() so move it's definition up without making
any other changes.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
 hw/ppc/spapr.c | 86 +++++++++++++++++++++++++++++-----------------------------
 1 file changed, 43 insertions(+), 43 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index dfee0f685f..0c0782b558 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -194,6 +194,49 @@ static int spapr_fixup_cpu_numa_dt(void *fdt, int offset, CPUState *cs)
     return ret;
 }
 
+/* Populate the "ibm,pa-features" property */
+static void spapr_populate_pa_features(CPUPPCState *env, void *fdt, int offset)
+{
+    uint8_t pa_features_206[] = { 6, 0,
+        0xf6, 0x1f, 0xc7, 0x00, 0x80, 0xc0 };
+    uint8_t pa_features_207[] = { 24, 0,
+        0xf6, 0x1f, 0xc7, 0xc0, 0x80, 0xf0,
+        0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
+        0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
+        0x80, 0x00, 0x80, 0x00, 0x00, 0x00 };
+    uint8_t *pa_features;
+    size_t pa_size;
+
+    switch (POWERPC_MMU_VER(env->mmu_model)) {
+    case POWERPC_MMU_VER_2_06:
+        pa_features = pa_features_206;
+        pa_size = sizeof(pa_features_206);
+        break;
+    case POWERPC_MMU_VER_2_07:
+        pa_features = pa_features_207;
+        pa_size = sizeof(pa_features_207);
+        break;
+    default:
+        return;
+    }
+
+    if (env->ci_large_pages) {
+        /*
+         * Note: we keep CI large pages off by default because a 64K capable
+         * guest provisioned with large pages might otherwise try to map a qemu
+         * framebuffer (or other kind of memory mapped PCI BAR) using 64K pages
+         * even if that qemu runs on a 4k host.
+         * We dd this bit back here if we are confident this is not an issue
+         */
+        pa_features[3] |= 0x20;
+    }
+    if (kvmppc_has_cap_htm() && pa_size > 24) {
+        pa_features[24] |= 0x80;    /* Transactional memory support */
+    }
+
+    _FDT((fdt_setprop(fdt, offset, "ibm,pa-features", pa_features, pa_size)));
+}
+
 static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState *spapr)
 {
     int ret = 0, offset, cpus_offset;
@@ -346,49 +389,6 @@ static int spapr_populate_memory(sPAPRMachineState *spapr, void *fdt)
     return 0;
 }
 
-/* Populate the "ibm,pa-features" property */
-static void spapr_populate_pa_features(CPUPPCState *env, void *fdt, int offset)
-{
-    uint8_t pa_features_206[] = { 6, 0,
-        0xf6, 0x1f, 0xc7, 0x00, 0x80, 0xc0 };
-    uint8_t pa_features_207[] = { 24, 0,
-        0xf6, 0x1f, 0xc7, 0xc0, 0x80, 0xf0,
-        0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
-        0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
-        0x80, 0x00, 0x80, 0x00, 0x00, 0x00 };
-    uint8_t *pa_features;
-    size_t pa_size;
-
-    switch (POWERPC_MMU_VER(env->mmu_model)) {
-    case POWERPC_MMU_VER_2_06:
-        pa_features = pa_features_206;
-        pa_size = sizeof(pa_features_206);
-        break;
-    case POWERPC_MMU_VER_2_07:
-        pa_features = pa_features_207;
-        pa_size = sizeof(pa_features_207);
-        break;
-    default:
-        return;
-    }
-
-    if (env->ci_large_pages) {
-        /*
-         * Note: we keep CI large pages off by default because a 64K capable
-         * guest provisioned with large pages might otherwise try to map a qemu
-         * framebuffer (or other kind of memory mapped PCI BAR) using 64K pages
-         * even if that qemu runs on a 4k host.
-         * We dd this bit back here if we are confident this is not an issue
-         */
-        pa_features[3] |= 0x20;
-    }
-    if (kvmppc_has_cap_htm() && pa_size > 24) {
-        pa_features[24] |= 0x80;    /* Transactional memory support */
-    }
-
-    _FDT((fdt_setprop(fdt, offset, "ibm,pa-features", pa_features, pa_size)));
-}
-
 static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
                                   sPAPRMachineState *spapr)
 {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Qemu-devel] [RFC PATCH v2 11/12] spapr: Enable ISA 3.0 MMU mode selection via CAS
  2017-02-23  5:59 [Qemu-devel] [RFC PATCH v2 00/12] ISA 3.00 KVM guest support Sam Bobroff
                   ` (9 preceding siblings ...)
  2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 10/12] spapr: move spapr_populate_pa_features() Sam Bobroff
@ 2017-02-23  6:00 ` Sam Bobroff
  2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 12/12] spapr: Workaround for broken radix guests Sam Bobroff
  11 siblings, 0 replies; 28+ messages in thread
From: Sam Bobroff @ 2017-02-23  6:00 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, sjitindarsingh

Add the new node, /chosen/ibm,arch-vec-5-platform-support to the
device tree. This allows the guest to determine which modes are
supported by the hypervisor.

Update the option vector processing in h_client_architecture_support()
to handle the new MMU bits. This allows guests to request hash or
radix mode and QEMU to create the guest's HPT at this time if it is
necessary but hasn't yet been done.  QEMU will terminate the guest if
it requests an unavailable mode, as required by the architecture.

Extend the ibm,pa-features node with the new ISA 3.0 values
and set the radix bit if KVM supports radix mode. This probably won't
be used directly by guests to determine the availability of radix mode
(that is indicated by the new node added above) but the architecture
requires that it be set when the hardware supports it.

ISA 3.0 guests will now begin to call h_register_process_table(),
which has been added previously.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
v2:

* Unused bits removed.
* Logic and bit definitions changed due to architectural change.
* Cleanly terminate QEMU if the guest requests an unavailable mode (as required
  by the new architecture).
* Legacy guest workaround moved to it's own patch.
* I'm sorry for the bitfield constants in spapr_dt_ov5_platform_support() but
  there don't seem to be convienent macros for converting an option vector
  specifier (OV_BIT(x,y)) into a byte-mask. I'm open to suggestions.

 hw/ppc/spapr.c              | 53 +++++++++++++++++++++++++++++++++++++++++++++
 hw/ppc/spapr_hcall.c        | 37 ++++++++++++++++++++++++-------
 include/hw/ppc/spapr_ovec.h |  5 +++++
 3 files changed, 87 insertions(+), 8 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 0c0782b558..e83468a8d3 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -204,6 +204,20 @@ static void spapr_populate_pa_features(CPUPPCState *env, void *fdt, int offset)
         0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
         0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
         0x80, 0x00, 0x80, 0x00, 0x00, 0x00 };
+    uint8_t pa_features_300[70 + 2] = { 70, 0,
+        0xf6, 0x3f, 0xc7, 0xc0, 0x80, 0xf0, /* 0 - 5 */
+        0x80, 0x00, 0x00, 0x00, 0x00, 0x00, /* 6 - 11 */
+        0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 12 - 17 */
+        0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 18 - 23 */
+        0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 24 - 29 */
+        0x80, 0x00, 0x80, 0x00, 0xC0, 0x00, /* 30 - 35 */
+        0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 36 - 41 */
+        0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 42 - 47 */
+        0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 48 - 53 */
+        0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 54 - 59 */
+        0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 64 */
+        0x00, 0x00, 0x00, 0x00,             /* 66 - 69 */
+        };
     uint8_t *pa_features;
     size_t pa_size;
 
@@ -216,6 +230,10 @@ static void spapr_populate_pa_features(CPUPPCState *env, void *fdt, int offset)
         pa_features = pa_features_207;
         pa_size = sizeof(pa_features_207);
         break;
+    case POWERPC_MMU_VER_3_00:
+        pa_features = pa_features_300;
+        pa_size = sizeof(pa_features_300);
+        break;
     default:
         return;
     }
@@ -804,6 +822,34 @@ static void spapr_dt_rtas(sPAPRMachineState *spapr, void *fdt)
     spapr_dt_rtas_tokens(fdt, rtas);
 }
 
+/* Prepare ibm,arch-vec-5-platform-support, which indicates the MMU features
+ * that the guest may request and thus the valid values for bytes 24..26 of
+ * option vector 5: */
+static void spapr_dt_ov5_platform_support(void *fdt, int chosen)
+{
+    char val[2 * 3] = {
+        24, 0x00, /* Hash/Radix, filled in below. */
+        25, 0x40, /* Hash options: Segment Tables == no, GTSE == no. */
+        26, 0x40, /* Radix options: GTSE == yes. */
+    };
+
+    if (kvm_enabled()) {
+        if (kvmppc_has_cap_mmu_radix() && kvmppc_has_cap_mmu_hash_v3()) {
+            val[1] = 0x80; /* OV5_MMU_BOTH */
+        } else if (kvmppc_has_cap_mmu_radix()) {
+            val[1] = 0x40; /* OV5_MMU_RADIX_300 */
+        } else {
+            assert(kvmppc_has_cap_mmu_hash_v3());
+            val[1] = 0x00; /* Hash */
+        }
+    } else {
+        /* TODO: TCG case, hash */
+        val[1] = 0x00;
+    }
+    _FDT(fdt_setprop(fdt, chosen, "ibm,arch-vec-5-platform-support",
+                     val, sizeof(val)));
+}
+
 static void spapr_dt_chosen(sPAPRMachineState *spapr, void *fdt)
 {
     MachineState *machine = MACHINE(spapr);
@@ -857,6 +903,8 @@ static void spapr_dt_chosen(sPAPRMachineState *spapr, void *fdt)
         _FDT(fdt_setprop_string(fdt, chosen, "linux,stdout-path", stdout_path));
     }
 
+    spapr_dt_ov5_platform_support(fdt, chosen);
+
     g_free(stdout_path);
     g_free(bootlist);
 }
@@ -1929,6 +1977,11 @@ static void ppc_spapr_init(MachineState *machine)
     }
 
     spapr_ovec_set(spapr->ov5, OV5_FORM1_AFFINITY);
+    if (kvmppc_has_cap_mmu_radix()) {
+        /* KVM always allows GTSE with radix... */
+        spapr_ovec_set(spapr->ov5, OV5_MMU_RADIX_GTSE);
+    }
+    /* ... but not with hash (currently). */
 
     /* advertise support for dedicated HP event source to guests */
     if (spapr->use_hotplug_event_source) {
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 9391619ed6..efaa1a1b19 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -12,6 +12,7 @@
 #include "trace.h"
 #include "kvm_ppc.h"
 #include "hw/ppc/spapr_ovec.h"
+#include "qemu/error-report.h"
 
 struct SPRSyncState {
     int spr;
@@ -933,6 +934,7 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
     uint32_t best_compat = 0;
     int i;
     sPAPROptionVector *ov5_guest, *ov5_cas_old, *ov5_updates;
+    bool guest_radix;
 
     /*
      * We scan the supplied table of PVRs looking for two things
@@ -984,6 +986,13 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
     ov_table = list;
 
     ov5_guest = spapr_ovec_parse_vector(ov_table, 5);
+    if (spapr_ovec_test(ov5_guest, OV5_MMU_BOTH)) {
+        error_report("qemu: guest requested hash and radix MMU, which is invalid.");
+        exit(EXIT_FAILURE);
+    }
+    /* The radix/hash bit in byte 24 requires special handling: */
+    guest_radix = spapr_ovec_test(ov5_guest, OV5_MMU_RADIX_300);
+    spapr_ovec_clear(ov5_guest, OV5_MMU_RADIX_300);
 
     /* NOTE: there are actually a number of ov5 bits where input from the
      * guest is always zero, and the platform/QEMU enables them independently
@@ -1002,14 +1011,18 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
     ov5_updates = spapr_ovec_new();
     spapr->cas_reboot = spapr_ovec_diff(ov5_updates,
                                         ov5_cas_old, spapr->ov5_cas);
-    if (kvm_enabled()) {
-        if (kvmppc_has_cap_mmu_radix()) {
-            /* If the HPT hasn't yet been set up (see
-             * ppc_spapr_reset()), and it's needed, do it now: */
-            if (!spapr_ovec_test(ov5_updates, OV5_MMU_RADIX)) {
-                /* legacy hash or new hash: */
-                spapr_setup_hpt_and_vrma(spapr);
-            }
+    /* Now that processing is finished, set the radix/hash bit for the
+     * guest if it requested a valid mode; otherwise terminate the boot. */
+    if (guest_radix) {
+        if (kvm_enabled() && !kvmppc_has_cap_mmu_radix()) {
+            error_report("qemu: Guest requested radix MMU mode when it is not available.");
+            exit(EXIT_FAILURE);
+        }
+        spapr_ovec_set(spapr->ov5_cas, OV5_MMU_RADIX_300);
+    } else {
+        if (kvm_enabled() && !kvmppc_has_cap_mmu_hash_v3()) {
+            error_report("qemu: Guest requested hash MMU mode when it is not available.");
+            exit(EXIT_FAILURE);
         }
     }
 
@@ -1022,6 +1035,14 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
 
     if (spapr->cas_reboot) {
         qemu_system_reset_request();
+    } else {
+        /* If ppc_spapr_reset() did not set up a HPT but one is necessary
+         * (because the guest isn't going to use radix) then set it up here. */
+        if (kvm_enabled()) {
+            if (kvmppc_has_cap_mmu_radix() && !guest_radix) {
+                spapr_setup_hpt_and_vrma(spapr);
+            }
+        }
     }
 
     return H_SUCCESS;
diff --git a/include/hw/ppc/spapr_ovec.h b/include/hw/ppc/spapr_ovec.h
index 355a34411f..e2dfbac558 100644
--- a/include/hw/ppc/spapr_ovec.h
+++ b/include/hw/ppc/spapr_ovec.h
@@ -48,6 +48,11 @@ typedef struct sPAPROptionVector sPAPROptionVector;
 #define OV5_FORM1_AFFINITY      OV_BIT(5, 0)
 #define OV5_HP_EVT              OV_BIT(6, 5)
 
+/* ISA 3.00 MMU features: */
+#define OV5_MMU_BOTH            OV_BIT(24, 0) /* Radix and hash */
+#define OV5_MMU_RADIX_300       OV_BIT(24, 1) /* 1 => Radix only, 0 => Hash only */
+#define OV5_MMU_RADIX_GTSE      OV_BIT(26, 1) /* Radix GTSE */
+
 /* interfaces */
 sPAPROptionVector *spapr_ovec_new(void);
 sPAPROptionVector *spapr_ovec_clone(sPAPROptionVector *ov_orig);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Qemu-devel] [RFC PATCH v2 12/12] spapr: Workaround for broken radix guests
  2017-02-23  5:59 [Qemu-devel] [RFC PATCH v2 00/12] ISA 3.00 KVM guest support Sam Bobroff
                   ` (10 preceding siblings ...)
  2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 11/12] spapr: Enable ISA 3.0 MMU mode selection via CAS Sam Bobroff
@ 2017-02-23  6:00 ` Sam Bobroff
  2017-02-28  0:36   ` David Gibson
  11 siblings, 1 reply; 28+ messages in thread
From: Sam Bobroff @ 2017-02-23  6:00 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, sjitindarsingh

For a little while around 4.9, Linux kernels that saw the radix bit in
ibm,pa-features would attempt to set up the MMU as if they were a
hypervisor, even if they were a guest, which would cause them to
crash.

Work around this by detecting pre-ISA 3.0 guests by their lack of that
bit in option vector 1, and then removing the radix bit from
ibm,pa-features. Note: This now requires regeneration of that node
after CAS negotiation.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
 hw/ppc/spapr.c              | 15 +++++++++++++--
 hw/ppc/spapr_hcall.c        |  5 +++--
 include/hw/ppc/spapr.h      |  1 +
 include/hw/ppc/spapr_ovec.h |  3 +++
 4 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index e83468a8d3..c47600b8ee 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -195,7 +195,8 @@ static int spapr_fixup_cpu_numa_dt(void *fdt, int offset, CPUState *cs)
 }
 
 /* Populate the "ibm,pa-features" property */
-static void spapr_populate_pa_features(CPUPPCState *env, void *fdt, int offset)
+static void spapr_populate_pa_features(CPUPPCState *env, void *fdt, int offset,
+                                      bool legacy_guest)
 {
     uint8_t pa_features_206[] = { 6, 0,
         0xf6, 0x1f, 0xc7, 0x00, 0x80, 0xc0 };
@@ -251,6 +252,12 @@ static void spapr_populate_pa_features(CPUPPCState *env, void *fdt, int offset)
     if (kvmppc_has_cap_htm() && pa_size > 24) {
         pa_features[24] |= 0x80;    /* Transactional memory support */
     }
+    if (legacy_guest && pa_size > 40) {
+        /* Workaround for broken kernels that attempt (guest) radix
+         * mode when they can't handle it, if they see the radix bit set
+         * in pa-features. So hide it from them. */
+        pa_features[40 + 2] &= ~0x80; /* Radix MMU */
+    }
 
     _FDT((fdt_setprop(fdt, offset, "ibm,pa-features", pa_features, pa_size)));
 }
@@ -265,6 +272,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState *spapr)
 
     CPU_FOREACH(cs) {
         PowerPCCPU *cpu = POWERPC_CPU(cs);
+        CPUPPCState *env = &cpu->env;
         DeviceClass *dc = DEVICE_GET_CLASS(cs);
         int index = ppc_get_vcpu_dt_id(cpu);
         int compat_smt = MIN(smp_threads, ppc_compat_max_threads(cpu));
@@ -306,6 +314,9 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState *spapr)
         if (ret < 0) {
             return ret;
         }
+
+        spapr_populate_pa_features(env, fdt, offset,
+                                         spapr->cas_legacy_guest_workaround);
     }
     return ret;
 }
@@ -503,7 +514,7 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
                           page_sizes_prop, page_sizes_prop_size)));
     }
 
-    spapr_populate_pa_features(env, fdt, offset);
+    spapr_populate_pa_features(env, fdt, offset, false);
 
     _FDT((fdt_setprop_cell(fdt, offset, "ibm,chip-id",
                            cs->cpu_index / vcpus_per_socket)));
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index efaa1a1b19..7660cd7d64 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -933,7 +933,7 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
     uint32_t max_compat = cpu->max_compat;
     uint32_t best_compat = 0;
     int i;
-    sPAPROptionVector *ov5_guest, *ov5_cas_old, *ov5_updates;
+    sPAPROptionVector *ov1_guest, *ov5_guest, *ov5_cas_old, *ov5_updates;
     bool guest_radix;
 
     /*
@@ -985,6 +985,7 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
     /* For the future use: here @ov_table points to the first option vector */
     ov_table = list;
 
+    ov1_guest = spapr_ovec_parse_vector(ov_table, 1);
     ov5_guest = spapr_ovec_parse_vector(ov_table, 5);
     if (spapr_ovec_test(ov5_guest, OV5_MMU_BOTH)) {
         error_report("qemu: guest requested hash and radix MMU, which is invalid.");
@@ -1025,7 +1026,7 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
             exit(EXIT_FAILURE);
         }
     }
-
+    spapr->cas_legacy_guest_workaround = !spapr_ovec_test(ov1_guest, OV1_PPC_3_00);
     if (!spapr->cas_reboot) {
         spapr->cas_reboot =
             (spapr_h_cas_compose_response(spapr, args[1], args[2],
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index d523db3b4a..1e64e3ada8 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -77,6 +77,7 @@ struct sPAPRMachineState {
     sPAPROptionVector *ov5;         /* QEMU-supported option vectors */
     sPAPROptionVector *ov5_cas;     /* negotiated (via CAS) option vectors */
     bool cas_reboot;
+    bool cas_legacy_guest_workaround;
 
     Notifier epow_notifier;
     QTAILQ_HEAD(, sPAPREventLogEntry) pending_events;
diff --git a/include/hw/ppc/spapr_ovec.h b/include/hw/ppc/spapr_ovec.h
index e2dfbac558..8807c753e0 100644
--- a/include/hw/ppc/spapr_ovec.h
+++ b/include/hw/ppc/spapr_ovec.h
@@ -43,6 +43,9 @@ typedef struct sPAPROptionVector sPAPROptionVector;
 
 #define OV_BIT(byte, bit) ((byte - 1) * BITS_PER_BYTE + bit)
 
+/* option vector 1 */
+#define OV1_PPC_3_00            OV_BIT(3, 0) /* set if we support PowerPC 3.00 */
+
 /* option vector 5 */
 #define OV5_DRCONF_MEMORY       OV_BIT(2, 2)
 #define OV5_FORM1_AFFINITY      OV_BIT(5, 0)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 04/12] Move virtio_mmio.h to fix update-linux-headers.sh
  2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 04/12] Move virtio_mmio.h to fix update-linux-headers.sh Sam Bobroff
@ 2017-02-24 16:40   ` Michael S. Tsirkin
  2017-02-24 16:47   ` Michael S. Tsirkin
  1 sibling, 0 replies; 28+ messages in thread
From: Michael S. Tsirkin @ 2017-02-24 16:40 UTC (permalink / raw)
  To: Sam Bobroff; +Cc: qemu-ppc, qemu-devel, sjitindarsingh, david

On Thu, Feb 23, 2017 at 04:59:57PM +1100, Sam Bobroff wrote:
> Currently, running update-linux-headers.sh will produce a patch that
> deletes virtio_mmio.h, which is still needed. This happens because
> virtio_mmio.h is in the directory used to store headers from the linux
> kernel that are copied by the kernel's "make headers_install" target
> (used by the update script) but it is not one of the files in that
> set.
> 
> Fix this by moving that file into a new directory.
> 
> In the future if that file is added to the "headers_install" target
> then this change should be reverted.
> 
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>

This is a temporary condition. I'm merging a patch
exporting virtio_mmio.h for next linux and we'll
want to stay in sync, so I don't think we should make this
change.

> ---
> v2:
> * FWIW, here's one way of fixing it.
> 
>  hw/virtio/virtio-mmio.c                                          | 2 +-
>  include/{standard-headers => kernel-headers}/linux/virtio_mmio.h | 0
>  2 files changed, 1 insertion(+), 1 deletion(-)
>  rename include/{standard-headers => kernel-headers}/linux/virtio_mmio.h (100%)
> 
> diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
> index 5807aa87fe..cc6afa9da1 100644
> --- a/hw/virtio/virtio-mmio.c
> +++ b/hw/virtio/virtio-mmio.c
> @@ -20,7 +20,7 @@
>   */
>  
>  #include "qemu/osdep.h"
> -#include "standard-headers/linux/virtio_mmio.h"
> +#include "kernel-headers/linux/virtio_mmio.h"
>  #include "hw/sysbus.h"
>  #include "hw/virtio/virtio.h"
>  #include "qemu/host-utils.h"
> diff --git a/include/standard-headers/linux/virtio_mmio.h b/include/kernel-headers/linux/virtio_mmio.h
> similarity index 100%
> rename from include/standard-headers/linux/virtio_mmio.h
> rename to include/kernel-headers/linux/virtio_mmio.h
> -- 
> 2.11.0
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 04/12] Move virtio_mmio.h to fix update-linux-headers.sh
  2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 04/12] Move virtio_mmio.h to fix update-linux-headers.sh Sam Bobroff
  2017-02-24 16:40   ` Michael S. Tsirkin
@ 2017-02-24 16:47   ` Michael S. Tsirkin
  2017-02-28  2:23     ` Sam Bobroff
  1 sibling, 1 reply; 28+ messages in thread
From: Michael S. Tsirkin @ 2017-02-24 16:47 UTC (permalink / raw)
  To: Sam Bobroff; +Cc: qemu-ppc, qemu-devel, sjitindarsingh, david

On Thu, Feb 23, 2017 at 04:59:57PM +1100, Sam Bobroff wrote:
> Currently, running update-linux-headers.sh will produce a patch that
> deletes virtio_mmio.h, which is still needed. This happens because
> virtio_mmio.h is in the directory used to store headers from the linux
> kernel that are copied by the kernel's "make headers_install" target
> (used by the update script) but it is not one of the files in that
> set.
> 
> Fix this by moving that file into a new directory.
> 
> In the future if that file is added to the "headers_install" target
> then this change should be reverted.
> 
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>

This is a temporary condition - I'm adding a patch exporting
this header for next linux. So I don't think we should merge this.

> ---
> v2:
> * FWIW, here's one way of fixing it.
> 
>  hw/virtio/virtio-mmio.c                                          | 2 +-
>  include/{standard-headers => kernel-headers}/linux/virtio_mmio.h | 0
>  2 files changed, 1 insertion(+), 1 deletion(-)
>  rename include/{standard-headers => kernel-headers}/linux/virtio_mmio.h (100%)
> 
> diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
> index 5807aa87fe..cc6afa9da1 100644
> --- a/hw/virtio/virtio-mmio.c
> +++ b/hw/virtio/virtio-mmio.c
> @@ -20,7 +20,7 @@
>   */
>  
>  #include "qemu/osdep.h"
> -#include "standard-headers/linux/virtio_mmio.h"
> +#include "kernel-headers/linux/virtio_mmio.h"
>  #include "hw/sysbus.h"
>  #include "hw/virtio/virtio.h"
>  #include "qemu/host-utils.h"
> diff --git a/include/standard-headers/linux/virtio_mmio.h b/include/kernel-headers/linux/virtio_mmio.h
> similarity index 100%
> rename from include/standard-headers/linux/virtio_mmio.h
> rename to include/kernel-headers/linux/virtio_mmio.h
> -- 
> 2.11.0
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 01/12] spapr: Small cleanup of PPC MMU enums
  2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 01/12] spapr: Small cleanup of PPC MMU enums Sam Bobroff
@ 2017-02-27  6:22   ` David Gibson
  0 siblings, 0 replies; 28+ messages in thread
From: David Gibson @ 2017-02-27  6:22 UTC (permalink / raw)
  To: Sam Bobroff; +Cc: qemu-ppc, qemu-devel, sjitindarsingh

[-- Attachment #1: Type: text/plain, Size: 12772 bytes --]

On Thu, Feb 23, 2017 at 04:59:54PM +1100, Sam Bobroff wrote:
> The PPC MMU types are sometimes treated as if they were a bit field
> and sometime as if they were an enum which causes maintenance
> problems: flipping bits in the MMU type (which is done on both the 1TB
> segment and 64K segment bits) currently produces new MMU type
> values that are not handled in every "switch" on it, sometimes causing
> an abort().
> 
> This patch provides some macros that can be used to filter out the
> "bit field-like" bits so that the remainder of the value can be
> switched on, like an enum. This allows removal of all of the
> "degraded" types from the list and should ease maintenance.
> 
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>

Seems like a good idea.

Reviewd-by: David Gibson <David@gibson.dropbear.id.au>

> ---
>  hw/ppc/spapr.c          |  8 +++---
>  target/ppc/cpu-qom.h    | 12 ++++-----
>  target/ppc/kvm.c        |  8 +++---
>  target/ppc/mmu-hash64.c | 10 ++++----
>  target/ppc/mmu_helper.c | 67 ++++++++++++++++++++-----------------------------
>  target/ppc/translate.c  | 12 ++++-----
>  6 files changed, 50 insertions(+), 67 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 5904e6498f..cceb35f083 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -359,14 +359,12 @@ static void spapr_populate_pa_features(CPUPPCState *env, void *fdt, int offset)
>      uint8_t *pa_features;
>      size_t pa_size;
>  
> -    switch (env->mmu_model) {
> -    case POWERPC_MMU_2_06:
> -    case POWERPC_MMU_2_06a:
> +    switch (POWERPC_MMU_VER(env->mmu_model)) {
> +    case POWERPC_MMU_VER_2_06:
>          pa_features = pa_features_206;
>          pa_size = sizeof(pa_features_206);
>          break;
> -    case POWERPC_MMU_2_07:
> -    case POWERPC_MMU_2_07a:
> +    case POWERPC_MMU_VER_2_07:
>          pa_features = pa_features_207;
>          pa_size = sizeof(pa_features_207);
>          break;
> diff --git a/target/ppc/cpu-qom.h b/target/ppc/cpu-qom.h
> index 4e3132b56b..4807f4d86c 100644
> --- a/target/ppc/cpu-qom.h
> +++ b/target/ppc/cpu-qom.h
> @@ -79,21 +79,21 @@ enum powerpc_mmu_t {
>      POWERPC_MMU_2_06       = POWERPC_MMU_64 | POWERPC_MMU_1TSEG
>                               | POWERPC_MMU_64K
>                               | POWERPC_MMU_AMR | 0x00000003,
> -    /* Architecture 2.06 "degraded" (no 1T segments)           */
> -    POWERPC_MMU_2_06a      = POWERPC_MMU_64 | POWERPC_MMU_AMR
> -                             | 0x00000003,
>      /* Architecture 2.07 variant                               */
>      POWERPC_MMU_2_07       = POWERPC_MMU_64 | POWERPC_MMU_1TSEG
>                               | POWERPC_MMU_64K
>                               | POWERPC_MMU_AMR | 0x00000004,
> -    /* Architecture 2.07 "degraded" (no 1T segments)           */
> -    POWERPC_MMU_2_07a      = POWERPC_MMU_64 | POWERPC_MMU_AMR
> -                             | 0x00000004,
>      /* Architecture 3.00 variant                               */
>      POWERPC_MMU_3_00       = POWERPC_MMU_64 | POWERPC_MMU_1TSEG
>                               | POWERPC_MMU_64K
>                               | POWERPC_MMU_AMR | 0x00000005,
>  };
> +#define POWERPC_MMU_VER(x) ((x) & (POWERPC_MMU_64 | 0xFFFF))
> +#define POWERPC_MMU_VER_64B POWERPC_MMU_VER(POWERPC_MMU_64B)
> +#define POWERPC_MMU_VER_2_03 POWERPC_MMU_VER(POWERPC_MMU_2_03)
> +#define POWERPC_MMU_VER_2_06 POWERPC_MMU_VER(POWERPC_MMU_2_06)
> +#define POWERPC_MMU_VER_2_07 POWERPC_MMU_VER(POWERPC_MMU_2_07)
> +#define POWERPC_MMU_VER_3_00 POWERPC_MMU_VER(POWERPC_MMU_3_00)
>  
>  /*****************************************************************************/
>  /* Exception model                                                           */
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index 52bbea514a..d53ede8b4a 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -282,8 +282,8 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
>              info->flags |= KVM_PPC_1T_SEGMENTS;
>          }
>  
> -        if (env->mmu_model == POWERPC_MMU_2_06 ||
> -            env->mmu_model == POWERPC_MMU_2_07) {
> +        if (POWERPC_MMU_VER(env->mmu_model) == POWERPC_MMU_VER_2_06 ||
> +           POWERPC_MMU_VER(env->mmu_model) == POWERPC_MMU_VER_2_07) {
>              info->slb_size = 32;
>          } else {
>              info->slb_size = 64;
> @@ -297,8 +297,8 @@ static void kvm_get_fallback_smmu_info(PowerPCCPU *cpu,
>          i++;
>  
>          /* 64K on MMU 2.06 and later */
> -        if (env->mmu_model == POWERPC_MMU_2_06 ||
> -            env->mmu_model == POWERPC_MMU_2_07) {
> +        if (POWERPC_MMU_VER(env->mmu_model) == POWERPC_MMU_VER_2_06 ||
> +            POWERPC_MMU_VER(env->mmu_model) == POWERPC_MMU_VER_2_07) {
>              info->sps[i].page_shift = 16;
>              info->sps[i].slb_enc = 0x110;
>              info->sps[i].enc[0].page_shift = 16;
> diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
> index 76669ed82c..6346167b48 100644
> --- a/target/ppc/mmu-hash64.c
> +++ b/target/ppc/mmu-hash64.c
> @@ -1032,8 +1032,8 @@ void helper_store_lpcr(CPUPPCState *env, target_ulong val)
>      uint64_t lpcr = 0;
>  
>      /* Filter out bits */
> -    switch (env->mmu_model) {
> -    case POWERPC_MMU_64B: /* 970 */
> +    switch (POWERPC_MMU_VER(env->mmu_model)) {
> +    case POWERPC_MMU_VER_64B: /* 970 */
>          if (val & 0x40) {
>              lpcr |= LPCR_LPES0;
>          }
> @@ -1059,19 +1059,19 @@ void helper_store_lpcr(CPUPPCState *env, target_ulong val)
>           * to dig HRMOR out of HID5
>           */
>          break;
> -    case POWERPC_MMU_2_03: /* P5p */
> +    case POWERPC_MMU_VER_2_03: /* P5p */
>          lpcr = val & (LPCR_RMLS | LPCR_ILE |
>                        LPCR_LPES0 | LPCR_LPES1 |
>                        LPCR_RMI | LPCR_HDICE);
>          break;
> -    case POWERPC_MMU_2_06: /* P7 */
> +    case POWERPC_MMU_VER_2_06: /* P7 */
>          lpcr = val & (LPCR_VPM0 | LPCR_VPM1 | LPCR_ISL | LPCR_DPFD |
>                        LPCR_VRMASD | LPCR_RMLS | LPCR_ILE |
>                        LPCR_P7_PECE0 | LPCR_P7_PECE1 | LPCR_P7_PECE2 |
>                        LPCR_MER | LPCR_TC |
>                        LPCR_LPES0 | LPCR_LPES1 | LPCR_HDICE);
>          break;
> -    case POWERPC_MMU_2_07: /* P8 */
> +    case POWERPC_MMU_VER_2_07: /* P8 */
>          lpcr = val & (LPCR_VPM0 | LPCR_VPM1 | LPCR_ISL | LPCR_KBV |
>                        LPCR_DPFD | LPCR_VRMASD | LPCR_RMLS | LPCR_ILE |
>                        LPCR_AIL | LPCR_ONL | LPCR_P8_PECE0 | LPCR_P8_PECE1 |
> diff --git a/target/ppc/mmu_helper.c b/target/ppc/mmu_helper.c
> index eb2d482ef7..0f6016ff0d 100644
> --- a/target/ppc/mmu_helper.c
> +++ b/target/ppc/mmu_helper.c
> @@ -1260,7 +1260,7 @@ static void mmu6xx_dump_mmu(FILE *f, fprintf_function cpu_fprintf,
>  
>  void dump_mmu(FILE *f, fprintf_function cpu_fprintf, CPUPPCState *env)
>  {
> -    switch (env->mmu_model) {
> +    switch (POWERPC_MMU_VER(env->mmu_model)) {
>      case POWERPC_MMU_BOOKE:
>          mmubooke_dump_mmu(f, cpu_fprintf, env);
>          break;
> @@ -1272,12 +1272,10 @@ void dump_mmu(FILE *f, fprintf_function cpu_fprintf, CPUPPCState *env)
>          mmu6xx_dump_mmu(f, cpu_fprintf, env);
>          break;
>  #if defined(TARGET_PPC64)
> -    case POWERPC_MMU_64B:
> -    case POWERPC_MMU_2_03:
> -    case POWERPC_MMU_2_06:
> -    case POWERPC_MMU_2_06a:
> -    case POWERPC_MMU_2_07:
> -    case POWERPC_MMU_2_07a:
> +    case POWERPC_MMU_VER_64B:
> +    case POWERPC_MMU_VER_2_03:
> +    case POWERPC_MMU_VER_2_06:
> +    case POWERPC_MMU_VER_2_07:
>          dump_slb(f, cpu_fprintf, ppc_env_get_cpu(env));
>          break;
>  #endif
> @@ -1412,14 +1410,12 @@ hwaddr ppc_cpu_get_phys_page_debug(CPUState *cs, vaddr addr)
>      CPUPPCState *env = &cpu->env;
>      mmu_ctx_t ctx;
>  
> -    switch (env->mmu_model) {
> +    switch (POWERPC_MMU_VER(env->mmu_model)) {
>  #if defined(TARGET_PPC64)
> -    case POWERPC_MMU_64B:
> -    case POWERPC_MMU_2_03:
> -    case POWERPC_MMU_2_06:
> -    case POWERPC_MMU_2_06a:
> -    case POWERPC_MMU_2_07:
> -    case POWERPC_MMU_2_07a:
> +    case POWERPC_MMU_VER_64B:
> +    case POWERPC_MMU_VER_2_03:
> +    case POWERPC_MMU_VER_2_06:
> +    case POWERPC_MMU_VER_2_07:
>          return ppc_hash64_get_phys_page_debug(cpu, addr);
>  #endif
>  
> @@ -1904,6 +1900,12 @@ void ppc_tlb_invalidate_all(CPUPPCState *env)
>  {
>      PowerPCCPU *cpu = ppc_env_get_cpu(env);
>  
> +#if defined(TARGET_PPC64)
> +    if (env->mmu_model & POWERPC_MMU_64) {
> +        env->tlb_need_flush = 0;
> +        tlb_flush(CPU(cpu));
> +    } else
> +#endif /* defined(TARGET_PPC64) */
>      switch (env->mmu_model) {
>      case POWERPC_MMU_SOFT_6xx:
>      case POWERPC_MMU_SOFT_74xx:
> @@ -1928,21 +1930,12 @@ void ppc_tlb_invalidate_all(CPUPPCState *env)
>          break;
>      case POWERPC_MMU_32B:
>      case POWERPC_MMU_601:
> -#if defined(TARGET_PPC64)
> -    case POWERPC_MMU_64B:
> -    case POWERPC_MMU_2_03:
> -    case POWERPC_MMU_2_06:
> -    case POWERPC_MMU_2_06a:
> -    case POWERPC_MMU_2_07:
> -    case POWERPC_MMU_2_07a:
> -    case POWERPC_MMU_3_00:
> -#endif /* defined(TARGET_PPC64) */
>          env->tlb_need_flush = 0;
>          tlb_flush(CPU(cpu));
>          break;
>      default:
>          /* XXX: TODO */
> -        cpu_abort(CPU(cpu), "Unknown MMU model %d\n", env->mmu_model);
> +        cpu_abort(CPU(cpu), "Unknown MMU model %x\n", env->mmu_model);
>          break;
>      }
>  }
> @@ -1951,6 +1944,16 @@ void ppc_tlb_invalidate_one(CPUPPCState *env, target_ulong addr)
>  {
>  #if !defined(FLUSH_ALL_TLBS)
>      addr &= TARGET_PAGE_MASK;
> +#if defined(TARGET_PPC64)
> +    if (env->mmu_model & POWERPC_MMU_64) {
> +        /* tlbie invalidate TLBs for all segments */
> +        /* XXX: given the fact that there are too many segments to invalidate,
> +         *      and we still don't have a tlb_flush_mask(env, n, mask) in QEMU,
> +         *      we just invalidate all TLBs
> +         */
> +        env->tlb_need_flush |= TLB_NEED_LOCAL_FLUSH;
> +    } else
> +#endif /* defined(TARGET_PPC64) */
>      switch (env->mmu_model) {
>      case POWERPC_MMU_SOFT_6xx:
>      case POWERPC_MMU_SOFT_74xx:
> @@ -1968,22 +1971,6 @@ void ppc_tlb_invalidate_one(CPUPPCState *env, target_ulong addr)
>           */
>          env->tlb_need_flush |= TLB_NEED_LOCAL_FLUSH;
>          break;
> -#if defined(TARGET_PPC64)
> -    case POWERPC_MMU_64B:
> -    case POWERPC_MMU_2_03:
> -    case POWERPC_MMU_2_06:
> -    case POWERPC_MMU_2_06a:
> -    case POWERPC_MMU_2_07:
> -    case POWERPC_MMU_2_07a:
> -    case POWERPC_MMU_3_00:
> -        /* tlbie invalidate TLBs for all segments */
> -        /* XXX: given the fact that there are too many segments to invalidate,
> -         *      and we still don't have a tlb_flush_mask(env, n, mask) in QEMU,
> -         *      we just invalidate all TLBs
> -         */
> -        env->tlb_need_flush |= TLB_NEED_LOCAL_FLUSH;
> -        break;
> -#endif /* defined(TARGET_PPC64) */
>      default:
>          /* Should never reach here with other MMU models */
>          assert(0);
> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index b09e16ff76..2a24d1de67 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -6988,18 +6988,16 @@ void ppc_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
>      if (env->spr_cb[SPR_LPCR].name)
>          cpu_fprintf(f, " LPCR " TARGET_FMT_lx "\n", env->spr[SPR_LPCR]);
>  
> -    switch (env->mmu_model) {
> +    switch (POWERPC_MMU_VER(env->mmu_model)) {
>      case POWERPC_MMU_32B:
>      case POWERPC_MMU_601:
>      case POWERPC_MMU_SOFT_6xx:
>      case POWERPC_MMU_SOFT_74xx:
>  #if defined(TARGET_PPC64)
> -    case POWERPC_MMU_64B:
> -    case POWERPC_MMU_2_03:
> -    case POWERPC_MMU_2_06:
> -    case POWERPC_MMU_2_06a:
> -    case POWERPC_MMU_2_07:
> -    case POWERPC_MMU_2_07a:
> +    case POWERPC_MMU_VER_64B:
> +    case POWERPC_MMU_VER_2_03:
> +    case POWERPC_MMU_VER_2_06:
> +    case POWERPC_MMU_VER_2_07:
>  #endif
>          cpu_fprintf(f, " SDR1 " TARGET_FMT_lx "   DAR " TARGET_FMT_lx
>                         "  DSISR " TARGET_FMT_lx "\n", env->spr[SPR_SDR1],

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 02/12] scripts/update-linux-headers.sh: refactor extra files
  2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 02/12] scripts/update-linux-headers.sh: refactor extra files Sam Bobroff
@ 2017-02-27  6:24   ` David Gibson
  0 siblings, 0 replies; 28+ messages in thread
From: David Gibson @ 2017-02-27  6:24 UTC (permalink / raw)
  To: Sam Bobroff; +Cc: qemu-ppc, qemu-devel, sjitindarsingh

[-- Attachment #1: Type: text/plain, Size: 2804 bytes --]

On Thu, Feb 23, 2017 at 04:59:55PM +1100, Sam Bobroff wrote:
> Refactor the architecture specific code to make it easier
> to add new special case files.
> 
> There should be no change in functionality.
> 
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
> v2:
> 
> I've factored the script to make it easier to add new files.
> 
>  scripts/update-linux-headers.sh | 25 +++++++++++--------------
>  1 file changed, 11 insertions(+), 14 deletions(-)
> 
> diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
> index ef11a8ab42..c75c30da1b 100755
> --- a/scripts/update-linux-headers.sh
> +++ b/scripts/update-linux-headers.sh
> @@ -76,28 +76,25 @@ for arch in $ARCHLIST; do
>      fi
>  
>      make -C "$linux" INSTALL_HDR_PATH="$tmpdir" ARCH=$arch headers_install
> +    ARCH_EXTRA=
> +    ARCH_STD_EXTRA=
> +    case "$arch" in
> +        powerpc) ARCH_EXTRA=epapr_hcalls.h ;;
> +        s390) ARCH_STD_EXTRA="kvm_virtio.h virtio-ccw.h" ;;
> +        x86) ARCH_EXTRA="unistd_32.h unistd_x32.h unistd_64.h" ARCH_STD_EXTRA="hyperv.h" ;;
> +    esac
>  
>      rm -rf "$output/linux-headers/asm-$arch"
>      mkdir -p "$output/linux-headers/asm-$arch"
> -    for header in kvm.h kvm_para.h unistd.h; do
> +    for header in kvm.h kvm_para.h unistd.h $ARCH_EXTRA; do
>          cp "$tmpdir/include/asm/$header" "$output/linux-headers/asm-$arch"
>      done
> -    if [ $arch = powerpc ]; then
> -        cp "$tmpdir/include/asm/epapr_hcalls.h" "$output/linux-headers/asm-powerpc/"
> -    fi
>  
>      rm -rf "$output/include/standard-headers/asm-$arch"
>      mkdir -p "$output/include/standard-headers/asm-$arch"
> -    if [ $arch = s390 ]; then
> -        cp_portable "$tmpdir/include/asm/kvm_virtio.h" "$output/include/standard-headers/asm-s390/"
> -        cp_portable "$tmpdir/include/asm/virtio-ccw.h" "$output/include/standard-headers/asm-s390/"
> -    fi
> -    if [ $arch = x86 ]; then
> -        cp_portable "$tmpdir/include/asm/hyperv.h" "$output/include/standard-headers/asm-x86/"
> -        cp "$tmpdir/include/asm/unistd_32.h" "$output/linux-headers/asm-x86/"
> -        cp "$tmpdir/include/asm/unistd_x32.h" "$output/linux-headers/asm-x86/"
> -        cp "$tmpdir/include/asm/unistd_64.h" "$output/linux-headers/asm-x86/"
> -    fi
> +    for header in $ARCH_STD_EXTRA; do
> +        cp_portable "$tmpdir/include/asm/$header" "$output/include/standard-headers/asm-$arch/"
> +    done
>  done
>  
>  rm -rf "$output/linux-headers/linux"

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 06/12] spapr: Add ibm, processor-radix-AP-encodings to the device tree
  2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 06/12] spapr: Add ibm, processor-radix-AP-encodings to the device tree Sam Bobroff
@ 2017-02-28  0:12   ` David Gibson
  2017-02-28  2:27     ` Suraj Jitindar Singh
  0 siblings, 1 reply; 28+ messages in thread
From: David Gibson @ 2017-02-28  0:12 UTC (permalink / raw)
  To: Sam Bobroff; +Cc: qemu-ppc, qemu-devel, sjitindarsingh

[-- Attachment #1: Type: text/plain, Size: 5881 bytes --]

On Thu, Feb 23, 2017 at 04:59:59PM +1100, Sam Bobroff wrote:
> Use the new ioctl, KVM_PPC_GET_RMMU_INFO, to fetch radix MMU
> information from KVM and present the page encodings in the device tree
> under ibm,processor-radix-AP-encodings. This provides page size
> information to the guest which is necessary for it to use radix mode.
> 
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
> ---
> v2:
> 
> * ppc_radix_page_info now kept in native format, conversion to BE done when adding to the device tree.
> * radix_page_info moved into the CPU class, cleaning up some code.

Looks pretty good, although I imaginge it will need a little rework to
rebase on top of the TCG radix stuff.  Also one comment below..

> 
>  hw/ppc/spapr.c       | 12 ++++++++++++
>  include/sysemu/kvm.h |  1 +
>  target/ppc/cpu-qom.h |  1 +
>  target/ppc/cpu.h     |  4 ++++
>  target/ppc/kvm.c     | 27 +++++++++++++++++++++++++++
>  5 files changed, 45 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index cceb35f083..ca3812555f 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -409,6 +409,8 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
>      sPAPRDRConnector *drc;
>      sPAPRDRConnectorClass *drck;
>      int drc_index;
> +    uint32_t radix_AP_encodings[PPC_PAGE_SIZES_MAX_SZ];
> +    int i;
>  
>      drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, index);
>      if (drc) {
> @@ -494,6 +496,16 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
>      _FDT(spapr_fixup_cpu_numa_dt(fdt, offset, cs));
>  
>      _FDT(spapr_fixup_cpu_smt_dt(fdt, offset, cpu, compat_smt));
> +
> +    if (pcc->radix_page_info) {
> +        for (i = 0; i < pcc->radix_page_info->count; i++) {
> +            radix_AP_encodings[i] = cpu_to_be32(pcc->radix_page_info->entries[i]);
> +        }
> +        _FDT((fdt_setprop(fdt, offset, "ibm,processor-radix-AP-encodings",
> +                          radix_AP_encodings,
> +                          pcc->radix_page_info->count *
> +                          sizeof(radix_AP_encodings[0]))));
> +    }
>  }
>  
>  static void spapr_populate_cpus_dt_node(void *fdt, sPAPRMachineState *spapr)
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index 3045ee7678..01a8db1180 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -526,5 +526,6 @@ int kvm_set_one_reg(CPUState *cs, uint64_t id, void *source);
>   * Returns: 0 on success, or a negative errno on failure.
>   */
>  int kvm_get_one_reg(CPUState *cs, uint64_t id, void *target);
> +struct ppc_radix_page_info *kvm_get_radix_page_info(void);
>  int kvm_get_max_memslots(void);
>  #endif
> diff --git a/target/ppc/cpu-qom.h b/target/ppc/cpu-qom.h
> index 4807f4d86c..0efb543912 100644
> --- a/target/ppc/cpu-qom.h
> +++ b/target/ppc/cpu-qom.h
> @@ -195,6 +195,7 @@ typedef struct PowerPCCPUClass {
>      int bfd_mach;
>      uint32_t l1_dcache_size, l1_icache_size;
>      const struct ppc_segment_page_sizes *sps;
> +    struct ppc_radix_page_info *radix_page_info;
>      void (*init_proc)(CPUPPCState *env);
>      int  (*check_pow)(CPUPPCState *env);
>      int (*handle_mmu_fault)(PowerPCCPU *cpu, vaddr eaddr, int rwx, int mmu_idx);
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index b559b67073..a6c8c5ff4c 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -934,6 +934,10 @@ struct ppc_segment_page_sizes {
>      struct ppc_one_seg_page_size sps[PPC_PAGE_SIZES_MAX_SZ];
>  };
>  
> +struct ppc_radix_page_info {
> +    uint32_t count;
> +    uint32_t entries[PPC_PAGE_SIZES_MAX_SZ];
> +};
>  
>  /*****************************************************************************/
>  /* The whole PowerPC CPU context */
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index d53ede8b4a..cf62a42c1f 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -48,6 +48,7 @@
>  #if defined(TARGET_PPC64)
>  #include "hw/ppc/spapr_cpu_core.h"
>  #endif
> +#include "sysemu/kvm_int.h"
>  
>  //#define DEBUG_KVM
>  
> @@ -329,6 +330,30 @@ static void kvm_get_smmu_info(PowerPCCPU *cpu, struct kvm_ppc_smmu_info *info)
>      kvm_get_fallback_smmu_info(cpu, info);
>  }
>  
> +struct ppc_radix_page_info *kvm_get_radix_page_info(void)
> +{
> +    KVMState *s = KVM_STATE(current_machine->accelerator);
> +    struct ppc_radix_page_info *radix_page_info;
> +    struct kvm_ppc_rmmu_info rmmu_info;
> +    int i;
> +
> +    if (!kvm_check_extension(s, KVM_CAP_PPC_MMU_RADIX)) {
> +        return NULL;
> +    }
> +    if (kvm_vm_ioctl(s, KVM_PPC_GET_RMMU_INFO, &rmmu_info)) {
> +        return NULL;
> +    }
> +    radix_page_info = g_malloc0(sizeof(*radix_page_info));
> +    radix_page_info->count = 0;
> +    for (i = 0; i < PPC_PAGE_SIZES_MAX_SZ; i++) {
> +        if (rmmu_info.ap_encodings[i]) {
> +            radix_page_info->entries[i] = rmmu_info.ap_encodings[i];
> +            radix_page_info->count++;
> +        }
> +    }
> +    return radix_page_info;
> +}
> +
>  static long gethugepagesize(const char *mem_path)
>  {
>      struct statfs fs;
> @@ -2379,6 +2404,8 @@ static void kvmppc_host_cpu_class_init(ObjectClass *oc, void *data)
>          pcc->l1_icache_size = icache_size;
>      }
>  
> +    pcc->radix_page_info = kvm_enabled() ? kvm_get_radix_page_info() : NULL;

This whole function is only called in the kvm case: no need to check
kvm_enabled() here.

>      /* Reason: kvmppc_host_cpu_initfn() dies when !kvm_enabled() */
>      dc->cannot_destroy_with_object_finalize_yet = true;
>  }

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 07/12] target-ppc: support KVM_CAP_PPC_MMU_RADIX, KVM_CAP_PPC_MMU_HASH_V3
  2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 07/12] target-ppc: support KVM_CAP_PPC_MMU_RADIX, KVM_CAP_PPC_MMU_HASH_V3 Sam Bobroff
@ 2017-02-28  0:13   ` David Gibson
  0 siblings, 0 replies; 28+ messages in thread
From: David Gibson @ 2017-02-28  0:13 UTC (permalink / raw)
  To: Sam Bobroff; +Cc: qemu-ppc, qemu-devel, sjitindarsingh

[-- Attachment #1: Type: text/plain, Size: 3078 bytes --]

On Thu, Feb 23, 2017 at 05:00:00PM +1100, Sam Bobroff wrote:
> Query and cache the value of two new KVM capabilities that indicate
> KVM's support for new radix and hash modes of the MMU.
> 
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
> v2:
> 
> * cap_mmu_hash renamed to cap_mmu_hash_v3.
> 
>  target/ppc/kvm.c     | 14 ++++++++++++++
>  target/ppc/kvm_ppc.h | 12 ++++++++++++
>  2 files changed, 26 insertions(+)
> 
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index cf62a42c1f..8b153808fd 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -83,6 +83,8 @@ static int cap_papr;
>  static int cap_htab_fd;
>  static int cap_fixup_hcalls;
>  static int cap_htm;             /* Hardware transactional memory support */
> +static int cap_mmu_radix;
> +static int cap_mmu_hash_v3;
>  
>  static uint32_t debug_inst_opcode;
>  
> @@ -136,6 +138,8 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>      cap_htab_fd = kvm_check_extension(s, KVM_CAP_PPC_HTAB_FD);
>      cap_fixup_hcalls = kvm_check_extension(s, KVM_CAP_PPC_FIXUP_HCALL);
>      cap_htm = kvm_vm_check_extension(s, KVM_CAP_PPC_HTM);
> +    cap_mmu_radix = kvm_vm_check_extension(s, KVM_CAP_PPC_MMU_RADIX);
> +    cap_mmu_hash_v3 = kvm_vm_check_extension(s, KVM_CAP_PPC_MMU_HASH_V3);
>  
>      if (!cap_interrupt_level) {
>          fprintf(stderr, "KVM: Couldn't find level irq capability. Expect the "
> @@ -2430,6 +2434,16 @@ bool kvmppc_has_cap_htm(void)
>      return cap_htm;
>  }
>  
> +bool kvmppc_has_cap_mmu_radix(void)
> +{
> +    return cap_mmu_radix;
> +}
> +
> +bool kvmppc_has_cap_mmu_hash_v3(void)
> +{
> +    return cap_mmu_hash_v3;
> +}
> +
>  static PowerPCCPUClass *ppc_cpu_get_family_class(PowerPCCPUClass *pcc)
>  {
>      ObjectClass *oc = OBJECT_CLASS(pcc);
> diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> index 8da2ee418a..56e222dfc2 100644
> --- a/target/ppc/kvm_ppc.h
> +++ b/target/ppc/kvm_ppc.h
> @@ -56,6 +56,8 @@ void kvmppc_hash64_write_pte(CPUPPCState *env, target_ulong pte_index,
>                               target_ulong pte0, target_ulong pte1);
>  bool kvmppc_has_cap_fixup_hcalls(void);
>  bool kvmppc_has_cap_htm(void);
> +bool kvmppc_has_cap_mmu_radix(void);
> +bool kvmppc_has_cap_mmu_hash_v3(void);
>  int kvmppc_enable_hwrng(void);
>  int kvmppc_put_books_sregs(PowerPCCPU *cpu);
>  PowerPCCPUClass *kvm_ppc_get_host_cpu_class(void);
> @@ -262,6 +264,16 @@ static inline bool kvmppc_has_cap_htm(void)
>      return false;
>  }
>  
> +static inline bool kvmppc_has_cap_mmu_radix(void)
> +{
> +    return false;
> +}
> +
> +static inline bool kvmppc_has_cap_mmu_hash_v3(void)
> +{
> +    return false;
> +}
> +
>  static inline int kvmppc_enable_hwrng(void)
>  {
>      return -1;

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 08/12] spapr: Only setup HTP if necessary.
  2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 08/12] spapr: Only setup HTP if necessary Sam Bobroff
@ 2017-02-28  0:28   ` David Gibson
  2017-02-28  2:25     ` Suraj Jitindar Singh
  0 siblings, 1 reply; 28+ messages in thread
From: David Gibson @ 2017-02-28  0:28 UTC (permalink / raw)
  To: Sam Bobroff; +Cc: qemu-ppc, qemu-devel, sjitindarsingh

[-- Attachment #1: Type: text/plain, Size: 5174 bytes --]

s/HTP/HPT/ in subject line.


On Thu, Feb 23, 2017 at 05:00:01PM +1100, Sam Bobroff wrote:
> If QEMU is using KVM, and KVM is capable of running in radix mode,
> guests can be run in real-mode without allocating a HPT (because KVM
> will use a minimal RPT). So in this case, we avoid creating the HPT
> at reset time and later (during CAS) create it if it is necessary.
> 
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>

So, IIRC, we discussed previously that the logical way to do things
was to, by default, delay HPT allocation until CAS time, and just do
it at reset time for the case that needs it: hash host with KVM.

Did you hit a problem with that approach, or is there still work to be
done here?

> ---
> v2:
> 
> * This patch has been mostly rewritten to move the late HPT allocation to CAS.
> This allows a guest to start in radix mode (when it's in real mode) and then
> change to hash, even if it is a legacy guest and will not call
> h_register_process_table().
> * Added an exported function to spapr.c to perform HPT allocation and adjust
> the vrma if necessary. This makes it possible to allocate the HPT from
> h_client_architecture_support() in spapr_hcall.c.
> 
>  hw/ppc/spapr.c         | 24 +++++++++++++++---------
>  hw/ppc/spapr_hcall.c   | 10 ++++++++++
>  include/hw/ppc/spapr.h |  1 +
>  3 files changed, 26 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index ca3812555f..dfee0f685f 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1123,6 +1123,17 @@ static void spapr_reallocate_hpt(sPAPRMachineState *spapr, int shift,
>      }
>  }
>  
> +void spapr_setup_hpt_and_vrma(sPAPRMachineState *spapr)
> +{
> +    spapr_reallocate_hpt(spapr,
> +                     spapr_hpt_shift_for_ramsize(MACHINE(qdev_get_machine())->maxram_size),
> +                     &error_fatal);
> +    if (spapr->vrma_adjust) {
> +        spapr->rma_size = kvmppc_rma_size(spapr_node0_size(),
> +                                          spapr->htab_shift);
> +    }
> +}
> +
>  static void find_unknown_sysbus_device(SysBusDevice *sbdev, void *opaque)
>  {
>      bool matched = false;
> @@ -1151,15 +1162,10 @@ static void ppc_spapr_reset(void)
>      /* Check for unknown sysbus devices */
>      foreach_dynamic_sysbus_device(find_unknown_sysbus_device, NULL);
>  
> -    /* Allocate and/or reset the hash page table */
> -    spapr_reallocate_hpt(spapr,
> -                         spapr_hpt_shift_for_ramsize(machine->maxram_size),
> -                         &error_fatal);
> -
> -    /* Update the RMA size if necessary */
> -    if (spapr->vrma_adjust) {
> -        spapr->rma_size = kvmppc_rma_size(spapr_node0_size(),
> -                                          spapr->htab_shift);
> +    /* If using KVM with radix mode available, VCPUs can be started
> +     * without a HPT because KVM will start them in radix mode. */
> +    if (!(kvm_enabled() && kvmppc_has_cap_mmu_radix())) {
> +        spapr_setup_hpt_and_vrma(spapr);
>      }
>  
>      qemu_devices_reset();
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index 42d20e0b92..cea34073aa 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -1002,6 +1002,16 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
>      ov5_updates = spapr_ovec_new();
>      spapr->cas_reboot = spapr_ovec_diff(ov5_updates,
>                                          ov5_cas_old, spapr->ov5_cas);
> +    if (kvm_enabled()) {
> +        if (kvmppc_has_cap_mmu_radix()) {
> +            /* If the HPT hasn't yet been set up (see
> +             * ppc_spapr_reset()), and it's needed, do it now: */

I think it's a bit fragile to have here it explicitly mirror the logic
which determines whether the HPT is allocated early.  I'd prefer to
explicitly test here whether we have allocated an HPT - adding a flag,
if we have to.

> +            if (!spapr_ovec_test(ov5_updates, OV5_MMU_RADIX)) {
> +                /* legacy hash or new hash: */
> +                spapr_setup_hpt_and_vrma(spapr);
> +            }
> +        }
> +    }
>  
>      if (!spapr->cas_reboot) {
>          spapr->cas_reboot =
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index f9b17d860a..a30cbc485c 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -590,6 +590,7 @@ void spapr_dt_events(sPAPRMachineState *sm, void *fdt);
>  int spapr_h_cas_compose_response(sPAPRMachineState *sm,
>                                   target_ulong addr, target_ulong size,
>                                   sPAPROptionVector *ov5_updates);
> +void spapr_setup_hpt_and_vrma(sPAPRMachineState *spapr);
>  sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn);
>  void spapr_tce_table_enable(sPAPRTCETable *tcet,
>                              uint32_t page_shift, uint64_t bus_offset,

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 10/12] spapr: move spapr_populate_pa_features()
  2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 10/12] spapr: move spapr_populate_pa_features() Sam Bobroff
@ 2017-02-28  0:29   ` David Gibson
  0 siblings, 0 replies; 28+ messages in thread
From: David Gibson @ 2017-02-28  0:29 UTC (permalink / raw)
  To: Sam Bobroff; +Cc: qemu-ppc, qemu-devel, sjitindarsingh

[-- Attachment #1: Type: text/plain, Size: 4595 bytes --]

On Thu, Feb 23, 2017 at 05:00:03PM +1100, Sam Bobroff wrote:
> In the next patch, spapr_fixup_cpu_dt() will need to call
> spapr_populate_pa_features() so move it's definition up without making
> any other changes.

s/it's/its/

> 
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
> ---
>  hw/ppc/spapr.c | 86 +++++++++++++++++++++++++++++-----------------------------
>  1 file changed, 43 insertions(+), 43 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index dfee0f685f..0c0782b558 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -194,6 +194,49 @@ static int spapr_fixup_cpu_numa_dt(void *fdt, int offset, CPUState *cs)
>      return ret;
>  }
>  
> +/* Populate the "ibm,pa-features" property */
> +static void spapr_populate_pa_features(CPUPPCState *env, void *fdt, int offset)
> +{
> +    uint8_t pa_features_206[] = { 6, 0,
> +        0xf6, 0x1f, 0xc7, 0x00, 0x80, 0xc0 };
> +    uint8_t pa_features_207[] = { 24, 0,
> +        0xf6, 0x1f, 0xc7, 0xc0, 0x80, 0xf0,
> +        0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
> +        0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
> +        0x80, 0x00, 0x80, 0x00, 0x00, 0x00 };
> +    uint8_t *pa_features;
> +    size_t pa_size;
> +
> +    switch (POWERPC_MMU_VER(env->mmu_model)) {
> +    case POWERPC_MMU_VER_2_06:
> +        pa_features = pa_features_206;
> +        pa_size = sizeof(pa_features_206);
> +        break;
> +    case POWERPC_MMU_VER_2_07:
> +        pa_features = pa_features_207;
> +        pa_size = sizeof(pa_features_207);
> +        break;
> +    default:
> +        return;
> +    }
> +
> +    if (env->ci_large_pages) {
> +        /*
> +         * Note: we keep CI large pages off by default because a 64K capable
> +         * guest provisioned with large pages might otherwise try to map a qemu
> +         * framebuffer (or other kind of memory mapped PCI BAR) using 64K pages
> +         * even if that qemu runs on a 4k host.
> +         * We dd this bit back here if we are confident this is not an issue
> +         */
> +        pa_features[3] |= 0x20;
> +    }
> +    if (kvmppc_has_cap_htm() && pa_size > 24) {
> +        pa_features[24] |= 0x80;    /* Transactional memory support */
> +    }
> +
> +    _FDT((fdt_setprop(fdt, offset, "ibm,pa-features", pa_features, pa_size)));
> +}
> +
>  static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState *spapr)
>  {
>      int ret = 0, offset, cpus_offset;
> @@ -346,49 +389,6 @@ static int spapr_populate_memory(sPAPRMachineState *spapr, void *fdt)
>      return 0;
>  }
>  
> -/* Populate the "ibm,pa-features" property */
> -static void spapr_populate_pa_features(CPUPPCState *env, void *fdt, int offset)
> -{
> -    uint8_t pa_features_206[] = { 6, 0,
> -        0xf6, 0x1f, 0xc7, 0x00, 0x80, 0xc0 };
> -    uint8_t pa_features_207[] = { 24, 0,
> -        0xf6, 0x1f, 0xc7, 0xc0, 0x80, 0xf0,
> -        0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
> -        0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
> -        0x80, 0x00, 0x80, 0x00, 0x00, 0x00 };
> -    uint8_t *pa_features;
> -    size_t pa_size;
> -
> -    switch (POWERPC_MMU_VER(env->mmu_model)) {
> -    case POWERPC_MMU_VER_2_06:
> -        pa_features = pa_features_206;
> -        pa_size = sizeof(pa_features_206);
> -        break;
> -    case POWERPC_MMU_VER_2_07:
> -        pa_features = pa_features_207;
> -        pa_size = sizeof(pa_features_207);
> -        break;
> -    default:
> -        return;
> -    }
> -
> -    if (env->ci_large_pages) {
> -        /*
> -         * Note: we keep CI large pages off by default because a 64K capable
> -         * guest provisioned with large pages might otherwise try to map a qemu
> -         * framebuffer (or other kind of memory mapped PCI BAR) using 64K pages
> -         * even if that qemu runs on a 4k host.
> -         * We dd this bit back here if we are confident this is not an issue
> -         */
> -        pa_features[3] |= 0x20;
> -    }
> -    if (kvmppc_has_cap_htm() && pa_size > 24) {
> -        pa_features[24] |= 0x80;    /* Transactional memory support */
> -    }
> -
> -    _FDT((fdt_setprop(fdt, offset, "ibm,pa-features", pa_features, pa_size)));
> -}
> -
>  static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
>                                    sPAPRMachineState *spapr)
>  {

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 12/12] spapr: Workaround for broken radix guests
  2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 12/12] spapr: Workaround for broken radix guests Sam Bobroff
@ 2017-02-28  0:36   ` David Gibson
  0 siblings, 0 replies; 28+ messages in thread
From: David Gibson @ 2017-02-28  0:36 UTC (permalink / raw)
  To: Sam Bobroff; +Cc: qemu-ppc, qemu-devel, sjitindarsingh

[-- Attachment #1: Type: text/plain, Size: 6110 bytes --]

On Thu, Feb 23, 2017 at 05:00:05PM +1100, Sam Bobroff wrote:
> For a little while around 4.9, Linux kernels that saw the radix bit in
> ibm,pa-features would attempt to set up the MMU as if they were a
> hypervisor, even if they were a guest, which would cause them to
> crash.
> 
> Work around this by detecting pre-ISA 3.0 guests by their lack of that
> bit in option vector 1, and then removing the radix bit from
> ibm,pa-features. Note: This now requires regeneration of that node
> after CAS negotiation.
> 
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>

A bit ugly, but not any more so than it needs to given what we're
dealing with AFAICT.

I'll save more detailed review until the rebase in conjuction with the
TCG bits.

> ---
>  hw/ppc/spapr.c              | 15 +++++++++++++--
>  hw/ppc/spapr_hcall.c        |  5 +++--
>  include/hw/ppc/spapr.h      |  1 +
>  include/hw/ppc/spapr_ovec.h |  3 +++
>  4 files changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index e83468a8d3..c47600b8ee 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -195,7 +195,8 @@ static int spapr_fixup_cpu_numa_dt(void *fdt, int offset, CPUState *cs)
>  }
>  
>  /* Populate the "ibm,pa-features" property */
> -static void spapr_populate_pa_features(CPUPPCState *env, void *fdt, int offset)
> +static void spapr_populate_pa_features(CPUPPCState *env, void *fdt, int offset,
> +                                      bool legacy_guest)
>  {
>      uint8_t pa_features_206[] = { 6, 0,
>          0xf6, 0x1f, 0xc7, 0x00, 0x80, 0xc0 };
> @@ -251,6 +252,12 @@ static void spapr_populate_pa_features(CPUPPCState *env, void *fdt, int offset)
>      if (kvmppc_has_cap_htm() && pa_size > 24) {
>          pa_features[24] |= 0x80;    /* Transactional memory support */
>      }
> +    if (legacy_guest && pa_size > 40) {
> +        /* Workaround for broken kernels that attempt (guest) radix
> +         * mode when they can't handle it, if they see the radix bit set
> +         * in pa-features. So hide it from them. */
> +        pa_features[40 + 2] &= ~0x80; /* Radix MMU */
> +    }
>  
>      _FDT((fdt_setprop(fdt, offset, "ibm,pa-features", pa_features, pa_size)));
>  }
> @@ -265,6 +272,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState *spapr)
>  
>      CPU_FOREACH(cs) {
>          PowerPCCPU *cpu = POWERPC_CPU(cs);
> +        CPUPPCState *env = &cpu->env;
>          DeviceClass *dc = DEVICE_GET_CLASS(cs);
>          int index = ppc_get_vcpu_dt_id(cpu);
>          int compat_smt = MIN(smp_threads, ppc_compat_max_threads(cpu));
> @@ -306,6 +314,9 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState *spapr)
>          if (ret < 0) {
>              return ret;
>          }
> +
> +        spapr_populate_pa_features(env, fdt, offset,
> +                                         spapr->cas_legacy_guest_workaround);
>      }
>      return ret;
>  }
> @@ -503,7 +514,7 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
>                            page_sizes_prop, page_sizes_prop_size)));
>      }
>  
> -    spapr_populate_pa_features(env, fdt, offset);
> +    spapr_populate_pa_features(env, fdt, offset, false);
>  
>      _FDT((fdt_setprop_cell(fdt, offset, "ibm,chip-id",
>                             cs->cpu_index / vcpus_per_socket)));
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index efaa1a1b19..7660cd7d64 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -933,7 +933,7 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
>      uint32_t max_compat = cpu->max_compat;
>      uint32_t best_compat = 0;
>      int i;
> -    sPAPROptionVector *ov5_guest, *ov5_cas_old, *ov5_updates;
> +    sPAPROptionVector *ov1_guest, *ov5_guest, *ov5_cas_old, *ov5_updates;
>      bool guest_radix;
>  
>      /*
> @@ -985,6 +985,7 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
>      /* For the future use: here @ov_table points to the first option vector */
>      ov_table = list;
>  
> +    ov1_guest = spapr_ovec_parse_vector(ov_table, 1);
>      ov5_guest = spapr_ovec_parse_vector(ov_table, 5);
>      if (spapr_ovec_test(ov5_guest, OV5_MMU_BOTH)) {
>          error_report("qemu: guest requested hash and radix MMU, which is invalid.");
> @@ -1025,7 +1026,7 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
>              exit(EXIT_FAILURE);
>          }
>      }
> -
> +    spapr->cas_legacy_guest_workaround = !spapr_ovec_test(ov1_guest, OV1_PPC_3_00);
>      if (!spapr->cas_reboot) {
>          spapr->cas_reboot =
>              (spapr_h_cas_compose_response(spapr, args[1], args[2],
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index d523db3b4a..1e64e3ada8 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -77,6 +77,7 @@ struct sPAPRMachineState {
>      sPAPROptionVector *ov5;         /* QEMU-supported option vectors */
>      sPAPROptionVector *ov5_cas;     /* negotiated (via CAS) option vectors */
>      bool cas_reboot;
> +    bool cas_legacy_guest_workaround;
>  
>      Notifier epow_notifier;
>      QTAILQ_HEAD(, sPAPREventLogEntry) pending_events;
> diff --git a/include/hw/ppc/spapr_ovec.h b/include/hw/ppc/spapr_ovec.h
> index e2dfbac558..8807c753e0 100644
> --- a/include/hw/ppc/spapr_ovec.h
> +++ b/include/hw/ppc/spapr_ovec.h
> @@ -43,6 +43,9 @@ typedef struct sPAPROptionVector sPAPROptionVector;
>  
>  #define OV_BIT(byte, bit) ((byte - 1) * BITS_PER_BYTE + bit)
>  
> +/* option vector 1 */
> +#define OV1_PPC_3_00            OV_BIT(3, 0) /* set if we support PowerPC 3.00 */
> +
>  /* option vector 5 */
>  #define OV5_DRCONF_MEMORY       OV_BIT(2, 2)
>  #define OV5_FORM1_AFFINITY      OV_BIT(5, 0)

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 04/12] Move virtio_mmio.h to fix update-linux-headers.sh
  2017-02-24 16:47   ` Michael S. Tsirkin
@ 2017-02-28  2:23     ` Sam Bobroff
  0 siblings, 0 replies; 28+ messages in thread
From: Sam Bobroff @ 2017-02-28  2:23 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: qemu-ppc, qemu-devel, sjitindarsingh, david

On Fri, Feb 24, 2017 at 06:47:03PM +0200, Michael S. Tsirkin wrote:
> On Thu, Feb 23, 2017 at 04:59:57PM +1100, Sam Bobroff wrote:
> > Currently, running update-linux-headers.sh will produce a patch that
> > deletes virtio_mmio.h, which is still needed. This happens because
> > virtio_mmio.h is in the directory used to store headers from the linux
> > kernel that are copied by the kernel's "make headers_install" target
> > (used by the update script) but it is not one of the files in that
> > set.
> > 
> > Fix this by moving that file into a new directory.
> > 
> > In the future if that file is added to the "headers_install" target
> > then this change should be reverted.
> > 
> > Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
> 
> This is a temporary condition - I'm adding a patch exporting
> this header for next linux. So I don't think we should merge this.

Great :-)

I'll drop this patch from the next version.

Cheers,
Sam.

> > ---
> > v2:
> > * FWIW, here's one way of fixing it.
> > 
> >  hw/virtio/virtio-mmio.c                                          | 2 +-
> >  include/{standard-headers => kernel-headers}/linux/virtio_mmio.h | 0
> >  2 files changed, 1 insertion(+), 1 deletion(-)
> >  rename include/{standard-headers => kernel-headers}/linux/virtio_mmio.h (100%)
> > 
> > diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
> > index 5807aa87fe..cc6afa9da1 100644
> > --- a/hw/virtio/virtio-mmio.c
> > +++ b/hw/virtio/virtio-mmio.c
> > @@ -20,7 +20,7 @@
> >   */
> >  
> >  #include "qemu/osdep.h"
> > -#include "standard-headers/linux/virtio_mmio.h"
> > +#include "kernel-headers/linux/virtio_mmio.h"
> >  #include "hw/sysbus.h"
> >  #include "hw/virtio/virtio.h"
> >  #include "qemu/host-utils.h"
> > diff --git a/include/standard-headers/linux/virtio_mmio.h b/include/kernel-headers/linux/virtio_mmio.h
> > similarity index 100%
> > rename from include/standard-headers/linux/virtio_mmio.h
> > rename to include/kernel-headers/linux/virtio_mmio.h
> > -- 
> > 2.11.0
> > 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 08/12] spapr: Only setup HTP if necessary.
  2017-02-28  0:28   ` David Gibson
@ 2017-02-28  2:25     ` Suraj Jitindar Singh
  2017-02-28  3:19       ` David Gibson
  0 siblings, 1 reply; 28+ messages in thread
From: Suraj Jitindar Singh @ 2017-02-28  2:25 UTC (permalink / raw)
  To: David Gibson, Sam Bobroff; +Cc: qemu-ppc, qemu-devel

On Tue, 2017-02-28 at 11:28 +1100, David Gibson wrote:
> s/HTP/HPT/ in subject line.
> 
> 
> On Thu, Feb 23, 2017 at 05:00:01PM +1100, Sam Bobroff wrote:
> > 
> > If QEMU is using KVM, and KVM is capable of running in radix mode,
> > guests can be run in real-mode without allocating a HPT (because
> > KVM
> > will use a minimal RPT). So in this case, we avoid creating the HPT
> > at reset time and later (during CAS) create it if it is necessary.
> > 
> > Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
> So, IIRC, we discussed previously that the logical way to do things
> was to, by default, delay HPT allocation until CAS time, and just do
> it at reset time for the case that needs it: hash host with KVM.
> 
> Did you hit a problem with that approach, or is there still work to
> be
> done here?

So what we're doing is assuming radix. Allocate hpt if hash host,
otherwise delay til CAS time and allocate only if guest chose hash.

> 
> > 
> > ---
> > v2:
> > 
> > * This patch has been mostly rewritten to move the late HPT
> > allocation to CAS.
> > This allows a guest to start in radix mode (when it's in real mode)
> > and then
> > change to hash, even if it is a legacy guest and will not call
> > h_register_process_table().
> > * Added an exported function to spapr.c to perform HPT allocation
> > and adjust
> > the vrma if necessary. This makes it possible to allocate the HPT
> > from
> > h_client_architecture_support() in spapr_hcall.c.
> > 
> >  hw/ppc/spapr.c         | 24 +++++++++++++++---------
> >  hw/ppc/spapr_hcall.c   | 10 ++++++++++
> >  include/hw/ppc/spapr.h |  1 +
> >  3 files changed, 26 insertions(+), 9 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index ca3812555f..dfee0f685f 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -1123,6 +1123,17 @@ static void
> > spapr_reallocate_hpt(sPAPRMachineState *spapr, int shift,
> >      }
> >  }
> >  
> > +void spapr_setup_hpt_and_vrma(sPAPRMachineState *spapr)
> > +{
> > +    spapr_reallocate_hpt(spapr,
> > +                     spapr_hpt_shift_for_ramsize(MACHINE(qdev_get_
> > machine())->maxram_size),
> > +                     &error_fatal);
> > +    if (spapr->vrma_adjust) {
> > +        spapr->rma_size = kvmppc_rma_size(spapr_node0_size(),
> > +                                          spapr->htab_shift);
> > +    }
> > +}
> > +
> >  static void find_unknown_sysbus_device(SysBusDevice *sbdev, void
> > *opaque)
> >  {
> >      bool matched = false;
> > @@ -1151,15 +1162,10 @@ static void ppc_spapr_reset(void)
> >      /* Check for unknown sysbus devices */
> >      foreach_dynamic_sysbus_device(find_unknown_sysbus_device,
> > NULL);
> >  
> > -    /* Allocate and/or reset the hash page table */
> > -    spapr_reallocate_hpt(spapr,
> > -                         spapr_hpt_shift_for_ramsize(machine-
> > >maxram_size),
> > -                         &error_fatal);
> > -
> > -    /* Update the RMA size if necessary */
> > -    if (spapr->vrma_adjust) {
> > -        spapr->rma_size = kvmppc_rma_size(spapr_node0_size(),
> > -                                          spapr->htab_shift);
> > +    /* If using KVM with radix mode available, VCPUs can be
> > started
> > +     * without a HPT because KVM will start them in radix mode. */
> > +    if (!(kvm_enabled() && kvmppc_has_cap_mmu_radix())) {
> > +        spapr_setup_hpt_and_vrma(spapr);
> >      }
> >  
> >      qemu_devices_reset();
> > diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> > index 42d20e0b92..cea34073aa 100644
> > --- a/hw/ppc/spapr_hcall.c
> > +++ b/hw/ppc/spapr_hcall.c
> > @@ -1002,6 +1002,16 @@ static target_ulong
> > h_client_architecture_support(PowerPCCPU *cpu,
> >      ov5_updates = spapr_ovec_new();
> >      spapr->cas_reboot = spapr_ovec_diff(ov5_updates,
> >                                          ov5_cas_old, spapr-
> > >ov5_cas);
> > +    if (kvm_enabled()) {
> > +        if (kvmppc_has_cap_mmu_radix()) {
> > +            /* If the HPT hasn't yet been set up (see
> > +             * ppc_spapr_reset()), and it's needed, do it now: */
> I think it's a bit fragile to have here it explicitly mirror the
> logic
> which determines whether the HPT is allocated early.  I'd prefer to
> explicitly test here whether we have allocated an HPT - adding a
> flag,
> if we have to.

We can use the MSB of patb_entry as that flag.

patb_entry & GUEST_RADIX == GUEST_RADIX -> radix, so assume a hpt
hasn't been allocated.

When we do allocate a hpt we know we're not radix, so set
patb_entry &= ~GUEST_RADIX;

Where GUEST_RADIX is the msb in patb_entry which indicates that a guest
is radix.

Essentially patb_entry & GUEST_RADIX cleared mean hash with hpt
allocated, patb_entry & GUEST_RADIX set means radix so assume an hpt
hasn't been allocated. On the hpt allocation path we clear GUEST_RADIX
in patb_entry and when we set GUEST_RADIX we free the hpt.

> 
> > 
> > +            if (!spapr_ovec_test(ov5_updates, OV5_MMU_RADIX)) {
> > +                /* legacy hash or new hash: */
> > +                spapr_setup_hpt_and_vrma(spapr);
> > +            }
> > +        }
> > +    }
> >  
> >      if (!spapr->cas_reboot) {
> >          spapr->cas_reboot =
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index f9b17d860a..a30cbc485c 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -590,6 +590,7 @@ void spapr_dt_events(sPAPRMachineState *sm,
> > void *fdt);
> >  int spapr_h_cas_compose_response(sPAPRMachineState *sm,
> >                                   target_ulong addr, target_ulong
> > size,
> >                                   sPAPROptionVector *ov5_updates);
> > +void spapr_setup_hpt_and_vrma(sPAPRMachineState *spapr);
> >  sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t
> > liobn);
> >  void spapr_tce_table_enable(sPAPRTCETable *tcet,
> >                              uint32_t page_shift, uint64_t
> > bus_offset,

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 06/12] spapr: Add ibm, processor-radix-AP-encodings to the device tree
  2017-02-28  0:12   ` David Gibson
@ 2017-02-28  2:27     ` Suraj Jitindar Singh
  0 siblings, 0 replies; 28+ messages in thread
From: Suraj Jitindar Singh @ 2017-02-28  2:27 UTC (permalink / raw)
  To: David Gibson, Sam Bobroff; +Cc: qemu-ppc, qemu-devel

On Tue, 2017-02-28 at 11:12 +1100, David Gibson wrote:
> On Thu, Feb 23, 2017 at 04:59:59PM +1100, Sam Bobroff wrote:
> > 
> > Use the new ioctl, KVM_PPC_GET_RMMU_INFO, to fetch radix MMU
> > information from KVM and present the page encodings in the device
> > tree
> > under ibm,processor-radix-AP-encodings. This provides page size
> > information to the guest which is necessary for it to use radix
> > mode.
> > 
> > Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
> > ---
> > v2:
> > 
> > * ppc_radix_page_info now kept in native format, conversion to BE
> > done when adding to the device tree.
> > * radix_page_info moved into the CPU class, cleaning up some code.
> Looks pretty good, although I imaginge it will need a little rework
> to
> rebase on top of the TCG radix stuff.  Also one comment below..

I've reworked a bit so my TCG stuff applies cleanly on top.

> 
> > 
> > 
> >  hw/ppc/spapr.c       | 12 ++++++++++++
> >  include/sysemu/kvm.h |  1 +
> >  target/ppc/cpu-qom.h |  1 +
> >  target/ppc/cpu.h     |  4 ++++
> >  target/ppc/kvm.c     | 27 +++++++++++++++++++++++++++
> >  5 files changed, 45 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index cceb35f083..ca3812555f 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -409,6 +409,8 @@ static void spapr_populate_cpu_dt(CPUState *cs,
> > void *fdt, int offset,
> >      sPAPRDRConnector *drc;
> >      sPAPRDRConnectorClass *drck;
> >      int drc_index;
> > +    uint32_t radix_AP_encodings[PPC_PAGE_SIZES_MAX_SZ];
> > +    int i;
> >  
> >      drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU,
> > index);
> >      if (drc) {
> > @@ -494,6 +496,16 @@ static void spapr_populate_cpu_dt(CPUState
> > *cs, void *fdt, int offset,
> >      _FDT(spapr_fixup_cpu_numa_dt(fdt, offset, cs));
> >  
> >      _FDT(spapr_fixup_cpu_smt_dt(fdt, offset, cpu, compat_smt));
> > +
> > +    if (pcc->radix_page_info) {
> > +        for (i = 0; i < pcc->radix_page_info->count; i++) {
> > +            radix_AP_encodings[i] = cpu_to_be32(pcc-
> > >radix_page_info->entries[i]);
> > +        }
> > +        _FDT((fdt_setprop(fdt, offset, "ibm,processor-radix-AP-
> > encodings",
> > +                          radix_AP_encodings,
> > +                          pcc->radix_page_info->count *
> > +                          sizeof(radix_AP_encodings[0]))));
> > +    }
> >  }
> >  
> >  static void spapr_populate_cpus_dt_node(void *fdt,
> > sPAPRMachineState *spapr)
> > diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> > index 3045ee7678..01a8db1180 100644
> > --- a/include/sysemu/kvm.h
> > +++ b/include/sysemu/kvm.h
> > @@ -526,5 +526,6 @@ int kvm_set_one_reg(CPUState *cs, uint64_t id,
> > void *source);
> >   * Returns: 0 on success, or a negative errno on failure.
> >   */
> >  int kvm_get_one_reg(CPUState *cs, uint64_t id, void *target);
> > +struct ppc_radix_page_info *kvm_get_radix_page_info(void);
> >  int kvm_get_max_memslots(void);
> >  #endif
> > diff --git a/target/ppc/cpu-qom.h b/target/ppc/cpu-qom.h
> > index 4807f4d86c..0efb543912 100644
> > --- a/target/ppc/cpu-qom.h
> > +++ b/target/ppc/cpu-qom.h
> > @@ -195,6 +195,7 @@ typedef struct PowerPCCPUClass {
> >      int bfd_mach;
> >      uint32_t l1_dcache_size, l1_icache_size;
> >      const struct ppc_segment_page_sizes *sps;
> > +    struct ppc_radix_page_info *radix_page_info;
> >      void (*init_proc)(CPUPPCState *env);
> >      int  (*check_pow)(CPUPPCState *env);
> >      int (*handle_mmu_fault)(PowerPCCPU *cpu, vaddr eaddr, int rwx,
> > int mmu_idx);
> > diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> > index b559b67073..a6c8c5ff4c 100644
> > --- a/target/ppc/cpu.h
> > +++ b/target/ppc/cpu.h
> > @@ -934,6 +934,10 @@ struct ppc_segment_page_sizes {
> >      struct ppc_one_seg_page_size sps[PPC_PAGE_SIZES_MAX_SZ];
> >  };
> >  
> > +struct ppc_radix_page_info {
> > +    uint32_t count;
> > +    uint32_t entries[PPC_PAGE_SIZES_MAX_SZ];
> > +};
> >  
> >  /*****************************************************************
> > ************/
> >  /* The whole PowerPC CPU context */
> > diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> > index d53ede8b4a..cf62a42c1f 100644
> > --- a/target/ppc/kvm.c
> > +++ b/target/ppc/kvm.c
> > @@ -48,6 +48,7 @@
> >  #if defined(TARGET_PPC64)
> >  #include "hw/ppc/spapr_cpu_core.h"
> >  #endif
> > +#include "sysemu/kvm_int.h"
> >  
> >  //#define DEBUG_KVM
> >  
> > @@ -329,6 +330,30 @@ static void kvm_get_smmu_info(PowerPCCPU *cpu,
> > struct kvm_ppc_smmu_info *info)
> >      kvm_get_fallback_smmu_info(cpu, info);
> >  }
> >  
> > +struct ppc_radix_page_info *kvm_get_radix_page_info(void)
> > +{
> > +    KVMState *s = KVM_STATE(current_machine->accelerator);
> > +    struct ppc_radix_page_info *radix_page_info;
> > +    struct kvm_ppc_rmmu_info rmmu_info;
> > +    int i;
> > +
> > +    if (!kvm_check_extension(s, KVM_CAP_PPC_MMU_RADIX)) {
> > +        return NULL;
> > +    }
> > +    if (kvm_vm_ioctl(s, KVM_PPC_GET_RMMU_INFO, &rmmu_info)) {
> > +        return NULL;
> > +    }
> > +    radix_page_info = g_malloc0(sizeof(*radix_page_info));
> > +    radix_page_info->count = 0;
> > +    for (i = 0; i < PPC_PAGE_SIZES_MAX_SZ; i++) {
> > +        if (rmmu_info.ap_encodings[i]) {
> > +            radix_page_info->entries[i] =
> > rmmu_info.ap_encodings[i];
> > +            radix_page_info->count++;
> > +        }
> > +    }
> > +    return radix_page_info;
> > +}
> > +
> >  static long gethugepagesize(const char *mem_path)
> >  {
> >      struct statfs fs;
> > @@ -2379,6 +2404,8 @@ static void
> > kvmppc_host_cpu_class_init(ObjectClass *oc, void *data)
> >          pcc->l1_icache_size = icache_size;
> >      }
> >  
> > +    pcc->radix_page_info = kvm_enabled() ?
> > kvm_get_radix_page_info() : NULL;
> This whole function is only called in the kvm case: no need to check
> kvm_enabled() here.

I've reworked this, so this is irrelevant. I've added a generic
function in mmu-radix64.h which calls kvm_get_radix_page_info if radix
enabled, otherwise returns the default for TCG case.

> 
> > 
> >      /* Reason: kvmppc_host_cpu_initfn() dies when !kvm_enabled()
> > */
> >      dc->cannot_destroy_with_object_finalize_yet = true;
> >  }

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 08/12] spapr: Only setup HTP if necessary.
  2017-02-28  2:25     ` Suraj Jitindar Singh
@ 2017-02-28  3:19       ` David Gibson
  2017-03-01  5:17         ` Suraj Jitindar Singh
  0 siblings, 1 reply; 28+ messages in thread
From: David Gibson @ 2017-02-28  3:19 UTC (permalink / raw)
  To: Suraj Jitindar Singh; +Cc: Sam Bobroff, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 6712 bytes --]

On Tue, Feb 28, 2017 at 01:25:17PM +1100, Suraj Jitindar Singh wrote:
> On Tue, 2017-02-28 at 11:28 +1100, David Gibson wrote:
> > s/HTP/HPT/ in subject line.
> > 
> > 
> > On Thu, Feb 23, 2017 at 05:00:01PM +1100, Sam Bobroff wrote:
> > > 
> > > If QEMU is using KVM, and KVM is capable of running in radix mode,
> > > guests can be run in real-mode without allocating a HPT (because
> > > KVM
> > > will use a minimal RPT). So in this case, we avoid creating the HPT
> > > at reset time and later (during CAS) create it if it is necessary.
> > > 
> > > Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
> > So, IIRC, we discussed previously that the logical way to do things
> > was to, by default, delay HPT allocation until CAS time, and just do
> > it at reset time for the case that needs it: hash host with KVM.
> > 
> > Did you hit a problem with that approach, or is there still work to
> > be
> > done here?
> 
> So what we're doing is assuming radix. Allocate hpt if hash host,
> otherwise delay til CAS time and allocate only if guest chose hash.
> 
> > 
> > > 
> > > ---
> > > v2:
> > > 
> > > * This patch has been mostly rewritten to move the late HPT
> > > allocation to CAS.
> > > This allows a guest to start in radix mode (when it's in real mode)
> > > and then
> > > change to hash, even if it is a legacy guest and will not call
> > > h_register_process_table().
> > > * Added an exported function to spapr.c to perform HPT allocation
> > > and adjust
> > > the vrma if necessary. This makes it possible to allocate the HPT
> > > from
> > > h_client_architecture_support() in spapr_hcall.c.
> > > 
> > >  hw/ppc/spapr.c         | 24 +++++++++++++++---------
> > >  hw/ppc/spapr_hcall.c   | 10 ++++++++++
> > >  include/hw/ppc/spapr.h |  1 +
> > >  3 files changed, 26 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index ca3812555f..dfee0f685f 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -1123,6 +1123,17 @@ static void
> > > spapr_reallocate_hpt(sPAPRMachineState *spapr, int shift,
> > >      }
> > >  }
> > >  
> > > +void spapr_setup_hpt_and_vrma(sPAPRMachineState *spapr)
> > > +{
> > > +    spapr_reallocate_hpt(spapr,
> > > +                     spapr_hpt_shift_for_ramsize(MACHINE(qdev_get_
> > > machine())->maxram_size),
> > > +                     &error_fatal);
> > > +    if (spapr->vrma_adjust) {
> > > +        spapr->rma_size = kvmppc_rma_size(spapr_node0_size(),
> > > +                                          spapr->htab_shift);
> > > +    }
> > > +}
> > > +
> > >  static void find_unknown_sysbus_device(SysBusDevice *sbdev, void
> > > *opaque)
> > >  {
> > >      bool matched = false;
> > > @@ -1151,15 +1162,10 @@ static void ppc_spapr_reset(void)
> > >      /* Check for unknown sysbus devices */
> > >      foreach_dynamic_sysbus_device(find_unknown_sysbus_device,
> > > NULL);
> > >  
> > > -    /* Allocate and/or reset the hash page table */
> > > -    spapr_reallocate_hpt(spapr,
> > > -                         spapr_hpt_shift_for_ramsize(machine-
> > > >maxram_size),
> > > -                         &error_fatal);
> > > -
> > > -    /* Update the RMA size if necessary */
> > > -    if (spapr->vrma_adjust) {
> > > -        spapr->rma_size = kvmppc_rma_size(spapr_node0_size(),
> > > -                                          spapr->htab_shift);
> > > +    /* If using KVM with radix mode available, VCPUs can be
> > > started
> > > +     * without a HPT because KVM will start them in radix mode. */
> > > +    if (!(kvm_enabled() && kvmppc_has_cap_mmu_radix())) {
> > > +        spapr_setup_hpt_and_vrma(spapr);
> > >      }
> > >  
> > >      qemu_devices_reset();
> > > diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> > > index 42d20e0b92..cea34073aa 100644
> > > --- a/hw/ppc/spapr_hcall.c
> > > +++ b/hw/ppc/spapr_hcall.c
> > > @@ -1002,6 +1002,16 @@ static target_ulong
> > > h_client_architecture_support(PowerPCCPU *cpu,
> > >      ov5_updates = spapr_ovec_new();
> > >      spapr->cas_reboot = spapr_ovec_diff(ov5_updates,
> > >                                          ov5_cas_old, spapr-
> > > >ov5_cas);
> > > +    if (kvm_enabled()) {
> > > +        if (kvmppc_has_cap_mmu_radix()) {
> > > +            /* If the HPT hasn't yet been set up (see
> > > +             * ppc_spapr_reset()), and it's needed, do it now: */
> > I think it's a bit fragile to have here it explicitly mirror the
> > logic
> > which determines whether the HPT is allocated early.  I'd prefer to
> > explicitly test here whether we have allocated an HPT - adding a
> > flag,
> > if we have to.
> 
> We can use the MSB of patb_entry as that flag.

Uh.. only for POWER9..

> patb_entry & GUEST_RADIX == GUEST_RADIX -> radix, so assume a hpt
> hasn't been allocated.
> 
> When we do allocate a hpt we know we're not radix, so set
> patb_entry &= ~GUEST_RADIX;
> 
> Where GUEST_RADIX is the msb in patb_entry which indicates that a guest
> is radix.
> 
> Essentially patb_entry & GUEST_RADIX cleared mean hash with hpt
> allocated, patb_entry & GUEST_RADIX set means radix so assume an hpt
> hasn't been allocated. On the hpt allocation path we clear GUEST_RADIX
> in patb_entry and when we set GUEST_RADIX we free the hpt.
> 
> > 
> > > 
> > > +            if (!spapr_ovec_test(ov5_updates, OV5_MMU_RADIX)) {
> > > +                /* legacy hash or new hash: */
> > > +                spapr_setup_hpt_and_vrma(spapr);
> > > +            }
> > > +        }
> > > +    }
> > >  
> > >      if (!spapr->cas_reboot) {
> > >          spapr->cas_reboot =
> > > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > > index f9b17d860a..a30cbc485c 100644
> > > --- a/include/hw/ppc/spapr.h
> > > +++ b/include/hw/ppc/spapr.h
> > > @@ -590,6 +590,7 @@ void spapr_dt_events(sPAPRMachineState *sm,
> > > void *fdt);
> > >  int spapr_h_cas_compose_response(sPAPRMachineState *sm,
> > >                                   target_ulong addr, target_ulong
> > > size,
> > >                                   sPAPROptionVector *ov5_updates);
> > > +void spapr_setup_hpt_and_vrma(sPAPRMachineState *spapr);
> > >  sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t
> > > liobn);
> > >  void spapr_tce_table_enable(sPAPRTCETable *tcet,
> > >                              uint32_t page_shift, uint64_t
> > > bus_offset,
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 08/12] spapr: Only setup HTP if necessary.
  2017-02-28  3:19       ` David Gibson
@ 2017-03-01  5:17         ` Suraj Jitindar Singh
  2017-03-03  5:04           ` David Gibson
  0 siblings, 1 reply; 28+ messages in thread
From: Suraj Jitindar Singh @ 2017-03-01  5:17 UTC (permalink / raw)
  To: David Gibson; +Cc: Sam Bobroff, qemu-ppc, qemu-devel

On Tue, 2017-02-28 at 14:19 +1100, David Gibson wrote:
> On Tue, Feb 28, 2017 at 01:25:17PM +1100, Suraj Jitindar Singh wrote:
> > 
> > On Tue, 2017-02-28 at 11:28 +1100, David Gibson wrote:
> > > 
> > > s/HTP/HPT/ in subject line.
> > > 
> > > 
> > > On Thu, Feb 23, 2017 at 05:00:01PM +1100, Sam Bobroff wrote:
> > > > 
> > > > 
> > > > If QEMU is using KVM, and KVM is capable of running in radix
> > > > mode,
> > > > guests can be run in real-mode without allocating a HPT
> > > > (because
> > > > KVM
> > > > will use a minimal RPT). So in this case, we avoid creating the
> > > > HPT
> > > > at reset time and later (during CAS) create it if it is
> > > > necessary.
> > > > 
> > > > Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
> > > So, IIRC, we discussed previously that the logical way to do
> > > things
> > > was to, by default, delay HPT allocation until CAS time, and just
> > > do
> > > it at reset time for the case that needs it: hash host with KVM.
> > > 
> > > Did you hit a problem with that approach, or is there still work
> > > to
> > > be
> > > done here?
> > So what we're doing is assuming radix. Allocate hpt if hash host,
> > otherwise delay til CAS time and allocate only if guest chose hash.
> > 
> > > 
> > > 
> > > > 
> > > > 
> > > > ---
> > > > v2:
> > > > 
> > > > * This patch has been mostly rewritten to move the late HPT
> > > > allocation to CAS.
> > > > This allows a guest to start in radix mode (when it's in real
> > > > mode)
> > > > and then
> > > > change to hash, even if it is a legacy guest and will not call
> > > > h_register_process_table().
> > > > * Added an exported function to spapr.c to perform HPT
> > > > allocation
> > > > and adjust
> > > > the vrma if necessary. This makes it possible to allocate the
> > > > HPT
> > > > from
> > > > h_client_architecture_support() in spapr_hcall.c.
> > > > 
> > > >  hw/ppc/spapr.c         | 24 +++++++++++++++---------
> > > >  hw/ppc/spapr_hcall.c   | 10 ++++++++++
> > > >  include/hw/ppc/spapr.h |  1 +
> > > >  3 files changed, 26 insertions(+), 9 deletions(-)
> > > > 
> > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > > index ca3812555f..dfee0f685f 100644
> > > > --- a/hw/ppc/spapr.c
> > > > +++ b/hw/ppc/spapr.c
> > > > @@ -1123,6 +1123,17 @@ static void
> > > > spapr_reallocate_hpt(sPAPRMachineState *spapr, int shift,
> > > >      }
> > > >  }
> > > >  
> > > > +void spapr_setup_hpt_and_vrma(sPAPRMachineState *spapr)
> > > > +{
> > > > +    spapr_reallocate_hpt(spapr,
> > > > +                     spapr_hpt_shift_for_ramsize(MACHINE(qdev_
> > > > get_
> > > > machine())->maxram_size),
> > > > +                     &error_fatal);
> > > > +    if (spapr->vrma_adjust) {
> > > > +        spapr->rma_size = kvmppc_rma_size(spapr_node0_size(),
> > > > +                                          spapr->htab_shift);
> > > > +    }
> > > > +}
> > > > +
> > > >  static void find_unknown_sysbus_device(SysBusDevice *sbdev,
> > > > void
> > > > *opaque)
> > > >  {
> > > >      bool matched = false;
> > > > @@ -1151,15 +1162,10 @@ static void ppc_spapr_reset(void)
> > > >      /* Check for unknown sysbus devices */
> > > >      foreach_dynamic_sysbus_device(find_unknown_sysbus_device,
> > > > NULL);
> > > >  
> > > > -    /* Allocate and/or reset the hash page table */
> > > > -    spapr_reallocate_hpt(spapr,
> > > > -                         spapr_hpt_shift_for_ramsize(machine-
> > > > > 
> > > > > maxram_size),
> > > > -                         &error_fatal);
> > > > -
> > > > -    /* Update the RMA size if necessary */
> > > > -    if (spapr->vrma_adjust) {
> > > > -        spapr->rma_size = kvmppc_rma_size(spapr_node0_size(),
> > > > -                                          spapr->htab_shift);
> > > > +    /* If using KVM with radix mode available, VCPUs can be
> > > > started
> > > > +     * without a HPT because KVM will start them in radix
> > > > mode. */
> > > > +    if (!(kvm_enabled() && kvmppc_has_cap_mmu_radix())) {
> > > > +        spapr_setup_hpt_and_vrma(spapr);
> > > >      }
> > > >  
> > > >      qemu_devices_reset();
> > > > diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> > > > index 42d20e0b92..cea34073aa 100644
> > > > --- a/hw/ppc/spapr_hcall.c
> > > > +++ b/hw/ppc/spapr_hcall.c
> > > > @@ -1002,6 +1002,16 @@ static target_ulong
> > > > h_client_architecture_support(PowerPCCPU *cpu,
> > > >      ov5_updates = spapr_ovec_new();
> > > >      spapr->cas_reboot = spapr_ovec_diff(ov5_updates,
> > > >                                          ov5_cas_old, spapr-
> > > > > 
> > > > > ov5_cas);
> > > > +    if (kvm_enabled()) {
> > > > +        if (kvmppc_has_cap_mmu_radix()) {
> > > > +            /* If the HPT hasn't yet been set up (see
> > > > +             * ppc_spapr_reset()), and it's needed, do it now:
> > > > */
> > > I think it's a bit fragile to have here it explicitly mirror the
> > > logic
> > > which determines whether the HPT is allocated early.  I'd prefer
> > > to
> > > explicitly test here whether we have allocated an HPT - adding a
> > > flag,
> > > if we have to.
> > We can use the MSB of patb_entry as that flag.
> Uh.. only for POWER9..

Well on <POWER9, patb_entry will always be zero, which we're taking to
mean a hpt has been allocated.

We could always just check spapr->htab == NULL...

> 
> > 
> > patb_entry & GUEST_RADIX == GUEST_RADIX -> radix, so assume a hpt
> > hasn't been allocated.
> > 
> > When we do allocate a hpt we know we're not radix, so set
> > patb_entry &= ~GUEST_RADIX;
> > 
> > Where GUEST_RADIX is the msb in patb_entry which indicates that a
> > guest
> > is radix.
> > 
> > Essentially patb_entry & GUEST_RADIX cleared mean hash with hpt
> > allocated, patb_entry & GUEST_RADIX set means radix so assume an
> > hpt
> > hasn't been allocated. On the hpt allocation path we clear
> > GUEST_RADIX
> > in patb_entry and when we set GUEST_RADIX we free the hpt.
> > 
> > > 
> > > 
> > > > 
> > > > 
> > > > +            if (!spapr_ovec_test(ov5_updates, OV5_MMU_RADIX))
> > > > {
> > > > +                /* legacy hash or new hash: */
> > > > +                spapr_setup_hpt_and_vrma(spapr);
> > > > +            }
> > > > +        }
> > > > +    }
> > > >  
> > > >      if (!spapr->cas_reboot) {
> > > >          spapr->cas_reboot =
> > > > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > > > index f9b17d860a..a30cbc485c 100644
> > > > --- a/include/hw/ppc/spapr.h
> > > > +++ b/include/hw/ppc/spapr.h
> > > > @@ -590,6 +590,7 @@ void spapr_dt_events(sPAPRMachineState *sm,
> > > > void *fdt);
> > > >  int spapr_h_cas_compose_response(sPAPRMachineState *sm,
> > > >                                   target_ulong addr,
> > > > target_ulong
> > > > size,
> > > >                                   sPAPROptionVector
> > > > *ov5_updates);
> > > > +void spapr_setup_hpt_and_vrma(sPAPRMachineState *spapr);
> > > >  sPAPRTCETable *spapr_tce_new_table(DeviceState *owner,
> > > > uint32_t
> > > > liobn);
> > > >  void spapr_tce_table_enable(sPAPRTCETable *tcet,
> > > >                              uint32_t page_shift, uint64_t
> > > > bus_offset,

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 08/12] spapr: Only setup HTP if necessary.
  2017-03-01  5:17         ` Suraj Jitindar Singh
@ 2017-03-03  5:04           ` David Gibson
  0 siblings, 0 replies; 28+ messages in thread
From: David Gibson @ 2017-03-03  5:04 UTC (permalink / raw)
  To: Suraj Jitindar Singh; +Cc: Sam Bobroff, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 8074 bytes --]

On Wed, Mar 01, 2017 at 04:17:43PM +1100, Suraj Jitindar Singh wrote:
> On Tue, 2017-02-28 at 14:19 +1100, David Gibson wrote:
> > On Tue, Feb 28, 2017 at 01:25:17PM +1100, Suraj Jitindar Singh wrote:
> > > 
> > > On Tue, 2017-02-28 at 11:28 +1100, David Gibson wrote:
> > > > 
> > > > s/HTP/HPT/ in subject line.
> > > > 
> > > > 
> > > > On Thu, Feb 23, 2017 at 05:00:01PM +1100, Sam Bobroff wrote:
> > > > > 
> > > > > 
> > > > > If QEMU is using KVM, and KVM is capable of running in radix
> > > > > mode,
> > > > > guests can be run in real-mode without allocating a HPT
> > > > > (because
> > > > > KVM
> > > > > will use a minimal RPT). So in this case, we avoid creating the
> > > > > HPT
> > > > > at reset time and later (during CAS) create it if it is
> > > > > necessary.
> > > > > 
> > > > > Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
> > > > So, IIRC, we discussed previously that the logical way to do
> > > > things
> > > > was to, by default, delay HPT allocation until CAS time, and just
> > > > do
> > > > it at reset time for the case that needs it: hash host with KVM.
> > > > 
> > > > Did you hit a problem with that approach, or is there still work
> > > > to
> > > > be
> > > > done here?
> > > So what we're doing is assuming radix. Allocate hpt if hash host,
> > > otherwise delay til CAS time and allocate only if guest chose hash.
> > > 
> > > > 
> > > > 
> > > > > 
> > > > > 
> > > > > ---
> > > > > v2:
> > > > > 
> > > > > * This patch has been mostly rewritten to move the late HPT
> > > > > allocation to CAS.
> > > > > This allows a guest to start in radix mode (when it's in real
> > > > > mode)
> > > > > and then
> > > > > change to hash, even if it is a legacy guest and will not call
> > > > > h_register_process_table().
> > > > > * Added an exported function to spapr.c to perform HPT
> > > > > allocation
> > > > > and adjust
> > > > > the vrma if necessary. This makes it possible to allocate the
> > > > > HPT
> > > > > from
> > > > > h_client_architecture_support() in spapr_hcall.c.
> > > > > 
> > > > >  hw/ppc/spapr.c         | 24 +++++++++++++++---------
> > > > >  hw/ppc/spapr_hcall.c   | 10 ++++++++++
> > > > >  include/hw/ppc/spapr.h |  1 +
> > > > >  3 files changed, 26 insertions(+), 9 deletions(-)
> > > > > 
> > > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > > > index ca3812555f..dfee0f685f 100644
> > > > > --- a/hw/ppc/spapr.c
> > > > > +++ b/hw/ppc/spapr.c
> > > > > @@ -1123,6 +1123,17 @@ static void
> > > > > spapr_reallocate_hpt(sPAPRMachineState *spapr, int shift,
> > > > >      }
> > > > >  }
> > > > >  
> > > > > +void spapr_setup_hpt_and_vrma(sPAPRMachineState *spapr)
> > > > > +{
> > > > > +    spapr_reallocate_hpt(spapr,
> > > > > +                     spapr_hpt_shift_for_ramsize(MACHINE(qdev_
> > > > > get_
> > > > > machine())->maxram_size),
> > > > > +                     &error_fatal);
> > > > > +    if (spapr->vrma_adjust) {
> > > > > +        spapr->rma_size = kvmppc_rma_size(spapr_node0_size(),
> > > > > +                                          spapr->htab_shift);
> > > > > +    }
> > > > > +}
> > > > > +
> > > > >  static void find_unknown_sysbus_device(SysBusDevice *sbdev,
> > > > > void
> > > > > *opaque)
> > > > >  {
> > > > >      bool matched = false;
> > > > > @@ -1151,15 +1162,10 @@ static void ppc_spapr_reset(void)
> > > > >      /* Check for unknown sysbus devices */
> > > > >      foreach_dynamic_sysbus_device(find_unknown_sysbus_device,
> > > > > NULL);
> > > > >  
> > > > > -    /* Allocate and/or reset the hash page table */
> > > > > -    spapr_reallocate_hpt(spapr,
> > > > > -                         spapr_hpt_shift_for_ramsize(machine-
> > > > > > 
> > > > > > maxram_size),
> > > > > -                         &error_fatal);
> > > > > -
> > > > > -    /* Update the RMA size if necessary */
> > > > > -    if (spapr->vrma_adjust) {
> > > > > -        spapr->rma_size = kvmppc_rma_size(spapr_node0_size(),
> > > > > -                                          spapr->htab_shift);
> > > > > +    /* If using KVM with radix mode available, VCPUs can be
> > > > > started
> > > > > +     * without a HPT because KVM will start them in radix
> > > > > mode. */
> > > > > +    if (!(kvm_enabled() && kvmppc_has_cap_mmu_radix())) {
> > > > > +        spapr_setup_hpt_and_vrma(spapr);
> > > > >      }
> > > > >  
> > > > >      qemu_devices_reset();
> > > > > diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> > > > > index 42d20e0b92..cea34073aa 100644
> > > > > --- a/hw/ppc/spapr_hcall.c
> > > > > +++ b/hw/ppc/spapr_hcall.c
> > > > > @@ -1002,6 +1002,16 @@ static target_ulong
> > > > > h_client_architecture_support(PowerPCCPU *cpu,
> > > > >      ov5_updates = spapr_ovec_new();
> > > > >      spapr->cas_reboot = spapr_ovec_diff(ov5_updates,
> > > > >                                          ov5_cas_old, spapr-
> > > > > > 
> > > > > > ov5_cas);
> > > > > +    if (kvm_enabled()) {
> > > > > +        if (kvmppc_has_cap_mmu_radix()) {
> > > > > +            /* If the HPT hasn't yet been set up (see
> > > > > +             * ppc_spapr_reset()), and it's needed, do it now:
> > > > > */
> > > > I think it's a bit fragile to have here it explicitly mirror the
> > > > logic
> > > > which determines whether the HPT is allocated early.  I'd prefer
> > > > to
> > > > explicitly test here whether we have allocated an HPT - adding a
> > > > flag,
> > > > if we have to.
> > > We can use the MSB of patb_entry as that flag.
> > Uh.. only for POWER9..
> 
> Well on <POWER9, patb_entry will always be zero, which we're taking to
> mean a hpt has been allocated.

Hrm, I suppose so.

> We could always just check spapr->htab == NULL...

Ah.. no.  Because it will be NULL when there is an HPT, but it's
inside KVM.

> 
> > 
> > > 
> > > patb_entry & GUEST_RADIX == GUEST_RADIX -> radix, so assume a hpt
> > > hasn't been allocated.
> > > 
> > > When we do allocate a hpt we know we're not radix, so set
> > > patb_entry &= ~GUEST_RADIX;
> > > 
> > > Where GUEST_RADIX is the msb in patb_entry which indicates that a
> > > guest
> > > is radix.
> > > 
> > > Essentially patb_entry & GUEST_RADIX cleared mean hash with hpt
> > > allocated, patb_entry & GUEST_RADIX set means radix so assume an
> > > hpt
> > > hasn't been allocated. On the hpt allocation path we clear
> > > GUEST_RADIX
> > > in patb_entry and when we set GUEST_RADIX we free the hpt.
> > > 
> > > > 
> > > > 
> > > > > 
> > > > > 
> > > > > +            if (!spapr_ovec_test(ov5_updates, OV5_MMU_RADIX))
> > > > > {
> > > > > +                /* legacy hash or new hash: */
> > > > > +                spapr_setup_hpt_and_vrma(spapr);
> > > > > +            }
> > > > > +        }
> > > > > +    }
> > > > >  
> > > > >      if (!spapr->cas_reboot) {
> > > > >          spapr->cas_reboot =
> > > > > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > > > > index f9b17d860a..a30cbc485c 100644
> > > > > --- a/include/hw/ppc/spapr.h
> > > > > +++ b/include/hw/ppc/spapr.h
> > > > > @@ -590,6 +590,7 @@ void spapr_dt_events(sPAPRMachineState *sm,
> > > > > void *fdt);
> > > > >  int spapr_h_cas_compose_response(sPAPRMachineState *sm,
> > > > >                                   target_ulong addr,
> > > > > target_ulong
> > > > > size,
> > > > >                                   sPAPROptionVector
> > > > > *ov5_updates);
> > > > > +void spapr_setup_hpt_and_vrma(sPAPRMachineState *spapr);
> > > > >  sPAPRTCETable *spapr_tce_new_table(DeviceState *owner,
> > > > > uint32_t
> > > > > liobn);
> > > > >  void spapr_tce_table_enable(sPAPRTCETable *tcet,
> > > > >                              uint32_t page_shift, uint64_t
> > > > > bus_offset,
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2017-03-03  5:05 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-23  5:59 [Qemu-devel] [RFC PATCH v2 00/12] ISA 3.00 KVM guest support Sam Bobroff
2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 01/12] spapr: Small cleanup of PPC MMU enums Sam Bobroff
2017-02-27  6:22   ` David Gibson
2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 02/12] scripts/update-linux-headers.sh: refactor extra files Sam Bobroff
2017-02-27  6:24   ` David Gibson
2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 03/12] scripts/update-linux-headers.sh: add new files for ARM Sam Bobroff
2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 04/12] Move virtio_mmio.h to fix update-linux-headers.sh Sam Bobroff
2017-02-24 16:40   ` Michael S. Tsirkin
2017-02-24 16:47   ` Michael S. Tsirkin
2017-02-28  2:23     ` Sam Bobroff
2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 05/12] Update headers using update-linux-headers.sh Sam Bobroff
2017-02-23  5:59 ` [Qemu-devel] [RFC PATCH v2 06/12] spapr: Add ibm, processor-radix-AP-encodings to the device tree Sam Bobroff
2017-02-28  0:12   ` David Gibson
2017-02-28  2:27     ` Suraj Jitindar Singh
2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 07/12] target-ppc: support KVM_CAP_PPC_MMU_RADIX, KVM_CAP_PPC_MMU_HASH_V3 Sam Bobroff
2017-02-28  0:13   ` David Gibson
2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 08/12] spapr: Only setup HTP if necessary Sam Bobroff
2017-02-28  0:28   ` David Gibson
2017-02-28  2:25     ` Suraj Jitindar Singh
2017-02-28  3:19       ` David Gibson
2017-03-01  5:17         ` Suraj Jitindar Singh
2017-03-03  5:04           ` David Gibson
2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 09/12] spapr: Add h_register_process_table() hypercall Sam Bobroff
2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 10/12] spapr: move spapr_populate_pa_features() Sam Bobroff
2017-02-28  0:29   ` David Gibson
2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 11/12] spapr: Enable ISA 3.0 MMU mode selection via CAS Sam Bobroff
2017-02-23  6:00 ` [Qemu-devel] [RFC PATCH v2 12/12] spapr: Workaround for broken radix guests Sam Bobroff
2017-02-28  0:36   ` David Gibson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.