All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support
@ 2013-07-15 15:11 Anthony Liguori
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 01/11] target-ppc: Convert ppc cpu savevm to VMStateDescription Anthony Liguori
                   ` (11 more replies)
  0 siblings, 12 replies; 25+ messages in thread
From: Anthony Liguori @ 2013-07-15 15:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-ppc, David Gibson

This series is based on Alexey's series:

  spapr: migration, pci, msi, power8

Which in turn was based on work by David Gibson.

I've removed the bits not related to migration and made the
following changes:

 1) QOMify TCE tables and XICS

 2) Do everything in terms of VMStateDescriptions

 3) Fix endianness problem with TCE table translation
    a) Drop the VMSTATE_DIVIDE thing in the process

I've tested this with a TCG pseries guest on an x86_64 host.

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 01/11] target-ppc: Convert ppc cpu savevm to VMStateDescription
  2013-07-15 15:11 [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support Anthony Liguori
@ 2013-07-15 15:11 ` Anthony Liguori
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 02/11] pseries: savevm support for VIO devices Anthony Liguori
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Anthony Liguori @ 2013-07-15 15:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-ppc, David Gibson

From: Alexey Kardashevskiy <aik@ozlabs.ru>

The savevm code for the powerpc cpu emulation is currently based around
the old register_savevm() rather than register_vmstate() method.  It's also
rather broken, missing some important state on some CPU models.

This patch completely rewrites the savevm for target-ppc, using the new
VMStateDescription approach.  Exactly what needs to be saved in what
configurations has been more carefully examined, too.  This introduces a
new version (5) of the cpu save format.  The old load function is retained
to support version 4 images.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[aik: ppc cpu savevm convertion fixed to use PowerPCCPU instead of CPUPPCState]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 target-ppc/cpu-qom.h        |   4 +
 target-ppc/cpu.h            |   8 +-
 target-ppc/machine.c        | 531 ++++++++++++++++++++++++++++++++++++--------
 target-ppc/translate_init.c |   2 +
 4 files changed, 452 insertions(+), 93 deletions(-)

diff --git a/target-ppc/cpu-qom.h b/target-ppc/cpu-qom.h
index 7132599..c660e3c 100644
--- a/target-ppc/cpu-qom.h
+++ b/target-ppc/cpu-qom.h
@@ -106,4 +106,8 @@ void ppc_cpu_dump_state(CPUState *cpu, FILE *f, fprintf_function cpu_fprintf,
 void ppc_cpu_dump_statistics(CPUState *cpu, FILE *f,
                              fprintf_function cpu_fprintf, int flags);
 
+#ifndef CONFIG_USER_ONLY
+extern const struct VMStateDescription vmstate_ppc_cpu;
+#endif
+
 #endif
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 7a7b1bf..454ea13 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -948,7 +948,7 @@ struct CPUPPCState {
 #if defined(TARGET_PPC64)
     /* PowerPC 64 SLB area */
     ppc_slb_t slb[64];
-    int slb_nr;
+    int32_t slb_nr;
 #endif
     /* segment registers */
     hwaddr htab_base;
@@ -957,11 +957,11 @@ struct CPUPPCState {
     /* externally stored hash table */
     uint8_t *external_htab;
     /* BATs */
-    int nb_BATs;
+    uint32_t nb_BATs;
     target_ulong DBAT[2][8];
     target_ulong IBAT[2][8];
     /* PowerPC TLB registers (for 4xx, e500 and 60x software driven TLBs) */
-    int nb_tlb;      /* Total number of TLB                                  */
+    int32_t nb_tlb;      /* Total number of TLB                              */
     int tlb_per_way; /* Speed-up helper: used to avoid divisions at run time */
     int nb_ways;     /* Number of ways in the TLB set                        */
     int last_way;    /* Last used way used to allocate TLB in a LRU way      */
@@ -1176,8 +1176,6 @@ static inline CPUPPCState *cpu_init(const char *cpu_model)
 #define cpu_signal_handler cpu_ppc_signal_handler
 #define cpu_list ppc_cpu_list
 
-#define CPU_SAVE_VERSION 4
-
 /* MMU modes definitions */
 #define MMU_MODE0_SUFFIX _user
 #define MMU_MODE1_SUFFIX _kernel
diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index 2d10adb..12e1512 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -1,96 +1,12 @@
 #include "hw/hw.h"
 #include "hw/boards.h"
 #include "sysemu/kvm.h"
+#include "helper_regs.h"
 
-void cpu_save(QEMUFile *f, void *opaque)
+static int cpu_load_old(QEMUFile *f, void *opaque, int version_id)
 {
-    CPUPPCState *env = (CPUPPCState *)opaque;
-    unsigned int i, j;
-    uint32_t fpscr;
-    target_ulong xer;
-
-    for (i = 0; i < 32; i++)
-        qemu_put_betls(f, &env->gpr[i]);
-#if !defined(TARGET_PPC64)
-    for (i = 0; i < 32; i++)
-        qemu_put_betls(f, &env->gprh[i]);
-#endif
-    qemu_put_betls(f, &env->lr);
-    qemu_put_betls(f, &env->ctr);
-    for (i = 0; i < 8; i++)
-        qemu_put_be32s(f, &env->crf[i]);
-    xer = cpu_read_xer(env);
-    qemu_put_betls(f, &xer);
-    qemu_put_betls(f, &env->reserve_addr);
-    qemu_put_betls(f, &env->msr);
-    for (i = 0; i < 4; i++)
-        qemu_put_betls(f, &env->tgpr[i]);
-    for (i = 0; i < 32; i++) {
-        union {
-            float64 d;
-            uint64_t l;
-        } u;
-        u.d = env->fpr[i];
-        qemu_put_be64(f, u.l);
-    }
-    fpscr = env->fpscr;
-    qemu_put_be32s(f, &fpscr);
-    qemu_put_sbe32s(f, &env->access_type);
-#if defined(TARGET_PPC64)
-    qemu_put_betls(f, &env->spr[SPR_ASR]);
-    qemu_put_sbe32s(f, &env->slb_nr);
-#endif
-    qemu_put_betls(f, &env->spr[SPR_SDR1]);
-    for (i = 0; i < 32; i++)
-        qemu_put_betls(f, &env->sr[i]);
-    for (i = 0; i < 2; i++)
-        for (j = 0; j < 8; j++)
-            qemu_put_betls(f, &env->DBAT[i][j]);
-    for (i = 0; i < 2; i++)
-        for (j = 0; j < 8; j++)
-            qemu_put_betls(f, &env->IBAT[i][j]);
-    qemu_put_sbe32s(f, &env->nb_tlb);
-    qemu_put_sbe32s(f, &env->tlb_per_way);
-    qemu_put_sbe32s(f, &env->nb_ways);
-    qemu_put_sbe32s(f, &env->last_way);
-    qemu_put_sbe32s(f, &env->id_tlbs);
-    qemu_put_sbe32s(f, &env->nb_pids);
-    if (env->tlb.tlb6) {
-        // XXX assumes 6xx
-        for (i = 0; i < env->nb_tlb; i++) {
-            qemu_put_betls(f, &env->tlb.tlb6[i].pte0);
-            qemu_put_betls(f, &env->tlb.tlb6[i].pte1);
-            qemu_put_betls(f, &env->tlb.tlb6[i].EPN);
-        }
-    }
-    for (i = 0; i < 4; i++)
-        qemu_put_betls(f, &env->pb[i]);
-    for (i = 0; i < 1024; i++)
-        qemu_put_betls(f, &env->spr[i]);
-    qemu_put_be32s(f, &env->vscr);
-    qemu_put_be64s(f, &env->spe_acc);
-    qemu_put_be32s(f, &env->spe_fscr);
-    qemu_put_betls(f, &env->msr_mask);
-    qemu_put_be32s(f, &env->flags);
-    qemu_put_sbe32s(f, &env->error_code);
-    qemu_put_be32s(f, &env->pending_interrupts);
-    qemu_put_be32s(f, &env->irq_input_state);
-    for (i = 0; i < POWERPC_EXCP_NB; i++)
-        qemu_put_betls(f, &env->excp_vectors[i]);
-    qemu_put_betls(f, &env->excp_prefix);
-    qemu_put_betls(f, &env->ivor_mask);
-    qemu_put_betls(f, &env->ivpr_mask);
-    qemu_put_betls(f, &env->hreset_vector);
-    qemu_put_betls(f, &env->nip);
-    qemu_put_betls(f, &env->hflags);
-    qemu_put_betls(f, &env->hflags_nmsr);
-    qemu_put_sbe32s(f, &env->mmu_idx);
-    qemu_put_sbe32(f, 0);
-}
-
-int cpu_load(QEMUFile *f, void *opaque, int version_id)
-{
-    CPUPPCState *env = (CPUPPCState *)opaque;
+    PowerPCCPU *cpu = opaque;
+    CPUPPCState *env = &cpu->env;
     unsigned int i, j;
     target_ulong sdr1;
     uint32_t fpscr;
@@ -177,3 +93,442 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id)
 
     return 0;
 }
+
+static int get_avr(QEMUFile *f, void *pv, size_t size)
+{
+    ppc_avr_t *v = pv;
+
+    v->u64[0] = qemu_get_be64(f);
+    v->u64[1] = qemu_get_be64(f);
+
+    return 0;
+}
+
+static void put_avr(QEMUFile *f, void *pv, size_t size)
+{
+    ppc_avr_t *v = pv;
+
+    qemu_put_be64(f, v->u64[0]);
+    qemu_put_be64(f, v->u64[1]);
+}
+
+const VMStateInfo vmstate_info_avr = {
+    .name = "avr",
+    .get  = get_avr,
+    .put  = put_avr,
+};
+
+#define VMSTATE_AVR_ARRAY_V(_f, _s, _n, _v)                       \
+    VMSTATE_ARRAY(_f, _s, _n, _v, vmstate_info_avr, ppc_avr_t)
+
+#define VMSTATE_AVR_ARRAY(_f, _s, _n)                             \
+    VMSTATE_AVR_ARRAY_V(_f, _s, _n, 0)
+
+static void cpu_pre_save(void *opaque)
+{
+    PowerPCCPU *cpu = opaque;
+    CPUPPCState *env = &cpu->env;
+    int i;
+
+    env->spr[SPR_LR] = env->lr;
+    env->spr[SPR_CTR] = env->ctr;
+    env->spr[SPR_XER] = env->xer;
+#if defined(TARGET_PPC64)
+    env->spr[SPR_CFAR] = env->cfar;
+#endif
+    env->spr[SPR_BOOKE_SPEFSCR] = env->spe_fscr;
+
+    for (i = 0; (i < 4) && (i < env->nb_BATs); i++) {
+        env->spr[SPR_DBAT0U + 2*i] = env->DBAT[0][i];
+        env->spr[SPR_DBAT0U + 2*i + 1] = env->DBAT[1][i];
+        env->spr[SPR_IBAT0U + 2*i] = env->IBAT[0][i];
+        env->spr[SPR_IBAT0U + 2*i + 1] = env->IBAT[1][i];
+    }
+    for (i = 0; (i < 4) && ((i+4) < env->nb_BATs); i++) {
+        env->spr[SPR_DBAT4U + 2*i] = env->DBAT[0][i+4];
+        env->spr[SPR_DBAT4U + 2*i + 1] = env->DBAT[1][i+4];
+        env->spr[SPR_IBAT4U + 2*i] = env->IBAT[0][i+4];
+        env->spr[SPR_IBAT4U + 2*i + 1] = env->IBAT[1][i+4];
+    }
+}
+
+static int cpu_post_load(void *opaque, int version_id)
+{
+    PowerPCCPU *cpu = opaque;
+    CPUPPCState *env = &cpu->env;
+    int i;
+
+    env->lr = env->spr[SPR_LR];
+    env->ctr = env->spr[SPR_CTR];
+    env->xer = env->spr[SPR_XER];
+#if defined(TARGET_PPC64)
+    env->cfar = env->spr[SPR_CFAR];
+#endif
+    env->spe_fscr = env->spr[SPR_BOOKE_SPEFSCR];
+
+    for (i = 0; (i < 4) && (i < env->nb_BATs); i++) {
+        env->DBAT[0][i] = env->spr[SPR_DBAT0U + 2*i];
+        env->DBAT[1][i] = env->spr[SPR_DBAT0U + 2*i + 1];
+        env->IBAT[0][i] = env->spr[SPR_IBAT0U + 2*i];
+        env->IBAT[1][i] = env->spr[SPR_IBAT0U + 2*i + 1];
+    }
+    for (i = 0; (i < 4) && ((i+4) < env->nb_BATs); i++) {
+        env->DBAT[0][i+4] = env->spr[SPR_DBAT4U + 2*i];
+        env->DBAT[1][i+4] = env->spr[SPR_DBAT4U + 2*i + 1];
+        env->IBAT[0][i+4] = env->spr[SPR_IBAT4U + 2*i];
+        env->IBAT[1][i+4] = env->spr[SPR_IBAT4U + 2*i + 1];
+    }
+
+    /* Restore htab_base and htab_mask variables */
+    ppc_store_sdr1(env, env->spr[SPR_SDR1]);
+
+    hreg_compute_hflags(env);
+    hreg_compute_mem_idx(env);
+
+    return 0;
+}
+
+static bool fpu_needed(void *opaque)
+{
+    PowerPCCPU *cpu = opaque;
+
+    return (cpu->env.insns_flags & PPC_FLOAT);
+}
+
+static const VMStateDescription vmstate_fpu = {
+    .name = "cpu/fpu",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_FLOAT64_ARRAY(env.fpr, PowerPCCPU, 32),
+        VMSTATE_UINTTL(env.fpscr, PowerPCCPU),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static bool altivec_needed(void *opaque)
+{
+    PowerPCCPU *cpu = opaque;
+
+    return (cpu->env.insns_flags & PPC_ALTIVEC);
+}
+
+static const VMStateDescription vmstate_altivec = {
+    .name = "cpu/altivec",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_AVR_ARRAY(env.avr, PowerPCCPU, 32),
+        VMSTATE_UINT32(env.vscr, PowerPCCPU),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static bool vsx_needed(void *opaque)
+{
+    PowerPCCPU *cpu = opaque;
+
+    return (cpu->env.insns_flags2 & PPC2_VSX);
+}
+
+static const VMStateDescription vmstate_vsx = {
+    .name = "cpu/vsx",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_UINT64_ARRAY(env.vsr, PowerPCCPU, 32),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static bool sr_needed(void *opaque)
+{
+#ifdef TARGET_PPC64
+    PowerPCCPU *cpu = opaque;
+
+    return !(cpu->env.mmu_model & POWERPC_MMU_64);
+#else
+    return true;
+#endif
+}
+
+static const VMStateDescription vmstate_sr = {
+    .name = "cpu/sr",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_UINTTL_ARRAY(env.sr, PowerPCCPU, 32),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+#ifdef TARGET_PPC64
+static int get_slbe(QEMUFile *f, void *pv, size_t size)
+{
+    ppc_slb_t *v = pv;
+
+    v->esid = qemu_get_be64(f);
+    v->vsid = qemu_get_be64(f);
+
+    return 0;
+}
+
+static void put_slbe(QEMUFile *f, void *pv, size_t size)
+{
+    ppc_slb_t *v = pv;
+
+    qemu_put_be64(f, v->esid);
+    qemu_put_be64(f, v->vsid);
+}
+
+const VMStateInfo vmstate_info_slbe = {
+    .name = "slbe",
+    .get  = get_slbe,
+    .put  = put_slbe,
+};
+
+#define VMSTATE_SLB_ARRAY_V(_f, _s, _n, _v)                       \
+    VMSTATE_ARRAY(_f, _s, _n, _v, vmstate_info_slbe, ppc_slb_t)
+
+#define VMSTATE_SLB_ARRAY(_f, _s, _n)                             \
+    VMSTATE_SLB_ARRAY_V(_f, _s, _n, 0)
+
+static bool slb_needed(void *opaque)
+{
+    PowerPCCPU *cpu = opaque;
+
+    /* We don't support any of the old segment table based 64-bit CPUs */
+    return (cpu->env.mmu_model & POWERPC_MMU_64);
+}
+
+static const VMStateDescription vmstate_slb = {
+    .name = "cpu/slb",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_INT32_EQUAL(env.slb_nr, PowerPCCPU),
+        VMSTATE_SLB_ARRAY(env.slb, PowerPCCPU, 64),
+        VMSTATE_END_OF_LIST()
+    }
+};
+#endif /* TARGET_PPC64 */
+
+static const VMStateDescription vmstate_tlb6xx_entry = {
+    .name = "cpu/tlb6xx_entry",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_UINTTL(pte0, ppc6xx_tlb_t),
+        VMSTATE_UINTTL(pte1, ppc6xx_tlb_t),
+        VMSTATE_UINTTL(EPN, ppc6xx_tlb_t),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static bool tlb6xx_needed(void *opaque)
+{
+    PowerPCCPU *cpu = opaque;
+    CPUPPCState *env = &cpu->env;
+
+    return env->nb_tlb && (env->tlb_type == TLB_6XX);
+}
+
+static const VMStateDescription vmstate_tlb6xx = {
+    .name = "cpu/tlb6xx",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_INT32_EQUAL(env.nb_tlb, PowerPCCPU),
+        VMSTATE_STRUCT_VARRAY_POINTER_INT32(env.tlb.tlb6, PowerPCCPU,
+                                            env.nb_tlb,
+                                            vmstate_tlb6xx_entry,
+                                            ppc6xx_tlb_t),
+        VMSTATE_UINTTL_ARRAY(env.tgpr, PowerPCCPU, 4),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static const VMStateDescription vmstate_tlbemb_entry = {
+    .name = "cpu/tlbemb_entry",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_UINT64(RPN, ppcemb_tlb_t),
+        VMSTATE_UINTTL(EPN, ppcemb_tlb_t),
+        VMSTATE_UINTTL(PID, ppcemb_tlb_t),
+        VMSTATE_UINTTL(size, ppcemb_tlb_t),
+        VMSTATE_UINT32(prot, ppcemb_tlb_t),
+        VMSTATE_UINT32(attr, ppcemb_tlb_t),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static bool tlbemb_needed(void *opaque)
+{
+    PowerPCCPU *cpu = opaque;
+    CPUPPCState *env = &cpu->env;
+
+    return env->nb_tlb && (env->tlb_type == TLB_EMB);
+}
+
+static bool pbr403_needed(void *opaque)
+{
+    PowerPCCPU *cpu = opaque;
+    uint32_t pvr = cpu->env.spr[SPR_PVR];
+
+    return (pvr & 0xffff0000) == 0x00200000;
+}
+
+static const VMStateDescription vmstate_pbr403 = {
+    .name = "cpu/pbr403",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_UINTTL_ARRAY(env.pb, PowerPCCPU, 4),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static const VMStateDescription vmstate_tlbemb = {
+    .name = "cpu/tlb6xx",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_INT32_EQUAL(env.nb_tlb, PowerPCCPU),
+        VMSTATE_STRUCT_VARRAY_POINTER_INT32(env.tlb.tlbe, PowerPCCPU,
+                                            env.nb_tlb,
+                                            vmstate_tlbemb_entry,
+                                            ppcemb_tlb_t),
+        /* 403 protection registers */
+        VMSTATE_END_OF_LIST()
+    },
+    .subsections = (VMStateSubsection []) {
+        {
+            .vmsd = &vmstate_pbr403,
+            .needed = pbr403_needed,
+        } , {
+            /* empty */
+        }
+    }
+};
+
+static const VMStateDescription vmstate_tlbmas_entry = {
+    .name = "cpu/tlbmas_entry",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_UINT32(mas8, ppcmas_tlb_t),
+        VMSTATE_UINT32(mas1, ppcmas_tlb_t),
+        VMSTATE_UINT64(mas2, ppcmas_tlb_t),
+        VMSTATE_UINT64(mas7_3, ppcmas_tlb_t),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static bool tlbmas_needed(void *opaque)
+{
+    PowerPCCPU *cpu = opaque;
+    CPUPPCState *env = &cpu->env;
+
+    return env->nb_tlb && (env->tlb_type == TLB_MAS);
+}
+
+static const VMStateDescription vmstate_tlbmas = {
+    .name = "cpu/tlbmas",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_INT32_EQUAL(env.nb_tlb, PowerPCCPU),
+        VMSTATE_STRUCT_VARRAY_POINTER_INT32(env.tlb.tlbm, PowerPCCPU,
+                                            env.nb_tlb,
+                                            vmstate_tlbmas_entry,
+                                            ppcmas_tlb_t),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+const VMStateDescription vmstate_ppc_cpu = {
+    .name = "cpu",
+    .version_id = 5,
+    .minimum_version_id = 5,
+    .minimum_version_id_old = 4,
+    .load_state_old = cpu_load_old,
+    .pre_save = cpu_pre_save,
+    .post_load = cpu_post_load,
+    .fields      = (VMStateField []) {
+        /* Verify we haven't changed the pvr */
+        VMSTATE_UINTTL_EQUAL(env.spr[SPR_PVR], PowerPCCPU),
+
+        /* User mode architected state */
+        VMSTATE_UINTTL_ARRAY(env.gpr, PowerPCCPU, 32),
+#if !defined(TARGET_PPC64)
+        VMSTATE_UINTTL_ARRAY(env.gprh, PowerPCCPU, 32),
+#endif
+        VMSTATE_UINT32_ARRAY(env.crf, PowerPCCPU, 8),
+        VMSTATE_UINTTL(env.nip, PowerPCCPU),
+
+        /* SPRs */
+        VMSTATE_UINTTL_ARRAY(env.spr, PowerPCCPU, 1024),
+        VMSTATE_UINT64(env.spe_acc, PowerPCCPU),
+
+        /* Reservation */
+        VMSTATE_UINTTL(env.reserve_addr, PowerPCCPU),
+
+        /* Supervisor mode architected state */
+        VMSTATE_UINTTL(env.msr, PowerPCCPU),
+
+        /* Internal state */
+        VMSTATE_UINTTL(env.hflags_nmsr, PowerPCCPU),
+        /* FIXME: access_type? */
+
+        /* Sanity checking */
+        VMSTATE_UINTTL_EQUAL(env.msr_mask, PowerPCCPU),
+        VMSTATE_UINT64_EQUAL(env.insns_flags, PowerPCCPU),
+        VMSTATE_UINT64_EQUAL(env.insns_flags2, PowerPCCPU),
+        VMSTATE_UINT32_EQUAL(env.nb_BATs, PowerPCCPU),
+        VMSTATE_END_OF_LIST()
+    },
+    .subsections = (VMStateSubsection []) {
+        {
+            .vmsd = &vmstate_fpu,
+            .needed = fpu_needed,
+        } , {
+            .vmsd = &vmstate_altivec,
+            .needed = altivec_needed,
+        } , {
+            .vmsd = &vmstate_vsx,
+            .needed = vsx_needed,
+        } , {
+            .vmsd = &vmstate_sr,
+            .needed = sr_needed,
+        } , {
+#ifdef TARGET_PPC64
+            .vmsd = &vmstate_slb,
+            .needed = slb_needed,
+        } , {
+#endif /* TARGET_PPC64 */
+            .vmsd = &vmstate_tlb6xx,
+            .needed = tlb6xx_needed,
+        } , {
+            .vmsd = &vmstate_tlbemb,
+            .needed = tlbemb_needed,
+        } , {
+            .vmsd = &vmstate_tlbmas,
+            .needed = tlbmas_needed,
+        } , {
+            /* empty */
+        }
+    }
+};
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 79bfcd8..09ea944 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -8449,6 +8449,8 @@ static void ppc_cpu_class_init(ObjectClass *oc, void *data)
     cc->do_interrupt = ppc_cpu_do_interrupt;
     cc->dump_state = ppc_cpu_dump_state;
     cc->dump_statistics = ppc_cpu_dump_statistics;
+
+    cpu_class_set_vmsd(cc, &vmstate_ppc_cpu);
 }
 
 static const TypeInfo ppc_cpu_type_info = {
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 02/11] pseries: savevm support for VIO devices
  2013-07-15 15:11 [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support Anthony Liguori
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 01/11] target-ppc: Convert ppc cpu savevm to VMStateDescription Anthony Liguori
@ 2013-07-15 15:11 ` Anthony Liguori
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 03/11] pseries: savevm support for PAPR VIO logical lan Anthony Liguori
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Anthony Liguori @ 2013-07-15 15:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-ppc, David Gibson

From: David Gibson <david@gibson.dropbear.id.au>

This patch adds helpers to allow PAPR VIO devices to save state common
to all VIO devices during savevm.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
---
 hw/ppc/spapr_vio.c         | 20 ++++++++++++++++++++
 include/hw/ppc/spapr_vio.h |  5 +++++
 2 files changed, 25 insertions(+)

diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
index 7c6f6e4..75ce19f 100644
--- a/hw/ppc/spapr_vio.c
+++ b/hw/ppc/spapr_vio.c
@@ -542,6 +542,26 @@ static const TypeInfo spapr_vio_bridge_info = {
     .class_init    = spapr_vio_bridge_class_init,
 };
 
+const VMStateDescription vmstate_spapr_vio = {
+    .name = "spapr_vio",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        /* Sanity check */
+        VMSTATE_UINT32_EQUAL(reg, VIOsPAPRDevice),
+        VMSTATE_UINT32_EQUAL(irq, VIOsPAPRDevice),
+
+        /* General VIO device state */
+        VMSTATE_UINTTL(signal_state, VIOsPAPRDevice),
+        VMSTATE_UINT64(crq.qladdr, VIOsPAPRDevice),
+        VMSTATE_UINT32(crq.qsize, VIOsPAPRDevice),
+        VMSTATE_UINT32(crq.qnext, VIOsPAPRDevice),
+
+        VMSTATE_END_OF_LIST()
+    },
+};
+
 static void vio_spapr_device_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *k = DEVICE_CLASS(klass);
diff --git a/include/hw/ppc/spapr_vio.h b/include/hw/ppc/spapr_vio.h
index 3609327..46edc2a 100644
--- a/include/hw/ppc/spapr_vio.h
+++ b/include/hw/ppc/spapr_vio.h
@@ -134,4 +134,9 @@ VIOsPAPRDevice *spapr_vty_get_default(VIOsPAPRBus *bus);
 
 void spapr_vio_quiesce(void);
 
+extern const VMStateDescription vmstate_spapr_vio;
+
+#define VMSTATE_SPAPR_VIO(_f, _s) \
+    VMSTATE_STRUCT(_f, _s, 0, vmstate_spapr_vio, VIOsPAPRDevice)
+
 #endif /* _HW_SPAPR_VIO_H */
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 03/11] pseries: savevm support for PAPR VIO logical lan
  2013-07-15 15:11 [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support Anthony Liguori
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 01/11] target-ppc: Convert ppc cpu savevm to VMStateDescription Anthony Liguori
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 02/11] pseries: savevm support for VIO devices Anthony Liguori
@ 2013-07-15 15:11 ` Anthony Liguori
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 04/11] pseries: savevm support for PAPR VIO logical tty Anthony Liguori
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Anthony Liguori @ 2013-07-15 15:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-ppc, David Gibson

From: David Gibson <david@gibson.dropbear.id.au>

This patch adds the necessary VMStateDescription information to support
savevm/loadvm for the spapr_llan (PAPR logical lan) device.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
---
 hw/net/spapr_llan.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/hw/net/spapr_llan.c b/hw/net/spapr_llan.c
index 03a09f2..46f7d5f 100644
--- a/hw/net/spapr_llan.c
+++ b/hw/net/spapr_llan.c
@@ -81,9 +81,9 @@ typedef struct VIOsPAPRVLANDevice {
     VIOsPAPRDevice sdev;
     NICConf nicconf;
     NICState *nic;
-    int isopen;
+    bool isopen;
     target_ulong buf_list;
-    int add_buf_ptr, use_buf_ptr, rx_bufs;
+    uint32_t add_buf_ptr, use_buf_ptr, rx_bufs;
     target_ulong rxq_ptr;
 } VIOsPAPRVLANDevice;
 
@@ -500,6 +500,25 @@ static Property spapr_vlan_properties[] = {
     DEFINE_PROP_END_OF_LIST(),
 };
 
+static const VMStateDescription vmstate_spapr_llan = {
+    .name = "spapr_llan",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_SPAPR_VIO(sdev, VIOsPAPRVLANDevice),
+        /* LLAN state */
+        VMSTATE_BOOL(isopen, VIOsPAPRVLANDevice),
+        VMSTATE_UINTTL(buf_list, VIOsPAPRVLANDevice),
+        VMSTATE_UINT32(add_buf_ptr, VIOsPAPRVLANDevice),
+        VMSTATE_UINT32(use_buf_ptr, VIOsPAPRVLANDevice),
+        VMSTATE_UINT32(rx_bufs, VIOsPAPRVLANDevice),
+        VMSTATE_UINTTL(rxq_ptr, VIOsPAPRVLANDevice),
+
+        VMSTATE_END_OF_LIST()
+    },
+};
+
 static void spapr_vlan_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
@@ -514,6 +533,7 @@ static void spapr_vlan_class_init(ObjectClass *klass, void *data)
     k->signal_mask = 0x1;
     dc->props = spapr_vlan_properties;
     k->rtce_window_size = 0x10000000;
+    dc->vmsd = &vmstate_spapr_llan;
 }
 
 static const TypeInfo spapr_vlan_info = {
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 04/11] pseries: savevm support for PAPR VIO logical tty
  2013-07-15 15:11 [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support Anthony Liguori
                   ` (2 preceding siblings ...)
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 03/11] pseries: savevm support for PAPR VIO logical lan Anthony Liguori
@ 2013-07-15 15:11 ` Anthony Liguori
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 05/11] spapr-tce: make sPAPRTCETable a proper device Anthony Liguori
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Anthony Liguori @ 2013-07-15 15:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-ppc, David Gibson

From: David Gibson <david@gibson.dropbear.id.au>

This patch adds the necessary VMStateDescription information to support
savevm/loadvm for the spapr_tty (PAPR logical serial) device.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
---
 hw/char/spapr_vty.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/hw/char/spapr_vty.c b/hw/char/spapr_vty.c
index 2993848..a799721 100644
--- a/hw/char/spapr_vty.c
+++ b/hw/char/spapr_vty.c
@@ -142,6 +142,21 @@ static Property spapr_vty_properties[] = {
     DEFINE_PROP_END_OF_LIST(),
 };
 
+static const VMStateDescription vmstate_spapr_vty = {
+    .name = "spapr_vty",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_SPAPR_VIO(sdev, VIOsPAPRVTYDevice),
+
+        VMSTATE_UINT32(in, VIOsPAPRVTYDevice),
+        VMSTATE_UINT32(out, VIOsPAPRVTYDevice),
+        VMSTATE_BUFFER(buf, VIOsPAPRVTYDevice),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
 static void spapr_vty_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
@@ -152,6 +167,7 @@ static void spapr_vty_class_init(ObjectClass *klass, void *data)
     k->dt_type = "serial";
     k->dt_compatible = "hvterm1";
     dc->props = spapr_vty_properties;
+    dc->vmsd = &vmstate_spapr_vty;
 }
 
 static const TypeInfo spapr_vty_info = {
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 05/11] spapr-tce: make sPAPRTCETable a proper device
  2013-07-15 15:11 [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support Anthony Liguori
                   ` (3 preceding siblings ...)
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 04/11] pseries: savevm support for PAPR VIO logical tty Anthony Liguori
@ 2013-07-15 15:11 ` Anthony Liguori
  2013-07-16 10:00   ` [Qemu-devel] [PATCH] ppc kvm: fix to compile Alexey Kardashevskiy
  2013-07-17  7:01   ` [Qemu-devel] [PATCH 05/11] spapr-tce: make sPAPRTCETable a proper device Alexey Kardashevskiy
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 06/11] pseries: rework PAPR virtual SCSI Anthony Liguori
                   ` (6 subsequent siblings)
  11 siblings, 2 replies; 25+ messages in thread
From: Anthony Liguori @ 2013-07-15 15:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Anthony Liguori, qemu-ppc,
	David Gibson

Model TCE tables as a device that's hooked up as a child object to
the owner.  Besides the code cleanup, we get a few nice benefits:

1) free actually works now (it was dead code before)

2) the TCE information is visible in the device tree

3) we can expose table information as properties such that if we
   change the window_size, we can use globals to keep migration
   working.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: pseries: savevm support for PAPR TCE tables]
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
---
 hw/ppc/spapr.c         |   3 -
 hw/ppc/spapr_iommu.c   | 145 ++++++++++++++++++++++++++++++++-----------------
 hw/ppc/spapr_pci.c     |   2 +-
 hw/ppc/spapr_vio.c     |   2 +-
 include/hw/ppc/spapr.h |  23 +++++---
 5 files changed, 114 insertions(+), 61 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 48ae092..e340708 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -848,9 +848,6 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
     /* Set up EPOW events infrastructure */
     spapr_events_init(spapr);
 
-    /* Set up IOMMU */
-    spapr_iommu_init();
-
     /* Set up VIO bus */
     spapr->vio_bus = spapr_vio_bus_init();
 
diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index 89b33a5..709cc34 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -36,17 +36,6 @@ enum sPAPRTCEAccess {
     SPAPR_TCE_RW = 3,
 };
 
-struct sPAPRTCETable {
-    uint32_t liobn;
-    uint32_t window_size;
-    sPAPRTCE *table;
-    bool bypass;
-    int fd;
-    MemoryRegion iommu;
-    QLIST_ENTRY(sPAPRTCETable) list;
-};
-
-
 QLIST_HEAD(spapr_tce_tables, sPAPRTCETable) spapr_tce_tables;
 
 static sPAPRTCETable *spapr_tce_find_by_liobn(uint32_t liobn)
@@ -96,7 +85,7 @@ static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr)
         return (IOMMUTLBEntry) { .perm = IOMMU_NONE };
     }
 
-    tce = tcet->table[addr >> SPAPR_TCE_PAGE_SHIFT].tce;
+    tce = tcet->table[addr >> SPAPR_TCE_PAGE_SHIFT];
 
 #ifdef DEBUG_TCE
     fprintf(stderr, " ->  *paddr=0x%llx, *len=0x%llx\n",
@@ -112,37 +101,51 @@ static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr)
     };
 }
 
+static int spapr_tce_table_pre_load(void *opaque)
+{
+    sPAPRTCETable *tcet = SPAPR_TCE_TABLE(opaque);
+
+    tcet->nb_table = tcet->window_size >> SPAPR_TCE_PAGE_SHIFT;
+
+    return 0;
+}
+
+static const VMStateDescription vmstate_spapr_tce_table = {
+    .name = "spapr_iommu",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .pre_load = spapr_tce_table_pre_load,
+    .fields      = (VMStateField []) {
+        /* Sanity check */
+        VMSTATE_UINT32_EQUAL(liobn, sPAPRTCETable),
+        VMSTATE_UINT32_EQUAL(window_size, sPAPRTCETable),
+
+        /* IOMMU state */
+        VMSTATE_BOOL(bypass, sPAPRTCETable),
+        VMSTATE_VARRAY_UINT32(table, sPAPRTCETable, nb_table, 0, vmstate_info_uint64, uint64_t),
+
+        VMSTATE_END_OF_LIST()
+    },
+};
+
 static MemoryRegionIOMMUOps spapr_iommu_ops = {
     .translate = spapr_tce_translate_iommu,
 };
 
-sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, size_t window_size)
+static int spapr_tce_table_realize(DeviceState *dev)
 {
-    sPAPRTCETable *tcet;
-
-    if (spapr_tce_find_by_liobn(liobn)) {
-        fprintf(stderr, "Attempted to create TCE table with duplicate"
-                " LIOBN 0x%x\n", liobn);
-        return NULL;
-    }
-
-    if (!window_size) {
-        return NULL;
-    }
-
-    tcet = g_malloc0(sizeof(*tcet));
-    tcet->liobn = liobn;
-    tcet->window_size = window_size;
+    sPAPRTCETable *tcet = SPAPR_TCE_TABLE(dev);
 
     if (kvm_enabled()) {
-        tcet->table = kvmppc_create_spapr_tce(liobn,
-                                              window_size,
+        tcet->table = kvmppc_create_spapr_tce(tcet->liobn,
+                                              tcet->window_size,
                                               &tcet->fd);
     }
 
     if (!tcet->table) {
-        size_t table_size = (window_size >> SPAPR_TCE_PAGE_SHIFT)
-            * sizeof(sPAPRTCE);
+        size_t table_size = (tcet->window_size >> SPAPR_TCE_PAGE_SHIFT)
+            * sizeof(uint64_t);
         tcet->table = g_malloc0(table_size);
     }
 
@@ -151,16 +154,43 @@ sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, size_t wi
             "table @ %p, fd=%d\n", tcet, liobn, tcet->table, tcet->fd);
 #endif
 
-    memory_region_init_iommu(&tcet->iommu, OBJECT(owner), &spapr_iommu_ops,
+    memory_region_init_iommu(&tcet->iommu, OBJECT(dev), &spapr_iommu_ops,
                              "iommu-spapr", UINT64_MAX);
 
     QLIST_INSERT_HEAD(&spapr_tce_tables, tcet, list);
 
+    return 0;
+}
+
+sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, size_t window_size)
+{
+    sPAPRTCETable *tcet;
+
+    if (spapr_tce_find_by_liobn(liobn)) {
+        fprintf(stderr, "Attempted to create TCE table with duplicate"
+                " LIOBN 0x%x\n", liobn);
+        return NULL;
+    }
+
+    if (!window_size) {
+        return NULL;
+    }
+
+    tcet = SPAPR_TCE_TABLE(object_new(TYPE_SPAPR_TCE_TABLE));
+    tcet->liobn = liobn;
+    tcet->window_size = window_size;
+
+    object_property_add_child(OBJECT(owner), "tce-table", OBJECT(tcet), NULL);
+
+    qdev_init_nofail(DEVICE(tcet));
+
     return tcet;
 }
 
-void spapr_tce_free(sPAPRTCETable *tcet)
+static void spapr_tce_table_finalize(Object *obj)
 {
+    sPAPRTCETable *tcet = SPAPR_TCE_TABLE(obj);
+
     QLIST_REMOVE(tcet, list);
 
     if (!kvm_enabled() ||
@@ -168,8 +198,6 @@ void spapr_tce_free(sPAPRTCETable *tcet)
                                  tcet->window_size) != 0)) {
         g_free(tcet->table);
     }
-
-    g_free(tcet);
 }
 
 MemoryRegion *spapr_tce_get_iommu(sPAPRTCETable *tcet)
@@ -182,10 +210,11 @@ void spapr_tce_set_bypass(sPAPRTCETable *tcet, bool bypass)
     tcet->bypass = bypass;
 }
 
-void spapr_tce_reset(sPAPRTCETable *tcet)
+static void spapr_tce_reset(DeviceState *dev)
 {
+    sPAPRTCETable *tcet = SPAPR_TCE_TABLE(dev);
     size_t table_size = (tcet->window_size >> SPAPR_TCE_PAGE_SHIFT)
-        * sizeof(sPAPRTCE);
+        * sizeof(uint64_t);
 
     tcet->bypass = false;
     memset(tcet->table, 0, table_size);
@@ -194,7 +223,6 @@ void spapr_tce_reset(sPAPRTCETable *tcet)
 static target_ulong put_tce_emu(sPAPRTCETable *tcet, target_ulong ioba,
                                 target_ulong tce)
 {
-    sPAPRTCE *tcep;
     IOMMUTLBEntry entry;
 
     if (ioba >= tcet->window_size) {
@@ -203,8 +231,7 @@ static target_ulong put_tce_emu(sPAPRTCETable *tcet, target_ulong ioba,
         return H_PARAMETER;
     }
 
-    tcep = tcet->table + (ioba >> SPAPR_TCE_PAGE_SHIFT);
-    tcep->tce = tce;
+    tcet->table[ioba >> SPAPR_TCE_PAGE_SHIFT] = tce;
 
     entry.target_as = &address_space_memory,
     entry.iova = ioba & ~SPAPR_TCE_PAGE_MASK;
@@ -238,14 +265,6 @@ static target_ulong h_put_tce(PowerPCCPU *cpu, sPAPREnvironment *spapr,
     return H_PARAMETER;
 }
 
-void spapr_iommu_init(void)
-{
-    QLIST_INIT(&spapr_tce_tables);
-
-    /* hcall-tce */
-    spapr_register_hypercall(H_PUT_TCE, h_put_tce);
-}
-
 int spapr_dma_dt(void *fdt, int node_off, const char *propname,
                  uint32_t liobn, uint64_t window, uint32_t size)
 {
@@ -286,3 +305,31 @@ int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
     return spapr_dma_dt(fdt, node_off, propname,
                         tcet->liobn, 0, tcet->window_size);
 }
+
+static void spapr_tce_table_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    dc->vmsd = &vmstate_spapr_tce_table;
+    dc->init = spapr_tce_table_realize;
+    dc->reset = spapr_tce_reset;
+
+    QLIST_INIT(&spapr_tce_tables);
+
+    /* hcall-tce */
+    spapr_register_hypercall(H_PUT_TCE, h_put_tce);
+}
+
+static TypeInfo spapr_tce_table_info = {
+    .name = TYPE_SPAPR_TCE_TABLE,
+    .parent = TYPE_DEVICE,
+    .instance_size = sizeof(sPAPRTCETable),
+    .class_init = spapr_tce_table_class_init,
+    .instance_finalize = spapr_tce_table_finalize,
+};
+
+static void register_types(void)
+{
+    type_register_static(&spapr_tce_table_info);
+}
+
+type_init(register_types);
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 318bc9d..dd9e74e 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -682,7 +682,7 @@ static void spapr_phb_reset(DeviceState *qdev)
     sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(s);
 
     /* Reset the IOMMU state */
-    spapr_tce_reset(sphb->tcet);
+    device_reset(DEVICE(sphb->tcet));
 }
 
 static Property spapr_phb_properties[] = {
diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
index 75ce19f..c3f85bf 100644
--- a/hw/ppc/spapr_vio.c
+++ b/hw/ppc/spapr_vio.c
@@ -316,7 +316,7 @@ int spapr_vio_send_crq(VIOsPAPRDevice *dev, uint8_t *crq)
 static void spapr_vio_quiesce_one(VIOsPAPRDevice *dev)
 {
     if (dev->tcet) {
-        spapr_tce_reset(dev->tcet);
+        device_reset(DEVICE(dev->tcet));
     }
     free_crq(dev);
 }
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index de95480..82ad7c0 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -334,10 +334,6 @@ int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
 #define SPAPR_TCE_PAGE_SIZE    (1ULL << SPAPR_TCE_PAGE_SHIFT)
 #define SPAPR_TCE_PAGE_MASK    (SPAPR_TCE_PAGE_SIZE - 1)
 
-typedef struct sPAPRTCE {
-    uint64_t tce;
-} sPAPRTCE;
-
 #define SPAPR_VIO_BASE_LIOBN    0x00000000
 #define SPAPR_PCI_BASE_LIOBN    0x80000000
 
@@ -345,14 +341,27 @@ typedef struct sPAPRTCE {
 
 typedef struct sPAPRTCETable sPAPRTCETable;
 
-void spapr_iommu_init(void);
+#define TYPE_SPAPR_TCE_TABLE "spapr-tce-table"
+#define SPAPR_TCE_TABLE(obj) \
+    OBJECT_CHECK(sPAPRTCETable, (obj), TYPE_SPAPR_TCE_TABLE)
+
+struct sPAPRTCETable {
+    DeviceState parent;
+    uint32_t liobn;
+    uint32_t window_size;
+    uint32_t nb_table;
+    uint64_t *table;
+    bool bypass;
+    int fd;
+    MemoryRegion iommu;
+    QLIST_ENTRY(sPAPRTCETable) list;
+};
+
 void spapr_events_init(sPAPREnvironment *spapr);
 void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
 sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn,
                                    size_t window_size);
 MemoryRegion *spapr_tce_get_iommu(sPAPRTCETable *tcet);
-void spapr_tce_free(sPAPRTCETable *tcet);
-void spapr_tce_reset(sPAPRTCETable *tcet);
 void spapr_tce_set_bypass(sPAPRTCETable *tcet, bool bypass);
 int spapr_dma_dt(void *fdt, int node_off, const char *propname,
                  uint32_t liobn, uint64_t window, uint32_t size);
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 06/11] pseries: rework PAPR virtual SCSI
  2013-07-15 15:11 [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support Anthony Liguori
                   ` (4 preceding siblings ...)
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 05/11] spapr-tce: make sPAPRTCETable a proper device Anthony Liguori
@ 2013-07-15 15:11 ` Anthony Liguori
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 07/11] pseries: savevm support for " Anthony Liguori
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Anthony Liguori @ 2013-07-15 15:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-ppc, David Gibson

From: Alexey Kardashevskiy <aik@ozlabs.ru>

The patch reimplements handling of indirect requests in order to
simplify upcoming live migration support.
- all pointers (except SCSIRequest*) were replaces with integer
indexes and offsets;
- DMA'ed srp_direct_buf kept untouched (ie. BE format);
- vscsi_fetch_desc() is added, now it is the only place where
descriptors are fetched and byteswapped;
- vscsi_req struct fields converted to migration-friendly types;
- many dprintf()'s fixed.

This also removed an unused field 'lun' from the spapr_vscsi device
which is assigned, but never used.  So, remove it.

[David Gibson: removed unused 'lun']
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
Changes:
2013/08/07:
* fixed handling of indirect requests with an additional table descriptor
---
 hw/scsi/spapr_vscsi.c | 223 +++++++++++++++++++++++++++++---------------------
 1 file changed, 130 insertions(+), 93 deletions(-)

diff --git a/hw/scsi/spapr_vscsi.c b/hw/scsi/spapr_vscsi.c
index e8978bf..e2740fb 100644
--- a/hw/scsi/spapr_vscsi.c
+++ b/hw/scsi/spapr_vscsi.c
@@ -75,20 +75,19 @@ typedef struct vscsi_req {
     /* SCSI request tracking */
     SCSIRequest             *sreq;
     uint32_t                qtag; /* qemu tag != srp tag */
-    int                     lun;
-    int                     active;
-    long                    data_len;
-    int                     writing;
-    int                     senselen;
+    bool                    active;
+    uint32_t                data_len;
+    bool                    writing;
+    uint32_t                senselen;
     uint8_t                 sense[SCSI_SENSE_BUF_SIZE];
 
     /* RDMA related bits */
     uint8_t                 dma_fmt;
-    struct srp_direct_buf   ext_desc;
-    struct srp_direct_buf   *cur_desc;
-    struct srp_indirect_buf *ind_desc;
-    int                     local_desc;
-    int                     total_desc;
+    uint16_t                local_desc;
+    uint16_t                total_desc;
+    uint16_t                cdb_offset;
+    uint16_t                cur_desc_num;
+    uint16_t                cur_desc_offset;
 } vscsi_req;
 
 #define TYPE_VIO_SPAPR_VSCSI_DEVICE "spapr-vscsi"
@@ -264,93 +263,138 @@ static int vscsi_send_rsp(VSCSIState *s, vscsi_req *req,
     return 0;
 }
 
-static inline void vscsi_swap_desc(struct srp_direct_buf *desc)
+static inline struct srp_direct_buf vscsi_swap_desc(struct srp_direct_buf desc)
 {
-    desc->va = be64_to_cpu(desc->va);
-    desc->len = be32_to_cpu(desc->len);
+    desc.va = be64_to_cpu(desc.va);
+    desc.len = be32_to_cpu(desc.len);
+    return desc;
+}
+
+static int vscsi_fetch_desc(VSCSIState *s, struct vscsi_req *req,
+                            unsigned n, unsigned buf_offset,
+                            struct srp_direct_buf *ret)
+{
+    struct srp_cmd *cmd = &req->iu.srp.cmd;
+
+    switch (req->dma_fmt) {
+    case SRP_NO_DATA_DESC: {
+        dprintf("VSCSI: no data descriptor\n");
+        return 0;
+    }
+    case SRP_DATA_DESC_DIRECT: {
+        memcpy(ret, cmd->add_data + req->cdb_offset, sizeof(*ret));
+        assert(req->cur_desc_num == 0);
+        dprintf("VSCSI: direct segment\n");
+        break;
+    }
+    case SRP_DATA_DESC_INDIRECT: {
+        struct srp_indirect_buf *tmp = (struct srp_indirect_buf *)
+                                       (cmd->add_data + req->cdb_offset);
+        if (n < req->local_desc) {
+            *ret = tmp->desc_list[n];
+            dprintf("VSCSI: indirect segment local tag=0x%x desc#%d/%d\n",
+                    req->qtag, n, req->local_desc);
+
+        } else if (n < req->total_desc) {
+            int rc;
+            struct srp_direct_buf tbl_desc = vscsi_swap_desc(tmp->table_desc);
+            unsigned desc_offset = n * sizeof(struct srp_direct_buf);
+
+            if (desc_offset >= tbl_desc.len) {
+                dprintf("VSCSI:   #%d is ouf of range (%d bytes)\n",
+                        n, desc_offset);
+                return -1;
+            }
+            rc = spapr_vio_dma_read(&s->vdev, tbl_desc.va + desc_offset,
+                                    ret, sizeof(struct srp_direct_buf));
+            if (rc) {
+                dprintf("VSCSI: spapr_vio_dma_read -> %d reading ext_desc\n",
+                        rc);
+                return -1;
+            }
+            dprintf("VSCSI: indirect segment ext. tag=0x%x desc#%d/%d { va=%"PRIx64" len=%x }\n",
+                    req->qtag, n, req->total_desc, tbl_desc.va, tbl_desc.len);
+        } else {
+            dprintf("VSCSI:   Out of descriptors !\n");
+            return 0;
+        }
+        break;
+    }
+    default:
+        fprintf(stderr, "VSCSI:   Unknown format %x\n", req->dma_fmt);
+        return -1;
+    }
+
+    *ret = vscsi_swap_desc(*ret);
+    if (buf_offset > ret->len) {
+        dprintf("   offset=%x is out of a descriptor #%d boundary=%x\n",
+                buf_offset, req->cur_desc_num, ret->len);
+        return -1;
+    }
+    ret->va += buf_offset;
+    ret->len -= buf_offset;
+
+    dprintf("   cur=%d offs=%x ret { va=%"PRIx64" len=%x }\n",
+            req->cur_desc_num, req->cur_desc_offset, ret->va, ret->len);
+
+    return ret->len ? 1 : 0;
 }
 
 static int vscsi_srp_direct_data(VSCSIState *s, vscsi_req *req,
                                  uint8_t *buf, uint32_t len)
 {
-    struct srp_direct_buf *md = req->cur_desc;
+    struct srp_direct_buf md;
     uint32_t llen;
     int rc = 0;
 
-    dprintf("VSCSI: direct segment 0x%x bytes, va=0x%llx desc len=0x%x\n",
-            len, (unsigned long long)md->va, md->len);
+    rc = vscsi_fetch_desc(s, req, req->cur_desc_num, req->cur_desc_offset, &md);
+    if (rc < 0) {
+        return -1;
+    } else if (rc == 0) {
+        return 0;
+    }
 
-    llen = MIN(len, md->len);
+    llen = MIN(len, md.len);
     if (llen) {
         if (req->writing) { /* writing = to device = reading from memory */
-            rc = spapr_vio_dma_read(&s->vdev, md->va, buf, llen);
+            rc = spapr_vio_dma_read(&s->vdev, md.va, buf, llen);
         } else {
-            rc = spapr_vio_dma_write(&s->vdev, md->va, buf, llen);
+            rc = spapr_vio_dma_write(&s->vdev, md.va, buf, llen);
         }
     }
-    md->len -= llen;
-    md->va += llen;
 
     if (rc) {
         return -1;
     }
+    req->cur_desc_offset += llen;
+
     return llen;
 }
 
 static int vscsi_srp_indirect_data(VSCSIState *s, vscsi_req *req,
                                    uint8_t *buf, uint32_t len)
 {
-    struct srp_direct_buf *td = &req->ind_desc->table_desc;
-    struct srp_direct_buf *md = req->cur_desc;
+    struct srp_direct_buf md;
     int rc = 0;
     uint32_t llen, total = 0;
 
-    dprintf("VSCSI: indirect segment 0x%x bytes, td va=0x%llx len=0x%x\n",
-            len, (unsigned long long)td->va, td->len);
+    dprintf("VSCSI: indirect segment 0x%x bytes\n", len);
 
     /* While we have data ... */
     while (len) {
-        /* If we have a descriptor but it's empty, go fetch a new one */
-        if (md && md->len == 0) {
-            /* More local available, use one */
-            if (req->local_desc) {
-                md = ++req->cur_desc;
-                --req->local_desc;
-                --req->total_desc;
-                td->va += sizeof(struct srp_direct_buf);
-            } else {
-                md = req->cur_desc = NULL;
-            }
-        }
-        /* No descriptor at hand, fetch one */
-        if (!md) {
-            if (!req->total_desc) {
-                dprintf("VSCSI:   Out of descriptors !\n");
-                break;
-            }
-            md = req->cur_desc = &req->ext_desc;
-            dprintf("VSCSI:   Reading desc from 0x%llx\n",
-                    (unsigned long long)td->va);
-            rc = spapr_vio_dma_read(&s->vdev, td->va, md,
-                                    sizeof(struct srp_direct_buf));
-            if (rc) {
-                dprintf("VSCSI: spapr_vio_dma_read -> %d reading ext_desc\n",
-                        rc);
-                break;
-            }
-            vscsi_swap_desc(md);
-            td->va += sizeof(struct srp_direct_buf);
-            --req->total_desc;
+        rc = vscsi_fetch_desc(s, req, req->cur_desc_num, req->cur_desc_offset, &md);
+        if (rc < 0) {
+            return -1;
+        } else if (rc == 0) {
+            break;
         }
-        dprintf("VSCSI:   [desc va=0x%llx,len=0x%x] remaining=0x%x\n",
-                (unsigned long long)md->va, md->len, len);
 
         /* Perform transfer */
-        llen = MIN(len, md->len);
+        llen = MIN(len, md.len);
         if (req->writing) { /* writing = to device = reading from memory */
-            rc = spapr_vio_dma_read(&s->vdev, md->va, buf, llen);
+            rc = spapr_vio_dma_read(&s->vdev, md.va, buf, llen);
         } else {
-            rc = spapr_vio_dma_write(&s->vdev, md->va, buf, llen);
+            rc = spapr_vio_dma_write(&s->vdev, md.va, buf, llen);
         }
         if (rc) {
             dprintf("VSCSI: spapr_vio_dma_r/w(%d) -> %d\n", req->writing, rc);
@@ -361,10 +405,18 @@ static int vscsi_srp_indirect_data(VSCSIState *s, vscsi_req *req,
 
         len -= llen;
         buf += llen;
+
         total += llen;
-        md->va += llen;
-        md->len -= llen;
+
+        /* Update current position in the current descriptor */
+        req->cur_desc_offset += llen;
+        if (md.len == llen) {
+            /* Go to the next descriptor if the current one finished */
+            ++req->cur_desc_num;
+            req->cur_desc_offset = 0;
+        }
     }
+
     return rc ? -1 : total;
 }
 
@@ -412,14 +464,13 @@ static int data_out_desc_size(struct srp_cmd *cmd)
 static int vscsi_preprocess_desc(vscsi_req *req)
 {
     struct srp_cmd *cmd = &req->iu.srp.cmd;
-    int offset, i;
 
-    offset = cmd->add_cdb_len & ~3;
+    req->cdb_offset = cmd->add_cdb_len & ~3;
 
     if (req->writing) {
         req->dma_fmt = cmd->buf_fmt >> 4;
     } else {
-        offset += data_out_desc_size(cmd);
+        req->cdb_offset += data_out_desc_size(cmd);
         req->dma_fmt = cmd->buf_fmt & ((1U << 4) - 1);
     }
 
@@ -427,31 +478,18 @@ static int vscsi_preprocess_desc(vscsi_req *req)
     case SRP_NO_DATA_DESC:
         break;
     case SRP_DATA_DESC_DIRECT:
-        req->cur_desc = (struct srp_direct_buf *)(cmd->add_data + offset);
         req->total_desc = req->local_desc = 1;
-        vscsi_swap_desc(req->cur_desc);
-        dprintf("VSCSI: using direct RDMA %s, 0x%x bytes MD: 0x%llx\n",
-                req->writing ? "write" : "read",
-                req->cur_desc->len, (unsigned long long)req->cur_desc->va);
         break;
-    case SRP_DATA_DESC_INDIRECT:
-        req->ind_desc = (struct srp_indirect_buf *)(cmd->add_data + offset);
-        vscsi_swap_desc(&req->ind_desc->table_desc);
-        req->total_desc = req->ind_desc->table_desc.len /
-            sizeof(struct srp_direct_buf);
+    case SRP_DATA_DESC_INDIRECT: {
+        struct srp_indirect_buf *ind_tmp = (struct srp_indirect_buf *)
+                (cmd->add_data + req->cdb_offset);
+
+        req->total_desc = be32_to_cpu(ind_tmp->table_desc.len) /
+                          sizeof(struct srp_direct_buf);
         req->local_desc = req->writing ? cmd->data_out_desc_cnt :
-            cmd->data_in_desc_cnt;
-        for (i = 0; i < req->local_desc; i++) {
-            vscsi_swap_desc(&req->ind_desc->desc_list[i]);
-        }
-        req->cur_desc = req->local_desc ? &req->ind_desc->desc_list[0] : NULL;
-        dprintf("VSCSI: using indirect RDMA %s, 0x%x bytes %d descs "
-                "(%d local) VA: 0x%llx\n",
-                req->writing ? "read" : "write",
-                be32_to_cpu(req->ind_desc->len),
-                req->total_desc, req->local_desc,
-                (unsigned long long)req->ind_desc->table_desc.va);
+                          cmd->data_in_desc_cnt;
         break;
+    }
     default:
         fprintf(stderr,
                 "vscsi_preprocess_desc: Unknown format %x\n", req->dma_fmt);
@@ -499,8 +537,8 @@ static void vscsi_command_complete(SCSIRequest *sreq, uint32_t status, size_t re
     vscsi_req *req = sreq->hba_private;
     int32_t res_in = 0, res_out = 0;
 
-    dprintf("VSCSI: SCSI cmd complete, r=0x%x tag=0x%x status=0x%x, req=%p\n",
-            reason, sreq->tag, status, req);
+    dprintf("VSCSI: SCSI cmd complete, tag=0x%x status=0x%x, req=%p\n",
+            sreq->tag, status, req);
     if (req == NULL) {
         fprintf(stderr, "VSCSI: Can't find request for tag 0x%x\n", sreq->tag);
         return;
@@ -509,7 +547,7 @@ static void vscsi_command_complete(SCSIRequest *sreq, uint32_t status, size_t re
     if (status == CHECK_CONDITION) {
         req->senselen = scsi_req_get_sense(req->sreq, req->sense,
                                            sizeof(req->sense));
-        dprintf("VSCSI: Sense data, %d bytes:\n", len);
+        dprintf("VSCSI: Sense data, %d bytes:\n", req->senselen);
         dprintf("       %02x  %02x  %02x  %02x  %02x  %02x  %02x  %02x\n",
                 req->sense[0], req->sense[1], req->sense[2], req->sense[3],
                 req->sense[4], req->sense[5], req->sense[6], req->sense[7]);
@@ -621,12 +659,11 @@ static int vscsi_queue_cmd(VSCSIState *s, vscsi_req *req)
         } return 1;
     }
 
-    req->lun = lun;
     req->sreq = scsi_req_new(sdev, req->qtag, lun, srp->cmd.cdb, req);
     n = scsi_req_enqueue(req->sreq);
 
-    dprintf("VSCSI: Queued command tag 0x%x CMD 0x%x ID %d LUN %d ret: %d\n",
-            req->qtag, srp->cmd.cdb[0], id, lun, n);
+    dprintf("VSCSI: Queued command tag 0x%x CMD 0x%x LUN %d ret: %d\n",
+            req->qtag, srp->cmd.cdb[0], lun, n);
 
     if (n) {
         /* Transfer direction must be set before preprocessing the
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 07/11] pseries: savevm support for PAPR virtual SCSI
  2013-07-15 15:11 [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support Anthony Liguori
                   ` (5 preceding siblings ...)
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 06/11] pseries: rework PAPR virtual SCSI Anthony Liguori
@ 2013-07-15 15:11 ` Anthony Liguori
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 08/11] pseries: savevm support for pseries machine Anthony Liguori
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Anthony Liguori @ 2013-07-15 15:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-ppc, David Gibson

From: David Gibson <david@gibson.dropbear.id.au>

This patch adds the necessary support for saving the state of the PAPR VIO
virtual SCSI device. This also saves and restores active SCSI requests.

[aik: implemented vscsi_req save/restore]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: David Gibson <david@gibson.dropbear.id.au>
---
 hw/scsi/spapr_vscsi.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 81 insertions(+), 1 deletion(-)

diff --git a/hw/scsi/spapr_vscsi.c b/hw/scsi/spapr_vscsi.c
index e2740fb..a86199b 100644
--- a/hw/scsi/spapr_vscsi.c
+++ b/hw/scsi/spapr_vscsi.c
@@ -578,6 +578,69 @@ static void vscsi_request_cancelled(SCSIRequest *sreq)
     vscsi_put_req(req);
 }
 
+static const VMStateDescription vmstate_spapr_vscsi_req = {
+    .name = "spapr_vscsi_req",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_BUFFER(crq.raw, vscsi_req),
+        VMSTATE_BUFFER(iu.srp.reserved, vscsi_req),
+        VMSTATE_UINT32(qtag, vscsi_req),
+        VMSTATE_BOOL(active, vscsi_req),
+        VMSTATE_UINT32(data_len, vscsi_req),
+        VMSTATE_BOOL(writing, vscsi_req),
+        VMSTATE_UINT32(senselen, vscsi_req),
+        VMSTATE_BUFFER(sense, vscsi_req),
+        VMSTATE_UINT8(dma_fmt, vscsi_req),
+        VMSTATE_UINT16(local_desc, vscsi_req),
+        VMSTATE_UINT16(total_desc, vscsi_req),
+        VMSTATE_UINT16(cdb_offset, vscsi_req),
+      /*Restart SCSI request from the beginning for now */
+      /*VMSTATE_UINT16(cur_desc_num, vscsi_req),
+        VMSTATE_UINT16(cur_desc_offset, vscsi_req),*/
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static void vscsi_save_request(QEMUFile *f, SCSIRequest *sreq)
+{
+    vscsi_req *req = sreq->hba_private;
+    assert(req->active);
+
+    vmstate_save_state(f, &vmstate_spapr_vscsi_req, req);
+
+    dprintf("VSCSI: saving tag=%u, current desc#%d, offset=%x\n",
+            req->qtag, req->cur_desc_num, req->cur_desc_offset);
+}
+
+static void *vscsi_load_request(QEMUFile *f, SCSIRequest *sreq)
+{
+    SCSIBus *bus = sreq->bus;
+    VSCSIState *s = VIO_SPAPR_VSCSI_DEVICE(bus->qbus.parent);
+    vscsi_req *req;
+    int rc;
+
+    assert(sreq->tag < VSCSI_REQ_LIMIT);
+    req = &s->reqs[sreq->tag];
+    assert(!req->active);
+
+    memset(req, 0, sizeof(*req));
+    rc = vmstate_load_state(f, &vmstate_spapr_vscsi_req, req, 1);
+    if (rc) {
+        fprintf(stderr, "VSCSI: failed loading request tag#%u\n", sreq->tag);
+        return NULL;
+    }
+    assert(req->active);
+
+    req->sreq = scsi_req_ref(sreq);
+
+    dprintf("VSCSI: restoring tag=%u, current desc#%d, offset=%x\n",
+            req->qtag, req->cur_desc_num, req->cur_desc_offset);
+
+    return req;
+}
+
 static void vscsi_process_login(VSCSIState *s, vscsi_req *req)
 {
     union viosrp_iu *iu = &req->iu;
@@ -932,7 +995,9 @@ static const struct SCSIBusInfo vscsi_scsi_info = {
 
     .transfer_data = vscsi_transfer_data,
     .complete = vscsi_command_complete,
-    .cancel = vscsi_request_cancelled
+    .cancel = vscsi_request_cancelled,
+    .save_request = vscsi_save_request,
+    .load_request = vscsi_load_request,
 };
 
 static void spapr_vscsi_reset(VIOsPAPRDevice *dev)
@@ -991,6 +1056,20 @@ static Property spapr_vscsi_properties[] = {
     DEFINE_PROP_END_OF_LIST(),
 };
 
+static const VMStateDescription vmstate_spapr_vscsi = {
+    .name = "spapr_vscsi",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_SPAPR_VIO(vdev, VSCSIState),
+        /* VSCSI state */
+        /* ???? */
+
+        VMSTATE_END_OF_LIST()
+    },
+};
+
 static void spapr_vscsi_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
@@ -1005,6 +1084,7 @@ static void spapr_vscsi_class_init(ObjectClass *klass, void *data)
     k->signal_mask = 0x00000001;
     dc->props = spapr_vscsi_properties;
     k->rtce_window_size = 0x10000000;
+    dc->vmsd = &vmstate_spapr_vscsi;
 }
 
 static const TypeInfo spapr_vscsi_info = {
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 08/11] pseries: savevm support for pseries machine
  2013-07-15 15:11 [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support Anthony Liguori
                   ` (6 preceding siblings ...)
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 07/11] pseries: savevm support for " Anthony Liguori
@ 2013-07-15 15:11 ` Anthony Liguori
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 09/11] pseries: savevm support for PCI host bridge Anthony Liguori
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Anthony Liguori @ 2013-07-15 15:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-ppc, David Gibson

From: David Gibson <david@gibson.dropbear.id.au>

This adds the necessary pieces to implement savevm / migration for the
pseries machine.  The most complex part here is migrating the hash
table - for the paravirtualized pseries machine the guest's hash page
table is not stored within guest memory, but externally and the guest
accesses it via hypercalls.

This patch uses a hypervisor reserved bit of the HPTE as a dirty bit
(tracking changes to the HPTE itself, not the page it references).
This is used to implement a live migration style incremental save and
restore of the hash table contents.

Normally a hash table is 16MB but it can get bigger depending on how
much RAM the guest has. Due to its nature, updates to it are random so
the live migration style is used for it.

In addition it adds VMStateDescription information to save and restore
the (few) remaining pieces of state information needed by the pseries
machine.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>

---
Changes:
2013/09/07:
* added a comment about HPT size and migration style choice
---
 hw/ppc/spapr.c         | 269 ++++++++++++++++++++++++++++++++++++++++++++++++-
 hw/ppc/spapr_hcall.c   |   8 +-
 include/hw/ppc/spapr.h |  12 ++-
 3 files changed, 281 insertions(+), 8 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index e340708..734a782 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -32,6 +32,7 @@
 #include "sysemu/cpus.h"
 #include "sysemu/kvm.h"
 #include "kvm_ppc.h"
+#include "mmu-hash64.h"
 
 #include "hw/boards.h"
 #include "hw/ppc/ppc.h"
@@ -666,7 +667,7 @@ static void spapr_cpu_reset(void *opaque)
 
     env->spr[SPR_HIOR] = 0;
 
-    env->external_htab = spapr->htab;
+    env->external_htab = (uint8_t *)spapr->htab;
     env->htab_base = -1;
     env->htab_mask = HTAB_SIZE(spapr) - 1;
     env->spr[SPR_SDR1] = (target_ulong)(uintptr_t)spapr->htab |
@@ -710,6 +711,268 @@ static int spapr_vga_init(PCIBus *pci_bus)
     }
 }
 
+static const VMStateDescription vmstate_spapr = {
+    .name = "spapr",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_UINT32(next_irq, sPAPREnvironment),
+
+        /* RTC offset */
+        VMSTATE_UINT64(rtc_offset, sPAPREnvironment),
+
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+#define HPTE(_table, _i)   (void *)(((uint64_t *)(_table)) + ((_i) * 2))
+#define HPTE_VALID(_hpte)  (tswap64(*((uint64_t *)(_hpte))) & HPTE64_V_VALID)
+#define HPTE_DIRTY(_hpte)  (tswap64(*((uint64_t *)(_hpte))) & HPTE64_V_HPTE_DIRTY)
+#define CLEAN_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) &= tswap64(~HPTE64_V_HPTE_DIRTY))
+
+static int htab_save_setup(QEMUFile *f, void *opaque)
+{
+    sPAPREnvironment *spapr = opaque;
+
+    spapr->htab_save_index = 0;
+    spapr->htab_first_pass = true;
+
+    /* "Iteration" header */
+    qemu_put_be32(f, spapr->htab_shift);
+
+    return 0;
+}
+
+#define MAX_ITERATION_NS    5000000 /* 5 ms */
+
+static void htab_save_first_pass(QEMUFile *f, sPAPREnvironment *spapr,
+                                 int64_t max_ns)
+{
+    int htabslots = HTAB_SIZE(spapr) / HASH_PTE_SIZE_64;
+    int index = spapr->htab_save_index;
+    int64_t starttime = qemu_get_clock_ns(rt_clock);
+
+    assert(spapr->htab_first_pass);
+
+    do {
+        int chunkstart;
+
+        /* Consume invalid HPTEs */
+        while ((index < htabslots)
+               && !HPTE_VALID(HPTE(spapr->htab, index))) {
+            index++;
+            CLEAN_HPTE(HPTE(spapr->htab, index));
+        }
+
+        /* Consume valid HPTEs */
+        chunkstart = index;
+        while ((index < htabslots)
+               && HPTE_VALID(HPTE(spapr->htab, index))) {
+            index++;
+            CLEAN_HPTE(HPTE(spapr->htab, index));
+        }
+
+        if (index > chunkstart) {
+            int n_valid = index - chunkstart;
+
+            qemu_put_be32(f, chunkstart);
+            qemu_put_be16(f, n_valid);
+            qemu_put_be16(f, 0);
+            qemu_put_buffer(f, HPTE(spapr->htab, chunkstart),
+                            HASH_PTE_SIZE_64 * n_valid);
+
+            if ((qemu_get_clock_ns(rt_clock) - starttime) > max_ns) {
+                break;
+            }
+        }
+    } while ((index < htabslots) && !qemu_file_rate_limit(f));
+
+    if (index >= htabslots) {
+        assert(index == htabslots);
+        index = 0;
+        spapr->htab_first_pass = false;
+    }
+    spapr->htab_save_index = index;
+}
+
+static bool htab_save_later_pass(QEMUFile *f, sPAPREnvironment *spapr,
+                                 int64_t max_ns)
+{
+    bool final = max_ns < 0;
+    int htabslots = HTAB_SIZE(spapr) / HASH_PTE_SIZE_64;
+    int examined = 0, sent = 0;
+    int index = spapr->htab_save_index;
+    int64_t starttime = qemu_get_clock_ns(rt_clock);
+
+    assert(!spapr->htab_first_pass);
+
+    do {
+        int chunkstart, invalidstart;
+
+        /* Consume non-dirty HPTEs */
+        while ((index < htabslots)
+               && !HPTE_DIRTY(HPTE(spapr->htab, index))) {
+            index++;
+            examined++;
+        }
+
+        chunkstart = index;
+        /* Consume valid dirty HPTEs */
+        while ((index < htabslots)
+               && HPTE_DIRTY(HPTE(spapr->htab, index))
+               && HPTE_VALID(HPTE(spapr->htab, index))) {
+            CLEAN_HPTE(HPTE(spapr->htab, index));
+            index++;
+            examined++;
+        }
+
+        invalidstart = index;
+        /* Consume invalid dirty HPTEs */
+        while ((index < htabslots)
+               && HPTE_DIRTY(HPTE(spapr->htab, index))
+               && !HPTE_VALID(HPTE(spapr->htab, index))) {
+            CLEAN_HPTE(HPTE(spapr->htab, index));
+            index++;
+            examined++;
+        }
+
+        if (index > chunkstart) {
+            int n_valid = invalidstart - chunkstart;
+            int n_invalid = index - invalidstart;
+
+            qemu_put_be32(f, chunkstart);
+            qemu_put_be16(f, n_valid);
+            qemu_put_be16(f, n_invalid);
+            qemu_put_buffer(f, HPTE(spapr->htab, chunkstart),
+                            HASH_PTE_SIZE_64 * n_valid);
+            sent += index - chunkstart;
+
+            if (!final && (qemu_get_clock_ns(rt_clock) - starttime) > max_ns) {
+                break;
+            }
+        }
+
+        if (examined >= htabslots) {
+            break;
+        }
+
+        if (index >= htabslots) {
+            assert(index == htabslots);
+            index = 0;
+        }
+    } while ((examined < htabslots) && (!qemu_file_rate_limit(f) || final));
+
+    if (index >= htabslots) {
+        assert(index == htabslots);
+        index = 0;
+    }
+
+    spapr->htab_save_index = index;
+
+    return (examined >= htabslots) && (sent == 0);
+}
+
+static int htab_save_iterate(QEMUFile *f, void *opaque)
+{
+    sPAPREnvironment *spapr = opaque;
+    bool nothingleft = false;;
+
+    /* Iteration header */
+    qemu_put_be32(f, 0);
+
+    if (spapr->htab_first_pass) {
+        htab_save_first_pass(f, spapr, MAX_ITERATION_NS);
+    } else {
+        nothingleft = htab_save_later_pass(f, spapr, MAX_ITERATION_NS);
+    }
+
+    /* End marker */
+    qemu_put_be32(f, 0);
+    qemu_put_be16(f, 0);
+    qemu_put_be16(f, 0);
+
+    return nothingleft ? 1 : 0;
+}
+
+static int htab_save_complete(QEMUFile *f, void *opaque)
+{
+    sPAPREnvironment *spapr = opaque;
+
+    /* Iteration header */
+    qemu_put_be32(f, 0);
+
+    htab_save_later_pass(f, spapr, -1);
+
+    /* End marker */
+    qemu_put_be32(f, 0);
+    qemu_put_be16(f, 0);
+    qemu_put_be16(f, 0);
+
+    return 0;
+}
+
+static int htab_load(QEMUFile *f, void *opaque, int version_id)
+{
+    sPAPREnvironment *spapr = opaque;
+    uint32_t section_hdr;
+
+    if (version_id < 1 || version_id > 1) {
+        fprintf(stderr, "htab_load() bad version\n");
+        return -EINVAL;
+    }
+
+    section_hdr = qemu_get_be32(f);
+
+    if (section_hdr) {
+        /* First section, just the hash shift */
+        if (spapr->htab_shift != section_hdr) {
+            return -EINVAL;
+        }
+        return 0;
+    }
+
+    while (true) {
+        uint32_t index;
+        uint16_t n_valid, n_invalid;
+
+        index = qemu_get_be32(f);
+        n_valid = qemu_get_be16(f);
+        n_invalid = qemu_get_be16(f);
+
+        if ((index == 0) && (n_valid == 0) && (n_invalid == 0)) {
+            /* End of Stream */
+            break;
+        }
+
+        if ((index + n_valid + n_invalid) >=
+            (HTAB_SIZE(spapr) / HASH_PTE_SIZE_64)) {
+            /* Bad index in stream */
+            fprintf(stderr, "htab_load() bad index %d (%hd+%hd entries) "
+                    "in htab stream\n", index, n_valid, n_invalid);
+            return -EINVAL;
+        }
+
+        if (n_valid) {
+            qemu_get_buffer(f, HPTE(spapr->htab, index),
+                            HASH_PTE_SIZE_64 * n_valid);
+        }
+        if (n_invalid) {
+            memset(HPTE(spapr->htab, index + n_valid), 0,
+                   HASH_PTE_SIZE_64 * n_invalid);
+        }
+    }
+
+    return 0;
+}
+
+static SaveVMHandlers savevm_htab_handlers = {
+    .save_live_setup = htab_save_setup,
+    .save_live_iterate = htab_save_iterate,
+    .save_live_complete = htab_save_complete,
+    .load_state = htab_load,
+};
+
 /* pSeries LPAR / sPAPR hardware init */
 static void ppc_spapr_init(QEMUMachineInitArgs *args)
 {
@@ -950,6 +1213,10 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 
     spapr->entry_point = 0x100;
 
+    vmstate_register(NULL, 0, &vmstate_spapr, spapr);
+    register_savevm_live(NULL, "spapr/htab", -1, 1,
+                         &savevm_htab_handlers, spapr);
+
     /* Prepare the device tree */
     spapr->fdt_skel = spapr_create_fdt_skel(cpu_model,
                                             initrd_base, initrd_size,
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index ed32dec..67d6cd9 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -115,7 +115,7 @@ static target_ulong h_enter(PowerPCCPU *cpu, sPAPREnvironment *spapr,
     }
     ppc_hash64_store_hpte1(env, hpte, ptel);
     /* eieio();  FIXME: need some sort of barrier for smp? */
-    ppc_hash64_store_hpte0(env, hpte, pteh);
+    ppc_hash64_store_hpte0(env, hpte, pteh | HPTE64_V_HPTE_DIRTY);
 
     args[0] = pte_index + i;
     return H_SUCCESS;
@@ -152,7 +152,7 @@ static RemoveResult remove_hpte(CPUPPCState *env, target_ulong ptex,
     }
     *vp = v;
     *rp = r;
-    ppc_hash64_store_hpte0(env, hpte, 0);
+    ppc_hash64_store_hpte0(env, hpte, HPTE64_V_HPTE_DIRTY);
     rb = compute_tlbie_rb(v, r, ptex);
     ppc_tlb_invalidate_one(env, rb);
     return REMOVE_SUCCESS;
@@ -282,11 +282,11 @@ static target_ulong h_protect(PowerPCCPU *cpu, sPAPREnvironment *spapr,
     r |= (flags << 48) & HPTE64_R_KEY_HI;
     r |= flags & (HPTE64_R_PP | HPTE64_R_N | HPTE64_R_KEY_LO);
     rb = compute_tlbie_rb(v, r, pte_index);
-    ppc_hash64_store_hpte0(env, hpte, v & ~HPTE64_V_VALID);
+    ppc_hash64_store_hpte0(env, hpte, (v & ~HPTE64_V_VALID) | HPTE64_V_HPTE_DIRTY);
     ppc_tlb_invalidate_one(env, rb);
     ppc_hash64_store_hpte1(env, hpte, r);
     /* Don't need a memory barrier, due to qemu's global lock */
-    ppc_hash64_store_hpte0(env, hpte, v);
+    ppc_hash64_store_hpte0(env, hpte, v | HPTE64_V_HPTE_DIRTY);
     return H_SUCCESS;
 }
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 82ad7c0..00a6f58 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -9,6 +9,8 @@ struct sPAPRPHBState;
 struct sPAPRNVRAM;
 struct icp_state;
 
+#define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
+
 typedef struct sPAPREnvironment {
     struct VIOsPAPRBus *vio_bus;
     QLIST_HEAD(, sPAPRPHBState) phbs;
@@ -17,20 +19,24 @@ typedef struct sPAPREnvironment {
 
     hwaddr ram_limit;
     void *htab;
-    long htab_shift;
+    uint32_t htab_shift;
     hwaddr rma_size;
     int vrma_adjust;
     hwaddr fdt_addr, rtas_addr;
     long rtas_size;
     void *fdt_skel;
     target_ulong entry_point;
-    int next_irq;
-    int rtc_offset;
+    uint32_t next_irq;
+    uint64_t rtc_offset;
     char *cpu_model;
     bool has_graphics;
 
     uint32_t epow_irq;
     Notifier epow_notifier;
+
+    /* Migration state */
+    int htab_save_index;
+    bool htab_first_pass;
 } sPAPREnvironment;
 
 #define H_SUCCESS         0
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 09/11] pseries: savevm support for PCI host bridge
  2013-07-15 15:11 [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support Anthony Liguori
                   ` (7 preceding siblings ...)
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 08/11] pseries: savevm support for pseries machine Anthony Liguori
@ 2013-07-15 15:11 ` Anthony Liguori
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 10/11] pseries: savevm support with KVM Anthony Liguori
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Anthony Liguori @ 2013-07-15 15:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-ppc, David Gibson

From: David Gibson <david@gibson.dropbear.id.au>

This adds the necessary support for saving the state of the PAPR virtual
PCI host bridge (or host bridges).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
---
 hw/ppc/spapr_pci.c          | 49 +++++++++++++++++++++++++++++++++++++++++++++
 include/hw/pci-host/spapr.h |  6 +++---
 2 files changed, 52 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index dd9e74e..560d375c 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -699,6 +699,54 @@ static Property spapr_phb_properties[] = {
     DEFINE_PROP_END_OF_LIST(),
 };
 
+static const VMStateDescription vmstate_spapr_pci_lsi = {
+    .name = "spapr_pci/lsi",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_UINT32_EQUAL(irq, struct spapr_pci_lsi),
+
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static const VMStateDescription vmstate_spapr_pci_msi = {
+    .name = "spapr_pci/lsi",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_UINT32(config_addr, struct spapr_pci_msi),
+        VMSTATE_UINT32(irq, struct spapr_pci_msi),
+        VMSTATE_UINT32(nvec, struct spapr_pci_msi),
+
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static const VMStateDescription vmstate_spapr_pci = {
+    .name = "spapr_pci",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_UINT64_EQUAL(buid, sPAPRPHBState),
+        VMSTATE_UINT32_EQUAL(dma_liobn, sPAPRPHBState),
+        VMSTATE_UINT64_EQUAL(mem_win_addr, sPAPRPHBState),
+        VMSTATE_UINT64_EQUAL(mem_win_size, sPAPRPHBState),
+        VMSTATE_UINT64_EQUAL(io_win_addr, sPAPRPHBState),
+        VMSTATE_UINT64_EQUAL(io_win_size, sPAPRPHBState),
+        VMSTATE_UINT64_EQUAL(msi_win_addr, sPAPRPHBState),
+        VMSTATE_STRUCT_ARRAY(lsi_table, sPAPRPHBState, PCI_NUM_PINS, 0,
+                             vmstate_spapr_pci_lsi, struct spapr_pci_lsi),
+        VMSTATE_STRUCT_ARRAY(msi_table, sPAPRPHBState, SPAPR_MSIX_MAX_DEVS, 0,
+                             vmstate_spapr_pci_msi, struct spapr_pci_msi),
+
+        VMSTATE_END_OF_LIST()
+    },
+};
+
 static const char *spapr_phb_root_bus_path(PCIHostState *host_bridge,
                                            PCIBus *rootbus)
 {
@@ -717,6 +765,7 @@ static void spapr_phb_class_init(ObjectClass *klass, void *data)
     sdc->init = spapr_phb_init;
     dc->props = spapr_phb_properties;
     dc->reset = spapr_phb_reset;
+    dc->vmsd = &vmstate_spapr_pci;
 }
 
 static const TypeInfo spapr_phb_info = {
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index 1e23dbf..93f9511 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -52,14 +52,14 @@ typedef struct sPAPRPHBState {
     sPAPRTCETable *tcet;
     AddressSpace iommu_as;
 
-    struct {
+    struct spapr_pci_lsi {
         uint32_t irq;
     } lsi_table[PCI_NUM_PINS];
 
-    struct {
+    struct spapr_pci_msi {
         uint32_t config_addr;
         uint32_t irq;
-        int nvec;
+        uint32_t nvec;
     } msi_table[SPAPR_MSIX_MAX_DEVS];
 
     QLIST_ENTRY(sPAPRPHBState) list;
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 10/11] pseries: savevm support with KVM
  2013-07-15 15:11 [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support Anthony Liguori
                   ` (8 preceding siblings ...)
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 09/11] pseries: savevm support for PCI host bridge Anthony Liguori
@ 2013-07-15 15:11 ` Anthony Liguori
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 11/11] xics: rename types to be sane and follow coding style Anthony Liguori
  2013-07-16 10:10 ` [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support Alexey Kardashevskiy
  11 siblings, 0 replies; 25+ messages in thread
From: Anthony Liguori @ 2013-07-15 15:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-ppc, David Gibson

From: Alexey Kardashevskiy <aik@ozlabs.ru>

At present, the savevm / migration support for the pseries machine will not
work when KVM is enabled.  That's because KVM manages the guest's hash page
table in the host kernel, so qemu has no visibility of it.  This patch
fixes this by using new kernel interfaces to extract and reinsert the
guest's hash table during the migration process.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr.c         | 106 +++++++++++++++++++++++++++++++++++++++----------
 include/hw/ppc/spapr.h |   1 +
 target-ppc/kvm.c       |  69 ++++++++++++++++++++++++++++++++
 target-ppc/kvm_ppc.h   |  22 ++++++++++
 4 files changed, 176 insertions(+), 22 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 734a782..7144829 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -735,17 +735,27 @@ static int htab_save_setup(QEMUFile *f, void *opaque)
 {
     sPAPREnvironment *spapr = opaque;
 
-    spapr->htab_save_index = 0;
-    spapr->htab_first_pass = true;
-
     /* "Iteration" header */
     qemu_put_be32(f, spapr->htab_shift);
 
+    if (spapr->htab) {
+        spapr->htab_save_index = 0;
+        spapr->htab_first_pass = true;
+    } else {
+        assert(kvm_enabled());
+
+        spapr->htab_fd = kvmppc_get_htab_fd(false);
+        if (spapr->htab_fd < 0) {
+            fprintf(stderr, "Unable to open fd for reading hash table from KVM: %s\n",
+                    strerror(errno));
+            return -1;
+        }
+    }
+
+
     return 0;
 }
 
-#define MAX_ITERATION_NS    5000000 /* 5 ms */
-
 static void htab_save_first_pass(QEMUFile *f, sPAPREnvironment *spapr,
                                  int64_t max_ns)
 {
@@ -796,8 +806,8 @@ static void htab_save_first_pass(QEMUFile *f, sPAPREnvironment *spapr,
     spapr->htab_save_index = index;
 }
 
-static bool htab_save_later_pass(QEMUFile *f, sPAPREnvironment *spapr,
-                                 int64_t max_ns)
+static int htab_save_later_pass(QEMUFile *f, sPAPREnvironment *spapr,
+                                int64_t max_ns)
 {
     bool final = max_ns < 0;
     int htabslots = HTAB_SIZE(spapr) / HASH_PTE_SIZE_64;
@@ -870,21 +880,32 @@ static bool htab_save_later_pass(QEMUFile *f, sPAPREnvironment *spapr,
 
     spapr->htab_save_index = index;
 
-    return (examined >= htabslots) && (sent == 0);
+    return (examined >= htabslots) && (sent == 0) ? 1 : 0;
 }
 
+#define MAX_ITERATION_NS    5000000 /* 5 ms */
+#define MAX_KVM_BUF_SIZE    2048
+
 static int htab_save_iterate(QEMUFile *f, void *opaque)
 {
     sPAPREnvironment *spapr = opaque;
-    bool nothingleft = false;;
+    int rc = 0;
 
     /* Iteration header */
     qemu_put_be32(f, 0);
 
-    if (spapr->htab_first_pass) {
+    if (!spapr->htab) {
+        assert(kvm_enabled());
+
+        rc = kvmppc_save_htab(f, spapr->htab_fd,
+                              MAX_KVM_BUF_SIZE, MAX_ITERATION_NS);
+        if (rc < 0) {
+            return rc;
+        }
+    } else  if (spapr->htab_first_pass) {
         htab_save_first_pass(f, spapr, MAX_ITERATION_NS);
     } else {
-        nothingleft = htab_save_later_pass(f, spapr, MAX_ITERATION_NS);
+        rc = htab_save_later_pass(f, spapr, MAX_ITERATION_NS);
     }
 
     /* End marker */
@@ -892,7 +913,7 @@ static int htab_save_iterate(QEMUFile *f, void *opaque)
     qemu_put_be16(f, 0);
     qemu_put_be16(f, 0);
 
-    return nothingleft ? 1 : 0;
+    return rc;
 }
 
 static int htab_save_complete(QEMUFile *f, void *opaque)
@@ -902,7 +923,20 @@ static int htab_save_complete(QEMUFile *f, void *opaque)
     /* Iteration header */
     qemu_put_be32(f, 0);
 
-    htab_save_later_pass(f, spapr, -1);
+    if (!spapr->htab) {
+        int rc;
+
+        assert(kvm_enabled());
+
+        rc = kvmppc_save_htab(f, spapr->htab_fd, MAX_KVM_BUF_SIZE, -1);
+        if (rc < 0) {
+            return rc;
+        }
+        close(spapr->htab_fd);
+        spapr->htab_fd = -1;
+    } else {
+        htab_save_later_pass(f, spapr, -1);
+    }
 
     /* End marker */
     qemu_put_be32(f, 0);
@@ -916,6 +950,7 @@ static int htab_load(QEMUFile *f, void *opaque, int version_id)
 {
     sPAPREnvironment *spapr = opaque;
     uint32_t section_hdr;
+    int fd = -1;
 
     if (version_id < 1 || version_id > 1) {
         fprintf(stderr, "htab_load() bad version\n");
@@ -932,6 +967,16 @@ static int htab_load(QEMUFile *f, void *opaque, int version_id)
         return 0;
     }
 
+    if (!spapr->htab) {
+        assert(kvm_enabled());
+
+        fd = kvmppc_get_htab_fd(true);
+        if (fd < 0) {
+            fprintf(stderr, "Unable to open fd to restore KVM hash table: %s\n",
+                    strerror(errno));
+        }
+    }
+
     while (true) {
         uint32_t index;
         uint16_t n_valid, n_invalid;
@@ -945,24 +990,41 @@ static int htab_load(QEMUFile *f, void *opaque, int version_id)
             break;
         }
 
-        if ((index + n_valid + n_invalid) >=
+        if ((index + n_valid + n_invalid) >
             (HTAB_SIZE(spapr) / HASH_PTE_SIZE_64)) {
             /* Bad index in stream */
             fprintf(stderr, "htab_load() bad index %d (%hd+%hd entries) "
-                    "in htab stream\n", index, n_valid, n_invalid);
+                    "in htab stream (htab_shift=%d)\n", index, n_valid, n_invalid,
+                    spapr->htab_shift);
             return -EINVAL;
         }
 
-        if (n_valid) {
-            qemu_get_buffer(f, HPTE(spapr->htab, index),
-                            HASH_PTE_SIZE_64 * n_valid);
-        }
-        if (n_invalid) {
-            memset(HPTE(spapr->htab, index + n_valid), 0,
-                   HASH_PTE_SIZE_64 * n_invalid);
+        if (spapr->htab) {
+            if (n_valid) {
+                qemu_get_buffer(f, HPTE(spapr->htab, index),
+                                HASH_PTE_SIZE_64 * n_valid);
+            }
+            if (n_invalid) {
+                memset(HPTE(spapr->htab, index + n_valid), 0,
+                       HASH_PTE_SIZE_64 * n_invalid);
+            }
+        } else {
+            int rc;
+
+            assert(fd >= 0);
+
+            rc = kvmppc_load_htab_chunk(f, fd, index, n_valid, n_invalid);
+            if (rc < 0) {
+                return rc;
+            }
         }
     }
 
+    if (!spapr->htab) {
+        assert(fd >= 0);
+        close(fd);
+    }
+
     return 0;
 }
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 00a6f58..b06ce79 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -37,6 +37,7 @@ typedef struct sPAPREnvironment {
     /* Migration state */
     int htab_save_index;
     bool htab_first_pass;
+    int htab_fd;
 } sPAPREnvironment;
 
 #define H_SUCCESS         0
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index b0099e1..8df88b7 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -65,6 +65,7 @@ static int cap_one_reg;
 static int cap_epr;
 static int cap_ppc_watchdog;
 static int cap_papr;
+static int cap_htab_fd;
 
 /* XXX We have a race condition where we actually have a level triggered
  *     interrupt, but the infrastructure can't expose that yet, so the guest
@@ -101,6 +102,7 @@ int kvm_arch_init(KVMState *s)
     cap_ppc_watchdog = kvm_check_extension(s, KVM_CAP_PPC_BOOKE_WATCHDOG);
     /* Note: we don't set cap_papr here, because this capability is
      * only activated after this by kvmppc_set_papr() */
+    cap_htab_fd = kvm_check_extension(s, KVM_CAP_PPC_HTAB_FD);
 
     if (!cap_interrupt_level) {
         fprintf(stderr, "KVM: Couldn't find level irq capability. Expect the "
@@ -1788,6 +1790,73 @@ static int kvm_ppc_register_host_cpu_type(void)
 }
 
 
+int kvmppc_get_htab_fd(bool write)
+{
+    struct kvm_get_htab_fd s = {
+        .flags = write ? KVM_GET_HTAB_WRITE : 0,
+        .start_index = 0,
+    };
+
+    if (!cap_htab_fd) {
+        fprintf(stderr, "KVM version doesn't support saving the hash table\n");
+        return -1;
+    }
+
+    return kvm_vm_ioctl(kvm_state, KVM_PPC_GET_HTAB_FD, &s);
+}
+
+int kvmppc_save_htab(QEMUFile *f, int fd, size_t bufsize, int64_t max_ns)
+{
+    int64_t starttime = qemu_get_clock_ns(rt_clock);
+    uint8_t buf[bufsize];
+    ssize_t rc;
+
+    do {
+        rc = read(fd, buf, bufsize);
+        if (rc < 0) {
+            fprintf(stderr, "Error reading data from KVM HTAB fd: %s\n",
+                    strerror(errno));
+            return rc;
+        } else if (rc) {
+            /* Kernel already retuns data in BE format for the file */
+            qemu_put_buffer(f, buf, rc);
+        }
+    } while ((rc != 0)
+             && ((max_ns < 0)
+                 || ((qemu_get_clock_ns(rt_clock) - starttime) < max_ns)));
+
+    return (rc == 0) ? 1 : 0;
+}
+
+int kvmppc_load_htab_chunk(QEMUFile *f, int fd, uint32_t index,
+                           uint16_t n_valid, uint16_t n_invalid)
+{
+    struct kvm_get_htab_header *buf;
+    size_t chunksize = sizeof(*buf) + n_valid*HASH_PTE_SIZE_64;
+    ssize_t rc;
+
+    buf = alloca(chunksize);
+    /* This is KVM on ppc, so this is all big-endian */
+    buf->index = index;
+    buf->n_valid = n_valid;
+    buf->n_invalid = n_invalid;
+
+    qemu_get_buffer(f, (void *)(buf + 1), HASH_PTE_SIZE_64*n_valid);
+
+    rc = write(fd, buf, chunksize);
+    if (rc < 0) {
+        fprintf(stderr, "Error writing KVM hash table: %s\n",
+                strerror(errno));
+        return rc;
+    }
+    if (rc != chunksize) {
+        /* We should never get a short write on a single chunk */
+        fprintf(stderr, "Short write, restoring KVM hash table\n");
+        return -1;
+    }
+    return 0;
+}
+
 bool kvm_arch_stop_on_emulation_error(CPUState *cpu)
 {
     return true;
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 771cfbe..4ae7bf2 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -38,6 +38,10 @@ uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift);
 #endif /* !CONFIG_USER_ONLY */
 int kvmppc_fixup_cpu(PowerPCCPU *cpu);
 bool kvmppc_has_cap_epr(void);
+int kvmppc_get_htab_fd(bool write);
+int kvmppc_save_htab(QEMUFile *f, int fd, size_t bufsize, int64_t max_ns);
+int kvmppc_load_htab_chunk(QEMUFile *f, int fd, uint32_t index,
+                           uint16_t n_valid, uint16_t n_invalid);
 
 #else
 
@@ -159,6 +163,24 @@ static inline bool kvmppc_has_cap_epr(void)
 {
     return false;
 }
+
+static inline int kvmppc_get_htab_fd(bool write)
+{
+    return -1;
+}
+
+static inline int kvmppc_save_htab(QEMUFile *f, int fd, size_t bufsize,
+                                   int64_t max_ns)
+{
+    abort();
+}
+
+static inline int kvmppc_load_htab_chunk(QEMUFile *f, int fd, uint32_t index,
+                                         uint16_t n_valid, uint16_t n_invalid)
+{
+    abort();
+}
+
 #endif
 
 #ifndef CONFIG_KVM
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 11/11] xics: rename types to be sane and follow coding style
  2013-07-15 15:11 [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support Anthony Liguori
                   ` (9 preceding siblings ...)
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 10/11] pseries: savevm support with KVM Anthony Liguori
@ 2013-07-15 15:11 ` Anthony Liguori
  2013-07-16 10:10 ` [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support Alexey Kardashevskiy
  11 siblings, 0 replies; 25+ messages in thread
From: Anthony Liguori @ 2013-07-15 15:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Anthony Liguori, qemu-ppc,
	David Gibson

>From Ben:

    Basically, in HW the layout of the interrupt network is:

     - One ICP per processor thread (the "presenter"). This contains the
    registers to fetch a pending interrupt (ack), EOI, and control the
    processor priority.

     - One ICS per logical source of interrupts (ie, one per PCI host
    bridge, and a few others here or there). This contains the per-interrupt
    source configuration (target processor(s), priority, mask) and the
    per-interrupt internal state.

    Under PAPR, there is a single "virtual" ICS ... somewhat (it's a bit
    oddball what pHyp does here, arguably there are two but we can ignore
    that distinction). There is no register level access. A pair of firmware
    (RTAS) calls is used to configure each virtual interrupt.

    So our model here is somewhat the same. We have one ICS in the emulated
    XICS which arguably *is* the emulated XICS, there's no point making it a
    separate "device", that would just be gross, and each VCPU has an
    associated ICP.

Yet we call the "XICS" struct icp_state and then the ICPs
'struct icp_server_state'.  It's particularly confusing when all of the
functions have xics_prefixes yet take *icp arguments.

Rename:

  struct icp_state -> XICSState
  struct icp_server_state -> ICPState
  struct ics_state -> ICSState
  struct ics_irq_state -> ICSIRQState

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[aik: added ics_resend() on post_load]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
---
 hw/intc/xics.c         | 345 +++++++++++++++++++++++++++++++++----------------
 hw/ppc/spapr.c         |  28 ++++
 include/hw/ppc/spapr.h |   3 +-
 include/hw/ppc/xics.h  |  74 ++++++++++-
 4 files changed, 330 insertions(+), 120 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 091912e..6b3c071 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -34,34 +34,19 @@
  * ICP: Presentation layer
  */
 
-struct icp_server_state {
-    uint32_t xirr;
-    uint8_t pending_priority;
-    uint8_t mfrr;
-    qemu_irq output;
-};
-
 #define XISR_MASK  0x00ffffff
 #define CPPR_MASK  0xff000000
 
 #define XISR(ss)   (((ss)->xirr) & XISR_MASK)
 #define CPPR(ss)   (((ss)->xirr) >> 24)
 
-struct ics_state;
-
-struct icp_state {
-    long nr_servers;
-    struct icp_server_state *ss;
-    struct ics_state *ics;
-};
-
-static void ics_reject(struct ics_state *ics, int nr);
-static void ics_resend(struct ics_state *ics);
-static void ics_eoi(struct ics_state *ics, int nr);
+static void ics_reject(ICSState *ics, int nr);
+static void ics_resend(ICSState *ics);
+static void ics_eoi(ICSState *ics, int nr);
 
-static void icp_check_ipi(struct icp_state *icp, int server)
+static void icp_check_ipi(XICSState *icp, int server)
 {
-    struct icp_server_state *ss = icp->ss + server;
+    ICPState *ss = icp->ss + server;
 
     if (XISR(ss) && (ss->pending_priority <= ss->mfrr)) {
         return;
@@ -78,9 +63,9 @@ static void icp_check_ipi(struct icp_state *icp, int server)
     qemu_irq_raise(ss->output);
 }
 
-static void icp_resend(struct icp_state *icp, int server)
+static void icp_resend(XICSState *icp, int server)
 {
-    struct icp_server_state *ss = icp->ss + server;
+    ICPState *ss = icp->ss + server;
 
     if (ss->mfrr < CPPR(ss)) {
         icp_check_ipi(icp, server);
@@ -88,9 +73,9 @@ static void icp_resend(struct icp_state *icp, int server)
     ics_resend(icp->ics);
 }
 
-static void icp_set_cppr(struct icp_state *icp, int server, uint8_t cppr)
+static void icp_set_cppr(XICSState *icp, int server, uint8_t cppr)
 {
-    struct icp_server_state *ss = icp->ss + server;
+    ICPState *ss = icp->ss + server;
     uint8_t old_cppr;
     uint32_t old_xisr;
 
@@ -112,9 +97,9 @@ static void icp_set_cppr(struct icp_state *icp, int server, uint8_t cppr)
     }
 }
 
-static void icp_set_mfrr(struct icp_state *icp, int server, uint8_t mfrr)
+static void icp_set_mfrr(XICSState *icp, int server, uint8_t mfrr)
 {
-    struct icp_server_state *ss = icp->ss + server;
+    ICPState *ss = icp->ss + server;
 
     ss->mfrr = mfrr;
     if (mfrr < CPPR(ss)) {
@@ -122,7 +107,7 @@ static void icp_set_mfrr(struct icp_state *icp, int server, uint8_t mfrr)
     }
 }
 
-static uint32_t icp_accept(struct icp_server_state *ss)
+static uint32_t icp_accept(ICPState *ss)
 {
     uint32_t xirr = ss->xirr;
 
@@ -135,9 +120,9 @@ static uint32_t icp_accept(struct icp_server_state *ss)
     return xirr;
 }
 
-static void icp_eoi(struct icp_state *icp, int server, uint32_t xirr)
+static void icp_eoi(XICSState *icp, int server, uint32_t xirr)
 {
-    struct icp_server_state *ss = icp->ss + server;
+    ICPState *ss = icp->ss + server;
 
     /* Send EOI -> ICS */
     ss->xirr = (ss->xirr & ~CPPR_MASK) | (xirr & CPPR_MASK);
@@ -148,9 +133,9 @@ static void icp_eoi(struct icp_state *icp, int server, uint32_t xirr)
     }
 }
 
-static void icp_irq(struct icp_state *icp, int server, int nr, uint8_t priority)
+static void icp_irq(XICSState *icp, int server, int nr, uint8_t priority)
 {
-    struct icp_server_state *ss = icp->ss + server;
+    ICPState *ss = icp->ss + server;
 
     trace_xics_icp_irq(server, nr, priority);
 
@@ -168,39 +153,59 @@ static void icp_irq(struct icp_state *icp, int server, int nr, uint8_t priority)
     }
 }
 
-/*
- * ICS: Source layer
- */
-
-struct ics_irq_state {
-    int server;
-    uint8_t priority;
-    uint8_t saved_priority;
-#define XICS_STATUS_ASSERTED           0x1
-#define XICS_STATUS_SENT               0x2
-#define XICS_STATUS_REJECTED           0x4
-#define XICS_STATUS_MASKED_PENDING     0x8
-    uint8_t status;
+static const VMStateDescription vmstate_icp_server = {
+    .name = "icp/server",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        /* Sanity check */
+        VMSTATE_UINT32(xirr, ICPState),
+        VMSTATE_UINT8(pending_priority, ICPState),
+        VMSTATE_UINT8(mfrr, ICPState),
+        VMSTATE_END_OF_LIST()
+    },
 };
 
-struct ics_state {
-    int nr_irqs;
-    int offset;
-    qemu_irq *qirqs;
-    bool *islsi;
-    struct ics_irq_state *irqs;
-    struct icp_state *icp;
+static void icp_reset(DeviceState *dev)
+{
+    ICPState *icp = ICP(dev);
+
+    icp->xirr = 0;
+    icp->pending_priority = 0xff;
+    icp->mfrr = 0xff;
+
+    /* Make all outputs are deasserted */
+    qemu_set_irq(icp->output, 0);
+}
+
+static void icp_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    dc->reset = icp_reset;
+    dc->vmsd = &vmstate_icp_server;
+}
+
+static TypeInfo icp_info = {
+    .name = TYPE_ICP,
+    .parent = TYPE_DEVICE,
+    .instance_size = sizeof(ICPState),
+    .class_init = icp_class_init,
 };
 
-static int ics_valid_irq(struct ics_state *ics, uint32_t nr)
+/*
+ * ICS: Source layer
+ */
+static int ics_valid_irq(ICSState *ics, uint32_t nr)
 {
     return (nr >= ics->offset)
         && (nr < (ics->offset + ics->nr_irqs));
 }
 
-static void resend_msi(struct ics_state *ics, int srcno)
+static void resend_msi(ICSState *ics, int srcno)
 {
-    struct ics_irq_state *irq = ics->irqs + srcno;
+    ICSIRQState *irq = ics->irqs + srcno;
 
     /* FIXME: filter by server#? */
     if (irq->status & XICS_STATUS_REJECTED) {
@@ -212,9 +217,9 @@ static void resend_msi(struct ics_state *ics, int srcno)
     }
 }
 
-static void resend_lsi(struct ics_state *ics, int srcno)
+static void resend_lsi(ICSState *ics, int srcno)
 {
-    struct ics_irq_state *irq = ics->irqs + srcno;
+    ICSIRQState *irq = ics->irqs + srcno;
 
     if ((irq->priority != 0xff)
         && (irq->status & XICS_STATUS_ASSERTED)
@@ -224,9 +229,9 @@ static void resend_lsi(struct ics_state *ics, int srcno)
     }
 }
 
-static void set_irq_msi(struct ics_state *ics, int srcno, int val)
+static void set_irq_msi(ICSState *ics, int srcno, int val)
 {
-    struct ics_irq_state *irq = ics->irqs + srcno;
+    ICSIRQState *irq = ics->irqs + srcno;
 
     trace_xics_set_irq_msi(srcno, srcno + ics->offset);
 
@@ -240,9 +245,9 @@ static void set_irq_msi(struct ics_state *ics, int srcno, int val)
     }
 }
 
-static void set_irq_lsi(struct ics_state *ics, int srcno, int val)
+static void set_irq_lsi(ICSState *ics, int srcno, int val)
 {
-    struct ics_irq_state *irq = ics->irqs + srcno;
+    ICSIRQState *irq = ics->irqs + srcno;
 
     trace_xics_set_irq_lsi(srcno, srcno + ics->offset);
     if (val) {
@@ -255,7 +260,7 @@ static void set_irq_lsi(struct ics_state *ics, int srcno, int val)
 
 static void ics_set_irq(void *opaque, int srcno, int val)
 {
-    struct ics_state *ics = (struct ics_state *)opaque;
+    ICSState *ics = (ICSState *)opaque;
 
     if (ics->islsi[srcno]) {
         set_irq_lsi(ics, srcno, val);
@@ -264,9 +269,9 @@ static void ics_set_irq(void *opaque, int srcno, int val)
     }
 }
 
-static void write_xive_msi(struct ics_state *ics, int srcno)
+static void write_xive_msi(ICSState *ics, int srcno)
 {
-    struct ics_irq_state *irq = ics->irqs + srcno;
+    ICSIRQState *irq = ics->irqs + srcno;
 
     if (!(irq->status & XICS_STATUS_MASKED_PENDING)
         || (irq->priority == 0xff)) {
@@ -277,16 +282,16 @@ static void write_xive_msi(struct ics_state *ics, int srcno)
     icp_irq(ics->icp, irq->server, srcno + ics->offset, irq->priority);
 }
 
-static void write_xive_lsi(struct ics_state *ics, int srcno)
+static void write_xive_lsi(ICSState *ics, int srcno)
 {
     resend_lsi(ics, srcno);
 }
 
-static void ics_write_xive(struct ics_state *ics, int nr, int server,
+static void ics_write_xive(ICSState *ics, int nr, int server,
                            uint8_t priority, uint8_t saved_priority)
 {
     int srcno = nr - ics->offset;
-    struct ics_irq_state *irq = ics->irqs + srcno;
+    ICSIRQState *irq = ics->irqs + srcno;
 
     irq->server = server;
     irq->priority = priority;
@@ -301,16 +306,16 @@ static void ics_write_xive(struct ics_state *ics, int nr, int server,
     }
 }
 
-static void ics_reject(struct ics_state *ics, int nr)
+static void ics_reject(ICSState *ics, int nr)
 {
-    struct ics_irq_state *irq = ics->irqs + nr - ics->offset;
+    ICSIRQState *irq = ics->irqs + nr - ics->offset;
 
     trace_xics_ics_reject(nr, nr - ics->offset);
     irq->status |= XICS_STATUS_REJECTED; /* Irrelevant but harmless for LSI */
     irq->status &= ~XICS_STATUS_SENT; /* Irrelevant but harmless for MSI */
 }
 
-static void ics_resend(struct ics_state *ics)
+static void ics_resend(ICSState *ics)
 {
     int i;
 
@@ -324,10 +329,10 @@ static void ics_resend(struct ics_state *ics)
     }
 }
 
-static void ics_eoi(struct ics_state *ics, int nr)
+static void ics_eoi(ICSState *ics, int nr)
 {
     int srcno = nr - ics->offset;
-    struct ics_irq_state *irq = ics->irqs + srcno;
+    ICSIRQState *irq = ics->irqs + srcno;
 
     trace_xics_ics_eoi(nr);
 
@@ -336,11 +341,92 @@ static void ics_eoi(struct ics_state *ics, int nr)
     }
 }
 
+static void ics_reset(DeviceState *dev)
+{
+    ICSState *ics = ICS(dev);
+    int i;
+
+    memset(ics->irqs, 0, sizeof(ICSIRQState) * ics->nr_irqs);
+    for (i = 0; i < ics->nr_irqs; i++) {
+        ics->irqs[i].priority = 0xff;
+        ics->irqs[i].saved_priority = 0xff;
+    }
+}
+
+static int ics_post_load(void *opaque, int version_id)
+{
+    int i;
+    ICSState *ics = opaque;
+
+    for (i = 0; i < ics->icp->nr_servers; i++) {
+        icp_resend(ics->icp, i);
+    }
+
+    return 0;
+}
+
+static const VMStateDescription vmstate_ics_irq = {
+    .name = "ics/irq",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_UINT32(server, ICSIRQState),
+        VMSTATE_UINT8(priority, ICSIRQState),
+        VMSTATE_UINT8(saved_priority, ICSIRQState),
+        VMSTATE_UINT8(status, ICSIRQState),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static const VMStateDescription vmstate_ics = {
+    .name = "ics",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .post_load = ics_post_load,
+    .fields      = (VMStateField []) {
+        /* Sanity check */
+        VMSTATE_UINT32_EQUAL(nr_irqs, ICSState),
+
+        VMSTATE_STRUCT_VARRAY_POINTER_UINT32(irqs, ICSState, nr_irqs,
+                                             vmstate_ics_irq, ICSIRQState),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static int ics_realize(DeviceState *dev)
+{
+    ICSState *ics = ICS(dev);
+
+    ics->irqs = g_malloc0(ics->nr_irqs * sizeof(ICSIRQState));
+    ics->islsi = g_malloc0(ics->nr_irqs * sizeof(bool));
+    ics->qirqs = qemu_allocate_irqs(ics_set_irq, ics, ics->nr_irqs);
+
+    return 0;
+}
+
+static void ics_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    dc->init = ics_realize;
+    dc->vmsd = &vmstate_ics;
+    dc->reset = ics_reset;
+}
+
+static TypeInfo ics_info = {
+    .name = TYPE_ICS,
+    .parent = TYPE_DEVICE,
+    .instance_size = sizeof(ICSState),
+    .class_init = ics_class_init,
+};
+
 /*
  * Exported functions
  */
 
-qemu_irq xics_get_qirq(struct icp_state *icp, int irq)
+qemu_irq xics_get_qirq(XICSState *icp, int irq)
 {
     if (!ics_valid_irq(icp->ics, irq)) {
         return NULL;
@@ -349,13 +435,17 @@ qemu_irq xics_get_qirq(struct icp_state *icp, int irq)
     return icp->ics->qirqs[irq - icp->ics->offset];
 }
 
-void xics_set_irq_type(struct icp_state *icp, int irq, bool lsi)
+void xics_set_irq_type(XICSState *icp, int irq, bool lsi)
 {
     assert(ics_valid_irq(icp->ics, irq));
 
     icp->ics->islsi[irq - icp->ics->offset] = lsi;
 }
 
+/*
+ * Guest interfaces
+ */
+
 static target_ulong h_cppr(PowerPCCPU *cpu, sPAPREnvironment *spapr,
                            target_ulong opcode, target_ulong *args)
 {
@@ -405,7 +495,7 @@ static void rtas_set_xive(PowerPCCPU *cpu, sPAPREnvironment *spapr,
                           uint32_t nargs, target_ulong args,
                           uint32_t nret, target_ulong rets)
 {
-    struct ics_state *ics = spapr->icp->ics;
+    ICSState *ics = spapr->icp->ics;
     uint32_t nr, server, priority;
 
     if ((nargs != 3) || (nret != 1)) {
@@ -433,7 +523,7 @@ static void rtas_get_xive(PowerPCCPU *cpu, sPAPREnvironment *spapr,
                           uint32_t nargs, target_ulong args,
                           uint32_t nret, target_ulong rets)
 {
-    struct ics_state *ics = spapr->icp->ics;
+    ICSState *ics = spapr->icp->ics;
     uint32_t nr;
 
     if ((nargs != 1) || (nret != 3)) {
@@ -458,7 +548,7 @@ static void rtas_int_off(PowerPCCPU *cpu, sPAPREnvironment *spapr,
                          uint32_t nargs, target_ulong args,
                          uint32_t nret, target_ulong rets)
 {
-    struct ics_state *ics = spapr->icp->ics;
+    ICSState *ics = spapr->icp->ics;
     uint32_t nr;
 
     if ((nargs != 1) || (nret != 1)) {
@@ -484,7 +574,7 @@ static void rtas_int_on(PowerPCCPU *cpu, sPAPREnvironment *spapr,
                         uint32_t nargs, target_ulong args,
                         uint32_t nret, target_ulong rets)
 {
-    struct ics_state *ics = spapr->icp->ics;
+    ICSState *ics = spapr->icp->ics;
     uint32_t nr;
 
     if ((nargs != 1) || (nret != 1)) {
@@ -506,32 +596,27 @@ static void rtas_int_on(PowerPCCPU *cpu, sPAPREnvironment *spapr,
     rtas_st(rets, 0, 0); /* Success */
 }
 
-static void xics_reset(void *opaque)
+/*
+ * XICS
+ */
+
+static void xics_reset(DeviceState *d)
 {
-    struct icp_state *icp = (struct icp_state *)opaque;
-    struct ics_state *ics = icp->ics;
+    XICSState *icp = XICS(d);
     int i;
 
     for (i = 0; i < icp->nr_servers; i++) {
-        icp->ss[i].xirr = 0;
-        icp->ss[i].pending_priority = 0xff;
-        icp->ss[i].mfrr = 0xff;
-        /* Make all outputs are deasserted */
-        qemu_set_irq(icp->ss[i].output, 0);
+        device_reset(DEVICE(&icp->ss[i]));
     }
 
-    memset(ics->irqs, 0, sizeof(struct ics_irq_state) * ics->nr_irqs);
-    for (i = 0; i < ics->nr_irqs; i++) {
-        ics->irqs[i].priority = 0xff;
-        ics->irqs[i].saved_priority = 0xff;
-    }
+    device_reset(DEVICE(icp->ics));
 }
 
-void xics_cpu_setup(struct icp_state *icp, PowerPCCPU *cpu)
+void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 {
     CPUState *cs = CPU(cpu);
     CPUPPCState *env = &cpu->env;
-    struct icp_server_state *ss = &icp->ss[cs->cpu_index];
+    ICPState *ss = &icp->ss[cs->cpu_index];
 
     assert(cs->cpu_index < icp->nr_servers);
 
@@ -551,37 +636,73 @@ void xics_cpu_setup(struct icp_state *icp, PowerPCCPU *cpu)
     }
 }
 
-struct icp_state *xics_system_init(int nr_servers, int nr_irqs)
+static void xics_realize(DeviceState *dev, Error **errp)
 {
-    struct icp_state *icp;
-    struct ics_state *ics;
-
-    icp = g_malloc0(sizeof(*icp));
-    icp->nr_servers = nr_servers;
-    icp->ss = g_malloc0(icp->nr_servers*sizeof(struct icp_server_state));
+    XICSState *icp = XICS(dev);
+    ICSState *ics = icp->ics;
+    int i;
 
-    ics = g_malloc0(sizeof(*ics));
-    ics->nr_irqs = nr_irqs;
+    ics->nr_irqs = icp->nr_irqs;
     ics->offset = XICS_IRQ_BASE;
-    ics->irqs = g_malloc0(nr_irqs * sizeof(struct ics_irq_state));
-    ics->islsi = g_malloc0(nr_irqs * sizeof(bool));
-
-    icp->ics = ics;
     ics->icp = icp;
+    qdev_init_nofail(DEVICE(ics));
 
-    ics->qirqs = qemu_allocate_irqs(ics_set_irq, ics, nr_irqs);
+    icp->ss = g_malloc0(icp->nr_servers*sizeof(ICPState));
+    for (i = 0; i < icp->nr_servers; i++) {
+        char buffer[32];
+        object_initialize(&icp->ss[i], TYPE_ICP);
+        snprintf(buffer, sizeof(buffer), "icp[%d]", i);
+        object_property_add_child(OBJECT(icp), buffer, OBJECT(&icp->ss[i]), NULL);
+        qdev_init_nofail(DEVICE(&icp->ss[i]));
+    }
+}
 
-    spapr_register_hypercall(H_CPPR, h_cppr);
-    spapr_register_hypercall(H_IPI, h_ipi);
-    spapr_register_hypercall(H_XIRR, h_xirr);
-    spapr_register_hypercall(H_EOI, h_eoi);
+static void xics_initfn(Object *obj)
+{
+    XICSState *xics = XICS(obj);
+
+    xics->ics = ICS(object_new(TYPE_ICS));
+    object_property_add_child(obj, "ics", OBJECT(xics->ics), NULL);
+}
+
+static Property xics_properties[] = {
+    DEFINE_PROP_UINT32("nr_servers", XICSState, nr_servers, -1),
+    DEFINE_PROP_UINT32("nr_irqs", XICSState, nr_irqs, -1),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void xics_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+
+    dc->realize = xics_realize;
+    dc->props = xics_properties;
+    dc->reset = xics_reset;
 
     spapr_rtas_register("ibm,set-xive", rtas_set_xive);
     spapr_rtas_register("ibm,get-xive", rtas_get_xive);
     spapr_rtas_register("ibm,int-off", rtas_int_off);
     spapr_rtas_register("ibm,int-on", rtas_int_on);
 
-    qemu_register_reset(xics_reset, icp);
+    spapr_register_hypercall(H_CPPR, h_cppr);
+    spapr_register_hypercall(H_IPI, h_ipi);
+    spapr_register_hypercall(H_XIRR, h_xirr);
+    spapr_register_hypercall(H_EOI, h_eoi);
+}
+
+static const TypeInfo xics_info = {
+    .name          = TYPE_XICS,
+    .parent        = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(XICSState),
+    .class_init    = xics_class_init,
+    .instance_init = xics_initfn,
+};
 
-    return icp;
+static void xics_register_types(void)
+{
+    type_register_static(&xics_info);
+    type_register_static(&ics_info);
+    type_register_static(&icp_info);
 }
+
+type_init(xics_register_types)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 7144829..16bfab9 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -129,6 +129,34 @@ int spapr_allocate_irq_block(int num, bool lsi)
     return first;
 }
 
+static XICSState *try_create_xics(const char *type, int nr_servers,
+                                  int nr_irqs)
+{
+    DeviceState *dev;
+
+    dev = qdev_create(NULL, type);
+    qdev_prop_set_uint32(dev, "nr_servers", nr_servers);
+    qdev_prop_set_uint32(dev, "nr_irqs", nr_irqs);
+    if (qdev_init(dev) < 0) {
+        return NULL;
+    }
+
+    return XICS(dev);
+}
+
+static XICSState *xics_system_init(int nr_servers, int nr_irqs)
+{
+    XICSState *icp = NULL;
+
+    icp = try_create_xics(TYPE_XICS, nr_servers, nr_irqs);
+    if (!icp) {
+        perror("Failed to create XICS\n");
+        abort();
+    }
+
+    return icp;
+}
+
 static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment *spapr)
 {
     int ret = 0, offset;
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index b06ce79..9fc1972 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -7,7 +7,6 @@
 struct VIOsPAPRBus;
 struct sPAPRPHBState;
 struct sPAPRNVRAM;
-struct icp_state;
 
 #define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
 
@@ -15,7 +14,7 @@ typedef struct sPAPREnvironment {
     struct VIOsPAPRBus *vio_bus;
     QLIST_HEAD(, sPAPRPHBState) phbs;
     struct sPAPRNVRAM *nvram;
-    struct icp_state *icp;
+    XICSState *icp;
 
     hwaddr ram_limit;
     void *htab;
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 6bce042..66364c5 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -27,15 +27,77 @@
 #if !defined(__XICS_H__)
 #define __XICS_H__
 
+#include "hw/sysbus.h"
+
+#define TYPE_XICS "xics"
+#define XICS(obj) OBJECT_CHECK(XICSState, (obj), TYPE_XICS)
+
 #define XICS_IPI        0x2
-#define XICS_IRQ_BASE   0x10
+#define XICS_BUID       0x1
+#define XICS_IRQ_BASE   (XICS_BUID << 12)
+
+/*
+ * We currently only support one BUID which is our interrupt base
+ * (the kernel implementation supports more but we don't exploit
+ *  that yet)
+ */
+typedef struct XICSState XICSState;
+typedef struct ICPState ICPState;
+typedef struct ICSState ICSState;
+typedef struct ICSIRQState ICSIRQState;
+
+struct XICSState {
+    /*< private >*/
+    SysBusDevice parent_obj;
+    /*< public >*/
+    uint32_t nr_servers;
+    uint32_t nr_irqs;
+    ICPState *ss;
+    ICSState *ics;
+};
+
+#define TYPE_ICP "icp"
+#define ICP(obj) OBJECT_CHECK(ICPState, (obj), TYPE_ICP)
+
+struct ICPState {
+    /*< private >*/
+    DeviceState parent_obj;
+    /*< public >*/
+    uint32_t xirr;
+    uint8_t pending_priority;
+    uint8_t mfrr;
+    qemu_irq output;
+};
+
+#define TYPE_ICS "ics"
+#define ICS(obj) OBJECT_CHECK(ICSState, (obj), TYPE_ICS)
+
+struct ICSState {
+    /*< private >*/
+    DeviceState parent_obj;
+    /*< public >*/
+    uint32_t nr_irqs;
+    uint32_t offset;
+    qemu_irq *qirqs;
+    bool *islsi;
+    ICSIRQState *irqs;
+    XICSState *icp;
+};
 
-struct icp_state;
+struct ICSIRQState {
+    uint32_t server;
+    uint8_t priority;
+    uint8_t saved_priority;
+#define XICS_STATUS_ASSERTED           0x1
+#define XICS_STATUS_SENT               0x2
+#define XICS_STATUS_REJECTED           0x4
+#define XICS_STATUS_MASKED_PENDING     0x8
+    uint8_t status;
+};
 
-qemu_irq xics_get_qirq(struct icp_state *icp, int irq);
-void xics_set_irq_type(struct icp_state *icp, int irq, bool lsi);
+qemu_irq xics_get_qirq(XICSState *icp, int irq);
+void xics_set_irq_type(XICSState *icp, int irq, bool lsi);
 
-struct icp_state *xics_system_init(int nr_servers, int nr_irqs);
-void xics_cpu_setup(struct icp_state *icp, PowerPCCPU *cpu);
+void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu);
 
 #endif /* __XICS_H__ */
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH] ppc kvm: fix to compile
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 05/11] spapr-tce: make sPAPRTCETable a proper device Anthony Liguori
@ 2013-07-16 10:00   ` Alexey Kardashevskiy
  2013-07-16 12:32     ` Anthony Liguori
  2013-07-16 23:12     ` Alexander Graf
  2013-07-17  7:01   ` [Qemu-devel] [PATCH 05/11] spapr-tce: make sPAPRTCETable a proper device Alexey Kardashevskiy
  1 sibling, 2 replies; 25+ messages in thread
From: Alexey Kardashevskiy @ 2013-07-16 10:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alexey Kardashevskiy, Anthony Liguori, qemu-ppc, Alexander Graf

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 target-ppc/kvm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 61f2737..606bdb9 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1628,7 +1628,7 @@ void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t window_size, int *pfd)
         return NULL;
     }
 
-    len = (window_size / SPAPR_TCE_PAGE_SIZE) * sizeof(sPAPRTCE);
+    len = (window_size / SPAPR_TCE_PAGE_SIZE) * sizeof(uint64_t);
     /* FIXME: round this up to page size */
 
     table = mmap(NULL, len, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
@@ -1651,7 +1651,7 @@ int kvmppc_remove_spapr_tce(void *table, int fd, uint32_t window_size)
         return -1;
     }
 
-    len = (window_size / SPAPR_TCE_PAGE_SIZE)*sizeof(sPAPRTCE);
+    len = (window_size / SPAPR_TCE_PAGE_SIZE)*sizeof(uint64_t);
     if ((munmap(table, len) < 0) ||
         (close(fd) < 0)) {
         fprintf(stderr, "KVM: Unexpected error removing TCE table: %s",
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support
  2013-07-15 15:11 [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support Anthony Liguori
                   ` (10 preceding siblings ...)
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 11/11] xics: rename types to be sane and follow coding style Anthony Liguori
@ 2013-07-16 10:10 ` Alexey Kardashevskiy
  2013-07-16 12:33   ` Anthony Liguori
  11 siblings, 1 reply; 25+ messages in thread
From: Alexey Kardashevskiy @ 2013-07-16 10:10 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Paolo Bonzini, qemu-ppc, qemu-devel, David Gibson

On 07/16/2013 01:11 AM, Anthony Liguori wrote:
> This series is based on Alexey's series:
> 
>   spapr: migration, pci, msi, power8
> 
> Which in turn was based on work by David Gibson.
> 
> I've removed the bits not related to migration and made the
> following changes:
> 
>  1) QOMify TCE tables and XICS
> 
>  2) Do everything in terms of VMStateDescriptions
> 
>  3) Fix endianness problem with TCE table translation
>     a) Drop the VMSTATE_DIVIDE thing in the process
> 
> I've tested this with a TCG pseries guest on an x86_64 host.


It did not compile (fixed, patch is posted) and it fails to migrate with
enabled KVM. Don't know why yet, will have a look tomorrow.


-- 
Alexey

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] ppc kvm: fix to compile
  2013-07-16 10:00   ` [Qemu-devel] [PATCH] ppc kvm: fix to compile Alexey Kardashevskiy
@ 2013-07-16 12:32     ` Anthony Liguori
  2013-07-16 23:12     ` Alexander Graf
  1 sibling, 0 replies; 25+ messages in thread
From: Anthony Liguori @ 2013-07-16 12:32 UTC (permalink / raw)
  To: Alexey Kardashevskiy, qemu-devel; +Cc: qemu-ppc, Alexander Graf

Alexey Kardashevskiy <aik@ozlabs.ru> writes:

> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
>  target-ppc/kvm.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 61f2737..606bdb9 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -1628,7 +1628,7 @@ void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t window_size, int *pfd)
>          return NULL;
>      }
>  
> -    len = (window_size / SPAPR_TCE_PAGE_SIZE) * sizeof(sPAPRTCE);
> +    len = (window_size / SPAPR_TCE_PAGE_SIZE) * sizeof(uint64_t);
>      /* FIXME: round this up to page size */
>  
>      table = mmap(NULL, len, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
> @@ -1651,7 +1651,7 @@ int kvmppc_remove_spapr_tce(void *table, int fd, uint32_t window_size)
>          return -1;
>      }
>  
> -    len = (window_size / SPAPR_TCE_PAGE_SIZE)*sizeof(sPAPRTCE);
> +    len = (window_size / SPAPR_TCE_PAGE_SIZE)*sizeof(uint64_t);
>      if ((munmap(table, len) < 0) ||
>          (close(fd) < 0)) {
>          fprintf(stderr, "KVM: Unexpected error removing TCE table:
> %s",

Thanks, I'm running on x86 so this code didn't get built for me.

Regards,

Anthony Liguori

> -- 
> 1.8.3.2

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support
  2013-07-16 10:10 ` [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support Alexey Kardashevskiy
@ 2013-07-16 12:33   ` Anthony Liguori
  2013-07-16 12:35     ` Alexey Kardashevskiy
  0 siblings, 1 reply; 25+ messages in thread
From: Anthony Liguori @ 2013-07-16 12:33 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Paolo Bonzini, qemu-ppc, qemu-devel, David Gibson

Alexey Kardashevskiy <aik@ozlabs.ru> writes:

> On 07/16/2013 01:11 AM, Anthony Liguori wrote:
>> This series is based on Alexey's series:
>> 
>>   spapr: migration, pci, msi, power8
>> 
>> Which in turn was based on work by David Gibson.
>> 
>> I've removed the bits not related to migration and made the
>> following changes:
>> 
>>  1) QOMify TCE tables and XICS
>> 
>>  2) Do everything in terms of VMStateDescriptions
>> 
>>  3) Fix endianness problem with TCE table translation
>>     a) Drop the VMSTATE_DIVIDE thing in the process
>> 
>> I've tested this with a TCG pseries guest on an x86_64 host.
>
>
> It did not compile (fixed, patch is posted) and it fails to migrate with
> enabled KVM.

With in-kernel XICS?  That's not in this series..

Regards,

Anthony Liguori


> Don't know why yet, will have a look tomorrow.
>
>
> -- 
> Alexey

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support
  2013-07-16 12:33   ` Anthony Liguori
@ 2013-07-16 12:35     ` Alexey Kardashevskiy
  2013-07-16 12:48       ` Alexey Kardashevskiy
  0 siblings, 1 reply; 25+ messages in thread
From: Alexey Kardashevskiy @ 2013-07-16 12:35 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Paolo Bonzini, qemu-ppc, qemu-devel, David Gibson

On 07/16/2013 10:33 PM, Anthony Liguori wrote:
> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
> 
>> On 07/16/2013 01:11 AM, Anthony Liguori wrote:
>>> This series is based on Alexey's series:
>>>
>>>   spapr: migration, pci, msi, power8
>>>
>>> Which in turn was based on work by David Gibson.
>>>
>>> I've removed the bits not related to migration and made the
>>> following changes:
>>>
>>>  1) QOMify TCE tables and XICS
>>>
>>>  2) Do everything in terms of VMStateDescriptions
>>>
>>>  3) Fix endianness problem with TCE table translation
>>>     a) Drop the VMSTATE_DIVIDE thing in the process
>>>
>>> I've tested this with a TCG pseries guest on an x86_64 host.
>>
>>
>> It did not compile (fixed, patch is posted) and it fails to migrate with
>> enabled KVM.
> 
> With in-kernel XICS?  That's not in this series..

No, as is, without any of my patches. I suspect rather HPTE than XICS though.


-- 
Alexey

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support
  2013-07-16 12:35     ` Alexey Kardashevskiy
@ 2013-07-16 12:48       ` Alexey Kardashevskiy
  2013-07-16 13:24         ` Alexey Kardashevskiy
  0 siblings, 1 reply; 25+ messages in thread
From: Alexey Kardashevskiy @ 2013-07-16 12:48 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Paolo Bonzini, qemu-ppc, qemu-devel, David Gibson

On 07/16/2013 10:35 PM, Alexey Kardashevskiy wrote:
> On 07/16/2013 10:33 PM, Anthony Liguori wrote:
>> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
>>
>>> On 07/16/2013 01:11 AM, Anthony Liguori wrote:
>>>> This series is based on Alexey's series:
>>>>
>>>>   spapr: migration, pci, msi, power8
>>>>
>>>> Which in turn was based on work by David Gibson.
>>>>
>>>> I've removed the bits not related to migration and made the
>>>> following changes:
>>>>
>>>>  1) QOMify TCE tables and XICS
>>>>
>>>>  2) Do everything in terms of VMStateDescriptions
>>>>
>>>>  3) Fix endianness problem with TCE table translation
>>>>     a) Drop the VMSTATE_DIVIDE thing in the process
>>>>
>>>> I've tested this with a TCG pseries guest on an x86_64 host.
>>>
>>>
>>> It did not compile (fixed, patch is posted) and it fails to migrate with
>>> enabled KVM.
>>
>> With in-kernel XICS?  That's not in this series..
> 
> No, as is, without any of my patches. I suspect rather HPTE than XICS though.

I was wrong. vmstate_spapr_tce_table is broken :)



-- 
Alexey

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support
  2013-07-16 12:48       ` Alexey Kardashevskiy
@ 2013-07-16 13:24         ` Alexey Kardashevskiy
  2013-07-16 14:12           ` Anthony Liguori
  0 siblings, 1 reply; 25+ messages in thread
From: Alexey Kardashevskiy @ 2013-07-16 13:24 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Paolo Bonzini, qemu-ppc, qemu-devel, David Gibson

On 07/16/2013 10:48 PM, Alexey Kardashevskiy wrote:
> On 07/16/2013 10:35 PM, Alexey Kardashevskiy wrote:
>> On 07/16/2013 10:33 PM, Anthony Liguori wrote:
>>> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
>>>
>>>> On 07/16/2013 01:11 AM, Anthony Liguori wrote:
>>>>> This series is based on Alexey's series:
>>>>>
>>>>>   spapr: migration, pci, msi, power8
>>>>>
>>>>> Which in turn was based on work by David Gibson.
>>>>>
>>>>> I've removed the bits not related to migration and made the
>>>>> following changes:
>>>>>
>>>>>  1) QOMify TCE tables and XICS
>>>>>
>>>>>  2) Do everything in terms of VMStateDescriptions
>>>>>
>>>>>  3) Fix endianness problem with TCE table translation
>>>>>     a) Drop the VMSTATE_DIVIDE thing in the process
>>>>>
>>>>> I've tested this with a TCG pseries guest on an x86_64 host.
>>>>
>>>>
>>>> It did not compile (fixed, patch is posted) and it fails to migrate with
>>>> enabled KVM.
>>>
>>> With in-kernel XICS?  That's not in this series..
>>
>> No, as is, without any of my patches. I suspect rather HPTE than XICS though.
> 
> I was wrong. vmstate_spapr_tce_table is broken :)

Here it is:

diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index 709cc34..3d4a1fc 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -148,6 +148,7 @@ static int spapr_tce_table_realize(DeviceState *dev)
             * sizeof(uint64_t);
         tcet->table = g_malloc0(table_size);
     }
+    tcet->nb_table = tcet->window_size >> SPAPR_TCE_PAGE_SHIFT;

 #ifdef DEBUG_TCE
     fprintf(stderr, "spapr_iommu: New TCE table @ %p, liobn=0x%x, "


Honestly, I liked David's approach more when we did not need any extra
parameter to sync  :(



-- 
Alexey

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support
  2013-07-16 13:24         ` Alexey Kardashevskiy
@ 2013-07-16 14:12           ` Anthony Liguori
  0 siblings, 0 replies; 25+ messages in thread
From: Anthony Liguori @ 2013-07-16 14:12 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Paolo Bonzini, qemu-ppc, qemu-devel, David Gibson

Alexey Kardashevskiy <aik@ozlabs.ru> writes:

> On 07/16/2013 10:48 PM, Alexey Kardashevskiy wrote:
>> On 07/16/2013 10:35 PM, Alexey Kardashevskiy wrote:
>>> On 07/16/2013 10:33 PM, Anthony Liguori wrote:
>>>> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
>>>>
>>>>> On 07/16/2013 01:11 AM, Anthony Liguori wrote:
>>>>>> This series is based on Alexey's series:
>>>>>>
>>>>>>   spapr: migration, pci, msi, power8
>>>>>>
>>>>>> Which in turn was based on work by David Gibson.
>>>>>>
>>>>>> I've removed the bits not related to migration and made the
>>>>>> following changes:
>>>>>>
>>>>>>  1) QOMify TCE tables and XICS
>>>>>>
>>>>>>  2) Do everything in terms of VMStateDescriptions
>>>>>>
>>>>>>  3) Fix endianness problem with TCE table translation
>>>>>>     a) Drop the VMSTATE_DIVIDE thing in the process
>>>>>>
>>>>>> I've tested this with a TCG pseries guest on an x86_64 host.
>>>>>
>>>>>
>>>>> It did not compile (fixed, patch is posted) and it fails to migrate with
>>>>> enabled KVM.
>>>>
>>>> With in-kernel XICS?  That's not in this series..
>>>
>>> No, as is, without any of my patches. I suspect rather HPTE than XICS though.
>> 
>> I was wrong. vmstate_spapr_tce_table is broken :)
>
> Here it is:
>
> diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
> index 709cc34..3d4a1fc 100644
> --- a/hw/ppc/spapr_iommu.c
> +++ b/hw/ppc/spapr_iommu.c
> @@ -148,6 +148,7 @@ static int spapr_tce_table_realize(DeviceState *dev)
>              * sizeof(uint64_t);
>          tcet->table = g_malloc0(table_size);
>      }
> +    tcet->nb_table = tcet->window_size >> SPAPR_TCE_PAGE_SHIFT;

Ah, right, pre_load isn't enough :-/  Sorry about that.  Thanks for
digging in to this.

>
>  #ifdef DEBUG_TCE
>      fprintf(stderr, "spapr_iommu: New TCE table @ %p, liobn=0x%x, "
>
>
> Honestly, I liked David's approach more when we did not need any extra
> parameter to sync  :(

I resisted the urge to refactor but I'm confident that nb_table will end
up with nicer code.

The 'tcet->window_size >> SPAPR_TCE_PAGE_SHIFT' is used all over the
place.  It's not very clear without carefully thinking about it that the
table size == the number of pages in the window.

Regards,

Anthony Liguori

>
>
>
> -- 
> Alexey

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] ppc kvm: fix to compile
  2013-07-16 10:00   ` [Qemu-devel] [PATCH] ppc kvm: fix to compile Alexey Kardashevskiy
  2013-07-16 12:32     ` Anthony Liguori
@ 2013-07-16 23:12     ` Alexander Graf
  2013-07-16 23:13       ` [Qemu-devel] [Qemu-ppc] " Alexander Graf
  1 sibling, 1 reply; 25+ messages in thread
From: Alexander Graf @ 2013-07-16 23:12 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Anthony Liguori, qemu-ppc, qemu-devel


On 16.07.2013, at 12:00, Alexey Kardashevskiy wrote:

> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>

Why? sPAPRTCE is a typedef struct that's defined in include/hw/ppc/spapr.h which is included from target-ppc/kvm.c. So it should work just fine.


Alex

> ---
> target-ppc/kvm.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 61f2737..606bdb9 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -1628,7 +1628,7 @@ void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t window_size, int *pfd)
>         return NULL;
>     }
> 
> -    len = (window_size / SPAPR_TCE_PAGE_SIZE) * sizeof(sPAPRTCE);
> +    len = (window_size / SPAPR_TCE_PAGE_SIZE) * sizeof(uint64_t);
>     /* FIXME: round this up to page size */
> 
>     table = mmap(NULL, len, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
> @@ -1651,7 +1651,7 @@ int kvmppc_remove_spapr_tce(void *table, int fd, uint32_t window_size)
>         return -1;
>     }
> 
> -    len = (window_size / SPAPR_TCE_PAGE_SIZE)*sizeof(sPAPRTCE);
> +    len = (window_size / SPAPR_TCE_PAGE_SIZE)*sizeof(uint64_t);
>     if ((munmap(table, len) < 0) ||
>         (close(fd) < 0)) {
>         fprintf(stderr, "KVM: Unexpected error removing TCE table: %s",
> -- 
> 1.8.3.2
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH] ppc kvm: fix to compile
  2013-07-16 23:12     ` Alexander Graf
@ 2013-07-16 23:13       ` Alexander Graf
  2013-07-16 23:26         ` Alexey Kardashevskiy
  0 siblings, 1 reply; 25+ messages in thread
From: Alexander Graf @ 2013-07-16 23:13 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Anthony Liguori, qemu-ppc, qemu-devel


On 17.07.2013, at 01:12, Alexander Graf wrote:

> 
> On 16.07.2013, at 12:00, Alexey Kardashevskiy wrote:
> 
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> 
> Why? sPAPRTCE is a typedef struct that's defined in include/hw/ppc/spapr.h which is included from target-ppc/kvm.c. So it should work just fine.

Ah, my mailer screwed up threading. This is in reply to Anthony's cleanups?


Alex

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH] ppc kvm: fix to compile
  2013-07-16 23:13       ` [Qemu-devel] [Qemu-ppc] " Alexander Graf
@ 2013-07-16 23:26         ` Alexey Kardashevskiy
  0 siblings, 0 replies; 25+ messages in thread
From: Alexey Kardashevskiy @ 2013-07-16 23:26 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Anthony Liguori, qemu-ppc, qemu-devel

On 07/17/2013 09:13 AM, Alexander Graf wrote:
> 
> On 17.07.2013, at 01:12, Alexander Graf wrote:
> 
>>
>> On 16.07.2013, at 12:00, Alexey Kardashevskiy wrote:
>>
>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>
>> Why? sPAPRTCE is a typedef struct that's defined in include/hw/ppc/spapr.h which is included from target-ppc/kvm.c. So it should work just fine.
> 
> Ah, my mailer screwed up threading. This is in reply to Anthony's cleanups?

Yes. He removed the struct.


-- 
Alexey

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 05/11] spapr-tce: make sPAPRTCETable a proper device
  2013-07-15 15:11 ` [Qemu-devel] [PATCH 05/11] spapr-tce: make sPAPRTCETable a proper device Anthony Liguori
  2013-07-16 10:00   ` [Qemu-devel] [PATCH] ppc kvm: fix to compile Alexey Kardashevskiy
@ 2013-07-17  7:01   ` Alexey Kardashevskiy
  2013-07-17 13:05     ` Anthony Liguori
  1 sibling, 1 reply; 25+ messages in thread
From: Alexey Kardashevskiy @ 2013-07-17  7:01 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Paolo Bonzini, qemu-ppc, qemu-devel, David Gibson

On 07/16/2013 01:11 AM, Anthony Liguori wrote:
> Model TCE tables as a device that's hooked up as a child object to
> the owner.  Besides the code cleanup, we get a few nice benefits:
> 
> 1) free actually works now (it was dead code before)
> 
> 2) the TCE information is visible in the device tree
> 
> 3) we can expose table information as properties such that if we
>    change the window_size, we can use globals to keep migration
>    working.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> [dwg: pseries: savevm support for PAPR TCE tables]
> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
> ---
>  hw/ppc/spapr.c         |   3 -
>  hw/ppc/spapr_iommu.c   | 145 ++++++++++++++++++++++++++++++++-----------------
>  hw/ppc/spapr_pci.c     |   2 +-
>  hw/ppc/spapr_vio.c     |   2 +-
>  include/hw/ppc/spapr.h |  23 +++++---
>  5 files changed, 114 insertions(+), 61 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 48ae092..e340708 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -848,9 +848,6 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
>      /* Set up EPOW events infrastructure */
>      spapr_events_init(spapr);
>  
> -    /* Set up IOMMU */
> -    spapr_iommu_init();
> -
>      /* Set up VIO bus */
>      spapr->vio_bus = spapr_vio_bus_init();
>  
> diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
> index 89b33a5..709cc34 100644
> --- a/hw/ppc/spapr_iommu.c
> +++ b/hw/ppc/spapr_iommu.c
> @@ -36,17 +36,6 @@ enum sPAPRTCEAccess {
>      SPAPR_TCE_RW = 3,
>  };
>  
> -struct sPAPRTCETable {
> -    uint32_t liobn;
> -    uint32_t window_size;
> -    sPAPRTCE *table;
> -    bool bypass;
> -    int fd;
> -    MemoryRegion iommu;
> -    QLIST_ENTRY(sPAPRTCETable) list;
> -};
> -
> -
>  QLIST_HEAD(spapr_tce_tables, sPAPRTCETable) spapr_tce_tables;
>  
>  static sPAPRTCETable *spapr_tce_find_by_liobn(uint32_t liobn)
> @@ -96,7 +85,7 @@ static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr)
>          return (IOMMUTLBEntry) { .perm = IOMMU_NONE };
>      }
>  
> -    tce = tcet->table[addr >> SPAPR_TCE_PAGE_SHIFT].tce;
> +    tce = tcet->table[addr >> SPAPR_TCE_PAGE_SHIFT];
>  
>  #ifdef DEBUG_TCE
>      fprintf(stderr, " ->  *paddr=0x%llx, *len=0x%llx\n",
> @@ -112,37 +101,51 @@ static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr)
>      };
>  }
>  
> +static int spapr_tce_table_pre_load(void *opaque)
> +{
> +    sPAPRTCETable *tcet = SPAPR_TCE_TABLE(opaque);
> +
> +    tcet->nb_table = tcet->window_size >> SPAPR_TCE_PAGE_SHIFT;
> +
> +    return 0;
> +}
> +
> +static const VMStateDescription vmstate_spapr_tce_table = {
> +    .name = "spapr_iommu",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .minimum_version_id_old = 1,
> +    .pre_load = spapr_tce_table_pre_load,
> +    .fields      = (VMStateField []) {
> +        /* Sanity check */
> +        VMSTATE_UINT32_EQUAL(liobn, sPAPRTCETable),
> +        VMSTATE_UINT32_EQUAL(window_size, sPAPRTCETable),
> +
> +        /* IOMMU state */
> +        VMSTATE_BOOL(bypass, sPAPRTCETable),
> +        VMSTATE_VARRAY_UINT32(table, sPAPRTCETable, nb_table, 0, vmstate_info_uint64, uint64_t),
> +
> +        VMSTATE_END_OF_LIST()
> +    },
> +};
> +
>  static MemoryRegionIOMMUOps spapr_iommu_ops = {
>      .translate = spapr_tce_translate_iommu,
>  };
>  
> -sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, size_t window_size)
> +static int spapr_tce_table_realize(DeviceState *dev)
>  {
> -    sPAPRTCETable *tcet;
> -
> -    if (spapr_tce_find_by_liobn(liobn)) {
> -        fprintf(stderr, "Attempted to create TCE table with duplicate"
> -                " LIOBN 0x%x\n", liobn);
> -        return NULL;
> -    }
> -
> -    if (!window_size) {
> -        return NULL;
> -    }
> -
> -    tcet = g_malloc0(sizeof(*tcet));
> -    tcet->liobn = liobn;
> -    tcet->window_size = window_size;
> +    sPAPRTCETable *tcet = SPAPR_TCE_TABLE(dev);
>  
>      if (kvm_enabled()) {
> -        tcet->table = kvmppc_create_spapr_tce(liobn,
> -                                              window_size,
> +        tcet->table = kvmppc_create_spapr_tce(tcet->liobn,
> +                                              tcet->window_size,
>                                                &tcet->fd);
>      }
>  
>      if (!tcet->table) {
> -        size_t table_size = (window_size >> SPAPR_TCE_PAGE_SHIFT)
> -            * sizeof(sPAPRTCE);
> +        size_t table_size = (tcet->window_size >> SPAPR_TCE_PAGE_SHIFT)
> +            * sizeof(uint64_t);
>          tcet->table = g_malloc0(table_size);
>      }
>  
> @@ -151,16 +154,43 @@ sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, size_t wi
>              "table @ %p, fd=%d\n", tcet, liobn, tcet->table, tcet->fd);
>  #endif
>  
> -    memory_region_init_iommu(&tcet->iommu, OBJECT(owner), &spapr_iommu_ops,
> +    memory_region_init_iommu(&tcet->iommu, OBJECT(dev), &spapr_iommu_ops,
>                               "iommu-spapr", UINT64_MAX);
>  
>      QLIST_INSERT_HEAD(&spapr_tce_tables, tcet, list);
>  
> +    return 0;
> +}
> +
> +sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, size_t window_size)
> +{
> +    sPAPRTCETable *tcet;
> +
> +    if (spapr_tce_find_by_liobn(liobn)) {
> +        fprintf(stderr, "Attempted to create TCE table with duplicate"
> +                " LIOBN 0x%x\n", liobn);
> +        return NULL;
> +    }
> +
> +    if (!window_size) {
> +        return NULL;
> +    }
> +
> +    tcet = SPAPR_TCE_TABLE(object_new(TYPE_SPAPR_TCE_TABLE));
> +    tcet->liobn = liobn;
> +    tcet->window_size = window_size;



I am trying to understand the QOM. How do you understand what
initialization should go to .realize and what should stay in
spapr_tce_new_table?

In this particular case having the .realize implementation does not make
much sense to me. If you made .liobn and .window_size members of
sPAPRTCETable then it would be ok but you did not. What do I miss? Thanks.



-- 
Alexey

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 05/11] spapr-tce: make sPAPRTCETable a proper device
  2013-07-17  7:01   ` [Qemu-devel] [PATCH 05/11] spapr-tce: make sPAPRTCETable a proper device Alexey Kardashevskiy
@ 2013-07-17 13:05     ` Anthony Liguori
  0 siblings, 0 replies; 25+ messages in thread
From: Anthony Liguori @ 2013-07-17 13:05 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Paolo Bonzini, qemu-ppc, qemu-devel, David Gibson

Alexey Kardashevskiy <aik@ozlabs.ru> writes:

> On 07/16/2013 01:11 AM, Anthony Liguori wrote:
>> Model TCE tables as a device that's hooked up as a child object to
>> the owner.  Besides the code cleanup, we get a few nice benefits:
>> 
>> 1) free actually works now (it was dead code before)
>> 
>> 2) the TCE information is visible in the device tree
>> 
>> 3) we can expose table information as properties such that if we
>>    change the window_size, we can use globals to keep migration
>>    working.
>> 
>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>> [dwg: pseries: savevm support for PAPR TCE tables]
>> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
>> ---
>>  hw/ppc/spapr.c         |   3 -
>>  hw/ppc/spapr_iommu.c   | 145 ++++++++++++++++++++++++++++++++-----------------
>>  hw/ppc/spapr_pci.c     |   2 +-
>>  hw/ppc/spapr_vio.c     |   2 +-
>>  include/hw/ppc/spapr.h |  23 +++++---
>>  5 files changed, 114 insertions(+), 61 deletions(-)
>> 
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 48ae092..e340708 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -848,9 +848,6 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
>>      /* Set up EPOW events infrastructure */
>>      spapr_events_init(spapr);
>>  
>> -    /* Set up IOMMU */
>> -    spapr_iommu_init();
>> -
>>      /* Set up VIO bus */
>>      spapr->vio_bus = spapr_vio_bus_init();
>>  
>> diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
>> index 89b33a5..709cc34 100644
>> --- a/hw/ppc/spapr_iommu.c
>> +++ b/hw/ppc/spapr_iommu.c
>> @@ -36,17 +36,6 @@ enum sPAPRTCEAccess {
>>      SPAPR_TCE_RW = 3,
>>  };
>>  
>> -struct sPAPRTCETable {
>> -    uint32_t liobn;
>> -    uint32_t window_size;
>> -    sPAPRTCE *table;
>> -    bool bypass;
>> -    int fd;
>> -    MemoryRegion iommu;
>> -    QLIST_ENTRY(sPAPRTCETable) list;
>> -};
>> -
>> -
>>  QLIST_HEAD(spapr_tce_tables, sPAPRTCETable) spapr_tce_tables;
>>  
>>  static sPAPRTCETable *spapr_tce_find_by_liobn(uint32_t liobn)
>> @@ -96,7 +85,7 @@ static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr)
>>          return (IOMMUTLBEntry) { .perm = IOMMU_NONE };
>>      }
>>  
>> -    tce = tcet->table[addr >> SPAPR_TCE_PAGE_SHIFT].tce;
>> +    tce = tcet->table[addr >> SPAPR_TCE_PAGE_SHIFT];
>>  
>>  #ifdef DEBUG_TCE
>>      fprintf(stderr, " ->  *paddr=0x%llx, *len=0x%llx\n",
>> @@ -112,37 +101,51 @@ static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr)
>>      };
>>  }
>>  
>> +static int spapr_tce_table_pre_load(void *opaque)
>> +{
>> +    sPAPRTCETable *tcet = SPAPR_TCE_TABLE(opaque);
>> +
>> +    tcet->nb_table = tcet->window_size >> SPAPR_TCE_PAGE_SHIFT;
>> +
>> +    return 0;
>> +}
>> +
>> +static const VMStateDescription vmstate_spapr_tce_table = {
>> +    .name = "spapr_iommu",
>> +    .version_id = 1,
>> +    .minimum_version_id = 1,
>> +    .minimum_version_id_old = 1,
>> +    .pre_load = spapr_tce_table_pre_load,
>> +    .fields      = (VMStateField []) {
>> +        /* Sanity check */
>> +        VMSTATE_UINT32_EQUAL(liobn, sPAPRTCETable),
>> +        VMSTATE_UINT32_EQUAL(window_size, sPAPRTCETable),
>> +
>> +        /* IOMMU state */
>> +        VMSTATE_BOOL(bypass, sPAPRTCETable),
>> +        VMSTATE_VARRAY_UINT32(table, sPAPRTCETable, nb_table, 0, vmstate_info_uint64, uint64_t),
>> +
>> +        VMSTATE_END_OF_LIST()
>> +    },
>> +};
>> +
>>  static MemoryRegionIOMMUOps spapr_iommu_ops = {
>>      .translate = spapr_tce_translate_iommu,
>>  };
>>  
>> -sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, size_t window_size)
>> +static int spapr_tce_table_realize(DeviceState *dev)
>>  {
>> -    sPAPRTCETable *tcet;
>> -
>> -    if (spapr_tce_find_by_liobn(liobn)) {
>> -        fprintf(stderr, "Attempted to create TCE table with duplicate"
>> -                " LIOBN 0x%x\n", liobn);
>> -        return NULL;
>> -    }
>> -
>> -    if (!window_size) {
>> -        return NULL;
>> -    }
>> -
>> -    tcet = g_malloc0(sizeof(*tcet));
>> -    tcet->liobn = liobn;
>> -    tcet->window_size = window_size;
>> +    sPAPRTCETable *tcet = SPAPR_TCE_TABLE(dev);
>>  
>>      if (kvm_enabled()) {
>> -        tcet->table = kvmppc_create_spapr_tce(liobn,
>> -                                              window_size,
>> +        tcet->table = kvmppc_create_spapr_tce(tcet->liobn,
>> +                                              tcet->window_size,
>>                                                &tcet->fd);
>>      }
>>  
>>      if (!tcet->table) {
>> -        size_t table_size = (window_size >> SPAPR_TCE_PAGE_SHIFT)
>> -            * sizeof(sPAPRTCE);
>> +        size_t table_size = (tcet->window_size >> SPAPR_TCE_PAGE_SHIFT)
>> +            * sizeof(uint64_t);
>>          tcet->table = g_malloc0(table_size);
>>      }
>>  
>> @@ -151,16 +154,43 @@ sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, size_t wi
>>              "table @ %p, fd=%d\n", tcet, liobn, tcet->table, tcet->fd);
>>  #endif
>>  
>> -    memory_region_init_iommu(&tcet->iommu, OBJECT(owner), &spapr_iommu_ops,
>> +    memory_region_init_iommu(&tcet->iommu, OBJECT(dev), &spapr_iommu_ops,
>>                               "iommu-spapr", UINT64_MAX);
>>  
>>      QLIST_INSERT_HEAD(&spapr_tce_tables, tcet, list);
>>  
>> +    return 0;
>> +}
>> +
>> +sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, size_t window_size)
>> +{
>> +    sPAPRTCETable *tcet;
>> +
>> +    if (spapr_tce_find_by_liobn(liobn)) {
>> +        fprintf(stderr, "Attempted to create TCE table with duplicate"
>> +                " LIOBN 0x%x\n", liobn);
>> +        return NULL;
>> +    }
>> +
>> +    if (!window_size) {
>> +        return NULL;
>> +    }
>> +
>> +    tcet = SPAPR_TCE_TABLE(object_new(TYPE_SPAPR_TCE_TABLE));
>> +    tcet->liobn = liobn;
>> +    tcet->window_size = window_size;
>
>
>
> I am trying to understand the QOM. How do you understand what
> initialization should go to .realize and what should stay in
> spapr_tce_new_table?

Really nothing should be in spapr_tce_new_table().   It should strictly
be an helper function to create a device and set properties (via
qdev_prop_set..).  We really should make liobn and window_size proper
properties.

Regards,

Anthony Liguori

>
> In this particular case having the .realize implementation does not make
> much sense to me. If you made .liobn and .window_size members of
> sPAPRTCETable then it would be ok but you did not. What do I miss?
> Thanks.



>
>
>
> -- 
> Alexey

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2013-07-17 13:07 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-15 15:11 [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support Anthony Liguori
2013-07-15 15:11 ` [Qemu-devel] [PATCH 01/11] target-ppc: Convert ppc cpu savevm to VMStateDescription Anthony Liguori
2013-07-15 15:11 ` [Qemu-devel] [PATCH 02/11] pseries: savevm support for VIO devices Anthony Liguori
2013-07-15 15:11 ` [Qemu-devel] [PATCH 03/11] pseries: savevm support for PAPR VIO logical lan Anthony Liguori
2013-07-15 15:11 ` [Qemu-devel] [PATCH 04/11] pseries: savevm support for PAPR VIO logical tty Anthony Liguori
2013-07-15 15:11 ` [Qemu-devel] [PATCH 05/11] spapr-tce: make sPAPRTCETable a proper device Anthony Liguori
2013-07-16 10:00   ` [Qemu-devel] [PATCH] ppc kvm: fix to compile Alexey Kardashevskiy
2013-07-16 12:32     ` Anthony Liguori
2013-07-16 23:12     ` Alexander Graf
2013-07-16 23:13       ` [Qemu-devel] [Qemu-ppc] " Alexander Graf
2013-07-16 23:26         ` Alexey Kardashevskiy
2013-07-17  7:01   ` [Qemu-devel] [PATCH 05/11] spapr-tce: make sPAPRTCETable a proper device Alexey Kardashevskiy
2013-07-17 13:05     ` Anthony Liguori
2013-07-15 15:11 ` [Qemu-devel] [PATCH 06/11] pseries: rework PAPR virtual SCSI Anthony Liguori
2013-07-15 15:11 ` [Qemu-devel] [PATCH 07/11] pseries: savevm support for " Anthony Liguori
2013-07-15 15:11 ` [Qemu-devel] [PATCH 08/11] pseries: savevm support for pseries machine Anthony Liguori
2013-07-15 15:11 ` [Qemu-devel] [PATCH 09/11] pseries: savevm support for PCI host bridge Anthony Liguori
2013-07-15 15:11 ` [Qemu-devel] [PATCH 10/11] pseries: savevm support with KVM Anthony Liguori
2013-07-15 15:11 ` [Qemu-devel] [PATCH 11/11] xics: rename types to be sane and follow coding style Anthony Liguori
2013-07-16 10:10 ` [Qemu-devel] [PATCH 00/11] pseries: migration and QOM support Alexey Kardashevskiy
2013-07-16 12:33   ` Anthony Liguori
2013-07-16 12:35     ` Alexey Kardashevskiy
2013-07-16 12:48       ` Alexey Kardashevskiy
2013-07-16 13:24         ` Alexey Kardashevskiy
2013-07-16 14:12           ` Anthony Liguori

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.