[PATCH v4 00/10] qemu-kvm: Hook cleanups and yet more use of upstream code

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v4 00/10] qemu-kvm: Hook cleanups and yet more use of upstream code
@ 2010-03-01 17:17 ` Jan Kiszka
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, qemu-devel, Gleb Natapov

Rebased over latest master.

Pull URL is

	git://git.kiszka.org/qemu-kvm.git queues/queues-kvm-merge

Jan Kiszka (10):
  qemu-kvm: Add KVM_CAP_X86_ROBUST_SINGLESTEP-awareness
  qemu-kvm: Rework VCPU state writeback API
  x86: Extend validity of cpu_is_bsp
  qemu-kvm: Clean up mpstate synchronization
  KVM: x86: Restrict writeback of VCPU state
  qemu-kvm: Use VCPU event state for reset and vmsave/load
  qemu-kvm: Cleanup/fix TSC and PV clock writeback
  qemu-kvm: Clean up KVM's APIC hooks
  qemu-kvm: Move kvm_set_boot_cpu_id
  qemu-kvm: Bring qemu_init_vcpu back home

 exec.c                |   17 -----
 hw/apic.c             |   47 ++-------------
 hw/pc.c               |   12 +---
 hw/ppc_newworld.c     |    3 -
 hw/ppc_oldworld.c     |    3 -
 hw/s390-virtio.c      |    1 -
 kvm-all.c             |   31 ++++++++--
 kvm.h                 |   28 ++++++++-
 qemu-kvm-ia64.c       |    6 +-
 qemu-kvm-x86.c        |  164 ++++++++++++++++++++++++-------------------------
 qemu-kvm.c            |   46 ++++----------
 qemu-kvm.h            |   27 +-------
 savevm.c              |    4 +
 sysemu.h              |    4 +
 target-i386/helper.c  |    2 +
 target-i386/kvm.c     |   94 +++++++++++++++++-----------
 target-i386/machine.c |   27 --------
 target-ia64/machine.c |    5 +-
 target-ppc/kvm.c      |    2 +-
 target-ppc/machine.c  |    4 -
 target-s390x/kvm.c    |    3 +-
 vl.c                  |   29 +++++++++
 22 files changed, 259 insertions(+), 300 deletions(-)


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH v4 00/10] qemu-kvm: Hook cleanups and yet more use of upstream code
@ 2010-03-01 17:17 ` Jan Kiszka
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: Gleb Natapov, qemu-devel, kvm

Rebased over latest master.

Pull URL is

	git://git.kiszka.org/qemu-kvm.git queues/queues-kvm-merge

Jan Kiszka (10):
  qemu-kvm: Add KVM_CAP_X86_ROBUST_SINGLESTEP-awareness
  qemu-kvm: Rework VCPU state writeback API
  x86: Extend validity of cpu_is_bsp
  qemu-kvm: Clean up mpstate synchronization
  KVM: x86: Restrict writeback of VCPU state
  qemu-kvm: Use VCPU event state for reset and vmsave/load
  qemu-kvm: Cleanup/fix TSC and PV clock writeback
  qemu-kvm: Clean up KVM's APIC hooks
  qemu-kvm: Move kvm_set_boot_cpu_id
  qemu-kvm: Bring qemu_init_vcpu back home

 exec.c                |   17 -----
 hw/apic.c             |   47 ++-------------
 hw/pc.c               |   12 +---
 hw/ppc_newworld.c     |    3 -
 hw/ppc_oldworld.c     |    3 -
 hw/s390-virtio.c      |    1 -
 kvm-all.c             |   31 ++++++++--
 kvm.h                 |   28 ++++++++-
 qemu-kvm-ia64.c       |    6 +-
 qemu-kvm-x86.c        |  164 ++++++++++++++++++++++++-------------------------
 qemu-kvm.c            |   46 ++++----------
 qemu-kvm.h            |   27 +-------
 savevm.c              |    4 +
 sysemu.h              |    4 +
 target-i386/helper.c  |    2 +
 target-i386/kvm.c     |   94 +++++++++++++++++-----------
 target-i386/machine.c |   27 --------
 target-ia64/machine.c |    5 +-
 target-ppc/kvm.c      |    2 +-
 target-ppc/machine.c  |    4 -
 target-s390x/kvm.c    |    3 +-
 vl.c                  |   29 +++++++++
 22 files changed, 259 insertions(+), 300 deletions(-)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v4 01/10] qemu-kvm: Add KVM_CAP_X86_ROBUST_SINGLESTEP-awareness
  2010-03-01 17:17 ` [Qemu-devel] " Jan Kiszka
@ 2010-03-01 17:17   ` Jan Kiszka
  -1 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, qemu-devel, Gleb Natapov

This add-on patch to recent guest debugging refactorings adds the
requested awareness for KVM_CAP_X86_ROBUST_SINGLESTEP to both the
upstream as well as qemu-kvm's own code. Fortunately, code sharing
increased once again.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 kvm-all.c         |   12 ++++++++++
 kvm.h             |    1 +
 qemu-kvm-x86.c    |   27 +----------------------
 qemu-kvm.h        |    1 +
 target-i386/kvm.c |   60 ++++++++++++++++++++++++++++++----------------------
 5 files changed, 51 insertions(+), 50 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index d1542e3..06708a5 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -65,6 +65,7 @@ struct KVMState
     int broken_set_mem_region;
     int migration_log;
     int vcpu_events;
+    int robust_singlestep;
 #ifdef KVM_CAP_SET_GUEST_DEBUG
     struct kvm_sw_breakpoint_head kvm_sw_breakpoints;
 #endif
@@ -673,6 +674,12 @@ int kvm_init(int smp_cpus)
     s->vcpu_events = kvm_check_extension(s, KVM_CAP_VCPU_EVENTS);
 #endif
 
+    s->robust_singlestep = 0;
+#ifdef KVM_CAP_X86_ROBUST_SINGLESTEP
+    s->robust_singlestep =
+        kvm_check_extension(s, KVM_CAP_X86_ROBUST_SINGLESTEP);
+#endif
+
     ret = kvm_arch_init(s, smp_cpus);
     if (ret < 0)
         goto err;
@@ -933,6 +940,11 @@ int kvm_has_vcpu_events(void)
     return kvm_state->vcpu_events;
 }
 
+int kvm_has_robust_singlestep(void)
+{
+    return kvm_state->robust_singlestep;
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
     if (!kvm_has_sync_mmu()) {
diff --git a/kvm.h b/kvm.h
index 1b498d7..888dfcb 100644
--- a/kvm.h
+++ b/kvm.h
@@ -43,6 +43,7 @@ int kvm_log_stop(target_phys_addr_t phys_addr, ram_addr_t size);
 
 int kvm_has_sync_mmu(void);
 int kvm_has_vcpu_events(void);
+int kvm_has_robust_singlestep(void);
 int kvm_put_vcpu_events(CPUState *env);
 int kvm_get_vcpu_events(CPUState *env);
 
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 5af9ce1..5db95f8 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -922,31 +922,8 @@ void kvm_arch_load_regs(CPUState *env)
     if (rc == -1)
         perror("kvm_set_msrs FAILED");
 
-    /*
-     * Kernels before 2.6.33 (which correlates with !kvm_has_vcpu_events())
-     * overwrote flags.TF injected via SET_GUEST_DEBUG while updating GP regs.
-     * Work around this by updating the debug state once again if
-     * single-stepping is on.
-     * Another reason to call kvm_update_guest_debug here is a pending debug
-     * trap raise by the guest. On kernels without SET_VCPU_EVENTS we have to
-     * reinject them via SET_GUEST_DEBUG.
-     */
-    if (!kvm_has_vcpu_events() &&
-        (env->exception_injected != -1 || env->singlestep_enabled)) {
-        unsigned long reinject_trap = 0;
-
-        if (env->exception_injected == 1) {
-            reinject_trap = KVM_GUESTDBG_INJECT_DB;
-        } else if (env->exception_injected == 3) {
-            reinject_trap = KVM_GUESTDBG_INJECT_BP;
-        }
-        env->exception_injected = -1;
-
-        rc = kvm_update_guest_debug(env, reinject_trap);
-        if (rc < 0) {
-            perror("kvm_update_guest_debug FAILED");
-        }
-    }
+    /* must be last */
+    kvm_guest_debug_workarounds(env);
 }
 
 void kvm_load_tsc(CPUState *env)
diff --git a/qemu-kvm.h b/qemu-kvm.h
index ecb52c1..ed00665 100644
--- a/qemu-kvm.h
+++ b/qemu-kvm.h
@@ -1012,6 +1012,7 @@ struct KVMState {
     int broken_set_mem_region;
     int migration_log;
     int vcpu_events;
+    int robust_singlestep;
 #ifdef KVM_CAP_SET_GUEST_DEBUG
     QTAILQ_HEAD(, kvm_sw_breakpoint) kvm_sw_breakpoints;
 #endif
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 4e1358c..dd636f1 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -861,6 +861,37 @@ int kvm_get_vcpu_events(CPUState *env)
     return 0;
 }
 
+static int kvm_guest_debug_workarounds(CPUState *env)
+{
+    int ret = 0;
+#ifdef KVM_CAP_SET_GUEST_DEBUG
+    unsigned long reinject_trap = 0;
+
+    if (!kvm_has_vcpu_events()) {
+        if (env->exception_injected == 1) {
+            reinject_trap = KVM_GUESTDBG_INJECT_DB;
+        } else if (env->exception_injected == 3) {
+            reinject_trap = KVM_GUESTDBG_INJECT_BP;
+        }
+        env->exception_injected = -1;
+    }
+
+    /*
+     * Kernels before KVM_CAP_X86_ROBUST_SINGLESTEP overwrote flags.TF
+     * injected via SET_GUEST_DEBUG while updating GP regs. Work around this
+     * by updating the debug state once again if single-stepping is on.
+     * Another reason to call kvm_update_guest_debug here is a pending debug
+     * trap raise by the guest. On kernels without SET_VCPU_EVENTS we have to
+     * reinject them via SET_GUEST_DEBUG.
+     */
+    if (reinject_trap ||
+        (!kvm_has_robust_singlestep() && env->singlestep_enabled)) {
+        ret = kvm_update_guest_debug(env, reinject_trap);
+    }
+#endif /* KVM_CAP_SET_GUEST_DEBUG */
+    return ret;
+}
+
 #ifdef KVM_UPSTREAM
 int kvm_arch_put_registers(CPUState *env)
 {
@@ -890,31 +921,10 @@ int kvm_arch_put_registers(CPUState *env)
     if (ret < 0)
         return ret;
 
-    /*
-     * Kernels before 2.6.33 (which correlates with !kvm_has_vcpu_events())
-     * overwrote flags.TF injected via SET_GUEST_DEBUG while updating GP regs.
-     * Work around this by updating the debug state once again if
-     * single-stepping is on.
-     * Another reason to call kvm_update_guest_debug here is a pending debug
-     * trap raise by the guest. On kernels without SET_VCPU_EVENTS we have to
-     * reinject them via SET_GUEST_DEBUG.
-     */
-    if (!kvm_has_vcpu_events() &&
-        (env->exception_injected != -1 || env->singlestep_enabled)) {
-        unsigned long reinject_trap = 0;
-
-        if (env->exception_injected == 1) {
-            reinject_trap = KVM_GUESTDBG_INJECT_DB;
-        } else if (env->exception_injected == 3) {
-            reinject_trap = KVM_GUESTDBG_INJECT_BP;
-        }
-        env->exception_injected = -1;
-
-        ret = kvm_update_guest_debug(env, reinject_trap);
-        if (ret < 0) {
-            return ret;
-        }
-    }
+    /* must be last */
+    ret = kvm_guest_debug_workarounds(env);
+    if (ret < 0)
+        return ret;
 
     return 0;
 }
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH v4 01/10] qemu-kvm: Add KVM_CAP_X86_ROBUST_SINGLESTEP-awareness
@ 2010-03-01 17:17   ` Jan Kiszka
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: Gleb Natapov, qemu-devel, kvm

This add-on patch to recent guest debugging refactorings adds the
requested awareness for KVM_CAP_X86_ROBUST_SINGLESTEP to both the
upstream as well as qemu-kvm's own code. Fortunately, code sharing
increased once again.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 kvm-all.c         |   12 ++++++++++
 kvm.h             |    1 +
 qemu-kvm-x86.c    |   27 +----------------------
 qemu-kvm.h        |    1 +
 target-i386/kvm.c |   60 ++++++++++++++++++++++++++++++----------------------
 5 files changed, 51 insertions(+), 50 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index d1542e3..06708a5 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -65,6 +65,7 @@ struct KVMState
     int broken_set_mem_region;
     int migration_log;
     int vcpu_events;
+    int robust_singlestep;
 #ifdef KVM_CAP_SET_GUEST_DEBUG
     struct kvm_sw_breakpoint_head kvm_sw_breakpoints;
 #endif
@@ -673,6 +674,12 @@ int kvm_init(int smp_cpus)
     s->vcpu_events = kvm_check_extension(s, KVM_CAP_VCPU_EVENTS);
 #endif
 
+    s->robust_singlestep = 0;
+#ifdef KVM_CAP_X86_ROBUST_SINGLESTEP
+    s->robust_singlestep =
+        kvm_check_extension(s, KVM_CAP_X86_ROBUST_SINGLESTEP);
+#endif
+
     ret = kvm_arch_init(s, smp_cpus);
     if (ret < 0)
         goto err;
@@ -933,6 +940,11 @@ int kvm_has_vcpu_events(void)
     return kvm_state->vcpu_events;
 }
 
+int kvm_has_robust_singlestep(void)
+{
+    return kvm_state->robust_singlestep;
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
     if (!kvm_has_sync_mmu()) {
diff --git a/kvm.h b/kvm.h
index 1b498d7..888dfcb 100644
--- a/kvm.h
+++ b/kvm.h
@@ -43,6 +43,7 @@ int kvm_log_stop(target_phys_addr_t phys_addr, ram_addr_t size);
 
 int kvm_has_sync_mmu(void);
 int kvm_has_vcpu_events(void);
+int kvm_has_robust_singlestep(void);
 int kvm_put_vcpu_events(CPUState *env);
 int kvm_get_vcpu_events(CPUState *env);
 
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 5af9ce1..5db95f8 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -922,31 +922,8 @@ void kvm_arch_load_regs(CPUState *env)
     if (rc == -1)
         perror("kvm_set_msrs FAILED");
 
-    /*
-     * Kernels before 2.6.33 (which correlates with !kvm_has_vcpu_events())
-     * overwrote flags.TF injected via SET_GUEST_DEBUG while updating GP regs.
-     * Work around this by updating the debug state once again if
-     * single-stepping is on.
-     * Another reason to call kvm_update_guest_debug here is a pending debug
-     * trap raise by the guest. On kernels without SET_VCPU_EVENTS we have to
-     * reinject them via SET_GUEST_DEBUG.
-     */
-    if (!kvm_has_vcpu_events() &&
-        (env->exception_injected != -1 || env->singlestep_enabled)) {
-        unsigned long reinject_trap = 0;
-
-        if (env->exception_injected == 1) {
-            reinject_trap = KVM_GUESTDBG_INJECT_DB;
-        } else if (env->exception_injected == 3) {
-            reinject_trap = KVM_GUESTDBG_INJECT_BP;
-        }
-        env->exception_injected = -1;
-
-        rc = kvm_update_guest_debug(env, reinject_trap);
-        if (rc < 0) {
-            perror("kvm_update_guest_debug FAILED");
-        }
-    }
+    /* must be last */
+    kvm_guest_debug_workarounds(env);
 }
 
 void kvm_load_tsc(CPUState *env)
diff --git a/qemu-kvm.h b/qemu-kvm.h
index ecb52c1..ed00665 100644
--- a/qemu-kvm.h
+++ b/qemu-kvm.h
@@ -1012,6 +1012,7 @@ struct KVMState {
     int broken_set_mem_region;
     int migration_log;
     int vcpu_events;
+    int robust_singlestep;
 #ifdef KVM_CAP_SET_GUEST_DEBUG
     QTAILQ_HEAD(, kvm_sw_breakpoint) kvm_sw_breakpoints;
 #endif
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 4e1358c..dd636f1 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -861,6 +861,37 @@ int kvm_get_vcpu_events(CPUState *env)
     return 0;
 }
 
+static int kvm_guest_debug_workarounds(CPUState *env)
+{
+    int ret = 0;
+#ifdef KVM_CAP_SET_GUEST_DEBUG
+    unsigned long reinject_trap = 0;
+
+    if (!kvm_has_vcpu_events()) {
+        if (env->exception_injected == 1) {
+            reinject_trap = KVM_GUESTDBG_INJECT_DB;
+        } else if (env->exception_injected == 3) {
+            reinject_trap = KVM_GUESTDBG_INJECT_BP;
+        }
+        env->exception_injected = -1;
+    }
+
+    /*
+     * Kernels before KVM_CAP_X86_ROBUST_SINGLESTEP overwrote flags.TF
+     * injected via SET_GUEST_DEBUG while updating GP regs. Work around this
+     * by updating the debug state once again if single-stepping is on.
+     * Another reason to call kvm_update_guest_debug here is a pending debug
+     * trap raise by the guest. On kernels without SET_VCPU_EVENTS we have to
+     * reinject them via SET_GUEST_DEBUG.
+     */
+    if (reinject_trap ||
+        (!kvm_has_robust_singlestep() && env->singlestep_enabled)) {
+        ret = kvm_update_guest_debug(env, reinject_trap);
+    }
+#endif /* KVM_CAP_SET_GUEST_DEBUG */
+    return ret;
+}
+
 #ifdef KVM_UPSTREAM
 int kvm_arch_put_registers(CPUState *env)
 {
@@ -890,31 +921,10 @@ int kvm_arch_put_registers(CPUState *env)
     if (ret < 0)
         return ret;
 
-    /*
-     * Kernels before 2.6.33 (which correlates with !kvm_has_vcpu_events())
-     * overwrote flags.TF injected via SET_GUEST_DEBUG while updating GP regs.
-     * Work around this by updating the debug state once again if
-     * single-stepping is on.
-     * Another reason to call kvm_update_guest_debug here is a pending debug
-     * trap raise by the guest. On kernels without SET_VCPU_EVENTS we have to
-     * reinject them via SET_GUEST_DEBUG.
-     */
-    if (!kvm_has_vcpu_events() &&
-        (env->exception_injected != -1 || env->singlestep_enabled)) {
-        unsigned long reinject_trap = 0;
-
-        if (env->exception_injected == 1) {
-            reinject_trap = KVM_GUESTDBG_INJECT_DB;
-        } else if (env->exception_injected == 3) {
-            reinject_trap = KVM_GUESTDBG_INJECT_BP;
-        }
-        env->exception_injected = -1;
-
-        ret = kvm_update_guest_debug(env, reinject_trap);
-        if (ret < 0) {
-            return ret;
-        }
-    }
+    /* must be last */
+    ret = kvm_guest_debug_workarounds(env);
+    if (ret < 0)
+        return ret;
 
     return 0;
 }
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v4 02/10] qemu-kvm: Rework VCPU state writeback API
  2010-03-01 17:17 ` [Qemu-devel] " Jan Kiszka
@ 2010-03-01 17:17   ` Jan Kiszka
  -1 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, qemu-devel, Gleb Natapov

This grand cleanup drops all reset and vmsave/load related
synchronization points in favor of four(!) generic hooks:

- cpu_synchronize_all_states in qemu_savevm_state_complete
  (initial sync from kernel before vmsave)
- cpu_synchronize_all_post_init in qemu_loadvm_state
  (writeback after vmload)
- cpu_synchronize_all_post_init in main after machine init
- cpu_synchronize_all_post_reset in qemu_system_reset
  (writeback after system reset)

These writeback points + the existing one of VCPU exec after
cpu_synchronize_state map on three levels of writeback:

- KVM_PUT_RUNTIME_STATE (during runtime, other VCPUs continue to run)
- KVM_PUT_RESET_STATE   (on synchronous system reset, all VCPUs stopped)
- KVM_PUT_FULL_STATE    (on init or vmload, all VCPUs stopped as well)

This level is passed to the arch-specific VCPU state writing function
that will decide which concrete substates need to be written. That way,
no writer of load, save or reset functions that interact with in-kernel
KVM states will ever have to worry about synchronization again. That
also means that a lot of reasons for races, segfaults and deadlocks are
eliminated.

cpu_synchronize_state remains untouched, just as Anthony suggested. We
continue to need it before reading or writing of VCPU states that are
also tracked by in-kernel KVM subsystems.

Consequently, this patch removes many cpu_synchronize_state calls that
are now redundant, just like remaining explicit register syncs. It does
not touch qemu-kvm's special hooks for mpstate, vcpu_events, or tsc
loading. They will be cleaned up by individual patches.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 exec.c                |   17 -----------------
 hw/apic.c             |    3 ---
 hw/pc.c               |    1 -
 hw/ppc_newworld.c     |    3 ---
 hw/ppc_oldworld.c     |    3 ---
 hw/s390-virtio.c      |    1 -
 kvm-all.c             |   19 +++++++++++++------
 kvm.h                 |   25 ++++++++++++++++++++++++-
 qemu-kvm-ia64.c       |    2 +-
 qemu-kvm-x86.c        |    3 +--
 qemu-kvm.c            |   16 +++++++++++++---
 qemu-kvm.h            |    2 +-
 savevm.c              |    4 ++++
 sysemu.h              |    4 ++++
 target-i386/kvm.c     |    2 +-
 target-i386/machine.c |   10 ----------
 target-ia64/machine.c |    2 --
 target-ppc/kvm.c      |    2 +-
 target-ppc/machine.c  |    4 ----
 target-s390x/kvm.c    |    3 +--
 vl.c                  |   29 +++++++++++++++++++++++++++++
 21 files changed, 93 insertions(+), 62 deletions(-)

diff --git a/exec.c b/exec.c
index b647512..69003c2 100644
--- a/exec.c
+++ b/exec.c
@@ -530,21 +530,6 @@ void cpu_exec_init_all(unsigned long tb_size)
 
 #if defined(CPU_SAVE_VERSION) && !defined(CONFIG_USER_ONLY)
 
-static void cpu_common_pre_save(void *opaque)
-{
-    CPUState *env = opaque;
-
-    cpu_synchronize_state(env);
-}
-
-static int cpu_common_pre_load(void *opaque)
-{
-    CPUState *env = opaque;
-
-    cpu_synchronize_state(env);
-    return 0;
-}
-
 static int cpu_common_post_load(void *opaque, int version_id)
 {
     CPUState *env = opaque;
@@ -562,8 +547,6 @@ static const VMStateDescription vmstate_cpu_common = {
     .version_id = 1,
     .minimum_version_id = 1,
     .minimum_version_id_old = 1,
-    .pre_save = cpu_common_pre_save,
-    .pre_load = cpu_common_pre_load,
     .post_load = cpu_common_post_load,
     .fields      = (VMStateField []) {
         VMSTATE_UINT32(halted, CPUState),
diff --git a/hw/apic.c b/hw/apic.c
index ae805dc..3e03e10 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -488,7 +488,6 @@ void apic_init_reset(CPUState *env)
     if (!s)
         return;
 
-    cpu_synchronize_state(env);
     s->tpr = 0;
     s->spurious_vec = 0xff;
     s->log_dest = 0;
@@ -1070,8 +1069,6 @@ static void apic_reset(void *opaque)
     APICState *s = opaque;
     int bsp;
 
-    cpu_synchronize_state(s->cpu_env);
-
     bsp = cpu_is_bsp(s->cpu_env);
     s->apicbase = 0xfee00000 |
         (bsp ? MSR_IA32_APICBASE_BSP : 0) | MSR_IA32_APICBASE_ENABLE;
diff --git a/hw/pc.c b/hw/pc.c
index 67af649..b90a79e 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -779,7 +779,6 @@ CPUState *pc_new_cpu(const char *cpu_model)
         fprintf(stderr, "Unable to find x86 CPU definition\n");
         exit(1);
     }
-    env->kvm_vcpu_dirty = 1;
     if ((env->cpuid_features & CPUID_APIC) || smp_cpus > 1) {
         env->cpuid_apic_id = env->cpu_index;
         /* APIC reset callback resets cpu */
diff --git a/hw/ppc_newworld.c b/hw/ppc_newworld.c
index bc86c85..d4f9013 100644
--- a/hw/ppc_newworld.c
+++ b/hw/ppc_newworld.c
@@ -167,9 +167,6 @@ static void ppc_core99_init (ram_addr_t ram_size,
         envs[i] = env;
     }
 
-    /* Make sure all register sets take effect */
-    cpu_synchronize_state(env);
-
     /* allocate RAM */
     ram_offset = qemu_ram_alloc(ram_size);
     cpu_register_physical_memory(0, ram_size, ram_offset);
diff --git a/hw/ppc_oldworld.c b/hw/ppc_oldworld.c
index 04a7835..93c95ba 100644
--- a/hw/ppc_oldworld.c
+++ b/hw/ppc_oldworld.c
@@ -165,9 +165,6 @@ static void ppc_heathrow_init (ram_addr_t ram_size,
         envs[i] = env;
     }
 
-    /* Make sure all register sets take effect */
-    cpu_synchronize_state(env);
-
     /* allocate RAM */
     if (ram_size > (2047 << 20)) {
         fprintf(stderr,
diff --git a/hw/s390-virtio.c b/hw/s390-virtio.c
index 3582728..ad3386f 100644
--- a/hw/s390-virtio.c
+++ b/hw/s390-virtio.c
@@ -185,7 +185,6 @@ static void s390_init(ram_addr_t ram_size,
             exit(1);
         }
 
-        cpu_synchronize_state(env);
         env->psw.addr = KERN_IMAGE_START;
         env->psw.mask = 0x0000000180000000ULL;
     }
diff --git a/kvm-all.c b/kvm-all.c
index 06708a5..7bb3f7b 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -156,10 +156,6 @@ static void kvm_reset_vcpu(void *opaque)
     CPUState *env = opaque;
 
     kvm_arch_reset_vcpu(env);
-    if (kvm_arch_put_registers(env)) {
-        fprintf(stderr, "Fatal: kvm vcpu reset failed\n");
-        abort();
-    }
 }
 #endif
 
@@ -216,7 +212,6 @@ int kvm_init_vcpu(CPUState *env)
     if (ret == 0) {
         qemu_register_reset(kvm_reset_vcpu, env);
         kvm_arch_reset_vcpu(env);
-        ret = kvm_arch_put_registers(env);
     }
 err:
     return ret;
@@ -770,6 +765,18 @@ void kvm_cpu_synchronize_state(CPUState *env)
     }
 }
 
+void kvm_cpu_synchronize_post_reset(CPUState *env)
+{
+    kvm_arch_put_registers(env, KVM_PUT_RESET_STATE);
+    env->kvm_vcpu_dirty = 0;
+}
+
+void kvm_cpu_synchronize_post_init(CPUState *env)
+{
+    kvm_arch_put_registers(env, KVM_PUT_FULL_STATE);
+    env->kvm_vcpu_dirty = 0;
+}
+
 int kvm_cpu_exec(CPUState *env)
 {
     struct kvm_run *run = env->kvm_run;
@@ -785,7 +792,7 @@ int kvm_cpu_exec(CPUState *env)
         }
 
         if (env->kvm_vcpu_dirty) {
-            kvm_arch_put_registers(env);
+            kvm_arch_put_registers(env, KVM_PUT_RUNTIME_STATE);
             env->kvm_vcpu_dirty = 0;
         }
 
diff --git a/kvm.h b/kvm.h
index 888dfcb..874506b 100644
--- a/kvm.h
+++ b/kvm.h
@@ -87,7 +87,14 @@ int kvm_arch_pre_run(CPUState *env, struct kvm_run *run);
 
 int kvm_arch_get_registers(CPUState *env);
 
-int kvm_arch_put_registers(CPUState *env);
+/* state subset only touched by the VCPU itself during runtime */
+#define KVM_PUT_RUNTIME_STATE   1
+/* state subset modified during VCPU reset */
+#define KVM_PUT_RESET_STATE     2
+/* full state set, modified during initialization or on vmload */
+#define KVM_PUT_FULL_STATE      3
+
+int kvm_arch_put_registers(CPUState *env, int level);
 
 int kvm_arch_init(KVMState *s, int smp_cpus);
 
@@ -131,6 +138,8 @@ int kvm_check_extension(KVMState *s, unsigned int extension);
 uint32_t kvm_arch_get_supported_cpuid(CPUState *env, uint32_t function,
                                       int reg);
 void kvm_cpu_synchronize_state(CPUState *env);
+void kvm_cpu_synchronize_post_reset(CPUState *env);
+void kvm_cpu_synchronize_post_init(CPUState *env);
 
 /* generic hooks - to be moved/refactored once there are more users */
 
@@ -141,4 +150,18 @@ static inline void cpu_synchronize_state(CPUState *env)
     }
 }
 
+static inline void cpu_synchronize_post_reset(CPUState *env)
+{
+    if (kvm_enabled()) {
+        kvm_cpu_synchronize_post_reset(env);
+    }
+}
+
+static inline void cpu_synchronize_post_init(CPUState *env)
+{
+    if (kvm_enabled()) {
+        kvm_cpu_synchronize_post_init(env);
+    }
+}
+
 #endif
diff --git a/qemu-kvm-ia64.c b/qemu-kvm-ia64.c
index a11fde8..fc8110e 100644
--- a/qemu-kvm-ia64.c
+++ b/qemu-kvm-ia64.c
@@ -16,7 +16,7 @@ int kvm_arch_qemu_create_context(void)
     return 0;
 }
 
-void kvm_arch_load_regs(CPUState *env)
+void kvm_arch_load_regs(CPUState *env, int level)
 {
 }
 
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 5db95f8..47667b3 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -803,7 +803,7 @@ static void get_seg(SegmentCache *lhs, const struct kvm_segment *rhs)
 	| (rhs->avl * DESC_AVL_MASK);
 }
 
-void kvm_arch_load_regs(CPUState *env)
+void kvm_arch_load_regs(CPUState *env, int level)
 {
     struct kvm_regs regs;
     struct kvm_fpu fpu;
@@ -1362,7 +1362,6 @@ void kvm_arch_push_nmi(void *opaque)
 void kvm_arch_cpu_reset(CPUState *env)
 {
     kvm_arch_reset_vcpu(env);
-    kvm_arch_load_regs(env);
     kvm_put_vcpu_events(env);
     if (!cpu_is_bsp(env)) {
 	if (kvm_irqchip_in_kernel()) {
diff --git a/qemu-kvm.c b/qemu-kvm.c
index 29365a9..365bb37 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -871,7 +871,7 @@ int pre_kvm_run(kvm_context_t kvm, CPUState *env)
     kvm_arch_pre_run(env, env->kvm_run);
 
     if (env->kvm_vcpu_dirty) {
-        kvm_arch_load_regs(env);
+        kvm_arch_load_regs(env, KVM_PUT_RUNTIME_STATE);
         env->kvm_vcpu_dirty = 0;
     }
 
@@ -1529,6 +1529,18 @@ void kvm_cpu_synchronize_state(CPUState *env)
         on_vcpu(env, do_kvm_cpu_synchronize_state, env);
 }
 
+void kvm_cpu_synchronize_post_reset(CPUState *env)
+{
+    kvm_arch_load_regs(env, KVM_PUT_RESET_STATE);
+    env->kvm_vcpu_dirty = 0;
+}
+
+void kvm_cpu_synchronize_post_init(CPUState *env)
+{
+    kvm_arch_load_regs(env, KVM_PUT_FULL_STATE);
+    env->kvm_vcpu_dirty = 0;
+}
+
 static void inject_interrupt(void *data)
 {
     cpu_interrupt(current_env, (long) data);
@@ -1874,8 +1886,6 @@ static void *ap_main_loop(void *_env)
 
     kvm_arch_init_vcpu(env);
 
-    kvm_arch_load_regs(env);
-
     /* signal VCPU creation */
     current_env->created = 1;
     pthread_cond_signal(&qemu_vcpu_cond);
diff --git a/qemu-kvm.h b/qemu-kvm.h
index ed00665..94b6763 100644
--- a/qemu-kvm.h
+++ b/qemu-kvm.h
@@ -893,7 +893,7 @@ int kvm_qemu_destroy_memory_alias(uint64_t phys_start);
 int kvm_arch_qemu_create_context(void);
 
 void kvm_arch_save_regs(CPUState *env);
-void kvm_arch_load_regs(CPUState *env);
+void kvm_arch_load_regs(CPUState *env, int level);
 void kvm_arch_load_mpstate(CPUState *env);
 void kvm_arch_save_mpstate(CPUState *env);
 int kvm_arch_has_work(CPUState *env);
diff --git a/savevm.c b/savevm.c
index 2fd3de6..f1e675c 100644
--- a/savevm.c
+++ b/savevm.c
@@ -1368,6 +1368,8 @@ int qemu_savevm_state_complete(Monitor *mon, QEMUFile *f)
 {
     SaveStateEntry *se;
 
+    cpu_synchronize_all_states();
+
     QTAILQ_FOREACH(se, &savevm_handlers, entry) {
         if (se->save_live_state == NULL)
             continue;
@@ -1568,6 +1570,8 @@ int qemu_loadvm_state(QEMUFile *f)
         }
     }
 
+    cpu_synchronize_all_post_init();
+
     ret = 0;
 
 out:
diff --git a/sysemu.h b/sysemu.h
index 647a468..01405e1 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -59,6 +59,10 @@ int load_vmstate(Monitor *mon, const char *name);
 void do_delvm(Monitor *mon, const QDict *qdict);
 void do_info_snapshots(Monitor *mon);
 
+void cpu_synchronize_all_states(void);
+void cpu_synchronize_all_post_reset(void);
+void cpu_synchronize_all_post_init(void);
+
 void qemu_announce_self(void);
 
 void main_loop_wait(int timeout);
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index dd636f1..23729ef 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -893,7 +893,7 @@ static int kvm_guest_debug_workarounds(CPUState *env)
 }
 
 #ifdef KVM_UPSTREAM
-int kvm_arch_put_registers(CPUState *env)
+int kvm_arch_put_registers(CPUState *env, int level)
 {
     int ret;
 
diff --git a/target-i386/machine.c b/target-i386/machine.c
index 0b8a33a..bebde82 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -322,7 +322,6 @@ static void cpu_pre_save(void *opaque)
     CPUState *env = opaque;
     int i;
 
-    cpu_synchronize_state(env);
     if (kvm_enabled()) {
         kvm_save_mpstate(env);
         kvm_get_vcpu_events(env);
@@ -342,14 +341,6 @@ static void cpu_pre_save(void *opaque)
 #endif
 }
 
-static int cpu_pre_load(void *opaque)
-{
-    CPUState *env = opaque;
-
-    cpu_synchronize_state(env);
-    return 0;
-}
-
 static int cpu_post_load(void *opaque, int version_id)
 {
     CPUState *env = opaque;
@@ -389,7 +380,6 @@ static const VMStateDescription vmstate_cpu = {
     .minimum_version_id = 3,
     .minimum_version_id_old = 3,
     .pre_save = cpu_pre_save,
-    .pre_load = cpu_pre_load,
     .post_load = cpu_post_load,
     .fields      = (VMStateField []) {
         VMSTATE_UINTTL_ARRAY(regs, CPUState, CPU_NB_REGS),
diff --git a/target-ia64/machine.c b/target-ia64/machine.c
index 7d29575..fdbeeef 100644
--- a/target-ia64/machine.c
+++ b/target-ia64/machine.c
@@ -9,7 +9,6 @@ void cpu_save(QEMUFile *f, void *opaque)
     CPUState *env = opaque;
 
     if (kvm_enabled()) {
-        kvm_arch_save_regs(env);
         kvm_arch_save_mpstate(env);
     }
 }
@@ -19,7 +18,6 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id)
     CPUState *env = opaque;
 
     if (kvm_enabled()) {
-        kvm_arch_load_regs(env);
         kvm_arch_load_mpstate(env);
     }
     return 0;
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 8ad0037..aa3d432 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -73,7 +73,7 @@ void kvm_arch_reset_vcpu(CPUState *env)
 {
 }
 
-int kvm_arch_put_registers(CPUState *env)
+int kvm_arch_put_registers(CPUState *env, int level)
 {
     struct kvm_regs regs;
     int ret;
diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index ead38e1..81d3f72 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -8,8 +8,6 @@ void cpu_save(QEMUFile *f, void *opaque)
     CPUState *env = (CPUState *)opaque;
     unsigned int i, j;
 
-    cpu_synchronize_state(env);
-
     for (i = 0; i < 32; i++)
         qemu_put_betls(f, &env->gpr[i]);
 #if !defined(TARGET_PPC64)
@@ -97,8 +95,6 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id)
     CPUState *env = (CPUState *)opaque;
     unsigned int i, j;
 
-    cpu_synchronize_state(env);
-
     for (i = 0; i < 32; i++)
         qemu_get_betls(f, &env->gpr[i]);
 #if !defined(TARGET_PPC64)
diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index 0199a65..72e77b0 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -91,7 +91,7 @@ void kvm_arch_reset_vcpu(CPUState *env)
     /* FIXME: add code to reset vcpu. */
 }
 
-int kvm_arch_put_registers(CPUState *env)
+int kvm_arch_put_registers(CPUState *env, int level)
 {
     struct kvm_regs regs;
     int ret;
@@ -296,7 +296,6 @@ static int handle_hypercall(CPUState *env, struct kvm_run *run)
 
     cpu_synchronize_state(env);
     r = s390_virtio_hypercall(env);
-    kvm_arch_put_registers(env);
 
     return r;
 }
diff --git a/vl.c b/vl.c
index 729c955..8847b70 100644
--- a/vl.c
+++ b/vl.c
@@ -3031,6 +3031,33 @@ static void nographic_update(void *opaque)
     qemu_mod_timer(nographic_timer, interval + qemu_get_clock(rt_clock));
 }
 
+void cpu_synchronize_all_states(void)
+{
+    CPUState *cpu;
+
+    for (cpu = first_cpu; cpu; cpu = cpu->next_cpu) {
+        cpu_synchronize_state(cpu);
+    }
+}
+
+void cpu_synchronize_all_post_reset(void)
+{
+    CPUState *cpu;
+
+    for (cpu = first_cpu; cpu; cpu = cpu->next_cpu) {
+        cpu_synchronize_post_reset(cpu);
+    }
+}
+
+void cpu_synchronize_all_post_init(void)
+{
+    CPUState *cpu;
+
+    for (cpu = first_cpu; cpu; cpu = cpu->next_cpu) {
+        cpu_synchronize_post_init(cpu);
+    }
+}
+
 struct vm_change_state_entry {
     VMChangeStateHandler *cb;
     void *opaque;
@@ -3179,6 +3206,7 @@ void qemu_system_reset(void)
     QTAILQ_FOREACH_SAFE(re, &reset_handlers, entry, nre) {
         re->func(re->opaque);
     }
+    cpu_synchronize_all_post_reset();
 }
 
 void qemu_system_reset_request(void)
@@ -6030,6 +6058,7 @@ int main(int argc, char **argv, char **envp)
     machine->init(ram_size, boot_devices,
                   kernel_filename, kernel_cmdline, initrd_filename, cpu_model);
 
+    cpu_synchronize_all_post_init();
 
 #ifndef _WIN32
     /* must be after terminal init, SDL library changes signal handlers */
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH v4 02/10] qemu-kvm: Rework VCPU state writeback API
@ 2010-03-01 17:17   ` Jan Kiszka
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: Gleb Natapov, qemu-devel, kvm

This grand cleanup drops all reset and vmsave/load related
synchronization points in favor of four(!) generic hooks:

- cpu_synchronize_all_states in qemu_savevm_state_complete
  (initial sync from kernel before vmsave)
- cpu_synchronize_all_post_init in qemu_loadvm_state
  (writeback after vmload)
- cpu_synchronize_all_post_init in main after machine init
- cpu_synchronize_all_post_reset in qemu_system_reset
  (writeback after system reset)

These writeback points + the existing one of VCPU exec after
cpu_synchronize_state map on three levels of writeback:

- KVM_PUT_RUNTIME_STATE (during runtime, other VCPUs continue to run)
- KVM_PUT_RESET_STATE   (on synchronous system reset, all VCPUs stopped)
- KVM_PUT_FULL_STATE    (on init or vmload, all VCPUs stopped as well)

This level is passed to the arch-specific VCPU state writing function
that will decide which concrete substates need to be written. That way,
no writer of load, save or reset functions that interact with in-kernel
KVM states will ever have to worry about synchronization again. That
also means that a lot of reasons for races, segfaults and deadlocks are
eliminated.

cpu_synchronize_state remains untouched, just as Anthony suggested. We
continue to need it before reading or writing of VCPU states that are
also tracked by in-kernel KVM subsystems.

Consequently, this patch removes many cpu_synchronize_state calls that
are now redundant, just like remaining explicit register syncs. It does
not touch qemu-kvm's special hooks for mpstate, vcpu_events, or tsc
loading. They will be cleaned up by individual patches.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 exec.c                |   17 -----------------
 hw/apic.c             |    3 ---
 hw/pc.c               |    1 -
 hw/ppc_newworld.c     |    3 ---
 hw/ppc_oldworld.c     |    3 ---
 hw/s390-virtio.c      |    1 -
 kvm-all.c             |   19 +++++++++++++------
 kvm.h                 |   25 ++++++++++++++++++++++++-
 qemu-kvm-ia64.c       |    2 +-
 qemu-kvm-x86.c        |    3 +--
 qemu-kvm.c            |   16 +++++++++++++---
 qemu-kvm.h            |    2 +-
 savevm.c              |    4 ++++
 sysemu.h              |    4 ++++
 target-i386/kvm.c     |    2 +-
 target-i386/machine.c |   10 ----------
 target-ia64/machine.c |    2 --
 target-ppc/kvm.c      |    2 +-
 target-ppc/machine.c  |    4 ----
 target-s390x/kvm.c    |    3 +--
 vl.c                  |   29 +++++++++++++++++++++++++++++
 21 files changed, 93 insertions(+), 62 deletions(-)

diff --git a/exec.c b/exec.c
index b647512..69003c2 100644
--- a/exec.c
+++ b/exec.c
@@ -530,21 +530,6 @@ void cpu_exec_init_all(unsigned long tb_size)
 
 #if defined(CPU_SAVE_VERSION) && !defined(CONFIG_USER_ONLY)
 
-static void cpu_common_pre_save(void *opaque)
-{
-    CPUState *env = opaque;
-
-    cpu_synchronize_state(env);
-}
-
-static int cpu_common_pre_load(void *opaque)
-{
-    CPUState *env = opaque;
-
-    cpu_synchronize_state(env);
-    return 0;
-}
-
 static int cpu_common_post_load(void *opaque, int version_id)
 {
     CPUState *env = opaque;
@@ -562,8 +547,6 @@ static const VMStateDescription vmstate_cpu_common = {
     .version_id = 1,
     .minimum_version_id = 1,
     .minimum_version_id_old = 1,
-    .pre_save = cpu_common_pre_save,
-    .pre_load = cpu_common_pre_load,
     .post_load = cpu_common_post_load,
     .fields      = (VMStateField []) {
         VMSTATE_UINT32(halted, CPUState),
diff --git a/hw/apic.c b/hw/apic.c
index ae805dc..3e03e10 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -488,7 +488,6 @@ void apic_init_reset(CPUState *env)
     if (!s)
         return;
 
-    cpu_synchronize_state(env);
     s->tpr = 0;
     s->spurious_vec = 0xff;
     s->log_dest = 0;
@@ -1070,8 +1069,6 @@ static void apic_reset(void *opaque)
     APICState *s = opaque;
     int bsp;
 
-    cpu_synchronize_state(s->cpu_env);
-
     bsp = cpu_is_bsp(s->cpu_env);
     s->apicbase = 0xfee00000 |
         (bsp ? MSR_IA32_APICBASE_BSP : 0) | MSR_IA32_APICBASE_ENABLE;
diff --git a/hw/pc.c b/hw/pc.c
index 67af649..b90a79e 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -779,7 +779,6 @@ CPUState *pc_new_cpu(const char *cpu_model)
         fprintf(stderr, "Unable to find x86 CPU definition\n");
         exit(1);
     }
-    env->kvm_vcpu_dirty = 1;
     if ((env->cpuid_features & CPUID_APIC) || smp_cpus > 1) {
         env->cpuid_apic_id = env->cpu_index;
         /* APIC reset callback resets cpu */
diff --git a/hw/ppc_newworld.c b/hw/ppc_newworld.c
index bc86c85..d4f9013 100644
--- a/hw/ppc_newworld.c
+++ b/hw/ppc_newworld.c
@@ -167,9 +167,6 @@ static void ppc_core99_init (ram_addr_t ram_size,
         envs[i] = env;
     }
 
-    /* Make sure all register sets take effect */
-    cpu_synchronize_state(env);
-
     /* allocate RAM */
     ram_offset = qemu_ram_alloc(ram_size);
     cpu_register_physical_memory(0, ram_size, ram_offset);
diff --git a/hw/ppc_oldworld.c b/hw/ppc_oldworld.c
index 04a7835..93c95ba 100644
--- a/hw/ppc_oldworld.c
+++ b/hw/ppc_oldworld.c
@@ -165,9 +165,6 @@ static void ppc_heathrow_init (ram_addr_t ram_size,
         envs[i] = env;
     }
 
-    /* Make sure all register sets take effect */
-    cpu_synchronize_state(env);
-
     /* allocate RAM */
     if (ram_size > (2047 << 20)) {
         fprintf(stderr,
diff --git a/hw/s390-virtio.c b/hw/s390-virtio.c
index 3582728..ad3386f 100644
--- a/hw/s390-virtio.c
+++ b/hw/s390-virtio.c
@@ -185,7 +185,6 @@ static void s390_init(ram_addr_t ram_size,
             exit(1);
         }
 
-        cpu_synchronize_state(env);
         env->psw.addr = KERN_IMAGE_START;
         env->psw.mask = 0x0000000180000000ULL;
     }
diff --git a/kvm-all.c b/kvm-all.c
index 06708a5..7bb3f7b 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -156,10 +156,6 @@ static void kvm_reset_vcpu(void *opaque)
     CPUState *env = opaque;
 
     kvm_arch_reset_vcpu(env);
-    if (kvm_arch_put_registers(env)) {
-        fprintf(stderr, "Fatal: kvm vcpu reset failed\n");
-        abort();
-    }
 }
 #endif
 
@@ -216,7 +212,6 @@ int kvm_init_vcpu(CPUState *env)
     if (ret == 0) {
         qemu_register_reset(kvm_reset_vcpu, env);
         kvm_arch_reset_vcpu(env);
-        ret = kvm_arch_put_registers(env);
     }
 err:
     return ret;
@@ -770,6 +765,18 @@ void kvm_cpu_synchronize_state(CPUState *env)
     }
 }
 
+void kvm_cpu_synchronize_post_reset(CPUState *env)
+{
+    kvm_arch_put_registers(env, KVM_PUT_RESET_STATE);
+    env->kvm_vcpu_dirty = 0;
+}
+
+void kvm_cpu_synchronize_post_init(CPUState *env)
+{
+    kvm_arch_put_registers(env, KVM_PUT_FULL_STATE);
+    env->kvm_vcpu_dirty = 0;
+}
+
 int kvm_cpu_exec(CPUState *env)
 {
     struct kvm_run *run = env->kvm_run;
@@ -785,7 +792,7 @@ int kvm_cpu_exec(CPUState *env)
         }
 
         if (env->kvm_vcpu_dirty) {
-            kvm_arch_put_registers(env);
+            kvm_arch_put_registers(env, KVM_PUT_RUNTIME_STATE);
             env->kvm_vcpu_dirty = 0;
         }
 
diff --git a/kvm.h b/kvm.h
index 888dfcb..874506b 100644
--- a/kvm.h
+++ b/kvm.h
@@ -87,7 +87,14 @@ int kvm_arch_pre_run(CPUState *env, struct kvm_run *run);
 
 int kvm_arch_get_registers(CPUState *env);
 
-int kvm_arch_put_registers(CPUState *env);
+/* state subset only touched by the VCPU itself during runtime */
+#define KVM_PUT_RUNTIME_STATE   1
+/* state subset modified during VCPU reset */
+#define KVM_PUT_RESET_STATE     2
+/* full state set, modified during initialization or on vmload */
+#define KVM_PUT_FULL_STATE      3
+
+int kvm_arch_put_registers(CPUState *env, int level);
 
 int kvm_arch_init(KVMState *s, int smp_cpus);
 
@@ -131,6 +138,8 @@ int kvm_check_extension(KVMState *s, unsigned int extension);
 uint32_t kvm_arch_get_supported_cpuid(CPUState *env, uint32_t function,
                                       int reg);
 void kvm_cpu_synchronize_state(CPUState *env);
+void kvm_cpu_synchronize_post_reset(CPUState *env);
+void kvm_cpu_synchronize_post_init(CPUState *env);
 
 /* generic hooks - to be moved/refactored once there are more users */
 
@@ -141,4 +150,18 @@ static inline void cpu_synchronize_state(CPUState *env)
     }
 }
 
+static inline void cpu_synchronize_post_reset(CPUState *env)
+{
+    if (kvm_enabled()) {
+        kvm_cpu_synchronize_post_reset(env);
+    }
+}
+
+static inline void cpu_synchronize_post_init(CPUState *env)
+{
+    if (kvm_enabled()) {
+        kvm_cpu_synchronize_post_init(env);
+    }
+}
+
 #endif
diff --git a/qemu-kvm-ia64.c b/qemu-kvm-ia64.c
index a11fde8..fc8110e 100644
--- a/qemu-kvm-ia64.c
+++ b/qemu-kvm-ia64.c
@@ -16,7 +16,7 @@ int kvm_arch_qemu_create_context(void)
     return 0;
 }
 
-void kvm_arch_load_regs(CPUState *env)
+void kvm_arch_load_regs(CPUState *env, int level)
 {
 }
 
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 5db95f8..47667b3 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -803,7 +803,7 @@ static void get_seg(SegmentCache *lhs, const struct kvm_segment *rhs)
 	| (rhs->avl * DESC_AVL_MASK);
 }
 
-void kvm_arch_load_regs(CPUState *env)
+void kvm_arch_load_regs(CPUState *env, int level)
 {
     struct kvm_regs regs;
     struct kvm_fpu fpu;
@@ -1362,7 +1362,6 @@ void kvm_arch_push_nmi(void *opaque)
 void kvm_arch_cpu_reset(CPUState *env)
 {
     kvm_arch_reset_vcpu(env);
-    kvm_arch_load_regs(env);
     kvm_put_vcpu_events(env);
     if (!cpu_is_bsp(env)) {
 	if (kvm_irqchip_in_kernel()) {
diff --git a/qemu-kvm.c b/qemu-kvm.c
index 29365a9..365bb37 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -871,7 +871,7 @@ int pre_kvm_run(kvm_context_t kvm, CPUState *env)
     kvm_arch_pre_run(env, env->kvm_run);
 
     if (env->kvm_vcpu_dirty) {
-        kvm_arch_load_regs(env);
+        kvm_arch_load_regs(env, KVM_PUT_RUNTIME_STATE);
         env->kvm_vcpu_dirty = 0;
     }
 
@@ -1529,6 +1529,18 @@ void kvm_cpu_synchronize_state(CPUState *env)
         on_vcpu(env, do_kvm_cpu_synchronize_state, env);
 }
 
+void kvm_cpu_synchronize_post_reset(CPUState *env)
+{
+    kvm_arch_load_regs(env, KVM_PUT_RESET_STATE);
+    env->kvm_vcpu_dirty = 0;
+}
+
+void kvm_cpu_synchronize_post_init(CPUState *env)
+{
+    kvm_arch_load_regs(env, KVM_PUT_FULL_STATE);
+    env->kvm_vcpu_dirty = 0;
+}
+
 static void inject_interrupt(void *data)
 {
     cpu_interrupt(current_env, (long) data);
@@ -1874,8 +1886,6 @@ static void *ap_main_loop(void *_env)
 
     kvm_arch_init_vcpu(env);
 
-    kvm_arch_load_regs(env);
-
     /* signal VCPU creation */
     current_env->created = 1;
     pthread_cond_signal(&qemu_vcpu_cond);
diff --git a/qemu-kvm.h b/qemu-kvm.h
index ed00665..94b6763 100644
--- a/qemu-kvm.h
+++ b/qemu-kvm.h
@@ -893,7 +893,7 @@ int kvm_qemu_destroy_memory_alias(uint64_t phys_start);
 int kvm_arch_qemu_create_context(void);
 
 void kvm_arch_save_regs(CPUState *env);
-void kvm_arch_load_regs(CPUState *env);
+void kvm_arch_load_regs(CPUState *env, int level);
 void kvm_arch_load_mpstate(CPUState *env);
 void kvm_arch_save_mpstate(CPUState *env);
 int kvm_arch_has_work(CPUState *env);
diff --git a/savevm.c b/savevm.c
index 2fd3de6..f1e675c 100644
--- a/savevm.c
+++ b/savevm.c
@@ -1368,6 +1368,8 @@ int qemu_savevm_state_complete(Monitor *mon, QEMUFile *f)
 {
     SaveStateEntry *se;
 
+    cpu_synchronize_all_states();
+
     QTAILQ_FOREACH(se, &savevm_handlers, entry) {
         if (se->save_live_state == NULL)
             continue;
@@ -1568,6 +1570,8 @@ int qemu_loadvm_state(QEMUFile *f)
         }
     }
 
+    cpu_synchronize_all_post_init();
+
     ret = 0;
 
 out:
diff --git a/sysemu.h b/sysemu.h
index 647a468..01405e1 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -59,6 +59,10 @@ int load_vmstate(Monitor *mon, const char *name);
 void do_delvm(Monitor *mon, const QDict *qdict);
 void do_info_snapshots(Monitor *mon);
 
+void cpu_synchronize_all_states(void);
+void cpu_synchronize_all_post_reset(void);
+void cpu_synchronize_all_post_init(void);
+
 void qemu_announce_self(void);
 
 void main_loop_wait(int timeout);
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index dd636f1..23729ef 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -893,7 +893,7 @@ static int kvm_guest_debug_workarounds(CPUState *env)
 }
 
 #ifdef KVM_UPSTREAM
-int kvm_arch_put_registers(CPUState *env)
+int kvm_arch_put_registers(CPUState *env, int level)
 {
     int ret;
 
diff --git a/target-i386/machine.c b/target-i386/machine.c
index 0b8a33a..bebde82 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -322,7 +322,6 @@ static void cpu_pre_save(void *opaque)
     CPUState *env = opaque;
     int i;
 
-    cpu_synchronize_state(env);
     if (kvm_enabled()) {
         kvm_save_mpstate(env);
         kvm_get_vcpu_events(env);
@@ -342,14 +341,6 @@ static void cpu_pre_save(void *opaque)
 #endif
 }
 
-static int cpu_pre_load(void *opaque)
-{
-    CPUState *env = opaque;
-
-    cpu_synchronize_state(env);
-    return 0;
-}
-
 static int cpu_post_load(void *opaque, int version_id)
 {
     CPUState *env = opaque;
@@ -389,7 +380,6 @@ static const VMStateDescription vmstate_cpu = {
     .minimum_version_id = 3,
     .minimum_version_id_old = 3,
     .pre_save = cpu_pre_save,
-    .pre_load = cpu_pre_load,
     .post_load = cpu_post_load,
     .fields      = (VMStateField []) {
         VMSTATE_UINTTL_ARRAY(regs, CPUState, CPU_NB_REGS),
diff --git a/target-ia64/machine.c b/target-ia64/machine.c
index 7d29575..fdbeeef 100644
--- a/target-ia64/machine.c
+++ b/target-ia64/machine.c
@@ -9,7 +9,6 @@ void cpu_save(QEMUFile *f, void *opaque)
     CPUState *env = opaque;
 
     if (kvm_enabled()) {
-        kvm_arch_save_regs(env);
         kvm_arch_save_mpstate(env);
     }
 }
@@ -19,7 +18,6 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id)
     CPUState *env = opaque;
 
     if (kvm_enabled()) {
-        kvm_arch_load_regs(env);
         kvm_arch_load_mpstate(env);
     }
     return 0;
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 8ad0037..aa3d432 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -73,7 +73,7 @@ void kvm_arch_reset_vcpu(CPUState *env)
 {
 }
 
-int kvm_arch_put_registers(CPUState *env)
+int kvm_arch_put_registers(CPUState *env, int level)
 {
     struct kvm_regs regs;
     int ret;
diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index ead38e1..81d3f72 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -8,8 +8,6 @@ void cpu_save(QEMUFile *f, void *opaque)
     CPUState *env = (CPUState *)opaque;
     unsigned int i, j;
 
-    cpu_synchronize_state(env);
-
     for (i = 0; i < 32; i++)
         qemu_put_betls(f, &env->gpr[i]);
 #if !defined(TARGET_PPC64)
@@ -97,8 +95,6 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id)
     CPUState *env = (CPUState *)opaque;
     unsigned int i, j;
 
-    cpu_synchronize_state(env);
-
     for (i = 0; i < 32; i++)
         qemu_get_betls(f, &env->gpr[i]);
 #if !defined(TARGET_PPC64)
diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index 0199a65..72e77b0 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -91,7 +91,7 @@ void kvm_arch_reset_vcpu(CPUState *env)
     /* FIXME: add code to reset vcpu. */
 }
 
-int kvm_arch_put_registers(CPUState *env)
+int kvm_arch_put_registers(CPUState *env, int level)
 {
     struct kvm_regs regs;
     int ret;
@@ -296,7 +296,6 @@ static int handle_hypercall(CPUState *env, struct kvm_run *run)
 
     cpu_synchronize_state(env);
     r = s390_virtio_hypercall(env);
-    kvm_arch_put_registers(env);
 
     return r;
 }
diff --git a/vl.c b/vl.c
index 729c955..8847b70 100644
--- a/vl.c
+++ b/vl.c
@@ -3031,6 +3031,33 @@ static void nographic_update(void *opaque)
     qemu_mod_timer(nographic_timer, interval + qemu_get_clock(rt_clock));
 }
 
+void cpu_synchronize_all_states(void)
+{
+    CPUState *cpu;
+
+    for (cpu = first_cpu; cpu; cpu = cpu->next_cpu) {
+        cpu_synchronize_state(cpu);
+    }
+}
+
+void cpu_synchronize_all_post_reset(void)
+{
+    CPUState *cpu;
+
+    for (cpu = first_cpu; cpu; cpu = cpu->next_cpu) {
+        cpu_synchronize_post_reset(cpu);
+    }
+}
+
+void cpu_synchronize_all_post_init(void)
+{
+    CPUState *cpu;
+
+    for (cpu = first_cpu; cpu; cpu = cpu->next_cpu) {
+        cpu_synchronize_post_init(cpu);
+    }
+}
+
 struct vm_change_state_entry {
     VMChangeStateHandler *cb;
     void *opaque;
@@ -3179,6 +3206,7 @@ void qemu_system_reset(void)
     QTAILQ_FOREACH_SAFE(re, &reset_handlers, entry, nre) {
         re->func(re->opaque);
     }
+    cpu_synchronize_all_post_reset();
 }
 
 void qemu_system_reset_request(void)
@@ -6030,6 +6058,7 @@ int main(int argc, char **argv, char **envp)
     machine->init(ram_size, boot_devices,
                   kernel_filename, kernel_cmdline, initrd_filename, cpu_model);
 
+    cpu_synchronize_all_post_init();
 
 #ifndef _WIN32
     /* must be after terminal init, SDL library changes signal handlers */
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
  2010-03-01 17:17 ` [Qemu-devel] " Jan Kiszka
@ 2010-03-01 17:17   ` Jan Kiszka
  -1 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, qemu-devel, Gleb Natapov

As we hard-wire the BSP to CPU 0 anyway and cpuid_apic_id equals
cpu_index, cpu_is_bsp can also be based on the latter directly. This
will help an early user of it: KVM while initializing mp_state.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/pc.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index b90a79e..58c32ea 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -767,7 +767,8 @@ static void pc_init_ne2k_isa(NICInfo *nd)
 
 int cpu_is_bsp(CPUState *env)
 {
-    return env->cpuid_apic_id == 0;
+    /* We hard-wire the BSP to the first CPU. */
+    return env->cpu_index == 0;
 }
 
 CPUState *pc_new_cpu(const char *cpu_model)
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
@ 2010-03-01 17:17   ` Jan Kiszka
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: Gleb Natapov, qemu-devel, kvm

As we hard-wire the BSP to CPU 0 anyway and cpuid_apic_id equals
cpu_index, cpu_is_bsp can also be based on the latter directly. This
will help an early user of it: KVM while initializing mp_state.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/pc.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index b90a79e..58c32ea 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -767,7 +767,8 @@ static void pc_init_ne2k_isa(NICInfo *nd)
 
 int cpu_is_bsp(CPUState *env)
 {
-    return env->cpuid_apic_id == 0;
+    /* We hard-wire the BSP to the first CPU. */
+    return env->cpu_index == 0;
 }
 
 CPUState *pc_new_cpu(const char *cpu_model)
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v4 04/10] qemu-kvm: Clean up mpstate synchronization
  2010-03-01 17:17 ` [Qemu-devel] " Jan Kiszka
@ 2010-03-01 17:17   ` Jan Kiszka
  -1 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, qemu-devel, Gleb Natapov

Push mpstate reading/writing into kvm_arch_load/save_regs and, on x86,
properly synchronize with halted in the accessor functions.

At this chance, drop the special reset of interrupt_request and halted
in kvm_arch_cpu_reset. The former is done via memset in cpu_reset, the
latter in apic_init_reset anyway.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/apic.c             |    7 ---
 qemu-kvm-ia64.c       |    4 +-
 qemu-kvm-x86.c        |  100 +++++++++++++++++++++++++++++-------------------
 qemu-kvm.c            |   30 ---------------
 qemu-kvm.h            |   15 -------
 target-i386/machine.c |    6 ---
 target-ia64/machine.c |    3 +
 7 files changed, 66 insertions(+), 99 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index 3e03e10..092c61e 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -507,13 +507,6 @@ void apic_init_reset(CPUState *env)
     s->wait_for_sipi = 1;
 
     env->halted = !(s->apicbase & MSR_IA32_APICBASE_BSP);
-#ifdef KVM_CAP_MP_STATE
-    if (kvm_enabled() && kvm_irqchip_in_kernel()) {
-        env->mp_state
-            = env->halted ? KVM_MP_STATE_UNINITIALIZED : KVM_MP_STATE_RUNNABLE;
-        kvm_load_mpstate(env);
-    }
-#endif
 }
 
 static void apic_startup(APICState *s, int vector_num)
diff --git a/qemu-kvm-ia64.c b/qemu-kvm-ia64.c
index fc8110e..39bcbeb 100644
--- a/qemu-kvm-ia64.c
+++ b/qemu-kvm-ia64.c
@@ -124,7 +124,9 @@ void kvm_arch_cpu_reset(CPUState *env)
 {
     if (kvm_irqchip_in_kernel(kvm_context)) {
 #ifdef KVM_CAP_MP_STATE
-	kvm_reset_mpstate(env->kvm_cpu_state.vcpu_ctx);
+        struct kvm_mp_state mp_state = {.mp_state = KVM_MP_STATE_UNINITIALIZED
+        };
+        kvm_set_mpstate(env, &mp_state);
 #endif
     } else {
 	env->interrupt_request &= ~CPU_INTERRUPT_HARD;
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 47667b3..61b3dde 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -754,6 +754,56 @@ static int get_msr_entry(struct kvm_msr_entry *entry, CPUState *env)
         return 0;
 }
 
+static void kvm_arch_save_mpstate(CPUState *env)
+{
+#ifdef KVM_CAP_MP_STATE
+    int r;
+    struct kvm_mp_state mp_state;
+
+    r = kvm_get_mpstate(env, &mp_state);
+    if (r < 0) {
+        env->mp_state = -1;
+    } else {
+        env->mp_state = mp_state.mp_state;
+        if (kvm_irqchip_in_kernel()) {
+            env->halted = (env->mp_state == KVM_MP_STATE_HALTED);
+        }
+    }
+#else
+    env->mp_state = -1;
+#endif
+}
+
+static void kvm_arch_load_mpstate(CPUState *env)
+{
+#ifdef KVM_CAP_MP_STATE
+    struct kvm_mp_state mp_state;
+
+    /*
+     * -1 indicates that the host did not support GET_MP_STATE ioctl,
+     *  so don't touch it.
+     */
+    if (env->mp_state != -1) {
+        mp_state.mp_state = env->mp_state;
+        kvm_set_mpstate(env, &mp_state);
+    }
+#endif
+}
+
+static void kvm_reset_mpstate(CPUState *env)
+{
+#ifdef KVM_CAP_MP_STATE
+    if (kvm_check_extension(kvm_state, KVM_CAP_MP_STATE)) {
+        if (kvm_irqchip_in_kernel()) {
+            env->mp_state = cpu_is_bsp(env) ? KVM_MP_STATE_RUNNABLE :
+                                              KVM_MP_STATE_UNINITIALIZED;
+        } else {
+            env->mp_state = KVM_MP_STATE_RUNNABLE;
+        }
+    }
+#endif
+}
+
 static void set_v8086_seg(struct kvm_segment *lhs, const SegmentCache *rhs)
 {
     lhs->selector = rhs->selector;
@@ -922,6 +972,14 @@ void kvm_arch_load_regs(CPUState *env, int level)
     if (rc == -1)
         perror("kvm_set_msrs FAILED");
 
+    if (level >= KVM_PUT_RESET_STATE) {
+        kvm_arch_load_mpstate(env);
+    }
+    if (kvm_irqchip_in_kernel()) {
+        /* Avoid deadlock: no user space IRQ will ever clear it. */
+        env->halted = 0;
+    }
+
     /* must be last */
     kvm_guest_debug_workarounds(env);
 }
@@ -938,36 +996,6 @@ void kvm_load_tsc(CPUState *env)
         perror("kvm_set_tsc FAILED.\n");
 }
 
-void kvm_arch_save_mpstate(CPUState *env)
-{
-#ifdef KVM_CAP_MP_STATE
-    int r;
-    struct kvm_mp_state mp_state;
-
-    r = kvm_get_mpstate(env, &mp_state);
-    if (r < 0)
-        env->mp_state = -1;
-    else
-        env->mp_state = mp_state.mp_state;
-#else
-    env->mp_state = -1;
-#endif
-}
-
-void kvm_arch_load_mpstate(CPUState *env)
-{
-#ifdef KVM_CAP_MP_STATE
-    struct kvm_mp_state mp_state = { .mp_state = env->mp_state };
-
-    /*
-     * -1 indicates that the host did not support GET_MP_STATE ioctl,
-     *  so don't touch it.
-     */
-    if (env->mp_state != -1)
-        kvm_set_mpstate(env, &mp_state);
-#endif
-}
-
 void kvm_arch_save_regs(CPUState *env)
 {
     struct kvm_regs regs;
@@ -1290,6 +1318,7 @@ int kvm_arch_init_vcpu(CPUState *cenv)
 #ifdef KVM_EXIT_TPR_ACCESS
     kvm_tpr_vcpu_start(cenv);
 #endif
+    kvm_reset_mpstate(cenv);
     return 0;
 }
 
@@ -1363,16 +1392,7 @@ void kvm_arch_cpu_reset(CPUState *env)
 {
     kvm_arch_reset_vcpu(env);
     kvm_put_vcpu_events(env);
-    if (!cpu_is_bsp(env)) {
-	if (kvm_irqchip_in_kernel()) {
-#ifdef KVM_CAP_MP_STATE
-	    kvm_reset_mpstate(env);
-#endif
-	} else {
-	    env->interrupt_request &= ~CPU_INTERRUPT_HARD;
-	    env->halted = 1;
-	}
-    }
+    kvm_reset_mpstate(env);
 }
 
 #ifdef CONFIG_KVM_DEVICE_ASSIGNMENT
diff --git a/qemu-kvm.c b/qemu-kvm.c
index 365bb37..222ca97 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -1573,36 +1573,6 @@ void kvm_update_interrupt_request(CPUState *env)
     }
 }
 
-static void kvm_do_load_mpstate(void *_env)
-{
-    CPUState *env = _env;
-
-    kvm_arch_load_mpstate(env);
-}
-
-void kvm_load_mpstate(CPUState *env)
-{
-    if (kvm_enabled() && qemu_system_ready && kvm_vcpu_inited(env))
-        on_vcpu(env, kvm_do_load_mpstate, env);
-}
-
-static void kvm_do_save_mpstate(void *_env)
-{
-    CPUState *env = _env;
-
-    kvm_arch_save_mpstate(env);
-#ifdef KVM_CAP_MP_STATE
-    if (kvm_irqchip_in_kernel())
-        env->halted = (env->mp_state == KVM_MP_STATE_HALTED);
-#endif
-}
-
-void kvm_save_mpstate(CPUState *env)
-{
-    if (kvm_enabled())
-        on_vcpu(env, kvm_do_save_mpstate, env);
-}
-
 int kvm_cpu_exec(CPUState *env)
 {
     int r;
diff --git a/qemu-kvm.h b/qemu-kvm.h
index 94b6763..7ad8f42 100644
--- a/qemu-kvm.h
+++ b/qemu-kvm.h
@@ -299,16 +299,6 @@ int kvm_get_mpstate(CPUState *env, struct kvm_mp_state *mp_state);
  *
  */
 int kvm_set_mpstate(CPUState *env, struct kvm_mp_state *mp_state);
-/*!
- *  * \brief Reset VCPU MP state
- *
- */
-static inline int kvm_reset_mpstate(CPUState *env)
-{
-    struct kvm_mp_state mp_state = {.mp_state = KVM_MP_STATE_UNINITIALIZED
-    };
-    return kvm_set_mpstate(env, &mp_state);
-}
 #endif
 
 /*!
@@ -859,8 +849,6 @@ static inline void kvm_inject_x86_mce(CPUState *cenv, int bank,
 int kvm_main_loop(void);
 int kvm_init_ap(void);
 int kvm_vcpu_inited(CPUState *env);
-void kvm_load_mpstate(CPUState *env);
-void kvm_save_mpstate(CPUState *env);
 void kvm_apic_init(CPUState *env);
 /* called from vcpu initialization */
 void qemu_kvm_load_lapic(CPUState *env);
@@ -894,8 +882,6 @@ int kvm_arch_qemu_create_context(void);
 
 void kvm_arch_save_regs(CPUState *env);
 void kvm_arch_load_regs(CPUState *env, int level);
-void kvm_arch_load_mpstate(CPUState *env);
-void kvm_arch_save_mpstate(CPUState *env);
 int kvm_arch_has_work(CPUState *env);
 void kvm_arch_process_irqchip_events(CPUState *env);
 int kvm_arch_try_push_interrupts(void *opaque);
@@ -964,7 +950,6 @@ void kvm_load_tsc(CPUState *env);
 #ifdef TARGET_I386
 #define qemu_kvm_has_pit_state2() (0)
 #endif
-#define kvm_save_mpstate(env)   do {} while(0)
 #define qemu_kvm_cpu_stop(env) do {} while(0)
 static inline void kvm_load_tsc(CPUState *env)
 {
diff --git a/target-i386/machine.c b/target-i386/machine.c
index bebde82..61e6a87 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -323,7 +323,6 @@ static void cpu_pre_save(void *opaque)
     int i;
 
     if (kvm_enabled()) {
-        kvm_save_mpstate(env);
         kvm_get_vcpu_events(env);
     }
 
@@ -362,12 +361,7 @@ static int cpu_post_load(void *opaque, int version_id)
     tlb_flush(env, 1);
 
     if (kvm_enabled()) {
-        /* when in-kernel irqchip is used, env->halted causes deadlock
-           because no userspace IRQs will ever clear this flag */
-        env->halted = 0;
-
         kvm_load_tsc(env);
-        kvm_load_mpstate(env);
         kvm_put_vcpu_events(env);
     }
 
diff --git a/target-ia64/machine.c b/target-ia64/machine.c
index fdbeeef..8cf5bdd 100644
--- a/target-ia64/machine.c
+++ b/target-ia64/machine.c
@@ -4,6 +4,9 @@
 #include "exec-all.h"
 #include "qemu-kvm.h"
 
+void kvm_arch_save_mpstate(CPUState *env);
+void kvm_arch_load_mpstate(CPUState *env);
+
 void cpu_save(QEMUFile *f, void *opaque)
 {
     CPUState *env = opaque;
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH v4 04/10] qemu-kvm: Clean up mpstate synchronization
@ 2010-03-01 17:17   ` Jan Kiszka
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: Gleb Natapov, qemu-devel, kvm

Push mpstate reading/writing into kvm_arch_load/save_regs and, on x86,
properly synchronize with halted in the accessor functions.

At this chance, drop the special reset of interrupt_request and halted
in kvm_arch_cpu_reset. The former is done via memset in cpu_reset, the
latter in apic_init_reset anyway.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/apic.c             |    7 ---
 qemu-kvm-ia64.c       |    4 +-
 qemu-kvm-x86.c        |  100 +++++++++++++++++++++++++++++-------------------
 qemu-kvm.c            |   30 ---------------
 qemu-kvm.h            |   15 -------
 target-i386/machine.c |    6 ---
 target-ia64/machine.c |    3 +
 7 files changed, 66 insertions(+), 99 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index 3e03e10..092c61e 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -507,13 +507,6 @@ void apic_init_reset(CPUState *env)
     s->wait_for_sipi = 1;
 
     env->halted = !(s->apicbase & MSR_IA32_APICBASE_BSP);
-#ifdef KVM_CAP_MP_STATE
-    if (kvm_enabled() && kvm_irqchip_in_kernel()) {
-        env->mp_state
-            = env->halted ? KVM_MP_STATE_UNINITIALIZED : KVM_MP_STATE_RUNNABLE;
-        kvm_load_mpstate(env);
-    }
-#endif
 }
 
 static void apic_startup(APICState *s, int vector_num)
diff --git a/qemu-kvm-ia64.c b/qemu-kvm-ia64.c
index fc8110e..39bcbeb 100644
--- a/qemu-kvm-ia64.c
+++ b/qemu-kvm-ia64.c
@@ -124,7 +124,9 @@ void kvm_arch_cpu_reset(CPUState *env)
 {
     if (kvm_irqchip_in_kernel(kvm_context)) {
 #ifdef KVM_CAP_MP_STATE
-	kvm_reset_mpstate(env->kvm_cpu_state.vcpu_ctx);
+        struct kvm_mp_state mp_state = {.mp_state = KVM_MP_STATE_UNINITIALIZED
+        };
+        kvm_set_mpstate(env, &mp_state);
 #endif
     } else {
 	env->interrupt_request &= ~CPU_INTERRUPT_HARD;
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 47667b3..61b3dde 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -754,6 +754,56 @@ static int get_msr_entry(struct kvm_msr_entry *entry, CPUState *env)
         return 0;
 }
 
+static void kvm_arch_save_mpstate(CPUState *env)
+{
+#ifdef KVM_CAP_MP_STATE
+    int r;
+    struct kvm_mp_state mp_state;
+
+    r = kvm_get_mpstate(env, &mp_state);
+    if (r < 0) {
+        env->mp_state = -1;
+    } else {
+        env->mp_state = mp_state.mp_state;
+        if (kvm_irqchip_in_kernel()) {
+            env->halted = (env->mp_state == KVM_MP_STATE_HALTED);
+        }
+    }
+#else
+    env->mp_state = -1;
+#endif
+}
+
+static void kvm_arch_load_mpstate(CPUState *env)
+{
+#ifdef KVM_CAP_MP_STATE
+    struct kvm_mp_state mp_state;
+
+    /*
+     * -1 indicates that the host did not support GET_MP_STATE ioctl,
+     *  so don't touch it.
+     */
+    if (env->mp_state != -1) {
+        mp_state.mp_state = env->mp_state;
+        kvm_set_mpstate(env, &mp_state);
+    }
+#endif
+}
+
+static void kvm_reset_mpstate(CPUState *env)
+{
+#ifdef KVM_CAP_MP_STATE
+    if (kvm_check_extension(kvm_state, KVM_CAP_MP_STATE)) {
+        if (kvm_irqchip_in_kernel()) {
+            env->mp_state = cpu_is_bsp(env) ? KVM_MP_STATE_RUNNABLE :
+                                              KVM_MP_STATE_UNINITIALIZED;
+        } else {
+            env->mp_state = KVM_MP_STATE_RUNNABLE;
+        }
+    }
+#endif
+}
+
 static void set_v8086_seg(struct kvm_segment *lhs, const SegmentCache *rhs)
 {
     lhs->selector = rhs->selector;
@@ -922,6 +972,14 @@ void kvm_arch_load_regs(CPUState *env, int level)
     if (rc == -1)
         perror("kvm_set_msrs FAILED");
 
+    if (level >= KVM_PUT_RESET_STATE) {
+        kvm_arch_load_mpstate(env);
+    }
+    if (kvm_irqchip_in_kernel()) {
+        /* Avoid deadlock: no user space IRQ will ever clear it. */
+        env->halted = 0;
+    }
+
     /* must be last */
     kvm_guest_debug_workarounds(env);
 }
@@ -938,36 +996,6 @@ void kvm_load_tsc(CPUState *env)
         perror("kvm_set_tsc FAILED.\n");
 }
 
-void kvm_arch_save_mpstate(CPUState *env)
-{
-#ifdef KVM_CAP_MP_STATE
-    int r;
-    struct kvm_mp_state mp_state;
-
-    r = kvm_get_mpstate(env, &mp_state);
-    if (r < 0)
-        env->mp_state = -1;
-    else
-        env->mp_state = mp_state.mp_state;
-#else
-    env->mp_state = -1;
-#endif
-}
-
-void kvm_arch_load_mpstate(CPUState *env)
-{
-#ifdef KVM_CAP_MP_STATE
-    struct kvm_mp_state mp_state = { .mp_state = env->mp_state };
-
-    /*
-     * -1 indicates that the host did not support GET_MP_STATE ioctl,
-     *  so don't touch it.
-     */
-    if (env->mp_state != -1)
-        kvm_set_mpstate(env, &mp_state);
-#endif
-}
-
 void kvm_arch_save_regs(CPUState *env)
 {
     struct kvm_regs regs;
@@ -1290,6 +1318,7 @@ int kvm_arch_init_vcpu(CPUState *cenv)
 #ifdef KVM_EXIT_TPR_ACCESS
     kvm_tpr_vcpu_start(cenv);
 #endif
+    kvm_reset_mpstate(cenv);
     return 0;
 }
 
@@ -1363,16 +1392,7 @@ void kvm_arch_cpu_reset(CPUState *env)
 {
     kvm_arch_reset_vcpu(env);
     kvm_put_vcpu_events(env);
-    if (!cpu_is_bsp(env)) {
-	if (kvm_irqchip_in_kernel()) {
-#ifdef KVM_CAP_MP_STATE
-	    kvm_reset_mpstate(env);
-#endif
-	} else {
-	    env->interrupt_request &= ~CPU_INTERRUPT_HARD;
-	    env->halted = 1;
-	}
-    }
+    kvm_reset_mpstate(env);
 }
 
 #ifdef CONFIG_KVM_DEVICE_ASSIGNMENT
diff --git a/qemu-kvm.c b/qemu-kvm.c
index 365bb37..222ca97 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -1573,36 +1573,6 @@ void kvm_update_interrupt_request(CPUState *env)
     }
 }
 
-static void kvm_do_load_mpstate(void *_env)
-{
-    CPUState *env = _env;
-
-    kvm_arch_load_mpstate(env);
-}
-
-void kvm_load_mpstate(CPUState *env)
-{
-    if (kvm_enabled() && qemu_system_ready && kvm_vcpu_inited(env))
-        on_vcpu(env, kvm_do_load_mpstate, env);
-}
-
-static void kvm_do_save_mpstate(void *_env)
-{
-    CPUState *env = _env;
-
-    kvm_arch_save_mpstate(env);
-#ifdef KVM_CAP_MP_STATE
-    if (kvm_irqchip_in_kernel())
-        env->halted = (env->mp_state == KVM_MP_STATE_HALTED);
-#endif
-}
-
-void kvm_save_mpstate(CPUState *env)
-{
-    if (kvm_enabled())
-        on_vcpu(env, kvm_do_save_mpstate, env);
-}
-
 int kvm_cpu_exec(CPUState *env)
 {
     int r;
diff --git a/qemu-kvm.h b/qemu-kvm.h
index 94b6763..7ad8f42 100644
--- a/qemu-kvm.h
+++ b/qemu-kvm.h
@@ -299,16 +299,6 @@ int kvm_get_mpstate(CPUState *env, struct kvm_mp_state *mp_state);
  *
  */
 int kvm_set_mpstate(CPUState *env, struct kvm_mp_state *mp_state);
-/*!
- *  * \brief Reset VCPU MP state
- *
- */
-static inline int kvm_reset_mpstate(CPUState *env)
-{
-    struct kvm_mp_state mp_state = {.mp_state = KVM_MP_STATE_UNINITIALIZED
-    };
-    return kvm_set_mpstate(env, &mp_state);
-}
 #endif
 
 /*!
@@ -859,8 +849,6 @@ static inline void kvm_inject_x86_mce(CPUState *cenv, int bank,
 int kvm_main_loop(void);
 int kvm_init_ap(void);
 int kvm_vcpu_inited(CPUState *env);
-void kvm_load_mpstate(CPUState *env);
-void kvm_save_mpstate(CPUState *env);
 void kvm_apic_init(CPUState *env);
 /* called from vcpu initialization */
 void qemu_kvm_load_lapic(CPUState *env);
@@ -894,8 +882,6 @@ int kvm_arch_qemu_create_context(void);
 
 void kvm_arch_save_regs(CPUState *env);
 void kvm_arch_load_regs(CPUState *env, int level);
-void kvm_arch_load_mpstate(CPUState *env);
-void kvm_arch_save_mpstate(CPUState *env);
 int kvm_arch_has_work(CPUState *env);
 void kvm_arch_process_irqchip_events(CPUState *env);
 int kvm_arch_try_push_interrupts(void *opaque);
@@ -964,7 +950,6 @@ void kvm_load_tsc(CPUState *env);
 #ifdef TARGET_I386
 #define qemu_kvm_has_pit_state2() (0)
 #endif
-#define kvm_save_mpstate(env)   do {} while(0)
 #define qemu_kvm_cpu_stop(env) do {} while(0)
 static inline void kvm_load_tsc(CPUState *env)
 {
diff --git a/target-i386/machine.c b/target-i386/machine.c
index bebde82..61e6a87 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -323,7 +323,6 @@ static void cpu_pre_save(void *opaque)
     int i;
 
     if (kvm_enabled()) {
-        kvm_save_mpstate(env);
         kvm_get_vcpu_events(env);
     }
 
@@ -362,12 +361,7 @@ static int cpu_post_load(void *opaque, int version_id)
     tlb_flush(env, 1);
 
     if (kvm_enabled()) {
-        /* when in-kernel irqchip is used, env->halted causes deadlock
-           because no userspace IRQs will ever clear this flag */
-        env->halted = 0;
-
         kvm_load_tsc(env);
-        kvm_load_mpstate(env);
         kvm_put_vcpu_events(env);
     }
 
diff --git a/target-ia64/machine.c b/target-ia64/machine.c
index fdbeeef..8cf5bdd 100644
--- a/target-ia64/machine.c
+++ b/target-ia64/machine.c
@@ -4,6 +4,9 @@
 #include "exec-all.h"
 #include "qemu-kvm.h"
 
+void kvm_arch_save_mpstate(CPUState *env);
+void kvm_arch_load_mpstate(CPUState *env);
+
 void cpu_save(QEMUFile *f, void *opaque)
 {
     CPUState *env = opaque;
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v4 05/10] KVM: x86: Restrict writeback of VCPU state
  2010-03-01 17:17 ` [Qemu-devel] " Jan Kiszka
@ 2010-03-01 17:17   ` Jan Kiszka
  -1 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, qemu-devel, Gleb Natapov

Do not write nmi_pending, sipi_vector, and mpstate unless we at least go
through a reset. And TSC as well as KVM wallclocks should only be
written on full sync, otherwise we risk to drop some time on during
state read-modify-write.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 kvm.h                 |    2 +-
 qemu-kvm-x86.c        |    2 +-
 target-i386/kvm.c     |   32 ++++++++++++++++++++------------
 target-i386/machine.c |    2 +-
 4 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/kvm.h b/kvm.h
index 874506b..afc7ee0 100644
--- a/kvm.h
+++ b/kvm.h
@@ -44,7 +44,7 @@ int kvm_log_stop(target_phys_addr_t phys_addr, ram_addr_t size);
 int kvm_has_sync_mmu(void);
 int kvm_has_vcpu_events(void);
 int kvm_has_robust_singlestep(void);
-int kvm_put_vcpu_events(CPUState *env);
+int kvm_put_vcpu_events(CPUState *env, int level);
 int kvm_get_vcpu_events(CPUState *env);
 
 void kvm_cpu_register_phys_memory_client(void);
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 61b3dde..d1a8f3b 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -1391,7 +1391,7 @@ void kvm_arch_push_nmi(void *opaque)
 void kvm_arch_cpu_reset(CPUState *env)
 {
     kvm_arch_reset_vcpu(env);
-    kvm_put_vcpu_events(env);
+    kvm_put_vcpu_events(env, KVM_PUT_RESET_STATE);
     kvm_reset_mpstate(env);
 }
 
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 23729ef..377829c 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -549,7 +549,7 @@ static void kvm_msr_entry_set(struct kvm_msr_entry *entry,
     entry->data = value;
 }
 
-static int kvm_put_msrs(CPUState *env)
+static int kvm_put_msrs(CPUState *env, int level)
 {
     struct {
         struct kvm_msrs info;
@@ -563,7 +563,6 @@ static int kvm_put_msrs(CPUState *env)
     kvm_msr_entry_set(&msrs[n++], MSR_IA32_SYSENTER_EIP, env->sysenter_eip);
     if (kvm_has_msr_star(env))
 	kvm_msr_entry_set(&msrs[n++], MSR_STAR, env->star);
-    kvm_msr_entry_set(&msrs[n++], MSR_IA32_TSC, env->tsc);
     kvm_msr_entry_set(&msrs[n++], MSR_VM_HSAVE_PA, env->vm_hsave);
 #ifdef TARGET_X86_64
     /* FIXME if lm capable */
@@ -572,8 +571,12 @@ static int kvm_put_msrs(CPUState *env)
     kvm_msr_entry_set(&msrs[n++], MSR_FMASK, env->fmask);
     kvm_msr_entry_set(&msrs[n++], MSR_LSTAR, env->lstar);
 #endif
-    kvm_msr_entry_set(&msrs[n++], MSR_KVM_SYSTEM_TIME,  env->system_time_msr);
-    kvm_msr_entry_set(&msrs[n++], MSR_KVM_WALL_CLOCK,  env->wall_clock_msr);
+    if (level == KVM_PUT_FULL_STATE) {
+        kvm_msr_entry_set(&msrs[n++], MSR_IA32_TSC, env->tsc);
+        kvm_msr_entry_set(&msrs[n++], MSR_KVM_SYSTEM_TIME,
+                          env->system_time_msr);
+        kvm_msr_entry_set(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
+    }
 
     msr_data.info.nmsrs = n;
 
@@ -791,7 +794,7 @@ static int kvm_get_mp_state(CPUState *env)
 }
 #endif
 
-int kvm_put_vcpu_events(CPUState *env)
+int kvm_put_vcpu_events(CPUState *env, int level)
 {
 #ifdef KVM_CAP_VCPU_EVENTS
     struct kvm_vcpu_events events;
@@ -815,8 +818,11 @@ int kvm_put_vcpu_events(CPUState *env)
 
     events.sipi_vector = env->sipi_vector;
 
-    events.flags =
-        KVM_VCPUEVENT_VALID_NMI_PENDING | KVM_VCPUEVENT_VALID_SIPI_VECTOR;
+    events.flags = 0;
+    if (level >= KVM_PUT_RESET_STATE) {
+        events.flags |=
+            KVM_VCPUEVENT_VALID_NMI_PENDING | KVM_VCPUEVENT_VALID_SIPI_VECTOR;
+    }
 
     return kvm_vcpu_ioctl(env, KVM_SET_VCPU_EVENTS, &events);
 #else
@@ -909,15 +915,17 @@ int kvm_arch_put_registers(CPUState *env, int level)
     if (ret < 0)
         return ret;
 
-    ret = kvm_put_msrs(env);
+    ret = kvm_put_msrs(env, level);
     if (ret < 0)
         return ret;
 
-    ret = kvm_put_mp_state(env);
-    if (ret < 0)
-        return ret;
+    if (level >= KVM_PUT_RESET_STATE) {
+        ret = kvm_put_mp_state(env);
+        if (ret < 0)
+            return ret;
+    }
 
-    ret = kvm_put_vcpu_events(env);
+    ret = kvm_put_vcpu_events(env, level);
     if (ret < 0)
         return ret;
 
diff --git a/target-i386/machine.c b/target-i386/machine.c
index 61e6a87..6fca559 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -362,7 +362,7 @@ static int cpu_post_load(void *opaque, int version_id)
 
     if (kvm_enabled()) {
         kvm_load_tsc(env);
-        kvm_put_vcpu_events(env);
+        kvm_put_vcpu_events(env, KVM_PUT_FULL_STATE);
     }
 
     return 0;
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH v4 05/10] KVM: x86: Restrict writeback of VCPU state
@ 2010-03-01 17:17   ` Jan Kiszka
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: Gleb Natapov, qemu-devel, kvm

Do not write nmi_pending, sipi_vector, and mpstate unless we at least go
through a reset. And TSC as well as KVM wallclocks should only be
written on full sync, otherwise we risk to drop some time on during
state read-modify-write.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 kvm.h                 |    2 +-
 qemu-kvm-x86.c        |    2 +-
 target-i386/kvm.c     |   32 ++++++++++++++++++++------------
 target-i386/machine.c |    2 +-
 4 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/kvm.h b/kvm.h
index 874506b..afc7ee0 100644
--- a/kvm.h
+++ b/kvm.h
@@ -44,7 +44,7 @@ int kvm_log_stop(target_phys_addr_t phys_addr, ram_addr_t size);
 int kvm_has_sync_mmu(void);
 int kvm_has_vcpu_events(void);
 int kvm_has_robust_singlestep(void);
-int kvm_put_vcpu_events(CPUState *env);
+int kvm_put_vcpu_events(CPUState *env, int level);
 int kvm_get_vcpu_events(CPUState *env);
 
 void kvm_cpu_register_phys_memory_client(void);
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 61b3dde..d1a8f3b 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -1391,7 +1391,7 @@ void kvm_arch_push_nmi(void *opaque)
 void kvm_arch_cpu_reset(CPUState *env)
 {
     kvm_arch_reset_vcpu(env);
-    kvm_put_vcpu_events(env);
+    kvm_put_vcpu_events(env, KVM_PUT_RESET_STATE);
     kvm_reset_mpstate(env);
 }
 
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 23729ef..377829c 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -549,7 +549,7 @@ static void kvm_msr_entry_set(struct kvm_msr_entry *entry,
     entry->data = value;
 }
 
-static int kvm_put_msrs(CPUState *env)
+static int kvm_put_msrs(CPUState *env, int level)
 {
     struct {
         struct kvm_msrs info;
@@ -563,7 +563,6 @@ static int kvm_put_msrs(CPUState *env)
     kvm_msr_entry_set(&msrs[n++], MSR_IA32_SYSENTER_EIP, env->sysenter_eip);
     if (kvm_has_msr_star(env))
 	kvm_msr_entry_set(&msrs[n++], MSR_STAR, env->star);
-    kvm_msr_entry_set(&msrs[n++], MSR_IA32_TSC, env->tsc);
     kvm_msr_entry_set(&msrs[n++], MSR_VM_HSAVE_PA, env->vm_hsave);
 #ifdef TARGET_X86_64
     /* FIXME if lm capable */
@@ -572,8 +571,12 @@ static int kvm_put_msrs(CPUState *env)
     kvm_msr_entry_set(&msrs[n++], MSR_FMASK, env->fmask);
     kvm_msr_entry_set(&msrs[n++], MSR_LSTAR, env->lstar);
 #endif
-    kvm_msr_entry_set(&msrs[n++], MSR_KVM_SYSTEM_TIME,  env->system_time_msr);
-    kvm_msr_entry_set(&msrs[n++], MSR_KVM_WALL_CLOCK,  env->wall_clock_msr);
+    if (level == KVM_PUT_FULL_STATE) {
+        kvm_msr_entry_set(&msrs[n++], MSR_IA32_TSC, env->tsc);
+        kvm_msr_entry_set(&msrs[n++], MSR_KVM_SYSTEM_TIME,
+                          env->system_time_msr);
+        kvm_msr_entry_set(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
+    }
 
     msr_data.info.nmsrs = n;
 
@@ -791,7 +794,7 @@ static int kvm_get_mp_state(CPUState *env)
 }
 #endif
 
-int kvm_put_vcpu_events(CPUState *env)
+int kvm_put_vcpu_events(CPUState *env, int level)
 {
 #ifdef KVM_CAP_VCPU_EVENTS
     struct kvm_vcpu_events events;
@@ -815,8 +818,11 @@ int kvm_put_vcpu_events(CPUState *env)
 
     events.sipi_vector = env->sipi_vector;
 
-    events.flags =
-        KVM_VCPUEVENT_VALID_NMI_PENDING | KVM_VCPUEVENT_VALID_SIPI_VECTOR;
+    events.flags = 0;
+    if (level >= KVM_PUT_RESET_STATE) {
+        events.flags |=
+            KVM_VCPUEVENT_VALID_NMI_PENDING | KVM_VCPUEVENT_VALID_SIPI_VECTOR;
+    }
 
     return kvm_vcpu_ioctl(env, KVM_SET_VCPU_EVENTS, &events);
 #else
@@ -909,15 +915,17 @@ int kvm_arch_put_registers(CPUState *env, int level)
     if (ret < 0)
         return ret;
 
-    ret = kvm_put_msrs(env);
+    ret = kvm_put_msrs(env, level);
     if (ret < 0)
         return ret;
 
-    ret = kvm_put_mp_state(env);
-    if (ret < 0)
-        return ret;
+    if (level >= KVM_PUT_RESET_STATE) {
+        ret = kvm_put_mp_state(env);
+        if (ret < 0)
+            return ret;
+    }
 
-    ret = kvm_put_vcpu_events(env);
+    ret = kvm_put_vcpu_events(env, level);
     if (ret < 0)
         return ret;
 
diff --git a/target-i386/machine.c b/target-i386/machine.c
index 61e6a87..6fca559 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -362,7 +362,7 @@ static int cpu_post_load(void *opaque, int version_id)
 
     if (kvm_enabled()) {
         kvm_load_tsc(env);
-        kvm_put_vcpu_events(env);
+        kvm_put_vcpu_events(env, KVM_PUT_FULL_STATE);
     }
 
     return 0;
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v4 06/10] qemu-kvm: Use VCPU event state for reset and vmsave/load
  2010-03-01 17:17 ` [Qemu-devel] " Jan Kiszka
@ 2010-03-01 17:17   ` Jan Kiszka
  -1 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, qemu-devel, Gleb Natapov

Push reading/writing of vcpu_events into kvm_arch_load/save_regs to
avoid KVM-specific hooks in generic code.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 kvm.h                 |    2 --
 qemu-kvm-x86.c        |    6 ++++--
 target-i386/kvm.c     |    4 ++--
 target-i386/machine.c |    6 ------
 4 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/kvm.h b/kvm.h
index afc7ee0..c1c1ae8 100644
--- a/kvm.h
+++ b/kvm.h
@@ -44,8 +44,6 @@ int kvm_log_stop(target_phys_addr_t phys_addr, ram_addr_t size);
 int kvm_has_sync_mmu(void);
 int kvm_has_vcpu_events(void);
 int kvm_has_robust_singlestep(void);
-int kvm_put_vcpu_events(CPUState *env, int level);
-int kvm_get_vcpu_events(CPUState *env);
 
 void kvm_cpu_register_phys_memory_client(void);
 
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index d1a8f3b..c838a8b 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -980,6 +980,8 @@ void kvm_arch_load_regs(CPUState *env, int level)
         env->halted = 0;
     }
 
+    kvm_put_vcpu_events(env, level);
+
     /* must be last */
     kvm_guest_debug_workarounds(env);
 }
@@ -1153,6 +1155,7 @@ void kvm_arch_save_regs(CPUState *env)
         }
     }
     kvm_arch_save_mpstate(env);
+    kvm_get_vcpu_events(env);
 }
 
 static void do_cpuid_ent(struct kvm_cpuid_entry2 *e, uint32_t function,
@@ -1224,7 +1227,7 @@ int kvm_arch_init_vcpu(CPUState *cenv)
 
     qemu_kvm_load_lapic(cenv);
 
-    cenv->interrupt_injected = -1;
+    kvm_arch_reset_vcpu(cenv);
 
 #ifdef KVM_CPUID_SIGNATURE
     /* Paravirtualization CPUIDs */
@@ -1391,7 +1394,6 @@ void kvm_arch_push_nmi(void *opaque)
 void kvm_arch_cpu_reset(CPUState *env)
 {
     kvm_arch_reset_vcpu(env);
-    kvm_put_vcpu_events(env, KVM_PUT_RESET_STATE);
     kvm_reset_mpstate(env);
 }
 
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 377829c..2a160fb 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -794,7 +794,7 @@ static int kvm_get_mp_state(CPUState *env)
 }
 #endif
 
-int kvm_put_vcpu_events(CPUState *env, int level)
+static int kvm_put_vcpu_events(CPUState *env, int level)
 {
 #ifdef KVM_CAP_VCPU_EVENTS
     struct kvm_vcpu_events events;
@@ -830,7 +830,7 @@ int kvm_put_vcpu_events(CPUState *env, int level)
 #endif
 }
 
-int kvm_get_vcpu_events(CPUState *env)
+static int kvm_get_vcpu_events(CPUState *env)
 {
 #ifdef KVM_CAP_VCPU_EVENTS
     struct kvm_vcpu_events events;
diff --git a/target-i386/machine.c b/target-i386/machine.c
index 6fca559..bcc315b 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -5,7 +5,6 @@
 
 #include "exec-all.h"
 #include "kvm.h"
-#include "qemu-kvm.h"
 
 static const VMStateDescription vmstate_segment = {
     .name = "segment",
@@ -322,10 +321,6 @@ static void cpu_pre_save(void *opaque)
     CPUState *env = opaque;
     int i;
 
-    if (kvm_enabled()) {
-        kvm_get_vcpu_events(env);
-    }
-
     /* FPU */
     env->fpus_vmstate = (env->fpus & ~0x3800) | (env->fpstt & 0x7) << 11;
     env->fptag_vmstate = 0;
@@ -362,7 +357,6 @@ static int cpu_post_load(void *opaque, int version_id)
 
     if (kvm_enabled()) {
         kvm_load_tsc(env);
-        kvm_put_vcpu_events(env, KVM_PUT_FULL_STATE);
     }
 
     return 0;
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH v4 06/10] qemu-kvm: Use VCPU event state for reset and vmsave/load
@ 2010-03-01 17:17   ` Jan Kiszka
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: Gleb Natapov, qemu-devel, kvm

Push reading/writing of vcpu_events into kvm_arch_load/save_regs to
avoid KVM-specific hooks in generic code.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 kvm.h                 |    2 --
 qemu-kvm-x86.c        |    6 ++++--
 target-i386/kvm.c     |    4 ++--
 target-i386/machine.c |    6 ------
 4 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/kvm.h b/kvm.h
index afc7ee0..c1c1ae8 100644
--- a/kvm.h
+++ b/kvm.h
@@ -44,8 +44,6 @@ int kvm_log_stop(target_phys_addr_t phys_addr, ram_addr_t size);
 int kvm_has_sync_mmu(void);
 int kvm_has_vcpu_events(void);
 int kvm_has_robust_singlestep(void);
-int kvm_put_vcpu_events(CPUState *env, int level);
-int kvm_get_vcpu_events(CPUState *env);
 
 void kvm_cpu_register_phys_memory_client(void);
 
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index d1a8f3b..c838a8b 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -980,6 +980,8 @@ void kvm_arch_load_regs(CPUState *env, int level)
         env->halted = 0;
     }
 
+    kvm_put_vcpu_events(env, level);
+
     /* must be last */
     kvm_guest_debug_workarounds(env);
 }
@@ -1153,6 +1155,7 @@ void kvm_arch_save_regs(CPUState *env)
         }
     }
     kvm_arch_save_mpstate(env);
+    kvm_get_vcpu_events(env);
 }
 
 static void do_cpuid_ent(struct kvm_cpuid_entry2 *e, uint32_t function,
@@ -1224,7 +1227,7 @@ int kvm_arch_init_vcpu(CPUState *cenv)
 
     qemu_kvm_load_lapic(cenv);
 
-    cenv->interrupt_injected = -1;
+    kvm_arch_reset_vcpu(cenv);
 
 #ifdef KVM_CPUID_SIGNATURE
     /* Paravirtualization CPUIDs */
@@ -1391,7 +1394,6 @@ void kvm_arch_push_nmi(void *opaque)
 void kvm_arch_cpu_reset(CPUState *env)
 {
     kvm_arch_reset_vcpu(env);
-    kvm_put_vcpu_events(env, KVM_PUT_RESET_STATE);
     kvm_reset_mpstate(env);
 }
 
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 377829c..2a160fb 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -794,7 +794,7 @@ static int kvm_get_mp_state(CPUState *env)
 }
 #endif
 
-int kvm_put_vcpu_events(CPUState *env, int level)
+static int kvm_put_vcpu_events(CPUState *env, int level)
 {
 #ifdef KVM_CAP_VCPU_EVENTS
     struct kvm_vcpu_events events;
@@ -830,7 +830,7 @@ int kvm_put_vcpu_events(CPUState *env, int level)
 #endif
 }
 
-int kvm_get_vcpu_events(CPUState *env)
+static int kvm_get_vcpu_events(CPUState *env)
 {
 #ifdef KVM_CAP_VCPU_EVENTS
     struct kvm_vcpu_events events;
diff --git a/target-i386/machine.c b/target-i386/machine.c
index 6fca559..bcc315b 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -5,7 +5,6 @@
 
 #include "exec-all.h"
 #include "kvm.h"
-#include "qemu-kvm.h"
 
 static const VMStateDescription vmstate_segment = {
     .name = "segment",
@@ -322,10 +321,6 @@ static void cpu_pre_save(void *opaque)
     CPUState *env = opaque;
     int i;
 
-    if (kvm_enabled()) {
-        kvm_get_vcpu_events(env);
-    }
-
     /* FPU */
     env->fpus_vmstate = (env->fpus & ~0x3800) | (env->fpstt & 0x7) << 11;
     env->fptag_vmstate = 0;
@@ -362,7 +357,6 @@ static int cpu_post_load(void *opaque, int version_id)
 
     if (kvm_enabled()) {
         kvm_load_tsc(env);
-        kvm_put_vcpu_events(env, KVM_PUT_FULL_STATE);
     }
 
     return 0;
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v4 07/10] qemu-kvm: Cleanup/fix TSC and PV clock writeback
  2010-03-01 17:17 ` [Qemu-devel] " Jan Kiszka
@ 2010-03-01 17:17   ` Jan Kiszka
  -1 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, qemu-devel, Gleb Natapov

Drop kvm_load_tsc in favor of level-dependent writeback in
kvm_arch_load_regs. KVM's PV clock MSRs fall in the same category and
should therefore only be written back on full sync.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 qemu-kvm-x86.c        |   27 +++++++++++++--------------
 qemu-kvm.h            |    4 ----
 target-i386/machine.c |    5 -----
 3 files changed, 13 insertions(+), 23 deletions(-)

diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index c838a8b..8d30b05 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -965,8 +965,19 @@ void kvm_arch_load_regs(CPUState *env, int level)
         set_msr_entry(&msrs[n++], MSR_LSTAR  ,           env->lstar);
     }
 #endif
-    set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME,  env->system_time_msr);
-    set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK,  env->wall_clock_msr);
+    if (level == KVM_PUT_FULL_STATE) {
+        /*
+         * KVM is yet unable to synchronize TSC values of multiple VCPUs on
+         * writeback. Until this is fixed, we only write the offset to SMP
+         * guests after migration, desynchronizing the VCPUs, but avoiding
+         * huge jump-backs that would occur without any writeback at all.
+         */
+        if (smp_cpus == 1 || env->tsc != 0) {
+            set_msr_entry(&msrs[n++], MSR_IA32_TSC, env->tsc);
+        }
+        set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME, env->system_time_msr);
+        set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
+    }
 
     rc = kvm_set_msrs(env, msrs, n);
     if (rc == -1)
@@ -986,18 +997,6 @@ void kvm_arch_load_regs(CPUState *env, int level)
     kvm_guest_debug_workarounds(env);
 }
 
-void kvm_load_tsc(CPUState *env)
-{
-    int rc;
-    struct kvm_msr_entry msr;
-
-    set_msr_entry(&msr, MSR_IA32_TSC, env->tsc);
-
-    rc = kvm_set_msrs(env, &msr, 1);
-    if (rc == -1)
-        perror("kvm_set_tsc FAILED.\n");
-}
-
 void kvm_arch_save_regs(CPUState *env)
 {
     struct kvm_regs regs;
diff --git a/qemu-kvm.h b/qemu-kvm.h
index 7ad8f42..5c0b660 100644
--- a/qemu-kvm.h
+++ b/qemu-kvm.h
@@ -943,7 +943,6 @@ int handle_tpr_access(void *opaque, CPUState *env, uint64_t rip,
 #ifdef TARGET_I386
 #define qemu_kvm_has_pit_state2() kvm_has_pit_state2(kvm_context)
 #endif
-void kvm_load_tsc(CPUState *env);
 #else
 #define kvm_nested 0
 #define qemu_kvm_has_gsi_routing() (0)
@@ -951,9 +950,6 @@ void kvm_load_tsc(CPUState *env);
 #define qemu_kvm_has_pit_state2() (0)
 #endif
 #define qemu_kvm_cpu_stop(env) do {} while(0)
-static inline void kvm_load_tsc(CPUState *env)
-{
-}
 #endif
 
 void kvm_mutex_unlock(void);
diff --git a/target-i386/machine.c b/target-i386/machine.c
index bcc315b..b547e2a 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -354,11 +354,6 @@ static int cpu_post_load(void *opaque, int version_id)
         hw_breakpoint_insert(env, i);
 
     tlb_flush(env, 1);
-
-    if (kvm_enabled()) {
-        kvm_load_tsc(env);
-    }
-
     return 0;
 }
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH v4 07/10] qemu-kvm: Cleanup/fix TSC and PV clock writeback
@ 2010-03-01 17:17   ` Jan Kiszka
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: Gleb Natapov, qemu-devel, kvm

Drop kvm_load_tsc in favor of level-dependent writeback in
kvm_arch_load_regs. KVM's PV clock MSRs fall in the same category and
should therefore only be written back on full sync.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 qemu-kvm-x86.c        |   27 +++++++++++++--------------
 qemu-kvm.h            |    4 ----
 target-i386/machine.c |    5 -----
 3 files changed, 13 insertions(+), 23 deletions(-)

diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index c838a8b..8d30b05 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -965,8 +965,19 @@ void kvm_arch_load_regs(CPUState *env, int level)
         set_msr_entry(&msrs[n++], MSR_LSTAR  ,           env->lstar);
     }
 #endif
-    set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME,  env->system_time_msr);
-    set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK,  env->wall_clock_msr);
+    if (level == KVM_PUT_FULL_STATE) {
+        /*
+         * KVM is yet unable to synchronize TSC values of multiple VCPUs on
+         * writeback. Until this is fixed, we only write the offset to SMP
+         * guests after migration, desynchronizing the VCPUs, but avoiding
+         * huge jump-backs that would occur without any writeback at all.
+         */
+        if (smp_cpus == 1 || env->tsc != 0) {
+            set_msr_entry(&msrs[n++], MSR_IA32_TSC, env->tsc);
+        }
+        set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME, env->system_time_msr);
+        set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
+    }
 
     rc = kvm_set_msrs(env, msrs, n);
     if (rc == -1)
@@ -986,18 +997,6 @@ void kvm_arch_load_regs(CPUState *env, int level)
     kvm_guest_debug_workarounds(env);
 }
 
-void kvm_load_tsc(CPUState *env)
-{
-    int rc;
-    struct kvm_msr_entry msr;
-
-    set_msr_entry(&msr, MSR_IA32_TSC, env->tsc);
-
-    rc = kvm_set_msrs(env, &msr, 1);
-    if (rc == -1)
-        perror("kvm_set_tsc FAILED.\n");
-}
-
 void kvm_arch_save_regs(CPUState *env)
 {
     struct kvm_regs regs;
diff --git a/qemu-kvm.h b/qemu-kvm.h
index 7ad8f42..5c0b660 100644
--- a/qemu-kvm.h
+++ b/qemu-kvm.h
@@ -943,7 +943,6 @@ int handle_tpr_access(void *opaque, CPUState *env, uint64_t rip,
 #ifdef TARGET_I386
 #define qemu_kvm_has_pit_state2() kvm_has_pit_state2(kvm_context)
 #endif
-void kvm_load_tsc(CPUState *env);
 #else
 #define kvm_nested 0
 #define qemu_kvm_has_gsi_routing() (0)
@@ -951,9 +950,6 @@ void kvm_load_tsc(CPUState *env);
 #define qemu_kvm_has_pit_state2() (0)
 #endif
 #define qemu_kvm_cpu_stop(env) do {} while(0)
-static inline void kvm_load_tsc(CPUState *env)
-{
-}
 #endif
 
 void kvm_mutex_unlock(void);
diff --git a/target-i386/machine.c b/target-i386/machine.c
index bcc315b..b547e2a 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -354,11 +354,6 @@ static int cpu_post_load(void *opaque, int version_id)
         hw_breakpoint_insert(env, i);
 
     tlb_flush(env, 1);
-
-    if (kvm_enabled()) {
-        kvm_load_tsc(env);
-    }
-
     return 0;
 }
 
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v4 08/10] qemu-kvm: Clean up KVM's APIC hooks
  2010-03-01 17:17 ` [Qemu-devel] " Jan Kiszka
@ 2010-03-01 17:17   ` Jan Kiszka
  -1 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, qemu-devel, Gleb Natapov

The APIC is part of the VCPU state, so trigger its readout and writeback
from kvm_arch_save/load_regs. Thanks to the transparent sync on reset
and vmsave/load, we can also drop explicit sync code, reducing the diff
to upstream.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/apic.c      |   37 +++++--------------------------------
 qemu-kvm-x86.c |    4 ++--
 qemu-kvm.h     |    5 ++---
 3 files changed, 9 insertions(+), 37 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index 092c61e..d8c4f7c 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -24,8 +24,6 @@
 #include "host-utils.h"
 #include "kvm.h"
 
-#include "qemu-kvm.h"
-
 //#define DEBUG_APIC
 
 /* APIC Local Vector Table */
@@ -951,36 +949,22 @@ static void kvm_kernel_lapic_load_from_user(APICState *s)
 
 #endif
 
-void qemu_kvm_load_lapic(CPUState *env)
+void kvm_load_lapic(CPUState *env)
 {
 #ifdef KVM_CAP_IRQCHIP
-    if (kvm_enabled() && kvm_vcpu_inited(env) && kvm_irqchip_in_kernel()) {
-        kvm_kernel_lapic_load_from_user(env->apic_state);
-    }
-#endif
-}
-
-static void apic_pre_save(void *opaque)
-{
-#ifdef KVM_CAP_IRQCHIP
-    APICState *s = (void *)opaque;
-
     if (kvm_enabled() && kvm_irqchip_in_kernel()) {
-        kvm_kernel_lapic_save_to_user(s);
+        kvm_kernel_lapic_load_from_user(env->apic_state);
     }
 #endif
 }
 
-static int apic_post_load(void *opaque, int version_id)
+void kvm_save_lapic(CPUState *env)
 {
 #ifdef KVM_CAP_IRQCHIP
-    APICState *s = opaque;
-
     if (kvm_enabled() && kvm_irqchip_in_kernel()) {
-        kvm_kernel_lapic_load_from_user(s);
+        kvm_kernel_lapic_save_to_user(env->apic_state);
     }
 #endif
-    return 0;
 }
 
 /* This function is only used for old state version 1 and 2 */
@@ -1019,9 +1003,6 @@ static int apic_load_old(QEMUFile *f, void *opaque, int version_id)
 
     if (version_id >= 2)
         qemu_get_timer(f, s->timer);
-
-    qemu_kvm_load_lapic(s->cpu_env);
-
     return 0;
 }
 
@@ -1052,9 +1033,7 @@ static const VMStateDescription vmstate_apic = {
         VMSTATE_INT64(next_time, APICState),
         VMSTATE_TIMER(timer, APICState),
         VMSTATE_END_OF_LIST()
-    },
-    .pre_save = apic_pre_save,
-    .post_load = apic_post_load,
+    }
 };
 
 static void apic_reset(void *opaque)
@@ -1077,7 +1056,6 @@ static void apic_reset(void *opaque)
          */
         s->lvt[APIC_LVT_LINT0] = 0x700;
     }
-    qemu_kvm_load_lapic(s->cpu_env);
 }
 
 static CPUReadMemoryFunc * const apic_mem_read[3] = {
@@ -1121,11 +1099,6 @@ int apic_init(CPUState *env)
     vmstate_register(s->idx, &vmstate_apic, s);
     qemu_register_reset(apic_reset, s);
 
-    /* apic_reset must be called before the vcpu threads are initialized and load
-     * registers, in qemu-kvm.
-     */
-    apic_reset(s);
-
     local_apics[s->idx] = s;
     return 0;
 }
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 8d30b05..0efa700 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -985,6 +985,7 @@ void kvm_arch_load_regs(CPUState *env, int level)
 
     if (level >= KVM_PUT_RESET_STATE) {
         kvm_arch_load_mpstate(env);
+        kvm_load_lapic(env);
     }
     if (kvm_irqchip_in_kernel()) {
         /* Avoid deadlock: no user space IRQ will ever clear it. */
@@ -1154,6 +1155,7 @@ void kvm_arch_save_regs(CPUState *env)
         }
     }
     kvm_arch_save_mpstate(env);
+    kvm_save_lapic(env);
     kvm_get_vcpu_events(env);
 }
 
@@ -1224,8 +1226,6 @@ int kvm_arch_init_vcpu(CPUState *cenv)
     CPUState copy;
     uint32_t i, j, limit;
 
-    qemu_kvm_load_lapic(cenv);
-
     kvm_arch_reset_vcpu(cenv);
 
 #ifdef KVM_CPUID_SIGNATURE
diff --git a/qemu-kvm.h b/qemu-kvm.h
index 5c0b660..ae2f864 100644
--- a/qemu-kvm.h
+++ b/qemu-kvm.h
@@ -849,9 +849,8 @@ static inline void kvm_inject_x86_mce(CPUState *cenv, int bank,
 int kvm_main_loop(void);
 int kvm_init_ap(void);
 int kvm_vcpu_inited(CPUState *env);
-void kvm_apic_init(CPUState *env);
-/* called from vcpu initialization */
-void qemu_kvm_load_lapic(CPUState *env);
+void kvm_save_lapic(CPUState *env);
+void kvm_load_lapic(CPUState *env);
 
 void kvm_hpet_enable_kpit(void);
 void kvm_hpet_disable_kpit(void);
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH v4 08/10] qemu-kvm: Clean up KVM's APIC hooks
@ 2010-03-01 17:17   ` Jan Kiszka
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: Gleb Natapov, qemu-devel, kvm

The APIC is part of the VCPU state, so trigger its readout and writeback
from kvm_arch_save/load_regs. Thanks to the transparent sync on reset
and vmsave/load, we can also drop explicit sync code, reducing the diff
to upstream.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/apic.c      |   37 +++++--------------------------------
 qemu-kvm-x86.c |    4 ++--
 qemu-kvm.h     |    5 ++---
 3 files changed, 9 insertions(+), 37 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index 092c61e..d8c4f7c 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -24,8 +24,6 @@
 #include "host-utils.h"
 #include "kvm.h"
 
-#include "qemu-kvm.h"
-
 //#define DEBUG_APIC
 
 /* APIC Local Vector Table */
@@ -951,36 +949,22 @@ static void kvm_kernel_lapic_load_from_user(APICState *s)
 
 #endif
 
-void qemu_kvm_load_lapic(CPUState *env)
+void kvm_load_lapic(CPUState *env)
 {
 #ifdef KVM_CAP_IRQCHIP
-    if (kvm_enabled() && kvm_vcpu_inited(env) && kvm_irqchip_in_kernel()) {
-        kvm_kernel_lapic_load_from_user(env->apic_state);
-    }
-#endif
-}
-
-static void apic_pre_save(void *opaque)
-{
-#ifdef KVM_CAP_IRQCHIP
-    APICState *s = (void *)opaque;
-
     if (kvm_enabled() && kvm_irqchip_in_kernel()) {
-        kvm_kernel_lapic_save_to_user(s);
+        kvm_kernel_lapic_load_from_user(env->apic_state);
     }
 #endif
 }
 
-static int apic_post_load(void *opaque, int version_id)
+void kvm_save_lapic(CPUState *env)
 {
 #ifdef KVM_CAP_IRQCHIP
-    APICState *s = opaque;
-
     if (kvm_enabled() && kvm_irqchip_in_kernel()) {
-        kvm_kernel_lapic_load_from_user(s);
+        kvm_kernel_lapic_save_to_user(env->apic_state);
     }
 #endif
-    return 0;
 }
 
 /* This function is only used for old state version 1 and 2 */
@@ -1019,9 +1003,6 @@ static int apic_load_old(QEMUFile *f, void *opaque, int version_id)
 
     if (version_id >= 2)
         qemu_get_timer(f, s->timer);
-
-    qemu_kvm_load_lapic(s->cpu_env);
-
     return 0;
 }
 
@@ -1052,9 +1033,7 @@ static const VMStateDescription vmstate_apic = {
         VMSTATE_INT64(next_time, APICState),
         VMSTATE_TIMER(timer, APICState),
         VMSTATE_END_OF_LIST()
-    },
-    .pre_save = apic_pre_save,
-    .post_load = apic_post_load,
+    }
 };
 
 static void apic_reset(void *opaque)
@@ -1077,7 +1056,6 @@ static void apic_reset(void *opaque)
          */
         s->lvt[APIC_LVT_LINT0] = 0x700;
     }
-    qemu_kvm_load_lapic(s->cpu_env);
 }
 
 static CPUReadMemoryFunc * const apic_mem_read[3] = {
@@ -1121,11 +1099,6 @@ int apic_init(CPUState *env)
     vmstate_register(s->idx, &vmstate_apic, s);
     qemu_register_reset(apic_reset, s);
 
-    /* apic_reset must be called before the vcpu threads are initialized and load
-     * registers, in qemu-kvm.
-     */
-    apic_reset(s);
-
     local_apics[s->idx] = s;
     return 0;
 }
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 8d30b05..0efa700 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -985,6 +985,7 @@ void kvm_arch_load_regs(CPUState *env, int level)
 
     if (level >= KVM_PUT_RESET_STATE) {
         kvm_arch_load_mpstate(env);
+        kvm_load_lapic(env);
     }
     if (kvm_irqchip_in_kernel()) {
         /* Avoid deadlock: no user space IRQ will ever clear it. */
@@ -1154,6 +1155,7 @@ void kvm_arch_save_regs(CPUState *env)
         }
     }
     kvm_arch_save_mpstate(env);
+    kvm_save_lapic(env);
     kvm_get_vcpu_events(env);
 }
 
@@ -1224,8 +1226,6 @@ int kvm_arch_init_vcpu(CPUState *cenv)
     CPUState copy;
     uint32_t i, j, limit;
 
-    qemu_kvm_load_lapic(cenv);
-
     kvm_arch_reset_vcpu(cenv);
 
 #ifdef KVM_CPUID_SIGNATURE
diff --git a/qemu-kvm.h b/qemu-kvm.h
index 5c0b660..ae2f864 100644
--- a/qemu-kvm.h
+++ b/qemu-kvm.h
@@ -849,9 +849,8 @@ static inline void kvm_inject_x86_mce(CPUState *cenv, int bank,
 int kvm_main_loop(void);
 int kvm_init_ap(void);
 int kvm_vcpu_inited(CPUState *env);
-void kvm_apic_init(CPUState *env);
-/* called from vcpu initialization */
-void qemu_kvm_load_lapic(CPUState *env);
+void kvm_save_lapic(CPUState *env);
+void kvm_load_lapic(CPUState *env);
 
 void kvm_hpet_enable_kpit(void);
 void kvm_hpet_disable_kpit(void);
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v4 09/10] qemu-kvm: Move kvm_set_boot_cpu_id
  2010-03-01 17:17 ` [Qemu-devel] " Jan Kiszka
@ 2010-03-01 17:17   ` Jan Kiszka
  -1 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, qemu-devel, Gleb Natapov

Setting the boot CPU ID is arch-specific KVM stuff. So push it where it
belongs to.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/pc.c        |    3 ---
 qemu-kvm-x86.c |    3 ++-
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index 58c32ea..d819fca 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -839,9 +839,6 @@ static void pc_init1(ram_addr_t ram_size,
 #endif
     }
 
-    if (kvm_enabled()) {
-        kvm_set_boot_cpu_id(0);
-    }
     for (i = 0; i < smp_cpus; i++) {
         env = pc_new_cpu(cpu_model);
     }
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 0efa700..7a5925a 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -695,7 +695,8 @@ int kvm_arch_qemu_create_context(void)
     if (kvm_check_extension(kvm_state, KVM_CAP_ADJUST_CLOCK))
         vmstate_register(0, &vmstate_kvmclock, &kvmclock_data);
 #endif
-    return 0;
+
+    return kvm_set_boot_cpu_id(0);
 }
 
 static void set_msr_entry(struct kvm_msr_entry *entry, uint32_t index,
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH v4 09/10] qemu-kvm: Move kvm_set_boot_cpu_id
@ 2010-03-01 17:17   ` Jan Kiszka
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: Gleb Natapov, qemu-devel, kvm

Setting the boot CPU ID is arch-specific KVM stuff. So push it where it
belongs to.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/pc.c        |    3 ---
 qemu-kvm-x86.c |    3 ++-
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index 58c32ea..d819fca 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -839,9 +839,6 @@ static void pc_init1(ram_addr_t ram_size,
 #endif
     }
 
-    if (kvm_enabled()) {
-        kvm_set_boot_cpu_id(0);
-    }
     for (i = 0; i < smp_cpus; i++) {
         env = pc_new_cpu(cpu_model);
     }
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 0efa700..7a5925a 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -695,7 +695,8 @@ int kvm_arch_qemu_create_context(void)
     if (kvm_check_extension(kvm_state, KVM_CAP_ADJUST_CLOCK))
         vmstate_register(0, &vmstate_kvmclock, &kvmclock_data);
 #endif
-    return 0;
+
+    return kvm_set_boot_cpu_id(0);
 }
 
 static void set_msr_entry(struct kvm_msr_entry *entry, uint32_t index,
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v4 10/10] qemu-kvm: Bring qemu_init_vcpu back home
  2010-03-01 17:17 ` [Qemu-devel] " Jan Kiszka
@ 2010-03-01 17:17   ` Jan Kiszka
  -1 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, qemu-devel, Gleb Natapov

There is no need for the this hack anymore, initialization is now robust
against reordering as it doesn't try to write the VCPU state on its own.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/pc.c              |    5 -----
 target-i386/helper.c |    2 ++
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index d819fca..0aebae9 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -787,11 +787,6 @@ CPUState *pc_new_cpu(const char *cpu_model)
     } else {
         qemu_register_reset((QEMUResetHandler*)cpu_reset, env);
     }
-
-    /* kvm needs this to run after the apic is initialized. Otherwise,
-     * it can access invalid state and crash.
-     */
-    qemu_init_vcpu(env);
     return env;
 }
 
diff --git a/target-i386/helper.c b/target-i386/helper.c
index 73381e2..e595a3e 100644
--- a/target-i386/helper.c
+++ b/target-i386/helper.c
@@ -2285,6 +2285,8 @@ CPUX86State *cpu_x86_init(const char *cpu_model)
     }
     mce_init(env);
 
+    qemu_init_vcpu(env);
+
     return env;
 }
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH v4 10/10] qemu-kvm: Bring qemu_init_vcpu back home
@ 2010-03-01 17:17   ` Jan Kiszka
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:17 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: Gleb Natapov, qemu-devel, kvm

There is no need for the this hack anymore, initialization is now robust
against reordering as it doesn't try to write the VCPU state on its own.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/pc.c              |    5 -----
 target-i386/helper.c |    2 ++
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index d819fca..0aebae9 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -787,11 +787,6 @@ CPUState *pc_new_cpu(const char *cpu_model)
     } else {
         qemu_register_reset((QEMUResetHandler*)cpu_reset, env);
     }
-
-    /* kvm needs this to run after the apic is initialized. Otherwise,
-     * it can access invalid state and crash.
-     */
-    qemu_init_vcpu(env);
     return env;
 }
 
diff --git a/target-i386/helper.c b/target-i386/helper.c
index 73381e2..e595a3e 100644
--- a/target-i386/helper.c
+++ b/target-i386/helper.c
@@ -2285,6 +2285,8 @@ CPUX86State *cpu_x86_init(const char *cpu_model)
     }
     mce_init(env);
 
+    qemu_init_vcpu(env);
+
     return env;
 }
 
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH v4 00/10] qemu-kvm: Hook cleanups and yet more use of upstream code
  2010-03-01 17:17 ` [Qemu-devel] " Jan Kiszka
@ 2010-03-01 17:21   ` Jan Kiszka
  -1 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:21 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, qemu-devel, Gleb Natapov

Jan Kiszka wrote:
> Rebased over latest master.

Oh, and I added that TSC issue workaround to patch 7. My memory...

Jan

> 
> Pull URL is
> 
> 	git://git.kiszka.org/qemu-kvm.git queues/queues-kvm-merge
> 
> Jan Kiszka (10):
>   qemu-kvm: Add KVM_CAP_X86_ROBUST_SINGLESTEP-awareness
>   qemu-kvm: Rework VCPU state writeback API
>   x86: Extend validity of cpu_is_bsp
>   qemu-kvm: Clean up mpstate synchronization
>   KVM: x86: Restrict writeback of VCPU state
>   qemu-kvm: Use VCPU event state for reset and vmsave/load
>   qemu-kvm: Cleanup/fix TSC and PV clock writeback
>   qemu-kvm: Clean up KVM's APIC hooks
>   qemu-kvm: Move kvm_set_boot_cpu_id
>   qemu-kvm: Bring qemu_init_vcpu back home
> 
>  exec.c                |   17 -----
>  hw/apic.c             |   47 ++-------------
>  hw/pc.c               |   12 +---
>  hw/ppc_newworld.c     |    3 -
>  hw/ppc_oldworld.c     |    3 -
>  hw/s390-virtio.c      |    1 -
>  kvm-all.c             |   31 ++++++++--
>  kvm.h                 |   28 ++++++++-
>  qemu-kvm-ia64.c       |    6 +-
>  qemu-kvm-x86.c        |  164 ++++++++++++++++++++++++-------------------------
>  qemu-kvm.c            |   46 ++++----------
>  qemu-kvm.h            |   27 +-------
>  savevm.c              |    4 +
>  sysemu.h              |    4 +
>  target-i386/helper.c  |    2 +
>  target-i386/kvm.c     |   94 +++++++++++++++++-----------
>  target-i386/machine.c |   27 --------
>  target-ia64/machine.c |    5 +-
>  target-ppc/kvm.c      |    2 +-
>  target-ppc/machine.c  |    4 -
>  target-s390x/kvm.c    |    3 +-
>  vl.c                  |   29 +++++++++
>  22 files changed, 259 insertions(+), 300 deletions(-)
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Qemu-devel] Re: [PATCH v4 00/10] qemu-kvm: Hook cleanups and yet more use of upstream code
@ 2010-03-01 17:21   ` Jan Kiszka
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-01 17:21 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: Gleb Natapov, qemu-devel, kvm

Jan Kiszka wrote:
> Rebased over latest master.

Oh, and I added that TSC issue workaround to patch 7. My memory...

Jan

> 
> Pull URL is
> 
> 	git://git.kiszka.org/qemu-kvm.git queues/queues-kvm-merge
> 
> Jan Kiszka (10):
>   qemu-kvm: Add KVM_CAP_X86_ROBUST_SINGLESTEP-awareness
>   qemu-kvm: Rework VCPU state writeback API
>   x86: Extend validity of cpu_is_bsp
>   qemu-kvm: Clean up mpstate synchronization
>   KVM: x86: Restrict writeback of VCPU state
>   qemu-kvm: Use VCPU event state for reset and vmsave/load
>   qemu-kvm: Cleanup/fix TSC and PV clock writeback
>   qemu-kvm: Clean up KVM's APIC hooks
>   qemu-kvm: Move kvm_set_boot_cpu_id
>   qemu-kvm: Bring qemu_init_vcpu back home
> 
>  exec.c                |   17 -----
>  hw/apic.c             |   47 ++-------------
>  hw/pc.c               |   12 +---
>  hw/ppc_newworld.c     |    3 -
>  hw/ppc_oldworld.c     |    3 -
>  hw/s390-virtio.c      |    1 -
>  kvm-all.c             |   31 ++++++++--
>  kvm.h                 |   28 ++++++++-
>  qemu-kvm-ia64.c       |    6 +-
>  qemu-kvm-x86.c        |  164 ++++++++++++++++++++++++-------------------------
>  qemu-kvm.c            |   46 ++++----------
>  qemu-kvm.h            |   27 +-------
>  savevm.c              |    4 +
>  sysemu.h              |    4 +
>  target-i386/helper.c  |    2 +
>  target-i386/kvm.c     |   94 +++++++++++++++++-----------
>  target-i386/machine.c |   27 --------
>  target-ia64/machine.c |    5 +-
>  target-ppc/kvm.c      |    2 +-
>  target-ppc/machine.c  |    4 -
>  target-s390x/kvm.c    |    3 +-
>  vl.c                  |   29 +++++++++
>  22 files changed, 259 insertions(+), 300 deletions(-)
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v4 00/10] qemu-kvm: Hook cleanups and yet more use of upstream code
  2010-03-01 17:17 ` [Qemu-devel] " Jan Kiszka
@ 2010-03-01 20:01   ` Marcelo Tosatti
  -1 siblings, 0 replies; 48+ messages in thread
From: Marcelo Tosatti @ 2010-03-01 20:01 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, kvm, qemu-devel, Gleb Natapov

On Mon, Mar 01, 2010 at 06:17:19PM +0100, Jan Kiszka wrote:
> Rebased over latest master.
> 
> Pull URL is
> 
> 	git://git.kiszka.org/qemu-kvm.git queues/queues-kvm-merge
> 
> Jan Kiszka (10):
>   qemu-kvm: Add KVM_CAP_X86_ROBUST_SINGLESTEP-awareness
>   qemu-kvm: Rework VCPU state writeback API
>   x86: Extend validity of cpu_is_bsp
>   qemu-kvm: Clean up mpstate synchronization
>   KVM: x86: Restrict writeback of VCPU state
>   qemu-kvm: Use VCPU event state for reset and vmsave/load
>   qemu-kvm: Cleanup/fix TSC and PV clock writeback
>   qemu-kvm: Clean up KVM's APIC hooks
>   qemu-kvm: Move kvm_set_boot_cpu_id
>   qemu-kvm: Bring qemu_init_vcpu back home

Applied, thanks.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Qemu-devel] Re: [PATCH v4 00/10] qemu-kvm: Hook cleanups and yet more use of upstream code
@ 2010-03-01 20:01   ` Marcelo Tosatti
  0 siblings, 0 replies; 48+ messages in thread
From: Marcelo Tosatti @ 2010-03-01 20:01 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Gleb Natapov, Avi Kivity, kvm, qemu-devel

On Mon, Mar 01, 2010 at 06:17:19PM +0100, Jan Kiszka wrote:
> Rebased over latest master.
> 
> Pull URL is
> 
> 	git://git.kiszka.org/qemu-kvm.git queues/queues-kvm-merge
> 
> Jan Kiszka (10):
>   qemu-kvm: Add KVM_CAP_X86_ROBUST_SINGLESTEP-awareness
>   qemu-kvm: Rework VCPU state writeback API
>   x86: Extend validity of cpu_is_bsp
>   qemu-kvm: Clean up mpstate synchronization
>   KVM: x86: Restrict writeback of VCPU state
>   qemu-kvm: Use VCPU event state for reset and vmsave/load
>   qemu-kvm: Cleanup/fix TSC and PV clock writeback
>   qemu-kvm: Clean up KVM's APIC hooks
>   qemu-kvm: Move kvm_set_boot_cpu_id
>   qemu-kvm: Bring qemu_init_vcpu back home

Applied, thanks.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
  2010-03-01 17:17   ` [Qemu-devel] " Jan Kiszka
@ 2010-03-03 16:00     ` Gleb Natapov
  -1 siblings, 0 replies; 48+ messages in thread
From: Gleb Natapov @ 2010-03-03 16:00 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, Marcelo Tosatti, kvm, qemu-devel

On Mon, Mar 01, 2010 at 06:17:22PM +0100, Jan Kiszka wrote:
> As we hard-wire the BSP to CPU 0 anyway and cpuid_apic_id equals
> cpu_index, cpu_is_bsp can also be based on the latter directly. This
> will help an early user of it: KVM while initializing mp_state.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---
>  hw/pc.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/hw/pc.c b/hw/pc.c
> index b90a79e..58c32ea 100644
> --- a/hw/pc.c
> +++ b/hw/pc.c
> @@ -767,7 +767,8 @@ static void pc_init_ne2k_isa(NICInfo *nd)
>  
>  int cpu_is_bsp(CPUState *env)
>  {
> -    return env->cpuid_apic_id == 0;
> +    /* We hard-wire the BSP to the first CPU. */
> +    return env->cpu_index == 0;
>  }
We should not assume that. The function was written like that
specifically so the code around it will not rely on this assumption.
Now you change that specifically to write code that will do incorrect
assumptions. I don't see the logic here.

>  
>  CPUState *pc_new_cpu(const char *cpu_model)
> -- 
> 1.6.0.2

--
			Gleb.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Qemu-devel] Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
@ 2010-03-03 16:00     ` Gleb Natapov
  0 siblings, 0 replies; 48+ messages in thread
From: Gleb Natapov @ 2010-03-03 16:00 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, Avi Kivity, kvm, qemu-devel

On Mon, Mar 01, 2010 at 06:17:22PM +0100, Jan Kiszka wrote:
> As we hard-wire the BSP to CPU 0 anyway and cpuid_apic_id equals
> cpu_index, cpu_is_bsp can also be based on the latter directly. This
> will help an early user of it: KVM while initializing mp_state.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---
>  hw/pc.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/hw/pc.c b/hw/pc.c
> index b90a79e..58c32ea 100644
> --- a/hw/pc.c
> +++ b/hw/pc.c
> @@ -767,7 +767,8 @@ static void pc_init_ne2k_isa(NICInfo *nd)
>  
>  int cpu_is_bsp(CPUState *env)
>  {
> -    return env->cpuid_apic_id == 0;
> +    /* We hard-wire the BSP to the first CPU. */
> +    return env->cpu_index == 0;
>  }
We should not assume that. The function was written like that
specifically so the code around it will not rely on this assumption.
Now you change that specifically to write code that will do incorrect
assumptions. I don't see the logic here.

>  
>  CPUState *pc_new_cpu(const char *cpu_model)
> -- 
> 1.6.0.2

--
			Gleb.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
  2010-03-03 16:00     ` [Qemu-devel] " Gleb Natapov
@ 2010-03-03 23:34       ` Jan Kiszka
  -1 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-03 23:34 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Avi Kivity, Marcelo Tosatti, kvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1350 bytes --]

Gleb Natapov wrote:
> On Mon, Mar 01, 2010 at 06:17:22PM +0100, Jan Kiszka wrote:
>> As we hard-wire the BSP to CPU 0 anyway and cpuid_apic_id equals
>> cpu_index, cpu_is_bsp can also be based on the latter directly. This
>> will help an early user of it: KVM while initializing mp_state.
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>> ---
>>  hw/pc.c |    3 ++-
>>  1 files changed, 2 insertions(+), 1 deletions(-)
>>
>> diff --git a/hw/pc.c b/hw/pc.c
>> index b90a79e..58c32ea 100644
>> --- a/hw/pc.c
>> +++ b/hw/pc.c
>> @@ -767,7 +767,8 @@ static void pc_init_ne2k_isa(NICInfo *nd)
>>  
>>  int cpu_is_bsp(CPUState *env)
>>  {
>> -    return env->cpuid_apic_id == 0;
>> +    /* We hard-wire the BSP to the first CPU. */
>> +    return env->cpu_index == 0;
>>  }
> We should not assume that. The function was written like that
> specifically so the code around it will not rely on this assumption.
> Now you change that specifically to write code that will do incorrect
> assumptions. I don't see the logic here.

The logic is that we do not support any other mapping yet - with or
without this change. Without it, we complicate the APIC initialization
for (so far) no good reason. Once we want to support different BSP
assignments, we need to go through the code and rework some parts anyway.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Qemu-devel] Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
@ 2010-03-03 23:34       ` Jan Kiszka
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-03 23:34 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Marcelo Tosatti, Avi Kivity, kvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1350 bytes --]

Gleb Natapov wrote:
> On Mon, Mar 01, 2010 at 06:17:22PM +0100, Jan Kiszka wrote:
>> As we hard-wire the BSP to CPU 0 anyway and cpuid_apic_id equals
>> cpu_index, cpu_is_bsp can also be based on the latter directly. This
>> will help an early user of it: KVM while initializing mp_state.
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>> ---
>>  hw/pc.c |    3 ++-
>>  1 files changed, 2 insertions(+), 1 deletions(-)
>>
>> diff --git a/hw/pc.c b/hw/pc.c
>> index b90a79e..58c32ea 100644
>> --- a/hw/pc.c
>> +++ b/hw/pc.c
>> @@ -767,7 +767,8 @@ static void pc_init_ne2k_isa(NICInfo *nd)
>>  
>>  int cpu_is_bsp(CPUState *env)
>>  {
>> -    return env->cpuid_apic_id == 0;
>> +    /* We hard-wire the BSP to the first CPU. */
>> +    return env->cpu_index == 0;
>>  }
> We should not assume that. The function was written like that
> specifically so the code around it will not rely on this assumption.
> Now you change that specifically to write code that will do incorrect
> assumptions. I don't see the logic here.

The logic is that we do not support any other mapping yet - with or
without this change. Without it, we complicate the APIC initialization
for (so far) no good reason. Once we want to support different BSP
assignments, we need to go through the code and rework some parts anyway.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
  2010-03-03 23:34       ` [Qemu-devel] " Jan Kiszka
@ 2010-03-04  6:47         ` Gleb Natapov
  -1 siblings, 0 replies; 48+ messages in thread
From: Gleb Natapov @ 2010-03-04  6:47 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, Marcelo Tosatti, kvm, qemu-devel

On Thu, Mar 04, 2010 at 12:34:22AM +0100, Jan Kiszka wrote:
> Gleb Natapov wrote:
> > On Mon, Mar 01, 2010 at 06:17:22PM +0100, Jan Kiszka wrote:
> >> As we hard-wire the BSP to CPU 0 anyway and cpuid_apic_id equals
> >> cpu_index, cpu_is_bsp can also be based on the latter directly. This
> >> will help an early user of it: KVM while initializing mp_state.
> >>
> >> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> >> ---
> >>  hw/pc.c |    3 ++-
> >>  1 files changed, 2 insertions(+), 1 deletions(-)
> >>
> >> diff --git a/hw/pc.c b/hw/pc.c
> >> index b90a79e..58c32ea 100644
> >> --- a/hw/pc.c
> >> +++ b/hw/pc.c
> >> @@ -767,7 +767,8 @@ static void pc_init_ne2k_isa(NICInfo *nd)
> >>  
> >>  int cpu_is_bsp(CPUState *env)
> >>  {
> >> -    return env->cpuid_apic_id == 0;
> >> +    /* We hard-wire the BSP to the first CPU. */
> >> +    return env->cpu_index == 0;
> >>  }
> > We should not assume that. The function was written like that
> > specifically so the code around it will not rely on this assumption.
> > Now you change that specifically to write code that will do incorrect
> > assumptions. I don't see the logic here.
> 
> The logic is that we do not support any other mapping yet - with or
> without this change. Without it, we complicate the APIC initialization
> for (so far) no good reason. Once we want to support different BSP
> assignments, we need to go through the code and rework some parts anyway.
> 
As far as I remember the only part that was missing was a command line to
specify apic IDs for each CPU and what CPU is BSP. The code was ready
otherwise. I's very sad if this was broken by other modifications. But
changes like that actually pushes us back from our goal. Why not rework
code so it will work with correct cpu_is_bsp() function instead of
introducing this hack?

--
			Gleb.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Qemu-devel] Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
@ 2010-03-04  6:47         ` Gleb Natapov
  0 siblings, 0 replies; 48+ messages in thread
From: Gleb Natapov @ 2010-03-04  6:47 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, Avi Kivity, kvm, qemu-devel

On Thu, Mar 04, 2010 at 12:34:22AM +0100, Jan Kiszka wrote:
> Gleb Natapov wrote:
> > On Mon, Mar 01, 2010 at 06:17:22PM +0100, Jan Kiszka wrote:
> >> As we hard-wire the BSP to CPU 0 anyway and cpuid_apic_id equals
> >> cpu_index, cpu_is_bsp can also be based on the latter directly. This
> >> will help an early user of it: KVM while initializing mp_state.
> >>
> >> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> >> ---
> >>  hw/pc.c |    3 ++-
> >>  1 files changed, 2 insertions(+), 1 deletions(-)
> >>
> >> diff --git a/hw/pc.c b/hw/pc.c
> >> index b90a79e..58c32ea 100644
> >> --- a/hw/pc.c
> >> +++ b/hw/pc.c
> >> @@ -767,7 +767,8 @@ static void pc_init_ne2k_isa(NICInfo *nd)
> >>  
> >>  int cpu_is_bsp(CPUState *env)
> >>  {
> >> -    return env->cpuid_apic_id == 0;
> >> +    /* We hard-wire the BSP to the first CPU. */
> >> +    return env->cpu_index == 0;
> >>  }
> > We should not assume that. The function was written like that
> > specifically so the code around it will not rely on this assumption.
> > Now you change that specifically to write code that will do incorrect
> > assumptions. I don't see the logic here.
> 
> The logic is that we do not support any other mapping yet - with or
> without this change. Without it, we complicate the APIC initialization
> for (so far) no good reason. Once we want to support different BSP
> assignments, we need to go through the code and rework some parts anyway.
> 
As far as I remember the only part that was missing was a command line to
specify apic IDs for each CPU and what CPU is BSP. The code was ready
otherwise. I's very sad if this was broken by other modifications. But
changes like that actually pushes us back from our goal. Why not rework
code so it will work with correct cpu_is_bsp() function instead of
introducing this hack?

--
			Gleb.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
  2010-03-04  6:47         ` [Qemu-devel] " Gleb Natapov
@ 2010-03-04  8:23           ` Jan Kiszka
  -1 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-04  8:23 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Avi Kivity, Marcelo Tosatti, kvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2025 bytes --]

Gleb Natapov wrote:
> On Thu, Mar 04, 2010 at 12:34:22AM +0100, Jan Kiszka wrote:
>> Gleb Natapov wrote:
>>> On Mon, Mar 01, 2010 at 06:17:22PM +0100, Jan Kiszka wrote:
>>>> As we hard-wire the BSP to CPU 0 anyway and cpuid_apic_id equals
>>>> cpu_index, cpu_is_bsp can also be based on the latter directly. This
>>>> will help an early user of it: KVM while initializing mp_state.
>>>>
>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>> ---
>>>>  hw/pc.c |    3 ++-
>>>>  1 files changed, 2 insertions(+), 1 deletions(-)
>>>>
>>>> diff --git a/hw/pc.c b/hw/pc.c
>>>> index b90a79e..58c32ea 100644
>>>> --- a/hw/pc.c
>>>> +++ b/hw/pc.c
>>>> @@ -767,7 +767,8 @@ static void pc_init_ne2k_isa(NICInfo *nd)
>>>>  
>>>>  int cpu_is_bsp(CPUState *env)
>>>>  {
>>>> -    return env->cpuid_apic_id == 0;
>>>> +    /* We hard-wire the BSP to the first CPU. */
>>>> +    return env->cpu_index == 0;
>>>>  }
>>> We should not assume that. The function was written like that
>>> specifically so the code around it will not rely on this assumption.
>>> Now you change that specifically to write code that will do incorrect
>>> assumptions. I don't see the logic here.
>> The logic is that we do not support any other mapping yet - with or
>> without this change. Without it, we complicate the APIC initialization
>> for (so far) no good reason. Once we want to support different BSP
>> assignments, we need to go through the code and rework some parts anyway.
>>
> As far as I remember the only part that was missing was a command line to
> specify apic IDs for each CPU and what CPU is BSP. The code was ready
> otherwise. I's very sad if this was broken by other modifications. But
> changes like that actually pushes us back from our goal. Why not rework
> code so it will work with correct cpu_is_bsp() function instead of
> introducing this hack?

If you can confirm that there is a serious use case behind it, I will
look into this again. But so far, I did not find it.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Qemu-devel] Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
@ 2010-03-04  8:23           ` Jan Kiszka
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-04  8:23 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Marcelo Tosatti, Avi Kivity, kvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2025 bytes --]

Gleb Natapov wrote:
> On Thu, Mar 04, 2010 at 12:34:22AM +0100, Jan Kiszka wrote:
>> Gleb Natapov wrote:
>>> On Mon, Mar 01, 2010 at 06:17:22PM +0100, Jan Kiszka wrote:
>>>> As we hard-wire the BSP to CPU 0 anyway and cpuid_apic_id equals
>>>> cpu_index, cpu_is_bsp can also be based on the latter directly. This
>>>> will help an early user of it: KVM while initializing mp_state.
>>>>
>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>> ---
>>>>  hw/pc.c |    3 ++-
>>>>  1 files changed, 2 insertions(+), 1 deletions(-)
>>>>
>>>> diff --git a/hw/pc.c b/hw/pc.c
>>>> index b90a79e..58c32ea 100644
>>>> --- a/hw/pc.c
>>>> +++ b/hw/pc.c
>>>> @@ -767,7 +767,8 @@ static void pc_init_ne2k_isa(NICInfo *nd)
>>>>  
>>>>  int cpu_is_bsp(CPUState *env)
>>>>  {
>>>> -    return env->cpuid_apic_id == 0;
>>>> +    /* We hard-wire the BSP to the first CPU. */
>>>> +    return env->cpu_index == 0;
>>>>  }
>>> We should not assume that. The function was written like that
>>> specifically so the code around it will not rely on this assumption.
>>> Now you change that specifically to write code that will do incorrect
>>> assumptions. I don't see the logic here.
>> The logic is that we do not support any other mapping yet - with or
>> without this change. Without it, we complicate the APIC initialization
>> for (so far) no good reason. Once we want to support different BSP
>> assignments, we need to go through the code and rework some parts anyway.
>>
> As far as I remember the only part that was missing was a command line to
> specify apic IDs for each CPU and what CPU is BSP. The code was ready
> otherwise. I's very sad if this was broken by other modifications. But
> changes like that actually pushes us back from our goal. Why not rework
> code so it will work with correct cpu_is_bsp() function instead of
> introducing this hack?

If you can confirm that there is a serious use case behind it, I will
look into this again. But so far, I did not find it.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
  2010-03-04  8:23           ` [Qemu-devel] " Jan Kiszka
@ 2010-03-04  8:35             ` Gleb Natapov
  -1 siblings, 0 replies; 48+ messages in thread
From: Gleb Natapov @ 2010-03-04  8:35 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, Marcelo Tosatti, kvm, qemu-devel

On Thu, Mar 04, 2010 at 09:23:46AM +0100, Jan Kiszka wrote:
> Gleb Natapov wrote:
> > On Thu, Mar 04, 2010 at 12:34:22AM +0100, Jan Kiszka wrote:
> >> Gleb Natapov wrote:
> >>> On Mon, Mar 01, 2010 at 06:17:22PM +0100, Jan Kiszka wrote:
> >>>> As we hard-wire the BSP to CPU 0 anyway and cpuid_apic_id equals
> >>>> cpu_index, cpu_is_bsp can also be based on the latter directly. This
> >>>> will help an early user of it: KVM while initializing mp_state.
> >>>>
> >>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> >>>> ---
> >>>>  hw/pc.c |    3 ++-
> >>>>  1 files changed, 2 insertions(+), 1 deletions(-)
> >>>>
> >>>> diff --git a/hw/pc.c b/hw/pc.c
> >>>> index b90a79e..58c32ea 100644
> >>>> --- a/hw/pc.c
> >>>> +++ b/hw/pc.c
> >>>> @@ -767,7 +767,8 @@ static void pc_init_ne2k_isa(NICInfo *nd)
> >>>>  
> >>>>  int cpu_is_bsp(CPUState *env)
> >>>>  {
> >>>> -    return env->cpuid_apic_id == 0;
> >>>> +    /* We hard-wire the BSP to the first CPU. */
> >>>> +    return env->cpu_index == 0;
> >>>>  }
> >>> We should not assume that. The function was written like that
> >>> specifically so the code around it will not rely on this assumption.
> >>> Now you change that specifically to write code that will do incorrect
> >>> assumptions. I don't see the logic here.
> >> The logic is that we do not support any other mapping yet - with or
> >> without this change. Without it, we complicate the APIC initialization
> >> for (so far) no good reason. Once we want to support different BSP
> >> assignments, we need to go through the code and rework some parts anyway.
> >>
> > As far as I remember the only part that was missing was a command line to
> > specify apic IDs for each CPU and what CPU is BSP. The code was ready
> > otherwise. I's very sad if this was broken by other modifications. But
> > changes like that actually pushes us back from our goal. Why not rework
> > code so it will work with correct cpu_is_bsp() function instead of
> > introducing this hack?
> 
> If you can confirm that there is a serious use case behind it, I will
> look into this again. But so far, I did not find it.
> 
Firs of all it is correctness issue. We should emulate x86 platform and
nothing there says that BSP apic id is zero. Second part of CPU topology
information is encoded in apic id. i.e when socket/core/ht topology is
used we can't just arbitrary specify apic ids for each logical cpu, we
should follow the rules described in SDM. For instance when more then 16
CPUs are present AMD advices to start numbering apic ids from 16 and leave
first 16 IDs for IOAPICs. And third introduction of this hack shows that
something is done wrong in other places of the code. Somewhere
initialization order is incorrect.

--
			Gleb.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Qemu-devel] Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
@ 2010-03-04  8:35             ` Gleb Natapov
  0 siblings, 0 replies; 48+ messages in thread
From: Gleb Natapov @ 2010-03-04  8:35 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, Avi Kivity, kvm, qemu-devel

On Thu, Mar 04, 2010 at 09:23:46AM +0100, Jan Kiszka wrote:
> Gleb Natapov wrote:
> > On Thu, Mar 04, 2010 at 12:34:22AM +0100, Jan Kiszka wrote:
> >> Gleb Natapov wrote:
> >>> On Mon, Mar 01, 2010 at 06:17:22PM +0100, Jan Kiszka wrote:
> >>>> As we hard-wire the BSP to CPU 0 anyway and cpuid_apic_id equals
> >>>> cpu_index, cpu_is_bsp can also be based on the latter directly. This
> >>>> will help an early user of it: KVM while initializing mp_state.
> >>>>
> >>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> >>>> ---
> >>>>  hw/pc.c |    3 ++-
> >>>>  1 files changed, 2 insertions(+), 1 deletions(-)
> >>>>
> >>>> diff --git a/hw/pc.c b/hw/pc.c
> >>>> index b90a79e..58c32ea 100644
> >>>> --- a/hw/pc.c
> >>>> +++ b/hw/pc.c
> >>>> @@ -767,7 +767,8 @@ static void pc_init_ne2k_isa(NICInfo *nd)
> >>>>  
> >>>>  int cpu_is_bsp(CPUState *env)
> >>>>  {
> >>>> -    return env->cpuid_apic_id == 0;
> >>>> +    /* We hard-wire the BSP to the first CPU. */
> >>>> +    return env->cpu_index == 0;
> >>>>  }
> >>> We should not assume that. The function was written like that
> >>> specifically so the code around it will not rely on this assumption.
> >>> Now you change that specifically to write code that will do incorrect
> >>> assumptions. I don't see the logic here.
> >> The logic is that we do not support any other mapping yet - with or
> >> without this change. Without it, we complicate the APIC initialization
> >> for (so far) no good reason. Once we want to support different BSP
> >> assignments, we need to go through the code and rework some parts anyway.
> >>
> > As far as I remember the only part that was missing was a command line to
> > specify apic IDs for each CPU and what CPU is BSP. The code was ready
> > otherwise. I's very sad if this was broken by other modifications. But
> > changes like that actually pushes us back from our goal. Why not rework
> > code so it will work with correct cpu_is_bsp() function instead of
> > introducing this hack?
> 
> If you can confirm that there is a serious use case behind it, I will
> look into this again. But so far, I did not find it.
> 
Firs of all it is correctness issue. We should emulate x86 platform and
nothing there says that BSP apic id is zero. Second part of CPU topology
information is encoded in apic id. i.e when socket/core/ht topology is
used we can't just arbitrary specify apic ids for each logical cpu, we
should follow the rules described in SDM. For instance when more then 16
CPUs are present AMD advices to start numbering apic ids from 16 and leave
first 16 IDs for IOAPICs. And third introduction of this hack shows that
something is done wrong in other places of the code. Somewhere
initialization order is incorrect.

--
			Gleb.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
  2010-03-04  8:35             ` [Qemu-devel] " Gleb Natapov
@ 2010-03-04 11:35               ` Jan Kiszka
  -1 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-04 11:35 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Avi Kivity, Marcelo Tosatti, kvm, qemu-devel

Gleb Natapov wrote:
> On Thu, Mar 04, 2010 at 09:23:46AM +0100, Jan Kiszka wrote:
>> Gleb Natapov wrote:
>>> On Thu, Mar 04, 2010 at 12:34:22AM +0100, Jan Kiszka wrote:
>>>> Gleb Natapov wrote:
>>>>> On Mon, Mar 01, 2010 at 06:17:22PM +0100, Jan Kiszka wrote:
>>>>>> As we hard-wire the BSP to CPU 0 anyway and cpuid_apic_id equals
>>>>>> cpu_index, cpu_is_bsp can also be based on the latter directly. This
>>>>>> will help an early user of it: KVM while initializing mp_state.
>>>>>>
>>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>> ---
>>>>>>  hw/pc.c |    3 ++-
>>>>>>  1 files changed, 2 insertions(+), 1 deletions(-)
>>>>>>
>>>>>> diff --git a/hw/pc.c b/hw/pc.c
>>>>>> index b90a79e..58c32ea 100644
>>>>>> --- a/hw/pc.c
>>>>>> +++ b/hw/pc.c
>>>>>> @@ -767,7 +767,8 @@ static void pc_init_ne2k_isa(NICInfo *nd)
>>>>>>  
>>>>>>  int cpu_is_bsp(CPUState *env)
>>>>>>  {
>>>>>> -    return env->cpuid_apic_id == 0;
>>>>>> +    /* We hard-wire the BSP to the first CPU. */
>>>>>> +    return env->cpu_index == 0;
>>>>>>  }
>>>>> We should not assume that. The function was written like that
>>>>> specifically so the code around it will not rely on this assumption.
>>>>> Now you change that specifically to write code that will do incorrect
>>>>> assumptions. I don't see the logic here.
>>>> The logic is that we do not support any other mapping yet - with or
>>>> without this change. Without it, we complicate the APIC initialization
>>>> for (so far) no good reason. Once we want to support different BSP
>>>> assignments, we need to go through the code and rework some parts anyway.
>>>>
>>> As far as I remember the only part that was missing was a command line to
>>> specify apic IDs for each CPU and what CPU is BSP. The code was ready
>>> otherwise. I's very sad if this was broken by other modifications. But
>>> changes like that actually pushes us back from our goal. Why not rework
>>> code so it will work with correct cpu_is_bsp() function instead of
>>> introducing this hack?
>> If you can confirm that there is a serious use case behind it, I will
>> look into this again. But so far, I did not find it.
>>
> Firs of all it is correctness issue. We should emulate x86 platform and
> nothing there says that BSP apic id is zero. Second part of CPU topology
> information is encoded in apic id. i.e when socket/core/ht topology is
> used we can't just arbitrary specify apic ids for each logical cpu, we
> should follow the rules described in SDM. For instance when more then 16
> CPUs are present AMD advices to start numbering apic ids from 16 and leave
> first 16 IDs for IOAPICs. And third introduction of this hack shows that
> something is done wrong in other places of the code. Somewhere
> initialization order is incorrect.

Well, it looks like we need to answer two questions: How shall to user
specify the BSP? And how to reliably map this on QEMU's internal
cpu_index? Depending on this, apic numbering may or may not be an
orthogonal issue.

BTW, do real systems allow to hot plug BSP as well? Or how is the case
handled when you unplug the BSP and then reboot the box?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Qemu-devel] Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
@ 2010-03-04 11:35               ` Jan Kiszka
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-04 11:35 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Marcelo Tosatti, Avi Kivity, kvm, qemu-devel

Gleb Natapov wrote:
> On Thu, Mar 04, 2010 at 09:23:46AM +0100, Jan Kiszka wrote:
>> Gleb Natapov wrote:
>>> On Thu, Mar 04, 2010 at 12:34:22AM +0100, Jan Kiszka wrote:
>>>> Gleb Natapov wrote:
>>>>> On Mon, Mar 01, 2010 at 06:17:22PM +0100, Jan Kiszka wrote:
>>>>>> As we hard-wire the BSP to CPU 0 anyway and cpuid_apic_id equals
>>>>>> cpu_index, cpu_is_bsp can also be based on the latter directly. This
>>>>>> will help an early user of it: KVM while initializing mp_state.
>>>>>>
>>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>> ---
>>>>>>  hw/pc.c |    3 ++-
>>>>>>  1 files changed, 2 insertions(+), 1 deletions(-)
>>>>>>
>>>>>> diff --git a/hw/pc.c b/hw/pc.c
>>>>>> index b90a79e..58c32ea 100644
>>>>>> --- a/hw/pc.c
>>>>>> +++ b/hw/pc.c
>>>>>> @@ -767,7 +767,8 @@ static void pc_init_ne2k_isa(NICInfo *nd)
>>>>>>  
>>>>>>  int cpu_is_bsp(CPUState *env)
>>>>>>  {
>>>>>> -    return env->cpuid_apic_id == 0;
>>>>>> +    /* We hard-wire the BSP to the first CPU. */
>>>>>> +    return env->cpu_index == 0;
>>>>>>  }
>>>>> We should not assume that. The function was written like that
>>>>> specifically so the code around it will not rely on this assumption.
>>>>> Now you change that specifically to write code that will do incorrect
>>>>> assumptions. I don't see the logic here.
>>>> The logic is that we do not support any other mapping yet - with or
>>>> without this change. Without it, we complicate the APIC initialization
>>>> for (so far) no good reason. Once we want to support different BSP
>>>> assignments, we need to go through the code and rework some parts anyway.
>>>>
>>> As far as I remember the only part that was missing was a command line to
>>> specify apic IDs for each CPU and what CPU is BSP. The code was ready
>>> otherwise. I's very sad if this was broken by other modifications. But
>>> changes like that actually pushes us back from our goal. Why not rework
>>> code so it will work with correct cpu_is_bsp() function instead of
>>> introducing this hack?
>> If you can confirm that there is a serious use case behind it, I will
>> look into this again. But so far, I did not find it.
>>
> Firs of all it is correctness issue. We should emulate x86 platform and
> nothing there says that BSP apic id is zero. Second part of CPU topology
> information is encoded in apic id. i.e when socket/core/ht topology is
> used we can't just arbitrary specify apic ids for each logical cpu, we
> should follow the rules described in SDM. For instance when more then 16
> CPUs are present AMD advices to start numbering apic ids from 16 and leave
> first 16 IDs for IOAPICs. And third introduction of this hack shows that
> something is done wrong in other places of the code. Somewhere
> initialization order is incorrect.

Well, it looks like we need to answer two questions: How shall to user
specify the BSP? And how to reliably map this on QEMU's internal
cpu_index? Depending on this, apic numbering may or may not be an
orthogonal issue.

BTW, do real systems allow to hot plug BSP as well? Or how is the case
handled when you unplug the BSP and then reboot the box?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
  2010-03-04 11:35               ` [Qemu-devel] " Jan Kiszka
@ 2010-03-04 12:03                 ` Gleb Natapov
  -1 siblings, 0 replies; 48+ messages in thread
From: Gleb Natapov @ 2010-03-04 12:03 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, Marcelo Tosatti, kvm, qemu-devel

On Thu, Mar 04, 2010 at 12:35:45PM +0100, Jan Kiszka wrote:
> Gleb Natapov wrote:
> > On Thu, Mar 04, 2010 at 09:23:46AM +0100, Jan Kiszka wrote:
> >> Gleb Natapov wrote:
> >>> On Thu, Mar 04, 2010 at 12:34:22AM +0100, Jan Kiszka wrote:
> >>>> Gleb Natapov wrote:
> >>>>> On Mon, Mar 01, 2010 at 06:17:22PM +0100, Jan Kiszka wrote:
> >>>>>> As we hard-wire the BSP to CPU 0 anyway and cpuid_apic_id equals
> >>>>>> cpu_index, cpu_is_bsp can also be based on the latter directly. This
> >>>>>> will help an early user of it: KVM while initializing mp_state.
> >>>>>>
> >>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> >>>>>> ---
> >>>>>>  hw/pc.c |    3 ++-
> >>>>>>  1 files changed, 2 insertions(+), 1 deletions(-)
> >>>>>>
> >>>>>> diff --git a/hw/pc.c b/hw/pc.c
> >>>>>> index b90a79e..58c32ea 100644
> >>>>>> --- a/hw/pc.c
> >>>>>> +++ b/hw/pc.c
> >>>>>> @@ -767,7 +767,8 @@ static void pc_init_ne2k_isa(NICInfo *nd)
> >>>>>>  
> >>>>>>  int cpu_is_bsp(CPUState *env)
> >>>>>>  {
> >>>>>> -    return env->cpuid_apic_id == 0;
> >>>>>> +    /* We hard-wire the BSP to the first CPU. */
> >>>>>> +    return env->cpu_index == 0;
> >>>>>>  }
> >>>>> We should not assume that. The function was written like that
> >>>>> specifically so the code around it will not rely on this assumption.
> >>>>> Now you change that specifically to write code that will do incorrect
> >>>>> assumptions. I don't see the logic here.
> >>>> The logic is that we do not support any other mapping yet - with or
> >>>> without this change. Without it, we complicate the APIC initialization
> >>>> for (so far) no good reason. Once we want to support different BSP
> >>>> assignments, we need to go through the code and rework some parts anyway.
> >>>>
> >>> As far as I remember the only part that was missing was a command line to
> >>> specify apic IDs for each CPU and what CPU is BSP. The code was ready
> >>> otherwise. I's very sad if this was broken by other modifications. But
> >>> changes like that actually pushes us back from our goal. Why not rework
> >>> code so it will work with correct cpu_is_bsp() function instead of
> >>> introducing this hack?
> >> If you can confirm that there is a serious use case behind it, I will
> >> look into this again. But so far, I did not find it.
> >>
> > Firs of all it is correctness issue. We should emulate x86 platform and
> > nothing there says that BSP apic id is zero. Second part of CPU topology
> > information is encoded in apic id. i.e when socket/core/ht topology is
> > used we can't just arbitrary specify apic ids for each logical cpu, we
> > should follow the rules described in SDM. For instance when more then 16
> > CPUs are present AMD advices to start numbering apic ids from 16 and leave
> > first 16 IDs for IOAPICs. And third introduction of this hack shows that
> > something is done wrong in other places of the code. Somewhere
> > initialization order is incorrect.
> 
> Well, it looks like we need to answer two questions: How shall to user
> specify the BSP? And how to reliably map this on QEMU's internal
> cpu_index? Depending on this, apic numbering may or may not be an
> orthogonal issue.
> 
Two good question :) We can extend -cpu command to let as specify base
apic id for each socket. Apic ids of logical cpus are derived from this
base acpi id depending on where in hierarchy the logical cpu resided.
cpu_index thing in QEMU is pretty messy. The way non x86 arches use it
make it hard to cleanup, so this is why I didn't want to rely on it at
all and use acpi id instead. Thinking about it cpu_is_bsp() should
really check BSP bit in apic base register. 


> BTW, do real systems allow to hot plug BSP as well? Or how is the case
> handled when you unplug the BSP and then reboot the box?
> 
Did you mean hot unplug BSP? OS determines what CPU is BSP by checking
BSP bit in APIC base register. My guess is that there is some pin on CPU
which value is mirrored as BSP bit in APIC base register. Board may have
some logic to check what sockets are populated and chose one of them as
BSP by pulling its pin up. But this is only guess.

--
			Gleb.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Qemu-devel] Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
@ 2010-03-04 12:03                 ` Gleb Natapov
  0 siblings, 0 replies; 48+ messages in thread
From: Gleb Natapov @ 2010-03-04 12:03 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, Avi Kivity, kvm, qemu-devel

On Thu, Mar 04, 2010 at 12:35:45PM +0100, Jan Kiszka wrote:
> Gleb Natapov wrote:
> > On Thu, Mar 04, 2010 at 09:23:46AM +0100, Jan Kiszka wrote:
> >> Gleb Natapov wrote:
> >>> On Thu, Mar 04, 2010 at 12:34:22AM +0100, Jan Kiszka wrote:
> >>>> Gleb Natapov wrote:
> >>>>> On Mon, Mar 01, 2010 at 06:17:22PM +0100, Jan Kiszka wrote:
> >>>>>> As we hard-wire the BSP to CPU 0 anyway and cpuid_apic_id equals
> >>>>>> cpu_index, cpu_is_bsp can also be based on the latter directly. This
> >>>>>> will help an early user of it: KVM while initializing mp_state.
> >>>>>>
> >>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> >>>>>> ---
> >>>>>>  hw/pc.c |    3 ++-
> >>>>>>  1 files changed, 2 insertions(+), 1 deletions(-)
> >>>>>>
> >>>>>> diff --git a/hw/pc.c b/hw/pc.c
> >>>>>> index b90a79e..58c32ea 100644
> >>>>>> --- a/hw/pc.c
> >>>>>> +++ b/hw/pc.c
> >>>>>> @@ -767,7 +767,8 @@ static void pc_init_ne2k_isa(NICInfo *nd)
> >>>>>>  
> >>>>>>  int cpu_is_bsp(CPUState *env)
> >>>>>>  {
> >>>>>> -    return env->cpuid_apic_id == 0;
> >>>>>> +    /* We hard-wire the BSP to the first CPU. */
> >>>>>> +    return env->cpu_index == 0;
> >>>>>>  }
> >>>>> We should not assume that. The function was written like that
> >>>>> specifically so the code around it will not rely on this assumption.
> >>>>> Now you change that specifically to write code that will do incorrect
> >>>>> assumptions. I don't see the logic here.
> >>>> The logic is that we do not support any other mapping yet - with or
> >>>> without this change. Without it, we complicate the APIC initialization
> >>>> for (so far) no good reason. Once we want to support different BSP
> >>>> assignments, we need to go through the code and rework some parts anyway.
> >>>>
> >>> As far as I remember the only part that was missing was a command line to
> >>> specify apic IDs for each CPU and what CPU is BSP. The code was ready
> >>> otherwise. I's very sad if this was broken by other modifications. But
> >>> changes like that actually pushes us back from our goal. Why not rework
> >>> code so it will work with correct cpu_is_bsp() function instead of
> >>> introducing this hack?
> >> If you can confirm that there is a serious use case behind it, I will
> >> look into this again. But so far, I did not find it.
> >>
> > Firs of all it is correctness issue. We should emulate x86 platform and
> > nothing there says that BSP apic id is zero. Second part of CPU topology
> > information is encoded in apic id. i.e when socket/core/ht topology is
> > used we can't just arbitrary specify apic ids for each logical cpu, we
> > should follow the rules described in SDM. For instance when more then 16
> > CPUs are present AMD advices to start numbering apic ids from 16 and leave
> > first 16 IDs for IOAPICs. And third introduction of this hack shows that
> > something is done wrong in other places of the code. Somewhere
> > initialization order is incorrect.
> 
> Well, it looks like we need to answer two questions: How shall to user
> specify the BSP? And how to reliably map this on QEMU's internal
> cpu_index? Depending on this, apic numbering may or may not be an
> orthogonal issue.
> 
Two good question :) We can extend -cpu command to let as specify base
apic id for each socket. Apic ids of logical cpus are derived from this
base acpi id depending on where in hierarchy the logical cpu resided.
cpu_index thing in QEMU is pretty messy. The way non x86 arches use it
make it hard to cleanup, so this is why I didn't want to rely on it at
all and use acpi id instead. Thinking about it cpu_is_bsp() should
really check BSP bit in apic base register. 


> BTW, do real systems allow to hot plug BSP as well? Or how is the case
> handled when you unplug the BSP and then reboot the box?
> 
Did you mean hot unplug BSP? OS determines what CPU is BSP by checking
BSP bit in APIC base register. My guess is that there is some pin on CPU
which value is mirrored as BSP bit in APIC base register. Board may have
some logic to check what sockets are populated and chose one of them as
BSP by pulling its pin up. But this is only guess.

--
			Gleb.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
  2010-03-04 12:03                 ` [Qemu-devel] " Gleb Natapov
@ 2010-03-07  6:36                   ` Gleb Natapov
  -1 siblings, 0 replies; 48+ messages in thread
From: Gleb Natapov @ 2010-03-07  6:36 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, Marcelo Tosatti, kvm, qemu-devel

On Thu, Mar 04, 2010 at 02:03:54PM +0200, Gleb Natapov wrote:
> > BTW, do real systems allow to hot plug BSP as well? Or how is the case
> > handled when you unplug the BSP and then reboot the box?
> > 
> Did you mean hot unplug BSP? OS determines what CPU is BSP by checking
> BSP bit in APIC base register. My guess is that there is some pin on CPU
> which value is mirrored as BSP bit in APIC base register. Board may have
> some logic to check what sockets are populated and chose one of them as
> BSP by pulling its pin up. But this is only guess.
> 
Actually this is much more simple:
SDM 8.4.1:
 The MP initialization protocol defines two classes of processors: the
 bootstrap processor (BSP) and the application processors (APs). Following
 a power-up or RESET of an MP system, system hardware dynamically selects
 one of the processors on the system bus as the BSP. The remaining
 processors are designated as APs.
And by "hardware" they mean CPUs themselves over apic BUS.

--
			Gleb.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Qemu-devel] Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
@ 2010-03-07  6:36                   ` Gleb Natapov
  0 siblings, 0 replies; 48+ messages in thread
From: Gleb Natapov @ 2010-03-07  6:36 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, Avi Kivity, kvm, qemu-devel

On Thu, Mar 04, 2010 at 02:03:54PM +0200, Gleb Natapov wrote:
> > BTW, do real systems allow to hot plug BSP as well? Or how is the case
> > handled when you unplug the BSP and then reboot the box?
> > 
> Did you mean hot unplug BSP? OS determines what CPU is BSP by checking
> BSP bit in APIC base register. My guess is that there is some pin on CPU
> which value is mirrored as BSP bit in APIC base register. Board may have
> some logic to check what sockets are populated and chose one of them as
> BSP by pulling its pin up. But this is only guess.
> 
Actually this is much more simple:
SDM 8.4.1:
 The MP initialization protocol defines two classes of processors: the
 bootstrap processor (BSP) and the application processors (APs). Following
 a power-up or RESET of an MP system, system hardware dynamically selects
 one of the processors on the system bus as the BSP. The remaining
 processors are designated as APs.
And by "hardware" they mean CPUs themselves over apic BUS.

--
			Gleb.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
  2010-03-07  6:36                   ` [Qemu-devel] " Gleb Natapov
@ 2010-03-07 10:48                     ` Jan Kiszka
  -1 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-07 10:48 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Avi Kivity, Marcelo Tosatti, kvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1279 bytes --]

Gleb Natapov wrote:
> On Thu, Mar 04, 2010 at 02:03:54PM +0200, Gleb Natapov wrote:
>>> BTW, do real systems allow to hot plug BSP as well? Or how is the case
>>> handled when you unplug the BSP and then reboot the box?
>>>
>> Did you mean hot unplug BSP? OS determines what CPU is BSP by checking
>> BSP bit in APIC base register. My guess is that there is some pin on CPU
>> which value is mirrored as BSP bit in APIC base register. Board may have
>> some logic to check what sockets are populated and chose one of them as
>> BSP by pulling its pin up. But this is only guess.
>>
> Actually this is much more simple:
> SDM 8.4.1:
>  The MP initialization protocol defines two classes of processors: the
>  bootstrap processor (BSP) and the application processors (APs). Following
>  a power-up or RESET of an MP system, system hardware dynamically selects
>  one of the processors on the system bus as the BSP. The remaining
>  processors are designated as APs.
> And by "hardware" they mean CPUs themselves over apic BUS.

That should be straightforward, will have a look. I just want to ensure
that cpu_is_bsp delivers a valid result all the time. Otherwise we will
run into ordering issues again once some new user of this services shows up.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Qemu-devel] Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
@ 2010-03-07 10:48                     ` Jan Kiszka
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-07 10:48 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Marcelo Tosatti, Avi Kivity, kvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1279 bytes --]

Gleb Natapov wrote:
> On Thu, Mar 04, 2010 at 02:03:54PM +0200, Gleb Natapov wrote:
>>> BTW, do real systems allow to hot plug BSP as well? Or how is the case
>>> handled when you unplug the BSP and then reboot the box?
>>>
>> Did you mean hot unplug BSP? OS determines what CPU is BSP by checking
>> BSP bit in APIC base register. My guess is that there is some pin on CPU
>> which value is mirrored as BSP bit in APIC base register. Board may have
>> some logic to check what sockets are populated and chose one of them as
>> BSP by pulling its pin up. But this is only guess.
>>
> Actually this is much more simple:
> SDM 8.4.1:
>  The MP initialization protocol defines two classes of processors: the
>  bootstrap processor (BSP) and the application processors (APs). Following
>  a power-up or RESET of an MP system, system hardware dynamically selects
>  one of the processors on the system bus as the BSP. The remaining
>  processors are designated as APs.
> And by "hardware" they mean CPUs themselves over apic BUS.

That should be straightforward, will have a look. I just want to ensure
that cpu_is_bsp delivers a valid result all the time. Otherwise we will
run into ordering issues again once some new user of this services shows up.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
  2010-03-07 10:48                     ` [Qemu-devel] " Jan Kiszka
@ 2010-03-07 13:44                       ` Jan Kiszka
  -1 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-07 13:44 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Avi Kivity, Marcelo Tosatti, kvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1701 bytes --]

Jan Kiszka wrote:
> Gleb Natapov wrote:
>> On Thu, Mar 04, 2010 at 02:03:54PM +0200, Gleb Natapov wrote:
>>>> BTW, do real systems allow to hot plug BSP as well? Or how is the case
>>>> handled when you unplug the BSP and then reboot the box?
>>>>
>>> Did you mean hot unplug BSP? OS determines what CPU is BSP by checking
>>> BSP bit in APIC base register. My guess is that there is some pin on CPU
>>> which value is mirrored as BSP bit in APIC base register. Board may have
>>> some logic to check what sockets are populated and chose one of them as
>>> BSP by pulling its pin up. But this is only guess.
>>>
>> Actually this is much more simple:
>> SDM 8.4.1:
>>  The MP initialization protocol defines two classes of processors: the
>>  bootstrap processor (BSP) and the application processors (APs). Following
>>  a power-up or RESET of an MP system, system hardware dynamically selects
>>  one of the processors on the system bus as the BSP. The remaining
>>  processors are designated as APs.
>> And by "hardware" they mean CPUs themselves over apic BUS.
> 
> That should be straightforward, will have a look. I just want to ensure
> that cpu_is_bsp delivers a valid result all the time. Otherwise we will
> run into ordering issues again once some new user of this services shows up.
> 

...but you have to explain to me how CPU hotplugging in current qemu-kvm
is supposed to work. 'cpu_set n offline' does not have any visible
effect: Linux still allows to bring such a CPU online again, and it is
also brought up again after reset.

Besides this, I would probably need some other OS to test offlining the
BSP - Linux does not appear to support it.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Qemu-devel] Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
@ 2010-03-07 13:44                       ` Jan Kiszka
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Kiszka @ 2010-03-07 13:44 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Marcelo Tosatti, Avi Kivity, kvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1701 bytes --]

Jan Kiszka wrote:
> Gleb Natapov wrote:
>> On Thu, Mar 04, 2010 at 02:03:54PM +0200, Gleb Natapov wrote:
>>>> BTW, do real systems allow to hot plug BSP as well? Or how is the case
>>>> handled when you unplug the BSP and then reboot the box?
>>>>
>>> Did you mean hot unplug BSP? OS determines what CPU is BSP by checking
>>> BSP bit in APIC base register. My guess is that there is some pin on CPU
>>> which value is mirrored as BSP bit in APIC base register. Board may have
>>> some logic to check what sockets are populated and chose one of them as
>>> BSP by pulling its pin up. But this is only guess.
>>>
>> Actually this is much more simple:
>> SDM 8.4.1:
>>  The MP initialization protocol defines two classes of processors: the
>>  bootstrap processor (BSP) and the application processors (APs). Following
>>  a power-up or RESET of an MP system, system hardware dynamically selects
>>  one of the processors on the system bus as the BSP. The remaining
>>  processors are designated as APs.
>> And by "hardware" they mean CPUs themselves over apic BUS.
> 
> That should be straightforward, will have a look. I just want to ensure
> that cpu_is_bsp delivers a valid result all the time. Otherwise we will
> run into ordering issues again once some new user of this services shows up.
> 

...but you have to explain to me how CPU hotplugging in current qemu-kvm
is supposed to work. 'cpu_set n offline' does not have any visible
effect: Linux still allows to bring such a CPU online again, and it is
also brought up again after reset.

Besides this, I would probably need some other OS to test offlining the
BSP - Linux does not appear to support it.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
  2010-03-07 13:44                       ` [Qemu-devel] " Jan Kiszka
@ 2010-03-07 13:50                         ` Gleb Natapov
  -1 siblings, 0 replies; 48+ messages in thread
From: Gleb Natapov @ 2010-03-07 13:50 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, Marcelo Tosatti, kvm, qemu-devel

On Sun, Mar 07, 2010 at 02:44:50PM +0100, Jan Kiszka wrote:
> Jan Kiszka wrote:
> > Gleb Natapov wrote:
> >> On Thu, Mar 04, 2010 at 02:03:54PM +0200, Gleb Natapov wrote:
> >>>> BTW, do real systems allow to hot plug BSP as well? Or how is the case
> >>>> handled when you unplug the BSP and then reboot the box?
> >>>>
> >>> Did you mean hot unplug BSP? OS determines what CPU is BSP by checking
> >>> BSP bit in APIC base register. My guess is that there is some pin on CPU
> >>> which value is mirrored as BSP bit in APIC base register. Board may have
> >>> some logic to check what sockets are populated and chose one of them as
> >>> BSP by pulling its pin up. But this is only guess.
> >>>
> >> Actually this is much more simple:
> >> SDM 8.4.1:
> >>  The MP initialization protocol defines two classes of processors: the
> >>  bootstrap processor (BSP) and the application processors (APs). Following
> >>  a power-up or RESET of an MP system, system hardware dynamically selects
> >>  one of the processors on the system bus as the BSP. The remaining
> >>  processors are designated as APs.
> >> And by "hardware" they mean CPUs themselves over apic BUS.
> > 
> > That should be straightforward, will have a look. I just want to ensure
> > that cpu_is_bsp delivers a valid result all the time. Otherwise we will
> > run into ordering issues again once some new user of this services shows up.
> > 
> 
> ...but you have to explain to me how CPU hotplugging in current qemu-kvm
> is supposed to work. 'cpu_set n offline' does not have any visible
> effect: Linux still allows to bring such a CPU online again, and it is
> also brought up again after reset.
Current BIOS does not support cpu hotplug/unplug. If after reset
unplugged vcpu is brought up again then there is a bug there somewhere.

> 
> Besides this, I would probably need some other OS to test offlining the
> BSP - Linux does not appear to support it.
> 

--
			Gleb.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Qemu-devel] Re: [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp
@ 2010-03-07 13:50                         ` Gleb Natapov
  0 siblings, 0 replies; 48+ messages in thread
From: Gleb Natapov @ 2010-03-07 13:50 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, Avi Kivity, kvm, qemu-devel

On Sun, Mar 07, 2010 at 02:44:50PM +0100, Jan Kiszka wrote:
> Jan Kiszka wrote:
> > Gleb Natapov wrote:
> >> On Thu, Mar 04, 2010 at 02:03:54PM +0200, Gleb Natapov wrote:
> >>>> BTW, do real systems allow to hot plug BSP as well? Or how is the case
> >>>> handled when you unplug the BSP and then reboot the box?
> >>>>
> >>> Did you mean hot unplug BSP? OS determines what CPU is BSP by checking
> >>> BSP bit in APIC base register. My guess is that there is some pin on CPU
> >>> which value is mirrored as BSP bit in APIC base register. Board may have
> >>> some logic to check what sockets are populated and chose one of them as
> >>> BSP by pulling its pin up. But this is only guess.
> >>>
> >> Actually this is much more simple:
> >> SDM 8.4.1:
> >>  The MP initialization protocol defines two classes of processors: the
> >>  bootstrap processor (BSP) and the application processors (APs). Following
> >>  a power-up or RESET of an MP system, system hardware dynamically selects
> >>  one of the processors on the system bus as the BSP. The remaining
> >>  processors are designated as APs.
> >> And by "hardware" they mean CPUs themselves over apic BUS.
> > 
> > That should be straightforward, will have a look. I just want to ensure
> > that cpu_is_bsp delivers a valid result all the time. Otherwise we will
> > run into ordering issues again once some new user of this services shows up.
> > 
> 
> ...but you have to explain to me how CPU hotplugging in current qemu-kvm
> is supposed to work. 'cpu_set n offline' does not have any visible
> effect: Linux still allows to bring such a CPU online again, and it is
> also brought up again after reset.
Current BIOS does not support cpu hotplug/unplug. If after reset
unplugged vcpu is brought up again then there is a bug there somewhere.

> 
> Besides this, I would probably need some other OS to test offlining the
> BSP - Linux does not appear to support it.
> 

--
			Gleb.

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2010-03-07 13:50 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-01 17:17 [PATCH v4 00/10] qemu-kvm: Hook cleanups and yet more use of upstream code Jan Kiszka
2010-03-01 17:17 ` [Qemu-devel] " Jan Kiszka
2010-03-01 17:17 ` [PATCH v4 01/10] qemu-kvm: Add KVM_CAP_X86_ROBUST_SINGLESTEP-awareness Jan Kiszka
2010-03-01 17:17   ` [Qemu-devel] " Jan Kiszka
2010-03-01 17:17 ` [PATCH v4 02/10] qemu-kvm: Rework VCPU state writeback API Jan Kiszka
2010-03-01 17:17   ` [Qemu-devel] " Jan Kiszka
2010-03-01 17:17 ` [PATCH v4 03/10] x86: Extend validity of cpu_is_bsp Jan Kiszka
2010-03-01 17:17   ` [Qemu-devel] " Jan Kiszka
2010-03-03 16:00   ` Gleb Natapov
2010-03-03 16:00     ` [Qemu-devel] " Gleb Natapov
2010-03-03 23:34     ` Jan Kiszka
2010-03-03 23:34       ` [Qemu-devel] " Jan Kiszka
2010-03-04  6:47       ` Gleb Natapov
2010-03-04  6:47         ` [Qemu-devel] " Gleb Natapov
2010-03-04  8:23         ` Jan Kiszka
2010-03-04  8:23           ` [Qemu-devel] " Jan Kiszka
2010-03-04  8:35           ` Gleb Natapov
2010-03-04  8:35             ` [Qemu-devel] " Gleb Natapov
2010-03-04 11:35             ` Jan Kiszka
2010-03-04 11:35               ` [Qemu-devel] " Jan Kiszka
2010-03-04 12:03               ` Gleb Natapov
2010-03-04 12:03                 ` [Qemu-devel] " Gleb Natapov
2010-03-07  6:36                 ` Gleb Natapov
2010-03-07  6:36                   ` [Qemu-devel] " Gleb Natapov
2010-03-07 10:48                   ` Jan Kiszka
2010-03-07 10:48                     ` [Qemu-devel] " Jan Kiszka
2010-03-07 13:44                     ` Jan Kiszka
2010-03-07 13:44                       ` [Qemu-devel] " Jan Kiszka
2010-03-07 13:50                       ` Gleb Natapov
2010-03-07 13:50                         ` [Qemu-devel] " Gleb Natapov
2010-03-01 17:17 ` [PATCH v4 04/10] qemu-kvm: Clean up mpstate synchronization Jan Kiszka
2010-03-01 17:17   ` [Qemu-devel] " Jan Kiszka
2010-03-01 17:17 ` [PATCH v4 05/10] KVM: x86: Restrict writeback of VCPU state Jan Kiszka
2010-03-01 17:17   ` [Qemu-devel] " Jan Kiszka
2010-03-01 17:17 ` [PATCH v4 06/10] qemu-kvm: Use VCPU event state for reset and vmsave/load Jan Kiszka
2010-03-01 17:17   ` [Qemu-devel] " Jan Kiszka
2010-03-01 17:17 ` [PATCH v4 07/10] qemu-kvm: Cleanup/fix TSC and PV clock writeback Jan Kiszka
2010-03-01 17:17   ` [Qemu-devel] " Jan Kiszka
2010-03-01 17:17 ` [PATCH v4 08/10] qemu-kvm: Clean up KVM's APIC hooks Jan Kiszka
2010-03-01 17:17   ` [Qemu-devel] " Jan Kiszka
2010-03-01 17:17 ` [PATCH v4 09/10] qemu-kvm: Move kvm_set_boot_cpu_id Jan Kiszka
2010-03-01 17:17   ` [Qemu-devel] " Jan Kiszka
2010-03-01 17:17 ` [PATCH v4 10/10] qemu-kvm: Bring qemu_init_vcpu back home Jan Kiszka
2010-03-01 17:17   ` [Qemu-devel] " Jan Kiszka
2010-03-01 17:21 ` [PATCH v4 00/10] qemu-kvm: Hook cleanups and yet more use of upstream code Jan Kiszka
2010-03-01 17:21   ` [Qemu-devel] " Jan Kiszka
2010-03-01 20:01 ` Marcelo Tosatti
2010-03-01 20:01   ` [Qemu-devel] " Marcelo Tosatti

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.